SIEM Export Parser & KPI

Platform

Python

Year

Summer 2022

"Alert monitoring monitoring"

After a SIEM analyst has spend 8 hours or more per day, and 5 days or more per week, reviewing and handling one alert after the other to keep the organization safe, there is a satisfaction to be felt. That feeling of a job well done and a well-deserved weekend.

In the ideal world, all these alerts were relevant, happened during the day, and all could be treated.

In the real world, not all these alerts were relevant, happen 24/7 and maybe not all could be treated due to meetings, break, technical issues or alert handling taking more time than foreseen.

Taking the time to review the alert monitoring itself, is time that is seldom properly because everyone is always busy. Fortunately, a tool that does this for you is never too busy.

This tool was created to answer a few questions:

-Which alerts happen & when?

-How complete is the alert monitoring, what are we missing?

(aka: "Give me Key Performance Indicators (KPI)")

-Of what appears missed, what was really missed?

How it started...

9/8/2022

At my customer, a well-known SIEM solution is used to perform real-time security monitoring with the native annotation system used to mark alerts by the L1 analyst. The amount of alerts received can easily be in the tens of thousand per month. While the SIEM excels at security event collection and correlation, showing alerts in real-time (or with a small window of time going back up to a few days), it is not so convenient to use in order to get the alerts & their annotation status for, say, the last month.

What it can provide is an exported CSV file with a pre-selected number of fields. However, my customer liked to have logic applied on this CSV that would not be possible to program into the SIEM solution. Logic like:

How many alerts happen during the day-shift or the night-shift?
Of the alerts that happened during the day-shift, how many of these were marked with a specific annotation code "Closed"?
The CSV is imperfect and requires manual corrections (e.g. to identify and remove duplicates). Can this correctoin be automated?
Can this data be generated over a longer period of time without modifying the exports?

Up and until now, this was a time-intensive manual labor to be conducted each month. While a well-motivated employee could do this effort in a few hours, there are some issues with this:

When the analyst is spending a few hours per month to review & manipulate the export for reporting purposes, that analyst is not looking at new alerts.
The analyst is human (i'd assume), and humans make errors. Especially when skimming over tenthousand alerts to catch and correct export errors & imperfections. While still a fair approximate, it is not exact and not repeatable.

As I know that Python is very strong at reading in and parsing data, as well as representing this data in a variety of ways, i decided to create a python script to automate this work.

The Script Design

The script design is rather simple:

A class is used to represent a CSV file as an object
This object provides several important methods:
- to populate itself one alert at time. Each alert (a line in the CSV file) has columns that infer meaning and is typed. For example: When did the alert occur? What happened? Various data like source IP? Involved user context? Review status?
- The object is, unlike the CSV file, self-aware of its contents. The date, for example, allows the object to know which alerts happened on a specific day. The Review status allows the object to know if one of it's alerts is reviewed or not.
- to allow itself to be merged with another CSV file object, whilst doing consistency checks (avoiding duplicates and not mismatching CSV objects with different composition)
- To allow to it to report metrics on itself. Such as, how many alerts happen during a given shift? Or how many alerts had a specific review status?
Certain generic functions were added, that e.g. allow the data the CSV object returns to be represented in a graph or a tabular output. Also, the CLI menu is foreseen like this, and common functions like reading or writing data.
The script can be used in two ways, either on a monthly basis or on a long-term basis while still aggregating data per month... and do so dynamically. Where shortcuts in the GUI can be taken, these should be taken by detecting the CSV files the user supplies in a predetermined location.
The script is to be used by people without knowing all of the above. Hence i decided to go for a simple CLI based menu.

How does it work?

The script expects to have::

one or more CSV files in the same folder it itself is located (and this for "file" mode operations)
one or more CSV files in a subfolder named "historic_data" (and this for "folder" mode operations aka historic overview)

Depending on what is found, the script can auto-pick it's mode, if only one or the other is present.
If multiple CSV are present in the same folder the script resides, it will offer the end-user to pick the intended CSV file.
If both CSV files are present in the same folder the script resides, and there are files in under the historic_data subfolder, the end-user needs to choose the mode manually.

As required, the script allows:

calculation of alerts per shift
Sanitize the CSV ("remove" errors from it)
calculation of certain metrics

As the code is made public you can test the code with the demo CSV files provided below.

The most elaborate mode is when obtaining the historic overview over multiple months (that is, for each month a CSV file under the historic_data folder).
The files are read in, one by one and stored & merged in the main CSV object. Once ready, it's metrics can be drawn:
Notice the CLI representation, and a GUI representation using mathplot. (Note: this data is altered and does not reflect the situation at the customer)

Of course, the intelligence of this script lies in it's sanitization mode. I.e. the removal of errors that reduce the desired metrics.
For example: say that an analyst marks an alert as closed, while a duplicate alert happens within 5 minutes of that alert (before or after in time) which had NOT been marked as closed. It is agreed that duplicate alert is marked as closed as well. This script detects and corrects the marking like below(and writes out a sanitized version). The cycle is iterative, as an duplicate alerts may also be repeated every few minutes.... it suffices the analysts only marks one (e.g. an ongoing scan for longer periods of time)

Wait... wait... the KPI increased because of this? This is golden for the reports :). Ad demonstrandum, let me draw the metrics again

The script was quickly dubbed the KPI-booster. Unfortunately, and logically, it can only boost the numbers once :).

You can find my code and sample data on github