Comparison of Detection Methodologies in SIEM. Correlation and Search.
In some cases, it is questionable if the correlation is really necessary. Without automated correlation and alert functions, log management systems require significantly more labor and talented professional personnel to glean any meaningful security information.
If you have a good blue team to “do” the correlation job, you may search for logs to detect attacks, suspicious and malicious events. If you do not have blue team or dedicated threat hunters, you do not detect attacks, suspicious and malicious events. If your company is not a big enterprise enough, most of the time, you do not have a dedicated team.
Also, for some security use cases, it is acceptable to detect after 5,10,15 minutes later but not for all security use cases. Most of the time, detecting in real time will be critical like “ detecting a user added to the domain admins group “.
Real-time analysis of data as it comes in, powered by robust analytical software that can handle large amounts of data is very critical.
Prevention is better than a cure.Thus, you must try your best to prevent a data breach. Example use case:
Another advantage of correlation is the power in multi-stage use cases like:
An insider tries to log in to a machine.
1-She only tries to log in to a machine on just lunch ours.
2-If she failed to log in to a machine, she waits 15 minutes before the second try.
3-After the second try, she left trying, not to be detected.
4-She tries this behavior more than 3 times in a week.
Most of the time, it is impossible or hard to detect use cases like above in real time with log searching or queries, even if you have a talented and dedicated security team.
Elastic Stack, Splunk Es, Logz.io, Microsoft Sentinel, SumoLogic and LogPoint are good examples of products that are doing correlation searches. They do not have a separate correlation engine. SureLog has a separate correlation and detection engine. Qradar utilizes EventGnosis Complex Event Processing product as a correlation engine, RSA uses Esper CEP for correlation, Logrthym, McAfee is some other SIEM solutions which have separate correlation engine [1].
A dedicated security team can develop some search queries and schedule those queries as correlation queries. In this case, consider:
Example of cases that extends learning curve from logz.io.
Another example from Microsoft Sentinel.
You have to learn Lucene query language and Kibana Query Language (KQL) both.
It is not easy to develop this Sentinel query.
let timeframe = 1d;
let DomainList = dynamic([“tor2web.org”, “tor2web.com”, “torlink.co”, “onion.to”, “onion.ink”, “onion.cab”, “onion.nu”, “onion.link”,
“onion.it”, “onion.city”, “onion.direct”, “onion.top”, “onion.casa”, “onion.plus”, “onion.rip”, “onion.dog”, “tor2web.fi”,
“tor2web.blutmagie.de”, “onion.sh”, “onion.lu”, “onion.pet”, “t2w.pw”, “tor2web.ae.org”, “tor2web.io”, “tor2web.xyz”, “onion.lt”,
“s1.tor-gateways.de”, “s2.tor-gateways.de”, “s3.tor-gateways.de”, “s4.tor-gateways.de”, “s5.tor-gateways.de”, “hiddenservice.net”]);
| where TimeGenerated >= ago(timeframe)
| where ProcessName contains “squid”
| extend URL = extract(“(([A-Z]+ [a-z]{4,5}:\\/\\/)|[A-Z]+ )([^ :]*)”,3,SyslogMessage),
SourceIP = extract(“([0–9]+ )(([0–9]{1,3})\\.([0–9]{1,3})\\.([0–9]{1,3})\\.([0–9]{1,3}))”,2,SyslogMessage),
Status = extract(“(TCP_(([A-Z]+)(_[A-Z]+)*)|UDP_(([A-Z]+)(_[A-Z]+)*))”,1,SyslogMessage),
HTTP_Status_Code = extract(“(TCP_(([A-Z]+)(_[A-Z]+)*)|UDP_(([A-Z]+)(_[A-Z]+)*))/([0–9]{3})”,8,SyslogMessage),
User = extract(“(CONNECT |GET )([^ ]* )([^ ]+)”,3,SyslogMessage),
RemotePort = extract(“(CONNECT |GET )([^ ]*)(:)([0–9]*)”,4,SyslogMessage),
Domain = extract(“(([A-Z]+ [a-z]{4,5}:\\/\\/)|[A-Z]+ )([^ :\\/]*)”,3,SyslogMessage),
Bytes = toint(extract(“([A-Z]+\\/[0–9]{3} )([0–9]+)”,2,SyslogMessage)),
contentType = extract(“([a-z/]+$)”,1,SyslogMessage)
| extend TLD = extract(“\\.[a-z]*$”,0,Domain)
| where HTTP_Status_Code == “200”
| where Domain contains “.”
| where Domain has_any (DomainList)
| extend timestamp = TimeGenerated, URLCustomEntity = URL, IPCustomEntity = SourceIP, AccountCustomEntity = User
Example from Splunk: if an account has failed to authenticate to the same Windows server 3 times, followed by a successful logon (same account, same host) in a period of 10 minutes [2].
If you check the link, you see that the question already remains unanswered. Maybe it is possible to develop the same use case, there are some comments also, but it is obvious that it is not such easy.
- Search performance.
- Ability to sequence events. Most of the search based products do not support this feature. For example, Splunk Enterprise/Core does not support to sequence events, but Splunk Enterprise Security (ES) supports
- Correlation searches. Most of the search based products do not support this feature. For example, Splunk Enterprise/Core does not support correlation searches, but Splunk Enterprise Security (ES) supports.
- Real time detection requirements.
For example, if you use Splunk ES for real time detection, you have to consider
“Each realtime search unpreemptively locks 1 core on EVERY INDEXER and on your Search Head”
Real time search limitations from SumoLogic.
Next-Gen SIEM solutions with correlation capabilities have easy to use GUI to develop correlation rules. When compared to developing correlation searches, correlation rule wizards are easy. The account added and deleted use case example:
First query: Second query:
Microsoft Sentinel correlation search [3]: You have to develop 2 different correlation queries for this use case.
Splunk ES correlation search [4]:
account_created|joinkind=inner(account_deleted)onComputer,TargetUserName
|wheredeletionTime-creationTime<lookback
|wheretolong(deletionTime-creationTime)>=0
|projectTimeDelta=deletionTime-creationTime,creationTime,CreateEventID,Computer,TargetUserName,UserPrincipalName,AccountUsedToCreate,
deletionTime,DeleteEventID,AccountUsedToDelete
|extendtimestamp=creationTime,AccountCustomEntity=AccountUsedToCreate,HostCustomEntity=Computer
sourcetype=WinEventLog:Security (EventCode=4726 OR EventCode=4720) |eval Date=strftime(_time, “%Y/%m/%d”) |rex “Subject:\s+\w+\s\S+\s+\S+\s+\w+\s\w+:\s+(? \S+)” | rex “Target\s\w+:\s+\w+\s\w+:\s+\S+\s+\w+\s\w+:\s+(? \S+)” | rex “New\s\w+:\s+\w+\s\w+:\s+\S+\s+\w+\s\w+:\s+(? \S+)” | eval SuspectAccount=coalesce(DeletedAccount,NewAccount) | transaction SuspectAccount startswith=”EventCode=4720" endswith=”EventCode=4726" |eval duration=round(((duration/60)/60)/24, 2) | eval Age=case(duration<=1, “Critical”, duration>1 AND duration<=7, “Warning”, duration>7, “Normal”)| table Date, index, host, SourceAccount, SuspectAccount, duration, Age | rename duration as “Days Account was Active” | sort + “Days Account was Active”
SureLog correlation Wizard [5]: SureLog SIEM has a correlation engine and has a correlation rule wizard. Everything is GUI based and easy to use.
Using a correlation rule wizard is very easy when compared to developing correlation searches. Most of the time, developing an advanced correlation search is called “ art “ .
SIEM can be consumed in a number of ways, including as a managed service, co-managed SIEM. Because of not having enough security team, you may consider using SaaS instead of a SIEM with powerful correlation engine. SaaS SIEM services are now popular and considered a cheaper solution [6]. You have to consider disadvantages of SaaS SIEM [6]:
The Co-managed SIEM model may fits when SaaS does not. So it is likely that any organization will be able to find a good fit for itself.
The Power of Correlation
Security is not searching. SIEM does the correlation and analysis of events to detect anomalies and threats in real-time. In essence, the SIEM is only as good as its rules [1] and threat intelligence.
With only the logs, all an analyst sees is: “Bob authenticated from a foreign country”. The Analyst needs this information to make a reasoned assessment of any security alert involving this authentication. The true value of logs is in correlation to get actionable information.
Event correlation techniques are the cornerstone of any reliable strategy that focuses on prevention rather than reaction.
“if you are spending 80 percent of your time within a SIEM tool doing alert review and analysis, then you are on the right track. If you are an organization that is instead focusing heavily on collecting more data sources, applying patches, or running compliance reports, then your SIEM implementation may not be tactical.” [7]
Active Notifications are real time detections. Instead of relying on searching that a system is giving an error, you can be actively notified by your SIEM system in real time as unexpected issues come up [1].
When you look at a one log file, it provides just a single point of view, such as an application point of view or server point of view. Without the capability to correlate views across multiple logs from different components, it is very difficult to get a full picture of the problems that are occurring.
UEBA is another form of correlation. UEBA modules/tools are using ML to do detection. System requirements and cost are main parameters to check for ML tools/modules. There are fully ML based tools and also you can add UEBA and ML features optionally to those cost-effective Next-Gen SIEMs [8]. Before going with pure UEBA tools/modules, keep in mind that it is not a silver bullet [9,10]. The state-of-the-art research on insider threat detection mostly focuses on developing unsupervised behavioral anomaly detection techniques to find out anomalousness or abnormal changes in user behavior over time. However, an anomalous activity is not necessarily malicious that can lead to an insider threat scenario. So ML or AI is not a silver bullet. UEBA or ML/AI module wants to address talent shortage but exacerbates it.
Be aware of the drawbacks UEBA. There are several drawbacks [11], however, to applying ML and other artificial intelligence (AI) approaches to security. Most ML tools are black-box in nature and can also be difficult to audit. No one wants the black box making decisions without them knowing what it’s doing. Also, appropriate security data may not be sufficiently available[12].
Do your research , use the guidelines and factor out what is most important for you. A lot of what is sold as AI is simply marketing, says Eugene Kaspersky
Cost of Correlation Versus Cost of Searching
When it comes to prices, most of the time, Next-Gen SIEMs with powerful correlation considered expensive tools, but this is not correct. There are SIEM tools with cost-effective prices.
The procurement and roll-out of SIEM products (either proprietary or open-source) for organizations with a limited cyber-security budget are often seen as too costly, however, due to the price of the license or equipment or due to the dependency on niche skills required (or both). But there are cost-effective solutions on both sides.
Factors That Affect SIEM Cost
References
For example, SureLog has a flat pricing model. On the search side, the Elastic stack license is free. Also, SureLog hardware requirements, especially the disk size requirement, is less than 1/10 of the closest rival [7]. When it comes to deployment and consulting costs, there are free solutions like Elastic stack. SureLog deployment and consulting cost is also cost-effective.
Copyright © 2020 by Ertuğrul AKBAŞ. All Rights Reserved
- https://www.peerlyst.com/posts/what-really-matters-when-selecting-a-siem-and-how-to-choose-a-siem-looking-into-the-correlation-ertugrul-akbas
- https://answers.splunk.com/answers/663659/need-help-writing-query-to-alert-if-an-account-has.html
- https://techcommunity.microsoft.com/t5/azure-sentinel/azure-sentinel-correlation-rules-the-join-kql-operator/ba-p/1041500
- https://gosplunk.com/accounts-deleted-within-24-hours-of-creation/
- https://medium.com/@eakbas/surelog-ueba-3cbf478d319d
- https://www.peerlyst.com/posts/siem-for-smb-in-2020-ertugrul-akbas
- https://www.peerlyst.com/posts/how-to-select-the-right-siem-solution-ertugrul-akbas
- https://www.peerlyst.com/posts/domain-generational-algorithm-dga-detection-in-surelog-ertugrul-akbas
- https://www.peerlyst.com/posts/ml-ai-is-a-feature-not-a-silver-bullet-and-ueba-questions-ertugrul-akbas
- https://www.peerlyst.com/posts/ai-in-cybersecurity-a-reality-check-steve-king
- https://towardsdatascience.com/the-limitations-of-machine-learning-a00e0c3040c6
- https://www.computerworld.com/article/3466508/the-impact-of-machine-learning-on-security.html
Originally published at https://www.peerlyst.com on March 24, 2020.