0% found this document useful (0 votes)
11 views22 pages

Anomaly Mining in Windows Event Logs in Splunk

Uploaded by

Efi K.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views22 pages

Anomaly Mining in Windows Event Logs in Splunk

Uploaded by

Efi K.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Forward- This presentation may contain forward-looking statements regarding future events, plans or the

expected financial performance of our company, including our expectations regarding our products,
technology, strategy, customers, markets, acquisitions and investments. These statements reflect

Looking management’s current expectations, estimates and assumptions based on the information currently
available to us. These forward-looking statements are not guarantees of future performance and involve
significant risks, uncertainties
involve significant and other
risks, uncertainties factors
and other that maythat
factors cause
mayour actual
cause ourresults, performance or
actual results,
Statements achievements
performance ortoachievements
by
be materially different
the forward-looking
expressed or implied bystatements
from results,
to be materially
containedstatements
the forward-looking
performance
different
in this presentation.
or achievements
from results, performance orexpressed
contained in this presentation.
or implied
achievements

For additional information about factors that could cause actual results to differ materially from those
described in the forward-looking statements made in this presentation, please refer to our periodic
reports and other filings with the SEC, including the risk factors identified in our most recent quarterly
reports on Form 10-Q and annual reports on Form 10-K, copies of which may be obtained by visiting the
Splunk Investor
the Splunk Relations
Investor website
Relations at www.investors.splunk.com
website at www.investors.splunk.com or the
or SEC's website
the SEC's at www.sec.gov.
website at
The forward-looking
www.sec.gov. statements made
The forward-looking in this presentation
statements made in thisare made as ofare
presentation themade
time and
as ofdate of this
the time and
presentation. If reviewed after
date of this presentation. the initial
If reviewed presentation,
after even if made
the initial presentation, available
even if madebyavailable
us, on ourbywebsite or
us, on our
otherwise,
website or itotherwise,
may not contain
it may notcurrent or accurate
contain current orinformation. We disclaim
accurate information. Weany obligation
disclaim anyto update or
obligation to
revise
updateany forward-looking
or revise statement statement
any forward-looking based on newbased information, future events
on new information, or otherwise,
future except as
events or otherwise,
required
except asbyrequired
applicable law.
by applicable law.
In addition, any information about our roadmap outlines our general product direction and is subject to
change at any time without notice. It is for informational purposes only and shall not be incorporated into
any contract
into any or other
contract commitment.
or other We undertake
commitment. no obligation
We undertake either
no obligation to develop
either the features
to develop or or
the features
functionalities described or to include any such feature or functionality in a future release.

Splunk, Splunk>, Data-to-Everything, D2E and Turn Data Into Doing are trademarks and registered trademarks of Splunk Inc. in the United States and other
countries. All other
other countries. brandbrand
All other names, product
names, namesnames
product or trademarks belongbelong
or trademarks to theirtorespective owners.
their respective © 2021
owners. © Splunk Inc. AllInc.
2021 Splunk rights reserved.
All rights reserved.
© 2021 SPLUNK INC.

Anomaly
Mining in
Windows
Event Logs
SEC1395A
Efi Kaufman
Cyber Security Center | Israel Ministry of Energy
Dr Greg Ainslie-Malik
Principal Product Manager | Splunk
© 2021 SPLUNK INC.

Efi Kaufman Dr Greg Ainslie-Malik


CTO, Cyber Security Center | Israel Ministry of Energy Principal Product Manager | Splunk
© 2021 SPLUNK INC.

1) What is the problem we are trying to solve?


Agenda How can we spot sophisticated attacks against our
Anomaly Mining in environment?
Windows Event Logs
2) How did we solve it?
With ML of course!

3) Demo time!
Sit back and relax...

4) What’s next?
Other considerations for making our analytics operational
© 2021 SPLUNK INC.

Learning outcomes from this talk

SOC Maturity Methodology Threat Hunting

Rule based approaches ML isn’t magic, but it can You can find some
have a place as much as be a useful tool interesting stuff by hunting
ML through your logs
© 2021 SPLUNK INC.

Quick Background About SIEM


Traditional SIEM implementations are based on rules.

● Simple rules detect an event and trigger an alert.


For example: Authentication failed -> trigger an alert.

● Correlation rules join two or more rules/events to accurately detect a


pattern: For example, multiple failed authentication attempts from the
same host to different computers using different user names, within ten
minutes. if a successful login occurs on any of these computers, they
trigger an alert.

Rules will be used to search for known Indicators of Compromise (IoC), well known but generic attack
patterns and in more mature and advanced implementation will leverage data from MITRE ATT&CK
framework to be able to pinpoint specific threats and attacks.
© 2021 SPLUNK INC.

Industry-wide operational SIEM


challenges

“Many of the rules and policies organizations currently have in place are ineffective.
CardinalOps research data shows that an average of 25% of SIEM rules are broken and
will never fire, primarily due to fields that are not extracted correctly or log sources that are not
sending the required data. However, organizations are completely unaware that these rules are
not functioning. Additionally, only 15% of SIEM rules lead to 95% of the tickets handled by
the Security Operations Center (SOC), demonstrating that a small percentage of noisy rules
overwhelm SOC analysts with distracting false positive (FP) alerts.”

CARDINALOPS Research -
https://www.securitymagazine.com/articles/94556-enterprise-siems-unprepared-for-84-of-mitre-attck-tactics-and-techniques
© 2021 SPLUNK INC.

SOC Analytics Maturity


Rules aren’t all bad... Machine learning
based analytics

Statistics
based
analytics

Rule based Even more complex


analytics behavioural detections,
such as what we will run
through shortly...

More complex behavioural


detections, such as an
endpoint connecting to
external host more than usual.
Simple detections for well Based on simple statistics
known events, adding such as models
new users/systems, Antivirus
and IDS alerts
© 2021 SPLUNK INC.

Current cyber security challenges


Why do we need these complex approaches?

1) Most if not all attacks are low and slow


(What time frame is your ground-truth ?)
2) Attackers are using legit admin tools (LotL), making needles look like hay.
3) In OT/ICS threat intelligence is a challenge
(How attacks really look like ? Will the next one be similar to the ones we know ?
4) The lion part of attacks end up with a quiet outcome, usually for gaining persistence in the
environment, establishing command and control channels and exfiltrating data.
But, in our favor:
The computing environment is relatively static and changes are seasonal (computing
devices, users, traffic pattern)
© 2021 SPLUNK INC.

Problem specifics
How can we programmatically find unusual windows events in our environment?

These points look like normal


events in our environment

This point does not look like a


normal event in our environment
© 2021 SPLUNK INC.

What made this problem different?

Cardinality Outlier profiles

We have 1000’s of users, event codes and Many of the outliers we spotted during the
hosts in the data, making millions of initial analysis sat in close proximity to
permutations ‘normal’ events
So what? So what?
The fit command in the MLTK disregards Clustering techniques such as KMeans
categorical fields that have over 100 values, aren’t sensitive to these types of scenarios
one-hot-encoding those with less than 100
values
© 2021 SPLUNK INC.

How did we solve it?


The ML pipeline

Aggregated Windows Standard


NPR PCA DBSCAN Clusters
Event Log Data Scaler
© 2021 SPLUNK INC.

Step 1: Handling high cardinality data


Did someone say Normalised Perlich Ratio?

Highly Likely, for example standard service


user remote logging onto a busy host

Very Unlikely, for example a standard user clearing


the event log on a critical database server that they
have never logged onto before
© 2021 SPLUNK INC.

Step 2: Getting our data ready for finding


anomalies
Normalisation and principal component analysis
© 2021 SPLUNK INC.

Step 3: Identify patterns and outliers in


the data
Centroid or density based clustering, that is the question

KMeans DBSCAN
© 2021 SPLUNK INC.

How did we solve it?


Once more with details

Aggregated Windows Standard


NPR PCA DBSCAN Clusters
Event Log Data Scaler

Calculated likelihoods for event


Reducing the scaled
codes, users, hosts and the Scaled counts and Grouping the data based on the two
variables down to
combination of all three against the likelihoods calculated dimensions
two dimensions
day of the week
© 2021 SPLUNK INC.

Caveats
Anomaly detection (not always) equals threat detection

1) We are using here unsupervised ML


2) Anomalies can be explained by a malicious activity but they can also be the result of
misconfiguration or human error
3) A good security practice is to treat both equally. misconfiguration is also something that
needs to be resolved !
(...and security guys always need to be nice to their operations peers)

?
© 2021 SPLUNK INC.

App Demo

Testing our
app against
the BOTS
dataset
(anyone for guess the version?)
© 2021 SPLUNK INC.

What’s next
Operationalize our method

1) Decide what other data sources are going to be used (Windows events, firewall, IDS logs )
2) Review your data (garbage in-garbage out)
3) Determine the X-axis time interval (Hourly,daily ?)
4) Automate KV store lookup
5) Fine tune min. sample size and distance parameters
6) Create alerts
© 2021 SPLUNK INC.

Augment static SIEM rules with ML


Look at your static rules and ask yourself if you can build a model

Static Rule ML Model

1) Detect connections to a remote system 1) Identify and model all RDP/connections


over RDP/RDS (ATT&CK ID with features like time,source,dest,user
T1021.001) (See .Conf19 IOT1410)

2) Identify malware renamed as legit 2) Model process creation events with


executable but executed from a non features like time, host, parent process,
standard location (ATT&CK ID T1036 exec process, process path
-Masquerading technique)

Machine Learning Toolkit Searches in Splunk Enterprise Security


https://docs.splunk.com/Documentation/ES/6.6.0/Admin/MLTKsearches
© 2021 SPLUNK INC.

What else?
Where else could you mine your data for anomalies?

IT Customer
Operations IT Security OT Security Fraud Experience

Identifying Spotting Flagging Helping


unusual patterns suspicious This is potential customers
of IT alerts activity where we fraudsters have a positive
started experience
© 2021 SPLUNK INC.

Thank You Please provide feedback via the

SESSION SURVEY

You might also like