0% found this document useful (0 votes)

142 views

Splunk Search and Performance Improvements

Uploaded by

bobwillmore

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

142 views

Splunk Search and Performance Improvements

Uploaded by

bobwillmore

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

Search Performance

Improvements
What we’ve done and why we did it…

Alex James – Senior Principal Product Manager (Search Technologies)

Manan Brahmkshatriya - Principal QA Engineer

25-28th Sept 2017 | Washington, DC

Forward-Looking Statements
During the course of this presentation, we may make forward-looking statements regarding future events or
the expected performance of the company. We caution you that such statements reflect our current
expectations and estimates based on factors currently known to us and that actual events or results could
differ materially. For important factors that may cause actual results to differ from those contained in our
forward-looking statements, please review our filings with the SEC.

The forward-looking statements made in this presentation are being made as of the time and date of its live
presentation. If reviewed after its live presentation, this presentation may not contain current or accurate
information. We do not assume any obligation to update any forward looking statements we may make. In
addition, any information about our roadmap outlines our general product direction and is subject to change
at any time without notice. It is for informational purposes only and shall not be incorporated into any contract
or other commitment. Splunk undertakes no obligation either to develop the features or functionality
described or to include any such feature or functionality in a future release.
Splunk, Splunk>, Listen to Your Data, The Engine for Machine Data, Splunk Cloud, Splunk Light and SPL are trademarks and registered trademarks of Splunk Inc. in
the United States and other countries. All other brand names, product names, or trademarks belong to their respective owners. © 2017 Splunk Inc. All rights reserved.
Session Outline

Language Improvements
Data Model Improvements
Optimizer Improvements
Further Improvement Ideas
Q&A

3
SPL Language
Improvements
Generating Search – typical breakdown
i.e. the time taken for the first search processor to do its job, with lots of TAs.

index scan
rawdata & decompression

kv
(auto and explicit)

autolookup
post filter search

typer
~ 50%

tagger

time

5
Search Directives
Producing TAGS & EVENT TYPES is very costly
• With lots of TAs it can easily be 50% of the total cost of the search
• Tags are stored in one multi-valued field
• We treat as ALL or NOTHING

Now have a way to selectively request just one or more TAGS (and types)
• search 500 DIRECTIVES(REQUIRED_TAGS(tags="foo, bar"))
• search 500 DIRECTIVES(REQUIRED_EVENTTYPES(eventtypes="alpha,omega"))

Combining Directives…
• search 500 DIRECTIVES(REQUIRED_EVENTTYPES(eventtypes="alpha,omega"),REQUIRED_TAGS(tags="foo,bar"))
• Will produce list of EVENT TYPES needed to correctly produce foo and bar tags
• And merge with “alpha, omega” event types...

Impact
• Low – targeted searches for a few events
• High – broad searches returning lots of events (i.e. Monitoring & Acceleration)
How Data Model Acceleration works…

25
20
15
TIME
10
5
0

INDEXERS
7
Data Model Acceleration (DMA)
Problem and Solution
▶ Issues prior to 7.0:
• Acceleration of warm/cold buckets was all or nothing. (I’ve started so I’ll finish...)
• So acceleration of a large warm/cold bucket could monopolize acceleration.
• Slowest indexer holds up the other indexers.
• So even temporary data imbalance could lead to loss of parallelism, and cascading delays.

▶ Solution:
• Added ability to pause / continue accelerating warm/cold buckets. (I’ve started, but something more important / hot has come along…)
• This means acceleration.max_time is now fully respected, even when processing historical data.
• Next acceleration search starts with hot buckets, thus keeping lag low, even when rebuilding acceleration from scratch.
• If summarization search finishes early we can poll for new data (to reduce lag) so all indexers can be keep busy.
• See new setting acceleration.poll_buckets_until_maxtime=true

▶ Impact:
• 7.0 typically twice as fast as 6.5 (or faster).
• 7.0 lag typically 50% as 6.5 (or less).
• Data Model Acceleration Rebuilds have less impact.
Demo #1
Typer / Tagger and DMA improvements
Improved High Cardinality Processing
Using Parallel Reduce
▶ Imagine a search like this:
• search tag=authentication | stats sum(bytes) by host

▶ Main gate on parallelism / scalability is the number of hosts

▶ But if we implicitly shuffle before the stats:
• search tag=authentication | shuffle by host | stats sum(bytes) by host
• Reduction can happen in parallel

▶ Limited support for this in 7.0:

• Needs both:
• Global enablement (phased_execution=true in limits.conf)
• SPL search by search enablement (| noop phase_mode=3)

• Works with only: stats, transaction and tstats

▶ Much more coming...
Demo #2
New Optimizations in 7.0
New Optimizations in 7.0

Projection Elimination for Reporting Commands

– search ERROR | eval x=a*b | lookup users uid OUTPUT username | stats count by host
– search ERROR | stats count by host

Tag Elimination
– search ERROR | where tag=“Authentication” | stats count by host
– search DIRECTIVES(REQUIRED_TAGS(tags=“Authentication”)) | where tag=Authentication | stats count by host

Collapsing evals commands

– | eval x=a+b | eval y=c+d
– | eval x=a+b, y=c+d

Predicate Normalization
– search ERROR | where 10=y
– search ERROR y=10
– Why would you ever do this:
ê search ERROR |… |… | eval x=10|… |… | where x=y

12
Further Improvement
Ideas
Further Improvement Ideas (1)

Faster Lookups and Lookup Replication

Better data structures and serialization formats
More optimization
• Projection Elimination for Fields
− search ERROR | eval x=a*b | inputlookup users uid OUTPUT username | fields b, username

− search ERROR | inputlookup users uid OUTPUT username | fields b, username

• Merging into Inputlookup (KV Store)

− | inputlookup foo | search x=10

− | inputlookup foo where x=10

• Etc.
Further Improvement Ideas (2)

Better Parallel Reduce

• Implicit support for more reporting commands
• Better timeliner and preview integration
• Continued parallel execution (for both streaming & compatible reporting splits)
• | tstats values(Authentication.app) as app, latest(Authentication.user_bunit) as user_bunit from datamodel=Authentication.Authentication by
Authentication.user, Authentication.src _time span=1s
| eventstats dc(Authentication.src) as src_count by Authentication.user
| search src_count>1

• Explicit Shuffle support

• search tag=authentication | shuffle by host | <any spl>

Better support for result reuse…

Sliding Window Re-use
Example of Result Reuse
▶ Lots of searches are scheduled to run on a frequent schedule (every 5m,10m,15m) but cover a larger time
range (last 1h, 3h, 24h).
▶ Which means there is a lot of re-calculation occurring
• i.e. For a search over the last hour run every 5 mins, ~55mins worth of results have already been calculated once (for
the last run) but thrown away.

…
Run 10
… 10 11 12
Run 11
Run 12

▶ Report Acceleration (RA) has the ability to incrementally build results already.
• Unfortunately RA doesn’t work for TSTATS searches.
• Why? TSTATS searches leverage Data Model Acceleration (DMA) and we don’t support RA over DMA.
▶ Many Sliding Windows searches are based on TSTATS
• Currently investigating adding support for RA over DMA
Summary - What does this mean for you?

Faster Searches
Faster Enterprise Security
Look for opportunities to use new DIRECTIVES
Checkout the optimizer in the Job Inspector
Upgrade to 7.0 (or at least 6.5 if that isn’t possible).

1. Splunk 7.0 is significantly faster.

Key
Takeaways 2. Key improvements include: new
directives, optimizer improvements and
This is where the DMA improvements.
subtitle goes

3. If you have ES the difference in DMA is

Thank You
Don't forget to rate this session in the
.conf2017 mobile app
Backup Slides
If the session runs short…
Union

Similar to append but is streaming when possible:

• <except> it runs in parallel on indexers (using an improved version of multisearch when possible)
Useful for correlation searches, i.e. append | stats to do a pseudo join
Supports:
• More than 2 datasets: | union [<spl1>], [<spl2>], ... , [<splN>]
• Named dataset format (like from) : | union savedsearch:mysavedsearch, [<spl2>], inputlookup:threats
• Shorthand (like append): <spl1> | union [<spl2>]

Should still use a single search or tstats append if possible...

• Don’t do this: search “error” | union [search “warning” ]
• Do this: search “error” OR “warning”
Effect of temporary data imbalance prior to 7.0
16 mins

25
13 mins
20
DELAY
15
TIME
10 7 mins

5
5 mins
0

INDEXERS

Learn SAP Basis in 24 Hours
From Everand
Learn SAP Basis in 24 Hours
Alex Nordeen
4.5/5 (2)
Apache Cassandra Administrator Associate - Exam Practice Tests
From Everand
Apache Cassandra Administrator Associate - Exam Practice Tests
Cristian Scutaru
No ratings yet
Kisi-Kisi Web Technologies LKS 2024
No ratings yet
Kisi-Kisi Web Technologies LKS 2024
25 pages
Conf2015 DWaddle DefensePointSecurity Deploying SplunkSSLBestPractices
No ratings yet
Conf2015 DWaddle DefensePointSecurity Deploying SplunkSSLBestPractices
50 pages
Research I: Quarter 3 - Module 2: Probability and Non-Probability Sampling
75% (8)
Research I: Quarter 3 - Module 2: Probability and Non-Probability Sampling
33 pages
Learning Informatica PowerCenter 9.x
From Everand
Learning Informatica PowerCenter 9.x
Rahul Malewar
3/5 (4)
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
From Everand
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
alasdair gilchrist
5/5 (1)
SQL Server Analysis Services 2012 Cube Development Cookbook
From Everand
SQL Server Analysis Services 2012 Cube Development Cookbook
Paul Turley
No ratings yet
Agile Testing: An Overview
From Everand
Agile Testing: An Overview
Florian Heuer
4/5 (10)
JMeter Cookbook
From Everand
JMeter Cookbook
Bayo Erinle
No ratings yet
Machine Learning with SAS Viya
From Everand
Machine Learning with SAS Viya
SAS Institute Inc.
No ratings yet
Kubernetes Tutorial
100% (11)
Kubernetes Tutorial
83 pages
Python Packaging User Guide
No ratings yet
Python Packaging User Guide
97 pages
SAS Interview Questions You'll Most Likely Be Asked
From Everand
SAS Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
From Everand
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Conf2015 MMueller Consist Deploying OptimizingSplunkKnowledge
No ratings yet
Conf2015 MMueller Consist Deploying OptimizingSplunkKnowledge
57 pages
0020_v002 - Search Optimization_Lab_Solutions_Guide
No ratings yet
0020_v002 - Search Optimization_Lab_Solutions_Guide
12 pages
search optimization lab
No ratings yet
search optimization lab
12 pages
Building Correlation Searches With Splunk Enterprise Security Takeaway
No ratings yet
Building Correlation Searches With Splunk Enterprise Security Takeaway
4 pages
search-optimization
No ratings yet
search-optimization
58 pages
Model Based Environment: A Practical Guide for Data Model Implementation with Examples in Powerdesigner
From Everand
Model Based Environment: A Practical Guide for Data Model Implementation with Examples in Powerdesigner
Vladimir Pantic
No ratings yet
PostgreSQL 9 Administration Cookbook LITE: Configuration, Monitoring and Maintenance
From Everand
PostgreSQL 9 Administration Cookbook LITE: Configuration, Monitoring and Maintenance
Simon Riggs
3/5 (1)
Terms Splunk
No ratings yet
Terms Splunk
62 pages
Microsoft SQL Server 2014 Business Intelligence Development Beginner’s Guide
From Everand
Microsoft SQL Server 2014 Business Intelligence Development Beginner’s Guide
Reza Rad
No ratings yet
JBoss AS 5 Performance Tuning
From Everand
JBoss AS 5 Performance Tuning
Francesco Marchioni
No ratings yet
Oracle 11g Streams Implementer's Guide
From Everand
Oracle 11g Streams Implementer's Guide
Ann L. R. McKinnell
No ratings yet
Oracle ADF Enterprise Application Development – Made Simple : Second Edition
From Everand
Oracle ADF Enterprise Application Development – Made Simple : Second Edition
Sten E. Vesterli
No ratings yet
PMI-ACP Exam Companion : Q & A with Explanations
From Everand
PMI-ACP Exam Companion : Q & A with Explanations
SUJAN
No ratings yet
Learning Dynamics NAV Patterns: Create solutions that are easy to maintain, are quick to upgrade, and follow proven concepts and design
From Everand
Learning Dynamics NAV Patterns: Create solutions that are easy to maintain, are quick to upgrade, and follow proven concepts and design
Marije Brummel
No ratings yet
Oracle Modernization Solutions
From Everand
Oracle Modernization Solutions
Tom Laszewski
No ratings yet
Microsoft SQL Server 2012 Performance Tuning Cookbook
From Everand
Microsoft SQL Server 2012 Performance Tuning Cookbook
Ritesh Shah
No ratings yet
Oracle Database 11g R2 Performance Tuning Cookbook
From Everand
Oracle Database 11g R2 Performance Tuning Cookbook
Ciro Fiorillo
No ratings yet
Oracle: Protect Your Data
From Everand
Oracle: Protect Your Data
Floribert TCHOKO
No ratings yet
Walking the Design for Six Sigma Bridge with Your Customer
From Everand
Walking the Design for Six Sigma Bridge with Your Customer
Carl Cordy
No ratings yet
Key Principles of IT Architecture
From Everand
Key Principles of IT Architecture
Nelson Ambrose
No ratings yet
SQL 101 Crash Course: Comprehensive Guide to SQL Fundamentals and Practical Applications
From Everand
SQL 101 Crash Course: Comprehensive Guide to SQL Fundamentals and Practical Applications
Emrys Callahan
5/5 (1)
IBM DB2 9.7 Advanced Application Developer Cookbook
From Everand
IBM DB2 9.7 Advanced Application Developer Cookbook
Mohankumar Saraswatipura
No ratings yet
Microsoft SQL Server 2008 R2 Administration Cookbook
From Everand
Microsoft SQL Server 2008 R2 Administration Cookbook
Satya Shyam K Jayanty
5/5 (1)
Angular Performance Optimization: Everything you need to know
From Everand
Angular Performance Optimization: Everything you need to know
Abdelfattah Ragab
No ratings yet
PostgreSQL 9.0 High Performance
From Everand
PostgreSQL 9.0 High Performance
Gregory Smith
4/5 (1)
Administering Microsoft Azure SQL Solutions DP 300
From Everand
Administering Microsoft Azure SQL Solutions DP 300
Manish Soni
No ratings yet
Getting Started with SQL Server 2012 Cube Development
From Everand
Getting Started with SQL Server 2012 Cube Development
Simon Lidberg
No ratings yet
Expert Cube Development with Microsoft SQL Server 2008 Analysis Services
From Everand
Expert Cube Development with Microsoft SQL Server 2008 Analysis Services
Alberto Ferrari
5/5 (2)
Splunk Notes
100% (3)
Splunk Notes
8 pages
Agile Methodology
From Everand
Agile Methodology
IntroBooks Team
No ratings yet
Get Splunk 7 Essentials Demystify machine data by leveraging datasets building reports and sharing powerful insights Contreras PDF ebook with Full Chapters Now
100% (3)
Get Splunk 7 Essentials Demystify machine data by leveraging datasets building reports and sharing powerful insights Contreras PDF ebook with Full Chapters Now
55 pages
NEW Security4Rookiesv1.3
No ratings yet
NEW Security4Rookiesv1.3
98 pages
AZ-104 Azure Administrator Practice Paper 1: AZ-104 Azure Administrator, #1
From Everand
AZ-104 Azure Administrator Practice Paper 1: AZ-104 Azure Administrator, #1
Tech Interviews
No ratings yet
SAP Basis Configuration Frequently Asked Questions
From Everand
SAP Basis Configuration Frequently Asked Questions
Equity Press
3.5/5 (4)
What The Heck You Should Know About Quality Engineering?
From Everand
What The Heck You Should Know About Quality Engineering?
Jibraka Jones
No ratings yet
Advanced Backend Code Optimization
From Everand
Advanced Backend Code Optimization
Sid Touati
No ratings yet
Oracle Quick Guides: Part 2 - Oracle Database Design
From Everand
Oracle Quick Guides: Part 2 - Oracle Database Design
Malcolm Coxall
No ratings yet
Splunk Search Optimization
No ratings yet
Splunk Search Optimization
3 pages
Oracle Warehouse Builder 11g: Getting Started
From Everand
Oracle Warehouse Builder 11g: Getting Started
Bob Griesemer
No ratings yet
Automating Software Tests Using Selenium
From Everand
Automating Software Tests Using Selenium
Hugo Peres
No ratings yet
Designing Microsoft Azure Infrastructure Solution AZ 305
From Everand
Designing Microsoft Azure Infrastructure Solution AZ 305
Manish Soni
No ratings yet
Microsoft Dynamics AX 2012 Reporting Cookbook
From Everand
Microsoft Dynamics AX 2012 Reporting Cookbook
Kamalakannan Elangovan
No ratings yet
Splunk Fundamentals 3: Course Topics
No ratings yet
Splunk Fundamentals 3: Course Topics
1 page
PostgreSQL 9 Administration Cookbook: LITE Edition
From Everand
PostgreSQL 9 Administration Cookbook: LITE Edition
Simon Riggs
3/5 (1)
Effective Business Intelligence with QuickSight
From Everand
Effective Business Intelligence with QuickSight
Rajesh Nadipalli
No ratings yet
Howd You Get So Big Tips Tricks For Growing Your Splunk Deployment From 50 GB Day To 1 TB Day 2
No ratings yet
Howd You Get So Big Tips Tricks For Growing Your Splunk Deployment From 50 GB Day To 1 TB Day 2
55 pages
Splunk Platform
No ratings yet
Splunk Platform
2 pages
Entity Framework Core Cookbook - Second Edition
From Everand
Entity Framework Core Cookbook - Second Edition
Ricardo Peres
No ratings yet
Introduction to Oracle Database Administration
From Everand
Introduction to Oracle Database Administration
Ying Wang
5/5 (1)
Lab Visualization Splunk
No ratings yet
Lab Visualization Splunk
19 pages
Splunk Test Blueprint Power User v.1.1
No ratings yet
Splunk Test Blueprint Power User v.1.1
3 pages
Monitoring Docker Containers With Splunk PDF
No ratings yet
Monitoring Docker Containers With Splunk PDF
33 pages
IT1171
No ratings yet
IT1171
43 pages
A Trip Through The Splunk Data Ingestion and Retrieval Pipeline
No ratings yet
A Trip Through The Splunk Data Ingestion and Retrieval Pipeline
72 pages
Conf2014andrewducasplunkdeploying 150528223344 Lva1 App6892
No ratings yet
Conf2014andrewducasplunkdeploying 150528223344 Lva1 App6892
55 pages
VPC TGW
No ratings yet
VPC TGW
27 pages
Extending SPL With Custom Search Commands and The Splunk SDK For Python
No ratings yet
Extending SPL With Custom Search Commands and The Splunk SDK For Python
63 pages
How To Use Office 365 Salesforce and Box With Splunk Enterprise and Splunk Enterprise Security
No ratings yet
How To Use Office 365 Salesforce and Box With Splunk Enterprise and Splunk Enterprise Security
42 pages
Docker k8s Lab
100% (1)
Docker k8s Lab
81 pages
Red Hat Enterprise Linux-6-6.4 Release Notes-En-US
No ratings yet
Red Hat Enterprise Linux-6-6.4 Release Notes-En-US
32 pages
Splunk 6.4.0 SearchReference
No ratings yet
Splunk 6.4.0 SearchReference
481 pages
13v4 Risk and Compliance
No ratings yet
13v4 Risk and Compliance
4 pages
Splunk 6.6.1 Updating
No ratings yet
Splunk 6.6.1 Updating
70 pages
7.0 Arslan - Lalbazar 16-04-2024
No ratings yet
7.0 Arslan - Lalbazar 16-04-2024
1 page
F2 Atlas Reviewer
No ratings yet
F2 Atlas Reviewer
26 pages
Optimal Design of Microgrids in Autonomous and Grid-Connected Modes Using Particle Swarm Optimization
No ratings yet
Optimal Design of Microgrids in Autonomous and Grid-Connected Modes Using Particle Swarm Optimization
15 pages
Bio 101N Midterm Lab Reviewer
No ratings yet
Bio 101N Midterm Lab Reviewer
32 pages
The Woman Warrior PDF
No ratings yet
The Woman Warrior PDF
51 pages
2019 Eng p2 Insert
No ratings yet
2019 Eng p2 Insert
5 pages
Applications of The First Order Differential Equations
No ratings yet
Applications of The First Order Differential Equations
6 pages
Lead Data Engineer
No ratings yet
Lead Data Engineer
4 pages
Wireless Communication Notes
No ratings yet
Wireless Communication Notes
29 pages
Cad Ex3
No ratings yet
Cad Ex3
7 pages
The American College: Hall Ticket
No ratings yet
The American College: Hall Ticket
2 pages
Al BattaniContributionsinAstronomyandMathematics
No ratings yet
Al BattaniContributionsinAstronomyandMathematics
10 pages
08 Chapter 04 PDF
No ratings yet
08 Chapter 04 PDF
463 pages
Apa Format Cover Letter Template
100% (2)
Apa Format Cover Letter Template
7 pages
Blackcoin'S Proof-Of-Stake Protocol V2: Pavel Vasin
No ratings yet
Blackcoin'S Proof-Of-Stake Protocol V2: Pavel Vasin
2 pages
Codinng and Robotics Report
No ratings yet
Codinng and Robotics Report
3 pages
Cat ST pcs7 11-2007 en
No ratings yet
Cat ST pcs7 11-2007 en
344 pages
Inflammatory Dermatology
No ratings yet
Inflammatory Dermatology
64 pages
W2 Module 2 - Demand Forecasting Presentation
No ratings yet
W2 Module 2 - Demand Forecasting Presentation
16 pages
ICICI Prudential Booster STP Brochure For Sep 23
No ratings yet
ICICI Prudential Booster STP Brochure For Sep 23
4 pages
Definition:: Roles of Excipients in Pharmaceuticals
No ratings yet
Definition:: Roles of Excipients in Pharmaceuticals
8 pages
Exercises - Chapter 3
No ratings yet
Exercises - Chapter 3
6 pages
Accounting Ratios MCQs 2024
No ratings yet
Accounting Ratios MCQs 2024
3 pages
High School Diploma
No ratings yet
High School Diploma
1 page
Solar Dryer
100% (2)
Solar Dryer
17 pages
CSC566 - A2 - Wan Azra Sofea Batrisyia Binti Jasman
No ratings yet
CSC566 - A2 - Wan Azra Sofea Batrisyia Binti Jasman
7 pages
Caterpillar Cat 18M3 MOTOR GRADER (Prefix N9A) Service Repair Manual Instant Download
No ratings yet
Caterpillar Cat 18M3 MOTOR GRADER (Prefix N9A) Service Repair Manual Instant Download
30 pages
Bensaci Aness Zehouani Yacine HAROUR Elamine Chahbi Imad: Group Members
No ratings yet
Bensaci Aness Zehouani Yacine HAROUR Elamine Chahbi Imad: Group Members
10 pages

Splunk Search and Performance Improvements

Uploaded by

Splunk Search and Performance Improvements

Uploaded by

Search Performance

Alex James – Senior Principal Product Manager (Search Technologies)

25-28th Sept 2017 | Washington, DC

▶ Main gate on parallelism / scalability is the number of hosts

▶ Limited support for this in 7.0:

• Works with only: stats, transaction and tstats

Projection Elimination for Reporting Commands

Collapsing evals commands

Faster Lookups and Lookup Replication

− search ERROR | inputlookup users uid OUTPUT username | fields b, username

• Merging into Inputlookup (KV Store)

− | inputlookup foo where x=10

Better Parallel Reduce

• Explicit Shuffle support

Better support for result reuse…

1. Splunk 7.0 is significantly faster.

3. If you have ES the difference in DMA is

Similar to append but is streaming when possible:

Should still use a single search or tstats append if possible...

You might also like