0% found this document useful (0 votes)
11 views40 pages

Master The Dark

The presentation by J. Cory Minton focuses on the architecture and deployment of Splunk, emphasizing the importance of proper sizing and infrastructure for optimal performance. Key takeaways include understanding the impacts of small changes, design concepts for scalability, and best practices learned from Dell EMC's internal use of Splunk. The document also highlights the significance of Splunk applications and the support provided by Dell EMC's Splunk Ninjas for effective implementation.

Uploaded by

saeedshanto65
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views40 pages

Master The Dark

The presentation by J. Cory Minton focuses on the architecture and deployment of Splunk, emphasizing the importance of proper sizing and infrastructure for optimal performance. Key takeaways include understanding the impacts of small changes, design concepts for scalability, and best practices learned from Dell EMC's internal use of Splunk. The document also highlights the significance of Splunk applications and the support provided by Dell EMC's Splunk Ninjas for effective implementation.

Uploaded by

saeedshanto65
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

Master The Dark Arts

Demystifying Splunk Architecture

J. Cory Minton | Principal Systems Engineer @ Dell EMC

Date | Washington, DC
Forward-Looking Statements
During the course of this presentation, we may make forward-looking statements regarding future events or
the expected performance of the company. We caution you that such statements reflect our current
expectations and estimates based on factors currently known to us and that actual events or results could
differ materially. For important factors that may cause actual results to differ from those contained in our
forward-looking statements, please review our filings with the SEC.

The forward-looking statements made in this presentation are being made as of the time and date of its live
presentation. If reviewed after its live presentation, this presentation may not contain current or accurate
information. We do not assume any obligation to update any forward looking statements we may make. In
addition, any information about our roadmap outlines our general product direction and is subject to change
at any time without notice. It is for informational purposes only and shall not be incorporated into any contract
or other commitment. Splunk undertakes no obligation either to develop the features or functionality
described or to include any such feature or functionality in a future release.
Splunk, Splunk>, Listen to Your Data, The Engine for Machine Data, Splunk Cloud, Splunk Light and SPL are trademarks and registered trademarks of Splunk Inc. in
the United States and other countries. All other brand names, product names, or trademarks belong to their respective owners. © 2017 Splunk Inc. All rights reserved.
J. Cory Minton
Principal SE and Data Analytics Leader
▶ 7+ years at Dell EMC
▶ Founder: Dell EMC Splunk
Ninjas
▶ Splunk SE Certified
▶ I hardware!
▶ Oracle and SAP background
▶ BS Engineering and MBA
▶ www.BigDataBeard.com

www.GoWithDaddy.com
Key Takeaways

▶ Size the infrastructure for a Splunk deployment


▶ Understand infrastructure impacts from small changes in Splunk
▶ Learn design concepts that will scale
▶ Hear how Dell EMC is doing it internally
▶ An easier way…
Problem…
© 2017 SPLUNK INC.

Provide Fundamentals For Sizing A


Splunk Deployment And Share
Learned Best Practices.
Assumption #1
General understanding of Splunk platform

Security VMware Exchange PCI ML UBA ITSI

Rich Ecosystem of Apps

Free Splunk>

Platform for machine data

Syslog / Sensors and Mainframe


Forwarders Stream DB connect Mobile control
TCP / other data
systems
© 2017 SPLUNK INC.

Assumption #2
General understanding of Splunk infrastructure

Search Heads
Query information across indexers
and are usually CPU and memory
intensive.

Indexers
Write data to disk and are both
CPU and I/O intensive.

Forwarders
Collect and forward data; usually
lightweight and not resource
intensive.
© 2017 SPLUNK INC.

Assumption #3
General understanding of Splunk data management.

HOT WARM COLD


OR
Optional TSIDX Reduction

FROZEN
HOT – Newest buckets of data that are still open for write
WARM – Recent data but closed for writing (read only)
COLD – Oldest data, commonly on cheaper, slower storage
FROZEN – No longer searchable, commonly archived or deleted data

© Copyright 2017 Dell Inc.


© 2017 SPLUNK INC.

Big & Fast


What makes Splunk grow?

Performance Capacity
üVolume Of Ingest üVolume Of Ingest
üSearch Performance üIndex Retention Periods
üMore Users üIndexer Clustering
üBig Apps üBig Apps
Sizing
Fundamentals
How many servers for I need?
Machine Requirements
Indexers
Reference Minimum Mid-Range ▶ High-Performance
▶ 12 cores ▶ 24 cores ▶ 48 cores
▶ 12GB RAM ▶ 64GB RAM ▶ 128GB RAM
▶ 800 IOPS ▶ 800 IOPS ▶ SSD
Others
Search Head Heavy Forwarder Utility
▶ 16 cores ▶ 16 cores ▶ 8 cores
▶ 12GB RAM ▶ 12GB RAM ▶ 8GB RAM
▶ 300 IOPS ▶ 300 IOPS ▶ 300 IOPS

Dark truth: Choose wisely…or scalability will suffer later.


© 2017 SPLUNK INC.

Indexer Ingest GB/Day


350

300

250

200

Indexer Sizing
150

▶ vCPU = CPU
▶ Hyperthreading ≠ CPU 100

▶ When in doubt, 100 50

0
Reference Mid-Range High Performance

Splunk ES, ITSI, UBA


© 2017 SPLUNK INC.

Search Heads
▶ Dedicate
▶ When in doubt, 1 per 8
▶ Indexers > Search
Utility Servers
Handy helpers…

▶ Heavy Forwarder

1:3
▶ License Master
▶ DMC
▶ Cluster Master
▶ Deployment
Sizing
Fundamentals
How much storage do I need?
© 2017 SPLUNK INC.

Assumption #3
General understanding of Splunk data management.

HOT WARM COLD


OR
Optional TSIDX Reduction

FROZEN
HOT – can be DAS in server or SAN (Flash is best)
WARM – same as Hot
COLD – adds option for NAS
FROZEN – No longer searchable, so object stores are option here (last resort)

© Copyright 2017 Dell Inc.


Myth About Bucket indexes.conf
# volume definitions

Sizing…
[volume:hotwarm_cold]
path = /mnt/fast_disk
maxVolumeDataSizeMB = 3984589

▶ # of buckets x bucket size # index definition (calculation is based on a single index)

▶ Not days…
[main]
homePath = volume:hotwarm_cold/defaultdb/db
coldPath = volume:hotwarm_cold/defaultdb/colddb
thawedPath = $SPLUNK_DB/defaultdb/thaweddb
homePath.maxDataSizeMB = 512000
coldPath.maxDataSizeMB = 2560000
maxWarmDBCount = 4294967295
frozenTimePeriodInSecs = 2592000
maxDataSize = auto_high_volume
coldToFrozenDir = /mnt/big_disk/defaultdb/frozendb
Indexer Deployment Options
Distributed Deployment

Distributed Deployment
Indexer data is stored once and
distributed across available
indexers
Clustered Deployment

Clustered Deployment
A group of indexers are configured
to replicate each other’s data
Distributed Deployment

▶ Single copy of data


▶ Small
▶ Starter
▶ Storage-bound
© 2017 SPLUNK INC.

Indexer Storage Capacity


1TB Ingested Data
Written Data

= ½ Ingested Data
= 500GB

Raw Data Indexes


Indexer

*.gz *.tsidx

Compressed Raw data Uncompressed ‘indexes’


30% of written data 70% of written data
à 150GB à 350GB
How Much Storage You Need?

1TB Ingested Data

= Daily indexing rate


x½ = 1TB x ½ x 60 days = 30TB
x Retention policy

Raw Data Indexes


Indexer

9TB 21TB
Indexer Clustering

▶ High Availability for Indexes


▶ Indexer Clustering Settings
• Replication Factor = copies of raw
data
• Search Factor = copies of indexes
Splunk Indexer Availability
Multiple copies of index and raw data
• Index à # copies of indexes à Search factor (SF)
• Raw Data -> # of of copies of raw data à Replication factor (RF)
Copy 2 Copy 2
500GB 500GB 1TB * 60 days x ½ x 2
Copy 1 Copy 1 = 60TB (RF/SF=2) ** doubled **
500GB 500GB
1TB * 60 days x ½ x 3
= 90TB (RF/SF=3) ** tripled **

STORAGE CAPACITY
MULTIPLIES!
1TB Ingested Data
SF=2 / RF=2
500GB written à 500GB replicated
Multisite Indexer Clustering

▶ Protects indexes across


disparate locations
▶ Enables Search Affinity
▶ Site specific RF/SF settings

▶ Sizing = each site + site


protected
Unofficial, But Really Helpful Tool

http://splunk-sizing.appspot.com/
Splunk Sizing Questionnaire

▶ What is the licensed daily ingest rate for Splunk (expressed in some amount of GB/Day or TB/day)?
▶ What is the retention period for Hot/Warm and Cold (days kept in each tier)?
▶ Any data being sent to frozen? If so, what is the retention period and requirement for doing so?
▶ Is indexer clustering being leveraged? If so, what are the settings for Replication and Search Factor?
▶ How many indexer and search servers are deployed? Do you have a visualization you can share of the
deployment?
▶ Is Splunk being run as a single site or multiple sites? If multiple, is multi-site clustering being leveraged?
▶ Is the Enterprise Security App or ITSI for Splunk deployed?
The right solutions to optimize your
Splunk deployment
The Ready Solutions formula
Dell EMC
portfolio

Priorities

Compute Ready Nodes


Ready Bundles
Ready Systems
Deploy

Biz Apps
Knowledge

Services
© 2017 SPLUNK INC.

Dell EMC Ready Solutions for Splunk

Ready System Ready Bundle

VxRack + Isilon VxRail + Isilon PowerEdge + Isilon

“Meets or EXCEEDS minimum hardware requirements”


© 2017 SPLUNK INC.

Logistics Leader
Doug called them out on Q1 earnings call…

▶ Simplified acquisition
▶ Leveraged Ninjas
▶ Deployed apps for all
Dell EMC platforms
▶ Replatforming HW in
near future
© 2017 SPLUNK INC.

Wholesale Club Retailer

▶ Flashed Splunk
▶ Bottomless cold with
Isilon…over 1PB!
▶ Decreased floor
space by 30%
▶ Growing to +3TB/day
© 2017 SPLUNK INC.

Winter is
coming…
© 2017 SPLUNK INC.

Splunk at Dell EMC


Our defense against Black Friday…

▶ eCommerce IT
services
▶ Marketing
effectiveness
▶ Security and threats
▶ Replatforming now
Splunk Applications From Dell EMC
Extend the power of Splunk to Dell EMC Platforms
What are Splunk Apps?
Dell EMC has apps for the following:
Splunk applications and add-ons allow user to
import data into Splunk from specific sources - VMAX
- XtremIO
- Isilon
Splunk & its partners have created a rich
community called SplunkBase that has 1000s+ - VNX
applications

Why are Splunk Apps important?


Splunk apps and add-ons allow customers to
incorporate new use cases and extend their
Splunk environment. This leads to increased
Splunk License needs as well as additional
Hardware
Global Solution Centers
Validate. Evaluate. Collaborate. Innovate

Solution centers
Staffed with engineers and
Blueprint solution experts

Engagements begin
with your challenges
• Briefings with a team of
experts
• Architectural design
sessions
• Proofs of concept
Let our Splunk Ninjas help you!

Trained by Splunk

Splunk Architecture Experts

Dell EMC Portfolio Experts

Religious about Best Practices

Available across the GLOBE!!!


Email [email protected]
© 2017 SPLUNK INC.

Don't forget to rate this session in the


.conf2017 mobile app

You might also like