Architecang and Sizing Your Splunk Deployment: Simeon Yep
Architecang and Sizing Your Splunk Deployment: Simeon Yep
#splunkconf
Legal
NoAces
During
the
course
of
this
presentaAon,
we
may
make
forward-‐looking
statements
regarding
future
events
or
the
expected
performance
of
the
company.
We
cauAon
you
that
such
statements
reflect
our
current
expectaAons
and
esAmates
based
on
factors
currently
known
to
us
and
that
actual
events
or
results
could
differ
materially.
For
important
factors
that
may
cause
actual
results
to
differ
from
those
contained
in
our
forward-‐
looking
statements,
please
review
our
filings
with
the
SEC.
The
forward-‐looking
statements
made
in
this
presentaAon
are
being
made
as
of
the
Ame
and
date
of
its
live
presentaAon.
If
reviewed
aTer
its
live
presentaAon,
this
presentaAon
may
not
contain
current
or
accurate
informaAon.
We
do
not
assume
any
obligaAon
to
update
any
forward-‐looking
statements
we
may
make.
In
addiAon,
any
informaAon
about
our
roadmap
outlines
our
general
product
direcAon
and
is
subject
to
change
at
any
Ame
without
noAce.
It
is
for
informaAonal
purposes
only
and
shall
not,
be
incorporated
into
any
contract
or
other
commitment.
Splunk
undertakes
no
obligaAon
either
to
develop
the
features
or
funcAonality
described
or
to
include
any
such
feature
or
funcAonality
in
a
future
release.
Splunk,
Splunk>,
Splunk
Storm,
Listen
to
Your
Data,
SPL
and
The
Engine
for
Machine
Data
are
trademarks
and
registered
trademarks
of
Splunk
Inc.
in
the
United
States
and
other
countries.
All
other
brand
names,
product
names,
or
trademarks
belong
to
their
respecCve
owners.
©2013
Splunk
Inc.
All
rights
reserved.
2
IntroducAon
About
Me
! 5+
years
@
Splunk
! Experience:
– SupporAng,
administering,
and
architecAng
large
scale
deployments
– OEM
–
technical
sales
– Strategic
accounts
–
technical
sales
! Based
in
HQ
(San
Francisco
office)
! Currently:
Business
Development,
Technical
Synergies
4
Agenda
! Sizing
Fundamentals
! ArchitecAng
Fundamentals
! Deployment
Topologies
5
Sizing
Fundamentals
Sizing
Fundamentals
! Understand
the
sizing
factors
! Data
volume
! Search
volume
! Sizing
sheet
7
Sizing
Factors
! How
much
data
(raw
sizes)?
– Daily
volume
– Peak
volume
– Retained
volume
(archive
size)
– Future
volume?
! How
much
searching?
– Use
cases
– How
many
people?
How
oTen?
! Jobs
– SummarizaAon,
alerAng,
reporAng
8
Data
Volumes
! EsAmate
input
volume
– Verify
raw
log
sizes
– Leverage
_internal
metrics
to
get
actual
input
volumes
! Confirm
esAmates
with
actual
data
– Create
a
baseline
with
real
or
simulated
data
– Find
compression
rates
(range
from
30%-‐120%,
typically
50%)
– Determine
retenAon
needs
! Document
use
cases
– Use
case
determines
search
needs
– Plan
for
expansion
as
adopAon
grows
(search
and
volume)
9
Data
Sizing
Exercise
! Via
Filesystem
! Use
the
Splunk
log
files:
metrics.log
or
license_usage.log
! OpAonally:
– License
report
view
in
Splunk
Enterprise
6
– S.o.S
app
in
5.x
10
Search
Volumes
! Gather
use
case
informaAon
– How
much
ad-‐hoc
searching?
– How
much
background
searching?
! Ad-‐hoc
searching
– Evaluate
the
data
being
searched
– Evaluate
the
Ame
duraAon
(real-‐Ame
vs
historic)
– Real-‐Ame
searches
are
typically
less
overhead
! Background
searching
– AlerAng
and
monitoring
– General
reports
– Summary
indexing
11
Final
Sizing
Numbers
! Data
capacity
– Daily
and
peak
! User
capacity
– Concurrent
and
total
! Search
capacity
– Concurrent
and
total
*Document
the
use
cases!!
12
Architecture
Architecture
! Splunk
server
roles:
distributed/clustered
deployments
! Reference
server
! Rules
of
thumb
! Hardware
factors
14
Splunk
Distributed
Roles
Search
Head
(regular
and
job
server)
search
head
Indexer
indexer
Forwarder
(universal)
forwarder
15
Splunk
Distributed
Roles
License Master
Deployment Server
16
Recommended
ConfiguraAons
Stand-‐
Indexer
Search
head
Indexer
Search
Cluster
alone
(distributed)
(distributed)
(clustered)
head
master
(clustered)
(clustered)
Forwarding
*
*
*
*
*
Searching
√
√
*
√
Indexing
√
√
*
√
Deployment
*
*
server
License
master
*
√
*
*
Cluster
master
√
√
-‐
common
*
-‐
uncommon
17
What s
a
Reference
Server?
! Sizing
based
on
commodity
x86
servers
! Dual
quad-‐core
CPUs
at
3.0
GHz
(dual
six
core
is
common)
! 8
GB
of
RAM
–
(16
GB
is
common)
! 64-‐bit
OS
! 4x10k
RPM
local
SAS
drives
in
RAID
1+0
(800+
IOPs)
! VariaAons
cause
corresponding
changes
in
performance/
requirements
18
Rules
of
Thumb
! These
all
have
excepAons
and
qualificaAons
! 1
reference
indexer
per
100
GB/day
! 1
reference
search
head
per
8
to
12
users
! 1
reference
job
server
per
20
concurrent
jobs
! 1
deployment
server
per
3000
polls/min
! ReplicaAon
later…
19
How
Many
Indexers?
! Rule
of
thumb
says:
1
per
100
GB/day
! Leaves
room
for:
– Daily
peaks
– Light
searching
and
reporAng
for
about
3
concurrent
users
! Need
more
indexers
for:
– Heavy
reporAng
– More
users
– Slower
disks,
slower
CPUs,
fewer
CPUs
20
How
Many
Search
Heads?
! Rule
of
thumb
says:
1
per
8
to
12
concurrent
users
! Limit
is
concurrent
queries
! 30-‐50
web
sessions
! 1:1
raAo
of
search
query
to
CPU
core
! Only
add
first
search
head
if
≥3
indexers
! Don’t
add
search
heads;
add
indexers:
indexers
do
most
work
! But
you
need
more
if:
– Running
a
lot
of
scheduled
jobs
on
the
search
head
21
Search
Head
vs.
Job
Server
! Search
Head
Pooling
(SHP):
uses
NFS
to
manage
user
profiles/
configuraAons
and
job
queue
! Search
head
and
job
server
are
equivalent
with
SHP
! Use
job
servers
for
scheduled
searches
(summaries,
alerts,
and
reports)
! Use
search
heads
for
ad-‐hoc
searching
22
How
Many
Deployment
Servers?
! Rule
of
thumb
says:
1
per
3000
polls/minute
! Just
use
one
deployment
server,
and
adjust
the
polling
period
! Small
deployments
can
share
the
same
splunkd
! Low
requirement
for
disk
performance
(good
candidate
for
virtualizaAon)
! Windows
OS
–
1
per
500
polls/minute
! Or
use
something
other
than
deployment
server
– Puppet,
SCCM,
cfengine,
chef…
23
More
is
Bexer?
! CPUs
– Search
process
uAlizes
up
to
1
CPU
core
(1:1)
– Indexers
sAll
need
to
do
the
heavy
liTing
(search
exists
on
indexer
AND
search
head)
– Limited
benefit
for
indexing
(up
to
2
CPU
cores
for
indexing)
! Memory
– Good
for
search
heads
and
indexers
(16+
GB)
! Disks
– Faster
is
bexer
(15k
rpm)
– More
disks
in
RAID
1+0
=
faster
24
Performance
and
Sizing
Tips
System
change
Search
Speed
Indexing
Speed
Faster
disks
++
++
Add
an
indexer
++
++
Add
a
search
head
+
Report
acceleraAon/
summaries
++
25
Performance
and
Sizing
Tips
System
change
Search
Speed
Indexing
Speed
27
Architecture
Factors
! What
are
my
sizing
requirements?
! Where
is
the
data?
! Where
are
the
users?
! What
is
the
security
policy?
! What
are
the
retenAon
and
compliance
policies?
! What
is
the
availability
requirement?
! What
about
the
cloud?
28
Architecture
Factors
! What
are
my
sizing
requirements?
– Data
capacity
– Search
capacity
– User
capacity
! Obtained
from
the
sizing
process
29
Architecture
Factors
! Where
is
the
data?
– Local
or
remote
to
the
indexing
machine
– If
remote
–
use
forwarders
when
possible
– Index
in
local
data
center
(zone)
or
index
centrally
– Persist
network
data
to
disk
as
a
best
pracAce
– Use
intermediate
forwarders
to
distribute
data
! Where
are
the
users?
– User
experience
affected
by
search
head
locaAon
ê Time
zone
tuning
(5.x
+)
ê Distributed
search
over
LAN
vs
WAN
30
Architecture
Factors
! What
is
the
Security
Policy?
– Apply
user
security
policies
ê Auth
method
ê Roles
ê Filters
– Apply
physical
security
policies
ê Index
locaAon
31
Architecture
Factors
! RetenAon,
compliance,
governance
– Where
is
the
data
allowed
to
be?
– Where
is
the
data
not
allowed
to
go?
– Where
must
the
data
go?
! Availability
– Local
failover,
fault-‐tolerance,
clustering
– Geographic
disaster
recovery/fault-‐tolerance
– Index
replicaAon!
32
Architecture
Factors
! Same
old
story
! Cloud
consideraAons
– AuthenAcaAon
restricAons
– Data
transfer
costs
– Security
–
SSL
tunnel
– Zones
33
Topologies
Architecture
Factors
à
Topology
Topology Examples
35
Centralized
Topology
Search Head Pooling
Indexers
Intermediate
Forwarders Forwarder Forwarders
Syslog Devices
36
Decentralized
Topology
37
Hybrid
Topology
38
Scaling
and
Expansion
! Add
to
your
indexer
pool
for
more
performance
or
capacity
– Mixed
pla{orm
and
hardware
is
okay
! Use
search
head
pooling
for
more
UI
capacity
– Requires
NFS
! Create
new
indexes
for
new
data
types
– Follows
best
pracAces
39
Index
ReplicaAon
(aka
Clustering)
! What
is
it?
– Indexes
are
replicated
to
1
or
more
indexers
(tunable)
– Splunk
controlled
! Basics
– Master
node
(manages
indexing
and
searching
locaAon)
– Distributed
deployment
– NOT
=
“index
and
forward
“
! HA
license
VS.
index
replicaAon
– HAL
–
Separate
fully
funcAoning
Splunk
deployments
– IR
–
Data
is
made
available
on
1
or
more
indexers
40
Index
ReplicaAon
! Rule
of
thumb:
1
per
50
GB/day
– Assume
simple
replicaAon
(2
in
existence)
– Increase
in
I/O,
CPU,
and
disk
requirement
! Need
more
indexers
if:
– Increase
in
replicaAon
factor
– Performance
or
capacity
needs
(search
and
index)
41
Index
ReplicaAon
(aka
–
Clustering)
Cluster Master Search Head Pooling
42
Index
ReplicaAon
! Data
is
replicated
and
made
available
! WAN
configuraAon
is
not
recommended
! Careful
consideraAon
when
inserted
into
standard
topologies
! Increases
– Storage
requirement
– I/O
requirement
(disk,
network)
– Total
indexer
requirement
! Disaster
recovery
and
high
availability
.conf
session
43
Final
Thoughts
! Sizing
is
more
than
data
volume
–
it’s
also
search
load
! Centralized
architecture
is
the
baseline
! VariaAons
on
architecture
are
driven
by
– Sizing
– Data
locaAon
– User
locaAon
– RetenAon/access/governance
– Availability
requirements
44
Next
Steps
1
Download
the
.conf2013
Mobile
App
If
not
iPhone,
iPad
or
Android,
use
the
Web
App
2
Take
the
survey
&
WIN
A
PASS
FOR
.CONF2014…
Or
one
of
these
bags!
3
View
the
sessions
listed
on
the
next
slide
Available
on
the
Mobile
App
45
More
InformaAon
! Contact:
[email protected]
! DocumenaAon:
hxp://docs.splunk.com
! Answers:
hxp://answers.splunk.com
! Other
presentaAons
– Best
PracAces:
Deploying
Splunk
on
Physical,
Virtual,
and
Cloud
Environments
– ArchitecAng
Splunk
for
High
Availability
and
Disaster
Recovery
46
THANK
YOU