Splunk-8 2 2-SearchReference
Splunk-8 2 2-SearchReference
Splunk-8 2 2-SearchReference
Quick Reference...................................................................................................................................................................8
Splunk Quick Reference Guide.................................................................................................................................8
Command quick reference.........................................................................................................................................8
Commands by category...........................................................................................................................................14
Command types.......................................................................................................................................................22
Splunk SPL for SQL users.......................................................................................................................................26
SPL data types and clauses....................................................................................................................................30
Evaluation Functions.........................................................................................................................................................35
Evaluation functions.................................................................................................................................................35
Comparison and Conditional functions....................................................................................................................44
Conversion functions...............................................................................................................................................57
Cryptographic functions...........................................................................................................................................62
Date and Time functions..........................................................................................................................................64
Informational functions.............................................................................................................................................69
JSON functions........................................................................................................................................................72
Mathematical functions............................................................................................................................................88
Multivalue eval functions..........................................................................................................................................95
Statistical eval functions........................................................................................................................................104
Text functions........................................................................................................................................................107
Trig and Hyperbolic functions................................................................................................................................112
Search Commands...........................................................................................................................................................170
abstract..................................................................................................................................................................170
accum....................................................................................................................................................................170
addcoltotals............................................................................................................................................................172
addinfo...................................................................................................................................................................174
addtotals................................................................................................................................................................175
analyzefields..........................................................................................................................................................180
anomalies..............................................................................................................................................................181
anomalousvalue.....................................................................................................................................................185
i
Table of Contents
Search Commands
anomalydetection...................................................................................................................................................190
append...................................................................................................................................................................193
appendcols............................................................................................................................................................197
appendpipe............................................................................................................................................................199
arules.....................................................................................................................................................................200
associate................................................................................................................................................................202
audit.......................................................................................................................................................................205
autoregress............................................................................................................................................................205
awssnsalert............................................................................................................................................................206
bin..........................................................................................................................................................................207
bucket....................................................................................................................................................................210
bucketdir................................................................................................................................................................210
cefout.....................................................................................................................................................................211
chart.......................................................................................................................................................................211
cluster....................................................................................................................................................................226
cofilter....................................................................................................................................................................229
collect.....................................................................................................................................................................231
concurrency...........................................................................................................................................................235
contingency............................................................................................................................................................239
convert...................................................................................................................................................................243
correlate.................................................................................................................................................................247
ctable.....................................................................................................................................................................248
datamodel..............................................................................................................................................................249
datamodelsimple....................................................................................................................................................252
dbinspect...............................................................................................................................................................253
dbxquery................................................................................................................................................................257
dedup.....................................................................................................................................................................257
delete.....................................................................................................................................................................260
delta.......................................................................................................................................................................262
diff..........................................................................................................................................................................267
entitymerge............................................................................................................................................................268
erex........................................................................................................................................................................268
eval........................................................................................................................................................................272
eventcount.............................................................................................................................................................280
eventstats..............................................................................................................................................................283
extract....................................................................................................................................................................291
fieldformat..............................................................................................................................................................293
fields......................................................................................................................................................................297
fieldsummary.........................................................................................................................................................299
filldown...................................................................................................................................................................301
fillnull......................................................................................................................................................................302
findtypes................................................................................................................................................................305
folderize.................................................................................................................................................................306
foreach...................................................................................................................................................................308
format.....................................................................................................................................................................311
from........................................................................................................................................................................315
ii
Table of Contents
Search Commands
gauge.....................................................................................................................................................................317
gentimes................................................................................................................................................................320
geom......................................................................................................................................................................322
geomfilter...............................................................................................................................................................327
geostats.................................................................................................................................................................328
head.......................................................................................................................................................................333
highlight.................................................................................................................................................................336
history....................................................................................................................................................................337
iconify.....................................................................................................................................................................340
inputcsv..................................................................................................................................................................341
inputintelligence.....................................................................................................................................................344
inputlookup............................................................................................................................................................344
iplocation................................................................................................................................................................348
join.........................................................................................................................................................................352
kmeans..................................................................................................................................................................358
kvform....................................................................................................................................................................360
loadjob...................................................................................................................................................................362
localize...................................................................................................................................................................365
localop...................................................................................................................................................................366
lookup....................................................................................................................................................................366
makecontinuous.....................................................................................................................................................372
makemv.................................................................................................................................................................374
makeresults...........................................................................................................................................................376
map........................................................................................................................................................................382
mcollect..................................................................................................................................................................384
metadata................................................................................................................................................................389
metasearch............................................................................................................................................................392
meventcollect.........................................................................................................................................................394
mpreview...............................................................................................................................................................397
msearch.................................................................................................................................................................402
mstats....................................................................................................................................................................402
multikv....................................................................................................................................................................414
multisearch............................................................................................................................................................417
mvcombine............................................................................................................................................................418
mvexpand..............................................................................................................................................................421
nomv......................................................................................................................................................................424
outlier.....................................................................................................................................................................425
outputcsv...............................................................................................................................................................426
outputlookup..........................................................................................................................................................429
outputtext...............................................................................................................................................................434
overlap...................................................................................................................................................................434
pivot.......................................................................................................................................................................435
predict....................................................................................................................................................................438
rangemap...............................................................................................................................................................444
rare........................................................................................................................................................................447
redistribute.............................................................................................................................................................449
iii
Table of Contents
Search Commands
regex......................................................................................................................................................................455
reltime....................................................................................................................................................................458
rename...................................................................................................................................................................459
replace...................................................................................................................................................................461
require....................................................................................................................................................................462
rest.........................................................................................................................................................................463
return.....................................................................................................................................................................466
reverse...................................................................................................................................................................468
rex..........................................................................................................................................................................469
rtorder....................................................................................................................................................................474
run..........................................................................................................................................................................475
savedsearch..........................................................................................................................................................475
script......................................................................................................................................................................477
scrub......................................................................................................................................................................478
search....................................................................................................................................................................480
searchtxn...............................................................................................................................................................488
selfjoin....................................................................................................................................................................490
sendemail..............................................................................................................................................................494
set..........................................................................................................................................................................499
setfields..................................................................................................................................................................501
sichart....................................................................................................................................................................502
sirare......................................................................................................................................................................503
sistats.....................................................................................................................................................................504
sitimechart.............................................................................................................................................................506
sitop.......................................................................................................................................................................507
snowincident..........................................................................................................................................................509
snowincidentstream...............................................................................................................................................509
snowevent..............................................................................................................................................................509
snoweventstream...................................................................................................................................................510
sort.........................................................................................................................................................................510
spath......................................................................................................................................................................514
stats.......................................................................................................................................................................519
strcat......................................................................................................................................................................531
streamstats............................................................................................................................................................532
table.......................................................................................................................................................................542
tags........................................................................................................................................................................545
tail..........................................................................................................................................................................548
timechart................................................................................................................................................................549
timewrap................................................................................................................................................................563
tojson.....................................................................................................................................................................567
top..........................................................................................................................................................................572
transaction.............................................................................................................................................................575
transpose...............................................................................................................................................................584
trendline.................................................................................................................................................................590
tscollect..................................................................................................................................................................591
tstats......................................................................................................................................................................593
iv
Table of Contents
Search Commands
typeahead..............................................................................................................................................................604
typelearner.............................................................................................................................................................607
typer.......................................................................................................................................................................608
union......................................................................................................................................................................609
uniq........................................................................................................................................................................614
untable...................................................................................................................................................................614
walklex...................................................................................................................................................................618
where.....................................................................................................................................................................622
x11.........................................................................................................................................................................624
xmlkv......................................................................................................................................................................626
xmlunescape..........................................................................................................................................................627
xpath......................................................................................................................................................................628
xsDisplayConcept..................................................................................................................................................630
xsDisplayContext...................................................................................................................................................630
xsFindBestConcept................................................................................................................................................630
xsListConcepts......................................................................................................................................................630
xsListContexts.......................................................................................................................................................631
xsUpdateDDContext..............................................................................................................................................631
xsWhere.................................................................................................................................................................631
xyseries..................................................................................................................................................................631
3rd party custom commands.................................................................................................................................636
Internal Commands..........................................................................................................................................................637
About internal commands......................................................................................................................................637
collapse..................................................................................................................................................................637
dump......................................................................................................................................................................638
findkeywords..........................................................................................................................................................639
makejson...............................................................................................................................................................640
mcatalog................................................................................................................................................................643
noop.......................................................................................................................................................................648
prjob.......................................................................................................................................................................653
runshellscript..........................................................................................................................................................656
sendalert................................................................................................................................................................657
v
Introduction
Getting Started
If you are new to Splunk software and searching, start with the Search Tutorial. This tutorial introduces you to the Search
& Reporting application. The tutorial guides you through uploading data to your Splunk deployment, searching your data,
and building simple charts, reports, and dashboards.
After you complete the Search Tutorial, and before you start using Splunk software on your own data you should:
Search Manual
The Search Manual is a companion manual to the Search Reference. The Search Manual contains detailed information
about creating and optimizing searches.
• Types of searches
• Retrieving events
• Specifying time ranges
• Optimizing searches
• Using subsearches
• Creating statistical tables and charts
• Grouping and correlating events
• Predicting future events
• Managing jobs
1
SPL commands
• The Command quick reference topic contains an alphabetical list of each command, along with a brief description
of what the command does and a link to the specific documentation for the command.
• The Commands by category topic organizes the commands by the type of action that the command performs.
This topic contains a brief description of what the command does and a link to the specific documentation for the
command.
SQL users
If you're familiar with SQL, see Splunk SPL for SQL users to see how to use your SQL knowledge to learn SPL.
Command syntax
Before you continue, see Understanding SPL syntax for the conventions and rules used in this manual.
The required argument is <field>. To use this command, at a minimum you must specify bin <field>.
2
Repeating arguments
Some arguments can be specified multiple times. The syntax displays ellipsis ... to specify which part of an argument can
be repeated. The ellipsis always appear immediately after the part of the syntax that you can repeat.
The required argument is <convert-function>, with an option to specify a field with the [AS <field>] clause.
Notice the ellipsis at the end of the syntax, just after the close parenthesis. In this example, the syntax that is inside the
parenthesis can be repeated <convert-function> [AS <field>].
Grouped arguments
Sometimes the syntax must display arguments as a group to show that the set of arguments are used together.
Parenthesis ( ) are used to group arguments.
The grouped argument is (<wc-string> WITH <wc-string>)... . This is a required set of arguments that you can repeat
multiple times.
Keywords
Many commands use keywords with some of the arguments or options. Examples of keywords include:
• AS
• BY
• OVER
• WHERE
You can specify these keywords in uppercase or lowercase in your search. However, for readability, the syntax in the
Splunk documentation uses uppercase on all keywords.
Quoted elements
If an element is in quotation marks, you must include that element in your search. The most common quoted elements are
parenthesis.
3
[ BY <row-split> <column-split> ] | [ OVER <row-split> ] [BY <column-split>] ]
There are quotation marks on the parenthesis surrounding the <eval-expression>. This means that you must enclose the
<eval-expression> in parenthesis in your search.
In the following search example, the <eval-expression> is avg(size)/max(delay) and is enclosed in parenthesis.
Argument order
In the command syntax, the command arguments are presented in the order in which the arguments are meant to be
used.
In the descriptions of the arguments, the Required arguments and Optional argument sections, the arguments are
listed alphabetically. For each argument, there is a Syntax and Description. Additionally, for Optional arguments, there
might be a Default.
Data types
The nomenclature used for the data types in SPL syntax are described in the following table.
<unsigned An unsigned integer must be positive value. Unsigned integers can be larger numbers
unsigned integer
int> than signed integers.
Boolean operators
When a boolean operator is included in the syntax of a command, you must always specify the operator in uppercase.
Boolean operators include:
• AND
• OR
• NOT
4
To learn more about the order in which boolean expressions are evaluated, along with some examples, see Boolean
expressions in the Search Manual.
To learn more about the the NOT operator, see Difference between NOT and != in the Search Manual.
BY clauses
When you use a <by-clause>, one row is returned for each distinct value <by-clause> field. A <by-clause> displays each
unique item in a separate row. Think of the <by-clause> as a grouping.
The <split-by-clause> displays each unique item in a separate column. Think of the <split-by-clause> as a splitting or
dividing.
When the syntax contains <field> you specify a field name from your events.
The <field> argument is required. You can specify that the field displays a different name in the search results by using
the [AS <newfield>] argument. This argument is optional.
For example, if the field is categoryId and you want the field to be named CategoryID in the output, you would specify:
categoryId AS CategoryID
The <wc-field> argument indicates that you can use wild card characters when specifying field names. For example, if
you have a set of fields that end with "log" you can specify *log to return all of those fields.
If you use a wild card character in the middle of a value, especially as a wild card for punctuation, the results might be
unpredictable.
See also
• Anatomy of a search
• Wildcards
• Field expressions
• Quotes and escaping characters
5
How to use this manual
This manual serves as a reference guide for the Splunk user who is looking for a catalog of the search commands with
complete syntax, descriptions, and examples for usage.
• The Command quick reference topic contains an alphabetical list of each command, along with a brief description
of what the command does and a link to the specific documentation for the command.
• The Commands by category topic organizes the commands by the type of action that the command performs.
This topic contains a brief description of what the command does and a link to the specific documentation for the
command.
Functions
Command topics
Each search command topic contains the following sections: Description, Syntax, Examples, and See also. Many of the
command topics also have a Usage section.
Description
Describes what the command is used for. This section might include details about how to use the command. For
more complex commands, there might be a separate Usage section.
Syntax
The syntax includes the complete syntax for each search command, and a description for each argument. Some
commands have arguments that have a set of options that you can specify. Each of these sets of options follow
the argument descriptions.
Required arguments
Displays the syntax and describes the required arguments.
Optional arguments
Displays the syntax and describes the optional arguments. Default values, if applicable, are also listed.
Usage
Contains additional information about using the command.
Examples
This section includes examples of how to use the command.
See also
This section contains links to all related or similar commands.
The command arguments are presented in the syntax in the order in which the arguments are meant to be used.
6
Arguments are either Required or Optional and are listed alphabetically under their respective subheadings. For each
argument, there are Syntax and Description sections. Additionally, there might be other sections, such as Default that
provide information about the argument.
Formatting conventions
Italic
When referring to another manual in the set of Splunk documentation, the name of the manual appears in italic.
7
Quick Reference
Note: The examples in this quick reference use a leading ellipsis (...) to indicate that there is a search before the pipe
operator. A leading pipe indicates that the search command is a generating command and prevents the command-line
interface and Splunk Web from prepending the search command to your search.
See also
Splunk Answers
If you cannot find what you are looking for in this search language reference, check out Splunk Answers and see what
questions and answers other Splunk users have about the search language.
Some of these commands share functions. For a list of the functions with descriptions and examples, see Evaluation
functions and Statistical and charting functions.
If you don't find a command in the table, that command might be part of a third-party app or add-on. For information about
commands contributed by apps and add-ons, see the documentation on Splunkbase.
Computes an event that contains sum of all numeric fields for previous
addcoltotals addtotals, stats
events.
addinfo Add fields that contain common information about the current search. search
addtotals Computes the sum of all numeric fields for each result. addcoltotals, stats
analyzefields Analyze numerical fields for their ability to predict another discrete field. anomalousvalue
8
Command Description Related commands
analyzefields, anomalies, cluster,
kmeans, outlier
analyzefields, anomalies,
Identifies anomalous events by computing a probability for each event
anomalydetection anomalousvalue, cluster, kmeans,
and then detecting unusually small probabilities.
outlier
appendcols, appendcsv,
append Appends subsearch results to current results.
appendlookup, join, set
Appends the fields of the subsearch results to current results, first results
appendcols append, appendcsv, join, set
to first result, second to second, etc.
Appends the result of the subpipeline applied to the current result set to
appendpipe append, appendcols, join, set
results.
audit Returns audit trail information that is stored in the local audit index.
bin (bucket) Puts continuous numerical values into discrete sets. chart, timechart
Returns results in a tabular output for charting. See also, Statistical and
chart bin,sichart, timechart
charting functions.
cofilter Finds how many times field1 and field2 values occurred together. associate, correlate
Uses a duration field to find the number of "concurrent" events for each
concurrency timechart
event.
Examine data model or data model dataset and search a data model
datamodel pivot
dataset.
Allows you to specify example or counter example values to automatically extract, kvform, multikv, regex, rex,
erex
extract fields that have similar values. xmlkv
9
Command Description Related commands
Calculates an expression and puts the value into a field. See also,
eval where
Evaluation functions.
extract (kv) Extracts field-value pairs from search results. kvform, multikv, xmlkv, rex
Keeps or removes fields from search results based on the field list
fields
criteria.
analyzefields, anomalies,
fieldsummary Generates summary information for all or a subset of the fields.
anomalousvalue, stats
filldown Replaces NULL values with the last non-NULL value. fillnull
format Takes the results of a subsearch and formats them into a single result.
Transforms results into a format suitable for display by the Gauge chart
gauge
types.
Adds a field, named geom, to each event. This field contains geographic
geom data structures for polygon geometry in JSON and is used for the geomfilter
choropleth map visualization.
Accepts two points that specify a bounding box for clipping a choropleth
geomfilter geom
map. Points that fall outside of the bounding box are filtered out.
Displays a unique icon for each different value in the list of fields that you
iconify highlight
specify.
inputcsv Loads search results from the specified CSV file. loadjob, outputcsv
inputlookup Loads search results from a specified static lookup table. inputcsv, join, lookup, outputlookup
join Combine the results of a subsearch with the results of a main search. appendcols, lookup, selfjoin
10
Command Description Related commands
anomalies, anomalousvalue,
kmeans Performs k-means clustering on selected fields.
cluster, outlier
localize Returns a list of the time ranges in which the search results were found. map, transaction
makemv Change a specified field into a multivalued field during a search. mvcombine, mvexpand, nomv
Converts search results into metric data and inserts the data into a metric
mcollect collect, meventcollect
index on the search head.
Converts search results into metric data and inserts the data into a metric
meventcollect collect, mcollect
index on the indexers.
multisearch Run multiple streaming searches at the same time. append, join
Combines events in search results that have a single differing field value
mvcombine mvexpand, makemv, nomv
into one result with a multivalue field of the differing field.
Expands the values of a multivalue field into separate events for each
mvexpand mvcombine, makemv, nomv
value of the multivalue field.
outputlookup Writes search results to the specified static lookup table. inputlookup, lookup, outputcsv
outputtext Outputs the raw text field (_raw) of results into the _xml field. outputcsv
11
Command Description Related commands
pivot Run pivot searches against a particular data model dataset. datamodel
rangemap Sets RANGE field to the name of the ranges that match.
rare Displays the least common values of a field. sirare, stats, top
regex Removes results that do not match the specified regular expression. rex, search
Causes a search to fail if the queries and commands that precede it in the
require
search string return zero events or results.
Specify a Perl regular expression named groups to extract fields while extract, kvform, multikv, xmlkv,
rex
you search. regex
Buffers events from real-time search to emit them in ascending time order
rtorder
when possible.
script (run) Runs an external Perl or Python script as part of your search.
set Performs set operations (union, diff, intersect) on subsearches. append, appendcols, join, diff
setfields Sets the field values for all results to a common value. eval, fillnull, rename
sichart Summary indexing version of the chart command. chart, sitimechart, timechart
12
Command Description Related commands
sitimechart Summary indexing version of the timechart command. chart, sichart, timechart
streamstats Adds summary statistics to all search results in a streaming manner. eventstats, stats
tags Annotates specified fields in your search results with tags. eval
Create a time series chart and corresponding table of statistics. See also,
timechart chart, bucket
Statistical and charting functions.
tscollect Writes results into tsidx file(s) for later use by the tstats command. collect, stats, tstats
tstats Calculates statistics over tsidx files created with the tscollect command. stats, tscollect
union Merges the results from two or more datasets into one dataset.
uniq Removes any search that is an exact duplicate with a previous result. dedup
where Performs arbitrary filtering on your data. See also, Evaluations functions. eval
13
Command Description Related commands
xyseries Converts results into a format suitable for graphing.
Commands by category
The following tables list all the search commands, categorized by their usage. Some commands fit into more than one
category based on the options that you specify.
Correlation
Command Description
append Appends subsearch results to current results.
Appends the fields of the subsearch results to current results, first results to first result, second to second,
appendcols
etc.
appendpipe Appends the result of the subpipeline applied to the current result set to results.
contingency, counttable,
Builds a contingency table for two fields.
ctable
join Combines the results from the main results pipeline with the results from a subsearch.
stats Provides statistics, grouped optionally by fields. See Statistical and charting functions.
These commands can be used to learn more about your data, add and delete data sources, or manage the data in your
summary indexes.
View data
These commands return information about the data you have in your indexes. They do not modify your data or indexes in
any way.
Command Description
audit Returns audit trail information that is stored in the local audit index.
14
Command Description
dbinspect Returns information about the specified index.
metadata Returns a list of source, sourcetypes, or hosts from a specified index or distributed search peer.
These are some commands you can use to add data sources to or delete specific data from your indexes.
Command Description
delete Delete specific events or search results.
Manage summary indexes
These commands are used to create and manage your summary indexes.
Command Description
collect, stash Puts search results into a summary index.
overlap Finds events in a summary index that overlap in time or have missed events.
Summary indexing version of chart. Computes the necessary information for you to later run a chart search
sichart
on the summary index.
Summary indexing version of rare. Computes the necessary information for you to later run a rare search
sirare
on the summary index.
Summary indexing version of stats. Computes the necessary information for you to later run a stats search
sistats
on the summary index.
Summary indexing version of timechart. Computes the necessary information for you to later run a
sitimechart
timechart search on the summary index.
Summary indexing version of top. Computes the necessary information for you to later run a top search on
sitop
the summary index.
Fields
These are commands you can use to add, extract, and modify fields or field values. The most useful command for
manipulating fields is eval and its statistical and charting functions.
Add fields
Command Description
accum Keeps a running total of the specified numeric field.
addinfo Add fields that contain common information about the current search.
addtotals Computes the sum of all numeric fields for each result.
15
Command Description
eval Calculates an expression and puts the value into a field. See also, evaluation functions.
iplocation Adds location information, such as city, country, latitude, longitude, and so on, based on IP addresses.
For configured lookup tables, explicitly invokes the field value lookup and adds fields from the lookup table
lookup
to the events.
rangemap Sets RANGE field to the name of the ranges that match.
strcat Concatenates string values and saves the result to a specified field.
Extract fields
These commands provide different ways to extract new fields from search results.
Command Description
erex Allows you to specify example or counter example values to automatically extract fields that have similar values.
rex Specify a Perl regular expression named groups to extract fields while you search.
spath Provides a straightforward means for extracting fields from structured data formats, XML and JSON.
Command Description
convert Converts field values into numerical values.
nomv Changes a specified multivalue field into a single-value field at search time.
Converts the difference between 'now' and '_time' to a human-readable value and adds adds this value to
reltime
the field, 'reltime', in your search results.
Find anomalies
These commands are used to find anomalies in your data. Either search for uncommon or outlying events and fields or
cluster similar events together.
Command Description
analyzefields, af Analyze numerical fields for their ability to predict another discrete field.
16
Command Description
Identifies anomalous events by computing a probability for each event and then detecting unusually small
anomalydetection
probabilities.
Command Description
iplocation Returns location information, such as city, country, latitude, longitude, and so on, based on IP addresses.
Adds a field, named "geom", to each event. This field contains geographic data structures for polygon
geom geometry in JSON and is used for choropleth map visualization. This command requires an external lookup
with external_type=geo to be installed.
Accepts two points that specify a bounding box for clipping choropleth maps. Points that fall outside of the
geomfilter
bounding box are filtered out.
geostats Generate statistics which are clustered into geographical bins to be rendered on a world map.
Metrics
Command Description
mcollect Converts events into metric data points and inserts the data points into a metric index on the search head.
meventcollect Converts events into metric data points and inserts the data points into a metric index on indexer tier.
Provides samples of the raw metric data points in the metric time series in your metrics indexes. Helps you
mpreview, msearch
troubleshoot your metrics data.
Calculates visualization-ready statistics for the measurement, metric_name, and dimension fields in
mstats
metric indexes.
These commands predict future values and calculate trendlines that can be used to create visualizations.
Command Description
predict Enables you to use time series algorithms to predict future values of fields.
x11 Enables you to determine the trend in your data by removing the seasonal pattern.
17
Reports
These commands are used to build transforming searches. These commands return statistical data tables that are
required for charts and other kinds of data visualizations.
Command Description
addtotals Computes the sum of all numeric fields for each result.
Prepares your events for calculating the autoregression, or moving average, based on a field that you
autoregress
specify.
chart Returns results in a tabular output for charting. See also, Statistical and charting functions.
contingency, counttable,
Builds a contingency table for two fields.
ctable
gauge Transforms results into a format suitable for display by the Gauge chart types.
makecontinuous Makes a field that is supposed to be the x-axis continuous (invoked by chart/timechart)
mstats Calculates statistics for the measurement, metric_name, and dimension fields in metric indexes.
stats Provides statistics, grouped optionally by fields. See also, Statistical and charting functions.
timechart Create a time series chart and corresponding table of statistics. See also, Statistical and charting functions.
Converts results from a tabular format to a format similar to stats output. Inverse of xyseries and
untable
maketable.
Results
These commands can be used to manage search results. For example, you can append one set of results with another,
filter more events from the results, reformat the results, and so on.
Alerting
Command Description
sendemail Emails search results, either inline or as an attachment, to one or more specified email addresses.
18
Command Description
Appending
Use these commands to append one set of results with another set or to itself.
Command Description
append Appends subsearch results to current results.
Appends the fields of the subsearch results to current results, first results to first result, second to second,
appendcols
and so on.
join SQL-like joining of results from the main results pipeline with the results from the subpipeline.
Use these commands to remove more events or fields from your current results.
Command Description
dedup Removes subsequent results that match a specified criteria.
Retrieves data from a dataset, such as a data model dataset, a CSV lookup, a KV Store lookup, a saved
from
search, or a table dataset.
Combines events in search results that have a single differing field value into one result with a multivalue
mvcombine
field of the differing field.
regex Removes results that do not match the specified regular expression.
uniq Removes any search that is an exact duplicate with a previous result.
where Performs arbitrary filtering on your data. See also, Evaluation functions.
Formatting
Command Description
Uses eval expressions to change the format of field values when they are rendered without changing their
fieldformat
underlying values. Does not apply to exported data.
Reformats rows of search results as columns. Useful for fixing X- and Y-axis display issues with charts, or
transpose
for turning sets of data into a series to produce a chart.
Converts results from a tabular format to a format similar to stats output. Inverse of xyseries and
untable
maketable.
19
Generating
Command Description
gentimes Returns results that match a time-range.
mvexpand Expands the values of a multivalue field into separate events for each value of the multivalue field.
Searches indexes for matching events. This command is implicit at the start of every search pipeline that
search
does not begin with another generating command.
Grouping
Command Description
cluster Clusters similar events together.
mvexpand Expands the values of a multivalue field into separate events for each value of the multivalue field.
Use these commands to change the order of the current search results.
Command Description
head Returns the first number n of specified results.
Use these commands to read in results from external files or previous searches.
Command Description
inputcsv Loads search results from the specified CSV file.
20
Writing
Command Description
collect, stash Puts search results into a summary index.
meventcollect Converts events into metric data points and inserts the data points into a metric index on indexer tier.
mcollect Converts events into metric data points and inserts the data points into a metric index on the search head.
outputtext Ouputs the raw text field (_raw) of results into the _xml field.
sendemail Emails search results, either inline or as an attachment, to one or more specified email addresses.
Search
Command Description
localop Run subsequent commands, that is all commands following this, locally and not on a remote peer.
Invokes parallel reduce search processing to shorten the search runtime of a set of supported SPL
redistribute
commands.
Searches indexes for matching events. This command is implicit at the start of every search pipeline that
search
does not begin with another generating command.
sendemail Emails search results, either inline or as an attachment, to one or more specified email addresses.
Subsearch
Command Description
append Appends subsearch results to current results.
Appends the fields of the subsearch results to current results, first results to first result, second to second,
appendcols
and so on.
appendpipe Appends the result of the subpipeline applied to the current result set to results.
foreach Runs a templated streaming subsearch for each field in a wildcarded field list.
format Takes the results of a subsearch and formats them into a single result.
join Combine the results of a subsearch with the results of a main search.
Time
Use these commands to search based on time ranges or add time information to your events.
21
Command Description
gentimes Returns results that match a time-range.
localize Returns a list of the time ranges in which the search results were found.
Converts the difference between 'now' and '_time' to a human-readable value and adds adds this value to
reltime
the field, 'reltime', in your search results.
Command types
There are six broad types for all of the search commands: distributable streaming, centralized streaming, transforming,
generating, orchestrating and dataset processing. These types are not mutually exclusive. A command might be
streaming or transforming, and also generating.
The following tables list the commands that fit into each of these types. For detailed explanations about each of the types,
see Types of commands in the Search Manual.
Streaming commands
• A distributable streaming command runs on the indexer or the search head, depending on where in the search
the command is invoked. Distributable streaming commands can be applied to subsets of indexed data in a
parallel manner.
• A centralized streaming command applies a transformation to each event returned by a search. Unlike
distributable streaming commands, a centralized streaming command only works on the search head.
Command Notes
addinfo Distributable streaming
addtotals Distributable streaming. A transforming command when used to calculate column totals (not row totals).
anomalydetection
append
arules
bin Streaming if specified with the span argument. Otherwise a dataset processing command.
bucketdir
Streaming by default. Using the sortby argument or specifying keepevents=true makes the dedup command a dataset
dedup
processing command.
22
Command Notes
fields Distributable streaming.
Distributable streaming when a field-list is specified. A dataset processing command when no field-list is
fillnull
specified.
Centralized streaming, if there is a defined set of fields to join to. A dataset processing command when no field-list is
join
specified.
Distributable streaming when specified with local=false, which is the default. An orchestrating command when
lookup
local=true.
Distributable streaming if used further down the search pipeline. A generating command when it is the first command in the
search
search.
xmlunescape
Distributable streaming if the argument grouped=false is specified, which is the default. Otherwise a transforming
xyseries
command.
23
Generating commands
A generating command generates events or reports from one or more indexes without transforming the events.
Command Notes
datamodel Report-generating
dbinspect Report-generating.
eventcount Report-generating.
Can be either report-generating or event-generating depending on the search or knowledge object that is referenced by the
from
command.
gentimes Event-generating.
makeresults Report-generating.
metadata Report-generating. Although metadata fetches data from all peers, any command run after it runs only on the search head.
metasearch Event-generating.
multisearch Event-generating.
pivot Report-generating.
rest
Event-generating (distributable) when the first command in the search, which is the default. A streaming (distributable) command
search
if used later in the search pipeline.
searchtxn Event-generating.
set Event-generating.
Transforming commands
A transforming command orders the results into a data table. The command "transforms" the specified cell values for
each event into numerical values for statistical purposes.
In earlier versions of Splunk software, transforming commands were called reporting commands.
Command Notes
Transforming when used to calculate column totals (not row totals). A distributable streaming command when used to
addtotals
calculate row totals, which is the default.
chart
cofilter
contingency
24
Command Notes
history
makecontinuous
mvcombine
rare
stats
table
timechart
top
xyseries Transforming if grouped=true. A streaming (distributable) command when grouped=false, which is the default setting.
Orchestrating commands
Orchestrating commands control some aspect of how a search is processed. They do not directly affect the final result
set of the search. For example, you might apply an orchestrating command to a search to enable or disable a search
optimization that helps the overall search complete faster.
Command Notes
localop
Only becomes an orchestrating command when local=true. This forces the lookup command to run on the search head and
lookup
not on any remote peers. A streaming (distributable) command when local=false, which is the default setting.
noop
redistribute
require
A dataset processing command is a command that requires the entire dataset before the command can run. Some of
these commands fit into other command types in specific situations or when specific arguments are used.
Command Notes
anomalousvalue Some modes
concurrency
datamodel
Using the sortby argument or specifying keepevents=true makes the dedup command a dataset processing command.
dedup
Otherwise, dedup is a streaming command.
eventstats
25
Command Notes
fieldsummary
join Some modes. A centralized streaming command when there is a defined set of fields to join to.
map
outlier
reverse
sort
tail
Concepts
The Splunk platform does not store data in a conventional database. Rather, it stores data in a distributed, non-relational,
semi-structured database with an implicit time dimension. Relational databases require that all table columns be
defined up-front and they do not automatically scale by just plugging in new hardware. However, there are analogues to
many of the concepts in the database world.
Database Splunk
Notes
Concept Concept
A Splunk search retrieves indexed data and can perform transforming and reporting operations.
SQL query Splunk search Results from one search can be "piped", or transferred, from command to command, to filter, modify,
reorder, and group your results.
Search results can be thought of as a database view, a dynamically generated table of rows, with
table/view search results
columns.
All values and fields are indexed by Splunk software, so there is no need to manually add, update,
index index
drop, or even think about indexing columns. Everything can be quickly retrieved automatically.
A result in a Splunk search is a list of fields (i.e., column) values, corresponding to a table row. An
event is a result that has a timestamp and raw text. Typically an event is a record from a log file,
such as:
row result/event
173.26.34.223 - - [01/Jul/2009:12:05:27 -0700] "GET
/trade/app?action=logout HTTP/1.1" 200 2953
column field Fields are returned dynamically from a search, meaning that one search might return a set of fields,
while another search might return another set. After teaching Splunk software how to extract more
26
Database Splunk
Notes
Concept Concept
fields from the raw underlying data, the same search will return more fields than it previously did.
Fields are not tied to a datatype.
A Splunk index is a collection of data, somewhat like a database has a collection of tables. Domain
database/schema index/app
knowledge of that data, how to extract it, what reports to run, etc, are stored in a Splunk application.
SQL is designed to search relational database tables which are comprised of columns. SPL is designed to search
events, which are comprised of fields. In SQL, you often see examples that use "mytable" and "mycolumn". In SPL, you
will see examples that refer to "fields". In these examples, the "source" field is used as a proxy for "table". In Splunk
software, "source" is the name of the file, stream, or other input from which a particular piece of data originates, for
example /var/log/messages or UDP:514.
When translating from any language to another, often the translation is longer because of idioms in the original
language. Some of the Splunk search examples shown below could be more concise and more efficient, but for
parallelism and clarity, the SPL table and field names are kept the same as the SQL example.
• SPL searches rarely need the FIELDS command to filter out columns because the user interface provides a more
convenient method for filtering. The FIELDS command is used in the SPL examples for parallelism.
• With SPL, you never have to use the AND operator in Boolean searches, because AND is implied between terms.
However when you use the AND or OR operators, they must be specified in uppercase.
• SPL commands do not need to be specified in uppercase. In the these SPL examples, the commands are
specified in uppercase for easier identification and clarity.
SELECT * SELECT *
source=mytable
FROM mytable
SELECT *
WHERE
FROM mytable source=mytable mycolumn=5
WHERE mycolumn=5
AND/OR
SELECT * source=mytable
FROM mytable AND (mycolumn1="true"
WHERE (mycolumn1="true" OR mycolumn2="red")
OR mycolumn2="red") AND mycolumn3="blue"
AND mycolumn3="blue" Note: The AND operator is implied in SPL and
does not need to be specified. For this
example you could also use:
source=mytable
(mycolumn1="true"
OR mycolumn2="red")
mycolumn3="blue"
27
SQL command SQL example Splunk SPL example
source=mytable
AS (alias) SELECT mycolumn AS column_alias
| RENAME mycolumn as column_alias
FROM mytable
| FIELDS column_alias
SELECT *
BETWEEN FROM mytable source=mytable
WHERE mycolumn mycolumn>=1 mycolumn<=5
BETWEEN 1 AND 5
source=mytable mycolumn=value
SELECT mycolumn, avg(mycolumn) | STATS avg(mycolumn) BY mycolumn
GROUP BY FROM mytable | FIELDS mycolumn, avg(mycolumn)
WHERE mycolumn=value Several commands use a by-clause to group
GROUP BY mycolumn information, including chart, rare, sort, stats,
and timechart.
source=mytable
mycolumn="*some text*"
Note: The most common search in Splunk
SPL is nearly impossible in SQL - to search all
SELECT *
LIKE fields for a substring. The following SPL search
FROM mytable
WHERE mycolumn LIKE "%some text%" returns all rows that contain "some text"
anywhere:
source=mytable
SELECT * | SORT -mycolumn
ORDER BY
FROM mytable
In SPL you use a negative sign ( - ) in front of
ORDER BY mycolumn desc
a field name to sort in descending order.
SELECT TOP(5)
mycolum1, Source=mytable1 mycolum3="bar"
SELECT TOP mycolum2 | FIELDS mycolum1 mycolum2
FROM mytable1 | SORT mycolum1 mycolum2
WHERE mycolum3 = "bar" | HEAD 5
ORDER BY mycolum1 mycolum2
INNER JOIN
SELECT * index=myIndex1 OR index=myIndex2
28
SQL command SQL example Splunk SPL example
FROM mytable1 | stats values(*) AS * BY myField
INNER JOIN mytable2 Note: There are two other methods to join
ON mytable1.mycolumn= tables:
mytable2.mycolumn
• Use the lookup command to add fields from
an external table:
• Use a subsearch:
source=mytable1
[SEARCH source=mytable2
mycolumn2=myvalue
| FIELDS mycolumn2]
If the columns that you want to join on have
different names, use the rename command to
rename one of the columns. For example, to
rename the column in mytable2:
source=mytable1
| JOIN type=inner mycolumn
[ SEARCH source=mytable2
| RENAME mycolumn2
AS mycolumn]
To rename the column in myindex1:
index=myIndex1 OR index=myIndex2
| rename myfield1 as myField
| stats values(*) AS * BY myField
You can rename a column regardless of
whether you use the search command, a
lookup, or a subsearch.
SELECT *
FROM mytable1 source=mytable1
LEFT (OUTER) JOIN
LEFT JOIN mytable2 | JOIN type=left mycolumn
ON mytable1.mycolumn= [SEARCH source=mytable2]
mytable2.mycolumn
SELECT INTO
SELECT * source=old_mytable
INTO new_mytable IN mydb2 | EVAL source=new_mytable
FROM old_mytable | COLLECT index=mydb2
Note: COLLECT is typically used to store
expensively calculated fields back into your
Splunk deployment so that future access is
much faster. This current example is atypical
but shown for comparison to the SQL
29
SQL command SQL example Splunk SPL example
command. The source will be renamed
orig_source
SELECT *
source=mytable1
UNION ALL FROM mytable1
| APPEND
UNION ALL
[SEARCH source=mytable2]
SELECT * FROM mytable2
See also
bool
The <bool> argument value represents the Boolean data type. The documentation specifies 'true' or 'false'. Other
variations of Boolean values are accepted in commands. For example, for 'true' you can also use 't', 'T', 'TRUE', or the
number one '1'. For 'false', you can use 'f', 'F', 'FALSE', or the number zero '0'.
int
30
num
float
bin-span
by-clause
Syntax: by <field-list>
Description: Fields to group by.
Example: BY addr, port
Example: BY host
eval-function
Syntax: abs | case | cidrmatch | coalesce | exact | exp | floor | if | ifnull | isbool | isint | isnotnull | isnull | isnum |
isstr | len|like | ln|log | lower | match | max | md5 | min | mvcount | mvindex | mvfilter | now | null | nullif | pi | pow |
random | replace | round | searchmatch | sqrt | substr | tostring | trim | ltrim | rtrim | typeof | upper | urldecode |
validate
Description: Function used by eval.
Example: md5(field)
Example: typeof(12) + typeof("string") + typeof(1==2) + typeof(badfield)
Example: searchmatch("foo AND bar")
Example: sqrt(9)
Example: round(3.5)
Example: replace(date, "^(\d{1,2})/(\d{1,2})/", "\2/\1/")
Example: pi()
Example: nullif(fielda, fieldb)
Example: random()
Example: pow(x, y)
Example: mvfilter(match(email, "\.net$") OR match(email, "\.org$"))
Example: mvindex(multifield, 2)
Example: null()
Example: now()
Example: isbool(field)
Example: exp(3)
Example: floor(1.9)
31
Example: coalesce(null(), "Returned value", null())
Example: exact(3.14 * num)
Example: case(error == 404, "Not found", error == 500, "Internal Server Error", error == 200, "OK")
Example: cidrmatch("123.132.32.0/25", ip)
Example: abs(number)
Example: isnotnull(field)
Example: substr("string", 1, 3) + substr("string", -3)
Example: if(error == 200, "OK", "Error")
Example: len(field)
Example: log(number, 2)
Example: lower(username)
Example: match(field, "^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$")
Example: max(1, 3, 6, 7, "f"^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$")oo", field)
Example: like(field, "foo%")
Example: ln(bytes)
Example: mvcount(multifield)
Example: urldecode("https%3A%2F%2Ffanyv88.com%3A443%2Fhttp%2Fwww.splunk.com%2Fdownload%3Fr%3Dheader")
Example: validate(isint(port), "ERROR: Port is not an integer", port >= 1 AND port <= 65535, "ERROR: Port is out
of range")
Example: tostring(1==1) + " " + tostring(15, "hex") + " " + tostring(12345.6789, "commas")
Example: trim(" ZZZZabcZZ ", " Z")
evaled-field
Syntax: eval(<eval-expression>)
Description: A dynamically evaled field
field
field-list
regex-expression
Syntax: (\")?<string>(\")?
Description: A Perl Compatible Regular Expression supported by the PCRE library.
Example: ... | regex _raw="(?<!\d)10.\d{1,3}\.\d{1,3}\.\d{1,3}(?!\d)"
single-agg
sort-by-clause
32
Example: _time, -host
span-length
Syntax: <int:span>(<timescale>)?
Description: Span of each bin. If using a timescale, this is used as a time range. If not, this is an absolute bucket
"length."
Example: 2d
Example: 5m
Example: 10
split-by-clause
stats-agg
stats-agg-term
subsearch
Syntax: [<string>]
Description: Specifies a subsearch.
Example: [search 404 | select url]
tc-option
33
Example: otherstr=OTHERFIELDS
Example: usenull=f
Example: bins=10
timeformat
Syntax: timeformat=<string>
Description: Set the time format for starttime and endtime terms.
Example: timeformat=%m/%d/%Y:%H:%M:%S
timestamp
Syntax: (MM/DD/YY)?:(HH:MM:SS)?|<int>
Description: None
Example: 10/1/07:12:34:56
Example: -5
where-clause
wc-field
34
Evaluation Functions
Evaluation functions
Use the evaluation functions to evaluate an expression, based on your events, and return a result.
Quick reference
See the Supported functions and syntax section for a quick reference list of the evaluation functions.
Commands
You can use evaluation functions with the eval, fieldformat, and where commands, and as part of eval expressions with
other commands.
Usage
• All functions that accept strings can accept literal strings or any field.Â
• All functions that accept numbers can accept literal numbers or any numeric field.
For most evaluation functions, when a string argument is expected, you can specify either a literal string or a field name.
Literal strings must be enclosed in double quotation marks. In other words, when the function syntax specifies a string you
can specify any expression that results in a string. For example, you have a field called name which contains the names
of your servers. You want to append the literal string server at the end of the name. You would specify this: name +
"server".â
Nested functions
In the following example, the cidrmatch function is used as the first argument in the if function.
The following example shows how to use the true() function to provide a default to the case function.
... | eval error=case(status == 200, "OK", status == 404, "Not found", true(), "Other")
There are two ways that you can see information about the supported evaluation functions:
35
Function list by category
The following table is a quick reference of the supported evaluation functions, organized by category. This table provides
a brief description for each function. Use the links in the table to learn more about each function and to see examples.
lookup(<lookup_table>, <json_object>, This function performs a CSV lookup. It returns the output field or
<json_array>) fields in the form of a JSON object.
Use this function to return TRUE if the search string (X) matches
searchmatch(X)
the event.
Date and Time now() Returns the time that the search was started.
functions relative_time(X,Y) Adjusts the time by a relative time specifier.
36
Supported functions and syntax Description
Type of function
strftime(X,Y) Takes a UNIX time and renders it into a human readable format.
strptime(X,Y) Takes a human readable time and renders it into UNIX time.
The time that eval function was computed. The time will be
time() different for each event, based on when the event was
processed.
37
Supported functions and syntax Description
Multivalue eval mvcount(MVFIELD) Returns the count of the number of values in the specified field.
functions mvdedup(X) Removes all of the duplicate values from a multivalue field.
38
Supported functions and syntax Description
max(X,...) Returns the maximum of the string or numeric values.
rtrim(X,Y) Returns X with the characters in Y trimmed from the right side.
The following table is a quick reference of the supported evaluation functions, organized alphabetically. This table
provides a brief description for each function. Use the links in the table to learn more about each function and to see
examples.
39
Supported functions and syntax Description Type of function
Mathematical
abs(X) Returns the absolute value.
functions
Trigonometry and
acos(X) Computes the arc cosine of X.
Hyperbolic functions
Trigonometry and
acosh(X) Computes the arc hyperbolic cosine of X.
Hyperbolic functions
Trigonometry and
asin(X) Computes the arc sine of X.
Hyperbolic functions
Trigonometry and
asinh(X) Computes the arc hyperbolic sine of X.
Hyperbolic functions
Trigonometry and
atan(X) Computes the arc tangent of X.
Hyperbolic functions
Trigonometry and
atan2(X,Y) Computes the arc tangent of X,Y.
Hyperbolic functions
Trigonometry and
atanh(X) Computes the arc hyperbolic tangent of X.
Hyperbolic functions
Statistical eval
avg(X,...) Returns the average of numerical values as an integer.
functions
Accepts alternating conditions and values. Returns the first value for Comparison and
case(X,"Y",...)
which the condition evaluates to TRUE. Conditional functions
Mathematical
ceiling(X) Rounds the value up to the next highest integer.
functions
This function takes an arbitrary number of arguments and returns the Comparison and
coalesce(X,...)
first value that is not NULL. Conditional functions
Returns a multivalued field that contains a list of the commands used Multivalue eval
commands(X)
in X. functions
Trigonometry and
cos(X) Computes the cosine of an angle of X radians.
Hyperbolic functions
Trigonometry and
cosh(X) Computes the hyperbolic cosine of X radians.
Hyperbolic functions
Returns the result of a numeric eval calculation with a larger amount Mathematical
exact(X)
of precision in the formatted output. functions
Mathematical
exp(X) Returns the exponential function e X.
functions
Comparison and
false() Returns FALSE.
Conditional functions
Mathematical
floor(X) Rounds the value down to the next lowest integer.
functions
Trigonometry and
hypot(X,Y) Computes the hypotenuse of a triangle.
Hyperbolic functions
if(X,Y,Z) If the condition X evaluates to TRUE, returns Y, otherwise returns Z. Comparison and
Conditional functions
40
Supported functions and syntax Description Type of function
The function returns TRUE if one of the values in the list matches a Comparison and
in(FIELD, VALUE-LIST)
value in the field you specify. Conditional functions
Informational
isbool(X) Returns TRUE if the field value is Boolean.
functions
Informational
isint(X) Returns TRUE if the field value is an integer.
functions
Informational
isnotnull(X) Returns TRUE if the field value is not NULL.
functions
Informational
isnull(X) Returns TRUE if the field value is NULL.
functions
Informational
isnum(X) Returns TRUE if the field value is a number.
functions
Informational
isstr(X) Returns TRUE if the field value is a string.
functions
json_array_to_mv(<json_array>, <Boolean>) Maps the elements of a proper JSON array into a multivalue field. JSON functions
Returns a value from a piece JSON and zero or more paths. The
json_extract(<json>, <paths>) value is returned in either a JSON array, or a Splunk software native JSON functions
type value.
Returns the keys from the key-value pairs in a JSON object. The
json_keys(<json>) JSON functions
keys are returned as a JSON array.
json_object(<members>) Creates a new JSON object from members of key-value pairs. JSON functions
Comparison and
like(TEXT, PATTERN) Returns TRUE if TEXT matches PATTERN.
Conditional functions
lookup(<lookup_table>, <json_object>, This function performs a CSV lookup. It returns the output field or Comparison and
<json_array>) fields in the form of a JSON object. Conditional functions
41
Supported functions and syntax Description Type of function
Mathematical
ln(X) Returns the natural logarithm.
functions
ltrim(X,Y) Trims the characters represented in Y from the left side of the string. Text functions
Statistical eval
max(X,...) Returns the maximum of the string or numeric values.
functions
Cryptographic
md5(X) Computes the md5 hash for the value X.
functions
Statistical eval
min(X,...) Returns the minimum of the string or numeric values.
functions
Multivalue eval
mvappend(X,...) Returns a multivalue result based on all of values specified.
functions
Multivalue eval
mvcount(MVFIELD) Returns the count of the number of values in the specified field.
functions
Multivalue eval
mvdedup(X) Removes all of the duplicate values from a multivalue field.
functions
Multivalue eval
mvfilter(X) Filters a multivalue field based on an arbitrary Boolean expression X.
functions
Finds the index of a value in a multivalue field that matches the Multivalue eval
mvfind(MVFIELD,"REGEX")
REGEX. functions
Takes all of the values in a multivalue field and appends them Multivalue eval
mvjoin(MVFIELD,STR)
together delimited by STR. functions
Creates a multivalue field with a range of numbers between X and Y, Multivalue eval
mvrange(X,Y,Z)
incrementing by Z. functions
Multivalue eval
mvsort(X) Returns the values of a multivalue field sorted lexicographically.
functions
mv_to_json_array(<field>, <Boolean>) Maps the elements of a multivalue field to a JSON array. JSON functions
Comparison and
null() This function takes no arguments and returns NULL.
Conditional functions
nullif(X,Y) This function is used to compare fields. The function takes two Comparison and
arguments, X and Y, and returns NULL if X = Y. Otherwise it returns Conditional functions
X.
42
Supported functions and syntax Description Type of function
Mathematical
pi() Returns the constant pi to 11 digits of precision.
functions
Mathematical
pow(X,Y) Returns X to the power of Y, X Y.
functions
Statistical eval
random() Returns a pseudo-random integer ranging from zero to 2 31-1.
functions
rtrim(X,Y) Returns X with the characters in Y trimmed from the right side. Text functions
Use this function to return TRUE if the search string (X) matches the Comparison and
searchmatch(X)
event. Conditional functions
Cryptographic
sha1(X) Computes the sha1 hash for the value X.
functions
Cryptographic
sha256(X) Computes the sha256 hash for the value X.
functions
Cryptographic
sha512(X) Computes the sha512 hash for the value X.
functions
Mathematical
sigfig(X) Rounds X to the appropriate number of significant figures.
functions
Trigonometry and
sin(X) Computes the sine of X.
Hyperbolic functions
Trigonometry and
sinh(X) Computes the hyperbolic sine of X.
Hyperbolic functions
Multivalue eval
split(X,"Y") Returns a mv field splitting X by the delimited character Y.
functions
Mathematical
sqrt(X) Returns the square root of the value.
functions
43
Supported functions and syntax Description Type of function
Trigonometry and
tan(X) Computes the tangent of X.
Hyperbolic functions
Trigonometry and
tanh(X) Computes the hyperbolic tangent of X.
Hyperbolic functions
The time that eval function was computed. The time will be different Date and Time
time()
for each event, based on when the event was processed. functions
tostring(X,Y) Converts the input, such as a number or a Boolean value, to a string. Conversion functions
Comparison and
true() Returns TRUE.
Conditional functions
Returns a string that indicates the field type, such as Number, String, Informational
typeof(X)
Boolean, and so forth. functions
urldecode(X) Replaces URL escaped characters with the original characters. Text functions
See also
Topics:
Statistical and charting functions
Commands:
eval
fieldformat
where
For information about using string and numeric fields in functions, and nesting functions, see Evaluation functions.
For information about Boolean operators, such as AND and OR, see Boolean operators.
case(X,"Y",...)
Description
Accepts alternating conditions and values. Returns the first value for which the condition evaluates to TRUE.
44
This function takes pairs of arguments X and Y. The X arguments are Boolean expressions that are evaluated from first to
last. When the first X expression is encountered that evaluates to TRUE, the corresponding Y argument is returned. The
function defaults to NULL if none are true.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
This example uses the sample data from the Search Tutorial, but should work with any format of Apache Web access log. To try this example on
your own Splunk instance, you must download the sample data and follow the instructions to get the tutorial data into Splunk. Use the time
range Yesterday when you run the search.
The following example returns descriptions for the corresponding http status code.
sourcetype=access_* | eval description=case(status == 200, "OK", status ==404, "Not found", status == 500,
"Internal Server Error") | table status description
The results appear on the Statistics tab and look something like this:
status description
200 OK
200 OK
408
200 OK
404 Not found
200 OK
406
500 Internal Server Error
200 OK
For an example of how to display a default value when that status does not match one of the values specified, see the
True function.
Extended example
This example shows you how to use the case function in two different ways, to create categories and to create a custom
sort order.
This example uses recent earthquake data downloaded from the USGS Earthquakes website. The data is a comma separated ASCII text file
that contains magnitude (mag), coordinates (latitude, longitude), region (place), and so forth, for each earthquake recorded.
You can download a current CSV file from the USGS Earthquake Feeds and upload the file to your Splunk instance if
you want follow along with this example.
You want classify earthquakes based on depth. Shallow-focus earthquakes occur at depths less than 70 km. Mid-focus
earthquakes occur at depths between 70 and 300 km. Deep-focus earthquakes occur at depths greater than 300 km.
45
We'll use Low, Mid, and Deep for the category names.
The eval command is used to create a field called Description, which takes the value of "Low", "Mid", or "Deep" based
on the Depth of the earthquake. The case() function is used to specify which ranges of the depth fits each description. For
example, if the depth is less than 70 km, the earthquake is characterized as a shallow-focus quake; and the resulting
Description is Low.
The search also pipes the results of the eval command into the stats command to count the number of earthquakes and
display the minimum and maximum magnitudes for each Description.
The results appear on the Statistics tab and look something like this:
You can also use the case function to sort the results in a custom order, such as Low, Mid, Deep. You create the custom
sort order by giving the values a numerical ranking and then sorting based on that ranking.
The results appear on the Statistics tab and look something like this:
Description
Use this function to determine if an IP address belongs to a particular subnet. This function returns true, when IP address
Y belongs to a particular subnet X. Both X and Y are string arguments. X is the CIDR subnet. Y is the IP address to match
with the subnet.
The cidrmatch function supports IPv4 and IPv6 addresses and subnets that use CIDR notation.
46
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic examples
The following example uses the cidrmatch and if functions to set a field, isLocal, to "local" if the field ip matches the
subnet. If the ip field does not match the subnet, the isLocal field is set to "not local".
The following example uses the cidrmatch function as a filter to remove events that do not match the ip address:
You can use the cidrmatch function to identify CIDR IP addresses by subnet. The following example uses cidrmatch with
the eval command to compare an IPv4 address with a subnet that uses CIDR notation to determine whether the IP
address is a member of the subnet. If there is a match, the search returns true in a new field called result.
The IP address is not in the subnet, so search displays false in the result field. The search results look something like
this.
The IP address is located within the subnet, so it is displayed in the search results, which look like this.
time ip
2020-11-19 16:43:31 192.0.2.56
It is worth noting that you can get the same results when using the search command, as shown in this example.
time ip
2020-11-19 16:43:31 192.0.2.56
47
time ip
The following example uses cidrmatch with the eval command to compare an IPv6 address with a subnet that uses CIDR
notation to determine whether the IP address is a member of the subnet. If there is a match, search returns true in a new
field called result.
The IP address is located within the subnet, so search displays truein the result field. The search results look something
like this.
time ip
2020-11-19 16:43:31 2001:0db8:ffff:ffff:ffff:ffff:ffff:ff99
See also
Commands
iplocation
lookup
search
coalesce(X,...)
Description
This function takes an arbitrary number of arguments and returns the first value that is not NULL.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic examples
You have a set of events where the IP address is extracted to either clientip or ipaddress. This example defines a new
field called ip, that takes the value of either the clientip field or ipaddress field, depending on which field is not NULL
48
(does not exist in that event). If both the clientip and ipaddress field exist in the event, this function returns the first
argument, the clientip field.
false()
Description
This function enables you to specify a conditional that is obviously false, for example 1==0. You do not specify a field with
this function.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic examples
if(X,Y,Z)
Description
This function takes three arguments. The first argument X must be a Boolean expression. If X evaluates to TRUE, the
result is the second argument Y. If X evaluates to FALSE, the result evaluates to the third argument Z.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic examples
The following example looks at the values of the field error. If error=200, the function returns err=OK. Otherwise the
function returns err=Error.
The following example uses the cidrmatch and if functions to set a field, isLocal, to "local" if the field ip matches the
subnet. If the ip field does not match the subnet, the isLocal field is set to "not local".
49
in(FIELD, VALUE-LIST)
Description
The function returns TRUE if one of the values in the list matches a value in the field you specify.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions with other
commands.
The eval command cannot accept a Boolean value. You must specify the IN function inside the IF function, which can
accept a Boolean value as input.
The string values must be enclosed in quotation marks. You cannot specify wildcard characters with the values to specify
a group of similar values, such as HTTP error codes or CIDR IP address ranges. Use the IN operator instead.
The IN operator is similar to the in function. You can use the IN operator with the search and tstats commands. You can
use wildcard characters in the VALUE-LIST with these commands.
Basic examples
The following example uses the where command to return in=TRUE if one of the values in the status field matches one of
the values in the list.
The following example uses the in function as the first parameter for the if function. The evaluation expression returns
TRUE if the value in the status field matches one of the values in the list.
Extended example
The following example combines the in function with the if function to evaluate the status field. The value of true is
placed in the new field error if the status field contains one of the values 404, 500, or 503. Then a count is performed of
the values in the error field.
50
See also
Blogs
Smooth operator | Searching for multiple field values
like(TEXT, PATTERN)
Description
This function takes two arguments, a string to match TEXT and a string expression to match PATTERN. It returns TRUE
if, and only if, TEXT matches PATTERN. The pattern matching supports an exact text match, as well as single and
multiple character matches.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic examples
The following example returns like=TRUE if the field value starts with foo:
The following example uses the where command to return like=TRUE if the ipaddress field starts with the value 198.. The
percent ( % ) symbol is a wildcard with the like function:
Description
This function performs a CSV lookup. It returns the output field or fields in the form of a JSON object.
Syntax
Usage
You can use the lookup() function with the eval, fieldformat, and where commands, and as part of eval expressions.
51
The lookup() function takes an <input_field> from a CSV <lookup_table>, finds events in the search result that have
the <match_field>, and then identifies other field-value pairs from from the CSV table that correspond to the input_field
and adds them to the matched events in the form of a JSON object.
The lookup() requires a <lookup_table>. You can provide this either a CSV lookup file or CSV lookup definition, enclosed
within quotation marks. To provide a file, give the full filename of a CSV lookup file that is stored in the global lookups
directory ($SPLUNK_HOME/etc/system/lookups/) or in a lookup directory that matches your current app context, such as
$SPLUNK_HOME/etc/users/<user>/<app>/lookups/.
If the first quoted string does not end in ".csv", the eval processor assumes it is the name of a CSV lookup definition.
Specified CSV lookup definitions must be shared globally. CSV lookup definitions cannot be private or shared to a specific
app.
Specify a lookup definition if you want the various settings associated with the definition to apply, such as limits on
matches, case-sensitive match options, and so on.
A lookup() function can use multiple <input_field>/<match_field> pairs to identify events, and multiple <output_field>
values can be applied to those events. Here is an example of valid lookup() syntax with multiple inputs, matches, and
outputs.
For more information about uploading CSV lookup files and creating CSV lookup definitions, see Define a CSV lookup in
Splunk Web in the Knowledge Manager Manual.
The lookup() function uses two JSON functions for eval: json_object and json_array. JSON functions allow the eval
processor to efficiently group things together. For more information, see JSON functions in the Search Reference.
Examples
This simple makeresults example returns an array that illustrates what status_description values are paired in
the http_status.csv lookup table with a status_type of Successful.
This search employs multiple input and match field pairs to show that an event with type="Successful" and
status="200" matches a status_description of OK in the http_status.csv lookup table.
52
| makeresults | eval type = "Successful", status="200" | eval output=lookup("http_status.csv",
json_object("status_type", type, "status", status), json_array("status_description"))
This example matches values of a status field in a http_status.csv lookup file with values of status fields in the
returned events. It then generates JSON objects as values of a status_details field, with the corresponding
status_description and status_type field-value pairs, and adds them to the events. Finally, it provides counts of
the JSON objects, broken out by object.
4. Get counts of the HTTP status description values that have been applied to your events by a HTTP status eval lookup
This example shows how you can nest a lookup function inside another eval function. In this case it is the
json_extract JSON function. This extracts status_description field-value pairs from the json_array objects and
applies them to corresponding events. The search then returns a count of events with status_description fields,
broken out by status_description value.
Here is an example of an extracted status_description value returned by this search. Compare it to the result
returned by the third example: status_details=Created
match(SUBJECT, "REGEX")
Description
This function returns TRUE or FALSE based on whether REGEX matches SUBJECT.
This function compares the regex string REGEX to the value of SUBJECT and returns a Boolean value. It returns TRUE if
the REGEX can find a match against any substring of SUBJECT.
Usage
The match function is regex based. For example use the backslash ( \ ) character to escape a special character, such as a
quotation mark. Use the pipe ( | ) character to specify an OR condition.
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic examples
The following example returns TRUE if, and only if, field matches the basic pattern of an IP address. This examples uses
the caret ( ^ ) character and the dollar ( $ ) symbol to perform a full match.
53
The following example uses the match function in an <eval-expression>. The SUBJECT is a calculated field called test.
The "REGEX" is the string yes.
This example uses the match function in an <eval-expression>. The SUBJECT is a calculated field called test. The
"REGEX" is the string yes.
If the value is stored with quotation marks, you must use the backslash ( \ ) character to escape the embedded quotation
marks. For example:
null()
Description
This function takes no arguments and returns NULL. The evaluation engine uses NULL to represent "no value". Setting a
field value to NULL clears the field value.
Usage
NULL values are field values that are missing in a some results but present in another results.
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic examples
Suppose you want to calculate the average of the values in a field, but several of the values are zero. If the zeros are
placeholders for no value, the zeros will interfere with creating an accurate average. You can use the null function to
remove the zeros.
See also
• You can use the fillnull command to replace NULL values with a specified value.
• You can use the nullif(X,Y) function to compare two fields and return NULL if X = Y.
nullif(X,Y)
Description
This function is used to compare fields. The function takes two arguments, X and Y, and returns NULL if X = Y. Otherwise
it returns X.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
54
Basic examples
The following example returns NULL if fieldA=fieldB. Otherwise the function returns fieldA.
searchmatch(X)
Description
Use this function to return TRUE if the search string (X) matches the event.
This function takes one argument X, which is a search string. The function returns TRUE if, and only if, the event matches
the search string.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic examples
The following example uses the makeresults command to create some simple results. The searchmatch function is used
to determine if any of the results match the search string "x=hi y=*".
| makeresults 1 | eval _raw = "x=hi y=bye" | eval x="hi" | eval y="bye" | eval test=if(searchmatch("x=hi
y=*"), "yes", "no") | table _raw test x y
The result of the if function is yes; the results match the search string specified with the searchmatch function.
true()
Description
This function enables you to specify a condition that is obviously true, for example 1==1. You do not specify a field with
this function.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic examples
55
This example uses the sample data from the Search Tutorial, but should work with any format of Apache Web access log. To try this example on
your own Splunk instance, you must download the sample data and follow the instructions to get the tutorial data into Splunk. Use the time
range Yesterday when you run the search.
The following example shows how to use the true() function to provide a default value to the case function. If the values
in the status field are not 200, or 404, the value used is Other.
The results appear on the Statistics tab and look something like this:
status description
200 OK
200 OK
408 Other
200 OK
404 Not found
200 OK
200 OK
406 Other
200 OK
validate(X,Y,...)
Description
Use this function to return the string Y corresponding to the first expression X that evaluates to FALSE. This function is
the opposite of the case function.
This function takes pairs of arguments, Boolean expressions X and strings Y. The function returns the string Y
corresponding to the first expression X that evaluates to FALSE. This function defaults to NULL if all evaluate to TRUE.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic examples
... | eval n=validate(isint(port), "ERROR: Port is not an integer", port >= 1 AND port <= 65535, "ERROR:
Port is out of range")
56
Conversion functions
The following list contains the functions that you can use to convert numbers to strings and strings to numbers.
For information about using string and numeric fields in functions, and nesting functions, see Evaluation functions.
printf("format",arguments)
Description
The printf function builds a string value, based on the a string format and the arguments that you specify.
• You can specify zero or more arguments. The arguments can be string values, numbers, computations, or fields.
The SPL printf function is similar to the C sprintf() function and similar functions in other languages such as Python,
Perl, and Ruby.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
format
Description: The format is a character string that can include one or more format conversion specifiers. Each
conversion specifier can include optional components such as flag characters, width specifications, and precision
specifications. The format must be enclosed in quotation marks.
Syntax: "(%[flags][width][.precision]<conversion_specifier>)..."
arguments
Description: The arguments are optional and can include the width, precision, and the value to format. The value
can be a string, number, or field name.
Syntax: [width][.precision][value]
Conversion
Alias Description Examples
specifier
This example returns the value of pi to 3 decimal points, in hexadecimal
format.
Floating point number in hexadecimal
%a or %A
format
%d %i Signed decimal integer This example returns the positive or negative integer values, including
any signs specified with those values.
57
Conversion
Alias Description Examples
specifier
%s %z String
This example returns the hexadecimal values that are equivalent to the
numbers in the arguments. This example shows both upper and
Unsigned hexadecimal number lowercase results when using this specifier.
%x or %X %p
(lowercase or uppercase)
%% Percent sign
Flag
Description Examples
characters
single quote or printf("%'d",12345) which
Adds commas as the thousands separator.
apostrophe ( ' ) returns 12,345
dash or minus ( Left justify. If this flag is not specified, the result keeps its default justification. printf("%-4d",1) which
-) The printf function supports right justification of results only when it formats that way by returns 1
default.
58
Flag
Description Examples
characters
which is left justified in the
output.
This example returns the value
in the argument with leading
zeros such that the number has
4 digits.
zero ( 0 ) Zero pad
printf("%04d",1) which
returns 0001
Always include the sign ( + or - ). If this flag is not specified, the conversion displays a sign printf("%+4d",1) which
plus ( + )
only for negative values. returns +1
Reserve space for the sign. If the first character of a signed conversion is not a sign or if a
printf("% -4d",1) which
<space> signed conversion results in no characters, a <space> is added as a prefixed to the result.
returns 1
If both the <space> and + flags are specified, the <space> flag is ignored.
Use an alternate form. For the %o conversion specifier, the # flag increases the precision to
force the first digit of the result to be zero. For %x or %X conversion specifiers, a non-zero
result has 0x (or 0X) prefixed to it. For %a, %A, %e, %E, %f, %F, %%g , and G conversion
hash, number, specifiers, the result always contains a radix character, even if no digits follow the radix printf("%#x", 1) which
or pound ( # ) character. Without this flag, a radix character appears in the result of these conversions returns 0x1
only if a digit follows it. For %g and %G conversion specifiers, trailing zeros are not
removed from the result as they normally are. For other conversion specifiers, the behavior
is undefined.
Specifying field width
You can use an asterisk ( * ) with the printf function to return the field width or precision from an argument.
Examples
The following example returns the positive or negative integer values, including any signs specified with those values.
The following example returns the floating point number with 1 decimal point.
The following example returns the value of pi() in exponential format with 2 decimal points.
The field width can be expressed using a number or an argument denoted with an asterisk ( * ) character.
59
The minimum number of characters to print. If the value to print is shorter than this number, the result is
number
padded with blank spaces. The value is not truncated even if the result is larger.
The width is not specified in the format string, but as an additional integer value argument preceding the
* (asterisk)
argument that has to be formatted.
Specifying precision
Precision Description
Precision specifies the minimum number of digits to be return. If the value to be return is shorter than this
%d, %i, %o, %u, %x or %X number, the result is padded with leading zeros. The value is not truncated even if the result is longer. A
precision of 0 means that no character is returned for the value 0.
%a or %A, %e or %E, %f or %F This is the number of digits to be returned after the decimal point. The default is 6 .
This is the maximum number of characters to be returned. By default all characters are printed until the
%s
ending null character is encountered.
Specifying an asterisk for the The precision is not specified in the format string, but as an additional integer value argument preceding
precision value, for example .* the argument that has to be formatted.
Unsupported conversion specifiers
There are a few conversion specifiers from the C sprintf() function that are not supported, including:
Basic examples
This example creates a new field called new_field and creates string values based on the values in field_one and
field_two. The values are formatted with 4 digits before the decimal and 4 digits after the decimal. The - specifies to left
justify the string values. The 30 specifies the width of the field.
tonumber(NUMSTR,BASE)
Description
This function converts the input string NUMSTR to a number. NUMSTR can be a field name or a value.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
60
BASE is optional and used to define the base of the number in NUMSTR. BASE can be 2 to 36. The default is 10 to
correspond to the decimal system.
If the tonumber function cannot parse a field value to a number, for example if the value contains a leading and trailing
space, the function returns NULL. Use the trim function to remove leading or trailing spaces.
If the tonumber function cannot parse a literal string to a number, it returns an error.
Basic examples
The following example converts the string values for the store_sales field to numbers.
The following example takes the hexadecimal number and uses a BASE of 16 to return the number "164".
The following example trims any leading or trailing spaces from the values in the celsius field before converting it to a
number.
tostring(X,Y)
Description
This function converts the input value to a string. If the input value is a number, it reformats it as a string. If the input value
is a Boolean value, it returns the corresponding string value, "True" or "False".
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
When used with the eval command, the values might not sort as expected because the values are converted to ASCII.
Use the fieldformat command with the tostring function to format the displayed values. The underlying values are not
changed with the fieldformat command.
If X is a number, the second argument Y is optional and can be "hex", "commas", or "duration".
Examples Description
tostring(X,"hex") Converts X to hexadecimal.
tostring(X,"commas") Formats X with commas. If the number includes decimals, the function rounds to nearest two decimal places.
61
Basic examples
... | eval n=tostring(1==1) + " " + tostring(15, "hex") + " " + tostring(12345.6789, "commas")
The following example returns foo=615 and foo2=00:10:15. The 615 seconds is converted into minutes and seconds.
The following example formats the column totalSales to display values with a currency symbol and commas. You must
use a period between the currency value and the tostring function.
See also
Commands
convert
Functions
strptime
Cryptographic functions
The following list contains the functions that you can use to compute the secure hash of string values.
For information about using string and numeric fields in functions, and nesting functions, see Evaluation functions.
md5(X)
Description
This function computes and returns the MD5 hash of a string value X.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic examples
The following example returns a new field n with a message-digest (MD5) 128-bit hash value for the phrase "Hello World".
62
| makeresults count=32768 | eval message=md5("". random()) | stats values(message) as message | eval message
= mvjoin(message, "")
sha1(X)
Description
This function computes and returns the secure hash of a string value X based on the FIPS compliant SHA-1 hash
function.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
sha256(X)
Description
This function computes and returns the secure hash of a string value X based on the FIPS compliant SHA-256 (SHA-2
family) hash function.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
sha512(X)
Description
This function computes and returns the secure hash of a string value X based on the FIPS compliant SHA-512 (SHA-2
family) hash function.
63
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
For information about using string and numeric fields in functions, and nesting functions, see Evaluation functions.
In addition to the functions listed in this topic, there are also variables and modifiers that you can use in searches.
now()
Description
This function takes no arguments and returns the time that the search was started.
Usage
The now() function is often used with other data and time functions.
The time returned by the now() function is represented in UNIX time, or in seconds since Epoch time.
When used in a search, this function returns the UNIX time when the search is run. If you want to return the UNIX time
when each result is returned, use the time() function instead.
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
The following example determines the UNIX time value of the start of yesterday, based on the value of now().
Extended example
If you are looking for events that occurred within the last 30 minutes you need to calculate the event hour, event minute,
the current hour, and the current minute. You use the now() function to calculate the current hour (curHour) and current
minute (curMin). The event timestamp, in the _time field, is used to calculate the event hour (eventHour) and event minute
(eventMin). For example:
64
curMin - 30) or (curMin < 30 and eventHour=curHour-1 and eventMin>curMin+30) | bucket _time span=1d | chart
count by _time
relative_time(X,Y)
Description
This function takes a UNIX time, X, as the first argument and a relative time specifier, Y, as the second argument and
returns the UNIX time value of Y applied to X.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
The following example determines the UNIX time value of the start of yesterday, based on the value of now().
strftime(X,Y)
Description
This function takes a UNIX time value, X, as the first argument and renders the time as a string using the format specified
by Y. The UNIX time must be in seconds. Use the first 10 digits of a UNIX time to use the time in seconds.
Usage
If the time is in milliseconds, microseconds, or nanoseconds you must convert the time into seconds. You can use the pow
function to convert the number.
The following search uses the pow function to convert from nanoseconds to seconds:
The results appear on the Statistics tab and look like this:
For a list and descriptions of format options, see Common time format variables.
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
65
Basic examples
The following example returns the hour and minute from the _time field.
If the _time field value is 2018-08-10 11:48:23, the value returned in the hour_min field is 11:48.
The following example creates a new field called starttime in your search results. For the strftimevalues, the now()
function is used to generate the current UNIX time and date and time variables are used to specify the ISO 8601
timestamp format;
_starttime
2021-02-11T01:55:00.000
For more information about date and time variables, see Date and time format variables.
Extended example
The following example creates a single result using the makeresults command.
| makeresults
For example:
_time
2018-08-14 14:00:15
The _time field is stored in UNIX time, even though it displays in a human readable format. To convert the UNIX time to
some other format, you use the strftime function with the date and time format variables. The variables must be in
quotations marks.
For example, to return the week of the year that an event occurred in, use the %V variable.
_time week
2018-08-14 14:00:15 33
To return the date and time with subseconds and the time designator (the letter T) that precedes the time components of
the format, use the %Y-%m-%dT%H:%M:%S.%Q variables. For example:
66
_time mytime
2018-08-14 14:00:15 2018-08-14T14:00:15.000
strptime(X,Y)
Description
This function takes a time represented by a string X and parses the time into a UNIX timestamp. You use date and time
variables to specify the format Y that matches string X.
For example, if string X is 2018-08-13 11:22:33, the format Y must be %Y-%m-%d %H:%M:%S . The string X date must be
January 1, 1971 or later.
The _time field is in UNIX time. In Splunk Web, the _time field appears in a human readable format in the UI but is
stored in UNIX time. If you attempt to use the strptime function on the _time field, no action is performed on the values
in the field.
Usage
With the strptime function, you must specify the time format of the string X so that the function can convert the string time
into the correct UNIX time. The following table shows some examples:
2018-07-23T17:19:01.89 %Y-%m-%dT%H:%M:%S.%N
For a list and descriptions of format options, see Common time format variables.
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
If the values in the timeStr field are hours and minutes, such as 11:59, the following example returns the time as a
timestamp:
Extended example
This example shows the results of using the strptime function. The following search does several things:
• The gentimes command generates a set of times with 6 hour intervals. This command returns four fields:
startime, starthuman, endtime, and endhuman.
• The fields command returns only the starthuman and endhuman fields.
• The eval command takes the string time values in the starthuman field and returns the UNIX time that
corresponds to the string time values.
67
| gentimes start=8/13/18 increment=6h | fields starthuman endhuman | eval
startunix=strptime(starthuman,"%a %B %d %H:%M:%S.%N %Y")
The results appear on the Statistics tab and look something like this:
time()
Description
This function returns the wall-clock time, in the UNIX time format, with microsecond resolution.
Usage
The value of the time() function will be different for each event, based on when that event was processed by the eval
command.
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
This example shows the results of using the time() function. The following search does several things"
• The gentimes command generates a set of times with 6 hour intervals. This command returns four fields:
startime, starthuman, endtime, and endhuman.
• The fields command returns only the startime and starthuman fields.
• The first eval command takes the numbers in the startime field and returns them with microseconds included.
• The second eval command creates the testtime field and returns the UNIX time at the instant the result was
processed by the eval command.
The results appear on the Statistics tab and look something like this:
68
starttime starthuman epoch_time testtime
1534165200 Mon Aug 13 06:00:00 2018 1534165200.000000 1534376565.299300
Informational functions
The following list contains the functions that you can use to return information about a value.
For information about using string and numeric fields in functions, and nesting functions, see Evaluation functions.
isbool(X)
Description
This function takes one argument X and evaluates whether X is a Boolean data type. The function returns TRUE if X is
Boolean.
Usage
Use this function with other functions that return Boolean data types, such as cidrmatch and mvfind.
This function cannot be used to determine if field values are "true" or "false" because field values are either string or
number data types. Instead, use syntax such as <fieldname>=true OR <fieldname>=false to determine field values.
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
isint(X)
Description
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
69
Basic examples
The following example uses the isint function with the if function. A field, "n", is added to each result with a value of "int"
or "not int", depending on the result of the isint function. If the value of "field" is a number, the isint function returns
TRUE and the value adds the value "int" to the "n" field.
The following example shows how to use the isint function with the where command.
isnotnull(X)
Description
This function takes one argument X and returns TRUE if X is not NULL.
Usage
This function is useful for checking for whether or not a field (X) contains a value.
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic examples
The following example uses the isnotnull function with the if function. A field, "n", is added to each result with a value of
"yes" or "no", depending on the result of the isnotnull function. If the value of "field" is a number, the isnotnull function
returns TRUE and the value adds the value "yes" to the "n" field.
The following example shows how to use the isnotnull function with the where command.
isnull(X)
Description
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
70
Basic examples
The following example uses the isnull function with the if function. A field, "n", is added to each result with a value of
"yes" or "no", depending on the result of the isnull function. If there is no value for "field" in a result, the isnull function
returns TRUE and adds the value "yes" to the "n" field.
The following example shows how to use the isnull function with the where command.
isnum(X)
Description
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic examples
The following example uses the isnum function with the if function. A field, "n", is added to each result with a value of
"yes" or "no", depending on the result of the isnum function. If the value of "field" is a number, the isnum function returns
TRUE and the value adds the value "yes" to the "n" field.
The following example shows how to use the isnum function with the where command.
isstr(X)
Description
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic examples
The following example uses the isstr function with the if function. A field, "n", is added to each result with a value of
"yes" or "no", depending on the result of the isstr function. If the value of "field" is a string, the isstr function returns
71
TRUE and the value adds the value "yes" to the "n" field.
The following example shows how to use the isstr function with the where command.
typeof(X)
Description
This function takes one argument and returns the data type of the argument.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic examples
The following example takes one argument and returns a string representation of its type. This example returns
"NumberStringBoolInvalid"
The following example creates a single result using the makeresults command.
| makeresults
For example:
_time
2018-08-14 14:00:15
To determine the data type of the _time field, use the eval command with the typeof function. For example:
_time t
2018-08-14 14:00:15 Number
JSON functions
The following table describes the functions that are available for you to use to create or manipulate JSON objects:
72
Description JSON function
Extend the contents of a valid JSON object with the values of an array. json_extend
Return either a JSON array or a Splunk software native type value from a field and zero or more paths. json_extract
Return Splunk software native type values from a piece of JSON by matching literal strings in the event and extracting
json_extract_exact
them as keys.
Return the keys from the key-value pairs in a JSON object. The keys are returned as a JSON array. json_keys
Insert or overwrite values for a JSON node with the values provided and return an updated JSON object. json_set
Generate or overwrite a JSON object using the key-value pairs specified. json_set_exact
Evaluate whether a JSON object uses valid JSON syntax and returns either TRUE or FALSE. json_valid
json_object(<members>)
Usage
If you specify a string for a <key> or <value>, you must enclose the string in double quotation marks. A <key> must be a
string. A <value> can be a string, number, Boolean, null, multivalue field, array, or another JSON object.
You can use this function with the eval and where commands, and as part of evaluation expressions with other
commands.
Examples
These examples show different ways to use the json_object function to create JSON objects in your events.
The following example creates a multivalue field called firstnames that uses the key name and contains the values
"maria" and "arun". The JSON object created is { "name": ["maria", "arun"] } .
The following example creates a JSON object that uses a JSON array for the values.
73
The result is the JSON object { "cities": ["London", "Sydney", "Berlin", "Santiago"] }.
The following example creates a nested JSON object that uses other JSON objects and a multivalue or JSON
array field called gamelist.
{
"games": {
"category": {
"boardgames": {
"cooperative": [ "Pandemic", "Forbidden Island", "Castle Panic" ]
}
}
}
}
json_append(<json>, <path_value_pairs>)
This function appends values to the ends of indicated arrays within a JSON document. This function provides a JSON
eval function equivalent to the multivalue mvappend function.
Usage
The json_append function always has at least three function inputs: <json> (the name of a valid JSON document such as
a JSON object), and at least one <path> and <value> pair.
If <json> does not reference a valid JSON document, such as a JSON object, the function outputs nothing.
The json_append function evaluates <path_value_pairs> from left to right. When a path-value pair is evaluated, the
function updates the <json> document. The function then evaluates the next path-value pair against the updated
document.
You can use this function with the eval and where commands, and as part of evaluation expressions with other
commands.
Each <path> designates an array or value within the <json> document. The json_append function adds the corresponding
<value> to the end of the value designated by the <path>. The following table explains what json_append does depending
on what the <path> specifies.
If <path> specifies... ...This is what json_append does with the corresponding <value>
An array with one or more
json_append adds the corresponding <value> to the end of that array.
values.
An empty array json_append adds the corresponding <value> to that array, creating an array with a single value.
74
If <path> specifies... ...This is what json_append does with the corresponding <value>
json_append autowraps the scalar or object value within an array and adds the corresponding <value> to
the end of that array.
The json_append function ignores path-value pairs for which the <path> does not identify any valid value in the JSON
document.
When the new <value> is an array, json_append appends the array as a single element. For example, if a json_array
<path> leads to the array ["a", "b", "c"] and its <value> is the array ["d", "e", "f"], the result is ["a", "b", "c",
["d", "e", "f"]].
Appending arrays as single elements separates json_append from json_extend, a similar function that flattens arrays and
objects into separate elements as it appends them. When json_extend takes the example in the preceding paragraph, it
returns ["a", "b", "c", "d", "e", "f"].
Examples
The following examples show how you can use json_append to append values to arrays within a JSON document.
Say you have an object named ponies that contains an array named ponylist: ["Minty", "Rarity", "Buttercup"]. This
is the search you would run to append "Fluttershy" to ponylist.
The output of that eval statement is {"ponylist": ["Minty", "Rarity", "Buttercup", "Fluttershy"]}.
This example has a <path> with the value Fluttershy.ponySkills. Fluttershy.ponySkills references an array of an
object that is nested within ponyDetails, the source object. The query uses json_append to add a string to the nested
object array.
json_array(<values>)
Usage
A <value> can be any kind of value such as string, number, or Boolean. You can also use the json_object function to
specify values.
75
You can use this function with the eval and where commands, and as part of evaluation expressions with other
commands.
Examples
These examples show different ways to use the json_array function to create JSON arrays in your events.
The following example uses a string dubois and the json_object function for the array values.
json_array_to_mv(<json_array>, <Boolean>)
This function maps the elements of a proper JSON array into a multivalue field.
Usage
You can use this function with the eval and where commands, and as part of evaluation expressions with other
commands.
If the <json array> input to the function is not a valid JSON array, the function outputs nothing.
Use the <Boolean> input to specify that the json_array_to_mv function should preserve bracketing quotes on
JSON-formatted strings. The <Boolean> input defaults to false().
Syntax Description
json_array_to_mv(<json_array>, By default (or when you explicitly set it to false()), the json_array_to_mv function
false()) or removes bracketing quotes from JSON string data types when it converts an array into a
json_array_to_mv(<json_array>) multivalue field.
json_array_to_mv(<json_array>, When set to true(), the json_array_to_mv function preserves bracketing quotes on JSON
true()) string data types when it converts an array into a multivalue field.
Example
This example demonstrates usage of the json_array_to_mv function to create multivalue fields out of JSON data.
The following example creates a simple array: ["Buttercup", "Fluttershy", "Rarity"]. Then it maps that array
into a multivalue field named my_little_ponies with the values Buttercup, Fluttershy, and Rarity. The function
removes the quote characters when it converts the array elements into field values.
76
... | eval ponies = json_array("Buttercup", "Fluttershy", "Rarity"), my_sweet_ponies =
json_array_to_mv(ponies)
If you change this search so it has my_sweet_ponies = json_array_to_mv(ponies,true()), you get an array with the
values "Buttercup", "Fluttershy", and "Rarity". Setting the function to true causes the function to preserve the quote
characters when it converts the array elements into field values.
json_extend(<json>, <path_value_pairs>)
Use json_extract when you want to append multiple values at once to an array. json_extend flattens arrays into their
component values and appends those values to the ends of indicated arrays within a valid JSON document.
Usage
The json_extend function always has at least three function inputs: <json> (the name of a valid JSON document such as
a JSON object), and at least one <path> and <value> pair. The <value> must be an array. When given valid inputs,
json_extend always outputs an array.
If <json> does not reference a valid JSON document, such as a JSON object, the function outputs nothing.
json_extend evaluates <path_value_pairs> from left to right. When json_extend evaluates a path-value pair, it updates
the <json> document. json_extend then evaluates the next path-value pair against the updated document.
You can use json_extend with the eval and where commands, and as part of evaluation expressions with other
commands.
Each <path> designates an array or value within the <json> document. The json_extend function adds the values of the
corresponding <array> after the last value of the array designated by the <path>. The following table explains what
json_extend does depending on what the <path> specifies.
If <path> specifies... ...This is what json_extend does with the corresponding array values
An array with one or more
json_extend adds the corresponding array values to the end of that array.
values.
An empty array json_extend adds the corresponding array values to that array.
json_extend autowraps the scalar or object value within an array and adds the corresponding array values to
A scalar or object value
the end of that array.
json_extend ignores path-value pairs for which the <path> does not identify any valid value in the JSON document.
The json_extend function flattens arrays as it appends them to the specified value. "Flattening" refers to the act of
breaking the array down into its component values. For example, if a json_extend <path> leads to the array ["a", "b",
"c"] and its <value> is the array ["d", "e", "f"], the result is ["a", "b", "c", "d", "e", "f"].
Appending arrays as individual values separates json_extend from json_append, a similar function that appends the
<value> as a single element. When json_append takes the example in the preceding paragraph, it returns ["a", "b",
"c", ["d", "e", "f"]].
77
Examples
The following examples show how you can use json_extend to append multiple values at once to arrays within a JSON
document.
You start with an object named fakeBandsInMovies that contains an array named fakeMovieBandList: ["The Blues
Brothers", "Spinal Tap", "Wyld Stallyns"]. This is the search you would run to extend that list with three more names
of fake bands from movies.
The output of this eval statement is {"fakeMovieBandList": ["The Blues Brothers", "Spinal Tap", "Wyld Stallyns",
"The Soggy Bottom Boys", "The Weird Sisters", "The Barden Bellas"]}
This example has an object named dndChars that contains an array named characterClasses. You want to update this
array with an object from a secondary array. Here is a search you could run to achieve that goal.
The output of this eval statement is {updatedParty = ["wizard", "rogue", "barbarian", {"artifact":"deck of many
things"}]}. Note that when json_extend flattens array2, it removes the object from the array. Otherwise the output would
be {updatedParty = ["wizard", "rogue", "barbarian", [{"artifact":"deck of many things"}]]}.
json_extract(<json>, <paths>)
This function returns a value from a piece of JSON and zero or more paths. The value is returned in either a JSON array,
or a Splunk software native type value.
If a JSON object contains a value with a special character, such as a period, json_extract can't access it. Use the
json_extract_exact function for those situations.
See json_extract_exact.
Usage
What is converted or extracted depends on whether you specify a piece of JSON, or JSON and one or more paths.
Syntax Description
Converts a JSON field to the Splunk software native type. For example:
78
Syntax Description
Extracts the value specified by <path> from <json>, and converts the value to the native type.
json_extract(<json>, <path>)
This can be a JSON array if the path leads to an array.
json_extract(<json>, <path>,
Extracts all of the paths from <json> and returns it as a JSON array.
<path>, ...)
You can use this function with the eval and where commands, and as part of evaluation expressions with other
commands.
Examples
These examples use this JSON object, which is in a field called cities in an event:
{
"cities": [
{
"name": "London",
"Bridges": [
{ "name": "Tower Bridge", "length": 801 },
{ "name": "Millennium Bridge", "length": 1066 }
]
},
{
"name": "Venice",
"Bridges": [
{ "name": "Rialto Bridge", "length": 157 },
{ "name": "Bridge of Sighs", "length": 36 },
{ "name": "Ponte della Paglia" }
]
},
{
"name": "San Francisco",
"Bridges": [
{ "name": "Golden Gate Bridge", "length": 8981 },
{ "name": "Bay Bridge", "length": 23556 }
]
}
]
}
The following example returns the entire JSON object from the cities field. The cities field contains only one
object. The key is the entire object. This extraction can return any type of value.
Field Results
{"cities":[{"name":"London","Bridges":[{"name":"Tower Bridge","length":801},{"name":"Millennium
Bridge","length":1066}]},{"name":"Venice","Bridges":[{"name":"Rialto Bridge","length":157},{"name":"Bridge of
extract_cities
Sighs","length":36},{"name":"Ponte della Paglia"}]},{"name":"San Francisco","Bridges":[{"name":"Golden Gate
Bridge","length":8981},{"name":"Bay Bridge","length":23556}]}]}
79
2. Extract the first nested JSON object in a field
The following example extracts the information about the city of London from the JSON object. This extraction can
return any type of value.
The {<num>} indexing demonstrated in this example search only works when the <path> maps to a JSON array. In this
case the {0} maps to the "0" item in the array, which is London. If the example used {1} it would select Venice from the
array.
Field Results
London {"name":"London","Bridges":[{"name":"Tower Bridge","length":801},{"name":"Millennium Bridge","length":1066}]}
The following example extracts the information about the city of San Francisco from the JSON object. This
extraction can return any type of value.
Field Results
{"name":"San Francisco","Bridges":[{"name":"Golden Gate Bridge","length":8981},{"name":"Bay
San_Francisco
Bridge","length":23556}]}
The following example extracts the names of the cities from the JSON object. This extraction can return any type
of value.
Field Results
my_cities ["London","Venice","San Francisco"]
5. Extract a specific set of key-value pairs from each nested JSON object in a field
The following example extracts the information about each bridge from every city from the JSON object. This
extraction can return any type of value.
Field Results
[{"name":"Tower Bridge","length":801},{"name":"Millennium Bridge","length":1066},{"name":"Rialto
Bridges Bridge","length":157},{"name":"Bridge of Sighs","length":36},{"name":"Ponte della Paglia"},{"name":"Golden Gate
Bridge","length":8981},{"name":"Bay Bridge","length":23556}]
80
6. Extract a specific value from each nested JSON object in a field
The following example extracts the names of the bridges from all of the cities from the JSON object. This
extraction can return any type of value.
Field Results
["Tower Bridge","Millennium Bridge","Rialto Bridge","Bridge of Sighs","Ponte della Paglia","Golden Gate Bridge","Bay
Bridge_names
Bridge"]
7. Extract a specific key-value pair from a specific nested JSON object in a field
The following example extracts the name and length of the first bridge from the third city from the JSON object.
This extraction can return any type of value.
Field Results
GG_Bridge {"name":"Golden Gate Bridge","length":8981}
The following example extracts the length of the first bridge from the third city from the JSON object. This
extraction can return any type of value.
Field Results
GG_Bridge_length 8981
json_extract_exact(<json>, <keys>)
Like the json_extract function, this function returns a Splunk software native type value from a piece of JSON. The main
difference between these functions is that the json_extract_exact function does not use paths to locate and extract
values, but instead matches literal strings in the event and extracts those strings as keys.
See json_extract.
Usage
The json_extract_exact function treats strings for key extraction literally. This means that the function does not support
explicitly nested paths. You can set paths with nested json_array/json_object function calls.
Syntax Description
json_extract_exact(<json>) Converts a JSON field to the Splunk software native type. For example:
81
Syntax Description
• Converts a JSON null to a null
Extracts the key specified by <string> from <json>, and converts the key to the Splunk
json_extract_exact(<json>, <string>)
software native type. This can be a JSON array if the path leads to an array.
json_extract_exact(<json>, <string>,
Extracts all of the strings from <json> and returns them as a JSON array of keys.
<string>, ...)
You can use this function with the eval and where commands, and as part of evaluation expressions with other
commands.
Example
Suppose you have a JSON event that looks like this: {"system.splunk.path":"/opt/splunk/"}
If you want to extract system.splunk.path from that event, you can't use the json_extract function because of the period
characters. Instead, you would use json_extract_exact, as shown in the following search:
json_keys(<json>)
Returns the keys from the key-value pairs in a JSON object. The keys are returned as a JSON array.
Usage
You can use this function with the eval and where commands, and as part of evaluation expressions with other
commands.
Examples
bridges
{"name": "Clifton Suspension Bridge", "length": 1352, "city": "Bristol", "country": "England"}
This example extracts the keys from the JSON object in the bridges field:
... | eval bridge_keys = json_keys(bridges)
bridge_keys
["name", "length", "city", "country"]
82
Return a list of keys from multiple JSON objects
Consider the following JSON objects, which are in separate rows in the bridges field:
bridges
{"name": "Clifton Suspension Bridge", "length": 1352, "city": "Bristol", "country": "England"}
{"name": "Tilikum Crossing", "length": 1700, "city": "Portland", "state": "Oregon", "country": "United States"}
This example extracts the keys from the JSON objects in the bridges field:
... | eval bridge_keys = json_keys(bridges)
bridge_keys
["name", "length", "city", "country"]
json_set(<json>, <path_value_pairs>)
Inserts or overwrites values for a JSON node with the values provided and returns an updated JSON object.
Usage
You can use this function with the eval and where commands, and as part of evaluation expressions with other
commands.
• If the path contains a list of keys, all of the keys in the chain are created if the keys don't exist.
• If there's a mismatch between the JSON object and the path, the update is skipped and doesn't generate an error.
For example, for object {"a": "b"}, json_set(.., "a.c", "d") produces no results since "a" has a string value and "a.c"
implies a nested object.
• If the value already exists and is of a matching non-value type, the json_set function overwrites the value by
default. A value type match isn't enforced. For example, you can overwrite a number with a string, Boolean, null,
and so on.
Examples
These examples use this JSON object, which is in a field called games in an event:
{
"category": {
"boardgames": {
83
"cooperative": [
{
"name": "Pandemic"
},
{
"name": "Forbidden Island"
},
{
"name": "Castle Panic"
}
]
}
}
}
1. Overwrite a value in an existing JSON array
The following example overwrites the value "Castle Panic" in the path [category.boardgames.cooperative] in
the JSON object. The value is replaced with "name":"Sherlock Holmes: Consulting Detective". The results are
placed into a new field called my_games.
The position count starts with 0. The third position is 2, which is why the example specifies {2} in the path.
Field Results
{"category":{"boardgames":{"cooperative":["name":"Pandemic", "name":"Forbidden Island", "name":"Sherlock Holmes:
my_games
Consulting Detective"]}}}
The following example inserts a list of popular games ["name":"Settlers of Catan", "name":"Terraforming
Mars", "name":"Ticket to Ride"] into the path [category.boardgames.competitive] in the JSON object.
Because the key competitive doesn't exist in the path, the key is created. The json_array function is used to
append the value list to the boardgames JSON object.
Field Results
{"category":{"boardgames":{"cooperative":["name":"Pandemic", "name":"Forbidden Island", "name":"Sherlock Holmes:
my_games
Consulting Detective"],"competitive": ["name":"Settlers of Catan", "name":"Terraforming Mars", "name":"Ticket to Ride"]}}}
{
"category": {
84
"boardgames": {
"cooperative": [
{
"name": "Pandemic"
},
{
"name": "Forbidden Island"
},
{
"name": "Castle Panic"
}
]
},
"competitive": [
{
"name": "Settlers of Catan"
},
{
"name": "Terraforming Mars"
},
{
"name": "Ticket to Ride"
}
]
}
}
The following example inserts a set of key-value pairs that specify if the game is available using a Boolean value.
These pairs are inserted into the path [category.boardgames.competitive] in the JSON object. The json_array
function is used to append the key-value pairs list to the boardgames JSON object.
Field Results
{"category":{"boardgames":{"cooperative":["name":"Pandemic", "name":"Forbidden Island", "name":"Sherlock Holmes:
my_games Consulting Detective"],"competitive": ["name":"Settlers of Catan", "available":true, "name":"Terraforming Mars",
"available":true, "name":"Ticket to Ride", "available":true]}}}
{
"category": {
"boardgames": {
"cooperative": [
{
"name": "Pandemic"
},
{
"name": "Forbidden Island"
},
{
"name": "Castle Panic"
85
}
]
},
"competitive": [
{
"name": "Settlers of Catan",
"available": true
},
{
"name": "Terraforming Mars",
"available": true
},
{
"name": "Ticket to Ride",
"available": true
}
]
}
}
If the Settlers of Catan game is out of stock, you can overwrite the value for the available key with the value
false().
For example:
Field Results
{"category":{"boardgames":{"cooperative":["name":"Pandemic", "name":"Forbidden Island", "name":"Sherlock Holmes:
my_games Consulting Detective"],"competitive": ["name":"Settlers of Catan", "available":false, "name":"Terraforming Mars",
"available":true, "name":"Ticket to Ride", "available":true]}}}
{
"category": {
"boardgames": {
"cooperative": [
{
"name": "Pandemic"
},
{
"name": "Forbidden Island"
},
{
"name": "Castle Panic"
}
]
},
"competitive": [
{
"name": "Settlers of Catan",
"available": false
},
{
86
"name": "Terraforming Mars",
"available": true
},
{
"name": "Ticket to Ride",
"available": true
}
]
}
}
json_set_exact(<json>, <key_value_pairs>)
Generates or overwrites a JSON object using the key-value pairs that you specify.
Usage
You can use the json_set_exact function with the eval and where commands, and as part of evaluation expressions with
other commands.
• The json_set_exact function interprets the keys as literal strings, including special characters. This function does
not interpret strings separated by period characters as keys for nested objects.
• If you supply multiple key-value pairs to json_set_exact, the function outputs an array.
• The json_set_exact function does not support or expect paths. You can set paths with nested json_array or
json_object function calls.
Example
Suppose you want to have a JSON object that looks like this:
{"system.splunk.path":"/opt/splunk"}
To generate this object, you can use the makeresults command and the json_set_exact function as shown in the
following search:
You use json_set_exact for this instead of json_set because the json_set function interprets the period characters in
{"system.splunk.path"} as nested objects. If you use json_set in the preceding search you get this JSON object:
{"system":{"splunk":{"path":"/opt/splunk"}}}
Instead of this object:
{"system.splunk.path":"/opt/splunk"}
json_valid(<json>)
Evaluates whether a piece of JSON uses valid JSON syntax and returns either TRUE or FALSE.
87
Usage
You can use this function with the eval and where commands, and as part of evaluation expressions with other
commands.
Example
The following example validates a JSON object { "names": ["maria", "arun"] } in the firstnames field.
Because fields cannot hold Boolean values, the if function is used with the json_valid function to place the
string value equivalents of the Boolean values into the isValid field.
... | eval IsValid = if(json_valid(firstnames), "true", "false")
See also
Functions
Evaluation functions quick reference
mv_to_json_array function
Mathematical functions
The following list contains the functions that you can use to perform mathematical calculations.
• For information about using string and numeric fields in functions, and nesting functions, see Evaluation functions.
• For the list of mathematical operators you can use with these functions, see "Operators" in the Usage section of
the eval command.
abs(X)
Description
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
The following example creates a field called absnum, whose values are the absolute values of the numeric field number.
88
ceiling(X) or ceil(X)
Description
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
You can use the abbreviation ceil(X) instead of the full name of the function.
Basic example
exact(X)
Description
This function renders the result of a numeric eval calculation with a larger amount of precision in the formatted output.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
exp(X)
Description
This function takes a number X and returns the exponential function eX.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
89
floor(X)
Description
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
ln(X)
Description
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
The following example returns the natural logarithm of the values of bytes.
log(X,Y)
Description
This function takes either one or two numeric arguments and returns the logarithm of the first argument X using the
second argument Y as the base. If the second argument Y is omitted, this function evaluates the logarithm of number X
with base 10.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
90
Basic example
pi()
Description
This function takes no arguments and returns the constant pi to 11 digits of precision.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
The following example calculates the area of a circle, which is pi() multiplied by the radius to the power of 2.
pow(X,Y)
Description
This function takes two numeric arguments X and Y and returns XY, X to the power of Y.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
The following example calculates the area of a circle, which is pi() multiplied by the radius to the power of 2.
round(X,Y)
Description
This function takes one or two numeric arguments X and Y, returning X rounded up to the amount of decimal places
specified by Y. The default is to round up to an integer.
91
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic examples
sigfig(X)
Description
This function takes one argument X, a number, and rounds that number to the appropriate number of significant figures.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
The computation for sigfig is based on the type of calculation that generates the number.
• For multiplication and division, the result should have the minimum number of significant figures of all of the
operands.
• For addition and subtraction, the result should have the same number of decimal places as the least precise
number of all of the operands.
For example, the numbers 123.0 and 4.567 contain different precision with the decimal places. The first number is less
precise because it has 1 decimal place. The second number is more precise because it has 3 decimal places.
If the calculation is 123.0 + 4.567 = 127.567, then the sigfig function returns the fewest number of decimal places. In this
example only one decimal place is returned. Because the numbers to the right of the last significant figure are greater
than 5, the result returned is 127.6
Basic examples
Example 1: The following example shows how the sigfig function works. The calculation 1.00*1111 returns the value
n=1111, but the following search using the sigfig function returns n=1110.
In this example, 1.00 has 3 significant figures and 1111 has 4 significant figures. In this example, the minimum number of
92
significant figures for all operands is 3. Using the sigfig function, the final result is rounded to 3 digits, returning n=1110
and not 1111.
Example 2: There are situations where the results of a calculation can return a different accuracy to the very far right of
the decimal point. For example, the following search calculates the average of 100 values:
avg(test)
3.9900000000000055
When the count is changed to 10000, the results are different:
avg(test)
3.990000000000215
This occurs because numbers are treated as double-precision floating-point numbers.
To mitigate this issue, you can use the sigfig function to specify the number of significant figures you want returned.
However, first you need to make a change to the stats command portion of the search. You need to change the name of
the field avg(test) to remove the parenthesis. For example stats avg(test) AS test. The sigfig function expects either
a number or a field name for X. The sigfig function cannot accept a field name that looks like another function, in this
case avg.
To specify the number of decimal places you want returned, you multiply the field name by 1 and use zeros to specify the
number of decimal places. If you want 4 decimal places returned, you would multiply the field name by 1.0000. To return 2
decimal places, multiply by 1.00, as shown in the following example:
test
3.99
sqrt(X)
Description
This function takes one numeric argument X and returns its square root.
93
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
sum(X,...)
Description
This function takes an arbitrary number of arguments and returns the sum of numerical values as an integer. Each
argument must be either a field (single or multi value) or an expression that evaluates to a number. At least one numeric
argument is required. When the function is applied to a multivalue field, each numeric value of the field is included in the
total. The eval command ignores arguments that don't exist in an event or can't be converted to a number.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic examples
Example 1: The following example creates a field called a with value 5.0, a field called b with value 9, and a field called x
with value 14 that is the sum of a and b. A field is not created for c and it is not included in the sum because a value was
not declared for that argument.
Example 2: The following example calculates the sum of three numbers and returns c=6.
However, the following example returns an error because one of the arguments in the function is a string.
To use a quoted string as a number within the function, you must convert the number to an integer, as shown in the
following example that returns c=6.
Example 3: In this example, a field with a value that is a string results in a field called a with value 1, and a field called c
with value 6,
94
Example 4: When an argument is a field, the eval command retrieves the value and attempts to treat it as a number,
even if it is a string. The following example creates a field called a with value somedata, and a field called c with value 5.
However, the following example returns an error because the string argument is specified directly within the function.
You can also use the statistical eval functions, max and min, on multivalue fields. See Statistical eval functions.
For information about using string and numeric fields in functions, and nesting functions, see Evaluation functions.
commands(X)
Description
This function takes a search string, or field that contains a search string, X and returns a multivalued field containing a list
of the commands used in X.
Usage
This function is generally not recommended for use except for analysis of audit.log events.
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
The following example returns a multivalued field X, that contains 'search', 'stats', and 'sort'.
mvappend(X,...)
Description
This function takes an arbitrary number of arguments and returns a multivalue result of all the values. The arguments can
be strings, multivalue fields or single value fields.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
95
Basic examples
This example shows how to append two values, localhost is a literal string value and srcip is a field name.
• The inner mvappend function contains two values: localhost is a literal string value and srcip is a field name.
• The outer mvappend function contains three values: the inner mvappend function, destip is a field name, and
192.168.1.1 which is a literal IP address.
The results are placed in a new field called ipaddresses, which contains the array ["localhost", <values_in_scrip>,
<values_in_destip>, "192.168.1.1"].
Note that the previous example generates the same results as the following example, which does not use a nested
mvappend function:
time ipaddresses
localhost
2020-11-19 16:43:31
192.168.1.1
mvcount(MVFIELD)
Description
This function takes a field and returns a count of the values in that field for each result. If the field is a multivalue field,
returns the number of values in that field. If the field contains a single value, this function returns 1 . If the field has no
values, this function returns NULL.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
Extended example
In the following example, the mvcount() function returns the number of email addresses in the To, From, and Cc fields and
saves the addresses in the specified "_count" fields.
96
This search takes the values in the To field and uses the split function to separate the email address on the @ symbol.
The split function is also used on the Cc field for the same purpose.
If only a single email address exists in the From field, as you would expect, mvcount(From) returns 1. If there is no Cc
address, the Cc field might not exist for the event. In that situation mvcount(cc) returns NULL.
mvdedup(X)
Description
This function takes a multivalue field X and returns a multivalue field with its duplicate values removed.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
mvfilter(X)
Description
This function filters a multivalue field based on an arbitrary Boolean expression X. The Boolean expression X can
reference ONLY ONE field at a time.
Usage
This function will return NULL values of the field x as well. If you do not want the NULL values, use one of the following
expressions:
• mvfilter(!isnull(x))
• mvfilter(isnotnull(x))
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
The following example returns all of the values in field email that end in .net or .org.
mvfind(MVFIELD,"REGEX")
Description
This function tries to find a value in the multivalue field MVFIELD that matches the regular expression in "REGEX". If a
match exists, the index of the first matching value is returned (beginning with zero). If no values match, NULL is returned.
97
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
mvindex(MVFIELD,STARTINDEX, ENDINDEX)
Description
This function returns a subset of the multivalue field using the start and end index values.
Usage
The MVFIELD argument must be a multivalue field. The STARTINDEX and ENDINDEX indexes must be numbers.
The MVFIELD and STARTINDEX arguments are required. The ENDINDEX argument is optional.
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
• Indexes start at zero. If you have 5 values in the multivalue field, the first value has an index of 0. The second
value has an index of 1, and so on.
• If only the STARTINDEX argument is specified, only that value is included in the results.
• When the ENDINDEX argument is specified, the range of values from STARTINDEX to ENDINDEX are included in the
results.
• Both the STARTINDEX and ENDINDEX arguments can be negative. An index of -1 is used to specify the last value in
the list.
• If the indexes are out of range or invalid, the result is NULL.
Examples
To return a range of values, specify both a STARTINDEX and ENDINDEX value. For example, the following search returns the
first 4 values in the field. The start value is 0 and the end value is 3.
98
my_names
alex,celestino,claudia,david
Extended examples
ponies
buttercup, dash, flutter, honey, ivory, minty, pinky, rarity
To return a value from the end of the list of values, the index numbers start with -1. The negative symbol indicates that the
indexing starts from the last value. For example:
Pony name buttercup dash flutter honey ivory minty pinky rarity
index number -8 -7 -6 -5 -4 -3 -2 -1
To return the last value in the list, you specify -1, which indicates to start at the end of the list and return only one value.
For example:
my_ponies
rarity
To return the 3rd value from the end, you would specify the index number -3. For example:
my_ponies
minty
To return a range of values, specify both a STARTINDEX and ENDINDEX value. For example, the following search returns the
last 3 values in the field. The start value is -3 and the end value is -1.
my_ponies
minty,pinky,rarity
mvjoin(MVFIELD,STR)
Description
This function takes two arguments, a multivalue field (MVFIELD) and a string delimiter (STR). The function concatenates
the individual values within MVFIELD using the value of STR as a separator.
99
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic examples
You have a multivalue field called "base" that contains the values "1" "2" "3" "4" "5". The values are separated by a space.
You want to create a single value field instead, with OR as the delimiter. For example "1 OR 2 OR 3 OR 4 OR 5".
The following search creates the base field with the values. The search then creates the joined field by using the result of
the mvjoin function.
The following example joins together the individual values of "foo" using a semicolon as the delimiter:
mvmap(X,Y)
Description
This function iterates over the values of a multi-value field (X), performs an operation (Y) on each value, and returns a
multi-value field with the list of results.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic examples
The following example multiplies each value in the results field by 10.
The following example multiplies each value in the results field by threshold, where threshold is a single-valued field.
The following example multiplies the 2nd and 3rd values in the results field by threshold, where threshold is a
single-valued field. This example uses the mvindex function to identify specific values in the results field.
100
mvrange(X,Y,Z)
Description
This function creates a multivalue field for a range of numbers. This function can contain up to three arguments: a starting
number X, an ending number Y (which is excluded from the field), and an optional step increment Z. If the increment is a
timespan such as 7d, the starting and ending numbers are treated as UNIX time.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic examples
The following example takes the UNIX timestamp for 1/1/2018 as the start date and the UNIX timestamp for 4/19/2018 as
an end date and uses the increment of 7 days.
This example returns a multivalue field with the UNIX timestamps. The results appear on the Statistics tab and look
something like this:
_time mv
1514834731
1515439531
1516044331
1516649131
1517253931
1517858731
1518463531
1519068331
2018-04-10 12:31:03
1519673131
1520277931
1520879131
1521483931
1522088731
1522693531
1523298331
1523903131
mvsort(X)
Description
This function uses a multivalue field X and returns a multivalue field with the values sorted lexicographically.
101
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Lexicographical order sorts items based on the values used to encode the items in computer memory. In Splunk software,
this is almost always UTF-8 encoding, which is a superset of ASCII.
• Numbers are sorted before letters. Numbers are sorted based on the first digit. For example, the numbers 10, 9,
70, 100 are sorted lexicographically as 10, 100, 70, 9.
• Uppercase letters are sorted before lowercase letters.
• Symbols are not standard. Some symbols are sorted before numeric values. Other symbols are sorted before or
after letters.
Basic example
mvzip(X,Y,"Z")
Description
This function takes two multivalue fields, X and Y, and combines them by stitching together the first value of X with the
first value of field Y, then the second with the second, and so on. The third argument, Z, is optional and is used to specify
a delimiting character to join the two values. The default delimiter is a comma.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
Extended example
You can nest several mvzip functions together to create a single multivalued field three_fields from three separate fields.
The pipe ( | ) character is used as the separator between the field values.
mv_to_json_array(<field>, <infer_types>)
102
Usage
You can use this function with the eval and where commands, and as part of evaluation expressions with other
commands.
Because the elements of JSON arrays can have many data types (such as string, numeric, Boolean, and null), the
mv_to_json_array function lets you specify how it should map the contents of multivalue fields into JSON arrays. You can
have the field values simply written to arrays as string data types, or you can have the function infer different JSON data
types.
Use the <infer_types> input to specify that the mv_to_json_array function should attempt to infer JSON data types when
it converts field values into array elements. The <infer_types> input defaults to false.
Syntax Description
By default, or when you explicitly set it to false(), the mv_to_json_array function maps all values in
mv_to_json_array(<field>, the multivalued field to the JSON array as string data types, whether they are numeric, strings, Boolean
false()) or values, or any other JSON data type. The mv_to_json_array function effectively splits the multivalue
mv_to_json_array(<field>) field on the comma and writes each quote-enclosed value to the array as an element with the string data
type.
When you set the mv_to_json_array function to true(), the function removes one set of bracketing
mv_to_json_array(<field>, quote characters from each value it transfers into the JSON array. If the function does not recognize the
true()) resulting array element as a proper JSON data type (such as string, numeric, Boolean, or null), the
function turns the element into a null data type.
Example
This example shows you how the mv_to_json_array function can validate JSON as it generates JSON arrays.
The array that is created from these values depends on the <infer_types> input.
When <infer_types> is set to false or omitted, the mv_to_json_array function converts the field values into array
elements without changing the values.
["\"Buttercup\"","\"Fluttershy\"","\"Rarity\"","true","null"]
When you run this search with infer_values set to true(), the mv_to_json_array function removes the extra quote and
backslash escape characters from the field values when the values are converted into array elements.
103
["Buttercup","Fluttershy","Rarity",true,null]
split(X,"Y")
Description
This function takes two arguments, field X and delimiting character Y. It splits the values of X on the delimiter Y and
returns X as a multivalue field.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
See also
For information about using string and numeric fields in functions, and nesting functions, see Evaluation functions.
In addition to these functions, there is a comprehensive set of statistical functions that you can use with the stats, chart,
and related commands.
avg(X,...)
Description
This function takes an arbitrary number of arguments and returns the average of numerical values as an integer. Each
argument must be either a field (single or multi value) or an expression that evaluates to a number. At least one numeric
argument is required. When the function is applied to a multivalue field, each numeric value of the field is included in the
total. The eval command ignores arguments that don't exist in an event or can't be converted to a number.
To get the numerical average or mean of the values of two fields, x and y, note that avg(x,y) is equivalent to
sum(x,y)/(mvcount(x) + mvcount(y)).
104
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
Example 1: The following example creates a field called a with value 5.0, a field called b with value 9, and a field called x
with value 7 that is the average of a and b. A field is not created for c and it is not included in the total because a value
was not declared for that argument.
Example 2: The following example calculates the average of three numbers and returns c=2.
However, the following example returns an error because one of the arguments in the function is a string.
To use a quoted string as a number within the function, you must convert the number to an integer, as shown in the
following example where c=2:
Example 3: In this example, a field with a value that is a string results in a field called a with value 1, and a field called c
with value 2,
Example 4: When an argument is a field, the eval command retrieves the value and attempts to treat it as a number,
even if it is a string. The following example creates a field called a with value somedata, and a field called c with value 2.5.
However, the following example returns an error because the string argument is specified directly within the function.
max(X,...)
Description
This function takes an arbitrary number of numeric or string arguments, and returns the maximum. Strings are greater
than numbers.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
105
Basic examples
The following example returns either "foo" or the value in the name field. Splunk searches use lexicographical order,
where numbers are sorted before letters. If the value in the name field is "baz", then "foo" is returned. If the value in the
name field is "zaz", then "zaz" is returned.
This search creates a field called n with a single value, which is a series of numbers. The makemv command is used to
make the single value into multiple values, each of which appears on it's own row in the results. Another new field called
maxn is created which takes the values in n and returns the maximum value, 6.
_time maxn n
1
3
2021-01-29 10:42:37 6 5
6
4
2
min(X,...)
Description
This function takes an arbitrary number of numeric or string arguments, and returns the minimum. Strings are greater than
numbers.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic examples
The following example returns either 3 or the value in the size field. Splunk searches use lexicographical order, where
numbers are sorted before letters. If the value in the size field is 9, then 3 is returned. If the value in the size field is 1,
then 1 is returned.
This search creates a field called n with a single value, which is a series of numbers. The makemv command is used to
make the single value into multiple values, each of which appears on it's own row in the results. Another new field called
106
minn is created which takes the values in n and returns the minimum value, 2.
_time minn n
3
5
2021-01-29 10:42:37 2 6
4
7
2
random()
Description
This function takes no arguments and returns a pseudo-random integer ranging from zero to 231-1.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic examples
The following example returns a random number within a specified range. In this example, the random number is between
1 and 100,000.
This example takes a random number and uses the modulo mathematical operator ( % ) to divide the random number by
100000. This ensures that the random number returned is not greater than 100000. The number remaining after the
division is increased by 1 to ensure that the number is at least greater than or equal to 1.
Text functions
The following list contains the functions that you can use with string values.
For information about using string and numeric fields in functions, and nesting functions, see Evaluation functions.
107
len(X)
Description
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
Suppose you have a set of results that looks something like this:
_time names
2020-01-09 16:35:14 buttercup
2020-01-09 16:35:14 rarity
2020-01-09 16:35:14 tenderhoof
2020-01-09 16:35:14 dash
2020-01-09 16:35:14 mistmane
You can determine the length of the values in the names field using the len function:
The results show a count of the character length of the values in the names field:
Description
This function takes one string argument and returns the string in lowercase.
108
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
The following example returns the value provided by the field username in lowercase.
ltrim(X,Y)
Description
This function takes one or two arguments X and Y, and returns X with the characters in Y trimmed from the left side. If Y is
not specified, spaces and tabs are removed.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
The following example trims the leading spaces and all of the occurrences of the letter Z from the left side of the string.
The value that is returned is x="abcZZ ".
replace(X,Y,Z)
Description
This function returns a string formed by substituting string Z for every occurrence of regex string Y in string X. The third
argument Z can also reference groups that are matched in the regex.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
To replace a backslash ( \ ) character, you must escape the backslash twice. This is because the replace function occurs
inside an eval expression. The eval expression perform one level of escaping before passing the regular expression to
PCRE. Then PCRE performs its own escaping. See SPL and regular expressions.
109
Basic example
The following example returns date, with the month and day numbers switched. If the input is 1/14/2017 the return value
would be 14/1/2017.
rtrim(X,Y)
Description
This function takes one or two arguments X and Y, and returns X with the characters in Y trimmed from the right side. If Y
is not specified, spaces and tabs are removed.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
spath(X,Y)
Description
This function takes two arguments, an input source field X and an spath expression Y, that is the XML or JSON formatted
location path to the value that you want to extract from X.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
If Y is a literal string, it needs quotes, spath(X,"Y"). If Y is a field name (with values that are the location paths), it doesn't
need quotes. This might result in a multivalued field. Read more about the spath command.
Basic example
110
substr(X,Y,Z)
Description
This function takes either two or three arguments. The required arguments are X, a string, and Y, a numeric. Z is optional
and a numeric. This function returns a substring of X, starting at the index specified by Y with the number of characters
specified by Z. If Z is not provided, the function returns the rest of the string.
Usage
The indexes follow SQLite semantics; they start at 1. Negative indexes can be used to indicate a start from the end of the
string.
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
The following example concatenates "str" and "ing" together, returning "string":
trim(X,Y)
Description
This function takes one or two arguments X and Y and returns X with the characters in Y trimmed from both sides. If Y is
not specified, spaces and tabs are removed.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
upper(X)
111
Description
This function takes one string argument and returns the string in uppercase.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
The following example returns the value provided by the field username in uppercase.
urldecode(X)
Description
This function takes one URL string argument X and returns the unescaped or decoded URL string.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
For information about using string and numeric fields in functions, and nesting functions, see Evaluation functions.
acos(X)
Description
This function computes the arc cosine of X, in the interval [0,pi] radians.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
112
Basic examples
The following example calculates 180 divided by pi and multiplies the result by the arc cosine of 0.
acosh(X)
Description
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
asin(X)
Description
This function computes the arc sine of X, in the interval [-pi/2,+pi/2] radians.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
The following example calculates 180 divided by pi and multiplies that by the arc sine of 1.
113
asinh(X)
Description
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
atan(X)
Description
This function computes the arc tangent of X, in the interval [-pi/2,+pi/2] radians.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
atan2(Y, X)
Description
This function computes the arc tangent of Y, X in the interval [-pi,+pi] radians.
Y is a value that represents the proportion of the y-coordinate. X is the value that represents the proportion of the
x-coordinate.
To compute the value, the function takes into account the sign of both arguments to determine the quadrant.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
114
atanh(X)
Description
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
cos(X)
Description
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
cosh(X)
Description
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
115
Basic example
hypot(X,Y)
Description
This function computes the hypotenuse of a right-angled triangle whose legs are X and Y.
The function returns the square root of the sum of the squares of X and Y, as described in the Pythagorean theorem.
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
sin(X)
Description
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic examples
The following search calculates the sine of pi divided by 180 and then multiplied by 90.
sinh(X)
116
Description
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
tan(X)
Description
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
tanh(X)
Description
Usage
You can use this function with the eval, fieldformat, and where commands, and as part of eval expressions.
Basic example
117
Statistical and Charting Functions
The functions can also be used with related statistical and charting commands. The following table lists the commands
supported by the statistical and charting functions and the related command that can also use these functions.
chart
• sichart
• eventstats
• streamstats
stats • geostats
• sistats
• For the tstats and the mstats commands, see the documentation for each command for a list of the supported
functions.
timechart
• sitimechart
Functions that you can use to create sparkline charts are noted in the documentation for each function. Sparkline is a
function that applies to only the chart and stats commands, and allows you to call other functions. For more information,
see Add sparklines to search results in the Search Manual.
Most of the statistical and charting functions expect the field values to be numbers. All of the values are processed as
numbers, and any non-numeric values are ignored.
The following functions process the field values as literal string values, even though the values are numbers.
• max
• count • estdc • latest
• min
• distinct_count • estdc_error • last
• mode
• earliest • first • list
• values
For example, you use the distinct_count function and the field contains values such as "1", "1.0", and "01". Each value is
considered a distinct string value.
The only exceptions are the max and min functions. These functions process values as numbers if possible. For example,
the values "1", "1.0", and "01" are processed as the same numeric value.
There are two ways that you can see information about the supported statistical and charting functions:
118
• Function list by category
• Alphabetical list of functions
The following table is a quick reference of the supported statistical and charting functions, organized by category. This
table provides a brief description for each functions. Use the links in the table to learn more about each function and to
see examples.
Supported
Type of
functions and Description
function
syntax
avg(X) Returns the average of the values in the field X.
Returns the number of occurrences where the field that you specify contains any value (is not
count(X) empty. You can also count the occurrences of a specific value in the field by using the eval
command with the count function. For example: count eval(field_name="value").
estdc(X) Returns the estimated count of the distinct values in the field X.
Returns the theoretical error of the estimated count of the distinct values in the field X. The error
estdc_error(X) represents a ratio of the absolute_value(estimate_distinct_count -
real_distinct_count)/real_distinct_count.
Returns the maximum value of the field X. If the values of X are non-numeric, the maximum
max(X) value is found using lexicographical ordering. This function processes field values as numbers if
Aggregate possible, otherwise processes field values as strings.
functions
mean(X) Returns the arithmetic mean of the field X.
Returns the minimum value of the field X. If the values of X are non-numeric, the minimum value
min(X)
is found using lexicographical ordering.
Returns the X-th percentile value of the numeric field Y. Valid values of X are integers from 1 to
99.
percentile<X>(Y)
sumsq(X) Returns the sum of the squares of the values of the field X.
Event order first(X) Returns the first seen value of the field X. In general, the first seen value of the field is the most
functions recent instance of this field, relative to the input order of events into the stats command.
119
Supported
Type of functions and Description
function syntax
Returns the last seen value of the field X. In general, the last seen value of the field is the oldest
last(X)
instance of this field relative to the input order of events into the stats command.
Multivalue Returns a list of up to 100 values of the field X as a multivalue entry. The order of the values
list(X)
stats and reflects the order of input events.
chart
functions
Returns the list of all distinct values of the field X as a multivalue entry. The order of the values is
values(X)
lexicographical.
earliest(X) Returns the chronologically earliest (oldest) seen occurrence of a value of a field X.
Returns the UNIX time of the earliest (oldest) occurrence of a value of the field. Used in
earliest_time(X) conjunction with earliest(x), latest(x), and latest_time(x) to calculate the rate of
increase for an accumulating counter.
latest(X) Returns the chronologically latest (most recent) seen occurrence of a value of a field X.
Time Returns the UNIX time of the latest (most recent) occurrence of a value of the field. Used in
functions latest_time(X) conjunction with earliest(x), earliest_time(x), and latest(x) to calculate the rate of
increase for an accumulating counter.
per_day(X) Returns the values of field X, or eval expression X, for each day.
per_hour(X) Returns the values of field X, or eval expression X, for each hour.
per_minute(X) Returns the values of field X, or eval expression X, for each minute.
per_second(X) Returns the values of field X, or eval expression X, for each second.
Returns the per-second rate change of the value of the field. Represents (latest(X) -
earliest(X)) / (latest_time(X) - earliest_time(X)) Requires the earliest(X)
rate(X)
and latest(X) values of the field to be numerical, and the earliest_time(X) and
latest_time(X) values to be different.
Returns the average rates for the time series associated with a specified accumulating counter
rate_avg(X)
metric.
Returns the summed rates for the time series associated with a specified accumulating counter
rate_sum(X)
metric.
Alphabetical list of functions
The following table is a quick reference of the supported statistical and charting functions, organized alphabetically. This
table provides a brief description for each function. Use the links in the table to learn more about each function and to see
examples.
Supported
Type of
functions and Description
function
syntax
Aggregate
avg(X) Returns the average of the values in the field X.
functions
count(X)
120
Supported
Type of
functions and Description
function
syntax
Returns the number of occurrences where the field that you specify contains any value (is not Aggregate
empty. You can also count the occurrences of a specific value in the field by using the eval functions
command with the count function. For example: count eval(field_name="value").
Aggregate
distinct_count(X) Returns the count of distinct values in the field X.
functions
earliest(X) Returns the chronologically earliest (oldest) seen occurrence of a value of a field X. Time functions
Returns the UNIX time of the earliest (oldest) occurrence of a value of the field. Used in
earliest_time(X) conjunction with earliest(x), latest(x), and latest_time(x) to calculate the rate of Time functions
increase for an accumulating counter.
Aggregate
estdc(X) Returns the estimated count of the distinct values in the field X.
functions
Returns the theoretical error of the estimated count of the distinct values in the field X. The error
Aggregate
estdc_error(X) represents a ratio of the absolute_value(estimate_distinct_count -
functions
real_distinct_count)/real_distinct_count.
Returns the first seen value of the field X. In general, the first seen value of the field is the most Event order
first(X)
recent instance of this field, relative to the input order of events into the stats command. functions
Returns the last seen value of the field X. In general, the last seen value of the field is the oldest Event order
last(X)
instance of this field relative to the input order of events into the stats command. functions
latest(X) Returns the chronologically latest (most recent) seen occurrence of a value of a field X. Time functions
Returns the UNIX time of the latest (most recent) occurrence of a value of the field. Used in
latest_time(X) conjunction with earliest(x), earliest_time(x), and latest(x) to calculate the rate of Time functions
increase for an accumulating counter.
Multivalue
Returns a list of up to 100 values of the field X as a multivalue entry. The order of the values
list(X) stats and chart
reflects the order of input events.
functions
Returns the maximum value of the field X. If the values of X are non-numeric, the maximum value
Aggregate
max(X) is found using lexicographical ordering. This function processes field values as numbers if
functions
possible, otherwise processes field values as strings.
Aggregate
mean(X) Returns the arithmetic mean of the field X.
functions
Aggregate
median(X) Returns the middle-most value of the field X.
functions
Returns the minimum value of the field X. If the values of X are non-numeric, the minimum value is Aggregate
min(X)
found using lexicographical ordering. functions
Aggregate
mode(X) Returns the most frequent value of the field X.
functions
Returns the X-th percentile value of the numeric field Y. Valid values of X are integers from 1 to
99.
Aggregate
percentile<X>(Y)
functions
per_hour(X) Returns the values of field X, or eval expression X, for each hour. Time functions
121
Supported
Type of
functions and Description
function
syntax
per_minute(X) Returns the values of field X, or eval expression X, for each minute. Time functions
per_second(X) Returns the values of field X, or eval expression X, for each second. Time functions
Returns the difference between the maximum and minimum values of the field X ONLY IF the Aggregate
range(X)
values of X are numeric. functions
Returns the per-second rate change of the value of the field. Represents (latest(X) -
earliest(X)) / (latest_time(X) - earliest_time(X)) Requires the earliest(X)
rate(X) Time functions
and latest(X) values of the field to be numerical, and the earliest_time(X) and
latest_time(X) values to be different.
Returns the average rates for the time series associated with a specified accumulating counter
rate_avg(X) Time functions
metric.
Returns the summed rates for the time series associated with a specified accumulating counter
rate_sum(X) Time functions
metric.
Aggregate
stdev(X) Returns the sample standard deviation of the field X.
functions
Aggregate
stdevp(X) Returns the population standard deviation of the field X.
functions
Aggregate
sum(X) Returns the sum of the values of the field X.
functions
Aggregate
sumsq(X) Returns the sum of the squares of the values of the field X.
functions
Multivalue
Returns the list of all distinct values of the field X as a multivalue entry. The order of the values is
values(X) stats and chart
lexicographical.
functions
Aggregate
var(X) Returns the sample variance of the field X.
functions
Aggregate
varp(X) Returns the population variance of the field X.
functions
See also
Commands
chart
geostats
eventstats
stats
streamstats
timechart
Functions
Evaluation functions
Answers
Have questions? Visit Splunk Answers and search for a specific function or command.
122
Aggregate functions
Aggregate functions summarize the values from each event to create a single, meaningful value. Common aggregate
functions include Average, Count, Minimum, Maximum, Standard Deviation, Sum, and Variance.
Most aggregate functions are used with numeric fields. However, there are some functions that you can use with either
alphabetic string fields or numeric fields. The function descriptions indicate which functions you can use with alphabetic
strings.
avg(X)
Description
Usage
You can use this function with the chart, mstats, stats, timechart, and tstats commands, and also with sparkline() charts.
For a list of the related statistical and charting commands that you can use with this function, see Statistical and charting
functions.
Basic examples
Example 1
The following example returns the average (mean) "size" for each distinct "host".
Example 2
The following example returns the average "thruput" of each "host" for each 5 minute time span.
Example 3
The following example charts the ratio of the average (mean) "size" to the maximum "delay" for each distinct "host" and
"user" pair.
123
Example 4
The following example displays a timechart of the average of cpu_seconds by processor, rounded to 2 decimal points.
Extended examples
Example 1
There are situations where the results of a calculation can return a different accuracy to the very far right of the decimal
point. For example, the following search calculates the average of 100 values:
avg(test)
3.9900000000000055
When the count is changed to 10000, the results are different:
avg(test)
3.990000000000215
This occurs because numbers are treated as double-precision floating-point numbers.
To mitigate this issue, you can use the sigfig function to specify the number of significant figures you want returned.
However, first you need to make a change to the stats command portion of the search. You need to change the name of
the field avg(test) to remove the parenthesis. For example stats avg(test) AS test. The sigfig function expects either
a number or a field name for X. The sigfig function cannot accept a field name that looks like another function, in this
case avg.
To specify the number of decimal places you want returned, you multiply the field name by 1 and use zeros to specify the
number of decimal places. If you want 4 decimal places returned, you would multiply the field name by 1.0000. To return 2
decimal places, multiply by 1.00, as shown in the following example:
test
3.99
124
Example 2
This example uses the sample data from the Search Tutorial. To try this example on your own Splunk instance, you must download the sample
data and follow the instructions to get the tutorial data into Splunk. Use the time range All time when you run the search.
1. Run the following search to create a chart to show the average number of events in a transaction based on the
duration of the transaction.
The transaction command adds two fields to the results duration and eventcount. The eventcount field tracks
the number of events in a single transaction.
In this search, the transactions are piped into the chart command. The avg() function is used to calculate the
average number of events for each duration. Because the duration is in seconds and you expect there to be many
values, the search uses the span argument to bucket the duration into bins using logarithm with a base of 2.
2. Use the field format option to enable number formatting.
3. Click the Visualization tab and change the display to a pie chart.
125
Each wedge of the pie chart represents a duration for the event transactions. You can hover over a wedge to see
the average values.
count(X) or c(X)
Description
Returns the number of occurrences of the field X. To indicate a specific field value to match, format X as
eval(field="value"). Processes field values as strings. To use this function, you can specify count(X), or the abbreviation
c(X).
Usage
You can use the count(X) function with the chart, mstats, stats, timechart, and tstats commands, and also with sparkline()
charts.
Basic examples
The following example returns the count of events where the status field has the value "404".
This example uses an eval expression with the count function. See Using eval expressions in stats functions.
The following example separates search results into 10 bins and returns the count of raw events for each bin.
The following example generates a sparkline chart to count the events that use the _raw field.
... sparkline(count)
The following example generates a sparkline chart to count the events that have the user field.
126
... sparkline(count(user))
The following example uses the timechart command to count the events where the action field contains the value
purchase.
Extended examples
Count the number of earthquakes that occurred for each magnitude range
This search uses recent earthquake data downloaded from the USGS Earthquakes website. The data is a comma separated ASCII text file that
contains magnitude (mag), coordinates (latitude, longitude), region (place), etc., for each earthquake recorded.
You can download a current CSV file from the USGS Earthquake Feeds and upload the file to your Splunk instance.
This example uses the All Earthquakes data from the past 30 days.
1. Run the following search to calculate the number of earthquakes that occurred in each magnitude range. This
data set is comprised of events over a 30-day period.
♦ This search uses span=1 to define each of the ranges for the magnitude field, mag.
♦ The rename command is then used to rename the field to "Magnitude Range".
The results appear on the Statistics tab and look something like this:
Number of
Magnitude Range
Earthquakes
-1-0 18
0-1 2088
1-2 3005
2-3 1026
3-4 194
4-5 452
5-4 109
6-7 11
7-8 3
Count the number of different page requests for each Web server
This example uses the sample data from the Search Tutorial but should work with any format of Apache web access log. To try this example on
your own Splunk instance, you must download the sample data and follow the instructions to get the tutorial data into Splunk. Use the time
range All time when you run the search.
127
1. Run the following search to use the chart command to determine the number of different page requests, GET
and POST, that occurred for each Web server.
This example uses eval expressions to specify the different field values for the stats command to count. The first
clause uses the count() function to count the Web access events that contain the method field value GET. Then,
using the AS keyword, the field that represents these results is renamed GET.
The second clause does the same for POST events. The counts of both types of events are then separated by
the web server, using the BY clause with the host field.
The results appear on the Statistics tab and look something like this:
128
distinct_count(X) or dc(X)
Description
Returns the count of distinct values of the field X. This function processes field values as strings. To use this function, you
can specify distinct_count(X), or the abbreviation dc(X).
Usage
You can use this function with the chart, mstats, stats, timechart, and tstats commands, and also with sparkline() charts.
Basic examples
The following example removes duplicate results with the same "host" value and returns the total count of the remaining
results.
The following example generates sparklines for the distinct count of devices and renames the field, "numdevices".
...sparkline(dc(device)) AS numdevices
The following example counts the distinct sources for each sourcetype, and buckets the count for each five minute spans.
...sparkline(dc(source),5m) BY sourcetype
Extended example
This example uses the sample data from the Search Tutorial. To try this example on your own Splunk instance, you must download the sample
data and follow the instructions to get the tutorial data into Splunk. Use the time range Yesterday when you run the search.
1. Run the following search to count the number of different customers who purchased something from the
Buttercup Games online store yesterday. The search organizes the count by the type of product (accessories,
t-shirts, and type of games) that customers purchased.
The results appear on the Statistics tab and look something like this:
categoryId dc(clientip)
ACCESSORIES 37
ARCADE 58
129
categoryId dc(clientip)
NULL 8
SHOOTER 31
SIMULATION 34
SPORTS 13
STRATEGY 74
TEE 38
estdc(X)
Description
Returns the estimated count of the distinct values of the field X. This function processes field values as strings. The string
values 1.0 and 1 are considered distinct values and counted separately.
Usage
You can use this function with the chart, stats, timechart, and tstats commands.
By default, if the actual number of distinct values returned by a search is below 1000, the Splunk software does not
estimate the distinct value count for the search. It uses the actual distinct value count instead. This threshold is set by
the approx_dc_threshold setting in limits.conf.
Basic examples
The following example removes duplicate results with the same "host" value and returns the estimated total count of the
remaining results.
estdc(host)
6
The following example generates sparklines for the estimated distinct count of the devices field and renames the results
field, "numdevices".
...sparkline(estdc(device)) AS numdevices
The following example estimates the distinct count for the sources for each sourcetype. The results are displayed for each
five minute span in sparkline charts.
...sparkline(estdc(source),5m) BY sourcetype
130
estdc_error(X)
Description
Returns the theoretical error of the estimated count of the distinct values of the field X. The error represents a ratio of the
absolute_value(estimate_distinct_count - real_distinct_count)/real_distinct_count. This function processes field
values as strings.
Usage
You can use this function with the chart, stats, and timechart commands.
Basic examples
The following example determines the error ratio for the estimated distinct count of the "host" values.
exactperc<X>(Y)
Description
Usage
You can use this function with the chart, stats, timechart, and tstats commands, and also with sparkline() charts.
The exactperc function provides the exact value, but is very resource expensive for high cardinality fields. The exactperc
function can consume a large amount of memory in the search head, which might impact how long it takes for a search to
complete.
Examples
max(X)
Description
Returns the maximum value of the field X. If the values of X are non-numeric, the maximum value is found using
lexicographical ordering.
Processes field values as numbers if possible, otherwise processes field values as strings.
Usage
You can use this function with the chart, mstats, stats, and timechart commands, and also with sparkline() charts.
Lexicographical order sorts items based on the values used to encode the items in computer memory. In Splunk software,
this is almost always UTF-8 encoding, which is a superset of ASCII.
131
• Numbers are sorted before letters. Numbers are sorted based on the first digit. For example, the numbers 10, 9,
70, 100 are sorted lexicographically as 10, 100, 70, 9.
• Uppercase letters are sorted before lowercase letters.
• Symbols are not standard. Some symbols are sorted before numeric values. Other symbols are sorted before or
after letters.
Basic examples
Extended example
This search uses recent earthquake data downloaded from the USGS Earthquakes website. The data is a comma separated ASCII text file that
contains magnitude (mag), coordinates (latitude, longitude), region (place), etc., for each earthquake recorded.
You can download a current CSV file from the USGS Earthquake Feeds and upload the file to your Splunk instance.
This example uses the All Earthquakes data from the past 30 days.
1. Search for earthquakes in and around California. Calculate the number of earthquakes that were recorded. Use
statistical functions to calculate the minimum, maximum, range (the difference between the min and max), and
average magnitudes of the recent earthquakes. List the values by magnitude type.
The results appear on the Statistics tab and look something like this:
MbLg 1 0 0 0 0.0000000
mean(X)
Description
The mean values should be exactly the same as the values calculated using the avg() function.
132
Usage
You can use this function with the chart, mstats, stats, and timechart commands, and also with sparkline() charts.
Basic examples
Extended example
This search uses recent earthquake data downloaded from the USGS Earthquakes website. The data is a comma separated ASCII text file that
contains magnitude (mag), coordinates (latitude, longitude), region (place), etc., for each earthquake recorded.
You can download a current CSV file from the USGS Earthquake Feeds and upload the file to your Splunk instance.
This example uses the All Earthquakes data from the past 30 days.
1. Run the following search to find the mean, standard deviation, and variance of the magnitudes of recent quakes
by magnitude type.
The results appear on the Statistics tab and look something like this:
median(X)
Description
Usage
You can use this function with the chart, mstats, stats, and timechart commands.
If you have an even number of events, by default the median calculation is approximated to the higher of the two values.
133
This function is, by its nature, nondeterministic. This means that subsequent runs of a search using this function over
identical data can contain slight variances in their results.
If you require results that are more exact and consistent you can use exactperc50() instead. However, the
exactperc<X>(Y) function is very resource expensive for high cardinality fields. See perc<X>(Y).
Basic examples
Consider the following list of values, which counts the number of different customers who purchased something from the
Buttercup Games online store yesterday. The values are organized by the type of product (accessories, t-shirts, and type
of games) that customers purchased.
categoryId count
ACCESSORIES 37
ARCADE 58
NULL 8
SIMULATION 34
SPORTS 13
STRATEGY 74
TEE 38
When the list is sorted the median, or middle-most value, is 37.
categoryId count
NULL 8
SPORTS 13
SIMULATION 34
ACCESSORIES 37
TEE 38
ARCADE 58
STRATEGY 74
min(X)
Description
Returns the minimum value of the field X. If the values of X are non-numeric, the minimum value is found using
lexicographical ordering.
This function processes field values as numbers if possible, otherwise processes field values as strings.
Usage
You can use this function with the chart, mstats, stats, and timechart commands.
134
Lexicographical order sorts items based on the values used to encode the items in computer memory. In Splunk software,
this is almost always UTF-8 encoding, which is a superset of ASCII.
• Numbers are sorted before letters. Numbers are sorted based on the first digit. For example, the numbers 10, 9,
70, 100 are sorted lexicographically as 10, 100, 70, 9.
• Uppercase letters are sorted before lowercase letters.
• Symbols are not standard. Some symbols are sorted before numeric values. Other symbols are sorted before or
after letters.
Basic examples
The following example returns the minimum size and maximum size of the HotBucketRoller component in the _internal
index.
The following example returns a list of processors and calculates the minimum cpu_seconds and the maximum
cpu_seconds.
Extended example
See the Extended example for the max() function. That example includes the min() function.
mode(X)
Description
Usage
You can use this function with the chart, stats, and timechart commands.
Basic examples
The mode returns the most frequent value. Consider the following data:
David Mayer 45
Alex Garcia 29
Wei Zhang 45
Javier Garcia 37
When you search for the mode in the age field, the value 45 is returned.
135
...| stats mode(age)
You can also use mode with fields that contain string values. When you search for the mode in the surname field, the value
Garcia is returned.
mode(host)
www1
perc<X>(Y)
Description
The percentile functions return the X-th percentile value of the numeric field Y. You can think of this as an estimate of
where the top X% starts. For example, a 95th percentile says that 95% of the values in field Y are below the estimate and
5% of the values in field Y are above the estimate.
Valid values of X are floating point numbers between 0 and 100, such as 99.95.
There are three different percentile functions that you can use:
Function Description
perc<X>(Y) or the Use the perc function to calculate an approximate threshold, such that of the values in field Y, X percent fall below
abbreviation p<X>(Y) the threshold. The perc function returns a single number that represents the lower end of the approximate values
for the percentile requested.
•
When there are more than 1000 values, the upperperc function gives the approximate upper bound for the
upperperc<X>(Y)
percentile requested. Otherwise the upperperc function returns the same percentile as the perc function.
The exactperc function provides the exact value, but is very resource expensive for high cardinality fields. The
exactperc<X>(Y) exactperc function can consume a large amount of memory, which might impact how long it takes for a search to
complete.
The percentile functions process field values as strings.
136
The perc and upperperc functions are, by their nature, nondeterministic, which means that that subsequent runs of
searches using these functions over identical data can contain slight variances in their results.
If you require exact and consistent results, you can use exactperc<X>(Y) instead.
Usage
You can use this function with the chart, mstats, stats, timechart, and tstats commands.
If there are less than 1000 distinct values, the Splunk percentile functions use the nearest rank algorithm. See
http://en.wikipedia.org/wiki/Percentile#Nearest_rank. Excel uses the NIST interpolated algorithm, which basically means
you can get a value for a percentile that does not exist in the actual data, which is not possible for the nearest rank
approach.
If there are more than 1000 distinct values for the field, the percentiles are approximated using a custom radix-tree
digest-based algorithm. This algorithm is much faster and uses much less memory, a constant amount, than an exact
computation, which uses memory in linear relation to the number of distinct values. By default this approach limits the
approximation error to < 1% of rank error. That means if you ask for 95th percentile, the number you get back is between
the 94th and 96th percentile.
You always get the exact percentiles even for more than 1000 distinct values by using the exactperc function compared
to the perc.
Basic examples
Extended example
Consider the following set of data, which shows the number of visitors for each hour a store is open:
hour visitors
0800 0
0900 212
1000 367
1100 489
137
hour visitors
1200 624
1300 609
1400 492
1500 513
1600 376
1700 337
This data resides in the visitor_count index. You can use the streamstats command to create a cumulative total for the
visitors.
138
range(X)
Description
Returns the difference between the max and min values of the field X. The values of field X must be numeric.
Usage
You can use this function with the chart, mstats, stats, timechart, and tstats commands, and also with sparkline() charts.
Basic example
This example uses events that list the numeric sales for each product and quarter, for example:
The results appear on the Statistics tab and look something like this:
139
Extended example
See the Extended example for the max() function. That example includes the range() function.
stdev(X)
Description
Usage
You can use this function with the chart, mstats, stats, timechart, and tstats commands, and also with sparkline() charts.
Basic examples
This example returns the standard deviation of wildcarded fields "*delay" which can apply to both, "delay" and "xdelay".
Extended example
This search uses recent earthquake data downloaded from the USGS Earthquakes website. The data is a comma separated ASCII text file that
contains magnitude (mag), coordinates (latitude, longitude), region (place), etc., for each earthquake recorded.
You can download a current CSV file from the USGS Earthquake Feeds and upload the file to your Splunk instance.
This example uses the All Earthquakes data from the past 30 days.
1. Run the following search to find the mean, standard deviation, and variance of the magnitudes of recent quakes
by magnitude type.
The results appear on the Statistics tab and look something like this:
stdevp(X)
140
Description
Usage
You can use this function with the chart, mstats, stats, timechart, and tstats commands, and also with sparkline() charts.
Basic examples
Extended example
sum(X)
Description
Usage
You can use this function with the chart, mstats, stats, timechart, and tstats commands, and also with sparkline() charts.
Basic examples
You can create totals for any numeric field. For example:
sum(bytes)
21502
total bytes
21502
141
07 6509
11 3726
15 6569
23 4698
sumsq(X)
Description
The sum of the squares is used to evaluate the variance of a dataset from the dataset mean. A large sum of the squares
indicates a large variance, which tells you that individual values fluctuate widely from the mean.
Usage
You can use this function with the chart, mstats, stats, timechart, and tstats commands, and also with sparkline() charts.
Basic examples
The following table contains the temperatures taken every day at 8 AM for a week.
You calculate the mean of the these temperatures and get 48.9 degrees. To calculate the deviation from the mean for
each day, take the temperature and subtract the mean. If you square each number, you get results like this:
You can calculate the mean, sum of the squares, and standard deviation with a few statistical functions:
142
mean(temp) sumsq(temp) stdev(temp)
48.857142857142854 17664 12.615183595289349
upperperc<X>(Y)
Description
Returns an approximate percentile value, based on the requested percentile X of the numeric field Y.
When there are more than 1000 values, the upperperc function gives the approximate upper bound for the percentile
requested. Otherwise the upperperc function returns the same percentile as the perc function.
Usage
You can use this function with the chart, mstats, stats, timechart, and tstats commands, and also with sparkline() charts.
Examples
var(X)
Description
Usage
You can use this function with the chart, mstats, stats, timechart, and tstats commands, and also with sparkline() charts.
Example
See the Extended example for the mean() function. That example includes the var() function.
varp(X)
Description
Usage
You can use this function with the chart, mstats, stats, timechart, and tstats commands, and also with sparkline() charts.
Basic examples
143
Event order functions
Use the event order functions to return values from fields based on the order in which the event is processed, which is not
necessarily chronological or timestamp order.
The following table lists the timestamps from a set of events returned from a search. This table identifies which event is
returned when you use the first and last event order functions, and compares them with the earliest and latest
functions, which you can read more about at Time functions.
Event order
_time Description
function
2020-04-28 This event is the first event in the search results. But this event is not
00:15:05
first
chronologically the earliest event.
2020-05-01
00:15:04
2020-04-30
00:15:02
2020-04-28
00:15:01
2020-05-01
00:15:05
latest This event is chronologically the latest event in the search results.
earliest
2020-04-27 This event is both the chronologically earliest event and the last event in the
00:15:01 search results.
last
See Overview of statistical and charting functions.
first(X)
Description
Returns the first seen value of the field X. The first seen value of the field is the most recent instance of this field, based
on the order in which the events are seen by the stats command. The order in which the events are seen is not
necessarily chronological order.
Usage
• To locate the first value based on time order, use the earliest function instead.
• Works best when the search includes the sort command immediately before the statistics or charting command.
• This function processes field values as strings.
• You can use the first(X) function with the chart, stats, and timechart commands.
Basic example
This example uses the sample data from the Search Tutorial. To try this example on your own Splunk instance, you must download the sample
data and follow the instructions to get the tutorial data into Splunk. Use the time range All time when you run the search.
You run the following search to locate invalid user login attempts against a specific sshd (Secure Shell Daemon). You use
the table command to see the values in the _time, source, and _raw fields.
144
sourcetype=secure invalid user "sshd[5258]" | table _time source _raw
The results appear on the Statistics tab and look something like this:
sourcetype=secure invalid user "sshd[5258]" | table _time source _raw | stats first(_raw)
The search returns the value for _raw field with the timestamp 2020-04-28 00:15:05, which is the first event in the original
list of values returned.
first(_raw)
Mon Apr 28 2020 00:15:05 mailsv1 sshd[5258]: Failed password for invalid user tomcat from 67.170.226.218 port 1490
ssh2
Extended example
The Basic example uses the _raw field to show how the first function works. That's useful because the _raw field
contains a timestamp. However, you can use the first function on any field.
Let's start by creating some results. You can use the makeresults command to create a series of results to test your
search syntax. Include the streamstats command to count your results:
_time count
2020-05-09 14:35:58 1
2020-05-09 14:35:58 2
145
_time count
2020-05-09 14:35:58 3
2020-05-09 14:35:58 4
2020-05-09 14:35:58 5
With the count field, you can create different dates in the _time field, using the eval command.
Use 3600, the number of seconds in an hour, to create a series of hours. The calculation multiplies the value in the count
field by the number of seconds in an hour. The result is subtracted from the original _time field to get new dates
equivalent to 1 hours ago, 2 hours ago, and so forth.
_time count
2020-05-09 13:45:24 1
2020-05-09 12:45:24 2
2020-05-09 11:45:24 3
2020-05-09 10:45:24 4
2020-05-09 09:45:24 5
The hours in the results begin with the 1 hour earlier than the original date, 2020-05-09 at 14:24. The minutes and
seconds are slightly different because the date is refreshed each time you run the search.
Use the eval command to add a field to your search with values in descending order:
2020-05-09 13:45:24 2 18
2020-05-09 12:45:24 3 17
2020-05-09 11:45:24 4 16
2020-05-09 10:45:24 5 15
As you can see from the results, the first result contains the highest number in field1. This shows the order in which the
results were processed. The first result was processed first (20-1=19) followed by the remaining results in order.
When you add the first function to the search, the only value returned is the value in the field you specify:
146
first(field1)
19
last(X)
Description
Returns the last seen value of the field X. The last seen value of the field is the oldest instance of this field, based on the
order in which the events are seen by the stats command. The order in which the events are seen is not necessarily
chronological order.
Usage
• To locate the last value based on time order, use the latest function instead.
• Works best when the search includes the sort command immediately before the statistics or charting command.
• This function processes field values as strings.
You can use the last(X) function with the chart, stats, and timechart commands.
Basic example
The following example returns the first "log_level" value for each distinct "sourcetype".
This example uses the sample data from the Search Tutorial. To try this example on your own Splunk instance, you must download the sample
data and follow the instructions to get the tutorial data into Splunk. Use the time range All time when you run the search.
You run the following search to locate invalid user login attempts against a specific sshd (Secure Shell Daemon). You use
the table command to see the values in the _time, source, and _raw fields.
The results appear on the Statistics tab and look something like this:
147
sourcetype=secure invalid user "sshd[5258]" | table _time source _raw | stats last(_raw)
The search returns the event with the _time value 2020-04-27 00:15:01, which is the last event in the list of events.
However it is not the last chronological event.
The Basic example uses the _raw field to show how the last function works. That's useful because the _raw field contains
a timestamp. However, you can use the last function on any field.
Let's start by creating some results. You can use the makeresults command to create a series of results to test your
search syntax. Include the streamstats command to count your results:
_time count
2020-05-09 14:35:58 1
2020-05-09 14:35:58 2
2020-05-09 14:35:58 3
2020-05-09 14:35:58 4
2020-05-09 14:35:58 5
With the count field, you can create different dates in the _time field, using the eval command.
Use 86400, the number of seconds in a day, to create a series of days. The calculation multiplies the value in the count
field by the number of seconds in an day. The result is subtracted from the original _time field to get new dates equivalent
to 1 day ago, 2 days ago, and so forth.
_time count
2020-05-08 14:45:24 1
2020-05-07 14:45:24 2
2020-05-06 14:45:24 3
2020-05-05 14:45:24 4
2020-05-04 14:45:24 5
The dates in the results begin with the 1 day earlier than the original date, 2020-05-09 at 14:45:24. The minutes and
seconds are slightly different because the date is refreshed each time you run the search.
148
Use the eval command to add a field to your search with values in descending order:
2020-05-07 14:45:24 2 18
2020-05-06 14:45:24 3 17
2020-05-05 14:45:24 4 16
2020-05-04 14:45:24 5 15
As you can see from the results, the last result contains the lowest number in field1. This shows the order in which the
results were processed. The fifth result was processed last (20-5=15) after all of the other results.
When you add the last function to the search, the only value returned is the value in the field you specify:
lastfield1)
15
See also
Commands
eval
makeresults
Description
Returns a list of up to 100 values of the field X as a multivalue entry. The order of the values reflects the order of input
events.
Usage
• If more than 100 values are in field X, only the first 100 are returned.
• This function processes field values as strings.
• You can use the list(X) function with the chart, stats, and timechart commands.
149
Basic example
To illustrate what the list function does, let's start by generating a few simple results.
1. Use the makeresults and streamstats commands to generate a set of results that are simply timestamps and a
count of the results which are used as row numbers.
The results appear on the Statistics tab and look something like this:
_time rowNumber
2018-04-02 20:27:11 1
2018-04-02 20:27:11 2
2018-04-02 20:27:11 3
2018-04-02 20:27:11 4
2018-04-02 20:27:11 5
Notice that each result appears on a separate row.
2. Add the stats command with the list function to the search. The numbers are returned in ascending order in a
single, multivalue result.
The results appear on the Statistics tab and look something like this:
numbers
1
2
3
4
5
Notice that it is a single result. There are no alternating row background colors.
3. Compare this result with the results returned by the values function.
values(X)
Description
Returns the list of all distinct values of the field X as a multivalue entry. The order of the values is lexicographical.
Usage
• By default there is no limit to the number of values returned. Users with the appropriate permissions can specify a
limit in the limits.conf file. You specify the limit in the [stats | sistats] stanza using the maxvalues setting.
150
• This function processes field values as strings.
• You can use the values(X) function with the chart, stats, timechart, and tstats commands.
Lexicographical order sorts items based on the values used to encode the items in computer memory. In Splunk software,
this is almost always UTF-8 encoding, which is a superset of ASCII.
• Numbers are sorted before letters. Numbers are sorted based on the first digit. For example, the numbers 10, 9,
70, 100 are sorted lexicographically as 10, 100, 70, 9.
• Uppercase letters are sorted before lowercase letters.
• Symbols are not standard. Some symbols are sorted before numeric values. Other symbols are sorted before or
after letters.
Basic example
To illustrate what the values function does, let's start by generating a few simple results.
1. Use the makeresults and streamstats commands to generate a set of results that are simply timestamps and a
count of the results, which are used as row numbers.
The results appear on the Statistics tab and look something like this:
_time rowNumber
2018-04-02 20:27:11 1
2018-04-02 20:27:11 2
2018-04-02 20:27:11 3
2018-04-02 20:27:11 4
2018-04-02 20:27:11 5
Notice that each result appears on a separate row.
2. Add the stats command with thevalues function to the search. The results are returned in lexicographical order.
The results appear on the Statistics tab and look something like this:
numbers
1
10
100
1000
101
102
103
104
105
106
107
108
109
151
numbers
11
110
Notice that it is a single result. There are no alternating row background colors.
3. Compare these results with the results returned by the list function.
Time functions
earliest(X)
Description
Usage
Basic example
This example uses the sample data from the Search Tutorial. To try this example on your own Splunk instance, you must download the sample
data and follow the instructions to get the tutorial data into Splunk. Use the time range All time when you run the search.
You run the following search to locate invalid user login attempts against a sshd (Secure Shell Daemon). You use the
table command to see the values in the _time, source, and _raw fields.
The results appear on the Statistics tab and look something like this:
152
sourcetype=secure invalid user "sshd[5258]" | table _time source _raw | stats earliest(_raw)
The search returns the event with the _time value 2018-04-27 00:15:01, which is the event with the oldest timestamp.
Description
Returns the UNIX time of the chronologically earliest-seen occurrence of a given field value.
Usage
• If you have metrics data, you can use earliest_time(x) in conjunction with earliest(x), latest(x), and
latest_time(x) to calculate the rate of increase for a counter. Alternatively you can use the rate(x) counter to do
the same thing.
Basic example
The following search runs against metric data. It is designed to return the earliest UNIX time values on every minute for
each metric_name that begins with deploy.
The results appear on the Statistics tab and look something like this:
Description
153
Usage
Basic example
This example uses the sample data from the Search Tutorial. To try this example on your own Splunk instance, you must download the sample
data and follow the instructions to get the tutorial data into Splunk. Use the time range All time when you run the search.
You run the following search to locate invalid user login attempts against a specific sshd (Secure Shell Daemon). You use
the table command to see the values in the _time, source, and _raw fields.
The results appear on the Statistics tab and look something like this:
sourcetype=secure invalid user "sshd[5258]" | table _time source _raw | stats latest(_raw)
The search returns the event with the _time value 2018-05-01 00:15:05, which is the event with the most recent
timestamp.
Description
Returns the UNIX time of the chronologically latest-seen occurrence of a given field value.
154
Usage
• If you have metrics data, you can use latest_time(x) in conjunction with earliest(x), latest(x), and
earliest_time(x) to calculate the rate of increase for a counter. Alternatively, you can use the rate(x) counter to
do the same thing.
Basic example
The following search runs against metric data. It is designed to return the latest UNIX time values in the past 60 minutes
for metrics with names that begin with queue..
The results appear on the Statistics tab and look something like this:
Description
Usage
Basic examples
The following example returns the values for the field total for each day.
155
... | timechart per_day(total)
The following example returns the results of the eval expression eval(method="GET")) AS Views .
Extended example
This example uses the sample dataset from the Search Tutorial but should work with any format of Apache Web access log. Download the data
set from this topic in the Search Tutorial and follow the instructions to upload it to your Splunk deployment.
This search uses the per_day() function and eval expressions to determine how many times the web pages were viewed
and how many times items were purchased. The results appear on the Statistics tab.
To determine the number of Views and Purchases for each hour, minute, or second you can add the other time functions
to the search. For example:
156
Use the field format option to change the number formatting for the field values.
per_hour(X)
Description
Usage
Basic examples
The following example returns the values for the field total for each hour.
The following example returns the the results of the eval expression eval(method="POST")) AS Views .
per_minute(X)
Description
Usage
• You can use the per_minute(X) function with the timechart command.
Basic examples
The following example returns the values for the field total for each minute.
The following example returns the the results of the eval expression eval(method="GET")) AS Views .
per_second(X)
Description
Usage
• You can use the per_second(X) function with the timechart command.
157
Basic examples
The following example returns the values for the field kb for each second.
rate(X)
Description
Returns the per-second rate change of the value of the field. Represents (latest(X) - earliest(X)) / (latest_time(X) -
earliest_time(X)). It also handles the largest value reset if there is at least one reset.
Usage
• You can use the rate(X) function with the mstats, stats, and tstats commands.
• Provides the per-second rate change for an accumulating counter metric. Accumulating counter metrics report
the total counter value since the last counter reset. See Investigate counter metrics in Metrics
• Requires the earliest(X) and latest(X) values of the field to be numerical, and the earliest_time(X) and
latest_time(X) values to be different.
• Requires at least two metric data points in the search time range.
• Should be used to provide rate information about single, rather than multiple, counters.
Basic example
The following search runs against metric data. It provides the hourly hit rate for a metric that provides measurements of
incoming web traffic. It uses the processor filter to ensure that it is not reporting on multiple metric series (name and
processor combinations).
The resulting chart shows you that the counter hit rate for the traffic.incoming metric spiked at 1 pm, 4 pm, and 11 am,
but otherwise remained stable.
158
rate_avg(X)
Description
Computes the per metric time series rates for an accumulating counter metric. Returns the averages of those rates.
For a detailed explanation of metric time series, see Perform statistical calculations on metric time series in Metrics.
Usage
• You can use the rate_avg(X) function with the mstats command.
• When you calculate the average rates for accumulating counter metrics, the cleanest way to do it is to split the
counter metric rate calculations out by metric time series and then compute the average rate across all of the
metric time series.
• Unlike rate(X), the rate_avg(X) function can calculate rates even when there is only a single metric data point
per time series per timespan. It can pull in data across timespans to calculate rates when necessary.
• The rate_avg(X) function does not support prestats=true. It needs the final list of dimensions to split by.
Basic example
In your _metrics index, you have data for the metric spl.intr.resource_usage.PerProcess.data.elapsed. This is an
accumulating counter metric. It contains a number of metric time series.
The following example search uses the rate_avg function to calculate the rate(X) for each
spl.mlog.thruput.thruput.total_k_processed time series in the time range. Then it gets the average rate across all of
the time series. Lastly, it splits the results by time, so they can be plotted on a chart.
rate_sum(X)
Description
Computes the per metric time series rates for an accumulating counter metric. Returns the aggregate of those rates.
For a detailed explanation of metric time series, see Perform statistical calculations on metric time series in Metrics.
Usage
• You can use the rate_sum(X) function with the mstats command.
• When you calculate the aggregated rates for accumulating counter metrics, the cleanest way to do it is to split the
counter metric rate calculations out by metric time series and then compute the aggregate rate across all of the
metric time series.
• Unlike rate(X), the rate_sum(X) function can calculate rates even when there is only a single metric data point
per time series per timespan. It can pull in data across timespans to calculate rates when necessary.
• The rate_sum(X) function does not support prestats=true. It needs the final list of dimensions to split by.
Basic example
In your _metrics index, you have data for the metric spl.intr.resource_usage.PerProcess.data.elapsed. This is an
accumulating counter metric. It contains a number of metric time series.
159
The following example search uses the rate_sum function to calculate the rate(X) for each
spl.mlog.thruput.thruput.total_k_processed time series in the time range. Then it gets the aggregate rate across all of
the time series. Lastly, it splits the results by time, so they can be plotted on a chart.
160
Time Format Variables and Modifiers
Additionally, you can use the relative_time() and now() time functions as arguments.
For more information about working with dates and time, see Time modifiers for search and About searching with time in
the Search Manual.
Refer to the list of tz database time zones for all permissible time zone values. For more information about how the Splunk
software determines a time zone and the tz database, see Specify time zones for timestamps in Getting Data In.
Subsecond time variables such as %N and %Q can be used in metrics searches of metrics indexes that are enabled
for millisecond timestamp resolution.
For more information about enabling metrics indexes to index metric data points with millisecond timestamp precision:
• For Splunk Cloud Platform, see Manage Splunk Cloud Platform indexes in the Splunk Cloud Platform Admin
Manual.
• For Splunk Enterprise, see Create custom indexes in Managing indexers and clusters of indexers.
Variable Description
The date and time in the current locale's format as defined by the server's operating system. For example, Thu
%c Jul 18 09:30:00 2019 for US English on Linux.
The date and time with time zone in the current locale's format as defined by the server's operating system. For
%+
example, Thu Jul 18 09:30:00 PDT 2019 for US English on Linux.
Time variables
Variable Description
%Ez Splunk-specific, timezone in minutes.
Hour (24-hour clock) as a decimal number. Hours are represented by the values 00 to 23. Leading zeros are
%H
accepted but not required.
Uppercase "i". Hour (12-hour clock) with the hours represented by the values 01 to 12. Leading zeros are accepted
%I
but not required. Use with %p to specify AM or PM for the 12-hour clock.
Like %H, the hour (24-hour clock) as a decimal number. Leading zeros are replaced by a space, for example 0 to
%k
23.
%M Minute as a decimal number. Minutes are represented by the values 00 to 59. Leading zeros are accepted but not
required.
161
Variable Description
The number of subsecond digits. The default is %9N. You can specify %3N = milliseconds, %6N =
%N
microseconds, %9N = nanoseconds.
%p AM or PM. Use with %I to specify the 12-hour clock for AM or PM. Do not use with %H.
The subsecond component of a UTC timestamp. The default is milliseconds, %3Q. Valid values are:
The UNIX Epoch Time timestamp, or the number of seconds since the Epoch: 1970-01-01 00:00:00 +0000 (UTC).
%s
For example the UNIX epoch time 1484993700 is equal to Tue Jan 21 10:15:00 2020.
%X The time in the format for the current locale. For US English the format for 9:30 AM is 9:30:00.
%Z The timezone abbreviation. For example EST for US Eastern Standard Time.
The timezone offset from UTC, in hour and minute: +hhmm or -hhmm. For example, for 5 hours before UTC the
values is -0500 which is US Eastern Standard Time.
Examples:
%z
• Use %z to specify hour and minute, for example -0500
• Use %:z to specify hour and minute separated by a colon, for example -5:00
• Use %::z to specify hour minute and second separated with colons, for example -05:00:00
• Use %:::z to specify hour only, for example -05
Date variables
Variable Description
%F Equivalent to %Y-%m-%d (the ISO 8601 date format).
%x The date in the format of the current locale. For example, 7/13/2019 for US English.
Specifying days and weeks
Variable Description
%A Full weekday name. (Sunday, ..., Saturday)
%d Day of the month as a decimal number, includes a leading zero. (01 to 31)
%e Like %d, the day of the month as a decimal number, but a leading zero is replaced by a space. (1 to 31)
Week of the year. The %V variable starts the count at 1, which is the most common start number. The %U variable
%V (or %U)
starts the count at 0.
162
Specifying months
Variable Description
%b Abbreviated month name. (Jan, Feb, etc.)
%m Month as a decimal number. (01 to 12). Leading zeros are accepted but not required.
Specifying year
Variable Description
%y Year as a decimal number, without the century. (00 to 99). Leading zeros are accepted but not required.
Examples
The following table shows the results of some time format strings, if today is Jun 1st 2021:
%y-%m-%d 21-06-01
%s 1622509200.000000
The following table shows the results of some time format strings, if today is May 31st 2021 at 09:45 and 22 seconds AM
in the US Pacific daylight savings timezone (PDT):
%Y-%m-%dT%T 2021-05-31T09:45:22
%s 1622454322.000000
The following table shows the results of some time format strings, if today is May 31st 2021 at 14:45 and 37 seconds PM
163
in the US Pacific daylight savings timezone (PDT):
%Y-%m-%dT%H:%M:%S.%QZ 2021-05-31T14:45:37.798Z
%s 1622472337.000000
The following table shows the results of searches that use time variables:
Time modifiers
Use time modifiers to customize the time range of a search or change the format of the timestamps in the search results.
When an event is processed by Splunk software, its timestamp is saved as the default field _time. This timestamp, which
is the time when the event occurred, is saved in UNIX time notation. Searching with relative time modifiers, earliest or
latest, finds every event with a timestamp beginning, ending, or between the specified timestamps.
For example, when you search for earliest=@d, the search finds every event with a _time value since midnight. This
example uses @d, which is a date format variable. See Date and time format variables.
When you use a time modifier in the SPL syntax, that time overrides the time specified in the Time Range Picker.
164
For example, suppose your search uses yesterday in the Time Range Picker. You add the time modifier earliest=-2d to
your search syntax. The search uses the time specified in the time modifier and ignores the time in the Time Range
Picker. Because the search does not specify the latest time modifier, the default value now is used for latest.
For more information, see Specify time modifiers in your search in the Search Manual.
Time ranges selected from the Time Range Picker apply to the base search and to subsearches.
However, time ranges specified directly in the base search do not apply to subsearches. Likewise, a time range specified
directly in a subsearch applies only to that subsearch. The time range does not apply to the base search or any other
subsearch.
For example, if the Time Range Picker is set to Last 7 days and a subsearch contains earliest=2d@d, then the earliest
time modifier applies only to the subsearch and Last 7 days applies to the base search.
You also have the option of searching for events based on when they were indexed. The UNIX time is saved in the
_indextime field. Similar to earliest and latest for the _time field, you can use the relative time modifiers
_index_earliest and _index_latest to search for events based on _indextime. For example, if you wanted to search for
events indexed in the previous hour, use: _index_earliest=-h@h _index_latest=@h.
When using index-time based modifiers such as _index_earliest and _index_latest, your search must also have an
event-time window which will retrieve the events. In other words, chunks of events might be ruled out based on the non
index-time window as well as the index-time window. To be certain of retrieving every event based on index-time, you
must run your search using All Time.
Use the earliest and latest modifiers to specify custom and relative time ranges. You can specify an exact time such as
earliest="10/5/2016:20:00:00", or a relative time such as earliest=-h or latest=@w6.
When specifying relative time, you can use the now modifier to refer to the current time.
earliest=[+|-]
earliest
Use earliest=1 to specify the UNIX epoch time 1, which is
<time_integer><time_unit>@<time_unit>
UTC January 1, 1970 at 12:00:01 AM.
Use earliest=0 to specify the earliest event in your data.
_index_earliest=[+|-]
_index_earliest Specify the earliest _indextime for the time range of your search.
<time_integer><time_unit>@<time_unit>
_index_latest _index_latest=[+|-] Specify the latest _indextime for the time range of your search.
<time_integer><time_unit>@<time_unit>
165
Modifier Syntax Description
latest=[+|-]
latest Specify the latest time for the _time range of your search.
<time_integer><time_unit>@<time_unit>
now now() or now Refers to the current time. If set to earliest, now() is the start of the search.
You can define the relative time in your search with a string of characters that indicate time amount (integer and unit). You
can also specify a "snap to" time unit, which is specified with the @ symbol followed by a time unit.
Begin your string with a plus (+) or minus (-) to indicate the offset from the current time.
Define your time amount with a number and a unit. The supported time units are listed in the following table.
earliest=-h
166
or
earliest=-60m
When specifying single time amounts, the number one is implied. An 's' is the same as '1s', 'm' is the same as '1m', 'h' is
the same as '1h', and so forth.
Subsecond timescales such as ms can be used in metrics searches only when they are searching over metrics indexes
that are enabled for millisecond timestamp resolution.
For more information about enabling metrics indexes to index metric data points with millisecond timestamp precision:
• For Splunk Cloud Platform, see Manage Splunk Cloud Platform indexes in the Splunk Cloud Platform Admin
Manual.
• For Splunk Enterprise, see Create custom indexes in Managing indexers and clusters of indexers.
You can specify a snap to time unit. The time unit indicates the nearest or latest time to which your time amount rounds
down. Separate the time amount from the "snap to" time unit with an "@" character.
• You can use any of time units listed previously. For example:
♦ @w, @week, and @w0 for Sunday
♦ @month for the beginning of the month
♦ @q, @qtr, or @quarter for the beginning of the most recent quarter (Jan 1, Apr 1, Jul 1, or Oct 1).
• You can specify a day of the week: w0 (Sunday), w1, w2, w3, w4, w5 and w6 (Saturday). For Sunday, you can
specify w0 or w7.
• You can also specify offsets from the snap-to-time or "chain" together the time modifiers for more specific
relative time definitions. For example, @d-2h snaps to the beginning of today (12 AM or midnight), and then
applies the time offset of -2h, This results in a time of 10 PM yesterday.
♦ The Splunk platform always applies the offset before it applies the snap. In other words, the left-hand side
of the @ symbol is applied before the right-hand side.
• When snapping to the nearest or latest time, Splunk software always snaps backwards or rounds down to the
latest time not after the specified time. For example, if it is 11:59:00 and you "snap to" hours, you will snap to
11:00 not 12:00.
• If you do not specify a time offset before the "snap to" amount, Splunk software interprets the time as "current
time snapped to" the specified amount. For example, if it is currently 11:59 PM on Friday and you use @w6 to
"snap to Saturday", the resulting time is the previous Saturday at 12:01 A.M.
Examples
If you want to search events from the start of UNIX time, use earliest=1.
When earliest=1 and latest=now() are used, the search runs over all time.
...earliest=1 latest=now()
167
To return future events, specify latest=<a_big_number>. Future events are events that contain timestamps later than the
current time now().
earliest=@w0
earliest=-5d@w1 latest=@w6
With a boundary such as from November 15 at 8 PM to November 22 at 8 PM, use the timeformat %m/%d/%Y:%H:%M:%S.
earliest="11/15/2017:20:00:00" latest="11/22/2017:20:00:00"
You can specify multiple time windows using the timeformat %m/%d/%Y:%H:%M:%S. For example to find events from 5-6 PM
or 7-8 PM on specific dates, use the following syntax.
You can specify multiple time windows using the time modifiers and snap-to with a relative time. For example to find
events for the last 24 hours but omit the events from Midnight to 1:00 A.M., use the following syntax:
The following search time modifiers are still valid, but might be removed and their function no longer supported in a
future release.
enddaysago enddaysago=<int> Set an end time for an integer number of days before Now.
endhoursago endhoursago=<int> Set an end time for an integer number of hours before Now.
endminutesago endminutesago=<int> Set an end time for an integer number of minutes before Now.
endmonthsago endmonthsago=<int> Set an end time for an integer number of months before Now.
Search for events before the specified time (exclusive of the specified time).
endtime endtime=<string>
Use timeformat to specify how the timestamp is formatted.
endtimeu endtimeu=<int> Search for events before the specific UNIX time.
hoursago hoursago=<int> Search events within the last integer number of hours.
168
Modifier Syntax Description
minutesago minutesago=<int> Search events within the last integer number of minutes.
monthsago monthsago=<int> Search events within the last integer number of months.
startdaysago startdaysago=<int> Search the specified number of days before the present time.
starthoursago starthoursago=<int> Search the specified number of hours before the present time.
startminutesago startminutesago=<int> Search the specified number of minutes before the present time.
startmonthsago startmonthsago=<int> Search the specified number of months before the present time.
Search from the specified date and time to the present, inclusive of the
starttime starttime=<timestamp>
specified time.
starttimeu starttimeu=<int> Search for events starting from the specific UNIX time.
Set the timeformat for the starttime and endtime modifiers. By default:
timeformat timeformat=<string>
timeformat=%m/%d/%Y:%H:%M:%S
169
Search Commands
abstract
Description
Produces an abstract, a summary or brief representation, of the text of the search results. The original text is replaced by
the summary.
The abstract is produced by a scoring mechanism. Events that are larger than the selected maxlines, those with more
textual terms and more terms on adjacent lines, are preferred over events with fewer terms. If a line has a search term, its
neighboring lines also partially match, and might be returned to provide context. When there are gaps between the
selected lines, lines are prefixed with an ellipsis (...).
If the text of an event has fewer lines or an equal number of lines as maxlines, no change occurs.
Syntax
Optional arguments
maxterms
Syntax: maxterms=<int>
Description: The maximum number of terms to match. Accepted values are 1 to 1000.
maxlines
Syntax: maxlines=<int>
Description: The maximum number of lines to match. Accepted values are 1 to 500.
Examples
See also
highlight
accum
Description
For each event where field is a number, the accum command calculates a running total or sum of the numbers. The
accumulated sum can be returned to either the same field, or a newfield that you specify.
170
Syntax
Required arguments
field
Syntax: <string>
Description: The name of the field that you want to calculate the accumulated sum for. The field must contain
numeric values.
Optional arguments
newfield
Syntax: <string>
Description: The name of a new field where you want the results placed.
Basic example
This example uses the sample data from the Search Tutorial but should work with any format of Apache web access log. To try this example on
your own Splunk instance, you must download the sample data and follow the instructions to get the tutorial data into Splunk. Use the time
range All time when you run the search.
The following search looks for events from web access log files that were successful views of strategy games. A count of
the events by each product ID is returned.
The results appear on the Statistics tab and look something like this:
productId views
DB-SG-G01 1796
DC-SG-G02 1642
FS-SG-G03 1482
PZ-SG-G05 1300
You can use the accum command to generate a running total of the views and display the running total in a new field called
"TotalViews".
The results appear on the Statistics tab and look something like this:
171
productId views TotalViews
FS-SG-G03 1482 4920
See also
addcoltotals
Description
The addcoltotals command appends a new result to the end of the search result set. The result contains the sum of each
numeric field or you can specify which fields to summarize. Results are displayed on the Statistics tab. If the labelfield
argument is specified, a column is added to the statistical results table with the name specified.
Syntax
Optional arguments
<wc-field-list>
Syntax: <field> ...
Description: A space delimited list of valid field names. The addcoltotals command calculates the sum only for
the fields in the list you specify. You can use the asterisk ( * ) as a wildcard to specify a list of fields with similar
names. For example, if you want to specify all fields that start with "value", you can use a wildcard such as
value*.
Default: Calculates the sum for all of the fields.
labelfield
Syntax: labelfield=<fieldname>
Description: Specify a field name to add to the result set.
Default: none
label
Syntax: label=<string>
Description: Used with the labelfield argument to add a label in the summary event. If the labelfield
argument is absent, the label argument has no effect.
Default: Total
Basic examples
Compute the sums of all the fields, and put the sums in a summary event called "change_name".
172
2. Add a column total for two specific fields
3. Create the totals for a field that match a field name pattern
Filter fields for two name-patterns, and get totals for one of them.
Extended example
This example uses the sample data from the Search Tutorial but should work with any format of Apache web access log. To try this example on
your own Splunk instance, you must download the sample data and follow the instructions to get the tutorial data into Splunk. Use the time
range All time when you run the search.
The following search looks for events from web access log files that were successful views of strategy games. A count of
the events by each product ID is returned.
The results appear on the Statistics tab and look something like this:
productId views
DB-SG-G01 1796
DC-SG-G02 1642
FS-SG-G03 1482
PZ-SG-G05 1300
You can use the addcoltotals command to generate a total of the views and display the total at the bottom of the column.
The results appear on the Statistics tab and look something like this:
productId views
DB-SG-G01 1796
DC-SG-G02 1642
FS-SG-G03 1482
173
productId views
PZ-SG-G05 1300
6220
You can use add a field to the results that labels the total.
The results appear on the Statistics tab and look something like this:
DC-SG-G02 1642
FS-SG-G03 1482
PZ-SG-G05 1300
6220 Total
See also
Commands
addtotals
stats
addinfo
Description
Adds fields to each event that contain global, common information about the search. This command is primarily an
internally-used component of Summary Indexing.
Syntax
addinfo
The following fields are added to each event when you use the addinfo command.
Field Description
info_min_time The earliest time boundary for the search.
174
Usage
Examples
... | addinfo
You can use this example to track heartbeats from hosts, forwarders, tcpin_connections on indexers, or any number of
system components. This example uses hosts.
You have a list of host names in a lookup file called expected_hosts. You want to search for heartbeats from your hosts
that are after an expected time range. You use the addinfo command to add information to each event that will help you
evaluate the time range.
... | stats latest(_time) AS latest_time BY host | addinfo | eval latest_age = info_max_time - latest_time |
fields - info_* | inputlookup append=t expected_hosts | fillnull value=9999 latest_age | dedup host | where
latest_age > 42
Use the stats command to calculate the latest heartbeat by host. The addinfo command adds information to each result.
This search uses info_max_time, which is the latest time boundary for the search. The eval command is used to create a
field called latest_age and calculate the age of the heartbeats relative to end of the time range. This allows for a time
range of -11m@m to -m@m. This is the previous 11 minutes, starting at the beginning of the minute, to the previous 1 minute,
starting at the beginning of the minute. The search does not work if you specify latest=null / all time because
info_max_time would be set to +infinity.
Using the lookup file, expected_hosts, append the list of hosts to the results. Using this list you can determine which hosts
are not sending a heartbeat in the expected time range. For any hosts that have a null value in the latest_age field, fill the
field with the value 9999. Remove any duplicated host events with the dedup command. Use the where command to filter
the results and return any heartbeats older than 42 seconds.
In this example, you could use the tstats command, instead of the stats command, to improve the performance of the
search.
See also
search
addtotals
175
Description
The addtotals command computes the arithmetic sum of all numeric fields for each search result. The results appear in
the Statistics tab.
You can specify a list of fields that you want the sum for, instead of calculating every numeric field. The sum is placed in a
new field.
If col=true, the addtotals command computes the column totals, which adds a new result at the end that represents the
sum of each field. labelfield, if specified, is a field that will be added to this summary event with the value set by the
'label' option. Alternately, instead of using the addtotals col=true command, you can use the addcoltotals command to
calculate a summary event.
Syntax
Required arguments
None.
Optional arguments
field-list
Syntax: <field> ...
Description: One or more numeric fields, delimited with a space. Only the fields specified in the <field-list> are
summed. If a <field-list> is not specified, all numeric fields are included in the sum.
Usage: You can use wildcards in the field names. For example, if the field names are count1, count2, and count3
you can specify count* to indicate all fields that begin with 'count'.
Default: All numeric fields are included in the sum.
row
Syntax: row=<bool>
Description: Specifies whether to calculate the sum of the <field-list> for each event. This is similar to calculating
a total for each row in a table. The sum is placed in a new field. The default name of the field is Total. If you want
to specify a different name for the field, use the fieldname argument.
Usage: Because the default is row=true, specify the row argument only when you do not want the event totals to
appear row=false.
Default: true
col
Syntax: col=<bool>
Description: Specifies whether to add a new event, referred to as a summary event, at the bottom of the list of
events. The summary event displays the sum of each field in the events, similar to calculating column totals in a
table.
Default: false
fieldname
Syntax: fieldname=<field>
Description: Used to specify the name of the field that contains the calculated sum of the field-list for each event.
The fieldname argument is valid only when row=true.
176
Default: Total
labelfield
Syntax: labelfield=<field>
Description: Used to specify a field for the summary event label. The labelfield argument is valid only when
col=true.
* To use an existing field in your result set, specify the field name for the labelfield argument. For example if the
field name is IP, specify labelfield=IP.
* If there is no field in your result set that matches the labelfield, a new field is added using the labelfield
value.
Default: none
label
Syntax: label=<string>
Description: Used to specify a row label for the summary event.
* If the labelfield argument is an existing field in your result set, the label value appears in that row in the
display.
* If the labelfield argument creates a new field, the label appears in the new field in the summary event row.
Default: Total
Usage
The addtotals command is a distributable streaming command, except when is used to calculate column totals. When
used to calculate column totals, the addtotals command is a transforming command. See Command types.
Examples
This example uses events that list the numeric sales for each product and quarter, for example:
177
Use the chart command to summarize data
To summarize the data by product for each quarter, run this search:
In this example, there are two fields specified in the BY clause with the chart command.
The results appear on the Statistics tab and look something like this:
To add a column that generates totals for each row, run this search:
The results appear on the Statistics tab and look something like this:
The results appear on the Statistics tab and look something like this:
products sum(sales)
ProductA 5475
ProductB 5525
ProductC 6200
2. Specify a name for the field that contains the sums for each event
Instead of accepting the default name added by the addtotals command, you can specify a name for the field.
178
3. Use wildcards to specify the names of the fields to sum
Calculate the sums for the fields that begin with amount or that contain the text size in the field name. Save the sums in
the field called TotalAmount.
In this example, the row calculations are turned off and the column calculations are turned on. The total for only a single
field, sum(quota), is calculated.
• The labelfield argument specifies in which field the label for the total appears. The default label is Total.
The results appear on the Statistics tab and look something like this:
quarter sum(quota)
QTR1 3825
QTR2 4175
QTR3 4000
QTR4 3875
Total 15875
5. Calculate the field totals and add custom labels to the totals
Calculate the sum for each quarter and product, and calculate a grand total.
• The labelfield argument specifies in which field the label for the total appears, which in this example is
products.
• The label argument is used to specify the label Quarterly Totals for the labelfield, instead of using the default
label Total.
• The fieldname argument is used to specify the label Product Totals for the row totals.
The results appear on the Statistics tab and look something like this:
179
See also
stats
analyzefields
Description
Using <field> as a discrete random variable, this command analyzes all numerical fields to determine the ability for each
of those fields to predict the value of the classfield. It determines the stability of the relationship between values in the
target classfield and numeric values in other fields.
As a reporting command, analyzefields consumes all input results and generates one row for each numeric field in the
output results. The values in that row indicate the performance of the analyzefields command at predicting the value of
a classfield. For each event, if the conditional distribution of the numeric field with the highest z-probability based on
matches the actual class, the event is counted as accurate. The highest z-probablility is based on the classfield.
Syntax
analyzefields classfield=<field>
Field Description
field The name of a numeric field from the input search results.
The co-occurrence of the field. In the results where classfield is present, this is the ratio of results in which field is also present.
cocur
The cocur is 1 if the field exists in every event that has a classfield.
The accuracy in predicting the value of the classfield, using the value of the field. This the ratio of the number of accurate
acc
predictions to the total number of events with that field. This argument is valid only for numerical fields.
The balanced accuracy is the non-weighted average of the accuracies in predicted each value of the classfield. This is only valid
balacc
for numerical fields.
Required arguments
classfield
Syntax: classfield=<field>
Description: For best results, classfield should have two distinct values, although multiclass analysis is
possible.
Examples
180
Example 1:
See also
anomalousvalue
anomalies
Description
Use the anomalies command to look for events or field values that are unusual or unexpected.
The anomalies command assigns an unexpectedness score to each event and places that score in a new field named
unexpectedness. Whether the event is considered anomalous or not depends on a threshold value. The threshold value
is compared to the unexpectedness score. The event is considered unexpected or anomalous if the unexpectedness
score is greater than the threshold value.
After you use the anomalies command in a search, look at the Interesting Fields list in the Search & Reporting window.
Select the unexpectedness field to see information about the values in your events.
The unexpectedness score of an event is calculated based on the similarity of that event (X) to a set of previous events
(P).
In this formula, s( ) is a metric of how similar or uniform the data is. This formula provides a measure of how much
adding X affects the similarity of the set of events. The formula also normalizes the results for the differing event sizes.
Syntax
anomalies
[threshold=<num>]
[labelonly=<bool>]
[normalize=<bool>]
[maxvalues=<num>]
[field=<field>]
[denylist=<filename>]
[denylistthreshold=<num>]
[by-clause]
181
Optional arguments
threshold
Syntax: threshold=<num>
Description: A number to represent the upper limit of expected or normal events. If unexpectedness calculated
for an event is greater than this threshold limit, the event is considered unexpected or anomalous.
Default: 0.01
labelonly
Syntax: labelonly=<bool>
Description: Specifies if you want the output result set to include all events or only the events that are above the
threshold value. The unexpectedness field is appended to all events. If labelonly=true, no events are removed. If
labelonly=false, events that have a unexpectedness score less than the threshold are removed from the output
result set.
Default: false
normalize
Syntax: normalize=<bool>
Description: Specifies whether or not to normalize numeric text in the fields. All characters in the field from 0 to 9
are considered identical for purposes of the algorithm. The placement and quantity of the numbers remains
significant. When a field contains numeric data that should not be normalized but treated as categories, set
normalize=false.
Default: true
maxvalues
Syntax: maxvalues=<num>
Description: Specifies the size of the sliding set of previous events to include when determining the
unexpectedness of a field value. By default the calculation uses the previous 100 events for the comparison. If the
current event number is 1000, the calculation uses the values in events 900 to 999 in the calculation. If the current
event number is 1500, the calculation uses the values in events 1400 to 1499 in the calculation. You can specify a
number between 10 and 10000. Increasing the value of maxvalues increases the total CPU cost per event linearly.
Large values have very long search runtimes.
Default: 100
field
Syntax: field=<field>
Description: The field to analyze when determining the unexpectedness of an event.
Default: _raw
denylist
Syntax: denylist=<filename>
Description: The name of a CSV file that contains a list of events that are expected and should be ignored. Any
incoming event that is similar to an event in the denylist is treated as not anomalous, or expected, and given an
unexpectedness score of 0.0. The CSV file must be located in the $SPLUNK_HOME/var/run/splunk/csv directory on
the search head. If you have Splunk Cloud Platform and want to configure a denylist file, file a Support ticket.
denylistthreshold
Syntax: denylistthreshold=<num>
Description: Specifies a similarity score threshold for matching incoming events to denylisted events. If the
incoming event has a similarity score above the denylistthreshold, the event is marked as unexpected.
Default: 0.05
182
by-clause
Syntax: by <fieldlist>
Description: Use to specify a list of fields to segregate the results for anomaly detection. For each combination of
values for the specified fields, the events with those values are treated entirely separately.
Examples
The following example shows the interesting events, ignoring any events in the denylist 'boringevents'. Sort the event list
in descending order, with highest value in the unexpectedness field listed first.
This example uses transactions to find regions of time that look unusual.
Look for anomalies in each source separately. A pattern in one source does not affect that it is anomalous in another
source.
This example shows how to tune a search for anomalies using the threshold value. Start with a search that uses the
default threshold value.
This search looks at events in the _internal index and calculates an unexpectedness score for sets of events that have
the same group value.
• The sliding set of events that are used to calculate the unexpectedness score for each unique group value
includes only the events that have the same group value.
• The search command is used to show events that only include the group field.
The unexpectedness and group fields appear in the list of Interesting fields. Click on the field name and then click Yes to
move the field to the Selected fields list. The fields are moved and also appear in the search results. Your results should
look something like the following image.
183
The key-value pairs in the first event include group=pipeline, name=indexerpipe, processor=indexer, cpu_seconds=0.022,
and so forth.
With the default threshold, which is 0.01, you can see that some of these events might be very similar. The next search
increases the threshold a little:
184
With the higher threshold value, the timestamps and key-value pairs show more distinction between each of the events.
Also, you might not want to hide the events that are not anomalous. Instead, you can add another field to your events that
tells you whether or not the event is interesting to you. One way to do this is with the eval command:
This search uses labelonly=true so that the boring events are still retained in the results list. The eval command is used
to define a field named threshold and set it to the threshold value. This has to be done explicitly because the threshold
attribute of the anomalies command is not a field.
The second eval command is used to define another new field, score, that is either "anomalous" or "boring" based on
how the unexpectedness compares to the threshold value. The following image shows a snapshot of the results.
See also
anomalousvalue
Description
The anomalousvalue command computes an anomaly score for each field of each event, relative to the values of this field
across other events. For numerical fields, it identifies or summarizes the values in the data that are anomalous either by
frequency of occurrence or number of standard deviations from the mean.
For fields that are determined to be anomalous, a new field is added with the following scheme. If the field is numeric,
such as size, the new field will be Anomaly_Score_Num(size). If the field is non-numeric, such as name, the new field will be
Anomaly_Score_Cat(name).
185
Syntax
Required arguments
None.
Optional arguments
<av-options>
Syntax: minsupcount=<int> | maxanofreq=<float> | minsupfreq=<float> | minnormfreq=<float>
Description: Specify one or more option to control which fields are considered for discriminating anomalies.
maxanofreq
Syntax: maxanofreq=<float>
Description: Maximum anomalous frequency is expressed as a floating point number between 0 and 1.
Omits a field from consideration if the field is too frequently anomalous. If the ratio of anomalous
occurrences of the field to the total number of occurrences of the field is greater than the maxanofreq
value, then the field is removed from consideration.
Default 0.05
minnormfreq
Syntax: minnormfreq=<float>
Description: Minimum normal frequency is expressed as a floating point number between 0 and 1. Omits
a field from consideration if the field is not anomalous frequently enough. If the ratio of anomalous
occurrences of the field to the total number of occurrences of the field is smaller than p, then the field is
removed from consideration.
Default: 0.01
minsupcount
Syntax: minsupcount=<int>
Description: Minimum supported count must be a positive integer. Drops a field that has a small number
of occurrences in the input result set. If the field appears fewer than N times in the input events, the field
is removed from consideration.
Default: 100
minsupfreq
Syntax: minsupfreq=<float>
Description: Minimum supported frequency is expressed as a floating point number between 0 and 1.
Drops a field that has a low frequency of occurrence. The minsupfreq argument checks the ratio of
occurrences of the field to the total number of events. If this ratio is smaller than p the field is removed
from consideration.
Default: 0.05
action
Syntax: action=annotate | filter | summary
Description: Specify whether to return the anomaly score (annotate), filter out events that are not anomalous
values (filter), or return a summary of anomaly statistics (summary).
186
Default: filter
annotate
Syntax: action=annotate
Description: The annotate action adds new fields to the events containing anomalous values. The fields
that are added are Anomaly_Score_Cat(field), Anomaly_Score_Num(field), or both.
filter
Syntax: action=filter
Description: The filter action returns events with anomalous values. Events without anomalous values
are removed. The events that are returned are annotated, as described for action=annotate.
summary
Syntax: action=summary
Description: The summary action returns a table summarizing the anomaly statistics for each field
generated. The table includes how many events contained this field, the fraction of events that were
anomalous, what type of test (categorical or numerical) were performed, and so on.
useCat Use categorical anomaly detection. Categorical anomaly detection looks for rare values.
Use numerical anomaly detection. Numerical anomaly detection looks for values that are far from
useNum
the mean value. This anomaly detection is Gaussian distribution based.
field-list
Syntax: <field> ...
Description: The List of fields to consider.
Default: If no field list is provided, all fields are considered.
pthresh
Syntax: pthresh=<num>
Description: Probability threshold (as a decimal) that has to be met for a value to be considered anomalous.
Default: 0.01.
187
Usage
By default, a maximum of 50,000 results are returned. This maximum is controlled by the maxresultrows setting in the
[anomalousvalue] stanza in the limits.conf file. Increasing this limit can result in more memory usage.
Only users with file system access, such as system administrators, can edit the configuration files. Never change or
copy the configuration files in the default directory. The files in the default directory must remain intact and in their
original location. Make the changes in the local directory.
Basic examples
... | anomalousvalue
Extended example
This search uses recent earthquake data downloaded from the USGS Earthquakes website. The data is a comma separated ASCII text file that
contains magnitude (mag), coordinates (latitude, longitude), region (place), etc., for each earthquake recorded.
You can download a current CSV file from the USGS Earthquake Feeds and upload the file to your Splunk instance.
This example uses the All Earthquakes data from the past 30 days.
Search for anomalous values in the earthquake data.
188
The numeric results are returned with multiple decimals. Use the field formatting icon, which looks like a pencil, to enable
number formatting and specify the decimal precision to display.
189
See also
anomalydetection
Description
A streaming and reporting command that identifies anomalous events by computing a probability for each event and then
detecting unusually small probabilities. The probability is defined as the product of the frequencies of each individual field
value in the event.
• For categorical fields, the frequency of a value X is the number of times X occurs divided by the total number of
events.
• For numerical fields, we first build a histogram for all the values, then compute the frequency of a value X as the
size of the bin that contains X divided by the number of events.
The anomalydetection command includes the capabilities of the existing anomalousvalue and outlier commands and
offers a histogram-based approach for detecting anomalies.
Syntax
Optional arguments
<method-option>
Syntax: method = histogram | zscore | iqr
Description: Select the method of anomaly detection. When method=zscore, performs like the anomalousvalue
command. When method=iqr, performs like the outlier command. See Usage.
Default: method=histogram
<action-option>
Syntax for method=histogram or method=zscore: action = filter | annotate | summary
Syntax for method=iqr: action = remove | transform
Description: The actions and defaults depend on the method that you specify. See the detailed descriptions for
the actions for each method below.
<pthresh-option>
Syntax: pthresh=<num>
Description: Used with method=histogram or method=zscore. Sets the probability threshold, as a decimal number,
that has to be met for an event to be deemed anomalous.
Default: For method=histogram, the command calculates pthresh for each data set during analysis. For
method=zscore, the default is 0.01. If you try to use this when method=iqr, it returns an invalid argument error.
<cutoff-option>
Syntax: cutoff=<bool>
Description: Sets the upper bound threshold on the number of anomalies. This option applies to only the
histogram method. If cutoff=false, the algorithm uses the formula threshold = 1st-quartile - 1.5 * IRQ
190
without modification. If cutoff=true, the algorithm modifies the formula in order to come up with a smaller number
of anomalies.
Default: true
<field-list>
Syntax: <string> <string> ...
Description: A list of field names.
Histogram actions
<action-option>
Syntax: action=annotate | filter | summary
Description: Specifies whether to return all events with additional fields (annotate), to filter out events with
anomalous values (filter), or to return a summary of anomaly statistics (summary).
Default: filter
When action=filter, the command returns anomalous events and filters out other events. Each returned event contains
four new fields. When action=annotate, the command returns all the original events with the same four new fields added
when action=filter.
Field Description
log_event_prob The natural logarithm of the event probability.
The name of the field that best explains why the event is anomalous. No one field causes anomaly by itself, but
probable_cause
often some field value occurs too rarely to make the event probability small.
<action-option>
Syntax: action=annotate | filter | summary
Description: Specifies whether to return the anomaly score (annotate), filter out events with anomalous values
(filter), or a summary of anomaly statistics (summary).
Default: filter
When action=filter, the command returns events with anomalous values while other events are dropped. The kept
events are annotated, like the annotate action.
191
When action=annotate, the command adds new fields, Anomaly_Score_Cat(field) and Anomaly_Score_Num(field), to
the events that contain anomalous values.
When action=summary, the command returns a table that summarizes the anomaly statistics for each field is generated.
The table includes how many events contained this field, the fraction of events that were anomalous, what type of test
(categorical or numerical) were performed, and so on.
IQR actions
<action-option>
Syntax: action=remove | transform
Description: Specifies what to do with outliers. The remove action removes the event containing the outlying
numerical value. The transform action transforms the event by truncating the outlying value to the threshold for
outliers. If mark=true, the transform action prefixes the value with "000".
Abbreviations: The abbreviation for remove is rm. The abbreviation for transform is tf.
Default: action=transform
Usage
When you specify method=zscore, the anomalydetection command performs like the anomalousvalue command. You can
specify the syntax components of the anomalousvalue command when you use the anomalydetection command with
method=zscore. See the anomalousvalue command.
When you specify method=iqr, the anomalydetection command performs like the outlier command. You can specify the
syntax components of the outlier command when you specify method=iqr with the anomalydetection command. For
example, you can specify the outlier options <action>, <mark>, <param>, and <uselower>. See the outlier command.
Examples
These two searches return the same results. The arguments specified in the second search are the default values.
... | anomalydetection
Example 2: Return a short summary of how many anomalous events are there
Return a short summary of how many anomalous events are there and some other statistics such as the threshold value
used to detect them.
192
Example 3: Return events with anomalous values
This example specifies method=zscore to return anomalous values. The search uses the filter action to filter out events
that do not have anomalous values. Events must meet the probability threshold pthresh before being considered an
anomalous value.
This example uses the outlier options from the outlier command. The abbreviation tf is used for the transform action in
this example.
See also
append
Description
Appends the results of a subsearch to the current results. The append command runs only over historical data and does
not produce correct results if used in a real-time search.
For more information about when to use the append command, see the flowchart in the topic About event grouping and
correlation in the Search Manual.
If you are familiar with SQL but new to SPL, see Splunk SPL for SQL users.
Syntax
Required arguments
subsearch
Syntax: [subsearch]
Description: A secondary search where you specify the source of the events that you want to append. The
subsearch must be enclosed in square brackets. See About subsearches in the Search Manual.
Optional arguments
subsearch-options
Syntax: extendtimerange=<boolean> | maxtime=<int> | maxout=<int> | timeout=<int>
Description: Controls how the subsearch is processed.
193
Subsearch options
extendtimerange
Syntax: extendtimerange=<boolean>
Description: Specifies whether to include the subsearch time range in the time range for the entire search. Use
the extendtimerange argument when the time range in the subsearch extends beyond the time range for the main
search. Use this argument when a transforming command, such as chart, timechart, or stats, follows the append
command in the search and the search uses time based bins.
Default: false
maxtime
Syntax: maxtime=<int>
Description: The maximum time, in seconds, to spend on the subsearch before automatically finalizing.
Default: 60
maxout
Syntax: maxout=<int>
Description: The maximum number of result rows to output from the subsearch.
Default: 50000
timeout
Syntax: timeout=<int>
Description: The maximum time, in seconds, to wait for subsearch to fully finish.
Default: 60
Usage
Examples
This search uses recent earthquake data downloaded from the USGS Earthquakes website. The data is a comma separated ASCII text file that
contains magnitude (mag), coordinates (latitude, longitude), region (place), etc., for each earthquake recorded.
You can download a current CSV file from the USGS Earthquake Feeds and upload the file to your Splunk instance.
This example uses the All Earthquakes data from the past 30 days.
Count the number of earthquakes that occurred in and around California yesterday and then calculate the total number of
earthquakes.
This example uses a subsearch to count all the earthquakes in the California regions (place="*California"), then uses
the main search to count the number of earthquakes based on the magnitude type of the search.
You cannot use the stats command to simultaneously count the total number of events and the number of events for a
specified field. The subsearch is used to count the total number of earthquakes that occurred. This count is added to the
results of the previous search with the append command.
194
Because both searches share the count field, the results of the subsearch are listed as the last row in the count column.
The results appear on the Statistics tab and look something like this:
magType count
H 123
MbLg 1
Md 1565
Me 2
Ml 1202
Mw 6
ml 10
2909
This search demonstrates how to use the append command in a way that is similar to using the addcoltotals command to
add the column totals.
2. Count the number of different customers who purchased items. Append the top purchaser for each type of
product.
This example uses the sample data from the Search Tutorial. To try this example on your own Splunk instance, you must download the sample
data and follow the instructions to get the tutorial data into Splunk. Use the time range Yesterday when you run the search.
Count the number of different customers who purchased something from the Buttercup Games online store yesterday,
and break this count down by the type of product (accessories, t-shirts, and type of games) they purchased. Also, list the
top purchaser for each type of product and how much that person bought of that product.
This example first searches for purchase events (action=purchase). These results are piped into the stats command and
the dc(), or distinct_count() function is used to count the number of different users who make purchases. The BY clause
is used to break up this number based on the different category of products (categoryId).
The subsearch is used to search for purchase events and count the top purchaser (based on clientip) for each category
of products. These results are added to the results of the previous search using the append command.
Here, the table command is used to display only the category of products (categoryId), the distinct count of users who
bought each type of product (dc(clientip)), the actual user who bought the most of a product type (clientip), and the
number of each product that user bought (count).
195
You can see that the append command just tacks on the results of the subsearch to the end of the previous search, even
though the results share the same field values. It does not let you manipulate or reformat the output.
3. Use the append command to determine the number of unique IP addresses that accessed the Web server.
Use the append command, along with the stats, count, and top commands to determine the number of unique IP
addresses that accessed the Web server. Find the user who accessed the Web server the most for each type of page
request.
This example uses the sample data from the Search Tutorial. To try this example on your own Splunk instance, you must download the sample
data and follow the instructions to get the tutorial data into Splunk. Use the time range Yesterday when you run the search.
Count the number of different IP addresses that accessed the Web server and also find the user who accessed the Web
server the most for each type of page request (method).
The Web access events are piped into the stats command and the dc() or distinct_count() function is used to count
the number of different users who accessed the site. The count() function is used to count the total number of times the
site was accessed. These numbers are separated by the page request (method).
The subsearch is used to find the top user for each type of page request (method). The append command is used to add
the result of the subsearch to the bottom of the table.
The results appear on the Statistics tab and look something like this:
196
method dc(clientip) count clientip percent
POST 64 87.194.216.51 3.705848
The first two rows are the results of the first search. The last two rows are the results of the subsearch. Both result sets
share the method and count fields.
4. Specify the maximum time for the subsearch to run and the maximum number of result rows from the
subsearch
Use the append command, to determine the number of unique IP addresses that accessed the Web server. Find the user
who accessed the Web server the most for each type of page request.
This example uses the sample dataset from the Search Tutorial but should work with any format of Apache web access log. Download the data
set from this topic in the Search Tutorial and follow the instructions to upload it to your Splunk deployment. Use the time range Yesterday
when you run this search.
Count the number of different IP addresses that accessed the Web server and also find the user who accessed the Web
server the most for each type of page request (method). Limit the subsearch to 30 seconds and the maximum number of
subsearch results to 1000.
Use the extendtimerange argument to ensure that the time range used for the search includes both the time range of the
main search and the time range of the subsearch.
The time range used for the search is from 11/1/2017:00:00:00, the earliest time in the subsearch, to
11/30/2017:00:00:00, the latest time in the main search.
See also
appendcols
Description
Appends the fields of the subsearch results with the input search results. All fields of the subsearch are combined into
the current results, with the exception of internal fields. For example, the first subsearch result is merged with the first
main result, the second subsearch result is merged with the second main result, and so on.
Syntax
197
Required arguments
subsearch
Description: A secondary search added to the main search. See how subsearches work in the Search Manual.
Optional arguments
override
Syntax: override=<bool>
Description: If the override argument is false, and if a field is present in both a subsearch result and the main
result, the main result is used. If override=true, the subsearch result value is used.
Default: override=false
subsearch-options
Syntax: maxtime=<int> | maxout=<int> | timeout=<int>
Description: These options control how the subsearch is executed.
Subsearch options
maxtime
Syntax: maxtime=<int>
Description: The maximum time, in units of seconds, to spend on the subsearch before automatically finalizing.
Default: 60
maxout
Syntax: maxout=<int>
Description: The maximum number of result rows to output from the subsearch.
Default: 50000
timeout
Syntax: timeout=<int>
Description: The maximum time, in units of seconds, to wait for subsearch to fully finish.
Default: 60
Usage
The appendcols command must be placed in a search string after a transforming command such as stats, chart, or
timechart. The appendcols command can't be used before a transforming command because it must append to an
existing set of table-formatted results, such as those generated by a transforming command. See Command types.
Note that the subsearch argument to the appendcols command doesn't have to contain a transforming command.
Examples
Example 1:
Search for "404" events and append the fields in each event to the previous search results.
This is a valid search string because appendcols comes after the transforming command table and adds columns to an
198
existing table of results.
Example 2:
This search uses appendcols to count the number of times a certain field occurs on a specific server and uses that value
to calculate other fields.
specific.server | stats dc(userID) as totalUsers | appendcols [ search specific.server AND "text" | addinfo
| where _time >= info_min_time AND _time <=info_max_time | stats count(field) as variableA ] | eval
variableB = exact(variableA/totalUsers)
• First, this search uses stats to count the number of individual users on a specific server and names that variable
"totalUsers".
• Then, this search uses appendcols to search the server and count how many times a certain field occurs on that
specific server. This count is renamed "VariableA". The addinfo command is used to constrain this subsearch
within the range of info_min_time and info_max_time.
• The eval command is used to define a "variableB".
The result is a table with the fields totalUsers, variableA, and variableB.
See also
appendpipe
Description
Appends the result of the subpipeline to the search results. Unlike a subsearch, the subpipeline is not run first. The
subpipeline is run when the search reaches the appendpipe command. The appendpipe command is used to append the
output of transforming commands, such as chart, timechart, stats, and top.
Syntax
Optional Arguments
run_in_preview
Syntax: run_in_preview=<bool>
Description: Specifies whether or not display the impact of the appendpipe command in the preview. When set to
FALSE, the search runs and the preview shows the results as if the appendpipe command is not part of the
search. However, when the search finishes, the results include the impact of the appendpipe command.
Default: True
subpipeline
Syntax: <subpipeline>
Description: A list of commands that are applied to the search results from the commands that occur in the
search before the appendpipe command.
199
Usage
The appendpipe command can be useful because it provides a summary, total, or otherwise descriptive row of the entire
dataset when you are constructing a table or chart. This command is also useful when you need the original results for
additional calculations.
Examples
Example 1:
index=_audit | stats count by action user | appendpipe [stats sum(count) as count by action | eval user =
"TOTAL - ALL USERS"] | sort action
The results appear on the Statistics tab and look something like this:
accelerate_search can-delete 6
add n/a 1
change_authentication admin 50
change_authentication buttercup 9
change_authentication can-delete 24
See also
arules
Description
The arules command looks for associative relationships between field values. The command returns a table with the
following columns: Given fields, Implied fields, Strength, Given fields support, and Implied fields support. The given and
implied field values are the values of the fields you supply. The Strength value indicates the relationship between (among)
the given and implied field values.
Implements arules agorithm as discussed in Michael Hahsler, Bettina Gruen and Kurt Hornik (2012). arules: Mining
Association Rules and Frequent Itemsets. R package version 1.0-12. This algorithm is similar to the algorithms used for
200
online shopping websites which suggest related items based on what items other customers have viewed or purchased.
Syntax
Required arguments
field-list
Syntax: <field> <field> ...
Description: The list of field names. At least two fields must be specified.
Optional arguments
<arules-option>
Syntax: <support> | <confidence>
Description: Options for arules command.
arules options
support
Syntax: sup=<int>
Description: Specify a support limit. Associations with computed support levels smaller than this value are not
included in the output results. The support option must be a positive integer.
Default: 3
confidence
Syntax: conf=<float>
Description: Specify a confidence limit. Associations with a confidence (expressed as Strength field) are not
included in the output results. Must be between 0 and 1.
Default: .5
Usage
Examples
Example 1: Search for the likelihood that the fields are related.
Example 2:
See also
associate, correlate
201
associate
Description
The associate command identifies correlations between fields. The command tries to find a relationship between pairs of
fields by calculating a change in entropy based on their values. This entropy represents whether knowing the value of one
field helps to predict the value of another field.
In Information Theory, entropy is defined as a measure of the uncertainty associated with a random variable. In this case
if a field has only one unique value, the field has an entropy of zero. If the field has multiple values, the more evenly those
values are distributed, the higher the entropy.
The associate command uses Shannon entropy (log base 2). The unit is in bits.
Syntax
Required arguments
None.
Optional arguments
associate-option
Syntax: supcnt | supfreq | improv
Description: Options for the associate command. See the Associate-options section.
field-list
Syntax: <field> ...
Description: A list of one or more fields. You cannot use wildcard characters in the field list. If you specify a list of
fields, the analysis is restricted to only those fields.
Default: All fields are analyzed.
Associate-options
supcnt
Syntax: supcnt=<num>
Description: Specifies the minimum number of times that the "reference key=reference value" combination must
appear. Must be a non-negative integer.
Default: 100
supfreq
Syntax: supfreq=<num>
Description: Specifies the minimum frequency of "reference key=reference value" combination as a fraction of
the number of total events.
Default: 0.1
improv
Syntax: improv=<num>
202
Description: Specifies a limit, or minimum entropy improvement, for the "target key". The calculated entropy
improvement must be greater than or equal to this limit.
Default: 0.5
The associate command outputs a table with columns containing the following fields.
Field Description
Reference_Key The name of the first field in a pair of fields.
Conditional_Entropy The entropy of the target key when the reference key is the reference value.
Entropy_Improvement The difference between the unconditional entropy and the conditional entropy.
A message that summarizes the relationship between the field values that is based on the entropy calculations.
The Description is a textual representation of the result. It is written in the format: "When the
Description
'Reference_Key' has the value 'Reference_Value', the entropy of 'Target_Key' decreases from
Unconditional_Entropy to Conditional_Entropy."
Specifies how often the reference field is the reference value, relative to the total number of events. For example,
Support
how often field A is equal to value X, in the total number of events.
Examples
This example uses the sample data from the Search Tutorial. To try this example on your own Splunk instance, you must download the sample
data and follow the instructions to get the tutorial data into Splunk. Use the time range Yesterday when you run the search.
This example demonstrates one way to analyze the relationship of fields in your web access logs.
The first part of this search retrieves web access events that returned a status that is not 200. Web access data contains
many fields. You can use the associate command to see a relationship between all pairs of fields and values in your data.
To simplify this example, restrict the search to two fields: method and status.
Because the associate command adds many columns to the output, this search uses the table command to display only
select columns.
The results appear on the Statistics tab and look something like this:
203
Reference_Key Reference_Value Target_Key Top_Conditional_Value Description
When 'status' has the value '400', the
entropy of 'method' decreases from
0.789 to 0.647.
When 'status' has the value '404', the
status 404 method GET (76.37% -> 81.27%) entropy of 'method' decreases from
0.789 to 0.696.
When 'status' has the value '406', the
status 406 method GET (76.37% -> 81.69%) entropy of 'method' decreases from
0.789 to 0.687.
When 'status' has the value '408', the
status 408 method GET (76.37% -> 80.00%) entropy of 'method' decreases from
0.789 to 0.722.
When 'status' has the value '500', the
status 500 method GET (76.37% -> 80.73%) entropy of 'method' decreases from
0.789 to 0.707.
In the results you can see that there is one method and five status values in the results.
From the first row of results, you can see that when method=POST, the status field is 503 for those events. The associate
command concludes that, if method=POST, the Top_Conditional_Value is likely to be 503 as much as 33% of the time.
It is formatted to read "CV (FRV% -> FCV%)" where CV is the conditional Value, FRV is is the percentage occurrence of
the reference value, and FCV is the percentage of occurrence for that conditional value, in the case of the reference
value.
Return results associated with each other (that have at least 3 references to each other).
Analyze all events from host "reports" and return results associated with each other.
204
See also
audit
Description
Returns audit trail information that is stored in the local audit index. This command also validates signed audit events
while checking for gaps and tampering.
Syntax
audit
Examples
index="_audit" | audit
autoregress
Description
Prepares your events for calculating the autoregression, or the moving average, by copying one or more of the previous
values for field into each event.
The first few events will lack the augmentation of prior values, since the prior values do not exist.
Syntax
Required arguments
field
Syntax: <string>
Description: The name of a field. Most usefully a field with numeric values.
Optional arguments
p
Syntax: p=<int> | p=<int>-<int>
Description: Specifies which prior events to copy values from. You can specify a single integer or a numeric
range. For a single value, such as 3, the autoregress command copies field values from the third prior event into
a new field. For a range, the autoregress command copies field values from the range of prior events. For
example, if you specify a range such as p=2-4, then the field values from the second, third, and fourth prior events
205
are copied into new fields.
Default: 1
newfield
Syntax: <field>
Description: If p is set to a single integer, the newfield argument specifies a field name to copy the single field
value into. Invalid if p is set to a range.
If the newfield argument is not specified, the single or multiple values are copied into fields with the names
<field>_p<num>. For example, if p=2-4 and field=count, the field names are count_p2, count_p3, count_p4.
Usage
Examples
Example 1:
For each event, copy the 3rd previous value of the 'ip' field into the field 'old_ip'.
Example 2:
For each event, copy the 2nd, 3rd, 4th, and 5th previous values of the 'count' field.
Since the new field argument is not specified, the values are copied into the fields 'count_p2', 'count_p3', 'count_p4', and
'count_p5'.
Example 3:
Calculate a moving average of event size over the current event and the four prior events. This search omits the
moving_average for the initial events, where the field would be wrong, because summing null fields is considered null.
... | eval rawlen=len(_raw) | autoregress rawlen p=1-4 | eval moving_average=(rawlen + rawlen_p1 + rawlen_p2
+ rawlen_p3 +rawlen_p4 ) /5
See also
awssnsalert
The awssnsalert command is used with the Splunk Add-on for AWS.
For information about this command, see Use the awssnsalert search command in Splunk Add-on for AWS.
206
bin
Description
Puts continuous numerical values into discrete sets, or bins, by adjusting the value of <field> so that all of the items in a
particular set have the same value.
The bin command is automatically called by the chart and the timechart commands. Use the bin command for only
statistical operations that the chart and the timechart commands cannot process. Do not use the bin command if you
plan to export all events to CSV or JSON file formats.
Syntax
Required arguments
field
Syntax: <field>
Description: Specify a field name.
Optional arguments
bin-options
Syntax: bins | minspan | span | <start-end> | aligntime
Description: Discretization options. See the Bins options section in this topic for the syntax and description for
each of these options.
newfield
Syntax: <string>
Description: A new name for the field.
Bin options
bins
Syntax: bins=<int>
Description: Sets the maximum number of bins to discretize into.
minspan
Syntax: minspan=<span-length>
Description: Specifies the smallest span granularity to use automatically inferring span from the data time range.
span
Syntax: span = <log-span> | <span-length>
Description: Sets the size of each bin, using a span length based on a logarithm-based span or based on time.
When a <span-length> of a day or more is used, the span is aligned to midnight in the timezone of the user.
<start-end>
Syntax: start=<num> | end=<num>
207
Description: Sets the minimum and maximum extents for numerical bins. The data in the field is analyzed and
the beginning and ending values are determined. The start and end arguments are used when a span value is not
specified.
You can use the start or end arguments only to expand the range, not to shorten the range. For example, if the
field represents seconds the values are from 0-59. If you specify a span of 10, then the bins are calculated in
increments of 10. The bins are 0-9, 10-19, 20-29, and so forth. If you do not specify a span, but specify end=1000,
the bins are calculated based on the actual beginning value and 1000 as the end value.
If you set end=10 and the values are >10, the end argument has no effect.
aligntime
Syntax: aligntime=(earliest | latest | <time-specifier>)
Description: Align the bin times to something other than base UTC time (epoch 0). The aligntime option is valid
only when doing a time-based discretization. Ignored if span is in days, months, or years.
Span options
log-span
Syntax: [<num>]log[<num>]
Description: Sets to log-based span. The first number is a coefficient. The second number is the base. If the first
number is supplied, it must be a real number >= 1.0 and < base. Base, if supplied, must be real number > 1.0
(strictly greater than 1).
Example: span=2log10
span-length
Syntax: <int>[<timescale>]
Description: A span of each bin. If discretizing based on the _time field or used with a timescale, this is treated
as a time range. If not, this is an absolute bin length.
<timescale>
Syntax: <sec> | <min> | <hr> | <day> | <month> | <subseconds>
Description: Time scale units. If discretizing based on the _time field.
Default: sec
208
Usage
The bin command is usually a dataset processing command. If the span argument is specified with the command, the bin
command is a streaming command. See Command types.
Subsecond span timescalesâtime spans that are made up of deciseconds (ds), centiseconds (cs), milliseconds (ms), or
microseconds (us)âshould be numbers that divide evenly into a second. For example, 1s = 1000ms. This means that
valid millisecond span values are 1, 2, 4, 5, 8, 10, 20, 25, 40, 50, 100, 125, 200, 250, or 500ms. In addition, span = 1000ms
is not allowed. Use span = 1s instead.
Examples
Example 1:
Return the average "thruput" of each "host" for each 5 minute time span.
Example 2:
Bin search results into 10 bins, and return the count of raw events for each bin.
Example 3:
Create bins with an end value larger than you need, ensure that all possible values are included.
Example 4:
Align the time bins to 3am (local time). Set the span to 12h. The bins will represent 3am - 3pm, then 3pm - 3am (the next
day), and so on.
Example 5:
See also
chart, timechart
209
bucket
The bucket command is an alias for the bin command. See the bin command for syntax information and examples.
bucketdir
Description
Replaces a field value with higher-level grouping, such as replacing filenames with directories.
Returns the maxcount events, by taking the incoming events and rolling up multiple sources into directories, by preferring
directories that have many files but few events. The field with the path is PATHFIELD (e.g., source), and strings are
broken up by a separator character. The default pathfield=source; sizefield=totalCount; maxcount=20;
countfield=totalCount; sep="/" or "\\", depending on the operation system.
Syntax
Required arguments
pathfield
Syntax: pathfield=<field>
Description: Specify a field name that has a path value.
sizefield
Syntax: sizefield=<field>
Description: Specify a numeric field that defines the size of bucket.
Optional arguments
countfield
Syntax: countfield=<field>
Description: Specify a numeric field that describes the count of events.
maxcount
Syntax: maxcount=<int>
Description: Specify the total number of events to bucket.
sep
Syntax: <char>
Description: The separating character. Specify either a forward slash "/" or double back slashes "\\", depending
on the operating system.
Usage
210
Examples
Example 1:
See also
cluster, dedup
cefout
The cefout command is used with the Splunk App for CEF.
For information about this command, see Technical implementation in Deploy and Use Splunk App for CEF.
chart
Description
The chart command is a transforming command that returns your results in a table format. The results can then be used
to display the data as a chart, such as a column, line, area, or pie chart. See the Visualization Reference in the
Dashboards and Visualizations manual.
You must specify a statistical function when you use the chart command. See Statistical and charting functions.
Syntax
chart
[<chart-options>]
[agg=<stats-agg-term>]
( <stats-agg-term> | <sparkline-agg-term> | "("<eval-expression>")" )...
[ BY <row-split> <column-split> ] | [ OVER <row-split> ] [BY <column-split>]
[<dedup_splitvals>]
Required arguments
You must include one of the following arguments when you use the chart command.
stats-agg-term
Syntax: <stats-func> ( <evaled-field> | <wc-field> ) [AS <wc-field>]
Description: A statistical aggregation function. See Stats function options. The function can be applied to an eval
expression, or to a field or set of fields. Use the AS clause to place the result into a new field with a name that you
specify. You can use wildcard characters in field names.
211
sparkline-agg-term
Syntax: <sparkline-agg> [AS <wc-field>]
Description: A sparkline aggregation function. Use the AS clause to place the result into a new field with a name
that you specify. You can use wild card characters in field names. See Sparkline options.
eval-expression
Syntax: <eval-math-exp> | <eval-concat-exp> | <eval-compare-exp> | <eval-bool-exp> | <eval-function-call>
Description: A combination of literals, fields, operators, and functions that represent the value of your destination
field. For more information, see the Evaluation functions. See Usage.
For these evaluations to work, your values need to be valid for the type of operation. For example, with the
exception of addition, arithmetic operations might not produce valid results if the values are not numerical. If
both operands are strings, they can be concatenated. When concatenating values with a period, the search
treats both values as strings regardless of their actual type.
Optional arguments
agg
Syntax: agg=<stats-agg-term>
Description: Specify an aggregator or function. For a list of stats functions with descriptions and examples, see
Statistical and charting functions.
chart-options
Syntax: cont | format | limit | sep
Description: Options that you can specify to refine the result. See the Chart options section in this topic.
Default:
column-split
Syntax: <field> [<tc-options>]... [<where-clause>]
Description: Specifies a field to use as the columns in the result table. By default, when the result are visualized,
the columns become the data series in the chart. If the field is numerical, discretization is applied using the
tc-options argument. See the tc options and the where clause sections in this topic.
Default: The number of columns included is limited to 10 by default. You can change the number of columns by
including a <where-clause>.
When a column-split field is included, the output is a table where each column represents a distinct value of the
split-by field. This is in contrast with the by-clause, where each row represents a single unique combination of
values of the group-by fields. For additional information, see the Usage section in this topic.
dedup_splitvals
Syntax: dedup_splitvals=<boolean>
Description: Specifies whether to remove duplicate values in multivalued BY clause fields.
Default: false
row-split
Syntax: <field> [<bin-options>]...
Description: The field that you specify becomes the first column in the results table. The field values become the
row labels in the results table. In a chart, the field name is used to label the X-axis. The field values become the
X-axis values. See the Bin options section in this topic.
Default: None.
212
Chart options
cont
Syntax: cont=<bool>
Description: Specifies if the bins are continuous. If cont=false, replots the x-axis so that a noncontinuous
sequence of x-value bins show up adjacently in the output. If cont=true, bins that have no values will display with
a count of 0 or null values.
Default: true
format
Syntax: format=<string>
Description: Used to construct output field names when multiple data series are used in conjunction with a
split-by-field. format takes precedence over sep and allows you to specify a parameterized expression with the
stats aggregator and function ($AGG$) and the value of the split-by-field ($VAL$).
limit
Syntax: limit=(top | bottom) <int>
Description: Only valid when a column-split is specified. Use the limit option to specify the number of results
that should appear in the output. When you set limit=N the top or bottom N values are retained, based on the
sum of each series and the prefix you have selected. If limit=0, all results are returned. If you opt not to provide a
prefix, the Splunk software provides the top results.
Default: top 10
sep
Syntax: sep=<string>
Description: Used to construct output field names when multiple data series are used in conjunctions with a
split-by field. This is equivalent to setting format to $AGG$<sep>$VAL$.
stats-func
Syntax: The syntax depends on the function you use. See Usage.
Description: Statistical and charting functions that you can use with the chart command. Each time you invoke
the chart command, you can use one or more functions. However, you can only use one BY clause.
Sparkline options
Sparklines are inline charts that appear within table cells in search results and display time-based trends associated with
the primary key of each row.
sparkline-agg
Syntax: sparkline (count(<wc-field>), <span-length>) | sparkline (<sparkline-func>(<wc-field>), <span-length>)
Description: A sparkline specifier, which takes the first argument of an aggregation function on a field and an
optional timespan specifier. If no timespan specifier is used, an appropriate timespan is chosen based on the time
range of the search. If the sparkline is not scoped to a field, only the count aggregate function is permitted. You
can use wild card characters in field names.
span-length
See the Span options section in this topic.
sparkline-func
213
Syntax: c() | count() | dc() | mean() | avg() | stdev() | stdevp() | var() | varp() | sum() | sumsq() | min() |
max() | range()
Description: Aggregation function to use to generate sparkline values. Each sparkline value is produced
by applying this aggregation to the events that fall into each particular time bin.
The size of the sparkline is defined by settings in the limits.conf file. The sparkline_maxsize setting defines the
maximum number of elements to emit for a sparkline.
For more information see Add sparklines to your search results in the Search Manual.
Bin options
The bin options control the number and size of the bins that the search results are separated, or discretized, into.
bins
Syntax: bins=<int>
Description: Sets the maximum number of bins to discretize into. For example, if bin=300, the search finds the
smallest bin size that results in no more than 300 distinct bins.
Default: 300
span
Syntax: span=<log-span> | span=<span-length>
Description: Sets the size of each bin, using a span length based on time or log-based span. See the Span
options section in this topic.
<start-end>
Syntax: end=<num> | start=<num>
Description: Sets the minimum and maximum extents for numerical bins. Data outside of the [start, end] range is
discarded.
aligntime
Syntax: aligntime=(earliest | latest | <time-specifier>)
Description: Align the bin times to something other than base UNIX time (epoch 0). The aligntime option is valid
only when doing a time-based discretization. Ignored if span is in days, months, or years.
Span options
<log-span>
Syntax: [<num>]log[<num>]
Description: Sets to a logarithm-based span. The first number is a coefficient. The second number is the base. If
the first number is supplied, it must be a real number >= 1.0 and < base. Base, if supplied, must be real number >
1.0 (strictly greater than 1).
span-length
Syntax: <span>[<timescale>]
Description: A span length based on time.
214
<span>
Syntax: <int>
Description: The span of each bin. If using a timescale, this is used as a time range. If not, this is an
absolute bucket "length."
<timescale>
Syntax: <sec> | <min> | <hr> | <day> | <month> | <subseconds>
Description: Time scale units.
<subseconds> us | ms | cs | ds Time scale in microseconds (us), milliseconds (ms), centiseconds (cs), or deciseconds (ds)
tc options
The timechart options are part of the <column-split> argument and control the behavior of splitting search results by a
field. There are options that control the number and size of the bins that the search results are separated into. There are
options that control what happens when events do not contain the split-by field, and for events that do not meet the criteria
of the <where-clause>.
tc-options
Syntax: <bin-options> | usenull=<bool> | useother=<bool> | nullstr=<string> | otherstr=<string>
Description: Options for controlling the behavior of splitting by a field.
bin-options
See the Bin options section in this topic.
nullstr
Syntax: nullstr=<string>
Description: Specifies the name of the field for data series for events that do not contain the split-by
field. The nullstri option is only applicable when the usenull option is set to true.
Default: NULL
otherstr
String: otherstr=<string>
Description: Specifies the name of the field for data series that do not meet the criteria of the
<where-clause>. The otherstr option is only applicable when the useother option is set to true.
Default: OTHER
usenull
Syntax: usenull=<bool>
Description: Controls whether or not a series is created for events that do not contain the split-by field.
Default: true
useother
215
Syntax: useother=<bool>
Description: Specifies if a series should be added for data series not included in the graph because the
series did not meet the criteria of the <where-clause>.
Default: true
where clause
where clause
Syntax: <single-agg> <where-comp>
Description: Specifies the criteria for including particular data series when a field is given in the tc-by-clause.
The most common use of this option is to select for spikes rather than overall mass of distribution in series
selection. The default value finds the top ten series by area under the curve. Alternately one could replace sum
with max to find the series with the ten highest spikes. This has no relation to the where command.
single-agg
Syntax: count | <stats-func>(<field>)
Description: A single aggregation applied to a single field, including an evaluated field. No wildcards are allowed.
The field must be specified, except when using the count aggregate function, which applies to events as a
whole.
<stats-func>
See the Statistical functions section in this topic.
<where-comp>
Syntax: <wherein-comp> | <wherethresh-comp>
Description: The criteria for the <where-clause>.
<wherein-comp>
Syntax: (in | notin) (top | bottom)<int>
Description: A grouping criteria for the <where-clause>. The aggregated series value be in or not in
some top or bottom grouping.
<wherethresh-comp>
Syntax: ( < | > ) <num>
Description: A threshold for the <where-clause>. The aggregated series value must be greater than or
less than the specified numeric threshold.
Usage
Evaluation expressions
You can use the chart command with an eval expression. Unless you specify a split-by clause, the eval expression
must be renamed.
216
Supported functions
You can use a wide range of functions with the stats command. For general information about using functions, see
Statistical and charting functions.
Some functions are inherently more expensive, from a memory standpoint, than other functions. For example, the
distinct_count function requires far more memory than the count function. The values and list functions also can
consume a lot of memory.
If you are using the distinct_count function without a split-by field or with a low-cardinality split-by by field, consider
replacing the distinct_count function with the the estdc function (estimated distinct count). The estdc function might
result in significantly lower memory usage and run times.
Some statistical commands, such as stats, process functions that are not paired with one or more fields as if they are
implicitly paired with a wildcard, so the command applies the function all available fields. For example, | stats sum is
treated as if it is | stats sum(*).
The chart command allows this behavior only with the count function. If you do not specify a field for count, chart applies
it to all events returned by the search. If you want to apply other functions to all fields, you must make the wildcard explicit:
| chart sum(*) .
X-axis
You can specify which field is tracked on the x-axis of the chart. The x-axis variable is specified with a by field and is
discretized if necessary. Charted fields are converted to numerical quantities if necessary.
Unlike the timechart command which generates a chart with the _time field as the x-axis, the chart command produces a
table with an arbitrary field as the x-axis.
You can also specify the x-axis field after the over keyword, before any by and subsequent split-by clause. The limit
and agg options allow easier specification of series filtering. The limit and agg options are ignored if an explicit
where-clause is provided.
When a column-split field is included, the output is a table where each column represents a distinct value of the
column-split field. This is in contrast with the stats command, where each row represents a single unique combination of
values of the group-by fields. The number of columns included is limited to 10 by default. You can change the number of
columns by including a where-clause.
With the chart and timechart commands, you cannot specify the same field in a function and as the row-split field.
For example, you cannot run this search. The field A is specified in the sum function and the row-split argument.
217
You must specify a different field as in the row-split argument.
Alternatively, you can work around this problem by using an eval expression. For example:
Subsecond span timescalesâtime spans that are made up of deciseconds (ds), centiseconds (cs), milliseconds (ms), or
microseconds (us)âshould be numbers that divide evenly into a second. For example, 1s = 1000ms. This means that
valid millisecond span values are 1, 2, 4, 5, 8, 10, 20, 25, 40, 50, 100, 125, 200, 250, or 500ms. In addition, span = 1000ms
is not allowed. Use span = 1s instead.
Basic examples
2. Chart the max(delay) for each value of foo, split by the value of bar
Return max( delay) for each value of foo split by the value of bar.
3. Chart the ratio of the average to the maximum "delay" for each distinct "host" and "user" pair
Return the ratio of the average (mean) "size" to the maximum "delay" for each distinct "host" and "user" pair.
4. Chart the maximum "delay" by "size" and separate "size" into bins
Return the maximum "delay" by "size", where "size" is broken down into a maximum of 10 equal sized bins.
Return the number of events, grouped by date and hour of the day, using span to group per 7 days and 24 hours per half
days. The span applies to the field immediately prior to the command.
218
7. Align the chart time bins to local time
Align the time bins to 5am (local time). Set the span to 12h. The bins will represent 5am - 5pm, then 5pm - 5am (the next
day), and so on.
For each unique value of mvfield, chart the average value of field. Deduplicates the values in the mvfield.
Extended examples
This example uses events that list the numeric sales for each product and quarter, for example:
In this example, there are two fields specified in the BY clause with the chart command.
• The products field is referred to as the <row-split> field. In the chart, this field forms the X-axis.
• The quarter field is referred to as the <column-split> field. In the chart, this field forms the data series.
The results appear on the Statistics tab and look something like this:
219
products QTR1 QTR2 QTR3 QTR4
See the addtotals command for an example that adds a total column for each product.
2. Chart the number of different page requests for each Web server
This example uses the sample data from the Search Tutorial but should work with any format of Apache web access log. To try this example on
your own Splunk instance, you must download the sample data and follow the instructions to get the tutorial data into Splunk. Use the time
range All time when you run the search.
Chart the number of different page requests, GET and POST, that occurred for each Web server.
This example uses eval expressions to specify the different field values for the stats command to count. The first clause
uses the count() function to count the Web access events that contain the method field value GET. Then, using the AS
keyword, the field that represents these results is renamed GET.
The second clause does the same for POST events. The counts of both types of events are then separated by the web
server, using the BY clause with the host field.
The results appear on the Statistics tab and look something like this:
220
3. Chart the number of transactions by duration
This example uses the sample data from the Search Tutorial. To try this example on your own Splunk instance, you must download the sample
data and follow the instructions to get the tutorial data into Splunk. Use the time range All time when you run the search.
Create a chart to show the number of transactions based on their duration (in seconds).
This search uses the transaction command to define a transaction as events that share the clientip field and fit within a
ten minute time span. The transaction command creates a new field called duration, which is the difference between the
timestamps for the first and last events in the transaction. (Because maxspan=10s, the duration value should not be
greater than this.)
The transactions are then piped into the chart command. The count() function is used to count the number of
transactions and separate the count by the duration of each transaction. Because the duration is in seconds and you
expect there to be many values, the search uses the span argument to bucket the duration into bins of log2 (span=log2).
The results appear on the Statistics tab and look something like this:
duration count
0 970
1-2 593
2-4 208
4-8 173
8-16 26
64-128 3
128-256 3
256-512 12
512-1024 2
Click the Visualization tab. If necessary, format the results as a column chart.
221
In this data set, most transactions take between 0 and 2 seconds to complete.
This example uses the sample data from the Search Tutorial. To try this example on your own Splunk instance, you must download the sample
data and follow the instructions to get the tutorial data into Splunk. Use the time range All time when you run the search.
Create a chart to show the average number of events in a transaction based on the duration of the transaction.
The transaction command adds two fields to the results duration and eventcount. The eventcount field tracks the
number of events in a single transaction.
In this search, the transactions are piped into the chart command. The avg() function is used to calculate the average
number of events for each duration. Because the duration is in seconds and you expect there to be many values, the
search uses the span argument to bucket the duration into bins using logarithm with a base of 2.
Click the Visualization tab and change the display to a pie chart.
222
Each wedge of the pie chart represents a duration for the event transactions. You can hover over a wedge to see the
average values.
This example uses the sample data from the Search Tutorial. To try this example on your own Splunk instance, you must download the sample
data and follow the instructions to get the tutorial data into Splunk. Use the time range Yesterday when you run the search.
Chart how many different people bought something and what they bought at the Buttercup Games online store Yesterday.
This search takes the purchase events and pipes it into the chart command. The dc() or distinct_count() function is
used to count the number of unique visitors (characterized by the clientip field). This number is then charted over each
hour of the day and broken out based on the category_id of the purchase. Also, because these are numeric values, the
search uses the usenull=f argument to exclude fields that don't have a value.
The results appear on the Statistics tab and look something like this:
1 4 7 2 3 0 10 5
2 2 2 2 1 1 2 0
3 3 5 3 5 0 7 1
4 3 4 0 0 1 4 0
5 3 0 3 0 1 6 1
Click the Visualization tab. If necessary, format the results as a line chart:
223
Each line represents a different type of product that is sold at the Buttercup Games online store. The height of each line
shows the number of different people who bought the product during that hour. In general, it looks like the most popular
items at the online shop were Arcade games.
You can format the report as a stacked column chart, which will show you the total purchases at each hour of day.
This example uses recent earthquake data downloaded from the USGS Earthquakes website. The data is a comma separated ASCII text file
that contains magnitude (mag), coordinates (latitude, longitude), region (place), etc., for each earthquake recorded.
You can download a current CSV file from the USGS Earthquake Feeds and add it as an input.
Create a chart that list the number of earthquakes, and the magnitude of each earthquake that occurred in and around
Alaska. Run the search using the time range All time.
source=all_month.csv place=*alaska* mag>=3.5 | chart count BY mag place useother=f | rename mag AS Magnitude
This search counts the number of earthquakes that occurred in the Alaska regions. The count is then broken down for
each place based on the magnitude of the quake. Because the place value is non-numeric, the search uses the
useother=f argument to exclude events that don't match.
The results appear on the Statistics tab and look something like this:
Magnitude 145km 225km 250km 252km 254km 255km 259km 264km 265km Gulf of
ENE of SE of SE of SE of SE of SE of SE of SE of SE of Alaska
Chirikof Kodiak, Kodiak, Kodiak, Kodiak, Kodiak, Kodiak, Kodiak, Kodiak,
224
Island, Alaska Alaska Alaska Alaska Alaska Alaska Alaska Alaska
Alaska
3.5 1 1 0 1 0 1 0 0 2 2
3.6 0 0 1 0 0 0 0 1 0 1
3.7 0 0 0 0 1 0 0 0 0 2
3.8 0 1 0 0 0 0 1 1 0 3
3.9 0 0 1 0 1 0 0 0 0 0
4 0 0 0 0 1 1 0 0 0 1
4.1 0 0 0 0 0 0 0 0 0 1
4.2 0 0 0 1 0 0 0 0 0 1
4.3 0 0 0 0 0 0 0 0 0 1
4.4 0 0 0 0 0 0 1 0 0 1
4.6 1 0 0 0 0 0 0 0 0 0
5 0 0 0 0 0 0 0 0 0 1
Click on the Visualization tab to view the results on a chart. This chart shows the number of earthquakes by magnitude.
225
See also
Commands
timechart
bin
sichart
Blogs
Search commands > stats, chart, and timechart
cluster
Description
The cluster command groups events together based on how similar they are to each other. Unless you specify a different
field, cluster groups events based on the contents of the _raw field. The default grouping method is to break down the
events into terms (match=termlist) and compute the vector between events. Set a higher threshold value for t, if you
want the command to be more discriminating about which events are grouped together.
The result of the cluster command appends two new fields to each event. You can specify what to name these fields with
the countfield and labelfield parameters, which default to cluster_count and cluster_label. The cluster_count value
is the number of events that are part of the cluster, or the cluster size. Each event in the cluster is assigned the
cluster_label value of the cluster it belongs to. For example, if the search returns 10 clusters, then the clusters are
labeled from 1 to 10.
Syntax
cluster [slc-options]...
Optional arguments
slc-options
Syntax: t=<num> | delims=<string> | showcount=<bool> | countfield=<field> | labelfield=<field> | field=<field> |
labelonly=<bool> | match=(termlist | termset | ngramset)
Description: Options for configuring simple log clusters (slc).
SLC options
t
Syntax: t=<num>
Description: Sets the cluster threshold, which controls the sensitivity of the clustering. This value needs to be a
number greater than 0.0 and less than 1.0. The closer the threshold is to 1, the more similar events have to be for
them to be considered in the same cluster.
Default: 0.8
delims
Syntax: delims=<string>
Description: Configures the set of delimiters used to tokenize the raw string. By default, everything except 0-9,
A-Z, a-z, and '_' are delimiters.
226
showcount
Syntax: showcount=<bool>
Description: If showcount=false, indexers cluster its own events before clustering on the search head. When
showcount=false the event count is not added to the event. When showcount=true, the event count for each
cluster is recorded and each event is annotated with the count.
Default: showcount=false
countfield
Syntax: countfield=<field>
Description: Name of the field to which the cluster size is to be written if showcount=true is true. The cluster size
is the count of events in the cluster.
Default: cluster_count.
labelfield
Syntax: labelfield=<field>
Description: Name of the field to write the cluster number to. As the events are grouped into clusters, each
cluster is counted and labelled with a number.
Default: cluster_label
field
Syntax: field=<field>
Description: Name of the field to analyze in each event.
Default: _raw
labelonly
Description: labelonly=<bool>
Syntax: Select whether to preserve incoming events and annotate them with the cluster they belong to
(labelonly=true) or output only the cluster fields as new events (labelonly=false). When labelonly=false, outputs
the list of clusters with the event that describes it and the count of events that combined with it.
Default: false
match
Syntax: match=(termlist | termset | ngramset)
Description: Select the method used to determine the similarity between events. termlist breaks down the field
into words and requires the exact same ordering of terms. termset allows for an unordered set of terms. ngramset
compares sets of trigram (3-character substrings). ngramset is significantly slower on large field values and is
most useful for short non-textual fields, like punct.
Default: termlist
Usage
The cluster command is a streaming command or a dataset processing command, depending on which arguments are
specified with the command. See Command types.
Use the cluster command to find common or rare events in your data. For example, if you are investigating an IT
problem, use the cluster command to find anomalies. In this case, anomalous events are those that are not grouped into
big clusters or clusters that contain few events. Or, if you are searching for errors, use the cluster command to see
approximately how many different types of errors there are and what types of errors are common in your data.
227
Examples
Example 1
Quickly return a glimpse of anything that is going wrong in your Splunk deployment. Your role must have the appropriate
capabilities to access the internal indexes.
This search takes advantage of what Splunk software logs about its operation in the _internal index. It returns all logs
where the log_level is DEBUG, WARN, ERROR, FATAL and clusters them together. Then it sorts the clusters by the
count of events in each cluster.
The results appear on the Statistics tab and look something like this:
cluster_count raw
03-20-2018 09:37:33.806 -0700 ERROR HotDBManager - Unable to create directory
303010
/Applications/Splunk/var/lib/splunk/_internaldb/db/hot_v1_49427345 because No such file or directory
03-20-2018 09:37:33.811 -0700 ERROR pipeline - Uncaught exception in pipeline execution (indexer) -
151506
getting next event
04-05-2018 08:30:53.996 -0700 WARN SearchResultsMem - Failed to append to multival. Original
16390
value not converted successfully to multival.
03-20-2018 09:37:33.811 -0700 ERROR BTreeCP - failed: failed to mkdir
486
/Applications/Splunk/var/lib/splunk/fishbucket/splunk_private_db/snapshot.tmp: No such file or directory
03-20-2018 09:37:33.814 -0700 WARN DatabaseDirectoryManager - idx=_internal Cannot open
216 file='/Applications/Splunk/var/lib/splunk/_internaldb/db/.bucketManifest99454_1652919429_tmp' for
writing bucket manifest (No such file or directory)
03-20-2018 09:37:33.814 -0700 ERROR SearchResultsWriter - Unable to open output file:
216 path=/Applications/Splunk/var/lib/splunk/_internaldb/db/.bucketManifest99454_1652919429_tmp
error=No such file or directory
Example 2
This returns clusters of events and uses the sort command to display them in ascending order based on the cluster size,
which are the values of cluster_count. Because they don't cluster into large groups, you can consider these rare or
uncommon events.
Example 3
Cluster similar error events together and search for the most frequent type of error.
228
This searches your index for events that include the term "error" and clusters them together if they are similar. The sort
command is used to display the events in descending order based on the cluster size, cluster_count, so that largest
clusters are shown first. The head command is then used to show the twenty largest clusters. Now that you've found the
most common types of errors in your data, you can dig deeper to find the root causes of these errors.
Example 4
Use the cluster command to see an overview of your data. If you have a large volume of data, run the following search
over a small time range, such as 15 minutes or 1 hour, or restrict it to a source type or index.
... | cluster labelonly=t showcount=t | sort - cluster_count, cluster_label, _time | dedup 5 cluster_label
This search helps you to learn more about your data by grouping events together based on their similarity and showing
you a few of events from each cluster. It uses labelonly=t to keep each event in the cluster and append them with a
cluster_label. The sort command is used to show the results in descending order by its size (cluster_count), then its
cluster_label, then the indexed timestamp of the event (_time). The dedup command is then used to show the first five
events in each cluster, using the cluster_label to differentiate between each cluster.
See also
cofilter
Description
Use this command to determine how many times field1 and field2 values occur together.
This command implements one step in a collaborative filtering analysis for making recommendations. Given a user field
(field1) and an item field (field2), it finds how common each pair of items is. That is, it computes sum(A has X and A
has Y) where X and Y are distinct items and A is each distinct user.
Syntax
Required arguments
field1
Syntax: <field>
Description: The name of field.
field2
Syntax: <field>
Description: The name of a field.
229
Usage
Examples
Example 1
Find the cofilter for user and item. The user field must be specified first and followed by the item field. The output is an
event for each pair of items with: the first item and its popularity, the second item and its popularity, and the popularity of
that pair of items.
| makeresults | eval user="a b c a b c a b c" | makemv user | mvexpand user | streamstats count
The results appear on the Statistics tab and look something like this:
2020-02-19 21:17:54 2 b
2020-02-19 21:17:54 3 c
2020-02-19 21:17:54 4 a
2020-02-19 21:17:54 5 b
2020-02-19 21:17:54 6 c
2020-02-19 21:17:54 7 a
2020-02-19 21:17:54 8 b
2020-02-19 21:17:54 9 c
The eval command with the modulus ( % ) operator is used to create the item field:
| makeresults | eval user="a b c a b c a b c" | makemv user | mvexpand user | streamstats count | eval item
= count % 5
2020-02-19 21:17:54 2 2 b
2020-02-19 21:17:54 3 3 c
2020-02-19 21:17:54 4 4 a
2020-02-19 21:17:54 5 0 b
2020-02-19 21:17:54 6 1 c
2020-02-19 21:17:54 7 2 a
230
_time count item user
2020-02-19 21:17:54 8 3 b
2020-02-19 21:17:54 9 4 c
Add the cofilter command to the search to determine for each pair of item values, how many user values occurred with
each.
| makeresults | eval user="a b c a b c a b c" | makemv user | mvexpand user | streamstats count | eval item
= count % 5 | cofilter user item
Item 1 Item 1 user count Item 2 Item 2 user count Pair count
1 2 2 2 1
1 2 3 2 1
1 2 4 2 2
2 2 3 2 1
2 2 4 2 1
2 2 0 1 1
3 2 4 2 1
3 2 0 1 1
See also
associate, correlate
collect
Description
Adds the results of a search to a summary index that you specify. You must create the summary index before you invoke
the collect command.
You do not need to know how to use collect to create and use a summary index, but it can help. For an overview of
summary indexing, see Use summary indexing for increased search efficiency in the Knowledge Manager Manual.
Syntax
Required arguments
index
Syntax: index=<string>
Description: Name of the summary index where the events are added. The index must exist before the events
are added. The index is not created automatically.
231
Optional arguments
arg-options
Syntax: addtime=<bool> | file=<string> | spool=<bool> | marker=<string> | output_format [raw | hec] |
testmode=<bool> | run_in_preview=<bool> | host=<string> | source=<string> | sourcetype=<string>
Description: Optional arguments for the collect command. See the arg-options section for the descriptions for
each option.
arg-options
addtime
Syntax: addtime=<bool>
Description: Use this option to specify whether to prefix a time field on to each event. Some commands return
results that do not have a _raw field, such as the stats, chart, timechart commands. If you specify addtime=false,
the Splunk software uses its generic date detection against fields in whatever order they happen to be in the
summary rows. If you specify addtime=true, the Splunk software uses the search time range info_min_time. This
time range is added by the sistats command or _time. Splunk software adds the time field based on the first field
that it finds: info_min_time, _time, or now().
This option is not valid when output_format=hec.
Default: true
file
Syntax: file=<string>
Description: The file name where you want the events to be written. You can use a timestamp or a random
number for the file name by specifying either file=$timestamp$ or file=$random$.
Usage: ".stash" needs to be added at the end of the file name when used with "index=". Otherwise, the data is
added to the main index.
Default: <random-number>_events.stash
host
Syntax: host=<string>
Description: The name of the host that you want to specify for the events.
This option is not valid when output_format=hec.
marker
Syntax: marker=<string>
Description: A string, usually of key-value pairs, to append to each event written out. Each key-value pair must
be separated by a comma and a space.
If the value contains spaces or commas, it must be escape quoted. For example if the key-value pair is
search_name=vpn starts and stops, you must change it to search_name=\"vpn starts and stops\".
This option is not valid when output_format=hec.
output_format
Syntax: output_format=[raw | hec]
Description: Specifies the output format for the summary indexing. If set to raw, uses the traditional
non-structured log style summary indexing stash output format.
If set to hec, it generates HTTP Event Collector (HEC) JSON formatted output:
◊ All fields are automatically indexed when the stash file is indexed.
◊ The file that is written to the var/spool/splunk path ends in .stash_hec instead of .stash.
◊ Allows the source, sourcetype, and host from the original data to be used directly in the summary index.
Does not re-map these fields to the extract_host/extracted_sourcetype/... path.
◊ The index and splunk_server fields in the original data are ignored.
232
◊ You cannot use the addtime, host, marker, source, or the sourcetype options when output_format=hec.
Default: raw
run_in_preview
Syntax: run_in_preview=<bool>
Description: Controls whether the collect command is enabled during preview generation. Generally, you do
not want to insert preview results into the summary index, run-in-preview=false. In some cases, such as when a
custom search command is used as part of the search, you might want to turn this on to ensure correct summary
indexable previews are generated.
Default: false
spool
Syntax: spool=<bool>
Description: If set to true, the summary indexing file is written to the Splunk spool directory, where it is indexed
automatically. If set to false, the file is written to the $SPLUNK_HOME/var/run/splunk directory. The file remains in
this directory unless some form of further automation or administration is done. If you have Splunk Enterprise, you
can use this command to troubleshoot summary indexing by dumping the output file to a location on disk where it
will not be ingested as data.
Default: true
source
Syntax: source=<string>
Description: The name of the source that you want to specify for the events.
This option is not valid when output_format=hec.
sourcetype
Syntax: sourcetype=<string>
Description: The name of the source type that you want to specify for the events. By specifying a sourcetype
outside of stash, you will incur license usage.
This option is not valid when output_format=hec.
Default: stash
testmode
Syntax: testmode=<bool>
Description: Toggle between testing and real mode. In testing mode the results are not written into the new index
but the search results are modified to appear as they would if sent to the index.
Default: false
Usage
The events are written to a file whose name format is: random-num_events.stash, unless overwritten, in a directory
that your Splunk deployment is monitoring. If the events contain a _raw field, then this field is saved. If the events do not
have a _raw field, one is created by concatenating all the fields into a comma-separated list of key=value pairs.
The collect command also works with real-time searches that have a time range of All time.
If you apply the collect command to events that do not have timestamps, the command designates a time for all of the
events using the earliest (or minimum) time of the search range. For example, if you use the collect command over the
past four hours (range: -4h to +0h), the command assigns a timestamp that is four hours prior to the time that the search
233
was launched. The timestamp is applied to all of the events without a timestamp.
If you use the collect command with a time range of All time and the events do not have timestamps, the current system
time is used for the timestamps.
For more information on summary indexing of data without timestamps, see Use summary indexing for increased
reporting efficiency in the Knowledge Manager Manual.
You can use the collect command to copy search results to another index. Construct a search that returns the data you
want to copy, and pipe the results to the collect command. For example:
This search writes the results into the bar index. The sourcetype is changed to stash.
You can specify a sourcetype with the collect command. However, specifying a sourcetype counts against your license,
as if you indexed the data again.
Examples
You want to collect hourly statistics on VPN connects and disconnects by country.
The addinfo command ensures that the search results contain fields that specify when the search was run to populate
these particular index values.
See also
Commands
overlap
sichart
sirare
sistats
sitimechart
sitop
tscollect
234
concurrency
Description
Concurrency measures the number of events which have spans that overlap with the start of each event. Alternatively,
this measurement represents the total number of events in progress at the time that each particular event started,
including the event itself. This command does not measure the total number of events that a particular event overlapped
with during its total span.
Syntax
Required arguments
duration
Syntax: duration=<field>
Description: A field that represents a span of time. This field must be a numeric with the same units as the start
field. For example, the duration field generated by the transaction command is in seconds (see Example 1), which
can be used with the default of _time which is also in units of seconds.
Optional arguments
start
Syntax: start=<field>
Description: A field that represents the start time.
Default: _time
output
Syntax: output=<field>
Description: A field to write the resulting number of concurrent events.
Default: "concurrency"
Usage
An event X is concurrent with event Y if X.start is between Y.start and (Y.start + Y.duration)
If your events have a time that represents event completion and a span that represents the time before the completion,
you need to subtract duration from the start time before the concurrency command:
Limits
There is a limitation on quantity of overlapping items. If the maximum tracked concurrency exceeds max_count, from the
[concurrency] stanza in limits.conf, a warning will be produced in the UI / search output, and the values will be clamped,
making them potentially inaccurate. This limit defaults to 10000000 or ten million.
235
Basic examples
Determine the number of overlapping HTTP requests outstanding from browsers accessing splunkd at the time that each
http request begins.
This relies on the fact that the timestamp of the logged message is the time that the request came in, and the 'spent' field
is the number of milliseconds spent handling the request. As always, you must be an 'admin' user, or have altered your
roles scheme in order to access the _internal index.
Calculate the number of concurrent events for each event and emit as field 'foo':
Calculate the number of concurrent events using the 'et' field as the start time and 'length' as the duration:
Extended examples
This example uses the sample data from the Search Tutorial. To try this example on your own Splunk instance, you must download the sample
data and follow the instructions to get the tutorial data into Splunk. Use the time range All time when you run the search.
Use the duration or span of a transaction to count the number of other transactions that occurred at the same time.
• This search groups events into transactions if they have the same values of JSESSIONID and clientip. An event is
the beginning of the transaction if the event contains the string "view". An event is the last event of the transaction
if the event contains the string "purchase".
• The transaction command returns a field called duration.
• The transactions are then piped into the concurrency command, which counts the number of events that occurred
at the same time based on the timestamp and duration of the transaction.
• The search also uses the eval command and the tostring() function to reformat the values of the duration field
to a more readable format, HH:MM:SS.
236
To see the values in each transaction for the JSESSIONID, clientip, concurrency, and duration fields:
following image:
237
2. Count the purchases that occurred at the same time
This example uses the sample data from the Search Tutorial. To try this example on your own Splunk instance, you must download the sample
data and follow the instructions to get the tutorial data into Splunk. Use the time range All time when you run the search.
Use the time between each purchase to count the number of different purchases that occurred at the same time.
• This search uses the delta command and the _time field to calculate the time between one purchase event
(action=purchase) and the purchase event immediately preceding it.
• The search renames this change in time as timeDelta.
• Some of the values of timeDelta are negative. Because the concurrency command does not work with negative
values, the eval command is used to redefine timeDelta as its absolute value (abs(timeDelta)).
• The timeDelta is then used as the duration for calculating concurrent events.
This example uses the sample data from the Search Tutorial. To try this example on your own Splunk instance, you must download the sample
data and follow the instructions to get the tutorial data into Splunk. Use the time range All time when you run the search.
Use the time between each consecutive transaction to calculate the number of transactions that occurred at the same
time.
• This search groups events into transactions if they have the same values of JSESSIONID and clientip. An event is
the beginning of the transaction if the event contains the string "view". An event is the last event of the transaction
if the event contains the string "purchase".
• The transaction command returns a field called duration.
• The transactions are then piped into the delta command, which uses the _time field to calculate the time between
one transaction and the transaction immediately preceding it.
• The search renames this change in time as timeDelta.
• Some of the values of timeDelta are negative. Because the concurrency command does not work with negative
values, the eval command is used to redefine timeDelta as its absolute value (abs(timeDelta)).
238
• This timeDelta is then used as the duration for calculating concurrent transactions.
See also
timechart
contingency
Description
In statistics, contingency tables are used to record and analyze the relationship between two or more (usually categorical)
variables. Many metrics of association or independence, such as the phi coefficient or the Cramer's V, can be calculated
based on contingency tables.
You can use the contingency command to build a contingency table, which in this case is a co-occurrence matrix for the
values of two fields in your data. Each cell in the matrix displays the count of events in which both of the cross-tabulated
field values exist. This means that the first row and column of this table is made up of values of the two fields. Each cell in
the table contains a number that represents the count of events that contain the two values of the field in that row and
column combination.
If a relationship or pattern exists between the two fields, you can spot it easily just by analyzing the information in the
table. For example, if the column values vary significantly between rows (or vice versa), there is a contingency between
the two fields (they are not independent). If there is no contingency, then the two fields are independent.
239
Syntax
Required arguments
<field1>
Syntax: <field>
Description: Any field. You cannot specify wildcard characters in the field name.
<field2>
Syntax: <field>
Description: Any field. You cannot specify wildcard characters in the field name.
Optional arguments
contingency-options
Syntax: <maxopts> | <mincover> | <usetotal> | <totalstr>
Description: Options for the contingency table.
Contingency options
maxopts
Syntax: maxrows=<int> | maxcols=<int>
Description: Specify the maximum number of rows or columns to display. If the number of distinct values of the
field exceeds this maximum, the least common values are ignored. A value of 0 means a maximum limit on rows
or columns. This limit comes from the maxvalues setting in the [ctable] stanza in the limits.conf file.
Default: 1000
mincover
Syntax: mincolcover=<num> | minrowcover=<num>
Description: Specify a percentage of values per column or row that you would like represented in the output
table. As the table is constructed, enough rows or columns are included to reach this ratio of displayed values to
total values for each row or column. The maximum rows or columns take precedence if those values are reached.
Default: 1.0
usetotal
Syntax: usetotal=<bool>
Description: Specify whether or not to add row, column, and complete totals.
Default: true
totalstr
Syntax: totalstr=<field>
Description: Field name for the totals row and column.
Default: TOTAL
Usage
240
This command builds a contingency table for two fields. If you have fields with many values, you can restrict the number
of rows and columns using the maxrows and maxcols arguments.
Totals
By default, the contingency table displays the row totals, column totals, and a grand total for the counts of events that are
represented in the table. If you don't want the totals to appear in the results, include the usetotal=false argument with the
contingency command.
Empty values
Values which are empty strings ("") will be represented in the results table as EMPTY_STR.
Limits
There is a limit on the value of maxrows or maxcols, which means more than 1000 values for either field will not be used.
Examples
This search uses recent earthquake data downloaded from the USGS Earthquakes website. The data is a comma separated ASCII text file that
contains magnitude (mag), coordinates (latitude, longitude), region (place), etc., for each earthquake recorded.
You can download a current CSV file from the USGS Earthquake Feeds and upload the file to your Splunk instance.
This example uses the All Earthquakes data from the past 30 days. Use the time range All time when you run the
searches.
You want to build a contingency table to look at the relationship between the magnitudes and depths of recent
earthquakes. You start with a simple search.
There are quite a range of values for the Magnitude and Depth fields, which results in a very large table. The magnitude
values appear in the first column. The depth values appear in the first row. The list is sorted by magnitude.
The results appear on the Statistics tab. The following table shows only a small portion of the table of results returned
from the search.
mag 10 0 5 35 8 12 15 11.9 11.8 6.4 5.4 8.2 6.5 8.1 5.6 10.1 9 8.5 9.8 8.7 7.9
-0.81 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
-0.59 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
-0.56 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
-0.45 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
-0.43 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
As you can see, earthquakes can have negative magnitudes. Only where an earthquake occurred that matches the
magnitude and depth will a count appear in the table.
To build a more usable contingency table, you should reformat the values for the magnitude and depth fields. Group the
241
magnitudes and depths into ranges.
source=all_month.csv | eval Magnitude=case(mag<=1, "0.0 - 1.0", mag>1 AND mag<=2, "1.1 - 2.0", mag>2 AND
mag<=3, "2.1 - 3.0", mag>3 AND mag<=4, "3.1 - 4.0", mag>4 AND mag<=5, "4.1 - 5.0", mag>5 AND mag<=6, "5.1 -
6.0", mag>6 AND mag<=7, "6.1 - 7.0", mag>7,"7.0+") | eval Depth=case(depth<=70, "Shallow", depth>70 AND
depth<=300, "Mid", depth>300 AND depth<=700, "Deep") | contingency Magnitude Depth | sort Magnitude
This search uses the eval command with the case() function to redefine the values of Magnitude and Depth, bucketing
them into a range of values. For example, the Depth values are redefined as "Shallow", "Mid", or "Deep". Use the sort
command to sort the results by magnitude. Otherwise the results are sorted by the row totals.
The results appear on the Statistics tab and look something like this:
5.1 - 6.0 63 12 3 78
6.1 - 7.0 2 2 1 5
Determine if there are any components that might be causing issues in your Splunk deployment. Build a contingency table
to see if there is a relationship between the values of log_level and component. Run the search using the time range All
time and limit the number of columns returned.
242
These results show you any components that might be causing issues in your Splunk deployment. The component field
has more than 50 values. In this search, the maxcols argument is used to show 5 components with the highest values.
See also
associate, correlate
convert
Description
The convert command converts field values in your search results into numerical values. Unless you use the AS clause,
the original values are replaced by the new values.
Alternatively, you can use evaluation functions such as strftime(), strptime(), or tonumber() to convert field values.
Syntax
Required arguments
<convert-function>
Syntax: auto() | ctime() | dur2sec() | memk() | mktime() | mstime() | none() | num() | rmcomma() | rmunit()
Description: Functions to use for the conversion.
Optional arguments
timeformat
Syntax: timeformat=<string>
Description: Specify the output format for the converted time field. The timeformat option is used by ctime and
mktime functions. For a list and descriptions of format options, see Common time format variables in the Search
Reference.
Default: %m/%d/%Y %H:%M:%S. Note that this default does not conform to the locale settings.
<field>
Syntax: <string>
Description: Creates a new field with the name you specify to place the converted values into. The original field
and values remain intact.
Convert functions
auto()
Syntax: auto(<wc-field>)
Description: Automatically convert the fields to a number using the best conversion. Note that if not all values of
a particular field can be converted using a known conversion type, the field is left untouched and no conversion at
all is done for that field. You can use a wildcard ( * ) character in the field name.
ctime()
243
Syntax: ctime(<wc-field>)
Description: Convert an epoch time to an ascii human readable time. Use the timeformat option to specify exact
format to convert to. You can use a wildcard ( * ) character in the field name.
dur2sec()
Syntax: dur2sec(<wc-field>)
Description: Convert a duration format "[D+]HH:MM:SS" to seconds. You can use a wildcard ( * ) character in the
field name.
memk()
Syntax: memk(<wc-field>)
Description: Accepts a positive number (integer or float) followed by an optional "k", "m", or "g". The letter k
indicates kilobytes, m indicates megabytes, and g indicates gigabytes. If no letter is specified, kilobytes is
assumed. The output field is a number expressing quantity of kilobytes. Negative values cause data incoherency.
You can use a wildcard ( * ) character in the field name.
mktime()
Syntax: mktime(<wc-field>)
Description: Convert a human readable time string to an epoch time. Use timeformat option to specify exact
format to convert from. You can use a wildcard ( * ) character in the field name.
mstime()
Syntax: mstime(<wc-field>)
Description: Convert a [MM:]SS.SSS format to seconds. You can use a wildcard ( * ) character in the field name.
none()
Syntax: none(<wc-field>)
Description: In the presence of other wildcards, indicates that the matching fields should not be converted. You
can use a wildcard ( * ) character in the field name.
num()
Syntax: num(<wc-field>)
Description: Like auto(), except non-convertible values are removed. You can use a wildcard ( * ) character in
the field name.
rmcomma()
Syntax: rmcomma(<wc-field>)
Description: Removes all commas from value, for example rmcomma(1,000,000.00) returns 1000000.00. You
can use a wildcard ( * ) character in the field name.
rmunit()
Syntax: rmunit(<wc-field>)
Description: Looks for numbers at the beginning of the value and removes trailing text. You can use a wildcard (
* ) character in the field name.
Usage
244
Basic examples
Use the auto convert function to convert all field values to numeric values.
Convert every field value to a number value except for values in the field foo. Use the none convert function to specify
fields to ignore.
Change the sendmail syslog duration format (D+HH:MM:SS) to seconds. For example, if delay="00:10:15", the resulting
value is delay="615". This example uses the dur2sec convert function.
Convert the values in the duration field, which contain numeric and string values, to numeric values by removing the
string portion of the values. For example, if duration="212 sec", the resulting value is duration="212". This example
uses the rmunit convert function.
Change all memory values in the virt field to KBs. This example uses the memk convert function.
Extended Examples
Convert a UNIX time to a more readable time formatted to show hours, minutes, and seconds.
• The ctime() function converts the _time value in the CSV file events to the format specified by the timeformat
argument.
245
• The timeformat="%H:%M:%S" argument tells the search to format the _time value as HH:MM:SS.
• The converted time ctime field is renamed c_time.
• The table command is used to show the original _time value and the ctime field.
The results appear on the Statistics tab and look something like this:
_time c_time
2018-03-27 17:20:14.839 17:20:14
• The mstime() function converts the _time field values from a minutes and seconds to just seconds.
• The table command is used to show the original _time value and the converted time.
_time ms_time
2018-03-27 17:20:14.839 1522196414.839
246
3. Convert a string time in HH:MM:SS into a number
Convert a string field time_elapsed that contains times in the format HH:MM:SS into a number. Sum the time_elapsed by
the user_id field. This example uses the eval command to convert the converted results from seconds into minutes.
See also
Commands
eval
fieldformat
Functions
tonumber
strptime
correlate
Description
You can use the correlate command to see an overview of the co-occurrence between fields in your data. The results
are presented in a matrix format, where the cross tabulation of two fields is a cell value. The cell value represents the
percentage of times that the two fields exist in the same events.
The field the result is specific to is named in the value of the RowField field, while the fields it is compared against are the
names of the other fields.
Note: This command looks at the relationship among all the fields in a set of search results. If you want to analyze the
relationship between the values of fields, refer to the contingency command, which counts the co-ocurrence of pairs of
field values in events.
Syntax
correlate
Limits
There is a limit on the number of fields that correlate considers in a search. From limits.conf, stanza [correlate], the
maxfields sets this ceiling. The default is 1000.
If more than this many fields are encountered, the correlate command continues to process data for the first N (eg
thousand) field names encountered, but ignores data for additional fields. If this occurs, the notification from the search or
alert contains a message "correlate: input fields limit (N) reached. Some fields may have been ignored."
As with all designed-in limits, adjusting this might have significant memory or cpu costs.
247
Examples
Example 1:
index=_internal | correlate
Because there are different types of logs in the _internal, you can expect to see that many of the fields do not co-occur.
Example 2:
sourcetype=access_* | correlate
You expect all Web access events to share the same fields: clientip, referer, method, and so on. But, because the
sourcetype=access_* includes both access_common and access_combined Apache log formats, you should see that the
percentages of some of the fields are less than 1.0.
Example 3:
eventtype=download | correlate
The more narrow your search is before you pass the results into correlate, the more likely it is that all the field value pairs
have a correlation of 1.0. A correlation of 1.0 means the values co-occur in 100% of the search results. For these
download events, you might be able to spot an issue depending on which pairs have less than 1.0 co-occurrence.
See also
associate, contingency
ctable
The ctable, or counttable, command is an alias for the contingency command. See the contingency command for the
syntax and examples.
248
datamodel
Description
Use the datamodel command to return the JSON for all or a specified data model and its datasets. You can also search
against the specified data model or a dataset within that datamodel.
A data model is a hierarchically-structured search-time mapping of semantic knowledge about one or more datasets. A
data model encodes the domain knowledge necessary to build a variety of specialized searches of those datasets. For
more information, see About data models and Design data models in the Knowledge Manager Manual.
The datamodel search command lets you search existing data models and their datasets from the search interface.
The datamodel command is a generating command and should be the first command in the search. Generating
commands use a leading pipe character.
Syntax
| datamodel [<data model name>] [<dataset name>] [<data model search mode>] [strict_fields=<bool>]
[allow_old_summaries=<bool>] [summariesonly=<bool>]
Required arguments
None
Optional arguments
dataset name
Syntax: <string>
Description: The name of a data model dataset to search. Must be specified after the data model name. The
search returns the JSON for the single dataset.
allow_old_summaries
Syntax: allow_old_summaries=<bool>
Description: This argument applies only to accelerated data models. When you change the constraints that
define a data model but the Splunk software has not fully updated the summaries to reflect that change, the
summaries may have some data that matches the old definition and some data that matches the new definition.
249
By default, allow_old_summaries = false, which means that the search head does not use summary directories
that are older than the new summary definition. This ensures that the datamodel search results always reflect your
current configuration. When you set allow_old_summaries = true, datamodel uses both current summary data
and summary data that was generated prior to the definition change. You can set allow_old_summaries=true in
your search if you feel that the old summary data is close enough to the new summary data that its results are
reliable.
Default: false
summariesonly
Syntax: summariesonly=<bool>
Description: This argument applies only to accelerated data models. When set to false, the datamodel search
returns both summarized and unsummarized data for the selected data model. When set to true, the search
returns results only from the data that has been summarized in TSIDX format for the selected data model. You
can use this argument to identify what data is currently summarized for a given data model, or to ensure that a
particular data model search runs efficiently.
Default: false
strict_fields
Syntax: strict_fields=<bool>
Description: Determines the scope of the datamodel search in terms of fields returned. When
strict_fields=true, the search returns only default fields and fields that are included in the constraints of the
specified data model dataset. When strict_fields=false, the search returns all fields defined in the data model,
including fields inherited from parent data model datasets, extracted fields, calculated fields, and fields derived
from lookups.
You can also arrange for strict_fields to default to false for a specific data model. See Design data models in
the Knowledge Manager Manual.
Default: true
Mode Description
search Returns the search results exactly how they are defined.
Returns the same results as the search, except that it strips the hierarchical information from the field
flat names. For example, where search mode might return a field named dmdataset.server, the flat
mode returns a field named server.
Runs the search that the search head uses to accelerate the data model. This mode works only on root
acceleration_search
event datasets and root search datasets that only use streaming commands.
250
Usage
Generating commands use a leading pipe character and should be the first command in a search.
Examples
Return JSON for all data models available in the current app context.
| datamodel
251
2. Return the JSON for a specific datamodel
Return JSON for the Splunk's Internal Audit Logs - SAMPLE data model, which has the model ID internal_audit_logs.
| datamodel internal_audit_logs
Search Buttercup Games's Client_errors dataset for 404 errors and count the number of events.
6. For an accelerated data model, reveal what data has been summarized over a selected time range
After the Tutorial data model is accelerated, this search uses the summariesonly argument in conjunction with timechart
to reveal what data has been summarized for the Client_errors dataset over a selected time range.
See also
pivot
datamodelsimple
The datamodelsimple command is used with the Splunk Common Information Model Add-on.
252
For information about this command, see Use the datamodelsimple command in the Common Information Model Add-on
Manual.
dbinspect
Description
Returns information about the buckets in the specified index. If you are using Splunk Enterprise, this command helps you
understand where your data resides so you can optimize disk usage as required. Searches on an indexer cluster return
results only from the primary buckets.
The Splunk index is the repository for data ingested by Splunk software. As incoming data is indexed and transformed
into events, Splunk software creates files of rawdata and metadata (index files). The files reside in sets of directories
organized by age. These directories are called buckets.
For more information, see Indexes, indexers, and clusters and How the indexer stores indexes in Managing Indexers and
Clusters of Indexers.
Syntax
| dbinspect
[index=<wc-string>]...
[<span> | <timeformat>]
[corruptonly=<bool>]
[cached=<bool>]
Required arguments
None.
Optional arguments
index
Syntax: index=<wc-string>...
Description: Specifies the name of an index to inspect. You can specify more than one index. For all internal and
non-internal indexes, you can specify an asterisk ( * ) in the index name.
Default: The default index, which is typically main.
<span>
Syntax: span=<int> | span=<int><timescale>
Description: Specifies the span length of the bucket. If using a timescale unit (second, minute, hour, day, month,
or subseconds), this is used as a time range. If not, this is an absolute bucket "length".
When you invoke the dbinspect command with a bucket span, a table of the spans of each bucket is returned.
When span is not specified, information about the buckets in the index is returned. See Information returned when
no bucket span is specified.
<timeformat>
253
Syntax: timeformat=<string>
Description: Sets the time format for the modTime field.
Default: timeformat=%m/%d/%Y:%H:%M:%S
<corruptonly>
Syntax: corruptonly=<bool>
Description: Specifies that each bucket is checked to determine if any buckets are corrupted and displays only
the corrupted buckets. A bucket is corrupt when some of the files in the bucket are incorrect or missing such as
Hosts.data or tsidx. A corrupt bucket might return incorrect data or render the bucket unsearchable. In most
cases the software will auto-repair corrupt buckets.
When corruptonly=true, each bucket is checked and the following informational message appears.
INFO: The "corruptonly" option will check each of the specified buckets. This search might be
slow and will take time.
Default: false
cached
Syntax: cached=<bool>
Description: If set to cached=true, the dbinspect command gets the statistics from the bucket's manifest. If set to
cached=false, the dbinspect command examines the bucket itself. For SmartStore buckets, cached=false
examines an indexer's local copy of the bucket. However, specifying cached=true examines instead the bucket's
manifest, which contains information about the canonical version of the bucket that resides in the remote store.
For more information see Troubleshoot SmartStore in Managing Indexers and Clusters of Indexers.
Default: For non-SmartStore indexes, the default is false. For SmartStore indexes, the default is true.
<timescale>
Syntax: <sec> | <min> | <hr> | <day> | <month> | <subseconds>
Description: Time scale units.
When you invoke the dbinspect command without the span argument, the following information about the buckets in the
index is returned.
254
Field name Description
A string comprised of <index>~<id>~<guId>, where the delimiters are tilde characters. For example,
bucketId
summary~2~4491025B-8E6D-48DA-A90E-89AC3CF2CE80.
The timestamp for the last event in the bucket, which is the time-edge of the bucket furthest towards the future. Specify
endEpoch
the timestamp in the number of seconds from the UNIX epoch.
guId The globally unique identifier (GUID) of the server that hosts the index. This is relevant for index replication.
id The local ID number of the bucket, generated on the indexer on which the bucket originated.
The name of the index specified in your search. You can specify index=* to inspect all of the indexes, and the index field
index
will vary accordingly.
modTime The timestamp for the last time the bucket was modified or updated, in a format specified by the timeformat flag.
The location to the bucket. The naming convention for the bucket path varies slightly, depending on whether the bucket
rolled to warm while its indexer was functioning as a cluster peer:
For more information, read "How Splunk stores indexes" and "Basic cluster architecture" in Managing
Indexers and Clusters of Indexers.
The volume in bytes of the raw data files in each bucket. This value represents the volume before compression and the
rawSize
addition of index files.
The size in MB of disk space that the bucket takes up expressed as a floating point number. This value represents the
sizeOnDiskMB
volume of the compressed raw data files and the index files.
splunk_server The name of the Splunk server that hosts the index in a distributed environment.
The timestamp for the first event in the bucket (the time-edge of the bucket furthest towards the past), in number of
startEpoch
seconds from the UNIX epoch.
corruptReason Specifies the reason why the bucket is corrupt. The corruptReason field appears only when corruptonly=true.
Usage
Generating commands use a leading pipe character and should be the first command in a search.
If no data is returned from the index that you specify with the dbinspect command, it is possible that you do not have the
authorization to access that index. The ability to access data in the Splunk indexes is controlled by the authorizations
given to each role. See Use access control to secure Splunk data in Securing Splunk Enterprise.
255
Non-searchable bucket copies
For hot non-searchable bucket copies on target peers, tsidx and other metadata files are not maintained. Because
accurate information cannot be reported, the following fields show NULL:
• eventCount
• hostCount
• sourceCount
• sourceTypeCount
• startEpoch
• endEpoch
Examples
Display a chart with the span size of 1 day, using the command line interface (CLI).
| dbinspect index=_internal
256
This screen shot does not display all of the columns in the output table. On your computer, scroll to the right to see the
other columns.
Use the corruptonly argument to display information about corrupted buckets, instead of information about all buckets.
The output fields that display are the same with or without the corruptonly argument.
Use this command to verify that the Splunk servers in your distributed environment are included in the dbinspect
command. Counts the number of buckets for each server.
Use dbinspect to find the index size of buckets in GB. For current numbers, run this search over a recent time range.
dbxquery
The dbxquery command is used with Splunk DB Connect.
For information about this command, see Execute SQL statements and stored procedures with the dbxquery command in
Deploy and Use Splunk DB Connect.
dedup
Description
Removes the events that contain an identical combination of values for the fields that you specify.
With the dedup command, you can specify the number of duplicate events to keep for each value of a single field, or for
each combination of values among several fields. Events returned by dedup are based on search order. For historical
searches, the most recent events are searched first. For real-time searches, the first events that are received are
searched, which are not necessarily the most recent events.
You can specify the number of events with duplicate values, or value combinations, to keep. You can sort the fields, which
determines which event is retained. Other options enable you to retain events with the duplicate fields removed, or to
keep events where the fields specified do not exist in the events.
Syntax
257
dedup
[<int>]
<field-list>
[keepevents=<bool>]
[keepempty=<bool>]
[consecutive=<bool>]
[sortby <sort-by-clause>]
Required arguments
<field-list>
Syntax: <string> <string> ...
Description: A list of field names to remove duplicate values from.
Optional arguments
consecutive
Syntax: consecutive=<bool>
Description: If true, only remove events with duplicate combinations of values that are consecutive.
Default: false
keepempty
Syntax: keepempty=<bool>
Description: If set to true, keeps every event where one or more of the specified fields is not present (null).
Default: false. All events where any of the selected fields are null are dropped.
The keepempty=true argument keeps every event that does not have one or more of the fields in the field list. To
keep N representative events for combinations of field values including null values, use the fillnull command to
provide a non-null value for these fields. For example:
keepevents
Syntax: keepevents=<bool>
Description: If true, keep all events, but will remove the selected fields from events after the first event containing
a particular combination of values.
Default: false. Events are dropped after the first event of each particular combination.
<N>
Syntax: <int>
Description: The dedup command retains multiple events for each combination when you specify N. The number
for N must be greater than 0. If you do not specify a number, only the first occurring event is kept. All other
duplicates are removed from the results.
<sort-by-clause>
Syntax: sortby ( - | + ) <sort-field> [(- | +) <sort_field> ...]
Description: List of the fields to sort by and the sort order. Use the dash symbol ( - ) for descending order and
the plus symbol ( + ) for ascending order. You must specify the sort order for each field specified in the
<sort-by-clause>. The <sort-by-clause> determines which of the duplicate events to keep. When the list of events
is sorted, the top-most event, of the duplicate events in the sorted list, is retained.
258
Sort field options
<sort-field>
Syntax: <field> | auto(<field>) | str(<field>) | ip(<field>) | num(<field>)
Description: The options that you can specify to sort the events.
<field>
Syntax: <string>
Description: The name of the field to sort.
auto
Syntax: auto(<field>)
Description: Determine automatically how to sort the field values.
ip
Syntax: ip(<field>)
Description: Interpret the field values as IP addresses.
num
Syntax: num(<field>)
Description: Interpret the field values as numbers.
str
Syntax: str(<field>)
Description: Order the field values by using the lexicographic order.
Usage
The dedup command is a streaming command or a dataset processing command, depending on which arguments are
specified with the command. For example, if you specify the <sort-by-clause, the dedup command acts as a dataset
processing command. All of the results must be collected before sorting. See Command types.
Avoid using the dedup command on the _raw field if you are searching over a large volume of data. If you search the _raw
field, the text of every event in memory is retained which impacts your search performance. This is expected behavior.
This behavior applies to any field with high cardinality and large size.
Multivalue fields
To use the dedup command on multivalue fields, the fields must match all values to be deduplicated.
Lexicographical order
Lexicographical order sorts items based on the values used to encode the items in computer memory. In Splunk software,
this is almost always UTF-8 encoding, which is a superset of ASCII.
• Numbers are sorted before letters. Numbers are sorted based on the first digit. For example, the numbers 10, 9,
70, 100 are sorted lexicographically as 10, 100, 70, 9.
• Uppercase letters are sorted before lowercase letters.
• Symbols are not standard. Some symbols are sorted before numeric values. Other symbols are sorted before or
after letters.
259
Examples
Remove duplicate search results with the same source value and sort the results by the _time field in ascending order.
Remove duplicate search results with the same source value and sort the results by the _size field in descending order.
For search results that have the same source value, keep the first 3 that occur and remove all subsequent results.
5. Keep results that have the same combination of values in multiple fields
For search results that have the same source AND host values, keep the first 2 that occur and remove all subsequent
results.
Remove only consecutive duplicate events. Keep non-consecutive duplicate events. In this example duplicates must have
the same combination of values the source and host fields.
See also
uniq
delete
Description
Using the delete command marks all of the events returned by the search as deleted. Subsequent searches do not return
the marked events. No user, not even a user with admin permissions, is able to view this data after deletion. The delete
command does not reclaim disk space.
260
Removing data is irreversible. If you want to get your data back after the data is deleted, you must re-index the
applicable data sources.
You cannot run the delete command in a real-time search to delete events as they arrive.
Syntax
delete
Usage
The delete command can be accessed only by a user with the "delete_by_keyword" capability. By default, only the
"can_delete" role has the ability to delete events. No other role, including the admin role, has this ability. You should
create a special userid that you log on with when you intend to delete indexed data.
To use the delete command, run a search that returns the events you want deleted. Make sure that the search returns
ONLY the events that you want to delete, and no other events. After you confirm that the results contain the data that you
want to delete, pipe the search to the delete command.
The delete command does not trigger a roll of hot buckets to warm in the affected indexes.
The output of the delete command is a table of the quantity of events removed by the fields splunk_server (the name of
the indexer or search head), and index, as well as a rollup record for each server by index "__ALL__". The quantity of
deleted events is in the deleted field. An errors field is also emitted, which will normally be 0.
Centralized streaming commands include: head, streamstats, some modes of dedup, and some modes of cluster.
See Command types.
The delete command does not remove the data from your disk space. You must use the clean command from the CLI to
permanently remove the data. The clean command removes all of the data in an index. You cannot select the specific
data that you want to remove. See Remove indexes and indexed data in Managing Indexers and Clusters of Indexers.
261
Examples
Delete the events from the insecure index that contain strings that look like Social Security numbers. Use the regex
command to identify events that contain the strings that you want to match.
1. Run the following search to ensure that you are retrieving the correct data from the insecure index.
Delete events from the imap index that contain the word invalid.
Remove all of the Splunk Search Tutorial events from your index.
source=tutorialdata.zip:*
6. Confirm that the search is retrieving the correct data.
7. Add the delete command to the end of the search criteria and run the search again.
source=tutorialdata.zip:* | delete
delta
Description
Computes the difference between nearby results using the value of a specific numeric field. For each event where <field>
is a number, the delta command computes the difference, in search order, between the <field> value for the current event
and the <field> value for the previous event. The delta command writes this difference into <newfield>.
262
Syntax
delta
<field> [AS <newfield>]
[p=int]
Required arguments
field
Syntax: <field-name>
Description: The name of a field to analyze. If <field> is not a numeric field, no output field is generated.
Optional arguments
newfield
Syntax: <string>
Description: The name of a new field to write the output to.
Default: delta(<field>)
p
Syntax: p=<int>
Description: Specifies how many results prior to the current result to use for the comparison to the value in field
in the current result. The prior results are determined by the search order, which is not necessarily chronological
order. If p=1, compares the current result value against the value in the first result prior to the current result. If p=2,
compares the current result value against the value in the result that is two results prior to the current result, and
so on.
Default: 1
Usage
The delta command works on the events in the order they are returned by search. By default, the events for historical
searches are in reverse time order from new events to old events.
For real-time search, the events are compared in the order they are received.
The delta can be applied after any sequence of commands, so there is no input order guaranteed. For example, if you
sort your results by an independent field and then use the delta command, the produced values are the deltas in that
specific order.
Basic examples
With the logs from a cable TV provider, sourcetype=tv, you can analyze broadcasting ratings, customer preferences, and
so on. Which channels do subscribers watch the most, activity=view, and how long do the subscribers stay on those
channels?
263
sourcetype=tv activity="View" | sort - _time | delta _time AS timeDeltaS | eval timeDeltaS=abs(timeDeltaS) |
stats sum(timeDeltaS) by ChannelName
2. Calculate the difference between that current value and the 3rd previous value
Compute the difference between current value of count and the 3rd previous value of count and store the result in the
default field, delta(fieldname), which in this example is delta(count).
3. Calculate the difference between that current value and the previous value and rename the result field
For each event where 'count' exists, compute the difference between count and its previous value and store the result in
the field countdiff.
Extended examples
1. Calculate the difference in the number of purchases between the top 10 buyers
This example uses the sample data from the Search Tutorial but should work with any format of Apache web access log. To try this example on
your own Splunk instance, you must download the sample data and follow the instructions to get the tutorial data into Splunk. Use the time
range Yesterday when you run the search.
Find the top ten people who bought something yesterday, count how many purchases they made and the difference in the
number of purchases between each buyer.
• The purchase events, action=purchase, are piped into the top command to find the top ten users, based on
clientip, who bought something.
• These results, which include a count for each clientip are then piped into the delta command to calculate the
difference between the count value of one event and the count value of the event preceding it, using the p=1
argument.
• By default, this difference is saved in a new field called delta(count).
• The first event does not have a delta(count) value.
211.166.11.101 91 1.741960 -4
109.169.32.135 60 1.148545 0
188.138.40.166 56 1.071975 -4
74.53.23.135 49 0.937979 -7
264
clientip count percent delta(count)
187.231.45.62 48 0.918836 -1
91.208.184.24 46 0.880551 -2
2. Calculate the difference in time between recent events
This example uses recent earthquake data downloaded from the USGS Earthquakes website. The data is a comma separated ASCII text file
that contains magnitude (mag), coordinates (latitude, longitude), region (place), etc., for each earthquake recorded.
You can download a current CSV file from the USGS Earthquake Feeds and add it as an input.
Calculate the difference in time between each of the recent earthquakes in Alaska. Run the search using the time range
All time.
The delta command is used to calculate the difference in the timestamps, _time, between each earthquake and the one
immediately before it. By default the difference is placed in a new field called delta(_time). The time is in seconds.
• The rename command is used to change the default field name to timeDeltaS.
• An eval command is used with the abs function to convert the time into the absolute value of the time. This
conversion is necessary because the differences between one earthquake and the earthquake immediately
before it result in negative values.
• Another eval command is used with the tostring function to convert the time, in seconds, into a string value. The
duration argument is part of the tostring function that specifies to convert the value to a readable time format
HH:MM:SS.
265
3. Calculate the difference in time between consecutive transactions
This example uses the sample data from the Search Tutorial but should work with any format of Apache web access log. To try this example on
your own Splunk instance, you must download the sample data and follow the instructions to get the tutorial data into Splunk. Use the time
range Yesterday when you run the search.
Calculate the difference in time between consecutive transactions.
• This example groups events into transactions if they have the same values of JSESSIONID and clientip.
• The beginning of a transaction is defined by an event that contains the string view. The end of a transaction is
defined by an event that contains the string purchase. The keywords view and purchase correspond to the
values of the action field. You might also notice other values for the action field, such as addtocart and remove.
• The transactions are then piped into the delta command, which uses the _time field to calculate the time between
one transaction and the transaction immediately preceding it. Specifically the difference between the timestamp
for the last event in the transaction and the timestamp in the last event in the previous transaction.
• The search renames the time change as timeDelta.
• An eval command is used with the abs function to convert the time into the absolute value of the time. This
conversion is necessary because the differences between one transaction and the previous transaction it result in
negative values.
• Another eval command is used with the tostring function to convert the time, in seconds, into a string value. The
duration argument is part of the tostring function that specifies to convert the value to a readable time format
HH:MM:SS.
266
See also
Commands
accum
autoregress
streamstats
trendline
diff
Description
Compares two search results and returns the line-by-line difference, or comparison, of the two. The two search results
compared are specified by the two position values position1 and position2. These values default to 1 and 2 to compare
the first two results.
By default, the text (_raw field) of the two search results is compared. Other fields can be compared by selecting another
field using attribute.
Syntax
Optional arguments
position1
Datatype: <int>
Description: Of the table of input search results, selects a specific search result to compare to position2.
Default: position1=1 and refers to the first search result.
position2
Datatype: <int>
Description: Of the table of input search results, selects a specific search result to compare to position1. This
value must be greater than position1.
Default: position2=2 and refers to the second search result.
attribute
Datatype: <field>
Description: The field name to be compared between the two search results.
Default: attribute=_raw, which refers to the text of the event or result.
diffheader
Datatype: <bool>
Description: If true, show the traditional diff header, naming the "files" compared. The diff header makes the
output a valid diff as would be expected by the programmer command-line patch command.
Default: diffheader=false.
context
Datatype: <bool>
Description: If true, selects context-mode diff output as opposed to the default unified diff output.
267
Default: context=false, or unified.
maxlen
Datatype: <int>
Description: Controls the maximum content in bytes diffed from the two events. If maxlen=0, there is no limit.
Default: maxlen=100000, which is 100KB.
Examples
Example 1:
Compare the "ip" values of the first and third search results.
Example 2:
See also
set
entitymerge
The entitymerge command is used with Splunk Enterprise Security.
For information about this command, see Overwrite asset or identity data with entitymerge in Splunk Enterprise Security in
Administer Splunk Enterprise Security.
erex
Description
Use the erex command to extract data from a field when you do not know the regular expression to use. The command
automatically extracts field values that are similar to the example values you specify.
The values extracted from the fromfield argument are saved to the field. The search also returns a regular expression
that you can then use with the rex command to extract the field.
Syntax
erex
[<field>]
268
examples=<string>
[counterexamples=<string>]
[fromfield=<field>]
[maxtrainers=<integer>]
Required arguments
examples
Syntax: examples=<string>,<string>...
Description: A comma-separated list of example values for the information to extract and save into a new field.
Use quotation marks around the list if the list contains spaces. For example: "port 3351, port 3768".
field
Syntax: <string>
Description: A name for a new field that will take the values extracted from the fromfield argument. The
resulting regular expression is generated and placed as a message under the Jobs menu in Splunk Web. That
regular expression can then be used with the rex command for more efficient extraction.
Optional arguments
counterexamples
Syntax: counterexamples=<string>,<string>,...
Description: A comma-separated list of example values that represent information not to be extracted.
fromfield
Syntax: fromfield=<field>
Description: The name of the existing field to extract the information from and save into a new field.
Default: _raw
maxtrainers
Syntax: maxtrainers=<int>
Description: The maximum number values to learn from. Must be between 1 and 1000.
Default: 100
Usage
The values specified in the examples and counterexample arguments must exist in the events that are piped into the erex
command. If the values do not exist, the command fails.
To make sure that the erex command works against your events, first run the search that returns the events you want
without the erex command. Then copy the field values that you want to extract and use those for the example values with
the Click the Job menu to see the generated regular expression based on your examples.
After you run a search or open a report in Splunk Web, the erex command returns informational log messages that are
displayed in the search jobs manager window. However, these messages aren't displayed if the infocsv_log_level
setting in the limits.conf file is set to WARN or ERROR. If you do not see the informational log messages when you click Jobs
from the Activity menu, check your limits.conf file and make sure that infocsv_log_level is set to the default, which is
INFO. See How to edit a configuration file. If you're using Splunk Cloud Platform and want to change this setting, file a
Support ticket.
269
View the regular expression
You can see the regular expression that is generated based on the erex command by clicking the Job menu in Splunk
Web. See Example 3.
The output of the erex command is captured in the search.log file. You can see the output by searching for "Successfully
learned regex". The search.log file is located in the $SPLUNK_HOME/var/run/splunk/dispatch/ directory. The search logs
are not indexed by default. See Dispatch directory and search artifacts in the Search Manual.
Examples
The following search extracts out month and day values like 7/01 and puts the values into the monthday attribute.
The following search extracts out month and day values like 7/01 and 7/02, but not patterns like 99/2. The extracted
values are put into the monthday attribute.
3. Extract values based on examples and return the most common values
This example uses the sample data from the Search Tutorial. To try this example on your own Splunk instance, you must download the sample
data and follow the instructions to get the tutorial data into Splunk. Use the time range All time when you run the search.
Determine which are the most common ports used by potential attackers.
1. Run a search to find examples of the port values, where there was a failed login attempt.
270
2. Then use the erex command to extract the port field. You must specify several examples with the erex command.
Use the top command to return the most common port values. By default the top command returns the top 10
values.
sourcetype=secure* port "failed password" | erex port examples="port 3351, port 3768" | top port
This search returns a table with the count of top ports that match the search.
The results appear on the Statistics tab and look something like this:
You can replace the erex command with the rex command and generated regular expression in your search. For
example:
Using the rex command with a regular expression is more cost effective than using the erex command.
271
See also
Commands
extract
kvform
multikv
regex
rex
xmlkv
eval
Description
The eval command calculates an expression and puts the resulting value into a search results field.
• If the field name that you specify does not match a field in the output, a new field is added to the search results.
• If the field name that you specify matches a field name that already exists in the search results, the results of the
eval expression overwrite the values in that field.
You can chain multiple eval expressions in one search using a comma to separate subsequent expressions. The search
processes multiple eval expressions left-to-right and lets you reference previously evaluated fields in subsequent
expressions.
The stats command calculates statistics based on fields in your events. The eval command creates new fields in your
events by using existing fields and an arbitrary expression.
Syntax
Required arguments
field
Syntax: <string>
272
Description: A destination field name for the resulting calculated value. If the field name already exists in your
events, eval overwrites the value.
expression
Syntax: <string>
Description: A combination of values, variables, operators, and functions that will be executed to determine the
value to place in your destination field.
The eval expression is case-sensitive. The syntax of the eval expression is checked before running the search,
and an exception is thrown for an invalid expression.
Usage
General
You must specify a field name for the results that are returned from your eval command expression. You can specify a
name for a new field or for an existing field.
If the field name that you specify matches an existing field name, the values in the existing field are replaced by the
results of the eval expression.
Numbers and strings can be assigned to fields, while booleans cannot be assigned. However you can convert booleans
and nulls to strings using the tostring() function, which can be assigned to fields.
If you are using a search as an argument to the eval command and functions, you cannot use a saved search name; you
must pass a literal search string or a field that contains a literal search string (like the 'search' field extracted from
index=_audit events).
Numeric calculations
During calculations, numbers are treated as double-precision floating-point numbers, subject to all the usual behaviors of
floating point numbers. If the calculation results in the floating-point special value NaN, it is represented as "nan" in your
results. The special values for positive and negative infinity are represented in your results as "inf" and "-inf" respectively.
Division by zero results in a null field.
There are situations where the results of a calculation contain more digits than can be represented by a floating- point
number. In those situations precision might be lost on the least significant digits. For an example of how to correct this,
see Example 2 of the basic examples for the sigfig(X) function.
273
Supported functions
You can use a wide range of functions with the eval command. For general information about using functions, see
Evaluation functions.
Operators
The following table lists the basic operations you can perform with the eval command. For these evaluations to work, the
values need to be valid for the type of operation. For example, with the exception of addition, arithmetic operations might
not produce valid results if the values are not numerical. When concatenating values, Splunk software reads the values as
strings, regardless of the value.
Type Operators
Arithmetic + - * / %
Concatenation .
• The plus ( + ) operator accepts two numbers for addition, or two strings for concatenation.
• The subtraction ( - ), multiplication ( * ), division ( / ), and modulus ( % ) operators accept two numbers.
• The period ( . ) operator concatenates both strings and number. Numbers are concatenated in their string
represented form.
• The AND, OR, and XOR operators accept two Boolean values.
• The <, >, <=, >=, !=, =, and == operators accept two numbers or two strings.
• In expressions, the single equal sign ( = ) is a synonym for the double equal sign ( == ).
• The LIKE operator accepts two strings. This is a pattern match similar to what is used in SQL. For example
string LIKE pattern. The pattern operator supports literal text, a percent ( % ) character for a wildcard, and an
underscore ( _ ) character for a single character match. For example, field LIKE "a%b_" matches any string
starting with a, followed by anything, followed by b, followed by one character.
Field names
To specify a field name with multiple words, you can either concatenate the words, or use single quotation marks when
you specify the name. For example, to specify the field name Account ID you can specify AccountID or 'Account ID'.
To specify a field name with special characters, such as a period, use single quotation marks. For example, to specify the
field name Last.Name use 'Last.Name'.
You can use the value of another field as the name of the destination field by using curly brackets, { }. For example, if you
have an event with the following fields, aName=counter and aValue=1234. Use | eval {aName}=aValue to return
counter=1234.
274
Calculated fields
You can use eval statements to define calculated fields by defining the eval statement in props.conf. If you are using
Splunk Cloud Platform, you can define calculated fields using Splunk Web, by choosing Settings > Fields > Calculated
Fields. When you run a search, Splunk software evaluates the statements and creates fields in a manner similar to that of
search time field extraction. Setting up calculated fields means that you no longer need to define the eval statement in a
search string. Instead, you can search on the resulting calculated field directly.
You can use calculated fields to move your commonly used eval statements out of your search string and into props.conf,
where they will be processed behind the scenes at search time. With calculated fields, you can change the search from:
to this search:
In this example, the three eval statements that were in the search--that defined the accountname, from_user, and
from_domain fields--are now computed behind the scenes when the search is run for any event that contains the extracted
field mailfrom field. You can also search on those fields independently once they're set up as calculated fields in
props.conf. You could search on from_domain=email.com, for example.
For more information about calculated fields, see About calculated fields in the Knowledge Manager Manual.
If you are using the eval command in search event tokens, some of the evaluation functions might be unavailable or have
a different behavior. See Custom logic for search tokens in Dashboards and Visualizations for information about the
evaluation functions that you can use with search event tokens.
Basic Examples
Create a new field called velocity in each event. Calculate the velocity by dividing the values in the distance field by the
values in the time field.
Create a field called error in each event. Using the if function, set the value in the error field to OK if the status value is
200. Otherwise set the error field value to Problem.
Create a new field in each event called low-user. Using the lower function, populate the field with the lowercase version
of the values in the username field. Because the field name contains a dash ( - ), the name must be enclosed in single
quotation marks.
275
... | eval 'low-user' = lower(username)
4. Use the value of one field as the name for a new field
In this example, use each value of the field counter to make a new field name. Assign to the new field the value of the
Value field. See Field names under the Usage section.
... | eval error_msg = case(error == 404, "Not found", error == 500, "Internal Server Error", error == 200,
"OK")
Use the period ( . ) character to concatenate the values in first_name field with the values in the last_name field.
Quotation marks are used to insert a space character between the two names. When concatenating, the values are read
as strings, regardless of the actual value.
You can specify multiple eval operations by using a comma to separate the operations. In the following search the
full_name evaluation uses the period ( . ) character to concatenate the values in the first_name field with the values in
the last_name field. The low_name evaluation uses the lower function to convert the full_name evaluation into lowercase.
Extended Examples
1. Coalesce a field from two different source types, create a transaction of events
This example shows how you might coalesce a field from two different source types and use that to create a transaction of
events. sourcetype=A has a field called number, and sourcetype=B has the same information in a field called
subscriberNumber.
276
sourcetype=A OR sourcetype=B | eval phone=coalesce(number,subscriberNumber) | transaction phone maxspan=2m
The eval command is used to add a common field, called phone, to each of the events whether they are from
sourcetype=A or sourcetype=B. The value of phone is defined, using the coalesce() function, as the values of number and
subscriberNumber. The coalesce() function takes the value of the first non-NULL field (that means, it exists in the event).
Now, you're able to group events from either source type A or B if they share the same phone value.
2. Separate events into categories, count and display minimum and maximum values
This example uses recent earthquake data downloaded from the USGS Earthquakes website. The data is a comma separated ASCII text file
that contains magnitude (mag), coordinates (latitude, longitude), region (place), and so forth, for each earthquake recorded.
You can download a current CSV file from the USGS Earthquake Feeds and upload the file to your Splunk instance if
you want follow along with this example.
Earthquakes occurring at a depth of less than 70 km are classified as shallow-focus earthquakes, while those with a
focal-depth between 70 and 300 km are commonly termed mid-focus earthquakes. In subduction zones, deep-focus
earthquakes may occur at much greater depths (ranging from 300 up to 700 kilometers).
To classify recent earthquakes based on their depth, you use the following search.
The eval command is used to create a field called Description, which takes the value of "Shallow", "Mid", or "Deep"
based on the Depth of the earthquake. The case() function is used to specify which ranges of the depth fits each
description. For example, if the depth is less than 70 km, the earthquake is characterized as a shallow-focus quake; and
the resulting Description is Shallow.
The search also pipes the results of the eval command into the stats command to count the number of earthquakes and
display the minimum and maximum magnitudes for each Description.
The results appear on the Statistics tab and look something like this:
This example uses the sample data from the Search Tutorial but should work with any format of Apache web access log. To try this example on
your own Splunk instance, you must download the sample data and follow the instructions to get the tutorial data into Splunk. Use the time
range Yesterday when you run the search.
In this search, you're finding IP addresses and classifying the network they belong to.
This example uses the cidrmatch() function to compare the IP addresses in the clientip field to a subnet range. The
search also uses the if() function, which says that if the value of clientip falls in the subnet range, then the network
277
field value is local. Otherwise, network=other.
The eval command does not do any special formatting to your results. The command creates a new field based on the
eval expression you specify.
In the fields sidebar, click on the network field. In the popup, next to Selected click Yes and close the popup. Now you
can see, inline with your search results, which IP addresses are part of your local network and which are not. Your
events list looks something like this:
Another option for formatting your results is to pipe the results of eval to the table command to display only the fields of
interest to you.
Note: This example just illustrates how to use the cidrmatch function. If you want to classify your events and quickly
search for those events, the better approach is to use event types. Read more about event types in the Knowledge
manager manual.
4. Extract information from an event into a separate field, create a multivalue field
This example uses sample email data. You should be able to run this search on any email data by replacing the sourcetype=cisco:esa with
the sourcetype value and the mailfrom field with email address field name in your data. For example, the email might be To, From, or Cc).
Use the email address field to extract the name and domain. The eval command in this search contains multiple
expressions, separated by commas.
• The split() function is used to break the mailfrom field into a multivalue field called accountname. The first value
of accountname is everything before the "@" symbol, and the second value is everything after.
• The mvindex() function is used to set from_user to the first value in accountname and to set from_domain to the
second value in accountname.
• The results of the eval expressions are then piped into the table command.
You can see the the original mailfrom values and the new from_user and from_domain values in the results table. The
results appear on the Statistics tab and look something like this:
278
mailfrom from_user from_domain
[email protected] na.lui sample.net
[email protected] MAILER-DAEMON hcp2mailsec.sample.net
M&[email protected] M&MService example.com
[email protected] AlexMartin oursample.de
[email protected] Exit_Desk sample.net
[email protected] buttercup-forum+SEMAG8PUC4RETTUB groups.com
[email protected] eduardo.rodriguez sample.net
[email protected] VC00110489 techexamples.com
Note: This example was written to demonstrate how to use an eval function to identify the individual values of a
multivalue fields. Because this particular set of email data did not have any multivalue fields, the example creates a
multivalue filed, accountname, from a single value field, mailfrom.
This example uses sample email data. You should be able to run this search on any email data by replacing the sourcetype=cisco:esa with
the sourcetype value and the mailfrom field with email address field name in your data. For example, the email might be To, From, or Cc).
This example classifies where an email came from based on the email address domain. The .com, .net, and .org
addresses are considered local, while anything else is considered abroad. There are many domain names. Of course,
domains that are not .com, .net, or .org are not necessarily from abroad. This is just an example.
The eval command in this search contains multiple expressions, separated by commas.
The first half of this search is similar to previous example. The split() function is used to break up the email address in
the mailfrom field. The mvindex function defines the from_domain as the portion of the mailfrom field after the @ symbol.
• If the from_domain value ends with a .com, .net., or .org, the location field is assigned the value local.
• If from_domain does not match, location is assigned the value abroad.
The eval results are then piped into the stats command to count the number of results for each location value.
The results appear on the Statistics tab and look something like this:
location count
abroad 3543
local 14136
Note: This example merely illustrates using the match() function. If you want to classify your events and quickly search for
those events, the better approach is to use event types. Read more about event types in the Knowledge manager
manual.
279
6. Convert the duration of transactions into more readable string formats
This example uses the sample data from the Search Tutorial but should work with any format of Apache web access log. To try this example on
your own Splunk instance, you must download the sample data and follow the instructions to get the tutorial data into Splunk. Use the time
range Yesterday when you run the search.
When you use the transaction command, as shown in the following search, it calculates the length of time for the
transaction. A new field, called duration, is automatically added to the results. The duration is the time between the first
and last events in the transaction.
In the Interesting fields list, click on the duration field to see the top 10 values for duration. The values are displayed in
seconds. Click Yes to add the field to the Selected fields list.
You can use the eval command to reformat a numeric field into a more readable string format. The following search uses
the tostring() function with the "duration" option to convert the values in the duration field into a string formatted as
HH:MM:SS.
The search defines a new field, durationstr, for the reformatted duration values. In the Interesting fields list, click on
the durationstr field and select Yes to add the field to the Selected fields list. The values for the fields now appear in the
set of fields below each transaction. The following image shows how your search results should look:
See also
Functions
Evaluation functions
Commands
where
eventcount
280
Description
Syntax
| eventcount
[index=<string>]...
[summarize=<bool>]
[report_size=<bool>]
[list_vix=<bool>]
Required arguments
None.
Optional arguments
index
Syntax: index=<string>
Description: A name of the index report on, or a wildcard matching many indexes to report on. You can specify
this argument multiple times, for example index=* index=_*.
Default: If no index is specified, the command returns information about the default index.
list_vix
Syntax: list_vix=<bool>
Description: Specify whether or not to list virtual indexes. If list_vix=false, the command does not list virtual
indexes.
Default: true
report_size
Syntax: report_size=<bool>
Description: Specify whether or not to report the index size. If report_size=true, the command returns the index
size in bytes.
Default: false
summarize
Syntax: summarize=<bool>
Description: Specifies whether or not to summarize events across all peers and indexes. If summarize=false, the
command splits the event counts by index and search peer.
Default: true
Usage
Generating commands use a leading pipe character and should be the first command in a search.
281
Specifying a time range has no effect on the results returned by the eventcount command. All of the events on the
indexes you specify are counted.
Specifying indexes
You cannot specify indexes to exclude from the results. For example, index!=foo is not valid syntax.
You can specify the index argument multiple times. For example:
Examples
Example 1:
Display a count of the events in the default indexes from all of the search peers. A single count is returned.
| eventcount
Example 2:
Return the number of events in only the internal default indexes. Include the index size, in bytes, in the results.
The results appear on the Statistics tab and should be similar to the results shown in the following table.
Example 3:
Return the event count for each index and server pair. Only the external indexes are returned.
282
To return the count all of the indexes including the internal indexes, you must specify the internal indexes separately from
the external indexes:
See also
metadata, fieldsummary
eventstats
Description
Generates summary statistics from fields in your events and saves those statistics in a new field.
Only those events that have fields pertinent to the aggregation are used in generating the summary statistics. The
generated summary statistics can be used for calculations in subsequent commands in your search. See Usage.
Syntax
eventstats
[allnum=<bool>]
<stats-agg-term> ...
[<by-clause>]
Required arguments
<stats-agg-term>
Syntax: <stats-func>( <evaled-field> | <wc-field> ) [AS <wc-field>]
Description: A statistical aggregation function. See Stats function options. The function can be applied to an eval
expression, or to a field or set of fields. Use the AS clause to place the result into a new field with a name that you
specify. You can use wild card characters in field names.
Optional arguments
allnum
Syntax: allnum=<bool>
Description: If set to true, computes numerical statistics on each field, if and only if ,all of the values of that field
are numerical. If you have a BY clause, the allnum argument applies to each group independently.
283
Default: false
<by-clause>
Syntax: BY <field-list>
Description: The name of one or more fields to group by.
stats-func
Syntax: The syntax depends on the function that you use. Refer to the table below.
Description: Statistical and charting functions that you can use with the eventstats command. Each time you
invoke the eventstats command, you can use one or more functions. However, you can only use one BY clause.
See Usage.
The following table lists the supported functions by type of function. Use the links in the table to see descriptions
and examples for each function. For an overview about using functions with commands, see Statistical and
charting functions.
Usage
The eventstats search processor uses a limits.conf file setting named max_mem_usage_mb to limit how much memory the
eventstats command can use to keep track of information. When the limit is reached, the eventstats command
processor stops adding the requested fields to the search results.
Do not set max_mem_usage_mb=0 as this removes the bounds to the amount of memory the eventstats command
processor can use. This can lead to search failures.
Splunk Enterprise
To change the max_mem_usage_mb setting, follow these steps.
Prerequisites
284
◊ Only users with file system access, such as system administrators, can increase the max_mem_usage_mb
setting using configuration files.
◊ Review the steps in How to edit a configuration file in the Splunk Enterprise Admin Manual.
Never change or copy the configuration files in the default directory. The files in the default directory must
remain intact and in their original location. Make changes to the files in the local directory.
Steps
The eventstats command is similar to the stats command. You can use both commands to generate aggregations like
average, sum, and maximum.
The differences between these commands are described in the following table:
You can only use the fields in your aggregated results in You can use the fields in your events in subsequent commands in your search,
subsequent commands in the search because the events have not been transformed
How eventstats generates aggregations
The eventstats command looks for events that contain the field that you want to use to generate the aggregation. The
command creates a new field in every event and places the aggregation in that field. The aggregation is added to every
event, even events that were not used to generate the aggregation.
For example, you have 5 events and 3 of the events have the field you want to aggregate on. the eventstats command
generates the aggregation based on the data in the 3 events. A new field is added to every event and the aggregation is
added to that field in every event.
With the exception of the count function, when you pair the eventstats command with functions that are not applied to
specific fields or eval expressions that resolve into fields, the search head processes it as if it were applied to a wildcard
for all fields. In other words, when you have | eventstats avg in a search, it returns results for | eventstats avg(*).
This "implicit wildcard" syntax is officially deprecated, however. Make the wildcard explicit. Write | eventstats
<function>(*) when you want a function to apply to all possible fields.
Some functions are inherently more expensive, from a memory standpoint, than other functions. For example, the
distinct_count function requires far more memory than the count function. The values and list functions also can
285
consume a lot of memory.
If you are using the distinct_count function without a split-by field or with a low-cardinality split-by by field, consider
replacing the distinct_count function with the the estdc function (estimated distinct count). The estdc function might
result in significantly lower memory usage and run times.
Using the first and last functions when searching based on time does not produce accurate results.
• To locate the first value based on time order, use the earliest function, instead of the first function.
• To locate the last value based on time order, use the latest function, instead of the last function.
When you use the stats and eventstats commands for ordering events based on time, use the earliest and latest
functions.
The following search is the same as the previous search except the first and last functions are replaced with the
earliest and latest functions.
Basic examples
This example uses the sample data from the Search Tutorial but should work with any format of Apache web access log. To try this example on
your own Splunk instance, you must download the sample data and follow the instructions to get the tutorial data into Splunk. Use the time
range All time when you run the search.
Calculate the overall average duration of a set of transactions, and place the calculation in a new field called avgdur.
Because no BY clause is specified, a single aggregation is generated and added to every event in a new field called
avgdur .
When you look at the list of Interesting Fields, you will see that avgdur has only one value.
286
2. Calculate the average duration grouped by a specific field
This example is the same as the previous example except that an average is calculated for each distinct value of the
date_minute field. The new field avgdur is added to each event with the average value based on its particular value of
date_minute.
When you look at the list of Interesting Fields, you will see that avgdur has 79 values, based on the timestamp, duration,
and date_minute values.
287
3. Search for spikes in the volume of errors
This searches for spikes in error volume. You can use this search to trigger an alert if the count of errors is higher than
average, for example.
Extended example
The following example provides you with a better understanding of how the eventstats command works. This example is
actually a progressive set of small examples, where one example builds on or extends the previous example.
It's much easier to see what the eventstats command does by showing you examples, using a set of simple events.
These examples use the makeresults command to create a set of events. The streamstats and eval commands are used
to create additional fields in the events.
Let's start by creating a set of four events. One of the events contains a null value in the age field.
| makeresults count=4 | streamstats count | eval age = case(count=1, 25, count=2, 39, count=3, 31, count=4,
null()) | eval city = case(count=1 OR count=3, "San Francisco", count=2 OR count=4, "Seattle")
• The streamstats command is used to create the count field. The streamstats command calculates a cumulative
count for each event, at the time the event is processed.
• The eval command is used to create two new fields, age and city. The eval command uses the value in the
count field.
• The case function takes pairs of arguments, such as count=1, 25. The first argument is a Boolean expression.
When that expression is TRUE, the corresponding second argument is returned.
The BY clause in the eventstats command is optional, but is used frequently with this command. The BY clause groups
the generated statistics by the values in a field. You can use any of the statistical functions with the eventstats command
to generate the statistics. See the Statistical and charting functions.
In this example, the eventstats command generates the average age for each city. The generated averages are placed
into a new field called avg(age).
The following search is the same as the previous search, with the eventstats command added at the end:
288
| makeresults count=4 | streamstats count | eval age = case(count=1, 25, count=2, 39, count=3, 31, count=4,
null()) | eval city = case(count=1 OR count=3, "San Francisco", count=2 OR count=4, "Seattle") | eventstats
avg(age) BY city
By default, the name of the new field that is generated is the name of the statistical calculation. In these examples, that
name is avg(age). You can rename the new field using the AS keyword.
In the following search, the eventstats command has been adjusted to rename the new field to average age by city.
| makeresults count=4 | streamstats count | eval age = case(count=1, 25, count=2, 39, count=3, 31, count=4,
null()) | eval city = case(count=1 OR count=3, "San Francisco", count=2 OR count=4, "Seattle") | eventstats
avg(age) AS "average age by city" BY city
The previous examples show how an event is processed that does not contain a value in the age field. Let's see how
events are processed that contain an alphabetic character value in the field that you want to use to generate statistics .
The following search includes the word test as a value in the age field.
| makeresults count=4 | streamstats count | eval age = case(count=1, 25, count=2, 39, count=3, 31, count=4,
"test") | eval city = case(count=1 OR count=3, "San Francisco", count=2 OR count=4, "Seattle")
289
_time age city count
2020-02-05 18:32:07 25 San Francisco 1
2020-02-05 18:32:07 39 Seattle 2
| makeresults count=4 | streamstats count | eval age = case(count=1, 25, count=2, 39, count=3, 31, count=4,
"test") | eval city = case(count=1 OR count=3, "San Francisco", count=2 OR count=4, "Seattle") | eventstats
avg(age) BY city
The alphabetic values are treated like null values. The results of the search look like this:
But suppose you don't want statistics generated when there are alphabetic characters in the field or the field is empty?
The allnum argument controls how the eventstats command processes field values. The default setting for the allnum
argument is FALSE. Which means that the field used to generate the statistics does not need to contain all numeric
values. Fields with empty values or alphabetic character values are ignored. You've seen this in the earlier examples.
You can force the eventstats command to generate statistics only when the fields contain all numeric values. To
accomplish this, you can set the allnum argument to TRUE.
| makeresults count=4 | streamstats count | eval age = case(count=1, 25, count=2, 39, count=3, 31, count=4,
"test") | eval city = case(count=1 OR count=3, "San Francisco", count=2 OR count=4, "Seattle") | eventstats
allnum=true avg(age) BY city
290
_time age avg(age) city count
Because the age field contains values for Seattle that are not all numbers, the entire set of values for Seattle are ignored.
No average is calculated.
The allnum=true argument applies to empty values as well as alphabetic character values.
See also
Commands
stats
streamstats
Blogs
Search commands > stats, eventstats and streamstats
extract
Description
Extracts field-value pairs from the search results. The extract command works only on the _raw field. If you want to
extract from another field, you must perform some field renaming before you run the extract command.
Syntax
Required arguments
None.
Optional arguments
<extract-options>
Syntax: clean_keys=<bool> | kvdelim=<string> | limit=<int> | maxchars=<int> | mv_add=<bool> |
pairdelim=<string> | reload=<bool> | segment=<bool>
Description: Options for defining the extraction. See the Extract_options section in this topic.
<extractor-name>
Syntax: <string>
Description: A stanza in the transforms.conf file. This is used when the props.conf file does not explicitly cause
an extraction for this source, sourcetype, or host.
Extract options
clean_keys
Syntax: clean_keys=<bool>
Description: Specifies whether to clean keys. Overrides CLEAN_KEYS in the transforms.conf file.
Default: The value specified in the CLEAN_KEYS in the transforms.conf file.
291
kvdelim
Syntax: kvdelim=<string>
Description: A list of character delimiters that separate the key from the value.
limit
Syntax: limit=<int>
Description: Specifies how many automatic key-value pairs to extract.
Default: 50
maxchars
Syntax: maxchars=<int>
Description: Specifies how many characters to look into the event.
Default: 10240
mv_add
Syntax: mv_add=<bool>
Description: Specifies whether to create multivalued fields. Overrides the value for the MV_ADD
parameter in the transforms.conf file.
Default: false
pairdelim
Syntax: pairdelim=<string>
Description: A list of character delimiters that separate the key-value pairs from each other.
reload
Syntax: reload=<bool>
Description: Specifies whether to force reloading of the props.conf and transforms.conf files.
Default: false
segment
Syntax: segment=<bool>
Description: Specifies whether to note the locations of the key-value pairs with the results.
Default: false
Usage
Alias
Examples
1. Specify the delimiters to use for the field and value extractions
Extract field-value pairs that are delimited by the pipe or semicolon characters ( |; ). Extract values of the fields that are
delimited by the equal or colon characters ( =: ). The delimiters are individual characters. In this example the "=" or ":"
character is used to delimit the key value. Similarly, a "|" or ";" is used to delimit the field-value pair itself.
292
2. Extract field-value pairs and reload the field extraction settings
Extract field-value pairs and reload field extraction settings from disk.
Rename the _raw field to a temporary name. Rename the field you want to extract from, to _raw. In this example the field
name is uri_query.
... | rename _raw AS temp uri_query AS _raw | extract pairdelim="?&" kvdelim="=" | rename _raw AS uri_query
temp AS _raw
Extract field-value pairs that are defined in the stanza 'access-extractions' in the transforms.conf file.
See also
fieldformat
Description
With the fieldformat command you can use an <eval-expression> to change the format of a field value when the results
render. This command changes the appearance of the results without changing the underlying value of the field.
Because commands that come later in the search pipeline cannot modify the formatted results, use the fieldformat
command as late in the search pipeline as possible.
The fieldformat command does not apply to commands that export data, such as the outputcsv and outputlookup
commands. The export retains the original data format and not the rendered format. If you want the format to apply to
exported data, use the eval command instead of the fieldformat command.
Syntax
fieldformat <field>=<eval-expression>
Required arguments
<field>
Description: The name of a new or existing field, non-wildcarded, for the output of the eval expression.
<eval-expression>
Syntax: <string>
293
Description: A combination of values, variables, operators, and functions that represent the value of your
destination field. You can specify only one <eval-expression> with the fieldformat command. To specify multiple
formats you must use multiple fieldformat commands. See Examples.
Usage
Time format variables are frequently used with the fieldformat command. See Date and time format variables.
Functions
You can use a wide range of functions with the fieldformat command. For general information about using functions, see
Evaluation functions.
The following table lists the supported functions by type of function. Use the links in the table to learn more about each
function, and to see examples.
md5(X)
Cryptographic functions sha256(X) sha512(X)
sha1(X)
now() strftime(X,Y)
Date and Time functions time()
relative_time(X,Y) strptime(X,Y)
isbool(X)
isnull(X) isstr(X)
Informational functions
isint(X)
isnum(X) typeof(X)
isnotnull(X)
294
Supported functions and
Type of function
syntax
mvappend(X,...) mvfind(MVFIELD,"REGEX") mvsort(X)
mvcount(MVFIELD) mvindex(MVFIELD,STARTINDEX,ENDINDEX) mvzip(X,Y,"Z")
mvdedup(X) mvjoin(MVFIELD,STR)
len(X) rtrim(X,Y)
trim(X,Y)
acos(X) atan2(X,Y)
sin(X)
Basic examples
This example uses the metadata command to return results for the sourcetypes in the main index.
The metadata command returns many fields. The table command is used to return only the sourcetype and totalCount
fields.
The results appear on the Statistics tab and look something like this:
sourcetype totalCount
access_combined_wcookie 39532
cisco:esa 112421
csv 9510
secure 40088
vendor_sales 30244
Use the fieldformat command to reformat the appearance of the field values. The values in the totalCount field are
formatted to display the values with commas.
The results appear on the Statistics tab and look something like this:
sourcetype totalCount
295
access_combined _wcookie 39,532
cisco:esa 112,421
csv 9,510
secure 40,088
vendor_sales 30,244
2. Display UNIX time in a readable format
Assume that the start_time field contains UNIX time. Format the start_time field to display only the hours, minutes, and
seconds that correspond to the UNIX time.
To format numerical values in a field with a currency symbol, you must specify the symbol as a literal and enclose it in
quotation marks. Use a period character as a binary concatenation operator, followed by the tostring function, which
enables you to display commas in the currency values.
Extended example
This example shows how to change the appearance of search results to display commas in numerical values and dates
into readable formats.
First, use the metadata command to return results for the sourcetypes in the main index.
| metadata type=sourcetypes | rename totalCount as Count firstTime as "First Event" lastTime as "Last Event"
recentTime as "Last Update" | table sourcetype Count "First Event" "Last Event" "Last Update"
• The metadata command returns the fields firstTime, lastTime, recentTime, totalCount, and type.
• In addition, because the search specifies types=sourcetypes, a field called sourcetype is also returned.
• The totalCount, firstTime, lastTime, and recentTime fields are renamed to Count, First Event, Last Event, and
Last Update.
• The First Event, Last Event, and Last Update fields display the values in UNIX time.
The results appear on the Statistics tab and look something like this:
296
sourcetype Count First Event Last Event Last Update
Use the fieldformat command to reformat the appearance of the output of these fields. The Count field is formatted to
display the values with commas. The First Event, Last Event, and Last Update fields are formatted to display the values
in readable timestamps.
| metadata type=sourcetypes | rename totalCount as Count firstTime as "First Event" lastTime as "Last Event"
recentTime as "Last Update" | table sourcetype Count "First Event" "Last Event" "Last Update" | fieldformat
Count=tostring(Count, "commas") | fieldformat "First Event"=strftime('First Event', "%c") | fieldformat
"Last Event"=strftime('Last Event', "%c") | fieldformat "Last Update"=strftime('Last Update', "%c")
The results appear on the Statistics tab and look something like this:
eval, where
fields
Description
Keeps or removes fields from search results based on the field list criteria.
By default, the internal fields _raw and _time are included in output in Splunk Web. Additional internal fields are included
in the output with the outputcsv command. See Usage.
Syntax
Required arguments
<wc-field-list>
Syntax: <field>, <field>, ...
Description: Comma-delimited list of fields to keep or remove. You can use the asterisk ( * ) as a wildcard to
specify a list of fields with similar names. For example, if you want to specify all fields that start with "value", you
297
can use a wildcard such as value*.
Optional arguments
+|-
Syntax: + | -
Description: If the plus ( + ) symbol is specified, only the fields in the wc-field-list are kept in the results. If the
negative ( - ) symbol is specified, the fields in the wc-field-list are removed from the results.
Default: +
Usage
The leading underscore is reserved for names of internal fields such as _raw and _time. By default, the internal fields _raw
and _time are included in the search results in Splunk Web. The fields command does not remove these internal fields
unless you explicitly specify that the fields should not appear in the output in Splunk Web.
... | fields - _*
Be cautious removing the _time field. Statistical commands, such as timechart and chart, cannot display date or time
information without the _time field.
Other than the _raw and _time fields, internal fields do not display in Splunk Web, even if you explicitly specify the fields in
the search. For example, the following search does not show the _bkt field in the results.
To display an internal field in the results, the field must be copied or renamed to a field name that does not include the
leading underscore character. For example:
When the outputcsv command is used in the search, there are additional internal fields that are automatically added to
the CSV file. The most common internal fields that are added are:
• _raw
• _time
• _indextime
298
To exclude internal fields from the output, specify each field that you want to exclude. For example:
Examples
Example 1:
Example 2:
Keep only the host and ip fields. Remove all of the internal fields. The internal fields begin with an underscore character,
for example _time.
Example 3:
Remove unwanted internal fields from the output CSV file. The fields to exclude are _raw_indextime, _sourcetype,
_subsecond, and _serial.
Example 4:
Keep only the fields source, sourcetype, host, and all fields beginning with error.
See also
rename, table
fieldsummary
Description
The fieldsummary command calculates summary statistics for all fields or a subset of the fields in your events. The
summary information is displayed as a results table.
Syntax
299
Optional arguments
maxvals
Syntax: maxvals=<unsigned_int>
Description: Specifies the maximum distinct values to return for each field. Cannot be negative. Set maxvals = 0
to return all available distinct values for each field.
Default: 100
wc-field-list
Syntax: <field> ...
Description: A single field name or a space-delimited list of field names. You can use the asterisk ( * ) as a
wildcard to specify a list of fields with similar names. For example, if you want to specify all fields that start with
"value", you can use a wildcard such as value*.
Usage
The fieldsummary command displays the summary information in a results table. The following information appears in the
results table:
Summary field
Description
name
field The field name in the event.
Whether or not the field is exact. This is related to the distinct count of the field values. If the number of values of the
is_exact field exceeds maxvals, then fieldsummary will stop retaining all the values and compute an approximate distinct
count instead of an exact one. 1 means it is exact, 0 means it is not.
numeric_count The count of numeric values in the field. This would not include NULL values.
The distinct values of the field and count of each value. The values are sorted first by highest count and then by distinct
values
value, in ascending order.
Examples
This example returns summaries for all fields in the _internal index from the last 15 minutes.
In this example, the results in the max, min, and stdev fields are formatted to display up to 4 decimal points.
300
2. Return summaries for specific fields
This example returns summaries for fields in the _internal index with names that contain "size" and "count". The search
returns only the top 10 values for each field from the last 15 minutes.
See also
filldown
Description
Replaces null values with the last non-null value for a field or set of fields. If no list of fields is given, the filldown
command will be applied to all fields. If there are not any previous values for a field, it is left blank (NULL).
301
Syntax
filldown <wc-field-list>
Required arguments
<wc-field-list>
Syntax: <field> ...
Description: A space-delimited list of field names. You can use the asterisk ( * ) as a wildcard to specify a list of
fields with similar names. For example, if you want to specify all fields that start with "value", you can use a
wildcard such as value*.
Examples
Example 1:
... | filldown
Example 2:
Example 3:
Filldown null values for the count field and any field that starts with 'score'.
See also
fillnull
fillnull
Description
Replaces null values with a specified value. Null values are field values that are missing in a particular result but present
in another result. Use the fillnull command to replace null field values with a string. You can replace the null values in
one or more fields. You can specify a string to fill the null field values or use the default, field value which is zero ( 0 ).
Syntax
fillnull
[value=<string>]
302
[<field-list>]
Required arguments
None.
Optional arguments
field-list
Syntax: <field>...
Description: A space-delimited list of one or more fields. If you specify a field list, all of the fields in that list are
filled in with the value you specify. If you specify a field that didn't previously exist, the field is created. If you do
not specify a field list, the value is applied to all fields.
value
Syntax: value=<string>
Description: Specify a string value to replace null values. If you do not specify a value, the default value is
applied to the <field-list>.
Default: 0
Usage
The fillnull command is a distributable streaming command when a field-list is specified. When no field-list is
specified, fillnull is a dataset processing command. See Command types.
Examples
2021-03-16 63 39 30 22 127 56
2021-03-15 65 94 38 42 128 60
You can fill all of empty field values with a zero ( 0 ) by adding the fillnull command to the end of your search.
... | fillnull
2021-03-16 0 63 39 30 22 127 56
2021-03-15 65 94 38 42 0 128 60
303
2. Fill all empty fields with the string "NULL"
For the current search results, fill all empty field values with the string "NULL".
Using the search results from the previous example, adding value=NULL to the end of your search changes the results to
this:
if you specify a field that does not exist, the field is created and the value you specify is added to the new field.
304
For example if you specify bytes in the field list, the bytes field is created and filled with the string "unknown".
Build a time series chart of web events by host and fill all empty fields with the string "NULL".
See also
filldown
streamstats
findtypes
Description
Generates suggested event types by taking the results of a search and producing a list of potential event types. At most,
5000 events are analyzed for discovering event types.
Syntax
Required arguments
max
Datatype: <int>
Description: The maximum number of events to return.
Default: 10
Optional arguments
notcovered
305
Description: If this keyword is used, the findtypes command returns only event types that are not already
covered.
useraw
Description: If this keyword is used, the findtypes command uses phrases in the _raw text of events to generate
event types.
Examples
Example 1:
... | findtypes
Example 2:
Discover 50 common event types and add support for looking at text phrases.
See also
typer
folderize
Description
Creates a higher-level grouping, such as replacing filenames with directories. Replaces the attr attribute value with a
more generic value, which is the result of grouping the attr value with other values from other results, where grouping
occurs by tokenizing the attr value on the sep separator value.
For example, the folderize command can group search results, such as those used on the Splunk Web home page, to
list hierarchical buckets (e.g. directories or categories). Rather than listing 200 sources, the folderize command breaks
the source strings by a separator (e.g. /) and determines if looking only at directories results in the number of results
requested.
Syntax
Arguments
attr
Syntax: attr=<string>
Description: Replaces the attr attribute value with a more generic value, which is the result of grouping it with
other values from other results, where grouping occurs by tokenizing the attribute (attr) value on the separator
(sep) value.
306
sep
Syntax: sep=<string>
Description: Specify a separator character used to construct output field names when multiple data series are
used in conjunction with a split-by field.
Default: ::
size
Syntax: size=<string>
Description: Supply a name to be used for the size of the folder.
Default: totalCount
minfolders
Syntax: minfolders=<int>
Description: Set the minimum number of folders to group.
Default: 2
maxfolders
Syntax: maxfolders=<int>
Description: Set the maximum number of folders to group.
Default: 20
Examples
The following image shows the results of the search run using the All Time time range. Many of the results start with
/en-US/account. Because some of the URIs are very long, the image does not show the second column on the far right.
That column is the count(uri) column created by the stats command.
307
Using the folderize command, you can summarize the URI values into more manageable groupings.
The following image shows the URIs grouped in the result set.
In this example, the count(uri) column is the count of the unique URIs that were returned from the stats command. The
memberCount column shows the count of the URIs in each group. For example, the /en-US/ URI was found 22 times in the
events, as shown in the count(uri) column. When the folderize command arranges the URI into groups, there is only 1
member in the /en-US/ group. Whereas the URIs that start with /services/ occurred 10088 times in the events, but there
are only 1648 unique members in the /services/* group.
foreach
Description
Use this command to run a streaming subsearch that uses a template to iterate over each field in a wildcarded field list.
Syntax
foreach
<wc-field-list>
[fieldstr=<string>]
[matchstr=<string>]
[matchseg1=<string>]
[matchseg2=<string>]
[matchseg3=<string>]
<subsearch>
308
Required arguments
<wc-field-list>
Syntax: <field> ...
Description: A space-delimited list of valid field names. You can use the asterisk ( * ) as a wildcard to specify a
list of fields with similar names. For example, if you want to specify all fields that start with "value", you can use a
wildcard such as value*.
subsearch
Syntax: [ subsearch ]
Description: A subsearch that includes a template for replacing the values of the fields specified. The subsearch
can use the following tokens:
Token Description
Each time the subsearch is run, the field value substitutes the whole field name for each field you specify in the
<<FIELD>>
<field-list>.
<<MATCHSTR>> The part of field name that matches wildcards in the specifier.
Optional arguments
fieldstr
Syntax: fieldstr=<string>
Description: Replaces the <<FIELD>> token with the whole field name.
matchstr
Syntax: matchstr=<string>
Description: Replaces <<MATCHSTR>> with part of the field name that matches wildcard(s) in the specifier.
matchseg1
Syntax: matchseg1=<string>
Description: Replaces <<MATCHSEG1>> with part of the field name that matches the first wildcard.
matchseg2
Syntax: matchseg2=<string>
Description: Replaces <<MATCHSEG2>> with part of the field name that matches the second wildcard.
matchseg3
Syntax: matchseg3=<string>
Description: Replaces <<MATCHSEG3>> with part of the field name that matches the third wildcard.
Usage
If the field names contain characters other than alphanumeric characters, such as dashes, underscores, or periods, you
need to enclose the <<FIELD>> token in single quotation marks in the eval command portion of the search.
For example, the following search adds the values from all of the fields that start with similar names.
309
... | eval total=0 | eval test_1=1 | eval test_2=2 | eval test_3=3 | foreach test* [eval total=total +
'<<FIELD>>']
The <<FIELD>> token in the foreach subsearch is just a string replacement of the field names test*. The eval expression
does not recognize field names with non-alphanumeric characters unless the field names are surrounded by single
quotation marks. For the eval expression to work, the <<FIELD>> token needs to be surrounded by single quotation
marks.
Examples
1. Add the values from all of the fields that start with similar names
The following search adds the values from all of the fields that start with similar names. You can run this search on your
own Splunk instance.
|makeresults 1| eval total=0 | eval test1=1 | eval test2=2 | eval test3=3 | foreach test* [eval total=total
+ <<FIELD>>]
The following table shows how the subsearch iterates over each "test" field. The table shows the beginning value of the
"total" field each time the subsearch is run and the calculated total based on the value for the "test" field.
Subsearch iteration Test field Total field start value Test field value Calculation of "total" field
1 test1 0 1 0+1=1
2 test2 1 2 1+2=3
3 test3 3 3 3+3=6
2. Monitor license usage
First run the following search on the license master to return the daily license usage per sourcetype in bytes:
Use the foreach command to calculate the daily license usage in gigabytes for each field:
Add each field that matches foo* to the corresponding bar* and write the result to a new_* field. For example, new_X =
fooX + barX.
310
... | foreach foo* [eval new_<<MATCHSTR>> = <<FIELD>> + bar<<MATCHSTR>>]
4.
5.
For the field, fooXbarY, this is equivalent to: ... | eval fooXbarY = "Y"
See also
eval, map
format
Description
This command is used implicitly by subsearches. This command takes the results of a subsearch, formats the results into
a single result and places that result into a new field called search.
Syntax
format
[mvsep="<mv separator>"]
[maxresults=<int>]
["<row prefix>" "<column prefix>" "<column separator>" "<column end>" "<row separator>" "<row end>"]
[emptystr="<string>"]
If you want to specify a row or column options, you must specify all of the row and column options.
Required arguments
None.
Optional arguments
mvsep
Syntax: mvsep="<string>"
Description: The separator to use for multivalue fields.
Default: OR
maxresults
Syntax: maxresults=<int>
311
Description: The maximum results to return.
Default: 0, which means no limitation on the number of results returned.
<row prefix>
Syntax: "<string>"
Description: The value to use for the row prefix.
Default: The open parenthesis character "("
<column prefix>
Syntax: "<string>"
Description: The value to use for the column prefix.
Default: The open parenthesis character "("
<column separator>
Syntax: "<string>"
Description: The value to use for the column separator.
Default: AND
<column end>
Syntax: "<string>"
Description: The value to use for the column end.
Default: The close parenthesis character ")"
<row separator>
Syntax: "<string>"
Description: The value to use for the row separator.
Default: OR
<row end>
Syntax: "<string>"
Description: The value to use for the column end.
Default: The close parenthesis character ")"
emptystr
Syntax: emptystr="<string>"
Description: The value that the format command outputs instead of the default empty string NOT( ) if the results
generated up to that point are empty and no fields or values other than internal fields are returned. You can set
this argument to a custom string that is displayed instead of the default empty string whenever your search results
are empty.
Default: NOT( )
Usage
By default, when you do not specify any of the optional row and column arguments, the output of the format command
defaults to: "(" "(" "AND" ")" "OR" ")".
There are several reasons to specify the row and column arguments:
Subsearches
312
There is an implicit format at the end of a subsearch that uses the default values for column and row arguments.
For example, you can specify OR for the column separator by including the format command at the end of the
subsearch.
Examples
This search returns the syntax for a search that is based on the field values in the top 2 results. The syntax is placed into
a new field called search.
The following search uses the eval command to create a field called "foo" that contains one value "eventtype,log_level".
The makemv command is used to make the foo field a mulitvalue field and specifies the comma as the delimiter between
the values. The search then outputs only the foo field and formats that field.
313
index=_internal |head 1 |eval foo="eventtype,log_level" | makemv delim="," foo | fields foo | format
mvsep="mvseparator" "{" "[" "AND" "]" "AND" "}"
foo search
{ [ ( foo="eventtype" mvseparator foo="log_level" ) ] }
4. Use emptystr to indicate empty results
When a search generates empty results, the format command returns internal fields and the contents of emptystr. You
can change the value of emptystr from the default to a custom string. For example, the results in the following search are
empty, so format returns a customized string "Error Found" in a new field called search.
search
Error Found
If your search doesn't include emptystr like the following example, the format command displays the default empty string
to indicate that the results are empty.
search
NOT ( )
5. Use emptystr in a subsearch as a failsafe
Customizing your empty string as shown in the last example is one way to use emptystr. However, it is more typical to
use the format command as a subsearch that is operating as a search filter, and then use emptystr as a failsafe in case
your search returns empty results. For example, perhaps your index isn't generating results because one of the fields
you're specifying in the subsearch doesn't exist or there's a typo or some other error in your search. You can include the
emptystr argument and set it to a default source type that you know is always present, such as splunkd. Then, instead of
returning nothing, your search will return some results that you can use for further filtering.
You can use the following sample search to make sure you get results even if your search contains errors.
i Time Event
> 11/16/21 11-16-2021 15:11:33.745 -0800 INFO Metrics - group=thruput, name=thruput,
instantaneous_kbps=4.984, instantaneous_eps=20.935, average_kbps=1.667,
total_k_processed=182447.000, kb=154.505, ev=649
314
i Time Event
3:11:33.745 host = PF32198Dsource = C:\Program Files\Splunk\var\log\splunk\metrics.logsourcetype = splunkd
PM
11/16/21 11-16-2021 15:11:33.745 -0800 INFO Metrics - group=thruput, name=syslog_output,
instantaneous_kbps=0.000, instantaneous_eps=0.000, average_kbps=0.000,
>
3:11:33.745 total_k_processed=0.000, kb=0.000, ev=0 host = PF32198Dsource = C:\Program
PM Files\Splunk\var\log\splunk\metrics.logsourcetype = splunkd
11/16/21 11-16-2021 15:11:33.745 -0800 INFO Metrics - group=thruput, name=index_thruput,
instantaneous_kbps=4.971, instantaneous_eps=19.355, average_kbps=1.667,
>
3:11:33.745 total_k_processed=182424.000, kb=154.094, ev=600 host = PF32198Dsource = C:\Program
PM Files\Splunk\var\log\splunk\metrics.logsourcetype = splunkd
11/16/21
11-16-2021 15:11:33.745 -0800 INFO Metrics - group=queue, name=winparsing, max_size_kb=500,
> current_size_kb=0, current_size=0, largest_size=0, smallest_size=0 host = PF32198Dsource =
3:11:33.745
C:\Program Files\Splunk\var\log\splunk\metrics.logsourcetype = splunkd
PM
See also
search
from
Description
The from command retrieves data from a dataset, such as a data model dataset, a CSV lookup, a KV Store lookup, a
saved search, or a table dataset.
Design a search that uses the from command to reference a dataset. Optionally add additional SPL such as lookups, eval
expressions, and transforming commands to the search. Save the result as a report, alert, or dashboard panel. If you use
Splunk Cloud Platform, or use Splunk Enterprise and have installed the Splunk Datasets Add-on, you can also save the
search as a table dataset.
Syntax
| from
<dataset_type>:<dataset_name> | <dataset_type> <dataset_name>
Required arguments
<dataset_type>
Syntax: <dataset_type>
Description: The type of dataset. Valid values are: datamodel, lookup, and savedsearch.
315
The datamodel dataset type can be either a data model dataset or a table dataset. You create data model
datasets with the Data Model Editor. You can create table datasets with the Table Editor if you use Splunk Cloud
Platform, or use Splunk Enterprise and have installed the Splunk Datasets Add-on.
The lookup dataset type can be either a CSV lookup or a KV Store lookup.
The savedsearch dataset type is a saved search. You can use from to reference any saved search as a dataset.
See About datasets in the Knowledge Manager Manual.
<dataset_name>
Syntax: <dataset_name>
Description: The name of the dataset that you want to retrieve data from. If the dataset_type is a data model,
the syntax is <datamodel_name>.<dataset_name>. If the name of the dataset contains spaces, enclose the dataset
name in quotation marks.
Example: If the data model name is internal_server, and the dataset name is splunkdaccess, specify
internal_server.splunkdaccess for the dataset_name.
In older versions of the Splunk software, the term "data model object" was used. That term has been replaced with
"data model dataset".
Optional arguments
None.
Usage
The from command is a generating command. It can be either report-generating or event-generating depending on the
search or knowledge object that is referenced by the command. See Command types.
Generating commands use a leading pipe character and should be the first command in a search. However, you can use
the from command inside the append command.
When you use the from command, you must reference an existing dataset. You can reference any dataset listed in the
Datasets listing page, such as data model datasets, CSV lookup files, CSV lookup definitions, and table datasets. You
can also reference saved searches and KV Store lookup definitions. See View and manage datasets in the Knowledge
Manager Manual.
When you create a knowledge object such as a report, alert, dashboard panel, or table dataset, that knowledge object has
a dependency on the referenced dataset. This is referred to as a dataset extension. When you make a change to the
original dataset, such as removing or adding fields, that change propagates down to the reports, alerts, dashboard panels,
and tables that have been extended from that original dataset. See Dataset extension in the Knowledge Manager Manual.
When you search the contents of a data model using the from command, by default the search returns a strictly-filtered
set of fields. It returns only default fields and fields that are explicitly identified in the constraint search that defines the
data model.
If you have edit access to your local datamodel.conf file, you can disable field filtering for specific data models by adding
the strict_fields=false setting to their stanzas. When you do this, | from searches of data models with that setting
316
return all fields related to the data model, including fields inherited from parent data models, fields extracted at search
time, calculated fields, and fields derived from lookups.
Examples
Search a data model that contains internal server log events for REST API calls. In this example, internal_server is the
data model name and splunkdaccess is the dataset inside the internal_server data model.
| from datamodel:internal_server.splunkdaccess
Search a lookup file that contains geographic attributes for each country, such as continent, two-letter ISO code, and
subregion.
Search the contents of the KV store collection kvstorecoll that have a CustID value greater than 500 and a CustName value
that begins with the letter P. The collection is referenced in a lookup table called kvstorecoll_lookup. Using the stats
command, provide a count of the events received from the table.
Retrieve the timestamp and client IP from the saved search called mysecurityquery.
When the name of a dataset includes spaces, enclose the dataset name in quotation marks.
See also
Commands
datamodel
inputlookup
inputcsv
lookup
gauge
317
Description
Use the gauge command to transform your search results into a format that can be used with the gauge charts. Gauge
charts are a visualization of a single aggregated metric, such as a count or a sum.
The output of the gauge command is a single numerical value stored in a field called x. You can specify a range to display
in the gauge or use the default range of 0 to 100.
For more information about using the gauge command with the gauge chart types, see Using gauges in the Gauges
section in Dashboards and Visualizations.
Syntax
Required arguments
value
Syntax: field_name | <num>
Description: A numeric field or literal number to use as the current value of the gauge. If you specify a numeric
field, the gauge command uses the first value in that field as the value for the gauge.
Optional arguments
range values
Syntax: <range_val1> <range_val2> ...
Description: A space-separated list of two or more numeric fields or numbers to use as the overall numeric range
displayed in the gauge. Each range value can be a numeric field name or a literal number. If you specify a field
name, the first value in that field is used as the range value. The total range of the gauge is from the first
range_val to the last range_val. See Usage.
Default range: 0 to 100
Usage
You can create gauge charts without using the gauge command as long as your search results in a single value. The
advantage of using the gauge command is that you can specify a set of range values instead of using the default range
values of 0 to 100.
Specifying ranges
If you specify range values, you must specify at least two values. The gauge begins at the first value and ends at the last
value that you specify.
If you specify more than two range_val arguments, the intermediate range values are used to split the total range into
subranges. Each subrange displays in different color, which creates a visual distinction.
The range values are returned as a series of fields called y1, y2, and so on.
If you do not specify range values, the range defaults to a low value of 0 and a high value of 100.
318
Gauge colors
With a gauge chart, a single numerical value is mapped against a set of colors. These colors can have particular business
meaning or business logic. As the value changes over time, the gauge marker changes position within this range.
The color ranges in the gauge chart are based on the range values that you specify with the gauge command. When you
specify range values, you define the overall numerical range represented by the gauge. You can define the size of the
colored bands within that range. If you want to use the color bands, add four range values to the search string. These
range values indicate the beginning and end of the range. These range values also indicate the relative sizes of the color
bands within this range.
Examples
Count the number of events and display the count on a gauge with four ranges, from 0-750, 750-1000, 1000-1250, and
1250-1500.
Start by generating the results table using this search. Run the search using the Last 15 minutes time range.
index=_internal | stats count as myCount | gauge myCount 750 1000 1250 1500
The results appear on the Statistics tab and look something like this:
x y1 y2 y3 y4
3321 750 1000 1250 1500
Click on the Visualizations tab. There are three types of gauges that you can choose from: radial, filler, and marker. The
following image shows the radial gauge that is created based on the search results.
319
For more information about using the gauge command with the gauge chart type, see the Gauges section in Dashboard
and Visualizations.
See also
Commands
eval
stats
gentimes
Description
Generates timestamp results starting with the exact time specified as start time. Each result describes an adjacent,
non-overlapping time range as indicated by the increment value. This terminates when enough results are generated to
pass the endtime value.
Syntax
Required arguments
start
Syntax: start=<timestamp>
Description: Specify as start time.
<timestamp>
Syntax: MM/DD/YYYY[:HH:MM:SS] | <int>
Description: Indicate the timeframe, using either a timestamp or an integer value. For example: 10/1/2020 for
October 1, 2020, 4/1/2021:12:34:56 for April 1, 2021 at 12:34:56, or -5 for five days ago.
Optional arguments
end
Syntax: end=<timestamp>
Description: Specify an end time.
Default: midnight, prior to the current time in local time
increment
Syntax: increment=<int>(s | m | h | d)
Description: Specify a time period to increment from the start time to the end time. Supported increments are
seconds, minutes, hours, and days.
Default: 1d
320
Usage
Generating commands use a leading pipe character and should be the first command in a search.
Field Description
starttime The starting time range in UNIX time.
starthuman The human readable time range in the format DDD MMM DD HH:MM:SS YYYY. For example Sun Apr 4 00:00:00 2021.
endhuman The human readable time range in the format DDD MMM DD HH:MM:SS YYYY. For example Fri Apr 16 23:59:59 2021.
To specify future dates, you must include the end argument.
Examples
Generates daily time ranges from April 4 to April 7 in 2021. This search generates four intervals covering one day periods
aligning with the calendar days April 4, 5, 6, and 7, during 2021.
The results appear on the Statistics tab and look something like this:
Generate daily time ranges from 30 days ago until 27 days ago.
321
4. Generate time ranges by only specifying a start date
| gentimes start=9/25/20
Although the week increment is not supported, you can generate a weekly increment by specifying increment=7d.
This examples generates weekly time ranges from December 1, 2020 to April 30, 2021.
See also
Commands
makeresults
map
geom
Description
The geom command adds a field, named geom, to each result. This field contains geographic data structures for polygon
geometry in JSON. These geographic data structures are used to create choropleth map visualizations.
For more information about choropleth maps, see Mapping data in the Dashboards and Visualizations manual.
Syntax
Required arguments
None.
322
Optional arguments
featureCollection
Syntax: <geo_lookup>
Description: Specifies the geographic lookup file that you want to use. Two geographic lookup files are included
by default with Splunk software: geo_us_states and geo_countries. You can install your own geographic lookups
from KMZ or KLM files. See Usage for more information.
allFeatures
Syntax: allFeatures=<bool>
Description: Specifies that the output include every geometric feature in the feature collection. When a shape
has no values, any aggregate fields, such as average or count, display zero when this argument is used.
Additional rows are appended for each feature that is not already present in the search results when this
argument is used. See Examples.
Default: false
featureIdField
Syntax: featureIdField=<field>
Description: If the event contains the featureId in a field named something other than "featureId", use this option
to specify the field name.
gen
Syntax: gen=<double>
Description: Specifies generalization, in the units of the data. For example, gen=0.1 generalizes, or reduces the
size of, the geometry by running the Douglass Puiker Ramer algorithm on the polygons with a parameter of 0.1
degrees.
Default: 0.1
min_x
Syntax: min_x=<double>
Description: The X coordinate for the bottom-left corner of the bounding box for the geometric shape. The range
for the coordinate is -180 to 180. See Usage for more information.
Default: -180
min_y
Syntax: min_y=<double>
Description: The Y coordinate for the bottom-left corner of the bounding box for the geometric shape. The range
for the coordinate is -90 to 90.
Default: -90
max_x
Syntax: max_x=<double>
Description: The X coordinate for the upper-right corner of the bounding box for the geometric shape. The range
for the coordinate -180 to 180.
Default: 180
max_y
Syntax: max_y=<double>
Description: The Y coordinate for the upper-right corner of the bounding box for the geometric shape. The range
is -90 to 90.
Default: 90
323
Usage
Specifying a lookup
To use your own lookup file in Splunk Enterprise, you can define the lookup in Splunk Web or edit the transforms.conf
file. If you use Splunk Cloud Platform, use Splunk Web to define lookups.
1. To create a geospatial lookup in Splunk Web, you use the Lookups option in the Settings menu. You must add
the lookup file, create a lookup definition, and can set the lookup to work automatically. See Define a geospatial
lookup in Splunk Web in the Knowledge Manager Manual.
When no arguments are specified, the geom command looks for a field named featureCollection and a field named
featureIdField in the event. These fields are present in the default output from a geoindex lookup.
The min_x, min_y, max_x, and max_y arguments are used to clip the geometry. Use these arguments to define a bounding
box for the geometric shape. You can specify the minimum rectangle corner (min_x, min_y) and the maximum rectangle
corner (max_x, max_y). By specifying the coordinates, you are returning only the data within those coordinates.
You can use the inputlookup command to verify that the geometric features on the map are correct. The syntax is |
inputlookup <your_lookup>.
For example, to verify that the geometric features in built-in geo_us_states lookup appear correctly on the choropleth
map:
| inputlookup geo_us_states
2. On the Visualizations tab, change to a Choropleth Map.
3. zoom in to see the geometric features. In this example, the states in the United States.
To show how the output appears with the allFeatures argument, the following search creates a simple set of fields and
values.
324
| stats count | eval featureId="California" | eval count=10000 | geom geo_us_states allFeatures=true
• The search uses the stats command, specifying the count field. A single result is created that has a value of zero
( 0 ) in the count field.
• The eval command is used to add the featureId field with value of California to the result.
• Another eval command is used to specify the value 10000 for the count field. You now have a single result with
two fields, count and featureId.
• When the geom command is added, two additional fields are added, featureCollection and geom.
The following image shows the results of the search on the Statistics tab.
The following image shows the results of the search on the Visualization tab. Make sure that the map is a Choropleth
Map. This image is zoomed in to show more detail.
325
Examples
When no arguments are provided, the geom command looks for a field named featureCollection and a field named
featureId in the event. These fields are present in the default output from a geospatial lookup.
...| geom
This example uses the built-in geo_us_states lookup file for the featureCollection.
This example uses the built-in geo_us_states lookup and specifies state as the featureIdField. In most geospatial
lookup files, the feature IDs are stored in a field called featureId. Use the featureIdField argument when the event
contains the feature IDs in a field named something other than "featureId".
The following example specifies that the output include every geometric feature in the feature collection. If no value is
present for a geometric feature, zero is the default value. Using the allFeatures argument causes the choropleth map
visualization to render all of the shapes.
326
5. Use the built-in countries lookup
The following example uses the built-in geo_countries lookup. This search uses the lookup command to specify shorter
field names for the latitude and longitude fields. The stats command is used to count the feature IDs and renames the
featureIdField field as country. The geom command generates the information for the chloropleth map using the
renamed field country.
... | lookup geo_countries latitude AS lat, longitude AS long | stats count BY featureIdField AS country |
geom geo_countries featureIdField="country"
This example uses the geom command attributes that enable you to clip the geometry by specifying a bounding box.
See also
geomfilter
Description
Use the geomfilter command to specify points of a bounding box for clipping choropleth maps.
For more information about choropleth maps, see "Mapping data" in the Dashboards and Visualizations Manual.
Syntax
Optional arguments
min_x
Syntax: min_x=<float>
Description: The x coordinate of the bounding box's bottom-left corner, in the range [-180, 180].
Default: -180
min_y
Syntax: min_y=<float>
Description: The y coordinate of the bounding box's bottom-left corner, in the range [-90, 90].
Default: -90
max_x
Syntax: max_x=<float>
Description: The x coordinate of the bounding box's up-right corner, in the range [-180, 180].
Default: 180
max_y
327
Syntax: max_y=<float>
Description: The y coordinate of the bounding box's up-right corner, in the range [-90, 90].
Default: max_y=90
Usage
The geomfilter command accepts two points that specify a bounding box for clipping choropleth maps. Points that fall
outside of the bounding box will be filtered out.
Examples
Example 1: This example uses the default bounding box, which will clip the entire map.
...| geomfilter
See also
geom
geostats
Description
Use the geostats command to generate statistics to display geographic data and summarize the data on maps.
The command generates statistics which are clustered into geographical bins to be rendered on a world map. The events
are clustered based on latitude and longitude fields in the events. Statistics are then evaluated on the generated clusters.
The statistics can be grouped or split by fields using a BY clause.
For map rendering and zooming efficiency, the geostats command generates clustered statistics at a variety of zoom
levels in one search, the visualization selecting among them. The quantity of zoom levels is controlled by the binspanlat,
binspanlong, and maxzoomlevel options. The initial granularity is selected by the binspanlat and the binspanlong. At each
level of zoom, the number of bins is doubled in both dimensions for a total of 4 times as many bins for each zoom in.
Syntax
geostats
[ translatetoxy=<bool> ]
[ latfield=<string> ]
328
[ longfield=<string> ]
[ globallimit=<int> ]
[ locallimit=<int> ]
[ outputlatfield=<string> ]
[ outputlongfield=<string> ]
[ binspanlat=<float> binspanlong=<float> ]
[ maxzoomlevel=<int> ]
<stats-agg-term>...
[ <by-clause> ]
Required arguments
stats-agg-term
Syntax: <stats-func> ( <evaled-field> | <wc-field> ) [AS <wc-field>]
Description: A statistical aggregation function. See Stats function options. The function can be applied to an eval
expression, or to a field or set of fields. Use the AS clause to place the result into a new field with a name that you
specify. You can use wild card characters in field names. For more information on eval expressions, see Types of
eval expressions in the Search Manual.
Optional arguments
binspanlat
Syntax: binspanlat=<float>
Description: The size of the bins in latitude degrees at the lowest zoom level. If you set binspanlat lower than
the default value, the visualizations on the map might not render.
Default: 22.5. If the default values for binspanlat and binspanlong are used, a grid size of 8x8 is generated.
binspanlong
Syntax: binspanlong=<float>
Description: The size of the bins in longitude degrees at the lowest zoom level. If you set binspanlong lower than
33, the visualizations on the map might not render.
Default: 45.0. If the default values for binspanlat and binspanlong are used, a grid size of 8x8 is generated.
by-clause
Syntax: BY <field>
Description: The name of the field to group by.
globallimit
Syntax: globallimit=<int>
Description: Controls the number of named categories to add to each pie chart. There is one additional category
called "OTHER" under which all other split-by values are grouped. Setting globallimit=0 removes all limits and all
categories are rendered. Currently the grouping into "OTHER" only works intuitively for count and additive
statistics.
Default: 10
locallimit
Syntax: locallimit=<int>
Description: Specifies the limit for series filtering. When you set locallimit=N, the top N values are filtered based
on the sum of each series. If locallimit=0, no filtering occurs.
Default: 10
latfield
329
Syntax: latfield=<field>
Description: Specify a field from the pre-search that represents the latitude coordinates to use in your analysis.
Defaults: lat
longfield
Syntax: longfield=<field>
Description: Specify a field from the pre-search that represents the longitude coordinates to use in your analysis.
Default: lon
maxzoomlevel
Syntax: maxzoomlevel=<int>
Description: The maximum number of levels to create in the quadtree.
Default: 9. Specifies that 10 zoom levels are created, 0-9.
outputlatfield
Syntax: outputlatfield=<string>
Description: Specify a name for the latitude field in your geostats output data.
Default: latitude
outputlongfield
Syntax: outputlongfield=<string>
Description: Specify a name for the longitude field in your geostats output data.
Default: longitude
translatetoxy
Syntax: translatetoxy=<bool>
Description: If true, geostats produces one result per each locationally binned location. This mode is appropriate
for rendering on a map. If false, geostats produces one result per category (or tuple of a multiply split dataset) per
locationally binned location. Essentially this causes the data to be broken down by category. This mode cannot be
rendered on a map.
Default: true
stats-func
Syntax: The syntax depends on the function that you use. See Usage.
Description: Statistical and charting functions that you can use with the geostats command. Each time you
invoke the geostats command, you can use one or more functions.
Usage
To display the information on a map, you must run a reporting search with the geostats command.
If you are using a lookup command before the geostats command, see Optimizing your lookup search.
Supported functions
You can use a wide range of functions with the geostats command. For general information about using functions, see
Statistical and charting functions.
330
• For an alphabetical list of statistical functions, see Alphabetical list of functions
A pair of limits.conf settings strike a balance between the performance of geostats searches and the amount of
memory they use during the search process, in RAM and on disk. If your geostats searches are consistently slow to
complete you can adjust these settings to improve their performance, but at the cost of increased search-time memory
usage, which can lead to search failures.
For more information, see Memory and stats search performance in the Search Manual.
Basic examples
Cluster events by default latitude and longitude fields "lat" and "lon" respectively. Calculate the count of the events.
2. Specify the latfield and longfield and calculate the average of a field
Compute the average rating for each gender after clustering/grouping the events by "eventlat" and "eventlong" values.
Extended examples
3. Count each product sold by a vendor and display the information on a map
This example uses the sample data from the Search Tutorial. To try this example on your own Splunk instance, you must download the sample
data and follow the instructions to get the tutorial data into Splunk. Use the time range All time when you run the search.
In addition, this example uses several lookup files that you must download (prices.csv.zip and vendors.csv.zip) and
unzip the files. You must complete the steps in the Enabling field lookups section of the tutorial for both the prices.csv
and the vendors.csv files. The steps in the tutorial are specific to the prices.csv file. For the vendors.csv file, use the
name vendors_lookup for the lookup definition. Skip the step in the tutorial that makes the lookups automatic.
This search uses the stats command to narrow down the number of events that the lookup and geostats commands
need to process.
Use the following search to count each product sold by a vendor and display the information on a map.
sourcetype=vendor_sales | stats count by Code VendorID | lookup prices_lookup Code OUTPUTNEW product_name |
table product_name VendorID | lookup vendors_lookup VendorID | geostats latfield=VendorLatitude
longfield=VendorLongitude count by product_name
• In this example, sourcetype=vendor_sales is associated with a log file that is included in the Search Tutorial
sample data. This log file contains vendor information that looks like this:
331
• The vendors_lookup is used to output all the fields in vendors.csv file that match to the VentorID in the
vendor_sales.log file. The fields in the vendors.csv file are : Vendor, VendorCity, VendorID, VendorLatitude,
VendorLongitude, VendorStateProvince, and VendorCountry.
• The prices_lookup is used to match the Code field in each event to a product_name in the table.
In this search, the CSV files are uploaded and the lookups are defined but are not automatic.
Click the Visualization tab. The results are plotted on a world map. There is a pie chart for each vendor in the results.
The larger the pie chart, the larger the count value.
332
In this screen shot, the mouse pointer is over the pie chart for a region in the northeastern part of the United States. An
popup information box displays the latitude and longitude for the vendor, as well as a count of each product that the
vendor sold.
See also
Commands
iplocation
stats
xyseries
Reference information
Mapping data in Dashboards and Visualizations
head
Description
Returns the first N number of specified results in search order. This means the most recent N events for a historical
search, or the first N captured events for a real-time search. The search results are limited to the first results in search
order.
There are two types of limits that can be applied: an absolute number of results, or an expression where all results are
returned until the expression becomes false.
Syntax
head
[<N> | (<eval-expression>)]
[limit=<int>]
[null=<bool>]
[keeplast=<bool>]
Required arguments
None.
If no options or limits are specified, the head command returns the first 10 results.
Optional arguments
<N>
Syntax: <int>
Description: The number of results to return.
Default: 10
333
limit
Syntax: limit=<int>
Description: Another way to specify the number of results to return.
Default: 10
eval-expression
Syntax: <eval-compare-exp> | <eval-bool-exp>
Description: A valid <eval-expression> that evaluates to a Boolean. The search returns results until this
expression evaluates to false. For more information, see the evaluation functions in the Search Reference.
keeplast
Syntax: keeplast=<bool>
Description: You must specify a eval-expression to use the keeplast argument. Controls whether the last result
in the result set is retained. The last result returned is the result that caused the eval-expression to evaluate to
false or NULL. Set keeplast to true to retain the last result in the result set. Set keeplast to false to discard the
last result.
Default: false
null
Syntax: null=<bool>
Description: You must specify an <eval-expression> for the null argument to have any effect. Controls how an
<eval-expression> that evaluates to NULL is handled. For example, if the <eval-expression> is (x > 10) and a
value in field x does not exist, the <eval-expression> evaluates to NULL instead of true or false.
◊ If null=true, the results of the head command include events for which <eval-expression> evaluates to
NULL in the output. The head command continues to process the remaining events.
◊ If null=false, the head command treats the <eval-expression> that evaluates to NULL as if the
<eval-expression> evaluated to false. The head command stops processing events. If keeplast=true, the
event for which the <eval-expression> evaluated to NULL is also included in the output.
Default: false
Usage
Setting limits
If a numeric limit such as a numeric literal or the argument limit=<int> is used, the head command returns the first N
results where N is the selected number. Using both the numeric limit and limit=<int> results in an error.
Using an <eval-expression>
If an <eval-expression> is used, all initial results are returned until the first result where the expression evaluates to false.
The result where the <eval-expression> evaluates to false is kept or dropped based on the keeplast argument.
If both a numeric limit and an <eval-expression> are used, the smaller of the two constraints applies. For example, the
following search returns up to the first 10 results, because the <eval-expression> is always true.
However, this search returns no results because the <eval-expression> is always false.
334
... |head limit=10 (0==1)
Basic examples
... | head 20
Return events until the time span of the data is >= 100 seconds
Extended example
The following example shows the search results when an <eval-expression> evaluates to NULL, and the impact of the
keeplast and null arguments on those results.
Let's start with creating a set of events. The eval command replaces the value 3 with NULL in the count field.
_time count
2020-05-18 12:46:51 1
2020-05-18 12:46:51 2
2020-05-18 12:46:51
2020-05-18 12:46:51 4
2020-05-18 12:46:51 5
2020-05-18 12:46:51 6
2020-05-18 12:46:51 7
When null is set to true, the head command continues to process the results. In this example the command processes
the results, ignoring NULL values, as long as the count is less than 5. Because keeplast=true the event that stopped the
processing, count 5, is also included in the output.
_time count
335
2020-05-18 12:46:51 1
2020-05-18 12:46:51 2
2020-05-18 12:46:51
2020-05-18 12:46:51 4
2020-05-18 12:46:51 5
When null is set to false, the head command stops processing the results when it encounters a NULL value. The events
with count 1 and 2 are returned. Because keeplast=true the event with the NULL value that stopped the processing, the
third event, is also included in the output.
_time count
2020-05-18 12:46:51 1
2020-05-18 12:46:51 2
2020-05-18 12:46:51
See also
Commands
reverse
tail
highlight
Description
Highlights specified terms in the events list. Matches a string or list of strings and highlights them in the display in Splunk
Web. The matching is not case sensitive.
Syntax
highlight <string>...
Required arguments
<string>
Syntax: <string> ...
Description: A space-separated list of strings to highlight in the results. The list you specify is not case-sensitive.
Any combination of uppercase and lowercase letters that match the string are highlighted.
336
Usage
The string that you specify must be a field value. The string cannot be a field name.
You must use the highlight command in a search that keeps the raw events and displays output on the Events tab. You
cannot use the highlight command with commands, such as stats which produce calculated or generated results.
Examples
Example 1:
Example 2:
See also
rangemap
history
Description
Use this command to view your search history in the current application. This search history is presented as a set of
events or as a table.
Syntax
| history [events=<bool>]
Required arguments
None.
Optional arguments
events
Syntax: events=<bool>
Description: When you specify events=true, the search history is returned as events. This invokes the
event-oriented UI which allows for convenient highlighting, or field-inspection. When you specify events=false,
the search history is returned in a table format for more convenient aggregate viewing.
Default: false
337
Fields returned when events=false.
api_et The earliest time of the API call, which is the earliest time for which events were requested.
api_lt The latest time of the API call, which is the latest time for which events were requested.
event_count If the search retrieved or generated events, the count of events returned with the search.
exec_time The execution time of the search in integer quantity of seconds into the Unix epoch.
is_realtime Indicates whether the search was real-time (1) or historical (0).
result_count If the search is a transforming search, the count of results for the search.
scan_count The number of events retrieved from a Splunk index at a low level.
splunk_server The host name of the machine where the search was run.
Usage
The history command is a generating command and should be the first command in the search. Generating commands
use a leading pipe character.
The history command returns your search history only from the application where you run the command.
Examples
Return a table of the search history. You do not have to specify events=false, since that this the default setting.
| history
338
Return search history as events
| history events=true
339
See also
Commands
search
iconify
Description
Causes Splunk Web to display an icon for each different value in the list of fields that you specify.
The iconify command adds a field named _icon to each event. This field is the hash value for the event. Within Splunk
Web, a different icon for each unique value in the field is displayed in the events list. If multiple fields are listed, the UI
displays a different icon for each unique combination of the field values.
Syntax
iconify <field-list>
Required arguments
field-list
Syntax: <field>...
Description: Comma or space-delimited list of fields. You cannot specify a wildcard character in the field list.
Usage
Examples
Display a different icon for unique pair of clientip and method values.
Here is how Splunk Web displays the results in your Events List:
340
See also
highlight
inputcsv
Description
For Splunk Enterprise deployments, loads search results from the specified .csv file, which is not modified. The filename
must refer to a relative path in $SPLUNK_HOME/var/run/splunk/csv. If dispatch=true, the path must be in
$SPLUNK_HOME/var/run/splunk/dispatch/<job id>.
If the specified file does not exist and the filename does not have an extension, then the Splunk software assumes it has a
filename with a .csv extension.
If you run into an issue with the inputcsv command resulting in an error, ensure that your CSV file ends with a BLANK
LINE.
Syntax
| inputcsv
[dispatch=<bool>]
[append=<bool>]
[strict=<bool>]
[start=<int>]
[max=<int>]
[events=<bool>]
<filename>
[WHERE <search-query>]
Required arguments
filename
Syntax: <filename>
Description: Specify the name of the .csv file, located in $SPLUNK_HOME/var/run/splunk/csv.
341
Optional arguments
dispatch
Syntax: dispatch=<bool>
Description: When set to true, this argument indicates that the filename is a .csv file in the dispatch directory.
The relative path is $SPLUNK_HOME/var/run/splunk/dispatch/<job id>/.
Default: false
append
Syntax: append=<bool>
Description: Specifies whether the data from the .csv file is appended to the current set of results (true) or
replaces the current set of results (false).
Default: false
strict
Syntax: strict=<bool>
Description: When set to true this argument forces the search to fail completely if inputcsv raises an error. This
happens even when the errors apply to a subsearch. When set to false, many inputcsv error conditions return
warning messages but do not otherwise cause the search to fail. Certain error conditions cause the search to fail
even when strict=false.
Default: false
events
Syntax: events=<bool>
Description: Specifies whether the data in the CSV file are treated as events or as a table of search results. By
default events=false returns the data in a table with field names as column headings. The table appears on the
Statistics tab. If you set events=true, the imported CSV data must have the _time and _raw fields. The data is
treated as events, which appear on the Events tab.
Default: false
max
Syntax: max=<int>
Description: Controls the maximum number of events to be read from the file. If max is not specified, there is no
limit to the number of events that can be read.
Default: 1000000000 (1 billion)
start
Syntax: start=<int>
Description: Controls the 0-based offset of the first event to be read.
Default: 0
WHERE
Syntax: WHERE <search-criteria>
Description: Use this clause to improve search performance by prefiltering data returned from the CSV file.
Supports a limited set of search query operators: =, !=, <, >, <=, >=, AND, OR, NOT. Any combination of these
operators is permitted. Also supports wildcard string searches.
Usage
342
Generating commands use a leading pipe character and should be the first command in a search.
If the append argument is set to true, you can use the inputcsv command to append the data from the CSV file to the
current set of search results. With append=true, you use the inputcsv command later in your search, after the search has
returned a set of results. See Examples.
The append argument is set to false by default. If the append argument is not specified or is set to false, the inputcsv
command must be the first command in the search. Data is loaded from the specified CSV file into the search.
The WHERE clause allows you to narrow the scope of the search of the inputcsv file. It restricts the inputcsv to a smaller
number of rows, which can improve search efficiency when you are working with significantly large CSV files.
Distributed deployments
The inputcsv command is not compatible with search head pooling and search head clustering.
The command saves the *.csv file on the local search head in the $SPLUNK_HOME/var/run/splunk/ directory. The *.csv
files are not replicated on the other search heads.
Use the strict argument to make inputcsv searches fail whenever they encounter an error condition. You can set this at
the system level for all inputcsv and inputlookup searches by changing input_errors_fatal in limits.conf
If you use Splunk Cloud Platform, file a Support ticket to change the input_errors_fatal setting.
Use the strict argument to override the input_errors_fatal setting for an inputcsv search.
Examples
This example loads search results from the $SPLUNK_HOME/var/run/splunk/csv/all.csv file. Those that contain the string
error are saved to the $SPLUNK_HOME/var/run/splunk/csv/error.csv file.
This example loads results 101 to 600 from either the bar file, if exists, or from the bar.csv file.
You can use comparison operators and Boolean expression to specify which results to load. This example loads all of the
events from the CSV file $SPLUNK_HOME/var/run/splunk/csv/students.csv and then filters out the events that do not
343
match the WHERE clause, where the values in the age field are greater than 13, less than 19, but not 16. The search
returns a count of the remaining search results.
| inputcsv students.csv WHERE (age>=13 age<=19) AND NOT age=16 | stats count
You can use the append argument to append data from a CSV file to a set of search results. In this example the combined
data is then output back to the same CSV file.
You can also append the search results of one CSV file to another CSV file by using the append command and a
subsearch. This example uses the eval command to add a field to each set of data to denote which CSV file the data
originated from.
See also
outputcsv
inputintelligence
The inputintelligence command is used with Splunk Enterprise Security.
For information about this command, see Use generic intelligence in search with inputintelligence in Administer Splunk
Enterprise Security.
inputlookup
Description
Use the inputlookup command to search the contents of a lookup table. The lookup table can be a CSV lookup or a KV
store lookup.
Syntax
| inputlookup
[append=<bool>]
[strict=<bool>]
[start=<int>]
[max=<int>]
[<filename> | <tablename>]
[WHERE <search-query>]
344
Required arguments
<filename>
Syntax: <string>
Description: The name of the lookup file must end with .csv or .csv.gz. If the lookup does not exist, a warning
message is displayed (but no syntax error is generated).
<tablename>
Syntax: <string>
Description: The name of the lookup table as specified by a stanza name in the transforms.conf file. The lookup
table can be configured for any lookup type (CSV, external, or KV store).
Optional arguments
append
Syntax: append=<bool>
Description: If set to true, the data returned from the lookup file is appended to the current set of results rather
than replacing it. Defaults to false.
strict
Syntax: strict=<bool>
Description: When set to true this argument forces the search to fail completely if inputlookup raises an error.
This happens even when the errors apply to a subsearch. When set to false, many inputlookup error conditions
return warning messages but do not otherwise cause the search to fail. Certain error conditions cause the search
to fail even when strict=false.
Default: false
max
Syntax max=<int>
Description: Specify the maximum number of events to be read from the file. Defaults to 1000000000.
start
Syntax: start=<int>
Description: Specify the 0-based offset of the first event to read. If start=0, it begins with the first event. If
start=4, it begins with the fifth event. Defaults to 0.
WHERE clause
Syntax: WHERE <search-query>
Description: Use this clause to improve search performance by prefiltering data returned from the lookup table.
Supports a limited set of search query operators: =, !=, <, >, <=, >=, AND, OR, NOT. Any combination of these
operators is permitted. Also supports wildcard string searches.
Usage
Generating commands use a leading pipe character and should be the first command in a search. The inputlookup
command can be first command in a search or in a subsearch.
345
The lookup can be a file name that ends with .csv or .csv.gz, or a lookup table definition in Settings > Lookups >
Lookup definitions.
When using the inputlookup command in a subsearch, if append=true, data from the lookup file or KV store collection is
appended to the search results from the main search. When append=false the main search results are replaced with the
results from the lookup search.
The WHERE clause allows you to narrow the scope of the query that inputlookup makes against the lookup table. It
restricts inputlookup to a smaller number of lookup table rows, which can improve search efficiency when you are
working with significantly large lookup tables.
You can use the inputlookup command to verify that the geometric features on the map are correct. The syntax is |
inputlookup <your_lookup>.
1. For example, to verify that the geometric features in built-in geo_us_states lookup appear correctly on the
choropleth map, run the following search:
| inputlookup geo_us_states
2. On the Visualizations tab, zoom in to see the geometric features. In this example, the states in the United States.
Use the strict argument to make inputlookup searches fail whenever they encounter an error condition. You can set this
at the system level for all inputcsv and inputlookup searches by changing input_errors_fatal in limits.conf.
If you use Splunk Cloud Platform, file a Support ticket o change the input_errors_fatal setting.
Use the strict argument to override the input_errors_fatal setting for an inputlookup search.
Additional information
For more information about creating lookups, see About lookups in the Knowledge Manager Manual.
For more information about the App Key Value store, see About KV store in the Admin Manual.
Examples
| inputlookup usertogroup
346
2. Append lookup table fields to the current search results
Using a subsearch, read in the usertogroup lookup table that is defined by a stanza in the transforms.conf file. Append
the fields to the results in the main search.
| inputlookup users.csv
Search the contents of the KV store collection kvstorecoll that have a CustID value greater than 500 and a CustName
value that begins with the letter P. The collection is referenced in a lookup table called kvstorecoll_lookup. Provide a
count of the events received from the table.
In this example, the lookup definition explicitly defines the CustID field as a type of "number". If the field type is not
explicitly defined, the where clause does not work. Defining field types is optional.
Example 5: View internal key ID values for the KV store collection kvstorecoll, using the lookup table
kvstorecoll_lookup. The internal key ID is a unique identifier for each record in the collection. This example uses the
eval and table commands.
| inputlookup kvstorecoll_lookup | eval CustKey = _key | table CustKey, CustName, CustStreet, CustCity,
CustState, CustZip
Update field values for a single KV store collection record. This example uses the inputlookup, outputlookup, and eval
commands. The record is indicated by the its internal key ID (the _key field) and this search updates the record with a new
customer name and customer city. The record belongs to the KV store collection kvstorecoll, which is accessed through
the lookup table kvstorecoll_lookup.
Write the contents of a CSV file to the KV store collection kvstorecoll using the lookup table kvstorecoll_lookup. The
CSV file is in the $SPLUNK_HOME/etc/system/lookups or $SPLUNK_HOME/etc/apps/<app_name>/lookups directory.
347
See also
Commands
inputcsv
join
lookup
outputlookup
iplocation
Description
The iplocation command extracts location information from IP addresses by using 3rd-party databases. This command
supports IPv4 and IPv6 addresses and subnets that use CIDR notation.
The IP address that you specify in the ip-address-fieldname argument, is looked up in the database. Fields from that
database that contain location information are added to each event. The setting used for the allfields argument
determines which fields are added to the events.
Because all the information might not be available for each IP address, an event can have empty field values.
For IP addresses that don't have a location, such as internal addresses, no fields are added.
Syntax
Required arguments
ip-address-fieldname
Syntax: <field>
Description: Specify an IP address field, such as clientip.
Optional arguments
allfields
Syntax: allfields=<bool>
Description: Specifies whether to add all of the fields from the database to the events. If set to true, adds the
fields City, Continent, Country, lat (latitude), lon (longitude), MetroCode, Region, and Timezone.
Default: false. Only the City, Country, lat, lon, and Region fields are added to the events.
lang
Syntax: lang=<string>
Description: Render the resulting strings in different languages. For example, use "lang=es" for Spanish. The set
of languages depends on the geoip database that is used. To specify more than one language, separate them
with a comma. This also indicates the priority in descending order. Specify "lang=code" to return the fields as two
letter ISO abbreviations.
prefix
Syntax: prefix=<string>
348
Description: Specify a string to prefix the field name. With this argument you can add a prefix to the added field
names to avoid name collisions with existing fields. For example, if you specify prefix=iploc_ the field names
that are added to the events become iploc_City, iploc_County, iploc_lat, and so forth.
Default: NULL/empty string
Usage
The Splunk software ships with a copy of the GeoLite2-City.mmdb database file. This file is located in the
$SPLUNK_HOME/share/ directory.
You can replace the version of the .mmdb file that ships with the Splunk software with a copy of the paid version of the file
or with a monthly update of the free version of the file.
1. From http://dev.maxmind.com/geoip/geoip2/geolite2/, download the binary gzipped version of the GeoLite2 City
database file.
2. Copy the file to the search head on your Splunk Enterprise instance.
3. Expand the GZ file.
4. Stop any real-time searches that are running.
5. Copy the GeoLite2-City.mmdb file to the $SPLUNK_HOME/share/ directory to overwrite the file there.
6. Restart the real-time searches.
When you upgrade your Splunk platform, the GeoLite2-City.mmdb file in the share directory is replaced by the version of
the file that ships with the Splunk software. One option is to store the MMDB file in a different path.
If you prefer to update the GeoLite2-City.mmdb file yourself, for example if you use a paid version of the file, you can store
the MMDB file in a different path. The path that is used by the Splunk software to access the file must be updated.
Prerequisites
• Only users with file system access, such as system administrators, can specify a different path to the MMDB file
in the limits.conf file.
• Review the steps in How to edit a configuration file in the Admin Manual.
• You can have configuration files with the same name in your default, local, and app directories. Read Where you
can place (or find) your modified configuration files in the Admin Manual.
Never change or copy the configuration files in the default directory. The files in the default directory must remain intact
and in their original location. Make the changes in the local directory.
If you use Splunk Cloud Platform, updates to the MMDB file are provided ONLY with Splunk version upgrades. If you wish
to discuss or request this, file a Support ticket.
Steps
349
1. Open the local limits.conf file for the Search app. For example, $SPLUNK_HOME/etc/system/local.
2. Add the [iplocation] stanza.
3. Add the db_path setting and specify the absolute path to the GeoLite2-City.mmdb file. The db_path setting does
not support standard Splunk environment variables such as $SPLUNK_HOME.
For example: db_path = /Applications/Splunk/mmdb/GeoLite2-City.mmdb specifies a new directory called mmdb.
4. Ensure a copy of the MMDB file is stored in the ../Applications/Splunk/mmdb/ directory.
5. Because you are editing the path to the MMDB file, you should restart the Splunk server.
Alternatively, you can add the updated MMDB to the share directory using a different name and then specify that name in
the db_path setting. For example: db_path = /Applications/Splunk/share/GeoLite2-City_paid.mmdb.
The iplocation command is a distributable streaming command, which means that it can be processed on the indexers.
The share directory is not part of the knowledge bundle. If you update the MMDB file in the share directory, the updated
file is not automatically sent to the indexers in a distributed deployment. To add the MMDB file to the indexers, use the
tools that you typically use to push files to the indexers.
Examples
This example uses the sample data from the Search Tutorial but should work with any format of Apache web access log. To try this example on
your own Splunk instance, you must download the sample data and follow the instructions to get the tutorial data into Splunk. Use the time
range All time when you run the search.
Add location information to web access events. By default, the iplocation command adds the City, Country, lat, lon,
and Region fields to the results.
This example uses the sample data from the Search Tutorial but should work with any format of Apache web access log. To try this example on
your own Splunk instance, you must download the sample data and follow the instructions to get the tutorial data into Splunk. Use the time
range All time when you run the search.
Search for client errors in web access events, returning only the first 20 results. Add location information and return a
table with the IP address, City, and Country for each client error.
sourcetype=access_* status>=400 | head 20 | iplocation clientip | table clientip, status, City, Country
The results appear on the Statistics tab and look something like this:
350
clientip status City Country
198.35.1.75 406 Princeton United States
198.35.1.75 500 Princeton United States
221.204.246.72 503 Taiyuan China
1.192.86.205 503 Amesbury United States
91.205.189.15 406
Prefix the fields added by the iplocation command with iploc_. Add all of the fields in the GeoLite2-City.mmdb database
file to the results.
351
4. Generate a choropleth map using IP addresses
Generate a choropleth map of your data like the one below using the iplocation command. See Use IP addresses to
generate a choropleth map in Dashboards and Visualizations.
The iplocation command supports IPv6 lookup through IP geolocation functionality. In the following example, iplocation
looks up the specified IP address in the default geolocation database file to determine where it is located.
Search finds the location of the IP address and displays the following results.
Commands
lookup
search
Functions
cidrmatch
join
Description
You can use the join command to combine the results of a main search (left-side dataset) with the results of either
another dataset or a subsearch (right-side dataset). You can also combine a search result set to itself using the selfjoin
command.
352
The left-side dataset is the set of results from a search that is piped into the join command and then merged on the right
side with the either a dataset or the results from a subsearch. The left-side dataset is sometimes referred to as the source
data.
The following search example joins the source data from the search pipeline with a subsearch on the right side. Rows
from each dataset are merged into a single row if the where predicate is satisfied.
<left-dataset>
| join left=L right=R where L.pid = R.pid [subsearch]
A maximum of 50,000 rows in the right-side dataset can be joined with the left-side dataset. This maximum default is set
to limit the impact of the join command on performance and resource consumption.
If you are familiar with SQL but new to SPL, see Splunk SPL for SQL users.
Alternative commands
For flexibility and performance, consider using one of the following commands if you do not require join semantics. These
commands provide event grouping and correlations using time and geographic location, transactions, subsearches, field
lookups, and joins.
Command Use
append To append the results of a subsearch to the results of your current search. The events from both result sets are retained.
• Use only with historical data. The append command does not produce correct results if used in a real-time search.
• If you use append to combine the events, use a stats command to group the events in a meaningful way. You
cannot use a transaction command after you use an append command.
appendcols Appends the fields of the subsearch results with the input search result fields. The first subsearch result is merged with the
first main result, the second subsearch result is merged with the second main result, and so on.
lookup Use when one of the result sets or source files remains static or rarely changes. For example, a file from an external system
such as a CSV file.
stats To group events by a field and perform a statistical function on the events. For example to determine the average duration of
events by host name.
• To group events by using the eval command with a conditional expression, such as if, case, or match.
• To group events by using a recycled field value, such as an ID or IP address.
• To group events by using a pattern, such as a start or end time for the event.
• To break up groups larger than a certain duration. For example, when a transaction does not explicitly end with a
message and you want to specify a maximum span of time after the start of the transaction.
• To display the raw event data for the grouped events.
For information about when to use a join, see the flowchart in About event grouping and correlation in the Search Manual.
353
Syntax
join
[<join-options>...]
[<field-list>] | [left=<left-alias>] [right=<right-alias>] where <left-alias>.<field>=<right-alias>.<field>
[<left-alias>.<field>=<right-alias>.<field>]...
<dataset-type>:<dataset-name> | <subsearch>
Required arguments
dataset-type
Syntax: datamodel | savedsearch | inputlookup
Description: The type of dataset that you want to use to join with the source data. The dataset must be a dataset
that you created or are authorized to use. You can specify datamodel, savedsearch, or inputlookup. The dataset
type must precede the dataset name. For example, savedsearch:<dataset-name>.
You can use either <dataset-type>:<dataset-name> or <subsearch> with the join command, but not both.
dataset-name
Syntax: <dataset-name>
Description: The name of the dataset that you want to use to join with the source data. The dataset must be a
dataset that you created or are authorized to use. The dataset name must follow the dataset type. For example, if
the dataset name is january and the dataset type is datamodel, you specify datamodel:january.
You can use either <dataset-type>:<dataset-name> or <subsearch> with the join command, but not both.
subsearch
Syntax: [<subsearch>]
Description: A secondary search or dataset that specifies the source of the events that you want to join to. The
subsearch must be enclosed in square brackets. The results of the subsearch should not exceed available
memory.
You can use either <dataset-type>:<dataset-name> or <subsearch> in a search, but not both. When
[<subsearch>] is used in a search by itself with no join keys, the Splunk software autodetects common fields and
combines the search results before the join command with the results of the subsearch.
Optional arguments
join-options
Syntax: type=(inner | outer | left) | usetime=<bool> | earlier=<bool> | overwrite=<bool> | max=<int>
Description: Arguments to the join command. Use either outer or left to specify a left outer join. See
Descriptions for the join-options argument in this topic.
field-list
Syntax: <field> <field> ...
Description: Specify the list of fields to use for the join. For example, to join fields ProductA, ProductB, and
ProductC, you would specify | join ProductA ProductB ProductC.... If <field-list> is specified, one or more of
the fields must be common to each dataset. If no fields are specified, all of the fields that are common to both
datasets are used.
354
left alias
Syntax: left=<left-alias>
Description: The alias to use with the left-side dataset, the source data, to avoid naming collisions. Must be
combined with the right alias and where clause, or the alias is ignored.
The left alias must be used together with the right alias.
right alias
Syntax: right=<right-alias>
Description: The alias to use with the right-side dataset to avoid naming collisions. Must be combined with the
left alias and the where clause, or the alias is ignored.
The right alias must be used together with the left alias.
where clause
Syntax: where <left-alias>.<field>=<right-alias>.<field>...
Description: Identifies the names of the fields in the left-side dataset and the right-side dataset that you want to
join on. You must specify the left and right aliases and the field name. Fields that are joined from the left and right
datasets do not have to have the same names. For example: where L.host=R.user matches events in the host
field from the left dataset with events in the user field from the right dataset.
The where clause must be used with the right and left aliases and field name.
You can specify the aliases and fields in a where clause on either side of the equal sign. For example:
where <left-alias>.<left-field>=<right-alias>.<right-field>
or
where <right-alias>.<right-field>=<left-alias>.<left-field>
type
Syntax: type=inner | outer | left
Description: Indicates the type of join to perform. The difference between an inner and a left (or outer) join is
how the events are treated in the main search that do not match any of the events in the subsearch. In both inner
and left joins, events that match are joined. The results of an inner join do not include events from the main
search that have no matches in the subsearch. The results of a left (or outer) join includes all of the events in
the main search and only those values in the subsearch have matching field values.
Default: inner
usetime
Syntax: usetime=<bool>
Description: A Boolean value that Indicates whether to use time to limit the matches in the subsearch results.
Used with the earlier option to limit the subsearch results to matches that are earlier or later than the main
search results.
Default: false
earlier
Syntax: earlier=<bool>
355
Description: If usetime=true and earlier=true, the main search results are matched only against earlier results
from the subsearch. If earlier=false, the main search results are matched only against later results from the
subsearch. Results that occur at the same time (second) are not eliminated by either value.
Default: true
overwrite
Syntax: overwrite=<bool>
Description: If fields in the main search results and subsearch results have the same name, indicates whether
fields from the subsearch results overwrite the fields from the main search results.
Default: true
max
Syntax: max=<int>
Description: Specifies the maximum number of subsearch results that each main search result can join with. If
set to max=0, there is no limit.
Default: 1
Usage
The join command is a centralized streaming command when there is a defined set of fields to join to. Otherwise the
command is a dataset processing command. See Command types.
Use the join command when the results of the subsearch are relatively small, for example 50,000 rows or less. To
minimize the impact of this command on performance and resource consumption, Splunk software imposes some default
limitations on the subsearch.
Limitations on the subsearch for the join command are specified in the limits.conf file. The limitations include a
maximum of 50,000 rows in the subsearch to join against and the maximum search time for the subsearch. See
Subsearches in the Search Manual.
Splunk Enterprise
To change the subsearch_maxout or subsearch_maxtime settings in your limits.conf file for join command
subsearches, follow these steps.
Prerequisites
◊ Only users with file system access, such as system administrators, can edit configuration files.
◊ Review the steps in How to edit a configuration file in the Splunk Enterprise Admin Manual.
Never change or copy the configuration files in the default directory. The files in the default directory must
remain intact and in their original location. Make changes to the files in the local directory.
Steps:
356
1. Open or create a local limits.conf file at $SPLUNK_HOME/etc/system/local.
2. Under the [join] stanza, add the line subsearch_maxout = <value> or subsearch_maxtime = <value>.
To return matches for one-to-many, many-to-one, or many-to-many relationships, include the max argument in your join
syntax and set the value to 0. By default max=1, which means that the subsearch returns only the first result from the
subsearch. Setting the value to a higher number or to 0, which is unlimited, returns multiple results from the subsearch.
Basic examples
1. A basic join
Combine the results from a main search with the results from a subsearch search vendors. The result sets are joined on
the product_id field, which is common to both sources.
By default, only the first row of the subsearch that matches a row of the main search is returned. To return all of the
matching subsearch rows, include the max=<int> argument and set the value to 0. This argument joins each matching
subsearch row with the corresponding main search row.
Combine the results from a search with the vendors dataset. The data is joined on the product_id field, which is common
to both datasets.
Combine the results from a search with the vendors dataset. The data is joined on a product ID field, which have different
field names in each dataset. The field in the left-side dataset is product_id. The field in the right-side dataset is pid.
You can use words for the aliases to help identify the datasets involved in the join. This example uses products and
vendors for the aliases.
357
Extended examples
This example joins each matching right-side dataset row with the corresponding source data row. This example uses
products, which is a savedsearch type of dataset, for the right-side dataset. The field names in the left-side dataset and
the right-side dataset are different. This search returns all of the matching rows in the left and right datasets by including
max=0 in the search.
Commands following the join can take advantage of the aliasing provided through the join command. For example, you
can use the aliasing in another command like stats as shown in the following example.
... | join left=L right=R where L.product_id=R.pid [search vendors] | stats count by L.product_id
The dashboards and alerts in the distributed management console shows you performance information about your Splunk
deployment. The Resource Usage: Instance dashboard contains a table that shows the machine, number of cores,
physical memory capacity, operating system, and CPU architecture.
To display the information in the table, use the following search. This search includes the join command. The search
uses the information in the dmc_assets table to look up the instance name and machine name. The search then uses the
serverName field to join the information with information from the /services/server/info REST endpoint. The
/services/server/info is the URI path to the Splunk REST API endpoint that provides hardware and operating system
information for the machine. The $splunk_server$ part of the search is a dashboard token variable.
| inputlookup dmc_assets
| search serverName = $splunk_server$
| stats first(serverName) AS serverName, first(host) AS host, first(machine) AS machine
| join type=left serverName
[ | rest splunk_server=$splunk_server$ /services/server/info
| fields serverName, numberOfCores, physicalMemoryMB, os_name, cpu_arch]
| fields machine numberOfCores physicalMemoryMB os_name cpu_arch
| rename machine AS Machine, numberOfCores AS "Number of Cores",
physicalMemoryMB AS "Physical Memory Capacity (MB)", os_name AS "Operating System",
cpu_arch AS "CPU Architecture"
See also
kmeans
Description
Partitions the events into k clusters, with each cluster defined by its mean value. Each event belongs to the cluster with
the nearest mean value. Performs k-means clustering on the list of fields that you specify. If no fields are specified,
performs the clustering on all numeric fields. Events in the same cluster are moved next to each other. You have the
358
option to display the cluster number for each event.
Syntax
Required arguments
None.
Optional arguments
field-list
Syntax: <field> ...
Description: Specify a space separated list of the exact fields to use for the join.
Default: If no fields are specified, uses all numerical fields that are common to both result sets. Skips events with
non-numerical fields.
kmeans-options
Syntax: <reps> | <iters> | <t> | <k> | <cnumfield> | <distype> | <showcentroid>
Description: Options for the kmeans command.
kmeans options
reps
Syntax: reps=<int>
Description: Specify the number of times to repeat kmeans using random starting clusters.
Default: 10
iters
Syntax: maxiters=<int>
Description: Specify the maximum number of iterations allowed before failing to converge.
Default: 10000
t
Syntax: t=<num>
Description: Specify the algorithm convergence tolerance.
Default: 0
k
Syntax: k=<int> | <int>-<int>
Description: Specify as a scalar integer value or a range of integers. When provided as single number, selects
the number of clusters to use. This produces events annotated by the cluster label. When expressed as a range,
clustering is done for each of the cluster counts in the range and a summary of the results is produced. These
results express the size of the clusters, and a 'distortion' field which represents how well the data fits those
selected clusters. Values must be greater than 1 and less than maxkvalue (see Limits section).
Default: k=2
cnumfield
Syntax: cfield=<field>
Description: Names the field to annotate the results with the cluster number for each event.
359
Default: CLUSTERNUM
distype
Syntax: dt= ( l1 | l1norm | cityblock | cb ) | ( l2 | l2norm | sq | sqeuclidean ) | ( cos | cosine )
Description: Specify the distance metric to use. The l1, l1norm, and cb distance metrics are synonyms for
cityblock. The l2, l2norm, and sq distance metrics are synonyms for sqeuclidean or sqEuclidean. The cos
distance metric is a synonym for cosine.
Default: sqeucildean
showcentroid
Syntax: showcentroid= true | false
Description: Specify whether to expose the centroid centers in the search results (showcentroid=true) or not.
Default: true
Usage
Limits
The number of clusters to collect the values into -- k -- is not permitted to exceed maxkvalue. The maxkvalue is specified
in the limits.conf file, in the [kmeans] stanza. The maxkvalue default is 1000.
When a range is given for the k option, the total distance between the beginning and ending cluster counts is not
permitted to exceed maxkrange. The maxkrange is specified in the limits.conf file, in the [kmeans] stanza. The
maxkrange default is 100.
The above limits are designed to avoid the computation work becoming unreasonably expensive.
The total number of values which are clustered by the algorithm (typically the number of input results) is limited by the
maxdatapoints parameter in the [kmeans] stanza of limits.conf. If this limit is exceeded at runtime, a warning message
displays in Splunk Web. This defaults to 100000000 or 100 million. This maxdatapoints limit is designed to avoid
exhausting memory.
Examples
Example 1: Group search results into 4 clusters based on the values of the "date_hour" and "date_minute" fields.
Example 2: Group results into 2 clusters based on the values of all numerical fields.
... | kmeans
See also
kvform
360
Description
Extracts key-value pairs from events based on a form template that describes how to extract the values.
For Splunk Cloud Platform, you must create a private app to extract key-value pairs from events. If you are a Splunk
Cloud administrator with experience creating private apps, see Manage private apps in your Splunk Cloud Platform
deployment in the Splunk Cloud Admin Manual. If you have not created private apps, contact your Splunk account
representative for help with this customization.
Syntax
Optional arguments
form
Syntax: form=<string>
Description: Specify a .form file located in a $SPLUNK_HOME/etc/apps/*/forms/ directory.
field
Syntax: field=<field_name>
Description: Uses the field name to look for .form files that correspond to the field values for that field name. For
example, your Splunk deployment uses the splunkd and mongod sourcetypes. If you specify field=sourcetype,
the kvform command looks for the splunkd.form and mongod.form in the $SPLUNK_HOME/etc/apps/*/forms/
directory.
Default: sourcetype
Usage
• Create the forms directory in the appropriate application path. For example
$SPLUNK_HOME/etc/apps/<app_name>/forms.
• Create the .form files and add the files to the forms directory.
A .form file is essentially a text file of all static parts of a form. It might be interspersed with named references to regular
expressions of the type found in the transforms.conf file.
Specifying a form
If the form argument is specified, the kvform command uses the <form_name>.form file found in the Splunk configuration
forms directory. For example, if form=sales_order, the kvform command looks for a sales_order.form file in the
$SPLUNK_HOME/etc/apps/<app_name>/forms directory for all apps. All the events processed are matched against the form,
361
trying to extract values.
Specifying a field
If you specify the field argument, the the kvform command looks for forms in the forms directory that correspond to the
values for that field. For example, if you specify field=error_code, and an event has the field value error_code=404, the
command looks for a form called 404.form in the $SPLUNK_HOME/etc/apps/<app_name>/forms directory.
Default value
If no form or field argument is specified, the kvform command uses the default value for the field argument, which is
sourcetype. The kvform command looks for <sourcetype_value>.form files to extract values.
Examples
Specify field=sourcetype to extract values from forms such as splunkd.form and mongod.form. If there is a form for a
source type, values are extracted from that form. If one of the source types is access_combined but there is no
access_combined.form file, that source type is ignored.
See also
Commands
extract
multikv
rex
xmlkv
loadjob
Description
Loads events or results of a previously completed search job. The artifacts to load are identified either by the search job id
<sid> or a scheduled search name and the time range of the current search. If a saved search name is provided and
multiple artifacts are found within that range, the latest artifacts are loaded.
362
You cannot run the loadjob command on ad hoc or real-time searches.
Syntax
| loadjob
(<sid> | <savedsearch>)
[<result-event>]
[<delegate>]
[<artifact_offset>]
[<ignore_running>]
Required arguments
sid
Syntax: <string>
Description: The search ID of the job whose artifacts need to be loaded, for example: 1233886270.2. You can
locate the sid through the Job Inspector or the addinfo command.
savedsearch
Syntax: savedsearch="<user-string>:<app-string>:<search-name-string>"
Description: The unique identifier of a saved search whose artifacts need to be loaded. A saved search is
uniquely identified by the triplet {user, app, savedsearch name}, for example: savedsearch="admin:search:my
Saved Search" There is no method to specify a wildcard or match-all behavior. All portions of the triplet must be
provided.
Optional arguments
result-event
Syntax: events=<bool>
Description: events=true loads events, while events=false loads results.
Defaults: false
delegate
Syntax: job_delegate=<string>
Description: When specifying a saved search, this option selects jobs that were started by the given user.
Scheduled jobs will be run by the delegate "scheduler". Dashboard-embedded searches are run in accordance
with the saved search's dispatchAs parameter, typically the owner of the search.
Defaults: scheduler
artifact_offset
Syntax: artifact_offset=<int>
Description: Selects a search artifact other than the most recent matching one. For example, if
artifact_offset=1, the second most recent artifact will be used. If artifact_offset=2, the third most recent
artifact will be used. If artifact_offset=0, selects the most recent. A value that selects past all available artifacts
will result in an error.
Default: 0
363
ignore_running
Syntax: ignore_running=<bool>
Description: Skip over artifacts whose search is still running.
Default: true
Usage
Generating commands use a leading pipe character and should be the first command in a search.
The loadjob command can be used for a variety of purposes, but one of the most useful is to run a fairly expensive
search that calculates statistics. You can use loadjob searches to display those statistics for further aggregation,
categorization, field selection and other manipulations for charting and display.
After a search job has completed and the results are cached, you can use this command to access or load the results.
A search head cluster can run the loadjob command only on scheduled saved searches. A search head cluster runs
searches on results or artifacts that the search head cluster replicates.
For more information on artifact replication, see Search head clustering architecture in the Distributed Search manual.
Examples
Loads the results of the latest scheduled execution of saved search MySavedSearch in the 'search' application owned by
the user admin.
| loadjob savedsearch="admin:search:MySavedSearch"
Loads the events that were generated by the search job with id=1233886270.2.
See also
Commands
addinfo
inputcsv
savedsearch
Related information
Manage search jobs
364
localize
Description
The localize command generates results that represent a list of time contiguous event regions. An event region is a
period of time in which consecutive events are separated, at most, by the maxpause time value. The regions found can be
expanded using the timeafter and timebefore arguments.
The regions discovered by the localize command are meant to be fed into the map command. The map command uses a
different region for each iteration.
Syntax
Optional arguments
maxpause
Syntax: maxpause=<int>(s|m|h|d)
Description: Specify the maximum (inclusive) time between two consecutive events in a contiguous time region.
Default: 1m
timeafter
Syntax: timeafter=<int>(s|m|h|d)
Description: Specify the amount of time to add to the output endtime field (expand the time region forward in
time).
Default: 30s
timebefore
Syntax: timebefore=<int>(s|m|h|d)
Description: Specify the amount of time to subtract from the output starttime field (expand the time region
backwards in time).
Default: 30s
Usage
You can expand the event range after the last event or before the first event in the region. These expansions are done
arbitrarily, possibly causing overlaps in the regions if the values are larger than maxpause.
The regions are returned in search order, or descending time for historical searches and data-arrival order for realtime
search. The time of each region is the initial pre-expanded start-time.
365
• The range duration in seconds
• The region density defined as the number of events in range divided by <range duration - events per second.
Examples
1. Search the time range of each previous result for the term "failure"
Searching for "error" and calling the localize command finds suitable regions around where error occurs and passes
each on to the search inside of the map command. Each iteration works with a specific time range to find potential
transactions.
See also
map, transaction
localop
Description
Prevents subsequent commands from being executed on remote peers. Tells the search to run subsequent commands
locally, instead.
The localop command forces subsequent commands to be part of the reduce step of the mapreduce process.
Syntax
localop
Examples
Example 1:
The iplocation command in this case will never be run on remote peers. All events from remote peers that originate from
the initial search, which was for the terms FOO and BAR, are forwarded to the search head. The search head is where
the iplocation command is run.
lookup
366
Description
For information about the types of lookups you can define, see About lookups in the Knowledge Manager Manual.
The lookup command supports IPv4 and IPv6 addresses and subnets that use CIDR notation.
Syntax
lookup
[local=<bool>]
[update=<bool>]
<lookup-table-name>
( <lookup-field> [AS <event-field>] )...
[ OUTPUT | OUTPUTNEW (<lookup-destfield> [AS <event-destfield>] )... ]
Note: The lookup command can accept multiple lookup and event fields and destfields. For example:
Required arguments
<lookup-table-name>
Syntax: <string>
Description: Can be either the name of a CSV file that you want to use as the lookup, or the name of a stanza in
the transforms.conf file that specifies the location of the lookup table file.
Optional arguments
local
Syntax: local=<bool>
Description: If local=true, forces the lookup to run on the search head and not on any remote peers.
Default: false
update
Syntax: update=<bool>
Description: If the lookup table is modified on disk while the search is running, real-time searches do not
automatically reflect the update. To do this, specify update=true. This does not apply to searches that are not
real-time searches. This implies that local=true.
Default: false
<lookup-field>
Syntax: <string>
Description: Refers to a field in the lookup table to match against the events. You can specify multiple
<lookup-field> values.
<event-field>
Syntax: <string>
367
Description: Refers to a field in the events from which to acquire the value to match in the lookup table. You can
specify multiple <event-field> values.
Default: The value of the <lookup-field>.
<lookup-destfield>
Syntax: <string>
Description: Refers to a field in the lookup table to be copied into the events. You can specify multiple
<lookup-destfield> values.
<event-destfield>
Syntax: <string>
Description: A field in the events. You can specify multiple <event-destfield> values.
Default: The value of the <lookup-destfield> argument.
Usage
The lookup command is a distributable streaming command when local=false, which is the default setting. See
Command types.
When using the lookup command, if an OUTPUT or OUTPUTNEW clause is not specified, all of the fields in the lookup
table that are not the match fields are used as output fields. If the OUTPUT clause is specified, the output lookup fields
overwrite existing fields. If the OUTPUTNEW clause is specified, the lookup is not performed for events in which the
output fields already exist.
When you set up the OUTPUT or OUTPUTNEW clause for your lookup, avoid accidentally creating lookup reference
cycles, where you intentionally or accidentally reuse the same field names among the match fields and the output fields of
a lookup search.
For example, if you run a lookup search where type is both the match field and the output field, you are creating a lookup
reference cycle. You can accidentally create a lookup reference cycle when you fail to specify an OUTPUT or
OUTPUTNEW clause for lookup.
For more information about lookup reference cycles see Define an automatic lookup in Splunk Web in the Knowledge
Manager Manual.
If you are using the lookup command in the same pipeline as a transforming command, and it is possible to retain the
field you will lookup on after the transforming command, do the lookup after the transforming command. For example, run:
and not:
The lookup in the first search is faster because it only needs to match the results of the stats command and not all the
Web access events.
368
Basic example
1. Lookup users and return the corresponding group the user belongs to
Suppose you have a lookup table specified in a stanza named usertogroup in the transforms.conf file. This lookup table
contains (at least) two fields, user and group. Your events contain a field called local_user. For each event, the following
search checks to see if the value in the field local_user has a corresponding value in the user field in the lookup table.
For any entries that match, the value of the group field in the lookup table is written to the field user_group in the event.
Extended example
1. Lookup price and vendor information and return the count for each product sold by a vendor
This example uses the tutorialdata.zip file from the Search Tutorial. You can download this file and follow the instructions to upload the tutorial
data into your Splunk deployment. Additionally, this example uses the prices.csv and the vendors.csv files. To follow along with this example in
your Splunk deployment, download these CSV files and complete the steps in the Use field lookups section of the Search Tutorial for both the
prices.csv and the vendors.csv files. When you create the lookup definition for the vendors.csv file, name the lookup vendors_lookup.
You can skip the step in the tutorial that makes the lookups automatic.
This example calculates the count of each product sold by each vendor.
The prices.csv file contains the product names, price, and code. For example:
Vendor Vendor
Vendor VendorCity VendorID VendorLatitude VendorLongitude Weight
StateProvince Country
Anchorage
Anchorage 1001 61.17440033 -149.9960022 Alaska United States 3
Gaming
Games of
Salt Lake City 1002 40.78839874 -111.9779968 Utah United States 3
Salt Lake
New Jack
New York 1003 40.63980103 -73.77890015 New York United States 4
Games
Seals Gaming San Francisco 1004 37.61899948 -122.375 California United States 5
The search will query the vendor_sales.log file, which is part of the tutorialdata.zip file. The vendor_sales.log file
contains the VendorID, Code, and AcctID fields. For example:
369
Entries in the vendor_sales.log file
sourcetype=vendor_* | stats count by Code VendorID | lookup prices_lookup Code OUTPUTNEW product_name
You can extend the search to display more information about the vendor by using the vendors_lookup.
Use the table command to return only the fields that you need. In this example you want the product_name, VendorID, and
count fields. Use the vendors_lookup file to output all the fields in the vendors.csv file that match the VendorID in each
event.
sourcetype=vendor_* | stats count by Code VendorID | lookup prices_lookup Code OUTPUTNEW product_name |
table product_name VendorID count | lookup vendors_lookup VendorID
370
To expand the search to display the results on a map, see the geostats command.
In this example, CSV lookups are used to determine whether a specified IPv6 address is in a CIDR subnet. You can
follow along with the example by performing these steps in Splunk Web. See Define a CSV lookup in Splunk Web.
Prerequisites
♦ Your role must have the upload_lookup_files capability to upload lookup table files in Splunk Web. See
Define roles with capabilities in Splunk Enterprise "Securing the Splunk Platform".
♦ A CSV lookup table file called ipv6test.csv that contains the following text.
ip,expected
2001:0db8:ffff:ffff:ffff:ffff:ffff:ff00/120,true
The ip field in the lookup table contains the subnet value, not the IP address.
Steps
You have to define a CSV lookup before you can match an IP address to a subnet.
371
12. Select a Destination app from the drop-down list.
13. Give your lookup definition a unique Name, like ipv6test.
14. Select File-based as the lookup Type.
15. Select ipv6test.csv as the Lookup file from the drop-down list.
16. Select the Advanced options check box.
17. Enter a Match type of CIDR(ip).
18. Click Save.
19. In the Lookup definitions list, click Permissions in the Sharing column of the ipv6test lookup definition you want
to share.
20. In the Permissions dialog box, under Object should appear in, select All apps to share globally. If you want the
lookup to be specific to this app only, select This app only.
Permissions for lookup table files must be at the same level or higher than those of the lookup definitions that
use those files.
The IP address is in the subnet, so the search displays true in the expected field. The search results look
something like this.
time expected ip
2020-11-19 16:43:31 true 2001:0db8:ffff:ffff:ffff:ffff:ffff:ff99
See also
Commands
appendcols
inputlookup
outputlookup
iplocation
search
Functions
cidrmatch
Related information
About lookups in the Knowledge Manager Manual
makecontinuous
372
Description
Makes a field on the x-axis numerically continuous by adding empty buckets for periods where there is no data and
quantifying the periods where there is data. This x-axis field can then be invoked by the chart and timechart commands.
Syntax
Required arguments
<bins-options>
Datatype: bins | span | start-end
Description: Discretization options. See "Bins options" for details.
Optional arguments
<field>
Datatype: <field>
Description: Specify a field name.
Bins options
bins
Syntax: bins=<int>
Description: Sets the maximum number of bins to discretize into.
span
Syntax: <log-span> | <span-length>
Description: Sets the size of each bin, using a span length based on time or log-based span.
<start-end>
Syntax: end=<num> | start=<num>
Description: Sets the minimum and maximum extents for numerical bins. Data outside of the [start, end] range is
discarded.
Span options
<log-span>
Syntax: [<num>]log[<num>]
Description: Sets to log-based span. The first number is a coefficient. The second number is the base. If the first
number is supplied, it must be a real number >= 1.0 and < base. Base, if supplied, must be real number > 1.0,
meaning it must be strictly greater than 1.
span-length
Syntax: <span>[<timescale>]
Description: A span length based on time.
<span>
Syntax: <int>
373
Description: The span of each bin. If using a timescale, this is used as a time range. If not, this is an absolute bin
"length."
<timescale>
Syntax: <sec> | <min> | <hr> | <day> | <month> | <subseconds>
Description: Time scale units.
Usage
Examples
Example 1:
See also
chart, timechart
makemv
Description
Converts a single valued field into a multivalue field by splitting the values on a simple string delimiter. The delimiter can
be a multicharacter delimiter. Alternatively, splits field by using a regex.
374
Syntax
Required arguments
field
Syntax: <field>
Description: Specify the name of a field.
Optional arguments
delim
Syntax: delim=<string>
Description: A string value used as a delimiter. Splits the values in field on every occurrence of this string.
Default: A single space (" ").
tokenizer
Syntax: tokenizer=<string>
Description: A regex, with a capturing group, that is repeat-matched against the text of field. For each match, the
first capturing group is used as a value of the newly created multivalue field.
allowempty
Syntax: allowempty=<bool>
Description: Specifies whether to permit empty string values in the multivalue field. When using delim=true,
repeats of the delimiter string produce empty string values in the multivalue field. For example if delim="," and
field="a,,b", by default does not produce any value for the empty string. When using the tokenizer argument,
zero length matches produce empty string values. By default they produce no values.
Default: false
setsv
Syntax: setsv=<bool>
Description: If true, the makemv command combines the decided values of the field into a single value, which is
set on the same field. (The simultaneous existence of a multivalue and a single value for the same field is a
problematic aspect of this flag.)
Default: false
Usage
You can use evaluation functions and statistical functions on multivalue fields or to return multivalue fields.
Examples
For sendmail search results, separate the values of "senders" into multiple values. Display the top values.
375
2. Use a colon delimiter and allow empty values
The following search creates a result and adds three values to the my_multival field. The makemv command is used to
arrange the values into separate lines.
See also
Commands:
mvcombine
mvexpand
nomv
Functions:
Multivalue eval functions
Multivalue stats and chart functions
split
makeresults
Description
If you do not specify any of the optional arguments, this command runs on the local machine and generates one result
with only the _time field.
Syntax
| makeresults
[<count>]
[<annotate>]
[<splunk-server>]
[<splunk-server-group>...]
Required arguments
None.
376
Optional arguments
<count>
Syntax: count=<num>
Description: The number of results to generate. If you do not specify the annotate argument, the results have
only the _time field.
Default: 1
<annotate>
Syntax: annotate=<bool>
Description: If annotate=true, generates results with the fields shown in the table below.
If annotate=false, generates results with only the _time field.
Default: false
Field Value
_raw None.
_time Date and time that you run the makeresults command.
host None.
source None.
sourcetype None.
splunk_server The name of the server that the makeresults command is run on.
splunk_server_group None.
<splunk-server>
Syntax: splunk_server=<string>
Description: Use to generate results on one specific server. Use 'local' to refer to the search head.
Default: local. See the Usage section.
<splunk-server-group>
Syntax: (splunk_server_group=<string>)...
Description: Use to generate results on a specific server group or groups. You can specify more than one
<splunk_server_group>.
Default: none. See the Usage section.
Usage
Generating commands use a leading pipe character and should be the first command in a search.
The search results created by the makeresults command are created in temporary memory and are not saved to disk or
indexed.
377
You can use this command with the eval command to generate an empty result for the eval command to operate on. See
the Examples section.
If you use Splunk Cloud Platform, omit any server or server group argument.
If you are using Splunk Enterprise, by default results are generated only on the originating search head, which is
equivalent to specifying splunk_server=local. If you provide a specific splunk_server or splunk_server_group, then the
number of results you specify with the count argument are generated on the all servers or server groups that you specify.
If you specify a server, the results are generated for that server, regardless of the server group that the server is
associated with.
If you specify a count of 5 and you target 3 servers, then you will generate 15 total results. If annotate=true, the names
for each server appear in the splunk_server column. This column will show that each server produced 5 results.
Basic examples
Sometimes you want to use the eval command as the first command in a search. However, the eval command expects
events as inputs. You can create a dummy event at the beginning of a search by using the makeresults command. You
can then use the eval command in your search.
_time newfield
2020-01-09 14:35:58 some value
2. Determine if the modified time of an event is greater than the relative time
For events that contain the field scheduled_time in UNIX time, determine if the scheduled time is greater than the relative
time. The relative time is 1 minute before now. This search uses a subsearch that starts with the makeresults command.
Extended examples
You can use the makeresults command to create a series of results to test your search syntax. For example, the following
search creates a set of five results:
| makeresults count=5
378
The results look something like this:
_time
2020-01-09 14:35:58
2020-01-09 14:35:58
2020-01-09 14:35:58
2020-01-09 14:35:58
2020-01-09 14:35:58
Each result has the same timestamp which, by itself, is not very useful. But with a few additions, you can create a set of
unique dates. Start by adding the streamstats command to count your results:
_time count
2020-01-09 14:35:58 1
2020-01-09 14:35:58 2
2020-01-09 14:35:58 3
2020-01-09 14:35:58 4
2020-01-09 14:35:58 5
You can now use that count to create different dates in the _time field, using the eval command.
The calculation multiplies the value in the count field by the number of seconds in a day. The result is subtracted from the
original _time field to get new dates equivalent to 24 hours ago, 48 hours ago, and so forth. The seconds in the date are
different because _time is calculated the moment you run the search.
_time count
2020-01-08 14:45:24 1
2020-01-07 14:45:24 2
2020-01-06 14:45:24 3
2020-01-05 14:45:24 4
2020-01-04 14:45:24 5
The dates start from the day before the original date, 2020-01-09, and go back five days.
Need more than five results? Simply change the count value in the makeresults command.
379
2. Create hourly results for testing
You can create a series of hours instead of a series of days for testing. Use 3600, the number of seconds in an hour,
instead of 86400 in the eval command.
_time count
2020-01-09 15:35:14 1
2020-01-09 14:35:14 2
2020-01-09 13:35:14 3
2020-01-09 12:35:14 4
2020-01-09 11:35:14 5
Notice that the hours in the timestamp are 1 hour apart.
You can specify a list of values for a field. But to have the values appear in separate results, you need to make the list a
multivalue field and then expand that multivalued list into separate results. Use this search, substituting your strings for
buttercup and her friends:
| makeresults | eval test="buttercup rarity tenderhoof dash mcintosh fleetfoot mistmane" | makemv delim=" "
test | mvexpand test
_time test
2020-01-09 16:35:14 buttercup
Let's start by creating a set of four events. One of the events contains a null value in the age field.
| makeresults count=4 | streamstats count | eval age = case(count=1, 25, count=2, 39, count=3, 31, count=4,
null()) | eval city = case(count=1 OR count=3, "San Francisco", count=2 OR count=4, "Seattle")
• The streamstats command is used to create the count field. The streamstats command calculates a cumulative
count for each event, at the time the event is processed.
380
• The eval command is used to create two new fields, age and city. The eval command uses the value in the
count field.
• The case function takes pairs of arguments, such as count=1, 25. The first argument is a Boolean expression.
When that expression is TRUE, the corresponding second argument is returned.
The following search is the same as the previous search, with the eventstats command added at the end:
| makeresults count=4 | streamstats count | eval age = case(count=1, 25, count=2, 39, count=3, 31, count=4,
null()) | eval city = case(count=1 OR count=3, "San Francisco", count=2 OR count=4, "Seattle") | eventstats
avg(age) BY city
If you need to test something with a set of numbers, you have two options:
• You can add a field with a set of numbers that you specify. This is similar to adding a field with a set of string
values, which is shown in the previous example.
• You can add a field with a set of randomly generated numbers by using the random function, as shown below:
381
_time count test
2020-01-08 14:45:24 1 5.371091109260495
See also
Commands
gentimes
map
Description
The map command is a looping operator that runs a search repeatedly for each input event or result. You can run the map
command on a saved search or an ad hoc search.
Syntax
map
(<searchoption> | <savedsplunkoption>)
[maxsearches=int]
Required arguments
<savedsplunkoption>
382
Syntax: <string>
Description: The name of a saved search to run for each input result.
Default: No default.
<searchoption>
Syntax: search="<string>"
Description: An ad hoc search to run for each input result. For example:
...| map search="search index=_internal earliest=$myearliest$ latest=$mylatest$".
Default: No default.
Optional arguments
maxsearches
Syntax: maxsearches=<int>
Description: The maximum number of searches to run. A message is generated if there are more search results
than the maximum number that you specify. Zero ( 0 ) does not equate to unlimited searches.
Default: 10
Usage
Known limitations
You cannot use the map command after an append or appendpipe command in your search pipeline.
When using a saved search or a literal search, the map command supports the substitution of $variable$ strings that match
field names in the input results. A search with a string like $count$, for example, will replace the variable with the value of
the count field in the input search result.
When using the map command in a dashboard <form>, use double dollar signs ($$) to specify a variable string. For
example, $$count$$. See Dashboards and forms.
Search ID field
The map command also supports a search ID field, provided as $_serial_id$. The search ID field will have a number that
increases incrementally each time that the search is run. In other words, the first run search will have the ID value 1, and
the second 2, and so on.
Basic examples
383
Extended examples
This example illustrates how to find a Sudo event and then use the map command to trace back to the computer and the
time that users logged on before the Sudo event. Start with the following search for the Sudo event.
userB serverA 3
userA serverB 2
Pipe these results into the map command, substituting the username.
sourcetype=syslog sudo | stats count by user host | map search="search index=ad_summary username=$user$
type_logon=ad_last_logon"
It takes each of the three results from the previous search and searches in the ad_summary index for the logon event for
the user. The results are returned as a table.
See also
Commands
gentimes
search
mcollect
Description
Converts events into metric data points and inserts the metric data points into a metric index on the search head. A metric
index must be present on the search head for mcollect to work properly, unless you are forwarding data to the indexer.
If you are forwarding data to the indexer, your data will be inserted on the indexer instead of the search head.
You can use the mcollect command only if your role has the run_mcollect capability. See Define roles on the Splunk
platform with capabilities in Securing Splunk Enterprise.
384
Syntax
| mcollect index=<string>
[ file=<string> ]
[ split=<true | false | allnums> ]
[ spool=<bool> ]
[ prefix_field=<string> ]
[ host=<string> ]
[ source=<string> ]
[ sourcetype=<string> ]
[ marker=<string> ]
[ <field-list> ]
Required arguments
index
Syntax: index=<string>
Description: Name of the metric index where the collected metric data is added.
field-list
Syntax: <field>, ...
Description: A list of dimension fields. Required if split=true. Optional if split=false or split=allnums. If
unspecified, which implies that split=false, mcollect treats all fields as dimensions for the data point except for
the metric_name, prefix_field, and all internal fields.
Default: No default value
Optional arguments
file
Syntax: file=<string>
Description: The file name where you want the collected metric data to be written. Only applicable when
spool=false. You can use a timestamp or a random number for the file name by specifying either
file=$timestamp$ or file=$random$.
Default: $random$_metrics.csv
split
Syntax: split=<true | false | allnums>
Description: Determines how mcollect identifies the measures in an event. See How to use the split argument.
Default: false
spool
Syntax: spool=<bool>
Description: If set to true, the metrics data file is written to the Splunk spool directory,
$SPLUNK_HOME/var/spool/splunk, where the file is indexed. Once the file is indexed, it is removed. If set to false,
the file is written to the $SPLUNK_HOME/var/run/splunk directory. The file remains in this directory unless further
automation or administration is done.
Default: true
prefix_field
385
Syntax: prefix_field=<string>
Description: Only applicable when split=true. If specified, any data point with that field missing is ignored.
Otherwise, the field value is prefixed to the metric name. See Set a prefix field
Default: No default value
host
Syntax: host=<string>
Description: The name of the host that you want to specify for the collected metrics data. Only applicable when
spool=true.
Default: No default value
source
Syntax: source=<string>
Description: The name of the source that you want to specify for the collected metrics data.
Default: If the search is scheduled, the name of the search. If the search is ad-hoc, the name of the file that is
written to the var/spool/splunk directory containing the search results.
sourcetype
Syntax: sourcetype=<string>
Description: The name of the source type that is specified for the collected metrics data. The Splunk platform
does not calculate license usage for data indexed with mcollect_stash, the default source type. If you change the
value of this setting to a different source type, the Splunk platform calculates license usage for any data indexed
by the mcollect command.
Default: mcollect_stash
Do not change this setting without assistance from Splunk Professional Services or Splunk Support. Changing
the source type requires a change to the props.conf file.
marker
Syntax: marker=<string>
Description: A string of one or more comma-separated key/value pairs that mcollect adds as dimensions to the
metric data points it generates, for the purpose of searching on those metric data points later. For example, you
could add the name of the mcollect search that you are running, like this:
marker=savedsearch=firewall_top_src_ip. This allows you to run searches later that isolate the metric data
points created by that mcollect search, simply by adding savedsearch=firewall_top_src_ip to the search string.
Usage
You use the mcollect command to convert events into metric data points to be stored in a metric index on the search
head. The metrics data uses a specific format for the metrics fields. See Metrics data format in Metrics.
The mcollect command causes new data to be written to a metric index for every run of the search.
All metrics search commands are case sensitive. This means, for example, that mcollect treats as the following as
three distinct values of metric_name: cap.gear, CAP.GEAR, and Cap.Gear.
The Splunk platform cannot index metric data points that contain metric_name fields which are empty or composed
entirely of white spaces.
386
If you are upgrading to version 8.0.0
After you upgrade your search head and indexer clusters to version 8.0.x of Splunk Enterprise, edit limits.conf on each
search head cluster and set the always_use_single_value_output setting under the [mcollect] stanza to false. This lets
these nodes use the "multiple measures per metric data point" schema when you convert logs to metrics with the
mcollect command or use metrics rollups. This schema increases your data storage capacity and improves metrics
search performance.
The split argument determines how mcollect identifies the measurement fields in your search. It defaults to false.
When split=false, your search needs to explicitly identify its measurement fields. If necessary it can use rename or eval
conversions to do this.
• If you have single-metric events, your mcollect search must produce results with a metric_name field that
provides the name of the measure, and a _value field that provides the measure's numeric value.
• If you have multiple-metric events, your mcollect search must produce results that follow this syntax:
metric_name:<metric_name>=<numeric_value>. mcollect treats each of these fields as a measurement. mcollect
treats the remaining fields as dimensions.
When you set split=true, you use field-list to identify the dimensions in your search. mcollect converts any field that
is not in the field-list into a measurement. The only exceptions are internal fields beginning with an underscore and the
prefix_field, if you have set one.
When you set split=allnums, mcollect treats all numeric fields as metric measures and all non-numeric fields as
dimensions. You can optionally use field-list to declare that mcollect should treat certain numeric fields in the events
as dimensions.
Use the prefix_field argument to apply a prefix to the metric fields in your event data.
...split=true prefix_field=type...
Because you have set split = true the Splunk software automatically converts those fields into measures, because they
are not otherwise identified in a <field-list>. Then it applies the value of the specified prefix_field as a prefix to the
metric field names. In this case, because you have specified the type field as the prefix field, its value, cpu, becomes the
metric name prefix. The results look like this:
metric_name:cpu.usage metric_name:cpu.idle
0.78 0.22
387
Time
If the _time field is present in the results, the Splunk software uses it as the timestamp of the metric data point. If the
_time field is not present, the current time is used.
field-list
If field-list is not specified, mcollect treats all fields as dimensions for the metric data points it generates, except for
the prefix_field and internal fields (fields with an underscore '_' prefix). If field-list is specified, the list must appear at
the end of the mcollect command arguments. If field-list is specified, all fields are treated as metric values, except for
the fields in field-list, the prefix-field, and internal fields.
The name of each metric value is the field name prefixed with the prefix_field value.
Effectively, one metric data point is returned for each qualifying field that contains a numerical value. If one search result
contains multiple qualifying metric name/value pairs, the result is split into multiple metric data points.
Examples
The following examples show how to use the mcollect command to convert events into multiple-value metric data points.
1: Generate metric data points that break out jobs and latency metrics by user
The following example specifies the metrics that should appear in the resulting metric data points, and splits them by user.
Note that it does not use the split argument, so the search has to use a rename conversion to explicitly identify the
measurements that will appear in the data points.
This search sets split=true so it automatically converts fields not otherwise identified as dimensions by the <field-list>
into metrics. The search identifies user as a dimension.
388
See also
Commands
collect
meventcollect
metadata
Description
The metadata command returns a list of sources, sourcetypes, or hosts from a specified index or distributed search peer.
The metadata command returns information accumulated over time. You can view a snapshot of an index over a specific
timeframe, such as the last 7 days, by using the time range picker.
See Usage.
Syntax
Required arguments
type
Syntax: type= hosts | sources | sourcetypes
Description: The type of metadata to return. This must be one of the three literal strings: hosts, sources, or
sourcetypes.
Optional arguments
index-specifier
Syntax: index=<index_name>
Description: Specifies the index from which to return results. You can specify more than one index. Wildcard
characters (*) can be used. To match non-internal indexes, use index=*. To match internal indexes, use
index=_*.
Example: | metadata type=hosts index=cs* index=na* index=ap* index=eu*
Default: The default index, which is usually the main index.
splunk_server
Syntax: splunk_server=<wc-string>
Description: Specifies the distributed search peer from which to return results.
If you are using Splunk Cloud Platform, omit this parameter.
If you are using Splunk Enterprise, you can specify only one splunk_server argument. However, you can use a
wildcard when you specify the server name to indicate multiple servers. For example, you can specify
splunk_server=peer01 or splunk_server=peer*. Use local to refer to the search head.
Default: All configured search peers return information
splunk_server_group
Syntax: splunk_server_group=<wc-string>...
Description: Limits the results to one or more server groups. If you are using Splunk Cloud, omit this parameter.
You can specify a wildcard character in the string to indicate multiple server groups.
389
Usage
Generating commands use a leading pipe character and should be the first command in a search.
Although the metadata command fetches data from all peers, any command run after it runs only on the search head.
The command shows the first, last, and most recent events that were seen for each value of the specified metadata type.
For example, if you search for:
| metadata type=hosts
• The firstTime field is the timestamp for the first time that the indexer saw an event from this host.
• The lastTime field is the timestamp for the last time that the indexer saw an event from this host.
• The recentTime field is the indextime for the most recent time that the index saw an event from this host. In other
words, this is the time of the last update.
• The totalcount field is the total number of events seen from this host.
• The type field is the specified type of metadata to display. Because this search specifies type=hosts, there is also
a host column.
In most cases, when the data is streaming live, the lastTime and recentTime field values are equal. If the data is
historical, however, the values might be different.
In small testing environments, the data is complete. However, in environments with large numbers of values for each
category, the data might not be complete. This is intentional and allows the metadata command to operate within
reasonable time and memory usage.
Real-time searches
Running the metadata command in a real-time search that returns a large number of results will very quickly consume all
the available memory on the Splunk server. Use caution when you use the metadata command in real-time searches.
Time ranges
Set the time range using the Time Range Picker. You cannot use the earliest or latest time range modifiers in the
search string. Time range modifiers must be set before the first piped command and generating commands in general do
not allow anything to be specified before the first pipe.
If you specify a time range other than All Time for your search, the search results might not be precise. The metadata is
stored as aggregate numbers for each bucket on the index. A bucket is either included or not included based on the time
range you specify.
390
For example, you run the following search specifying a time range of Last 7 days. The time range corresponds to
January 1st to January 7th.
There is a bucket on the index that contains events from both December 31st and January 1st. The metadata from that
bucket is included in the information returned from search.
Maximum results
By default, a maximum of 10,000 results are returned. This maximum is controlled by the maxresultrows setting in the
[metadata] stanza In the limits.conf file.
Examples
391
3. Format the results from the metadata command
You can also use the fieldformat command to format the results of the firstTime, lastTime, and recentTime columns to be
more readable.
| metadata type=sourcetypes index=_internal | rename totalCount as Count firstTime as "First Event" lastTime
as "Last Event" recentTime as "Last Update" | fieldformat Count=tostring(Count, "commas") | fieldformat
"First Event"=strftime('First Event', "%c") | fieldformat "Last Event"=strftime('Last Event', "%c") |
fieldformat "Last Update"=strftime('Last Update', "%c")
Click on the Count field label to sort the results and show the highest count first. Now, the results are more readable:
Return values of "sourcetype" for events in the "_audit" index on server foo.
See also
dbinspect
tstats
metasearch
Description
Syntax
metasearch [<logical-expression>]
Optional arguments
<logical-expression>
Syntax: <time-opts> | <search-modifier> | [NOT] <logical-expression> | <index-expression> |
<comparison-expression> | <logical-expression> [OR <logical-expression>]
392
Description: Includes time and search modifiers, comparison and index expressions.
Logical expression
<comparison-expression>
Syntax: <field><cmp><value>
Description: Compare a field to a literal value or values of another field.
<index-expression>
Syntax: "<string>" | <term> | <search-modifier>
<time-opts>
Syntax: [<timeformat>] [<time-modifier>]...
Comparison expression
<cmp>
Syntax: = | != | < | <= | > | >=
Description: Comparison operators.
<field>
Syntax: <string>
Description: The name of one of the fields returned by the metasearch command. See Usage.
<lit-value>
Syntax: <string> | <num>
Description: An exact, or literal, value of a field that is used in a comparison expression.
<value>
Syntax: <lit-value> | <field>
Description: In comparison-expressions, the literal value of a field or another field name. The <lit-value> must be
a number or a string.
Index expression
<search-modifier>
Syntax: <field-specifier> | <savedsplunk-specifier> | <tag-specifier>
Time options
The search allows many flexible options for searching based on time. For a list of time modifiers, see the topic Time
modifiers for search in the Search Manual.
<timeformat>
Syntax: timeformat=<string>
Description: Set the time format for starttime and endtime terms. By default, timestamp is formatted:
timeformat=%m/%d/%Y:%H:%M:%S .
<time-modifier>
Syntax: earliest=<time_modifier> | latest=<time_modifier>
Description: Specify start and end times using relative or absolute time. For more about the time modifier index,
see Specify time modifiers in your search in the Search Manual.
393
Usage
Generating commands use a leading pipe character and should be the first command in a search.
Field Description
host A default field that contains the host name or IP address of the network device that generated an event.
index The repository for data. When the Splunk platform indexes raw data, it transforms the data into searchable events.
source A default field that identifies the source of an event, that is, where the event originated.
_time The _time field contains an event's timestamp expressed in UNIX time.
Examples
Example 1:
Return metadata on the default index for events with "404" and from host "webserver1".
See also
Commands
metadata
search
meventcollect
Description
Converts events generated by streaming search commands into metric data points and inserts the data into a metric
index on the indexers.
You can use the meventcollect command only if your role has the run_mcollect capability. See Define roles on the
Splunk platform with capabilities in Securing Splunk Enterprise.
Syntax
| meventcollect index=<string>
[ file=<string> ]
[ split=<bool> ]
394
[ spool=<bool> ]
[ prefix_field=<string> ]
[ host=<string> ]
[ source=<string> ]
[ sourcetype=<string> ]
[ <field-list> ]
Required arguments
index
Syntax: index=<string>
Description: Name of the metric index where the collected metric data is added.
field-list
Syntax: <field>, ...
Description: A list of dimension fields. Required if split=true. Optional if split=false. If unspecified (which
implies that split=false), meventcollect treats all fields as dimensions for the data point, except for the
metric_name, prefix_field, and all internal fields.
Default: No default value
Optional arguments
file
Syntax: file=<string>
Description: The file name where you want the collected metric data to be written. Only applicable when
spool=false. You can use a timestamp or a random number for the file name by specifying either
file=$timestamp$ or file=$random$.
Default: $random$_metrics.csv
split
Syntax: split=<bool>
Description: Determines how meventcollect identifies the measures in an event. See How to use the split
argument.
Default: false
spool
Syntax: spool=<bool>
Description: If set to true, meventcollect writes the metrics data file to the Splunk spool directory,
$SPLUNK_HOME/var/spool/splunk, where the file is indexed automatically. If set to false, meventcollect writes the
file to the $SPLUNK_HOME/var/run/splunk directory. The file remains in this directory unless further automation or
administration is done.
Default: true
prefix_field
Syntax: prefix_field=<string>
Description: Only applicable when split=true. If specified, meventcollect ignores any data point with that field
missing. Otherwise, meventcollect prefixes the field value to the metric name. See Set a prefix field.
Default: No default value
host
Syntax: host=<string>
395
Description: The name of the host that you want to specify for the collected metrics data. Only applicable when
spool=true.
Default: No default value
source
Syntax: source=<string>
Description: The name of the source that you want to specify for the collected metrics data.
Default: If the search is scheduled, the name of the search. If the search is ad-hoc, meventcollect writes the
name of the file to the var/spool/splunk directory containing the search results.
sourcetype
Syntax: sourcetype=<string>
Description: The name of the source type that you want to specify for the collected metrics data.
Default: metrics_csv
Do not change this setting without assistance from Splunk Professional Services or Splunk Support. Changing
the source type requires a change to the props.conf file.
Usage
You use the meventcollect command to convert streaming events into metric data to be stored in a metric index on the
indexers. The metrics data uses a specific format for the metrics fields. See Metrics data format in Metrics.
Only streaming commands can precede the meventcollect command so that results can be ingested on the indexers. If
you would like to run a search that uses transforming commands to generate metric data points, use mcollect instead
of meventcollect.
The meventcollect command causes new data to be written to a metric index for every run of the search. In addition, if
you run an meventcollect search over large amounts of data, it potentially can overwhelm indexers and indexer clusters
that do not have a significant amount of capacity.
All metrics search commands are case sensitive. This means, for example, that meventcollect treats as the following as
three distinct values of metric_name: cap.gear, CAP.GEAR, and Cap.Gear.
The Splunk platform cannot index metric data points that contain metric_name fields which are empty or composed
entirely of white spaces.
The split argument determines how meventcollect identifies the measurement fields in your search. It defaults to false.
When split=false, your search needs to explicitly identify its measurement fields. If necessary it can use rename or eval
conversions to do this.
• If you have single-metric events, your meventcollect search must produce results with a metric_name field that
provides the name of the measure, and a _value field that provides the measure's numeric value.
• If you have multiple-metric events, your meventcollect search must produce results that follow this syntax:
metric_name:<metric_name>=<numeric_value>. Each of these fields will be treated as a measurement.
meventcollecttreats the remaining fields as dimensions.
396
When you set split=true, you use field-list to identify the dimensions in your search. meventcollect converts any field
that is not in the field-list into a measurement. The only exceptions are internal fields beginning with an underscore
and the prefix_field, if you have set one.
When you set split=allnums, meventcollect treats all numeric fields as metric measures and all non-numeric fields as
dimensions. You can optionally use field-list to declare that meventcollect should treat certain numeric fields in the
events as dimensions.
Use the prefix_field argument to apply a prefix to the metric fields in your event data.
...split=true prefix_field=type...
Because you have set split = true the Splunk software automatically converts those fields into measures, because they
are not otherwise identified in a <field-list>. Then it applies the value of the specified prefix_field as a prefix to the
metric field names. In this case, because you have specified the type field as the prefix field, its value, cpu, becomes the
metric name prefix. The results look like this:
metric_name:cpu.usage metric_name:cpu.idle
0.78 0.22
Examples
The following example shows you how to collect metrics log data into a metric index called 'my_metric_index'.
See also
Commands
collect
mcollect
mpreview
397
Description
Use mpreview to get an idea of the kinds of metric time series that are stored in your metrics indexes and to troubleshoot
your metrics data.
mpreview returns a preview of the raw metric data points in a specified metric index that match a provided filter. By
default, mpreview retrieves a target of five metric data points per metric time series from each metrics time-series index
file (.tsidx file) associated with the search. You can change this target amount with the target_per_timeseries
argument.
The mpreview command cannot search data that was indexed prior to your upgrade to the 8.0.x version of the Splunk
platform.
You can use the mpreview command only if your role has the run_msearch capability. See Define roles on the Splunk
platform with capabilities in Securing Splunk Enterprise.
Syntax
| mpreview
[filter=<string>]
[<index-opt>]...
[splunk_server=<wc-string>]
[splunk_server_group=<wc-string>]...
[earliest=<time-specifier>]
[latest=<time-specifier>] :
[chunk_size=<unsigned-integer>]
[target_per_timeseries=<unsigned-integer>]
Required arguments
Optional arguments
chunk_size
Syntax: chunk_size=<unsigned-integer>
Description: Advanced option. This argument controls how many metric time series are retrieved at a time from
a single time-series index file (.tsidx file) when the Splunk software processes searches. Lower this setting
from its default only when you find a particular mpreview search is using too much memory, or when it infrequently
returns events. This can happen when a search groups by excessively high-cardinality dimensions (dimensions
with very large amounts of distinct values). In such situations, a lower chunk_size value can make mpreview
searches more responsive, but potentially slower to complete. A higher chunk_size, on the other hand, can help
long-running searches to complete faster, with the potential tradeoff of causing the search to be less responsive.
For mpreview, chunk_size cannot be set lower than 10.
For more information about this setting, see Use chunk_size to regulate mpreview performance.
398
Default: 1000
For Splunk Enterprise: The default value for the the chunk_size argument is set by the chunk_size setting for
the [msearch] stanza in limits.conf.
earliest
Syntax: earliest=<time-specifier>
Description: Specify the earliest _time for the time range of your search. You can specify an exact time
(earliest="11/5/2016:20:00:00") or a relative time (earliest=-h or earliest=@w0).
For more information about setting exact times see Date and time format variables. For more information about
setting relative times, see Time modifiers. Subsecond options are available only if you are searching over a
metrics index with millisecond timestamp resolution.
filter
Syntax: filter= "<string>"
Description: An arbitrary boolean expression over the dimension or metric_name.
index-opt
Syntax: index=<index-name> (index=<index-name>)...
Description: Limits the search to results from one or more indexes. You can use wildcard characters (*). To
match non-internal indexes, use index=*. To match internal indexes, use index=_*.
latest
Syntax: latest=<time-specifier>
Description: Specify the latest time for the _time range of your search. You can specify an exact time
(latest="11/12/2016:20:00:00") or a relative time (latest=-30m or latest=@w6).
For more information about setting exact times see Date and time format variables. For more information about
setting relative times, see Time modifiers. Subsecond options are available only if you are searching over a
metrics index with millisecond timestamp resolution.
splunk_server
Syntax: splunk_server=<wc-string>
Description: Specifies the distributed search peer from which to return results. If you are using Splunk
Enterprise, you can specify only one splunk_server argument. However, you can use a wildcard when you
specify the server name to indicate multiple servers. For example, you can specify splunk_server=peer01 or
splunk_server=peer*. Use local to refer to the search head.
splunk_server_group
Syntax: splunk_server_group=<wc-string>
Description: Limits the results to one or more server groups. If you are using Splunk Cloud Platform, omit this
parameter. You can specify a wildcard character in the string to indicate multiple server groups.
target_per_timeseries
Syntax: target_per_timeseries=<unsigned-integer>
Description Determines the target number of metric data points to retrieve per metric time series from each
metrics time-series index file (.tsidx file) associated with the mpreview search. If a time series has less than the
target_per_timeseries of data points within a .tsidx file, the search head retrieves all of the data points for that
time series within that particular .tsidx file.
399
If you set target_per_timeseries to 0 it returns all data points available within the given time range for each time
series. This search will likely be very large in scale and therefore very slow to complete. If you must search on
a large number of metric data points, use mstats instead.
For more information about this setting, see How the target_per_timeseries argument works.
Default: 5
The default value for the the target_per_timeseries argument is set by the target_per_timeseries setting for the
[msearch] stanza in limits.conf
Usage
This search command generates a list of individual metric data points from a specified metric index that match a provided
filter. The filter can be any arbitrary boolean expression over the dimensions or the metric_name. Specify earliest and
latest to override the time range picker settings.
For more information about setting earliest and latest, see Time modifiers.
The mpreview command is designed to display individual metric data points in JSON format. If you want to aggregate
metric data points, use the mstats command.
All metrics search commands are case sensitive. This means, for example, that mpreview treats as the following as
three distinct values of metric_name: cap.gear, CAP.GEAR, and Cap.Gear.
Unfiltered mpreview searches can cover extremely large numbers of raw metric data points. In some cases the sheer
number of data points covered by the search can cause such searches to be slow or unresponsive.
The target_per_timeseries argument makes the mpreview command more responsive while giving you a relatively broad
preview of your metric data. It limits the number of metric data points that mpreview can return from each metric time
series in each .tsidx file covered by the search.
For example, if you have 10 metrics tsidx files that each contain 100 metric time series, and each time series has >=5
data points. If you set target_per_timeseries=5 in the search, you should expect a maximum of 10 x 100 x 5 = 5000
metric data points to be returned by the search.
On the other hand, say you have 10 metrics tsidx files that each contain 100 metric time series, but in this case, 50 of
those time series have 3 data points and the other 50 of those time series have >=5 data points. If you set
target_per_timeseries=5 in the search, you should expect to get 10 x ((50 x 3) + (50 x 5)) = 4000 data points.
The target_per_timeseries argument is especially useful when the number of metric data points covered by your
mpreview search is significantly larger than the number of metric time series covered by the search. It's not particularly
helpful if the number of data points in your search are slightly larger than or equal to the number of metric time series in
the search.
You can run this search to determine the number of metric data points that could potentially be covered by an mpreview
search:
400
| metadata index=<metric_index_name> type=hosts datatype=metric | fields totalCount
You can run this search to determine the number of metric time series that could potentially be covered by an mpreview
search:
If you find that mpreview is slow or unresponsive despite the target_per_timeseries argument you can also use
chunk_size to regulate mpreview behavior. Reduce the chunk_size to make the search more responsive with the potential
tradeoff of making the search slower to complete. Raise the chunk_size to help the mpreview search to complete faster,
with the potential tradeoff of making it less responsive.
Examples
This search returns individual data points from the _metrics index that match a specific filter.
| mpreview index=_metrics
The following search lowers chunk_size so that it returns 100 metric time series worth of metric data points in batches
from tsidx files that belong to the _metrics index. Ordinarily it would return 1000 metric time series in batches.
The following search uses target_per_timeseries to return a maximum of five metric data points per time series in each
tsidx file searched in the _metrics index.
401
See also
Commands
mcatalog
mcollect
mstats
msearch
The msearch command is an alias for the mpreview command. See the mpreview command for the syntax and examples.
See also
Commands
mcatalog
mstats
mstats
Description
Use the mstats command to analyze metrics. This command performs statistics on the measurement, metric_name, and
dimension fields in metric indexes. You can use mstats in historical searches and real-time searches. When you use
mstats in a real-time search with a time window, a historical search runs first to backfill the data.
The mstats command provides the best search performance when you use it to search a single metric_name value or a
small number of metric_name values.
Syntax
| mstats
[chart=<bool>]
[<chart-options>]
[prestats=<bool>]
[append=<bool>]
[backfill=<bool>]
[update_period=<integer>]
[fillnull_value=<string>]
[chunk_size=<unsigned int>]
<stats-metric-term>...
WHERE [<logical-expression>]...
[ (BY|GROUPBY) <field-list> ]
[<span-length>]
402
Required arguments
<stats-metric-term>
Syntax: <stats-func> | <stats-func-value>
Description: Provides two options for performing statistical calculations on metrics. Use <stats-func> to perform
statistical calculations on one or more metrics that you name in the argument. Use <stats-func-value> for cases
where a wildcard can be used to represent several metrics. You cannot blend the <stats-func> syntax and the
<stats-func-value syntax in a single mstats search.
Use the <stats-func> syntax for most cases. You only need to use the <stats-func-value> syntax in cases
where a single metric may be represented by several different metric names, such as cpu.util and
cpu.utilization. In these cases you can apply a wildcard to catch all of the permutations of the metric_name.
See Stats metric term options for details on the <stats-func> and <stats-func-value> syntax options.
Optional arguments
append
Syntax: append=<bool>
Description: Valid only when prestats=true. This argument runs the mstats command and adds the results to
an existing set of results instead of generating new results.
Default: false
backfill
Syntax: backfill=<bool>
Description: Valid only with real-time searches that have a time window. When backfill=true, the mstats
command runs a search on historical data to backfill events before searching the in-memory real-time data.
Default: true
chart
Syntax: chart=<bool>
Description: When set to chart=t, the mstats data output has a format suitable for charting. The mstats charting
mode is valid only when prestats=f.
When a span is provided, the mstats chart mode format resembles that of the timechart command, and can
support at most one group-by field, which is used as the series splitting field.
When no span is provided, the chart mode follows a format similar to that of the chart or timechart commands.
Without a span, the mstats chart mode requires one or two grouping fields. The first grouping field represents the
chart x-axis. The second grouping field represents the y-axis and is a series split field.
Default: chart=f
<chart-options>
Syntax: chart.limit | chart.agg | chart.usenull | chart.useother | chart.nullstr | chart.otherstr
Description: Options that you can specify to refine the result. See the Chart options section in this topic.
chunk_size
Syntax: chunk_size=<unsigned_int>
Description: Advanced option. This argument controls how many metric time series are retrieved at a time from
a single time-series index file (.tsidx file) when the Splunk software processes searches. Lower this setting
from its default only when you find a particular mstats search is using too much memory, or when it infrequently
returns events. This can happen when a search groups by excessively high-cardinality dimensions (dimensions
403
with very large amounts of distinct values). In such situations, a lower chunk_size value can make mstats
searches more responsive, but potentially slower to complete. A higher chunk_size, on the other hand, can help
long-running searches to complete faster, with the potential tradeoff of causing the search to be less responsive.
For mstats, chunk_size cannot be set lower than10000.
Default: 10000000 (10 million)
The default value for the the chunk_size argument is set by the chunk_size setting for the [mstats] stanza in
limits.conf.
fillnull_value
Description: This argument sets a user-specified value that the mstats command substitutes for null values for
any field within its group-by field list. Null values include field values that are missing from a subset of the returned
events as well as field values that are missing from all of the returned events. If you do not provide a
fillnull_value argument, mstats omits rows for events with one or more null field values from its results.
Default: empty string
<field-list>
Syntax: <field>, ...
Description: Specifies one or more fields to group the results by. Required when using the BY clause.
<logical-expression>
Syntax: <time-opts>|<search-modifier>|((NOT)?
<logical-expression>)|<search-modifier>|<comparison-expression>|(<logical-expression> (OR)?
<logical-expression>)
Description: An expression describing the filters that are applied to your search. Includes time and search
modifiers, and comparison expressions. See the following sections for descriptions of each of these logical
expression components.
Cannot filter on metric_name. Does not support CASE or TERM directives. You also cannot use the WHERE
clause to search for terms or phrases.
prestats
Syntax: prestats=true | false
Description: Specifies whether to use the prestats format. The prestats format is a Splunk internal format that is
designed to be consumed by commands that generate aggregate calculations. When you use the prestats format,
you can pipe the data into the chart, stats, or timechart commands, which are designed to accept the prestats
format. When prestats is set to true, instructions with the AS clause are not relevant. The field names for the
aggregates are determined by the command that consumes the prestats format and produces the aggregate
output.
Default: false
<span-length>
Syntax: span=<int><timescale> [every=<int><timescale>]
Description: The span of each time bin. If used with a <timescale>, the <span-length> is treated as a time
range. If not, this is an absolute bucket length. If you do not specify a <span-length>, the default is auto, which
means that the number of time buckets adjusts to produce a reasonable number of results. For example, if
seconds are used initially for the <timescale>and too many results are returned, the <timescale> is changed to a
longer value, such as minutes, to return fewer time buckets.
To improve the performance of mstats searches you can optionally use the every argument in conjunction with
span to cause the search to reduce the amount of data it samples per span. In other words you could design a
search where the search head samples a span of only ten minutes of data for every hour covered by the search.
See Span length options.
404
update_period
Syntax: update_period=<integer>
Description: Valid only with real-time searches. Specifies how frequently, in milliseconds, the real-time summary
for the mstats command is updated. A larger number means less frequent updates to the summary and less
impact on index processing.
Default: 1000 (1 second)
<stats-func>
Syntax: <stats-func> | <mstats-specific-func> "("<metric_name")" [AS <string>]...
Description: Perform statistical calculations on one or more metric_name fields. You can rename the result of
each function using the AS clause, unless prestats is set to true. The metric_name must be enclosed in
parenthesis.
When you use the <stats-func> syntax, the WHERE clause cannot filter on metric_name.
<mstats-specific-func>
Syntax: rate_avg | rate_sum
Description: Two functions that are specific to mstats. rate_avg computes the per metric time series rates for
an accumulating counter metric and then returns the average of those rates. rate_sum does the same thing as
rate_avg except that it returns the sum of the rates. For more about counter metrics and these functions see
Investigate counter metrics in Metrics.
<stats-func-value>
Syntax: count(_value) | <function>(_value) [AS <string>] WHERE metric_name=<metric_name>
Description: Specify a basic count of the _value field or a function on the _value field. The _value field uses a
specific format to store the numeric value of the metric. You can specify one or more functions. You can rename
the result of the function using AS unless prestats=true.
When you use the <stats-func-value> syntax, the WHERE clause must filter on the metric_name. Wildcards are
okay.
The stats-func-value syntax does not support real-time searches. If you must run a real-time search, use the
stats-func syntax instead.
The following table lists the supported functions for the mstats command by type of function. Use the links in the
table to see descriptions and examples for each function.
count() sumsq()
Aggregate functions range()
max() upperperc<num>
stdev()
median() var()
stdevp()
min() varp()
rate() rate_sum()
405
Type of function Supported functions and syntax
earliest_time()
latest()
For an overview of using functions with commands, see Statistical and charting functions.
Chart options
chart.limit
Syntax: chart.limit=(top | bottom)<int>
Description: Only valid when a column-split is specified. Use the chart.limit option to specify the number of
results that should appear in the output. When you set chart.limit=N the top or bottom N values are retained,
based on the sum of each series and the prefix you have selected. If chart.limit=0, all results are returned. If
you opt not to provide a top or bottom prefix before the chart.limit value, the Splunk software provides the top N
results. For example, if you set chart.limit=10 the Splunk software defaults to providing the top 10 results.
This argument is identical to the limit argument of the chart and timechart commands.
Default: top10
chart.agg
Syntax:chart.agg=( <stats-func> ( <evaled-field> | <wc-field> ) [AS <wc-field>] )
Description: A statistical aggregation function. See the table of supported functions in Stats metric term options.
The function can be applied to an eval expression, or to a field or set of fields. Use the AS clause to place the
result into a new field with a name that you specify. You can use wild card characters in field names. This
argument is identical to the agg argument of the chart and timechart commands.
Default: sum
chart.nullstr
Syntax: chart.nullstr=<string>
Description: If chart.usenull is true, this series is labeled by the value of the chart.nullstr option, and defaults
to NULL. This argument is identical to the nullstr argument of the chart and timechart commands.
chart.otherstr
Syntax: chart.otherstr=<string>
Description: If chart.useother is true, this series is labeled by the value of the code.otherstr option, and
defaults to OTHER. This argument is identical to the otherstr argument of the chart and timechart commands.
chart.usenull
Syntax: chart.usenull=<bool>
Description: Determines whether a series is created for events that do not contain the split-by field. This
argument is identical to the usenull argument of the chart and timechart commands.
chart.useother
Syntax: chart.useother=<bool>
Description: Specifies whether a series should be added for data series not included in the graph because they
did not meet the criteria of the WHERE clause. This argument is identical to the useother argument of the chart
and timechart commands.
406
Logical expression options
<comparison-expression>
Syntax: <field><comparison-operator><value> | <field> IN (<value-list>)
Description: Compares a field to a literal value or provides a list of values that can appear in the field.
<search-modifier>
Syntax: <sourcetype-specifier> | <host-specifier> | <source-specifier> | <splunk_server-specifier>
Description: Search for events from specified fields. For example, search for one or a combination of hosts,
sources, and source types. See searching with default fields in the Knowledge Manager manual.
<time-opts>
Syntax: [<timeformat>] (<time-modifier>)*
Description: Describes the format of the <starttime> and <endtime> terms of the search.
<comparison-operator>
Syntax: = | != | < | <= | > | >=
Description: You can use comparison operators when searching field-value pairs. Comparison expressions with
the equal ( = ) or not equal ( != ) operator compare string values. For example, "1" does not match "1.0".
Comparison expressions with greater than or less than operators < > <= >= numerically compare two numbers
and lexicographically compare other values. See Usage.
<field>
Syntax: <string>
Description: The name of a field.
<value>
Syntax: <literal-value>
Description: In comparison expressions, this is the literal number or string value of a field.
<value-list>
Syntax: (<literal-value>, <literal-value>, ...)
Description: Used with the IN operator to specify two or more values. For example use error IN (400, 402,
404, 406) instead of error=400 OR error=402 OR error=404 OR error=406
<sourcetype-specifier>
Syntax: sourcetype=<string>
Description: Search for events from the specified sourcetype field.
<host-specifier>
Syntax: host=<string>
Description: Search for events from the specified host field.
<source-specifier>
Syntax: source=<string>
Description: Search for events from the specified source field.
<splunk_server-specifier>
407
Syntax: splunk_server=<string>
Description: Search for events from a specific server. Use "local" to refer to the search head.
every
Syntax: every=<int><timescale>
Description: Use in conjunction with span to search data in discrete time intervals over the full timespan of a
search. The every argument is valid only when span is set to a valid value other than auto. Set the every
timespan to a value that is greater than the span timespan.
This method of "downsampling" the search data improves search performance at the expense of data granularity.
For example, this search returns an average of the active_logins measurement for the first ten seconds of every
twenty seconds covered by the time range of the search: | mstats avg(active_logins) span=10s every=20s
Month intervals for every are exactly 30 days long. Year intervals for every are exactly 365 days long.
<timescale>
Syntax: <sec> | <min> | <hr> | <day> | <month> | <subseconds>
Description: Time scale units.
Default: sec
mstats only supports subsecond timescales such as ms when it is searching metric indexes that are configured
for millisecond timestamp resolution.
For more information about enabling metrics indexes to index metric data points with millisecond timestamp
precision, see:
• Manage Splunk Cloud Platform indexes in the Splunk Cloud Platform Admin Manual if you use Splunk Cloud
Platform.
• Create custom indexes in Managing indexers and clusters of indexers if you use Splunk Enterprise.
Time options
<timeformat>
Syntax: timeformat=<string>
408
Description: Set the time format for starttime and endtime terms.
Default: timeformat=%m/%d/%Y:%H:%M:%S.
For more about setting exact times with the available timeformat options, see Date and time format variables.
Subsecond options are only available if you are searching over a metrics index with millisecond timestamp
resolution.
<time-modifier>
Syntax: starttime=<string> | endtime=<string> | earliest=<time_modifier> | latest=<time_modifier>
Description: Specify start and end times using relative or absolute time.
You can also use the earliest and latest arguments to specify absolute and relative time ranges for your search.
For more about the relative <time_modifier> syntax, see Time modifiers.
For more information about setting absolute time ranges see Date and time format variables. Subsecond options
are only available if you are searching over a metrics index with millisecond timestamp resolution.
starttime
Syntax: starttime=<string>
Description: Events must be later or equal to this time. The starttime must match the timeformat.
endtime
Syntax: endtime=<string>
Description: All events must be earlier or equal to this time.
Usage
The mstats command is a report-generating command, except when append=true. See Command types.
Generating commands use a leading pipe character and should be the first command in a search, except when
append=true is specified with the command.
Use the mstats command to search metrics data. The metrics data uses a specific format for the metrics fields. See
Metrics data format in Metrics.
All metrics search commands are case sensitive. This means, for example, that mstats treats as the following as three
distinct values of metric_name: cap.gear, CAP.GEAR, and Cap.Gear.
mstatssearches cannot return results for metric data points with metric_name fields that are empty or which contain blank
spaces.
The mstats command does not support subsearches. You can use the append argument to add the results of an mstats
search to the results of a preceding mstats search. See the topic on the tstats command for an append usage example.
Aggregations
If you are using the <stats-func> syntax, numeric aggregations are only allowed on specific values of the metric_name
field. The metric name must be enclosed in parenthesis. If there is no data for the specified metric_name in parenthesis,
409
the search is still valid.
If you are using the <stats-func-value> syntax, numeric aggregations are only allowed on the _value field.
Aggregations are not allowed for values of any other field, including the _time field.
When prestats = true and you run an mstats search that uses the c and count aggregation functions without an
aggregation field, the Splunk software processes them as if they are actually count(_value). In addition, any statistical
functions that follow in the search string must reference the _value field. For example: | mstats count | timechart
count(_value)
Wildcard characters
The mstats command supports wildcard characters in any search filter, with the following exceptions:
• Real-time mstats searches cannot utilize wildcarded metric aggregations when you use the <stats-func> syntax.
For example, this search is invalid, when you set it up as a real-time search:
WHERE clause
If you are using the <stats-func> syntax, the WHERE clause cannot filter by metric_name. Filtering by metric_name is
performed based on the metric_name fields specified with the <stats-func> argument.
If you are using the <stats-func-value> syntax, the WHERE clause must filter by metric_name.
If you do not specify an index name in the WHERE clause, the mstats command returns results from the default metrics
indexes associated with your role. If you do not specify an index name and you have no default metrics indexes
associated with your role, mstats returns no results. To search against all metrics indexes use WHERE index=*.
410
The WHERE clause must come before the BY or GROUPBY clause, if they are both used in conjunction with mstats.
For more information about defining default metrics indexes for a role, see Add and edit roles with Splunk Web in
Securing Splunk Enterprise.
You can also group by time. You must specify a timespan using the <span-length> argument to group by time buckets.
For example, span=1hr or span=auto. The <span-length> argument is separate from the BY clause and can be placed at
any point in the search between clauses.
You can group results by metric time series. A metric time series is a set of metric data points that share the same
metrics and the same dimension field-value pairs. Grouping by metric time series ensures that you are not mixing up data
points from different metric data sources when you perform statistical calculations on them.
Use BY _timeseries to group by metric time series. The _timeseries field is internal and won't display in your results. If
you want to display the _timeseries values in your search, add | rename _timeseries AS timeseries to the search.
For a detailed overview of the _timeseries field with examples, see Perform statistical calculations on metric time series
in Metrics.
Time dimensions
The mstats command does not recognize the following time-related dimensions.
Unsupported dimensions
date_hour date_wday
timeendpos
date_mday date_year
date_minute date_zone timestamp
date_month metric_timestamp timestartpos
date_second time
You can only use subsecond span timescalesâtime spans that are made up of deciseconds (ds), centiseconds (cs),
milliseconds (ms), or microseconds (us)âfor mstats searches over metrics indexes that have been configured to have
millisecond timestamp resolution.
Subsecond span timescales should be numbers that divide evenly into a second. For example, 1s = 1000ms. This means
that valid millisecond span values are 1, 2, 4, 5, 8, 10, 20, 25, 40, 50, 100, 125, 200, 250, or 500ms. In addition, span =
1000ms is not allowed. Use span = 1s instead.
411
• For Splunk Cloud Platform: See Manage Splunk Cloud Platform indexes in the Splunk Cloud Platform Admin
Manual.
• For Splunk Enterprise: See Create custom indexes in Managing indexes and clusters of indexes.
If you run an mstats search over multiple metrics indexes with varying levels of timestamp resolution, the results of the
search may contain results with timestamps of different resolutions.
For example, say you have two metrics indexes. Your "metrics-second" metrics index has a second timestamp resolution.
Your "metrics-ms" metrics index has a millisecond timestamp resolution. You run the following search over both indexes:
| mstats count(*) WHERE index=metric* span=100ms.
_time count(cpu.nice)
1549496110 48
1549496110.100 2
The 11549496110 row counts results from both indexes. The count from "metric-ms" includes only metric data points with
timestamps from 1549496110.000 to 1549496110.099. The "metric-ms" metric data points with timestamps from
1549496110.100 to 1549496110.199 appear in the 1549496110.100 row.
Meanwhile, the metric data points in the "metric-second" index do not have millisecond timestamp precision. The
1549496110 row only counts those "metric-second" metric data points with the 11549496110 timestamp, and no metric data
points from "metric-second" are counted in the 1549496110.100 row.
Splunk software regulates mstats search jobs that use span or a similar method to group results by time. When Splunk
software processes these jobs, it limits the number of "time bins" that can be allocated within a single .tsidx file.
For metrics indexes with second timestamp resolution, this only affects searches with large time ranges and very small
time spans, such as a search over a year with span = 1s. If you are searching on a metrics index with millisecond
timestamp resolution, you might encounter this limit over shorter ranges, such as a search over an hour with span = 1ms.
This limit is set by time_bin_limit in limits.conf, which is set to 1 million bins by default. If you need to run these kinds
of mstats search jobs, lower this value if they are using too much memory per search. Raise this value if these kinds of
search jobs are returning errors.
The Splunk platform estimates the number of time bins that a search requires by dividing the search time range by its
group-by span. If this produces a number that is larger than the time_bin_limit, the Splunk platform returns an error.
The search time range is determined by the earliest and latest values of the search. Some kinds of searchesâsuch
as all-time searchesâdo not have earliest and latest. In such cases the Splunk platform checks within each single
TSIDX file to derive a time range for the search.
Metrics indexes have second timestamp resolution by default. You can give a metrics index a millisecond timestamp
resolution when you create it, or you can edit an existing metrics index to switch it to millisecond timestamp resolution.
412
If you use Splunk Cloud, see Manage Splunk Cloud Platform indexes in the Splunk Cloud Platform Admin Manual. If you
use Splunk Enterprise, see Create custom indexes in Managing indexes and clusters of indexes.
A pair of limits.conf settings strike a balance between the performance of mstats searches and the amount of memory
they use during the search process, in RAM and on disk. If your mstats searches are consistently slow to complete you
can adjust these settings to improve their performance, but at the cost of increased search-time memory usage, which
can lead to search failures.
If you use Splunk Cloud Platform, you will need to file a Support ticket to change these settings.
For more information, see Memory and stats search performance in the Search Manual.
Lexicographical order
Lexicographical order sorts items based on the values used to encode the items in computer memory. In Splunk software,
this is almost always UTF-8 encoding, which is a superset of ASCII.
• Numbers are sorted before letters. Numbers are sorted based on the first digit. For example, the numbers 10, 9,
70, 100 are sorted lexicographically as 10, 100, 70, 9.
• Uppercase letters are sorted before lowercase letters.
• Symbols are not standard. Some symbols are sorted before numeric values. Other symbols are sorted before or
after letters.
You can specify a custom sort order that overrides the lexicographical order. See the blog Order Up! Custom Sort Orders.
Examples
Return the average value of the aws.ec2.CPUUtilization metric in the mymetricdata metric index. Bucket the results into
30 second time spans.
Return the average value of both the aws.ec2.CPUUtilization metric and the os.cpu.utilization metric. Group the
results by host and bucket the results into 1 minute time spans. Both metrics are combined and considered a single metric
series.
3. Use chart=t mode to chart metric event counts by the top ten hosts
Return a chart of the number of aws.ec2.CPUUtilization metric data points for each day, split by the top ten hosts.
413
4. Filter the results on a dimension value and split by the values of another dimension
Return the average value of the aws.ec2.CPUUtilization metric for all measurements with host=foo and split the results
by the values of the app dimension.
Return the average and maximum of the resident set size and virtual memory size. Group the results by metric_name and
bucket them into 1 minute spans
6. Aggregate a metric across all of your default metrics indexes, using downsampling to speed up the search
Find the median of the aws.ec2.CPUUtilization metric. Do not include an index filter to search for measurements in all of
the default metrics indexes associated with your role. Speed up the search by using every to compute the median for one
minute of every five minutes covered by the search.
7. Get the rate of an accumulating counter metric and group the results by time series
See Perform statistical calculations on metric time series in Metrics for more information.
8. Stats-func-value example
Use the <stats-func-value> syntax to get a count of all of the measurements for the aws.ec2.CPUUtilization metric in
the mymetricdata index.
See also
Related information
Overview of metrics in Metrics
multikv
Description
Extracts field-values from table-formatted search results, such as the results of the top, tstat, and so on. The multikv
command creates a new event for each table row and assigns field names from the title row of the table.
414
Name Age Occupation
Josh 42 SoftwareEngineer
Francine 35 CEO
Samantha 22 ProjectManager
The multikv command can transform this table from one event into three events with the relevant fields. It works more
easily with the fixed-alignment though can sometimes handle merely ordered fields.
The general strategy is to identify a header, offsets, and field counts, and then determine which components of
subsequent lines should be included into those field names. Multiple tables in a single event can be handled (if
multitable=true), but might require ensuring that the secondary tables have capitalized or ALLCAPS names in a header
row.
Auto-detection of header rows favors rows that are text, and are ALLCAPS or Capitalized.
For Splunk Cloud Platform, you must create a private app to extract field-value pairs from table-formatted search
results. If you are a Splunk Cloud administrator with experience creating private apps, see Manage private apps in your
Splunk Cloud deployment in the Splunk Cloud Admin Manual. If you have not created private apps, contact your Splunk
account representative for help with this customization.
Syntax
Optional arguments
conf
Syntax: conf=<stanza_name>
Description: If you have a field extraction defined in multikv.conf, use this argument to reference the stanza in
your search. For more information, refer to the configuration file reference for multikv.conf in the Admin Manual.
<multikv-option>
Syntax: copyattrs=<bool> | fields <field-list> | filter <term-list> | forceheader=<int> | multitable=<bool> |
noheader=<bool> | rmorig=<bool>
Description: Options for extracting fields from tabular events.
copyattrs
Syntax: copyattrs=<bool>
Description: When true, multikv copies all fields from the original event to the events generated from that event.
When false, no fields are copied from the original event. This means that the events will have no _time field and
the UI will not know how to display them.
Default: true
415
fields
Syntax: fields <field-list>
Description: Limit the fields set by the multikv extraction to this list. Ignores any fields in the table which are not
on this list.
filter
Syntax: filter <term-list>
Description: If specified, multikv skips over table rows that do not contain at least one of the strings in the filter
list. Quoted expressions are permitted, such as "multiple words" or "trailing_space ".
forceheader
Syntax: forceheader=<int>
Description: Forces the use of the given line number (1 based) as the table's header. Does not include empty
lines in the count.
Default: The multikv command attempts to determine the header line automatically.
multitable
Syntax: multitable=<bool>
Description: Controls whether or not there can be multiple tables in a single _raw in the original events.
Default: true
noheader
Syntax: noheader=<bool>
Description: Handle a table without header row identification. The size of the table will be inferred from the first
row, and fields will be named Column_1, Column_2, ... noheader=true implies multitable=false.
Default: false
rmorig
Syntax: rmorig=<bool>
Description: When true, the original events will not be included in the output results. When false, the original
events are retained in the output results, with each original emitted after the batch of generated results from that
original.
Default: true
Usage
Examples
Example 1: Extract the "COMMAND" field when it occurs in rows that contain "splunkd".
See also
416
multisearch
Description
The multisearch command is a generating command that runs multiple streaming searches at the same time. This
command requires at least two subsearches and allows only streaming operations in each subsearch. Examples of
streaming searches include searches with the following commands: search, eval, where, fields, and rex. For more
information, see Types of commands in the Search Manual.
Syntax
Required arguments
<subsearch>
Syntax: "["search <logical-expression>"]"
Description: At least two streaming searches must be specified. See the search command for detailed
information about the valid arguments for <logical-expression>.
Usage
Generating commands use a leading pipe character and should be the first command in a search.
With the multisearch command, the events from each subsearch are interleaved. Therefore the multisearch command is
not restricted by the subsearch limitations.
Unlike the append command, the multisearch command does not run the subsearch to completion first. The following
subsearch example with the append command is not the same as using the multisearch command.
index=a | eval type = "foo" | append [search index=b | eval mytype = "bar"]
Examples
Example 1:
Search for events from both index a and b. Use the eval command to add different fields to each set of results.
| multisearch [search index=a | eval type = "foo"] [search index=b | eval mytype = "bar"]
See also
append, join
417
mvcombine
Description
Takes a group of events that are identical except for the specified field, which contains a single value, and combines
those events into a single event. The specified field becomes a multivalue field that contains all of the single values from
the combined events.
Syntax
Required arguments
field
Syntax: <field>
Description: The name of a field to merge on, generating a multivalue field.
Optional arguments
delim
Syntax: delim=<string>
Description: Defines the string to use as the delimiter for the values that get combined into the multivalue field.
For example, if the values of your field are "1", "2", and "3", and delim is "; " then the combined multivalue field is
"1";"2";"3".
Default: a single space, (" ")
To see the output of the delim argument, you must use the nomv command immediately after the mvcombine
command. See Usage
Usage
You can use evaluation functions and statistical functions on multivalue fields or to return multivalue fields.
The mvcombine command accepts a set of input results and finds groups of results where all field values are identical,
except the specified field. All of these results are merged into a single result, where the specified field is now a multivalue
field.
Because raw events have many fields that vary, this command is most typically useful after paring down the set of
available fields with the fields command. The command is also useful for manipulating the results of certain reporting
commands.
418
Specifying delimiters
The mvcombine command creates a multivalue version of the field you specify, as well as a single value version of the
field. The multivalue version is displayed by default.
The single value version of the field is a flat string that is separated by a space or by the delimiter that you specify with the
delim argument.
By default the multivalue version of the field is displayed in the results. To display the single value version with the
delimiters, add the | nomv command to the end of your search. For example ...| mvcombine delim= "," host | nomv
host.
Some modes of search result investigation prefer this single value representation, such as exporting to CSV in the UI, or
running a command line search with splunk search "..." -output csv. Some commands that are not multivalue aware
might use this single value as well.
Most ways of accessing the search results prefer the multivalue representation, such as viewing the results in the UI, or
exporting to JSON, requesting JSON from the command line search with splunk search "..." -output json or
requesting JSON or XML from the REST API. For these forms of, the selected delim has no effect.
If your primary goal is to convert a multivalue field into a single-value field, mvcombine is probably not your best option.
mvcombine is mainly meant for the creation of new multivalue fields. Instead, try either the nomv command or the mvjoin
eval function.
Use when you want to perform multivalue field to single-value field conversion where the former
multivalues are separated by a delimiter that you supply. For example, you start with a multivalue field
mvjoin eval Multivalue eval
that contains the values 1, 2, 3,4, 5. You can use mvjoin to transform your multivalue field into a
function functions
single-valued field with OR as the delimiter. The new single value of the field is 1 OR 2 OR 3 OR 4 OR
5.
Examples
This example uses the sample dataset from the Search Tutorial. To try this example yourself, download the data set from Get the tutorial data
into Splunk and follow the instructions in the Search Tutorial to upload the data.
To understand how mvcombine works, let's explore the data.
The results show that the max and min fields have duplicate entries for the hosts that start with www. The other
hosts show no results for the max and min fields.
419
3. To remove the other hosts from your results, modify the search to add host=www* to the search criteria.
Because the values in the max and min columns contain the exact same values, you can use the mvcombine to
combine the host values into a multivalue result.
4. Add | mvcombine host to your search and run the search again.
index=* host=www* | stats max(bytes) AS max, min(bytes) AS min BY host | mvcombine host
Instead of three rows, one row is returned. The host field is now a multvalue field.
As mentioned in the Usage section, by default the delimited version of the results are not returned in the output. To return
the results with the delimiters, you must return the single value string version of the field.
index=* host=www* | stats max(bytes) AS max, min(bytes) AS min BY host | mvcombine delim="," host | nomv
host
The search results that are returned are shown in the following table.
420
Example 3:
In multivalue events:
Example 4:
See also
Commands:
makemv
mvexpand
nomv
Functions:
Multivalue eval functions
Multivalue stats and chart functions
split
mvexpand
Description
Expands the values of a multivalue field into separate events, one event for each value in the multivalue field. For each
result, the mvexpand command creates a new result for every multivalue field.
Syntax
Required arguments
field
Syntax: <field>
Description: The name of a multivalue field.
421
Optional arguments
limit
Syntax: limit=<int>
Description: Specify the number of values of <field> to use for each input event.
Default: 0, or no limit
Usage
You can use evaluation functions and statistical functions on multivalue fields or to return multivalue fields.
Limits
A limit exists on the amount of RAM that the mvexpand command is permitted to use while expanding a batch of results. By
default the limit is 500MB. The input chunk of results is typically maxresultrows or smaller in size, and the expansion of all
these results resides in memory at one time. The total necessary memory is the average result size multiplied by the
number of results in the chunk multiplied by the average size of the multivalue field being expanded.
If this attempt exceeds the configured maximum on any chunk, the chunk is truncated and a warning message is emitted.
If you have Splunk Enterprise, you can adjust the limit by editing the max_mem_usage_mb setting in the limits.conf file.
Prerequisites
• Only users with file system access, such as system administrators, can increase the maxresultrows and
max_mem_usage_mb settings using configuration files.
• Review the steps in How to edit a configuration file in the Splunk Enterprise Admin Manual.
• You can have configuration files with the same name in your default, local, and app directories. Read Where you
can place (or find) your modified configuration files in the Splunk Enterprise Admin Manual.
Never change or copy the configuration files in the default directory. The files in the default directory must remain intact
and in their original location. Make changes to the files in the local directory.
If you use Splunk Cloud Platform and encounter problems because of this limit, file a Support ticket.
Examples
Example 1:
Example 2:
Create new events for the first 100 values of multivalue field, "foo".
422
Example 3:
The mvexpand command only works on one multivalue field. This example walks through how to expand an event with
more than one multivalue field into individual events for each field value. For example, given these events, with
sourcetype=data:
First, use the rex command to extract the field values for a and b. Then use the eval command and mvzip function to
create a new field from the values of a and b.
The results appear on the Statistics tab and look something like this:
_time fields
22,21
2018-04-01 00:11:23 23,32
51,24
1,2
2018-04-01 00:11:22 2,3
5,2
Use the mvexpand command and the rex command on the new field, fields, to create new events and extract the alpha
and beta values:
Use the table command to display only the _time, alpha, and beta fields in a results table.
The results appear on the Statistics tab and look something like this:
2018-04-01 00:11:23 51 24
2018-04-01 00:11:22 1 2
2018-04-01 00:11:22 2 3
2018-04-01 00:11:22 5 2
(Thanks to Splunk user Duncan for this example.)
423
See also
Commands:
makemv
mvcombine
nomv
Functions:
Multivalue eval functions
Multivalue stats and chart functions
split
nomv
Description
Converts values of the specified multivalue field into one single value. Separates the values using a new line "\n delimiter.
Overrides the configurations for the multivalue field that are set in the fields.conf file.
Syntax
nomv <field>
Required arguments
field
Syntax: <field>
Description: The name of a multivalue field.
Usage
You can use evaluation functions and statistical functions on multivalue fields or to return multivalue fields.
Examples
Example 1:
For sendmail events, combine the values of the senders field into a single value. Display the top 10 values.
See also
Commands:
makemv
mvcombine
424
mvexpand
convert
Functions:
Multivalue eval functions
Multivalue stats and chart functions
split
outlier
Description
This command is used to remove outliers, not detect them. It removes or truncates outlying numeric values in selected
fields. If no fields are specified, then the outlier command attempts to process all fields.
To identify outliers and create alerts for outliers, see finding and removing outliers in the Search Manual.
Syntax
Optional arguments
<outlier-options>
Syntax: <action> | <mark> | <param> | <uselower>
Description: Outlier options.
<field-list>
Syntax: <field> ...
Description: A space-delimited list of field names.
Outlier options
<action>
Syntax: action=remove | transform
Description: Specifies what to do with the outliers. The remove option removes events that containing the outlying
numerical values. The transform option truncates the outlying values to the threshold for outliers. If
action=transform and mark=true, prefixes the values with "000".
Abbreviations: The remove action can be shorted to rm. The transform action can be shorted to tf.
Default: transform
<mark>
Syntax: mark=<bool>
Description: If action=transform and mark=true, prefixes the outlying values with "000". If action=remove, the
mark argument has no effect.
Default: false
<param>
Syntax: param=<num>
425
Description: Parameter controlling the threshold of outlier detection. An outlier is defined as a numerical value
that is outside of param multiplied by the inter-quartile range (IQR).
Default: 2.5
<uselower>
Syntax: uselower=<bool>
Description: Controls whether to look for outliers for values below the median in addition to above.
Default: false
Usage
Filtering is based on the inter-quartile range (IQR), which is computed from the difference between the 25th percentile and
75th percentile values of the numeric fields. If the value of a field in an event is less than (25th percentile) - param*IQR
or greater than (75th percentile) + param*IQR , that field is transformed or that event is removed based on the action
parameter.
Examples
Example 1: For a timechart of webserver events, transform the outlying average CPU values.
... | outlier
See also
outputcsv
Description
If you have Splunk Enterprise, this command saves search results to the specified CSV file on the local search head in
the $SPLUNK_HOME/var/run/splunk/csv directory. Updates to $SPLUNK_HOME/var/run/*.csv using the outputcsv command
are not replicated across the cluster.
If you have Splunk Cloud Platform, you cannot use this command. Instead, you have these options:
• Export search results using Splunk Web. See Export data using Splunk Web in the Search Manual.
• Export search results using REST API. See Export data using the REST APIs in the Search Manual.
• Create an alert action that includes a CSV file as an email attachment. See Email notification action in the Alerting
Manual.
426
Syntax
Optional arguments
append
Syntax: append=<bool>
Description: If append is true, the command attempts to append to an existing CSV file, if the file exists. If the
CSV file does not exist, a file is created. If there is an existing file that has a CSV header already, the command
only emits the fields that are referenced by that header. The command cannot append to .gz files.
Default: false
create_empty
Syntax: create_empty=<bool>
Description: If set to true and there are no results, a zero-length file is created. When set to false and there are
no results, no file is created. If the file previously existed, the file is deleted.
Default: false
dispatch
Syntax: dispatch=<bool>
Description: If set to true, refers to a file in the job directory in $SPLUNK_HOME/var/run/splunk/dispatch/<job
id>/.
filename
Syntax: <filename>
Description: Specify the name of a CSV file to write the search results to. This file should be located in
$SPLUNK_HOME/var/run/splunk/csv. Directory separators are not permitted in the filename. If no filename is
specified, the command rewrites the contents of each result as a CSV row into the _xml field. Otherwise the
command writes into a file. The .csv file extension is appended to the filename if the filename has no file
extension.
override_if_empty
Syntax: override_if_empty=<bool>
Description: If override_if_empty=true and no results are passed to the output file, the existing output file is
deleted, If override_if_empty=false and no results are passed to the output file, the command does not delete
the existing output file.
Default: true
singlefile
Syntax: singlefile=<bool>
Description: If singlefile is set to true and the output spans multiple files, collapses it into a single file.
Default: true
usexml
Syntax: usexml=<bool>
Description: If there is no filename, specifies whether or not to encode the CSV output into XML. This option
should not be used when invoking the outputcsv from the UI.
427
Usage
There is no limit to the number of results that can be saved to the CSV file.
When the outputcsv command is used there are internal fields that are automatically added to the CSV file. The internal
fields that are added to the output in the CSV file are:
• _raw
• _time
• _indextime
• _serial
• _sourcetype
• _subsecond
To exclude internal fields from the output, use the fields command and specify the fields that you want to exclude. For
example:
Multivalued fields
The outputcsv command merges values in a multivalued field into single space-delimited value.
Distributed deployments
The outputcsv command is not compatible with search head pooling and search head clustering.
The command saves the *.csv file on the local search head in the $SPLUNK_HOME/var/run/splunk/ directory. The *.csv
files are not replicated on the other search heads.
Examples
Output the search results to the mysearch.csv file. The CSV file extension is automatically added to the file name if you
don't specify the extension in the search.
You can exclude unwanted internal fields from the output CSV file. In this example, the fields to exclude are _indextime,
_sourcetype, _subsecond, and _serial.
428
index=_internal sourcetype="splunkd" | head 5 | fields _raw _time | fields - _indextime _sourcetype
_subsecond _serial | outputcsv MyTestCsvfile
Output the search results to the mysearch.csv file if results are returned from the search. Do not delete the mysearch.csv
file if no results are returned.
See also
inputcsv
outputlookup
Description
Writes search results to a static lookup table, or KV store collection, that you specify.
Syntax
| outputlookup
[append=<bool>]
[create_empty=<bool>]
[override_if_empty=<bool>]
[max=<int>]
[key_field=<field>]
[createinapp=<bool>]
[create_context=<string>]
[output_format=<string>]
<filename> | <tablename>
Required arguments
You must specify one of the following required arguments, either filename or tablename.
filename
Syntax: <string>
Description: The name of the lookup file. The file must end with .csv or .csv.gz.
tablename
Syntax: <string>
Description: The name of the lookup table as specified by a stanza name in transforms.conf. The lookup table
can be configured for any lookup type (CSV, external, or KV store).
429
Optional arguments
append
Syntax: append=<bool>
Description: The default, append=false setting, writes the search results to the .csv file or KV store collection.
Columns that are not in the current search results are removed from the file. If set to true, attempts to append
search results to an existing .csv file or KV store collection. Otherwise it creates a file. If there is an existing .csv
file, the outputlookup command writes only the fields that are present in the previously existing .csv file. An
outputlookup search that is run with append=true might result in a situation where the lookup table or collection is
only partially updated. This means that a subsequent lookup or inputlookup search on that lookup table or
collection might return stale data along with new data. The outputlookup command cannot append to .gz files.
Default: false
create_context
Syntax: create_context= app | user | system
Description: Specifies where the lookup table file is created. Ignored in favor of the createinapp argument if both
arguments are used in the search. See Usage for details.
Default: app
create_empty
Syntax: create_empty=<bool>
Description: If set to true and there are no results, a zero-length file is created. When set to false and there are
no results, no file is created. If the file previously existed, the file is deleted.
For example, suppose there is a system-level lookup called "test" with the lookup defined in "test.csv". There is
also an app-level lookup with the same name. If an app overrides that "test.csv" in it's own app directory with an
empty file create_empty=true, the app-level lookup behaves as if the lookup is empty. However, if there's no file
at all create_empty=false at the app level, then the lookup file in the system-level is used.
Default: false
createinapp
Syntax: createinapp=<bool>
Description: Specifies whether the lookup table file is created in the system directory or the lookups directory for
the current app context. Overrides the create_context argument if both arguments are used in the search. See
Usage for details.
Default: true
key_field
Syntax: key_field=<field>
Description: For KV store-based lookups, uses the specified field name as the key to a value and replaces that
value. An outputlookup search using the key_field argument might result in a situation where the lookup table or
collection is only partially updated. A subsequent lookup or inputlookup search on that collection might return
stale data along with new data. A partial update only occurs with concurrent searches, one with the outputlookup
command and a search with the inputlookup command. It is possible that the inputlookup occurs when the
outputlookup is still updating some of the records.
max
Syntax: max=<int>
Description: The number of rows to output. Include the max argument in the outputlookup command to set a limit
for a CSV file or change the limit for a KV store collection. The default number of rows to output to a KV store
collection is 50000 and is controlled by the max_rows_per_query setting in the limits.conf file.
Default: no limit for a CSV file, 50000 for a KV store.
430
output_format
Syntax: output_format=splunk_sv_csv | splunk_mv_csv
Description: Controls the output data format of the lookup. Use output_format=splunk_mv_csv when you want to
output multivalued fields to a lookup table file, and then read the fields back into Splunk using the inputlookup
command. The default, splunk_sv_csv outputs a CSV file which excludes the _mv_<fieldname> fields.
Default: splunk_sv_csv
override_if_empty
Syntax: override_if_empty=<bool>
Description: If override_if_empty=true and no results are passed to the output file, the existing output file is
deleted, If override_if_empty=false and no results are passed to the output file, the command does not delete
the existing output file.
Default: true
Usage
The lookup table must be a CSV or GZ file, or a table name specified with a lookup table configuration in
transforms.conf. The lookup table can refer to a KV store collection or a CSV lookup. The outputlookup command cannot
be used with external lookups.
For CSV lookups, outputlookup creates a lookup table file for the results of the search. There are three locations where
outputlookup can put the file it creates:
You can use the createinapp or create_context arguments to determine where outputlookup creates the lookup table for
a given search. If you try to use both of these arguments in the same search, createinapp argument overrides the
create_context argument.
If you do not use either argument in your search, the create_context setting in limits.conf determines where
outputlookup creates the lookup table file. This setting defaults to app if there is an app context when you run the search,
or to system, if there is not an app context when you run the search.
To have outputlookup create the lookup table file in the system lookups directory, set createinapp=false or set
create_context=system. Alternatively, if you do not have an app context when you run the search, leave both arguments
out of the search and rely on the limits.conf version of create_context to put the lookup table file in the system
directory. This last approach only works if the create_context setting in limits.conf has not been set to user.
To have outputlookup create the lookup table file in the lookups directory for the current app context, set
createinapp=true or set create_context=app. Alternatively, if you do have an app context when you run the search, leave
both arguments out of the search and rely on the limits.conf version of create_context to put the lookup table file in the
app directory. This last approach only works if the create_context setting in limits.conf has not been set to user.
To have outputlookup create the lookup table file in the lookups directory for the user running the search, set
create_context=user. Alternatively, if you want all outputlookup searches to create lookup table files in user lookup
directories by default, you can set create_context=user in limits.conf. The createinapp and create_context arguments
can override this setting if they are used in the search.
431
If the lookup table file already exists in the location to which it is written, the existing version of the file is overwritten
with the results of the outputlookup search.
For permissions in CSV lookups, use the check_permission field in transforms.conf and
outputlookup_check_permission in limits.conf to restrict write access to users with the appropriate permissions when
using the outputlookup command. Both check_permission and outputlookup_check_permission default to false. Set to
true for Splunk software to verify permission settings for lookups for users. You can change lookup table file permissions
in the .meta file for each lookup file, or Settings > Lookups > Lookup table files. By default, only users who have the
admin or power role can write to a shared CSV lookup file.
For more information about creating lookups, see About lookups in the Knowledge Manager Manual.
For more information about App Key Value Store collections, see About KV store in the Admin Manual.
Append results
Suppose you have an existing CSV file which contains columns A, D, and J. The results of your search are columns A, C,
and J. If you run a search with outputlookup append=false, then columns A, C, and J are written to the CSV file. Column
D is not retained.
If you run a search with outputlookup append=true, then only the columns that are currently in the file are preserved. In
this example columns A and J are written to the CSV file. Column C is lost because it does not already exist in the CSV
file. Column D is retained.
You can work around this issue by using the eval command to add a column to your CSV file before you run the search.
For example, if your CSV file is named foo you would do something like this:
Then run your search and pipe the results to the fields command for the columns you want to preserve.
Multivalued fields
When you output to a static lookup table, the outputlookup command merges values in a multivalued field into single
space-delimited value. This does not apply to a KV store collection.
Examples
| outputlookup usertogroup
432
2. Write to a lookup file in a specific system or app directory
| outputlookup users.csv
Write food inspection events for Shalimar Restaurant to a KV store collection called kvstorecoll. This collection is
referenced in a lookup table called kvstorecoll_lookup.
Write the contents of a CSV file to the KV store collection kvstorecoll using the lookup table kvstorecoll_lookup. This
requires usage of both inputlookup and outputlookup commands.
Update field values for a single KV store collection record. This requires you to use the inputlookup, outputlookup, and
eval commands. The record is indicated by the value of its internal key ID (the _key field) and is updated with a new
customer name and customer city. The record belongs to the KV store collection kvstorecoll, which is accessed through
the lookup table kvstorecoll_lookup.
To learn how to obtain the internal key ID values of the records in a KV store collection, see Example 5 for the
inputlookup command.
See also
Commands
collect
inputlookup
lookup
inputcsv
mcollect
meventcollect
outputcsv
outputtext
433
outputtext
Description
The outputtext command was created as an internal mechanism to render event texts for output.
Syntax
outputtext [usexml=<bool>]
Optional arguments
usexml
Syntax: usexml=<bool>
Description: If set to true, the copy of the _raw field in the _xml is escaped XML. If usexml is set to false, the _xml
field is an exact copy of _raw.
Default: true
Usage
The outputtext command writes all search results to the search head. In Splunk Web, the results appear in the Statistics
tab.
Examples
... | outputtext
See also
outputcsv
overlap
Note: We do not recommend using the overlap command to fill or backfill summary indexes. Splunk Enterprise provides
a script called fill_summary_index.py that backfills your indexes or fill summary index gaps. If you have Splunk Cloud
Platform and need to backfill, open a Support ticket and specify the time range, app, search name, user and any other
details required to enable Splunk Support to backfill the required data. For more information, see "Manage summary index
gaps" in the Knowledge Manager Manual.
434
Description
Find events in a summary index that overlap in time, or find gaps in time during which a scheduled saved search might
have missed events.
• If you find a gap, run the search over the period of the gap and summary index the results using "| collect".
• If you find overlapping events, manually delete the overlaps from the summary index by using the search
language.
Important: Input events are expected to have the following fields: 'info_min_time', 'info_max_time' (inclusive and
exclusive, respectively) , 'info_search_id' and 'info_search_name' fields. If the index contains raw events (_raw), the
overlap command does not work. Instead, the index should contain events such as chart, stats, and timechart results.
Syntax
overlap
Examples
Example 1:
index=summary | overlap
See also
pivot
Description
The pivot command makes simple pivot operations fairly straightforward, but can be pretty complex for more
sophisticated pivot operations. Fundamentally this command is a wrapper around the stats and xyseries commands.
The pivot command does not add new behavior, but it might be easier to use if you are already familiar with how Pivot
works. See the Pivot Manual. Also, read how to open non-transforming searches in Pivot.
Run pivot searches against a particular data model object. This requires a large number of inputs: the data model, the
data model object, and pivot elements.
435
Syntax
Required arguments
datamodel-name
Syntax: <string>
Description: The name of the data model to search.
objectname
Syntax: <string>
Description: The name of a data model object to search.
pivot element
Syntax: (<cellvalue>)* (SPLITROW <rowvalue>)* (SPLITCOL colvalue [options])* (FILTER <filter expression>)*
(LIMIT <limit expression>)* (ROWSUMMARY <true | false>)* (COLSUMMARY <true | false>)* (SHOWOTHER
<true | false>)* (NUMCOLS <num>)* (rowsort [options])*
Description: Use pivot elements to define your pivot table or chart. Pivot elements include cell values, split rows,
split columns, filters, limits, row and column formatting, and row sort options. Cell values always come first. They
are followed by split rows and split columns, which can be interleaved, for example: avg(val), SPLITCOL foo,
SPLITROW bar, SPLITCOL baz.
Cell value
<cellvalue>
Syntax: <function>(fieldname) [AS <label>]
Description: Define the values of a cell and optionally rename it. Here, label is the name of the cell in the report.
The set of allowed functions depend on the data type of the fieldname:
SPLITROW <rowvalue>
Syntax: SPLITROW <field> [AS <label>] [RANGE start=<value> end=<value> max=<value> size=<value>]
[PERIOD (auto | year | month | day | hour | minute | second)] [TRUELABEL <label>] [FALSELABEL <label>]
Description: You can specify one or more of these options on each SPLITROW. The options can appear in any
order. You can rename the <field> using "AS <label>", where "label" is the name of the row in the report.
• RANGE applies only for numbers. You do not need to specify all of the options (start, end, max, and size).
• PERIOD applies only for timestamps. Use it to specify the period to bucket by.
• TRUELABEL applies only for booleans. Use it to specify the label for true values.
• FALSELABEL applies only for booleans. Use it to specify the label for false values.
436
Descriptions for column split-by elements
Other options depend on the data type of the field specified (fieldname):
• RANGE applies only for numbers. The options (start, end, max, and size) do not all have to be specified.
• PERIOD applies only for timestamps. Use it to specify the period to bucket by.
• TRUELABEL applies only for booleans. Use it to specify the label for true values.
• FALSELABEL applies only for booleans. Use it to specify the label for false values.
• Strings: is, contains, in, isNot, doesNotContain, startsWith, endsWith, isNull, isNotNull
Usage
Generating commands use a leading pipe character and should be the first command in a search.
Examples
Example 1: This command counts the number of events in the "HTTP Requests" object in the "Tutorial" data model.
437
This can be formatted as a single value report in the dashboard panel:
Example 2: Using the Tutorial data model, create a pivot table for the count of "HTTP Requests" per host.
| pivot Tutorial HTTP_requests count(HTTP_requests) AS "Count" SPLITROW host AS "Server" SORT 100 host
See also
predict
Description
The predict command forecasts values for one or more sets of time-series data. The command can also fill in missing
data in a time-series and provide predictions for the next several time steps.
The predict command provides confidence intervals for all of its estimates. The command adds a predicted value and an
upper and lower 95th percentile range to each event in the time-series. See the Usage section in this topic.
Syntax
Required arguments
<field-list>
Syntax: <field>...
Description: The names of the fields for the variable that you want to predict. You can specify one or more fields.
438
Optional arguments
<newfield>
Syntax: <string>
Description: Renames the fields that are specified in the <field-list>. You do not need to rename every field
that you specify in the <field-list>. However, for each field that you want to rename, you must specify a
separate AS <newfield> clause.
<predict_options>
Syntax: algorithm=<algorithm_name> | correlate_field=<field> | future_timespan=<number> |
holdback=<number> | period=<number> | suppress=<bool> | lowerXX=<field> | upperYY=<field>
Description: Options you can specify to control the predictions. You can specify one or more options, in any
order. Each of these options is described in the Predict options section.
Predict options
algorithm
Syntax: algorithm= LL | LLT | LLP | LLP5 | LLB | BiLL
Description: Specify the name of the forecasting algorithm to apply. LL, LLT, LLP, and LLP5 are univariate
algorithms. LLB and BiLL are bivariate algorithms. All the algorithms are variations based on the Kalman filter.
Each algorithm expects a minimum number of data points. If not enough effective data points are supplied, an
error message is displayed. For instance, the field itself might have more than enough data points, but the
number of effective data points might be small if the holdback value that you specify is large.
Default: LLP5
Algorithm
Algorithm type Description
option
A univariate model with no trends and no seasonality. Requires a minimum of 2
data points. The LL algorithm is the simplest algorithm and computes the levels
LL Local level
of the time series. For example, each new state equals the previous state, plus
the Gaussian noise.
LLT Local level trend A univariate model with trend, but no seasonality. Requires a minimum of 3 data points.
A univariate model with seasonality. The number of data points must be at least twice the
number of periods, using the period attribute. The LLP algorithm takes into account the
LLP Seasonal local level cyclical regularity of the data, if it exists. If you know the number of periods, specify the period
argument. If you do not set the period, this algorithm tries to calculate it. LLP returns an error
message if the data is not periodic.
If the time series is periodic, LLP5 computes two predictions, one using LLT and the other using
Combines LLT and
LLP. The algorithm then takes a weighted average of the two values and outputs that as the
LLP5 LLP models for its
prediction. The confidence interval is also based on a weighted average of the variances of LLT
prediction.
and LLP.
A bivariate model with no trends and no seasonality. Requires a minimum of 2 data points. LLB
uses one set of data to make predictions for another. For example, assume it uses dataset Y to
LLB Bivariate local level
make predictions for dataset X. If holdback=10, LLB takes the last 10 data points of Y to
make predictions for the last 10 data points of X.
A bivariate model that predicts both time series simultaneously. The covariance of the two
BiLL Bivariate local level
series is taken into account.
correlate
Syntax: correlate=<field>
439
Description: Specifies the time series that the LLB algorithm uses to predict the other time series. Required
when you specify the LLB algorithm. Not used for any other algorithm.
Default: None
future_timespan
Syntax: future_timespan=<num>
Description: Specifies how many future predictions the predict command will compute. This number must be a
non-negative number. You would not use the future_timespan option if algorithm=LLB.
Default: 5
holdback
Syntax: holdback=<num>
Description: Specifies the number of data points from the end that are not to be used by the predict command.
Use in conjunction with the future_timespan argument. For example, 'holdback=10 future_timespan=10'
computes the predicted values for the last 10 values in the data set. You can then judge how accurate the
predictions are by checking whether the actual data point values fall into the predicted confidence intervals.
Default: 0
lowerXX
Syntax: lower<int>=<field>
Description: Specifies a percentage for the confidence interval and a field name to use for the lower confidence
interval curve. The <int> value is a percentage that specifies the confidence level. The integer must be a number
between 0 and 100. The <field> value is the field name.
Default: The default confidence interval is 95%. The default field name is 'lower95(prediction(X))' where X is the
name of the field to be predicted.
period
Syntax: period=<num>
Description: Specifies the length of the time period, or recurring cycle, in the time series data. The number must
be at least 2. The LLP and LLP5 algorithms attempt to compute the length of time period if no value is specified. If
you specify the span argument with the timechart command, the unit that you specify for span is the unit used for
period. For example, if your search is ...|timechart span=1d foo2| predict foo2 period=3. The spans are 1
day and the period for the predict is 3 days. Otherwise, the unit for the time period is a data point. For example, if
there are a thousand events, then each event is a unit. If you specify period=7, that means the data recycles after
every 7 data points, or events.
Default: None
suppress
Syntax: suppress=<field>
Description: Used with the multivariate algorithms. Specifies one of the predicted fields to hide from the output.
Use suppress when it is difficult to see all of the predicted visualizations at the same time.
Default: None
upperYY
Syntax: upper<int>=<field>
Description: Specifies a percentage for the confidence interval and a field name to use for the upper confidence
interval curve. The <int> value is a percentage that specifies the confidence level. This must be a number
between 0 and 100. The <field> value is the field name.
Default: The default confidence interval is 95%. The default field name is 'upper95(prediction(X))' where X is the
name of the field to be predicted.
440
Confidence intervals
The lower and upper confidence interval parameters default to lower95 and upper95. These values specify a confidence
interval where 95% of the predictions are expected fall.
It is typical for some of the predictions to fall outside the confidence interval.
Usage
The predict command must be preceded by the timechart command. The predict command requires time series data.
See the Examples section for more details.
How it works
The predict command models the data by stipulating that there is an unobserved entity which progresses through time in
different states.
To predict a value, the command calculates the best estimate of the state by considering all of the data in the past. To
compute estimates of the states, the command hypothesizes that the states follow specific linear equations with Gaussian
noise components.
Under this hypothesis, the least-squares estimate of the states are calculated efficiently. This calculation is called the
Kalman filter, or Kalman-Bucy filter. For each state estimate, a confidence interval is obtained. The estimate is not a point
estimate. The estimate is a range of values that contain the observed, or predicted, values.
The measurements might capture only some aspect of the state, but not necessarily the whole state.
Missing values
The predict command can work with data that has missing values. The command calculates the best estimates of the
missing values.
Do not remove events with missing values, Removing the events might distort the periodicity of the data. Do not specify
cont=false with the timechart command. Specifying cont=false removes events with missing values.
Specifying span
The unit for the span specified with the timechart command must be seconds or higher. The predict command cannot
accept subseconds as an input when it calculates the period.
Examples
This example uses the sample data from the Search Tutorial but should work with any format of Apache web access log. To try this example on
your own Splunk instance, you must download the sample data and follow the instructions to get the tutorial data into Splunk. Use the time
441
range All time when you run the search.
Predict future access based on the previous access numbers that are stored in Apache web access log files. Count the
number of access attempts using a span of 1 day.
The results appear on the Statistics tab. Click the Visualization tab. If necessary change the chart type to a Line Chart.
This example uses the sample data from the Search Tutorial. To try this example on your own Splunk instance, you must download the sample
data and follow the instructions to get the tutorial data into Splunk. Use the time range All time when you run the search.
Chart the number of purchases made daily for a specific product.
• This example searches for all purchases events, defined by the action=purchase for the arc, and pipes those
results into the timechart command.
• The span=1day argument buckets the count of purchases into daily chunks.
The results appear on the Statistics tab and look something like this:
_time count
2018-06-11 17
2018-06-12 63
2018-06-13 94
2018-06-14 82
2018-06-15 63
2018-06-16 76
442
_time count
2018-06-17 70
2018-06-18 72
Add the predict command to the search to calculate the prediction for the number of purchases of the Arcade games that
might be sold in the near future.
The results appear on the Statistics tab. Click the Visualization tab. If necessary change the chart type to a Bar Chart.
Predict the values of foo using the default LLP5 algorithm, an algorithm that combines the LLP and LLT algorithms.
Predict multiple fields using the same algorithm. The default algorithm in this example.
When specifying confidence intervals, the upper and lower confidence interval values do not need to match. This example
predicts 10 values for a field using the LL algorithm, holding back the last 20 values in the data set.
... | timechart span="1m" count AS foo | predict foo AS foobar algorithm=LL upper90=high lower97=low
future_timespan=10 holdback=20
443
6. Predict the values using the LLB algorithm
This example illustrates the LLB algorithm. The foo3 field is predicted by correlating it with the foo2 field.
... | timechart span="1m" count(x) AS foo2 count(y) AS foo3 | predict foo3 AS foobar algorithm=LLB
correlate=foo2 holdback=100
In this example, the search abstains from using the last 5 data points and makes 5 future predictions. The predictions
correspond to the last 5 values in the data. You can judge how accurate the predictions are by checking whether the
observed values fall into the predicted confidence intervals.
8. Predict multiple fields using the same algorithm and the same future_timespan and holdback
Predict multiple fields using the same algorithm and same future_timespan and holdback.
... | timechart ... | predict foo1 foo2 foo3 algorithm=LLT future_timespan=15 holdback=5
Use aliases for the fields by specifying the AS keyword for each field.
... | timechart ... | predict foo1 AS foobar1 foo2 AS foobar2 foo3 AS foobar3 algorithm=LLT
future_timespan=15 holdback=5
Predict multiple fields using different algorithms and different options for each field.
... | timechart ... | predict foo1 algorithm=LL future_timespan=15 foo2 algorithm=LLP period=7
future_timespan=7
Predict values for foo1 and foo2 together using the bivariate algorithm BiLL.
See also
trendline, x11
rangemap
Description
Use the rangemap command to categorize the values in a numeric field. The command adds in a new field called range to
each event and displays the category in the range field. The values in the range field are based on the numeric ranges
444
that you specify.
Set the range field to the names of any attribute_name that the value of the input field is within. If no range is matched,
the range value is set to the default value.
The ranges that you set can overlap. If you have overlapping values, the range field is created as a multivalue field
containing all the values that apply. For example, if low=1-10, elevated=5-15, and the input field value is 10, range=low
and code=elevated.
Syntax
rangemap
field=<string>
[<attribute_name>=<numeric_range>]...
[default=<string>]
Required arguments
field
Syntax: field=<string>
Description: The name of the input field. This field must contain numeric values.
Optional arguments
attribute_name=numeric_range
Syntax: <string>=<num>-<num>
Description: The <attribute_name> is a string value that is output when the <numeric_range> matches the value
in the <field>. The <attribute_name> is a output to the range field. The <numeric_range> is the starting and
ending values for the range. The values can be integers or floating point numbers. The first value must be lower
than the second. The <numeric_range> can include negative values.
Example: Dislike=-5--1 DontCare=0-0 Like=1-5
default
Syntax: default=<string>
Description: If the input field does not match a range, use this to define a default value.
Default: "None"
Usage
Basic examples
Example 1:
Set range to "green" if the date_second is between 1-30; "blue", if between 31-39; "red", if between 40-59; and "gray", if
no range matches (for example, if date_second=0).
445
Example 2:
Sets the value of each event's range field to "low" if its count field is 0 (zero); "elevated", if between 1-100; "severe",
otherwise.
Extended example
This example uses recent earthquake data downloaded from the USGS Earthquakes website. The data is a comma separated ASCII text file
that contains magnitude (mag), coordinates (latitude, longitude), region (place), etc., for each earthquake recorded.
You can download a current CSV file from the USGS Earthquake Feeds and add it as an input. The following examples
uses the All Earthquakes under the Past 30 days list.
This search counts the number and magnitude of each earthquake that occurred in and around Alaska. Then a color is
assigned to each magnitude using the rangemap command.
source=all_month.csv place=*alaska* mag>=3.5 | stats count BY mag | rename mag AS magnitude | rangemap
field=magnitude light=3.9-4.3 strong=4.4-4.9 severe=5.0-9.0 default=weak
446
magnitude count range
source=all_month.csv place=*alaska* mag>=3.5 | stats count BY mag | rename mag AS magnitude | rangemap
field=magnitude green=3.9-4.2 yellow=4.3-4.6 red=4.7-5.0 default=gray | stats sum(count) by range
range sum(count)
gray 127
green 96
red 23
yellow 43
Arrange the results in a custom sort order
By default the values in the search results are in descending order by the sum(count) field. You can apply a custom sort
order to the results using the eval command with the case function.
source=all_month.csv place=*alaska* mag>=3.5 | stats count BY mag | rename mag AS magnitude | rangemap
field=magnitude green=3.9-4.2 yellow=4.3-4.6 red=4.7-5.0 default=gray | stats sum(count) by range | eval
sort_field=case(range="red",1, range="yellow",2, range="green",3, range="gray",4) | sort sort_field
yellow 43 2
green 96 3
gray 127 4
See also
Commands
eval
Blogs
Order Up! Custom Sort Orders
rare
Description
Finds the least frequent tuple of values of all fields in the field list. If the <by-clause> is specified, this command returns
rare tuples of values for each distinct tuple of values of the group-by fields.
447
This command operates identically to the top command, except that the rare command finds the least frequent instead of
the most frequent.
Syntax
Required arguments
<field-list>
Syntax: <string>,...
Description: Comma-delimited list of field names.
Optional arguments
<top-options>
Syntax: countfield=<string> | limit=<int> | percentfield=<string> | showcount=<bool> | showperc=<bool>
Description: Options that specify the type and number of values to display. These are the same <top-options>
used by the top command.
<by-clause>
Syntax: BY <field-list>
Description: The name of one or more fields to group by.
Top options
countfield
Syntax: countfield=<string>
Description: The name of a new field to write the value of count into.
Default: "count"
limit
Syntax: limit=<int>
Description: Specifies how many tuples to return. If you specify limit=0, all values up to maxresultrows are
returned. See Limits section. Specifying a value larger than maxresultrows produces an error.
Default: 10
percentfield
Syntax: percentfield=<string>
Description: Name of a new field to write the value of percentage.
Default: "percent"
showcount
Syntax: showcount=<bool>
Description: Specify whether to create a field called "count" (see "countfield" option) with the count of that tuple.
Default: true
showperc
Syntax: showperc=<bool>
Description: Specify whether to create a field called "percent" (see "percentfield" option) with the relative
prevalence of that tuple.
448
Default: true
Usage
The number of results returned by the rare command is controlled by the limit argument. The default value for the limit
argument is 10. You can change this limit up to the maximum value specified in the maxresultrows setting in the [rare]
stanza in the limits.conf file. The default maximum is 50,000, which effectively keeps a ceiling on the memory that the
rare command uses.
Examples
Return the least common values in the "url" field. Limits the number of values returned to 5.
Find the least common values in the "user" field for each "host" value. By default, a maximum of 10 results are returned.
See also
redistribute
Description
The redistribute command implements parallel reduce search processing to shorten the search runtime of a set of
supported SPL commands. Apply the redistribute command to high-cardinality dataset searches that aggregate large
numbers of search results.
The redistribute command requires a distributed search environment where indexers have been configured to operate
as intermediate reducers. Only users with roles that have the run_multi_phased_searches capability can use the
redistribute command in searches.
Syntax
449
Required arguments
None.
Optional arguments
num_of_reducers
Syntax: num_of_reducers=<int>
Description: Specifies the number of indexers in the indexer pool that are repurposed as intermediate reducers.
Default: The default value for num_of_reducers is controlled by three settings in the limits.conf
file:maxReducersPerPhase, winningRate, and reducers. If these settings are not changed, by default the Splunk
software sets num_of_reducers to 50 percent of your indexer pool, with a maximum of 4 indexers. See Usage for
more information.
by-clause
Syntax: BY <field-list>
Description: The name of one or more fields to group by. You cannot use a wildcard character to specify multiple
fields with similar names. You must specify each field separately. See Using the by-clause for more information.
Usage
In Splunk deployments that have distributed search, a two-phase map-reduce process is typically used to determine the
final result set for the search. Search results are mapped at the indexer layer and then reduced at the search head.
The redistribute command inserts an intermediary reduce phase to the map-reduce process, making it a three-phase
map-reduce-reduce process. This three-phase process is parallel reduce search processing.
In the intermediary reduce phase, a subset of the indexers become intermediate reducers. The intermediate reducers
perform reduce operations for the search commands and then pass the results on to the search head, where the final
result reduction and aggregation operations are performed. This parallelization of reduction work that otherwise would be
done entirely by the search head can result in faster completion times for high-cardinality searches that aggregate large
numbers of search results.
For information about managing parallel reduce processing at the indexer level, including configuring indexers to operate
as intermediate reducers, see Overview of parallel reduce search processing, in the Distributed Search manual.
If you use Splunk Cloud Platform, use redistribute only when your indexers are operating with a low to medium average
load. You do not need to perform any configuration tasks to use the redistribute command.
Supported commands
The redistribute command supports only streaming commands and the following nonstreaming commands:
• stats
• tstats
• streamstats
• eventstats
• sichart
• sitimechart
The redistribute command also supports the transactioncommand, when the transaction command is operating on
450
only one field. For example, the redistribute command cannot support the transactioncommand when the following
conditions are true:
For best performance, place redistribute immediately before the first supported nonstreaming command that has
high-cardinality input.
The redistribute command moves the processing of a search string from the intermediate reducers to the search head
in the following circumstances:
Using the by-clause to determine how results are partitioned on the reducers
At the start of the intermediate reduce phase, the redistribute command takes the mapped search results and
redistributes them into partitions on the intermediate reducers according to the fields specified by the by-clause argument.
If you do not specify any by-clause fields, the search processor uses the field or fields that work best with the commands
that follow the redistribute command in the search string.
Command type
The redistribute command is an orchestrating command, which means that it controls how a search runs. It does not
focus on the events processed by the search. The redistribute command instructs the distributed search query planner
to convert centralized streaming data into distributed streaming data by distributing it across the intermediate reducers.
For more information about command types, see Types of commands in the Search Manual.
The default value for the num_of_reducers argument is controlled by three settings in the limits.conf file:
maxReducersPerPhase, winningRate, and reducers.
Default
Setting name Definition
value
The maximum number of indexers that can be used as intermediate reducers in the intermediate reduce
maxReducersPerPhase 4
phase.
The percentage of indexers that can be selected from the total pool of indexers and used as
winningRate intermediate reducers in a parallel reduce search process. This setting applies only when the 50
reducers setting is not configured.
reducers A list of valid indexers that are to be used as dedicated intermediate reducers for parallel reduce search " " (empty
processing. When you run a search with the redistribute command, the valid indexers in the list)
reducers list are the only indexers that are used for parallel reduce operations. If the number of valid
451
Default
Setting name Definition
value
indexers in the reducers list exceeds the maxReducersPerPhase value, the Splunk platform
randomly selects a set of indexers from the reducers list that meets the maxReducersPerPhase
limit.
If you decide to add 7 of your indexers to the reducers list, the winningRate setting ceases to be applied, and the
num_of_reducers argument defaults to 4 indexers. The Splunk platform randomly selects four indexers from the reducers
list to act as intermediate reducers each time you run a valid redistribute search.
If you provide a value for the num_of_reducers argument that exceeds the limit set by the maxReducersPerPhase
setting, the Splunk platform sets the number of reducers to the maxReducersPerPhase value.
Searches that use the redistribute command ignore all data on the search head. If you plan to use the redistribute
command, the best practice is to forward all search head data to the indexer layer. See Best Practice: Forward search
head data to the indexer layer in the Distributed Search manual.
If you want to add the redistribute command to a search that uses the chart or timechart commands to produce
statistical results that can be used for chart visualizations, include either the sichart command or the sitimechart
command in the search as well. The redistribute command uses these si- commands to perform the statistical
calculations for the reporting commands on the intermediate reducers. When the redistribute command moves the
results to the search head, the chart or timechart command transforms the results into a format that can be used for
chart visualizations.
A best practice is to use the same syntax and values for both commands. For example, if you want to have | timechart
count by referrer_domain in your redistribute search, insert | sitimechart count by referrer_domain into the search
string:
Certain commands that the redistribute command supports explicitly return results in a sorted order. As a result of the
partitioning that takes place when the redistributecommand is run, the Splunk platform loses the sorting order. If the
Splunk platform detects that an order-sensitive command, such as streamstats, is used in a redistribute search, it
automatically inserts sort into the search as it processes it.
For example, the following search includes the streamstats command, which is order-sensitive:
... | redistribute by host | stats count by host | streamstats count by host, source
The Splunk platform adds a sort segment before the streamstats segment when it processes the search. You can see
the sort segment in the search string if you inspect the search job after you run it.
... | redistribute by host | stats count by host | sort 0 str(host) | streamstats count by host, source
452
The stats and streamstats segments are processed on the intermediate reducers because they both split by the
hostfield, the same field that the redistributecommand is distributing on. The work of the sort segment is split between
the indexers during the map phase of the search and the search head during the final reduce phase of the search.
If you require the results of a redistribute search to be sorted in that exact order, use sort to perform the sorting at the
search head. There is an additional performance cost to event sorting after the redistribute command partitions events
on the intermediate reducers.
If you want to get that same event ordering while also adding redistribute to the search to speed it up, add sort to the
search:
The stats segment of this search is processed on the intermediate reducers. The work of the sort segment is split
between the indexers during the map phase of the search and the search head during the final reduce phase of the
search.
The redistribute command does not support searches of virtual indexes. The redistribute command also does not
support unified searches if their time ranges are long enough that they run across virtual archive indexes. For more
information, see the following Splunk Analytics for Hadoop topics:
Examples
In this example, the redistribute command is applied to a stats search that is running over an extremely large
high-cardinality dataset. The redistribute command reduces the completion time for the search.
The intermediate reducers process the | stats count by ip portion of the search in parallel, lowering the completion
time for the search. The search head aggregates the results.
This example uses a search over an extremely large high-cardinality dataset. The search string includes the eventstats
command, and it uses the sitimechart command to perform the statistical calculations for a timechart operation. The
search uses the redistribute command to reduce the completion time for the search. A by-clause field is not specified,
so the search processor selects one.
453
... | redistribute | eventstats count by user, source | where count>10 | sitimechart max(count) by source |
timechart max(count) by source
When this search runs, the intermediate reducers process the eventstats and sitimechart segments of the search in
parallel, reducing the overall completion time of the search. On the search head, the timechart command takes the
reduced sitimechart calculations and transforms them into a format that can be used for for charts and visualizations.
Because a by-clause field is not identified in the search string, the intermediate reducers redistribute and partition events
on the source field.
This example uses a search over an extremely large high-cardinality dataset. This search uses the tstats command in
conjunction with the sitimechart and timechart commands. The redistribute command reduces the completion time for
the search.
| tstats prestats=t count BY _time span=1d | redistribute by _time | sitimechart span=1d count | timechart
span=1d count
You have to place the tstats command at the start of the search string with a leading pipe character. When you use the
redistribute command in conjunction with tstats, you must place the redistribute command after the tstats segment
of the search.
In this example, the tstats command uses the prestats=t argument to work with the sitimechart and timechart
commands.
The redistribute command causes the intermediate reducers to process the sitimechart segment of the search in
parallel, reducing the overall completion time for the search. The reducers then push the results to the search head,
where the timechart command processes them into a format that you can use for charts and visualizations.
This example uses a search over an extremely large high-cardinality dataset. The search uses the redistribute
command to reduce the search completion time. The search includes commands that are both supported and
unsupported by the redistribute command. It uses the sort command to sort of the results after the rest of the search
has been processed. You need the sort command for event sorting because the redistribute process undoes the
sorting naturally provided by commands in the stats command family.
... | redistribute | eventstats count by user, source | where count >10 | sort 0 -num(count)
In this example, the intermediate reducers process the eventstats and where segments in parallel. Those portions of the
search complete faster than they would when the redistribute command is not used.
The Splunk platform divides the work of processing the sort portion of the search between the indexer and the search
head.
5. Speed up a search where a supported command splits by fields that are not in the redistribute command
by-clause argument
In this example, the redistribute command redistributes events across the intermediate reducers by the source field.
The search includes two commands that are supported by the redistribute command but only one of them is processed
on the intermediate reducers.
454
... | redistribute by source | eventstats count by source, host | where count > 10 | stats count by userid,
host
In this case, the eventstats segment of the search is processed in parallel by the intermediate reducers because it
includes source as a split-by field. The where segment is also processed on the intermediate reducers.
The stats portion of the search, however, is processed on the search head because its split-by fields are not a superset
of the set of fields that the events have been redistributed by. In other words, the stats split-by fields do not include
source.
regex
Description
Removes results that match or do not match the specified regular expression.
Syntax
regex
(<field>=<regex-expression> | <field>!=<regex-expression> | <regex-expression>)
Required arguments
<regex-expression>
Syntax: "<string>"
Description: An unanchored regular expression. The regular expression must be a Perl Compatible Regular
Expression supported by the PCRE library. Quotation marks are required.
Optional arguments
<field>
Syntax: <field>
Description: Specify the field name from which to match the values against the regular expression.
You can specify that the regex command keeps results that match the expression by using
<field>=<regex-expression>. To keep results that do not match, specify <field>!=<regex-expression>.
Default: _raw
Usage
When you use regular expressions in searches, you need to be aware of how characters such as pipe ( | ) and backslash
( \ ) are handled. See SPL and regular expressions in the Search Manual.
For general information about regular expressions, see About Splunk regular expressions in the Knowledge Manager
Manual.
455
The difference between the regex and rex commands
Use the regex command to remove results that match or do not match the specified regular expression.
Use the rex command to either extract fields using regular expression named groups, or replace or substitute characters
in a field using sed expressions.
If you use regular expressions in conjunction with the regex command, note that != behaves differently for the regex
command than for the search command.
You can use a regex command with != to filter for events that don't have a field value matching the regular expression, or
for which the field is null. For example, this search will include events that do not define the field Location.
The search command behaves the opposite way. You can use a search command with != to filter for events that don't
contain a field matching the search string, and for which the field is defined. For example, this search will not include
events that do not define the field Location.
If you use != in the context of the regex command, keep this behavior in mind and make sure you want to include null
fields in your results.
Examples
Example 1: Keep only search results whose "_raw" field contains IP addresses in the non-routable class A (10.0.0.0/8).
This example uses a negative lookbehind assertion at the beginning of the expression.
Example 2: Keep only the results that match a valid email address. For example, [email protected].
Part of the
Description
expression
^ Specifies the beginning of the string.
This is the first group in the expression. Specifies to match one or more lowercase letters, numbers, underscores, dots,
or hyphens. The backslash ( \ ) character is used to escape the dot ( . ) character. The dot character is escaped,
([a-z0-9_\.-]+) because a non-escaped dot matches any character. The plus ( + ) sign specifies to match from 1 to unlimited characters
in this group. In this example this part of the expression matches buttercup in the email address
[email protected].
([\da-z\.-]+) This is the second group in the expression. Specifies to match the domain name, which can be one or more lowercase
letters, numbers, underscores, dots, or hyphens. This is followed by another escaped dot character. The plus ( + ) sign
456
Part of the
Description
expression
specifies to match from 1 to unlimited characters in this group. In this example this part of the expression matches
example in the email address [email protected].
This is the third group. Specifies to match the top-level domain (TLD), which can be 2 to 6 letters or dots. This group
([a-z\.]{2,6}) matches all types of TLDs, such as .co.uk, .edu, or .asia. In this example it matches .com in the email address
[email protected].
Example 3: Filter out zip codes that are formatted like a United States zip code or zip+4 code. For example, this search
would return a Canadian zip code.
Example 4: The search with regex and != in the following example creates 5 events with Country="Canada" and 5 events
with City="Ontario", and filters on events where Country does not equal "Canada".
This search returns the union of two groups of events: events where the field Country is defined and has a value not equal
to "Canada"; and events where the field Country is not defined. As a result, 5 events are displayed for the City field,
even though a Country field was not defined for those events. Also, the Country field is displayed, but the values are null.
The results look something like this.
See also
Commands
rex
search
457
reltime
Description
Adds a new field to your search results, called reltime, and sets this field to a human readable value of the difference
between now and _time.
The human-readable values look like "5 days ago", "1 minute ago", "2 years ago", and so on.
Syntax
reltime
Usage
The reltime command returns relative times in seconds, minutes, hours, days and years. For example, 12 seconds ago.
The reltime command changes the time unit when a threshold has been passed. If the time difference between a
timestamp and "now" does not meet the next threshold the smaller time unit is used. For example, if you have a
timestamp and only 23 hours and 59 seconds have passed, the relative time displays hours instead of days. When exactly
24 hours have passed, the relative time still displays hours. Only when 24 hours and 1 second have passed will the
relative time display 1 day ago.
Examples
_time
2021-06-11 14:35:58
2021-06-10 14:35:58
2021-06-08 14:35:58
2021-04-12 14:35:58
2021-04-12 14:35:59
When you add the reltime command to the end of the search, a field is added to the events. The relative time difference
between the _time field and now is calculated and added to the new field.
_time reltime
2021-06-11 14:35:58 now
2021-06-10 14:35:58 1 day ago
458
_time reltime
See also
convert
rename
Description
Use the rename command to rename one or more fields. This command is useful for giving fields more meaningful names,
such as "Product ID" instead of "pid". If you want to rename fields with similar names, you can use a wildcard character.
See the Usage section.
Syntax
Required arguments
wc-field
Syntax: <string>
Description: The name of a field and the name to replace it. Field names with spaces must be enclosed in
quotation marks. You can use the asterisk ( * ) as a wildcard to specify a list of fields with similar names. For
example, if you want to specify all fields that start with "value", you can use a wildcard such as value*.value*.
Usage
459
If both the source and destination fields are wildcard expressions with the same number of wildcards, the renaming will
carry over the wildcarded portions to the destination expression. See Examples.
You cannot rename one field with multiple names. For example if you have field A, you cannot specify | rename A as B,
A as C. This rule also applies to other commands where you can rename fields, such as the stats command.
You cannot use the rename command to merge multiple fields into one field because null, or non-present, fields are
brought along with the values.
For example, if you have events with either product_id or pid fields, ... | rename pid AS product_id would not merge
the pid values into the product_id field. It overwrites product_id with Null values where pid does not exist for the event.
See the eval command and coalesce() function.
Suppose you rename fieldA to fieldB, but fieldA does not exist.
Examples
Example 1:
Example 2:
Example 3:
Rename the "count" field. Names with spaces must be enclosed in quotation marks.
460
See also
replace
Description
Replaces a single occurrence of the first string with another string in the specified fields. If you do not specify a one or
more fields, the value is replaced in all fields.
Syntax
Required arguments
wc-string
Syntax: <string>
Description: Specify one or more field values and their replacements. You can use wildcard characters to match
one or multiple terms.
Optional arguments
field-list
Syntax: <string> ...
Description: Specify a comma or space delimited list of one or more field names for the field value replacements.
To replace values on _internal fields, you must specify the field name with the IN <fieldname> clause.
Usage
Non-wildcard replacement values specified later take precedence over those replacements specified earlier. For a
wildcard replacement, fuller matches take precedence over lesser matches. To assure precedence relationships, you are
advised to split the replace into two separate invocations. When using wildcard replacements, the result must have the
same number of wildcards, or none at all. Wildcards ( * ) can be used to specify many values to replace, or replace values
with.
Examples
Change any host value that ends with "localhost" to simply "localhost" in all fields.
461
2. Replace a value in a specific field
Replaces the values in the start_month and end_month fields. You can separate the names in the field list with spaces or
commas.
In the host field, change the order of string values that contain the word localhost so that the string "localhost" precedes
the other strings.
Replace the values in a field with more descriptive names. Separate the value replacements with comma.
Search for an error message and replace empty strings with a whitespace.
This example will not work unless you have values that are actually the empty string, which is not the same as not
having a value.
"Error exporting to XYZ :" | rex "Error exporting to XYZ:(?.*)" | replace "" WITH " " IN errmsg
sourcetype=* | head 5 | eval _time="XYZ" | stats count BY _time | replace *XYZ* WITH *ALL* IN _time
See also
Commands
rename
require
462
Description
Causes a search to fail if the queries and commands that precede it in the search string return zero events or results.
Syntax
| require
Usage
When require is used in a search string, it causes the search to fail if the queries and commands that precede it in the
search string return zero events or results. When you use it in a subsearch, it causes the parent search to fail when the
subsearch fails to return results.
Use this command to prevent the Splunk platform from running zero-result searches when this might have certain
negative side effects, such as generating false positives, running custom search commands that make costly API calls, or
creating empty search filters via a subsearch.
Do not expect the require command to mitigate all possible negative consequences of a search. When the require
command causes a search to fail, it prevents subsequent commands in the search from receiving the results, but it does
not prevent the Splunk software from invoking those commands before the search is finalized. This means that those
subsequent search command processors may receive empty "chunks" before the search is finalized.
If you are implementing a custom search command, make sure it interoperates well with the require command. Ensure
that it avoids exhibiting side effects in response to partial input.
See Create custom search commands for apps in Splunk Cloud Platform or Splunk Enterprise in the Developer Guide on
the Developer Portal.
Examples
... | require
2. Raise an exception if the subsearch returns zero events or results, and stop the parent search.
rest
463
Description
The rest command reads a Splunk REST API endpoint and returns the resource data as a search result.
For information about the REST API, see the REST API User Manual.
Syntax
| rest <rest-uri>
[count=<int>]
[strict=<bool>]
[splunk_server=<wc-string>]
[splunk_server_group=<wc-string>]...
[timeout=<int>]
[<get-arg-name>=<get-arg-value>]...
Required arguments
rest-uri
Syntax: <uri>
Description: URI path to the Splunk REST API endpoint.
Optional arguments
count
Syntax: count=<int>
Description: Limits the number of results returned from each REST call. For example, you have four indexers
and one search head. You set the limit to count=25000. This results in a total limit of 125000, which is 25000 x 5.
When count=0, there is no limit.
Default: 0
get-arg-name
Syntax: <string>
Description: REST argument name.
get-arg-value
Syntax: <string>
Description: REST argument value.
splunk_server
Syntax: splunk_server=<wc-string>
Description: Specifies the distributed search peer from which to return results. You can specify only one
splunk_server argument, However, you can use a wildcard character when you specify the server name to
indicate multiple servers. For example, you can specify splunk_server=peer01 or splunk_server=peer*. Use
local to refer to the search head.
Default: All configured search peers return information
splunk_server_group
Syntax: splunk_server_group=<wc-string>...
464
Description: Limits the results to one or more server groups. You can specify a wildcard character in the string to
indicate multiple server groups.
strict
Syntax: strict=<bool>
Description: When set to true this argument forces the search to fail completely if rest raises an error. This
happens even when the errors apply to a subsearch. When set to false, many rest error conditions return
warning messages but do not otherwise cause the search to fail. Certain error conditions cause the search to fail
even when strict=false.
Default: false
timeout
Syntax: timeout=<int>
Description: Specify the timeout, in seconds, to wait for the REST endpoint to respond. Specify timeout=0 to
indicate no limit on the time to wait for the REST endpoint to respond.
Default: 60
Usage
The rest command authenticates using the ID of the person that runs the command.
Use the strict argument to make rest searches fail whenever they encounter an error condition. You can set this at the
system level for all rest searches by changing restprocessor_errors_fatal in limits.conf.
If you use Splunk Cloud Platform, file a Support ticket to change the restprocessor_errors_fatal setting.
Use the strict argument to override the restprocessor_errors_fatal setting for a rest search.
Examples
2. Find all saved searches with searches that include a specific sourcetype
Find all saved searches with search strings that include the speccsv sourcetype.
Add current search user to all events. This is useful for creating reports that only show events associated with the logged
in user.
465
4. Use the GET method pagination and filtering parameters
To determine if an endpoint supports these parameters, find the endpoint in the REST API Reference manual. Click
Expand on the GET method and look for a link to the Pagination and filtering parameters topic. For more information
about the Pagination and filtering parameters, see the Request and response details in the REST API Reference manual.
The following example uses the search parameter to identify if a search is scheduled and if a search is disabled. The
search looks for scheduled searches on Splunk servers that match the Monitoring Console role of "search heads".
Look only at Splunk servers that match the Monitoring Console role of "search heads".
splunk_server_group=dmc_group_search_head
return
Description
The return command is used to pass values up from a subsearch. The command replaces the incoming events with one
event, with one attribute: "search". To improve performance, the return command automatically limits the number of
incoming results with the head command and the resulting fields with the fields command.
By default, the return command uses only the first row of results. Use the count argument to specify the number of
results to use.
Syntax
Required arguments
None.
466
Optional arguments
<count>
Syntax: <int>
Description: Specify the number of rows.
Default: 1, which is the first row of results passed into the command.
<alias>
Syntax: <alias>=<field>...
Description: Specify the field alias and value to return. You can specify multiple pairs of aliases and values,
separated by spaces.
<field>
Syntax: <field>...
Description: Specify one or more fields to return, separated by spaces.
<$field>
Syntax: <$field>
Description: Specify one or more field values to return, separated by spaces.
Usage
The command is convenient for outputting a field name, a alias-value pair, or just a field value.
Output Example
Field name return source
In most cases, using the return command at the end of a subsearch removes the need for head, fields, rename, format,
and dedup.
Duplicate values
You might logically expect the command to return the first two distinct users. Instead the command looks at the first two
events, based on the ordering from the implied head command. The return command returns the users within those two
events. The command does not determine if the user value is unique. If the same user is listed in these events, the
command returns only the one user.
To return unique values, you need to include the dedup command in your search. For example:
467
Quotations in returned fields
The return command does not escape quotation marks that are in the fields that are returned. You must use an eval
command to escape the quotation marks before you use the return command. For example:
Examples
Example 1:
Search for 'error ip=<someip>', where <someip> is the most recent ip used by user 'boss'.
Example 2:
Search for 'error (user=user1 ip=ip1) OR (user=user2 ip=ip2)', where the users and IPs come from the two
most-recent logins.
Example 3:
See also
format, search
reverse
Description
The reverse command does not affect which resultss are returned by the search, only the order in which the results are
displayed. For the CLI, this includes any default or explicit maxout setting.
On very large result sets, which means sets with millions of results or more, reverse command requires large amounts
of temporary storage, I/O, and time.
Syntax
reverse
468
Usage
Examples
Example 1:
... | reverse
See also
Commands
head
sort
tail
rex
Description
Use this command to either extract fields using regular expression named groups, or replace or substitute characters in a
field using sed expressions.
The rex command matches the value of the specified field against the unanchored regular expression and extracts the
named groups into fields of the corresponding names.
When mode=sed, the given sed expression used to replace or substitute characters is applied to the value of the chosen
field. This sed-syntax is also used to mask sensitive data at index-time. Read about using sed to anonymize data in the
Getting Data In Manual.
If a field is not specified, the regular expression or sed expression is applied to the _raw field. Running the rex
command against the _raw field might have a performance impact.
Use the rex command for search-time field extraction or string replacement and character substitution.
Syntax
rex [field=<field>]
( <regex-expression> [max_match=<int>] [offset_field=<string>] ) | (mode=sed <sed-expression>)
469
Required arguments
regex-expression
Syntax: "<string>"
Description: The PCRE regular expression that defines the information to match and extract from the specified
field. Quotation marks are required.
mode
Syntax: mode=sed
Description: Specify to indicate that you are using a sed (UNIX stream editor) expression.
sed-expression
Syntax: "<string>"
Description: When mode=sed, specify whether to replace strings (s) or substitute characters (y) in the matching
regular expression. No other sed commands are implemented. Quotation marks are required. Sed mode supports
the following flags: global (g) and Nth occurrence (N), where N is a number that is the character location in the
string.
Optional arguments
field
Syntax: field=<field>
Description: The field that you want to extract information from.
Default: _raw
max_match
Syntax: max_match=<int>
Description: Controls the number of times the regex is matched. If greater than 1, the resulting fields are
multivalued fields. Use 0 to specify unlimited matches. Multiple matches apply to the repeated application of the
whole pattern. If your regex contains a capture group that can match multiple times within your pattern, only the
last capture group is used for multiple matches.
Default: 1
offset_field
Syntax: offset_field=<string>
Description: Creates a field that lists the position of certain values in the field argument, based on the regular
expression specified in regex-expression. For example, if the rex expression is "(?<tenchars>.{10})" the first
ten characters of the field argument are matched. The offset_field shows tenchars=0-9. The offset calculation
always uses zero ( 0 ) for the first position. For another example, see Examples.
Default: No default
Usage
Use the rex command to either extract fields using regular expression named groups, or replace or substitute characters
in a field using sed expressions.
470
Use the regex command to remove results that do not match the specified regular expression.
Regular expressions
When you use regular expressions in searches, you need to be aware of how characters such as pipe ( | ) and backslash
( \ ) are handled. See SPL and regular expressions in the Search Manual.
For general information about regular expressions, see Splunk Enterprise regular expressions in the Knowledge Manager
Manual.
Sed expressions
When using the rex command in sed mode, you have two options: replace (s) or character substitution (y).
The syntax for using sed to replace (s) text in your data is: "s/<regex>/<replacement>/<flags>"
• This substitutes the characters that match <string1> with the characters in <string2>.
Examples
Extract email values from events to create from and to fields in your events. For example, you have events such as:
Mon Mar 19 20:16:27 2018 Info: Bounced: DCID 8413617 MID 19338947 From: <[email protected]> To:
<[email protected]> RID 0 - 5.4.7 - Delivery expired (message too old) ('000', ['timeout'])
Mon Mar 19 20:16:03 2018 Info: Delayed: DCID 8414309 MID 19410908 From: <[email protected]> To:
<[email protected]> RID 0 - 4.3.2 - Not accepting messages at this time ('421', ['4.3.2 try again
later'])
Mon Mar 19 20:16:02 2018 Info: Bounced: DCID 0 MID 19408690 From: <[email protected]> To:
<[email protected]> RID 0 - 5.1.2 - Bad destination host ('000', ['DNS Hard Error looking up
mahidnrasatyambsg.com (MX): NXDomain'])
Mon Mar 19 20:15:53 2018 Info: Delayed: DCID 8414166 MID 19410657 From: <[email protected]> To:
<[email protected]> RID 0 - 4.3.2 - Not accepting messages at this time ('421', ['4.3.2 try again
later'])
When the events were indexed, the From and To values were not identified as fields. You can use the rex command to
extract the field values and create from and to fields in your search results.
The from and to lines in the _raw events follow an identical pattern. Each from line is From: and each to line is To:. The
email addresses are enclosed in angle brackets. You can use this pattern to create a regular expression to extract the
values and create the fields.
471
source="cisco_esa.txt" | rex field=_raw "From: <(?<from>.*)> To: <(?<to>.*)>"
You can remove duplicate values and return only the list of address by adding the dedup and table commands to the
search.
source="cisco_esa.txt" | rex field=_raw "From: <(?<from>.*)> To: <(?<to>.*)>" | dedup from to | table from
to
You can use the max_match argument to specify that the regular expression runs multiple times to extract multiple values
from a field.
For example, use the makeresults command to create a field with multiple values:
_time test
2019-12-05 11:15:28 a$1,b$2
To extract each of the values in the test field separately, you use the max_match argument with the rex command. For
example:
472
3. Extract values from a field in scheduler.log events
Extract "user", "app" and "SavedSearchName" from a field called "savedsearch_id" in scheduler.log events. If
savedsearch_id=bob;search;my_saved_search then user=bob , app=search and SavedSearchName=my_saved_search
Use sed syntax to match the regex to a series of numbers and replace them with an anonymized string.
This example shows how to use the rex command sed expression with capture replace using \1, \2 to reuse captured
pieces of a string.
This search creates an event with three fields, _time, search, and orig_search. The regular expression removes the
quotation marks and any leading or trailing spaces around the quotation marks.
To identify the position of certain values in a field, use the rex command with the offset_field argument and a regular
expression.
The following example starts with the makeresults command to create a field with a value:
_time foo
2021-05-21 11:36:57 abcdefghijklmnopqrstuvwxyz
Add the rex command with the offset_field argument to the search to create a field called bar. You can identify the
position of the first five values in the field foo using the regular expression "(?<firstfive>abcde)". For example:
473
_time bar firstfive foo
2021-05-21 11:36:57 firstfive=0-4 abcde abcdefghijklmnopqrstuvwxyz
You can identify the position of several of the middle values in the field foo using the regular expression
"(?<middle>fgh)". For example:
sourcetype=linux_secure port "failed password" | rex "\s+(?<ports>port \d+)" | top src_ip ports showperc=0
This search used rex to extract the port field and values. Then, it displays a table of the top source IP addresses (src_ip)
and ports the returned with the search for potential attackers.
See also
rtorder
Description
Buffers events from real-time search to emit them in ascending time order when possible.
The rtorder command creates a streaming event buffer that takes input events, stores them in the buffer in ascending
time order, and emits them in that order from the buffer. This is only done after the current time reaches at least the span
of time given by buffer_span, after the timestamp of the event.
Events are also emitted from the buffer if the maximum size of the buffer is exceeded.
If an event is received as input that is earlier than an event that has already been emitted previously, the out of order
event is emitted immediately unless the discard option is set to true. When discard is set to true, out of order events are
always discarded to assure that the output is strictly in time ascending order.
Syntax
474
Optional arguments
buffer_span
Syntax: buffer_span=<span-length>
Description: Specify the length of the buffer.
Default: 10 seconds
discard
Syntax: discard=<bool>
Description: Specifies whether or not to always discard out-of-order events.
Default: false
max_buffer_size
Syntax: max_buffer_size=<int>
Description: Specifies the maximum size of the buffer.
Default: 50000, or the max_result_rows setting of the [search] stanza in limits.conf.
Examples
Example 1:
Keep a buffer of the last 5 minutes of events, emitting events in ascending time order once they are more than 5 minutes
old. Newly received events that are older than 5 minutes are discarded if an event after that time has already been
emitted.
See also
sort
run
The run command is an alias for the script command. See the script command for the syntax and examples.
savedsearch
Description
Runs a saved search, or report, and returns the search results of a saved search. If the search contains replacement
placeholder terms, such as $replace_me$, the search processor replaces the placeholders with the strings you specify.
For example:
Syntax
475
Required arguments
savedsearch_name
Syntax: <string>
Description: Name of the saved search to run.
Optional arguments
savedsearch-options
Syntax: <substitution-control> | <replacement>
Description: Specify whether substitutions are allowed. If allowed, specify the key-value pair to use in the string
substitution replacement.
substitution-control
Syntax: nosubstitution=<bool>
Description: If true, no string substitution replacements are made.
Default: false
replacement
Syntax: <field>=<string>
Description: A key-value pair to use in string substitution replacement.
Usage
The savedsearch command is a generating command and must start with a leading pipe character.
The savedsearch command always runs a new search. To reanimate the results of a previously run search, use the
loadjob command.
When the savedsearch command runs a saved search, the command always applies the permissions associated with the
role of the person running the savedsearch command to the search. The savedsearch command never applies the
permissions associated with the role of the person who created and owns the search to the search. This happens even
when a saved search has been set up to run as the report owner.
See Determine whether to run reports as the report owner or user in the Reporting Manual.
Time ranges
• If you specify All Time in the time range picker, the savedsearch command uses the time range that was saved
with the saved search.
• If you specify any other time in the time range picker, the time range that you specify overrides the time range that
was saved with the saved search.
Examples
Example 1
| savedsearch mysecurityquery
476
Example2
Run the saved search "mysearch". Where the replacement placeholder term $replace_me$ appears in the saved search,
use "value" instead.
See also
search, loadjob
script
Description
Calls an external python program that can modify or generate search results.
Splunk Enterprise
Scripts must be declared in the commands.conf file and be located in the
$SPLUNK_HOME/etc/apps/<app_name>/bin/ directory. The script is executed using $SPLUNK_HOME/bin/python.
Syntax
Required arguments
script-name
Syntax: <string>
Description: The name of the scripted search command to run, as defined in the commands.conf file.
Optional arguments
maxinputs
Syntax: maxinputs=<int>
Description: Specifies how many of the input results are passed to the script per invocation of the command. The
script command is invoked repeatedly in increments according to the maxinputs argument until the search is
complete and all of the results have been displayed. Do not change the value of maxinputs unless you know what
you are doing.
Default: 50000
script-arg
Syntax: <string> ...
477
Description: One or more arguments to pass to the script. If you are passing multiple arguments, delimit each
argument with a space.
Usage
The script command is effectively an alternative way to invoke custom search commands. See Create custom search
commands for apps in Splunk Cloud Platform or Splunk Enterprise in the Developer Guide on the Developer Portal.
| script commandname
is synonymous with
| commandname
Some functions of the script command have been removed over time. The explicit choice of Perl or Python as an
argument is no longer functional and such an argument is ignored. If you need to write Perl search commands, you
must declare them as Perl in the commands.conf file. This is not recommended, as you need to determine a number of
underdocumented things about the input and output formats. Additionally, support for explicit filename reference for
scripts in the etc/searchscripts directory has been removed. All search commands must now be declared in the
commands.conf file.
Examples
Example 1:
Run the Python script "myscript" with arguments, myarg1 and myarg2; then, email the results.
scrub
Description
Anonymizes the search results by replacing identifying data - usernames, ip addresses, domain names, and so forth -
with fictional values that maintain the same word length. For example, it might turn the string [email protected]
into [email protected]. This lets Splunk users share log data without revealing confidential or personal
information.
Syntax
478
Required arguments
None
Optional arguments
public-terms
Syntax: public-terms=<filename>
Description: Specify a filename that includes the public terms NOT to anonymize.
private-terms
Syntax: private-terms=<filename>
Description: Specify a filename that includes the private terms to anonymize.
name-terms
Syntax: name-terms=<filename>
Description: Specify a filename that includes the names to anonymize.
dictionary
Syntax: dictionary=<filename>
Description: Specify a filename that includes a dictionary of terms NOT to anonymize, unless those terms are in
the private-terms file.
timeconfig
Syntax: timeconfig=<filename>
Description: Specify a filename that includes the time configurations to anonymize.
namespace
Syntax: namespace=<string>
Description: Specify an application that contains the alternative files to use for anonymizing, instead of using the
built-in anonymizing files.
Usage
By default, the scrub command uses the dictionary and configuration files that are located in the
$SPLUNK_HOME/etc/anonymizer directory. These default files can be overridden by specifying arguments to the scrub
command. The arguments exactly correspond to the settings in the splunk anonymize CLI command. For details, issue
the splunk help anonymize command.
You can add your own versions of the configuration files to the default location.
Alternatively, you can specify an application where you maintain your own copy of the dictionary and configuration files.
To specify the application, use the namespace=<string> argument, where <string> is the name of the application that
corresponds to the name that appears in the path $SPLUNK_HOME/etc/apps/<app>/anonymizer.
If the $SPLUNK_HOME/etc/apps/<app>/anonymizer directory does not exist, the Splunk software looks for the files in the
$SPLUNK_HOME/etc/slave-apps/<app>/anonymizer directory.
The scrub command anonymizes all attributes, except those that start with underscore ( _ ) except _raw) or start with
date_. Additionally, the following attributes are not anonymized: eventtype, linecount, punct, sourcetype, timeendpos,
timestartpos.
479
The scrub command adheres to the default maxresultrows limit of 50000 results. This setting is documented in the
limits.conf file in the [searchresults] stanza. See limits.conf in the Admin Manual.
Examples
... | scrub
2. Anonymize the current search results using the specified private-terms file.
This search uses the abc_private-terms file that is located in the $SPLUNK_HOME/etc/anonymizer directory.
search
Description
Use the search command to retrieve events from indexes or filter the results of a previous search command in the
pipeline. You can retrieve events from your indexes, using keywords, quoted phrases, wildcards, and field-value
expressions. The search command is implied at the beginning of any search. You do not need to specify the search
command at the beginning of your search criteria.
You can also use the search command later in the search pipeline to filter the results from the previous command in the
pipeline.
The search command can also be used in a subsearch. See about subsearches in the Search Manual.
After you retrieve events, you can apply commands to transform, filter, and report on the events. Use the vertical bar ( | ) ,
or pipe character, to apply a command to the retrieved events.
The search command supports IPv4 and IPv6 addresses and subnets that use CIDR notation.
Syntax
search <logical-expression>
Required arguments
<expression>
Syntax: <logical-expression> | <time-opts> | <search-modifier> | NOT <logical-expression> | <index-expression>
| <comparison-expression> | <logical-expression> [OR] <logical-expression>
Description: Includes all keywords or field-value pairs used to describe the events to retrieve from the index.
Include parenthesis as necessary. Use Boolean expressions, comparison operators, time modifiers, search
modifiers, or combinations of expressions for this argument.
The AND operator is always implied between terms and expressions. For example, web error is the same as web
AND error. Specifying clientip=192.0.2.255 earliest=-1h@h is the same as clientip=192.0.2.255 AND
earliest=-1h@h. So unless you want to include it for clarity reasons, you do not need to specify the AND operator.
480
Logical expression options
<comparison-expression>
Syntax: <field><comparison-operator><value> | <field> IN (<value-list>)
Description: Compare a field to a literal value or provide a list of values that can appear in the field.
<index-expression>
Syntax: "<string>" | <term> | <search-modifier>
Description: Describe the events you want to retrieve from the index using literal strings and search modifiers.
<time-opts>
Syntax: [<timeformat>] (<time-modifier>)...
Description: Describe the format of the starttime and endtime terms of the search. See Time options.
<comparison-operator>
Syntax: = | != | < | <= | > | >=
Description: You can use comparison operators when searching field/value pairs. Comparison expressions with
the equal ( = ) or not equal ( != ) operator compare string values. For example, "1" does not match "1.0".
Comparison expressions with greater than or less than operators < > <= >= numerically compare two numbers
and lexicographically compare other values. See Usage.
<field>
Syntax: <string>
Description: The name of a field.
<value>
Syntax: <literal-value>
Description: In comparison-expressions, the literal number or string value of a field.
<value-list>
Syntax: (<literal-value>, <literal-value>, ...)
Description: Used with the IN operator to specify two or more values. For example use error IN (400, 402,
404, 406) instead of error=400 OR error=402 OR error=404 OR error=406
<string>
Syntax: "<string>"
Description: Specify keywords or quoted phrases to match. When searching for strings and quoted strings
(anything that's not a search modifier), Splunk software searches the _raw field for the matching events or results.
<search-modifier>
Syntax: <sourcetype-specifier> | <host-specifier> | <hosttag-specifier> | <source-specifier> |
<savedsplunk-specifier> | <eventtype-specifier> | <eventtypetag-specifier> | <splunk_server-specifier>
Description: Search for events from specified fields or field tags. For example, search for one or a combination of
hosts, sources, source types, saved searches, and event types. Also, search for the field tag, with the format:
tag::<field>=<string>.
◊ Read more about searching with default fields in the Knowledge Manager manual.
◊ Read more about using tags and field aliases in the Knowledge Manager manual.
481
<sourcetype-specifier>
Syntax: sourcetype=<string>
Description: Search for events from the specified sourcetype field.
<host-specifier>
Syntax: host=<string>
Description: Search for events from the specified host field.
<hosttag-specifier>
Syntax: hosttag=<string>
Description: Search for events that have hosts that are tagged by the string.
<eventtype-specifier>
Syntax: eventtype=<string>
Description: Search for events that match the specified event type.
<eventtypetag-specifier>
Syntax: eventtypetag=<string>
Description: Search for events that would match all eventtypes tagged by the string.
<savedsplunk-specifier>
Syntax: savedsearch=<string> | savedsplunk=<string>
Description: Search for events that would be found by the specified saved search.
<source-specifier>
Syntax: source=<string>
Description: Search for events from the specified source field.
<splunk_server-specifier>
Syntax: splunk_server=<string>
Description: Search for events from a specific server. Use "local" to refer to the search head.
Time options
<timeformat>
Syntax: timeformat=<string>
Description: Set the time format for starttime and endtime terms.
Default: timeformat=%m/%d/%Y:%H:%M:%S.
<time-modifier>
Syntax: starttime=<string> | endtime=<string> | earliest=<time_modifier> | latest=<time_modifier>
Description: Specify start and end times using relative or absolute time.
You can also use the earliest and latest attributes to specify absolute and relative time ranges for your search.
For more about this time modifier syntax, see Specify time modifiers in your search in the Search Manual.
starttime
Syntax: starttime=<string>
Description: Events must be later or equal to this time. Must match timeformat.
482
endtime
Syntax: endtime=<string>
Description: All events must be earlier or equal to this time.
Usage
The search command is an event-generating command when it is the first command in the search, before the first pipe.
When the search command is used further down the pipeline, it is a distributable streaming command. See Command
types.
When search is the first command in the search, you can use terms such as keywords, phrases, fields, boolean
expressions, and comparison expressions to specify exactly which events you want to retrieve from Splunk indexes. If you
don't specify a field, the search looks for the terms in the the _raw field.
• keywords: error login, which is the same as specifying for error AND login
• quoted phrases: "database error"
• boolean operators: login NOT (error OR fail)
• wildcards: fail*
• field-value pairs: status=404, status!=404, or status>200
To search field values that are SPL operators or keywords, such as country=IN, country=AS, iso=AND, or state=OR,
you must enclose the operator or keyword in quotation marks. For example: country="IN".
In addition to the implied search command at the beginning of all searches, you can use the search command later in the
search pipeline. The search terms that you can use depend on which fields are passed into the search command.
If the _raw field is passed into the search command, you can use the same types of search terms as you can when the
search command is the first command in a search.
However, if the _raw field is not passed into the search command, you must specify field-values pairs that match the fields
passed into the search command. Transforming commands, such as stats and chart, do not pass the _raw field to the
next command in the pipeline.
Boolean expressions
The order in which Boolean expressions are evaluated with the search is:
483
4. AND clauses
This evaluation order is different than the order used with the where command. The where command evaluates AND
clauses before OR clauses.
To compare two fields, do not specify index=myindex fieldA=fieldB or index=myindex fieldA!=fieldB with the search
command. When specifying a comparison_expression, the search command expects a <field> compared with a <value>.
The search command interprets fieldB as the value, and not as the name of a field.
For not equal comparisons, you can specify the criteria in several ways.
or
Use the IN operator when you want to determine if a field contains one of several values.
When used with the search command, you can use a wildcard character in the list of values for the IN operator. For
example:
You can use the NOT operator with the IN operator. For example:
There is also an IN function that you can use with the eval and where commands. Wild card characters are not allowed in
the values list when the IN function is used with the eval and where commands. See Comparison and Conditional
functions.
CIDR matching
The search command can perform a CIDR match on a field that contains IPv4 and IPv6 addresses.
484
Suppose the ip field contains these values:
10.10.10.12
50.10.10.17
10.10.10.23
If you specify ip="10.10.10.0/24", the search returns the events with the first and last values: 10.10.10.12 and
10.10.10.23.
Lexicographical order
Lexicographical order sorts items based on the values used to encode the items in computer memory. In Splunk software,
this is almost always UTF-8 encoding, which is a superset of ASCII.
• Numbers are sorted before letters. Numbers are sorted based on the first digit. For example, the numbers 10, 9,
70, 100 are sorted lexicographically as 10, 100, 70, 9.
• Uppercase letters are sorted before lowercase letters.
• Symbols are not standard. Some symbols are sorted before numeric values. Other symbols are sorted before or
after letters.
You can specify a custom sort order that overrides the lexicographical order. See the blog Order Up! Custom Sort Orders.
In general, you need quotation marks around phrases and field values that include white spaces, commas, pipes,
quotations, and brackets. Quotation marks must be balanced. An opening quotation must be followed by an unescaped
closing quotation. For example:
• A search such as error | stats count will find the number of events containing the string error.
• A search such as ... | search "error | stats count" would return the raw events containing error, a pipe,
stats, and count, in that order.
Additionally, you want to use quotation marks around keywords and phrases if you do not want to search for their default
meaning, such as Boolean operators and field/value pairs. For example:
• A search for the keyword AND without meaning the Boolean operator: error "AND"
• A search for this field/value phrase: error "startswith=foo"
The backslash character ( \ ) is used to escape quotes, pipes, and itself. Backslash escape sequences are still expanded
inside quotation marks. For example:
• The sequence \| as part of a search will send a pipe character to the command, instead of having the pipe split
between commands.
• The sequence \" will send a literal quotation mark to the command, for example for searching for a literal quotation
mark or inserting a literal quotation mark into a field using rex.
• The \\ sequence will be available as a literal backslash in the command.
485
• For example \s in a search string will be available as \s to the command, because \s is not a known escape
sequence.
• However, in the search string \\s will be available as \s to the command, because \\ is a known escape sequence
that is converted to \.
You can use the TERM() directive to force Splunk software to match whatever is inside the parentheses as a single term
in the index. TERM is more useful when the term contains minor segmenters, such as periods, and is bounded by major
segmenters, such as spaces or commas. In fact, TERM does not work for terms that are not bounded by major breakers.
See Use CASE and TERM to match phrases in the Search Manual.
You can use the CASE() directive to search for terms and field values that are case-sensitive.
See Use CASE and TERM to match phrases in the Search Manual.
Examples
These examples demonstrate how to use the search command. You can find more examples in the Start Searching topic
of the Search Tutorial.
This example demonstrates field-value pair matching for specific values of source IP (src) and destination IP (dst).
src="10.9.165.*" OR dst="10.9.165.8"
This example demonstrates field-value pair matching with boolean and comparison operators. Search for events with
code values of either 10 or 29, and any host that isn't "localhost", and an xqp value that is greater than 5.
In this example you could also use the IN operator since you are specifying two field-value pairs on the same field. The
revised search is:
3. Using wildcards
This example demonstrates field-value pair matching with wildcards. Search for events from all the web servers that have
an HTTP client or server error status.
In this example you could also use the IN operator since you are specifying two field-value pairs on the same field. The
revised search is:
486
host=webserver* status IN(4*, 5*)
This example shows how to use the IN operator to specify a list of field-value pair matchings. In the events from an
access.log file, search the action field for the values addtocart or purchase.
This example uses the search command twice. The search command is implied at the beginning of every search with the
criteria eventtype=web-traffic. The search command is used again later in the search pipeline to filter out the results.
This search defines a web session using the transaction command and searches for the user sessions that contain more
than three events.
Searching with the boolean "NOT"comparison operator is not the same as using the "!=" comparison.
The following search returns everything except fieldA="value2", including all other fields.
NOT fieldA="value2"
The following search returns events where fieldA exists and does not have the value "value2".
fieldA!="value2"
If you use a wildcard for the value, NOT fieldA=* returns events where fieldA is null or undefined, and fieldA!=* never
returns any events.
You can use the search command to match IPv4 and IPv6 addresses and subnets that use CIDR notation. For example,
this search identifies whether the specified IPv4 address is located in the subnet.
The IP address is located in the subnet, so search displays it in the search results, which look like this.
time ip
2020-11-19 16:43:31 192.0.2.56
Note that you can get identical results using the eval command with the cidrmatch("X",Y) function, as shown in this
example.
487
Alternatively, if you're using IPv6 addresses, you can use the search command to identify whether the specified IPv6
address is located in the subnet.
The IP address is in the subnet, so the search results look like this.
time ip
2020-11-19 16:43:31 2001:0db8:ffff:ffff:ffff:ffff:ffff:ff99
See also
Commands
iplocation
lookup
Functions
cidrmatch
searchtxn
Description
Efficiently returns transaction events that match a transaction type and contain specific text.
For Splunk Cloud Platform, you must create a private app that contains your transaction type definitions. If you are a
Splunk Cloud administrator with experience creating private apps, see Manage private apps in your Splunk Cloud
Platform deployment in the Splunk Cloud Admin Manual. If you have not created private apps, contact your Splunk
account representative for help with this customization.
Syntax
Required arguments
<transaction-name>
Syntax: <transactiontype>
Description: The name of the transaction type stanza that is defined in transactiontypes.conf.
<search-string>
Syntax: <string>
Description: Terms to search for within the transaction events.
Optional arguments
eventsonly
Syntax: eventsonly=<bool>
488
Description: If true, retrieves only the relevant events but does not run "| transaction" command.
Default: false
max_terms
Syntax: maxterms=<int>
Description: Integer between 1-1000 which determines how many unique field values all fields can use. Using
smaller values speeds up search, favoring more recent values.
Default: 1000
use_disjunct
Syntax: use_disjunct=<bool>
Description: Specifies if each term in <search-string> should be processed as if separated by an OR operator on
the initial search.
Default: true
Usage
Generating commands use a leading pipe character and should be the first command in a search.
Transactions
The command works only for transactions bound together by particular field values, not by ordering or time constraints.
Suppose you have a <transactiontype> stanza in the transactiontypes.conf.in file called "email". The stanza contains
the following settings.
• fields=qid, pid
• search=sourcetype=sendmail_syslog to=root
The searchtxn command finds all of the events that match sourcetype="sendmail_syslog" to=root.
From those results, all fields that contain a qid or pid located are used to further search for relevant transaction events.
When no additional qid or pid values are found, the resulting search is run:
Examples
Example 1:
See also
transaction
489
selfjoin
Description
Join search result rows with other search result rows in the same result set, based on one or more fields that you specify.
Syntax
Required arguments
<field-list>
Syntax: <field>...
Description: The field or list of fields to join on.
Optional arguments
<selfjoin-options>
Syntax: overwrite=<bool> | max=<int> | keepsingle=<bool>
Description: Options that control the search result set that is returned. You can specify one or more of these
options.
Selfjoin options
keepsingle
Syntax: keepsingle=<bool>
Description: Controls whether or not to retain results that have with a unique value in the join fields. When
keepsingle=true search results that have no other results to join with are kept in the output.
Default: false
max
Syntax: max=<int>
Description: Indicates the maximum number of 'other' results to join with each main result. If max=0, there is no
limit. This argument sets the maximum for the 'other' results. The maximum number of main results is 100,000.
Default: 1
overwrite
Sytnax: overwrite=<bool>
Description: When overwrite=true, causes fields from the 'other' results to overwrite fields of the main results.
The main results are used as the basis for the join.
Default: true
Usage
Self joins are more commonly used with relational database tables. They are used less commonly with event data.
An example of an events usecase is with events that contain information about processes, where each process has a
parent process ID. You can use the selfjoin command to correlate information about a process with information about
the parent process.
490
See the Extended example.
Basic example
... | selfjoin id
Extended example
The following example shows how the selfjoin command works against a simple set of results. You can follow along with
this example on your own Splunk instance.
This example builds a search incrementally. With each addition to the search, the search is rerun and the impact of the
additions are shown in a results table. The values in the _time field change each time you rerun the search. However,
in this example the values in the results table are not changed so that we can focus on how the changes to the search
impact the results.
| makeresults count=5
_time
2018-01-18 14:38:59
2018-01-18 14:38:59
2018-01-18 14:38:59
2018-01-18 14:38:59
2018-01-18 14:38:59
2. To keep better track of each result use the streamstats command to add a field that numbers each result.
_time a
2018-01-18 14:38:59 1
2018-01-18 14:38:59 2
2018-01-18 14:38:59 3
2018-01-18 14:38:59 4
2018-01-18 14:38:59 5
491
3. Additionally, use the eval command to change the timestamps to be 60 seconds apart. Different timestamps makes this
example more realistic.
_time a
2018-01-18 14:38:59 1
2018-01-18 14:39:59 2
2018-01-18 14:40:59 3
2018-01-18 14:41:59 4
2018-01-18 14:42:59 5
4. Next use the eval command to create a field to use as the field to join the results on.
| makeresults count=5 | streamstats count as a | eval _time = _time + (60*a) | eval joiner="x"
_time a joiner
2018-01-18 14:38:59 1 x
2018-01-18 14:39:59 2 x
2018-01-18 14:40:59 3 x
2018-01-18 14:41:59 4 x
2018-01-18 14:42:59 5 x
5. Use the eval command to create some fields with data.
An if function is used with a modulo (modulus) operation to add different data to each of the new fields. A modulo
operation finds the remainder after the division of one number by another number:
• The eval b command processes each result and performs a modulo operation. If the remainder of a/2 is 0, put
"something" into the field "b", otherwise put "nada" into field "b".
• The eval c command processes each result and performs a modulo operation. If the remainder a/2 is 1, put
"something else" into the field "c", otherwise put nothing (NULL) into field "c".
| makeresults count=5 | streamstats count as a | eval _time = _time + (60*a) | eval joiner="x" | eval b =
if(a%2==0,"something","nada"), c = if(a%2==1,"somethingelse",null())
The new fields are added and the fields are arranged in alphabetical order by field name, except for the _time field.
_time a b c joiner
2018-01-18 14:38:59 1 nada somethingelse x
492
_time a b c joiner
2018-01-18 14:41:59 4 something x
| makeresults count=5 | streamstats count as a | eval _time = _time + (60*a) | eval joiner="x" | eval b =
if(a%2==0,"something","nada"), c = if(a%2==1,"somethingelse",null()) | selfjoin joiner
_time a b c joiner
2018-01-18 14:39:59 2 something somethingelse x
| makeresults count=5 | streamstats count as a | eval _time = _time + (60*a) | eval joiner="x" | eval b =
if(a%2==0,"something"+a,"nada"+a), c = if(a%2==1,"somethingelse"+a,null())
The results now have the row number appended to the values in the b and c fields.
_time a b c joiner
2018-01-18 14:38:59 1 nada1 somethingelse1 x
| makeresults count=5 | streamstats count as a | eval _time = _time + (60*a) | eval joiner="x" | eval b =
if(a%2==0,"something"+a,"nada"+a), c = if(a%2==1,"somethingelse"+a,null()) | selfjoin joiner
_time a b c joiner
2018-01-18 14:39:59 2 something2 somethingelse1 x
493
Result
Output Description
row
Row 1 is joined with row 2 In field b, the value nada1 is discarded because the value something2 in row 2 takes precedence.
1
and returned as row 2. In field c, there is no value in row 2. The value somethingelse1 from row 1 is returned.
Row 2 is joined with row 3 Since row 3 contains values for both field b and field c, the values in row 3 take precedence and the
2
and returned as row 3. values in row 2 are discarded.
Row 3 is joined with row 4 In field b, the value nada3 is discarded because the value something4 in row 4 takes precedence.
3
and returned as row 4. In field c, there is no value in row 4. The value somethingelse3 from row 3 is returned.
Row 4 is joined with row 5 Since row 5 contains values for both field b and field c, the values in row 5 take precedence and the
4
and returned as row 5. values in row 4 are discarded.
See also
join
sendemail
Description
Use the sendemail command to generate email notifications. You can email search results to specified email addresses.
You must have a Simple Mail Transfer Protocol (SMTP) server available to send email. An SMTP server is not included
with the Splunk instance.
Syntax
sendemail to=<email_list>
[from=<email_list>]
[cc=<email_list>]
[bcc=<email_list>]
[subject=<string>]
[format=csv | table | raw]
[inline= <bool>]
[sendresults=<bool>]
[sendpdf=<bool>]
[priority=highest | high | normal | low | lowest]
[server=<string>]
[width_sort_columns=<bool>]
[graceful=<bool>]
[content_type=html | plain]
[message=<string>]
[sendcsv=<bool>]
[use_ssl=<bool>]
494
[use_tls=<bool>]
[pdfview=<string>]
[papersize=letter | legal | ledger | a2 | a3 | a4 | a5]
[paperorientation=portrait | landscape]
[maxinputs=<int>]
[maxtime=<int> m | s | h | d]
[footer=<string>]
Required arguments
to
Syntax: to=<email_list>
Description: List of email addresses to send search results to. Specify email addresses in a comma-separated
and quoted list. For example: "[email protected], [email protected], [email protected]"
The set of domains to which you can send emails can be restricted by the Allowed Domains setting on the Email
Settings page. For example, that setting could restrict you to sending emails only to addresses in your organization's
email domain.
For more information, see Email notification action in the Alerting Manual.
Optional arguments
bcc
Syntax: bcc=<email_list>
Description: Blind courtesy copy line. Specify email addresses in a comma-separated and quoted list.
cc
Syntax: cc=<email_list>
Description: Courtesy copy line. Specify email addresses in a comma-separated and quoted list.
content_type
Syntax: content_type=html | plain
Description: The format type of the email.
Default: The default value for the content_type argument is set in the [email] stanza of the alert_actions.conf
file. The default value for a new or upgraded Splunk installation is html.
format
Syntax: format=csv | raw | table
Description: Specifies how to format inline results.
Default: The default value for the format argument is set in the [email] stanza of the alert_actions.conf file.
The default value for a new or upgraded Splunk installation is table.
footer
Syntax: footer=<string>
Description: Specify an alternate email footer.
Default: The default footer is:
If you believe you've received this email in error, please see your Splunk
administrator.
splunk > the engine for machine data.
495
To force a new line in the footer, use Shift+Enter.
from
Syntax: from=<email_list>
Description: Email address from line.
Default: "splunk@<hostname>"
inline
Syntax: inline=<boolean>
Description: Specifies whether to send the results in the message body or as an attachment. By default, an
attachment is provided as a CSV file. See the Usage section.
Default: The default value for the inline argument is set in the [email] stanza of the alert_actions.conf file.
The default value for a new or upgraded Splunk installation is false.
graceful
Syntax: graceful=<boolean>
Description: If set to true, no error is returned if sending the email fails for whatever reason. The remainder of the
search continues as if the the sendemail command was not part of the search. If graceful=false and sending the
email fails, the search returns an error.
Default: false
maxinputs
Syntax: maxinputs=<integer>
Description: Sets the maximum number of search results sent via alerts per invocation of the command. The
sendemail command is invoked repeatedly in increments according to the maxinputs argument until the search is
complete and all of the results have been displayed. Do not change the value of maxinputs unless you know what
you are doing.
Default: 50000
maxtime
Syntax: maxtime=<integer>m | s | h | d
Description: The maximum amount of time that the execution of an action is allowed to take before the action is
aborted.
Example: 2m
Default: no limit
message
Syntax: message=<string>
Description: Specifies the message sent in the email.
Default: The default message depends on which other arguments are specified with the sendemail command.
◊ If sendresults=false the message defaults to "Search complete."
◊ If sendresults=true, inline=true, and either sendpdf=false or sendcsv=false, message defaults to "Search
results."
◊ If sendpdf=true or sendcsv=true, message defaults to "Search results attached."
paperorientation
Syntax: paperorientation=portrait | landscape
Description: The orientation of the paper.
Default: portrait
papersize
496
Syntax: papersize=letter | legal | ledger | a2 | a3 | a4 | a5
Description: Default paper size for PDFs. Acceptable values: letter, legal, ledger, a2, a3, a4, a5.
Default: letter
pdfview
Syntax: pdfview=<string>
Description: Name of a view.xml file to send as a PDF. For example, mydashboard.xml, search.xml, or foo.xml.
Generally this is the name of a dashboard, but it could also be the name of a single page application or some
other object. Specify the name only. Do not specify the filename extension. The view.xml files are located in
<SPLUNK_HOME>/data/ui/views.
priority
Syntax: priority=highest | high | normal | low | lowest
Description: Set the priority of the email as it appears in the email client. Lowest or 5, low or 4, high or 2, highest
or 1.
Default: normal or 3
sendcsv
Syntax: sendcsv=<boolean>
Description: Specify whether to send the results with the email as an attached CSV file or not.
Default: The default value for the sendcsv argument is set in the [email] stanza of the alert_actions.conf file.
The default value for a new or upgraded Splunk installation is false.
sendpdf
Syntax: sendpdf=<boolean>
Description: Specify whether to send the results with the email as an attached PDF or not. For more information
about generating PDFs, see "Generate PDFs of your reports and dashboards" in the Reporting Manual.
Default: The default value for the sendpdf argument is set in the [email] stanza of the alert_actions.conf file.
The default value for a new or upgraded Splunk installation is false.
sendresults
Syntax: sendresults=<boolean>
Description: Determines whether the results should be included with the email. See the Usage section.
Default: The default value for the sendresults argument is set in the [email] stanza of the alert_actions.conf
file. The default value for a new or upgraded Splunk installation is false.
server
Syntax: server=<host>[:<port>]
Description: If the SMTP server is not local, use this argument to specify the SMTP mail server to use when
sending emails. The <host> can be either the hostname or the IP address. You have the option to specify the
SMTP <port> that the Splunk instance should connect to.
If you set use_ssl=true, you must specify both <host> and <port> in the server argument.
This setting takes precedence over the mailserver setting in the alert_actions.conf file. The default setting for
mailserver is localhost:25.
If an alert action is configured to send an email notification when an alert triggers, the sendemail command
might not be able to use the server you specify in the server argument. The values in the Email domains setting
on the Email Settings page might restrict the server you can use. The sendemail command uses the Mail host
that is set on the Email Settings page. For more information, see Email notification action in the Alerting
497
Manual.
Default: localhost
subject
Syntax: subject=<string>
Description: Specifies the subject line.
Default: "Splunk Results"
use_ssl
Syntax: use_ssl=<boolean>
Description: Specifies whether to use SSL when communicating with the SMTP server. When set to true, you
must also specify both the <host> and <port> in the server argument.
Default: false
use_tls
Syntax: use_tls=<boolean>
Description: Specify whether to use TLS (transport layer security) when communicating with the SMTP server
(starttls).
Default: false
width_sort_columns
Syntax: width_sort_columns=<boolean>
Description: This is only valid for plain text emails. Specifies whether the columns should be sorted by their
width.
Default: true
Usage
If you set sendresults=true and inline=false and do not specify format, a CSV file is attached to the email.
If you use fields as tokens in your sendemail messages, use the rename command to remove curly brace characters
such as { and } from them before they are processed by the sendemail command. The sendemail command cannot
interpret curly brace characters when they appear in tokens such as $results$.
Capability requirements
To use sendemail, your role must have the schedule_search and list_settings capabilities.
Examples
Send search results to the specified email. By default, the results are formatted as a table.
498
2: Send search results in table format
Send an email notification with a PDF attachment, a message, and raw inline results.
set
Description
Syntax
Required arguments
Operation Description
Returns a set that combines the results generated by the two subsearches. Provides results that are common to both
union
subsets only once.
Returns a set that combines the results generated by the two subsearches and excludes the events common to both.
diff
Does not indicate which subsearch the results originated from.
subsearch
Syntax: "[" <string> "]"
Description: Specifies a subsearch. Subsearches must be enclosed in square brackets. For more information
about subsearch syntax, see "About subsearches" in the Search Manual.
Usage
499
Generating commands use a leading pipe character and should be the first command in a search.
Results
The set command considers results to be the same if all of fields that the results contain match. Some internal fields
generated by the search, such as _serial, vary from search to search. You need to filter out some of the fields if you are
using the set command with raw events, as opposed to transformed results such as those from a stats command.
Typically in these cases, all fields are the same from search to search.
Output limitations
There is a limit on the quantity of results that come out of the invoked subsearches that the set command receives to
operate on. If this limit is exceeded, the input result set to the diff command is silently truncated.
If you have Splunk Enterprise, you can adjust this limit by editing the limits.conf file and changing the maxout value in
the [subsearch] stanza. If this value is altered, the default quantity of results coming from a variety of subsearch
scenarios are altered. Note that very large values might cause extensive stalls during the 'parsing' phase of a search,
which is when subsearches run. The default value for this limit is 10000.
Only users with file system access, such as system administrators, can edit the configuration files. Never change or
copy the configuration files in the default directory. The files in the default directory must remain intact and in their
original location. Make the changes in the local directory.
If you have Splunk Cloud Platform and want to edit a configuration file, file a Support ticket.
By default the set command attempts to traverse a maximum of 50000 items from each subsearch. If the number of input
results from either search exceeds this limit, the set command silently ignores the remaining events. By default, the
maxout setting for subsearches in limits.conf prevents the number of results from exceeding this limit.
This maximum is controlled by the maxresultrows setting in the [set] stanza in the limits.conf file. Increasing this limit can
result in more memory usage.
Only users with file system access, such as system administrators, can edit the configuration files. Never change or
copy the configuration files in the default directory. The files in the default directory must remain intact and in their
original location. Make the changes in the local directory.
If you have Splunk Cloud Platform and want to edit a configuration file, file a Support ticket.
Examples
Example 1:
Return values of "URL" that contain the string "404" or "303" but not both.
500
| set diff [search 404 | fields url] [search 303 | fields url]
Example 2:
Return all urls that have 404 errors and 303 errors.
| set intersect [search 404 | fields url] [search 303 | fields url]
Note: When you use the fields command in your subsearches, it does not filter out internal fields by default. If you do not
want the set command to compare internal fields, such as the _raw or _time fields, you need to explicitly exclude them
from the subsearches:
| set intersect [search 404 | fields url | fields - _*] [search 303 | fields url | fields - _*]
See also
setfields
Description
Sets the value of the given fields to the specified values for each event in the result set. Delimit multiple definitions with
commas. Missing fields are added, present fields are overwritten.
Whenever you need to change or define field values, you can use the more general purpose eval command. See usage
of an eval expression to set the value of a field in Example 1.
Syntax
Required arguments
<setfields-arg>
Syntax: string="<string>", ...
Description: A key-value pair, with the value quoted. If you specify multiple key-value pairs, separate each pair
with a comma. Standard key cleaning is performed. This means all non-alphanumeric characters are replaced
with '_' and leading '_' are removed.
Examples
Example 1:
501
To do this with the eval command:
See also
sichart
Summary indexing is a method you can use to speed up long-running searches that do not qualify for report
acceleration, such as searches that use commands that are not streamable before the reporting command. For more
information, see "About report accelleration and summary indexing" and "Use summary indexing for increased reporting
efficiency" in the Knowledge Manager Manual.
Description
The summary indexing version of the chart command. The sichart command populates a summary index with the
statistics necessary to generate a chart visualization. For example, it can create a column, line, area, or pie chart. After
you populate the summary index, you can use the chart command with the exact same search that you used with the
sichart command to search against the summary index.
Syntax
sichart
[sep=<string>]
[format=<string>]
[cont=<bool>]
[limit=<int>]
[agg=<stats-agg-term>]
( <stats-agg-term> | <sparkline-agg-term> | "("<eval-expression>")" )...
[ BY <field> [<bins-options>... ] [<split-by-clause>] ] | [ OVER <field> [<bins-options>...] [BY <split-by-clause>] ]
Usage
Supported functions
You can use a wide range of functions with the sichart command. For general information about using functions, see
Statistical and charting functions.
502
Examples
Example 1:
Compute the necessary information to later do 'chart avg(foo) by bar' on summary indexed results.
See also
sirare
Summary indexing is a method you can use to speed up long-running searches that do not qualify for report
acceleration, such as searches that use commands that are not streamable before the reporting command. For more
information, see "About report accelleration and summary indexing" and "Use summary indexing for increased reporting
efficiency" in the Knowledge Manager Manual.
Description
The sirare command is the summary indexing version of the rare command, which returns the least common values of a
field or combination of fields. The sirare command populates a summary index with the statistics necessary to generate a
rare report. After you populate the summary index, use the regular rare command with the exact same search string as
the rare command search to report against it.
Syntax
Required arguments
<field-list>
Syntax: <string>,...
Description: Comma-delimited list of field names.
Optional arguments
<by-clause>
Syntax: BY <field-list>
Description: The name of one or more fields to group by.
<top-options>
Syntax: countfield=<string> | limit=<int> | percentfield=<string> | showcount=<bool> | showperc=<bool>
Description: Options that specify the type and number of values to display. These are the same <top-options>
used by the rare and top commands.
503
Top options
countfield
Syntax: countfield=<string>
Description: Name of a new field to write the value of count.
Default: "count"
limit
Syntax: limit=<int>
Description: Specifies how many tuples to return, "0" returns all values.
percentfield
Syntax: percentfield=<string>
Description: Name of a new field to write the value of percentage.
Default: "percent"
showcount
Syntax: showcount=<bool>
Description: Specify whether to create a field called "count" (see "countfield" option) with the count of that tuple.
Default: true
showpercent
Syntax: showpercent=<bool>
Description: Specify whether to create a field called "percent" (see "percentfield" option) with the relative
prevalence of that tuple.
Default: true
Examples
Example 1:
Compute the necessary information to later do 'rare foo bar' on summary indexed results.
See also
sistats
Description
The sistats command is one of several commands that you can use to create summary indexes. Summary indexing is
one of the methods that you can use to speed up searches that take a long time to run.
The sistats command is the summary indexing version of the stats command, which calculates aggregate statistics over
the dataset.
504
The sistats command populates a summary index. You must then create a report to generate the summary statistics.
See the Usage section.
Syntax
• For descriptions of each of the arguments in this syntax, refer to the stats command.
• For information about functions that you can use with the sistats command, see Statistical and charting
functions.
Usage
After you create the summary index, create a report by running a search against the summary index. You use the exact
same search string that you used to populate the summary index, substituting the stats command for the sistats
command, to create your reports.
For more information, see About report acceleration and summary indexing and Use summary indexing for increased
reporting efficiency in the Knowledge Manager Manual.
With the exception of the count function, when you pair the sistats command with functions that are not applied to
specific fields or eval expressions that resolve into fields, the search head processes it as if it were applied to a wildcard
for all fields. In other words, when you have | sistats avg in a search, it returns results for | sistats avg(*).
This "implicit wildcard" syntax is officially deprecated, however. Make the wildcard explicit. Write | sistats
<function>(*) when you want a function to apply to all possible fields.
A pair of limits.conf settings strike a balance between the performance of sistats searches and the amount of memory
they use during the search process, in RAM and on disk. If your sistats searches are consistently slow to complete you
can adjust these settings to improve their performance, but at the cost of increased search-time memory usage, which
can lead to search failures.
If you have Splunk Cloud Platform, you need to file a Support ticket to change these settings.
For more information, see Memory and stats search performance in the Search Manual.
Examples
Example 1:
Create a summary index with the statistics about the average, for each hour, of any unique field that ends with the string
"lay". For example, delay, xdelay, relay, etc.
505
To create a report, run a search against the summary index using this search
See also
For a detailed explanation and examples of summary indexing, see Use summary indexing for increased reporting
efficiency in the Knowledge Manager Manual.
sitimechart
Summary indexing is a method you can use to speed up long-running searches that do not qualify for report
acceleration, such as searches that use commands that are not streamable before the transforming command. For more
information, see "About report accelleration and summary indexing" and "Use summary indexing for increased reporting
efficiency" in the Knowledge Manager Manual.
Description
The sitimechart command is the summary indexing version of the timechart command, which creates a time-series
chart visualization with a corresponding table of statistics. The sitimechart command populates a summary index with
the statistics necessary to generate a timechart report. After you use an sitimechart search to populate the summary
index, use the regular timechart command with the exact same search string as the sitimechart search to report against
the summary index.
Syntax
sitimechart
[sep=<string>]
[partial=<bool>]
[cont=<bool>]
[limit=<int>]
[agg=<stats-agg-term>]
[<bin-options>... ]
<single-agg> [BY <split-by-clause>] | <eval-expression> BY <split-by-clause>
Usage
Supported functions
You can use a wide range of functions with the sitimechart command. For general information about using functions, see
Statistical and charting functions.
506
◊ For a list of functions by category, see Function list by category
◊ For an alphabetical list of functions, see Alphabetical list of functions
Examples
Example 1:
Use the collect command to populate a summary index called mysummary with the statistics about CPU usage organized
by host,
The collect command adds the results of a search to a summary index that you specify. You must create the summary
index before you invoke the collect command.
Then use the timechart command with the same search to generate a timechart report.
See also
sitop
Summary indexing is a method you can use to speed up long-running searches that do not qualify for report
acceleration, such as searches that use commands that are not streamable before the reporting command. For more
information, see Overview of summary-based search acceleration and Use summary indexing for increased reporting
efficiency in the Knowledge Manager Manual.
Description
The sitop command is the summary indexing version of the top command, which returns the most frequent value of a
field or combination of fields. The sitop command populates a summary index with the statistics necessary to generate a
top report. After you populate the summary index, use the regular top command with the exact same search string as the
sitop command search to report against it.
Syntax
Note: This is the exact same syntax as that of the top command.
Required arguments
<field-list>
Syntax: <field>, ...
Description: Comma-delimited list of field names.
507
Optional arguments
<N>
Syntax: <int>
Description: The number of results to return.
<top-options>
Syntax: countfield=<string> | limit=<int> | otherstr=<string> | percentfield=<string> | showcount=<bool> |
showperc=<bool> | useother=<bool>
Description: Options for the sitop command. See Top options.
<by-clause>
Syntax: BY <field-list>
Description: The name of one or more fields to group by.
Top options
countfield
Syntax: countfield=<string>
Description: The name of a new field that the value of count is written to.
Default: count
limit
Syntax: limit=<int>
Description: Specifies how many tuples to return, "0" returns all values.
Default: "10"
otherstr
Syntax: otherstr=<string>
Description: If useother is true, specify the value that is written into the row representing all other values.
Default: "OTHER"
percentfield
Syntax: percentfield=<string>
Description: Name of a new field to write the value of percentage.
Default: "percent"
showcount
Syntax: showcount=<bool>
Description: Specify whether to create a field called "count" (see "countfield" option) with the count of that tuple.
Default: true
showperc
Syntax: showperc=<bool>
Description: Specify whether to create a field called "percent" (see "percentfield" option) with the relative
prevalence of that tuple.
Default: true
useother
Syntax: useother=<bool>
Description: Specify whether or not to add a row that represents all values not included due to the limit cutoff.
Default: false
508
Examples
Example 1:
Compute the necessary information to later do 'top foo bar' on summary indexed results.
Example 2:
Populate a summary index with the top source IP addresses in a scheduled search that runs daily:
Later, when you want to retrieve that information and report on it, run this search over the past year:
Additionally, because this search specifies the search name, it filters out other data that have been placed in the summary
index by other summary indexing searches.
See also
snowincident
The snowincident command is used with the Splunk Add-on for ServiceNow.
For information about this command, see Use custom generating search commands for the Splunk Add-on for
ServiceNow in Splunk Add-on for ServiceNow.
snowincidentstream
The snowincidentstream command is used with the Splunk Add-on for ServiceNow.
For information about this command, see Use custom streaming search commands for the Splunk Add-on for ServiceNow
in Splunk Add-on for ServiceNow.
snowevent
The snowevent command is used with the Splunk Add-on for ServiceNow.
For information about this command, see Use custom generating search commands for the Splunk Add-on for
ServiceNow in Splunk Add-on for ServiceNow.
509
snoweventstream
The snoweventstream command is used with the Splunk Add-on for ServiceNow.
For information about this command, see Use custom streaming search commands for the Splunk Add-on for ServiceNow
in Splunk Add-on for ServiceNow.
sort
Description
The sort command sorts all of the results by the specified fields. Results missing a given field are treated as having the
smallest or largest possible value of that field if the order is descending or ascending, respectively.
If the first argument to the sort command is a number, then at most that many results are returned, in order. If no number
is specified, the default limit of 10000 is used. If the number 0 is specified, all of the results are returned. See the count
argument for more information.
Syntax
sort
[<count>]
<sort-by-clause>...
[desc]
Required arguments
<sort-by-clause>
Syntax: [ - | + ] <sort-field>, ( - | + ) <sort-field> ...
Description: List of fields to sort by and the sort order. Use a minus sign (-) for descending order and a plus sign
(+) for ascending order. When specifying more than one field, separate the field names with commas. See Sort
field options.
Optional arguments
<count>
Syntax: <int> | limit=<int>
Description: Specify the number of results to return from the sorted results. If no count is specified, the default
limit of 10000 is used. If 0 is specified, all results are returned. You can specify the count using an integer or
precede the count with a label, for example limit=10.
Using sort 0 might have a negative impact performance, depending on how many results are returned.
Default: 10000
desc
Syntax: d | desc
Description: Reverses the order of the results. If multiple fields are specified, reverses the order of the values in
510
the fields in the order in which the fields are specified. For example, if there are three fields specified, the desc
argument reverses the order of the values in the first field. For each set of duplicate values in the first field,
reverses the order of the corresponding values in the second field. For each set of duplicate values in the second
field, reverses the order of the corresponding values in the third field.
<sort-field>
Syntax: <field> | auto(<field>) | str(<field>) | ip(<field>) | num(<field>)
Description: Options you can specify with <sort-field>.
<field>
Syntax: <string>
Description: The name of field to sort.
auto
Syntax: auto(<field>)
Description: Determine automatically how to sort the values of the field.
ip
Syntax: ip(<field>)
Description: Interpret the values of the field as IP addresses.
num
Syntax: num(<field>)
Description: Interpret the values of the field as numbers.
str
Syntax: str(<field>)
Description: Interpret the values of the field as strings and order the values alphabetically.
Usage
By default, sort tries to automatically determine what it is sorting. If the field contains numeric values, the collating
sequence is numeric. If the field contains on IP address values, the collating sequence is for IP addresses. Otherwise, the
collating sequence is in lexicographical order. Some specific examples are:
The sort order is determined between each pair of values that are compared at any one time. This means that for some
pairs of values, the order might be lexicographical, while for other pairs the order might be numerical.
511
Results in descending order Description
10.1
This set of values are sorted numerically because the values are all numeric.
9.1
9.1.a
This set of values are sorted lexicographically because the values are alphanumeric strings.
10.1.a
Lexicographical order
Lexicographical order sorts items based on the values used to encode the items in computer memory. In Splunk software,
this is almost always UTF-8 encoding, which is a superset of ASCII.
• Numbers are sorted before letters. Numbers are sorted based on the first digit. For example, the numbers 10, 9,
70, 100 are sorted lexicographically as 10, 100, 70, 9.
• Uppercase letters are sorted before lowercase letters.
• Symbols are not standard. Some symbols are sorted before numeric values. Other symbols are sorted before or
after letters.
You can specify a custom sort order that overrides the lexicographical order. See the blog Order Up! Custom Sort Orders.
Basic examples
Sort the results by the ipaddress field in ascending order and then sort by the url field in descending order.
Sort first 100 results in descending order of the "size" field and then by the "source" value in ascending order. This
example specifies the type of data in each of the fields. The "size" field contains numbers and the "source" field contains
strings.
Sort results by the "_time" field in ascending order and then by the "host" value in descending order.
Change the format of the event's time and sort the results in descending order by the Time field that is created with the
eval command.
... | bin _time span=60m | eval Time=strftime(_time, "%m/%d %H:%M %Z") | stats avg(time_taken) AS
AverageResponseTime BY Time | sort - Time
512
(Thanks to Splunk user Ayn for this example.)
You can use a label to identify the number of results to return: Return the first 12 results, sorted by the "host" field in
descending order.
Extended example
Sort a table of results in a specific order, such as days of the week or months of the year, that is not lexicographical or
numeric. For example, suppose you have a search that produces the following table:
Day Total
Friday 120
Monday 93
Tuesday 124
Thursday 356
Weekend 1022
Wednesday 248
Sorting on the day field (Day) returns a table sorted alphabetically, which does not make much sense. Instead, you want
to sort the table by the day of the week, Monday to Friday, with the Weekend at the end of the list.
To create a custom sort order, you first need to create a field called sort_field that defines the order. Then you can sort
on that field.
This search uses the eval command to create the sort_field and the fields command to remove sort_field from the
final results table.
Day Total
Monday 93
Tuesday 124
Wednesday 248
513
Day Total
Thursday 356
Friday 120
Weekend 1022
(Thanks to Splunk users Ant1D and Ziegfried for this example.)
For additional custom sort order examples, see the blog Order Up! Custom Sort Orders and the Extended example in the
rangemap command.
See also
reverse
spath
Description
The spath command enables you to extract information from the structured data formats XML and JSON. The command
stores this information in one or more fields. The command also highlights the syntax in the displayed events list.
You can also use the spath() function with the eval command. For more information, see the evaluation functions.
Syntax
Optional arguments
input
Syntax: input=<field>
Description: The field to read in and extract values.
Default: _raw
output
Syntax: output=<field>
Description: If specified, the value extracted from the path is written to this field name.
Default: If you do not specify an output argument, the value for the path argument becomes the field name for
the extracted value.
path
Syntax: path=<datapath> | <datapath>
Description: The location path to the value that you want to extract. The location path can be specified as
path=<datapath> or as just datapath. If you do not specify the path=, the first unlabeled argument is used as the
location path. A location path is composed of one or more location steps, separated by periods. An example of
this is 'foo.bar.baz'. A location step is composed of a field name and an optional index surrounded by curly
brackets. The index can be an integer, to refer to the position of the data in an array (this differs between JSON
and XML), or a string, to refer to an XML attribute. If the index refers to an XML attribute, specify the attribute
514
name with an @ symbol.
Usage
When used with no path argument, the spath command runs in "auto-extract" mode. In the "auto-extract" mode, the spath
command finds and extracts all the fields from the first 5000 characters in the input field. These fields default to _raw if
another input source is not specified. If a path is provided, the value of this path is extracted to a field named by the path
or to a field specified by the output argument, if the output argument is provided.
A location path contains one or more location steps, each of which has a context that is specified by the location steps
that precede it. The context for the top-level location step is implicitly the top-level node of the entire XML or JSON
document.
The location step is composed of a field name and an optional array index
The location step is composed of a field name and an optional array index indicated by curly brackets around an integer
or a string.
Array indices mean different things in XML and JSON. For example, JSON uses zero-based indexing. In JSON,
foo.bar{3} refers to the fourth element of the bar child of the foo element. In XML, this same path refers to the third bar
child of foo.
The spath command lets you use wildcards to take the place of an array index in JSON. Now, you can use the location
path entities.hashtags{}.text to get the text for all of the hashtags, as opposed to specifying
entities.hashtags{0}.text, entities.hashtags{1}.text, and so on. The referenced path, here entities.hashtags, has
to refer to an array for this to make sense. Otherwise, you get an error just like with regular array indices.
This also works with XML. For example, catalog.book and catalog.book{} are equivalent. Both get you all the books in
the catalog.
If you are using autokv or index-time field extractions, the path extractions are performed for you at index time.
You do not need to explicitly use the spath command to provide a path.
If using indexed_extractions=JSON or using KV_MODE=JSON in the props.conf file, then the spath command is not
necessary to explicitly use.
Basic examples
515
1. Specify an output field and path
For the path argument, you can specify the location path with or without the path=. In this example the <datapath> is
server.name.
{
"foo" : [1,2]
}
{
"foo" : {
"bar" : [
{"zoo" : 1},
{"baz" : 2}
]
}
}
To specify the output and path from this nested array, use this syntax.
Use the @ symbol to specify an XML attribute. Consider the following XML list of books and authors.
<?xml version="1.0">
<purchases>
<book>
<author>Martin, George R.R.</author>
<title yearPublished=1996>A Game of Thrones</title>
<title yearPublished=1998>A Clash of Kings</title>
</book>
<book>
516
<author>Clarke, Susanna</author>
<title yearPublished=2004>Jonathan Strange and Mr. Norrell</title>
</book>
<book>
<author>Kay, Guy Gavriel</author>
<title yearPublished=1990>Tigana</title>
</book>
<book>
<author>Bujold, Lois McMasters</author>
<title yearPublished=1986>The Warrior's Apprentice</title>
</book>
</purchases>
Use this search to return the path for the book and the year it was published.
In this example, the output is a single multivalue result that lists all of the years the books were published.
Extended examples
1: GitHub
• See who has committed the most changes and to which repository
• Produce a list of the commits submitted for each user
Suppose you are Indexing JSON data using the GitHub PushEvent webhook. You can use the spath command to extract
fields called repository, commit_author, and commit_id:
To see who has committed the most changes to a repository, run the search.
This example shows how to extract values from XML attributes and elements.
<vendorProductSet vendorID="2">
<product productID="17" units="mm" >
<prodName nameGroup="custom">
<locName locale="all">APLI 01209</locName>
</prodName>
<desc descGroup="custom">
517
<locDesc locale="es">Precios</locDesc>
<locDesc locale="fr">Prix</locDesc>
<locDesc locale="de">Preise</locDesc>
<locDesc locale="ca">Preus</locDesc>
<locDesc locale="pt">Preços</locDesc>
</desc>
</product>
To extract the values of the locDesc elements (Precios, Prix, Preise, etc.), use:
To extract the value of the locale attribute (es, fr, de, etc.), use:
The mvexpand command only works on one multivalued field. This example walks through how to expand a JSON event
that has more than one multivalued field into individual events for each field value. For example, given this event with
sourcetype=json:
{
"widget": {
"text": [
{
"data": "Click here",
"size": 36
},
{
"data": "Learn more",
"size": 37
},
{
"data": "Help",
"size": 38
},
]
}
}
First, start with a search to extract the fields from the JSON. Because no path argument is specified, the spath command
runs in "auto-extract" mode and extracts all of the fields from the first 5000 characters in the input field. The fields are then
renamed and placed in a table.
518
Then, use the eval function, mvzip(), to create a new multivalued field named x, with the values of the size and data:
Now, use the mvexpand command to create individual events based on x and the eval function mvindex() to redefine the
values for data and size.
See also
stats
Description
Calculates aggregate statistics, such as average, count, and sum, over the results set. This is similar to SQL aggregation.
If the stats command is used without a BY clause, only one row is returned, which is the aggregation over the entire
incoming result set. If a BY clause is used, one row is returned for each distinct value specified in the BY clause.
The stats command can be used for several SQL-like operations. If you are familiar with SQL but new to SPL, see
Splunk SPL for SQL users.
The stats command calculates statistics based on fields in your events. The eval command creates new fields in your
events by using existing fields and an arbitrary expression.
519
Syntax
Simple:
stats (stats-function(field) [AS field])... [BY field-list]
Complete:
Required syntax is in bold.
| stats
[partitions=<num>]
[allnum=<bool>]
[delim=<string>]
( <stats-agg-term>... | <sparkline-agg-term>... )
[<by-clause>]
[<dedup_splitvals>]
Required arguments
stats-agg-term
Syntax: <stats-func>(<evaled-field> | <wc-field>) [AS <wc-field>]
Description: A statistical aggregation function. See Stats function options. The function can be applied to an eval
expression, or to a field or set of fields. Use the AS clause to place the result into a new field with a name that you
specify. You can use wild card characters in field names. For more information on eval expressions, see Types of
eval expressions in the Search Manual.
sparkline-agg-term
Syntax: <sparkline-agg> [AS <wc-field>]
Description: A sparkline aggregation function. Use the AS clause to place the result into a new field with a name
that you specify. You can use wild card characters in the field name.
520
Optional arguments
allnum
Syntax: allnum=<bool>
Description: If true, computes numerical statistics on each field if and only if all of the values of that field are
numerical.
Default: false
by-clause
Syntax: BY <field-list>
Description: The name of one or more fields to group by. You cannot use a wildcard character to specify multiple
fields with similar names. You must specify each field separately. The BY clause returns one row for each distinct
value in the BY clause fields. If no BY clause is specified, the stats command returns only one row, which is the
aggregation over the entire incoming result set.
dedup_splitvals
Syntax: dedup_splitvals=<boolean>
Description: Specifies whether to remove duplicate values in multivalued BY clause fields.
Default: false
delim
Syntax: delim=<string>
Description: Specifies how the values in the list() or values() aggregation are delimited.
Default: a single space
partitions
Syntax: partitions=<num>
Description: Partitions the input data based on the split-by fields for multithreaded reduce. The partitions
argument runs the reduce step (in parallel reduce processing) with multiple threads in the same search process
on the same machine. Compare that with parallel reduce, using the redistribute command, that runs the reduce
step in parallel on multiple machines.
When partitions=0, the value of the partitions argument is the same as the value of the default_partitions
setting in the limits.conf file.
Default: 0. Set to the same value as the default_partitions setting in the limits.conf file, which is 1 by default.
stats-func
Syntax: The syntax depends on the function that you use. Refer to the table below.
Description: Statistical and charting functions that you can use with the stats command. Each time you invoke
the stats command, you can use one or more functions. However, you can only use one BY clause. See Usage.
The following table lists the supported functions by type of function. Use the links in the table to see descriptions
and examples for each function. For an overview about using functions with commands, see Statistical and
charting functions.
521
Supported functions and
Type of function
syntax
count() max() range() sumsq()
distinct_count() median() stdev() upperperc<num>()
estdc() min() stdevp() var()
estdc_error() mode() varp()
earliest() latest()
Time functions rate()
earliest_time() latest_time()
Sparklines are inline charts that appear within table cells in search results to display time-based trends associated with
the primary key of each row. Read more about how to "Add sparklines to your search results" in the Search Manual.
sparkline-agg
Syntax: sparkline (count(<wc-field>), <span-length>) | sparkline (<sparkline-func>(<wc-field>), <span-length>)
Description: A sparkline specifier, which takes the first argument of a aggregation function on a field and an
optional timespan specifier. If no timespan specifier is used, an appropriate timespan is chosen based on the time
range of the search. If the sparkline is not scoped to a field, only the count aggregator is permitted. You can use
wildcard characters in the field name. See the Usage section.
sparkline-func
Syntax: c() | count() | dc() | mean() | avg() | stdev() | stdevp() | var() | varp() | sum() | sumsq() | min() | max() |
range()
Description: Aggregation function to use to generate sparkline values. Each sparkline value is produced by
applying this aggregation to the events that fall into each particular time bin.
Usage
When you use the stats command, you must specify either a statistical function or a sparkline function. When you use a
statistical function, you can use an eval expression as part of the statistical function. For example:
With the exception of the count function, when you pair the stats command with functions that are not applied to specific
fields or eval expressions that resolve into fields, the search head processes it as if it were applied to a wildcard for all
fields. In other words, when you have | stats avg in a search, it returns results for | stats avg(*).
522
This "implicit wildcard" syntax is officially deprecated, however. Make the wildcard explicit. Write | stats <function>(*)
when you want a function to apply to all possible fields.
Numeric calculations
During calculations, numbers are treated as double-precision floating-point numbers, subject to all the usual behaviors of
floating point numbers. If the calculation results in the floating-point special value NaN, it is represented as "nan" in your
results. The special values for positive and negative infinity are represented in your results as "inf" and "-inf" respectively.
Division by zero results in a null field.
There are situations where the results of a calculation contain more digits than can be represented by a floating- point
number. In those situations precision might be lost on the least significant digits. For an example of how to correct this,
see Example 2 of the basic examples for the sigfig(X) function.
Some functions are inherently more expensive, from a memory standpoint, than other functions. For example, the
distinct_count function requires far more memory than the count function. The values and list functions also can
consume a lot of memory.
If you are using the distinct_count function without a split-by field or with a low-cardinality split-by by field, consider
replacing the distinct_count function with the the estdc function (estimated distinct count). The estdc function might
result in significantly lower memory usage and run times.
A pair of limits.conf settings strike a balance between the performance of stats searches and the amount of memory
they use during the search process, in RAM and on disk. If your stats searches are consistently slow to complete you can
adjust these settings to improve their performance, but at the cost of increased search-time memory usage, which can
lead to search failures.
If you use Splunk Cloud Platform, you need to file a Support ticket to change these settings.
For more information, see Memory and stats search performance in the Search Manual.
Using the first and last functions when searching based on time does not produce accurate results.
• To locate the first value based on time order, use the earliest function, instead of the first function.
• To locate the last value based on time order, use the latest function, instead of the last function.
Replace the first and last functions when you use the stats and eventstats commands for ordering events based on
time. The following search shows the function changes.
523
index=test sourcetype=testDb | eventstats latest(LastPass) AS LastPass, earliest(_time) AS
mostRecentTestTime BY testCaseId | where startTime==LastPass OR _time==mostRecentTestTime | stats
latest(startTime) AS startTime, latest(status) AS status, latest(histID) AS currentHistId, earliest(histID)
AS lastPassHistId BY testCaseId
Wildcards in BY clauses
The stats command does not support wildcard characters in field values in BY clauses.
Renaming fields
You cannot rename one field with multiple names. For example if you have field A, you cannot rename A as B, A as C.
The following example is not valid.
Basic examples
2. Search the access logs, and return the total number of hits from the top 100 values of "referer_domain"
Search the access logs, and return the total number of hits from the top 100 values of "referer_domain". The "top"
command returns a count and percent value for each "referer_domain".
3. Calculate the average time for each hour for similar fields using wildcard characters
Return the average, for each hour, of any unique field that ends with the string "lay". For example, delay, xdelay, relay,
etc.
4. Remove duplicates in the result set and return the total count for the unique results
Remove duplicates of results with the same "host" value and return the total count of the remaining results.
For each unique value of mvfield, return the average value of field. Deduplicates the values in the mvfield.
524
Extended examples
1. Compare the difference between using the stats and chart commands
This example uses the sample data from the Search Tutorial but should work with any format of Apache web access log. To try this example on
your own Splunk instance, you must download the sample data and follow the instructions to get the tutorial data into Splunk. Use the time
range All time when you run the search.
This search uses the stats command to count the number of events for a combination of HTTP status code values and
host:
The BY clause returns one row for each distinct value in the BY clause fields. In this search, because two fields are
specified in the BY clause, every unique combination of status and host is listed on separate row.
The results appear on the Statistics tab and look something like this:
Substitute the chart command for the stats command in the search.
With the chart command, the two fields specified after the BY clause change the appearance of the results on the
Statistics tab. The BY clause also makes the results suitable for displaying the results in a chart visualization.
• The first field you specify is referred to as the <row-split> field. In the table, the values in this field become the
labels for each row. In the chart, this field forms the X-axis.
• The second field you specify is referred to as the <column-split> field. In the table, the values in this field are used
as headings for each column. In the chart, this field forms the data series.
The results appear on the Statistics tab and look something like this:
525
200 11835 11186 11261
403 0 288 0
2. Use eval expressions to count the different types of requests against each Web server
This example uses the sample data from the Search Tutorial but should work with any format of Apache web access log. To try this example on
your own Splunk instance, you must download the sample data and follow the instructions to get the tutorial data into Splunk. Use the time
range All time when you run the search.
Run the following search to use the stats command to determine the number of different page requests, GET and POST,
that occurred for each Web server.
This example uses eval expressions to specify the different field values for the stats command to count.
• The first clause uses the count() function to count the Web access events that contain the method field value GET.
Then, using the AS keyword, the field that represents these results is renamed GET.
• The second clause does the same for POST events.
• The counts of both types of events are then separated by the web server, using the BY clause with the host field.
The results appear on the Statistics tab and look something like this:
You can substitute the chart command for the stats command in this search. You can then click the Visualization tab to
see a chart of the results.
Count the number of earthquakes that occurred for each magnitude range
526
This search uses recent earthquake data downloaded from the USGS Earthquakes website. The data is a comma separated ASCII text file that
contains magnitude (mag), coordinates (latitude, longitude), region (place), etc., for each earthquake recorded.
You can download a current CSV file from the USGS Earthquake Feeds and upload the file to your Splunk instance.
This example uses the All Earthquakes data from the past 30 days.
Run the following search to calculate the number of earthquakes that occurred in each magnitude range. This data set is
comprised of events over a 30-day period.
source=all_month.csv | chart count AS "Number of Earthquakes" BY mag span=1 | rename mag AS "Magnitude
Range"
• This search uses span=1 to define each of the ranges for the magnitude field, mag.
• The rename command is then used to rename the field to "Magnitude Range".
The results appear on the Statistics tab and look something like this:
Number of
Magnitude Range
Earthquakes
-1-0 18
0-1 2088
1-2 3005
2-3 1026
3-4 194
4-5 452
5-4 109
6-7 11
7-8 3
Click the Visualization tab to see the result in a chart.
Search for earthquakes in and around California. Calculate the number of earthquakes that were recorded. Use statistical
functions to calculate the minimum, maximum, range (the difference between the min and max), and average magnitudes
of the recent earthquakes. List the values by magnitude type.
The results appear on the Statistics tab and look something like this:
MbLg 1 0 0 0 0.0000000
527
magType count max(mag) min(mag) range(mag) avg(mag)
Find the mean, standard deviation, and variance of the magnitudes of the recent quakes
Search for earthquakes in and around California. Calculate the number of earthquakes that were recorded. Use statistical
functions to calculate the mean, standard deviation, and variance of the magnitudes for recent earthquakes. List the
values by magnitude type.
The results appear on the Statistics tab and look something like this:
4. In a table display items sold by ID, type, and name and calculate the revenue for each product
This example uses the sample dataset from the Search Tutorial and a field lookup to add more information to the event data.
• Download the data set from Add data tutorial and follow the instructions to load the tutorial data.
• Download the CSV file from Use field lookups tutorial and follow the instructions to set up the lookup definition to add price and
productName to the events.
After you configure the field lookup, you can run this search using the time range, All time.
Create a table that displays the items sold at the Buttercup Games online store by their ID, type, and name. Also,
calculate the revenue for each product.
This example uses the values() function to display the corresponding categoryId and productName values for each
productId. Then, it uses the sum() function to calculate a running total of the values of the price field.
528
Also, this example renames the various fields, for better display. For the stats functions, the renames are done inline with
an "AS" clause. The rename command is used to change the name of the product_id field, since the syntax does not let
you rename a split-by field.
Finally, the results are piped into an eval expression to reformat the Revenue field values so that they read as currency,
with a dollar sign and commas.
This example uses sample email data. You should be able to run this search on any email data by replacing the sourcetype=cisco:esa with
the sourcetype value and the mailfrom field with email address field name in your data. For example, the email might be To, From, or Cc).
Find out how much of the email in your organization comes from .com, .net, .org or other top level domains.
The eval command in this search contains two expressions, separated by a comma.
• The first part of this search uses the eval command to break up the email address in the mailfrom field. The
from_domain is defined as the portion of the mailfrom field after the @ symbol.
♦ The split() function is used to break the mailfrom field into a multivalue field called accountname. The
first value of accountname is everything before the "@" symbol, and the second value is everything after.
♦ The mvindex() function is used to set from_domain to the second value in the multivalue field
accountname.
• The results are then piped into the stats command. The count() function is used to count the results of the eval
expression.
• Theeval uses the match() function to compare the from_domain to a regular expression that looks for the different
suffixes in the domain. If the value of from_domain matches the regular expression, the count is updated for each
suffix, .com, .net, and .org. Other domain suffixes are counted as other.
The results appear on the Statistics tab and look something like this:
529
6. Search Web access logs for the total number of hits from the top 10 referring domains
This example uses the sample data from the Search Tutorial but should work with any format of Apache web access log. To try this example on
your own Splunk instance, you must download the sample data and follow the instructions to get the tutorial data into Splunk. Use the time
range Yesterday when you run the search.
This example searches the web access logs and return the total number of hits from the top 10 referring domains.
This search uses the top command to find the ten most common referer domains, which are values of the referer field.
Some events might use referer_domain instead of referer. The top command returns a count and percent value for each
referer.
You can then use the stats command to calculate a total for the top 10 referrer accesses.
The sum() function adds the values in the count to produce the total number of times the top 10 referrers accessed the
web site.
See also
Functions
Statistical and charting functions
Commands
eventstats
rare
sistats
streamstats
530
top
Blogs
Getting started with stats, eventstats and streamstats
Search commands > stats, chart, and timechart
Smooth operator | Searching for multiple field values
strcat
Description
Concatenates string values from 2 or more fields. Combines together string values and literals into a new field. A
destination field name is specified at the end of the strcat command.
Syntax
Required arguments
<dest-field>
Syntax: <string>
Description: A destination field to save the concatenated string values in, as defined by the <source-fields>
argument. The destination field is always at the end of the series of source fields.
<source-fields>
Syntax: (<field> | <quoted-str>)...
Description: Specify the field names and literal string values that you want to concatenate. Literal values must be
enclosed in quotation marks.
quoted-str
Syntax: "<string>"
Description: Quoted string literals.
Examples: "/" or ":"
Optional arguments
allrequired
Syntax: allrequired=<bool>
Description: Specifies whether or not all source fields need to exist in each event before values are written to the
destination field. If allrequired=f, the destination field is always written and source fields that do not exist are
treated as empty strings. If allrequired=t, the values are written to destination field only if all source fields exist.
Default: false
Usage
531
Examples
Example 1:
Add a field called comboIP, which combines the source and destination IP addresses. Separate the addresses with a
forward slash character.
Example 2:
Add a field called comboIP, which combines the source and destination IP addresses. Separate the addresses with a
forward slash character. Create a chart of the number of occurrences of the field values.
Example 3:
Add a field called address, which combines the host and port values into the format <host>::<port>.
See also
eval
streamstats
Description
Adds cumulative summary statistics to all search results in a streaming manner. The streamstats command calculates
statistics for each event at the time the event is seen. For example, you can calculate the running total for a particular
field. The total is calculated by using the values in the specified field for every event that has been processed, up to the
current event.
Syntax
streamstats
[reset_on_change=<bool>]
[reset_before="("<eval-expression>")"]
[reset_after="("<eval-expression>")"]
[current=<bool>]
[window=<int>]
[time_window=<span-length>]
[global=<bool>]
[allnum=<bool>]
<stats-agg-term>...
[<by-clause>]
532
Required arguments
stats-agg-term
Syntax: <stats-func>( <evaled-field> | <wc-field> ) [AS <wc-field>]
Description: A statistical aggregation function. See Stats function options. The function can be applied to an eval
expression, or to a field or set of fields. Use the AS clause to place the result into a new field with a name that you
specify. You can use wild card characters in field names. For more information on eval expressions, see Types of
eval expressions in the Search Manual.
Optional arguments
allnum
Syntax: allnum=<boolean>
Description: If true, computes numerical statistics on each field only if all of the values in that field are numerical.
Default: false
by-clause
Syntax: BY <field-list>
Description: The name of one or more fields to group by.
current
Syntax: current=<boolean>
Description: If true, the search includes the given, or current, event in the summary calculations. If false, the
search uses the field value from the previous event.
Default: true
global
Syntax: global=<boolean>
Description: Used only when the window argument is set. Defines whether to use a single window, global=true,
or to use separate windows based on the by clause. If global=false and window is set to a non-zero value, a
separate window is used for each group of values of the field specified in the by clause.
Default: true
reset_after
Syntax: reset_after="("<eval-expression>")"
Description: After the streamstats calculations are produced for an event, reset_after specifies that all of the
accumulated statistics are reset if the eval-expression returns true. The eval-expression must evaluate to true
or false. The eval-expression can reference fields that are returned by the streamstats command. When the
reset_after argument is combined with the window argument, the window is also reset when the accumulated
statistics are reset.
Default: false
reset_before
Syntax: reset_before="("<eval-expression>")"
Description: Before the streamstats calculations are produced for an event, reset_before specifies that all of the
accumulated statistics are reset when the eval-expression returns true. The eval-expression must evaluate to
true or false. When the reset_before argument is combined with the window argument, the window is also reset
when the accumulated statistics are reset.
Default: false
reset_on_change
Syntax: reset_on_change=<bool>
533
Description: Specifies that all of the accumulated statistics are reset when the group by fields change. The reset
is as if no previous events have been seen. Only events that have all of the group by fields can trigger a reset.
Events that have only some of the group by fields are ignored. When the reset_on_change argument is combined
with the window argument, the window is also reset when the accumulated statistics are reset. See the Usage
section.
Default: false
time_window
Syntax: time_window=<span-length>
Description: Specifies the window size for the streamstats calculations, based on time. The time_window
argument is limited by range of values in the _time field in the events. To use the time_window argument, the
events must be sorted in either ascending or descending time order. You can use the window argument with the
time_window argument to specify the maximum number of events in a window. For the <span-length>, to specify
five minutes, use time_window=5m. To specify 2 days, use time_window=2d.
Default: None. However, the value of the max_stream_window attribute in the limits.conf file applies. The default
value is 10000 events.
window
Syntax: window=<integer>
Description: Specifies the number of events to use when computing the statistics.
Default: 0, which means that all previous and current events are used.
stats-func
Syntax: The syntax depends on the function that you use. Refer to the table below.
Description: Statistical and charting functions that you can use with the streamstats command. Each time you
invoke the streamstats command, you can use one or more functions. However, you can only use one BY clause.
See Usage.
The following table lists the supported functions by type of function. Use the links in the table to see descriptions
and examples for each function. For an overview about using functions with commands, see Statistical and
charting functions.
Usage
534
The streamstats command is similar to the eventstats command except that it uses events before the current event to
compute the aggregate statistics that are applied to each event. If you want to include the current event in the statistical
calculations, use current=true, which is the default.
The streamstats command is also similar to the stats command in that streamstats calculates summary statistics on
search results. Unlike stats, which works on the group of results as a whole, streamstats calculates statistics for each
event at the time the event is seen.
With the exception of the count function, when you pair the streamstats command with functions that are not applied to
specific fields or eval expressions that resolve into fields, the search head processes it as if it were applied to a wildcard
for all fields. In other words, when you have | streamstats avg in a search, it returns results for | stats avg(*).
This "implicit wildcard" syntax is officially deprecated, however. Make the wildcard explicit. Write | streamstats
<function>(*) when you want a function to apply to all possible fields.
If your <eval-expression> contains a value instead of a field name, you must escape the quotation marks around the
value.
The following example is a simple way to see this. Start by using the makeresults command to create 3 events. Use the
streamstats command to produce a cumulative count of the events. Then use the eval command to create a simple test.
If the value of the count field is equal to 2, display yes in the test field. Otherwise display no in the test field.
2017-01-11 11:32:43 3 no
Use the streamstats command to reset the count when the match is true. You must escape the quotation marks around
the word yes. The following example shows the complete search.
Here is another example. You want to look for the value session is closed in the description field. Because the value is
a string, you must enclose it in quotation marks. You then need to escape those quotation marks.
You have a dataset with the field "shift" that contains either the value DAY or the value NIGHT. You run this search:
535
If the dataset is:
shift
DAY
DAY
NIGHT
NIGHT
NIGHT
NIGHT
DAY
NIGHT
Running the command with reset_on_change=true produces the following streamstats results:
shift, count
DAY, 1
DAY, 2
NIGHT, 1
NIGHT, 2
NIGHT, 3
NIGHT, 4
DAY, 1
NIGHT, 1
The streamstats search processor uses two limits.conf settings to determine the maximum number of results that it can
store in memory for the purpose of computing statistics.
The maxresultrows setting specifies a top limit for the window argument. This sets the number of result rows that the
streamstats command processor can store in memory. The max_mem_usage_mb setting limits how much memory the
streamstats command uses to keep track of information.
When the max_mem_usage_mb limit is reached, the streamstats command processor stops adding the requested fields to
the search results.
Do not set max_mem_usage_mb=0 as this removes the bounds to the amount of memory the streamstats command
processor can use. This can lead to search failures.
Prerequisites
• Only users with file system access, such as system administrators, can increase the maxresultrows and
max_mem_usage_mb settings using configuration files.
• Review the steps in How to edit a configuration file in the Splunk Enterprise Admin Manual.
• You can have configuration files with the same name in your default, local, and app directories. Read Where you
can place (or find) your modified configuration files in the Splunk Enterprise Admin Manual.
Never change or copy the configuration files in the default directory. The files in the default directory must remain intact
and in their original location. Make changes to the files in the local directory.
If you have Splunk Cloud Platform and want to change these limits, file a Support ticket.
536
Basic examples
For each event, compute the average of the foo field over the last 5 events, including the current event.
This is similar to using the trendline command to compute a simple moving average (SMA), such as trendline
sma5(foo).
2. Compute the average of a field, with a by clause, over the last 5 events
For each event, compute the average value of foo for each value of bar including only 5 events, specified by the window
size, with that value of bar.
This example adds to each event a count field that represents the number of events seen so far, including that event. For
example, it adds 1 for the first event, 2 for the second event, and so on.
If you did not want to include the current event, you would specify:
Assume that the max_stream_window argument in the limits.conf file is the default value of 10000 events.
The following search counts the events, using a time window of five minutes.
• If the events are in descending time order (most recent to oldest), the value in the count field represents the
number of events in the next 5 minutes.
• If the events are in ascending time order (oldest to most recent), the count field represents the number of events
in the previous 5 minutes.
If there are more events in the time-based window than the value for the max_stream_window argument, the
max_stream_window argument takes precedence. The count will never be > 10000, even if there are actually more than
10,000 events in any 5 minute period.
Extended examples
537
1. Create events for testing
You can use the streamstats command with the makeresults command to create a series events. This technique is often
used for testing search syntax. The eval command is used to create events with different hours. You use 3600, the
number of seconds in an hour, in the eval command.
The streamstats command is used to create the count field. The streamstats command calculates a cumulative count for
each event, at the time the event is processed.
_time count
2020-01-09 15:35:14 1
2020-01-09 14:35:14 2
2020-01-09 13:35:14 3
2020-01-09 12:35:14 4
2020-01-09 11:35:14 5
Notice that the hours in the timestamp are 1 hour apart.
| makeresults count=5 | streamstats count | eval _time=_time-(count*3600) | eval age = case(count=1, 25,
count=2, 39, count=3, 31, count=4, 27, count=5, null()) | eval city = case(count=1 OR count=3, "San
Francisco", count=2 OR count=4, "Seattle",count=5, "Los Angeles")
• The eval command is used to create two new fields, age and city. The eval command uses the value in the
count field.
• The case function takes pairs of arguments, such as count=1, 25. The first argument is a Boolean expression.
When that expression is TRUE, the corresponding second argument is returned.
This example uses the sample data from the Search Tutorial but should work with any format of Apache web access log. To try this example on
your own Splunk instance, you must download the sample data and follow the instructions to get the tutorial data into Splunk. Use the time
range All time when you run the search.
538
You want to determine the number of the bytes used over a set period of time. The following search uses the first 5
events. Because search results typically display the most recent event first, the sort command is used to sort the 5
events in ascending order to see the oldest event first and the most recent event last. Ascending order enables the
streamstats command to calculate statistics over time.
539
Add the streamstats command to the search to generate a running total of the bytes over the 5 events and organize the
results by clientip.
When you click on the ASimpleSumOfBytes field in the list of Interesting fields, an information window shows the
cumulative sum of the bytes, as shown in this image:
The streamstats command aggregates the statistics to the original data, which means that all of the original data is
accessible for further calculations.
Add the table command to the search to display the only the values in the _time, clientip, bytes, and
ASimpleSumOfBytes fields.
Each event shows the timestamp for the event, the clientip, and the number of bytes used. The ASimpleSumOfBytes field
shows a cumulative summary of the bytes for each clientip.
540
3. Calculate the running total of distinct users over time
Each day you track unique users, and you would like to track the cumulative count of distinct users. This example
calculates the running total of distinct users over time.
eventtype="download" | bin _time span=1d as day | stats values(clientip) as ips dc(clientip) by day |
streamstats dc(ips) as "Cumulative total"
The bin command breaks the time into days. The stats command calculates the distinct users (clientip) and user count
per day. The streamstats command finds the running distinct count of users.
This search returns a table that includes: day, ips, dc(clientip), and Cumulative total.
The timechart command buckets the events into spans of 1 hour and counts the total values for each category. The
timechart command also fills NULL values, so that there are no missing values. Then, the streamstats command is used
to calculate the accumulated total.
This example uses streamstats to produce hourly cumulative totals for category values.
... | timechart span=1h sum(value) as total by category | streamstats global=f | addtotals | accum Total |
rename Total as accu_total
5. Calculate when a DHCP IP lease address changed for a specific MAC address
This example uses streamstats to figure out when a DHCP IP lease address changed for a MAC address,
54:00:00:00:00:00.
You can also clean up the presentation to display a table of the DHCP IP address changes and the times the occurred.
For more details, refer to the Splunk Blogs post for this example.
See also
Commands
accum
autoregress
delta
541
fillnull
eventstats
makeresults
trendline
Blogs
Getting started with stats, eventstats and streamstats
table
Description
The table command returns a table that is formed by only the fields that you specify in the arguments. Columns are
displayed in the same order that fields are specified. Column headers are the field names. Rows are the field values.
Each row represents an event.
The table command is similar to the fields command in that it lets you specify the fields you want to keep in your results.
Use table command when you want to retain data in tabular format.
With the exception of a scatter plot to show trends in the relationships between discrete values of your data, you should
not use the table command for charts. See Usage.
Syntax
table <wc-field-list>
Arguments
<wc-field-list>
Syntax: <wc-field> ...
Description: A list of valid field names. The list can be space-delimited or comma-delimited. You can use the
asterisk ( * ) as a wildcard to specify a list of fields with similar names. For example, if you want to specify all fields
that start with "value", you can use a wildcard such as value*.value*.
Usage
Visualizations
To generate visualizations, the search results must contain numeric, datetime, or aggregated data such as count, sum, or
average.
Command type
The table command is a non-streaming command. If you are looking for a streaming command similar to the table
command, use the fields command.
542
Field renaming
The table command doesn't let you rename fields, only specify the fields that you want to show in your tabulated results.
If you're going to rename a field, do it before piping the results to table.
Truncated results
The table command truncates the number of results returned based on settings in the limits.conf file. In the [search]
stanza, if the value for the truncate_report parameter is 1, the number of results returned is truncated.
The number of results is controlled by the max_count parameter in the [search] stanza. If truncate_report is set to 0, the
max_count parameter is not applied.
Examples
Example 1
This example uses recent earthquake data downloaded from the USGS Earthquakes website. The data is a comma separated ASCII text file
that contains magnitude (mag), coordinates (latitude, longitude), region (place), etc., for each earthquake recorded.
You can download a current CSV file from the USGS Earthquake Feeds and add it as an input to your Splunk
deployment.
Search for recent earthquakes in and around California and display only the time of the quake (Datetime), where it
occurred (Region), and the quake's magnitude (Magnitude) and depth (Depth).
This simply reformats your events into a table and displays only the fields that you specified as arguments.
Example 2
This example uses recent earthquake data downloaded from the USGS Earthquakes website. The data is a comma separated ASCII text file
that contains magnitude (mag), coordinates (latitude, longitude), region (place), etc., for each earthquake recorded.
You can download a current CSV file from the USGS Earthquake Feeds and add it as an input to your Splunk
deployment.
Show the date, time, coordinates, and magnitude of each recent earthquake in Northern California.
index=usgs_* source=usgs place=*California | rename lat as latitude lon as longitude | table time, place,
lat*, lon*, mag
543
This example begins with a search for all recent earthquakes in Northern California (Region="Northern California").
Then it pipes these events into the rename command to change the names of the coordinate fields, from lat and lon to
latitude and longitude. (The table command doesn't let you rename or reformat fields, only specify the fields that you
want to show in your tabulated results.)
Finally, it pipes the results into the table command and specifies both coordinate fields with lat*, lon*, the magnitude
with mag, and the date and time with time.
This example just illustrates how the table command syntax allows you to specify multiple fields using the asterisk
wildcard.
Example 3
This example uses the sample dataset from the tutorial but should work with any format of Apache Web access log. Download the data set from
the Add data tutorial and follow the instructions to get the sample data into your Splunk deployment. Then, run this search using the time
range, All time.
Search for IP addresses and classify the network they belong to.
This example searches for Web access data and uses the dedup command to remove duplicate values of the IP
addresses (clientip) that access the server. These results are piped into the eval command, which uses the
cidrmatch() function to compare the IP addresses to a subnet range (192.0.0.0/16). This search also uses the if()
function, which says that if the value of clientip falls in the subnet range, then network is given the value local.
Otherwise, network=other.
The results are then piped into the table command to show only the distinct IP addresses (clientip) and the network
classification (network):
544
More examples
Example 1: Create a table for fields foo, bar, then all fields that start with 'baz'.
See Also
fields
tags
Description
Annotates specified fields in your search results with tags. If there are fields specified, only annotate tags for those fields.
Otherwise, look for tags for all fields.
Syntax
tags
[outputfield=<field>]
[inclname=<bool>]
[inclvalue=<bool>]
<field-list>
Required arguments
None.
Optional arguments
<field-list>
Syntax: <field> <field> ...
Description: Specify the fields that you want to output the tags from. The tags are written to the outputfield.
Default: All fields
outputfield
Syntax: outputfield=<field>
Description: If specified, the tags for all of the fields are written to this one new field. If not specified, a new field
is created for each field that contains tags. The tags are written to these new fields using the naming convention
tag::<field>. In addition, a new field is created called tags that lists all of the tags in all of the fields.
Default: New fields are created and the tags are written to the new fields.
inclname
Syntax: inclname=true | false
Description: If outputfield is specified, this controls whether or not the event field name is added to the output
field, along with the tags. Specify true to include the field name.
Default: false
545
inclvalue
Syntax: inclvalue=true | false
Description: If outputfield is specified, controls whether or not the event field value is added to the output field,
along with the tags. Specify true to include the event field value.
Default: false
Usage
If outputfield is specified, the tags for the fields are written to this field. By default the tag is written to the outputfield, in
the format <field>::<tag>.
If outputfield is specified, the inclname and inclvalue arguments control whether or not the field name and field values
are added to the outputfield. If both inclname and inclvalue are set to true the format <field>::<value>::<tag>.
Examples
This example uses the sample data from the Search Tutorial but should work with any format of Apache web access log. To try this example on
your own Splunk instance, you must download the sample data and follow the instructions to get the tutorial data into Splunk. Use the time
range All time when you run the search.
This search looks for web access events and counts those events by host.
host count
www1 13628
www2 12912
www3 12992
When you use the tags command without any arguments, two new fields are added to the results tag and tag::host.
546
host count tag tag::host
www3 12992
There are no tags for host=www3.
The tag::host field list all of the tags used in the events that contain that host value.
The tag::sourcetype field list all of the tags used in the events that contain that sourcetype value.
Write tags for host and eventtype fields in the format tag::host and tag::eventtype.
Write the tags for all fields to the new field test.
547
host sourcetype count test
tag1
www3 access_combined_wcookie 12992 apache
4. Including the field names in the search results
Write the tags for the host and sourcetype fields into the test field in the format host::<tag> or sourcetype::<tag>.
Include the field name in the output.
See also
Related information
About tags and aliases in the Knowledge Manager Manual
Tag field-value pairs in Search in the Knowledge Manager Manual
Define and manage tags in Settings in the Knowledge Manager Manual
Commands
eval
tail
Description
Returns the last N number of specified results. The events are returned in reverse order, starting at the end of the result
set. The last 10 events are returned if no integer is specified
Syntax
tail [<N>]
Required arguments
None.
Optional arguments
<N>
Syntax: <int>
Description: The number of results to return.
Default: 10
Usage
548
Examples
Example 1:
... | tail 20
See also
head, reverse
timechart
Description
A timechart is a statistical aggregation applied to a field to produce a chart, with time used as the X-axis. You can specify
a split-by field, where each distinct value of the split-by field becomes a series in the chart. If you use an eval expression,
the split-by clause is required. With the limit and agg options, you can specify series filtering. These options are ignored
if you specify an explicit where-clause. If you set limit=0, no series filtering occurs.
Syntax
timechart
[sep=<string>]
[format=<string>]
[partial=<bool>]
[cont=<bool>]
[limit=<chart-limit-opt>]
[agg=<stats-agg-term>]
[<bin-options>... ]
( (<single-agg> [BY <split-by-clause>] ) | (<eval-expression>) BY <split-by-clause> )
[<dedup_splitvals>]
Required arguments
eval-expression
Syntax: <math-exp> | <concat-exp> | <compare-exp> | <bool-exp> | <function-call>
Description: A combination of literals, fields, operators, and functions that represent the value of your destination
field. For these evaluations to work, your values need to be valid for the type of operation. For example, with the
exception of addition, arithmetic operations might not produce valid results if the values are not numerical.
Additionally, the search can concatenate the two operands if they are both strings. When concatenating values
with a period '.' the search treats both values as strings, regardless of their actual data type.
549
single-agg
Syntax: count | <stats-func>(<field>)
Description: A single aggregation applied to a single field, including an evaluated field. For <stats-func>, see
Stats function options. No wildcards are allowed. The field must be specified, except when using the count
function, which applies to events as a whole.
split-by-clause
Syntax: <field> (<tc-options>)... [<where-clause>]
Description: Specifies a field to split the results by. If field is numerical, default discretization is applied.
Discretization is defined with the tc-options. Use the <where-clause> to specify the number of columns to
include. See the tc options and the where clause sections in this topic.
Optional arguments
agg=<stats-agg-term>
Syntax:agg=( <stats-func> ( <evaled-field> | <wc-field> ) [AS <wc-field>] )
Description: A statistical aggregation function. See Stats function options. The function can be applied to an eval
expression, or to a field or set of fields. Use the AS clause to place the result into a new field with a name that you
specify. You can use wild card characters in field names.
bin-options
Syntax: bins | minspan | span | <start-end> | aligntime
Description: Options that you can use to specify discrete bins, or groups, to organize the information. The
bin-options set the maximum number of bins, not the target number of bins. See the Bin options section in this
topic.
Default: bins=100
cont
Syntax: cont=<bool>
Description: Specifies whether the chart is continuous or not. If set to true, the Search application fills in the time
gaps.
Default: true
dedup_splitvals
Syntax: dedup_splitvals=<boolean>
Description: Specifies whether to remove duplicate values in multivalued <split-by-clause> fields.
Default: false
fixedrange
Syntax: fixedrange=<bool>
Description: Specifies whether or not to enforce the earliest and latest times of the search. Setting
fixedrange=false allows the timechart command to constrict or expand to the time range covered by all events
in the dataset.
Default: true
format
Syntax: format=<string>
Description: Used to construct output field names when multiple data series are used in conjunction with a
split-by-field. format takes precedence over sep and allows you to specify a parameterized expression with the
stats aggregator and function ($AGG$) and the value of the split-by-field ($VAL$).
limit
550
Syntax: limit=(top | bottom) <int>
Description: Specifies a limit for the number of distinct values of the split-by field to return. If set to limit=0, all
distinct values are used. Setting limit=N or limit=top N keeps the N highest scoring distinct values of the
split-by field. Setting limit=bottom N keeps the lowest scoring distinct values of the split-by field. All other
values are grouped into 'OTHER', as long as useother is not set to false. The scoring is determined as follows:
◊ If a single aggregation is specified, the score is based on the sum of the values in the aggregation for that
split-by value. For example, for timechart avg(foo) BY <field>, the avg(foo) values are added up for
each value of <field> to determine the scores.
◊ If multiple aggregations are specified, the score is based on the frequency of each value of <field>. For
example, for timechart avg(foo) max(bar) BY <field>, the top scoring values for <field> are the most
common values of <field>.
Ties in scoring are broken lexicographically, based on the value of the split-by field. For example, 'BAR' takes
precedence over 'bar', which takes precedence over 'foo'. See Usage.
Default: top 10
partial
Syntax: partial=<bool>
Description: Controls if partial time bins should be retained or not. Only the first and last bin can be partial.
Default: True. Partial time bins are retained.
sep
Syntax: sep=<string>
Description: Used to construct output field names when multiple data series are used in conjunctions with a
split-by field. This is equivalent to setting format to $AGG$<sep>$VAL$.
stats-func
Syntax: The syntax depends on the function that you use. See Usage.
Description: Statistical functions that you can use with the timechart command. Each time you invoke the
timechart command, you can use one or more functions. However, you can only use one BY clause.
Bin options
bins
Syntax: bins=<int>
Description: Sets the maximum number of bins to discretize into. This does not set the target number of bins. It
finds the smallest bin size that results in no more than N distinct bins. Even though you specify a number such as
300, the resulting number of bins might be much lower.
Default: 100
minspan
Syntax: minspan=<span-length>
Description: Specifies the smallest span granularity to use automatically inferring span from the data time range.
See Usage.
span
Syntax: span=<log-span> | span=<span-length> | span=<snap-to-time>
Description: Sets the size of each bin, using either a log-based span, a span length based on time, or a span
that snaps to a specific time. For descriptions of each of these options, see Span options.
The starting time of a bin might not match your local timezone. see Usage.
551
<start-end>
Syntax: end=<num> | start=<num>
Description: Sets the minimum and maximum extents for numerical bins. Data outside of the [start, end] range is
discarded.
aligntime
Syntax: aligntime=(earliest | latest | <time-specifier>)
Description: Align the bin times to something other than base UNIX time (epoch 0). The aligntime option is valid
only when doing a time-based discretization. Ignored if span is in days, months, or years.
Span options
<log-span>
Syntax: [<num>]log[<num>]
Description: Sets to log-based span. The first number is a coefficient. The second number is the base. If the first
number is supplied, it must be a real number >= 1.0 and < base. Base, if supplied, must be real number > 1.0
(strictly greater than 1).
<span-length>
Syntax: <int>[<timescale>]
Description: A span of each bin, based on time. If the timescale is provided, this is used as a time range. If not,
this is an absolute bin length.
<timescale>
Syntax: <sec> | <min> | <hr> | <day> | <week> | <month> | <subseconds>
Description: Timescale units.
Default: <sec>
<snap-to-time>
Syntax: [+|-] [<time_integer>] <relative_time_unit>@<snap_to_time_unit>
Description: A span of each bin, based on a relative time unit and a snap to time unit. The <snap-to-time> must
include a relative_time_unit, the @ symbol, and a snap_to_time_unit. The offset, represented by the plus (+) or
minus (-) is optional. If the <time_integer> is not specified, 1 is the default. For example if you specify w as the
relative_time_unit, 1 week is assumed.
552
The option is used only with the week timescale unit. It cannot be used with other timescale units such as
minutes or quarters.
tc options
tc-option
Syntax: <bin-options> | usenull=<bool> | useother=<bool> | nullstr=<string> | otherstr=<string>
Description: Timechart options for controlling the behavior of splitting by a field.
bin-options
See the Bin options section in this topic.
nullstr
Syntax: nullstr=<string>
Description: If usenull=true, specifies the label for the series that is created for events that do not contain the
split-by field.
Default: NULL
otherstr
Syntax: otherstr=<string>
Description: If useother=true, specifies the label for the series that is created in the table and the graph.
Default: OTHER
usenull
Syntax: usenull=<bool>
Description: Controls whether or not a series is created for events that do not contain the split-by field. The label
for the series is controlled by the nullstr option.
Default: true
useother
Syntax: useother=<bool>
Description: You specify which series to include in the results table by using the <agg>, <limit>, and
<where-clause> options. The useother option specifies whether to merge all of the series not included in the
results table into a single new series. If useother=true, the label for the series is controlled by the otherstr
option.
Default: true
where clause
The <where-clause> is part of the <split-by-clause>. The <where-clause> is comprised of two parts, a single aggregation
and some options. See Where clause examples.
where clause
Syntax: <single-agg> <where-comp>
Description: Specifies the criteria for including particular data series when a field is given in the <tc-by-clause>.
The most common use of this option is to look for spikes in your data rather than overall mass of distribution in
series selection. The default value finds the top ten series by area under the curve. Alternately one could replace
sum with max to find the series with the ten highest spikes. Essentially the default is the same as specifying where
sum in top10. The <where-clause> has no relation to the where command.
553
<where-comp>
Syntax: <wherein-comp> | <wherethresh-comp>
Description: Specify either a grouping for the series or the threshold for the series.
<wherein-comp>
Syntax: (in | notin) (top | bottom)<int>
Description: A grouping criteria that requires the aggregated series value be in or not in some top or bottom
group.
<wherethresh-comp>
Syntax: (< | >) [" "] <num>
Description: A threshold criteria that requires the aggregated series value be greater than or less than some
numeric threshold. You can specify the threshold with or without a space between the sign and the number.
Usage
The timechart command accepts either the bins argument OR the span argument. If you specify both bins and span,
span is used. The bins argument is ignored.
If you do not specify either bins or span, the timechart command uses the default bins=100.
If you use the predefined time ranges in the time range picker, and do not specify the span argument, the following table
shows the default span that is used.
When you specify a minspan value, the span that is used for the search must be equal to or greater than one of the span
threshold values in the following table. For example, if you specify minspan=15m that is equivalent to 900 seconds. The
minimum span that can be used is 1800 seconds, or 30 minutes.
554
1 second
5 seconds
10 seconds
30 seconds
60 seconds 1 minute
The span argument always rounds down the starting date for the first bin. There is no guarantee that the bin start time
used by the timechart command corresponds to your local timezone. In part this is due to differences in daylight savings
time for different locales. To use day boundaries, use span=1d. Do not use not span=86400s, or span=1440m, or
span=24h.
The functions, per_day(), per_hour(), per_minute(), and per_second() are aggregator functions and are not responsible
for setting a time span for the resultant chart. These functions are used to get a consistent scale for the data when an
explicit span is not provided. The resulting span can depend on the search time range.
For example, per_hour() converts the field value so that it is a rate per hour, or sum()/<hours in the span>. If your chart
span ends up being 30m, it is sum()*2.
If you want the span to be 1h, you still have to specify the argument span=1h in your search.
You can do per_hour() on one field and per_minute() (or any combination of the functions) on a different field in the
same search.
Subsecond span timescalesâtime spans that are made up of deciseconds (ds), centiseconds (cs), milliseconds (ms), or
microseconds (us)âshould be numbers that divide evenly into a second. For example, 1s = 1000ms. This means that
valid millisecond span values are 1, 2, 4, 5, 8, 10, 20, 25, 40, 50, 100, 125, 200, 250, or 500ms. In addition, span = 1000ms
is not allowed. Use span = 1s instead.
Split-by fields
If you specify a split-by field, ensure that you specify the bins and span arguments before the split-by field. If you specify
these arguments after the split-by field, Splunk software assumes that you want to control the bins on the split-by field, not
on the time axis.
555
If you use chart or timechart, you cannot use a field that you specify in a function as your split-by field as well. For
example, you will not be able to run:
However, you can work around this with an eval expression, for example:
Supported functions
You can use a wide range of functions with the timechart command. For general information about using functions, see
Statistical and charting functions.
Some functions are inherently more expensive, from a memory standpoint, than other functions. For example, the
distinct_count function requires far more memory than the count function. The values and list functions also can
consume a lot of memory.
If you are using the distinct_count function without a split-by field or with a low-cardinality split-by by field, consider
replacing the distinct_count function with the the estdc function (estimated distinct count). The estdc function might
result in significantly lower memory usage and run times.
Lexicographical order
Lexicographical order sorts items based on the values used to encode the items in computer memory. In Splunk software,
this is almost always UTF-8 encoding, which is a superset of ASCII.
• Numbers are sorted before letters. Numbers are sorted based on the first digit. For example, the numbers 10, 9,
70, 100 are sorted lexicographically as 10, 100, 70, 9.
• Uppercase letters are sorted before lowercase letters.
• Symbols are not standard. Some symbols are sorted before numeric values. Other symbols are sorted before or
after letters.
You can specify a custom sort order that overrides the lexicographical order. See the blog Order Up! Custom Sort Orders.
Basic Examples
1. Chart the product of the average "CPU" and average "MEM" for each "host"
For each minute, compute the product of the average "CPU" and average "MEM" for each "host".
This example uses an eval expression that includes a statistical function, avg to calculate the average of cpu_seconds
field, rounded to 2 decimal places. The results are organized by the values in the processor field. When you use a eval
556
expression with the timechart command, you must also use BY clause.
For each minute, calculate the average value of "CPU" for each "host".
Calculate the average "cpu_seconds" by "host". Remove outlying values that might distort the timechart axis.
For each minute, count the eventypes by source_ip, where the count is greater than 10.
sshd failed OR failure | timechart span=1m count(eventtype) BY source_ip usenull=f WHERE count>10
Align the time bins to 5am (local time). Set the span to 12h. The bins will represent 5am - 5pm, then 5pm - 5am (the next
day), and so on.
For each unique value of mvfield, return the average value of field. Deduplicates the values in the mvfield.
Extended Examples
This example uses the sample dataset from the Search Tutorial and a field lookup to add more information to the event data. To try this example
for yourself:
• Download the tutorialdata.zip file from this topic in the Search Tutorial and follow the instructions to upload the file to your
Splunk deployment.
• Download the Prices.csv.zip file from this topic in the Search Tutorial and follow the instructions to set up your field lookup.
• Use the time range Yesterday when you run the search.
The tutorialdata.zip file includes a productId field that is the catalog number for the items sold at the Buttercup Games
online store. The field lookup uses the prices.csv file to add two new fields to your events: productName, which is a
557
descriptive name for the item, and price, which is the cost of the item.
Chart the revenue for the different products that were purchased yesterday.
• This example searches for all purchase events (defined by the action=purchase).
• The results are piped into timechart command.
• The per_hour() function sums up the values of the price field for each productName and organizes the total by
time.
This search produces the following table of results in the Statistics tab. To format the numbers to the proper digits for
currency, click the format icon in the column heading. On the Number Formatting tab, select the Precision.
Click the Visualization tab. If necessary, change the chart to a column chart. On the Format menu, the General tab
contains the Stack Mode option where you can change the chart to a stacked chart.
558
After you create this chart, you can position your mouse pointer over each section to view more metrics for the product
purchased at that hour of the day.
Notice that the chart does not display the data in hourly spans. Because a span is not provided (such as span=1hr), the
per_hour() function converts the value so that it is a sum per hours in the time range (which in this example is 24 hours).
This example uses the sample data from the Search Tutorial. To try this example on your own Splunk instance, you must download the sample
data and follow the instructions to get the tutorial data into Splunk. Use the time range All time when you run the search.
Chart the number of purchases made daily for each type of product.
• This example searches for all purchases events, defined by the action=purchase, and pipes those results into the
timechart command.
• The span=1day argument buckets the count of purchases over the week into daily chunks.
• The usenull=f argument ignore any events that contain a NULL value for categoryId.
The results appear on the Statistics tab and look something like this:
2018-03-30 62 63 39 30 22 127 56
2018-03-31 65 94 38 42 34 128 60
2018-04-01 54 82 42 39 13 115 66
2018-04-02 52 63 45 42 22 124 52
2018-04-03 46 76 34 42 19 123 59
2018-04-04 57 70 36 38 20 130 56
2018-04-05 46 72 35 37 13 106 46
Click the Visualization tab. If necessary, change the chart to a column chart.
559
Compare the number of different items purchased each day and over the course of the week.
This search uses recent earthquake data downloaded from the USGS Earthquakes website. The data is a comma separated ASCII text file that
contains magnitude (mag), coordinates (latitude, longitude), region (place), etc., for each earthquake recorded.
You can download a current CSV file from the USGS Earthquake Feeds and upload the file to your Splunk instance.
This example uses the All Earthquakes data from the past 30 days.
This search counts the number of earthquakes in Alaska where the magnitude is greater than or equal to 3.5. The results
are organized in spans of 1 week, where the week begins on Monday.
The results appear on the Statistics tab and look something like this:
_time 3.5 3.6 3.7 3.8 4 4.1 4.1 4.3 4.4 4.5 OTHER
2018-03-26 3 3 2 2 3 1 0 2 1 1 1
2018-04-02 5 7 2 0 3 2 1 0 0 1 1
2018-04-09 2 3 1 2 0 2 1 1 0 1 2
2018-04-16 6 5 0 1 2 2 2 0 0 2 1
2018-04-23 2 0 0 0 0 2 1 2 2 0 1
4. Count the revenue for each item over time
This example uses the sample dataset from the Search Tutorial and a field lookup to add more information to the event data. Before you run this
example:
• Download the data set from this topic in the Search Tutorial and follow the instructions to upload it to your Splunk deployment.
• Download the Prices.csv.zip file from this topic in the Search Tutorial and follow the instructions to set up your field lookup.
The original data set includes a productId field that is the catalog number for the items sold at the Buttercup Games
online store. The field lookup adds two new fields to your events: productName, which is a descriptive name for the item,
and price, which is the cost of the item.
Count the total revenue made for each item sold at the shop over the last 7 days. This example shows two different
searches to generate the calculations.
Search 1
The first search uses the span argument to bucket the times of the search results into 1 day increments. The
search then uses the sum() function to add the price for each product_name.
sourcetype=access_* action=purchase | timechart span=1d sum(price) by productName usenull=f
Search 2
This second search uses the per_day() function to calculate the total of the price values for each day.
sourcetype=access_* action=purchase | timechart per_day(price) by productName usenull=f
560
Both searches produce similar results. Search 1 produces values with two decimal places. Search 2 produces values with
six decimal places. The following image shows the results from Search 1.
Click the Visualization tab. If necessary, change the chart to a column chart.
Now you can compare the total revenue made for items purchased each day and over the course of the week.
This example uses the sample data from the Search Tutorial but should work with any format of Apache web access log. To try this example on
your own Splunk instance, you must download the sample data and follow the instructions to get the tutorial data into Splunk. Use the time
range Yesterday when you run the search.
Chart a single day's views and purchases at the Buttercup Games online store.
• This search uses the per_hour() function and eval expressions to search for page views (method=GET) and
purchases (action=purchase).
• The results of the eval expressions are renamed as Views and Purchases, respectively.
The results appear on the Statistics tab and look something like this:
561
_time Views Purchases
2018-04-05 01:00:00 214.000000 72.000000
The difference between the two areas indicates that many of the views did not become to purchases. If all of the views
became purchases, you would expect the areas to overlay on top each other completely. There would be no difference
between the two areas.
These examples use the where clause to control the number of series values returned in the time-series chart.
Example 1: Show the 5 most rare series based on the minimum count values. All other series values will be labeled as
"other".
Example 2: Show the 5 most frequent series based on the maximum values. All other series values will be labeled as
"other".
These two searches return six data series: the five top or bottom series specified and the series labeled other. To hide
the "other" series, specify the argument useother=f.
Example 3: Show the source series count of INFO events, but only where the total number of events is larger than 100.
All other series values will be labeled as "other".
562
Example 4: Using the where clause with the count function measures the total number of events over the period. This
yields results similar to using the sum function.
The following two searches returns the sources series with a total count of events greater than 100. All other series values
will be labeled as "other".
See also
Commands
bin
chart
sitimechart
timewrap
Blogs
Search commands > stats, chart, and timechart
timewrap
Description
Displays, or wraps, the output of the timechart command so that every period of time is a different series.
You can use the timewrap command to compare data over specific time period, such as day-over-day or
month-over-month. You can also use the timewrap command to compare multiple time periods, such as a two week
period over another two week period. See Timescale options.
Syntax
timewrap
<timewrap-span>
[align=now | end]
[series=relative | exact | short]
[time_format=<str>]
Required arguments
timewrap-span
Syntax: [<int>]<timescale>
Description: A span of each bin, based on time. The timescale is required. The int is not required. If <int> is not
specified, 1 is assumed. For example if day is specified for the timescale, 1day is assumed. See Timescale
options.
563
Optional arguments
align
Syntax: align=now | end
Description: Specifies if the wrapping should be aligned to the current time or the end time of the search.
Default: end
series
Syntax: series=relative | exact | short
Description: Specifies how the data series is named. If series=relative and timewrap-span is set to week, the
field names are latest_week, 1week_before, 2weeks_before, and so forth. If series=exact, use the time_format
argument to specify a custom format for the series names. If series=short, the field names are an abbreviated
version of the field names used with series=relative. With series=short, the field names are abbreviated to "s"
followed by a number representing the period of time. For example, if timewrap-span is set to week, the field
names are s0, s1, s2 and so forth. The field s0 represents the latest week. The field s1 represents 1 week before
the latest week.
Default: relative
time_format
Syntax: time_format=<str>
Description: Use with series=exact to specify a custom name for the series. The time_format is designed to be
used with the time format variables. For example, if you specify time_format="week of %d/%m/%y", this format
appears as week of 13/2/17 and week of 20/2/17. If you specify time_format=week of %b %d, this format
appears as week of Feb 13 and week of Feb 20. See the Usage section.
Default: None
Timescale options
<timescale>
Syntax: <sec> | <min> | <hr> | <day> | <week> | <month> | <quarter> | <year>
Description: Time scale units.
The timewrap command uses the abbreviation m to refer to months. Other commands , such as timechart and bin use
the abbreviation m to refer to minutes.
564
Usage
You must use the timechart command in the search before you use the timewrap command.
The wrapping is based on the end time of the search. If you specify the time range of All time, the wrapping is based on
today's date. You see this in the timestamps for the _time field and in the data series names.
If you use a BY clause in the timechart command part of your search, the field names generated by the timewrap
command are appended to the field names generated with the BY clause. For example, suppose you have a search that
includes BY categoryId in the timechart command and the results look something like this:
2020-05-22 62 22 127
2020-05-23 65 34 128
2020-05-24 5 17 32
2020-05-25 62 22 127
2020-05-26 65 34 128
When you add the timewrap command, such as | timewrap w series=short, the series field names are appended to the
category ID names from the timechart BY clause.
2020-05-22 62 22 127
2020-05-23 65 34 128
2020-05-24 5 17 32
2020-05-25 62 22 127 17 54 39
2020-05-26 65 34 128
If you do not include any time specifiers with the time_format argument, all of the data series display the same name and
are compressed into each other.
565
Examples
Display a timechart that has a span of 1 day for each count in a week over week comparison table. Each table column,
which is the series, is 1 week of time.
To compare a few days with the weekly average, you need to calculate the daily totals, calculate the weekly average, and
remove the days you don't want to use. For example:
...| timechart count span=1h | timewrap d series=short | addtotals s* | eval 7dayavg=Total/7.0 | table
_time, _span, s0, s1, 7dayavg | rename s0 as now, s1 as yesterday
• Use the timewrap command to generate results over the last 7 days.
• By using the series=short argument, field names are generated in the output which start with "s", making it easy
to create totals using the addtotals command.
• Use the addtotals and eval commands to calculate the average over those 7 days.
• The table command is used to cut out days 3-7 so that only today, yesterday, and the weekly average are
returned.
• The rename command is used to rename the fields.
You can compare a day of the week to the same day of the weeks by specifying a filter at the end of the search. For
example, to compare Wednesdays your search would be like this:
566
_time 4weeks_before 3weeks_before 2weeks_before 1week_before latest_week
2020-02-19 01:00 2 0 0 0 1
2020-02-19 02:00 3 5 7 2 0
2020-02-19 03:00 6 4 0 1 2
2020-02-19 04:00 9 0 4 0 0
2020-02-19 05:00 2 8 7 3 1
2020-02-19 06:00 4 2 7 0 1
2020-02-19 07:00 6 9 2 2 0
If you change the timechart span to 1d instead of 1h, your output will look like this:
See also
timechart
tojson
Description
Converts events into JSON objects. You can specify which fields get converted by identifying them through exact match
or through wildcard expressions. You can also apply specific JSON datatypes to field values using datatype functions.
The tojson command converts multivalue fields into JSON arrays.
When fields are specifically named in a tojson search, the command generates JSON objects that are limited to the
values of just those named fields. If no fields are specified for tojson, tojson generates JSON objects for all fields that
would otherwise be returned by the search.
Syntax
| tojson
[<tojson-function>]...
[default_type=<datatype>]
[fill_null=<boolean>]
[include_internal=<boolean>]
[output_field=<string>]
Optional arguments
tojson-function
Syntax: [auto | bool | json | none | num | str](<wc-field>)...
567
Description: Applies JSON datatype functions to values of named fields. See Usage for details about how tojson
interprets these datatype functions, and how tojson applies datatypes to field values when it converts events into
JSON objects.
If you provide no fields, the tojson processor creates JSON objects for each event that include all available fields.
In other words, it applies none(*) to the search.
Default: none(*)
default_type
Syntax: default_type=<datatype>
Description: Specifies the datatype that the tojson processor should apply to fields that aren't specifically
associated with a datatype function.
Default: none
fill_null
Syntax: fill_null=<boolean>
Description: When set to true, tojson outputs a literal null value when tojson skips a value. For example,
normally, when tojson tries to apply the json datatype to a field that does not have proper JSON formatting,
tojson skips the field. However, if fill_null=true, the tojson processor outputs a null value
Default: false
include_internal
Syntax: include_internal=<boolean>
Description: When set to true, tojson includes internal fields such as _time, _indextime, or _raw in its JSON
object output.
Default: false
output_field
Syntax: output_field=<string>
Description: Specifies the name of the field to which the tojson search processor writes the output JSON
objects.
Default: _raw
Usage
The tojson command is a streaming command, which means it operates on each event as it is returned by the search.
See Types of commands.
The tojson command applies JSON datatypes to field values according to logic encoded in its datatype functions.
You can assign specific datatype functions to fields when you write a tojson search. Alternatively, you can name a set of
fields without associating them with datatype functions, and then identify a default_type that tojson can apply to those
unaffiliated fields.
If you do not specify any fields for the tojson command, the tojson returns JSON objects for each field that can possibly
be returned by the search at that point, and applies the none datatype function to the values of those fields. The none
datatype function applies the numeric datatype to field values that are purely numeric, and applies the string datatype to
all other field values.
The following table explains the logic that the various datatype functions use to apply datatypes to the values of the fields
568
with which they are associated.
Datatype
Conversion logic
function
Converts all values of the specified field into JSON-formatted output. Automatically determines the field datatypes.
• If the value is numeric, the JSON output has a numeric output and includes a literal numeric.
auto • If the value is the string true or false the JSON output has a Boolean type.
• If the value is a literal null the JSON output has a null type and includes a null value.
• If the value is a string other than the previously mentioned strings, tojson examines the string. If it is proper
JSON, tojson outputs a nested JSON object. If it is not proper JSON, tojson includes the string in the output.
Converts valid values of the specified field to the Boolean datatype, and skips invalid values, using string validation.
• If the value is a number, tojson outputs false only if that value is 0. Otherwise tojson outputs false.
• If the value is a string, tojson outputs false only if the value is false, f, or no.
bool
• The tojson processor outputs true only if the value is code true, t, or yes. If the value does not fit into those
two sets of strings, it is skipped.
• The validation for the bool datatype function is case insensitive. This means that it also interprets FALSE, False,
F, and NO as false.
Converts all values of the specified field to the JSON type, using string validation. Skips values with invalid JSON.
Outputs all values for the specified field in the JSON type. Does not apply string validation.
none
• If the value is a number, tojson outputs a numeric datatype in the JSON block.
• If the value is a string, tojson outputs a string datatype.
Converts all values of the specified field to the numeric type, using string validation.
num
• If the value is a number, tojson outputs that value and gives it the numeric datatype.
• If the value is a string, tojson attempts to parse the string as a number. If it cannot, it skips the value.
Converts all values of the specified field into the string datatype, using string validation.
str The tojson processor applies the string type to all values of the specified field, even if they are numbers, Boolean values,
and so on.
When a field includes multivalues, tojson outputs a JSON array and applies the datatype function logic to each element
of the array.
Examples
This search of index=_internal converts all events it returns for its time range into JSON-formatted data.
Because the search string does not assign datatype functions to specific fields, by default tojson applies the none
datatype function to all fields returned by the search. This means all of their values get either the numeric or string
datatypes.
index=_internal | tojson
For example, say you start with events that look like this:
569
instantaneous_eps=27.194, average_kbps=5.652, total_k_processed=444500.000, kb=180.443, ev=843,
load_average=19.780
After being processed by tojson, such events have JSON formatting like this:
{ [-]
component: Metrics
date_hour: 18
date_mday: 18
date_minute: 22
date_month: december
date_second: 9
date_wday: friday
date_year: 2020
date_zone: 0
event_message: group=thruput, name=thruput, instantaneous_kbps=2.914, instantaneous_eps=13.903,
average_kbps=5.062, total_k_processed=398412.000, kb=90.338, ev=431, load_average=14.690
group: thruput
host: sh1
index: _internal
linecount: 1
log_level: INFO
name: thruput
punct: --_::._+____-_=,_=,_=.,_=.,_=.,_=.,_=.,_=,_=.
source: /opt/splunk/var/log/splunk/metrics.log
sourcetype: splunkd
splunk_server: idx2
timeendpos: 29
timestartpos: 0
}
2. Specify different datatypes for 'date' fields
The following search of the _internal index converts results into JSON objects that have only the date_* fields
from each event. The numeric datatype is applied to all date_hour field values. The string datatype is applied to all
other date field values.
{ [-]
date_hour: 18
date_mday: 18
date_minute: 28
date_month: december
date_second: 45
date_wday: friday
date_year: 2020
date_zone: 0
}
Note that all fields that do not start with date_ have been stripped from the output.
570
3. Limit JSON object output and apply datatypes to the field values
This search returns JSON objects only for the name, age, and isRegistered fields. It uses the auto datatype
function to have tojson automatically apply appropriate JSON datatypes to the values of those fields.
4. Convert all events into JSON objects and apply appropriate datatypes to all field values
This search converts all of the fields in each event returned by the search into JSON objects. It uses the auto
datatype function in conjunction with a wildcard to apply appropriate datatypes to the values of all fields returned
by the search.
Notice that this search references the auto datatype function, which ensures that Boolean, JSON, and null field
values are appropriately typed alongside numeric and string values.
Alternatively, you can use default_type to apply the auto datatype function to all fields returned by a search:
This example generates JSON objects containing values of the isInternal field. It uses the bool datatype
function to apply the Boolean datatype to those field values.
7. Designate a default datatype for a set of fields and write the JSON objects to another field
This search generates JSON objects based on the values of four fields. It uses the default_type argument to
convert the first three fields to the num datatype. It applies the string datatype to a fourth field. Finally, it writes the
finished JSON objects to the field my_JSON_field.
See also
Evaluation functions
JSON functions
571
top
Description
Finds the most common values for the fields in the field list. Calculates a count and a percentage of the frequency the
values occur in the events. If the <by-clause> is included, the results are grouped by the field you specify in the
<by-clause>.
Syntax
Required arguments
<field-list>
Syntax: <field>, <field>, ...
Description: Comma-delimited list of field names.
Optional arguments
<N>
Syntax: <int>
Description: The number of results to return.
Default: 10
<top-options>
Syntax: countfield=<string> | limit=<int> | otherstr=<string> | percentfield=<string> | showcount=<bool> |
showperc=<bool> | useother=<bool>
Description: Options for the top command. See Top options.
<by-clause>
Syntax: BY <field-list>
Description: The name of one or more fields to group by.
Top options
countfield
Syntax: countfield=<string>
Description: For each value returned by the top command, the results also return a count of the events that have
that value. This argument specifies the name of the field that contains the count. The count is returned by default.
If you do not want to return the count of events, specify showcount=false.
Default: count
limit
Syntax: limit=<int>
Description: Specifies how many results to return. To return all values, specify zero ( 0 ). Specifying top
limit=<int> is the same as specifying top N.
Default: 10
otherstr
Syntax: otherstr=<string>
572
Description: If useother=true, a row representing all other values is added to the results. Use
otherstr=<string> to specify the name of the label for the row.
Default: OTHER
percentfield
Syntax: percentfield=<string>
Description: For each value returned by the top command, the results also return a percentage of the events
that have that value. This argument specifies the name of the field that contains the percentage. The percentage
is returned by default. If you do not want to return the percentage of events, specify showperc=false.
Default: percent
showcount
Syntax: showcount=<bool>
Description: Specify whether to create a field called "count" (see "countfield" option) with the count of that tuple.
Default: true
showperc
Syntax: showperc=<bool>
Description: Specify whether to create a field called "percent" (see "percentfield" option) with the relative
prevalence of that tuple.
Default: true
useother
Syntax: useother=<bool>
Description: Specify whether or not to add a row that represents all values not included due to the limit cutoff.
Default: false
Usage
Default fields
When you use the top command, two fields are added to the results: count and percent.
Field Description
The number of events in your search results that contain the field values that are returned by the top command. See the
count
countfield and showcount arguments.
The percentage of events in your search results that contain the field values that are returned by the top command. See the
percent
percentfield and showperc arguments.
Default maximum number of results
By default the top command returns a maximum of 50,000 results. This maximum is controlled by the maxresultrows
setting in the [top] stanza in the limits.conf file. Increasing this limit can result in more memory usage.
Only users with file system access, such as system administrators, can edit the configuration files. Never change or
copy the configuration files in the default directory. The files in the default directory must remain intact and in their
original location. Make the changes in the local directory.
573
See How to edit a configuration file.
If you have Splunk Cloud Platform, you need to file a Support ticket to change this limit.
Examples
This search returns the 20 most common values of the "referer" field. The results show the number of events (count) that
have that a count of referer, and the percent that each referer is of the total number of events.
Example 2: Return top values for one field organized by another field
This search returns the top "action" values for each "referer_domain".
Because a limit is not specified, this returns all the combinations of values for "action" and "referer_domain" as well as the
counts and percentages:
574
Example 3: Returns the top product purchased for each category
This example uses the sample dataset from the Search Tutorial and a field lookup to add more information to the event data.
• Download the data set from Add data tutorial and follow the instructions to load the tutorial data.
• Download the CSV file from Use field lookups tutorial and follow the instructions to set up the lookup definition to add price and
productName to the events.
After you configure the field lookup, you can run this search using the time range, All time.
This search returns the top product purchased for each category. Do not show the percent field. Rename the count field to
"total".
See also
transaction
Description
The transaction command finds transactions based on events that meet various constraints. Transactions are made up of
the raw text (the _raw field) of each member, the time and date fields of the earliest member, as well as the union of all
other fields of each member.
Additionally, the transaction command adds two fields to the raw events, duration and eventcount. The values in the
duration field show the difference, in seconds, between the timestamps for the first and last events in the transaction. The
values in the eventcount field show the number of events in the transaction.
Syntax
transaction
[<field-list>]
[name=<transaction-name>]
[<txn_definition-options>...]
[<memcontrol-options>...]
[<rendering-options>...]
575
Required arguments
None.
Optional arguments
field-list
Syntax: <field> ...
Description: One or more field names. The events are grouped into transactions, based on the unique values in
the fields. For example, suppose two fields are specified: client_ip and host. For each client_ip value, a
separate transaction is returned for each unique host value for that client_ip.
memcontrol-options
Syntax: <maxopentxn> | <maxopenevents> | <keepevicted>
Description: These options control the memory usage for your transactions. They are not required, but you can
use 0 or more of the options to define your transaction. See Memory control options.
name
Syntax: name=<transaction-name>
Description: Specify the stanza name of a transaction that is configured in the transactiontypes.conf file. This
runs the search using the settings defined in this stanza of the configuration file. If you provide other transaction
definition options (such as maxspan) in this search, they overrule the settings in the configuration file.
rendering-options
Syntax: <delim> | <mvlist> | <mvraw> | <nullstr>
Description: These options control the multivalue rendering for your transactions. They are not required, but you
can use 0 or more of the options to define your transaction. See Multivalue rendering options.
txn_definition-options
Syntax: <maxspan> | <maxpause> | <maxevents> | <startswith> | <endswith> | <connected> | <unifyends> |
<keeporphans>
Description: Specify the transaction definition options to define your transactions. You can use multiple options
to define your transaction.
connected
Syntax: connected=<bool>
Description: Only relevant if <field-list> is specified. If an event contains fields required by the transaction, but
none of these fields have been instantiated in the transaction (added with a previous event), this opens a new
transaction (connected=true) or adds the event to the transaction (connected=false). For multivalue fields, specify
connected=false merges the events into one transaction if at least one value of the multivalue field is the same in
the events. See Usage.
Default: true
endswith
Syntax: endswith=<filter-string>
Description: A search or eval expression which, if satisfied by an event, marks the end of a transaction.
keeporphans
Syntax: keeporphans=true | false
576
Description: Specify whether the transaction command should output the results that are not part of any
transactions. The results that are passed through as "orphans" are distinguished from transaction events with a
_txn_orphan field, which has a value of 1 for orphan results.
Default: false
maxspan
Syntax: maxspan=<int>[s | m | h | d]
Description: Specifies the maximum length of time in seconds, minutes, hours, or days that the events can span.
The events in the transaction must span less than integer specified for maxspan. Events that exceed the maxspan
limit are treated as part of a separate transaction. If the value is negative, the maxspan constraint is disabled and
there is no limit.
Default: -1 (no limit)
maxpause
Syntax: maxpause=<int>[s | m | h | d]
Description: Specifies the maximum length of time in seconds, minutes, hours, or days for the pause between
the events in a transaction. If value is negative, the maxpause constraint is disabled and there is no limit.
Default: -1 (no limit)
maxevents
Syntax: maxevents=<int>
Description: The maximum number of events in a transaction. If the value is negative this constraint is disabled.
Default: 1000
startswith
Syntax: startswith=<filter-string>
Description: A search or eval filtering expression which if satisfied by an event marks the beginning of a new
transaction.
unifyends
Syntax: unifyends= true | false
Description: Whether to force events that match startswith and endswith constraints to also match at least one
of the fields used to unify events into a transaction.
Default: false
These options are used with the startswith and endswith arguments.
<filter-string>
Syntax: <search-expression> | (<quoted-search-expression>) | eval(<eval-expression>)
Description: A search or eval filtering expression which if satisfied by an event marks the end of a transaction.
<search-expression>
Description: A valid search expression that does not contain quotes.
<quoted-search-expression>
Description: A valid search expression that contains quotes.
<eval-expression>
Description: A valid eval expression that evaluates to a Boolean.
577
Memory control options
If you have Splunk Cloud Platform, Splunk Support administers the settings in the limits.conf file on your behalf.
keepevicted
Syntax: keepevicted=<bool>
Description: Whether to output evicted transactions. Evicted transactions can be distinguished from non-evicted
transactions by checking the value of the 'closed_txn' field. The 'closed_txn' field is set to '0', or false, for evicted
transactions and '1', or true for non-evicted, or closed, transactions. The 'closed_txn' field is set to '1' if one of the
following conditions is met: maxevents, maxpause, maxspan, startswith. For startswith, because the
transaction command sees events in reverse time order, it closes a transaction when it satisfies the start
condition. If none of these conditions is specified, all transactions are output even though all transactions will have
'closed_txn' set to '0'. A transaction can also be evicted when the memory limitations are reached.
Default: false or 0
maxopenevents
Syntax: maxopenevents=<int>
Description: Specifies the maximum number of events (which are) part of open transactions before transaction
eviction starts happening, using LRU policy.
Default: The default value for this argument is read from the transactions stanza in the limits.conf file.
maxopentxn
Syntax: maxopentxn=<int>
Description: Specifies the maximum number of not yet closed transactions to keep in the open pool before
starting to evict transactions, using LRU policy.
Default: The default value for this argument is read from the transactions stanza in the limits.conf file.
delim
Syntax: delim=<string>
Description: Specify a character to separate multiple values. When used in conjunction with the mvraw=t
argument, represents a string used to delimit the values in the _raw field.
Default: " " (whitespace)
mvlist
Syntax: mvlist= true | false | <field-list>
Description: Flag that controls how multivalued fields are processed. When set to mvlist=true, the multivalued
fields in the transaction are a list of the original events ordered in arrival order. When set to mvlist=false, the
multivalued fields in the transaction are a set of unique field values ordered alphabetically. If a comma or space
delimited list of fields is provided, only those fields are rendered as lists.
Default: false
mvraw
Syntax: mvraw=<bool>
Description: Used to specify whether the _raw field of the transaction search result should be a multivalued field.
Default: false
nullstr
Syntax: nullstr=<string>
Description: A string value to use when rendering missing field values as part of multivalued fields in a
transaction. This option applies only to fields that are rendered as lists.
578
Default: NULL
Usage
The transaction command is a centralized streaming command in most cases, however is a dataset processing
command with specific arguments. See Command types.
In the output, the events in a transaction are grouped together as multiple values in the Events field. Each event in a
transaction starts on a new line by default.
If there are more than 5 events in a transaction, the remaining events in the transaction are collapsed. A message
appears at the end of the transaction which gives you the option to show all of the events in the transaction.
The Splunk software does not necessarily interpret the transaction defined by multiple fields as a conjunction (field1 AND
field2 AND field3) or a disjunction (field1 OR field2 OR field3) of those fields. If there is a transitive relationship
between the fields in the fields list and if the related events appear in the correct sequence, each with a different
timestamp, transaction command will try to use it. For example, if you searched for
event=1 host=a
event=2 host=a cookie=b
event=3 cookie=b
The transaction command requires that the incoming events be in descending time order. Some commands, such as
eval, might change the order or time labeling of events. If one of these commands precedes the transaction command,
your search returns an error unless you include a sort command in your search. The sort command must occur
immediately before the transaction command to reorder the search results in descending time order.
Multivalue fields
If one of the fields in your events is a multivalue field, you must specify connected=false to have the values in the field
merged when the transaction is created. The values are merged when at least one value in the multivalue field overlaps
between events. For example, if the multivalue field in one event contains a b c and in another event contains c d e, the
fields are merged when connected=false is specified.
Basic Examples
Group search results that that have the same host and cookie value, occur within 30 seconds, and do not have a pause of
more than 5 seconds between the events.
579
2. Transactions with the same "from" value, time range, and pause
Group search results that have the same value of "from", with a maximum span of 30 seconds, and a pause between
events no greater than 5 seconds into a transaction.
You have events that include an alert_level. You want to create transactions where the level is equal. Using the
streamstats command, you can remember the value of the alert level for the current and previous event. Using the
transaction command, you can create a new transaction if the alert level is different. Output specific fields to table.
Extended Examples
This example uses the sample data from the Search Tutorial but should work with any format of Apache web access log. To try this example on
your own Splunk instance, you must download the sample data and follow the instructions to get the tutorial data into Splunk. Use the time
range Yesterday when you run the search.
Define a transaction based on Web access events that share the same IP address. The first and last events in the
transaction should be no more than thirty seconds apart and each event should not be longer than five seconds apart.
This produces the following events list. The clientip for each event in the transaction is highlighted.
580
This search groups events together based on the IP addresses accessing the server and the time constraints. The search
results might have multiple values for some fields, such as host and source. For example, requests from a single IP could
come from multiple hosts if multiple people are shopping from the same office. For more information, read the topic About
transactions in the Knowledge Manager Manual.
This example uses the sample data from the Search Tutorial but should work with any format of Apache web access log. To try this example on
your own Splunk instance, you must download the sample data and follow the instructions to get the tutorial data into Splunk. Use the time
range Yesterday when you run the search.
Define a transaction based on Web access events that have a unique combination of host and clientip values. The first
and last events in the transaction should be no more than thirty seconds apart and each event should not be longer than
five seconds apart.
Each of these events have a distinct combination of the IP address (clientip) values and host values within the limits of
the time constraints specified in the search.
This example uses the sample data from the Search Tutorial but should work with any format of Apache web access log. To try this example on
your own Splunk instance, you must download the sample data and follow the instructions to get the tutorial data into Splunk. Use the time
range Yesterday when you run the search.
This search defines a purchase transaction as 3 events from one IP address which occur in a 10 minute span of time.
This search defines a purchase event based on Web access events that have the action=purchase value. These results
are then piped into the transaction command. This search identifies purchase transactions by events that share the
581
same clientip, where each session lasts no longer than 10 minutes, and includes no more than 3 events.
This example uses sample email data. You should be able to run this search on any email data by replacing the sourcetype=cisco:esa with
the sourcetype value and the mailfrom field with email address field name in your data. For example, the email might be To, From, or Cc).
This example defines an email transaction as a group of up to 10 events. Each event contains the same value for the mid
(message ID), icid (incoming connection ID), and dcid (delivery connection ID). The last event in the transaction contains
a Message done string.
582
By default, only the first 5 events in a transaction are shown. The first transaction contains 7 events and the last event is
hidden. The second and third transactions show the Message done string in the last event in the transaction.
This example uses sample email data. You should be able to run this search on any email data by replacing the sourcetype=cisco:esa with
the sourcetype value and the mailfrom field with email address field name in your data. For example, the email might be To, From, or Cc).
This example defines an email transaction as a group of up to 10 events. Each event contains the same value for the mid
(message ID), icid (incoming connection ID), and dcid (delivery connection ID). The first and last events in the
transaction should be no more than thirty seconds apart.
By default, the values of multivalue fields are suppressed in search results with the default setting for mvlist, which is
false. Specifying mvlist=true in this search displays all of the values of the selected fields. This produces the following
events list:
Here you can see that each transaction has a duration that is less than thirty seconds. Also, if there is more than one
value for a field, each of the values is listed.
This example uses the sample data from the Search Tutorial but should work with any format of Apache web access log. To try this example on
your own Splunk instance, you must download the sample data and follow the instructions to get the tutorial data into Splunk. Use the time
range All time when you run the search.
Define a transaction as a group of events that have the same session ID, JSESSIONID, and come from the same IP
address, clientip, and where the first event contains the string, "view", and the last event contains the string, "purchase".
The search defines the first event in the transaction as events that include the string, "view", using the startswith="view"
argument. The endswith="purchase" argument does the same for the last event in the transaction.
583
This example then pipes the transactions into the where command and the duration field to filter out all of the transactions
that took less than a second to complete. The where filter cannot be applied before the transaction command because
the duration field is added by the transaction command.
You might be curious about why the transactions took a long time, so viewing these events might help you to
troubleshoot.
You won't see it in this data, but some transactions might take a long time because the user is updating and removing
items from their shopping cart before they completes the purchase. Additionally, this search is run over all events. There
is no filtering before the transaction command. Anytime you can filter the search before the first pipe, the faster the
search runs.
See also
Reference
About transactions in the Search Manual
Commands
stats
concurrency
transpose
Description
Returns the specified number of rows (search results) as columns (list of field values), such that each search row
becomes a column.
Syntax
transpose
[int]
[column_name=<string>]
[header_field=<field>]
584
[include_empty=<bool>]
Required arguments
None.
Optional arguments
column_name
Syntax: column_name=<string>
Description: The name of the first column that you want to use for the transposed rows. This column contains the
names of the fields.
Default: column
header_field
Syntax: header_field=<field>
Description: The field in your results to use for the names of the columns (other than the first column) in the
transposed data.
Default: row 1, row 2, row 3, and so on.
include_empty
Syntax: include_empty=<bool>
Description: Specify whether to include (true) or not include (false) fields that contain empty values.
Default: true
int
Syntax: <int>
Description: Limit the number of rows to transpose. To transpose all rows, specify | transpose 0, which
indicates that the number of rows to transpose is unlimited.
Default: 5
Usage
When you use the transpose command the field names used in the output are based on the arguments that you use with
the command. By default the field names are: column, row1, row2, and so forth.
Examples
Use the default settings for the transpose command to transpose the results of a chart command.
host count
www1 11835
585
host count
www2 11186
www3 11261
When you add the transpose command to the end of the search, the results look something like this:
row
column row 1 row 2
3
host www1 www2 www3
Count the number of events by sourcetype and display the sourcetypes with the highest count first.
586
Use the transpose command to convert the rows to columns and show the source types with the 3 highest counts.
This example uses the sample dataset from the Search Tutorial.
• Download the data set from Add data tutorial and follow the instructions to get the tutorial data into your Splunk deployment.
Search all successful events and count the number of views, the number of times items were added to the cart, and the
number of purchases.
The value for count AS views is the total number of the events that match the criteria sourcetype=access_* status=200,
or the total count for all actions. The values for addtocart and purchases show the number of events for those specific
actions.
When you switch to the Visualization tab, the data displays a chart with the "34282 views" as the X axis label and two
columns, one for "addtocart "and one for "purchases". Because the information about the views is placed on the X axis,
this chart is confusing.
587
If you change to a pie chart, you see only the "views".
588
Use the transpose command to convert the columns of the single row into multiple rows.
Now these rows can be displayed in a column or pie chart where you can compare the values.
589
In this particular example, using a pie chart is misleading. The views is a total count of all the actions, not just the
addtocart and purchases actions. Using a pie chart implies that views is an action like addtocart and purchases. The
pie chart implies that the value for views is 1 part of the total, when in fact views is the total.
See also
Commands
fields
stats
untable
xyseries
trendline
Description
Computes the moving averages of fields: simple moving average (sma), exponential moving average (ema), and weighted
moving average (wma) The output is written to a new field, which you can specify.
SMA and WMA both compute a sum over the period of most recent values. WMA puts more weight on recent values
rather than past values. EMA is calculated using the following formula.
Syntax
Required arguments
trendtype
Syntax: sma | ema | wma
Description: The type of trend to compute. Current supported trend types include simple moving average (sma),
exponential moving average (ema), and weighted moving average (wma).
period
Syntax: <num>
Description: The period over which to compute the trend, an integer between 2 and 10000.
590
<field>
Syntax: "("<field>")"
Description: The name of the field on which to calculate the trend.
Optional arguments
<newfield>
Syntax: <field>
Description: Specify a new field name to write the output to.
Default: <trendtype><period>(<field>)
Usage
Examples
Example 1: Computes a five event simple moving average for field 'foo' and writes the result to new field called
'smoothed_foo.' Also, in the same line, computes ten event exponential moving average for field 'bar'. Because no AS
clause is specified, writes the result to the field 'ema10(bar)'.
See also
tscollect
This feature is deprecated.
The tscollect command is deprecated in the Splunk platform as of version 7.3.0. Although this command continues to function, it might be
removed in a future version. This command has been superseded by data models. See Accelerate data models in the Knowledge Manager
Manual.
The tscollect command uses indexed fields to create time series index (tsidx) files in a namespace that you define.
The result tables in these files are a subset of the data that you have already indexed. This then enables you to use the
tstats command to search and report on these tsidx files instead of searching raw data. Because you are searching on a
subset of the full index, the search should complete faster than it would otherwise.
The tscollect command creates multiple tsidx files in the same namespace. The command will begin a new tsidx file
when it determines that the tsidx file it is currently creating has gotten big enough.
Only users with the indexes_edit capability can run this command. See Usage.
591
Syntax
Optional arguments
keepresults
Syntax: keepresults = true | false
Description: If true, tscollect outputs the same results it received as input. If false, tscollect returns the count of
results processed (this is more efficient since it does not need to store as many results).
Default: false
namespace
Syntax: namespace=<string>
Description: Define a location for the tsidx file(s). If namespace is provided, the tsidx files are written to a
directory of that name under the main tsidxstats directory (that is, within $SPLUNK_DB/tsidxstats). These
namespaces can be written to multiple times to add new data.
Default: If namespace is not provided, the files are written to a directory within the job directory of that search,
and will live as long as the job does. If you have Splunk Enterprise, you can configure the namespace location by
editing indexes.conf and setting the attribute tsidxStatsHomePath.
squashcase
Syntax: squashcase = true | false
Description: Specify whether or not the case for the entire field::value tokens are case sensitive when it is put
into the lexicon. To create indexed field tsidx files that are similar to those created by Splunk Enterprise, set
squashcase=true for results to be converted to all lowercase.
Default: false
Usage
You must have the indexes_edit capability to run the tscollect command. By default, the admin role has this capability
and the user and power roles do not have this capability.
Examples
Example 2: Retrieve events from the main index and write the values of field foo to tsidx files in the job directory.
See also
592
tstats
Description
Use the tstats command to perform statistical queries on indexed fields in tsidx files. The indexed fields can be from
normal index data, tscollect data, or accelerated data models.
Syntax
| tstats
[prestats=<bool>]
[local=<bool>]
[append=<bool>]
[summariesonly=<bool>]
[include_reduced_buckets=<bool>]
[allow_old_summaries=<bool>]
[chunk_size=<unsigned int>]
[fillnull_value=<string>]
<stats-func>...
[FROM ( <namespace> | sid=<tscollect-job-id> | datamodel=<data_model_name>.<root_dataset_name> [where
nodename = <root_dataset_name>.<...>.<target_dataset_name>)]]
[WHERE <search-query> | <field> IN (<value-list>)]
[BY (<field-list> | (PREFIX(<field>))) [span=<timespan>] ]
Required arguments
<stats-func>
Syntax: (count [<field>] | <function>(PREFIX(<string>) | <field>))... [AS <string>]
Description: Either perform a basic count of a field or perform a function on a field. For a list of the supported
functions for the tstats command, refer to the table below. You must specify one or more functions. You can
apply the function to a field, or to a PREFIX() directive if you want to aggregate a raw segment in your indexed
events as if it were an extracted field-value pair. You can also rename the result using the AS keyword, unless you
are in prestats mode (prestats=true).
You cannot specify functions without applying them to fields or eval expressions that resolve into fields. You
cannot use wildcards to specify field names.
See Usage to learn more about using PREFIX(), and about searches you can run to find raw segments in your
data.
The following table lists the supported functions by type of function. Use the links in the table to see descriptions
and examples for each function. For an overview about using functions with commands, see Statistical and
charting functions.
593
Supported functions and
Type of function
syntax
count() max() range() sumsq()
distinct_count() median() stdev() upperperc<int>()
estdc() min() stdevp() var()
mode() varp()
earliest() latest()
Time functions rate()
earliest_time() latest_time()
Optional arguments
append
Syntax: append=<bool>
Description: When in prestats mode (prestats=true), enables append=true where the prestats results append to
existing results, instead of generating them.
Default: false
allow_old_summaries
Syntax: allow_old_summaries=true | false
Description: Only applies when selecting from an accelerated data model. To return results from summary
directories only when those directories are up-to-date, set this parameter to false. If the data model definition has
changed, summary directories that are older than the new definition are not used when producing output from
tstats. This default ensures that the output from tstats will always reflect your current configuration. When set to
true, tstats will use both current summary data and summary data that was generated prior to the definition
change. Essentially this is an advanced performance feature for cases where you know that the old summaries
are "good enough".
Default: false
chunk_size
Syntax: chunk_size=<unsigned_int>
Description: Advanced option. This argument controls how many events are retrieved at a time from a single
tsidx file when the Splunk software processes searches. Lower this setting from its default only when you find a
particular tstats search is using too much memory, or when it infrequently returns events. This can happen when
a search groups by excessively high-cardinality fields (fields with very large amounts of distinct values). In such
situations, a lower chunk_size value can make tstats searches more responsive, but potentially slower to
complete. A higher chunk_size, on the other hand, can help long-running searches to complete faster, with the
potential tradeoff of causing the search to be less responsive. For tstats, chunk_size cannot be set lower than
10000.
Default: 10000000 (10 million)
The default value for the the chunk_size argument is set by the chunk_size setting for the [tstats] stanza in
limits.conf. If you have Splunk Cloud Platform, file a Support ticket to change this setting.
fillnull_value
594
Description: This argument sets a user-specified value that the tstats command substitutes for null values for
any field within its group-by field list. Null values include field values that are missing from a subset of the returned
events as well as field values that are missing from all of the returned events. If you do not provide a
fillnull_value argument, tstats omits rows for events with one or more null field values from its results.
Default: no default value
include_reduced_buckets
Syntax: include_reduced_buckets=true | false
Description: This setting only applies when enableTSIDXReduction=true in indexes.conf. When set to false, the
tstats command generates results only from index buckets that are not reduced. Set to true if you want tstats
to use results from reduced buckets.
Default: false
local
Syntax: local=true | false
Description: If true, forces the processor to be run only on the search head.
Default: false
prestats
Syntax: prestats=true | false
Description: Specifies whether to use the prestats format. The prestats format is a Splunk internal format that is
designed to be consumed by commands that generate aggregate calculations. When using the prestats format
you can pipe the data into the chart, stats, or timechart commands, which are designed to accept the prestats
format. When prestats=true, AS instructions are not relevant. The field names for the aggregates are determined
by the command that consumes the prestats format and produces the aggregate output.
Default: false
summariesonly
Syntax: summariesonly=<bool>
Description: Only applies when selecting from an accelerated data model. When false, generates results from
both summarized data and data that is not summarized. For data not summarized as TSIDX data, the full search
behavior will be used against the original index data. If set to true, 'tstats' will only generate results from the
TSIDX data that has been automatically generated by the acceleration and non-summarized data will not be
provided.
Default: false
The FROM clause is optional. You can specify either a namespace, an sid, or a datamodel. See Selecting data for more
information about this clause.
namespace
Syntax: <string>
Description: Define a location for the tsidx file with $SPLUNK_DB/tsidxstats. If you have Splunk Enterprise, you
can configure this location by editing the local version of the indexes.conf file and setting the tsidxStatsHomePath
attribute. See How to edit a configuration file in the Admin manual.
sid
Syntax: sid=<tscollect-job-id>
Description: The job ID string of a tscollect search (that generated tsidx files).
datamodel
595
Syntax: datamodel=<data_model_name>.<root_dataset_name> [where nodename =
<root_dataset_name>.<...>.<target_dataset_name>]
Description: The name of a data model, concatenated with the name of the root dataset that you are searching.
If you wish to filter on a child dataset, you need to use a where clause that uses nodename to reference a specific
child dataset in a dataset hierarchy in the data model. See Selecting data for more information.
The WHERE clause is optional. This clause is used as a filter. You can specify either a search or a field and a set of
values with the IN operator.
<search-query>
Specify search criteria to filter on.
<field> IN (<value-list>)
For the field, specify a list of values to include in the search results.
BY clause arguments
The BY clause is optional. You cannot use wildcards in the BY clause with the tstats command. See Usage. If you use
the BY clause, you must specify a field-list. You can also specify a span.
<field-list>
Syntax: <field>, ...
Description: Specify one or more fields to group results.
PREFIX()
Syntax: PREFIX(<string>)
Description: Specify a raw segment in your indexed events that you want to split by as if it were an extracted
field-value pair. See Usage for more information about the PREFIX() directive, and for a search you can run to find
raw segments in your indexed data.
span
Syntax: span=<timespan>
Description: The span of each time bin. If you use the BY clause to group by _time, use the span argument to
group the time buckets. You can specify timespans such as BY _time span=1h or BY _time span=5d. If you do not
specify a <timespan>, the default is auto, which means that the number of time buckets adjusts to produce a
reasonable number of results. For example if initially seconds are used for the <timespan> and too many results
are being returned, the <timespan> is changed to a longer value, such as minutes, to return fewer time buckets.
Default: auto
<timespan>
Syntax: auto | <int><timescale>
<timescale>
Syntax: <sec> | <min> | <hr> | <day> | <month>
Description: Time scale units. For the tstats command, the <timescale> does not support
subseconds.
Default: sec
596
<sec> s | sec | secs | second | seconds Time scale in seconds.
Usage
The tstats command is a report-generating command, except when prestats=true. When prestats=true, the tstats
command is an event-generating command. See Command types.
Generating commands use a leading pipe character and should be the first command in a search, except when
prestats=true.
Selecting data
Use the tstats command to perform statistical queries on indexed fields in tsidx files. You can select the data for the
indexed fields in several ways.
When you select a data model for a tstats search, you also have to select the root dataset within that data model
that you intend to search. You cannot select all of the root datasets within a data model at once.
Search filters cannot be applied to accelerated data models. This includes both role-based and user-based
search filters.
597
...| tstats <stats-func> FROM datamodel=<data_model_name>.<root_dataset_name> where
nodename=<root_dataset_name>.<...>.<target_dataset_name>
For example, say you have a data model with three root datasets, each with their own dataset hierarchies.
ButtercupGamesPromos
- NYC (BaseEvent)
- TShirtStore (NYC)
- FashionShows (TShirtStore)
- Giveaways (TShirtStore)
- Chicago (BaseEvent)
- BeerAndBrautsPopup (Chicago)
- BeerSales (BeerAndBrautsPopup)
- BrautSales (BeerAndBrautsPopup)
- Tokyo (BaseSearch)
- GiantRobotBattles (Tokyo)
- UFORobotGrendizer (GiantRobotBattles)
- MechaGodzilla (GiantRobotBattles)
With this hierarchy, if you wanted run a tstats search that selects from the dataset containing records of the
MechaGodzilla giant robot battles staged by the Tokyo office, you would use the following search:
Search filters cannot be applied to accelerated data model datasets. This includes both role-based and
user-based search filters.
You might see a count mismatch in the events retrieved when searching tsidx files. It is not possible to
distinguish between indexed field tokens and raw tokens in tsidx files. On the other hand, it is more explicit to run
the tstats command on accelerated data models or from a tscollect command, where only the fields and
values are stored and not the raw tokens.
As with the search command, you can use the tstats command to filter events with CIDR match on fields that contain
IPv4 and IPv6 addresses. However, unlike the search command, the tstats command may not correctly filter strings
containing non-numeric wildcard octets. As a result, your searches may return unpredictable results.
If you are filtering fields with a CIDR match using the tstats command in a BY clause, you can work around this issue
and correctly refilter your results by appending your search with a search command, regex command, or WHERE clause.
Unfortunately, you can't use this workaround if the search doesn't include the filtered field in a BY clause.
Let's take a look at an example of how you could use CIDR match with the tstats command in a BY clause. Say you
create a file called data.csv containing the following lines:
ip,description
1.2.3.4,"An IP address"
5.6.7.8,"Another IP address"
this.is.a.hostname,"A hostname"
this.is.another.hostname,"Another hostname"
598
Then follow these steps:
1. Upload the file and set the sourcetype to csv, which ensures that all fields in the file are indexed as required by
the tstats command.
2. Run the following search against the index you specified when you uploaded the file. This example uses the main
index.
ip count
1.2.3.4 1
5.6.7.8 1
this.is.a.hostname 1
this.is.another.hostname 1
Even though only two addresses are legitimate IP addresses, all four rows of addresses are displayed in the results.
Invalid IP addresses are displayed along with the valid IP addresses because the tstats command uses string matching
to satisfy search requests and doesn't directly support IP address-based searches. The tstats command does its best to
return the correct results for CIDR search clauses, but the tstats search may return more results than you want if the
source data contains mixed IP and non-IP data such as host names.
To make sure your searches only return the results you want, make sure that your data set is clean and only contains
data in the correct format. If that is not possible, use the search command or WHERE clause to do post-filtering of the
search results. For example, the following search using the search command displays correct results because the piped
search command further filters the results from the tstats command.
Alternatively, you can use the WHERE clause to filter your results, like this.
Both of these searches using the search command and the WHERE clause return only the valid IP addresses in the
results, which look like this:
ip count
1.2.3.4 1
5.6.7.8 1
Wildcard characters
The tstats command does not support wildcard characters in field values in aggregate functions or BY clauses.
For example, you cannot specify | tstats avg(foo*) or | tstats count WHERE host=x BY source*.
Aggregate functions include avg(), count(), max(), min(), and sum(). For more information, see Aggregate functions.
599
Any results returned where the aggregate function or BY clause includes a wildcard character are only the most recent
few minutes of data that has not been summarized. Include the summariesonly=t argument with your tstats command to
return only summarized data.
With the exception of count, the tstats command supports only statistical functions that are applied to fields or eval
expressions that resolve into fields. For example, you cannot specify | tstats sum or | tstats sum(). Instead the tstats
syntax requires that at least one field argument be provided for the function: | tstats sum(<field>).
You cannot use eval expressions inside aggregate functions with the tstats command.
While nested eval expressions are supported with the stats command, they are not supported with the tstats command.
Some functions are inherently more expensive, from a memory standpoint, than other functions. For example, the
distinct_count function requires far more memory than the count function. The values and list functions also can
consume a lot of memory.
If you are using the distinct_count function without a split-by field or with a low-cardinality split-by by field, consider
replacing the distinct_count function with the the estdc function (estimated distinct count). The estdc function might
result in significantly lower memory usage and run times.
The PREFIX() directive allows you to search on a raw segment in your indexed data as if it were an extracted field. This
causes the search to run over the tsidx file in your indexers rather than the log line. This is a practice that can
significantly reduce the CPU load on your indexers.
The PREFIX() directive is similar to the CASE() and TERM() directives in that it matches strings in your raw data. You can
use PREFIX() to locate a recurring segment in your raw event data that is actually a key-value pair separated by a
delimiter that is also a minor breaker, like = or :. You give PREFIX() the text that precedes the valueâthe "prefix"âand
then it returns the values that follow the prefix. This enables you to group by those values and aggregate them with tstats
functions. The values can be strings or purely numeric.
For example, say you have indexed segments in your event data that look like kbps=10 or kbps=333. You can isolate the
numerical values in these segments and perform aggregations or group-by operations on them by using the PREFIX()
directive to identify kbps= as a common prefix string. Run a tstats search with PREFIX(kbps=) against your event data
and it will return 10 and 333. These values are perfect for tstats aggregation functions that require purely numeric input.
Notice that in this example you need to include the = delimiter. If you run PREFIX(kbps), the search returns =10 and =333.
Efforts to aggregate on such results may return unexpected results, especially if you are running them through
aggregation functions that require purely numeric values.
The text you provide for the PREFIX() directive must be in lower case. For example, the tstats search processor will fail
to process PREFIX(connectionType=). Use PREFIX(connectiontype=) instead. It will still match connectionType=
600
strings in your events.
The Splunk software separates events into raw segments when it indexes data, using rules specified in segmenters.conf.
You can run the following search to identify raw segments in your indexed events:
You cannot apply the PREFIX() directive to segment prefixes and values that contain major breakers such as spaces,
square or curly brackets, parentheses, semicolons, or exclamation points.
For more information about the CASE() and TERM() directives, see Use CASE() and TERM() to match phrases in the
Search Manual.
For more information about the segmentation of indexed events, see About event segmentation in Getting Data In
For more information about minor and major breakers in segments, see Event segmentation and searching in the Search
Manual.
A pair of limits.conf settings strike a balance between the performance of tstats searches and the amount of memory
they use during the search process, in RAM and on disk. If your tstats searches are consistently slow to complete you
can adjust these settings to improve their performance, but at the cost of increased search-time memory usage, which
can lead to search failures.
If you have Splunk Cloud Platform, you need to file a Support ticket to change these settings.
For more information, see Memory and stats search performance in the Search Manual.
The tstats command does not support complex aggregate functions such as
...count(eval('Authentication.action'=="failure")).
Consider the following query. This query will not return accurate results because complex aggregate functions are not
supported by the tstats command.
Instead, separate out the aggregate functions from the eval functions, as shown in the following search.
601
Sparkline charts
You can generate sparkline charts with the tstats command only if you specify the _time field in the BY clause and use
the stats command to generate the actual sparkline. For example:
You can provide any number of aggregates (aggregate-opt) to perform and also have the option of providing a filtering
query using the WHERE keyword. This query looks like a normal query you would use in the search processor. This
supports all the same time arguments as search, such as earliest=-1y.
Grouping by _time
You can provide any number of BY fields. If you are grouping by _time, supply a timespan with span for grouping the time
buckets, for example ...BY _time span=1h or ...BY _time span=3d.
tstats searches over indexes that have undergone tsidx bucket reduction will return incorrect results.
For more information see Reduce tsidx disk usage in Managing indexers and clusters of indexers.
The tstats command is unable to handle multiple time ranges. This is because the tstats command is a generating
command and doesn't perform post-search filtering, which is required to return results for multiple time ranges.
The following example of a search using the tstats command on events with relative times of 5 seconds to 1 second in
the past displays a warning that the results may be incorrect because the tstats command doesn't support multiple time
ranges.
If you want to search events in multiple time ranges, use another command such as stats, or use multiple tstats
commands with append as shown in the following example.
| tstats prestats=t count where index=_internal earliest=-5s latest=-4s | tstats prestats=t append=true
count where index=_internal earliest=-3s latest=-2s | stats count
count
264
Examples
602
Example 2: Returns the average of the field foo in mydata, specifically where bar is value2 and the value of baz is greater
than 5.
Example 4: Gives a timechart of all the data in your default indexes with a day granularity.
Example 5: Use prestats mode in conjunction with append to compute the median values of foo and bar, which are in
different namespaces.
| tstats prestats=t median(foo) FROM mydata | tstats prestats=t append=t median(bar) FROM otherdata | stats
median(foo) median(bar)
Example 6: Uses the summariesonly argument to get the time range of the summary for an accelerated data model
named mydm.
Example 7: Uses summariesonly in conjunction with timechart to reveal what data has been summarized over the past
hour for an accelerated data model titled mydm.
| tstats summariesonly=t prestats=t count FROM datamodel=mydm BY _time span=1h | timechart span=1h count
Example 8: Uses the values statistical function to provide a list of all distinct values for source returned by the "Splunk's
Internal Server Logs" data model. The list is returned as a multivalue entry.
values(source)
/Applications/Splunk/var/log/splunk/license_usage.log
/Applications/Splunk/var/log/splunk/metrics.log
/Applications/Splunk/var/log/splunk/metrics.log.1
/Applications/Splunk/var/log/splunk/scheduler.log
/Applications/Splunk/var/log/splunk/splunkd.log
/Applications/Splunk/var/log/splunk/splunkd_access.log
If you don't have the internal_server data model defined, check under Settings->Data models for a list of the data
models you have access to.
Example 9: Uses the values statistical function to provide provide a list of all distinct values for source returned by the
Alerts dataset within the "Splunk's Internal Server Logs" data model.
603
| tstats values(source) FROM datamodel=internal_server where nodename=server.scheduler.alerts
Example 10: Gets the count and average of a raw, unindexed term using the PREFIX kbps=, then splits this by an indexed
source and another unindexed term using the PREFIX group=.
See also
Commands
datamodel
stats
tscollect
walklex
typeahead
Description
Returns autosuggest information for a specified prefix that is used to autocomplete word candidates in searches. The
maximum number of results returned is based on the value you specify for the count argument.
Syntax
| typeahead
prefix=<string>
count=<int>
[collapse=<bool>]
[<endtimeu=<int>]
[<index=<string>]
[max_time=<int>]
[<starttimeu=<int>]
[use_cache=<bool>]
Required arguments
prefix
Syntax: prefix=<string>
Description: The full search string to return typeahead information.
count
Syntax: count=<int>
Description: The maximum number of results to return.
Optional arguments
collapse
Syntax: collapse=<bool>
604
Description: Specify whether to collapse a term that is a prefix of another term when the event count is the
same.
Default: true
endtimeu
Syntax: endtimeu=<int>
Description: Set the end time to N seconds, measured in UNIX time.
Default: now
index-specifier
Syntax: index=<string>
Description: Search the specified index instead of the default index.
max_time
Syntax: max_time=<int>
Description: The maximum time in seconds that the typeahead can run. If max_time=0, there is no limit.
startimeu
Syntax: starttimeu=<int>
Description: Set the start time to N seconds, measured in UNIX time.
Default: 0
use_cache
Syntax: use_cache = <boolean>
Description: Specifies whether the typeahead cache will be used if use_cache is not specified in the command
line or endpoint. When use_cache is turned on, Splunk software uses cached search results when running
typeahead searches, which may have outdated results for a few minutes after you make changes to .conf files.
For more information, see Typeahead and .conf file updates.
Default: true or 1
Usage
The typeahead command is a generating command and should be the first command in the search. Generating
commands use a leading pipe character.
When you run the typeahead command, it extracts data from indexes, configurations, and search histories. This
information is used to suggest word candidates for autocomplete when users run searches. The typeahead command
extracts data from these sources:
The typeahead command uses a cache to run fast searches at the expense of accurate results. As a result, sometimes
what is in the cache and shows up in typeahead search results may not reflect recent changes to .conf files. This is
because it takes 5 or 10 minutes for the cached data to clear, depending on the performance of the server. For example, if
you rename a sourcetype in the props.conf file, it may take a few minutes for that change to display in typeahead search
605
results. A typeahead search that is run while the cache is being cleared returns the cached data, which is expected
behavior.
If you make a change to a .conf file, you can wait a few minutes for the cache to clear to get the most accurate and
up-to-date results from your typeahead search. Alternatively, you can turn off the use_cache argument to clear the cache
immediately, which fetches more accurate results, but is a little slower. After you manually clear the cache, you should
see the changes to your .conf file reflected in your results when you rerun the typeahead search.
For more information, see Rename source types in the Splunk Cloud Platform Getting Data In manual.
typeahead searches over indexes that have undergone tsidx bucket reduction will return incorrect results.
For more information see Reduce tsidx disk usage in Managing indexers and clusters of indexers.
Examples
When you run a typeahead search, Splunk software extracts information about field definitions from .conf files, such as
props.conf and savedsearches.conf, and displays the relevant information for the specified prefix. For example, say you
run the following search for the source prefix:
The results look something like this, which tells you what your users see as autocomplete suggestions when they start to
type source in the Search bar in Splunk Web.
You can also run typeahead on saved searches. For example, say you run this search:
The results look something like this, which tells you what your users see as autocomplete suggestions when they start to
type savedsearch in the Search bar in Splunk Web.
606
Example 3: Return typeahead information for sourcetypes in the _internal index
When you run the following typeahead search, Splunk software returns typeahead information for sourcetypes in the
_internal index.
typelearner
The typelearner command is deprecated as of Splunk Enterprise version 5.0. This means that although the command
continues to function, it might be removed in a future version. Use the findtypes command instead.
Description
Generates suggested event types by taking previous search results and producing a list of potential searches that can be
used as event types. By default, the typelearner command initially groups events by the value of the grouping-field. The
search then unifies and merges these groups based on the keywords they contain.
Syntax
Optional arguments
grouping-field
Syntax: <field>
Description: The field with values for the typelearner comman to use when initially grouping events.
Default: punct, the punctuation seen in _raw
grouping-maxlen
Syntax: maxlen=<int>
Description: Determines how many characters in the grouping-field value to look at. If set to negative, the entire
value of the grouping-field value is used to group events.
Default: 15
607
Examples
Example 1:
Have the search automatically discover and apply event types to search results.
... | typelearner
See also
typer
typer
Description
Calculates the 'eventtype' field for search results that match a known event type.
Syntax
Optional arguments
eventtypes
Syntax: eventtypes=<string>
Description: Provide a comma-separated list of event types to filter the set of event types that typer can return in
the eventtype field. The eventtypes argument filters out all event types except the valid event types in its list. If all
of the event types listed for eventtypes are invalid, or if no event types are listed, typer is disabled and will not
return any event types. The eventtypes argument accepts wildcards.
Default: No default (by default typer returns all available event types)
maxlen
Syntax: maxlen=<unsigned_integer>
Description: By default, the typer command looks at the first 10000 characters of an event to determine its event
type. Use maxlen to override this default. For example, maxlen=300 restricts typer to determining event types from
the first 300 characters of events.
Usage
Users with file system access, such as system administrators, can change the default setting for maxlen.
Prerequisites
• Only users with file system access, such as system administrators, can change the default setting for maxlen
608
using configuration files.
• Review the steps in How to edit a configuration file in the Splunk Enterprise Admin Manual.
• You can have configuration files with the same name in your default, local, and app directories. Read Where you
can place (or find) your modified configuration files in the Splunk Enterprise Admin Manual.
Never change or copy the configuration files in the default directory. The files in the default directory must remain intact
and in their original location. Make changes to the files in the local directory.
Steps
1. Open or create a local limits.conf file for the Search app at $SPLUNK_HOME/etc/apps/search/local.
2. Under the [typer] stanza, specify the default for the maxlen setting.
If you have Splunk Cloud Platform and want to change the default, file a Support ticket.
Examples
Example 1:
Force the search to apply event types that you have configured (Splunk Web automatically does this when you view the
"eventtype" field).
... | typer
See also
Commands
typelearner
union
Description
Merges the results from two or more datasets into one dataset. One of the datasets can be a result set that is then piped
into the union command and merged with a second dataset.
The union command appends or merges event from the specified datasets, depending on whether the dataset is
streaming or non-streaming and where the command is run. The union command runs on indexers in parallel where
possible, and automatically interleaves results on the _time when processing events. See Usage.
If you are familiar with SQL but new to SPL, see Splunk SPL for SQL users.
Syntax
union
609
[<subsearch-options>]
<dataset>
[<dataset>...]
Required arguments
dataset
Syntax: <dataset-type>:<dataset-name> | <subsearch>
Description: The dataset that you want to perform the union on. The dataset can be either a named or unnamed
dataset.
◊ A named dataset is comprised of <dataset-type>:<dataset-name>. For <dataset-type> you can specify a
data model, a saved search, or an inputlookup. For example
datamodel:"internal_server.splunkdaccess".
◊ A subsearch is an unnamed dataset.
When specifying more than one dataset, use a space or a comma separator between the dataset names.
Optional arguments
subsearch-options
Syntax: maxtime=<int> maxout=<int> timeout=<int>
Description: You can specify one set of subsearch-options that apply to all of the subsearches. You can specify
one or more of the subsearch-options. These options apply only when the subsearch is treated as a
non-streaming search.
◊ The maxtime argument specifies the maximum number of seconds to run the subsearch before finalizing.
The default is 60 seconds.
◊ The maxout argument specifies the maximum number of results to return from the subsearch. The default
is 50000 results. This value is the maxresultrows setting is in the [searchresults] stanza in the
limits.conf file.
◊ The timeout argument specifies the maximum amount of time, in seconds, to cache the subsearch
results. The default is 300 seconds.
Usage
How the union command processes datasets depends on whether the dataset is a streaming or non-streaming dataset.
The type of dataset is determined by the commands that are used to create the dataset. See Types of commands.
There are two types of streaming commands, distributable streaming and centralized streaming. For this discussion
about the union command, streaming datasets refers to distributable streaming.
Whether the datasets are streaming or non-streaming determines if the union command is run on the indexers or the
search head. The following table specifies where the command is run.
610
Dataset type Dataset 1 is streaming Dataset 1 is non-streaming
The type of dataset also determines how the union command is processed.
Distributable streaming
Consider the following search, which uses the union command to merge the events from three indexes. Each index
contains 60,000 events, for a total of 180,000 events.
index count
union_1 60000
union_2 60000
union_3 60000
In this example, all of the subsearches are distributable streaming, so they are unioned by using same processing as the
multisearch command. All 60,000 results for each index are unioned for a total of 180,000 merged events.
However, if you specify a centralized streaming command, such as the head command, in one of the subsearches the
results change.
| union maxout=10000000 [ search index=union_1 | head 60000] [ search index=union_2 ] [ search index=union_3
] | stats count by index
611
This search produces the following union results for a total of 160,000 merged events.
index count
union_1 60000
union_2 50000
union_3 50000
Because the head command is a centralized streaming command rather than distributable streaming command, any
subsearches that follow the head command are processed using the append command. In other words, when a command
forces the processing to the search head, all subsequent commands must also be processed on the search head.
| search index=union_1 | head 60000 | append [ search index=union_2 ] | append [ search index=union_3 ] |
stats count by index
When the union command is used with commands that are non-streaming commands, the default for the maxout
argument is enforced. The default for the maxout argument is 50,000 events. In this example, the default for the maxout
argument is enforced starting with the subsearch that used the non-streaming command. The default is enforced for any
subsequent subsearches.
If the non-streaming command is on the last subsearch, the first two subsearches are processed as streaming. These
subsearches are unioned using the multisearch command processing. The final subsearch includes a non-streaming
command, the head command. That subsearch gets unioned using the append command processing.
In this example, the default for the maxout argument applies only to the last subsearch. That subsearch returns only
50,000 events instead of the entire set of 60,000 events. The total number events merged is 170,000. 60,000 events for
the first and second subsearches and 50,000 events from the last subsearch.
Interleaving results
When two datasets are retrieved from disk in descending time order, which is the default sort order, the union command
interleaves the results. The interleave is based on the _time field. For example, you have the following datasets:
dataset_A
1 dns15 231
dataset_B
612
_time host bytes
2 dns12 22o
Both datasets are descending order by _time. When | union dataset_A, dataset_B is run, the following dataset is the
result.
3 router1 23
2 dns12 22o
1 dns15 231
Examples
The following example merges events from index a and index b. New fields type and mytype are added in each
subsearch using the eval command.
| union [search index=a | eval type = "foo"] [search index=b | eval mytype = "bar"]
The following example appends the current results of the main search with the tabular results of errors from the
subsearch.
... | chart count by category1 | union [search error | chart count by category2]
The following example unions a built-in data model that is an internal server log for REST API calls and the events from
index a.
The following example sets a maximum of 20,000 results to return from the subsearch. The example specifies to limit the
duration of the subsearch to 120 seconds. The example also sets a maximum time of 600 seconds (5 minutes) to cache
the subsearch results.
... | chart count by category1 | union maxout=20000 maxtime=120 timeout=600 [search error | chart count by
category2]
See also
Related information
About subsearches in the Search Manual
About data models in the Knowledge Manager Manual
613
Commands
search
inputlookup
uniq
Description
The uniq command works as a filter on the search results that you pass into it. This command removes any search result
if that result is an exact duplicate of the previous result. This command does not take any arguments.
Syntax
uniq
Examples
Example 1:
Keep only unique results from all web traffic in the past hour.
See also
dedup
untable
Description
Converts results from a tabular format to a format similar to stats output. This command is the inverse of the xyseries
command.
Syntax
Required arguments
<x-field>
Syntax: <field>
Description: Field to be used as the x-axis.
614
<y-name-field>
Syntax: <field>
Description: Field that contains the values to be used as labels for the data series.
<y-data-field>
Syntax: <field>
Description: Field that contains the data to be charted.
Usage
When you untable a set of results and then use the xyseries command to combine the results, results that contain
duplicate values are removed.
You can use the streamstats command create unique record numbers and use those numbers to retain all results. See
Extended examples.
Basic example
This example uses the sample data from the Search Tutorial. To try this example on your own Splunk instance, you must download the sample
data and follow the instructions to get the tutorial data into Splunk. Use the time range All time when you run the search.
To show how to use the untable command, we need results that appear in a table format. Run this search.
The results appear on the Statistics tab and look something like this:
For each categoryId, there are two values, the count and the percent. When you untable these results, there will be three
columns in the output:
615
When you use the untable command to convert the tabular results, you must specify the categoryId field first. You can
use any field name you want for the type of calculation and the values. For example:
The results appear on the Statistics tab and look something like this:
Extended example
The untable command does exactly what the name says, it converts tabular information into individual rows of results.
Suppose you have this search:
616
date-time1 EventCode 4136
date-time1 Message Too late now
Remember that the original set of events in this example had duplicates for date_time4. If you want to process the events
in some way and then put the events back together, you can avoid eliminating the duplicate events by using the
streamstats command.
Use the streamstats command to give each event a unique record number and use that unique number as the key field
for the untable and xyseries commands.
For example, you can add the streamstats command to your original search.
...| table _time EventCode Message | streamstats count as recno | untable recno FieldName FieldValue
617
recno FieldName FieldValue
2 EventCode 1234
2 Message I dont know
3 EventCode 3456
3 Message Too busy, ask again later
4 EventCode 1256
4 Message Everything is happening at once
4 EventCode 1257
4 Message And right now, as well
These events can be put back together by using the xyseries command, again using the recno field as the <x-field>. For
example:
In addition to using the streamstats command to generate a record number, you can use the rename command to restore
the timestamp information after the xyseries command. For example:
...| table _time EventCode Message | streamstats count as recno | rename _time as time | untable recno
FieldName FieldValue | xyseries recno FieldName FieldValue | rename time as _time
(Thanks to Splunk users DalJeanis and BigCosta for their help with this example.)
See also
xyseries
walklex
618
Description
Generates a list of terms or indexed fields from each bucket of event indexes.
Important: The walklex command does not work on hot buckets. This command only works on warm or cold buckets,
after the buckets have a merged lexicon file or single time-series index (tsidx) file.
Due to the variable nature of merged_lexicon.lex and .tsidx files, the walklex command does not always return
consistent results.
Syntax
| walklex
[ type=<walklex-type> ]
[ prefix=<string> | pattern=<wc-string> ]
<index-list>
[ splunk_server=<wc-string> ]
[ splunk_server_group=<wc-string> ]...
Required arguments
<index-list>
Syntax: index=<index-name> index=<index-name> ...
Description: Limits the search to one or more indexes. For example, index=_internal.
Optional arguments
prefix | pattern
Syntax: prefix=<string> | pattern=<wc-string>
Description: Limits results to terms that match a specific pattern or prefix. Either prefix or pattern can be
specified but not both. Includes only buckets with a merged_lexicon file or a single tsidx file. This means that hot
buckets are generally not included.
Default: pattern=*
splunk_server
Syntax: splunk_server=<wc-string>
Description: Specifies the distributed search peers from which to return results.
◊ If you are using Splunk Cloud Platform, omit this parameter.
◊ If you are using Splunk Enterprise, you can specify only one splunk_server argument. However, you can
use a wildcard when you specify the server name to indicate multiple servers. For example, you can
specify splunk_server=peer01 or splunk_server=peer*. Use local to refer to the search head.
Default: All configured search peers return information
splunk_server_group
Syntax: splunk_server_group=<wc-string>
Description: Limits the results to one or more server groups. You can specify a wildcard character in the string to
indicate multiple server groups with similar names.
◊ If you are using Splunk Cloud Platform, omit this parameter.
619
Default: None
type
Syntax: type = ( all | field | fieldvalue | term )
Description: Specifies which type of terms to return in the lexicon. See Usage for more information about using
the type argument options.
◊ Use field to return only the unique field names in each index bucket.
◊ Use fieldvalue to include only indexed field terms.
◊ Use term to exclude all indexed field terms of the form <field>::<value>.
Default: all
Usage
The walklex command is a generating command, which use a leading pipe character. The walklex command must be
the first command in a search. See Command types.
When the Splunk software indexes event data, it segments each event into raw tokens using rules specified in
segmenters.conf file. You might end up with raw tokens that are actually key-value pairs separated by an arbitrary
delimiter such as an equal ( = ) symbol.
The following search uses the walklex and where commands to find the raw tokens in your index. It uses the stats
command to count the raw tokens.
Specify the type=field argument to have walklex return only the field names from indexed fields.
The indexed fields returned by walklex can include default fields such as host, source, sourcetype, the date_* fields,
punct, and so on. It can also include additional indexed fields configured as such in props.conf and transforms.conf and
created with the INDEXED_EXTRACTIONS setting or other WRITE_META methods. The discovery of this last set of additional
indexed fields is likely to help you with accelerating your searches.
Return the set of terms that are indexed fields with indexed values
Specify type=fieldvalue argument to have walklex return the set of terms from the index that are indexed fields with
indexed values.
The type=fieldvalue argument returns the list terms from the index that are indexed fields with indexed values. Unlike
the type=field argument, where the values returned are only the field names themselves, the type=fieldvalue argument
returns indexed field names that have any field value.
For example, if the indexed field term is runtime::0.04, the value returned by the type=fieldvalue argument is
runtime::0.04. The value returned by the type=field argument is runtime.
Return all TSIDX keywords that are not part of an indexed field structure
Specify type=term to have walklex return the keywords from the TSIDX files that are not part of any indexed field
structure. In other words, it excludes all indexed field terms of the form <field>::<value>.
620
Return terms of all three types
When you do not specify a type, or when you specify type=all, walklex uses the default type=all argument. This causes
walklex to return the terms in the index of all three types: field, fieldvalue, and term.
When you use type=all, the indexed fields are not called out as explicitly as the fields are with the type=field argument.
You need to split the term field on :: to obtain the field values from the indexed term.
Restrictions
The walklex command applies only to event indexes. It cannot be used with metrics indexes.
People who have search filters applied to one or more of their roles cannot use walklex unless they also have a role
with either the run_walklex capability or the admin_all_objects capability. For more information about role-based search
filters, see Create and manage roles with Splunk Web in Securing the Splunk Platform. For more information about
role-based capabilities, see Define roles on the Splunk platform with capabilities, in Securing the Splunk Platform.
Basic examples
The following example returns all of the terms in each bucket of the _internal index and finds the total count for each
term.
The following example returns all of the terms that start with foo in each bucket of the _internal and _audit indexes.
The following example returns all of the indexed field terms for each bucket that end with bar in the _internal index.
The following example returns all of the field names of indexed fields in each bucket of the _audit index.
See also
Commands
metadata
tstats
621
where
Description
The where command uses eval-expressions to filter search results. These eval-expressions must be Boolean expressions,
where the expression returns either true or false. The where command returns only the results for which the eval
expression returns true.
Syntax
where <eval-expression>
Required arguments
eval-expression
Syntax: <eval-mathematical-expression> | <eval-concatenate-expression> | <eval-comparison-expression> |
<eval-boolean-expression> | <eval-function-call>
Description: A combination of values, variables, operators, and functions that represent the value of your
destination field. See Usage.
The <eval-expression> is case-sensitive. The syntax of the eval expression is checked before running the search,
and an exception is thrown for an invalid expression.
The following table describes characteristics of eval expressions that require special handling.
Expression
Description Example
characteristics
'5minutes'="late"
If the expression references a field
name that starts with a numeric
Field names starting
character, the field name must be This expression is a field name equal to a string value.
with numeric characters
surrounded by single quotation Because the field starts with a numeric it must be
marks. enclosed in single quotations. Because the value is a
string, it must be enclosed in double quotations.
new=count+'server-1'
If the expression references a field
Field names with name that contains
non-alphanumeric non-alphanumeric characters, the This expression could be interpreted as a mathematical
characters field name must be surrounded by equation, where the dash is interpreted as a minus sign.
single quotation marks. To avoid this, you must enclose the field name server-1
in single quotation marks.
Literal strings If the expression references a literal new="server-"+count
string, the literal string must be
surrounded by double quotation
marks.
There are two issues with this example. First, server-
could be interpreted as a field name or as part of a
mathematical equation, that uses a minus sign and a plus
sign. To ensure that server- is interpreted as a literal
string, enclose the string in double quotation marks.
622
Expression
Description Example
characteristics
Usage
The where command uses the same expression syntax as the eval command. Also, both commands interpret quoted
strings as literals. If the string is not quoted, it is treated as a field name. Because of this, you can use the where command
to compare two different fields, which you cannot use the search command to do.
Search | search foo=bar This search looks for events where the field foo contains the string value bar.
Where ... | where foo="bar" This search looks for events where the value in the field foo is the string value bar.
Boolean expressions
The order in which Boolean expressions are evaluated with the where command is:
This evaluation order is different than the order used with the search command. The search command evaluates OR
clauses before AND clauses.
You can only specify a wildcard by using the like function with the where command. The percent ( % ) symbol is the
wildcard the you use with the like function. See the like() evaluation function.
Supported functions
You can use a wide range of evaluation functions with the where command. For general information about using functions,
see Evaluation functions.
Examples
623
1. Specify a wildcard with the where command
You can only specify a wildcard with the where command by using the like function. The percent ( % ) symbol is the
wildcard you must use with the like function. The where command returns like=TRUE if the ipaddress field starts with the
value 198..
See also
x11
Description
The x11 command removes the seasonal pattern in your time-based data series so that you can see the real trend in your
data. This command has a similar purpose to the trendline command, but it uses the more sophisticated and industry
popular X11 method.
The seasonal component of your time series data can be either additive or multiplicative, defined as the two types of
seasonality that you can calculate with x11: add() for additive and mult() for multiplicative. See About time-series
forecasting in the Search Manual.
Syntax
Required arguments
<fieldname>
Syntax: <field>
Description: The name of the field to calculate the seasonal trend.
Optional arguments
<type>
Syntax: add() | mult()
624
Description: Specify the type of x11 to compute, additive or multiplicative.
Default: mult()
<period>
Syntax: <int>
Description: The period of the data relative to the number of data points, expressed as an integer between 5 and
1000. If the period is 7, the command expects the data to be periodic every 7 data points. If you omit this
parameter, Splunk software calculates the period automatically. The algorithm does not work if the period is less
than 5 and will be too slow if the period is greater than 1000.
<newfield>
Syntax: <string>
Description: Specify a field name for the output of the x11 command.
Default: None
Examples
Example 1: In this example, the type is the default mult and the period is 15. The field name specified is count.
Because span=1d, every data point accounts for 1 day. As a result, the period in this example is 15 days. You can
change the syntax in this example to ... | x11 15(count) because the mult type is the default type.
Example 2: In this example, the type is add and the period is 20. The field name specified is count.
625
See also
predict, trendline
xmlkv
Description
The xmlkv command automatically extracts key-value pairs from XML-formatted data.
Syntax
xmlkv
[<field>]
maxinputs=<int>
Required arguments
None.
Optional arguments
field
Syntax: <field>
Description: The field from which to extract the key and value pairs.
Default: The _raw field.
maxinputs
Syntax: maxinputs=<int>
Description: Sets the maximum number of events or search results that can be passed as inputs into the xmlkv
command per invocation of the command. The xmlkv command is invoked repeatedly in increments according to
the maxinputs argument until the search is complete and all of the results have been displayed. Do not change
the value of maxinputs unless you know what you are doing.
Default: 50000
Usage
From the following XML, name is the key and Settlers of Catan is the value in the first element.
<game>
<name>Settlers of Catan</name>
626
<category>competitive</category>
</game>
<game>
<name>Ticket to Ride</name>
<category>competitive</category>
</game>
Examples
Extract key-value pairs from XML tags in the _raw field. Processes a maximum of 50000 events.
... | xmlkv
Extract the key-value pairs from events or search results in increments of 10,000 per invocation of the xmlkv command
until the search has finished and all of the results are displayed.
See also
Commands
extract
kvform
multikv
rex
spath
xpath
xmlunescape
Description
Un-escapes xml characters, including entity references such as &, <, and >, so that they return to their corresponding
characters. For example, & becomes &.
Syntax
xmlunescape maxinputs=<int>
Optional arguments
maxinputs
Syntax: maxinputs=<int>
Description: The maximum number of inputs per invocation of the command. The xmlunescape command is
invoked repeatedly in increments according to the maxinputs argument until the search is complete and all of the
results have been displayed. Do not change the value of maxinputs unless you know what you are doing.
Default: 50000
627
Examples
... | xmlunescape
xpath
Description
Extracts the xpath value from field and sets the outfield attribute.
Syntax
Required arguments
xpath-string
Syntax: <string>
Description: Specifies the XPath reference.
Optional arguments
field
Syntax: field=<field>
Description: The field to find and extract the referenced xpath value from.
Default: _raw
outfield
Syntax: outfield=<field>
Description: The field to write, or output, the xpath value to.
Default: xpath
default
Syntax: default=<string>
Description: If the attribute referenced in xpath doesn't exist, this specifies what to write to the outfield. If this
isn't defined, there is no default value.
Usage
The xpath command supports the syntax described in the Python Standard Library 19.7.2.2. Supported XPath syntax.
Examples
628
1. Extract values from a single element in _raw XML events
You want to extract values from a single element in _raw XML events and write those values to a specific field.
<foo>
<bar nickname="spock">
</bar>
</foo>
<foo>
<bar nickname="scotty">
</bar>
</foo>
<foo>
<bar nickname="bones">
</bar>
</foo>
Extract the nickname values from _raw XML events. Output those values to the name field.
<DataSet xmlns="">
<identity_id>3017669</identity_id>
<instrument_id>912383KM1</instrument_id>
<transaction_code>SEL</transaction_code>
<sname>BARC</sname>
<currency_code>USA</currency_code>
</DataSet>
<DataSet xmlns="">
<identity_id>1037669</identity_id>
<instrument_id>219383KM1</instrument_id>
<transaction_code>SEL</transaction_code>
<sname>TARC</sname>
<currency_code>USA</currency_code>
</DataSet>
Extract the values from the identity_id element from the _raw XML events:
To extract a combination of two elements, sname with a specific value and instrument_id, use this search:
629
Because you specify sname='BARC', this search returns one result: instrument_id=912383KM1.
You must add field=xml to the end of your search. For example:
| makeresults
| eval xml="<DataSet xmlns=\"\">
<identity_id>1037669</identity_id>
<instrument_id>219383KM1</instrument_id>
<transaction_code>SEL</transaction_code>
<sname>TARC</sname>
<currency_code>USA</currency_code>
</DataSet>"
| xpath outfield=identity_id "//DataSet/identity_id" field=xml
See also
xsDisplayConcept
The xsDisplayConcept command is an extreme search command used with Splunk Enterprise Security.
For information about this command, see Extreme search commands in Administer Splunk Enterprise Security.
xsDisplayContext
The xsDisplayContext command is an extreme search command used with Splunk Enterprise Security.
For information about this command, see Extreme search commands in Administer Splunk Enterprise Security.
xsFindBestConcept
The xsFindBestConcept command is an extreme search command used with Splunk Enterprise Security.
For information about this command, see Extreme search commands in Administer Splunk Enterprise Security.
xsListConcepts
The xsListConcepts command is an extreme search command used with Splunk Enterprise Security.
For information about this command, see Extreme search commands in Administer Splunk Enterprise Security.
630
xsListContexts
The xsListContexts command is an extreme search command used with Splunk Enterprise Security.
For information about this command, see Extreme search commands in Administer Splunk Enterprise Security.
xsUpdateDDContext
The xsUpdateDDContext command is an extreme search command used with Splunk Enterprise Security.
For information about this command, see Extreme search commands in Administer Splunk Enterprise Security.
xsWhere
The xsWhere command is an extreme search command used with Splunk Enterprise Security.
For information about this command, see Extreme search commands in Administer Splunk Enterprise Security.
xyseries
This topic walks through how to use the xyseries command.
Description
Converts results into a tabular format that is suitable for graphing. This command is the inverse of the untable command.
Syntax
Required arguments
<x-field>
Syntax: <field>
Description: The name of the field to use for the x-axis label. The values of this field appear as labels for the data
series plotted on the x-axis.
<y-name-field>
Syntax: <field>
Description: The field that contains the values to use as labels for the data series.
<y-data-field>
Syntax: <field> [,<field>] ...
Description: One or more fields that contain the data to chart. When specifying multiple fields, separate the field
names with commas.
631
Optional arguments
format
Syntax: format=<string>
Description: Used to construct output field names when multiple data series are used in conjunction with a
split-by-field and separate the <y-name-field> and the <y-data-field>. format takes precedence over sep and lets
you specify a parameterized expression with the stats aggregator and function ($AGG$) and the value of the
split-by-field ($VALUE$).
grouped
Syntax: grouped= true | false
Description: If true, indicates that the input is sorted by the value of the <x-field> and multifile input is allowed.
Default: false
sep
Syntax: sep=<string>
Description: Used to construct output field names when multiple data series are used in conjunctions with a
split-by field. This is equivalent to setting format to $AGG$<sep>$VALUE$.
Usage
The xyseries command is a distributable streaming command, unless grouped=true is specified and then the xyseries
command is a transforming command. See Command types.
Alias
When you use the xyseries command to converts results into a tabular format, results that contain duplicate values are
removed.
You can use the streamstats command create unique record numbers and use those numbers to retain all results. For an
example, see the Extended example for the untable command.
Example
Let's walk through an example to learn how to reformat search results with the xyseries command.
Write a search
This example uses the sample data from the Search Tutorial but should work with any format of Apache web access log. To try this example on
your own Splunk instance, you must download the sample data and follow the instructions to get the tutorial data into Splunk. Use the time
range All time when you run the search.
Run this search in the search and reporting app:
The top command automatically adds the count and percent fields to the results. For each categoryId, there are two
values, the count and the percent.
632
The search results look like this:
In this example:
• <x-field> = categoryId
• <y-name-field> = count
• <y-data-field> = percent
When you apply the xyseries command, the categoryId serves as the <x-field> in your search results. The results of the
calculation count become the columns, <y-name-field>, in your search results. The <y-data-field>, percent, corresponds
to the values in your search results.
ACCESSORIES 13.166856
ARCADE 18.653046
SHOOTER 9.269769
SIMULATION 9.307605
STRATEGY 30.495649
TEE 13.885736
Extended example
Let's walk through an example to learn how to add optional arguments to the xyseries command.
633
Write a search
To add the optional arguments of the xyseries command, you need to write a search that includes a split-by-field
command for multiple aggregates. Use the sep and format arguments to modify the output field names in your search
results.
In this example:
• <x-field> = clientip
• <y-name-field> = referrer domain
• <y-data-field> = host, productId
The xyseries command needs two aggregates, in this example they are: count(host) count(productId). The first few search
results look like this:
634
Add optional argument: sep
Add a string to the sep argument to change the default character that separates the <y-name-field> host,and the
<y-data-field> productId. The format argument adds the <y-name-field> and separates the field name and field value by
the default ":"
The format argument adds the <y-name-field> and separates the field name and field value by the default ":" For example,
the default for this example looks like count(host):referrer_domain
When you specify a string to separate the <y-name-field> and <y-data-field> with the format argument, it overrides any
assignment from the sep argument. In the following example, the sep argument assigns the "-" character to separate the
<y-name-field> and <y-data-field> fields. The format argument assigns a "+" and this assignment takes precedence over
sep. In this case $VAL$ and $AGG$ represent both the <y-name-field> and <y-data-field>. As seen in the search results,
the <y-name-field>, host, and <y-data-field>, productId can correspond to either $VAL$ or $AGG$.
635
Add optional argument: grouped
The grouped argument determines whether the xyseries command runs as a distributable streaming command, or a
transforming command. The default state grouped=FALSE for the xyseries command runs as a streaming command.
See also
Commands
untable
636
Internal Commands
Consult your Splunk Administrator or Splunk Support before using any of these commands.
• collapse
• dump
• findkeywords
• mcatalog
• noop
• prjob
• runshellscript
• sendalert
collapse
The collapse command is an internal, unsupported, experimental command. See About internal commands.
Description
The collapse command condenses multifile results into as few files as the chunksize option allows. This command runs
automatically when you use outputlookup and outputcsv commands.
Syntax
Optional arguments
chunksize
Syntax: chunksize=<num>
Description: Limits the number of resulting files.
Default: 50000
force
Syntax: force=<bool>
Description: If force=true and the results are entirely in memory, re-divide the results into appropriated chunked
files.
Default: false
637
Examples
... | collapse
dump
The dump command is an internal, unsupported, experimental command. See About internal commands.
Description
For Splunk Enterprise deployments, export search results to a set of chunk files on local disk. For information about other
export methods, see Export search results in the Search Manual.
Syntax
Required arguments
basefilename
Syntax: basefilename=<string>
Description: The prefix of the export filename.
Optional arguments
compress
Syntax: compress=<number>
Description: The gzip compression level. Specify a number from 0 to 9, where 0 means no compression and a
higher number means more compression and slower writing speed.
Default: 2
fields
Syntax: fields=<comma-delimited-string>
Description: A list of the fields to be exported. The entire list must be enclosed in quotation marks. Invalid field
names are ignored.
format
Syntax: format= raw | csv | tsv | json | xml
Description: The output data format.
Default: raw
rollsize
Syntax: rollsize=<number>
Description: The maximum file size, in MB, at which point no more events are written to the file and it becomes a
candidate for HDFS transfer.
Default: 63 MB
638
Usage
The dump command preserves the order of events as the events are received by the command.
Examples
Example 1: Export all events from index "bigdata" to the location "YYYYmmdd/HH/host" at
"$SPLUNK_HOME/var/run/splunk/dispatch/<sid>/dump/" directory on local disk with "MyExport" as the prefix of export
filenames. Partitioning of the export data is achieved by eval preceeding the dump command.
Example 2: Export all events from index "bigdata" to the local disk with "MyExport" as the prefix of export filenames.
findkeywords
The findkeywords command is an internal, unsupported, experimental command. See About internal commands.
Description
Given some integer labeling of events into groups, finds searches to generate these groups.
Syntax
findkeywords labelfield=<field>
Required arguments
labelfield
Syntax: labelfield=<field>
Description: A field name.
Usage
Use the findkeywords command after the cluster command, or a similar command that groups events. The findkeyword
command takes a set of results with a field (labelfield) that supplies a partition of the results into a set of groups. The
command derives a search to generate each of these groups. This search can be saved as an event type.
639
Examples
Return logs for specific log_level values and group the results
Return all logs where the log_level is DEBUG, WARN, ERROR, FATAL and group the results by cluster count.
The values of groupID are the values of cluster_count returned from the cluster command.
See also
cluster, findtypes
makejson
The makejson command is an internal, unsupported, experimental command. See About internal commands.
Description
Creates a JSON object from the specified set of fields in the search results, and places the JSON object into a new field.
Syntax
640
Required arguments
output
Syntax: output=<string>
Description: The name to use for the output field where the JSON object is placed.
Optional arguments
wc-field-list
Syntax: <field>(,<field>) ...
Description: Comma-delimited list of fields to use to generate a JSON object. You can use a wild card character
in the field names.
Default: All fields are included in the JSON object if a list is not specified.
Usage
You cannot use the table or fields command to specify the field order in the JSON object that gets created.
Examples
The following search create a JSON object in a field called "data" taking in values from all available fields.
• The makeresults command creates five search results that contain a timestamp.
• The eval command creates two fields in each search result. One field is named owner and contains the value
vladimir. The other field is named error that takes a random number and uses the modulo mathematical
operator ( % ) to divide the random number by 3.
• The makejson command creates a JSON object based on the values in the fields in each search result.
641
_time owner error_code
The makejson command is used to create a JSON object in a field called "data" using the values from only the _time and
owner fields. The error field is not included in the JSON object.
data
{"owner": "claudia", "_time": 1583876714}
{"owner": "alex", "_time": 1583873117}
{"owner": "wei", "_time": 1583822891}
{"owner": "david", "_time": 1583788535}
{"owner": "maria", "_time": 1583770930}
{"owner": "vanya", "_time": 1583710376}
{"owner": "claudia", "_time": 1583589914}
3. Create a JSON object using a wildcard list of fields
Create a JSON object in a field called "json-object" using the values from the _time field and fields that end in _owner.
642
_time product_owner system_owner error json-object
2020-03-10 {"product_owner": "wei", "system_owner": "vanya", "_time":
22:23:24
wei vanya 3
1583904204}
2020-03-10 {"product_owner": "wei", "system_owner": "vanya", "_time":
22:23:24
wei vanya 2
1583904204}
4. Use with schema-bound lookups
You can use the makejson command with schema-bound lookups to store a JSON object in the description field for later
processing.
Suppose that a Splunk application comes with a KVStore collection called example_ioc_indicators, with the fields key
and description. For long term supportability purposes you do not want to modify the collection, but simply want to utilize
a custom lookup within a framework, such as Splunk Enterprise Security (ES) Threat Framework.
threat description
{"threat_name": "Sample threat", "threat_confidence": 100, "threat_expiry": "2020-01-01 21:13:37
maliciousdomain.example
UTC", "threat_campaign": "Sample threat"}
You would then add the outputlookup command to send the search results to the lookup:
To use this custom lookup within a framework, you would specify this in a search:
See also
Related commands
spath
mcatalog
The mcatalog command is an internal, unsupported, experimental command. See About internal commands.
Description
The mcatalog command performs aggregations on the values in the metric_name and dimension fields in the metric
indexes.
643
Syntax
Required arguments
values (<field>)
Syntax: values(<field>) [AS <field>]
Description: Returns the list of all distinct values of the specified field as a multivalue entry. The order of the
values is lexicographical. See Usage.
Optional arguments
append
Syntax: append=<bool>
Description: Valid only when prestats=true. This argument runs the mcatalog command and adds the results to
an existing set of results instead of generating new results.
Default: false
<field-list>
Syntax: <field>, ...
Description: Specify one or more fields to group results.
<logical-expression>
Syntax: <time-opts>|<search-modifier>|((NOT)?
<logical-expression>)|<index-expression>|<comparison-expression>|(<logical-expression> (OR)?
<logical-expression>)
Description: Includes time and search modifiers, comparison, and index expressions. Does not support CASE or
TERM directives. You also cannot use the WHERE clause to search for terms or phrases.
prestats
Syntax: prestats=true | false
Description: Specifies whether to use the prestats format. The prestats format is a Splunk internal format that is
designed to be consumed by commands that generate aggregate calculations. When using the prestats format
you can pipe the data into the chart, stats, or timechart commands, which are designed to accept the prestats
format. When prestats=true, AS instructions are not relevant. The field names for the aggregates are determined
by the command that consumes the prestats format and produces the aggregate output.
Default: false
<comparison-expression>
Syntax: <field><comparison-operator><value> | <field> IN (<value-list>)
Description: Compare a field to a literal value or provide a list of values that can appear in the field.
<index-expression>
Syntax: "<string>" | <term> | <search-modifier>
Description: Describe the events you want to retrieve from the index using literal strings and search modifiers.
<time-opts>
644
Syntax: [<timeformat>] (<time-modifier>)*
Description: Describe the format of the starttime and endtime terms of the search
<comparison-operator>
Syntax: = | != | < | <= | > | >=
Description: You can use comparison operators when searching field/value pairs. Comparison expressions with
the equal ( = ) or not equal ( != ) operator compare string values. For example, "1" does not match "1.0".
Comparison expressions with greater than or less than operators < > <= >= numerically compare two numbers
and lexicographically compare other values. See Usage.
<field>
Syntax: <string>
Description: The name of a field.
<value>
Syntax: <literal-value>
Description: In comparison-expressions, the literal number or string value of a field.
<value-list>
Syntax: (<literal-value>, <literal-value>, ...)
Description: Used with the IN operator to specify two or more values. For example use error IN (400, 402,
404, 406) instead of error=400 OR error=402 OR error=404 OR error=406
<string>
Syntax: "<string>"
Description: Specify keywords or quoted phrases to match. When searching for strings and quoted strings
(anything that's not a search modifier), Splunk software searches the _raw field for the matching events or results.
<search-modifier>
Syntax: <sourcetype-specifier> | <host-specifier> | <source-specifier> | <splunk_server-specifier>
Description: Search for events from specified fields. For example, search for one or a combination of hosts,
sources, and source types. See searching with default fields in the Knowledge Manager manual.
<sourcetype-specifier>
Syntax: sourcetype=<string>
Description: Search for events from the specified sourcetype field.
<host-specifier>
Syntax: host=<string>
Description: Search for events from the specified host field.
<source-specifier>
Syntax: source=<string>
Description: Search for events from the specified source field.
<splunk_server-specifier>
Syntax: splunk_server=<string>
Description: Search for events from a specific server. Use "local" to refer to the search head.
645
Time options
<timeformat>
Syntax: timeformat=<string>
Description: Set the time format for starttime and endtime terms.
Default: timeformat=%m/%d/%Y:%H:%M:%S.
<time-modifier>
Syntax: starttime=<string> | endtime=<string> | earliest=<time_modifier> | latest=<time_modifier>
Description: Specify start and end times using relative or absolute time.
Note: You can also use the earliest and latest attributes to specify absolute and relative time ranges for your search. For
more about this time modifier syntax, see About search time ranges in the Search Manual.
starttime
Syntax: starttime=<string>
Description: Events must be later or equal to this time. Must match timeformat.
endtime
Syntax: endtime=<string>
Description: All events must be earlier or equal to this time.
Usage
You use the mcatalog command to search metrics data. The metrics data uses a specific format for the metrics fields. See
Metrics data format in Metrics. The _values field is not allowed with this command.
The mcatalog command is a generating command for reports. Generating commands use a leading pipe character. The
mcatalog command must be the first command in a search pipeline, except when append=true.
All metrics search commands are case sensitive. This means, for example, that mcatalog treats as the following as
three distinct values of metric_name: cap.gear, CAP.GEAR, and Cap.Gear.
If your role does not have the list_metrics_catalog capability, you cannot use mcatalog.
See About defining roles with capabilities in the Securing Splunk Enterprise manual.
WHERE
If you do not specify an index name in the WHERE clause, the mcatalog command returns results from the default metrics
indexes associated with your role. If you do not specify an index name and you have no default metrics indexes
associated with your role, mcatalog returns no results. To search against all metrics indexes use WHERE index=*.
For more information about defining default metrics indexes for a role, see Add and edit roles with Splunk Web in
Securing Splunk Enterprise.
646
Group by
The mcatalog command does not allow grouping by time ranges. The argument is not included in its syntax.
Time dimensions
The mcatalog command does not recognize the following time-related dimensions.
Unsupported dimensions
date_hour date_wday
timeendpos
date_mday date_year
date_minute date_zone timestamp
date_month metric_timestamp timestartpos
date_second time
Lexicographical order
Lexicographical order sorts items based on the values used to encode the items in computer memory. In Splunk software,
this is almost always UTF-8 encoding, which is a superset of ASCII.
• Numbers are sorted before letters. Numbers are sorted based on the first digit. For example, the numbers 10, 9,
70, 100 are sorted lexicographically as 10, 100, 70, 9.
• Uppercase letters are sorted before lowercase letters.
• Symbols are not standard. Some symbols are sorted before numeric values. Other symbols are sorted before or
after letters.
You can specify a custom sort order that overrides the lexicographical order. See the blog Order Up! Custom Sort Orders.
Examples
2. Return all metric names in the default metric indexes associated with the role of the user
If the user role has no default metric indexes assigned to it, the search returns no events.
| mcatalog values(metric_name)
3. Return all IP addresses for a specific metric_name among all metric indexes
647
4. Returns a list of all available dimensions in the default metric indexes associated with the role of the user
| mcatalog values(_dims)
noop
The noop command is an internal, unsupported, experimental command. See About internal commands.
Description
The noop command is an internal command that you can use to debug your search. It includes several arguments that you
can use to troubleshoot search optimization issues.
You cannot use the noop command to add comments to a search. If you are looking for a way to add comments to your
search, see Add comments to searches in the Search Manual.
Syntax
Required arguments
None.
Optional arguments
appender-expression
Syntax: log_appender = "<appender_name>; [<attribute_name> = <attribute_value>], ..."
Description Identifies an appender from log-searchprocess.log and specifies changed values for one or more
attributes that belong to that appender. These value changes apply to the search job for the lifetime of the job.
They are not reused after the search finishes. The list of attribute value changes should be enclosed in quotes.
See Appender expression options.
log-level-expression
Syntax: log_<level> = "<channel>, ..."
Description: Sets or changes the log levels for one or more log channels at search startup. The log channel list
should be double-quoted. See Log level expression options.
optimization_type
Syntax: search_optimization.<optimization_type> = <boolean>
Description: Enables or disables a specific type of search optimization for the search. To disable multiple
optimization types, create a comma-separated list of search_optimization.<optimization_type> arguments. See
Optimization type arguments.
Default: true
sample_ratio
Syntax: sample_ratio = <int>
648
Description: Sets a randomly-sampled subset of results to return from a given search. It returns 1 out of every
<sample_ratio> events. For example, if you supply | noop sample_ratio=25, the Splunk software returns a
random sample of 1 out of every 25 events from the search result set. The sample_ratio argument requires that
search be the generating command of the search to which you are applying noop.
The sample_ratio does the same thing as the event sampling feature that you can manage through Splunk Web.
The difference is that it enables you to apply event sampling to a subsearch, while the Splunk Web version of
event sampling is applied only to the main search. See Event sampling in the Search Manual.
Default: 1
search_optimization
Syntax: search_optimization = <boolean>
Description: Enables or disables all optimizations for the search.
Default: true
set_ttl
Syntax: set_ttl = <timespan>
Description: Specifies the lifetime of the search job using time modifiers like 1d for one day or 12h for twelve
hours. The search job lifetime is the amount of time that the job exists in the system before it is deleted. The
default lifetime of an ad hoc search is 10 minutes. You might use this setting to make an ad hoc search job stay
in the system for 24 hours or 7 days.
Here are the search_optimization.<optimization_type> arguments that you can use with noop.
level
Syntax: log_<level>
649
Description: Valid values are the Splunk platform internal logging levels: debug, info, warn, and error, and
fatal. You can apply different log levels to different sets of channels.
channel
Syntax: <channel>, ...
Description: Specifies one or more log channels to apply the log level to. Use wildcards to catch all channels
with a matching string of characters in their name.
appender_name
Syntax: <string>
Description: The name of an appender from the log-searchprocess.cfg file. Use a wildcard * to identify all
appenders in the log-searchprocess.cfg file. The noop parser is case-sensitive. It sends an error message if you
submit an appender name with incorrect case-formatting.
attribute_name
Syntax: maxFileSize | maxBackupIndex | ConversionPattern | maxMessageSize
Description: Attributes that can be changed for a given appender. The noop parser is case-sensitive, so do not
change the case-formatting of these attributes. It sends an error message if you submit an appender name with
incorrect case-formatting.
attribute_value
Syntax: <string>
Description: Provides an updated value for the selected appender attribute. The values you provide for the
maxFileSize, maxBackupIndex, and maxMessageSize attributes must be higher than the values that are currently set
for those attributes in the log-searchprocess.cfg file. In other words, if the maxFileSize setting for the
650
searchprocessAppender is currently set to 10000000, you can only submit a new maxFileSize value that is higher
than 10000000.
Usage
You can use the noop command to enable or disable search optimizations when you run a search. Enabling or disabling
search optimizations can help you troubleshoot certain kinds of search issues. For example, you might experiment with
disabling and enabling search optimizations to determine whether they are causing a search to be slow to complete.
For information about managing search optimization through limits.conf for all of the users in your Splunk platform
deployment, see Built-in optimization in the Search Manual.
The noop command can enable or disable all search optimizations for a single run of a search.
If all search optimizations are enabled for your Splunk deployment in limits.conf, you can add the following argument to
the end of a search string to disable all optimizations when you run that search:
If all search optimizations are disabled for your Splunk deployment in limits.conf, you can add the following argument to
the end of a search string to enable all search optimizations when you run that search:
You can use the optimization_type argument to selectively disable or enable specific types of search optimization.
Here is an example of a set of noop arguments that disable the predicate merge and predicate pushdown optimizations for
a search.
This example works only if you have enabled all optimizations in limits.conf.
When you set enabled=false for the [search_optimization] stanza in limits.conf you disable all search optimizations
for your Splunk platform deployment. With this limits.conf configuration, your searches can use noop to enable all
optimizations and selectively disable specific optimization types.
For example, if you have the [search_optimization] stanza set to enabled = false in limits.conf, the following search
enables all optimizations except projection elimination.
However, When you set enabled=false for the [search_optimization] stanza in limits.conf, your searches cannot
enable specific optimization types unless specific conditions are met. See How noop interoperates with limits.conf search
optimization settings.
651
How the noop command interoperates with limits.conf search optimization settings
Review how you have configured search optimization for your Splunk platform deployment in limits.conf before you use
the noop command to enable or disable optimization types. The search processor respects limits.conf settings for
optimization types only when [search_optimization] is enabled.
For example, if the [search_optimization] stanza is set to enabled=true in limits.conf, the search processor checks
whether individual optimization types are enabled or disabled in limits.conf. On the other hand, if the
[search_optimization] stanza is set to enabled = false, the search processor does not check the settings for other
optimization types. It assumes all of the optimization types are set to enabled=false.
This search processor logic affects the way that the noop command works when you use it to enable or disable search
optimization for an individual search.
For example, imagine that you have the following configuration in limits.conf:
[search_optimization]
enabled=false
[search_optimization::projection_elimination]
enabled=false
With this configuration, the search processor ignores the disabled projection elimination optimization. Because
[search_optimization] is disabled, the search processor assumes all optimizations are disabled.
Say you have this configuration, and you run the following search, which uses the noop command to enable search
optimization:
When you do this, you enable search optimization, but the search processor sees that in limits.conf, the projection
elimination optimization is disabled. It runs the search with all optimization types enabled except projection elimination.
Instead, use the noop command in a search to enable search optimization and selectively enable the projection elimination
optimization:
When this search runs, it overrides both limits.conf settings: the setting for [search_optimization] and the setting for
[search_optimization::projection_elimination]. The search runs with all optimizations enabled.
The log_<level> argument lets you set the debugging channel for a search at a specific log level, such as debug or warn.
You might use this if you need to set the log level for a specific search but do not have CLI access to the Splunk platform
implementation.
The Splunk platform changes the log level after it parses the noop command. It can do this before the search head parses
arguments from other search commands, even if it comes after those commands in the search string. For instance, the
following search properly logs some debug messages from the makeresults command despite the fact that it precedes
the noop command:
652
However, the log_<level> argument cannot set the log level for search process components that are ahead of SPL
argument processing in the order of operations. For example, LicenseMgr is one of those components. When you run this
search, it still logs at the default level of info for LicenseMgr even though you specify debug in the SPL.
If you have command-line access and you need to debug an issue with that component or ones like it, you can modify
$SPLUNK_HOME/etc/log-searchprocess.cfg directly to set the logging level before the search is dispatched and produce
more verbose output in search.log.
The noop command must be part of the streaming pipeline. Because the Splunk software performs argument parsing on
the search head and then pushes the search to the indexers, make sure that the noop command is part of the streaming
pipeline. Place the noop command before the first non-streaming command in the search string. An easy way to do this is
to put it after the first command in the search string, which is ususally search.
The log_<level> argument supports wildcard matching. You can also set different log levels for different debugging
channels in the same search.
For more information about logs and setting log levels for debugging channels, see What Splunk logs about itself in the
Troubleshooting Manual.
For debugging purposes, you can use noop to apply changed attributes for log-searchprocess.cfg appenders to
individual runs of a search. Appenders are blocks of configurations for specific sub-groups of log components. Example
appenders include searchprocessAppender, watchdog_appender, and searchTelemetryAppender. You can use the *
wildcard to select all appenders.
For example, the following search changes the maximum size of the search.log file to 50 MB and sets the maximum
number of rolled search.log files to 99.
These changes are applied for the lifetime of that particular search. They are not saved or applied to other searches.
You can only change values for the following appender attributes: maxFileSize, maxBackupIndex, ConversionPattern, and
maxMessageSize. Values you supply for maxFileSize, maxBackupIndex, and maxMessageSize must be higher than the
current values for those appender attributes in log-searchprocess.cfg.
For more information about changing appender attributes for log debugging purposes, see Enable log debugging in the
Troubleshooting Manual.
prjob
The prjob command is an internal, unsupported, experimental command. See About internal commands.
653
Description
Use the prjob command for parallel reduce search processing of an SPL search in a distributed search environment. The
prjob command analyzes the specified SPL search and attempts to reduce the search runtime by automatically placing a
redistribute command in front of the first non-streaming SPL command like stats or transaction in the search. It
provides the same functionality as the redistribute command, but with a simpler syntax. Similar to the redistribute
command, use the prjob command to automatically speed up high cardinality searches that aggregate a large number of
search results.
Syntax
prjob [<subsearch>]
or
prjob [num_of_reducers=<int>] [subsearch]
Required arguments
subsearch
Syntax: [<subsearch>]
Description: Specifies the search string that the prjob command attempts to process in parallel.
Optional arguments
num_of_reducers
Syntax: [num_of_reducers=<int>]
Description: Specifies the number of eligible indexers from the indexer pool that may function as intermediate
reducers. For example: When a search is run on 10 indexers and the configuration is set to use 60% of the
indexer pool (with a maximum value of 5), it implies that only five indexers may be used as intermediate reducers.
If the value of num_of_reducers is set to greater than 5, only five reducers are available due to the limit. If the
value of of num_of_reducers is set to less than 5, the number of reducers used shrinks from the maximum limit of
5.
• reducers:
• maxReducersPerPhase + winningRate
The number of intermediate reducers is determined by the value set for reducers. If no value is set for reducers, the
search uses the values set for maxReducersPerPhase and winningRate to determine the number of intermediate reducers.
For example: In a scenario where Splunk is configured so that the value of num_of_reducers is set to 50 percent of the
indexer pool and the maxReducersPerPhase value is set to four indexers, a parallel reduce search that runs on six search
peers will be assigned to run on three intermediate reducers. Similarly, a parallel reduce search that runs on four search
peers, will be assigned to run on two intermediate reducers. However, searches that runs on ten search peers would be
limited to the maximum of four intermediate reducers.
654
Usage
Use the prjob command instead of the redistribute command when you want to run a parallel reduce job without
determining where to insert the redistribute command or managing the by-clause field.
The prjob command may be used only as the first command of a search. Additionally, you must include the entire search
within the prjob command.
To use the prjob command, set the phased_execution_mode to multithreaded or auto and set enabled to true in the
[search_optimization::pr_job_extractor] stanza of the limits.conf configuration file.
The prjob command does not support real time or verbose mode searches. Real time or verbose mode searches with the
prjob command may run, but the redistribute operation will be ignored. Also, you may not use the prjob and the
redistribute command within the same search.
The prjob command supports the same commands as the redistribute command. For more information, see
redistribute. The prjob command only reduces the search runtime of an SPL search that contains at least one of the
following non-streaming commands: â¦"
• stats
• tstats
• streamstats
• eventstats
• sistats
• sichart
• sitimechart
• transaction (only on a single field)
Examples
Example 1: Using the prjob command in a search automatically places the redistribute command before the first
non-streaming SPL command in the search. This speeds up a stats search that aggregates a large number of results.
The stats count by host portion of the search is processed on the intermediate reducers and the search head
aggregates the results.
is transformed to:
Example 2: Speeds up a search that includes eventstats and uses sitimechart to perform the statistical calculations for
a timechart operation. The intermediate reducers process eventstats, where, and sitimechart operations. The search
head runs the timechart command to turn the reduced sitimechart statistics into sorted, visualization-ready results.
| prjob [search index=myindex | eventstats count by user, source | where count>10 | sitimechart max(count)
by source | timechart max(count) by source]
Example 3: Speeds up a search that uses tstats to generate events. The tstats command must be placed at the start
of the subsearch, and uses prestats=t to work with the timechart command. The sitimechart command is processed on
the intermediate reducers and the timechart command is processed on the search head.
655
| prjob [search index=myindex | tstats prestats=t count by _time span=1d | sitimechart span=1d count |
timechart span=1d count]
Example 4: The eventstats and where commands are processed in parallel on the reducers, while the sort command
and any other following commands are processed on the search head. This happens because the sort command is a
non-streaming command that is not supported by the prjob command.
| prjob [ search index=myindex | eventstats count by user, source | where count >10 | sort 0 -num(count) |
...]
runshellscript
The runshellscript command is an internal, unsupported, experimental command. See About internal commands.
Description
For Splunk Enterprise deployments, executes scripted alerts. This command is not supported as a search command.
Syntax
Usage
Argument Description
$0 The filename of the script.
$5 The description or trigger reason. For example, "The number of events was greater than 1."
$8 The path to the results file, results.csv. The results file contains raw results.
656
See also
script
sendalert
The sendalert command is an internal, unsupported, experimental command. See About internal commands.
Description
Use the sendalert command to invoke a custom alert action. The command gathers the configuration for the alert action
from the alert_actions.conf file and the saved search and custom parameters passed using the command arguments.
Then the command performs token replacement. The command determines the alert action script and arguments to run,
creates the alert action payload and executes the script, handing over the payload by using STDIN to the script process.
When running the custom script, the sendalert command honors the maxtime setting from the alert_actions.conf file
and terminates the process if the process runs longer than the configured threshold. By default the threshold is set to 5
minutes.
See "Advanced options for working with custom alert actions" in the Developing Views and Apps for Splunk Web manual.
Syntax
Required arguments
alert_action_name
Syntax: <alert_action_name>
Description: The name of the alert action configured in the alert_actions.conf file
Optional arguments
results_link
Syntax: results_link=<url>
Description: Set the URL link to the search results.
results_path
Syntax: results_path=<path>
Description: Set the location to the file containing the search results.
param.<name>
Syntax: param.<name>=<"value">
Description: The parameter name and value. You can use this name and value pair to specify a variety of things,
such as a threshold value, a team name, or the text of a message.
657
Usage
When you use the sendalert command in an ad hoc search, the command might be called multiple times if there are a
large number of search results. This occurs because previewing the search results on the Statistics tab is enabled by
default. If you are using an ad hoc search to test the sendalert command, testing turn off preview to avoid the command
being called multiple times.
When the sendalert command is included in a saved search, such as a scheduled report or a scheduled search, the
command is called only one time.
When the sendalert command is used in a search or in an alert action, the search results are stored in an archive file in
the dispatch directory using the CSV format. The file name is results.csv.gz. The default format for the search results is
SRS, a Splunk-specific binary format for the search results. The CSV format for the archive file is used so that scripts can
process the results file. The default SRS format is not designed to be parsed by scripts.
The archived search results format is controlled through the forceCsvResults setting. This setting is in the [default] stanza
in the alert_actions.conf file.
Examples
Example 1: Invoke an alert action without any arguments. The alert action script handles checking whether there are
necessary parameters that are missing and report the error appropriately.
Example 2: Trigger the hipchat custom alert action and pass in room and message as custom parameters.
658
Search in the CLI
You can run historical searches using the search command, and real-time searches using the rtsearch command. The
following is a table of useful search-related CLI help objects. To see the full help information for each object, type into the
CLI:
Object Description
rtsearch Returns the parameters and syntax for real-time searches.
search-commands Returns a list of search commands that you can use from the CLI.
search-modifiers Returns a list of search and time-based modifiers that you can use to narrow your search.
Historical and real-time searches in the CLI work the same way as searches in Splunk Web, except that there is no
timeline rendered with the search results and there is no default time range. Instead, the results are displayed as a raw
events list or a table, depending on the type of search.
• For more information, read "Type of searches" in the Search Overview chapter of the Search Manual.
The syntax for CLI searches is similar to the syntax for Splunk Web searches, except that you can pass parameters
outside of the query to specify the time limit of the search, where to run the search, and how results are displayed.
• For more information about the CLI search options, see the next topic in this chapter, "CLI search syntax".
• For more information about how to search remote Splunk servers from your local server, see "Access and use the
CLI on a remote server" in the Splunk Enterprise Admin Manual.
The syntax for CLI searches is similar to the syntax for searches you run from Splunk Web except that you can pass
parameters outside of the search object to control the time limit of the search, specify the server where the search is to be
659
run, and specify how results are displayed.
Search defaults
By default when you run a search from the CLI, the search is uses All Time as the time range. You can specify time
ranges using one of the CLI search parameters, such as earliest_time, index_earliest, or latest_time.
The first 100 events are returned when you run a historical search using the CLI. Use the maxout search parameter to
specify the number of events to return.
Search objects
Search objects are enclosed in single quotes (' ') and can be keywords, expressions, or a series of search commands. On
Windows OS use double quotes (" ") to enclose your search object.
• For more information about searching, see Start searching in the Search Tutorial.
• For a brief description of every search command, see the Command quick reference in the Search Reference.
• For a quick reference for Splunk concepts, features, search commands, and functions, see the Quick Reference
Guide in the Search Reference.
Search objects can include not only keywords and search commands but also fields and modifiers to specify the events
you want to retrieve and the results you want to generate.
• For more information about fields, see Use fields to search in the Search Tutorial.
• For more information about default fields and how to use them, see Use default and internal fields in the
Knowledge Manager Manual.
• For more information about time modifiers, see Time modifiers for search in the Search Reference.
Search parameters
Search parameters are options that control the way the search is run or the way the search results are displayed. All of
these parameters are optional. Parameters that take Boolean values support 0, false, f, no as negatives and 1, true,
t, yes as positives.
Specify these search parameters at the end of your search, after you have specified all of the commands and command
arguments. See Example 4.
header <bool> T
660
Parameter Values Defaults Description
Indicates whether to display a header in the table
output mode.
661
Parameter Values Defaults Description
Examples
2. Retrieve events that match web access errors and detach the search
See more examples of Real-time searches and reports in the CLI in the Admin Manual.
There are two recommended ways that you can do this. This first is with the stats command:
./splunk search 'index=* | stats count by host | fields - count' -preview true
Alternatively, since you are only interested in the host field, you can use the metadata command:
Here, the -preview flag is optional and used to view the results as it is returned. In contrast, the table command, unlike the
fields command, generally requires all inputs before it can emit any non-preview output. In this case, you would need to
use the preview flag to be able to view the results of the search.
662