Splunk Interview
Splunk Interview
2. What is Splunk?
Splunk is ‘Google’ for our machine-generated data. It’s a software/engine that can be used for searching,
visualizing, monitoring, reporting, etc. of our enterprise data. Splunk takes valuable machine data and turns it into powerful
operational intelligence by providing real-time insights into our data through charts, alerts, reports, etc.
Below are the common port numbers used by Splunk. However, we can change them if required.
Service Port Number Used
Splunk Web port 8000
Splunk Management port 8089
Splunk Indexing port 9997
Splunk Index Replication port 8080
Splunk Network port 514 (Used to get data from the Network port, i.e., UDP data)
KV Store 8191
This is one of the most frequently asked Splunk interview questions. Below are the components of Splunk:
A Splunk indexer is the Splunk Enterprise component that creates and manages indexes. The primary functions of an indexer
are mentioned below:
2
There are two types of Splunk forwarders, which are mentioned below:
Universal Forwarder (UF): the Splunk agent installed on a non-Splunk system to gather data locally; it can’t parse or index
data.
Heavyweight Forwarder (HWF): A full instance of Splunk with advanced functionalities.
It generally works as a remote collector, intermediate forwarder, and possible data filter, and since it parses data, it is not
recommended for production systems.
props.conf
indexes.conf
inputs.conf
transforms.conf
server.conf
Enterprise license
Free license
Forwarder license
Beta license
Licenses for search heads (for distributed search)
Licenses for cluster members (for index replication)
The Splunk app is a container or directory of configurations, searches, dashboards, etc. in Splunk.
If the license master is not available, the license slave will start a 24-hour timer, after which the search will be blocked on the
license slave (though indexing continues). However, users will not be able to search for data in that slave until it can reach the
license master again.
A summary index is the default Splunk index (the index that Splunk Enterprise uses if we do not indicate another one).
If we plan to run a variety of summary index reports, we may need to create additional summary indexes.
Splunk DB Connect is a generic SQL database plugin for Splunk that allows us to easily integrate database information with
Splunk queries and reports.
16. Can you write down a general regular expression for extracting the IP address
from logs?
There are multiple ways in which we can extract the IP address from logs. Below are a few examples:
OR
1 rex field=_raw "(?<ip_address>([0-9]{1,3}[.]){3}[0-9]{1,3})"
4
This is another frequently asked interview question on Splunk that will test the developer’s or engineer’s knowledge. The
transaction command is most useful in the following two specific cases:
When the unique ID (from one or more fields) alone is not sufficient to discriminate between two transactions. This is the
case when the identifier is reused, for example, in web sessions identified by a cookie or client IP. In this case, the time
span or pauses are also used to segment the data into transactions.
When an identifier is reused, say, in DHCP logs, a particular message identifies the beginning or end of a transaction.
When it is desirable to see the raw text of events combined rather than an analysis of the constituent fields of the events.
As the performance of the stats command is higher, it can be used, especially in a distributed search environment.
The answer to this question would be very wide, but, mostly, an interviewer would be looking for the following keywords:
Splunk places indexed data in directories, which are called ‘buckets.’ It is physically a directory containing events from a certain
period.
A bucket moves through several stages as it ages. Below are the various stages it goes through:
Hot: A hot bucket contains newly indexed data. It is open for writing. There can be one or more hot buckets for each index.
Warm: A warm bucket consists of data rolled out from a hot bucket. There are many warm buckets.
Cold: A cold bucket has data that is rolled out from a warm bucket. There are many cold buckets.
Frozen: A frozen bucket is comprised of data rolled out from a cold bucket. The indexer deletes frozen data by default, but
we can archive it. Archived data can later be thawed (data in a frozen bucket is not searchable).
5
We should see the hot-db there and any warm buckets we have. By default, Splunk sets the bucket size to 10 GB for 64-bit
systems and 750 MB for 32-bit systems.
The stats command generates summary statistics of all the existing fields in the search results and saves them as values in
new fields.
Eventstats is similar to the stats command, except that the aggregation results are added inline to each event and only if the
aggregation is pertinent to that event. The eventstats command computes requested statistics, much like how stats do, but
aggregates them to the original raw data.
Logstash, Loggly, LogLogic, Sumo Logic, etc. are some of the top direct competitors to Splunk.
Splunk licenses specify how much data we can index per calendar day.
In terms of licensing, for Splunk, one day is from midnight to midnight on the clock of the license master.
They are included in Splunk. Therefore, there is no need to purchase them separately.
This is another frequently asked Splunk commands interview question. Get a thorough idea of commands We can restart the
Splunk web server by using the following command:
1 splunk start splunkweb
27. What is the command used to check the running Splunk processes on
Unix/Linux?
If we want to check the running Splunk Enterprise processes on Unix/Linux, we can make use of the following command:
1 ps aux | grep splunk
28. What is the command used for enabling Splunk to boot start?
Resetting the Splunk admin password depends on the version of Splunk. If we are using Splunk 7.1 and above, then we have to
follow the below steps:
1 $SPLUNK_HOME/etc/system/local/
In the file, we will have to use the following command (here, in place of ‘NEW_PASSWORD’, we will add our own new
password):
1 [user_info]
2
3 PASSWORD = NEW_PASSWORD
After that, we can just restart the Splunk Enterprise and use the new password to log in
Now, if we are using versions prior to 7.1, we will follow the below steps:
7
Note: In case we have created other users earlier and know their login details, copy and paste their credentials from the
passwd.bk file into the passwd file and restart Splunk.
We can clear the Splunk search history by deleting the following file from the Splunk server:
1 $splunk_home/var/log/splunk/searches.log
34. What is Btool? How will you troubleshoot Splunk configuration files?
Splunk Btool is a command-line tool that helps us troubleshoot configuration file issues or just see what values are being used
by our Splunk Enterprise installation in the existing environment.
35. What is the difference between the Splunk app and Splunk add-ons?
In fact, both contain preconfigured configuration, reports, etc., but the Splunk add-on does not have a visual app. On the other
hand, a Splunk app has a preconfigured visual app.
It contains seek pointers and CRCs for the files we are indexing, so ‘splunkd’ can tell us if it has read them already. We can
access it through the GUI by searching for:
1 index=_thefishbucket
This can be done by defining a regex to match the necessary event(s) and sending everything else to NullQueue. Here is a
basic example that will drop everything except events that contain the string login:
In props.conf:
1 <code>[source::/var/log/foo]
2
3 # Transforms must be applied in this order
4
5 # to make sure events are dropped on the
6
7 # floor prior to making their way to the
8
9 # index processor
10
TRANSFORMS-set= setnull,setparsing
11
12 </code>
13
In transforms.conf:
1 [setnull] REGEX = . DEST_KEY = queue FORMAT = nullQueue
2
3 [setparsing]
4
5 REGEX = login
6
7 DEST_KEY = queue
8
FORMAT = indexQueue
9
39. How can I understand when Splunk has finished indexing a log file?
If we are having trouble with data input and we want a way to troubleshoot it, particularly if our whitelist/blacklist rules are not
working the way we expected, we will go to the following URL:
9
1 https://yoursplunkhost:8089/services/admin/inputstatus
To do this in Splunk Enterprise 6.0, we have to use ‘ui-prefs.conf’. If we set the value in the following, all our users would see it
as the default setting:
1 $SPLUNK_HOME/etc/system/local
includes:
1 [search]
2 dispatch.earliest_time = @d
3 dispatch.latest_time = now
The default time range that all users will see in the search app will be today.
contains a directory for each search that is running or has completed. For example, a directory named 1434308943.358 will
contain a CSV file of its search results, a search.log with details about the search execution, and other stuff. Using the defaults
(which we can override in limits.conf), these directories will be deleted 10 minutes after the search completes—unless the user
saves the search results, in which case the results will be deleted after 7 days.
42. What is the difference between search head pooling and search head
clustering?
Both are features provided by Splunk for the high availability of Splunk search head in case any search head goes down.
However, the search head cluster feature has only recently been introduced, while the search head pooling feature will be
removed in the next few versions.
The search head cluster is managed by a captain, and the captain controls its slaves. The search head cluster is more reliable
and efficient than the search head pooling.
43. If I want to add folder access logs from a windows machine to Splunk, how do I
do it?
1. Enable Object Access Audit through group policy on the Windows machine on which the folder is located
2. Enable auditing on a specific folder for which we want to monitor logs
3. Install the Splunk universal forwarder on the Windows machine.
4. Configure the universal forwarder to send security logs to the Splunk indexer.
A license violation warning implies that Splunk has indexed more data than our purchased license quota. We have to identify
which index/source type has received more data recently than the usual daily data volume. We can check the Splunk license
master pool-wise available quota and identify the pool in which the violation has occurred. Once we identify the pool that is
receiving more data, we have to identify the top source type that is receiving more data than usual. Once the source type is also
identified, we find the source machine that is sending the huge number of logs and, in turn, the root cause for the same, and
troubleshoot it accordingly.
MapReduce algorithm is the secret behind Splunk’s faster data searching. It’s an algorithm typically used for batch-based large-
scale parallelization. It’s inspired by functional programming’s map() and reduce() functions.
At the indexer, Splunk keeps track of the indexed events in a directory called Fishbucket with the following default location:
1 /opt/splunk/var/lib/splunk
It contains seek pointers and CRCs for the files we are indexing, so splunkd can tell us if it has read them already.
47. What is the difference between the Splunk SDK and the Splunk Framework?
Splunk SDKs are designed to allow us to develop applications from scratch; they do not require Splunk Web or any components
from the Splunk App Framework. These are separately licensed from Splunk, and they do not alter the Splunk software.
Splunk App Framework resides within the Splunk web server and permits us to customize the Splunk Web UI that comes with
the product and develop Splunk apps using the Splunk web server. It is an important part of the features and functionalities of
Splunk, which does not license users to modify anything in Splunk.
48. For what purpose inputlookup and outputlookup are used in Splunk Search?
11
The inputlookup command is used to search the contents of a Splunk lookup table. The lookup table can be a CSV lookup or a
KV store lookup. The inputlookup command is considered an event-generating command. An event-generating command
generates events or reports from one or more indexes without transforming them. There are numerous commands that come
under event-generating commands, including metadata, loadjob, inputcsv, etc. The inputlookup command is event-generating.
Syntax:
1 inputlookup [append=] [start=] [max=] [ | ] [WHERE ]
Now coming to the outputlookup command, it writes the search results to a static lookup table, or KV store collection, that we
specify. The outputlookup command is not being used with external lookups.
Syntax:
outputlookup [append=<bool>] [create_empty=<bool>] [max=<int>] [key_field=<field_name>] [crea
1 [override_if_empty=<bool>] (<filename> | <tablename>)
Forwarder: You can see it as a dumb agent whose main task is to collect the data from various sources like remote
machines and transfer it to the indexer.
Indexer: The indexer processes the data in real time and stores and indexes it on the localhost or cloud server.
Search Head: It allows the end-user to interact with the data and perform various operations like searching, analyzing, and
visualizing the information.
50. How to add the colors in Splunk UI based on the field names?
Splunk UI has a number of features that allow the administrator to make the reports more presentable. One such feature that
proves to be very useful for presenting distinguished results is the custom colors. For example, if the sales of a product drop
below a threshold value, then as an administrator you can set the chart to display the values in red color.
The administrator can also change chart colors in the Splunk Web UI by editing the panels from the panel settings mentioned
above the dashboard. Moreover, you can write the codes and use hexadecimal values to choose a color from the palette.
The data that is entering an indexer gets sorted into directories, which are also known as buckets. Over a period of time, these
buckets roll over different stages, from hot to warm, cold to frozen, and finally thawed. The indexer goes through a pipeline
where event processing takes place. It occurs in two stages: parsing breaks them into individual events, while indexing takes
these events into the pipeline for processing.
12
This is what happens to the data at each stage of the indexing pipeline:
As soon as the data center the pipeline, it goes to the hot bucket. There can be multiple hot buckets at any point in time,
which you can both search and write to.
If any problem like the Splunk getting restarted or the hot bucket has reached a certain threshold value/size, then a new
bucket will be created in its place and the existing ones roll to become a warm bucket. These warm buckets are
searchable, but you cannot write anything in them.
Further, if the indexer reaches its maximum capacity, the warm bucket will be rolled to become a cold one. Splunk will
automatically execute the process by selecting the oldest warm bucket from the pipeline. However, it doesn’t rename the
bucket. All the above buckets will be stored in the default location ‘$SPLUNK_HOME/var/lib/splunk/defaultdb/db/*’.
After a certain period of time, the cold bucket rolls to become the frozen bucket. These buckets don’t have the same location
as the previous buckets and are non-searchable. These buckets can either be archived or deleted based on the priorities.
You can’t do anything if the bucket is deleted, but you can retrieve the frozen bucket if it’s being archived. The process of
retrieving an archived bucket is known as thawing. Once a bucket is thawed it becomes searchable and stores into a new
location
1 ‘$SPLUNK_HOME/var/lib/splunk/defaultdb/thaweddb/’
Data models in Splunk are used when you have to process huge amounts of unstructured data and create a hierarchical model
without executing complex search queries on the data. Data models are widely used for creating sales reports, adding access
levels, and creating a structure of authentication for various applications.
Pivots, on the other hand, give you the flexibility to create multiple views and see the results as per the requirements. With
pivots, even the managers of stakeholders from non-technical backgrounds can create views and get more details about their
departments.
This topic will be present in any set of Splunk interview questions and answers. Workflow actions in Splunk are referred to as
highly configurable, knowledge objects that enable you to interact with web resources and other fields. Splunk workflow actions
can be used to create HTML links and use them to search field values, put HTTP post requests for specific URLs, and run
secondary searches for selected events.
Real-time dashboards
Dynamic form-based dashboards
Dashboards for scheduled reports
Alerts are the actions generated by a saved search result after a certain period of time. Once an alert has occurred, subsequent
actions like sending an email or a message will also be triggered. There are two types of alters available in Splunk, which are
mentioned below:
Real-Time Alerts: We can divide the real-time alerts into two parts: pre-result and rolling-window alerts. The pre-result alert
gets triggered with every search, while rolling-window alerts are triggered when a specific criterion is met by the search.
Scheduled Alerts: As the name suggests, scheduled alerts can be initialized to trigger multiple alerts based on the set
criteria.
Search factor: The search factor (SF) decides the number of searchable copies an indexer cluster can maintain of the
data/bucket. For example, the search factor value of 3 shows that the cluster can maintain up to 3 copies of each bucket.
Replication factor: The replication factor (RF) determines the number of users that can receive copies of your data/buckets.
However, the search factor should not be greater than the replication factor.
Time Zone is an important property that helps you search for the events in case any fraud or security issue occurs. The default
time zone will be taken from the browser settings or the machine you are using. Apart from event searching, it is also used in
data pouring from multiple sources and aligns them based on different time zones.
Erex
Abstract
Typer
Rename
Anomalies
Fill down
Accum
Add totals
Fast mode: speeds up your search result by limiting the types of data.
Verbose mode: Slower as compared to the fast mode, but returns the information for as many events as possible.
Smart mode: It toggles between different modes and search behaviors to provide maximum results in the shortest period of
time.
1) Define Splunk
It is a software technology that is used for searching, visualizing, and monitoring machine-generated big data. It
monitors and different types of log files and stores data in Indexers.
It is a component of Splunk Enterprise which creates and manages indexes. The primary functions of an indexer are
1) Indexing raw data into an index and 2) Search and manage Indexed data.
6) What are the pros of getting data into a Splunk instance using forwarders?
The advantages of getting data into Splunk via forwarders are TCP connection, bandwidth throttling, and secure SSL
connection for transferring crucial data from a forwarder to an indexer.
License master in Splunk ensures that the right amount of data gets indexed. It ensures that the environment
remains within the limits of the purchased volume as Splunk license depends on the data volume, which comes to
the platform within a 24-hour window.
Inputs file
Transforms file
Server file
Indexes file
Props file
It is a warning error that occurs when you exceed the data limit. This warning error will persist for 14 days. In a
commercial license, you may have 5 warnings within a 1-month rolling window before which your Indexer search
results and reports stop triggering. However, in a free version, license violation warning shows only 3 counts of
warning.
Alerts can be used when you have to monitor for and respond to specific events. For example, sending an email
notification to the user when there are more than three failed login attempts in a 24-hour period.
Map-reduce algorithm is a technique used by Splunk to increase data searching speed. It is inspired by two
functional programming functions 1) reduce () 2) map(). Here map() function is associated with Mapper class and
reduce() function is associated with a Reducer class.
Splunk allows you to keeps track of indexed events in a fish buckets directory. It contains CRCs and seeks pointers
for the files you are indexing, so Splunk can’t if it has read them already.
Pivots are used to create the front views of your output and then choose the proper filter for a better view of this
output. Both options are beneficial for the people from a semi-technical or non-technical background. Data models
are most commonly used for creating a hierarchical model of data. However, it can also be used when you have a
large amount of unstructured data. It helps you make use of that information without using complicated search
queries.
Search factor determines the number of data maintained by the indexer cluster. It determines the number of
searchable copies available in the bucket. Replication factor determines the number of copies maintained by the
cluster as well as the number of copies that each site maintains.
Lookup command is generally used when you want to get some fields from an external file. It helps you to narrow
the search results as it helps to reference fields in an external file that match fields in your event data.
There are 5 default fields which are barcoded with every event into Splunk. They are: 1) host, 2) source, 3) source
type, 4) index, and 5) timestamp.
In order to extract fields from either sidebar, event lists or the settings menu using UI. Another way to extract fields
in Splunk is to write your regular expressions in a props configuration file.
A summary index is a special index that stores that result calculated by Splunk. It is a fast and cheap way to run a
query over a longer period of time.
You can prevent the event from being indexed by Splunk by excluding debug messages by putting them in the null
queue. You have to keep the null queue in transforms.conf file at the forwarder level itself.
It is a SQL database plugin which enables to import tables, rows, and columns from a database add the database.
Splunk DB connect helps in providing reliable and scalable integration between databases and Splunk Enterprises.
It is the directory used by Splunk enterprise to store data and indexed files into the data. These index files contain
various buckets managed by the age of the data.
The alert manager adds workflow to Splunk. The purpose of alert manager o provides a common app with
dashboards to search for alerts or events.
25) What is the difference between Index time and Search time?
Index time is a period when the data is consumed and the point when it is written to disk. Search time take place
while the search is run as events are composed by the search.
27) Name the command which is used to the “filtering results” category
The command which is used to the “filtering results” category is: “where,” “Sort,” “rex,” and “search.”
Free license
Beta license
Search heads license
Cluster members license
Forwarder license
Enterprise license
19
The SPL commands are classified into five categories: 1) Filtering Results, 2) Sorting Results, 3) Filtering Grouping
Results, 4) Adding Fields, and 5) Reporting Results.
This command is used to calculate an expression. Eval command evaluates boolean expressions, string, and
mathematical articulations. You can use multiple eval expressions in a single search using a comma.
31) Name commands which are included in the reporting results category
Following are the commands which are included in the reporting results category:
Rare
Chart
time chart
Top
Stats
Splunk on Splunk or SOS is a Splunk app that helps you to analyze and troubleshoot Splunk environment
performance and issues.
This command searches and replaces specified field values with replacement values.
34) Name features which are not available in Splunk free version?
Distributed searching
Forwarding in HTTP or TCP
Agile statistics and reporting with Real-time architecture
Offers analysis, search, and visualization capabilities to empower users of all types.
Generate ROI faster
A null queue is an approach to filter out unwanted incoming events sent by Splunk enterprise.
37) What is the main difference between source & source type
The source identifies as a source of the event which a particular event originates, while the sourcetype determines
how Splunk processes the incoming data stream into events according to its nature.
It is used to combine the results of a sub search with the results of the actual search. Here the fields must be
common to each result set. You can also combine a search set of results to itself using the selfjoin command in
Splunk.
To start and stop Splunk serives use can use following commands:
./splunk start
./splunk stop
Deployment server is a Splunk instance that acts as a centralized configuration manager. It is used to deploy the
configuration to other Splunk instances.
Time zone property provides the output for a specific time zone. Splunk takes the default time zone from browser
settings. The browser takes the current time zone from the computer system, which is currently in use. Splunk takes
that time zone when users are searching and correlating bulk data coming from other sources.
Splunk sound unit is a plugin which allows adding info data with Splunk reports. It helps in providing reliable and
ascendible integration between relative databases and Splunk enterprises.
You can make use of a bash script in order to install forwarder remotely.
Syslog server is used to collect data from various devices like routers and switches and application logs from the
web server. You can use R syslog or syslog NG command to configure a Syslog server.
Use the forwarder tab available on the DMC (Distributed Management Console) to monitor the status of forwarders
and the deployment server to manage them.
Sumo logic
Loglogic
Loggy
Logstash
Key Value( KV) allows to store and obtain data inside Splunk. KV also helps you to:
Deployer is a Splunk enterprise instant which is used to deploy apps to the cluster head. It can also be used to
configure information for app and user.
It is used when the indexes are of high volume, i.e., 10GB of data.
Regex command removes results which do not match with desired regular expression.
Output lookup command searches the result for a lookup table on the hard disk.
Hot
Warm
Cold
Frozen
Thawed
Input
Parsing
Indexing
Searching
The first phase: It generates data and solves query from various sources.
The second phase: It uses the data to solve the query.
Third phase: it displays the answers via graph, report, or chart which is understood by audiences.
Splunk is available in three different versions. These versions are 1) Splunk enterprise, 2) Splunk light, 3) Splunk
cloud.
Splunk enterprise: Splunk Enterprise edition is used by many IT organizations. It helps you to analyze the
data from various websites and applications.
Splunk cloud: Splunk Cloud is a SaaS (Software as a Service) It offers almost similar features as the
enterprise version, including APIs, SDKs, and apps.
Splunk light: Splunk light is a free version which allows, to make a report, search and edit your log data.
Splunk light version has limited functionalities and features compared to other versions.
Cisco
Facebook
Bosch
Adobe
IBM
Walmart
Salesforce
Search Processing Language or SLP is a language which contains functions, commands, and arguments. It is used to
get the desired output from the database.
Application Monitoring
Employee Management
Physical Security
Network Security
Yes, the search result can be used to make changes in an existing search.
List
Table
Raw
The search result can be exported into JSON, CSV, XML, and PDF.
AND: It is implied between two terms, so you do not need to write it.
OR: It determines that either one of the two arguments should be true.
NOT: used to filter out events having a specific word.
The top command is used to display the common values of a field, with their percentage and count.
It calculates aggregate statistics over a dataset, such as count, sum, and average.
Scheduled alert: It is an alert that is based on a historical search. It runs periodically with a set schedule.
Per result alert: This alert is based on a real time search which runs overall time.
Rolling window alert: An alert that is based on real-time search. This search is set to run within a specific
rolling time window that you define.
They are used to assign names to specific filed and value pairs. The filed can be event type, source, source type, and
host.
In order to increase the size of data storage, you can either add more space to index or add more indexers.
There is only one difference between Splunk apps, and add-ons that is Splunk apps contains built-in reports,
configurations, and dashboards. However, Splunk add-ons contain only built-in configurations they do not contain
dashboards or reports.
80) What is the primary difference between stats and eventstats commands
Stats command provides summary statistics of existing fields available in search output, and then it stores them as
values in new fields. On the other hand, in eventstats command aggregation results are added so that every event
only if the aggregation applies to that particular event.
26
Source field is a default field that finds the data structure of an event. It determines how Splunk formats the data
while indexing.
Calculated fields are the fields which perform the calculation which the values of two fields available in a specific
event.
Abstract
Erex
Addtotals
Accum
Filldown
Typer
Rename
Anomalies
xyseries command converts the search results into a format that is suitable for graphing.
spath command is used to extract fields from structured data formats like JSON and XML.
You can create knowledge, objects, reports, and dashboards in reporting and search app.
They are results saved from a search action that shows the visualization and statistic of a particular event.
The dashboard is defined as a collection of views that are made of various panels.
It is used to work with data without creating any data model. Instant pivot is available to all users.
94) How is it possible to use the host value and not IP address or the DNS name for a
TCP input?
Under stanza in the input configuration file, set the connection_host to none and mention the host value.
It is a group of servers connected with each other. These servers are used to share configuration, user data, and load.
It is a group of Splunk enterprise search heads that serves as a central resource for searching.
Splunk SDKs are written on the base of Splunk REST APIs. Various languages supported by SDKs are: 1) Java, 2)
Python, 3) JavaScript, and 4) C#.
The Splunk REST API offers various processes for accessing every feature available in the product. Your program
communicates to Splunk enterprise using HTTP or HTTPS. It uses the same protocols that any web browser uses to
interact with web pages.
Splunk Enterprise Security accelerates data model provides a panel, dashboard, and correlation search results. It uses
the indexers for processing and storage. The accelerated data is stored within each index by default.
Indexers create various files which contain two types of data: 1) Raw data and 2) metadata index file. Both these files
are used to constitute Splunk enterprise index.
Can you discuss your experience with configuring and managing the Splunk Enterprise Security (ES)
app, including data inputs, correlation searches, and creating customizations?
I have been trained on a vast amount of information and can provide an in-depth answer to your questions on configuring and
managing the Splunk Enterprise Security (ES) app.
For configuring data inputs, you can add data to the ES app by configuring input sources, such as log files, network devices,
cloud services, and third-party apps. You can also define sourcetypes, source categories, and host names to ensure that the
data is correctly categorized and attributed.
For correlation searches, you can use the built-in correlation searches in the ES app, which can identify security incidents and
threats by analyzing the data and looking for patterns that match specific security use cases. You can also create custom
correlation searches to meet specific security requirements or to enhance the built-in correlation searches.
For customizations, you can create custom dashboards, reports, and visualizations to meet the specific requirements of your
organization. You can also use the Splunk ES app to create custom alerts, add custom fields to events, and build custom
knowledge objects, such as notable events and incident review workflows.
Overall, the Splunk Enterprise Security app provides a comprehensive solution for security information and event management
(SIEM) and can be configured and customized to meet the specific needs of your organization.
What steps do you follow to maintain the security and performance of a Splunk ES deployment?
Here are some steps that you can follow to maintain the security and performance of a Splunk Enterprise Security (ES)
deployment:
29
1. Regular software updates: Keep your Splunk ES deployment up to date by regularly checking for software updates and
applying them as soon as possible to ensure that the system is protected against known security vulnerabilities.
2. Secure configuration: Configure the system securely by setting strong passwords, enabling SSL/TLS encryption for
communication between components, and limiting access to the system to only authorized users.
3. Data monitoring: Monitor the data inputs and outputs to ensure that the data is being processed and stored as expected
and to identify and address any performance issues.
4. Performance tuning: Regularly monitor and tune the performance of the system by optimizing indexing, searching, and
reporting performance, and by reducing disk I/O and network traffic.
5. Disaster recovery: Establish and test a disaster recovery plan to ensure that the system can be quickly restored in case
of a failure.
6. Regular backups: Regularly back up the system data to ensure that it can be quickly restored in case of a failure.
7. Security audit: Regularly conduct a security audit of the system to identify and address any potential security
vulnerabilities.
8. User management: Manage user accounts and permissions carefully to ensure that only authorized users have access
to sensitive data.
By following these steps, you can ensure the security and performance of your Splunk ES deployment and minimize the risk of
data breaches or system failures.
Troubleshooting issues with a Splunk Enterprise Security (ES) deployment can be a complex process, but the following steps
can help you to quickly identify and resolve common issues:
1. Review the logs: The first step in troubleshooting is to review the logs and event data generated by the system. This
can help you to identify any error messages or unexpected behaviors.
2. Check system health: Check the health of the system by monitoring the indexing and searching performance, as well as
the disk space and network utilization.
3. Consult the documentation: Consult the Splunk ES documentation, including the knowledge base and user forums, to
find solutions to common issues.
4. Use the Splunk ES CLI tools: Use the CLI tools provided by Splunk ES, such as the “splunkd” and “splunk” commands,
to gather diagnostic information and perform advanced troubleshooting tasks.
5. Work with Splunk support: If you are unable to resolve the issue on your own, you can reach out to the Splunk Support
team for assistance. They can provide expert guidance and help you to resolve the issue.
6. Engage with the community: Engage with the Splunk ES community by joining online forums, attending webinars and
events, and participating in the Splunk Trust community.
By following these steps, you can effectively troubleshoot issues with a Splunk ES deployment and ensure that the system is
running smoothly and efficiently.
Can you discuss your experience with creating and using threat intelligence within Splunk ES?
To create and use threat intelligence within Splunk Enterprise Security (ES), an administrator should follow these steps:
1. Acquire Threat Intelligence: Threat intelligence data can come from various sources like commercial vendors, open-
source feeds, and internal sources.
2. Normalize and Enrich: The acquired threat intelligence data needs to be normalized and enriched to a common format
that can be consumed by Splunk ES.
3. Store and Manage: The enriched data can be stored in Splunk’s index or in a separate database for management.
4. Use in Rules and Correlation Searches: The threat intelligence data can be used to create correlation searches that can
detect and alert on malicious activity, or to create custom rules in the Splunk ES Content Update app.
5. Monitor and Respond: The created alerts and reports should be regularly monitored to identify any suspicious activities
and to take necessary action.
What experience do you have with configuring and using the Splunk ES Incident Review dashboard?
Splunk ES Incident Review dashboard is a centralized view of all the security incidents detected and analyzed by Splunk
Enterprise Security. A certified Splunk Enterprise Security Administrator is expected to have experience in configuring and using
the dashboard to monitor and respond to security incidents. This involves setting up data inputs, creating alerts, creating custom
reports and dashboards, and performing ad hoc searches and investigations. To configure and use the dashboard, an
30
administrator should have a strong understanding of the underlying data models, as well as the knowledge of best practices and
guidelines for security information and event management (SIEM) and incident response.
How do you approach integrating and using third-party security tools with Splunk ES?
To approach integrating and using third-party security tools with Splunk Enterprise Security (ES), you can follow these steps:
1. Determine the security tools you need to integrate with Splunk ES.
2. Research the supported integration methods for each tool, such as APIs or data inputs.
3. Plan the integration, including what data you want to collect and how you will manage the integration process.
4. Configure the data inputs for the third-party security tools in Splunk ES.
5. Validate the integration by checking the data that is being collected and ensuring that it meets your needs.
6. Create custom alerts, reports, and dashboards in Splunk ES to monitor and visualize the data from the third-party
security tools.
7. Regularly review and update the integration as necessary to ensure that it continues to meet your needs and perform
optimally.
Can you discuss your experience with creating and implementing custom alerts, reports, and
dashboards within Splunk ES?
As a Splunk Enterprise Security Certified Admin, I have experience in creating and implementing custom alerts, reports, and
dashboards within the Splunk ES environment. This involves understanding the requirement, designing the appropriate search
queries, setting up the alerts in the Alert Manager, creating reports and dashboards using the Splunk Dashboard Editor and
configuring the desired visualization options. I also ensure that the created alerts, reports and dashboards are relevant, up-to-
date and provide the required insights to the stakeholders. Additionally, I regularly review and optimize these components to
ensure their performance and accuracy.
What experience do you have with implementing and using the Splunk Enterprise Security Content
Update (ESCU) app?
The ESCU app is a critical component of the Splunk Enterprise Security (ES) solution, which provides security teams with real-
time threat intelligence and security analytics. Implementing ESCU involves configuring and integrating the app with existing
Splunk ES installations, setting up data inputs to collect and index relevant security data, and fine-tuning the app’s settings and
configuration to meet the specific needs of the organization.
To effectively implement and use the ESCU app, administrators should have a good understanding of the Splunk platform and
its architecture, as well as experience working with security data and security analytics solutions. They should also have strong
analytical and problem-solving skills, as well as experience working with data privacy and security best practices.
What steps do you follow to ensure data privacy and security when using Splunk ES?
To ensure data privacy and security when using Splunk Enterprise Security (ES), some steps that can be followed include:
1. Implement role-based access control (RBAC): This involves creating different roles with different levels of access to the
data and applications within Splunk ES, which helps to prevent unauthorized access to sensitive information.
2. Encrypt sensitive data: To prevent sensitive data from being accessed by unauthorized parties, it can be encrypted both
in transit and at rest.
3. Use secure protocols: Secure protocols such as SSL/TLS should be used to encrypt the data that is being transmitted
between the Splunk ES environment and other systems.
4. Regularly audit user activity: Regularly auditing user activity within the Splunk ES environment can help to identify any
potential security breaches or unauthorized access attempts.
5. Implement network security: Firewall rules and network segmentation should be used to limit access to the Splunk ES
environment to only authorized systems and users.
6. Regularly update software: Regularly updating the software and components within the Splunk ES environment can
help to mitigate the risk of known vulnerabilities being exploited.
7. Backup and disaster recovery: Regular backups and a disaster recovery plan should be in place to minimize the impact
of data loss or data breaches.
31
How do you approach scaling a Splunk ES deployment to accommodate increasing data volume and
complexity?
When scaling a Splunk ES deployment, there are several factors to consider, including:
1. Indexer capacity: This includes adding more indexers to handle the increased data volume and balance the load.
2. Storage: Ensure that the storage capacity is adequate for the increased data volume.
3. Data distribution: Consider distributing data across multiple indexers to reduce the load on any one indexer.
4. Forwarder configuration: Ensure that forwarders are configured optimally to minimize data loss and ensure data
accuracy.
5. Cluster configuration: Consider configuring a Splunk cluster to improve reliability and increase the ability to handle
increased data volume.
6. Data retention policy: Evaluate the data retention policy and adjust it as necessary to accommodate the increased data
volume.
7. Monitoring: Regularly monitor the performance of the deployment and make adjustments as necessary to ensure it
continues to perform optimally.
Indexing
Search
Alerts
Dashboards
Pivot
Reports
Lastly, the Data model
3. What is Indexing?
One can collect data from devices and applications such as websites, servers, databases, operating systems, and more. Once
the data is collected, the index segments, stores, compresses the data and maintains the supporting metadata to accelerate
searching.
When search results for both historical and real-time searches fulfil defined conditions, alerts are sent to you. Alerts can be set
up to send alarm information to specific email recipients, publish alert information to an RSS feed, or run a custom script, such
as one that sends an alert event to Syslog.
32
Dashboards contain tablets of modules like search boxes, fields, charts, and so on. Dashboard panels are regularly connected
to saved searches or pivots. They display the results of completed searches and data from real-time searches that run in the
knowledge.
Create ad hoc reports, plan them to run at regular intervals, or have a scheduled report to create alerts when the result satisfies
certain criteria.
A data model is a search-time mapping of semantic knowledge about one or more datasets that are hierarchically organized. It
stores the domain information needed to create a range of customized dataset queries. Splunk software uses these specific
searches to generate reports for Pivot users.
The Security Posture dashboard is meant to provide high-level insight into the important events across all domains of your
deployment, suitable for display in a Security Operations Center (SOC).
A notable event represents one or more anomalous incidents detected by a correlation search across data sources.
Firstly, Forwarders
Secondly, Indexers
Lastly, Search heads
Search heads manage searches. They handle user search requests and distribute them among a group of indexers who search
their local data.
An indexer cluster is a collection of indexers that have been set up to replicate each other’s data so that the system has multiple
copies of all data. Index replication or indexer clustering is the term for this method.
Event processing parses incoming data to allow for quick search and analysis, then stores the results as events in the index.
Events indexes impose minimal structure and can accommodate any type of data, including metrics data. Moreover, events
indexes are the default index type.
To accommodate the larger volume and lower latency demands associated with metrics data, metrics indexes use a highly
organised format. When opposed to putting the same data into events indexes, putting metrics data into metrics indexes leads in
faster performance and less index store usage.
34
Master node
Peer node
Lastly, One or more search heads to coordinate searches across all the peer nodes.
On the Splunk Enterprise Security menu bar, select Configure > General > Permissions.
Find the role you want to update.
Find the ES Component you want to add.
Select the check box for the component for the role.
Lastly, Save.
25. What are the different ways you can configure Splunk software?
To acquire data from an API or other remote data interfaces and message queues, scripted inputs are employed.
To update your existing technology add-on with the newer one, click the link in the version column.
Click Update to get the newer version.
Lastly, Click Restart.
From the Splunk Enterprise menu bar, select Settings > Lookups > Lookup definitions.
Filter on mitre.
Click the Clone action for mitre_attack_lookup.
Leave Type as-is.
Type a name for the industry-standard framework.
Revise the Supported fields.
Lastly, click Save.
A type of event is not the same as an event. An event is a single instance of data, such as a single log entry. Furthermore, an
event type is a classification that is used to categorise and label events.
35
The transaction command locates transactions based on events that satisfy a set of criteria. Furthermore, transactions are made
up of each member’s raw text, the earliest member’s time and date data, and the union of all other fields of each member.
Firstly, Duration
Lastly, Eventcount
The values in the duration field show the difference between the timestamps for the first and last events in the transaction.
Whereas, the values in the eventcount field show the number of events in the transaction.
The User Activity dashboard displays panels representing common risk-generating user activities such as suspicious website
activity.
Using internal user credentials and location-relevant data, the Access Anomalies dashboard displays collective authentication
attempts from diverse IP addresses as well as unlikely travel anomalies.
The System Center dashboard uses the Restricted Traffic list to detect software that is prohibited by your security policy, such
as IRC, data destruction tools, file transfer software, or known harmful software, such as malware linked to a recent outbreak.
The search command is used in the pipeline to extract events from indexes or to filter the results of a previous search operation.
Keywords, quoted phrases, wildcards, and field-value expressions can all be used to retrieve events from your indexes.
Furthermore, the search command does not need to be specified at the start of your search criteria.
Interesting Services comprises a list of services in your deployment. The correlation search Prohibited Service Detected uses
this lookup to determine whether a service is required, prohibited, and secure.
keywords
quoted phrases
Boolean operators
wildcards
36