Splunk Lab - Enriching Data With Lookups
Splunk Lab - Enriching Data With Lookups
Overview
Welcome to the Splunk Education lab environment. These lab exercises will have you create automatic
lookups to provides additional information for a source type, upload lookup table files, use lookups in searches,
and upload a (KML) lookup table file and create a Geospatial lookup definition to use it in searches and to
create a choropleth visualization report.
Scenario
You will use data from the international video game company, Buttercup Games. A list of source types is
provided below.
NOTE: This is a lab environment driven by data generators with obvious limitations. This is not a
production environment. Screenshots approximate what you should see, not the exact output.
network Email security data cisco_esa dcid, icid, mailfrom, mailto, mid
© 2021 Splunk Inc. All rights reserved. Enriching Data with Lookups 28 February 2022 1
Common Commands and Functions
These commands and statistical functions are commonly used in searches but may not have been explicitly
discussed in the module. Please use this table for quick reference. Click on the hyperlinked SPL to be taken to
the Search Manual for that command or function.
SPL Type Description Example
Sorts results in Sort the first 100 src_ip values in descending order
descending or ascending
sort command
order by a specified field.
| sort 100 -src_ip
Can limit results to a
specific number.
Returns the sum of the Calculate the sum of the bytes field
statistical values of a field. Can be
sum() function used with stats,
timechart, and chart
| stats sum(bytes)
commands.
Returns the number of Count all events as "events" and count all events that
occurrences of all events contain a value for action as "action"
count or statistical
or a specific field. Can
count() function
be used with stats, | stats count as events,
timechart, and chart count(action) as action
commands.
Refer to the Search Reference Manual for a full list of commands and functions.
© 2021 Splunk Inc. All rights reserved. Enriching Data with Lookups 28 February 2022 2
Lab Exercise 1 – Creating Lookups
Description
Configure the lab environment user account. Then, create a new automatic lookup that provides additional
information to the access_combined source type.
Steps
Task 1: Log into Splunk and change the account name and time zone.
Set up your lab environment to fit your time zone. This also allows the
instructor to track your progress and assist you if necessary.
1. Log into your Splunk lab environment using the username and
password provided to you.
2. You may see a pop-up window welcoming you to the lab environment.
You can click Continue to Tour but this is not required. Click Skip to
dismiss the window.
3. Click on the username you logged in with (at the top of the screen) and
then choose Account Settings from the drop-down menu.
After you complete step 6,
4. In the Full name box, enter your first and last name.
you will see your name in
5. Click Save. the web interface.
6. Reload your browser to reflect the recent changes to the interface.
(This area of the web interface will be referred to as user name.)
NOTE: Sometimes there can be delays in executing an action like saving in the UI or returning results
of a search. If you are experiencing a delay, please allow the UI a few minutes to execute
your action.
© 2021 Splunk Inc. All rights reserved. Enriching Data with Lookups 28 February 2022 3
Scenario: The access_combined source type contains http status codes, but not the code definitions.
Task 2: Add a lookup file to the access_combined source type to make the code definitions available
as fields.
11. Obtain the status_definitions.csv file (see image above for location of link to file).
12. View the status_definitions.csv file with a text editor, noticing the comma-separated value format that
defines HTTP response status codes, their description, and status type:
16. In the top left corner of Splunk Web, select Apps > Search & Reporting. This sets our app context to the
search app.
20. In the top left corner, select Apps > Search & Reporting.
21. Use the inputlookup command with the name of the lookup definition to verify the contents of the lookup
file and that the lookup definition was created correctly.
| inputlookup status_definitions_lookup
22. Search the online store data (index=web) over the Last 24 hours for all events that were not associated
with an "OK" status of 200.
index=web status!=200
23. For the same search results, view the Interesting Fields side bar, and click on status. Note that there
are no fields for status description or status type.
© 2021 Splunk Inc. All rights reserved. Enriching Data with Lookups 28 February 2022 5
24. Add status_description and status_type fields using the lookup definition you created in the previous
task. Pipe results to | lookup status_definitions_lookup status. Run the search and verify that
status_description and status_type fields are included the Interesting Fields list.
index=web status!=200
| lookup status_definitions_lookup status
NOTE: You can limit or customize the fields added by the lookup command by using the OUTPUT or
OUTPUTNEW options. For example, use lookup status_definition_lookup status OUTPUT
status_description to only add the status_description field to the results. For more
information on the lookup command, visit the Search Reference Manual.
© 2021 Splunk Inc. All rights reserved. Enriching Data with Lookups 28 February 2022 6
NOTE: Step 25 is optional and requires knowledge of the stats command. You can skip this step and
follow step 26 to save your search as a report.
25. Modify the search to use the stats command to get a count by host, status_description, and
status_type.
index=web status!=200
| lookup status_definitions_lookup status
| stats count by host, status_description, status_type
26. Save your search as a report with the name L1S1.
a. Click Save As > Report
b. For Title, enter L1S1.
c. Save.
d. You can View your report or exit out of the Your Report Has Been Created window by clicking
the X in the upper-right corner.
e. You can access your saved reports using the Reports tab in the application bar.
f. Re-initialize the search window by clicking Search in the application bar.
Your recently saved L1S1 report will be visible in the Reports tab.
NOTE: It may take a few moments before the automatic lookup starts working.
© 2021 Splunk Inc. All rights reserved. Enriching Data with Lookups 28 February 2022 8
Task 7: Verify your automatic lookup is working.
30. In the top left corner, select Apps > Search & Reporting.
31. Search the online store data for the Last 24 hours for all events that do not have a status of 200.
index=web status!=200
32. In the search results under the Interesting Fields sidebar, notice that StatusDescription and
StatusType are showing automatically, without requiring the use of any lookup commands.
NOTE: Steps 33 - 35 are optional and require knowledge of the stats command. You can skip these
steps and follow step 36 to save your search as a report.
33. Search the online store data and count events by host, StatusDescription, and StatusType over the
Last 24 hours.
index=web status!=200
| stats count by host, StatusDescription, StatusType
34. Click on Visualization, then select Column Chart.
© 2021 Splunk Inc. All rights reserved. Enriching Data with Lookups 28 February 2022 9
35. Create multiple visualizations for each status description. Click on Trellis.
— Select the Use Trellis Layout checkbox.
— For Split By select StatusDescription.
© 2021 Splunk Inc. All rights reserved. Enriching Data with Lookups 28 February 2022 10
Scenario: HR wants a count of logins by known Buttercup Games employees over the last 24 hours.
Exclude non-standard employee accounts in the results.
Task 8: Upload the knownusers.csv lookup table file and create a lookup definition to filter out non-
standard Buttercup employees from the lookup.
NOTE: The knownusers.csv lookup contains Buttercup employees as well as common user accounts
such as root, mail, and so on.
39. Navigate to Settings > Lookups and click + Add new next to Lookup table files.
a. Save the lookup table file with these values:
— Destination app: search
— File: knownusers.csv
— Destination filename: knownusers.csv
b. Click Save.
40. Navigate back to the Search & Reporting app and check the contents of the lookup using the
inputlookup command. There should be 76 results.
| inputlookup knownusers.csv
© 2021 Splunk Inc. All rights reserved. Enriching Data with Lookups 28 February 2022 11
41. Navigate to Settings > Lookups and click + Add new next to Lookup definitions.
a. Save the lookup definition with these values:
— Destination app: search
— Name: knownusers_lookup
— Type: File-based
— Lookup file: knownusers.csv
b. Check the Advanced options checkbox.
c. For Filter lookup, write a Boolean expression that excludes root, mail, and apache users from
the lookup.
(user!=root) AND (user!=mail) AND (user!=apache)
There are multiple ways to type in a valid Boolean expression. An alternative expression that
would also work is: NOT user IN(root,mail,apache)
d. Click Save.
42. Navigate back to the Search & Reporting app and use the inputlookup command to verify that the
lookup definition does not include root, mail or apache.
| inputlookup knownusers_lookup
43. Add | lookup knownusers_lookup user OUTPUT user to this search so that the results are limited to
only Buttercup Games employees. Search over the Last 24 hours.
index=security sourcetype=linux_secure
| stats count by user
index=security sourcetype=linux_secure
| lookup knownusers_lookup user OUTPUT user
| stats count by user
44. Save your search as a report with the name L1S3.
© 2021 Splunk Inc. All rights reserved. Enriching Data with Lookups 28 February 2022 12
Lab Exercise 2 – Geospatial and External Lookups
Description
In this exercise, you will define an external lookup, upload a (KML) lookup table file, create a geospatial lookup
definition, and use these lookups in searches.
Steps
Task 1: Upload and define a geospatial lookup and verify its contents in search.
| inputlookup canada_prov
5. Save your search as a report with the name L2S1.
© 2021 Splunk Inc. All rights reserved. Enriching Data with Lookups 28 February 2022 13
Task 2: Create and use an external lookup with external_lookup.py script to return a count of online
sales events by host name.
6. Sales wants a count of online sales events by host name over the last 15 minutes. This search looks for
online sales events and calculates a count of each value for clientip. Run this search over the Last 60
minutes:
index=web sourcetype=access_combined
| stats count by clientip
7. You will need to use an external lookup to enrich your data with client host values. The lookup you will be
using, external_lookup.py, has already been moved to the search app directory
(SPLUNK_HOME/etc/apps/search/bin/external_lookup.py), which is required before you can define
this external lookup. Navigate to Settings > Lookups and click + Add new next to Lookup definitions.
a. Save the lookup table file with these values:
— Destination app: search
— Name: dnslookup
— Type: External
— Command: external_lookup.py clienthost clientip
— Supported fields: clienthost,clientip
b. Click Save.
8. Navigate back to the Search & Reporting app and perform a search of online sales during the Last 60
minutes. Invoke the dnslookup lookup with the lookup command and pipe the results to stats count
by clienthost to count the results by clienthost.
index=web sourcetype=access_combined
| lookup dnslookup clientip
| stats count by clienthost
© 2021 Splunk Inc. All rights reserved. Enriching Data with Lookups 28 February 2022 14
9. Rewrite the search to include HTTP status and HTTP status descriptions by piping to stats count by
clienthost, status, status_description. This will require an additional lookup command that uses
the status_definitions.csv lookup.
index=web sourcetype=access_combined
| lookup dnslookup clientip
| lookup status_definitions.csv status OUTPUT status_description
| stats count by clienthost, status, status_description
10. Save your search as a report with the name L2S2.
© 2021 Splunk Inc. All rights reserved. Enriching Data with Lookups 28 February 2022 15