Splunk4Ninjas - Data Onboarding - Lab Guide - Jan 2024
Splunk4Ninjas - Data Onboarding - Lab Guide - Jan 2024
This lab guide contains the step by step instructions for hands-on exercises done in the Splunk4Ninjas -
Data Onboarding workshop. This workshop simulates the following:
● Monitoring a location on either your local Splunk instance or from a Splunk Universal/Heavy Forwarder
● Getting data in from a data source that may not have a Splunk app or add-on currently available
Please ensure that you have a copy of the workshop slide deck: https://splk.it/S4N-DataOnboarding
Prerequisites
Splunk Instance
In order to take part in the workshop exercises, you will need your own Splunk instance. You will either be
provided an instance by your workshop host or you will need to enroll in the workshop to have an instance
provisioned. Your host will let you know.
If you are instructed to enroll in today’s workshop, you will need to have a Splunk.com account prior to
enrolling. If you don’t already have a Splunk.com account, please create one at
https://www.splunk.com/en_us/sign-up.html before proceeding with the enrollment link provided by your host.
Knowledge Requirements
Attendees should have a base knowledge of the Splunk interface and understand key concepts such as:
● What is an index?
● What is a sourcetype?
● What is a data model?
⚠️ Troubleshooting Connectivity
If you experience connectivity issues with accessing either your workshop environment or the event page,
please try the following troubleshooting steps. If you still experience issues please reach out to the team
running your workshop.
1
● Disconnect from VPN (if you’re using one)
● Clear your browser cache and restart your browser (if using Google Chrome, go to: Settings > Privacy
and security > Clear browsing data)
● Try using private browsing mode (e.g. Incognito in Google Chrome) to rule out any cache issues
● Try using another computer such as your personal computer - all you need is a web browser! Cloud
platforms like AWS can often be blocked on corporate laptops.
2
Table of Contents
Overview................................................................................................................................................................... 1
Prerequisites............................................................................................................................................................ 1
Table of Contents..................................................................................................................................................... 3
Lab 0 - Register/Enroll and Login to Instance.......................................................................................................4
Description........................................................................................................................................................... 4
Steps.................................................................................................................................................................... 4
Lab 1 - File Monitor.................................................................................................................................................. 8
Description........................................................................................................................................................... 8
Option 1: Monitoring Through the GUI............................................................................................................ 8
Monitor the Syslog Directory.......................................................................................................................... 8
Adjust the Input Settings................................................................................................................................ 9
Review and Confirm Your Monitoring Input..................................................................................................10
Option 2: Monitoring Through the CLI........................................................................................................... 12
SSH into Your Provided Machine................................................................................................................. 12
Monitor the Syslog Directory Through inputs.conf....................................................................................... 13
Reload and confirm the monitoring input..................................................................................................... 15
Lab 2 - Data Parsing.............................................................................................................................................. 16
Description......................................................................................................................................................... 16
Select the badge.log File as your Source.................................................................................................... 16
Set the Source Type.....................................................................................................................................17
Adjust the Input Settings.............................................................................................................................. 19
Submit and Search Your Data......................................................................................................................20
Set Up the Add-on Builder........................................................................................................................... 21
Lab 3 - Field Extraction and CIM Compliance..................................................................................................... 26
Description......................................................................................................................................................... 26
Validate the Empty Authentication Dashboard............................................................................................. 26
Extract Fields from the Badge Data............................................................................................................. 27
Map Source Type to the Authentication Data Model....................................................................................30
Re-accelerate Your Data Model................................................................................................................... 34
Verify the Authentication Dashboard............................................................................................................36
Appendix.................................................................................................................................................................37
Troubleshooting for Lab 1, Option 2: Monitoring Through the CLI.............................................................37
Optional Activity: Lab 2, Tasks 1-4 Accomplished Through the CLI.......................................................... 41
Establish a Custom “badge” Source Type....................................................................................................41
Use Oneshot to Ingest and View badge.log Data........................................................................................ 42
3
Lab 0 - Register/Enroll and Login to Instance
Description
You will need a Splunk training instance to take part in the labs for the workshop. In this lab, you will register
for your own Splunk instance to use for the workshop.
🔑
_______________________________________________________________________________________
Has your Host already provided you a link for your instance and login credentials?
If your workshop host has already provided you with an instance link as well as login credentials, you do NOT
need to follow these instructions for Lab 0. You can instead skip straight to Lab 1 - File Monitor!
_______________________________________________________________________________________
Steps
1. Browse to https://show.splunk.com - or the enrollment link provided by the host - and log in using your
Splunk.com account credentials:
___________________________________________________________________________________
If you do not have a Splunk.com account, create one in a few minutes by clicking here. After creating
your Splunk.com account, please navigate back to https://show.splunk.com or use the enrollment link
your host provides.
___________________________________________________________________________________
4
2. Once logged into Splunk Show, you will see the event page for the event that you have been invited to. If
nothing is showing, try selecting “Invited” from the dropdown list.
5
4. The page will refresh and the event will now display ‘Enrolled’
5. Once the workshop starts, your individual environment will start up.
6. Scroll down the page to the Instances Information section and expand the Splunk Enterprise section
to locate the URL and login credentials for your lab environment.
If you don’t see any connection information displayed yet, it means that your lab environment has not yet
fully provisioned. Please check this page again in a few minutes.
6
___________________________________________________________________________________
7
Lab 1 - File Monitor
Description
Firewall logs collected by a Syslog server are stored on the server locally. We will bring those firewall logs
into Splunk by monitoring the files and directories where those logs are stored. With Splunk Enterprise, we
can do this from either the Splunk Web interface (GUI) or the inputs.conf file through the CLI:
______________________________________________________________________________________
ℹ️ inputs.conf
To learn more about the inputs.conf file, click here.
______________________________________________________________________________________
1. Select Source by navigating to Settings > Add Data > Monitor > Files & Directories and clicking on
Browse.
2. Select the opt > data > syslog directory. Ensure all devices and firewall.log files are highlighted and
click the Select button in the bottom right of the pop up.
3. Back on the Select Source page, your File or Directory field should read:
/opt/data/syslog
8
With this set, click Next.
4. On the Input Settings page, set the Source type by selecting New and entering ftg_traffic for the
Source Type field:
6. For the Host select Segment in path and set the Segment number to 4:
9
___________________________________________________________________________________
ℹ️ Segment in path
These settings specify that the “host” value assigned to the events will align with whatever comes after
the 4th slash in the directory path. We are monitoring firewall logs from the following directories:
/opt/data/syslog/device1
/opt/data/syslog/device2
/opt/data/syslog/device3
which means that the device# is the 4th segment of the file path and will get assigned as the host
depending on where the logs come from, i.e:
/opt = segment 1
/data = segment 2
/syslog = segment 3
/device* = segment 4
___________________________________________________________________________________
10
10. Validate your data by clicking on Start Searching and entering the following search into the search bar:
11
Skip Option 2 if you have already completed Option 1 (GUI).
>> Jump to Lab 2 <<
2. Use your given SSH command to log into your individual machine.
___________________________________________________________________________________
🔑 SSH Details
The specific IP address for your instance will either be provided to you by the workshop host or will be
presented alongside your other workshop instance information within the Splunk Show portal.
If you had to enroll in today’s workshop (via Splunk Show) then please log in to https://show.splunk.com
to locate your unique SSH command.
___________________________________________________________________________________
Note: Replace the IP address with the IP address of your workshop instance.
When prompted, enter the SSH password you were provided (this is different from the Splunk login
password.)
12
3. Upon entering the correct password, you should be logged into the remote machine:
4. Firstly, we will view the current inputs.conf settings. Start by switching to the root user:
sudo su
cd /opt/splunk/etc/apps/DataOnboarding4Ninjas/
ls
6. Splunk app configuration changes should ONLY be made in the local directory of an app. Since the local
directory does not yet exist, we need to create one:
mkdir local
13
7. We will now add a monitoring stanza to the inputs.conf file to instruct Splunk to monitor our firewall log
files. To do this, navigate to the newly created local directory and open the inputs.conf file to edit:
cd local
vi inputs.conf
___________________________________________________________________________________
📝 vi (Visual Editor)
vi is a text editor used particularly for UNIX machines. There are many cheat sheets available online to
help you navigate the commands when working with the editor. There are also other options for editing
Splunk .conf files in Linux, such as nano.
___________________________________________________________________________________
[monitor:///opt/data/syslog/*/firewall.log]
sourcetype = fgt_traffic
index = firewall
host_segment = 4
___________________________________________________________________________________
ℹ️ This stanza defines the input as firewall logs within the syslog directory. The data is being assigned a
sourcetype of fgt_traffic and sent to the firewall index and assigning the host name to the 4th segment
of the directory path. In this case the host hame will be assigned whatever directory follows /syslog/ in
the stated path.
___________________________________________________________________________________
10. Exit the vi editor by pressing the escape key then typing :wq and hitting enter. This is telling the editor to
save and exit back to the main CLI.
14
Reload and confirm the monitoring input
12. We now need to confirm that events are being ingested. To do this, log back in to Splunk (you will have
been logged out due to the restart command we ran above) and navigate to Apps >
DataOnboarding4Ninjas. Run the following search over All Time:
15
Lab 2 - Data Parsing
Description
In this activity, we ingest badge.log data into Splunk a single time (rather than monitoring continuously). We
will set up a source type to correctly parse the data into individual events and then we will pivot into the
Splunk Add-on Builder app where we import our source type into a new Add-on named ‘badge’.
This lab will be completed in the UI but several tasks can also be done in the CLI. See the Appendix
Optional Activity on how tasks 1-4 can be accomplished through the CLI.
1. Navigate to Settings > Add Data > Monitor > Files & Directories and click on Browse.
16
4. Select Index Once and click on Next.
___________________________________________________________________________________
ℹ️ Index Once
Selecting Index Once is equivalent to a ‘oneshot’ command in the CLI. This tells Splunk to read the file
once, rather than establishing a continuous monitor of the file. This could be useful if there are
segregated systems offline that cannot connect to Splunk; you can download the file and ingest it one
time to enrich data or help investigate. This is also useful when testing data!
___________________________________________________________________________________
5. We now need to verify our data parsing settings. In the data preview pane on the right of the Set Source
Type page we can see that the data is not getting ingested in the correct format; it is coming in as one
big event rather than individual events, so we need to set a new source type to rectify this.
On the left of the page, expand out the Advanced section and adjust the source type settings to match
the following:
Name Value
SHOULD_LINEMERGE false
LINE_BREAKER (##)
TIME_PREFIX \d+\.\d+\.\d+\.\d+\s
MAX_TIMESTAMP_LOOKAHEAD 14
TRUNCATE 1000
17
6. Click on Apply Settings - the single event will now be broken into individual events.
18
___________________________________________________________________________________
After saving, there may be additional settings added to the source type. For the purposes of this lab, we
do not need to worry about these.
___________________________________________________________________________________
12. Upon saving, a badge sourcetype will now be visible in the props.conf for the DataOnboarding4Ninjas
app. The props.conf file can be found in the directory
/opt/splunk/etc/apps/DataOnboarding4Ninjas/local.
To view the contents and see the new [badge] stanza that has been added, run the following CLI
commands:
cd /opt/splunk/etc/apps/DataOnboarding4Ninjas/local
cat props.conf
13. On the Input Settings screen, set the App Context to DataOnboarding4Ninjas.
19
___________________________________________________________________________________
15. With the input settings adjusted, you can select Review and then Submit on the next page.
20
16. On the File input has been created successfully page click on Start Searching. This will take you to a
new search - with the search query populated - where you can see your badge.log data now coming in.
Earlier in this lab we created a source type during the “Add Data” process. Now we are going to import that
source type into the Splunk Add-on Builder where we can save it for future use as a Technology Add-on, or
“TA”. This way anyone can easily grab that add-on and install it to bring in badge data.
______________________________________________________________________________________
For more information about the Splunk Add-on Builder and Splunk add-ons in general, please see the
following helpful resources:
● https://docs.splunk.com/Documentation/AddonBuilder/latest/UserGuide/Overview
● https://docs.splunk.com/Documentation/AddOns/released/Overview/AboutSplunkadd-ons
______________________________________________________________________________________
21
17. Navigate to Apps > Splunk Add-on Builder and click on New Add-on.
18. For the Add-on Name enter “badge”. For the author you can enter your own name or leave it blank.
Note the Add-on Folder Name that is generated, as this can change based on who the author is. This is
the name of the app folder where you will find the underlying configuration files through the CLI.
22
19. Click on Create.
20. You can now see your badge add-on on the home screen of the Splunk Add-on Builder app. Open the
badge add-on by clicking on it.
21. We will now import the source type we created earlier so it is contained within the badge add-on. To do
this, click on Manage Source Types in the menu bar.
22. Click on the Add button and select Import From Splunk from the dropdown menu.
23
The sourcetype parameters defined earlier (see Set the Source Type) will now be imported into the
add-on builder and will be visible on the left under the Advanced section.
The Splunk Add-on Builder imports source type settings but it may also pull in other default settings. You
can always continue to adjust your settings here as well.
24. Expand out the Advanced section on the left of the page and confirm that the field name and values we
defined earlier are present. Click Save.
25. On the popup Warning message, click on Continue. The Warning is letting you know that the source
type will no longer be associated with the DataOnboarding4Ninjas app but will instead be associated
with the Splunk Add-on Builder app. Since this is the intention we can ignore the warning.
24
26. Click Save to navigate back to the Manage Source Types home screen.
___________________________________________________________________________________
ℹ️ Add-on Folders
As previously mentioned, when creating a new add-on within the Splunk Add-on Builder (see step 18),
an Add-on Folder Name directory will be created under the Splunk /etc/apps directory. Then, when
importing and saving a sourcetype within the new add-on (step 24) Splunk will save that sourcetype
within a local props.conf of the TA.
You can now find your badge add-on listed as /opt/splunk/etc/apps/<TA FOLDER NAME> and your
updated [badge] sourcetype within props.conf under /opt/splunk/etc/apps/<TA FOLDER
NAME>/local/props.conf
Command to see a list of Splunk apps where your badge add-on sits:
ls /opt/splunk/etc/apps
___________________________________________________________________________________
25
Lab 3 - Field Extraction and CIM Compliance
Description
In this activity we configure our badge add-on to extract fields from our badge.log data and we map our data
to the Authentication data model available through Splunk’s Common Information Model (CIM). In mapping
our data to the data model, the Authentication Dashboard available in the DataOnboarding4Ninjas app will
start to populate.
1. Navigate to Apps > DataOnboarding4Ninjas and click on Authentication Dashboard in the menu bar.
Although we have badge authentication data ingested, the searches written for this dashboard are
looking for information tied to the Authentication data model. This data model is available as part of the
Splunk Common Information Model (CIM). We are going to map our badge TA to the authentication data
model so the searches in the dashboard know to pick up the badge data we have coming in.
26
___________________________________________________________________________________
ℹ️ Data Models
Further information about data models can be found in the Knowledge Manager Manual:
https://docs.splunk.com/Documentation/Splunk/latest/Knowledge/WhatisSplunkknowledge
To learn about the Splunk Common Information Model (CIM) please see:
https://docs.splunk.com/Documentation/CIM/latest/User/Overview
Further information about the Authentication data model specifically can be found here:
https://docs.splunk.com/Documentation/CIM/latest/User/Authentication
___________________________________________________________________________________
2. Navigate to the Splunk Add-on Builder App. On the Add-on List page click on the badge add-on.
27
4. We will now establish a table format to help parse the data.
To do this, on the Extract Fields page you will see your badge source type listed. Under Actions, click
Assisted Extraction.
5. On the Choose Data Format popup, select the Table data format and click Submit.
6. Results are returned broken down by delimited fields in a table format. Select Comma as the separator.
field_1 = employee
field_2 = door
field_3 = result
28
Optional Step - view the underlying .conf files that get built as we adjust the Add-on in the GUI
8. Prior to saving, view your local directory configuration files via the CLI. List the files in the local directory
of your TA by running the following command:
ls /opt/splunk/etc/apps/<TA_FOLDER_NAME>/local
There is an app.conf and a props.conf file listed, but currently no transforms.conf. Upon saving the
field names as the extraction, a transforms.conf will be created to store the configurations.
Optional Step - view the underlying .conf files that get built as we adjust the Add-on in the GUI
10. Verify the transforms.conf file by running the following CLI command to list the files in the local directory
of your TA:
ls /opt/splunk/etc/apps/<TA_FOLDER_NAME>/local
11. View the transforms.conf and see the new field names defined:
___________________________________________________________________________________
ℹ️ transforms.conf
More information about transforms.conf can be found in the Admin Manual:
https://docs.splunk.com/Documentation/Splunk/latest/Admin/Transformsconf
___________________________________________________________________________________
29
Map Source Type to the Authentication Data Model
12. Open the Splunk Add-on Builder and click on Map to Data Models in the menu bar.
14. On the Define Event Type page, enter the following into the fields:
Click on Save.
Optional Step - view the underlying .conf files that get built as we adjust the add-on in the GUI
15. Verify the eventtypes.conf file by running the following CLI command to list the files in the local directory
of your TA:
ls /opt/splunk/etc/apps/<TA_FOLDER_NAME>/local
16. View the contents of eventtypes.conf and see the newly defined [badge_data] stanza:
sudo cat /opt/splunk/etc/apps/<TA_FOLDER_NAME>/local/eventtypes.conf
You should see the [badge_data] stanza and the accompanying search you defined:
30
___________________________________________________________________________________
ℹ️ eventtypes.conf
More information about eventtypes.conf can be found in the Admin Manual:
https://docs.splunk.com/Documentation/Splunk/latest/Admin/eventtypesconf
___________________________________________________________________________________
Back in the Splunk GUI, you should be on the Data Model Mapping Details page, where you can add
knowledge objects to enhance the badge data. On the right side of the page click on Select Data
Model(s)...
18. Under the Data Models panel, expand out the Splunk_SA_CIM list of data models. Select the
Authentication data model and the corresponding Data Model Fields will appear on the right side of the
page.
Click on Select.
31
19. On the Data Model Mapping Details page, add the field alias knowledge object by clicking on New
Knowledge Object > FIELDALIAS.
___________________________________________________________________________________
ℹ️ Field Aliases
Field aliases are an alternate name that you assign to a field, allowing you to search for those events
using both the original field name and/or the alias.
More information about field alias can be found in the Knowledge Manager Manual:
https://docs.splunk.com/Documentation/Splunk/latest/Knowledge/Addaliasestofields
___________________________________________________________________________________
20. On the next page in the Data Model Mapping List panel, select the badge source type.
21. In the Event Type Fields section on the left, expand out the badge_data section (if not already
expanded) and click on “door” - this will populate the Event Type Field or Expression column with this
field name.
22. In the Data Model Fields panel on the right, click on “dest” - this will populate the Data Model Field
name.
Note: If you do not see a list of fields in the Data Model Fields panel, you may need to expand the
Authentication(23) section to view them.
32
23. Your mapping will now look like this:
24. We will now add a new knowledge object to populate the action field depending on whether a badge
scan was successful or unsuccessful. To do this we will use an EVAL knowledge object (this works the
same way as using eval in a search query).
Add the new knowledge object by clicking on New Knowledge Object > EVAL.
25. In the Data Model Mapping List panel, select the badge source type.
if(result=="badge accepted","success","failure")
33
27. For the Data Model Field, add the action field by clicking on it under the Data Model Fields panel on
the right side of the page.
28. When your complete mapping looks like this, click on Done.
29. Now that we have updated our data model mappings we need to re-accelerate it so Splunk regenerates
the summaries for the data model. This will ensure that our search results are both fast and accurate.
To do this, navigate to Settings > Data Models and click on Edit next to the Authentication data Model.
On the dropdown, select Edit Acceleration.
34
30. On the Edit Acceleration popup uncheck the Accelerate box and click on Save.
31. Now go back into Edit Acceleration again and re-accelerate the data model by re-checking the
acceleration box and clicking Save.
___________________________________________________________________________________
In the case of updating/appending/removing fields for data model mapping, data models only need to be
re-accelerated if a user wants the old data to be populated with the new/updated extractions. The user
does not need to re-accelerate any data models if they are okay with new mappings not being applied to
the old data.
To learn more about periodic updating for accelerated data models, refer to the following documentation:
https://docs.splunk.com/Documentation/Splunk/latest/Knowledge/Acceleratedatamodels#After_you_enab
le_acceleration_for_a_data_model
___________________________________________________________________________________
35
Verify the Authentication Dashboard
32. Navigate to Apps > DataOnboarding4Ninjas and click on Authentication Dashboard in the menu bar.
The dashboard should now be populated. This is because the underlying dashboard searches are
looking for events that map back to the Authentication data model - events that have fields like action
and dest. While our original events may not have contained action and dest, we have now mapped our
events to those data model field names and our data is now being returned in the results.
36
Appendix
If an error was made in the monitoring stanza - for example, a typo or host_segment was set incorrectly -
there are a couple of steps to fix the monitoring input.
1. Adjust the monitoring stanza in inputs.conf. To do this, edit the local inputs.conf file:
vi /opt/splunk/etc/system/local/inputs.conf
2. Type “ i ” to insert text into the file and adjust any errors in the monitoring stanza
3. Exit the vi editor by pressing the escape key then typing :wq and hitting enter. This is telling the editor to
save and exit back to the main CLI.
4. Clean up the incorrectly parsed event data in Splunk using the delete command. This will prevent the
incorrect event data from showing up in search results.
To do this, log in to Splunk and go to Settings > Users and click Edit next to the admin user. On the
dropdown menu select Edit.
5. Under Available item(s) click on the can_delete role - this will add it to the Selected item(s) list on the
right side of the screen.
37
6. Check the “I acknowledge…” box and click on Save.
___________________________________________________________________________________
⚠️ can_delete
The can_delete role is disabled by default for admin users to prevent accidental deletion of data. It is
recommended to remove this role once you have deleted the required data.
___________________________________________________________________________________
7. Remove the incorrectly parsed event data by running the following search over All time. This will remove
all of the previously ingested data in our firewall index:
index=firewall | delete
38
Before we can re-ingest the firewall data (with the correct settings from inputs.conf) we need to clean the
fishbucket - this will tell Splunk to clear out the cached markers for where it was monitoring our inputs
and will ensure that Splunk will start collecting data from the very beginning of the log files.
To do this, open your terminal app and run the following commands:
sudo su
9. Stop Splunk:
/opt/splunk/bin/splunk stop
10. Reset the fishbucket cache for each of the paths it was monitoring:
/opt/splunk/bin/splunk start
39
12. Validate your data by logging in to Splunk and running the following search over All time:
40
Optional Activity: Lab 2, Tasks 1-4 Accomplished Through the CLI
Description:
These 2 tasks are optional replacements of tasks 1-4 in Lab 2. You will create a sourcetype called badge
that correctly parses our badge.log data. Then you will bring in the badge.log data via a CLI command called
oneshot. You will then assign that data to the sourcetype named badge and the index named badge. Finally,
you will verify that parsed events are coming into the UI.
sudo su
cd /opt/splunk/etc/apps/DataOnboarding4Ninjas/
vi props.conf
[badge]
SHOULD_LINEMERGE=false
LINE_BREAKER=(##)
TIME_PREFIX=\d+\.\d+\.\d+\.\d+\s
TIME_FORMAT=%m%d%y %H:%M
MAX_TIMESTAMP_LOOKAHEAD=14
TRUNCATE=1000
7. Exit the vi editor by pressing the escape key then typing :wq and hitting enter. This is telling the editor to
save and exit back to the main CLI.
8. Restart Splunk:
/opt/splunk/bin/splunk restart
41
Use Oneshot to Ingest and View badge.log Data
9. Ingest data using oneshot by running the following command in the CLI. This is the same as “index once”
in the Splunk GUI.
10. Verify the badge data in Splunk by logging in to Splunk and navigating to the DataOnboarding4Ninjas
app.
11. Click Search in the menu bar and run the following search over All time.
index=firewall sourcetype=badge
You should now see your badge data in the search results:
42