EnvolveLabs Training Guide (Intermediate)
EnvolveLabs Training Guide (Intermediate)
Welcome to Envolve Labs Corporation! Today is your first day as a Junior Security
Operations Center (SOC) Analyst with our company. Your primary job responsibility is to
defend Envolve Labs and its employees from malicious cyber actors.
Envolve Labs is a med-tech startup based in the United States that was founded in 2012.
Our mission is to develop a new type of flexible vaccine technology that covers many
different viral strains and offers long-lasting immunity (which means no more boosters!)
Our initial research has proven this technology is highly effective – we’re planning to start
production in Q1 2023. Our investors are hopeful at the prospect of future earnings from
patents and licensing of this technology.
Until now, we’ve been laser focused on medical research and meeting production goals.
But, as our work becomes more important and successful, we’ve realized the need to invest
more in cybersecurity efforts. That’s why we’ve hired you!
Like all good companies, Envolve Labs collects log data about the activity its employees
perform on the corporate network. These security audit logs are stored in Azure Data
Explorer (ADX) - a data storage service in Azure (Microsoft’s cloud). You will use the Kusto
Query Language (KQL) to parse through various types of security logs. By analysing these
logs, you can help us determine whether we’re being targeted by malicious actors.
✓ Use Kusto Query Language (KQL) to manipulate data in Azure Data Explorer (ADX)
✓ Pivot across multiple data sets to answer targeted questions
✓ Identify malicious cyber activity in audit logs including: email, authentication, web
traffic, and endpoint logs
✓ Use multiple “pivoting” techniques to track the activity of one or more Advanced
Persistent Threat (APT) actors
✓ Leverage third party data sets such as PassiveDNS to discover unknown actor
infrastructure based on known actor indicators (e.g. domains and IPs)
✓ Analyze third-party reporting on APT actors and their infrastructure and capabilities
✓ Validate a threat actor’s Techniques, Tactics, and Procedures (TTPs)
✓ Cluster threat activity using the Diamond Model
The attackers have gotten a head start, so let’s not waste any more time… time to get to
work!
2. Add a new cluster using the cluster URI provided by your instructor
• Click add cluster
• Enter Connection URI: kc7round2.eastus
The big blank space to the right of your cluster list is the query workspace. That’s where
you’ll actually write the queries used to interact with our log data.
We currently have eight types of log data. As you’ll see in ADX, each log type corresponds to
a table that exists in the SecurityLogs database:
Key Point – Over the Horizon (OTH) data: One of the tables listed above is not like the
others – PassiveDns. Rather than being an internal security log, PassiveDns is a data source
that we’ve purchased from a 3rd party vendor. Not all malicious cyber activity happens
within our company network, so sometimes we depend on data from other sources to
complete our investigations.
You’ll learn more about how to use each of these datasets in just a minute. First, let’s just
run some queries so you can practice using KQL and ADX.
Employees
| take 10
The take operator is a powerful tool you can use to explore rows in a table, and therefore
better understand what kinds of data are stored there.
Key Point – What to do when you don’t know what to do: Whenever you are faced with
an unfamiliar database table, the first thing you should do is sample its rows using the take
operator. That way, you know what fields are available for you to query and you can guess
what type of information you might extract from the data source.
The Employees table contains information about all the employees in our organization. In
this case, we can see that the organization is named “Envolve Labs” and the domain is
“envolvelabs.com”.
1. Try it for yourself! Do a take 10 on all the other tables to see what kind of data
they contain.
We can use count to see how many rows are in a table. This tells us how much data is
stored there.
Employees
| count
We can use the where operator in KQL to apply filters to a particular field. For example, we
can find all the employees with the name “Linda” by filtering on the name column in the
Employees table.
where statements are written using a particular structure. Use this helpful chart below to
understand how to structure a where statement.
Employees
| where name has "Linda"
Employees
| where name == " Linda Taylor"
3. Each employee at Envolve Labs is assigned an IP address. Which employee has the
IP address: “192.168.2.13”
While performing their day-to-day tasks, Envolve Labs employees send and receive emails. A
record of each of these emails is stored in the Email table.
Key Point – User Privacy and Metadata: As you can imagine, some emails are highly
sensitive. Instead of storing the entire contents of every email sent and received within the
company in a database that can be easily accessed by security analysts, we only capture
email metadata. Email metadata includes information like: the time the email was sent, the
sender, the recipient, the subject line, and any links the email may contain. Storing only
email metadata, rather than entire contents, helps protect the privacy of our employees,
while also ensuring that our security analysts can keep us safe. Sometimes even metadata
can reveal sensitive information, so it’s important that you don’t talk about log data with
other employees outside the SOC.
We can find information about the emails sent or received by a user by looking for their
email address in the sender and recipient fields of the Email table. For example, we can use
the following query to see all the emails sent by “Edward Ives”:
Email
| where sender == "[email protected]"
We can use the distinct operator to find unique values in a particular column. We can
use the following query to determine how many of the organization’s users sent emails.
Email
| where sender has "envolvelabs"
| distinct sender
| count
*Note here that the distinct operator returns all of the unique senders with the term
“envolvelabs” in their domain. However, we can further use the count operator from above
to figure out exactly “how many” of those senders there are.
5. How many users received emails with the term “policymakers” in the subject?
If we want to figure out what websites Anita Brown visited, we can find her IP address from
the Employees table.
Employees
| where name == "Anita Brown"
The query above tells us her IP address is “192.168.0.145”. We can take her IP address and
look in the OutboundBrowsing table to determine what websites she visited.
OutboundBrowsing
| where src_ip == "192.168.0.145"
Although domain names like “google.com” are easy for humans to remember, computers
don’t know how to handle them. So, they convert them to machine readable IP addresses.
Just like your home address tells your friends how to find your house or apartment, an IP
address tells your computer where to find a page or service hosted on the internet.
Key Point – Practice Good OPSEC: If we want to find out which IP address a particular
domain resolves to, we could just browse to it. But, if the domain is a malicious one, you
could download malicious files to your corporate analysis system or tip off the attackers
that you know about their infrastructure. As cybersecurity analysts, we must follow
procedures and safeguards that protect our ability to track threats. These practices are
generally called operational security, or OPSEC.
To eliminate the need to actively resolve (that is- directly browse to or interact with a
domain to find it’s related IP address) every domain we’re interested in, we can rely on
passive DNS data. Passive DNS data allows us to safely explore domain-to-IP relationships,
so we can answer questions like:
Sometimes we need to use the output of one query as the input for a second query. The
first way we can do this is by manually typing the results into next query.
For example, what if we want to look at all the web browsing activity from employees
named “Linda”?
First, you would need to go into the Employees table and find the IP addresses used by
these employees.
Then, you could manually copy and paste these IPs into a query against the
OutboundBrowsing table. Note that we can use the in operator to choose all rows that have
a value matching any value from a list of possible values. In other words, the ==
(comparison) operator looks for an exact match, while the in operator checks for any values
from the list.
Although this is a valid way to get the information you need, it may not be as elegant (or
timely) if you had 100 or even 1000 employees named “Linda.”
On the left of the let statement is the variable name (“linda_ips” in this case). The variable
name can be whatever we want, but it is helpful to make it something meaningful that can
help us remember what values it is storing.
On the right side of the let statement in the expression you are storing. In this case, we use
the distinct operator to select values from only one column – so they are stored in an
array – or list of values.
Key Point – Pivoting: Part of being a great cyber analyst is learning how to use multiple
data sources to tell a more complete story of what an attacker has done. We call this
“pivoting.” We pivot by taking one known piece of data in one dataset and looking in a
different dataset to learn something we didn’t already know. You practiced this here when
we started in one dataset – the Employees table – and used knowledge from there to find
related data in another source – OutboundBrowsing.
A security researcher tweeted that the domain “illness.med” was being used by hackers.
Apparently the hackers are sending this domain inside credential phishing emails.
Key Point – Open Source Intelligence (OSINT): Security researchers and analysts often
use free, publicly available data, like Twitter! We call this public data OSINT, and it can be a
great way to get investigative leads. Like all public data sources on the internet, you should
follow up any OSINT tip with rigorous analysis, rather than blindly trusting the source.
1. Which users in our organization were sent emails containing the domain illness.med?
2. Did we block any of the emails containing that domain? Who actually received one
of these emails? (hint: the “accepted” field in the Email table tells you whether or
not the email was blocked. Blocked emails will show as false).
3. What other domains shared the same IPs as illness.med? Can you find the full list of
domains associated with this actor based on PassiveDNS data? (hint: you can use the
in operator to check for multiple values in a field. E.g. where field in (“x”, “y”, “z”)
4. What email addresses did the hackers use to send these domains?
6. Did any user have their credentials stolen? How do you know?
Hint: In order to have their credentials stolen, a user would need to browse to the
credential harvesting site and enter their username and password. After this, the actor
might try to login to the user’s account using the stolen credentials. You can find details
about login activity in the AuthenticationEvents table.
After digging for a bit on the phishing activity, you come across another tweet from a threat
intelligence vendor SolitaryStrike:
Hint: Try looking in PassiveDns to see if you can find infrastructure relationships
between this domain and the activity you identified in questions 1-6.
8. How many users at Envolve Labs downloaded the file mentioned in the tweet
(Patient_Information.rar?)
Hint: Files that are created on employees’ devices are captured in the
FileCreationEvents log. Try looking there to see which employees downloaded this
file.
Hint: Try narrowing down on one particular device that downloaded the
Patient_Information.pptx file. Then, look in both the FileCreationEvents and
ProcessEvents logs to find new files and processes created around the time when
the file was downloaded.
10. Is the actor using other domains or file names to deliver this malware?
Hint: What unique characteristics of this malware have you identified? Try
looking for file names, process command lines, and infrastructure that appear
unique. How can you pivot on these characteristics to find other related indicators?
Hint: Actors establish persistence so they can come back later and conduct
manual tasks (called hands-on-keyboard activity) within your company’s network.
Try looking for systems creating connections to external domains and IPs, or unusual
behaviors like creation of scheduled tasks.
---
Eggs-N-Ham Malware
Eggs-N-Ham implants are dropped by malicious files named after villages from the
Japanese anime called Naruto. These malicious files are hosted on actor domains and
are delivered to targets via email links. WaterNose observed the following Eggs-N-
Ham dropper file names:
Ceramic_Village.xls
Hidden_Sand.docx
Hidden_leaf.docx
Curtain_Village.zip
ba8a996a117702b946e07dd12d030956efddc159a5e775c18b1a7fb10df13902
cd6355ba77bf37be2027c2016cd37f9e08f7025e067903a45b3d37b7c11afdbf
6c35723e76ecc4fe8e5d1f6ef8bb96c8f163e020fd367c2a260295432ad11ed6
261e6dc6c25734ddaba007bedb8b474d7be4803d8e724d42637775bd7cc397aa
ham.exe
egg.exe
sam_i_am.dll
ping 8.8.8.8
whoami
plink <C2 IP>
LongNeedle Malware
Study_Results.zip
Research_Opportunity.exe
Study_Results.xls
Research_Opportunity.docx
1851a1c4e18f174c4af7f9521988b02815b647995f6d6776d727a54f4dff4cd6
94ff506cb4ff849279e84308de2b0c815fefe67d943972a12097ccc465448a7d
4930957431e73049e7564a35bd29a98bac78d78856eb614f9acca554c33fd42
6aa30523aad9901350a8e1dace08f70716259609d5b94ae68cb9a5ee36da8cc8
svhost.exe
infector.exe
recordsvr.exe
Recon:
C2:
Seussium IOCs
55.25.245.102
220.63.252.154
189.179.77.49
99.185.64.18
152.23.46.90
immune-evolve.net
vaccine.org
clan-curse.com
1. The report claims that Eggs-N-Ham implants are always dropped by files that
have Naruto-themed names. Do you agree that Eggs-N-Ham implants are
dropped exclusively by files with Naruto-themed names?
2. Look in our email logs for delivery of the Eggs-N-Ham dropper files. Then,
take the domains hosting the droppers and pivot in PassiveDNS. Identify
trends, patterns, or clusters in that infrastructure. Is it all related? Or are there
distinct patterns?
3. The report asserts that the two malware families Eggs-N-Ham and
LongNeedle are exclusive to one actor. Do you agree with this assessment?
Hint: Look for activity on Envolve Labs’ network related to the indicators
provided in the report for both malware families. Is the activity you observe
consistent with the actor described in the report?
4. Are there multiple actors targeting Envolve Labs? If so, can you describe the
Techniques, Tactics, and Procedures (TTPs) of each of them? How are they
similar? How are they different?