0% found this document useful (0 votes)
19 views39 pages

Content 3

Uploaded by

mohamedmmdouh69
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views39 pages

Content 3

Uploaded by

mohamedmmdouh69
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

Advanced searching

strategies in QRadar

Mar 2022
The agenda

• Ariel search and AQL


• Best practices and search techniques
• Performance investigation scenarios

2 IBM Security
The basics

Data storage technology – Where is my data?

Ariel – Proprietary flat file write-once-read-many database and search engine,


contains most of security data processed by QRadar. Data is organized in
multiple DBs with the following on-disk structure

/store/ariel/[db]/[db_specific_structure1]/…/[year]/[month]/[day]/[hour]/
file1
file2

fileN

PostgreSQL – Contains configuration data, meta data and some security artifacts
such as Offenses, Assets, Vulnerabilities, Reference Data etc.

3 IBM Security
The basics - Core QRadar architecture at 10K feet

4 IBM Security
How Ariel works

Ariel limits –

▪ 100 concurrent clients ~= 100 concurrent queries (the real execution


concurrency depends on hardware and configuration)
▪ 2.14 billion results maximum for a single query
▪ 9.22 quintillion (10^18) results processed in a single query (3.64 million years
@80K EPS)
▪ 32K maximum raw message (payload) size

6 IBM Security
Search basics

Ways to search:

▪ Classic UI
▪ Basic search – visual query building
▪ Advanced Search – using Ariel Query Language (AQL)
▪ Quick Filter
▪ REST API (AQL)
▪ New search experience – Analyst Workflow application (Visual + AQL builder)
▪ /opt/qradar/bin/ariel_query (REST API/AQL based CLI)

7 IBM Security
AQL

Most ingested data is held within two Ariel tables: events and flows.

AQL allows users to structure queries to pull data from a database table, then
manipulate the data as required to customize to the desired format.

Basic AQL structure


SELECT column_name FROM table_name
SELECT * FROM flows
SELECT *, my_favorite_property FROM events
SELECT username, sourceip FROM events

10
10 IBM Security
AQL
[SELECT *, column_name1, column_name2, … , column_nameN]
[FROM table_name]
[WHERE condition clauses]
[GROUP BY column_reference*]
[HAVING condition clause]
[ORDER BY column_reference*]
[LIMIT numeric_value]
[PARAMETERS list]
[TIMEFRAME]

NOTES:
- Mandatory operators are in red, everything else is optional.
- By default, advanced searches without a timeframe executed from the UI query
against the last 5 minutes of Ariel data, while from the API query the last 1 minute.

*column_reference: When you use a GROUP BY or ORDER BY clause to sort


information, to make a meaningful query it is required to reference column_names
from your existing SELECT statement.

11 IBM Security
AQL
What fields are available in AQL?

* Most default normalized properties


* All Custom, Calculated and AQL Properties

Documentation - https://www.ibm.com/docs/en/qsip/7.4?topic=language-event-flow-
simarc-fields-aql-queries
API - GET - /ariel/databases/{database_name}
UI – Ctrl + spacebar

12 IBM Security
Simple examples

• What are the top 5 log source types sending events?

SELECT LOGSOURCETYPENAME(devicetype)lstn,COUNT() c
FROM events
GROUP BY devicetype ORDER BY c DESC LIMIT 5

14 IBM Security
Simple examples

• What are the top log source types sending over 10000 events?

SELECT LOGSOURCETYPENAME(devicetype) lstn,COUNT() c


FROM events
GROUP BY devicetype HAVING c>10000 ORDER BY c DESC

15 IBM Security
AQL – Quotation mark usage
Quotation mark usage is a common question for new users as they develop their own
queries in QRadar and a common source of errors. Here is what you need to know.

Single-quotes
Use single-quotes characters to specify literal values or variable characters.
This includes:

• username LIKE '%Jason%‘


• sourceCIDR = '10.10.10.10‘
• TEXT SEARCH = 'VPN Authenticated user‘
• Column name alias definition that use spaces. Example, QIDNAME(qid) as ‘Event Name’

Double-quotes
Use double-quotes characters around column names that contain spaces or non-ASCII
characters. For AQL, this includes:

• Custom property names with spaces, such as “Account Security ID”


• Values with non-ASCII characters, such as “Beyoncé” or “jón.hallssonar”.
• When referencing column name aliases that use spaces or special characters

17 IBM Security
Finding what you are looking for

• What’s wrong with my queries?


SELECT "AQL Statement",COUNT() 'my count’
FROM events
WHERE qid=28250254 and "AQL Statement"<>NULL
GROUP BY 'AQL Statement'

SELECT 'AQL Statement',COUNT() 'my count’


FROM events
WHERE qid=28250254 and "AQL Statement"<>NULL
GROUP BY "AQL Statement"

18 IBM Security
Simple analytics

• Having logs with a numeric value representing something, calculate what


percentage of those logs were over/under some desired value in a single
query. Here we calculate the number of slow queries processed by QRadar in a
single pass over the data.

SELECT SUM(IF searchtime > 100 THEN 1 ELSE 0)


slow_search_count, COUNT() total_search_count,
(slow_search_count / total_search_count) * 100
slow_search_count_pct
FROM events WHERE qid=28250295

19 IBM Security
Advanced analytics

• Find periods of continuous high CPU usage of a QRadar host -


“Get all time intervals where any host other than EPID104 in the deployment
went over 80% CPU usage and returned back to 40% CPU usage.”
SELECT
sessionId,
Hostname as host,
DATEFORMAT(MIN(starttime),'YYYY-MM-dd HH:mm:ss') AS high_CPU_start_date,
LONG((MAX(starttime)-MIN(starttime))/1000) AS session_duration_sec,
LONG(COUNT()) AS events_in_session
FROM events
WHERE devicetype=368 AND "Metric ID"='IdleCpu'
SESSION BY starttime Hostname
EXPLICIT BEGIN Value<20
EXPLICIT END Value>60
GROUP BY sessionId
ORDER BY session_duration_sec DESC
LAST 120 MINUTES
PARAMETERS EXCLUDESERVERS=ARIELSERVERS4EPNAME('eventprocessor104'), PRIORITY='lowminus',
RETENTIONTIME=60000

21 IBM Security
Data enrichment

• Having authentication events with numeric authentication identity, enrich the


search results with meaningful usernames

SELECT DATEFORMAT(starttime,'yyyy-MM-dd HH:mm:ss') ts,


_identity_test,
REFERENCEMAP('id_to_username',_identity_test)
identity_username
FROM events WHERE _identity_test <> NULL LIMIT 1

The enrichment “id_to_username” map ->

22 IBM Security
Full text search, pattern matching etc.

Ways to search:
• Payload/Property contains / matches / LIKE / MATCHES
• Quick Filter

SELECT DATEFORMAT(starttime,'yyyy-MM-dd HH:mm:ss')


ts,sourceip,username FROM events WHERE username ILIKE
'admin%’

SELECT DATEFORMAT(starttime,'yyyy-MM-dd HH:mm:ss')


ts,sourceip,username FROM events WHERE UTF8(payload) ILIKE
'%admin%’

SELECT DATEFORMAT(starttime,'yyyy-MM-dd HH:mm:ss')


ts,sourceip,username FROM events WHERE TEXT SEARCH '*admin*'

23 IBM Security
Full text search, pattern matching etc.

Regex vs. QuickFilter:


• contains / matches / LIKE / MATCHES – all involve a Java Regex execution.
Regex execution can be computationally complex and is unlikely to result in
great search performance on large inputs.
• Quick Filter – based on Apache Lucene FTS library. Creates an index and uses
it for very quick text search capabilities. Trades off preprocessing and index
creation resources for significantly faster search. Requires additional disk
space to store the index, usually in range of 30% of the size of the
uncompressed payload size.

24 IBM Security
Advanced Search topics – Quick Filter

Advanced searching using Quick Filter

▪ What is Quick Filter?


▪ Quick Filter is a way to search event logs and flows data using simple words and
phrases and it is optimized for text search use case

▪ What technology is it based upon?


▪ Apache Lucene - lucene.apache.org/core/

▪ How does Quick Filter compare to Ariel filters?


▪ Quick Filter Has a potential to perform a ‘Payload Contains/Matches’ type of search
with performance of an indexed search. Ariel filters are faster for finding exact
matches.

▪ Is Quick Filter aware of the normalized event properties in QRadar like Source IP?
▪ No, Quick Filter operates on the index built from the raw payload and is not aware
of the normalized QRadar fields which are extracted and set by DSMs in QRadar.

25 IBM Security
Advanced Search topics – Quick Filter

Building the index – Analysis and Tokenization

Text is split using whitespace and punctuation as delimiters, “meaningless” words &
delimiters are dropped, tokens are indexed
̶ Example 1:
• Message: Hello world, I am a string!
• Tokens: am hello i string world
• Not tokenized: , ! a
̶ Example 2:
• Message: abc=blah|url=https://www4.dot.com|user=root
• Tokens: abc blah https root url user www4.dot.com
• Not tokenized: = | : //
̶ Example 3:
• Message: Sep 1 11:27:49 152.7.19.18 sshd(pam_unix)[11467]: authentication failure;
logname= uid=0 euid=0 tty=NODEVssh ruser= rhost=1.2.3.13 user=root
• Tokens: 0 1 1.2.3.13 11 11467 152.7.19.18 27 49 authentication euid failure logname
nodevssh pam rhost root ruser sep sshd tty uid unix user
• Not tokenized: : ( ) [ ] ; : =

26 IBM Security
Advanced Search topics – Quick Filter

Boolean logic
• a b – search for messages that contain either a or b
• a AND b – search for messages that contain both a and b
• NOT a AND (b OR c) - search for messages that do not contain a and contain b or c
Wildcards
• b?t - search for messages with words of exactly 3 letters, starting with b and ending with t. Example -
bat, bot etc
• b??t - search for messages with words of exactly 4 letters, starting with b and ending with t. Example -
boat, boot etc
• b*t – search for messages with words of any length, starting with b and ending with t. Example - bat,
bot etc
• *axe* - search for messages with words of any length that contain axe
Regex
• /[br]oot/ - search for messages with words of exactly 4 letters, starting with either b or r
• /(b|r)oot/ - search for messages with words of exactly 4 letters, starting with either b or r
• /hack{2}/ - search for messages where the word hack appears exactly twice
• /hack{2,}/ - search for messages where the word hack appears at least twice
• /.*\.doc/ - search for messages containing .doc
Proximity
• “user hacker”~2 - search for messages with words user and hacker at most two words apart

27 IBM Security
Advanced Search topics – Quick Filter

Things to be aware of:


• The default operator is OR when multiple terms are searched for –
̶ a b = a OR b
• QuickFilter terms are not case sensitive, but operators are -
̶ ABC = abc
̶ a and b = a OR and OR b
• The following characters need to be escaped when used as a term.
̶ + - && || ! ( ) { } [ ] ^ " ~ * ? : \ /
Use backslash to escape
̶ (1+1):2 has to be converted to \(1\+1\)\:2
• QRadar uses the Classic Analyzer.
• Criteria is also analyzed and tokenized!
̶ “hello world” => hello AND world, where the word ‘world’ is positioned one word after hello
̶ “hello\-there” => hello AND there
̶ 127.0.0.1 -> Periods that are not followed by whitespace are preserved. No need to escape in
criteria
̶ Words are split at hyphens, unless the word contains a number, in which case, the token is not split
and the numbers and hyphens are retained as one token
• Abc-efg => abc efg
• Abc-efg2 => abc-efg2
̶ Internet domain names and email addresses are preserved as a single token
̶ File names and URL names that contain more than one underscore are split on the last underscore
before a period
• 1.2.3.4:/john_big_admin/1.pdf => 1.2.3.4 big john admin/1.pdf

28 IBM Security
Advanced Search topics – Quick Filter

Search tips:
• Loosen the search criteria to find things
• Use as strict criteria as possible for best performance when possible

Example 1:
Log message = 1.2.3.4:/john_big_admin/1.pdf
Quick Filter 1 = “john_big_admin”. Log not found!
Quick Filter 2 = /.*john_big_admin.*/. Log not found!
Quick Filter 3 = john AND big AND admin. Log not found!
Quick Filter 4 = *john* AND *big* AND *admin*. Log found!

Tokens - 1.2.3.4 admin/1.pdf big john

Optimal Quick Filter – admin* AND john AND big → But not too important for
performance as long as a wildcard/prefix query is already used, i.e. admin* AND
john* AND big* is expected to perform similarly

29 IBM Security
Advanced Search topics

Evaluating your query execution:


Expand “Current Statistics” to see important query processing details.

Click “More Details” to see per-host query statistics.

30 IBM Security
Search performance best practices

How to search efficiently – Limit the scope!

▪ Narrow down the time window. Start small, expand as needed.


▪ Take advantage of indexes
▪ Limit the result set size
▪ Use as specific criteria as possible

Use Case: Find all users that interacted with “spam.ru” domain in the past month

Example of a “bad” search. Bad = expensive and slow

32 IBM Security
Search performance investigation real-world example
Case 1 – Top log source by EPS (Pulse default widget)

33 IBM Security
Search performance investigation real-world example
Solution 1 – Use QRadar default aggregated data view

34 IBM Security
Search performance investigation real-world example
Case 2 – Average event rate (EPS) (Pulse default widget)

35 IBM Security
Search performance investigation real-world example
Case 2 – Average event rate (EPS) (Pulse default widget)

SELECT starttime/(1000*60) as minute,


DATEFORMAT(starttime,'YYYY MM dd HH:mm:ss') as showTime,
(minute * (1000 * 60)) as 'tsTime’,
"Events per Second Raw - Average 1 Min" as EPS,parent as aParent
from events
where aParent IN <- aParent is not indexed -> sequential scan
(select aParent FROM
(select parent as aParent,
"Events per Second Raw - Average 1 Min" as EPS
from events
where parent <> NULL and logsourceid=65
group by Parent
order by EPS <- implicit ascending sort order
limit 5)) <- no time criteria -> implicit last 5 minutes timeframe
group by minute, parent
order by minute ASC
last 2 hours

36 IBM Security
Search performance investigation real-world example
Solution 2 – Used an indexed criteria, provide and explicit timeframe for the inner
query, fix the sort order

SELECT starttime/(1000*60) as minute,


DATEFORMAT(starttime,'YYYY MM dd HH:mm:ss') as showTime,
(minute * (1000 * 60)) as 'tsTime’,
"Events per Second Raw - Average 1 Min" as EPS,parent as aParent
from events
where logsourceid=65 and aParent IN
(select aParent FROM
(select parent as aParent,
"Events per Second Raw - Average 1 Min" as EPS
from events
where parent <> NULL and logsourceid=65
group by Parent
order by EPS desc
limit 5
last 2 hours))
group by minute, parent
order by minute ASC
last 2 hours

37 IBM Security
Search performance investigation real-world example
Solution 2 – Used an indexed criteria, provide and explicit timeframe for the inner
query, fix the sort order

38 IBM Security
Conclusion

QRadar provides a scalable distributed


data storage and analysis capabilities
by leveraging a fast and scalable
search engine, multiple search tools,
flexible deployment architecture and
building blocks that allow to achieve
great search performance.

39 IBM Security
References:
• Searching Events and Flows in QRadar
• Ariel Query Language
• Searching Your QRadar Data Efficiently: Part 2 - Leveraging Indexed Values
• QRadar Quick Filter search options
• QRadar Pulse app
• QRadar Deployment Intelligence app

40 IBM Security
THANK YOU
FOLLOW US ON:

ibm.com/security

securityintelligence.com
xforce.ibmcloud.com

@ibmsecurity

youtube/user/ibmsecuritysolutions

© Copyright IBM Corporation 2016. All rights reserved. The information contained in these materials is provided for informational purposes only, and is provided AS IS without warranty of any kind,
express or implied. Any statement of direction represents IBM's current intent, is subject to change or withdrawal, and represent only goals and objectives. IBM, the IBM logo, and other IBM products
and services are trademarks of the International Business Machines Corporation, in the United States, other countries or both. Other company, product, or service names may be trademarks or service
marks of others.
Statement of Good Security Practices: IT system security involves protecting systems and information through prevention, detection and response to improper access from within and outside your
enterprise. Improper access can result in information being altered, destroyed, misappropriated or misused or can result in damage to or misuse of your systems, including for use in attacks on others.
No IT system or product should be considered completely secure and no single product, service or security measure can be completely effective in preventing improper use or access. IBM systems,
products and services are designed to be part of a lawful, comprehensive security approach, which will necessarily involve additional operational procedures, and may require other systems, products
or services to be most effective. IBM does not warrant that any systems, products or services are immune from, or will make your enterprise immune from, the malicious or illegal conduct of any party.
BACKUP

42 IBM Security
How Ariel works – high level

▪A client sends a query request to Ariel Proxy


Server (APS)
▪APS processes query parameters and
distributes them to all hosts participating in the
search
▪Hosts start working on their search tasks until
completion or until cancellation, reporting
progress and partial results periodically, as
needed
▪APS maintains search state, processes partial
results coming from hosts participating in the
search
▪Once all AQS hosts finish, APS may have to do
a final processing phase such as a final sort or a
group by before reporting search completion

43 IBM Security
How Ariel works – lower level

▪Multiple specialized execution thread pools – index


processing, aggregations, filtering/reading data etc.
▪Each pool size is configurable and is sized automatically at
install time based on system resources – allows fine tuning
the system
▪Results of work are transferred to the next pool, to network
or disk, as needed

44 IBM Security
How Ariel works – lowest level

▪Query is divided into many small work tasks


▪Each execution thread pool has a partitioned priority
queue to enforce execution priorities, with high priority
tasks getting execution slots at a higher rate that normal
and low priority tasks
▪Each execution pool can execute 1..N tasks, based on
the pool size
▪Tasks are given a small time slice (called quantum) to be
executed. If a task is not completed in time, state is
preserved and the task is put back into the entry queue of
the pool. This allows progress for all tasks (and so all
searches). Queued query state is (mostly) eliminated.
▪Quantum is configurable – larger value prioritize
computationally complex search execution speed at
expense of latency and fairness of smaller tasks, while
smaller values prioritize fairness and a sense of overall
work progress.

45 IBM Security
Bonus – Qradar search performance retrospective
Searching
1 TB of data
in less than
1 second
on
1 xx28
With LazySearch
all searches using
filters over
indexed properties
perform like
needle-in-a-
haystack!
Needle in haystack search, returning 100’s of results
46 IBM Security

You might also like