0% found this document useful (0 votes)
200 views18 pages

Lucene Search Syntax Guide

LogRhythm-LuceneSearchSyntax

Uploaded by

exente
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
200 views18 pages

Lucene Search Syntax Guide

LogRhythm-LuceneSearchSyntax

Uploaded by

exente
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Lucene Search Syntax Guide

© LogRhythm, Inc. All rights reserved.


This document contains proprietary and confidential information of LogRhythm, Inc., which is protected by
copyright and possible non-disclosure agreements. The Software described in this Guide is furnished under
the End User License Agreement or the applicable Terms and Conditions (“Agreement”) which governs the
use of the Software. This Software may be used or copied only in accordance with the Agreement. No part
of this Guide may be reproduced or transmitted in any form or by any means, electronic or mechanical,
including photocopying and recording for any purpose other than what is permitted in the Agreement.

Disclaimer
The information contained in this document is subject to change without notice. LogRhythm, Inc. makes no
warranty of any kind with respect to this information. LogRhythm, Inc. specifically disclaims the implied
warranty of merchantability and fitness for a particular purpose. LogRhythm, Inc. shall not be liable for any
direct, indirect, incidental, consequential, or other damages alleged in connection with the furnishing or use
of this information.

Trademark
LogRhythm is a registered trademark of LogRhythm, Inc. All other company or product names mentioned
may be trademarks, registered trademarks, or service marks of their respective holders.

LogRhythm Inc.
4780 Pearl East Circle
Boulder, CO 80301
(303) 413-8745
www.logrhythm.com

LogRhythm Customer Support


[email protected]
Contents
Lucene Search Syntax
Basic Queries 1

Wildcard Queries and Fuzzy Matches 2

Complex Queries 3

Boolean Operators 3

Grouping 5

Field Grouping 5

Troubleshooting 5

Metadata Field Descriptions


Applications 6

KBytes and Packets 8

Classification 9

Host 10

Identity 11

Location 12

Log 13

Network 14

Lucene Search Syntax Guide │ Contents i


Lucene Search Syntax
The LogRhythm Web Console allows you to filter data on the Dashboards and the Analyze page by using
Lucene search syntax. Lucene is an open source text retrieval library released under the Apache Software
License. This guide provides an overview of building Lucene queries for use in the Web Console. For
information on how to run queries that are more complex than those detailed in this guide, see the official
Apache Lucene search website.

Basic Queries
Type a query using the following basic syntax of field, colon, standard open quotation mark, term, standard
end quotation mark: Metadata:"term"

The Metadata used in the Web Console are details from log messages. For complete information on the
metadata available, including the syntax, a description, and the corresponding Web Console display name,
see the tables at the end of this document.

Example
If you wanted to run a query for all activity that falls under the Malware classification, you would
use: classificationName:"Malware"
If you wanted to run a query for the impacted user account jon.smith, you would use:
login:"jon.smith"

Lucene search looks for exact matches in the Metadata name. You must be careful to match capitalization
correctly. The term, however, is not case sensitive. If you tried either of these, the Web Console widgets
would return No data available error messages.

classificationname:"Malware"
Login:"jon.smith"

To escape a special character that is part of the query syntax, use a backslash before the character.
Characters that require this treatment are: + - && || ! ( ) { } [ ] ^ " ~ * ? : \

Example
If you wanted to run a query for an impacted user whose name is jon*, you would use: login:"jon\*"
If you wanted to run a query for an impacted user whose name is jon.smith-miller, you would use:
login:"jon.smith\-miller"

1 Lucene Search Syntax Guide


Wildcard Queries and Fuzzy Matches
If a classification name is Malware XYZ, you will not get the results you are looking for if you run a query using
classificationName:"Malware". You need to enter the term exactly or use a wildcard search. Wildcard
searches use forward slashes (in place of quotation marks) to indicate regex syntax. In Lucene queries, the
set of operators for wildcard searches is limited when compared to the full Java-supported regex.

You can use a period and question mark combination (.?) for single-character wildcards.

Example
If you wanted to run a query for an impacted user named either Jon or Jan, you would use: login:/J.?n/

Use a period and asterisk combination (.*) for multi-character wildcards.

Example
If you wanted to run a query for all impacted users whose name begins with Jo, you would use:
login:/Jo.*/
If you wanted to run a query for all impacted users whose account ends with Smith, you would use:
login:/.*Smith/
If you wanted to search for all classifications containing the word Malware, you would use:
classificationName:/.*Malware.*/

You can also use wildcard queries to filter results by blank or non-blank term fields.

Example
If you wanted to run a query for all log messages classified under any major activity group, you
would use: classificationName:/.*/
If you wanted to run a query for all log messages NOT classified under any major activity group,
you would use: *:* AND NOT classificationName:/.*/

To use a fuzzy match to locate terms similar to what you type, use a tilde (~) with no quotations marks or
slashes.

Example
If you wanted to run a query for impacted users whose names are similar to Jon, such as Ron or John,
you would use: login:Jon~

Lucene Search Syntax Guide 2


Complex Queries
Boolean Operators
The default Boolean operator is OR. If you do not include an operator when searching on multiple criteria, the
query will be run as an OR search. You can also use || in place of OR.

Example
If you wanted to run a query for all activity that falls under the Malware classification or that originated
from a particular host, you would use any of the following:

classificationName:"Malware" originHost: "106.194.190.210"


classificationName:"Malware" OR originHost: "106.194.190.210"
classificationName:"Malware" || originHost: "106.194.190.210"

The AND operator looks for all terms to exist. You can also use && in place of AND.

Example
If you wanted to run a query to see whether Malware activity originated from a particular host, you would
use either of the following:

classificationName:"Malware" AND originHost: "106.194.190.210"


classificationName:"Malware" && originHost: "106.194.190.210"

The NOT operator excludes results with that term. You can also use ! in place of NOT.

Example
If you wanted to run a query for the impacted user account jon.smith for all activity that is not classified as
Malware, you would use either of the following:

login:"jon.smith" NOT classificationName:"Malware"


login:"jon.smith" ! classificationName:"Malware"

If you need to run a NOT search by itself, use the following wildcard syntax:

*:* AND NOT Metadata:"term"

To search by a range, include TO between the parameters. To run an inclusive search, use square brackets [
]. To run an exclusive search, use curly brackets { }.

3 Lucene Search Syntax Guide


Example
If you wanted to run a query for the host from which a log activity originated, INCLUSIVE of the first
and last IP address, you would use:
originHost: [106.194.190.210 TO 106.194.190.250]
If you wanted to run a query for the host from which a log activity originated, EXCLUSIVE of the
first and last IP address, you would use:
originHost: {106.194.190.210 TO 106.194.190.250}
If you wanted to run a query for a log of a certain priority ranking, INCLUSIVE of a ranking 40 or
greater, you would use:
priority: [40 TO *]

To run a search on a log date, you need to convert the time to epoch format in milliseconds. There are several
online tools to help you do so, including EpochConverter and Unix Time Stamp.

Example
If you wanted to run a query for all logs after October 30, 2016 at 9 A.M. local time, you would use:
normalDate{1477839600000 TO *}

Note that there currently is no way to designate time relative to now.

Lucene Search Syntax Guide 4


Grouping
You can query for multiple values in the same filter by enclosing all terms in parentheses. This is similar to
using the OR operator, except that you can only search one metadata field with this syntax.

Example
If you wanted to run a query for all activity that falls under the Malware or Attack classifications, you
would use: classificationName:("Malware" "Attack")

Field Grouping
Use parentheses to group fields in order to create combinations of any of these query types.

Example
If you wanted to run a query for the host from which Malware activity originated, exclusive of the
first and last IP addresses in two different ranges, you would use:
originHost: ({106.194.190.210 TO 106.194.190.250} OR {106.194.190.365 TO 106.194.190.395})
AND classificationName:"Malware"
If you wanted to run a query to look for Malware or Compromise activity that impacted any of three
separate users but that does not come from a particular IP address, you would use: login:
("jon.smith" "fred.miller" "janice.jones") AND classificationName:("Malware" "Compromise")
AND NOT originHost:"106.194.190.210"

Troubleshooting
If your Lucene query returns a No data available error or otherwise is not returning the results you expect,
check the following:

Upper and lower case accuracy of the metadata field and the term
Use of quotation marks, forward slashes, and/or back slashes
Use of wild card characters
Capitalization of Boolean logic terms
Use of parentheses in complex queries

5 Lucene Search Syntax Guide


Metadata Field Descriptions

Applications
Web
Console Lucene Search
Field Description
Display Syntax
Name

Action action An action taken by a device.

Amount amount Integer value representing a quantity.

Application portProtocol A network protocol or a web application impacted by the event


generated from the log message.

Note: The "unknown" category is an aggregation of


applications that the SIEM has not classified.

Command command The name of an executed command within the metadata (for
example: login, get, or put).

Duration duration Running time of a session, job, activity, etc.

Hash hash The digital signature, or mathematical equivalent, of the file that
retrieves data from a URL or is the combination of other
downloaded files.

Known serviceName Known application or service, such as HTTP, POP3, or Telnet.


Application An application is "known" if the SIEM can match the protocol
number from the log to a service name in the Events Database.

Object object Resource that is referenced or impacted by the log activity. An


"object" can include a file, file path, registry key, etc.
Object objectName
Name
Note: The Object field contains the full path and name,
but objectName only stores the object name.

Object objectType A pair with an Object and an Object Name for example, the
Type content type from HTTP logs.

Lucene Search Syntax Guide 6


Web
Console Lucene Search
Field Description
Display Syntax
Name

Parent parentProcessId An ID number for a service or process running on a device, also


Process known as PID.
ID

Parent parentProcessName The name of a process currently running on a system.


Process
Name

Parent parentProcessPath The logical storage path for a given process.


Process
Path

Process process Name or value that identifies a process (for example, "inetd" or
Name "sshd").

Process processId The ID associated with a process.


ID

Quantity quantity Item quantity.

Rate rate Rate of an item.

Size size The size of an item, which depends on the log type. For example,
logs relating to firewalls may show the size or length of a packet.

Subject subject Email subject line. For non-email logs, this field could represent
the subject in some form of communicated information.

Threat ID threatId An Identification Number specified for a given threat, as defined


from a third party security system or device, such as a firewall,
IPS/IDS, AV, Endpoint Protection System, etc.

Version version A value that represents a version (OS version, patch version, doc
version, etc.).

7 Lucene Search Syntax Guide


KBytes and Packets
Web Console
Lucene Search Syntax Field Description
Display Name

Host (Impacted) kBytesIn KBytes involved in the impacted host activity:


KBytes Rcvd
l Host (Impacted) KBytes Rcvd is the
Host (Impacted) kBytesOut number of bytes the impacted host
KBytes Sent received.
Host (Impacted) impactedHostTotalKBytes l Host (Impacted) KBytes Sent is the
KBytes Total number of bytes the impacted host sent.
l Host (Impacted) KBytes Total is the sum of
KBytes In and KBytes Out.

Host (Impacted) itemsPacketsIn Packets involved in the impacted host activity:


Packets Rcvd
l Host (Impacted) Packets Rcvd is the
Host (Impacted) itemsPacketsOut number of packets the impacted host
Packets Sent received.
Host (Impacted) impactedHostTotalPackets l Host (Impacted) Packets Sent is the
Packets Total number of packets the impacted host sent.
l Host (Impacted) Packets Total is the sum
of Packets In and Packets Out.

KBytes Inbound kBytes Total KBytes transferred from a device, system,


or process:
KBytes Outbound outboundKBytes
l KBytes Inbound. Total KBytes received.
l KBytes Outbound. Total KBytes sent.

Lucene Search Syntax Guide 8


Classification
Web Console Lucene Search
Field Description
Display Name Syntax

Classification classificationName One of the major activity groups (Operations, Audit, or


Security) used to group log message types, along with a
more specific sub-classification. For example, sub-
classifications for Security might include Compromise,
Attack, or Malware.

Common Event commonEventName A short, plain-language description of the log that


determines its Classification.

CVE cve Common Vulnerabilities and Exposure. This field is used to


refer to specific vulnerabilities for a product.

Direction directionName Direction of activity between a log's origin and impacted


zones. Values can be Internal, External, Outbound, Local,
or Unknown.

MPE Rule mpeRuleName Message Processing Engine (MPE) rule, which identifies
Name and normalizes log messages and then assigns them to a
Log Type (Common Event).

Policy policy The LogRhythm Policy (e.g., FIM, RIM, Agent, etc.)
resulting in the log being generated.

Reason reason The reason code within a log message. For example:
Checkpoint: reason=mlx Syslog - AirTight IDS/IPS:
REASON=1

Response Code responseCode The response code that is returned from a prior command.

Result result Anything indicating a result, including but not exclusively a


code.

Severity severity A value indicating the severity of the log.

Status status The current waiting state for a process, system state,
network state, or attempted action.

9 Lucene Search Syntax Guide


Web Console Lucene Search
Field Description
Display Name Syntax

Threat Name threatName The name of a specific threat as defined from a third party
security system or device, such as a firewall, IPS/IDS, AV,
Endpoint Protection System, etc.

Vendor Info vendorInfo Human readable strings that may contain clarifying
information not easily encapsulated by CE/Classification or
a rule name.

Vendor vendorMessageId Unique vendor-assigned value that identifies the log


Message ID message.

Host
Web Console Lucene Search
Field Description
Display Name Syntax

Host (Impacted) impactedHost The host involved in the log activity, which may include the
IP address, host name, or Ethernet address:
Host (Origin) originHost
l Host (Impacted) is the destination.
l Host (Origin) is the source.

Hostname impactedName The name of the host involved in the log activity (for
(Impacted) example, a DNS name or a Netbios name):
Hostname originName l Hostname (Impacted) is the destination.
(Origin) l Hostname (Origin) is the source.

Interface impactedInterface The interface number of a device or physical port number of


(Impacted) a switch:
Interface (Origin) originInterface l Interface (Impacted) is the destination interface.
l Interface (Origin) is the source interface.

IP Address impactedIp The IP addresses for the log activity:


(Impacted)
l IP Address (Impacted) is the destination address.
IP Address originIp l IP Address (Origin) is the source address.
(Origin)

Lucene Search Syntax Guide 10


Web Console Lucene Search
Field Description
Display Name Syntax

Known Host impactedHostName The host record associated with a specific Entity:
(Impacted)
l Known Host (Origin) is the source of the log activity.
Known Host originHostName l Known Host (Impacted) is the destination of the log
(Origin) activity.

Mac Address impactedMac The MAC address involved in the log message:
(Impacted)
l MAC Address (Origin) is the source.
Mac Address originMac l MAC Address (Impacted) is the destination.
(Origin)

NAT IP Address impactedNatIp The IP address that was translated via NAT device logs:
(Impacted)
l NAT IP Address (Origin) is the source.
NAT IP Address originNatIp l NAT IP Address (Impacted) is the destination.
(Origin)

Serial Number serialNumber This is the serial number for a specific device or system.

Identity
Web
Lucene
Console
Search Field Description
Display
Syntax
Name

Group group User group or role referenced or impacted by the log activity. This group is
typically an Active Directory group name or other type of logical
container.

Recipient recipient Email address or VOIP caller number. For non-email logs, this field could
represent the user who received a form of information.

Sender sender Email originator or VOIP caller number. For non-email logs, this field
could represent the user who received a form of information.

User login The user logon that is the source of the log activity.
(Origin)

User account The user account that is the recipient of the action (for example, a
(Impacted) password reset on a user account).

11 Lucene Search Syntax Guide


Location
Web Console Lucene Search
Field Description
Display Name Syntax

Country impactedCountry The country involved in the log activity:


(Impacted)
originCountry l Country (Impacted) is the destination area.
Country (Origin) l Country (Origin) is the source area.

Note: The Country values are derived from the


LogRhythm SIEM's GeoLocation feature.

Entity (Impacted) impactedEntityName The resolved host entities involved in the log data:
Entity (Origin) originEntityName l Entity (Impacted) is the destination host.
l Entity (Origin) is the source host.

Note: An "Entity" is a record that represents a


logical grouping of the SIEM or log objects in
the enterprise. Administrators define Entities
for security management and organization.

Location impactedLocation The geographic area involved in the log activity:


(Impacted)
originLocation l Location (Origin) is the source area.
Location (Origin) l Location (Impacted) is the destination area.

Note: The Location values are derived from


the LogRhythm SIEM's GeoLocation feature.

Region impactedRegion The region involved in the log activity:


(Impacted)
l Region (Origin) is the source area.
Region (Origin) originRegion l Region (Impacted) is the destination area.

Note: The Region values are derived from the


LogRhythm SIEM's GeoLocation feature.

Lucene Search Syntax Guide 12


Web Console Lucene Search
Field Description
Display Name Syntax

Root Entity rootEntityId The root entity (top-most entity) for a log source.

Note: In the search syntax, provide the ID


number that the root entity is mapped to in the
LogRhythm Client Console, rather than the
name of the root entity.

Zone (Impacted) impactedZoneName The resolved zone (Internal, External, or DMZ) that
LogRhythm identified in the log activity:
Zone (Origin) originZoneName
l Zone (Origin) is the source zone.
l Zone (Impacted) is the destination zone.

Note: Administrators assign zones in the Host


records and Network records.

Log
Web Console Lucene Search
Field Description
Display Name Syntax

First Log Date normalMsgDate First occurrence of a single log in an aggregated log.

Last Log Date normalDateMax Latest occurrence of a single log in an aggregated log.

Log Count count Number of logs.

Log Date normalDate The creation date contained in the log. This value can be
in UTC or a user-selected time zone.

Log Message logMessage Text from the log that is parsed into metadata fields.

Log Source logSourceName A unique identifier that generated the log on a specific
host.

Log Source Entity entityName A logical collection of unique networks, devices, and
systems.

13 Lucene Search Syntax Guide


Web Console Lucene Search
Field Description
Display Name Syntax

Log Source Host logSourceHostName The system or device where the Log Source originated.

Log Source Type logSourceTypeName Type of facility or source where the log originated.

Log Sequence sequenceNumber The order in which the log was collected, in relation to
Number other logs.

Network
Web
Console Lucene Search
Field Description
Display Syntax
Name

Domain domainImpacted The Impacted Windows of DNS referenced or impacted by log


(Impacted) activity.

Domain domainOrigin The domain from which a log message originated.


(Origin)

NAT impactedNatPort The TCP/UDP port that was translated via NAT device logs:
TCP/UDP
l NAT TCP/UDP Port (Origin) is the source.
Port
originNatPort NAT TCP/UDP Port (Impacted) is the destination.
(Impacted) l

NAT
TCP/UDP
Port
(Origin)

Network impactedNetwork Network involved in the log activity:


(Impacted)
originNetwork l Network (Origin) is the source network.
Network l Network (Impacted) is the destination network.
(Origin)

Protocol protocolName Network protocol applicable to the log message.

Session session The user, system, or application session.

Lucene Search Syntax Guide 14


Web
Console Lucene Search
Field Description
Display Syntax
Name

Session sessionType If a session code is already in use for TCP or UDP protocols, this
Type field is used for a session type that could be ssh, console, etc. Upon
the establishment of a network connection, a session type is defined
for that connection.

TCP/UDP originPort The TCP or UDP port number:


Port
l TCP/UDP Port (Origin) is the source.
(Origin)
impactedPort l TCP/UDP Port (Impacted) is the destination.
TCP/UDP
Port
(Impacted)

URL url URL referenced or impacted by the log activity.

User userAgent A unique string which identifies the browser or application and
Agent provides system specific details to servers hosting visited websites.

15 Lucene Search Syntax Guide

You might also like