Drain An Online Log Parsing Approach With Fixed Depth Tree

This document discusses an online log parsing approach called Drain. Drain can parse logs in a streaming manner without requiring source code or offline training. It uses a fixed depth parse tree to encode parsing rules and evaluate log messages as they are collected. The experimental results show that Drain has the highest accuracy on four datasets and comparable accuracy on the fifth dataset, while providing 51.85-81.47% improvement in running time over existing online parsers.

Uploaded by

redzgn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

100 views8 pages

Drain An Online Log Parsing Approach With Fixed Depth Tree

Uploaded by

redzgn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

2017 IEEE 24th International Conference on Web Services

Drain: An Online Log Parsing Approach with Fixed

Depth Tree
Pinjia He∗ , Jieming Zhu∗ , Zibin Zheng† , and Michael R. Lyu∗
∗ ComputerScience and Engineering Department, The Chinese University of Hong Kong, China
{pjhe, jmzhu, lyu}@cse.cuhk.edu.hk
† Key Laboratory of Machine Intelligence and Advanced Computing (Sun Yat-sen University), Ministry of Education

School of Data and Computer Science, Sun Yat-sen University, China

[email protected]

Abstract—Logs, which record valuable system runtime infor- logging.info()) written by developers. Thus, log analysis tech-
mation, have been widely employed in Web service management niques, which apply data mining models to get insights of sys-
by service providers and users. A typical log analysis based Web tem behaviors, are in widespread use for service management.
service management procedure is to first parse raw log messages
because of their unstructured format; and then apply data mining For service providers, there are studies in anomaly detection
models to extract critical system behavior information, which can [3], [4], fault diagnosis [5], [6] and performance improvement
assist Web service management. Most of the existing log parsing [7]. For service users, typical examples include business model
methods focus on offline, batch processing of logs. However, as mining [8], [9] and user behavior analysis [10], [11].
the volume of logs increases rapidly, model training of offline
log parsing methods, which employs all existing logs after log Most of the data mining models used in these log analysis
collection, becomes time consuming. To address this problem, techniques require structured input (e.g., an event list or a
we propose an online log parsing method, namely Drain, that matrix). However, raw log messages are usually unstructured,
can parse logs in a streaming and timely manner. To accelerate because developers are allowed to write free-text log messages
the parsing process, Drain uses a fixed depth parse tree, which in source code. Thus, the first step of log analysis is log
encodes specially designed rules for parsing. We evaluate Drain
on five real-world log data sets with more than 10 million raw parsing, where unstructured log messages are transformed into
log messages. The experimental results show that Drain has the structured events. An unstructured log message, as in the
highest accuracy on four data sets, and comparable accuracy following example, usually contains various forms of system
on the remaining one. Besides, Drain obtains 51.85%∼81.47% runtime information: timestamp (records the occurring time
improvement in running time compared with the state-of-the- of an event), verbosity level (indicate the severity level of
art online parser. We also conduct a case study on an anomaly
detection task using Drain in the parsing step, which determines an event, e.g., INFO), and raw message content (free-text
the effectiveness of Drain in log analysis. description of a service operation).
Index Terms—Log parsing; Online algorithm; Log analysis;
Web service management; 081109 204655 556 INFO dfs.DataNode$PacketResponder
: Received block blk_3587508140051953248 of size 67
108864 from /10.251.42.84
I. I NTRODUCTION
The prevalence of cloud computing, which enables on- Traditionally, log parsing relies heavily on regular expres-
demand service delivery, has made Service-oriented Architec- sions [12], which are designed and maintained manually by
ture (SOA) a dominant architectural style. Nowadays, more developers. However, this manual method is not suitable for
and more developers leverage existing Web services to build logs generated by modern services for the following three
their own systems because of their rich functionality and reasons. First, the volume of logs is increasing rapidly, which
“plug-and-play” property. Although developing Web service makes the manual method prohibitive. For example, a large-
based system is convenient and lightweight, Web service man- scale service system can generate 50 GB logs (120∼200
agement is a significant challenge for both service providers million lines) per hour [13]. Second, as open-source platforms
and users. Specifically, service providers (e.g., Amazon EC2 (e.g., Github) and Web service become popular, a system often
[1]) are expected to provide services with no failures or SLA consists of components written by hundreds of developers
(service-level agreement) violations to a large number of users. globally [3]. Thus, people in charge of the regular expressions
Similarly, service users need to effectively and efficiently may not know the original logging purpose, which makes
manage the adopted services, which have been discussed in manual management even harder. Third, logging statements
many recent works (e.g., Web service monitoring [2]). In this in modern systems updates frequently (e.g., hundreds of new
context, log analysis based service management techniques, logging statements every month [14]). In order to maintain
which employ service logs to achieve automatic or semi- a correct regular expression set, developers need to check all
automatic service management, have been widely studied. logging statements regularly, which is tedious and error-prone.
Logs are usually the only data resource available that Log parsing is widely studied to parse the raw log messages
records service runtime information. In general, a log message automatically. Most of existing log parsers focus on offline,
is a line of text printed by logging statements (e.g., printf(), batch processing. For example, Xu et al. [3] design a method

DOI 10.1109/ICWS.2017.13
to automatically generate regular expressions based on source 081109 204608 Receiving block blk_3587 src: /10.251.42.84:57069 dest:
code. However, source code is often inaccessible in practice /10.251.42.84:50010
081109 204655 PacketResponder 0 for block blk_4003 terminating
(e.g., Web service components). For general log parsing, recent
081109 204655 Received block blk_3587 of size 67108864 from /10.251.42.84
studies propose data-driven methods [4], [15], which directly
extract log templates from raw log messages. These log parsers Log Parsing
are offline, and limited by the memory of a single computer.
blk_3587 Receiving block * src: * dest: *
Besides, they fail to align with the log collecting manner. A blk_4003 PacketResponder * for block * terminating
typical log collection system has a log shipper installed on blk_3587 Received block * of size * from *
each node to forward log entries in a streaming manner to a
centralized server that contains a log parser [16]. The offline Fig. 1: Overview of Log Parsing
log parsers need to employ all logs after log collection for
a certain period (e.g., 1h) for the parser training. In contrast, Specifically, raw log messages are unstructured data, including
an online log parser parses logs in a streaming manner, and it timestamps and raw message contents. The raw log messages
does not require an offline training step. Thus, current systems in Figure 1 are simplified HDFS raw log messages collected
highly demand online log parsing, which is only studied in on the Amazon EC2 platform [3]. In the parsing process, a
a few preliminary works [16], [17]. However, we observe parser distinguishes between the constant part and variable
that the parsers proposed in these works are not accurate and part of each raw log message. The constant part is tokens
efficient enough, which make them not eligible for log parsing that describe a system operation template (i.e., log event),
in modern Web service or Web service based systems. such as “Receiving block * src: * dest: *” in Figure 1; while
In this paper, we propose an online log parsing method, the variable part is the remaining tokens (e.g, “blk 3587”)
namely Drain, that can accurately and efficiently parse raw that carry dynamic runtime system information. A typical
log messages in a streaming manner. Drain does not require structured log message contains a matched log event and fields
source code or any information other than raw log messages. of interest (e.g, the HDFS block ID “blk 3587”). Typical log
Drain can automatically extract log templates from raw log parsers [4], [15], [16], [17] regard log parsing as a clustering
messages and split them into disjoint log groups. It employs problem, where they cluster raw log messages with the same
a parse tree with fixed depth to guide the log group search log event into a log group. The following section introduces
process, which effectively avoids constructing a very deep and our proposed log parser, which clusters the raw log messages
unbalanced tree. Besides, specially designed parsing rules are into different log groups in a streaming manner.
compactly encoded in the parse tree nodes. We evaluate Drain
on five real-world log data sets with more than 10 million III. M ETHODOLOGY
raw log messages. Drain demonstrates the highest accuracy In this section, we briefly introduce Drain, a fixed depth
on four data sets, and comparable accuracy on the remaining tree based online log parsing method. When a new raw log
one. Besides, Drain obtains 51.8%∼81.47% improvement in message arrives, Drain will preprocess it by simple regular
running time compared with the state-of-the-art online parser expressions based on domain knowledge. Then we search a log
[16]. We also demonstrate the effectiveness of Drain in log group (i.e., leaf node of the tree) by following the specially-
analysis by tackling a real-world anomaly detection task [3]. designed rules encoded in the internal nodes of the tree. If a
In summary, our paper makes the following contributions: suitable log group is found, the log message will be matched
• This paper presents the design of an online log parsing with the log event stored in that log group. Otherwise, a new
method (Drain), which encodes specially designed pars- log group will be created based on the log message. In the
ing rules in a parse tree with fixed depth. following, we first introduce the structure of the fixed depth
• Extensive experiments have been conducted on five real- tree (i.e., parse tree). Then we explain how Drain parses raw
world log data sets, which determine the superiority of log messages by searching the nodes of the parse tree.
Drain in terms of accuracy and efficiency.
• The source code of Drain has been publicly released [18],
A. Overall Tree Structure
allowing for easy use by researchers and practitioners for When a raw log message arrives, an online log parser needs
future study. to search the most suitable log group for it, or create a new log
The remainder of this paper is organized as follows. Section group. In this process, a simple solution is to compare the raw
II presents the overview of log parsing process. Section III log message with log event stored in each log group one by
describes our online log parsing method, Drain. We evaluate one. However, this solution is very slow because the number
the performance of Drain in Section IV. Related work is of log groups increases rapidly in parsing. To accelerate this
introduced in Section V. Finally, we conclude this paper in process, we design a parse tree with fixed depth to guide the
Section VI. log group search, which effectively bounds the number of log
groups that a raw log message needs to compare with.
II. OVERVIEW OF L OG PARSING The parse tree is illustrated in Figure 2. The root node is
The goal of log parsing is to transform raw log mes- in the top layer of the parse tree; the bottom layer contains the
sages into structured log messages, as described in Figure 1. leaf nodes; other nodes in the tree are internal nodes. Root

34
C. Step 2: Search by Log Message Length
Root In this step and step 3, we explain how we traverse the
parse tree according to the encoded rules and finally find a
. . . leaf node.
Length: 4 Length: 5 Length: 10
Drain starts from the root node of the parse tree with the
preprocessed log message. The 1-st layer nodes in the parse
tree represent log groups whose log messages are of different
Send Receive Starting * log message lengths. By log message length, we mean the
number of tokens in a log message. In this step, Drain selects
Log Group a path to a 1-st layer node based on the log message length of
A List of Log Groups
the preprocessed log message. For example, for log message
Log Event: Receive from node *
... Log IDs: [1, 23, 25, 46, 345, …]
“Receive from node 4”, Drain traverse to the internal node
“Length: 4” in Figure 2. This is based on the assumption
that log messages with the same log event will probably
have the same log message length. Although it is possible
Root Node Internal Node Leaf Node Log Group that log messages with the same log event have different log
message lengths, it can be handled by simple postprocessing.
Fig. 2: Structure of Parse Tree in Drain (depth = 3)
Besides, our experiments in Section IV-B demonstrate the
superiority of Drain in terms of parsing accuracy even without
node and internal nodes encode specially-designed rules to postprocessing.
guide the search process. They do not contain any log groups. D. Step 3: Search by Preceding Tokens
Each path in the parse tree ends with a leaf node, which stores
a list of log groups, and we only plot one leaf node here In this step, Drain traverses from a 1-st layer node, which
for simplicity. Each log group has two parts: log event and is searched in step 2, to a leaf node. This step is based on
log IDs. Log event is the template that best describes the log the assumption that tokens in the beginning positions of a log
messages in this group, which consists of the constant part message are more likely to be constants. Specifically, Drain
of a log message. Log IDs records the IDs of log messages selects the next internal node by the tokens in the beginning
in this group. One special design of the parse tree is that positions of the log message. For example, for log message
the depth of all leaf nodes are the same and are fixed by “Receive from node 4”, Drain traverses from 1-st layer node
a predefined parameter depth. For example, the depth of the “Length: 4” to 2-nd layer node “Receive” because the token
leaf nodes in Figure 2 is fixed to 3. This parameter bounds in the first position of the log message is “Receive”. Then
the number of nodes Drain visits during the search process, Drain will traverse to the leaf node linked with internal node
which greatly improves its efficiency. Besides, to avoid tree “Receive”, and go to step 4.
branch explosion, we employ a parameter maxChild, which The number of internal nodes that Drain traverses in this
restricts the maximum number of children of a node. In the step is (depth − 2), where depth is the parse tree parameter
following, for clarity, we define an n-th layer node as a node restricting the depth of all leaf nodes. Thus, there are (depth−
whose depth is n. Besides, unless otherwise stated, we use the 2) layers that encode the first (depth − 2) tokens in the log
parse tree in Figure 2 as an example in following explanation. messages as search rules. In the example above, we use the
parse tree in Figure 2 for simplicity, whose depth is 3, so
B. Step 1: Preprocess by Domain Knowledge we search by only the token in the first position. In practice,
According to our previous empirical study on existing Drain can consider more preceding tokens with larger depth
log parsing methods [19], preprocessing can improve parsing settings. Note that if depth is 2, Drain only considers the first
accuracy. Thus, before employing the parse tree, we preprocess layer used by step 2.
the raw log message when it arrives. Specifically, Drain In some cases, a log message may start with a parameter, for
allows users to provide simple regular expressions based on example, “120 bytes received”. These kinds of log messages
domain knowledge that represent commonly-used variables, can lead to branch explosion in the parse tree because each
such as IP address and block ID. Then Drain will remove the parameter (e.g., 120) will be encoded in an internal node.
tokens matched from the raw log message by these regular To avoid branch explosion, we only consider tokens that do
expressions. For example, block IDs in Figure 1 will be not contain digits in this step. If a token contains digits, it
removed by “blk [0-9]+”. will match a special internal node “*”. For example, for the
The regular expressions employed in this step are often log message above, Drain will traverse to the internal node
very simple, because they are used to match tokens instead “*” instead of “120”. Besides, we also define a parameter
of log messages. Besides, a data set usually requires only a maxChild, which restricts the maximum number of children
few such regular expressions. For example, the data sets used of a node. If a node already has maxChild children, any
in our evaluation section require at most two such regular non-matched tokens will match the special internal node “*”
expressions. among all its children.

35
E. Step 4: Search by Token Similarity
Root Root
Before this step, Drain has traversed to a leaf node, which
contains a list of log groups. The log messages in these log
Length: 3 Length: 3
groups comply with the rules encoded in the internal nodes
along the path. For example, the log group in Figure 2 has log
Send Send Receive
event “Receive from node *”, where the log messages contain
4 tokens and start with token “Receive”. block block
In this step, Drain selects the most suitable log group from *
the log group list. We calculate the similarity simSeq between
the log message and the log event of each log group. simSeq Log Event: Send block 44 Log Event: Send block 44 Log Event: Receive 120 bytes
is deﬁned as following: Log IDs: [1] Log IDs: [1] Log IDs: [2]

n
equ(seq1 (i), seq2 (i))
simSeq = i=1 , (1) Fig. 3: Parse Tree Update Example (depth = 4)
n
where seq1 and seq2 represent the log message and the log
event respectively; seq(i) is the i-th token of the sequence; n system logs (HDFS and Zookeeper) to standalone software
is the log message length of the sequences; function equ is logs (Proxifier). Companies rarely release their log data to
defined as following: the public, because it may violates confidential clauses. We
obtained three log data sets from other researchers with their
1 if t1 = t2 generous support. Specifically, BGL is a log data set collected
equ(t1 , t2 ) = (2)
0 otherwise by Lawrence Livermore National Labs (LLNL) from Blue-
Gene/L supercomputer system [20]. HPC logs are collected
where t1 and t2 are two tokens. After finding the log group from a high performance cluster, which has 49 nodes with
with the largest simSeq, we compare it with a predefined 6,152 cores and 128GB memory per node [21]. HDFS is a
similarity threshold st. If simSeq ≥ st, Drain returns the log data set collected from a 203-node cluster on Amazon
group as the most suitable log group. Otherwise, Drain returns EC2 platform in [3]. We also collect two log data sets for
a flag (e.g., None in Python) to indicate no suitable log group. evaluation. One is collected from Zookeeper installed on a
F. Step 5: Update the Parse Tree 32-node cluster in our lab. The other are logs of a standalone
software Proxifier.
If a suitable log group is returned in step 4, Drain will add
2) Comparison: To prove the effectiveness of Drain, we
the log ID of the current log message to the log IDs in the
compare its performance with four existing log parsing meth-
returned log group. Besides, the log event in the returned log
ods in terms of accuracy, efficiency and effectiveness on
group will be updated. Specifically, Drain scans the tokens in
subsequent log mining tasks. Specifically, two of them are
the same position of the log message and the log event. If the
offline log parsers, and the other two are online log parsers.
two tokens are the same, we do not modify the token in that
The ideas of these log parsers are briefly introduced as
token position. Otherwise, we update the token in that token
following:
position by wildcard (i.e., *) in the log event.
• LKE [4]: This is an offline log parsing method devel-
If Drain cannot find a suitable log group, it creates a new
oped by Microsoft. It employs hierarchical clustering and
log group based on the current log message, where log IDs
heuristic rules.
contains only the ID of the log message and log event is
• IPLoM [15]: IPLoM conducts a three-step hierarchical
exactly the log message. Then, Drain will update the parse tree
partitioning before template generation in an offline man-
with the new log group. Intuitively, Drain traverses from the
ner.
root node to a leaf node that should contain the new log group,
• SHISO [17]: In this online parser, a tree with predefined
and adds the missing internal nodes and leaf node accordingly
number of children in each node is used to guide log
along the path. For example, assume the current parse tree
group searching.
is the tree in the left-hand side of Figure 3, and a new log
message “Receive 120 bytes” arrives. Then Drain will update
the parse tree to the right-hand side tree in Figure 3. Note TABLE I: Summary of Log Data Sets
that the new internal node in the 3-rd layer is encoded as “*” System Description #Log Messages Log Message Length #Events
because the token “120” contains digits. BGL
BlueGene/L
4,747,963 10~102 376
Supercomputer
IV. E VALUATION High Performance
HPC Cluster 433,490 6~104 105
A. Experimental Settings (Los Alamos)
HDFS Hadoop File System 11,175,629 8~29 29
1) Log Data Sets: The log data sets used in our evaluation Zookeeper
Distributed
74,380 8~27 80
System Coordinator
are summarized in Table I. These five real-world data sets Proxifier Proxy Client 10,108 10~27 8
range from supercomputer logs (BGL and HPC) to distributed

36
• Spell [16]: This method uses longest common sequence to TABLE III: Parsing Accuracy of Log Parsing Methods
search log group in an online manner. It accelerates the
BGL HPC HDFS Zookeeper Proxifier
searching process by subsequence matching and preﬁx Offline Log Parsers
LKE 0.67 0.17 0.57 0.78 0.85
tree. IPLoM 0.99 0.65 0.99 0.99 0.85
3) Evaluation Metric and Experimental Setup: We use F- Online Log Parsers
SHISO 0.87 0.53 0.93 0.68 0.85
measure [22], [23], which is a typical evaluation metric for Spell 0.98 0.82 0.87 0.99 0.87
clustering algorithms, to evaluate the accuracy of log parsing Drain 0.99 0.84 0.99 0.99 0.86
methods. The deﬁnition of accuracy is as the following.
TABLE IV: Running Time (Sec) of Log Parsing Methods
2 ∗ P recision ∗ Recall
Accuracy = , (3)
P recision + Recall BGL HPC HDFS
Offline Log Parsers
Zookeeper Proxifier

where P recision and Recall are defined as follows: LKE N/A N/A N/A N/A 8888.49
IPLoM 140.57 12.74 333.03 2.17 0.38
Online Log Parsers
TP
P recision = , (4) SHISO 10964.55 582.14 6649.23 87.61 8.41
TP + FP Spell 447.14 47.28 676.45 5.27 0.87
Drain 115.96 8.76 325.7 1.81 0.27
Improvement 74.07% 81.47% 51.85% 65.65% 68.97%
TP
Recall = , (5)
TP + FN
where a true positive (T P ) decision assigns two log messages In this section, we evaluate the accuracy of two offline and
with the same log event to the same log group; a false positive two online log parsing methods on the data sets described in
(F P ) decision assigns two log messages with different log Table I. The evaluation results are in Table III. LKE fails to
events to the same log group; and a false negative (F N ) handle the data sets except Proxifier, because its O(n2 ) time
decision assigns two log messages with the same log event complexity makes it too slow for the other data sets. Thus,
to different log groups. This evaluation metric is also used in for the other four data sets, as with the existing work [19],
our previous study [19] on existing log parsers. [24], we evaluate LKE’s accuracy on sample data sets with
TABLE II: Parameter Setting of Drain 2k log messages randomly extracted from the original ones,
while all parsers are evaluated on the 2k sample data sets in
BGL HPC HDFS Zookeeper Proxifier our previous paper [19].
depth 3 4 3 3 4
st 0.3 0.4 0.5 0.3 0.3 We observe that the proposed online parsing method,
namely Drain, obtains the best accuracy on four data sets,
We run all experiments on a Linux server with Intel Xeon even compared with the offline log parsing methods. For
E5-2670v2 CPU and 128GB DDR3 1600 RAM, running 64- data set Proxifier, Drain also has the second best accuracy
bit Ubuntu 14.04.2 with Linux kernel 3.16.0. We run each (i.e., 0.86), and it is comparable to Spell, which obtains the
experiment 10 times to avoid bias. For the preprocessing highest accuracy (0.87) on this data set. LKE is not that
step of Drain (step 1), we remove obvious parameters in good on some data sets, because it employs an aggressive
log messages (i.e., IP addresses in HPC&Zookeeper&HDFS, clustering strategy, which can lead to under-partitioning.
core IDs in BGL, block IDs in HDFS and application IDs IPLoM obtains high accuracy on most data sets because of its
in Proxifier). The parameter setting of Drain is shown in specially-designed heuristic rules. SHISO uses the similarity
Table II. Besides, we empirically set maxChild to 100 for of characters in log messages to search the corresponding
all experiments. The number of children of a tree node rarely log events. This strategy is too coarse-grained, which causes
exceeds maxChild, because the encoded rules in the parse inaccuracy. Spell is accurate, but its strategy only based on
tree can already distribute the logs evenly to different paths. longest common subsequence can lead to under-partitioning.
We also re-tune the parameters of other log parsers to optimize Drain has the overall best accuracy for three reasons. First,
their performance, which is not presented here because of the it compounds both the log message length and the first few
space limit. We put them in our released source code [18] for tokens, which are effective and specially-designed rules, to
further reference. construct the fixed depth tree. Second, Drain only uses tokens
that do not contain digits to guide the searching process,
B. Accuracy of Drain which effectively avoids over-partitioning. Third, the tunable
Accuracy demonstrates how well a log parser matches tree depth and similar threshold st allows users to conduct
raw log messages with the correct log events. Accuracy is fine-grained tuning on different data sets.
important because parsing errors can degrade the performance
of subsequent log mining task. Intuitively, an offline log C. Efficiency of Drain
parsing method could obtain higher accuracy compared with To evaluate the efficiency of Drain, we measure the running
an online one, because an offline method enjoys all raw log time of it and four existing log parsers on five real-world log
messages at the beginning of parsing, while an online method data sets described in Table I. In Table IV, we demonstrate
adjusts its parsing model gradually in the parsing process. the running time of these log parsers. LKE fails to handle

37
four data sets in reasonable time (i.e., days or weeks), so we enjoys linear time complexity. The time complexity of Drain
mark the corresponding results as not available. is O( (d + cm)n ), where d is the depth of the parse tree, c
Considering online parsing methods, SHISO takes too much is the number of candidate log groups in the leaf node, m is
time on some data sets (e.g., takes more than 3h on BGL). the log message length, and n is the number of log messages.
This is mainly because SHISO only limits the number of Obviously, d and m are constants. c can also be regarded as
children for its tree nodes, which can cause very deep parse a constant, because the quantity of candidate log groups in
tree. Spell obtains better efficiency performance, because it each leaf node is nearly the same, and the number of log
employs a prefix tree structure to store all log events found, groups is far less than that of log messages. Thus, the time
which greatly reduces its running time. However, Spell does complexity of Drain is O(n). For SHISO and Spell, the depth
not restrict the depth of its prefix tree either, and it calculates of the parse tree could increase during the parsing process.
the longest common subsequence between two log messages, Second, we use the specially-designed simSeq to calculate the
which is time consuming. Compared with the existing online similarity between a log message and a log event candidate.
parsing methods, our proposed Drain requires the least running Its time complexity is O(m1 + m2 ), while m1 and m2 are
time on all five data sets. Specifically, Drain only needs 2 number of tokens in them respectively. In Drain, m1 = m2 . By
min to parse 4m BGL log messages and 6 min to parse 10m comparison, SHISO and Spell calculate the longest common
HDFS log messages. Drain greatly improves the running time subsequence between two sequences, whose time complexity
of existing online parsing methods. The improvements on the is O(m1 m2 ).
five real-world data sets are at least 51.85%, and it reduce
81.47% running time on HPC. Drain also outperforms the D. Effectiveness of Drain on Real-World Anomaly Detection
existing offline log parsing methods. It requires less running Task
time than IPLoM on all five data sets. Moreover, as an online In previous sections, we demonstrate the superiority of
log parsing method, Drain is not limited by the memory of a Drain in terms of accuracy and efficiency. Although high
single computer, which is the bottleneck of most offline log accuracy is necessary for log parsing methods, it does not
parsing methods. For example, IPLoM needs to load all log guarantee good performance in the subsequent log mining
messages into computer memory, and it will construct extra task. For example, because log mining could be sensitive to
data structures of comparable size in runtime. Thus, although some critical events, little parsing error may cause an order
IPLoM is efficient too, it may fail to handle large-scale log of magnitude performance degradation in log mining [19]. To
data. Drain is not limited by the memory of single computer, evaluate the effectiveness of Drain on subsequent log mining
because it processes the log messages one by one. tasks, we conduct a case study on a real-world anomaly
detection task.
TABLE V: Log Size of Sample Datasets for Efficiency Ex- We use the HDFS log data set in this case study. Specif-
periments ically, raw log messages in the HDFS data set [3] records
BGL 400 4k 40k 400k 4m system operations on 575,061 HDFS blocks with a total of
HPC 600 3k 15k 75k 375k 29 log event types. Among these blocks, 16,838 are manually
HDFS 1k 10k 100k 1m 10m labeled as anomalies by the original authors. In the original
Zookeeper 4k 8k 16k 32k 64k paper [3], the authors employ Principal Component Analysis
Proxifier 600 1200 2400 4800 9600
(PCA) to detect these anomalies. Next, we will briefly intro-
Because log size of modern systems is rapidly increasing, a duce the anomaly detection workflow, including log parsing
log parsing method is expected to handle large-scale log data. and log mining. In log parsing step, all the raw log messages
Thus, to simulate the increasing of log size, we also measure are parsed into structured log messages. Each structured log
the running time of these log parsers on 25 sampled log data message contains the corresponding HDFS block ID and a
sets with varying log size (i.e., number of log messages) as log event. A source code-based log parsing method is used
described in Table V. The log messages in these sampled data in the original paper, which is not discussed here because
sets are randomly extracted from the real-world data sets in source code is inaccessible in many cases (e.g., in third
Table I. party libraries). In log mining, we first use the structured log
The evaluation results are illustrated in Figure 4, which is messages to generate an event count matrix, where each row
in logarithmic scale. In this figure, we observe that, compared represents an HDFS block; each column represents a log event
with other methods, the running time of LKE raises faster as type; each cell counts the occurrence of an event on a certain
the log size increases. Because the time complexity of LKE HDFS block. Then we use TF-IDF [25] to preprocess the
is O(n2 ), and the time complexity of other methods is O(n), event count matrix. Intuitively, TF-IDF gives lower weights to
while n is the number of log messages. IPLoM is comparable common event types, which are less likely to contribute to the
to Drain, but it requires substantial amounts of memory as anomaly detection process. Finally, the event count matrix is
explained above. Online parsing methods (i.e., SHISO, Spell, fed into PCA, which automatically marks the blocks as normal
Drain) process log message one by one, and they all use a or abnormal.
parse tree to accelerate the log event search process. Drain is In our case study, we evaluate the performance of the
faster than others because of two main reasons. First, Drain anomaly detection task with different log parsing methods

38
Fig. 4: Running Time of Log Parsing Methods on Data Sets in Different Size

TABLE VI: Anomaly Detection with Different Log Parsing high parsing accuracy (0.93), does not perform well in this
Methods (16,838 True Anomalies) anomaly detection task. By using SHISO, we would report
Parsing Reported Detected False 1, 907 false alarms, which are 6 times worse than others.
Accuracy Anomaly Anomaly Alarm This will largely increase the workload of developers, because
IPLoM 0.99 10,998 10,720 (63%) 278 (2.5%) they usually need to manually check the anomalies reported.
SHISO 0.93 13,050 11,143 (66%) 1,907 (14.6%) Among the online parsing methods, Drain not only has the
Spell 0.87 10,949 10,674 (63%) 275 (2.5%)
highest parsing accuracy as demonstrated in Section IV-B,
Drain 0.99 10,998 10,720 (63%) 278 (2.5%)
Ground truth 1.00 11,473 11,195 (66%) 278 (2.4%)
but also obtains nearly optimal performance in the anomaly
detection case study.
used in the parsing step. Specifically, we use different log
V. R ELATED W ORK
parsing methods to parse the HDFS raw log messages respec-
tively and, hence, we obtain different sets of structured log Log Analysis for Service Management. Logs, which
messages. For example, an HDFS block ID could match with records system runtime information, are in widespread use for
different log events by using different log parsing methods. service management tasks, such as business model mining [8],
Then, we generate different event count matrices, and fed them [9], user behavior analysis [10], [11], anomaly detection [3],
into PCA, respectively. [4], [26], fault diagnosis [5], [6], performance improvement
The experimental results are shown in Table VI. In this [7], etc. Log parsing is a critical step to enable automated and
table, reported anomaly is the number of anomalies reported effective log analysis [19], because most of these techniques
by the PCA model; detected anomaly is the number of true require structured log messages as input. Thus, we believe our
anomalies reported; f alse alarm is the number of wrongly proposed online parsing method can benefit these techniques
reported ones. We use four existing log parsing methods to and future studies on log analysis.
handle the parsing step of this anomaly detection task. We do Log Parsing. Log parsing has been widely studied in recent
not use LKE because it cannot handle this large amount of years. Xu et al. [3] design a source code based log parser
data. Ground truth is the experiment using exactly correct that achieves high accuracy. However, source code is often
parsed results. inaccessible in practice (e.g., Web service components). Some
We can observe that Drain obtains nearly the optimal other work proposes data-driven approaches (LKE [4], IPLoM
anomaly detection performance. It detects 10, 720 true anoma- [15], SHISO [17], Spell [16]), in which data mining techniques
lies with only 278 false alarms. Although 37% of anomalies are employed to extract log templates and split raw log mes-
have not been detected, it is caused by the log mining sages into different log groups accordingly. Specifically, LKE
step. Because even when all the log messages are correctly and IPLoM are offline log parsers, which are studied in our
parsed, the log mining model still leaves 34% of anomalies previous evaluation study on offline log parsers [19]. SHISO
at large. Note that although IPLoM demonstrates the same and Spell are online log parsers, which parse log messages
anomaly detection performance as Drain, their parsing results in a streaming manner, and are not limited by the memory
are different. We also observe that SHISO, although has a of a single computer. In this paper, we propose an online log

39
parser, namely Drain, that greatly outperforms existing online [7] Y. Sun, H. Li, I. G. Councill, J. Huang, W. C. Lee, and C. L. Giles,
“Personalized ranking for digital libraries based on log analysis,” in
log parsers in terms of both accuracy and efficiency. It even WIDM’08: Proc. of the 10th ACM workshop on Web information and
performs better than the state-of-the-art offline parsers. data management, 2008, pp. 133–140.
[8] H. J. Cheng and A. Kumar, “Process mining on noisy logs-can log
Reliability of Web Service Systems. Many recent studies sanitization help to improve performance?” Decision Support Systems,
focus on enhancing the reliability of Web service systems. vol. 79, pp. 138–149, 2015.
[9] H. R. Motahari-Nezhad, R. Saint-Paul, B. Benatallah, and F. Casati,
Cubo et al. [27] use dynamic software product lines to “Deriving protocol models from imperfect service conversation logs,”
TKDE’08: IEEE Transactions on Knowledge and Data Engineering,
reconfigure service failures dynamically. Service selection and vol. 20, no. 12, pp. 1683–1698, 2008.
recommendation are also widely studied [28], [29]. These [10] X. Yu, M. Li, I. Paik, and K. H. Ryu, “Prediction of web user behavior by
discovering temporal relational rules from web log data,” in DEXA’12:
studies usually employ QoS (quality of service) values to Proc. of the 23rd International Conference on Database and Expert
characterize the reliability of different Web services. Jurca et Systems Applications, 2012, pp. 31–38.
[11] N. Poggi, V. Muthusamy, D. Carrera, and R. Khalaf, “Business process
al. [30] propose a reliable QoS monitoring technique based mining from e-commerce web logs,” in Business Process Management,
2013, pp. 65–80.
on client feedback. Yao et al. [31] develop a model with [12] D. Lang, “Using SEC,” USENIX ;login: Magazine, vol. 38, 2013.
[13] H. Mi, H. Wang, Y. Zhou, M. R. Lyu, and H. Cai, “Toward fine-grained,
accountability for business and QoS compliance. Besides, unsupervised, scalable performance diagnosis for production cloud com-
Chen et al. [32] propose a performance prediction method for puting systems,” IEEE Transactions on Parallel and Distributed Systems,
vol. 24, pp. 1245–1255, 2013.
component-based applications. Our proposed online log parser [14] W. Xu, “System problem detection by mining console logs,” Ph.D.
is critical for log analysis techniques, which can complement dissertation, University of California, Berkeley, 2010.
[15] A. Makanju, A. Zincir-Heywood, and E. Milios, “A lightweight algo-
with these methods in reliability enhancement for Web service rithm for message type extraction in system application logs,” TKDE’12:
IEEE Transactions on Knowledge and Data Engineering, 2012.
systems. The log analysis methods can also improve the [16] M. Du and F. Li, “Spell: Streaming parsing of system event logs,” in
reliability of many existing service systems [33], [34], [35]. ICDM’16 Proc. of the 16th International Conference on Data Mining,
2016.
[17] M. Mizutani, “Incremental mining of system log format,” in SCC’13:
VI. C ONCLUSION Proc. of the 10th International Conference on Services Computing, 2013.
[18] Drain source code. [Online]. Available: http://appsrv.cse.cuhk.edu.hk/
∼pjhe/Drain.py
Log parsing is critical for log analysis based Web service [19] P. He, J. Zhu, S. He, J. Li, and M. R. Lyu, “An evaluation study on
management techniques. This paper proposes an online log log parsing and its use in log mining,” in DSN’16: Proc. of the 46th
Annual IEEE/IFIP International Conference on Dependable Systems and
parsing method, namely Drain, that parses raw log messages Networks, 2016.
in a streaming manner. Drain adopts a fixed depth parse tree [20] A. Oliner and J. Stearley, “What supercomputers say: A study of five
system logs,” in DSN’07, 2007.
to accelerate the log group search process, which encodes [21] L. A. N. S. LLC. Operational data to support and enable computer
specially designed rules in its tree nodes. To evaluate the science research. [Online]. Available: http://institutes.lanl.gov/data/fdata
[22] C. Manning, P. Raghavan, and H. Schutze, Introduction to Information
effectiveness of Drain, we conduct experiments on five real- Retrieval. Cambridge University Press, 2008.
[23] Evaluation of clustering. [Online]. Available: http://nlp.stanford.edu/
world log data sets. The experimental results show that Drain IR-book/html/htmledition/evaluation-of-clustering-1.html
[24] L. Tang, T. Li, and C. Perng, “LogSig: generating system events from
greatly outperforms existing online log parsers in terms of raw textual logs,” in CIKM’11: Proc. of ACM International Conference
accuracy and efficiency. Drain even obtains better performance on Information and Knowledge Management, 2011.
[25] G. Salton and C. Buckley, “Term weighting approaches in automatic
than the state-of-the-art offline log parsers, which are limited text retrival,” Cornell, Tech. Rep., 1987.
[26] W. Zhang, F. Bastani, I. L. Yen, K. Hulin, F. Bastani, and L. Khan, “Real-
by the memory of a single computer. Besides, we conduct time anomaly detection in streams of execution traces,” in HASE’16:
a case study on a real-world anomaly detection task, which Proc. of the 14th International Symposium on High-Assurance Systems
Engineering, 2012, pp. 32–39.
demonstrates the effectiveness of Drain on log analysis tasks. [27] J. Cubo, N. Gamez, E. Pimentel, and L. Fuentes, “Reconfiguration
of service failures in damasco using dynamic software product lines,”
ACKNOWLEDGMENT in SCC’15: Proc. of the 12nd International Conference on Services
Computing, 2015, pp. 114–121.
[28] S. Y. Hwang, W. P. Liao, and C. H. Lee, “Web services selection in
The work described in this paper was supported by the support of reliable web service choreography,” in ICWS’10: Proc. of the
National Natural Science Foundation of China (Project No. 17th International Conference on Web Services, 2010, pp. 115–122.
[29] S. Meng, Z. Zhou, T. Huang, D. Li, S. Wang, F. Fei, W. Wang,
61332010 and 61472338), the National Basic Research Pro- and W. Dou, “A temporal-aware hybrid collaborative recommendation
gram of China (973 Project No. 2014CB347701), and the method for cloud service,” in ICWS’16: Proc. of the 23rd International
Conference on Web Services, 2016, pp. 252–259.
Research Grants Council of the Hong Kong Special Admin- [30] R. Jurca, B. Faltings, and W. Binder, “Reliable qos monitoring based
on client feedback,” in WWW’07: Proc. of the 16th International
istrative Region, China (No. CUHK 14234416 of the General Conference on World Wide Web, 2007, pp. 1003–1012.
Research Fund). [31] J. Yao, S. Chen, C. Wang, D. Levy, and J. Zic, “Modelling collaborative
services for business and qos compliance,” in ICWS’11: Proc. of the 18th
International Conference on Web Services, 2011, pp. 299–306.
R EFERENCES [32] S. Chen, Y. Liu, I. Gorton, and A. Liu, “Performance prediction
of component-based applications,” JSS’05: Journal of Systems and
[1] Amazon ec2. [Online]. Available: https://aws.amazon.com/tw/ec2/ Software, vol. 74, no. 1, pp. 35–43, 2005.
[2] R. Ding, H. Zhou, J. Lou, H. Zhang, Q. Lin, Q. Fu, D. Zhang, and T. Xie, [33] A. Iwai and M. Aoyama, “Automotive cloud service systems based on
“Log2: A cost-aware logging mechanism for performance diagnosis,” in service-oriented architecture and its evaluation,” in CLOUD’11: Proc.
ATC’15: Proc. of the USENIX Annual Technical Conference, 2015. of the 4th International Conference on Cloud Computing, 2011.
[3] W. Xu, L. Huang, A. Fox, D. Patterson, and M. Jordon, “Detecting large- [34] J. Zhang, B. Iannucci, M. Hennessy, K. Gopal, S. Xiao, S. Kumar,
scale system problems by mining console logs,” in SOSP’09: Proc. of D. Pfeffer, B. Aljedia, Y. Ren, M. Griss, S. Rosenberg, J. Cao, and
the ACM Symposium on Operating Systems Principles, 2009. A. Rowe, “Sensor data as a service–a federated platform for mobile
[4] Q. Fu, J. Lou, Y. Wang, and J. Li, “Execution anomaly detection in
distributed systems through unstructured log analysis,” in ICDM’09: data-centric service development and sharing,” in SCC’13: Proc. of the
Proc. of International Conference on Data Mining, 2009. 10th International Conference on Services Computing, 2013.
[5] W. E. Wong, V. Debroy, R. Golden, X. Xu, and B. Thuraisingham, [35] Y. Duan, G. Fu, N. Zhou, X. Sun, N. C. Narendra, and B. Hu,
“Effective software fault localization using an rbf neural network,” “Everything as a service (xaas) on the cloud: origins, current and future
TR’12: IEEE Transactions on Reliability, 2012. trends,” in CLOUD’15: Proc. of the 8th International Conference on
[6] D. Q. Zou, H. Qin, and H. Jin, “Uilog: Improving log-based fault Cloud Computing, 2015, pp. 621–628.
diagnosis by log analysis,” Journal of Computer Science and Technology,
vol. 31, no. 5, pp. 1038–1052, 2016.

Localising Vulnerability Assessment To Urban Floods
No ratings yet
Localising Vulnerability Assessment To Urban Floods
19 pages
Construction Management PDF
No ratings yet
Construction Management PDF
61 pages
Bmcu006 Group Assignment
No ratings yet
Bmcu006 Group Assignment
15 pages
Log Analysis Final Part 3 1703257926
No ratings yet
Log Analysis Final Part 3 1703257926
21 pages
DA Lab Manual
No ratings yet
DA Lab Manual
60 pages
Ai Syllabus
No ratings yet
Ai Syllabus
7 pages
ML 2 ND Unit
No ratings yet
ML 2 ND Unit
50 pages
Sok: Llm-Based Log Parsing: Viktor Beck, Max Landauer, Markus Wurzenberger, Florian Skopik, Andreas Rauber
No ratings yet
Sok: Llm-Based Log Parsing: Viktor Beck, Max Landauer, Markus Wurzenberger, Florian Skopik, Andreas Rauber
34 pages
Cse3069 - Fluentd
No ratings yet
Cse3069 - Fluentd
28 pages
Analysis of Ceramic Compositional Data From Late Developmental Period Sites in The Tewa Basin New Mexico
No ratings yet
Analysis of Ceramic Compositional Data From Late Developmental Period Sites in The Tewa Basin New Mexico
39 pages
Log Mine
No ratings yet
Log Mine
10 pages
Pyhton Programming Language
No ratings yet
Pyhton Programming Language
22 pages
Bessel Sequence and Operator Norm
No ratings yet
Bessel Sequence and Operator Norm
27 pages
مشروع تخرج
No ratings yet
مشروع تخرج
37 pages
Do Urban Renewal Programs Make Suburbs Safer A Fine-Grained GIS and Space Syntax Study of An Urban Renewal Project in The City of Toulouse
No ratings yet
Do Urban Renewal Programs Make Suburbs Safer A Fine-Grained GIS and Space Syntax Study of An Urban Renewal Project in The City of Toulouse
27 pages
2 - BBDS - Decisions Management & Problem Framing
No ratings yet
2 - BBDS - Decisions Management & Problem Framing
78 pages
Landscape of Automated Log Analysis A Systematic L
No ratings yet
Landscape of Automated Log Analysis A Systematic L
22 pages
OPENFOAM Modelling Method OE - 21 - Hullvane
No ratings yet
OPENFOAM Modelling Method OE - 21 - Hullvane
41 pages
SteelStructureAs builtBIMAuthorVersion
No ratings yet
SteelStructureAs builtBIMAuthorVersion
48 pages
LogCluster A Data Clustering and Pattern Mining
No ratings yet
LogCluster A Data Clustering and Pattern Mining
7 pages
ELK 2 4 - Logstash Filtering - Unstructured Data
No ratings yet
ELK 2 4 - Logstash Filtering - Unstructured Data
16 pages
Lecture Five-Multivariate Factor Models
No ratings yet
Lecture Five-Multivariate Factor Models
20 pages
Sem Parse
No ratings yet
Sem Parse
13 pages
Experience Report: Deep Learning-Based System Log Analysis For Anomaly Detection
No ratings yet
Experience Report: Deep Learning-Based System Log Analysis For Anomaly Detection
12 pages
CS231n Convolutional Neural Networks For Visual Recognition 6
No ratings yet
CS231n Convolutional Neural Networks For Visual Recognition 6
17 pages
She Issre2016 Experience
No ratings yet
She Issre2016 Experience
12 pages
Drain Log Parser
No ratings yet
Drain Log Parser
12 pages
Combining Laboratory Measurements and Proximal Soil Sensing Data in Digital Soil Mapping Approaches
No ratings yet
Combining Laboratory Measurements and Proximal Soil Sensing Data in Digital Soil Mapping Approaches
14 pages
Error Logs Simplified
No ratings yet
Error Logs Simplified
10 pages
A Hybrid Machine-Learning Ensemble For Anomaly Detection in Real-Time Industry 4.0 Systems
No ratings yet
A Hybrid Machine-Learning Ensemble For Anomaly Detection in Real-Time Industry 4.0 Systems
13 pages
Body Esteem Scale A Validation On Italian Adolescents
No ratings yet
Body Esteem Scale A Validation On Italian Adolescents
13 pages
Survey Questionnaire
No ratings yet
Survey Questionnaire
10 pages
Accepted Manuscript: Journal of Pharmaceutical and Biomedical Analysis
No ratings yet
Accepted Manuscript: Journal of Pharmaceutical and Biomedical Analysis
71 pages
Pratik Zanke Factor Hair Revised
No ratings yet
Pratik Zanke Factor Hair Revised
37 pages
KMO and Bartlett's Test
No ratings yet
KMO and Bartlett's Test
9 pages
A Hybrid Machine Learning Method
No ratings yet
A Hybrid Machine Learning Method
6 pages
Nulog Self-Supervised Log Parsing
No ratings yet
Nulog Self-Supervised Log Parsing
16 pages
Xu Sosp09
No ratings yet
Xu Sosp09
16 pages
Icsme2022 Ulp
No ratings yet
Icsme2022 Ulp
12 pages
1 s2.0 S0957417420312008 Main
No ratings yet
1 s2.0 S0957417420312008 Main
11 pages
Phe Icws2017 Drain
No ratings yet
Phe Icws2017 Drain
8 pages
Sponsored DZ Refcard 290 Getting Started Log MGMT
No ratings yet
Sponsored DZ Refcard 290 Getting Started Log MGMT
7 pages
Scrambling Research Paper 1
No ratings yet
Scrambling Research Paper 1
10 pages
Final House Prediction
50% (2)
Final House Prediction
83 pages
Lenma Length Matters Clustering System Log Messages
No ratings yet
Lenma Length Matters Clustering System Log Messages
10 pages
Fttree Syslog Processing For Switch Failure Diagnosis and Prediction in Datacenter Networks
No ratings yet
Fttree Syslog Processing For Switch Failure Diagnosis and Prediction in Datacenter Networks
10 pages
Feature Construction
No ratings yet
Feature Construction
8 pages
Tracing Back Log Data To Its Log Statement From Research To Practice
No ratings yet
Tracing Back Log Data To Its Log Statement From Research To Practice
5 pages
TMPA WhaleShark
No ratings yet
TMPA WhaleShark
14 pages
Thesis Seismic Interpretation
100% (3)
Thesis Seismic Interpretation
6 pages
LogSig Generating System Events From Raw Textual Logs
No ratings yet
LogSig Generating System Events From Raw Textual Logs
10 pages
Molfi A Search-Based Approach For Accurate Identification of Log Message Formats
No ratings yet
Molfi A Search-Based Approach For Accurate Identification of Log Message Formats
11 pages
IPLOM Clustering Event Logs Using Iterative Partitioning
No ratings yet
IPLOM Clustering Event Logs Using Iterative Partitioning
9 pages
Va A Data Clustering Algorithm For Mining Patterns From Event Logs
No ratings yet
Va A Data Clustering Algorithm For Mining Patterns From Event Logs
8 pages
AEL Abstracting Execution Logs To Execution Events For Enterprise Applications
No ratings yet
AEL Abstracting Execution Logs To Execution Events For Enterprise Applications
6 pages
Ijmir - Effect of Cabotage Vessel Financing Fund On The Nigerian Maritime Sector PDF
No ratings yet
Ijmir - Effect of Cabotage Vessel Financing Fund On The Nigerian Maritime Sector PDF
15 pages
AWSOM-LP: An Effective Log Parsing Technique Using Pattern Recognition and Frequency Analysis
No ratings yet
AWSOM-LP: An Effective Log Parsing Technique Using Pattern Recognition and Frequency Analysis
13 pages
Tools and Benchmarks For Automated Log Parsing
No ratings yet
Tools and Benchmarks For Automated Log Parsing
10 pages
Spell Streaming Parsing of System Event Logs
No ratings yet
Spell Streaming Parsing of System Event Logs
6 pages
Automated Log Parsing For Large Scale Log Analysis
No ratings yet
Automated Log Parsing For Large Scale Log Analysis
18 pages
LFA Abstracting Log Lines To Log Event Types For Mining Software System Logs
No ratings yet
LFA Abstracting Log Lines To Log Event Types For Mining Software System Logs
4 pages
Article - Applications of Artificial Neural Networks in Chemical Engineering
No ratings yet
Article - Applications of Artificial Neural Networks in Chemical Engineering
20 pages
Log Parsing
No ratings yet
Log Parsing
24 pages
A Survey of Traveling Wave Protection Schemes in Electric Power Systems
No ratings yet
A Survey of Traveling Wave Protection Schemes in Electric Power Systems
21 pages
Splunk Lisa2014 Final Version
No ratings yet
Splunk Lisa2014 Final Version
16 pages
Resume Ishan Nigam
No ratings yet
Resume Ishan Nigam
2 pages
En Tanagra Acp
No ratings yet
En Tanagra Acp
10 pages
What Is Multivariate Analysis
No ratings yet
What Is Multivariate Analysis
7 pages
Mastering System Programming with C: Files, Processes, and IPC
From Everand
Mastering System Programming with C: Files, Processes, and IPC
Larry Jones
No ratings yet
Papertrail Logging and Monitoring Essentials: Definitive Reference for Developers and Engineers
From Everand
Papertrail Logging and Monitoring Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Logstash Essentials: Definitive Reference for Developers and Engineers
From Everand
Logstash Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Java Streams Explained: A Practical Guide with Examples
From Everand
Java Streams Explained: A Practical Guide with Examples
William E. Clark
No ratings yet
Loggly Essentials: Definitive Reference for Developers and Engineers
From Everand
Loggly Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Study Guide 300-835 CLAUTO Automating and Programming Cisco Collaboration Solutions Exam
From Everand
Study Guide 300-835 CLAUTO Automating and Programming Cisco Collaboration Solutions Exam
Anand Vemula
No ratings yet
Study Guide Cisco 300-735 SAUTO Automating and Programming Cisco Security Solutions Exam
From Everand
Study Guide Cisco 300-735 SAUTO Automating and Programming Cisco Security Solutions Exam
Anand Vemula
No ratings yet
NetFlow Protocols and Applications: Definitive Reference for Developers and Engineers
From Everand
NetFlow Protocols and Applications: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
SAP interface programming with RFC and VBA: Edit SAP data with MS Access
From Everand
SAP interface programming with RFC and VBA: Edit SAP data with MS Access
Karl Josef Hensel
No ratings yet
tcpflow Essentials: Definitive Reference for Developers and Engineers
From Everand
tcpflow Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Study Guide 300-615 Dcit Troubleshooting Cisco Data Centre Infrastructure
From Everand
Study Guide 300-615 Dcit Troubleshooting Cisco Data Centre Infrastructure
Anand Vemula
No ratings yet
Study Guide Cisco 300-535 SPAUTO Automating and Programming Cisco Service Provider Solutions
From Everand
Study Guide Cisco 300-535 SPAUTO Automating and Programming Cisco Service Provider Solutions
Anand Vemula
No ratings yet
Logstash Made Easy: A Beginner's Guide to Log Ingestion and Transformation
From Everand
Logstash Made Easy: A Beginner's Guide to Log Ingestion and Transformation
Robert Johnson
No ratings yet
Airflow for Data Workflow Automation
From Everand
Airflow for Data Workflow Automation
Richard Johnson
No ratings yet
JavaScript File Handling from Scratch: A Practical Guide with Examples
From Everand
JavaScript File Handling from Scratch: A Practical Guide with Examples
William E. Clark
No ratings yet
The Software Programmer: Basis of common protocols and procedures
From Everand
The Software Programmer: Basis of common protocols and procedures
S Mathioudakis
No ratings yet
What's New in .NET 8? A Complete Guide to the Latest Features
From Everand
What's New in .NET 8? A Complete Guide to the Latest Features
Nitika
No ratings yet
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
From Everand
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
Wei Liu
No ratings yet
Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data
From Everand
Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data
Byron Ellis
No ratings yet
Accelerated DevOps with AI, ML & RPA: Non-Programmer’s Guide to AIOPS & MLOPS
From Everand
Accelerated DevOps with AI, ML & RPA: Non-Programmer’s Guide to AIOPS & MLOPS
Stephen Fleming
5/5 (2)
Graph Layout Support for Model-Driven Engineering
From Everand
Graph Layout Support for Model-Driven Engineering
Miro Spönemann
No ratings yet
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
From Everand
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
alasdair gilchrist
5/5 (1)
Advanced Backend Code Optimization
From Everand
Advanced Backend Code Optimization
Sid Touati
No ratings yet
Dataflow and Reactive Programming Systems
From Everand
Dataflow and Reactive Programming Systems
Matt Carkci
No ratings yet
ASP.NET For Beginners: The Simple Guide to Learning ASP.NET Web Programming Fast!
From Everand
ASP.NET For Beginners: The Simple Guide to Learning ASP.NET Web Programming Fast!
Tim Warren
No ratings yet
Learn SAP Basis in 24 Hours
From Everand
Learn SAP Basis in 24 Hours
Alex Nordeen
4.5/5 (2)
Java / J2EE Interview Questions You'll Most Likely Be Asked
From Everand
Java / J2EE Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
20 Windows Tools Every SysAdmin Should Know
From Everand
20 Windows Tools Every SysAdmin Should Know
padmin
4.5/5 (3)

Drain An Online Log Parsing Approach With Fixed Depth Tree

Uploaded by

Drain An Online Log Parsing Approach With Fixed Depth Tree

Uploaded by

2017 IEEE 24th International Conference on Web Services

Drain: An Online Log Parsing Approach with Fixed

School of Data and Computer Science, Sun Yat-sen University, China

978-1-5386-0752-7/17 $31.00 © 2017 IEEE 33

You might also like