0% found this document useful (0 votes)
45 views6 pages

Yadav 2020

Uploaded by

juhi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views6 pages

Yadav 2020

Uploaded by

juhi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

2020 8th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO)

Amity University, Noida, India. June 4-5, 2020

A Survey on Log Anomaly Detection using Deep


Learning
Rakesh Bahadur Yadav P Santosh Kumar Sunita Vikrant Dhavale
Department of Computer Science and Department of Computer Science and Department of Computer Science and
Engineering Engineering Engineering
Defence Institute of Advanced Defence Institute of Advanced Defence Institute of Advanced
Technology Technology Technology
Pune , India Pune , India Pune , India
[email protected] [email protected] [email protected]

Abstract—Logs generated from the security systems, In the last few years, log anomaly detection using deep
network devices, servers, and various software applications are learning approaches with NLP techniques have achieved
one of the ways to record the operational happening of the better accuracy by harnessing semantic relationships in logs.
equipment or software. These logs are assets for extracting
meaningful information related to system behavior. Increasing
Mengying Wang et al. [1], Min Du et al. [2], Wang et al. [3],
usage of computer devices and the evolution of software systems Meng et al. [4] have achieved higher accuracy in log anomaly
can be considered as one of triggering acts for the concentration detection by using Long Short Term Memory (LSTM) for
on the analysis of logs. Also, considering the massive volume of anomaly detection. Siyang et al. [5] have used the
unstructured data, it raises the requirement for automatic Convolution Neural Network (CNN) based deep learning
analysis of these logs. The log analysis is helpful for model for achieving an accuracy of 99 percent. Amir et al. [6]
understanding system behavior, malfunctioning detection, have used autoencoder [7] for feature extraction and further
security scanning, and failure prediction. Machine learning(ML)
and Deep Learning (DL) methods have been proved potent tools DL models for the identification of anomaly. Zhang et al. [8]
for data classification problems and have been applied to and Brown et al. [9] have used attention mechanisms with
various fields of research. The purpose of this survey is to deep learning models to give more consideration for a
review recent research on log anomaly detection using Deep particular sequence of data. The lack of a detailed study for
Neural Networks. Survey also presents the brief of log parsing the amount of research work carried out on log datasets and
approaches, types of datasets used for log analysis, and various deep learning approaches applied to these datasets for log
concepts proposed for Log Anomaly detection.
anomaly detection is decelerating further research in this area.
Keywords—Log Anomaly, Deep Learning, LSTM, CNN, To the best of our knowledge, there is no paper available
Autoencoder, Log Parsing which studies the various explored methods for log anomaly
detection using DL. In this paper, we survey and present a
I. INTRODUCTION comparative analysis of various available datasets, log
Anomaly detection in system logs has become critical for parsing methods, and deep learning models used for log
large enterprises as the systems and applications are getting anomaly detection.
more sophisticated and generating large event logs. The II. BACKGROUND
security systems are subject to more bugs and vulnerability,
which may be exploited to launch an attack by the attacker. A. Feature Extraction
Researchers are working for efficient analysis of log data and 1) TF-IDF: Term frequency-inverse document
exploring the possibility of timely identification of threats in frequency (TF-IDF) [10] is a extensively used method for
logs before its getting activated. The log anomaly detection feature extraction. It is a metric that reflects how important
based on the traditional method is no longer useful since the the word is to a document in the corpus. TF-IDF gives little
attacks are becoming more sophisticated. So far, data mining importance to words that are very common to the entire
and machine learning approaches such as Decision Tree corpus e.g., “a”, “the”, and “of”, and provide more
(DT), Support Vector Machine (SVM), and Principal importance to words that occur more frequently in a
Component Analysis (PCA) have been used for extracting particular document but not very common in the entire
more relevant features. These approaches give better corpus.
accuracy at the same time, reduce complexity as well.
However, analyzing the concealed relationships in extracted 2) Word Embedding: Mikolov et al. [11] have
features is still tricky by these approaches. More sophisticated introduced the word2vec model to compute and generate a
methods like deep learning approach overcome this dense vector for word representation, which captures the
limitation. semantics of words. Word2vec presented two novel models
for the transformation of a word to word embedding. The first

978-1-7281-7016-9/20/$31.00 ©2020 IEEE 1215

Authorized licensed use limited to: Middlesex University. Downloaded on November 06,2020 at 14:18:26 UTC from IEEE Xplore. Restrictions apply.
model, Continuous Bag Of Words(CBOW) predicts the promising results in various Natural Language Processing
central word with reference to surrounding words within the (NLP) tasks such as sentiment analysis, text classification,
provided window size. The second model, skip-gram, and log anomaly detection.
predicts the surrounding words with reference to the central
3) Bi-LSTM: Bidirectional LSTM (Bi-LSTM) is an
word. The foremost advantage of the word embedding is that
expansion of LSTM that divides the hidden layer of neurons
the different words which are used similar context will
of a regular LSTM into two opposite directions, i.e., forward
remain in proximity in the vector space. Furthermore, we can
and backward. Bi-LSTM can capture the knowledge of log
apply basic algebraic operations on word embeddings to get
sequence from both of the input directions.
interpretable results. For example, the difference of vectors
between “man” and “woman” is similar to the difference of 4) Autoencoder: An autoencoder [7] is an unsupervised
vectors between “king” and “queen”. Meaning that if we add neural network technique. It efficiently compresses and
the vectors of “king” and “woman” and subtract vector of encodes the data and, again, from encoded data, reconstructs
“man” from that, then we will get the vector of “queen”. The the output, which is close to the original input. Autoencoders
word embeddings have demonstrated to be very useful in are used for dimensional reduction similar to PCA; moreover,
many NLP applications, such as sentiment analysis, parsing, it can also learn non-linear transformation with a non-linear
and anomaly detection. activation function.
III. CHALLENGES IN LOG ANALYSIS
B. Machine Learning
Logs are generated from the collection of an ordered
1) Decision Tree: Decision Trees (DT) is a useful
sequence of statements captured as a piece of evidence. The
supervised learning technique in many areas, such as data
execution of a set of instructions performs a task. The sources
mining, information extraction, and machine learning for
of logs are spread across every entity of Information
classification and prediction. A DT can be depicted in a
technology infrastructure, i.e., network devices, security
simple treelike graphical representation, and the outcome
devices, servers, storage, etc. This diversified nature of log
decision can be explained easily. A leaf node specifies the
creates many challenges for their processing.
class of the instances. The instances are classified by sorting
them down the tree from the root to some leaf node. A. Unstructured data
2) Support Vector Machine: In Support Vector The logs are mainly in an unstructured or semi-structured
Machine (SVM), different classes of instances are separated format, which varies for different devices, Operating system,
by drawing a hyperplane in high-dimension space. When Software version, OEM. There is no defined formal structure
there is no apparent classifier between the classes, then SVM and syntax for log files. Centralized collection of all this data
works by moving data into a relatively high dimension space and processing encounters another big challenge.
where the hyperplane can classify observation. For the B. Instability
transformation of data from low dimensional to higher
Xu Zhang et al. [8] acknowledged the problem of log
dimensional space SVM use kernel functions which
instability. They identified a few reasons for it, such as the
systematically find support vector classifier in high
evolution of logging statements by modification of source
dimensional space.
code and processing of noise in log data. There is no fixed
3) PCA: PCA is a widely used method for dimension rule for a set of a distinct set of logs or several logs to be
reduction. PCA enables the transformation of high generated for any task.
dimensional data into a low dimensional data without losing
C. Log burst
important information. The basic idea of PCA is to remove
redundant data and highly correlated features while retaining The volume of log data is increasing many folds due to the
significant features. increasing sense of security and intelligence in devices and
software. The centralized collection of all logs for storage,
correlation, and processing are making this a big data
C. Deep Learning
problem. As a result, the generated log burst is another
1) RNN: Recurrent Neural Network (RNN) is an problem for log analysis and extracting of information from
artificial neural network that can capture sequential or raw log data.
temporal information. It has memory to store previous output,
which is further used as input like a loop for making D. Availability of public dataset
predictions. The RNNs can look back only a few steps as it The logs are unstructured most of the time, and its contents
has limited memory. are susceptible, hence considering security concerns, it cannot
be disclosed publicly for everyone. This nature limits the
2) LSTM: Long Short-Term Memory (LSTM) [12]
possibility of the availability of public datasets for research
networks are a type of RNNs that can look back on long-term
works.
temporal dependencies over sequences. LSTMs have shown

1216

Authorized licensed use limited to: Middlesex University. Downloaded on November 06,2020 at 14:18:26 UTC from IEEE Xplore. Restrictions apply.
The above discussion also supports the fact that it is manual labeling are always a point of discussion and
infeasible for humans to comprehend and perform anomaly challenge for researchers because it is a costly activity for
detection from log data. voluminous log data. Preprocessing of log data, feature
extraction, and applying techniques on processed data are the
core steps involved for anomaly detection from a massive
IV. LOG ANOMALY DATASETS volume of log data. Understating the log format and applying
The type of log data source profoundly influences the domain knowledge is always a valuable addition to anomaly
procedure for log anomaly detection. Log data source detection activity, which improves accuracy.
gathered from independent Devices software or servers,
supercomputer logs, distributed system logs, Security Log Parsing: Text log message can be viewed in two parts,
Information and Event Management (SIEM) systems collect i.e., a fixed portion of the message and changing sections. The
the variety of logs from the different systems in one place. fixed part text remains the same in messages while changing
Jieming Zhu et al. [13] worked with sixteen log datasets of parts contains different value as per parameter. This
different categories of systems and assessed the performance observation leads to a concept of event and template for log
of 13 log parsers. The datasets of logs can be categorized messages. Shilin He et al. [24] worked with system log
based on the source of logs and volume of log entries as analysis and some log parsing tools. The parsing methods
supercomputer logs, distributed system logs, and software categorization is as clustering-based (e.g., LKE [25], LogSig
system logs etc. Summary of log datasets and their brief [26]) and heuristic-based (e.g., iPLoM [27], SLCT [28]).
feature are illustrated in Table -I Pinjia et al. [17] propose a parallel log parser(POP) and
compare the accuracy of this with other log parsers. Log
High Performance Cluster (HPC) logs are from the parsing gives structured data from unstructured log data.
supercomputer setup of 49 nodes at Los Alamos National
Laboratory and each node configured with 6,152 cores and Next, to get the structured data with preprocessing, the
128GB of memory. data should be processed as per the required of the model we
are going to use, i.e., Machine Learning or deep learning. TF-
Amir Farzad et al. [6] Uses four Datasets, i.e., BGL, IDF [10] Word embedding techniques are useful for
Thunderbird, openstack, and IMDB [22] for experiments. processing textual data. TF-IDF is a statistical analysis
They used the IMDB dataset to prove the generalization of technique based on the frequency of words that appear on the
their proposed model for the text classification activities. The document. On the other side, word embeddings are successful
IMDB data set consists of 50,000 movie review sentences, in capturing semantic information, which is more relevant in
with equal numbers that are positive and negative. case of log anomaly detection. Text log contains words from
natural languages, and word embedding is an NLP technique
Xu Zhang et al. [8] proposed a model to overcome the for feature extraction, which transforms words in vectors of a
issue of instability in log data For log anomaly detection. real number.
They worked with the HDFS dataset and also created an
unstable testing dataset using the original HDFS dataset by T.F. Yen et al. [29] preprocessed the using domain
inserting some unstable events and sequences of logs. They knowledge and normalized it in terms of Timestamp, IP
also use Service X Dataset from Microsoft, which is a Address to host mapping, and static IP Address assigned, etc.
realworld industrial log data. as the log is from DHCP servers and policy, host and traffic
based features are extracted.
Xiaojuan Wanget et al. [3] work is based on one-month
log data of 1.09 GB with 18727517 log entries obtained from This domain knowledge application to log data increases
the router device of NetEngine40E series installed in the real the accuracy, but the methods can not be generalized. M Du et
network. al. [2], Weibin Meng et al. [4], and Xu Zhang et al. [8] have
created the log event templates and log sequences using log
V. METHODOLOGY parsers like SPELL[30], FT-Tree[31] and Drain[32]
Candace Suh-Lee et al. [23] proposed the idea to extend respectively. Further, a variety of approaches are available
techniques of text processing and Natural Language based on the problem statement. Approaches for anomaly
Processing (NLP) for unstructured log data. The work detection are statistical, Machine Learning, and Deep learning
identifies some properties of log data concerning text data as based on the model technique used. The anomaly can be
analyzed in the sequence of logs and also with the semantics
x log entries repeat large portion messages of logs.
x less number of Natural language words used VI. DEEP LEARNING AND LOG ANOMALY
x vocabulary size for log data is larges as it contains DETECTION
error codes, status codes, numbers in different
formats.
Hence, the text processing techniques may be useful only
with specific processing steps and methods.Preprocessing and

1217

Authorized licensed use limited to: Middlesex University. Downloaded on November 06,2020 at 14:18:26 UTC from IEEE Xplore. Restrictions apply.
TABLE I. SUMMARY OF LOG ANOMALY DATASETS

Dataset Name Log Source No of Message Log Source Type References


HDFS Hadoop Distributed File System 11175629 Distributed System [14] , [15] , [2], [8] , [16], [17] , [5], [4]
Spark Spark Job 33236604 Distributed System [18]
ZooKeeper Zookeeper Service 74380 Distributed System [17]
OpenStack OpenStack Software 207820 Distributed System [2] , [6], [19]
BGL Blue Gene/L super computer 4747963 Super Computer [20], [6], [17] , [4]
HPC High Performance Cluster 433489 Super Computer [21]
Thunderbird Thunderbird Supercomputer 211212192 Super Computer [20], [6] , [1]
Proxifier Proxifier Software 21329 Standalone Software [17]
Router Logs NetEngine40E Router 18727517 Router [3]

TABLE II. SUMMARY OF CHALLENGES ADDRESSED AND METHODS USED BY VARIOUS AUTHORS ON LOG ANOMALY
DETECTION
NLP /Other Pre-Processing
Year Citation Challenges Addressed DL Model Data Set Method /
Parsing
2019 Xiaojuan Wang Extracted Time period LSTM Netengine40E Directed Graph Parsed on
[3] Anomaly and causes for Router Log Behavior type
surge of Log , also
represented semantics.
2019 Xu Zhang [8] Instability of log data , Bi-LSTM with HDFS, FastText[36] Drain[32]
worked for contextual Attention Other security
information of log system of Microsoft
sequences.
2019 Weibin Meng[4] To detect sequential and LSTM BGL, Template2Vec[4] FT-Tree[31]
quantitative anomaly HDFS
simultaneously, worked for
semantic relation of logs
lost due when only log
templates are used.
2019 Amir Farzad [6] LSTM and BiLSTM Auto-LSTM, Auto- BGL, Word Frequency --
models with autoencoders BLSTM,Auto-GRU IMDB, Openstack,
are used for Log message Thunderbird
classification and anomaly
detection.
2018 Siyang Lu [5] Compared CNN CNN based model HDFS LogKey2Vec Logs-Key
performance for log Sequences and
anomaly detection with session key
LSTM and Multilayer
Perceptron (MLP)
2018 Andy Brown[9] Effect of attention for LSTM with 5 LANL, -- Language
sequence modeling attention cybersecurity Modeling and
mechanism dataset Tokenization
2018 Mengying Use of NLP techniques like LSTM Thunderbird Word2Vec , Data Cleaning of
Wang [1] word2vec and TFIDF for TF-IDF logs
log anomaly detection
2017 Min Du [2] ”Log Key” and ”Parameter LSTM HDFS, Openstack Log Key, Spell[30]
value” anomaly detection Parameter value
and diagnosis using work and Workflow
flows.

T.F. Yen et al. [29] worked with SIEMs log data collected identification of potential security threats. Lack of ground
with approx 1.4 billion logs per day for detection of truth is observed by the need of experts for manual labeling.
suspicious activity specific to enterprise settings and user The method is rule based, and processing of history logs
profile behavior. This work faced the challenges of scalability, required expert domain knowledge.
data noise, and no availability of ground truth. The proposed
method requires the creation of a feature vector for each Min Du et al. [2] presented an architecture for the
internet host using the history data. They utilize the detection of an anomaly in log data, which does not require
unsupervised clustering with data specific features for the any domain related prior familiarity. The proposed approach
identifies log key and parameter value anomaly from logs and

1218

Authorized licensed use limited to: Middlesex University. Downloaded on November 06,2020 at 14:18:26 UTC from IEEE Xplore. Restrictions apply.
also crates a workflow which is used for diagnosis purpose. A BiLSTM classification model using vector representation of
neural network based algorithm LSTM is applied to predict each log event considering its semantic information called the
the likelihood of the next log key. Additionally, a similar type semantic vector of the log. Word vectorization using fastText
of LSTM neural network detects an anomaly in the log [36] and TF-IDF based aggregation is performed for the
parameter sequence. The algorithm is also using manual creation of vector from the log event. The robust-log claims to
feedback about false positive for improving future accuracy. perform well in case of unstable logs events also which is
The LSTM is used to support the fact that the log sequence is tested by creating a synthetic HDFS log dataset.
a natural language sequence and can be processing similarly.
Table-II presents a summary of the challenges addressed
Siyang Lu et al. [5] worked with the Log key and CNN by researchers along with Deep Learning models, Dataset and
model with the embedding of Logkey as input to the model. method used in various research work for log analysis.
Two level parsing is applied on raw log data to get the log key
and vector for log key sequences as per the execution order. VII. CONCLUSION
They have presented a comparison of the CNN model with the Applying deep learning for log analysis is a rapidly
LSTM and Multilayer Perceptron Model and found that CNN growing practice for effectively extracting knowledge from
model accuracy is better than other models used. the unstructured textual log messages. This work is an
Amir Farzad et al. [6] proposed deep learning based model addition to the consolidation work done on the domain of log
for log message anomaly detection and presented a anomaly using Deep Learning. In this paper, we surveyed
comparison between these proposed models for better various deep learning algorithms implemented for log
efficiency using BGL, Thunderbird, Openstack, and IMDB anomaly detection. We also summarized different NLP
datasets. IMDB dataset is used to prove the generalization of feature extraction techniques used to capture semantic and
their proposed model for other text classification problems. In context information of log messages. Word embedding has
the architecture, word frequency is used to present textual log proved significant results for capturing semantic information
messages in numeric form. This positive and negative labeled from log messages. The unsupervised deep learning models
data is passed to two different autoencoders for training to get like autoencoder provided satisfactory results, while the huge
an improved relationship with original data, and this output is task of manual labeling of log messages can be avoided. We
used as input for the Deep Learning algorithm. presented a comparative study of challenges addressed, and
DL methods applied. Our work is also summarizing the
Mengying Wang et al. [1] also work to explore the standard datasets, which are useful for log analysis related
prospect of using the Natural Language Processing algorithms study. We observed that the DL algorithms are performing
for anomaly identification from log messages. In the much better as compared to traditional data mining and
experiments, word2vec and TF-IDF feature extraction Machine Learning methods. Many DL methods have been
algorithms are used, and activity is completed with LSTM explored, but still, there is much scope for improvement in
deep learning algorithm for classification. They have results by optimizing hyperparameter and using other DL
concluded that the word2vec is performing better than TF-IDF models.
for log message anomaly detection tasks.
REFERENCES
W Meng et al. 2019 [4] designed an attention-based
LSTM model for the detection of both types of anomaly i.e., [1] M. Wang, L. Xu, and L. Guo, “Anomaly detection of system logs
sequential and quantitative simultaneously. It is using FT-Tree based on natural language processing and deep learning,” in 2018 4th
International Conference on Frontiers of Signal Processing (ICFSP),
for parsing of logs and also proposed novel word pp. 140–144, IEEE, 2018.
representation method template2vec based on synonym and [2] M. Du, F. Li, G. Zheng, and V. Srikumar, “Deeplog: Anomaly
antonym to extract the semantic for anomaly detection detection and diagnosis from system logs through deep learning,” in
effectively. This approach overcomes the problem of missing Proceedings of the 2017 ACM SIGSAC Conference on Computer and
the valuable information from logs where only log template Communications Security, pp. 1285–1298, 2017.
index is considered, and semantics relation of logs could not [3] X. Wang, D. Wang, Y. Zhang, L. Jin, and M. Song, “Unsupervised
be revealed in [2] learning for log data analysis based on behavior and attribute features,”
in Proceedings of the 2019 International Conference on Artificial
Xiaojuan Wang et al. [3] have worked on Router logs Intelligence and Computer Science, pp. 510–518, 2019.
collected form NetEngine40E and analyze the for the type of [4] W. Meng, Y. Liu, Y. Zhu, S. Zhang, D. Pei, Y. Liu, Y. Chen, R. Zhang,
behavior, attribute, and status. The proposed model is a neural S. Tao, P. Sun, et al., “Loganomaly: Unsupervised detection of
sequential and quantitative anomalies in unstructured logs,” in
network LSTM to predict the surge in logs by analyzing the Proceedings of the Twenty-Eighth International Joint Conference on
number of logs in the time period. Also, attribute syntax forest Artificial Intelligence, IJCAI-19. International Joint Conferences on
is used for performing semantic analysis on the attribute Artificial Intelligence Organization, vol. 7, pp. 4739–4745, 2019.
information. The work has been extended with training [5] S. Lu, X. Wei, Y. Li, and L. Wang, “Detecting anomaly in big data
unsupervised machine learning models i.e. Isolation Forest system logs using convolutional neural network,” in 2018 IEEE 16th
[33], OneClassSVM [34], and density-based algorithm Intl Conf on Dependable, Autonomic and Secure Computing, 16th
IntlConf on Pervasive Intelligence and Computing, 4th Intl Conf on Big
LocalOutlierFactor [35] using attribute information and value Data Intelligence and Computing and Cyber Science and Technology
for finding the logs which are the cause of log surge. Congress (DASC/PiCom/DataCom/CyberSciTech), pp.151–158, IEEE,
2018.
Xu Zhang et al. [8] have proposed “Robust Log” one of
recent work in log anomaly detection. They developed a

1219

Authorized licensed use limited to: Middlesex University. Downloaded on November 06,2020 at 14:18:26 UTC from IEEE Xplore. Restrictions apply.
[6] A. Farzad and T. A. Gulliver, “Log message anomaly detection and [22] A. L. Maas, R. E. Daly, P. T. Pham, D. Huang, A. Y. Ng, and C. Potts,
classification using auto-b/lstm and auto-gru,” arXiv preprint “Learning word vectors for sentiment analysis,” in Proceedings of the
arXiv:1911.08744, 2019. 49th annual meeting of the association for computational linguistics:
[7] G. E. Hinton and R. R. Salakhutdinov, “Reducing the dimensionality of [23] Human language technologies-volume 1, pp. 142–150, Association for
data with neural networks,” science, vol. 313, no. 5786, pp. 504–507, Computational Linguistics, 2011.
2006. [24] C. Suh-Lee, J.-Y. Jo, and Y. Kim, “Text mining for security threat
[8] X. Zhang, Y. Xu, Q. Lin, B. Qiao, H. Zhang, Y. Dang, C. Xie, X. detection discovering hidden information in unstructured log
Yang, Q. Cheng, Z. Li, et al., “Robust log-based anomaly detection on messages,” in 2016 IEEE Conference on Communications and Network
unstable log data,” in Proceedings of the 2019 27th ACM Joint Meeting Security (CNS), pp. 252–260, IEEE, 2016.
on European Software Engineering Conference and Symposium on the [25] S. He, J. Zhu, P. He, and M. R. Lyu, “Experience report: System log
Foundations of Software Engineering, pp. 807–817, 2019. analysis for anomaly detection,” in 2016 IEEE 27th International
[9] A. Brown, A. Tuor, B. Hutchinson, and N. Nichols, “Recurrent neural Symposium on Software Reliability Engineering (ISSRE), pp. 207–218,
network attention mechanisms for interpretable system log anomaly IEEE, 2016.
detection,” in Proceedings of the First Workshop on Machine Learning [26] Q. Fu, J.-G. Lou, Y. Wang, and J. Li, “Execution anomaly detection in
for Computing Systems, pp. 1–8, 2018. distributed systems through unstructured log analysis,” in 2009 ninth
[10] G. Salton and C. Buckley, “Term-weighting approaches in automatic IEEE international conference on data mining, pp. 149–158, IEEE,
text retrieval,” Information processing & management, vol. 24, no. 5, 2009.
pp. 513–523, 1988. [27] L. Tang, T. Li, and C.-S. Perng, “Logsig: Generating system events
[11] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of from raw textual logs,” in Proceedings of the 20th ACM international
word representations in vector space,” arXiv preprint arXiv:1301.3781, conference on Information and knowledge management, pp. 785–794,
2013. 2011.
[12] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural [28] A. A. Makanju, A. N. Zincir-Heywood, and E. E. Milios, “Clustering
computation, vol. 9, no. 8, pp. 1735–1780, 1997. event logs using iterative partitioning,” in Proceedings of the 15th
[13] J. Zhu, S. He, J. Liu, P. He, Q. Xie, Z. Zheng, and M. R. Lyu, “Tools ACM SIGKDD international conference on Knowledge discovery and
and benchmarks for automated log parsing,” in 2019 IEEE/ACM 41st data mining, pp. 1255–1264, 2009.
International Conference on Software Engineering: Software [29] R. Vaarandi, “A data clustering algorithm for mining patterns from
Engineering in Practice (ICSE-SEIP), pp. 121–130, IEEE, 2019. event logs,” in Proceedings of the 3rd IEEE Workshop on IP
[14] W. Xu, L. Huang, A. Fox, D. Patterson, and M. Jordan, “Largescale Operations & Management (IPOM 2003)(IEEE Cat. No. 03EX764),
system problem detection by mining console logs,” Proceedings of pp. 119–126, IEEE, 2003.
SOSP’09, 2009. [30] T.-F. Yen, A. Oprea, K. Onarlioglu, T. Leetham, W. Robertson, A.
[15] W. Xu, L. Huang, A. Fox, D. Patterson, and M. I. Jordan, “Detecting Juels, and E. Kirda, “Beehive: Large-scale log analysis for detecting
large-scale system problems by mining console logs,” in Proceedings suspicious activity in enterprise networks,” in Proceedings of the 29th
of the ACM SIGOPS 22nd symposium on Operating systems principles, Annual Computer Security Applications Conference, pp. 199–208,
pp. 117–132, 2009. 2013.
[16] S. Bursic, A. D’Amelio, and V. Cuculo, “Anomaly detection from log [31] M. Du and F. Li, “Spell: Streaming parsing of system event logs,” in
files using unsupervised deep learning,” 09 2019. 2016 IEEE 16th International Conference on Data Mining (ICDM), pp.
859–864, IEEE, 2016.
[17] P. He, J. Zhu, S. He, J. Li, and M. R. Lyu, “Towards automated log
parsing for large-scale log data analysis,” IEEE Transactions on [32] S. Zhang, W. Meng, J. Bu, S. Yang, Y. Liu, D. Pei, J. Xu, Y. Chen, H.
Dependable and Secure Computing, vol. 15, no. 6, pp. 931–944, 2017. Dong, X. Qu, et al., “Syslog processing for switch failure diagnosis and
prediction in datacenter networks,” in 2017 IEEE/ACM 25th
[18] S. Lu, B. Rao, X. Wei, B. Tak, L. Wang, and L. Wang, “Log-based
International Symposium on Quality of Service (IWQoS), pp. 1–10,
abnormal task detection and root cause analysis for spark,” in 2017 IEEE, 2017.
IEEE International Conference on Web Services (ICWS), pp. 389–396,
IEEE, 2017. [33] P. He, J. Zhu, Z. Zheng, and M. R. Lyu, “Drain: An online log parsing
approach with fixed depth tree,” in 2017 IEEE International
[19] B. Debnath, M. Solaimani, M. A. G. Gulzar, N. Arora, C. Lumezanu, J. Conference on Web Services (ICWS), pp. 33–40, IEEE, 2017.
Xu, B. Zong, H. Zhang, G. Jiang, and L. Khan, “Loglens: A real-time
log analysis system,” in 2018 IEEE 38th International Conference on [34] F. T. Liu, K. M. Ting, and Z.-H. Zhou, “Isolation forest,” in 2008
Distributed Computing Systems (ICDCS), pp. 1052–1062, IEEE, 2018. Eighth IEEE International Conference on Data Mining, pp. 413–422,
IEEE, 2008.
[20] A. Oliner and J. Stearley, “What supercomputers say: A study of five
system logs,” in 37th Annual IEEE/IFIP International Conference on [35] B. Scholkopf, J. C. Platt, J. Shawe-Taylor, A. J. Smola, and R. C.¨
Dependable Systems and Networks (DSN’07), pp. 575–584, IEEE, Williamson, “Estimating the support of a high-dimensional
2007. distribution,” Neural computation, vol. 13, no. 7, pp. 1443–1471, 2001.
[21] M. C. Dani, H. Doreau, and S. Alt, “K-means application for anomaly [36] L. Xu, Y.-R. Yeh, Y.-J. Lee, and J. Li, “A hierarchical framework
detection and log classification in hpc,” in International Conference on using approximated local outlier factor for efficient anomaly
Industrial, Engineering and Other Applications of Applied Intelligent detection,” Procedia Computer Science, vol. 19, pp. 1174–1181, 2013.
Systems, pp. 201–210, Springer, 2017. [37] A. Joulin, E. Grave, P. Bojanowski, M. Douze, H. Jegou, and T.
Mikolov,´ “Fasttext. zip: Compressing text classification models,”
arXiv preprint arXiv:1612.03651, 2016.

1220

Authorized licensed use limited to: Middlesex University. Downloaded on November 06,2020 at 14:18:26 UTC from IEEE Xplore. Restrictions apply.

You might also like