0% found this document useful (0 votes)
89 views8 pages

Aggregation of Elastic Stack Instruments

This document proposes an approach to constructing a system for collecting, storing, and processing security data and events using Elastic Stack. It analyzes monitoring and incident management tasks for computer security and compares existing technologies and architectures to identify technical requirements. The paper describes developing a system that integrates Elastic Stack, Nginx, and Docker to collect, store, and analyze large amounts of security information and events at high performance while allowing for future expansion. Experimental results of the prototype are presented and compared to other architectures.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
89 views8 pages

Aggregation of Elastic Stack Instruments

This document proposes an approach to constructing a system for collecting, storing, and processing security data and events using Elastic Stack. It analyzes monitoring and incident management tasks for computer security and compares existing technologies and architectures to identify technical requirements. The paper describes developing a system that integrates Elastic Stack, Nginx, and Docker to collect, store, and analyze large amounts of security information and events at high performance while allowing for future expansion. Experimental results of the prototype are presented and compared to other architectures.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Aggregation of Elastic Stack Instruments

for Collecting, Storing and Processing


of Security Information and Events
Igor Kotenko
Artem Kuleshov and Igor Ushakov
St.Petersburg Institute for Informatics and Automation
The Bonch-Bruevich Saint - Petersburg State University of
of the Russian Academy of Sciences (SPIIRAS),
Telecommunications (SPbSUT),
39 14th Liniya, St.Petersburg 199178 Russia
22 (b.1) pr. Bolshevikov, St.Petersburg 193232 Russia
[email protected]
{gart9515, ushakovia}@gmail.com
St. Petersburg National Research University of Information
Technologies, Mechanics and Optics,
49, Kronverkskiy prospekt, Saint-Petersburg, Russia

Abstract—The paper suggests an approach to construction of task we set is the ability to apply big data technologies for
the system for collecting, storing and processing of data and monitoring and to select the most productive architecture.
security events on the basis of aggregation of instruments The main contribution of the paper is in the integration of a
provided by Elastic Stack. Basing on the analysis of the bunch of existing open source packages to create a complete
monitoring and incident management tasks for computer
security monitoring system. The solution being proposed
security and comparative analysis of existing technologies and
architectural solutions the technical requirements for such differs from existing ones in integration of Elastic stack,
systems are identified, and on their basis the architecture of the Nginx, and Docker software for collecting, storing and
proposed solution is formed. The paper describes the developed processing large amounts of data in order to analyze security
system for data collecting, storing and analyzing for various information and events. This solution is aimed at providing
components of information security systems. Results of high-performance message processing with regard to possible
experiments with the developed prototype are presented. overloads and further expansion of the system. One rationale
for the work is that existing commercial solutions are too
Keywords—Big Data; security information and event
expensive for SMEs to adopt. The paper has the following
management; SIEM systems; Elastic Stack.
structure. In section II a concise analysis of relevant works and
the most representative SIEM products is given, their
I. INTRODUCTION advantages and disadvantages are discussed. Section III
Technological progress does not stand still, and identifies the technical requirements for a next generation
information security systems are being developed and evolve system for collecting, storing and processing of security
with it. The systems for security information and event information and events, and on their basis a general
management (SIEM) are no exception. Previously, the architecture of the proposed solution is formed, and noting the
functionality of the classic SIEM solutions for large and characteristics of the products, available on the market, the
medium sized companies more or less satisfied existing selection of the instruments for implementation is carried out.
requirements. However, nowadays new mechanisms and Section IV describes the proposed approach to the
functions are required that are able to timely and adequately implementation of the system for collection, storing and
identify, process and analyze current information flows and processing of security information and events that meet the
security events and to manage incidents for a much larger specified requirements, and its implementation is described.
number of devices with significantly increased amounts of Section V presents the results of the experiments and
information and speed of information flows [1-4]. The comparison of our prototype with several others architectures
problem is that modern SIEM systems are insufficiently from research papers. In conclusion, the findings are shown
adapted to the timely processing of large amounts of security and directions for future research are identified.
information and events needed to assess the current state,
perform incident management and develop countermeasures. II. RELEVANT SOLUTIONS
In this paper we set the task of developing the architecture Let us select two areas for review of relevant solutions as
and implementation of a research prototype of the system for research works and software implementations of commercial
collection, storing and processing of security information and products and open source ones.
events based on big data technology, as the basis for a new Aiming at objective and clear understanding of existing
generation SIEM system, as well as the preliminary analysis architectures of SIEM systems, let us first consider some
of the functioning parameters of this system. Specific of the research papers. [5] presents a generalized architecture of

978-1-5386-0435-9/17/$31.00 ©2017 IEEE


SIEM systems of the new generation, which can be divided ArcSight ESM system core is licensed by volume of logs per
into levels of network, data, events and applications. The day. In addition to the core, you must license a variety of
following main system components are outlined: collector, settings and options, for example, the number of users,
universal translator of events, highly reliable data bus, scalable development of own connectors, the number of event sources
event processor, repository, system for decision making and (considered separately for source types), modules of
response, component of the attack modeling and security compliance, log management, etc. which means that ArcSight
analysis, predictive security analyzer and visualization system. is focused on large corporations.
[6] identifies three architectural levels to build SIEM IBM SIEM Qradar [15] integrates into a single, unified
systems: data analysis, data management and data collection. solution for information management and security event
This approach allows to analyze the complexity of processing management system logs, anomaly detection, configuration
and the number of events to handle at each level. It is management and vulnerability remediation. This system
concluded that the most loaded and computing power allows to reveal attributes of most critical incidents. Qradar
consuming level is data collection. uses the advantages of unified architecture for the analysis of
Research, described in [7], were aimed at analyzing the system logs, data flows, vulnerabilities, user data and
central component of any SIEM system, which is the data information resources. However, this approach leads to a
storage system. The benefits of storing information in the technically challenging process of increasing the resources
hybrid repository are presented. The suggested approach to used by the system. The acquisition cost of IBM Qradar SIEM
storage provides a convenient and reliable data sharing across is composed of many factors of distribution and configuration
heterogeneous storage systems and can provide the of the system. The price of the license depends on the
performance high enough. processing capabilities, measured in events per second. High
Systems that are capable of working with big data are price does not allow to use this solution for small and
integrated into growing number of different products, and medium-sized enterprises.
SIEM products are no exception. Representative of such OSSIM (Open Source Security Information Management)
a system is Hadoop [8]. It works on the principle of batch [16] is a SIEM system based on open source code from the
processing, when direct analysis does not depend on data flow. company AlientVault. OSSIM implements functions of
The platform provides instruments for processing structured collection, analysis and correlation of events, as well as
and unstructured large files. The component of resource intrusion detection. It includes the host intrusion detection
management MapReduce [9] responds for this functionality. system (HIDS), network intrusion detection system (NIDS),
The basic idea of MapReduce is to distribute the tasks using a intrusion detection system for wireless networks (WIDS), the
large number of nodes, organized in a cluster. components of network nodes monitoring, network anomaly
[10] presents the result of implementation of this approach analysis, vulnerability scanner, system for the exchange of
(speed was up to 30,000 MB of processed data per second). information about threats between users, a set of plug-ins for
Such performance is possible using a large number of parsing and correlation of syslog records with a variety of
resources to calculate, in particular, there were used 1800 external devices and services. The main disadvantage of these
nodes, each of which included two Intel Xeon 2GHz CPUs solutions is the limited functionality for the aggregation of
and 4Gb RAM. In the referenced paper the technology of received messages.
batch processing was discussed, but for processing the data in As the SIEM products developed gradually, with the
the stream they use stream processing, allowing to track advent of the concept of big data into the information security
messages in real time. domain most of them were forced to introduce additional set
This task is performed, for example, by Apache Storm of methods of Hadoop (from Apache Software Foundation) to
[11]. Cisco has provided an analysis of this approach [12], work with large flow of information. Today, the situation has
capable of handling over a million packages per second on a not changed, and Hadoop is used as the main solution. It is
single node system. worth considering that the Hadoop project was developed with
Gartner [13] highlights some of the most advanced SIEM the aim of building a software infrastructure for distributed
systems, including HP ArcSight, IBM Qradar (commercial) computing. The module YARN, issued in autumn of 2013,
and AlienVault OSSIM (open source). made this technology universal for data processing, is
SIEM HP ArcSight [14] has modules for event monitoring, considered as a great development for the project, however, in
behavioral analysis, system of rules for security events our opinion, is redundant for solving problems of collection of
processing. Due to the closed source code it is difficult to add indexing and data distribution.
new features. HP Arcsight uses its own product CORR as
DBMS. In the system by default the ability to store events III. REQUIREMENTS, GENERIC ARCHITECTURE
coming in with no particular pattern or mask is not AND THE CHOICE OF INSTRUMENTS FOR IMPLEMENTATION
implemented, i.e. to add new information an additional
Basing on the results of the analysis of relevant works, the
intervention into the system is required. The system
present paper aims at development of the generic architecture
implements centralized data storage. For geographically
and a research prototype of the system for collecting, storing
distributed organizations it is required to transmit all events to
and processing of data and security events based on big data
the central database, where all processing is performed. The
technology, as the basis for a next generation SIEM system.
It is assumed that the decision must comply with the generation. This level could be installed on separate physical
complex of requirements. Let us enumerate the basic server. The main idea of the described architecture is to
requirements: (1) ensuring robust collection of information provide complex scenarios to manage security incidents.
from many different sources; (2) elasticity of system, that is System core includes the functions of big data aggregation and
providing of optimal (rational) distribution of load and low storage, and could store the data for future forensic analysis.
dependence of the performance of the components at change On the contemporary market there are many commercial
of individual components; (3) possibility of efficient and open source solutions for the collection, storing and
integration of own algorithms into the analytics subsystem for processing of security information and events.
further development of the SIEM system; (4) minimization of One of the most popular solutions of this kind is the
time to deployment and configuration of infrastructure, commercial software product Splunk Enterprise [17]. This
services, and other system tasks; (5) implementation of product is an analytical platform for collecting and analyzing
efficient mechanisms to ensure fault tolerance and high machine data with automatic load balancing. It has the ability
availability at the software level; (6) support for horizontal to increase productivity by adding typical servers, that
scaling, that is the ability to divide the system into individual provides horizontal scalability and fault tolerance of the
components and their distribution into separate physical system. Splunk has a documented API and SDKs for popular
machines; (7) prevalence of development tools of the market programming languages. This framework handles data in any
system and the active user community. format, including dynamic data from software applications,
The system architecture should be built on microservices application servers, web servers, operating systems, and many
that will provide system virtualization, fault tolerance and fast other sources. In addition, Splunk is a commercial product
horizontal scaling. It will also allow to automate and simplify with closed source code, which limits and slows down the
deployment. For efficient integration of algorithms of analysis development of the API and makes scaling of the system paid.
it is required to support the developed API from popular Another common commercial solution is ManageEngine
programming languages such as Java, C++ or Python. There EventLog Analyzer [18] with similar advantages and
should not be hard linking of the prototype to a particular disadvantages. The product EventLog Analyzer supports load
source event, i.e. the reception of information must be balancing using the internal service. The algorithms, methods
accessed through the mask or universal pattern. It should be and platform core are not available at the software level, and
possible to use multiple network streams or similar the entire setup and operation is done via an API that evolves
components for load balancing, to avoid overload on the depending on the needs of the consumer, which reduces the
server side. The main component of the architecture of the flexibility of the system.
SIEM system is a set of tools performing the functions of In 2014, Cisco Systems has published the source code of
collecting, storing, processing of security information and the solution OpenSOC [19]. It is a solution to create the
events and analytic capabilities that jointly represent the centers of cyber threats monitoring, based on the Apache Big
system of monitoring and incident management [3, 4]. Data open source. This source code has become the basis for
The generalized theoretical architecture of the developing the Metron Apache project [20]. This solution has the
SIEM system has four levels: network, data, events and architecture that was initially focused on processing large
analytic processing. Network level consists of aggregation arrays of data from multiple distributed information systems.
agents that are presented on different nodes and have a Currently, the Apache project Metron is not actively
function of aggregation and sending information from developed by open community; the statistics from Google
different sources such as: systems logs, security logs from search and GitHub data [21] on adding code to the project
firewall systems, sensors, etc. Main components of the data show weak interest from potential users. The development of
level includes functions of indexing and preparation for data the project is now performed almost solely by the company
storage. Due to the initial separation and load balancing, the Hortonworks, which offers a commercial implementation of
indexing could be fulfilled on multiple nodes in parallel. This the solution. Unfortunately, the tasks adapted architecture of
helps to achieve elastics and horizontal scalability. Before the Apache Metron has a poorly implemented system of
saving, messages are indexed and processed. It helps the inverse scaling, and to build the simplest system it is required
search and correlation engines do not use much compute to deploy a large number of system components and resource-
power that helps to process a lot of information for a short intensive services, most of which will not be used. In terms of
period of time. Events level consists of two components: the average resources it is sometimes impossible to provide
(1) Analysis and evaluation of events for data aggregation and loading of data from sources, for which this solution was
structuring; (2) Real-time monitoring controlling the main designed, while the components of the solutions require a lot
parameters of information security. Separation of the data and of manpower to deploy and configure. The project Apache
events levels helps to create backup storages and components Metron deserves attention in case of necessity to solve the
that work with it, and to make complex parallel tasks of tasks aimed at maintenance of high-loaded and distributed
aggregation, storage and preliminary analysis of received information systems.
messages. Eventually that helps to create fault tolerance One of the most efficient of open source software,
infrastructure. Level of analytic processing includes functions specializing in solving specified problems, is Graylog [22].
of security analysis, anomaly detection and countermeasure This is a free centralized system for collecting, storing and
analyzing information. This system uses the functions of collectors, and the ability to write your own collector. Filebeat
Elasticsearch. It is worth considering that, despite the frequent transmits to the server the information from dynamically
updating Graylog and developed user community, the updating system logs and files that contain text information.
integration of the latest versions of Elasticsearch into the Winlogbeat is used for similar actions with the Windows
project requires a lot of time. Proof of this is that in 2017 the system logs. Packetbeat is a network packet analyzer, which
latest version of Graylog 2.2.1 only works with Elasticsearch transits information about network activity between the
version 2.4.4 [23], which is obsolete. application servers. It captures network traffic, decodes the
Due to partial inability to satisfy all the requirements protocols, and retrieves the required data. Metricbeat
formulated above in the present work we suggested the periodically collects metrics of operating systems that
following approach. Basing on the results of analysis of the characterize, for example, CPU and memory usage, number of
available industry tools we selected the open source software packets transmitted in the network, the state of the system, and
complex Stack Elastic [24], as it is suitable to meet the services running on the server.
previously established requirements. Besides, it is free and
popular solution, managed to prove its efficiency in a large IV. ARCHITECTURE AND IMPLEMENTATION OF THE PROTOTYPE
number of implementations worldwide. The main software
The implemented prototype has the architecture shown in
components are: Elasticsearch, as a search and analytical
Fig. 1. Data collected using Winlogbeat and Metricbeat as
system; Logstash, as a software pipeline for data processing;
an XML document is sent via Internet to the server where the
Kibana, as a tool for visualization and navigation through the
ELK server is deployed. Data is taken by a Logstash instance
system; Beats, as a set of programs required for collection and
transportation of system logs and files. Software stack, that is configured to collect data, which serves as a collector.
consisting of Elasticsearch, Logstash and Kibana (named ELK Then a special instance of Logstash indexes incoming
information from collectors and sends the data to Elasticsearch
hereinafter) was specifically designed to solve the problems of
for subsequent storing and processing.
collecting, storing and processing of system logs. It should be
All the analytical processing is performed by the software
noted that the decision on the software stack ELK is a mature
component Kibana. To realize the function of viewing the
product, and on its base there was implemented the
commercial solution Elastic X-Pack that combines the above Kibana representation by the domain name in the Internet, you
technology with the addition of auxiliary analytical need an Nginx server, where the information from output port
of the Kibana is proxied to the 80th port. The received user
capabilities. Let us consider the basic components of Elastic
commands via the API are transmitted from Kibana to
Stack, which are used in the project.
Elasticsearch, where they are processed and receive the
Elasticsearch is a distributed search and analytical system
resulting data.
core that supports the architectural style REST API and
sending data via JSON. It centralizes incoming data and Device side Central side
supports clustered architecture, allowing the system to scale Windows system logs Logstash
out. It is a transparent and reliable system and has a failure Metricbeat+Winlogbeat
Collector Instance

detection system, which in its turn ensures high availability. Logstash


Indexing Instance
Elasticsearch kernel performs real-time search over large System logs, messages,
application reports
volumes of heterogeneous data structures (documents). Elasticsearch
Internet
Document is the basic unit of information that can be indexed. Metricbeat+Winlogbeat

Documents are specified in JSON format. The system has well Security services messages
Kibana

developed API, and the list of supported languages for Metricbeat+Winlogbeat Nginx
interoperability includes Java, Python, C++ and others. Application servers system
Proxy server

Logstash is a software pipeline for data processing, it logs


Metricbeat+Winlogbeat
simultaneously collects data from many different sources, Docker container

primarily processes them and sends them to the storage Fig. 1. The architecture of the prototype
subsystem. It has built-in parser, allowing to normalize
heterogeneous data, to determine the geographical coordinates The architecture of the prototype can be conditionally
by IP, to process information from various sources regardless divided into the following components: (1) the subsystem for
of format and structure. sending data from client devices; (2) the subsystem for
Kibana is a software component that implements pipelining and data delivery; (3) the fault tolerance and load
visualization and navigation in the Elastic Stack. It presents balancing mechanism; (4) the subsystem of search and
data as customizable interactive dashboard in real time and analytical core, combined with the storage subsystem; (5) the
implements a large number of built-in custom widgets visualization subsystem.
(histograms, graphs, maps, and other standard tools). It has Let us consider the first four subsystems.
well developed API. The prototype is developed with the goal to be the basis of
Beats is a set of programs - collectors of data with low the SIEM system, therefore it is necessary to provide in the
requirements on resources that are installed on client devices prototype the broad coverage of available information for
to collect system logs and files. There is a wide choice of analysis. Currently, it supports event collection of syslog
protocol, Windows events logs, telemetry hardware, OS and broker;
services, as well as information about the flow of network (2) the service for reception data from the buffer, further
traffic from netflow/sflow with Filebeat, Winlogbeat, processing and sending to Elasticsearch.
Metricbeat, and Packetbeat, respectively. In the future, if it is The usage of multiple object instances of Logstash with
needed to send specific data, it is possible to write own Beat- the division of functional roles allows you to use load
collectors on the basis of the presented library Libbeat and balancing between the data source and the Logstash cluster.
well developed API. To avoid the impossibility of entering data of a specific
For processing and delivery of data in the Elastic Stack the type, when an instance of Logstash of this type is not
Logstash implementation is used, shown in Fig. 2. available, we use a specially configured Logstash pipeline
supporting a plurality of input plugin modules. For example, if
the subsystem had only one instance of the object with
Data Source Input plugin Filter plugin Output plugin Elasticsearch
Logstash file input plugin, then when it fails it would be
impossible to take data from Filebeat. Increasing the number
of input plugin modules allows you to scale horizontally,
Logstash Instance
while separated parallel receiving pipelines increase system
Fig. 2. The scheme of Logstash functioning reliability and eliminate single point of failure.
Elasticsearch module output plugin is also configured for
It is divided into three major functional blocks, automatic load balancing using a multitude of nodes in the
implemented in the form of three extensions (plugins): Input; Elasticsearch cluster. If one of the nodes fails, the data stream
Filter; Output. In the functional block Input plugin the specific is not interrupted, which eliminates single point of failure.
event source is specified, which is read by Logstash pipeline. This ensures high availability of the cluster and route traffic to
In the prototype, these sources are beats. It accepts documents active nodes in the cluster.
in JSON format that contain data from system logs, system The use of multiple instances of Logstash objects with the
metrics, the information from the protocols and other available division of functional roles allows you to use load balancing
data in accordance with the selected collector. The block Filter between the data source and cluster Logstash.
plugin performs intermediate processing of an event. This To avoid the impossibility of entering data of a specific
allows you to structure data by extracting only necessary type, when an instance of Logstash of this type is not
information, such as date, time, IP address, error code etc. and available, we use the specially configured Logstash pipeline
storing them in data structures, to send data further to output that supports plurality of input plugin modules. For example,
plugin for onward transmission to Elasticsearch. The filter if the subsystem had only one instance of the object with
used is selected depending on the characteristics of the event. Logstash file input plugin, when it fails it would be impossible
Note that this unit is resources consuming, so the Filter plugin to take data from Filebeat. Increasing the number of input
is actively using parallel computing. In the
Output plugin the further route for File and Winlog
Metrics and
Network data
Source
processing of documents in JSON format Source

is specified. This is the final phase of the


pipeline's functioning. In the prototype Logstash Shipping Instance Logstash Shipping Instance Logstash Shipping Instance
data is transferred to the subsystem of
analysts and storing in Elasticsearch. Winlog
Input
File
Input
Packet
Input
Metric
Input
Winlog
Input
File
Input
Packet
Input
Metric
Input
Winlog
Input
File
Input
Packet
Input
Metric
Input
The architecture of the pipeline for plugin plugin plugin plugin plugin plugin plugin plugin plugin plugin plugin plugin

data processing and delivery in the


prototype is shown in Fig. 3. Message Message Message

In case of excess of the rate of Gueue


Output
Gueue
Output
Gueue
Output
plugin plugin plugin
incoming events over processing speed the
Logstash implementation begins to drop
events. To prevent loss it was proposed
the use of message broker Redis as a Message Queue

buffer (it is possible to use the industry


standard solutions such as Redis, Kafka or
RabbitMQ).
In the implemented prototype the
subsystem of pipeline processing on the Input plugin Filter plugin Output plugin Elasticsearch

basis of Logstash was divided into two


separate program services:
(1) the service for data receiving from Logstash Indexing Instance

the event sources beats, and further


sending to the buffer of the massage Fig. 3. The architecture of the pipeline for data processing and delivery
plugin modules allows you to scale horizontally, while library without additional information about the source was
separated parallel receiving pipelines increase system sent, i.e. one message. The average indexing and processing
reliability and eliminate single point of failure. amounted to 13,000 packets per second with an average CPU
In the prototype the Elasticsearch cluster was deployed. In utilization of 65%. Fig. 4 shows the dependence of the CPU
the terminology of Elastic Stack, the cluster is a set of nodes time as a percentage of time. Fig. 5 shows the dependence of
(servers) that hold all the information and provide the ability the number of packets processed from time. However, for
to index and search across all nodes. In the terminology of SIEM systems the errors tests are not very helpful. In order the
ELK the set of structures-documents that have any similar system can track, for example, from which node the message
characteristics, is called as an index. For convenience, the arrived, which process was the sender, and what is the time of
indexes are divided into types – documents with common the incident, etc., messages are packaged into JSON packets,
fields. Potentially the index can grow to large sizes that exceed containing the additional information. During the second
the physical capabilities of the node. This index is therefore experiment a library of JSON packets with additional
divided into several parts called shards. This allows you to information was sent. As a result, the size of the packets
distribute data across multiple nodes, and to distribute and increased relatively to the first experiment and on average was
parallelize operations with the shards, which increases equal to 330 bytes. This had an impact on the processing
performance and throughput. To prevent failures and ensure speed, which on average was equal to 4000 packets per second
fault tolerance mechanism, Elasticsearch allows you to make and CPU utilization averaged 40% (Fig. 6 and Fig. 7).
copies of shards of the index which are called replicas. The
shard and its replica are never placed on the same node.
In the prototype the system elements are deployed in
Docker containers that allow you to automate the deployment
and management. To accelerate development and testing, as
well as to easily upgrade to the next version of the services in
the future, we selected an microservices approach and
deployment in the environment of container virtualization.
This separated the subsystem into modules, making it
easier to upgrade individual services and test their Fig. 4. CPU utilization while processing packets without additional metrics
compatibility with each other. At the same time, containers
solve problems of reliability and backup of infrastructure
services, enable engineers to focus on the logic of the solution
and abstract from infrastructure problems.
V. THE RESULTS OF THE EXPERIMENTS
With the purpose of validation of the proposed solutions, a
series of experiments was conducted. The test bench for the
study had the following characteristics: 16 GB of ddr3
technology RAM; 4-nuclear processor with frequency of 1.9
GHz; Elasticsearch of version 5.1.1, Kibana 5.1.1, Logstash Fig. 5. Speed of packet processing without additional metrics
5.1.1, Winlogbeat 5.1.1, Nginx 1.10.2, Java(TM) SE Runtime
Environment (build 1.8.0_73-b02); operating system CentOS
7 Linux 3.10.0-327.36.3.el7.x86_64. One of the important
criteria of operation of the monitoring system is the system
throughput, i.e. the number of packets from the device
processed in a certain period of time. When a packet arrives at
the ELK server port, Elasticsearch takes it and starts to index
information. After indexing it saves the received data in the
database and updates the API. The update occurs not Fig. 6. CPU load when processing JSON packets with extra metrics
immediately, but in the time interval set in the parameters as
refresh_interval, which by default is 1 sec. The update of the
API requires processing power, so if you load a large number
of packages, at possibility of delay of the monitoring, the
interval is been increased, which increases the speed of
indexing. All experiments were carried out with refresh
interval equal 1 sec. For the experiment to the server via
WLAN with maximum throughput 10 Mbps there was
connected the computer from which different libraries of pre-
prepared system logs were sent. During the experiment the
Fig. 7. Speed of packet processing without additional metrics
and the X axis is the number of tests), we revealed that the
average search time of nodes by the phrase is 1.28 ms.
The experiments demonstrated that the computational
power of the system is sufficient to increase the processing
speed, but the speed increase was not happening.
The reason for this is the bandwidth limitations of the
network. However, the set goal of processing and presenting a
large number of packets in real time is achieved. Given the
fact that Elastic Stack has support for netflow protocol, the
Fig. 8. Time of search of nodes by the phrase with tests' numbers analysis of network protocols in real time is possible.
The demonstrated in the experiments large potential of the
Another primary mechanism for SIEM systems is system at low consumption of computing power allows to
analytics, for which the ability to quickly extract the necessary obtain a significant gain in comparison with analogues and
information from the database is critical. To check the time of provides an opportunity for further research with large
the search, the system Elasticsearch was experimentally sent volumes of incoming security information and events.
requests for load of indexed and saved messages (nodes) Let us consider several solutions that were created by
containing the word of 5 letters. From the obtained results, researchers and compare them with our prototype. Table I
presented in Fig. 8 (where the Y-axis specifies the time in ms, represents the comparison of several systems.

TABLE I. COMPARISON PARAMETERS OF THE SYSTEMS


Considered system Volume if input data Number of Processing time Method of data Main task of the developed
servers/nodes processing prototype
Massive Distributed From 100 to 500 MB Up to 8 slave-servers 450 seconds for 500 Stream data processing. Development of the architecture
and Parallel Log MB of traffic Data from logs loads to for processing shared logs
Analysis [26] processing of 8 servers one server storage information
VSS Monitoring. More than 3,5 Exabyte Not presented Speed of traffic from Ability to data An ability to split network
Leveraging a Big 100 Mb/s to 100 Gb/s processing based on analytic from storage system.
Data Model in the batch or stream An ability to get more
Network Monitoring mechanisms effectiveness with integration of
Domain [27] different data, equipment,
infrastructure of big data
Toward a Standard More than 100 TB. Including: 240000 sensors though Not presented Ability to data An ability to data process,
Benchmark for up to 10 million URL- the world processing based on gathering them from millions of
Computer Security reputation domains; up to 2,5 batch or stream hosts worldwide, using key
Research. WINE millions of e-mail accounts; mechanisms. security fields
[28] 130 million of antivirus Hadoop/Spark support
telemetry
Using Large Scale More than 74 GB about 144 One physical server with 1500 sec. using 15 Batch data processing The developed scheme of attack
Distributed million events 16 cores cores mechanism detection has to integrate all
Computing to Unveil information security incidents
Advanced Persistent that are collected by the
Threats [29] organization
RF (Random Forest) 500 GB 10 nodes, joined in the Average time - 517,8 s Classification is based on Algorithm of machine learning,
[25] cluster. There are 500 votes, where each tree that is able to process the data
trees used for algorithm verify that each object is with big number of signs
confront to each class classes
Spark-MLRF [25] 500 GB 10 nodes, joined in the Average time -186,2 s Based on Apache Spark Method of parallel work of the
cluster. There are 500 Mllib Random Forest algorithm
trees used for algorithm
PRF (Parallel 500 GB 10 nodes, joined in the Average time is 101,3 Based on the platform Hybrid method of parallel
Random Forest) [25] cluster. There are 500 sec Apache Spark optimization of the Random
trees used for algorithm Forest algorithm
Developed system 1 GB of stream data 1 physical server: 16 GB Average time – 794 ms Stream and indexing data Distributed search and analytic
based on Elastic indexing. For data processing of RAM, 4 cores to transmit 330 byes of processing core have to search a big
Stack instruments we used library with 2,5 processor, 1.9 GHz, JSON packets amount of different types of
million rows (768 MB) throughput 10 Mb/s data

Let us consider parameters we used in the table. lower than in other papers [26-29]. In general this is enough
1. Volume of input data. There were represented results of for confirmation that our system is workable and also we
tests and experiments in analyzed papers where authors could make a conclusion of satisfaction of the requirements
presented information about test bed prototypes, using for processing and analyzing data in real time.
different volume of data [25-29]. Comparing with papers [25- 2. Number of servers/nodes used in the prototypes. Most of
29] our prototype uses 1 GB of data processed in stream way. the papers we considered use parallel and load balancing
This is a little bit more that in paper [25], but much more mechanisms to split the data in multiple streams [25-28]. This
parameter determines the ability of a system to load balance REFERENCES
traffic between different nodes in a network. In our prototype [1] Big Data Analytics for Security Intelligence. Cloud Security Alliance.
we used one physical server with characteristics that are September 2013. pp.1-12.
presented in the table 1, however, different components of the [2] R. Zuech, T.M. Khoshgoftaar, R. Wald, “Intrusion detection and Big
Heterogeneous Data: a Survey”, in Journal of Big Data, Springer,
system were located on different virtual machines. December 2015. pp.1-42.
3. Processing time. This parameter shows the speed of [3] I.V. Kotenko and I.B. Saenko, “Creating New Generation
indexing and analyzing time of the data, loaded in the system. Cybersecurity Monitoring and Management Systems”, Herald of the
There is not possible to make a comparison between papers Russian Academy of Sciences, vol.84, no.6, 2014, pp.993-1001.
[4] I. Kotenko, O. Polubelova, and I. Saenko, “Data Repository for
[25-29] using this parameter, because we have no ability to Security Information and Event Management in Service
make experiment on one platform. However, taking into Infrastructures”, in SECRYPT 2012 - Proc. of the International
consideration a power and time of data processing that we got Conference on Security and Cryptography, 2012, pp. 308-313.
in our experiment results, we could suppose that Elastic Stack [5] I. Kotenko and A. Chechulin, “Common Framework for Attack
Modeling and Security Evaluation in SIEM Systems”, in 2012 IEEE
is one of the most productive solutions in area of big data. Intern. Conference on Green Computing and Communications, 2012,
4. Method of data processing. This parameter describes pp. 94-101.
type of data processing [25-29]. According to the Table 1 we [6] I. Kotenko, A. Chechulin, and E. Novikova, “Attack Modelling and
could see that most of the researches use stream method of Security Evaluation for Security Information and Event Management”,
in SECRYPT 2012. Intern. Conference on Security and Cryptography,
data processing [25-28]. Our prototype uses stream data 2012, pp. 391-394.
processing mechanisms also, that is a key parameter for [7] I. Kotenko, O. Polubelova, and I. Saenko, “The Ontological Approach
building new generation SIEM-systems. for SIEM Data Repository Implementation”, in 2012 IEEE
5. Main task of the developed prototype. Each research International Conference on Green Computing and Communications,
2012, pp. 761-766.
paper that is considered in Table 1 solves a specific challenge. [8] ApacheHadoop 2.7.2, Web: http://hadoop.apache.org/docs/ current/.
All the prototypes we have considered in this paper are able to [9] J. Dean and S. Ghemawat, MapReduce: Simplified Data Processing on
proceed the huge amount of stream data for fast detection of Large clusters, Google Inc., 2004. pp.1-13.
information security incidents. The goal of the prototype we [10] K. Shim, “MapReduce algorithms for big data analysis”, Databases in
Networked Information Systems. Lecture Notes in Computer Science,
developed is to solve the challenge of creation a shared search vol.7813, 2013, pp.44-48.
and analytic core control system and management of the [11] Apache Storm, Web: http://storm.apache.org/.
incidents that is able to search information through huge [12] O. Santos, Network Security with NetFlow and IPFIX: Big Data
volumes of different data types. Analytics for Information Security, Cisco Press, 2015, 320 p.
[13] K.M. Kavanagh, O. Rochford, T. Bussa, 2016 Magic Quadrant for
VI. CONCLUSION SIEM. Gartner, 10 August 2016.
[14] (2017, Jun.) HPE Security ArcSight ESM, Web: https://saas.hpe.com/
The paper presented the architecture and the prototype of en-us/software/siem-security-information-event-management.
the system for the collecting, storing and processing of [15] (2017, Jun.) IBM Security QRadar SIEM, Web: http://www-
03.ibm.com/software/products/en/qradar-siem.
security information and events based on big data [16] (2017, Jun.) Alienvault OSSIM. [Online] Available: https://
technologies. To solve the problem the analysis of relevant www.alienvault.com/products/ossim.
papers and modern SIEM products and solutions, [17] (2017, Jun.) Splunk Enterprise. [Online] Available:
implementing the collecting, storing and analysis of system https://www.splunk.com/en_us/products/splunk-enterprise.html.
[18] (2017, Jun.) ManageEngine EventLog Analyzer. [Online] Available:
events and telemetry data was performed. Based on this the https://www.manageengine.com/products/eventlog/.
generalized architecture of the monitoring system that meets [19] (2017, Jun.) Cisco Systems OpenSOC. [Online] Available:
the requirements was suggested. We selected the solution on https://github.com/OpenSOC/.
Elastic Stack and implemented a prototype system for research [20] (2017, Jun.) Apache Metron, Web: http://metron.incubator.apache.org/.
[21] (2017, Jun.) GitHub Apache Metron. [Online] Available:
purposes basing on it. We conducted several experiments https://github.com/apache/incubator-metron/pulls.
demonstrating the performance of the developed system. The [22] (2017, Jun.) Graylog. [Online] Available: https://www.graylog.org/.
ability of this solution to collect, store, structure and analyze [23] (2017, Jun.) Documentation Graylog. [Online] Available:
data of any type with high performance, flexibility and http://docs.graylog.org/en/2.2/pages/configuration/elasticsearch.html.
[24] (2017, Jun.) Elastic Stack. [Online] Available: https://www.elastic.co/.
extensibility of a software system provides wide opportunities [25] J. Chen, Z. Tang, K. Bilal, S. Yu, C. Weng, and K. Li, “A Parallel
for further product development in the area of security, Random Forest Algorithm for Big Data in a Spark Cloud Computing
monitoring and control of information systems. Future Environment”, IEEE Transactions on Parallel and Distributed Systems,
research and development will be focused on further vol.28, Issue 4, 2017, pp.919-933.
[26] X. Shu, J. Smiy, D. Yao, H. Lin, “Massive Distributed and Parallel Log
improvement of the system architecture, the study of the Analysis For Organizational Security”, in IEEE Globecom Workshops,
interaction of the components with each other for security December 2013, pp.194-199.
information and events processing, as well as on analysis and [27] Leveraging a Big Data Model in the Network Monitoring Domain.
experimental evaluation of the functioning of the system for White Paper. VSS Monitoring. 2014.
[28] T. Dumitras, D. Shou, “Toward a Standard Benchmark for Computer
different security information and events streams. Security Research: the Worldwide Intelligence Network Environment
(WINE)”, in BADGERS’11, 2011, pp.89-96.
VII. ACKNOWLEDGMENT [29] P. Giura, W. Wang, “Using Large Scale Distributed Computing to
The work is performed by the grant of RSF #15-11-30029 Unveil Advanced Persistent Threats”, Science Journal, vol.1, no.3,
2013, pp.93-105.
in SPIIRAS.

You might also like