0% found this document useful (0 votes)
68 views17 pages

01-2020 DL CNN

This document summarizes a research paper on applying deep learning techniques to real-time web intrusion detection. The paper proposes an optimal convolutional neural network and long short-term memory (CNN-LSTM) model combined with spatial feature learning to detect attacks in HTTP traffic streams. The AI-based intrusion detection system (AI-IDS) is implemented as flexible and scalable Docker images. Experiments on public datasets and real-time data show the system can accurately detect sophisticated attacks, unknown patterns, and encoded/obfuscated attacks in real-time web traffic.

Uploaded by

Sayeed Habeeb
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views17 pages

01-2020 DL CNN

This document summarizes a research paper on applying deep learning techniques to real-time web intrusion detection. The paper proposes an optimal convolutional neural network and long short-term memory (CNN-LSTM) model combined with spatial feature learning to detect attacks in HTTP traffic streams. The AI-based intrusion detection system (AI-IDS) is implemented as flexible and scalable Docker images. Experiments on public datasets and real-time data show the system can accurately detect sophisticated attacks, unknown patterns, and encoded/obfuscated attacks in real-time web traffic.

Uploaded by

Sayeed Habeeb
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

SPECIAL SECTION ON SCALABLE DEEP LEARNING FOR BIG DATA

Received March 22, 2020, accepted April 3, 2020, date of publication April 10, 2020, date of current version April 24, 2020.
Digital Object Identifier 10.1109/ACCESS.2020.2986882

AI-IDS: Application of Deep Learning to


Real-Time Web Intrusion Detection
AECHAN KIM 1,2 , MOHYUN PARK 2, AND DONG HOON LEE 1, (Member, IEEE)
1 Graduate School of Information Security, Korea University, Seoul 02841, South Korea
2 Financial Security Institute (FSI), Yongin 16881, South Korea

Corresponding author: Dong Hoon Lee ([email protected])

ABSTRACT Deep Learning has been widely applied to problems in detecting various network attacks.
However, no cases on network security have shown applications of various deep learning algorithms in
real-time services beyond experimental conditions. Moreover, owing to the integration of high-performance
computing, it is necessary to apply systems that can handle large-scale traffic. Given the rapid evolution of
web-attacks, we implemented and applied our Artificial Intelligence-based Intrusion Detection System (AI-
IDS). We propose an optimal convolutional neural network and long short-term memory network (CNN-
LSTM) model, normalized UTF-8 character encoding for Spatial Feature Learning (SFL) to adequately
extract the characteristics of real-time HTTP traffic without encryption, calculating entropy, and compres-
sion. We demonstrated its excellence through repeated experiments on two public datasets (CSIC-2010,
CICIDS2017) and fixed real-time data. By training payloads that analyzed true or false positives with a
labeling tool, AI-IDS distinguishes sophisticated attacks, such as unknown patterns, encoded or obfuscated
attacks from benign traffic. It is a flexible and scalable system that is implemented based on Docker images,
separating user-defined functions by independent images. It also helps to write and improve Snort rules for
signature-based IDS based on newly identified patterns. As the model calculates the malicious probability
by continuous training, it could accurately analyze unknown web-attacks.

INDEX TERMS Computer networks, intrusion detection, neural networks, large-scale systems, intelligent
systems, real time systems, security, CNN-LSTM.

I. INTRODUCTION Hypertext transfer protocol (HTTP) [1] is an application-


As technology evolves, cyber-criminals are also improving level protocol for distributed, collaborative, and hypertext
their attack methods, tools, and techniques to exploit orga- information systems. Today’s HTTP is evolving where the
nizations. In particular, public web-services are common information is transferred from web pages, and it is also
services that anyone can access, and many companies pro- used for exchanges or sending system commands to various
vide their services through open webpages. If a web-service devices, such as command-lines, update scripts, and mobile
fails or is compromised, it can cause a drop in corporate apps. Web-attacks often exploit vulnerabilities in applications
reputation or revenue. In general, security managers prevent in open web services rather than perform a host-level sys-
intrusions from external attacks by registering all denied tem penetration. The attacker attempts to attack by sending
black-list policies for unused services in the firewall, but exploitational code using a vulnerability in a specific domain
web-services in the Internet Demilitarized Zone (DMZ) can- or path file of the webserver. The webserver or device that is
not be blocked by firewalls because they are always open injected with the code can subsequently be compromised by
to public access. As such, identifying normal access and the attacker [2].
differentiating it from malicious attacks is an important task Traditionally, intrusion detection is a major research field
in cybersecurity. In reality, many security incidents originated in network security, as it is important to identify unusual
with web-attacks such as information disclosure, service fail- access to secure internal networks. An Intrusion Detection
ures, and malware infections. System (IDS) is used to identify intrusions, attacks, or vio-
lations of security policies in a network or host system
The associate editor coordinating the review of this manuscript and promptly [3]. An IDS system that inspects a packet of net-
approving it for publication was Moayad Aloqaily . works to detect attacks is called Network Intrusion Detection

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
VOLUME 8, 2020 70245
A. Kim et al.: AI-IDS: Application of Deep Learning to Real-Time Web Intrusion Detection

System (NIDS). An NIDS is collected using network equip- B. CONDITIONS AND ASSUMPTIONS
ment via mirroring by network devices, such as switches, This study uses the following conditions and assumptions:
routers, and network terminal access points (TAP), which is
a surveillance device for monitoring network infringements 1) AI-IDS: AN OPEN SOURCE SOFTWARE
and policy violations [4]. Many organizations operate NIDS AI-IDS software contains the following license and notice
with firewalls and an application firewall (L7) to protect web- below: Licensed under the MIT License. You can access the
servers that are on the same network and system. An NIDS source-code directly on Github in our repositories [5].
runs mostly signature-based detection by Snort IDS rules.
The analyst writes a user-defined pattern into the rules to 2) PARALLEL OPERATIONS: IDS, TAS
detect an attack. When there is a malicious payload on the An IDS and a Traffic Analysis System (TAS) operate
network traffic, the rule triggers security events, including independently and do not affect each other. We used a
detection time, source/destination IP (metadata), and some signature-based NIDS for intrusion detection and a Splunk
raw packets (payloads). String or pattern match is reliable StreamApp-based TAS for collecting real-time traffic. A TAS
and generates very few false alarms but does not identify is equal to a packet monitoring system.
unknown or irregular pattern attacks. Recent sophisticated
cyber-attacks use irregular patterns such as encoding and
3) APPLICATION-LEVEL PACKETS INSPECTION
obfuscation to bypass security systems. To solve these prob-
We focused on the HTTP commonly used in web services that
lems, we applied AI-IDS to detect variant attacks that cannot
request headers and payload data. We did not address low-risk
be identified by legacy signature-based NIDS.
attacks from protocols below the application layer, such as
user datagram protocol (UDP).
A. LIST OF CONTRIBUTIONS
The remainder of this paper is organized as follows.
The main contributions of this paper are summarized as Section II presents related works, limitations of meta-
follows: datasets, and the motivation for this study. Section III intro-
duces the security operations for deep learning. Section IV
1) APPLYING DEEP LEARNING TO REAL-WORLD NETWORKS shows our spatial feature learning algorithms for big-data,
We have successfully applied AI-IDS to big-data scale traffic. optimal CNN-LSTM model, and AI-IDS infrastructure.
The AI-IDS is a flexible and scalable system that is imple- Section V shows the experimental results. Section VI intro-
mented based on Docker images, and separates user-defined duces the efficacy and applications. The last Section VII
functions by independent images. shows the conclusion.

2) PROPOSE A FAST AND EFFECTIVE II. BACKGROUND


PREPROCESSING METHOD This section describes related studies on deep learning-based
We implemented fast and effective spatial feature learn- IDS, and the limitations of meta-datasets and the motivation
ing through normalized UTF-8 character encoding, even for this study are also described.
if we do not apply computationally intensive algorithms,
such as entropy, compression, and encryption. For example, A. RELATED WORKS
the entropy of a string requires probability calculation, fol- Recent studies on intrusion detection using various deep
lowed by multiplication and logarithm. Instead, our proposed learning (DL) techniques have been published since 2017.
method can preprocess strings with a single operation. In Table 1, related studies focusing on intrusion detection
using DL algorithms based on models, features, datasets,
3) PROPOSE OPTIMIZED CNN-LSTM MODEL FOR BIG DATA and performance measures are given. Liu et al. [6] showed
We demonstrated the process of model design in detail via that when compared with other IDS classifiers, intrusion
performance evaluation between CNN-LSTM, LSTM-CNN, detection models based on a convolutional neural network
and DNN models based on fixed real-time data from HTTP (CNN) have the highest detection rate and precision. The
request packets. Hyper-parameters were determined in each feasibility of applying a CNN in highly intruded detection
model through repeat experiments. An optimized neural net- has been proven. The authors argue that the performance of
work model was validated through experiments on public CNN-based techniques is better than that of other machine
datasets (CSIC-2010, CICIDS2017) and fixed real-time data. learning classification techniques. Wang et al. [7] designed
an IDS using a CNN to automatically train and look for
4) PROVE OF EFFICACY AND APPLICATION traffic characteristics, effectively reducing the false alarm rate
We proved that AI-IDS could detect unknown attacks, such (FAR). This study shows that deep learning techniques can be
as obfuscated or encoded malicious payloads; it can write used to extract and learn the characteristics of network traffic
improved existing Snort rules and new rules for newly iden- in detail. Yin et al. [8] proposed an RNN-IDS and compared
tified patterns. it with ANN, random forest (RF), SVM, and other machine

70246 VOLUME 8, 2020


A. Kim et al.: AI-IDS: Application of Deep Learning to Real-Time Web Intrusion Detection

TABLE 1. Related works on intrusion detection.

learning methods. An RNN-IDS is suitable for modeling a and adaptive machine learning-based IDS: the adap-
classification model with high accuracy, and its performance tively supervised and clustered hybrid IDS (ASCH-IDS).
is superior to that of traditional machine learning classifica- Chouhan et al. [14] proposed a Channel Boosted and Resid-
tion methods in both binary and multiclass classification. ual learning-based deep Convolutional Neural Network
Shone et al. [9] showed a non-symmetric deep auto- (CBR-CNN) architecture for the detection of network intru-
encoder (NDAE) for unsupervised feature learning. This sions. This study used Stacked Auto-encoders (SAE) and
study improves the classification performance of KDD99 and unsupervised training, and the performance of the proposed
NSL-KDD99 by comparing an auto-encoder with a CBR-CNN technique is evaluated with an NSL-KDD dataset.
non-symmetric deep auto-encoder (NDAE). Wu et al. [10] Vinayakumar et al. [15] developed an IDS to detect and
devised CNN and RNN for attack detection; however, their classify unforeseen and unpredictable cyberattacks by DNN.
model differs from the model used in this study because The performance was tested with the DNN model and com-
it performed separate experiments on the CNN and RNN pared to the results of the NSL-KDD, UNSW-NB15, Kyoto,
model. Naseer et al. [11] investigated the suitability of deep WSN-DS, and CICIDS2017 datasets. Chiba et al. [16] pro-
learning approaches for anomaly-based intrusion detection posed a DNN model in a cloud environment based anomaly
systems. Ding and Zhai [12] compared the performance of network IDS using recent datasets, such as CICIDS2017,
models using multi-class classification with the performance NSL-KDD version 2015, and CIDDS-001, using a hybrid
of traditional machine learning methods. optimization framework (IGASAA) based on the Improved
Otoum et al. [13] devised DL for an IDS available on Genetic Algorithm (IGA) and Simulated Annealing Algo-
wireless sensor networks (WSNs), and also compared rithm (SAA). Zhang et al. [17], used a deep belief net-
the Boltzmann machine-based clustered IDS (RBC-IDS) work (DBN) model to identify SQL injection attacks in

VOLUME 8, 2020 70247


A. Kim et al.: AI-IDS: Application of Deep Learning to Real-Time Web Intrusion Detection

network traffic. Faker and Dogdu [18] experimented with a practical environment, because metadata are not attack-
improving the performance of intrusion detection systems on attempts. Moreover, most public datasets contain redundant
CICIS2017 and UNSW NB15 datasets using a DNN and two information and an unbalanced number of categories. For
ensemble techniques, RF and gradient boosting tree (GBT). instance, Ring et al. [21] compared the characteristics of
Previous research has suggested new ideas or algorithms intrusion detection datasets used in previous works. This
for improving deep learning algorithms. Aloqaily et al. [19] study shows that various previously published datasets model
introduced an automated secure continuous cloud service repetitive and inefficient attacks, such as DOS, UDP Flood-
availability framework for smart connected vehicles that ing, and brute force, which are different to recent web-attacks
enables an intrusion detection mechanism against security trends. In fact, types of attacks and trends in the data are
attacks. constantly changing; therefore, it is necessary to develop
However, most previous deep learning-based studies have a general-purpose model that is not biased toward current
difficulty applying intrusion detection in real-world environ- trends.
ments because the models were usually pre-processed into Another problem with most published datasets is that
metadata formats in an experimental environment. Few stud- they are often over-fitted due to duplicated or flow-based
ies have proven how to apply them in real-time in the real metadata, and the performance of the model is signifi-
world. cantly upgraded in experimental conditions. If the model
is applied in practical services, it will face a serious
B. LIMITATIONS OF META-DATASETS problem with false-positive alarms. Likewise, the work in
Previous studies [8], [12], [15], [16] mainly focused on Sabhnani and Serpen [25] has shown that when using the
extracting or analyzing features from metadata rather than KDD99 dataset, it is not possible to successfully train pattern
paying attention to exploited raw packets. Owing to network classifications or machine learning algorithms for misuse
traffic changing with trends, the accuracy rates of real-world detection.
without continuous re-training is significantly reduced even if Nevertheless, most previous studies measured the model
a system is 99.9% accurate in an experimental environment. performance on deep learning or machine learning techniques
The following is a description of the KDD99, NSL-KDD, in experiments using KDD99 datasets. Yin et al. [8] also used
and PU-IDS datasets that have been widely used in previous the KDDTest +- dataset to compare performance with the
works. RNN model, and a recent work in Vinayakumar et al. [15]
experimented using the DNN model through public data
1) KDD CUP 1999 DATASETS such as KDD99, NSL-KDD, and UNSW-NB15 using
KDD Cup 1999 Dataset [20] is the most widespread dataset machine learning techniques such as LR, NB, RF, and
for intrusion detection based on the DARPA dataset. The DT. Gu et al. [26] demonstrated that validated training
dataset contains TCP high-level attributes, such as the con- data is an essential determinant for successful research
nection window, but no IP addresses. KDD99 involves more that can greatly enhance the detection capability. Moreover,
than 20 different types of attacks and comes along with Moustafa et al. [27] compared the characteristics of various
redundant records in the test-set [21]. public datasets and suggested that datasets that are not based
on reality can lead to misguided research.
2) NSL-KDD DATASETS
NSL-KDD [22] NSL-KDD is a dataset that has been C. MOTIVATION
enhanced from KDD99, removing much of the duplicated One of the challenges faced by security operations is an inef-
data from KDD99 and creating a more advanced sub-dataset. ficient operation due to false-positive alarm events. It wastes
The dataset consists of separate and predefined training data IDS resources and reduces the performance for effective deep
and test data for intrusion detection. NSL-KDD uses the learning; therefore, the issue of false-positives should be
same attributes as KDD 99 and belongs to the four attack addressed properly to detect threats in big-data infrastructure.
categories: R2L, Prob, U2R, and DOS. belongs to the other Misuse detection that broadly applied in SOC uses predefined
category [23]. signatures for filtering and to detect attacks. It relies on
human inputs by constantly updating the signature database.
3) PU-IDS DATASETS This method is accurate in finding known attacks but is
PU-IDS [24] is a derivative of the NSL-KDD data set, and its completely ineffective for unknown attacks. In most cases in
author has developed a generator that extracts the statistics real-world environments, misuse detection generates a high
of the input data set and then creates a new data set. A traffic false-positive rate similar to anomaly-based IDS. In the study
generator has the same attributes and format as the NSL-KDD of Mishra et al. [4], performance optimization was needed
data set. during the detection process to deal with false-positive issues.
While previous studies mainly used KDD99 or KDD, However, most previous works do not adequately address
and NSL-KDD, they are not suitable as datasets for real- the false-positives issue in the real-world due to performance
time detection. These datasets deal with metadata and evaluations with limited datasets in experimental environ-
therefore make it difficult to identify invalid attacks in ments. To mitigate the false-positive problem, high-quality

70248 VOLUME 8, 2020


A. Kim et al.: AI-IDS: Application of Deep Learning to Real-Time Web Intrusion Detection

training data is a basic determinant for improving DL model network traffic in a user-defined function. A TAS is a type of
performance. system that collects network traffic and enables users to ana-
The most common issues with existing solutions based on lyze traffic by collecting various protocols, including HTTP,
learning models include SMTP, and SSH. It allows analysts to analyze anomalies
–First, the learning models produce a high false-positive by collecting various network protocols from the network
rate with a wide range of attacks. layer to the application layer. We use the StreamApp [28]
–Second, the learning models are not adaptive to the real- as a Splunk software for traffic collection and an analysis
world, as meta-datasets like KDD Cup 99 were mainly used system. For the effective detection and analysis of cyber-
to evaluate the performance of the learning model. attacks, we recommend running NIDS and TAS in parallel.
–Third, previous studies were unable to foresee today’s If security events are alerted on an NIDS, a TAS could inspect
huge network traffic; therefore, scalable solutions are the same malicious payloads on the network.
required to maintain a high performance with a rapidly An NIDS and a TAS inspected a variety of protocols, such
increasing high-speed network size. as SSDP, DNS, SMTP, POP3, HTTP, and SSL, from the
–Finally, no cases have been published on DL applications network layer to the application layer on the network. As the
for the detection of unknown attacks on real-world computer UDP-based protocol does not establish 3-ways handshaking,
networks. These challenges form the primary motivation for it is difficult to attribute it to an IP address and can easily be
the application of deep learning-based NIDS. forged. Thus, we did not analyze invalid UDP-type or denial
of service (DOS) attacks to maintain stable performance.
III. SECURITY OPERATIONS FOR DEEP LEARNING SSL protocol was excluded from our scope because it is not
This section introduces the security operations for deep learn- possible for an analyst to review the malicious payload.
ing applications and the data design for practical training. In managed security monitoring operations, security man-
agers process security events in the order of Detection, Anal-
A. OVERVIEW OF SECURITY OPERATIONS ysis, and Prevention. ‘‘Detection’’ means to collect security
We detected and analyzed intrusion attempts into financial alerts generated by user-defined Snort rules on NIDS or TAS,
networks to protect electronic financial incidents. The SOC which include detection time, source IP, destination IP, port
also plays the role of an Information Sharing and Analysis information and signature messages in Table 2. ‘‘Analysis’’
Center (KF-ISAC). Fig. 1 shows that the SOC collected refers to classifying events into true or false positives by
real-time network traffic, and detected malicious network reviewing detection information. ‘‘Prevention’’ is to register
traffic by directly installing an NIDS, a TAS, a TAP, and a malicious IP addresses to blacklists, which are then blocked
virtual private network (VPN) on the Internet DMZ area of from accessing service websites. Prevention is applied to
many financial companies in South Korea. The SOC oper- very obvious attack patterns, and it is recommended to block
ated continuously for 24 hours a day, 7 days a week, and access from certain attacks only after being verified by an
365 days a year. About 20 people work in shifts and generate analyst or system. The proposed AI-IDS is used as a supple-
daily analytical information for training. The IDS and TAS ment system with legacy signature-based NIDSs for network
data were transferred to the SOC via VPN from financial layer security.
institutions, and the SOC collected approximately 1 billion
TABLE 2. Attributes of analysis information.
real-time HTTP traffic per day (Sep. 2019).

B. DATA DESIGN FOR PRACTICAL TRAINING


FIGURE 1. AI-IDS applied Security Operations Center (SOC). The proposed AI-IDS trains the labeled analysis information
based on HTTP data in-bounding from the managed ser-
An NIDS is a signature-based misuse detection system vices instead of metadata sets in a constrained environment.
based on Snort rules, and a TAS is a system that collects We detect about 200,000 attacks on about 1 billion HTTP

VOLUME 8, 2020 70249


A. Kim et al.: AI-IDS: Application of Deep Learning to Real-Time Web Intrusion Detection

per day on legacy signature-based NIDS, and we perform detailed AI-IDS architecture, and the structure of a neuron
about 10,000 automatic and manual analyses. During gen- network model for large-scale web traffic.
eral security operations, malicious detection information is
triggered by NIDS when an attack packet occurs in the net- A. SPATIAL FEATURE LEARNING BASED ON NORMALIZED
work communication. Daily training-data on the production UTF-8 CHARACTER ENCODING
environment is labeled in real-time by security analysts using Feature extraction is one of the most important tasks in
labeling tools. designing an efficient learning algorithm. Mamun et al. [29]
The analysis information labeled is shared with AI-IDS devised a combined preprocessing technique using attributes
and used as training data for prediction in neural networks. of information theory such as entropy, encoding, and com-
We implemented deep learning models in real-time HTTP pression. Theoretically, the entropy of encrypted or irregular
traffic – ‘‘Password guessing and Authentication bypass data is high as there are many uncertainties in the data stream.
(AUT),’’ ‘‘SQL Injection (SQL),’’ and ‘‘Application vulnera- The entropy value indicates the degree of uncertainty of
bility attack (APP).’’ For UDP-type attacks, such as ‘‘infor- the information, but it is difficult to extract the feature by
mation gathering’’ or ‘‘denial of service,’’ it is difficult to matching the unique characters of the given data 1:1. For
identify the attacker’s IP address when compared to TCP, example, entropy can express the uncertainty of information
because the session is not connected perfectly, and contains as a number in the range of 0 to 1, but a collision problem
meaningless repeated data; therefore, we excluded it from would be calculated with the same entropy even if different
the deep learning model. Besides, HTTP traffic related to data were given. For this reason, it is difficult to extract unique
malware infection events are often detected when the mal- features of a given string, as it is. Thus, we use UTF-8 encod-
ware connects to the C2 server after infection. Unlike general ing that normalizes the deep learning model to recognize
intrusion events of which traffic are sent from an external the data with its own characteristics. It is simple and fast,
IP to an internal IP, malware events’ traffic is usually in the because it does not include unnecessary entropy calculations,
opposite direction. compression, encryption, or anything else. Assuming that all
The security event shown in Table 2 consists of the detec- data preprocessing for billions of HTTP within 1,000 bytes
tion time, detection site, direction, source IP, source port, per day is executed, the UTF-8 encoding method can achieve
destination IP, destination port, signature name, raw packet fast data preprocessing at only about 1 × 28 × 1, 000 billion.
(pcap file), and flag. ‘‘Detection Time’’ is the time the signa- The biggest advantage of UTF-8 is that it cannot be confused
ture generated the event, and ‘‘Detection Site’’ is the location with a single encoding method, so there is no possibility of
identifier where IDS and TAS were installed. ‘‘Direction’’ wrong encoding in other ways, such as for the national lan-
shows the direction of attacks based on assets between the guage encoding method such as UTF-16, EUC-KR (Korean),
source IP and the destination IP. ‘‘Source IP/port (src_ip, GB2312 (Simplified Chinese). As both browsers and web
src_port)’’ is the IP/Port address that requests a connec- servers are now developed assuming UTF-8, it is a very
tion from the client to the server, and ‘‘Destination IP/port efficient way to preprocess HTTP traffic.
(dest_ip, dest_port)’’ is the IP/Port address from the server to UTF-8 encoding in Algorithms 1-2 converts up to 256 char-
the client. Most of the above metadata are managed as Critical acters into floats, which can be encoded into numbers, includ-
IP or Threat Intelligence by security administrators. ing special characters that include Simplified Chinese in
The number of HTTP requests collected per day was the packet, such as WebDAV attacks. When preprocessing a
approximately 1 billion, of which about ten thousand were string of 7 bits or less, it is difficult to preprocess various
analyzed information about attack events detected in HTTP. characters in a real environment. We used the normalized
Assuming a normal to abnormal ratio of 5:5, the amount of UTF-8 encoding and the module developed on ‘‘parse’’ and
malicious analysis information is 65 MB for the last year, but shown in Figs. 2 and 3. The input variable was replaced
benign HTTP traffic is 6 GB per day. To equalize the data with a value corresponding to a unique string in the range
rate for training in the deep learning model, the 65 MB HTTP of 0 to 255 (256 features), and the input string was con-
payload, which was analyzed during one year, was multiplied verted into a float value between −1.000 and 1.000 given
100 times by concatenation and shuffle, and the ratio of the that ys = −(ys − 128)/128. The output variables ys for
analysis information and normal traffic was adjusted to be a transformed set of input data, for one training-data size
equal to 6 GB per day. Malicious events identified by analysts s ∈ [0, 2, 3, . . . , 999].
were used as data for re-training. The training data was Fig. 2 shows a preprocessing example for ‘‘http://target.
approximately 6 GB per day, and the analysis information com/manager/html/.’’ When comparing preprocessing meth-
from the duration of 1 year was changed sequentially like a ods with our proposed UTF-8 encoding and entropy-based
sliding window. encoding, our proposed method is a normalized calcula-
tion expression. The entropy of a string requires probabil-
IV. DESIGN AND IMPLEMENTATION ity calculation, followed by multiplication and logarithm.
This section introduces a fast and effective spatial feature Entropy-based preprocessing involves two steps of calcu-
learning based on normalized UTF-8 character encoding, lating the probability of each string and then calculating

70250 VOLUME 8, 2020


A. Kim et al.: AI-IDS: Application of Deep Learning to Real-Time Web Intrusion Detection

Algorithm 1 UTF-8 Character Encoding Algorithm 2 Spatial Feature Learning (train/test)


Input: content_string (web traffic) Input: ∗ .h5(model), ∗ .npy (preprocessed file)
Output: ∗ .npy (preprocessed file) Output: performance metrics
1: FUNCTION save_data(content_string_list, 1: FUNCTION train_model(model, train_file_list,
npy_filename) x_dim):
2: numpy_array <- numpy.empty() 2: npy_list <- list(load(filename) for filename in
3: FOR content_string in content_string_list train_file_list
4: byte_array <- [] 3: data <- concatenate(npy_list)
5: FOR character in content_string 4: train_size <- data.shape[0]
6: byte_array.append(character.encode(‘utf-8’) 5: x_train <- array(data[:,:−1])
7: ENDFOR 6: x_train <- (x_train − 128.0) / −128.0
8: int8_array <- [] 7: x_train <- x_train.reshape(train_size, x_dim, 1)
9: FOR byte in byte_array 8: y_train <- data[:, [−1]].reshape(train_size, 1)
10: int8_array.append(byte.toint8()) 9: y_prediction <- model.predict(x_train,
11: ENDFOR batch_size = 4096)
// 10: y_merged <- (y_prediction.round()∗2 +
//float_array <- [] y_train).flatten()
//FOR int_8 in int8_array 11: value, counts <- unique(y_merged,
// float_array.append(int_8 - 128.0 / -128.0) return_counts = True)
//ENDFOR 12: value_str <- list(map(lambda x: str(int(x)), value))
// 13: metrics <- dict(zip(value_str, counts))
//content_array <- numpy.array(float_array) 14: loss <- binary_crossentropy(y_train, y_prediction))
//numpy.append(content_array) 15: metrics[‘Loss’] <- average(loss)
// 16: RETURN metrics
// for data size issues, 17: ENDFUNCTION
// the actual array is saved from int8_array
// and the float is calculated just before training
12: content_array <- numpy.array(int8_array)
13: numpy.append(content_array) B. AI-IDS ARCHITECTURE
14: ENDFOR Fig. 3 shows an enlarged representation of the AI-IDS and
15: numpy.save(npy_filename, numpy_array) Index Cluster, as shown in Fig. 1. Individual modules are con-
16: ENDFUNCTION figured as Docker image/containers that output files after the
Docker process. No Docker container affects another and they
all run independently. However, Docker volumes are shared
as same data in a series of processes, from pre-processing
the log. Instead, our proposed method can preprocess strings (parsing), training and testing, to prediction. Our AI-IDS
with a single operation and have no data transformation or process is as follows: (i) data save and splitting - collecting
substitution in the progress. Normalized UTF-8 encoding web traffic and splitting training data for each model (ii) data
generates input values so that the deep learning model can preprocessing and training by labeled analysis information
train immediately. (iii) prediction for suspicious payloads on new web-traffic.
Previous [30] research designed a CNN that can be trained The following is a detailed process description of the
as a corpus to process natural language between sentences AI-IDS, as shown in Fig. 3:
and words. However, it functions closer to image process-
ing than natural language processing because cybersecurity 1) DATA SAVE AND SPLITTING
corpora have a different attribution compared with natural Index Cluster collects true or false positive analysis informa-
language. A corpus in the field of cybersecurity is diffi- tion and normal traffic for training data and then stores it in
cult to create because it needs to understand string classi- ‘‘(labeled analysis info) data_save’’ (name of docker image).
fications and attributes, for example, ‘‘get, post, head, put, ‘‘data_save’’ saves legacy NIDS payload data along with its
php, cgi, admin, wget, ‘POST /manager/html’, ‘User-Agent: analysis results, and also previously labeled data by AI-IDS.
Mozilla/5.0’.’’ In our initial model, we were trying to train Simultaneously, ‘‘(new traffic) data_save’’ stores real-time
the security corpus into the CNN filter and LSTM layers. HTTP traffic for prediction in ‘‘ai_payloads’’ for the previous
However, as there is currently not enough research on cyber- 3h to −10 min. ‘‘data_split’’ saves data in ‘‘ai_payloads’’
security corpora, we have implemented deep learning on all by splitting the data according to the intrusion attack types
strings of the HTTP data. If a cybersecurity’s corpus was (AUT, SQL, APP) to generate training data for each deep
created, deep learning model performance is expected to be learning model. Each process module has one or more input
improved. and output files. The real-time web traffic is transferred into

VOLUME 8, 2020 70251


A. Kim et al.: AI-IDS: Application of Deep Learning to Real-Time Web Intrusion Detection

FIGURE 2. Comparison of normalized UTF-8 encoding and entropy-based encoding.

application-level strings for AI-IDS, and the ‘‘data_backup’’ TABLE 3. Samples of prediction output.
module backs up raw-data which has been collected more
than 24 hours in the past.

2) DATA PREPROCESSING AND TRAINING


Output files for parsing is shared into the ‘‘ai_payloads’’ vol-
ume. Each spitted raw-data (in the form of.csv files) is clas-
sified by its attack type. Then it is preprocessed by ‘‘parse,’’
and the preprocessed data (npy files) is saved into a docker
volume named ‘‘ai_parse_data.’’ The raw web-traffic strings
are transformed into trainable float data for deep learning
through our UTF-8 encoding with zero-padding. As a result
of searching the optimum performance using the ‘‘train/test’’
module, the h5 filetypes in ‘‘ai_model’’ stores the model’s
best hyper-parameters and states achieved by deep learning,
for classifying malicious and benign traffic.
‘‘src_content,’’ and ‘‘src_content’’ means payload that
3) PREDICTION FOR SUSPICIOUS PAYLOADS ‘‘POST /manager/html.’’ ‘‘payload_id’’ is the prediction
To predict suspicious payloads, ‘‘predict’’ inspects the event id, whose value can be identified and is gener-
real-time data (payload_data.tmp) stored in ‘‘ai_payloads’’ ated by ‘‘hexdigest (sha1(_time@src_ip: src_port-> dest_ip:
using the h5 model trained by the ‘‘train/test’’ module. ‘‘pre- dest_port)).’’
dict’’ also stores output as JSON files, including metadata, The infrastructure of the deployed center system consists
suspicious payloads, prediction for the malicious-ness prob- of Splunk Architecture and our AI-IDS. Splunk Architec-
ability. The predicted data in ‘‘ai_prediction’’ volume are ture consists of a Search Head Cluster with multiple search
potential suspicious events identified by each model and are headers and an Index Cluster with dozens of Indexes. One of
stored periodically (saved 8 times a day) until they are finally the search heads was built as an independent and dedicated
reviewed or labeled by a security analyst. The output files are system to interface with AI-IDS. The AI-IDS was devel-
accumulated into labeling tools in fig. 7. oped in the Docker 18.09.5, Python 3.6.7, Tensorflow 1.13.1,
Table 3 shows the contents of a sample JSON file gen- Keras 2.2.4 and Splunk SPL 7.2.3. The test-bed system is
erated by ‘‘predict’’ and stored in ‘‘ai_predction.’’ The HP DL380G9: 2.1 GHz 2P/8C(16C) CPU, 416 GB RAM,
file type is stored in the data frame in the following Tesla P100 16 GB × 1EA GPU, 960G × 2(RAID-1) SSD,
order: ‘‘_time,’’ ‘‘payload_id,’’ ‘‘model_name,’’ ‘‘predic- 6 TB × 2(RAID-1) HDD and 10Gbps LAN. The operating
tion,’’ ‘‘src_ip,’’ ‘‘src_port,’’ ‘‘dest_ip,’’ ‘‘dest_port’’ and server consisted of an HP DL390G10: 2.4 GHz 2P/20C(40C)

70252 VOLUME 8, 2020


A. Kim et al.: AI-IDS: Application of Deep Learning to Real-Time Web Intrusion Detection

FIGURE 3. AI-IDS architecture.

CPU, 1 TB RAM, Tesla V100 32 GB RAM × 2EA GPU, the data is zero-padded. AI-IDS preprocessing continuously
960G × 4(RAID-5) SSD, 10 TB × 4(RAID-5) HDD and collects data for 3 hours in 1 cycle. AI-IDS operates 8 times
10 Gbps LAN. of learning, and predicts every 3 hours for real-time traffic,
As shown in Fig. 1, the sensor systems were located in which allows for real-time monitoring for 24 hours. In the
several financial institutions, and IDS alert events and TAS training phase, the labels indicating ‘‘malicious,’’ ‘‘benign,’’
traffic were collected and transmitted to the SOC via VPN and ‘‘unknown’’ are recorded at the end of 1,000 bytes of an
from financial institutions. The experiment was conducted HTTP request, and the model calculates a malicious proba-
in a test-bed system, which was deployed to the operating bility when all neuronal network operations are completed.
system only when the performance and function verification The initial input-data at the CNN layer generates 1 × 1,
were completed. 000×12 composited data through an operation with 1×4×12
filters. After 1/5 max-pooling, 1 × 200 × 12 pieces of data
C. OPTIMIZED CNN-LSTM MODEL are stored in the memory in normalized form. In the second
Table 4 and Fig. 4 show the CNN-LSTM structure, which convolutional layer, 1 × 200 × 60 data are generated through
illustrates the hyper-parameters. One UTF-8 encoded HTTP the composite product of a 1 × 4 × 5 filter, and then 1 ×
data, including the variable-length HTTP header and pay- 40 × 60 data are generated as a result of 1/5 max-pooling
loads, which is the initial input value of the proposed neuron and normalization. Data output from the CNN layer is used
network model, is made into a fixed-length input value of as an input to the RNN layer, and data processed into cells
1,000 bytes (1 dimension × 1,000 bytes). Strings corre- of 1 × 40 × 60 are sequentially input to Forward LSTM
sponding to the header and body of the HTTP request from and Bidirectional LSTM. The first LSTM cell is calculated
the 0-th byte to the n-th byte are aligned, and the rest of in the forward direction with 16 cells, the second LSTM cell

VOLUME 8, 2020 70253


A. Kim et al.: AI-IDS: Application of Deep Learning to Real-Time Web Intrusion Detection

TABLE 4. Summary of proposed CNN-LSTM model. datasets to select optimal performance parameters. The pro-
posed model is devised with an intuitive design based on the
theoretical basis of a previous study, and we proved the model
validity through repeat experiments. In the next section,
we present the detailed experimental results to demonstrate
the validity and performance of our proposed model.

V. EXPERIMENTS
This section demonstrates performance measurements,
experimental design, and results: comparing the performance
of the CNN-LSTM, LSTM-CNN, and DNN models and the
experimental results of the KF-ISAC, CSIC-2010 HTTP, and
CICIDS 2017 datasets.
We have defined the following experimental statements for
the deep learning application:
• Selection of experimental data: CSIC-2010, CICIDS
2017 HTTP dataset, real-time HTTP data
• Design of optimal model structure using deep learn-
ing: CNN-LSTM model, LSTM-CNN model
• Determination of hyper-parameters: This is required
for individual neural networks, such as CNN,
RNN, DNN: conv_depth, conv_filter, and lstm_units,
dense_units
• Model validation: experiments on two public datasets
(CSIC-2010, CICIDS2017) and fixed real-time data
We experimented to select the optimal model by comparing
CNN-LSTM with LSTM-CNN based on real-time HTTP
traffic on a fixed date. In the second experiment, we validated
is processed in a bidirectional flow, and the last 32nd LSTM the model through experiments using two public datasets
Cells are transferred to the DNN layer by combining the (CSIC-2010, CICIDS 2017 HTTP dataset) and real-time data
accumulated forward and backward cells. on the optimal model. Recently, various models have been
The output value of the calculated RNN is input into each introduced that optimize performance by combining CNN,
of the 12 fully connected DNN layers. Until the DNN output RNN (LSTM), and DNN. Liu et al. [31] and Wu et al. [10]
layer, dropout was set to 0.1, and the LeakyReLU function devised CNN and RNN for intrusion detection, but it was
was applied. Sigmoid activation function was used at the different from the model of this study because it performed
DNN output layer and the model was trained for prediction experiments each separated model in CNN and RNN. In this
on malicious payloads using the Adam optimizer along with paper, a DNN was selected as the last layer to output a single
binary-crossentropy (BCE) as the loss function. The probabil- result; we chose a model that can best characterize the data
ity is calculated in the output layer which includes the JSON among a CNN-LSTM or LSTM-CNN.
output-file shown in Table 3 and the output files are shared
with Index Cluster, as shown in Figs. 1 and 3. A. PERFORMANCE MEASUREMENT
The analyst reviews the probability calculated by AI-IDS We used a confusion matrix to evaluate the performance of
and examines the payload to determine whether an attack the deep learning model. A confusion matrix is a popular
warning is valid or not. During the training phase, AI-IDS indicator of the performance of classification models. The
uses labeled analysis information from an analyst: (i) attack matrix in Table 5 shows us the number of correctly and incor-
alert events detected by IDS and (ii) valid attack events that rectly classified results, compared to the actual outcomes in
the analyst has confirmed. As the AI-IDS aims to detect new the test data. One of the advantages of a confusion matrix
threats in the predict phase, the security events detected by as an evaluation tool is that it allows for a more detailed
legacy signature-based IDS are considered duplicate data. analysis. The matrix is n by n, where n is the number of
It calculates malicious probability for new and real-time pay- classes. The simplest classifiers, called binary classifiers,
loads and outputs prediction results. have only two classes: positive/negative. The performance of
The composition and depth of each layer of CNN, RNN, a binary classifier is summarized in a confusion matrix that
and DNN derives the optimal parameters for the model cross-tabulates predicted and observed examples into four
through repeated experiments in the training phase. The categories [8], [32].
structure and parameters of the neuron network are slightly In our deep learning model, Precision and F-Score are more
different when iterative experiments are performed on various important performance indicators than others. Moreover, the

70254 VOLUME 8, 2020


A. Kim et al.: AI-IDS: Application of Deep Learning to Real-Time Web Intrusion Detection

FIGURE 4. Structure of optimized convolutional recurrent neural networks.

TABLE 5. A confusion matrix. TABLE 6. Rules for performance measurement.

purpose of AI-IDS is to obtain a higher accuracy with a lower


false-positive rate.
We describe the four indicators that make up the confusion
matrix in Table 5, as follows:
• True Positive (TP): the number of cases correctly pre-
dicted and labeled as positive.
• False Positive (FP): the number of cases incorrectly
predicted and labeled as positive.
We use the following notation in Table 6 for the model
• True Negative (TN): the number of cases correctly
evaluation:
predicted and labeled as negative.
• False Negative (FN): the number of cases incorrectly • Accuracy (ACC): the proportion of the number of cor-
predicted and labeled as positive. rectly predicted cases to the labeled total of records.

VOLUME 8, 2020 70255


A. Kim et al.: AI-IDS: Application of Deep Learning to Real-Time Web Intrusion Detection

TABLE 7. Experimental datasets.

• Precision: the proportion of the number of correctly pre- 3) CICIDS2017 HTTP DATASETS
dicted cases as positive to the number of predicted cases The CICIDS2017 datasets [34] generated in 2017 by the
as positive, high precision relates to a low false-positive Canadian Institute of Cybersecurity overcome these issues.
rate. The CICIDS2017 benchmark dataset contains the abstract
• Recall (Sensitivity, Detection Rate): the proportion of behavior of 25 users based on HTTP, HTTPS, FTP, SSH,
the number of correctly predicted as positive to the and email protocols. We use only HTTP datasets, including
number of cases labeled as positive. web attacks and generated 586,180 records by augmenting
• Specificity: the proportion of the number of correctly 20 times from the original 29,309 records. The dataset con-
predicted as negative to the number of cases predicted sists of the entire abnormal/normal pcap file, the unlabeled
as negative. HTTP attack, and the metadata, including label data.
• F-Score: the weighted average of Precision and Recall;
this score considers both false positives and false C. EXPERIMENTAL RESULTS
negatives. 1) MODEL SELECTION
We implemented CNN-LSTM and LSTM-CNN structures
B. EXPERIMENTAL DESIGN for an optimal deep learning model selection and then
We describe the details of the experimental datasets in the performed 10 iterations using real-time HTTP data shown
following paragraphs and in Table 7. in Table 7. The training data of KF-ISAC consisted of approx-
imately 6.6 million records extracted and proposed on a
specific date, and each of the normal/attack classes was com-
1) REAL-TIME HTTP DATASETS (KF-ISAC)
posed of approximately 3.3 million records. The test datasets
KF-ISAC HTTP data is real-time HTTP stream data during were set to a ratio of 8:2. The results of the experiment
fixed dates from a TAS. The proposed model trains a mix of shown in Fig. 5 are the average values of the results of
benign HTTP data and labeled malicious payloads that have 50 epochs. The overall model performance of CNN-LSTM is
been analyzed over the past year. It evaluates performance better than LSTM-CNN, in areas such as accuracy, precision,
by separating training and test data at an 8:2 ratio. The label and F-Score. In particular, there are many differences in
in the training data is located at the end of the preprocessed Precision, and F-Score because of the True/False Positive
data. Rate. The model performance starting from the highest to the
lowest is CNN-LSTM, LSTM-CNN, and DNN. CNN-LSTM
2) CSIC-2010 HTTP DATASETS reduces the dimension by max-pooling at the initial step, but
CSIC-2010 HTTP data [33] was provided by Aberystwyth LSTM-CNN takes more time to train because the dimension
University. The contributors collected HTTP packets to detect and parameters are increased through LSTM Cell. The DNN
web attacks. The dataset contains 36,000 normal requests and is relatively fast but it has low rates for the scores of Accuracy
more than 25,000 anomalous requests. The data consists of and Specificity.
normal HTTP data for training and normal/abnormal data
for testing. We generated 1,941,300 records by augmenting 2) DETERMINATION OF HYPER-PARAMETERS
20 times from the original 77,652 records and split the set in We chose the best-performing deep learning model according
a ratio of train 8: test 2, except for 6 error records during data to the experimental results. The CNN-LSTM model needs to
import. determine the optimal hyper-parameters for stable operations.

70256 VOLUME 8, 2020


A. Kim et al.: AI-IDS: Application of Deep Learning to Real-Time Web Intrusion Detection

FIGURE 5. Performance comparison of NN models. FIGURE 6. Experimental results on public datasets.

We considered a high precision such that the time needed For all experiments for each dataset, the model parameters
to train or to validate events by true/false-positive rates in were modified to obtain the results above and to optimize
a practical environment is minimized. The experiment used the performance on different datasets. Considering that our
real-time HTTP (KF-ISAC) data shown in Table 7. model has 14,000 trainable parameters, the CSIC-2010 and
The CNN layer determines the conv_depth, conv_filter, CICIDS- 2017 are relatively small, which leads to overfit-
conv_kernel_width, and conv_pool variables. In detail, one ting and low performance. Experimental results shown in
variable has to be selected from conv_depth ∈ [2], conv_filter Fig. 6 showed a high accuracy of 91–93% for each dataset in
∈ [2, 4, 8, 12], conv_kernel_width ∈ [4] and conv_pool CSIC-2010 and CICISC-2017. The precision was in the range
∈ [3, 4, 5]. The RNN layer determines the lstm_units and of 86–98%, and the F-Score was in the range of 80–82%,
lstm_depth variables. In detail, one of the following values which is lower than the experimental results of the previous
has to be selected from lstm_unit ∈ [16] and lstm_depth real-time data. Experimental results showed that the perfor-
∈ [1, 2, 4, 8]. The DNN layer determines dense_depth, mance of the model is affected by the number of samples
dense_units, dense_dropout, and dense_relu_alpha. In detail, and the diversity of the training data. It was difficult to
one value of dense_depth ∈ [1, 2, 4, 8], dense_units ∈ cross-validate our model with two published datasets owing
[4, 8, 12, 16] and dense_dropout ∈ [0.1, 0.5] is selected. The to small samples. If we had a large amount of non-repeated
experiment was conducted 270 times, with one or more of the HTTP data, the experimental performance would improve
five indicators converging to zero or one, and then moving on and would return more reliable results. Considering the above
to the next parameters. results, our model is more suitable for a large amount of data,
Aiming for the high F-score and the high preci- and we demonstrated the excellence of our model by training
sion, which means minimum with false-positive values, with various datasets of more than 6 million HTTP traffic
the hyper-parameters of an optimal model are shown as fol- data.
lows: 2 for convolution depth, 12 for convolution filter, 4 for
convolution kernel size, 5 for max-pooling size, 16 for LSTM VI. EFFICACY AND APPLICATION
Cell, 2 for LSTM depth (1 forward LSTM, 1 Bidirectional This section describes cases of how AI-IDS detects vari-
LSTM), 12 dense units, 8 for dense depth, and 0.1 for dense ant attacks that bypass detection on legacy signature-based
dropout. NIDS, and how Snort rules can be rewritten or improved.
The AI-IDS in Fig. 7 performs ‘‘predict’’ based on the
3) MODEL VALIDATION completed h5-model shown in Fig. 3, and it predicts real-time
To validate the performance of deep learning mod- data by inspecting the attack as a prediction output. When
els, we used real-time data and public HTTP datasets the prediction value is 100%, the NIDS knows the payload is
(CSIC-2010, CICIDS 2017 HTTP datasets), and experi- malicious, but the results of analysts are not perfectly reliable
mented with 50 epochs on the previously selected model. because an initial AI-IDS result may contain an analysis error.
The experimental results of real-time data showed that Thus, an analyst needs to confirm the final step until a stable
the proposed CNN-LSTM model can be used for general level has been reached. We classified the suspicious payloads
HTTP data with a high performance. The AI-IDS is a deep as a prediction value within a range of 50–100%, and an
learning-based model with no pre-feature extraction and average of 100-500 events occurred every 3h. We assumed
therefore all strings can be processed. that AI-IDS is classified as normal or malignant, and less

VOLUME 8, 2020 70257


A. Kim et al.: AI-IDS: Application of Deep Learning to Real-Time Web Intrusion Detection

FIGURE 7. Labeling tools for deep learning on AI-IDS.

than 50% of the predicted values are labeled as ‘‘benign,’’ and TABLE 8. Detecting variant patterns on AI-IDS.
50–100% are classified as ‘‘malicious’’ payloads. In Fig. 7,
the analyst labels suspicious payloads on the program as
‘‘benign,’’ ‘‘malicious,’’ or ‘‘unknown’’ using a conditional
search. The labeled data is used for daily retraining. The
label program shows the prediction value (%) generated by
the optimal CNN-LSTM model, and the analyst can use it
as a reference for identifying the actual malicious payloads.
Some of the analysis information, such as src_ip/port and
dest_ip/port can be used to register a blocking policy in the
firewall.
The effects of applying AI-IDS are summarized as follows:
–First, it can detect variant bypass attacks that are not
detected on legacy Signature-based NIDS. For all AI-IDS
predictions, security events on the legacy NIDS are automat-
ically excluded, such that no duplicate events can occur.
–Second, it is possible to write or modify Snort rules for types and patterns, as well as the examples shown in Table 8.
new patterns. If legacy NIDS have existing rules but cannot In the case of SQL Injection, the detection accuracy of variant
detect attacks, then this had to be caused by Snort grammati- patterns is close to 100%.
cal errors or missing patterns in the rules. However, it can also The AI-IDS can also detect unknown variant patterns or
be a detection failure due to low performance or functional obfuscated attacks, as shown in Fig. 8. An attacker can use
failure. URL Encoding or base64 to bypass arbitrary payloads in the
security system to attack web servers effectively. An attacker
A. DETECTION OF OBFUSCATED VARIANTS uses the Char( ) function to insert code into noticeView.jsp
Table 8 shows an example of a variant attack detection. to attempt to acquire system information. In other cases,
A common intrusion pattern is a scan of an admin page or the attacker attempts to send spare-phishing mails, attempting
file upload page, usually accessed by an attacker via a known to communicate with an external SMTP server by inject-
open source path. Suppose that there is an admin page such as ing irregular or encoded code to AspCms_SiteSetting.asp
‘‘http://target.com/admin/index.php’’ and a rule that detects (AspCMS). Recent malicious HTTP payloads contain irreg-
‘‘/admin/index.php’’ in legacy NIDS. The AI-IDS examines ular patterns that are difficult to detect as simple strings.
payloads coming from the trained CNN-LSTM model in A commercial NIDS detects most known attacks or patterns
real-time to detect abnormal URI accesses that detect vari- but does not detect strings that do not have a registered
ant attacks on ‘‘index.php’’ parameters and subpaths. It also pattern. By contrast, the AI-IDS can detect variant and obfus-
detects similar and different new variant attacks for all attack cated attacks that cannot be detected with legacy NIDS.

70258 VOLUME 8, 2020


A. Kim et al.: AI-IDS: Application of Deep Learning to Real-Time Web Intrusion Detection

FIGURE 8. Detecting encoded and obfuscated payloads.

B. IMPROVEMENT OF SIGNATURE-BASED SNORT RULES


The second effect is to improve the signature-based Snort
Rule. The AI-IDS does not generate detected events in
duplicate on legacy signature-based NIDS, analysts can
identify new patterns for threats by analyzing suspicious
payloads: (i) A new rule can be written for a new pat-
tern (ii) If a rule exists, but cannot be detected, the detec-
tion rule can be improved by correcting an error in the
rule’s options: an offset, depth, distance or within and
so on. We write or improve on average about five new
detection rules per month manually. If an event occurs in
the AI-IDS when the rules are normally applied, we have
to suspect a detection failure on signature-based NIDS. Step 3. Writing a new Snort rule or improving an existing
Table 7 shows that AI-IDS detects a vulnerability attack Snort rule (general use in case)
(CVE-2018-9174) of DedeCMS. When NIDS rules for
related attacks are not registered in the currently operating
NIDS, new rules can be registered based on the detection
of AI-IDS.
Step 1. AI-IDS detection for suspicious payloads

AI-IDS detects attacks that are not detected in the exist-


ing NIDS in step 1. As AI-IDS double-checks with exist-
ing NIDS, basically all events detected by AI-IDS are not
detected by NIDS. The analyst examines the existing rules
in step 2 to review why the existing NIDS did not detect
payloads. It is usually found that attackers used several meth-
ods to randomly change the encoding or attacked strings to
bypass NIDS. In addition, if the signature is individually over-
customized, there are few types of attacks that cannot be
Step 2. Analysis of related existing rules (Why not detected, even if there is only a slight change in the attack
detectable?) pattern. Therefore, step 3 modified existing signatures to

VOLUME 8, 2020 70259


A. Kim et al.: AI-IDS: Application of Deep Learning to Real-Time Web Intrusion Detection

rewrite detection rules that typically detect PHP webshell [9] N. Shone, T. Nguyen Ngoc, V. Dinh Phai, and Q. Shi, ‘‘A deep learning
code attacks. However, if a general-purpose detection rule approach to network intrusion detection,’’ IEEE Trans. Emerg. Topics
Comput. Intell., vol. 2, no. 1, pp. 41–50, Feb. 2018.
is written without regard to the environment, appropriate [10] K. Wu, Z. Chen, and W. Li, ‘‘A novel intrusion detection model for
optimization tasks are required as the number of detections a massive network using convolutional neural networks,’’ IEEE Access,
increases. vol. 6, pp. 50850–50859, 2018.
[11] S. Naseer, Y. Saleem, S. Khalid, M. K. Bashir, J. Han, M. M. Iqbal,
and K. Han, ‘‘Enhanced network anomaly detection based on deep neural
VII. CONCLUSION networks,’’ IEEE Access, vol. 6, pp. 48231–48246, 2018.
We proposed an optimal CNN-LSTM model based on [12] Y. Ding and Y. Zhai, ‘‘Intrusion detection system for NSL-KDD dataset
using convolutional neural networks,’’ in Proc. 2nd Int. Conf. Comput. Sci.
SFL and successfully applied payload-level deep learn- Artif. Intell. (CSAI), 2018, pp. 81–85.
ing techniques in a high-performance computing environ- [13] S. Otoum, B. Kantarci, and H. T. Mouftah, ‘‘On the feasibility of deep
ment. The AI-IDS distinguishes between normal and abnor- learning in sensor network intrusion detection,’’ IEEE Netw. Lett., vol. 1,
no. 2, pp. 68–71, Jun. 2019.
mal traffic on HTTP traffic that could not be detected in [14] N. Chouhan, A. Khan, and H.-U.-R. Khan, ‘‘Network anomaly detection
legacy signature-based NIDS because AI-IDS can formalize using channel boosted and residual learning based deep convolutional
unknown patterns, help write or improve signature-based neural network,’’ Appl. Soft Comput., vol. 83, Oct. 2019, Art. no. 105612.
[15] R. Vinayakumar, M. Alazab, K. P. Soman, P. Poornachandran,
rules for new vulnerabilities, variants, and bypass attacks. A. Al-Nemrat, and S. Venkatraman, ‘‘Deep learning approach
Network meta-data, without its payload is usually difficult for intelligent intrusion detection system,’’ IEEE Access, vol. 7,
to identify whether it is malicious or not. Thus, we review pp. 41525–41550, 2019.
the HTTP header and body of web attacks in detail. We also [16] Z. Chiba, N. Abghour, K. Moussaid, A. El Omri, and M. Rida, ‘‘Intelligent
approach to build a deep neural network based IDS for cloud environ-
used real-time web traffic for deep learning, but initially, ment using combination of machine learning algorithms,’’ Comput. Secur.,
we learned that AI-IDS needed to be re-validated for pre- vol. 86, pp. 291–317, Sep. 2019.
dicted suspicious events due to false positives alarms. The [17] H. Zhang, B. Zhao, H. Yuan, J. Zhao, X. Yan, and F. Li, ‘‘SQL injection
detection based on deep belief network,’’ in Proc. 3rd Int. Conf. Comput.
AI-IDS performs continuous optimization by re-training Sci. Appl. Eng. (CSAE), 2019, p. 20.
analysis information that is labeled ‘‘benign,’’ ‘‘malicious,’’ [18] O. Faker and E. Dogdu, ‘‘Intrusion detection using big data and deep
and ‘‘unknown.’’ Thus, it should be used as an assistant learning techniques,’’ in Proc. ACM Southeast Conf. ZZZ - ACM SE, 2019,
pp. 86–93.
system until it reaches a high-quality level. If the quality goes [19] M. Aloqaily, S. Otoum, I. A. Ridhawi, and Y. Jararweh, ‘‘An intrusion
beyond the ability of humans by continually learning, it could detection system for connected vehicles in smart cities,’’ Ad Hoc Netw.,
be executed as an automated analysis. Ultimately, the goal of vol. 90, Jul. 2019, Art. no. 101842.
[20] KDD Cup 1999 Data. Accessed: Nov. 17, 2019. [Online]. Available:
AI-IDS is to outperform human analysis quality and to help http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html
analysts handle large quantities of unknown security events. [21] M. Ring, S. Wunderlich, D. Scheuring, D. Landes, and A. Hotho, ‘‘A survey
Previous works have mainly considered accuracy (ACC) in of network-based intrusion detection data sets,’’ Comput. Secur., vol. 86,
pp. 147–167, Sep. 2019.
terms of performance measures, but scalability and precision
[22] M. Tavallaee, E. Bagheri, W. Lu, and A. A. Ghorbani, ‘‘A detailed analysis
are also important indicators for applying deep learning in of the KDD CUP 99 data set,’’ in Proc. IEEE Symp. Comput. Intell. Secur.
the real-world. In practical security services, re-validation Defense Appl., Jul. 2009, pp. 1–6.
for predicted events is a required task because of the low [23] A. Shenfield, D. Day, and A. Ayesh, ‘‘Intelligent intrusion detection
systems using artificial neural networks,’’ ICT Express, vol. 4, no. 2,
tolerance for analysis errors. pp. 95–99, Jun. 2018.
[24] R. Singh, H. Kumar, and R. K. Singla, ‘‘A reference dataset for network
traffic activity based intrusion detection system,’’ Int. J. Comput. Commun.
REFERENCES
Control, vol. 10, no. 3, pp. 390–402, 2015.
[1] R. Fielding, J. Gettys, J. C. Mogul, H. Frystyk, and T. Berners-Lee, [25] M. Sabhnani and G. Serpen, ‘‘Why machine learning algorithms fail in
Hypertext Transfer Protocol—HTTP/1.1, document RFC 2616, IETF, Jun. misuse detection on KDD intrusion detection data set,’’ Intell. Data Anal.,
1999. vol. 8, no. 4, pp. 403–415, Oct. 2004.
[2] D. Atienza, Á. Herrero, and E. Corchado, ‘‘Neural analysis of HTTP traffic [26] J. Gu, L. Wang, H. Wang, and S. Wang, ‘‘A novel approach to intru-
for Web attack detection,’’ in Proc. Comput. Intell. Secur. Inf. Syst. Conf. sion detection using SVM ensemble with feature augmentation,’’ Comput.
Cham, Switzerland: Springer, 2015, pp. 201–212. Secur., vol. 86, pp. 53–62, Sep. 2019.
[3] B. Mukherjee, L. T. Heberlein, and K. N. Levitt, ‘‘Network intrusion [27] N. Moustafa, J. Hu, and J. Slay, ‘‘A holistic review of network anomaly
detection,’’ IEEE Netw., vol. 8, no. 3, pp. 26–41, May 1994. detection systems: A comprehensive survey,’’ J. Netw. Comput. Appl.,
[4] P. Mishra, V. Varadharajan, U. Tupakula, and E. S. Pilli, ‘‘A detailed inves- vol. 128, pp. 33–55, Feb. 2019.
tigation and analysis of using machine learning techniques for intrusion [28] Splunk. (2019). Splunk Stream (STM). Accessed: Mar. 20, 2020. [Online].
detection,’’ IEEE Commun. Surveys Tuts., vol. 21, no. 1, pp. 686–728, Available: https://splunkbase.splunk.com/app/1809/
1st Quart., 2019. [29] M. Mamun, R. Lu, and M. Gaudet, ‘‘Tell them from me: An encrypted
[5] FSI AI-IDS Software for Splunk. [Online]. Available: https://github. application profiler,’’ in Proc. Int. Conf. Netw. Syst. Secur., in Lecture Notes
com/ackim-fsi/AI-IDS in Computer Science, vol. 11928. Cham, Switzerland: Springer, 2019,
[6] Y. Liu, S. Liu, and X. Zhao, ‘‘Intrusion detection algorithm based on con- pp. 456–471.
volutional neural network,’’ Beijing Ligong Daxue Xuebao/Trans. Beijing [30] Y. Zhang and B. Wallace, ‘‘A sensitivity analysis of (and practitioners’
Inst. Technol., vol. 37, no. 12, pp. 1271–1275, 2017. guide to) convolutional neural networks for sentence classification,’’ 2015,
[7] W. Wang, Y. Sheng, J. Wang, X. Zeng, X. Ye, Y. Huang, and M. Zhu, arXiv:1510.03820. [Online]. Available: http://arxiv.org/abs/1510.03820
‘‘HAST-IDS: Learning hierarchical spatial-temporal features using deep [31] H. Liu, B. Lang, M. Liu, and H. Yan, ‘‘CNN and RNN based payload
neural networks to improve intrusion detection,’’ IEEE Access, vol. 6, classification methods for attack detection,’’ Knowl.-Based Syst., vol. 163,
pp. 1792–1806, 2018. pp. 332–341, Jan. 2019.
[8] C. Yin, Y. Zhu, J. Fei, and X. He, ‘‘A deep learning approach for intru- [32] Y. Xin, L. Kong, Z. Liu, Y. Chen, Y. Li, H. Zhu, M. Gao, H. Hou, and
sion detection using recurrent neural networks,’’ IEEE Access, vol. 5, C. Wang, ‘‘Machine learning and deep learning methods for cybersecu-
pp. 21954–21961, 2017. rity,’’ IEEE Access, vol. 6, pp. 35365–35381, 2018.

70260 VOLUME 8, 2020


A. Kim et al.: AI-IDS: Application of Deep Learning to Real-Time Web Intrusion Detection

[33] C. T. Giménez, A. P. Villegas, and G. Marañón. (2010). Information MOHYUN PARK received the B.S. degree in
Security Institute of CSIC (Spanish Research National Council). Accessed: computer science from Seoul National Univer-
Nov. 17, 2019. [Online]. Available: https://www.isi.csic.es/dataset/ sity, Seoul, South Korea, in 2013. He is currently
[34] I. Sharafaldin, A. Habibi Lashkari, and A. A. Ghorbani, ‘‘Toward gen- the Manager with the Financial Security Institute
erating a new intrusion detection dataset and intrusion traffic char- (FSI), Yongin, South Korea. His research interests
acterization,’’ in Proc. 4th Int. Conf. Inf. Syst. Secur. Privacy, 2018, include applied deep learning and intrusion detec-
pp. 108–116. Accessed: Nov. 17, 2019. [Online]. Available: https://www. tion on computer networks.
unb.ca/cic/datasets/ids-2017.html

DONG HOON LEE (Member, IEEE) received the


B.S. degree from the Department of Economics,
Korea University, Seoul, in 1985, and the M.S. and
AECHAN KIM received the B.S. degree in indus- Ph.D. degrees in computer science from The Uni-
trial engineering from the Seoul National Uni- versity of Oklahoma, Norman, in 1988 and 1992,
versity of Science and Technology, Seoul, South respectively. He is currently a Professor with the
Korea, in 2009, and the M.S. degree in finan- Graduate School of Information Security, Korea
cial information security from Korea University, University, where he has been with the Faculty of
Seoul, in 2014, where he is currently pursuing the Computer Science and Information Security, since
Ph.D. degree with the Graduate School of Infor- 1993. His research interests include cryptographic
mation Security. He is also an Assistant Manager protocols, applied cryptography, functional encryption, software protection,
with the Financial Security Institute (FSI), Yongin, mobile security, vehicle security, and ubiquitous sensor network (USN)
South Korea. His research interests include applied security.
deep learning and intrusion detection on computer networks.

VOLUME 8, 2020 70261

You might also like