0% found this document useful (0 votes)

95 views12 pages

Network Attack Detection and Visual Payload Labeli

This article proposes a supervised detection algorithm based on benign samples to detect malicious web requests targeting IoT devices. The algorithm uses a Seq2Seq model with an attention mechanism. The attention mechanism helps label attack payloads visually to make the model more interpretable. Experiments show the proposed model achieves 97.02% precision and 97.60% recall in detecting web attacks, demonstrating its effectiveness. However, current web attack detection methods still face challenges including a lack of labeled data, limited sample classes, and lack of interpretability in results.

Uploaded by

Beria Sim

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

95 views12 pages

Network Attack Detection and Visual Payload Labeli

Uploaded by

Beria Sim

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Intelligent IoT - Artificial Intelligence for Future Internet of Things - Research Article

International Journal of Distributed

Sensor Networks
2020, Vol. 16(4)
Network attack detection and visual Ó The Author(s) 2020
DOI: 10.1177/1550147720917019
payload labeling technology based on journals.sagepub.com/home/dsn

Seq2Seq architecture with attention

mechanism

Fan Shi1, Pengcheng Zhu2, Xiangyu Zhou3, Bintao Yuan1

and Yong Fang3

Abstract
In recent years, Internet of things (IoT) devices are playing an important role in business, education, medical as well as in
other fields. Devices connected to the Internet is much more than the number of world population. However, it may
face all kinds of attacks from the Internet easily for its accessibility. As we all know, most attacks against IoT devices are
based on Web applications. So protecting the security of Web services can effectively improve the situation of IoT eco-
system. Conventional Web attack detection methods highly rely on samples, and artificial intelligence detection results
are uninterpretable. Hence, this article introduced a supervised detection algorithm based on benign samples. Seq2Seq
algorithm is been chosen and applied to detect malicious web requests. Meanwhile, the attention mechanism is intro-
duced to label the attack payload and highlight labeling abnormal characters. The results of experiments show that on
the premise of training a benign sample, the precision of proposed model is 97.02%, and the recall is 97.60%. It explains
that the model can detect Web attack requests effectively. Simultaneously, the model can label attack payload visually
and make the model ‘‘interpretable.’’

Keywords
Seq2Seq, IoT devices, Web attack detection, autoencoder, attention mechanism, attack visualization

Date received: 3 November 2019; accepted: 6 March 2020

Handling Editor: Kien Nguyen

Introduction against IoT devices can be launched from the Web

application. For its accessibility, it may receive all kinds
Today portable devices are playing an important role of attacks from the Internet easily. Meanwhile, its vul-
in business, education, and medicine as well as in other nerability is exacerbated due to its distribution and the
fields. IoT devices can provide convenience for human
life in auxiliary medical treatment,1 sleep detection,2
1
and activity analysis.3 Meanwhile, IoT devices may also College of Electronic Engineering, National University of Defense
Technology, Hefei, China
be used by attackers, leading to privacy disclosure.4 2
College of Electronics and Information Engineering, Sichuan University,
The IoT devices, which is of easy deployment and scal- Chengdu, China
ability, offer a choice instead of traditional desktop 3
College of Cybersecurity, Sichuan University, Chengdu, China
programs and greatly facilitate people’s daily life and
Corresponding author:
work. IoT devices usually use Web application to pro-
Yong Fang, College of Cybersecurity, Sichuan University, Chengdu
vide services to users, so Web attack is also an effective 610065, China.
method to attack IoT devices. Therefore, attacks Email: [email protected]

Creative Commons CC BY: This article is distributed under the terms of the Creative Commons Attribution 4.0 License
(https://creativecommons.org/licenses/by/4.0/) which permits any use, reproduction and distribution of the work
without further permission provided the original work is attributed as specified on the SAGE and Open Access pages
(https://us.sagepub.com/en-us/nam/open-access-at-sage).
2 International Journal of Distributed Sensor Networks

complexity of its configuration. These factors are rele- observed that normal HTTP requests were a majority,
vant to the fact that Web attacks are happening with and the behavior patterns were similar, while malicious
increasing frequency, through which attackers retrieve requests were a minority and the behavior patterns
or change sensitive data or even execute arbitrary code were changeable.19 They proposed an unsupervised
in remote systems. algorithm based on text clustering to distinguish nor-
Most attacks against IoT devices are based on Web mal requests from malicious requests. Rather, it proved
applications. To deal with the various attack methods, to be in high detection rate and low false alarm rate.
it has been a trend for security researchers to apply Zhang et al. picked up upper 300 bytes characters in
machine-learning and deep-learning to web applica- Web communication traffic through statistical analysis
tions attack detection technique. to Webshell traffic. The sequence vector was generated
A trust-aware probability marking traceback scheme based on American Standard Code for Information
is proposed to locate malicious source quickly.5 The Interchange (ASCII).20 Then CNN (convolutional
nodes are marked with different marking probabilities neural networks)21 and LSTM (long short-term mem-
according to its trust which is deduced by trust evalua- ory)22 trained sequence vectors to build a model and
tion. The high marking probability for low trust node classify sample data. This method obtained rather good
can locate malicious source quickly, and the low mark- results with a 98.2% detection rate and 97.84% recall
ing probability for high trust node can reduce the num- rate.
ber of marking to improve the network lifetime, so the The above detection methods affect well since data
security and the network lifetime can be improved in sets have been given. However, there are still some
this scheme. Wu et.al.6 proposed a safety detection problems that need resolving:
mechanism based on the analysis of big data. Fuzzy
cluster analytical method, game theory, and reinforce- 1. The lack of label data: There are numerous nor-
ment learning are integrated seamlessly to perform the mal request samples while there are few varie-
safety detection. The simulation and experimental gated attack samples in real environment, which
results show the advantages of this scheme in terms of causes obstacles to the model’s learning and
high efficiency and low error rate. training.
Adeva and Atxa7 proposed another attack identify 2. The lack of sample classes: In the stage of train-
method. This method helps extract metadata from the ing, if there are only SQL injection and XSS
Weblog, including date, source address, size, type, and attacks in the sample data sets it is hard to iden-
so on. Besides, it allows selecting the best feature tify command executions or new Payload
through the feature assessment, classifying log samples attacks in real environment. Besides, Web
or identifying attacks with Naive Bayes algorithm,8 applications run by different users vary greatly.
k-nearest neighbors (KNN) algorithm,9 and Rocchio Even SQL injection has numerous forms.
algorithm.10 The method has a high detection rate of Obviously, we cannot be sure to use a data col-
more than 90%. However, it is just a post-test and can- lected in the past to train a model that can
not defend attacks in time. Raghuveer and detect unknown attacks. It is easy to understand
Chandrasekhar11 proposed a detection model combin- that the results will be deadly different in the
ing support vector machine (SVM),12 fuzzy neural net- experimental environment from the real
work,13 and K-means.14 This model clusters and environment.
generates various subsets by K-means algorithm, and 3. The interpretability of the results: If the model
then trains different neuro-fuzzy models to gain eigen- identifies SQL injection, security researchers can
vectors. Rathore et al.15 designed a cross-site scripting find out the exact location of the attack payload
(XSS) classifier based on social networking services so that they can maintain the Web applications
(SNS) website, three types of eigenvalues (URL, consciously. But common Web maintainers may
HTML labels, and SNS) are processed by artificial not understand the significance of the alarm.
choice, meanwhile the classifier constructs eigenvectors. Even though they constrain the attacks at that
Then, 10 machine-learning algorithms, including RF very moment, they still cannot repair the Web
(Random Forest),16 ADTree (Alternating Decision applications. Because of that, the security risk
tree),17 SVM, LR (Logistic Regression)18 and so on, still exists.
will testify and identify whether the eigenvectors are
attacks. This method compares each algorithm and As services provided by IoT devices are often subject
obtains the best algorithm model, but it has limitation to Web attacks, to improve the security of IoT devices,
on scalability and human factors which greatly affect an attack detection model based on Seq2Seq23 to imple-
the detection results, as it requires artificial mainte- ment the shortage of current Web attack detection tech-
nance and obtaining large numbers of eigenvectors arti- nologies is proposed in this article. This model helps
ficially. Through Web visit statistics, Yang et al. acquire a great many of normal samples, identify kinds
Shi et al. 3

Figure 1. Diagram of detection model framework.

of Web attacks efficiently, and locate attack payload maintainers are able to locate the attack pay-
timely. We summarize the major contributions as load swiftly, repair security risks in time, and
following: protect the data security of enterprises or
organizations.
1. We propose a visual payload labeling model to
detect network attack. Under the premise of
using only benign training samples, our model Web attack detection model based on
has good precision and recall. Seq2Seq
2. As our model relies on comparing predicted val-
Detection model framework
ues and thresholds to classify benign and mali-
cious Web requests, it can identify whether a Figure 1 presents the whole framework of the Web
Web request is malicious, rather than defend attack detection model based on Seq2Seq. The model is
against a specific type of Web attacks. mainly divided into three modules: data preparation
3. Our model not only distinguishes normal module, attack detection module, and attack payload
requests from attack requests but also interprets visualization module. In the data preparation module,
the detection results by visually labeling the pretreating original HTTP request samples, establish-
attack payloads. In the stage of encoding, we ing vocabulary, generating sequence vectors that meet
encode request samples from HTTP through the model’s input requirements happend in sequence.
the Bi-LSTM algorithm and maintain the con- In the attack detection module, the main task is to con-
text semantics in the request. In the decoding struct and train the attack detection model as well as
stage, we introduce the attention mechanism, testify and classify the test sample sets. In the attack
calculate the probability distribution of each payload visualization module, the attack payload is
character in the sequence vector, and mark the visually labeled, and normal elements (characters) are
exact location of the attack payload. The detect labeled as white but abnormal elements (characters) are
results of our model are interpretable. Website labeled as red.
4 International Journal of Distributed Sensor Networks

Figure 2. Normal request character embedded vocabulary.

Figure 3. Sample encode example.

Data preparation module 2. Per the autoencoder model introduced in section

‘‘Experiment and assessment,’’ we input \GO.
Sample labeling. Sample cleansing and labeling play a
first at the decoding sequence, index number 0,
key role in machine learning, as samples’ quality deter-
and output \EOS. last at the sequence, index
mines the quality of model training and the accuracy of
number 2.
subsequent detections. Through the observation of the
3. Setting the characters as \UNK. which are
data sets, we could conclude that the data of benign
not in the vocabulary or we cannot identify,
samples accords with expectations while mislabeled
index number ‘‘4.’’
many malicious samples, and we found that the length
4. The length of ordered column vectors needs to
of most wrongly labeled data requests is less than 20
be filled into the same length to meet the
bits. So, the first step is to delete all the POST and GET
requirements of Bi-LSTM. \PAD. is the filler
request data which is less than 20 bits, and then label
word, index number ‘‘0.’’
the samples manually to ensure all our training samples
5. Link break ‘‘/r,’’ tab ‘‘t’’ is added to the vocabu-
are labeled correctly. Though we would delete some
lary. Figure 2 shows the final vocabulary.
data wrongly, the sample data labeling is more accu-
Figure 3 elaborates sequence vectors after we
rate, guaranteeing the accuracy of the experiments. encoding samples in one batch in training.
After cleansing, we store rest of the samples, labeling
abnormal HTTP requests as ‘‘malicious’’ and normal
HTTP requests as ‘‘benign.’’ Attack detection module based on Seq2Seq
Seq2Seq model framework. Figure 4 refers to Seq2Seq
Establish vocabulary based on ASCII. Paper20 applied the (we choose the Seq2Seq framework because its output
Webshell traffic detection technology which is based on length is uncertain, which we do not need modify the
deep-learning. Similar to the method introduced in that input vector) framework. As the figure shows,
paper, the model in this article constructs sequence vec- X = [X1, X2, X3, X4] refers to input sequences, and
tors by embedding characters. The first step is to estab- Y = [Y1, Y2, Y3] refers to output sequences. Encoder
lish vocabulary with the following steps: and decoder can be various neural network models or
their combinations. We believe that the payload of a
1. We store the visible characters in the vocabu- Web attack has a context, and Bi-LSTM can better
lary, and the index number of each character is capture the bidirectional semantic dependency, so after
its corresponding ASCII code. encoding with Bi-LSTM, this context can be expressed
Shi et al. 5

Y1 Y2 Y3

Encoder Semantic Code C Decoder

X1 X2 X3 X4

Figure 4. Basic framework of Seq2Seq model.

in the vector. Semantic Coding C refers to the encoding it can be abbreviated as

value of sequence X.
Yt = g(Yt jfY1 , Y2 , Y3 , :::Yt g, C) ð7Þ
1. Encoder
For the model that decoder is Bi-LSTM, Yt is deter-
At the stage of encoding, for the Encoder is Bi- mined by the output of Yt1 at the last moment, the
LSTM, its hidden layer output is the splicing of far- output of hidden layer at the current moment and the
ward and reverse LSTM hidden layer ouput. Shown as semantic vector C. The above formula can be abbre-
follows viated as formula (8)
0
Ht = ½ht , ∂ht ð1Þ Yt = g(Yt1 , Ht , C) ð8Þ

ht = f (ht1 , xt ) ð2Þ
0
ht = f (ht1 , xt ) ð3Þ Attack detection algorithm based on measuring the loss of
model. The last section introduced the basic framework
f is the encoding function of Bi-LSTM, ht is the output of Seq2Seq. Seq2Seq needs modifying before it is
of the forward LSTM hidden layer, and h0t is the output applied to detect attacks. In Figure 5, we take training
of the reverse LSTM hidden layer. For semantic coding samples as input and output of the model at the same
C, the output information of Encoder’s hidden layer is time, and this model is also called autoencoder. This
generally aggregated to obtain the semantic vector of model is almost the same as the Seq2Seq model dia-
the middle layer gram in Figure 3. The main difference is that the out-
put layer also uses the same data as the input layer. It
C = q½H1 , H2 , H3 , :::, Ht ð4Þ should be noted that in the decoding stage, the first
A common simple method is to use the hidden layer input of the sequence is replaced by ‘‘\GO.,’’ and the
output of the last moment as the semantic vector C, last output of the sequence is replaced by ‘‘\EOS..’’
that is To train only with positive samples and process
attack detection, this article designed an attack detec-
C = q½H1 , H2 , H3 , :::, Ht = Ht ð5Þ tion algorithm based on measuring the loss of a model.
The procedures are as follows:

2. Decoder 1. To gain a model with slight loss by training a

great many of benign sample sets.
The decoding stage can be regarded as the inverse 2. To predict benign samples in test sets. Under
process of encoding. For the output Yt at a certain time, normal circumstance, every sequence has one
it is predicted by the output sequence Y1 , Y2 , Y3 , :::Yt1 predictive value with rather low loss. The loss of
and semantic vector C, as shown in formula (6) all the sequences is counted and recorded as
total loss
X
T
Yt = argmaxP(Yt ) = p(Yt jfY1 , Y2 , Y3 , :::Yt g, C) ð6Þ
t=1
total loss = ½Loss1 , Loss2 , :::, LossT ð9Þ
6 International Journal of Distributed Sensor Networks

Figure 5. Basic framework of Seq2Seq model (Autoencoder).

3. Calculate the mean and standard deviation of aij represents the corresponding weight of each hid-
total loss in equation (2), and calculate the den layer and is calculated by the following formula
threshold using the following formula
aij = softmax(eij ) ð12Þ
threshold = mean(total loss) + C std(total loss) ð10Þ
eij is a score calculated by the output Hi of the enco-
In the above formula, mean refers to calculate mean der’s hidden layer and the output Hj0 of the decoder’s
value, and std refers to calculate the standard deviation. hidden layer. The score is calculated by the following
C is a constant, and we need to adjust it in experiments, formula
so that threshold can gradually approach the optimal 0

threshold. Generally speaking, C should ensure that the eij = score(Hi , Hj ) ð13Þ
threshold value is greater than the maximum value of For the score function, Luong et al.24 defines the fol-
Loss:Max of the test set Loss. lowing three definitions, which can be selected accord-
ing to the different problems
4. The model predicted benign samples and mali-
8
cious samples at the same time. If the > HiT Hj
0
<
Loss.threshold, the malicious samples are 0
HiT Wa Hj
0
score(Hi , Hj ) = ð14Þ
judged; otherwise, the Loss\threshold of the >
: V tanh(W ½H T ; H 0 )
sequence is the normal samples. a a i j

2. Decoder
Attack payload visualization module based on
attention mechanism The decoding stage is determined by the current time
0

Seq2Seq model with attention mechanism. To solve the semantic vector Ct and the output Ht of the decoder’s
problem that cannot explain the results of the conven- hidden layer. First, we contacted the two vectors, and
tional detection model, this section will optimize the use tanh as the activation function
Seq2Seq model using the attention mechanism and 0

mark the specific location of attack payload using the Ht = tanh(Wc ½Ct ; Ht ) ð15Þ
characteristics of this mechanism, to realize the visuali- Finally, the predicted output Yt is calculated
zation function of attack payload. The optimized model
is shown in Figure 6. Yt = argmaxP(Yt ) = softmax(Wc , Ht ) ð16Þ
We should note that the output Yt at this time is a
1. Encoder
probabilistic sequence. By looking up the maximum
probability value in the sequence, the corresponding
After leading into the attention mechanism, the word is retrieved from the vocabulary list and decoded.
semantic vector C is obtained by weighted averaging
the output H of encoder’s hidden layer, as follows
Attack payload labeling principle based on attention
X
T mechanism. Formula (15) shows that the output Ht of
Ci = p(aij Hj ) ð11Þ the attention layer is determined by semantic vector Ct
0
j=1 and Ht , the output of the Decoder’s hidden layer. At
Shi et al. 7

Figure 6. Seq2Seq model with attention mechanism.

Y1 Y2 Yt

Output Layer 0 0 1 0 1 0 1 0 0

softmax

WC WC WC

Attention Layer H1 H2 H*t

Figure 7. Attention layer and output layer diagram.

this time, the semantic vector Ct represents the weight of correct prediction, the No.i element of the sequence
of the current input compared to the output of the should be the maximum value of the sequence and far
model, which is similar to the human attention greater than the value of other elements. (Figure 6 is a
mechanism. It is the focus of visual attention with a simple demonstration. The No.i element of the prob-
large weight, and on the contrary, it may be low-value abilistic sequence is set to 1 and the rest is set to 0.)
information with a low weight. Based on this conclusion, the following steps can be
Assuming that the input of Xt is the No.i element in taken to optimize the model:
the vocabulary at a certain time, the output of Yt at the
current time is a probabilistic sequence of sum 1, which 1. The test samples are predicted by the
is as long as the vocabulary and the sum of them is 1, trained model, receive the output probabilistic
as shown in Figure 7 and Formula (16). On the premise sequence
8 International Journal of Distributed Sensor Networks

Table 1. Data set.

Classification of experiments Training samples Testing samples Detection samples Total

Benign Malicious Benign Malicious Benign Malicious

Model in this article 16,264 0 0 0 4067 4067 24,398

Classification threshold detection and computation 0 0 4880 0 0 0 4880
Threshold calculation of abnormal elements 0 0 1001 1001 0 0 2002
Comparative experiment 16,264 12,176 0 0 4067 4067 36,574

Yij = ½ai1 , ai2 , ai3 , :::, ait ð17Þ divide data sets into three parts: training samples,
testing samples, and detection samples. Among them,
In the formula, Yij refers to the No.i sequence, the
we take 30% benign samples randomly to do thresh-
No.j element in the vocabulary, T is the length of the
old test training, and 1001 benign samples and 1001
vocabulary, recording the current value of aij .
malicious samples randomly to do abnormal element
threshold test. Away from that, this model only
2. Count all outputs of samples, set as a
adopts benign samples in training, but in comparative
alpha = ½aij ð18Þ
experiments, it adopts benign samples and malicious
samples simultaneously. The specific allocation of
Calculate the mean value and standard deviation of data sets is shown in Table 1.
a, and use the following formula to calculate the thresh-
old. In the formula, the constant C is a constant to be
determined, mean value is calculated by mean, and stan- Environment for experiment
dard deviation is calculated by std The model introduced in this article is mainly devel-
oped under Windows system. The code involved in the
threshold = mean(alpha) C std(alpha) ð19Þ experiment is mainly based on Python’s tensorflow
3. By adjusting the constant C, make sure the framework.26 Function static bidirectional rnn in ten-
threshold value is guaranteed to be less than the sorflow is applied to realize encoder of Bi-LSTM algo-
minimum weight of benign samples in the test rithm. Function Seq2Seq in tensorflow is applied to
set, and greater than the maximum weight of realize encoder with attention mechanism. Python
malicious samples, such as Formula (20). Scikit-learn toolkit helps to realize assessment for
Meanwhile, it is necessary to observe whether model.27 Detailed configurations are shown in Table 2.
the sample labeling conforms to objective facts.
If it does, the threshold value should be selected;
Experiment process
otherwise, it will continue to adjust
Classification threshold parameter optimization. In sections
maliciousmax \threshold ł benignmin ð20Þ ‘‘Attack detection module based on Seq2Seq’’ and
‘‘Attack payload visualization module based on atten-
tion mechanism,’’ the calculation methods of threshold
4. If a sequence in the test set is checked by the
value of model classification and threshold value of
model, the model predicts the No.j element
exceptional determination have been introduced in
aij \threshold in the probabilistic sequence of
detail, but the formula cannot directly calculate the
Yij , it indicates that Yij is abnormal and labeled
final threshold value. Further experimental tests are
red; otherwise, if aij .threshold, it indicates that
necessary to get the optimal threshold value. Formula
Yij is normal and labeled white.
(10) shows that constant C needs to be adjusted to
obtain a reasonable threshold value to achieve the goal
Experiment and assessment of sample classification. We tested the accuracy change
with a constant C from 1 to 7 in steps of 2, and specifi-
Data set cally tested the accuracy with a constant 0. The rela-
This article applied HTTP DATASET CSIC 2010 tionship between threshold value and accuracy is
data set to to do experiments and make analysis.25 shown in Table 3.
After processing, we stored 20,331 pieces of benign It is understandable that the higher the threshold is,
samples and 16,243 malicious samples. According to the higher the accuracy of benign samples is, but the
the detection method introduced in this article, we model also needs to detect malicious samples, so while
Shi et al. 9

Table 2. Hardware and software configuration of experimental value of the classification of abnormal elements accu-
environment. racy, we only calculate an estimated value by Formula
(19): threshold = mean(alpha) C std(alpha). If the
Operating system Microsoft Windows 10 Build 17134
value meets the expectation, it will be adopted, conver-
Professional edition, 64-bit
operating system sely, then adjusted. According to the statistics of 1001
System configuration CPU: Intel i7-7700, Memory: 8GB, benign samples and 1001 malicious samples, mean(al-
Hard Disk: 1TB pha) = 0.67052 and std(alpha) = 0.4568767 were
GPU: NVIDIA GeForce GTX 1060, obtained. Threshold adjustment calculations are as
Display Memory:6GB
follows
The Python standard Python:3.6.2
library and version tensorflow-gpu == 1.12.0
numpy == 1.16.0 1. Initial value C = 0, step size = 0.1 and maxi-
scikit-learn == 0.19.2 mum value = 1.5;
matplotlib == 2.2.2 2. Calculate threshold value by formula (19);
colorama == 0.4.1
3. Ten malicious samples and 10 normal samples
were printed randomly to observe whether to
label the labels of attack payload;
Table 3. Threshold test results. 4. If it does not meet the expectation, repeat (1) to
(3) until it meets the expectation, and store the
Number Constant C Threshold Accuracy (%) current threshold value.
1 0 0.076447 72.57
2 1 0.215574 94.27 After many rounds of experiments, we set the thresh-
3 3 0.493829 99.20 old value to 0.076589 (constant C = 1.3), when the
4 5 0.772084 99.54 output meets the target of labeling attack payload.
5 7 1.050339 99.58 Figure 8 is an example of labeling attack payloads.

ensuring the accuracy, the smaller the threshold, the Experiment indicators
more consistent with the classification standard of the
To get better evaluation of the attack detection model
model. As shown in the table above, when the constant
based on attention mechanism and Bi-LSTM algo-
C is 5 and 7, the accuracy no longer increases signifi-
rithm, the experimental results will be evaluated using
cantly, and the threshold value is 0.772084.
Confusion Matrix. The obfuscation matrix, also known
as Error Matrix, can be used to visually evaluate the
Threshold parameter optimization of abnormal performance of classification model algorithms, as
elements. Because we cannot quantify the threshold shown in Table 4.

Figure 8. An example of labeling attack payloads.

10 International Journal of Distributed Sensor Networks

Table 4. Confusion Matrix. Results

Malicious Normal requests To make the experiment’s objective, we compare results
requests from the model introduced in this article with the other
four attack detection models:
Malicious requests TP FP
Normal requests FN TN 1. SVM: Characteristic construction feature vec-
tors are extracted artificially, and then the sam-
TP: true positive; FP: false positive; FN: false negative; TN: true negative.
ples are classified by SVM algorithm.
2. TF-IDF_RF: Word frequency vectors are
extracted from samples by TF-IDF, and then
The matrix has the following four categories of
the samples are classified by Random Forest.
indicators:
3. Word2vec_MLP: The samples are segmented
artificially, and then word vectors with seman-
1. True positive (TP): An attack request and the tics are constructed by Word2vec. At last, the
result of model judgment is also a sample of the samples are classified by MLP algorithm.
attack request. 4. Character_CNN: Sequence vectors are con-
2. False positive (FP): Samples which are real nor- structed by character embedding. Finally, sam-
mal requests but whose model judgment results ples are classified by CNN algorithm.
are attack requests.
3. True negative (TN): Samples that are real nor- Table 5 shows that they use different methods to
mal requests and the results of model judgment extract features: SVM extracts features artificially,
are also normal requests. Word2vec_MLP needs semantic model constructed
4. False negative (FN): Samples that are attacking artificially, and the rest three models extract features
requests but the model judges the results as nor- with the help of algorithms. As for the requirements
mal requests. toward samples, the Attention_Bi-LSTM model in this
article only needs to train positive samples, but the rest
Based on the above definition and the detection of four models all need samples with positive and negative
attack behaviors in this article, we can consider it as a labels. For the interpretability of results, only the model
binary classification, which can deduce the performance introduced in this article achieves attack visualization.
indicators of this model The comparative results of experimental perfor-
mance indexes are shown in Table 6. Character_CNN
TP has a precision rate of 99.48%, but the recall rate is
Precision : P = ð21Þ only 94.66%, indicating that the model has a lower
TP + FP
false positive rate (FPR) but a higher false negative
TP rate. Our model does not have the precision accuracy,
Recall : R = ð22Þ
TP + FN but the recall rate is 97.60%, and the F1 value is
P3R 97.31%, means that our model has better comprehen-
F1 score : F1 = 2 3 ð23Þ sive ability. The attention_Bi-LSTM model performs
P+R
better in the aspect of precision, recall, F1 value than
The precision reflects the proportion of real mali- SVM, TF-IDF_R, and Word2vec_MLP. Although
cious requests in the sample in which the results of Attention_Bi-LSTM has lower precision than
model judgment are malicious requests; the recall Character_CNN, it does better in recalling, and it has a
reflects the proportion of malicious requests correctly higher F1 value. In summary, on the premise of only
identified by the model, and the F1 score is the harmo- positive training samples, Attention_Bi-LSTM per-
nic mean of the precision rate and recall rate. forms best in classification. Character_CNN and

Table 5. Comparison of model characteristics.

Detection models Feature extraction method Sample requirements Visualization

SVM Artificial extraction Positive and negative samples No

TF-IDF_RF Algorithm Positive and negative samples No
Word2vec_MLP Artificial segmentation Positive and negative samples No
Character_CNN Algorithm Positive and negative samples No
Attention_Bi-LSTM Algorithm Positive samples Yes
Shi et al. 11

Table 6. Model performance indicators. 99.48%, but the recall is 94.66%. It shows that
Detection models Precision (%) Recall (%) F1 (%)
Character_CNN has a high false alarm rate. The preci-
sion, recall, and F1 values of Attention_Bi-LSTM are
SVM 92.07 94.95 93.49 as high as 97.02%, 97.60%, and 97.31%, respectively.
TF-IDF_RF 93.81 89.12 91.41 Also, Attention_Bi-LSTM has the largest AUC (the
Word2vec_MLP 96.28 96.29 96.29 area under the ROC curve). It shows that on the pre-
Character_CNN 99.48 94.66 97.01
Attention_Bi-LSTM 97.02 97.60 97.31 mise of benign training samples alone, this model can
detect attack requests effectively and has rather high
precision and recall. Besides, its exclusive function of
labeling attack payload helps to achieve attack
visualization.
However, the model has some shortcomings. The
model constructs sequence vectors by using character
embedding. Although it shortcuts the steps of artificial
word segmentation and feature extraction, it enlarges
calculation. There are only around 20,000 data sets,
but the training time is more than 10 h. We will con-
sider adopting the embedding method of N-gram to
process experiments or improve hardware resources of
experiments.

Acknowledgements
The authors thank anonymous reviewers and editors for pro-
viding helpful comments on earlier drafts of the manuscript.

Figure 9. Comparison of ROC curves of different models.

Declaration of conflicting interests
Word2vec_MLP are good. The performance of TF- The author(s) declared no potential conflicts of interest with
respect to the research, authorship, and/or publication of this
IDF_RF and SVM is worse.
article.
Figure 9 shows a comparison of the receiver operat-
ing characteristic (ROC) curve of five models. The
Attention_Bi-LSTM model achieved the best true posi- Funding
tive rate (TPR) at smaller FPR, proving that our model The author(s) disclosed receipt of the following financial sup-
has better accuracy than other models. In terms of the port for the research, authorship, and/or publication of this
degree of automation of the model, our model has article: This work was supported in part by National Key
higher accuracy using only benign samples for training, Research and Development Program (2016YFE0206700 and
and do not need extract features artificially. Although 2018YFB0804503) and the Fundamental Research Funds for
Character_CNN has higher precision than the Central Universities.
Attention_Bi-LSTM, our model is better than the other
four models in terms of construction, training, and ORCID iD
accuracy. Yong Fang https://orcid.org/0000-0003-0708-1686

Conclusion References
Based on experimental data sets, tests on Attention_Bi- 1. Gu Y, Wang Y, Liu Z, et al. Sleepguardian: an RF-based
LSTM, SVM, TF-IDF_RF, Word2vec_MLP, healthcare system guarding your sleep from afar, 2019,
Character_CNN are processed. The results indicate arXiv:1908.06171v1.
2. Gu Y, Zhang Y, Li J, et al. Sleepy: wireless channel data
that SVM and TF-IDF_RF have relatively low detec-
driven sleep monitoring via commodity WiFi devices.
tion rate; their precision is 92.07% and 93.81%, respec-
IEEE T Big Data. Epub ahead of print 28 June 2018.
tively; their recall is 94.95% and 89.12%, respectively. DOI: 10.1109/TBDATA.2018.2851201.
The detection and recall of Word2vec_MLP are of 3. Gu Y, Ren F and Li J. Paws: passive human activity rec-
average, 96.28% and 96.29%, respectively. ognition based on wifi ambient signals. IEEE Internet
That means extracting word vectors by Word2vec Thing J 2015; 3(5): 796–805.
can maintain samples’ semantics and make classifica- 4. Gu Y, Zhang X, Li C, et al. Your WiFi knows how you
tion as well. The precision of Character_CNN reaches behave: leveraging WiFi channel data for behavior
12 International Journal of Distributed Sensor Networks

analysis. In: Proceedings of the 2018 IEEE global commu- 16. Breiman L. Random forests. Machine Learn 2001; 45(1):
nications conference (GLOBECOM), Abu Dhabi, UAE, 5–32.
9–13 December 2018, pp.1–6. New York: IEEE. 17. Freund Y and Mason L. The alternating decision tree
5. Liu X, Dong M, Ota K, et al. Trace malicious source to learning algorithm. In: Proceedings of the sixteenth inter-
guarantee cyber security for mass monitor critical infra- national conference on machine learning, ICML ’99, Bled,
structure. J Comput Syst Sci 2018; 98: 1–26. Slovenia, 27-30 June 1999, pp.124–133. New York:
6. Wu J, Ota K, Dong M, et al. Big data analysis-based ACM.
security situational awareness for smart grid. IEEE T Big 18. Castilla E, Martı́n N and Pardo L. A logistic regression
Data 2016; 4(3): 408–417. analysis approach for sample survey data based on Phi-
7. Adeva JJG and Atxa JMP. Intrusion detection in web divergence measures. In: Gil E, Gil E, Gil J, et al. (eds)
applications using text mining. Eng Appl Artif Intell Mathematics of the uncertain. New York: Springer, 2018,
2007; 20(4): 555–566. pp.465–474.
8. Tang B, He H, Baggenstoss PM, et al. A bayesian classi- 19. Yang X, Wei LI, Sun M, et al. Web attack detection
fication approach using class-specific features for text method on the basis of text clustering. CAAI Trans Intel
categorization. IEEE Trans Knowl Data Eng 2016; 28(6): Syst 2014; 9: 40–46.
1602–1606. 20. Zhang H, Guan H, Yan H, et al. Webshell traffic detec-
9. Li D, Zhang B and Li C. A feature-scaling-based -nearest tion with character-level features based on deep learning.
neighbor algorithm for indoor positioning systems. IEEE IEEE Access 2018; 6: 75268–75277.
Internet Thing J 2015; 3(4): 590–597. 21. Yandong LI, Hao Z and Lei H. Survey of convolutional
10. Ding Q, Zhang J, Wang J, et al. Based on knn and roc- neural network. J Comput Appl 2016; 36: 2508–2515.
chio improved text classification technology. Autom 22. Greff K, Srivastava RK, Koutnı́k J, et al. LSTM: a
Instrum 2017; 8: 41. search space odyssey. IEEE T Neural Netw Learn Syst
11. Chandrasekhar AM and Raghuveer K. Intrusion detection 2016; 28(10): 2222–2232.
technique by using k-means, fuzzy neural network and 23. Google. seq2seq, 2017, https://google.github.io/seq2seq/
SVM classifiers. In: Proceedings of the international confer- 24. Luong MT, Pham H and Manning CD. Effective
ence on computer communication and informatics, Coimba- approaches to attention-based neural machine transla-
tore, India, 4–6 January 2013, pp.1–7. New York: IEEE. tion, 2015, arXiv:1508.04025v5.
12. Suykens JAK and Vandewalle J. Least squares support 25. Giménez CT, Villegas AP and Marañón GA. Http data
vector machine classifiers. Neural Proces Lett 1999; 9(3): set CSIC 2010. Information Security Institute of CSIC
293–300. (Spanish Research National Council), 2010, https://
13. Lin CT, George Lee CS, Lin CT, et al. Neural fuzzy sys- www.impactcybertrust.org/dataset_view?idDataset=940
tems: a neuro-fuzzy synergism to intelligent systems, vol. 26. Abadi M, Barham P, Chen J, et al. Tensorflow: a system
205. Upper Saddle River NJ: Prentice Hall, 1996. for large-scale machine learning. In: Proceedings of the
14. Krishna K and Murty NM. Genetic k-means algorithm. 12th USENIX symposium on operating systems design and
IEEE T Syst Man Cyb: Part B 1999; 29(3): 433–439. implementation (OSDI 16), Savannah, GA, 2–4 Novem-
15. Rathore S, Sharma PK and Park JH. Xssclassifier: an ber 2016, pp.265–283. New York: ACM.
efficient XSS attack detection approach based on 27. Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-
machine learning classifier on SNSs. J Inform Process learn: machine learning in python. J Machine Learn Res
Syst 2017; 13(4): 1014–1028. 2011; 12: 2825–2830.

Manual
No ratings yet
Manual
233 pages
Urdu Solved Mcqs
100% (1)
Urdu Solved Mcqs
8 pages
Hybrid Deep Learning Model For Attack Detection in Internet of Things
No ratings yet
Hybrid Deep Learning Model For Attack Detection in Internet of Things
20 pages
N Gram
No ratings yet
N Gram
6 pages
Process Instrumentation
100% (1)
Process Instrumentation
12 pages
Noa Magazine 2025
No ratings yet
Noa Magazine 2025
111 pages
Waf DG
No ratings yet
Waf DG
319 pages
Ntrusion Detection Based On Machine Learning in The Internet of Things, Attacks and Counter Measures
No ratings yet
Ntrusion Detection Based On Machine Learning in The Internet of Things, Attacks and Counter Measures
35 pages
s44147 025 00635 7
No ratings yet
s44147 025 00635 7
26 pages
Enhancing Network Security in IoT Using Machine Learning - Based Anomaly Detection
No ratings yet
Enhancing Network Security in IoT Using Machine Learning - Based Anomaly Detection
5 pages
Information Security Career
No ratings yet
Information Security Career
70 pages
Ensemble Technique of Intrusion Detection For Iot Edge Platform
No ratings yet
Ensemble Technique of Intrusion Detection For Iot Edge Platform
16 pages
Federated Learning-Based Anomaly Detection For IoT Security Attacks
No ratings yet
Federated Learning-Based Anomaly Detection For IoT Security Attacks
10 pages
Advancing Xss Detection in Iot Over 5G: A Cutting-Edge Artificial Neural Network Approach
No ratings yet
Advancing Xss Detection in Iot Over 5G: A Cutting-Edge Artificial Neural Network Approach
31 pages
SOA - Design of Intrusion Detection System Based On Cyborg Intelligence For Security of Cloud Network Traffic of Smart Cities
No ratings yet
SOA - Design of Intrusion Detection System Based On Cyborg Intelligence For Security of Cloud Network Traffic of Smart Cities
33 pages
DSA Round Prep - Big Picture
No ratings yet
DSA Round Prep - Big Picture
55 pages
S4ABAP Curriculum Classroom Vendor
No ratings yet
S4ABAP Curriculum Classroom Vendor
25 pages
Paper 3
No ratings yet
Paper 3
21 pages
Electronics: Federated Machine Learning To Enable Intrusion Detection Systems in Iot Networks
No ratings yet
Electronics: Federated Machine Learning To Enable Intrusion Detection Systems in Iot Networks
19 pages
2024 A Novel Internet of Things Web Attack Detection Architecture Based On The Combination of Symbolism and Connectionism AI
No ratings yet
2024 A Novel Internet of Things Web Attack Detection Architecture Based On The Combination of Symbolism and Connectionism AI
15 pages
Python - L1. Introduction
No ratings yet
Python - L1. Introduction
27 pages
Presentation Cloud Computing by Sapan Shah
No ratings yet
Presentation Cloud Computing by Sapan Shah
25 pages
Internet of Things
No ratings yet
Internet of Things
25 pages
Sharma 2021
No ratings yet
Sharma 2021
31 pages
How Machine Learning Changes The Nature of Cyberattacks On IoT Networks A Survey
No ratings yet
How Machine Learning Changes The Nature of Cyberattacks On IoT Networks A Survey
32 pages
Cic BCCC
No ratings yet
Cic BCCC
26 pages
Sensors 23 06305 v2
No ratings yet
Sensors 23 06305 v2
35 pages
Lit - Cybersecurity Anomaly Detection
No ratings yet
Lit - Cybersecurity Anomaly Detection
13 pages
Sandesh Powerpoint Dautya Final
No ratings yet
Sandesh Powerpoint Dautya Final
21 pages
A Taxonomy of Machine-Learning-Based Intrusion Detection Systems For The Internet of Things A Survey
No ratings yet
A Taxonomy of Machine-Learning-Based Intrusion Detection Systems For The Internet of Things A Survey
23 pages
Electronics 13 01031 v2
No ratings yet
Electronics 13 01031 v2
25 pages
Icepe Presentation
No ratings yet
Icepe Presentation
15 pages
PRBT-0410 G500 Guideform Specs V100 R0
No ratings yet
PRBT-0410 G500 Guideform Specs V100 R0
22 pages
A Trusted Feature Aggregator Federated Learning For Distributed Malicious Attak Detection-14
No ratings yet
A Trusted Feature Aggregator Federated Learning For Distributed Malicious Attak Detection-14
15 pages
Applsci 11 03022 v2
No ratings yet
Applsci 11 03022 v2
19 pages
1 s2.0 S2665917423002532 Main
No ratings yet
1 s2.0 S2665917423002532 Main
8 pages
B20-ml Basedbotnet Attack in IoT Devices
No ratings yet
B20-ml Basedbotnet Attack in IoT Devices
66 pages
A Novel Web Attack Detection System For Internet of Things Via Ensemble Classification
No ratings yet
A Novel Web Attack Detection System For Internet of Things Via Ensemble Classification
9 pages
7) Paper 80-Internet of Things Cyber Attacks Detection
No ratings yet
7) Paper 80-Internet of Things Cyber Attacks Detection
8 pages
Main Paper
No ratings yet
Main Paper
17 pages
1 s2.0 S0167404823002250 Main
No ratings yet
1 s2.0 S0167404823002250 Main
14 pages
Deep Learning Based Detection For Cyber Attacks in Iot Networks: A Distributed Attack Detection Framework
No ratings yet
Deep Learning Based Detection For Cyber Attacks in Iot Networks: A Distributed Attack Detection Framework
24 pages
A Survey On IoT Intrusion Detection Federated Learning Game Theory Social Psychology and Explainable A
No ratings yet
A Survey On IoT Intrusion Detection Federated Learning Game Theory Social Psychology and Explainable A
34 pages
Applsci 14 04729
No ratings yet
Applsci 14 04729
15 pages
Ada-Boosted Locally Enhanced Probabilistic Neural Network For IoT Intrusion Detection
No ratings yet
Ada-Boosted Locally Enhanced Probabilistic Neural Network For IoT Intrusion Detection
14 pages
Sensors 21 02987 v2
No ratings yet
Sensors 21 02987 v2
21 pages
Iot Security Techniques Based On Machine Learning
No ratings yet
Iot Security Techniques Based On Machine Learning
20 pages
Learning Approaches For Security and Privacy in Internet of Things
No ratings yet
Learning Approaches For Security and Privacy in Internet of Things
12 pages
Minor Project
No ratings yet
Minor Project
10 pages
A Novel Attack Detection Scheme For The Industrial Internet of Things Using A Lightweight Random Neural Network
No ratings yet
A Novel Attack Detection Scheme For The Industrial Internet of Things Using A Lightweight Random Neural Network
14 pages
A Systematic Approach To Prevent Threats Using Ids in Iot Based Devices
No ratings yet
A Systematic Approach To Prevent Threats Using Ids in Iot Based Devices
7 pages
A Sample Article Using IEEEtran Cls For IEEE Journals and Transactions
No ratings yet
A Sample Article Using IEEEtran Cls For IEEE Journals and Transactions
7 pages
Reffrence Project
No ratings yet
Reffrence Project
9 pages
507 Submission
No ratings yet
507 Submission
5 pages
Capturing and Debugging SSL Traffic - FortiWeb
No ratings yet
Capturing and Debugging SSL Traffic - FortiWeb
5 pages
Stepper Driver Power Supplies Processor: Title: Duet 2
No ratings yet
Stepper Driver Power Supplies Processor: Title: Duet 2
7 pages
A Machine Learning Security Framework For Iot Systems
No ratings yet
A Machine Learning Security Framework For Iot Systems
12 pages
1 s2.0 S2667305323000145 Main
No ratings yet
1 s2.0 S2667305323000145 Main
13 pages
22CSE53 Full Stack Final
No ratings yet
22CSE53 Full Stack Final
4 pages
LNKDSEA - Machine Learning Based IoT IIoT
No ratings yet
LNKDSEA - Machine Learning Based IoT IIoT
8 pages
Detection of Cyber Attacks On IoT Based Cyber Phys
No ratings yet
Detection of Cyber Attacks On IoT Based Cyber Phys
9 pages
Formato de Excel Modelo para Revision de Literatura
No ratings yet
Formato de Excel Modelo para Revision de Literatura
11 pages
Deep Learning Algorithms For Intrusion D
No ratings yet
Deep Learning Algorithms For Intrusion D
8 pages
Year 7 ICT
No ratings yet
Year 7 ICT
4 pages
Complex Engineering Problem-ES475-2023
No ratings yet
Complex Engineering Problem-ES475-2023
4 pages
Fin Irjmets1708609848
No ratings yet
Fin Irjmets1708609848
4 pages
P5100 Electrical Part List PDF
No ratings yet
P5100 Electrical Part List PDF
11 pages
Comparative Analysis of Soft Computing Approaches of Zero-Day-Attack Detection
No ratings yet
Comparative Analysis of Soft Computing Approaches of Zero-Day-Attack Detection
5 pages
DXC Resume
No ratings yet
DXC Resume
3 pages
AAPP006-4-2-SDP Project Case Study
No ratings yet
AAPP006-4-2-SDP Project Case Study
3 pages
Data Sheet Five9 Geographic Redundancy
No ratings yet
Data Sheet Five9 Geographic Redundancy
2 pages
Etasr 4202 PDF
No ratings yet
Etasr 4202 PDF
6 pages
Intrusion Detection System Through Advance Machine Learning For The Internet of Things Networks
No ratings yet
Intrusion Detection System Through Advance Machine Learning For The Internet of Things Networks
7 pages
Avocor G Series Datasheets ENGLISH
No ratings yet
Avocor G Series Datasheets ENGLISH
2 pages
Chapter 8 - Exam Questions
No ratings yet
Chapter 8 - Exam Questions
16 pages
(IJCST-V8I4P18) :madhankumar Y
No ratings yet
(IJCST-V8I4P18) :madhankumar Y
7 pages
Published Journals
No ratings yet
Published Journals
9 pages
Man Ecdis 213 Uk 2
No ratings yet
Man Ecdis 213 Uk 2
280 pages
Sample Business Plan Laundry
No ratings yet
Sample Business Plan Laundry
4 pages
Overview of Cyber Attacks Classification and Detection in IoT Using CNN-Deep Reinforcement Learning
No ratings yet
Overview of Cyber Attacks Classification and Detection in IoT Using CNN-Deep Reinforcement Learning
6 pages
Smart Defenses: Machine Learning-Based Proactive Cyber Attack Detection in Iot Systems
No ratings yet
Smart Defenses: Machine Learning-Based Proactive Cyber Attack Detection in Iot Systems
5 pages
IoT Network Attack Detection Using Supervised Machine Learning
No ratings yet
IoT Network Attack Detection Using Supervised Machine Learning
15 pages
To Start Oracle Forms
No ratings yet
To Start Oracle Forms
39 pages
Mnet 011 2000396
No ratings yet
Mnet 011 2000396
7 pages
Ajmani International School: Timestampcamera App and E-Mail It To Your Class Teacher
No ratings yet
Ajmani International School: Timestampcamera App and E-Mail It To Your Class Teacher
2 pages
UNIX Assignments: 2: 1. List All The Files and Sub Directories of The Directory /bin
No ratings yet
UNIX Assignments: 2: 1. List All The Files and Sub Directories of The Directory /bin
15 pages
Effective Vulnerability Management: Managing Risk in the Vulnerable Digital Ecosystem
From Everand
Effective Vulnerability Management: Managing Risk in the Vulnerable Digital Ecosystem
Chris Hughes
5/5 (1)
Botnet Attack Detection in the Internet of Things Using Selected Learning Algorithms: A Research Study on Securing IoT Against Cyber Threats Using Machine Learning
From Everand
Botnet Attack Detection in the Internet of Things Using Selected Learning Algorithms: A Research Study on Securing IoT Against Cyber Threats Using Machine Learning
Bolakale Aremu
5/5 (1)
Next-Gen Cybersecurity
From Everand
Next-Gen Cybersecurity
Dr. Valarian Couch
No ratings yet
The Today and Future of WSN, AI, and IoT: A Compass and Torchbearer for the Technocrats
From Everand
The Today and Future of WSN, AI, and IoT: A Compass and Torchbearer for the Technocrats
Dr.Chandrakant
No ratings yet