Adv-Bot Realistic Adversarial Botnet Attacks Against Network Intrusion Detection Systems
Adv-Bot Realistic Adversarial Botnet Attacks Against Network Intrusion Detection Systems
Islam Debichaa,b,∗, Benjamin Cocheza,∗, Tayeb Kenazac , Thibault Debattyb , Jean-Michel Dricota , Wim Meesb
a Cybersecurity Research Center, Université Libre de Bruxelles, 1000 Brussels, Belgium
b Cyber Defence Lab, Royal Military Academy, 1000 Brussels, Belgium
c Computer Security Laboratory, Ecole Militaire Polytechnique, Algiers, Algeria
Abstract
Due to the numerous advantages of machine learning (ML) algorithms, many applications now incorporate them. However,
many studies in the field of image classification have shown that MLs can be fooled by a variety of adversarial attacks. These attacks
take advantage of ML algorithms’ inherent vulnerability. This raises many questions in the cybersecurity field, where a growing
number of researchers are recently investigating the feasibility of such attacks against machine learning-based security systems, such
as intrusion detection systems. The majority of this research demonstrates that it is possible to fool a model using features extracted
from a raw data source, but it does not take into account the real implementation of such attacks, i.e., the reverse transformation
from theory to practice. The real implementation of these adversarial attacks would be influenced by various constraints that would
make their execution more difficult. As a result, the purpose of this study was to investigate the actual feasibility of adversarial
attacks, specifically evasion attacks, against network-based intrusion detection systems (NIDS), demonstrating that it is entirely
possible to fool these ML-based IDSs using our proposed adversarial algorithm while assuming as many constraints as possible in
a black-box setting. In addition, since it is critical to design defense mechanisms to protect ML-based IDSs against such attacks, a
defensive scheme is presented. Realistic botnet traffic traces are used to assess this work. Our goal is to create adversarial botnet
traffic that can avoid detection while still performing all of its intended malicious functionality.
Keywords: Intrusion detection system, Botnet attacks, Machine learning, Evasion attacks, Adversarial detection.
1. Introduction types of attacks [7]. Evasion attacks can fool any ML model
during its inference process by adding slight, often impercep-
Sophisticated methods of cybercrime, such as botnets, are tible, perturbations to the original instance sent to create so-
becoming increasingly rampant. Given the severity of the threat, 25 called adversarial instances.
it is essential to have a defense mechanism that can detect all The literature study shows that there has been considerable
5 types of botnet traffic. Among these mechanisms, the use of in- research on the impact of adversarial attacks on machine learn-
trusion detection systems dedicated to network analysis, known ing models [8, 9, 10]. However, their feasibility in domain-
as NIDS, is gaining popularity. And given the difficulty of constrained applications, such as intrusion detection systems,
some signature-based NIDSs in detecting new botnet attacks 30 is still in its early stages [11, 12, 13]. Adversarial attacks can
or even variants of known botnet attacks, NIDSs based on ma- be performed in either white-box or black-box settings. Many
10 chine learning algorithms have become more prevalent. These white-box adversarial attacks, originally developed for com-
machine learning-based NIDS can detect not only variants of puter vision applications [14, 15, 16], have been applied di-
known attacks, but also novel attacks, also known as zero-day rectly to network traffic without addressing domain constraints
attacks [1, 2]. 35 properly [17, 18, 19].
Despite this encouraging achievement, many recent studies Aside from the issue of functionality preservation, white-
15 have shown that it is possible to fool the ML algorithms used box attacks necessitate knowledge of the target system’s inter-
in these detection methods [3, 4], as has been discovered previ- nal architecture. It is unlikely that an attacker would have ac-
ously in other application areas, such as computer vision [5, 6]. cess to the internal configuration of the ML model [8], mak-
These ML algorithms have been shown to be vulnerable to a va- 40 ing a white-box attack in a realistic environment less likely
riety of adversarial attacks, including poisoning, extraction, and [7]. As a result, recent research [20, 21] has focused on de-
20 evasion. Evasion attacks are studied in this paper because they signing adversarial network traffic in a black-box setting where
are more realistic in cyber security scenarios than the other two the attacker has little or no knowledge of the defender’s NIDS.
Model querying [22] and transferability [14] are two methods
∗ Corresponding 45 for launching black-box attacks.
author
Email addresses: [email protected] (Islam Debicha), An attacker can extract useful information such as the clas-
[email protected] (Benjamin Cochez) sifier’s decision by sending multiple queries to the target NIDS,
4
Table 2: Common features description used by flow-based NIDS for botnet be modified by the attacker, either directly or indirectly. Only
attacks [47]
the three green features: Dur, Out/InBytes, and TotPkts can be
Features Name Description Type Category
Dur Duration of the flow Float manipulated by the attacker. It should be noted that modifying
Bytes outgoing/entering the 370 the green features will result in some indirect changes to the
Out/InBytes Integer Modifiable
network yellow features. These changes should be taken into account in
TotPkts Total of exchanged packets Integer order to properly address the respect of semantic and syntactic
TotBytes Total of exchanged bytes Integer
constraints. To manipulate the network factors explained just
BytesPerPkt Total bytes by packet Float
BytesPerSec Total bytes by second Float before, the attacker can use the following three approaches:
Dependent
PktsPerSec Total packet by second Float
Ratio between outgoing and
375 1. Time manipulation attack: with this approach, it is pos-
RatioOutIn Float sible to act on two time-related aspects during the attack.
entering bytes
ConnectionState The state of the connection Categorical On the one hand, by reducing the frequency of the attack
FlowDirection Direction of the flow Categorical packets by increasing the sending time between packets
The destination port is a private, of the same flow, as in the work of Zhijun et al. [50]
Src/DstPortType Categorical
registered or known port
Unmodifiable 380 which shows the implication of this variant on DoS at-
The network IP source or
IPSrc/DestType Boolean tacks. On the other hand, by accelerating the frequency
destination
Source and destination type of of attack packets in a moderate way by decreasing the
Src/DstTos Integer
service time taken to send the packets. These two variants allow
to influence directly the ”Duration” feature and indirectly
385 the ”BytesPerSec” and ”PktsPerSec” features.
2. Packet quantity manipulation attack: This attack can be
335 Concerning the knowledge assumptions, it is considered that carried out in two different manners. The first is packet
the attacker has limited access to network data and has no infor- injection, which suggests injecting new packets into the
mation about the model and parameters of the NIDS used by the network flow by creating them directly, with tools such as
company. This means he can only intercept traffic that passes390 Hping 4 or Scapy 5 , or by breaking a packet into several
through his machine and gathers limited information about the fragments, using packet fragmentation, so as to preserve
340 benign traffic that passes through the bot’s machine. In terms the content and underlying logic of the packet without
of requirements, our scenario assumes that the attacker has pre- damaging its overall behavior. Packet fragmentation is,
viously breached the network by infecting a machine with mal- for example, used by some attacks such as the TCP frag-
ware in order to connect to the C&C server. As a result, the395 ment attack [51] or a DoS attack. A variant suggested by
attacker will be able to gather information about the benign traf- this possibility would be to resend the same packet mul-
345 fic. Regarding the flow exporter, the attacker does not need to tiple times using a tool like Tcpreplay 6 . The other way
have knowledge about it or about the extracted features used by to do this is packet retention, which would be to not send
the NIDS since he acts directly on factors that he can manipu- a packet immediately, but rather to send it after a certain
late in the traffic space. Given the recurring use of certain net-400 amount of time, thus adding that packet to the next flow,
work factors (in particular, packet duration, number, and size) thus avoiding having more packets in a single flow. With
350 in botnet detection by state-of-the-art NIDSs, the attacker can the same idea of retention, another suggested possibil-
assume that these factors are part of the set of features used ity could be a modification of the general communication
by the NIDS. The attacker can retrieve these feature lists from system used in the botnet attack to shorten the number of
known exporters such as Argus 1 , CIC-FlowMeter 2 or nProbe405 packets sent between the botnet devices. In both cases,
3
and from scientific papers [48, 49, 3] explaining the features this attack directly influences the ”TotPkts” feature, and
355 used by NIDS models. The attacker has the ability to com- indirectly the ”PktsPerSec” and ”BytesPerPkt” features.
municate with and target the infected computing device. It is 3. Byte quantity manipulation attack: Similar to the previ-
therefore also considered that the attacker can manipulate, in ous approach, there are two ways to perform this attack.
both directions of communication, the duration of packet trans-410 The first is byte expansion. In the case where the com-
mission, the number of packets sent and received, as well as the munication is not encrypted, a suggestion would be to
360 number of bytes sent and received, as shown in Table 2, respect- directly modify the payload to obtain the desired num-
ing the semantic and syntactic constraints related to the network ber of bytes. In case it is encrypted, it is assumed that
protocols used and maintaining the underlying logic of his mali- the attacker knows the cryptographic suite used to en-
cious communications. In order to maintain the malware’s full415 crypt the two-way communication channel. This would
behavior, the attacker cannot directly act on certain features, allow him to add padding using a tool like Scapy, calcu-
365 such as the source and destination IP addresses or the type of late its encrypted version, and then verify that the packet
service. In fact, the features highlighted in red in Table 2 cannot size is the desired one. The addition of padding is known
1 https://openargus.org/ 4 http://www.hping.org/
2 https://github.com/CanadianInstituteForCybersecurity/CICFlowMeter 5 https://scapy.net/
3 https://www.ntop.org/products/netflow/nprobe/ 6 https://tcpreplay.appneta.com/
5
to both communicating parties, allowing the receiver to traffic executed by the botnet attack. All of this traffic is
420 remove the unnecessary part and thus recover the origi-460 extracted as PCAP files and then formatted as flows via
nal payload. The second method is byte squeezing. The the Argus tool, which turns raw network data into flows.
idea is to reduce the number of bytes sent. To do this, it
would be possible to compress the data to directly reduce • CSE-CIC-IDS2018: This dataset contains a set of com-
the size of the payload using a tool like Zlib [52]. This puter attack scenarios, such as Brute-force, Heartbleed,
425 is a common tool already used in the IoT domain to re- Botnet, DoS, DDoS, or Web attacks. There are two vari-
duce the bytes exchanged, as explained in [53]. Another465 ants of botnet attacks: Zeus and Ares. The network traf-
possibility would be to modify the general behavior of fic for each scenario was extracted as a PCAP file and
the botnet system by minimizing the content of the pay- formatted by CICFlowMeter to provide a set of 83 net-
load to be sent. This second possibility can only be done work features for thousands of labeled network flows.
430 before the machine is affected by the malicious bot soft- This dataset, being relatively recent, provides consistent
ware, but can be proactively prepared by an attacker. Fur-470 data following the dataset creation methodology present
thermore, whether it is byte expansion or byte compres- in [56], allowing to have a reliable and realistic dataset.
sion, the attack directly influences the ”OutBytes” and
Other than botnets, CTU-13 and CSE-CIC-IDS2018 con-
”InBytes” features, and indirectly the TotBytes, ”Bytes-
tain a variety of attack types. Because our research focuses
435 PerPkt”, ”BytesPerSec” and ”RatioOutIn” features.
solely on botnet attacks, all other attack types were removed to
475 create a dataset containing only botnet attack records. To avoid
incorporating potential biases when creating these new datasets,
Internet
Enterprise network we relied on the work done by Venturi et al. [47]. Features were
Attacker
Defender NIDS
filtered to keep only those consistent for the study of botnet at-
3 tacks and common to both datasets used. These features are
Transferability
1 2 480 those described in Table 2. Regarding CTU-13, the botnet at-
tacks considered in this work are Neris, Virut, and Rbot. Other
4 attacks were not included because they do not have enough ma-
Surrogate models trained licious instances to provide consistent results. For CSE-CIC-
with sniffed traffic Bot IDS2018, after initially being indistinguishable in the original
485 dataset, the Zeus and Ares botnet attacks were extracted into
Attacker side Defender side the same dataset (Zeus & Ares).
To ensure the practicality of the present work, the CTU-13
Figure 2: An illustration of the adversarial botnet traffic generation steps
and CSE-CIC-IDS2018 datasets were divided into two equiva-
lent datasets and stratified according to the labels. These datasets
As shown in Figure 2, There are four steps in the process
490 are equivalent in terms of size and distribution, which repre-
of creating adversarial botnet traffic. During step 1, the attacker
sents more than 32,000 instances for each side. The first is used
generates adversarial traffic that is specifically designed to by-
for training and evaluation of the model used by the defender.
pass the surrogate models that the attacker previously trained
The second is used by the attacker to train the surrogate models
440 using sniffed traffic. The attacker then receives and analyzes
independently. The attacker can obtain this data by sniffing the
the adversarial traffic that managed to avoid detection by the
495 network. This is particularly possible in the case of a botnet sce-
surrogate models during step 2. During step 3, the attacker uses
nario, as the attacker communicates bidirectionally with the in-
the transferability property to send adversarial botnet traffic to
fected device. The datasets for each side (defender and attacker)
the defender NIDS. In step 4, the adversarial botnet traffic that
are split into a training dataset and a test dataset for validation
445 successfully bypassed the defender NIDS will arrive at its final
with proportions of 75% and 25% respectively. Each training
destination, the bot.
500 and test data subset is evenly split in terms of malicious and be-
nign traffic. The datasets are separated in this manner to have
3.2. Datasets
the most balanced representation so as to avoid the problem of
To perform reproducible experiments, the CTU-13 [54] and unbalanced data given that it is out of the scope of this study.
CSE-CIC-IDS2018 [55] datasets were used to provide results The attacker and defender datasets thus follow similar but not
450 that could be comparable to multiple datasets as well as a wider505 identical distributions since they are not the same datasets. The
range of attacks. arrangement of the datasets is illustrated in Figure 3.
• CTU-13: This dataset, provided by the Czech Techni-
3.3. Preprocessing
cal University in Prague, contains different labeled traffic
from 13 different application scenarios. Each of them General preprocessing has already been performed on the
455 follows the same process, including a different botnet at- dataset provided by Venturi et al. [47], resulting in clean data
tack variant (Neris, Rbot, Virut, Murlo, Menti, NSIS.ay510 where outliers, empty values, and non-TCP traffic were removed.
and Sogou). The process involves monitoring a network Data filtering and ”one-hot encoding” of categorical features
for both benign network communications and malicious were carried out. This encoding transforms the categorical data
6
75%
Train Dataset
Defender
Dataset
50%
Datasets partitioning
Test Dataset
25% Figure 4: Machine Learning pipeline for both attacker and defender
Datasets
75%
Train Dataset
RF. The choice to use several algorithms allows for more com-
50%
Attacker parable results, especially in the case of transferability. Having
Dataset several ML models allows for a better understanding of the im-
Test Dataset 550 pact of transferability of adversarial network traffic, both intra
25% and cross-transferability. The meta-parameters of all models
can be seen in Table 3.
Figure 3: Partitioning of the datasets for experiments Table 3: Meta-parameters of the selected ML models
Classifier Attacker Parameters Defender Parameters
Hidden Layers (HL) = 3 Hidden Layers (HL) = 2
into a binary format that can be exploited by the model. How- Neurons by HL = 128 Neurons by HL = 256
MLP
ever, some inconsistencies are present in the dataset provided Activation = ReLU Activation = ReLU
515 by Venturi et al. [47]. By default, when the ”OutBytes” and Optimizer = Adam Optimizer = Adam
Nb estimators = 300 Nb estimators = 200
”InBytes” features are set to 0, the ”RatioOutIn” feature is set Criterion = Gini Criterion = Gini
Random Forest
to the maximum value present in the data set, simulating an infi- Bootstrap = True Bootstrap = True
nite value. Since the ratio in this case is 0/0, we chose to replace KNN Nb neighbors = 5 Nb neighbors = 3
it with 0 instead of a pseudo-infinite value, representing a zero
520 byte exchange.
For training the neural network algorithms, some additional
3.5. Proposed evasion attacks
preprocessing was applied to the training and test data. First,
the data were normalized using a minmax scaling method, trans- In order to generate adversarial perturbations, which are
forming the data to be projected to a value between zero and555 added to the malicious instances to make them benign, we pro-
525 one, according to Eq. 4. Then, the labels undergo a one-hot pose two evasion attack variations formulated in Eq. 5 and Eq.
encoding to transform them to binary values so that they can be 6.
processed by the MLP.
X − min(X)
t
xadv ( f ) = Pro j[xt−1 ( f ) + sign(benign mean( f )
X scaled = (4) (5)
max(X) − min(X) −x0 ( f )) ∗ (c ∗ t) ∗ mean ratio( f )]
where X is an instance and X scaled is the normalized in-
where x0 ( f ) is the initial value of the targeted network factor
stance.
f in the instance, the sign function specifies the direction of
560 the perturbation, c is a multiplicative coefficient that regulates
530 3.4. ML-based NIDS
the perturbation rate generated at each step t, benign mean is
As neural networks are increasingly used in the context of the mean of the benign set of the targeted network factor f that
NIDS based on machine learning algorithms, a couple of them the attacker can obtain from sniffing network traffic, mean ratio
are chosen, as well as more classical ML algorithms. On the de- is the ratio between the mean of the malicious set and the be-
fender side, the defender uses a Multilayer Perceptron (MLP),565 nign set of the targeted network factor f , and Pro j is a pro-
535 Random Forrest (RF) and K-Nearest Neighbors (KNN) algo- jection function that projects the values of modified features
rithm as a model for his IDS. For the attacker, the same algo- that violate syntactic and semantic constraints into the space
rithms are chosen with different parameters and hyperparame- of valid values. These modified features are only network fac-
ters and trained with a different dataset. These three ML mod- tors that the attacker can manipulate directly or indirectly, as
els were chosen in our work given their popular use in the IDS570 represented by the green and yellow groups in Table 2. If the
540 community [57, 58, 59]. All these algorithms follow the same attacker manipulates one of the modifiable features shown in
training and testing process, as shown in Figure 4. Table 2 with green color, the Proj function will make sure that
Meta-parameters are chosen randomly to avoid having the the dependents features will change values accordingly. For in-
same model parameters between the attacker and defender. Ex- stance, changing the flow duration by the attacker will induce a
amples of meta-parameters are the the number of neighbors k575 change in ”total packet by second” feature which equals num-
545 defined in the KNN algorithm, the number of hidden layers or ber of packets divided by the flow duration. In nutshul, the pro-
neurons in the DNN, or the number of estimators used in the jection function is what allows, when features are modified, to
7
ensure that the domain constraints are respected. This enables
the adversarial instances created to be valid and reversible in
580 the targeted domain. In this manner, the malware’s intended
behavior is preserved.
t
xadv ( f ) = Pro j[xt−1 ( f ) + sign(benign mean( f )
(6)
−x0 ( f )) ∗ (c ∗ t) ∗ |mean di f f ( f )|]
8
Table 4: Features used by combination with their corresponding mask domain-constrained systems such as NIDS and that there is still
Combinaison Mask Target factor
1 0001 Duration room for improvement [41, 60], although some of them have the
2 0010 TotPackets 660 potential to reduce the impact of adversarial instances [14, 61].
3 0011 Duration, TotPackets For this reason, a defense is proposed in this work.
4 0100 InBytes
5 0101 InBytes, Duration
6 0110 InBytes, TotPackets
7 0111 InBytes, TotPackets, Duration
8 1000 OutBytes
9 1001 OutBytes, Duration
10 1010 OutBytes, TotPackets
11 1011 OutBytes, TotPackets, Duration
12 1100 OutBytes, InBytes
13 1101 OutBytes, InBytes, Duration
14 1110 OutBytes, InBytes, TotPackets
15 1111 OutBytes, InBytes, TotPackets, Duration
9
Figure 8: Proposed defense approach using MLP sub-detectors
undergo a contextual discounting and then a Bayesian fusion,710 is investigated to see the impact of transferability between the
as defined in Eq. 7, where the prediction matrix is multiplied same and different models as well as the training data. A study
by the corresponding weights, and then the three resulting ma- of the time taken by each attack is also considered, as well as
690 trices are summed. Once this step is complete, the new values an analysis of the differences in perturbation between the initial
are normalized to obtain the final probabilities, where Pa rep- malicious instance and the adversarial instance. Then, a general
resents the probability that the instance is adversarial, and Pc is715 comparison is made on the performance of the proposed adver-
the probability that it is clean. The result is a final prediction of sarial generation algorithm across the botnet attacks present in
whether an instance is adversarial or not, and thus is subject to each dataset. The last section includes the results of the pro-
695 rejection by the detector. posed defense against the adversarial instances generated by the
previously proposed evasion attack algorithm.
P3
i=1 (Pai∗ wi ) 4.1. Initial performance of ML-IDS models in clean settings
Pa = P3
720
Rbot Rbot
Attacker/Attacker MLP RF KNN Mean Attacker/Defender MLP RF KNN Mean
MLP 0% 5% 0% 2% MLP 1% 6% 0% 2%
RF 18% 0% 16% 11% RF 18% 22% 15% 18%
KNN 2% 7% 0% 3% KNN 2% 8% 1% 4%
Defender side
Attacker side
Virut Virut
Attacker/Attacker MLP RF KNN Mean Attacker/Defender MLP RF KNN Mean
MLP 0% 69% 1% 23% MLP 25% 69% 2% 32%
RF 28% 0% 23% 17% RF 28% 90% 24% 47%
KNN 7% 72% 0% 26% KNN 29% 71% 2% 34%
Grand Mean 22.2% Grand Mean 37.8%
Table 11: Time taken (in seconds) to generate 3000 adversarial instances for all
RF, and KNN, respectively, as shown in Table 10, resulting in botnet attacks using Algorithm 1
an average of 28% for the defender models. This is a relative Model\Botnet Neris Rbot Virut Zeus & Ares Mean
success for the attacker since their malicious traffic is only de- MLP 32.48 10.21 11.69 15.22 17.40
885 tected 28% of the time and has almost a three-quarter chance of RF 16.95 5.48 9.55 6.85 9.71
KNN 3.17 1.92 1.95 0.14 1.79
evading the defender’s installed intrusion detection systems in Grand Mean 9.63
a black-box setting.
Comparing Table 9 and Table 10, we can observe that the
adversarial instances generated by the attacker are, on average,
890 more efficient on his own models than on the models of the de- tection rate) by relying solely on the transferability property of
fender who uses different training data and hyper-parameters. It adversarial instances. Our proposed defense aims to negate the
can also be observed that intra-transferability has more impact910 effect of the transferability property by adding an adversarial
than cross-transferability. Even if this loss is not negligible, the detector to filter out adversarial instances, allowing the NIDS
results show good performance on average, both through intra- to process only clean traffic.
895 and cross-transferability. We first evaluate the performance of the proposed adversar-
The time taken to generate an adversarial instance for each ial detector and its sub-detectors by generating adversarial in-
of the botnet attacks and models is shown in Table 11. It seems915 stances based on the CTU-13 and CSE-CIC-IDS2018 datasets.
that MLP is, for each of the attacks, the algorithm that con- During this analysis, each sub-detector is evaluated as shown
sumes the most time to generate adversarial instances, followed in Figure 7. To measure this performance, various metrics are
900 by RF, which is an ensemble method. On the other hand, KNN used: recall as defined in Eq. 1; precision as defined in Eq. 2;
seems to be the fastest algorithm to generate adversarial in- and F1-score as defined in Eq. 3.
stances for the four botnet attacks, which would make it an in-920 As shown in Figure 11, we can see that the first two sub-
teresting option if a trade-off between efficiency and time had detectors are performing quite well, reaching more than 96%
to be made. for each metric for CTU-13 and 99% for CSE-CIC-IDS2018,
while the last one seems to be less efficient, with performances
905 4.3. Proposed Defense effectiveness around 70%. We can also observe that the final detector per-
As we discussed in section 4.2, the attacker is able to evade925 formance after Bayesian fusion provides good performances,
the defender’s NIDS with a high success rate (i.e., a low de-
13
reaching 97% for CTU-13 and 100% for CSE-CIC-IDS2018. tal adversarial botnet traffic sent by the attacker, thus protecting
The poorer performance of the third detector can be ex- the NIDS from getting evaded by these adversarial instances.
plained by the fact that it is trained with the group of non- Table 12c represents the detection rate of the defender’s
modifiable features. Since the value of these features does not970 NIDS protected by our proposed adversarial defense. In fact,
930 change, it seems that the detector is not able to distinguish it shows the impact of adversarial instances that have made it
adversarial instances from clean instances, thus behaving ran- through our adversarial detector and reached the NIDS. It can
domly. be seen from this table that NIDS has, on average, a detection
We also note that the fusion of the three detectors slightly rate of 96.9% for any type of machine learning model used for
improved the overall performance of the proposed defense com-975 adversarial generation, across all botnet attacks.
935 pared to the individual detectors. The inferior performance of These results indicate that NIDS seems to be significantly
the third detector does not seem to diminish the performance of more robust once the adversarial detection method is used, go-
the proposed defense due to the contextual discounting mecha- ing from an average detection rate of 21.3% without defense, as
nism, which allows the performance of each individual detector shown in Table 13a, to 96.9% when using the proposed adver-
to be taken into account during the fusion stage. 980 sarial detector. This also shows that NIDS is hardly affected by
adversarial instances capable of passing the adversarial detec-
Figure 11: Performance of our proposed defense against adversarial traffic
tor.
100%
90% Table 12: Proposed adversarial defense effectiveness
80% (a) Detection rate of the defender MLP-based NIDS against adversarial instances without
70% adversarial defense
60%
50% Defender using MLP-based NIDS without adversarial defense
40% Model\Botnet Neris Rbot Virut Zeus & Ares Mean
30%
MLP 18% 1% 25% 0% 11%
Attacker
20%
RF 41% 18% 28% 50% 34%
10%
0% KNN 35% 2% 29% 50% 29%
Precision Recall F1-score Precision Recall F1-score Grand Mean 24.8%
CTU-13 CSE-CIC-IDS2018
Sub-detector 1 (modifiable features) Sub-detector 2 (dependent features) (b) Detection rate of our proposed adversarial defense
Sub-detector 3 (unmodifiable features) Proposed defense approach (fusion) Our proposed adversarial defense performance
Model\Botnet Neris Rbot Virut Zeus & Ares Mean
MLP 95% 99% 98% 100% 98%
Attacker
14
on both non-adversarial and adversarial samples. When apply- ing valid adversarial network traffic by adding small perturba-
ing the two models to a sample, one may assume that if the tions, thus evading NIDS protection with high probability while
two models categorize it differently, the sample is adversarial. maintaining the underlying logic of the botnet attack. To the
In principle, if the provided sample is not adversarial, the base1035 best of our knowledge, this is the first complete black-box bot-
1000 model and robust model should correctly identify it. If the sam- net attack that proposes to evade NIDS by exploiting the trans-
ple is adversarial, the base model will categorize it wrong, but ferability property, and without using any query method, with
the resilient model would classify it properly. very limited knowledge of the target NIDS, which acts on the
Adversarial training detection defense is built by training a traffic space, while respecting the domain constraints.
second defense-side MLP-based IDS on data sets with an equal1040 The second component of the proposed framework is a re-
1005 mix of adversarial and non-adversarial samples. We then use active defense that limits the impact of the proposed attack.
the same adversarial instances created by the attacker to attack This defense, inspired by adversarial detection, capitalizes on
both models and record their predictions. We infer a final set of the fact that it does not change the initial performance of the
predictions by comparing these two sets of predictions so as to NIDS since it provides an additional layer of security indepen-
classify each instance as adversarial or non-adversarial. 1045 dent of the model. The proposed defense is considered mod-
ular because it uses an ensemble method called bagging yet
Table 13: Detection rate comparison between adversarial training detection and can use any type of machine learning algorithm. In addition to
our proposed adversarial defense
(a) Detection rate of our proposed adversarial defense this ensemble method, it also includes a contextual discounting
method that improves the overall performance of the defense.
Our proposed adversarial defense performance
1050 The results showed that the proposed defense is able to de-
Model\Botnet Neris Rbot Virut Zeus & Ares Mean
MLP 95% 99% 98% 100% 98%
tect most adversarial botnet traffic, showing promising results
Attacker
RF 88% 94% 91% 84% 89% with respect to state-of-the-art defenses. Since the proposed
KNN 90% 99% 97% 86% 93% framework is easily adaptable to other domains, evaluating its
Grand Mean 93.4% performance in other highly constrained domains would be an
1055 interesting future work.
(b) Detection rate of adversarial training detection
15
[5] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. J. Goodfel- channel attacks on query-based data anonymization, in: Proceedings of
low, R. Fergus, Intriguing properties of neural networks, in: Y. Bengio,1155 the 2021 ACM SIGSAC Conference on Computer and Communications
1085 Y. LeCun (Eds.), 2nd International Conference on Learning Representa- Security, 2021, pp. 1254–1265.
tions, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Conference [26] C. Zhang, X. Costa-Pérez, P. Patras, Tiki-taka: Attacking and defending
Track Proceedings, 2014. deep learning-based intrusion detection systems, in: Proceedings of the
[6] A. Kurakin, I. J. Goodfellow, S. Bengio, Adversarial examples in the 2020 ACM SIGSAC Conference on Cloud Computing Security Work-
physical world, in: Artificial intelligence safety and security, Chapman1160 shop, 2020, pp. 27–39.
1090 and Hall/CRC, 2018, pp. 99–112. [27] I. Debicha, T. Debatty, J.-M. Dricot, W. Mees, T. Kenaza, Detect & re-
[7] G. Apruzzese, M. Andreolini, L. Ferretti, M. Marchetti, M. Colajanni, ject for transferability of black-box adversarial attacks against network
Modeling realistic adversarial attacks against network intrusion detection intrusion detection systems, in: International Conference on Advances in
systems, Digital Threats: Research and Practice (DTRAP) 3 (3) (2022) Cyber Security, Springer, 2021, pp. 329–339.
1–19. 1165 [28] Y. Al-Hadhrami, F. K. Hussain, Real time dataset generation framework
1095 [8] D. J. Miller, Z. Xiang, G. Kesidis, Adversarial learning targeting deep for intrusion detection systems in iot, Future Generation Computer Sys-
neural network classification: A comprehensive review of defenses tems 108 (2020) 414–423.
against attacks, Proceedings of the IEEE 108 (3) (2020) 402–433. [29] M. Teuffenbach, E. Piatkowska, P. Smith, Subverting network intrusion
[9] J. Zhang, C. Li, Adversarial examples: Opportunities and challenges, detection: Crafting adversarial examples accounting for domain-specific
IEEE transactions on neural networks and learning systems 31 (7) (2019)1170 constraints, in: International Cross-Domain Conference for Machine
1100 2578–2593. Learning and Knowledge Extraction, Springer, 2020, pp. 301–320.
[10] S. Qiu, Q. Liu, S. Zhou, C. Wu, Review of artificial intelligence adversar- [30] D. Han, Z. Wang, Y. Zhong, W. Chen, J. Yang, S. Lu, X. Shi, X. Yin, Eval-
ial attack and defense technologies, Applied Sciences 9 (5) (2019) 909. uating and improving adversarial robustness of machine learning-based
[11] N. Martins, J. M. Cruz, T. Cruz, P. H. Abreu, Adversarial machine learn- network intrusion detectors, IEEE Journal on Selected Areas in Commu-
ing applied to intrusion and malware scenarios: a systematic review, IEEE1175 nications 39 (8) (2021) 2632–2647.
1105 Access 8 (2020) 35403–35419. [31] A. M. Sadeghzadeh, S. Shiravi, R. Jalili, Adversarial network traffic: To-
[12] J. Vitorino, N. Oliveira, I. Praça, Adaptative perturbation patterns: Re- wards evaluating the robustness of deep-learning-based network traffic
alistic adversarial learning for robust intrusion detection, Future Internet classification, IEEE Transactions on Network and Service Management
14 (4) (2022) 108. 18 (2) (2021) 1962–1976.
[13] A. McCarthy, E. Ghadafi, P. Andriotis, P. Legg, Functionality-preserving1180 [32] J. Chen, X. Gao, R. Deng, Y. He, C. Fang, P. Cheng, Generating ad-
1110 adversarial machine learning for robust classification in cybersecurity and versarial examples against machine learning based intrusion detector in
intrusion detection domains: A survey, Journal of Cybersecurity and Pri- industrial control systems, IEEE Transactions on Dependable and Secure
vacy 2 (1) (2022) 154–190. Computing.
[14] I. J. Goodfellow, J. Shlens, C. Szegedy, Explaining and harnessing adver- [33] Z. Wang, Deep learning-based intrusion detection with adversaries, IEEE
sarial examples, in: Y. Bengio, Y. LeCun (Eds.), 3rd International Con-1185 Access 6 (2018) 38367–38384.
1115 ference on Learning Representations, ICLR 2015, San Diego, CA, USA, [34] K. Yang, J. Liu, C. Zhang, Y. Fang, Adversarial examples against the deep
May 7-9, 2015, Conference Track Proceedings, 2015. learning based network intrusion detection systems, in: MILCOM 2018-
[15] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, A. Vladu, Towards deep 2018 ieee military communications conference (MILCOM), IEEE, 2018,
learning models resistant to adversarial attacks, in: 6th International pp. 559–564.
Conference on Learning Representations, ICLR 2018, Vancouver, BC,1190 [35] N. Martins, J. M. Cruz, T. Cruz, P. H. Abreu, Analyzing the footprint of
1120 Canada, April 30 - May 3, 2018, Conference Track Proceedings, Open- classifiers in adversarial denial of service contexts, in: EPIA Conference
Review.net, 2018. on Artificial Intelligence, Springer, 2019, pp. 256–267.
[16] N. Carlini, D. A. Wagner, Towards evaluating the robustness of neural [36] S. Moosavi-Dezfooli, A. Fawzi, P. Frossard, Deepfool: A simple and ac-
networks, in: 2017 IEEE Symposium on Security and Privacy, SP 2017, curate method to fool deep neural networks, in: 2016 IEEE Conference on
San Jose, CA, USA, May 22-26, 2017, IEEE Computer Society, 2017,1195 Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV,
1125 pp. 39–57. USA, June 27-30, 2016, IEEE Computer Society, 2016, pp. 2574–2582.
[17] M. Pawlicki, M. Choras, R. Kozik, Defending network intrusion detection [37] N. Papernot, P. McDaniel, S. Jha, M. Fredrikson, Z. B. Celik, A. Swami,
systems against adversarial evasion attacks, Future Gener. Comput. Syst. The limitations of deep learning in adversarial settings, in: 2016 IEEE
110 (2020) 148–154. European symposium on security and privacy (EuroS&P), IEEE, 2016,
[18] I. Debicha, R. Bauwens, T. Debatty, J.-M. Dricot, T. Kenaza, W. Mees,1200 pp. 372–387.
1130 Tad: Transfer learning-based multi-adversarial detection of evasion at- [38] I. Debicha, T. Debatty, J. Dricot, W. Mees, Adversarial training for deep
tacks against network intrusion detection systems, Future Generation learning-based intrusion detection systems, in: ICONS 2021, The Six-
Computer Systems 138 (2023) 185–197. teenth International Conference on Systems, 2021, pp. 45–49.
[19] M. A. Merzouk, F. Cuppens, N. Boulahia-Cuppens, R. Yaich, Investigat- [39] N. Papernot, P. D. McDaniel, X. Wu, S. Jha, A. Swami, Distillation as
ing the practicality of adversarial evasion attacks on network intrusion1205 a defense to adversarial perturbations against deep neural networks, in:
1135 detection, Annals of Telecommunications (2022) 1–13. IEEE Symposium on Security and Privacy, SP 2016, San Jose, CA, USA,
[20] G. Apruzzese, M. Andreolini, M. Marchetti, A. Venturi, M. Colajanni, May 22-26, 2016, IEEE Computer Society, 2016, pp. 582–597.
Deep reinforcement adversarial learning against botnet evasion attacks, [40] G. Apruzzese, M. Andreolini, M. Colajanni, M. Marchetti, Hardening
IEEE Transactions on Network and Service Management 17 (4) (2020) random forest cyber detectors against adversarial attacks, IEEE Trans.
1975–1987. 1210 Emerg. Top. Comput. Intell. 4 (4) (2020) 427–439.
1140 [21] C. Zhang, X. Costa-Pérez, P. Patras, Adversarial attacks against deep [41] N. Carlini, D. Wagner, Defensive distillation is not robust to adversarial
learning-based network intrusion detection systems and defense mech- examples, arXiv preprint arXiv:1607.04311.
anisms, IEEE/ACM Transactions on Networking. [42] F. Zhang, P. P. K. Chan, B. Biggio, D. S. Yeung, F. Roli, Adversarial fea-
[22] P.-Y. Chen, H. Zhang, Y. Sharma, J. Yi, C.-J. Hsieh, Zoo: Zeroth or- ture selection against evasion attacks, IEEE Trans. Cybern. 46 (3) (2016)
der optimization based black-box attacks to deep neural networks without1215 766–777.
1145 training substitute models, in: Proceedings of the 10th ACM workshop on [43] C. Smutz, A. Stavrou, Malicious PDF detection using metadata and struc-
artificial intelligence and security, 2017, pp. 15–26. tural features, in: R. H. Zakon (Ed.), 28th Annual Computer Security Ap-
[23] J. Zhang, Q. Yan, M. Wang, Evasion attacks based on wasserstein gener- plications Conference, ACSAC 2012, Orlando, FL, USA, 3-7 December
ative adversarial network, in: 2019 Computing, Communications and IoT 2012, ACM, 2012, pp. 239–248.
Applications (ComComAp), IEEE, 2019, pp. 454–459. 1220 [44] N. Carlini, D. A. Wagner, Adversarial examples are not easily detected:
1150 [24] Y. Ren, Q. Zhou, Z. Wang, T. Wu, G. Wu, K.-K. R. Choo, Query-efficient Bypassing ten detection methods, in: B. M. Thuraisingham, B. Biggio,
label-only attacks against black-box machine learning models, Computers D. M. Freeman, B. Miller, A. Sinha (Eds.), Proceedings of the 10th ACM
& security 90 (2020) 101698. Workshop on Artificial Intelligence and Security, AISec@CCS 2017,
[25] F. Boenisch, R. Munz, M. Tiepelt, S. Hanisch, C. Kuhn, P. Francis, Side- Dallas, TX, USA, November 3, 2017, ACM, 2017, pp. 3–14.
16
1225 [45] J. Lu, T. Issaranon, D. A. Forsyth, Safetynet: Detecting and rejecting ad- Islam Debicha received his master’s
versarial examples robustly, in: IEEE International Conference on Com- degree in Computer Science with a focus
puter Vision, ICCV 2017, Venice, Italy, October 22-29, 2017, IEEE Com- in network security in 2018. He is pur-
puter Society, 2017, pp. 446–454.
[46] D. J. Miller, Y. Wang, G. Kesidis, When not to classify: Anomaly detec- suing a joint Ph.D. in machine learning-
1230 tion of attacks (ADA) on DNN classifiers at test time, Neural Comput. based intrusion detection systems at ULB
31 (8) (2019) 1624–1670. and ERM, Brussels, Belgium. He works
[47] A. Venturi, G. Apruzzese, M. Andreolini, M. Colajanni, M. Marchetti, at ULB cybersecurity Research Center and
Drelab-deep reinforcement learning adversarial botnet: A benchmark
dataset for adversarial attacks against botnet intrusion detection systems, Cyber Defence Lab on evasion attacks against
1235 Data in Brief 34 (2021) 106631. machine learning-based intrusion detection
[48] M. Sarhan, S. Layeghy, M. Portmann, Towards a standard feature set for systems. He is the author or co-author of 6 peer-reviewed scien-
network intrusion detection system datasets, Mobile Networks and Ap-
tific publications. His research interests include defenses against
plications 27 (1) (2022) 357–370.
[49] A. Pektaş, T. Acarman, Effective feature selection for botnet detection adversarial examples, machine learning applications in cyberse-
1240 based on network flow analysis, in: International Conference Automatics curity, data fusion, and network security.
and Informatics, 2017, pp. 1–4.
[50] W. Zhijun, L. Wenjing, L. Liang, Y. Meng, Low-rate dos attacks, detec- Benjamin Cochez received his mas-
tion, defense, and challenges: A survey, IEEE access 8 (2020) 43920– ter’s degree in Cybersecurity from the Uni-
43943. versité Libre de Bruxelles (ULB). He works
1245 [51] G. Ziemba, D. Reed, P. Traina, Rfc1858: Security considerations for ip
fragment filtering (1995).
as a cybersecurity consultant for a consul-
[52] J.-l. Gailly, zlib: A massively spiffy yet delicately unobtrusive compres- tancy company based in Brussels, and his
sion library, http://www. zlib. net/. main areas of interest are cloud security,
[53] S. R. Zahra, M. A. Chishti, Packet header compression in the internet of endpoint security, machine learning and
1250 things, Procedia Computer Science 173 (2020) 64–69.
[54] S. Garcia, M. Grill, J. Stiborek, A. Zunino, An empirical comparison of adversarial learning.
botnet detection methods, computers & security 45 (2014) 100–123.
[55] I. Sharafaldin, A. H. Lashkari, A. A. Ghorbani, Toward generating a new
Tayeb Kenaza received a Ph.D. de-
intrusion detection dataset and intrusion traffic characterization, in: Pro-
1255 ceedings of the 4th International Conference on Information Systems Se- gree in computer science from Artois Uni-
curity and Privacy, ICISSP, 2018, pp. 108–116. versity, France, in 2011. Since 2017, he is
[56] A. Gharib, I. Sharafaldin, A. H. Lashkari, A. A. Ghorbani, An evaluation the Head of the Computer Security Labo-
framework for intrusion detection dataset, in: 2016 International Confer-
ence on Information Science and Security (ICISS), IEEE, 2016, pp. 1–6.
ratory at the Military Polytechnic School
1260 [57] M. Wang, Y. Lu, J. Qin, A dynamic mlp-based ddos attack detection of Algiers. He is currently a full professor
method using feature selection and feedback, Computers & Security 88 in Computer Science Department at the
(2020) 101645. same school. His research and publication
[58] E. Min, J. Long, Q. Liu, J. Cui, W. Chen, Tr-ids: Anomaly-based intrusion
detection through text-convolutional neural network and random forest,
interests include Computer Network Se-
1265 Security and Communication Networks 2018. curity, Wireless Communication Security, Intelligent Systems,
[59] R. Wazirali, An improved intrusion detection system based on knn hyper- and Data Mining.
parameter tuning and cross-validation, Arabian Journal for Science and
Engineering 45 (12) (2020) 10859–10873. Thibault Debatty obtained a master’s
[60] W. Xu, D. Evans, Y. Qi, Feature squeezing: Detecting adversarial exam- degree in applied engineering sciences at
1270 ples in deep neural networks, arXiv preprint arXiv:1704.01155. the Royal Military Academy (RMA) in Bel-
[61] K. Grosse, P. Manoharan, N. Papernot, M. Backes, P. McDaniel,
On the (statistical) detection of adversarial examples, arXiv preprint gium, followed by a master’s degree in
arXiv:1702.06280. applied computer science at the Vrije Uni-
[62] F. S. d. Lima Filho, F. A. Silveira, A. de Medeiros Brito Junior, G. Vargas- versiteit Brussel (VUB). Finally, he ob-
1275 Solar, L. F. Silveira, Smart detection: an online approach for dos/ddos at-
tained a Ph.D. with a specialization in dis-
tack detection using machine learning, Security and Communication Net-
works 2019. tributed computing at both Telecom Paris
[63] V. Kanimozhi, T. P. Jacob, Artificial intelligence based network intru- and the RMA. He is now an associate pro-
sion detection with hyper-parameter optimization tuning on the realis- fessor at the RMA, where he teaches courses in networking,
1280 tic cyber dataset cse-cic-ids2018 using cloud computing, in: 2019 inter-
national conference on communication and signal processing (ICCSP),
distributed information systems, and information security. He
IEEE, 2019, pp. 0033–0036. is also president of the jury of the Master of Science in Cyber-
[64] B. Nugraha, A. Nambiar, T. Bauschert, Performance evaluation of bot- security organized by the Université Libre de Bruxelles (ULB),
net detection using deep learning techniques, in: 2020 11th International the RMA, and four other institutions.
1285 Conference on Network of the Future (NoF), IEEE, 2020, pp. 141–149.
17
Jean-Michel Dricot leads research on
network security with a specific focus on
the IoT and wireless networks. He teaches
communication networks, mobile networks,
the internet of things, and network secu-
rity. Prior to his tenure at the ULB, Jean-
Michel Dricot obtained a Ph.D. in network
engineering, with a focus on wireless sen-
sor network protocols and architectures.
In 2010, Jean-Michel Dricot was appointed professor at the
Université Libre de Bruxelles, with tenure in mobile and wire-
less networks. He is the author or co-author of more than 100+
papers published in peer-reviewed international Journals and
Conferences and served as a reviewer for European projects.
Wim Mees is Professor in computer
science and cyber security at the Royal
Military Academy and is leading the Cy-
ber Defence Lab. He is also teaching in
the Belgian inter-university Master in Cy-
bersecurity, and in the Master in Enter-
prise Architecture at the IC Institute. Wim
has participated in and coordinated numer-
ous national and European research projects
as well EDA and NATO projects and task groups.
18