2004.00234v1 Botnet Detection Using Recurrent Variational
2004.00234v1 Botnet Detection Using Recurrent Variational
Autoencoder
Jeeyung Kim∗ , Alex Sim∗ , Jinoh Kim† , Kesheng Wu∗
∗ Lawrence Berkeley National Laboratory, Berkeley, CA, USA
† Texas
A&M University, Commerce, TX, USA
Email: ∗ {jeeyungkim, asim, kwu}@lbl.gov, † [email protected]
Abstract—Botnets are increasingly used by malicious actors, botnet named Mirai started to control hundreds of thousands of
creating increasing threat to a large number of internet users. Internet of Things (IoT) devices making a high-profile DDoS
To address this growing danger, we propose to study methods
arXiv:2004.00234v1 [cs.CR] 1 Apr 2020
threat. Mirai has made other botnets that imitate the infection
to detect botnets, especially those that are hard to capture
with the commonly used methods, such as the signature based strategy [3]. The other sophisticated botnet systems, such as
ones and the existing anomaly-based ones. More specifically, Smominru which is known as crypto-mining botnet became
we propose a novel machine learning based method, named a rampant threat since 2018. There are also examples where
Recurrent Variational Autoencoder (RVAE), for detecting botnets botnets could operate without detection for an extended period
through sequential characteristics of network traffic flow data of time by gradually changing its mode of operation. Detecting
including attacks by botnets.
We validate robustness of our method with the CTU-13 dataset, these botnets requires the detection algorithms to evolve with
where we have chosen the testing dataset to have different types time and to adapt quickly with extensive retraining.
of botnets than those of training dataset. Tests show that RVAE There is a significant body of published literature on de-
is able to detect botnets with the same accuracy as the best tecting malicious botnets. The existing approaches of botnet
known results published in literature. In addition, we propose detection are categorized into two broad categories: honeypots
an approach to assign anomaly score based on probability
distributions, which allows us to detect botnets in streaming mode and Intrusion Detection Systems (IDS). Honeypot is a com-
as the new networking statistics becomes available. This on-line puter mechanism that is used as trap to draw the attention
detection capability would enable real-time detection of unknown of attackers to attack this computer system [4]. The honeypot
botnets. approach has several limitations in terms of scale and possible
Index Terms—anomaly detection system, botnet detection, detection types of attacks [4]. On the other hand, IDS methods
network security, online detection, Recurrent Neural Network,
Variational Autoencoder which is to monitor a network or systems for malicious
activity are further divided into two categories: signature-
I. I NTRODUCTION based and anomaly-based methods. A signature-based method
Botnet is one of the most significant threats to the cyber- is configured with a set of rules or signatures to classify
security as they are considered a source of many malicious types of network traffics. This approach requires a relatively
activities [1]. The machines in a botnet is typically hijacked small amount of computation and often can work in real-
without the owner’s knowledge. These machines are then time without slowing down normal network operations. The
commanded to act together to attack more machines and effectiveness of signature-based is widely studied, but it is only
find more valuable targets. They are also frequently used to able to be identify well-known botnets [4], [5].
perform distributed denial-of-service attacks (DDos), click- On the other hand, anomaly-based techniques detect botnets
fraud, spamming and crypto-mining. These botnets could based on a number of network traffic anomalies such as high
also harbor malware and ransomware for delivery to victims network latency, high volumes of traffic and unusual system
of their attacks. Therefore, a critical task of cybersecurity behavior [4]. They model a normal behavior of network traffic
research is to detect botnets and stop their attacks. and design a decision engine which determines any diver-
At the same time, the malicious software for infecting a gence or statistical deviations from the norm as a threat [6].
machine and operating botnets is evolving to evade detection, Traditionally, many studies have attempted to use statistical
rendering many of the commonly used botnet detection tech- features or heuristic methods to detect botnet anomalies [7],
niques ineffective. For example, the protocol used by botnet [8]. Recently, with the motivation of making more generalized
has changed. Initially, Internet Relay Chat (IRC) in which botnet detectors in anomaly-based IDS, there have been many
the bot master controls the other bots was adopted as the studies on machine learning (ML) methods to analyze botnets
communication method. After that, peer-to-peer (P2P) where behavior for the anomaly detection. With anomaly detection
individual boats serve as clients and servers was widely used system utilizing ML, previously unseen types of botnet attacks
and then HTTP based botnet hijacking a legitimate com- can be detected based on its behavior [2], [9]–[12].
munication channels began to flourish [2]. Moreover, botnet However, many studies suggesting ML methods for botnet
attack techniques are continuously evolving. In 2016, a new detection are limited in that they do not share the testing
dataset, which makes them incomparable to each other. [2],
[9]–[12]. In addition, most previous studies do not take into sequential pattern of data which can increase the performance
account sequential patterns within network traffic data, even of detection. Van et al. propose the revised VAE structure,
though botnet traffic shows a repeated pattern behavior due to called as Dirac Delta VAE, for achieving better anomaly
the nature of the pre-programmed characteristics of bots [13]. detection performance in [17]. It narrows down the range
There are some studies considering periodic behaviors, but of latent space which makes classifiers detect anomaly easier.
they are still limited as the studies only consider sequential However, the proposed method in the paper cannot be trained
characteristics within the same source IP addresses not within end-to-end because it separately uses classifier in the latent
overall network traffic, which makes it difficult to be used space. Furthermore, the authors conducted experiments using
as the online detection system [14], [15]. Furthermore, some one type of botnets for training and testing. While it is not
existing studies narrow their scope by evaluating their methods VAE, Ruggiero et al. utilize Denoising Autoencoder (DAE) for
only for one of the IRC, P2P, and HTTP traffic. For these botnets anomaly detection in [18]. The authors propose the
reasons, those methods fail to validate their methods to cope way using both DAE and filter-encoder architecture to extract
with various types of botnets or previously unseen botnet fam- features for botnets classifier. Furthermore, there are various
ilies, limiting the practical use given the new introduction of of studies utilizing VAE for anomaly detection of network
new botnet families [14]–[17]. Only the method which shows traffic, as you can see in [19] and [20]. While many works
effectiveness on various types of botnets can be rendered have been done so far, most are limited in that the methods
reliable and practically useful. overlook sequential characteristics within network traffic.
Our goal of this paper is to suggest a ML method which is Recurrent Neural Network: RNN have been in great use
capable to reflect periodicity within network data as well as to in many studies for employing sequential characteristics of
detect previously unseen types of botnets in a on-line manner. network traffic data. Kapil et al. propose supervised approach
We have three main contributions in this paper: to detect botnet hosts by tracking a host’s network activity over
• We adapt Recurrent Variational Autoencoder (RVAE) time using RNN architecture and extract graph-based features
architecture for anomaly detection. This detection method of NetFlow data for botnet detection in [14]. However,
can be trained on normal data and detect anomalies that extracted features are obtained each host IP address. If using
vary over time. This is a novel feature of our detection periodicity of each source IP address, detecting malicious
approach. botnets in an timely manner is impossible because we need
• We devise a strategy for on-line detection of anomalies to wait to collect every flow with the IP address to classify
using the output from the RVAE network. one connection as malicious or non-malicious. In addition,
• We verify that the on-line detection approach could detect the method is restricted in terms of not being generalized
changing botnets by splitting the popular test data set in that it is trained and tested on limited botnet scenarios.
CTU-13 into training and testing sets with different types Besides, Pablo et al. assign the symbol to size and the
of botnets. Tests show that we are able to detect botnets port and embedded the code to distributed representation like
effectively when different types of botnets are used for word embedding [15]. It shows the potential to use RNN as
testing compared to the existing methods. botnet detection model, but it is limited that it doesn’t show
comparable performance in imbalanced network traffic. In
II. R ELATED WORKS AND BACKGROUND [21], Egon et al. present a method of determining which alerts
The various ML method have been utilized in botnet de- are correlated by applying Neural Networks and clustering.
tection. We use a fundamental structure of Recurrent VAE It utilizes text strings output from IDS as RNN input. It is
(RVAE) which contains both Variational Autoencoder (VAE) somehow limited as the method requires the specific output,
and Recurrent Neural Network (RNN) in regard that it uses such as text strings. While many works have been done so
RNN structure as encoder/decoder of VAE instead of Multi- far, most are limited now that the method cannot be applied
Layer Perceptron (MLP). In the next subsection II-A, we to the online anomaly detection system because the methods
introduce previous work utilizing ML techniques for botnet analyze traffic by the same host.
detection and discuss the limitation each work has. Subse- Other Machine Learning Approach: Besides VAE and
quently, in the subsection II-B, regardless of the topic of botnet RNN, many recent studies have attempted to make use of
detection, we describe each part of the proposed ML model, various Machine Learning (ML) approach to reduce the depen-
how each model works, and what problems each method was dence for human heuristics. In [22], the authors regard every
created to deal with. We also explain why the method need to feature as sentence and embed it. With embedded features,
be utilized for botnet detection. classifier can be trained to detect malicious botnets. In [9],
the authors propose the way of detecting HTTP botnets using
A. Machine Learning Approach for Anomaly based IDS
MLP. Kamaldeep et al. introduce the framework for P2P botnet
Variational Autoencoder: In [16], Guoc et al. introduce detection using Random Forest in [10]. In [11], Elaheh
VAE which is an unsupervised method for detecting anomalies et al. introduce the method of selecting effective features
and also focus on explaining anomalies with a gradient-based for machine learning based botnet detection approaches. The
fingerprinting technique, but it is limited it assumes that they authors also assess its effectiveness on the dataset which are
already know the ratio of anomaly and it does not consider constructed focusing on generality, realism and representa-
III. P ROPOSED M ODEL
In order to identify botnets, we propose a novel flow-based
botnet detection system coping with periodicity of traffic flow.
The overall procedure of our proposed system is shown in
Fig. 1. Procedure of the proposed method
Fig. 1, which consists of three steps as follows:
• Data pre-processing: The data instances are grouped
tiveness. Ongun et al. present how to extract features which based on a predefined time interval(e.g., 60 sec for the
are good for ML model in [21]. The authors also compare size of time window), and they are aggregated by host
a statistics aggregated feature processing method with the IP addresses. This process also includes the process of
connection level feature processing method and validate those calculating statistical features and normalizing numerical
methods with Random Forest and Gradient Boosting, which values.
are ML techniques. In [23], the authors treat network traffic • Anomaly scoring: At every time window, anomaly
features as an image. By bringing pre-trained Convolutional scores of every flow are calculated, which provides the
Neural Network (CNN) model which is suitable for image degree of maliciousness of individual connections. For
data, the authors do transfer learning to adapt network traffic scoring, we establish a function that consists of RVAE
data. and produces anomaly scores by comparing the input with
the output of the model which is the reconstructed input.
B. Background of ML structures Employed in the Proposed
• Anomaly detection: Based on the calculated anomaly
Model
scores, the anomaly detection function classifies individ-
Variational Autoencoder (VAE) VAE is one of the gen- ual connections into either Malicious or Non-malicious.
erative models which utilizes deep neural network structure In particular, our method does not rely on threshold;
to represent transformation. Encoder which consists of neural rather, it utilizes a couple of probability density func-
network extracts latent variable z in accordance with input x tion (PDF) which are estimated by normal and botnet
using a reparameterization trick. Decoder which also consists instances in training dataset, respectively.
of neural network reconstructs x with z that is created by the Fig. 2 demonstrates a snapshot of the process for the
encoder. The more detailed discussion of VAE can be referred data pre-processing and anomaly scoring. In the phase of
in [24]. data processing, every flow sorted in chronological order is
Gated Recurrent Unit (GRU) We use RNN structure aggregated to obtain statistic features within the windows.
which is known as containing directed cycle beneficial to These flow-based features are used as input to RVAE, and are
represent data with sequential pattern. Especially, we utilize input in the order of time. In the botnet detection system, the
GRU model which has capability to remember values with encoder is expected that it is trained in a way of distilling the
long sequences comparing to vanilla RNN. This allows GRU common characteristics within the sequential data into latent
to extract long-term periodic characteristics of network traffic variable z. The decoder reconstruct sequential inputs utilizing
data. The more detailed discussion of GRU can be referred in z. In the end, reconstruction loss is obtained as an output of
[25]. the process of anomaly scoring.
Recurrent Variational Autoencoder (RVAE) RVAE is the
structure of combining seq2seq with VAE, whose encoder and A. Anomaly Scores from Recurrent Variational Autoencoder
decoder consists of auto-regressive model. As it utilizes RNN
instead of MLP or CNN to generate sequential outputs, it not TABLE I
only takes the current input into account while generating but N OTATIONS USED
also its neighborhood. For prior distribution, it uses Gaussian
distribution like VAE. The last hidden state is used as mean Notation Description
and variance of multivariate Gaussian in latent space. The hE,T The last hidden state of the encoder
Wµ Linear transformation to get µ(x)
latent variable is employed as initial hidden state of the Wσ Linear transformation to get σ(x)
decoder which is also RNN structure. The more detailed z The latent variable
discussion of Recurrent VAE can be referred in [26] and hD,2 The second hidden state of the decoder
hD,t The hidden state of the decoder at timestep t
[27]. W hh Linear transformation from the previous hidden state
Generally, seq2seq models are actively being used in text W hx Linear transformation from input
and music field. With sequential patterns, they generate music Ws Linear transformation from hidden state to get output
θ The parameters of encoder
or text sequences. The advantage to the RVAE is that it utilizes x1 The first input of the decoder
sequential patterns to generate the data. With this structure, we φ The parameters of decoder
expect that the structure performs ably in network traffic data y˜t Output, reconstruction at timestep t
y˜tn nth feature of reconstruction at timestep t
to detect anomaly. DKL Kullback-Leibler divergence
Fig. 2. Botnet detection system using RVAE with sequential dataset
Notations used in this paper are in Table I. We first In the following section, we present how to detect botnets with
input network traffic data, which are pre-processed, to GRU anomaly scores.
structure. hE,T is used as mean and variance of Gaussian
B. Anomaly Detection
distribution which represents latent space. With µ and σ, z
can be obtained, and the z is used as initial hidden state for In many studies, threshold of anomaly scores is used to dis-
the decoder. tinguish whether the source IP addresses in the time window is
malicious or not in anomaly detection methods [16], [18], [19].
µ(x) = Wµ hE,T
The threshold can be set in many ways. It is one of the simple
σ(x) = Wσ hE,T (1) and intuitive method; however, the information of the dataset
z = µ(x) + σ(x) ∗ , ∼ N (0, 1) is required in most cases such as the ratio of botnets or at
The second hidden state of decoder follows as: least approximate values of anomaly scores of botnet samples.
Unfortunately, there are few cases that the information about
hD,2 = sigmoid(W hh z + W hx x1 ) (2) the traffic data is known in advance. Furthermore, detecting
attacks of botnet with threshold hampers the anomaly detection
The first input of the decoder(x1 ) is zero-padded. Finally, the
framework performing on-line. Let’s say that the threshold is
outputs we obtain from RVAE is formulated:
set to 10%, which means that samples higher than top 10% of
y˜t = sigmoid(W s hD,t ) (3) anomaly scores are classified as botnets. Then, we have to wait
until all samples in testing dataset complete to get the anomaly
The loss function that we want to minimize: scores because we must sort every anomaly scores. Therefore,
J(x) = −Eqφ (z|x) [logpθ (x|z)] + β ∗ DKL [qφ (z|x)|pθ (z)] (4) this method is limited to being used in timely manner, which
means that it is not practical.
We train the model with only non-malicious instances, and in Instead, we suggest a more efficient method using the
evaluation phase, we calculate reconstruction errors and use estimated probability distribution of reconstruction errors. In
it as anomaly scores using both non-malicious and malicious training phase, we collect reconstruction errors from normal
instances. As we use binary cross entropy as error function, and abnormal instances. Then, we find the distribution and
the anomaly score is formulated: its parameters to represent the distribution of reconstruction
N
X errors of abnormal and normal, respectively, by exploring
L= (1 − ytn )log(1 − ỹtn ) + ytn log ỹtn (5) various types of distributions and selecting the minimum sum
n=1 of squared estimate errors (SSE). We call the distribution with
Each time window, we can obtain the anomaly scores of every the smallest SSE as the best-fit PDF. We search for the best-
connection which belong to the time window. In other words, fit PDF among many different candidates such as gamma
if the traffic connection can be considered malicious or not is distribution, generalized logistic distribution, fold cauchy dis-
indicated by the outputs(L) of the anomaly detection system. tribution, Mielke distribution and beta distribution, among
others. In testing phase, the estimated PDF can be utilized to DDos and FastFlux are included in the dataset we use for the
obtain likelihood to belong to each distribution. Comparing experiments. The description of dataset is in Table II.
the likelihood values of the two different distributions, we
assume that each sample of the test data set belongs to the TABLE II
distribution with greater likelihood. Utilizing best-fit PDFs at CTU-13 DATASET
the training stage does not require the information of network
traffic dataset as well as provide the botnet detection system Dataset Scenario
Training&Validation 3,4,5,7,10,11,12,13
that can be used in on-line. Testing 1,2,6,8,9
IV. E XPERIMENTS
We have experimented several ways to validate reliability
B. Data Pre-processing
of the proposed method in different aspects. We show the two
aspects from the experiments. The first is to show that the The CTU-13 dataset consists of NetFlow files which are
proposed method has better performance than both Random composed of source and destination IP addresses and ports,
Forest and the existing standard VAE, which we call as MLP- time, protocol, duration, number of packets, number of bytes,
VAE in this paper, in various measures. Second, we explain state, and service. We process the data to use the aggregated
how the reconstruction errors are distributed and how to utilize flows statistic, which is the way many existing works adopt
it in detecting botnets. in order to obtain flow-based features [15]–[19], [21]. We
group NetFlow data at every time interval of T , and aggregate
A. Evaluation Datasets features within every group based on the source IP addresses
We use CTU-13 dataset which is widely used in the latest to get flow-based features. With the processing method, we can
studies for botnet detection [15]–[19], [21], [22], [28]. A detect IP address showing malicious behavior in a particular
botnet scenario is a particular infection of the virtual machines time window. Many existing works experimentally find the
using a specific malware. Thirteen of these scenarios were most appropriate time window T , which is crucial in that while
created, and each of them was designed to be representative too small time window might not capture traffic characteristics
of some malware behavior [28]. To compare the results of over a longer period of time, too large time window cannot
MLP-VAE and Random Forest, we reproduced nearly the provide timely detection in waiting the end of the window [2],
same experimental settings with the settings in [16] and [21]. [15], [16], [18], [19], [21], [28]. We did experiment as chang-
In [16] and [21] which proposes VAE and Random Forest ing the duration of windows to find the ideal value for the
structures respectively that we select as the baseline, they statistical aggregation. We then sort the entire data within the
prove the robustness of their methods on scenario 1, 2 and time window by the time of the source IP connection group,
9 of CTU-13 dataset, which consists of only Neris botnet. because the RNN model is sensitive to the order of the inputs.
The Neris botnet is IRC based bot infecting other machines For RVAE, we use the network traffic connections collected
by Spam and Click Fraud. In our reproduced experiments, all within N windows as the sequential inputs to the model. You
methods show similar performance in every metrics, as you can see it in the data processing part of Fig. 2. In the case of
see in Table III. Especially, Random Forest performs very well Fig. 2, 60-second duration of three windows are used.
on the testing datasets because botnet families in the testing In terms of source/destination ports and destination IP
dataset are already used for training. In other words, Random addresses, we count the number of unique records with con-
Forest method is able to capture dominant features to classify nected source IP addresses in the time window. In addition, for
anomalies. However, when considering the evolving botnets, the source IP addresses, we count the number of connections
the method cannot be validated with being evaluated on dataset with the source IP addresses in the time window. For service,
consisting of botnets which are previously identified. state, and protocol, we count the number of different values
Thus, we determine to follow the dataset separation criteria, in each category with the source IP addresses in the time
as suggested in [28], in order to test the model in more general window. Finally, we normalize the numerical values to be
cases. In [28], the authors made the dataset in a way that between 0 and 1. As a result, the number of features used in
none of the botnet families used in the training and cross- this experiment is smaller compared to the number of features
validation datasets should be used in the testing dataset. The used in [21] and [16].
authors state that this way ensures that the detection methods
C. Experimental Setting
can generalize and detect new behaviors. By splitting of CTU-
13 data in the suggested way, we can mimic the real situation For splitting datasets for training, validation and testing, we
where the operations of botnet changes over time in terms of followed the suggested separation criteria, as we mentioned
protocols and attack types. Compared to the restricted dataset in the section IV-A. The architecture of MLP-VAE follows
(scenario 1, 2, 9), various types of botnets that have IRC- what is used in [16], [# of features → 512 → 512 → 1024
based, P2P-based and HTTP-based communication methods → 100]. For RNN architecture, we use 2-layer bidirectional
and conduct attacks such as Spam, Click Fraud, Port Scan, GRU. We use the 512 dimensions of hidden states, and 100
dimensions of latent variable as MLP-VAE. We also apply
ReLU activation [29] to MLP-VAE as well as RVAE. The TABLE III
Kullback-Leibler annealing method is set so that the weight R ESULTS COMPARISON -T RAINED AND TESTED ON SCENARIO 1,2 AND 9
multiplied to KLD increases linearly for 500 gradient updates
for RVAE. We train for 500 epochs with Adam optimizer and Model Recall Precision F1 AUPRC AUROC
RVAE 0.978 0.957 0.967 0.960 0.966
128 batch-size. Also, learning rate is set as 0.01 for both MLP-VAE 0.974 0.959 0.966 0.959 0.966
VAE models. We use the 5 different evaluation metrics to Random Forest 1.000 0.998 0.999 1.000 1.000
validate our performance; Area Under the Receiver Operating
Characteristics (AUROC), Area Under the Precision-Recall
Curve (AUPRC), Precision, Recall and F1 score which are TABLE IV
R ESULTS COMPARISON -T RAINED AND TESTED ON THE DATASETS IN
common metrics for anomaly-based IDS. We save the model TABLE II
showing the best value of AUPRC in 5-fold cross validation
sets. The source code is written with the PyTorch1 library. Model Recall Precision F1 AUPRC AUROC
RVAE 0.969 0.892 0.929 0.972 0.975
D. Baseline MLP-VAE 0.944 0.891 0.917 0.967 0.962
Random Forest 0.424 0.982 0.592 0.888 0.901
We compared our experimental results to the reproduced
results from the MLP-VAE and Random Forest. In terms of
MLP-VAE, the experimental results are based on the same
data processing method with the same optimizer, learning rate and Neris, Menti, and Murio are used for testing. Because
and the size of latent variable from our experiment. each botnet shows different characteristics, there is an overall
performance degrade with Random Forest, which is affected
V. R ESULTS AND D ISCUSSION by the dominant features of the training dataset. Nonetheless,
We did experiments to validate the proposed method in VAE methods validate its reliability by showing the robust
different aspects. For quantitative validation, we compare the performance with the generalized dataset. In addition, we find
performance of our proposed method with other methods that RVAE method outperforms MLP-VAE method overall
on different metrics. For qualitative validation, we plot the based on the same features and the same size of latent variables
distribution of the reconstruction errors of normal and botnets on both datasets, as you see in Table III and Table IV. It can be
cases. Moreover, we plot estimated the best-fit PDF. We concluded that the botnets of network traffic flow data should
validate our detection method utilizing the best-fit PDF as we be detected utilizing sequential and periodic patterns.
describe it in the section III-B. To compare methods in various Probability Density Function of reconstruction errors As
metrics with the same processing and detection method, we shown in Fig. 3, the distribution of the reconstruction errors
reproduce MLP-VAE and Random Forest in [16] and [21], of botnet samples can be distinguished from the distribution
respectively. While the value in the literature [16] is 0.936, of the normal sample reconstruction errors. As we only
our reproduced value of AUROC with MLP-VAE is 0.966. use non-malicious samples for training, we expect that the
Even if there may be a slightly different experimental setup, reconstruction errors of malicious samples are larger than that
the reproduced results show that our implementation is still of the non-malicious samples. Comparing medians of those
valid according to a bit higher result than the result of the two distributions, we intuitively notice that the median of the
original paper. distribution of non-malicious reconstruction errors is larger
Results comparison among methods In Table III, Random than the median of the distribution of botnet reconstruction
Forest method shows the nearly perfect performance in every errors, even if the estimated PDF function may not perfectly
metric, even though VAE models show the comparable per- represent the samples in the testing dataset, as the best-fit PDF
formance. It is because that the training/testing dataset which is determined with the validation dataset.
are based on scenario 1, 2, 9 share the same characteristics. Especially, you can find a group of botnet samples which
Random Forest is effective in finding dominant features in have the smaller reconstruction errors compared to the other
these characterized datasets. However, as we mention in the botnet samples in Fig. 3b. We focus on the samples whose
section IV-A, validating the models on the characterized reconstruction errors are smaller than 4. We find that 66%
datasets is not what we focus on in this paper. of the samples of the scenario 6 labeled as botnet show the
In Table IV, we show the results from the training and reconstruction errors less than 4, while only 0.0% – 4.0% of
testing on the generalized dataset that we mentioned in the samples show reconstruction errors less than 4 in the other
section IV-A. In this experiment, we pre-processed our data by scenarios (1,2,8,9). The scenario 6 utilizes proprietary com-
using 60-second duration of window and using three windows. mand control channels unlike other scenarios most of which
While both training and testing datasets that we use in Table III use IRC, HTTP and P2P communication methods [28]. The
consist of only Neris botnet, the testing dataset and training samples of the group having small reconstruction errors show
dataset we use in Table IV consist of each different botnets: low values for DNS, smtp, ssl, the number of IP addresses, the
Rbot, Virut, Sogou, and NESIS.ay are used for training, number of ports, and the number of different IP addresses in
1 https://pytorch.org
the time window. These characteristics mainly represent non-
malicious samples other than the botnet samples. We conclude
TABLE V
R ESULTS COMPARISON
Window
Recall Precision F1 AUPRC AUROC
duration(s)
5 1.000 0.865 0.928 0.791 0.881
60 0.969 0.892 0.929 0.972 0.975
300 0.998 0.537 0.699 0.905 0.972