CNN-BiLSTM A Hybrid Deep Learning Approach For Network Intrusion Detection System in Software-Defined Networking With Hybrid Feature Selection
CNN-BiLSTM A Hybrid Deep Learning Approach For Network Intrusion Detection System in Software-Defined Networking With Hybrid Feature Selection
INDEX TERMS Network intrusion detection system (NIDS), software-defined networking (SDN), CNN-
BiLSTM, deep learning.
2023 The Authors. This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
138732 For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/ VOLUME 11, 2023
R. Ben Said et al.: CNN-BiLSTM: A Hybrid Deep Learning Approach
may be vulnerable to failures. In addition, an attacker may memory (CNN–BiLSTM) hybrid algorithm. The previous
inundate the network with hazardous cyber-attacks, such as DL models required a high number of training parameters
Distributed Denial of Service (DDoS) and Denial of Service to be effective. Forcing the model to use a large number of
(DoS) attacks. As a result, genuine queries may be refused parameters may slow down its training phase and increase its
due to the heavy consumption of channel bandwidth and computational cost. As a result, it adds extra computing over-
network resources [4]. head to the SDN architecture. Our approach suggests using
The dynamic and configurable nature of SDN may increase deep learning over SDN to optimize NIDS deployment. The
the false detection of attacks. An SDN controller may be NIDS implementation of the SDN controller features a deep
hijacked by an attacker and used to route traffic according learning approach for network monitoring. To better identify
to its own needs, resulting in consequences for the whole anomalies, this study proposes the use of tree-based deep-
network. Although IDS is a legitimate way to make sure learning approaches. Ten features were used in a multi-class
a network is safe and can detect and stop intrusions, it is classification to determine whether or not an intrusion has
not directly clear whether SDN’s more complicated flow occurred and to classify the kind of intrusion. In comparison
management may increase the chances of service failures. with other studies, previous works have used deep learning
Although the purpose of an IDS is to monitor network traffic algorithms in NIDS to secure SDN, such as [8] and [9].
and warn the administrator of any malicious threats in the However, they only used general datasets, such as NSL-KDD
network, the particularities of SDN may raise exposure to and UNSW-NB15, which are not dedicated to SDN. Other
sophisticated cyber-attacks, which has led to concerns about authors [10] used the InSDN dataset, but their model
developing innovative IDS approaches to secure SDN, protect employed CNN-LSTM, which is considered weak compared
user communication [5], and reveal new security problems to the CNN-BiLSTM model. This model used BiLSTM,
that appear. which has different sequences. As each sequence of BiLSTM
Feature selection is becoming an important phase of the moves forward and backward, it has two LSTM layers. This
preparation for network classification problems. The selec- compensates for the absence of contextual semantic infor-
tion of features helps us reduce and remove irrelevant and mation in LSTM. Each single time in the input sequence
redundant features from the main database that have no of the output layer is covered by a bidirectional structure,
impact on the classification results. The method of enabling which provides information about both the past and future.
feature selection selects a subset of the original dataset The BiLSTM Networks employed in our model were used
with some criteria that contain the properties of the original to better find LSTM dependent features and improve the
dataset. According to Kantardzic [6], classification improves accuracy of classification [11].
when the features of large datasets are reduced by using basic The major contributions of this paper are flowing:
techniques. There are two types of feature selection algo- • We propose an approach that includes a hybrid model
rithms: filter-based and wrapper-based. Filter methods are combining a convolutional neural network (CNN) with
computationally efficient, have fast processing speeds, and Bidirectional Long Short-Term Memory (BiLSTM).
are less likely to be overadjusted [7]. Common methods for We used the L2 regularizer and dropout (0.5) techniques
filter selection include ANOVA, Chi-square, and Pearson’s in the training model. Our hybrid CNN-BiLSTM model
correlation. On the other hand, wrapper methods provide achieved high classification performance.
the best subset of relevant functions, but they require more • Numerous experiments were conducted to validate the
computational time, which decreases system performance. performance of the proposed hybrid model for multi-
Recursive selection, forward selection, and backward elim- class and binary classifications. Additionally, we bal-
ination are commonly employed by wrappers. As network anced the datasets using random oversampling, after that
traffic increases, intrusion detection systems face problems we used a random forest classifier and recursive fea-
in terms of data dimensions and complexity. ture elimination algorithm to select the highly important
One of the proposed solutions for improving NIDS (net- features so that the proposed model could train those
work intrusion detection system (NIDS) monitoring is to features with high performance. Each experiment was
use Deep Learning (DL) models. While the unified vision validated by using the InSDN dataset.
of SDN and deep learning methods opens new possibilities • The efficiency of the proposed model was verified
for the security of IDS deployment, the effectiveness of the using two distinct benchmark datasets: NSL-KDD and
detection method is dependent on the quality of the training UNSW-NB15.
datasets. Even though deep learning for NIDSs has lately • The efficacy and performance of suggested deep learn-
shown promising results for a number of issues, the majority ing algorithms were tested using a set of measures
of the studies overlooked the impact of data redundancy and including accuracy, recall, F1-score, precision, and
an unbalanced dataset. As a consequence, this may adversely training time. The findings indicate that our model out-
affect the resilience of the anomaly detection system, result- performs the CNN, AlexNet, LeNet5, and CNN-LSTM
ing in a suboptimal model performance. models for the majority of the evaluation measures.
Our model aims to enhance SDN security by employing The remainder of this paper is organized into four sections.
a convolutional neural network–bidirectional long short-term Section II outlines several related works concerning network
intrusion detection. The proposed method is presented in identify attacks using multiclass classification. Down-
detail in Section III. Section IV is devoted to the presentation sampling and several pre-processing techniques have been
of the experimental results as well as the analysis achieved used on various datasets to increase the detection rate and
on the dataset. Section V summarizes the results of this study monitoring effectiveness. The UNSW-NB15 detection accu-
and discusses future work. racy was 89.134 %. In this study, UNSW-NB15 was the only
dataset used to verify the effectiveness of the proposed model
II. RELATED WORK and did not treat the network intrusion detection systems in
Deep Learning is a subfield of machine learning that has been the SDN case.
used successfully in a variety of research areas such as image Authors of [14] suggested many neural network models
processing, face recognition and natural language processing. for NIDS, including seq2seq structures using LSTM (Long-
Recently, various deep learning methods have been evaluated Short-Term-Memory), variational auto-encoder, and fully
in NIDS research; specifically, CNN and LSTM models have connected networks. Several datasets, including NSL-KDD,
been experimented with because of their potential ability to KYOTO-HONEYPOT, MAWILAB, and UNSWNB15, were
represent contextual and sequence or temporal information, used to create and execute the suggested technique to dis-
respectively, in the training datasets. tinguish between malicious and legitimate packets in the
Elsayed et al. developed a hybrid intrusion detection sys- network. Different preprocessing methods have been utilized,
tem that combines CNNs and LSTMs. The proposed model such as normalization and one-hot encoding, to prepare data,
can capture both the spatial and temporal features of the net- smooth training, feature manipulation, and selection in neural
work traffic. In this study, the authors used two regularization networks. These parameters are intended but are not limited
techniques, L2 regularization and dropout, to deal with the to allowing neural networks to learn complicated characteris-
overfitting problem [8]. The authors used a recently published tics from the diverse scope of a packet.
novel dataset of NIDS in SDN (InSDN) [12]. The proposed A deep neural network architecture [15] was built on
model improves the intrusion detection performance for zero- the KDD cup99 to check intrusion attempts using four hid-
day attacks, and the results achieved by this model can reach den layers. Data preparation and decreased data usage were
96.32% accuracy. achieved using feature scaling and encoding. For several
Kaiyuan et al. proposed a network intrusion detection datasets, more than 50 characteristics were utilized to com-
algorithm that combines hybrid sampling with deep hier- plete this assignment. Therefore, advanced hardware GPUs
archical networks, that is, CNN and BiLSTM. In the first were employed to handle such a large number of character-
stage, they used One-Side Selection (OSS) to reduce the istics while reducing the training time. Although the authors
noise samples in the majority category and then increased the obtained a high accuracy, the NIDS in SDN was not the focus
minority samples using the synthetic minority oversampling of their study.
technique (SMOTE) [9]. Using this hybrid technique, they Tang et al. [16] used a Deep Neural Network (DNN) to
balanced the dataset so that the model could learn features detect flow-based anomalies in SDN. To reduce the compu-
of minority samples and greatly reduce the model training tational cost of attack detection, only six main characteristics
time. In the second stage, they used a CNN to extract spatial from the NSL-KDD database were used. The structure of
features and a BiLSTM to extract temporal features. By using the DNN model consists of three hidden layers, each with
this method, the proposed model achieved good classification 12, 6, and 3 neurons. The suggested model has a global
accuracy in two datasets, NSL-KDD and UNSW-NB15, with accuracy of 75%, which is less than the threshold required
83.58% and 77.16%, respectively. However, this study did not for real-world environment implementations. Using the same
deal with NIDS in SDN. NSL-KDD database, the authors improved their suggested
On the other hand, the incorporation of data preprocessing model using a Gated Recurrent Unit (GRU) to achieve better
methods with machine learning, such as data reduction, data accuracy [17]. The upgraded model detection rate increased
augmentation, and feature selection, into SDN has received to 89%. Although the authors improved their model, they
interest. In [10], a method was suggested to address the diffi- did not use feature selection, which may have significantly
culties in the KDD Cup 99 dataset by conducting large-scale enhanced the obtained results. Table 1 summarizes and com-
experimental research with the NSL-KDD dataset to achieve pares the most important studies on this topic.
high accuracy in intrusion detection. The experiment was The earlier DL models, on the other hand, needed an
carried out on five effective machine learning algorithms enormous number of training parameters (as all neighbor-
(Naïve Bayes, CART, RF, SVM, and J48). The correlation ing layers are entirely linked together). Using a significant
feature selection approach was applied to minimize the fea- number of parameters may hold back the training phase
ture complexity, providing only 13 features in the NSL-KDD and increase the computing charge of the detection model.
dataset. As a result, in an SDN architecture, it adds unnecessary
In TSDL [13], a deep neural network model with two computational load. By employing deep learning over SDN,
stages was constructed and suggested for NIDS, employing the suggested approach improves NIDS implementation.
a piled auto-encoder combined with SoftMax in the output It incorporates a deep learning technique for network moni-
layer as a classifier. TSDL was created and developed to toring into the SDN controller’s NIDS implementation. In this
TABLE 1. Different deep learning and machine learning techniques for NIDS in traditional networks and SDNs.
FIGURE 3. Model of the overall framework of the proposed network intrusion detection method used in SDN.
The KDD CUP99 and the NSL-KDD are well-known in data preparation, as the ‘‘id’’ column typically contains no
datasets in the context of network intrusion detection, and significant data. Eliminating the ‘‘id’’ column allows for a
Revathi’s research demonstrates that the NSL-KDD datasets focus on examining only essential features.
are ideal for evaluating different intrusion detection algo-
2) CATGEORICAL LABELENCODER ENCONDING
rithms [13]. Each incursion record in this dataset includes a
42-dimensional feature that is subdivided into a traffic-type The process of categorical transformation holds significance
label, 3-dimensional symbol feature, and 38-dimensional in enhancing the learning capacity of classifiers that are
digital feature. The label mostly comprises normal data as designed to handle only numerical data. Specifically, in the
well as the data of four categories of attack (R2L, DoS, context of the InSDN dataset, attributes like ‘‘proto,’’ ‘‘ser-
U2R, and Probe). In this study’s experiments, the test set vice,’’ and ‘‘state’’ encompass categorical information that
(KDDTest) and training set (KDDTrain) from the NSL-KDD has been converted into numeric representations.
dataset were utilized as the model’s test and training sets, We have opted to employ the LabelEncoder technique for
respectively. encoding purposes [36]. This choice is well suited for the
The Australian Centre for Cyber Security research team InSDN dataset because it assigns a distinct numerical label to
has produced a dataset containing the most recent updates each unique categorical value. This transformation enables
named the UNSW-NB15 dataset [35] to address the concerns the classifier to comprehend and derive insights from the
detected in the NSL-KDD and KDDCup 99 datasets. A full encoded labels. While LabelEncoder differs from OneHotEn-
connection record partition comprised 82337 test connec- coder in that it does not generate distinct binary columns,
tion records and 175343 train connection records linked by it nonetheless preserves the categorical characteristics of the
10 attacks. The partitioned dataset included 42 characteristics feature [37].
with parallel-class labels that were typical and nine distinct 3) BALANCING THE DATASET
intrusions. Among the several oversampling methods available, ran-
B. DATA PREPROCESSING dom oversampling is the simplest method for balancing an
The process of data processing, which is referred to as data unbalanced dataset. It provides data parity by duplicating
engineering, is crucial for the success of the learning pro- the minority class samples. This does not result in any
cess. It involves various procedures such as column and row information loss; however, the dataset is susceptible to over-
cleaning, feature encoding, and data normalization, which are fitting because identical information is duplicated. In this
used to preprocess the data to ensure that they are adequately study, we used random oversampling to deal with imbalanced
prepared for analysis. All these procedures were applied to datasets.
all datasets. This subsection discusses the details of these
procedures as follows. 4) HYBRID FEATURE SELECTION
Data were gathered from network packets to identify intru-
1) DROP ROWS AND MISSING VALUES sions. As a result, manually classifying the huge quantity of
To ensure data integrity, all rows were inspected thoroughly network data gathered by the system is a time-consuming
to detect missing values. This process is a standard practice operation. Apart from gathering network data, evaluating it
TABLE 2. Selected 10 features with their description. principle. It’s been utilized in a variety of scenarios, including
image processing and face reorganization. The CNN archi-
tecture is made up of three essential parts: a pooling layer,
a convolution layer, and a’’ fully connected’’ layer. The linear
operation is applied via shifting the filter (kernel) of a given
size through the outcome of the preceding layer; this is done
in the convolution layer. The rectified linear unit (ReLU)
activation function is the most widely used activation func-
tion in convolutional neural networks (CNNs) to introduce
nonlinearity and set all negative values in the feature map to
zero. A convolutional neural network (CNN) may consist of
multiple convolution layers, with the initial layer specifically
designed to extract basic features such as corners and edges.
The ensuing levels are used for complicated feature extrac-
tions, and the pooling layer is used to minimize the feature
size, thus lowering the computing costs. The last completely
linked layer is utilized for classification goals.
TABLE 3. Experimental setup for model training. from a network intrusion detection system. As a result, this
model integrates CNN and BiLSTM to extract features and
then builds a hybrid model.
Due to the fact that the CNN and BiLSTM mechanism
inputs have different formats, the derived spatial features
are changed at the CNN output to conform to the BiLSTM
network’s input format. In our case, the completely linked
layer of the CNN produced 1 × 128, 1 × 64, and 1 × 32
feature vectors as the output. When applied to the input layer
of the BiLSTM model, the input value is fixed at 70. One
layer of BiLSTM units was used to extract temporal features.
Three convolution layers were created using kernels sizes of
5,3 and 5. Each convolution layer is a Max Pooling layer
with a size of 2 × 2 to minimize the dimensionality of the
batch normalization and map features. The high-dimensional
characteristics extracted in the CNN stage are passed to the
next stage, which has three layers: a fully connected layer,
a BiLSTM layer, and an output layer. In both the BiLSTM and
fully connected layers, there were 10 neurons for multiclass
classification and one neuron for binary-class classification.
Finally, at the output, the sigmoid layer was employed to
represent the likelihood of each input flow for classification
consists of feature selection, as explained above, while the purposes.
second and third phases are responsible for extracting and We employed the L2 Regularizer and dropout approaches
deriving data, respectively. Both CNN and BiLSTM are to mitigate the impact of overfitting and improve the detection
well-known deep learning algorithms. CNNs are capable of model’s capacity in unseen data. We ran multiple tests to
extracting spatial dimensional information from data. BiL- find the appropriate regularization hyperparameter λ given
STM is unique in that it retains contextual background data the dropout technique’s probability values. The value of λ
for an extended period of time and enables the derivation of is determined to 0.1 for the L2 Regularizer, and a dropout
data characteristics at the temporal level. It is vital to examine with a probability P of 0.5 is employed after the following
the feature link at the spatial level while extracting features pooling layer and the entirely connected layer, respectively.
TABLE 4. Hyperparameters settings. c. False Positive (FP): indicate that normal data is incor-
rectly classified as an attack.
d. False Negative (FN): indicate that attack data is incor-
rectly classified as normal.
The evaluation indicators accuracy (3), Precision (4), Recall
(4) and F1-score (5) below are the principal indicators to
evaluate our model:
Accuracy(A):
TP + TN
A= (3)
TP + FP + TN + FN
Precision (p):
TP
p= (4)
TP + FP
Recall (r):
TP
r= (5)
TP + FN
F1-Score (f1):
2 (TP + FP) (TP + FN )
f1 = (6)
TP
V. EXPERIMENT RESULTS AND DISCUSSION
A. BINARY CLASS CLASSIFICATION RESULTS FOR ALL
DATASETS
Several deep learning and machine learning algorithms are
currently being used to detect network intrusions. In network
intrusion detection, random forests and standards convolu-
We begin with a low dropout probability and progressively tional neural networks are widely used. As a result, the
increase it because a dropout might induce some errors inside algorithms in this paper are compared to the classic clas-
the learning model, and we want to limit the propagation of sification algorithms widely used in intrusion detection.
this loss to the subsequent layers. We trained all the models In this study, classification performance was compared using
using k-fold cross-validation with 2-fold and 20 epochs for AlexNet, LeNet5, CNN, and CNN-LSTM networks.
each fold. The suggested model was compared to numerous deep
learning models, including CNN, ALexNet, LeNet5, and
CNN-LSTM, according to the study with the optimal hyper-
3) HYPER-PARAMETER SETTING
parameter tuning. Additionally, it was tested in the InSDN,
In some circumstances, the model may result in lower
UNSW-NB15, and NSL-KDD datasets. Table 5 and Figure 6
accuracy or even overfitting or underfitting. Performing
show the binary classification for each model in InSDN
hyperparameter adjustment is crucial for achieving high
dataset. The CNN-BiLSTM is determined to have the top
model performance. For that reason, the randomized search
performance for binary classification across all evaluation
approach was utilized to refine the hyper-parameters and
metrics. Our model has an average accuracy of 97.77 %.
improve accuracy. Table 4 shows the hyperparameter values
The suggested model outperformed the previous models
for the proposed model.
i.e., CNN, AlexNet, LeNet5, and CNN-LSTM by 2.11 %,
31.19 %, 5.68 %, and 2.74 %. This illustrates the significance
4) EVALUATION METRIC and ability of CNN-BiLSTM for efficiently detecting anoma-
The model was tested using a standard performance assess- lies with the least number features.
ment. The model’s accuracy, precision, recall, and f1-score We also conducted tests on the UNSW-NB15 and
are the four principal indicators to evaluate. These four NSL-KDD datasets to further validate the suggested tech-
indicators are basically derived from the four basic attribute nique in this research. Table 5 summarizes the experimental
of the confusion matrix as follows: findings. The classification performance of all classifiers is
a. True Positive (TP): indicates that attack data is correctly between 84 and 95 %. In classification performance, the
classified as an attack. approach described in this research outperformed AlexNet,
b. True Negative (TN): indicate that normal data is cor- LeNet5, CNN, and CNN-LSTM by 8.13 %, 9.23 %, 9.04 %,
rectly classified as normal. and 7.62 %, respectively, on the UNSW-NB15 dataset.
TABLE 5. Binary classification results for all dataset and the running time.
It also outperformed the same models in NSL-KDD dataset of different metrics, the performance of various attacks varies
by 9.86 %, 9.94 %, 11.86%, and 3.71% respectively. significantly amongst classifiers.
As illustrated in Table 5, among all classification mod- • The accuracy of different models was depicted by
els in the InSDN dataset, the CNN-BiLSTM produced figure7, and the Proposed CNN-BiLSTM model demon-
the best detection accuracy. CNN, AlexNet, LeNet-5, strates the highest accuracy among the models analyzed.
CNN-LSTM, and CNN-BiLSTM take 421.15s, 1633.62s, It outperforms the other models with an accuracy of
242.01s, 736.56s, and 978.42s of training time, respec- 97.12%. The CNN-LSTM model follows closely with
tively. Although CNN-BiLSTM took more training time than an accuracy of 96.69%. LeNet5 and AlexNet models
LetNet-5, this was tolerable in consideration of the much achieved accuracies of 93.97% and 89.79% respec-
higher accuracy. CNN-BiLSTM performed much better tively, while the CNN model had the lowest accuracy
than AlexNet. at 85.94%.
• The recall of the models ranged from 50% to 100%. The
B. MULTI-CLASS CLASSIFICATION RESULTS proposed CNN-BiLSTM model had the highest recall of
1) RESULTS AND DISCUSSION ON INSDN DATASET 100% for the following classes: U2R, DDoS, DoS, R2L,
Table 6 provides detailed findings for multi-class classifica- Botnet, Web Attack, and Normal. The CNN-LSTM
tion using several classifiers on the InSDN dataset. In terms model had the highest recall of 90.18% for the class
TABLE 9. Average accuracy of the proposed model compared to other highest accuracy of 84.23%. It also has the highest recall and
models.
F1-score for most of the classes. This suggests that the pro-
posed model can effectively detect attack with high accuracy.
In Table 8 is a more detailed breakdown of the results for
UNSW-NB dataset for the proposed model:
• Figure 10 shown the accuracy of different models, the
proposed model has an accuracy of 84.23%, which is
higher than all of the other models.
• Recall: The proposed model has a recall of 49.29% for
backdoors, 54.95% for analysis, 68.69% for fuzzers,
86.08% for shellcode, 97.98% for reconnaissance,
93.4% for exploits, 79.31% for DoS, and 91.47% for
worms. This suggests that the proposed model is capable
of efficiently identifying a wide range of attacks.
• Precision: The proposed model has a precision of
49% for backdoors, 42.56% for analysis, 54.29% for
fuzzers, 66.83% for shellcode, 91.2% for reconnais-
sance, 75.45% for exploits, 72.31% for DoS, and
88.87% for worms. This implies that the proposed model
is not overfitting the data and is able to generalize to
unseen attack samples.
the highest accuracy, recall, precision, and F1-score for all • Figure 11 shown the F1-score of all models for multi
five classes of network traffic. The CNN-LSTM model is the class classification on UNSW-NB15, the proposed
next best model, with slightly lower accuracy but still very model has an F1-score of 44.35% for backdoors, 42.18%
high performance. The other three models (CNN, AlexNet, for analysis, 51.16% for fuzzers, 70.51% for shell-
and LeNet5) have lower accuracy but still relatively high code, 94.01% for reconnaissance, 81.82% for exploits,
performance. 74.88% for DoS, and 90.58% for worms. This suggests
that the proposed model can effectively detect most
3) RESULTS AND DISCUSSION ON UNSW-NB15 DATASET types of attack with good accuracy and precision.
Table 8 shows the performance of different models on an Overall, the results show that the proposed CNN-BiLSTM
attack detection task. The models are evaluated using dif- model is a promising approach for attack detection. It can
ferent metrics, including accuracy, recall, precision, and effectively detect most types of attack with high accuracy and
F1-score. The proposed CNN-BiLSTM model achieves the precision.
4) COMPARISON OF THE PROPOSED HYBRID MODEL WITH [11] L. Zhang, J. Huang, Y. Zhang, and G. Zhang, ‘‘Intrusion detection model
DIFFERENT MODELS of CNN-BiLSTM algorithm based on mean control,’’ in Proc. IEEE 11th
Int. Conf. Softw. Eng. Service Sci. (ICSESS), Oct. 2020, pp. 22–27.
An improved level of accuracy was achieved with the pro- [12] M. S. Elsayed, N.-A. Le-Khac, and A. D. Jurcut, ‘‘InSDN: A novel SDN
posed model. Based on Table 9, our model had an accuracy intrusion dataset,’’ IEEE Access, vol. 8, pp. 165263–165284, 2020, doi:
of 98.42%, which was the highest among comparable studies. 10.1109/ACCESS.2020.3022633.
[13] S. Revathi and A. Malathi, ‘‘A detailed analysis on NSL-KDD dataset using
various machine learning techniques for intrusion detection,’’ Int. J. Eng.
VI. CONCLUSION Res. Technol., vol. 2, no. 12, pp. 1848–1853, 2013.
In this research, we propose a solution for network intrusion [14] F. A. Khan, A. Gumaei, A. Derhab, and A. Hussain, ‘‘A novel two-
stage deep learning model for efficient network intrusion detection,’’ IEEE
detection based on CNN and BiLSTM. The suggested model Access, vol. 7, pp. 30373–30385, 2019.
makes use of the integrity of CNN and BiLSTM to enhance [15] R. K. Malaiya, D. Kwon, J. Kim, S. C. Suh, H. Kim, and I. Kim, ‘‘An empir-
detection capacity across a variety of datasets. Additionally, ical evaluation of deep learning for network anomaly detection,’’ in Proc.
Int. Conf. Comput., Netw. Commun. (ICNC), Mar. 2018, pp. 893–898.
we demonstrated that our suggested technique outperformed
[16] T. A. Tang, L. Mhamdi, D. McLernon, S. A. R. Zaidi, and M. Ghogho,
either a single CNN, LeNet5, AlexNet, or CNN-LSTM in ‘‘Deep learning approach for network intrusion detection in software
terms of training time and accuracy. Furthermore, by bal- defined networking,’’ in Proc. Int. Conf. Wireless Netw. Mobile Commun.
(WINCOM), Oct. 2016, pp. 258–263.
ancing the dataset using the random over sampling approach
[17] S. Boukria and M. Guerroumi, ‘‘Intrusion detection system for SDN net-
and selecting features using a random forest classifier along work using deep learning approach,’’ in Proc. Int. Conf. Theor. Applicative
with the recursive feature elimination approach, the detec- Aspects Comput. Sci. (ICTAACS), vol. 1, Dec. 2019, pp. 1–6.
tion model’s performance was improved, indicating that the [18] T. Bakhshi and B. Ghita, ‘‘Anomaly detection in encrypted internet traffic
using hybrid deep learning,’’ Secur. Commun. Netw., vol. 2021, pp. 1–16,
hybrid model has a significant potential for usage in real-time Sep. 2021.
NIDS. In further work, we attempting to explore the potential [19] A. R. Narayanadoss, T. Truong-Huu, P. M. Mohan, and M. Gurusamy,
of transfer learning techniques to improve the performance ‘‘Crossfire attack detection using deep learning in software defined
ITS networks,’’ in Proc. IEEE 89th Veh. Technol. Conf. (VTC-Spring),
of our model. as well as to deploy the suggested model in Apr. 2019, pp. 1–6.
a real SDN system and evaluate its throughput and latency [20] M. K. Prasath and B. Perumal, ‘‘A meta-heuristic Bayesian network clas-
performance. sification for intrusion detection,’’ Int. J. Netw. Manage., vol. 29, no. 3,
p. e2047, May 2019.
[21] A. Dawoud, S. Shahristani, and C. Raun, ‘‘Deep learning and software-
ACKNOWLEDGMENT defined networks: Towards secure IoT architecture,’’ Internet Things,
The authors would like to thank their technical support team vols. 3–4, pp. 82–89, Oct. 2018.
for making it possible and for using the TRUBA HPC system [22] C. Song, Y. Park, K. Golani, Y. Kim, K. Bhatt, and K. Goswami, ‘‘Machine-
learning based threat-aware system in software defined networks,’’ in
efficiently in this article. Proc. 26th Int. Conf. Comput. Commun. Netw. (ICCCN), Jul. 2017,
pp. 1–9.
REFERENCES [23] A. Alshamrani, A. Chowdhary, S. Pisharody, D. Lu, and D. Huang,
‘‘A defense system for defeating DDoS attacks in SDN based networks,’’ in
[1] Y. Jarraya, T. Madi, and M. Debbabi, ‘‘A survey and a layered
Proc. 15th ACM Int. Symp. Mobility Manage. Wireless Access, Nov. 2017,
taxonomy of software-defined networking,’’ IEEE Commun.
pp. 83–92.
Surveys Tuts., vol. 16, no. 4, pp. 1955–1980, 4th Quart., 2014, doi:
10.1109/COMST.2014.2320094. [24] S. A. Mehdi, J. Khalid, and S. A. Khayam, ‘‘Revisiting traffic anomaly
[2] N. McKeown and T. Anderson, ‘‘OpenFlow: Enabling innovation in detection using software defined networking,’’ in Proc. Int. Symp.
campus networks,’’ SIGCOMM Comput. Commun. Rev., vol. 38, no. 2, Recent Adv. Intrusion Detection, Menlo Park, CA, USA, Sep. 2011,
pp. 69–74, 2008. pp. 161–180.
[3] M. Latah and L. Toker, ‘‘An efficient flow-based multi-level hybrid intru- [25] R. Sathya and R. Thangarajan, ‘‘Efficient anomaly detection and mitigation
sion detection system for software-defined networks,’’ CCF Trans. Netw., in software defined networking environment,’’ in Proc. 2nd Int. Conf.
vol. 3, nos. 3–4, pp. 261–271, Dec. 2020, doi: 10.1007/s42045-020-00040- Electron. Commun. Syst. (ICECS), Feb. 2015, pp. 479–484.
z. [26] S. T. Selvi and K. Govindarajan, ‘‘DDoS detection and analysis in SDN-
[4] M. S. ElSayed, N.-A. Le-Khac, M. A. Albahar, and A. Jurcut, ‘‘A novel based environment using support vector machine classifier,’’ in Proc. 6th
hybrid model for intrusion detection systems in SDNs based on CNN and a Int. Conf. Adv. Comput. (ICoAC), Dec. 2014, pp. 205–210.
new regularization technique,’’ J. Netw. Comput. Appl., vol. 191, Oct. 2021, [27] Q. Niyaz, W. Sun, and A. Y. Javaid, ‘‘A deep learning based
Art. no. 103160. DDoS detection system in software-defined networking (SDN),’’ 2016,
[5] N. Sultana, N. Chilamkurti, W. Peng, and R. Alhadad, ‘‘Survey on arXiv:1611.07400.
SDN based network intrusion detection system using machine learn- [28] S. M. Mousavi and M. St-Hilaire, ‘‘Early detection of DDoS attacks against
ing approaches,’’ Peer-Peer Netw. Appl., vol. 12, no. 2, pp. 493–501, software defined network controllers,’’ J. Netw. Syst. Manage., vol. 26,
Mar. 2019, doi: 10.1007/s12083-017-0630-0. no. 3, pp. 573–591, Jul. 2018.
[6] M. Kantardzic, Data Mining: Concepts, Models, Methods, and Algorithms. [29] A. Le, P. Dinh, H. Le, and N. C. Tran, ‘‘Flexible network-based intrusion
Hoboken, NJ, USA: Wiley, 2011. detection and prevention system on software-defined networks,’’ in Proc.
[7] I. Srba and M. Bieliková, ‘‘Encouragement of collaborative learning based Int. Conf. Adv. Comput. Appl. (ACOMP), Nov. 2015, pp. 106–111.
on dynamic groups,’’ in Proc. Eur. Conf. Technol. Enhanced Learn. Cham, [30] S. Nanda, F. Zafari, C. DeCusatis, E. Wedaa, and B. Yang, ‘‘Predicting
Switzerland: Springer, 2012, pp. 432–437. network attack patterns in SDN using machine learning approach,’’ in Proc.
[8] M. Abdallah, N. A. Le Khac, H. Jahromi, and A. D. Jurcut, ‘‘A hybrid IEEE Conf. Netw. Function Virtualization Softw. Defined Netw., Nov. 2016,
CNN-LSTM based approach for anomaly detection systems in SDNs,’’ in pp. 167–172.
Proc. 16th Int. Conf. Availability, Rel. Secur., Aug. 2021, pp. 1–12, doi: [31] H. Li, F. Wei, and H. Hu, ‘‘Enabling dynamic network access con-
10.1145/3465481.3469190. trol with anomaly-based IDS and SDN,’’ in Proc. ACM Int. Workshop
[9] K. Jiang, W. Wang, A. Wang, and H. Wu, ‘‘Network intrusion detection Secur. Softw. Defined Netw. Netw. Function Virtualization, Mar. 2019,
combined hybrid sampling with deep hierarchical network,’’ IEEE Access, pp. 13–16.
vol. 8, pp. 32464–32476, 2020. [32] P. Manso, J. Moura, and C. Serrão, ‘‘SDN-based intrusion detection system
[10] N. M. Jacob and M. Y. Wanjala, ‘‘A review of intrusion detection systems,’’ for early detection and mitigation of DDoS attacks,’’ Information, vol. 10,
Global J. Comput. Sci. Inf. Technol. Res., vol. 5, no. 4, pp. 1–5, 2017. no. 3, p. 106, Mar. 2019.
[33] M. A. Albahar, ‘‘Recurrent neural network model based on a new regular- RACHID BEN SAID received the M.S. degree
ization technique for real-time intrusion detection in SDN environments,’’ in computer engineering from the National High
Secur. Commun. Netw., vol. 2019, pp. 1–9, Nov. 2019. School for Electricity and Mechanics, Hassan II
[34] M. Ring, S. Wunderlich, D. Scheuring, D. Landes, and A. Hotho, ‘‘A survey University of Casablanca, Casablanca, Morocco,
of network-based intrusion detection data sets,’’ Comput. Secur., vol. 86, in 2014. He is currently pursuing the Ph.D. degree
pp. 147–167, Sep. 2019. in computer engineering with Ankara University,
[35] N. Moustafa and J. Slay, ‘‘UNSW-NB15: A comprehensive data set Ankara, Turkey. His research interests include the
for network intrusion detection systems (UNSW-NB15 network data application of deep learning, and machine learning
set),’’ in Proc. Mil. Commun. Inf. Syst. Conf. (MilCIS), Nov. 2015, in the fields of software-defined networking, net-
pp. 1–6. working security, and NLP.
[36] E. Bisong, Building Machine Learning and Deep Learning Models on
Google Cloud Platform. Cham, Switzerland: Springer, 2019.
[37] E. Jackson and R. Agrawal, Performance Evaluation of Different Feature ZAKARIA SABIR received the Ph.D. degree in
Encoding Schemes on Cybersecurity Logs. Piscataway, NJ, USA: IEEE, computer engineering and the master’s degree in
2019. information systems security from the National
[38] J. Li, K. Cheng, S. Wang, F. Morstatter, R. P. Trevino, J. Tang, and H. Liu, School of Applied Sciences (ENSA), Ibn Tofail
‘‘Feature selection: A data perspective,’’ ACM Comput. Surv., vol. 50, no. 6, University, Kenitra, Morocco. His research inter-
pp. 1–45, Jan. 2018. ests include connected vehicles, named data
[39] J. Ling and C. Wu, ‘‘Feature selection and deep learning based approach networking, security, blockchain, and intelligent
for network intrusion detection,’’ in Proc. 3rd Int. Conf. Mechatronics Eng. transportation systems.
Inf. Technol., 2019, pp. 764–770, doi: 10.2991/icmeit-19.2019.122.
[40] L. Breiman, ‘‘Random forests,’’ Mach. Learn., vol. 45, no. 1, pp. 5–32,
2001.
IMAN ASKERZADE received the B.S. and
[41] S. Hochreiter and J. Schmidhuber, ‘‘Long short-term memory,’’
Neural Comput., vol. 9, no. 8, pp. 1735–1780, Nov. 1997, doi: M.S. degrees from the Department of Physics,
10.1162/neco.1997.9.8.1735. Moscow State University, Russia, and the Ph.D.
[42] T. A. Tang, L. Mhamdi, D. McLernon, S. A. R. Zaidi, and M. Ghogho, degree from the Azerbaijan Academy of Sciences,
‘‘Deep recurrent neural network for intrusion detection in SDN-based Moscow State University, Azerbaijan, in 1995.
networks,’’ in Proc. 4th IEEE Conf. Netw. Softwarization Workshops (Net- Since 2012, he has been a Professor with the Com-
Soft), Jun. 2018, pp. 202–206. puter Engineering Department, Ankara University,
[43] P. Choobdar, M. Naderan, and M. Naderan, ‘‘Detection and multi-class Ankara, Turkey. His research interests include
classification of intrusion in software defined networks using stacked auto- fuzzy logic, quantum computing, modeling, and
encoders and CICIDS2017 dataset,’’ Wireless Pers. Commun., vol. 123, simulation.
no. 1, pp. 437–471, Mar. 2022.