2022 Iiccit

Uploaded by

habdulkader68

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views6 pages

2022 Iiccit

Uploaded by

habdulkader68

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

2022 Iraqi International Conference on Communication & Information Technologies ( IICCIT-2022) , Basrah University ,

Basrah , Iraq

Botnet Mobile Detection Using Machine &

Deep Learning Techniques
2022 Iraqi International Conference on Communication and Information Technologies (IICCIT) | 978-1-6654-7220-3/22/$31.00 ©2022 IEEE | DOI: 10.1109/IICCIT55816.2022.10010653

Hasan Abdulkader
Mustafa Al-farttoosi
Dept. of Electrical and Computer
Computer and electrical engineering
Engineering (ECE)
Altinbas university
Altinbas University
Istanbul, turkey
Istanbul, turkey
[email protected]
[email protected]

Abstract— In this study, we discuss the most dangerous and Among the potential malicious attack, the most powerful
most widespread attacks, namely botnets. We will also discuss one is the botnet attack. Kaspersky lab reported in 2016 that
the bot attack mechanism and introduce methods using botnet-assisted DDoS attacks present 78.9 percent of all
machine learning (ML) and deep learning techniques to detect detected attacks [4]. In the future, Internet of things (IoT)
attacks entered by SMS or malware. This study investigates devices and UE may play the main role in health, industrial,
the application of several methods based on ML, including banking services, and other vital fields, so any threat
logistic regression, random forest, and deep neural networks. A exposed to these devices will be a big problem for
deep learning algorithm was applied to artificial neural institutions and factories.
networks (ANN) in more than one way. Datasets were divided
into two halves to obtain more accurate results than classical In establishments that use these devices, for the mobile
learning. Special preprocessing of datasets was applied to devices, the attacker may steal sensitive confidential or
improve the performance of classification algorithms. The private information or credential and financial information
obtained accuracy from deep learning was encouraging. A using botnet attacks that aim to create a distributed denial
result of 99.79% was obtained for the SMS attacks and 98.48% of service (DDoS). Similar to the botnet in legacy, mobile
for the malware attacks. network future botnet will be for 5G network.
Keywords—Botnet, Deep learning, Ham, Machine learning, TABLE I. TYPES OF A BOTNET ATTACK
Malware, Spam.
Attack Impact of Attack
Preventing a single system from servicing legitimate
I. INTRODUCTION DDoS
request
The next mobile generation (5G) will include many Active advertising of a commercial offering without
Adware the user’s permission or awareness
innovative technologies, such as device 2 device communication
Sending information to the botmaster about a
(D2D), machine 2 machine communication, massive MIMO,
victim’s activity such as credit card number,
autonomous vehicle, software-defined network (SDN) and spyware passwords, and other information that can be sold
virtual function network. This development will lead to an onthe black market
increase in the number of users and user equipment (UE) and Flooding people with emails disguised as messages
Email spam
IoT machines. It is predicted to reach 1.5 billion subscribers from people but containing malicious links
for eMBB by the end of 2024, according to the Ericson
mobility report [1]; 5G will connect approximately 7 trillion
wireless devices or things, shrink the average rate of services Networks of many UE (mobile or IoT) are under the
creation from 90 hours to 90 minutes [2]. 5G is the first mobile control of malicious actor called bot master that periodically
technology designed to meet the requirement of connected gives orders as a centralized 5G mobile botnet where the
devices for health, industrial applications, transportation, compromised devices will be controlled through central
banking, and many other IoT uses cases. However, that will command and control (C&C) server[5], the bot master will
increase the potential security challenges, safety issues, and be responsible for choosing the mobile devices that will be
cyber attacks since already most of the user equipment compromised by malware and turned into bots. This, in turn,
devices have an open system that makes these devices a will be a bot proxy server which is considered a mean of
potential and tempting target for hackers, especially IoT communication between the bot master and other slave bots.
devices, which are considered vulnerable to cyber attacks. In Also, the bot device will always receive requests from the bot
general, the following could be the main reason that makes master for a specific duration of time. These attacks will form
IoT devices and UE vulnerable to hacking, such as [3]: a very big network of malicious devices with unstable
topology because of user equipment movement from one
• Lack of encryption or poor encryption macro cell to another.
• Default password
• Poor support
• Lack of user awareness

978-1-6654-7220-3/22/$31.00 ©2022 IEEE 82

Authorized licensed use limited to: ULAKBIM UASL - Altinbas Universitesi. Downloaded on January 14,2023 at 09:28:52 UTC from IEEE Xplore. Restrictions apply.
2022 Iraqi International Conference on Communication & Information Technologies ( IICCIT-2022) , Basrah University ,
Basrah , Iraq

over human decision-making to accomplish complicated

tasks independently or with minimum human participation
[9]. As a result, it is concerned with a wide range of
fundamental issues, including the representation of
information, reasoning, learning, planning, perception, and
communication. It relates to a wide range of algorithms and
procedures.
Handcrafted feature engineering is no match for the
power of deep neural networks. Using their innovative
architecture, they can extract discriminative feature
representations using minimum human effort. Because of
this, deep learning (DL) is better able to deal with huge,
noisy, and unstructured datasets than other approaches.
Features are often learned hierarchically, starting with basic
ones and building up to more complex features. However,
Fig. 1. Centralized 5G Botnet. there are several strategies for learning features in
combination with the phase of developing a model
depending on the kind of data and the choice of deep neural
A. Datasets network architecture.
Our study covered two datasets. The first dataset was a The term “machine learning” refers to the process by
collection of SMSs containing clean SMS and malicious which computers learn from the data they are given using
SMS, while the second dataset was a grouped data of clean certain algorithms to complete a job without being
and malicious software. Over this, our datasets were specifically programmed. DL is a kind of ML that employs
separated into two types according to the classification of the kind of complicated deep neural network structures
botnet attack on the mobile phone, which would be the focus fashioned after the human brain and provided with specific
of our investigation. These categories were SMS and algorithms to learn even deep layers. The processing of
malware, and their respective tables contained information unstructured data, such as documents, photos, and text, is
regarding whether or not an attack was classified as a botnet. made possible as a result of its impressive capability.
The SMS dataset and the malware dataset were chosen from
the repository because most of the programs that were Applications like checking messages, software, or any of
included in the database are programs that were installed on the currently popular issues were solved using ML. In
the computer and received using mobile devices, as well as general, ML implies that a computer program's performance
their application, some of which are harmful and some of increases its accuracy over time for a certain set of tasks and
which are useful. These datasets were used by many performance measurements [10]. The goal is to automate the
researchers to conduct tests using machine learning (ML) process of creating analytical models to carry out cognitive
techniques and deep learning techniques to train these tasks like image classification, object identification, or
datasets, whereas the SMS dataset consisted of the message natural language translation.
content and the message classification that included spam or
ham, which were represented by bivariate variables, the II. LITERATURE REVIEW
dataset of malware, which contained 54 features, was made
up of classes of applications and their content forming a The topic of malware detection and SMS spam detection
multivariate dataset. has attracted many interests since it touches all users of IT
especially mobile compliances.
SMS dataset consisted of 5,574 text messages from the
UCI Machine Learning repository registered in 2012 [6] [7]. Spam detection using ML and DL algorithms is a
A subset of 3,375 randomly selected non-spam (ham) SMS statistical adaptive learning model. For SMS spam
messages from the NUS SMS Corpus was included, as were classification, many researchers have used ML approaches in
425 manually extracted spam messages from the the past (more references [11-15]). M. Rubin Julis et al. [11]
Grumbletext website (a UK community where mobile used a support vector machine to attain a 97% accuracy rate
phone users made public allegations about spam). with multiple ML classifiers. Also, support vector machines
provided the best accuracy of 97.3% among a variety of ML
As the malware attacks, a dataset is concerned, it was methods given by Pavas Navaney[12]. Nilam Nur Amir
also provided with the collection and identification of benign Sjarif et al. [13] used random forest classifiers to reach
and malware files. Malware samples were obtained from 97.5% accuracy when used with the TF-IDF approach.
several trusted websites, which include VirusTotal and Inverse Document Frequency (IDF) is constant per corpus
Kaggle [8]. The latter dataset used in this study collected and accounts for the ratio of documents that include that
138,047 samples with 41,323 normal samples and 96,724 specific “term,” and Term Frequency (TF) is the frequency
malware samples. of any “term” in a given “document,” so these are two
methods used to quantify the words in a document. For the
B. Algorithms identification of SMS spam, Tian Xia et al. [14] suggested
According to a broad definition of artificial intelligence the Hidden Markov Model. As a result of including
(AI), the term refers to a specific approach that allows information regarding word order, their model was able to
computers to imitate human behavior and duplicate or excel address problems with low short-term frequency. Their
proposed HMM model has a 98% accuracy rate. Sheikh [15]

advocated using feature selection and the Neural Network formats, such as comma-separated-values, JSON, Parquet,
model for SMS spam identification and got an excellent SQL database tables or queries, and Microsoft Excel. In that
accuracy rate of nearly 98%. case, our data were described as "CSV" file. In the following,
There are a few other studies on malware detection using we will explain how we use it in detail.
ML. In [16] “A Multi-Dimensional Machine Learning a) SMS: We discovered that the SMS dataset was not
Approach to Predict Sophisticated Malware” focused on evenly distributed. Unbalanced data sets were common, and
predicting the advanced malware that is comparable to the problem arose when ML algorithms tried to find these
Stuxnet by employing four separate aspects of the unusual situations in huge datasets where results were few.
Regression algorithm. Random Forest Regression and Linear Since classes had different memberships, the method
and Polynomial Regression are included in the features. preferred sorting in the class with the most cases, the
Linear and polynomial regression are inefficient with four majority class, while still presenting the illusion of a high-
algorithms, but random forest regression delivers superior fidelity model. The minority class, least present in the
predictions with additional data, according to the findings of
dataset, would not participate enough in the learning
his study.
process, and misleading accuracy was thrown out of the
In [17], “Machine Learning Aided Android Malware prediction models we built because of their unpredictable
Classification” has conducted research towards finding and nature. A data set was generated from two categories of data
classifying malware in mobile apps using ML. It was shown using the oversampling method to the spam when the
that the permission-based technique could distinguish transaction was fraudulent and inspiring due to its low
between malware and goodware in 89% of situations, while value. The proportion of ham was much higher than spam,
the source code analysis classification of performance was so the sampling method was used to overcome this problem,
above 95%. SVM had a 95.1% accuracy rate, while which was used to intensify the samples from the minority
ensemble learning had a 95.6% accuracy rate. and add duplicates of ML from the minority class to become
Hemalatha and Selvabrunda [18] have suggested ML an over-sampled dataset.
classifiers to identify the previous portable malware, with the b) Malware: Initially, we faced difficulty in training
mixed kernel function unique to support vector machines and the data due to the small number of valid programs, totaling
selected fundamental information, such as data content time 41,323 samples, compared to the number of malware
and order utilizing various network-based functions.
programs, which numbered 96,724 samples, which means
MalGenome's dataset is used in the calculation. In this case,
the implementation was based on a mixed kernel function the difference between malware and legit apps is about
using SVM, which yielded an accuracy of 96.89% when 55,401 samples, and it is a very huge difference. By splitting
compared to previous models. the datasets into two sub-datasets, we realized that one of
the models outperformed the other widely. We suggested
Furthermore, Cuan Bonan's study showed how they applying the method “df.sample()” for shuffling the data and
employed ML approaches to identify hazardous PDF implementing the method “df.reset_index()” that was
activities. First and foremost, the SVM classifier was built up proven to work effectively and here was how to write the
and capable of detecting 99.7% of the malware. Although a equation representing:
malicious file has easily fooled the classifier, the classifier
“malData.sample(frac=1).reset_index(drop=True)”
was cleaned by forging the data. According to a report, they
have successfully used a gradient-descent assault to thwart To make the data more balanced even after splitting the
the SVM algorithm [19]. data into two halves, and as part of preparing the data for
ML, we normalized the malware dataset using Python’s
III. METHODOLOGY function “lambda:”
“X_Data1.apply(lambda x: (x - x.min(axis=0)) / (x.max(axis=0) -
A. Dataset modification x.min(axis=0)))”

Due to the restriction of our topic to the mobile botnet, The objective of normalization was to let the values of
two datasets were chosen. The first one was the SMS which the features range between (0 and 1) according to the
consisted of spam and ham messages. For each message, the following equation:
dataset attributed two fields, a field for describing the
message and a message field. The total number of messages ()
in the dataset was 5,574 messages. Concerning the second
dataset, the MALWARE, which contained 138,047 The idea of dividing the dataset into two halves was
applications, had 54 features, such as “Size of Optional proposed, and algorithms were trained, which allowed later
Header, Address of Entry Point, Major Linker Version, to find more accurate results. Then, the accuracy measures
Minor Linker Version and Size of Code. ” Besides the were averaged to obtain a single value of accuracy.
features, the dataset attributed a description of the application
as “legitimate,” which could be True/False. Thus, the
program was developed on the platform “Google Colab”
using Python and specialized libraries, such as scikit-learn
and pandas. After that, we used the program to analyze the
data and associated manipulation of tabular data in
DataFrames. Pandas allow importing data from various file

Cohen's kappa (κ) is a statistic that measures reliability

among commentators for qualitative (categorical) items. It is
a more powerful measure than simple percentage agreement
calculations because it considers the possibility of an
agreement occurring by chance. It is a dual reliability
measure between two annotations.
Cohen's kappa statistic is the agreement between two
raters, where Po is the observed relative agreement between
the raters (congruent with accuracy), and Pe is the
hypothetical probability of chance agreement. You find the
equation for the scale below.

Fig. 2. Oversampling Dataset technology. ()

The entire work was based on SK-Learn, for SMS, so we The sensitivity for the SMS dataset determines the ratio
split the two datasets with an average of 70% of the training between how many were correctly identified as positive to
data (1950 for both halves) and 3900, 30% of the test data how many were positive. In other words, Sensitivity
(837 for both halves) and 1,674. On the other hand, the measures how various sources of uncertainty in a
malware dataset was also split into two datasets with an mathematical model contribute to the model's overall
average of 80% of the training data (55219 for the first half uncertainty, and the equation below describes the scale of
and 55218 for the second half) which the total is 110437, sensitivity:
20% of the test data (13805 for the first half and 13805 for
the second half) which the total is 27610.
()
After that, we applied these datasets to the three
algorithms (Random Forest, Linear Regression, and The F1-score (for the malware dataset) is used to
Artificial Neural Network) and trained the models with compare ML algorithms and represent a statistical measure
datasets before and after oversampling. to rate performances and the quality of that model, so it can
be calculated as the equation below:
• Random forest: it should select random samples from
a given dataset, construct a decision tree for each
sample and get a prediction result from each decision
tree. After that, it performs a vote for each predicted ()
result, and lastly, selects the prediction result ‘Spam
or Ham’ in the SMS datasets and ‘Legit or non Legit
Where Precision is correct positive predictions relative to
apps’ in the malware dataset with the most votes as
total positive predictions and Recall is correct positive
the final prediction.
predictions relative to total actual positives
• Logistic Regression: a classification algorithm that is Where the components of the confusion matrix of 4 cells
used to predict the probability of a categorical given by the sequence “TP, FP, FN, TN”, “TP” represents
dependency of samples. Thus, the dependent variable “true positive,” which indicates the number of positive
here is the spam and ham messages, like a binary samples that were accurately categorized, “FP” shows a
variable that contains data coded as 1 (spam, “false positive” value, that is, the number of negative
malware) and 0 (ham, legit). samples classified as positive, “FN” means the “false-
• Artificial neural network: for the ANN model we negative” value which means the number of actual positive
used the Keras library. We firstly utilized a batch size samples classified as negative, “TN” represents the number
of 32 for the whole dataset, then the epoch was of accurately classified negative samples. Thus, the
adjusted to 100 and replaced “Spam & Ham” & confusion matrix allows us to visualize the performance of
“malware & legit” terms into ‘1 & 0.’ The ANN used the classification models.
in “SMS” has a total of “4” layers: one input layer-
“2” hidden layers- and 1 output layer and used 31393 IV. RESULTS
trainable parameters. While in “Malware,” the
The mobile botnet was divided into two parts, SMS and
number of layers was “5”: one input layer-“3” hidden
malware, and each dataset was divided into two halves to
layers- and one output layer, and used 1057 trainable
find quality measures for the first and second half. By the
parameters.
end, the average value of each measure was obtained to find
Lastly, Lastly, we compare the performance of models more accurate results by implementing algorithms (logistic
based on measures, such as “Sensitivity and Cohen’s Kappa regression, random forest, and Artificial Neural Network). In
(for SMS Dataset),” “f1-score (for Malware Dataset), and the following, details of the program developed on “Google
Accuracy for both datasets.” To this end, we compute the Colab” using Python and specialized libraries, and we
predictions of the ‘testing inputs’ and compare them to the noticed the following.
actual ‘testing outputs.’

TABLE I : MALWARE DATASET RESULT WITHOUT NORMALIZATION

Results
Classifiers Accuracy of the Accuracy of
F1-Score TP FP FN TN
test dataset train dataset
Logistic regression 70.06% 70.06% 0% 19250 0 8360 0
Average
Result
Random Forest 98.45% 98.27% 97.34% 19080 170 277 8083
Without
Normalization Artificial Neural
70.08% 70.1% 0.15% 19249 1 8359 1
Network
Logistic regression 97.25% 97.33% 95.33% 18919 331 410 7950
Normalized
Average Random Forest 97.51% 97.48% 95.69% 19144 106 569 7791
Result
Artificial Neural
98.44% 98.48% 97.38% 19067 183 162 8198
Network

A. SMS
Through previous studies, it turned out that most of the
results were when analyzing the complete and unbalanced
dataset and comparing these to results achieved by methods B. Malware
we proposed using one of the data balancing methods, and as Through the preliminary tests, the dataset was used
we explained the reason for this discrepancy in the results completely without preprocessing, and the best of the models
and the attempt to improve the results. Therefore, the focused heavily on the SVM algorithm. Results were similar
oversampling method was used, characterized by increasing to those explained in the literature review section. Before
the frequency of the minority category ‘spam’ and making it modifying the dataset, we noticed a dispersion in the logistic
equal to the majority category ‘ham.’ It necessitated the use regression readings because this algorithm is usually used for
of oversampling method to deal with this unbalanced data, as binary classification. Also, training ANN tends to constantly
each section contained 4825 messages, and the total number decrease the measured error between its output and reference
of data points was 9650. Using this method, we can obtain a output. What we have in the dataset, as features are huge
more complete and balanced dataset, which drastically numeric entries, which causes the saturation of neurons
improves the performance of the proposed models, as shown output and prohibits the convergence toward acceptable
in Table II . The results regarding the rate before the process models. Thus, the dataset is normalized, i.e., the huge values
of data imbalance were the worst results in terms of of features are converted to values between (0-1). The same
TABLE II. SMS DATASET RESULTS

Results
Classifiers Cohen's
Accuracy Sensitivity TP FP FN TN
Kappa
Logistic regression 97.52% 83.44% 88.91% 1453 0 33 186
Unbalancd
Average Random Forest 97.15% 80.33% 87.06% 1453 0 37 182
Result
Artificial Neural
Network 98.08% 86.92% 91.46% 1452 1 29 190
Logistic regression 99.83% 99.925% 99.653% 1450 4 4 1437
Oversampling
Average Random Forest 99.9% 100% 99.79% 1453 1 0 1441
Result
Artificial Neural
Network 99.757% 99.645% 99.51% 1453 1 10 1431
accuracy: ‘97.15%’ for the Random Forest algorithm and the method was applied as the dataset was separated into two
best result was ‘98.08%’ for ANN. The lowest result in equal sub-datasets. Found results by the models trained
sensitivity was reached ‘80.33%’ for the Random Forest separately by the first and second half datasets, and then the
algorithm, and the highest result was ‘86.92%’ for ANN. average value of quality measures are illustrated in Table I .
And finally, Cohen's kappa results reached the lowest value
for the Random Forest algorithm of 87.06% and the highest The results before the data normalization process were
value of 91.46% for the ANN algorithm. After using the lowest regarding accuracy in the data test, which
balancing methods, results became more accurate and stable: amounted to 70.06% for the logistic regression algorithm.
the accuracy of the ANN model reached 99.757% and of The best result was 98.45% for Random Forest. The lowest
Random Forest reached the highest value of 99.9%, the accuracy in training the data was 70.06% for the logistic
sensitivity of the ANN model reached 99.645%, and of regression algorithm and the highest accuracy was 98.27% in
Random Forest reached 100%, and finally the Cohen's kappa Random Forest. And finally, the F1-Score reached the lowest
measure was at the lowest value of 99.51% for the ANN value for logistic regression was resulted at 0%, and ANN
model and reached the highest value of 99.79% for the models reached 0.15%.
Random Forest algorithm.

However, the result of the Random Forest algorithm was Oct. 2016; https://securelist .com/kaspersky-ddos-intelligence-report-
97.34%. After normalization, the results became more for-q3-2016 /76464.
accurate, as the result lowered to an accuracy of 97.25% for [5] G. Mantas, N. Komninos, J. Rodriguez, E. Logota, and H. Marques,
“Security for 5G Communications,” Fundam. 5G Mob. Networks, pp.
logistic regression in the testing dataset and topped at an 207– 220, 2015, DOI: 10.1002/9781118867464.ch9
accuracy of 98.44% for ANN. Moreover, for training data, [6] SMS Spam Collection Data Set from UCI Machine Learning
the lowest accuracy was equal to 97.33% for the logistic Repository,http://archive.ics.uci.edu/ml/datasets/SMS+Spam+Collecti
regression algorithm, and the highest accuracy was equal to on
98.48% for the ANN model. Finally, concerning the F1- [7] SMS Spam Collection v.1, ”http://www.dt.fee.unicamp.br/∼tiago/
Score, the worst result was observed for the logistic smsspamcollection
regression algorithms, equal to 95.33%, but the best value [8] K. Inc. Kaggle, Retrieved from
was recorded for the ANN model, equal to 97.38%. https://www.kaggle.com/nsaravana/malwaredetection#Malware%20d
ataset.cs, 2019.
[9] S. J. Russell, & P. Norvig, Artificial intelligence: A modern approach
CONCLUSION (4th ed.). Pearson. (2021).
In this work, on defining bot networks attack detection, [10] M. I. Jordan, & T. M. Mitchell, Machine learning: Trends,
artificial intelligence-based methods were proposed. A perspectives, and prospects. Science, 349(6245), 255 –260.
https://doi.org/10.1126/science.aaa8415 , (2015).
preprocessing phase of datasets is presented, consisting of
[11] M. Rubin Julis, S.AIagesan: “Spam Detection In Sms Using Machine
splitting data into two halves and making it normalized and Learning through Textmining”, International Journal Of Scientific &
balanced. SMS attacks and malware detection using ML Technology Research Volume 9, Issue 02, February 2020.
methods, including ANN, random forest, and logistic [12] P. Navaney, G. Dubey, A. Rana, “SMS Spam Filtering using
regression to classify norm and harmful samples, were Supervised Machine Learning Algorithms.,” in 8th International
tested. The results of the experiments were ranked in order of Conference on Cloud Computing, Data Science & Engineering, 978 -
preference based on how the model performed concerning 1- 5386-1719-9/18/ 2018 IEEE.
SMS attacks. ML methods achieved high results, and this is [13] N. Nur Amir Sjarif, N F Mohd Azmi, Suriayati Chuprat, “SMS Spam
Message Detection using Term Frequency-Inverse Document
thanks to the oversampling of data when compared to other Frequency and Random Forest Algorithm,” in The Fifth Information
methods before the equilibrium process, which can Systems International Conference 2019, Procedia Computer Science
theoretically be used to better identify a variety of attacks, 161 (2019) 509-515, ScienceDirect.
bots and other forms of unwanted network behavior than [14] T. Xia, Xuemin Chen, “A Discrete Hidden Markov Model for SMS
previously created models. Spam Detection.,” in Applied Science, MDPI, Appl. Sci. 2020, 10,
5011; doi:10.3390/app10145011.
ML was also used for malware detection so that the [15] S. Sheikhi, M.T.Kheirabadi, A.Bazzazi, “An Effective Model for
results using the original dataset were sorted in order of SMS Spam Detection Using Content-based Features and Neural
preference, which changed the results after the data was Network”, International Journal of Engineering, IJE
shuffled and normalized with obtaining great results in ANN TRANSACTIONS B: Applications Vol. 33, No. 2, (February 2020)
221-228.
and logistic regression models, but the random forest
classifier saw a slight decrease in the results, and the findings [16] S. Bahtiyar, M. B. Yaman,., & C. Y. A. Altıniğne, multi-dimensional
machine learning approach to predict advanced malware. Computer
suggest that the dataset handled better after normalization. Networks, 160, 118–129, 2019.
https://doi.org/10.1016/j.comnet.2019.06.015
REFERENCES [17] N. Milosevic, A. Dehghantanha, & Choo, K. K. R. Machine learning
aided Android malware classification. Computers and Electrical
[1] “5G estimated to reach 1.5 billion subscriptions in 2024 - Ericsson.” Engineering, 61, 266–274, 2017.
[Online]. Available: https://www.ericsson.com/en/press- https://doi.org/10.1016/j.compeleceng.2017.02.013
releases/2018/11/5g-estimated-to-reach-1.5-billion-subscriptions-in-
2024--ericsson-mobility-report. [18] S. Hemalatha, “Mobile Malware Detection using Anomaly Based
Machine Learning Classifier Techniques,” International Journal of
[2] 5G-PPP Security WG, “5G-PPP Phase1 Security Landscape,” white Innovative Technology and Exploring Engineering (IJITEE), ISSN:
paper,2017. 2278-3075, Volume-8, Issue11S2, September 2019.
[3] I. Ahmad, T. Kumar, M. Liyanage, J. Okwuibe, M. Ylianttila, and A. [19] B. Cuan, A. Damien, C. Delaplace, & Valois, M. Malware detection
Gurtov, “Overview of 5G Security Challenges and Solutions,” IEEE in PDF files using machine learning. ICETE 2018 - Proceedings of
Commun. Stand.Mag., vol. 2, no. 1, pp. 36–43, 2018, DOI: the 15th International Joint Conference on e-Business and
10.1109/MCOMSTD.2018.1700063. Telecommunications, 2, 412–419, 2018.
[4] O. Kupreev, J. Strohschneider, and A. Khalimonenko, Kaspersky https://doi.org/10.5220/0006884705780585.
DDOS Intelligence Report for Q3 2016, tech. report, SecureList, 31

Authorized licensed use limited to: ULAKBIM UASL - Altinbas Universitesi. Downloaded on January 14,2023 at 09:28:52 UTC from IEEE Xplore. Restrictions apply.

TNP Portal Using Web Development and Machine Learning
No ratings yet
TNP Portal Using Web Development and Machine Learning
9 pages
Coursera Machine Learning Specialization
No ratings yet
Coursera Machine Learning Specialization
46 pages
Sample Paper AI 2not
No ratings yet
Sample Paper AI 2not
16 pages
Predicting Electric Vehicle Energy Consumption From Field Data Using Machine Learning
No ratings yet
Predicting Electric Vehicle Energy Consumption From Field Data Using Machine Learning
12 pages
A Review and Analysis of The Bot-IoT Dataset
No ratings yet
A Review and Analysis of The Bot-IoT Dataset
8 pages
Correlation-Based Botnet Detection
No ratings yet
Correlation-Based Botnet Detection
186 pages
Eeum Di Dissertacao pg13570 PDF
No ratings yet
Eeum Di Dissertacao pg13570 PDF
132 pages
AI-Powered Pneumonia Detection Enhanced Chest X-Ray Interpretation With CNNs
No ratings yet
AI-Powered Pneumonia Detection Enhanced Chest X-Ray Interpretation With CNNs
5 pages
(IJCST-V13I2P12) :sanjeev Kumar, Prof. Shivank Soni
No ratings yet
(IJCST-V13I2P12) :sanjeev Kumar, Prof. Shivank Soni
12 pages
DCNN-a Novel Binary and Multi-Class Network Intrusion Detection Model Via Deep Convolutional Neural Network
No ratings yet
DCNN-a Novel Binary and Multi-Class Network Intrusion Detection Model Via Deep Convolutional Neural Network
23 pages
CSE-Internship Report Sample
No ratings yet
CSE-Internship Report Sample
13 pages
Projject Final - PPTT.DXX
No ratings yet
Projject Final - PPTT.DXX
19 pages
Securing IoT Devices Against Exploitation For Cyber Attacks Through Detection and Mitigation Strategies Case Study of Public Institutions in Rwanda
No ratings yet
Securing IoT Devices Against Exploitation For Cyber Attacks Through Detection and Mitigation Strategies Case Study of Public Institutions in Rwanda
14 pages
A Visualized Botnet Detection System Based Deep Learning For The Internet of Things Networks of Smart Cities
No ratings yet
A Visualized Botnet Detection System Based Deep Learning For The Internet of Things Networks of Smart Cities
49 pages
DDOS Attack Final
No ratings yet
DDOS Attack Final
41 pages
Module 1
No ratings yet
Module 1
65 pages
42 - Machine Learning Techniques For Cyber Attacks Detection
No ratings yet
42 - Machine Learning Techniques For Cyber Attacks Detection
50 pages
AI Session For Amity Institute of Information Technology Noida 2021-Public
No ratings yet
AI Session For Amity Institute of Information Technology Noida 2021-Public
85 pages
Bangla Sign Language Recognition
No ratings yet
Bangla Sign Language Recognition
6 pages
Sat - 48.Pdf - Malicious Attacks Detection Using Machine Learning
No ratings yet
Sat - 48.Pdf - Malicious Attacks Detection Using Machine Learning
11 pages
CS3491-AI ML-Chapter 2
No ratings yet
CS3491-AI ML-Chapter 2
16 pages
Roleof Machine Learnig Algorithmin Digital Forensic Investigationof Botnet Attacks
No ratings yet
Roleof Machine Learnig Algorithmin Digital Forensic Investigationof Botnet Attacks
10 pages
En La Mente de Un Hacker
No ratings yet
En La Mente de Un Hacker
44 pages
B20-ml Basedbotnet Attack in IoT Devices
No ratings yet
B20-ml Basedbotnet Attack in IoT Devices
66 pages
Exposing Bot Attacks Using Machine Learning and Flow Level Analysis
No ratings yet
Exposing Bot Attacks Using Machine Learning and Flow Level Analysis
8 pages
Paper Springer
No ratings yet
Paper Springer
30 pages
Stage 1
No ratings yet
Stage 1
24 pages
An Efficient Spam Detection Technique For IoT Devices Using Machine Learning
No ratings yet
An Efficient Spam Detection Technique For IoT Devices Using Machine Learning
7 pages
Vivek Dhokte
No ratings yet
Vivek Dhokte
33 pages
V.seminar Report
No ratings yet
V.seminar Report
33 pages
Revise 2
No ratings yet
Revise 2
26 pages
Overview of Cyber Attacks Classification and Detection in IoT Using CNN-Deep Reinforcement Learning
No ratings yet
Overview of Cyber Attacks Classification and Detection in IoT Using CNN-Deep Reinforcement Learning
6 pages
Paper 6
No ratings yet
Paper 6
26 pages
Applied Sciences: Intelligent Detection of Iot Botnets Using Machine Learning and Deep Learning
No ratings yet
Applied Sciences: Intelligent Detection of Iot Botnets Using Machine Learning and Deep Learning
22 pages
Majp
No ratings yet
Majp
20 pages
Botnet Invasion
No ratings yet
Botnet Invasion
5 pages
Botnet Detection
No ratings yet
Botnet Detection
16 pages
Hybrid Machine Learning Model For Efficient Botnet
No ratings yet
Hybrid Machine Learning Model For Efficient Botnet
19 pages
Detection of Mirai Botnet Attacks On IoT Devices Using Deep Learning
No ratings yet
Detection of Mirai Botnet Attacks On IoT Devices Using Deep Learning
14 pages
openSAP Sac5 Week 4 Unit 7 PREDKEYINT Exercise
No ratings yet
openSAP Sac5 Week 4 Unit 7 PREDKEYINT Exercise
18 pages
ML Notes
No ratings yet
ML Notes
7 pages
Crop Yield Report BT-4-1
No ratings yet
Crop Yield Report BT-4-1
23 pages
9 - Next-Gen Agriculture Integrating AI and XAI For Precision Crop Yield Predictions
No ratings yet
9 - Next-Gen Agriculture Integrating AI and XAI For Precision Crop Yield Predictions
16 pages
Deep Learning Hybridization For Improved Malware Detection in Smart Internet of Things
No ratings yet
Deep Learning Hybridization For Improved Malware Detection in Smart Internet of Things
18 pages
178p1a0427 (Seminar Report)
No ratings yet
178p1a0427 (Seminar Report)
37 pages
WoS Paper1-River Publisher - Keerthi Vardhan
No ratings yet
WoS Paper1-River Publisher - Keerthi Vardhan
19 pages
Data Preprocessing Implementation 13112023 061217pm
No ratings yet
Data Preprocessing Implementation 13112023 061217pm
31 pages
FDS Viva
No ratings yet
FDS Viva
46 pages
Bhavyatha Technical Seminar Report
No ratings yet
Bhavyatha Technical Seminar Report
30 pages
Bot Detection System Using CNN Algorithm
No ratings yet
Bot Detection System Using CNN Algorithm
9 pages
Detecting IoT Botnet Attacks Using Machine Learning Methods
No ratings yet
Detecting IoT Botnet Attacks Using Machine Learning Methods
7 pages
CellularBotnet Preprint
No ratings yet
CellularBotnet Preprint
8 pages
Wine Quality Prediction
No ratings yet
Wine Quality Prediction
22 pages
Synopsis - SANTOSH VERMA
No ratings yet
Synopsis - SANTOSH VERMA
26 pages
Deep Learning-Assisted Arrhythmia Classification Using 2-D ECG Spectrograms
No ratings yet
Deep Learning-Assisted Arrhythmia Classification Using 2-D ECG Spectrograms
15 pages
1.A Novel Approach of Botnet Detection Using Hybrid Deep - 2024 - Alexandria Engi
No ratings yet
1.A Novel Approach of Botnet Detection Using Hybrid Deep - 2024 - Alexandria Engi
10 pages
The Bot Problem
No ratings yet
The Bot Problem
15 pages
Rq3 Paper 01
No ratings yet
Rq3 Paper 01
10 pages
4) ICISSP 2020 113 ScitePress
No ratings yet
4) ICISSP 2020 113 ScitePress
13 pages
IT22587138GunasekaraA G M K
No ratings yet
IT22587138GunasekaraA G M K
12 pages
2022 ICAIoT
No ratings yet
2022 ICAIoT
7 pages
Nec Ass-4
No ratings yet
Nec Ass-4
2 pages
Ensemble-Based Botnet Attack Detection and Classification Using Machine Learning Algorithms On NBaIoT Dataset
No ratings yet
Ensemble-Based Botnet Attack Detection and Classification Using Machine Learning Algorithms On NBaIoT Dataset
6 pages
Botnet Detection - IEEE - Doc2
No ratings yet
Botnet Detection - IEEE - Doc2
6 pages
Machine Learning Detection
No ratings yet
Machine Learning Detection
13 pages
5G IoT Botnets
No ratings yet
5G IoT Botnets
9 pages
Comparative Analysis of Mobile Botnet Detection Techniques
No ratings yet
Comparative Analysis of Mobile Botnet Detection Techniques
7 pages
Botnet Detection in The Internet of Things Using Deep Learning Approaches
No ratings yet
Botnet Detection in The Internet of Things Using Deep Learning Approaches
8 pages
m64421 F.farhad - Paper
No ratings yet
m64421 F.farhad - Paper
10 pages
Shibu George Et Al. - 2023
No ratings yet
Shibu George Et Al. - 2023
16 pages
Botnet Attacks
No ratings yet
Botnet Attacks
9 pages
N-BaIoTNetwork-Based Detection of IoT Botnet Attacks Using Deep Autoencoders
No ratings yet
N-BaIoTNetwork-Based Detection of IoT Botnet Attacks Using Deep Autoencoders
11 pages
1 s2.0 S235286482200102X Main
No ratings yet
1 s2.0 S235286482200102X Main
9 pages
Behavior Analysis of Machine Learning Algorithms For Botnets Detection
No ratings yet
Behavior Analysis of Machine Learning Algorithms For Botnets Detection
7 pages
2023 Tianjin 2
No ratings yet
2023 Tianjin 2
12 pages
Improved Empirical Wavelet Transform Combined With Particle Swarm Optimization-Support Vector Machine For EEG-based Depression Recognition
No ratings yet
Improved Empirical Wavelet Transform Combined With Particle Swarm Optimization-Support Vector Machine For EEG-based Depression Recognition
19 pages
Medicinal Leaves Classification Using Random Forest and AdaBoost
No ratings yet
Medicinal Leaves Classification Using Random Forest and AdaBoost
8 pages
SVMBasedRealTimeHand WrittenDigitRecognitionSystem
No ratings yet
SVMBasedRealTimeHand WrittenDigitRecognitionSystem
7 pages
Machine Learning Base IoT Botnet Detection Systems
No ratings yet
Machine Learning Base IoT Botnet Detection Systems
10 pages
IoT Botnets
No ratings yet
IoT Botnets
2 pages
Mirai Botnet
No ratings yet
Mirai Botnet
3 pages
Verdical Data Science
No ratings yet
Verdical Data Science
13 pages
Hybrid Machine Learning Model For Efficient Botnet Attack Detection in IoT
No ratings yet
Hybrid Machine Learning Model For Efficient Botnet Attack Detection in IoT
5 pages
Cross Validation
No ratings yet
Cross Validation
10 pages
Implementation of Real Time Activity Sensing
No ratings yet
Implementation of Real Time Activity Sensing
9 pages
AI Framework For Identifying Anomalous Network Traffic in Mirai and Bashlite IOT Botnet Attack.
No ratings yet
AI Framework For Identifying Anomalous Network Traffic in Mirai and Bashlite IOT Botnet Attack.
5 pages
2024 Eai Airo
No ratings yet
2024 Eai Airo
7 pages
2023 Ijeetc 1
No ratings yet
2023 Ijeetc 1
7 pages
Paper 17-A Yolo Based Violence Detection Method
No ratings yet
Paper 17-A Yolo Based Violence Detection Method
7 pages
Instructions
No ratings yet
Instructions
4 pages
Grp-2 Implementation Paper
No ratings yet
Grp-2 Implementation Paper
6 pages
Cybersecurity Threat Detectionusing Machine Learningand Deep Learning Techniques
No ratings yet
Cybersecurity Threat Detectionusing Machine Learningand Deep Learning Techniques
4 pages
Classification of Spam Emails Using Deep Learning
No ratings yet
Classification of Spam Emails Using Deep Learning
6 pages
ML Sample PDF
No ratings yet
ML Sample PDF
5 pages
Prediction of Stock Price Trend Based On Wavelet Neural Network and RS Attributes Reduction
No ratings yet
Prediction of Stock Price Trend Based On Wavelet Neural Network and RS Attributes Reduction
4 pages
Kavach: Cab Signalling & Automatic Train Protection System for Digital Railways
From Everand
Kavach: Cab Signalling & Automatic Train Protection System for Digital Railways
Lalit Kumar Mansukhani
No ratings yet
Telecommunications Traffic : Technical and Business Considerations
From Everand
Telecommunications Traffic : Technical and Business Considerations
Sigit Haryadi
No ratings yet

2022 Iiccit

Uploaded by

2022 Iiccit

Uploaded by

2022 Iraqi International Conference on Communication & Information Technologies ( IICCIT-2022) , Basrah University ,

Botnet Mobile Detection Using Machine &

978-1-6654-7220-3/22/$31.00 ©2022 IEEE 82

over human decision-making to accomplish complicated

Cohen's kappa (κ) is a statistic that measures reliability

Fig. 2. Oversampling Dataset technology. ()

TABLE I : MALWARE DATASET RESULT WITHOUT NORMALIZATION

You might also like