0% found this document useful (0 votes)
55 views12 pages

A Literature Review On Machine Learning

The document discusses combining machine learning and cyber security. It summarizes various cyber security protocols like authentication, access control, intrusion detection, and key management protocols. It explains how machine learning can enhance cyber security techniques and help detect zero day threats with minimal human involvement. However, it also notes that combining the two fields faces issues that require careful handling.

Uploaded by

Joram
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views12 pages

A Literature Review On Machine Learning

The document discusses combining machine learning and cyber security. It summarizes various cyber security protocols like authentication, access control, intrusion detection, and key management protocols. It explains how machine learning can enhance cyber security techniques and help detect zero day threats with minimal human involvement. However, it also notes that combining the two fields faces issues that require careful handling.

Uploaded by

Joram
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

International Journal of Scientific Research in Computer Science, Engineering and Information Technology

ISSN : 2456-3307 (www.ijsrcseit.com)


doi : https://doi.org/10.32628/CSEIT228654

A Literature Review on Machine Learning for Cyber Security


Issues
Jay Kumar Jain1, Akhilesh A. Waoo1, Dipti Chauhan2
1 Department of Computer Science and Engineering, AKS University, Satna, Madhya Pradesh, India
2 Professor, Department of Computer Science and Engineering, PIEMR, Indore, Madhya Pradesh, India

ABSTRACT
Through the use of relevant data to build an algorithm, machine learning
Article Info primarily aims to automate human help. A subset of artificial intelligence (AI),
machine learning (ML) focuses on the development of systems that can learn
Publication Issue : from past data, recognize patterns, and reach logical conclusions with little to no
Volume 8, Issue 6 human involvement. The concept of cyber security involves guarding against
November-December-2022 hostile attack on digital systems such computers, servers, mobile devices,
networks, and the data they are connected to. Accounting for cyber security
Page Number : 374-385 where machine learning is used and using machine learning to enable cyber
security are the two main components of combining cyber security and ML. We
Article History may benefit from this union in a number of ways, including by giving machine
Accepted: 20 Nov 2022 learning models better security, enhancing the effectiveness of cyber security
Published: 04 Dec 2022 techniques, and supporting the efficient detection of zero day threats with
minimal human involvement. In this review paper, we combine ML and cyber
security to talk about two distinct notions. We also talk about the benefits,
problems, and difficulties of combining ML and cyber security. In addition, we
explore several attacks and present a thorough analysis of various tactics in two
different categories. Finally, we offer a few suggestions for future research.
Keywords- Cyber-security issues, machine learning, algorithms, detection.

I. INTRODUCTION data, banking data, other finance related data, and


social security numbers). Online attackers (hackers),
Most of the devices we use in the modern era of for example, are continuously on the lookout for ways
computers are now part of an Internet of Things (IoT) to fool around (for example, they can launch attacks
ecosystem that is connected to the Internet. The like man-in-the-middle, replay credential guessing,
Internet, a form of unsecure (open) communication, is impersonation, malware insertion, session key
the conduit through which these devices exchange computation, and data manipulation) [2].
and transfer their data [1]. Usually, these are sensitive As a result, numerous researchers periodically
pieces of information (i.e., healthcare data, insurance recommend different security measures to lessen

Copyright: © the author(s), publisher and licensee Technoscience Academy. This is an open-access article distributed 374
under the terms of the Creative Commons Attribution Non-Commercial License, which permits unrestricted non-
commercial use, distribution, and reproduction in any medium, provided the original work is properly cited
Jay Kumar Jain et al Int. J. Sci. Res. Comput. Sci. Eng. Inf. Technol., November-December-2022, 8 (6) : 374-385

these dangers. Security protocols or cyber security • Access control protocols: The process of limiting a
protocols can be divided into the following categories: person's or a device's unauthorized access (s). After
authentication protocols, access control protocols, all phases of a user/device access control protocol
intrusion detection protocols, key management have been followed, users or devices can securely
protocols, and security techniques using block chains. access other users or devices. The two types of
A summary of these protocols is provided below. access control protocols are: (1) user access control
• Authentication protocols: Authentication is the and (2) device access control. Device access control
process of determining whether a person or a protocol can be used to control access to
device is who they claim to be. It is possible to unauthorized devices, whereas user access control
execute it using credentials or factors that are protocol can be used to control access to
directly related to the users or device (for example, unauthorized users. Access control methods
username, password, smartcard, and biometrics). include certificate-based and certificate-less
User to user, user to device, or device to device approaches. An entity (a client) is deemed to have
authentication are all options. User authentication permission to utilize a resource through the
protocols can also be broken down into three process of authorization, which is carried out by
groups based on the factors that are available: one- an authority (a server). In order for the server to
factor user authentication protocol, two-factor identify the client making the access request, it is
user authentication protocol, and three-factor user typically done in conjunction with authentication.
authentication protocol. It establishes who is permitted
• Protocols for access control: Restricting someone's • Protocols for detecting intrusions: An intrusion is
or something's unauthorized access is done anything or anyone that has malign purpose. This
through access control (s). Once all steps of a could be a hacker-controlled system that attacks
user/device access control protocol have been the Internet or a malicious computer script.
followed, users or devices can securely access one Typically, hackers attempt to introduce malware
another. Protocols for key management, intrusion into online devices to degrade their performance
detection, and block chain enabled security make or compromise their security (systems). We
up access control protocols. Below is a summary of require a certain class of protocols that fall under
these protocols. the heading of "intrusion detection protocols" for
• Authentication protocols: Authentication is the the detection and mitigation of intrusion. The
process of determining whether a person or a intrusion detection can be carried out in a variety
device is who they claim to be. It is possible to of ways, including anomaly-based intrusion
execute it using credentials or factors that are detection, hybrid intrusion detection, and
directly related to the users or device (for example, signature-based and anomaly-based intrusion
username, password, smartcard, and biometrics). detection. Malware detection techniques based on
User to user, user to device, or device to device machine learning or deep learning are becoming
authentication are all options. User authentication increasingly popular these days.
protocols can also be broken down into three • Key management protocols: These protocols are
groups based on the factors that are available: one- used to securely manage keys between different
factor user authentication protocol, two-factor entities, including some devices (such as Internet
user authentication protocol, and three-factor user of Things (IoT) devices and smart vehicles) and
authentication protocol. certain people (smart home user, doctor traffic
inspector). Typically, a trusted registration

Volume 8, Issue 6, November-December-2022 | http://ijsrcseit.com 375


Jay Kumar Jain et al Int. J. Sci. Res. Comput. Sci. Eng. Inf. Technol., November-December-2022, 8 (6) : 374-385

authority registers all of the communication machine learning model security, and more effective
system's entities and then stores the secret zero day attack detection with less human
credentials (i.e., secret keys) in their memory. For involvement. However, it might encounter a variety
the objectives of creating new keys, storing them of issues and security obstacles, which should be
in devices, establishing keys, and revoking keys, handled with caution. We therefore need a review
we require a key management method. After study in this particular field that addresses the
establishing a shared secret key (also known as a "uniting of cyber security and machine learning," i.e.,
session key), which may be accomplished through issues and challenges, various attacks, different
the crucial phases of an authenticated key protection schemes with a comparative study, and
agreement protocol, the devices/users can transmit some future research directions on which other
their information in a secure manner [28]. researchers should concentrate. Therefore, we made
• Security protocols that use block chain technology: an effort to do comparable research in the study that
Block chain technology is one of the era's rising was suggested [4,5].
technologies. Data is kept on a block chain in the
II. RELATED WORK
form of specific blocks that are linked together
using hash values. Block chain uses distributed
Samson Ho and other people [6] The major objective
ledger technology, also known as distributed
of this paper is to propose a Convolutional Neural
ledger technology, to maintain data (DLT). The
Network-based Intrusion Detection System (IDS) for
DLT is accessible to all legitimate participants (and
enhancing internet security. The proposed IDS
occasionally miners) in the network. The data that
paradigm categorizes all network packet traffic into
we store on the blockchain is protected from a
benign and malicious kinds in order to detect
variety of potential cyberattacks. Consequently,
network intrusions. The Canadian Institute for
the security mechanisms for blockchains can
Cybersecurity's CICIDS2017 dataset was used to train
protect against a variety of cyberattacks. [3].
and validate the suggested model. All aspects of the
Computing systems learn from data and utilize
model have been evaluated, including overall
algorithms to carry out tasks without being
accuracy, attack detection rate, false alarm rate, and
explicitly programmed. This process is known as
training overhead. Nine other well-known classifiers
machine learning (ML). AI's deep learning (DL)
have been compared with the recommended model to
subfield is a form of machine learning (ML). A
see how effective they are.
complex set of algorithms that are based on the
Praneeth Narisetty and coworkers [7] On the basis of
human brain underlie deep learning (DL). This
the most recent CICIDS2017 dataset, assist vector
enables the processing of unstructured data,
machines, ANNs, CNNs, Random Forests, and
including text, images, and documents. ML
substantial learning estimations have all received
describes a computer's capacity to reason and act
moderate evaluations. Significant learning estimation
independently of human intervention. However,
fared worse than SVM, ANN, RF, and CNN. In the
DL often requires less constant human assistance.
end, we will use this dataset to conduct port scope
As a result, it performs image, video, and
attacks similar to other attack kinds by combining AI
unstructured
and substantial learning calculations with Apache
Hadoop and shimmer developments. This study
Combining machine learning and cyber security can
determines which algorithm has the best accuracy
benefit us in a number of ways. For instance,
rates for predicting the best outcomes to ascertain
improved cyber security techniques, increased

Volume 8, Issue 6, November-December-2022 | http://ijsrcseit.com 376


Jay Kumar Jain et al Int. J. Sci. Res. Comput. Sci. Eng. Inf. Technol., November-December-2022, 8 (6) : 374-385

whether or not a cyberattack occurred by predicting improve the detection precision of malicious activity
the four algorithms like SVM, ANN, RF, and CNN. on that service are found. The second layer uses those
Emrah Tufan et al. [7], (Member, IEEE) In this study, features to categorize each network connection as an
network intrusion attempts are investigated using attack or regular activity using the pattern
anomaly-based machine learning models, which offer recognition technique. The normal behavior model
superior security than the traditional misuse-based and the attack behavior model are two multivariate
approaches. A data set acquired from a real-world, normal statistical models that are produced during the
institutional production setting was used to build and training phase. The two multivariate normal
apply two models: an ensemble learning model and a statistical models are employed in the testing and
convolutional neural network model. The models operating phases to classify a network connection into
were used with the UNSW-NB15 benchmarking data attack or normal activity using a maximum likelihood
set to show their validity and dependability. To make estimate function. The experimental findings
the scope of the study modest, the sort of assault was demonstrate the suggested IDS's advantage for
restricted to probing attacks. The CNN model was network intrusion detection over comparable IDSs. It
marginally more accurate, according to the results, successfully achieves DR of 97.5%, 0.001 FAR, MCC
which showed high accuracy rates. 95.7%, and 99.8% overall accuracy using just four
characteristics.
Iqbal H. Sarker et al. [8] introduce an intrusion
detection tree ("IntruDTree") machine-learning-based III. UNITING CYBER SECURITY AND MACHINE
security model in this article. This model takes into LEARNING
account the importance ranking of security features
before creating a tree-based generalized intrusion 3.1 Machine learning in cyber security
detection model based on the important features that Attacks such as replay, man-in-the-middle (MiTM),
have been selected. This model is useful in terms of impersonation, credentials leakage, password guessing,
prediction accuracy for test cases that have not yet session key leakage, unauthorized data update,
been seen since it reduces the computational malware injection, flooding, denial of service (DoS)
complexity of the model by reducing the feature and distributed denial of service (DDoS), among
dimensions. The performance of our IntruDTree others, can be carried out against connected systems
model was then evaluated utilizing cybersecurity in the cyberspace. Therefore, in order to recognize
datasets, and the scores for precision, recall, fscore, and stop these attacks, we need some sort of security
accuracy, and ROC were computed. standard. Through the provided pre-processed dataset,
To assess the efficacy of the resulting security model, the machine learning models (machine learning ML
we also compare the outcome outcomes of the algorithms) can learn about various cyber-attacks in
IntruDTree model with a number of conventionally the offline/online mode. The machine learning
well-liked machine learning techniques, including algorithms identify any indication of intrusion (such
the naive Bayes classifier, logistic regression, support as a cyberattack) in real time, or in online mode.
vector machines, and k-nearest neighbor. Figure 1 shows the scenario of "machine learning in
cyber security." In this case, a system that is
Mohamed M. and others' [9], This article suggests connected to the Internet (such as a laptop, desktop,
using IDS with two layers. The first layer categorizes smartphone, or IoT device) can be used to carry out a
the network connection based on the service being variety of online operations, such as financial
used. Following that, a minimal set of features that transactions, online access to healthcare data, social

Volume 8, Issue 6, November-December-2022 | http://ijsrcseit.com 377


Jay Kumar Jain et al Int. J. Sci. Res. Comput. Sci. Eng. Inf. Technol., November-December-2022, 8 (6) : 374-385

security numbers, etc. Hackers are constantly looking signature generation and verification techniques, and
for weaknesses in these systems, and when they find hashing processes). The ML models and the related
one, they launch an attack. Depending on the context, datasets are secured under the use of these cyber
several ML techniques, including as supervised security procedures, and the predicted results are
learning, unsupervised learning, reinforcement accurate.
learning, and deep learning, can be utilized for the
detection and mitigation of cyber-attacks. Whether
supervised learning, unsupervised learning,
reinforcement learning, or deep learning is the
technique that best suits a system depends on the
communication environment and resources that are
accessible to it. Because cloud servers have good
processing and storage capacity, it is possible to learn
about (train) and forecast (test) cyberattacks using Fig. 1. Scenario of machine learning in cyber security.
them.
3.2 Cyber security in machine learning
Figure 2 presents a scenario for "cyber security in
machine learning," also known as machine learning
(ML) security. For the analysis and forecasting of
numerous events, ML models are employed. However,
certain attacks, such as model poisoning disruption
attacks and dataset poisoning attacks, can negatively
impact the performance of ML models [6]. These Fig. 2. Scenario of cyber security in machine learning.
attacks may cause machine learning (ML) models to
forecast the associated phenomena incorrectly. The IV. ADVANTAGES OF UNITING CYBER SECURITY
"dataset poisoning attack" involves the introduction of AND MACHINE LEARNING
adversarial examples (updated values) into the dataset
by an attacker, which leads the ML model to make Machine learning and cyber security are both crucial
incorrect predictions. The attacker's goal in the for one another and can enhance each other's
"model poisoning attack" is to corrupt the models by performance. The following are some advantages of
meddling with their internal operations and changing their union.
their settings. The attacker seeks to retrieve the • ML models' complete evidence of security: The ML
model's useful information while simultaneously models are susceptible to a variety of attacks, as was
working to expose sensitive data during a "privacy previously addressed. The prevalence of these attacks
breach attack." A privacy breach includes a may have an impact on how well the ML models
membership inference attack. Additionally, in a function, perform, and predict. However, these
"runtime disruption attack," the attacker subverts the unpleasant occurrences can be prevented by
ML workflow by assaulting the model's execution implementing specific cyber security measures. The
process, which has an impact on the accuracy of the functioning, performance, and input datasets of the
prediction outcomes. Therefore, in order to defend ML models are safeguarded under the use of cyber
against these attacks, there is a need for specific cyber security procedures, and we obtain accurate
security methods (such as encryption techniques, predictions and outcomes [7].

Volume 8, Issue 6, November-December-2022 | http://ijsrcseit.com 378


Jay Kumar Jain et al Int. J. Sci. Res. Comput. Sci. Eng. Inf. Technol., November-December-2022, 8 (6) : 374-385

• Enhanced effectiveness of cyber security methods: • Eavesdropping is a passive form of attack sometimes
When we use ML algorithms to cyber security plans referred to as sniffing or snooping attack. In this
(such as intrusion detection systems), their efficacy is attack, a foe tries to overhear the communicating
increased (i.e., improved accuracy and detection rate parties' private discussion.
with less false positive rate). According to the • The attack is passive in nature, according to traffic
communication environment and the associated analysis. In this attack, adversary A intercepts the
systems, ML techniques including supervised learning, discussion in progress before examining the messages
unsupervised learning, reinforcement learning, and to gather data about the communication's nature,
deep learning algorithms can be applied. pattern, and behavior as well as its location and
• Good zero-day attack detection: ML-based cyber timing. The info that was intercepted also aids A in
security systems that identify infiltration appear to be carrying out related cyberattacks.
particularly effective at detecting zero-day attack (i.e., • Replay attack: In this attack, A purposefully
unknown malware attacks). They use certain retransmits the intercepted messages that were
deployed ML models to carry out the detection, previously sent. A does this to deceive or mislead the
which is why it occurs. The ML mode works by recipient and coerces the lawful users into acting in
gathering and matching specific features; if the accordance with A's wishes.
features of a program match those of a malicious • Man-in-the-middle (MiTM) attack: This active
program, then that program can be regarded as attack involves establishing separate connections with
malicious. The ML models are capable of carrying out interacting entities and relaying the messages to both
this detecting operation automatically. Thus, by ends. In these circumstances, the two communicating
combining cyber security with machine learning, it is entities believe they are speaking directly to one
possible to identify zero day threats effectively. another. Therefore, without being noticed, a may
• Limited need for human intervention: In ML-based intercept, remove, change, or insert new information
systems, deployed ML models handle the majority of for transmission [8].
the tasks. When we combine ML and cyber security, • Impersonation attack: This type of attack is similarly
most of the jobs that these systems are used for are current in nature. A impersonates one of the
completed either entirely or largely without the network's legitimate parties by determining its
assistance of humans. identity, and then sends modified or some brand-new
• Rapid scanning and mitigation: Because they use messages on that party's behalf to the other legal
specific ML algorithms, ML-based intrusion detection party.
systems are particularly effective at detecting the • Denial-of-Service (DoS) assault: In a DoS attack, A
presence of threats. As a result, the combination of floods the victim's computational resources with a
machine learning with cyber security systems large number of fictitious requests (i.e., HTTP flood
performs intrusion screening very quickly and also messages). As a result, the valid user's service request
offers quick response in case of any sign of intrusion. cannot be handled. In this case, the network's services
The selection of an appropriate ML algorithm is the are inaccessible to the genuine user. Another type of
only thing we need to be concerned with. DoS assault is referred to as a distributed denial-of-
IV. OVERVIEW OF VARIOUS THREATS AND service (DDoS) attack, in which A utilizes a network
ATTACKS of devices (a botnet) to send several requests to the
We describe the following attacks in detail in this victim's workstation at once, quickly using up all of
section since they can happen in different computing the system's processing capabilities. DoS or DDoS
environments. attack can be carried out using many flooding

Volume 8, Issue 6, November-December-2022 | http://ijsrcseit.com 379


Jay Kumar Jain et al Int. J. Sci. Res. Comput. Sci. Eng. Inf. Technol., November-December-2022, 8 (6) : 374-385

techniques, such as SYN flooding, HTTP flooding, higher likelihood of collisions between random attack
UDP flooding, etc. attempts, is detailed in the birthday paradox. The
• Malware attack: These attack are carried out by possibility that any paired individuals in a group of n
having harmful scripts executed on the victim's randomly chosen individuals will have a birth date is
computer. Malware that has been introduced or addressed by the birthday paradox (also known as the
installed is a file or a piece of code that carries out birthday dilemma). The birthday attack, a well-
unauthorized operations on a system, such as data known cryptographic attack that employs this
theft, illegal drive or data encryption, data probabilistic tactic to lessen the difficulty of breaking
manipulation, or data deletion. Keyloggers, spyware, a hash function, was inspired by the mathematics
viruses, ransomware, worms, trojan horses, and other underlying this problem [10].
sorts of malware are examples [6]. • Dictionary attack: A dictionary attack on a
• Scripting attack: These attack involve the release of cryptographic system is a specific kind of malicious
data from an online database that is kept up by a web brute force attack. The attacker tries to get past the
server (i.e., online banking database). For instance, system's security by methodically entering every
"password cracking," "SQL injection attack," and word in a dictionary as a password. It's also possible to
"cross-site scripting (XSS) attack" can be used to use a dictionary attack to determine the key required
obtain sensitive data from the system, such as to decrypt an encrypted message or document. A
passwords, credit card information, and debit card library of phrases or keywords that is maintained
details. current is used by the attacker to try to bypass
• Privileged insider attack: Any privileged user of the encryption or get access. Automatic word insertion
system who has access to the registration data of into the target can be done using dictionary terms or
various users and devices can carry out this attack. numerical sequences. By using ineffective password
Due to the insider's access to the sensitive data, this strategies, such as changing passwords to ones that
attack is far more difficult to fight against and has a include consecutive numbers, symbols, or letters,
more negative effect. dictionary attack are made simpler. It functions as a
• Actual theft of smart devices: In the modern world, password since some people use common words.
the majority of computing environments are run Systems that employ multi-word passwords are often
using smart devices, such as smart household resistant to these attack. Furthermore, it is
appliances, smart healthcare devices, and smart challenging for an attacker to crack passwords made
manufacturing machines. No physical security is used consisting of a mix of random digits, uppercase, and
when deploying the smart devices. If an adversary A lowercase characters [10].
physically steals these smart devices, they can be • Stolen verifier attack: In this harmful act, an
utilized to harvest sensitive data by employing power attacker first attempts to steal some devices (such as
analysis attack. After sensitive data has been extracted, smart IoT devices), after which they launch a power
illegal operations like computing session keys without analysis assault on their memory units to steal
authorization can be carried out [9]. sensitive data (such as secret passwords and keys)
• Birthday attack: This sort of cryptographic attack from them. In order to launch further potential
uses the birthday problem's mathematical foundations, network attacks, such as unauthorized session key
which can be found in probability theory. Birthday computation, password guessing, MiTM, and
attacks can be used for malicious objectives like impersonation attack, the attacker eavesdrops on part
credential guessing (passwords). This attack, which is of the exchanged messages.
predicated on a fixed degree of permutations and a

Volume 8, Issue 6, November-December-2022 | http://ijsrcseit.com 380


Jay Kumar Jain et al Int. J. Sci. Res. Comput. Sci. Eng. Inf. Technol., November-December-2022, 8 (6) : 374-385

• Attack on the computation of the session key encryption mechanisms. Additionally, it makes it
without authorization: In this malicious behavior, a possible for the user to alter the model. As the
hacker attempts to compute the session key that is confidentiality of the sensitive data may be
established between the network's legitimate compromised, it raises the privacy risks connected
organizations. The attacker employs a variety of with the data [14]. The many privacy-preserving
techniques to complete this task, including physical strategies to safeguard the model's privacy were
device theft, insider access, and stolen verifier attacks. described by paper not et al. [15,16]. Additionally,
It is generally advised to compute the session keys they talked about how to "randomize the behavior of
using both short term secrets (i.e., random secret the model" [17] in order to use noise creation to give
nonce values) and long term secrets (i.e., pseudo differential privacy
identities, secret keys). This technique provides Runtime interference attack: A foe This task is used
unique keys to various entities during several sessions. by A to put off or complete the ongoing ML task. A
Unfortunately, if one session key is disclosed to the typically attacks the server during the deployment
attacker, the other session keys will still be protected, process. The continuing ML process is then attempted
and the remaining portion of the connection will still to be remotely disrupted by A. As a result, there is a
be secure. disruption in the ML task's regular operation, which
• Attacks on machine learning (ML) models can be wastes time and resources. Through various attack
roughly categorized into four categories: (a) dataset including phishing, denial-of-service (DoS) attack,
poisoning, (b) model poisoning, (c) privacy breach and SQL injection attack, A penetrates the run time
attack, and (d) runtime disruption attack [11]. server by identifying the weak points (vulnerabilities).
• Dataset poising attack: In this assault, A uses a The decentralization of the ML work space will be
variety of techniques to access the training and test able to reduce the impact of this attack. In order to
data in order to interfere with the ML task's regular further divide and implement "distributed machine
operation. A can assault the data server from which learning," which helps to safeguard the accuracy and
raw data must be extracted by using adversarial privacy of user data and the related information,
examples. The compromise of the data sources enables blockchain-based mechanisms can be used.
the insertion of false data, which may change how the
ML model operates. This further modifies the ML- V. ISSUES AND CHALLENGES OF UNITING OF
based system's output [12]. CYBER SECURITY AND MACHINE LEARNING
• Model poisoning attack: In a model poisoning attack,
a parameter change is made by A, who then interferes Although there are several benefits to combining
with the classifier to produce faulty output. The machine learning and cyber security. It also has
classifier modifies the parameters used to create the various problems and difficulties that must be treated
ML model. A can alter the sensitivity limits, rate of with extreme caution. The list below discusses a few
accession, and lead to under- or over-fitting, which of them.
further impacts how normally ML tasks are carried • Problems with compatibility: The combination of
out [13]. machine learning and cyber security involves a
• Privacy violation: Using a variety of techniques, the variety of machine learning and security approaches,
internal workings of the model and the user's including convolutional neural networks (CNNs),
sensitive data may be compromised. The ML task's clustering, classification, and signature generation and
training and deployment phases can result in data verification algorithms. Additionally, the data that
leakage because of unprotected files and a lack of serves as the primary input for the analytic process

Volume 8, Issue 6, November-December-2022 | http://ijsrcseit.com 381


Jay Kumar Jain et al Int. J. Sci. Res. Comput. Sci. Eng. Inf. Technol., November-December-2022, 8 (6) : 374-385

originates from a variety of sources, including IoT security systems. If these mechanisms have any
devices. Different communication methods are used weaknesses, the system's security may suffer as a
to operate these Internet of Things devices. Issues result. The majority of the time, hackers look for
with compatibility may arise during the merging of zero-day vulnerabilities to later attack. The system's
these various algorithms. Therefore, we must be sensitive data may be exposed, altered, or rendered
extremely picky about which algorithm and scheme unavailable in such circumstances. As a result,
complement each other. Therefore, compatibility- security protocol designers should use extreme
related concerns need to be handled with extreme caution while creating new security protocols.
caution [18]. Through specific procedures, such as the Automated
• Overloading: As was said before, we use a variety of Validation of Internet Security Protocols and
methods that combine machine learning with cyber Applications (AVISPA) [20], the security of the newly
security. We need additional resources in order to run constructed protocol can be verified formally. This
these algorithms. The system will not operate tests the security of the protocol against replay and
properly in any other case. As a result, combining and man-in-the-middle attacks. In addition, the Burrows-
using different algorithms may overwhelm the system, Abadi-Needham (BAN) logic test [21] can be used to
which could therefore impair how well the system determine whether there is a chance for "safe mutual
actually functions. For instance, we are unable to use authentication among the communicating
the entire system's resources for security-related organizations." In addition to these, we can analyze
operations. For the completion of ML-related tasks, the formal security of a security protocol using the
we additionally require some resources. As a result, Real-or-Random (ROR) model implementation [22],
we should choose the algorithms carefully and in which highlights the potential for an attack on the
accordance with the communication environment's planned authentication, access control, or key
resources. For instance, we would choose to use the management protocol that involves unauthorized
Advanced Encryption Standard (AES) algorithm—a session key computation. In this approach, the
symmetric-key-based encryption—instead of any security of the designed protocol may be assessed and
public key cryptographic algorithm for the secure examined.
communication of IoT because it costs less to compute,
communicate, and store data than public key VI. FUTURE RESEARCH
cryptographic algorithms. In that case, we can also
assign system resources for the accomplishment of Through specific procedures, such as the Automated
crucial tasks. Validation of Internet Security Protocols and
• Accuracy: When combining machine learning and Applications (AVISPA) [20], the security of the newly
cyber security, we employ a variety of ML processes, constructed protocol can be verified formally. This
or models, to make predictions about certain physical tests the security of the protocol against replay and
events (i.e., chances of roadside accident in the man-in-the-middle attacks. In addition, the Burrows-
intelligent transportation system). The ML models Abadi-Needham (BAN) logic test [21] can be used to
rely on certain datasets to function, therefore errors determine whether there is a chance for "safe mutual
in either the dataset or the ML model's settings might authentication among the communicating
cause serious problems. For instance, the accuracy organizations." In addition to these, we can analyze
attained is not entirely accurate [19]. the formal security of a security protocol using the
• Security system flaws: When combining ML and Real-or-Random (ROR) model implementation [22],
cyber security, we may employ a variety of cyber which highlights the potential for an attack on the

Volume 8, Issue 6, November-December-2022 | http://ijsrcseit.com 382


Jay Kumar Jain et al Int. J. Sci. Res. Comput. Sci. Eng. Inf. Technol., November-December-2022, 8 (6) : 374-385

planned authentication, access control, or key (NLP), which can be used to understand what a
management protocol that involves unauthorized person or a piece of text is saying, and image
session key computation. In this approach, the processing for recognition are two of the most well-
security of the designed protocol may be assessed and liked applications. In several ways, cybersecurity is
examined [23]. Therefore, new security protocols are unique from other machine learning use cases.
needed that have more security and functionality Utilizing machine learning for cybersecurity has its
characteristics and can withstand zero day own requirements and obstacles. We will go through
vulnerabilities as well [24]. three particular difficulties with using ML in
• The compatibility of various tools and mechanisms cybersecurity as well as three typical but more
The "uniting of cyber security and ML" makes use of a difficult difficulties. Only machine learning can
variety of methods and tools, or numerous security classify complex events and scenarios at scale to
techniques. as hashing techniques, machine learning enable enterprises to address the challenge of
algorithms like CNNs and clustering, signature cybersecurity now and in the years to come. This is
creation and verification algorithms, and encryption because more devices and dangers are coming online
algorithms). They also call for various hardware and every day, while human security resources are in
configurations. These mechanisms and tools' short supply. By combining cyber security and
compatibility under such conditions can give rise to machine learning, we revealed the details of two
some problems [25]. distinct concepts: "cyber security in machine
• Performance and overloading: We use a number of learning" and "machine learning in cyber security."
the previously mentioned techniques to combine The benefits, problems, and challenges of combining
machine learning and cyber security. To run these ML with cyber security were then reviewed. In
various algorithms, we need some extra resources. addition, we described several attacks and offered a
Otherwise, the tasks won't be completed properly. As comparison of various strategies in two separate
a result, the system may become overloaded by the categories. Finally, some recommendations for future
combination and use of numerous algorithms, which research are given.
could hinder the system's actual operation. Therefore,
whether in ML or security, we should carefully select VIII. REFERENCES
the algorithms and work to develop new, lightweight
algorithms that consume fewer system resources [26]. [1]. Chauhan, D., and J. K. Jain. "A Journey from
• Increased system accuracy: Since ML models rely on IoT to IoEA Journey from IoT to IoE."
certain datasets to function, errors in either the International Journal of Innovative Technology
dataset or the ML model's configuration might lead to and Exploring Engineering (IJITEE) ISSN: 2278-
issues. For instance, the obtained accuracy may not be 3075.
entirely accurate or the algorithm may produce [2]. Z. Lv, L. Qiao, J. Li, H. Song, Deep-learning-
incorrect predictions. Therefore, it is important for enabled security issues in the internet of things,
researchers to try to find solutions to these problems. IEEE Internet Things J. 8 (12) (2021) 9531–
By developing new techniques, mistakes in datasets 9538.
can be found and the systems' accuracy can be [3]. Y. Wang, J. Yu, B. Yan, G. Wang, Z. Shan, BSV-
increased [27]. PAGS: Blockchain based special vehicles
VII. CONCLUSION priority access guarantee scheme, Comput.
Machine learning is well known and used extensively Commun. 161 (2020) 28–40.
in a variety of fields. Natural language processing

Volume 8, Issue 6, November-December-2022 | http://ijsrcseit.com 383


Jay Kumar Jain et al Int. J. Sci. Res. Comput. Sci. Eng. Inf. Technol., November-December-2022, 8 (6) : 374-385

[4]. N. Magaia, R. Fonseca, K. Muhammad, A.H.F.N. [12]. J. Yang, Z. Bian, J. Liu, B. Jiang, W. Lu, X. Gao,
Segundo, A.V. Lira Neto, V.H.C. de H. Song, Noreference quality assessment for
Albuquerque, Industrial internet-of-things screen content images using visual edge model
security enhanced with deep learning and AdaBoosting neural network, IEEE Trans.
approaches for smart cities, IEEE Internet Image Process. 30 (2021) 6801–6814.
Things J. 8 (8) (2021) 6393–6405. [13]. Y. Zhao, J. Yang, Y. Bao, H. Song, Trustworthy
[5]. S.A. Parah, J.A. Kaw, P. Bellavista, N.A. Loan, authorization method for security in industrial
G.M. Bhat, K. Muhammad, V.H.C. de internet of things, Ad Hoc Netw. 121 (C)
Albuquerque, Efficient security and (2021).
authentication for edge-based internet of [14]. T.S. Messerges, E.A. Dabbish, R.H. Sloan,
medical things, IEEE Internet Things J. 8 (21) Examining smart-card security under the threat
(2021) 15652–15662. of power analysis attacks, IEEE Trans. Comput.
[6]. Ho, Samson, et al. "A novel intrusion detection 51 (5) (2002) 541–552.
model for detecting known and innovative [15]. M.R.K. Soltanian, I.S. Amiri, Chapter 3 -
cyberattacks using convolutional neural problem solving, investigating ideas, and
network." IEEE Open Journal of the Computer solutions, in: M.R.K. Soltanian, I.S. Amiri (Eds.),
Society 2 (2021): 14-25. Theoretical and Experimental Methods for
[7]. Praneeth Narisetty, Pavan Narra "A MACHINE Defending Against DDOS Attacks, Syngress,
LEARNING APPROACH FOR DETECTING 2016, pp. 33–45.
CYBERATTACKS IN NETWORKS", [16]. T. Lei, Z. Qin, Z. Wang, Q. Li, D. Ye, EveDroid:
International Journal of Emerging Technologies Event-aware android malware detection against
and Innovative Research (www.jetir.org | UGC model degrading for IoT devices, IEEE Internet
and issn Approved), ISSN:2349-5162, Vol.9, Things J. 6 (4) (2019) 6668–6680.
Issue 6, page no. ppg26-g31, June-2022, [17]. J. Steinhardt, P.W. Koh, P. Liang, Certified
Available at : defenses for data poisoning attacks, in: 31st
http://www.jetir.org/papers/JETIR2206605.pdf International Conference on Neural
[8]. Tufan, Emrah, Cihangir Tezcan, and Cengiz Information Processing Systems, in: NIPS’17,
Acartürk. "Anomaly-based intrusion detection Curran Associates Inc. Long Beach, California,
by machine learning: A case study on probing USA, 2017, pp. 3520–3532.
attacks to an institutional network." IEEE [18]. M. Aladag, F.O. Catak, E. Gul, preventing data
Access 9 (2021): 50078-50092. poisoning attacks by using generative models,
[9]. Sarker, Iqbal H., et al. "Intrudtree: a machine in: 1st International Informatics and Software
learning based cyber security intrusion Engineering Conference, UBMYK, Ankara,
detection model." Symmetry 12.5 (2020): 754. Turkey, 2019, pp. 1–5,
[10]. Abdeldayem, Mohamed M. "Intrusion http://dx.doi.org/10.1109/UBMYK48245.2019.8
Detection System Based on Pattern 965459.
Recognition." Arabian Journal for Science and [19]. C. Huang, S. Chen, Y. Zhang, W. Zhou, J.J.P.C.
Engineering (2022): 1-9. Rodrigues, V.H.C. de Albuquerque, a robust
[11]. Y. Sun, A.K. Bashir, U. Tariq, F. Xiao, Effective approach for privacy data protection: IoT
malware detection scheme based on classified security assurance using generative adversarial
behavior graph in IIoT, Ad Hoc Netw. 120 imitation learning, IEEE Internet Things J.
(2021) 102558.

Volume 8, Issue 6, November-December-2022 | http://ijsrcseit.com 384


Jay Kumar Jain et al Int. J. Sci. Res. Comput. Sci. Eng. Inf. Technol., November-December-2022, 8 (6) : 374-385

(2021) 1, http://dx.doi.org/10.1109/JIOT.2021. Approaches in Wireless Sensor Network."


3128531. THEETAS 2022: Proceedings of The
[20]. N. Papernot, P. McDaniel, X. Wu, S. Jha, A. International Conference on Emerging Trends
Swami, Distillation as a defense to adversarial in Artificial Intelligence and Smart Systems,
perturbations against deep neural networks, in: THEETAS 2022, 16-17 April 2022, Jabalpur,
2016 IEEE Symposium on Security and Privacy, India. European Alliance for Innovation, 2022.
2016, pp. 582–597, [28]. J. K. Jain, C. S. Dangi and D. Chauhan, "An
http://dx.doi.org/10.1109/SP.2016.41. Efficient Multipath Productive Routing
[21]. N. Papernot, A marauder’s map of security and Protocol for Mobile Ad-hoc Networks," 2020
privacy in machine learning, in: 11th ACM IEEE International Conference for Innovation
Workshop on Artificial Intelligence and in Technology (INOCON), 2020, pp. 1-5, doi:
Security, Toronto, Canada, 2018. 10.1109/INOCON50539.2020.9298291.
[22]. S. Pirbhulal, W. Wu, K. Muhammad, I.
Mehmood, G. Li, V.H.C. de Albuquerque, Cite this article as :
Mobility enabled security for optimizing IoT
based intelligent applications, IEEE Netw. 34 (2) Jay Kumar Jain, Akhilesh A. Waoo, Dipti Chauhan,
(2020) 72–77. "A Literature Review on Machine Learning for Cyber
[23]. Chauhan, Dipti, Jay Kumar Jain, and Sanjay Security Issues", International Journal of Scientific
Sharma. "An end-to-end header compression Research in Computer Science, Engineering and
for multihop IPv6 tunnels with varying Information Technology (IJSRCSEIT), ISSN : 2456-
bandwidth." 2016 Fifth international 3307, Volume 8, Issue 6, pp.374-385, November-
conference on eco-friendly computing and December-2022. Available at doi :
communication systems (ICECCS). IEEE, 2016. https://doi.org/10.32628/CSEIT228654
[24]. Jain, Jay Kumar, Devendra Kumar Jain, and Journal URL : https://ijsrcseit.com/CSEIT228654
Anuradha Gupta. "Performance analysis of
node-disjoint multipath routing for mobile ad-
hoc networks based on QOS." International
Journal of Computer Science and Information
Technologies 3.5 (2012): 5000-5004.
[25]. Waoo, A., and Sanjay Sharma. "Threshold
Sensitive Stable Election Multi-path Energy
Aware Hierarchical Protocol for Clustered
Heterogeneous Wireless Sensor Networks."
International Journal of Recent Trends in
Engineering & Research 3.09 (2017): 158-16.
[26]. Jain, Jay Kumar, and Sanjay Sharma.
"Performance Evaluation of Hybrid Multipath
Progressive Routing Protocol for MANETs."
International Journal of Computer Applications
71.18 (2013).
[27]. Jain, Jay Kumar, and Akhilesh A. Waoo. "An
Analytical Study of Energy Efficient Routing

Volume 8, Issue 6, November-December-2022 | http://ijsrcseit.com 385

You might also like