Detection of Cyber Attack in Network Using Machine Learning Techniques New PDF
Detection of Cyber Attack in Network Using Machine Learning Techniques New PDF
ABSTRACT
CONTENTS
Certificate 2
Declaration 3
Acknowledgement 4
Abstract 5
1. Introduction 7-9
1.1 Existing System
1.2 Proposed System ‘
2. Literature Survey
3. System Requirements 11
3.1 Software Requirements
3.2 Hardware Requirements
4. System Design 12-17
5. Implementation
5.1 Coding 18-24
6. Results 25-26
7. Conclusion 27
8. Bibliography 28
1.INTRODUCTION
issues, digital fear based oppression is one of the most significant issues in this day and
age. Digital fear, which made a great deal of issues people and establishments, has
arrived at a level that could undermine open and nation security by different gatherings,
for example, criminal association, proficient people and digital activists. Along these
lines, Intrusion Detection Systems (IDS) has been created to maintain a strategic
distance from digital assaults. Right now, learning the bolster support vector machine
(SVM) calculations were utilized to recognize port sweep endeavors dependent on the
new CICIDS2017 dataset with 97.80%, 69.79% precision rates were accomplished
individually. Rather than SVM we can introduce some other algorithms like random
forest, CNN, ANN where these algorithms can acquire accuracies like SVM – 93.29,
CNN – 63.52, Random Forest – 99.93, ANN – 99.11.
.
Advantages
• Protection from malicious attacks on your network.
• Deletion and/or guaranteeing malicious elements within a preexisting network.
• Prevents users from unauthorized access to the network.
• Deny's programs from certain resources that could be infected.
• Securing confidential information
2. LITERATURE SURVEY
2.1 R. Christopher, “Port scanning techniques and the defense against them,”
SANS Institute, 2001.
Port Scanning is one of the most popular techniques attackers use to discover services
that they can exploit to break into systems. All systems that are connected to a LAN or
the Internet via a modem run services that listen to well-known and not so well-known
ports. By port scanning, the attacker can find the following information about the
targeted systems: what services are running, what users own those services, whether
anonymous logins are supported, and whether certain network services require
authentication. Port scanning is accomplished by sending a message to each port, one at
a time. The kind of response received indicates whether the port is used and can be
probed for further weaknesses. Port scanners are important to network security
technicians because they can reveal possible security vulnerabilities on the targeted
system. Just as port scans can be ran against your systems, port scans can be detected
and the amount of information about open services can be limited utilizing the proper
tools. Every publicly available system has ports that are open and available for use. The
object is to limit the exposure of open ports to authorized users and to deny access to
the closed ports.
2.2 S. Staniford, J. A. Hoagland, and J. M. McAlerney, “Practical automated
detection of stealthy portscans,” Journal of Computer Security, vol. 10, no. 1-2, pp.
105–136, 2002.
that almost all of them turn out to have come from compromised hosts and thus are very
likely to be hostile. So we think it reasonable to consider a portscan as at least
potentially hostile, and to report it to the administrators of the remote network from
whence it came. However, this paper is focussed on the technical questions of how to
detect portscans, which are independent of what significance one imbues them with, or
how one chooses to respond to them. Also, we are focussed here on the problem of
detecting a portscan via a network intrusion detection system (NIDS). We try to take
into account some of the more obvious ways an attacker could use to avoid detection,
but to remain with an approach that is practical to employ on busy networks. In the
remainder of this section, we first define portscanning, give a variety of examples at
some length, and discuss ways attackers can try to be stealthy. In the next section, we
discuss a variety of prior work on portscan detection. Then we present the algorithms
that we propose to use, and give some very preliminary data justifying our approach.
Finally, we consider possible extensions to this work, along with other applications that
might be considered. Throughout, we assume the reader is familiar with Internet
protocols, with basic ideas about network intrusion detection and scanning, and with
elementary probability theory, information theory, and linear algebra. There are two
general purposes that an attacker might have in conducting a portscan: a primary one,
and a secondary one. The primary purpose is that of gathering information about the
reachability and status of certain combinations of IP address and port (either TCP or
UDP). (We do not directly discuss ICMP scans in this paper, but the ideas can be
extended to that case in an obvious way.) The secondary purpose is to flood intrusion
detection systems with alerts, with the intention of distracting the network defenders or
preventing them from doing their jobs. In this paper, we will mainly be concerned with
detecting information gathering portscans, since detecting flood portscans is easy.
However, the possibility of being maliciously flooded with information will be an
important consideration in our algorithm design. We will use the term scan footprint for
the set of port/IP combinations which the attacker is interested in characterizing. It is
helpful to conceptually distinguish the footprint of the scan, from the script of the scan,
which refers to the time sequence in which the attacker tries to explore the footprint.
The footprint is independent of aspects of the script, such as how fast the scan is,
whether it is randomized, etc. The footprint represents the attacker’s information
6
gathering requirements for her scan, and she designs a scan script that will meet those
requirements, and perhaps other non-information-gathering requirements (such as not
being detected by an NIDS). The most common type of portscan footprint at present is a
horizontal scan. By this, we mean that an attacker has an exploit for a particular service,
and is interested in finding any hosts that expose that service. Thus she scans the port of
interest on all IP addresses in some range of interest. Also at present, this is mainly
being done sequentially on TCP port 53 (DNS)
2.3 M. C. Raja and M. M. A. Rabbani, “Combined analysis of support vector
machine and principle component analysis for ids,” in IEEE International
Conference on Communication and Electronics Systems, 2016, pp. 1–5.
Compared to the past security of networked systems has become a critical universal
issue that influences individuals, enterprises and governments. The rate of attacks
against networked systems has increased melodramatically, and the strategies used by
the attackers are continuing to evolve. For example, the privacy of important
information, security of stored data platforms, availability of knowledge etc. Depending
on these problems, cyber terrorism is one of the most important issues in today’s world.
Cyber terror, which caused a lot of problems to individuals and institutions, has reached
a level that could threaten public and country security by various groups such as
criminal organizations, professional persons and cyber activists. Intrusion detection is
one of the solutions against these attacks. A free and effective approach for designing
Intrusion Detection Systems (IDS) is Machine Learning. In this study, deep learning
and support vector machine (SVM) algorithms were used to detect port scan attempts
based on the new CICIDS2017 dataset Introduction Network Intrusion Detection
System (IDS) is a software-based application or a hardware device that is used to
identify malicious behavior in the network [1,2]. Based on the detection technique,
intrusion detection is classified into anomaly-based and signature-based. IDS
developers employ various techniques for intrusion detection. Information security is
the process of protecting information from unauthorized access, usage, disclosure,
destruction, modification or damage. The terms ”Information security”, ”computer
security” and ”information insurance” are often used interchangeably. These areas are
related to each other and have common goals to provide availability, confidentiality, and
integrity of information. Studies show that the first step of an attack is discovery [1].
7
3.SYSTEM REQUIREMENTS
4.SYSTEM DESIGN
UML DIAGRAMS
There are several types of UML diagrams and each one of them serves a different
purpose regardless of whether it is being designed before the implementation or after
(as part of documentation).
The two broadest categories that encompass all other types are Behavioral UML
diagram and Structural UML diagram. As the name suggests, some UML diagrams
try to analyze and depict the structure of a system or process, whereas other describe
the behavior of the system, its actors, and its building components. The different
types are broken down as follows
Class Diagram
Use Case Diagram
Sequence Diagram
Just like any other thing in life, in order to get something done properly, you need the
right tools. For documenting software, processes or systems, you need the right tools
that offer UML annotations and UML diagram templates. There are different
software documentation tools that can help you draw a UML diagram. They are
generally divided into these main categories:
Paper and pen – this one is a no-brainer. Pick up a paper and a pen, open up a UML
syntax cheat sheet from the web and start drawing any diagram type you need.
Online tools – there are several online applications that can be used to draw a UML
diagram. Most of them offer free trials or a limited number of diagrams on the free
tier. If you are looking for a long-term solution for drawing UML diagrams, it is
generally more beneficial to buy a premium subscription for one of the applications.
Free Online Tools – these do pretty much the same thing that the paid ones do. The
main difference is that the paid ones also offer tutorials and ready-made templates for
specific UML diagrams. A great free tool is draw.io.
Desktop application – a typical desktop application to use for UML diagrams and
almost any other sort of diagram is Microsoft Visio. It offers advanced options and
functionality. The only downside is that you have to pay for it.
CLASS DIAGRAM
Class diagram describes the attributes and operations of a class and also the
constraints imposed on the system. The class diagrams are widely used in the
modeling of object oriented systems because they are the only UML diagrams, which
can be mapped directly with object-oriented languages.
User
agriculture
Start()
Localhost()
Register & Login to Application() System
Real Time Malware Detection()
Data Stores in SQL()
User Add Data()
Attack Classification based on model()
Detection of Attack()
Visualisation()
end()
A use case diagram is a dynamic or behavior diagram in UML. Use case diagrams
model the functionality of a system using actors and use cases. Use cases are a set of
actions, services, and functions that the system needs to perform. In this context, a
"system" is something being developed or operated, such as a web site. The "actors"
are people or entities operating under defined roles within the system.
Start
Localhost
Detection of Attack
Visualisation
End
SEQUENCE DIAGRAM
User System
Start
Localhost
Detection of A ttack
Visualisation
5.1 CODING:
Settings.py
"""
Django settings for QuickShorts_NewsAggregator project.
Generated by 'django-admin startproject' using Django 3.0.8.
'django.contrib.staticfiles',
'news',
]
MIDDLEWARE = [
'django.middleware.security.SecurityMiddleware',
'django.contrib.sessions.middleware.SessionMiddleware',
'django.middleware.common.CommonMiddleware',
'django.middleware.csrf.CsrfViewMiddleware',
'django.contrib.auth.middleware.AuthenticationMiddleware',
'django.contrib.messages.middleware.MessageMiddleware',
'django.middleware.clickjacking.XFrameOptionsMiddleware',
]
ROOT_URLCONF = 'QuickShorts_NewsAggregator.urls'
TEMPLATES = [
{
'BACKEND': 'django.template.backends.django.DjangoTemplates',
'DIRS': [],
'APP_DIRS': True,
'OPTIONS': {
'context_processors':
[ 'django.template.context_processors.debug',
'django.template.context_processors.request',
'django.contrib.auth.context_processors.auth',
'django.contrib.messages.context_processors.messages',
],
},
},
]
WSGI_APPLICATION = 'QuickShorts_NewsAggregator.wsgi.application'
# Database
# https://docs.djangoproject.com/en/3.0/ref/settings/#databases
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.sqlite3',
'NAME': os.path.join(BASE_DIR, 'db.sqlite3'),
}
}
# Password validation
# https://docs.djangoproject.com/en/3.0/ref/settings/#auth-password-validators
AUTH_PASSWORD_VALIDATORS = [
{
'NAME':
'django.contrib.auth.password_validation.UserAttributeSimilarityValidator',
},
{
'NAME': 'django.contrib.auth.password_validation.MinimumLengthValidator',
},
{
'NAME': 'django.contrib.auth.password_validation.CommonPasswordValidator',
},
{
'NAME': 'django.contrib.auth.password_validation.NumericPasswordValidator',
},
]
# Internationalization
# https://docs.djangoproject.com/en/3.0/topics/i18n/
LANGUAGE_CODE = 'en-us'
TIME_ZONE = 'UTC'
USE_I18N = True
USE_L10N = True
USE_TZ = True
# Static files (CSS, JavaScript, Images)
# https://docs.djangoproject.com/en/3.0/howto/static-files/
STATIC_URL = '/static/'
views.py
from django.shortcuts import render
import requests
from bs4 import BeautifulSoup
# Getting news from Times of India
toi_r = requests.get("https://timesofindia.indiatimes.com/briefs")
toi_soup = BeautifulSoup(toi_r.content, 'html5lib')
toi_headings = toi_soup.find_all('h2')
toi_headings = toi_headings[0:-13] # removing footers
toi_news = []
for th in toi_headings:
toi_news.append(th.text)
#Getting news from Indian Express
ie_r = requests.get("https://indianexpress.com/article")
ie_soup = BeautifulSoup(ie_r.content, 'html5lib')
ie_headings = ie_soup.findAll('h3')
ie_headings = ie_headings[0:-2]
ie_news = []
for ieh in ie_headings:
ie_news.append(ieh.text)
def index(req):
Class-based views
Add an import: from other_app.views import Home
Add a URL to urlpatterns: path('', Home.as_view(), name='home')
Including another URLconf
Import the include() function: from django.urls import include, path
Add a URL to urlpatterns: path('blog/', include('blog.urls'))
"""
from django.contrib import admin
from django.urls import path
from news import views
urlpatterns = [
path('admin/', admin.site.urls),
path('', views.index, name = "home"),
]
index.html
<!DOCTYPE html>
<html>
<head>
<title>QuickShorts News Aggregator</title>
<linkrel="stylesheet"
href="https://maxcdn.bootstrapcdn.com/bootstrap/4.0.0/css/bootstrap.min.css"
integrity="sha384-
Gn5384xqQ1aoWXA+058RXPxPg6fy4IWvTNh0E263XmFcJlSAwiGgFAW/dAiS6
JXm" crossorigin="anonymous">
</head>
<body>
<div class="jumbotron">
<center><h1>QuickShorts News Aggregator</h1>
<a href="/" class="btn btn-danger">Refresh News</a>
</form>
</center>
</div>
<div class="container">
<div class="row">
<div class="col-6">
<h3 class="text-centre">News from Times of india</h3>
{% for n in toi_news %}
<h5> - {{n}} </h5>
<hr>
{% endfor %}
<br>
</div>
<div class="col-6">
<h3 class="text-centre">News from Indian Express</h3>
{% for ien in ie_news %}
<h5> - {{ien}} </h5>
<hr>
{% endfor %}
<br>
</div>
</div>
</div>
<script
src="http://code.jquery.com/jquery-3.3.1.min.js"
integrity="sha256-FgpCb/KJQlLNfOu91ta32o/NMZxltwRo8QtmkMRdAu8="
crossorigin="anonymous"></script>
<script
src="https://cdnjs.cloudflare.com/ajax/libs/popper.js/1.12.9/umd/popper.min.js"
integrity="sha384-
ApNbgh9B+Y1QKtv3Rn7W3mgPxhU9K/ScQsAP7hUibX39j7fakFPskvXusvfa0b4
Q" crossorigin="anonymous"></script>
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/4.0.0/js/bootstrap.min.js"
integrity="sha384-
JZR6Spejh4U02d8jOt6vLEHfe/JQGiRRSQQxSfFWpi1MquVdAyjUar5+76PVCmY
l" crossorigin="anonymous"></script>
</body>
</html>
6.RESULTS
Data preprocessing
Data EDA
ML Deploy
From the score accuracy we concluding the DT & RF give better accuracy and
building pickle file for predicting the user input
Application
Predict attack -
7. CONCLUSION
Right now, estimations of help vector machine, ANN, CNN, Random Forest and
profound learning calculations dependent on modern CICIDS2017 dataset were
introduced relatively. Results show that the profound learning calculation
performed fundamentally preferable outcomes over SVM, ANN, RF and CNN.
We are going to utilize port sweep endeavors as well as other assault types with
AI and profound learning calculations, apache Hadoop and sparkle innovations
together dependent on this dataset later on. All these calculation helps us to
detect the cyber attack in network. It happens in the way that when we consider
long back years there may be so many attacks happened so when these attacks
are recognized then the features at which values these attacks are happening will
be stored in some datasets. So by using these datasets we are going to predict
whether cyber attack is done or not. These predictions can be done by four
algorithms like SVM, ANN, RF, CNN this paper helps to identify which
algorithm predicts the best accuracy rates which helps to predict best results to
identify the cyber attacks happened or not.
FUTURE SCOPE
In enhancement we will add some ML Algorithms to increase accuracy
8.REFERENCES
[1] K. Graves, Ceh: Official certified ethical hacker review guide: Exam 312-50.
John Wiley & Sons, 2007.
[2] R. Christopher, “Port scanning techniques and the defense against them,” SANS
Institute, 2001.
[3] M. Baykara, R. Das¸, and I. Karado ˘gan, “Bilgi g ¨uvenli ˘gi sistemlerinde
kullanilan arac¸larin incelenmesi,” in 1st International Symposium on Digital
Forensics and Security (ISDFS13), 2013, pp. 231–239.
[4] S. Staniford, J. A. Hoagland, and J. M. McAlerney, “Practical automated
detection of stealthy portscans,” Journal of Computer Security, vol. 10, no. 1-2, pp.
105–136, 2002.
[5] S. Robertson, E. V. Siegel, M. Miller, and S. J. Stolfo, “Surveillance detection in
high bandwidth environments,” in DARPA Information Survivability Conference
and Exposition, 2003. Proceedings, vol. 1. IEEE, 2003, pp. 130–138.
[6] K. Ibrahimi and M. Ouaddane, “Management of intrusion detection systems
based-kdd99: Analysis with lda and pca,” in Wireless Networks and Mobile
Communications (WINCOM), 2017 International Conference on. IEEE, 2017, pp.
1–6.
[7] N. Moustafa and J. Slay, “The significant features of the unsw-nb15 and the
kdd99 data sets for network intrusion detection systems,” in Building Analysis
Datasets and Gathering
Experience Returns for Security (BADGERS), 2015 4th International Workshop on.
IEEE, 2015, pp. 25–31.
[8] L. Sun, T. Anthony, H. Z. Xia, J. Chen, X. Huang, and Y. Zhang, “Detection and
classification of malicious patterns in network traffic using benford’s law,” in Asia-
Pacific Signal and Information Processing Association Annual Summit and
Conference (APSIPA ASC), 2017. IEEE, 2017, pp. 864–872.
[9] S. M. Almansob and S. S. Lomte, “Addressing challenges for intrusion detection
system using naive bayes and pca algorithm,” in Convergence in Technology
(I2CT), 2017 2nd International Conference for. IEEE, 2017, pp. 565–568.