Internship Report

Uploaded by

LIKHITH

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views7 pages

Internship Report

Uploaded by

LIKHITH

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Cyberattack Detection Using Machine Learning Models

1. Introduction:
Cybersecurity is a critical aspect of any organization's infrastructure, as it involves protecting
networks, systems, and data from cyberattacks. As cyber threats become more sophisticated, machine
learning (ML) has emerged as a powerful tool for detecting and responding to these attacks. In this
project, we applied various machine learning algorithms to a network traffic dataset to predict and
detect cyberattacks. The goal was to develop a model that could identify attack types and classify
traffic as either normal or suspicious.
2. Dataset Overview:
The dataset used in this project contains network traffic data and features related to the
communication between source and destination devices. The dataset consists of the following
features:
• Timestamp: The time when a network packet was captured.
• Source_IP: The source IP address from which the traffic originated.
• Destination_IP: The destination IP address to which the traffic was sent.
• Protocol: The communication protocol used (e.g., TCP, UDP).
• Packet_Length: The length of the packet transmitted.
• Duration: The duration of the session.
• Source_Port: The source port used for the communication.
• Destination_Port: The destination port.
• Bytes_Sent: The number of bytes sent in the packet.
• Bytes_Received: The number of bytes received.
• Flags: The flags associated with the network packet.
• Flow_Packets/s: The rate of packet flow per second.
• Flow_Bytes/s: The rate of byte flow per second.
• Avg_Packet_Size: The average size of the packets in the flow.
• Total_Fwd_Packets: The total number of forward packets in the flow.
• Total_Bwd_Packets: The total number of backward packets in the flow.
• Fwd_Header_Length: The length of the header for forward packets.
• Bwd_Header_Length: The length of the header for backward packets.
• Sub_Flow_Fwd_Bytes: The total number of forward bytes in the sub-flow.
• Sub_Flow_Bwd_Bytes: The total number of backward bytes in the sub-flow.
• Inbound: Whether the traffic is inbound (binary value).
• Attack_Type: The type of attack (e.g., DDoS, intrusion).
• Label: The target variable indicating whether the traffic is normal or suspicious.
These features were used to train various machine learning models to classify traffic and detect
potential cyberattacks.
3. Preprocessing and Feature Engineering:
Before training the models, the dataset was preprocessed to handle missing values, encode categorical
variables, and scale numeric features. The following steps were performed:
• Categorical Encoding: Categorical features like Source_IP, Destination_IP, Protocol, and
Flags were encoded using LabelEncoder. This step transformed the categorical values into
numerical representations, making them suitable for machine learning models.
• Date-time Conversion: The Timestamp feature was converted to a Unix timestamp,
representing the time in seconds, to facilitate machine learning model processing.
• Data Scaling: Numeric features like Bytes_Sent, Bytes_Received, and others were scaled
using StandardScaler to ensure all features were on the same scale and improve model
performance.
4. Model Training and Evaluation:
Several machine learning models were trained on the preprocessed dataset, including:
i. K-Nearest Neighbors (KNN): A classification algorithm that predicts the label of a data point
based on the majority class of its nearest neighbors.
ii. Logistic Regression: A linear classifier used to model the probability of an attack.

iii. Decision Tree Classifier: A tree-based model used to classify traffic based on feature splits.

iv. Support Vector Machine (SVM): A classification model that creates hyperplanes to classify
data points.
v. Random Forest Classifier: An ensemble method that uses multiple decision trees to improve
classification accuracy.
vi. Neural Networks: A deep learning model with multiple layers designed to capture complex
patterns in the data.
Each model was evaluated using metrics such as accuracy, precision, recall, and F1-score. These
metrics help assess the models' performance in detecting cyberattacks and distinguishing them from
normal traffic.
5. Model Evaluation:
Several machine learning algorithms were implemented, trained, and evaluated:
(a) K-Nearest Neighbors (KNN)
• Confusion Matrix: [[65, 87], [56, 78]]
• Accuracy: 50.00%
• Precision: 0.507
• Recall: 0.50
• F1-Score: 0.498
(b) Logistic Regression

• Confusion Matrix: [[152, 0], [134, 0]]

• Accuracy: 53.15%
• Precision: 0.751
• Recall: 0.531
• F1-Score: 0.369
(c) Decision Tree Classifier

• Confusion Matrix: [[79, 73], [69, 65]]

• Accuracy: 50.35%
• Precision: 0.504
• Recall: 0.50
• F1-Score: 0.504
(d) Support Vector Machine (SVM)

• Confusion Matrix: [[152, 0], [134, 0]]

• Accuracy: 53.15%
• Precision: 0.751
• Recall: 0.531
• F1-Score: 0.369
(e) Random Forest Classifier

• Confusion Matrix: [[81, 71], [81, 53]]

• Accuracy: 46.85%
• Precision: 0.507
• Recall: 0.50
• F1-Score: 0.498
• Classification Report:
• Precision: 0.50 for Class 0, 0.43 for Class
1
• Recall: 0.53 for Class 0, 0.40 for Class 1
• F1-Score: 0.52 for Class 0, 0.41 for Class 1
(f) Neural Network (Basic Architecture)

• Test Accuracy: 100.00%

• The neural network, with a simple dense layer architecture, achieved perfect accuracy
in testing, showing excellent performance for cyberattack detection.
(g) Neural Network with Dropout (Overfitting Prevention)

• Test Accuracy: 100.00%

• Dropout layers were used to prevent overfitting, further enhancing the model's ability
to generalize while achieving perfect accuracy.

(h) Neural Network with Convolutional(CNN) Layers

• Test Accuracy: 100.00%

• By adding convolutional layers, this model further improved in terms of detecting
patterns in the data and achieved perfect accuracy during testing.
6. Feature Analysis and Visualization:
To better understand the relationships between features and model predictions, several visualizations
were generated:
• Effect of Dropout on Model Generalization: Dropout is a regularization technique that helps
improve model generalization by introducing randomness during training. By randomly
deactivating neurons, dropout prevents individual neurons from becoming overly reliant on
specific features in the training data.

The
graph

illustrates the performance of a neural network during training. The left plot shows accuracy,
where the training accuracy generally increases while the validation accuracy plateaus after a
few epochs, indicating potential overfitting. The right plot shows loss, where both training and
validation loss decrease initially, but the validation loss starts to increase after some point, again
suggesting overfitting. This behaviour is common in neural networks and highlights the
importance of techniques like dropout, as implemented in the code with Dropout(0.5) layers, to
mitigate overfitting and improve generalization.

• Model Performance Analysis (CNN Model):The left graph shows the model achieving
perfect accuracy after one epoch, indicating possible overfitting, as training accuracy matches
validation accuracy. The right graph shows a rapid decline in loss for both training and
validation, stabilizing near zero. This suggests excellent training but raises concerns about
generalization due to overfitting.
• Correlation Matrix Heatmap: A heatmap was used to visualize the correlations between
numerical features. Features like Bytes_Sent, Bytes_Received, and Flow_Packets/s showed
strong correlations, indicating their importance in predicting cyberattacks.

• Stacked Bar Chart: A stacked bar chart was created to display the distribution of attack types
across different source IP addresses. This helped identify patterns related to specific attack
types originating from particular sources.
• Network Graph: A network graph was generated to visualize the interaction between source
and destination ports. This graph illustrated how network traffic flows between different ports
and provided insights into potential attack vectors.

• Web Traffic Analysis Over Time: The graph illustrates the variation in bytes sent and
received over time, displaying high fluctuations. The consistent peaks and troughs suggest
dynamic web traffic patterns. Bytes sent and received follow similar trends, indicating
synchronous data exchange. The detailed time-based visualization aids in identifying patterns,
anomalies, or potential bottlenecks in network performance.
7. Conclusion:
The deep learning models, particularly Neural Networks, achieved perfect accuracy in detecting
cyberattacks, outperforming traditional machine learning models like KNN, Logistic Regression,
SVM, Decision Tree and Random Forest. The neural network models showed significant potential
in learning complex patterns from the network traffic data, and their ability to prevent overfitting was
a key advantage.
• Top Performing Model: The Neural Network with Dropout demonstrated the highest
performance, achieving perfect test accuracy.

Packet Tracer Commands - CCNA
No ratings yet
Packet Tracer Commands - CCNA
16 pages
Kali Linux Slides
100% (1)
Kali Linux Slides
17 pages
VCS-SH30 Datasheet20221011
No ratings yet
VCS-SH30 Datasheet20221011
2 pages
Mlns Notes
No ratings yet
Mlns Notes
20 pages
IOT SEM 5 2K23 - 24 - Solution Set Editable
No ratings yet
IOT SEM 5 2K23 - 24 - Solution Set Editable
60 pages
Sunspec Modbus Protocol For SMA Device
No ratings yet
Sunspec Modbus Protocol For SMA Device
19 pages
SIDEHOBBY Copy
No ratings yet
SIDEHOBBY Copy
95 pages
Classification of Network Traffic Using Machine Learning Models On The NetML Dataset
No ratings yet
Classification of Network Traffic Using Machine Learning Models On The NetML Dataset
15 pages
Team - Akash, Dhanasekar
No ratings yet
Team - Akash, Dhanasekar
22 pages
DL Arch Packets
No ratings yet
DL Arch Packets
21 pages
Project Proposal
No ratings yet
Project Proposal
26 pages
Smart Attendance Using MAC Address
No ratings yet
Smart Attendance Using MAC Address
87 pages
76.phikita Phishing Kit Attacks Dataset For Phishing Websites Identification Felipe
No ratings yet
76.phikita Phishing Kit Attacks Dataset For Phishing Websites Identification Felipe
100 pages
Machine Learning Based Network Traffic P
No ratings yet
Machine Learning Based Network Traffic P
13 pages
Finalized Blackbook Group 28
No ratings yet
Finalized Blackbook Group 28
42 pages
النسخة بعد الترقيم 6 بعد المراجعة
No ratings yet
النسخة بعد الترقيم 6 بعد المراجعة
89 pages
Project Report (Cyber - )
No ratings yet
Project Report (Cyber - )
14 pages
997-476 HW19
No ratings yet
997-476 HW19
144 pages
Group4 AutoencodersforIoT
No ratings yet
Group4 AutoencodersforIoT
10 pages
Analyze and Forecast The Cyber Attack Detection PR
No ratings yet
Analyze and Forecast The Cyber Attack Detection PR
49 pages
Mlcs Nodes
No ratings yet
Mlcs Nodes
32 pages
DDOS Attack Final
No ratings yet
DDOS Attack Final
41 pages
جديد
No ratings yet
جديد
54 pages
Report
No ratings yet
Report
2 pages
Research Work DCN ML
No ratings yet
Research Work DCN ML
1 page
Final Report
No ratings yet
Final Report
65 pages
Mohak RR
No ratings yet
Mohak RR
57 pages
Breast Cancer Classification-Group240
No ratings yet
Breast Cancer Classification-Group240
4 pages
The Complete Guide To Prompt Engineering....
No ratings yet
The Complete Guide To Prompt Engineering....
47 pages
Cyber-Security-Attack-Recognition-On-Cloud-Computing-Ne - 2024 - Results-in-Cont
No ratings yet
Cyber-Security-Attack-Recognition-On-Cloud-Computing-Ne - 2024 - Results-in-Cont
10 pages
Classification Model To Classify Network Traffic
No ratings yet
Classification Model To Classify Network Traffic
5 pages
Umeme Service Guidelines PDF
No ratings yet
Umeme Service Guidelines PDF
1 page
Bharath Thesis Report
No ratings yet
Bharath Thesis Report
120 pages
Literature Review
No ratings yet
Literature Review
2 pages
Screens
No ratings yet
Screens
14 pages
Ijhs 9745+1341 1349
No ratings yet
Ijhs 9745+1341 1349
9 pages
Excel Dynamic Arrays: Course Notes
No ratings yet
Excel Dynamic Arrays: Course Notes
34 pages
Information Security Project
No ratings yet
Information Security Project
7 pages
QI+ 5 GD en
No ratings yet
QI+ 5 GD en
138 pages
A Machine Learning-Based Classification and Prediction Technique For DDoS Attacks
No ratings yet
A Machine Learning-Based Classification and Prediction Technique For DDoS Attacks
7 pages
Machine Learning Algorithms For DoS and DDoS Cyberattacks Detection in Real-Time Environment
No ratings yet
Machine Learning Algorithms For DoS and DDoS Cyberattacks Detection in Real-Time Environment
2 pages
First Draft Report
No ratings yet
First Draft Report
5 pages
Base Paper Interview
No ratings yet
Base Paper Interview
5 pages
List ReadingPaper
No ratings yet
List ReadingPaper
20 pages
MMAKR
No ratings yet
MMAKR
13 pages
SRS Cyber
No ratings yet
SRS Cyber
11 pages
Assignment
No ratings yet
Assignment
5 pages
DCN El
No ratings yet
DCN El
17 pages
Batch 7 Conference Paper
No ratings yet
Batch 7 Conference Paper
5 pages
TelematiqueVol21Issue1 616
No ratings yet
TelematiqueVol21Issue1 616
31 pages
Apply Machine Learning Techniques To Detect Malicious Network Traffic in Cloud Computing
No ratings yet
Apply Machine Learning Techniques To Detect Malicious Network Traffic in Cloud Computing
24 pages
IEEE Conference Template
No ratings yet
IEEE Conference Template
4 pages
NCFTEAS - 2024 Paper 16
No ratings yet
NCFTEAS - 2024 Paper 16
8 pages
Conference-template-A4 (AutoRecovered)
No ratings yet
Conference-template-A4 (AutoRecovered)
6 pages
AWS Certified DevOps Engineer Professional Questions
No ratings yet
AWS Certified DevOps Engineer Professional Questions
4 pages
Cyber Attack Detection Thanks To Machine Learning Algorithms
No ratings yet
Cyber Attack Detection Thanks To Machine Learning Algorithms
46 pages
Paper 127-A Comprehensive Analysis of Network Security Attack Classification
No ratings yet
Paper 127-A Comprehensive Analysis of Network Security Attack Classification
12 pages
Explainable AI For IDS Final Report
No ratings yet
Explainable AI For IDS Final Report
94 pages
High School Football Schedule - Oct 29 - Dec 4 Rev Oct 28
No ratings yet
High School Football Schedule - Oct 29 - Dec 4 Rev Oct 28
1 page
Predicitve Risk Model
No ratings yet
Predicitve Risk Model
32 pages
GRPPRJCT
No ratings yet
GRPPRJCT
15 pages
ICT Skills - II (Part - A - Unit - 3)
No ratings yet
ICT Skills - II (Part - A - Unit - 3)
28 pages
Paper Presentation - IDS
No ratings yet
Paper Presentation - IDS
2 pages
19bit0368 Capstone Final Review
No ratings yet
19bit0368 Capstone Final Review
48 pages
EEI3346 Final Written Paper - 9th January2022
No ratings yet
EEI3346 Final Written Paper - 9th January2022
10 pages
Machine Learning Methods For Secure Internet of Things Against Cyber Threats Synopsis
No ratings yet
Machine Learning Methods For Secure Internet of Things Against Cyber Threats Synopsis
5 pages
КШ - 1.2 англ
No ratings yet
КШ - 1.2 англ
14 pages
Data Sharing Collaboration Delta Sharing Final
No ratings yet
Data Sharing Collaboration Delta Sharing Final
127 pages
22mdt1038 Capstone Final
No ratings yet
22mdt1038 Capstone Final
63 pages
Applying Machine Learning To Cyber Security
No ratings yet
Applying Machine Learning To Cyber Security
117 pages
Applsci 13 07507 v4
No ratings yet
Applsci 13 07507 v4
34 pages
How To Send To A Business Paypal - Google Search
No ratings yet
How To Send To A Business Paypal - Google Search
1 page
6117991xF2-8 - ADTRAN1148SVX Host - Client
No ratings yet
6117991xF2-8 - ADTRAN1148SVX Host - Client
4 pages
Tesi
No ratings yet
Tesi
110 pages
Catatan
No ratings yet
Catatan
1 page
19 Assessing Model Accuracy
No ratings yet
19 Assessing Model Accuracy
16 pages
Project Paper Publication
No ratings yet
Project Paper Publication
10 pages
Shreya Ghosh MS Thesis Final Revised
No ratings yet
Shreya Ghosh MS Thesis Final Revised
64 pages
ASA Failover
No ratings yet
ASA Failover
6 pages
TLE ICT CSS9 Q2 WEEK3 MODULE Edited Black
No ratings yet
TLE ICT CSS9 Q2 WEEK3 MODULE Edited Black
6 pages
Samsung Galaxy Mega 6.3 - Schematic Diagarm PDF
No ratings yet
Samsung Galaxy Mega 6.3 - Schematic Diagarm PDF
135 pages
SIP Master Stations: Configuration Guide
No ratings yet
SIP Master Stations: Configuration Guide
36 pages
Purchase Order Version Management - S - 4HANA Materials Management
No ratings yet
Purchase Order Version Management - S - 4HANA Materials Management
18 pages
What Is Bitcoin
No ratings yet
What Is Bitcoin
5 pages
Cmpe 256 - Midterm - Report
No ratings yet
Cmpe 256 - Midterm - Report
3 pages
Autosar RTE Layer
No ratings yet
Autosar RTE Layer
1,116 pages
PCI Express 1x, 4x, 8x, 16x Bus Pinout Diagram @
No ratings yet
PCI Express 1x, 4x, 8x, 16x Bus Pinout Diagram @
1 page
011 Terapia de Regulacao Orofacial (Castillo) - FuturoFono
100% (3)
011 Terapia de Regulacao Orofacial (Castillo) - FuturoFono
179 pages
Ad SW Final Revision Essay Question
No ratings yet
Ad SW Final Revision Essay Question
4 pages
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
From Everand
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
César Pérez López
No ratings yet

Internship Report

Uploaded by

Internship Report

Uploaded by

Cyberattack Detection Using Machine Learning Models

• Confusion Matrix: [[152, 0], [134, 0]]

• Confusion Matrix: [[79, 73], [69, 65]]

• Confusion Matrix: [[152, 0], [134, 0]]

• Confusion Matrix: [[81, 71], [81, 53]]

• Test Accuracy: 100.00%

• Test Accuracy: 100.00%

(h) Neural Network with Convolutional(CNN) Layers

• Test Accuracy: 100.00%

You might also like