0% found this document useful (0 votes)

31 views

S Detection Using Machine Learning

The document discusses using machine learning for detecting distributed denial of service (DDoS) attacks. It loads and explores a dataset containing network traffic features and labels, identifies correlated and uninformative features, and prepares the data for analysis and modeling to classify examples as benign or attacks.

Uploaded by

soham pawar

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views

S Detection Using Machine Learning

Uploaded by

soham pawar

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

s-detection-using-machine-learning

March 2, 2024

[1]: import numpy as np

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder,OneHotEncoder
from sklearn.metrics import accuracy_score, confusion_matrix, accuracy_score,␣
↪precision_recall_curve, roc_curve, auc

from sklearn.compose import ColumnTransformer

from sklearn.pipeline import Pipeline

from sklearn.tree import DecisionTreeClassifier

from sklearn.ensemble import RandomForestClassifier
from xgboost import XGBClassifier
from xgboost import plot_importance

[2]: ddos=pd.read_csv("APA-DDoS-Dataset.csv")

[3]: ddos

[3]: ip.src ip.dst tcp.srcport tcp.dstport ip.proto \

0 192.168.1.1 192.168.23.2 2412 8000 6
1 192.168.1.1 192.168.23.2 2413 8000 6
2 192.168.1.1 192.168.23.2 2414 8000 6
3 192.168.1.1 192.168.23.2 2415 8000 6
4 192.168.1.1 192.168.23.2 2416 8000 6
… … … … … …
151195 192.168.19.1 192.168.23.2 37360 8000 6
151196 192.168.19.1 192.168.23.2 37362 8000 6
151197 192.168.19.1 192.168.23.2 37364 8000 6
151198 192.168.19.1 192.168.23.2 37366 8000 6
151199 192.168.19.1 192.168.23.2 37368 8000 6

frame.len tcp.flags.syn tcp.flags.reset tcp.flags.push \

0 54 0 0 1
1 54 0 0 1

1
2 54 0 0 1
3 54 0 0 1
4 54 0 0 1
… … … … …
151195 66 0 0 0
151196 66 0 0 0
151197 66 0 0 0
151198 66 0 0 0
151199 66 0 0 0

tcp.flags.ack … tcp.seq tcp.ack \

0 1 … 1 1
1 1 … 1 1
2 1 … 1 1
3 1 … 1 1
4 1 … 1 1
… … … … …
151195 1 … 1 1
151196 1 … 1 1
151197 1 … 1 1
151198 1 … 1 1
151199 1 … 1 1

frame.time Packets Bytes \

0 16-Jun 2020 20:18:15.071112000 Mountain Dayli… 8 432
1 16-Jun 2020 20:18:15.071138000 Mountain Dayli… 10 540
2 16-Jun 2020 20:18:15.071146000 Mountain Dayli… 12 648
3 16-Jun 2020 20:18:15.071152000 Mountain Dayli… 10 540
4 16-Jun 2020 20:18:15.071159000 Mountain Dayli… 6 324
… … … …
151195 16-Jun 2020 22:10:46.923006000 Mountain Dayli… 10 1146
151196 16-Jun 2020 22:10:46.935672000 Mountain Dayli… 10 1151
151197 16-Jun 2020 22:10:46.957469000 Mountain Dayli… 10 1144
151198 16-Jun 2020 22:10:46.970971000 Mountain Dayli… 10 1175
151199 16-Jun 2020 22:10:46.984798000 Mountain Dayli… 10 1146

Tx Packets Tx Bytes Rx Packets Rx Bytes Label

0 4 216 4 216 DDoS-PSH-ACK
1 5 270 5 270 DDoS-PSH-ACK
2 6 324 6 324 DDoS-PSH-ACK
3 5 270 5 270 DDoS-PSH-ACK
4 3 162 3 162 DDoS-PSH-ACK
… … … … … …
151195 6 560 4 586 Benign
151196 6 560 4 591 Benign
151197 6 560 4 584 Benign
151198 6 560 4 615 Benign

2
151199 6 560 4 586 Benign

[151200 rows x 23 columns]

[5]: ddos.info()

RangeIndex: 151200 entries, 0 to 151199

Data columns (total 23 columns):

# Column Non-Null Count Dtype

--- ------ -------------- -----

0 ip.src 151200 non-null object

1 ip.dst 151200 non-null object

2 tcp.srcport 151200 non-null int64

3 tcp.dstport 151200 non-null int64

4 ip.proto 151200 non-null int64

5 frame.len 151200 non-null int64

6 tcp.flags.syn 151200 non-null int64

7 tcp.flags.reset 151200 non-null int64

8 tcp.flags.push 151200 non-null int64

9 tcp.flags.ack 151200 non-null int64

10 ip.flags.mf 151200 non-null int64

11 ip.flags.df 151200 non-null int64

12 ip.flags.rb 151200 non-null int64

13 tcp.seq 151200 non-null int64

14 tcp.ack 151200 non-null int64

15 frame.time 151200 non-null object

3
16 Packets 151200 non-null int64

17 Bytes 151200 non-null int64

18 Tx Packets 151200 non-null int64

19 Tx Bytes 151200 non-null int64

20 Rx Packets 151200 non-null int64

21 Rx Bytes 151200 non-null int64

22 Label 151200 non-null object

dtypes: int64(19), object(4)

memory usage: 26.5+ MB

[6]: ddos.isna().sum()

[6]: ip.src 0
ip.dst 0
tcp.srcport 0
tcp.dstport 0
ip.proto 0
frame.len 0
tcp.flags.syn 0
tcp.flags.reset 0
tcp.flags.push 0
tcp.flags.ack 0
ip.flags.mf 0
ip.flags.df 0
ip.flags.rb 0
tcp.seq 0
tcp.ack 0
frame.time 0
Packets 0
Bytes 0
Tx Packets 0
Tx Bytes 0
Rx Packets 0
Rx Bytes 0
Label 0
dtype: int64

[7]: ddos.duplicated().sum()

4
[7]: 0

There are no duplicates or nulls that needs to be dropped, so we can proceed into our analysis.
[8]: ddos.groupby('Label').size()

[8]: Label
Benign 75600
DDoS-ACK 37800
DDoS-PSH-ACK 37800
dtype: int64

We have 75600 Benign, and 75600 DDOS attacks

0.1 Exploring Relations between features

[10]: sns.pairplot(ddos, hue = 'Label', size = 2, diag_kind = 'kde')
plt.show()

/usr/local/lib/python3.10/dist-packages/seaborn/axisgrid.py:2095: UserWarning:
The `size` parameter has been renamed to `height`; please update your code.

warnings.warn(msg, UserWarning)

5
By observing the pairplot generated, we can notice too many features that have just a single value
across the coloumn, they can be dropped, we can notice which ones exactly
[16]: numeric_data = ddos.select_dtypes(include='number')# select only the columns in␣
↪the DataFrame data that have numeric (number) data

correlation_matrix = numeric_data.corr()
fig, ax = plt.subplots(figsize=(15,8))
sns.heatmap(correlation_matrix, annot=True,ax=ax, cmap="RdPu")
plt.title('Correlation Between the Variables')
#plt.xticks(rotation=45);
plt.show()

6
[9]: columns_to_drop = ['tcp.dstport', 'ip.proto', 'tcp.flags.syn', 'tcp.flags.
↪reset', 'tcp.flags.ack', 'ip.flags.mf', 'ip.flags.rb', 'tcp.seq', 'tcp.ack']

ddos_new= ddos.drop(columns=columns_to_drop).copy()
ddos_new

[9]: ip.src ip.dst tcp.srcport frame.len tcp.flags.push \

0 192.168.1.1 192.168.23.2 2412 54 1
1 192.168.1.1 192.168.23.2 2413 54 1
2 192.168.1.1 192.168.23.2 2414 54 1
3 192.168.1.1 192.168.23.2 2415 54 1
4 192.168.1.1 192.168.23.2 2416 54 1
… … … … … …
151195 192.168.19.1 192.168.23.2 37360 66 0
151196 192.168.19.1 192.168.23.2 37362 66 0
151197 192.168.19.1 192.168.23.2 37364 66 0
151198 192.168.19.1 192.168.23.2 37366 66 0
151199 192.168.19.1 192.168.23.2 37368 66 0

ip.flags.df frame.time \
0 0 16-Jun 2020 20:18:15.071112000 Mountain Dayli…
1 0 16-Jun 2020 20:18:15.071138000 Mountain Dayli…
2 0 16-Jun 2020 20:18:15.071146000 Mountain Dayli…
3 0 16-Jun 2020 20:18:15.071152000 Mountain Dayli…
4 0 16-Jun 2020 20:18:15.071159000 Mountain Dayli…

7
… … …
151195 1 16-Jun 2020 22:10:46.923006000 Mountain Dayli…
151196 1 16-Jun 2020 22:10:46.935672000 Mountain Dayli…
151197 1 16-Jun 2020 22:10:46.957469000 Mountain Dayli…
151198 1 16-Jun 2020 22:10:46.970971000 Mountain Dayli…
151199 1 16-Jun 2020 22:10:46.984798000 Mountain Dayli…

Packets Bytes Tx Packets Tx Bytes Rx Packets Rx Bytes \

0 8 432 4 216 4 216
1 10 540 5 270 5 270
2 12 648 6 324 6 324
3 10 540 5 270 5 270
4 6 324 3 162 3 162
… … … … … … …
151195 10 1146 6 560 4 586
151196 10 1151 6 560 4 591
151197 10 1144 6 560 4 584
151198 10 1175 6 560 4 615
151199 10 1146 6 560 4 586

Label
0 DDoS-PSH-ACK
1 DDoS-PSH-ACK
2 DDoS-PSH-ACK
3 DDoS-PSH-ACK
4 DDoS-PSH-ACK
… …
151195 Benign
151196 Benign
151197 Benign
151198 Benign
151199 Benign

[151200 rows x 14 columns]

We don’t need the frame.time as well

[10]: ddos_new= ddos_new.drop(columns=['frame.time'])
ddos_new

[10]: ip.src ip.dst tcp.srcport frame.len tcp.flags.push \

8
151195 192.168.19.1 192.168.23.2 37360 66 0
151196 192.168.19.1 192.168.23.2 37362 66 0
151197 192.168.19.1 192.168.23.2 37364 66 0
151198 192.168.19.1 192.168.23.2 37366 66 0
151199 192.168.19.1 192.168.23.2 37368 66 0

ip.flags.df Packets Bytes Tx Packets Tx Bytes Rx Packets \

0 0 8 432 4 216 4
1 0 10 540 5 270 5
2 0 12 648 6 324 6
3 0 10 540 5 270 5
4 0 6 324 3 162 3
… … … … … … …
151195 1 10 1146 6 560 4
151196 1 10 1151 6 560 4
151197 1 10 1144 6 560 4
151198 1 10 1175 6 560 4
151199 1 10 1146 6 560 4

Rx Bytes Label
0 216 DDoS-PSH-ACK
1 270 DDoS-PSH-ACK
2 324 DDoS-PSH-ACK
3 270 DDoS-PSH-ACK
4 162 DDoS-PSH-ACK
… … …
151195 586 Benign
151196 591 Benign
151197 584 Benign
151198 615 Benign
151199 586 Benign

[151200 rows x 13 columns]

0.2 Preparing the Data

[11]: # Assuming your DataFrame is named df
ddos_new['Label_new'] = ddos_new['Label'].apply(lambda x: 'Benign' if x ==␣
↪'Benign' else 'DDoS')

ddos_new.drop(columns=['Label'], inplace=True)
ddos_new.rename(columns={'Label_new': 'Label'}, inplace=True)
ddos_new

[11]: ip.src ip.dst tcp.srcport frame.len tcp.flags.push \

0 192.168.1.1 192.168.23.2 2412 54 1
1 192.168.1.1 192.168.23.2 2413 54 1
2 192.168.1.1 192.168.23.2 2414 54 1

9
3 192.168.1.1 192.168.23.2 2415 54 1
4 192.168.1.1 192.168.23.2 2416 54 1
… … … … … …
151195 192.168.19.1 192.168.23.2 37360 66 0
151196 192.168.19.1 192.168.23.2 37362 66 0
151197 192.168.19.1 192.168.23.2 37364 66 0
151198 192.168.19.1 192.168.23.2 37366 66 0
151199 192.168.19.1 192.168.23.2 37368 66 0

ip.flags.df Packets Bytes Tx Packets Tx Bytes Rx Packets \

Rx Bytes Label
0 216 DDoS
1 270 DDoS
2 324 DDoS
3 270 DDoS
4 162 DDoS
… … …
151195 586 Benign
151196 591 Benign
151197 584 Benign
151198 615 Benign
151199 586 Benign

[151200 rows x 13 columns]

[12]: y = ddos_new['Label']
y

[12]: 0 DDoS
1 DDoS
2 DDoS
3 DDoS
4 DDoS
…
151195 Benign

10
151196 Benign
151197 Benign
151198 Benign
151199 Benign
Name: Label, Length: 151200, dtype: object

[13]: label_encoder = LabelEncoder()

y = label_encoder.fit_transform(y)

[14]: y

[14]: array([1, 1, 1, …, 0, 0, 0])

There are many ip addresses so we need to encode them using one hot encoding. There is no
ordinality so using label encoding for would be biases.
[15]: X = ddos_new.drop(columns=['Label']).copy()

categorical_columns = ['ip.src', 'ip.dst']# Select categorical columns for␣

↪one-hot encoding

# Create a ColumnTransformer
preprocessor = ColumnTransformer(
transformers=[
('cat', OneHotEncoder(sparse=False, handle_unknown='ignore'),␣
↪categorical_columns)

],
remainder='passthrough'
)

pipeline = Pipeline(steps=[('preprocessor', preprocessor)])# Create a pipeline

X_encoded = pipeline.fit_transform(X)# Fit and transform

# Get the column names after encoding

encoded_column_names = pipeline.named_steps['preprocessor'].
↪named_transformers_['cat'].get_feature_names_out(categorical_columns)

column_names = list(encoded_column_names) + list(X.columns.

↪difference(categorical_columns))

X = pd.DataFrame(X_encoded, columns=column_names)

/usr/local/lib/python3.10/dist-packages/sklearn/preprocessing/_encoders.py:975:

11
FutureWarning: `sparse` was renamed to `sparse_output` in version 1.2 and will
be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its
default value.

warnings.warn(

[16]: X

[16]: ip.src_192.168.1.1 ip.src_192.168.10.1 ip.src_192.168.11.1 \

0 1.0 0.0 0.0
1 1.0 0.0 0.0
2 1.0 0.0 0.0
3 1.0 0.0 0.0
4 1.0 0.0 0.0
… … … …
151195 0.0 0.0 0.0
151196 0.0 0.0 0.0
151197 0.0 0.0 0.0
151198 0.0 0.0 0.0
151199 0.0 0.0 0.0

ip.src_192.168.13.1 ip.src_192.168.14.1 ip.src_192.168.16.1 \

0 0.0 0.0 0.0
1 0.0 0.0 0.0
2 0.0 0.0 0.0
3 0.0 0.0 0.0
4 0.0 0.0 0.0
… … … …
151195 0.0 0.0 0.0
151196 0.0 0.0 0.0
151197 0.0 0.0 0.0
151198 0.0 0.0 0.0
151199 0.0 0.0 0.0

ip.src_192.168.17.1 ip.src_192.168.19.1 ip.src_192.168.2.1 \

0 0.0 0.0 0.0
1 0.0 0.0 0.0
2 0.0 0.0 0.0
3 0.0 0.0 0.0
4 0.0 0.0 0.0
… … … …
151195 0.0 1.0 0.0
151196 0.0 1.0 0.0
151197 0.0 1.0 0.0
151198 0.0 1.0 0.0
151199 0.0 1.0 0.0

12
ip.src_192.168.20.1 … Bytes Packets Rx Bytes Rx Packets \
0 0.0 … 2412.0 54.0 1.0 0.0
1 0.0 … 2413.0 54.0 1.0 0.0
2 0.0 … 2414.0 54.0 1.0 0.0
3 0.0 … 2415.0 54.0 1.0 0.0
4 0.0 … 2416.0 54.0 1.0 0.0
… … … … … … …
151195 0.0 … 37360.0 66.0 0.0 1.0
151196 0.0 … 37362.0 66.0 0.0 1.0
151197 0.0 … 37364.0 66.0 0.0 1.0
151198 0.0 … 37366.0 66.0 0.0 1.0
151199 0.0 … 37368.0 66.0 0.0 1.0

Tx Bytes Tx Packets frame.len ip.flags.df tcp.flags.push \

0 8.0 432.0 4.0 216.0 4.0
1 10.0 540.0 5.0 270.0 5.0
2 12.0 648.0 6.0 324.0 6.0
3 10.0 540.0 5.0 270.0 5.0
4 6.0 324.0 3.0 162.0 3.0
… … … … … …
151195 10.0 1146.0 6.0 560.0 4.0
151196 10.0 1151.0 6.0 560.0 4.0
151197 10.0 1144.0 6.0 560.0 4.0
151198 10.0 1175.0 6.0 560.0 4.0
151199 10.0 1146.0 6.0 560.0 4.0

tcp.srcport
0 216.0
1 270.0
2 324.0
3 270.0
4 162.0
… …
151195 586.0
151196 591.0
151197 584.0
151198 615.0
151199 586.0

[151200 rows x 25 columns]

[17]: X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,␣

↪random_state=42)# Split the data into training and testing sets

[18]: X_train

13
[18]: ip.src_192.168.1.1 ip.src_192.168.10.1 ip.src_192.168.11.1 \
39462 0.0 0.0 0.0
86399 0.0 0.0 0.0
46424 0.0 0.0 0.0
123679 0.0 0.0 0.0
23643 0.0 0.0 0.0
… … … …
119879 0.0 0.0 0.0
103694 0.0 0.0 1.0
131932 0.0 0.0 0.0
146867 0.0 0.0 0.0
121958 0.0 0.0 0.0

ip.src_192.168.13.1 ip.src_192.168.14.1 ip.src_192.168.16.1 \

39462 0.0 0.0 0.0
86399 0.0 0.0 0.0
46424 0.0 0.0 0.0
123679 0.0 0.0 0.0
23643 0.0 1.0 0.0
… … … …
119879 0.0 0.0 0.0
103694 0.0 0.0 0.0
131932 0.0 0.0 1.0
146867 0.0 0.0 0.0
121958 0.0 0.0 0.0

ip.src_192.168.17.1 ip.src_192.168.19.1 ip.src_192.168.2.1 \

39462 0.0 0.0 1.0
86399 0.0 0.0 0.0
46424 0.0 0.0 0.0
123679 0.0 0.0 0.0
23643 0.0 0.0 0.0
… … … …
119879 0.0 0.0 0.0
103694 0.0 0.0 0.0
131932 0.0 0.0 0.0
146867 0.0 1.0 0.0
121958 0.0 0.0 0.0

ip.src_192.168.20.1 … Bytes Packets Rx Bytes Rx Packets \

39462 0.0 … 34562.0 223.0 1.0 1.0
86399 0.0 … 13220.0 54.0 0.0 0.0
46424 0.0 … 40372.0 223.0 1.0 1.0
123679 0.0 … 54482.0 66.0 0.0 1.0
23643 0.0 … 5706.0 54.0 1.0 0.0
… … … … … … …
119879 0.0 … 46882.0 66.0 0.0 1.0

14
103694 0.0 … 6153.0 54.0 0.0 0.0
131932 0.0 … 46358.0 66.0 0.0 1.0
146867 0.0 … 56936.0 66.0 0.0 1.0
121958 0.0 … 51040.0 66.0 0.0 1.0

Tx Bytes Tx Packets frame.len ip.flags.df tcp.flags.push \

39462 10.0 1229.0 6.0 561.0 4.0
86399 10.0 540.0 5.0 270.0 5.0
46424 10.0 1229.0 6.0 561.0 4.0
123679 10.0 1170.0 6.0 560.0 4.0
23643 8.0 432.0 4.0 216.0 4.0
… … … … … …
119879 10.0 1151.0 6.0 560.0 4.0
103694 10.0 540.0 5.0 270.0 5.0
131932 10.0 1229.0 6.0 561.0 4.0
146867 10.0 1151.0 6.0 560.0 4.0
121958 10.0 1175.0 6.0 560.0 4.0

tcp.srcport
39462 668.0
86399 270.0
46424 668.0
123679 610.0
23643 216.0
… …
119879 591.0
103694 270.0
131932 668.0
146867 591.0
121958 615.0

[120960 rows x 25 columns]

[19]: y_train

[19]: array([0, 1, 0, …, 0, 0, 0])

[20]: class_counts = np.bincount(y_train)

print(f'Count for class 0 (Benign): {class_counts[0]}')
print(f'Count for class 1 (DDoS): {class_counts[1]}')

Count for class 0 (Benign): 60431

Count for class 1 (DDoS): 60529

15
0.3 Building the Model
[21]: X_train

[21]: ip.src_192.168.1.1 ip.src_192.168.10.1 ip.src_192.168.11.1 \

39462 0.0 0.0 0.0
86399 0.0 0.0 0.0
46424 0.0 0.0 0.0
123679 0.0 0.0 0.0
23643 0.0 0.0 0.0
… … … …
119879 0.0 0.0 0.0
103694 0.0 0.0 1.0
131932 0.0 0.0 0.0
146867 0.0 0.0 0.0
121958 0.0 0.0 0.0

ip.src_192.168.13.1 ip.src_192.168.14.1 ip.src_192.168.16.1 \

ip.src_192.168.17.1 ip.src_192.168.19.1 ip.src_192.168.2.1 \

ip.src_192.168.20.1 … Bytes Packets Rx Bytes Rx Packets \

39462 0.0 … 34562.0 223.0 1.0 1.0
86399 0.0 … 13220.0 54.0 0.0 0.0
46424 0.0 … 40372.0 223.0 1.0 1.0

16
123679 0.0 … 54482.0 66.0 0.0 1.0
23643 0.0 … 5706.0 54.0 1.0 0.0
… … … … … … …
119879 0.0 … 46882.0 66.0 0.0 1.0
103694 0.0 … 6153.0 54.0 0.0 0.0
131932 0.0 … 46358.0 66.0 0.0 1.0
146867 0.0 … 56936.0 66.0 0.0 1.0
121958 0.0 … 51040.0 66.0 0.0 1.0

Tx Bytes Tx Packets frame.len ip.flags.df tcp.flags.push \

tcp.srcport
39462 668.0
86399 270.0
46424 668.0
123679 610.0
23643 216.0
… …
119879 591.0
103694 270.0
131932 668.0
146867 591.0
121958 615.0

[120960 rows x 25 columns]

0.4 Decision Tree

[43]: decision_tree_model = DecisionTreeClassifier()
decision_tree_model.fit(X_train, y_train)
y_pred_decision_tree = decision_tree_model.predict(X_test)
accuracy_decision_tree = accuracy_score(y_test, y_pred_decision_tree)
print(f"Decision Tree Accuracy: {accuracy_decision_tree * 100:.2f}%")

Decision Tree Accuracy: 100.00%

17
[44]: cm = confusion_matrix(y_test, y_pred_decision_tree)
class_labels = ["Benign", "DDoS"]
sns.heatmap(cm, annot=True, fmt="d", cmap="Blues", cbar=False,␣
↪xticklabels=class_labels, yticklabels=class_labels)

plt.xlabel("Predicted")
plt.ylabel("True")
plt.show()

0.5 Random Forest

[45]: rf_model = RandomForestClassifier()
rf_model.fit(X_train, y_train)

y_pred = rf_model.predict(X_test)# predict

accuracy = accuracy_score(y_test, y_pred)

print(f"Accuracy: {accuracy * 100:.2f}%")

# Plot precision-recall curve

fig, ax = plt.subplots(figsize=(8, 8))

18
precision, recall, _ = precision_recall_curve(y_test, rf_model.
↪predict_proba(X_test)[:, 1])

area = auc(recall, precision)

plt.plot(recall, precision, label=f'Precision-Recall curve (area = {area:.2f})')

plt.xlabel('Recall')
plt.ylabel('Precision')
plt.title('Precision-Recall Curve')
plt.legend(loc='best')
plt.show()

# Plot F1 score
fig, ax = plt.subplots(figsize=(8, 8))
f1 = 2 * (precision * recall) / (precision + recall)
plt.plot(recall, f1, label='F1 Score')
plt.xlabel('Recall')
plt.ylabel('F1 Score')
plt.title('F1 Score Curve')
plt.legend(loc='best')
plt.show()

Accuracy: 100.00%

19
20
0.6 XGBOOST
[47]: xgb_model = XGBClassifier()

xgb_model.fit(X_train, y_train)
y_pred = xgb_model.predict(X_test)

accuracy = accuracy_score(y_test, y_pred)

print(f"Accuracy: {accuracy * 100:.2f}%")

Accuracy: 100.00%

21
[34]: plot_importance(xgb_model)
plt.show()

[49]: cm = confusion_matrix(y_test, y_pred)

class_labels = ["Benign", "DDoS"]
sns.heatmap(cm, annot=True, fmt="d", cmap="Blues", cbar=False,␣
↪xticklabels=class_labels, yticklabels=class_labels)

plt.xlabel("Predicted")
plt.ylabel("True")
plt.show()

22
[51]: y_prob = xgb_model.predict_proba(X_test)[:, 1]

fpr, tpr, thresholds = roc_curve(y_test, y_prob)

roc_auc = auc(fpr, tpr)

plt.plot(fpr, tpr, label=f'AUC = {roc_auc:.2f}')

plt.plot([0, 1], [0, 1], linestyle='--', color='gray')
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.legend()
plt.show()

23
24

Predictive-Modelling-Project - Graded Project - Predictive Modeling - Business Report - PDF at Main Aadyatomar - Predictive-Modelling-Project GitHub
100% (8)
Predictive-Modelling-Project - Graded Project - Predictive Modeling - Business Report - PDF at Main Aadyatomar - Predictive-Modelling-Project GitHub
64 pages
Runes - Shadowhunters
No ratings yet
Runes - Shadowhunters
1 page
SCTP in Theory and Practice-Sample
0% (1)
SCTP in Theory and Practice-Sample
55 pages
Programming FPGAs: Getting Started with Verilog
From Everand
Programming FPGAs: Getting Started with Verilog
Simon Monk
3.5/5 (2)
ZGB-02-02-001 Basic Handover - FG - 20101030
No ratings yet
ZGB-02-02-001 Basic Handover - FG - 20101030
28 pages
Practical 2
No ratings yet
Practical 2
4 pages
lab04_notebook
No ratings yet
lab04_notebook
22 pages
Sample 11
No ratings yet
Sample 11
20 pages
P2) Code Email Spam Detection
No ratings yet
P2) Code Email Spam Detection
3 pages
Mloa Exp1 C121
No ratings yet
Mloa Exp1 C121
49 pages
Ass 1B Part1 Solution
No ratings yet
Ass 1B Part1 Solution
6 pages
Income Qualification Project3
No ratings yet
Income Qualification Project3
40 pages
Notebooks pp2
No ratings yet
Notebooks pp2
44 pages
Siddhesh Asati: #Group: B (ML) #Assignment: 7
No ratings yet
Siddhesh Asati: #Group: B (ML) #Assignment: 7
9 pages
ShmooCon2014-Controlling USB Flash Drive Controllers
No ratings yet
ShmooCon2014-Controlling USB Flash Drive Controllers
25 pages
2009-05-16 23.53.12 Error
No ratings yet
2009-05-16 23.53.12 Error
9 pages
DHCP Packet Capture
No ratings yet
DHCP Packet Capture
5 pages
Credit_Card_fraud_detection Using ML - Jupyter Notebook (1)
No ratings yet
Credit_Card_fraud_detection Using ML - Jupyter Notebook (1)
12 pages
Practical PRogram List 2.ipynb - Colab
No ratings yet
Practical PRogram List 2.ipynb - Colab
6 pages
ACC Acara 3
No ratings yet
ACC Acara 3
2 pages
vpn-stats
No ratings yet
vpn-stats
4 pages
Bugreport Cv7a - Lao - Com OPM1.171019.019 2022 04 16 21 54 08
No ratings yet
Bugreport Cv7a - Lao - Com OPM1.171019.019 2022 04 16 21 54 08
6,902 pages
Game Crash Log
No ratings yet
Game Crash Log
5 pages
Subnet Calculator
No ratings yet
Subnet Calculator
8 pages
Libro1 K 30
No ratings yet
Libro1 K 30
5 pages
Structure of The Report-S2
No ratings yet
Structure of The Report-S2
15 pages
Buy ebook (Ebook) The Fundamentals of C/C++ Game Programming: Using Target-based Development on SBC's by Brian Beuken ISBN 9781498788748, 1498788742 cheap price
100% (9)
Buy ebook (Ebook) The Fundamentals of C/C++ Game Programming: Using Target-based Development on SBC's by Brian Beuken ISBN 9781498788748, 1498788742 cheap price
55 pages
Netstat POST
No ratings yet
Netstat POST
24 pages
turing-data-analysis
No ratings yet
turing-data-analysis
30 pages
Boston House Prediction - Colab1
No ratings yet
Boston House Prediction - Colab1
10 pages
DM Project - Step 4
No ratings yet
DM Project - Step 4
11 pages
Movie Recommendation System-jupyter System
No ratings yet
Movie Recommendation System-jupyter System
8 pages
Untitled 23
No ratings yet
Untitled 23
4 pages
Election Result in Excel Format
No ratings yet
Election Result in Excel Format
13 pages
TT 03
No ratings yet
TT 03
416 pages
Aids
No ratings yet
Aids
88 pages
Hwinfo
No ratings yet
Hwinfo
629 pages
Estats: June 19, 2019
No ratings yet
Estats: June 19, 2019
33 pages
20MIS1025 - Comparative Analysis - Ipynb - Colaboratory
No ratings yet
20MIS1025 - Comparative Analysis - Ipynb - Colaboratory
6 pages
DATA SCIENCE IDC 302 End Sem Project
No ratings yet
DATA SCIENCE IDC 302 End Sem Project
1 page
Log
No ratings yet
Log
5 pages
Data Vizualization - Jupyter Notebook
No ratings yet
Data Vizualization - Jupyter Notebook
20 pages
Advertising Data Analysis
No ratings yet
Advertising Data Analysis
12 pages
History Performance L2U CELL 20171121110852
No ratings yet
History Performance L2U CELL 20171121110852
8 pages
Email Spam Classification
No ratings yet
Email Spam Classification
4 pages
Name: Mahesh A Abnave Title: Inventory System Simulation Using MS Excel Spreadsheet Sample Cell Formulae
No ratings yet
Name: Mahesh A Abnave Title: Inventory System Simulation Using MS Excel Spreadsheet Sample Cell Formulae
2 pages
Adithiyaa BR 23MBA0018 SMA DA Text Mining PDF
No ratings yet
Adithiyaa BR 23MBA0018 SMA DA Text Mining PDF
6 pages
Cridex Malware Memory Analysis
No ratings yet
Cridex Malware Memory Analysis
14 pages
Parametric Airfoil Catalog PDF
No ratings yet
Parametric Airfoil Catalog PDF
568 pages
T48 - Panel Wise IO Summary - 9.11.23
No ratings yet
T48 - Panel Wise IO Summary - 9.11.23
6 pages
Progress On Dedekind's Problem
No ratings yet
Progress On Dedekind's Problem
4 pages
A5 A.ipynb - Colaboratory
No ratings yet
A5 A.ipynb - Colaboratory
8 pages
Monitoring Lama
No ratings yet
Monitoring Lama
3 pages
Lecture # 40: Creation of A File On NTFS
No ratings yet
Lecture # 40: Creation of A File On NTFS
6 pages
Probabilidad Examen
No ratings yet
Probabilidad Examen
5 pages
207 - Intro To Area Problem
No ratings yet
207 - Intro To Area Problem
90 pages
(Ebook) How to Cheat in Maya 2012: Tools and Techniques for Character Animation by Eric Luhta, Kenny Roy ISBN 9780240816982, 0240816986 download
100% (1)
(Ebook) How to Cheat in Maya 2012: Tools and Techniques for Character Animation by Eric Luhta, Kenny Roy ISBN 9780240816982, 0240816986 download
56 pages
AnyConnect ExportedStats
No ratings yet
AnyConnect ExportedStats
4 pages
DICIEMBRE MINTCARD Jorge Cortes - copia
No ratings yet
DICIEMBRE MINTCARD Jorge Cortes - copia
2 pages
Route Points
No ratings yet
Route Points
6 pages
credit card-fraud-detection
No ratings yet
credit card-fraud-detection
39 pages
MS14A017GB-00 Manuale Sistema Di Controllo - MODBUS 1649760875572
100% (1)
MS14A017GB-00 Manuale Sistema Di Controllo - MODBUS 1649760875572
26 pages
legal_document_analysis (10)
No ratings yet
legal_document_analysis (10)
2 pages
Practical No 01
No ratings yet
Practical No 01
9 pages
Spam Sms Detection 2
No ratings yet
Spam Sms Detection 2
8 pages
Gradient Minimalist Business Slides
No ratings yet
Gradient Minimalist Business Slides
19 pages
Introduction To EDA Method in Machine Learning: by 60 - Soham Pawar
No ratings yet
Introduction To EDA Method in Machine Learning: by 60 - Soham Pawar
10 pages
Capr-I 6271
No ratings yet
Capr-I 6271
18 pages
Capr-Ii 6271
No ratings yet
Capr-Ii 6271
18 pages
BXE U1 (MCQS) 31-05
No ratings yet
BXE U1 (MCQS) 31-05
27 pages
Program No
No ratings yet
Program No
20 pages
The Invention of Zero and How It Helps Us - School Assignment
No ratings yet
The Invention of Zero and How It Helps Us - School Assignment
3 pages
790CSE AI Newsletter April - June 2023
No ratings yet
790CSE AI Newsletter April - June 2023
7 pages
The Nature of Human Nature
No ratings yet
The Nature of Human Nature
9 pages
T300NC T350NC
No ratings yet
T300NC T350NC
204 pages
Opinion Essay Structure - Student
No ratings yet
Opinion Essay Structure - Student
10 pages
Study On Equalizers: Anik Sengupta Sohom Das
No ratings yet
Study On Equalizers: Anik Sengupta Sohom Das
11 pages
Argumentessay
No ratings yet
Argumentessay
4 pages
Condition Monitoring Essentials Catalog
100% (1)
Condition Monitoring Essentials Catalog
60 pages
Modul-1:: Introduction To Satellite Geodesy
No ratings yet
Modul-1:: Introduction To Satellite Geodesy
79 pages
SS2 Reviewer
No ratings yet
SS2 Reviewer
4 pages
Gillian Rose - A Feminist Critique of The Space of Phallocentric Self-Knowledge
No ratings yet
Gillian Rose - A Feminist Critique of The Space of Phallocentric Self-Knowledge
21 pages
Scaey PDF
No ratings yet
Scaey PDF
20 pages
QMB12 CH 6 A
No ratings yet
QMB12 CH 6 A
56 pages
Journal of Nano Technology
No ratings yet
Journal of Nano Technology
15 pages
9701_w24_ms_42
100% (1)
9701_w24_ms_42
12 pages
Shravani S-Research Paper
No ratings yet
Shravani S-Research Paper
8 pages
Performance Management System
No ratings yet
Performance Management System
248 pages
Learner'S Activity Sheet in Health 9 (3 Quarter) Reading Time!
No ratings yet
Learner'S Activity Sheet in Health 9 (3 Quarter) Reading Time!
5 pages
The Elements of Journalism - Book Review
No ratings yet
The Elements of Journalism - Book Review
4 pages
Design of Experiment: Eng. Ibrahim Kuhail
No ratings yet
Design of Experiment: Eng. Ibrahim Kuhail
23 pages
8612 Assignment Il
No ratings yet
8612 Assignment Il
16 pages
Week 1 - Session 1 Relations Functions Domain Range
No ratings yet
Week 1 - Session 1 Relations Functions Domain Range
31 pages
Dokumen - Tips - Law 243 Constitutional Law Dl4a 243pdflaw 243 Constitutional Law Contents Pages
No ratings yet
Dokumen - Tips - Law 243 Constitutional Law Dl4a 243pdflaw 243 Constitutional Law Contents Pages
90 pages
SG 600 Wiring
100% (2)
SG 600 Wiring
2 pages
(Ebook) Laravel: Up & Running by Matt Stauffer ISBN 9781492041160, 1492041165 2024 Scribd Download
100% (8)
(Ebook) Laravel: Up & Running by Matt Stauffer ISBN 9781492041160, 1492041165 2024 Scribd Download
67 pages
Hannspree SV32AMUB 32 Class LED HDTV Manual
No ratings yet
Hannspree SV32AMUB 32 Class LED HDTV Manual
59 pages
Question 1 of 23: Teamwork What Qualities Do You Believe Are Most Valuable For A Team Lead To Possess? Answer
No ratings yet
Question 1 of 23: Teamwork What Qualities Do You Believe Are Most Valuable For A Team Lead To Possess? Answer
23 pages
ECUST PROII Advanced Training PDF
100% (1)
ECUST PROII Advanced Training PDF
118 pages