0% found this document useful (0 votes)
49 views3 pages

All Classifair

The document loads data for classification models, specifies evaluation metrics, and evaluates several classification algorithms for phishing detection, including decision trees, random forests, gradient boosting, histogram-based gradient boosting, LightGBM, and SVMs. It loads and preprocesses data, performs 10-fold cross-validation on the specified classifiers, and prints the mean scores for accuracy, recall, precision, and F1.

Uploaded by

Khải Trần
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views3 pages

All Classifair

The document loads data for classification models, specifies evaluation metrics, and evaluates several classification algorithms for phishing detection, including decision trees, random forests, gradient boosting, histogram-based gradient boosting, LightGBM, and SVMs. It loads and preprocesses data, performs 10-fold cross-validation on the specified classifiers, and prints the mean scores for accuracy, recall, precision, and F1.

Uploaded by

Khải Trần
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

import pandas as pd

import sklearn
from sklearn.model_selection import train_test_split
from sklearn import preprocessing
from sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score
from sklearn.metrics import classification_report
from sklearn.model_selection import KFold
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import cross_validate
from sklearn.neural_network import MLPClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn import svm , preprocessing
from sklearn.ensemble import AdaBoostClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import GradientBoostingClassifier
from xgboost import XGBClassifier
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.experimental import enable_hist_gradient_boosting
from sklearn.ensemble import HistGradientBoostingClassifier
from lightgbm import LGBMClassifier
from catboost import CatBoostClassifier

def mean_score(scoring): return {i:j.mean() for i,j in scoring.items()}

Loading data.
 Load data and shuffle it with specific seeding.
 df = pd.read_csv("dataset.csv",index_col=0)
 df = sklearn.utils.shuffle(df)
 X = df.drop("Result",axis=1).values
 X = preprocessing.scale(X)
 y = df['Result'].values
 df.head()

Evalution metrics
 Specifying evaluation metrics for classification models
 Using 10 fold-cross-validation for evaluting
 scoring = {'accuracy': 'accuracy',
 'recall': 'recall',
 'precision': 'precision',
 'f1': 'f1'}
 fold_count=10

Classication models for phishing detection:


Descision Tree
dtree_clf=DecisionTreeClassifier()
cross_val_scores = cross_validate(dtree_clf, X, y, cv=fold_count,
scoring=scoring)
dtree_score = mean_score(cross_val_scores)
print(dtree_score)

Random Forest
rforest_clf=RandomForestClassifier()
cross_val_scores = cross_validate(rforest_clf, X, y, cv=fold_count,
scoring=scoring)
rforest_clf_score = mean_score(cross_val_scores)
print(rforest_clf_score)

Gradient_Booster
gradientBooster_clf=GradientBoostingClassifier()
cross_val_scores = cross_validate(gradientBooster_clf,X, y, cv=fold_count,
scoring=scoring)
gradientBooster_clf_score= mean_score(cross_val_scores)
print(gradientBooster_clf_score)

Histogram-Based Gradient Boosting


histGradientBooster_clf = HistGradientBoostingClassifier()
cross_val_scores = cross_validate(histGradientBooster_clf,X, y,
cv=fold_count, scoring=scoring)
histGradientBooster_clf_score= mean_score(cross_val_scores)
print(histGradientBooster_clf_score)

LightGBM for Classification


LGBM_clf = LGBMClassifier()
cross_val_scores = cross_validate(LGBM_clf,X, y, cv=fold_count,
scoring=scoring)
LGBM_clf_score= mean_score(cross_val_scores)
print(LGBM_clf_score)

SVM
###linear
linear_clf = svm.SVC(kernel='linear')
cross_val_scores = cross_validate(linear_clf, X, y, cv=fold_count,
scoring=scoring)
linear_svc_clf_score = mean_score(cross_val_scores)
print(linear_svc_clf_score)
###poly
poly_clf = svm.SVC(kernel='poly')
cross_val_scores = cross_validate(poly_clf, X, y, cv=fold_count,
scoring=scoring)
poly_svc_clf_score = mean_score(cross_val_scores)
print(poly_svc_clf_score)
###rbf
rbf_clf = svm.SVC(kernel='rbf')
cross_val_scores = cross_validate(rbf_clf, X, y, cv=fold_count,
scoring=scoring)
rbf_svc_clf_score = mean_score(cross_val_scores)
print(rbf_svc_clf_score)
###sigmoid
sigmoid_clf = svm.SVC(kernel='sigmoid')
cross_val_scores = cross_validate(sigmoid_clf, X, y, cv=fold_count,
scoring=scoring)
sigmoid_svc_clf_score = mean_score(cross_val_scores)
print(sigmoid_svc_clf_score)

You might also like