0% found this document useful (0 votes)

33 views21 pages

Data Analysis in Python - ML

Uploaded by

Khushal Khan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views21 pages

Data Analysis in Python - ML

Uploaded by

Khushal Khan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 21

Anaconda is a distribution of the Python and R programming languages for scientific

computing, that aims to simplify package management and deployment. The distribution
includes data-science packages suitable for Windows, Linux, and macOS. It can be accessed
from

https://www.anaconda.com/products/individual

After installing anaconda, launch it by simply searching it and then launch it by clicking the
icon you get. When it opens, find and launch jupyter Or type jupyter notebook.

Home screen will open in a browser after terminal display for few seconds.
Go to New and select python 3 in drop down list.

A new notebook will be opened, you may rename it by clicking on text box displaying
“untitled”

You may enter your commands in the empty cells. Green color indicates the active cell
while blue indicates the inactive ones. You can make the cell inactive by pressing Esc key.
Hover the mouse over the inactive cell and click or press ENTER key to make it active.

To create a new cell, press + icon at the top or press B on keyboard to add a cell below or A
to add a cell above the current cell.

To remove a cell, press D twice, indicates number of the command

executed.

# Represents comments

# Like R, it recognizes numbers but not the valueless variables.

print(‘Bismillah') # You can print a value to the screen using “print.”

# = is used as an assignment operator.

A=1

A # will print the contents of A

type (A) # tells the class of the object A, we created above

Variable types don’t need to be declared. Python figures out the variable types on its own.

Variable Names are case sensitive and cannot start with a number. They can contain
letters, numbers, and underscores.

a, b, c = 17, 3.14, "test" # We can assign values to multiple variables at a time.

+ operator to concatenate (join) two strings

a = "hello "

b = "world"

print (a + b)

Python Data Structures

• Integers: Whole numbers e.g. 2, 3, 5, 0, -1

• Floats: Numbers with decimal point. e.g. 1.50

• Strings: Either single ('') or double ("") or triple quotes (""" or ''') can be used. For example,
“python” and ‘python’ are same strings. Unmatched ones can occur within the string.
"datatype’s"

• Tuple ( ) A collection of different things. Tuples are “immutable”, i.e., they cannot be modified
after creation.
myTuple = ('abc', 2.5, A)

myTuple[2] # Shall return the value at 3rd index position starting from 0.

myTuple.index(2.5) # shall tell us what is the index position of 2.5

List [ ] Lists are “mutable”, i.e., their elements can be modified.

myList = ['abc', 'def', 'ghij']

myList.append('klm')

myList

myList.count('def') # shall count the occurrences of def in a list

myList2 = [1,2,3]

myList3 = [4,5,6]
myList2 + myList3

Array [ ] vectors (1d) and matrices (>1d) , for numerical data manipulation are defined in
numpy. We need to import numpy to our python session.
import numpy as np # that’s what the community does, we can access any function like
np.array etc you may use any variable name here, we may also do import numpy but then
we need to access the functions like numpy.array etc. Lists are containers for elements
having differing data types but arrays are used as containers for elements of the same data
type.

myArray2 = np.array(myList2)

myArray3 = np.array(myList3)

myArray2 + myArray3

myArray2.dot(myArray3)

Data Analysis in python

To perform various tasks, a set of instructions is combined into functions. A function is

defined by the keyword def, and can be defined anywhere. A combination of various
functions are put together as Modules which then constitute Packages.

import pandas as pd #Pandas library is designed for quick and easy data manipulation, reading,
aggregation, visualization.

import numpy as np #NumPy is used to process arrays that store values of the same datatype. It
facilitates math operations on arrays and their vectorization.

import matplotlib.pyplot as plt #To plot the histograms and other statistical graphs

Working Directory Setting:

pwd # will tell us where we are (same as linux)

ls # will tell us what we have there (files and folders)

dir() # will tell us what variables (objects) we have try vars()

cd # we can change directory

import os #OS module functions for creating and removing a directory (folder), fetching its
contents, changing and identifying the current directory. Right way of doing that as in other
python IDEs above commands might not work.
os.chdir("C:/Desktop/DataAnalysis/") # shall change to directory to the required directory

os.getcwd() # To make sure we are at the right place

#Reading Dataframe exported previously

combined_PRAD = pd.read_csv("PRAD_labeled.csv", index_col=0)

#Tells you what type of variable it is.

type(combined_PRAD)

#Description about variables contained by the dataframe

combined_PRAD.info()

# gives the total size of a dataframe by multiplying the rows with

columns.
combined_PRAD.size

# tells about the shape of data, how many rows and columns present.
combined_PRAD.shape

# Gives you the information about the dimensions of dataset.

combined_PRAD.ndim

#Outputs the first five rows of the data

combined_PRAD.head()

#Gives the list of last 10 rows in the dataset

combined_PRAD.tail(10)

# Import label encoder

from sklearn import preprocessing
# label_encoder object knows how to understand word labels.
label_encoder = preprocessing.LabelEncoder()
# Encode labels in column 'label'.
combined_PRAD['labels']=
label_encoder.fit_transform(combined_PRAD['labels'])
#To have a list of unique enteries.
combined_PRAD['labels'].unique()

#counting the number of classes

combined_PRAD["labels"].value_counts()

#Assigning the numerical data to a "X" variable and labels column into
a "y" variable that will be used in the next steps
X = combined_PRAD.iloc[:,:-1]
y = combined_PRAD["labels"]

#importing train_test_split
from sklearn.model_selection import train_test_split
X_train, X_test ,Y_train, Y_test = train_test_split(X,y,test_size
=0.30, random_state=42)
from sklearn.preprocessing import StandardScaler
sc= StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

#########Plotting TSNE plot to check whether problem is linear or

not#######

import matplotlib.pyplot as plt

import seaborn as sns
from sklearn.manifold import TSNE
fig, ax = plt.subplots()
m = TSNE(learning_rate=50)
X_tsne = m.fit_transform(X)
combined_PRAD["y"] = Y_train
combined_PRAD["comp-1"] = X_tsne[:,0]
combined_PRAD["comp-2"] = X_tsne[:,1]

sns.scatterplot(x="comp-1", y="comp-2", hue=combined_PRAD.y.tolist(),

palette=sns.color_palette('husl', 2),
data=combined_PRAD).set(title="Cancer data T-SNE
projection")
plt.savefig("TSNE-plot.png", dpi = 600)

from sklearn.metrics import accuracy_score

from sklearn.metrics import precision_score
from sklearn.metrics import recall_score
from sklearn.metrics import f1_score
from sklearn.metrics import classification_report
import matplotlib.pyplot as plt
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay
from sklearn.neighbors import KNeighborsClassifier

KNN = KNeighborsClassifier(n_neighbors=7, metric='minkowski', p=1)

KNN.fit(X_train,Y_train)
# predict samples in the test set
prediction = KNN.predict(X_test)
Accuracy = accuracy_score(Y_test,prediction)
print('accuracy',Accuracy)

#Printing precision, recall and f1_scores

print(precision_score(Y_test,prediction))
print(recall_score(Y_test,prediction))
print(f1_score(Y_test,prediction))

# Assuming you have trained your KNN model and have X_test and Y_test
# KNN is your trained K Nearest Neighbors model
# Get predictions from the KNN model
Y_pred = KNN.predict(X_test)
# Create the confusion matrix
cm = confusion_matrix(Y_test, Y_pred)

# Display the confusion matrix using ConfusionMatrixDisplay

disp = ConfusionMatrixDisplay(confusion_matrix=cm,
display_labels=KNN.classes_) # KNN.classes_ contains your class
labels
disp.plot(cmap=plt.cm.Greens)
plt.xlabel('Predicted label', color='black')
plt.ylabel('True label', color='black')
plt.gca().tick_params(axis='x', colors='white') # Modify ticks on x-
axis
plt.gca().tick_params(axis='y', colors='white') # Modify ticks on y-
axis
plt.gcf().set_size_inches(10, 6)
plt.savefig("KNN.png", dpi=600)
plt.show()

# roc curve and auc

from sklearn.metrics import roc_curve
from sklearn.metrics import roc_auc_score
from matplotlib import pyplot

KNN_probs = KNN.predict_proba(X_test)
KNN_probs = KNN_probs[:, 1]
KNN_auc = roc_auc_score(Y_test, KNN_probs)
KNN_fpr, KNN_tpr, _ = roc_curve(Y_test, KNN_probs)

# plot the roc curve for the model

fig = pyplot.figure(figsize=(7, 5))
pyplot.plot(KNN_fpr, KNN_tpr ,label='KNN =%.3f' % (KNN_auc))
params = {'legend.fontsize': 10,
'legend.handlelength': 2}
pyplot.rcParams.update(params)
pyplot.legend()
pyplot.ylabel('Precision', fontsize=15)
pyplot.xlabel('Recall', fontsize=15)
pyplot.xticks(fontsize=12)
pyplot.yticks(fontsize=12)
#Show legend
pyplot.legend() #
#plt.savefig("PR_curve.png", dpi = 600)
pyplot.show()

#SVC_linear
from sklearn.svm import SVC
svm_linear = SVC(kernel='linear', probability=True, random_state=40)
svm_linear.fit(X_train,Y_train).decision_function(X_test)
prediction = svm_linear.predict(X_test)
Accuracy = accuracy_score(Y_test,prediction)
print('accuracy',Accuracy)

#Printing precision, recall and f1_scores

print(precision_score(Y_test,prediction))
print(recall_score(Y_test,prediction))
print(f1_score(Y_test,prediction))

# Assuming you have trained your SVM linear model and have X_test and
Y_test
# svm_linear is your trained Support Vector Machine with linear kernel
model

# Get predictions from the SVM linear model

Y_pred_svm_linear = svm_linear.predict(X_test)

# Create the confusion matrix

cm_svm_linear = confusion_matrix(Y_test, Y_pred_svm_linear)

# Display the confusion matrix using ConfusionMatrixDisplay

disp_svm_linear =
ConfusionMatrixDisplay(confusion_matrix=cm_svm_linear,
display_labels=svm_linear.classes_)
disp_svm_linear.plot(cmap=plt.cm.Greens)
plt.xlabel('Predicted label', color='black')
plt.ylabel('True label', color='black')
plt.gca().tick_params(axis='x', colors='white') # Modify ticks on x-
axis
plt.gca().tick_params(axis='y', colors='white') # Modify ticks on y-
axis
plt.gcf().set_size_inches(10, 6)
plt.savefig("svm_linear.png", dpi=600)
plt.show()
svm_linear_probs = svm_linear.predict_proba(X_test)
svm_linear_probs = svm_linear_probs[:, 1]
svm_linear_auc = roc_auc_score(Y_test, svm_linear_probs)
svm_linear_fpr, svm_linear_tpr, _ = roc_curve(Y_test,
svm_linear_probs)

# plot the roc curve for the model

fig = pyplot.figure(figsize=(7, 5))
pyplot.plot(svm_linear_fpr, svm_linear_tpr ,label='SVM_Linear =%.3f' %
(svm_linear_auc))
params = {'legend.fontsize': 10,
'legend.handlelength': 2}
pyplot.rcParams.update(params)
pyplot.legend()
pyplot.ylabel('True Positive Rate', fontsize=15)
pyplot.xlabel('False Positive Rate', fontsize=15)
pyplot.xticks(fontsize=12)
pyplot.yticks(fontsize=12)
#Show legend
pyplot.legend() #
#plt.savefig("AUC_ROC.png", dpi = 600)
pyplot.show()

#SVC_poly
from sklearn.svm import SVC
# Training a SVM classifier using SVC polynomial
svm_poly = SVC(kernel='poly', probability=True, random_state=40)
svm_poly.fit(X_train,Y_train).decision_function(X_test)
prediction = svm_poly.predict(X_test)
Accuracy = accuracy_score(Y_test,prediction)
print('accuracy',Accuracy)

#Printing precision, recall and f1_scores

print(precision_score(Y_test,prediction))
print(recall_score(Y_test,prediction))
print(f1_score(Y_test,prediction))

Y_pred_svm_poly = svm_poly.predict(X_test)

# Create the confusion matrix

cm_svm_poly = confusion_matrix(Y_test, Y_pred_svm_poly)

# Display the confusion matrix using ConfusionMatrixDisplay

disp_svm_poly = ConfusionMatrixDisplay(confusion_matrix=cm_svm_poly,
display_labels=svm_poly.classes_)
disp_svm_poly.plot(cmap=plt.cm.Greens)
plt.xlabel('Predicted label', color='black')
plt.ylabel('True label', color='black')
plt.gca().tick_params(axis='x', colors='white') # Modify ticks on x-
axis
plt.gca().tick_params(axis='y', colors='white') # Modify ticks on y-
axis
plt.gcf().set_size_inches(10, 6)
plt.savefig("svm_poly.png", dpi=600)
plt.show()

svm_poly_probs = svm_poly.predict_proba(X_test)
svm_poly_probs = svm_poly_probs[:, 1]
svm_poly_auc = roc_auc_score(Y_test, svm_poly_probs)
svm_poly_fpr, svm_poly_tpr, _ = roc_curve(Y_test, svm_poly_probs)

# plot the roc curve for the model

fig = pyplot.figure(figsize=(7, 5))
pyplot.plot(svm_poly_fpr, svm_poly_tpr ,label='SVM_poly =%.3f' %
(svm_poly_auc))
params = {'legend.fontsize': 10,
'legend.handlelength': 2}
pyplot.rcParams.update(params)
pyplot.legend()
pyplot.ylabel('True Positive Rate', fontsize=15)
pyplot.xlabel('False Positive Rate', fontsize=15)
pyplot.xticks(fontsize=12)
pyplot.yticks(fontsize=12)
#Show legend
pyplot.legend() #
#plt.savefig("AUC_ROC.png", dpi = 600)
pyplot.show()

#SVC_RBF
from sklearn.svm import SVC
# Training a SVM classifier using SVC class
svm_rbf = SVC(kernel='rbf', probability=True, random_state=40)
svm_rbf.fit(X_train,Y_train).decision_function(X_test)
prediction = svm_rbf.predict(X_test)
Accuracy = accuracy_score(Y_test,prediction)
print('accuracy',Accuracy)

#Printing precision, recall and f1_scores

print(precision_score(Y_test,prediction))
print(recall_score(Y_test,prediction))
print(f1_score(Y_test,prediction))

# Assuming you have trained your SVM poly model and have X_test and
Y_test
# svm_poly is your trained SVM with polynomial kernel model

# Get predictions from the SVM poly model

Y_pred_svm_rbf = svm_rbf.predict(X_test)

# Create the confusion matrix

cm_svm_rbf = confusion_matrix(Y_test, Y_pred_svm_rbf)

# Display the confusion matrix using ConfusionMatrixDisplay

disp_svm_rbf = ConfusionMatrixDisplay(confusion_matrix=cm_svm_rbf,
display_labels=svm_rbf.classes_)
disp_svm_rbf.plot(cmap=plt.cm.Greens)
plt.xlabel('Predicted label', color='black')
plt.ylabel('True label', color='black')
plt.gca().tick_params(axis='x', colors='white') # Modify ticks on x-
axis
plt.gca().tick_params(axis='y', colors='white') # Modify ticks on y-
axis
plt.gcf().set_size_inches(10, 6)
plt.savefig("svm_rbf.png", dpi=600)
plt.show()

svm_rbf_probs = svm_rbf.predict_proba(X_test)
svm_rbf_probs = svm_rbf_probs[:, 1]
svm_rbf_auc = roc_auc_score(Y_test, svm_rbf_probs)
svm_rbf_fpr, svm_rbf_tpr, _ = roc_curve(Y_test, svm_rbf_probs)

# plot the roc curve for the model

fig = pyplot.figure(figsize=(7, 5))
pyplot.plot(svm_rbf_fpr, svm_rbf_tpr ,label='SVM_rbf =%.3f' %
(svm_rbf_auc))
params = {'legend.fontsize': 10,
'legend.handlelength': 2}
pyplot.rcParams.update(params)
pyplot.legend()
pyplot.ylabel('True Positive Rate', fontsize=15)
pyplot.xlabel('False Positive Rate', fontsize=15)
pyplot.xticks(fontsize=12)
pyplot.yticks(fontsize=12)
#Show legend
pyplot.legend() #
#plt.savefig("AUC_ROC.png", dpi = 600)
pyplot.show()

from sklearn.linear_model import LogisticRegression

LR = LogisticRegression()
LR.fit(X_train,Y_train).decision_function(X_test)
prediction = LR.predict(X_test)
Accuracy = accuracy_score(Y_test,prediction)
print('accuracy',Accuracy)

#Printing precision, recall and f1_scores

print(precision_score(Y_test,prediction))
print(recall_score(Y_test,prediction))
print(f1_score(Y_test,prediction))

# Assuming you have trained your SVM poly model and have X_test and
Y_test
# svm_poly is your trained SVM with polynomial kernel model

# Get predictions from the SVM poly model

Y_pred_LR = LR.predict(X_test)

# Create the confusion matrix

cm_LR = confusion_matrix(Y_test, Y_pred_LR)

# Display the confusion matrix using ConfusionMatrixDisplay

disp_LR = ConfusionMatrixDisplay(confusion_matrix=cm_LR,
display_labels=LR.classes_)
disp_LR.plot(cmap=plt.cm.Greens)
plt.xlabel('Predicted label', color='black')
plt.ylabel('True label', color='black')
plt.gca().tick_params(axis='x', colors='white') # Modify ticks on x-
axis
plt.gca().tick_params(axis='y', colors='white') # Modify ticks on y-
axis
plt.gcf().set_size_inches(10, 6)
plt.savefig("LR.png", dpi=600)
plt.show()

LR_probs = LR.predict_proba(X_test)
LR_probs = LR_probs[:, 1]
LR_auc = roc_auc_score(Y_test, LR_probs)
LR_fpr, LR_tpr, _ = roc_curve(Y_test, LR_probs)

# plot the roc curve for the model

fig = pyplot.figure(figsize=(7, 5))
pyplot.plot(LR_fpr, LR_tpr ,label='LR =%.3f' % (LR_auc))
params = {'legend.fontsize': 10,
'legend.handlelength': 2}
pyplot.rcParams.update(params)
pyplot.legend()
pyplot.ylabel('True Positive Rate', fontsize=15)
pyplot.xlabel('False Positive Rate', fontsize=15)
pyplot.xticks(fontsize=12)
pyplot.yticks(fontsize=12)
#Show legend
pyplot.legend() #
#plt.savefig("AUC_ROC.png", dpi = 600)
pyplot.show()

#Naive bayes
from sklearn.naive_bayes import GaussianNB
NB = GaussianNB()
NB.fit(X_train, Y_train)
prediction = NB.predict(X_test)
Accuracy = accuracy_score(Y_test,prediction)
print('accuracy',Accuracy)

#Printing precision, recall and f1_scores

print(precision_score(Y_test,prediction))
print(recall_score(Y_test,prediction))
print(f1_score(Y_test,prediction))
# Assuming you have trained your SVM poly model and have X_test and
Y_test
# svm_poly is your trained SVM with polynomial kernel model

# Get predictions from the SVM poly model

Y_pred_NB = NB.predict(X_test)

# Create the confusion matrix

cm_NB = confusion_matrix(Y_test, Y_pred_NB)

# Display the confusion matrix using ConfusionMatrixDisplay

disp_NB = ConfusionMatrixDisplay(confusion_matrix=cm_NB,
display_labels=NB.classes_)
disp_NB.plot(cmap=plt.cm.Greens)
plt.xlabel('Predicted label', color='black')
plt.ylabel('True label', color='black')
plt.gca().tick_params(axis='x', colors='white') # Modify ticks on x-
axis
plt.gca().tick_params(axis='y', colors='white') # Modify ticks on y-
axis
plt.gcf().set_size_inches(10, 6)
plt.savefig("NB.png", dpi=600)
plt.show()

NB_probs = NB.predict_proba(X_test)
NB_probs = NB_probs[:, 1]
NB_auc = roc_auc_score(Y_test, NB_probs)
NB_fpr, NB_tpr, _ = roc_curve(Y_test, NB_probs)

# plot the roc curve for the model

fig = pyplot.figure(figsize=(7, 5))
pyplot.plot(NB_fpr, NB_tpr ,label='NB =%.3f' % (NB_auc))
params = {'legend.fontsize': 10,
'legend.handlelength': 2}
pyplot.rcParams.update(params)
pyplot.legend()
pyplot.ylabel('True Positive Rate', fontsize=15)
pyplot.xlabel('False Positive Rate', fontsize=15)
pyplot.xticks(fontsize=12)
pyplot.yticks(fontsize=12)
#Show legend
pyplot.legend() #
#plt.savefig("AUC_ROC.png", dpi = 600)
pyplot.show()

#DECISION TREE CLASSIFIER

from sklearn.tree import DecisionTreeClassifier
DT= DecisionTreeClassifier(random_state=0)
DT.fit(X_train, Y_train)
prediction = DT.predict(X_test)
Accuracy = accuracy_score(Y_test,prediction)
print('accuracy',Accuracy)

#Printing precision, recall and f1_scores

print(precision_score(Y_test,prediction))
print(recall_score(Y_test,prediction))
print(f1_score(Y_test,prediction))

# Assuming you have trained your SVM poly model and have X_test and
Y_test
# svm_poly is your trained SVM with polynomial kernel model

# Get predictions from the SVM poly model

Y_pred_DT = DT.predict(X_test)

# Create the confusion matrix

cm_DT = confusion_matrix(Y_test, Y_pred_DT)

# Display the confusion matrix using ConfusionMatrixDisplay

disp_DT = ConfusionMatrixDisplay(confusion_matrix=cm_DT,
display_labels=DT.classes_)
disp_DT.plot(cmap=plt.cm.Greens)
plt.xlabel('Predicted label', color='black')
plt.ylabel('True label', color='black')
plt.gca().tick_params(axis='x', colors='white') # Modify ticks on x-
axis
plt.gca().tick_params(axis='y', colors='white') # Modify ticks on y-
axis
plt.gcf().set_size_inches(10, 6)
plt.savefig("DT.png", dpi=600)
plt.show()

DT_probs = DT.predict_proba(X_test)
DT_probs = DT_probs[:, 1]
DT_auc = roc_auc_score(Y_test, DT_probs)
DT_fpr, DT_tpr, _ = roc_curve(Y_test, DT_probs)

# plot the roc curve for the model

fig = pyplot.figure(figsize=(7, 5))
pyplot.plot(DT_fpr, DT_tpr ,label='DT =%.3f' % (DT_auc))
params = {'legend.fontsize': 10,
'legend.handlelength': 2}
pyplot.rcParams.update(params)
pyplot.legend()
pyplot.ylabel('True Positive Rate', fontsize=15)
pyplot.xlabel('False Positive Rate', fontsize=15)
pyplot.xticks(fontsize=12)
pyplot.yticks(fontsize=12)
#Show legend
pyplot.legend() #
#plt.savefig("AUC_ROC.png", dpi = 600)
pyplot.show()

from sklearn.neural_network import MLPClassifier

MLP = MLPClassifier()
MLP.fit(X_train, Y_train)
prediction = MLP.predict(X_test)
Accuracy = accuracy_score(Y_test,prediction)
print('accuracy',Accuracy)

#Printing precision, recall and f1_scores

print(precision_score(Y_test,prediction))
print(recall_score(Y_test,prediction))
print(f1_score(Y_test,prediction))

# Assuming you have trained your SVM poly model and have X_test and
Y_test
# svm_poly is your trained SVM with polynomial kernel model

# Get predictions from the SVM poly model

Y_pred_MLP = MLP.predict(X_test)

# Create the confusion matrix

cm_MLP = confusion_matrix(Y_test, Y_pred_MLP)

# Display the confusion matrix using ConfusionMatrixDisplay

disp_MLP = ConfusionMatrixDisplay(confusion_matrix=cm_MLP,
display_labels=MLP.classes_)
disp_MLP.plot(cmap=plt.cm.Greens)
plt.xlabel('Predicted label', color='black')
plt.ylabel('True label', color='black')
plt.gca().tick_params(axis='x', colors='white') # Modify ticks on x-
axis
plt.gca().tick_params(axis='y', colors='white') # Modify ticks on y-
axis
plt.gcf().set_size_inches(10, 6)
plt.savefig("MLP.png", dpi=600)
plt.show()

MLP_probs = MLP.predict_proba(X_test)
MLP_probs = MLP_probs[:,1]
MLP_auc = roc_auc_score(Y_test, MLP_probs)
MLP_fpr, MLP_tpr, _ = roc_curve(Y_test, MLP_probs)

# plot the roc curve for the model

fig = pyplot.figure(figsize=(7, 5))
pyplot.plot(MLP_fpr, MLP_tpr ,label='MLP =%.3f' % (MLP_auc))
params = {'legend.fontsize': 10,
'legend.handlelength': 2}
pyplot.rcParams.update(params)
pyplot.legend()
pyplot.ylabel('True Positive Rate', fontsize=15)
pyplot.xlabel('False Positive Rate', fontsize=15)
pyplot.xticks(fontsize=12)
pyplot.yticks(fontsize=12)
#Show legend
pyplot.legend() #
#plt.savefig("AUC_ROC.png", dpi = 600)
pyplot.show()

from sklearn.ensemble import AdaBoostClassifier

# define the model
AB = AdaBoostClassifier()
AB.fit(X_train, Y_train)
prediction = AB.predict(X_test)
Accuracy = accuracy_score(Y_test,prediction)
print('accuracy',Accuracy)

#Printing precision, recall and f1_scores

print(precision_score(Y_test,prediction))
print(recall_score(Y_test,prediction))
print(f1_score(Y_test,prediction))

# Assuming you have trained your SVM poly model and have X_test and
Y_test
# svm_poly is your trained SVM with polynomial kernel model

# Get predictions from the SVM poly model

Y_pred_AB = AB.predict(X_test)

# Create the confusion matrix

cm_AB = confusion_matrix(Y_test, Y_pred_AB)

# Display the confusion matrix using ConfusionMatrixDisplay

disp_AB = ConfusionMatrixDisplay(confusion_matrix=cm_AB,
display_labels=AB.classes_)
disp_AB.plot(cmap=plt.cm.Greens)
plt.xlabel('Predicted label', color='black')
plt.ylabel('True label', color='black')
plt.gca().tick_params(axis='x', colors='white') # Modify ticks on x-
axis
plt.gca().tick_params(axis='y', colors='white') # Modify ticks on y-
axis
plt.gcf().set_size_inches(10, 6)
plt.savefig("AB.png", dpi=600)
plt.show()
AB_probs = AB.predict_proba(X_test)
AB_probs = AB_probs[:, 1]
AB_auc = roc_auc_score(Y_test, AB_probs)
AB_fpr, AB_tpr, _ = roc_curve(Y_test, AB_probs)

# plot the roc curve for the model

fig = pyplot.figure(figsize=(7, 5))
pyplot.plot(AB_fpr, AB_tpr ,label='AB =%.3f' % (AB_auc))
params = {'legend.fontsize': 10,
'legend.handlelength': 2}
pyplot.rcParams.update(params)
pyplot.legend()
pyplot.ylabel('True Positive Rate', fontsize=15)
pyplot.xlabel('False Positive Rate', fontsize=15)
pyplot.xticks(fontsize=12)
pyplot.yticks(fontsize=12)
#Show legend
pyplot.legend() #
#plt.savefig("AUC_ROC.png", dpi = 600)
pyplot.show()

from sklearn.ensemble import RandomForestClassifier

# define the model
RF = RandomForestClassifier()
RF.fit(X_train, Y_train)
prediction = RF.predict(X_test)
Accuracy = accuracy_score(Y_test,prediction)
print('accuracy',Accuracy)

#Printing precision, recall and f1_scores

print(precision_score(Y_test,prediction))
print(recall_score(Y_test,prediction))
print(f1_score(Y_test,prediction))

# Assuming you have trained your SVM poly model and have X_test and
Y_test
# svm_poly is your trained SVM with polynomial kernel model

# Get predictions from the SVM poly model

Y_pred_RF = RF.predict(X_test)

# Create the confusion matrix

cm_RF = confusion_matrix(Y_test, Y_pred_RF)

# Display the confusion matrix using ConfusionMatrixDisplay

disp_RF = ConfusionMatrixDisplay(confusion_matrix=cm_RF,
display_labels=RF.classes_)
disp_RF.plot(cmap=plt.cm.Greens)
plt.xlabel('Predicted label', color='black')
plt.ylabel('True label', color='black')
plt.gca().tick_params(axis='x', colors='white') # Modify ticks on x-
axis
plt.gca().tick_params(axis='y', colors='white') # Modify ticks on y-
axis
plt.gcf().set_size_inches(10, 6)
plt.savefig("RF.png", dpi=600)
plt.show()

RF_probs = RF.predict_proba(X_test)
RF_probs = RF_probs[:,1]
RF_auc = roc_auc_score(Y_test, RF_probs)
RF_fpr, RF_tpr, _ = roc_curve(Y_test, RF_probs)

# plot the roc curve for the model

fig = pyplot.figure(figsize=(7, 5))
pyplot.plot(RF_fpr, RF_tpr ,label='RF =%.3f' % (RF_auc))
params = {'legend.fontsize': 10,
'legend.handlelength': 2}
pyplot.rcParams.update(params)
pyplot.legend()
pyplot.ylabel('True Positive Rate', fontsize=15)
pyplot.xlabel('False Positive Rate', fontsize=15)
pyplot.xticks(fontsize=12)
pyplot.yticks(fontsize=12)
#Show legend
pyplot.legend() #
#plt.savefig("AUC_ROC.png", dpi = 600)
pyplot.show()

#####Combined AUROC and PR Curves

KNN_probs = KNN.predict_proba(X_test)
AB_probs = AB.predict_proba(X_test)
DT_probs = DT.predict_proba(X_test)
LR_probs = LR.predict_proba(X_test)
RF_probs = RF.predict_proba(X_test)
NB_probs = NB.predict_proba(X_test)
MLP_probs = MLP.predict_proba(X_test)
svm_linear_probs =svm_linear.predict_proba(X_test)
svm_rbf_probs = svm_rbf.predict_proba(X_test)
svm_poly_probs = svm_poly.predict_proba(X_test)

# keep probabilities for the positive outcome only

KNN_probs = KNN_probs[:, 1]
AB_probs = AB_probs[:, 1]
DT_probs = DT_probs[:, 1]
LR_probs = LR_probs[:, 1]
RF_probs = RF_probs[:, 1]
NB_probs = NB_probs[:, 1]
MLP_probs = MLP_probs[:,1]
svm_linear_probs = svm_linear_probs[:, 1]
svm_poly_probs = svm_poly_probs[:, 1]
svm_rbf_probs = svm_rbf_probs[:, 1]

# calculate scores
KNN_auc = roc_auc_score(Y_test, KNN_probs)
AB_auc = roc_auc_score(Y_test, AB_probs)
DT_auc = roc_auc_score(Y_test, DT_probs)
LR_auc = roc_auc_score(Y_test, LR_probs)
NB_auc = roc_auc_score(Y_test, NB_probs)
RF_auc = roc_auc_score(Y_test, RF_probs)
MLP_auc = roc_auc_score(Y_test, MLP_probs)
svm_poly_auc = roc_auc_score(Y_test, svm_poly_probs)
svm_linear_auc = roc_auc_score(Y_test, svm_linear_probs)
svm_rbf_auc = roc_auc_score(Y_test, svm_rbf_probs)

# calculate roc curves

KNN_fpr, KNN_tpr, _ = roc_curve(Y_test, KNN_probs)
AB_fpr, AB_tpr, _ = roc_curve(Y_test, AB_probs)
DT_fpr, DT_tpr, _ = roc_curve(Y_test, DT_probs)
NB_fpr, NB_tpr, _ = roc_curve(Y_test, NB_probs)
LR_fpr, LR_tpr, _ = roc_curve(Y_test, LR_probs)
RF_fpr, RF_tpr, _ = roc_curve(Y_test, RF_probs)
MLP_fpr, MLP_tpr, _ = roc_curve(Y_test, MLP_probs)
svm_linear_fpr, svm_linear_tpr, _ = roc_curve(Y_test,
svm_linear_probs)
svm_poly_fpr, svm_poly_tpr, _ = roc_curve(Y_test, svm_poly_probs)
svm_rbf_fpr, svm_rbf_tpr, _ = roc_curve(Y_test, svm_rbf_probs)

# plot the roc curve for the model

fig = pyplot.figure(figsize=(7, 5))
pyplot.plot(KNN_fpr, KNN_tpr ,label='KNN =%.3f' % (KNN_auc))
pyplot.plot(AB_fpr, AB_tpr ,label='AB =%.3f' % (AB_auc))
pyplot.plot(NB_fpr, NB_tpr ,label='NB =%.3f' % (NB_auc))
pyplot.plot(DT_fpr, DT_tpr ,label='DT =%.3f' % (DT_auc))
pyplot.plot(LR_fpr, LR_tpr ,label='LR =%.3f' % (LR_auc))
pyplot.plot(RF_fpr, RF_tpr ,label='RF =%.3f' % (RF_auc))
pyplot.plot(MLP_fpr, MLP_tpr ,label='MLP =%.3f' % (MLP_auc))
pyplot.plot(svm_linear_fpr, svm_linear_tpr ,label='SVM_Linear =%.3f' %
(svm_linear_auc))
pyplot.plot(svm_poly_fpr, svm_poly_tpr ,label='SVM_Poly =%.3f' %
(svm_poly_auc))
pyplot.plot(svm_rbf_fpr, svm_rbf_tpr ,label='SVM_RBF =%.3f' %
(svm_rbf_auc))
params = {'legend.fontsize': 10,
'legend.handlelength': 2}
pyplot.rcParams.update(params)
pyplot.legend()
pyplot.ylabel('True Positive Rate', fontsize=15)
pyplot.xlabel('False Positive Rate', fontsize=15)
pyplot.xticks(fontsize=12)
pyplot.yticks(fontsize=12)
#Show legend
pyplot.legend() #
plt.savefig("AUC_ROC.png", dpi = 600)
pyplot.show()

from sklearn.metrics import precision_score

from sklearn.metrics import recall_score
from sklearn.metrics import precision_recall_curve

KNN_precision, KNN_recall, _ = precision_recall_curve(Y_test,

KNN_probs)
AB_precision, AB_recall, _ = precision_recall_curve(Y_test, AB_probs)
DT_precision, DT_recall, _ = precision_recall_curve(Y_test, DT_probs)
LR_precision, LR_recall, _ = precision_recall_curve(Y_test, LR_probs)
NB_precision, NB_recall, _ = precision_recall_curve(Y_test, NB_probs)
RF_precision, RF_recall, _ = precision_recall_curve(Y_test, RF_probs)
MLP_precision, MLP_recall, _ = precision_recall_curve(Y_test,
MLP_probs)
svm_linear_precision, svm_linear_recall, _ =
precision_recall_curve(Y_test, svm_linear_probs)
svm_poly_precision, svm_poly_recall, _ =
precision_recall_curve(Y_test, svm_poly_probs)
svm_rbf_precision, svm_rbf_recall, _ = precision_recall_curve(Y_test,
svm_rbf_probs)

from sklearn.metrics import auc

# calculate the precision-recall auc
KNN_auc = auc(KNN_recall, KNN_precision)
AB_auc = auc(AB_recall, AB_precision)
DT_auc = auc(DT_recall, DT_precision)
NB_auc = auc(NB_recall, NB_precision)
LR_auc = auc(LR_recall, LR_precision)
RF_auc = auc(RF_recall, RF_precision)
MLP_auc = auc(MLP_recall, MLP_precision)
svm_linear_auc = auc(svm_linear_recall, svm_linear_precision)
svm_poly_auc = auc(svm_poly_recall, svm_poly_precision)
svm_rbf_auc = auc(svm_rbf_recall, svm_rbf_precision)

# plot the roc curve for the model

fig = pyplot.figure(figsize=(7, 5))
pyplot.plot(KNN_precision, KNN_recall ,label='KNN =%.3f' % (KNN_auc))
pyplot.plot(AB_precision, AB_recall ,label='AB =%.3f' % (AB_auc))
pyplot.plot(NB_precision, NB_recall ,label='NB =%.3f' % (NB_auc))
pyplot.plot(DT_precision, DT_recall ,label='DT =%.3f' % (DT_auc))
pyplot.plot(LR_precision, LR_recall ,label='LR =%.3f' % (LR_auc))
pyplot.plot(RF_precision, RF_recall ,label='RF =%.3f' % (RF_auc))
pyplot.plot(MLP_precision, MLP_recall ,label='MLP =%.3f' % (MLP_auc))
pyplot.plot(svm_linear_precision, svm_linear_recall ,label='SVM_Linear
=%.3f' % (svm_linear_auc))
pyplot.plot(svm_poly_precision, svm_poly_recall ,label='SVM_Poly =
%.3f' % (svm_poly_auc))
pyplot.plot(svm_rbf_precision, svm_rbf_recall ,label='SVM_RBF =%.3f' %
(svm_rbf_auc))
params = {'legend.fontsize': 10,
'legend.handlelength': 2}
pyplot.rcParams.update(params)
pyplot.legend()
pyplot.ylabel('Precision', fontsize=15)
pyplot.xlabel('Recall', fontsize=15)
pyplot.xticks(fontsize=12)
pyplot.yticks(fontsize=12)
#Show legend
pyplot.legend() #
plt.savefig("PR_curve.png", dpi = 600)
pyplot.show()

01 Introduction To Python
No ratings yet
01 Introduction To Python
36 pages
Python For DataScience
No ratings yet
Python For DataScience
47 pages
01 Introduction To Python
No ratings yet
01 Introduction To Python
36 pages
Cheat Sheet: Python For Data Science
No ratings yet
Cheat Sheet: Python For Data Science
4 pages
Cheat Sheet: Python For Data Science
No ratings yet
Cheat Sheet: Python For Data Science
4 pages
Data Science Lab Manual
No ratings yet
Data Science Lab Manual
63 pages
CS3361 Data Science Lab Manual
No ratings yet
CS3361 Data Science Lab Manual
43 pages
Python Summary
No ratings yet
Python Summary
10 pages
DV Lab Manual Modified
No ratings yet
DV Lab Manual Modified
31 pages
Fds Lab Manual
No ratings yet
Fds Lab Manual
61 pages
Ch02 Statlearn Lab
No ratings yet
Ch02 Statlearn Lab
58 pages
Datascience Internship
No ratings yet
Datascience Internship
43 pages
AD3301 DEV Lab Manual
No ratings yet
AD3301 DEV Lab Manual
26 pages
FDS Lab Manual
No ratings yet
FDS Lab Manual
62 pages
Experiment 1 to 4
No ratings yet
Experiment 1 to 4
15 pages
PMI - Modules and Data Structures
No ratings yet
PMI - Modules and Data Structures
23 pages
Datascience Lab Manual
No ratings yet
Datascience Lab Manual
46 pages
AI With Python Print MCA
No ratings yet
AI With Python Print MCA
54 pages
Jenisha INTERNSHIP REPORT-2
No ratings yet
Jenisha INTERNSHIP REPORT-2
19 pages
Exp No. 1-3 (MLC)
No ratings yet
Exp No. 1-3 (MLC)
12 pages
PML Lab Manual-7-12
No ratings yet
PML Lab Manual-7-12
6 pages
Int254 Unit 2
No ratings yet
Int254 Unit 2
33 pages
Getting Started With Python Cheat Sheet
No ratings yet
Getting Started With Python Cheat Sheet
1 page
ML File Updated
No ratings yet
ML File Updated
60 pages
DS Final
No ratings yet
DS Final
46 pages
Fds Lab Manual
No ratings yet
Fds Lab Manual
58 pages
EXP1-siddhant Gupta (23 - SE - 148)
No ratings yet
EXP1-siddhant Gupta (23 - SE - 148)
17 pages
Python Cheat Sheet For Beginners
No ratings yet
Python Cheat Sheet For Beginners
1 page
Data Analysis Tutorial
No ratings yet
Data Analysis Tutorial
152 pages
Q-Step WS 06112019 Data Analysis and Visualisation With Python
No ratings yet
Q-Step WS 06112019 Data Analysis and Visualisation With Python
76 pages
Data Visualization - Lab - Manual - 2024
No ratings yet
Data Visualization - Lab - Manual - 2024
13 pages
Python For Data Science Quickstart Guide
No ratings yet
Python For Data Science Quickstart Guide
13 pages
Numpy Data Analysis and Visualisation With Python
No ratings yet
Numpy Data Analysis and Visualisation With Python
75 pages
Lab 2 DWM
No ratings yet
Lab 2 DWM
13 pages
DV Lab2 Updated
No ratings yet
DV Lab2 Updated
12 pages
Module 1.Foundations of Data Science
No ratings yet
Module 1.Foundations of Data Science
17 pages
StatisticsMachineLearningPythonDraft PDF
100% (1)
StatisticsMachineLearningPythonDraft PDF
219 pages
EX - No: 1 Date:: Download Install Explore The Features of Numpy, Scipy, Jupiter, Statsmodels and Pandas Packages
No ratings yet
EX - No: 1 Date:: Download Install Explore The Features of Numpy, Scipy, Jupiter, Statsmodels and Pandas Packages
38 pages
NumPy and Pandas
No ratings yet
NumPy and Pandas
72 pages
Python Notes
No ratings yet
Python Notes
16 pages
Eda Unit 1
No ratings yet
Eda Unit 1
7 pages
FDS Record-1-4
No ratings yet
FDS Record-1-4
18 pages
Machine Learning Using Phython
No ratings yet
Machine Learning Using Phython
25 pages
Data Analysis and Visualisation With Python
No ratings yet
Data Analysis and Visualisation With Python
75 pages
Data Processing With Python and R
No ratings yet
Data Processing With Python and R
6 pages
Fds Lab Manual
No ratings yet
Fds Lab Manual
59 pages
Python Libraries
No ratings yet
Python Libraries
27 pages
NumPy & Pandas
No ratings yet
NumPy & Pandas
27 pages
Cs3361-Data Science Lab Manual
No ratings yet
Cs3361-Data Science Lab Manual
44 pages
Python
No ratings yet
Python
132 pages
Pandas Notes
No ratings yet
Pandas Notes
54 pages
quickStartGuide Py
No ratings yet
quickStartGuide Py
30 pages
Dav 2 Unit
No ratings yet
Dav 2 Unit
55 pages
Pandas What Can Pandas Do For You ?: Statsmodels SM Seaborn Sns
No ratings yet
Pandas What Can Pandas Do For You ?: Statsmodels SM Seaborn Sns
9 pages
AML LAB MANUAL Yash
No ratings yet
AML LAB MANUAL Yash
60 pages
oG1M8adGXOGe DHBiQVrXgXHO6GrHU01tHWZgd tpRqUW65xGX9ufzrZMtM6hjBWlvlYViPn6r2Cgghq2M8oiXNNdf0HeL-DQvJKWM
No ratings yet
oG1M8adGXOGe DHBiQVrXgXHO6GrHU01tHWZgd tpRqUW65xGX9ufzrZMtM6hjBWlvlYViPn6r2Cgghq2M8oiXNNdf0HeL-DQvJKWM
42 pages
Unit6 - Working With Data
No ratings yet
Unit6 - Working With Data
29 pages
DAL EXT 1 and 2
No ratings yet
DAL EXT 1 and 2
125 pages
Python for Data Science: Data Science Mastery by Nikhil Khan, #1
From Everand
Python for Data Science: Data Science Mastery by Nikhil Khan, #1
Nikhil Khan
No ratings yet
Introduction to PHP, Part 2, Second Edition
From Everand
Introduction to PHP, Part 2, Second Edition
Adam Majczak
No ratings yet
Internship - Final Report
No ratings yet
Internship - Final Report
22 pages
Maptek Vulcan Pit Optimiser
No ratings yet
Maptek Vulcan Pit Optimiser
1 page
Las Ict Csa 9 Q3 Week 1
No ratings yet
Las Ict Csa 9 Q3 Week 1
10 pages
Fujifilm Amulet Felicia
No ratings yet
Fujifilm Amulet Felicia
2 pages
ASSIGNMENT BQS554 Mac - Ogos 2021 B
No ratings yet
ASSIGNMENT BQS554 Mac - Ogos 2021 B
4 pages
Energy Efficient Performance Analysis of NOMA For Wireless Down-Link in Heterogeneous Networks Under Imperfect SIC
No ratings yet
Energy Efficient Performance Analysis of NOMA For Wireless Down-Link in Heterogeneous Networks Under Imperfect SIC
6 pages
DSP Lab Manual 2021 22
No ratings yet
DSP Lab Manual 2021 22
66 pages
TOW Waste-Calendar 2024-To-2025 Printable 8.5x11in FINAL
No ratings yet
TOW Waste-Calendar 2024-To-2025 Printable 8.5x11in FINAL
4 pages
2024-03-16
No ratings yet
2024-03-16
6 pages
IOT LabCombined Ex1 Ex10 21BAI1509 MADHURMANNComplete
No ratings yet
IOT LabCombined Ex1 Ex10 21BAI1509 MADHURMANNComplete
60 pages
Smartphone Cradle Receiver: Operating Instructions Manual de Instrucciones
No ratings yet
Smartphone Cradle Receiver: Operating Instructions Manual de Instrucciones
56 pages
Lab 4
No ratings yet
Lab 4
7 pages
Unit 1 BCT Prof. Nilima S. Dandge
No ratings yet
Unit 1 BCT Prof. Nilima S. Dandge
13 pages
Autoinvoice Setup
No ratings yet
Autoinvoice Setup
10 pages
NIC52918763 20241201T174511 Ecard
No ratings yet
NIC52918763 20241201T174511 Ecard
1 page
DP WebCam 15121 Drivers
No ratings yet
DP WebCam 15121 Drivers
1,130 pages
The Command Center-Lesson 1
No ratings yet
The Command Center-Lesson 1
31 pages
Certificado Modbus TCP
No ratings yet
Certificado Modbus TCP
3 pages
Cargill Rosario Paraguay 777 (ARG-002434554)
No ratings yet
Cargill Rosario Paraguay 777 (ARG-002434554)
2 pages
Sagar Devops Resume
No ratings yet
Sagar Devops Resume
3 pages
5-1 Phaser 3635MFP Parts List Draft 4
No ratings yet
5-1 Phaser 3635MFP Parts List Draft 4
34 pages
Types of Computing Devices
No ratings yet
Types of Computing Devices
3 pages
Robotics TP
No ratings yet
Robotics TP
15 pages
Omnicom VMS Solution
No ratings yet
Omnicom VMS Solution
2 pages
FDSD Reviewer
No ratings yet
FDSD Reviewer
5 pages
5 - Ministry On Building Permit For - Qatar Telecom (OOREDOO)
No ratings yet
5 - Ministry On Building Permit For - Qatar Telecom (OOREDOO)
1 page
4 Bit Even Synchr. Counter Detail
100% (1)
4 Bit Even Synchr. Counter Detail
5 pages
Exception Handling in Java
No ratings yet
Exception Handling in Java
18 pages
CTR Nsa Network Infrastructure Security Guide 20220615
No ratings yet
CTR Nsa Network Infrastructure Security Guide 20220615
60 pages
Tourism and Travel Management Formatted Paper
No ratings yet
Tourism and Travel Management Formatted Paper
10 pages