Student Name: Gaurav Raut University ID: 2038584: 6CS012 Workshop 4
Student Name: Gaurav Raut University ID: 2038584: 6CS012 Workshop 4
6CS012 Workshop 4
Question 1:
Train a scikit-learn MLPCLassifier to classify the dataset.
In [ ]: # Importing the required libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import confusion_matrix
from sklearn.metrics import classification_report
Out[32]: (200, 4)
Out[33]: (200,)
localhost:8888/lab 1/48
03/03/2021 WORKSHOP 4
Out[35]: 0
Out[40]: target
0 0
1 0
2 1
3 0
4 0
localhost:8888/lab 2/48
03/03/2021 WORKSHOP 4
Relationship Plots
Here, the features were plotted and visulized their relationship between each other. Statistical analysis is the
process of understanding how different variables are related to each other in a dataset and how they depend on
other variables. By visualizing the data properly, we can see different patterns and trends which indicates the
relationships.
localhost:8888/lab 3/48
03/03/2021 WORKSHOP 4
localhost:8888/lab 4/48
03/03/2021 WORKSHOP 4
localhost:8888/lab 5/48
03/03/2021 WORKSHOP 4
(160, 5) (40, 5)
In [59]: classifier
localhost:8888/lab 6/48
03/03/2021 WORKSHOP 4
localhost:8888/lab 7/48
03/03/2021 WORKSHOP 4
localhost:8888/lab 8/48
03/03/2021 WORKSHOP 4
localhost:8888/lab 9/48
03/03/2021 WORKSHOP 4
localhost:8888/lab 10/48
03/03/2021 WORKSHOP 4
localhost:8888/lab 11/48
03/03/2021 WORKSHOP 4
localhost:8888/lab 12/48
03/03/2021 WORKSHOP 4
localhost:8888/lab 13/48
03/03/2021 WORKSHOP 4
localhost:8888/lab 14/48
03/03/2021 WORKSHOP 4
localhost:8888/lab 15/48
03/03/2021 WORKSHOP 4
localhost:8888/lab 16/48
03/03/2021 WORKSHOP 4
localhost:8888/lab 17/48
03/03/2021 WORKSHOP 4
[[ 9 5 3]
[ 2 14 0]
[ 0 0 17]]
accuracy 0.80 50
macro avg 0.80 0.80 0.79 50
weighted avg 0.85 0.80 0.81 50
Question 2:
Write a paragraph to explain how the confusion matrix and
other metrics regard the MPL or decision tree to be most
applicable.
Confusion matrix is extremely useful to measure Recall, Precision, Specificity, Accuracy and most importantly
AUC-ROC Curve.....
Question 3:
Experiment with 3 hyper-parameters included in the lecture and
write a short summary of what you have learnt.
As we already know that MLP is best for this classification task, we can experiment changing some hyper-
parameters to see if there will be some improvement in the performance of the model.
Experiment 1:
Changing the following paramaters: Hidden Layer: 500 batch_size: auto activation function: relu loss function:
adam
In [66]: classifier1 = MLPClassifier(hidden_layer_sizes=(500,), max_iter=1000,activation = 'r
solver='adam',random_state=1, batch_size="auto", learning
learning_rate_init=0.001, verbose=True)
In [67]: classifier1
localhost:8888/lab 18/48
03/03/2021 WORKSHOP 4
localhost:8888/lab 19/48
03/03/2021 WORKSHOP 4
localhost:8888/lab 20/48
03/03/2021 WORKSHOP 4
localhost:8888/lab 21/48
03/03/2021 WORKSHOP 4
localhost:8888/lab 22/48
03/03/2021 WORKSHOP 4
localhost:8888/lab 23/48
03/03/2021 WORKSHOP 4
localhost:8888/lab 24/48
03/03/2021 WORKSHOP 4
localhost:8888/lab 25/48
03/03/2021 WORKSHOP 4
localhost:8888/lab 26/48
03/03/2021 WORKSHOP 4
In [89]: print(matrix1)
[[10 5 4]
[ 1 14 0]
[ 0 0 16]]
accuracy 0.80 50
macro avg 0.82 0.82 0.79 50
weighted avg 0.87 0.80 0.82 50
Experiment 2:
Changing the following paramaters: Hidden Layer: 350 batch_size: auto learning rate: adaptive activation
function: relu loss function: sgd
In [80]: classifier2 = MLPClassifier(hidden_layer_sizes=(350,), max_iter=1000,activation = 'r
solver='sgd',random_state=1, batch_size="auto", learning_
learning_rate_init=0.001, verbose=True)
localhost:8888/lab 28/48
03/03/2021 WORKSHOP 4
localhost:8888/lab 29/48
03/03/2021 WORKSHOP 4
localhost:8888/lab 30/48
03/03/2021 WORKSHOP 4
localhost:8888/lab 31/48
03/03/2021 WORKSHOP 4
localhost:8888/lab 32/48
03/03/2021 WORKSHOP 4
localhost:8888/lab 33/48
03/03/2021 WORKSHOP 4
localhost:8888/lab 34/48
03/03/2021 WORKSHOP 4
localhost:8888/lab 35/48
03/03/2021 WORKSHOP 4
localhost:8888/lab 36/48
03/03/2021 WORKSHOP 4
localhost:8888/lab 37/48
03/03/2021 WORKSHOP 4
localhost:8888/lab 38/48
03/03/2021 WORKSHOP 4
In [91]: print(matrix2)
localhost:8888/lab 39/48
03/03/2021 WORKSHOP 4
[[ 9 2 4]
[ 1 17 0]
[ 1 0 16]]
accuracy 0.84 50
macro avg 0.83 0.84 0.83 50
weighted avg 0.87 0.84 0.85 50
Experiment 3:
In [ ]:
Changing the following paramaters: Hidden Layer: 400 batch_size: auto learning rate: invscaling activation
function: relu loss function: max_iter: 500
In [93]: classifier3 = MLPClassifier(hidden_layer_sizes=(400,), max_iter=500,activation = 're
random_state=1, batch_size='auto', learning_rate='invscaling
learning_rate_init=0.001, verbose=True)
In [94]: classifier3
localhost:8888/lab 40/48
03/03/2021 WORKSHOP 4
localhost:8888/lab 41/48
03/03/2021 WORKSHOP 4
localhost:8888/lab 42/48
03/03/2021 WORKSHOP 4
localhost:8888/lab 43/48
03/03/2021 WORKSHOP 4
localhost:8888/lab 44/48
03/03/2021 WORKSHOP 4
localhost:8888/lab 45/48
03/03/2021 WORKSHOP 4
localhost:8888/lab 46/48
03/03/2021 WORKSHOP 4
In [99]: print(matrix3)
[[10 5 3]
[ 1 14 0]
[ 0 0 17]]
localhost:8888/lab 47/48
03/03/2021 WORKSHOP 4
accuracy 0.82 50
macro avg 0.83 0.83 0.81 50
weighted avg 0.88 0.82 0.83 50
End Of Assignment!!
localhost:8888/lab 48/48