Name: Mussab Bin Shahid Sap-Id: 2024 Assignment: Machine-Learning

1. The document discusses implementing a KNN classifier with Euclidean distance to predict whether passengers on the Titanic survived or not, based on features like passenger class, sex, and fare. 2. It tests different values of K and finds the highest testing accuracy of 81% is achieved when K=7. 3. Confusion matrices and accuracy scores are calculated for different distance metrics like Euclidean, Minkowski, and Manhattan at various K values, with the best results obtained at K=7 for Manhattan distance.

Uploaded by

Mussab Shahid

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

46 views5 pages

Name: Mussab Bin Shahid Sap-Id: 2024 Assignment: Machine-Learning

Uploaded by

Mussab Shahid

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 5

Name: Mussab Bin Shahid

Sap-Id: 2024
Assignment: Machine-Learning
Dataset: https://github.com/rashida048/Datasets

Problem
We take dataset of titanic ship and we see 0 means the person survived and 1 means
the person did not survive.For this tutorial, our goal will be to predict
the‘Survived’feature. This dataset is very simple. Just from intuition, we can see that
there are columns that cannot be important to predict the ‘Survived’ feature.
For example, ‘PassengerId’, ‘Name’, ‘Ticket’ and, ‘Cabin’ does not seem to be useful
to predict that if a passenger survived or not.

KNN implementation With Euclidean Distance

# Calculating Distance Using Euclidean Distance
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
pd.options.mode.chained_assignment = None # default='warn'
#Importing the Dataset
from sklearn.model_selection import train_test_split
titanic = pd.read_csv('titanic_data.csv')
titanic.head(5)
titanic1 = titanic[['Pclass', 'Sex', 'Fare',
'Survived']]
#As computer doesn't understands text value so we convert it into numeric form
for male is 0 and for female is 1
titanic1['Sex'] = titanic1.Sex.replace({'male':0, 'female':1})
X = titanic1[['Pclass', 'Sex', 'Fare']]
y = titanic1['Survived']
#Splitting Test Data and Training Data
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size =
0.25,random_state=0)
# Implementing KNN Classifier and using Euclidean Distance
from sklearn.neighbors import KNeighborsClassifier
classifier = KNeighborsClassifier(n_neighbors = 3,metric = 'euclidean', p = 2)
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
y_pred
#Calculating Accuracy and Applying Confusion Matrix
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)
from sklearn.metrics import accuracy_score
print ("Accuracy : ", accuracy_score(y_test, y_pred))
cm
# df = pd.DataFrame({'Real Values':y_test, 'Predicted Values':y_pred}) # if you
want to see the real data comparison with predicted
#data
# df

Choosing Value of K=3

from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)
from sklearn.metrics import accuracy_score
print ("Accuracy : ", accuracy_score(y_test, y_pred))
cm
Accuracy : 0.7802690582959642

Out[71]:
array([[118, 21],
[ 28, 56]], dtype=int64)

As accuracy when choosing k=3 on testing data prediction is 78%.

Choosing Value of K=7

Accuracy : 0.8116591928251121
Out[56]:
array([[118, 21],
[ 21, 63]], dtype=int64)
so this is our sweet point at this point our model has highest testing accuracy of 81%.
As I have checked points with different K values.
At k=5 it is 78%
K=9 it is 80%
K=11 it is 79%
So

Confusion Matrix and Accuracy

The confusion matrix is a table that is used to show the number of correct and
incorrect predictions on a classification problem when the real values of the Test
Set are known.
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)
from sklearn.metrics import accuracy_score
print ("Accuracy : ", accuracy_score(y_test, y_pred))
cm
Accuracy with Minkowski at

It has also its sweet point at K=7

Accuracy with Manhatten k=3

Accuracy with Manhatten k=5

Accuracy with Manhatten k=7

Accuracy with Manhatten k=9

This is our sweet point where we have greatest accuracy above this

Imaging and Design For The Online Environment: CS - ICT11/12-ICTPT-Ie-f-6
No ratings yet
Imaging and Design For The Online Environment: CS - ICT11/12-ICTPT-Ie-f-6
49 pages
Lecture 11
No ratings yet
Lecture 11
32 pages
Machine Learning Techniques For Sensor Data Analysis
No ratings yet
Machine Learning Techniques For Sensor Data Analysis
17 pages
Lab 8
No ratings yet
Lab 8
7 pages
Machine Learning With Python - Machine Learning Algorithms - KNN
No ratings yet
Machine Learning With Python - Machine Learning Algorithms - KNN
15 pages
AIML Report (1) 11
No ratings yet
AIML Report (1) 11
13 pages
KNN Rainfall
No ratings yet
KNN Rainfall
9 pages
SPPUML5
No ratings yet
SPPUML5
4 pages
Practical 7
No ratings yet
Practical 7
6 pages
ML 2 16
No ratings yet
ML 2 16
6 pages
KNN Model
No ratings yet
KNN Model
5 pages
Naive Bayes Gaussian Table Tennis - Jupyter Notebook
No ratings yet
Naive Bayes Gaussian Table Tennis - Jupyter Notebook
6 pages
Mnbnmnbnnmbbhhuyrgh
No ratings yet
Mnbnmnbnnmbbhhuyrgh
3 pages
ML Lab2 PGM
No ratings yet
ML Lab2 PGM
3 pages
ML 4
No ratings yet
ML 4
2 pages
ML - Labtask5.ipynb - K - Colab
No ratings yet
ML - Labtask5.ipynb - K - Colab
8 pages
1 KNN - Jupyter Notebook
No ratings yet
1 KNN - Jupyter Notebook
3 pages
AIDI 1002 FinalExam Section 01
No ratings yet
AIDI 1002 FinalExam Section 01
2 pages
CP4252 Machine Learning Laboratory
No ratings yet
CP4252 Machine Learning Laboratory
37 pages
K-Nearest Neighbor: General Gist
No ratings yet
K-Nearest Neighbor: General Gist
14 pages
Classification and K Nearest Neighbour Algorithm
No ratings yet
Classification and K Nearest Neighbour Algorithm
53 pages
Machine Learning Assignment 3
No ratings yet
Machine Learning Assignment 3
7 pages
Slide 2 ML Basics
No ratings yet
Slide 2 ML Basics
42 pages
Bi 6 New
No ratings yet
Bi 6 New
6 pages
Week10 KNN Practical
No ratings yet
Week10 KNN Practical
4 pages
Python For Data Science IA 1 Programs
No ratings yet
Python For Data Science IA 1 Programs
14 pages
I Avaliação Parcial - 25.0 PTS - Gabarito
No ratings yet
I Avaliação Parcial - 25.0 PTS - Gabarito
9 pages
BTVN4 Code
No ratings yet
BTVN4 Code
3 pages
Python Code For KNN Classifier 1. Initial Message
No ratings yet
Python Code For KNN Classifier 1. Initial Message
7 pages
KNN - Jupyter Notebook
No ratings yet
KNN - Jupyter Notebook
5 pages
Python For Data Science IA 1 Programs
No ratings yet
Python For Data Science IA 1 Programs
14 pages
KNN - Predictive Analysis
No ratings yet
KNN - Predictive Analysis
6 pages
Lab 5
No ratings yet
Lab 5
2 pages
4K-Nearest Neighbor
No ratings yet
4K-Nearest Neighbor
38 pages
Iii Aid - ML
No ratings yet
Iii Aid - ML
30 pages
Machine Learning Lab Assignment 1
No ratings yet
Machine Learning Lab Assignment 1
23 pages
Slip
No ratings yet
Slip
5 pages
ML Full For Print New 1
No ratings yet
ML Full For Print New 1
38 pages
Solution 1
No ratings yet
Solution 1
6 pages
EX - NO:3: Algorithm
No ratings yet
EX - NO:3: Algorithm
11 pages
Aiml Practical
No ratings yet
Aiml Practical
17 pages
Artificial Intelligence Lab 7
No ratings yet
Artificial Intelligence Lab 7
10 pages
Total Listing Machine Learning
100% (1)
Total Listing Machine Learning
114 pages
23BCE7199 ML Lab Assignment
No ratings yet
23BCE7199 ML Lab Assignment
15 pages
Openlab 1
No ratings yet
Openlab 1
17 pages
Machine Learning II
No ratings yet
Machine Learning II
61 pages
STO Process - Pricing Procedure
No ratings yet
STO Process - Pricing Procedure
30 pages
23BCE7092 ML Lab Assignment
No ratings yet
23BCE7092 ML Lab Assignment
14 pages
ct9 Ilm3
No ratings yet
ct9 Ilm3
11 pages
Naive Bayes Classification
No ratings yet
Naive Bayes Classification
8 pages
Titanic
No ratings yet
Titanic
6 pages
Mini Project PPT Oyo
75% (4)
Mini Project PPT Oyo
13 pages
Fashion MNIST-6
No ratings yet
Fashion MNIST-6
10 pages
LAB-4 Report
No ratings yet
LAB-4 Report
21 pages
Here's An Visualization of The K-Nearest Neighbors Algorithm
No ratings yet
Here's An Visualization of The K-Nearest Neighbors Algorithm
5 pages
Titanic Akshaya
No ratings yet
Titanic Akshaya
12 pages
Master Bollinger Bands Swing Trading Strategy - OpoFinance
No ratings yet
Master Bollinger Bands Swing Trading Strategy - OpoFinance
14 pages
Case Study - Classifier
No ratings yet
Case Study - Classifier
5 pages
ML Lab Programs (1-13)
No ratings yet
ML Lab Programs (1-13)
44 pages
ML0101EN Clas K Nearest Neighbors CustCat Py v1
100% (1)
ML0101EN Clas K Nearest Neighbors CustCat Py v1
11 pages
Introduction To K-Nearest Neighbors: Simplified (With Implementation in Python)
100% (1)
Introduction To K-Nearest Neighbors: Simplified (With Implementation in Python)
125 pages
Machine Learning
100% (5)
Machine Learning
56 pages
Summer Internship Project Report
100% (1)
Summer Internship Project Report
49 pages
Cat Connectors
No ratings yet
Cat Connectors
85 pages
Euler's Path
50% (2)
Euler's Path
10 pages
Agri-Fishery Arts: Module 1: Importance of Planting Trees
No ratings yet
Agri-Fishery Arts: Module 1: Importance of Planting Trees
22 pages
CP4252 Machine Learning Lab Manual
No ratings yet
CP4252 Machine Learning Lab Manual
33 pages
Pharm 2013
100% (6)
Pharm 2013
13 pages
#9 - RA 9028 As Amended by RA 10364
100% (1)
#9 - RA 9028 As Amended by RA 10364
3 pages
Shelton v. Patton, Et Al. Final
No ratings yet
Shelton v. Patton, Et Al. Final
27 pages
Listening For Academic Purposes Chapter 3 and 4
100% (1)
Listening For Academic Purposes Chapter 3 and 4
16 pages
How To Create COBie Using With BIM Interoperability Tool
No ratings yet
How To Create COBie Using With BIM Interoperability Tool
26 pages
Three Phase Frequency Converter PDF
No ratings yet
Three Phase Frequency Converter PDF
86 pages
Deathworld Harry Harrison
No ratings yet
Deathworld Harry Harrison
153 pages
Model Design Process Anaplan
0% (1)
Model Design Process Anaplan
6 pages
Visionis Biometric Solutions Vis 3015 Vis 3016 Vis 3013 ENG
No ratings yet
Visionis Biometric Solutions Vis 3015 Vis 3016 Vis 3013 ENG
14 pages
Canicosa Contract To Sell
No ratings yet
Canicosa Contract To Sell
5 pages
Conrado Lopez v. Atty. Mata, Atty. Sentillas, and Atty. Abellana AC No. 9334 July 28, 2020 Facts
No ratings yet
Conrado Lopez v. Atty. Mata, Atty. Sentillas, and Atty. Abellana AC No. 9334 July 28, 2020 Facts
2 pages
Class X Unit 3 DBMS
No ratings yet
Class X Unit 3 DBMS
78 pages
Invitation of PT Garuda Indonesia (Persero) TBK: The Annual General Meeting of Shareholders
No ratings yet
Invitation of PT Garuda Indonesia (Persero) TBK: The Annual General Meeting of Shareholders
1 page
Global Maritime Distress and Safety System (GMDSS) : Companies Can Opt For Block Booking
100% (1)
Global Maritime Distress and Safety System (GMDSS) : Companies Can Opt For Block Booking
1 page
BMW PDF
No ratings yet
BMW PDF
38 pages
كاتلوج 2
No ratings yet
كاتلوج 2
44 pages
GST & Central Excise
No ratings yet
GST & Central Excise
2 pages
Backface Removal
No ratings yet
Backface Removal
4 pages
Volume Profile 部分20
No ratings yet
Volume Profile 部分20
5 pages
Residual vs. Zero Sequence: Welcome Posts About Electrical Training Arc Flash Studies Safety Compliance
No ratings yet
Residual vs. Zero Sequence: Welcome Posts About Electrical Training Arc Flash Studies Safety Compliance
2 pages
Los Campeones
No ratings yet
Los Campeones
1 page
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet
MCS-011: Problem Solving and Programming
From Everand
MCS-011: Problem Solving and Programming
Dr. DK Sukhani
No ratings yet

Name: Mussab Bin Shahid Sap-Id: 2024 Assignment: Machine-Learning

Uploaded by

Name: Mussab Bin Shahid Sap-Id: 2024 Assignment: Machine-Learning

Uploaded by

Name: Mussab Bin Shahid

KNN implementation With Euclidean Distance

Choosing Value of K=3

As accuracy when choosing k=3 on testing data prediction is 78%.

Choosing Value of K=7

Confusion Matrix and Accuracy

It has also its sweet point at K=7

Accuracy with Manhatten k=3

Accuracy with Manhatten k=5

Accuracy with Manhatten k=7

Accuracy with Manhatten k=9

You might also like