0% found this document useful (0 votes)

16 views15 pages

Assignment 3

Uploaded by

krishnaanikam911

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views15 pages

Assignment 3

Uploaded by

krishnaanikam911

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

10/17/24, 4:06 PM Assignment3

In [2]: #Pranav Kulkarni(T512004)

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

In [4]: #Loading data into dataframe

data = pd.read_csv("Admission_Predict.csv")

In [6]: data.head()

Out[6]: Serial GRE TOEFL University Chance of

SOP LOR CGPA Research
No. Score Score Rating Admit

0 1 337 118 4 4.5 4.5 9.65 1 0.92

1 2 324 107 4 4.0 4.5 8.87 1 0.76

2 3 316 104 3 3.0 3.5 8.00 1 0.72

3 4 322 110 3 3.5 2.5 8.67 1 0.80

4 5 314 103 2 2.0 3.0 8.21 0 0.65

In [8]: data.tail()

Out[8]: Serial GRE TOEFL University Chance of

SOP LOR CGPA Research
No. Score Score Rating Admit

395 396 324 110 3 3.5 3.5 9.04 1 0.82

396 397 325 107 3 3.0 3.5 9.11 1 0.84

397 398 330 116 4 5.0 4.5 9.45 1 0.91

398 399 312 103 3 3.5 4.0 8.78 0 0.67

399 400 333 117 4 5.0 4.0 9.66 1 0.95

In [10]: data.shape

Out[10]: (400, 9)

In [12]: data.columns

Out[12]: Index(['Serial No.', 'GRE Score', 'TOEFL Score', 'University Rating', 'SOP',
'LOR ', 'CGPA', 'Research', 'Chance of Admit '],
dtype='object')

In [14]: data.drop("Serial No.",axis=1,inplace=True)

In [16]: data

file:///C:/Users/Student/Downloads/Assignment3 (3).html 1/15

10/17/24, 4:06 PM Assignment3

Out[16]: GRE TOEFL University Chance of

SOP LOR CGPA Research
Score Score Rating Admit

0 337 118 4 4.5 4.5 9.65 1 0.92

1 324 107 4 4.0 4.5 8.87 1 0.76

2 316 104 3 3.0 3.5 8.00 1 0.72

3 322 110 3 3.5 2.5 8.67 1 0.80

4 314 103 2 2.0 3.0 8.21 0 0.65

... ... ... ... ... ... ... ... ...

395 324 110 3 3.5 3.5 9.04 1 0.82

396 325 107 3 3.0 3.5 9.11 1 0.84

397 330 116 4 5.0 4.5 9.45 1 0.91

398 312 103 3 3.5 4.0 8.78 0 0.67

399 333 117 4 5.0 4.0 9.66 1 0.95

400 rows × 8 columns

In [18]: data["Chance of Admit "]=data["Chance of Admit "].apply(lambda x: 1 if x>0.5 els

In [20]: data

Out[20]: GRE TOEFL University Chance of

SOP LOR CGPA Research
Score Score Rating Admit

0 337 118 4 4.5 4.5 9.65 1 1

1 324 107 4 4.0 4.5 8.87 1 1

2 316 104 3 3.0 3.5 8.00 1 1

3 322 110 3 3.5 2.5 8.67 1 1

4 314 103 2 2.0 3.0 8.21 0 1

... ... ... ... ... ... ... ... ...

395 324 110 3 3.5 3.5 9.04 1 1

396 325 107 3 3.0 3.5 9.11 1 1

397 330 116 4 5.0 4.5 9.45 1 1

398 312 103 3 3.5 4.0 8.78 0 1

399 333 117 4 5.0 4.0 9.66 1 1

400 rows × 8 columns

In [22]: #Find missing values

print("Missing values:\n")
data.isnull().sum()

file:///C:/Users/Student/Downloads/Assignment3 (3).html 2/15

10/17/24, 4:06 PM Assignment3

Missing values:

Out[22]: GRE Score 0

TOEFL Score 0
University Rating 0
SOP 0
LOR 0
CGPA 0
Research 0
Chance of Admit 0
dtype: int64

In [24]: data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 400 entries, 0 to 399
Data columns (total 8 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 GRE Score 400 non-null int64
1 TOEFL Score 400 non-null int64
2 University Rating 400 non-null int64
3 SOP 400 non-null float64
4 LOR 400 non-null float64
5 CGPA 400 non-null float64
6 Research 400 non-null int64
7 Chance of Admit 400 non-null int64
dtypes: float64(3), int64(5)
memory usage: 25.1 KB

In [26]: data.corr()

Out[26]: Cha
GRE TOEFL University
SOP LOR CGPA Research
Score Score Rating
Ad

GRE
1.000000 0.835977 0.668976 0.612831 0.557555 0.833060 0.580391 0.390
Score

TOEFL
0.835977 1.000000 0.695590 0.657981 0.567721 0.828417 0.489858 0.393
Score

University
0.668976 0.695590 1.000000 0.734523 0.660123 0.746479 0.447783 0.279
Rating

SOP 0.612831 0.657981 0.734523 1.000000 0.729593 0.718144 0.444029 0.285

LOR 0.557555 0.567721 0.660123 0.729593 1.000000 0.670211 0.396859 0.353

CGPA 0.833060 0.828417 0.746479 0.718144 0.670211 1.000000 0.521654 0.455

Research 0.580391 0.489858 0.447783 0.444029 0.396859 0.521654 1.000000 0.216

Chance of
0.390875 0.393121 0.279316 0.285939 0.353341 0.455949 0.216193 1.000
Admit

In [28]: plt.figure(figsize=(6,6))
sns.heatmap(data.corr(), annot=True, cmap='Oranges')
plt.show()

file:///C:/Users/Student/Downloads/Assignment3 (3).html 3/15

10/17/24, 4:06 PM Assignment3

In [30]: data.hist(bins = 50,figsize = (15,11));

file:///C:/Users/Student/Downloads/Assignment3 (3).html 4/15

10/17/24, 4:06 PM Assignment3

In [32]: data_admit = data[data['Chance of Admit ']==1]

data_non_admit = data[data['Chance of Admit ']==0]
print("Admitted count : " ,data_admit.shape[0])
print("Non - Admitted count : " ,data_non_admit.shape[0])

Admitted count : 365

Non - Admitted count : 35

In [34]: data['Chance of Admit '].value_counts().plot(kind='pie',figsize=(5,5),autopct='%

plt.title("Chance of Admit in total")
plt.show()

file:///C:/Users/Student/Downloads/Assignment3 (3).html 5/15

10/17/24, 4:06 PM Assignment3

In [36]: data['LOR '].value_counts().plot(kind='pie',figsize=(5,5),autopct='%1.1f%%')

plt.title("LOR Point Chart")
plt.show()

In [38]: data['SOP'].value_counts().plot(kind='pie',figsize=(5,5),autopct='%1.1f%%')
plt.title("SOP Point Chart")
plt.show()

file:///C:/Users/Student/Downloads/Assignment3 (3).html 6/15

10/17/24, 4:06 PM Assignment3

In [40]: data["University Rating"].value_counts().plot(kind='pie',figsize=(5,5),autopct='

plt.title("University Rating Chart")
plt.show()

In [42]: #highest GRE score

print("maximum GRE Score : ",data['GRE Score'].max())
#lowest GRE score
print("minimum GRE Score : ",data['GRE Score'].min())

file:///C:/Users/Student/Downloads/Assignment3 (3).html 7/15

10/17/24, 4:06 PM Assignment3

maximum GRE Score : 340

minimum GRE Score : 290

In [44]: sns.pairplot(data,hue = "Research")

Out[44]: <seaborn.axisgrid.PairGrid at 0x24e2de89940>

In [46]: sns.pairplot(data,hue = "SOP");

file:///C:/Users/Student/Downloads/Assignment3 (3).html 8/15

10/17/24, 4:06 PM Assignment3

In [48]: sns.pairplot(data,hue = "University Rating");

file:///C:/Users/Student/Downloads/Assignment3 (3).html 9/15

10/17/24, 4:06 PM Assignment3

In [50]: sns.pairplot(data)

Out[50]: <seaborn.axisgrid.PairGrid at 0x24e375efe00>

file:///C:/Users/Student/Downloads/Assignment3 (3).html 10/15

10/17/24, 4:06 PM Assignment3

In [52]: X= data.drop("Chance of Admit ",axis =1 )

y= data["Chance of Admit "]

In [54]: X.nunique()

Out[54]: GRE Score 49

TOEFL Score 29
University Rating 5
SOP 9
LOR 9
CGPA 168
Research 2
dtype: int64

In [56]: from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y,test_size = 0.2, random

# Shape of train Test Split

print(X_train.shape,y_train.shape)
print(X_test.shape,y_test.shape)

(320, 7) (320,)
(80, 7) (80,)

file:///C:/Users/Student/Downloads/Assignment3 (3).html 11/15

10/17/24, 4:06 PM Assignment3

In [58]: from sklearn.tree import DecisionTreeClassifier

# instantiate the model

tree = DecisionTreeClassifier()

# fit the model

tree.fit(X_train, y_train)

Out[58]: ▾ DecisionTreeClassifier i ?

DecisionTreeClassifier()

In [60]: y_train_tree = tree.predict(X_train)

y_test_tree = tree.predict(X_test)

In [62]: from sklearn.metrics import accuracy_score

#computing the accuracy of the model performance
acc_train_tree = accuracy_score(y_train,y_train_tree)
acc_test_tree = accuracy_score(y_test,y_test_tree)

print("Decision Tree : Accuracy on training Data: {:.3f}".format(acc_train_tree)

print("Decision Tree : Accuracy on test Data: {:.3f}".format(acc_test_tree))

Decision Tree : Accuracy on training Data: 1.000

Decision Tree : Accuracy on test Data: 0.863

In [64]: from sklearn.metrics import classification_report

#computing the classification report of the model

print(classification_report(y_test, y_test_tree))

precision recall f1-score support

0 0.44 0.40 0.42 10

1 0.92 0.93 0.92 70

accuracy 0.86 80
macro avg 0.68 0.66 0.67 80
weighted avg 0.86 0.86 0.86 80

In [66]: plt.barh(X.columns,tree.feature_importances_)
plt.title("Feature Importances while constructing Tree")
plt.show()

file:///C:/Users/Student/Downloads/Assignment3 (3).html 12/15

10/17/24, 4:06 PM Assignment3

In [68]: #visualization of Confusion Matrix

from sklearn.metrics import confusion_matrix
cm=confusion_matrix(y_test,y_test_tree)
cmn = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
fig, ax = plt.subplots(figsize=(4,4))
sns.heatmap(cmn, annot=True, fmt='.2f',cmap='Oranges')
plt.title("Confusion Matrix")
plt.ylabel('Actual')
plt.xlabel('Predicted')
plt.show(block=False);

file:///C:/Users/Student/Downloads/Assignment3 (3).html 13/15

10/17/24, 4:06 PM Assignment3

In [70]: training_accuracy = []
test_accuracy = []
# try max_depth from 1 to 15
depth = range(1,16)
for n in depth:
tree_test = DecisionTreeClassifier(max_depth=n)
tree_test.fit(X_train, y_train)
# record training set accuracy
training_accuracy.append(tree_test.score(X_train, y_train))
# record generalization accuracy
test_accuracy.append(tree_test.score(X_test, y_test))

#plotting the training & testing accuracy for max_depth from 1 to 15

plt.plot(depth, training_accuracy, label="training accuracy")
plt.plot(depth, test_accuracy, label="test accuracy")
plt.title("Accuracy vs max_depth")
plt.ylabel("Accuracy")
plt.xlabel("max_depth")
plt.legend();

In [72]: from sklearn.tree import export_text

from sklearn.tree import DecisionTreeClassifier

# instantiate the model

tree = DecisionTreeClassifier(max_depth=3)

# fit the model

tree.fit(X_train, y_train)
text_representation = export_text(tree)
print(text_representation)

file:///C:/Users/Student/Downloads/Assignment3 (3).html 14/15

10/17/24, 4:06 PM Assignment3

|--- feature_5 <= 7.66

In [74]: import sklearn.tree as tr

fig = plt.figure(figsize=(20,15))
_ = tr.plot_tree(tree,
feature_names=X.columns,
class_names=np.array(["Non admit","Admit"]),
filled=True)

In [ ]:

file:///C:/Users/Student/Downloads/Assignment3 (3).html 15/15

Missionary Coordination Meeting Agenda
50% (2)
Missionary Coordination Meeting Agenda
2 pages
2) - William Grabe PDF
75% (4)
2) - William Grabe PDF
11 pages
Fraud Analytics Using Descriptive, Predictive, and Social Network Techniques: A Guide to Data Science for Fraud Detection
From Everand
Fraud Analytics Using Descriptive, Predictive, and Social Network Techniques: A Guide to Data Science for Fraud Detection
Bart Baesens
No ratings yet
MRP Config Steps
No ratings yet
MRP Config Steps
4 pages
Case study-ML-SI No 2
No ratings yet
Case study-ML-SI No 2
13 pages
Experiment 3 FDL - Jupyter Notebook
No ratings yet
Experiment 3 FDL - Jupyter Notebook
1 page
Assignment - 4 - Decision Tree - 014319
No ratings yet
Assignment - 4 - Decision Tree - 014319
3 pages
Bda Assign
No ratings yet
Bda Assign
15 pages
ML Assignment 2
No ratings yet
ML Assignment 2
5 pages
Regression Prac 9
No ratings yet
Regression Prac 9
8 pages
vertopal.com_Jamboree
No ratings yet
vertopal.com_Jamboree
10 pages
Logistic _Regresssion
No ratings yet
Logistic _Regresssion
22 pages
DOC-20250211-WA0009. (1)
No ratings yet
DOC-20250211-WA0009. (1)
26 pages
bacdeaf_23032025_115708_split_1
No ratings yet
bacdeaf_23032025_115708_split_1
37 pages
Machine Learning Laboratory (21AIL66)
No ratings yet
Machine Learning Laboratory (21AIL66)
7 pages
Dav Lab Manual
No ratings yet
Dav Lab Manual
28 pages
15CSL76 Students
No ratings yet
15CSL76 Students
18 pages
Assignment 3 - LP1
No ratings yet
Assignment 3 - LP1
13 pages
ML File
No ratings yet
ML File
13 pages
Ai ML Programs
No ratings yet
Ai ML Programs
34 pages
St. John College of Engineering and Management, Palghar - Maharashtra
No ratings yet
St. John College of Engineering and Management, Palghar - Maharashtra
11 pages
Ml Lab Record
No ratings yet
Ml Lab Record
49 pages
ML LAB P-1
No ratings yet
ML LAB P-1
10 pages
FDS All Practicals
No ratings yet
FDS All Practicals
10 pages
ML 7
No ratings yet
ML 7
6 pages
Heart: Our "Goal" Predict The Presence of Heart Disease in The Patient
100% (1)
Heart: Our "Goal" Predict The Presence of Heart Disease in The Patient
73 pages
AI&ML pgm
No ratings yet
AI&ML pgm
53 pages
Data Mining Assignment No. 1
No ratings yet
Data Mining Assignment No. 1
7 pages
Code
No ratings yet
Code
7 pages
Assessment Test
No ratings yet
Assessment Test
22 pages
S6 - Data Mining Lab Experiments (Except 1)
No ratings yet
S6 - Data Mining Lab Experiments (Except 1)
6 pages
AIML Prograns
No ratings yet
AIML Prograns
6 pages
Advance Machine Learning
No ratings yet
Advance Machine Learning
28 pages
Aiml Lab
No ratings yet
Aiml Lab
14 pages
Name: Muhammad Sarfraz Seat: EP1850086 Section: A Course Code: 514 Course Name: Data Warehousing and Data Mining
No ratings yet
Name: Muhammad Sarfraz Seat: EP1850086 Section: A Course Code: 514 Course Name: Data Warehousing and Data Mining
39 pages
ML Lab Programs
No ratings yet
ML Lab Programs
21 pages
Machine File
No ratings yet
Machine File
27 pages
ML Practice Assignment
No ratings yet
ML Practice Assignment
7 pages
6.AIML - To Develop Classification Model and Evaluate Its Performance
No ratings yet
6.AIML - To Develop Classification Model and Evaluate Its Performance
4 pages
Name: Suprit Darshan Shrestha Reg - no:19BCE2584: Lab DA1 Machine Learning Lab
No ratings yet
Name: Suprit Darshan Shrestha Reg - no:19BCE2584: Lab DA1 Machine Learning Lab
9 pages
ML (1)(LAB)
No ratings yet
ML (1)(LAB)
51 pages
ml_all_projectpdf_removed
No ratings yet
ml_all_projectpdf_removed
41 pages
AIML
No ratings yet
AIML
12 pages
MLT(1)
No ratings yet
MLT(1)
18 pages
QUIZ Week 2 CART Practice PDF
No ratings yet
QUIZ Week 2 CART Practice PDF
10 pages
Lab_Manual2 (2)
No ratings yet
Lab_Manual2 (2)
6 pages
C2M2 - Assignment: 1 Risk Models Using Tree-Based Models
100% (1)
C2M2 - Assignment: 1 Risk Models Using Tree-Based Models
38 pages
ML Lab Prog1-5 (5) College PDF
No ratings yet
ML Lab Prog1-5 (5) College PDF
12 pages
ML Record
No ratings yet
ML Record
19 pages
SOURCE CODE (1)
No ratings yet
SOURCE CODE (1)
20 pages
ML Shristi File
No ratings yet
ML Shristi File
49 pages
1-10
No ratings yet
1-10
4 pages
Code:: To Find Frequent Itemsets and Association Between Different Itemsets Using Apriori Algorithm
No ratings yet
Code:: To Find Frequent Itemsets and Association Between Different Itemsets Using Apriori Algorithm
28 pages
ML Lab File Batch 1
No ratings yet
ML Lab File Batch 1
20 pages
ML Lab Programs 1-10-Converted NAM COLLEGE PDF
No ratings yet
ML Lab Programs 1-10-Converted NAM COLLEGE PDF
33 pages
EXP - 7- Prasham Doshi - 22bec097
No ratings yet
EXP - 7- Prasham Doshi - 22bec097
7 pages
FDS Slot 1
No ratings yet
FDS Slot 1
19 pages
code mlt
No ratings yet
code mlt
9 pages
indexdw (1)
No ratings yet
indexdw (1)
34 pages
DA LAB MANNUAL
No ratings yet
DA LAB MANNUAL
25 pages
MLPrograma1-5 Py
No ratings yet
MLPrograma1-5 Py
17 pages
Mastering MongoDB: A Comprehensive Guide to NoSQL Database Excellence
From Everand
Mastering MongoDB: A Comprehensive Guide to NoSQL Database Excellence
Kameron Hussain
No ratings yet
MCA Microsoft Certified Associate Azure Security Engineer Study Guide: Exam AZ-500
From Everand
MCA Microsoft Certified Associate Azure Security Engineer Study Guide: Exam AZ-500
Shimon Brathwaite
No ratings yet
How To Create I.P. Project ?: How To Load New Well and Its Logs ?
No ratings yet
How To Create I.P. Project ?: How To Load New Well and Its Logs ?
3 pages
Fiche 1
100% (1)
Fiche 1
3 pages
Donn Love Poet
100% (1)
Donn Love Poet
6 pages
Assignment 9 - Methods
No ratings yet
Assignment 9 - Methods
12 pages
Grade 4 Unit 4 Scope and Sequence
No ratings yet
Grade 4 Unit 4 Scope and Sequence
9 pages
Vhdl-Ams Rules 2.0
No ratings yet
Vhdl-Ams Rules 2.0
30 pages
Dynamic Programming - Set 12 (Longest Palindromic Subsequence) - GeeksforGeeks PDF
No ratings yet
Dynamic Programming - Set 12 (Longest Palindromic Subsequence) - GeeksforGeeks PDF
4 pages
Maria Shimbha M. Corpuz ALS Teacher
No ratings yet
Maria Shimbha M. Corpuz ALS Teacher
35 pages
Download Complete New Map Energy Climate and the Clash of Nations The Daniel Yergin PDF for All Chapters
No ratings yet
Download Complete New Map Energy Climate and the Clash of Nations The Daniel Yergin PDF for All Chapters
24 pages
PMAEE - PNPAEE Mathematics Modules and Readings
100% (2)
PMAEE - PNPAEE Mathematics Modules and Readings
142 pages
Module13 tcp1 PDF
No ratings yet
Module13 tcp1 PDF
16 pages
News of The Day 2 Direct and Direct Speech
No ratings yet
News of The Day 2 Direct and Direct Speech
3 pages
CodeISM Class 14 (Binary Search Problems)
No ratings yet
CodeISM Class 14 (Binary Search Problems)
13 pages
Story Elements: Week 8-Day 3
No ratings yet
Story Elements: Week 8-Day 3
32 pages
GSK
No ratings yet
GSK
6 pages
05 Ds Quiz Set Bfs and Dfs
No ratings yet
05 Ds Quiz Set Bfs and Dfs
24 pages
Nouns Reflection
100% (2)
Nouns Reflection
8 pages
Advanced Windows Post-Exploitation
No ratings yet
Advanced Windows Post-Exploitation
236 pages
Day One PDF
No ratings yet
Day One PDF
39 pages
ROLL NO. 00135 Maha Arbab: Job Preparation - Repeated Mcqs Test
No ratings yet
ROLL NO. 00135 Maha Arbab: Job Preparation - Repeated Mcqs Test
3 pages
F5 300 Boot Camp and APM
No ratings yet
F5 300 Boot Camp and APM
12 pages
A. A. Leontiev e T. v. Riabova - A Estrutura de Fases Do Ato de Fala e A Natureza Dos Planos (Em Inglês)
No ratings yet
A. A. Leontiev e T. v. Riabova - A Estrutura de Fases Do Ato de Fala e A Natureza Dos Planos (Em Inglês)
7 pages
Snow Flake 10 S
No ratings yet
Snow Flake 10 S
2 pages
? Midterm - Module 4-Communication in Multicultural Settingsb
No ratings yet
? Midterm - Module 4-Communication in Multicultural Settingsb
20 pages
Curriculum Vitae SIU
No ratings yet
Curriculum Vitae SIU
2 pages
Knight of Cups
No ratings yet
Knight of Cups
2 pages
Pages From (PET Practice Tests) PDF
No ratings yet
Pages From (PET Practice Tests) PDF
89 pages