0% found this document useful (0 votes)

14 views5 pages

ML Assignment 2

The document is a Jupyter Notebook that processes a dataset related to university admissions, specifically analyzing GRE and TOEFL scores, university ratings, and other factors to predict admission chances. It uses libraries such as pandas, numpy, and scikit-learn to read the data, perform data cleaning, and build a Decision Tree model for classification. The model achieves an accuracy of 86% in predicting admission outcomes based on the provided features.

Uploaded by

lucifer267302

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views5 pages

ML Assignment 2

Uploaded by

lucifer267302

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

8/1/24, 12:40 PM TEIT-10(1) - Jupyter Notebook

In [6]: import numpy as np;

import pandas as pd;
import dask.dataframe as dd;
import seaborn as sns;
import matplotlib.pyplot as mtp;

In [7]: df = pd.read_csv("Admission_Predict.csv")

In [8]: df

Out[8]: Serial GRE TOEFL University Chance of

Rating SOP LOR CGPA Research
No. Score Score Admit
1
1 337 118 0.92
0 4 4.5 4.5 9.65
1
2 324 107 0.76
1 4 4.0 4.5 8.87
1
3 316 104 0.72
2 3 3.0 3.5 8.00
1
4 322 110 0.80
3 3 3.5 2.5 8.67
0
5 314 103 0.65
4 2 2.0 3.0 8.21
...
... ... ... ...
... ... ... ... ...
1
396 324 110 0.82
395 3 3.5 3.5 9.04
1
397 325 107 0.84
396 3 3.0 3.5 9.11
1
398 330 116 0.91
397 4 5.0 4.5 9.45
0
399 312 103 0.67
398 3 3.5 4.0 8.78
1
400 333 117 0.95
399 4 5.0 4.0 9.66

400 rows × 9 columns

In [10]: dfd = dd.read_csv("Admission_Predict.csv")

In [11]: dfd
Out[11]: Dask DataFrame Structure:
Serial GRE TOEFL University Chance
SOP LOR CGPA Research
No. Score Score Rating of Admit

npartitions=1

int64 int64 int64 int64 float64 float64 float64 int64 float64

... ... ... ... ... ... ... ... ...

Dask Name: read-csv, 1 graph layer

localhost:8888/notebooks/TEIT-10 (1).ipynb 1/5

8/1/24, 12:35 PM TEIT-10 (1) - Jupyter Notebook

In [12]: df.head()

Out[12]: Serial GRE TOEFL University Chance of

In [13]: df.isnull()

Out[13]: Serial GRE TOEFL University Chance of

SOP LOR CGPA Research
No. Score Score Rating Admit
False False False False False
0 False False False False
False False False False False
1 False False False False
False False False False False
2 False False False False
False False False False False
3 False False False False
False False False False False
4 False False False False
... ... ... ... ...
... ... ... ... ...
False False False False False
395 False False False False
False False False False False
396 False False False False
False False False False False
397 False False False False
False False False False False
398 False False False False
False False False False False
399 False False False False

400 rows × 9 columns

In [14]: df.isnull().sum()

Out[14]: Serial No. 0

GRE Score 0
TOEFL Score 0
University Rating 0
SOP 0
LOR 0
CGPA 0
Research 0
Chance of Admit 0
dtype: int64

localhost:8888/notebooks/TEIT-10 (1).ipynb 2/5

8/1/24, 12:35 PM TEIT-10 (1) - Jupyter Notebook

In [15]: df.sum() Serial

Out[15]: No. 80200.00

GRE Score 126723.00
TOEFL Score 42964.00
University Rating 1235.00
SOP 1360.00
LOR 1381.00
CGPA 3439.57
Research 219.00
Chance of Admit 289.74
dtype: float64

In [16]: df = df.drop('Serial No.',axis=1)

In [17]: df

Out[17]: GRE Score TOEFL Score University Rating SOP LOR CGPA Research Chance of Admit

0 337 118 4 4.5 4.5 9.65 1 0.92

1 324 107 4 4.0 4.5 8.87 1 0.76

2 316 104 3 3.0 3.5 8.00 1 0.72

3 322 110 3 3.5 2.5 8.67 1 0.80

4 314 103 2 2.0 3.0 8.21 0 0.65

... ... ... ... ... ... ... ... ...

395 324 110 3 3.5 3.5 9.04 1 0.82

396 325 107 3 3.0 3.5 9.11 1 0.84

397 330 116 4 5.0 4.5 9.45 1 0.91

398 312 103 3 3.5 4.0 8.78 0 0.67

399 333 117 4 5.0 4.0 9.66 1 0.95

400 rows × 8 columns

In [18]: df.shape()

---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[18], line 1
----> 1 df.shape()

TypeError: 'tuple' object is not callable

In [19]: df.shape

Out[19]: (400, 8)

localhost:8888/notebooks/TEIT-10 (1).ipynb 3/5

8/1/24, 12:35 PM TEIT-10(1) - Jupyter Notebook

In [20]: df['Chance of Admit '] = [1 if each > 0.75 else 0 for each in df['Chance of Adm

In [21]: df.head()
Out[21]: GRE Score TOEFL Score University Rating SOP LOR CGPA Research Chance of Admit

0 337 118 4 4.5 4.5 9.65 1 1

1 324 107 4 4.0 4.5 8.87 1 1

2 316 104 3 3.0 3.5 8.00 1 0

3 322 110 3 3.5 2.5 8.67 1 1

4 314 103 2 2.0 3.0 8.21 0 0

In [22]: x = df[['GRE Score', 'TOEFL Score', 'University Rating', 'SOP', 'LOR ', 'CGPA',
'Research']] #input on the x-axix

y = df['Chance of Admit '] #output on the y-axis

In [23]: from sklearn.model_selection import train_test_split

In [24]: x_train, x_test, y_train, y_test = train_test_split(x,y,test_size=0.25,random_s

In [25]: print(f"Size of splitted data")

print(f"x_train {x_train.shape}")
print(f"y_train {y_train.shape}")
print(f"x_test {x_test.shape}")
print(f"y_test {y_test.shape}")
Size of splitted data
x_train (300, 7)
y_train (300,)
x_test (100, 7)
y_test (100,)

In [26]: from sklearn.tree import DecisionTreeRegressor

In [27]: model_dt = DecisionTreeRegressor(random_state=1)

In [29]: model_dt.fit(x_train,y_train)

Out[29]: DecisionTreeRegressor(random_state=1)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust
the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with
nbviewer.org.

In [30]: y_pred_dt = model_dt.predict(x_test) #int

localhost:8888/notebooks/TEIT-10(1).ipynb 4/5
8/1/24, 12:35 PM TEIT-10(1) - Jupyter Notebook

In [31]: from sklearn.metrics import ConfusionMatrixDisplay, accuracy_score

from sklearn.metrics import classification_report

In [33]: ConfusionMatrixDisplay.from_predictions(y_test,y_pred_dt)
mtp.title('Decision Tree')
mtp.show()
print(f" Accuracy is {accuracy_score(y_test,y_pred_dt)}")
print(classification_report(y_test,y_pred_dt))

Accuracy is 0.86
precision recall f1-score support

0 0.86 0.89 0.88 56

1 0.86 0.82 0.84 44

accuracy 0.86 100

macro avg 0.86 0.86 0.86 100
weighted avg 0.86 0.86 0.86 100

In [ ]:

localhost:8888/notebooks/TEIT-10(1).ipynb 5/5

Names With Meaning ARABIC ENGLISH
100% (1)
Names With Meaning ARABIC ENGLISH
107 pages
Jamboree
No ratings yet
Jamboree
56 pages
Practicalpgm ML
No ratings yet
Practicalpgm ML
33 pages
CSC - 310 Advanced Python Programming Continuous Assessment-2 Assignment:Ca2
No ratings yet
CSC - 310 Advanced Python Programming Continuous Assessment-2 Assignment:Ca2
33 pages
ML Lab Programs
No ratings yet
ML Lab Programs
21 pages
Ashwin Report
No ratings yet
Ashwin Report
18 pages
Student Performance Analysis
No ratings yet
Student Performance Analysis
28 pages
Student Performance Analysis
No ratings yet
Student Performance Analysis
28 pages
Dav Lab Manual
No ratings yet
Dav Lab Manual
28 pages
Jamboree Case Study
No ratings yet
Jamboree Case Study
24 pages
Assignment 3
No ratings yet
Assignment 3
15 pages
ML All Projectpdf Removed
No ratings yet
ML All Projectpdf Removed
41 pages
Advance Machine Learning
No ratings yet
Advance Machine Learning
28 pages
College Predictor
No ratings yet
College Predictor
20 pages
Cse Machine Learning Lab Manual
No ratings yet
Cse Machine Learning Lab Manual
22 pages
Assessment Test
No ratings yet
Assessment Test
22 pages
Source Code
No ratings yet
Source Code
20 pages
Assignment 3 - LP1
No ratings yet
Assignment 3 - LP1
13 pages
Data Science Practical 01
No ratings yet
Data Science Practical 01
12 pages
ML Journal
No ratings yet
ML Journal
37 pages
ML Lab File Batch 1
No ratings yet
ML Lab File Batch 1
20 pages
FDS Slot 1
No ratings yet
FDS Slot 1
19 pages
Name: Muhammad Sarfraz Seat: EP1850086 Section: A Course Code: 514 Course Name: Data Warehousing and Data Mining
No ratings yet
Name: Muhammad Sarfraz Seat: EP1850086 Section: A Course Code: 514 Course Name: Data Warehousing and Data Mining
39 pages
Computer Science For Everyone
100% (1)
Computer Science For Everyone
283 pages
Jamboree
No ratings yet
Jamboree
10 pages
Jamboree Linear Regression Version 2 Jupyter Notebook
No ratings yet
Jamboree Linear Regression Version 2 Jupyter Notebook
12 pages
Case study-ML-SI No 2
No ratings yet
Case study-ML-SI No 2
13 pages
Loan Prediction
No ratings yet
Loan Prediction
26 pages
C2M2 - Assignment: 1 Risk Models Using Tree-Based Models
100% (1)
C2M2 - Assignment: 1 Risk Models Using Tree-Based Models
38 pages
AIPT Practical Exam Codes
No ratings yet
AIPT Practical Exam Codes
12 pages
Regression Prac 9
No ratings yet
Regression Prac 9
8 pages
University Admission Prediction
No ratings yet
University Admission Prediction
18 pages
ML Lab P-1
No ratings yet
ML Lab P-1
10 pages
Grade 06 English Language 2nd Term Test Paper 2019 North Central Province
No ratings yet
Grade 06 English Language 2nd Term Test Paper 2019 North Central Province
5 pages
Screenshot 2023-12-07 at 11.07.49 AM
No ratings yet
Screenshot 2023-12-07 at 11.07.49 AM
14 pages
Submitted by
No ratings yet
Submitted by
6 pages
List of Practical Ip065 Xii Session 2025 CKC Academy
No ratings yet
List of Practical Ip065 Xii Session 2025 CKC Academy
19 pages
Modelling and Simmulation Assignment - Ipynb - Colab
No ratings yet
Modelling and Simmulation Assignment - Ipynb - Colab
7 pages
Machine File
No ratings yet
Machine File
27 pages
Data Science Complete Theory
No ratings yet
Data Science Complete Theory
884 pages
Documentation
No ratings yet
Documentation
7 pages
ML File
No ratings yet
ML File
13 pages
6.to Develop Classification Model and Evaluate Its Performance
No ratings yet
6.to Develop Classification Model and Evaluate Its Performance
3 pages
Naive Bayes Classification
No ratings yet
Naive Bayes Classification
6 pages
Machine Learning Laboratory Manual
No ratings yet
Machine Learning Laboratory Manual
11 pages
Exp5 - Naive - Ipynb - Colab
No ratings yet
Exp5 - Naive - Ipynb - Colab
4 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
13 pages
Assignment - 4 - Decision Tree - 014319
No ratings yet
Assignment - 4 - Decision Tree - 014319
3 pages
Code
No ratings yet
Code
7 pages
ML Lab Programs For Exam
No ratings yet
ML Lab Programs For Exam
10 pages
Week 6
No ratings yet
Week 6
4 pages
12 IP File Programs 6 To 17
No ratings yet
12 IP File Programs 6 To 17
9 pages
C1 Input Manual 2007
No ratings yet
C1 Input Manual 2007
342 pages
AIML
No ratings yet
AIML
12 pages
1 10
No ratings yet
1 10
4 pages
Machine Learning Laboratory (21AIL66)
No ratings yet
Machine Learning Laboratory (21AIL66)
7 pages
But Is It the Bad Kind?: A Story About Uninvited Guests
From Everand
But Is It the Bad Kind?: A Story About Uninvited Guests
Rachel Orgel
No ratings yet
Program 1
No ratings yet
Program 1
25 pages
EDS - Python Cheat Sheet
0% (1)
EDS - Python Cheat Sheet
3 pages
Trigonometric Using Calculators - 1980 PDF
No ratings yet
Trigonometric Using Calculators - 1980 PDF
392 pages
Mah Mba Cet 2022 Question Paper PDF
No ratings yet
Mah Mba Cet 2022 Question Paper PDF
9 pages
Simultaneous Interpreting-Completeversion
No ratings yet
Simultaneous Interpreting-Completeversion
15 pages
Titanic Akshaya
No ratings yet
Titanic Akshaya
12 pages
Experiment 3 FDL - Jupyter Notebook
No ratings yet
Experiment 3 FDL - Jupyter Notebook
1 page
6.AIML - To Develop Classification Model and Evaluate Its Performance
No ratings yet
6.AIML - To Develop Classification Model and Evaluate Its Performance
4 pages
15CSL76 Students
No ratings yet
15CSL76 Students
18 pages
Home Work
No ratings yet
Home Work
12 pages
Predict Student Passfail
No ratings yet
Predict Student Passfail
1 page
10 - Part 2 PDF
No ratings yet
10 - Part 2 PDF
230 pages
Literature Review On Accessibility
100% (1)
Literature Review On Accessibility
7 pages
Conformal Einstein Spaces and Bach Tensor Generali
No ratings yet
Conformal Einstein Spaces and Bach Tensor Generali
99 pages
Tài Liệu Biến Tần LS IC5
No ratings yet
Tài Liệu Biến Tần LS IC5
20 pages
Narayaniyam (RSS)
No ratings yet
Narayaniyam (RSS)
107 pages
Elmer Gui Manual
No ratings yet
Elmer Gui Manual
49 pages
SCC 2.0 Beta2 UserManual
No ratings yet
SCC 2.0 Beta2 UserManual
57 pages
Sap Start and Stop
No ratings yet
Sap Start and Stop
15 pages
The Musical Experience of Composer, Performer, Listener
100% (1)
The Musical Experience of Composer, Performer, Listener
4 pages
Coroutines Flow - 1
No ratings yet
Coroutines Flow - 1
19 pages
Assignment No. 2 Q.1 What Kind of Curriculum Aristotle Supported To Be Taught To The Children? Discuss Its Features
No ratings yet
Assignment No. 2 Q.1 What Kind of Curriculum Aristotle Supported To Be Taught To The Children? Discuss Its Features
12 pages
New Synopsis
No ratings yet
New Synopsis
12 pages
Gr11-Gr12 Trigs Study Sheet
No ratings yet
Gr11-Gr12 Trigs Study Sheet
6 pages
Exercising Our Authority in Practice
No ratings yet
Exercising Our Authority in Practice
5 pages
Selectividad Fashion and Waste
No ratings yet
Selectividad Fashion and Waste
6 pages
Insem September 2023
No ratings yet
Insem September 2023
2 pages
ML Assignment 3
No ratings yet
ML Assignment 3
6 pages
Literary Devices Class Xii Poems
No ratings yet
Literary Devices Class Xii Poems
6 pages
MAT 4052 22 Mar 2024
No ratings yet
MAT 4052 22 Mar 2024
6 pages
ML Endsem
No ratings yet
ML Endsem
3 pages
Sppu TEIT OS - Endsem
No ratings yet
Sppu TEIT OS - Endsem
2 pages
Numan Arshad
No ratings yet
Numan Arshad
2 pages
JSP Quick Reference Card
No ratings yet
JSP Quick Reference Card
4 pages
Year 12 Mathematical Induction Topic Test
No ratings yet
Year 12 Mathematical Induction Topic Test
3 pages
Chapter 3 of David Crystal
No ratings yet
Chapter 3 of David Crystal
3 pages
Tuesday Devotion Script
No ratings yet
Tuesday Devotion Script
2 pages
William Lin Resume
No ratings yet
William Lin Resume
1 page

ML Assignment 2

Uploaded by

ML Assignment 2

Uploaded by

8/1/24, 12:40 PM TEIT-10(1) - Jupyter Notebook

In [6]: import numpy as np;

Out[8]: Serial GRE TOEFL University Chance of

400 rows × 9 columns

In [10]: dfd = dd.read_csv("Admission_Predict.csv")

int64 int64 int64 int64 float64 float64 float64 int64 float64

... ... ... ... ... ... ... ... ...

localhost:8888/notebooks/TEIT-10 (1).ipynb 1/5

Out[12]: Serial GRE TOEFL University Chance of

Out[13]: Serial GRE TOEFL University Chance of

400 rows × 9 columns

Out[14]: Serial No. 0

localhost:8888/notebooks/TEIT-10 (1).ipynb 2/5

In [15]: df.sum() Serial

Out[15]: No. 80200.00

In [16]: df = df.drop('Serial No.',axis=1)

0 337 118 4 4.5 4.5 9.65 1 0.92

1 324 107 4 4.0 4.5 8.87 1 0.76

2 316 104 3 3.0 3.5 8.00 1 0.72

3 322 110 3 3.5 2.5 8.67 1 0.80

4 314 103 2 2.0 3.0 8.21 0 0.65

... ... ... ... ... ... ... ... ...

395 324 110 3 3.5 3.5 9.04 1 0.82

396 325 107 3 3.0 3.5 9.11 1 0.84

397 330 116 4 5.0 4.5 9.45 1 0.91

398 312 103 3 3.5 4.0 8.78 0 0.67

399 333 117 4 5.0 4.0 9.66 1 0.95

400 rows × 8 columns

TypeError: 'tuple' object is not callable

localhost:8888/notebooks/TEIT-10 (1).ipynb 3/5

0 337 118 4 4.5 4.5 9.65 1 1

1 324 107 4 4.0 4.5 8.87 1 1

2 316 104 3 3.0 3.5 8.00 1 0

3 322 110 3 3.5 2.5 8.67 1 1

4 314 103 2 2.0 3.0 8.21 0 0

y = df['Chance of Admit '] #output on the y-axis

In [23]: from sklearn.model_selection import train_test_split

In [24]: x_train, x_test, y_train, y_test = train_test_split(x,y,test_size=0.25,random_s

In [25]: print(f"Size of splitted data")

In [26]: from sklearn.tree import DecisionTreeRegressor

In [27]: model_dt = DecisionTreeRegressor(random_state=1)

In [30]: y_pred_dt = model_dt.predict(x_test) #int

In [31]: from sklearn.metrics import ConfusionMatrixDisplay, accuracy_score

0 0.86 0.89 0.88 56

accuracy 0.86 100

You might also like