0% found this document useful (0 votes)

4 views4 pages

Exp. 1

The document details the process of data preprocessing using Python, including importing libraries, handling missing data, encoding categorical variables, and splitting the dataset into training and test sets. It utilizes libraries such as pandas, sklearn, and numpy to perform operations like imputation, one-hot encoding, and feature scaling. The final output includes transformed training and test datasets ready for further analysis.

Uploaded by

G Ravi Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views4 pages

Exp. 1

Uploaded by

G Ravi Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 4

Experiment 1:

Data Preprocessing with Python

Importing the libraries

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

Importing the dataset (Data.csv)

Country,Age,Salary,Purchased
France,44,72000,No
Spain,27,48000,Yes
Germany,30,54000,No
Spain,38,61000,No
Germany,40,,Yes
France,35,58000,Yes
Spain,,52000,No
France,48,79000,Yes
Germany,50,83000,No
France,37,67000,Yes

dataset = pd.read_csv('Data.csv')
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, -1].values

print(X)

[['France' 44.0 72000.0]

['Spain' 27.0 48000.0]
['Germany' 30.0 54000.0]
['Spain' 38.0 61000.0]
['Germany' 40.0 nan]
['France' 35.0 58000.0]
['Spain' nan 52000.0]
['France' 48.0 79000.0]
['Germany' 50.0 83000.0]
['France' 37.0 67000.0]]
print(y)

['No' 'Yes' 'No' 'No' 'Yes' 'Yes' 'No' 'Yes' 'No' 'Yes']

Taking care of missing data

from sklearn.impute import SimpleImputer

imputer = SimpleImputer(missing_values=np.nan, strategy='mean')
imputer.fit(X[:, 1:3])
X[:, 1:3] = imputer.transform(X[:, 1:3])

print(X)

[['France' 44.0 72000.0]

['Spain' 27.0 48000.0]
['Germany' 30.0 54000.0]
['Spain' 38.0 61000.0]
['Germany' 40.0 63777.77777777778]
['France' 35.0 58000.0]
['Spain' 38.77777777777778 52000.0]
['France' 48.0 79000.0]
['Germany' 50.0 83000.0]
['France' 37.0 67000.0]]

Encoding categorical data

Encoding the Independent Variable

from sklearn.compose import ColumnTransformer

from sklearn.preprocessing import OneHotEncoder
ct = ColumnTransformer(transformers=[('encoder', OneHotEncoder(), [0])], remainder='passthr
ough')
X = np.array(ct.fit_transform(X))

print(X)

[[1.0 0.0 0.0 44.0 72000.0]

[0.0 0.0 1.0 27.0 48000.0]
[0.0 1.0 0.0 30.0 54000.0]
[0.0 0.0 1.0 38.0 61000.0]
[0.0 1.0 0.0 40.0 63777.77777777778]
[1.0 0.0 0.0 35.0 58000.0]
[0.0 0.0 1.0 38.77777777777778 52000.0]
[1.0 0.0 0.0 48.0 79000.0]
[0.0 1.0 0.0 50.0 83000.0]
[1.0 0.0 0.0 37.0 67000.0]]

Encoding the Dependent Variable

from sklearn.preprocessing import LabelEncoder

le = LabelEncoder()
y = le.fit_transform(y)

print(y)

[0 1 0 0 1 1 0 1 0 1]

Splitting the dataset into the Training set and Test set

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 1)

print(X_train)

[[0.0 0.0 1.0 38.77777777777778 52000.0]

[0.0 1.0 0.0 40.0 63777.77777777778]
[1.0 0.0 0.0 44.0 72000.0]
[0.0 0.0 1.0 38.0 61000.0]
[0.0 0.0 1.0 27.0 48000.0]
[1.0 0.0 0.0 48.0 79000.0]
[0.0 1.0 0.0 50.0 83000.0]
[1.0 0.0 0.0 35.0 58000.0]]

print(X_test)

[[0.0 1.0 0.0 30.0 54000.0]

[1.0 0.0 0.0 37.0 67000.0]]

print(y_train)

[0 1 0 0 1 1 0 1]
print(y_test)
[0 1]

Feature Scaling

from sklearn.preprocessing import StandardScaler

sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

print(X_train)

[[-0.77459667 -0.57735027 1.29099445 -0.19159184 -1.07812594]

[-0.77459667 1.73205081 -0.77459667 -0.01411729 -0.07013168]
[ 1.29099445 -0.57735027 -0.77459667 0.56670851 0.63356243]
[-0.77459667 -0.57735027 1.29099445 -0.30453019 -0.30786617]
[-0.77459667 -0.57735027 1.29099445 -1.90180114 -1.42046362]
[ 1.29099445 -0.57735027 -0.77459667 1.14753431 1.23265336]
[-0.77459667 1.73205081 -0.77459667 1.43794721 1.57499104]
[ 1.29099445 -0.57735027 -0.77459667 -0.74014954 -0.56461943]]

print(X_test)

[[-0.77459667 1.73205081 -0.77459667 -1.46618179 -0.9069571 ]

[ 1.29099445 -0.57735027 -0.77459667 -0.44973664 0.20564034]]

Exp2-Dm - KS
No ratings yet
Exp2-Dm - KS
9 pages
Dissertation Essex Uni
100% (2)
Dissertation Essex Uni
6 pages
ML Manual
No ratings yet
ML Manual
18 pages
Iii Aid - ML
No ratings yet
Iii Aid - ML
30 pages
CSE - AI & DS R20 - IV YEARS - Course Structure
No ratings yet
CSE - AI & DS R20 - IV YEARS - Course Structure
8 pages
Dalle Vacche Angela - André Bazin's Film Theory. Art, Science, Religion - 2020
No ratings yet
Dalle Vacche Angela - André Bazin's Film Theory. Art, Science, Religion - 2020
235 pages
OOPR Lesson-1
No ratings yet
OOPR Lesson-1
46 pages
Haier India Survey (Pune, Maharashtra Region)
100% (4)
Haier India Survey (Pune, Maharashtra Region)
10 pages
Uniform Plane Wave Solution To The Wave Equation
No ratings yet
Uniform Plane Wave Solution To The Wave Equation
5 pages
Processfolio
No ratings yet
Processfolio
3 pages
Datascience Lab
No ratings yet
Datascience Lab
42 pages
Chem 1 Subject-Outline
No ratings yet
Chem 1 Subject-Outline
10 pages
ML Lab
No ratings yet
ML Lab
23 pages
Dear Sir/Madam,: IITH Campus Recruitment Program 2019-20
No ratings yet
Dear Sir/Madam,: IITH Campus Recruitment Program 2019-20
2 pages
ML Codes
No ratings yet
ML Codes
9 pages
Feature Scaling Codes
No ratings yet
Feature Scaling Codes
1 page
Machine
100% (1)
Machine
45 pages
Annexure 9.13 - Spillway Pier, Breast Wall, Training Wall Solid Stresses
100% (1)
Annexure 9.13 - Spillway Pier, Breast Wall, Training Wall Solid Stresses
6 pages
ML (Sudhanshu)
No ratings yet
ML (Sudhanshu)
24 pages
Đề Thi Thử Số 1
No ratings yet
Đề Thi Thử Số 1
8 pages
Aam Codes
No ratings yet
Aam Codes
8 pages
Geometric Annual Presentation
No ratings yet
Geometric Annual Presentation
12 pages
A) Program To Implement A FLYING KITE
No ratings yet
A) Program To Implement A FLYING KITE
36 pages
Hydraulic Diagram MM0434313 - 1
100% (1)
Hydraulic Diagram MM0434313 - 1
4 pages
CP4252 Lab Manual
No ratings yet
CP4252 Lab Manual
13 pages
A Dog Named Duke
No ratings yet
A Dog Named Duke
12 pages
Deep Learning Perceptron
No ratings yet
Deep Learning Perceptron
10 pages
New - FE - I - Exam Form - Submitted List
No ratings yet
New - FE - I - Exam Form - Submitted List
42 pages
Data Preprocessing 2
No ratings yet
Data Preprocessing 2
5 pages
The Ultimate Zoom Poker Strategy Guide PDF
0% (1)
The Ultimate Zoom Poker Strategy Guide PDF
25 pages
Polarographic Analysis and Its Importance in Pharmaceutical Field PDF
No ratings yet
Polarographic Analysis and Its Importance in Pharmaceutical Field PDF
17 pages
MACHINE LEARNING Manual
No ratings yet
MACHINE LEARNING Manual
36 pages
Language Development Program Birth 12 Months
No ratings yet
Language Development Program Birth 12 Months
2 pages
Conclusion
No ratings yet
Conclusion
2 pages
Restauración de Poblaciones de Plantas Amenazadas
No ratings yet
Restauración de Poblaciones de Plantas Amenazadas
2 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
9 pages
DA Programs
No ratings yet
DA Programs
44 pages
ML
No ratings yet
ML
17 pages
SiddharthShah 1032221195 DivC 50 DL LabAssignment2
No ratings yet
SiddharthShah 1032221195 DivC 50 DL LabAssignment2
7 pages
Programs Lab Bca
No ratings yet
Programs Lab Bca
16 pages
Introduction To CFD SPRING 2016
No ratings yet
Introduction To CFD SPRING 2016
36 pages
Regression Analysis - Cheatsheet
No ratings yet
Regression Analysis - Cheatsheet
9 pages
MAGNESITA CEMENT Folder 092015
No ratings yet
MAGNESITA CEMENT Folder 092015
24 pages
1
No ratings yet
1
13 pages
Ece-B Time Table
No ratings yet
Ece-B Time Table
2 pages
ML Ds
No ratings yet
ML Ds
2 pages
Mlda - Lab
No ratings yet
Mlda - Lab
35 pages
ML Shristi File
No ratings yet
ML Shristi File
49 pages
Project 2
No ratings yet
Project 2
5 pages
Flat R20
No ratings yet
Flat R20
2 pages
HBSE Class 11 Chemistry Model Paper 2024 25 Answer Key
No ratings yet
HBSE Class 11 Chemistry Model Paper 2024 25 Answer Key
11 pages
Aiml Lab
No ratings yet
Aiml Lab
6 pages
Lab Mannual of ML
No ratings yet
Lab Mannual of ML
43 pages
Data Mining Lab Manual CSE VII Sem
No ratings yet
Data Mining Lab Manual CSE VII Sem
63 pages
2-1 Time Tables Empty
No ratings yet
2-1 Time Tables Empty
1 page
1st PGM
No ratings yet
1st PGM
10 pages
Shobit Sharma (2124399) ML Lab File PDF
No ratings yet
Shobit Sharma (2124399) ML Lab File PDF
19 pages
On The Insert Ta1
No ratings yet
On The Insert Ta1
1 page
Oop Through Java Mid-2
No ratings yet
Oop Through Java Mid-2
1 page
ML Remaining
No ratings yet
ML Remaining
17 pages
AIML Exp 2
No ratings yet
AIML Exp 2
4 pages
AIML Exp 2
No ratings yet
AIML Exp 2
4 pages
Work Load I Sem (2025-26)
No ratings yet
Work Load I Sem (2025-26)
12 pages
Fall Semester 2020-21 AI With Python ECE-4031
No ratings yet
Fall Semester 2020-21 AI With Python ECE-4031
5 pages
LAB-4 Report
No ratings yet
LAB-4 Report
21 pages
Tokyo Revengers, Chapter 219 - English Scans
No ratings yet
Tokyo Revengers, Chapter 219 - English Scans
1 page
To Study About Numpy, Pandas and Matplotlib Libraries in Python
No ratings yet
To Study About Numpy, Pandas and Matplotlib Libraries in Python
21 pages
Exp 3& 4
No ratings yet
Exp 3& 4
3 pages
S6 - Data Mining Lab Experiments (Except 1)
No ratings yet
S6 - Data Mining Lab Experiments (Except 1)
6 pages
Machine Learning Laboratory (BTCS619-18) B.Tech Cse 6Th 2024 EVEN
No ratings yet
Machine Learning Laboratory (BTCS619-18) B.Tech Cse 6Th 2024 EVEN
29 pages
Aiml Practicals
No ratings yet
Aiml Practicals
22 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
47 pages
Harshit Sinha: Deloitte Financial Advisory Services India Private Limited (USI)
No ratings yet
Harshit Sinha: Deloitte Financial Advisory Services India Private Limited (USI)
1 page
Data Pre Processing
No ratings yet
Data Pre Processing
2 pages
Experiment 1
No ratings yet
Experiment 1
19 pages
Final ML File
No ratings yet
Final ML File
34 pages
Content
No ratings yet
Content
2 pages
CP Lab Internal Sign Sheet 2024-25 Cse-A
No ratings yet
CP Lab Internal Sign Sheet 2024-25 Cse-A
2 pages
Daa Mid-2
No ratings yet
Daa Mid-2
2 pages
Data Analytics
No ratings yet
Data Analytics
10 pages
Linear Reg 33
No ratings yet
Linear Reg 33
3 pages
Index: Name - JINESH PRAJAPAT Class - B. Tech, III Year Branch - AI & DS Sem - V
No ratings yet
Index: Name - JINESH PRAJAPAT Class - B. Tech, III Year Branch - AI & DS Sem - V
35 pages
CP Lab Internal Sign Sheet 2024-25 Cse-B
No ratings yet
CP Lab Internal Sign Sheet 2024-25 Cse-B
2 pages
ML Lab Manual
No ratings yet
ML Lab Manual
12 pages
Ai Last 5
No ratings yet
Ai Last 5
4 pages
Roll NO 2020
No ratings yet
Roll NO 2020
8 pages
Pytorch (Tabular) - Regression
No ratings yet
Pytorch (Tabular) - Regression
13 pages
Daa Mid-1 III Ml&Ds 31-Jan-2025
No ratings yet
Daa Mid-1 III Ml&Ds 31-Jan-2025
1 page
B.tech Ii-Ii DS
No ratings yet
B.tech Ii-Ii DS
1 page
Functions
No ratings yet
Functions
1 page
C2W3 Lab 01 Model Evaluation and Selection
No ratings yet
C2W3 Lab 01 Model Evaluation and Selection
21 pages
ML Lab Prgms Split
No ratings yet
ML Lab Prgms Split
3 pages
C2W3 Lab 01 Model Evaluation and Selection
No ratings yet
C2W3 Lab 01 Model Evaluation and Selection
21 pages
5) Randomforest - Ipynb - Colaboratory
No ratings yet
5) Randomforest - Ipynb - Colaboratory
12 pages
Machine Learning Algorithms PDF
100% (1)
Machine Learning Algorithms PDF
148 pages
DAA - Quiz 2-2025 (Ans)
No ratings yet
DAA - Quiz 2-2025 (Ans)
1 page
DS Lab Internal Q Paper - CSE-B 2024-25
No ratings yet
DS Lab Internal Q Paper - CSE-B 2024-25
1 page
2-2 ML DS
No ratings yet
2-2 ML DS
1 page
Phil Iri English Automated - Innovation
100% (1)
Phil Iri English Automated - Innovation
24 pages
ML Lab
No ratings yet
ML Lab
7 pages
Unit2 ML Programs
No ratings yet
Unit2 ML Programs
7 pages
16BCB0126 VL2018195002535 Pe003
No ratings yet
16BCB0126 VL2018195002535 Pe003
40 pages
FDP Manual - Petrel Dynamic Modeling PDF
83% (6)
FDP Manual - Petrel Dynamic Modeling PDF
28 pages
Deep Learning Unit I II MCQ
No ratings yet
Deep Learning Unit I II MCQ
2 pages
Mini 4
No ratings yet
Mini 4
9 pages
Week 7 Laboratory Activity
No ratings yet
Week 7 Laboratory Activity
12 pages
Data - Preprocessing - Tools - Ipynb - Colaboratory
No ratings yet
Data - Preprocessing - Tools - Ipynb - Colaboratory
4 pages
Aiml Ex 4-7
No ratings yet
Aiml Ex 4-7
8 pages
IELTS Academic Listening May-Aug 2022
100% (1)
IELTS Academic Listening May-Aug 2022
137 pages
DLL Matatag Week 5 Pe and Health
No ratings yet
DLL Matatag Week 5 Pe and Health
14 pages
Unit 3 - Subject Evaluation Building Sentences and Paragraphs (BSP)
No ratings yet
Unit 3 - Subject Evaluation Building Sentences and Paragraphs (BSP)
4 pages
Linear
No ratings yet
Linear
2 pages
Develop Snakes & Ladders Game Complete Guide with Code & Design
From Everand
Develop Snakes & Ladders Game Complete Guide with Code & Design
Anurag Pandey
No ratings yet
Develop Snake & Ladder Game in an Hour (Complete Guide with Code & Design)
From Everand
Develop Snake & Ladder Game in an Hour (Complete Guide with Code & Design)
Anurag Pandey
No ratings yet

Exp. 1

Uploaded by

Exp. 1

Uploaded by

Experiment 1:

Data Preprocessing with Python

Importing the libraries

Importing the dataset (Data.csv)

[['France' 44.0 72000.0]

Taking care of missing data

from sklearn.impute import SimpleImputer

[['France' 44.0 72000.0]

Encoding categorical data

Encoding the Independent Variable

from sklearn.compose import ColumnTransformer

[[1.0 0.0 0.0 44.0 72000.0]

Encoding the Dependent Variable

from sklearn.preprocessing import LabelEncoder

from sklearn.model_selection import train_test_split

[[0.0 0.0 1.0 38.77777777777778 52000.0]

[[0.0 1.0 0.0 30.0 54000.0]

from sklearn.preprocessing import StandardScaler

[[-0.77459667 -0.57735027 1.29099445 -0.19159184 -1.07812594]

[[-0.77459667 1.73205081 -0.77459667 -1.46618179 -0.9069571 ]

You might also like