Exp 3 Data Wrangling SDK Ok
Exp 3 Data Wrangling SDK Ok
#Name : ____________________
#Class :SE Branch : E&TC
#Roll no :_________ Subject: Python for Data Analytics
# Data Wrangling
#Example no.01
# import libraries
import pandas as pd
import numpy as np
import seaborn as sns
kashti=sns.load_dataset('titanic')
ks1=kashti
#ks2=kashti
#ks=sns.load_dataset('titanic')
kashti.head()
Output:-
survived pclass sex age sibsp parch fare embarked class \
0 0 3 male 22.0 1 0 7.2500 S Third
1 1 1 female 38.0 1 0 71.2833 C First
2 1 3 female 26.0 0 0 7.9250 S Third
3 1 1 female 35.0 1 0 53.1000 S First
4 0 3 male 35.0 0 0 8.0500 S Third
# to drop na
kashti.dropna()
# to update the main dataframe
kashti=kashti.dropna()
kashti.dropna().isnull().sum()
kashti.shape
Output:-(182, 15)
#Example no.6
ks1.isnull().sum()
Output:-
survived 0
pclass 0
sex 0
age 19
sibsp 0
parch 0
fare 0
embarked 2
class 0
who 0
adult_male 0
deck 0
embark_town 2
alive 0
alone 0
dtype: int64
#Example no.7
# replacing missing values with the average of that column
# finding an average (mean)
mean=ks1['age'].mean()
mean
Output:- 35.77945652173913
#Example no.8
#replacing nan with mean of the data(updating as well)
ks1['age']=ks1['age'].replace(np.nan,mean)
ks1['age']
ks1.isnull().sum()
Output:-
survived 0
pclass 0
sex 0
age 0
sibsp 0
parch 0
fare 0
embarked 2
class 0
who 0
adult_male 0
deck 0
embark_town 2
alive 0
alone 0
dtype: int64
kashti.dtypes
Output:-
survived int64
pclass int64
sex object
age float64
sibsp int64
parch int64
fare float64
embarked object
class category
who object
adult_male bool
deck category
embark_town object
alive object
alone bool
dtype: object
ks1.dtypes
Output:-
survived int64
pclass int64
sex object
age float64
sibsp int64
parch int64
fare float64
embarked object
class category
who object
adult_male bool
deck category
embark_town object
alive object
alone bool
dtype: object
# use this method to convert datatype from one to another format
kashti['survived']=kashti['survived'].astype('int64')
kashti.dtypes
Output:-
survived int64
pclass int64
sex object
age float64
sibsp int64
parch int64
fare float64
embarked object
class category
who object
adult_male bool
deck category
embark_town object
alive object
alone bool
dtype: object
kashti['survived']=kashti['survived'].astype('float64')
kashti.dtypes
Output:-
survived float64
pclass int64
sex object
age float64
sibsp int64
parch int64
fare float64
embarked object
class category
who object
adult_male bool
deck category
embark_town object
alive object
alone bool
dtype: object
kashti['survived']=kashti['survived'].astype('int64')
kashti.dtypes
Output:-
survived Int64
pclass int64
sex object
age float64
sibsp int64
parch int64
fare float64
embarked object
class category
who object
adult_male bool
deck category
embark_town object
alive object
alone bool
dtype: object
kashti['age']=ks1['age']*365
ks1.head(10)
Output:-
survived pclass sex age sibsp parch fare embarked
1 1 1 female 38.000000 1 0 71.2833 C
3 1 1 female 35.000000 1 0 53.1000 S
6 0 1 male 54.000000 0 0 51.8625 S
10 1 3 female 4.000000 1 1 16.7000 S
11 1 1 female 58.000000 0 0 26.5500 S
21 1 2 male 34.000000 0 0 13.0000 S
23 1 1 male 28.000000 0 0 35.5000 S
27 0 1 male 19.000000 3 2 263.0000 S
31 1 1 female 35.779457 1 0 146.5208 C
52 1 1 female 49.000000 1 0 76.7292 C