Ds Exp1 Manju
Ds Exp1 Manju
EX NO : 1
DATA CLEANING
AIM
To read the given data and perform data cleaning and save the cleaned data to a file.
EXPLANATION
Data cleaning is the process of preparing data for analysis by removing or modifying
data that is incorrect ,incomplete , irrelevant , duplicated or improperly formatted. Data
cleaning is not simply about erasing data ,but rather finding a way to maximize datasets
accuracy without necessarily deleting the information.
ALGORITHM
STEP 1
Read the given Data
STEP 2
Get the information about the data
STEP 3
Remove the null values from the data
STEP 4
Save the Clean data to the file
PROGRAM CODE
**Data Cleaning - Data_set.csv**
import numpy as np
import pandas as pd
import seaborn as sbn
df = pd.read_csv("/content/Data_set.csv")
print(df)
df.head(10)
df.info()
df.isnull()
df.isnull().sum()
df['show_name'] = df['show_name'].fillna(df['aired_on'].mode()[0])
df['aired_on'] = df['aired_on'].fillna(df['aired_on'].mode()[0])
df['original_network'] = df['original_network'].fillna(df['aired_on'].mode()[0])
df.head()
df['rating'] = df['rating'].fillna(df['rating'].mean())
df['current_overall_rank'] =
df['current_overall_rank'].fillna(df['current_overall_rank'].mean())
df.head()
df['watchers'] = df['watchers'].fillna(df['watchers'].median())
212220220024
df.head()
df.info()
df.isnull().sum()
OUTPUT
Data Cleaning – Data_set.csv
212220220024
212220220024
RESULT
Thus the given data is read, cleansed and the cleaned data is saved into the file
212220220024