22IZ023 Nikhil - Exercise 5 - Data Preprocessing
22IZ023 Nikhil - Exercise 5 - Data Preprocessing
Aim
To perform data preprocessing techniques such as handling missing values,
standardization, and normalization using Python.
Logic Description
Data preprocessing involves cleaning and transforming raw data to improve its
quality for analysis. This includes handling missing values, standardizing data
distributions, and normalizing data scales.
Algorithm
Package/Tools Description
Source Code:-
import pandas as pd
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import StandardScaler, MinMaxScaler
# Display outputs
print("\nStandardized 'Price' column:\n",
df_standardized_price["Price"].head())
print("\nNormalized 'Price' column:\n",
df_normalized_price["Price"].head())
# Optionally: Save the processed datasets to CSV
df_dropna.to_csv("df_dropna.csv", index=False)
df_mean_filled.to_csv("df_mean_filled.csv", index=False)
df_median_filled.to_csv("df_median_filled.csv", index=False)
df_standardized_price.to_csv("df_standardized_price.csv", index=False)
df_normalized_price.to_csv("df_normalized_price.csv", index=False)
Output Terminal:-
Test Cases
Inferences
Result
Data preprocessing was successfully implemented using Python, ensuring the
dataset is clean and well-prepared for further analysis.