Handson Data Preprocessing PYTHON
Handson Data Preprocessing PYTHON
• Check for null (missing) values and count them in each column.
• Replace missing numerical values with the mean of the respective column.
• Replace missing numerical values with the median of the respective column.
• Fill missing categorical values with the most frequent value in the column.
3. Data Cleaning
o One-hot encoding.
Into to ML by S i r. A s i f Ahsa n
Page |2
o Label encoding.
• Handle categorical columns with multiple categories (more than 10 unique values).
5. Feature Scaling
o Z-score.
7. Feature Engineering
• Create new features based on existing columns (e.g., age groups, salary ranges).
• Combine multiple columns into one (e.g., full name from first and last name).
• Extract information from columns (e.g., extracting year from a date column).
8. Data Transformation
Into to ML by S i r. A s i f Ahsa n
Page |3
• Split a column into multiple columns (e.g., splitting a full name into first and last
names).
Additional Challenges
• Detect and correct inconsistent data (e.g., inconsistent spellings in text columns).
1. Complete each task on the provided dataset or any dataset of your choice.
Into to ML by S i r. A s i f Ahsa n