0% found this document useful (0 votes)
12 views3 pages

04 05 PDE Missing Value

The document discusses missing value imputation which is important for machine learning as many algorithms do not support missing data. Common imputation methods include replacing missing values with the mean, median or mode of existing values or using zero. More advanced methods segment the data before imputing based on relevant attributes.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views3 pages

04 05 PDE Missing Value

The document discusses missing value imputation which is important for machine learning as many algorithms do not support missing data. Common imputation methods include replacing missing values with the mean, median or mode of existing values or using zero. More advanced methods segment the data before imputing based on relevant attributes.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Missing Value Imputation

Real-world data often has missing values. Data can have missing values for a number of reasons such as observations
that were not recorded and data corruption.

Impact
• Handling missing data is important as many machine learning algorithms do
not support data with missing values.
Solution
Missing Value • Remove rows with missing data from your dataset.
Imputation • Impute missing values with mean/median values in your dataset.
Note
• Use business knowledge to take separate approach for each variable
• It is advisable to impute instead of remove in case of small sample size or
large proportion of observations with missing values

Start-Tech Academy
Missing Value Imputation
1. Impute with ZERO
• Impute missing values with zero
2. Impute with Median/Mean/Mode
• For numerical variables, impute missing values with Mean or Median
• For categorical variables, impute missing values with Mode
Methods 3. Segment based imputation
• Identify relevant segments
• Calculate mean/median/mode of segments
• Impute the missing value according to the segments
• For example, we can say rainfall hardly varies for cities in a particular
State
• In this case, we can impute missing rainfall value of a city with the
average of that state

Start-Tech Academy

You might also like