0% found this document useful (0 votes)
32 views

Normalization Vs Standardization

Uploaded by

nishu.government
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views

Normalization Vs Standardization

Uploaded by

nishu.government
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

5/22/22, 12:26 AM Normalization vs Standardization - GeeksforGeeks

Normalization vs Standardization
Difficulty Level : Basic Last Updated : 12 Nov, 2021

Feature scaling is one of the most important data preprocessing step in machine learning. Algorithms
that compute the distance between the features are biased towards numerically larger values if the
data is not scaled.

Tree-based algorithms are fairly insensitive to the scale of the features. Also, feature scaling helps
machine learning, and deep learning algorithms train and converge faster.

There are some feature scaling techniques such as Normalization and Standardization that are the
most popular and at the same time, the most confusing ones.

Let’s resolve that confusion.

Normalization or Min-Max Scaling is used to transform features to be on a similar scale. The new
point is calculated as:

X_new = (X - X_min)/(X_max - X_min)

This scales the range to [0, 1] or sometimes [-1, 1]. Geometrically speaking, transformation squishes
the n-dimensional data into an n-dimensional unit hypercube. Normalization is useful when there are
no outliers as it cannot cope up with them. Usually, we would scale age and not incomes because only
a few people have high incomes but the age is close to uniform.

Standardization or Z-Score Normalization is the transformation of features by subtracting from


mean and dividing by standard deviation. This is often called as Z-score.

X_new = (X - mean)/Std

Standardization can be helpful in cases where the data follows a Gaussian distribution. However, this
does not have to be necessarily true. Geometrically speaking, it translates the data to the mean vector
of original data to the origin and squishes or expands the points if std is 1 respectively. We can see
that we are just changing mean and standard deviation to a standard normal distribution which is still
normal thus the shape of the distribution is not affected.

https://www.geeksforgeeks.org/normalization-vs-standardization/ 1/2
5/22/22, 12:26 AM Normalization vs Standardization - GeeksforGeeks

Standardization does not get affected by outliers because there is no predefined range of transformed
features.

Difference between Normalization and Standardization

S.NO. Normalization Standardization

1. Minimum and maximum value of features Mean and standard deviation is used for
are used for scaling scaling.

2. It is used when features are of different It is used when we want to ensure zero mean
scales. and unit standard deviation.

3. Scales values between [0, 1] or [-1, 1]. It is not bounded to a certain range.

4. It is really affected by outliers. It is much less affected by outliers.

5. Scikit-Learn provides a transformer Scikit-Learn provides a transformer


called MinMaxScaler for Normalization. called StandardScaler for standardization.

6. This transformation squishes the n- It translates the data to the mean vector of
dimensional data into an n-dimensional unit original data to the origin and squishes or
hypercube. expands.

7. It is useful when we don’t know about the It is useful when the feature distribution is
distribution Normal or Gaussian.

8. It is a often called as Scaling Normalization It is a often called as Z-Score Normalization.

https://www.geeksforgeeks.org/normalization-vs-standardization/ 2/2

You might also like