Feature Scaling Notes
Feature Scaling Notes
Generally, when we get raw data, all the features values varies on different scales. It is important
to bring all the feature values on the scale so that value of one feature should not dominate over
the others and hinder the performance of the learning algorithm. This process of bringing all the
features values on the same scale is called feature scaling. Ensuring standardised feature values
implicitly weights all features equally in their representation.
Feature scaling or re-scaling of the features is performed such that they have the properties of
normal distribution (most of the time) where values have standard deviation = 1, and mean = 0.
There are many ways by which we can apply feature scaling on the dataset:
1. The easiest way of scaling is to use - preprocessing.scale() function
A numpy array of values is given as input and output is numpy array with scaled values.
This will scale the values in such a way that mean of the values will be 0 and standard
deviation will be 1.
2.
3. Another method is to scale the features between given minimum and maximum values,
generally between 0 and 1.
Function -preprocessing.MinMaxScaler(feature_range=(0, 1), copy=True)
Formula used -