ML Normalization Techniques - Overview & Practical Guide
ML Normalization Techniques - Overview & Practical Guide
Normalization is a preprocessing step where data is rescaled to fit within a specific range or
distribution, improving model performance and convergence. Below is a detailed explanation of
each type, along with Python examples.
1. Min-Max Normalization
Formula:
𝐗−𝐗𝐦𝐢𝐧
𝐗′ =
𝐗𝐦𝐚𝐱−𝐗𝐦𝐢𝐧
Description:
Use Cases:
Python Implementation:
Formula:
′
𝐗−𝛍
𝐗 =
𝝈
Where:
μ: Mean of the feature.
σ: Standard deviation.
Description:
Use Cases:
Python Implementation:
scaler = StandardScaler()
z_score_data = scaler.fit_transform(data)
print("Z-Score Normalized Data:\n", z_score_data)
Formula:
𝑿
𝐗′ =
∣ 𝐗𝐦𝐚𝐱 ∣
Description:
Use Cases:
Python Implementation:
scaler = MaxAbsScaler()
max_abs_data = scaler.fit_transform(data)
print("Max Absolute Normalized Data:\n", max_abs_data)
4. Robust Scaling
Formula:
𝐗 − 𝐦𝐞𝐝𝐢𝐚𝐧
𝐗′ =
𝑰𝑸𝑹
Where:
Description:
Use Cases:
Python Implementation:
scaler = RobustScaler()
robust_data = scaler.fit_transform(data)
print("Robust Scaled Data:\n", robust_data)
5. Logarithmic Normalization
Formula:
𝐗′ = 𝐥𝐨𝐠(𝐗 + 𝐜)
Where:
Description:
Python Implementation:
import numpy as np
data = np.array([[1, 10, 100], [2, 20, 200], [3, 30, 300]])
log_data = np.log1p(data) # log1p is log(X + 1)
print("Logarithmic Normalized Data:\n", log_data)
6. L2 Normalization
Formula:
𝑿
𝐗′ =
∥𝐗∥𝟐
Where:
∥ 𝐗 ∥ 𝟐 = √∑𝐗𝟐
Description:
Use Cases:
Python Implementation:
scaler = Normalizer(norm='l2')
l2_data = scaler.fit_transform(data)
print("L2 Normalized Data:\n", l2_data)