PCA Numerical Example (4D to 2D)
PCA Numerical Example (4D to 2D) — Fully Detailed Step-by-Step Solution
GOAL
We apply Principal Component Analysis (PCA) to reduce 4-dimensional data to 2
dimensions, with complete detailed steps.
Given Data Matrix X — 5 Samples, 4 Features
X=[
[4, 2, 0, 1],
[2, 4, 0, 3],
[2, 2, 2, 3],
[4, 4, 2, 5],
[6, 6, 4, 7]
Step 1: Standardize the Data (Z-score Normalization)
Mean of features (μ):
μ = [3.6, 3.6, 1.6, 3.8]
Standard deviation (σ):
σ ≈ [1.673, 1.673, 1.673, 2.280]
Standardized matrix Z ≈
[
[0.239, -0.956, -0.956, -1.228],
[-0.956, 0.239, -0.956, -0.351],
[-0.956, -0.956, 0.239, -0.351],
[0.239, 0.239, 0.239, 0.526],
[1.433, 1.433, 1.433, 1.404]
Step 2: Covariance Matrix (C)
C≈
[1, 0.83, 0.83, 0.78],
[0.83, 1, 0.83, 0.78],
[0.83, 0.83, 1, 0.78],
[0.78, 0.78, 0.78, 1]
Step 3: Eigenvalues and Eigenvectors
Eigenvalues ≈ [3.37, 0.49, 0.13, 0.01]
Total variance = 4
Explained Variance:
PC1 ≈ 84.2%
PC2 ≈ 12.3%
Together: 96.5%
Step 4: Select Top 2 Eigenvectors (Principal Components)
W≈
[0.51, -0.62],
[0.51, -0.62],
[0.51, 0.46],
[0.47, 0.09]
Step 5: Project Data onto Principal Components
Z_PCA ≈
[-1.49, -0.34],
[-1.04, 0.61],
[-1.09, -0.45],
[0.42, 0.41],
[3.20, -0.23]
Final Output: 2D data with 96.5% variance retained.