Principal Component Analysis
Principal Component Analysis
Analysis
What is PCA?
• Principal Component Analysis (PCA) is a dimensionality reduction
technique used in data analysis and machine learning. Its main goal is
to simplify a dataset while retaining as much information (variance)
as possible.
Why use PCA?
• Datasets can have many variables (features), which can make analysis
hard.
• Some features may be correlated or redundant.
• PCA transforms the original data into a new set of uncorrelated
variables, called principal components, ordered by the amount of
variance they explain.
Steps
•Standardize the data (if so required)
.Compute the covariance matrix
•Calculate the eigenvalues and eigenvectors
Eigenvectors represent the directions (principal components)
Eigenvalues show how much variance each principal component
captures.
•Sort components by variance
Keep the top k components to explain the most variance
.