Dimension Reduction (PCA

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 12

Dimension Reduction

(PCA/SVD)
What is Dimension
Reduction?
IT IS THE PROCESS OF REDUCING NUMBER OF
VARIABLES/FEATURES IN REVIEW.

TWO APPROACHES:
✔FEATURE SELECTION
✔ FEATURE EXTRACTION
Reducing Number of Features
Original data or raw data are often sparse containing various
different features.
We reduce these features and select ”related and meaningful”
features to the original data excluding unrelated features to
reduce the computation.

Technically, It is the transformation of data from a high-


dimensional space into a low-dimensional space so that the
low-dimensional representation retains some meaningful
properties of the original data.
Examples
Example of selling a house –
Raw Features:- Locality, Size of rooms, amenities (Gym,
pool), Train station or airports nearby, marketplace
nearby, availability of water and electricity, type of
neighbours etc
Related Features:- Locality, Size of rooms, amenities
(Gym, pool), Train station or airports nearby,
marketplace nearby, availability of water and electricity.

Types of neighbours
Examples
A+B+C+D+E=Z
Number of operations = 4 additions
A + B = AB
D=0
Then,
AB + C + E = A + B + C + D + E = Z
Number of operations(AB + C + E) = 2 additions
Principal Component Analysis (PCA)
⮚It is an unsupervised linear transformation technique that is widely used
across different fields, most prominently for feature extraction and
dimensionality reduction.
⮚PCA helps us to identify patterns in data based on the correlation between
features.
⮚ PCA aims to find the directions of maximum variance in high-dimensional data
and projects it onto a new subspace with equal or fewer dimensions than the
original one.
⮚The orthogonal axes (principal components) of the new subspace can be
interpreted as the directions of maximum variance given the constraint that
the new feature axes are orthogonal to each other
Principal Component Analysis (PCA)
⮚If I had to project data onto a single feature or dimension. A single direction in two dimentional plane containing two
features maximizing the variance. It would give me an orthogonal line.

⮚Popular applications of PCA include exploratory data analyses and de-noising of signals in stock market trading, and the
analysis of genome data and gene expression levels in the field of bioinformatics.

⮚PCA is mainly used as the dimensionality reduction technique in various AI applications such as computer vision, image
compression, etc.

⮚It can also be used for finding hidden patterns if data has high dimensions. Some fields where PCA is used are Finance,
data mining, Psychology, etc.
Singular Value Decomposition (SVD)
⮚The Singular-Value Decomposition, or SVD for short, is a matrix decomposition method for
reducing a matrix to its constituent parts in order to make certain subsequent matrix calculations
simpler.
⮚way to factorize a matrix, into singular vectors and singular values
⮚Data Reduction
⮚Data-Driven Generalization for Fourier Transform (FFT)
⮚"Tailored“ to Specific Problem
⮚Simple /Interpretable Linear Algebra Problem
Proof of SVD
The function takes a matrix and returns the U, Sigma and V^T elements. The Sigma diagonal matrix is returned as a
vector of singular values
To proof SVD, we want to solve U, S, and V with:
A=USVT U^U=1 V^V=1
We have 3 unknowns. Hopefully, we can solve them with the 3 equations above. The transpose of A is
A=USVT A^T=((USVT))T=VSTUT=VSUT
Knowing
UTU=1 And VTV=1
We compute AᵀA, ATA=VSUT(USVT)=VS2VT
The last equation is equilvant to the eigenvector definition for the matrix (AᵀA). We just put all eigenvectors in a matrix.
ATAV=VS2 where ATA=Matrix V=V=All Eigenvectors S2=Eigenvalues
with VS² equals
Proof of SVD
V hold all the eigenvectors vᵢ of AᵀA and S hold the square roots of all
eigenvalues of AᵀA. We can repeat the same process for AAᵀ and
come back with a similar equation.
ATAU= U S2
Now, we just solve U, V and S for
A=US VT

and prove the theorem.


Recap
The following is a recap of SVD.
Applications
Google - Page Rank

Facebook - Facial Recognition

OTT Platform-Building Correlation Patterns

YouTube – Recommendation system

Image Compression & Image Recovery


Thank you

You might also like