0% found this document useful (0 votes)

39 views47 pages

Dimensionality Reduction 22-01-22

The document discusses dimensionality reduction techniques. It begins by explaining the motivations for dimensionality reduction, such as computational efficiency, better generalization with fewer dimensions, visualization of data structure, and anomaly detection. It then describes the basic setup for linear dimensionality reduction, which involves projecting high-dimensional data points into a lower-dimensional subspace. Finally, it introduces principal component analysis (PCA) and explains that PCA aims to choose projection directions that minimize reconstruction error of the original data and maximize the projected variance of the data.

Uploaded by

Rohit Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

39 views47 pages

Dimensionality Reduction 22-01-22

Uploaded by

Rohit Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 47

Pattern Recognition and Machine learning

Dimensionality Reduction
Dipanjan Roy
Associate Professor
School of AIDE
Indian Institute of Technology Jodhpur
Numerous examples of high-dimensional data…..
Documents
According to media reports, a pair of hackers said on Saturday that
the Firefox Web browser, commonly perceived as the safer and
more customizable alternative to market leader Internet Explorer,
Face images is critically flawed. A presentation on the flaw was shown during the
ToorCon hacker conference in San Diego.

Zambian President Levy

Mwanawasa has won a
second term in office in
an election his challenger
Michael Sata accused him
of rigging, official results
showed on Monday.

Neural population recordings MEG readings

Gene expression data

2
High dimensional Brain fMRI data
Motivation and context
Why do dimensionality reduction?
• Computational: compress data ⇒ time/space
efficiency
Motivation and context
Why do dimensionality reduction?
• Computational: compress data ⇒ time/space
efficiency
• Statistical: fewer dimensions ⇒ better generalization

3
Motivation and context
Why do dimensionality reduction?
• Computational: compress data ⇒ time/space
efficiency
• Statistical: fewer dimensions ⇒ better generalization
• Visualization: understand structure of data

3
Motivation and context
Why do dimensionality reduction?
• Computational: compress data ⇒ time/space efficiency
• Statistical: fewer dimensions ⇒ better generalization
• Visualization: understand structure of data
• Anomaly detection: describe normal data, detect
outliers

Dimensionality reduction in this course:

• Linear methods (this week)
• Nonlinear methods (later)

3
Why reduce dimensions?
 High dimensionality has many costs

– Redundant and irrelevant features degrade performance of some ML

algorithms

– Difficulty in interpretation and visualization

– Computation may become infeasible

 what if your algorithm scales as O( n3 )?

– Curse of dimensionality
Types of problems
• Prediction x → y: classification, regression
Applications: face recognition, gene expression
prediction Techniques: kNN, SVM, least squares ( +
dimensionality reduction preprocessing)

• Structure discovery x → z: find an

alternative representation z of data x
Applications: visualization
Techniques: clustering, linear dimensionality
reduction

• Density estimation p(x): model the data

Applications: anomaly detection, language modeling
Techniques: clustering, linear dimensionality
Linear dimensionality reduction
 Bestk-dimensional subspace for projection depends on task
– Unsupervised: retain as much data variance as possible
Example: principal component analysis (PCA)

– Classification: maximize separation among classes

Example: linear discriminant analysis (LDA)

– Regression: maximize correlation between projected data and response

variable
Example: partial least squares (PLS)
Basic idea of linear dimensionality reduction

Represent each face as a high-dimensional vector x ∈

R3 6 1

5
Basic idea of linear dimensionality reduction

Represent each face as a high-dimensional vector x ∈

R3 6 1
x ∈R 361

z=
UT x z ∈
R 10
5
Basic idea of linear dimensionality
reduction

Represent each face as a high-dimensional vector x ∈

R3 6 1
x ∈R 361

z=
UT x z ∈
R 10

How do we choose U ? 5
Outline
• Principal component analysis (PCA)
– Basic principles
– Case studies

• Linear discriminant analysis (LDA)

• Fisher discriminant analysis (FDA)

• Canonical correlation analysis (CCA)

• Independent Component Analysis (ICA)

• Summary
Dimensionality reduction setup
Given n data points in d dimensions: x 1 , . . . , x n ∈
Rd
Dimensionality reduction setup
Given n data points in d dimensions: x 1 , . . . , x n ∈
Rd

X = ( x1 · · · · · · xn ) ∈

R d×n
Dimensionality reduction setup
Given n data points in d dimensions: x 1 , . . . , x n ∈
Rd

X = ( x1 · · · · · · xn ) ∈R d×n

Want to reduce dimensionality from d to k

Dimensionality reduction setup
Given n data points in d dimensions: x 1 , . . . , x n ∈
Rd

X = ( x1 · · · · · · xn ) ∈R d×n

Want to reduce dimensionality from d to k

Choose k directions u 1 , . . . , u k
Dimensionality reduction setup
Given n data points in d dimensions: x 1 , . . . , x n ∈
Rd

X = ( x1 · · · · · · xn ) ∈R d×n

Want to reduce dimensionality from d to k

Choose k directions u 1 , . . . , u k

U = ( u 1 ·· u k ) ∈R d×k
Dimensionality reduction setup
Given n data points in d dimensions: x 1 , . . . , x n ∈
Rd

X = ( x1 · · · · · · xn ) ∈R d×n

Want to reduce dimensionality from d to k

Choose k directions u 1 , . . . , u k

U = ( u 1 ·· u k ) ∈R d×k

For each u j , compute “similarity” z j = uTj x

Dimensionality reduction setup
Given n data points in d dimensions: x 1 , . . . , x n ∈
Rd

X = ( x1 · · · · · · xn ) ∈

R d×n

Want to reduce dimensionality from d to k

Choose k directions u 1 , . . . , u k

U = ( u 1 ·· u k ) ∈R d×k

For each u j , compute “similarity” z j = uTj

x
Dimensionality reduction setup
Given n data points in d dimensions: x 1 , . . . , x n ∈
Rd

X = ( x1 · · · · · · xn ) ∈

R d×n

Want to reduce dimensionality from d to k

Choose k directions u 1 , . . . , u k

U = ( u 1 ·· u k ) ∈R d×k

For each u j , compute “similarity” z j = uTj

x
PCA objective 1: reconstruction error
U serves two functions:
• Encode: z = U T x, zj =
uTj x
PCA objective 1: reconstruction error
U serves two functions:
• Encode: z = U T x, zj =
•u
T x
Decode:
j
x˜ = U z Σ kj =1 z j u j
=
PCA objective 1: reconstruction error
U serves two functions:
• Encode: z = U T x, zj =
•u
T x
Decode:
j
x˜ = U z Σ kj =1 z j u j
= reconstruction error ǁx − x˜ǁ to be
Want
small
PCA objective 1: reconstruction error
U serves two functions:
• Encode: z = U T x, zj =
•u
T x Σ k
Decode: x˜ = U z
j
j =1 z j u j
= reconstruction error ǁx − x˜ ǁ to be small
Want

Objective: minimize total squared reconstruction

error

n
min Σ ǁx i − U U T x i ǁ 2
U ∈ R d×k
i=1
PCA objective 2: projected variance
Empirical distribution: uniform over x 1 , . . . ,
xn
PCA objective 2: projected variance
Empirical distribution: uniform over x 1 , . . . ,
xn
Expectation (think
Ê [f sum
(x)] over
1 data
Σ
n
fpoints):
n i=1
= (x i )
PCA objective 2: projected variance
Empirical distribution: uniform over x 1 , . . . ,
xn
Expectation (think sum over data points):
Variance (think sum of squares if
centered):

Assume data is centered:

PCA objective 2: projected variance
Dimensionality reduction from Multi Trial Recordings
Trial averaged and concatenated PCA
Equivalence in two objectives
Finding one principal component
How many principal components?

• Similar to question of “How many clusters?”

• Magnitude of eigenvalues indicate fraction of variance captured.
How many principal components?
• Similar to question of “How many clusters?”
• Magnitude of eigenvalues indicate fraction of variance
captured.
• Eigenvalues 1353.2
on a face image dataset:
1086.7

820.1
λi
553.6

287.1

2 3 4 5 6 7 8 9 10
i 11

Principal component analysis (PCA) / Basic 15

How many principal components?
• Similar to question of “How many clusters?”
• Magnitude of eigenvalues indicate fraction of variance
captured.
• Eigenvalues 1353.2
on a face image dataset:
1086.7

820.1
λi
553.6

287.1

2 3 4 6 7 8 9 10
5 11
i
• Eigenvalues typically drop off sharply, so don’t need that
many.
• Of course variance isn’t everything...
Summary of PCA
Reducing Matrix Dimensions
◾ Often, our data can be represented by an
𝑚-by-𝑛 matrix
◾ And this matrix can be closely approximated
by the product of three matrices that share a
small common dimension 𝑟
n
n r r VT
   r

m A ≈ U m

Jure Leskovec & Mina Ghashami

SVD Definition
T

n r r n
r


m A  m VT

◾ A: Input data matrix U

 m x n matrix (e.g., m documents, n terms)
◾ U: Left singular vectors
 m x r matrix (m documents, r concepts)
◾
: Singular values
 r x r diagonal matrix (strength of each ‘concept’)
(r : rank of the matrix A)
◾ V: Right singular vectors
 n x r matrix (n terms, r concepts)
SVD steps for estimation of eigenvectors

n
1u1v1 2u2v2

m A  +

σi … scalar
If we set 2 = 0, then the green ui … vector
columns may as well vi … vector
It is always possible to decompose a
real matrix A into A = U  VT , where
◾ U, , V: unique
◾ U, V: column orthonormal
 UT U = I; VT V = I (I: identity matrix)
 (Columns are orthogonal unit vectors)
◾ : diagonal
 Entries (singular values) are non-negative,
and sorted in decreasing order (σ1  σ2  ...  0)

Nice proof of uniqueness: https://www.cs.cornell.edu/courses/cs322/2008sp/stuff/TrefethenBau_Lec4_SVD.pdf

Large-scale Brain Networks in M/EEG and
dimension reduction?
• What is happening at faster time-scales?
• What are the specific neuronal interactions?
• Can we use MEG to answer these questions?
• excellent temporal res (millisecs)
• good spatial res
• non-
invasive
SVD on High dimensional Brain Data
Spectrum Cross-Spectral Matrix Karhunen-Loève transform

Global Coherence

Sahoo et al. (2020) Neuroimage

Linear dimensionality reduction
 Best k-dimensional subspace for projection depends on task

– Unsupervised: retain as much data variance as possible

Example: principal component analysis (PCA)

– Classification: maximize separation among classes

Example: linear discriminant analysis (LDA)

– Regression: maximize correlation between projected data and response

variable
Example: partial least squares (PLS)
LDA for two classes

Dimensionality Reduction 22-01-22
No ratings yet
Dimensionality Reduction 22-01-22
47 pages
14: Dimensionality Reduction (PCA) : Motivation 1: Data Compression
No ratings yet
14: Dimensionality Reduction (PCA) : Motivation 1: Data Compression
7 pages
03 Dimensionality Reduction
No ratings yet
03 Dimensionality Reduction
38 pages
22AIP3101A Session 7
No ratings yet
22AIP3101A Session 7
28 pages
Unit 4 Dimenstionality Reduction
No ratings yet
Unit 4 Dimenstionality Reduction
104 pages
Linear Regression: Dimensionality Reduction
No ratings yet
Linear Regression: Dimensionality Reduction
7 pages
Lecture 1. Dimension Reduction
No ratings yet
Lecture 1. Dimension Reduction
67 pages
5-Dimension Reduction
No ratings yet
5-Dimension Reduction
48 pages
EE769-11 Dimension Reduction
No ratings yet
EE769-11 Dimension Reduction
16 pages
315 F19 27 Pca1
No ratings yet
315 F19 27 Pca1
28 pages
Curse of Dimensionality
No ratings yet
Curse of Dimensionality
51 pages
Unit 3
No ratings yet
Unit 3
102 pages
Class8-9 DataPreprocessing DataReduction 30Sept-05Oct2020
No ratings yet
Class8-9 DataPreprocessing DataReduction 30Sept-05Oct2020
22 pages
Lec 3
No ratings yet
Lec 3
60 pages
Dimension Reduction
No ratings yet
Dimension Reduction
38 pages
Ann Unit V
No ratings yet
Ann Unit V
30 pages
Unit 3dimentionality Reduction
No ratings yet
Unit 3dimentionality Reduction
13 pages
Lec 11 Data Preparation Dimensionality-Reduction-JL-Lemma
No ratings yet
Lec 11 Data Preparation Dimensionality-Reduction-JL-Lemma
36 pages
Lecture W12ab
No ratings yet
Lecture W12ab
60 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
9 pages
Dim Reduction & Pattern Recognition
No ratings yet
Dim Reduction & Pattern Recognition
63 pages
Module3 Notes
No ratings yet
Module3 Notes
13 pages
CHBE413CDS Lecture 12 Unsupervised DimRed
No ratings yet
CHBE413CDS Lecture 12 Unsupervised DimRed
30 pages
Machine Learning (CSO851) - Lecture 03
No ratings yet
Machine Learning (CSO851) - Lecture 03
71 pages
Machine Learning: Unsupervised Learning Dimensionality Reduction K-Means Clustering
No ratings yet
Machine Learning: Unsupervised Learning Dimensionality Reduction K-Means Clustering
28 pages
Lecture 14: Principal Component Analysis: Computing The Principal Components
No ratings yet
Lecture 14: Principal Component Analysis: Computing The Principal Components
6 pages
Computer Vision and Image Processing - Fundamentals and Applications
No ratings yet
Computer Vision and Image Processing - Fundamentals and Applications
34 pages
Lecture 9 - PCA
No ratings yet
Lecture 9 - PCA
44 pages
ML Unit 4
No ratings yet
ML Unit 4
34 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
60 pages
10 Autoencoders
No ratings yet
10 Autoencoders
42 pages
W4.2 DataPreProcessing-PCA
No ratings yet
W4.2 DataPreProcessing-PCA
22 pages
Dimensionality Reduction Techniques You Should Know in 2021
No ratings yet
Dimensionality Reduction Techniques You Should Know in 2021
12 pages
MLSP-6 Dimensionality Reduction
No ratings yet
MLSP-6 Dimensionality Reduction
39 pages
PCA
100% (1)
PCA
33 pages
Unit Iii Dimentionality Reduction
No ratings yet
Unit Iii Dimentionality Reduction
12 pages
Lecture 7: Unsupervised Learning: C19 Machine Learning Hilary 2013 A. Zisserman
No ratings yet
Lecture 7: Unsupervised Learning: C19 Machine Learning Hilary 2013 A. Zisserman
20 pages
Module7 Slides
No ratings yet
Module7 Slides
69 pages
P-3.1.4 - Pca
No ratings yet
P-3.1.4 - Pca
44 pages
771 A18 Lec19
No ratings yet
771 A18 Lec19
131 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
27 pages
Dimensionality Reduction Report-Yomna Eid Rizk
No ratings yet
Dimensionality Reduction Report-Yomna Eid Rizk
6 pages
Principal Component Analysis: Jianxin Wu
No ratings yet
Principal Component Analysis: Jianxin Wu
24 pages
Polo Chaur Dimension Reduction
No ratings yet
Polo Chaur Dimension Reduction
59 pages
cs229 Notes10 PDF
No ratings yet
cs229 Notes10 PDF
6 pages
1.variable Reduction 2.principal Component Analysis: Topic UNIT-4
No ratings yet
1.variable Reduction 2.principal Component Analysis: Topic UNIT-4
19 pages
Visualization 9 Dim Reduction
No ratings yet
Visualization 9 Dim Reduction
73 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
33 pages
CS464 Ch6 FeatureExtraction
No ratings yet
CS464 Ch6 FeatureExtraction
46 pages
ML RUSA Module 5 Dim Red
No ratings yet
ML RUSA Module 5 Dim Red
85 pages
Dimensionality Reduction (Pca)
No ratings yet
Dimensionality Reduction (Pca)
32 pages
Sanjay Singh Principal Component Analysis
No ratings yet
Sanjay Singh Principal Component Analysis
9 pages
Outline: Reducing Data Dimension
No ratings yet
Outline: Reducing Data Dimension
7 pages
ML Unit 4
No ratings yet
ML Unit 4
34 pages
Lecture 14
No ratings yet
Lecture 14
30 pages
Dimensionality Reduc1on
No ratings yet
Dimensionality Reduc1on
30 pages
PCA - Feb 8
No ratings yet
PCA - Feb 8
28 pages
Illuminating Data: A hands on guide to data visualization in R
From Everand
Illuminating Data: A hands on guide to data visualization in R
Eman Ahmad
No ratings yet
Mastering Scientific Computing with R
From Everand
Mastering Scientific Computing with R
Paul Gerrard
3/5 (1)
Geometric functions in computer aided geometric design
From Everand
Geometric functions in computer aided geometric design
Oscar Ruiz
No ratings yet
Lectures3 5
No ratings yet
Lectures3 5
57 pages
ICA Dim Red
No ratings yet
ICA Dim Red
39 pages
Decision Tree
No ratings yet
Decision Tree
43 pages
Bayes Classification
No ratings yet
Bayes Classification
86 pages
Features Election
No ratings yet
Features Election
62 pages
Resume Rohit Singh
No ratings yet
Resume Rohit Singh
1 page
AWS Cloud Security 7.4 Administrator Study Guide-Online
No ratings yet
AWS Cloud Security 7.4 Administrator Study Guide-Online
164 pages
Hci Elf
No ratings yet
Hci Elf
20 pages
Sensors 22 07896
No ratings yet
Sensors 22 07896
34 pages
AS Pseudocode Notes Paper2
No ratings yet
AS Pseudocode Notes Paper2
150 pages
Mis Unit 5
No ratings yet
Mis Unit 5
7 pages
Alpa Automating Inter - and Intra-Operator Parallelism - 2201.12023
No ratings yet
Alpa Automating Inter - and Intra-Operator Parallelism - 2201.12023
20 pages
Data Processing SSS3 Scheme of Work - syllabusNG
No ratings yet
Data Processing SSS3 Scheme of Work - syllabusNG
12 pages
Modulation & Multiple Access Techniques
No ratings yet
Modulation & Multiple Access Techniques
129 pages
Constituency Coordinator - PE Southwest - Eastern Cape
No ratings yet
Constituency Coordinator - PE Southwest - Eastern Cape
2 pages
Basics of Quantum Computing
No ratings yet
Basics of Quantum Computing
23 pages
CS6306-Prelim Quiz 2 - Attempt Review
No ratings yet
CS6306-Prelim Quiz 2 - Attempt Review
8 pages
3 - Compute Cloud Services
No ratings yet
3 - Compute Cloud Services
94 pages
Exam Iiia: - Tan) (Lim)
No ratings yet
Exam Iiia: - Tan) (Lim)
6 pages
Incompatibility Between Android Auto 13.2, 13.3 With Samsung S23 Ultra - Android Auto Community
No ratings yet
Incompatibility Between Android Auto 13.2, 13.3 With Samsung S23 Ultra - Android Auto Community
8 pages
HUB, Switch, Router PDF
No ratings yet
HUB, Switch, Router PDF
11 pages
How To Enable The Sound On Kali Linux - Our Code World
No ratings yet
How To Enable The Sound On Kali Linux - Our Code World
1 page
Train C
No ratings yet
Train C
15 pages
Data Engineer - Ireland
No ratings yet
Data Engineer - Ireland
3 pages
Compiler-Designnotes (All Modules)
No ratings yet
Compiler-Designnotes (All Modules)
148 pages
Rri63 N3rit
No ratings yet
Rri63 N3rit
2,237 pages
500 Digital Marketing Tools and Services To Make Money
No ratings yet
500 Digital Marketing Tools and Services To Make Money
23 pages
Data Security in Smart Precision Agriculture Research Areas
No ratings yet
Data Security in Smart Precision Agriculture Research Areas
12 pages
MATLAB Examples: Linear Block Codes: 1 The Galois Field F For Prime P
No ratings yet
MATLAB Examples: Linear Block Codes: 1 The Galois Field F For Prime P
9 pages
(MAXHUB) T86FA-CS1272A-B-Interactive Intelligent Panel-Product Spec-V1.0-20201211
No ratings yet
(MAXHUB) T86FA-CS1272A-B-Interactive Intelligent Panel-Product Spec-V1.0-20201211
5 pages
Name Three Regulations or Guidelines That Govern The Use of AI in Financial Services
No ratings yet
Name Three Regulations or Guidelines That Govern The Use of AI in Financial Services
36 pages
Chapter - 6: Hardware Requirements For NSRC
No ratings yet
Chapter - 6: Hardware Requirements For NSRC
25 pages
Sales Cloud Mini Exam 40 Answers
No ratings yet
Sales Cloud Mini Exam 40 Answers
20 pages
Lecture-8 Subnetting Basics
No ratings yet
Lecture-8 Subnetting Basics
23 pages
SrInfoPR - 18 - 2023 - 29122023 NM
No ratings yet
SrInfoPR - 18 - 2023 - 29122023 NM
20 pages
Computer Scince Project: Grocery Shop Mangement
No ratings yet
Computer Scince Project: Grocery Shop Mangement
17 pages

Dimensionality Reduction 22-01-22

Uploaded by

Dimensionality Reduction 22-01-22

Uploaded by

Pattern Recognition and Machine learning

Zambian President Levy

Neural population recordings MEG readings

Dimensionality reduction in this course:

– Redundant and irrelevant features degrade performance of some ML

– Difficulty in interpretation and visualization

– Computation may become infeasible

• Structure discovery x → z: find an

• Density estimation p(x): model the data

– Classification: maximize separation among classes

– Regression: maximize correlation between projected data and response

Represent each face as a high-dimensional vector x ∈

Represent each face as a high-dimensional vector x ∈

Represent each face as a high-dimensional vector x ∈

• Linear discriminant analysis (LDA)

• Fisher discriminant analysis (FDA)

• Canonical correlation analysis (CCA)

• Independent Component Analysis (ICA)

Want to reduce dimensionality from d to k

Want to reduce dimensionality from d to k

Want to reduce dimensionality from d to k

Want to reduce dimensionality from d to k

For each u j , compute “similarity” z j = uTj x

Want to reduce dimensionality from d to k

For each u j , compute “similarity” z j = uTj

Want to reduce dimensionality from d to k

For each u j , compute “similarity” z j = uTj

Objective: minimize total squared reconstruction

Assume data is centered:

• Similar to question of “How many clusters?”

Principal component analysis (PCA) / Basic 15

Jure Leskovec & Mina Ghashami

◾ A: Input data matrix U

Nice proof of uniqueness: https://www.cs.cornell.edu/courses/cs322/2008sp/stuff/TrefethenBau_Lec4_SVD.pdf

Sahoo et al. (2020) Neuroimage

– Unsupervised: retain as much data variance as possible

– Classification: maximize separation among classes

– Regression: maximize correlation between projected data and response

You might also like