0% found this document useful (0 votes)

111 views6 pages

Pca PDF

Principal Component Analysis (PCA) is a technique used to reduce the dimensionality of data while retaining most of the information. It transforms correlated variables into a set of uncorrelated variables called principal components. PCA algorithms include singular value decomposition and stochastic gradient descent. The document implements PCA on a random dataset using scikit-learn and analyzes the variance captured by each principal component. It also discusses extensions of PCA like kernel PCA, partial least squares, and canonical correlation analysis.

Uploaded by

Rohan Bansal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

111 views6 pages

Pca PDF

Uploaded by

Rohan Bansal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Principal Component Analysis

Rohan Bansal

1 Introduction
Principal Component Analysis is a tool used to reduce the dimensions of a set
of variables, still retrieving majority of the information. Working with a high
dimensional dataset often leads to difficulties. Low dimensional data set is eas-
ier to analyze, view and storage is less expensive. A given set of variables might
be correlated, which causes certain redundancies. PCA deletes these redundan-
cies by transforming the original variables to an independent set of variables.
For example, working with a colour image by transforming the red green blue
colours into combinations of white and black. In simpler terms, it’s always eas-
ier to work in 2D than 3D.

In PCA, we try to reflect a d-dimensional set of variables, say xRD onto a or-
thogonal set of k-dimensional vectors, say uRk such that u=[b1 , b2 ....bk ]. Then
we can write the projection of vector x in terms of vectors b1 , b2 ....bk . Let us
donate projection of x by πu (x).
k
X
πu (x) = λi bi
i=1

Using the property of orthogonality, we get

< πu (x) − x, bi >= 0 (1)

Substituting πu (x) = λB in eq(1), where B is the d × k matrix consisting of

b1 , b2 ....bk as columns, we get

πu (x) = B(B T B)−1 B T x

This is the basic intuition of PCA. Our next job is to find the k-dimensional
subspace maximizing the (uncentered) variance of this d-dimensional set of vari-
ables inside the sub-space[1]. So our optimisation problem can be written as
[1]:-
maxB T xxT B (2)
subject to
BT B = 1

1
2 PCA Algorithms
2.1 Singular Value Decomposition
In principal component analysis we find the directions in the data with the most
variation, i.e. the eigenvectors corresponding to the largest eigenvalues of the
covariance matrix, and project the data onto these directions. Suppose U is the
matrix consisting of the eigenvectors with the largest eigenvalues of X, then the
PCA transformation can be given by Y=U T X.

2.2 Stochastic Gradient Descent

Stochastic Gradient Descent(SGD) is a simple method to carry out the opti-
mization in eq(2). It iteratively updates the matrix B at data set point of x,
just like gradient descent. This can be sometimes difficult to implement, given
the optimization function is not essentially convex. So we start with a unit
vector B0 , and then keep on updating the matrix given by the equation (3)

Bt+1 = (I + ηxxT )Bt (3)

where η is the step size parameter[5].The algorithm is highly efficient in terms

of memory and runtime per iteration, requiring storage of a singled-dimensional
vector, and performing only vector-vector and a vector-scalar products in each
iteration[5].

2.3 Matrix Stochastic Gradient

We try to formulate a better algorithm than SGD because SGD doesn’t work
well on non-convex optimizations. We parameterise our subspace using the
projection matrix(BBT ) and relax our constraints by taking the convex hull of
the feasible region[2]. This changes our constraint to trace(M)=k (M=BB T ),
which is a convex optimization objective. The following steps are same as that
of SGD, a k-ranked solution is sampled by taking the average of the iterations.

2.3.1 Capped MSG

This algorithm puts a further constraint on the rank of the iterations which
makes it computationally faster and more efficient than MSG[2]. For K=k, it is
similar to the incremental algorithm of [2].

3 Implementation
Here, we implement PCA using SVD on a random dataset.It is done using
sklearn library on a random dataset created using ’random’ library.

2
x1 x2 x3 x4 x5 x6 y1 y2 y3 y4 y5 y6
754 787 742 759 785 722 290 243 294 245 265 254
492 501 495 489 494 483 502 515 474 507 452 468
57 63 54 53 71 56 442 459 460 444 477 483
769 744 753 733 766 782 738 756 782 795 772 750
863 894 886 884 894 876 557 572 567 537 602 546
Dataset

Figure 1: Variance captured by each component

We choose PCA1 and PCA2 as our components to represent the data as

they have the maximum variance.

3
Figure 2: variance comparison

We can easily observe that PC1 captures the maximum variance.

4 Conclusion
4.1 PCA in noisy settings
Till now, we have considered stochastic settings for implementation of our al-
gorithms. Here, we move on to non-stochastic settings ,i.e, we deal with noisy
gradients and missing data[6]. Oja’s method works well in case of bounded noise.
It cannot be implemented if the noise is unbounded because the optimization
objective of maximization can never be achieved.

4
4.2 Kernel PCA
Kernel methods represent an important class of machine learning algorithms
that simultaneously enjoy strong theoretical guarantees as well as empirical per-
formance[7]. Standard PCA only allows linear dimensionality reduction. How-
ever, if the data has more complicated structures which cannot be well repre-
sented in a linear subspace, standard PCA will not be very helpful. Fortunately,
kernel PCA allows us to generalize standard PCA to nonlinear dimensionality
reduction[8].

4.3 Partial Least Squares

PLS can be regarded as a substitute for the method of multiple regression. Sup-
pose we have a data set of two set of variables(can be different dimensionally),
we need to find a lower dimensional subspace that captures the maximum covari-
ance between the two set of variables. It is often posed as the following problem:
given a data set of n samples of two set of variates (or views), xRdx and yRdy ,
respectively, what is the k-dimensional subspace that captures most of the co-
variance between the two views[4]. So the rest of the method is same as that
discussed earlier in PCA with some changes in the optimization objective

4.4 Canonical Correlation Analysis

Canonical Correlation Analysis (CCA) is a ubiquitous statistical technique for
finding maximallycorrelated linear components of two sets of random vari-
ables[9]. It is posed as the problem given in [9,para 1], which is a non-convex
problem. Another difficulty is that compared to PCA, and most other machine
learning problems, is that the constraints also involve stochastic quantities that
depend on the unknown distribution D. The CCA objective does not decompose
over samples. To solve the non-convexity problem, we extend our problem to
Matrix Stochastic Gradient for CCA[9].

References
[1] Raman Arora, Andrew Cotter, Karen Livescu and Nathan Srebro, ”Stochas-
tic optimization for PCA and PLS,” in Allerton, 2012.
[2] Raman Arora, Andrew Cotter and Nathan Srebro, ”Stochastic Optimization
of PCA with Capped MSG,” Advances in Neural Information Processing
Systems 26 (NIPS 2013).
[3] Mianjy, Poorya and Raman Arora. “Stochastic PCA with l2 and l1 Regu-
larization.” ICML (2018).
[4] Arora, Raman Mianjy, Poorya Marinov, Teodor. (2016). Stochastic opti-
mization for multiview representation learning using partial least squares.
1786-1794.

5
[5] Shamir, Ohad. “Convergence of Stochastic Gradient Descent for PCA.”
ICML (2016).
[6] Marinov, Teodor Vanislavov, Poorya Mianjy and Raman Arora. “Streaming
Principal Component Analysis in Noisy Settings.” ICML (2018).

[7] Ullah, Enayat, Poorya Mianjy, Teodor √ V. Marinov and Raman

Arora. “Streaming Kernel PCA with Õ( n) Random Features.” CoRR
abs/1808.00934 (2018): n. pag.
[8] Scholkopf, Bernhard, Smola, Alexander, and Muller,Klaus-Robert. ”Kernel
principal component analysis,” In Advances in Kernel Methods – Support
VectorLearning, pp. 327–352. MIT Press, 1999.
[9] Arora, R., Marinov, T.V., Mianjy, P. (2017). Stochastic Approximation for
Canonical Correlation Analysis. NIPS.

Manual Bomba Tornillo NM076BY
100% (1)
Manual Bomba Tornillo NM076BY
194 pages
Unit IV
No ratings yet
Unit IV
144 pages
Manual RTLO-16918B PDF
100% (9)
Manual RTLO-16918B PDF
286 pages
Breville BES920XL
100% (3)
Breville BES920XL
17 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
60 pages
PCA ChrisDing4
No ratings yet
PCA ChrisDing4
74 pages
한도샤프트2
100% (1)
한도샤프트2
56 pages
Software Testing Report
33% (3)
Software Testing Report
21 pages
CDT 05 PCA SVD FoDS
No ratings yet
CDT 05 PCA SVD FoDS
34 pages
20 Pca
No ratings yet
20 Pca
50 pages
Machine Learning
No ratings yet
Machine Learning
29 pages
Cauchy Robust Principal Component Analysis With AP
No ratings yet
Cauchy Robust Principal Component Analysis With AP
22 pages
New Routes From Minimal Approximation Error To Principal Components
No ratings yet
New Routes From Minimal Approximation Error To Principal Components
14 pages
QSRI Lecture4
No ratings yet
QSRI Lecture4
56 pages
Premiertrak 300 & R300 Operations Manual 1.0 (En)
100% (1)
Premiertrak 300 & R300 Operations Manual 1.0 (En)
17 pages
Principal Component Analysis: #Datascience
No ratings yet
Principal Component Analysis: #Datascience
13 pages
How To Learn To Code & Get A Developer Job in 2023 (Full Book)
100% (1)
How To Learn To Code & Get A Developer Job in 2023 (Full Book)
348 pages
کتاب نهم بارگزاری شده
No ratings yet
کتاب نهم بارگزاری شده
55 pages
Lecture 9 - PCA
No ratings yet
Lecture 9 - PCA
44 pages
Week12 PCA BayesianInference Before Lecture
No ratings yet
Week12 PCA BayesianInference Before Lecture
82 pages
IRJMETS443407
No ratings yet
IRJMETS443407
7 pages
Kernel Principal Component Analysis and Its Applications in Face Recognition and Active Shape Models
No ratings yet
Kernel Principal Component Analysis and Its Applications in Face Recognition and Active Shape Models
9 pages
Machine Learning (CSO851) - Lecture 03
No ratings yet
Machine Learning (CSO851) - Lecture 03
71 pages
Feature Extraction Techniques
No ratings yet
Feature Extraction Techniques
32 pages
Dimension Reduction
No ratings yet
Dimension Reduction
23 pages
Wk02 Machine Learning
No ratings yet
Wk02 Machine Learning
4 pages
The Intuition Behind PCA: Machine Learning Assignment
No ratings yet
The Intuition Behind PCA: Machine Learning Assignment
11 pages
Dimension Reduction and Hidden Structure: 1.1 Principal Component Analysis (PCA)
No ratings yet
Dimension Reduction and Hidden Structure: 1.1 Principal Component Analysis (PCA)
40 pages
Presentation A I STD 2
No ratings yet
Presentation A I STD 2
63 pages
Week 1
No ratings yet
Week 1
19 pages
Lecture 7: Principal Component Analysis (PCA) (Draft: Version 0.9.1)
No ratings yet
Lecture 7: Principal Component Analysis (PCA) (Draft: Version 0.9.1)
11 pages
Unit 4
No ratings yet
Unit 4
79 pages
Dim Reduction & Pattern Recognition
No ratings yet
Dim Reduction & Pattern Recognition
63 pages
Bedford Presentation
No ratings yet
Bedford Presentation
18 pages
AML Unit - 1 Material
No ratings yet
AML Unit - 1 Material
36 pages
Kumar 2017
No ratings yet
Kumar 2017
13 pages
CHBE413CDS Lecture 12 Unsupervised DimRed
No ratings yet
CHBE413CDS Lecture 12 Unsupervised DimRed
30 pages
3 - Feature Extraction
No ratings yet
3 - Feature Extraction
22 pages
PCA revis-BoW PDF
No ratings yet
PCA revis-BoW PDF
47 pages
Org - Gypsum.ga 216 07
100% (1)
Org - Gypsum.ga 216 07
24 pages
Fare Quote Overview Sabre
100% (1)
Fare Quote Overview Sabre
6 pages
PCA Basics
No ratings yet
PCA Basics
1 page
Principal Component Analysis
No ratings yet
Principal Component Analysis
11 pages
Pipes SS316
No ratings yet
Pipes SS316
7 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
34 pages
Chapter Five Principal Comonent Analysis (PCA)
No ratings yet
Chapter Five Principal Comonent Analysis (PCA)
33 pages
4.5 Principal Component Analysis
No ratings yet
4.5 Principal Component Analysis
15 pages
Lecture 9 - Data Reduction
No ratings yet
Lecture 9 - Data Reduction
36 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
16 pages
DS Ca2 PPT 3010 3017
No ratings yet
DS Ca2 PPT 3010 3017
10 pages
What Is PCA?: Image Source
No ratings yet
What Is PCA?: Image Source
17 pages
Data Pre-Processing-IV (Feature Extraction-PCA)
No ratings yet
Data Pre-Processing-IV (Feature Extraction-PCA)
23 pages
Principal Component Analysis Limitations and How To Overcome Them Let's Talk A
No ratings yet
Principal Component Analysis Limitations and How To Overcome Them Let's Talk A
5 pages
7.3 Pca
No ratings yet
7.3 Pca
17 pages
Leds-C4 The One 2015
No ratings yet
Leds-C4 The One 2015
836 pages
Love Report
No ratings yet
Love Report
7 pages
3.2 Pca
No ratings yet
3.2 Pca
27 pages
U5@-Data Reduction
No ratings yet
U5@-Data Reduction
22 pages
Principal Component Analysis and Cluster Analysis
No ratings yet
Principal Component Analysis and Cluster Analysis
14 pages
Introduction To Kernel PCA
No ratings yet
Introduction To Kernel PCA
1 page
PCA Dev
No ratings yet
PCA Dev
16 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
3 pages
Olympic Proportions Cost and Cost Overrun at The Olympics 1960 To 2012
No ratings yet
Olympic Proportions Cost and Cost Overrun at The Olympics 1960 To 2012
24 pages
Presentation
No ratings yet
Presentation
31 pages
Love Report 1
No ratings yet
Love Report 1
10 pages
Presentation
No ratings yet
Presentation
17 pages
Kinya Sharon - Ass2 - Machine Learning
No ratings yet
Kinya Sharon - Ass2 - Machine Learning
12 pages
VrLwNeLxTo2TvLff9AnI - Pitch Zoom Template
No ratings yet
VrLwNeLxTo2TvLff9AnI - Pitch Zoom Template
20 pages
Linear Regression: Dimensionality Reduction
No ratings yet
Linear Regression: Dimensionality Reduction
7 pages
Pca 1
No ratings yet
Pca 1
3 pages
Pca
No ratings yet
Pca
6 pages
Abb Acs355 PDF
No ratings yet
Abb Acs355 PDF
2 pages
Model Components Bloom S Taxonomy
No ratings yet
Model Components Bloom S Taxonomy
5 pages
10-601 Machine Learning (Fall 2010) Principal Component Analysis
No ratings yet
10-601 Machine Learning (Fall 2010) Principal Component Analysis
8 pages
Measuring Predicting and Managing Grinding Media Wear
No ratings yet
Measuring Predicting and Managing Grinding Media Wear
12 pages
G37+ The Strongest 2.1 Monk Build (Quin69) - Monk - Diablo III Builds - Diablo Fans PDF
No ratings yet
G37+ The Strongest 2.1 Monk Build (Quin69) - Monk - Diablo III Builds - Diablo Fans PDF
6 pages
Marika SENG Research Presentation v1
No ratings yet
Marika SENG Research Presentation v1
27 pages
Goulds Pumps 3355 Series
100% (1)
Goulds Pumps 3355 Series
7 pages
What Is The Role of An Owner's Representative - Curley & Rothman LLC
No ratings yet
What Is The Role of An Owner's Representative - Curley & Rothman LLC
3 pages
Guidelines For PTI UK and Ireland Intra-Party Elections 2015-2016
No ratings yet
Guidelines For PTI UK and Ireland Intra-Party Elections 2015-2016
13 pages
2014 Toyota Sai Press Release
No ratings yet
2014 Toyota Sai Press Release
4 pages
416 C
No ratings yet
416 C
15 pages
Soalogic Cheat Sheet: Target Account Profile Elevator Pitch Target Account Profile Continued
No ratings yet
Soalogic Cheat Sheet: Target Account Profile Elevator Pitch Target Account Profile Continued
2 pages
Chartermanual HPCL
No ratings yet
Chartermanual HPCL
6 pages
Data Warehousing Questionnaire
No ratings yet
Data Warehousing Questionnaire
5 pages
Exercise - DBMS - Chapter 5 - Concurrency Control Techniques
No ratings yet
Exercise - DBMS - Chapter 5 - Concurrency Control Techniques
3 pages
Clarisse Rosaz Shariyf: From: Subject: Date: To: CC
No ratings yet
Clarisse Rosaz Shariyf: From: Subject: Date: To: CC
3 pages
Dear The Weight
From Everand
Dear The Weight
Masud Rana
No ratings yet
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
From Everand
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
Gérard Blanchet
3/5 (1)
Numerical Analysis II Essentials
From Everand
Numerical Analysis II Essentials
The Editors of REA
No ratings yet

Pca PDF

Uploaded by

Pca PDF

Uploaded by

Principal Component Analysis

Using the property of orthogonality, we get

< πu (x) − x, bi >= 0 (1)

Substituting πu (x) = λB in eq(1), where B is the d × k matrix consisting of

πu (x) = B(B T B)−1 B T x

2.2 Stochastic Gradient Descent

Bt+1 = (I + ηxxT )Bt (3)

where η is the step size parameter[5].The algorithm is highly efficient in terms

2.3 Matrix Stochastic Gradient

2.3.1 Capped MSG

Figure 1: Variance captured by each component

We choose PCA1 and PCA2 as our components to represent the data as

We can easily observe that PC1 captures the maximum variance.

4.3 Partial Least Squares

4.4 Canonical Correlation Analysis

[7] Ullah, Enayat, Poorya Mianjy, Teodor √ V. Marinov and Raman

You might also like