0% found this document useful (0 votes)
91 views5 pages

Steps For PCA

The document provides a detailed step-by-step explanation of Principal Component Analysis (PCA), starting with the standardization of data to ensure equal contribution of variables. It then covers the computation of the covariance matrix to identify relationships between variables, followed by the calculation of eigenvectors and eigenvalues to determine principal components. Finally, it discusses the creation of a feature vector for dimensionality reduction and the recasting of data along the principal components axes.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
91 views5 pages

Steps For PCA

The document provides a detailed step-by-step explanation of Principal Component Analysis (PCA), starting with the standardization of data to ensure equal contribution of variables. It then covers the computation of the covariance matrix to identify relationships between variables, followed by the calculation of eigenvectors and eigenvalues to determine principal components. Finally, it discusses the creation of a feature vector for dimensionality reduction and the recasting of data along the principal components axes.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Step-by-Step Explanation of PCA

Step 1: Standardization

The aim of this step is to standardize the range of the continuous initial variables so that each one of
them contributes equally to the analysis.

More specifically, the reason why it is critical to perform standardization prior to PCA, is that the latter
is quite sensitive regarding the variances of the initial variables. That is, if there are large differences
between the ranges of initial variables, those variables with larger ranges will dominate over those with
small ranges (for example, a variable that ranges between 0 and 100 will dominate over a variable that
ranges between 0 and 1), which will lead to biased results. So, transforming the data to comparable
scales can prevent this problem.

Mathematically, this can be done by subtracting the mean and dividing by the standard deviation for
each value of each variable.

Once the standardization is done, all the variables will be transformed to the same scale.

Step 2: Covariance Matrix Computation

The aim of this step is to understand how the variables of the input data set are varying from the mean
with respect to each other, or in other words, to see if there is any relationship between them. Because
sometimes, variables are highly correlated in such a way that they contain redundant information. So, in
order to identify these correlations, we compute the covariance matrix.

The covariance matrix is a p × p symmetric matrix (where p is the number of dimensions) that has as
entries the covariances associated with all possible pairs of the initial variables. For example, for a 3-
dimensional data set with 3 variables x, y, and z, the covariance matrix is a 3×3 data matrix of this from:

Covariance Matrix for 3-Dimensional Data.


Since the covariance of a variable with itself is its variance (Cov(a,a)=Var(a)), in the main diagonal (Top
left to bottom right) we actually have the variances of each initial variable. And since the covariance is
commutative (Cov(a,b)=Cov(b,a)), the entries of the covariance matrix are symmetric with respect to
the main diagonal, which means that the upper and the lower triangular portions are equal.

What do the covariances that we have as entries of the matrix tell us about the correlations
between the variables?

It’s actually the sign of the covariance that matters:


 If positive then: the two variables increase or decrease together (correlated)
 If negative then: one increases when the other decreases (Inversely correlated)

Now that we know that the covariance matrix is not more than a table that summarizes the correlations
between all the possible pairs of variables, let’s move to the next step.

Step 3: Compute the eigenvectors and eigenvalues of the covariance matrix to identify the
principal components

Eigenvectors and eigenvalues are the linear algebra concepts that we need to compute from the
covariance matrix in order to determine the principal components of the data.

What you first need to know about eigenvectors and eigenvalues is that they always come in pairs, so
that every eigenvector has an eigenvalue. Also, their number is equal to the number of dimensions of the
data. For example, for a 3-dimensional data set, there are 3 variables, therefore there are 3 eigenvectors
with 3 corresponding eigenvalues.

It is eigenvectors and eigenvalues who are behind all the magic of principal components because the
eigenvectors of the Covariance matrix are actually the directions of the axes where there is the most
variance (most information) and that we call Principal Components. And eigenvalues are simply the
coefficients attached to eigenvectors, which give the amount of variance carried in each Principal
Component.

By ranking your eigenvectors in order of their eigenvalues, highest to lowest, you get the principal
components in order of significance.

Principal Component Analysis Example:

Let’s suppose that our data set is 2-dimensional with 2 variables x,y and that the eigenvectors and
eigenvalues of the covariance matrix are as follows:

If we rank the eigenvalues in descending order, we get λ1>λ2, which means that the eigenvector that
corresponds to the first principal component (PC1) is v1 and the one that corresponds to the second
principal component (PC2) is v2.

After having the principal components, to compute the percentage of variance (information) accounted
for by each component, we divide the eigenvalue of each component by the sum of eigenvalues. If we
apply this on the example above, we find that PC1 and PC2 carry respectively 96 percent and 4 percent
of the variance of the data.
Step 4: Create a Feature Vector

As we saw in the previous step, computing the eigenvectors and ordering them by their eigenvalues in
descending order, allow us to find the principal components in order of significance. In this step, what
we do is, to choose whether to keep all these components or discard those of lesser significance (of low
eigenvalues), and form with the remaining ones a matrix of vectors that we call Feature vector.

So, the feature vector is simply a matrix that has as columns the eigenvectors of the components that we
decide to keep. This makes it the first step towards dimensionality reduction, because if we choose to
keep only p eigenvectors (components) out of n, the final data set will have only p dimensions.

Principal Component Analysis Example:

Continuing with the example from the previous step, we can either form a feature vector with both of
the eigenvectors v1 and v2:

Or discard the eigenvector v2, which is the one of lesser significance, and form a feature vector with v1
only:

Discarding the eigenvector v2 will reduce dimensionality by 1, and will consequently cause a loss of
information in the final data set. But given that v2 was carrying only 4 percent of the information, the
loss will be therefore not important and we will still have 96 percent of the information that is carried
by v1.

So, as we saw in the example, it’s up to you to choose whether to keep all the components or discard the
ones of lesser significance, depending on what you are looking for. Because if you just want to describe
your data in terms of new variables (principal components) that are uncorrelated without seeking to
reduce dimensionality, leaving out lesser significant components is not needed.

Step 5: Recast the Data Along the Principal Components Axes

In the previous steps, apart from standardization, you do not make any changes on the data, you just
select the principal components and form the feature vector, but the input data set remains always in
terms of the original axes (i.e, in terms of the initial variables).

In this step, which is the last one, the aim is to use the feature vector formed using the eigenvectors of
the covariance matrix, to reorient the data from the original axes to the ones represented by the
principal components (hence the name Principal Components Analysis). This can be done by
multiplying the transpose of the original data set by the transpose of the feature vector.
Covariance Matrix

Example 1: The marks scored by 3 students in Physics and Biology are given below:
Student Physics(X) Biology(Y)

A 92 80

B 60 30

C 100 70

Calculate Covariance Matrix from the above data.


Solution:
Sample covariance matrix is given by ∑1n(xi−x‾)2n−1 n−1∑1n(xi−x)2 .
Here, μx = 84, n = 3
var(x) = [(92 – 84)2 + (60 – 84)2 + (100 – 84)2] / (3 – 1) = 448
Also, μy = 60, n = 3
var(y) = [(80 – 60)2 + (30 – 60)2 + (70 – 60)2] / (3 – 1) = 700
Now, cov(x, y) = cov(y, x) = [(92 – 84)(80 – 60) + (60 – 84)(30 – 60) + (100 – 84)(70 – 60)] / (3 –
1) = 520.
The population covariance matrix is given as: [448520520700][448520520700]

Properties of Covariance Matrix


The Properties of Covariance Matrix are mentioned below:
 A covariance matrix is always square, implying that the number of rows in a covariance matrix is always
equal to the number of columns in it.
 A covariance matrix is always symmetric, implying that the transpose of a covariance matrix is always
equal to the original matrix.
 A covariance matrix is always positive and semi-definite.
 The eigenvalues of a covariance matrix are always real and non-negative.
Eigen Values and Eigen Vector

Eigenvalues Definition
Eigenvalues are the scalar values associated with the eigenvectors in linear transformation. The
word ‘Eigen’ is of German Origin which means ‘characteristic’.

Eigenvalues are scalar values associated with a square matrix that measure how a matrix
transforms a vector. If a matrix AAA multiplies a vector vvv, and the result is a scalar multiple
of vvv, then that scalar is the eigenvalue corresponding to the eigenvector vvv. Eigenvalues are
widely used in fields like physics, engineering, and data science.
Hence, these characteristic values indicate the factor by which eigenvectors are stretched in
their direction. It doesn’t involve the change in the direction of the vector except when the
eigenvalue is negative. When the eigenvalue is negative the direction is just reversed.
The equation for eigenvalue is given by
Av = λv
Where,
 A is the matrix,
 v is associated eigenvector, and
 λ is scalar eigenvalue.

What are Eigenvectors?


Eigenvectors for square matrices are defined as non-zero vector values which when multiplied by
the square matrices give the scaler multiple of the vector, i.e. we define an eigenvector for matrix
A to be “v” if it specifies the condition, Av = λv
The scaler multiple λ in the above case is called the eigenvalue of the square matrix. We always
have to find the eigenvalues of the square matrix first before finding the eigenvectors of the
matrix.
For any square matrix, A of order n × n the eigenvector is the column matrix of order n × 1. If
we find the eigenvector of the matrix A by, Av = λv, “v” in this is called the right eigenvector of
the matrix A and is always multiplied to the right-hand side as matrix multiplication is not
commutative in nature. In general, when we find the eigenvector it is always the right
eigenvector.
We can also find the left eigenvector of the square matrix A by using the relation, vA = vλ
Here, v is the left eigenvector and is always multiplied to the left-hand side. If matrix A is of
order n × n then v is a column matrix of order 1 × n.
Eigenvector Equation
The Eigenvector equation is the equation that is used to find the eigenvector of any square
matrix. The eigenvector equation is,
Av = λv
Where,
 A is the given square matrix,
 v is the eigenvector of matrix A, and
 λ is any scaler multiple.

You might also like