0% found this document useful (0 votes)
10 views13 pages

How Do You Do A Principal Component Analysis?

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views13 pages

How Do You Do A Principal Component Analysis?

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 13

HOW DO YOU DO A PRINCIPAL COMPONENT ANALYSIS?

1.Standardize the range of continuous initial variables


2.Compute the covariance matrix to identify correlations
3.Compute the eigenvectors and eigenvalues of the covariance matrix to
identify the principal components
4.Create a feature vector to decide which principal components to keep
5.Recast the data along the principal components axes
 PCA condenses information from a large set of variables into fewer variables by applying some sort
of transformation onto them. The transformation is applied in such a way that linearly correlated
variables get transformed into uncorrelated variables.
 Correlation tells us that there is a redundancy of information and if this redundancy can be reduced,
then information can be compressed. For example, if there are two variables in the variable set
which are highly correlated, then, we are not gaining any extra information by retaining both the
variables because one can be nearly expressed as the linear combination of the other.
 In such cases, PCA transfers the variance of the second variable onto the first variable by translation
and rotation of original axes and projecting data onto new axes. The direction of projection is
determined using eigenvalues and eigenvectors. So, the first few transformed features (termed as
Principal Components) are rich in information, whereas the last features contain mostly noise with
negligible information in them.
 This transferability allows us to retain the first few principal components, thus reducing the number
of variables significantly with minimal loss of information.
1. Assemble a data matrix: The first step is to assemble all the data points into a matrix where each column is
one data point. A data matrix,D, of n 3D points would like something like this

2. Calculate Mean: The next step is to calculate the mean (average) of all data points. Note, if the data is 3D,
the mean is also a 3D point with x, y and z coordinates. Similarly, if the data is m dimensional, the mean will
also be m dimensional. The mean is calculculated as
3.Subtract Mean from data matrix: We next create another matrix M by subtracting the mean from every data
point of D

4.Calculate the Covariance matrix: Remember we want to find the direction of maximum variance. The
covariance matrix captures the information about the spread of the data. The diagonal elements of a
covariance matrix are the variances along the X, Y and Z axes. The off-diagonal elements represent the
covariance between two dimensions ( X and Y, Y and Z, Z and X ).The covariance matrix, C} is calculated using
the following product.

where, T represents the transpose operation. The matrix Cis of size m x m times where m is the
number of dimensions ( which is 3 in our example ).Figure shows how the covariance matrix
changes depending on the spread of data in different directions.
Figure: Left : When the data is evenly spread in all directions, the covariance matrix has equal
diagonal elements and zero off-diagonal elements. Center: When the data spread is
elongated along one of the axes, the diagonal elements are unequal, but the off diagonal
elements are zero. Right : In general the covariance matrix has both diagonal and off -
diagonal elements.
Variance- can only be used to explain the spread of the data in
the directions parallel to the axes of the feature space.

Covariance
Variance

For this data, we could calculate the variance in the x-direction and the variance in the y-direction.
However, the horizontal spread and the vertical spread of the data does not explain the clear diagonal correlation. Figure
clearly shows that on average, if the x-value of a data point increases, then also the y-value increases, resulting in a
positive correlation. This correlation can be captured by extending the notion of variance to what is called the ‘covariance’
of the data:
Covariance

For 2D data, we thus obtain , , and


.

These four values can be summarized in a matrix, called the covariance matrix:

the covariance matrix is always a symmetric matrix with the variances on its diagonal and the covariances off-
diagonal.
So, the covariance matrix defines both the spread (variance), and the orientation (covariance) of our data. So, if
we would like to represent the covariance matrix with a vector and its magnitude, we should simply try to find
the vector that points into the direction of the largest spread of the data, and whose magnitude equals the
spread (variance) in this direction
5. Calculate the Eigen vectors and Eigen values of the covariance matrix: The
principal components are the Eigen vectors of the covariance matrix. The first principal
component is the Eigen vector corresponding to the largest Eigen value, the second
principal component is the Eigen vector corresponding to the second largest Eigen
value and so on and so forth.

Feature Vector = (eig1, eig2)


 Forming Principal Components:

This is the final step where we actually form the principal components using all the
math we did till here. For the same, we take the transpose of the feature vector and
left-multiply it with the transpose of scaled version of original dataset.

NewData = FeatureVectorT x
ScaledDataT
NewData- is the Matrix consisting of the principal components,
FeatureVector- is the matrix we formed using the eigenvectors we chose to keep
Scaled Data- is the scaled version of original dataset

You might also like