DR Pca
DR Pca
Principal components are new variables that are constructed as linear combinations or
mixtures of the initial variables.
These combinations are done in such a way that the new variables (i.e., principal
components) are uncorrelated and most of the information within the initial variables is
compressed into the first components.
So, the idea is 10-dimensional data gives you 10 principal components, but PCA tries to
put maximum possible information in the first component, then maximum remaining
information in the second and so on,
Scree plot
Organizing information in
principal components this
way will allow you to reduce
dimensionality without
losing much information,
and this by discarding the
components with low
information and considering
the remaining components
as your new variables.
3. Arrange Eigenvalues
x2 2.4 0.7 2.9 2.2 3.0 2.7 1.6 1.1 1.6 0.9
So, transforming the data to comparable scales can prevent this problem.
0.69 -1.31 0.39 0.09 1.29 0.49 0.19 -0.81 -0.31 -0.71
0.49 -1.21 0.99 0.29 1.09 0.79 -0.31 -0.81 -0.31 -1.0
Eigenvectors and eigenvalues are the linear algebra concepts that we need to
compute from the covariance matrix in order to determine the principal
components of the data.
What you first need to know about eigenvectors and eigenvalues is that they
always come in pairs, so that every eigenvector has an eigenvalue. Also, their
number is equal to the number of dimensions of the data. For example, for a
3-dimensional data set, there are 3 variables, therefore there are 3 eigenvectors
with 3 corresponding eigenvalues.
Let’s suppose that our data set is 2-dimensional with 2 variables x,y and that the
eigenvectors and eigenvalues of the covariance matrix are as follows:
If we rank the eigenvalues in descending order, we get λ1>λ2, which means
If we apply this on the example above, we find that PC1 and PC2 carry
As we saw in the previous step, computing the eigenvectors and ordering them by their
eigenvalues in descending order, allow us to find the principal components in order of
significance.
In this step, what we do is, to choose whether to keep all these components or discard
those of lesser significance (of low eigenvalues), and form with the remaining ones a
matrix of vectors that we call Feature vector.
So, the feature vector is simply a matrix that has as columns the eigenvectors of the
components that we decide to keep.
This makes it the first step towards dimensionality reduction, because if we choose to
keep only p eigenvectors (components) out of n, the final data set will have only p
dimensions.
Principal Component Analysis Example:
Continuing with the example from the previous step, we can either form
Or discard the eigenvector v2, which is the one of lesser significance, and form a feature
vector with v1 only:
Discarding the eigenvector v2 will reduce dimensionality by 1, and will
But given that v2 was carrying only 4 percent of the information, the loss
will be therefore not important and we will still have 96 percent of the
information that is carried by v1.
So, as we saw in the example, it’s up to you to choose whether to keep all
the components or discard the ones of lesser significance, depending on
what you are looking for.
STEP 5: RECAST THE DATA ALONG THE PRINCIPAL COMPONENTS AXES
In the previous steps, apart from standardization, you do not make any changes on
the data, you just select the principal components and form the feature vector, but
the input data set remains always in terms of the original axes (i.e, in terms of the
initial variables).
In this step, the aim is to use the feature vector formed using the eigenvectors of
the covariance matrix, to reorient the data from the original axes to the ones
represented by the principal components (hence the name Principal Components
Analysis).
This can be done by multiplying the transpose of the original data set by the
transpose of the feature vector.
Step 5: Example…Transform Original Dataset
Use the equation Z = X V
Step 6: Reconstructing Data
So in order to reconstruct the original data, we follow:
Row Original DataSet = Row Zero Mean Data + Original Mean
Page 302, ML in Action by Peter Harrington
Dimensionality reduction techniques allow us to make data easier to use and often remove noise
to make other machine learning tasks more accurate. It’s often a preprocessing step that can be
done to clean up data before applying it to some other algorithm.
A number of techniques can be used to reduce the dimensionality of our data. Among these,
independent component analysis, factor analysis, and principal component analysis are popular
methods.
The most widely used method is principal component analysis.
Principal component analysis allows the data to identify the important features.
It does this by rotating the axes to align with the largest variance in the data. Other axes
are chosen orthogonal to the first axis in the direction of largest variance. Eigenvalue
analysis on the covariance matrix can be used to give us a set of orthogonal axes.