0% found this document useful (0 votes)
6 views38 pages

Principal Component Analysis: by Eesha Tur Razia Babar

The document provides an overview of Principal Component Analysis (PCA), a dimensionality reduction technique aimed at transforming high-dimensional datasets into lower dimensions while preserving information. It outlines the steps involved in PCA, including standardization, covariance matrix computation, and the calculation of eigenvectors and eigenvalues to identify principal components. Additionally, it discusses criteria for determining the number of components to retain, such as the eigenvalue criterion and scree plot criterion.

Uploaded by

enl36756
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views38 pages

Principal Component Analysis: by Eesha Tur Razia Babar

The document provides an overview of Principal Component Analysis (PCA), a dimensionality reduction technique aimed at transforming high-dimensional datasets into lower dimensions while preserving information. It outlines the steps involved in PCA, including standardization, covariance matrix computation, and the calculation of eigenvectors and eigenvalues to identify principal components. Additionally, it discusses criteria for determining the number of components to retain, such as the eigenvalue criterion and scree plot criterion.

Uploaded by

enl36756
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

PRINCIPAL

COMPONENT ANALYSIS
BY EESHA TUR RAZIA BABAR
2 PCA

• Dimensionality reduction technique to help represent data in lower dimensions


• Helpful in visualizations and modeling
• Naturally comes at the expense of accuracy

• The goal is to transform high-dimensional datasets (having a large number of features) into a low-dimensional one
(having a smaller number of features) without losing too much information.
• Datasets can include images or simple structured datasets.

• Deal with the curse of dimensionality, which results in complex models and difficulty in visualizing.
• Helps us to remove the Multi-collinearity situation in which some input features are correlated with each other and
provide redundant information.
• PCA - reduce the number of features/variables of a data set, while preserving as much information as possible
3 HOW PCA WORKS
4 PCA STEPS

1. Standardize the range of continuous initial variables


2. Compute the covariance matrix to identify correlations
3. Compute the eigenvectors and eigenvalues of the covariance matrix to identify the
principal components
4. Create a feature vector to decide which principal components to keep
5. Recast the data along the principal components axes
5 STEP 1: STANDARDIZATION

• Standardize the range of the continuous initial variables


• Variables with larger ranges will dominate over those with small ranges
• Could lead to biased results

• After normalization, all the variables will be transformed to the same scale.
6 STEP 2: COVARIANCE MATRIX COMPUTATION

• The goal of this step is to understand that if there is any relationship between input
variables
• Correlated variables = redundant information

• We’ll use covariance matrix to identify these correlations


• p × p symmetric matrix (where p is the number of dimensions)
• Covariance matrix is a 3×3 matrix of this from
7 CONT.

• Covariance of a variable with itself is its variance (Cov(a,a)=Var(a))


• Thus main diagonal (Top left to bottom right) we actually have the variances of each initial
variable

• Covariance is commutative (Cov(a,b)=Cov(b,a)),


• Means that the upper and the lower triangular portions are equal.
8 CONT.

• Sign of the covariance matters:


• If positive then: the two variables increase or decrease together (correlated)
• If negative then: one increases when the other decreases (Inversely correlated)
9 STEP 3: COMPUTE THE EIGENVECTORS AND EIGENVALUES OF THE
COVARIANCE MATRIX TO IDENTIFY THE PRINCIPAL COMPONENTS

• Eigenvectors and eigenvalues are the linear algebra concepts that we need to compute from the
covariance matrix
• Eigen-Vectors are also called Principal Components in the context of the topic.
• Principal components are new variables that are constructed as linear combinations or
mixtures of the initial variables.
• Combinations are done in such a way that the new variables (i.e., principal components) are
uncorrelated
• Most of the information within the initial variables is squeezed or compressed into the first
components.
10 SCREE PLOT
11 NOTE THAT

• Principal components are less interpretable and don’t have any real meaning since they
are constructed as linear combinations of the initial variables.
• Geometrically, principal components represent the directions of the data that explain
a maximal amount of variance.
• In simple terms, lines that capture most information of the data
12 HOW PCA CONSTRUCTS THE PRINCIPAL
COMPONENTS
• There are as many principal components as there are variables in the data
• First principal component accounts for the largest possible variance, and so on.
13 EIGENVECTORS AND EIGENVALUES

• Comes in pairs, so that every eigenvector has an eigenvalue.


• Their number is equal to the number of dimensions of the data
• For example, for a 3-dimensional data set, there are 3 variables, therefore there are 3
eigenvectors with 3 corresponding eigenvalues.
14 CONT.

• Eigenvectors of the Covariance matrix are actually the directions of the axes where there is the most variance(most
information) and that we call Principal Components.

• Eigenvalues are simply the coefficients attached to eigenvectors, which give the amount of variance carried in each
Principal Component.

• We construct a total of n principal components where n is the total number of dimensions of the dataset.

• By sorting the Principal Components in order of their EigenValues, highest to lowest, you get the principal
components in order of significance.

• In the last step, we decide the number of Principal Components to keep and choose the ones of greater significance
and construct a matrix of vectors that we call Feature Vectors. These are our features in the new lower
dimensions.
15 EIGEN VALUES AND EIGEN VECTOR EXAMPLE

• Eigenvectors of the Covariance matrix are actually the directions of the axes where there is the most variance(most
information) and that we call Principal Components.

• Eigenvalues are simply the coefficients attached to eigenvectors, which give the amount of variance carried in each
Principal Component.

• We construct a total of n principal components where n is the total number of dimensions of the dataset.

• By sorting the Principal Components in order of their EigenValues, highest to lowest, you get the principal
components in order of significance.

• In the last step, we decide the number of Principal Components to keep and choose the ones of greater significance
and construct a matrix of vectors that we call Feature Vectors. These are our features in the new lower
dimensions.
16 EXAMPLE 1: STEP 1: FIND MEAN
17 EXAMPLE 1: STEP 2: FIND COVARIANCE
18 EXAMPLE 1: STEP 3: FIND EIGEN VALUE
19 EXAMPLE 1: STEP 4: FIND EIGEN VALUE
20 EXAMPLE 1: STEP 4: FIND EIGEN VALUE
21 EXAMPLE 1: STEP 4: FIND EIGEN VECTOR
22 EXAMPLE 1: STEP 4: FIND PRINCIPLE COMPONENT
DATA AND SCATTER PLOT
SUBTRACTING MEAN
CONT.
27 COVARIANCE MATRIX
28 EIGEN VALUES AND EIGEN VECTORS


29 EIGEN VECTOR: UNIT LENGTH VECTOR
30 SORT BY EIGEN VALUES


31 PRINCIPLE COMPONENTS CALCULATION
32 CONT.
33 INTERPRET THE PRINCIPLE COMPONENTS
34 HOW MANY COMPONENTS SHOULD WE
EXTRACT?
• The motivations for PCA was to reduce the number of features.
• The question arises, “How do we determine how many components to extract?”
• For example, should we retain only the first principal component, as it explains nearly
half the variability? Or, should we retain all eight components, as they explain 100% of
the variability?
• Retaining all eight components does not help us to reduce the number of dimension
• The answer lies somewhere between these two extremes.
35 HOW MANY COMPONENTS SHOULD WE
EXTRACT?
• The criteria used for deciding how many components to extract are the following:
1. The Eigenvalue Criterion
2. The Proportion of Variance Explained Criterion
3. The Scree Plot Criterion
4. The Minimum Communality Criterion
36 1. THE EIGENVALUE CRITERION

• An eigenvalue of 1 would mean that the component would explain about “one variable's worth” of the
variability.
• The rationale for using the eigenvalue criterion is that each component should explain at least one
variable's worth of the variability, and therefore, the eigenvalue criterion states that only components with
eigenvalues greater than 1 should be retained.
• Note that, if there are fewer than 20 variables, the eigenvalue criterion tends to recommend extracting
too few components, while, if there are more than 50 variables, this criterion may recommend extracting
too many.
37 2. THE PROPORTION OF VARIANCE EXPLAINED
CRITERION
• Specified by the analytst that how much
of the total varability would like the
principal components to account for
• Selects the components one by one until
the desired proportion of variability
explained is attained.
• For example, suppose we would like our
components to explain 85% of the
variability in the variables.
38 3. THE SCREE PLOT CRITERION
• A scree plot is a graphical plot of the eigenvalues against the component
number.

• Scree plots are useful for finding an upper bound (maximum) for the number of
components that should be retained.

• Most scree plots look broadly similar in shape, starting high on the left, falling
rather quickly, and then flattening out at some point.

• This is because the first component usually explains much of the variability, the
next few components explain a moderate amount, and the latter components
only explain a small amount of the variability.

• The scree plot criterion is this: The maximum number of components that
should be extracted is just before where the plot first begins to straighten out
into a horizontal line.

• Sometimes, the curve in a scree plot is so gradual that no such elbow point is
evident; in that case, turn to the other criteria.

You might also like