0% found this document useful (0 votes)
96 views11 pages

Dimensionality Reduction: Principal Component Analysis (PCA)

PCA is a technique used to reduce dimensionality in data. It transforms the data by projecting it onto a set of orthogonal principal components or axes that account for maximum variance in the data. The first principal component accounts for as much variation in the data as possible, and each succeeding component accounts for as much remaining variation as possible. PCA is useful for reducing the size of datasets for analysis and speeding up machine learning algorithms. However, it has limitations when variables are not linearly related or there are outliers in the data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
96 views11 pages

Dimensionality Reduction: Principal Component Analysis (PCA)

PCA is a technique used to reduce dimensionality in data. It transforms the data by projecting it onto a set of orthogonal principal components or axes that account for maximum variance in the data. The first principal component accounts for as much variation in the data as possible, and each succeeding component accounts for as much remaining variation as possible. PCA is useful for reducing the size of datasets for analysis and speeding up machine learning algorithms. However, it has limitations when variables are not linearly related or there are outliers in the data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Dimensionality

Reduction
Principal Component Analysis (PCA)

1
Contents
• Dimensionality Reduction
• Curse of Dimensionality
• Principal Component Analysis
• Intution behind PCA
• Uses
• Limitations
• Example

2
Dimensionality Reduction

• 1-d, 5 data points

• 2-d, 25 data points • 3-d, 25 data points


3
Dimensionality Reduction
Hughes Phenomenon
• As the number of features
increases, the classifier’s
performance increases as well
until we reach the optimal
number of features.
• Adding more features based on
the same size as the training set
will then degrade the classifier’s
performance.

4
Dimensionality Reduction
• Dimensionality - in statistics refers to how many attributes a
dataset has.
• Need for reduction  ‘Curse of dimensionality’.
• Curse of dimensionality refers to an exponential increase in the
size of data caused by a large number of dimensions.
• As the number of dimensions of a data increases, it becomes more
and more difficult to process it.
• Dimensionality Reduction is a solution - reduce the size of data by
extracting relevant information and disposing rest of data as
noise.

5
Principal Component Analysis
• PCA is one of the most popular linear dimension reduction.
• It’s a projection based method.
• Transforms the data by projecting it onto a set of orthogonal
(involving in right angles, perpendicular) axes.
• PCA creates new variables from old ones.

6
Principal Component Analysis

7
Principal Component Analysis
• Understanding PCA through animation.

• Each blue dot on the plot represents a point from data given by its x & y coordinate.
• A line P (red line) is drawn from the center of the dataset i.e. from the mean of x & y.
• Every point on the graph is projected on this line shown by two sets of points red & green.
• The spread or variance of data along line p is given by the distance between the two big red
points.
• As the line p rotates the distance between the two red points changes according to the angle
created by line p with the x-axis.
• The purple lines which join a point and its projection represent the error which arises when
we approximate a point by its projection.

8
Principal Component Analysis
• The approximation error should be small, when the new variables closely approximate the
old variables.
• The squared sum of the lengths of all purple lines gives the total error in approximation.
• The angle which minimizes the squared sum of errors also maximizes the distance between
the red points.
• The direction of maximum spread is called the principal axis.
• We apply the same procedure to find the next principal axis, which must be orthogonal to
the other principal axes.
• Once, we get all the principal axes, the dataset is projected onto these axes. The columns in
the projected or transformed dataset are called principal components.

9
When should you use PCA?
• Reducing the dimensionality of the dataset reduces the size.
• If your learning algorithm is too slow because the input dimension
is too high, then using PCA to speed it up.

10
Limitations of PCA
• If the number of variables is large, it becomes hard to interpret the
principal components.
• PCA is most suitable when variables have a linear relationship
among them.
• PCA is influenced to big outliers.

11

You might also like