0% found this document useful (0 votes)
30 views7 pages

Principal Component Analysis

PCA is a technique used to reduce the dimensionality of data while retaining important patterns and information. It works by transforming the data into a new set of variables (principal components) that maximize the variance in the data. The first principal component accounts for as much of the variability in the data as possible, and each succeeding component accounts for as much of the remaining variability as possible. PCA is commonly used for data compression and to identify patterns in high-dimensional data.

Uploaded by

saira tahir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views7 pages

Principal Component Analysis

PCA is a technique used to reduce the dimensionality of data while retaining important patterns and information. It works by transforming the data into a new set of variables (principal components) that maximize the variance in the data. The first principal component accounts for as much of the variability in the data as possible, and each succeeding component accounts for as much of the remaining variability as possible. PCA is commonly used for data compression and to identify patterns in high-dimensional data.

Uploaded by

saira tahir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 7

Principal Component

Analysis
Introduction:
PCA is the way of identifying pattern in the data; data is expressed in
such away to highlight the similarities and differences.
To reduce dimensionality of a vector image while maintaining
information as much as possible.
Once the pattern is found in the data it is compressed i.e reduce in
number of dimensions, without too much losing the information of
data.
Much more predominant factors are considered.
Varieties of Samples:
Single sample, independent samples, and dependent samples.
Single sample t having only 1 group; want to test against a
hypothetical mean.
Independent samples t having 2 means, 2 groups; no relation
between groups, e.g., people randomly assigned to a single group.
Dependent t having two means. Either same people in both groups,
or people are related, e.g., husband-wife, left hand-right hand,
hospital patient and visitor.
The t Distribution
We use t when the population variance is unknown (the usual case)
and sample size is small (N<100, the usual case). If you use a stat
package for testing hypotheses about means, you will use t.
PCA steps: transform an matrix into an matrix :
Centralized the data (subtract the mean).
Calculate the covariance matrix: C= 1 1
,= 1 1 =1 ,. , , (diagonal) is the variance of
variable i.
, (off-diagonal) is the covariance between variables i and j.
Calculate the eigenvectors of the covariance matrix (orthonormal)
Principal Components
All principal components (PCs) start
at the origin of the ordinate axes.
First PC is direction of maximum
variance from origin
Subsequent PCs are orthogonal to 1st
PC and describe maximum residual
variance

You might also like