0% found this document useful (0 votes)
13 views24 pages

PCA Mynotes

This document discusses orthogonal transforms and dimensionality reduction techniques like PCA. It defines orthogonal and unitary matrices and explains how orthogonal transforms preserve energy through Parseval's theorem. It provides examples of the discrete cosine transform (DCT) and discrete Fourier transform (DFT), and describes how PCA finds the optimal linear transformation by using the eigenvectors of the data's covariance matrix. PCA achieves dimensionality reduction and compression by projecting the data onto its principal components.

Uploaded by

jfdweij
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views24 pages

PCA Mynotes

This document discusses orthogonal transforms and dimensionality reduction techniques like PCA. It defines orthogonal and unitary matrices and explains how orthogonal transforms preserve energy through Parseval's theorem. It provides examples of the discrete cosine transform (DCT) and discrete Fourier transform (DFT), and describes how PCA finds the optimal linear transformation by using the eigenvectors of the data's covariance matrix. PCA achieves dimensionality reduction and compression by projecting the data onto its principal components.

Uploaded by

jfdweij
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 24

Signal and Image

Compression/dimensionality
reduction
(PCA)
 Consider a square matrix A

 1 1 
 2 2 
 
 
 1 1 
  
 2 2 
Orthogonal matrix
 A square matrix with real entries. Both
rows and columns are orthonormal
 Orthonormal: Orthogonal + unit norm
 Row/column vectors have unit norm
(length) and dot product is zero between
vectors
 Complex version : Unitary matrix
a00 a10  a01a11  0 (orthogonal basis vectors )
2 2 2 2
a  a  a  a 1
00 01 10 11 (norm is unity )
 Consider a square matrix of size NXN

T
A A  I if A is orthogonal,
1 T
So A A

1 *T
A A For unitary matrix
Orthogonal transform

 It is the transformation using a orthogonal


matrix
 Let x be input vector and y is the transformed
vector, then y  T [x]
 That is y=Ax where A is the transformation
matrix, A being orthogonal matrix. Here T is
linear
y  Ax
 Transform domain (Analysis)

 Inverse transform (synthesis)


x  A1 y, or x  AT y

 No need to find the inverse of a matrix to


get back x!
Complex A, orthogonal matrix is called unitary matrix
1 *T
A A
Why orthogonal transforms?
 Tend to redistribute energy. Most of it in
few transformed values.
 Very useful in compression.
 One can see that orthogonality preserves
energy (Parsevels theorem)
 Correlation: Speech, images (x in earlier
slide) etc. have highly correlated samples or
pixels i.e., they have gradual variations with
occasional discontinuities.

 Decorrelation: Transformation using


orthogonal matrix A results in
(un)decorrelated values y.
 Correlated data has redundancy i.e., more
samples than what is required to represent.
 Decorrelation: No correlation between
sample values.
Parsevel’s theorem
 Energy is preserved.
 Consider a matrix A which is orthogonal.
we have y=Ax, Now

y T y  xT AT Ax  xT x
Discrete cosine transform (DCT)
(An Orthogonal Transformation)
N 1
  (2n  1)k 
y (k )  C (k ). x(n) cos  
n 0  2 N
N 1
  (2n  1)k 
x(n)   C (k ) y (k ) cos  
k 0  2 N 
1 2
C (k )  for k  0, and for k  0
N N
Let, N=2

 1 1 
 y (0)  2 2   x(0)
 y (1)    1
  
1   x(1) 
, y  Ax
   
 2 2 
 1 1 
 x(0)  2 2   y (0)
 x(1)    1
  
1   y (1) 
, x  A1 y  AT y
   
 2 2 
DFT (Unitary Transform)

2
1 N 1 j nk
y (k ) 
N
 x ( n )e
n 0
N
, k  0,1, ...., N  1, (1)

2
1 N 1 j nk
x ( n) 
N
 y ( k )e
k 0
N
, n  0,1, ...., N  1, (2)
 Consider N=4 (4 point DFT)

 y (0)  1 1 1 1   x(0) 
 y (1)  1  j  1 j   x(1) 
  1   
 y (2) 2 1  1 1  1   x(2) y  Bx
    
 y (3)  1 j  1  j   x(3) 
 x[0] 1 1 1 1   X [0]
 x[1]  1 j  1  j   X [1] 
  1   
 x[2] 2 1  1 1  1   X [2]
    
 x[3]  1  j  1 j   X [3] 

1 *T
xB X B X
Advantages of DCT over DFT
 Basis vectors are real i.e. A is real. So, less
computational complexity.
 Better energy compaction.
 Meaning of energy compaction: The sum of
squared error between the source signal
x[n] and reconstructed signal with k
transformed coefficients i.e., truncated
transform
Application DCT
 Most important is in image compression
JPEG uses DCT.

After quantization, coefficients are coded.


PCA (Orthogonal Transform)
 Optimum transform (in MSE sense), better
than DFT and DCT
 Why Optimum? Because the transformation
matrix is derived from input data (X)
 Rows of transformation matrix correspond to
eigen vectors (orthogonal directions)
obtained from covariance matrix of input
data
In Lab on PCA
 Input X is of dimension 112x92x400=10304x400
face images
 There are 10304 random variables. So covariance
matrix is of size 10304X10304
 Using this matrix we get 10304 eigen vectors
each of size 10304x1
 These are used as rows of transformation matrix
A (first row with highest eigen value and so on)
 First few rows represent principal
components (depends on how many we
retain while taking the inverse to get
reconstructed x with as low MSE as
possible)
 Note: eigen vectors are orthogonal and
have unit norm
 We have Y=AX

 Since A is 10304x10304 and X is 10304x400,


Y will be 10304x400 (transformed matrix –
output matrix)
 If we perform X=A transpose Y, we get back
X with no error (MSE between original X and
reconstructed X = 0)
 However, let us say we have retained 100
eigen vectors (all other rows filled with 0s).
The A becomes 100x10304 and A
transpose will be 10304x100.
 Then Reconstructed x = (A transpose) Y is
still 10304x400. This can be done by
making Y as 100x400 by retaining only 100
rows of Y
 The reconstructed image has error but by
retaining about 400 rows we saw in lab
that the error is minimum
 This means each image which has originally
10304 values (pixels) can be represented in
PCA domain by using only 400 values
achieving compression of dimensionality
reduction.

You might also like