0% found this document useful (0 votes)

17 views53 pages

Lec 13-14 PCA

Uploaded by

vaibhav.kumar.ug23

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views53 pages

Lec 13-14 PCA

Uploaded by

vaibhav.kumar.ug23

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 53

Machine Learning

COCSC403/CACSC403

Dr. Poonam Rani

Associate Professor Lecture 13-14:
NSUT(CSE) Dimensionality reduction - PCA
[email protected]

Dr Poonam Rani - Machine learning 1

2
Contents
 Feature selection techniques
1. Filter methods
2. Wrapper methods
3. Embedded methods

 Feature extraction

 PCA

 Managing Missing features

Dr Poonam Rani - Machine learning

1.Filter methods 3

x1 x2 x3 .. Xn-1 xn X
(Target)

Dr Poonam Rani - Machine learning

2. Wrapper methods 4
x1 x2 x3 .. Xn-1 xn X
(Target)
M1
M2

A - M1 Wrapper
AB - M2
ABC - M3
ABCD - M4
……
Dr Poonam Rani - Machine learning
5
Dimensionality reduction
❑Dimensionality reduction can be divided into:

 Feature selection :
➢Find a subset of the original set of variables, or features, to get
a smaller subset which can be used to model the problem.

 Feature extraction/scaling:
➢This reduces the data in a high dimensional space to a lower
dimension space, i.e. a space with lesser no. of dimensions.

Dr Poonam Rani - Machine learning

6
Feature Selection
➢ Filter Method

➢ Wrapper method

➢ Embedded Method

➢ Hybrid Method

Dr Poonam Rani - Machine learning

7
Parameters for Feature Selection
❑The Similarity of information contributed by the features :

 Correlation

❑Quantum of information contributed by the features :

 Entropy

 Mutual Information

Dr Poonam Rani - Machine learning

8
Parameters for Feature Selection
❑Correlation

❖ Pearson’s correlation coefficient(ρ)

Cov( X , Y )
 ( X ,Y ) =
 XY
where ,
cov(X, Y) - covariance
σ(X) - standard deviation of X
σ(Y) - standard deviation of Y

Dr Poonam Rani - Machine learning

9
Parameters for Feature Selection
❑Entropy
o Entropy (H) can be formulated as:

H ( x) = E[ I ( X ) = E[− ln P ( X )]
Where, X - discrete random variable X
P(X) - probability mass function
E - expected value operator,
I - information content of X.
I(X) - a random variable.

o To calculate Entropy of feature fi: Exclude fi and calculate

entropy for rest of the features.

o If Entropy is low, then the information by feature fi is high.

o Entropy is mostly used for Unsupervised Learning

Dr Poonam Rani - Machine learning
10
Parameters for Feature Selection
❑Mutual Information

o Amount of uncertainty in X due to the knowledge of Y.

o It is calculated as:
P ( x, y )
I ( X , Y ) =  y =Y  x = X P( x, y ) log( )
P( x) P( y )
where

p(x, y) - joint probability function of X and Y,

p(x) - marginal probability distribution function of X

p(y) - marginal probability distribution function of Y

o Calculated to know the amount of information shared about the
class by a feature.

Dr Poonam Rani - Machine learning

Feature Scaling 11

 Feature Scaling is a technique to standardize the independent

features present in the data in a fixed range.

 It is performed during the data pre-processing.

 Feature Scaling Algorithms will scale Age, Salary, BHK in fixed

range say [-1, 1] or [0, 1].

 And then no feature can dominate other.

X1 500 300 200

X2 6 8 2

Dr Poonam Rani - Machine learning

Without Feature Scaling 12

 Eg: Classes: Yes or No

Dr Poonam Rani - Machine learning

Without Feature Scaling:
Distance Measure 13

Dr Poonam Rani - Machine learning

14
Techniques for Feature Scaling

Dr Poonam Rani - Machine learning

15
Methods of Dimensionality Reduction

➢ Principal Component Analysis (PCA)

➢ Linear Discriminant Analysis (LDA)

➢ Generalized Discriminant Analysis (GDA)

❑ Most Common Linear Method:

➢ Principal Component Analysis (PCA)

Dr Poonam Rani - Machine learning

Curse of dimensionality

Dr Poonam Rani - Machine learning

Ball
 Sphere

 Not - Eatable Orange

 Play White ball
 Red

 Size = 5 cm

Dr Poonam Rani - Machine learning

Real life example

Dr Poonam Rani - Machine learning

Dimensionality reduction

Dr Poonam Rani - Machine learning

Dimensionality reduction

Dr Poonam Rani - Machine learning

Dimensionality reduction

Dr Poonam Rani - Machine learning

Dimensionality reduction

Dr Poonam Rani - Machine learning

Dimensionality reduction

Dr Poonam Rani - Machine learning

28
Principle Component Analysis
(PCA)

Dr Poonam Rani - Machine learning

29
Principle Component Analysis
 Principal Component Analysis (PCA) is a dimension-reduction
tool that can be used to reduce a large set of variables to a
small set that still contains most of the information in the large
set.

 It is a mathematical procedure that transforms a number of

(possibly) correlated variables into a (smaller) number of
uncorrelated variables called principal components.

 The first principal component accounts for as much of the

variability in the data as possible, and each succeeding
component accounts for as much of the remaining variability
as possible.

Dr Poonam Rani - Machine learning

30
PCA Steps

Dr Poonam Rani - Machine learning

1. Standardization of data

Dr Poonam Rani - Machine learning

32
2. Calculate the covariance
matrix

Dr Poonam Rani - Machine learning

3. Calculate the eigenvalues 33

and eigenvectors

Dr Poonam Rani - Machine learning

4. Computing the principal

components

Dr Poonam Rani - Machine learning

35
5. Reducing the dimensions

Dr Poonam Rani - Machine learning

36
PCA Example
 Step 1: Find the mean Values

 Data: Mean Values:

x' = 1.81

y' = 1.91

Dr Poonam Rani - Machine learning

PCA Example 37

 Step 2: Subtract the mean to make the data pass through the
origin.

(x’-x) (y’-y)
-0.69 -0.49
1.31 1.21
The normalized data will have
-0.39 -0.99 mean=0
-0.09 -0.29
-1.29 -1.09
-0.49 -0.79 n
-0.19
0.81
0.31
0.81
 ( x − x ')( y − y ')
0.31 0.31
Co − Variance = i =1
n −1
0.71 1.01
Step 3: Find the Co-Variance Matrix: It shows how two variable vary
together
Dr Poonam Rani - Machine learning
PCA Example 38

Step 3: Find the Co-Variance Matrix: It shows how two variable vary
together n

 ( x − x ')( y − y ')
Co − Variance = i =1

n −1
X=(x’-x) Y=(y’-y) X^2 Y^2 X*Y
-0.69 -0.49 0.4761 0.2401 0.3381
1.31 1.21 1.7161 1.4641 1.5851
-0.39 -0.99 0.1521 0.9801 0.3861
-0.09 -0.29 0.0081 0.0841 0.0261
-1.29 -1.09 1.6641 1.1881 1.4061
-0.49 -0.79 0.2401 0.6241 0.3871
-0.19 0.31 0.0361 0.0961 -0.0589
0.81 0.81 0.6561 0.6561 0.6561
0.31 0.31 0.0961 0.0961 0.0961
0.71 1.01 0.5041 1.0201 0.7171
0 0 5.549 6.449 5.539
=sum/9 0.61656 0.71656 0.61544
Dr Poonam Rani - Machine learning
PCA Example 39

The Co-variance matrix for example is:

 0.616 0.615 
 0.615 0.716 
 
Since, the non-diagonal elements in covariance matrix are positive,
Thus, x and y variable increase together in one direction.

o Step 4: Calculate Eigen Vales and Eigen Vectors for covariance

matrix”

 0.4908   −0.735 −0.678 

eigen values =   eigen vectors  
 1.25402   0.677 −0.731 
The most important (principle) Eigen vector would have the direction
in which the variables strongly correlate.
Dr Poonam Rani - Machine learning
PCA Example 40
o Step 4: Calculate Eigen Vales and Eigen Vectors for covariance
matrix”

 0.616 0.615  1 0
C − I = 0    −    =0
 0.615 0.716  0 1
 0.616 −  0.615   0.4908 
  =
0.716 −   1.25402 
=0 eigen values
 0.615  
 0.616 − 0.490 0.615   x1 
Calculate Eigen Vectors     y  =0
 0.615 0.716 − 0.490  1 
 0.616 − 1.254 0.615   x2 
and   
0.716 − 1.254  y2  =0
 0.615
 0.4908   −0.735 −0.678 
eigen values =   eigen vectors  
 1.25402   0.677 −0.731 
The most important (principle) Eigen vector would have the direction
inDrwhich the variables strongly correlate.
Poonam Rani - Machine learning
PCA Example 41
o Step 5: The Eigen vectors with highest Eigen value will be selected
for PCA.

➢Now we can ignore the other dimensions,

➢For n dimensions of data→ n Eigen vectors→ select p Eigen vectors

➢For dimensionality reduction p < n

Final data= Feature Vector x Scaled DataT

Final data= [Eigen vectors]T x Scaled DataT

➢Final data is the final dataset, with data items in columns, and
dimensions along rows.

➢Example data has 2 dimensions so data was in terms of x and y.

Now the data will be in the terms of eigen vectors.
Dr Poonam Rani - Machine learning
Pros and cons of 42
Dimensionality Reduction
Advantages of Dimensionality Reduction

 It helps in data compression, and hence reduced storage space.

 It reduces computation time.

 It also helps remove redundant features, if any.

Disadvantages of Dimensionality Reduction

 It may lead to some amount of data loss.

 PCA tends to find linear correlations between variables, which is

sometimes undesirable.

 PCA fails in cases where mean and covariance are not enough to
define datasets.

 We may not know how many principal components to keep- in

practice,
Dr Poonam some
Rani - Machine thumb rules are applied.
learning
Independent Component analysis 43
(ICA)
➢Unlike principal component analysis which focuses on
maximizing the variance of the data points, the independent
component analysis focuses on independence, i.e.
independent components.

➢Problem: To extract independent sources’ signals from a mixed

signal composed of the signals from those sources.
Given: Mixed signal from five different independent sources.
Aim: To decompose the mixed signal into independent
sources:

Dr Poonam Rani - Machine learning

Independent Component 44
analysis (ICA)

Dr Poonam Rani - Machine learning

Independent Component 45

analysis (ICA)
 Decomposing the mixed signal of each microphone’s recording into
independent source’s speech signal can be done by using the
machine learning technique, independent component analysis.
[ X1, X2, ….., Xn ] => [ Y1, Y2, ….., Yn ]
where, X1, X2, …, Xn are the original signals present in the mixed
signal and Y1, Y2, …, Yn are the new features and are independent
components which are independent of each other.

Restrictions on ICA –

 The independent components generated by the ICA are assumed

to be statistically independent of each other.

 The independent components generated by the ICA must have

non-gaussian distribution.

 The number of independent components generated by the ICA is

equal to the number of observed mixtures.
Dr Poonam Rani - Machine learning
46

PRINCIPAL COMPONENT INDEPENDENT COMPONENT

ANALYSIS ANALYSIS

▪ It reduces the dimensions to ▪ It decomposes the mixed

avoid the problem of signal into its independent
overfitting. sources’ signals.

▪ It deals with the Principal ▪ It deals with the

Components. Independent Components.

▪ It focuses on maximizing the ▪ It doesn’t focus on the issue

variance. of variance among the data
points.
▪ It focuses on the mutual ▪ It doesn’t focus on the
orthogonality property of the mutual orthogonality of the
principal components. components.

▪ It doesn’t focus on the ▪ It focuses on the mutual

mutual independence of the independence of the
components. components.
Dr Poonam Rani - Machine learning
Linear Discriminate Analysis 47
(LDA)

 Linear Discriminant Analysis (LDA) is most commonly used as

dimensionality reduction technique in the pre-processing step
for pattern-classification and machine learning applications.

 The goal is to project a dataset onto a lower-dimensional

space with good class-separability in order avoid overfitting
(“curse of dimensionality”) and also reduce computational
costs.

 In addition to finding the component axes that maximize the

variance of our data (PCA), we are additionally interested in
the axes that maximize the separation between multiple
classes (LDA).

Dr Poonam Rani - Machine learning

Linear Discriminate 48

Analysis (LDA)
 The goal of an LDA is to project a feature space (a dataset n-
dimensional samples) onto a smaller subspace k (where k≤n−1)
while maintaining the class-discriminatory information.

 In general, dimensionality reduction does not only help

reducing computational costs for a given classification task, but
it can also be helpful to avoid overfitting by minimizing the error
in parameter estimation (“curse of dimensionality”).

Dr Poonam Rani - Machine learning

PCA vs LDA 49
 Both Linear Discriminant Analysis (LDA) and Principal Component
Analysis (PCA) are linear transformation techniques that are
commonly used for dimensionality reduction.

 PCA is a “unsupervised” algorithm.

 LDA is “supervised”

Dr Poonam Rani - Machine learning

50
LDA Steps
 Compute the d-dimensional mean vectors for the different
classes from the dataset.
 Compute the scatter matrices (in-between-class and within-class
scatter matrix).
 Compute the eigenvectors (e1,e2,...,ed) and corresponding
eigenvalues (λ1,λ2,...,λd) for the scatter matrices.
 Sort the eigenvectors by decreasing eigenvalues and
choose k eigenvectors with the largest eigenvalues to form
a d×k dimensional matrix W (where every column represents an
eigenvector).
 Use this d×k eigenvector matrix to transform the samples onto the
new subspace. This can be summarized by the matrix
multiplication: Y=X×WY=X×W (where X is a n×d-dimensional
matrix representing the n samples, and y are the
transformed n×k-dimensional samples in the new subspace).
Dr Poonam Rani - Machine learning
51

Managing Missing features

1. Remove the instance

2. Create sub model

3. Automatic strategy

Dr Poonam Rani - Machine learning

Assignment
• Chi-Square Test
• SURVEY PAPER STUDY (ppt and complete notes and video
lecture)
1. ALL DIMENSIONAL REDUCTION ALGORITHMS
• PCA, LDA, ICA and TSNE
2. MISSING VALUE HANDLING
3. IMPLEMENTATION OF PCA/ICA/LDA/T-SNE
1. RESEARCH PAPER IMPLEMENTION ON ANY
TOPIC
2. COMPARSION PPTS

Dr Poonam Rani - Machine learning

Thanks

Dr Poonam Rani - Machine learning

Principal Component Analysis
No ratings yet
Principal Component Analysis
11 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
60 pages
Module-2 C3-C4
No ratings yet
Module-2 C3-C4
66 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
8 pages
DS Unit 3 Essay Answers
No ratings yet
DS Unit 3 Essay Answers
15 pages
Ass 1 2019 RMBA
100% (3)
Ass 1 2019 RMBA
8 pages
Unit 4 Dimenstionality Reduction
No ratings yet
Unit 4 Dimenstionality Reduction
104 pages
Week12 PCA BayesianInference Before Lecture
No ratings yet
Week12 PCA BayesianInference Before Lecture
82 pages
Dimension Reduction
No ratings yet
Dimension Reduction
23 pages
Pca Kmeans GMM
No ratings yet
Pca Kmeans GMM
96 pages
کتاب نهم بارگزاری شده
No ratings yet
کتاب نهم بارگزاری شده
55 pages
Unit 3
No ratings yet
Unit 3
102 pages
Dimensionality Reduction Using PCA: Unsupervised Machine Learning
No ratings yet
Dimensionality Reduction Using PCA: Unsupervised Machine Learning
32 pages
Ai & ML Week-9
No ratings yet
Ai & ML Week-9
30 pages
Module 5 - BECE309L - AIML - Part2
No ratings yet
Module 5 - BECE309L - AIML - Part2
34 pages
Dim Reduction & Pattern Recognition
No ratings yet
Dim Reduction & Pattern Recognition
63 pages
Machine Learning (CSO851) - Lecture 03
No ratings yet
Machine Learning (CSO851) - Lecture 03
71 pages
6 - Data Pre-Processing-III
No ratings yet
6 - Data Pre-Processing-III
30 pages
CHBE413CDS Lecture 12 Unsupervised DimRed
No ratings yet
CHBE413CDS Lecture 12 Unsupervised DimRed
30 pages
10 Autoencoders
No ratings yet
10 Autoencoders
42 pages
5 Data Pre Processing III
No ratings yet
5 Data Pre Processing III
30 pages
Lecture 3
No ratings yet
Lecture 3
14 pages
PCA - Feb 8
No ratings yet
PCA - Feb 8
28 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
27 pages
MLSP-6 Dimensionality Reduction
No ratings yet
MLSP-6 Dimensionality Reduction
39 pages
IDS 4 (Week 14)
No ratings yet
IDS 4 (Week 14)
66 pages
22AIP3101A Session 7
No ratings yet
22AIP3101A Session 7
28 pages
Data Analysis: Dr. C Santhosh Kumar
No ratings yet
Data Analysis: Dr. C Santhosh Kumar
22 pages
Pattern Recognition PCA: Subrata Datta Dept. of AIML Nsec
No ratings yet
Pattern Recognition PCA: Subrata Datta Dept. of AIML Nsec
19 pages
MLPDF 2
No ratings yet
MLPDF 2
9 pages
Chapter Five Principal Comonent Analysis (PCA)
No ratings yet
Chapter Five Principal Comonent Analysis (PCA)
33 pages
AML Unit - 1 Material
No ratings yet
AML Unit - 1 Material
36 pages
Lecture 9 - Data Prep - Reduction - PCA-M
No ratings yet
Lecture 9 - Data Prep - Reduction - PCA-M
44 pages
Love Report
No ratings yet
Love Report
7 pages
ML Mod 4 Part 2
No ratings yet
ML Mod 4 Part 2
32 pages
Dimension Reduction and Hidden Structure: 1.1 Principal Component Analysis (PCA)
No ratings yet
Dimension Reduction and Hidden Structure: 1.1 Principal Component Analysis (PCA)
40 pages
Principal Component Analysis and Cluster Analysis
No ratings yet
Principal Component Analysis and Cluster Analysis
14 pages
P-3.1.4 - Pca
No ratings yet
P-3.1.4 - Pca
44 pages
ML Mod 4 & 6 Pyq
No ratings yet
ML Mod 4 & 6 Pyq
11 pages
Lecture 9 - Data Reduction
No ratings yet
Lecture 9 - Data Reduction
36 pages
Principal Computer Analysis (PCA)
No ratings yet
Principal Computer Analysis (PCA)
25 pages
Unit 3
No ratings yet
Unit 3
28 pages
Love Report 1
No ratings yet
Love Report 1
10 pages
AISS Manual
No ratings yet
AISS Manual
7 pages
Linear Regression: Dimensionality Reduction
No ratings yet
Linear Regression: Dimensionality Reduction
7 pages
3.2 Pca
No ratings yet
3.2 Pca
27 pages
PCA Dev
No ratings yet
PCA Dev
16 pages
1501589578da Mod15 Q1 e Text
No ratings yet
1501589578da Mod15 Q1 e Text
9 pages
Dimensionality Reduction (Principal Component Analysis)
No ratings yet
Dimensionality Reduction (Principal Component Analysis)
12 pages
U4 - PCA - 5th Sem - DS
No ratings yet
U4 - PCA - 5th Sem - DS
14 pages
Dimension Reduction
No ratings yet
Dimension Reduction
15 pages
Unit - IV - DIMENSIONALITY REDUCTION AND GRAPHICAL MODELS
No ratings yet
Unit - IV - DIMENSIONALITY REDUCTION AND GRAPHICAL MODELS
59 pages
PCA Complete
No ratings yet
PCA Complete
8 pages
Feature Extraction: - Saheni Patra
No ratings yet
Feature Extraction: - Saheni Patra
17 pages
1.variable Reduction 2.principal Component Analysis: Topic UNIT-4
No ratings yet
1.variable Reduction 2.principal Component Analysis: Topic UNIT-4
19 pages
Dimensionality Reduction by Pca: Non - Feasible
No ratings yet
Dimensionality Reduction by Pca: Non - Feasible
26 pages
Principal Components Analysis (PCA) Final
No ratings yet
Principal Components Analysis (PCA) Final
23 pages
Presentation
No ratings yet
Presentation
31 pages
Statistics and Probability Test Items
No ratings yet
Statistics and Probability Test Items
10 pages
PCA
100% (1)
PCA
33 pages
Machine Learning With Matlab
100% (1)
Machine Learning With Matlab
36 pages
Improving Student Learning Outcomes in Economics
No ratings yet
Improving Student Learning Outcomes in Economics
7 pages
Data Analysis Finals1
No ratings yet
Data Analysis Finals1
10 pages
10-601 Machine Learning (Fall 2010) Principal Component Analysis
No ratings yet
10-601 Machine Learning (Fall 2010) Principal Component Analysis
8 pages
4-Data Cleaning - Handout
No ratings yet
4-Data Cleaning - Handout
6 pages
Dote 2011 L1
No ratings yet
Dote 2011 L1
35 pages
Unit1 DrLokeshChaudhary
No ratings yet
Unit1 DrLokeshChaudhary
177 pages
Introduction To Panel Data
No ratings yet
Introduction To Panel Data
20 pages
Probability Theory III (B.Stat. 2017-2020)
No ratings yet
Probability Theory III (B.Stat. 2017-2020)
173 pages
Intro
No ratings yet
Intro
26 pages
Math 1040
No ratings yet
Math 1040
5 pages
Lecture03 MachineLearning
No ratings yet
Lecture03 MachineLearning
78 pages
Concepts and Techniques: Data Mining
No ratings yet
Concepts and Techniques: Data Mining
52 pages
Multivariate Statistical Methods in Atmospheric Science: Ian Jolliffe
No ratings yet
Multivariate Statistical Methods in Atmospheric Science: Ian Jolliffe
8 pages
Sequential Clustering and Classication Approach To Analyze Sales Performance of Retail Stores Based On Point of Sale Data
No ratings yet
Sequential Clustering and Classication Approach To Analyze Sales Performance of Retail Stores Based On Point of Sale Data
26 pages
3RD Sem CS Ass-1
No ratings yet
3RD Sem CS Ass-1
2 pages
ch04 Sampling Distributions
No ratings yet
ch04 Sampling Distributions
60 pages
Simultaneous Equations Modelling in Rstudio
No ratings yet
Simultaneous Equations Modelling in Rstudio
10 pages
1st 2nd 3rd 4th
No ratings yet
1st 2nd 3rd 4th
6 pages
Correlation of Statistics
No ratings yet
Correlation of Statistics
6 pages
Case 2
No ratings yet
Case 2
2 pages
Kernel Methods: Dept. Computer Science & Engineering, Shanghai Jiao Tong University
No ratings yet
Kernel Methods: Dept. Computer Science & Engineering, Shanghai Jiao Tong University
29 pages
Univariate Statistics
No ratings yet
Univariate Statistics
4 pages
Mahler Practice Exam 1
No ratings yet
Mahler Practice Exam 1
19 pages
Financial Management: Lecture No. 22 Portfolio Risk Analysis & Efficient Portfolio Maps Batch 6-2
No ratings yet
Financial Management: Lecture No. 22 Portfolio Risk Analysis & Efficient Portfolio Maps Batch 6-2
10 pages
Chapter 6 Review
No ratings yet
Chapter 6 Review
4 pages
PART I: (Please Answer On The QUESTION SHEET) : F E - GTP 8 Intake Index 1
No ratings yet
PART I: (Please Answer On The QUESTION SHEET) : F E - GTP 8 Intake Index 1
8 pages
Diagnosis Worksheet: Page 1 of 2 Citation
No ratings yet
Diagnosis Worksheet: Page 1 of 2 Citation
2 pages
Python Programming: General-Purpose Libraries; NumPy,Pandas,Matplotlib,Seaborn,Requests,os & sys: Python, #2
From Everand
Python Programming: General-Purpose Libraries; NumPy,Pandas,Matplotlib,Seaborn,Requests,os & sys: Python, #2
e3
No ratings yet