0% found this document useful (0 votes)

45 views12 pages

Exp 15

The document outlines the implementation of Principal Component Analysis (PCA) for dimension reduction in data sets, emphasizing its benefits such as reduced storage requirements and improved model performance. It provides a step-by-step PCA algorithm and includes practice problems to compute principal components from given data. Additionally, it demonstrates the use of PCA in a machine learning context with code examples for data processing and model evaluation.

Uploaded by

8367748261durga

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

45 views12 pages

Exp 15

Uploaded by

8367748261durga

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 12

Experiment-15:

Write a program to Implement Principle Component Analysis.

Principal Component Analysis | Dimension Reduction

Dimension Reduction-

In pattern recognition, Dimension Reduction is defined as-

 It is a process of converting a data set having vast dimensions into a data set with lesser
dimensions.
 It ensures that the converted data set conveys similar information concisely.

Example-

Consider the following example-

 The following graph shows two dimensions x1 and x2.
 x1 represents the measurement of several objects in cm.
 x2 represents the measurement of several objects in inches.

In machine learning,
 Using both these dimensions convey similar information.
 Also, they introduce a lot of noise in the system.
 So, it is better to use just one dimension.

Using dimension reduction techniques-

 We convert the dimensions of data from 2 dimensions (x1 and x2) to 1 dimension (z1).
 It makes the data relatively easier to explain.
Benefits-

Dimension reduction offers several benefits such as-

 It compresses the data and thus reduces the storage space requirements.
 It reduces the time required for computation since less dimensions require less computation.
 It eliminates the redundant features.
 It improves the model performance.

Dimension Reduction Techniques-

The two popular and well-known dimension reduction techniques are-

1. Principal Component Analysis (PCA)

2. Fisher Linear Discriminant Analysis (LDA)

Principal Component Analysis-

 Principal Component Analysis is a well-known dimension reduction technique.

 It transforms the variables into a new set of variables called as principal components.
 These principal components are linear combination of original variables and are orthogonal.
 The first principal component accounts for most of the possible variation of original data.
 The second principal component does its best to capture the variance in the data.
 There can be only two principal components for a two-dimensional data set.

PCA Algorithm-

The steps involved in PCA Algorithm are as follows-

Step-01: Get data.

Step-02: Compute the mean vector (µ).
Step-03: Subtract mean from the given data.
Step-04: Calculate the covariance matrix.
Step-05: Calculate the eigen vectors and eigen values of the covariance matrix.
Step-06: Choosing components and forming a feature vector.
Step-07: Deriving the new data set.

PRACTICE PROBLEMS BASED ON PRINCIPAL COMPONENT

ANALYSIS-

Problem-01:

Given data = { 2, 3, 4, 5, 6, 7 ; 1, 5, 3, 6, 7, 8 }.
Compute the principal component using PCA Algorithm.

Consider the two dimensional patterns (2, 1), (3, 5), (4, 3), (5, 6), (6, 7), (7, 8).
Compute the principal component using PCA Algorithm.

Compute the principal component of following data-

CLASS 1
X=2,3,4
Y=1,5,3
CLASS 2
X=5,6,7
Y=6,7,8
Solution-

We use the above discussed PCA Algorithm-

Step-01:

Get data.
The given feature vectors are-
 x1 = (2, 1)
 x2 = (3, 5)
 x3 = (4, 3)
 x4 = (5, 6)
 x5 = (6, 7)
 x6 = (7, 8)

Step-02:

Calculate the mean vector (µ).

Mean vector (µ)
= ((2 + 3 + 4 + 5 + 6 + 7) / 6, (1 + 5 + 3 + 6 + 7 + 8) / 6)
= (4.5, 5)

Thus,
Step-03:

Subtract mean vector (µ) from the given feature vectors.

 x1 – µ = (2 – 4.5, 1 – 5) = (-2.5, -4)
 x2 – µ = (3 – 4.5, 5 – 5) = (-1.5, 0)
 x3 – µ = (4 – 4.5, 3 – 5) = (-0.5, -2)
 x4 – µ = (5 – 4.5, 6 – 5) = (0.5, 1)
 x5 – µ = (6 – 4.5, 7 – 5) = (1.5, 2)
 x6 – µ = (7 – 4.5, 8 – 5) = (2.5, 3)

Feature vectors (xi) after subtracting mean vector (µ) are-

Step-04:

Calculate the covariance matrix.

Covariance matrix is given by-

Now,
Now,
Covariance matrix
= (m1 + m2 + m3 + m4 + m5 + m6) / 6

On adding the above matrices and dividing by 6, we get-

Step-05:

Calculate the eigen values and eigen vectors of the covariance matrix.
λ is an eigen value for a matrix M if it is a solution of the characteristic equation |M – λI| = 0.
So, we have-

From here,
(2.92 – λ)(5.67 – λ) – (3.67 x 3.67) = 0
16.56 – 2.92λ – 5.67λ + λ2 – 13.47 = 0
λ2 – 8.59λ + 3.09 = 0

Solving this quadratic equation, we get λ = 8.22, 0.38

Thus, two eigen values are λ1 = 8.22 and λ2 = 0.38.

Clearly, the second eigen value is very small compared to the first eigen value.
So, the second eigen vector can be left out.

Eigen vector corresponding to the greatest eigen value is the principal component for the given data
set.
So. we find the eigen vector corresponding to eigen value λ1.

We use the following equation to find the eigen vector-

MX = λX
where-
 M = Covariance Matrix
 X = Eigen vector
 λ = Eigen value

Substituting the values in the above equation, we get-

Solving these, we get-

2.92X1 + 3.67X2 = 8.22X1
3.67X1 + 5.67X2 = 8.22X2

On simplification, we get-
5.3X1 = 3.67X2 ………(1)
3.67X1 = 2.55X2 ………(2)

From (1) and (2), X1 = 0.69X2

From (2), the eigen vector is-
Thus, principal component for the given data set is-

Lastly, we project the data points onto the new subspace as-
# -*- coding: utf-8 -*-
"""EXP15.ipynb

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

dataset = pd.read_csv('/content/PCA.csv')

dataset

dataset.isna().sum()

X = dataset.iloc[: , :-1].values
y = dataset.iloc[: , -1].values

from sklearn.model_selection import train_test_split

x_train,x_test , y_train , y_test = train_test_split(X, y , test_size = 0.2 , random_state=0)

from sklearn.preprocessing import StandardScaler

sc = StandardScaler()
x_train = sc.fit_transform(x_train)
x_test = sc.transform(x_test)

from sklearn.decomposition import PCA

pca = PCA(n_components = 2)
x_train = pca.fit_transform(x_train)
x_test = pca.transform(x_test)

variance = pca.explained_variance_ratio_
variance

from sklearn.linear_model import LogisticRegression

classifier = LogisticRegression()
classifier.fit(x_train , y_train)

y_pred = classifier.predict(x_test)

data_p=pd.DataFrame({'Actual':y_test, 'Predicted':y_pred})
data_p

from sklearn.metrics import confusion_matrix , accuracy_score

cm = confusion_matrix(y_test , y_pred)
cm

import seaborn as sns

sns.heatmap(cm , annot=True)

# find accuracy_score
from sklearn.metrics import accuracy_score
print("the Accuracy of given model:",accuracy_score(y_test, y_pred)*100)

# find precision_score
from sklearn.metrics import precision_score

precision = precision_score(y_test, y_pred, average='micro')

print('Precision:', precision*100)

# calculate recall
from sklearn.metrics import recall_score
recall = recall_score(y_test, y_pred, average='micro')
print('Recall:', recall*100)

# calculate f1 score
from sklearn.metrics import f1_score
f1score = f1_score(y_test, y_pred, average='micro')
print('Recall:', f1score*100)

Dimensionality Reduction Using PCA (Principal Component Analysis)
No ratings yet
Dimensionality Reduction Using PCA (Principal Component Analysis)
13 pages
Topic 16:: Factor Analysis
No ratings yet
Topic 16:: Factor Analysis
33 pages
PCA Steps - Numerical Problem
No ratings yet
PCA Steps - Numerical Problem
8 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
12 pages
Maths Pca
No ratings yet
Maths Pca
6 pages
Data Analysis: Dr. C Santhosh Kumar
No ratings yet
Data Analysis: Dr. C Santhosh Kumar
22 pages
ML Lec-20
No ratings yet
ML Lec-20
17 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
8 pages
Pca
No ratings yet
Pca
16 pages
Mathematical Approach To PCA
No ratings yet
Mathematical Approach To PCA
8 pages
AML Unit - 1 Material
No ratings yet
AML Unit - 1 Material
36 pages
Unit 3
No ratings yet
Unit 3
28 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
34 pages
Kinya Sharon - Ass2 - Machine Learning
No ratings yet
Kinya Sharon - Ass2 - Machine Learning
12 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
12 pages
Singular Value Decomposition (SVD) / Principal Components Analysis (Pca)
No ratings yet
Singular Value Decomposition (SVD) / Principal Components Analysis (Pca)
31 pages
5 Pca
No ratings yet
5 Pca
14 pages
MLSP Exp02
No ratings yet
MLSP Exp02
10 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
13 pages
PCA With An Example
No ratings yet
PCA With An Example
7 pages
Pattern Recognition PCA: Subrata Datta Dept. of AIML Nsec
No ratings yet
Pattern Recognition PCA: Subrata Datta Dept. of AIML Nsec
19 pages
Pca Kmeans GMM
No ratings yet
Pca Kmeans GMM
96 pages
PCA Complete
No ratings yet
PCA Complete
8 pages
Unit - IV - DIMENSIONALITY REDUCTION AND GRAPHICAL MODELS
No ratings yet
Unit - IV - DIMENSIONALITY REDUCTION AND GRAPHICAL MODELS
59 pages
AML Non Evaluative Assignment 2
No ratings yet
AML Non Evaluative Assignment 2
2 pages
A COMPLETE GUIDE TO PRINCIPAL COMPONENT ANALYSIS in ML 1598272724
No ratings yet
A COMPLETE GUIDE TO PRINCIPAL COMPONENT ANALYSIS in ML 1598272724
16 pages
MLSP Exp2
No ratings yet
MLSP Exp2
7 pages
MLPDF 2
No ratings yet
MLPDF 2
9 pages
Principal Component Analysis - A Numerical Approach
No ratings yet
Principal Component Analysis - A Numerical Approach
8 pages
ML Unit - 3 DimensionalitY Reduction
No ratings yet
ML Unit - 3 DimensionalitY Reduction
39 pages
MLSP-6 Dimensionality Reduction
No ratings yet
MLSP-6 Dimensionality Reduction
39 pages
Principal Components Analysis (PCA) Final
No ratings yet
Principal Components Analysis (PCA) Final
23 pages
Lecture6 PCA
No ratings yet
Lecture6 PCA
30 pages
Principal Component Analysis (PCA)
No ratings yet
Principal Component Analysis (PCA)
18 pages
Dimension Reduction
No ratings yet
Dimension Reduction
15 pages
DimensionalitY Reduction
No ratings yet
DimensionalitY Reduction
29 pages
Lecture 9 - Data Reduction
No ratings yet
Lecture 9 - Data Reduction
36 pages
Program: Course Code: Course Name:: M.C.A. MCAS9220 Data Science Fundamentals
No ratings yet
Program: Course Code: Course Name:: M.C.A. MCAS9220 Data Science Fundamentals
28 pages
Principal Component Analysis
100% (1)
Principal Component Analysis
10 pages
FALLSEM2024-25 SWE1015 ETH VL2024250103260 2024-09-18 Reference-Material-I
No ratings yet
FALLSEM2024-25 SWE1015 ETH VL2024250103260 2024-09-18 Reference-Material-I
62 pages
Need of Principal Component Analysis
No ratings yet
Need of Principal Component Analysis
8 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
10 pages
Dimensionality Reduction (Principal Component Analysis)
No ratings yet
Dimensionality Reduction (Principal Component Analysis)
12 pages
Principal Component Analysis Concepts: T56Gzsrvah
No ratings yet
Principal Component Analysis Concepts: T56Gzsrvah
16 pages
Feature Extraction: - Saheni Patra
No ratings yet
Feature Extraction: - Saheni Patra
17 pages
Principal Component Analysis (PCA) Final
No ratings yet
Principal Component Analysis (PCA) Final
37 pages
Dimensionality Reduction - PCA LDA
No ratings yet
Dimensionality Reduction - PCA LDA
25 pages
09 Pca
No ratings yet
09 Pca
22 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
27 pages
DR Pca
No ratings yet
DR Pca
22 pages
The Math Behind PCA
No ratings yet
The Math Behind PCA
3 pages
Dimension Reduction and Hidden Structure: 1.1 Principal Component Analysis (PCA)
No ratings yet
Dimension Reduction and Hidden Structure: 1.1 Principal Component Analysis (PCA)
40 pages
4.5 Principal Component Analysis
No ratings yet
4.5 Principal Component Analysis
15 pages
L 10 Principal Component Analysis 09052024 072206pm
No ratings yet
L 10 Principal Component Analysis 09052024 072206pm
37 pages
Machine Learning (CSO851) - Lecture 03
No ratings yet
Machine Learning (CSO851) - Lecture 03
71 pages
PCA Explained Stepbystep
No ratings yet
PCA Explained Stepbystep
4 pages
Steps For PCA
No ratings yet
Steps For PCA
5 pages
Principal Component Analysis Concepts
No ratings yet
Principal Component Analysis Concepts
16 pages
Principal Component Analysis Concepts
No ratings yet
Principal Component Analysis Concepts
16 pages
Assignment 2 Documentation
No ratings yet
Assignment 2 Documentation
15 pages
A Brief Introduction to MATLAB: Taken From the Book "MATLAB for Beginners: A Gentle Approach"
From Everand
A Brief Introduction to MATLAB: Taken From the Book "MATLAB for Beginners: A Gentle Approach"
Peter Kattan
2.5/5 (2)
Exp 14
No ratings yet
Exp 14
27 pages
Cubes and Dice
No ratings yet
Cubes and Dice
21 pages
Exp 12
No ratings yet
Exp 12
4 pages
Probability
No ratings yet
Probability
34 pages
Counting Figures 1
No ratings yet
Counting Figures 1
14 pages
Boats and Streams
No ratings yet
Boats and Streams
12 pages
Profit & Loss
No ratings yet
Profit & Loss
3 pages
Banchar Arnonkijpanich, Barbara Hammer and Alexander Hasenfuss - Local Matrix Adaptation in Topographic Neural Maps
No ratings yet
Banchar Arnonkijpanich, Barbara Hammer and Alexander Hasenfuss - Local Matrix Adaptation in Topographic Neural Maps
34 pages
Mini Project
No ratings yet
Mini Project
19 pages
Statistical Analysis On The Loan Repayment Efficiency and Its Impact On The Borrowers
No ratings yet
Statistical Analysis On The Loan Repayment Efficiency and Its Impact On The Borrowers
22 pages
Rotated Component Matrix
No ratings yet
Rotated Component Matrix
4 pages
RUS Boost Tree Ensemble Classifiers For OD
No ratings yet
RUS Boost Tree Ensemble Classifiers For OD
7 pages
ADD Serie 01 Eng
No ratings yet
ADD Serie 01 Eng
2 pages
Dreher, A. (2006) Does Globalization Affect Growth
No ratings yet
Dreher, A. (2006) Does Globalization Affect Growth
21 pages
8D Report
No ratings yet
8D Report
17 pages
Reverse Engineering Recurrent Networks For Sentiment Classification Reveals Line Attractor Dynamics
No ratings yet
Reverse Engineering Recurrent Networks For Sentiment Classification Reveals Line Attractor Dynamics
17 pages
Goldsmith 2002
No ratings yet
Goldsmith 2002
8 pages
hw1 Sols
No ratings yet
hw1 Sols
6 pages
Complete Matrix Differential Calculus With Applications in Statistics and Econometrics 3rd Edition Jan R. Magnus PDF For All Chapters
100% (1)
Complete Matrix Differential Calculus With Applications in Statistics and Econometrics 3rd Edition Jan R. Magnus PDF For All Chapters
55 pages
Set3sol 2022
No ratings yet
Set3sol 2022
3 pages
The Sensory Panel Applied To Textile Goods A New Marketing Tool
No ratings yet
The Sensory Panel Applied To Textile Goods A New Marketing Tool
14 pages
Data Science Masters 2.0 - PW Skills
No ratings yet
Data Science Masters 2.0 - PW Skills
15 pages
Olive Oils
No ratings yet
Olive Oils
27 pages
Healthcare Predictive Analytics Using Machine Learning and Deep Learning Techniques: A Survey
No ratings yet
Healthcare Predictive Analytics Using Machine Learning and Deep Learning Techniques: A Survey
45 pages
LIRIL Marketing Project
100% (2)
LIRIL Marketing Project
13 pages
Fault Detection of Drinking Water Treatment Process Using PCA and Hotelling's T Chart
No ratings yet
Fault Detection of Drinking Water Treatment Process Using PCA and Hotelling's T Chart
6 pages
Blast Event Simulation For A Structure Subjected To A Landmine Explosion
No ratings yet
Blast Event Simulation For A Structure Subjected To A Landmine Explosion
8 pages
Schnell Tatjana Some Jopp
No ratings yet
Schnell Tatjana Some Jopp
18 pages
From The Closet To The Shore - Compressed
No ratings yet
From The Closet To The Shore - Compressed
16 pages
Improving Earthquake Prediction With Principal Component Analysis: Application To Chile
No ratings yet
Improving Earthquake Prediction With Principal Component Analysis: Application To Chile
13 pages
GCD Detailed Syllabus
No ratings yet
GCD Detailed Syllabus
24 pages
Ford Global 8d Asses
No ratings yet
Ford Global 8d Asses
12 pages
AI - Capstone Project
No ratings yet
AI - Capstone Project
12 pages
Araujo Et Al. 2006
No ratings yet
Araujo Et Al. 2006
17 pages
Steps in Factor Analysis
No ratings yet
Steps in Factor Analysis
3 pages
EasyChair Preprint 12726
No ratings yet
EasyChair Preprint 12726
6 pages

Exp 15

Uploaded by

Exp 15

Uploaded by

Experiment-15:

Write a program to Implement Principle Component Analysis.

Principal Component Analysis | Dimension Reduction

In pattern recognition, Dimension Reduction is defined as-

Consider the following example-

Using dimension reduction techniques-

Dimension reduction offers several benefits such as-

Dimension Reduction Techniques-

The two popular and well-known dimension reduction techniques are-

1. Principal Component Analysis (PCA)

Principal Component Analysis-

 Principal Component Analysis is a well-known dimension reduction technique.

The steps involved in PCA Algorithm are as follows-

Step-01: Get data.

PRACTICE PROBLEMS BASED ON PRINCIPAL COMPONENT

Compute the principal component of following data-

We use the above discussed PCA Algorithm-

Calculate the mean vector (µ).

Subtract mean vector (µ) from the given feature vectors.

Feature vectors (xi) after subtracting mean vector (µ) are-

Calculate the covariance matrix.

On adding the above matrices and dividing by 6, we get-

Solving this quadratic equation, we get λ = 8.22, 0.38

We use the following equation to find the eigen vector-

Substituting the values in the above equation, we get-

Solving these, we get-

From (1) and (2), X1 = 0.69X2

from sklearn.model_selection import train_test_split

from sklearn.preprocessing import StandardScaler

from sklearn.decomposition import PCA

from sklearn.linear_model import LogisticRegression

from sklearn.metrics import confusion_matrix , accuracy_score

import seaborn as sns

precision = precision_score(y_test, y_pred, average='micro')

You might also like