0% found this document useful (0 votes)
28 views7 pages

MLSP Exp2

This document discusses performing principal component analysis on two datasets with two features each using Python. It describes standardizing the data, computing the covariance matrix, determining eigenvectors and eigenvalues, projecting the data along the principal component, and reconstructing the original data from the projected data.

Uploaded by

Raj mehta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views7 pages

MLSP Exp2

This document discusses performing principal component analysis on two datasets with two features each using Python. It describes standardizing the data, computing the covariance matrix, determining eigenvectors and eigenvalues, projecting the data along the principal component, and reconstructing the original data from the projected data.

Uploaded by

Raj mehta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

DEPARTMENT OF ELECTRONICS AND TELECOMMUNICATION ENGINEERING

Sem/Branch: VIII/ EXTC Course: Machine Learning for Signal Processing– Laboratory Course code - DJ19ECEL8014

Experiment No.: 2 Name: Raj Mehta Sap_Id-60002200083 Date:2/2/24

Aim: PRINCIPAL COMPONENT ANALYSIS IN PYTHON


To project the given data along the principal component and reconstruction of the giventhe data
x1data=[2 3 4 5 6 7] ; x2data=[1 5 3 6 7 8]
Reduce the two-dimension feature into one dimension with PCA in Python and reconstruction
of the given data.

Tools Required: PYTHON


Theory:
Principal Component Analysis (PCA) is a widely used technique in data analysis and
dimensionality reduction. The main idea behind PCA is to identify the principal components
of a dataset, which are the directions in the feature space that explain the largest amount of
variance in the data. By projecting the data onto these principal components, we can reduce the
dimensionality of the dataset while preserving most of the information.
The first step in PCA is to standardize the data by subtracting the mean and dividing by the
standard deviation. This step ensures that all the features have a similar scale and avoids biases
towards features with larger values. Then, the covariance matrix of the data is computed, which
captures the relationships between the features.
The next step is to compute the eigenvectors and eigenvalues of the covariance matrix. The
eigenvectors are the principal components of the data, and the eigenvalues represent the amount
of variance explained by each principal component. The eigenvectors are sorted in descending
order of their corresponding eigenvalues, so that the first principal component explains the
largest amount of variance, the second principal component explains the second largest amount
of variance, and so on.
Once the principal components have been identified, the data can be projected onto them by
computing the dot product between the data and the principal components. This results in a
new representation of the data in terms of the principal components, which can be used for
visualization or further analysis.
PCA can also be used for dimensionality reduction, by selecting a subset of the principal
components that explain most of the variance in the data. This can be done by setting a
threshold on the total variance explained, or by selecting a fixed number of principal
components.
Overall, PCA is a powerful technique for analyzing and visualizing high-dimensional data. Its
applications include data compression, feature extraction, and pattern recognition, and it is
widely used in fields such as image processing, finance, and biology.
Implementation Algorithm
1. Remove the mean of the given data
2. Find the covariance matrix
3. Find the Eigen vectors and Eigen values of the covariance matrix. These vectors are
perpendicular to each other. One Eigen value is very large and other Eigen value is very small.
We will take the vector components which has highest Eigen value.
4. Multiply the vector with mean removed data. This will give the 1D data.
5. Multiply the 1D data with Transpose vector add the mean will give the reconstructed data
with less error.
CONCLUSION:
To project the given data along the principal component and reconstruct the given data x1data=[2 3 4
5 6 7]; x2data=[1 5 3 6 7 8] has demonstrated the effectiveness of Principal Component Analysis
(PCA) in reducing the dimensionality of the data while preserving most of the information.
The first step was to standardize the data by subtracting the mean and dividing by the standard
deviation. Then, the covariance matrix was computed, and the eigenvectors and eigenvalues were
determined. The eigenvectors were sorted in descending order of their corresponding eigenvalues, and
the first principal component was identified.
The data was projected onto the first principal component by computing the dot product between the
data and the principal component. This resulted in a new representation of the data in terms of the first
principal component. The reconstruction of the data was achieved by multiplying the projected data by
the transpose of the principal component and adding the mean of the original data.
The effectiveness of the projection and reconstruction was evaluated by comparing the reconstructed
data with the original data. The results showed that the reconstructed data closely approximated the
original data, indicating that the projection along the principal component preserved most of the
information in the data.
Overall, the experiment demonstrated the effectiveness of PCA in reducing the dimensionality of the
data and preserving most of the information. PCA can be a powerful tool in data analysis and
visualization, and its applications include data compression, feature extraction, and pattern recognition.
1/2/24 2:29 PM C:\Users\djsce.student\Desktop\E12.m 1 of 2

CODE:

clc;
clear all;
close all;

A = [2, 3, 4, 5, 6, 7];
B = [1, 5, 3, 6, 7, 8];

A_org = A;
B_org = B;

meanA = mean(A);
meanB = mean(B);
disp(meanA)
disp(meanB)
for i = 1:length(A)
A(i) = A(i)-meanA;
end
for i = 1:length(B)
B(i) = B(i)-meanB;
end

disp(A)
disp(B)

a = 0;
for i=1:length(A)
a = a + (A(i) * A(i));
end
a = a/length(A);
disp(a)

d = 0;
for i=1:length(B)
d = d + (B(i) * B(i));
end
d = d/length(B);
disp(d)

b = 0;
for i=1:length(A)
b = b + (A(i) * B(i));
end
b = b/length(A);
disp(b)

%Covariance Matrix
C = [a b ; b d];
disp(C)
1/2/24 2:29 PM C:\Users\djsce.student\Desktop\E12.m 2 of 2

e = eig(C);
disp(e);

[D , E] = eig(C); % D column 1 and column 2 are eigenmatrix for 2 eignvalues..


disp(D)
disp(E)

if(e(1,1)>e(2,1))
f = D( : , 1);
else
f = D(: , 2);
end
disp(f)
X = [A ; B];
X = transpose(X);

OneD_Data = X * f;

Reconstructed = OneD_Data * transpose(f);

Reconstructed(:,1) = Reconstructed(:,1) + meanA;


Reconstructed(:,2) = Reconstructed(:,2) + meanB;
disp(Reconstructed);

Recon_A = transpose(Reconstructed(:,1));
Recon_B = transpose(Reconstructed(:,2));
disp(Recon_A);
disp(Recon_B);
mse_A = 0; mse_B = 0;
for i=1:length(A)
mse_A = mse_A + ((Recon_A(i) - A_org(i))* (Recon_A(i) - A_org(i)));
mse_B = mse_B + ((Recon_B(i) - B_org(i))* (Recon_B(i) - B_org(i)));
end
disp(mse_A/(2*length(A)));
disp(mse_B/(2*length(B)));
OUTPUT:

You might also like