Principal Component Analysis

Uploaded by

smorshed03

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

0% found this document useful (0 votes)

18 views13 pages

Principal Component Analysis

Uploaded by

smorshed03

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

You are on page 1/ 13

2114124, 12:05 AM Principal component analysis (PCA): Explained and implemented | by Raghavan | Medium + Get unlimited access to the best of Medium for less than $1/week. Become a member x Principal component analysis (PCA): Explained and implemented e Raghavan - Follow 6minread - Aug 19,2018 n © It is very common in datascience tasks involving large number of features , that one is advised to PCA aka Principal component analysis . We is will start with a brief introduction to what and why of PCA . Then we will look into implementing PCA with explanation. The What , whys of PCA When there are lot of variables aka features n(> 10) , then we are advised to do PCA. PCA is a statistical technique which reduces the dimensions of the data and help us understand, plot the data with lesser dimension compared to original data. As the name says PCA helps us compute the Principal components in data. Principal components are basically vectors that are linearly uncorrelated and have a variance with in data. From the principal components top pis picked which have the most variance. nitps:imecium,com/@raghavandBolprincipl-component-analysis-pea-explained-and-mplemented-eeab7cb72b72 wa2114124, 12:05 AM Principal component analysis (PCA): Explained and implemented | by Raghavan | Medium PCA example In the above plot what PCA does is it draws a line across the data which has maximum variance , 2nd maximum variance , so on ... But Wait , why is an axes with maximum variance important ? Lets consider our problem to be classification problem (Similar arguments can also be drawn for other problems too) . Our goal separate data by drawing a line (or a plane) between data. If we find out the dimension which has maximum variance, then it solves part of the problem, now all we have to use suitable algorithm to draw the line or plane which splits the data. Lets implement PCA Lets begin by generating random data with 3 dimensions with 40 samples . We will have two class with 20 samples per class. import numpy as np np. random. seed(1) nitps:imecium,com/@raghavan9Bolprincipal-component-analysis-pea-explained-and-mplemented-eeab7cb72b72 ana2114124, 12:05 AM Principal component analysis (PCA): Explained and implemented | by Raghavan | Medium vecl = np.array([0, 0, 0]) matl = np.array([[1, 0, 0], [0, 1, ©], [0, ®, 1]]) sample_for_classl = np.random.multivariate_normal(vecl, matl, 20).T assert sample_for_classl.shape == (3, 20), "The dimension of the sample_for_class1 matrix is not 3x20" vec2 = np.array([1, 1, 1]) mat2 = np.array([[1, 0, 0], [0, 1, ®], [0, , 1]]) sample_for_class2 = np.random.multivariate_normal(vec2, mat2, 20).T assert sample_for_class2.shape == (3, 20), "The dimension of the sample_for_class2 matrix is not 3x20" all_data = np.concatenate((sample_for_class1, sample_for_class2), axis=1) assert all_data.shape matrix is not 3x20" (3, 40), "The dimension of the all_data Now we will compute the scatter matrix or covariance of this matrix. Why scatter matrix ? scatter matrix records the relationship between variables , which is important to find a dimension of maximum variance . More on scatter martrix , covariance here . Either of scatter matrix or covariance can be used as covariance is just scaled version of scatter matrix. mean_diml = np.mean(all_data(o, :1) mean_dim2 = np.mean(all_data(1, :]) mean_dim3 = np.mean(all_data[2, :]) mean_vector = np.array([[mean_dim1], [mean_dim2], [mean_dim3]]) print('The Mean Vector:\n', mean_vector) scatter_matrix = np.zeros((3,3)) for i in range(all_data.shape(1]) scatter_matrix += (all_data[:, i].reshape(3, 1) - mean_vector) .dot((all_data[:, i].reshape(3, 1) - mean_vector).T) print('The Scatter Matrix is :\n', scatter_matrix) output : (‘The Mean Vector:\n', array([[@.41667492], [6.69848315], nitps:imecium,com/@raghavan9Bolprincipal-component-analysis-pea-explained-and-mplemented-eeab7cb72b72 ana2114124, 12:05 AM Principal component analysis (PCA): Explained and implemented | by Raghavan | Medium [0.49242335]])) (‘The Scatter Matrix:\n', array([[38.4878051 , 10.50787213,11.13746016] , [10.50787213, 36.23651274, 11.96598642], [11.13746016, 11.96598642, 49.73596619]])) Now we know how each of variables are related to each other. We are just one step away from computing the principal components . Before that we will review a basic concept from Linear algebra : Eigenvalues and Eigenvectors. Eigenvalues and Eigenvectors Eigenvectors and Eigenvalues are a property of a matrix which satisfies the following equation . = Ax Eigenvectors and Eigenvalues Where A denotes the matrix , x denotes the eigenvector and lambda denotes the eigenvalues. Now to understand the significance of eigenvector and eigenvalues , lets observe into the following gif . nitps:imecium,com/@raghavan9Bolprincipal-component-analysis-pea-explained-and-mplemented-eeab7cb72b72 ana2114124, 12:05 AM Principal component analysis (PCA): Explained and implemented | by Raghavan | Medium Here the original matrix is multiplied by a vector (aka, undergoing a linear transformation), what we can observe is for lines colored blue and purple do not change direction , it only scales up , while the line colored red changes direction in addition it scaling up. A Eigenvector of a matrix (which can be seen as a linear transformation of matrix) is the condensed vector which summarizes one axes of the matrix. Another property of eigen vectors is that even if I scale the vector by some amount before I multiply it, I still get the same multiple of it as a result. We might also say that eigenvectors are axes along which linear transformation acts, stretching or compressing input vectors.They are the lines of change that represent the action of the larger matrix, the very “line” in linear transformation. Eigenvectors and Eigenvalues are defined for a square matrix. We prefer our eigenvectors to be always unit length one . If we have a eigenvector nitps:imecium,com/@raghavan9Bolprincipal-component-analysis-pea-explained-and-mplemented-eeab7cb72b72 532114124, 12:05 AM Principal component analysis (PCA): Explained and implemented | by Raghavan | Medium We compute the length of the vector by V(8? +2?) = vB Length of the eigenvector Then to make the eigenvector length one we do 3\. 7 [ 3/Vv13 2 + V13 = 2/13 Since this is just the scalar operation , it is not going to affect the direction of the vector.A simple reason we do this is that we need all of our eigenvectors to be comparable ,so that we can choose the most valuable among them (the one having maximum variance). A matrix can have more than one eigenvector , and at max d eigenvectors if matrix is of dimention d X d. Back to PCA Lets continue on where we left of PCA , we have the scatter matrix which has the information about how one variable is related to the other variable. Now we use the numpy library to compute the Figenvectors and eigenvalues nitps:imecium,com/@raghavan9Bolprincipal-component-analysis-pea-explained-and-mplemented-eeab7cb72b72 ena2114924, 12:05 AM Principal component analysis (PCA): Explained and implemented | by Raghavan | Medium eig.val, eig_vec = np.linalg.eig(scatter_matrix) By default numpy or any stats library gives out eigenvectors of unit length. Lets verify it Openinapp 7 Awie O Okay we could endup having upto 3 eigenvectors , since the size of our scatter matrix is 3 X 3 Choosing k important eigenvectors from d vectors is same as droping d-k vectors . We now sort the eigenvectors by their eigenvalues and drop the least one. # We Make a list of tuple containing (eigenvalue, eigenvector) eig_pairs = [(np.abs(eig_val_scli]), eig_vec_scl:,i]) for i in range(len(eig_val_sc))] # We then Sort List of tuples by the eigenvalue eig_pairs.sort(key=lambda x: x[0], reverse=True) # verify that the list is correctly sorted by decreasing eigenvalues for i in eig_pairs: print (4[0]) Output : 65. 16936779078195 32.69471296321799 26 .596203282097097 Now we choose k the largest eigenvectors : htis:Imecium.com/@reghavan99olprincpal-component-analysis-pca-explained-andmplemented.eeab7cb73072 7132114924, 12:05 AM Principal component analysis (PCA): Explained and implemented | by Raghavan | Medium matrix_w = np.hstack((eig_pairs[0] [1] .reshape(3,1), eig_pairs{1] [1] .reshape(3,1))) print('Matrix W:\n', matrix_w) (‘Matrix W:\n', array([[-0.49210223, -0.64670286], [-0.47927902, -9.35756937], [-0.72672348, .67373552]])) Finally we have the new axes across which we can project our samples , we just multiply the original vector with our chosen eigenvectors and plot. transformed = matrix_w.T.dot(all_samples) assert transformed.shape == (2,46), "The matrix is not 2x40 dimensional." ‘Transformed samples with class labels fe © dosst | aa dass? vy.values ° Given that we generated raw data , the plot may vary when we attempt to recreate, Comparing for the same data the PCA from sklearn htis:Imecium.com/@reghavan99olprincpal-component-analysis-pca-explained-andmplemented.eeab7cb73072 aia2114924, 12:05 AM Principal component analysis (PCA): Explained and implemented | by Raghavan | Medium ‘4, Transformed samples via sklearn decomposition PCA © dass aa dass? vy.values Bingo !!! With this understanding by our side , we can define PCA as a process of finding the axes in our features space , viewed from which each samples in our data is separable in maximum way. Machine Learning Data Dimensionality Reduction Data Science Data Visualization Some rights reserved htis:Imecium.com/@reghavan99olprincpal-component-analysis-pca-explained-andmplemented.eeab7cb73072 ona2144, 205 AN Written by Raghavan 131 Followers Principal component analysis (PCA): Explained and implemented | by Raghavar Data scientist at Ericsson Al Accelerator, Kravmaga Trainer More from Raghavan @ Raghavan Scatter matrix , Covariance and Correlation Explained itis common among data science tasks to understand the relation between two... 3min ead + Aug 16,2018 &) 200 Q2 [pow @ Raghavan Kalman Filter Explained: Example @ Raghavan Box and whisker plot Aka Box Plot explained When trying to analysis a series of numerical data, it is essential for us to understand the... 2minread + Jun 24,2018 Se Q @ Raghavan Discriminative vs Generative models htis:Imecium.com/@reghavan99olprincpal-component-analysis-pca-explained-andmplemented.eeab7cb73072 101132114924, 12:05 AM Principal component analysis (PCA): Explained and implemented | by Raghavan | Medium This is a continuation of this post which Classifier is a very common machine learning introduced Kalman filtering, in this post, we. technique used . Two most popular of them. 2minread » Apr 14,2019 2min read » Jun 2, 2017 Sur Qi See all from Raghavan Recommended from Medium htis:Imecium.com/@reghavan99olprincipal-component-analysic-pea-explained-andmplemented.eab7cb73072 wi32114124, 12:05 AM A=WTSw @ shubham Panchal in Towards Data Science Principal Component Analy: Everything You Need To Know Covariance, eigenvalues, variance and everything... Y2minread « Sep 22,2022 Sie Qi Wi Lists Predictive Modeling w/ Python 20 stories . 895 saves Ei” 1189stories - 664 saves Principal component analysis (PCA): Explained and implemented | by Raghavan | Medium I TE) Natural Language Processing ® Rukshan Pramoditha in Data Science 365, 3 Easy Steps to Perform Dimensionality Reduction Using.. Running the PCA algorithm twice is the most effective way of performing PCA + + 10min read + Jan 3, 2028 Sw Qe oo Practical Guides to Machine Learning JOstories - 1046 saves data science and Al 40 stories - 70 saves htis:Imecium.com/@reghavan99olprincpal-component-analysis-pca-explained-andmplemented.eeab7cb73072 sata2114924, 12:05 AM B Visnatkarda PCA vs SVD: Simplified Principal Component Analysis (PCA) and Singular Value Decomposition (SVD) are two... 3minread + Aug 16,2023 & c + Ss Q oo (Conan |» auesrens (Corae +1 areca] @ Mahima keiti jimension reduction using Factor Analysis This article delves into the application of unsupervised machine learning through. 4minread » Dec 28,2023 + S2 Q woo ‘See more recommendations Principal component analysis (PCA): Explained and implemented | by Raghavan | Medium FY Huda swati Understanding Principal Component Analysis (PCA) What is PCA? 8minread + Sep 25,2023 ~ Q oo 40 as Feature 2 30 25. @ AneeshaB Soman Linear Discriminant Analysis (LDA) LDA is a supervised dimensionality reduction and classification technique. Its primary... 6minread » Oct 26,2023 Om Qe os htis:Imecium.com/@reghavan99olprincpal-component-analysis-pca-explained-andmplemented.eeab7cb73072 1313

Seminar PPT On Pca
No ratings yet
Seminar PPT On Pca
17 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
13 pages
A Step-By-Step Explanation of Principal Component Analysis (PCA) - Built in
No ratings yet
A Step-By-Step Explanation of Principal Component Analysis (PCA) - Built in
8 pages
The Intuition Behind PCA: Machine Learning Assignment
No ratings yet
The Intuition Behind PCA: Machine Learning Assignment
11 pages
L08 PrincipalComponentAnalysis
No ratings yet
L08 PrincipalComponentAnalysis
36 pages
Principal Component Analysis Concepts
No ratings yet
Principal Component Analysis Concepts
16 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
11 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
8 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
15 pages
Program: Course Code: Course Name:: M.C.A. MCAS9220 Data Science Fundamentals
No ratings yet
Program: Course Code: Course Name:: M.C.A. MCAS9220 Data Science Fundamentals
28 pages
Principal Component Analysis Concepts
No ratings yet
Principal Component Analysis Concepts
16 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
6 pages
09 Pca
No ratings yet
09 Pca
22 pages
Principal Component Analysis: #Datascience
No ratings yet
Principal Component Analysis: #Datascience
13 pages
Aim: Theory: Experiment 3
No ratings yet
Aim: Theory: Experiment 3
3 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
1 page
Principle Component Analysis
No ratings yet
Principle Component Analysis
14 pages
Module 5.2 Principal Component Analysis - V1
No ratings yet
Module 5.2 Principal Component Analysis - V1
4 pages
Mlfa Autumn 2023 Pca
No ratings yet
Mlfa Autumn 2023 Pca
32 pages
Principal Component Analysis Concepts: T56Gzsrvah
No ratings yet
Principal Component Analysis Concepts: T56Gzsrvah
16 pages
R PCA (Principal Component Analysis) - DataCamp
No ratings yet
R PCA (Principal Component Analysis) - DataCamp
54 pages
Principal Component Analysis (PCA) Explained - Built in
No ratings yet
Principal Component Analysis (PCA) Explained - Built in
11 pages
Qrm2024 Topic5 Pca Fa
No ratings yet
Qrm2024 Topic5 Pca Fa
67 pages
Principal Component Analysis
100% (1)
Principal Component Analysis
10 pages
Presentation A I STD 2
No ratings yet
Presentation A I STD 2
63 pages
Dimensionality Reduction (Principal Component Analysis)
No ratings yet
Dimensionality Reduction (Principal Component Analysis)
12 pages
Principal: Component Analysis
No ratings yet
Principal: Component Analysis
29 pages
FALLSEM2024-25 SWE1015 ETH VL2024250103260 2024-09-18 Reference-Material-I
No ratings yet
FALLSEM2024-25 SWE1015 ETH VL2024250103260 2024-09-18 Reference-Material-I
62 pages
IDS 4 (Week 14)
No ratings yet
IDS 4 (Week 14)
66 pages
Pca
No ratings yet
Pca
18 pages
PCA Theory
No ratings yet
PCA Theory
13 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
6 pages
Remote Sensing Assignment
No ratings yet
Remote Sensing Assignment
10 pages
Program 3
No ratings yet
Program 3
7 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
20 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
5 pages
3.2 Pca
No ratings yet
3.2 Pca
27 pages
Module 2 Lab 2
No ratings yet
Module 2 Lab 2
5 pages
DR Pca
No ratings yet
DR Pca
22 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
34 pages
PCA Finds Representation Through Linear Transformation
No ratings yet
PCA Finds Representation Through Linear Transformation
28 pages
A COMPLETE GUIDE TO PRINCIPAL COMPONENT ANALYSIS in ML 1598272724
No ratings yet
A COMPLETE GUIDE TO PRINCIPAL COMPONENT ANALYSIS in ML 1598272724
16 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
6 pages
6 Principal Component Analysis
No ratings yet
6 Principal Component Analysis
7 pages
Steps For PCA
No ratings yet
Steps For PCA
5 pages
Kinya Sharon - Ass2 - Machine Learning
No ratings yet
Kinya Sharon - Ass2 - Machine Learning
12 pages
Data Pre-Processing-IV (Feature Extraction-PCA)
No ratings yet
Data Pre-Processing-IV (Feature Extraction-PCA)
23 pages
Pca Topic
No ratings yet
Pca Topic
12 pages
PCA Explained Stepbystep
No ratings yet
PCA Explained Stepbystep
4 pages
PCA
100% (1)
PCA
33 pages
Principal Component Analysis (PCA) : Gundimeda Venugopal
No ratings yet
Principal Component Analysis (PCA) : Gundimeda Venugopal
17 pages
Need of Principal Component Analysis
No ratings yet
Need of Principal Component Analysis
8 pages
PCA Dev
No ratings yet
PCA Dev
16 pages
The Math Behind PCA
No ratings yet
The Math Behind PCA
3 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
10 pages
Unsupervised and Supervised Learning Problems: Problem 1
No ratings yet
Unsupervised and Supervised Learning Problems: Problem 1
3 pages
What Is Supervised and Unsupervised Learning?
No ratings yet
What Is Supervised and Unsupervised Learning?
2 pages
Lecture Slides - Inferential Statistics
100% (1)
Lecture Slides - Inferential Statistics
42 pages
Principal Computer Analysis (PCA)
No ratings yet
Principal Computer Analysis (PCA)
25 pages
Unit 3
No ratings yet
Unit 3
28 pages
Pca 1
No ratings yet
Pca 1
3 pages
Lecture Slides - Hypothesis Testing
No ratings yet
Lecture Slides - Hypothesis Testing
30 pages
10-601 Machine Learning (Fall 2010) Principal Component Analysis
No ratings yet
10-601 Machine Learning (Fall 2010) Principal Component Analysis
8 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
3 pages

Principal Component Analysis

Uploaded by

Principal Component Analysis

Uploaded by

You might also like