Feature Extraction: - Saheni Patra

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 17

FEATURE EXTRACTION

-SAHENI PATRA

RAJA NARANDRA LAL KHAN WOMEN’S COLLEGE (AUTONOMOUS)

Subject :- Machine Learning (ML)


Semester :- 5th
Roll No :- 204239
Registration No:- 1390757
Student-ID :- 2020-1469
Department:- Computer Science
Session:- 2021-2022
TABLE OF CONTENTS
What is Feature
extraction
Covariance matrix
computation
Types of method
Eigen values and
Eigen vectors
What is PCA

Feature vector
Important
terminologies
Applications of PCA

How PCA works


Factor Analysis

Standardization
Singular Value
Decomposition
WHAT IS FEATURE
EXTRACTION?
Feature extraction means are just select a particular features. For training purpose,we have to extract the features and send to those
features to particular machine as a input.

Feature extraction involves transforming high dimensional data into space of fewer dimensional. High dimensional data space
means all the the data sets of the featured dataset and lower dimensional dataset means we have to select only the relevant features.

We have to see in this example:

There is one bike image and feature extraction algorithm means the different parts & different subsets which is called feature extraction.
TYPES OF METHOD

• Principal Component Analysis (PCA)

• Factor Analysis (FA)

• Singular Value Decomposition (SVD)


WHAT IS PCA?
PCA is an unsupervised learning algorithm that is used for the dimensionality reduction in Machine Learning.
It is a statistical process that converts the observation of correlated features into a set of linear uncorrelated
features with help of orthogonal transformation.

Here, we take some very complex dataset with lots of variables. Then we run it through the
PCA and we reduce the variables.
dimensionality
• Here, for our ease we can consider that we get two principle components PC1 & PC2.

• Comparing both the principl components,vee find that the data points are sufficiently spaced in the PC1.

• Whereas, in PC2 they are less spaced which makes the observation and further calculations much difficult. Therefore, we
accept the PC1 and not PC2 as the data point are more spaced.
IMPORTANT TERMINOLOGIES
HOW PCA WORKS?
The steps to perform PCA are following:
1. Standardize the data.
2. Compute the covariance matrix of the features from the dataset.
3. Perform eigendecompositon on the covariance matrix.
4. Order the eigenvectors in decreasing order based on the magnitude of their corresponding eigenvalues.
STANDARDIZATION
In this step, we will standardize our dataset. Such as in a particular column, the features with high variance are more
important compared to the features with lower variance.

If the importance of features is independent of the variance of the feature, then we will divide each data
item in a column with the standard deviation of the column. Here we will name the matrix as Z.

The process Involves the removal of mean from the variable values and scaling the data with respect to the
standard deviation.
COVARIANCE MATRIX COMPUTATION
Covariance matrix is used to express the correlation between any two or more attributes in a maultidimensional dataset.
• Positive covariance indicate that the value of one variable is directly proportional to other variable.
• Negetive covariance indicate that the value of one variable is inversely proportional to other variable.
Here we can see the covariance table for more than two attributes in a multidimensional dataset.
EIGEN VALUES & EIGEN VECTORS
Eigen values and eigen vectors are the mathematical values that are extracted from the covariance table. Now we
need to calculate the eigenvalues and eigenvectors for the resultant covariance matrix Z. Eigenvectors or the
covariance matrix are the directions of the axes with high information. And the coefficients of these eigenvectors
are defined as the eigenvalues.

• Eigen vectors do not change direction after linear transformations.


• Eigen values are the scalars or the magnitude of the Eigen vectors.
FEATURE VECTORS
Feature vector is simply a matrix that has Eigen vectors of the components that we decide to keep as the column.

Here we decide whether we must keep or disregard the less significant principal components that we have
generated in the above steps.
APPLICATION OF PCA IN MACHINE
LEARNING
• PCA is used to visualize multidimensional image.
• It is used to reduce the number of dimensions in healthcare data.
• PCA can help resize an image.
• PCA helps to find patterns in the high-dimensional datasets.
FACTOR ANALYSIS

Just like PCA, Factor Analysis is also a model that allows reducing information in a
larger number of variables into a smaller number of variables. In Factor Analysis we
call those “latent variables”.

Factor Analysis is based on a model called the common factor model. It starts from
the principle that there a certain number of factors in a data set, and that each of the
measured variables captures a part of one or more of those factors.
SINGULAR VALUE DECOMPOSITION

Singular Value Decomposition (SVD) is a widely used technique to decompose a


matrix into several component matrices, exposing many of the useful and
interesting properties of the original matrix.

A Singular Value Decomposition analysis supports and yields results for a


more compact demonstration of these correlations.
REFERENCE:

1. https://www.geeksforgeeks.org/ml-principle-component/analysispca/amp/
2. https://www.simplilearn.com/tutorials/machine-learning-tutorial/principal-c
omponent-analysis
3. https://www.machinelearningplus.com/machine-learning/principal-compone
nts-analysis-pca-better-explained/
4. https://slideplayer.com/slide/5256003/
THANK
YOU

You might also like