0% found this document useful (0 votes)

12 views

Module 3 ML

Uploaded by

neha1831sewani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views

Module 3 ML

Uploaded by

neha1831sewani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

Module 03

The Curse of Dimensionality in Machine Learning arises when working with

high-dimensional data, leading to increased computational complexity,
overfitting, and spurious correlations. Techniques like dimensionality reduction,
feature selection, and careful model design are essential for mitigating its
effects and improving algorithm performance. Navigating this challenge is
crucial for unlocking the potential of high-dimensional datasets and ensuring
robust machine-learning solutions.

What is the Curse of Dimensionality?

The Curse of Dimensionality refers to the phenomenon where the efficiency
and effectiveness of algorithms deteriorate as the dimensionality of the
data increases exponentially.

In high-dimensional spaces, data points become sparse, making it

challenging to discern meaningful patterns or relationships due to the vast
amount of data required to adequately sample the space.

The Curse of Dimensionality significantly impacts machine

learning algorithms in various ways. It leads to increased computational
complexity, longer training times, and higher resource requirements.
Moreover, it escalates the risk of overfitting and spurious correlations,
hindering the algorithms' ability to generalize well to unseen data.

How to Overcome the Curse of

Dimensionality?
To overcome the curse of dimensionality, you can consider the following
strategies:

Dimensionality Reduction Techniques:

Feature Selection: Identify and select the most relevant features from the
original dataset while discarding irrelevant or redundant ones. This reduces

Module 03 1
the dimensionality of the data, simplifying the model and improving its
efficiency.

Feature Extraction: Transform the original high-dimensional data into a

lower-dimensional space by creating new features that capture the
essential information. Techniques such as Principal Component Analysis
(PCA) and t-distributed Stochastic Neighbor Embedding (t-SNE) are
commonly used for feature extraction.

Data Preprocessing:
Normalization: Scale the features to a similar range to prevent certain
features from dominating others, especially in distance-based algorithms.

Handling Missing Values: Address missing data appropriately through

imputation or deletion to ensure robustness in the model training process.

PCA
As the number of features or dimensions in a dataset increases, the amount of
data required to obtain a statistically significant result increases exponentially.
This can lead to issues such as overfitting, increased computation time, and
reduced accuracy of machine learning models this is known as the curse of
dimensionality problems that arise while working with high-dimensional data.
As the number of dimensions increases, the number of possible combinations
of features increases exponentially, which makes it computationally difficult to
obtain a representative sample of the data. It becomes expensive to perform
tasks such as clustering or classification because the algorithms need to
process a much larger feature space, which increases computation time and
complexity. Additionally, some machine learning algorithms can be sensitive to
the number of dimensions, requiring more data to achieve the same level of
accuracy as lower-dimensional data.

To address the curse of dimensionality, Feature engineering techniques are

used which include feature selection and feature extraction. Dimensionality
reduction is a type of feature extraction technique that aims to reduce the
number of input features while retaining as much of the original information as
possible.

Module 03 2
In this article, we will discuss one of the most popular dimensionality reduction
techniques i.e. Principal Component Analysis(PCA).

What is Principal Component

Analysis(PCA)?
Principal Component Analysis(PCA) technique was introduced by the
mathematician Karl Pearson in 1901. It works on the condition that while the
data in a higher dimensional space is mapped to data in a lower dimension
space, the variance of the data in the lower dimensional space should be
maximum.

Principal Component Analysis (PCA) is a statistical procedure that uses an

orthogonal transformation that converts a set of correlated variables to a
set of uncorrelated variables.PCA is the most widely used tool in
exploratory data analysis and in machine learning for predictive models.
Moreover,

Principal Component Analysis (PCA) is an unsupervised learning algorithm

technique used to examine the interrelations among a set of variables. It is
also known as a general factor analysis where regression determines a line
of best fit.

The main goal of Principal Component Analysis (PCA) is to reduce the

dimensionality of a dataset while preserving the most important patterns or
relationships between the variables without any prior knowledge of the
target variables.

Principal Component Analysis (PCA) is used to reduce the dimensionality of a

data set by finding a new set of variables, smaller than the original set of
variables, retaining most of the sample’s information, and useful for
the regression and classification of data.

Module 03 3
Principal Component Analysis

1. Principal Component Analysis (PCA) is a technique for dimensionality

reduction that identifies a set of orthogonal axes, called principal
components, that capture the maximum variance in the data. The principal
components are linear combinations of the original variables in the dataset
and are ordered in decreasing order of importance. The total variance
captured by all the principal components is equal to the total variance in the
original dataset.

2. The first principal component captures the most variation in the data, but
the second principal component captures the maximum variance that
is orthogonal to the first principal component, and so on.

3. Principal Component Analysis can be used for a variety of purposes,

including data visualization, feature selection, and data compression. In
data visualization, PCA can be used to plot high-dimensional data in two or
three dimensions, making it easier to interpret. In feature selection, PCA can
be used to identify the most important variables in a dataset. In data
compression, PCA can be used to reduce the size of a dataset without
losing important information.

4. In Principal Component Analysis, it is assumed that the information is

carried in the variance of the features, that is, the higher the variation in a
feature, the more information that features carries.

Overall, PCA is a powerful tool for data analysis and can help to simplify
complex datasets, making them easier to understand and work with.

Module 03 4
Step-By-Step Explanation of PCA
(Principal Component Analysis)
Step1:Find mean of X & Y
Step 2:

Cov(X,X)

Cov(Y,Y)

Module 03 5
Cov(X,Y) & Cov(Y,X) are same

Module 03 6
Cov(X,Y) i.e 5.539 is value of only numerator so we have to divide it by n-1
before writing in amtrix
hence 5.539/9 (9 is n-1 i.e no of dimensins 10 -1)
Do same for all

Module 03 7
Module 03 8
For Lambda2

Module 03 9
For lambda 1

LDA

Module 03 10
fIND MEAN

Then calcualte S1 AND S2

Note:S1 is reltd to X1 S2 is related to X2

Before thatt find mue1 i.e {mean of all left or x coorditnate of X1,mean of all
right or x coorditnate of X1 )
Find diff of each term with mean

Module 03 11
Find transdpose ulta kro and theri multiplication in calci

Then add all matrix and u have computed S1

Do same for S2

Module 03 12
Module 03 13
Module 03 14
Linear Discriminant Analysis (LDA), also known as Normal Discriminant Analysis
or Discriminant Function Analysis, is a dimensionality reduction technique
primarily utilized in supervised classification problems. It facilitates the
modeling of distinctions between groups, effectively separating two or more
classes. LDA operates by projecting features from a higher-dimensional space
into a lower-dimensional one. In machine learning, LDA serves as a supervised
learning algorithm specifically designed for classification tasks, aiming to
identify a linear combination of features that optimally segregates classes
within a dataset.

Module 03 15
For example, we have two classes and we need to separate them efficiently.
Classes can have multiple features. Using only a single feature to classify them
may result in some overlapping as shown in the below figure. So, we will keep
on increasing the number of features for proper classification.

Assumptions of LDA
LDA assumes that the data has a Gaussian distribution and that
the covariance matrices of the different classes are equal. It also assumes that
the data is linearly separable, meaning that a linear decision boundary can
accurately classify the different classes.
Suppose we have two sets of data points belonging to two different classes
that we want to classify. As shown in the given 2D graph, when the data points
are plotted on the 2D plane, there’s no straight line that can separate the two
classes of data points completely. Hence, in this case, LDA (Linear Discriminant
Analysis) is used which reduces the 2D graph into a 1D graph in order to
maximize the separability between the two classes.

Module 03 16
Linearly Separable Dataset
Here, Linear Discriminant Analysis uses both axes (X and Y) to create a new
axis and projects data onto a new axis in a way to maximize the separation of
the two categories and hence, reduces the 2D graph into a 1D graph.
Two criteria are used by LDA to create a new axis:

1. Maximize the distance between the means of the two classes.

2. Minimize the variation within each class.

Module 03 17
The perpendicular distance between the line and points
In the above graph, it can be seen that a new axis (in red) is generated and
plotted in the 2D graph such that it maximizes the distance between the means
of the two classes and minimizes the variation within each class. In simple
terms, this newly generated axis increases the separation between the data
points of the two classes. After generating this new axis using the above-
mentioned criteria, all the data points of the classes are plotted on this new axis
and are shown in the figure given below.

But Linear Discriminant Analysis fails when the mean of the distributions are
shared, as it becomes impossible for LDA to find a new axis that makes both
classes linearly separable. In such cases, we use non-linear discriminant
analysis.

How does LDA work?

Module 03 18
LDA works by projecting the data onto a lower-dimensional space that
maximizes the separation between the classes. It does this by finding a set of
linear discriminants that maximize the ratio of between-class variance to
within-class variance. In other words, it finds the directions in the feature space
that best separates the different classes of data.

SVD:
The Singular Value Decomposition (SVD) of a matrix is a factorization of that
matrix into three matrices. It has some interesting algebraic properties and
conveys important geometrical and theoretical insights about linear
transformations. It also has some important applications in data science. In this
article, I will try to explain the mathematical intuition behind SVD and its
geometrical meaning.
Mathematics behind SVD:
The SVD of mxn matrix A is given by the formula A=UΣVTA=UΣVT

where:

U: mxm matrix of the orthonormal eigenvectors of AAT .

AAT

V: transpose of a nxn matrix containing the orthonormal eigenvectors

of ATA.
T
ATA

ΣΣ : diagonal matrix with r elements equal to the root of the positive

eigenvalues of AAᵀ or Aᵀ A (both matrics have the same positive
eigenvalues anyway).

Module 03 19

INFA Scenario Based Q N A - 16 Pages
No ratings yet
INFA Scenario Based Q N A - 16 Pages
16 pages
Diagnostic Questions: Managing Implementation and Ensuring Solution and Operations Reliability
No ratings yet
Diagnostic Questions: Managing Implementation and Ensuring Solution and Operations Reliability
11 pages
PUM24 Updates
No ratings yet
PUM24 Updates
115 pages
Unit - IV - DIMENSIONALITY REDUCTION AND GRAPHICAL MODELS
No ratings yet
Unit - IV - DIMENSIONALITY REDUCTION AND GRAPHICAL MODELS
59 pages
Principal Component Analysis: Jianxin Wu
No ratings yet
Principal Component Analysis: Jianxin Wu
24 pages
Chapter Five Principal Comonent Analysis (PCA)
No ratings yet
Chapter Five Principal Comonent Analysis (PCA)
33 pages
Love Report 1
No ratings yet
Love Report 1
10 pages
1694601214-Unit 3.4 Principal Component Analysis CU 2.0
No ratings yet
1694601214-Unit 3.4 Principal Component Analysis CU 2.0
36 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
2 pages
Love Report
No ratings yet
Love Report
7 pages
3
No ratings yet
3
12 pages
PCA - Feb 8
No ratings yet
PCA - Feb 8
28 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
30 pages
Principal Component Analysis and Cluster Analysis
No ratings yet
Principal Component Analysis and Cluster Analysis
14 pages
Module 3
No ratings yet
Module 3
41 pages
Clustering_and_dimensionality_reduction_techniques__PCA__t_SNE__K_means_ (1)
No ratings yet
Clustering_and_dimensionality_reduction_techniques__PCA__t_SNE__K_means_ (1)
15 pages
Unit 3dimentionality Reduction
No ratings yet
Unit 3dimentionality Reduction
13 pages
Dimensionality Reduction Techniques You Should Know in 2021
No ratings yet
Dimensionality Reduction Techniques You Should Know in 2021
12 pages
Ai & ML Week-9
No ratings yet
Ai & ML Week-9
30 pages
6 - Data Pre-Processing-III
No ratings yet
6 - Data Pre-Processing-III
30 pages
Dimension Reduction
No ratings yet
Dimension Reduction
15 pages
CHBE413CDS Lecture 12 Unsupervised DimRed
No ratings yet
CHBE413CDS Lecture 12 Unsupervised DimRed
30 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
19 pages
Feature Extraction: - Saheni Patra
No ratings yet
Feature Extraction: - Saheni Patra
17 pages
ML Module 6
No ratings yet
ML Module 6
6 pages
P-3.1.4 - Pca
No ratings yet
P-3.1.4 - Pca
44 pages
ML Unit 4 @ VS
No ratings yet
ML Unit 4 @ VS
33 pages
22AIP3101A Session 7
No ratings yet
22AIP3101A Session 7
28 pages
U5@-Data Reduction
No ratings yet
U5@-Data Reduction
22 pages
3.2 Pca
No ratings yet
3.2 Pca
27 pages
EDAB Module 5 Singular Value Decomposition (SVD)
No ratings yet
EDAB Module 5 Singular Value Decomposition (SVD)
58 pages
Module3 Notes
No ratings yet
Module3 Notes
13 pages
Unit 4 Dimenstionality Reduction
No ratings yet
Unit 4 Dimenstionality Reduction
104 pages
Data Mining Machine Learning and Big Dat
No ratings yet
Data Mining Machine Learning and Big Dat
7 pages
data reduction
No ratings yet
data reduction
9 pages
03 Dimensionality Reduction
No ratings yet
03 Dimensionality Reduction
38 pages
Unit-3
No ratings yet
Unit-3
28 pages
5 Data Pre Processing III
No ratings yet
5 Data Pre Processing III
30 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
9 pages
Feature Selection and Dimensionality Reduction
No ratings yet
Feature Selection and Dimensionality Reduction
4 pages
W4.2 DataPreProcessing-PCA (1)
No ratings yet
W4.2 DataPreProcessing-PCA (1)
22 pages
1.variable Reduction 2.principal Component Analysis: Topic UNIT-4
No ratings yet
1.variable Reduction 2.principal Component Analysis: Topic UNIT-4
19 pages
Unit V Foml
No ratings yet
Unit V Foml
18 pages
PCA - Ensemble Classifiers
No ratings yet
PCA - Ensemble Classifiers
9 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
27 pages
STAT502
No ratings yet
STAT502
13 pages
Class8-9 DataPreprocessing DataReduction 30Sept-05Oct2020
No ratings yet
Class8-9 DataPreprocessing DataReduction 30Sept-05Oct2020
22 pages
DATA REDUCTION
No ratings yet
DATA REDUCTION
23 pages
9 ML
No ratings yet
9 ML
39 pages
Kinya Sharon - Ass2 - Machine Learning
No ratings yet
Kinya Sharon - Ass2 - Machine Learning
12 pages
Feature Extraction Techniques
No ratings yet
Feature Extraction Techniques
32 pages
Dimensionality Reduction Algorithms
No ratings yet
Dimensionality Reduction Algorithms
7 pages
PCA Using Python
No ratings yet
PCA Using Python
18 pages
Ann Unit V
No ratings yet
Ann Unit V
30 pages
10-601 Machine Learning (Fall 2010) Principal Component Analysis
No ratings yet
10-601 Machine Learning (Fall 2010) Principal Component Analysis
8 pages
PCA Finds Representation Through Linear Transformation
No ratings yet
PCA Finds Representation Through Linear Transformation
28 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
27 pages
Feature Selection and Extraction
No ratings yet
Feature Selection and Extraction
26 pages
DS Unit 3 Essay Answers
No ratings yet
DS Unit 3 Essay Answers
15 pages
SE 458 - Data Mining (DM) : Spring 2019 Section W1
No ratings yet
SE 458 - Data Mining (DM) : Spring 2019 Section W1
10 pages
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
César Pérez López
No ratings yet
Mathematics for Data Science: Linear Algebra with Matlab
From Everand
Mathematics for Data Science: Linear Algebra with Matlab
César Pérez López
No ratings yet
VPT 21hd Mil Icd
No ratings yet
VPT 21hd Mil Icd
2 pages
Optical: Cts and Vts
No ratings yet
Optical: Cts and Vts
7 pages
List of Construction Equipment
No ratings yet
List of Construction Equipment
3 pages
OnApp Diagnostics
No ratings yet
OnApp Diagnostics
7 pages
5 Testing Advanced I/O Devices
No ratings yet
5 Testing Advanced I/O Devices
10 pages
(Archives) Microsoft Publisher 2007: Working With Rulers & Guides
No ratings yet
(Archives) Microsoft Publisher 2007: Working With Rulers & Guides
4 pages
Command - Line 720p en
No ratings yet
Command - Line 720p en
26 pages
DCCN Lab Record
No ratings yet
DCCN Lab Record
66 pages
Basic Electronics Syllabus Only
No ratings yet
Basic Electronics Syllabus Only
2 pages
A8298 Datasheet
No ratings yet
A8298 Datasheet
28 pages
Setec 1
No ratings yet
Setec 1
12 pages
UAV Communications
No ratings yet
UAV Communications
12 pages
64T64R Massive Mimo Remote Radio Unit
No ratings yet
64T64R Massive Mimo Remote Radio Unit
2 pages
Allelectricalinterviewquestions4u Blogspot in
No ratings yet
Allelectricalinterviewquestions4u Blogspot in
5 pages
Create and Customize Comprehensive Tier Lists For Any Topic With Our Advanced Tier List Maker
No ratings yet
Create and Customize Comprehensive Tier Lists For Any Topic With Our Advanced Tier List Maker
10 pages
Error Handling in ASP
No ratings yet
Error Handling in ASP
116 pages
N700 Catalog
No ratings yet
N700 Catalog
40 pages
Micro - 5 Instruction (2) Logical + Shift 16-10-2023
No ratings yet
Micro - 5 Instruction (2) Logical + Shift 16-10-2023
10 pages
Creating Service Concepts For Finnish Elderly Care: Promise Nwagu
No ratings yet
Creating Service Concepts For Finnish Elderly Care: Promise Nwagu
95 pages
Practical 2 Data Transfer
No ratings yet
Practical 2 Data Transfer
3 pages
Enrollment System Thesis
100% (3)
Enrollment System Thesis
7 pages
Manhole Pathdetection Major Project
No ratings yet
Manhole Pathdetection Major Project
68 pages
C70 Series VHF Antenna 360-720 Channel Broadband
No ratings yet
C70 Series VHF Antenna 360-720 Channel Broadband
1 page
P. Pages: 4 Time: Three Hours) (Max. Marks: 60: AP-cc
No ratings yet
P. Pages: 4 Time: Three Hours) (Max. Marks: 60: AP-cc
2 pages
Arduino Water Pressure Sensor Project, Water Level Pressure Sensor
100% (1)
Arduino Water Pressure Sensor Project, Water Level Pressure Sensor
21 pages
CV Template
No ratings yet
CV Template
1 page
Prelims of 10th
No ratings yet
Prelims of 10th
2 pages