0% found this document useful (0 votes)

157 views52 pages

6CS4-02 ML PPT Unit-3

The document discusses machine learning concepts including feature extraction, dimensionality reduction techniques like feature selection and feature extraction. It specifically describes principal component analysis (PCA) including the mathematical steps to implement PCA on a 2D dataset. PCA involves normalizing data, calculating the covariance matrix, obtaining eigenvalues and eigenvectors, selecting principal components based on eigenvalues, and forming new feature vectors and principal components from the eigenvectors. Properties of principal components are also outlined.

Uploaded by

kunal Goyal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

157 views52 pages

6CS4-02 ML PPT Unit-3

Uploaded by

kunal Goyal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 52

AIET, Jaipur MACHINE LEARNING CS-VI Sem

Machine Learning
(6CS4-02)
Unit-III

1
AIET, Jaipur MACHINE LEARNING CS-VI Sem

S. No. Title of Book Authors Publisher

Reference Books
T1 Pattern Recognition and Christopher M. Bishop Springer
Machine Learning

T2 Introduction to Machine Ethem Alpaydın MIT Press

Learning (Adaptive
Computation and
Machine Learning),
Text Books

R1 Machine Learning Mitchell. T, McGraw

Hill

2
AIET, Jaipur MACHINE LEARNING CS-VI Sem

1. Feature Extraction
1.Principal component analysis
2.Singular value decomposition.
Introduction
2. Feature selection
to Statistical
3. Feature ranking
Learning
1.Subset selection,
Theory,
2.Filter,

3.Wrapper

4.Embedded methods,
4. Evaluating Machine Learning algorithms
5. Model Selection.

3
AIET, Jaipur MACHINE LEARNING CS-VI Sem

Dimensionality Reduction
Reducing the dimension of the feature space is called “dimensionality reduction.”
There are many ways to achieve dimensionality reduction, but most of these
techniques fall into one of two classes:
 Feature Selection
 Feature Extraction

4
AIET, JaipurMachine Leaning CS-VI Sem

Feature Extraction
Feature extraction is a process of dimensionality
reduction by which an initial set of raw data is reduced to
more manageable groups for processing. A characteristic
of these large data sets is a large number of variables that
require a lot of computing resources to process.

5
AIET, JaipurMachine Leaning CS-VI Sem

Principal Component Analysis

An important machine learning method for dimensionality reduction is
called Principal Component Analysis. It is a method that uses simple
matrix operations from linear algebra and statistics to calculate a
projection of the original data into the same number or fewer dimensions.
When should I use PCA?

1. Do you want to reduce the number of variables, but aren’t able to identify variables
to completely remove from consideration?
2. Do you want to ensure your variables are independent of one another?
3. Are you comfortable making your independent variables less interpretable?
If you answered “yes” to all three questions, then PCA is a good method to use. If you
answered “no” to question 3, you should not use PCA.
By projecting our data into a smaller space, we’re reducing the dimensionality of our feature
space… but because we’ve transformed our data in these different “directions,” we’ve made
sure to keep all original variables in our model!

6
AIET, JaipurMachine Leaning CS-VI Sem

Principal Component Analysis

7
AIET, JaipurMachine Leaning CS-VI Sem

Properties of Principal Component

Technically, a principal component can be defined as a linear combination of optimally-
weighted observed variables. The output of PCA are these principal components, the
number of which is less than or equal to the number of original variables. Less, in case
when we wish to discard or reduce the dimensions in our dataset. The PCs possess some
useful properties which are listed below:
1. The PCs are essentially the linear combinations of the original variables, the weights
vector in this combination is actually the eigenvector found which in turn satisfies
the principle of least squares.
2. The PCs are orthogonal, as already discussed.
3. The variation present in the PCs decrease as we move from the 1st PC to the last
one, hence the importance.
The least important PCs are also sometimes useful in regression, outlier detection, etc.

8
AIET, JaipurMachine Leaning CS-VI Sem

Properties of Principal Component

9
AIET, JaipurMachine Leaning CS-VI Sem

Principal Component Analysis

Implementing PCA on a 2-D Dataset
Step 1: Normalize the data
First step is to normalize the data that we have so that PCA works properly. This is done by
subtracting the respective means from the numbers in the respective column. So if we have
two dimensions X and Y, all X become 𝔁- and all Y become 𝒚-. This produces a dataset
whose mean is zero.
Step 2: Calculate the covariance matrix
Since the dataset we took is 2-dimensional, this will result in a 2x2 Covariance matrix.

“The mean value of the product of the deviations of two variates from their
respective means.”

Please note that Var[X1] = Cov[X1,X1] and Var[X2] = Cov[X2,X2]. 10

AIET, JaipurMachine Leaning CS-VI Sem

Principal Component Analysis

Step 3: Calculate the eigenvalues and eigenvectors

Next step is to calculate the eigenvalues and eigenvectors for the covariance matrix. The
same is possible because it is a square matrix. ƛ is an eigenvalue for a matrix A if it is a
solution of the characteristic equation:
det( ƛI - A ) = 0
Where, I is the identity matrix of the same dimension as A which is a required condition for
the matrix subtraction as well in this case and ‘det’ is the determinant of the matrix. For
each eigenvalue ƛ, a corresponding eigen-vector v, can be found by solving:
( ƛI - A )v = 0

11
AIET, JaipurMachine Leaning CS-VI Sem

Principal Component Analysis

Step 4: Choosing components and forming a feature vector:

We order the eigenvalues from largest to smallest so that it gives us the components in
order or significance. Here comes the dimensionality reduction part. If we have a dataset
with n variables, then we have the corresponding n eigenvalues and eigenvectors. It turns
out that the eigenvector corresponding to the highest eigenvalue is the principal component
of the dataset and it is our call as to how many eigenvalues we choose to proceed our
analysis with. To reduce the dimensions, we choose the first p eigenvalues and ignore the
rest. We do lose out some information in the process, but if the eigenvalues are small, we
do not lose much.
Next we form a feature vector which is a matrix of vectors, in our case, the eigenvectors. In
fact, only those eigenvectors which we want to proceed with. Since we just have 2
dimensions in the running example, we can either choose the one corresponding to the
greater eigenvalue or simply take both.
Feature Vector = (eig1, eig2)

12
AIET, JaipurMachine Leaning CS-VI Sem

Principal Component Analysis

Step 5: Forming Principal Components:

This is the final step where we actually form the principal components using all the math
we did till here. For the same, we take the transpose of the feature vector and left-multiply
it with the transpose of scaled version of original dataset.
NewData = FeatureVectorT x ScaledDataT
Here,
NewData is the Matrix consisting of the principal components,
FeatureVector is the matrix we formed using the eigenvectors we chose to keep, and
ScaledData is the scaled version of original dataset
(‘T’ in the superscript denotes transpose of a matrix which is formed by interchanging the
rows to columns and vice versa. In particular, a 2x3 matrix has a transpose of size 3x2)

13
AIET, JaipurMachine Leaning CS-VI Sem

Principal Component Analysis

14
AIET, JaipurMachine Leaning CS-VI Sem

Principal Component Analysis

15
AIET, JaipurMachine Leaning CS-VI Sem

Principal Component Analysis

16
AIET, JaipurMachine Leaning CS-VI Sem

References

PCA:
https://medium.com/analytics-vidhya/understanding-principle-component-analysis-pca-ste
p-by-step-e7a4bb4031d9

SVD:
https://web.mit.edu/be.400/www/SVD/Singular_Value_Decomposition.htm

17
Feature Selection

18
AIET, JaipurMachine Leaning CS-VI Sem

19
Overview
• Why we need FS:
1. to improve performance (in terms of speed,
predictive power, simplicity of the model).
2. to visualize the data for model selection.
3. To reduce dimensionality and remove noise.

• Feature Selection is a process that chooses an

optimal subset of features according to a
certain criterion.
20
Overview
• Reasons for performing FS may include:
– removing irrelevant data.
– increasing predictive accuracy of learned models.
– reducing the cost of the data.
– improving learning efficiency, such as reducing
storage requirements and computational cost.
– reducing the complexity of the resulting model
description, improving the understanding of the
data and the model.

21
Perspectives
1. searching for the best subset of features.

2. criteria for evaluating different subsets.

3. principle for selecting, adding, removing or

changing new features during the search.

22
Perspectives:
Search of a Subset of Features
• FS can be considered as a search problem, where
each state of the search space corresponds to a
concrete subset of features selected.
• The selection can be represented as a binary
array, with each element corresponding to the
value 1, if the feature is currently selected by the
algorithm and 0, if it does not occur.
• There should be a total of 2M subsets where M is
the number of features of a data set.
23
Perspectives:
Search of a Subset of Features

Search
Space:

24
Perspectives:
Search of a Subset of Features
• Search Directions:
– Sequential Forward Generation (SFG): It starts with an empty set of features
S. As the search starts, features are added into S according to some criterion
that distinguish the best feature from the others. S grows until it reaches a
full set of original features. The stopping criteria can be a threshold for the
number of relevant features m or simply the generation of all possible
subsets in brute force mode.

– Sequential Backward Generation (SBG): It starts with a full set of features

and,iteratively, they are removed one at a time. Here, the criterion must
point out the worst or least important feature. By the end, the subset is only
composed of a unique feature, which is considered to be the most
informative of the whole set. As in the previous case, different stopping
criteria can be used.

25
Perspectives:
Search of a Subset of Features
• Search Directions:
– Bidirectional Generation (BG): Begins the search in both directions,
performing SFG and SBG concurrently. They stop in two cases: (1)
when one search finds the best subset comprised of m features before
it reaches the exact middle, or (2) both searches achieve the middle of
the search space. It takes advantage of both SFG and SBG.

– Random Generation (RG): It starts the search in a random direction.

The choice of adding or removing a features is a random decision.
RGtries to avoid the stagnation into a local optima by not following a
fixed way for subset generation. Unlike SFG or SBG, the size of the
subset of features cannot be stipulated.

26
Perspectives:
Selection Criteria
– Information Measures.
• Information serves to measure the uncertainty of the
receiver when she/he receives a message.
• Shannon’s Entropy:

• Information gain:

27
Perspectives:
Selection Criteria
– Dependence Measures.
• known as measures of association or correlation.
• Its main goal is to quantify how strongly two variables
are correlated or present some association with each
other, in such way that knowing the value of one of
them, we can derive the value for the other.
• Pearson correlation coefficient:

28
Perspectives:
Selection Criteria
– Consistency Measures.
• They attempt to find a minimum number of features that
separate classes as the full set of features can.

• They aim to achieve P(C|FullSet) = P(C|SubSet).

• An inconsistency is defined as the case of two examples with

the same inputs (same feature values) but with different
output feature values (classes in classification).

29
Perspectives
• Filters:

30
Perspectives
• Filters:
– measuring uncertainty, distances, dependence or
consistency is usually cheaper than measuring the
accuracy of a learning process. Thus, filter methods are
usually faster.
– it does not rely on a particular learning bias, in such a way
that the selected features can be used to learn different
models from different DM techniques.
– it can handle larger sized data, due to the simplicity and
low time complexity of the evaluation measures.

31
Perspectives
• Wrappers:

32
Perspectives
• Wrappers:
– can achieve the purpose of improving the particular
learner’s predictive performance.
– usage of internal statistical validation to control the
overfitting, ensembles of learners and
hybridizations with heuristic learning like Bayesian
classifiers or Decision Tree induction.
– filter models cannot allow a learning algorithm to
fully exploit its bias, whereas wrapper methods do.

33
Perspectives
• Embedded FS:
– similar to the wrapper approach in the sense that
the features are specifically selected for a certain
learning algorithm, but in this approach, the
features are selected during the learning process.
– they could take advantage of the available data by
not requiring to split the training data into a
training and validation set; they could achieve a
faster solution by avoiding the re-training of a
predictor for each feature subset explored.
34
Comparison

35
• Filters: for a much faster alternative, filters do not test any particular
algorithm, but rank the original features according to their
relationship with the problem (labels) and just select the top of
them. Correlation and mutual information are the most widespread
criteria. There are many easy to use tools, like the feature selection
sklearn package.

• Wrappers: a wrapper evaluates a specific model sequentially using

different potential subsets of features to get the subset that best works
in the end. They are highly costly and have a high chance of
overfitting, but also a high chance of success, on the other hand.

• Embedded: this group is made up of all the Machine Learning

techniques that include feature selection during their training stage.
LASSO is an example.

36
Aspects:
Output of Feature Selection
• Feature Ranking Techniques:
– we expect as the output a ranked list of features
which are ordered according to evaluation
measures.
– they return the relevance of the features.
– For performing actual FS, the simplest way is to
choose the first m features for the task at hand,
whenever we know the most appropriate m value.

37
Aspects:
Output of Feature Selection
• Minimum Subset Techniques:
– The number of relevant features is a parameter
that is often not known by the practitioner.
– There must be a second category of techniques
focused on obtaining the minimum possible
subset without ordering the features.
– whatever is relevant within the subset, is
otherwise irrelevant.

38
Aspects:
Evaluation
• Goals:
– Inferability: For predictive tasks, considered as an
improvement of the prediction of unseen examples with
respect to the direct usage of the raw training data.
– Interpretability: Given the incomprehension of raw data
by humans, DM is also used for generating more
understandable structure representation that can explain
the behavior of the data.
– Data Reduction: It is better and simpler to handle data
with lower dimensions in terms of efficiency and
interpretability.

39
Aspects:
Evaluation
• We can derive three assessment measures
from these three goals:
– Accuracy

– Complexity

– Number of Features Selected

– Speed of the FS method

40
Aspects:
Drawbacks
• The resulted subsets of many models of FS are strongly dependent
on the training set size.
• It is not true that a large dimensionality input can always be
reduced to a small subset of features because the objective
feature is actually related with many input features and the
removal of any of them will seriously effect the learning
performance.
• A backward removal strategy is very slow when working with
large-scale data sets. This is because in the firsts stages of the
algorithm, it has to make decisions funded on huge quantities of
data.
• In some cases, the FS outcome will still be left with a relatively
large number of relevant features which even inhibit the use of
complex learning methods. 41
Aspects:
Using Decision Trees for FS
• Decision trees can be used to implement a
trade-off between the performance of the
selected features and the computation time
which is required to find a subset.

• Decision tree inducers can be considered as

anytime algorithms for FS, due to the fact that
they gradually improve the performance and
can be stopped at any time, providing sub-
optimal feature subsets. 42
Difference between feature extraction/selection

• Extraction: Getting useful features from existing data.

• Selection: Choosing a subset of the original pool of features.

43
AIET, JaipurMachine Leaning CS-VI Sem

Evaluating Machine Learning algorithms and Model Selection.

Machine learning
Features & Target
Target (Y) is what we’re trying to predict. Features (X) are factors we think will
help us in predicting this target.
Model
This is the manifestation of our estimate of the true (f) relationship between the
features and the target.
Learning algorithm

Loss/Objective function

Parameter & Hyperparameters

Bias & Variance

44
AIET, JaipurMachine Leaning CS-VI Sem

Evaluating Machine Learning algorithms and Model Selection.

Overfitting: Good performance on the training data, poor generliazation to other data.
Underfitting: Poor performance on the training data and poor generalization to other data.

Bias & Variance:

What is bias?
Bias is the difference between the average prediction of our model and the correct value which we
are trying to predict. Model with high bias pays very little attention to the training data and
oversimplifies the model. It always leads to high error on training and test data.
What is variance?
Variance is the variability of model prediction for a given data point or a value which tells us spread of
our data. Model with high variance pays a lot of attention to training data and does not generalize on
the data which it hasn’t seen before. As a result, such models perform very well on training data but
has high error rates on test data.

High bias implies our estimate based on the observed data is not close to the true parameter. (aka
underfitting).
High variance implies our estimates are sensitive to sampling. They’ll vary a lot if we compute them
with a different sample of data (aka overfitting).

45
AIET, JaipurMachine Leaning CS-VI Sem

Evaluating Machine Learning algorithms and Model Selection.

Bias and variance using bulls-eye diagram

46
AIET, JaipurMachine Leaning CS-VI Sem

Evaluating Machine Learning algorithms and Model Selection.

47
AIET, JaipurMachine Leaning CS-VI Sem

Evaluating Machine Learning algorithms and Model Selection.

48
AIET, JaipurMachine Leaning CS-VI Sem

Evaluating Machine Learning algorithms and Model Selection.

Validation strategies can be broadly divided into 2 categories: Holdout validation and cross
validation.

Holdout validation
Within holdout validation we have 2 choices: Single holdout and repeated holdout.
a) Single Holdout
Implementation
The basic idea is to split our data into a training set and a holdout test set. Train the model on the
training set and then evaluate model performance on the test set. We take only a single holdout—
hence the name. Let’s walk through the steps:

49
AIET, JaipurMachine Leaning CS-VI Sem

Evaluating Machine Learning algorithms and Model Selection.

b) Repeated holdout (Monte Carlo cross validation)
Implementation
One way to obtain a more robust performance estimate that’s less variant to how we split the data is to repeat the
holdout method k times with different random seeds. The estimate of predictive power of the model will be the
average performance over these k repetitions. In case of accuracy as a proxy for predictive power:
In repeated holdout, we can test our model on a higher number of samples compared to single holdout. This
reduces the variance of the final estimate.

50
AIET, JaipurMachine Leaning CS-VI Sem

Evaluating Machine Learning algorithms and Model Selection.

k-fold cross validation

51
AIET, JaipurMachine Leaning CS-VI Sem

Evaluating Machine Learning algorithms and Model Selection.

Resource:
https://heartbeat.fritz.ai/model-evaluation-selection-i-30d803a44ee

https://stanford.edu/~shervine/teaching/cs-229/cheatsheet-machine
-learning-tips-and-tricks

https://docs.microsoft.com/en-us/azure/machine-learning/media/al
gorithm-cheat-sheet/machine-learning-algorithm-cheat-sheet.svg

Android Interview Questions PDF
No ratings yet
Android Interview Questions PDF
24 pages
Handout 02
No ratings yet
Handout 02
19 pages
Et Ubd G7
50% (2)
Et Ubd G7
306 pages
Grade 3 Term 3 Efal
No ratings yet
Grade 3 Term 3 Efal
7 pages
Aiml (Sample) - Full Stack Development Lab Manual
No ratings yet
Aiml (Sample) - Full Stack Development Lab Manual
57 pages
StorySelling Secrets by PG
No ratings yet
StorySelling Secrets by PG
32 pages
Unit 2
No ratings yet
Unit 2
36 pages
Unit I: Software Process Maturity Software Maturity Framework
No ratings yet
Unit I: Software Process Maturity Software Maturity Framework
27 pages
Unit 3 - Iot and Arduino Programming
No ratings yet
Unit 3 - Iot and Arduino Programming
55 pages
Unit 1
No ratings yet
Unit 1
139 pages
Factors and Tables
No ratings yet
Factors and Tables
10 pages
Unstructtured Data Classification Fresco
100% (1)
Unstructtured Data Classification Fresco
4 pages
Advanced Java Unit 3 Digital Notes
100% (1)
Advanced Java Unit 3 Digital Notes
67 pages
OOP Unit 1 Notes
No ratings yet
OOP Unit 1 Notes
54 pages
Cyber Security IMP Points Short Notes
No ratings yet
Cyber Security IMP Points Short Notes
20 pages
Unit 1 Notes
100% (1)
Unit 1 Notes
18 pages
Andrade (2010) Psychology As Notes
No ratings yet
Andrade (2010) Psychology As Notes
10 pages
r22 1 9 ML Lab Manual r22 Regulations
No ratings yet
r22 1 9 ML Lab Manual r22 Regulations
24 pages
Mean Stack Technologies Lab Record
No ratings yet
Mean Stack Technologies Lab Record
49 pages
Web Programming Lab Manual
75% (4)
Web Programming Lab Manual
42 pages
HTML, XML and Javascript Lab P O O: Rogram Bjective AND Utcome
67% (3)
HTML, XML and Javascript Lab P O O: Rogram Bjective AND Utcome
25 pages
IOT Unit-4
No ratings yet
IOT Unit-4
16 pages
Modal Verbs Should and Must (1) .
100% (1)
Modal Verbs Should and Must (1) .
2 pages
Unit V
No ratings yet
Unit V
67 pages
IoT UNIT III
No ratings yet
IoT UNIT III
13 pages
Nakatani (2006) Developing An Oral Communication Strategy Inventory
100% (1)
Nakatani (2006) Developing An Oral Communication Strategy Inventory
18 pages
Spelling Tracking Work Pages
No ratings yet
Spelling Tracking Work Pages
6 pages
Data Mining Models - GeeksforGeeks
No ratings yet
Data Mining Models - GeeksforGeeks
4 pages
R22-M.tech Curriculum and Syllabus
No ratings yet
R22-M.tech Curriculum and Syllabus
85 pages
Cs3391 Oops Unit 1 Notes Eduengg
No ratings yet
Cs3391 Oops Unit 1 Notes Eduengg
60 pages
Hadoop Lab Manual
No ratings yet
Hadoop Lab Manual
92 pages
IOT in Agriculture 1
No ratings yet
IOT in Agriculture 1
7 pages
Mobile Application Dev
No ratings yet
Mobile Application Dev
104 pages
Ad3251 Unit 2 Notes Edu Engg
No ratings yet
Ad3251 Unit 2 Notes Edu Engg
35 pages
Data Mining and Data Warehousing
No ratings yet
Data Mining and Data Warehousing
15 pages
Arduino - Architecture, Programming and Application
No ratings yet
Arduino - Architecture, Programming and Application
64 pages
Variables in Java: - : Unit 2
No ratings yet
Variables in Java: - : Unit 2
13 pages
Raspberry Pi Int
No ratings yet
Raspberry Pi Int
95 pages
Lesson 1: Structure of A Compiler
No ratings yet
Lesson 1: Structure of A Compiler
20 pages
Dce Lab Manual
0% (1)
Dce Lab Manual
21 pages
Problems Numbers Aptitude Questions Answers
No ratings yet
Problems Numbers Aptitude Questions Answers
4 pages
FSD Unit - 3 - Part-1
No ratings yet
FSD Unit - 3 - Part-1
15 pages
LP Format EFDT English 9
No ratings yet
LP Format EFDT English 9
11 pages
Axiological Linguistics - 2023-1-Part 1.2
No ratings yet
Axiological Linguistics - 2023-1-Part 1.2
45 pages
ML Unit 1
No ratings yet
ML Unit 1
25 pages
Lab Manual
No ratings yet
Lab Manual
28 pages
SM 6th-Sem Cse Internet-Of-Things
No ratings yet
SM 6th-Sem Cse Internet-Of-Things
76 pages
Transfer Learning Seminar
No ratings yet
Transfer Learning Seminar
12 pages
Unit V Easy To Learn
No ratings yet
Unit V Easy To Learn
21 pages
Intern Report
No ratings yet
Intern Report
27 pages
React Intro
No ratings yet
React Intro
45 pages
FSD Unit III
100% (1)
FSD Unit III
36 pages
Concept Learning
No ratings yet
Concept Learning
85 pages
Workplace Emotions and Attitudes: Learning Objectives
No ratings yet
Workplace Emotions and Attitudes: Learning Objectives
31 pages
Isom 3400 - Python For Business Analytics 1. Intro To Python
No ratings yet
Isom 3400 - Python For Business Analytics 1. Intro To Python
46 pages
Unit V Graph Structures
No ratings yet
Unit V Graph Structures
39 pages
2nd Grade Wonders Reading Series Weekly Skills At-A-Glance: Created By: ©agatha Lee
67% (6)
2nd Grade Wonders Reading Series Weekly Skills At-A-Glance: Created By: ©agatha Lee
7 pages
SPM 3-I Couse File Format
No ratings yet
SPM 3-I Couse File Format
18 pages
Machine Learning: PAC-Learning and VC-Dimension
No ratings yet
Machine Learning: PAC-Learning and VC-Dimension
31 pages
Berkeley 1713 Part 1
No ratings yet
Berkeley 1713 Part 1
27 pages
ML Unit-3
No ratings yet
ML Unit-3
92 pages
Macro Processor
100% (2)
Macro Processor
44 pages
FSD Unit2
No ratings yet
FSD Unit2
41 pages
Proposal Fix
No ratings yet
Proposal Fix
19 pages
Super Detailed Lesson Plan 7es Science Models
No ratings yet
Super Detailed Lesson Plan 7es Science Models
3 pages
KOBORI Et Al-2012-Japanese Psychological Research
No ratings yet
KOBORI Et Al-2012-Japanese Psychological Research
6 pages
Question Bank: T.E. (Computer Engineering) Data Science and Big Data Analytics (2019 Pattern)
No ratings yet
Question Bank: T.E. (Computer Engineering) Data Science and Big Data Analytics (2019 Pattern)
4 pages
Module 2 Principle of AI
No ratings yet
Module 2 Principle of AI
15 pages
FSD Module 5 Notes
No ratings yet
FSD Module 5 Notes
13 pages
5 Pca
No ratings yet
5 Pca
14 pages
Unit 2 AI
No ratings yet
Unit 2 AI
22 pages
Role of Innovativeness of Consumer in Relationship Between Perceived Attributes of New Products and Intention To Adopt
No ratings yet
Role of Innovativeness of Consumer in Relationship Between Perceived Attributes of New Products and Intention To Adopt
9 pages
Math 42 Reading Assignment Notes
No ratings yet
Math 42 Reading Assignment Notes
3 pages
Cse-CSEViii-web 2.0 & Rich Internet Application (06cs832) - Notes
No ratings yet
Cse-CSEViii-web 2.0 & Rich Internet Application (06cs832) - Notes
86 pages
Layered Hidden Markov Model
No ratings yet
Layered Hidden Markov Model
2 pages
Behavior Skill (Quiz 1) 2
No ratings yet
Behavior Skill (Quiz 1) 2
4 pages
01 The Formal Cause of Business
No ratings yet
01 The Formal Cause of Business
4 pages
Global Citizen Essay
No ratings yet
Global Citizen Essay
2 pages
Christmas Around The World
No ratings yet
Christmas Around The World
4 pages
Designing English Learning Materials For Tourism Police Officer
No ratings yet
Designing English Learning Materials For Tourism Police Officer
3 pages
Requirements Analysis: A Review: Joseph T. Catanio
No ratings yet
Requirements Analysis: A Review: Joseph T. Catanio
2 pages
Review of Introducing Linguistics by Dav
No ratings yet
Review of Introducing Linguistics by Dav
1 page
EDU 580 Final
No ratings yet
EDU 580 Final
3 pages
Star Wars - Past Tense
No ratings yet
Star Wars - Past Tense
1 page
Deep Learning r18 Jntuh Lab Manual
No ratings yet
Deep Learning r18 Jntuh Lab Manual
20 pages
CS964 Data Warehousing and Data Mining
No ratings yet
CS964 Data Warehousing and Data Mining
1 page
Collate Se Unit 4 Notes
No ratings yet
Collate Se Unit 4 Notes
37 pages
Textbook of Engineering Chemistry
From Everand
Textbook of Engineering Chemistry
C. Parameswara Murthy
No ratings yet
Uid-Graphical System Advatages
No ratings yet
Uid-Graphical System Advatages
21 pages
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
From Everand
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
Robert Johnson
No ratings yet

6CS4-02 ML PPT Unit-3

Uploaded by

6CS4-02 ML PPT Unit-3

Uploaded by

AIET, Jaipur MACHINE LEARNING CS-VI Sem

S. No. Title of Book Authors Publisher

T2 Introduction to Machine Ethem Alpaydın MIT Press

R1 Machine Learning Mitchell. T, McGraw

Principal Component Analysis

Principal Component Analysis

Properties of Principal Component

Properties of Principal Component

Principal Component Analysis

Please note that Var[X1] = Cov[X1,X1] and Var[X2] = Cov[X2,X2]. 10

Principal Component Analysis

Step 3: Calculate the eigenvalues and eigenvectors

Principal Component Analysis

Step 4: Choosing components and forming a feature vector:

Principal Component Analysis

Step 5: Forming Principal Components:

Principal Component Analysis

Principal Component Analysis

Principal Component Analysis

• Feature Selection is a process that chooses an

2. criteria for evaluating different subsets.

3. principle for selecting, adding, removing or

– Sequential Backward Generation (SBG): It starts with a full set of features

– Random Generation (RG): It starts the search in a random direction.

• They aim to achieve P(C|FullSet) = P(C|SubSet).

• An inconsistency is defined as the case of two examples with

• Wrappers: a wrapper evaluates a specific model sequentially using

• Embedded: this group is made up of all the Machine Learning

– Number of Features Selected

– Speed of the FS method

• Decision tree inducers can be considered as

• Extraction: Getting useful features from existing data.

Evaluating Machine Learning algorithms and Model Selection.

Parameter & Hyperparameters

Bias & Variance

Evaluating Machine Learning algorithms and Model Selection.

Bias & Variance:

Evaluating Machine Learning algorithms and Model Selection.

Evaluating Machine Learning algorithms and Model Selection.

Evaluating Machine Learning algorithms and Model Selection.

Evaluating Machine Learning algorithms and Model Selection.

Evaluating Machine Learning algorithms and Model Selection.

Evaluating Machine Learning algorithms and Model Selection.

Evaluating Machine Learning algorithms and Model Selection.

You might also like