1 Machine Learning

Uploaded by

vishalkumar995565

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views111 pages

1 Machine Learning

Uploaded by

vishalkumar995565

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 111

Machine Learning

Samatrix Consulting Pvt Ltd

What is Machine Learning?
Machine Learning
• To start the introduction to Machine Learning, let’s start with a simple
example.
• Suppose you have been assigned as Data Scientist to advice on how to
improve the sales of a particular product of a company.
• The company provided you with sales data from 200 different markets.
• The data also contains the advertising budgets for the product in each of
those markets for three different media: TV, radio, and newspaper.
• The client cannot directly increase the sales of the product.
• But they can adjust the advertisement budget for each of the three media.
Machine Learning
• As a data scientist, if you can establish the relationship between
advertisement expenditure and sales, you can provide your feedback
on how to adjust the budgets so that sales can increase.
• So, the objective is to develop a model that you can use to predict the
sales on the basis of the three media budgets.
Machine Learning

Fig 1: The plots display sales in thousands of units. Sales is a function of budget on
TV, Radio, and Newspaper in thousands of dollars across 200 markets
Input – Output Variables
•
Independent Vs Dependent Variables
•
Function Approximation
Function Approximation
• Let us consider another
dataset, income dataset.
• The left-hand panel shows
the plot of income versus
years of education for 30
individuals.
• Using the plot, you may be
able to predict the income
given the years of education.
• But the function that relates
the income to the years of
education is not known.
Function Approximation
•
Function Approximation
•
Why Approximate f
•
Prediction
•
Reducible - Irreducible Errors
•
Reducible - Irreducible Errors
•
Reducible - Irreducible Errors
•
Inference
•
Inference
•
Inference
•
Inference
•
•
•
Parametric Approach
•
Parametric Approach
•
Parametric Approach
•
Parametric Approach
•
Parametric Approach
•
Parametric Approach
•
Non - parametric Approach
•
Non - parametric Approach
•
Non - parametric Approach
•
Non - parametric Approach
•
Prediction Accuracy vs Model
Interpretability
Trade Off
•
Trade Off
•
Trade Off
•
Trade Off
• On the other hand, generalized additive models (GAMs) extend the
linear model to allow for certain non-linear relationships.
• Hence, GAMs are more flexible than linear regression.
• However, they are less interpretable than linear regression, because
the relationship between each predictor and the response is now
modeled using a curve.
• Finally, fully non-linear methods such as bagging, boosting, and
support vector machines with non-linear kernels are highly flexible
approaches that are harder to interpret.
Trade Off
• Hence, we can state that when inference is the goal, we should use
simple and relatively inflexible machine learning methods.
• However, there might be situations, when we are only interested in
prediction, and not in the interpretability of the predictive model.
• For instance, if we want to build a model to predict the price of a
stock, we would be interested in an algorithm that can predict
accurately whereas the interpretability is not a concern.
• In such cases, we should use the most flexible model available.
Assessing Model Accuracy
No Free Lunch Theorem
• During this course we would introduce a wide range of machine learning models.
• These models are more complex than the standard linear regression approach.
• The question is why do we need so many different machine learning approaches,
rather than having a best method?
• In statistics and machine learning, we follow no free lunch theorem.
• For a given data set, one specific approach may give us the best results but some
other scientific approach may give better results on a similar but different data
set.
• Hence, we need to explore and decide for each data set which approach provides
us the best results.
• The most challenging part of the machine learning is to select the approach that
can provide us the best results.
Measuring Quality of Fit
•
Training MSE
• We compute the MSE using the training data that we used to fit the model.
• Hence, we call it training MSE. However, in practice, we are not bothered about the
performance of the model on the training data.
• Rather, we are interested in the accuracy of prediction that we get using the previously
unseen test data.
• The question arises, why we are interested in unseen test data not in training data?
• Suppose our goal is to develop a machine learning model to predict the stock price base
on historical stock returns.
• We can use the last 6 months stock return data to train our model.
• We would not be interested in how well the model is predicting the stock price for a past
date.
• Rather we would be interested in how well the model can predict the stock price the next
day or the next month.
Training MSE
• Similarly, if we have clinical data that includes weight, blood pressure,
height, age, and family history of disease for a number of patients.
• We also have information about whether each patient has diabetes.
• This data can be used to train a machine learning model to predict the
risk of diabetes based on clinical observations.
• In practice, we are interested accurately predicting diabetes risk for
future patients based on their clinical observations.
• We do not want to know how accurately the model predicts diabetes
risk for patients used to train the model.
• We already know which of those patients have diabetes.
Test MSE
•
Model Selection
• How do we select a model that results in the minimization of the
MSE?
• In certain situations, the test data set might be available.
• In other words, we have a set of observations that we did not use to
train the machine learning method.
• In this case, we can evaluate the test observations and select the
model with the smallest test MSE.
Model Selection
• On the other hand, in certain situations, the test observations are not
available.
• In such situations, we can select the model with the smallest training
MSE.
• Even though the training MSE and test MSE appear to be closely
related, there is no guarantee that the model with the lowest training
MSE will also have the lowest test MSE.
• For many machine learning methods, the training set MSE can be
quite small, but the test MSE is often much larger.
Model Selection

•
Model Selection
•
Model Selection
•
Model Selection
• We have demonstrated the test MSE using the red curve in the
right-hand panel.
• The test MSE along with training MSE initially decline with the
increase in the level of flexibility.
• At a certain point the test MSE levels off and then it starts to increase
again.
Model Selection
•
Model Selection
•
Model Selection
• When we overfit the training data, the test MSE will be very large
because the supposed patterns that the method found in the training
data simply don’t exist in the test data.
• Note that regardless of whether or not overfitting has occurred, we
almost always expect the training MSE to be smaller than the test
MSE because most machine learning methods either directly or
indirectly seek to minimize the training MSE.
• Overfitting refers specifically to the case in which a less flexible model
would have yielded a smaller test MSE.
Model Selection

•
Model Selection

•
Bias – Variance Trade Off
•
Bias – Variance Trade Off
•
Bias – Variance Trade Off
•
Meaning of Bias – Variance Trade Off
•
Meaning of Bias – Variance Trade Off
•
Meaning of Bias – Variance Trade Off
•
Meaning of Bias – Variance Trade Off
•
Meaning of Bias – Variance Trade Off
• We can generalize the concept. As the model becomes more flexible, the
variance increases and the bias decreases.
• By analyzing the relative rate of change of these two quantities, we can
determine whether the test MSE will increase or decrease.
• As the flexibility of the model increases, the bias tends to initially decrease
faster than the variance increases.
• As a result, the expected test MSE decreases.
• After some point an increase in flexibility has little impact on the bias but it
starts to significantly increase the variance.
• Due to this, the test MSE increases.
• You can note this pattern of decreasing test MSE followed by increasing test
MSE in the right-hand panels of Figures 9–11.
Meaning of Bias – Variance Trade Off

The three plots in Figure 12 illustrate relationship between bias and

variance for the examples in Figures 9–11.
Meaning of Bias – Variance Trade Off
•
Meaning of Bias – Variance Trade Off
•
Meaning of Bias – Variance Trade Off
• The relationship between bias, variance, and test set MSE in Figure 12
is referred to as the bias-variance trade-off.
• Good test set performance of a machine learning method requires
low variance as well as low squared bias.
• This is referred to as a trade-off because it is easy to obtain a method
with extremely low bias but high variance (for instance, by drawing a
curve that passes through every single training observation) or a
method with very low variance but high bias (by fitting a horizontal
line to the data).
• The challenge lies in finding a method for which both the variance
and the squared bias are low.
Regression vs Classification
•
Approaches for Prediction
•
Linear Model
•
Linear Model
•
Linear Model
•
Linear Model - Classification
•
Linear Model - Classification
•
Model Accuracy - Classification
•
Model Accuracy - Classification
•
Bayes Classifier
•
Bayes Classifier
•
Bayes Classifier
• We can determine the Bayes classifier prediction using the Bayes
decision boundary.
• An observation that falls on the orange side of the boundary will be
assigned to the orange class whereas the observation on the blue side
of the boundary will be assigned to the blue class.
• The Bayes classifier gives the lowest possible test error rate, which is
known as Bayes error rate.
• The Bayes error rate is analogous to the irreducible errors.
N Nearest Neighbour
•
N Nearest Neighbour
•
N Nearest Neighbour
•
N Nearest Neighbour
•
N Nearest Neighbour
•
N Nearest Neighbour
•
N Nearest Neighbour
•
N Nearest Neighbour
• Hence for both the regression and classification models, the correct
flexibility level is critical to the success.
• The bias-variance tradeoff, and the resulting U-shape in the test error,
can make this a challenging task.
Thanks
Samatrix Consulting Pvt Ltd
• KNN -“Birds of a feather flock together.” similar things are near to each
other.
• K-NN is one of the simplest Machine Learning algorithms based on
Supervised Learning technique.
• K-NN algorithm can be used for Regression as well as for Classification but
mostly it is used for the Classification problems.
• K-NN is a non-parametric algorithm, which means it does not make any
assumption on underlying data.
• It is also called a lazy learner algorithm because it does not learn from the
training set immediately instead it stores the dataset and at the time of
classification, it performs an action on the dataset.
• KNN algorithm at the training phase just stores the dataset and when it
gets new data, then it classifies that data into a category that is much
similar to the new data.
Example: Suppose, we have an image of a creature that looks similar to cat
and dog, but we want to know either it is a cat or dog. So for this
identification, we can use the KNN algorithm, as it works on a similarity
measure. Our KNN model will find the similar features of the new data set to
the cats and dogs images and based on the most similar features it will put it
in either cat or dog category.
Advantages of KNN
1. No Training Period: KNN is called Lazy Learner (Instance based learning). It
does not learn anything in the training period. It does not derive any
discriminative function from the training data. In other words, there is no
training period for it. It stores the training dataset and learns from it only at
the time of making real time predictions. This makes the KNN algorithm much
faster than other algorithms that require training e.g. SVM, Linear Regression
etc.
2. Since the KNN algorithm requires no training before making
predictions, new data can be added seamlessly which will not impact the
accuracy of the algorithm.
3. KNN is very easy to implement. There are only two parameters required to
implement KNN i.e. the value of K and the distance function (e.g. Euclidean or
Manhattan etc.)
Disadvantages of KNN
1. Does not work well with large dataset: In large datasets, the cost of calculating
the distance between the new point and each existing points is huge which
degrades the performance of the algorithm.
2. Does not work well with high dimensions: The KNN algorithm doesn't work well
with high dimensional data because with large number of dimensions, it becomes
difficult for the algorithm to calculate the distance in each dimension.
3. Need feature scaling: We need to do feature scaling (standardization and
normalization) before applying KNN algorithm to any dataset. If we don't do so, KNN
may generate wrong predictions.
4. Sensitive to noisy data, missing values and outliers: KNN is sensitive to noise in
the dataset. We need to manually impute missing values and remove outliers.
Thanks
Samatrix Consulting Pvt Ltd

PA Combined
No ratings yet
PA Combined
264 pages
DL Unit1
100% (1)
DL Unit1
79 pages
Unit 2
No ratings yet
Unit 2
97 pages
Start Up Compendium 2023
No ratings yet
Start Up Compendium 2023
141 pages
PA DL Consolidated
No ratings yet
PA DL Consolidated
94 pages
Chapter 1-ML
No ratings yet
Chapter 1-ML
27 pages
Patient Complaint Form Template
No ratings yet
Patient Complaint Form Template
3 pages
02 Chap02 AssesingModelAccuracy
No ratings yet
02 Chap02 AssesingModelAccuracy
22 pages
Lecture 10 - 04.09.2024 - Regression-02 Lecture Slides
No ratings yet
Lecture 10 - 04.09.2024 - Regression-02 Lecture Slides
61 pages
Machine Learning Math Essentials - 12.02.2025
No ratings yet
Machine Learning Math Essentials - 12.02.2025
88 pages
Conceptual Physics Chapter 9 Conservation of Energy Answers 7Gkg
100% (1)
Conceptual Physics Chapter 9 Conservation of Energy Answers 7Gkg
3 pages
ASSESSING MODEL Accuracy PDF
No ratings yet
ASSESSING MODEL Accuracy PDF
22 pages
Receiver Operator Characteristic
No ratings yet
Receiver Operator Characteristic
25 pages
Intro To Data Science Lecture 5
No ratings yet
Intro To Data Science Lecture 5
7 pages
ENME 392 - Homework 13 - Fa13 - Solutions
No ratings yet
ENME 392 - Homework 13 - Fa13 - Solutions
23 pages
Linear Regression, Polynomical, Gradiant Descent
No ratings yet
Linear Regression, Polynomical, Gradiant Descent
42 pages
Data Science Interview Questions - 1
No ratings yet
Data Science Interview Questions - 1
55 pages
Matthias Schonlau, Ph.D. Statistical Learning - Classification Stat441
No ratings yet
Matthias Schonlau, Ph.D. Statistical Learning - Classification Stat441
30 pages
Fundamental Principles & Equatimes
No ratings yet
Fundamental Principles & Equatimes
26 pages
Week2-Day 1-Introduction To Data Mining
No ratings yet
Week2-Day 1-Introduction To Data Mining
30 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
3 pages
ML MU Unit 2
100% (3)
ML MU Unit 2
84 pages
Unit - 2 Deep Learning
No ratings yet
Unit - 2 Deep Learning
26 pages
Chapter 2
No ratings yet
Chapter 2
5 pages
ZG512 L1 Introduction, Bias-Variance 270724
No ratings yet
ZG512 L1 Introduction, Bias-Variance 270724
19 pages
Bias Variance Trade Off
No ratings yet
Bias Variance Trade Off
14 pages
Lecture 19
No ratings yet
Lecture 19
25 pages
ML Unit 3
No ratings yet
ML Unit 3
23 pages
Data Science Interview Questions: Answer Here
No ratings yet
Data Science Interview Questions: Answer Here
54 pages
Linear Regression Summary
No ratings yet
Linear Regression Summary
57 pages
ML 1 PPT Unit 1
No ratings yet
ML 1 PPT Unit 1
93 pages
Individual Learner's Record (LR)
No ratings yet
Individual Learner's Record (LR)
2 pages
M1 - Evaluating Predictive Performance
No ratings yet
M1 - Evaluating Predictive Performance
58 pages
Lecture 14
No ratings yet
Lecture 14
17 pages
Bias-Variance Tradeoff
No ratings yet
Bias-Variance Tradeoff
6 pages
Theory in Machine Learning
No ratings yet
Theory in Machine Learning
60 pages
Study Notes - Lesson 1 - 7 PDF
No ratings yet
Study Notes - Lesson 1 - 7 PDF
25 pages
Jntuh BT Che 5 Mass-Transfer-Operations-I-2011
No ratings yet
Jntuh BT Che 5 Mass-Transfer-Operations-I-2011
8 pages
Bioinformatics Companies in India
No ratings yet
Bioinformatics Companies in India
3 pages
Class 7 S09
No ratings yet
Class 7 S09
3 pages
ML MU Unit 2
100% (2)
ML MU Unit 2
42 pages
Module 3.2 Laplace & Inverse Laplace Transforms
No ratings yet
Module 3.2 Laplace & Inverse Laplace Transforms
17 pages
UNIT I-Part 2
No ratings yet
UNIT I-Part 2
35 pages
Sk-2.0-Fizik Topikal Answers F5 C1
No ratings yet
Sk-2.0-Fizik Topikal Answers F5 C1
4 pages
Bias Variance Tradeoff
No ratings yet
Bias Variance Tradeoff
2 pages
GentileGrabeSelf-esteemMetaRGP2009 - Bibliografia
No ratings yet
GentileGrabeSelf-esteemMetaRGP2009 - Bibliografia
13 pages
Machine Learning Notes Anna University
No ratings yet
Machine Learning Notes Anna University
9 pages
Ellemers 2018 Gender Stereotypes
No ratings yet
Ellemers 2018 Gender Stereotypes
26 pages
1 5 Bias Variance Trade Off
No ratings yet
1 5 Bias Variance Trade Off
34 pages
Jkkklphftbbhuii
No ratings yet
Jkkklphftbbhuii
17 pages
Unit 1.2 Perceptron 2024
No ratings yet
Unit 1.2 Perceptron 2024
107 pages
Hours Submittal Form
No ratings yet
Hours Submittal Form
2 pages
Chapter 2
No ratings yet
Chapter 2
38 pages
Provide Compassionate, Provide Compassionate, Provide Compassionate, Respectful and Caring Service Learninig Guide 02
No ratings yet
Provide Compassionate, Provide Compassionate, Provide Compassionate, Respectful and Caring Service Learninig Guide 02
13 pages
MCNN-AAPT Accurate Classification and Functional Prediction of Amino Acid and Peptide Transporters in Secondary Active Transporters Using Protein Lan
No ratings yet
MCNN-AAPT Accurate Classification and Functional Prediction of Amino Acid and Peptide Transporters in Secondary Active Transporters Using Protein Lan
11 pages
Integrative Assessment PPT Rey Reyes 2
No ratings yet
Integrative Assessment PPT Rey Reyes 2
63 pages
BA 170-4 - LearningLog3 - Santos
No ratings yet
BA 170-4 - LearningLog3 - Santos
1 page
Luyện Kỹ Năng Viết Tiếng Anh
No ratings yet
Luyện Kỹ Năng Viết Tiếng Anh
6 pages
Data Science and Applications Notes
No ratings yet
Data Science and Applications Notes
4 pages
TQ For Gen Math
No ratings yet
TQ For Gen Math
4 pages
Linear Regression
No ratings yet
Linear Regression
37 pages
1 - Intro To Machine Learning
No ratings yet
1 - Intro To Machine Learning
34 pages
Research Analyst
No ratings yet
Research Analyst
3 pages
Donna M. Richter Honored As A VIP For Fall 2024 by P.O.W.E.R. (Professional Organization of Women of Excellence Recognized)
No ratings yet
Donna M. Richter Honored As A VIP For Fall 2024 by P.O.W.E.R. (Professional Organization of Women of Excellence Recognized)
3 pages
Disciplines and Ideas in The Applied Sciences
89% (9)
Disciplines and Ideas in The Applied Sciences
24 pages
Huawei H12-211 PRACTICE EXAM HCNA-HNTD H
No ratings yet
Huawei H12-211 PRACTICE EXAM HCNA-HNTD H
117 pages
Deep Learning
No ratings yet
Deep Learning
26 pages
Finite Element Method in Structure Assignment-4
No ratings yet
Finite Element Method in Structure Assignment-4
16 pages
Grade 7 Math Lesson 23: Multiplying Polynomials Learning Guide
No ratings yet
Grade 7 Math Lesson 23: Multiplying Polynomials Learning Guide
6 pages
ML 1 2 3
No ratings yet
ML 1 2 3
54 pages
AIRCRAFT Impact On Society
No ratings yet
AIRCRAFT Impact On Society
2 pages
Csa202 Unit 2
No ratings yet
Csa202 Unit 2
36 pages
Machine Learning Volume I 280820241047
No ratings yet
Machine Learning Volume I 280820241047
4 pages
Module3 DS PPT
No ratings yet
Module3 DS PPT
68 pages
Machine Learning Note
No ratings yet
Machine Learning Note
40 pages
9 TiengAnh TR19 HSG12PT 2024 DE SO 1
No ratings yet
9 TiengAnh TR19 HSG12PT 2024 DE SO 1
15 pages
16-Semantics (1-2) - 2023
No ratings yet
16-Semantics (1-2) - 2023
1 page
Machine Learning and Data Mining
No ratings yet
Machine Learning and Data Mining
88 pages
Machine Learning Models: by Mayuri Bhandari
No ratings yet
Machine Learning Models: by Mayuri Bhandari
48 pages
Week 4 - Intro To ML
No ratings yet
Week 4 - Intro To ML
37 pages
A "Short" Introduction To Model Selection
No ratings yet
A "Short" Introduction To Model Selection
25 pages
A Study On Emotional Maturity and Self Esteem Among Adolescents - May - 2020 - 1589879447 - 78142741
No ratings yet
A Study On Emotional Maturity and Self Esteem Among Adolescents - May - 2020 - 1589879447 - 78142741
3 pages
Crush Hypothesis Testing
From Everand
Crush Hypothesis Testing
Allison Dillard
No ratings yet
Machine Learning HC
No ratings yet
Machine Learning HC
4 pages
DataScience Interview Questions
100% (1)
DataScience Interview Questions
66 pages
Gale Researcher Guide for: Econometric Models
From Everand
Gale Researcher Guide for: Econometric Models
Chupp
No ratings yet
Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
From Everand
Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
Jim Frost
5/5 (4)
Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)
Photointerrupter: Product Data Sheet
No ratings yet
Photointerrupter: Product Data Sheet
6 pages
Geometry - Drill Sheets Gr. 3-5
From Everand
Geometry - Drill Sheets Gr. 3-5
Mary Rosenberg
No ratings yet

1 Machine Learning

Uploaded by

1 Machine Learning

Uploaded by

Machine Learning

Samatrix Consulting Pvt Ltd

The three plots in Figure 12 illustrate relationship between bias and

You might also like