UNIT I-Part 2

The document provides an introduction to machine learning, focusing on supervised learning, classification, and regression. It discusses key concepts such as model selection, generalization, Bayes theorem, and association rules, along with methods like the Apriori algorithm for mining frequent item sets. Additionally, it highlights the importance of loss and risk in decision-making processes within machine learning applications.

Uploaded by

janarthana9789

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views35 pages

UNIT I-Part 2

Uploaded by

janarthana9789

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 35

UNIT I

Introduction to Machine
Learning and Supervised
Learning

Code:U18CST7002
Presented by: Nivetha R
Department: CSE
Learning Multiple Classes

• Definition: Classification task with more than two

classes.
Learning Multiple Classes
• Example:
• Classes: Family Car, Sports Car, Luxury Sedan.
• Goal: Learn boundaries separating each class.
Learning Multiple Classes

• Handling Doubt
• Doubt Cases: When no hypothesis or multiple
hypotheses predict 1 for an instance.
• Rejection: Classifier rejects instances in doubt
regions for further human review.
Example 1

Consider the problem of assigning the label “family car” (indicated by “1”) or “not family
car” (indicated by “0”) to cars. Given the following training set for the problem and
assuming that the hypothesis space is as defined by (p1 ≤ price ≤ p2) AND (e1 ≤ engine
power ≤ e2) , find the version space for the problem.
Example 1-solution
•Dependent Variable (y):
Regression
•The variable we are trying to predict or
explain.
•Also known as the response variable.

•Independent Variable (x):

•The variable used to predict the dependent
variable.
•Also known as the predictor variable or
feature.

•Linear Relationship:
•The relationship between the dependent and
independent variables is modeled as a straight
line.
•The general form of the linear equation:
y=mx+c
Regression

https://www.youtube.com/watch?
v=CtsRRUddV2s
Regression
Regression
Regression is a type of supervised learning where the
output is a numeric value, not a Boolean class.

Interpolation: Finding a function f(x) that passes

through all data points when there is no noise:

Extrapolation :Predicting values outside the range

of the training set.
Regression with Noise :The output includes random
noise
Regression
Polynomial Regression
1. Extends linear regression by considering
polynomial relationships between dependent and
independent variables.
2. Can model non-linear data more accurately.
Model Selection and Generalization
• Model Selection: Choosing the best model based on
performance metrics.
• Generalization: The model's ability to perform well on
new, unseen data.
The Problem of Learning
• Hypothesis Elimination: Each training example
removes half of the hypotheses.
• Ill-Posed Problem: With limited training examples,
the solution is not unique, leading to multiple
consistent hypotheses.
Inductive Bias :Assumptions made to have a unique
solution with the given data.
Examples:
• Assuming the shape of a rectangle for family cars.
• Assuming a linear function in linear regression.
Model Selection and
Generalization
• Hypothesis Class Capacity: The class of functions that can be
learned depends on the hypothesis class capacity.
Model Selection:
• Inductive Bias: Necessary for learning and impacts model
selection.
• Goal: Choose the hypothesis class H that generalizes well to new
data.
• Generalization: Ability of a model to make accurate predictions
on new, unseen data
Complexity and Generalization:
• Underfitting: H is too simple (e.g., fitting a line to third-order
polynomial data).
• Overfitting: H is too complex (e.g., fitting a sixth-order
polynomial to noisy data).
• Optimal Complexity: Match the complexity of H with the
underlying function's complexity.
Model Selection and
Generalization
• Triple Trade-Off (Dietterich 2003)
Factors:
• Complexity of the hypothesis class H.
• Amount of training data.
• Generalization error on new examples.
• Balancing:
• Increasing training data reduces generalization
error.
• Increasing model complexity initially reduces,
then increases generalization error.
• Validation and Cross-Validation
• Validation Set: Used to test generalization ability.
• Cross-Validation: Dividing data into training and
validation sets multiple times to ensure robust model
selection.
Model Selection and
Generalization
Steps for Model Selection
1.Divide Data: Split dataset into training, validation, and test
sets.
2.Fit Models: Train models on the training set for different
hypothesis classes Hi.
3.Evaluate Models: Use validation set to measure
generalization error.
4.Select Best Model: Choose the model with the least
validation error.
Test Set
• Purpose: Evaluate the final model's performance on unseen
data.
• Importance: Avoids overfitting to validation set, providing
an unbiased estimate of model performance.
Analogy:
• Training Set: Problems solved in class.
• Validation Set: Exam questions.
• Test Set: Real-world problems.
Bayes Theorem in ML
• Bayes theorem is used in Machine Learning where we need to
predict classes precisely and accurately.
• It is used to calculate conditional probability in Machine
Learning application that includes classification tasks.
• A simplified version of Bayes theorem (Naïve Bayes
classification) is also used to reduce computation time and
average cost of the projects.
• It helps to determine the probability of an event with random
knowledge.
• It is used to calculate the probability of occurring one event
while other one already occurred.
• It is a best method to relate the conditional probability and
marginal probability.
• It is extensively applied in Financial, health and medical,
research and survey industry, aeronautical sector
Bayes Theorem in ML
• In machine learning, we try to determine the best hypothesis
from some hypothesis space H, given the observed training
data X.
• In Bayesian learning, the best hypothesis means the most
probable hypothesis, given the data X plus any initial
knowledge about the prior probabilities of the various
hypotheses in H.
• Prior knowledge can be combined with observed data to
determine the final probability of a hypothesis.
• In Bayesian learning, prior knowledge is provided by asserting
• – a prior probability for each candidate hypothesis
• – a probability distribution over observed data for each
possible hypothesis.
• Bayesian methods can accommodate hypotheses that make
probabilistic predictions
• New instances can be classified by combining the predictions
of multiple hypotheses, weighted by their probabilities.
Bayesian Classification
• - Uses Bayes' theorem to calculate the probability
of each class given the input data.

• - Selects the class with the highest posterior probability.

• Hence, Bayes Theorem can be written as: posterior =
likelihood * prior / Marginal
Bayes Decision Theory
Bayes Decision Theory
Bayes Decision Theory
Bayes Decision Theory
Losses and Risk

Loss: The cost associated with making an incorrect

decision. It quantifies the penalty for errors in decision-
making.
In medical diagnosis, predicting no disease when
the patient has one (false negative) can delay treatment.
The loss could include additional medical costs and
health deterioration .

Risk: The expected loss, which considers both the

probability of various outcomes and their associated
losses. It aims to minimize potential negative impacts by
factoring in the likelihood and cost of different decisions.
In loan approval, The bank uses this risk to decide
on loan approvals, aiming to minimize financial losses.
Losses and Risks
Losses and Risk
Discriminant Functions
Discriminant functions are mathematical functions used
to distinguish between different classes in a dataset.
Discriminant Functions
Discriminant Functions:
Discriminant Functions
Association Rules
• Association rules are used
to find interesting
relationships or patterns
between variables in large
datasets.
• An association rule is an
implication of the form X→Y
where X is the antecedent
and Y is the consequent of
the rule.
• Example: market basket
analysis, where the goal is
to discover associations
between products
purchased together by
customers
Keys used to measure Association
Rules
• Support:
• The support of an association rule X→Y measures how
frequently the items X and Y appear together in the
dataset.
• Support =

• Confidence:
• The confidence of an association rule X→Y measures the
probability that Y is purchased given that X is purchased.
• Confidence =
• Lift:
• Lift (also known as interest) measures the strength of an
association rule compared to the random co-occurrence
of X and Y
• Lift =
Apriori algorithm
• The Apriori algorithm is a popular method for mining
frequent item sets and generating association rules. It
operates in two main steps:
• Finding Frequent Item sets:
• The algorithm first identifies item sets that have
sufficient support.
• It uses the fact that any subset of a frequent
itemset must also be frequent to reduce the
search space.
• Generating Rules
• Once frequent item sets are identified, the
algorithm generates rules by dividing the item
sets into antecedents and consequents and
calculating their confidence
• Rules that do not meet the minimum confidence
threshold are discarded
References
• 1. Ethem Alpaydin, “Introduction to Machine
Learning”, Second Edition, MIT Press, 2013.
• 2. Tom M. Mitchell, “Machine Learning”, McGraw-Hill
Education, 2013.
• 3. Stephen Marsland, “Machine Learning: An
Algorithmic Perspective”, CRC Press, 2009.
• 4. Y. S. Abu-Mostafa, M. Magdon-Ismail, and H.-T.
Lin, “Learning from Data”, AML Book Publishers,
2012.
• 5. K. P. Murphy, “Machine Learning: A Probabilistic
Perspective”, MIT Press, 2012.
• 6. M. Mohri, A. Rostamizadeh, and A. Talwalkar,
“Foundations of Machine Learning”, MIT Press,
2012.

ML Merged Endsem
No ratings yet
ML Merged Endsem
1,117 pages
Supervised Machine Learning Algorithm
100% (1)
Supervised Machine Learning Algorithm
111 pages
ML Merged
No ratings yet
ML Merged
729 pages
Mathematics of Machine Learning MIT
No ratings yet
Mathematics of Machine Learning MIT
411 pages
Unit 2linear Regression Bayesian Learning
No ratings yet
Unit 2linear Regression Bayesian Learning
49 pages
Unit-1 ML
No ratings yet
Unit-1 ML
19 pages
DA Unit 2
No ratings yet
DA Unit 2
124 pages
MLT Unit 2 - Updated
No ratings yet
MLT Unit 2 - Updated
58 pages
Linear Classification: 1 1 N N I D I
No ratings yet
Linear Classification: 1 1 N N I D I
33 pages
Discrete Mathematics Question Paper
0% (2)
Discrete Mathematics Question Paper
4 pages
Linearclassification
No ratings yet
Linearclassification
31 pages
Bayesian Decision Theory and Learning: Jayanta Mukhopadhyay Dept. of Computer Science and Engg
No ratings yet
Bayesian Decision Theory and Learning: Jayanta Mukhopadhyay Dept. of Computer Science and Engg
56 pages
Machine Learning Notes
100% (3)
Machine Learning Notes
134 pages
Asset Management PAS 55 ISO 55000
100% (2)
Asset Management PAS 55 ISO 55000
15 pages
Supervised Learning
No ratings yet
Supervised Learning
6 pages
Unit 5 - Machine Learning - WWW - Rgpvnotes.in
No ratings yet
Unit 5 - Machine Learning - WWW - Rgpvnotes.in
12 pages
6.1 Bayesian Learning
No ratings yet
6.1 Bayesian Learning
33 pages
Week 3
No ratings yet
Week 3
56 pages
ML 2 PPT Unit 2
No ratings yet
ML 2 PPT Unit 2
214 pages
MIT18 657F15 LecNote PDF
No ratings yet
MIT18 657F15 LecNote PDF
194 pages
3 LogisticRegression
No ratings yet
3 LogisticRegression
30 pages
19 ML Intro
No ratings yet
19 ML Intro
33 pages
Notes
No ratings yet
Notes
125 pages
Chapter02 Introduction To DeepLearning
No ratings yet
Chapter02 Introduction To DeepLearning
84 pages
A Preliminary Idea On Machine Learning
No ratings yet
A Preliminary Idea On Machine Learning
40 pages
Unit 3
No ratings yet
Unit 3
99 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
27 pages
Naive Bayes
No ratings yet
Naive Bayes
37 pages
05-1 Supervised Learning
No ratings yet
05-1 Supervised Learning
65 pages
For Unit 4 Useful
100% (1)
For Unit 4 Useful
107 pages
Bayesian Theory Daniel Restrepo
No ratings yet
Bayesian Theory Daniel Restrepo
8 pages
Machine - Learning (Unit 3)
No ratings yet
Machine - Learning (Unit 3)
9 pages
ML Unit 1
No ratings yet
ML Unit 1
73 pages
Machine Learning
No ratings yet
Machine Learning
87 pages
Murphy Book Solution
No ratings yet
Murphy Book Solution
100 pages
Module3 DS PPT
No ratings yet
Module3 DS PPT
68 pages
07 Intro To ML
No ratings yet
07 Intro To ML
38 pages
Module 2 - Syllabus: CS 476 Introduction To Machine Learning, Module 2
No ratings yet
Module 2 - Syllabus: CS 476 Introduction To Machine Learning, Module 2
20 pages
Lecturenotes PDF
No ratings yet
Lecturenotes PDF
80 pages
Lecture 03 Bayes Classifier With Prob Concepts
No ratings yet
Lecture 03 Bayes Classifier With Prob Concepts
70 pages
CAD 110 Inventor Basics 2026
No ratings yet
CAD 110 Inventor Basics 2026
104 pages
ChatGPT - Machine Learning Overview
No ratings yet
ChatGPT - Machine Learning Overview
34 pages
ML Sit1305
No ratings yet
ML Sit1305
127 pages
Outline: - Learning Agents - Inductive Learning - Decision Tree Learning
No ratings yet
Outline: - Learning Agents - Inductive Learning - Decision Tree Learning
30 pages
Lecturenotes Cse176
No ratings yet
Lecturenotes Cse176
80 pages
ML 01
No ratings yet
ML 01
24 pages
Exercises
No ratings yet
Exercises
69 pages
Machine Learning Handbook - Radivojac and White
No ratings yet
Machine Learning Handbook - Radivojac and White
108 pages
Lesson Plan
50% (2)
Lesson Plan
7 pages
Aiml Unit 3
No ratings yet
Aiml Unit 3
9 pages
Unit 2 Notes
No ratings yet
Unit 2 Notes
7 pages
Machine Learning UNIT-2: Logistic Regression
No ratings yet
Machine Learning UNIT-2: Logistic Regression
12 pages
ML 1 2 3
No ratings yet
ML 1 2 3
54 pages
Statistical Machine Learning-The Basic Approach and Current Research Challenges
No ratings yet
Statistical Machine Learning-The Basic Approach and Current Research Challenges
35 pages
Statistical Learning Theory
No ratings yet
Statistical Learning Theory
4 pages
Cheat Sheet
No ratings yet
Cheat Sheet
163 pages
Classification & Prediction: - Shailesh Yadav Central University of Rajasthan
No ratings yet
Classification & Prediction: - Shailesh Yadav Central University of Rajasthan
28 pages
Hypothesis in ML
No ratings yet
Hypothesis in ML
8 pages
An Introduction To GCC - Brian Gough PDF
No ratings yet
An Introduction To GCC - Brian Gough PDF
124 pages
Introduction To Machine Learning: ETH Zurich Janik Schuettler Marcel Graetz FS18
No ratings yet
Introduction To Machine Learning: ETH Zurich Janik Schuettler Marcel Graetz FS18
18 pages
Machine Learning HC
No ratings yet
Machine Learning HC
4 pages
Ab Initio
No ratings yet
Ab Initio
17 pages
6.867 Lecture Notes: Section 1: Introduction: 1 Intro 2 2 Problem Class 3
No ratings yet
6.867 Lecture Notes: Section 1: Introduction: 1 Intro 2 2 Problem Class 3
10 pages
FINAL JOINING KIT COMPLETE - Employees 2
No ratings yet
FINAL JOINING KIT COMPLETE - Employees 2
17 pages
DRTECH API Manual For EVS Detectors
No ratings yet
DRTECH API Manual For EVS Detectors
74 pages
UNIT II Part-1
No ratings yet
UNIT II Part-1
59 pages
Im Smartcool e 6877419 V1.5.0 10 14
No ratings yet
Im Smartcool e 6877419 V1.5.0 10 14
222 pages
Unit IV Decision Trees
No ratings yet
Unit IV Decision Trees
37 pages
UNIT I-Part 1
No ratings yet
UNIT I-Part 1
52 pages
Visualization Charts
No ratings yet
Visualization Charts
108 pages
DV Unit5
No ratings yet
DV Unit5
113 pages
DV Unit1 Part2
No ratings yet
DV Unit1 Part2
98 pages
Unit V - Graphical Models
No ratings yet
Unit V - Graphical Models
43 pages
UNIT III Part-1
No ratings yet
UNIT III Part-1
69 pages
Unit V - Multiple Learners
No ratings yet
Unit V - Multiple Learners
54 pages
UNIT III Part-2
No ratings yet
UNIT III Part-2
39 pages
QB 12678
No ratings yet
QB 12678
3 pages
MCQ1
No ratings yet
MCQ1
22 pages
ARC Family Disaster Plan Template r083012 0
No ratings yet
ARC Family Disaster Plan Template r083012 0
3 pages
N2OS-UserManual-20 0 7 4
No ratings yet
N2OS-UserManual-20 0 7 4
256 pages
UNIT II Part-2
No ratings yet
UNIT II Part-2
32 pages
AUTONOMOUS 231GES104T - PROBLEM SOLVING THROUGH PYTHON PROGRAMMING Question Bank - Unit
No ratings yet
AUTONOMOUS 231GES104T - PROBLEM SOLVING THROUGH PYTHON PROGRAMMING Question Bank - Unit
6 pages
03 Design Apis
No ratings yet
03 Design Apis
16 pages
Transpose of A Matrix in Python With User Input
No ratings yet
Transpose of A Matrix in Python With User Input
15 pages
Mathematical Physics A Modern Introduction To Its Foundations 2nd Edition 2024 Scribd Download
100% (3)
Mathematical Physics A Modern Introduction To Its Foundations 2nd Edition 2024 Scribd Download
28 pages
AJP Question Paper
No ratings yet
AJP Question Paper
7 pages
Prix Num 6 - Datasheet DELL Multimedia Keyboard - KB216 - Anglais
No ratings yet
Prix Num 6 - Datasheet DELL Multimedia Keyboard - KB216 - Anglais
2 pages
Design and Fabrication of Rotary Fixture For Control Valve Cylinder Head of Tractor
No ratings yet
Design and Fabrication of Rotary Fixture For Control Valve Cylinder Head of Tractor
5 pages
FlashSystem Redirect On Write Snapshots 2021 Jul 01
No ratings yet
FlashSystem Redirect On Write Snapshots 2021 Jul 01
8 pages
BIM Template Training Bentley July2013
No ratings yet
BIM Template Training Bentley July2013
35 pages
Cyber Security Course Content
No ratings yet
Cyber Security Course Content
8 pages
Canvas Manual Student 2023
No ratings yet
Canvas Manual Student 2023
3 pages
5-Channel Integrated Power Solution With Quad Buck Regulators and 200 Ma LDO Regulator
No ratings yet
5-Channel Integrated Power Solution With Quad Buck Regulators and 200 Ma LDO Regulator
40 pages
Indexing Structures For Files: Database Design Database Design
No ratings yet
Indexing Structures For Files: Database Design Database Design
9 pages
The Intel Pentium
No ratings yet
The Intel Pentium
10 pages
BE Mechatronics Brochure 2019 Final
No ratings yet
BE Mechatronics Brochure 2019 Final
2 pages
K To 12 Basic Education Curriculum in TLE Caregiving Grade 8 Competencies Allocation For 180 Teaching-Learning Days SY 2014-2015
No ratings yet
K To 12 Basic Education Curriculum in TLE Caregiving Grade 8 Competencies Allocation For 180 Teaching-Learning Days SY 2014-2015
4 pages
Oil Basics & More: Drawing & Painting With Style and Confidence
No ratings yet
Oil Basics & More: Drawing & Painting With Style and Confidence
20 pages
Kamal Sir Cabin: S.No. Item Reuse in 206 Where and How
No ratings yet
Kamal Sir Cabin: S.No. Item Reuse in 206 Where and How
2 pages
Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)
Naive Bayes Classifier: Fundamentals and Applications
From Everand
Naive Bayes Classifier: Fundamentals and Applications
Fouad Sabry
No ratings yet
Random Optimization: Fundamentals and Applications
From Everand
Random Optimization: Fundamentals and Applications
Fouad Sabry
No ratings yet

UNIT I-Part 2

Uploaded by

UNIT I-Part 2

Uploaded by

UNIT I

• Definition: Classification task with more than two

•Independent Variable (x):

Interpolation: Finding a function f(x) that passes

Extrapolation :Predicting values outside the range

• - Selects the class with the highest posterior probability.

Loss: The cost associated with making an incorrect

Risk: The expected loss, which considers both the

You might also like