0% found this document useful (0 votes)

3 views

Lecture 8

The document discusses the fundamentals of machine learning, focusing on the selection of hypothesis spaces and the associated biases and trade-offs. It explains the concepts of bias, variance, and generalization error, emphasizing the importance of choosing an appropriate model class for effective learning. Additionally, it highlights the empirical and subjective nature of hypothesis space selection and the implications of model complexity on learning algorithms.

Uploaded by

Dhruv Jain

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

Lecture 8

Uploaded by

Dhruv Jain

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

DS605: Fundamentals of Machine Learning

Lecture 08

Choosing a Hypothesis Space

[Inductive Bias, Bias-Variance Trade-off, Model Complexity and Expressiveness Trade-off]

Arpit Rana
6th August 2024
Supervised Learning Process

Hypothesis Learner
Space 𝓗 (𝚪: S → h)

Final Hypothesis or
Model (h)
A Test Instance Prediction
Supervised Learning: Example

Problem: whether to wait for a table at a restaurant.

● Alternate: whether there is a suitable alternative restaurant ● Price: the restaurant’s price range ($, $$, $$$).
nearby. ● Raining: whether it is raining outside.
● Bar: whether the restaurant has a comfortable bar area to wait in. ● Reservation: whether we made a reservation.
● Fri/Sat: true on Fridays and Saturdays. ● Type: the kind of restaurant (French, Italian, Thai, or Burger).
● Hungry: whether we are hungry right now. ● WaitEstimate: host’s wait estimate: 0–10, 10–30, 30–60, or >60
● Patrons: how many people are in the restaurant (values are None, minutes.
Some, and Full).
Supervised Learning: Example

Problem: whether to wait for a table at a restaurant.

. Unknown
Training . Target
Data . function 𝑓
.

Instances

Instance
Space (𝑿)
2 x 2 x 2 x 2 x 3 x 3 x 2 x 2 x 4 x 4 = 9216
Size of Hypothesis Space (| 𝓗 |)
of Boolean Functions
= 29216
Hypothesis Space vs. Hypothesis

What do we mean by a Hypothesis Space (a.k.a. Model Class) and a hypothesis?

There are three different levels of speciﬁcity for using the term Hypothesis or Model:

● a broad hypothesis space (like “polynomials”),

● a hypothesis space with hyperparameters ﬁlled in (like “degree-2 polynomials”), and

● a speciﬁc hypothesis with all parameters ﬁlled in (like 5x2 + 3x − 2).

Hypothesis Space vs. Hypothesis

What do we mean by a Hypothesis Space (a.k.a. Model Class) and a hypothesis?

There are three different levels of speciﬁcity for using the term Hypothesis or Model:

Hyperparameter:
Polynomials degree=1

Parameters:
a=2, b=3
Hypothesis Space vs. Hypothesis

How do we choose a good Hypothesis Space or Model Class?

Hyperparameter:
Polynomials degree=1

Parameters:
a=2, b=3

Hypothesis Space / Representation / Model Class Selection Optimization

(popularly known as Model Selection) or Training
Choosing the Hypothesis Space
Hypothesis Space Selection is Subjective

Most probable hypothesis given the data -

≣
𝒉∈𝓗 𝒉∈𝓗

● We can say that the prior probability P(h) is high for a smooth degree-1 or -2 polynomial
and lower for a degree-12 polynomial with large, sharp spikes.
Hypothesis Space Selection is Subjective

The observed dataset S alone does not allow us to make conclusions about unseen instances.
We need to make some assumptions!

● These assumptions induce the bias (a.k.a. inductive or learning bias) of a learning
algorithm.

● Two ways to induce bias:

○ Restriction: Limit the hypothesis space (e.g., degree-2 polynomials)
○ Preference: Impose ordering on hypothesis space (e.g., prefer simpler than complex)
Hypothesis Space Selection is not only subjective but is empirical also.

● Part of hypothesis space selection is qualitative and subjective:

We might select polynomials rather than decision trees based on something that we
know about the problem,

and

● part is quantitative and empirical:

Within the class of polynomials, we might select Degree = 2, because that value performs
best on the validation data set.
Experimental Evaluation of Learning Algorithms

The overall objective of the Learning

Algorithm is to ﬁnd a hypothesis that -

● is consistent (i.e., ﬁts the training

data), but more importantly,

● generalizes well for previously

unseen data. Hypothesis Learner
Space 𝓗 (𝚪: S → h)

Experimental Evaluation deﬁnes ways

to Measure the Generalizability of a
Learning Algorithm.
Final Hypothesis or
Model (h)
Experimental Evaluation of Learning Algorithms

Sample Error
The sample error of hypothesis h with respect to the target function f and data sample S is:

It is impossible to asses
true error, so we try to
estimate it using sample
error.
True Error

The true error of hypothesis h with respect to the target function f and the distribution D is
the probability that h will misclassify an instance drawn at random according to D:
Generalization Error

Generalization error (a.k.a. out-of-sample error) is a measure of how accurately an algorithm is

able to predict outcome values for previously unseen data.

Variance Bias Irreducible Error

Due to the model’s Due to Wrong Assumptions. Restrictions Due to the noisiness of
sensitivity to small imposed by - the data itself.
variations in the
training data. The Representation Function (i.e., Hypothesis The only way to handle
space, such as, linear or quadratic) it is to clean up the
It leads to overﬁtting! data properly, detect
The Search Algorithm (e.g., Grid search or Beam
search) and remove outliers.

It leads to underﬁtting!
Choosing a Hypothesis Space - I

One way to analyze hypothesis spaces is by

● the bias they impose (regardless of the training data set), and

● the variance they produce (from one training set to another).

Bias

The tendency of a predictive hypothesis to deviate from the expected value when averaged
over different training sets.

● Bias often results from restrictions

imposed by the hypothesis space.

● We say that a hypothesis is

underﬁtting when it fails to ﬁnd a
pattern in the data.
Variance

The amount of change in the hypothesis due to ﬂuctuation in the training data.

● We say a function is overﬁtting the data

when it pays too much attention to the
particular data set it is trained on.

● It causes the hypothesis to perform

poorly on unseen data.
Bias–Variance Trade-off

● High Variance-High Bias

The model is inconsistent and also
inaccurate on average

● Low Variance-High Bias

Models are consistent but low on
average

● High Variance-Low Bias

Somewhat accurate but inconsistent on
average

● Low Variance-Low Bias

Model is consistent and accurate on
average
Analogy with throwing darts at a board.
Choosing a Hypothesis Space - II

Another way to analyze hypothesis spaces is by

● the expressiveness (i.e., ability of a model to represent a wide variety of functions or

patterns) of a hypothesis space, and
○ Can be measured by the size of the hypothesis space

● the model complexity (i.e., how intricate the relationships a model can capture) of a
hypothesis space.
○ Can be estimated by the number of parameters of a hypothesis

Note-1: Sometimes the term model capacity is used to refer to model complexity and
expressiveness together.
Note-2: In general, the required amount of training data depends on the model complexity,
representativeness of the training sample, and the acceptable error margin.
Choosing a Hypothesis Space - II

There is a tradeoff between the expressiveness of a hypothesis space and the computational
complexity of ﬁnding a good hypothesis within that space.

● Fitting a straight line to data is an easy computation; ﬁtting high-degree polynomials is

somewhat harder; and ﬁtting unusual-looking functions may be undecidable.

● After learning h, computing h(x) when h is a linear function is guaranteed to be fast, while
computing an arbitrarily complex function may not even guaranteed to terminate.

For example:
● In Deep Learning, representations are not simple but the h(x) computation still takes
only a bounded number of steps to compute with appropriate hardware.
Bias-Variance vs. Model’s Complexity

The relationship between bias and variance is closely related to the machine learning concepts
of overﬁtting, underﬁtting, and model’s complexity.

● Increasing a model’s complexity

typically increases its variance and
reduces its bias.
● Reducing a model’s complexity
increases its bias and reduces its
variance.

This is why it is called a tradeoff.

Optimal Model’s
complexity complexity
Learning as a Search

Given a hypothesis space, data, and a bias, the problem of learning can be reduced to one of
search.

Hypothesis Learner
Space 𝓗 (𝚪: S → h)

Final Hypothesis or
Model (h)
A Test Instance Prediction
Next lecture Evaluation
8th August 2024

Ruud UMPC Series Manual
No ratings yet
Ruud UMPC Series Manual
8 pages
Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
From Everand
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
Joseph George Caldwell
No ratings yet
Ethnographic Research and The Problem Data Reduction1: Judith Preissle Goetz and Margaret Lecornpte
No ratings yet
Ethnographic Research and The Problem Data Reduction1: Judith Preissle Goetz and Margaret Lecornpte
20 pages
Inductive Bias, Hypothesis, Hypothesis Space, Variance
No ratings yet
Inductive Bias, Hypothesis, Hypothesis Space, Variance
23 pages
hypothesis_in_ml
No ratings yet
hypothesis_in_ml
8 pages
Hypothesis Space and Inductive Bias - Inductive Bias - Inductive Learning - Underfitting and Overfitting
No ratings yet
Hypothesis Space and Inductive Bias - Inductive Bias - Inductive Learning - Underfitting and Overfitting
4 pages
Inductive Bias Hypothesis Hypothesis Space Variance
No ratings yet
Inductive Bias Hypothesis Hypothesis Space Variance
12 pages
ML_Unit-2
No ratings yet
ML_Unit-2
23 pages
Formalizing Supervised Learning Model Selection
No ratings yet
Formalizing Supervised Learning Model Selection
1 page
Machine Learning Coms-4771: Alina Beygelzimer Tony Jebara, John Langford, Cynthia Rudin
No ratings yet
Machine Learning Coms-4771: Alina Beygelzimer Tony Jebara, John Langford, Cynthia Rudin
17 pages
Unit 2
No ratings yet
Unit 2
97 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
27 pages
Lecture+Notes+Model+ Selection PDF
No ratings yet
Lecture+Notes+Model+ Selection PDF
12 pages
19 ML Intro
No ratings yet
19 ML Intro
31 pages
ML MU Unit 2
100% (3)
ML MU Unit 2
84 pages
Bias and Variance.pptx
No ratings yet
Bias and Variance.pptx
21 pages
Machine Leaning 1 unit
No ratings yet
Machine Leaning 1 unit
10 pages
Optimization
No ratings yet
Optimization
64 pages
19_ML_intro
No ratings yet
19_ML_intro
33 pages
Notes
No ratings yet
Notes
125 pages
What Is Supervise
No ratings yet
What Is Supervise
3 pages
cs229 Notes4 PDF
No ratings yet
cs229 Notes4 PDF
11 pages
UNIT I 4 ML Hypothesis & Concept Learning
No ratings yet
UNIT I 4 ML Hypothesis & Concept Learning
69 pages
Unit 2
No ratings yet
Unit 2
76 pages
Gansp Awareness Quiz PDF
No ratings yet
Gansp Awareness Quiz PDF
13 pages
Machine Learning Math Essentials _12.02.2025
No ratings yet
Machine Learning Math Essentials _12.02.2025
88 pages
Lec 3
No ratings yet
Lec 3
21 pages
Sec 1630
No ratings yet
Sec 1630
145 pages
Unit 1-1
No ratings yet
Unit 1-1
75 pages
10: Advice For Applying Machine Learning: Deciding What To Try Next
No ratings yet
10: Advice For Applying Machine Learning: Deciding What To Try Next
8 pages
DL_Unit1 (1)
100% (1)
DL_Unit1 (1)
79 pages
UNIT 5
No ratings yet
UNIT 5
21 pages
Lecture 3 Hypothesis Space & Inductive Bias
No ratings yet
Lecture 3 Hypothesis Space & Inductive Bias
29 pages
Probability Theory
No ratings yet
Probability Theory
3 pages
ML notes
No ratings yet
ML notes
49 pages
Machine Learning Exploring The Model
No ratings yet
Machine Learning Exploring The Model
17 pages
Week 3
No ratings yet
Week 3
56 pages
ML MU Unit 2
100% (2)
ML MU Unit 2
42 pages
Biasvariancetradeoff 210313075413
No ratings yet
Biasvariancetradeoff 210313075413
13 pages
Presentation on ML - Copy
No ratings yet
Presentation on ML - Copy
469 pages
Lecture 2 Ai
No ratings yet
Lecture 2 Ai
24 pages
Machine Learning Moudle - 1: There Are Three Main Types of Machine Learning
No ratings yet
Machine Learning Moudle - 1: There Are Three Main Types of Machine Learning
86 pages
Machine Learning Using Matlab: Lecture 8 Advice On ML Application
No ratings yet
Machine Learning Using Matlab: Lecture 8 Advice On ML Application
30 pages
module 3 modified
No ratings yet
module 3 modified
48 pages
module3_DS_ppt
No ratings yet
module3_DS_ppt
68 pages
4 - Bias-Variance Tradeoff
No ratings yet
4 - Bias-Variance Tradeoff
28 pages
Comparing Machine Learning Models
No ratings yet
Comparing Machine Learning Models
6 pages
Lecture 3 Factor Influence Learning - Recovered
No ratings yet
Lecture 3 Factor Influence Learning - Recovered
26 pages
ML 1-6
No ratings yet
ML 1-6
248 pages
Machine_learning(unit 3)
No ratings yet
Machine_learning(unit 3)
9 pages
ML material
No ratings yet
ML material
32 pages
All Cards
No ratings yet
All Cards
104 pages
3 LogisticRegression
No ratings yet
3 LogisticRegression
30 pages
Week 6 Lecture Notes
No ratings yet
Week 6 Lecture Notes
9 pages
Csa202 Unit 2
No ratings yet
Csa202 Unit 2
36 pages
Module 2 - Syllabus: CS 476 Introduction To Machine Learning, Module 2
No ratings yet
Module 2 - Syllabus: CS 476 Introduction To Machine Learning, Module 2
20 pages
Chapter 01 Introduction To Machine Learning
No ratings yet
Chapter 01 Introduction To Machine Learning
59 pages
EE5434 Regression
No ratings yet
EE5434 Regression
96 pages
Ciml v0 - 99 ch02 PDF
No ratings yet
Ciml v0 - 99 ch02 PDF
10 pages
ML3 - Evaluation
100% (1)
ML3 - Evaluation
65 pages
Bias-variance
No ratings yet
Bias-variance
8 pages
COBOL Interview Question
100% (1)
COBOL Interview Question
13 pages
Coby Harmon: Prepared by University of California, Santa Barbara Westmont College
No ratings yet
Coby Harmon: Prepared by University of California, Santa Barbara Westmont College
53 pages
Jeppview For Windows: General Information General Information
No ratings yet
Jeppview For Windows: General Information General Information
14 pages
IFHT Series: Centrifugal Jet Fans
No ratings yet
IFHT Series: Centrifugal Jet Fans
3 pages
Chapter 4 Motivation and Rewards Final
100% (1)
Chapter 4 Motivation and Rewards Final
63 pages
Construction and Maintenance of Masonry Houses
No ratings yet
Construction and Maintenance of Masonry Houses
91 pages
Development of Prestressed Concrete Bridges
No ratings yet
Development of Prestressed Concrete Bridges
64 pages
Earls Court Opportunity Area Risk Assessment PDF
No ratings yet
Earls Court Opportunity Area Risk Assessment PDF
72 pages
B2B Assignment
No ratings yet
B2B Assignment
7 pages
Digital Solution Business Continuity Planning and RCM During COVID 19 Beyond
No ratings yet
Digital Solution Business Continuity Planning and RCM During COVID 19 Beyond
10 pages
Aviation ENGLISH TEST
No ratings yet
Aviation ENGLISH TEST
2 pages
(Pub) Daft (1995) - Why I Recommended That Your Manuscript Be Rejected and What You Can Do About It
No ratings yet
(Pub) Daft (1995) - Why I Recommended That Your Manuscript Be Rejected and What You Can Do About It
19 pages
Global Water: 1.800.876.1172 Instrumentation Resource Book
No ratings yet
Global Water: 1.800.876.1172 Instrumentation Resource Book
6 pages
Zero-Phase Current Transformers (ZCT) for Ground-Fault Protection1
No ratings yet
Zero-Phase Current Transformers (ZCT) for Ground-Fault Protection1
1 page
Goodbye Greenscreen v1.0.1 Guide
No ratings yet
Goodbye Greenscreen v1.0.1 Guide
2 pages
Water Tap Measurer
No ratings yet
Water Tap Measurer
11 pages
Hipps PLC PDF
No ratings yet
Hipps PLC PDF
8 pages
User Manual Simplicity 1
No ratings yet
User Manual Simplicity 1
12 pages
ShakirGatea HooputraDamageModel
No ratings yet
ShakirGatea HooputraDamageModel
10 pages
IPR
No ratings yet
IPR
12 pages
Remittance-Kotak Bank
No ratings yet
Remittance-Kotak Bank
2 pages
Business Plan Cafe Classic
No ratings yet
Business Plan Cafe Classic
20 pages
Ringo Rag Company Ringo Rag Company: A. Soal Kasus
No ratings yet
Ringo Rag Company Ringo Rag Company: A. Soal Kasus
13 pages
Recombinant Dna Technology 1st Edition Keya Chaudhuri pdf download
No ratings yet
Recombinant Dna Technology 1st Edition Keya Chaudhuri pdf download
87 pages
"Dividend and Its Importance: Mrs. Nandita. S. Jha Adhish Prasad
No ratings yet
"Dividend and Its Importance: Mrs. Nandita. S. Jha Adhish Prasad
15 pages
Baldor M3615T 50
No ratings yet
Baldor M3615T 50
11 pages
Unit 7 Pointers
No ratings yet
Unit 7 Pointers
12 pages
William F. Higgins, Jr. v. Clarence M. Kelley, Director, Federal Bureau of Investigation and Federal Bureau of Investigation, An Agency of The United States, 574 F.2d 789, 3rd Cir. (1978)
No ratings yet
William F. Higgins, Jr. v. Clarence M. Kelley, Director, Federal Bureau of Investigation and Federal Bureau of Investigation, An Agency of The United States, 574 F.2d 789, 3rd Cir. (1978)
7 pages