Lecture 6

Lec

Uploaded by

bunnypatel063

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views

Lecture 6

Lec

Uploaded by

bunnypatel063

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Machine Learning Coms-4771

Machine Learning Theory

The Mistake-Bound model
Lecture 6

Based entirely on Avrim Blum’s notes (see the link at the web
page)
Goals of ML theory:
I develop and analyze models of learning that capture the key aspects of
machine learning
I help understand what type of guarantees we can hope to achieve, and
what type of learning problems we can hope to solve
Two main learning models:

1. Distributional (PAC) setting (a “batch” model)

I Assumption: training and test examples come (independently)
from some fixed probability distribution D over the instance
space, labeled by an unknown target function (from a known
hypothesis class).
I Basic question: How much data do I need to see so that if I do
well over it, I can expect to do well on new points drawn from
D?
2. On-line setting: No distributional assumptions. An adversary selects the
order in which examples are presented. No separate training set. The
learner attempts to predict on every example seen. Count the number of
mistakes.

We are going to start with (2).

Mistake Bound Model

Learning is in stages:
I The learner gets an unlabeled example x ∈ X
I The learner predicts its classification
I The learner is told the correct label f (x)

Goal: minimize the number of mistakes

Definition
Algorithm A learns class of functions C with mistake bound M if A makes at
most M mistakes on any sequence of examples consistent with some f ∈ C .

Can’t talk about past performance predicting future performance, or the

number of samples needed to learn (e.g., we may end up seeing the same
example over and over again, so we don’t learn anything about f , but it’s Ok,
because we won’t be making many mistakes either).
Example: Disjunctions

X = {0, 1}n . Assumption: examples are consistent with a disjunction (an OR

on a subset of features)
Example: Disjunctions

X = {0, 1}n . Assumption: examples are consistent with a disjunction (an OR

on a subset of features)

I Start with h(x) = x1 ∨ x2 ∨ · · · ∨ xn

I Get x, answer according to h.
I Can’t make a mistake on positive. Mistake on negative: throw out all
variables in h set to 1 in x.
I Each mistake removes at least one variable, so the total number of
mistakes is at most n.
Example: Disjunctions

No deterministic algorithm can do better:

1 0 0 0 0 0
0 1 0 0 0 0
0 0 1 0 0 0
0 0 0 1 0 0
0 0 0 0 1 0
0 0 0 0 0 1

Any labeling of these examples is consistent with some disjunction. So

regardless of what the algorithm predicts, there is always a consistent
disjunction that disagrees with the algorithm on every single example.
Halving Algorithm

I Learn arbitrary concept class C with at most log(|C |) mistakes. (The

number of monotone disjunctions on n variables is 2n , so this makes at
most n mistakes, one mistake per bit)
I Take majority vote over all h ∈ C consistent with all examples seen so far.
I Proof: Each mistake cuts down the number of available concepts at least
in half.
What if our concept class has functions of different sizes (e.g., decision trees)?
I Cb = the set of functions in C that take at most b bits to represent,
|Cb | < 2b .
I ”Guess and double” on b. If the target function takes s bits and
2b ≤ s ≤ 2b+1 , the number of mistakes is at most
1 + 2 + 4 + ... + 2s < 4s.
I Can get rid of the factor of 4:

I Give each h a weight of (1/2)size(h) . If our description

language is a prefix-code (no concept is a prefix of another),
the total sum of weights is at most 1.
I Take weighted vote. Each mistake removes at least 1/2 of
total weight left. So if the target has size s, then the number
of mistakes is at most log(2s ) = s. So we are paying one
mistake per bit of the description of the target function (if you
don’t care about computation time).
Standard Optimal Algorithm

Online Learning as a 2-player game between Algorithm and Adversary

I Adversary selects x which splits C into two sets: C0 (x)—functions that
label x negative, and C1 (x)—functions that labels x positive.
I The algorithm gets to pick one of C0 (x) and C1 (x) to throw away (by
predicting postive and making a mistake it throws out C1 (x), or by
predicting negative and making a mistake it throws out C0 (x))
I Repeat until only one function is left.

The value of this game (to the opponent) is the number of rounds the game is
played.
I Well-defined number opt(C ) that is the optimal mistake bound for
concept class C (minimum over all algorithms).
I Well defined optimal strategy for each player: Given an example x, we
“just” calculate opt(C0 (x)) and opt(C1 (x)) (by applying this idea
recursively), and throw out whichever set has larger mistake bound.
Is Halving Algorithm optimal?

I Halving algorithm: throw out larger set

I Optimal algorithm: throw out set with larger mistake bound
I Not always the same (so halving is not always optimal)! It’s possible for
the mistake bound for a set of functions C to be much smaller than
log |C |.
I You will come up with an example in your next homework.
What if there is no perfect function?

I Think of functions in C as experts giving advice. You want to do as well

as the best expert in hindsight (regret bounds).
I Next lecture: Flexible version of the halving algoritm (instead of throwing
out inconsistent functions, reduce their weight) (Weighted Majority)
I Next lecture: efficient algorithms that can handle large number of
irrelevant features (Winnow)

Detailed Lesson Plan in Creative Nonfiction
100% (4)
Detailed Lesson Plan in Creative Nonfiction
6 pages
cs3230 Cheatsheet
No ratings yet
cs3230 Cheatsheet
6 pages
Randomized Algo Harvey
No ratings yet
Randomized Algo Harvey
234 pages
Kakade S. Tewari a. - Topics in Artificial Intelligence (Learning Theory)
No ratings yet
Kakade S. Tewari a. - Topics in Artificial Intelligence (Learning Theory)
68 pages
Probab 10
No ratings yet
Probab 10
3 pages
Colt Tutorial
No ratings yet
Colt Tutorial
43 pages
Online Learning: 9.520 Class 12, 20 March 2006 Andrea Caponnetto, Sanmay Das
No ratings yet
Online Learning: 9.520 Class 12, 20 March 2006 Andrea Caponnetto, Sanmay Das
33 pages
The Mistake Bound Model: 8803 Machine Learning Theory
No ratings yet
The Mistake Bound Model: 8803 Machine Learning Theory
3 pages
infIILecture5.en.handout
No ratings yet
infIILecture5.en.handout
50 pages
Lecture 8-ml On-Line Learning
No ratings yet
Lecture 8-ml On-Line Learning
48 pages
18 Coping With NP Completeness 3 Exact Algorithms
No ratings yet
18 Coping With NP Completeness 3 Exact Algorithms
126 pages
Tutorial
No ratings yet
Tutorial
81 pages
Sol All
No ratings yet
Sol All
66 pages
Foundations of Machine Learning: Module 7: Computational Learning Theory
No ratings yet
Foundations of Machine Learning: Module 7: Computational Learning Theory
64 pages
HW 1
No ratings yet
HW 1
6 pages
Final: CS 188 Spring 2014 Introduction To Artificial Intelligence
No ratings yet
Final: CS 188 Spring 2014 Introduction To Artificial Intelligence
28 pages
Lecture 17
No ratings yet
Lecture 17
36 pages
Hw1 Theory Solution PuHK4fmHvB
No ratings yet
Hw1 Theory Solution PuHK4fmHvB
4 pages
Dynamic Programming
No ratings yet
Dynamic Programming
46 pages
Arsh
No ratings yet
Arsh
13 pages
Final 2009
No ratings yet
Final 2009
3 pages
1 Review of Definitions: COS 511: Theoretical Machine Learning
No ratings yet
1 Review of Definitions: COS 511: Theoretical Machine Learning
9 pages
simon
No ratings yet
simon
4 pages
Computational Learning
No ratings yet
Computational Learning
12 pages
computational learning theorem
No ratings yet
computational learning theorem
91 pages
4 Solid Property
No ratings yet
4 Solid Property
30 pages
HW 1 Eeowh 3
No ratings yet
HW 1 Eeowh 3
6 pages
combinatorial_algorithms
No ratings yet
combinatorial_algorithms
77 pages
What Is An "Algorithm?"
No ratings yet
What Is An "Algorithm?"
46 pages
Revisionsss
No ratings yet
Revisionsss
16 pages
hw1 PDF
No ratings yet
hw1 PDF
6 pages
Unit-V bee gghhgg
No ratings yet
Unit-V bee gghhgg
47 pages
Lec 3
No ratings yet
Lec 3
29 pages
Vdoc.pub Discrete Optimization Spring 2015
No ratings yet
Vdoc.pub Discrete Optimization Spring 2015
82 pages
Daas Solved 2019
No ratings yet
Daas Solved 2019
7 pages
ML Unit-4 Prob Learning
No ratings yet
ML Unit-4 Prob Learning
36 pages
ML Unit 1
No ratings yet
ML Unit 1
35 pages
CS648A 1 Overview of the Course 2025
No ratings yet
CS648A 1 Overview of the Course 2025
35 pages
CS174: Note12
No ratings yet
CS174: Note12
5 pages
HW02
No ratings yet
HW02
9 pages
MT 2009 Answers
No ratings yet
MT 2009 Answers
8 pages
ML Lecture 8
No ratings yet
ML Lecture 8
12 pages
What Is An Algorithm?: (And How Do We Analyze One?)
No ratings yet
What Is An Algorithm?: (And How Do We Analyze One?)
29 pages
AI MID Answers
No ratings yet
AI MID Answers
7 pages
Algorithm Introduction
No ratings yet
Algorithm Introduction
28 pages
CO3002/7002 Assignment 2: Question 1 (30 Marks)
No ratings yet
CO3002/7002 Assignment 2: Question 1 (30 Marks)
3 pages
Exam Spring 10
No ratings yet
Exam Spring 10
10 pages
15-17 Dynamic Programming - Algorithms (Series Lecture)
No ratings yet
15-17 Dynamic Programming - Algorithms (Series Lecture)
63 pages
sol3_2016
No ratings yet
sol3_2016
8 pages
Chapters PDF
No ratings yet
Chapters PDF
51 pages
UNIT2_final
No ratings yet
UNIT2_final
17 pages
L03 Randomized Algorithms
No ratings yet
L03 Randomized Algorithms
61 pages
Bdo Co1 Session 3
No ratings yet
Bdo Co1 Session 3
25 pages
comp106_3_algorithms
No ratings yet
comp106_3_algorithms
52 pages
Fall2022_Week1.1
No ratings yet
Fall2022_Week1.1
19 pages
Assignment 03 Ai
No ratings yet
Assignment 03 Ai
17 pages
Lecture-2-Binary SearchTwo Pointers (1) (1)
No ratings yet
Lecture-2-Binary SearchTwo Pointers (1) (1)
48 pages
ML UNIT-3 Notes PDF
No ratings yet
ML UNIT-3 Notes PDF
23 pages
Computational Complexity: Definition of Mixed-Integer Programming
No ratings yet
Computational Complexity: Definition of Mixed-Integer Programming
8 pages
Fundamental Math
From Everand
Fundamental Math
Russell Pead
No ratings yet
MCS-011: Problem Solving and Programming
From Everand
MCS-011: Problem Solving and Programming
Dr. DK Sukhani
No ratings yet
Getting Started With ESP32
100% (1)
Getting Started With ESP32
49 pages
Mémoire Sur Le Systéme Primitif Des Voyelles Dans Les Langues Indo-Européennes by Ferdinand de Saussure
No ratings yet
Mémoire Sur Le Systéme Primitif Des Voyelles Dans Les Langues Indo-Européennes by Ferdinand de Saussure
322 pages
King Vernon Thesis PDF
No ratings yet
King Vernon Thesis PDF
197 pages
Advanced Grammar Tests KEY
No ratings yet
Advanced Grammar Tests KEY
7 pages
Note 04 Gates Matrix PDF
No ratings yet
Note 04 Gates Matrix PDF
5 pages
13 Linking Words 1
50% (2)
13 Linking Words 1
6 pages
Investigation 2022 QP
No ratings yet
Investigation 2022 QP
32 pages
Download Complete The Treasury Of Knowledge Books 9 and 10 Journey And Goal 1st Edition Jamgon Kongtrul PDF for All Chapters
No ratings yet
Download Complete The Treasury Of Knowledge Books 9 and 10 Journey And Goal 1st Edition Jamgon Kongtrul PDF for All Chapters
84 pages
Ensayo Sobre Corrupción
100% (1)
Ensayo Sobre Corrupción
4 pages
TXT2PDF User Guide PDF
No ratings yet
TXT2PDF User Guide PDF
54 pages
Congiuntivo Summary Rev4
No ratings yet
Congiuntivo Summary Rev4
2 pages
Acknowledgement Receipt For Waiver Application
No ratings yet
Acknowledgement Receipt For Waiver Application
1 page
EHIS507 U01 T01 PowerPoint
No ratings yet
EHIS507 U01 T01 PowerPoint
28 pages
All Modul Fun English 2022
No ratings yet
All Modul Fun English 2022
22 pages
System Architecture Template
No ratings yet
System Architecture Template
8 pages
(Studyplan) UPPCS Preliminary Exam Paper 2 - Aptitude, Maths, Reasoning, Decision Making, English, Hindi, Free Studymaterial & Previous Papers Mrunal
No ratings yet
(Studyplan) UPPCS Preliminary Exam Paper 2 - Aptitude, Maths, Reasoning, Decision Making, English, Hindi, Free Studymaterial & Previous Papers Mrunal
14 pages
Justification of Religious Belief
100% (1)
Justification of Religious Belief
15 pages
Fairyland 4
No ratings yet
Fairyland 4
3 pages
Cbgapi
No ratings yet
Cbgapi
25 pages
Ngữ Pháp Tiếng Anh theo chủ đề - Copy
No ratings yet
Ngữ Pháp Tiếng Anh theo chủ đề - Copy
126 pages
PR_CL2A4_Task 1-3
No ratings yet
PR_CL2A4_Task 1-3
7 pages
History of Latin
No ratings yet
History of Latin
4 pages
Spelling Jamaican The Jamaican Way (Coll.) (Z-Library)
No ratings yet
Spelling Jamaican The Jamaican Way (Coll.) (Z-Library)
2 pages
Giao Trinh Tieng Anh
No ratings yet
Giao Trinh Tieng Anh
120 pages
ASSIGNMENT- I_Basic Electrical and Electronics Engg ECEG-1004 (1)_2e9aea1772074b04aff8bd947b6dcad3
No ratings yet
ASSIGNMENT- I_Basic Electrical and Electronics Engg ECEG-1004 (1)_2e9aea1772074b04aff8bd947b6dcad3
2 pages
Lesson 4 Your Hometown
No ratings yet
Lesson 4 Your Hometown
4 pages
SEO Audit For Amazon - Com - SEOptimer
No ratings yet
SEO Audit For Amazon - Com - SEOptimer
26 pages
GR 11 SBA - TASK 7 LITERATURE ASSIGNMENT_MARKING GUIDELINE_June 2020
No ratings yet
GR 11 SBA - TASK 7 LITERATURE ASSIGNMENT_MARKING GUIDELINE_June 2020
11 pages
Student Name: J'mie Lawrence Subject: Information Technology Teacher: Mr. Paul Form: 4H
No ratings yet
Student Name: J'mie Lawrence Subject: Information Technology Teacher: Mr. Paul Form: 4H
7 pages