0% found this document useful (0 votes)

5 views

lect3

Uploaded by

tranvietkien2005

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

lect3

Uploaded by

tranvietkien2005

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

COS-511: Learning Theory Spring 2017

Lecture 3: VC Dimension & The Fundamental Theorem

Lecturer: Roi Livni

Disclaimer: These notes have not been subjected to the usual scrutiny reserved for formal publications.
They may be distributed outside this class only with the permission of the Instructor.

3.1 Uniform Convergence and VC Dimension Cont.

So far we have defined the notion of learnability and the property of uniform convergence. Recall that a
class of functions C are learnable if, given an IID finite sample we can find a hypothesis h that minimizes
the generalization error w.r.t the best hypothesis in C. (see Def. 1.1, for pac learnability.).
We discussed the notion of an ERM algorithm. An ERM algorithm is an algorithm that chooses a hypothesis
which minimizes the empirical error. We’ve seen that finite class are learnable, via an ERM algorithm.
The proof relied on showing that by seeing O( 12 log |C|
δ ) example we can estimate the performance of all
hypotheses in C uniformly.
The last result motivated the property of uniform convergence. A class has the uniform convergence property
if, with enough samples, we can estimate uniformly the performance of all hypotheses in the class (see Def.
2.6). It is not hard to see that having the uniform convergence property means that an ERM algorithm will
succeed in learning.

3.1.1 VC Dimension

Here we introduce the notion of VC-Dimension of a hypothesis class. The VC-Dimension of a hypothesis
class is a strictly combinatorial property: namely, it is a property of the hypothesis class, and is completely
independent of any distribution D on the domain or on the labels. Never the less we will see that this
property of the hypothesis class is the main property that governs the learnability of the hypothesis class.

Definition 3.1. For a sample S define HS to be the restriction of H to S

HS = {h0 : S → Y : h0 (s) = h(s) for some h ∈ H and all s ∈ S}

Definition 3.2 (Shattered Set). Given a domain χ and a hypothesis class H, a finite set A ⊆ χ is said to
be shattered if for every subset A0 ⊆ A there is h ∈ H such that for x ∈ A, h(x) = 1 iff x ∈ A0 . Formally A
is shattered iff

|HA | = 2|A|

Definition 3.3 (VC dimension). The VC dimension of a hypothesis class, VC-dim(H), is defined as the
maximal cardinality of a finite set A that is shattered.

VC-dim(H) = max{|A| : A is shattered by H}

3-1
3-2 Lecture 3: VC Dimension & The Fundamental Theorem

3.1.1.1 Examples

Example 3.1 (Axis aligned rectangles). Consider Example. 2.1 of H consists of all target functions of the
form:

(
1 z1 ≤ x1 ≤ z2 , z3 ≤ x2 ≤ z4
fz1 ,z2 ,z3 ,z4 (x1 , x2 ) =
0 else

We will show that VC-dim(H) = 4.

First we show that VC-dim(H) ≥ 4 for that we need to show a set of size 4 that is shattered. Let us denote:

e1 = (1, 0) e2 = (0, 1)

We wil show that the set S = {±e1 , ±e2 } is shattered. Choose any target function h over S, then we need
to show that for some fz1 ,z2 ,z3 ,z4 (±ei ) = h(±ei ). Now if h(−e1 ) = 1 put z1 = −2, else put z1 = 0. Similarly
if h(e1 ) = 1 put z2 = 2 and else put z2 = 0. Similarly define z3 , z4 . if h(−e2 ) = 1 put z3 = −2, else put
z3 = 0, if h(e2 ) = 1 put z4 = 2 and else put z4 = 0.
One can then check that fz1 ,z2 ,z3 ,z4 = h. Since that is for arbitrary h we’ve shown that |HS | = 24 . In other
words, all target functions are realizable.
Next we need to show that VC-dim(H) < 5. For that we need to show that any set of size 5 cannot be
shattered. Choose a set S = {x(1) , x(2) , . . . x(5) }. Now let A be a set of 4 points that are extermal points
(i.e. all points have the maximal or the minimal value in some coordinate). Now since A is a set of just
4 points, there is x(i) ∈
/ A, however if fz1 ,z2 ,z3 ,z4 (x) = 1 for all x ∈ A we must have fz1 ,z2 ,z3 ,z4 (x(i) ) = 1
(j) (i)
also (since, for example z2 > max {x1 } ≥ x1 and so on...). thus we cannot realize f (x) = 1 if x ∈ A and
f (x) = 0 if x ∈
/ A.

Example 3.2 (Half Spaces with bias). Next we consider a class H similar to Example. 2.2 of halfspaces
with bias:

H = {fw (x) = sign(w · x + b) : w ∈ Rd , b ∈ R}

We will prove that VC-dim(H) = d + 1.

First we show that VC-dim(H) ≥ d + 1 for that we need to show that there is a set of size d + 1 that is
shattered.
Similar to before define ei = (0, 0, . . . , 1
|{z} , 0, . . . 0) and set e0 = 0 then S = {e0 , e1 , . . . , ed } is a set of
i−thcoordinate
size d + 1. for every function h(x) = ±1 set wi = h(ei ) and set b = h(e0 ) · 21 . Finaly we set w = (w1 , . . . , wd ).

1
fw,b (ei ) = sign(w · ei + b) = sign(wi + b) = sign(h(ei ) ± ) = h(ei )
2
Also for fw,b (0) = sign(w · 0 + b) = sign(b) = h(e0 ). Thus, we realized an arbitrary target function over S.
Next we wish to show that VC-dim(H)
Pt < d + 1. For
P the we use Radon’s theorem. Recall that the convex
hull of a set A, denoted Ac = { i=1 λi xi : λ > 0, λi = 1, xi ∈ A}.
Lecture 3: VC Dimension & The Fundamental Theorem 3-3

Theorem 3.4 (Radon’s Theorem). For every set S ⊆ Rd of size d + 2, we can divide S into two disjoint
sets whose convex hull intersect.

Given a set S of size d we divide it into two disjoint set A1 , A2 whose convex hull intersect. Now we show
that we cannot have fw,b (x) = 1 if x ∈ A1 and fw,b (x) = −1 if x ∈ A2 .
(1) (2)
Indeed, suppose otherwise and let a ∈ Ac1 ∩ Ac2 then there are positive λi , λi who sum to one and
(1) (2)
X X X
a= λ i xi = λi xi
xi ∈A1 xi ∈A2

Next we show that fw,b (a) ≥ 0:

!
(1) (1) (1)
X X X
w·a+b=w· λi xi +b= λ i w · xi + λi ·b
xi ∈A1 xi ∈A1
(1)
X
= λi (w · xi + b) ≥ 0
xi ∈A1

Exactly the same way we show that fw,b (a) < 0 which is a contradiction.

3.2 The Fundamental Theorem of Statistical Learning Theory

3.2.1 VC = ERM = Learnability

Theorem 3.5 (The Fundamental Theorem of Statistical Learning). Let C be a concept class of functions
from a domain χ to {−1, 1}, and let the loss function ` be the 0 − 1 loss. Then the following are equivalent

1. C is (agnostic) PAC learnable.

2. C is (realizable) PAC learnable.

3. C has finite VC dimension.

4. C has the uniform convergence property

5. C is learnable by a ERMC algorithm.

Further, if the VC-dimension of H is d then the sample complexity of the class (attained by an ERM)
algorithm is given by

d
m(, δ) = O( log 1/δ),

in the realizable model and
d
m(, δ) = O( log 1/δ),
2
in the agnostic setting.
3-4 Lecture 3: VC Dimension & The Fundamental Theorem

To summarize, the fundamental theorem states that that for the 0 − 1 function the VC dimension
completely characterizes the learnable classes, and as far as the PAC model goes, ERM algorithms are
optimal.

The implications 1 → 2, 5 → 1 are trivial. The proof that 4 → 5 is essentialy the proof we gave for the
special case fo finite hypotheses classes. (Cor. 2.5). We next prove 2 → 3. In the next lecture we will show
2 → 3 and 3 → 4.

PARRT3 NGỮ LIỆU
No ratings yet
PARRT3 NGỮ LIỆU
22 pages
QUESTIONNAIRE
100% (3)
QUESTIONNAIRE
2 pages
Understanding Machine Learning Solution Manual: 2 Gentle Start
No ratings yet
Understanding Machine Learning Solution Manual: 2 Gentle Start
67 pages
WHLP-Week 1-4 FABM2 Meinrad C. Bautista
100% (4)
WHLP-Week 1-4 FABM2 Meinrad C. Bautista
9 pages
Convex Functions and Optimization
No ratings yet
Convex Functions and Optimization
20 pages
Notes - Basic Analysis
No ratings yet
Notes - Basic Analysis
73 pages
Ejercicios Tema 1
No ratings yet
Ejercicios Tema 1
8 pages
Cal1 Econ Week2
No ratings yet
Cal1 Econ Week2
15 pages
Homework 1
No ratings yet
Homework 1
3 pages
Notes PDF
No ratings yet
Notes PDF
73 pages
Convex Functions: See P. 10 of The Handout On Preliminary Material
No ratings yet
Convex Functions: See P. 10 of The Handout On Preliminary Material
20 pages
Metric Spaces: 1.1 Definition and Examples
No ratings yet
Metric Spaces: 1.1 Definition and Examples
103 pages
SVM Explained PDF
No ratings yet
SVM Explained PDF
19 pages
550-Spr2022-hw6 (1)
No ratings yet
550-Spr2022-hw6 (1)
3 pages
convexity-1
No ratings yet
convexity-1
3 pages
1P1A Calculus 1 Example Sheet
No ratings yet
1P1A Calculus 1 Example Sheet
4 pages
arb_free_c3_jod
No ratings yet
arb_free_c3_jod
20 pages
MAT102-2023S-ProblemSet
No ratings yet
MAT102-2023S-ProblemSet
7 pages
MIT18 014F10 Final PR Ex
No ratings yet
MIT18 014F10 Final PR Ex
3 pages
Mte 10 TB
No ratings yet
Mte 10 TB
287 pages
Math185f09 hw3
No ratings yet
Math185f09 hw3
3 pages
Homework2 v1.0
No ratings yet
Homework2 v1.0
5 pages
MTH600 (A) Convex
No ratings yet
MTH600 (A) Convex
11 pages
Calculus Ii Spring 2011: Harry Mclaughlin Revised 1/22/11 Edited by
No ratings yet
Calculus Ii Spring 2011: Harry Mclaughlin Revised 1/22/11 Edited by
25 pages
MA 101 (Linear Algebra and Single-Variable Calculus) Semester II, 2021 - 2022 Tutorial 9
No ratings yet
MA 101 (Linear Algebra and Single-Variable Calculus) Semester II, 2021 - 2022 Tutorial 9
2 pages
1-2 The Precise Definition of A Limit
No ratings yet
1-2 The Precise Definition of A Limit
4 pages
Munkres - Topology - Chapter 1 Solutions: Section 3
No ratings yet
Munkres - Topology - Chapter 1 Solutions: Section 3
16 pages
Lect1116-18 Active Learning
No ratings yet
Lect1116-18 Active Learning
6 pages
Ps 1
No ratings yet
Ps 1
5 pages
f06 Basic Advcalc
No ratings yet
f06 Basic Advcalc
2 pages
Calculus Stanford
No ratings yet
Calculus Stanford
24 pages
Integration
No ratings yet
Integration
9 pages
337week0405 PDF
No ratings yet
337week0405 PDF
13 pages
Support Vector Machines: 1 Outline
No ratings yet
Support Vector Machines: 1 Outline
19 pages
(Halburd) Review of Complex Analysis PDF
No ratings yet
(Halburd) Review of Complex Analysis PDF
15 pages
EE Section D
No ratings yet
EE Section D
4 pages
Dynamical Systems: Supplementary Notes
No ratings yet
Dynamical Systems: Supplementary Notes
27 pages
ps4
No ratings yet
ps4
4 pages
Statement of The Riemann-Lebesgue Lemma: The Result Is Easy To State
No ratings yet
Statement of The Riemann-Lebesgue Lemma: The Result Is Easy To State
4 pages
A Characterization of Convex Subsets of Normed Spaces: Kodai Math. Sem. Rep 25 (1973) - 307-320
No ratings yet
A Characterization of Convex Subsets of Normed Spaces: Kodai Math. Sem. Rep 25 (1973) - 307-320
14 pages
math8530_lecture-6-05_h
No ratings yet
math8530_lecture-6-05_h
16 pages
MAT1001 Midterm (2023)
No ratings yet
MAT1001 Midterm (2023)
7 pages
112-1
No ratings yet
112-1
7 pages
202 F13 Ex 2 Solns
No ratings yet
202 F13 Ex 2 Solns
3 pages
Problem Set 4
No ratings yet
Problem Set 4
4 pages
Quiz of Advance Numerical Analysis (Math)
No ratings yet
Quiz of Advance Numerical Analysis (Math)
3 pages
Practice_Problems_for_ML_Midterms
No ratings yet
Practice_Problems_for_ML_Midterms
5 pages
MTH 201 Chapter 2: Application of Integration 2.1 Areas Between Curves
No ratings yet
MTH 201 Chapter 2: Application of Integration 2.1 Areas Between Curves
6 pages
homw5sol
No ratings yet
homw5sol
8 pages
2012-13 Exam
No ratings yet
2012-13 Exam
8 pages
MJRF 2022
No ratings yet
MJRF 2022
4 pages
Practice_exam_calc_AI(2)
No ratings yet
Practice_exam_calc_AI(2)
3 pages
notes_lec
No ratings yet
notes_lec
41 pages
113Calculus1_ex5
No ratings yet
113Calculus1_ex5
3 pages
MATH 115: Lecture XIV Notes
No ratings yet
MATH 115: Lecture XIV Notes
3 pages
Nonlinear Control and Servo Systems (FRTN05)
No ratings yet
Nonlinear Control and Servo Systems (FRTN05)
9 pages
Chapter 12
No ratings yet
Chapter 12
16 pages
Reading List 2020 21
No ratings yet
Reading List 2020 21
8 pages
IV-22-Ring of Polynomials
No ratings yet
IV-22-Ring of Polynomials
7 pages
1 Theory of Convex Functions
No ratings yet
1 Theory of Convex Functions
14 pages
HW 5
No ratings yet
HW 5
2 pages
Calculus-II (Mathematics) Question Bank
From Everand
Calculus-II (Mathematics) Question Bank
Mohmmad Khaja Shareef
No ratings yet
Long-Memory Time Series: Theory and Methods
From Everand
Long-Memory Time Series: Theory and Methods
Wilfredo Palma
No ratings yet
SLCP QA Manual 2019 V2.1
No ratings yet
SLCP QA Manual 2019 V2.1
29 pages
The Pleasure and Pain of As An Interpreter
No ratings yet
The Pleasure and Pain of As An Interpreter
2 pages
Team Building: Psychometric Analyst & Trainer: Tina Karkhanis
No ratings yet
Team Building: Psychometric Analyst & Trainer: Tina Karkhanis
59 pages
Speaking 1119 - 3 Cefr 2022 Segamat
No ratings yet
Speaking 1119 - 3 Cefr 2022 Segamat
43 pages
CUF 3 Reading Materials
No ratings yet
CUF 3 Reading Materials
4 pages
Final Behavior Management
No ratings yet
Final Behavior Management
28 pages
EN10INF-IV-11 Determine One’s Thesis as the Central Idea of the Paper-DLP (2)
No ratings yet
EN10INF-IV-11 Determine One’s Thesis as the Central Idea of the Paper-DLP (2)
9 pages
Celta Application Form - 0
No ratings yet
Celta Application Form - 0
13 pages
MOD Application Form
No ratings yet
MOD Application Form
4 pages
Programme Document ST125101 CE 2015-16
No ratings yet
Programme Document ST125101 CE 2015-16
11 pages
Transcript of Interview (AutoRecovered) Sba
No ratings yet
Transcript of Interview (AutoRecovered) Sba
10 pages
Input Data Sheet For E-Class Record: Region Division School Name School Id
No ratings yet
Input Data Sheet For E-Class Record: Region Division School Name School Id
11 pages
Unit 6 Learning Aim B Assessment Mat
No ratings yet
Unit 6 Learning Aim B Assessment Mat
2 pages
QHSE Manager JD
No ratings yet
QHSE Manager JD
2 pages
Nevada Sagebrush Archives For 02032015
No ratings yet
Nevada Sagebrush Archives For 02032015
12 pages
Top 10 Secrets To Great Teaching
100% (4)
Top 10 Secrets To Great Teaching
41 pages
Action Plan in Remedial
No ratings yet
Action Plan in Remedial
2 pages
Subject Wise Result
No ratings yet
Subject Wise Result
2 pages
10th M
No ratings yet
10th M
1 page
Position Paper (JAPAN UNESCO)
No ratings yet
Position Paper (JAPAN UNESCO)
3 pages
EDUC 5 Final Exam
No ratings yet
EDUC 5 Final Exam
4 pages
Insurance Case Study
No ratings yet
Insurance Case Study
13 pages
Junard Resume
No ratings yet
Junard Resume
2 pages
Comparative Education
100% (3)
Comparative Education
67 pages
The New Treasure of Dudjom Ngondro The Condensed Meaningful Guide (With Additions From The Long Ngodro)
No ratings yet
The New Treasure of Dudjom Ngondro The Condensed Meaningful Guide (With Additions From The Long Ngodro)
15 pages
B Arch - New Curriculum - 15june2021
No ratings yet
B Arch - New Curriculum - 15june2021
81 pages
Course Information Sheet CIS P1128FARP01 001
No ratings yet
Course Information Sheet CIS P1128FARP01 001
4 pages