Lecture Summary

The document discusses Jensen's inequality and uniform convergence. It talks about increasing the hypothesis space to find a good model. It also discusses expectations within expectations when using Rademacher elements. The document covers bounding the expectation using a growth function and the size of the data. It states that the six statements in the theorem are equivalent. It then shifts to discussing what big data means and using samples to handle it for frequent item set mining. It covers deriving sample size lower bounds for PAC learning and guarantees around learning with infinite versus finite hypothesis spaces.

Uploaded by

jerry

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views

Lecture Summary

Uploaded by

jerry

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

You are on page 1/ 2

Jensen's inequality

We proof by Uniform convergence

The forall h elem H: Ld(h) - L_squiglyD(h) <= epsilon

Means that by increasing the hypotheses increase the chance of finding a good
model.

Instead of taking all the unions, instead we look at the biggest set. The infinite
case does not have to exist.
Just like a function with a limit, the supremum is the upper/lower limit.
Eventhough the limit is never sampled if we sample it.

During the second step by definition

The value is the average of the differences of the losses of each of the two
datasets

To get rid of the infinitity. We observe that we are training and testing ins the
equation
During AI training we have a training and testing set.
It does not matter for the expectation to change the elements of D and D', it does
not matter from which dataset we sample.

Using rademacher elements we can generate a vector o \elem {-1,1}^m we can observe
more things
Since the equality holds for any vector of length m we can swap around the
expectations.
This is interesting (we have an expectation within an expectation)
Expectation is average. The average is a loop, nested expectation is a nested loop.
We may have infinitely many hypotheses, but we're talkinga bout finite things
In finite case supernum is same as taking the maximum over the expectation.

Step 4 implies that our expectation has a bound.

The size of the data is 2m. Then we know the size of Hm is bounded by the growth
function (last week)
Rewrite with growth function. and then we know the expectation has a bound.

Step 5, almost done.

We're talkinga bout a non negative random variable.
Markov tells us that with a probability of at least 1-\delta is less than or equal
to a fraction.
We only need to show that the fraction can be as low as we want it to be.
If m is bigger than

The theorem from this says that all the 6 statements are equivalent

-- AFTER BREAK
Conceptual
We started by discussing what big data means
Everything is statistically significant, regardless of how small the difference is.

Using samples is the way to handle the big data and learn from this.
We used frequent item set mining
However this scan does multiple scans over database, if we don't sample and it
doesnt fit in RAM, its very costly.
We then asked ourselves how big our sample should be to learn from it.
You can check the frequent but not the missed frequent once, by lowering the
required frequency.
However there CAN still be set missing from the sample as frequent but is frequent
in reality.
Then we noticed that we are doing item set mining using machine learning.
Classification is supervised learning.
We derived a lower bound of how big the sample should be.
We call our learning PAC and then investigated how we can do this PAC learning.
Using hypothesis sets etc to denote how to learn
!!!!! It is DOABLE for finite hypothesis sets to learn
The no free lunch theorem tells us, if you express too much you cannot learn.

If the set is infinite then IT CANNOT be PAC learned.

Today we saw that if the VC dimension is finite, we can in fact DO PAC learning.

We were trying to find sample bounds for frequent item set mining? This will be
friday
And what are the guarantees

The one requirement, is can we learn more with a larger class of hypotheses sets.
Answer is not really, which is kindof suprirsing.
Can we guarantee that we're eventually having a accuracy of alpha. Answer is no

Another side note: This course teaches that there are requirements for amount of
data in order to learn things.
If we have only limited amount of clients and we have a great amount of nodes, then
we will not be able to learn.
Which is opposite of big data, but shows the other side of what we learned

Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)
Dawson - Five Classics of Fairy Chess
No ratings yet
Dawson - Five Classics of Fairy Chess
91 pages
2015 Australian Mathematics Competition AMC Years 3 and 4
No ratings yet
2015 Australian Mathematics Competition AMC Years 3 and 4
10 pages
ML5
No ratings yet
ML5
5 pages
Manual Inconcert Flow - Designer-3.1.0-0.02-Lt
No ratings yet
Manual Inconcert Flow - Designer-3.1.0-0.02-Lt
91 pages
Unit 1-1
No ratings yet
Unit 1-1
75 pages
Lecture 09 Bounds for Bad Hypotheses
No ratings yet
Lecture 09 Bounds for Bad Hypotheses
26 pages
ML Document-1 - Merged
No ratings yet
ML Document-1 - Merged
19 pages
Lecture_Notes_MAI
No ratings yet
Lecture_Notes_MAI
114 pages
Lec 6
No ratings yet
Lec 6
29 pages
Foundations of Machine Learning: Module 7: Computational Learning Theory
No ratings yet
Foundations of Machine Learning: Module 7: Computational Learning Theory
64 pages
Lecture Notes MAI
No ratings yet
Lecture Notes MAI
111 pages
Machine Learning - The Science of Selection under Uncertainty
No ratings yet
Machine Learning - The Science of Selection under Uncertainty
85 pages
Tutorial
No ratings yet
Tutorial
81 pages
TheLearningTheory 2
No ratings yet
TheLearningTheory 2
90 pages
Sol All
No ratings yet
Sol All
66 pages
lecture01
No ratings yet
lecture01
11 pages
UNIT 5
No ratings yet
UNIT 5
21 pages
RIP Routing Protocol
No ratings yet
RIP Routing Protocol
27 pages
1 The Probably Approximately Correct (PAC) Model: COS 511: Theoretical Machine Learning
No ratings yet
1 The Probably Approximately Correct (PAC) Model: COS 511: Theoretical Machine Learning
6 pages
Montanez Dissertation
No ratings yet
Montanez Dissertation
143 pages
Lect 26 PDF
No ratings yet
Lect 26 PDF
14 pages
ITML U1 Overview
No ratings yet
ITML U1 Overview
45 pages
Module 2 - Syllabus: CS 476 Introduction To Machine Learning, Module 2
No ratings yet
Module 2 - Syllabus: CS 476 Introduction To Machine Learning, Module 2
20 pages
Cs 171 18 IntroLearning Old
No ratings yet
Cs 171 18 IntroLearning Old
47 pages
AI-unit-4
No ratings yet
AI-unit-4
91 pages
Model Selection/ Structure Learning Koller & Friedman Chapter 14 Mackay Chapter 28
No ratings yet
Model Selection/ Structure Learning Koller & Friedman Chapter 14 Mackay Chapter 28
49 pages
Qualification Exam Question: 1 Statistical Models and Methods
No ratings yet
Qualification Exam Question: 1 Statistical Models and Methods
4 pages
ML 3
No ratings yet
ML 3
36 pages
SML_Lecture2
No ratings yet
SML_Lecture2
35 pages
02-first-model-of-learning
No ratings yet
02-first-model-of-learning
37 pages
04u Handout
No ratings yet
04u Handout
35 pages
88450
No ratings yet
88450
50 pages
Machine Learning: PAC-Learning and VC-Dimension
No ratings yet
Machine Learning: PAC-Learning and VC-Dimension
31 pages
RL-Notes Book
No ratings yet
RL-Notes Book
119 pages
Week 3
No ratings yet
Week 3
56 pages
Machine Learning Learning
No ratings yet
Machine Learning Learning
35 pages
Download Complete Learning theory An approximation theory viewpoint 1st Edition Felipe Cucker PDF for All Chapters
100% (1)
Download Complete Learning theory An approximation theory viewpoint 1st Edition Felipe Cucker PDF for All Chapters
77 pages
Full Notes
No ratings yet
Full Notes
197 pages
Selected theoretical aspects of ML and deep learning
No ratings yet
Selected theoretical aspects of ML and deep learning
46 pages
Statistical Machine Learning-The Basic Approach and Current Research Challenges
No ratings yet
Statistical Machine Learning-The Basic Approach and Current Research Challenges
35 pages
Statistical Machine Learning-The Basic Approach and Current Research Challenges
No ratings yet
Statistical Machine Learning-The Basic Approach and Current Research Challenges
35 pages
ECE 368 Course Review: Probabilistic Reasoning 2023
No ratings yet
ECE 368 Course Review: Probabilistic Reasoning 2023
138 pages
Problem Set 2
No ratings yet
Problem Set 2
18 pages
Lecture2 Math ML Review
No ratings yet
Lecture2 Math ML Review
87 pages
Learning theory An approximation theory viewpoint 1st Edition Felipe Cucker - Download the ebook and start exploring right away
100% (3)
Learning theory An approximation theory viewpoint 1st Edition Felipe Cucker - Download the ebook and start exploring right away
57 pages
ML Lecture 8
No ratings yet
ML Lecture 8
12 pages
How Many Samples To Learn A Finite Class?
No ratings yet
How Many Samples To Learn A Finite Class?
4 pages
Lec7 - Nonparametric Methods - II
No ratings yet
Lec7 - Nonparametric Methods - II
38 pages
Machine Learning, Spring 2005
No ratings yet
Machine Learning, Spring 2005
3 pages
Machine Learning
No ratings yet
Machine Learning
6 pages
ML 15 09 2022
No ratings yet
ML 15 09 2022
22 pages
ML Unit-4 Prob Learning
No ratings yet
ML Unit-4 Prob Learning
36 pages
UNIT-3
No ratings yet
UNIT-3
99 pages
Bayesian reasoning and machine learning Barber D. - Get instant access to the full ebook with detailed content
100% (1)
Bayesian reasoning and machine learning Barber D. - Get instant access to the full ebook with detailed content
47 pages
תרגול - Bayesian Learning
No ratings yet
תרגול - Bayesian Learning
45 pages
Bayes ML Tutorial
No ratings yet
Bayes ML Tutorial
69 pages
Bark08 Ghahramani Samlbb 01
No ratings yet
Bark08 Ghahramani Samlbb 01
26 pages
Pac VC PDF
No ratings yet
Pac VC PDF
32 pages
05-vc-bound
No ratings yet
05-vc-bound
27 pages
Maths For Machine Learning
No ratings yet
Maths For Machine Learning
118 pages
ML Lecture23
No ratings yet
ML Lecture23
57 pages
Generalization Bounds and Stability: 9.520 Class 14, 03 April 2006 Sasha Rakhlin
No ratings yet
Generalization Bounds and Stability: 9.520 Class 14, 03 April 2006 Sasha Rakhlin
25 pages
Active Filters Theory and Design
No ratings yet
Active Filters Theory and Design
276 pages
Lab 2 Assignment
No ratings yet
Lab 2 Assignment
7 pages
Scantling Floating Pontoon
No ratings yet
Scantling Floating Pontoon
4 pages
Golden
No ratings yet
Golden
9 pages
ADACM Lecture #2: MATLAB® Programming
No ratings yet
ADACM Lecture #2: MATLAB® Programming
27 pages
Mains Diagram Structure - Sunya IAS
100% (4)
Mains Diagram Structure - Sunya IAS
16 pages
Thermal Physics - Concepts and Practice - Allen L. Wasserman - CAMBRIDGE - 2012
100% (3)
Thermal Physics - Concepts and Practice - Allen L. Wasserman - CAMBRIDGE - 2012
320 pages
Sola 2
No ratings yet
Sola 2
8 pages
Worksheet On Dividing Polynomials
No ratings yet
Worksheet On Dividing Polynomials
2 pages
Download Spatial Mathematics Theory and Practice through Mapping 1st Edition Sandra Lach Arlinghaus (Author) ebook All Chapters PDF
100% (1)
Download Spatial Mathematics Theory and Practice through Mapping 1st Edition Sandra Lach Arlinghaus (Author) ebook All Chapters PDF
67 pages
All Math Suite Bank Questions Without Answers or Explanation From
No ratings yet
All Math Suite Bank Questions Without Answers or Explanation From
1,081 pages
Determining The Shape Parameter of A Weibull Distribution From Mechanical Damage Models
No ratings yet
Determining The Shape Parameter of A Weibull Distribution From Mechanical Damage Models
5 pages
Game Theory (S4415) : Answers To Problem Set 4: Prajit K. Dutta June 22, 2001
No ratings yet
Game Theory (S4415) : Answers To Problem Set 4: Prajit K. Dutta June 22, 2001
5 pages
Tailings Dam With Core and Filter: Model Description and Geometry
No ratings yet
Tailings Dam With Core and Filter: Model Description and Geometry
9 pages
CC103 Engineering Surveying 1 Chapter 2
No ratings yet
CC103 Engineering Surveying 1 Chapter 2
15 pages
The Analysis of Determinant Factors of Customer Preference On Café As A Culinary Tourism Destinantion On Malang
No ratings yet
The Analysis of Determinant Factors of Customer Preference On Café As A Culinary Tourism Destinantion On Malang
7 pages
History of Behavioral Economics PDF
100% (1)
History of Behavioral Economics PDF
17 pages
Combinational & Sequential Logics
No ratings yet
Combinational & Sequential Logics
32 pages
Chapter 2 REGULAR EXPRESSION
No ratings yet
Chapter 2 REGULAR EXPRESSION
26 pages
Auditorium Aided Design
No ratings yet
Auditorium Aided Design
21 pages
Formulæ For Calculating Muslim Prayer Times-And Three Types of Sundial
No ratings yet
Formulæ For Calculating Muslim Prayer Times-And Three Types of Sundial
4 pages
U of Sheffield: The Diversity
No ratings yet
U of Sheffield: The Diversity
270 pages
12_AOS3 Poster Task Sheet & Rubric 2025
No ratings yet
12_AOS3 Poster Task Sheet & Rubric 2025
8 pages
BQQ6214 Statistical Formulae
No ratings yet
BQQ6214 Statistical Formulae
3 pages
Computation of The Compression Factor and Fugacity Coefficient of Real Gases
No ratings yet
Computation of The Compression Factor and Fugacity Coefficient of Real Gases
20 pages
Machine Learning Approachto Improve Satellite Orbit Prediction Accuracy Using Publicly Available
No ratings yet
Machine Learning Approachto Improve Satellite Orbit Prediction Accuracy Using Publicly Available
32 pages
1 s2.0 S0141029622010392 Main
No ratings yet
1 s2.0 S0141029622010392 Main
16 pages

Lecture Summary

Uploaded by

Lecture Summary

Uploaded by

Jensen's inequality

We proof by Uniform convergence

The forall h elem H: Ld(h) - L_squiglyD(h) <= epsilon

During the second step by definition

Step 4 implies that our expectation has a bound.

Step 5, almost done.

If the set is infinite then IT CANNOT be PAC learned.

You might also like