0% found this document useful (0 votes)

6 views4 pages

Note 2

This document discusses concentration inequalities, particularly Hoeffding's Inequality, and its applications in various contexts such as Multi-Armed Bandits and supervised learning. It introduces key concepts including pseudo-regret, simple regret, and generalization bounds, providing mathematical formulations and proofs for these concepts. The document emphasizes the importance of union bounds and Hoeffding's Inequality in deriving performance guarantees for learning algorithms.

Uploaded by

zzyy20010204

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views4 pages

Note 2

Uploaded by

zzyy20010204

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Concentration Inequalities and Union Bound

Nan Jiang

September 13, 2022

This note introduces the basics of concentration inequalities and examples of its applications (often
with union bound), which will be useful for the rest of this course.

1 Hoeffding’s Inequality
Theorem 1. Let X1 , . . . , Xn be independent random variables on R such that Xi is bounded in the interval
Pn
[ai , bi ] . Let Sn = i=1 Xi . Then for all t > 0,
2 Pn 2
Pr[Sn − E[Sn ] ≥ t] ≤ e−2t / i=1 (bi −ai ) , (1)
−2t2 /
Pn 2
Pr[Sn − E[Sn ] ≤ −t] ≤ e i=1 (bi −ai ) . (2)

Remarks:
2 Pn 2
• By union bound, we have Pr[|Sn − E[Sn ]| ≥ t] ≤ 2e−2t / i=1 (bi −ai ) .

• We often care
h about the convergence
i of the empirical mean to the true average, so we can devide
2 2 Pn 2
Sn by n: Pr Snn − E[Snn ] ≥ t ≤ 2e−2n t / i=1 (bi −ai ) .

• A useful rephrase of the result whenqall variables share the same support [a, b]: with probability
Sn E[Sn ] 1
at least 1 − δ, n − n ≤ (b − a) 2n ln 2δ .

• X1 , . . . , Xn are not necessarily identically distributed; they just have to be independent.

• The number of variables, n, is a constant in the theorem statement. When n is a random variable
itself, for Hoeffding’s inequality to apply, n cannot depend on the realization of X1 , . . . , Xn .
Example: Consider the following Markov chain:

s1
p 1-p

s2 s4

1
Say we start at s1 and sample a path of length T (T is a constant). Let n be the number of times
we visit s1 , and we can use the transitions from s1 to estimate p.

1. Can we directly apply Hoeffding’s inequality here with n as the number of coin tosses? If
you want to derive a concentration bound for this problem, look up Azuma’s inequality.
2. What if we sample a path until we visit s1 N times for some constant N ? Can we apply
Hoeffding’s inequality with N as the number of random variables?

2 Multi-Armed Bandits (MAB)

2.1 Formulation
A MAB problem is specified by K distributions over R, {Ri }K i=1 . Each Ri has bounded supported
[0, 1] and mean µi . Let µ⋆ = maxi∈[K] µi . For round t = 1, 2, . . . , T , the learner

1. Chooses arm it ∈ [K].

2. Receives reward rt ∼ Rit .

A popular objective for MAB is the pseudo-regret, which poses the exploration-exploitation challenge:
T
X
RegretT = (µ⋆ − µit ).
t=1

Another important objective is the simple regret:

µ⋆ − µî ,

where î is the arm that the learner picks after T rounds of interactions. This poses the “pure explo-
ration” challenge, since all it matters is to make a good final guess and the regret incurred within the
T rounds does not matter. A related objective is called Best-Arm Identification, which asks whether
î ∈ arg maxi∈[K] µi ; Best-Arm Identification results often require additional gap conditions.

2.2 Uniform sampling

We consider the simplest algorithm that chooses each arm the same number of times, and after T
rounds selects the arm with the highest empirical mean. For simplicity let’s assume that T /K is an
integer. We will prove a high-probability bound on the simple regret. The analysis gives an example
of the application of Hoeffiding’s inequlaity to a learning problem; the algorithm itself is likely to be
suboptimal.
For simplicity let’s assume that T /K is an integer. After T rounds, each arm is chosen T /K times,
and let µ̂i be the empirical average reward associated with arm i. By Hoeffding’s inequality, we have:
2
Pr[|µ̂i − µi | ≥ ϵ] ≤ 2e−2T ϵ /K
.

2
Now we want accurate estimation for all arms simultaneously. That is, we want to bound the proba-
bility of the event that any µ̂i deviating from µi too much. This is where union bound is useful:
"K #
[
Pr {|µ̂i − µi | ≥ ϵ} (the event that estimation is ϵ-inaccurate for at least 1 arm)
i=1
K
X 2
≤ Pr [|µ̂i − µi | ≥ ϵ] ≤ 2Ke−2T ϵ /K
. (union bound, then Hoeffding’s inequality)
i=1
q
K
To rephrase this result: with probability at least 1 − δ, |µ̂i − µi | ≤ 2T ln 2K
δ holds for all i simultane-
ously.
Finally, we use the estimation error to bound the decision loss: recall that î = arg maxi∈[K] µ̂i , and
let i⋆ = arg maxi∈[K] µi .

µ⋆ − µî = µi⋆ − µ̂i⋆ + µ̂i⋆ − µî

r
K 2K
≤ µi⋆ − µ̂i⋆ + µ̂î − µî ≤ 2 ln .
2T δ
We can rephrase this result as a sample complexity
statement:
in order to guarantee that µ⋆ − µî ≤ ϵ
K K
with probablity at least 1 − δ, we need T = O 2 ln .
ϵ δ

2.3 Lower bound

The linear dependence of the sample complexity on K makes a lot of sense, as to choose a arm with
high reward we have to try each arm at least once. Below we will see how to mathematically formalize
this idea and prove a lower bound on the sample complexity of MAB.
p
Theorem 2. For any K ≥ 2, ϵ ≤ 1/8, and any MAB algorithm, there exists an MAB instance where µ⋆
is ϵ better than other arms, yet the algorithm identifies the best arm with no more than 2/3 probability unless
K
T ≥ 72ϵ 2.

The theorem itself is stated as a best-arm identification lower bound, but it is also a lower bound
for simple regret minimization. This is because all arms except the best one is ϵ worse than µ⋆ , so
missing the optimal arm means a simple regret of at least ϵ.
See the proof in [1] (Theorem 2); the technique is due to [2] and can be also used to prove the lower
bound on the regret of MAB.

3 Generalization Bounds for Supervised Learning

Consider a simple supervised learning setting: let X be the feature space and Y be the label space; in
this example we consider classification so Y = {0, 1}. Let PX,Y be a distribution over X × Y, and we
are given a dataset {(Xi , Yi )}ni=1 with each (Xi , Yi ) drawn i.i.d. from PX,Y . Let F : X → Y be a finite
hypothesis class. The classifier in F that minimizes the classification error is:

f ⋆ := arg min E[I[f (X) ̸= Y ]],

f ∈F

3
where E[·] is w.r.t. PX,Y . Given only a finite sample, one natural thing to do is empirical risk minimiza-
tion, i.e., find the classifer that has the lowest training error rate on data:
n
1X
fˆ = arg min E[I[f
b (X) ̸= Y ]] := I[f (Xi ) ̸= Yi ].
f ∈F n i=1

The question is, can we give any guarantee to how good the learned classifier fˆ is compared to the
optimal one f ⋆ , as a function of n? In other words, we want to bound

E[I[fˆ(X) ̸= Y )]] − E[I[f ⋆ (X) ̸= Y )]].

We provide the analysis below, which mainly uses Hoeffding’s and union bound. First of all,

E[I[fˆ(X) ̸= Y )]] − E[I[f ⋆ (X) ̸= Y )]]

≤ E[I[fˆ(X) ̸= Y )]] − E[I[
b fˆ(X) ̸= Y )]] + E[I[f
b ⋆
(X) ̸= Y )]] − E[I[f ⋆ (X) ̸= Y )]] (fˆ is optimal w.r.t. E)
b
≤ 2 · max |E[I[f (X) ̸= Y )]] − E[I[f
b (X) ̸= Y )]]|. (3)
f ∈F

It then suffices to bound maxf ∈F |E[I[f (X) ̸= Y )]] − E[I[f

b (X) ̸= Y )]]|, which is often called a uniform
deviation bound. The key is to realize that, for any fixed f ∈ F, E[I[f b (X) ̸= Y ]] is the average of
i.i.d. random variables I[f (Xi ) ̸= Yi ] bounded in [0, 1], whose true expectation is precisely E[I[f (X) ̸=
Y ]]. Applying Hoeffding’s, for a fixed f ∈ F, with probability at least 1 − δ, we have
r
1 2
|E[I[f (X) ̸= Y ] − E[I[f (X) ̸= Y ]| ≤
b ln .
2n δ
Union bounding over F and plugging into Eq.(4),
r
2 2|F|
E[I[fˆ(X) ̸= Y )]] − E[I[f ⋆ (X) ̸= Y )]] ≤ ln . (4)
n δ

References
[1] Akshay Krishnamurthy, Alekh Agarwal, and John Langford. PAC reinforcement learning with
rich observations. In Advances in Neural Information Processing Systems, pages 1840–1848, 2016.

[2] Peter Auer, Nicolò Cesa-Bianchi, and Paul Fischer. Finite-time analysis of the multiarmed bandit
problem. Machine learning, 47(2-3):235–256, 2002.

Latent Variable Estimation in Bayesian
No ratings yet
Latent Variable Estimation in Bayesian
48 pages
CRT2 PDF
No ratings yet
CRT2 PDF
4 pages
Sharp Concentration of Uniform Generalization Errors in Binary Linear Classification
No ratings yet
Sharp Concentration of Uniform Generalization Errors in Binary Linear Classification
26 pages
Final
No ratings yet
Final
15 pages
AR23
No ratings yet
AR23
159 pages
RL Sem Ans
No ratings yet
RL Sem Ans
90 pages
02 First Model of Learning
No ratings yet
02 First Model of Learning
37 pages
1.RL Unit 1
No ratings yet
1.RL Unit 1
47 pages
MLB Assignment 7 Final
No ratings yet
MLB Assignment 7 Final
16 pages
Improved Regret Bounds For Bandits With Expert Advice
No ratings yet
Improved Regret Bounds For Bandits With Expert Advice
18 pages
Q1. Explain The Multi-Armed Bandit Problem and Its Key Characteristics. Illustrate Their Real-World Applications
No ratings yet
Q1. Explain The Multi-Armed Bandit Problem and Its Key Characteristics. Illustrate Their Real-World Applications
11 pages
Class 06
No ratings yet
Class 06
25 pages
Note 8
No ratings yet
Note 8
12 pages
June 2024 (v1) QP
No ratings yet
June 2024 (v1) QP
16 pages
Solutions - REINFORCE and Linear Function Approximation
No ratings yet
Solutions - REINFORCE and Linear Function Approximation
5 pages
Note 3
No ratings yet
Note 3
9 pages
C:/Users/User/Downloads/model - Amos - Ads - Final - Amw: Analysis Summary Date and Time
No ratings yet
C:/Users/User/Downloads/model - Amos - Ads - Final - Amw: Analysis Summary Date and Time
37 pages
EXP3
No ratings yet
EXP3
36 pages
Sibd Questions Soved Theory
No ratings yet
Sibd Questions Soved Theory
14 pages
728HA1
No ratings yet
728HA1
6 pages
Multi-Arm-Bandit Problem
No ratings yet
Multi-Arm-Bandit Problem
11 pages
Probability: An Introduction To Modeling Uncertainty
100% (2)
Probability: An Introduction To Modeling Uncertainty
98 pages
DLP Stat
No ratings yet
DLP Stat
7 pages
CS 747, Autumn 2023: Lecture 4: Shivaram Kalyanakrishnan
No ratings yet
CS 747, Autumn 2023: Lecture 4: Shivaram Kalyanakrishnan
42 pages
On Lower Bounds For Statistical Learning Theory
No ratings yet
On Lower Bounds For Statistical Learning Theory
17 pages
RL Unit 1 - QA
No ratings yet
RL Unit 1 - QA
10 pages
Ashish Mcdiarmid
No ratings yet
Ashish Mcdiarmid
22 pages
26 Making Decisions
No ratings yet
26 Making Decisions
31 pages
139-Article Text-183-1-10-20210528
No ratings yet
139-Article Text-183-1-10-20210528
14 pages
Chapter 3 Solutions Understanding Machine Learning
No ratings yet
Chapter 3 Solutions Understanding Machine Learning
6 pages
Introduction To Neural Networks
No ratings yet
Introduction To Neural Networks
3 pages
03 Hoeffding
No ratings yet
03 Hoeffding
5 pages
Expanded Multi Armed Bandit and Probability Basics
No ratings yet
Expanded Multi Armed Bandit and Probability Basics
5 pages
EE 6106: Online Learning and Optimisation Homework 1
No ratings yet
EE 6106: Online Learning and Optimisation Homework 1
4 pages
Sol3 2016
No ratings yet
Sol3 2016
8 pages
Concentration Inequalities: Hoeffding and Mcdiarmid
No ratings yet
Concentration Inequalities: Hoeffding and Mcdiarmid
5 pages
Lecture 03: Adaptive Exploration-Based Algorithms: 1.1 Outline of The Algorithm
No ratings yet
Lecture 03: Adaptive Exploration-Based Algorithms: 1.1 Outline of The Algorithm
4 pages
Lec 3
No ratings yet
Lec 3
8 pages
Probability Bounds
No ratings yet
Probability Bounds
14 pages
Fokker Plank Equation
No ratings yet
Fokker Plank Equation
7 pages
Probability and Statistics: Code: CAT-208 Bca-Iii Sem
No ratings yet
Probability and Statistics: Code: CAT-208 Bca-Iii Sem
19 pages
Notes
No ratings yet
Notes
32 pages
Lecturenotes
No ratings yet
Lecturenotes
56 pages
2020HW7
No ratings yet
2020HW7
2 pages
Chap1 Bishop
No ratings yet
Chap1 Bishop
35 pages
Selected Theoretical Aspects of ML and Deep Learning
No ratings yet
Selected Theoretical Aspects of ML and Deep Learning
46 pages
EE675A Lecture 3
No ratings yet
EE675A Lecture 3
8 pages
A Tutorial On Particle Filtering and Smoothing: Fifteen Years Later
No ratings yet
A Tutorial On Particle Filtering and Smoothing: Fifteen Years Later
41 pages
OLS 2 Variables
No ratings yet
OLS 2 Variables
55 pages
RIP Routing Protocol
No ratings yet
RIP Routing Protocol
27 pages
Lemma 1 Prooff PDF
No ratings yet
Lemma 1 Prooff PDF
9 pages
Emiprical Risk Minimization
No ratings yet
Emiprical Risk Minimization
12 pages
Michael Importance Weighting
No ratings yet
Michael Importance Weighting
30 pages
Machine Learning - The Science of Selection Under Uncertainty
No ratings yet
Machine Learning - The Science of Selection Under Uncertainty
85 pages
Sol3 2015
No ratings yet
Sol3 2015
8 pages
Lecture Notes 2 1 Probability Inequalities
No ratings yet
Lecture Notes 2 1 Probability Inequalities
9 pages
Stat Risk
No ratings yet
Stat Risk
6 pages
Lecture 2 EE675
No ratings yet
Lecture 2 EE675
4 pages
Statistics and Probability Activity Sheet Answer Key
100% (7)
Statistics and Probability Activity Sheet Answer Key
2 pages
Probabilistic Reliability Engineering (PDFDrive)
No ratings yet
Probabilistic Reliability Engineering (PDFDrive)
536 pages
Interval Estimation
No ratings yet
Interval Estimation
19 pages
Unit-3 The Geometric Distribution: Dr. Rupal C. Shroff 1
No ratings yet
Unit-3 The Geometric Distribution: Dr. Rupal C. Shroff 1
18 pages
Envelope
No ratings yet
Envelope
8 pages
Unit 4 - Continuous Random Variables
No ratings yet
Unit 4 - Continuous Random Variables
35 pages
n14 PDF
No ratings yet
n14 PDF
4 pages
IntroMulti Armed Bandits Slivkin Microsoft PDF
No ratings yet
IntroMulti Armed Bandits Slivkin Microsoft PDF
174 pages
Lecture 3
No ratings yet
Lecture 3
15 pages
Lec10 PDF
No ratings yet
Lec10 PDF
8 pages
Exploration Exploitation
No ratings yet
Exploration Exploitation
40 pages
Which OTT Platform You Find Most Convenient? H0 - There Is No Significant Relationship Between Gender and OTT Preference. H1-There Is A Significant Relation Between Gender and OTT Preference
No ratings yet
Which OTT Platform You Find Most Convenient? H0 - There Is No Significant Relationship Between Gender and OTT Preference. H1-There Is A Significant Relation Between Gender and OTT Preference
8 pages
Class 3 - Normal Distribution
No ratings yet
Class 3 - Normal Distribution
20 pages
T Distribution
No ratings yet
T Distribution
48 pages
Generalization Bounds and Stability: 9.520 Class 14, 03 April 2006 Sasha Rakhlin
No ratings yet
Generalization Bounds and Stability: 9.520 Class 14, 03 April 2006 Sasha Rakhlin
25 pages
Probability Definitions of Statistics
No ratings yet
Probability Definitions of Statistics
19 pages
GCSE Probability Sanela
No ratings yet
GCSE Probability Sanela
23 pages
Mid-Semester Examination
No ratings yet
Mid-Semester Examination
2 pages
SPTC 0301 Q3 FPF
No ratings yet
SPTC 0301 Q3 FPF
31 pages
Very Important
No ratings yet
Very Important
8 pages
Tutorial Questions For Random Variable
No ratings yet
Tutorial Questions For Random Variable
4 pages
1 Upper Bounds On The Tail Probability
No ratings yet
1 Upper Bounds On The Tail Probability
7 pages
10-601 Machine Learning
No ratings yet
10-601 Machine Learning
7 pages
15-359: Probability and Computing Inequalities: N J N J
No ratings yet
15-359: Probability and Computing Inequalities: N J N J
11 pages
Lecture Notes For ECE 695-09/08/03
No ratings yet
Lecture Notes For ECE 695-09/08/03
3 pages
CE605 Assignment1
No ratings yet
CE605 Assignment1
2 pages
Lecture Notes 2 1 Probability Inequalities
No ratings yet
Lecture Notes 2 1 Probability Inequalities
9 pages
Probability and Statistics Module 1
No ratings yet
Probability and Statistics Module 1
24 pages
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)
Stat2112 1st and 2nd Quarter Exam
No ratings yet
Stat2112 1st and 2nd Quarter Exam
21 pages
Statistics and Probability Reviewer
77% (13)
Statistics and Probability Reviewer
6 pages
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet

Note 2

Uploaded by

Note 2

Uploaded by

Concentration Inequalities and Union Bound

September 13, 2022

• X1 , . . . , Xn are not necessarily identically distributed; they just have to be independent.

2 Multi-Armed Bandits (MAB)

1. Chooses arm it ∈ [K].

2. Receives reward rt ∼ Rit .

Another important objective is the simple regret:

2.2 Uniform sampling

µ⋆ − µî = µi⋆ − µ̂i⋆ + µ̂i⋆ − µî

2.3 Lower bound

3 Generalization Bounds for Supervised Learning

f ⋆ := arg min E[I[f (X) ̸= Y ]],

E[I[fˆ(X) ̸= Y )]] − E[I[f ⋆ (X) ̸= Y )]].

E[I[fˆ(X) ̸= Y )]] − E[I[f ⋆ (X) ̸= Y )]]

It then suffices to bound maxf ∈F |E[I[f (X) ̸= Y )]] − E[I[f

You might also like