0% found this document useful (0 votes)

6 views61 pages

FSMLecture6 - Statistics

The document discusses confidence intervals, Bayesian credible intervals, and their construction using p-values and e-values. It highlights the importance of understanding the long-run behavior of confidence intervals and the differences between Bayesian and frequentist interpretations of uncertainty. Additionally, it emphasizes the limitations of confidence intervals in making conditional conclusions about individual experiments.

Uploaded by

Günay

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views61 pages

FSMLecture6 - Statistics

Uploaded by

Günay

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 61

Foundations of Statistics and Machine Learning:

testing and uncertainty quantification with e-values

(and their link to likelihood, betting)

1
Today, Lecture 6: Confidence Sequences

1. Standard Confidence Intervals

(and how to construct them using p-values)

2. Bayesian Credible Intervals

3. Anytime-Valid Confidence intervals

(and how to construct them using e-values)

4. Subjective Objectivity: Luckiness

2
Confidence Intervals
Neyman and Pearson, 1930s

Given 1-dimensional model {𝑃! : 𝜃 ∈ Θ} , let p! ≔ p! (𝑋 " ) be a p-value for

data 𝑋 # = (𝑋$, … , 𝑋" ) relative to null hypothesis {𝑃! }. Then:
• 𝑃! p! ≤ 𝛼 ≤ 𝛼
• Set CS",$&' ≔ {𝜃: p! > 𝛼}
𝐜
• …so CS",$&' (the complement) is the set of 𝜃 that you can reject:
𝐜
∀𝜃 ∈ Θ: 𝑃! (𝜃 ∈ CS",$&' ) ≤ 𝛼 so

∀𝜃 ∈ Θ: 𝑃! (𝜃 ∈ CS",$&' ) ≥ 1 − 𝛼

We call the set CS",$&' a confidence set

Confidence Intervals - II

• 𝑃! p! ≤ 𝛼 ≤ 𝛼

• Set CS",$&' ≔ {𝜃: p! > 𝛼} The 𝜃 you have not been able to reject
• …so for CS 𝐜",$&' : ∀𝜃 ∈ Θ: 𝑃! (𝜃 ∈ CS",$&'
𝐜
) ≤ 𝛼 so

∀𝜃 ∈ Θ: 𝑃! (𝜃 ∈ CS",$&' ) ≥ 1 − 𝛼 (*)

We call the set CS",$&' a confidence set.

• It should really be called a “random” set, since it is data-dependent (i.e. random)
• In fact we call any random set satisfying (*) a confidence set, irrespective of how it is
constructed (here we constructed it with p-values but other roads lead to (*) as well)
Confidence Intervals - II

• 𝑃! p! ≤ 𝛼 ≤ 𝛼

• Set CS",$&' ≔ {𝜃: p! > 𝛼}

• …so for CS 𝐜",$&' : ∀𝜃 ∈ Θ: 𝑃! (𝜃 ∈ CS",$&'
𝐜
) ≤ 𝛼 so

∀𝜃 ∈ Θ: 𝑃! (𝜃 ∈ CS",$&' ) ≥ 1 − 𝛼

We call the set CS",$&' a confidence set.

• Usually CS!,#$% will be an interval, but for complicated models it will be.
• To make set interpretable, some authors insist on making it an interval. We can always
do that by enlarging it (if CS!,#$% is a confidence set then so is any CI!,#$% ⊃ CS!,#$% )
Example 2: Confidence Intervals

Fix confidence level 𝛼, sample size 𝑛 and let 𝑌 =

𝑋$, … , 𝑋" , 𝑋) iid ∼ 𝑃! , 𝜃 ∈ Θ unknown
• A (strict) 1 − 𝛼 confidence interval CI",$&' =
(ℓ 𝑌 , 𝑟 𝑌 ) for 𝜃 is a “random set” with ℓ 𝑌 < 𝑟 𝑌
such that for all 𝜃 ∈ Θ,
𝑃! 𝜃 ∈ ℓ 𝑌 , 𝑟 𝑌 =1−𝛼
Example 2:CI’s, normal distrs.

• Fix 𝑛 and let 𝑌 = 𝑋$, … , 𝑋" , 𝑋) iid ∼ 𝑁(𝜇, 1)

𝑋D = (∑𝑋) )/𝑛 is empirical average

• Standard 95% confidence interval for 𝜇

• (ℓ 𝑌 , 𝑟 𝑌 ) = (𝑋D − 1.96/ 𝑛, 𝑋D + 1.96/ 𝑛)
Confidence Intervals

• Standard 95% confidence interval

• (ℓ 𝑌 , 𝑟 𝑌 ) = (𝑋D − 1.96/ 𝑛, 𝑋D + 1.96/ 𝑛)
• Note that there are many other valid CI’s as well. For
example, (ℓ 𝑌 , 𝑟 𝑌 ) = (−∞, 𝑋D + 1.65/ 𝑛)
• The standard CI is optimal in the sense that every
other CI has larger expected width
(Correct)
Inductive Behavior Interpretation
• Suppose in our career we do many independent experiments with data
𝑌($) , 𝑌(,) , …
• note: each 𝑌(-) consists itself of many data points
…and we always output a 95% confidence interval, then by the law of large
numbers we can be (essentially) certain that the true parameter will be in
our interval at least 95% of the time
• Exactly analogous to hypothesis testing case
(Correct)
Inductive Behavior Interpretation
• Suppose in our career we do many independent experiments with data
𝑌($) , 𝑌(,) , …
• note: each 𝑌(-) consists itself of many data points
…and we always output a strict 95% confidence interval, then by the law of
large numbers we can be (essentially) certain that the true parameter will
be in our interval about 95% of the time
• Exactly analogous to hypothesis testing case
(Correct)
Inductive Behavior Interpretation
• Suppose in our career we do many independent experiments with data
𝑌($) , 𝑌(,) , …
• note: each 𝑌(-) consists itself of many data points
…and we always output a strict 95% confidence interval, then by the law of
large numbers we can be (essentially) certain that the true parameter will
be in our interval about 95% of the time
• Just like in hypothesis testing, we cannot say anything about any
individual experiment though!
• Each sample 𝑌(-) = (𝑋 - ,$, … , 𝑋 - ,"! ) consists of data points with 𝑋 - ,. is
the difference between two measurements of a patient’s blood pressure,
one before and one after taking medication of type 𝑗
• So research group 1 tries med. of type 1 (say, paracetamol) on sample
𝑌($) , research group 2 tries medication 2 (say, green tea) on 𝑌(,) , 𝑒𝑡𝑐.
• We assume 𝑋 - ,. ∼ 𝑁(𝜇(-) , 𝜎 ,) for some known 𝜎 ,
• Suppose that a medication for lowering blood pressure is considered
effective if 𝝁 ≤ −𝟏𝟎
• Suppose that, while there are many possible medications around, none
of these achieves the goal 𝜇 ≤ −10. So 𝝁(𝟏) , 𝝁(𝟐) , … are all > −𝟏𝟎.
Thought Experiment, Continued

Even though 𝝁(𝟏) , 𝝁(𝟐) , … are all > −𝟏𝟎, every now and then we might observe an
experiment 𝑗 with 𝜇-(*) ≪ −10.
(say (ℓ 𝑌 * , 𝑟 𝑌 * = −12.3, −10.1 )
We might now tempted to conclude that for this experiment/medication, we are 95%
certain that
𝜇 * < −10
But this would be wrong: the ‘world’ we set up is such that 𝜇(*) is never < −10.
Similar to what we saw for hypothesis testing in the previous lecture, we cannot use CI’s to
give conditional conclusions: they only say something about long-run averages
Bayesian Credible Intervals

• We may also take a Bayesian stance towards uncertainty quantification

of a parameter:
• Fix 0 < 𝛼1 < 𝛼 < 1 and let 𝛼 = 𝛼1 + 𝛼2 .
'" '#
• We take the ,
and1− , quantiles of the posterior density 𝑤 𝜃 𝑌
and call the set CrI",$&' of 𝜃 inbetween an “(1 − 𝛼)- Bayesian posterior
credible interval”
• The posterior then satisfies: 𝑊(𝜃 ∈ CrI",$&' ∣ 𝑌) = 1 − 𝛼

14
Example: Normal Location Family

• Let ℳ = 𝑝3 : 𝜇 ∈ ℝ be family of normal densities with mean 𝜇 and fixed

$%$& '
&
'('
variance 𝜎, and let 𝑤 𝜇 ∝ 𝑒 ) be density of a normal with mean
5'
𝜇4 and variance 𝜌., = .
.
• Then the Bayes posterior is also normal and given by
7* &3 ' . 3& &3 ' 3&93 '
& ∑*+,... 8 & '
𝑤 𝜇 𝑋" ∝𝑒 ,5' ,5' ∝ 𝑒 ,:𝒏0)

… 𝑘 counts additional ‘’virtual’ data points: 𝜇a = (∑");$ 𝑋) + 𝑘𝜇4)/(𝑛 + 𝑘)

• similar to uniform prior in Bernoulli case which “added” 2 virtual points, at 0 and 1
• Note 𝑘 not required to be an integer
• Very special property (“self-conjugacy”) of the normal distributions!
Similarity in Form of Bayes 𝐂𝐫𝐈 and 𝐂𝐈

• For normal family+prior, in the limit for 𝑘 ↓ 0 Bayes posterior given by

3 '
3&<
&
𝑤 𝜇 𝑋" ∝ 𝑒 , "5'
• “The prior gets less and less informative as 𝑘 ↓ 0 ”
• Question: why does 𝜇> turn into 𝜇?
-

5'
• But this just a normal distribution: 𝑊 ∣ 𝑋" = 𝑁 𝜇,̂ "
• Hence the Bayesian 1 − 𝛼 credible interval for a noninformative (high-
variance) prior is essentially indistiguingishable from a standard 1 −
𝛼 confidence interval!
16
Similarity in Form of Bayes 𝐂𝐫𝐈 and 𝐂𝐈

• For normal location family, Bayesian credible interval based on a

noninformative prior and standard CI essentially coincide
• For 1-dimensional probability models “that satisfy Central Limit Theorem”
and continuous, strictly positive priors, they coincide asymptotically
• Examples: exponential families such as Bernoulli, Poisson, … (next
week), noncentral t-family, …
• Non-Example: mixture models
• Non-Example: 𝜃 just represents an aspect of a distribution such as its
mean rather than a full 𝑃! and “model is nonparametric”
• e.g. testing a mean on a bounded support, last week
• Bayesian credible intervals for such a situation completely different
17
Dissimilarity in Meaning of Bayes 𝐂𝐫𝐈 and 𝐂𝐈
• Each sample 𝑌(-) = (𝑋 - ,$, … , 𝑋 - ,"! ) consists of data points with 𝑋 - ,. the
difference between two measurements of a patient’s blood pressure, one
before and one after taking medicatin of type 𝑗
• Research group 1 tries med. of type 1 (say, paracetamol) on sample
𝑌($) , research group 2 tries medication 2 (say, green tea) on 𝑌(,) , 𝑒𝑡𝑐.
• We assume 𝑋 - ,. ∼ 𝑁(𝜇(-) , 𝜎 ,) for some known 𝜎 ,
• Suppose that a medication for lowering blood pressure is considered
effective if 𝝁 ≤ −𝟏𝟎

18
Dissimilarity in Meaning of Bayes 𝐂𝐫𝐈 and 𝐂𝐈
• Each sample 𝑌(-) = (𝑋 - ,$, … , 𝑋 - ,"! ) consists of data points with 𝑋 - ,. the
difference between two measurements of a patient’s blood pressure, one
before and one after taking medicatin of type 𝑗
• Research group 1 tries med. of type 1 (say, paracetamol) on sample
𝑌($) , research group 2 tries medication 2 (say, green tea) on 𝑌(,) , 𝑒𝑡𝑐.
• We assume 𝑋 - ,. ∼ 𝑁(𝜇(-) , 𝜎 ,) for some known 𝜎 ,
• Suppose that a medication for lowering blood pressure is considered
effective if 𝝁 ≤ −𝟏𝟎
• When analyzing standard CIs we considered the scenario that, while there
are many medications around, none of these achieves the goal 𝜇 ≤ −10.
So 𝝁(𝟏) , 𝝁(𝟐) , … are all > −𝟏𝟎.
19
Previous Thought Experiment, Continued

Standard CIs : suppose we observe an experiment 𝑗 with 𝜇-(*) ≪ −10.

(say (ℓ 𝑌 * , 𝑟 𝑌 * = −12.3, −10.1 )
We might be tempted to conclude that for this experiment/medication, we are 95% certain
that 𝜇 * < −10
But this would be wrong: the ‘world’ might have been set up in such a way that 𝜇(*) is
never < −10 (e.g. 𝜇(*) = 0 for all 𝑗 cannot be ruled out).
We cannot use standard CIs to give conditional conclusions: they only say something
about long-run averages
We can use Bayesian CrIs to give conditional conclusions if we believe our prior. In the
Bayesian setup the situation that 𝜇(#) , 𝜇(+) , … are all > −10 would be exceedingly unlikely.
If each 𝜇(,) were itself independently sampled from distribution (density) 𝑤 𝜃 , then
𝑤 𝜃 𝑌 * would be the correct density upon observing 𝑌(*)
Previous Thought Experiment, Continued

Standard CIs : suppose we observe an experiment 𝑗 with 𝜇-(*) ≪ −10.

1. Standard Confidence Intervals

(and how to construct them using p-values)

2. Bayesian Credible Intervals

3. Anytime-Valid Confidence intervals

(and how to construct them using e-values)

4. Subjective Objectivity: Luckiness

22
Standard CIs: invalid under optional stopping

• Just like the Neyman-Pearson tests on which they are often based,
standard CIs cannot handle (become invalid under) optional stopping
• Bayesian Credible Intervals can handle optional stopping if (there it is
agian) you really believe your prior, but if you choose it pragmatically
(which you usually do), they cannot
• This has to follow from the fact that standard CIs cannot handle
optional stopping, for the Bayesian CrI and the standard CI are
essentially the same with normal distributions

23
Z-test ⇒ Z-Confidence Interval

standard 95% CI: 𝑋D ± 1.96/ 𝑛

Suppose 𝐻4: 𝜇 = 7 is true yet you keep sampling until 𝐻4 can
be rejected (falls outside of the CI) or some 𝑛=>? has been
achieved. We plot the probability that 𝜃 is contained in your
CI at time 𝑛=>? as function of 𝑛=>?
Anytime-Valid Confidence Interval
(“Confidence Sequence”)

standard CI: 𝑋D ± 1.96/ 𝑛

anytime-valid CI based on “non-informative” prior distribution
Suppose 𝐻4: 𝜇 = 7 is true yet you keep sampling until 𝐻4 can
be rejected (falls outside of the CI) or some 𝑛=>? has been
achieved. We plot the probability that 𝜃 is contained in your
CI at time 𝑛=>? as function of 𝑛=>?
standard CI: 𝑋D ± 1.96/ 𝑛
A8BCD "
AV CI, “non-informative” prior: 𝑋D ± "

AV CI, prior optimized for specific 𝑛∗ :

around 𝑛 = 𝑛∗ , 𝑋D ± 2.72/ 𝑛∗
What about Bayes?

In this simple problem, Bayesian 95% posterior credible

interval (with noninformative prior) is indistinguishable from
standard 95% CI and therefore not anytime valid
𝒏
Theorem: Let 𝑺𝟏 , 𝑺𝟐 , … be an e-process and set 𝑺 : = 𝑺𝒏 (𝑿𝒏 ).
Ville’s Inequality: for all 𝑃 ∈ 𝐻E :

The probability that in a real

casino you will ever multiply
your initial capital by more
than 20 is bounded by 1/20
Towards Anytime-Valid CIs

• Let 𝑆 $ , 𝑆 (,) , … be an e-process. Then for all 𝑃4 ∈ 𝐻4:

Anytime-Valid Confidence Intervals
Darling and Robbins, 1967

• e-processes can be used to construct AVCIs (“confidence sequences”)

• Given model {𝑃! : 𝜃 ∈ Θ}, let 𝑆! be e-process for 𝐻4 = 𝑃! , all 𝜃 ∈ Θ :
Anytime-Valid Confidence Intervals
Darling and Robbins, 1967

• e-processes can be used to construct AVCIs

• Given model {𝑃! : 𝜃 ∈ Θ} , let 𝑆! be e-process for 𝐻4 = 𝑃! , all 𝜃 ∈ Θ :
Anytime-Valid Confidence Intervals
Darling and Robbins, 1967

• e-processes can be used to construct AVCIs

• Given model {𝑃! : 𝜃 ∈ Θ} , let 𝑆! be e-process for 𝐻4 = 𝑃! , all 𝜃 ∈ Θ :

The 𝜃 you have not

been able to reject
AV CIs for normal location family
$ $
• 𝑝! 𝑥" = . exp −,∑ 𝑥) − 𝜃 ,
,F

• Equip with normal prior 𝜃 G ∼ 𝑊 = 𝑁 0, 𝜌,

• Bayes factor relative to 𝐻4 = 𝑃! given by

∫ I12 J . K !2 L!2 I3 (J . )
𝑆! = I1 (J . )
= I1 (J . )
I1 J .
CI",$&M = 𝜃: I J. >𝛼
3

… always wider than Bayes credible posterior interval based on same prior
AV CI’s vs. Bayesian Credible Sets

• Standard CI = Bayesian 95% credible interval (noninformative prior)

• AV confidence interval based on BF with prior with variance → ∞ ,

approximately:
Anytime-Valid Confidence Interval

Red is the standard confidence interval,

green is the anytime-valid confidence interval that I just gave
The Running Intersection

• The AV confidence intervals are invariable wider than the standard ones.
But in fact we can considerably improve them so that they sometimes
(not always) are even tighter at some 𝑛 than the standard ones.
• We do this by taking the running intersection:

38
∗
The Running Intersection CI!,#$%

• Given model {𝑃! : 𝜃 ∈ Θ} , let 𝑆! be e-process for 𝐻4 = 𝑃! , all 𝜃 ∈ Θ :

The 𝜃 you have not been

able to reject at time 𝑛

∗
• …CIF,GHI ≔ ⋂KLG..F CIK,GHI The 𝜃 you have not yet been able to reject at 𝑛
Running Intersection, Illustration
AV CIs for normal location family, revisited

Another way to make AV CIs tighter in a specific “region of 𝑛”

$ $
• 𝑝! 𝑥 " = . exp − ∑ 𝑥) − 𝜃 ,
,F ,
• Bayes factor relative to 𝐻4 = 𝑃! given by
∫ I12 J . K !2 L!2 I3 (J . )
𝑆! = I1 (J . )
= I1 (J . )
I1 J .
CI",$&M = 𝜃: I J. >𝛼
3
AV CIs for normal location family, revisited

Another way to make AV CIs tighter in a specific “region of 𝑛”

$ $
• 𝑝! 𝑥 " = . exp − ∑ 𝑥) − 𝜃 ,
,F ,
• Bayes factor relative to 𝐻4 = 𝑃! given by
∫ I12 J . 𝒘𝜽 !2 L!2 I𝑾𝜽 (J . )
𝑆! = I1 (J . )
= I1 (J . )

I1 J .
CI",$&M = 𝜃: >𝛼
I𝑾𝜽 J .

• We may make prior 𝑊 ≔ 𝑊! dependent on 𝜃 (e.g. normal with mean 𝜃)

• …now we do not have such a clear correspondence to credible intervals
anymore (because there is no “credible interval based on same prior”)
standard CI: 𝑋D ± 1.96/ 𝑛
A8BCD "
AV CI, “non-informative” prior: 𝑋D ±
"
,.P,
AV CI, prior optimized for specific 𝑛∗ : around 𝑛 = 𝑛∗ , 𝑋D ± "∗ , implemented by setting
$ & BCD ' & BCD '
𝑊! 𝜃8 = 𝑊! 𝜃& = for
,
𝜃8 =𝜃+ "∗
, 𝜃& =𝜃− "∗
The Different Role of Priors
in Bayesian and E-Based Methods
• Both the Bayesian Credible Interval and the E-Process Based Anytime-
Valid Interval relied on a prior.

• Still they use the prior in a different way, and they lead to very different
conclusions. Understanding this difference is important!
• Unfortunately many Bayesian statisticians don’t understand it…

• I will now illustrate!

44
Yellow: Bayes 95% credible interval based on noninformative
prior = standard confidence interval = 𝑋D ± 1.96/ 𝑛
Blue: 95% AV interval based on same prior:
Subjective and Objective, at same time:
luckiness
• E-Posteriors and the AV CIs they induce rely on a prior, just like
Bayesian posteriors…
…but they remain valid irrespective of prior you use

…suppose for example you have a pretty mistaken prior belief that 𝜃 =
0, with variance 0.5 …
Subjective and Objective, at same time:
luckiness
• The AV CIs induced by e-variables rely on a prior, just like Bayesian
credible intervals…
…but they remain valid irrespective of prior you use

with a bad prior, e-confidence interval

gets wide rather than wrong
More Details

• We can easily construct anytime-valid confidence intervals also in

nonparametric settings

• With simple nulls, Bayes factor testing is essentially equivalent to e-

value based testing, but Bayes credible intervals are very different from
e-based (anytime-valid) confidence intervals

49
Nonparametric Anytime-Valid CI
…recall from last week:
Testing the Mean of a Bounded Random Variable
Waudby-Smith and Ramdas, JRSS B, 2024, Orabona & Jun, IEEE Trans. Inf. Th., 2023

𝑋$, 𝑋,, … iid ∼ 𝑃3 , 𝑋) ∈ [−1,1]

We assume nothing at all about 𝑃3

Last week we tested whether mean is 𝜇, now we use exact same technique
to make AV CI for 𝜇
Nonparametric Anytime-Valid CI

𝑋 ∈ −1,1 ∶ set 𝒔𝝀,[𝝁] 𝒙 : = 𝟏 + 𝝀(𝒙 − 𝝁)

defined for any 𝜆 ∈ Λ[3] ≔ {𝜆: min 𝑠V,[3] 𝑥 ≥ 0 }
J∈[&$,$]
𝑠V, 3 (𝑋) is e-variable for 𝐻4: 𝐄 𝑋 = 𝜇

…since under any 𝑃 ∈ 𝐻4: 𝐄W [𝑠V,[3] 𝑋 ] = 1 + 𝜆 𝜇 − 𝜇 = 1

Nonparametric Anytime-Valid CI

𝑋 ∈ −1,1 ∶ set 𝒔𝝀,[𝝁] 𝒙 : = 𝟏 + 𝝀(𝒙 − 𝝁)

𝑠V, 3 (𝑋) is e-variable for 𝐻4: 𝐄 𝑋 = 𝜇

…since under any 𝑃 ∈ 𝐻4: 𝐄W [𝑠V,[3] 𝑋 ] = 1 + 𝜆 𝜇 − 𝜇 = 1

($) (,) (")

Also: 𝑆[3] , 𝑆[3] , … with 𝑆[3] = ∏);$.." 𝑠V, 3 (𝑋) ) is an e-process
• follows easily from i.i.d. assumption
Nonparametric Anytime-Valid CI

𝑋 ∈ −1,1 ∶ set 𝒔𝝀,[𝝁] 𝒙 : = 𝟏 + 𝝀(𝒙 − 𝝁)

𝑠V, 3 (𝑋) is e-variable for 𝐻4: 𝐄 𝑋 = 𝜇

…since under any 𝑃 ∈ 𝐻4: 𝐄W [𝑠V,[3] 𝑋 ] = 1 + 𝜆 𝜇 − 𝜇 = 1

($) (,) (")

𝑆[3] , 𝑆[3] , … with 𝑆[3] = ∏);$.." 𝑠𝝀|𝑿
X 𝒊%𝟏 , 3 (𝑋) ) is an e-process

We simply set: CI",$&' = 𝜇: 𝑆 3" <

$ The 𝜇 you have not been
' able to reject at time 𝑛
Nonparametric Anytime-Valid CI

𝑋 ∈ −1,1 ∶ set 𝒔𝝀,[𝝁] 𝒙 : = 𝟏 + 𝝀(𝒙 − 𝝁)

𝑠V, 3 (𝑋) is e-variable for 𝐻4: 𝐄 𝑋 = 𝜇

…since under any 𝑃 ∈ 𝐻4: 𝐄W [𝑠V,[3] 𝑋 ] = 1 + 𝜆 𝜇 − 𝜇 = 1

($) (,) (")

𝑆[3] , 𝑆[3] , … with 𝑆[3] = ∏);$.." 𝑠𝝀|𝑿
X 𝒊%𝟏 , 3 (𝑋) ) is an e-process

Running Intersection: The

∗
We simply set: CI",$&' = ∀𝑖 ∈ 1. . 𝑛: 𝜇: 𝑆 3) <
$ 𝜇 you have not yet been
' able to reject at time 𝑛
Nonparametric Anytime-Valid CI

Variation:

($) (,) (")

𝑆[3] , 𝑆[3] , … with 𝑆[3] = ∫ ∏);$.." 𝑠 3 (𝑋) ) is an e-process

$
We simply set: CI",$&' = 𝜇: 𝑆 3" < '

The 𝜇 you have not been

able to reject at time 𝑛
Variation

(")
For fixed 𝜆: 𝑆V, 3 ≔ ∏");$ 𝑠V, 3 (𝑋) ) is an e-variable

(") (")
Now put “prior” 𝑤[3] on Λ3 : 𝑆[3] ∶= ∫[ 𝑆V, 3 𝑤 3 𝜆 d𝜆
$
(")
Since 𝑆[3] is a mixture of e-variables, it is itself an e-variable
Variation
(")
For fixed 𝜆: 𝑆V, 3 ≔ ∏");$ 𝑠V, 3 (𝑋) ) is an e-variable
(") (")
Now put “prior” 𝑤[3] on Λ3 : 𝑆[3] ∶= ∫[ 𝑆V, 3 𝑤 3 𝜆 d𝜆
$

Now set 𝑆), 3 = ∫ 𝑠V, 3 𝑥) 𝑤 3 𝜆|𝑥 )&$ d𝜆

with “posterior” 𝑤[3] 𝜆|𝑥 )&$ ∝ 𝑤 3 𝜆 ∏)&$
-;$ 𝑠V, 3 𝑋-
(")
Then we have 𝑆[3] = ∏);$.." 𝑆), 3
Variation
(")
For fixed 𝜆: 𝑆V, 3 ≔ ∏");$ 𝑠V, 3 (𝑋) ) is an e-variable
(") (")
Now put “prior” 𝑤[3] on Λ3 : 𝑆[3] ∶= ∫[ 𝑆V, 3 𝑤 3 𝜆 d𝜆
$

Now set 𝑆), 3 = ∫ 𝑠V, 3 𝑥) 𝑤 3 𝜆|𝑥 )&$ d𝜆

with “posterior” 𝑤[3] 𝜆|𝑥 )&$ ∝ 𝑤 3 𝜆 ∏)&$
-;$ 𝑠V, 3 𝑋-
(")
Then we have 𝑆[3] = ∏);$.." 𝑆), 3
just like in the Bayes factor case with simple null:
Bayes marginal is product of Bayes predictive distributions
Marginal e-variable is product of past-conditional e-variables
More Details

• We can easily construct anytime-valid confidence intervals also in

nonparametric settings
• we can also make standard confidence intervals for such settings,
but in general for such cases it is difficult to make Bayesian credible
intervals that work well in practice
• In a Bayesian approach, we would need to put a prior density on the
infinitely-dimensional set 𝒫 of all distributions on [−1,1]
• …in practice you always “forget” many distributions
• …e-based approach: only need to learn/”put prior on” a single
parameter, 𝜆
59
More Details

1. We can easily construct anytime-valid confidence intervals also in

nonparametric settings
2. With simple nulls, Bayes factor testing is essentially equivalent to e-
value based testing, but Bayes credible intervals are very different from
e-based (anytime-valid) confidence intervals
Reason: in Bayesian interpretation, in Bayes factor testing you implicitly
put a massive prior mass of ½ on 𝐻4 , in Bayes credible interval
approach, every 𝜃 ∈ Θ gets prior mass 0
(its density is > 0, its mass is not)

60
Where we stand and where we will go

• You have now learned the basic concepts of this course!

• likelihood ratios, e-variables, test martingales, e-processes
• anytime-valid tests for simple/composite nulls, simple/composite alternatives
• anytime-valid confidence intervals
• Basic Neyman-Pearson testing/confidence intervals/Bayesian testing/credible
intervals, and differences with e–based approach
• Coming weeks:
• Significantly extend math: exponential families, generic construction of optimal e-
variables, concentration inequalities and their connection to e-world
• More examples (including programming homework exercise about 2x2 tables)
• More philosophy (“evidence”)

Doctor's Secret To Hair Growth
100% (1)
Doctor's Secret To Hair Growth
42 pages
TNT135 Wiring Diagram-Color PDF
50% (4)
TNT135 Wiring Diagram-Color PDF
1 page
GP Gpss15 Session1
No ratings yet
GP Gpss15 Session1
244 pages
Outline Note Allan Agresti
No ratings yet
Outline Note Allan Agresti
187 pages
Introduction To Hypothesis Test in R
No ratings yet
Introduction To Hypothesis Test in R
103 pages
Lec11 Introduction2BayesianStatistics
No ratings yet
Lec11 Introduction2BayesianStatistics
48 pages
Lecture BDS 11-23-24 Print
No ratings yet
Lecture BDS 11-23-24 Print
37 pages
BDA Lecture 11b
No ratings yet
BDA Lecture 11b
37 pages
Bayesclassday 1
No ratings yet
Bayesclassday 1
57 pages
Lectures On Statistics in Theory - Prelude To Statistics in Practice
No ratings yet
Lectures On Statistics in Theory - Prelude To Statistics in Practice
94 pages
Interval Estimation13oct - Slides
No ratings yet
Interval Estimation13oct - Slides
38 pages
Bayes
No ratings yet
Bayes
27 pages
Statistical Inference
No ratings yet
Statistical Inference
47 pages
Chapter 4 Bayesian Networks
No ratings yet
Chapter 4 Bayesian Networks
62 pages
Bayes and Frequentism: Return of An Old Controversy: Louis Lyons
No ratings yet
Bayes and Frequentism: Return of An Old Controversy: Louis Lyons
40 pages
W6 Lecture6
No ratings yet
W6 Lecture6
20 pages
25 Intro To Bayesian Inference
No ratings yet
25 Intro To Bayesian Inference
31 pages
Business Analytics & Machine Learning: Regression Analysis
No ratings yet
Business Analytics & Machine Learning: Regression Analysis
58 pages
2A3. Review of Mathematical Statistics
No ratings yet
2A3. Review of Mathematical Statistics
8 pages
Con Dence: ECON 226 - J L. G
No ratings yet
Con Dence: ECON 226 - J L. G
54 pages
Introduction To Discrete Bayesian Methods: Petri Nokelainen
No ratings yet
Introduction To Discrete Bayesian Methods: Petri Nokelainen
146 pages
Confidence Intervals For Discrete Data in Clinical Research - 1st Edition Unlimited Ebook Download
100% (12)
Confidence Intervals For Discrete Data in Clinical Research - 1st Edition Unlimited Ebook Download
17 pages
Probs Stats
No ratings yet
Probs Stats
26 pages
11.estimation IV
No ratings yet
11.estimation IV
62 pages
FSMLecture 4
No ratings yet
FSMLecture 4
49 pages
확통1 LectureNote09 on Bayesian Statistical Inference
No ratings yet
확통1 LectureNote09 on Bayesian Statistical Inference
78 pages
Two-Variable Regression, Interval Estimation and Hypothesis Testing
No ratings yet
Two-Variable Regression, Interval Estimation and Hypothesis Testing
51 pages
Chapman-Kolmogorov Equations 40 48511
No ratings yet
Chapman-Kolmogorov Equations 40 48511
9 pages
Introduction To Discrete Bayesian Methods: Petri Nokelainen
No ratings yet
Introduction To Discrete Bayesian Methods: Petri Nokelainen
146 pages
MA40189 Notes
No ratings yet
MA40189 Notes
70 pages
Description of Uncertainty 2024
No ratings yet
Description of Uncertainty 2024
16 pages
Zzzz-Essential Bayes
No ratings yet
Zzzz-Essential Bayes
158 pages
Machine Learning Models and Theories
No ratings yet
Machine Learning Models and Theories
38 pages
Bayesian Network
No ratings yet
Bayesian Network
30 pages
Week 12-1 Pre
No ratings yet
Week 12-1 Pre
30 pages
STAT40950 2 HypothesisTesting
No ratings yet
STAT40950 2 HypothesisTesting
13 pages
Main
No ratings yet
Main
195 pages
Bayesian Inference: Chris Mathys
No ratings yet
Bayesian Inference: Chris Mathys
32 pages
Lecture Notes For Probability and Statistics
No ratings yet
Lecture Notes For Probability and Statistics
7 pages
ML - Lec 2 - Review of Probability and Statistics
No ratings yet
ML - Lec 2 - Review of Probability and Statistics
30 pages
AE 248: AI and Data Science: Prabhu Ramachandran 2024-03-01
No ratings yet
AE 248: AI and Data Science: Prabhu Ramachandran 2024-03-01
8 pages
00 - Inrroduction To Statistics
No ratings yet
00 - Inrroduction To Statistics
30 pages
Stat 535 C - Statistical Computing & Monte Carlo Methods: Arnaud Doucet
No ratings yet
Stat 535 C - Statistical Computing & Monte Carlo Methods: Arnaud Doucet
23 pages
Bayesian Statistics
No ratings yet
Bayesian Statistics
76 pages
0 Points of View
No ratings yet
0 Points of View
15 pages
24 Intro To Bayesian Inference
No ratings yet
24 Intro To Bayesian Inference
33 pages
Bayesian Credible Interval
100% (1)
Bayesian Credible Interval
8 pages
Week 10
No ratings yet
Week 10
42 pages
LectureNotes22 WI4455
No ratings yet
LectureNotes22 WI4455
154 pages
Intro To Essential Stats With Python
No ratings yet
Intro To Essential Stats With Python
51 pages
LE 451 Law of Succession Trusts and Wills (WILLS)
100% (2)
LE 451 Law of Succession Trusts and Wills (WILLS)
81 pages
Unit-I Probability
No ratings yet
Unit-I Probability
38 pages
Bayes Manuscripts
No ratings yet
Bayes Manuscripts
180 pages
Introduction
No ratings yet
Introduction
11 pages
Statistical+Inference+1 Shaw2007
No ratings yet
Statistical+Inference+1 Shaw2007
66 pages
Stat 111
No ratings yet
Stat 111
7 pages
Stat 535 C - Statistical Computing & Monte Carlo Methods: Arnaud Doucet
No ratings yet
Stat 535 C - Statistical Computing & Monte Carlo Methods: Arnaud Doucet
23 pages
Basic Concepts of Inference: Corresponds To Chapter 6 of Tamhane and Dunlop
No ratings yet
Basic Concepts of Inference: Corresponds To Chapter 6 of Tamhane and Dunlop
40 pages
IES - Electronics Conventional Papers - I & II 1980 - 2007
No ratings yet
IES - Electronics Conventional Papers - I & II 1980 - 2007
222 pages
Overview of Principles of Statistics
No ratings yet
Overview of Principles of Statistics
8 pages
Confidence Level
No ratings yet
Confidence Level
31 pages
Dempster Shafer Theory
No ratings yet
Dempster Shafer Theory
7 pages
Chakras Book PDF
100% (17)
Chakras Book PDF
89 pages
Java-Important Questions
100% (3)
Java-Important Questions
3 pages
#5 - Chapter 4
No ratings yet
#5 - Chapter 4
21 pages
Testbank: Chapter 13 Diversification Strategy: True/False Questions
No ratings yet
Testbank: Chapter 13 Diversification Strategy: True/False Questions
8 pages
211108-2017-Spouses Latonio v. McGeorge Food Industries20180221-6791-1nj34pi
No ratings yet
211108-2017-Spouses Latonio v. McGeorge Food Industries20180221-6791-1nj34pi
8 pages
Fall Risk
No ratings yet
Fall Risk
2 pages
WD801
No ratings yet
WD801
2 pages
ĐỀ SỐ 7- ĐỀ LƯƠNG THẾ VINH HÀ NỘI KHÓA 8+-CÔ PHẠM LIỄU
No ratings yet
ĐỀ SỐ 7- ĐỀ LƯƠNG THẾ VINH HÀ NỘI KHÓA 8+-CÔ PHẠM LIỄU
6 pages
Vascular Medicine Therapy and Practice
No ratings yet
Vascular Medicine Therapy and Practice
307 pages
Lectures 2017.pdf Manifolds
No ratings yet
Lectures 2017.pdf Manifolds
10 pages
Protocol RS 485
100% (6)
Protocol RS 485
3 pages
Ontology As A Service (Oaas) : A Case For Sub-Ontology Merging On The Cloud
No ratings yet
Ontology As A Service (Oaas) : A Case For Sub-Ontology Merging On The Cloud
32 pages
Jasmine B Resume Revised
No ratings yet
Jasmine B Resume Revised
2 pages
Pengembangan Konsep Smart Village Bagi Desa-Desa Di Indonesia Developing The Smart Village Concept For Indonesian Villages
No ratings yet
Pengembangan Konsep Smart Village Bagi Desa-Desa Di Indonesia Developing The Smart Village Concept For Indonesian Villages
17 pages
1-1. Location: 1. Background To Nairobi City
No ratings yet
1-1. Location: 1. Background To Nairobi City
9 pages
Review On Medicinal Properties of Arica Papaya Inn.: Asian Pacific Journal of Tropical Disease
No ratings yet
Review On Medicinal Properties of Arica Papaya Inn.: Asian Pacific Journal of Tropical Disease
6 pages
Estimation of Saturated Data Using The Tobit Kalman Filter
No ratings yet
Estimation of Saturated Data Using The Tobit Kalman Filter
6 pages
MZDG - 66172 1 en 0603
No ratings yet
MZDG - 66172 1 en 0603
44 pages
The Common House Gecko, Hemidactylus Frenatus Schlegel in Dumeril & Bibron 1836 (Reptilia: Gekkonidae) in Gujarat, India
No ratings yet
The Common House Gecko, Hemidactylus Frenatus Schlegel in Dumeril & Bibron 1836 (Reptilia: Gekkonidae) in Gujarat, India
6 pages
Jožef Kolarič: Literary Intertextuality in The Lyrics of GZA, MF DOOM, Aesop Rock and Billy Woods
No ratings yet
Jožef Kolarič: Literary Intertextuality in The Lyrics of GZA, MF DOOM, Aesop Rock and Billy Woods
19 pages
Newton Raphson Method - Formula, Solved Examples
No ratings yet
Newton Raphson Method - Formula, Solved Examples
10 pages
Old Home: Story Draft
No ratings yet
Old Home: Story Draft
2 pages
Lecture 011 FOUNDATIONS OF STATISTICS
No ratings yet
Lecture 011 FOUNDATIONS OF STATISTICS
63 pages
Heressies and Cult in The 21ST Century
No ratings yet
Heressies and Cult in The 21ST Century
23 pages
Lecture3 Statistics
No ratings yet
Lecture3 Statistics
39 pages
Lecture05 - Survival Analysis
No ratings yet
Lecture05 - Survival Analysis
52 pages
Adbury Survey of Training Experiences and Clinical Practice in Assessment For Autism Spectrum Disorder by Neuropsychologists
No ratings yet
Adbury Survey of Training Experiences and Clinical Practice in Assessment For Autism Spectrum Disorder by Neuropsychologists
19 pages
Lecture04 - Survival Analysis
No ratings yet
Lecture04 - Survival Analysis
41 pages
Taylor
No ratings yet
Taylor
21 pages
Importance of The Surface Oxide Layer in The Reduction of Outgassing From Stainless Steels
No ratings yet
Importance of The Surface Oxide Layer in The Reduction of Outgassing From Stainless Steels
7 pages
Lectures 2017 MANIFOLDS
No ratings yet
Lectures 2017 MANIFOLDS
6 pages
Catalogue Partition Seiceito
No ratings yet
Catalogue Partition Seiceito
3 pages
Zia Ul Islam
No ratings yet
Zia Ul Islam
2 pages
JuniorCoach Waiver Form 3new 1 202109 1
No ratings yet
JuniorCoach Waiver Form 3new 1 202109 1
3 pages
Judy Taylor
No ratings yet
Judy Taylor
10 pages
521 Lecture 13
No ratings yet
521 Lecture 13
7 pages
Hypothesis Testing Made Simple
From Everand
Hypothesis Testing Made Simple
Leonard Gaston
4/5 (5)
Introduction to Gambling Theory – Know the Odds!
From Everand
Introduction to Gambling Theory – Know the Odds!
stanbook449
3.5/5 (2)

FSMLecture6 - Statistics

Uploaded by

FSMLecture6 - Statistics

Uploaded by

Foundations of Statistics and Machine Learning:

testing and uncertainty quantification with e-values

1. Standard Confidence Intervals

2. Bayesian Credible Intervals

3. Anytime-Valid Confidence intervals

4. Subjective Objectivity: Luckiness

Given 1-dimensional model {𝑃! : 𝜃 ∈ Θ} , let p! ≔ p! (𝑋 " ) be a p-value for

We call the set CS",$&' a confidence set

We call the set CS",$&' a confidence set.

• Set CS",$&' ≔ {𝜃: p! > 𝛼}

We call the set CS",$&' a confidence set.

Fix confidence level 𝛼, sample size 𝑛 and let 𝑌 =

• Fix 𝑛 and let 𝑌 = 𝑋$, … , 𝑋" , 𝑋) iid ∼ 𝑁(𝜇, 1)

• Standard 95% confidence interval for 𝜇

• Standard 95% confidence interval

• We may also take a Bayesian stance towards uncertainty quantification

• Let ℳ = 𝑝3 : 𝜇 ∈ ℝ be family of normal densities with mean 𝜇 and fixed

… 𝑘 counts additional ‘’virtual’ data points: 𝜇a = (∑");$ 𝑋) + 𝑘𝜇4)/(𝑛 + 𝑘)

• For normal family+prior, in the limit for 𝑘 ↓ 0 Bayes posterior given by

• For normal location family, Bayesian credible interval based on a

Standard CIs : suppose we observe an experiment 𝑗 with 𝜇-(*) ≪ −10.

Standard CIs : suppose we observe an experiment 𝑗 with 𝜇-(*) ≪ −10.

1. Standard Confidence Intervals

2. Bayesian Credible Intervals

3. Anytime-Valid Confidence intervals

4. Subjective Objectivity: Luckiness

standard 95% CI: 𝑋D ± 1.96/ 𝑛

standard CI: 𝑋D ± 1.96/ 𝑛

AV CI, prior optimized for specific 𝑛∗ :

In this simple problem, Bayesian 95% posterior credible

The probability that in a real

• Let 𝑆 $ , 𝑆 (,) , … be an e-process. Then for all 𝑃4 ∈ 𝐻4:

• e-processes can be used to construct AVCIs (“confidence sequences”)

• e-processes can be used to construct AVCIs

• e-processes can be used to construct AVCIs

The 𝜃 you have not

• Equip with normal prior 𝜃 G ∼ 𝑊 = 𝑁 0, 𝜌,

• Bayes factor relative to 𝐻4 = 𝑃! given by

• Standard CI = Bayesian 95% credible interval (noninformative prior)

• AV confidence interval based on BF with prior with variance → ∞ ,

Red is the standard confidence interval,

• Given model {𝑃! : 𝜃 ∈ Θ} , let 𝑆! be e-process for 𝐻4 = 𝑃! , all 𝜃 ∈ Θ :

The 𝜃 you have not been

Another way to make AV CIs tighter in a specific “region of 𝑛”

Another way to make AV CIs tighter in a specific “region of 𝑛”

• We may make prior 𝑊 ≔ 𝑊! dependent on 𝜃 (e.g. normal with mean 𝜃)

• I will now illustrate!

with a bad prior, e-confidence interval

• We can easily construct anytime-valid confidence intervals also in

• With simple nulls, Bayes factor testing is essentially equivalent to e-

𝑋$, 𝑋,, … iid ∼ 𝑃3 , 𝑋) ∈ [−1,1]

We assume nothing at all about 𝑃3

𝑋 ∈ −1,1 ∶ set 𝒔𝝀,[𝝁] 𝒙 : = 𝟏 + 𝝀(𝒙 − 𝝁)

…since under any 𝑃 ∈ 𝐻4: 𝐄W [𝑠V,[3] 𝑋 ] = 1 + 𝜆 𝜇 − 𝜇 = 1

𝑋 ∈ −1,1 ∶ set 𝒔𝝀,[𝝁] 𝒙 : = 𝟏 + 𝝀(𝒙 − 𝝁)

𝑠V, 3 (𝑋) is e-variable for 𝐻4: 𝐄 𝑋 = 𝜇

…since under any 𝑃 ∈ 𝐻4: 𝐄W [𝑠V,[3] 𝑋 ] = 1 + 𝜆 𝜇 − 𝜇 = 1

($) (,) (")

𝑋 ∈ −1,1 ∶ set 𝒔𝝀,[𝝁] 𝒙 : = 𝟏 + 𝝀(𝒙 − 𝝁)

𝑠V, 3 (𝑋) is e-variable for 𝐻4: 𝐄 𝑋 = 𝜇

…since under any 𝑃 ∈ 𝐻4: 𝐄W [𝑠V,[3] 𝑋 ] = 1 + 𝜆 𝜇 − 𝜇 = 1

($) (,) (")

We simply set: CI",$&' = 𝜇: 𝑆 3" <

𝑋 ∈ −1,1 ∶ set 𝒔𝝀,[𝝁] 𝒙 : = 𝟏 + 𝝀(𝒙 − 𝝁)

𝑠V, 3 (𝑋) is e-variable for 𝐻4: 𝐄 𝑋 = 𝜇

…since under any 𝑃 ∈ 𝐻4: 𝐄W [𝑠V,[3] 𝑋 ] = 1 + 𝜆 𝜇 − 𝜇 = 1

($) (,) (")

Running Intersection: The

($) (,) (")

The 𝜇 you have not been

Now set 𝑆), 3 = ∫ 𝑠V, 3 𝑥) 𝑤 3 𝜆|𝑥 )&$ d𝜆

Now set 𝑆), 3 = ∫ 𝑠V, 3 𝑥) 𝑤 3 𝜆|𝑥 )&$ d𝜆

• We can easily construct anytime-valid confidence intervals also in

1. We can easily construct anytime-valid confidence intervals also in

• You have now learned the basic concepts of this course!

You might also like