0% found this document useful (0 votes)

2 views

FSMLecture4 - Copy (4)

Lecture 4 focuses on Bayesian statistics, including prediction and testing, and contrasts it with Neyman-Pearson methods. It discusses statistical models, maximum likelihood estimation, and the Bayesian posterior, emphasizing the importance of reporting full posterior distributions. Additionally, it covers hypothesis testing via Bayes factors and introduces e-processes in the context of betting strategies.

Uploaded by

Günay

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

FSMLecture4 - Copy (4)

Uploaded by

Günay

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 49

Today, Lecture 4

1. Bayesian Statistics
• Bayesian prediction, testing
2. E-Processes with simple nulls
• Simple alternative: GRO e-variable
• Composite alternative: learning in Bayesian & non-Bayesian manner
3. Bayesian vs. Neyman-Pearson vs. E-Process Testing with simple nulls

1
Models

• Let Ω = 𝒳 ! be a sample space and suppose we observe data

𝑥", … , 𝑥! ∈ 𝒳 !
• We call a set of distributions ℳ = {𝑃# : 𝜃 ∈ Θ} on Ω a statistical model (or
often hypothesis) for the data
• Simple example: 𝒳 = 0,1 , Θ = 0,1 , ℳ is the Bernoulli model, defined
by
Models

• Let Ω = 𝒳 ! be a sample space and suppose we observe data

Note: for all distributions on Ω,

Bernoulli is the restriction to those distrs with

Models

• Let Ω = 𝒳 ! be a sample space and suppose we observe data

• NOTE: not
Models

• Let Ω = 𝒳 ! be a sample space and suppose we observe data

• The method of maximum likelihood (Fisher, 1922) tells us to pick, as a

‘best guess’ of the true 𝜃 , the value 𝜃2 maximizing the probability of the
actually observed data.
The Likelihood Function

𝒏
𝒑𝜽 𝑿 as function of 𝜽
The Bayesian Posterior

• From the Bayesian perspective, you do not necessarily want to make a

‘single’ estimate of 𝜃
• Rather, you want to report the full posterior – this encapsulates
everything you have learned from the data
• Example – Bernoulli model with prior 𝑃 on Θ = [0,1]
" % "
• We have already seen the example with 𝑃 𝜃 = =𝑃 𝜃= = ;
$ $ %
posterior was 𝑃(𝜃|𝐷), a probability distr on 2 parameter values
The Bayesian Posterior

• From the Bayesian perspective, you do not necessarily want to make a

‘single’ estimate of 𝜃
• Rather, you want to report the full posterior – this encapsulates
everything you have learned from data
• Example – Bernoulli model with prior on Θ = [0,1]
• If we want to take a prior on the full Bernoulli model, we should take
one with a continuous probability density 𝑝 𝜃
• Everything works as before: posterior is
The Bayesian Posterior

• Posterior is

• If we take uniform prior 𝑝 𝜃 ≡ 1, this is proportional to likelihood

function!
• For more general priors, uniform prior not always well-defined (and even
for Bernoulli, perhaps not desirable!)
• Why not desirable?
The Bayesian Posterior

• Posterior is

• If we take uniform prior 𝑝 𝜃 ≡ 1, this is proportional to likelihood

function!
• For more general priors, uniform prior not always well-defined (and even
for Bernoulli, not desirable!)
• Not invariant to reparametrization
• …we could just as well have defined 𝑝# 𝑋& = 1 = 𝜃 %
The Bayesian Posterior

• Posterior is

• If we take uniform prior 𝑝 𝜃 ≡ 1, this is proportional to likelihood

function!
• For more general priors, uniform prior not always well-defined (and even
for Bernoulli, not desirable!)
• For general parametric models and continuous priors, posterior looks
more and more like a normal distribution as 𝑛 increases, centered
around 𝜃,2 with variance of order 1/√𝑛
The Bayesian Posterior

• Posterior is

• If we take uniform prior, this is proportional to likelihood function!

• For more general priors, uniform prior not always well-defined (and even
for Bernoulli, not desirable!)
• For general parametric models and continuous priors, posterior looks
more and more like a normal distribution as 𝑛 increases, centered
around 𝜃,2 with variance of order 1/√𝑛
A Note On Notation

• We will henceforth use 𝑤(𝜃) and 𝑤 𝜃 𝐷 = 𝑤(𝜃|𝑋 ! )

for prior and posterior (𝑤 stands for “weight”) and write
𝑝# (𝑋 ! ) instead of 𝑝 𝑋 ! 𝜃) and 𝑝' 𝑋 ! for 𝑝 𝑋 ! , the
marginal probability of the data.

• So Bayes theorem becomes

…and
Bayesian Prediction/
Predictive Estimation
• As a Bayesian you prefer to output the full posterior
• But what if you are asked to make a specific prediction for the next
outcome? Then you have to come up with a distribution after all
• Bayesian predictive distribution:
Laplace Rule of Succession

• For the Bernoulli model with uniform prior 𝑊,

…a formula first derived by Laplace, around 1800.

We can also view these predictions as a ‘Bayesian estimate’ of 𝜃 ….

Laplace Rule of Succession

• For the Bernoulli model with uniform prior 𝑊,

…a formula first derived by Laplace, around 1800.

We can also view these predictions as a ‘Bayesian estimate’ of 𝜃 ….

Two Fundamentally Different Uses of Bayes
Theorem
1. A Priori Probabilities can be meaningfully estimated
(medical testing, for example!)

2. A Priori Probabilities are wild guess (and conceivably do not exist)

• Sweden/France
• “Bayesian inference” in statistics

(…in reality it’s often ‘somewhere in the middle’)

Hypothesis Testing
via Bayes Factors
• Bayes factor testing: alternative to Neyman-Pearson / E-based testing
• First, very special case: 𝐻( and 𝐻" are both point (simple) hypotheses,
just like last two weeks
• E.g. our example -

Posterior odds
Bayes Factor
Hypothesis Testing
via Bayes Factors
• Bayes factor testing: alternative to Neyman-Pearson
• First, very special case: 𝐻( and 𝐻" are both point (simple) hypotheses,
just like last week
• E.g. our example - Bayes Factor

• Jeffreys: evidence in favor of 𝐻", against 𝐻(, should be measured by the

Bayes factor
= likelihood ratio (but only if 𝐻( and 𝐻" are simple)
= posterior odds if prior odds are equal
Hypothesis Testing
via Bayes Factors
• Composite case: still Bayes Factor

• …with now 𝑝 𝐷 𝐻) ) = ∫ 𝑝 𝐷 𝜃 𝑝 𝜃 𝐻) given by the marginal likelihood

(probability of the data, averaged according to the prior ‘within’ 𝐻) )
• Evidence in favor of 𝐻", against 𝐻( ,still measured by the Bayes factor
= marginal likelihood ratio, ≠standard likelihood ratio
= posterior odds if prior odds are equal
Example:
testing whether a coin is fair

• Under 𝑃# , data are i.i.d. Bernoulli 𝜃

" "
Θ( = %
, Θ" = 0,1 ∖ %
• Θ( is simple so no need to put prior on its elements
• Θ" represented by (for example) 𝑤" 𝜃 , uniform prior density on 0,1
"
(puts mass 0 on % so this seems o.k.)
• Evidence against 𝐻( measured by Bayes factor
Bayes factor testing in
‘non-Bayesian’ notation

𝐻( = 𝑝# 𝜃 ∈ Θ(} vs 𝐻" = 𝑝# 𝜃 ∈ Θ"} :

Evidence in favour of 𝐻" provided by the data measured by

where
Example:
testing whether a coin is fair

• Under 𝑃# , data are i.i.d. Bernoulli 𝜃

" "
Θ( = %
, Θ" = 0,1 ∖ %
• Θ( is simple so no need to put prior on its elements
• Evidence against 𝐻( measured by Bayes factor

• …wait! Last week we saw the same formula as an e-process for testing
𝐻( against 𝐻" !?
Today, Lecture 4

25
E-Processes and Betting

• Let 𝒳 = {1, … , 𝐾}.

At each time 𝑡 = 1,2, … there are 𝐾 tickets available. Ticket 𝑘 pays off
1/𝑝((𝑘) if outcome is 𝑘, and 0 otherwise.
You may buy multiple and fractional nrs of tickets.
• You start by investing 1$ in ticket 1.
• At each time t you put fraction 𝑃I" 𝑋* = 𝑘|𝑋 *+" of your money on outcome
𝑘. Then your total capital 𝑀()) gets multiplied by 𝑀) : = 𝑝"̅ 𝑋* |𝑋 *+" /𝑝((𝑋* )
• After 1 outcome you either stop with end-capital 𝑀" or continue, putting
fraction 𝑃I% 𝑋% = 𝑘|𝑋" of 𝑀" on outcome 𝑋% = 𝑘 (“reinvest everything”).
After 2nd outcome you stop with end capital 𝑀 % = 𝑀" ⋅ 𝑀% or you
continue, and so on… 26
Good Betting Strategies

• If the null is true, you do not expect to gain any money, under any
stopping time, no matter what strategy 𝑝"̅ you use

• If you think alternative is a specific 𝑝" , then using 𝑝"̅ = 𝑝" is a good idea
• “constant” strategy

• If you think 𝐻( is wrong, but you do not know which alternative is true,
then… you can try to learn 𝑝"
• Use a 𝑝"̅ that better and better mimics the true, or just “best” fixed 𝑝"

27
Simple 𝑯𝟏 , log-optimal betting

If null and alternative are simple, 𝐻( = 𝑃( , 𝐻" = {𝑃"} , 𝑋", 𝑋%, … are i.i.d.
according to 𝑃", then using 𝑝"̅ = 𝑝" is a good idea. Why?
• For any choice of e-variable 𝑆& = 𝑠(𝑋& ), we have, with 𝑆 (!) = ∏!&." 𝑠(𝑋& ),
!
1 !
1
log 𝑆 = T log 𝑆& → 𝐄/∼1! [log 𝑠(𝑋)] , 𝑃" − a. s.
n 𝑛
&."
• …hence if we measure evidence against 𝐻( with same e-variable 𝑠 𝑋&
at each 𝑖 , we would like to pick 𝑠 ∗ (𝑋) maximizing
𝐄/∼1! [log 𝑠(𝑋)] over all e-variables 𝑠 𝑋 for 𝐻(
leads a.s. to exponentially more money than any other e-variable!
28
Simple 𝑯𝟏 , log-optimal betting

We aim to to pick 𝑠 ∗ (𝑋) maximizing

𝐄/∼1! [log 𝑠(𝑋)] over all e-variables 𝑠 𝑋 for 𝐻(

3! /
It turns out that maximum is achieved for 𝑠∗ 𝑋 = : the LR e-variable
3" (/)
• We say: betting according to 𝑝" 𝑋& at each 𝑋& is log-optimal or GRO
(GRO = Growth-Optimal)
• We say that the LR e-variable 𝑠 ∗ (𝑋) is log-optimal/GRO
• Note that many sub-log-optimal e-variables exist as well…
3! /
e.g. 𝜆 + 1 − 𝜆 3" (/)
for any 𝜆 ∈ [0,1] or Neyman-Pearson e-variable
30
Simple 𝑯𝟏 , log-optimal betting

We aim to to pick 𝑠 ∗ (𝑋) maximizing

𝐄/∼1! [log 𝑠(𝑋)] over all e-variables 𝑠 𝑋 for 𝐻(
3! /
maximum is achieved for 𝑠∗ 𝑋 = 3" (/)
Proof: homework (with substantial hint)

31
Composite 𝑯𝟏

• If you think 𝐻( is wrong, but you do not know which alternative is true,
then… you can try to learn 𝑝"
• Use a 𝑝"̅ that better and better mimics the true, or just “best” fixed 𝑝"
" "
Example, 𝐻(: 𝑋& ∼ Ber %
, H": X 4 ∼ Ber 𝜃 , 𝜃 ≠ % : set:
!! 5"
𝑝"̅ 𝑋!5" = 1 𝑥! ≔ !5%
, where 𝑛" is nr of 1s in 𝑥 !

…we use notation for conditional probabilities, but we should really think of
𝑝"̅ as a sequential betting strategy with the “conditional probabilities”
indicating how to bet/invest in the next round, given the past data
32
Composite 𝑯𝟏

• If you think 𝐻( is wrong, but you do not know which alternative is true,
then… you can try to learn 𝑝"
• Use a 𝑝"̅ that better and better mimics the true, or just “best” fixed 𝑝"
"
Example, 𝐻(: 𝑋& ∼ Ber %
, set:
!! 5"
𝑝"̅ 𝑋!5" = 1 𝑥! ≔ !5%
, where 𝑛" is nr of 1s in 𝑥 !
…still, formally, using telescoping-in-reverse, we find that 𝑝"̅ also uniquely
defines a marginal probability distribution for 𝑋 ! , for each 𝑛 , and our
accumulated capital at time 𝑛 is again given by the likelihood ratio.
3̅! /# 3̅! (/$ ∣/$%! )
= ∏&."..!
3" (/# ) 3" (/$ ∣𝑿𝒊%𝟏 )
33
Composite 𝑯𝟏
"
Example, 𝐻(: 𝑋& ∼ Ber %
, set:
!! 5"
𝑝"̅ 𝑋!5" = 1 𝑥 ! ≔ , where 𝑛" is nr of 1s in 𝑥 !
!5%
using telescoping-in-reverse, we find that 𝑝"̅ also uniquely defines a
marginal probability distribution for 𝑋 ! , for each 𝑛 , and our accumulated
capital at time 𝑛 is again given by the likelihood ratio.
3̅! (/$ ∣/$%! ) 3̅! /# ∫ 3( /# ; # <#
∏&."..! = =
3" (/$ ) 3" (/# ) 3" (/# )

Last week’s “plug-in” strategy turns out to be equal to a Bayesian strategy:

Laplace Rule of Succession
34
Composite 𝑯𝟏 : plug-in vs. Bayes

Two general strategies for learning 𝑃" ∈ 𝐻" ∶

• ”prequential plug-in” (or simply “plug-in”) vs.
• “method-of-mixture” (or, in present simple context, simply “Bayesian”)

𝐻" Bernoulli model:

!! 5"
• plug-in based on the regularized MLE !5%
is precisely equal to Bayesian
strategy based on uniform prior

35
Composite 𝑯𝟏 : plug-in vs. Bayes

Two general strategies for learning 𝑃" ∈ 𝐻" ∶

• ”prequential plug-in” (or simply “plug-in”) vs.
• “method-of-mixture” (or, in present simple context, simply “Bayesian”)

𝐻" Bernoulli model:

!! 5=!
• plug-in based on the regularized MLE !5= is precisely equal to
! 5=)
Bayesian strategy based on beta prior 𝐵(𝑚", 𝑚%)

36
Composite 𝑯𝟏 : plug-in vs. Bayes

𝐻" Bernoulli model:

• plug-in can be precisely equal to Bayesian strategy
• Highly specific to Bernoulli/multinomial, e.g.:

𝐻" = {𝑁 𝜇, 1 : 𝜇 ∈ ℝ}
∑#
$*! /$ 5?
• plug-in: normal density with mean !5"
variance 1
• Bayes with normal prior 𝑁 𝑎, 𝜌 : Bayes predictive distribution with same
@
mean but variance 1 + ! > 1 (“out-model”)
Other models: differences even more substantial
37
General Insight for
Simple Nulls, Composite Alternatives
• If the null is simple, every Bayes factor defines an e-process:

3+! /# A /#
𝐄 3" /#
= ∫ 𝑝( 𝑋! ⋅ 3 /# 𝑑𝑋 ! = ∫ 𝑞 𝑋 ! 𝑑𝑋 ! = 1
"

• … but there are e-processes which are not Bayes factors

• general plug-in processes, e.g. for non-Bernoulli models

38
Today, Lecture 4

39
Similarities & Differences
Bayes Factor vs Neyman Pearson vs E-Testing
• In Bayesian testing, the roles of 𝐻( and 𝐻( are symmetrical
• In NP and E-testing they are not
• Type-I error control is the most important
• May seem like a bug, but turns out to be a feature when moving to
confidence intervals

• Likelihood ratios play an important role in all three theories

• NP: via the NP Lemma
• E: via growth-rate optimality of the likelihood ratio
• Bayes: via occurrence of likelihood in Bayes’ theorem
Differences
Bayes Factor vs Neyman Pearson
• The Bayesian views (marginal) likelihood ratios as evidence in favour of
either hypothesis and views the goal of testing as induction: one wants
to find out which is true, 𝐻( or 𝐻", and gets statements like ‘the
probability that 𝐻" is true is close to 95%’
• The Neymanian thinks that statements like ‘the probability of 𝐻" is…’ are
meaningless and finding out which one is true is too ambitious. She is
only interested in inductive behavior: not making mistakes too often if
one does many hypothesis tests in one’s lifetime
BF vs NP vs E

• Even though philosophies are different, we can still try to compare the
methods more closely
• As a Bayesian you can report the full posterior but it is also fine to
merely use the posterior as a tool if your goal is to make a specific
decision (which like in the NP theory can e.g. be ‘accept’ or ‘reject’)
• It then makes sense to reject the null if the Bayes posterior for 𝐻( is
smaller than 𝛼 , since then the conditional (on the data) Type-I error,
i.e. the probability that 𝐻( is true given that you reject it, is bounded by 𝛼:

𝑃 𝐻( is true 𝛿 𝑋 B = reject) ≤ 𝛼
The Bayesian’s Conditional Type-I Error

𝑃 𝐻( is true 𝛿 𝑋 B = reject) ≤ 𝛼

• This is intuitively correct but it does need proof!

• 𝑃 𝐻( is true {𝑋 B : 𝛿 𝑋 B = reject}) =
𝐄/, ∼1|{/, : F /, .GHIHJK} [𝑃 𝐻( is true 𝑋 B )] ≤ 𝐄/, ∼1|{/, : F /, .GHIHJK} 𝛼 =𝛼

43
The Bayesian’s Conditional Type-I Error

𝑃 𝐻( is true 𝛿 𝑋 B = reject) ≤ 𝛼

• This is intuitively correct but it does need proof:

• 𝑃 𝐻( is true {𝑋 B : 𝛿 𝑋 B = reject}) =
𝐄/, ∼1|{/, : F /, .GHIHJK} [𝑃 𝐻( is true 𝑋 B )] ≤ 𝐄/, ∼1|{/, : F /, .GHIHJK} 𝛼 =𝛼

44
BF in “some sense”
less conservative than E
" "
• With 𝛼 = 0.05 = and 𝑤(𝐻() = 𝑤 𝐻" = , 𝑃 𝐻( 𝑋 ! ) ≤ 1/20 is
%( %
equivalent to Bayes factor ≥ 19
• The Bayesian would reject the null if BF ≥ 19 and would get a
conditional Type-I error probability bound of 0.05
• The E-Statistician, who uses Bayesian learning for 𝐻", would reject null if
BF ≥ 20 and get an unconditional Type-I error probability bound of 0.05
• Conditional bounds imply unconditional ones (why?) but not vice versa.
• It seems the Bayesian gets better bound with less conservative rule!?!?
BF in “some sense”
less conservative than E
" "
• With 𝛼 = 0.05 = and 𝑤(𝐻() = 𝑤 𝐻" = , 𝑃 𝐻( 𝑋 ! ) ≤ 1/20 is
%( %
equivalent to Bayes factor ≥ 19
• The Bayesian would reject the null if BF ≥ 19 and would get a
conditional Type-I error probability bound of 0.05
• The E-Statistician, who uses Bayesian learning for 𝐻", would reject null if
BF ≥ 20 and get an unconditional Type-I error probability bound of 0.05
• Conditional bounds imply unconditional ones (why?) but not vice versa.
• It seems the Bayesian gets better bound with less conservative rule!?!?
This is possible because the Bayesian makes much stronger assumptions
E-bounds hold irrespective of whether (uniform) prior on 𝐻" is “correct”
Bayesian bounds rely on correctness of this prior.
BF usually
more conservative than NP
" "
• With 𝛼 = 0.05 = %(
and 𝑤(𝐻() = 𝑤 𝐻" = , 𝑃 𝐻( 𝑋 ! ) < 1/20 eqv to BF > 19
%
• Suppose 𝐻(,𝐻" simple (so Bayes factor=𝐿𝑅), 𝛼 = 0.05
• NP: reject null if 𝐿𝑅 ≥ ℓ Such that 𝑃M" 𝐿𝑅 ≥ ℓ = 0.05, i.e. 𝑝 ≤ 0.05
• (in contrast to BF and E, the NP test does not depend on the actual alternative
𝑃" ∈ 𝐻" or a prior thereon; this is one advantage of it!)
How difficult is p < 0.05 as function of 𝒏?
10 20 30 .. 50 .. 100 .. 200 .. 500
≥9 ≥ 15 ≥ 20 ≥ 32 ≥ 59 ≥ 113 ≥ 269
90% 75% 67% 64% 59% 56% 54%

How difficult is BF > 19?

10 20 30 .. 50 .. 100 .. 200 .. 500
≥ 10 ≥ 17 ≥ 24 ≥ 36 ≥ 66 ≥ 124 ≥ 289
100% 85% 80% 72% 66% 62% 58%
Upcoming Weeks

• Beyond Testing: Confidence Intervals

• Composite null hypotheses

• Math: exponential families, concentration inequalities

Medicinal Plants of Kashmir and Ladakh_ Temperate and Cold -- Maharaj Krishnen Kaul -- New Delhi, India, 1997 -- Indus Pub_ Co_ -- 9788173870613 -- Df5b5a86080cb7b0231cfb7806d7ae7b -- Anna’s Archive
No ratings yet
Medicinal Plants of Kashmir and Ladakh_ Temperate and Cold -- Maharaj Krishnen Kaul -- New Delhi, India, 1997 -- Indus Pub_ Co_ -- 9788173870613 -- Df5b5a86080cb7b0231cfb7806d7ae7b -- Anna’s Archive
204 pages
Case Study Bus Terminus
No ratings yet
Case Study Bus Terminus
1 page
Doosan Infracore EZ Guide-I Programming For Lathe.
100% (1)
Doosan Infracore EZ Guide-I Programming For Lathe.
108 pages
PRO 1R - Italy PDF
No ratings yet
PRO 1R - Italy PDF
2 pages
Notes4_BayesianLearning
No ratings yet
Notes4_BayesianLearning
8 pages
확통1 LectureNote09 on Bayesian Statistical Inference
No ratings yet
확통1 LectureNote09 on Bayesian Statistical Inference
78 pages
Block 4 ST3189
No ratings yet
Block 4 ST3189
25 pages
Lecture3_statistics
No ratings yet
Lecture3_statistics
39 pages
bayesian-inference
No ratings yet
bayesian-inference
18 pages
Introduction To Bayesian Methods: Jessi Cisewski Department of Statistics Yale University
No ratings yet
Introduction To Bayesian Methods: Jessi Cisewski Department of Statistics Yale University
53 pages
Lecture Notes For Probability and Statistics
No ratings yet
Lecture Notes For Probability and Statistics
7 pages
Bayesian-inference-slides-2021
No ratings yet
Bayesian-inference-slides-2021
37 pages
Stat 535 C - Statistical Computing & Monte Carlo Methods: Arnaud Doucet
No ratings yet
Stat 535 C - Statistical Computing & Monte Carlo Methods: Arnaud Doucet
23 pages
Stat 535 C - Statistical Computing & Monte Carlo Methods: Arnaud Doucet
No ratings yet
Stat 535 C - Statistical Computing & Monte Carlo Methods: Arnaud Doucet
23 pages
MA40189 Notes
No ratings yet
MA40189 Notes
70 pages
BML Lecture Notes
No ratings yet
BML Lecture Notes
126 pages
Bayesian Basics: Ryan P. Adams
No ratings yet
Bayesian Basics: Ryan P. Adams
7 pages
STAT40950_2_HypothesisTesting
No ratings yet
STAT40950_2_HypothesisTesting
13 pages
Chapter 1 B
No ratings yet
Chapter 1 B
35 pages
CH 5
No ratings yet
CH 5
45 pages
Bayes ML Tutorial
No ratings yet
Bayes ML Tutorial
69 pages
24 Intro to Bayesian Inference (1)
No ratings yet
24 Intro to Bayesian Inference (1)
33 pages
25 Intro to Bayesian Inference (1)
No ratings yet
25 Intro to Bayesian Inference (1)
31 pages
Estimation
No ratings yet
Estimation
53 pages
Bayesian Statistics 01
100% (1)
Bayesian Statistics 01
22 pages
Mstat Note14 Bayesian Inference FSP
No ratings yet
Mstat Note14 Bayesian Inference FSP
30 pages
Bayesian Ibrahim
No ratings yet
Bayesian Ibrahim
370 pages
Modern Bayesian Econometrics
No ratings yet
Modern Bayesian Econometrics
100 pages
1-MS2 (Intro Bayes)
No ratings yet
1-MS2 (Intro Bayes)
38 pages
Lecture1 introToBayes
No ratings yet
Lecture1 introToBayes
65 pages
Bayes For Beginners: Luca Chech and Jolanda Malamud Supervisor: Thomas Parr 13 February 2019
No ratings yet
Bayes For Beginners: Luca Chech and Jolanda Malamud Supervisor: Thomas Parr 13 February 2019
41 pages
Bayesian Statistics: Thomas Bayes
No ratings yet
Bayesian Statistics: Thomas Bayes
22 pages
Introduction To Discrete Bayesian Methods: Petri Nokelainen
No ratings yet
Introduction To Discrete Bayesian Methods: Petri Nokelainen
146 pages
BaYesian Models Machine Learning 2016
No ratings yet
BaYesian Models Machine Learning 2016
126 pages
Single Parameter Models
No ratings yet
Single Parameter Models
37 pages
Bayes' Estimators: The Method
No ratings yet
Bayes' Estimators: The Method
7 pages
Bayesian Inference: The Basics
No ratings yet
Bayesian Inference: The Basics
37 pages
Model Selection/ Structure Learning Koller & Friedman Chapter 14 Mackay Chapter 28
No ratings yet
Model Selection/ Structure Learning Koller & Friedman Chapter 14 Mackay Chapter 28
49 pages
Bayesian Statistics
No ratings yet
Bayesian Statistics
76 pages
Course On Bayesian Methods in Environmental Valuation: Basics (Continued) : Models For Proportions and Means
No ratings yet
Course On Bayesian Methods in Environmental Valuation: Basics (Continued) : Models For Proportions and Means
34 pages
Lecture 6. Bayesian Estimation
No ratings yet
Lecture 6. Bayesian Estimation
14 pages
Bayesian Modelling Tuts-4-9
No ratings yet
Bayesian Modelling Tuts-4-9
6 pages
Introduction To Bayesian Statistics: Foo Lee Kien (PHD)
No ratings yet
Introduction To Bayesian Statistics: Foo Lee Kien (PHD)
65 pages
An Introduction To Bayesian Statistics
100% (9)
An Introduction To Bayesian Statistics
20 pages
A Beginner's Notes On Bayesian Econometrics (Art)
No ratings yet
A Beginner's Notes On Bayesian Econometrics (Art)
21 pages
Lecture 10
No ratings yet
Lecture 10
33 pages
An Overview of Bayesian Econometrics
No ratings yet
An Overview of Bayesian Econometrics
30 pages
BST413 12jan Page1to11
No ratings yet
BST413 12jan Page1to11
11 pages
Bayes Lecture Notes
No ratings yet
Bayes Lecture Notes
79 pages
Lectures on Statistics in Theory - Prelude to Statistics in Practice
No ratings yet
Lectures on Statistics in Theory - Prelude to Statistics in Practice
94 pages
ProblemSet1Sol
No ratings yet
ProblemSet1Sol
7 pages
Bayesian Modelling For Data Analysis and Learning From Data
No ratings yet
Bayesian Modelling For Data Analysis and Learning From Data
19 pages
Bayes and Frequentism: Return of An Old Controversy: Louis Lyons
No ratings yet
Bayes and Frequentism: Return of An Old Controversy: Louis Lyons
40 pages
IDS22Bayes Applications
No ratings yet
IDS22Bayes Applications
34 pages
Bayes Manuscripts
No ratings yet
Bayes Manuscripts
180 pages
Bayesian Inference: A Practical Primer: Outline
No ratings yet
Bayesian Inference: A Practical Primer: Outline
28 pages
Bayesian Data Analysis - Reading Instructions 2: Chapter 2 - Outline
No ratings yet
Bayesian Data Analysis - Reading Instructions 2: Chapter 2 - Outline
36 pages
MIT18 650F16 Bayesian Statistics
No ratings yet
MIT18 650F16 Bayesian Statistics
18 pages
Introduction To Discrete Bayesian Methods: Petri Nokelainen
No ratings yet
Introduction To Discrete Bayesian Methods: Petri Nokelainen
146 pages
Bayesian Inference: Chris Mathys
No ratings yet
Bayesian Inference: Chris Mathys
32 pages
FSMLecture6 - statistics
No ratings yet
FSMLecture6 - statistics
61 pages
Chap 2
No ratings yet
Chap 2
28 pages
Analytic Inequalities
From Everand
Analytic Inequalities
Nicholas D. Kazarinoff
5/5 (1)
Bell's Inequality Untwisted
From Everand
Bell's Inequality Untwisted
Jim Spinosa
No ratings yet
Judy_Taylor
No ratings yet
Judy_Taylor
10 pages
lectures_2017.pdf_manifolds
No ratings yet
lectures_2017.pdf_manifolds
10 pages
lecture04_survival analysis
No ratings yet
lecture04_survival analysis
41 pages
lectures_2017_MANIFOLDS
No ratings yet
lectures_2017_MANIFOLDS
6 pages
lecture05_Survival Analysis
No ratings yet
lecture05_Survival Analysis
52 pages
JuniorCoach Waiver Form 3new 1 202109 1
No ratings yet
JuniorCoach Waiver Form 3new 1 202109 1
3 pages
Lecture 011 FOUNDATIONS OF STATISTICS
No ratings yet
Lecture 011 FOUNDATIONS OF STATISTICS
63 pages
taylor
No ratings yet
taylor
21 pages
521-lecture-13
No ratings yet
521-lecture-13
7 pages
Wk. 10 - Aggregating Multiple Products in A Single Order
No ratings yet
Wk. 10 - Aggregating Multiple Products in A Single Order
10 pages
The Effect of Burnout and Organızatıonal Trust On Job Satısfactıon in Healthcare Workers
No ratings yet
The Effect of Burnout and Organızatıonal Trust On Job Satısfactıon in Healthcare Workers
14 pages
1_02 Detailed Design Drawings of Piers and Abutments Rev 00(1)
No ratings yet
1_02 Detailed Design Drawings of Piers and Abutments Rev 00(1)
43 pages
Architecture Ireland
No ratings yet
Architecture Ireland
76 pages
Regression Analysis Script
No ratings yet
Regression Analysis Script
24 pages
2.4 Measurement of Inventory Subsequent To Initial-Q-SE
No ratings yet
2.4 Measurement of Inventory Subsequent To Initial-Q-SE
10 pages
HO Manual 2021 V6.0 PDF
No ratings yet
HO Manual 2021 V6.0 PDF
20 pages
Romania
No ratings yet
Romania
26 pages
04 Moments, Skewness & Kurtosis
100% (1)
04 Moments, Skewness & Kurtosis
6 pages
Psalm 91
No ratings yet
Psalm 91
2 pages
Philosophical Foundations of the Three Sociologies 1st Edition Ted Benton - The full ebook set is available with all chapters for download
No ratings yet
Philosophical Foundations of the Three Sociologies 1st Edition Ted Benton - The full ebook set is available with all chapters for download
62 pages
ECON101 Notes 1
No ratings yet
ECON101 Notes 1
41 pages
01 Helical Gear ONLY Problems
0% (2)
01 Helical Gear ONLY Problems
2 pages
Rules-Future Continuous
No ratings yet
Rules-Future Continuous
2 pages
DLLQ2
No ratings yet
DLLQ2
45 pages
ATFX 2023Q2 EN MY A
No ratings yet
ATFX 2023Q2 EN MY A
58 pages
Unit 5 Attention Perception Learning Memory and Forgetting
No ratings yet
Unit 5 Attention Perception Learning Memory and Forgetting
163 pages
Growth and Inequality in Pakistan - Hafeez Pasha
No ratings yet
Growth and Inequality in Pakistan - Hafeez Pasha
220 pages
Class Ix Biology Sample Paper Final
No ratings yet
Class Ix Biology Sample Paper Final
10 pages
Class III Cavity Preparation DR Talha
No ratings yet
Class III Cavity Preparation DR Talha
26 pages
UNIT 5 NOTES
No ratings yet
UNIT 5 NOTES
38 pages
Ikea Business Brochure 2020 en HK
No ratings yet
Ikea Business Brochure 2020 en HK
38 pages
7 X 25MW-MGT30
No ratings yet
7 X 25MW-MGT30
11 pages
Grade IV Matatag Exam 2024
No ratings yet
Grade IV Matatag Exam 2024
3 pages
UGC-NET-Teaching-Aptitude-Questions
No ratings yet
UGC-NET-Teaching-Aptitude-Questions
13 pages
Historical Essays
No ratings yet
Historical Essays
908 pages

FSMLecture4 - Copy (4)

Uploaded by

FSMLecture4 - Copy (4)

Uploaded by

Today, Lecture 4

• Let Ω = 𝒳 ! be a sample space and suppose we observe data

• Let Ω = 𝒳 ! be a sample space and suppose we observe data

Note: for all distributions on Ω,

Bernoulli is the restriction to those distrs with

• Let Ω = 𝒳 ! be a sample space and suppose we observe data

• Let Ω = 𝒳 ! be a sample space and suppose we observe data

• Let Ω = 𝒳 ! be a sample space and suppose we observe data

• The method of maximum likelihood (Fisher, 1922) tells us to pick, as a

• From the Bayesian perspective, you do not necessarily want to make a

• From the Bayesian perspective, you do not necessarily want to make a

• If we take uniform prior 𝑝 𝜃 ≡ 1, this is proportional to likelihood

• If we take uniform prior 𝑝 𝜃 ≡ 1, this is proportional to likelihood

• If we take uniform prior 𝑝 𝜃 ≡ 1, this is proportional to likelihood

• If we take uniform prior, this is proportional to likelihood function!

• We will henceforth use 𝑤(𝜃) and 𝑤 𝜃 𝐷 = 𝑤(𝜃|𝑋 ! )

• So Bayes theorem becomes

• For the Bernoulli model with uniform prior 𝑊,

…a formula first derived by Laplace, around 1800.

We can also view these predictions as a ‘Bayesian estimate’ of 𝜃 ….

• For the Bernoulli model with uniform prior 𝑊,

…a formula first derived by Laplace, around 1800.

We can also view these predictions as a ‘Bayesian estimate’ of 𝜃 ….

2. A Priori Probabilities are wild guess (and conceivably do not exist)

(…in reality it’s often ‘somewhere in the middle’)

• Jeffreys: evidence in favor of 𝐻", against 𝐻(, should be measured by the

• …with now 𝑝 𝐷 𝐻) ) = ∫ 𝑝 𝐷 𝜃 𝑝 𝜃 𝐻) given by the marginal likelihood

• Under 𝑃# , data are i.i.d. Bernoulli 𝜃

𝐻( = 𝑝# 𝜃 ∈ Θ(} vs 𝐻" = 𝑝# 𝜃 ∈ Θ"} :

• Under 𝑃# , data are i.i.d. Bernoulli 𝜃

• Let 𝒳 = {1, … , 𝐾}.

We aim to to pick 𝑠 ∗ (𝑋) maximizing

We aim to to pick 𝑠 ∗ (𝑋) maximizing

Last week’s “plug-in” strategy turns out to be equal to a Bayesian strategy:

Two general strategies for learning 𝑃" ∈ 𝐻" ∶

𝐻" Bernoulli model:

Two general strategies for learning 𝑃" ∈ 𝐻" ∶

𝐻" Bernoulli model:

𝐻" Bernoulli model:

• … but there are e-processes which are not Bayes factors

• Likelihood ratios play an important role in all three theories

• This is intuitively correct but it does need proof!

• This is intuitively correct but it does need proof:

How difficult is BF > 19?

• Beyond Testing: Confidence Intervals

• Composite null hypotheses

• Math: exponential families, concentration inequalities

You might also like