0% found this document useful (0 votes)
46 views5 pages

Ch1 - Bayesian Analysis

Bayesian analysis

Uploaded by

sayantan bhunia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views5 pages

Ch1 - Bayesian Analysis

Bayesian analysis

Uploaded by

sayantan bhunia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Statistical Decision Theory and Outline

Bayesian Analysis
Chapter1 Basic Concepts z Introduction
z Basic Elements
z Bayesian Expected Loss
z Frequentist Risk
9139511 ᘕেᘜ
z Randomized Decision Rules
z Decision Principles
z Misuse of Classical Inference Procedures

1 2

Introduction Introduction(cont.)

z Decision theory is concerned with the problem z Classical statistics is directed towards the use
of making decisions. of sample information in making inference.
z Statistical decision theory is concerned ith the z In decision theory, an attempt is made to
making of decisions in the presence of combine the sample information with loss
statistical knowledge which sheds light on function and prior information.
some of the uncertainties involved in the
decision problem (will be presented by T )

3 4

Basic Elements Basic Elements (cont.)

z The unknown quantity T which affects the z Loss function L(T , a ) is defined for all
decision process is commonly called the state (T , a )  4 u $ . For technical convenience, only
of nature.
loss functions satisfying L(T , a ) t  K ! f
z The symbol 4 will be used to denote the set of
all possible states of nature and it is also z Outcome (Observations) will be denoted X
called the parameter or parameter space. (that’s a vector), the possible outcomes is the
z Decisions are more commonly called actions sample space, and will be denoted 
in the literature. Particular actions will be z Denote PT ( A) the probability of the event A.
denoted by a, while the set of all possible
actions under consideration will be denoted $
5 6
Bayesian Expected Loss Frequentist Risk

z Definition 1. z Definition 2.
If S * (T ) is the believed probability distribution A decision rule G (x) is a function from  into $ .
of T at the time of decision making, the If X=x is the observed value of the sample
Bayesian expected loss of an action a is information, thenG (x) is the action that will be
*
taken. Two decision rules, G1 ( x) and G 2 ( x) , are
S*
U (S * , a) E S L(T , a) ³ L(T , a)dF (T ) considered equivalent if PT (G1 ( X ) G 2 ( X )) 1
4
for all T

7 8

Frequentist Risk(cont.) Frequentist Risk(cont.)


z Definition 3. z Definition 5.
The risk function of a decision ruleG (x) is defined by A decision rule G is admissible if there exists no R-
R(T , G ) ETX [ L(T , G ( X ))] X better decision rule. A decision rule G is inadmissible
³ L(T , G ( x))dF

(x | T )
example if there does exist an R-better decision rule.
z Definition 4.
A decision rule G1 is R-better than a decision ruleG 2 a1 a2 a3
if R(T , G1 ) d R(T , G 2 ) for all T  4 , with strict
inequality for some T . A ruleG1 is R-equivalent to G 2 T1 1 3 4
if R(T , G1 ) R(T , G 2 ) for all T  4 T2 -1 5 5
T3 0 -1 -1
9 10

Frequentist Risk(cont.) Randomized Decision Rules

z Definition 7.
D {all decision rules G : R (T , G )  f for all T  4}. A randomized decision rule is G * ( x,x) , for
z Definition 6. each x, a probability distribution on $ , with the
The Bayes risk of a decision rule G ,with interpretation that if x is observed, G * ( x, A) is the
respect to a prior distribution S on 4 , is probability that an action in A will be chosen.
defined as
r (S , G ) E S [ R (T , G )]

11 12
Randomized Decision Rules( cont.) Randomized Decision Rules(cont.)

z Definition 8. z Matching Pennies


The loss function L(T , G * ( x,x)) of the randomized a1 a2
rule G * is defined to be T1 -1 1
G * ( x ,x )
*
L(T , G ( x,x)) E [ L(T , a )] T2 1 -1
The risk function of G * will then be defined to z Definition 9.
be Let D * be the set of all randomized decision rules G * for
R (T , G * ) ETX L(T , G * ( x,x)) which R (T , G )  f for all T  4 . The concepts
introduced in Definitions 4 and 5 will hence forth be
considered to apply to all randomized rules in D * .

13 14

Decision Principles Decision Principles(cont.)


z The Conditional Bayes Principle z The Minimax Principle
*
Choose an action a  $ which minimizes U (S , a ) . Such *
an action will be called a Bayes action and will be A decision rule G1* is preferred to a rule G 2 if
* *
denoted a S .
*
sup R (T , G1 )  sup R (T , G 2 )
T 4 T 4
z The Bayes Risk Principle
z Definition 10.
A decision rule G1 is preferred to a rule G 2 if
r (S , G1 )  r (S , G 2 ) A rule G *M is a minimax decision rule if it
A decision rule which minimizes r (S , G ) is optimal; it is minimizes sup R(T , G ) among all randomized rules
T 4
*

called a Bayes rule, and will be denoted G S .The in D* ,if


quantityr (S ) r (S , G S ) is then called the Bayes risk sup R (T , G *M )  inf sup R (T , G * )
for S . T 4 G *D * T 4

15 16

Misuse of Classical Inference


Decision Principles(cont.) Procedures

z The Invariance Principle z For a large enough sample size, the classical test will
be virtually certain to reject,because the point null
The invariance principle basically states that if hypothesis is almost certainly not exactly true, and
two problems have identical formal that this will always be confirmed by a large enough
structured,then the same decision rule should sample.
z In the other hand, a “statistically significant”
be used in each problem.That leads to a difference between the true parameter and the null
restriction to so-called “invariant” decision hypothesis can be an unimportant difference
rules. practically.
Example:
X  T0
X 1 , X 2 ,..., X n ~ N (T ,1)
1/ n
17 18
Misuse of Classical Inference Misuse of Classical Inference
Procedures Procedures

z The Frequentist Perspective: it’s risk funtion z The standard frequentist ,observing x=1, would
has not eliminated dependence on T report that the decision is H1 and that the test
X X
R(T , G ) ET [ L(T , G ( X ))] ³ L(T , G ( x))dF ( x | T ) had error probabilities of 0.01.


z The Conditional Perspective : z But the likelihood ratio is very close to 1. To a


At level D 0.01 , rejection region: R= {X : 1,2} conditionalist that means the data does very
little to distinguish T 0 and T 1 .
H0 :T 0 x sup L(T | X )
T 4 0 0.005
1 2 3 O( X )
H1 : T 1 sup L(T | X ) 0.0051
f (x | 0) 0.005 0.005 0.99 T 4
f (x | 1) 0.0051 0.9849 0.01
19 20

The Likelihood Principle Sufficient Statistics

z The Likelihood Principle: In making inferences or z Definition: Let X be a random variable whose
decisions about T after x is observed, all relevant distribution depends on the unknown parameter T , but
experimental information is contained in the likelihood is otherwise known. A function T of X is said to be a
function for the observed x. Furthermore two sufficient statistic for T if the conditional distribution of
likelihood functions contain the same information T X, given T(X)=t, is independent of T .
about if they are proportional to each other. z Theorem: Assume that T is a sufficient statistic
*
for T ,and let G 0 (t ,x) be any randomized rule in D * . Then
L1 (T | X ) v L2 (T | X ) there exists a randomized rule G1* (t ,x) , depending only
on T(x), which is R-equivalent to G 0* .

21 22

Convexity Conclusion

z Theorem: Assume that $ is a convex subset z Baysian analysis combine the sample
of R ,and that for each T  4 the loss functionL(T , a)
m
information with loss function and prior
*
is a convex function of a. Let G be a information.
randomized decision rule in D for which E [| a |]  f * G *( x , ˜ )
z It can improve the defects of classical statistic.
for all x  F . Then the nonrandomized rule
*( x , ˜ )
G ( x) E G [a]

has L(T , G ( x)) d L(T , G ( x)) for


*
all x and T

23 24
example example

z Consider the situation of a drug company deciding z Assume n people are interviewed , and the
whether or not to market a new pain reliever. Two of number X who would buy the drug is observed.
the many factors affecting its decision are the
It might be reasonable to assume that X is %(n,T 2 )
proportion of people for for which the drug will prove
effective T 2,and the proportion of the market the drug §n·
f ( x | T 2 ) ¨¨ ¸¸T 2x (1  T 2 ) n  x
will capture T 2.both T1 and T 2 will be generally unknown. © x¹
z Loss function:
z By historical record, assume prior distribution
T  a ,T 2  a ! 0
L(T 2 , a ) { 2
2( a  T 2 ) , T 2  a d 0 S (T 2 ) 10 I ( 0.1,0.2) (T 2 )

25 26

example

• Bayesian expected loss of an action


* 1
U (S * , a ) E S L(T , a ) ³ L(T , a)S (T
2 2 )dT 2
0

­ 0.15  a , a d 0.1
° 2
®15a  4a  0.3 ,0.1 d a d 0.2
° 2a  0.3 a t 0.2
¯

•Frequentist Risk

G ( x) x / n
R(T , G ( x)) ETX [ L(T , G ( X ))]
27

You might also like