Lecture08 Hypothesis Testing Inf Stats FA24
Lecture08 Hypothesis Testing Inf Stats FA24
Applied Probability
Hypothesis Testing
Ch-9 from Michael Baron’s Book
Ch-7 from Forsyth’s Book
1
Introduction
2
Null and Alternative Hypothesis
• Let, we are interested in the “burning rate” of solid propellant in aircrew escape
systems which is a RV described by a probability distribution
• Let we are specifically interested in mean rate i.e. in deciding whether or not this
values is 50 cm/s. This can be formally expressed as:
4
Type I and Type II errors
• When testing hypotheses, we realize that all we see
is a random sample.
• Therefore, with all the best statistics skills, our
decision to accept or to reject H0 may still be wrong
i.e. may be due to a sampling error.
• Four situations are possible,
5
Type I and Type II errors (Cont)
• Each error occurs with a certain probability that we hope to
keep small. A good test results in an erroneous decision only
if the observed data is somewhat extreme.
• Type I errors more undesired e.g. convicting an innocent
defendant (null), hence their probability usually bound by a
pre-assigned small number α and it is endeavored to
minimize type II errors
• Type I error probability called Significance level of a test or
α-error
• The power of a statistical test is the probability of rejecting the
null hypothesis H0 when the alternative hypothesis is true
6
The P-value
• P-value is the probability that we would have seen
our data (or something more unexpected) just by
chance if the null hypothesis (null value) is true.
• Small p-values mean the null value is unlikely given
our data i.e. would result in null hypothesis being
rejected
• More formally, The P-value is the smallest level of
significance that would lead to rejection of the
null hypothesis H0 with the given data
7
The P-value (Cont)
• By convention, p-values <0.05 are often accepted as
“statistically significant” in the medical literature; but this is
an arbitrary cutoff.
• A cut-off of p<0.05 means that in about ONLY 5 of
100 experiments, a result would appear significant just by
chance (“Type I error”) so safe to use this value as threshold
to reject the Null hypothesis
o Null hypothesis is rejected when it is confirmed that something
unusual did not happen by chance but by evidence collected in
hypothesis testing
• It is customary to call the test statistic (and the data)
significant when the null hypothesis H0 is rejected
8
Explanation with Example
9
Example (Contd)
10
Summary: Hypothesis Testing
11
Significance
∘
• Let Null Hypothesis be that average body temp = 95 F
• Collect temp of random sample of∘ N people where the
sample mean is less likely to be 95 F
• We must find the reason for the difference between
the sample mean and our hypothesized value
• The hypothesis may be wrong or the difference may
just be due to samples being randomly chosen
• Significance of evidence against hypothesis assessed by
finding out what fraction of samples would give us
sample means like the one we observe if the hypothesis
is true.
12
Evaluating Significance
• Use t-statistic with N-1 degrees of freedom if N<30
or use Z-statistic if N>30
14
Rejection Region of Null
Hypothesis
15
P-value: Rejection region i.e.
Extreme Fraction
16
A Hypothesis
17
P-value: Election Polling
19
Z-Table
20
P-value Caution
21
Some Problems From Forsyth
Example: 7.1
• Assume the mean weight of a male chow eating
mouse is 35 gr. and the standard error of a sample
of 44 such mice is 0.827 gr. What fraction of
samples of 44 such mice will have a sample mean
in the range 33–37 grams?
22
Solution: Exp 7.1
𝑠−35
• The statistic to use is 𝑇 =
0.827
• The question is asking for the probability that t
takes a value in the range
33 − 35 37 − 35
, 𝑖. 𝑒. −2.41, 2.41
0.827 0.827
• So,
23
Z-Table
24
Z-Table
25
Example 7-2
• Assume the population mean of the weight of
a chow-eating female mouse is 27.8 gr and stderr
of 0.70. Estimate the fraction of samples that will
have mean weight greater than 29 gr. Consider
sample size of 48.
• Try!
26
Ex 7-2: Solution
27
Z-Table
28
Acknowledgements
• Some parts taken from the StatQuest youtube
channel
• Next time
o More significance tests e.g. F-test, chi-squared test etc
29