Inferential Statistics: Estimation Hypothesis Testing
Inferential Statistics: Estimation Hypothesis Testing
Estimation
Hypothesis testing
Estimation
4
Types of estimates
• Interval estimate – is a range of values used to estimate a
population parameter.
• It indicates errors in two ways: by the extent of its range and
by the probability of the true population parameter lying
within the range.
“I estimate that the true average salary offered to our MBA
student will be between Rs. 10 lakhs and Rs. 12.1 lakhs, and it
is very likely that the true salary will fall within this interval.”
OR
“I estimate that the true average salary offered to our MBA
student will be between Rs. 10.25 lakhs and Rs. 11.85 lakhs,
and it is 99% likely that the true salary will fall within this
interval.” 5
Criteria for a good estimator
• Unbiasedness: refers to the fact that sample mean is an
unbiased estimator of the population (because the mean of
the sampling distribution of the sample mean is equal to the
population mean).
• Efficiency: refers to the standard error of the statistic. If we
compare two statistics from a sample of the same size and try
to decide which one is the more efficient estimator, we would
pick the statistic that has the smaller standard error.
6
Criteria for a good estimator
• Consistency: A statistic is a consistent estimator of the
population parameter if as the sample size increases, it
becomes certain that the value of the statistic comes very
close to the value of the population parameter.
• Sufficiency: An estimator is sufficient if it makes so much use
of the information in the sample that no other estimator
could extract from the sample additional information.
7
Point Estimation: Mean
8
Point Estimation: Variance
n
X i Xn
2
2
• The sample variance: S (n) i 1
n 1
9
Point estimates: Sampling distribution
2
• Variance of the mean: Var X n .
n
S 2 ( n)
• We can estimate this variance of mean by:Var X n .
n
10
Point estimates: Sampling distribution
• However, most often in most of the experiments, the data is
correlated.
• In that case, estimation using sample variance is dangerous.
Because it underestimates the actual population variance.
E S 2 (n) 2 , and
S 2 ( n)
E Var X n .
n
11
Hypothesis testing
12
Introduction
• Hypothesis testing begins with an assumption, called a
hypothesis, that we can make about a population parameter.
• Then we collect sample data, calculate sample statistics, and
use this information to decide how likely it is that our
assumption is true.
• Say that, we assume certain value of population mean. To test
the validity of our assumption, we collect sample data and
determine the difference between the hypothesized
(assumed) value of the mean and the sample mean.
• Smaller the difference, the greater the likelihood that our
hypothesized value for the mean is correct.
13
Introduction
• Suppose that the portfolio manager tells you that average rate
of return earned on the portfolios handled by her is 13%.
• How do you test the validity of her statement?
• We collect a sample of the portfolios handled by her and
calculate the average rate of return for this sample.
• If the sample average is 15%, we would be willing to accept
her claim! However, what if the sample average is 3.5%?
• We may or may not interpret this result through common
sense. How close is close enough for us to accept?
14
Example
15
Example
• If we assume that the true mean thickness of 0.18 mm and
we know that the population standard deviation is 0.025 mm,
how likely is it that we would get a sample mean of 0.19 mm
from that population?
• That is, if the true mean is 0.18 mm and standard deviation is
0.025 mm, what are the chances of getting a sample mean
that differs from 0.18 by 0.01 ( = 0.19 – 0.18) mm or more?
• First, we calculate the standard error of the mean from the
population standard deviation. That turns out to be 0.0025
mm. How?
• Now we calculate the difference between the hypothesized
value and the sample mean in terms of standard errors.
16
Example
• And find that, we are 4 standard errors away (to the right)
from the hypothesized population mean. Why?
• Finding a data point 4 standard deviations away in a standard
normal distribution is very unlikely. Why?
• We find that the difference between the sample mean and
the hypothesized population mean is too large, and the
chance that the population would produce such a random
sample is too low.
• Hence, we conclude that the value claimed by the paper
manufacturer is not true and we reject his hypothesis about
the mean paper thickness.
17
Hypothesis testing: Technicalities
• We must state the assumed or hypothesized value of
population parameter before we begin sampling.
• The assumption we wish to test is called the null hypothesis
and is symbolized H0 (called “H-zero”).
• Suppose we want to test the hypothesis that the population
mean is 500. We would say that: “The null hypothesis is that
the population mean is equal to 500.” H0 : µ = 500.
(The term null hypothesis arises from agricultural applications of
statistics. In order to test the effectiveness of a new fertilizer,
the tested hypothesis was that it had no effect, that is. There
was no difference between treated and untreated samples.)
18
Technicalities
• Whenever we reject the null hypothesis, the conclusion we do
accept is called Alternative Hypothesis denoted H1 or Ha. (“H-
one” or “H-a”).
• For our null hypothesis (on the previous page), what are the
possible alternative hypothesis? There are three options!
• The alternative hypothesis could be that the population mean
is not equal to, less than, or greater than 500.
• Why does it matter which alternative hypothesis are we
considering?
• Amongst other things, technically, this tells us whether we are
solving a one-tailed test or two-tailed test.
• And you thought only our ancestors had tails!!!
19
Pop quiz: Formulate the null hypothesis
• A person comes for a job interview. The selection panel needs
to be decide whether to hire the person.
H0: Person not capable. (Candidate needs to prove capability.)
• An innovator proposes a new way to pack products, and
claims that this will speed up the manufacturing process. The
organization needs to be decide adopting the new method.
H0: New method is not effective (status quo).
• Accounting department of an organization needs to decide
whether receipts submitted by an employee are legitimate.
H0: Claims are legitimate. (Proof of contradiction needed.)
20
Technicalities
• The purpose of hypothesis testing is not to question the
computed value of the sample statistic, but to make a
judgment about the sample statistic and the hypothesized
value of the population parameter.
• How does the significance level play a role in hypothesis
testing?
• If we assume the hypothesis is correct, then the significance
level will indicate the percentage of sample means that is
outside certain limits.
• If the sample statistic falls in the rejection zone, then there is
a significant difference between hypothesized value and
sample value.
21
Technicalities
• And we reject the null hypothesis.
• On the other hand, if the sample statistic does not fall in the
rejection region, it does not prove that the null hypothesis is
true, it simply does not provide statistical evidence to reject
the null hypothesis.
• How do we, then, select the significance level?
• Higher the significance level we use for testing a hypothesis,
the higher the probability of rejecting a null hypothesis when
it is true.
22
Technicalities
• The significance test analyzes the strength of the sample
evidence against the null hypothesis.
• The test is conducted to investigate whether the data
contradict the null hypothesis.
• The approach is indirect one – one of proof by contradiction.
• The alternative hypothesis is judged acceptable if the sample
data is inconsistent with the null hypothesis.
• How do we know sample data is inconsistent with the null
hypothesis?
• We calculate value of test statistic and from there, p-value.
23
Technicalities
• Test statistic – it is the sample value against which we
compare the hypothesized value.
• Using these (the test statistic and population parameter), for a
large sample, we calculate the z-value as:
X n
Zn .
2
/n
• When the population standard deviation is not known, we
calculate tn value, in a similar way as zn.
zn follows a standard normal distribution, while tn has a t-
distribution.
24
What distribution?
tn
X n
S 2 ( n) / n
• The variable tn is approximately normal as n increases.
25
Technicalities
• p-value – Using the sampling distribution of the sample mean,
we calculate probability that the value from the sample like
the one observed would occur if the null hypothesis were
true.
• This provides a measure of how unusual the observed sample
value is compared to what null hypothesis predicts.
• p-value is the probability, when H0 is true, of a sample value at
least as contradictory to H0 as the actual observed value.
• Smaller the p-value, the more strongly the data contradict H0.
• p-value of, say, 0.8 means that the observed data is not
unusual if H0 were true.
• p-value of 0.001 means that such data would be very unlikely
if H0 were true. 26
Example
• A market researcher wants to know whether the customers
prefer a product or not.
• To check, a study was designed.
• A market research analyst collected product preferences from
627 respondents and mapped the responses (about the
product) on a 5-point scale.
• Preference is on the ordinal scale. The mean preference
below 3 would mean consumers prefer our product (and are
likely to buy); whereas a mean above 3 would mean that they
are unlikely to buy our product.
• We will test how the population mean compares to the
middle value of 3.
27
Example
28
Example
Hypothesis: Let µ be the population mean preference for this 5-
point scale. The null hypothesis is the specified value for µ.
Since we want to check whether the population mean departs
from moderate response of 3, we form
H0: µ = 3.0
Alternative hypothesis is then: H1: µ ≠ 3.0.
29
Data from the sample n = 627
• The response and corresponding frequencies are as follows:
Response Count
1. Very likely to buy 45
2. Somewhat likely to buy 142
3. Indifferent 239
4. May not buy 153
5. Won’t buy at all 48
30
Example
Y 3.027, s 1.032.
s 1.032
Standard error: Y 0.041.
n 627
Y 3.027 3
z 0.655.
Y 0.041
31
Example
• If the population mean preference were 3.0, then the
probability equals 0.51 that a sample mean of n = 627
subjects would fall at least as far from 3.0 as the observed
sample mean of 3.027.
Conclusion: Since the p-value of 0.51 is not small, it does not
contradict the null hypothesis. If the null hypothesis were to
be true, the data observed is not unusual. It is plausible that
the population mean response is 3.0, showing no tendency of
like or dislike for the product.
32
Hypotheses testing
• Assume that X1, X2, X3…Xn are normally distributed (or be
approximately normal) and that we would like to test
whether µ = µ0, where µ0 is a fixed hypothesized value of µ.
• If X n 0 is large then our hypothesis is not true.
• To conduct such test (whether the hypothesis is true or not),
we need a statistical parameter whose distribution is known
when the hypothesis is true.
• Turns out, if our hypothesis is true (µ = µ0), then the statistic
tn has a t-distribution with n-1 df.
33
Hypotheses testing
• We form our two-tailed hypothesis (H0) to test for µ = µ0 as:
t n 1,1 Reject H 0
2
If tn
t n 1,1 2 ``Accept' ' H 0
34
Errors in hypothesis testing
• Rejecting a null hypothesis when it is true is called a Type I
error, and its probability (which, is the significance level of the
test) is denoted by α (alpha). This errors is under
experimenter's control.
• Alternatively, accepting a null hypothesis when it is false is
called a Type II error, and its probability is denoted by β
(beta).
• Obviously, there is a trade-off between α and β. Increasing
one causes decrease in other.
• When would you prefer controlling Type I error? Example…
• When would you prefer controlling Type II error? Example…
35
Errors in hypothesis testing
• We call δ = 1- β as power of test which is the probability of
rejecting H0 when it is false.
• For a fixed α, power of the test can only be increased by
increasing n.
36
Interval estimation
37
Interval estimation
• General form of interval estimate is:
point estimate ± margin of error.
38
Interval estimation: Mean
• Start with the point estimator of mean, which is the sample
mean.
• Then assuming we know the population standard deviation,
and using CLT, we can conclude that the sampling distribution
of the mean is normally distributed with a standard error of
.
n
39
Interval estimation: Mean
• Since, according to CLT, sampling distribution is normally
distributed, it is worthwhile to review properties of the
standard normal distribution.
• From these properties, we know that for a standard normal
distribution 95.5% of the data lies within 2 standard
deviations from the mean.
• This translates to the fact that: The probability is 0.955 that
the mean of the sample will be within ±2 standard errors from
the population mean.
• Stated differently, 95.5% of all the sample means are within
±2 standard errors from µ, and hence µ is within ±2 standard
errors of 95.5% of all the sample means.
40
Interval estimation: Mean
• One can generalize and find the number of standard errors
that contain any percentage of data and make similar
conclusions.
• For the probability of 95.5% that the population mean is
within 2 standard errors of the sample mean, we are allowing
for 4.5% error, and hence, quantifying the error.
• For this example, the interval quoted will be:
point estimate ± 2 standard error.
• In general, we can say that the interval is:
point estimate ± zα/2 (standard error).
• What is the z-value?
• What if we don’t know the population variance? 41
Interval Estimation
Fn ( z ) Pr Z n z.
42
Interval Estimation
• Central Limit Theorem states that:
Fn ( z ) ( z ) as n .
where Φ(x) is the standard normal distribution with mean 0
and variance 1.
43
Standard Normal distribution
• Standard Normal distribution is N(0,1).
• The cumulative distributive function (CDF) at any given value
(z) can be found using standard statistical tables.
• Conversely, if we know the probability, we can compute the
corresponding value of z such that,
F ( z1 ) PrZ z1 1 .
2
• This value is z1-α/2 and is called the critical point for N(0,1).
• Similarly, the other critical point (z2 = – z1-α/2) is such that:
F ( z 2 ) PrZ z 2 .
2
44
Interval Estimation
• It follows for a large n:
Pr z1 Z n z1
2 2
Xn
Pr z1 z1
2
2 S ( n) / n 2
S 2 (n) S 2 (n)
Pr X n z1 X n z1
2 n 2 n
1.
45
Interval Estimation
• Therefore, if n is sufficiently large, an approximate
100(1- α) percent confidence interval of µ is given by:
2
X n z1 .
2 n
46
Interval Estimation
• What if the n is not “sufficiently large”?
• Or what if we don’t know the population standard deviation
σ?
• It can be shown that CLT applies if we replace σ2 by sample
variance S2(n).
tn
X n
S 2 ( n) / n
• The variable tn is approximately normal as n increases.
47
Interval Estimation
48
Interval Estimation
49
Interval Estimation
• The confidence level has a long-run relative frequency
interpretation.
• The unknown population mean μ is a fixed number.
• A confidence interval constructed from any particular
sample either does or does not contain μ.
• However, if we repeatedly select random samples of that
size and each time constructed a confidence interval, with
say 95% confidence, then in the long run, 95% of the CI’s
would contain μ.
• This happens because 95% of the time the sample mean
Y would fall within 1.96 Y of .
• So 95% of the times, the inference about μ is correct.
50
Interval Estimation
• Every time we take a new sample of the same size, the
confidence interval is going to little different than the
previous one.
• This is because the sample mean Y varies from sample to
sample.
• In practice, however, we select just one sample of fixed size n
and construct one confidence interval using the observations
in that sample.
• We do not know whether any particular CI truly contains μ.
• Our 95% confidence in that interval is based on long-term
properties of the procedure.
51
Confidence intervals ↔ Hypothesis testing
• Conclusions using testing of hypothesis should be consistent
with those drawn from interval estimation!
• Continuing with marketing example, if we construct 95%
confidence interval for the customer preference, we have:
Y z /2 Y 3.027 1.96*0.041 [2.94,3.107].
52
[Chapter 17 of the textbook]
COMPARISONS
53
Comparing two groups
• Often we want to compare two or more groups in terms of
their characteristics, preferences, etc. and draw conclusions
on their differences.
• We look at comparing two groups in this section. Comparing
more than two groups is discussed using ANOVA.
• Suppose we want to compare the average consumer spend in
a Big Bazaar in Mumbai and in Chennai, then the (cross-
sectional) studies can be done using hypothesis testing.
• Remember that we need to have independent random
samples from each of two groups.
54
Example: Do men work more?
• Do men spend less time on housework than women?
• If so, by how much?
• A survey (published in 1994) had the following data (The data
is for household hours per week):
Housework hours
Sample size
Mean Std. Dev
Male 4262 18.1 12.9
Female 6764 32.6 18.2
55
Example: Do men work more?
• Natural way to compare two population means is to estimate
the difference between them µ2 – µ1.
• We can use sample data from these population and estimate
the difference between the sample means. And the treat the
difference variable as the variable of interest.
• Let Y2 Y1 be the standard error of sampling distribution of the
estimated difference. It can be shown that, if the two
populations are independent, this variance of the difference is
simply addition of individual variances of sampling
distributions of respective populations.
• In this case this value is 2 s12 s22
Y2 Y1 .
n1 n2
56
Example: Do men work more?
• Hypothesis test
H 0 : 2 1 2 1 0.
H1 : 2 1.
57
Example: Do men work more?
• Confidence interval
95% CI is given by:
(Y2 Y1 ) z / 2 Y2 Y
(14.5) 1.96*(0.3) [13.9,15.05].
58
Comparing two means
• The example was a preview of the technique used to set the
hypothesis test for comparing two means.
• There are variants of this situations which one should be
careful about.
• Particularly, when the variances of two populations are
assumed to be constant; or if the variances could be pooled;
or if the study is longitudinal.
• In all these cases, the technique remains the same. The way
we calculate the t-statistic and the degrees of freedom is
different.
59