0% found this document useful (0 votes)
17 views42 pages

Chapter-8-Estimation & Hyp

Chapter 8 covers Estimation and Hypothesis Testing, focusing on statistical inference methods to draw conclusions about a population based on sample data. It discusses statistical estimation, including point and interval estimation, and outlines properties of good estimators. Additionally, it explains hypothesis testing, including null and alternative hypotheses, and the types of errors that can occur during testing.

Uploaded by

nehlam97
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views42 pages

Chapter-8-Estimation & Hyp

Chapter 8 covers Estimation and Hypothesis Testing, focusing on statistical inference methods to draw conclusions about a population based on sample data. It discusses statistical estimation, including point and interval estimation, and outlines properties of good estimators. Additionally, it explains hypothesis testing, including null and alternative hypotheses, and the types of errors that can occur during testing.

Uploaded by

nehlam97
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

Chapter-8

Estimation & Hypothesis Testing

Compiled by Bacha Edosa


Introduction
• Inference Theory deals with the method of drawing
valid conclusions and prediction about the Population
using information contained in the sample
• Inference is the process of making interpretations or
conclusions from sample data for the totality of the
population.
• On the basis of the information contained in the
sample we try to draw conclusions about the
population. This process is called statistical inference.
• In statistics there are two ways though which inference
can be made.
– Statistical estimation
– Statistical hypothesisCompiled
testing by Bacha Edosa
Introduction …

Inference Analyzed
Population
Data

Numerical
Sample
data

Compiled by Bacha Edosa


Statistical Estimation
• One of the major applications of statistics is
estimating population parameters from sample
statistics.
• In estimation procedures, statistics calculated from
random samples are used to estimate the value of
population parameters
• Many times due to certain constraints such as
inadequate funds or manpower or time we are not
in a position to survey all the units in a population.
• In such situations we take resort to sampling, that is,
we survey only a part of the population.
Compiled by Bacha Edosa
Estimation …
Procedure of estimating a population
(parameter) by using sample information is
referred as estimation.
Estimation is a procedure in which we use the
information included in a sample to get
inferences about the true parameter of interest.
An estimator is a sample statistic that used to
estimate the population parameter.
While an estimate is the possible values that a
given estimator can assume.
Compiled by Bacha Edosa
Estimation …

Compiled by Bacha Edosa


Estimation …

Compiled by Bacha Edosa


Properties of a good estimator
 A desirable property of a good estimator is the following
1. It should be unbiased:
• The expected value of the estimator must be equal to the
parameter to be estimated
2. It should be consistent:
• As the sample size increase, the value of the estimator
should approaches to the value of the parameter estimated.
3. It should be relatively efficient:
 The estimator for a parameter with the smallest variance.
 This actually compares two or more estimators for one
parameter.

Compiled by Bacha Edosa


Properties …
4. It should be sufficient:
• An estimator is said to Sufficient if the sample
contained all information about population
parameter.
• That is, the sample from which the estimator is
calculated must contain the maximum possible
information about the population.

Compiled by Bacha Edosa


Types of Estimation
 There are two types of estimation.
i) Point Estimation
Ii) Interval Estimation
Point Estimation
Is a single numerical value used to estimate
the corresponding population parameters.
A point estimate of some population is a
single value of a sample statistic.
Compiled by Bacha Edosa
Point estimation…
Point estimate for population mean 𝜇 is
𝑛
𝑖=1 𝑥𝑖
𝑥=
𝑛
Point estimate for population varince 𝜎 2 is
𝑛 2
2
𝑥
𝑖=1 𝑖 − 𝑥
𝑆 =
𝑛−1
Example: Sample mean 𝑥 = 3 is the point estimate
of the unknown population mean.

Compiled by Bacha Edosa


Interval Estimation
An interval estimate provides more information
about a population characteristic than does a
point estimate
It deals with identifying the upper and lower limits
of a parameter.
Such interval estimates are called confidence
intervals
The general formula for all confidence intervals is:
Point Estimate ± (Critical Value)(Standard Error)
Point Estimate  Margin of Error
Compiled by Bacha Edosa
Confidence Level (1-a)
• The confidence level is the probability that the
interval estimate will contain the parameter. i.e.,
Confidence for which the interval will contain the
unknown population parameter
A percentage less than 100%
Example: Confidence interval for the population
mean is:
𝑿 ± Critical value × Standard error of (𝑿)

Compiled by Bacha Edosa


Confidence interval(C.I) estimation of
the population mean, µ
• There are different cases to be considered to
construct confidence intervals.
Case 1: If sample size is large or if the population is
normal with known variance
a) If the population is normal with known variance
 The sampling distribution of the sample mean is
𝝈𝟐
normal with mean 𝜇 and variance .
𝒏
A 1 − 𝛼 100% 𝑐𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝑖𝑛𝑡𝑟𝑣𝑎𝑙 𝑓𝑜𝑟 population
mean, µ is given by
𝝈
𝑿 ± 𝒁𝜶 𝟐 ×
𝒏
Compiled by Bacha Edosa
C.I …
Where, 𝑿 is sample mean
𝒁𝜶 𝟐 is stands for the standard normal variable to the
right 𝛼 2 probability lies. That is,
𝑃 𝑍 > 𝑍𝛼 2 = 𝛼 2.
𝝈 𝒏 is standard error
b) If the sample size n is large and When sampling from a non
normal population,
Using the Central Limit Theorem, a 1 − 𝛼 100% 𝑐𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒
𝑖𝑛𝑡𝑟𝑣𝑎𝑙 𝑓𝑜𝑟 population mean, µ is given by
𝝈
i) 𝑿 ± 𝒁𝜶 𝟐× if 𝜎 2 is known
𝒏
𝑺
ii) 𝑿 ± 𝒁𝜶 𝟐 × if 𝜎 2 is unknown. We estimate 𝝈𝟐 by
𝒏
its point estimator 𝑺𝟐 .
Compiled by Bacha Edosa
C.I…
Here are the Z values corresponding to the most
commonly used confidence levels.
1 − 𝛼 100% 𝛼 𝛼 2 𝒁𝜶 𝟐

90 0.10 0.05 1.645


95 0.05 0.025 1.96
99 0.01 0.005 2.575 ≅ 𝟐. 𝟓𝟖

Compiled by Bacha Edosa


Case 2: If sample size is small and the
population variance, is not known.
 A 𝟏 − 𝜶 𝟏𝟎𝟎% 𝑐𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝑖𝑛𝑡𝑟𝑣𝑎𝑙 𝑓𝑜𝑟 population mean, µ
is given by
𝑺
𝑿 ± 𝒕𝜶 𝟐 (𝒏 − 𝟏) ×
𝒏
Where, 𝒕𝜶 𝟐 (𝒏 − 𝟏) is t-distribution with n-1
degrees of freedom
𝑺 is sample standard deviation
n is sample size.
NB.: Degrees of freedom are the maximum number of
logically independent values, which may vary in a data
sample.
Compiled by Bacha Edosa
Examples
1. From a normal sample of size 25 a mean of 32
was found. Given that the population standard
deviation is 4.2. Find:
a) A 95% confidence interval for the population mean.
b) A 99% confidence interval for the population mean.
c) Compare a & b
Solution:
a)
X  32,   4.2, 1  a  0.95  a  0.05, a 2  0.025
 Za 2  1.96 from table.
 The required int erval will be X  Za 2  n
 32  1.96 * 4.2 25
 32  1.65
 Edosa
Compiled by Bacha (30.35, 33.65)
Examples …
b)

Compiled by Bacha Edosa


Examples…
2. A drug company is testing a new drug which is
supposed to reduce blood pressure. From the six
people who are used as subjects, it is found that
the average drop in blood pressure is 2.28 points,
with a standard deviation of 0.95 points. What is
the 95% confidence interval for the mean change in
pressure?
Solution:
X  2.28, S  0.95, 1  a  0.95  a  0.05, a 2  0.025
 ta 2  2.571 with df  5 fromtable.
Compiled by Bacha Edosa
Examples…

 The required int erval will be X  ta 2 S n


 2.28  2.571* 0.95 6
 2.28  1.008
 (1.28, 3.28)

• That is, we can be 95% confident that the mean


decrease in blood pressure is between 1.28 and
3.28 points.

Compiled by Bacha Edosa


Examples…
3. An experiment involves selecting a random sample
of 256 middle managers. One item of interest is
annual income. The sample mean is $45,420 and the
sample standard deviation is $2,050.
i) What is the estimated mean income of all middle
manager (point estimate or population mean)?
ii) What is the 95 percent confidence interval for
population mean?
iii) What degree of confidence being used?
iv) Interpret the result.
Compiled by Bacha Edosa
Examples…
i) the point estimate of the population mean is
$45,420
ii) C.I. for population mean
𝑺
𝑿 ± 𝒁𝜶 𝟐 × , 𝑍0.05 2 = 𝑍0.025 = 1.96
𝒏
2,050
⇒ 45,420 ± 1.96 × = 45,420 ± 251.125
256
= (45,168.875, 45,671.125) or
⇒ 45,168.875 < 𝝁 < 45,671.125
iii) Degree of confidence of a person is 95%.
Compiled by Bacha Edosa
Examples…
• We are 95% confident that the mean annual
income of all middle managers is in between
45,168.875 and 45,671.125.
Sample size determination in estimation of
population mean
In the process of estimating population mean μ
using the sample mean with absolute margin of
error (d) and risk probability , the sample size is
given by:
𝒁𝜶 𝟐 ×𝝈 𝟐
𝒏= where 𝑋 − 𝜇 = 𝑑
𝒅
Compiled by Bacha Edosa
Example
• If a random sample of size n is needed to estimate
population mean with risk probability α = 0.05 and
maximum margin of error 8.5. What is the sample
size if population variance is 200?
Solution: α = 0.05 , 𝑑 = 8.5, 𝜎 2 = 200, 𝑍0.025 = 1.96
𝟐 𝟐
𝒁𝜶 𝟐 × 𝝈 𝟏. 𝟗𝟔 × 𝟏𝟒. 𝟏𝟒𝟐
𝒏= =
𝒅 𝟖. 𝟓
= 𝟏𝟎. 𝟔 ≅ 𝟏𝟏

Compiled by Bacha Edosa


Statistical Hypothesis testing
In the previous section, we have studied how to
make estimations of the mean using point and
interval estimations.
The other aspect of statistical inference is known
as statistical test of hypothesis.
The branch of statistics which helps us in arriving
at the criterion for deciding about the
characteristics of the population, a parameter,
based on the information obtained from the
sample data is known as testing of hypothesis.
Compiled by Bacha Edosa
Hypothesis testing for population mean
 Statistical hypothesis is defined as a statement (or
an assertion) about the parameter of a population or
its distribution that may be proved or disproved.
 Test statistic: is a statistic whose value serves to
determine whether to reject or accept the
hypothesis to be tested. It is a random variable.
• A given statement concerning a parameter could be
true or false.
• Hence we have two complementary hypotheses,
namely, null hypothesis and alternative hypothesis.
Compiled by Bacha Edosa
a) Null hypothesis (H0)
It is the hypothesis to be tested for possible
rejection under the assumption that it is true
It is the hypothesis of equality or the hypothesis
of no difference.
b) Alternative hypothesis (Ha/H1)
It is hypothesis which is the complementary to the
null hypothesis.
It may be accepted if Ho is rejected or be rejected
if Ho is accepted
It is the hypothesis of difference.
Compiled by Bacha Edosa
Continuous …
Statistical Test: is a test or procedure used to
evaluate a statistical hypothesis for deciding
whether to reject the hypothesis depending on
sample data.
The decisions we make are of two types:
 either to reject Ho and conclude that HA is
accepted or
retain Ho and conclude that we have no enough
evidence to reject Ho.
Compiled by Bacha Edosa
Types of errors
 There are two types of errors in hypothesis testing.
i) Type I error
ii) Type II error
Type I error
 If the statistical test rejects Ho when it is true, the
error is type I error.
 Type I error is the error committed in rejecting the
null hypothesis when it is true.
 Probability of committing type I error is sometimes
called level of significance and denoted by α.
Compiled by Bacha Edosa
Type II error
If the test accepts Ho when it is false, the error is
a type II error.
Type II error is the error committed in accepting
the null hypothesis when it is false.
Probability of committing type II error is denoted
by β.
In both types of errors, a wrong decision has
occurred.
An ideal test procedure is one which is so
planned as to safeguard against both these
errors.
Compiled by Bacha Edosa
Types of errors …
However, in practical situations an attempt to
minimize one of the errors maximizes the other.
In view of this dilemma and the fact that wrong
rejection of Ho is a more serious error, we will
hold at a predetermined low level, such as 0.1,
0.05, or 0.01 when choosing a rejection region.
The level of significance 5% implies that in 5
samples out of 100 we are likely to reject a correct
H0.
In other words this implies that we are 95%
confident that our decision to reject H0 is correct.
Compiled by Bacha Edosa
Types of errors …
• The following table gives a summary of possible
results of any hypothesis testing procedure:
Decision Ho is true Ho is false

Reject Ho Type I error Correct decision

Accept Ho Correct decision Type II error

Compiled by Bacha Edosa


General steps in hypothesis testing on
population mean, μ
Step-1: Formulate the null hypothesis (H0) and the
alternative hypothesis (H1).
Suppose the assumed or hypothesized value of μ
is denoted by μo, then one can formulate two
sided and one sided hypothesis as follows:
1. Ho: μ = μo versus H1: μ  μo (two sided test)
2. Ho: μ = μo versus H1: μ < μo (one sided test)
3. Ho: μ = μo versus H1: μ > μo (one sided test)
Compiled by Bacha Edosa
General steps …
Step-2: Specify a significance level of α.
Step-3: Using the sampling distribution of an
appropriate statistic, determine a critical region of 𝛼.
(𝑍𝑡𝑎𝑏 , 𝑡𝑡𝑎𝑏 , 𝜒 2 𝑡𝑎𝑏 𝑜𝑟 𝐹𝑡𝑎𝑏 from the given tables)
Step-4: Determine the value of the test statistic
from the sample data. (𝑍𝑐𝑎𝑙 , 𝑡𝑐𝑎𝑙 , 𝜒 2 𝑐𝑎𝑙 𝑜𝑟 𝐹𝑐𝑎𝑙 )
Step-5: Identify the critical (rejection) region or put
the decision rule by comparing tabulate with
calculated in step-3 & step-4. (Accept Ho or reject)
Step-6: Draw conclusion: interpret the conclusion
Compiled by Bacha Edosa
Decision Table
To test 𝐻𝑜 : 𝜇 = 𝜇𝑜 against the three alternatives,
the rules are summarized as:
Alternative Accept H0 if Reject H0 if Inconclusive if
Hypothesis

  0  Za / 2  Z C  Za / 2 Z C  Z a / 2 or Z C  Z a / 2 Z C  Za / 2

orZC  Za / 2

  0 Z C  Za Z C  Za Z C  Za

  0 Z C  Za Z C  Za Z C  Za

Compiled by Bacha Edosa


Case-I: When Population variance (σ2) is known
and parent population is normal.
The relevant test statistic is
𝑋 − 𝜇𝑜
𝑍𝑐𝑎𝑙 =
𝜎 𝑛
Decision: Reject Ho if
i) 𝑍𝑐𝑎𝑙 > 𝒁𝜶 𝟐 or 𝑍𝑐𝑎𝑙 < −𝒁𝜶 𝟐 for two-sided test
ii) 𝑍𝑐𝑎𝑙 > 𝒁𝜶 for right-sided test
iii) 𝑍𝑐𝑎𝑙 < −𝒁𝜶 for left-sided test

Compiled by Bacha Edosa


Case-II: When sampling from a non normal
population and when the sample size is large
The relevant test statistic is
𝑋−𝜇𝑜
a) 𝑍𝑐𝑎𝑙 = if 𝜎 2 is known
𝜎 𝑛
𝑋−𝜇𝑜
b) 𝑍𝑐𝑎𝑙 = if 𝜎 2 is unknown
𝑆 𝑛
Decision: Reject Ho if
i) 𝑍𝑐𝑎𝑙 > 𝒁𝜶 𝟐 or 𝑍𝑐𝑎𝑙 < −𝒁𝜶 𝟐 for two-sided test
ii) 𝑍𝑐𝑎𝑙 > 𝒁𝜶 for right-sided test
iii) 𝑍𝑐𝑎𝑙 < −𝒁𝜶 for left-sided test
Compiled by Bacha Edosa
Case-III: When sampling is from normally
distributed population with unknown population
variance and small sample size (𝒏 < 𝟑𝟎)

The relevant test statistic is


𝑋−𝜇𝑜
𝑡𝑐𝑎𝑙 = ~𝑡(𝑛 − 1)
𝑆 𝑛
Decision: Reject Ho if
i) 𝑡𝑐𝑎𝑙 > 𝒕𝜶 𝟐 (𝒏 − 𝟏) or 𝑡𝑐𝑎𝑙 < −𝒕𝜶 𝟐 (𝒏 − 𝟏)for
two-sided test
ii) 𝑡𝑐𝑎𝑙 > 𝒕𝜶 (𝒏 − 𝟏) for right-sided test
iii) 𝑡𝑐𝑎𝑙 < −𝒕𝜶 (𝒏 − 𝟏) for left-sided test
Compiled by Bacha Edosa
Examples
1. A sample of 81 students gave an average mark of 70
with a standard deviation of 10. Can we conclude
that the population mean of marks is 72 at 𝛼 =
0.05?
Solution: 𝑛 = 81, 𝑋 = 70, 𝑆 = 10, 𝜇𝑜 = 72, 𝒁𝟎.𝟎𝟐𝟓 = 𝟏. 𝟗𝟔
𝐻𝑜 : 𝜇 = 72 𝑣𝑠 𝐻𝑜 : 𝜇 ≠ 72
𝑋 − 𝜇𝑜 70 − 72
𝑍𝑐𝑎𝑙 = = = −𝟏. 𝟖𝟎
𝑆 𝑛 10 81
Since 𝑍𝑐𝑎𝑙 = −1.80 is in the acceptance region, 𝐻𝑜 is
accepted.
Interpretation: We can concluded that the average
mark of all students is not different from 72 at 5% level
of significance.
Compiled by Bacha Edosa
Examples …
2. Test at 𝛼 = 0.05 whether the mean of a random
sample of size n = 49 is "significantly less than 10" if
the distribution from which the sample was taken is
normal, 𝑋 = 8.4 and 𝜎 = 3.2.
Solution: n = 49, 𝑋 = 8.4, 𝜎 = 3.2, 𝛼 = 0.05
𝐻𝑜 : 𝜇 = 10 𝑣𝑠 𝐻𝑜 : 𝜇 < 10 , 𝑍0.05 = 1.645
𝑋 − 𝜇𝑜 8.4 − 10
𝑍𝑐𝑎𝑙 = = = −𝟑. 𝟓
𝜎 𝑛 3.2 49
* Since 𝑍𝑐𝑎𝑙 = −3.5 < 𝑍0.05 = −1.645, the null
hypothesis is rejected.Compiled by Bacha Edosa
Examples …
3. A sample of 16 students gave an average mark of
53.8 with a standard deviation of 5.2. Can we
conclude that the population mean of marks is 50 at
𝛼 = 0.05 ?
Solution: 𝑛 = 16, 𝑋 = 53.8, 𝑆 = 5.2, 𝜇𝑜 = 50
𝐻𝑜 : 𝜇 = 50 𝑣𝑠 𝐻𝑜 : 𝜇 ≠ 50
𝒕𝜶 𝟐 𝒏 − 𝟏 = 𝑡0.025 15 = 2.131
𝑋 − 𝜇𝑜 53.8 − 50
𝑡𝑐𝑎𝑙 = = = 𝟐. 𝟗𝟐
𝑆 𝑛 5.2 16
Since 𝑡𝑐𝑎𝑙 = 2.92 > 𝑡0.025 15 = 2.131, we reject 𝐻𝑜
Compiled by Bacha Edosa

You might also like