0% found this document useful (0 votes)
103 views

Note3 CHAPTER2

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
103 views

Note3 CHAPTER2

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

QMT220: Statistical Methods Estimation techniques

Chapter 2

Estimation Techniques

2.1 Introduction

Suppose that a population has an unknown parameter, such as the mean, or the
variance, or the proportion of ‘successes’. Then an estimate of the unknown
parameter can be made from the information supplied by a random sample (or
samples) taken from the population.

Estimator : A statistic used to estimate the value of a parameter is called an


estimator and it is denoted by a capital letter (e.g. U,T,…)

Estimate : The numerical value taken by the estimator and is denoted by a small
letter ( e.g. u, t,…)

There are two type types of estimation, point estimation and interval estimation.

2.2 Point Estimation

A point estimate is a specific numerical value estimate of a parameter

Three Properties of a Good Estimator

There are many estimators which could be formed, but the best (or most efficient)
estimator is the one which

i) is unbiased
ii) has the smallest variance
iii) consistent

2.2.1 Unbiased Estimator

Consider a population with unknown parameter θ . If W is some statistic


derived from a random sample taken from the population, then W is an
unbiased estimator for θ if

E(W)= θ


Sometimes the estimator of θ also written as θ .

@Copyright is Prohibited: 1
Prepared By:Norshahida Shaadan; Statistics Department , FSKM Shah Alam
QMT220: Statistical Methods Estimation techniques

Example: 2

If X has the binomial distribution with the parameter n and θ , show that the
sample proportion , X/n is an unbiased estimator of θ .

2.2.2 Most Efficient Estimator

An estimator is said to be the most efficient estimator if it is unbiased and has


the smallest variance

Example:1

If X 1 , X 2 , X 3 is a random sample taken from a population with mean μ and


variance σ 2 , find which of the following estimators for μ are unbiased, and
which is the most efficient of these.

X1 + X 2 + X 3
i) T1 =
3

X1 + 2 X 2
ii) T2 =
3

X1 + 2 X 2 + 3X 3
iii) T3 =
3

2.2.3 Consistent Estimator

If W is an estimator for an unknown parameter θ , then W is a consistent


estimator for θ if var (W) → 0 as n → ∞ , where n is the size of the sample from
which W is obtained.

i) Estimation of a population mean

From a population of unknown mean μ take a random sample of size n, and


_
1 n
let X = ∑ Xi
n i =1
then

∧ _ _
the most efficient estimator for μ , which will write as μ is X , where X is
∧ _
the sample mean and written as μ=X

@Copyright is Prohibited: 2
Prepared By:Norshahida Shaadan; Statistics Department , FSKM Shah Alam
QMT220: Statistical Methods Estimation techniques

_
NOTE: X is an unbiased and consistent estimator for μ .

ii) Estimation of a population variance

From a population with unknown variance σ 2 take a random sample of size n, and
_

let S2 =
∑ (X i − X )2
=
n ∑ x 2 − (∑ x) 2 2

, where S is the sample variance.


n −1 n( n − 1)

The point estimator for population variance σ 2 is the sample variance S

iii) Estimation of a population proportion

From a binomial population in which p is the proportion of successes (unknown), a


random sample of size n is taken.

Let P s the random variable ‘ the proportion of successes in the sample’.

Then, an unbiased estimator for p is P s .

X 1 np
Proof: E(P s )=E( )= E ( X ) = =p
n n n
pq pq
The estimator also consistent, since Var(P s )= where q=1-p and → 0 as n → ∞ .
n n

Example: 5

A random sample of 50 children from a large school is chosen and the number who
are left handed is noted. It is found that 6 are left handed. Obtain an unbiased
estimate of the proportion of children in the school who are left handed.

@Copyright is Prohibited: 3
Prepared By:Norshahida Shaadan; Statistics Department , FSKM Shah Alam
QMT220: Statistical Methods Estimation techniques

2.2.4 Pooled Estimators From Two Samples

Estimates of the population mean, variance, proportion, etc.,may be made by


pooling the values from two samples.

2.3.1 Pooled estimators of population means and of population variances

From a population with unknown mean μ and unknown variance σ 2 we take two
random samples :
Size Mean Variance

n1
Sample I 1 x S1 2

Sample II n2 x 2 S2
2

_ _

n X +n X
Then μ = 11 1 2 2
n1 + n2


where μ is an unbiased estimator for the population mean μ .

Also

(n1 − 1)S1 + (n2 − 1)S2
2 2
σ2 =
n1 + n2 − 2

∧2
where σ is an unbiased estimator for the population variance σ 2 .

@Copyright is Prohibited: 4
Prepared By:Norshahida Shaadan; Statistics Department , FSKM Shah Alam
QMT220: Statistical Methods Estimation techniques

Example : 6

Two samples, size 40 and 50 respectively, are taken from a population with unknown
mean μ and unknown variance σ 2 . Using the data from the two samples, obtain
unbiased estimates of μ and σ 2 .

Sample I Sample 2

x1 18 19 20 21 22 x 2 18 19 20 21 22 23
f 3 17 5 10 5 f 10 21 8 6 3 5

2.3.2 Pooled estimator of population proportion

From a binomial population which has unknown proportion p of ‘successes’, we take


two samples:

Size Proportion
Sample 1 n1 P s1
Sample 2 n2 P s2


Then p , an unbiased estimator for the population proportion p, is given by

∧ n1Ps1 + n2 Ps 2
P=
n1 + n2

Example 7 :

An opinion poll in a certain city indicated that 69 people in random sample of 120
said that they would vote for Encik Firdaus, while in a second random sample of 160,
93 said that they would vote for Encik Firdaus. Find an unbiased estimate of the
proportion of people in the city who will vote for Encik Firdaus.

2.3 Interval Estimation – Confidence Interval

An interval estimate of an unknown population parameter is a random interval


constructed so that it has a given probability of including the parameter.

Consider a population with unknown parameter θ . If we can find interval (a,b) such
that P(a< θ )=0.95, we say that (a,b) is a 95% confidence interval for θ .
In this case, 0.95 is the probability that the interval includes θ .

NOTE: It is not the probability that θ lies in the interval.


@Copyright is Prohibited: 5
Prepared By:Norshahida Shaadan; Statistics Department , FSKM Shah Alam
QMT220: Statistical Methods Estimation techniques

DEFINITION: Interval Estimate: An interval estimate of a parameter is an interval or


range of values used to estimate the parameter.

DEFINITION: Confidence Level: A confidence level of an interval estimate of a


parameter is the probability that the interval estimate will contain the parameter.

NOTE:

To begin determining the confidence interval for the parameter, it is important for
you to know :

i) the population distribution (the distribution must be normal)


ii) Whether the variance known or unknown
iii) And if it is unknown is the sample size small or large ( ≥ 30)

2.3.1 Confidence Interval For A Population Mean

Consider a population with mean μ and variance σ 2 . Now take random sample
_
from the population, X 1 , X 2 , . . . , X n and consider the distribution of X where

_
∑X
1
X= i i = 1,2, . . . , n.
n

To estimate the unknown population mean:

a) Using Z-distribution:

Case I: With Known Population Variance σ 2


_
If X is the mean of random sample of size n taken from a normal population with
known variance σ 2 , the a 100(1- α )% confidence interval for μ , the population
mean, is given by

_
σ _
σ
X− Z α < μ < X+ Z α
2 n 2 n

_
σ _
σ
lower limit: X− Z α upper limit : X+ Z α
2 n 2 n

@Copyright is Prohibited: 6
Prepared By:Norshahida Shaadan; Statistics Department , FSKM Shah Alam
QMT220: Statistical Methods Estimation techniques

_ σ
Formula: This can be written as X ± Zα / 2
n

Example 8:

A certain medication is known to increase the pulse rate of its users. The standard
deviation of the pulse rate is known to be 5 beats per minute. A sample of 20 users
had an average pulse rate of 104 beats per minute. Find the 99% confidence interval
of the true mean.

Case II: With unknown Population Variance σ 2 but large sample (n ≥ 30).

For this case, the unknown population variance is replaced by its point estimated
value, the sample variance s 2 . The sample standard deviation is s.

_ s
Formula: This can be written as X ± Zα / 2
n

Example 9:

A random sample of size 40 is taken from a normal population. The sample mean
_
and the standard deviation are as follows; x = 60.5 and s = 8 . Construct a 95%
confidence interval for the unknown population mean. Interpret its meaning.

b) Using t-distribution:

Case: T –distribution will be used if there is unknown value for population variance σ 2
and the sample size is small (n<30).

_ s
Formula: This can be weitten as X ± tα / 2 where x ~ t (v) v=n-1
n

@Copyright is Prohibited: 7
Prepared By:Norshahida Shaadan; Statistics Department , FSKM Shah Alam
QMT220: Statistical Methods Estimation techniques

Example 10:

Ten randomly selected automobiles were stopped and the tread depth of the right
front tire was measured. The mean was 0.32 inch and the standard deviation was
0.08 inch. Find the 95% confidence interval of the mean depth. (Assume that the
variable is approximately normally distributed).

Example 11:

Ten packets of a particular brand of biscuits are chosen at random and their masses
noted. The results in grams are:

397.3 399.6 401.0 392.9 396.8 400.0 397.6 392.1 400.8

Assuming that the sample is taken from a normal population with mean mass μ .
Calculate the 95% confidence interval for μ . Interpret the interval.

2.3.2 Confidence Interval For The Difference Between Two Population Means

The (1- α )% confidence interval for the difference between two population means(
μ1 − μ 2 ) :

Type of samples: Independent.

Condition 1: Known population variances

− − σ 12 σ 22
Formula: ( ( x1 − x 2 ) ± Z α / 2 +
n1 n2

Condition 2: Unknown population variances and large sample

− − s12 s 22
Formula: ( ( x1 − x 2 ) ± Z α / 2 +
n1 n 2

Example 12:

An experiment was conducted in which two types of engines, A and B were


compared. Gas mileage in miles per gallon was measured. Fifty experiments were
conducted using engine type A and 75 experiments were done for engine type B.
The gasoline used and other conditions were held constant. The average gas

@Copyright is Prohibited: 8
Prepared By:Norshahida Shaadan; Statistics Department , FSKM Shah Alam
QMT220: Statistical Methods Estimation techniques

mileage for engine A was 36 miles per gallon and the average for engine B was 42
miles per gallon. Assuming that the population standard deviation are 6 and 8 for
engine A and B, respectively;

a) Determine the 95% confidence interval for the mean difference of gas mileage
between engine A and engine B.

b) Draw your conclusion based on the answer in (a).

Example 13:

In an attempt to compare the starting salaries of college graduates majoring in


education and social sciences, random samples of 50 recent college graduates in
each major were selected and the following information was obtained:

Major Mean Standard deviation


Education 40554 2225
Social Science 38348 2375

a) Obtain the 90% confidence interval for the mean difference of starting salaries
between education and social science major.

b) Do you think that the difference for the two groups in the general population is
significant?

Condition 3: Unknown population variances but small sample

The (1- α )% confidence interval for the difference between two population means (
μ1 − μ 2 ):

¾ Case I: Assuming Equal Population Variances ( σ 12 = σ 22 )

− − 1 1
Formula: ( x1 − x 2 ) ± tα / 2 . S p. + t distribution with d.f=v= n1 + n 2 − 2
n1 n 2
The difference follows t distribution with degree of freedon v= n1 + n 2 − 2 .

Where the the unknown common variance can be obtained by pooling the
sample variances such as follows:

(n1 − 1) s12 + (n 2 − 1) s 22
S 2p =
n1 + n 2 − 2

Where s p is the pooled estimate of the population standard deviation.

@Copyright is Prohibited: 9
Prepared By:Norshahida Shaadan; Statistics Department , FSKM Shah Alam
QMT220: Statistical Methods Estimation techniques

Example 14:

The following data, recorded in days represent the length of time to recovery for
patients randomly treated with one of two medications to clear up severe bladder
infections:

Medication A Medication B
n1 =14 n 2 =16
− −
x1 =17 x 2 =19
s12 = 1.5 s 22 =1.8

Find the 99% confidence interval for μ1 − μ 2 in the mean recovery time for the two
medications, assuming normal populations with equal variances.

Example 15:

The following results are the weights (in grams) of 52 and 63 eggs produced by hens
fed on ordinary corn and vitamins enriched corn respectively. The allocation of hens
to type of corn was done randomly.

Type of corn Number of eggs Mean weight of Standard Deviation


eggs
Ordinary 22 44.6 8.56
Vitamin Enriched 13 48.1 6.24

Assume that the weight of eggs is normally distributed for both populations and
having equal variances ( σ 21 = σ 2 2 ).Is there a significant difference between the two
population mean weights? Use a 95% confidence interval. Draw your conclusion
based on the confidence interval.

¾ Case II: Assuming Unequal Population Variances ( σ 12 ≠ σ 22 )

− − s12s2
Formula: ( x 1 − x 2 ) ± tα / 2 + 2 Use t distribution.
n1 n 2

v=degree of freedom
( s12 / n1 + s 22 / n 2 ) 2
v=
[( s12 / n1 ) 2 /(n1 − 1) + ( s 22 / n2 ) 2 /(n2 − 1)]

@Copyright is Prohibited: 10
Prepared By:Norshahida Shaadan; Statistics Department , FSKM Shah Alam
QMT220: Statistical Methods Estimation techniques

Example 16:

A study was conducted by the Department of Zoology at the Virginia Polytechnic


Institute and State University to estimate the difference in the amount of the
chemical orthophosphorous measured at two different stations on the James River.
Orthophophorous is measured in milligrams per liter. 15 samples were collected from
station 1 and 12 samples were collected from station 2. The 15 samples from station 1
had an average orthophosphorous content of 3.84 milligrams per liter and a
standard deviation of 3.07 milligrams perl iter, while the 12 samples from station 2 had
an average content of 1.49 milligrams per liter and a standard deviation of
0.80milligram per liter. Find a 95% confidence interval for the difference in the true
average orthophosporus contents at these two stations, assuming the observations
came from normal populations with different variances.

2.4 Paired Observations

Observations are taken from a very special experimental situation where the two
population are not randomly assigned to experimental units. The observations in a
pair have something in common. The samples taken are related and the variances
of the two populations are not necessarily equal.

Type of samples: Dependent.

The (1- α )% confidence interval for the difference between two population means (
μ D = μ1 − μ 2 ) :

_ sd
Formula: d m t α/2 t- distribution with v= n d -1 degree of freedom
nd

Where μ D = population mean difference in the observations


_
d = sample mean differences
s d = standard deviation of the differences
n d = number of differences

@Copyright is Prohibited: 11
Prepared By:Norshahida Shaadan; Statistics Department , FSKM Shah Alam
QMT220: Statistical Methods Estimation techniques

Example 17:

A study published in Chemosphere reported the levels of dioxin TCDD of 10


Massachusetts Vietnam veterans who were possibly exposed to Agent Orange. The
amount of TCDD levels in plasma and in fat tissue are listed in the table below:

Veteran 1 2 3 4 5 6 7 8 9 10
TCDD
levels In 2.5 3.1 2.1 3.5 3.1 1.8 6.0 3.0 36.0 4.7
Plasma
TCDD
levels In
4.9 5.9 4.4 6.9 7.0 4.2 10.0 5.5 41.0 4.4
Fat
Tissue

Find the 95% confidence interval for the difference in the means between TCDD
levels in plasma and TCDD levels in tissue. Draw your conclusion.

2.5 Confidence Interval For A Population Variance

The (1- α )% confidence interval for a population variance σ 2 is given by:

( n − 1) s 2 ( n − 1) s 2
Formula: <σ2 <
χα2 / 2,v χ12−α / 2,v

Chi-squared distribution with degree of freedom v=n-1

Note: A (1- α )% confidence interval for a population standard deviation is obtained


by taking the square root of each endpoint of the interval for σ 2 .

Example 18:

The following are the weights in decagrams of 10 packages of grass seed distributed
by a certain company: 46.4, 46.1, 45.8, 47.0, 46.1, 45.9, 45.8, 46.9, 45.2 and 46.0.
Find the 95% confidence interval for the variance of all such packages of grass seed
distributed by this company, assuming a normal population.

@Copyright is Prohibited: 12
Prepared By:Norshahida Shaadan; Statistics Department , FSKM Shah Alam
QMT220: Statistical Methods Estimation techniques

2.5.1 Confidence interval For The Ratio Between Two Population Variances

To compare two population variances, researcher commonly uses the confidence


interval for their ratio by employing the f -distribution. The (1- α )% confidence interval
for the ratio between two population variances is given by:

s12 1 σ 1 s2
Formula: < 1 < 1 f α / 2 (v2 ,v1 )
s 22 f α / 2 (v1 , v2 ) σ 22 s 22

Where f α / 2 is an f-value with v1 = n1 − 1 and v 2 = n 2 − 1 degrees of


freedom.

Example 19:

Two independent random samples are chosen from two normal populations with the
following information:

Population 1 Population 2
n1 = 13 n2 = 25
s1 = 5 s2 = 8

σ 12
i. Construct the 95% confidence interval for the ratio of the variances .
σ 22
ii. Based on your answer in (i) can we conclude that σ 21 ≠ σ 2 2 ? Give your reasons.
2.6 Confidence Interval for a population proportion

A 100(1- α )% confidence interval for population proportion

∧ pq
p ± zα / 2
n

Example 20:

A random sample of 50 children from a large school is chosen and the number who
is left-handed is noted. It is found that 6 of them are left-handed. Find a 95%
confidence interval for proportion of children in the school who are left-handed.

Example 21:

In a market survey 25 people out of a random sample of 100 from a certain area said
that they used a particular brand of soap. Find 90% confidence interval for the
population proportion.

@Copyright is Prohibited: 13
Prepared By:Norshahida Shaadan; Statistics Department , FSKM Shah Alam
QMT220: Statistical Methods Estimation techniques

2.6.1 Confidence interval for the difference between two population proportions.

Assumptions:
• The two samples are taken from independent populations
• The samples taken from each population are sufficiently large.

The 100(1- α )% confidence interval for π1 − π 2 is :

p 1 q1 p2 q2
( p 1 − p 2 ) ± zα / 2 +
n1 n2

Example 22:

In a sample of 200 surgeons, 15% thought the government should control healthcare.
In a sample of 200 general practitioners, 21% felt this way. Find the 95% confidence
interval for the difference in the proportion. Is there a difference in the proportions?

2.7 Sample size determination

In survey research, sample size determination is necessary. Generalizing a population


parameter based on sample data must be incorporated with random situation.
Therefore researcher must employ a probabilistic or random sampling for data
collection. How to determine the minimum sample size so the results of the research
can represent the whole population?

2.7.1 Using sample mean to estimate population mean

Formula:
_
If x is used as an estimate of μ , we can be 100(1- α )% confident that the error will
not exceed a specified amount e when the sample size is

⎛Z σ ⎞
2

n = ⎜ α /2 ⎟
⎝ e ⎠
Discus:
Deduce the formula using a confidence interval for a population mean with z-
distribution.

@Copyright is Prohibited: 14
Prepared By:Norshahida Shaadan; Statistics Department , FSKM Shah Alam
QMT220: Statistical Methods Estimation techniques

Example 23:

A research is done to determine the average zinc concentration in a river. How large
a sample of locations is required if we want to be 95% confident that our estimate of
μ is off by less than 0.05. From previous research, it is known that the population
standard deviation is 0.3.

2.7.2 Using sample proportion to estimate population proportion

Formula:

Case 1: If the estimate of population proportion p is known

If p is used as an estimate of P, we can be 100(1- α )100% confident that the error will
be less than a specified amount e when the sample size is approximately

∧ ∧
Z2 pq
n = α / 22
e
Discussion:
Deduce the formula using the confidence interval for a single population proportion.

Example 24:

For a test market, find the sample size needed to estimate the true proportion of
consumers satisfied with a certain new product within ± 0.04

i) at the 90% confidence level assuming the proportion is 0.5.


ii) at the 95% confidence level assuming that the estimate of p is 0.02.


Case 2: If the estimate of population proportion p is unknown

Sometimes, there are cases where the estimate of p is unknown. The following
formula can be employed.
Z α2 / 2
n=
4e 2

@Copyright is Prohibited: 15
Prepared By:Norshahida Shaadan; Statistics Department , FSKM Shah Alam

You might also like