100% found this document useful (2 votes)
478 views13 pages

Buss. Stat CH-2

1. This chapter discusses statistical estimation and how to use sample statistics to estimate unknown population parameters. 2. There are two types of statistical estimates - point estimates which are single values, and interval estimates which provide a range of values. 3. Good estimators are unbiased, efficient, and consistent. The sample mean and proportion are consistent estimators as their standard errors decrease with larger sample sizes.

Uploaded by

Jk K
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
100% found this document useful (2 votes)
478 views13 pages

Buss. Stat CH-2

1. This chapter discusses statistical estimation and how to use sample statistics to estimate unknown population parameters. 2. There are two types of statistical estimates - point estimates which are single values, and interval estimates which provide a range of values. 3. Good estimators are unbiased, efficient, and consistent. The sample mean and proportion are consistent estimators as their standard errors decrease with larger sample sizes.

Uploaded by

Jk K
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 13

Chapter Two- Statistical Estimation, Department of AcFn

CHAPTER TWO
STATISTICAL ESTIMATION
2.1. Basic Concepts
Statistical Inference is the process of making judgment about a population based on
sampling properties. An important aspect of statistical inference is using estimates to
approximate the value of an unknown population parameter. This chapter will study
different kinds of estimator and lay the foundations for making statistical inference
about the population mean and proportion.
Analyzed
Inference Data
Population

Numerical data
Sample

Definitions
Confidence Interval: An interval estimate with a specific level of confidence

Confidence Level: The probability that the interval estimate will contain the parameter.
Consistent Estimator: An estimator which gets closer to the value of the parameter as
the sample size increases.

Degrees of Freedom: The number of data values which are allowed to vary once a
statistic has been determined.

Estimator: A sample statistic which is used to estimate a population parameter. It must


be unbiased, consistent, and relatively efficient.

Estimate: Is the different possible values which an estimator can assumes.

Interval Estimate: A range of values used to estimate a parameter.

Point Estimate: A single value used to estimate a parameter.

2.1.1. Estimator and Estimates


Any sample statistic used to estimate or measure a population parameter is called an
estimator, that is, an estimator is a sample statistic used to estimate a population
parameter. The sample mean X̄ can be an estimator of the population mean μ, and the
sample proportion can be used as an estimator of the population proportion P.

Business Statistics - AcFn 2132 Page 12


Chapter Two- Statistical Estimation, Department of AcFn

Statistical Estimation refers the procedure of using a sample statistic (measure) to


estimate a population parameter.

2.1.2. Types of Statistical Estimates

We can make two types of estimates about a population: a point estimate and an
interval estimate.
A. Point estimate: It is a single number that is used to estimate an unknown population
parameter. Point estimate is the values computed from sample distribution that is used
to estimate the population parameter. The sample mean, X̄ is a point estimator for the
population mean, μ; and sample proportion, p̄ estimates parameter population
proportion, P.
B. Interval estimate: It is a range of values used to estimate the population parameter. It
places the unknown population parameter between two limits. It has ranges to estimate
the population. It also assumes or considers the errors associated with the sampling
procedure. It indicates the errors in two ways: by the extent of its range and by the
probability of the true population parameter lying within that range.
Example: The mean of the age of men attending a show is between 28 and 36 years or it
can be written as28 ≤ age ≤ 36.

2.1.3. Criteria of a Good Estimator


There are three criteria developed to compare statistical estimators in terms of their
worth as an estimator:

1. Unbiasedness: An unbiased estimator is a statistic that has an expected value equal to


the population parameter being estimated.

μ
Examples: The sample mean, is an unbiased estimator of the population mean, .

2. Efficiency: Efficiency refers to the size of the standard error or standard deviation of the
statistic. An efficient estimator considers the reliability of the estimator in terms of its
tendency to have a smaller standard error for the same sample size when compared
each other.

3. Consistency: A statistic is a consistent estimator of a population parameter if as the


sample size increases; it becomes almost certain that the value of the statistic comes
very close to the value of the population parameter. If an estimator is consistent, it
becomes more reliable with large samples. The standard error of a consistent estimator
becomes smaller as the sample size gets larger.

Business Statistics - AcFn 2132 Page 12


Chapter Two- Statistical Estimation, Department of AcFn

The sample mean and sample proportions are consistent estimators, since from their
σ
formulas as n get big, the standard error becomes small, that is, σ X̄ = and σ p̄ =
√n

√ pq
n
.

2.1.
Interval Estimates and Confidence Intervals
The probability that we associate with an interval estimate is called the confidence level.
This probability indicates how confident we are that the interval estimate will include
the population parameter. A higher probability means more confident.
Interval Estimation establishes an interval consisting of a lower limit and an upper limit
in which the true value of the population parameter is expected to fall. This interval is
called “Confidence Interval” in the parlance of inferential statistics.

2.2.1 Interval Estimation of a Population Mean

A. Large sample Case (n ≥30) and Standard Deviation is Known

≤ 5%N
, when n or N is very large, otherwise use the multiplier

of the standard error,



N −n
. α
N −1 1- is a confidence coefficient, and is the Z value

providing an area ofα /2 in the upper tail of the standard normal distribution.

A review of the normal distribution will illustrate the probability in terms of the
interval estimate around the mean.

Business Statistics - AcFn 2132 Page 12


Chapter Two- Statistical Estimation, Department of AcFn

Note that the shaded area represents the


probability between the intervals using the
z-score.
Example: If the mean was 0 and the
standard deviation was 1.
Then a 95% or 0.95 probability estimate of
the mean would be .
Note that the area to the right of z=1.96 is
0.025 and the area to the left of z=-1.96 is
also 0.025, added together they both equal
0.05, this area is not a part of the
probability interval of interest is called the
significant level (alpha= , and ).In this
example alpha = 0.05 or 5%.

The z-score for the normal variable statistics is used to help the determination of the
interval endpoints that correspond to the probability of degree of certainty one which to
use for the interval estimator.
An interval estimator for the mean is given by the following formula:

Or
For reasonably large samples, the results of the central limit theorem state the following:

1. 90% of the sample means selected from population will be within 1.64 standard
deviations of the population mean μ.
2. 95% of the sample means selected from population will be within 1.96 standard
deviations of the population mean μ.
3. 99% of the sample means will lie within 2.58 standard deviations of the
population mean.

Example: Find an interval estimator of the sample mean of a random variable of sample
size 49 if the population standard deviation is 5 and the sample mean is 15. Assume a
95% confidence interval for the population true mean.

Solution: σ X̄ =
√σ2 σ
= =
n √ n √ 49
5
= 0.7143, and the given X̄ = 15.

Since the level of significance, alpha is 5% (100 - 95) or 0.05, so α /2 = 0.025. From the

standard normal distribution reference table, appears at the endpoints of the


normal distribution. At the probability of 0.025; thus, Z0.025=1.96.

Business Statistics - AcFn 2132 Page 12


Chapter Two- Statistical Estimation, Department of AcFn

So, the 95% confidence interval for the example is


So Confidence Interval=

The confidence interval ( ) for the


population mean, when the sample size is
large ( ) is:

or the endpoints

The maximum error of the estimate, E, with level of confidence 1-α , is the error
associated with the estimate of the population mean from the sample mean and is
given by the formula below:

or The Confidence Interval, when both the mean and the standard
deviations are estimated from the sample mean and variance if or as above

Example: Find the 99% confidence interval estimate of the true population mean
income if a sample of 100 families gives a sample mean of $28,500. From previous
experience we know that the population standard deviation is $5,000.
Solution: Using alpha = 1 - 0.99 = 0.01, the probabilities are α /2 = 0.01/2 = 0.005; thus,
Z0.005=2.57

So our CI estimate is:

B. Small Sample Case (n¿ 30 ) with Unknown Standard Deviation


In the case of small sample size (n¿ 30) and with unknown standard deviation, t
distribution is applied.
X̄ ±tα /2 s , where; 1-α is the confidence coefficient.
√n
 tα /2 is the t value providing an area of α / 2 in the upper tail of a t
distribution with n-1 degree of freedom.
 s is the sample standard deviation.

Business Statistics - AcFn 2132 Page 12


Chapter Two- Statistical Estimation, Department of AcFn

 The population is assumed to have normal probability distribution.


I. Using the t Distribution to find Interval estimations
The t distribution is commonly called student’s t distribution, or simply student’s
distribution. Use of the t distribution for estimating is required whenever the sample
size is less than 30 and the population standard deviation not known. Furthermore, in
using the t distribution, we assume that the population is normal or approximately
normal.

II. Characteristics of the t Distribution


The t distribution shares some characteristics of the normal distribution and differs
from it in others. The t distribution is similar to the standard normal distribution in
these ways:
1. It is bell-shaped.
2. It is symmetric about the mean.
3. The mean, median, and mode are equal to 0 and are located at the center of the
distribution.
4. The curve never touches the x axis.
The t distribution differs from the standard normal distribution in the following ways:
1. The variance is greater than 1.
2. The t distribution is actually a family of curves based on the concept of degrees of
freedom, which is related to sample size.
3. As the sample size increases, the t distribution approaches the standard normal
distribution.

Figure. Normal distribution, t distribution for sample size n=21, and t distribution for sample size
n=6.

III. Degrees of Freedom


Many statistical distributions use the concept of degrees of freedom, and the formulas
for finding the degrees of freedom vary for different statistical tests. The degrees of
freedom are the number of values that are free to vary after a sample statistic has been

Business Statistics - AcFn 2132 Page 12


Chapter Two- Statistical Estimation, Department of AcFn

computed, and they tell the researcher which specific curve to use when a distribution
consists of a family of curves.

For example, if the mean of 5 values is 10, then 4 of the 5 values are free to vary. But
once 4 values are selected, the fifth value must be a specific number to get a sum of 50,
since 50/5 = 10. Hence, the degrees of freedom are 5 - 1 = 4, and this value tells the
researcher which t curve to use.
Degree of freedom = n-1

Example 1: An agricultural chemical retail firm wants to estimate the average number
of gallons of weed killer sold per day for the purpose of accurately forecasting and
controlling inventory. Twelve business days were monitored, and average daily sales of
10 gallons were recorded. The sample yielded standard deviation of 2 gallons. Calculate
the confidence limits at the 95% level?
Solution:
n = 12 X̄ = 10 s=2 CL = 0.95

δ = 1 - CL = 1 - 0.95 = 0.05 df = n - 1 = 12 - 1 = 11
δ/2 = 0.05/2 = 0.025 tδ/2 = t0.025 at df (11)= 2.201

Confidence interval:
X̄ ±tα /2 s = 10±2.201(2/√ 12) = 10 ±2.201(0.57735) = 10 ±1.27
√n
= (8.73, 11.27)

Example 2: A drug company is testing a new drug which is supposed to reduce blood
pressure. From the six people who are used as subjects, it is found that the average drop
in blood pressure is 2.28 points, with a standard deviation of .95 points. What is the 95%
confidence interval for the mean change in pressure?
Solution:

X̄ =2. 28 ,
S=0. 95 , 1−α =0 . 95⇒ α =0 . 05 , α /2=0 .025
⇒t α / 2 =2. 571 with df =5 from table .
⇒The required int erval will be X̄±t α / 2 S / √ n
=2. 28±2. 571∗0 . 95/ √ 6
=2. 28±1 . 008 4.2.2.
=( 1 .28 , 3. 28 )
Interval Estimation for a Population Proportion

Business Statistics - AcFn 2132 Page 12


Chapter Two- Statistical Estimation, Department of AcFn

Samples are often used to estimate a proportion of occurrences in a population. For


example, the government estimates by a sampling procedure the unemployment rate,
or the proportion of unemployed people, in the country workforce.

Let’s express the proportion of successes in a sample by .

Mean of the sampling distribution of the proportion is: μ =p
P ±zσ p
For population
p̄ ±zσ p̄
For sample

Standard error of the proportion is:σ



p̄ = pq .
n

Confidence limits are: p̄


Z n √ pq

Example: When a sample of 70 retail executives was surveyed regarding the poor
performance of the retail industry, 66% believed that decreased sales were due to
unseasonably warm temperatures, resulting in consumers’ delaying purchase of cold-
weather items.
a) Estimate the standard error of the proportion of retail executives who blame warm
weather for low sales.
b) Find the upper and lower confidence limits for this proportion, given a 95% confidence
level.

Solution: n= 70, and = 0.66

√ √
a) σ p̄ = pq = 0.66(0.34) =0.0566
n 70


b) p̄ ± Z pq =0.66 ± 1.96(0.0566) =0.66 ± 0.111= [0.316, 0.77].
n

4.2.4. Determining Sample Size in Estimation

It is discussed so far, we have used for sample size the symbol n instead of a specific
number. Now we need to know how large should the sample be? If it is too small, we
may fail to achieve the objective of our analysis. But if it is too large, we waste resources
when we gather the sample.
Sampling error is controlled by selecting a sample that is adequate in size. In general,
the more precision we want, the larger the sample we will need to take.
The correct sample size depends on three factors. These factors are:

Business Statistics - AcFn 2132 Page 12


Chapter Two- Statistical Estimation, Department of AcFn

a. The desired level of confidence,

Business Statistics - AcFn 2132 Page 12


Chapter Two- Statistical Estimation, Department of AcFn

b. The margin of error the researcher will tolerate, and


c. The variability in the population being studied.

A. Sample Size for Estimating the Mean


The probability or confidence level offers level of significance, alpha for estimating the
sample mean. The knowledge from the previous section can be used to find appropriate
sample size, n for estimating the sample mean with some degree of certainty or
probability.

Since the maximum margin of error, E is given by the formula:

Then solving for n, the sample size for some expected level of error, E; and
then the sample size needed is determined by the formula:


(n is expected to be large, 30)

(n is expected to be small, < 30)

Example 1: An average price for gasoline is expected to be $1.45 per gallon, if the
standard deviation for a specific National State is $0.10 per gallon. It is believed that the
mean price per gallon has changed. How many samples (gas stations) should be studied
so as to estimate the new National state's mean with a maximum error of the estimate of
$0.01 and a 90% level of confidence?
Solution: α = 0.10 From reference table α /2 = 0.05, Z0.05= 1.65,σ =0.10 ,E = 0.01. So,

So n = 273 (round up to the next integer).

Example 2: We wish to know the average thickness of washers in a shipment. We are


willing to take a risk of 5 times in 100 that the error in our estimate will be 0.002 inch (E)
or more. From a sample of another lot we estimate the standard deviation is 0.00359
with 9 degrees of freedom.
Solution:α =5/100 or 0.05 or 5 %, so 1-α = 0.95 (α = 0.05).
From reference table α /2 = 0.025 and degrees of freedom, df = 9, t= 2.262, E = 0.002.
Then, use the formula.
2.262× 0.00359 2
=[ 0.002 ] =16.5 So, n = 17 (round up to the next integer).

B. Sample Size for Estimating the Proportion

Business Statistics - AcFn 2132 Page 12


Chapter Two- Statistical Estimation, Department of AcFn

The procedures for determining sample sizes for estimating a population proportion are
similar to those for estimating a population mean. Then the sample size needed is
determined by the formula:
z
n = p q [ α / 2 ]2 , 1- p=q
E
Example: Suppose the Prime Minister of a country wants an estimate for the proportion
of the ‘Kebele’ administrators who support the country’s current economic policy. The
Prime Minister wants the estimate to be within ± 0.04 of the true proportion and a 95%
level of confidence. The secretary of the office of the Prime Minister estimated the
proportion supporting the current policy to be 0.60. What sample size is required?
Solution: 1−α =0.95, α =0.05, α /2=0.025, and 0.5-0.025=0.4750, Z0.4750=1.96, E=0.04, p=0.60,
and q =0.40. So,
z 1.96 2
n = p q [ α / 2 ]2= (0.60)(0.40)[ ] =576.24=577
E 0.04
Therefore, the number of ‘Kebele” leaders who support the country’s current economic
policy are 577.

Exercises
1. The operation manger of a certain Tele-center is in the process of developing an
operation plan. For that purpose, he takes a random sample of 60 calls from the
company records and finds that the mean sample length for a call is 4.26
minutes. Past history for these types of calls has shown that the population
standard deviation for call length is about 1.1 minutes. Assuming that the
population is normally distributed and he wants to have a 95% confidence, help
him in estimating the population mean.
2. A survey conducted by a CSA found that the sample mean age of men was 44
years and the sample mean age of women was 47 years. Altogether, 454 people
from Oromia were included in the reader poll –340 women and 114 men.
Assume that the population standard deviation of age for both men and women
is 8 years. Develop a 99% confidence interval estimate for the mean age of the
population men.
3. Suppose that a survey is being conducted in a company that has 800 workers. A
random sample of 50 of these workers reveals that the average sample age is 34.3
years, and the sample standard deviation is 8 years. Assuming normality,
construct a 98% confidence interval to estimate the average age of all workers in
this company.
4. A recent study showed that the modern working person experiences an average
of 2.1 hours per day of distractions (phone calls, e-mails, impromptu visits, etc.).
A random sample of 50 workers for a large corporation found that these workers

Business Statistics - AcFn 2132 Page 12


Chapter Two- Statistical Estimation, Department of AcFn

were distracted an average of 1.8 hours per day and the population standard
deviation was 20 minutes. Estimate the true mean population distraction time
with 90% confidence, and compare your answer to the results of the study.
5. A survey of 30 emergency room patients found that the average waiting time for
treatment was 174.3 minutes. Assuming that the population standard deviation
is 46.5 minutes, find the best point estimate of the population mean and the 99%
confidence of the population mean.
6. Ten randomly selected people were asked how long they slept at night. The
mean time was 7.1 hours, and the standard deviation was 0.78 hour. Find the
95% confidence interval of the mean time. Assume the variable is normally
distributed.
7. If a random sample of 27 items produces a mean of 128.4 and standard deviation
of 20.6 what is the 98% confidence interval for μ ? Assume that x is normally
distributed for the population. What is the point estimate?
8. A meteorologist who sampled 13 thunderstorms found that the average speed at
which they traveled across a certain state was 15 miles per hour. The standard
deviation of the sample was 1.7 miles per hour. Find the 99% confidence interval
of the mean. If a meteorologist wanted to use the highest speed to predict the
times it would take storms to travel across the state in order to issue warnings,
what figure would she likely use? 13.6 _ m _ 16.4; 16.4 miles per hour
9. A recent study of 28 employees of XYZ Company showed that the mean of the
distance they traveled to work was 14.3 miles. The standard deviation of the
sample mean was 2 miles. Find the 95% confidence interval of the true mean. If a
manager wanted to be sure that most of his employees would not be late, how
much time would he suggest they allow for the commute if the average speed
were 30 miles per hour?
10. For a group of 22 college football players, the mean heart rate after a morning
workout session was 86 beats per minute, and the standard deviation was 5. Find
the 90% confidence interval of the true mean for all college football players after
a workout session.
11. The national average for the number of students per teacher for all U.S. public
schools is 15.9.Arandom sample of 12 school districts from a moderately
populated area showed that the mean number of students per teacher was 19.2
with a variance of 4.41. Estimate the true mean number of students per teacher
with 95% confidence. How does your estimate compare with the national
average?
12. A gasoline service station shows a standard deviation of Birr 6.25 for the changes
made by the credit card customers. Assume that the station’s management
would like to estimate the population mean gasoline bill for its credit card

Business Statistics - AcFn 2132 Page 12


Chapter Two- Statistical Estimation, Department of AcFn

customers to be within an error of Birr 1.00. For a 95% confidence level, how
large a sample would be necessary?
13. A random sample of shoppers at a convenience store is selected to see how much
they spent on that visit. The standard deviation of the population is $6.43. How
large a sample must be selected if the researcher wants to be 99% confident of
finding whether the true mean differs from the sample mean by $1.50?
14. A random sample of 205 college students were asked if they believed that places
could be haunted, and 65 responded yes. Estimate the true proportion of college
students who believe in the possibility of haunted places with 99% confidence.
According to Time magazine, 37% of Americans believe that places can be
haunted.
15. A survey conducted by Sallie Mae and Gallup of 1404 respondents found that
323 students paid for their education by student loans. Find the 90% confidence
of the true proportion of students who paid for their education by student loans.
16. The national average for the percentage of high school graduates taking the SAT
is 49%, but the state averages vary from a low of 4% to a high of 92%. A random
sample of 300 graduating high school seniors was polled across a particular
tristate area, and it was found that 195 had taken the SAT. Estimate the true
proportion of high school graduates in this region who take the SAT with 95%
confidence.
17. A researcher wishes to estimate, with 95% confidence, the proportion of people
who own a home computer. A previous study shows that 40% of those
interviewed had a computer at home. The researcher wishes to be accurate
within 2% of the true proportion. Find the minimum sample size necessary.
18. It is believed that 25% of U.S. homes have a direct satellite television receiver.
How large a sample is necessary to estimate the true population of homes which
do with 95% confidence and within 3 percentage points? How large a sample is
necessary if nothing is known about the proportion?
19. America’s young people are heavy Internet users; 87% of Americans ages 12 to
17 are Internet users (The Cincinnati Enquirer, February 7, 2006). MySpace was
voted the most popular website by 9% in a sample survey of Internet users in this
age group. Suppose 1400 youths participated in the survey. What is the margin
of error, and what is the interval estimate of the population proportion for which
MySpace is the most popular website? Use a 95% confidence level.

Business Statistics - AcFn 2132 Page 12

You might also like