0% found this document useful (0 votes)
208 views13 pages

Business Statistics Lecture Notes Chapter II

The document discusses statistical estimation and interval estimates. It defines key terms like estimators, point estimates, interval estimates, and confidence intervals. It also covers how to calculate confidence intervals for a population mean using both large sample and small sample cases.

Uploaded by

minilek
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
208 views13 pages

Business Statistics Lecture Notes Chapter II

The document discusses statistical estimation and interval estimates. It defines key terms like estimators, point estimates, interval estimates, and confidence intervals. It also covers how to calculate confidence intervals for a population mean using both large sample and small sample cases.

Uploaded by

minilek
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 13

Chapter Two

Statistical Estimation
2.1. Basic Concepts

Statistical Inference is the process of making judgment about a population based on sampling
properties. An important aspect of statistical inference is using estimates to approximate the
value of an unknown population parameter. This chapter will study different kinds of estimator
and lay the foundations for making statistical inference about the population mean and
proportion.

Inference
Analyzed
Population Data

Sample Numerical
data

Definitions
Confidence Interval: is an interval estimate with a specific level of confidence
Confidence Level: is the probability that the interval estimate will contain the parameter.
Consistent Estimator: is an estimator which gets closer to the value of the parameter as the
sample size increases.
Degrees of Freedom: is the number of data values which are allowed to vary once a statistic has
been determined.
Estimator: is a sample statistic which is used to estimate a population parameter. It must be
unbiased, consistent, and relatively efficient.
Estimate: are the different possible values which an estimator can assume.
Interval Estimate: is a range of values used to estimate a parameter.
Point Estimate: is a single value used to estimate a parameter.
2.1.1. Estimator and Estimates
Any sample statistic used to estimate or measure a population parameter is called an estimator,
that is, an estimator is a sample statistic used to estimate a population parameter. The sample

B u s i n e s s S t a t i s t i c s - AcFn 2132 Page 1


mean X can be an estimator of the population mean𝜇, and the sample proportion can be used as
an estimator of the population proportion P.
Statistical Estimation refers the procedure of using a sample statistic (measure) to estimate a
population parameter.

2.1.2. Types of Statistical Estimates

We can make two types of estimates about a population: a point estimate and an interval
estimate.
A. Point estimate: is a single number that is used to estimate an unknown population parameter.
Point estimate is the values computed from sample distribution that is used to estimate the
population parameter. The sample mean, X is a point estimator for the population mean, 𝜇; and

sample proportion, p estimates parameter population proportion, P.


B. Interval estimate: It is a range of values used to estimate the population parameter. It places the
unknown population parameter between two limits. It has ranges to estimate the population. It
also assumes or considers the errors associated with the sampling procedure. It indicates the
errors in two ways: by the extent of its range and by the probability of the true population
parameter lying within that range.
Example: The mean of the age of men attending a show is between 28 and 36 years or it can be
written as 28 ≤ 𝑎𝑔𝑒 ≤ 36.

2.1.3. Criteria of a Good Estimator


There are three criteria developed to compare statistical estimators in terms of their worth as an
estimator:
1. Unbiasedness: an unbiased estimator is a statistic that has an expected value equal to the
population parameter being estimated. Examples: The sample mean, X is an unbiased
estimator of the population mean, 𝜇.
2. Efficiency: refers to the size of the standard error or standard deviation of the statistic.
An efficient estimator considers the reliability of the estimator in terms of its tendency to
have a smaller standard error for the same sample size when compared each other.
3. Consistency: a statistic is a consistent estimator of a population parameter if as the
sample size increases; it becomes almost certain that the value of the statistic comes very

B u s i n e s s S t a t i s t i c s - AcFn 2132 Page 2


close to the value of the population parameter. If an estimator is consistent, it becomes
more reliable with large samples. The standard error of a consistent estimator becomes
smaller as the sample size gets larger.
The sample mean and sample proportions are consistent estimators, since from their formulas as
𝜎
n get big, the standard error becomes small, that is, 𝜎 X and 𝜎 p 𝑝𝑞
.
=√𝑛 = 𝑛

2.2 Interval Estimates and Confidence Intervals


The probability that we associate with an interval estimate is called the confidence level. This
probability indicates how confident we are that the interval estimate will include the population
parameter. A higher probability means more confident. Interval estimation establishes an interval
consisting of a lower limit and an upper limit in which the true value of the population parameter
is expected to fall. This interval is called “Confidence Interval” in the parlance of inferential
statistics.

2.2.1 Interval Estimation of a Population Mean


A. Large sample Case (n ≥30) and Standard Deviation is Known

, when n≤ 5%𝑁 or N is very large, otherwise use the multiplier of

𝑁−𝑛
the standard error, √ 𝑁 .
where, 1-𝛼 is a confidence coefficient, and is the Z value

providing an area of 𝛼/2 in the upper tail of the standard normal distribution.
A review of the normal distribution will illustrate the probability in terms of the interval estimate
around the mean.

B u s i n e s s S t a t i s t i c s - AcFn 2132 Page 3


Note that the shaded area represents the probability between the intervals using the z- score.
Example: If the mean was 0 and the standard deviation was 1.
Then a 95% or 0.95 probability estimate of the

mean would be . Note


that the area to the right of z=1.96 is 0.025 and the area to the left of z=-1.96 is also 0.025, added together they
called the significant level (alpha= , and ).In
this example alpha = 0.05 or 5%.

The z-score for the normal variable statistics is used to help the determination of the interval
endpoints that correspond to the probability of degree of certainty one which to use for the
interval estimator.
An interval estimator for the mean is given by the following formula:

Or
For reasonably large samples, the results of the central limit theorem state the following:

1. 90% of the sample means selected from population will be within 1.64 standard
deviations of the population mean μ.
2. 95% of the sample means selected from population will be within 1.96 standard
deviations of the population mean μ.
3. 99% of the sample means will lie within 2.58 standard deviations of the population mean.

Example: Find an interval estimator of the sample mean of a random variable of sample size 49
if the population standard deviation is 5 and the sample mean is 15. Assume a 95% confidence
interval for the population true mean.

B u s i n e s s S t a t i s t i c s - AcFn 2132 Page 4


𝜎2 𝜎 5
Solution: 𝜎 X = √ = = = 0.7143, and the given X = 15.
𝑛 √𝑛 √49

Since the level of significance, alpha is 5% (100 - 95) or 0.05, so 𝛼/2 = 0.025. From the standard

normal distribution reference table, appears at the endpoints of the normal distribution. At
the probability of 0.025; thus, Z0.025=1.96.

So, the 95% confidence interval for the example is


So Confidence Interval=

The confidence interval ( ) for the population

mean, when the sample size is large ( ) is:

or the endpoints

The maximum error of the estimate, E, with level of confidence1-𝛼, is the error associated
with the estimate of the population mean from the sample mean and is given by the formula
below:

or the confidence interval, when both the mean and the standard deviations

are estimated from the sample mean and variance if or as above

Example: Find the 99% confidence interval estimate of the true population mean income if a
sample of 100 families gives a sample mean of $28,500. From previous experience we know that
the population standard deviation is $5,000.
Solution: Using alpha = 1 - 0.99 = 0.01, the probabilities are 𝛼/2 = 0.01/2 = 0.005; thus,
Z0.005=2.57

So our CI estimate is:

B u s i n e s s S t a t i s t i c s - AcFn 2132 Page 5


B. Small Sample Case (n< 30) with Unknown Standard Deviation
In the case of small sample size (n< 30) and with unknown standard deviation, t distribution is
applied.
𝑠
X ±t𝑎/2 , where; 1-𝛼 is the confidence coefficient.
√𝑛

 t𝛼/2 is the t value providing an area of 𝛼/2 in the upper tail of a t


distribution with n-1 degree of freedom.
 S is the sample standard deviation.
 The population is assumed to have normal probability distribution.
I. Using the t Distribution to find Interval estimations
The t distribution is commonly called student’s t distribution, or simply student’s distribution.
Use of the t distribution for estimating is required whenever the sample size is less than 30 and
the population standard deviation not known. Furthermore, in using the t distribution, we assume
that the population is normal or approximately normal.
II. Characteristics of the t Distribution
The t distribution shares some characteristics of the normal distribution and differs from it in
others. The t distribution is similar to the standard normal distribution in these ways:
1. It is bell-shaped.
2. It is symmetric about the mean.
3. The mean, median, and mode are equal to 0 and are located at the center of the distribution.
4. The curve never touches the x axis.
The t distribution differs from the standard normal distribution in the following ways:
1. The variance is greater than 1.
2. The t distribution is actually a family of curves based on the concept of degrees of freedom,
which is related to sample size.
3. As the sample size increases, the t distribution approaches the standard normal distribution.

B u s i n e s s S t a t i s t i c s - AcFn 2132 Page 6


Figure; Normal distribution, t distribution for sample size n=21, and t distribution for sample size n=6.
III. Degrees of Freedom
Many statistical distributions use the concept of degrees of freedom, and the formulas for finding
the degrees of freedom vary for different statistical tests. The degrees of freedom are the number
of values that are free to vary after a sample statistic has been computed, and they tell the
researcher which specific curve to use when a distribution consists of a family of curves.

For example, if the mean of 5 values is 10, then 4 of the 5 values are free to vary. But once 4
values are selected, the fifth value must be a specific number to get a sum of 50, since 50/5 = 10.
Hence, the degrees of freedom are 5 - 1 = 4, and this value tells the researcher which t curve to
use.
Degree of freedom = n-1

Example : An agricultural chemical retail firm wants to estimate the average number of gallons
of weed killer sold per day for the purpose of accurately forecasting and controlling inventory.
Twelve business days were monitored, and average daily sales of 10 gallons were recorded. The
sample yielded standard deviation of 2 gallons. Calculate the confidence limits at the 95% level?
Solution:

n = 12 X = 10 s=2 CL = 0.95
δ = 1 - CL = 1 - 0.95 = 0.05
df = n - 1 = 12 - 1 = 11
δ/2 = 0.05/2 = 0.025
tδ/2 = t0.025 at df (11)= 2.201

Confidence interval:
𝑠
X ±t𝛼/2 = 10±2.201(2/√12) = 10 ±2.201(0.57735) = 10 ±1.27
√𝑛

= (8.73, 11.27)

B u s i n e s s S t a t i s t i c s - AcFn 2132 Page 7


Example: A drug company is testing a new drug which is supposed to reduce blood pressure.
From the six people who are used as subjects, it is found that the average drop in blood pressure
is 2.28 points, with a standard deviation of .95 points. What is the 95% confidence interval for
the mean change in pressure?
Solution:

X  2.28,
S  0.95, 1    0.95    0.05, 2  0.025

 t 2  2.571 with df  5 fromtable.
 The requiredint erval will be X  t 2
n
S
 2.28  2.571* 0.95 6
 2.28  1.008
 (1.28, 3.28)

2.2.2. Interval Estimation for a Population Proportion


Samples are often used to estimate a proportion of occurrences in a population. For example, the
government estimates by a sampling procedure the unemployment rate, or the proportion of
unemployed people, in the country workforce.

Let’s express the proportion of successes in a sample by p .

Mean of the sampling distribution of the proportion is: 𝜇 p =p

For population P ±z 𝜎p

For sample p ±z 𝜎 p

𝑝𝑞
Standard error of the proportion is: 𝜎 p
=√ 𝑛 .
𝑝𝑞
Confidence limits are: p √
Z 𝑛
Example: When a sample of 70 retail executives was surveyed regarding the poor performance
of the retail industry, 66% believed that decreased sales were due to unseasonably warm
temperatures, resulting in consumers’ delaying purchase of cold-weather items.
a) Estimate the standard error of the proportion of retail executives who blame warm weather
for low sales.
b) Find the upper and lower confidence limits for this proportion, given a 95% confidence level.

B u s i n e s s S t a t i s t i c s - AcFn 2132 Page 8


Solution: n= 70, and p = 0.66

a) 𝜎 p 𝑝𝑞 0.66(0.34)
=√ =0.0566
=√ 𝑛 70

b) p
=0.66 ± 1.96(0.0566) =0.66 ± 0.111= [0.316, 0.77].
𝑝𝑞

± 𝑛

2.2.4. Determining Sample Size in Estimation


It is discussed so far, we have used for sample size the symbol n instead of a specific number.
Now we need to know how large should the sample be? If it is too small, we may fail to achieve
the objective of our analysis. But if it is too large, we waste resources when we gather the
sample. Sampling error is controlled by selecting a sample that is adequate in size. In general, the
more precision we want, the larger the sample we will need to take.
The correct sample size depends on three factors. These factors are:
a) The desired level of confidence,
b) The margin of error the researcher will tolerate, and
c) The variability in the population being studied.
A. Sample Size for Estimating the Mean
The probability or confidence level offers level of significance, alpha for estimating the sample
mean. The knowledge from the previous section can be used to find appropriate sample size, n
for estimating the sample mean with some degree of certainty or probability. Since the maximum
margin of error, E is given by the formula:

Then solving for n, the sample size for some expected level of error, E; and then the
sample size needed is determined by the formula:

(n is expected to be large, ≥30) (n is expected to be small, < 30)

Example: An average price for gasoline is expected to be $1.45 per gallon, if the standard
deviation for a specific National State is $0.10 per gallon. It is believed that the mean price per
gallon has changed. How many samples (gas stations) should be studied so as to estimate the

B u s i n e s s S t a t i s t i c s - AcFn 2132 Page 9


new National state's mean with a maximum error of the estimate of $0.01 and a 90% level of
confidence?
Solution: 𝛼= 0.10 From reference table 𝛼/2 = 0.05, Z0.05= 1.65, 𝜎 = 0.10, E = 0.01. So,

So n = 273 (round up to the next integer).

Example: We wish to know the average thickness of washers in a shipment. We are willing to
take a risk of 5 times in 100 that the error in our estimate will be 0.002 inch (E) or more. From a
sample of another lot we estimate the standard deviation is 0.00359 with 9 degrees of freedom.
Solution:𝛼=5/100 or 0.05 or 5 %, so 1- 𝛼 = 0.95 (𝛼 = 0.05).
From reference table 𝛼 /2 = 0.025 and degrees of freedom, df = 9, t= 2.262, E = 0.002. Then, use
the formula.

= [2.262×0.00359]2 =16.5 So, n = 17 (round up to the next integer).


0.002

B. Sample Size for Estimating the Proportion


The procedures for determining sample sizes for estimating a population proportion are similar
to those for estimating a population mean. Then the sample size needed is determined by the
formula:
𝑧𝛼/2
n = 𝑝̅ 𝑞̅[ ]2 , 1-𝑝̅=𝑞̅
𝐸

Example: Suppose the Prime Minister of a country wants an estimate for the proportion of the
‘Kebele’ administrators who support the country’s current economic policy. The Prime Minister
wants the estimate to be within ±0.04 of the true proportion and a 95% level of confidence. The
secretary of the office of the Prime Minister estimated the proportion supporting the current
policy to be 0.60. What sample size is required?
Solution: 1 − 𝛼=0.95, 𝛼=0.05, 𝛼/2=0.025, and 0.5-0.025=0.4750, Z0.4750=1.96, E=0.04,𝑝̅=0.60,
and 𝑞̅ =0.40. So,
𝑧𝛼/2 2
n= 1.96 2
𝑝̅𝑞̅[ ] = (0.60)(0.40)[ ] =576.24=577
𝐸 0.04

Therefore, the number of ‘Kebele” leaders who support the country’s current economic policy
are 577.

B u s i n e s s S t a t i s t i c s - AcFn 2132 Page 10


Exercises
1. The operation manger of a certain Tele-center is in the process of developing an
operation plan. For that purpose, he takes a random sample of 60 calls from the company
records and finds that the mean sample length for a call is 4.26 minutes. Past history for
these types of calls has shown that the population standard deviation for call length is
about 1.1 minutes. Assuming that the population is normally distributed and he wants to
have a 95% confidence, help him in estimating the population mean.
2. A survey conducted by a CSA found that the sample mean age of men was 44 years and
the sample mean age of women was 47 years. Altogether, 454 people from Oromia were
included in the reader poll –340 women and 114 men. Assume that the population
standard deviation of age for both men and women is 8 years. Develop a 99% confidence
interval estimate for the mean age of the population men.
3. Suppose that a survey is being conducted in a company that has 800 workers. A random
sample of 50 of these workers reveals that the average sample age is 34.3 years, and the
sample standard deviation is 8 years. Assuming normality, construct a 98% confidence
interval to estimate the average age of all workers in this company.
4. A recent study showed that the modern working person experiences an average of 2.1
hours per day of distractions (phone calls, e-mails, impromptu visits, etc.). A random
sample of 50 workers for a large corporation found that these workers were distracted an
average of 1.8 hours per day and the population standard deviation was 20 minutes.
Estimate the true mean population distraction time with 90% confidence, and compare
your answer to the results of the study.
5. A survey of 30 emergency room patients found that the average waiting time for
treatment was 174.3 minutes. Assuming that the population standard deviation is 46.5
minutes, find the best point estimate of the population mean and the 99% confidence of
the population mean.
6. Ten randomly selected people were asked how long they slept at night. The mean time
was 7.1 hours, and the standard deviation was 0.78 hour. Find the 95% confidence
interval of the mean time. Assume the variable is normally distributed.

B u s i n e s s S t a t i s t i c s - AcFn 2132 Page 11


7. If a random sample of 27 items produces a mean of 128.4 and standard deviation of 20.6
what is the 98% confidence interval for  ? Assume that x is normally distributed for the
population. What is the point estimate?
8. A meteorologist who sampled 13 thunderstorms found that the average speed at which
they traveled across a certain state was 15 miles per hour. The standard deviation of the
sample was 1.7 miles per hour. Find the 99% confidence interval of the mean. If a
meteorologist wanted to use the highest speed to predict the times it would take storms to
travel across the state in order to issue warnings, what figure would she likely use? 13.6 _
m _ 16.4; 16.4 miles per hour
9. A recent study of 28 employees of XYZ Company showed that the mean of the distance
they traveled to work was 14.3 miles. The standard deviation of the sample mean was 2
miles. Find the 95% confidence interval of the true mean. If a manager wanted to be sure
that most of his employees would not be late, how much time would he suggest they
allow for the commute if the average speed were 30 miles per hour?
10. For a group of 22 college football players, the mean heart rate after a morning workout
session was 86 beats per minute, and the standard deviation was 5. Find the 90%
confidence interval of the true mean for all college football players after a workout
session.
11. The national average for the number of students per teacher for all U.S. public schools is
15.9.Arandom sample of 12 school districts from a moderately populated area showed
that the mean number of students per teacher was 19.2 with a variance of 4.41. Estimate
the true mean number of students per teacher with 95% confidence. How does your
estimate compare with the national average?
12. A gasoline service station shows a standard deviation of Birr 6.25 for the changes made
by the credit card customers. Assume that the station’s management would like to
estimate the population mean gasoline bill for its credit card customers to be within an
error of Birr 1.00. For a 95% confidence level, how large a sample would be necessary?
13. A random sample of shoppers at a convenience store is selected to see how much they
spent on that visit. The standard deviation of the population is $6.43. How large a sample
must be selected if the researcher wants to be 99% confident of finding whether the true
mean differs from the sample mean by $1.50?

B u s i n e s s S t a t i s t i c s - AcFn 2132 Page 12


14. A random sample of 205 college students were asked if they believed that places could be
haunted, and 65 responded yes. Estimate the true proportion of college students who
believe in the possibility of haunted places with 99% confidence. According to Time
magazine, 37% of Americans believe that places can be haunted.
15. A survey conducted by Sallie Mae and Gallup of 1404 respondents found that 323
students paid for their education by student loans. Find the 90% confidence of the true
proportion of students who paid for their education by student loans.
16. The national average for the percentage of high school graduates taking the SAT is 49%,
but the state averages vary from a low of 4% to a high of 92%. A random sample of 300
graduating high school seniors was polled across a particular tristate area, and it was
found that 195 had taken the SAT. Estimate the true proportion of high school graduates
in this region who take the SAT with 95% confidence.
17. A researcher wishes to estimate, with 95% confidence, the proportion of people who own
a home computer. A previous study shows that 40% of those interviewed had a computer
at home. The researcher wishes to be accurate within 2% of the true proportion. Find the
minimum sample size necessary.
18. It is believed that 25% of U.S. homes have a direct satellite television receiver. How large
a sample is necessary to estimate the true population of homes which do with 95%
confidence and within 3 percentage points? How large a sample is necessary if nothing is
known about the proportion?
19. America’s young people are heavy Internet users; 87% of Americans ages 12 to 17 are
Internet users (The Cincinnati Enquirer, February 7, 2006). MySpace was voted the most
popular website by 9% in a sample survey of Internet users in this age group. Suppose
1400 youths participated in the survey. What is the margin of error, and what is the
interval estimate of the population proportion for which MySpace is the most popular
website? Use a 95% confidence level.

B u s i n e s s S t a t i s t i c s - AcFn 2132 Page 13

You might also like