Business Statistics Lecture Notes Chapter II
Business Statistics Lecture Notes Chapter II
Statistical Estimation
2.1. Basic Concepts
Statistical Inference is the process of making judgment about a population based on sampling
properties. An important aspect of statistical inference is using estimates to approximate the
value of an unknown population parameter. This chapter will study different kinds of estimator
and lay the foundations for making statistical inference about the population mean and
proportion.
Inference
Analyzed
Population Data
Sample Numerical
data
Definitions
Confidence Interval: is an interval estimate with a specific level of confidence
Confidence Level: is the probability that the interval estimate will contain the parameter.
Consistent Estimator: is an estimator which gets closer to the value of the parameter as the
sample size increases.
Degrees of Freedom: is the number of data values which are allowed to vary once a statistic has
been determined.
Estimator: is a sample statistic which is used to estimate a population parameter. It must be
unbiased, consistent, and relatively efficient.
Estimate: are the different possible values which an estimator can assume.
Interval Estimate: is a range of values used to estimate a parameter.
Point Estimate: is a single value used to estimate a parameter.
2.1.1. Estimator and Estimates
Any sample statistic used to estimate or measure a population parameter is called an estimator,
that is, an estimator is a sample statistic used to estimate a population parameter. The sample
We can make two types of estimates about a population: a point estimate and an interval
estimate.
A. Point estimate: is a single number that is used to estimate an unknown population parameter.
Point estimate is the values computed from sample distribution that is used to estimate the
population parameter. The sample mean, X is a point estimator for the population mean, 𝜇; and
𝑁−𝑛
the standard error, √ 𝑁 .
where, 1-𝛼 is a confidence coefficient, and is the Z value
providing an area of 𝛼/2 in the upper tail of the standard normal distribution.
A review of the normal distribution will illustrate the probability in terms of the interval estimate
around the mean.
The z-score for the normal variable statistics is used to help the determination of the interval
endpoints that correspond to the probability of degree of certainty one which to use for the
interval estimator.
An interval estimator for the mean is given by the following formula:
Or
For reasonably large samples, the results of the central limit theorem state the following:
1. 90% of the sample means selected from population will be within 1.64 standard
deviations of the population mean μ.
2. 95% of the sample means selected from population will be within 1.96 standard
deviations of the population mean μ.
3. 99% of the sample means will lie within 2.58 standard deviations of the population mean.
Example: Find an interval estimator of the sample mean of a random variable of sample size 49
if the population standard deviation is 5 and the sample mean is 15. Assume a 95% confidence
interval for the population true mean.
Since the level of significance, alpha is 5% (100 - 95) or 0.05, so 𝛼/2 = 0.025. From the standard
normal distribution reference table, appears at the endpoints of the normal distribution. At
the probability of 0.025; thus, Z0.025=1.96.
or the endpoints
The maximum error of the estimate, E, with level of confidence1-𝛼, is the error associated
with the estimate of the population mean from the sample mean and is given by the formula
below:
or the confidence interval, when both the mean and the standard deviations
Example: Find the 99% confidence interval estimate of the true population mean income if a
sample of 100 families gives a sample mean of $28,500. From previous experience we know that
the population standard deviation is $5,000.
Solution: Using alpha = 1 - 0.99 = 0.01, the probabilities are 𝛼/2 = 0.01/2 = 0.005; thus,
Z0.005=2.57
For example, if the mean of 5 values is 10, then 4 of the 5 values are free to vary. But once 4
values are selected, the fifth value must be a specific number to get a sum of 50, since 50/5 = 10.
Hence, the degrees of freedom are 5 - 1 = 4, and this value tells the researcher which t curve to
use.
Degree of freedom = n-1
Example : An agricultural chemical retail firm wants to estimate the average number of gallons
of weed killer sold per day for the purpose of accurately forecasting and controlling inventory.
Twelve business days were monitored, and average daily sales of 10 gallons were recorded. The
sample yielded standard deviation of 2 gallons. Calculate the confidence limits at the 95% level?
Solution:
n = 12 X = 10 s=2 CL = 0.95
δ = 1 - CL = 1 - 0.95 = 0.05
df = n - 1 = 12 - 1 = 11
δ/2 = 0.05/2 = 0.025
tδ/2 = t0.025 at df (11)= 2.201
Confidence interval:
𝑠
X ±t𝛼/2 = 10±2.201(2/√12) = 10 ±2.201(0.57735) = 10 ±1.27
√𝑛
= (8.73, 11.27)
X 2.28,
S 0.95, 1 0.95 0.05, 2 0.025
t 2 2.571 with df 5 fromtable.
The requiredint erval will be X t 2
n
S
2.28 2.571* 0.95 6
2.28 1.008
(1.28, 3.28)
For population P ±z 𝜎p
For sample p ±z 𝜎 p
𝑝𝑞
Standard error of the proportion is: 𝜎 p
=√ 𝑛 .
𝑝𝑞
Confidence limits are: p √
Z 𝑛
Example: When a sample of 70 retail executives was surveyed regarding the poor performance
of the retail industry, 66% believed that decreased sales were due to unseasonably warm
temperatures, resulting in consumers’ delaying purchase of cold-weather items.
a) Estimate the standard error of the proportion of retail executives who blame warm weather
for low sales.
b) Find the upper and lower confidence limits for this proportion, given a 95% confidence level.
a) 𝜎 p 𝑝𝑞 0.66(0.34)
=√ =0.0566
=√ 𝑛 70
b) p
=0.66 ± 1.96(0.0566) =0.66 ± 0.111= [0.316, 0.77].
𝑝𝑞
± 𝑛
Then solving for n, the sample size for some expected level of error, E; and then the
sample size needed is determined by the formula:
Example: An average price for gasoline is expected to be $1.45 per gallon, if the standard
deviation for a specific National State is $0.10 per gallon. It is believed that the mean price per
gallon has changed. How many samples (gas stations) should be studied so as to estimate the
Example: We wish to know the average thickness of washers in a shipment. We are willing to
take a risk of 5 times in 100 that the error in our estimate will be 0.002 inch (E) or more. From a
sample of another lot we estimate the standard deviation is 0.00359 with 9 degrees of freedom.
Solution:𝛼=5/100 or 0.05 or 5 %, so 1- 𝛼 = 0.95 (𝛼 = 0.05).
From reference table 𝛼 /2 = 0.025 and degrees of freedom, df = 9, t= 2.262, E = 0.002. Then, use
the formula.
Example: Suppose the Prime Minister of a country wants an estimate for the proportion of the
‘Kebele’ administrators who support the country’s current economic policy. The Prime Minister
wants the estimate to be within ±0.04 of the true proportion and a 95% level of confidence. The
secretary of the office of the Prime Minister estimated the proportion supporting the current
policy to be 0.60. What sample size is required?
Solution: 1 − 𝛼=0.95, 𝛼=0.05, 𝛼/2=0.025, and 0.5-0.025=0.4750, Z0.4750=1.96, E=0.04,𝑝̅=0.60,
and 𝑞̅ =0.40. So,
𝑧𝛼/2 2
n= 1.96 2
𝑝̅𝑞̅[ ] = (0.60)(0.40)[ ] =576.24=577
𝐸 0.04
Therefore, the number of ‘Kebele” leaders who support the country’s current economic policy
are 577.