0% found this document useful (0 votes)
108 views14 pages

Chapter 4 - BUSINESS STATISTICS

Uploaded by

mesele
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
108 views14 pages

Chapter 4 - BUSINESS STATISTICS

Uploaded by

mesele
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 14

CHAPTER FOUR: ESTIMATION

INTRODUCTION
A Marketing Manager in an organization needs to estimate the likely market share his company
can achieve in the market place. Quality Assurance Manager may be interested in estimating
the proportion defective of the finished product before shipment to the customer. Manager of
the credit department needs to estimate the average collection period for collecting dues from
the customers. In many cases values for a population parameter are unknown. If parameters
are unknown it is generally not sufficient to make some convenient assumption about their
values, rather those unknown parameters should be estimated. One aspect of inferential
statistics is estimation, which is the process of estimating the value of a parameter from
information obtained from a sample.

Types of Estimates

Point estimate
An estimator is a sample statistic used to estimate the population parameter. An estimator is a
specific value of a statistic. A point estimate is a single number or value derived from the
sample used to estimate a population parameter. A random sample of observations is taken
from the population of interest and the observed values are used to obtain a point estimate of
the relevant parameter.
a. The ample mean, x is a point estimates of the population mean ().

b. Sample proportion p is a good estimator of population proportion, P. Population proportion


(P) is equal to the number of elements in the population belonging to the category of interest
X
divided by the total number of elements in the population p = N
Where: X is the number of success in the population and N population size
x
Sample proportion, p = n where; X is the number of success in the sample and n is the
sample size. In general:
The statistic x estimates 
S estimates 
S2 estimates 2
p Estimates p

The properties of good estimators


A. The estimator should be an unbiased estimator. That is, the expected value or the mean
of the estimates obtained from samples of a given size is equal to the parameter being
estimated. E ( x ) =. The sample mean, x , is therefore, an unbiased estimator of the

1|Page
population mean. Any systematic deviation of the estimator away from the parameter of
interest is called Bias.
B. The estimator should be consistent. For a consistent estimator, as sample size increases,
the value of the estimator approaches the value of the parameter estimated. The sample
mean is a consistent estimator of . This is so because the standard deviation of x is
σ
σ x=
√n . As the sample size n increases, the standard deviation of x decreases and hence
the probability that x will be closes to its expected value,  increases.
C. The estimator should be a relatively efficient estimator. That is, of all the statistics that
can be used to estimate a parameter, the relatively efficient estimator has the smallest
variance or standard deviation.
D. An estimator is said to be sufficient if it contains all the information in the data about the
parameter it estimates. The sample mean is sufficient estimator of . Other estimators like
the median and mode do not consider all values. But the mean considers all values (added
and divided by the sample size).

Interval Estimates
An interval estimate of a parameter is an interval or a range of values used to estimate the
parameter. In an interval estimate, the parameter is specified as being between two values.
The interval with in which a population parameter is expected to lie is usually referred to as the
confidence interval. A degree of confidence (usually a percent) can be assigned before an
interval estimate is made. The confidence level of an interval estimate of a parameter is the
probability that the interval estimate will contain the parameter, assuming that a large number
of samples are selected and that the estimation process on the same parameter is repeated.
The three confidence intervals are used extensively.
1. 90% confidence interval
2. 95% confidence interval and
3. 99% confidence interval
A90% confidence interval means that about 90% of the similarly constructed intervals will
contain the parameter being estimated. A 95% confidence interval means that about 95% of
the similarly constructed intervals will contain the parameter being estimated. If we use the
99% confidence interval we expect about 99% of the intervals to contain the parameter being
estimated.

Another interpretation of the 95 % confidence interval is that 95 % of the sample means for a
specified sample size will lie within 1.96 standard deviations of the hypothesized population
mean. For 99% the sample means will lie, with in 2.58 standard deviations of the hypothesized
population mean.

Where do the values 1.96 and 2.58 come from?

2|Page
The middle of 95% of the sample mean lie equally on either side of the mean. And logically
0.95/2=0.4750 or 47.5% of the area is to the right of the mean and the area to the left of the
mean is 0.4750. The Z value for this probability is 1.96. The Z to the right of the mean is +
1.96 and Z to the left is – 1.96.

Confidence level
α
α /2 Z α /2
=(1-c)
90 0.10 0.05 1.645
95 0.05 0.025 1.96
99 0.01 0.005 2.58
Constructing Confidence Interval
Recall the Central Limit Theorem, which applies to the sampling distribution of the mean of a
μ
sample. Consider samples of size n drawn from a population, whose mean is and standard
σ
deviation is with replacement and order important. The population can have any frequency

X̄ μ x̄=μ
distribution. The sampling distribution of will have a mean and a standard
σ
σ x̄ =
√n
deviation , and approaches a normal distribution as n gets large . This allows us to
use the normal distribution curve for computing confidence intervals.
X̄ −μ
⇒ Z= has a normal distribution with mean=0 and var iance=1
σ / √n

⇒ μ= X̄ ±Z σ / √n
= X̄ ±ε , where ε is a measure of error.
⇒ ε=Z σ / √ n

Case 1 when sample size is large and population standard deviation is known

Example: 1 A researcher wishes to estimate the number of days it takes an automobile dealer
to sell a Chevrolet Aveo. A sample of 50 cars had a mean time on the dealer’s lot of 54 days.
Assume the population standard deviation to be 6.0 days. Find the best point estimate of the
population mean and the 95% confidence interval of the population mean.
Solution
The best point estimate of the mean is 54 days.

3|Page
For the 95% confidence interval use z= 1.96.A machine produces components, which have a
standard deviation of 1.6cm in length.

Hence one can say with 95% confidence that the interval between 52.3 and 55.7 days does
contain the population mean, based on a sample of 50 automobiles.

Example: 2 A survey of 30 emergency room patients found that the average waiting time for
treatment was 174.3 minutes. Assuming that the population standard deviation is 46.5 minutes,
find the best point estimate of the population mean and the 99% confidence of the population
mean.
Solution
The best point estimate is 174.3 minutes.
The 99% confidence is interval is

Hence, one can be 99% confident that the mean waiting time for emergency room treatment is
between 152.4 and 196.2 minutes.
Example: 2 A random samples of 64 parts are selected from the output and this sample has a
mean length of 90 cm. The customer will reject the part if it is either less than 88cm or more
than 92 cm. assume the population standard deviation is 1.6 cm. Does the 95% confidence
interval for the true mean length of all the components produced ensure acceptance by the
customer?

Solution:
To answer the question of acceptance by the customer, you should first work out the 95%
confidence interval for the population mean  (Here  is the mean length of the components in
the population). The formula for the confidence interval is

4|Page
Case 2- when n is large (n > 30) and population standard deviation is Unknown.
If the sample size is at least 30, the sample standard deviation can substitute the population
standard deviation and the results are deemed satisfactory. If the population standard deviation
is not known, the standard deviation of the sample s, is used to approximate the population
S
S x=
standard deviation. √ n This indicates that the error in estimating the population means
decreases as the sample size increases. The 95% and 99% confidence intervals are constructed
as follows
S
95% confidence interval x  1.96 √ n
S
99% confidence interval x  2.58 √ n 1.96 and 2.58 indicate the Z values
corresponding to the middle 95% or 99% of the observation respectively. In general a
S
x±Z
confidence interval for the mean is computed by √ n , Z reflects the selected level of
confidence.

Example. An experiment involves selecting a random sample of 256 middle managers for
studying their annual income. The sample mean is computed to be Br. 35,420 and the sample
standard deviation is Br. 2,050.
a. What is the point estimate of the population mean
b. What is the 95% confidence interval of the mean (rounded to the nearest 10)
c. What are the 95% confidence upper and lower limits?
d. Interpret the finding.

Solution
a. Sample mean is 35 420 so this will approximate the population mean so  = 35420. It is
estimated from the sample mean.
b. The confidence interval is between 35170 and 35670 found by

5|Page
X ±1. 96
S
(
2050
)
√ n = 35420  1.96 √ 256 = 35168.87 and 35671.13
c. The end points of the confidence interval are called the confidence limits. In this case
they are rounded to 35170 and 35670. 35170 is the lower limit and 35070 is the upper
limit.
d. Interpretation
If we select 100 samples of size 256 form the population of all middle managers and compute
the sample means and confidence intervals, the populations mean annual income would be
found in about 95 out of the 100 confidence intervals. About 5 out of the 100 confidence
intervals would not contain the population mean annual income.

Check Your Progress

A research firm conducted a survey to determine the mean amount smokers spend on cigarette
during a week. A sample of 49 smokers revealed that the sample mean is Br. 20 with standard
deviation of Br. 5. Construct 95% confidence interval for the mean amount spent.

Case 3-Confidence interval for small sample (n <30) (Student t Distribution)

If the sample size is less than 30 and population standard deviation is unknown, the standard
normal distribution, Z, is not appropriate. The student’s t or the t distribution is used.

Characteristics of the Student’s t Distribution


Assuming that the population of interest is normal or approximately normal, the following are
the characteristics of the t distribution
1. It is a continuous distribution
2. It is bell-shaped and symmetrical
3. There is not one t distribution, but rather a ‘family’ of t distribution. All have the same
mean of zero but their standard deviation differs according to the sample size, n. The t
distribution differs for different sample size.
4. It is more spread out and flatter at the center than is the Z. However as the sample size
increases the curve representing t distribution approaches the Z distribution.
For a given confidence level, say 95%, the t value is greater than the Z value. This is so
because there is more variability in sample means computed from smaller samples. Thus our
confidence in the resulting estimate is not strong. t values are found referring to the
appropriate degrees of freedom in the t table. Degrees of freedom mean the freedom to freely
move data points or the freedom to freely assign values arbitrarily.
Degrees of freedom (df) = n – 1 where n is the sample size. Computing t value

The t variable representing the student’s t distribution is defined as

6|Page
x−μ
t= s/√n where: x is the sample mean of n measurements,  is the population mean
and s is the sample standard deviation
x−μ
Note that t is just like Z = σ /√n
except that we replace  with s. unlike our methods of large
samples,  cannot be approximated by s when the sample size is less than 30 and we cannot
use the normal distribution. The table for the t distribution is constructed for selected levels of
confidence for degree of freedom up to 30. To use the table we need to know two numbers,
the tail area, (1 minus confidence level selected), and the degree of freedom.

(1 – Confidence level selected) is , the Greek letter alpha. This is the error we committee in
estimating.
The Concept of Degrees of Freedom

Example 1. A traffic department in town is planning to determine mean number of accidents


at a high-risk intersection. Only a random sample of 10 days measurements were obtained.
Numbers of accidents per day were
8, 7 10 15 11 6 8 5 13 12
Construct a 95% confidence interval for the mean number of accident per day.
a) Compute x and s
95
x = 10 = 9.5 per day

S x=
√ ∑ ( x−x )2 =
n−1 √
The confidence level is 95% so
94 . 5
9 = 3.24 per day

 = 1 – 0.95 = 0.05
α 0 . 05
=
2 2 = 0.025

7|Page
The degree of freedom, df = n – 1 = 10 – 1 = 9 from the t table t 0.025, df 9 = 2.76
The confidence interval is
s
x  t.0025 df(9) √ n
3. 24
9.5  (2.26) √10
9.5  2.3
7.2 to 11.80
With 95% confidence the mean number of accident at this particular intersection is between 7.2
and 11.8.
Example 2
Given in the problem:
A tire manufacturer wishes to n=10
investigate the tread life of its x=0 . 32
tires. A sample of 10 tires s=0 .09
driven 50,000 miles revealed a
sample mean of 0.32 inch of Compute the C . I . using the
t-dist .(since σ is unknown )
tread remaining with a standard s
deviation of 0.09 inch. X̄ ±t α /2, n−1
√n
1. Construct a 95 percent
confidence interval for the

8|Page
Check Your Progress
A quality controller of a company plans to inspect the average diameter of small bolts made. A
random sample of 6 bolts was selected. The sample is computed to be 2.0016mm and the
sample standard deviation 0.0012mm. Construct the 99% confidence interval for all bolts made.

Confidence interval for a population proportion


Proportion is the fraction, ratio or percent indicating the part of the sample or the population
having a particular trait of interest. To construct a confidence interval the following assumption

should be meet. The confidence interval for a population proportion is estimated as p  Z p

Where p is the standard error of the proportion and

σ p=
√p(1− p )
n Therefore the confidence interval for population proportion is constructed by

p Z √ p(1−p )
n
Example. Suppose 1600 of 2000 union members sampled said they plan to vote for the
proposal to merge with a national union. Union by laws state that at least 75% of all members
must approve for the merger to be enacted. Using the 0.95 degree of confidence, what is the
interval estimate for the population proportion? Based on the confidence interval, what
conclusion can be drawn?
1600
First calculate sample proportion: p = 2000 = 0.8. The sample proportion is 80%

9|Page
The interval is computed as follows.

P=
p Z √ p(1−p )
n √
0. 80(1−0 .8 )
= 0.8  1.96 2000 = 0.8  1.96 √ 0. 00008

= 0.78247 and 0.81753 rounded to 0.782 and 0.818.


Based on the sample results when all union members vote, the proposal will probably pass
because 0.75 lie below the interval between 0.782 and 0.818.

Check Your Progress


Samples of 200 people were assumed to identify their major source of news information; 110
stated that their major source was television news coverage. Construct a 90% confidence
interval for the proportion of people in the population who consider television their major
source of news information.

Determining the sample size

Size of a sample must be determined scientifically. Care must be taken not to select a sample
too large or too small. There are two misconceptions about how many to sample
a) Sample consisting 5% (or similar constant percentage) is adequate for all problems.
5% can be too much for a particular population say 10 million or can be too small for
another say 200.
b) A sample, for example, must be selected form a heavily populated area. To avoid such
problems the sample size should be mathematically determined.

Sample Size for the Mean

There are three factors that determine the size of the sample. None of which has any direct
relationship to the size of the population.
a. The degree of confidence selected.
b. The maximum allowable error
c. The variation in the population
a. The degree of confidence, This is usually 95% or 99%. But it may be any level. It is
specified by the statistician. The higher the degree of confidence, the larger the sample
required. If we want to be sure the true mean will lie between an intervals, we would have
to survey the entire population. Example. Suppose the parameter to be estimated is the
arithmetic mean, and the degree of confidence selected is 90%. Based on a sample, it was
estimated that the population mean is in the interval between 850 and 1050. Logically, if
the degree of confidence were increased to 95% or 99% the sample size would have to
increase.

10 | P a g e
b. Maximum error allowed. It is the maximum error that will be tolerable at a specified level of
confidence. Suppose a statistician is interested to estimate the mean income of residents of
an area. There are indications that the family incomes range from a probable low of 19000
to a high of about 39000. On the assumption that these are reasonable estimates, does it
seem likely that the statistician would be satisfied with this statement resulting from a
sample of area residents? “The population mean is between 23,000 and 35,000” Probability
not. Because confidence limits that wide indicate little or nothing about the population
mean. Instead, the statistician stated “using the 0.95 confidence level, the total error is
predicting the populations mean should not exceed by 200”. The maximum allowable error
is denoted ‘E’ = E = | x - |. This means based on a sample size n, if the estimate of
population mean is computed to be 35,000, then we will assure that the population mean is
in the interval between 34800and 35200. Found by 35,000 + 200 and 35000-200. For the
0.95 degree of confidence selected the maximum error of + 200 in terms of Z is 1.96. To

determine the value of one standard error of the mean


σx simply divide the total error of
200 by 1.96 = 102.04
200
σx = 1. 96 = 102.04

c. Variation in the population. The standard deviation is a measure of variation. Thus the
standard deviation of the population must be estimated. A more convenient computational
formula for determining n is.

Where E = allowable error


Z = Z value for the degree of confidence selected
Example1. A marketing research firm wants to conduct a survey to estimate the average
amount spent on entertainment by each person visiting a popular pub. The people who plan the
survey would like to be able to determine the average amount spent by all people visiting the

11 | P a g e
pub to within br. 120, with 95% confidence. From past operations of the pub, an estimate of
the population standard deviation is  = br. 400 what is the minimum required sample sizes?
Z = 1.96
E = 120
 = 400
Required, n?

( )
2
1. 96×400
n= 120 = 42.68  43

Check Your Progress

A processor of carrots cuts the green top of each carrot, washes the carrots, and inserts six to a
package. Twenty packages are inserted in a box for shipment. To test the Wight of the boxes, a
few were checked. The mean weight was 10kg and the standard deviation 0.25kg. How many
boxes must the processor sample to be 95% confident that the sample mean does not differ
from the population mean by more than 0.1 kg?

Sample size for proportion


The procedure used to determine the sample size for the mean is applicable to determine when
proportions are involved.
Three things must be specified.
- Decide on the level of confidence
- Indicate how precise the estimate of the population proportion must be
- Approximate the population proportion, P, either from past experience or from a small
pilot survey p
The formula for determining the sample size n for a proportion

12 | P a g e
2
n= p (1 - p ) ( ZE )
Where: p - estimated proportion
Z = Z value for the selected confidence level
E = the maximum tolerable error

Example1. A member of parliament wants to determine her popularity in her region. She
indicates that the proportion of voters who will vote for her must be estimated within + 2
percent of the population proportion. Further, the 95% degree of confidence is to be used. In
past elections she received 40% of the popular vote in that area. She doubts whether it has
changed much. How many registered voters should be sampled?
Z = 1.96
p = 0.40
E = 0.02
2
n= p (1 - p ) ( ZE )

( )
2
1. 96
= 0.40 (1 – 0.4) 0 . 02 = 2,304.96  2305

13 | P a g e
This sample size might be too large, or too small or exactly correct depending on the accuracy

of p.

Note: if there is no logical estimate of p , the sample size can be estimated by letting p =0.5

Example 2. Suppose the president wants an estimate of the proportion of the population that
supports this current policy on unemployment. The president wants the estimate to be with in
0.04 of the true proportion. Assume a 95% level of confidence and the proportion supporting
current policy to be 0.60.
a) How large a sample is required?
b) How large would the sample have to be if the estimate were not available?
Solution:
a) E = 0.04
Z = 1.96
p = 0.60
( )
1 . 96 2
n = 0.6(1 – 0.6) 0 . 04
= 577
b) E = 0.4
Z = 1.96
p = 0.50 (since there is no estimate)

( )
2
1 . 96
n = 0.5 (1 – 0.5) 0 . 04
= 600
Check Your Progress

The marketing department of a company wishes to study the loyalty pattern of consumers.
Loyalty patterns range from extremely loyal to brand snitcher. If the department wishes to
estimate the proportion of consumers who are extremely loyal to this brand, what sample size
would be necessary to estimate this proportion with 0.05 with 95% confidence?

14 | P a g e

You might also like