03 Statistical Inference v0 2 05062022 050648pm

Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

BBA – 2 (Spring – 2022)

Statistical Inference

Estimation

Ahmad Jalil Ansari


Contact: [email protected]
Part II
Student t-Distribution

Learning Objectives
 Concept of Estimates and Estimators
 Types of Estimates
• Point Estimate
• Interval Estimate
Student t Distribution
 Introduced by W. S. Gosset pen named as Student in 1900
 It is used when
 Sample size is 30 or less
 Population Standard Deviation is not known
 When using student t-distribution it is assumed that the
population is normal or approximately normal
Characteristics of t Distribution
 Like normal distribution t-distribution is symmetrical
 It is flatter than normal distribution (Higher Kurtosis). It means
 It is lower at means and higher at tails than normal distribution
 More areas at tails than the normal distribution
 Intervals are wider than those based on the normal distribution
 There is a different t-distribution for every possible sample size.
(Or in other words, there is a different t distribution for each of
possible degree of freedom)
 As the sample size increases the t-distribution loses its flatness and
become closer to normal distribution
 When sample size is 30 or more, then instead of t-distribution we
use normal distribution
Using t Distribution
 The t table is more compact than normal distribution and shows
values only for few percentages (like 10%, 5%, 2%, 1%)
 It does not focus on the probability the population parameter
being estimated within our confidence interval, instead, it
measures the probability the population parameter we are
estimating will not be within our confidence interval (i.e. it will lie
outside it)
 In using t table we need to specify the degree of freedom which
which we are dealing. It is for t-distribution equal to n -1
 the confidence interval will
= 𝑥 ± SE (t-value of confidence level required)
So if we make an estimate at 90% confidence level for a sample size
14, then to find t-value we need to look in t-table the value at the
cross of chance of error = 10% and d.f. = 13 which is 1.771 so the
confidence interval will be 𝑥 ± SE (1.771)
Using t Distribution - Example
a power generating plant manager want to estimate the coal needed
this year. He took a sample of by measuring coal usage or 10 weeks.
The sample data has mean as 11,400 tons and standard deviation as
700 tons.
a) Find the estimated standard error of the mean
b) Construct a 96% confidence interval for the mean
Sample size = n = 10 Mean of sample = 𝑥 = 11,400 tons/ week
Sample standard deviation = s = 700 tons
Using t Distribution - Example
Sample size = n = 10 Mean of sample = 𝑥 = 11,400
tons/ week Sample standard deviation = s = 700 tons
a) Estimated population standard deviation = 𝜎 = 𝑠 = 700
𝜎
The estimated standard error of mean = 𝜎𝑥 =
𝑛
700
= = 221.38
10
b) As sample size is less than 30 and population mean is not known
it is a case of t-distribution. For 95% confidence interval we need
to look for area on both side as 0.05 with d.f. = 9 which is 2.262

Confidence Interval = 𝑥 ± 2.262 𝜎𝑥


= 11,400 ± 2.262× 221.38 = (10899, 11901)
Therefore, with 95% confidence level, the estimate of the
average mean of coal needed per week is between 10,899 and
11,901 tons
t-Table
Using t Distribution - Example
For the following sample sizes and confidence intervals, find the
appropriate t values for constructing confidence intervals
a) n = 28; 95% b) n = 8; 98% c) n = 13; 90% d) n=10; 95%

Answers
a) 2.052 b) 2.998 c) 1.782 d) 2.262
Using t Distribution - Example
Using a random sample of seven homemakers, it was determined
that the distance they walked in their housework had an average of
39.2 miles per week with standard deviation 3.2 miles per week.
Construct a 95% confidence interval for the mean
Sample size = n = 7 Mean of sample = 𝑥 = 39.2
Sample standard deviation = s = 3.2
As sample size is less than 30 and population mean is not known it is
a case of t-distribution. For 95% confidence interval we need to look
for area on both side as 0.05 with d.f. = 6 which is 2.447
𝑠 3.2
Standard error of mean = 𝜎𝑥 = = = 1.2095
𝑛 7

Confidence Interval for 95% confidence level = 𝑥 ± 𝑡 𝜎𝑥


= 39.2 ± 2.447× 1.2095 = (36.24, 42.16)
Therefore, estimate of the mean distance covered is between 36.24 and
42.16 with 95% confidence level
Using t Distribution - Example
Sample size = n = 10 Mean of sample = 𝑥 = 11,400
tons/ week Sample standard deviation = s = 700 tons
a) Estimated population standard deviation = 𝜎 = 𝑠 = 700
𝜎
The estimated standard error of mean = 𝜎𝑥 =
𝑛
700
= = 221.38
10
b) As sample size is less than 30 and population mean is not known
it is a case of t-distribution. For 95% confidence interval we need
to look for area on both side as 0.05 with d.f. = 9 which is 2.262

Confidence Interval = 𝑥 ± 2.262 𝜎𝑥


= 11,400 ± 2.262× 221.38 = (10899, 11901)
Therefore, with 95% confidence level, the estimate of the
average mean of coal needed per week is between 10,899 and
11,901 tons
Sample Size in estimation
Using optimal sample size is important in any analysis
 If it is too small, we fail to achieve the objective of our
analysis
 If it is too big, we waste our resources when we gather
the sample
 Unless we study whole population, some sampling error
will always be there i.e. we may lose some useful
information
 Sampling error may be controlled by selecting a sample
which is adequate in size
 In general, if more precision is required, the larger sample is
needed
Sample Size for estimating
Before making any analysis, it is necessary to find the
best sample size so that we can make very close to
an accurate estimate.

Best estimate of sample size depends on:


 Maximum error of the estimate  how close to the
true mean do you want to be (2 units, 5 units, etc.)
 Population standard deviation  from historical data
either it is known or has been estimated
 Degree of confidence  how confident do you wish to
Sample Size for estimating a mean
A university is performing a survey of the annual earnings from
graduates. It knows that the standard deviation of these earnings
of entire population (1000) is about $1500. How large a sample
size should the university use, in order to estimate the mean
earnings within $500 and at 95% confidence level?

University is going to:


 Take a sample of some size(n)
 Determine the mean of the sample 𝑥
 Use 𝑥 as a point estimate of the population mean σ

It wants to be 95% certain that the true mean annual earnings is


not more than $500 above or below the point estimate
Sample Size for estimating a mean
A university is performing a survey of the annual earnings from graduates. It
knows that the standard deviation of these earnings of entire population
(1000) is about $1500. How large a sample size should the university in order
to estimate the mean earnings within $500 and at 95% confidence level?
Confidence Interval = 𝑥 ± 𝑧 𝜎𝑥
= 𝑥 ± 500
⟹ 𝑧 𝜎𝑥 = 500
⟹ 1.96 𝜎𝑥 = 500
⟹ 𝜎𝑥 = 500/1.96 = 255 = SE
𝜎
Standard error of mean = 𝜎𝑥 =
𝑛
1500
⟹ 255 =
𝑛
⟹ 𝑛 = 1500/255 = 5.882
⟹ 𝑛 = 34.6 ≈ 35
Therefore, the university should take a sample of 35 graduates to
get the precision it wants in estimating the mean earning
Sample Size for estimating a Proportion
A university wants to know what proportion of its student favors the new
grading system. What sample size it should use to enable them to be 90%
certain of estimating the true of the population of 40,000 students that is
in favor of the new system with plus and minus 0.02?
p n
Confidence Interval = 𝑝 ± 𝑧 𝜎𝑝 = 𝑝 ± 0.02 0.2 1,075
⟹ 1.64 𝜎𝑝 = 0.02 0.3 1,411
⟹ 𝜎𝑝 = 0.02/1.64 =0.0122 0.4 1,613
𝑝𝑞 0.5 1,680
⟹ = 0.0122 0.6 1,613
𝑛
𝑝𝑞 0.7 1,411
⟹ = 0.00014884
𝑛
0.8 1,075
Let p = 0.5. then q = 0.5
⟹ 𝑛 = (0.5 × 0.5) /0.00014884 = 1,680
Therefore, the university should take a sample of 1,680 students to
get the precision it wants in estimating the proportion of students
favoring the new system
Sample Size for estimating a Proportion
For a test market, find the sample size needed to estimate the true
proportion of consumers satisfied with the new product within ±0.04 at
the 90% confidence level.
Confidence Interval = 𝑧 𝜎𝑝 = 0.04
⟹ 1.64 𝜎𝑝 = 0.04
⟹ 𝜎𝑝 = 0.04/1.64
𝑝𝑞
⟹ = 0.04/1.64
𝑛
𝑝𝑞
⟹ =(0.04/1.64)2
𝑛
Let p = 0.5. then q = 0.5
⟹ 𝑛 = (0.5 × 0.5) /(0.04/1.64)2= 420.25 ≈ 421
Therefore, for test market, a sample of 421 customers should be used
to get the precision needed in estimating the proportion of customers
satisfied with the new product
Thank you

Ahmad Jalil Ansari


Contact: [email protected]

You might also like