0% found this document useful (0 votes)
2 views

chapter 7estimation

The document provides an overview of statistical inference, focusing on statistical estimation methods, including point and interval estimation. It explains how to estimate population parameters, calculate confidence intervals, and the assumptions for using t-distribution and z-distribution. Key concepts such as confidence levels, significance levels, and factors affecting confidence interval width are also discussed.

Uploaded by

michot felegu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

chapter 7estimation

The document provides an overview of statistical inference, focusing on statistical estimation methods, including point and interval estimation. It explains how to estimate population parameters, calculate confidence intervals, and the assumptions for using t-distribution and z-distribution. Key concepts such as confidence levels, significance levels, and factors affecting confidence interval width are also discussed.

Uploaded by

michot felegu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

Statistical Inference

Samrawit.F (Msc)

1
Objective
At the end of this session you are expected to
• Define statistical estimation, estimator and estimate.
• Differentiate the statistical estimation methods
• Understand the assumption for t-distribution and z-distribution
• Estimate population mean and mean difference
• To understand how to calculate CI for single population mean
• To understand how to calculate CI for single population proportion

2
Introduction
• Statistical inference is the process by which we acquire

information about populations from samples.

• There are two procedures for making inferences:

– Statistical Estimation.

– Hypotheses testing.

3
4
Estimation, Estimator & Estimate
• Estimation: is about estimating population parameters
based on sample statistics (by computation of a statistic
from sample data)

• The statistic itself is called an estimator and can be of


two types: point or interval.

• The value or values that the estimator assumes are called


estimates.

5
6
Statistical Estimation
• There are two ways to estimate population values from sample
values
– Point estimation
• using a sample statistic to estimate a population parameter
based on a single value
• Point estimation ignores sampling error !
– Interval estimation
• using a sample statistic to estimate a population parameter
by making allowance for sample variation (error)

7
Point estimate
• A point estimate is a single numerical value used to estimate
corresponding population parameter.

• Point estimation involves the calculation of a single value to


estimate the population a single value to estimate the population
parameter (mean or proportion/prevalence)

• The probability that a single sample static actually equal to


population parameter value is extremely small. For this reason
point estimation is rarely used.
• Thus,
– A point estimate is of the form: [Value],

8
Interval estimation
• An interval estimate is an interval or two numbers
within which the population parameter could lie.

• It provides more information about a population


characteristic than does a point estimate.

• considers sample to sample variation of sample


statistics- variability.
– Whereas, an interval estimate is of the form:
[ lower limit, upper limit]
9
Cont’d…
• It provides a confidence level for the estimate.
• Such interval estimates are called confidence
intervals.

10
Point Estimate
• A single numerical value used to estimate the
corresponding population parameter.
Sample Statistics are Estimators of Population Parameters

Sample mean, X µ
Sample variance, S2
2
Sample proportion, p
P or π
Sample Odds Ratio, OŔ
Sample Relative Risk, RŔ OR

Sample correlation coefficient, r RR


ρ

School of Public Health 11


Point Estimator
– A point estimator draws inference about a population
by estimating the value of an unknown parameter using
a single value or a point.

Parameter
Population distribution

Sample distribution
Point estimator
9
IntervalEstimator
– An interval estimator draws inferences about a
population by estimating the value of an unknown
parameter using an interval.
– The interval estimator is affected by the sample size.

Population distribution Parameter

Interval estimator
Sample distribution
10
14
Cont’d…
• A confidence interval (CI) for a population
characteristic is an interval of plausible values for the
characteristic.

• It is constructed so that, with a chosen degree of


confidence (the confidence level), the value of the
characteristic will be captured inside the interval.

• A Confidence Interval is always accompanied by a


probability that defines the risk of being wrong.

• This probability of error is usually called level of


significance α (alpha).

15
Cont’d…
• The level of confidence (1- α ) is the probability
that the interval estimate contains the population
parameter.

• The most common choices of level of confidence


are 0.95, 0.99, and 0.90.

• For example: If the level of confidence is 90%, this


means that we are 90% confident that the interval
contains the population.

16
• Confidence interval: A range of results from a poll,
experiment, or survey that would be expected to contain
the population parameter of interest. Confidence intervals
are constructed using significance levels / confidence levels.

• Confidence level: The probability that if a poll/test/survey


were repeated over and over again, the results obtained
would be the same. A confidence level = 1 - alpha.

• Significance level: In a hypothesis test, the significance level,


alpha, is the probability of making the wrong decision when
the null hypothesis is true

17
18
vLower limit = Point Estimate - (Critical Value) x (Standard Error)

vUpper limit = Point Estimate + (Critical Value) x (Standard Error)

Note the following


Ø A wide interval suggests imprecision of estimmation.
Ø Narrow CI widths reflects large sample size
Ø Narrow CI widths reflects large sample size or low variability or both.

19
Standard error

• The standard error is equal to the standard


deviation divided by the square root of the sample
size, n.

• This shows that the larger the sample size, the


smaller the standard error.

• The standard error of the mean can provide a rough


estimate of the interval in which the population
mean is likely to fall.

20
Factors affecting the width of the confidence interval
– The width of the interval estimate is a function of:
• The population standard deviation: increasing the
population standard deviation leads to wider confidence
interval

• The confidence level: Increasing the confidence level


produces a wider interval

• The sample size: Increasing the sample size decreases the width
of the interval estimate while the confidence level can remain
unchanged.

21
Key points
• For a given confidence level (i.e. 90%, 95%, 99%) the
width of the CI depends on the SE of the estimate which
in turn depends on the:

1.Sample size:-The larger the sample size, the narrower


the CI (means the sample statistic (x) will approach the
population parameter(p) and the more approach the
population parameter(p) and the more precise our
estimate.)

Lack of precision means that in repeated sampling the values of


the sample statistic are spread out or scattered. The result of
sampling is not repeatable

22
Cont’d…

2.Standard deviation:
The more the variation among the individual values,
the wider the CI and the less precise the estimate. As
sample size increases SD decreases.

23
24
Estimating the Population Mean
Assumptions
1. Population is normally/approximately normally distributed
– When the population variance/standard deviation is known and sample size is large or small
→ z-distribution

P(x  Z(1/2) / n    x  Z(1/2) / n) 1


– When the population variance/standard deviation is unknown and sample size is small
→ t-distribution
P(x  t(1 /2) , s / n    x  t(1 /2),(n1)s / n)  1 
(n1)

2. Population is non-normally distributed and sample size is large (i.e. n>30)

• Population variance is known → z-distribution


P(x  Z(1/2) / n    x  Z(1/2) / n) 1
• Population variance is unknown → z-distribution
P(x  Z  s/ n  xZ  s / n )  1
(1 ) (1 )
2 2

School of Public Health 25


26
27
28
29
Solution

Steps
1. Sample mean is 199, Sample SD is 20 mmHg and
requested to construct 95% CI
2. 1-ɑ = 0.95 where the value of ɑ is 0.05
3. Since it is two tailed , ɑ/2 will be 0.025.
4. So the value 0.025 will be searched from Z table
and the corresponding value is 1.96 (where 1.96
also called the critical value)
5. Calculate the CI using the formula

30
31
32
33
34
35
36
37
Exercise
• Suppose we are interested in estimating the
prevalence rate of breast cancer among 50- to 54-
year-old women whose mothers have had breast
cancer. Suppose that in a random sample of 10,000
such women, 400 are found to have had breast
cancer at some point in their lives.
1. Estimate the prevalence of breast cancer among
50- to 54-year-old women whose mothers have had
breast cancer.
2. Derive a 95% CI for the prevalence rate of breast
cancer among 50- to 54-year-old women.

38
39
Exercise 2

1. Scores on the exam are normally distributed with


a population standard deviation of 5.6. A random
sample of 40 scores on the exam has a mean of
32.
Estimate the population mean with
a) 80% confidence
b) 90% confidence
c) 98% confidence

40
41
Interpretation: we are 80% confident that the
population mean score is between 30.87 and 33.13

42
Interpretation of CI

School of Public Health 43


Any question???

44

You might also like