0% found this document useful (0 votes)

32 views60 pages

Chapte 8 Estimation

Uploaded by

lindazd1223

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views60 pages

Chapte 8 Estimation

Uploaded by

lindazd1223

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 60

College of Health and Medical Science

Department of Epidemiology and Biostatistics

Statistical Estimation Techniques

Hamdi Fekredin (BSc, MPH)

October, 2024

[email protected] 1
Estimation techniques
Learning objectives
Upon completion of the session, students will be able to
Identify the different estimation techniques in one samples situation
Estimate sample size for cross-sectional study

2
Introduction
In the real world, the values of population parameters are fixed
and usually not known.
Instead, we must try to say something about the way in which
a variable is distributed using the information contained in a
sample of observations.
The process of drawing conclusions about an entire population
based on the data in a sample is known as statistical inference.
Two broad categories: Estimation and Hypothesis testing.

3
Estimation
Is concerned with estimating the values of specific population
parameters based on sample statistic.
Is about using information in a sample to make estimates of the
characteristics (parameters) of the source population.

Examples: A sample survey revealed:

 Proportion of smokers among a certain group of population aged 15 to 24.

 Mean of SBP among sampled population

The next question is what can we predict about the characteristics of

the population from which the sample was drawn
4
Estimation, Estimator & Estimate
♣ Estimation is the computation of a statistic from sample data,
often yielding a value that is an approximation (guess) of its
target, an unknown true population parameter value.

♣ The statistic itself is called an estimator and can be of two

types - point or interval.

♣ The value or values that the estimator assumes are called

estimates.
5
Two methods of estimation are commonly used:
point estimation and interval estimation

Point estimation involves the calculation of a single number to

estimate the population parameter
Interval estimation specifies a range of reasonable values for
the parameter

6
Point versus Interval Estimators
♣ An estimator that represents a "single best guess" is called a
point estimator.

♣ When the estimate is of the form of a "range of plausible

values", it is called an interval estimator.

 Thus,
 A point estimate is of the form: [ Value ],

 Whereas, an interval estimate is of the form: [ lower limit,

upper limit ] 7
Sample mean ( ) is an unbiased estimator of population mean.

8
Estimating the Sampling Error

 Any estimates derived from samples are subject to the

sampling error.
 This comes from the fact that only a part of the population
was observed, instead of the whole.
 A different samples could have come up with different results.

 The amount of variation that exists among the estimates from

the different possible samples is the sampling error. 9
 The set of sample means in repeated random samples of size n from a
given population has variance .
 The standard deviation of this set of sample means is and is
referred to as the standard error of the mean (sem) or the standard
error.
 The SEM is estimated by if  is unknown.

10
 The sampling error is dependent on sample size (n) and the

variability of individual sample points ().

 As n increases, the sample mean ( ) and the sample variance
s2 approach the values of the true population parameters, µ
and 2, respectively.

11
Example
 Suppose that the mean ± sd of DBP on 20 old males is 78.5 ± 10.3 mm
Hg.

1. What is our best estimate of µ ?

Our best estimate of µ is 78.5

2. What is the SEM ?

The sem of this estimate is 10.3/√20 = 2.3

3. Compare the SEM with the sd.

The sem (2.3) is much smaller than sd (10.3).
12
1. Point Estimate
 A single numerical value used to estimate the corresponding
population parameter.
Sample Statistic are Estimators of Population Parameters

Sample mean, µ
Sample variance, S2 2
Sample P or π
proportion, p OR
Sample Odds Ratio,
RR
OŔ
ρ 13
Sample Relative Risk, RŔ
2. Interval Estimation
 Interval estimation specifies a range of reasonable values for
the population parameter based on a point estimate.
 A confidence interval is a particular type of interval estimator.

Confidence Intervals
 Give a plausible range of values of the estimate likely to include
the “true” (population) value with a given confidence level.
 An interval estimate provides more information about a
population characteristic than does a point estimate 14
 CI’s also give information about the precision of an estimate.

 When sampling variability is high, the CI will be wide to reflect

the uncertainty of the observation.

 Wider CIs indicate less certainty.

 CIs can also answer the question of whether or not an

association exists (analogous to p-values…).

 Narrow CI widths reflects large sample size or low variability

or both. 15
General Formula:
The general formula for all CIs is:

The value of the statistic in sample

(eg., mean, proportions, etc.)
point estimate  (measure of how confident we want to be)
 (standard error)

From a Z table or a T table, depending on the

sampling distribution of the statistic.

16
A confidence interval has 3 components:

1) A point estimate (e.g. the sample mean)

2) The standard error of the point estimate ( e.g. SEM =σ/√ n )

3) A confidence coefficient (conf. coeff)

Lower limit = Point Estimate - (Critical Value/ confidence
coefficient) x (Standard Error)
Upper limit = Point Estimate + (Critical Value/ confidence
coefficient) x (Standard Error)
17
Confidence Level
 Confidence Level:

 Confidence in which the interval will contain the unknown

population parameter
 A percentage (less than 100%)

Example: 95%
 Also written (1 - α) = .95

18
Definition of 95% CI
1. Probabilistic interpretation:
 If all possible random samples of a given sample size were obtained
and if each were used to obtain its own CI, then 95% of all such CIs
would contain the unknown population parameter; the remaining 5%
would not.

2. Practical interpretation
 When sampling is from a normally distributed population with known
standard deviation, we are 100 (1-α) [e.g., 95%] confident that the
single computed interval contains the unknown population
parameter. 19
Estimation for Single Population

20
1. CI for a Population Mean (normally distributed)

A. Known variance (large sample size)

Consider the task of computing a CI estimate of μ for a

population distribution that is normal with σ known.
 Available are data from a random sample of size = n.

21
Assumptions
 Population standard deviation () is known

 Population is normally distributed

 If population is not normal, use large sample

A 100(1-)% C.I. for  is:

  is to be chosen by the researcher, most common values

of  are 0.1, 0.05 and 0.01. 22
3. Commonly used CLs are 90%, 95%, and 99%

23
Finding the Critical Value

24
Margin of Error
(Precision of the estimate)

25
Factors Affecting Margin of Error

The CI for mean or margin of error is determined by n, s,

and α.
As n increases, the CI decreases.

As s increases, the length of CI increases.

As the confidence level increases (α decreases), the length

of CI increases.
26
Example:
1. Waiting times (in hours) at a particular hospital are believed to
be approximately normally distributed with a variance of
2.25 hr.

a. A sample of 20 outpatients revealed a mean waiting time of

1.52 hours. Construct the 95% CI for the estimate of the
population mean.

b. Suppose that the mean of 1.52 hours had resulted from a

sample of 32 patients. Find the 95% CI.

c. What effect does larger sample size have on the CI? 27

a.
2.25
1.52 1.96 1.52 1.96(.33)
20
1.52 .65 (.87, 2.17)

 We are 95% confident that the true mean waiting time is

between 0.87 and 2.17 hrs.
 95% of the intervals formed in this manner will contain the true
mean.

28
b. 2.25
1.52 1.96 1.52 1.96(.27)
32
1.52 .53 (.99, 2.05)

c. The larger the sample size makes the CI narrower (more

precision).

29
 When constructing CIs, it has been assumed that the standard
deviation of the underlying population,  , is known
 What if  is not known?

 In this case, the SE of the population can be replaced by the

SE of the sample if the sample size is large enough (n>30).
With large sample size, we assume a normal distribution.

30
 Example: It was found that a sample of 35 patients were 17.2
minutes late for appointments, on the average, with SD of 8
minutes. What is the 90% CI for µ? Ans: (14.98, 19.42).
 Since the sample size is fairly large (>30) and the
population SD is unknown, we assume the distribution
of sample mean to be normally distributed based on
the CLT and the sample SD to replace population .

31
B. Unknown variance
(small sample size, n ≤ 30)
 What if the  for the underlying population is unknown and
the sample size is small?

 As an alternative we use Student’s t distribution .

32
33
Student’s t Distribution
 The t is a family of continuous probability distributions

 Bell Shaped

 Symmetric about zero (the mean)

 Flatter than the Normal (0,1). This means

The variability of a t is greater than that of a Z that is

normal(0,1)
Thus, there is more area under the tails and less at center

Because variability is greater, resulting confidence intervals

34
will be wider.
• Note: t approaches z as n increases

35
Student’s t Table

36
t distribution values
 With comparison to the Z value

37
Example

 Standard error =

 t-value at 90% CI at 19 df =1.729

38
39
2. CIs for population proportion, p

Is based on three elements of CI.

Point estimate

SE of point estimate

Confidence coefficient
40
41
42
Lower limit = Point Estimate - (Critical Value) x (Standard
Error of Estimate)
Upper limit = Point Estimate + (Critical Value) x (Standard
Error of Estimate)

Hence,

is an approximate 95% CI for the true proportion p.

43
Example 1
 A random sample of 100 people shows that 25 are left-
handed. Form a 95% CI for the true proportion of left-
handers.

44
Interpretation

45
Example
 It was found that 28.1% of 153 cervical-cancer cases had never
had a Pap smear prior to the time of case’s diagnosis. Calculate
a 95% CI for the percentage of cervical-cancer cases who never
had a Pap test.

46
Sample size Determination
Too small sample size :
May fail to detect an important effect

Estimates of effect may be too imprecise (wide CI’s)

Too many sample size:

May results in wastage of resources.

To make generalizations about entire population, we need

a total sample size of 200-400
47
Confidence interval approach
 Given confidence interval
mean ( proportion ) z  s.e
2

 Hence the absolute precision denoted by d is given as

Margin of error
d = z s.e

 Where s.e is the standard error2 of the estimator of the
parameter of interest.

48
Steps to determine sample size:
1. Specify tolerable error (i.e., desired precision and confidence
level via d and  )

2. Identify appropriate equation relating tolerable error (d, ) to

sample size (n)

3. Estimate unknown quantities in equation

4. Solve for n

5. Evaluate (and return to first step)

sample size calculation should relate to the study’s outcome
variable 49
Estimating a single population
mean/proportion

50
Examples
1. A survey is being planned to determine what proportion of
families in a certain area are medically indigent. It is found
that the proportion is 0.35 from previous studies. A 95%
confidence interval is desired with d=5% What size sample of
families should be selected?
2. Suppose that you are interested to know the proportion of
infants who breastfed >18 months of age in a rural area.
Suppose that in a similar area, the proportion (p) of breastfed
infants was found to be 0.20. What sample size is required to
estimate the true proportion within ±3% points with 95%
confidence. Let p=0.20, d=0.03, α=5%

52
Example
3. Suppose that for a certain group of cancer patients, we are
interested in estimating the mean age at diagnosis. We would like
a 95% CI and wants margin of error of 2 units.

If the population SD is 124 years, how large should our sample

be?

= 1.96*1.96*124 = 119
2*2

53
Suppose there is no prior information about the proportion
(p) who breastfeed

For a fixed absolute precision (d), the required sample

size increases as P increases form 0 to 0.5, and then
decreases in the same way as the prevalence
approaches 1.

54
 An estimate of p is not always available.

 However, the formula may also be used for sample size

calculation based on various assumptions for the values of
p.
P = 0.1  n = (1.96)2(0.1)(0.9)/(0.05)2 = 138
P = 0.2  n = (1.96)2(0.2)(0.8)/(0.05)2 = 246
P = 0.3  n = (1.96)2(0.3)(0.7)/(0.05)2 = 323
P = 0.5  n = (1.96)2(0.5)(0.5)/(0.05)2 = 384
P = 0.7  n = (1.96)2(0.7)(0.3)/(0.05)2 = 323
P = 0.8  n = (1.96)2(0.8)(0.2)/(0.05)2 = 246
55
Some Considerations

56
Using design effect
 The loss of effectiveness by the use of cluster sampling,
instead of simple random sampling, is the design effect.
 The design effect is basically the ratio of the actual variance,
under the sampling method actually used, to the variance
computed under the assumption of simple random
sampling
Using design effect cont.…
 When simple and systematic random sampling
techniques are used design effect is one.
 When clustering sampling technique is used design
effect is two.
 When multi stage sampling technique is used design
effect is equal to the number of stages.
Quiz

1. List and explain at least two types of probability sampling

2. State central limit theorem
3. Differentiate point and interval estimation

Dana S. Dunn, Suzanne Mannes - Statistics and Data Analysis For The Behavioral Sciences-McGraw-Hill Companies (2001)
100% (1)
Dana S. Dunn, Suzanne Mannes - Statistics and Data Analysis For The Behavioral Sciences-McGraw-Hill Companies (2001)
758 pages
Gender and The Relationship Between Perceived Fairness
No ratings yet
Gender and The Relationship Between Perceived Fairness
8 pages
Problems and Errors in Measurements by Pugazh
No ratings yet
Problems and Errors in Measurements by Pugazh
10 pages
Creating Charts and Graphs
No ratings yet
Creating Charts and Graphs
40 pages
Statistics For Management - 2
80% (10)
Statistics For Management - 2
14 pages
Theory of Estimation
100% (1)
Theory of Estimation
30 pages
Estimation
No ratings yet
Estimation
44 pages
Chapter 08 Sampling Distributions and Estimation: Multiple Choice Questions
No ratings yet
Chapter 08 Sampling Distributions and Estimation: Multiple Choice Questions
302 pages
Neil J. Salkind - Encyclopedia of Research Design (2010, SAGE Publications, Inc) PDF
92% (13)
Neil J. Salkind - Encyclopedia of Research Design (2010, SAGE Publications, Inc) PDF
1,644 pages
Minitab Workbook
No ratings yet
Minitab Workbook
28 pages
Estimation
No ratings yet
Estimation
74 pages
Virtual COMSATS Inferential Statistics Lecture-6: Ossam Chohan CIIT Abbottabad
100% (1)
Virtual COMSATS Inferential Statistics Lecture-6: Ossam Chohan CIIT Abbottabad
35 pages
Chapter Two (Estimation and Hypothesis Testing)
No ratings yet
Chapter Two (Estimation and Hypothesis Testing)
20 pages
Statistics For Economists Lecture VI
No ratings yet
Statistics For Economists Lecture VI
33 pages
Estimation
No ratings yet
Estimation
53 pages
Estimation
No ratings yet
Estimation
106 pages
Free Fall Acceleration and Error Analysis
100% (1)
Free Fall Acceleration and Error Analysis
3 pages
Chpater Three
No ratings yet
Chpater Three
84 pages
Lecture 4 Dr. Amani Week 13
No ratings yet
Lecture 4 Dr. Amani Week 13
34 pages
Chapter 5-6 Estimation Hypothesis
No ratings yet
Chapter 5-6 Estimation Hypothesis
146 pages
QM Prelim Finals - Usergen
No ratings yet
QM Prelim Finals - Usergen
157 pages
Estimation & Sample Size Determination
No ratings yet
Estimation & Sample Size Determination
91 pages
7 Estimation
No ratings yet
7 Estimation
108 pages
Lec - 7& 8 (Stastical Estimation)
No ratings yet
Lec - 7& 8 (Stastical Estimation)
65 pages
Bio 6
No ratings yet
Bio 6
36 pages
Q3 Week 6
No ratings yet
Q3 Week 6
38 pages
IPPTCh 007
No ratings yet
IPPTCh 007
41 pages
Confidence Intervals
No ratings yet
Confidence Intervals
28 pages
Understanding and Misunderstanding Randomized Controlled Trials - PMC
No ratings yet
Understanding and Misunderstanding Randomized Controlled Trials - PMC
46 pages
Lecture 5
No ratings yet
Lecture 5
130 pages
Chapter 5 - Estimation
No ratings yet
Chapter 5 - Estimation
8 pages
A Bass Diffusion Model Analysis - Understanding Alternative Fuel V PDF
No ratings yet
A Bass Diffusion Model Analysis - Understanding Alternative Fuel V PDF
53 pages
Statistical Inference
100% (1)
Statistical Inference
33 pages
4estimation and Hypothesis Testing (DB) (Compatibility Mode)
No ratings yet
4estimation and Hypothesis Testing (DB) (Compatibility Mode)
170 pages
Chapter 7estimation
No ratings yet
Chapter 7estimation
44 pages
Estimation
No ratings yet
Estimation
40 pages
Sampling and Sampling Distributions - Adobe Scan 27 Jan 2023
No ratings yet
Sampling and Sampling Distributions - Adobe Scan 27 Jan 2023
12 pages
Biostat Estimation
100% (1)
Biostat Estimation
48 pages
Chapter 6. Estiamation
No ratings yet
Chapter 6. Estiamation
65 pages
Chem 413 Special Methods in Analytical Chemistry
No ratings yet
Chem 413 Special Methods in Analytical Chemistry
103 pages
6 Estimation
No ratings yet
6 Estimation
65 pages
L8 Statistical Estimation 1
No ratings yet
L8 Statistical Estimation 1
48 pages
Inferential Estimation
100% (1)
Inferential Estimation
74 pages
Hypothesis Testing Notes 2025
No ratings yet
Hypothesis Testing Notes 2025
116 pages
Chapter 3 (Sampling-New)
0% (1)
Chapter 3 (Sampling-New)
103 pages
Methods Chapter 2
No ratings yet
Methods Chapter 2
19 pages
Biostat Lecture Seven
No ratings yet
Biostat Lecture Seven
59 pages
Double Slit
No ratings yet
Double Slit
3 pages
Chapter Two-Four
No ratings yet
Chapter Two-Four
118 pages
4 5 Chapter 4 ESTIMATION and 5 Hyp Testing
No ratings yet
4 5 Chapter 4 ESTIMATION and 5 Hyp Testing
180 pages
Reliability: What Is It, and How Is It Measured?: 94 Key Words
No ratings yet
Reliability: What Is It, and How Is It Measured?: 94 Key Words
6 pages
Lecture 4-Statistical Inferences
No ratings yet
Lecture 4-Statistical Inferences
118 pages
Chapter 6
No ratings yet
Chapter 6
33 pages
Statistical Inference 417
No ratings yet
Statistical Inference 417
90 pages
Chapter 4 - Hypothesis Confidence Interval - 30102016
No ratings yet
Chapter 4 - Hypothesis Confidence Interval - 30102016
103 pages
Estimation and CI
No ratings yet
Estimation and CI
87 pages
Statistical Inferenace 1
No ratings yet
Statistical Inferenace 1
9 pages
Unit V Estimation
No ratings yet
Unit V Estimation
33 pages
Test 1 Advanced Stats Utd
No ratings yet
Test 1 Advanced Stats Utd
8 pages
Module 5
No ratings yet
Module 5
67 pages
Statistical Estimation
No ratings yet
Statistical Estimation
28 pages
GenDent JF20 Meharry
No ratings yet
GenDent JF20 Meharry
8 pages
Inferential Statistics
No ratings yet
Inferential Statistics
119 pages
Hypothesis Testing Notes 2025
No ratings yet
Hypothesis Testing Notes 2025
93 pages
GE 105 Lecture 1 (LEAST SQUARES ADJUSTMENT) By: Broddett Bello Abatayo
100% (1)
GE 105 Lecture 1 (LEAST SQUARES ADJUSTMENT) By: Broddett Bello Abatayo
49 pages
Module 06 - One Population Parameter Estimation - Topic 4A
No ratings yet
Module 06 - One Population Parameter Estimation - Topic 4A
59 pages
Statistics and Probability T-Test
No ratings yet
Statistics and Probability T-Test
37 pages
VIII - Estimation
No ratings yet
VIII - Estimation
60 pages
Statistical Methods
No ratings yet
Statistical Methods
43 pages
7 Estimation
No ratings yet
7 Estimation
91 pages
Practice Problems FOR Biostatistics
No ratings yet
Practice Problems FOR Biostatistics
37 pages
Statistics For Psychology A Beginners Guide, 2nd Edition (FULL VERSION DOWNLOAD)
100% (10)
Statistics For Psychology A Beginners Guide, 2nd Edition (FULL VERSION DOWNLOAD)
16 pages
Business Statistics CH 2
No ratings yet
Business Statistics CH 2
49 pages
5.4 MLBasics Estimators
No ratings yet
5.4 MLBasics Estimators
23 pages
Advanced Statistical Approaches To Quality: INSE 6220 - Week 4
No ratings yet
Advanced Statistical Approaches To Quality: INSE 6220 - Week 4
44 pages
4 Inferentials
No ratings yet
4 Inferentials
53 pages
Chapter 8
No ratings yet
Chapter 8
19 pages
Lecture 8
No ratings yet
Lecture 8
85 pages
Chapter Two
No ratings yet
Chapter Two
154 pages
Mathematics and Statistics (Unit IV & V)
75% (4)
Mathematics and Statistics (Unit IV & V)
61 pages
Ch-1.Ppt Business Statx
No ratings yet
Ch-1.Ppt Business Statx
66 pages
Chapter 2
No ratings yet
Chapter 2
30 pages
Generalized Kappa
No ratings yet
Generalized Kappa
11 pages
Biostat Inferential Statistics
No ratings yet
Biostat Inferential Statistics
62 pages
Chapter 1
No ratings yet
Chapter 1
24 pages
Chapter Two
No ratings yet
Chapter Two
28 pages
University of Gondar College of Medicine and Health Science Department of Epidemiology and Biostatistics
No ratings yet
University of Gondar College of Medicine and Health Science Department of Epidemiology and Biostatistics
119 pages
Geg 222 LMS Questions Compilation With Answers (Ibra)
No ratings yet
Geg 222 LMS Questions Compilation With Answers (Ibra)
12 pages
Statistics For Dummies
From Everand
Statistics For Dummies
Deborah J. Rumsey
4/5 (27)
Learn Statistics Fast: A Simplified Detailed Version for Students
From Everand
Learn Statistics Fast: A Simplified Detailed Version for Students
Hesbon R.M
No ratings yet

Chapte 8 Estimation

Uploaded by

Chapte 8 Estimation

Uploaded by

College of Health and Medical Science

Department of Epidemiology and Biostatistics

Statistical Estimation Techniques

Examples: A sample survey revealed:

 Mean of SBP among sampled population

The next question is what can we predict about the characteristics of

♣ The statistic itself is called an estimator and can be of two

♣ The value or values that the estimator assumes are called

Point estimation involves the calculation of a single number to

♣ When the estimate is of the form of a "range of plausible

 Whereas, an interval estimate is of the form: [ lower limit,

 Any estimates derived from samples are subject to the

 The amount of variation that exists among the estimates from

variability of individual sample points ().

1. What is our best estimate of µ ?

Our best estimate of µ is 78.5

2. What is the SEM ?

3. Compare the SEM with the sd.

 When sampling variability is high, the CI will be wide to reflect

 Wider CIs indicate less certainty.

 CIs can also answer the question of whether or not an

 Narrow CI widths reflects large sample size or low variability

The value of the statistic in sample

From a Z table or a T table, depending on the

1) A point estimate (e.g. the sample mean)

2) The standard error of the point estimate ( e.g. SEM =σ/√ n )

3) A confidence coefficient (conf. coeff)

 Confidence in which the interval will contain the unknown

A. Known variance (large sample size)

Consider the task of computing a CI estimate of μ for a

 Population is normally distributed

 If population is not normal, use large sample

A 100(1-)% C.I. for  is:

  is to be chosen by the researcher, most common values

The CI for mean or margin of error is determined by n, s,

As s increases, the length of CI increases.

As the confidence level increases (α decreases), the length

a. A sample of 20 outpatients revealed a mean waiting time of

b. Suppose that the mean of 1.52 hours had resulted from a

c. What effect does larger sample size have on the CI? 27

 We are 95% confident that the true mean waiting time is

c. The larger the sample size makes the CI narrower (more

 In this case, the SE of the population can be replaced by the

 As an alternative we use Student’s t distribution .

 Symmetric about zero (the mean)

 Flatter than the Normal (0,1). This means

The variability of a t is greater than that of a Z that is

Because variability is greater, resulting confidence intervals

 t-value at 90% CI at 19 df =1.729

Is based on three elements of CI.

SE of point estimate

is an approximate 95% CI for the true proportion p.

Estimates of effect may be too imprecise (wide CI’s)

Too many sample size:

To make generalizations about entire population, we need

 Hence the absolute precision denoted by d is given as

2. Identify appropriate equation relating tolerable error (d, ) to

3. Estimate unknown quantities in equation

5. Evaluate (and return to first step)

If the population SD is 124 years, how large should our sample

For a fixed absolute precision (d), the required sample

 However, the formula may also be used for sample size

1. List and explain at least two types of probability sampling

You might also like