Lec - 7& 8 (Stastical Estimation)
Lec - 7& 8 (Stastical Estimation)
Statistical Estimation
02/09/2025 1
Learning Objectives
02/09/2025 2
What is a sampling distribution?
• is a distribution of all possible values of a statistic computed
from samples of the same size randomly selected from the
same population.
• In order to make an inference (e.g. estimate) about the
parameter from the sample statistic, one has to know or make
some assumptions about the distribution of the sample
statistic.
02/09/2025 3
Cont..
• Due to random variation different samples from the same population
will have different sample means.
• If we repeatedly take sample of the same size n from a population the
means of the samples form a sampling distribution of means of size n.
E.g. Take a sample (n) from N and calculate the statistic, e.g., mean.
• Take another sample (same size) and calculate mean.
• Repeat & repeat & repeat & ………..
• Do you expect all the sample means the same? NO
02/09/2025 4
Cont..
• Sampling variability: the value of any statistic ( mean or
proportion ) varies in repeated random sampling.
• They will vary BUT less variation
• In practice we do not take repeated samples from a population i.e.
we do not encounter sampling distribution empirically, but it is
necessary to know their properties in order to draw statistical
inferences.
02/09/2025 5
Cont..
When sampling a discrete, finite population, a sampling
For example:
Age of individuals is a random variable.
02/09/2025 7
Cont..
3. Repeat the sampling procedure until the possible number of
different samples drawn.
• For each sample, calculate the sample value of interest
(statistic) such as sample mean, and proportion.
02/09/2025 8
Cont..
02/09/2025 9
Cont..
02/09/2025 10
Properties of sampling distribution
1. The mean of the sampling distribution of is the same as the population mean
(μx = μ)
2. The standard deviation of the sampling distribution of is equal to the
population standard deviation divided by the square root of the sample size
(σ/√n). It is called Standard error
• 3. If the original distribution is approximately normal, the sampling distribution
is normal even at small sample sizes.
If the original population) is non-normal, the sampling distribution will be
approximately normal by central limit theorem provided n is large enough (>
30).
02/09/2025 11
Cont..
When sample sizes are large, sampling distribution generated
02/09/2025 12
Cont..
• The beauty of the CLT is that it allows us to make probability
statements about without regard for the distribution of X provided n
is large.
Since , we can standardize to obtain
•
And, use our standard normal tables to find the probability that lies in
any particular interval.
02/09/2025 13
Note
The standard deviation represents the variability in the
individual data.
The standard error represents the variability in the sample
estimates. Or Measures how much the sample statistic varies
from sample to sample.
02/09/2025 14
Inferential statistics
Statistical inference
02/09/2025 16
Cont..
02/09/2025 17
Statistical Estimation
parameters.
Researchers are usually interested in looking at estimates
02/09/2025 18
Cont..
Types of Estimation
1. Point Estimation
2. Interval Estimation
02/09/2025 19
1. Point Estimation
02/09/2025 20
Cont..
From a single sample we can calculate a sample
02/09/2025 21
Cont..
• The problem is that two different samples are very likely to
result in different sample means, and thus there is some degree
of uncertainty involved.
• A point estimate does not provide any information about the
inherent variability of the estimator; we do not know how
close is to μ in any given situation.
02/09/2025 22
Properties of a Good Estimates
a. Un biasedness
A sample statistic whose mean is equal to the
population parameter it estimates is unbiased.
The sample mean and median are unbiased
estimators of the population mean μ.
b. Minimum variance
An estimate which has a minimum standard error
is a good estimator.
For symmetrical distribution the mean has a
minimum standard error and
If the distribution is skewed the median has a
minimum standard error.
02/09/2025 23
Cont..
c. Consistency
As sample size increases, variation of the
estimator from the true population value
decreases
02/09/2025 24
2. Interval estimation
• Interval estimation: is a statement that a
population parameter has a value lying between
two specified limits.
An interval estimate provides more information
about a population characteristic than a point
estimate.
The value of the sample statistic will vary from
sample to sample therefore to simply obtain an
estimate of the single value of the parameter is not
generally acceptable.
02/09/2025 25
Cont..
We need to take into account the sample to sample variation of
the statistic.
02/09/2025 26
02/09/2025 27
Cont..
Interval estimate (Confidence interval) -
consists of two numbers, a lower limit
and an upper limit which serve as the
bounding values within which the
parameter is expected to lie with a certain
degree of confidence.
02/09/2025 28
Cont..
• A CI in general:
Takes into consideration variation in
sample statistics from sample to sample
Based on observation from one sample
Gives information about closeness to
unknown population parameters
Stated in terms of level of confidence
Never 100% sure
02/09/2025 29
Cont..
• Confidence Level: Confidence in which the interval
will contain the unknown population parameter.
A percentage (less than 100%)
[ x z . , x z . ] for estimating mean
n 2 2 n
if is unknown, it can be estimated by s.e
[ p z . P (1 P ) / n , p z . P (1 P ) / n ] for estimating proportion
2 2
02/09/2025 32
Cont..
Interpretation:
02/09/2025 33
Cont..
For a given confidence level (i.e. 90%, 95%, 99%) the
width of the confidence interval depends on
The Standard Error of the estimate which in turn
depends on the:
02/09/2025 34
You can make the precision as high as you want by
taking a large enough sample.
The margin of error decreases as√n increases.
02/09/2025 35
02/09/2025 36
Cont..
02/09/2025 37
1) C.I. for a single population mean (normally distributed)
02/09/2025 38
Example
A physical therapist wished to estimate, with 99% confidence,
the mean maximal strength of a particular muscle in a certain
group of individuals.
He assume that strength scores are approximately normally
distributed with a variance of 144.
A sample of 15 subjects who participated in the experiment
yielded a mean of 84.3.
02/09/2025 39
Solution:
02/09/2025 40
E.g. 2. A random sample of 100 cancer patients
treated with a new drug has a mean survival time of
46.9 months.
02/09/2025 42
3) C.I. for a population proportion (large sample size)
02/09/2025 43
Cont..
02/09/2025 44
Exercise
02/09/2025 45
Cont..
Find: a) 95%
b) 90%
c) 99% confidence intervals for the proportion of
the whole infected people in that locality during the
peak malaria transmission period.
02/09/2025 46
Cont..
Solution:
Sample proportion = 60 / 150 =0.4
a) A 95% C.I for the population proportion (the proportion of
the whole infected people in that locality) = 0.4 ± 1.96 (0.04)
= (0.4 ± 0.078) = (0.322, 0.478).
b) A 90 = 0.4 ± 1.64 (0.04) = (0.4 ± 0.065)
c) A 99= 0.4 ± 2.57 (0.04) = (0.4 ± 0.1)
02/09/2025 47
Sample size determination
• How many samples should be taken from the larger
population to have a representative sample?
If too many…
• Shortage of resource
– Data collection
– Analysis
• Waste of resources
02/09/2025 48
Con…
If too few…
• May fail to detect an important effect
• Estimates of effect may be too imprecise (wide CI’s)
02/09/2025 49
Con…
Why is it important to consider sample size?
• In studies concerned with estimating some characteristic of a
population (e.g. the prevalence of asthmatic children), sample
size calculations are important to ensure that estimates are
obtained with required precision or confidence.
02/09/2025 50
Con…
• In planning any investigation we must decide how
many people need to be studied in order to answer
the study objectives
• Is studies concerned with detecting an effect
– e.g. a difference b/n two treatments, or identify risk
of a diagnosis, if a certain risk factor is present
versus absent),
02/09/2025 51
Cont..
– Sample size calculations are important to ensure that
if an effect deemed to be clinically or biologically
important exists,
– Then, there is a high chance of it being detected
– i.e. that the analysis will be statistically significant.
02/09/2025 52
Cont..
Sample size determination depends on the:
objective of the study;
Availability of resources
02/09/2025 53
Incorrect sample size will lead to:
• Wrong conclusions
• Ethical problems
• Delay in completion
02/09/2025 54
Sample size determination
• Given confidence interval
02/09/2025 56
Sample size for single population proportion
02/09/2025 57
Single population proportion
• Let p denotes proportion of success, then
02/09/2025 58
Cont..
Where:
n-is minimum sample size
p-is estimate of the prevalence rate for the population
(if it is unknown we use 50%)
d-is the margin of sampling error tolerated
Zα/2 is the standard normal variable at (1-α)100%
confidence level and α is mostly 5%
02/09/2025 59
Point to be considered
02/09/2025 60
Example
02/09/2025 61
Excersis
02/09/2025 62
Con..
=3.8416x1600/25
=245.8624 ≈ 246
02/09/2025 64
Thank you!!!
02/09/2025 65