0% found this document useful (0 votes)
23 views37 pages

Quantitative Methods - 1

The document discusses sampling methods and sampling distributions. It describes different types of random sampling methods like simple random sampling, stratified sampling, and cluster sampling. It also discusses key concepts related to sampling distributions like the central limit theorem and how sample statistics can estimate population parameters. Estimators like the sample mean and sample proportion are discussed as the most common estimators of the population mean and proportion respectively.

Uploaded by

Sakshi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views37 pages

Quantitative Methods - 1

The document discusses sampling methods and sampling distributions. It describes different types of random sampling methods like simple random sampling, stratified sampling, and cluster sampling. It also discusses key concepts related to sampling distributions like the central limit theorem and how sample statistics can estimate population parameters. Estimators like the sample mean and sample proportion are discussed as the most common estimators of the population mean and proportion respectively.

Uploaded by

Sakshi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

Sampling and Sampling

Distributions

QAM – I by Shalabh Singh


Sampling Methods

Sampling methods can be:

Random (each member of the population has an equal


chance of being selected)
Non-Random

The actual process of sampling causes sampling errors. For


example, the sample may not be large enough or
representative of the population. Factors not related to
the sampling process cause non sampling errors. A
defective counting device can cause a Non sampling error.
Random Sampling Methods

• Simple random sample (each sample of the same size has an


equal chance of being selected)
• Stratified sample (divide the population into groups called strata
and then take a sample from each stratum)
• Cluster sample (divide the population into strata and then
randomly select some of the strata. All the members from these
strata are in the cluster sample.)
• Systematic sample (randomly select a starting point and take
every n-th piece of data from a listing of the population)
The Literary Digest Poll (1936)

Unbiased
Sample
Unbiased,
representative sample
Democrats Republicans drawn at random from
Population
the entire population.

Biased Biased,
People who have Sample
phones and/or cars unrepresentative
and/or are Digest
readers. sample drawn from
Democrats Republicans people who have cars
Population and/or telephones
and/or read the Digest.
Sampling Distributions

• Concept of Sampling Distribution


• Distributions of Sample Mean and
Sample Proportion
• Central Limit Theorem
• T distribution.

QAM – I by Shalabh Singh


Using Statistics

• Statistical Inference: On basis of sample statistics


✓ Predict and forecast values of
population parameters... derived from limited and
✓ Test hypotheses about values incomplete sample
of population parameters... information.
✓ Make decisions...

Make generalizations On the basis of


about the observations of a
characteristics of a sample, a part of a
population... population
Sample Statistics as Estimators of
Population Parameters

• A sample statistic is a numerical A population parameter


measure of a summary
characteristic of a sample. is a numerical measure of
a summary characteristic
of a population.
• An estimator of a population parameter is a sample statistic used to
estimate or predict the population parameter.
• An estimate of a parameter is a particular numerical value of a
sample statistic obtained through sampling.
• A point estimate is a single value used as an estimate of a population
parameter.
Estimators

• The sample mean, X , is the most common


estimator of the population mean, 
• The sample variance, s2, is the most common
estimator of the population variance, 2.
• The sample standard deviation, s, is the most
common estimator of the population standard
deviation, .
• The sample proportion, p , is the most common
estimator of the population proportion, 𝝅 .
Estimators

𝝅
Parameter and Statistic
• Parameter:
▪ Statistical measures computed using population observations.
▪ Let X1, X2,…, XN are population units.
▪ Population mean Population Variance
1 N
 =  Xi 1 N
 = ( X i − X ) 2
2

N i=1 N i=1

• Statistic:
▪ Statistical measures computed using sample observations.
▪ Let x1, x2,…, xn are sample units.
▪ Sample mean Sample Variance
1 n
x =  xi
n n
s2 = ( xi − x)2 or s12 = 
1 1
( x − x ) 2
n i=1 n i=1 n −1 i=1 i

QAM – I by Shalabh Singh


• In practice, parameter values are not known.
• They are estimated using sample observations.
• Parameter values are fixed.
• Values of statistic varies sample to sample.
• Unbiased Estimate
▪ If E(statistic) = parameter,
▪ then the statistic is said to be unbiased estimate of
the parameter.
▪ Sample mean is an unbiased estimate of population
mean.

QAM – I by Shalabh Singh


Sampling Distributions
• Unknown parameters are estimated using sample
observations.
• Parameter values are fixed.
• Values of statistic varies from sample to sample.
• Each sample has some probability of being chosen.
• Each value of a statistic is associated with a probability.
• Statistic is a random variable.
• Distribution of a statistic is called sampling distribution.
• Distribution of a statistic may not be the same as the
distribution of population.
QAM – I by Shalabh Singh
A Population Distribution, a Sample from a
Population, and the Population and Sample Means

Population mean ()


Frequency distribution
of the population

X X X X X X X
X X X X X X X
X X X X

Sample points

Sample mean ( X)
A Small Case
A Small Case
A Small Case
A Small Case
A Small Case
A Small Case
A Small Case
A Small Case
A Small Case
Sampling Distribution of x

• Process of Statistical Inference


Population A simple random sample
with mean of n elements is selected
=? from the population.

The value of x is used to The sample data


make inferences about provide a value for
the value of . the sample mean x.
• Let us consider the following population of size 4:
• 18, 20, 22, 24
• Population mean = (18 + 20 + 22 + 24)/ 4 = 21
• Population Variance
• = [(18-21)2 + (20-21) 2 + (22-21) 2 + (24-21) 2] / 4 =
5
• Consider all possible samples of size 2

QAM – I by Shalabh Singh


• The value of the sample mean depends on the chosen
sample.
• Each sample is chosen with certain probability.
• So, each possible value of sample mean is associated
with some probability.
• Distribution of sample mean is the list of all possible
values along with corresponding probabilities.

Sample 18 19 20 21 22 23 24
Mean
Probability 1/16 2/16 3/16 4/16 3/16 2/16 1/16

QAM – I by Shalabh Singh


• In other words, the statistic T = x (sample mean)
can be considered as a random variable.
• The distribution of T is given by following table:
t P(T=t) t x P(T=t) t2 x P(T=t)
18 1/16 1.125 20.250
19 2/16 2.375 45.125 E(T) = 21
20 3/16 3.750 75.000 E(T2) = 443.5
21 4/16 5.250 110.250 Var(T) = E(T2) – [E(T)]2
22 3/16 4.125 90.750 = 2.5
23 2/16 2.875 66.125
24 1/16 1.500 36.000
21.000 443.500

QAM – I by Shalabh Singh


• In general, E( x) = , Var( x) =  n
2

• E( x) and Var( x) can also be obtained as follows:

1 n  1 n 1 n
E( x) = E  xi  =  E( xi ) =   = n = 
1
 n i=1  n i=1 n i=1 n

1 n  1 n 1 n 2 1  2
Var( x) = Var  xi  = 2 Var( xi ) = 2  = 2 n 2 =
 n i=1  n i=1 n i=1 n n

• Common Notation:
x = E( x) = ,  = Var( x) =  n
2
x
2

QAM – I by Shalabh Singh


5-28
Relationships between Population Parameters and
the Sampling Distribution of the Sample Mean

The expected value of the sample mean is equal to the population mean:

E( X ) =  = 
X X

The variance of the sample mean is equal to the population variance divided by
the sample size:

 2

V( X) =  = 2
X
X

n
The standard deviation of the sample mean, known as the standard error of
the mean, is equal to the population standard deviation divided by the square
root of the sample size:

SD( X ) =  = X
X

n
Standard Error
• Different samples of the same size from the same
population will yield different sample means.
• A measure of the variability in different values of
sample mean is given by the Standard Error of the
sample mean.
standarderror( x) =  x = Var( x) =  n
• Standard error of a statistic is the standard deviation
of its distribution.
• In our example,  x = 2.5 = 1.5811
• Standard error decreases when sample size is
increased.
QAM – I by Shalabh Singh
Sampling from a Normal
Population
When sampling from a normal population with mean  and standard
deviation , the sample mean, X, has a normal sampling distribution:
2
X ~ N ( , )
n

This means that, as the S ampling Distribution of the S ample Mean

sample size increases, the 0.4

Sampling Distribution: n =16


sampling distribution of the 0.3
Sampling Distribution: n = 4

sample mean remains


f(X )
0.2

Sampling Distribution: n = 2
centered on the population 0.1
Normal population
Normal population
mean, but becomes more 0.0


compactly distributed around
that population mean.
Central Limit Theorem
When the population from which we are selecting
a random sample does not have a normal distribution,
the central limit theorem is helpful in identifying the
shape of the sampling distribution of .x

CENTRAL LIMIT THEOREM


In selecting random samples of size n from a
population, the sampling distribution of the sample
mean can be approximated by a normal
distribution as the sample size becomes large.
Central Limit Theorem
• When population distribution is N(μ, σ),
(
• then x ~ N , n . )
• When the population distribution is not normal,
(
• then also x ~ N , n , provided n→∞. )
• Practically, this result is true for n ≥ 30.

QAM – I by Shalabh Singh


1,800 Randomly Selected Values
from an Exponential Distribution

Distribution of Sample Mean

10
n=2 9 16
8
7
n=5 14 n=30
6 12
5 10
4 8
3 6
2 4
1 2
00.00
0.25
0.50
0.75
1.00
1.25
1.50
1.75
2.00
2.25
2.50
2.75
3.00
3.25
3.50
3.750
4.00
0.00
0.25
0.50
0.75
1.00
1.25
1.50
1.75
2.00
2.25
2.50
2.75
3.00
x x

QAM – I by Shalabh Singh


1,800 Randomly Selected Values from a Uniform
Distribution
F250
r 200
e
150
q
u100
e 50
n0
c 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
X
y
Distribution of Sample Mean
F25
F10 F12 r20
9 10
r87 r8 e15
6
e54 e6 q10
q32 q4 u5
2
u10 u0 e0
1.00
1.25
1.50
1.75
2.00
2.25
2.50
2.75
3.00
3.25
3.50
3.75 4.25 1.00
4.00 1.25
1.50
1.75
2.00
2.25
2.50
2.75
3.00
3.25
3.50
3.75
4.001.00
4.251.25
1.50
1.75
2.00
2.25
2.50
2.75
3.00
3.25
3.50
3.75
4.00
4.25
e n=2 x e n=5 xn
c n=30 x
n n
c c y
y y
QAM – I by Shalabh Singh
• Example:
• Suppose a population has mean μ = 8 and standard
deviation σ = 3.
• Suppose a random sample of size n = 36 is selected.
• What is the probability that the sample mean is
between 7.75 and 8.25?
• Even if the population is not normally distributed,
the central limit theorem can be used (n > 30).
• So, the distribution of the sample mean is
approximately N(8, 3/6).
• i.e, x ~ N (8, 3 / 6)
P[7.75  x  8.25] = ?
QAM – I by Shalabh Singh
• Example:
• After deducting grants based on need, the average cost
to attend the University of Southern California (USC) is
$27,175 (U.S. News & World Report, America’s Best
Colleges, 2009 ed.). Assume the population standard
deviation is $7400. Suppose that a random sample of 60
USC students will be taken from this population.
• a. What is the value of the standard error of the mean?
• b. What is the probability that the sample mean will be
more than $27,175?
• Ans: 955, .50

QAM – I by Shalabh Singh


• Example:
The Barron’s reported that the average number of weeks
an individual is unemployed is 17.5 weeks (Barron’s,
February 18, 2008). Assume that for the population of all
unemployed individuals the population mean length of
unemployment is 17.5 weeks and that the population
standard deviation is 4 weeks. What is the probability that
a simple random sample of 50 unemployed individuals will
provide a sample mean within 1 week of the population
mean?

• Ans: .9198

QAM – I by Shalabh Singh

You might also like