0% found this document useful (0 votes)
17 views101 pages

Class4 Newbold Chap06 Spring2024 01

Chapter 6 of 'Statistics for Business and Economics' focuses on sampling and sampling distributions, explaining the concepts of populations, samples, and the importance of sampling in statistical analysis. It discusses simple random samples, classical statistical inference, and the central limit theorem, highlighting how sample statistics can be used to make inferences about population parameters. Additionally, the chapter covers the estimation process, sampling distributions for means and proportions, and the conditions under which normal approximations can be applied.

Uploaded by

Ava Robledo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views101 pages

Class4 Newbold Chap06 Spring2024 01

Chapter 6 of 'Statistics for Business and Economics' focuses on sampling and sampling distributions, explaining the concepts of populations, samples, and the importance of sampling in statistical analysis. It discusses simple random samples, classical statistical inference, and the central limit theorem, highlighting how sample statistics can be used to make inferences about population parameters. Additionally, the chapter covers the estimation process, sampling distributions for means and proportions, and the conditions under which normal approximations can be applied.

Uploaded by

Ava Robledo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 101

Statistics for

Business and Economics

Chapter 6

Sampling and Sampling


Distributions
Tools of Statistics

 Descriptive statistics
 Collecting, presenting, and describing data

Ch 6
Tools of Statistics

 Descriptive statistics
 Collecting, presenting, and describing data

 Inferential statistics
 Drawing conclusions and/or making
decisions concerning a population
based only on sample data; called
Classical Statistical Inference

Ch 6
Populations and Samples

 A Population is the set of all items or


individuals of interest
 Examples: All likely voters in the next election
All parts produced today
All sales receipts for November

Ch 6
Populations and Samples

 A Population is the set of all items or individuals


of interest
 Examples: All likely voters in the next election
All parts produced today
All sales receipts for November

 A Sample is a subset of the population


 Examples: 1000 voters selected at random for interview
A few parts selected for destructive testing
Random receipts selected for audit

Ch 6
Population vs. Sample

Population

a b cd
ef gh i jk l m n
o p q rs t u v w
x y z

Ch 6
Population vs. Sample

Population

a b cd
ef gh i jk l m n
o p q rs t u v w
x y z

Ch 6
Population vs. Sample

Population

a b cd
ef gh i jk l m n
o p q rs t u v w
x y z

Ch 6
Population vs. Sample

Population Sample

a b cd b c
ef gh i jk l m n gi n
o p q rs t u v w o r u
x y z y

Ch 6
Why Sample?

 Less time consuming; Less costly than a


census

 It is possible to obtain statistical results of a


sufficiently high precision based on samples.

Ch 6
Simple Random Samples

 Suppose that we want to select a sample of


n objects from a population of N objects
where N is ‘large’.

Ch 6
Simple Random Samples

 A simple random sample is selected such that


every object has an equal probability of
being selected and the objects are selected
independently
- -the selection of one object does not
change the probability of selecting any
other objects.

Ch 6
Simple Random Samples

 Simple random samples are the ideal sample.


 In a number of real-world sampling studies analysts develop
alternative sampling procedures to lower the costs of
sampling.
 But the basis for determining if these strategies are acceptable
is to determine how closely they approximate a simple random
sample.

Ch 6
Simple Random Samples

 Every object in the population has an equal


chance of being selected
 Objects are selected independently

Ch 6
Simple Random Samples

 Every object in the population has an equal chance of


being selected
 Objects are selected independently
 Samples can be obtained from a table of random
numbers or computer random number generators

 A simple random sample is the ideal against which


other sample methods are compared

Ch 6
Classical Statistical Inference

 Making statements about a population


by examining results obtained from
random sampling

Ch 6
Classical Statistical Inference

 Making statements about a population


by examining (random) sample results

Sample statistics
statistic, say
compute the
sample mean

Ch 6
Classical Statistical Inference

 Making statements about a population


by examining (random) sample results

Sample statistics Population


statistic parameters
Say, compute Say, population mean
sample mean is not known

Ch 6
Classical Statistical Inference

 Making statements about a population


by examining (random) sample results

Sample statistics Why, how? Population


statistic parameters
Say, compute Say, population mean
sample mean is not known

Ch 6
Classical Statistical Inference

 Making statements about a population by


examining sample results
Sample statistics Population parameters
(compute from sample) Inference (unknown, but can
be estimated from
sample evidence)

Ch 6
Classical Statistical Inference

 Making statements about a population by


examining sample results
Sample statistics Population parameters
(estimated) Inference (unknown, but can
be estimated from
sample evidence)

Sample
Population

Ch 6
Classical Statistical Inference
Drawing conclusions and/or
making decisions concerning
a population based on sample
results.

Ch 6
Classical Statistical Inference
Drawing conclusions and/or making decisions
concerning a population based on sample results.
 Estimation
 e.g., Estimate the population mean
weight using the sample mean
weight

Ch 6
Classical Statistical Inference
Drawing conclusions and/or making decisions
concerning a population based on sample results.
 Estimation
 e.g., Estimate the population mean
weight using the sample mean
weight
 Hypothesis Testing
 e.g., Use sample evidence to test
the claim that the population mean
weight is 120 pounds

Ch 6
Sampling Distributions

 A sampling distribution is a distribution


of all of the possible values of a statistic
for a given size sample selected from a
population

Ch 6
Chapter Outline

Sampling
Distributions

Sampling Sampling Sampling


Distribution of Distribution of Distribution of
Sample Sample Sample
Mean Proportion Variance

Ch 6
Sampling Distributions of
Sample Means

Sampling
Distributions

Sampling Sampling Sampling


Distribution of Distribution of Distribution of
Sample Sample Sample
Mean Proportion Variance

Ch 6
Estimation Process

Population
(mean, μ, is Random Sample
unknown)

Sample

Ch 6
Estimation Process

Random Sample

Population Mean
(mean, μ, is X = 50
unknown)

Sample

Ch 6
Estimation Process

Population
(mean, μ, is Random Sample
unknown)

Sample

Ch 6
Estimation Process

Random Sample

Population Mean
(mean, μ, is X = 41
unknown)

Sample

Ch 6
Estimation Process

Population
(mean, μ, is Random Sample
unknown)

Sample

Ch 6
Estimation Process

Random Sample

Population Mean
(mean, μ, is X = 60
unknown)

Sample

Ch 6
Estimation Process

Population
(mean, μ, is Random Sample
unknown)

Sample

Ch 6
Estimation Process

Random Sample

Population Mean
(mean, μ, is X = 55
unknown)

Sample

Ch 6
Estimation Process

Population
(mean, μ, is Random Sample
unknown)

Sample

Ch 6
Estimation Process

Random Sample

Population Mean
(mean, μ, is X = 48
unknown)

Sample

Ch 6
Estimation Process

Random Sample

Population Mean
(mean, μ, is X=…
unknown) REDO, many times…

Sample

Ch 6
Estimation Process

Random Sample

Population Mean
(mean, μ, is X=…
unknown) REDO, many times…
Obtain mean of X and
Sample Variance of X…

Ch 6
Estimation Process

Random Sample

Population Mean
(mean, μ, is X=…
unknown) REDO, many times…
Obtain mean of X and
Sample Variance of X…
X is a
random variable

Ch 6
Results for the Sampling Distribution of
the Sample Mean
Let X denote the sample mean of a random sample of n observations from a
population with a mean X and variance 2. Then
1. The sampling distribution of X has mean
E( X )  
2. The sampling distribution of X has standard deviation


X 
This is called the standard error of X. n

3. If the population distribution is normal, then the random variable


X 
z
X
Has a standard normal distribution with mean 0 and variance 1.

Ch 6
Recall: Central Limit Theorem

 Let X1, X2, . . . , Xn be a set of n independent random variables


having identical distributions with mean  and variance 2,
with X as the sum and X as the mean of these random
variables. As n becomes large, the central limit theorem states
that the distribution of
X   X X  n X
Z 
X n 2
 approaches the standard normal distribution.

Ch 6
If the Population is not Normal
 We can apply the Central Limit Theorem:
 Even if the population is not normal,
 …sample means from the population will be
approximately normal as long as the sample size is
large enough.

Properties of the sampling distribution:

σ
μx  μ and σx 
n
Ch 6
If the Population is not Normal
(continued)

Population Distribution
Sampling distribution
properties:
Central Tendency
E[x]  μ x  μ
μ x
Sampling Distribution of sample mean
Variation
σ
σx 
(becomes normal as n increases)
Larger
n Smaller
sample size
sample
size

μx x
Ch 6
How Large is Large Enough?

 For most distributions, n > 25 will give a


sampling distribution that is nearly normal
NOTE: For Econometrics applications
n>120.
 For normal population distributions, the
sampling distribution of the mean is always
normally distributed

Ch 6
Intervals…
 Goal: determine a range within which sample
means are likely to occur, given a population
mean and variance
 By the Central Limit Theorem, we know that the
distribution of X is approximately normal if n is large
enough, with mean μ and standard deviation σ X

Ch 6
Intervals
 Goal: determine a range within which sample means are likely to occur, given a population
mean and variance

 By the Central Limit Theorem, we know that the distribution of X is approximately


normal if n is large enough, with mean μ and standard deviation

 Let zα/2 be the z-value that leaves area α/2 in the upper tail
of the normal distribution (i.e., the interval -zα/2 to zα/2
encloses probability 1 – α) (1 – α)

a/2 a/2

-zα/2 zα/2
Ch 6
Intervals
 Goal: determine a range within which sample means are
likely to occur, given a population mean and variance
 By the Central Limit Theorem, we know that the distribution of X
is approximately normal if n is large enough, with mean μ and
standard deviation σ X
 Let zα/2 be the z-value that leaves area α/2 in the upper tail of the
normal distribution (i.e., the interval - zα/2 to zα/2 encloses
probability 1 – α)
 Then a/2 a
μ  z  /2 σ X

is the interval that includes X with probability 1 – α

Ch 6
Sampling Distributions of
Sample Proportions

Sampling
Distributions

Sampling Sampling Sampling


Distribution of Distribution of Distribution of
Sample Sample Sample
Mean Proportion Variance

Ch 6
Population Proportions, P
Recall Binomial distribution:
P = the proportion of the population having
some characteristic
 Sample proportion (P̂) provides an estimate
of P:

X number of items in the sample having the characteri stic of interest


P̂  
n sample size

Ch 6
Population Proportions, P
P = the proportion of the population having
some characteristic
 Sample proportion (P̂) provides an estimate
of P:
X number of items in the sample having the characteristic of interest
P̂  
n sample size

 0 ≤ P̂ ≤ 1
 P̂ has a binomial distribution, but can be
approximated by a normal distribution
when nP(1 – P) > 9
Ch 6
^
Sampling Distribution of P
 Normal approximation:
Sampling Distribution
P(Pˆ )
.3
.2
.1
0
0 .2 .4 .6 8 1 P̂

Properties:
 X  P(1  P)
E(P̂)  P and
σ  Var  
2

n

n
(where p = population proportion)

Ch 6
Z-Value for Proportions

Standardize P̂ to a Z value with the formula:

P̂  P P̂  P
Z 
σ P̂ P(1  P)
n

Ch 6
Sampling Distribution of the Sample
Proportion
Let P^ denote the sample proportion of successes in a random sample
from a population with proportion of success P. Then
1. The sampling distribution of P^ has mean P

E(P^ )  P
2. The sampling distribution of P^ has standard deviation

P (1  P )
p 
n
3. If the sample size is large, the random variable
P^ P
Z
p
is approximately distributed as a standard normal. The approximation is
good if
nP (1  P )  9.
Ch 6
Example

 If the true (population) proportion of voters who


support Proposition A is P = .4, what is the
probability that a sample of size 200 yields a
sample proportion between .40 and .45?

 i.e.: if P = .4 and n = 200, what is


P(.40 ≤ P̂ ≤ .45) ?

Ch 6
Example
(continued)

 if P = .4 and n = 200, what is


P(.40 ≤ P̂ ≤ .45) ?

P(1 P) .4(1 .4)


Find σ Pˆ : σ Pˆ    .03464
n 200

Ch 6
Example
(continued)

 if P = .4 and n = 200, what is


P(.40 ≤ P̂ ≤ .45) ?

P(1 P) .4(1 .4)


Find σ Pˆ : σ Pˆ    .03464
n 200

Convert to
ˆ  .40  .40 .45  .40 
standard P(.40  P  .45)  P Z 
normal:  .03464 .03464 
 P(0  Z  1.44)

Ch 6
Example
(continued)

 if p = .4 and n = 200, what is


P(.40 ≤ P̂ ≤ .45) ?

Use standard normal table: P(0 ≤ Z ≤ 1.44) = .4251

Sampling Distribution

Standardize

.40 .45 P̂

Ch 6
Example
(continued)

 if p = .4 and n = 200, what is


P(.40 ≤ P̂ ≤ .45) ?

Use standard normal table: P(0 ≤ Z ≤ 1.44) = .4251

Standardized
Sampling Distribution Normal Distribution

.4251

Standardize

.40 .45 P̂ 0 1.44


Z

Ch 6
Sampling Distributions of
Sample Proportions

Sampling
Distributions

Sampling Sampling Sampling


Distribution of Distribution of Distribution of
Sample Sample Sample
Mean Proportion Variance

Ch 6
Sample Variance
 Let x1, x2, . . . , xn be a random sample from a
population. The sample variance is
n
1
s2  
n  1 i1
(x i  x) 2

Ch 6
Sample Variance
 Let x1, x2, . . . , xn be a random sample from a
population. The sample variance is
n
1
s2  
n  1 i1
(x i  x) 2

 the square root of the sample variance is


called the sample standard deviation

 the sample variance is different for different


random samples from the same population
Ch 6
Digression:
Concept of Degrees of Freedom (df)

Basic Idea:
Number of observations that are free
to vary after sample mean has
been calculated

Ch 6
Digression:
Concept of Degrees of Freedom (df)
Idea: Number of observations that are free to vary
after sample mean has been calculated

Example:

Suppose the mean of 3 numbers is


known to be = 8.0

Ch 6
Digression:
Concept of Degrees of Freedom (df)

Idea: Number of observations that are free to vary


after sample mean has been calculated
Example: Suppose the mean of 3 numbers is 8.0

Let X1 = 7
THEN IF: Let X2 = 8
What is X3?

Ch 6
Degrees of Freedom (df)
Idea: Number of observations that are free to vary
after sample mean has been calculated
Example: Suppose the mean of 3 numbers is 8.0

Let X1 = 7 If the mean of these three


THEN IF: Let X2 = 8 values is 8.0,
What is X3? then X3 must be 9
(i.e., X3 is not free to vary)
Here, n = 3, so degrees of freedom = n – 1 = 3 – 1 = 2
(2 values can be any numbers, but the third is not free to vary
for a given mean)
Ch 6
Sampling Distribution of
Sample Variances

 What is the sampling distribution


of the sample variance?

 Can it be normal?

Ch 6
Sampling Distribution of
Sample Variances

 What is the sampling distribution


of the sample variance?

 Can it be normal?

 Well, it is a sum of squares…

Ch 6
The Chi-Square, 2 from N(0,1)
 Let Z~N(0,1) be a normally-distributed random variable with zero
mean and unit variance.
 For one degree of freedom, this means that:

Ch 6
The Chi-Square, 2 from N(0,1)
 Let N(0,1) be a normally-distributed random variable with zero mean
and unit variance.
 For one degree of freedom:
 This means that:
Draw from Normal and square it:
m
radian N(0,1)
1
arcsec
0.8
m
m 0.6
kg 0.4
W 0.2
0
-4 -3 -2 -1 0 1 2 3 4 5

Ch 6
The Chi-Square, 2 from N(0,1)
 Let N(0,1) be a normally-distributed random variable with zero mean
and unit variance.
 For one degree of freedom:
 This means that:
m
radian 1
arcsec
0.8
m
m 0.6
N(0,1)
kg 0.4
W 0.2
0
-4 -3 -2 -1 0 1 2 3 4 5

Draw from Normal and


1.0
0.8
square it:
0.6 a2
0.4
0.2
0.0 21
-4 -3 -2 -1 0 1 2 3 4 5 6

Ch 6
The Chi-Square, 2 from N(0,1)
 Let N(0,1) be a normally-distributed random variable with zero mean
and unit variance.
 For one degree of freedom:
 This means that:
m
radian 1
arcsec

i.e. The 2 distribution with 1


0.8
m
m 0.6
N(0,1)
kg 0.4
W 0.2 degree of freedom is the
0
-4 -3 -2 -1 0 1 2 3 4 5 same as the distribution of
the square of a
1.0
0.8
single normally distributed
0.6 a2 quantity.
0.4
0.2
0.0 21
-4 -3 -2 -1 0 1 2 3 4 5 6

Ch 6
The Chi-Square, 2 from N(0,1)
 Let N(0,1) be a normally-distributed random variable with zero mean
and unit variance.
 For one degree of freedom:
 This means that:
m
radian 1
arcsec
0.8
m
m 0.6
N(0,1)
i.e. The 2 distribution with 1
kg 0.4
W 0.2
0 degree
-4 -3 -2 -1 0 1 2 3 4 5
of freedom is the same as the
distribution of the square of a
1.0 single normally distributed
0.8
0.6 a2 quantity.
0.4
0.2
0.0 21
-4 -3 -2 -1 0 1 2 3 4 5 6

Ch 6
The Chi-square Distribution
 The chi-square distribution is a family of distributions,
depending on degrees of freedom (NOT SYMMETRIC):
 d.f. = n – 1

0 4 8 12 16 20 24 28 2

d.f. = 1

Ch 6
The Chi-square Distribution
 The chi-square distribution is a family of distributions,
depending on degrees of freedom(NOT SYMMETRIC):
 d.f. = n – 1

0 4 8 12 16 20 24 28 2 0 4 8 12 16 20 24 28 2

d.f. = 1 d.f. = 5

Ch 6
The Chi-square Distribution
 The chi-square distribution is a family of distributions,
depending on degrees of freedom(NOT SYMMETRIC):
 d.f. = n – 1

0 4 8 12 16 20 24 28 2 0 4 8 12 16 20 24 28 2 0 4 8 12 16 20 24 28 2

d.f. = 1 d.f. = 5 d.f. = 15

 Text Appendix Table 7 contains chi-square probabilities


Ch 6
The Chi-square Distribution

Ch 6
CDF of Chi-Square Distribution:

Ch 6
Sampling Distribution of
Sample Variances

 The sampling distribution of n


1
s2 has mean σ2: for s 
2

n  1 i1
(x i  x) 2

2 2

Ch 6
Sampling Distribution of
Sample Variances

 The sampling distribution of s2 has mean σ2


E(s2 )  σ 2
 If the population distribution is normal, then

4
2


Ch 6
Sampling Distribution of
Sample Variances

 If the population distribution is normal then

2
(n - 1)s
σ 2

has a 2 distribution with n – 1 degrees of freedom

Ch 6
Sampling Distribution of
Sample Variances

 If the population distribution is normal then

2
(n - 1)s
~ 2 n-1
σ 2

has a 2 distribution with n – 1 degrees of freedom.

Ch 6
Chi-square Example
 A commercial freezer must hold a selected
temperature with little variation. Specifications call
for a standard deviation of no more than 4 degrees
(a variance of 16 degrees2).
 A sample of 14 freezers is to be
tested

Ch 6
Chi-square Example
 A commercial freezer must hold a selected
temperature with little variation. Specifications call
for a standard deviation of no more than 4 degrees
(a variance of 16 degrees2).
 A sample of 14 freezers is to be
tested
 What is the upper limit (K) for the
sample variance such that the
probability of exceeding this limit,
given that the population standard
deviation is 4, is less than 0.05?

Ch 6
Chi-square Example
 What is the upper limit (K) for
the sample variance such that
the probability of exceeding this
limit, given that the population
standard deviation is 4, is less
than 0.05?

P(s  K)  0.05?
2

Ch 6
Finding the Chi-square Value

(n  1)s 2
~ chi-square distributed with (n – 1) = 13
χ2 
σ2 degrees of freedom, ~ 2 n-1

Ch 6
Finding the Chi-square Value

(n  1)s 2
~ chi-square distributed with (n – 1) = 13
χ2 
σ2 degrees of freedom, ~ 2 n-1

 Use the the chi-square distribution with area 0.05


in the upper tail:

probability
α = .05

2
213 = ?
Ch 6
Finding the Chi-square Value

(n  1)s 2
Is chi-square distributed with (n – 1) = 13
χ2 
σ2 degrees of freedom

 Use the the chi-square distribution with area 0.05


in the upper tail:

213 = 22.36 (α = .05 and 14 – 1 = 13 d.f.)

probability
α = .05

2
213 = 22.36
Ch 6
Chi-square Example
(continued)
213 = 22.36 (α = .05 and 14 – 1 = 13 d.f.)

 (n  1)s 2

So: P(s 2  K)  P  χ132   0.05
 16 

(where n = 14)

If s2 from the sample of size n = 14 is greater than 27.52, there is


strong evidence to suggest the population variance exceeds 16.

Ch 6
Chi-square Example
(continued)
213 = 22.36 (α = .05 and 14 – 1 = 13 d.f.)

 (n  1)s 2

So: P(s 2  K)  P  χ132   0.05
 16 
(n  1)s 2
χ 
2

σ2 (n  1)K
or  22.36 (where n = 14)
16

If s2 from the sample of size n = 14 is greater than 27.52, there is


strong evidence to suggest the population variance exceeds 16.

Ch 6
Chi-square Example
(continued)
213 = 22.36 (α = .05 and 14 – 1 = 13 d.f.)

 (n  1)s 2

So: P(s 2  K)  P  χ132   0.05
 16 
(n  1)K
or  22.36 (where n = 14)
16

(22.36)(16)
so K  27.52
(14  1)

If s2 from the sample of size n = 14 is greater than 27.52, there is


strong evidence to suggest the population variance exceeds 16.

Ch 6
Intervals for Chi-Square

/2=

1-
=/2

Ch 6
Stata
set obs 1000
**X~chi2(10 df)
gen cx1 = rchi2(10)
sum cx1
hist cx1, normal
distplot cx1
pnorm cx1

Ch 6
Ch 6
THE CENTRAL LIMIT
THEOREM
The “World is Normal” Theorem
Central Limit Theorem

 Let X1, X2, . . . , Xn be a set of n independent random variables


having identical distributions with mean  and variance 2

Ch 5
Central Limit Theorem

 Let X1, X2, . . . , Xn be a set of n independent random variables


having identical distributions with mean  and variance 2,
with X as the sum and X as the mean of these random
variables.

Ch 5
Central Limit Theorem

 Let X1, X2, . . . , Xn be a set of n independent random variables


having identical distributions with mean  and variance 2,
with X as the sum and X as the mean of these random
variables. As n becomes large, the central limit theorem
states that the distribution of

X   X X  n X
Z 
X n 2
 approaches the standard normal distribution.

Ch 5
Central Limit Theorem

Even if the population is not normal,

Z’s from the population will be approximately


normal as long as the sample size is large
enough…
Central Limit Theorem

the sampling
As the n↑
distribution
sample
becomes
size gets
almost normal
large
regardless of
enough…
shape of
population

x
THE CENTRAL LIMIT
THEOREM
Hence: The “World is Normal” Theorem

You might also like