Class4 Newbold Chap06 Spring2024 01
Class4 Newbold Chap06 Spring2024 01
Chapter 6
Descriptive statistics
Collecting, presenting, and describing data
Ch 6
Tools of Statistics
Descriptive statistics
Collecting, presenting, and describing data
Inferential statistics
Drawing conclusions and/or making
decisions concerning a population
based only on sample data; called
Classical Statistical Inference
Ch 6
Populations and Samples
Ch 6
Populations and Samples
Ch 6
Population vs. Sample
Population
a b cd
ef gh i jk l m n
o p q rs t u v w
x y z
Ch 6
Population vs. Sample
Population
a b cd
ef gh i jk l m n
o p q rs t u v w
x y z
Ch 6
Population vs. Sample
Population
a b cd
ef gh i jk l m n
o p q rs t u v w
x y z
Ch 6
Population vs. Sample
Population Sample
a b cd b c
ef gh i jk l m n gi n
o p q rs t u v w o r u
x y z y
Ch 6
Why Sample?
Ch 6
Simple Random Samples
Ch 6
Simple Random Samples
Ch 6
Simple Random Samples
Ch 6
Simple Random Samples
Ch 6
Simple Random Samples
Ch 6
Classical Statistical Inference
Ch 6
Classical Statistical Inference
Sample statistics
statistic, say
compute the
sample mean
Ch 6
Classical Statistical Inference
Ch 6
Classical Statistical Inference
Ch 6
Classical Statistical Inference
Ch 6
Classical Statistical Inference
Sample
Population
Ch 6
Classical Statistical Inference
Drawing conclusions and/or
making decisions concerning
a population based on sample
results.
Ch 6
Classical Statistical Inference
Drawing conclusions and/or making decisions
concerning a population based on sample results.
Estimation
e.g., Estimate the population mean
weight using the sample mean
weight
Ch 6
Classical Statistical Inference
Drawing conclusions and/or making decisions
concerning a population based on sample results.
Estimation
e.g., Estimate the population mean
weight using the sample mean
weight
Hypothesis Testing
e.g., Use sample evidence to test
the claim that the population mean
weight is 120 pounds
Ch 6
Sampling Distributions
Ch 6
Chapter Outline
Sampling
Distributions
Ch 6
Sampling Distributions of
Sample Means
Sampling
Distributions
Ch 6
Estimation Process
Population
(mean, μ, is Random Sample
unknown)
Sample
Ch 6
Estimation Process
Random Sample
Population Mean
(mean, μ, is X = 50
unknown)
Sample
Ch 6
Estimation Process
Population
(mean, μ, is Random Sample
unknown)
Sample
Ch 6
Estimation Process
Random Sample
Population Mean
(mean, μ, is X = 41
unknown)
Sample
Ch 6
Estimation Process
Population
(mean, μ, is Random Sample
unknown)
Sample
Ch 6
Estimation Process
Random Sample
Population Mean
(mean, μ, is X = 60
unknown)
Sample
Ch 6
Estimation Process
Population
(mean, μ, is Random Sample
unknown)
Sample
Ch 6
Estimation Process
Random Sample
Population Mean
(mean, μ, is X = 55
unknown)
Sample
Ch 6
Estimation Process
Population
(mean, μ, is Random Sample
unknown)
Sample
Ch 6
Estimation Process
Random Sample
Population Mean
(mean, μ, is X = 48
unknown)
Sample
Ch 6
Estimation Process
Random Sample
Population Mean
(mean, μ, is X=…
unknown) REDO, many times…
Sample
Ch 6
Estimation Process
Random Sample
Population Mean
(mean, μ, is X=…
unknown) REDO, many times…
Obtain mean of X and
Sample Variance of X…
Ch 6
Estimation Process
Random Sample
Population Mean
(mean, μ, is X=…
unknown) REDO, many times…
Obtain mean of X and
Sample Variance of X…
X is a
random variable
Ch 6
Results for the Sampling Distribution of
the Sample Mean
Let X denote the sample mean of a random sample of n observations from a
population with a mean X and variance 2. Then
1. The sampling distribution of X has mean
E( X )
2. The sampling distribution of X has standard deviation
X
This is called the standard error of X. n
Ch 6
Recall: Central Limit Theorem
Ch 6
If the Population is not Normal
We can apply the Central Limit Theorem:
Even if the population is not normal,
…sample means from the population will be
approximately normal as long as the sample size is
large enough.
σ
μx μ and σx
n
Ch 6
If the Population is not Normal
(continued)
Population Distribution
Sampling distribution
properties:
Central Tendency
E[x] μ x μ
μ x
Sampling Distribution of sample mean
Variation
σ
σx
(becomes normal as n increases)
Larger
n Smaller
sample size
sample
size
μx x
Ch 6
How Large is Large Enough?
Ch 6
Intervals…
Goal: determine a range within which sample
means are likely to occur, given a population
mean and variance
By the Central Limit Theorem, we know that the
distribution of X is approximately normal if n is large
enough, with mean μ and standard deviation σ X
Ch 6
Intervals
Goal: determine a range within which sample means are likely to occur, given a population
mean and variance
Let zα/2 be the z-value that leaves area α/2 in the upper tail
of the normal distribution (i.e., the interval -zα/2 to zα/2
encloses probability 1 – α) (1 – α)
a/2 a/2
-zα/2 zα/2
Ch 6
Intervals
Goal: determine a range within which sample means are
likely to occur, given a population mean and variance
By the Central Limit Theorem, we know that the distribution of X
is approximately normal if n is large enough, with mean μ and
standard deviation σ X
Let zα/2 be the z-value that leaves area α/2 in the upper tail of the
normal distribution (i.e., the interval - zα/2 to zα/2 encloses
probability 1 – α)
Then a/2 a
μ z /2 σ X
Ch 6
Sampling Distributions of
Sample Proportions
Sampling
Distributions
Ch 6
Population Proportions, P
Recall Binomial distribution:
P = the proportion of the population having
some characteristic
Sample proportion (P̂) provides an estimate
of P:
Ch 6
Population Proportions, P
P = the proportion of the population having
some characteristic
Sample proportion (P̂) provides an estimate
of P:
X number of items in the sample having the characteristic of interest
P̂
n sample size
0 ≤ P̂ ≤ 1
P̂ has a binomial distribution, but can be
approximated by a normal distribution
when nP(1 – P) > 9
Ch 6
^
Sampling Distribution of P
Normal approximation:
Sampling Distribution
P(Pˆ )
.3
.2
.1
0
0 .2 .4 .6 8 1 P̂
Properties:
X P(1 P)
E(P̂) P and
σ Var
2
n
P̂
n
(where p = population proportion)
Ch 6
Z-Value for Proportions
P̂ P P̂ P
Z
σ P̂ P(1 P)
n
Ch 6
Sampling Distribution of the Sample
Proportion
Let P^ denote the sample proportion of successes in a random sample
from a population with proportion of success P. Then
1. The sampling distribution of P^ has mean P
E(P^ ) P
2. The sampling distribution of P^ has standard deviation
P (1 P )
p
n
3. If the sample size is large, the random variable
P^ P
Z
p
is approximately distributed as a standard normal. The approximation is
good if
nP (1 P ) 9.
Ch 6
Example
Ch 6
Example
(continued)
Ch 6
Example
(continued)
Convert to
ˆ .40 .40 .45 .40
standard P(.40 P .45) P Z
normal: .03464 .03464
P(0 Z 1.44)
Ch 6
Example
(continued)
Sampling Distribution
Standardize
.40 .45 P̂
Ch 6
Example
(continued)
Standardized
Sampling Distribution Normal Distribution
.4251
Standardize
Ch 6
Sampling Distributions of
Sample Proportions
Sampling
Distributions
Ch 6
Sample Variance
Let x1, x2, . . . , xn be a random sample from a
population. The sample variance is
n
1
s2
n 1 i1
(x i x) 2
Ch 6
Sample Variance
Let x1, x2, . . . , xn be a random sample from a
population. The sample variance is
n
1
s2
n 1 i1
(x i x) 2
Basic Idea:
Number of observations that are free
to vary after sample mean has
been calculated
Ch 6
Digression:
Concept of Degrees of Freedom (df)
Idea: Number of observations that are free to vary
after sample mean has been calculated
Example:
Ch 6
Digression:
Concept of Degrees of Freedom (df)
Let X1 = 7
THEN IF: Let X2 = 8
What is X3?
Ch 6
Degrees of Freedom (df)
Idea: Number of observations that are free to vary
after sample mean has been calculated
Example: Suppose the mean of 3 numbers is 8.0
Can it be normal?
Ch 6
Sampling Distribution of
Sample Variances
Can it be normal?
Ch 6
The Chi-Square, 2 from N(0,1)
Let Z~N(0,1) be a normally-distributed random variable with zero
mean and unit variance.
For one degree of freedom, this means that:
Ch 6
The Chi-Square, 2 from N(0,1)
Let N(0,1) be a normally-distributed random variable with zero mean
and unit variance.
For one degree of freedom:
This means that:
Draw from Normal and square it:
m
radian N(0,1)
1
arcsec
0.8
m
m 0.6
kg 0.4
W 0.2
0
-4 -3 -2 -1 0 1 2 3 4 5
Ch 6
The Chi-Square, 2 from N(0,1)
Let N(0,1) be a normally-distributed random variable with zero mean
and unit variance.
For one degree of freedom:
This means that:
m
radian 1
arcsec
0.8
m
m 0.6
N(0,1)
kg 0.4
W 0.2
0
-4 -3 -2 -1 0 1 2 3 4 5
Ch 6
The Chi-Square, 2 from N(0,1)
Let N(0,1) be a normally-distributed random variable with zero mean
and unit variance.
For one degree of freedom:
This means that:
m
radian 1
arcsec
Ch 6
The Chi-Square, 2 from N(0,1)
Let N(0,1) be a normally-distributed random variable with zero mean
and unit variance.
For one degree of freedom:
This means that:
m
radian 1
arcsec
0.8
m
m 0.6
N(0,1)
i.e. The 2 distribution with 1
kg 0.4
W 0.2
0 degree
-4 -3 -2 -1 0 1 2 3 4 5
of freedom is the same as the
distribution of the square of a
1.0 single normally distributed
0.8
0.6 a2 quantity.
0.4
0.2
0.0 21
-4 -3 -2 -1 0 1 2 3 4 5 6
Ch 6
The Chi-square Distribution
The chi-square distribution is a family of distributions,
depending on degrees of freedom (NOT SYMMETRIC):
d.f. = n – 1
0 4 8 12 16 20 24 28 2
d.f. = 1
Ch 6
The Chi-square Distribution
The chi-square distribution is a family of distributions,
depending on degrees of freedom(NOT SYMMETRIC):
d.f. = n – 1
0 4 8 12 16 20 24 28 2 0 4 8 12 16 20 24 28 2
d.f. = 1 d.f. = 5
Ch 6
The Chi-square Distribution
The chi-square distribution is a family of distributions,
depending on degrees of freedom(NOT SYMMETRIC):
d.f. = n – 1
0 4 8 12 16 20 24 28 2 0 4 8 12 16 20 24 28 2 0 4 8 12 16 20 24 28 2
Ch 6
CDF of Chi-Square Distribution:
Ch 6
Sampling Distribution of
Sample Variances
2 2
Ch 6
Sampling Distribution of
Sample Variances
4
2
Ch 6
Sampling Distribution of
Sample Variances
2
(n - 1)s
σ 2
Ch 6
Sampling Distribution of
Sample Variances
2
(n - 1)s
~ 2 n-1
σ 2
Ch 6
Chi-square Example
A commercial freezer must hold a selected
temperature with little variation. Specifications call
for a standard deviation of no more than 4 degrees
(a variance of 16 degrees2).
A sample of 14 freezers is to be
tested
Ch 6
Chi-square Example
A commercial freezer must hold a selected
temperature with little variation. Specifications call
for a standard deviation of no more than 4 degrees
(a variance of 16 degrees2).
A sample of 14 freezers is to be
tested
What is the upper limit (K) for the
sample variance such that the
probability of exceeding this limit,
given that the population standard
deviation is 4, is less than 0.05?
Ch 6
Chi-square Example
What is the upper limit (K) for
the sample variance such that
the probability of exceeding this
limit, given that the population
standard deviation is 4, is less
than 0.05?
P(s K) 0.05?
2
Ch 6
Finding the Chi-square Value
(n 1)s 2
~ chi-square distributed with (n – 1) = 13
χ2
σ2 degrees of freedom, ~ 2 n-1
Ch 6
Finding the Chi-square Value
(n 1)s 2
~ chi-square distributed with (n – 1) = 13
χ2
σ2 degrees of freedom, ~ 2 n-1
probability
α = .05
2
213 = ?
Ch 6
Finding the Chi-square Value
(n 1)s 2
Is chi-square distributed with (n – 1) = 13
χ2
σ2 degrees of freedom
probability
α = .05
2
213 = 22.36
Ch 6
Chi-square Example
(continued)
213 = 22.36 (α = .05 and 14 – 1 = 13 d.f.)
(n 1)s 2
So: P(s 2 K) P χ132 0.05
16
(where n = 14)
Ch 6
Chi-square Example
(continued)
213 = 22.36 (α = .05 and 14 – 1 = 13 d.f.)
(n 1)s 2
So: P(s 2 K) P χ132 0.05
16
(n 1)s 2
χ
2
σ2 (n 1)K
or 22.36 (where n = 14)
16
Ch 6
Chi-square Example
(continued)
213 = 22.36 (α = .05 and 14 – 1 = 13 d.f.)
(n 1)s 2
So: P(s 2 K) P χ132 0.05
16
(n 1)K
or 22.36 (where n = 14)
16
(22.36)(16)
so K 27.52
(14 1)
Ch 6
Intervals for Chi-Square
/2=
1-
=/2
Ch 6
Stata
set obs 1000
**X~chi2(10 df)
gen cx1 = rchi2(10)
sum cx1
hist cx1, normal
distplot cx1
pnorm cx1
Ch 6
Ch 6
THE CENTRAL LIMIT
THEOREM
The “World is Normal” Theorem
Central Limit Theorem
Ch 5
Central Limit Theorem
Ch 5
Central Limit Theorem
X X X n X
Z
X n 2
approaches the standard normal distribution.
Ch 5
Central Limit Theorem
the sampling
As the n↑
distribution
sample
becomes
size gets
almost normal
large
regardless of
enough…
shape of
population
x
THE CENTRAL LIMIT
THEOREM
Hence: The “World is Normal” Theorem