0% found this document useful (0 votes)
56 views

Notes 515 Fall 10 Chap 6

This document defines key concepts in sampling distributions: - Parameters describe populations while statistics describe samples. The sampling distribution of a statistic is the distribution of its values from all possible samples of the same size. - For the sample mean from a normal population, its sampling distribution is normal. For other populations, the Central Limit Theorem says the sampling distribution of the mean is approximately normal for large sample sizes. - Other important sampling distributions include the t, chi-square, and F distributions, which are used to make statistical inferences when population parameters are unknown.

Uploaded by

Sultana Raja
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views

Notes 515 Fall 10 Chap 6

This document defines key concepts in sampling distributions: - Parameters describe populations while statistics describe samples. The sampling distribution of a statistic is the distribution of its values from all possible samples of the same size. - For the sample mean from a normal population, its sampling distribution is normal. For other populations, the Central Limit Theorem says the sampling distribution of the mean is approximately normal for large sample sizes. - Other important sampling distributions include the t, chi-square, and F distributions, which are used to make statistical inferences when population parameters are unknown.

Uploaded by

Sultana Raja
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

STAT 515 -- Chapter 6: Sampling Distributions

Definition: Parameter = a number that characterizes a


population (example: population mean ) – it’s typically
unknown.

Statistic = a number that characterizes a sample


_
(example: sample mean X) – we can calculate it from
our sample data.
_
We use the sample mean X to estimate the population
mean .
_
Suppose we take a sample and calculate X.
_ _
Will X equal ? Will X be close to ?
_
Suppose we take another sample and get another X.
_ _
Will it be same as first X? Will it be close to first X?

• What if we took many repeated samples (of the same


size) from the same population, and each time,
calculated the sample mean?
_
What would that set of X values look like?

The sampling distribution of a statistic is the


distribution of values of the statistic in all possible
samples (of the same size) from the same population.
Consider the sampling distribution of the sample mean
_
X when we take samples of size n from a population
with mean  and variance 2.

Picture:

_
The sampling distribution of X has mean  and
standard deviation  / n .
Notation:

Point Estimator: A statistic which is a single number


meant to estimate a parameter.

It would be nice if the average value of the estimator


(over repeated sampling) equaled the target parameter.

An estimator is called unbiased if the mean of its


sampling distribution is equal to the parameter being
estimated.
Examples:

Another nice property of an estimator: we want the


spread of its sampling distribution to be as small as
possible.

The standard deviation of a statistic’s sampling


distribution is called the standard error of the statistic.
_
The standard error of the sample mean X is  / n .

Note: As the sample size gets larger, the spread of the


sampling distribution gets smaller.

When the sample size is large, the sample mean varies


less across samples.

Evaluating an estimator:
(1) Is it unbiased?
(2) Does it have a small standard error?
Central Limit Theorem

We have determined the center and the spread of the


_
sampling distribution of X. What is the shape of its
sampling distribution?

Case I: If the distribution of the original data is


_
normal, the sampling distribution of X is normal. (This
is true no matter what the sample size is.)

Case II: Central Limit Theorem: If we take a random


sample (of size n) from any population with mean  and
_
standard deviation , the sampling distribution of X is
approximately normal, if the sample size is large.

How large does n have to be?


Our rule of thumb: If n ≥ 30, we can apply the CLT
result.

Pictures:

As n gets larger, the closer the sampling distribution


looks to a normal distribution.
_
Why is the CLT important? Because when X is
(approximately) normally distributed, we can answer
probability questions about the sample mean.
_
Standardizing values of X:
_
If X is normal with mean  and standard deviation
 / n , then
X −
Z=
/ n
has a standard normal distribution.

Example: Suppose we’re studying the failure time (at


high stress) of a certain engine part. The failure times
have a mean of 1.4 hours and a standard deviation of
0.9 hours.

If our sample size is 40 engine parts, then what is the


sampling distribution of the sample mean?
What is the probability that the sample mean will be
greater than 1.5?

Example: Suppose lawyers’ salaries have a mean of


$90,000 and a standard deviation of $30,000 (highly
skewed). Given a sample of lawyers, can we find the
probability the sample mean is less than $100,000
if n = 5? If n = 30?
Other Sampling Distributions

In practice, the population standard deviation  is


typically unknown.

We estimate  with s.

X −
But the quantity s / n no longer has a standard
normal distribution.

Its sampling distribution is as follows:


• If the data come from a normal population, then the
X −
T =
statistic s / n has a t-distribution (“Student’s t”)
with n – 1 degrees of freedom (the parameter of the
t-distribution).

• The t-distribution resembles the standard normal


(symmetric, mound-shaped, centered at zero) but it is
more spread out.
• The fewer the degrees of freedom, the more spread out
the t-distribution is.
• As the d.f. increase, the t-distribution gets closer to the
standard normal.

Picture:
Table VI gives values of the t-distribution with specific
areas to the right of these values:

Verify:
In t-distribution with 3 d.f., area to the right of _______
is .025. (Notation: For 3 d.f., t.025 = )

In t with 14 d.f., area to the right of _______ is .05.

In t with 25 d.f., area to the right of _______ is .999.


The 2 (Chi-square) Distribution

Suppose our sample (of size n) comes from a normal


population with mean  and standard deviation .

(n − 1)s 2
Then 2
has a 2 distribution with n – 1 degrees of
freedom.

• The 2 distribution takes on positive values.


• It is skewed to the right.
• It is less skewed for higher degrees of freedom.
• The mean of a 2 distribution with n – 1 degrees of
freedom is n – 1 and the variance is 2(n – 1).

Fact: If we add the squares of n independent standard


normal r.v.’s, the resulting sum has a 2n distribution.
(n − 1)s 2
Note that 2
=

_
We sacrifice one d.f. by estimating  with X, so it is 2n-1.
Table VII gives values of a 2 r.v. with specific areas to
the right of those values.

Examples:

For 2 with 6 d.f., area to the right of __________ is .90.

For 2 with 6 d.f., area to the right of __________ is .05.

For 2 with 80 d.f., area to the right of _________ is .10.


The F Distribution

 n2 −1 /(n1 − 1)
1

The quantity  2 /(n − 1) where the two 2 r.v.’s are


n −1
2 2

independent, has an F-distribution with n1 – 1


“numerator degrees of freedom” and n2 – 1
denominator degrees of freedom.

So, if we have independent samples (of sizes n1 and n2)


from two normal populations, note:

has an F-distribution with (n1 – 1, n2 – 1) d.f.


Table VIII gives values of F r.v. with area .10 to the right.
Table IX gives values of F r.v. with area .05 to the right.
Table X gives values of F r.v. with area .025 to the right.
Table XI gives values of F r.v. with area .01 to the right.

Verify:

For F with (3, 9) d.f., 2.81 has area 0.10 to right.

For F with (15, 13) d.f., 3.82 has area 0.01 to right.

• These sampling distributions will be important in


many inferential procedures we will learn.

You might also like