0% found this document useful (0 votes)

44 views15 pages

Topic06 Written

The document discusses populations, samples, parameters, and statistics. It defines key terms like population, sample, parameter, and statistic. It also describes common statistics like mean, median, variance, and standard deviation. Methods for calculating these statistics from sample data are provided.

Uploaded by

oreowhite111

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

44 views15 pages

Topic06 Written

Uploaded by

oreowhite111

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

1

Topic 6: Sampling Distributions

6.1 Populations, Samples, and Processes

• We are constantly exposed to collections of facts, or data.

• The discipline of statistics provides methods for organizing and summarizing data
(descriptive statistics)

graphically numerically

and for drawing conclusions based on information contained in the data (infer-
ential statistics).

• An investigation will typically focus on a well-defined collection of objects constituting a

population of interest. For example, a population might consist of all students in our
campus.

• When the desired information is available for all objects in the population, we have
a census. Constraints on time, money, and lack of resources usually make a census
impractical or infeasible, then a subset of the population – a sample – is selected.
2

• In a probability problem, properties of the population under study are assumed known.
Questions regarding a sample taken from the population are posed and answered.

• In a statistics problem, having obtained a sample from a population, an investigator

would frequently like to use sample information to draw some type of conclusion (make
inferences) about the population.

• The relationship between the two disciplines can be summarized by saying that proba-
bility reasons from the population to the sample, whereas inferential statistics
reasons from the sample to the population.

Example 6.1. Consider drivers’ use of manual lap belts in cars equipped with automatic
shoulder belt systems.

– In a probability question, we might assume that 50% of all drivers of cars (popu-
lation) equipped in this way in a certain metropolitan area regularly use their lap
belt.
– We may want to ask questions about a sample selected from the population. For
example, how likely is it that a sample of 100 such drivers will include at least 70
who regularly use their lap belt?
– In inferential statistics, we have sample information available.
– We would like to use sample information to answer a question about the structure
of the entire population from which the sample was selected.
– For example, a sample of 100 drivers of such cars revealed that 65 regularly use
their lap belt. We may want to ask whether this provide substantial evidence for
concluding that more than 50% of all such drivers in this area regularly use their
lap belt.
3

• A variable is any characteristic whose value may change from one object to another in the
population. A variable that takes numerical values is called a quantitative variable; a
variable that takes non-numerical values is called a qualitative variable or categorical
variable. Examples include

x = market classification of a book (categorical )

y = age of the author of a book (quantitative)

Example 6.2. A manufacturer of computer chips claims that less than 10% of his products
are defective. When 1,000 chips were drawn from a large production, 7.5% were found to be
defective.

1. What is the population of interest?

2. What is the sample?

3. Explain briefly how the manufacturer can test the claim.

Example 6.3. For each of the following variables, decide if it is quantitative or categorical. If
it is quantitative, decide if it is discrete or continuous.

1. The number of joggers run per week.

2. The starting salaries of university graduates.

3. The months in which a company’s employees choose to take their vacations.

4. The grades received by students in a statistics course.

6.2 Population Parameters and Sample Statistics

Statistical inference is almost always directed toward drawing some type of conclusion about
one or more population parameters. To do so requires that an investigator obtain sample
data from each of the populations under study. Conclusions can then be based on the computed
values of various sample quantities or sample statistics.

Parameter Statistic
Mean µ X
Variance σ2 S2
Standard Deviation σ S
Proportion p P̂

Parameter Statistic

• Target • Known

• Unknown • Random Variable - a list of

possible values with associated
• Constant - one single value probabilities

• Use Statistic to infer Parame-

ter.

We now look at various sample statistics.

The sample mean x of observations x1 , x2 , . . ., xn is given by

Pn
x1 + x2 + · · · + xn xi
x= = i=1
n n

Example 6.4. Given the sample: 55, 73, 75, 80, 80, 85, 90, 92, 93, 98. Compute the sample
mean.
5

The mean is greatly affected by the presence of even a single outlier (unusually large or small
observation) making it an inappropriate measure of center under some circumstances.

(Image taken from the web)

The sample median is obtained by first ordering the n observations from smallest to
largest (with any repeated values included so that every sample observation appears in the
ordered list).

Then, look for the observation in the middle.

x x x x x

If there are two observations in the middle, take the average.

x x x x x x

The sample median is sometimes denoted by x̃, and so the population median is denoted
by µ̃.

Example 6.5. Given the sample: 55, 73, 75, 80, 80, 85, 90, 92, 93, 98. Find the sample
median.

Example 6.6. For the following two sets of data, compute the sample mean and the sample
median.

1. Data: 1, 1, 1, 1, 1.

2. Data: 1, 1, 1, 1, 100.

What can you conclude about the properties of the sample mean and the sample median?
6

The sample variance s2 of observations x1 , x2 , . . ., xn is given by

Pn 2 ( ni=1 xi )2
P
i=1 xi −
Pn
(xi − x)2 n
s2 = i=1 =
n−1 n−1

Pn 2
Pn 2
Pn 2 ( i=1 xi )
Example 6.7. Show that Sxx = i=1 (xi − x) = i=1 xi − .
n

Example 6.8. Given the sample: 55, 73, 75, 80, 80, 85, 90, 92, 93, 98. Compute the sample
variance.
7

Why is the denominator n − 1 but not n?

• The population variance, denoted by σ 2 and (and thus σ is the population standard
deviation can be computed by
Pn
2 (xi − µ)2
σ = i=1
N
when the population is finite and consists of N values. Observe here that the divisor is
N and not N − 1.

• Note that σ 2 involves squared deviations about the population mean µ. If we

actually knew the value of µ, then we could define the sample variance as the average
squared deviation of the sample xi ’s about µ.

• However, the value of µ is almost never known, so the sum of squared deviations about
x must be used. But the xi ’s tend to be closer to their average x than to the population
average µ.

• To compensate for this, the divisor n − 1 is used rather than the sample size n. In other
words, if we used a divisor n in the sample variance, then the resulting quantity would
tend to underestimate σ 2 (produce estimated values that are too small on the average),
whereas dividing by the slightly smaller n − 1 corrects this underestimating.
8

6.3 Sampling Distributions

• Consider selecting two different samples of size n from the same population distribution.
The observations xi ’s in the second sample will virtually always differ at least a bit from
those in the first sample.

• For example, a first sample of n = 3 cars of a particular type might result in fuel efficiencies
x1 = 30.7, x2 = 29.4, x3 = 31.1, whereas a second sample may give x1 = 28.8, x2 = 30.0,
and x3 = 32.5.

• Before we obtain data, there is uncertainty about the value of each xi . Because of this
uncertainty, before the data becomes available we will view each observations as a random
variable and denote the sample by X1 , X2 , . . . , Xn .

• This variation in observed values in turn implies that the value of any function of the
sample observations, or statistic – such as the sample mean and sample standard de-
viation – also varies from sample to sample. That is, prior to obtaining x1 , x2 , . . . , xn ,
there is uncertainty as to the value of x, the value of s, and so on.

• Any statistic, being a random variable, has a probability distribution. The probability
distribution of a statistic is sometimes referred to as its sampling distribution to em-
phasize that it describes how the statistics varies in value across all samples that might
be selected.

• The probability distribution of any particular statistic depends not only on the population
distribution and the sample size n but also on the method of sampling. In our course, we
will be dealing with (simple) random samples.

Definition 6.1. The random variables X1 , X2 , . . . , Xn are said to form a (simple) random
sample of size n if

1. The Xi ’s are independent random variables.

2. Every Xi has the same probability distribution.

In other words, the random variables Xi ’s are independent and identically distributed (iid).
9

Example 6.9. A certain brand of MP3 player comes in three configurations: a model with 2
GB of memory, costing $80, a 4 GB model priced at $100, and an 8 GB version with a price
tag of $120. If 20% of all purchasers choose the 2 GB model, 30% choose the 4 GB model,
and 50% choose the 8 GB model, then the probability distribution of the cost X of a single
randomly selected MP3 player purchase is given by

Suppose on a particular day only two MP3 players are sold. Let X1 = the revenue from the
first sale and X2 = the revenue from the second. Suppose that X1 and X2 are independent,
each with the probability distribution shown above so that X1 and X2 constitute a random
sample from the distribution.

Find the sampling distribution of X and the sampling distribution of S.

x1 x2 p(x1 , x2 ) x s2

P (X = x)

P (S 2 = s2 )
10

6.3.1 The Distribution of the Sample Mean

The importance of the sample mean X springs from its use in drawing conclusions about the
population mean µ. Some of the most frequently used inferential procedures are based on the
properties of the sampling distribution of X.

Proposition 6.1. Let X1 , X2 , ..., Xn be a random sample from a distribution with mean
value µX = µ and standard deviation σX = σ. Then,

(a) E(X) = µX = µ ,

2 σ2 σ
(b) V (X) = σX = and σX = √ .
n n
11

Example 6.10. Let X1 , X2 , ..., Xn be a random sample of size n taken from a population with
mean µ and variance σ 2 . Given that T0 = X1 + X2 + · · · + Xn . Find E(T0 ) and V (T0 ).

Example 6.11. Let X1 , X2 , ..., Xn be a random sample of size 25 taken from a population
with mean µ = 28, 000 and standard deviation σ = 5000. Also T0 = X1 + X2 + · · · + X25 .

a. Find E(X) and σX .

b. Find E(T0 ) and σT0 .

6.3.2 The Central Limit Theorem (CLT)

Theorem 6.1. Let X1 , X2 , ..., Xn be a random sample of size n from a population distri-
bution with mean µ and variance σ 2 . Then regardless of the population distribution of
X1 , X2 , ..., Xn , if n is sufficiently large (typically n > 30), X has approximately a normal
2 σ2
distribution with µX = µ and σX = .
n
T0 = X1 + X2 + · · · + Xn also has approximately a normal distribution with µT0 = nµ,
σT20 = nσ 2 .

• There are population distributions for which even an n of 40 or 50 does not suffice, but
such distributions are rarely encountered in practice. On the other hand, the rule of
thumb is often conservative; for many population distributions, an n much less than 30
would suffice. For example, in the case of a uniform population distribution, the CLT
gives a good approximation for n ≥ 12.

• If the Xi ’s are normally distributed, so is X for every sample size n.

Example 6.12. The amount of a particular impurity in a batch of a certain chemical product
is a random variable with mean value 4.0g and standard deviation 1.5g.

a. If 50 batches are independently prepared, what is the approximate probability that the
sample average amount of impurity X is between 3.5g and 3.8g?

b. Now consider randomly selecting 100 batches, and let T0 represent the total amount of
impurity in these batches. Find the mean and standard deviation of T0 .

c. Find the probability that this total is at most 425g.

Example 6.13. Let X = the number of different people sent text messages during a particular
day by a randomly selected student at a large university. Suppose the mean value of X is 7
and the standard deviation is 6 (values very close to those reported in the article “Cell Phone
Use and Grade Point Average Among Undergraduate University Students” (College Student J.,
2011: 544–551). Among 100 randomly selected such students, how likely is it that the sample
mean number of different people texted exceeds 5?

Notice that the distribution being sampled is discrete, but the CLT is applicable whether the
variable of interest is discrete or continuous.

Exercises Sections 5.3 and 5.4 of textbook: 37, 38, 49, 51, 52, 53, 54

Sta 341 Class Notes Final
No ratings yet
Sta 341 Class Notes Final
120 pages
Chapter 5 - Sampling and Sampling Distribution
No ratings yet
Chapter 5 - Sampling and Sampling Distribution
44 pages
Statistics 1B Lecture Notes: Author: T. Farrar
No ratings yet
Statistics 1B Lecture Notes: Author: T. Farrar
129 pages
DSML
No ratings yet
DSML
510 pages
Sampling Distributions and Confidence Intervals
No ratings yet
Sampling Distributions and Confidence Intervals
69 pages
Class4 Newbold Chap06 Spring2024 01
No ratings yet
Class4 Newbold Chap06 Spring2024 01
101 pages
Newbold Sbe8 ch06 Ge
No ratings yet
Newbold Sbe8 ch06 Ge
54 pages
Distributions of Sample Statistics
No ratings yet
Distributions of Sample Statistics
112 pages
L06 Inference
No ratings yet
L06 Inference
48 pages
Statistics Group 1
No ratings yet
Statistics Group 1
59 pages
MC 106 354 395
No ratings yet
MC 106 354 395
42 pages
Lecture 9
No ratings yet
Lecture 9
39 pages
MATH 403 Engineering Data Analysis 95 132
No ratings yet
MATH 403 Engineering Data Analysis 95 132
38 pages
Chap 6
No ratings yet
Chap 6
49 pages
Sample Statistics: 2.1 Populations and Observations
No ratings yet
Sample Statistics: 2.1 Populations and Observations
23 pages
Note 06 - Concept of Statistical Inference
No ratings yet
Note 06 - Concept of Statistical Inference
30 pages
Isom 2500
No ratings yet
Isom 2500
58 pages
Chap5 Statistical Inference
No ratings yet
Chap5 Statistical Inference
37 pages
10 Intro To Stats
No ratings yet
10 Intro To Stats
43 pages
Introduction To Inferential Statistics Sampling Distributions
No ratings yet
Introduction To Inferential Statistics Sampling Distributions
21 pages
Stat Chapter 2
No ratings yet
Stat Chapter 2
15 pages
06 - Ch06
No ratings yet
06 - Ch06
39 pages
Sampling Distribution and P G Estimation: T I3 Topic 3
No ratings yet
Sampling Distribution and P G Estimation: T I3 Topic 3
46 pages
ST Topic 3
No ratings yet
ST Topic 3
71 pages
Chap5 Statistical Inference
No ratings yet
Chap5 Statistical Inference
38 pages
Chapter 6 Sampling Distribution
No ratings yet
Chapter 6 Sampling Distribution
10 pages
Chapter 6
No ratings yet
Chapter 6
9 pages
Kuliah 3-Taburan PersempelanM4 TABURAN PERSAMPELAN
No ratings yet
Kuliah 3-Taburan PersempelanM4 TABURAN PERSAMPELAN
42 pages
Sampling Distribution of Sample Mean: Muhammad Tahir Yousafzai
No ratings yet
Sampling Distribution of Sample Mean: Muhammad Tahir Yousafzai
15 pages
Sampling
No ratings yet
Sampling
27 pages
Inference 1 Notes Hsts111
No ratings yet
Inference 1 Notes Hsts111
73 pages
Slides SM 1
No ratings yet
Slides SM 1
51 pages
Engineering Probability & Statistics
No ratings yet
Engineering Probability & Statistics
30 pages
Inferential Statistics: X (Called X Bar), To Symbolize The Sample
No ratings yet
Inferential Statistics: X (Called X Bar), To Symbolize The Sample
19 pages
Business Stat CH 1
No ratings yet
Business Stat CH 1
15 pages
Course: Statistical Inference & Applications: Instructor in Charge
No ratings yet
Course: Statistical Inference & Applications: Instructor in Charge
30 pages
Transition To MATH503
No ratings yet
Transition To MATH503
12 pages
Chapter 6
No ratings yet
Chapter 6
13 pages
Random Samples
No ratings yet
Random Samples
8 pages
Gsbiju MA202 3 1
No ratings yet
Gsbiju MA202 3 1
5 pages
Chapter 2
No ratings yet
Chapter 2
39 pages
STAT2601B (23-24, 2nd) Chapter 10
No ratings yet
STAT2601B (23-24, 2nd) Chapter 10
12 pages
Untitled 3
No ratings yet
Untitled 3
32 pages
Sampling Distribution
No ratings yet
Sampling Distribution
15 pages
Chap 1 Sampling Distributions
No ratings yet
Chap 1 Sampling Distributions
14 pages
Lecture1 - Copy (1) Copy 2
No ratings yet
Lecture1 - Copy (1) Copy 2
24 pages
MATH+270 Chapter+8
No ratings yet
MATH+270 Chapter+8
7 pages
Midterms Gec Math Adooooor
No ratings yet
Midterms Gec Math Adooooor
6 pages
Statistics 10 1
No ratings yet
Statistics 10 1
5 pages
Inferential Statistics: by The End of This Chapter You Should Be Able To
No ratings yet
Inferential Statistics: by The End of This Chapter You Should Be Able To
46 pages
Inf Lec 1
No ratings yet
Inf Lec 1
26 pages
FALLSEM2020-21 MAT2001 ETH VL2020210107492 Reference Material I 18-Oct-2020 M
No ratings yet
FALLSEM2020-21 MAT2001 ETH VL2020210107492 Reference Material I 18-Oct-2020 M
25 pages
Chap8 STAT 2 Merged
No ratings yet
Chap8 STAT 2 Merged
15 pages
Chapter 1
No ratings yet
Chapter 1
4 pages
Prof. Joy V. Lorin-Picar Davao Del Norte State College: New Visayas, Panabo City
No ratings yet
Prof. Joy V. Lorin-Picar Davao Del Norte State College: New Visayas, Panabo City
91 pages
Sampling Distribution
No ratings yet
Sampling Distribution
37 pages
Point Estimation: Statistics (MAST20005) & Elements of Statistics (MAST90058) Semester 2, 2018
No ratings yet
Point Estimation: Statistics (MAST20005) & Elements of Statistics (MAST90058) Semester 2, 2018
12 pages
MIT2 854F10 Stats
No ratings yet
MIT2 854F10 Stats
38 pages