0% found this document useful (0 votes)
152 views

Chapter 5 - Sampling and Sampling Distribution

This document summarizes key concepts about sampling and sampling distributions from Chapter 5: 1. Sampling can be done from finite or infinite populations, with or without replacement. Simple random sampling selects samples such that each possible sample has an equal chance of being selected. 2. Parameters describe populations while statistics describe samples. The sample mean and proportion are used to estimate the population mean and proportion. 3. The sampling distribution of a statistic like the sample mean describes how the statistic varies across all possible samples of the same size. It allows estimating how close a sample estimate is to the true population parameter.

Uploaded by

SANE KARIMI
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
152 views

Chapter 5 - Sampling and Sampling Distribution

This document summarizes key concepts about sampling and sampling distributions from Chapter 5: 1. Sampling can be done from finite or infinite populations, with or without replacement. Simple random sampling selects samples such that each possible sample has an equal chance of being selected. 2. Parameters describe populations while statistics describe samples. The sample mean and proportion are used to estimate the population mean and proportion. 3. The sampling distribution of a statistic like the sample mean describes how the statistic varies across all possible samples of the same size. It allows estimating how close a sample estimate is to the true population parameter.

Uploaded by

SANE KARIMI
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

Advanced Statistics

Chapter 5 - Sampling and Sampling


Distribution
Chapter 5 – Sampling and Sampling Distribution 2

Statistics in Practice
Chapter 5 – Sampling and Sampling Distribution 3

Selecting a Sample
– Sampling from a finite population
• A finite population is a population of size N which the N is less than infinity (i.e.,
not too large)
• SIMPLE RANDOM SAMPLE (FINITE POPULATION)
– A simple random sample of size n from a finite population of size N is a
sample selected such that each possible sample of size n has the same
probability of being selected.
• Sampling from the finite population can be with/without replacement
– Sampling from a infinite population
• RANDOM SAMPLE (INFINITE POPULATION)
– A random sample of size n from an infinite population is a sample selected
such that the following conditions are satisfied.
– 1. Each element selected comes from the same population.
– 2. Each element is selected independently.
Chapter 5 – Sampling and Sampling Distribution 4

– The random sample selection procedure (each element is selected


independently) is to prevent selection bias.
• Selection bias would also occur if the interviewer selected a group of five
customers who entered the restaurant together and asked all of them to
participate in the sample.
• Mean travel time using travelers’ or vehicles’ Mac ID of their Bluetooth: bus vs car
– infinite population:
• manufactured on a production line, repeated experimental trials in a laboratory,
transactions occurring at a bank, telephone calls arriving at a technical support
center, and customers entering a retail store.
• A common quality control application involves a production process where there is
no limit on the number of elements that can be produced.
• population of customers arriving at a fast-food restaurant. Suppose an employee
is asked to select and interview a sample of customers in order to develop a profile
of customers who visit the restaurant.
Chapter 5 – Sampling and Sampling Distribution 5

Infer Parameters From Statistics


Parameter: Statistic:
Mean Height of All 109 Students Sample Mean of 10 Students

Parameter:
Sample
Population
Mean
Mean (μ)

Parameters vs. Statistics


Parameter Statistic
Definition A number describing A number computed
population from a sample
Notation (μ, σ) (𝑥ҧ , s)
Characteristics Fixed Varied by sample
Chapter 5 – Sampling and Sampling Distribution 6

Behavior of Sample Mean


– If the goal is to infer the population mean of a quantitative variable (e.g.
student height) based on a sample, we need to understand how the sample
mean behave …
Statistic - Sample Mean
Sample:
Heights of n Students

X1

X2
Parameter:
Mean Height
(μ)

Xn
Chapter 5 – Sampling and Sampling Distribution 7

Sample Mean
– Imagine we take a sample of n from a population, and define random variables X1,
X2 ... Xn representing the values that could be obtained, then X1, X2 ... Xn are
independent with same mean (μ) and variance (σ2), and their mean (Sample
Mean)

– Mean is a random variable with mean μ and variance σ2 /n (WHY???)

Y = c*X
– E(Y) = c E(X)
– Var(Y) = c2Var(X)
Y = X1+ X2 + ... + Xn
– If X1, X2 ... Xn are independent with same mean (μ) and variance (σ2), Then
– E(Y) = nμ
– Var(Y) =nσ2
Chapter 5 – Sampling and Sampling Distribution 8

– Let us begin by citing two examples in which sampling was used to answer a
research question about a population.
– 1. Members of a political party in Texas were considering supporting a
particular candidate for election to the U.S. Senate, and party leaders wanted
to estimate the proportion of registered voters in the state favoring the
candidate. A sample of 400 registered voters in Texas was selected and 160 of
the 400 voters indicated a preference for the candidate. Thus, an estimate of
the proportion of the population of registered voters favoring the candidate is
160/400 =.40.
– 2. A tire manufacturer is considering producing a new tire designed to provide
an increase in mileage over the firm’s current line of tires. To estimate the
mean useful life of the new tires, the manufacturer produced a sample of 120
tires for testing. The test results provided a sample mean of 36,500 miles.
Hence, an estimate of the mean useful life for the population of new tires was
36,500 miles.
Chapter 5 – Sampling and Sampling Distribution 9

Real Example:
– The director of personnel for Electronics Associates, Inc. (EAI), has been
assigned the task of developing a profile of the company’s 2500 managers. The
characteristics to be identified include the mean annual salary for the
managers and the proportion of managers having completed the company’s
management training program.
– Using the 2500 managers as the population for this study, we can find the
annual salary and the training program status for each individual by referring
to the firm’s personnel records. The data set containing this information for all
2500 managers in the population is in the file named EAI. 1500 of the 2500
managers completed the training program.
– Numerical characteristics of a population are called parameters
– Population mean: μ = $51,800
– Population standard deviation: σ = $4000
– Proportion of the population that completed the training program: p = 0.60
Chapter 5 – Sampling and Sampling Distribution 10

– Now, suppose that the necessary information on all the EAI managers was not
readily available in the company’s database. The question we now consider is
how the firm’s director of personnel can obtain estimates of the population
parameters by using a sample of managers rather than all 2500 managers in
the population.
– Suppose that a sample of 30 managers will be used. Clearly, the time and the
cost of developing a profile would be substantially less for 30 managers than
for the entire population. If the personnel director could be assured that a
sample of 30 managers would provide adequate information about the
population of 2500 managers, working with a sample would be preferable to
working with the entire population.
Chapter 5 – Sampling and Sampling Distribution 11

Point Estimation
– To estimate the value of a population parameter, we compute a
corresponding characteristic of the sample, referred to as a sample statistic.
– A simple random sample of 30 managers and the corresponding data on
annual salary and management training program participation are as shown
below:
Chapter 5 – Sampling and Sampling Distribution 12

Sampling Distribution
– In the preceding section we said that the sample mean is the point estimator
of the population mean μ, and the sample proportion is the point estimator of
the population proportion p.
– For the simple random sample of 30 EAI managers the point estimate of μ is 𝑥ҧ
= $51,814 and the point estimate of 𝑝ҧ is 0.63. Suppose we select another
simple random sample of 30 EAI managers and obtain the following point
estimates:
• Sample mean: 𝑥ҧ = $52,670 and Sample proportion: 𝑝ҧ = 0.70
– Now, suppose we repeat the process of selecting a simple random sample of
30 EAI managers over and over again, each time computing the values of 𝑥ҧ
and 𝑝ҧ .
Chapter 5 – Sampling and Sampling Distribution 13

Sampling distribution
– The sample mean 𝑥ҧ is a random variable.
– 𝑥ҧ has a mean or expected value, a standard deviation, and a probability
distribution.
– The various possible values of 𝑥ҧ are the result of different simple random
samples
– The probability distribution of 𝑥ҧ is called the sampling distribution of 𝑥ҧ
– Knowledge of this sampling distribution and its properties will enable us to
make probability statements about how close the sample mean is to the
population mean μ.
Chapter 5 – Sampling and Sampling Distribution 14

– We note that the largest concentration of the values and the mean of the 500
values is near the population mean μ = $51,800.
– The sampling distribution of 𝑥ҧ is the probability distribution of all possible
values of the sample mean 𝑥ҧ
Chapter 5 – Sampling and Sampling Distribution 15

Sampling Distribution of 𝑥ҧ
1. Population has a normal distribution
• In many situations it is reasonable to assume that the population from which we
are selecting a random sample has a normal, or nearly normal, distribution. When
the population has a normal distribution, the sampling distribution of is normally
distributed for any sample size
2. Population does not have a normal distribution.
• When the population from which we are selecting a random sample does not have
a normal distribution, the central limit theorem is helpful in identifying the shape
of the sampling distribution of 𝑥.ҧ

CENTRAL LIMIT THEOREM


– In selecting random samples of size n from a population, the sampling
distribution of the sample mean can be approximated by a normal
distribution as the sample size becomes large.
– General statistical practice is to assume that, for most applications, the
sampling distribution of 𝒙 ഥ can be approximated by a normal distribution
whenever the sample size is 30 or more. In cases where the population is
highly skewed or outliers are present, samples of size 50 may be needed
Chapter 5 – Sampling and Sampling Distribution 16

1 . X follows normal distribution

X is the height of students


𝒙 is the mean of 5 of students’ height
Chapter 5 – Sampling and Sampling Distribution 17

2. X is non-normal
Chapter 5 – Sampling and Sampling Distribution 18

Expected value of 𝑥ҧ

σ/ n

μ 𝑥ҧ

Standard deviation of 𝑥ҧ

– In many practical sampling situations, we find that the population involved,


although finite, is “large,” whereas the sample size is “small”. In such cases the
finite population correction factor is close to 1.
• The population is finite and the sample size is less than or equal to 5% of the
population size; that is, n/N < 0.05.
Chapter 5 – Sampling and Sampling Distribution 19

Central Limit Theorem (CLT)


– The sample mean (𝑥ҧ ) approaches a normal distribution with mean μ and
standard deviation σ/ n as the sample size (n) becomes large (>30). Or

– follows standard normal distribution regardless of the distribution of X


Standard error
– To compute 𝝈ഥ𝒙 , we need to know σ, the standard deviation of the population.
– To further emphasize the difference between 𝜎𝑥ҧ and σ, we refer to the
standard deviation of 𝑥,ҧ 𝜎𝑥ҧ , as the standard error of the mean
– The term standard error refers to the standard deviation of a point estimator
– The value of the standard error of the mean is helpful in determining how far
the sample mean may be from the population mean.
Chapter 5 – Sampling and Sampling Distribution 20

Practical Value of Sampling Distribution


– The practical reason we are interested in the sampling distribution of 𝑥ҧ is that
it can be used to provide probability information about the difference
between the sample mean and the population mean.
– Suppose the personnel director believes the sample mean will be an
acceptable estimate of the population mean if the sample mean is within
$500 of the population mean.
– What is the probability that the sample mean computed using a simple
random sample of 30 EAI managers will be within $500 of the population
mean?
Chapter 5 – Sampling and Sampling Distribution 21

– With a population mean of $51,800, the personnel director wants to know the
probability that is between $51,300 and $52,300. This probability is given by
the darkly shaded area of the sampling distribution shown in Figure below.
– Because the sampling distribution is normally distributed, with mean 51,800
and standard error of the mean 730.3, we can use the standard normal
probability table to find the area or probability.
– We first calculate the z value at the upper and lower endpoints of the interval
(52,300 and 51,300, respectively)

– The preceding computations


show that a simple random
sample of 30 EAI managers
has a 0.5034 probability of
providing a sample mean
that is within $500 of the
population mean.
Chapter 5 – Sampling and Sampling Distribution 22

Relationship between the sample size and the sampling


distribution
– First note that E(ഥ
𝒙)=μ regardless of the sample size. Thus, the mean of all
possible values of 𝑥ҧ is equal to the population mean μ regardless of the
sample size n.
– However, note that the standard error of the mean, 𝝈ഥ𝒙 = 𝝈Τ 𝒏, is related to
the square root of the sample size. Whenever the sample size is increased,
the standard error of the mean decreases
– It tends to less variation and closer to the population mean.
Chapter 5 – Sampling and Sampling Distribution 23

Example 1 . Travel Time


– Suppose GRT has conducted a survey of 20 runs with the travel time data:

– What is the estimates of the mean and standard deviation of the transit travel
time?
– The sample mean of transit time is 19.7 min and the observed standard
deviation is 4.91 min. If GRT would like to use the sample mean (19.7 min.) as
an estimate of the population mean (true mean but unknown!!!), how good is
this estimate?
– What is the true mean? how to find it?
– If we survey another 20 runs, what would be the estimate? Would it be very
different from 19.7 min.
– What are the factors influencing the quality of the estimate?
Chapter 5 – Sampling and Sampling Distribution 24

Example 2. Averages of Rolling “N” Dice


– Consider the distribution of the average outcome from rolling n = 1, 2, ... dice

Row Labels Count of n=1 Row Labels Count of n=2 Row Labels Count of n=3
1 1 1 1 1.00 1
2 1 1.5 2 1.33 3
3 1 2 3 1.67 6
4 1 2.5 4 2.00 10
5 1 3 5 2.33 15
6 1 3.5 6 2.67 21
Grand Total 6 4 5 3.00 25
4.5 4 3.33 27
Excel
5 3 3.67 27
5.5 2 4.00 25
1.5 6 1 4.33 21
Grand Total 36 4.67 15
1 5.00 10
0.5 5.33 6
5.67 3
0 7 6.00 1
6
5 Grand Total 216
4
3
2 30
1 25
0 20
15
1.5

2.5

3.5

4.5

5.5
2
1

6
(blank)

10
5
0
Chapter 5 – Sampling and Sampling Distribution 25

Example 3. Random Number Generation


– Excel Spreadsheet generates pseudo random numbers using function RAND().
Each random number follows a uniform distribution between 0 and 1. If 12
numbers are to be generated using RAND().
– (1) Determine the mean and standard deviation for the mean of those 12
random numbers.
– (2) What distribution does the mean follow (approximately)?
– (3) Find the probability that the mean of the 12 random numbers is
• (a) < 0.4
• (b) > 0.65
Chapter 5 – Sampling and Sampling Distribution 26

Solution of Example 3
– (1)

– (2) The CLT does not apply for a sample of small size (n<30), which means the
answer to this question is that “the distribution of the sample mean is
unknown”. However, for a population of uniform distribution, it has been
found by simulation (see Notes about CLT) that the sample mean of a small
size (>5) could follow approximately normal distribution. Therefore, in this
case
Chapter 5 – Sampling and Sampling Distribution 27

– (3)
(a) < 0.4

(b) > 0.65


Chapter 5 – Sampling and Sampling Distribution 28

Sampling Distribution of the Mean(σ Unknown)


– What if the sample is not large enough…
Statistics of Normal Distributions
– Let X1, X2 ... Xn be independent variables that are all normal with same mean
(μ) and variance (σ2), then

– are both random variables


Theorem on t Distribution
– Define the following random variable

– Then T has a Student t distribution with (n-1) degrees-of-freedom (df= n-1).


(for any sample size!)
Chapter 5 – Sampling and Sampling Distribution 29

Central Limit Theorem (CLT)


– If n> 30, Student t distribution can be approximated by normal distribution,
i.e.,

– Follows standard normal distribution, where s is the sample standard


deviation of any available sample.
Chapter 5 – Sampling and Sampling Distribution 30

Example 4
The customer waiting times at a certain post office are assumed to be normally
distributed. A co-op student will be monitoring 30 noontime customers, timing
their arrivals and service with a watch. As closely as you can, find an interval
bracketing the probability that she will observe a deviation from the mean in
waiting time exceeding 2 minutes considering the following:
(a) σ = 6.3 minutes
(b) s = 6.3 minutes (unknown standard deviation)

P = 0.6467

P = 0.6453
Chapter 5 – Sampling and Sampling Distribution 31

Summary - Distribution of the Sample Mean

Do we know 2?

yes no

Data normal? Data normal?

yes no yes no

x−
n>30 x− ?
z= t n −1 =
 n s n
yes no

Degrees of Freedom
x− x−
z= z=
 n s n

Central Limit Theorem

? No simple solution…therefore outside scope of this course.


Chapter 5 – Sampling and Sampling Distribution 32

Behavior of Sample Proportion


– If the goal is to infer the population proportion of a categorical variable (e.g.
soft drink preference) based on a sample, we need to understand how the
sample proportion behave …
Statistic - Sample Proportion
Sample:
n Students

X1

Parameter: X2
Proportion of Sample Proportion
students liking p*
cokes (p)

Xn
Chapter 5 – Sampling and Sampling Distribution 33

Central Limit Theorem (CLT)


– The sample proportion (p*) approaches a normal distribution with mean p
and variance p(1-p) as the sample size (n) becomes very large (np>10). Or

– Follows standard normal distribution

Recall
Bernoulli Distribution (Expectation and Variance of X)
E(X) = p
Var(X) = p(1-p)
Chapter 5 – Sampling and Sampling Distribution 34

Example 5:
– For the EAI managers the μ is $51,800 and the p is 0.60. Suppose we select
another simple random sample of 30 EAI managers and obtain the following
point estimates: 𝑝ҧ = 0.70. What is the probability of getting this proportion?

P = 0.868
Chapter 5 – Sampling and Sampling Distribution 35

Sampling Distribution of the Sample Variance


– Let X1, X2 ... Xn be independent variables that are all normal with same mean
(μ) and variance (σ2), then

– are both random variables

Theorem on χ2 Distribution
– If and only if the data come from (at least approximately) a normal population,
then the pivotal statistic (n − 1) s 2 comes from a chi-squared
 2

distribution with (n-1) “degrees of freedom” (for any sample size!)


Chapter 5 – Sampling and Sampling Distribution 36

Example 6:
– A strain gage measurement accuracy is 2 mm (standard deviation). It measures
the deformation of 30 concrete slabs. What is the probability that the
standard deviation of these measurements would be equal to 2.2 mm.

χ2 (n − 1) s 2
2

P = 0.798
Chapter 5 – Sampling and Sampling Distribution 37

Review Exercise
1) A quality control process accepts or rejects each batch of 0.5”steel rods based
on the test results of a random sample of 100. A batch is acceptable if the mean
diameter of a sample from it falls between 0.4995” and 0.5005” otherwise, it is
rejected. Previous evaluations have established that the standard deviation for
individual rod diameters is 0.003”.
• (a)What is the probability that a batch of steel rods that have a mean diameter of
0.5003” will be accepted?
• (b)What is the probability that a near perfect batch having μ= 0.4999” (which
means it should have been accepted) will be rejected?

2) A civil engineer has computed the following results for the strength of certain
materials from 20 specimens: 𝑥= ҧ 31.4 MPa, s = 2.85 MPa. Determine the
approximate probability for getting a result this rare or rarer (this large or larger)
if the true mean strength is 29 MPa.
Chapter 5 – Sampling and Sampling Distribution 38

3) A civil engineer claims that the true mean strength of a given material is 29
MPa. To check this claim he tested 20 specimens and computed the following
results: x = 31.4 MPa, s = 2.85 MPa. If the computed t-value from this sample is
between –t0.05and t0.05, he is satisfied with his claim, what would be his
conclusion?

4) The customer waiting times at a certain post office are assumed to be


normally distributed, with unknown mean and standard deviation. A co-op
student will be monitoring 25 noontime customers, timing their arrivals and
service with a watch. As closely as you can, find an interval bracketing the
probability that she will observe a standard deviation in waiting time exceeding 5
minutes when the true value of the population standard deviation is:
(a) σ= 4.3 minutes
(b) σ= 6.0 minutes.
Chapter 5 – Sampling and Sampling Distribution 39

5) If each strand in a rope has a breaking strength, with mean 100 N and standard
deviation 10 N, and the breaking strength of a rope is the sum of the
(independent) breaking strengths of all the strands, what is the probability that a
rope made up of 64 strands will support a weight of 6300 N?

6) Dana conducted a survey to measure the height of students. The following


results were drawn. Take 20 samples at random from the following 30 samples
and repeat this process 5 times. Answer the following questions.
190 162 165 182 160 178 159 167 182 165
175 185 185 169 170 173 198 193 155 180
175 180 178 186 178 174 175 169 187 178

(a) Calculate the Z value with following assumptions


– 𝜎 is known
– 𝜎 is unknown
(b) Calculate the t value with following assumptions and compare it with the (a)
results. What can you conclude?
(c) What does Z mean? What can you infer from Z value?
Chapter 5 – Sampling and Sampling Distribution 40

7) The following is the transportation modes that students use to travel between
their residence and the AUT campus in summer and winter. Answer the following
questions for different modes?
(a) How many sample should you take to be able to use CLT?
(b) Use the number you find in part (a) and calculate Z value?
(c) How can you interpret this Z value?
(d) Compare the results for Summer and Winter.
Summer
Walk Walk Walk Bike Walk Walk Walk Walk Bike Walk Drive Drive Walk Bus
Walk Walk Walk Bike Bus Bike Walk Walk Walk Walk Bike Bus Drive Bike
Walk Walk Walk Walk Walk Walk Walk Walk Walk Walk Bus Walk Walk Walk
Bus Bus Bike Bus Bike Walk Bus Bike Walk Bike Bike Walk Walk Walk
Walk Walk Walk Bike Walk Bike Walk Bus Walk Drive Bus Walk Drive Bike
Walk Bus Walk Walk Walk Walk Walk Walk Walk Bus Bus Walk Walk Bus
Bike Walk Walk Drive Walk Bus Bus Walk

Winter
Walk Bus Drive Bus Walk Bus Walk Bus Bike Bus Drive Walk Bus Bus
Walk Bus Bus Bus Walk Walk Walk Walk Walk Bike Bus Walk Bus Walk
Bus Drive Others Drive Bus Walk Walk Walk Bus Walk Bus Bus Bus Bus
Bus Walk Walk Bus Walk Walk Bus Bus Walk Bus Walk Bus Walk Walk
Bike Bus Walk Bus Walk Drive Bus Walk Drive Bus Bus Bus Walk Walk
Bus Walk Bus Walk Bus Bus Walk Walk Bus Bus Bus Walk Drive Walk
Walk Walk Bus Bus Bus Bus Bus Walk
Chapter 5 – Sampling and Sampling Distribution 41

8) A student took 2 concrete samples and did compressive strength test. The
results are presented below in Mpa. Her supervisor insisted to do the test at high
level of accuracy and precision and restricted her to not having more than 2 Mpa
standard deviation for results. What is the probability that samples cannot meet
this restriction?
26.98 33.52 31.56 34.04 32.82 34.43 28.78 27.55

34.18 33.42 32.4 27.47 29.69 27.37 26.36 32.22

33.98 34.64 31.04 28.66 28.7 34.77 28.8 26.59

9) The wall thickness of 25 glass 2-liter bottles was measured by a quality-control


engineer. The sample mean was 4.05 millimeters, and the sample standard
deviation was 0.08 milimeter. Find a 95% lower confidence bound for mean wall
thickness. Interpret the interval you have obtained. Based on the confident
interval you just built, do you agree with the statement that the mean of the
thickness is less than 4.0 millimeters?
Chapter 5 – Sampling and Sampling Distribution 42

10) A water distribution subsystem consists of pipes AB, BC, and AC as shown in
the figure below. Because of differences in elevation and in hydraulic head loss in
the pipes and associated uncertainties, the capacity of each pipe (which is
defined as the maximum rate of flow) is given as follows, in cfs (cubic feet per
second):
AB: capacity is normal with mean 5 and coefficient of variation 0.1 (Coefficient of
variation = standard deviation/mean)
BC: capacity is uniform distributed between 2 and 8
AC: capacity equal to 8 or 9 with equal likelihood
(1)Determine the probability that the capacity of the branch ABC will exceed 4 cfs.
(Hint: Define this event as a combination of the events related to the capacity of AB
and BC);
(2)Determine the probability that the total capacity of the subsystem shown above
will exceed 13 cfs. (Hint: Use conditional probability.).

A C

B
Chapter 5 – Sampling and Sampling Distribution 43

11) Consider the class (or all students participated in the survey) as a population and
you are interested in students’ average height. Let X = the height of a randomly
selected student in cm. Use the survey data to answer the following questions:
a) Determine the (population) mean, standard deviation, probability distribution of X;
b) Following a) with the known population parameters (mean and variance of X), imagine
you pick up a sample of 5 students at random (N=5) and let Y = the average height of these
sampled students, determine the mean and standard deviation of Y. What distribution do
you expect Y to follow?
c) Repeat a) for N= 10 and 30. What patterns do you observe (or how do the mean,
standard deviation and distribution of Y change by the sample size N)?
d) With the information about the population, if N=30, what is the probability that the
difference between Y (sample mean) and the population mean (true) is less than 2 cm.
e) With the information about the population, if N=5, what is the probability that the
difference between Y (sample mean) and the population mean (true) is less than 2 cm. You
could assume that the population is normally distributed
f) With the information about the population, if we want to make sure that the probability
that the difference between Y (sample mean) and the population mean (true) is less than 1
cm is over 95%, how many students should we sample?
Chapter 5 – Sampling and Sampling Distribution 44

References
– Liping Fu, Probability and Statistics, University of Waterloo.
– Thomas A. Duever, Statistics in Engineering , University of Waterloo.
– Mahesh D. Pandey, Engineering Risk and Reliability, University of Waterloo.
– David R. Anderson, Dennis J. Sweeney, Thomas A. Williams, Statistics for
Business and Economics (Eleventh Edition), South-Western, Cengage Learning,
2011.
– Douglas C. Montgomery, George C. Runger, Applied statistics and probability
for engineers (Third Edition), John Wiley & Sons, Inc., 2003.
– Paul Newbold, William Carlson, and Betty Thorne, Statistics for Business and
Economics, Eighth Edition, Pearson, 2013.
– W.J. DeCoursey, Statistics and Probability for Engineering Applications With
Microsoft® Excel, Newnes, 2003.
– Carlo Vercellis, Business Intelligence Data Mining and Optimization for
Decision Making, John Wiley & Sons, 2009.

You might also like