0% found this document useful (0 votes)
29 views

Engineering Probability & Statistics

This document discusses key concepts in probability and statistics, including sampling distributions, population parameters, sample statistics, and estimators. It explains that sample statistics can be used to estimate population parameters, and discusses important properties of good estimators like unbiasedness. Different sampling methods are also introduced, along with how the distribution of sample means relates to the population distribution.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

Engineering Probability & Statistics

This document discusses key concepts in probability and statistics, including sampling distributions, population parameters, sample statistics, and estimators. It explains that sample statistics can be used to estimate population parameters, and discusses important properties of good estimators like unbiasedness. Different sampling methods are also introduced, along with how the distribution of sample means relates to the population distribution.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 30

Engineering

Probability & Statistics


Lecture 6
5 LEARNING OBJECTIVES
After studying this chapter you should be able to:
 Take random samples from populations
 Distinguish between population parameters and sample statistics
 Apply the central limit theorem
 Derive sampling distributions of sample means and proportions
 Explain why sample statistics are good estimators of population
parameters
 Judge one estimator as better than another based on desirable
properties of estimators
 Apply the concept of degrees of freedom
 Identify special sampling methods
Using Statistics
• Statistical Inference:
On basis of sample statistics
Predict and forecast values of derived from limited and
population parameters...
incomplete sample
Test hypotheses about values information
of population parameters...
Make decisions...

Make
Make Onthe
On thebasis
basisofof
generalizations
generalizations observationsof
observations of
aboutthe
about the aasample,
sample,aapart
part
characteristicsof
characteristics of ofaapopulation
of population
aapopulation...
population...
Literary Digest Poll (1936)

What is wrong
with the Poll?
5-2 Sample Statistics as Estimators of
Population Parameters
• Sample statistic? Population parameter?
- a numerical measure of a - a numerical measure of a
summary characteristic summary characteristic of a
of a sample. population.

• An estimator of a population parameter is a sample


statistic used to estimate or predict the population
parameter.
• An estimate of a parameter is a particular numerical value
of a sample statistic obtained through sampling.
• A point estimate is a single value used as an estimate of a
population parameter.
How about interval
estimate?
Estimators

• The sample mean, X , is the most common


estimator of the population mean, 
• The sample variance, s2, is the most common
estimator of the population variance, 2.
• The sample standard deviation, s, is the most
common estimator of the population standard
deviation, .
• The sample proportion, p̂, is the most common
estimator of the population proportion, p.
Population and Sample Proportions

• The population proportion, p, is equal to the number of


elements in the population belonging to the category of
interest, divided by the total number of elements in the
population N: X
p =
N
• The sample proportion, is the number of elements in the
sample belonging to the category of interest, divided by
the sample size:
x
p$ =
n
Sampling Methods
• Stratified sampling: in stratified sampling,
the population is partitioned into two or
more subpopulation called strata, and from
each stratum a desired sample size is
selected at random.
• Cluster sampling: in cluster sampling, a
random sample of the strata is selected and
then samples from these selected strata are
obtained.
• Systematic sampling: in systemic sampling,
we start at a random point in the sampling
frame, and from this point selected every kth,
say, value in the frame to formulate the
sample.

Source: https://www.phamduytung.com/blog/2019-05-04-sampling-method/
Population Distribution, Random Sample
from Population, and Their Means

How do you
observe the
relationship
between means of
sample &
population?
5-3 Sampling Distribution

• The sampling distribution of X is the


probability distribution of all possible values the
random variable X may assume when a sample
of size n is taken from a specified population.
Sampling Distribution (Continued)

Uniform population of integers from 1 to 8 (N=8)

XX P(X) XP(X) (X-x) (X-x) P(X)(X-x)


2 2
P(X) XP(X) (X-x) (X-x)2
P(X)(X-x)2 U nif o rm D is trib utio n (1 ,8 )

11 0.125
0.125 0.125
0.125 -3.5
-3.5 12.25
12.25 1.53125
1.53125 0 .2
22 0.125
0.125 0.250
0.250 -2.5
-2.5 6.25
6.25 0.78125
0.78125
3 0.125 0.375 -1.5 2.25 0.28125
3 0.125 0.375 -1.5 2.25 0.28125
4 0.125 0.500 -0.5 0.25 0.03125
4 0.125 0.500 -0.5 0.25 0.03125
55 0.125
0.125 0.625
0.625 0.5
0.5 0.25
0.25 0.03125
0.03125
66 0.125
0.125 0.750
0.750 1.5
1.5 2.25
2.25 0.28125
0.28125

P (X )
77 0.125
0.125 0.875
0.875 2.5
2.5 6.25
6.25 0.78125
0.78125 0 .1
88 0.125
0.125 1.000
1.000 3.5
3.5 12.25
12.25 1.53125
1.53125
1.000 4.500 5.25000
1.000 4.500 5.25000

0 .0
1 2 3 4 5 6 7 8

E(X) =  = 4.5 X

V(X) = 2 = 5.25
SD(X) =  = 2.2913
Sampling Distribution (Continued)

• There are 8*8 = 64 different but Each of these samples has a sample
equally-likely samples of size 2 mean. For example, the mean of
(n=2) that can be drawn (with the sample (1,4) is 2.5, and the
replacement) from a uniform mean of the sample (8,4) is 6.
population of the integers, 1 to 8:
Samples of Size 2 from Uniform (1,8) Sample Means from Uniform (1,8), n = 2
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
1 1,1 1,2 1,3 1,4 1,5 1,6 1,7 1,8 1 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5
2 2,1 2,2 2,3 2,4 2,5 2,6 2,7 2,8 2 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
3 3,1 3,2 3,3 3,4 3,5 3,6 3,7 3,8 3 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5
4 4,1 4,2 4,3 4,4 4,5 4,6 4,7 4,8 4 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0
5 5,1 5,2 5,3 5,4 5,5 5,6 5,7 5,8 5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5
6 6,1 6,2 6,3 6,4 6,5 6,6 6,7 6,8 6 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0
7 7,1 7,2 7,3 7,4 7,5 7,6 7,7 7,8 7 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5
8 8,1 8,2 8,3 8,4 8,5 8,6 8,7 8,8 8 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0
Sampling Distribution (Continued)
The probability distribution of the sample mean is called the
sampling distribution of the sample mean.
Sampling Distribution of the Mean
S ampling Dis tributio n of the Me an
X P(X) XP(X) X-X (X-X)2 P(X)(X-X)2

1.0 0.015625 0.015625 -3.5 12.25 0.191406


1.5 0.031250 0.046875 -3.0 9.00 0.281250 0.1 0

2.0 0.046875 0.093750 -2.5 6.25 0.292969

P(X)
2.5 0.062500 0.156250 -2.0 4.00 0.250000
3.0 0.078125 0.234375 -1.5 2.25 0.175781 0.0 5
3.5 0.093750 0.328125 -1.0 1.00 0.093750
4.0 0.109375 0.437500 -0.5 0.25 0.027344
4.5 0.125000 0.562500 0.0 0.00 0.000000
0.0 0
5.0 0.109375 0.546875 0.5 0.25 0.027344 1.0 1 .5 2.0 2 .5 3.0 3.5 4.0 4.5 5 .0 5 .5 6 .0 6 .5 7 .0 7.5 8 .0
5.5 0.093750 0.515625 1.0 1.00 0.093750 X
6.0 0.078125 0.468750 1.5 2.25 0.175781
6.5 0.062500 0.406250 2.0 4.00 0.250000
7.0
7.5
0.046875
0.031250
0.328125
0.234375
2.5
3.0
6.25
9.00
0.292969
0.281250 E ( X ) = m X = 4.5
8.0 0.015625 0.125000 3.5 12.25 0.191406
V ( X ) = s 2X = 2.625
1.000000 4.500000 2.625000 SD( X ) = s X = 1.6202
Properties of the Sampling Distribution of
the Sample Mean
Uniform Distribution (1,8)
• Comparing the population 0.2

distribution and the sampling


distribution of the mean:

P(X)
0.1

The sampling distribution is


more bell-shaped and 0.0
1 2 3 4 5 6 7 8

symmetric. X

Both have the same center. Sampling Distribution of the Me an

The sampling distribution of 0.10

the mean is more compact,

P(X)
with a smaller variance. 0.05

0.00
1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0
X
Nonnormal Population Distribution and
Normal Sampling Distribution of Sample
Mean When a Large Sample is Used
Relationships between Population Parameters and the
Sampling Distribution of the Sample Mean

The expected value of the sample mean is equal to the population


mean:
E( X )    
X X

The variance of the sample mean is equal to the population


variance divided by the sample size n:
 2

V(X)  2
 X
X
n
The standard deviation of the sample mean, known as the
standard error of the mean, is equal to the population standard
deviation divided by the square root of the sample size:

SD( X )    X
X
n
The Central Limit Theorem
Effects of Central Limit Theorem
The Central Limit Theorem
Mercury makes a 2.4 liter V-6 engine, the Laser XRi, used in
speedboats. The company’s engineers believe the engine delivers
an average power of 220 horsepower and that the standard
deviation of power delivered is 15 HP. A potential buyer intends to
sample 100 engines (each engine is to be run a single time). What
is the probability that the sample mean will be less than 217 HP?
Sampling from a Normal Population
When sampling from a normal population with mean  and standard
deviation , the sample mean, X, has a normal sampling
distribution:

2

X ~ N (, )
n
Thismeans
This meansthat,
that,as
asthe
the
samplesize
sample sizeincreases,
increases,the
the
samplingdistribution
sampling distributionof
ofthe
the
samplemean
sample meanremains
remains
centeredon
centered onthe
thepopulation
population
mean,but
mean, butbecomes
becomesmore
more
compactlydistributed
compactly distributedaround
around
thatpopulation
that populationmean
mean
Student’s t Distribution
If the population standard deviation, , is unknown, replace with
the sample standard deviation, s. If the population is normal, the
resulting statistic: X  t
s/ n
has a t distribution with (n-1) degrees of freedom.
•• The
Thettisisaafamily
familyof
ofbell-shaped
bell-shapedand
and
symmetricdistributions,
symmetric distributions,one onefor foreach
each
numberof
number ofdegree
degreeof offreedom.
freedom.
Standard normal
•• Theexpected
The expectedvalue
valueof ofttisis0.0.
t, df=20
•• Thevariance
The varianceof ofttisisgreater
greaterthanthan1,1,but
but t, df=10
approaches11as
approaches asthe
thenumber
numberof ofdegrees
degrees
offreedom
of freedomincreases.
increases. TheThettisisflatter
flatterand
and
hasfatter
has fattertails
tailsthan
thandoes
doesthe thestandard
standard 

normal.
normal.
•• Thettdistribution
The distributionapproaches
approachesaastandardstandard
normalas
normal asthe
thenumber
numberof ofdegrees
degreesof of
freedomincreases.
freedom increases.
The Sampling Distribution of the Sample
Proportion, p̂
+ The sampling distribution of the sample
proportion is based on the binominal distribution
with parameters n and p, where n is the sample
size and p is the population proportion.

+ The sample proportion is the percentage of


successes in n binomial trials. It is the number of
successes, X, divided by the number of trials, n.
X
Sample proportion: p 
n
As the sample size, n, increases, the
sampling distribution of p
approaches a normal distribution with
mean p and standard deviation
p(1  p)
n
Sample proportion
In recent years, convertible sports coupes have
become very popular in Japan. Toyota is currently
shipping Celicas to Los Angeles, where a
customizer does a roof lift and ships them back to
Japan. Suppose that 25% of all Japanese in a given
income and lifestyle category are interested in
buying Celica convertibles. A random sample of
100 Japanese consumers in the category of interest
is to be selected. What is the probability that at n = 100, p =.25
least 20% of those in the sample will express an
interest in a Celica convertible?
5-4 Estimators and Their Properties
An estimator of a population parameter is a sample statistic used to
estimate the parameter. The most commonly-used estimator of the:

Population Parameter Sample Statistic


Mean () is the Mean (X)
Variance (2) is the Variance (s2)
Standard Deviation () is the Standard Deviation (s)
Proportion (p) is the Proportion ( ) p
• Desirable properties of estimators include:
Unbiasedness
Efficiency
Consistency
Sufficiency
Unbiased and Biased Estimators

{
Bias

An unbiased estimator is on A biased estimator is


target on average. off target on average.
Consistency and Sufficiency

An estimator is said to be consistent if its probability of being close


to the parameter it estimates increases as the sample size increases.

Consistency

n = 10 n = 100

An estimator is said to be sufficient if it contains all the information


in the data about the parameter it estimates.
Properties of the Sample Variance

The sample variance (the sum of the squared deviations from the
sample mean divided by (n-1) is an unbiased estimator of the
population variance.

æ
E ( s ) = Eç
2 å ( x - x )
2
ö
÷ =s2
è (n - 1) ø

æ å ( x - x ) 2ö
Eç ÷<s
2

è n ø
Degrees of freedom

+ You are asked to chose 10 numbers. You have the


freedom to chose 10 numbers as you please -> you have 10
degrees of freedom.

+ What happen if a condition is imposed on the numbers


(eg. The sum of all the numbers you choose must be 100),
what is the number of degree of freedoms you have now?

+ In general, if you have to choose n numbers, and a


condition on their total is imposed, you will have only (n-1)
degrees of freedom.
Degrees of Freedom (Continued)

The number of degrees of freedom is equal to the total number of


measurements (these are not always raw data points), less the total
number of restrictions on the measurements. A restriction is a
quantity computed from the measurements.

The sample mean is a restriction on the sample measurements, so


after calculating the sample mean there are only (n-1) degrees of
freedom remaining with which to calculate the sample variance.
The sample variance is based on only (n-1) free data points:

s
2
=
å (x - x)
2

(n - 1)
End Lecture 6

You might also like