Lecture 3
Lecture 3
P(x)
P(x)
x - 2
-
0.3
e 2 2
f(x)
f ( x) = - < x<
0.2
1 for 0.1
2p 2 0.0
0.4
x - 2
-
0.3
f(x)
0.2
2 p 2 0.1
Example 4.1: Let X1, X2, and X3 be independent random variables that are
normally distributed with means and variances as shown.
Mean Variance
X1 10 1
X2 20 2
X3 30 3
Example 4.3: Let X1 , X2 , X3 and X4 be independent random variables that are normally
distributed with means and variances as shown. Find the mean and variance of Q =
X1 - 2X2 + 3X2 - 4X4 + 5
Mean Variance
X1 12 4
X2 -5 2
X3 8 5
X4 10 1
0.3
f(w)
f(x)
f(y)
0.2 0.1 0.1
0.1
0.2
Z~N(0,1)
The Standard Normal Distribution
0 .4
0 .3
=1
f(z)
{
0 .2
0 .1
0 .0
-5 -4 -3 -2 -1 0 1 2 3 4 5
=0
Z
Finding Probabilities of the Standard
Normal Distribution: P(0 Z 1.56)
Standard Normal Probabilities
Standard Normal Distribution z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359
0.4 0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753
0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141
0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517
0.3 0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879
0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224
0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549
f(z)
0.2 0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852
0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133
0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389
0.1 1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621
1.56 1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830
{
1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015
0.0 1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177
-5 -4 -3 -2 -1 0 1 2 3 4 5 1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319
Z 1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441
1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545
1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633
1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706
1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767
2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817
0.2
0.1
0.0
-5 -4 -3 -2 -1 0 1 2 3 4 5
Z
Finding Probabilities of the Standard
Normal Distribution: P(1 Z 2)
To find P(1 Z 2): z
.
.00
.
...
. .
1. Find table area for 2.00 .
0.9
.
0.3159 ...
0.3
Area between 1 and 2
P(1 Z 2) = .9772 - .8413 = 0.1359
f(z)
0.2
0.1
0.0
-5 -4 -3 -2 -1 0 1 2 3 4 5
Z
Finding Values of the Standard Normal
Random Variable: P(0 Z z) = 0.40
z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
To find z such that 0.0
0.1
0.0000
0.0398
0.0040
0.0438
0.0080
0.0478
0.0120
0.0517
0.0160
0.0557
0.0199
0.0596
0.0239
0.0636
0.0279
0.0675
0.0319
0.0714
0.0359
0.0753
0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141
0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517
0.2
0.0
Look to the table of standard normal probabilities Total area in center = .99
to find that: Area in center left = .495
0.4
z.005
f(z)
0.2
0.0
-5 -4 -3 -2 -1 0 1 2 3 4 5
Z
-z.005 z.005
-2.575 2.575
The Transformation of Normal
Random Variables
The area within k of the mean is the same for all normal random variables. So an area
under any normal distribution is equivalent to an area under the standard normal. In this
example: P(40 X =P(-1 Z = since =and =
The transformation of X to Z:
X - x Normal Distribution: =50, =10
Z =
x 0.07
0.06
Transformation 0.05
f(x)
0.04
(1) Subtraction: (X - x) 0.03
0.02 =10
{
Standard Normal Distribution 0.01
0.4 0.00
0 10 20 30 40 50 60 70 80 90 100
X
0.3
f(z)
0.2
X~N(160,302)
P (100 X 180 )
100 - X - 180 -
= P
100 - 160 180 - 160
= P Z
30 30
= P (-2 Z .6666 )
= 0 . 4772 + 0 . 2475 = 0 . 7247
Using the Normal Transformation
Example
X~N(127,222)
P( X < 150)
= P
X - 150 -
<
= P Z <
150 - 127
22
(
= P Z < 1.045
= 0.5 + 0.3520 = 0.8520
The Transformation of Normal
Random Variables
f(z)
0 .2
That is, P(X >70) can be found easily because 70 is 2 standard deviations above the mean
of X: 70 = + 2. P(X > 70) is equivalent to P(Z > 2), an area under the standard normal
distribution.
Example X~N(2450,4002)
Example X~N(5.7,0.52) P(a<X<b)=0.95 and P(-1.96<Z<1.96)=0.95
P(X > x)=0.01 and P(Z > 2.33) 0.01 x = z = 2450 ± (1.96)(400) = 2450
x = + z = 5.7 + (2.33)(0.5) = 6.865 ±784=(1666,3234)
P(1666 < X < 3234) = 0.95
z .02 .03 .04
. . . . .
z .05 .06 .07
. . . . .
. . . . .
. . . . .
. . . . .
2.2 ... 0.4868 0.4871 0.4875
. . . . .
2.3 ... 0.4898 0.4901 0.4904
1.8 ... 0.4678 0.4686 0.4693
2.4 ... 0.4922 0.4925 0.4927
1.9 ... 0.4744 0.4750 0.4756
. . . . .
2.0 ... 0.4798 0.4803 0.4808
. . . . .
. . . . .
. . . . .
. . . . .
f(x)
0.4
0.3 X.01 = +z = 5.7 + (2.33)(0.5) = 6.865
0.0005
0.2 .0250 .0250
0.1 Area = 0.01
0.0 0.0000
3.2 4.2 5.2 6.2 7.2 8.2 1000 2000 3000 4000
X X
-5 -4 -3 -2 -1 0 1 2 3 4 5 -5 -4 -3 -2 -1 0 1 2 3 4 5
z Z.01 = 2.33 -1.96 Z 1.96
4-27
0.0010
.
distribution in 0.0008
.
question and of the
f(x)
0.0006
.
0.0002
.
distribution.
0.0000
1000 2000 3000 4000
X
0.3
f(z)
0.2
0.1
0.0
-5 -4 -3 -2 -1 0 1 2 3 4 5
Z
4-28
0.0010
. .4750 .4750
1. Draw pictures of 0.0008
.
the normal
f(x)
0.0006
.
distribution in 0.0004
.
0.0002
. .9500
question and of the
0.0000
standard normal 1000 2000 3000 4000
X
distribution.
S tand ard Norm al D is trib utio n
0.4
.4750
2. Shade the area 0.3
.4750
corresponding to
f(z)
0.2
the desired
probability. 0.1
.9500
0.0
-5 -4 -3 -2 -1 0 1 2 3 4 5
Z
Finding Values of a Normal Random
Variable, Given a Probability
Normal Distribution: = 2450, = 400
1. Draw pictures of 3. From the table
0.0012
.
the normal 0.0010
. .4750 .4750 of the standard
distribution in 0.0008
.
normal
f(x)
question and of the 0.0006
.
distribution,
0.0004
.
standard normal 0.0002
. find the z value
.9500
distribution. 0.0000 or values.
1000 2000 3000 4000
X
2. Shade the area S tand ard Norm al D is trib utio n
corresponding 0.4
to the desired .4750 .4750
0.3
probability.
f(z)
0.2
f(x)
question and of the 0.0006
.
0.0004
. distribution,
standard normal 0.0002
. .9500 find the z value
distribution. 0.0000
1000 2000 3000 4000 or values.
X
2. Shade the area S tand ard Norm al D is trib utio n 4. Use the
corresponding 0.4
transformation
to the desired .4750 .4750 from z to x to get
0.3
probability. value(s) of the
original random
f(z)
0.2
0.3 0.3
P( x 4) = 0.7734
0.2 0.2
P(x)
f(x)
0.1 0.1
0.0 0.0
0 5 10 0 1 2 3 4 5 6 7
X X
Normal with mean = 3.50000 and standard deviation = 1.32300 Binomial with n = 7 and p = 0.500000
x P( X <= x) x P( X <= x)
4.5000 0.7751 4.00 0.7734
The Normal Approximation of Binomial
Distribution
0.2
P(x)
f(x)
0.1
0.1
0.0
0.0
0 1 2 3 4 5 6 7 8 9 10 11
0 5 10
X
X
Approximating a Binomial Probability
Using the Normal Distribution
a - np b - np
P( a X b) =& P Z
np(1 - p) np(1 p)
-
0.3
f(z)
or 0.2
0.1
0.0
P x - 196 < < x + 196 = 0.95
-4 -3 -2 -1 0 1 2 3 4
. . z
n n
Confidence Interval for when is Known
(Continued)
Before sampling, there is a 0.95probability that the interval
1.96
n
will include the sample mean (and 5% that it will not).
That is, x 1.96 is a 95% confidence interval for .
n
A 95% Interval around the Population
Mean
Sampling Distribution of the Mean
0.4
Approximately 95% of sample means
0.3
95%
can be expected to fall within the
interval - 196 .
, + 196
f(x)
0.2
. .
n n
0.1
2.5% 2.5%
Conversely, about 2.5% can be
0.0
- 196
.
n
+ 196
.
n
x
expected to be above + 196
.
n
and
2.5% can be expected to be below
x
- 196
. .
x n
2.5% fall below
the interval x
x So 5% can be expected to fall outside
the interval - 1.96 , + 1.96 .
x
x 2.5% fall above
x
the interval n n
x
x
We define za as the z value that cuts off a right-tail area of a under the standard
2
normal curve. (1-a) is called the confidence coefficient. a is called the error
2
P za z za = (1 - a)
- < <
0.2
0.1 a a 2 2
2 2
0.0 (1- a)100% Confidence Interval:
-5 -4 -3 -2 -1 0 1 2 3 4 5
-z a Z za x za
2 2
2 n
Critical Values of z and Levels of
Confidence
f(z)
0.2
0.4 0.4
0.3 0.3
f(z)
f(z)
0.2 0.2
0.1 0.1
0.0 0.0
-5 -4 -3 -2 -1 0 1 2 3 4 5 -5 -4 -3 -2 -1 0 1 2 3 4 5
Z Z
0 .4 0 .9
0 .8
0 .3 0 .7
0 .6
0 .5
f(x)
f(x)
0 .2
0 .4
0 .3
0 .1
0 .2
0 .1
0 .0 0 .0
x x
}
f(t)
9 1.383 1.833 2.262 2.821 3.250 0 .2
}
}
17 1.333 1.740 2.110 2.567 2.898 t
18 1.330 1.734 2.101 2.552 2.878
19 1.328 1.729 2.093 2.539 2.861 Area = 0.025 Area = 0.025
20 1.325 1.725 2.086 2.528 2.845
21 1.323 1.721 2.080 2.518 2.831
22 1.321 1.717 2.074 2.508 2.819
23
24
1.319
1.318
1.714
1.711
2.069
2.064
2.500
2.492
2.807
2.797
Whenever is not known (and the population is
25 1.316 1.708 2.060 2.485 2.787
26 1.315 1.706 2.056 2.479 2.779
assumed normal), the correct distribution to use is
27 1.314 1.703 2.052 2.473 2.771
28 1.313 1.701 2.048 2.467 2.763
the t distribution with n-1 degrees of freedom.
29 1.311 1.699 2.045 2.462 2.756
30 1.310 1.697 2.042 2.457 2.750
Note, however, that for large degrees of freedom,
40 1.303 1.684 2.021 2.423 2.704
60 1.296 1.671 2.000 2.390 2.660
the t distribution is approximated well by the Z
120 1.289 1.658 1.980 2.358 2.617
1.282 1.645 1.960 2.326 2.576
distribution.
Example
A blood analyst wants to estimate the average AFP index of the Vietnamese
people. A random blood sample of size 15 yields an average of x = 10.37ng / ml
and a standard deviation of s = 3.5 ng/ml. Assuming a normal population of
the AFP values, give a 95% confidence interval for the average AFP value
of the Vietnamese population? (AFP=alpha-fetoprotein)
df t0.100 t0.050 t0.025 t0.010 t0.005
--- ----- ----- ------ ------ ------ The critical value of t for df = (n -1) = (15 -1)
1 3.078 6.314 12.706 31.821 63.657
. . . . . . =14 and a right-tail area of 0.025 is:
. . . . . .
13
. .
1.350
.
1.771
.
2.160
.
2.650
.
3.012
t0.025 = 2.145
14 1.345 1.761 2.145 2.624 2.977 The corresponding confidence interval or
15 1.341 1.753 2.131 2.602 2.947 s
.
.
.
.
.
.
.
.
.
.
.
. interval estimate is: x t 0 . 025
. . . . . . n
35
.
= 10.37 2.145
15
= 10.37 1.94
= 8.43,12.31
Large Sample Confidence Intervals for
the Population Mean
Example An environmental scientist wants to estimate the average amount of NOx in a given region. A random sample
of 100 data points gives x-bar = 357.60 ppm and s = 140.00 ppm. Give a 95% confidence interval for , the average
amount of NOx in any sample taken.
s 140.00
x z0.025 = 357.60 1.96 = 357.60 27.44 = 33016,385
. .04
n 100
Exercise 1
Exercise 2
Large-Sample Confidence Intervals
for the Population Proportion, p
For estimating p , a sample is considered large enough when both n p an n q are greater
than 5.
Large-Sample Confidence Intervals
for the Population Proportion, p
A large - sample (1-a )100% confidence interval for the population proportion , p :
pˆ z pˆ qˆ
a /2 n
where the sample proportion , p̂, is equal to the number of successes in the sample, x,
divided by the number of trials (the sample size), n, and q̂ = 1- p̂.
Example
A marketing research firm wants to estimate the share that foreign companies
have in the American market for certain products. A random sample of 100
consumers is obtained, and it is found that 34 people in the sample are users
of foreign-made products; the rest are users of domestic products. Give a
95% confidence interval for the share of foreign products in this market.
pq ( 0.34 )( 0.66)
p za = 0.34 1.96
2
n 100
= 0.34 (1.96)( 0.04737 )
= 0.34 0.0928
= 0.2472 ,0.4328
Thus, the firm may be 95% confident that foreign manufacturers control
anywhere from 24.72% to 43.28% of the market.
Exercise 3
Confidence Intervals for the Population Variance:
The Chi-Square (2) Distribution
f( )
as the degrees of freedom increase. df = 30
2
0 .0 5
0 .0 4
0 .0 3 df = 50
0 .0 2
0 .0 1
0 .0 0
0 50 100
2
( n - 1) s 2
=
2
2
has a chi - square distribution with (n - 1) degrees of freedom.
Confidence Interval for the Population
Variance
A (1-a)100% confidence interval for the population variance * (where the
population is assumed normal) is:
2
( n - 1) s , ( n - 1) s
2
a2 2 a
2
1-
2
2
where a is the value of the chi-square distribution with n - 1 degrees of freedom
2 a 2
that cuts off an area to its right and a is the value of the distribution that
1-
a2 2 a
cuts off an area of to its left (equivalently, an area of 1 - to its right).
2 2
* Note: Because the chi-square distribution is skewed, the confidence interval for the
population variance is not symmetric
Example
2
( n - 12 ) s , ( n -21) s = ( 30 - 1)18540 , ( 30 - 1)18540 = 11765,33604
2
a a 457
. 16.0
2
1-
2
Example (continued)
df .995 .990 .975 .950 .900 .100 .050 .025 .010 .005
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
28 12.46 13.56 15.31 16.93 18.94 37.92 41.34 44.46 48.28 50.99
29 13.12 14.26 16.05 17.71 19.77 39.09 42.56 45.72 49.59 52.34
30 13.79 14.95 16.79 18.49 20.60 40.26 43.77 46.98 50.89 53.67
Chi-Square Distribution: df = 29
0.06
0.05
0.95
0.04
f( )
2
0.03
0.02
0.025
0.01 0.025
0.00
0 10 20 30 40 50 60 70
2
20.975 = 16.05 20.025 = 4572
.
Sample-Size Determination
For example: A (1- a ) Confidence Interval for : x z a
n
}
2
Bound, B
Exercise 4
Sample Size and Standard Error
The sample size determines the bound of a statistic, since the standard
error of a statistic shrinks as the sample size increases:
Sample size = 2n
Standard error
of statistic
Sample size = n
Standard error
of statistic
Minimum Sample Size: Mean and
Proportion
Minimum required sample size in estimating the population
mean, :
za2 2
n= 2 2
B
Bound of estimate:
B = za
2 n
za
2 2
n= 2
2
B
2 2
(1.96) ( 400)
= 2
120
= 42.684 43