0% found this document useful (0 votes)
11 views27 pages

Lecture+Notes+5+-+EDA+-+Continuous+Random+Variable

The document discusses the normal distribution, detailing its density function, properties, and transformation formula. It highlights characteristics of the normal curve, including symmetry, inflection points, and the total area under the curve being equal to 1. Additionally, it provides areas under the standard normal curve with corresponding Z values.

Uploaded by

qs4mjgqbmd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views27 pages

Lecture+Notes+5+-+EDA+-+Continuous+Random+Variable

The document discusses the normal distribution, detailing its density function, properties, and transformation formula. It highlights characteristics of the normal curve, including symmetry, inflection points, and the total area under the curve being equal to 1. Additionally, it provides areas under the standard normal curve with corresponding Z values.

Uploaded by

qs4mjgqbmd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

EDA (Eng’g Data Analysis)

Reference: Intro to Statistics by R. Walpole Lecture Notes 5


SOME CONTINOUS PROBABILITY DISTRIBUTION

A. NORMAL DISTRIBUTION (Reference: Introduction to Statistics by R. Walpole –


Chapter 7)

 The density function of the normal random variable X, with mean μ and
variance σ2, is
(𝑥−𝜇) 2
1 −(1/2)[ 𝜎 ]
𝑛(𝑋; 𝜇, 𝜎) = 𝑒 for – ∞ < X < + ∞
√2𝜋𝜎

Point of Inflection Point of Inflection

-∞ μ X values + ∞
σ σ

Normal Curve
Properties of the Normal Curve

a. The mode, which is the point on the horizontal axis where the curve is a
maximum, occurs at x = .
b. The curve is symmetric about a vertical axis through the mean .
c. The curve has its points of inflection at x =  ±  , is concave downward if
( - )  X  ( + ) , and is concave upward otherwise.
d. The normal curve approaches the horizontal axis asymptotically as we
proceed in either direction away from the mean.
e. The total area under the curve and above the horizontal axis is equal to 1.

Transformation Formula

𝑋−𝜇
𝑍=
𝜎

Normal Curve
 μ < or > 0
 σ > or < 1

μ X value

StandardNormal Curve
 μ=0
 σ=1

Z=0 Z value

The Areas Under the Standard Normal Curve


(Reference: Introduction to Statistics by R. Walpole – Table A.4 – page 480)

1|27 cblamsis
EDA (Eng’g Data Analysis)
Reference: Intro to Statistics by R. Walpole Lecture Notes 5
Areas Under the Standard Normal Curve

Area

0 Z

Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
- 3.4 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0002
- 3.3 0.0005 0.0005 0.0005 0.0004 0.0004 0.0004 0.0004 0.0004 0.0004 0.0003
- 3.2 0.0007 0.0007 0.0006 0.0006 0.0006 0.0006 0.0006 0.0005 0.0005 0.0005
- 3.1 0.0010 0.0009 0.0009 0.0009 0.0008 0.0008 0.0008 0.0008 0.0007 0.0007
- 3.0 0.0013 0.0013 0.0013 0.0012 0.0012 0.0011 0.0011 0.0011 0.0010 0.0010
- 2.9 0.0019 0.0018 0.0017 0.0017 0.0016 0.0016 0.0015 0.0015 0.0014 0.0014
- 2.8 0.0026 0.0025 0.0024 0.0023 0.0023 0.0022 0.0021 0.0021 0.0020 0.0019
- 2.7 0.0035 0.0034 0.0033 0.0032 0.0031 0.0030 0.0029 0.0028 0.0027 0.0026
- 2.6 0.0047 0.0045 0.0044 0.0043 0..0041 0.0040 0.0039 0.0038 0.0037 0.0036
- 2.5 0.0062 0.0060 0.0059 0.0057 0.0055 0.0054 0.0052 0.0051 0.0049 0.0048
- 2.4 0.0082 0.0080 0.0078 0.0075 0.0073 0.0071 0.0069 0.0068 0.0066 0.0064
- 2.3 0.0107 0.0104 0.0102 0.0099 0.0096 0.0094 0.0091 0.0089 0.0087 0.0084
- 2.2 0.0139 0.0136 0.0132 0.0129 0.0125 0.0122 0.0119 0.0116 0.0113 0.0110
- 2.1 0.0179 0.0174 0.0170 0.0166 0.0162 0.0158 0.0154 0.0150 0.0146 0.0143
- 2.0 0.0228 0.0222 0.0217 0.0212 0.0207 0.0202 0.0197 0.0192 0.0188 0.0183
- 1.9 0.0287 0.0281 0.0274 0.0268 0.0262 0.0256 0.0250 0.0244 0.0239 0.0233
- 1.8 0.0359 0.0352 0.0344 0.0336 0.0329 0.0322 0.0314 0.0307 0.0301 0.0294
- 1.7 0.0446 0.0436 0.0427 0.0418 0.0409 0.0401 0.0392 0.0384 0.0375 0.0367
- 1.6 0.0548 0.0537 0.0526 0.0516 0.0505 0.0495 0.0485 0.0475 0.0465 0.0455
- 1.5 0.0668 0.0655 0.0643 0.0630 0.0618 0.0606 0.0594 0.0582 0.0571 0.0559
- 1.4 0.0808 0.0793 0.0778 0.0764 0.0749 0.0735 0.0722 0.0708 0.0694 0.0681
- 1.3 0.0968 0.0951 0.0934 0.0918 0.0901 0.0885 0.0869 0.0853 0.0838 0.0823
- 1.2 0.1151 0.1131 0.1112 0.1093 0.1075 0.1056 0.1038 0.1020 0.1003 0.0985
- 1.1 0.157 0.1335 0.1314 0.1292 0.1271 0.1251 0.1230 0.1210 0.1190 0.1170
- 1.0 0.1587 0.1562 0.1539 0.1515 0.1492 0.1469 0.1446 0.1423 0.1401 0.1379
- 0.9 0.1841 0.1814 0.1788 0.1762 0.1736 0.1711 0.1685 0.1660 0.1635 0.1611
- 0.8 0.2119 0.2090 0.2061 0.2033 0.2005 0.1977 0.1949 0.1922 0.1894 0.1867
- 0.7 0.2420 0.2389 0.2358 0.2327 0.2296 0.2266 0.2236 0.2206 0.2177 0.2148
- 0.6 0.2743 0.2709 0.2676 0.2643 0.2611 0.2578 0.2546 0.2514 0.2483 0.2451
- 0.5 0.3085 0.3050 0.3015 0.2981 0.2946 0.2912 0.2877 0.2843 0.2810 0.2776
- 0.4 0.3446 0.3409 0.3372 0.3336 0.3300 0.3264 0.3228 0.3192 0.3156 0.3121
- 0.3 0.3821 0.3783 0.3745 0.3707 0.3669 0.3632 0.3594 0.3557 0.3520 0.3483
- 0.2 0.4207 0.4168 0.4129 0.4090 0.4052 0.4013 0.3974 0.3936 0.3897 0.3859
- 0.1 0.4602 0.4562 0.4522 0.4483 0.4443 0.4404 0.4364 0.4325 0.4286 0.4247
- 0.0 0.5000 0.4960 0.4920 0.4880 0.4840 0.4801 0.4761 0.4721 0.4681 0.4641
0.0 0.0500 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359
0.1 0.5398 0.5040 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753
0.2 0.5793 05438 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141
0.3 0.6179 0.5832 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.6517
0.4 0.6554 0.6217 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.6879
0.5 0.6915 0.6591 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.7224
0.6 0.7257 0.6950 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.7549
0.7 0.7580 0.7291 0.7642 0.7673 0.7704 0.7734 0.7764 0.7794 0.7823 0.7852
0.8 0.7881 0.7611 0.7939 0.7967 0.7995 0.8023 0.8051 0.8078 0.8106 0.8133
0.9 0.8159 0.7910 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389
1.0 0.8413 0.8186 0.8461 0.8485 0.8508 0.8531 0.8554 .8577 0.8599 0.8621
1.1 0.8643 0.8438 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.8830
1.2 0.8849 0.8665 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.9015
1.3 0.9032 0.8869 0.9066 0.9082 0.9099 0.9115 0.9131 0.9147 0.9162 0.9177
1.4 0.9192 0.9049 0.9222 0.9236 0.9251 0.9265 0.9278 0.9292 0.9306 0.9319
1.5 0.9332 0.9207 0.9357 0.9370 0.9382 0.9394 0.9406 0.9418 0.9429 0.9441
1.6 0.9452 0.9345 0.9474 0.9484 0.9495 0.9505 0.9515 0.9525 0.9535 0.9545
1.7 0.9554 0.9463 0.9573 0.9582 0.9591 0.9599 0.9608 0.9616 0.9625 0.9633
1.8 0.9641 0.9564 0.9656 0.9664 0.9671 0.9678 0.9686 0.9693 0.9699 0.9706
1.9 0.9713 0.9649 0.9726 0.9732 0.9738 0.9744 0.9750 0.9756 0.9761 0.9767
2.0 0.9772 0.9719 0.9783 0.9788 0.9793 0.9798 0.9803 0.9808 0.9812 0.9817
2.1 0.9821 0.9778 0.9830 0.9834 0.9838 0.9842 0.9846 0.9850 0.9854 0.9857
2.2 0.9861 0.9826 0.9868 0.9871 0.9875 0.9878 0.9881 0.9884 0.9887 0.9890
2.3 0.9893 0.9864 0.9898 0.9901 0.9904 0.9906 0.9909 0.9911 0.9913 0.9916
2.4 0.9918 0.9896 0.9922 0.9925 0.9927 0.9929 0.9931 0.9932 0.9934 0.9936
2.5 0.9938 0.9920 0.9941 0.9943 0.9945 0.9946 0.9948 0.9949 0.9951 0.9952
2.6 0.9953 0.9940 0.9956 0.9957 0.9959 0.9960 0.9961 0.9962 0.9963 0.9964
2.7 0.9965 0.9955 0.9967 0.9968 0.9969 0.9970 0.9971 0.9972 0.9973 0.9974
2.8 0.9974 0.9966 0.9976 0.9977 0.9977 0.9978 0.9979 0.9979 0.9980 0.9981
2.9 0.9981 0.9975 0.9982 0.9983 0.9984 0.9984 0.9985 0.9985 0.9986 0.9986
3.0 0.9987 0.9982 0.9987 0.9988 0.9988 0.9989 0.9989 0.9989 0.9990 0.9990
3.1 0.9990 0.9987 0.9991 0.9991 0.9992 0.9992 0.9992 0.9992 0.9993 0.9993
3.2 0.9993 0.9991 0.9994 0.9994 0.9994 0.9994 0.9994 0.9995 0.9995 0.9995
3.3 0.9995 0.9993 0.9995 0.9996 0.9996 0.9996 0.9996 0.9996 0.9996 0.9997
3.4 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9998

2|27 cblamsis
EDA (Eng’g Data Analysis)
Reference: Intro to Statistics by R. Walpole Lecture Notes 5

Example

Given the normally distributed variable X with mean 18 and standard deviation 2.5, find:
a. P(X  15)
b. P(X > 17)
c. P(19  X  22)
d. The value of k such that P(X  k) = 0.1814

Solution:
Let X = continuous random variable

The symbol, <, means the required probability (area) is to the left of a specific value and
the symbol, >, means the required probability (area) is to the right of a specific value.

NOTE:
 THE TABLE, AREAS UNDER THE STANDARD NORMAL CURVE, SHOWN
ABOVE ALWAYS GIVE THE AREA TO THE LEFT OF A SPECIFIC VALUE
OF Z.

a. P(X  15)

Solve for Z-value


𝑋−𝜇
𝑍=
𝜎
𝑋 − 𝜇 15 − 18
𝑍1 = = = −1.20
𝜎 2.5

Normal Curve σ = 2.5

P(X < 15)

X = 15
μ = 18 X value

Standard Normal Curve σ=1

P(Z < z1)

Z1 = - 1.20 Z = 0 Z value

𝑃(𝑋 < 15) = 𝑃(𝑍 < −1.2) = 𝟎. 𝟏𝟏𝟓𝟏

From the table, the area (probability) to the left of Z = - 1.20 is 0.1151
Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
- 3.4
- 1.3
- 1.2 0.1151 0.1131 0.1112 0.1093 0.1075 0.1056 0.1038 0.1020 0.1003 0.0985
- 1.1

3|27 cblamsis
EDA (Eng’g Data Analysis)
Reference: Intro to Statistics by R. Walpole Lecture Notes 5

b. P(X > 17)

Solve for Z-value


𝑋−𝜇
𝑍=
𝜎
c.
𝑋 − 𝜇 17 − 18
𝑍2 = = = −0.40
𝜎 2.5

Normal Curve σ = 2.5

P(X > 17)

X = 17 μ = 18 X value

Standard Normal Curve σ=1

P(Z > z2)

Z2 = - 0.40 Z = 0 Z value

𝑃(𝑋 > 15) = 𝑃(𝑍 > −0.4) = 𝑷(𝑺) − 𝑷(𝒁 < −𝟎. 𝟒𝟎)

𝑃(𝑋 > 15) = 𝑃(𝑍 > −0.4) = 𝟏. 𝟎 − 𝟎. 𝟑𝟒𝟒𝟔 = 𝟎. 𝟔𝟓𝟓𝟒

Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
- 3.4
- 0.6
- 0.5
- 0.4 0.3446 0.3409 0.3372 0.3336 0.3300 0.3264 0.3228 0.3192 0.3156 0.3121
- 0.3

c. P(19  X  22)

Solve for Z-value


𝑋−𝜇
𝑍=
𝜎
𝑋 − 𝜇 19 − 18
𝑍3 = = = 0.40
𝜎 2.5
𝑋 − 𝜇 22 − 18
𝑍4 = = = 1.60
𝜎 2.5

4|27 cblamsis
EDA (Eng’g Data Analysis)
Reference: Intro to Statistics by R. Walpole Lecture Notes 5

Normal Curve σ = 2.5

P(19 < X < 22)

18 X = 19 X = 22 X value

Standard Normal Curve σ=1

P(0.40 < Z < 1.60)

0Z 3 = 0.40 Z value
Z4 = 1.60

𝑃(19 < 𝑋 < 22) = 𝑃(0.40 < 𝑍 < 1.60) = 𝑃(𝑍 < 1.60) − 𝑃(𝑍 < 0.40)

𝑃(0.40 < 𝑍 < 1.60) = 0.9452 − 0.6554 = 𝟎. 𝟐𝟖𝟗𝟖

Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
- 3.4
0.3
0.4 0.6554 0.6217 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.6879
0.5
0.6
0.7
0.8
0.9
1.0
1.1
1.2
1.3
1.4
1.5
1.6 0.9452 0.9345 0.9474 0.9484 0.9495 0.9505 0.9515 0.9525 0.9535 0.9545
1.7

d. The value of k such that P(X  k) = 0.1814

From the transformation formula, solve for X

𝑋−𝜇
𝑍=
𝜎

𝑋 = 𝑍𝜎 + 𝜇

𝑃(𝑋 > 𝑘) = 0.1814 = 𝑃(𝑍 > 𝑧5 ) = 𝑃(𝑆) − 𝑃(𝑍 < 𝑧5 )

0.1814 = 1.0 − 𝑃(𝑍 < 𝑧5 )

𝑃(𝑍 < 𝑧5 ) = 0.8186


Thus, on the table, the z-value leaving an area to the left by 0.8186 is 1.01 (1 + 0.01).
Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
- 3.4
- 3.3
0.8
0.9 0.8159 0.7910 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389
1.0 0.8413 0.8186 0.8461 0.8485 0.8508 0.8531 0.8554 .8577 0.8599 0.8621
1.1 0.8643 0.8438 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.8830

5|27 cblamsis
EDA (Eng’g Data Analysis)
Reference: Intro to Statistics by R. Walpole Lecture Notes 5

Normal Curve σ = 2.5

P(X > k)

18 X=k X value

Standard Normal Curve σ=1

P(Z > z5)

0 Z = z5 Z value

Therefore, the X-value is


𝑋 = 𝑘 = 𝑍𝜎 + 𝜇 = (1.01)(2.5) + 18 = 𝟐𝟎. 𝟓𝟐𝟓

Example

In a mathematics examination the average grade was 82 and the standard deviation was
5. All students with grades from 88 to 94 received a grade of “B”. If the grades are
approximately normally distributed and 8 students received a B grade, how many
students took the examination?

Solution:

Let X = continuous random variable


A = the event that a student received a “B” grade = (8 students)
μ = 82
σ=5

Assume that the grades are rounded to the nearest whole number, thus
X1 = 88 – 0.5 = 87.5
X2 = 94 + 0.5 = 94.5

And
𝑋1 −𝜇 87.5−82
𝑍1 = = = 1.10
𝜎 5
𝑋2 −𝜇 94.5−82
𝑍2 = = = 2.50
𝜎 5

Also

𝑛 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠 𝑤𝑖𝑡ℎ 𝑎 "B" 𝑔𝑟𝑎𝑑𝑒


𝑃(𝐴) = = = 𝑃(𝑋1 < 𝑋 < 𝑋2 )
𝑁 𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠 𝑤ℎ𝑜 𝑡𝑜𝑜𝑘 𝑡ℎ𝑒 𝑒𝑥𝑎𝑚

8
𝑃(𝑋1 < 𝑋 < 𝑋2 ) = 𝑃(𝑍1 < 𝑍 < 𝑍2 ) =
𝑁

6|27 cblamsis
EDA (Eng’g Data Analysis)
Reference: Intro to Statistics by R. Walpole Lecture Notes 5

Normal Curve σ=5

P(87.5 < X < 94.5)

82 X1 = 87.5 X value
X2 = 94.5

Standard Normal Curve σ=1

P(1.10 < Z < 2.50)

0Z 1 = 1.10 Z value
Z2 = 2.50

8
𝑃(𝑋1 < 𝑋 < 𝑋2 ) = 𝑃(𝑍1 < 𝑍 < 𝑍2 ) =
𝑁

𝑃(𝑍1 < 𝑍 < 𝑍2 ) = 𝑃(𝑍 < 𝑍2 ) − 𝑃(𝑍 < 𝑍2 ) = 𝑃(𝑍 < 2.50) − 𝑃(𝑍 < 1.10)
8
𝑃(𝑍1 < 𝑍 < 𝑍2 ) = 0.9938 − 0.8643 = 0.1295 =
𝑁
8
𝑁= = 𝟔𝟐 𝒔𝒕𝒖𝒅𝒆𝒏𝒕𝒔
0.1295
Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
- 3.4
1.0
1.1 0.8643 0.8438 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.8830
1.2
1.3
2.4
2.5 0.9938 0.9920 0.9941 0.9943 0.9945 0.9946 0.9948 0.9949 0.9951 0.9952
2.6

Example

The tensile strength of a certain metal component is normally distributed with mean of
10,000 kg / sq. cm and a standard deviation of 100 kg / sq. cm. Measurements are
recorded to the nearest 50 kg / sq. cm.
a. What proportion of these components exceed 10,150 kg / sq. cm in tensile
strength?
b. If specifications require that all components have tensile strength between
9,800 and 10,200 kg / sq. cm inclusive, what proportion of pieces would we
expect to scrap?

Solution:

Let X = continuous random variable


μ = 10,000
σ = 100

7|27 cblamsis
EDA (Eng’g Data Analysis)
Reference: Intro to Statistics by R. Walpole Lecture Notes 5
a. What proportion of these components exceed 10,150 kg / sq. cm in tensile
strength?

 Since measurements are recorded to the nearest 50 kg / sq. cm.,


X1 = 10150 + ½ (50) = 10175
𝑋1 −10000 10175−10000
And: 𝑍1 = = = 1.75
100 100

Normal Curve σ = 100

P(X > 10175)

10000 X1 = 10175 X value

Standard Normal Curve σ=1

P(Z > 1.75)

0 Z1 = 1.75 Z value

𝑃(𝑋 > 10175) = 𝑃(𝑍 > 1.75) = 𝑃(𝑆) − 𝑃(𝑍 < 1.75)

𝑃(𝑋 > 10175) = 𝑃(𝑍 > 1.75) = 𝑃(𝑆) − 𝑃(𝑍 < 1.75)

𝑃(𝑋 > 10175) = 𝑃(𝑍 > 1.75) = 1.0 − 0.9599 = 𝟎. 𝟎𝟒𝟎𝟏

Therefore, 4.01% of the components exceed 10,150 kg / sq. cm in tensile


strength.

Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
- 3.4
- 3.3
1.6
1.7 0.9554 0.9463 0.9573 0.9582 0.9591 0.9599 0.9608 0.9616 0.9625 0.9633
1.8

b. If specifications require that all components have tensile strength between


9,800 and 10,200 kg / sq. cm inclusive, what proportion of pieces would we
expect to scrap?

 Since measurements are recorded to the nearest 50 kg / sq. cm.,


X2 = 9800 - ½ (50) = 9775
X3 = 10200 + ½ (50) = 10225
𝑋2 −10000
And: 𝑍2 = = −2.25
100

𝑋3 −10000
𝑍3 = = 2.25
100

8|27 cblamsis
EDA (Eng’g Data Analysis)
Reference: Intro to Statistics by R. Walpole Lecture Notes 5

Normal Curve σ = 100

P(9775<X<10225)
10175)

X2 = 9775 10000 X3 = 10225 X value

Standard Normal Curve σ=1

P(-2.25<Z<2.25)

Z2 = - 2.25
0 Z3 = 2.25 Z value

The proportion of pieces that would be expected to be scrap = P(S) – P(X2<X<X3)

𝑃(𝑋2 < 𝑋 < 𝑋3 ) = 𝑃(𝑍2 < 𝑍 < 𝑍2 ) = 𝑃(𝑍 < 𝑍3 ) − 𝑃(𝑍 < 𝑍2 )

𝑃(𝑋2 < 𝑋 < 𝑋3 ) = 𝑃(𝑍 < 2.25) − 𝑃(𝑍 < −2.25)

𝑃(𝑋2 < 𝑋 < 𝑋3 ) = 0.9878 − 0.0122 = 0.9756

Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
- 3.4
- 2.3
- 2.2 0.0139 0.0136 0.0132 0.0129 0.0125 0.0122 0.0119 0.0116 0.0113 0.0110
- 2.1
2.1
2.2 0.9861 0.9826 0.9868 0.9871 0.9875 0.9878 0.9881 0.9884 0.9887 0.9890
2.3

Therefore, the proportion of pieces that would be scrapped = 1 – 0.9756 = 0.0244 = 2.44%

Example

A lawyer commutes daily from his suburban home to his midtown office. On the average
the trip one way takes 24 minutes, with a standard deviation of 3.8 minutes. Assume the
distribution of trips to be normally distributed.
a. What is probability that a trip will take at least ½ hour?
b. If the office opens at 9:00 A.M. and he leaves his house at 8:45 A.M. daily, what
percentage of the time is he late for work?
c. If he leaves the house at 8:35 A.M and coffee is served at the office from 8:50
A.M until 9:00 A.M., what is the probability that he misses coffee?
d. Find the length of time above which we find the slowest 15% of the trips.
e. Find the probability that 2 of the next 3 trips will take at least ½ hour.

Solution:

Let X = continuous random variable


μ = 24
σ = 3.8

9|27 cblamsis
EDA (Eng’g Data Analysis)
Reference: Intro to Statistics by R. Walpole Lecture Notes 5

a. What is probability that a trip will take at least ½ hour?


Required: P(X>30 minutes)
30 − 24
𝑍1 = = 1.58
3.8

Normal Curve σ = 3.8

P(X > 30)

24 X1 = 30 X value

Standard Normal Curve σ=1

P(Z > 1.58)

0 Z1 = 1.58 Z value

𝑃(𝑋 > 30) = 𝑃(𝑍 > 𝑍1 ) = 𝑃(𝑆) − 𝑃(𝑍 < 𝑍1 ) = 1.0 − 𝑃(𝑍 < 1.58)

𝑃(𝑋 > 30) = 𝑃(𝑍 > 𝑍1 ) = 𝑃(𝑆) − 𝑃(𝑍 < 𝑍1 ) = 1.0 − 0.9429 = 0.0571

Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
- 3.4
- 0.9
1.5 0.9332 0.9207 0.9357 0.9370 0.9382 0.9394 0.9406 0.9418 0.9429 0.9441
1.6

Therefore, 0.0571 (5.71%) chance of the trips takes 30 minutes or longer.

b. If the office opens at 9:00 A.M. and he leaves his house at 8:45 A.M. daily, what
percentage of the time is he late for work?

Required: P(X>15 minutes); X = 9:00 – 8:45 = 15 minutes. If lawyer trip


time will be longer than 15 minutes, he will be late for work.

15 − 24
𝑍2 = = −2.37
3.8

Normal Curve σ = 3.8

P(X > 15)

X = 15 μ = 24 X value

Standard Normal Curve σ=1

P(Z > -2.37)


Z2 = - 2.37 Z=0 Z value

10 | 2 7 cblamsis
EDA (Eng’g Data Analysis)
Reference: Intro to Statistics by R. Walpole Lecture Notes 5

𝑃(𝑋 > 15) = 𝑃(𝑍 > −2.37) = 𝑃(𝑆) − 𝑃(𝑍 < −2.37))

𝑃(𝑋 > 15) = 𝑃(𝑍 > −2.37) = 1.0 − 0.0089 = 0.9911

Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
- 3.4
- 2.4
- 2.3 0.0107 0.0104 0.0102 0.0099 0.0096 0.0094 0.0091 0.0089 0.0087 0.0084
- 2.2

Therefore, the lawyer is 0.9911 (99.11%) chance late for work.

c. If he leaves the house at 8:35 A.M and coffee is served at the office from 8:50
A.M until 9:00 A.M., what is the probability that he misses coffee?

Required: P(X>25 minutes); X = 9:00 – 8:35 = 25 minutes. If lawyer trip


time will be longer than 25 minutes, he will not sip a hot coffee.
25 − 24
𝑍3 = = 0.26
3.8

Normal Curve σ = 3.8

P(X > 25)

24 X = 25 X value

Standard Normal Curve σ=1

P(Z > 0.26)

0 Z3 = 0.26 Z value

𝑃(𝑋 > 25) = 𝑃(𝑍 > 0.26) = 𝑃(𝑆) − 𝑃(𝑍 < 0.26)

𝑃(𝑋 > 25) = 𝑃(𝑍 > 0.26) = 1.0 − 0.6026 = 𝟎. 𝟑𝟗𝟕𝟒

Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
- 3.4
0.1
0.2 0.5793 05438 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141
0.3

Therefore, 0.3974 (39.74%) chance that the lawyer will miss his hot coffee.

d. Find the length of time above which we find the slowest 15% of the trips.
 The slowest trips will be longer trip time, thus
P(X > k) = 15% = 0.1500

11 | 2 7 cblamsis
EDA (Eng’g Data Analysis)
Reference: Intro to Statistics by R. Walpole Lecture Notes 5

Normal Curve σ = 3.8

P(X > k) = 0.1500

24 X=k X value

Standard Normal Curve σ=1

P(Z > z4) = 0.1500

0 Z4 Z value

𝑃(𝑋 > 𝑘) = 𝑃(𝑍 > 𝑍4 ) = 0.1500

𝑃(𝑍 > 𝑍4 ) = 𝑃(𝑆) − 𝑃(𝑍 < 𝑍4 ) = 0.1500

1.0 − 𝑃(𝑍 < 𝑍4 ) = 0.1500

𝑃(𝑍 < 𝑍4 ) = 1.0 − 0.1500 = 0.8500

Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
- 3.4
0.9
1.0 0.8413 0.8186 0.8461 0.8485 0.8508 0.8531 0.8554 .8577 0.8599 0.8621
1.1

The nearest Z- value leaving an area to the left equal to 0.8500 is 1.04 (1.0 + 0.04).

Therefore:
𝑋 = 𝑍𝜎 + 𝜇

𝑋 = 𝑘 = (1.04)(3.8) + 24 = 27.952 𝑚𝑖𝑛𝑢𝑡𝑒𝑠

15% of the slowest trip times takes 27.952 minutes or longer.

e. Find the probability that 2 of the next 3 trips will take at least ½ hour.

Solution: this problem is under binomial distribution.


Let Y = random variable representing the number of times that the
lawyer’s trip takes 30 minutes or longer.
n=3
Y=2
p = P(X > 30) = 0.0571 (letter a requirement of this problem)
q = (1 – p) = (1 – 0.0571) = 0.9429

𝑏(𝑋; 𝑛, 𝑝) = ( 𝑛𝐶𝑋 )𝑝 𝑋 𝑞 𝑛−𝑋

𝑏(𝑋 = 2; 3,0.0571) = ( 3𝐶2 )(0.0571)2 (0.9429)3−2

3!
𝑏(𝑋 = 2; 3,0.0571) = ( ) (0.0571)2 (0.9429)1 = 𝟎. 𝟎𝟎𝟗𝟐
2! 1!

The probability that exactly 2 of the next 3 of the lawyer’s trip is 0.0092.

12 | 2 7 cblamsis
EDA (Eng’g Data Analysis)
Reference: Intro to Statistics by R. Walpole Lecture Notes 5
NORMAL APPROXIMATION TO THE BINOMIAL DISTRIBUTION

If X is a binomial random variable with mean  = np and variance 2 = npq, then the
limiting form of the distribution of
𝑋 − 𝑛𝑝
𝑍=
√𝑛𝑝𝑞

as n  ∞, is the standard normal distribution n(z; 0, 1).

Example

The probability that a patient recovers from a delicate heart operation is 0.90. Of the next
100 patients having this operation, what is the probability that
a. between 84 and 95 inclusive survive?
b. Fewer than 86 survive?

Solution:
μ = np = (100) (0.90) = 90
σ = (100 x 0.90 x 0.10)1/2 = 3

a. What is the probability that between 84 and 95 inclusive survive?


X1 = 84
X2 = 95
84−90
And: 𝑍1 = 3 = −2.0
95−90
𝑍2 = = 1.67
3

Normal Curve σ=3

P(84 < X < 95)

X1 = 84 90 X2 = 95 X value

Standard Normal Curve σ=1

P(-2.0 < Z < 1.67)

Z1 = - 2.0
0 Z2 = 1.67 Z value

𝑃(84 < 𝑋 < 95) = 𝑃(−2.0 < 𝑍 < 1.67) = 𝑃(𝑍 < 1.67) − 𝑃(𝑍 < −2.0)

𝑃(84 < 𝑋 < 95) = 𝑃(−2.0 < 𝑍 < 1.67) = 0.9525 − 0.0228 = 0.9297

The probability that 84 to 95 patients from 100 patients will survive the
delicate heart operation is 92.97%

b. What is the probability that fewer than 86 survive?


X3 = 86
86−90
𝑍3 = 3 = −1.33

13 | 2 7 cblamsis
EDA (Eng’g Data Analysis)
Reference: Intro to Statistics by R. Walpole Lecture Notes 5

Normal Curve σ=3

P(X < 86)

X = 86
μ = 90 X value

Standard Normal Curve σ=1

P(Z < - 1.33)

Z1 = - 1.33 Z = 0 Z value

𝑃(𝑋 < 86) = 𝑃(𝑍 < −1.33) = 𝟎. 𝟎𝟗𝟏𝟖

The probability that fewer than 86 patients from 100 patients will survive
the delicate heart operation is 9.18%

Example

If 20% of the residents in certain city prefer a white telephone over any other color
available, what is the probability that among the next 1000 telephones installed in this city
a. between 170 and 185 inclusive will be white?
b. At least 210 but not more than 225 will be white?
Solution:
μ = np = (1000) (0.20) = 200
σ = (1000 x 0.20 x 0.80)1/2 = 40

a. What is the probability that among the next 1000 telephones installed in this city
between 170 and 185 inclusive will be white?

Solution: X1 = 170
X2 = 185
170−200
And: 𝑍1 = = −0.75
40
185−200
𝑍2 = = −0.38
40

14 | 2 7 cblamsis
EDA (Eng’g Data Analysis)
Reference: Intro to Statistics by R. Walpole Lecture Notes 5

Normal Curve σ = 40

P(170 < X < 185)

X1 = 170 μ = 200 X value


X2 = 185

Standard Normal Curve σ=1


P(-0.75 < Z < - 0.38)

Z1 = - 0.75 Z = 0 Z value
Z2 = - 0.38

𝑃(170 < 𝑋 < 185) = 𝑃(−0.75 < 𝑍 < −0.38) = 𝑃(𝑍 < −0.38) − 𝑃(𝑍 < −0.75)

𝑃(170 < 𝑋 < 185) = 𝑃(−0.75 < 𝑍 < −0.38) = 0.3520 − 0.2266 = 𝟎. 𝟏𝟐𝟓𝟒

b. What is the probability that among the next 1000 telephones installed in this city at
least 210 but not more than 225 will be white?

Solution: : X3 = 210
X4 = 225
210−200
And: 𝑍3 = = 0.25
40
225−200
𝑍4 = = 0.62
40

Normal Curve σ = 40

P(210 < X < 225)

200 X3 = 210 X value


X4 = 225

Standard Normal Curve σ=1

P(0.25 < Z < 0.62)

0Z 3 = 0.25 Z value
Z4 = 0.62

𝑃(210 < 𝑋 < 225) = 𝑃(0.25 < 𝑍 < 0.62) = 𝑃(𝑍 < 0.62) − 𝑃(𝑍 < 0.25)

𝑃(210 < 𝑋 < 225) = 𝑃(0.25 < 𝑍 < 0.62) = 0.7324 − 0.5987 = 𝟎. 𝟏𝟑𝟑𝟕

15 | 2 7 cblamsis
EDA (Eng’g Data Analysis)
Reference: Intro to Statistics by R. Walpole Lecture Notes 5
SAMPLING THEORY
(Reference: Introduction to Statistics by R. Walpole – Chapter 8)

Research Methodology
 refers to the detailed description of procedure, instrument, and participants.
This includes the sampling procedure, that is, how participants selected
in the study.
Target Population
 it is the entire particular group of people a researcher identifies to study and
about which to draw conclusions.
Sample
 it refers to that part of the population that is included in the study and where the
information in research comes from.
Sampling
 refers to the process of selecting the participants from the target population to
be included in the study.

SAMPLING TECHNIQUES

A. Random Sampling – is the method of selecting a sample size (n) from the universe
(N) such that each member of the population has an equal chance of being included in
the sample and all possible combinations of size (n) have an equal chance of being
selected as the sample.

1. Lottery Sampling – is usually carried out by assigning numbers to each member of


the population. Write down the names of each member of the population on pieces
of paper then placed in a box or container drum where the required number of
sample units are picked.
2. Table of Random Numbers – the selection of each member of the population is left
adequately to chance, and every member of the population has an equal chance
of being chosen.

B. Systematic Sampling – when sample units are obtained by drawing every, say, 4th or
7th or 10th item on a list.

1. Stratified Sampling – in this method the population is first divided into groups –
based on homogeneity – in order to avoid the possibility of drawing samples whose
members come only from one stratum. The distribution of sampling units is
proportionate to the total number of units in each stratum. The bigger the
population, the more sample units are drawn, the less population, the less sample
units.
2. Cluster Sampling – the cluster sample is sometimes referred to as an area sample
because it is frequently applied on a geographical basis. On this basis, districts or
blocks of a municipality or city are selected. These districts or blocks constitute the
clusters. Cluster sampling is useful in selecting the sample when blocks in a
community or city are occupied by heterogeneous groups.
3. Multi-stage Sampling – this technique uses several stages or phases in getting the
sample from the general population. Multi-stage sampling is useful in conducting
nation-wide surveys or any-survey involving a large universe.

C. Non-random Sampling – under this methodology, not all members of the population
are given equal chances to be chosen. Certain elements in the population are deliberately
left out in the choice of the sample for varied reasons.

1. Purposive Sampling – this is based on certain criteria laid down by the researcher.
People who satisfy the criteria are interviewed.

16 | 2 7 cblamsis
EDA (Eng’g Data Analysis)
Reference: Intro to Statistics by R. Walpole Lecture Notes 5
2. Quota Sampling – this is relatively quick and inexpensive method to operate. Each
interviewer is given definite instructions about the section of the public he is to
question, but the final choice of the actual persons is left to his own convenience
or preference and is not predetermined by some carefully operated randomized
plan. Each interviewer then proceeds to fill the prescribed quota.
3. Convenience Sampling – for example a researcher might want to find out the
popularity of a radio program. Since the researcher has a telephone, he might
simply use it and “randomly” pick his samples from the telephone directory. This
method, of course, biased against non-telephone users.
4. Snowball Sampling = the selection of samples through referrals made by people
who possess characteristics that are of interest to the researcher.

Sample size determination

SLOVIN’S FORMULA

𝑁
𝑛=
1 + 𝑁𝑒 2

where: n = sample size


N = population size
e = desired margin of error (percent allowance for non-precision
because of the use of the sample instead of the population).

Sample size for specified margins of error

Population Margin of sample size (n) per error (e) of


(N) +/- 1% +/- 2% +/- 3% +/- 4 % +/- 5% +/- 10%
500 * * * * 222 83
1,500 * * 638 441 316 94
2,500 * 1,250 769 500 345 96
3,000 * 1,364 811 517 353 97
4,000 * 1,538 870 541 364 98
5,000 * 1,667 909 556 370 98
6,000 * 1,765 938 566 375 98
7,000 * 1,842 959 574 378 99
8,000 * 1,905 976 580 381 99
9,000 * 1,957 989 584 383 99
10,000 5,000 2,000 1,000 588 385 99
50,000 8,333 2,381 1,087 617 397 100
* In these cases, the assumption of normal approximation is poor, and the sample size
formula does not apply.

Example
Determine the reliable sample size from 10, 000 students of UC if the margin of error is
2%. Freshmen are 3500; sophomores are 3000; juniors are 2000; and seniors are 1500.

Solution:
Using Slovin’s formula, the total students to be selected for a reliable sample size is

𝑁 10000
𝑛= 2
= = 2000 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠
1 + 𝑁𝑒 1 + 10000(0.02)2

17 | 2 7 cblamsis
EDA (Eng’g Data Analysis)
Reference: Intro to Statistics by R. Walpole Lecture Notes 5
Then determine the number of students to be selected from the different year levels by
stratified sampling
3500
Freshmen: 𝑛1 = 10000 (2000) = 700 𝑓𝑟𝑒𝑠ℎ𝑚𝑒𝑛 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠

3000
Sophomores: 𝑛2 = 10000 (2000) = 600 𝑠𝑜𝑝ℎ𝑜𝑚𝑜𝑟𝑒 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠

2000
Juniors: 𝑛3 = 10000 (2000) = 400 𝑗𝑢𝑛𝑖𝑜𝑟 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠

1500
Seniors: 𝑛4 = 10000 (2000) = 300 𝑠𝑒𝑛𝑖𝑜𝑟 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠

And lastly select the number of students from the different levels by systematic sampling
may using the data information from the MIS of the school.

Number of students to be
Year level
selected
Freshmen 700
Sophomores 600
Juniors 400
Seniors 300
Total sample size 2000

SAMPLING DISTRIBUTIONS
 The probability distribution of a statistic.

Central Limit Theorem (if the sample size is large and population standard
deviation is known)

 If random samples of size n are drawn from a large or infinite population with
mean μ and variance σ2, then the sampling distribution of the sample mean 𝑋̅
is approximately normally distributed with mean 𝜇𝑋̅ = 𝜇 and standard deviation
𝜎𝑋̅ = 𝜎⁄ . Hence
√𝑛

𝑋̅ − 𝜇
𝑍= 𝜎
√𝑛

is a value of a standard normal variable Z.

18 | 2 7 cblamsis
EDA (Eng’g Data Analysis)
Reference: Intro to Statistics by R. Walpole Lecture Notes 5
Example

The random variable X, representing the number of cherries in a cherry puff, has the
following probability distribution

x 4 5 6 7
P(X = x) 0.20 0.40 0.30 0.10

a. Find the mean  and the variance 2 of X.


b. Find the mean x and the variance 2x of the mean 𝑋̅ for the random
samples of 36 cherry puffs.
c. Find the probability that the average number of cherries in the 36 cherry
puffs will be less than 5.5.

Solution:
a. Find the mean  and the variance 2 of X.
 The mean and variance will be determined using the formulas in
discrete random variable.
𝑛

𝜇 = ∑ 𝑋𝑖 𝑃(𝑋 = 𝑥𝑖 )
𝑖=1

𝜇 = [(4)(0.20) + (5)(0.40) + (6)(0.30) + (7)(0.10)] = 𝟓. 𝟑


𝑛

𝜎 2 = ∑ 𝑋𝑖 2 𝑃(𝑋 = 𝑥𝑖 ) − 𝜇 2
𝑖=1

𝜎 2 = [(4)2 (0.20) + (5)2 (0.40) + (6)2 (0.30) + (7)2 (0.10)] − (5.3)2

𝜎 2 = 𝟎. 𝟖𝟏

b. Find the mean x and the variance 2x of the mean 𝑋̅ for the random samples
of 36 cherry puffs.
For n = 36
𝜇𝑋̅ = 𝜇 = 𝟓. 𝟑

2
𝜎 2 0.81
𝜎𝑋̅ = = = 𝟎. 𝟎𝟐𝟐𝟓
𝑛 36

c. Find the probability that the average number of cherries in the 36 cherry puffs
will be less than 5.5.
n = 36
X = 5.5
P(X < 5.5) = ?
And
𝑋̅ − 𝜇 5.5 − 5.3
𝑍= 𝜎 = = 1.33
√0.81
√𝑛 √36

19 | 2 7 cblamsis
EDA (Eng’g Data Analysis)
Reference: Intro to Statistics by R. Walpole Lecture Notes 5

σX = 0.15
Normal Curve

P(X < 5.5)

μX = 5.3 X value
X = 5.5

σ=1
Standard Normal Curve

P(Z < 1.33)

Z = 0 Z = 1.33 Z value

P(𝑋 < 5.5) = 𝑃(𝑍 < 1.33) = 𝟎. 𝟗𝟎𝟖𝟐

Example

The heights of 1000 students are approximately normally distributed with a mean of 174.5
cm and standard deviation of 6.9 cm. If 200 random samples of size 25 are drawn from
this population and the means recorded to the nearest tenth of a cm, determine
a. the mean and standard error of the sampling distribution of 𝑋̅;
b. the number of sample means that fall between 172.5 and 175.8 cm
inclusive;
c. the number of sample means falling below 172 cm.

Solution:
Let X = continuous random variable representing the heights of students
Given: μ = 174.5 cm
σ = 6.9 cm
N = 200 samples
n = 25
means are recorded to the nearest tenth of a centimeter.

a. Determine the mean and standard error of the sampling distribution of 𝑋̅.
𝜇𝑋̅ = 𝜇 = 174.5 𝑐𝑚

𝜎 2 (6.9)2
𝜎𝑋̅ 2 = = = 1.9044
𝑛 25

𝜎2
𝜎𝑋̅ = √ = √𝜎𝑋̅ 2 = √1.9044 = 1.38
𝑛

b. Determine the number of sample means that fall between 172.5 and 175.8 cm
inclusive.
0.1
𝑋̅1 = 172.5 − = 172.45
2

20 | 2 7 cblamsis
EDA (Eng’g Data Analysis)
Reference: Intro to Statistics by R. Walpole Lecture Notes 5
0.1
𝑋̅2 = 175.8 + = 175.85
2

𝑋̅1 − 𝜇 172.45 − 174.5


𝑍1 = 𝜎 = = −1.49
1.38
√𝑛

𝑋̅2 − 𝜇 175.85 − 174.5


𝑍2 = 𝜎 = = 0.98
1.38
√𝑛

Let:
A = the event that the sample mean falls between 172.5 and 175.8 cm
inclusive. = (nA samples)
𝑛𝐴 𝑛𝐴
𝑃(𝐴) = = = 𝑃(172.45 < 𝑋̅ < 175.85)
𝑁 200

𝑃(172.45 < 𝑋̅ < 175.85) = 𝑃(−1.49 < 𝑍 < 0.98)

𝑃(−1.49 < 𝑍 < 0.98) = 𝑃(𝑍 < 0.98) − 𝑃(𝑍 < −1.49) = 0.8365 − 0.0681 = 𝟎. 𝟕𝟔𝟖𝟒
Thus
𝑛𝐴 𝑛𝐴
𝑃(𝐴) = = = 𝑃(172.45 < 𝑋̅ < 175.85) = 0.7684
𝑁 200
𝑛𝐴
0.7684 =
200

𝒏𝑨 = (𝟎. 𝟕𝟔𝟖𝟒)(𝟐𝟎𝟎) = 𝟏𝟓𝟒 𝒔𝒂𝒎𝒑𝒍𝒆 𝒎𝒆𝒂𝒏𝒔

𝜎𝑋̅ = 1.38
Normal Curve

ഥ < 175.8)
P(172.5 < 𝑿

ഥ = 172.5174.5
𝑿1
ഥ 2 = 175.8
𝑿 𝑋̅ value

Standard Normal Curve σ=1

P(-2.0 < Z < 1.67)

Z1 = - 1.49
0 Z2 = 0.98 Z value

c. Determine the number of sample means falling below 172 cm.


0.1
𝑋̅3 = 172 − = 171.95
2

𝑋̅3 − 𝜇 171.95 − 174.5


𝑍3 = 𝜎 = = −1.85
1.38
√𝑛

21 | 2 7 cblamsis
EDA (Eng’g Data Analysis)
Reference: Intro to Statistics by R. Walpole Lecture Notes 5

𝜎𝑋̅ = 1.38
Normal Curve

P(𝑋̅3 < 172)

174.5 𝑋̅ value
𝑋̅3 = 172

Standard Normal Curve σ=1

P(Z < - 1.85)

Z1 = - 1.85 Z = 0 Z value

Let: B = event that the sample mean falls below 172 cm = (nB samples)
𝑛𝐵 𝑛𝐵
𝑃(𝐵) = = = 𝑃(𝑋̅ < 172) = 𝑃(𝑍 < −1.85) = 0.0322
𝑁 200
𝑛𝐵
= 0.0322
200

𝒏𝑩 = (𝟎. 𝟎𝟑𝟐𝟐)(𝟐𝟎𝟎) = 𝟔 𝒔𝒂𝒎𝒑𝒍𝒆 𝒎𝒆𝒂𝒏𝒔

22 | 2 7 cblamsis
EDA (Eng’g Data Analysis)
Reference: Intro to Statistics by R. Walpole Lecture Notes 5
t Distribution (if the sample size is small and population standard deviation is
unknown)

If 𝑋̅ and 𝑆 2 are the mean and variance, respectively, of a random sample of size n taken
from a population that is normally distributed with mean μ and variance 𝜎 2 , then
ഥ −𝝁
𝑿
𝒕= 𝒔
⁄ 𝒏

is a value of a random variable T having the t distribution with 𝑣 = 𝑛 − 1 degrees of
freedom.

Critical Values of the t Distribution


(Reference: Introduction to Statistics by R. Walpole – Table A.5 – page 481)

α
0 tα

α
v
0.10 0.05 0.025 0.01 0.005
1 3.078 6.314 12.706 31.821 63.657
2 1.886 2.920 4.303 6.965 9.925
3 1.638 2.353 3.182 4.541 5.841
4 1.533 2.132 2.776 3.747 4.604
5 1.476 2.015 2.571 3.365 4.032
6 1.440 1.943 2.447 3.143 3.707
7 1.415 1.895 2.365 2.998 3.499
8 1.397 1.860 2.306 2.896 3.355
9 1.383 1.833 2.262 2.821 3.250
10 1.372 1.812 2.228 2.764 3.169
11 1.363 1.796 2.201 2.718 3.106
12 1.356 1.782 2.179 2.681 3.055
13 1.350 1.771 2.160 2.650 3.012
14 1.345 1.761 2.145 2.624 2.977
15 1.341 1.753 2.131 2.602 2.947
16 1.337 1.746 2.120 2.583 2.921
17 1.333 1.740 2.110 2.567 2.898
18 1.330 1.734 2.101 2.552 2.878
19 1.328 1.729 2.093 2.539 2.861
20 1.325 1.725 2.086 2.528 2.845
21 1.323 1.721 2.080 2.518 2.831
22 1.321 1.717 2.074 2.508 2.819
23 1.319 1.714 2.069 2.500 2.807
24 1.318 1.711 2.064 2.492 2.797
25 1.316 1.708 2.060 2.485 2.787
26 1.315 1.706 2.056 2.479 2.779
27 1.314 1.703 2.052 2.473 2.771
28 1.313 1.701 2.048 2.467 2.763
29 1.311 1.699 2.045 2.462 2.756
infinity 1.282 1.645 1.960 2.326 2.576

The table given above gives the probability (α) to the right of a specific value of T variable.

23 | 2 7 cblamsis
EDA (Eng’g Data Analysis)
Reference: Intro to Statistics by R. Walpole Lecture Notes 5
Example

Given a random sample of size 24 from a normal distribution, find k such that
a. P(– 2.069 < T < k) = 0.965
b. P(k < T < 2.807) = 0.095
c. P(– k < T < k) = 0.90

Solution:
The degrees of freedom, 𝑣 = n – 1 = 24 – 1 = 23

a. Find k such that P(– 2.069 < T < k) = 0.965

The area between T = – 2.069 and T = k is 0.965, thus T = k must be at the right
side of T = 0.

0.965

β α
T = - 2.069 0 T=k

𝑃(𝑇 < −2.069) = 𝛽 = 𝑡(0.965+𝛼),𝑣

Determine β from the t – table, degrees of freedom, v = 23, disregard negative sign
of the T value, T = 2.069.

v α
0.10 0.05 β=0.025 0.01 0.005
1
22
23 1.319 1.714 2.069 2.500 2.807
24

Thus: β = 0.025

And: α = 1 – (0.965 + 0.025) = 0.01,

solve for k in the t – table with v = 23 and α = 0.01

v α
0.10 0.05 0.025 0.01 0.005
1
22
23 1.319 1.714 2.069 2.500 2.807
24

Thus: k = 2.500

b. Find k such that P(k < T < 2.807) = 0.095

The area between T = k and T = 2.807 is 0.095 (< 0.50), thus T = k must be at the
right side of T = 0.

24 | 2 7 cblamsis
EDA (Eng’g Data Analysis)
Reference: Intro to Statistics by R. Walpole Lecture Notes 5

0.095
α
0 T = k T = 2.807

𝑃(𝑇 > 2.807) = 𝛼

Determine α from the t – table, degrees of freedom, v = 23, and T = 2.807.

v α
0.10 0.05 0.025 0.01 0.005
1
22
23 1.319 1.714 2.069 2.500 2.807
24

Thus: α = 0.005

And: α1 = (0.095 + α) =(0.095 + 0.005) = 0.10 = area to the right of T = k.

solve for k in the t – table with v = 23 and α1 = 0.10

v α
0.10 0.05 0.025 0.01 0.005
1
22
23 1.319 1.714 2.069 2.500 2.807
24

Thus: k = 1.319

c. Find k such that P(– k < T < k) = 0.90.

The area between T = – k and T = k is 0.90, thus T = – k must be at the left side
of T = 0 and T = k must be at the right side of T = 0. The area is symmetrical to T
= 0.

0.90

α α
T=-k 0 T=k

Determine α:
1
𝛼= (1.0 − 0.90) = 0.05
2

25 | 2 7 cblamsis
EDA (Eng’g Data Analysis)
Reference: Intro to Statistics by R. Walpole Lecture Notes 5
solve for k in the t – table with v = 23 and α = 0.05

v α
0.10 0.05 0.025 0.01 0.005
1
22
23 1.319 1.714 2.069 2.500 2.807
24

Thus: k = 1.714 and – k = – 1.714 since the area is symmetrical to T = 0.

Example

A manufacturing firm claims that the batteries used in their electronic games will last an
average of 30 hours. To maintain this average, 16 batteries are tested each month. If the
computed t value falls between – t0.025 and t0.025, the firm is satisfied with its claim. What
conclusion should the firm draw from a sample that has a mean 𝑋̅ = 27.5 hours and a
standard deviation s = 5 hours? Assume the distribution of battery lives to be
approximately normal.

Solution:
n = 16
𝑋̅ = 27.5 ℎ𝑟𝑠
s = 5 hrs.
μ = 30 hrs.

Determine t-value

ഥ − 𝝁 𝟐𝟕. 𝟓 − 𝟑𝟎
𝑿
𝒕= 𝒔 = = −𝟐. 𝟎
⁄ 𝒏 𝟓

√𝟏𝟔

Determine t-value for the acceptance region, with degrees of freedom,


v = n – 1 = 16 – 1 = 15 and α = 0.025.

Acceptance
region
α α = 0.025
0
T = - t0.025, 15 T = t0.025, 15

v α
0.10 0.05 0.025 0.01 0.005
1
13
14
15 1.341 1.753 2.131 2.602 2.947
16

Thus: T = – t0.025, 15 = – 2.131 and T = t0.025, 15 = 2.131

26 | 2 7 cblamsis
EDA (Eng’g Data Analysis)
Reference: Intro to Statistics by R. Walpole Lecture Notes 5
Conclusion:
Since t = – 2.0 is greater than T = – 2.131 and less than T = 2.131, T = – 2.0 is
on the acceptance region, thus, the manufacturers claim is valid.

Acceptance
region
α α = 0.025
T = – 2.131 0 T = 2.131
T = – 2.0

Example

A cigarette manufacturer claims that his cigarettes have an average nicotine content of
1.83 milligrams. If a random sample of 8 cigarettes of this type have nicotine contents of
2.0, 1.7, 2.1, 1.9, 2.2, 2.1, 2.0, and 1.6 milligrams. What is the T-value?

Solution:
n=8
μ = 1.83
Nicotine content, Xi (Xi)2
2.0 4.00
1.7 2.89
2.1 4.41
1.9 3.61
2.2 4.84
2.1 4.41
2.0 4.00
1.6 2.56
∑ 𝑋𝑖 = 15.60 ∑(𝑋𝑖 )2 = 30.72

Determine the sample mean,𝑋̅, and standard deviation, s.


∑ 𝑋𝑖 15.6
𝑋̅ = = = 1.95 𝑚𝑖𝑙𝑙𝑖𝑔𝑟𝑎𝑚𝑠
𝑛 8
1
2 2
𝑛(∑ 𝑋𝑖 ) − (∑ 𝑋𝑖 ) 2
𝑠=[ ]
𝑛(𝑛 − 1)

1
8(30.72) − (15.6)2 2
𝑠=[ ] = 0.207 𝑚𝑖𝑙𝑙𝑖𝑔𝑟𝑎𝑚𝑠
8(8 − 1)

Determine T- value
ഥ − 𝝁 𝟏. 𝟗𝟓 − 𝟏. 𝟖𝟑
𝑿
𝑻=𝒕= 𝒔 = = 𝟏. 𝟔𝟒𝟎
⁄ 𝒏 𝟎. 𝟐𝟎𝟕

√𝟖

27 | 2 7 cblamsis

You might also like