0% found this document useful (0 votes)
55 views42 pages

Chapter V Continuous Probability Distributions PDF - Dr. Ashraf

This document discusses continuous probability distributions and the normal distribution. It defines continuous random variables and probability density functions which are used to describe continuous distributions. Specifically, it covers the normal probability distribution and how it is characterized by the mean and standard deviation parameters. The normal distribution is a commonly used continuous probability distribution with a bell-shaped curve.

Uploaded by

Abdullah Mamun
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views42 pages

Chapter V Continuous Probability Distributions PDF - Dr. Ashraf

This document discusses continuous probability distributions and the normal distribution. It defines continuous random variables and probability density functions which are used to describe continuous distributions. Specifically, it covers the normal probability distribution and how it is characterized by the mean and standard deviation parameters. The normal distribution is a commonly used continuous probability distribution with a bell-shaped curve.

Uploaded by

Abdullah Mamun
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

STS 201, Chapter-V, Prof. Dr. M.

Ashraf Hossain

Continuous Probability Distributions


Topics: Continuous random variable, Continuous probability
distributions…, Normal probability distribution, Standard normal
probability distribution, etc.
Recall:
Random variables (RVs): RVs are quantities that take on different
values depending on chance, or probability.
Discrete variables: The word discrete means countable, e.g. the # of --
students in a class, cars sold by a car dealer in one month, students who
were protesting the tuition increase last semester, applicants who have
applied for a vacant position at a company or typographical errors in a
rough draft of a book… all are countable. For each of these, if the variable
is X, then x=0,1,2,3,…. Note that X can become very large.
So, the Discrete Random Variables represent the number of distinct
values that can be counted of an event.
Continuous variables (CVs): CVs that take on any value within the
limits of the variable. In general, quantities such as pressure, height,
mass, weight, density, volume, temperature, and distance, etc.,
are examples. For each of these, if the variable is X, then x is greater zero
(x>0) and some max. value possible (x˂1), but it can take on any value
within this range (0~1), e.g. the amount of water in a 5-litre plastic bottle..
Continuous Random Variables?
(A continuous variable is a variable that takes on any value within
the limits of the variable)
 The following are examples of continuous random variables: The length of
time it takes a truck driver to go from Rangpur to Cox’s Bazar; The depth of drilling
to find oil in Bay of Bengal; The weight of a truck in a truck-weighing station; and
The amount of water in a 5-litre plastic bottle…..
For each of these, if the variable is X, then x>0 and less than some maximum value
possible, but it can take on any value within this range.

For the following situations, determine whether a discrete or continuous random


variable is involved. Example 1: The number of hairs on a cat or a monkey or a
sea-otter. Since hairs are something we can count, this is a discrete random
variable, even though the # of hairs may be very large, that we wouldn't actually
want to count them (since there are no "half hairs" or fractional amounts of hair,
only whole # amounts of hairs). Example 2: But a cat/a monkey/a sea-otter
length is typically considered a continuous variable, since there will typically not
their measure in exact feet, but the length will differ by some fraction of a foot.
Example 3: The age of a - cat/monkey/sea otter can sometimes be treated as
discrete or continuous, as for example, we usually report age only as a # of
years, but sometimes, e.g. a sea-otter we talk about being 3 & half years old.
Technically, since age can be treated as a continuous random variable, then that
is what it is considered, unless we have a reason to treat it as a discrete variable.
Continuous Random Variables
 A continuous random variable (CRV) can assume any value in an interval on
• the real line or in a collection of intervals. It is not possible to talk about the
probability of the CRV assuming a particular value. Instead, we talk about the
probability of the random variable assuming a value within a given interval.
• The probability density function (pdf) is used to describe probabilities for CRVs,
(and the CRVs are described with pdf curves).
• The area under the density curve between two points corresponds to
the probability that the variable falls between those two values....The probability of
the CRV assuming a value within some given interval (say from x1 to x2) is defined
to be the area under the graph of the pdf between two points (i.e. x1 & x2).
• The cumulative distribution function (cdf) of X is defined by P (X ≤ x).
 A random variable is uniformly distributed whenever the probability is proportional
to the length of the interval. Uniform PDF f(x)=1/(b-a) for a<x<b; f(x)=0 elsewhere;
Expected Value of x, E(x) = (a + b)/2; Variance of x, Var (x)= (b-a)2 /12; where a=
smallest value the variable can assume, b= largest value the variable can assume.
Example (Slater's Buffet): Slater customers are charged for the amount of salad
they take. Sampling suggests that the amount of salad taken is uniformly
distributed between 5 ounces and 15 ounces. The pdf, f(x) = 1/10 for 5 < x < 15,
and f(x) = 0 elsewhere, where x = salad plate filling weight.
Example of Uniformly Distributed Random Variable:
Example (Slater's Buffet): Slater customers are charged for the amount of
salad they take. Sampling suggests that the amount of salad taken is uniformly
distributed between 5 ounces and 15 ounces. The pdf, f(x) = 1/10 for 5 < x < 15,
and f(x) = 0 elsewhere, where x = salad plate filling weight.
Problem exercise (for example): What is the
probability that a customer will take between 12 and
f(x) 15 ounces of salad?

P(12 < x < 15) = 1/10(3) = .3

1/10

x
5 10 12 15
Salad Weight (oz.)
Continuous Probability Distributions
A probability distribution in which the random variable X can take on any
value (is continuous). Because there are infinite values that X could assume,
the probability of X taking on any one specific value is zero. ...

 Continuous Distributions: Normal distribution,


Standard normal, T Distribution, Chi-square, F
distribution, Beta distribution, Cauchy distribution,
Exponential distribution, Gamma distribution, Logistic
distribution, Weibull distribution, etc.
 In probability theory, a PDF or density of a CRV is a
function whose value at any given sample (or point) in
the sample space (the set of possible values taken by the
random variable) can be interpreted as providing a
relative likelihood that the value of the random variable
would equal that sample.
 The normal distribution is one important example of
continuous distributions.
Normal Probability Density Function
 Recall: continuous Fig.: Age distribution of a pediatric*
random variables are population with overlying Normal pdf
described with *Pediatrics is the branch of medicine
dealing with the health and medical care

probability density of infants, children, and adolescents from


birth up to the age of 18.

function (pdfs) curves


 Normal pdfs are
recognized by their
typical bell-shape
Area Under the Curve (AUC)
 pdfs should be viewed almost like a histogram
 Top Figure: The darker bars of the histogram
correspond to ages ≤ 9 (~40% of distribution)
 Bottom Figure: shaded area under the curve (AUC)
corresponds to ages ≤ 9 (~40% of area)
Normal Probability Distributions
Two types of means and standard deviations
• The mean and standard deviation from the pdf (μ & σ) are parameters
• The mean and standard deviation from a sample (“x-bar” & s) are statistics
• Statistics and parameters are related, but are not the same thing!
Continuous probability distribution: A
probability distribution in which the random
variable X can take on any value (is
continuous). Because there are infinite values
that X could assume, the probability of X
taking on any one specific value is zero.
Therefore we often speak in ranges of values
(p(X>0) = .50). The normal distribution is one
example of a continuous distribution. The
probability that X falls between two values (a
and b) equals the integral (area under the
curve) from a to b:
Mean and Standard Deviation of Normal Density

μ Points of inflections* one


σ below and above μ
• Practice sketching Normal
curves
• Feel inflection points
(where slopes change)
• Label horizontal axis with σ
landmarks
* IN MATHEMATICS point of
inflection: a change of curvature
from convex to concave at a
particular point on a curve. Here
"the point of inflection of the bell-
shaped curve"
Normal Probability Distributions
Standard Deviation σ
Points of inflections
one σ below and
above μ
Practice sketching
Normal curves
Feel inflection
points (where
slopes change)
Label horizontal
axis with σ
landmarks IN MATHEMATICS point of inflection: a change of
curvature from convex to concave at a particular point
on a curve. "the point of inflection of the bell-shaped
curve"
The Normal Distribution
(The most usual & important example of the continuous distributions )
Normal pdfs have two parameters μ - expected value (mean) and σ - standard deviation (sigma)
Changing μ shifts the distribution
μ controls location
left or right, whereas Changing σ
increases or decreases the spread.
f(X)
σ controls spread

μ X
Graph of the Normal Probability Density Function

Symmetric Distribution, where μ = Mean = Median = Mode


The Normal Distribution:
as mathematical function (pdf)
Normal distribution is defined by
its mean and standard deviation. 1 x 2
 1 x 2
1  ( )
f ( x)  e 
1  ( )
E(X)= = x  2
e 2  dx 2

Variance(X) = 2 =  2
 1 x 2
1  ( )
 e
2  dx)   2
2
( x This is a bell shaped curve with different
  2 centers and spreads depending on  and 
Standard Deviation(X) = 
The Normal PDF, a probability  1 x 2
function, so no matter what the 1  ( )
values of  and , must integrate
to 1!
Note constants:

 2
 e 2  dx  1
=3.14159 and e=2.71828
The beauty of the Normal Curve:
68-95-99.7 Rule (AUC: Area Under the Curve)
No matter what  and  are, the area between - and
+ is about 68%; the area between -2 and +2 is
about 95%; and the area between -3 and +3 is about
99.7%. Almost all values fall within 3 standard deviations.

 68% of the AUC


within ±1σ of μ
 95% of the AUC
within ±2σ of μ
 99.7% of the AUC
within ±3σ of μ
68-95-99.7 Rule in Math terms…
  1 x 2
1  ( )

  
 2
e 2  dx  .68

  2 1 x 2
1  ( )



2  2
e 2  dx  .95

  3 1 x 2
1  ( )



3  2
e 2  dx  .997
… we can
easily
Symmetry in the Tails
determine the
Because the Normal curve AUC in tails
is symmetrical and the 95%
total AUC is exactly 1…
Example: 68-95-99.7 Rule

Wechsler adult intelligence scores: Normally


distributed with μ = 100 and σ = 15; X ~ N(100, 15)
 68% of scores within
μ ± σ = 100 ± 15
= 85 to 115
 95% of scores within
μ ± 2σ = 100 ±(2)(15)
=70 to 130
 99.7% of scores within
μ ± 3σ =100 ±(3)(15)
= 55 to 145
Normal Probability Distributions
How good is rule for real data?
Example: Male Height
Male height in West: Normal with μ = 70.0˝ and σ = 2.8˝
 68% within μ ± σ = 70.0  2.8 = 67.2 to 72.8
 32% in tails (below 67.2˝ and above 72.8˝)
 16% below 67.2˝ and 16% above 72.8˝ (symmetry)

7: Normal Probability Distributions


Normal Distribution– Probability Density Functions (pdf)

Normal Densities

0.045

0.04

0.035

0.03
N(100,400)
0.025 N(100,100)
f(y)

N(100,900)
0.02 N(75,400)
N(125,400)
0.015

0.01

0.005

0
0 20 40 60 80 100 120 140 160 180 200
y
Standard Normal Distribution: Z-distribution
 In short, the Z-distribution is a way of naming the Standard
Normal distribution.
 The Z-distribution is a specific instance of the Normal Distribution
that has a mean of ‘0’ and a standard deviation of ‘1’.
 The visual way to understand it would be the following image:

The red
curve is
the standard
normal
distribution

Probability density function


The Standard Normal Distribution (Z)
All normal distributions can be converted into the
standard normal curve by subtracting the mean and
dividing by the standard deviation: X 
Z 
Comparing 
X and Z
units ( = 100,  = 50)
X
100 200
0 2.0 Z ( = 0,  = 1)

1 Z 0 2 1
1  ( ) 1  ( Z )2
p( Z )  e 2 1
 e 2
(1) 2 2
Somebody calculated all the integrals for the standard normal and put them in a
table! So we never have to integrate!
Even better, computers now do all the integration.
Determining Normal Probabilities
19
When value do not fall directly on σ landmarks:
1. State the problem; 2. Standardize the value(s) (z score)
3. Sketch, label, and shade the curve; 4. Use Table B

Step 1: State the Problem


Step 2: Standardize
What percentage of gestations*
are less than 40 weeks?  Standard Normal variable ≡ “Z” ≡ a Normal
Let X ≡ gestational length random variable with μ = 0 and σ = 1,
We know from prior research:
 Z ~ N(0,1)
X ~ N(39, 2) weeks
Pr (X ≤ 40) = ?  Use Table B to look up cumulative
probabilities for Z

*Gestation (the develop-


ment of something over a
period of time) : the
process or period of
developing inside the
womb between
conception and birth.
Step 2 (cont.)
x
 Turn value into z score: z 

z-score = no. of σ-units above (positive z) or below
(negative z) distribution mean μ
For example, the value 40 from X ~ N (39,2) has
40  39
z  0.5
2
Steps 3 & 4: Sketch & Table B

?
Z ~ N(0,1)

Example: a Z variable
of 1.96 has cumulative
probability 0.9750.

7: Normal Probability Distributions


Values Corresponding to Normal Probabilities e.g., What is the 97.5th
percentile on the
1). State the problem; 2). Find Z-score
corresponding to percentile (Table B); Standard Normal curve?
3). Sketch; 4). Unstandardize : z.975 = 1.96
x    z p
z percentiles
• zp ≡ the Normal z variable with cumulative probability p
• Use Table B to look up the value of zp
• Look inside the table for the closest cumulative
probability entry
• Trace the z score to row and column

Notation: Let zp
represents the z
score with cumulative
probability p,
7: Normal Probability Distributions
e.g., z.975 = 1.96
Example
 Suppose SAT scores roughly follows a normal distribution in the
U.S. population of college-bound students (with range restricted
to 200-800), and the average math SAT is 500 ()with a
standard deviation (σ) of 50, then:
68% of students will have scores between 450 and 550
95% will be between 400 and 600
99.7% will be between 350 and 650
 BUT…What if you wanted to know the math SAT score
corresponding to the 90th percentile (=90% of students are lower)?
Q 1 x  500 2
1  ( )
P(X≤Q) = .90  
200
(50) 2
e 2 50 dx  .90
Example
 For example: What’s the probability of getting a math SAT score of
575 or less, =500 and =50?
575  500
Z  1.5
50
i.e., A score of 575 is 1.5 standard deviations above the mean
575 1 x 500 2 1.5 1
1  ( ) 1  Z2
 P( X  575)   (50)
200
2
 e 2 50 dx   

2
 e 2 dz

Yikes!
But to look up Z= 1.5 in standard normal chart (or enter
into SAS) no problem! = .9332
Practice problem
If birth weights in a population are normally distributed with a mean of
109 oz and a standard deviation of 13 oz,
a. What is the chance of obtaining a birth weight of 141 oz or heavier
when sampling birth records at random?
b. What is the chance of obtaining a birth weight of 120 or lighter?

 a). What is the chance of obtaining a birth weight of 141 oz or heavier


when sampling birth records at random?
141  109
Z   2.46
13
From the chart or SAS  Z of 2.46 corresponds to a right tail
(greater than) area of: P(Z≥2.46) = 1-(.9931)= .0069 or .69 %
b). What is the chance of obtaining a birth weight of 120
or lighter? 120  109
Z   .85
13
From the chart or SAS  Z of .85 corresponds to a left tail area of:
P(Z≤.85) = .8023= 80.23%
Looking up probabilities in the standard
normal table
What is the
area to the left
of Z=1.51 in a
standard
normal curve?

Z=1.51 Area is
93.45%

Z=1.51
Step 1: State Problem
Question: What gestational length is smaller than 97.5% of gestations?
Let X represent gestations length. We know from prior research that
X ~ N(39, 2)
A value that is smaller than .975 of gestations has a cumulative probability
of.025
Step 2 (z percentile)
Less than 97.5% (right tail) =
greater than 2.5% (left tail)
z lookup:
z.025 = −1.96

z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09

–1.9 .0287 .0281 .0274 .0268 .0262 .0256 .0250 .0244 .0239 .0233
Probabilities Between Points
a represents a lower boundary
b represents an upper boundary
Pr (a ≤ Z ≤ b) = Pr (Z ≤ b) − Pr(Z ≤ a)

Between Two Points


Pr(-2 ≤ Z ≤ 0.5) = Pr(Z ≤ 0.5) − Pr(Z ≤ -2)
.6687 = .6915 − .0228

-2 0.5 0.5 -2
Unstandardize and sketch
29

x    z p  39  (1.96)( 2)  35

The 2.5th percentile is 35 weeks


Are your data normally distributed?
Not all continuous random variables are normally
distributed!! It is important to evaluate how well the
data are approximated by a normal distribution

1. Look at the histogram! Does it appear bell shaped?


2. Compute descriptive summary measures — are
mean, median, and mode similar?
3. Do 2/3 of observations lie within 1 std. dev. of the
mean? Do 95% of observations lie within 2 std.
dev. of the mean?
4. Look at a normal probability plot — is it
approximately linear?
5. Run tests of normality (such as Kolmogorov-
Smirnov). But, be cautious, highly influenced by
sample size!
Data from a BBA class…

Median = 6,
Mean = 7.1,
Mode = 0

SD = 6.8,
Range = 0 to 24

Median = 3
Mean = 3.4
Mode = 3

SD =
2.5
Range =
0 to 12
Data from a BBA class…

Median = 5
Mean = 5.4
Mode = none

SD = 1.8
Range = 2 to 9

Median = 7:00
Mean = 7:04
Mode = 7:00

SD = :55
Range = 5:30 to 9:00
Normal probability plot
coffee…

Right-Skewed!
(concave up)
Normal probability plot love of
writing…

Neither right-skewed
or left-skewed, but
big gap at 6.
Norm prob. plot Exercise…

Right-Skewed!
(concave up)
Normal prob. Plot: Wake up time

Closest to a
straight line…
Formal tests for normality
Results:
Coffee: Strong evidence of non-
normality (p<.01)
Writing love: Moderate evidence
of non-normality (p=.01)
Exercise: Weak to no evidence of
non-normality (p>.10)
Wakeup time: No evidence of
non-normality (p>.25)
Take example of Wake-up Time … for
+ (% of AUC?)

6:09 7:59 7:04+/- 0:55


= 6:09 – 7:59
Take example of Wake-up Time … for
 + 2 (% of AUC?)

5:14
8:54

7:04+/- 2*0:55
= 5:14 – 8:54
Re-expression of Non-Normal Random Variables
 Many variables are not Normal but can be re-expressed with
a mathematical transformation to be Normal
 Example of mathematical transforms used for this purpose:
 logarithmic
 exponential
 square roots
 Review logarithmic transformations…
Logarithms are exponents Base 10 log function
of their base
Common log (base 10)
log(100) = 0
log(101) = 1
log(102) = 2
Natural ln (base e)
ln(e0) = 0
ln(e1) = 1
Example: Logarithmic Re-expression
 Prostate Specific Take exponents of “95%
Antigen (PSA) is range”
used to screen for  e−1.9,1.3 = 0.15 and 3.67
prostate cancer  Thus, 2.5% of non-
 In non-diseased diseased population have
populations, it is not values greater than 3.67 
use 3.67 as screening cutoff
Normally
distributed, but its
logarithm is:
 ln(PSA) ~N(−0.3, 0.8)
 95% of ln(PSA)
within
= μ ± 2σ
= −0.3 ± (2)(0.8)
= −1.9 to 1.3
Chapter Notes

You might also like