0% found this document useful (0 votes)
11 views

Continuous Random Variables and Probability Density Functions

A short explanation on Random Variables

Uploaded by

Peter
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Continuous Random Variables and Probability Density Functions

A short explanation on Random Variables

Uploaded by

Peter
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Continuous Random Variables and Probability Density Functions

A random variable X is said to be continuous if its set of possible values is an entire interval of
numbers – that is, if for some A < B, any number x between A and B is possible. Let X be a
continuous random variable. Then a probability density function (pdf) of X is a function f(x) such
that for any two numbers a and b with a  b ,
b
(a    b)   f ( x)dx .
a

That is, the probability that X takes on a value in the interval [a, b] is the area above this interval
and under the graph of the density function f(x) which is referred to as the density curve.

f(x)

x
a b

Fig. 1: (a    b)  the area under the density curve between a and b.

The two conditions that must be satisfied for f(x) to be a legitimate pdf:

(i) f ( x)  0 for all x.



(ii) 

f ( x ) dx = area under the entire graph of f(x)

=1
If X is a continuous random variable, then for any number c, P(X = c) = 0. Furthermore, for any
two numbers a and b with a < b,
(a    b)  (a    b)  (a    b)  (a    b)

The cumulative distribution function F(x) for a continuous random variable X is defined for
every number x by
x
F ( x)  (   x)   f ( y )dy

For each x, F(x) is the area under the density curve to the left of x. This is illustrated in Figure 2,
where F(x) increases smoothly as x increases.

f(x) F(x)
1
F(8)
F(8)
.5

x x
5 8 10 5 8 10
Fig. 2: A pdf and associated cdf

Let X be continuous random variable with pdf f(x) and cdf F(x). Then for any number a,
P(X > a) = 1 - F(a)

and for any two numbers a and b with a < b,


(a    b)  F (b)  F (a)

The desired probability is the shaded area under the density curve between a and b, and it equals
the difference between the two shaded cumulative areas.

f(x)

a b b a
Fig 3: Computing (a    b) from cumulative probabilities

Example 1

Suppose the pdf of the magnitude X of a dynamic load on a bridge (in Newton’s) is given by

1 3
  x 0 x2
f ( x)   8 8
0 otherwise

For any number x between 0 and 2,


x x1 3  x 3
F ( x)   f ( y )dy     y dy   x 2


0 8 8  8 16
Thus,

 0 x0

 x 3
F ( x)    x 2 0  x  2
 8 16
 1 2 x


The probability that the load is between 1 and 1.5 is


(1    1.5)  F (1.5)  F (1)

1 3  1 3 
  (1.5)  (1.5) 2    (1)  (1) 2 
8 16  8 16 
19
  .297
64
The probability that the load exceeds 1 is

1 3 
(   1)  1  (   1)  1  F (1)  1   (1)  (1) 2 
8 16 

11
  .688
64
If X is a continuous random variable with pdf f(x) and cdf F(x), then at every x which the
derivative F (x) exists, F ( x)  f ( x) .

Continuous Probability Functions: The conditional expectation of X given that even A has
occurred is

    xf (x )dx .



If X is a discrete, then integral will replace with summation as

     xi   (xi ) .
i

The probability density function (pdf), if it exists, is given by

dF ( x)
f ( x)  ,
dx

where F ( x)  F ( x) , since we are dealing with only a single random variable.


Properties of pdf
If f(x) exists then:

(a) 

f ( )d  F ()  F ( )  1

x
(b) F ( x)   f ( )d  (   x)


x2 x1
(c) F ( x 2 )  F ( x1 )   f ( )d   f ( )d
 

x2
  f ( )d  ( x1    x2 ) .
x1

Interpretation of f(x)
( x    x  x)  F ( x  x)  F ( x)

If F(x) is continuous in its first derivative then, for sufficiently small x ,


x  x
F ( x  x)  F ( x)   f ( )d  f ( x)x
x

Observe that if f(x) exists, then F(x) is continuous and P(X = x) = 0.

Some examples of Continuous probability functions are:

The univariate normal (Gaussian) pdf: The pdf is given by

1  x
2

1 2  
f ( x)  e  

2 2

There are two independent parameters:  , the standard deviation (  2 is the variance) and  ,
the mean. If a random variable X obeys the normal probability law with mean  and standard
deviation  , we write in symbol as  :  (  ,  2 ) . The normal pdf is widely encountered in all
branches of science, engineering, social and demographic studies. For example, the masses of
lecturers in a university, the intelligent quotient of children, the heights of a growing child, the
yields of agricultural produce in a farm, the noise voltage produced by a thermally agitated
resistor, all are postulated to be approximately normal over a large range of values.
Properties of the Normal Distribution

(a) The normal distribution function depends on the mean  and the standard deviation  .
(b) The normal distribution curve is bell-shaped.
(c) The curve is asymptotic to the x-axis.

(d) The function is continuous from   to   .


(e) The curve is symmetrical about the vertical line through the mean.

Since a normal distribution function is a probability function, the total area under its curve is 1.

Statisticians have found it rather convenient to choose a normal curve with mean 0 and standard
deviation 1. Such a normal distribution curve is called a standard normal distribution curve or a
standard normal curve.
p(x)

Standard normal (z) curve

 0 z
Fig. 4: A Standard Normal Curve,
where p(x) axis becomes an axis of symmetry. If the probability distribution has mean 0 and
standard deviation 1, we say that the distribution has been standardized and z is called
standardized score or z-score.

x
z .

A table of standard normal distribution is available in most statistical tables.

When  ~ (,  2 ) , probabilities involving X are computed by “standardizing.” The


standardized variable is (   ) /  . Subtracting  shifts the mean from  to zero, and then
dividing by  scales the variable so that the standard deviation is 1 rather than  .


If X has a normal distribution with mean  and standard deviation  , then Z 

has a standard normal distribution. Thus

a b
 ( a    b )   Z 
   
b  a
     
     

a b 
(   a )    (   b)  1   
     

The standardized normal table can be used. The proposition can be proved by writing the cdf of
Z  (    ) /  as
z  
( Z  z )  (   z   )   f ( x;  ,  ) dx


Using a result from calculus, this integral can be differentiated with respect to z to yield the
desired pdf f(z: 0, 1).
N(0, 1)

( , 2 )
=

 x 0
(x  ) / 
Fig 5: Equality of nonstandard and standard normal curve areas
Example 2

The life-length of an electronic device manufactured by Company A is normally distributed with


mean 45 and standard deviation 8, while that of a similar electronic device manufactured by
Company B has mean life-length 48 and standard deviation 4, all measurements being in hours.
Which of the electronic devices is to be preferred if it is required for:

(i) a 48 hour period;


(ii) a 52 hour period.

giving reasons for your answer.


Solution

Let  1 be a random variable which is a measure of the life-length of the electronic device from
Company A and let  2 be a random variable which is a measure of the life-length of the
electronic device from Company B.
We shall compare:
(i) ( 1  48) and (  2  48 )

    48   
(1  48)   1  
   
 48  45 
  Z  
 8 
 ( Z  0.375)  1  ( Z  0.375)
= 1 – 0.6461
= 0.3539
    48   
(  2  48)   2  
  4 
 ( Z 2  0)  1  ( Z 2  0)
 1  0.5000
 0.5
Comment: The electronic device of Company B is to be preferred to the electronic device of
Company A because it has a greater chance of lasting more than 48 hours than the electronic
device of Company A.

   45 52  45 
(ii) ( 1  52)   1  
 8 8 
 ( Z1  0.875 )  1  ( Z  0.875 )
= 1 – 0.8092
= 0.1908
   48 52  48 
(  2  52)   2  
 4 4 
 ( Z 2  1)  1  ( Z 2  1)
= 1 – 0.8413
= 0.1587
Comment: The electronic device of Company A is to be preferred to the electronic device of
Company A because it has a greater chance of lasting more than 52 hours than the electronic
device of Company B.
Example 3
The yields, in kilogrammes, of tomatoes from 10 identical plots on a farm are:

30, 36, 39, 48, 27, 42, 39, 48, 51, 30


Calculate the (a) mean and (b) standard deviation of the yields.

Assuming that the yields are normally distributed with these values of the mean and standard
deviation, find the yield W such that the probability of the yield from a plot being greater than W
kilogrammes is 5%.
Solution

(a) Let the mean of the distribution be x ; and let the standard deviation be S.

x xx (x  x)2
30 -9 81
36 -3 9
39 0 0
48 9 81
27 -12 144
42 3 9
39 0 0
48 9 81
51 12 144
50 -9 81
390 630

x
 x  390  39
 10

S
 (x  x) 2


630

10
 7.937

(b) Let X be a random variable which is a measure of the yields from the plot. We are required to
find P(X > W) = 0.05.

 0 W

It follows that (  W )  1  0.05  0.95

To find a Z whose probability correspond to 0.95, we use the standard normal distribution table
inverse. From the Table, this is 1.645

Z 
 1.645

W  39
 1.645
7.937
W  39  1.645  7.937
 W  39  13.06

= 52.06kg

Example 5
In a very large collection of plants, it is found that 20% have heights greater than 36.3 cm and
67% have heights greater than 29.9 cm. Assuming that height x cm is normally distributed in this
collection, find the mean and standard deviation of the heights of the plants in the collection.

What is the probability that the height x cm of a plant will exceed 33.0 cm?
Solution

Let X be a random variable which is a measure of the height of a plant.


(   36 .3)  0.2
(   36 .3)  0.8

    36.3   
    0.8
   
 36.3   
  0  Z    0.3
  
From the table,

36.3  
 0.845

 36 .3    0.845  (1)

(   29 .9)  0.67
(   29 .9)  0.33

    29.9   
    0.33
   
 29.9   
  Z    0.33
  

From the table,

29.9  
 0.44

 29 .9    0.44 (2)

Solving Equation (1) and (2) simultaneously, we have


6.4  1.285
   4.981

From (1), we have

36.3    0.845
   36.3  (0.845  4.981)
 36.3  4.2089

= 32.0911cm

 32.1 cm

    33 .0  32 .1 
(b) (   33 .0)  1    
  4.981 

 1  (Z  0.1807)

= 1 – 0.5714

= 0.4286
The Gamma Distribution and Its Relatives: The graph of any normal pdf is bell-shaped and thus
symmetric. There are many practical situations in which the variable of interest to the
experimenter might have a skewed distribution. A family of pdf’s that yields a wide variety of
skewed distributional shapes is the gamma family. For   0 , the gamma function ( ) is
defined by

( )   x  1e  x dx . (3)
0

The properties of the gamma function are:

(i) For any   1, ( )  (  1)  (  1) [via integration by parts]

(ii) For any positive integer, n, (n)  (n  1)!

(iii)  12   

By Equation (3), if we let

 x  1e  x
 x0
f ( x;  )   ( ) (4)
0 otherwise



then f ( x; )  0 and 0
f ( x;  )dx  ( ) / ( )  1 , so f ( x;  ) satisfies the two basic

properties of a pdf.

The family of Gamma Distributions: A continuous random variable X is said to have a gamma
distribution if the pdf of X is

 1  1  x / 
   ( ) x e x0
f ( x;  ,  )   (5)
0 otherwise

where the parameters  and  satisfy   0,   0 . The standard gamma distribution has
  1 , so the pdf of a standard gamma random variable is given by (4).
f ( x; ,  ) f ( x;  )

1.0   2,   1
3 1.0  1

  1,   1   .6
0.5   2,   1 0.5  2

  2,   2  5
0 x 0 x
1 2 3 4 5 6 7 1 2 3 4 5
Fig. 6: (a) Gamma density curves (b) Standard gamma density curves

Figure 6(a) illustrates the graphs of the gamma pdf f ( x; ,  ) (2.14) for several ( ,  ) pairs,
whereas (b) presents graphs of the standard gamma pdf. For the standard pdf, when   1 , f ( x;  )
is strictly decreasing as x increases from 0; when   1, f ( x;  ) rises from 0 at x = 0 to a
maximum and then decreases. The parameter  in (2.14) is called the scale parameter because
values other than 1 either stretch or compress the pdf in the x direction.

() and (  2 ) can be obtained from a reasonably straightforward integration, and then
V ( )  (  2 )  [( )] 2 . The mean and variance of a random variable X having the gamma
distribution f ( x; ,  ) are

()     V ( )   2   2

Exponential (   0 ): The exponential pdf is a special case of the general gamma pdf in which
  1 and  has been replaced by 1 /  . X is said to have an exponential distribution with
parameter  (  0) if the pdf of X is

e  x x0

f ( x;  )   .
0 otherwise

The mean and variance of X are


1 1
     2   2 
 2
1
Both the mean and standard deviation of the exponential distribution equal . The exponential

pdf can easily be integrated to give the cdf of X as:
0 x0

F ( x;  )  
 x
1  e x0

The exponential law occurs in waiting-time problems, lifetime of machinery and in describing
the intensity variations of incoherent light. Thus, the exponential distribution is frequently used
as a model for the distribution of times between occurrence of successive events, such as
customers arriving at a service facility or calls coming in to a switchboard.

Suppose that the number of events occurring in any time interval of length t has a Poisson
distribution with parameter t (where  , the rate of event process, is the expected number of
events occurring in 1 unit of time) and that numbers of occurrences in non-overlapping intervals
are independent of one another. Then the distribution of elapsed time between the occurrence of
two successive events is exponential with parameter    . For example, for the time  1 until
the first event occurs is given as:
(  1  t )  1  (  1  t )  1  [( no events in (0, t )]

e t  (t ) 0
 1  1  e t
0!
which is exactly the cdf of the exponential distribution.

Uniform ( b  a ): A continuous random variable X is said to have a uniform distribution on the


interval [a, b] if the pdf of X is

 1
 a xb
f ( x)   b  a
0 otherwise

The uniform pdf is used in communication theory, in queuing models and in situation where we
have no a priori knowledge favouring the distribution of outcomes except for the end points; that
is we don’t know when a business call will come but it must come
Rayleigh (   0 ):

x / 2 2
f ( x)  e x
2
u ( x) .
 2

where the function u (x) is the unit step, that is, u ( x)  1, x  0, u ( x)  0, x  0 . Thus,
f ( x)  0 for x < 0. Examples of where the Rayleigh pdf shows up are in rocket-landing errors,
random fluctuations in the envelope of certain waveforms and radial distribution of misses
around the bull’s eye at a rifle range.

1 Exponential

0.606

Rayleigh Uniform
1
ba

0  a b

Fig. 2.6: The Rayleigh, exponential and uniform probability density functions (pdf’s)

The Chi-Squared Distribution: The chi-squared distribution is important because it is the basis
for a number of procedures in statistical inference. Let v be a positive integer. Then a random
variable X is said to have a chi-squared distribution with parameter v if the pdf of X is the
gamma density with   v / 2 and   2 . The pdf of the chi-squared random variable is

 1 ( v / 2 ) 1  x / 2
 2 v / 2 (v / 2) x e x0
f ( x; v)  
 0 x0

The parameter v is called the number of degrees of freedom of X. The symbol of “chi-squared” is
2.

There are other families of continuous distributions used by Statisticians to tackle many practical
situations in which no member of normal, gamma, exponential and uniform families of
distributions (that provides a wide variety of probability models for continuous variables) fits a
set of observed data very well. They are:

The Weibull Distribution: A random variable X is said to have a Weibull distribution with
parameters  and  (  0,   0) if the pdf of X is

   1 ( x /  )
  x e x0
f ( x;  ,  )   
 0 x0

When   1, the pdf reduces to the exponential distribution (with   1 /  ) , so the exponential
distribution is a special case of both the gamma and Weibull distributions. However, there are
gamma distributions that are not Weibull distributions and vice versa, so one family is not a
subset of the other.

Integrating to obtain E(X) and (  2 ) yields

 1 
  2   1  
2

  1      1    1   
2 2

          
 

The computation of  and  2 thus necessitates using the gamma function. The integration
x
0
f ( y;  ,  )dy is easily carried out to obtain the cdf of X. The cdf of a Weibull random variable

parameters  and  is

 0 x0

F ( x;  ,  )  
 ( x /  )
1  e x0

The Lognormal Distribution: A nonnegative random variable is said to have a lognormal


distribution if the random variable Y = ln(X) has a normal distribution. The resulting pdf of a
lognormal random variable when ln(X) is normally distributed with parameter  and  is

 1 [ln( x )   ]2 /( 2 2 )
 e x0
f ( x;  ,  )   2 x
 0 x0

The mean and variance of X is


( )  e   V ( )  e 2    (e  1)
2 2 2
/2

A lognormal curve has a positive skew. Because ln(X) has a normal distribution, the cdf of X can
be expressed in terms of the cdf (z) of a standard normal random variable Z. For x  0 ,

F ( x;  ,  )  (  x)  [ln()  ln( x)]

 ln( x)     ln( x)   
  Z     
     

Example 6

The article “Reliability of Wood Joist Floor Systems with Creep” suggests that the lognormal
distribution with   0.375 and   0.25 is a plausible model for X = the modulus of elasticity
(MOE, in 106 psi) of wood joist floor systems constructed from #2 grade hem-fir. Find

(a) The expected value of MOE.


(b) The variance of MOE.

(c) The probability that MOE is between 1 and 2.


(d) What value c is such that only 1% of all systems have an MOE exceeding c?

Solution
.375  (.25) 2 / 2
(a) The mean value of MOE is ()  e  e.40625  1.50

(b) The variance, ( )  e .8125 (e .0625  1)  0.1453

(c) (1    2)  ln(1)  ln( )  ln( 2) 

 (0  ln( )  .693)


 0  .375 .693  .375 
  Z 
 .25 .25 
 (1.27)  (1.50)  0.8312

 ln( c)  .375 
(d) .99  (   c)   Z  
 .25 

from which (ln(c) - .375)/.25 = 2.33 and c = 2.605


Thus, 2.605 is the 99th percentile of the MOE distribution.
The Beta Distribution: All families of continuous distributions listed except for the uniform
distribution have positive density over an infinite interval (though typically the density function
decreases rapidly to zero beyond a few standard deviations from the mean). The beta distribution
provides positive density only for X in the interval of finite length.

A random variable X is said to have a beta distribution with parameters  ,  (both positive), A
and B if the pdf of X is
 1  1
 1 (   )  x       x 
      x
f ( x;  ,  , , )      ( )  (  )          
 0 otherwise


The case A = 9, B = 1 gives the standard beta distribution. The mean variance of X are

 (  ) 2 
    (  )  2 
  (   ) 2 (    1)

Example 7
Suppose that in constructing a single-family house, the time X (in days) necessary for laying the
foundation has a beta distribution with A = 2, B = 5,   2 and   3 . Then  /(   )  .4 , so
E(X) = 2 + (3)(.4) = 3.2. For these values of  and  , the pdf of X is a simple polynomial
function. The probability that it takes at most 3 days to lay the foundation is
3 1 4!  x  2  5  x 
(   3)      dx
2 3 1!2!  3  3 

4 3 4 11 11

27 2
( x  2)(5  x) 2 dx   
27 4 27
 0.407

The standard beta distribution is commonly used to model variation in the proportion or
percentage of a quantity occurring in different samples, such as the proportion of a 24-hour day
that an individual is asleep or the proportion of a certain element in a chemical compound.

The pdf of all continuous distributions share the properties:

(a) f ( x)  0

(b) 

f ( x)dx  1 .
When F(x) is not continuous, its finite derivative does not exist and –in the classical sense-the
pdf does not exist. If F(x) is continuous for every x and its derivative exists everywhere except at
a countable points, then we say that X is a continuous random variable. The expected value, if it
exists, of a real random variable X with pdf f  (x) is defined as

(  )   x f  ( x)dx .


The Normal Approximation to the Binomial Distribution: Let X be a binomial random variable
based on n trials with success probability p. then if the binomial probability histogram is not too
skewed, X has approximately a normal distribution with   np and   npq . In particular,
for x = a possible value of X,
 area under the normal curve
(   x)  ( x; n, p)   

 to the left of x  .5 
 x  .5  np 
  
 
 npq 
In practice, the approximate is adequate provided that both np  10 and nq  10 . If either
np  10 or nq  10 , the binomial distribution is too skewed for the (symmetric) normal curve to
give accurate approximation. For example, a binomial probability histogram for the binomial
distribution with n = 20, p = .6 [so   20(.6)  12 and   20(.6)(.4)  2.19] . A normal curve
with mean value and standard deviation equal to the corresponding values for the binomial
distribution has been superimposed on the probability histogram.

.25 normal curve,   12 ,   2.19

.20
.15

.10
.05

0 2 4 6 8 10 14 16 18 20
Fig. 2.7: Binomial probability histogram for n = 20, p = .6 with normal approximation curve
superimposed.
Example 8
Suppose that 10% of all steel shafts produced by a certain process are nonconforming but can be
reworked (rather than having to be scraped). Consider a random sample of 200 shafts, and let X
denote the number among these that are non-conforming and can be reworked. What is the
(approximate) probability that
(a) At most 30?
(b) Less than 30?
(c) Between 15 and 25 (inclusive)?
Solution
From the data: p = .10,   np  200(.10)  20 and   200(.10)(.90)  4.24 . Since
np  20  10 and nq  200(.9)  180  10 , the approximation can be applied.
 30  .5  20 
(a) (   30 )  (30; 200 , .10 )   
 4.24 
 (2.48) = 0.9934
 29  .5  20 
(b) (   30 )  (   29 )  (29; 200 , .10 )   
 4.24 
 (2.24) = 0.9875
(c) (15    25 )  (25; 200 , .10 )  (14; 200 , .10 )

 25 .5  20   14 .5  20 
     
 4.24   4.24 
 (1.30)  (1.30)
= 0.9032 – 0.0968
= 0.8064
The probability (15    25) is being approximated by the area under the normal curve
between 14.5 and 25.5 – the continuity correction is added in both the upper and lower limits.

When the objective of our investigation is to make an inference about a population proportion p,
interest will focus on the sample proportion of successes X/n rather than on X itself. Because this
proportion is just X multiplied by the constant 1/n, it will also have approximately a normal
distribution (mean   p and standard deviation   pq / n provided that both np  10 and
nq  10 . This normal approximation is the basis for several inferential procedures.
ASSIGNMENT
1. A theoretical justification based on a certain material failure mechanism underlies the
assumption that ductile strength X of a material has a lognormal distribution. Suppose the
parameter are   5 and   .1 .
(a) Compute E(X) and V(X).
(b) Compute (  125) .
(c) Compute (110    125) .
(d) What is the value of median ductile strength?
(e) If ten different samples of an alloy steel of this type were subjected to a strength test,
how
many would you expect to have strength of at least 125?
(f) If the smallest 5% of strength values were unacceptable, what would the minimum
acceptable strength be?

2. Suppose only 40% of all drivers in a certain city regularly wear a seat belt. A random
sample of 500 drivers is selected. What is the probability that
(a) Between 180 and 230 (inclusive) of the drivers in the sample regularly wear a seat belt?
(b) Fewer than 175 of those in the sample regularly wear a seat belt? Fewer than 150?

You might also like