0% found this document useful (0 votes)
35 views13 pages

Functions of Random Variables

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 13

4.

FUNCTIONS OF RANDOM VARIABLES


Given Y = g(X), or Y = g(X1, X2) = a1X1+a2X2
If PDF of X is known, can the PDF of Y be obtained?

4.1 SINGLE VARIABLE CASE


Suppose Y = g(X), g is monotonic increasing, e.g Y = 5X2, X  0.
Relationship is deterministic, i.e. X is known exactly (e.g. X = 2),
then Y is also known exactly (e.g. Y = 20).

If X is random, then Y is random.


If P(1 < X < 2) = 0.1, then P(5 < Y < 20) = 0.1 since Y = 5X2.
If P(x0 < X < x0+dx) = p, then P(y0 < Y <y0+dy) = p; y0=g(x0)

For monotonically increasing function


Y
Probability of X0
occurrence within
interval = fY(y)dy Probability of
dy occurrence within
interval
= fX(x)dx

X
dx
PDF of X = fX(x), PDF of Y = fY(y).
Probability in variable X is mapped to variable Y. Hence,
fY(y)dy = fX(x)dx
𝐝𝒙
fY(y) = fX(x) where x = g-1(y) (inverse function)
𝐝𝒚
𝐝𝒙
For monotonically decreasing function, i.e. is negative
𝐝𝒚

Y
X0 Probability
= fY(y)dy
dy
Probability
= fX(x)dx
X
dx
Probability in variable X is mapped to variable Y.
But probabilities must be positive, hence
|fY(y)dy| = |fX(x)dx|
𝐝𝒙
fY(y) = fX(x) where x = g-1(y)
𝐝𝒚

Change of variable theorem for monotonic function


(either always increasing or decreasing)

Example 4.1
Given Y = ln X. If fX(x) is LN(, ), find fY(y).

Y = ln X => = => =

1  1  ln x    2 
Recall: f X ( x)  exp       LN ( ,  ) 0 x
x 2  2    

1  1  ln(e y )    2 
fY(y) = fX(x)  exp     x
x 2  2    
1  1  y   2 
 exp     ~ N ( ,  )
 2  2    

(All x must be converted to y by putting x = ey)


For discrete r.v.,
if Y = g(X) and P(X = x0) = pX(x0) = p,
then P(Y = g(x0)) = p.
Hence, given pX(x), then PMF of Y is
pY(y) = pX(x) = pX[g-1(y)]

Example
X = no. of functional bulldozers
Y = X2
pY(4) = pX(2) = 0.384
X Y P(X = xi)
3 9 0.512
2 4 0.384
1 1 0.096
0 0 0.008

4.2 MULTI-VALUE SINGLE VARIABLE FUNCTION


Given Y = g(X), the inverse, X = g-1(Y) can take multiple values,
e.g. Y = X2, => X =  . How to find fY(y)?

For a two-value function, if Y = y, then the inverse is


X = x1 or X = x2. For the above, X = + or X =
Hence, P(Y = y) = P(X = x1 or X = x2) = P(X= x1  X = x2)
= P(X = x1) + P(X = x2)

Therefore if Y  g ( X ) and X  g 1 ( y )  x1 , x 2 , x 3 , , x n ,
n
pY ( y )   p X ( x i ) for discrete r.v.
i 1
n
dx i
fY ( y )   f X ( x i ) for continuous r.v.
i 1 dy
Two-valued function (non-monotonic)
Y Probability
= fY(y)dy

dy
Probability
= fX(x2)dx2 Probability
= fX(x1)dx1
X
dx2 dx1
Probability fY(y)dy is mapped to two regions:
fX(x1)dx1 and fX(x2)dx2
Hence, |fY(y)dy| = |fX(x1)dx1| + |fX(x2)dx2|

fY(y) = fX(x1) + fX(x2)

Don’t worry, 3 or
Multi-valued function (non-monotonic) more valued
function not tested
Y
Probability = fY(y)dy

Prob = dy
Prob =
fX(x3)dx3 fX(x1)dx1

X
dx3 dx2 dx1
Prob = fX(x2)dx2

Probability fY(y)dy is mapped to multiple regions:


n
dxi
Hence, f Y ( y )   f X ( xi )
i 1 dy
Example 4.2 - Y=X2, X ~ N(0,1), find fY(y).

If Y = y, then X = x1 = + (first root)


or X = x2 = (second root)

n
dxi
f Y ( y )   f X ( xi ) for multi-valued functions
i 1 dy and continuous r.v.

1 2

a) x1 = + => =

1  x12  1 1  y
f Y (y )  exp     exp  
2π  2  2 y 2 2 πy  2

b) x2 = => =

1  x22  1 1  y
f Y (y )  exp      exp  
2π  2  2 y 2 2 πy  2

c) Combining (a) and (b) gives


1  y 1  y 1  y
fY ( y )  exp     exp     exp   
2 2y  2  2 2y  2 2y  2
Scaling a random variable

How does multiplying (or dividing) by a constant affect a


random variable?

Consider y = bX, where b = constant

Both the mean and standard deviation are multiplied by b, i.e.

Y = bX , Y = |b| X (note b can be negative!)

Distribution type remains unchanged, i.e.


If X ~ normal, Y ~ normal
If X ~ lognormal, Y ~ lognormal

Coefficient of variation also remains unchanged

Covariance of two random variables


Recall that for a single random variable, the variance is
 X2  E[( X   X ) 2 ]
For two random variables X and Y, the covariance is

cov( X , Y )  E[( X   X )(Y  Y )]

Convenient to normalize the covariance as follows

cov( X , Y )  = Pearson product moment


 XY 
 XY correlation coefficient
(no units) or simply correlation coefficient

–1    1
Correlation coefficient
r = sample correlation coefficient
Negative correlation
straight
line

Zero correlation
Positive correlation
(uncorrelated)
straight
line

Background,
Correlation vs Dependence not tested
X and Y are independent
P(XY) = P(X)P(Y) (discrete)
fXY(x, y) = fX(x)fY(y)
joint probability density function for continuous r.v.
(background only, not tested)

X and Y are uncorrelated


E[( X   X )(Y  Y )]  0
• Independent implies uncorrelated
• However, uncorrelated variables may not necessarily
be independent!!
• Special case: for two jointly normal variables X and Y,
uncorrelated implies independence. (does not apply to
other distributions)
Background,
Correlation vs Dependence not tested

• Correlation is a measure of linear dependence


• Uncorrelated implies no linear dependence, but there
can be nonlinear dependence! Y
For example, Y = X2
X and Y are completely
dependent
(Y is fully specified by X)
X
However, XY= 0
E[( X   X )( X   X ) 2 )
(uncorrelated, i.e. no  XY 
 XY
linear relationship)
E[( X   X )3 )
= 0 (symmetric)
 XY

4.3 FUNCTION OF MULTIPLE RANDOM VARIABLES


(more advanced, so we only consider special cases)

4.3.1 Sum (or Difference) of Normal Random Variables

Consider a case of two normal variates X1 and X2, where

X 1 ~ N (  X1 , X1 ), X 2 ~ N (  X 2 , X 2 ) with correlation  X1 X 2

What is the distribution of Y = a1X1+a2X2 ?


The distribution of Y will be N (Y , Y )

(sum of normal variables is also normal)


The distribution of Y will be N ( Y , Y ), where
2
Y  E (Y )  E (a1 X 1  a2 X 2 )  a1  X  a2  X   ai  X
1 2 i
i 1

 Y2  E[(Y  Y )2 ]  E[{a1 ( X 1   X )  a2 ( X 2   X )}2 ]


1 2

 E[a12 ( X 1   X1 )2  2a1a2 ( X 1   X1 )( X 2   X 2 )  a22 ( X 2   X 2 )2 ]


 a12 X2 1  2a1a2 E[( X 1   X1 )( X 2   X 2 )]  a22 X2 2
2 2
a  2
1
2
X1 a 
2
2
2
X2  2a1a2  X1 X 2  X1  X 2   a a  i j Xi X j X X
i j
i 1 j 1

E [( X 1   X 1 )( X 2   X 2 )]  cov ( X 1, X 2 )
cov ( X 1, X 2 )
X 
1X2
X X1 2

Correlation does not imply causation !!!


B causes A (reverse causation or reverse causality)
Observation: The faster that windmills are observed to rotate, the more
wind is observed.
Wrong conclusion: Wind is caused by the rotation of windmills.

Third factor causes both A and B


Observation: As ice cream sales increase, drowning deaths increases
Wrong conclusion: Ice cream consumption causes drowning.
Actual explanation: Ice-cream is sold in hot summer months, and during
summer, people are more likely to swim.

Relationship is coincidental
Observation: Russian state leaders alternate from bald to non-bald for 200
years
Actual explanation: Purely coincidental

Littlewood's law states that a person can expect to experience events with
odds of one in a million (i.e miracle) at the rate of about one per month.
https://en.wikipedia.org/wiki/Littlewood%27s_law
Dead
Example 4.3 - Combined load on column + live

Given S = D + L + W, D ~ N(4.2,0.3), L ~ N(6.5,0.8) + wind

and W ~ N(3.4,0.7), with DL= 0.1, LW = 0 and load

DW = 0. If the strength of the column R ~ N(21.15,


3.1725), find the probability of failure of the
column. Assume R and S are uncorrelated.

The distribution of S will be N(  S ,  S ), where


3
 S   ai  X   D   L  W  4.2  6.5  3.4  14.1
i
i 1
3 3
 
2
S a a  i j X X
i
jX  X
i
 
j
2
D   2
L   W  2  DL D L
2

i 1 j 1

 0.32  0.82  0.7 2  2  0.1  0.3  0.8  1.268   S  1.126

Given S = D + L + W , D ~ N(4.2,0.3), L ~ N(6.5,0.8) and


W ~ N(3.4,0.7), with DL= 0.1, LW = 0 and DW = 0. If the
strength of the column R ~ N(21.15, 3.1725), find the
probability of failure of the column.
Assume R and S are uncorrelated.

Let X  strength  load  R  S . X will be N(  X ,  X )

 X   R   S  21.15  14.1  7.05


 X2   R2   S2  3.1725 2  1.268 2  11.333   X  3.366

Probability of failure is when load > strength, i.e. S > R


or R – S < 0 or equivalently, X < 0
 X  X   0  7.05 
P( X  0)           ( 2.094 )  0.018
 X   3 . 366 
3 3

 a a 
i 1 j 1
i j Xi X j X X 
i j

a1a1  X 1 X 1 X 1 X 1  a1a 2  X 1 X 2  X 1 X 2  a1a 3  X 1 X 3  X 1 X 3

 a 2 a1  X 2 X 1 X 2  X 1  a 22 a 22  X 2 X 2  X 2  X 2  a 2 a 3  X 2 X 3  X 2  X 3

 a 3 a1  X 3 X 1 X 3  X 1  a 32 a 32  X 3 X 2  X 3  X 2  a 3 a 3  X 3 X 3  X 3  X 3

 a12 12  a 22 22  a 32 32  2a1a 2  X 1 X 2  X 1  X 2


 2a1a 3  X 1 X 3  X 1 X 3  2a 2 a 3  X 2 X 3  X 2  X 3

Note that
X X  X1 1 2X2
  X3X3  1 X X  X
1 2 2 X1
, etc

More complex example.


S =1 –2D + 3L – 4W , D ~ N(4.2,0.3), L ~ N(6.5,0.8) and
W ~ N(3.4,0.7), with DL= 0.1, LW = –0.2 and DW = – 0.3.

The distribution of S will be N(S , S ), where


3
S   ai  X  1  2D  3L  4W  1  2(4.2)  3(6.5)  4(3.4)  ...
i
i 1
3 3
  2
S a a  i j Xi X j X X
i j
i 1 j 1

 22  D2  32  L2  42  W2  2(2)(3)  DL D L  2(3)(4)  LW  L W


 2(2)(4)  DW  D W
 4(0.32 )  9(0.82 )  16(0.72 )  2(2)(3)(0.1)(0.3)(0.8)
 2(3)(4)(0.2)(0.8)(0.7)  2(2)(4)(0.3)(0.3)(0.7)
 ...
4.3.2 Product (or quotient) of Lognormal Random Variables
Consider Y  a0 X 1 1 X 2 2 , where
a a

X 1 ~ LN (  X 1 ,  X 1 ), X 2 ~ LN (  X 2 ,  X 2 ) with correlation  X 1 X 2
ln Y  ln a 0  a1 ln X 1  a 2 ln X 2
ln X 1 ~ N (  X 1 ,  X 1 ) and ln X 2 ~ N (  X 2 ,  X 2 )

ln Y is the sum of normal r.v. with mean and variance:


E(ln Y )  ln a 0  a1 E (ln X 1 )  a 2 E (ln X 2 )  ln a 0  a1 X 1  a 2  X 2
var(ln Y )  a12 X2 1  a 22 X2 2  2 a1 a 2  ln X 1 ln X 2  X 1  X 2
Assume  ln X 1 ln X 2   X 1 X 2 .

The distribution of Y is LN ( Y ,  Y ), where Y  E(ln Y )


2 2 2
 ln a 0   a i  Xi and   var(ln Y )  
Y
2
aa i j X iX j
XX
i j
i 1 i 1 j 1

lognormal lognormal

Y  a0 X 1a1 X 2a2
Take log

ln Y  ln a0  a1 ln X 1  a2 ln X 2

normal normal normal

Calculate: E[ln Y] = Y
Stdev(lnY) = Y
Example 4.4 - Settlement of footing, S
Settlement of footing on sand, S = PBI/M
P (applied pressure): LN, = –0.005,   0.1
B (footing dimension): LN,  = 1.792,  = 0
sand
I (influencing factor): LN, = –0.516,   0.1
M (modulus of compressibility): LN,  = 3.455,   0.15
Assume P, B, I, and M are independent LN variates,
find (a) mean settlement (b) P(S < 0.2)
Pr oduct of lognormals, hence S will be LN(S , S ), where
S   P   B   I   M  2.184
S2   2P   2B   2I   2M  0.0425    0.206
(a ) mean settlement, S  exp(S  0.5S2 )  0.115
 ln 0.2  (2.184) 
(b) P(S  0.2)     (2.789)  0.9974
 0 .206 

More complex example

S = 2P3B4I5/M6 (Physically wrong, just for example)


Assume P and B are correlated with PB = 0.1
I and M are correlated with IM = –0.2

S will be LN(S , S ), where


S  ln2  3P  4B  5I  6M  ...
 S2  32  P2  42  B2  52  I2  62  M2
 2(3)(4) PB P B  2(5)(6) IM  I  M
 ...

You might also like