Functions of Random Variables

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

4.

FUNCTIONS OF RANDOM VARIABLES


Given Y = g(X), or Y = g(X1, X2) = a1X1+a2X2
If PDF of X is known, can the PDF of Y be obtained?

4.1 SINGLE VARIABLE CASE


Suppose Y = g(X), g is monotonic increasing, e.g Y = 5X2, X  0.
Relationship is deterministic, i.e. X is known exactly (e.g. X = 2),
then Y is also known exactly (e.g. Y = 20).

If X is random, then Y is random.


If P(1 < X < 2) = 0.1, then P(5 < Y < 20) = 0.1 since Y = 5X2.
If P(x0 < X < x0+dx) = p, then P(y0 < Y <y0+dy) = p; y0=g(x0)

For monotonically increasing function


Y
Probability of X0
occurrence within
interval = fY(y)dy Probability of
dy occurrence within
interval
= fX(x)dx

X
dx
PDF of X = fX(x), PDF of Y = fY(y).
Probability in variable X is mapped to variable Y. Hence,
fY(y)dy = fX(x)dx
𝐝𝒙
fY(y) = fX(x) where x = g-1(y) (inverse function)
𝐝𝒚
𝐝𝒙
For monotonically decreasing function, i.e. is negative
𝐝𝒚

Y
X0 Probability
= fY(y)dy
dy
Probability
= fX(x)dx
X
dx
Probability in variable X is mapped to variable Y.
But probabilities must be positive, hence
|fY(y)dy| = |fX(x)dx|
𝐝𝒙
fY(y) = fX(x) where x = g-1(y)
𝐝𝒚

Change of variable theorem for monotonic function


(either always increasing or decreasing)

Example 4.1
Given Y = ln X. If fX(x) is LN(, ), find fY(y).

Y = ln X => = => =

1  1  ln x    2 
Recall: f X ( x)  exp       LN ( ,  ) 0 x
x 2  2    

1  1  ln(e y )    2 
fY(y) = fX(x)  exp     x
x 2  2    
1  1  y   2 
 exp     ~ N ( ,  )
 2  2    

(All x must be converted to y by putting x = ey)


For discrete r.v.,
if Y = g(X) and P(X = x0) = pX(x0) = p,
then P(Y = g(x0)) = p.
Hence, given pX(x), then PMF of Y is
pY(y) = pX(x) = pX[g-1(y)]

Example
X = no. of functional bulldozers
Y = X2
pY(4) = pX(2) = 0.384
X Y P(X = xi)
3 9 0.512
2 4 0.384
1 1 0.096
0 0 0.008

4.2 MULTI-VALUE SINGLE VARIABLE FUNCTION


Given Y = g(X), the inverse, X = g-1(Y) can take multiple values,
e.g. Y = X2, => X =  . How to find fY(y)?

For a two-value function, if Y = y, then the inverse is


X = x1 or X = x2. For the above, X = + or X =
Hence, P(Y = y) = P(X = x1 or X = x2) = P(X= x1  X = x2)
= P(X = x1) + P(X = x2)

Therefore if Y  g ( X ) and X  g 1 ( y )  x1 , x 2 , x 3 , , x n ,
n
pY ( y )   p X ( x i ) for discrete r.v.
i 1
n
dx i
fY ( y )   f X ( x i ) for continuous r.v.
i 1 dy
Two-valued function (non-monotonic)
Y Probability
= fY(y)dy

dy
Probability
= fX(x2)dx2 Probability
= fX(x1)dx1
X
dx2 dx1
Probability fY(y)dy is mapped to two regions:
fX(x1)dx1 and fX(x2)dx2
Hence, |fY(y)dy| = |fX(x1)dx1| + |fX(x2)dx2|

fY(y) = fX(x1) + fX(x2)

Don’t worry, 3 or
Multi-valued function (non-monotonic) more valued
function not tested
Y
Probability = fY(y)dy

Prob = dy
Prob =
fX(x3)dx3 fX(x1)dx1

X
dx3 dx2 dx1
Prob = fX(x2)dx2

Probability fY(y)dy is mapped to multiple regions:


n
dxi
Hence, f Y ( y )   f X ( xi )
i 1 dy
Example 4.2 - Y=X2, X ~ N(0,1), find fY(y).

If Y = y, then X = x1 = + (first root)


or X = x2 = (second root)

n
dxi
f Y ( y )   f X ( xi ) for multi-valued functions
i 1 dy and continuous r.v.

1 2

a) x1 = + => =

1  x12  1 1  y
f Y (y )  exp     exp  
2π  2  2 y 2 2 πy  2

b) x2 = => =

1  x22  1 1  y
f Y (y )  exp      exp  
2π  2  2 y 2 2 πy  2

c) Combining (a) and (b) gives


1  y 1  y 1  y
fY ( y )  exp     exp     exp   
2 2y  2  2 2y  2 2y  2
Scaling a random variable

How does multiplying (or dividing) by a constant affect a


random variable?

Consider y = bX, where b = constant

Both the mean and standard deviation are multiplied by b, i.e.

Y = bX , Y = |b| X (note b can be negative!)

Distribution type remains unchanged, i.e.


If X ~ normal, Y ~ normal
If X ~ lognormal, Y ~ lognormal

Coefficient of variation also remains unchanged

Covariance of two random variables


Recall that for a single random variable, the variance is
 X2  E[( X   X ) 2 ]
For two random variables X and Y, the covariance is

cov( X , Y )  E[( X   X )(Y  Y )]

Convenient to normalize the covariance as follows

cov( X , Y )  = Pearson product moment


 XY 
 XY correlation coefficient
(no units) or simply correlation coefficient

–1    1
Correlation coefficient
r = sample correlation coefficient
Negative correlation
straight
line

Zero correlation
Positive correlation
(uncorrelated)
straight
line

Background,
Correlation vs Dependence not tested
X and Y are independent
P(XY) = P(X)P(Y) (discrete)
fXY(x, y) = fX(x)fY(y)
joint probability density function for continuous r.v.
(background only, not tested)

X and Y are uncorrelated


E[( X   X )(Y  Y )]  0
• Independent implies uncorrelated
• However, uncorrelated variables may not necessarily
be independent!!
• Special case: for two jointly normal variables X and Y,
uncorrelated implies independence. (does not apply to
other distributions)
Background,
Correlation vs Dependence not tested

• Correlation is a measure of linear dependence


• Uncorrelated implies no linear dependence, but there
can be nonlinear dependence! Y
For example, Y = X2
X and Y are completely
dependent
(Y is fully specified by X)
X
However, XY= 0
E[( X   X )( X   X ) 2 )
(uncorrelated, i.e. no  XY 
 XY
linear relationship)
E[( X   X )3 )
= 0 (symmetric)
 XY

4.3 FUNCTION OF MULTIPLE RANDOM VARIABLES


(more advanced, so we only consider special cases)

4.3.1 Sum (or Difference) of Normal Random Variables

Consider a case of two normal variates X1 and X2, where

X 1 ~ N (  X1 , X1 ), X 2 ~ N (  X 2 , X 2 ) with correlation  X1 X 2

What is the distribution of Y = a1X1+a2X2 ?


The distribution of Y will be N (Y , Y )

(sum of normal variables is also normal)


The distribution of Y will be N ( Y , Y ), where
2
Y  E (Y )  E (a1 X 1  a2 X 2 )  a1  X  a2  X   ai  X
1 2 i
i 1

 Y2  E[(Y  Y )2 ]  E[{a1 ( X 1   X )  a2 ( X 2   X )}2 ]


1 2

 E[a12 ( X 1   X1 )2  2a1a2 ( X 1   X1 )( X 2   X 2 )  a22 ( X 2   X 2 )2 ]


 a12 X2 1  2a1a2 E[( X 1   X1 )( X 2   X 2 )]  a22 X2 2
2 2
a  2
1
2
X1 a 
2
2
2
X2  2a1a2  X1 X 2  X1  X 2   a a  i j Xi X j X X
i j
i 1 j 1

E [( X 1   X 1 )( X 2   X 2 )]  cov ( X 1, X 2 )
cov ( X 1, X 2 )
X 
1X2
X X1 2

Correlation does not imply causation !!!


B causes A (reverse causation or reverse causality)
Observation: The faster that windmills are observed to rotate, the more
wind is observed.
Wrong conclusion: Wind is caused by the rotation of windmills.

Third factor causes both A and B


Observation: As ice cream sales increase, drowning deaths increases
Wrong conclusion: Ice cream consumption causes drowning.
Actual explanation: Ice-cream is sold in hot summer months, and during
summer, people are more likely to swim.

Relationship is coincidental
Observation: Russian state leaders alternate from bald to non-bald for 200
years
Actual explanation: Purely coincidental

Littlewood's law states that a person can expect to experience events with
odds of one in a million (i.e miracle) at the rate of about one per month.
https://en.wikipedia.org/wiki/Littlewood%27s_law
Dead
Example 4.3 - Combined load on column + live

Given S = D + L + W, D ~ N(4.2,0.3), L ~ N(6.5,0.8) + wind

and W ~ N(3.4,0.7), with DL= 0.1, LW = 0 and load

DW = 0. If the strength of the column R ~ N(21.15,


3.1725), find the probability of failure of the
column. Assume R and S are uncorrelated.

The distribution of S will be N(  S ,  S ), where


3
 S   ai  X   D   L  W  4.2  6.5  3.4  14.1
i
i 1
3 3
 
2
S a a  i j X X
i
jX  X
i
 
j
2
D   2
L   W  2  DL D L
2

i 1 j 1

 0.32  0.82  0.7 2  2  0.1  0.3  0.8  1.268   S  1.126

Given S = D + L + W , D ~ N(4.2,0.3), L ~ N(6.5,0.8) and


W ~ N(3.4,0.7), with DL= 0.1, LW = 0 and DW = 0. If the
strength of the column R ~ N(21.15, 3.1725), find the
probability of failure of the column.
Assume R and S are uncorrelated.

Let X  strength  load  R  S . X will be N(  X ,  X )

 X   R   S  21.15  14.1  7.05


 X2   R2   S2  3.1725 2  1.268 2  11.333   X  3.366

Probability of failure is when load > strength, i.e. S > R


or R – S < 0 or equivalently, X < 0
 X  X   0  7.05 
P( X  0)           ( 2.094 )  0.018
 X   3 . 366 
3 3

 a a 
i 1 j 1
i j Xi X j X X 
i j

a1a1  X 1 X 1 X 1 X 1  a1a 2  X 1 X 2  X 1 X 2  a1a 3  X 1 X 3  X 1 X 3

 a 2 a1  X 2 X 1 X 2  X 1  a 22 a 22  X 2 X 2  X 2  X 2  a 2 a 3  X 2 X 3  X 2  X 3

 a 3 a1  X 3 X 1 X 3  X 1  a 32 a 32  X 3 X 2  X 3  X 2  a 3 a 3  X 3 X 3  X 3  X 3

 a12 12  a 22 22  a 32 32  2a1a 2  X 1 X 2  X 1  X 2


 2a1a 3  X 1 X 3  X 1 X 3  2a 2 a 3  X 2 X 3  X 2  X 3

Note that
X X  X1 1 2X2
  X3X3  1 X X  X
1 2 2 X1
, etc

More complex example.


S =1 –2D + 3L – 4W , D ~ N(4.2,0.3), L ~ N(6.5,0.8) and
W ~ N(3.4,0.7), with DL= 0.1, LW = –0.2 and DW = – 0.3.

The distribution of S will be N(S , S ), where


3
S   ai  X  1  2D  3L  4W  1  2(4.2)  3(6.5)  4(3.4)  ...
i
i 1
3 3
  2
S a a  i j Xi X j X X
i j
i 1 j 1

 22  D2  32  L2  42  W2  2(2)(3)  DL D L  2(3)(4)  LW  L W


 2(2)(4)  DW  D W
 4(0.32 )  9(0.82 )  16(0.72 )  2(2)(3)(0.1)(0.3)(0.8)
 2(3)(4)(0.2)(0.8)(0.7)  2(2)(4)(0.3)(0.3)(0.7)
 ...
4.3.2 Product (or quotient) of Lognormal Random Variables
Consider Y  a0 X 1 1 X 2 2 , where
a a

X 1 ~ LN (  X 1 ,  X 1 ), X 2 ~ LN (  X 2 ,  X 2 ) with correlation  X 1 X 2
ln Y  ln a 0  a1 ln X 1  a 2 ln X 2
ln X 1 ~ N (  X 1 ,  X 1 ) and ln X 2 ~ N (  X 2 ,  X 2 )

ln Y is the sum of normal r.v. with mean and variance:


E(ln Y )  ln a 0  a1 E (ln X 1 )  a 2 E (ln X 2 )  ln a 0  a1 X 1  a 2  X 2
var(ln Y )  a12 X2 1  a 22 X2 2  2 a1 a 2  ln X 1 ln X 2  X 1  X 2
Assume  ln X 1 ln X 2   X 1 X 2 .

The distribution of Y is LN ( Y ,  Y ), where Y  E(ln Y )


2 2 2
 ln a 0   a i  Xi and   var(ln Y )  
Y
2
aa i j X iX j
XX
i j
i 1 i 1 j 1

lognormal lognormal

Y  a0 X 1a1 X 2a2
Take log

ln Y  ln a0  a1 ln X 1  a2 ln X 2

normal normal normal

Calculate: E[ln Y] = Y
Stdev(lnY) = Y
Example 4.4 - Settlement of footing, S
Settlement of footing on sand, S = PBI/M
P (applied pressure): LN, = –0.005,   0.1
B (footing dimension): LN,  = 1.792,  = 0
sand
I (influencing factor): LN, = –0.516,   0.1
M (modulus of compressibility): LN,  = 3.455,   0.15
Assume P, B, I, and M are independent LN variates,
find (a) mean settlement (b) P(S < 0.2)
Pr oduct of lognormals, hence S will be LN(S , S ), where
S   P   B   I   M  2.184
S2   2P   2B   2I   2M  0.0425    0.206
(a ) mean settlement, S  exp(S  0.5S2 )  0.115
 ln 0.2  (2.184) 
(b) P(S  0.2)     (2.789)  0.9974
 0 .206 

More complex example

S = 2P3B4I5/M6 (Physically wrong, just for example)


Assume P and B are correlated with PB = 0.1
I and M are correlated with IM = –0.2

S will be LN(S , S ), where


S  ln2  3P  4B  5I  6M  ...
 S2  32  P2  42  B2  52  I2  62  M2
 2(3)(4) PB P B  2(5)(6) IM  I  M
 ...

You might also like