0% found this document useful (0 votes)
10 views11 pages

06 RandomVariableMath PDF

This document discusses transforming random variables by applying functions to them. It provides examples of transforming the random variable for time taken for a trip to the random variable for average velocity. It explains that the expectation of a function of a random variable can be calculated by integrating the original probability density function multiplied by the function. This allows calculating the expectation without finding the density function of the transformed variable.

Uploaded by

Sai Prasad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views11 pages

06 RandomVariableMath PDF

This document discusses transforming random variables by applying functions to them. It provides examples of transforming the random variable for time taken for a trip to the random variable for average velocity. It explains that the expectation of a function of a random variable can be calculated by integrating the original probability density function multiplied by the function. This allows calculating the expectation without finding the density function of the transformed variable.

Uploaded by

Sai Prasad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Lecture 6: Random Variable Math

Data Analysis MA2224

2024-02-12

Transforming Random Variables

Functions of Random Variables

A random variable, Y gets used as a model for a random outcome from some single situation. But often we care about
some alteration of that random variable.
For example, X could be the number of three pointers an NBA player makes in a game, but maybe what we care about
is the number of points they get from threes in a game which would be Y = 3X. which is a function or transformation
of the original random variable Y = f (X).
Or suppose you want to know how long it takes you to drive 100 miles away from Tandon. We make T the random
time in hours it takes for a single trip.
Now we pick a model for this situation, so let’s choose

2



 t3
 t>1
f (t) =


0

else

Using this model we can determine the probability of having to travel certain amounts of time or our “expected” travel
time.
CDF:

t
2 1 1
Z t
F (T ) = dt = − 2 =1−
1 x3 x 1 t2

90th Percentile Trip time

1
F (T ) = 0.9 → 1 − = 0.9 → t2 = 10 → t = 3.1 hrs
t2
Average trip time


2 2
Z ∞
E(T ) = dt = − = 0 − (−2) = 2 hrs
1 t2 t 1
Finding a Density Function (NOT TESTED)

2.0

1.5
Density

1.0

0.5
E(T)

90th
0.0

1 2 3 4 5
Time (hrs)

Finding a Density Function (NOT TESTED)

However, what if you are pretty confident in the probability model for your travel time (maybe through repeated trips)
but you actually care about the average velocity you travel during this trip? (recall distance = velocity × time)

100
V=
T
This says that average velocity is a transformation of the time random variable. What can we learn about this?
For example, since E(T) = 2 does that mean E(V) = 100/2 = 50? (The answer is NO).
Well since V is a random variable that means it has a pdf. Let’s try and find it

100
V=
T

What would f (v) be? One method we can use is to try and find the CDF of V (THIS IS NOT TESTED)

100 100 100


     
FV (v) = P (V ≤ v) = P ≤v =P T≥ =1−P T<
T v v

100 100 1 v2
     
F (v) = 1 − P T < = 1 − FT =1− 1− =
v v (100/v)2 10000

Now using the fact that the pdf is the derivative of the cdf we get:

dFv v
f (v) = = → 0 < v ≤ 100
dv 5000
And finally let’s find the expected average velocity:

100
1003
Z 100 2
v v3
E(V) = dv = = = 66.67 mph
0 5000 15000 0 15000
The Difficulty

0.020

E(V)
0.015
Density

0.010

0.005

0.000

0 25 50 75 100
Velocity (hrs)

The Difficulty

The main take away here is that this process is not fun and you should be happy you dont have to learn it.
Things get worse as the pdfs get worse and there are other techniques for getting these pdfs that exist but you also
dont have to worry about that.
The question now is, “can we learn anything without finding the full density function?”

A Partial Solution

The answer is yes, we can learn the expectation of V using the pdf of T only.

Expectation of a Function of A Random Variable.


Given a random variable Y with pdf fY (y) and a new random variable built from Y called W = g(Y). We can
get the expectation (mean) of W using the pdf of Y as
P
 y g(y)fY (y) Y Discrete


E(W) = E(g(Y)) =
R ∞ g(y)f (y) dy


−∞ Y Y Continuous

Which means, just integrate (or sum) the original pdf multiplied by the function you want to get ONLY THE
AVERAGE of W
Example

Or more loosely:

Expectation of Something Theorem.


P
 y something × pmf Y Discrete


E(something) =
R ∞ something × pdf dy


−∞ Y Continuous

This rule says rather than go through all that cdf non-sense, AS LONG AS WE ONLY CARE ABOUT THE
AVERAGE then we can just work with the pdf we have.
In the travel time example
100
V = g(T) =
T
And so we can calculate the average velocity easily as:


100 200 200 200
  Z ∞
E(V) = ET = dt = − = = 66.67 mph
T 1 t4 3t3 1 3

where we have no need to find that f (v) function.

Example

The amount of sadness you will experience this semester is given by the random variable S with pdf

8



 s3
 2≤s≤∞
f (s) =


0

elsewhere

Where 1 sadness unit is equivalent to eating 4 slices of pineapple pizza off a subway floor.

1. How much sadness can a random student expect this semester?

8 8
Z ∞   Z ∞ ∞
−8

E(S) = s ds = ds = = (0 − −4) = 4
2 s3 2 s 2 s 2

2. If I’m having a good year and promise to half the sadness of all who takes my class this semester, how much
sadness can you expect for a student in my class?

s 4
Z ∞   ∞
S −4
  
E = ds = =2
2 2 2 s3 s 2

Notice:
S 1 4
 
E = 2 = E(S) = = 2
2 2 2
Example 2

3. If my year has been crappy and I vow to square the sadness of all who takes my class this semester, how much
sadness can you expect for a student in my class?

8
Z ∞  
E(S ) =
2
s 2
ds = [8 ln(s)]∞
2 =∞
2 s3

Notice:
E(S2 ) = ∞ =
̸ E(S)2 = 42 = 16
There will be rules for these.

Example 2

For any Random variable Y with pdf f (y) and mean E(Y) = µY determine

E(Y − µY )

Which is the average deviation (or ± distance from the random variable to the mean)
Z ∞ Z ∞ Z ∞ Z ∞ Z ∞
(y − µY )f (y) dy = yf (y) dy − µY f (y) dy = yf (y) dy − µY f (y) dy
−∞ −∞ −∞ −∞ −∞

Z ∞ Z ∞
E(Y − µY ) = yf (y) dy − µY f (y) dy = E(Y) − µY (1) = µY − µY = 0
−∞ −∞

Note: The mean µ or E(Y) is a constant and so can be treated as any other number.

The Variance

An important use of the previous rule is to determine the variance of a random variable.

Variance and Standard Deviation (Not For Calculation).


The Signed Distance or Deviation of a random variable Y is Y − E(Y)

The Squared Distance of a random variable Y is (Y − E(Y))2


 
The Average Squared Distance or Variance of a random variable Y is V (Y) = σ 2 = E (Y − E(Y))2

The Standard Deviation of a random variable Y is SD(Y) = σ = V (Y)


p

The variance and standard deviation are measures of how compact the probability distribution is. Larger variance
means the probability is spread out over the support.
Multiple Random Variables

Combining Random Variables

A transformation of a random variable takes Y and turns it into something else g(Y) and is primarily used in this
class to get the variance formula.
More commonly, we need to take two or more random variables Y1 , Y2 and combine them in some way. Usually by
addition, Y1 + Y2 or subtraction Y1 − Y2 .
This also creates a new problem that leads to a “sort of” solution that works for this class and many other applications
of probability.

Joint Distributions

The probability functions we have been looking at (pmfs or pdfs) tell you “how the random variable moves”, or “where
the outcomes should be around”.
Once you have 2 or more random variables, Y1 , Y2 , . . . , Yk then you need a function that tells you how all of these
are moving together.

Joint Distribution Function.


For the random variables Y1 , Y2 , . . . , Yk , the multivariate function f (y1 , y2 , . . . , yk ) tells you the probability
(discrete) or density (continuous) of each random variable being a specific value. For discrete random variables:

f (y1 , y2 , . . . , yk ) = P (Y1 = y1 And Y2 = y2 And . . . And Yk = yk )

The Difficulty

The difficulty with this is pretty simple.


Multivariate functions are covered in calculus 3 which is not required for this class. So we cant deal with this stuff
directly.
A Partial Solution

A Partial Solution

As was hinted at, there is a simplification that makes this problem go away

Decomposition of Functions of Multiple Random Variables.


If Y1 and Y2 are random variables having joint distribution function fY1 ,Y2 (Y1 , Y2 ) and if Y1 and Y2 are
independent then
fY1 ,Y2 (Y1 , Y2 ) = fY1 (Y1 )fY2 (Y2 )

Translation, as long as we assume that we are dealing with independent random variables then all the multivariate
calculus stuff can be broken into single variable stuff. This is what we will do.
fY1 ,Y2 (Y1 , Y2 ) Is the joint distribution from before and the usual functions fY1 (Y1 ) and fY2 (Y2 ) are called
Marginal Probability Functions

Sums of Independent Random Variables

The ideal way of dealing with random variables is to have the pdf, pmf or joint distribution and then you can figure
out any thing you want.
So if we have the random variable T = Y1 + Y2 + Y3 + Y4 + Y5 and I know the joint distribution of T then I can
calculate probabilities, means, variances etc.
However this joint distribution is terrible to figure out, and you may need calculus 3 to do anything with it.
In general joint distributions of combined random variables will be off limits just like the distributions of functions of
random variables were off limits.
However expectations and variances will be easy to figure out.

1. T = a for a constant.
• E(T ) = E(a) = a
• V (T ) = V (a) = 0
2. T = aY for a constant and Y a random variable.
• E(T ) = E(aY ) = aE(Y )
• V (T ) = V (aY ) = a2 V (Y )
3. T = X + Y for X and Y independent.
• E(T ) = E(X + Y ) = E(X) + E(Y )
• V (T ) = V (X + Y ) = V (X) + V (Y )
4. T = aX + bY + c for X and Y independent random variables, and a, b, c constants.
• E(T ) = E(aX + bY + c) = aE(X) + bE(Y ) + c
• V (T ) = V (aX + bY + c) = a2 V (X) + b2 V (Y )
Sums Summed Up

Sums Summed Up

All rolled up into one rule and extended for n random variables gives you

Sums of Independent Random Variables.


Given T = a0 + for independent random variables Y1 , Y2 , ..., Yn and constants a0 , a1 , ..., an .
Pn
i=1 ai Yi

n n
E(T ) = E(a0 + ai Yi ) = a0 + ai E(Yi )
X X

i=1 i=1

n n
V (T ) = V (a0 + ai Yi ) = a2i V (Yi )
X X

i=1 i=1

Example: Variance Calculation

These rules can be used to simplify our variance calculation


From before

V (Y) = E((Y − E(Y))2 ) = E((Y − µ)2 ) = E(Y2 − 2Yµ + µ2 )

Keeping in mind that E(Y) = µ is a constant number

V (Y) = E(Y2 ) − E(2Yµ) + E(µ2 ) = E(Y2 ) − 2µE(Y) + µ2

V (Y) = E(Y2 ) − 2µ2 + µ2 = E(Y2 ) − µ2

Variance Calculation Formula.


For a random variable Y, the variance can be calculated from a pdf or pmf using

σ 2 = V (Y) = E(Y2 ) − µ2

Where
P
 y y fY (y)
 2 Y Discrete

E(Y ) =
2

R ∞ y 2 f (y) dy


−∞ Y Y Continuous
Examples

Example 1

Behold the following independent random variables.

RV Expectation Variance
W 13 9
X -4 25
Y 2 144
Z 0 49

1. Let D = X − W . Determine E(D), V (D) and SD(D).

E(D) = E(X) − E(W ) = −4 − 13 = −17



V (D) = V (X) + V (W ) = 25 + 9 = 34 SD(D) = 34

2. Let R = W + X + Y + Z. Determine E(R), V (R) and SD(R).

E(R) = E(W ) + E(X) + E(Y ) + E(Z) = 13 + (−4) + 2 + 0 = 11



V (R) = V (W ) + V (X) + V (Y ) + V (Z) = 9 + 25 + 144 + 49 = 227 SD(D) = 227

3. Let M = 3X − Y + 12. Determine E(M ), V (M ) and SD(M ).

E(M ) = E(3X) − E(Y ) + E(12) = 3E(X) − E(Y ) + 12 = 3(−4) − 2 + 12 = −2



V (M ) = 9V (X) + V (Y ) = 225 + 144 = 369 SD(D) = 369

4. Let Q = 6 − Y + 2X − 4Z + W . Determine E(Q), V (Q) and SD(Q).

E(Q) = 6 − E(Y ) + 2E(X) − 4E(Z) + E(W ) = 9



V (Q) = V (Y ) + 4V (X) + 16E(Z) + V (W ) = 1037 SD(D) = 1037

5. Determine E(W 2 ), E(X 2 ), E(Y 2 ) and E(Z 2 ).

E(W 2 ) = V (W ) + [E(W )]2 = 9 + 132 = 178


E(Z 2 ) = V (Z) + [E(Z)]2 = 49 + 02 = 49
E(Y 2 ) = V (Y ) + [E(Y )]2 = 144 + 22 = 148
E(X 2 ) = V (X) + [E(X)]2 = 25 + 42 = 41
Example 2

Example 2

Consider the following pdfs of W, Y and R which are independent

2 − 3w2 0≤w≤1 0.05 −10 ≤ y ≤ 10


( (
g(w) = t(y) =
0 elsewhere 0 elsewhere

0.2 r = 0, 2, 10, 20, 100


(
k(r) =
0 elsewhere

1. Determine E(W), V (W), E(Y),V (Y),E(R) and V (R).

Z 1 h i1
E(W) = w(2 − 3w2 ) dw = w2 − 0.75w4 = 0.25
0 0
Z 10 h i10
E(Y) = y(0.05) dy = 0.025y 2 =0
−10 −10

E(R) = 0(0.2) + 2(0.2) + 10(0.2) + 20(0.2) + 100(0.2) = 26.4

2 3 3 5 1
Z 1  1
E(W2 ) = w2 (2 − 3w2 ) dw = w − w =
0 3 5 0 15
0.05 3 50 50 100
Z 10  10
E(Y2 ) = y 2 (0.05) dy = y = + = = 33.3
−10 3 −10 3 3 3

E(R2 ) = 02 (0.2) + 22 (0.2) + 102 (0.2) + 202 (0.2) + 1002 (0.2) = 2100.8

1
V (W) = − (0.25)2 = 0.004
15

V (Y) = 33.33 − (0)2 = 33.33

V (R) = 2100.8 − (26.4)2 = 1403.84

2. Let U = 4Y − 5W − R + 9, determine E(U) and SD(U).

E(U) = 4E(Y) − 5E(W) − E(R) + 9 = 4(0) − 5(0.25) − 26.4 + 9 = −18.65


V (U) = 16V (Y) + 25V (W) + V (R) = 16(33.33) + 25(0.004) + 1403.84 = 1937.22

SD(U) = 1937.22 = 44.01
Warnings and Notes (Important)

Warnings (Important)

1. When dealing with joint distributions, an important property is the Covariance (or similarly the Correlation)
of the random variables. This is basically a measure that tells you how “in sync” the two random variables
are, or how much information about one of the random variables is contained in the other random variable.
Essentially all you need to know about this is that you can ignore it for almost all of the class.
2. DONT DO THIS A big mistake people make is to do something where they incorrectly combine the rules
such as
W RON G (BU T W ORKS) → E(Y1 + Y2 + Y3 + Y4 + Y5 ) = E(5Y ) = 5E(Y )
W RON G (DON T DO) → V (Y1 + Y2 + Y3 + Y4 + Y5 ) = V (5Y ) = 25V (Y )
Confusion seems to happen here because this does actually work for expectation, mathematically, but way
inflates that variance. The reason is that doing something 5 times and adding the results up is not the same as
doing something once and multiplying the answer by 5.
3. Notice that there are NO RULES FOR STANDARD DEVIATION. so you always have to work with
variances first before you get the standard deviation.

Main Points

1. T = a0 + for independent random variables Y1 , Y2 , ..., Yn and constants a0 , a1 , ..., an .


Pn
i=1 ai Yi


n n
!
E (T ) = E a0 + = a0 + ai E (Yi )
X X
ai Yi
i=1 i=1


n n
!
V (T ) = V a0 + = a2i V (Yi )
X X
ai Yi
i=1 i=1

2. Easier variance formula. Remember that E(Y ) = µ and it is a constant.

Variance Formula to remember.

V (Y ) = E(Y 2 ) − µ2

3. You have to determine the variance of sums before you can get the standard deviation.

You might also like