SlidesCourse 7 8 Oct
SlidesCourse 7 8 Oct
Definition: The set of all possible values of X is called its “domain”, or “range”
In practice we very often do NOT consider the
elementary events of a random experiment
the sample space and the probability function related to that sample space.
Instead we directly consider a random variate related to that random experiment.
HOW CAN WE DESCRIBE THE RANDOM “BEHAVIOUR” OF A RV?
we say: we describe the “DISTRIBUTION” of a RV ;
For discrete RVs the probability mass function (pmf) f(x) is very important:
Definition: pmf: f(x) = fX(x)= P(X=x) for x = 0, 1, 2, ...
For discrete RVs also the letter K is used for X. Then the pmf is dented by fK(k)
Example: We consider the RV “sum shown when rolling 2 dice”.
Find the domain and the pmf of X:
Solution: Domain: Clearly the sum of 2 dice can take the values (2,3,4,…,12)
f(2) = f(12) = 1/62
The above plot, the formula P=opportune/possible and counting helps to find all pmf
values: P(X=2) = 1/36, P(X=3) = 2/36, P(X=4) = 3/36,.., P(X=7) = 6/36,
P(X=8) = 5/36, P(X=9) = 4/36, P(X=10) = 3/36, P(X=11) = 2/36, P(X=12) = 1/36
Figure of the pmf of X fX(x) (also can be denoted fK(k))
This example makes obvious that knowing the pmf we know the exact random
behaviour (ie. the distribution) of the RV X.
What are the properties of the pmf: 1) 0 ≤ f(x) ≤ 1 2) sum_{x in domain} f(x) = 1
properties pmf f(x):=P(X=x): 1) 0 ≤ f(x) ≤ 1 as each f(x) is a probability
2) sum of all non-zero values = 1
Example 1b: A discrete RV Y has the pmf f(0) = 0.4, f(1)=0.4, f(2)=0.2
Check that f is a pmf and find the proability P( Y > 0.5 ).
Solution: P( Y > 0.5 ) = f(1) + ...
The most general approach to describe the behaviour of a RV is to use the
********************************
Continous RV: Definition Probability density function (pdf) or density
X has a continuous CDF and Prob(X=c) = 0 for all c in R
Is this CDF F(x) continuous: YES as left and right limit at 0 and 0.5 are the same
Find the density:
I) CDF is continuous
II) pdf: f(x) = F’(x) = 2 for 0 ≤ x < 0.5 constant 2 on (0,0.5) and 0 elsewhere
0 else
plot(c(-0.5,5.5),c(-1,5.5),pch="",main="floor()")
lines(c(-0.5,0),c(-1,-1))
lines(0:1,rep(0,2))
lines(1:2,rep(1,2))
lines(2:3,rep(2,2))
lines(3:4,rep(3,2))
lines(4:5,rep(4,2))
lines(5:6,rep(5,2))
points(0:5,0:5)
windows();plot(c(-0.5,5.5),c(-1,5.5),pch="",main="ceiling()")
lines(c(-0.5,0),c(-1,-1))
lines(0:1,rep(0,2))
lines(1:2,rep(1,2))
lines(2:3,rep(2,2))
lines(3:4,rep(3,2))
lines(4:5,rep(4,2))
lines(5:6,rep(5,2))
points(0:5,(0:5)-1)
Example A0 continued:
0 x<0
F(x) = floor(x)/5 0 ≤ x < 5
1 5≤x pmf: (discrete pdf) f(x) = P(X=x)
b) Find the table representation of the pmf and the CDF.
note that for X a discrete random variate with domain subset of the
integers we can calcuate: f(x) = F(x)-F(x-0.0001) x = 0,1,2,3
x 0 1 2 3 4 5 6
F(x) 0 0 1/5 2/5 3/5 4/5 1 1
f(x) 0 0 1/5 1/5 1/5 1/5 1/5 0
We can see that X has the discrete uniform distributio with
domain 1,2,3,4,5
c) Calculate probabilities like P(3≤X≤6)
c) answer: P(3≤X≤6) =f(3)+f(4)+f(5)+f(6) =3/5+0=0.6
NOTE: it is easiest to use the pmf; we can also use F(6) – F(2) = 0.6
Example B0:
f(x) = -c x for -1 ≤ x < 0
0 else
a) Select the value c such that f is a density.
b) Find the CDF. c) Check that the CDF you found is really a CDF.
d) Check if the CDF you found is continuous.
Example B0:
f(x) = -c x for -1 ≤ x < 0
0 else
a) Select the value c such that f is a density.
b) Find the CDF.
c) Check that the CDF you found is really a CDF.
d) Check if the CDF you found is continuous.
a) integral_(-1 to 0) -c x dx = -c integral_(from-1 to 0) x dx = -c (x2/2)_(from-1 to 0) =
= -c(-1/2)=c/2
a density must have integral 1 so we have to select c = 2.
b) Way 1: F(x) = integral_(from-1,x) -2y dy = -2 (y2/2)_(from-1,x) =
= -2(x2/2-1/2) = - x2 + 1 for -1< x ≤ 0
Way 2: F(x) = integral -2x dx = - 2 x2/2 = - x2 + const for -1< x ≤ 0
We know that F(0) = 1; so const must fulfill F(0) = - 02 + const = 1 and const =1
F(x) = - x2 + const = - x2 + 1 for -1< x ≤ 0
Final Result for b):
F(x) = 0 for x < -1
1 - x2 for -1≤ x < 0
1 for 0 ≤ x
c) limit x to –infinity “=F(-9999999)” = 0 limit x to –infinity “= F(9999999)”=1
F(x) is non-decreasing so F is a CDF
d) F(-1) =0 from left and right F(0) = 1 from left and right thus F(x) is continuous
Example B:
c(1+x) for -1 ≤ x < 0
f(x) = c(1-x)^2 for 0 ≤ x < 1
0 else
X discrete or continuous? answer: X continuous as f(x) is non zero for whole (-1,1)
a) Find c that f(x) is a density. b) Find the CDF.
For answering b)and c)below we use the formula for probabilities calculated using the CDF
F(x) correct for discrete, continuous and mixed distributions:
General Formula: P(a<X≤b) = F(b)-F(a)
b) Find P(0.5 < X ≤ 1.21) and P(0.5 ≤ X ≤ 1.21)
Find P(0.5 < X ≤ 1.21)= F(1.21) – F(0.5) = 0.5 – 0.1 = 0.45
P(0.5 ≤ X ≤ 1.21) = 0.45 as P(X=0.5) is 0.
c) Find P(1 < X ≤ 1.21) and P(1 ≤ X ≤ 1.21)
P(1 < X ≤ 1.21) = F(1.21) – F(1) = 0.55 – 0.5 = 0.05
P(1 ≤ X ≤ 1.21) = F(1.21) – F(1) + P(X=1) = 0.55 – 0.5 + 0.3 = 0.35
How can we describe the distribution of Example C?
1) continuous uniform distribution for X < 1 with P(X<1) = 0.2
2) P(X=1) = 0.3 ... point mass at 1.
3) continuous distribution for X >1 with density proportional to
1/sqrt(x) for 1 < x < 4; P(X>1)= 1-F(1) = 0.5.
Note that this mixture of a discrete and a continuous distribution can be
only described by the CDF.
This non-continuous CDF has no pdf and no pmf.
When we try to find a pdf we could use (as some authors) f(1) = infinity
but this notation does not exactly describe the distribution as the weigth
of the point mass is not given.
#######################################################
Homework Questions:
Q 3.6: We consider the random variate X with the properties:
P(X=0)=0.5 and all points in the interval (0,2) have the same
probability (and no points outside of [0,2) ).
Hint for Question 3.6:
1. P(X=0) = 0.5
2. all points in (0,2) have the same probability
add: 3. Prob(X<0 or X>=2)=0
UNIT 4: Part 1
Expectation and Variance
X discrete RV with pmf: f(0)=0.2;f(1)=0.6;f(2)=0.1;f(3)=0.1
How can we generate realisations of X ?
y<- sample(x=0:3,size=1.e6,replace=T,prob=c(0.2,0.6,0.1,0.1))
length(y) [1] 1000000
table(y)
Y
0 1 2 3
199575 599888 100349 100188
How can we describe 2 main properties of a RVs ?
Variance: Var(X) = V(X) … measure for the average distance to the mean value
Var(X) = E( (X-E(X))^2) = sum( (x-E(X))^2 f(x) ) = sum( x^2 f(x)) – E(X)^2
The variance is the average squared distance from the expectation
Question: RV with E(2.5) and Variance 0. Find its domain and pmf !
Answer: domain = {2.5} pmf: f(x) = 1 for x= 2.5
0 else
This is called a deterministic experiment.
Example A: X discrete RV with pmf: f(0) = 0.2; f(1) = 0.6; f(2) = 0.1; f(3)=0.1
a) Guess if the expectation is 1, > 1 or < 1
Due to the simulation we guess that E(X) > 1
b) Write R-code to generate from X and check if your guess was correct!
y<- sample(x=0:3,size=1.e6,replace=T,prob=c(0.2,0.6,0.1,0.1))
mean(y) 1.100125
c) Calculate the expectation E(X) and compare it with the result of b).
0*0.2 + 1*0.6+ 2*0.1 +3*0.1 = 1.1
We write a function that calculates expectation and variance
of discrete RVs:
meanVarRV <- function(x=0:3,prob=c(0.2,0.6,0.1,0.1)){
# calculates the mean and the variance of a discrete RV
fx <- prob
EX <- sum(x*fx) # Expectation
VX <- sum((x-EX)^2*fx) # Variance
return( c(EX=EX,VarX=VX))
}
meanVarRV(x=0:3,prob=c(0.2,0.6,0.1,0.1))
# EX VarX
# 1.10 0.69
d) Calculate the Variance.
Var(X) = sum( x^2 f(x)) – E(X)^2 = 0 * 0.2 + 1*0.6+ 4*0.1 + 9 *0.1 - 1.1^2 = 0.69
Example B: X discrete RV with pmf: f(0) = 0.2; f(1) = 0.6; f(2) = 0.1; f(3)=0.1
Consider a game with a final payment of X^2. How much must be payed
for such a game that it is a fair game.
a) What is the expected payment of this game?
E(payment(X)) = E(X^2) = 0*0.2+ 1^2 *0.6 + 2^2 *0.1+ 3^2 *0.1 =1.9
b) What has to be the payment the gamer has to pay that it is a fair game?
payment has to be 1.9:
Win = -1.9 + X^2
E(Win) = -1.9 + E(X^2) = -1.9 +1.9 = 0 …. Fair game.
# Example B
X<- sample(x=0:3,size=1.e6,replace=T,prob=c(0.2,0.6,0.1,0.1))
mean(X^2)# expected payment in the game
#[1] 1.895456
mean(-1.9+X^2)# expectation of fair game
#[1] -0.004544
b)
E(X) = integral_(-inf, inf) x f(x) dx =
= integral_(-1,0) x (x+1) dx + integral_(0,0.5) x dx =
= integral_(-1,0) x^2 +x dx + x^2/2_(0,0.5) =
= (x^3/3 +x^2/2) _(-1,0) + (x^2/2)_(0,0.5) =
= (0 – (-1/3+1/2)) + 0.25/2-0 = -1/6 + 1/8 = -0.04166667
c)
E(X^2) = integral_(-inf, inf) x^2 f(x) dx =
= integral_(-1,0) x^2 (x+1) dx + integral_(0,0.5) x^2 dx =
= integral_(-1,0) x^3 +x^2 dx + x^3/3_(0,0.5) =
= (x^4/4 +x^3/3) _(-1,0) + (x^3/3)_(0,0.5) =
= (0 – (1/4-1/3)) + 0.125/3-0 = 1/12 + 1/24 = 3/24 = 1/8
V(X) = E(X^2) – (E(X))^2 = 1/8 - (-0.04166667)^2 = 0.1232639
d) E(Z) = E(5 + 4 X) = 5 + 4 E(X) = …
V(Z) = V(5 + 4 X) = 4^2 V(X) = …
Example: We consider two urns with 3 balls numbered (0, 1, 2) and (1, 2, 3).
U1, U2 ... number obtained from randomly drawing from urn 1 and urn 2 respectively
Find the exact value of E(U1 . U2) and Var(U1 . U2).
Solution: We define: Y = U1 . U2
we find first the pmf of Y
f_Y(0) = P(Y=0) =P(U1=0) = 1/3;
f_Y(1) = P(U1=1 and U2=1)= 1/9
f_Y(2) = P((1, 2) or (2,1))= 2/9
f_Y(3) = P((1,3) )= 1/9
f_Y(4) = P((2, 2))= 1/9
f_Y(6) = P((2, 3))= 1/9
pmf <- c(3,1,2,1,1,0,1)/9
Y <- 0:6
sum(pmf*Y) # E(X) [1] 2
> sum(pmf*Y^2)- sum(pmf*Y)^2 # Var(X)
[1] 3.777778
E(Y) = sum(x . f(x)) = (0*3 + 1*1+2*2+3*1+4*1+6*1)/9 = 2
E(Y^2) = sum(x . f(x)) = (0*3 + 1*1+4*2+9*1+16*1+36*1)/9 = 70/9 = 7.77778
Var(Y) = 70/9 – 2^2 = 34/9 = 3.77778
comparing the results of simulation and the exact formula show the they are correct