Lec 3
Lec 3
Mathematical Statistics
Lecture 3 - Random
Variables
Akerke Zhailaubek
[email protected]
Lecture overview:
1. Random variables
2. Discrete random variables (DRV)
3. Expected values for DRV
4. Variance and standard deviation for DR
5. Continuous Random Variables
6. Expected values, variance, mode of CRV
7. Median, quartiles of CRV
Random Variable
§ A variable is represented by a symbol (X,Y, A etc.) and it
can take on any of a specified set of values from a sample
space.
§ When the value of a variable is the outcome of an
experiment, the variable is called a random variable
§ The list of all possible outcomes of an experiment is also
called the sample space
Discrete Random Variable
§Random variables can be discrete or
continuous
§A continuous random variable is one where
the outcome can be any value on a
continuous scale (will cover it in more detail
later)
§A discrete random variable (DRV) takes only
values on a discrete scale
Example 1
Determine whether or not each of the following is a
discrete random variable
a) The average height of a group of boys
b) The number of times a coin is tossed before a head
appears
c) The number of months in a year
Example 1
Determine whether or not each of the following is a
discrete random variable
Discrete Random Variable
§ Capital letters like 𝑿 are used for the random variable,
and a small letter such as 𝒙 for a particular value of
the random variable 𝑋
§ The probability that 𝑋 is equal to a particular value 𝑥 is
written as 𝑃(𝑋 = 𝑥) or sometimes as 𝑝(𝑥). Thus
§ 𝑥 is a particular value of a random variable 𝑋
§ 𝑃(𝑋 = 𝑥) refers to the probability that 𝑋 is equal to a
particular value 𝑥
Example 2
A coin is tossed six times and the number of heads,
𝑋, is noted. What are all possible values of 𝑋?
Discrete Random Variable
§To specify a discrete random variable
completely, you need to know its set of
possible values and the probability with
which it takes each one
§ A table with possible outcomes and
corresponding probabilities is called
probability distribution
Example 3
§ A fair dice is rolled. Find the probability of getting any
particular number.
𝒙 1 2 3 4 5 6
𝑷(𝑿 = 𝒙) 1 1 1 1 1 1
6 6 6 6 6 6
Probability Function
§ A discrete random variable can be specified with a
function. Such a function is called probability function
§ Probability function is the correspondence between a
value of a DRV and its probability
§ In order to find the probability function
1. Write down the sample space
2. Write down the probability distribution
3. Write down the probability function
Example 4
Three fair coins are tossed. The number of heads, X, is
counted. Find the probability function
Solution
1. Write down the possible outcomes
Example 4
Three fair coins are tossed. The number of heads, X, is
counted. Find the probability function
Solution
2. Write down the probability distribution
Example 4
Three fair coins are tossed. The number of heads, X, is
counted. Find the probability function
Solution
3. Write down the probability function
The Sum of All Probabilities for a
DRV
§ Look at the probability distributions of DRVs from
Examples 3 and 4. What is the sum of all probabilities?
1 2 3 4 5 6
The Sum of All Probabilities is 1
§For a discrete random variable the sum of all
probabilities is one
§In plain English this means that at least one
of all possible events will happen for sure
§The probability of any outcome for a DRV
cannot be negative or larger than one
§The fact that the sum of all probabilities is
one is used to normalize probability function
Example 5
A tetrahedral dies has the numbers 1, 2, 3 and 4 on its
faces. The die is biased in such a way that the probability of
the die landing on any number x is k/x, where k is a
constant. Find the probability distribution for X, the number
the die lands on after a single roll.
Solution:
Example 5
Since the sum of all probabilities is one:
Example 5
12
𝑘=
25
Finding the Probability that 𝑿 is within a
Range
§Using probability function, it is possible to
find probabilities that a DRV is smaller than a
certain value, larger than a certain value, or
is between two values
b) 𝑃 2 ≤ 𝑥 ≤ 4 = 𝑃 𝑥 = 2,3 𝑜𝑟 4 =
0.2 + 0.3 + 0.25 = 0.75
Cumulative Distribution Function for a
DRV
§If a particular value of 𝑋 is 𝑥, the
probability that 𝑋 is less than or equal to 𝑥
is written as 𝐹 𝑥
§𝐹(𝑥) is found by adding together all the
probabilities for those outcomes that are
equal or less than 𝑥
§This is written as 𝐹 𝑥 = 𝑃(𝑋 ≤ 𝑥)
§A cumulative distribution function can be
also specified with a table
Example 7
§Two fair coins are tossed. 𝑋 is the number of
heads on the two coins. Draw a table to show
the cumulative distribution function for 𝑋.
§Solution:
The Expected Value of a DRV
§The expected value of a DRV is defined as
𝐸 𝑋 = 6 𝑥 7 𝑃 𝑋 = 𝑥 = 𝑥𝑝(𝑥)
Value 0 1 2 3 4
Frequency 3 10 5 4 9
Example 8
§ The number of television sets per household in a
survey of a 100 houses in a town gave the following
frequency distribution
Number of 0 1 2 3
sets
Frequency 10 75 10 5
a) Find the mean of this set of data
b) Draw the probability distribution table for the
variable 𝑋, where 𝑋 is the number of TV sets in a
house picked at random in the town
c) Find the expected value of 𝑋
Example 8
Solution:
Number of sets 0 1 2 3
Frequency 10 75 10 5
∑ "# $×&$'&×()'*×&$'+×)
a) 𝑚𝑒𝑎𝑛 = ∑#
= = 1.1
&$'()'&$')
b)
𝒙 0 1 2 3
𝑓 0.1 0.75 0.1 0.05
𝑝 𝑥 =
∑𝑓
c)
𝐸 𝑥 = & 𝑥𝑝 𝑥 = 0×0.1 + 1×0.75 + 2×0.1 + 3×0.05 = 1.1
Expected Value of 𝑿𝟐
§The expected value of the square of a DRV is
defined as
𝐸 𝑋 * = 6𝑥 * 7𝑃 𝑋 =𝑥
𝒙 1 2 3 4
𝒙𝟐 1 4 9 16
𝑃(𝑋 = 𝑥 " ) 12 6 4 3
25 25 25 25
Example 9
§ A discrete random variable 𝑋 has a probability
distribution
b) Expected value of 𝑋 ! :
𝐸 𝑋! = = 𝑥! > 𝑝 𝑋 = 𝑥!
12 6 4 3 120
= 1× + 4× + 9× + 16× =
25 25 25 25 25
Variance of a DRV
§Another important characteristic of a DRV
is variance
§The variance of 𝑋 is usually written as
𝑉𝑎𝑟 𝑋 and is found by using
𝑉𝑎𝑟 𝑋 = 𝐸 𝑋 * − 𝐸(𝑋) *
§Variance shows how much scattered data
points around the expected value
§The square root of the variance is called
the standard deviation (SD or s)
Example 10
For the probability distribution below find
𝒙 0 1 2 3
𝑃(𝑋 = 𝑥) 1 3 3 1
8 8 8 8
a) Expected value
b) Variance
Example 10
For the probability distribution below find
𝒙 0 1 2 3
𝑃(𝑋 = 𝑥) 1 3 3 1
8 8 8 8
a) Expected value
1 3 3 1
𝐸 𝑥 = 0× + 1× + 2× + 3 = 1.5
8 8 8 8
Example 10
For the probability distribution below find
𝒙 0 1 2 3
𝒙𝟐 0 1 4 9
𝑃(𝑋 = 𝑥 " ) 1 3 3 1
8 8 8 8
b) Variance
𝑉𝑎𝑟 𝑥
1 3 3 1
= 0× + 1× + 4× + 9× − 1.5*
8 8 8 8
= 0.75
Expected Value and Variance of a
Function
𝒙 1 2 3 4
𝑃(𝑋 = 𝑥) 12 6 4 3
25 25 25 25
a) Write down the probability distribution for 𝑌, where
𝑌 = 2𝑋 + 1
b) Find 𝐸(𝑌)
c) Find 𝑉𝑎𝑟(𝑌)
Example 11
§ A discrete random variable 𝑋 has a probability
distribution
a) Write down the probability distribution for 𝑌, where
𝑌 = 2𝑋 + 1
𝒙 1 2 3 4
𝑦 = 2𝑥 + 1 3 5 7 9
𝑃(𝑌 = 𝑦) 12 6 4 3
25 25 25 25
Example 11
b) Find 𝐸(𝑌) 𝒙 1 2 3 4
𝑦 = 2𝑥 + 1 3 5 7 9
𝑃(𝑌 = 𝑦) 12 6 4 3
25 25 25 25
12 6 4 3 48
𝐸 𝑋 = 1× + 2× + 3× + 4× =
25 25 25 25 25
48 25 121
𝐸 𝑌 = 2×𝐸 𝑋 + 1 = 2× + =
25 25 25
12 6 4 3 121
𝐸 𝑌 = 3× + 5× + 7× + 9× =
25 25 25 25 25
Example 11
c) Find 𝑉𝑎𝑟(𝑌)
𝒙 1 2 3 4
𝑦 = 2𝑥 + 1 3 5 7 9
𝑃(𝑌 = 𝑦) 12 6 4 3
25 25 25 25
!
12 6 4 3 48 24 2304 696
𝑉𝑎𝑟 𝑋 = 1! × + 2! + 3! + 4! − = − =
25 25 25 25 25 5 25 625
696 2784
𝑉𝑎𝑟 𝑌 = 𝑎! ×𝑉𝑎𝑟 𝑋 = 4× =
625 625
!
!
12 !
6 !
4 !
3 121 2784
𝑉𝑎𝑟 𝑌 = 3 × +5 +7 +9 − =
25 25 25 25 25 625
Question
56789
Data are coded using 𝑌 = 89
.
The mean of the coded data is 5.1.
The standard deviation of the coded data is 2.5.
Find
a) The mean of the original data
b) The standard deviation of the original data
Solution
a) The mean of the original data
Solution
b) The standard deviation of the original data
Lecture outline
Continuous Random Variables
(CRV)
Integrate
!"#$%#&'(
• 𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 = =
)*)+, !"#$%#&'(
!
×+"#+ %&.#" )/# '%"0# +"#+ %&.#" )/# '%"0#
"
! = )*)+, +"#+
,
×)*)+, +"#+
"
• if 𝑡𝑜𝑡𝑎𝑙 𝑎𝑟𝑒𝑎 is 1 then probability is area under
1
the curve 𝑃 𝑎 < 𝑋 < 𝑏 = ∫+ 𝑓 𝑥 𝑑𝑥, where
𝑓 𝑥 is the equation of the curve
• 𝑡𝑜𝑡𝑎𝑙 𝑎𝑟𝑒𝑎 = 𝑘×𝑡𝑜𝑡𝑎𝑙 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 →
1
𝑘=
𝑡𝑜𝑡𝑎𝑙 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦
Revisit: Discrete random variables
• We have been studying discrete random variables
(drv’s) and binomial, Poisson and geometric
distributions.
• We saw that, because the variable X was discrete, X
could only take specific values in a range.
• Also, we could express the probabilities using a
function, called a probability function.
• We used the summation notation, ∑ 𝑃(𝑋 = 𝑥), to
sum the probabilities.
• The sum of all of the probabilities equalled one.
CRV
• Continuous random variables can take any value.
• As with drv’s, we can express the probabilities for
crv’s using a function called a probability density
function (pdf).
• Again as with drv’s, for crv’s the sum of all of the
probabilities is equal to one.
• We will use a method of integration to sum all of the
probabilities since the Sigma method only works for
discrete values of X.
Interpretation of p.d.f. graphs
• 𝑓 𝑥 is a p.d.f.
3
3. ∫23 𝑓 𝑥 𝑑𝑥 = 1
Example
The random variable 𝑋 has probability density function:
a) Sketch 𝑓(𝑥).
b) Find the value of 𝑘.
Solution Notice that 𝑓 2 = 𝑘
either using the first one or
the second one.
This is because a pdf is a
continuous function.
Total area =
4
= 𝑘 + ×2×4𝑘 = 5𝑘
5
4
So, 𝑘 = 6
Alternative solution: You can also
use definite integrals to find the
total area
Example
The random variable 𝑋 has the probability density
function:
3𝑥 5F 0≤𝑥≤2
𝑓 𝑥 =E 8,
0, otherwise
𝐹 𝑥 = 𝑃(𝑋 ≤ 𝑥)
;
𝐹 𝑥 = 𝑃 𝑋 ≤ 𝑥 = W 𝑓 𝑡 𝑑𝑡
23
Relationship between 𝑭(𝒙) and 𝒇(𝒙)
• We can find 𝐹(𝑥) by integrating 𝑓(𝑥)
• We can find 𝑓(𝑥) by differentiating 𝐹(𝑥).
Find
a) F(2.5), b) F(x).
(Although the question doesn’t say ‘continuous
random variable’, this is implied by the terminology
probability density function. Remember, drv’s have
probability functions.)
Solution
5.6 4 5.6
a) 𝐹 2.5 = ∫23 𝑓 𝑡 𝑑𝑡 = ∫23 𝑓 𝑡 𝑑𝑡 + ∫4 𝑓 𝑡 𝑑𝑡 =
5.6 2.5
1 1 21
ét ù
( )
2
0+W
4
𝑡 𝑑𝑡 =
ê8ú = 2.52
- 1 =
4
ë û1 8 32
Solution
b) To find F(x) we need to use the definition:
;
𝐹 𝑥 = 𝑃 𝑋 ≤ 𝑥 = W 𝑓 𝑡 𝑑𝑡
23
First, find 𝐹(𝑥) for 𝑥 < 1 and 𝑥 > 3, outside of the given
range.
For 𝑥 < 1,
; ;
𝐹 𝑥 = W 𝑓 𝑡 𝑑𝑡 = W 0 𝑑𝑡 = 0
23 23
For 𝑥 > 3,
; < ;
𝐹 𝑥 =W 𝑓 𝑡 𝑑𝑡 = W 𝑓 𝑡 𝑑𝑡 + W 0𝑑𝑡 = 1
23 23 <
b) For 1≤ 𝑥 ≤ 3,
Method 1 Method 2
9 1
+ 𝐶 = 1, 𝐶=−
8 8
So, the cdf function 𝐹 𝑥 is
0 ,𝑥 < 1
;# 4
𝐹 𝑥 = [= 2= ,1 ≤ 𝑥 ≤ 3
1 ,𝑥 > 3
Caution!
Don’t forget to define 𝐹(𝑥) over the whole range (−∞, ∞)
Example
The random variable X has probability density function:
Find F(x).
This time f(x) has two parts defined and so we need to
consider the two parts separately. We can use either
method 1 or 2 as above; it is personal preference.
Solution
From the range given in pdf, we know that 𝐹 𝑥 = 0 for
𝑥 ≤ 1 and 𝐹 𝑥 = 1 for 𝑥 > 4.
- <
𝐹 𝑥 = 9 𝑓 𝑡 𝑑𝑡 + 9 𝑓 𝑡 𝑑𝑡
:; -
<
=𝐹 1 + ∫- 0.2𝑑𝑡
= 0 + 0.2 𝑥 − 1 = 0.2(𝑥 − 1)
For 2 ≤ 𝑥 ≤ 4,
0.2(𝑥 − 1)
5 ; ;
𝐹 𝑥 = W 𝑓 𝑡 𝑑𝑡 + W 𝑓 𝑡 𝑑𝑡 = 𝐹 2 + W 𝑓 𝑡 𝑑𝑡
23 5 5
; 𝑥
4 )# )
= 0.2 2 − 1 + W 6
)24 .) = 0.2 + 2
48 6 5
5
;# ; ;# ;
= 0.2 + 2 2
48 6
0.4 − 0.4 = 2
48 6
+46
Again, we must write out 𝐹(𝑥) in full across the whole
range (–∞, ∞):
Example
The random variable X has cumulative distribution
function:
4 <
a) 𝑃 𝑋 ≤ 1.5 = 𝐹 1.5 = ×1.5 + ×1.55 = 0.6375
6 58
b) 𝑃 0.5 ≤ 𝑋 ≤ 1.5 = 𝐹 1.5 − 𝐹 0.5
1 3
= 0.6375 − ×0.5 + ×0.55 = 0.5
5 20
c) 𝑃 𝑋 = 1 = 0
4 <
. 4 < + 𝑥 ,0 ≤ 𝑥 ≤ 2
d) 𝐹 𝑥 = + 𝑥, 𝑓 𝑥 = E6 48
.; 6 48
0 , othewise
Mean and variance of crv
If X is a continuous random5variable with pdf f(x) then the
mean 𝜇 and the variance 𝜎 are:
"
𝝁=𝑬 𝑿 = ∫!" 𝒙𝒇 𝒙 𝒅𝒙
𝟐 𝟐 𝟐 " 𝟐
𝝈 = Var(𝑿) = 𝑬 𝑿 − 𝑬 𝑿 = ∫!" 𝒙 𝒇 𝒙 𝒅𝒙 − 𝝁𝟐
32
Exercise
A random variable Y has a probability density function:
Find
a) E(Y), b) Var(Y)
33
Solution
3
a) We will use the definition E(Y) = ∫23 𝒚𝒇 𝒚 𝒅𝒚
However, since f(y) = 0 for all values of y except [1, 3],
then yf(y) will be zero for all values of y except [1, 3].
So, we can integrate using y = 1 and y = 3 as the limits
of integration.
<
< ( < (# ($ <$ 4$ 4<
E(Y) = ∫4 𝑦 > 𝑑𝑦 = ∫4 > 𝑑𝑦 = = − =
45 4 45 45 ?
34
b) We will use the definition
" 𝟐
Var(Y) = 𝒚 𝒅𝒚 − [E(Y)]2
∫!" 𝒚 𝒇
As before, we can integrate using y = 1 and y = 3 as the
limits of integration.
% &' $% &
Var(Y) = ∫$ 𝑦 ( 𝑑𝑦 − =
)
%
% '! $% & '" $% & %" $" $% &
∫$ ( 𝑑𝑦 − )
= $) $
− )
= $)
− $)
− )
81 1 169 11
= − − =
16 16 36 36
35
Finding E(aX+b) and Var(aX+b)
36
Exercise (continued)
A random variable Y has a probability density function:
Find
a) E(Y), b) Var(Y), c) E(2Y-3), d)Var(2Y-3)
37
Solution
38
Example
A random variable X has probability density function:
Find
a) E(X)
b) P(X > μ)
39
Solution
40
We can easily find the value of x at the maximum point
by using x = –b/2a, or by completing the square.
41
b) Now P(X > μ) = P(X > 1)
42
Alternative solution for (b)
P(X > μ) = P(X > 1)
you could integrate f(x) between
x = 1 and x = 3:
43
Example 9
The random variable X has probability density function:
Find a) E(X)
44
Solution
We will use the same definitions for E(X) and Var(X),
but we must integrate each part of f(x) separately using
the appropriate limits of integration from the intervals
given.
1st integral
2nd integral
45
;
E(X) = ∫:; 𝒙𝒇 𝒙 𝒅𝒙
1st integral
2nd integral
46
Exercise
The random variable X has probability density function:
Find b) Var(X)
47
Solution ;
Var(X) = ∫:; 𝒙𝟐 𝒇 𝒙 𝒅𝒙 − [E(X)]2
1st integral
2nd integral
48
Mode
49
Example 11
The random variables X and Y have probability density
functions f(x) and f(y) respectively:
f(y)
50
Solution
a) First, sketch f(x).
52
Median and quartiles
53
Example 12
A continuous random variable X has probability density
function:
Find
a) The cdf of X, b) the median value of X
54
Solution
55
b) To find the median, Q2, we will use the fact that
𝐹 𝑄2 = 0.5.
The letter 𝑚 is often used for Q2.
So substitute 𝑥 = 𝑚 into 𝐹(𝑥) and set it equal to 0.5:
2𝑚5 − 𝑚> = 0.5
Rearranging and simplifying we get:
2𝑚> − 4𝑚5 + 1 = 0
!
4 ± 16 − 8
𝑚 =
4
2
𝑚! = 1 ±
2
2 𝑚= 1−
!
= 0.541(3 d.p.)
𝑚= 1± !
2
𝟎≤𝒎≤𝟏
56
57
References:
1. Palin A., Park A., Whiteley C., (2012), A-level
mathematics for Edexcel Statistics 1, CGP, UK.
2. Attwood, G., Clegg, A., Dyer, G. and Dyer, J
(2008), Edexcel AS and A-Level Modular
Mathematics series S2, Pearson, Harlow, UK.
3. Lecture notes, Statistics and Math for Life
Sciences courses, NUFYP, Nazarbayev
University.
58