0% found this document useful (0 votes)

50 views10 pages

Lecturenotes 2

This document provides an introduction to random variables, which are mathematical objects that represent uncertainty. Random variables take values from a state space according to a probability distribution. They can be described by their probability density function or by simulations. Key properties of random variables include their expectation (mean) and variance. The document defines the normal distribution and provides examples of random variable transformations and their resulting distributions.

Uploaded by

Houssam Fouki

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

50 views10 pages

Lecturenotes 2

Uploaded by

Houssam Fouki

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

2 Random variables 1
2.1 Random variable and expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2.2 Two random variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.3 Averages and their convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2 Random variables
This chapter provides a short introduction to the concept of random variable; these concepts are
central in data science and will be important in the rest of the course. Indeed we will want to
forecast using ideas from probability theory, in order to design efficient prediction tools but also to
quantify the uncertainties associated with them.

2.1 Random variable and expectation

Definition. A random variable is an object designed to represent uncertainty. Will denote random
variables by capital letters, such as X, Y , X1 , X2 , etc. Random variables take values (called
“realizations”) in a set (called the state space), and each value is associated with a probability. As
a first example, to represent a coin flip, we introduce a random variable X that takes value in the
set {heads, tails}. If the coin lands on heads with probability p, i.e. P(X = heads) = p, then it
must land on tails with probability 1 − p. Having described the probabilities associated with every
possible outcome, we have fully described our coin flip X (see Figure 1(a)).
Let us take another example: X is the Uniform distribution on the interval [0, 1]. It means that
X takes values in [0, 1], and it is equally likely to “land” anywhere in [0, 1]. More formally, X is
such that P(X ∈ (a, b)) = b − a for all a and b such that 0 ≤ a < b ≤ 1: in words, the probability
that X lands in a sub-interval of [0, 1] is equal to the length of that sub-interval (see Figure 1(b)).
We will encounter various random variables, taking values in finite or infinite sets such as R (the
set of all real values). When the state space is large or infinite, we cannot describe the probability
that X will be equal to each and very particular value. So how do we describe X?
There are various, perfectly valid ways to define a random variable X. For example,

• We can describe the relation between X and an already-defined variable. For example, if we
say that X = − log(U ) where U is Uniform in [0, 1], we have fully described X, and we can
simulate it on the computer. At least, if we assume that we can simulate Uniform variables
on the computer.

Pierre Jacob 1 Forecasting and predictive analytics

1.0

1
0.7

probability
probability

0.3

0.0 0
heads tails a b c d
0 1

state state
(a) (b)

Figure 2.1: Two random variables: a biased coin flip (left), and a Uniform variable in [0, 1]. Here,
b − a is equal to d − c, so the variable is equally likely to land in (a, b) or in (c, d).

• We can describe the probability density function of X, denoted by fX , from the state space
to R+ , the set of positive reals. For example, the density function of a Uniform(0, 1) variable
is x 7→ 1(x ∈ (0, 1)), and the density function of the Exponential(1) variable is x 7→ exp(−x).
From the probability density function, we can compute the probability that the random
variable lands in any subset of the state space. For example for any a < b in the state space,
Rb
we have P(X ∈ (a, b)) = a fX (x)dx. In words: the area under the curve of fX between a and
b. Here we represent probabilities as integrals of fX , which is convenient because integrals
can be computed by certain humans and all computers.

We can prove that X = − log(U ) where U is Uniform in [0, 1] is indeed a random variable with
density function x 7→ exp(−x), via the change of variable formula.

Change of variable. Suppose that X is a random variable with density fX , and that s is a
one-to-one function. Define Y = s(X). What is the density of fY ?

ds (y)
−1
(change of variable) fY (y) = fX (s−1 (y)) × . (2.1)
dy

In the above equation, s−1 is the inverse of s: s−1 (y) is the number such that s(s−1 (y)) = y. The
last term on the right is the absolute value of the derivative of s−1 evaluated at y.
To summarise, we can think of random variables as mathematical objects (more precisely, as
functions), and as concrete objects that we can simulate on the computer (see Figure 2.2 on the
Exponential variable). Both views are useful.

Pierre Jacob 2 Forecasting and predictive analytics

1.00 1.00

0.75 0.75

density
U

0.50 0.50

0.25 0.25

0.00 0.00
0 2 4 0 2 4

X state space
(a) (b)

Figure 2.2: Two views on the Exponential random variable. Left: generate X = − log(U ) with
U ∼ Uniform(0, 1). Right: probability density function x 7→ exp(−x).

Properties. Once we have defined a variable, we can look at its properties. The expectation of
R +∞
a random variable X, E[X], also known as its mean, is defined by E[X] = −∞ xfX (x)dx, where
fX is the probability density function of X. The integral is not always finite, so the expectation
can be infinite: this is the case for example with a Cauchy variable that has density x 7→ π −1 (1 +
R +∞
x2 )−1 . Similarly we can define E[h(X)] = −∞ h(x)fX (x)dx for a function h, for example E[X 2 ] =
x fX (x)dx. It is helpful to know that expectations are defined as integrals. But it does not mean
R 2

that we have to resort to (scary!) integral calculations every time we meet an expectation, thanks
to fundamental properties recalled below (linearity, and later on, the tower property).
The expectation is linear, which means the following: if X is equal (with probability one) to a
constant real number c, then E[X] = c. For any pair of random variables X and Y , and any two
real numbers a and b, the following holds:

(linearity of expectation) E[aX + bY ] = aE[X] + bE[Y ]. (2.2)

Using linearity, we can find the expectation of a Uniform(a, b) variable X from the expectation of
a Uniform(0, 1) variable U , which is equal to 1/2. Since X is a + (b − a)U , E[X] = a + (b − a)/2.
We can also use the linearity of expectation to show that E[(X − E[X])2 ] is equal to E[X 2 ] − E[X]2 .
This is called the variance of X and denoted by V[X]. The variance satisfies:

V[aX + b] = a2 V[X]. (2.3)

Pierre Jacob 3 Forecasting and predictive analytics

0.4 0.4

0.3 0.3

density
0.2 0.2

0.1 0.1

0.0 0.0
−4 −2 0 2 4 −4 −2 0 2 4

state space state space

(a) (b)

Figure 2.3: Two views on the Normal random variable. Left: normalized histogram of generated
values simulated using (2.5). Right: probability density function ϕ defined in (2.4).

Univariate Normal distribution. A Normal distribution, denoted by Normal(µ, σ 2 ), has prob-

ability density function

1 1

(Normal pdf) ϕ : x 7→ √ exp − 2 (x − µ) ,
2
(2.4)
2πσ 2 2σ

for x ∈ R. This is one way of defining it. It is “standard” if µ = 0 and σ = 1. Alternatively, we can
describe a Normal variable by its relation to the (already defined) Uniform variable. Suppose that
U1 and U2 are independent (more on independence below) Uniform(0, 1) variables. Define Z as

(Box–Muller transform) Z= −2 log(U1 ) cos(2πU2 ). (2.5)

Then Z is Normal(0, 1), and therefore X = µ + σZ is Normal(µ, σ 2 ). See Figure 2.3.

We write X ∼ Normal(µ, σ 2 ) to specify that X follows that distribution. We can compute:
E[X] = µ and V[X] = σ 2 .

2.2 Two random variables

Things get more interesting with a pair of variables, as we can introduce the important concepts of
independence, conditional distribution and covariance.

Independence. We now consider a pair of real-valued random variables X and Y . We can put
them in a “vector” V = (X, Y ), in which case, we have one random vector of length 2. The random
vector, like any other random variable, can be described with its probability density function fX,Y .

Pierre Jacob 4 Forecasting and predictive analytics

We say that X and Y are independent if the joint density fX,Y factorizes into a product of marginal
densities fX and fY :

(independent factorization) ∀x, y ∈ R fX,Y (x, y) = fX (x) fY (y) . (2.6)

For example the density of a vector (U1 , U2 ) of independent Uniform(0, 1) variables is the function
(x, y) 7→ 1(x ∈ (0, 1)) × 1(y ∈ (0, 1)). To simulate independent variables, we just simulate them
separately, without sharing of information, recycling or communication between the two simulators.

Conditioning. Consider again a pair of real-valued random variables X and Y , not necessarily
independent. Conditioning on X, or more precisely on the event {X = x}, means considering
the distribution of the random variables while fixing the value of X to some x ∈ R. As if the
random variable X was solidified into the value x. For example, suppose that X ∼ Normal(0, 1)
and Y = aX + b. Then if we “condition” on {X = x}, Y is equal to the ax + b.
Conditioning on {X = x}, the variable Y might have a different distribution than if we do not
condition on {X = x}. We denote the conditional density of Y given {X = x} by y 7→ fY |X (y|x).
For any joint distribution fX,Y , we can always write

(general factorization) fX,Y (x, y) = fX (x) fY |X (y|x) = fY (y) fX|Y (x|y) . (2.7)

From this we get the expression fY |X (y|x) = fX,Y (x, y) /fX (x), so we can obtain an expression for
the conditional density using the joint and the marginal densities.
Note, since (2.7) is always true, independence as in (2.6) implies that fY |X (y|x) = fY (y) and
that fX|Y (x|y) = fX (x). This is intuitive, it corresponds our human idea of “independence”:
knowing the value of X does not change our understanding of the distribution of Y , and vice versa.
The notion of independence is symmetric in X and Y .

Tower property. We write E[Y |X] or E[Y |X = x] for the expectation of the random variable Y
when we condition/know the value of X, say x. We have this very useful property, for any pair of
random variables X and Y ,

(tower property of expectation) E[Y ] = E[E[Y |X]]. (2.8)

For example, suppose that X and W are two independent Normal(0, 1) variables and that Y =
aX + bW . If we condition on the event {X = x}, then Y becomes ax + bW and its distribution
is Normal(ax, b2 ), thus E[Y |X] = aX. On the other hand, unconditionally the expectation E[Y ]
is 0. We can find this by linearity: E[Y ] = aE[X] + bE[W ]. Or by the tower property: E[Y ] =

Pierre Jacob 5 Forecasting and predictive analytics

E[E[Y |X]] = E[aX] = 0.

Products. A useful property about independent variables is that the expectation E [XY ] is equal
to the product E [X] E [Y ]. Indeed, using the tower property:

E [XY ] = E [E[XY |X]] = E[XE[Y |X]] = E[XE[Y ]] = E[X]E[Y ],

where we have used E [Y |X] = E [Y ] by independence. Another useful property is that, for any two
functions g and h, if X and Y are independent then g(X) and h(Y ) are independent.

Covariance. The notion of independence is considered uncontroversial: most people agree on

its mathematical definition and on its use in data analysis. On the other hand, there have many
attempts at quantifying the amount of dependence between variables. The correlation coefficient
is one of them.
First, the covariance between X and Y is defined as

Cov (X, Y ) = E [(X − E [X])(Y − E [Y ])] = E [XY ] − E [X] E [Y ] . (2.9)

The second equality can be checked by developing the product, and using the linearity of expec-
tations. The covariance is not a very intuitive notion but note that if X and Y are independent,
then Cov(X, Y ) = 0 because then E[XY ] = E[X]E[Y ]. If Cov(X, Y ) = 0 we say that X and Y are
uncorrelated. Independent variables are always uncorrelated.

Uncorrelated but dependent. There are plenty of pairs of variables X and Y such that
Cov (X, Y ) = 0 and yet X and Y are dependent. Consider X following a symmetric distri-
bution around 0, such as a centered Normal distribution or a Uniform distribution on [−1, 1].
Define Y as Y = X 2 . Then X brings a lot of information on Y (in fact, X determines Y com-
pletely), so intuitively the two variables are dependent. On the other hand, we can compute
Cov (X, Y ) = E X 3 − E[X]E X 2 . Since X is symmetric around 0, we have E X 3 = 0 and

E [X] = 0, and thus Cov (X, Y ) = 0.

We state some properties of the covariance. With X = Y , we obtain Cov (X, X) = V [X]. The
covariance is symmetric: Cov (X, Y ) = Cov (Y, X). The covariance is bilinear: for numbers a, b, c
and random variables X, Y, W ,

(bilinearity of covariance) Cov (aX + bY, cW ) = ac Cov (X, W ) + bc Cov (Y, W ) . (2.10)

The covariance is invariant by shifts: for all a ∈ R, Cov (X + a, Y ) = Cov (X, Y ).

Pierre Jacob 6 Forecasting and predictive analytics

3 3

0 0
X2

X2
−3 −3

−6 −6
−5 0 5 10 −5 0 5 10

X1 X1
(a) (b)

Figure 2.4: Bivariate Normal random variable. Left: samples. Right: contours of the probability
density function.

Correlation. The correlation is a standardized covariance, defined as

Cov (X, Y )
(correlation) Cor (X, Y ) = p . (2.11)
V [X] V [Y ]

Some properties of the correlation are derived from those of the covariance (symmetry, invariance
by shifts). Some properties are specific to the correlation:

• Invariance by scalings: Cor (aX, Y ) = Cor (X, Y ) for all a ∈ R. Invariance by shifts and
scalings means that the correlation is insensitive to the units used for X and Y .

• The correlation is always between −1 and +1. Indeed, for any pair of random variables X
and Y with finite first two moments (i.e. E X 2 and E Y 2 are finite), the Cauchy–Schwarz

2
inequality states that E [XY ] ≤ E X 2 E Y 2 . If we apply this inequality to the variables

2
X − E [X] and Y − E [Y ], we obtain Cov (X, Y ) ≤ V [X] V[Y ] and thus Cor (X, Y ) ∈ [−1, 1].
Furthermore, the equality holds only if Y = aX + b for some real numbers a and b. Therefore,
we have Cor (X, Y ) = 1 (resp. = −1) if and only if Y = aX + b with a > 0 (resp. with a < 0).
Maximally correlated variables are perfectly aligned.

The latter property hints at a limitation of the correlation coefficient: it really only captures linear
associations.

Bivariate Normal distribution. A multivariate Normal distribution Normal (µ, Σ) of dimen-

−k/2 −1/2 T
sion k has probability density function fX : x 7→ (2π) |Σ| exp(− 21 (x − µ) Σ−1 (x − µ)),
defined for all x ∈ Rk . Here µ ∈ Rk is a real vector, and Σ ∈ Rk×k is a positive definite matrix.

Pierre Jacob 7 Forecasting and predictive analytics

T
The notation |Σ| refers to the determinant of Σ, Σ−1 to its inverse, and (x − µ) refers to the
transpose of the column vector (x − µ). We can simulate a vector X distributed as Normal(µ, Σ)
in dimension k, by simulating a vector of independent standard Normals Z = (Z1 , . . . , Zk ) and
computing X = µ + LZ, where L is a matrix such that LLT = Σ. Given a matrix Σ we can find
such a matrix L by “Cholesky decomposition”.
The density function is simpler if we consider the case where k = 2: the “bivariate” Normal, see
Figure 2.4. If we consider the mean µ to be the pair (µ1 , µ2 ) and the covariance matrix Σ to be
!
σ12 ρσ1 σ2
Σ= ,
ρσ1 σ2 σ22

then we can explicitly invert Σ and compute its determinant. After some work, we can write, for
any pair (x1 , x2 ), the joint density fX1 ,X2 (x1 , x2 ) of (X1 , X2 ) ∼ Normal(µ, Σ) as
" 2 2 #!
1 1

x1 − µ1 x2 − µ2 x1 − µ1 x2 − µ2
exp − + − 2ρ .
2πσ1 σ2 1−ρ 2 (1 ρ2 )
p
2 − σ1 σ2 σ1 σ2

Note that if ρ = 0, then the off-diagonal elements of Σ are zero, i.e. Cov(X1 , X2 ) = 0. But also
the joint density factorizes into a product of marginal densities as in (2.6). In that case, X1 and
X2 are independent. So for variables that are jointly Normal, lack of correlation is equivalent to
independence. It is not true for general pairs of random variables.

2.3 Averages and their convergence

We can define empirical versions of expectations, based on realizations of random variables. Indeed,
Pn
the expectation E [X] can be approximated by an empirical average n−1 t=1 xt , denoted by x̄n ,
where (xt )nt=1 are n independent realizations of X. This is key: we can approximate a theoretical
quantity such as X with an empirical quantity such as x̄n . In some sense, this is what makes
statistical analysis useful. But why does it work?

Average and expectation. One justification is through the law of large numbers. Assume that
E [|X|] = |x|fX (x)dx < ∞, and that X1 , X2 is a sequence of independent identical copies of X
R

then the law states:

n
1X a.s.
(law of large numbers) X̄n = Xt −−−−→ E [X] . (2.12)
n t=1 n→∞

Pierre Jacob 8 Forecasting and predictive analytics

3 0.4

density
2
Xn

0.2

1
0.0
0 5
0
0 25 50 75 100 n (Xn − E[X])
n n 2 3 20

(a) (b)

Figure 2.5: Asympotics

√ of the average. Left: twenty independent trajectories of X̄n along n.
Right: histogram of n(X̄n − E[X]) for different n and standard Normal density in solid black line.
In this example each X is Exponential(1), so E[X] = 1 and V[X] = 1.

Pn
The convergence “almost sure” or “a.s.” means that P limn→∞ n−1 Xt = E [X] = 1; in

t=1
words: in every experiment where we would generate such sequence X1 , X2 , etc, there is an integer
n large enough so that X̄n is close to E [X]. This is called an “asymptotic” result, because it
describes a phenomenon occurring when n → ∞ and it does not say anything about any finite
value of n.
If we assume more, we get more. For example if we assume that V[X] < ∞, then we have
Chebyshev’s inequality that states that for all ε > 0 and for all n ≥ 1:

V[X]
(Chebyshev) P(|X̄n − E[X]| > ε) ≤ . (2.13)
nε2

Accordingly, the probability that X̄n is ε-away from E[X] goes to zero as 1/n when n → ∞. But
Chebyshev is a non-asymptotic result: it works for all n. Under the same assumption V[X] < ∞,
the Central Limit Theorem is a purely asymptotic result that states

√ d
(CLT) n(X̄n − E[X]) −−−−→ Normal(0, V[X]). (2.14)
n→∞

The convergence is “in distribution”: the random variable on the left becomes more and more like
the random variable on the right of the arrow.

Empirical covariance and correlation Consider two samples, x = (x1 , . . . , xn ), and y =

(y1 , . . . , yn ), that are assumed to be realizations of random variables X1 , . . . , Xn and Y1 , . . . , Yn , all
distributed identically as X and Y . From (2.9), replacing expectations by averages we obtain the

Pierre Jacob 9 Forecasting and predictive analytics

empirical covariance
n n
1X 1X
Ĉov (x1:n , y1:n ) = (xt − x̄n ) (yt − ȳn ) = xt yt − x̄n ȳn . (2.15)
n t=1 n t=1

With the same reasoning, if we replace V [X] by the empirical variance σ̂x2 defined as Ĉov (x1:n , x1:n ),
and if we replace V [Y ] by σ̂y2 , then from Eq. (2.11) we obtain the empirical correlation as
Pn Pn
n−1 (xt − x̄n ) (yt − ȳn ) (xt − x̄n ) (yt − ȳn )
Ĉor (x1:n , y1:n ) = t=1
q = qP t=1 . (2.16)
n 2 Pn 2
σ̂x2 σ̂y2 t=1 (x t − x̄ n ) t=1 (y t − ȳn )

The ability to approximate “theoretical” quantities such as expectations using samples will be
key in the developments of the next chapters.

Pierre Jacob 10 Forecasting and predictive analytics

rv_3
No ratings yet
rv_3
20 pages
Basic Statistical Research
No ratings yet
Basic Statistical Research
7 pages
Paper III Stastical Methods in Economics
No ratings yet
Paper III Stastical Methods in Economics
115 pages
Test Bank for Business Statistics : 0321924290 download
100% (4)
Test Bank for Business Statistics : 0321924290 download
45 pages
8: Continuous Random Variables and Probability Density Functions
No ratings yet
8: Continuous Random Variables and Probability Density Functions
20 pages
Randomvariables
No ratings yet
Randomvariables
18 pages
Slides-Probability and Random Processes, 4, March 2024
No ratings yet
Slides-Probability and Random Processes, 4, March 2024
116 pages
3. Random Variables and Distribution Functions
No ratings yet
3. Random Variables and Distribution Functions
33 pages
0.1. Probability Review
No ratings yet
0.1. Probability Review
6 pages
Random Variables
No ratings yet
Random Variables
12 pages
Garcinia Binucao: (Propose Three Titles of Your Study)
No ratings yet
Garcinia Binucao: (Propose Three Titles of Your Study)
21 pages
Statistical Signal Processing
100% (3)
Statistical Signal Processing
125 pages
Chapter 6
No ratings yet
Chapter 6
39 pages
Review of Statistics Econ3005 L1 AEF
No ratings yet
Review of Statistics Econ3005 L1 AEF
42 pages
Data science Lab manual (1)
No ratings yet
Data science Lab manual (1)
33 pages
Lecture 2
No ratings yet
Lecture 2
9 pages
Topic 2: Random Variables and Probability Distributions: Rohini Somanathan Course 003, 2014-2015
No ratings yet
Topic 2: Random Variables and Probability Distributions: Rohini Somanathan Course 003, 2014-2015
35 pages
Slide 2 - 20191
No ratings yet
Slide 2 - 20191
44 pages
ECMT1020_lecture_notes_01_rv1
No ratings yet
ECMT1020_lecture_notes_01_rv1
6 pages
Probability
No ratings yet
Probability
69 pages
Probability
No ratings yet
Probability
12 pages
week two note
No ratings yet
week two note
19 pages
Ecmt1020 LT01
No ratings yet
Ecmt1020 LT01
8 pages
ML_Lec 2- Review of probability and statistics
No ratings yet
ML_Lec 2- Review of probability and statistics
30 pages
06 Probability and Random Variables
No ratings yet
06 Probability and Random Variables
37 pages
Probability & Random Process: Formulas
No ratings yet
Probability & Random Process: Formulas
10 pages
Expectations of Discrete Random Variables: Scott Sheffield
No ratings yet
Expectations of Discrete Random Variables: Scott Sheffield
61 pages
18MEO113T - DOE - Unit 5_AY2023_24 ODD.pptx (1)
No ratings yet
18MEO113T - DOE - Unit 5_AY2023_24 ODD.pptx (1)
76 pages
Random Variables
No ratings yet
Random Variables
44 pages
Probability
No ratings yet
Probability
73 pages
Random Variables: Fall 2017 Instructor: Ajit Rajwade
No ratings yet
Random Variables: Fall 2017 Instructor: Ajit Rajwade
74 pages
Ugc Net Economics English Book 2
No ratings yet
Ugc Net Economics English Book 2
17 pages
Chapter 3
No ratings yet
Chapter 3
35 pages
LEC0125 RNG Generation
No ratings yet
LEC0125 RNG Generation
7 pages
Doane Chap012 ASBE 7e SM
No ratings yet
Doane Chap012 ASBE 7e SM
78 pages
RVSP Notes
89% (9)
RVSP Notes
123 pages
Random Variables PDF
No ratings yet
Random Variables PDF
64 pages
Stochastic Processes and The Mathematics of Finance: Jonathan Block April 1, 2008
No ratings yet
Stochastic Processes and The Mathematics of Finance: Jonathan Block April 1, 2008
132 pages
Probability Review Stochastic
No ratings yet
Probability Review Stochastic
23 pages
LECT3 Probability Theory
No ratings yet
LECT3 Probability Theory
42 pages
Introduction To Probability and Random Processes: Appendix
No ratings yet
Introduction To Probability and Random Processes: Appendix
19 pages
Probability Theory (MATHIAS LOWE)
No ratings yet
Probability Theory (MATHIAS LOWE)
69 pages
Main EL CM2end 2023
No ratings yet
Main EL CM2end 2023
33 pages
3.flajolet Martin Algorithm
No ratings yet
3.flajolet Martin Algorithm
31 pages
ProbabilityStatistics_Probability2 (1)
No ratings yet
ProbabilityStatistics_Probability2 (1)
11 pages
Chap2 PDF
No ratings yet
Chap2 PDF
20 pages
Phd Agriculture Statistics
No ratings yet
Phd Agriculture Statistics
7 pages
Introduction To Probability: 2.1 Random Variable
No ratings yet
Introduction To Probability: 2.1 Random Variable
4 pages
Probability Distribution: Question Booklet
No ratings yet
Probability Distribution: Question Booklet
8 pages
ALL ST218 Lecture Notes
No ratings yet
ALL ST218 Lecture Notes
87 pages
Draw PDF
No ratings yet
Draw PDF
21 pages
5.Topic-Sensitive PageRank S5
No ratings yet
5.Topic-Sensitive PageRank S5
11 pages
Lecturenotes 3
No ratings yet
Lecturenotes 3
11 pages
Probability Basics
No ratings yet
Probability Basics
19 pages
IEOR 6711: Stochastic Models I Fall 2013, Professor Whitt Lecture Notes, Tuesday, September 3 Laws of Large Numbers
No ratings yet
IEOR 6711: Stochastic Models I Fall 2013, Professor Whitt Lecture Notes, Tuesday, September 3 Laws of Large Numbers
6 pages
Analyzing The Representation and Experience of Female Riders in Ride-Hailing Services in Calabarzon
No ratings yet
Analyzing The Representation and Experience of Female Riders in Ride-Hailing Services in Calabarzon
13 pages
Chapter 3: Random Variables and Probability Distributions This Chapter Is All About
No ratings yet
Chapter 3: Random Variables and Probability Distributions This Chapter Is All About
8 pages
Random Variable Modified PDF
No ratings yet
Random Variable Modified PDF
19 pages
Aaoc C111: Probability & Statistics: Bits-Pilani Hyderabad Campus
No ratings yet
Aaoc C111: Probability & Statistics: Bits-Pilani Hyderabad Campus
41 pages
Introductory Probability and The Central Limit Theorem
No ratings yet
Introductory Probability and The Central Limit Theorem
11 pages
Random Variables: COS 341 Fall 2002, Lecture 21
No ratings yet
Random Variables: COS 341 Fall 2002, Lecture 21
6 pages
Ststistics
No ratings yet
Ststistics
5 pages
Elgenfunction Expansions Associated with Second Order Differential Equations
From Everand
Elgenfunction Expansions Associated with Second Order Differential Equations
E. C. Titchmarsh
No ratings yet
Test Questionnaire PracResearch2 Q2
No ratings yet
Test Questionnaire PracResearch2 Q2
9 pages
MIT14 381F13 Lec1 PDF
No ratings yet
MIT14 381F13 Lec1 PDF
8 pages
Notes CPM & Pert
50% (2)
Notes CPM & Pert
8 pages
Probability Review
No ratings yet
Probability Review
12 pages
Distributions and Normal Random Variables
No ratings yet
Distributions and Normal Random Variables
8 pages
Module 2 Project Complete
No ratings yet
Module 2 Project Complete
10 pages
Module 1
No ratings yet
Module 1
36 pages
STAT 552 Probability and Statistics Ii: Short Review of S551
No ratings yet
STAT 552 Probability and Statistics Ii: Short Review of S551
51 pages
MMW Midterm Reviewer
No ratings yet
MMW Midterm Reviewer
6 pages
S. Sampling Distribution
No ratings yet
S. Sampling Distribution
28 pages
Academic Writing 271
No ratings yet
Academic Writing 271
73 pages
Chapter V Tests of Hypothesis 6
No ratings yet
Chapter V Tests of Hypothesis 6
67 pages
Square Summable Power Series
From Everand
Square Summable Power Series
Louis de Branges
5/5 (1)
Infinite Series
From Everand
Infinite Series
James M Hyslop
No ratings yet
Internship Report
No ratings yet
Internship Report
61 pages
1 MinHash-1
No ratings yet
1 MinHash-1
4 pages
CS-601-CBGS: B.Tech., VI Semester
No ratings yet
CS-601-CBGS: B.Tech., VI Semester
4 pages
Probability
No ratings yet
Probability
28 pages
Final Educ 107 Unit 4 ANALYSIS AND INTERPRETATION OF ASSESSMENT RESULTS
No ratings yet
Final Educ 107 Unit 4 ANALYSIS AND INTERPRETATION OF ASSESSMENT RESULTS
32 pages
ECE523 Engineering Applications of Machine Learning and Data Analytics - Bayes and Risk - 1
No ratings yet
ECE523 Engineering Applications of Machine Learning and Data Analytics - Bayes and Risk - 1
7 pages
Multiple Integrals, A Collection of Solved Problems
From Everand
Multiple Integrals, A Collection of Solved Problems
Steven Tan
No ratings yet
Finalslddeterminants of Nonlife Insurance Companies in Nepal
No ratings yet
Finalslddeterminants of Nonlife Insurance Companies in Nepal
9 pages
Iit-Jam: Mathematical Statistics (MS)
No ratings yet
Iit-Jam: Mathematical Statistics (MS)
51 pages
CPLN
No ratings yet
CPLN
11 pages
Capsule Calculus
From Everand
Capsule Calculus
Ira Ritow
No ratings yet
Final Quiz 1 - Attempt Reviewresearch
No ratings yet
Final Quiz 1 - Attempt Reviewresearch
7 pages
Continuous - Time - Vs Discrete Time
No ratings yet
Continuous - Time - Vs Discrete Time
23 pages
Diagnostic Test in PR2
100% (2)
Diagnostic Test in PR2
2 pages
Approval Sheet List of Tables Figures 0.1
No ratings yet
Approval Sheet List of Tables Figures 0.1
7 pages
Machine Learning For Business Analytics Real Time Data Analysis For Decision Making Bibis - Ir
100% (1)
Machine Learning For Business Analytics Real Time Data Analysis For Decision Making Bibis - Ir
191 pages
Calculus-II (Mathematics) Question Bank
From Everand
Calculus-II (Mathematics) Question Bank
Mohmmad Khaja Shareef
No ratings yet
Functions and Graphs
From Everand
Functions and Graphs
I.M. Gelfand
4/5 (1)

Lecturenotes 2

Uploaded by

Lecturenotes 2

Uploaded by

Contents

2.1 Random variable and expectation

Pierre Jacob 1 Forecasting and predictive analytics

Pierre Jacob 2 Forecasting and predictive analytics

(linearity of expectation) E[aX + bY ] = aE[X] + bE[Y ]. (2.2)

V[aX + b] = a2 V[X]. (2.3)

Pierre Jacob 3 Forecasting and predictive analytics

state space state space

Univariate Normal distribution. A Normal distribution, denoted by Normal(µ, σ 2 ), has prob-

(Box–Muller transform) Z= −2 log(U1 ) cos(2πU2 ). (2.5)

Then Z is Normal(0, 1), and therefore X = µ + σZ is Normal(µ, σ 2 ). See Figure 2.3.

2.2 Two random variables

Pierre Jacob 4 Forecasting and predictive analytics

(independent factorization) ∀x, y ∈ R fX,Y (x, y) = fX (x) fY (y) . (2.6)

(tower property of expectation) E[Y ] = E[E[Y |X]]. (2.8)

Pierre Jacob 5 Forecasting and predictive analytics

E [XY ] = E [E[XY |X]] = E[XE[Y |X]] = E[XE[Y ]] = E[X]E[Y ],

Covariance. The notion of independence is considered uncontroversial: most people agree on

Cov (X, Y ) = E [(X − E [X])(Y − E [Y ])] = E [XY ] − E [X] E [Y ] . (2.9)

E [X] = 0, and thus Cov (X, Y ) = 0.

The covariance is invariant by shifts: for all a ∈ R, Cov (X + a, Y ) = Cov (X, Y ).

Pierre Jacob 6 Forecasting and predictive analytics

Correlation. The correlation is a standardized covariance, defined as

Bivariate Normal distribution. A multivariate Normal distribution Normal (µ, Σ) of dimen-

Pierre Jacob 7 Forecasting and predictive analytics

2.3 Averages and their convergence

then the law states:

Pierre Jacob 8 Forecasting and predictive analytics

Figure 2.5: Asympotics

Empirical covariance and correlation Consider two samples, x = (x1 , . . . , xn ), and y =

Pierre Jacob 9 Forecasting and predictive analytics

Pierre Jacob 10 Forecasting and predictive analytics

You might also like