0% found this document useful (0 votes)
8 views13 pages

CME 106 - Probability Cheatsheet

probability

Uploaded by

viksit333
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views13 pages

CME 106 - Probability Cheatsheet

probability

Uploaded by

viksit333
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Want more content like this?

Subscribe here
(https://docs.google.com/forms/d/e/1FAIpQLSeOr-
yp8VzYIs4ZtE9HVkRcMJyDcJ2FieM82fUsFoCssHu9DA/viewform) to be notified of new
releases!

(https://stanford.edu/~shervine/teaching/cme-106/cheatsheet-probability#cme-106---
introduction-to-probability-and-statistics-for-engineers)CME 106 - Introduction to Probability and
Statistics for Engineers (teaching/cme-106) English 

Probability Statistics

(https://stanford.edu/~shervine/teaching/cme-
106/cheatsheet-
probability#cheatsheet)Probability cheatsheet
Star 568

By Afshine Amidi (https://twitter.com/afshinea) and Shervine Amidi


(https://twitter.com/shervinea)

(https://stanford.edu/~shervine/teaching/cme-106/cheatsheet-
probability#introduction)
Introduction to Probability and Combinatorics

❐ Sample space ― The set of all possible outcomes of an experiment is known as the sample space
of the experiment and is denoted by S .

❐ Event ― Any subset E of the sample space is known as an event. That is, an event is a set
consisting of possible outcomes of the experiment. If the outcome of the experiment is contained in
E , then we say that E has occurred.

❐ Axioms of probability ― For each event E , we denote P (E) as the probability of event E
occurring.

Axiom 1 ― Every probability is between 0 and 1 included, i.e:


0 ⩽ P (E) ⩽ 1 ​

Axiom 2 ― The probability that at least one of the elementary events in the entire sample space will
occur is 1, i.e:

P (S) = 1 ​

Axiom 3 ― For any sequence of mutually exclusive events E1 , ..., En , we have:


​ ​

P (⋃ Ei ) = ∑ P (Ei )
n n
​ ​ ​ ​ ​

i=1 i=1
❐ Permutation ― A permutation is an arrangement of r objects from a pool of n objects, in a given
order. The number of such arrangements is given by P (n, r), defined as:

n!
P (n, r) =
(n − r)!
​ ​

❐ Combination ― A combination is an arrangement of r objects from a pool of n objects, where the


order does not matter. The number of such arrangements is given by C(n, r), defined as:

P (n, r) n!
C(n, r) = =
r!(n − r)!
​ ​ ​

r!

Remark: we note that for 0 ⩽ r ⩽ n, we have P (n, r) ⩾ C(n, r).

(https://stanford.edu/~shervine/teaching/cme-106/cheatsheet-
probability#conditional-probability)
Conditional Probability

❐ Bayes' rule ― For events A and B such that P (B) > 0, we have:

P (B∣A)P (A)
P (A∣B) =
P (B)
​ ​

Remark: we have P (A ∩ B) = P (A)P (B∣A) = P (A∣B)P (B).


❐ Partition ― Let {Ai , i

= ∅. We say that {Ai } is a partition if
∈ [[1, n]]} be such that for all i, Ai  ​ ​ ​

we have:

n
∀i 
= j, Ai ∩ Aj = ∅
​ ​ ​
and ⋃ Ai = S
​ ​ ​

i=1

n
Remark: for any event B in the sample space, we have P (B) = ∑ P (B∣Ai )P (Ai ).
​ ​ ​

i=1

❐ Extended form of Bayes' rule ― Let {Ai , i ​


∈ [[1, n]]} be a partition of the sample space. We
have:

P (B∣Ak )P (Ak )
P (Ak ∣B) =
​ ​

n
​ ​

∑ P (B∣Ai )P (Ai )
​ ​

i=1

❐ Independence ― Two events A and B are independent if and only if we have:

P (A ∩ B) = P (A)P (B) ​
(https://stanford.edu/~shervine/teaching/cme-106/cheatsheet-
probability#random-variables)
Random Variables

Definitions

❐ Random variable ― A random variable, often noted X , is a function that maps every element in a
sample space to a real line.

❐ Cumulative distribution function (CDF) ― The cumulative distribution function F , which is


monotonically non-decreasing and is such that lim F (x) = 0 and lim F (x) = 1, is defined
​ ​

x→−∞ x→+∞
as:

F (x) = P (X ⩽ x) ​

Remark: we have P (a < X ⩽ B) = F (b) − F (a).

❐ Probability density function (PDF) ― The probability density function f is the probability that X
takes on values between two adjacent realizations of the random variable.

Relationships involving the PDF and CDF

❐ Discrete case ― Here, X takes discrete values, such as outcomes of coin flips. By noting f and
F the PDF and CDF respectively, we have the following relations:
F (x) = ∑ P (X = xi )
​ ​ ​
and f (xj ) = P (X = xj )
​ ​ ​

xi ⩽x

On top of that, the PDF is such that:

0 ⩽ f (xj ) ⩽ 1 ​ ​
and ∑ f (xj ) = 1
​ ​ ​

❐ Continuous case ― Here, X takes continuous values, such as the temperature in the room. By
noting f and F the PDF and CDF respectively, we have the following relations:

x
dF
F (x) = ∫ ​
f (y)dy ​
and f (x) = ​ ​

−∞ dx

On top of that, the PDF is such that:

+∞
f (x) ⩾ 0 ​
and ∫ ​
f (x)dx = 1 ​

−∞

(https://stanford.edu/~shervine/teaching/cme-106/cheatsheet-
probability#expectation)
Expectation and Moments of the Distribution

In the following sections, we are going to keep the same notations as before and the formulas will
be explicitly detailed for the discrete (D) and continuous (C) cases.

❐ Expected value ― The expected value of a random variable, also known as the mean value or the
first moment, is often noted E[X] or μ and is the value that we would obtain by averaging the
results of the experiment infinitely many times. It is computed as follows:

n +∞
(D) E[X] = ∑ xi f (xi ) ​ ​ ​
and (C) E[X] = ∫ ​
xf (x)dx ​

i=1 −∞
❐ Generalization of the expected value ― The expected value of a function of a random variable
g(X) is computed as follows:

n +∞
(D) E[g(X)] = ∑ g(xi )f (xi ) ​ ​ ​
and (C) E[g(X)] = ∫ ​
g(x)f (x)dx ​

i=1 −∞

❐ k th moment ― The k th moment, noted E[X k ], is the value of X k that we expect to observe on
average on infinitely many trials. It is computed as follows:

n +∞
(D) k
E[X ] = ∑ xki f (xi )
​ ​ ​ ​
and (C) E[X ] = ∫
k

xk f (x)dx ​

i=1 −∞

Remark: the k th moment is a particular case of the previous definition with g : X ↦ Xk.

❐ Variance ― The variance of a random variable, often noted Var(X) or σ 2 , is a measure of the
spread of its distribution function. It is determined as follows:

Var(X) = E[(X − E[X])2 ] = E[X 2 ] − E[X]2 ​

❐ Standard deviation ― The standard deviation of a random variable, often noted σ , is a measure of
the spread of its distribution function which is compatible with the units of the actual random
variable. It is determined as follows:

σ= Var(X) ​ ​
❐ Characteristic function ― A characteristic function ψ(ω) is derived from a probability density
function f (x) and is defined as:

n +∞
(D) ψ(ω) = ∑ f (xi )e ​ ​
iωxi ​


and (C) ψ(ω) = ∫ ​
f (x)eiωx dx ​

i=1 −∞

❐ Euler's formula ― For θ ∈ R, the Euler formula is the name given to the identity:

eiθ = cos(θ) + i sin(θ) ​


❐ Revisiting the k th moment ― The k th moment can also be computed with the characteristic
function as follows:

1 ∂kψ
E[X ] = k [ k ]
k
i ∂ω ω=0
​ ​ ​ ​

❐ Transformation of random variables ― Let the variables X and Y be linked by some function. By
noting fX and fY the distribution function of X and Y respectively, we have:
​ ​
∣ dx ∣
fY (y) = fX (x) ∣∣ ∣∣
​ ​ ​ ​ ​ ​

∣ dy ∣

❐ Leibniz integral rule ― Let g be a function of x and potentially c, and a, b boundaries that may
depend on c. We have:

(∫ g(x)dx) =
b b
∂ ∂b ∂a ∂g
⋅ g(b) − ⋅ g(a) + ∫ (x)dx
∂c ∂c ∂c ∂c
​ ​ ​ ​ ​ ​ ​

a a

(https://stanford.edu/~shervine/teaching/cme-106/cheatsheet-
probability#probability-distributions)
Probability Distributions

❐ Chebyshev's inequality ― Let X be a random variable with expected value μ. For k, σ > 0, we
have the following inequality:

1
P (∣X − μ∣ ⩾ kσ) ⩽
k2
​ ​

❐ Discrete distributions ― Here are the main discrete distributions to have in mind:

Distribution P (X = x) ψ(ω) E[X] Var(X) Illustration


X∼ (peiω +
( )px q n−x
n
np npq
B(n, p) q)n

μx −μ iω
−1)
X ∼ Po(μ) e ​
eμ(e μ μ
x!

❐ Continuous distributions ― Here are the main continuous distributions to have in mind:

Distribution f (x) ψ(ω) E[X] Var(X) Illus

1 eiωb − eiωa a+b (b − a)2


X ∼ U(a, b)
b−a (b − a)iω 2

12

1 1 x−μ 2
e− 2 ( σ )
1 2 2
X ∼ N (μ, σ) eiωμ− 2 ω σ μ σ2
​ ​

2π σ ​

1 1 1
X ∼ Exp(λ) λe−λx
1 − iω

λ2
​ ​

λ

(https://stanford.edu/~shervine/teaching/cme-106/cheatsheet-
probability#joint-rv)
Jointly Distributed Random Variables

❐ Joint probability density function ― The joint probability density function of two random variables
X and Y , that we note fXY , is defined as follows:

(D) fXY (xi , yj ) = P (X = xi and Y = yj )


​ ​ ​ ​ ​ ​

(C) fXY (x, y)ΔxΔy = P (x ⩽ X ⩽ x + Δx and y ⩽ Y ⩽ y + Δy)


​ ​
❐ Marginal density ― We define the marginal density for the variable X as follows:

+∞
(D) fX (xi ) = ∑ fXY (xi , yj )
​ ​ ​ ​ ​ ​ ​
and (C) fX (x) = ∫
​ ​
fXY (x, y)dy
​ ​

j −∞

❐ Cumulative distribution ― We define cumulative distrubution FXY as follows: ​

x y
(D) FXY (x, y) = ∑ ∑ fXY (xi , yj )
​ ​ ​ ​ ​ ​ ​
and (C) FXY (x, y) = ∫
​ ​
∫ ​
fXY (x′ ​

xi ⩽x yj ⩽y
​ ​
−∞ −∞

❐ Conditional density ― The conditional density of X with respect to Y , often noted fX∣Y , is ​

defined as follows:

fXY (x, y)
fX∣Y (x) =

fY (y)
​ ​ ​

❐ Independence ― Two random variables X and Y are said to be independent if we have:

fXY (x, y) = fX (x)fY (y)


​ ​ ​ ​

❐ Moments of joint distributions ― We define the moments of joint distributions of random


variables X and Y as follows:

+∞ +∞
(D) E[X Y ] = p q
∑ ∑ xpi yjq f (xi , yj )
​ ​ ​ ​ ​ ​ ​
and (C) E[X Y ] = ∫
p q

∫ ​
xp y q f
i j −∞ −∞

❐ Distribution of a sum of independent random variables ― Let Y = X1 + ... + Xn with


​ ​

X1 , ..., Xn independent. We have:


​ ​
n
ψY (ω) = ∏ ψXk (ω)
​ ​


​ ​

k=1

2
❐ Covariance ― We define the covariance of two random variables X and Y , that we note σXY or ​

more commonly Cov(X, Y ), as follows:

2
Cov(X, Y ) ≜ σXY = E[(X − μX )(Y − μY )] = E[XY ] − μX μY
​ ​ ​ ​ ​ ​

❐ Correlation ― By noting σX , σY the standard deviations of X and Y , we define the correlation


between the random variables X and Y , noted ρXY , as follows: ​

2
σXY
=

ρXY ​ ​ ​

σX σY ​ ​

Remark 1: we note that for any random variables X, Y , we have ρXY ​


∈ [−1, 1].

Remark 2: If X and Y are independent, then ρXY ​


= 0.

 (https://twitter.com/shervinea)  (https://linkedin.com/in/shervineamidi) 
(https://github.com/shervinea)  (https://scholar.google.com/citations?user=nMnMTm8AAAAJ) 

You might also like