0% found this document useful (0 votes)
53 views12 pages

170 Are View

This document provides an overview of key concepts in probability theory, including: 1) Definitions of permutations, combinations, and binomial coefficients that are used to count outcomes of experiments. 2) The basic properties of probability functions, including that they must be between 0 and 1 and add to 1. 3) Key concepts involving independent and mutually exclusive events, and formulas like the multiplication rule and Bayes' rule. 4) Definitions of important distributions like the binomial and Poisson distributions that are used to model random phenomena. Expectations and variances of these distributions are also discussed.

Uploaded by

jason.kerwin
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views12 pages

170 Are View

This document provides an overview of key concepts in probability theory, including: 1) Definitions of permutations, combinations, and binomial coefficients that are used to count outcomes of experiments. 2) The basic properties of probability functions, including that they must be between 0 and 1 and add to 1. 3) Key concepts involving independent and mutually exclusive events, and formulas like the multiplication rule and Bayes' rule. 4) Definitions of important distributions like the binomial and Poisson distributions that are used to model random phenomena. Expectations and variances of these distributions are also discussed.

Uploaded by

jason.kerwin
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 12

Probability Theory Review

Axiom (Basic Counting Principle)


Suppose we perform two experiments, one after the other. If the first experiment
has m1 outcomes and the second experiment has m 2 outcomes, then there are
m1 m 2 outcomes for the two-step experiment.

Definition (Permutation)
An ordering of n distinct objects x1 , x 2 ,..., x n is called a permutation of
x1 , x 2 ,..., x n .

Definition (Factorial)
We define 0! =1
n factorial = n!= n(n −1)...( 2)(1) for n =1,2,3,...

Proposition
There are n! permutations of n distinct objects

Proposition
Suppose that a sequence S of n items has n1 identical items of type 1, n 2
identical items of type 2, …, and n k identical items of type k . Then, the
n!
number of orderings of S is .
n1 ! n 2 !... n k !

Definition ( k -combination)
Let x1 , x 2 ,..., x n be n distinct elements. A k-combination of x1 , x 2 ,..., x n is
a k-element subset of the set { x1 , x 2 ,..., x n } .

 n n!
Further, there are
  = k-combinations of n distinct elements.

 k ( − ) kkn ! !
Definition (Binomial Coefficient)
For k ≤ n , we define the binomial coefficient as follows:
 n n!
  =
 k ( − ) kkn ! !
 n
For k > n , we set
  = 0
 k
Identities

 n  n 
(a )
  =  
k   n− k
n −  n−11 
(b )

 = + 
k   k−1
Theorem (The Binomial Theorem)

n n k n− k n
(x+ y) = ∑   yx , n =1,2,3,...

k= 0  k 
Theorem (Trinomial Theorem)
n!
( x + y + z) n = ∑ x i y j z k
i, j,k≥ 0 i! j!k!
i+ j+ k = n
Definition (Sample Space)
The sample space is the set of all possible outcomes. Denoted S.

Definition (Event)
An event is a subset of the sample space.

Definition (Probability Function)


A probability function is a function P : {events } →[0,1] with the following
properties:
(a ) For all events E ⊂ S , 0 ≤ P ( E ) ≤1
(b ) P ( S ) =1
(c ) For all mutually exclusive E1 , E 2 ,... ⊂ S (i.e. Ei  E j = 0/ ∀i ≠ j ) , we
∞  ∞
have   i  = ∑ P ( Ei ) (additivity)
P  E
 i =1  i =1

Proposition
Let S be a finite sample space, and suppose that all outcomes are equally likely.
E
Then, for all E ⊂ S , P( E ) =
S
.

Proposition
If E ⊂ F ⊂ S , P ( E ) ≤ P( F )

Proposition

For all events E ⊂ S , we have that P ( E C ) =1 − P ( E ) where


E = S\E
C is the

complement of E.

Proposition
For all events E , F ⊂ S P ( E  F ) = P ( E ) + P( F ) − P ( EF )

Proposition (General Inclusion-Exclusion Formula)


For any events E1 , E 2 ,..., E n ⊂ S ,
 n  n
P  E i 
 = ∑ P ( Ei ) − ∑P ( Ei  Ei ) + ∑P( Ei  Ei  Ei )
 i =1  i =1
1 2 1 2 3
i ≤i <i ≤n
1 2 i ≤i <i <i ≤n
1 2 3

 n 
− + ... + (−1) n +1 PEi  
 i =1 

Definition (Conditional Probability)


If P ( B ) > 0 , then we define P ( A | B ) , the conditional probability of A given B
by the following:
P ( AB )
P( A | B) =
P( B)

Proposition (Multiplication Rule)


n −1 
Let E1 , E 2 ,..., E n ⊂ S be events such that P
 E i 
 >0 . Then,
i =1 
 n

P
E i 
 = P ( E1 ) P ( E 2 | E1 ) P ( E 3 | E1 E 2 )... P ( E n | E1 E 2 ... E n −1 )
 i =1 

Proposition
Let A, B ⊂ S be events such that 0 < P ( B) <1. Then,
P ( A) = P ( A | B ) P ( B ) + P ( A | B c ) P ( B c )

Proposition (Generalization of Previous)


Let A, B1 , B2 ,..., Bn ⊂ S be events such that
(a ) 0 < P ( B ) <1

(b ) Bi  B j = 0/ for all i ≠ j
n
(c )  Bi = S .
i =1

Then, P ( A) = P ( A | B1 ) P ( B1 ) + P ( A | B2 ) P ( B2 ) + ... + P ( A | Bn ) P ( Bn )
n
= ∑ P ( A | Bi ) P ( Bi )
i =1

Definition (Partition)
Let B1 , B2 ,..., Bn ⊂ S be events such that
(a ) 0 < P ( B ) <1

(b ) Bi  B j = 0/ for all i ≠ j
n
(c )  Bi = S .
i =1
{B1 ,..., Bn } is called a partition of S .

Proposition (Bayes’ Formula)


Let P ( A) > 0 and B1 ,..., Bn be a partition of S with 0 < P( Bi ) < 1 for all i .
Then,
P( A | B j ) P( B j )
P ( B j | A) = .
P ( A | B1 ) P ( B1 ) + ... + P ( A | Bn ) P ( Bn )

Definition (Independent)
Events A, B ⊂ S are called independent if P ( AB ) = P ( A) P ( B ) .
Proposition
Let A, B ⊂ S be events such that P ( B ) > 0 . Then, A and B are independent
if and only if P ( A | B ) = P ( A) .

Proposition
If A and B are independent events, then A and B c are independent.

Definition (Pairwise Independent)


Events A1 , A2 ,..., An are said to be pairwise independent if P ( Ai A j )
= P ( Ai ) P ( A j ) for all i ≠ j.

Definition (Independent)
Events A1 , A2 ,..., An are said to be independent if P ( Ai1 Ai2 ... Aik )
= P ( Ai ) P ( Ai )... P ( Ai ) for all 1 ≤ i1 < i2 < ... < i k ≤ n , 2 ≤ k ≤ n.
1 2 k

Definition (Discrete Random Variable)


A discrete random variable is a function X : S → ∆ where ∆ is a countable set.

Definition (Probability Mass Function)


The probability mass function of X is defined as follows: p : ∆ →[0,1] .
p ( x ) = P ({ s ∈ S | X ( x ) = x}) for x ∈ ∆ . We abbreviate p ( x ) = P ( X = x) .

Definition (Cumulative Distribution Function)


The cumulative distribution function of a real-valued discrete random variable
X is the function F : R →[0,1] defined by
F ( x) = P ( X ≤ x) = P ({s ∈ S | X ( s ) ≤ x})

Fact
Let X have distribution F . Then
(a ) P ( X > x) = 1 − P ( X ≤ x) = 1 − F ( x) for all x ∈ R.
(b ) P ( x < X ≤ y ) = F ( y ) − F ( x ) for all y > x

Definition (Expected Value)


The expectation (expected value) of a random variable X is defined by
E[ X ] = ∑xp ( x)
x: p ( x ) >0

Proposition
Let X be a discrete random variable with probability mass function p .
For any function g : R → R , E[ g ( X )] = x∈∑ g ( x) p( x)
D( p)

This is true whenever the sum makes sense.

Definition (Variance)
If X is a random variable with a finite mean µ , we define the variance of X
by:
Var ( X ) = E[( X − µ) 2 ] .

Proposition
Var ( X ) = E[ X 2 ] − ( E[ X ])
2

Proposition
For all a, b ∈R , var( aX + b) = a 2 var( X )

Definition (Standard Deviation)


The standard deviation of a random variable X is defined by
SD ( X ) = Var ( X )

Definition (Bernoulli Random Variable)


A random variable is called a Bernoulli random variable if its probability mass
function is given by: p(0) =1 − p and p (1) = p for some 0 ≤ p ≤1 .

Definition (Binomial Random Variable)


A random variable with the following probability mass function is called a
Binomial random variable with parameters n and p :

 n  k n− k
p(k) = P(X = k) =   p (1− p) for all k =0,1,..., n .

 k
Proposition
If X ~ Binomial ( n, p ) , then E ( X ) = np and Var ( X ) = np (1 − p )

Definition (Poisson Random Variable)


A Poisson random variable approximates a Binomial (n, p ) random variable
with n very large and mean λ = np , λ > 0

Proposition
λk
Let X n ~ Binomial (n, λ n) . Then, for all k ≥ 0 , lim P ( X n = k ) = e −λ
n →∞ k!
Fact
λk
p (k ) = e −λ for k = 0,1,2,... defines a probability mass function.
k!

Definition (Poisson( λ )-distributed)


Let λ > 0 . A random variable with the probability mass function given by
λk
p (k ) = e −λ is called Poisson (λ) -distributed.
k!

Proposition
If X ~ Poisson (λ) , then E [ X ] = var ( X ) = λ .

Fact
p ( k ) = (1 − p ) k −1 p , k ≥ 1 defines a probability mass function

Definition (Geometric(p)-distributed)
Let p ∈( 0,1) . A random variable with the probability mass function
p (k ) = (1 − p ) p is called geometric ( p ) -distributed with parameter p.
k

Proposition
1 1− p
If X ~ geometric ( p ) , then E [ X ] = p and var ( X ) = p 2 .

Definition (Continuous Random Variable)


We say X : S → R is a continuous random variable if there exists a function
f : R → [0, ∞) with the following properties:
(a ) f ( x ) ≥ 0 for all x ∈ R .

(b ) ∫−∞
f ( x ) dx =1

(c ) P ( X ∈ A) = ∫ f ( x)dx for all intervals A ⊂ R


x∈A

Definition (Density Function)


The function f defined above is called the density function of X .

Fact
Let X be a continuous random variable. Then, P( X = a ) = 0 for all a ∈ R.

Definition (Uniform( α, β )-Distributed)


A random variable is uniformly distributed on the interval (α, β) if its density

 1
 x ∈(α, β)
function is given by f ( x) =  β − α
0 otherwise

Proposition
Let X be a continuous random variable. Its cumulative distribution function F
has the following properties:
(a ) F is increasing
(b ) lim F ( x) = 0 , lim F ( x) = 1
x →−∞ x →∞
(c ) F ' ( x ) = f ( x ) where f is the density of X .

Definition (Expectation)
We defined the expectation of a continuous random variable X with density f
by:
E [ X ] = ∫ xf ( x) dx whenever this integral makes sense.

−∞

Proposition
Let X be a continuous random variable with density function f X , and let
Then E [ g ( X )] = ∫−∞ g ( X ) f X ( x) dx whenever the integral makes

g :R →R .

sense.

Lemma
For a non-negative random variable Y , E [Y ] = ∫0 P(Y > y ) dy .

Corollary
For all a, b ∈R , E[ aX + b] = b + aE [ X ]

Definition (Exponential ( λ )-Distributed)


Let λ > 0 . A random variable is said to be an exponential ( λ) distribution if it
has the following density function:

 λ e− λ x
f ( x) =  x ≥0
else
0
Proposition
1 1
Let X ~ exponential (λ) . Then, E [ X ] = , Var ( X ) = .
λ λ2

Proposition (Memoryless Property)


Let X be exponential (λ) -distributed. Then, for all s, t > 0 ,
P( X > s + t | X > t ) = P( X > s )

Theorem
Let X be a continuous random variable with density f X . Let g be a
continuous, strictly monotone function. Then the random variable Y = g ( X ) has
the following density:
 −1 d −1
 f X ( g ( y) ) g ( y)
fY ( y) =  dy
[ ] y = g ( x ) for some x

0 else


Definition
We say that X is normally distributed with parameters µ and σ 2 if the density
is given by the following:
2
1  x −µ 
1 −  
f ( x) = e 2 σ 
, x ∈R .
2π σ2

Fact
2
1  x −µ 
1 −  
f ( x) = e 2 σ 
, x ∈ R is a probability density
2π σ2

Theorem (de Moivre and Laplace)


Let X 1 , X 2 ,... be independent random variables with P( X i = 1) = p and
P ( X i = 0) = 1 − p . Let S n = X 1 + ... + X n . Then,
 S n − np  b 1 −
x2
lim P a ≤ ≤ b = ∫ e 2
dx
n →∞  
 np (1 − p ) 
a

Proposition
Let X ~ N ( µ, σ 2 ) . Then, E [ X ] = µ , var ( X ) = σ 2 .

Definition (Moment-Generating Function)


Let X be a random variable. The moment-generating function of X is defined
by
[ ] for t ∈ R .
M (t ) = E e tX

Theorem
Suppose M (t ) is finite on some open interval containing the origin. Then,
(a ) E [ X ] = M ' ( 0)
(b ) For all k ≥ 1 , E [ X k ] = M (k ) (0)

Definition (Joint Cumulative Distribution Function)


The joint-cumulative distribution function of random variables X and Y is
defined by F : R × R → [ 0,1] :
F ( a , b) = P ( X ≤ a , Y ≤ b )

Definition (Joint Probability Mass Function)


If X and Y are discrete random variables, then we define their joint probability
mass function by:
p ( x, y ) = P ( X = x , Y = y )

Definition (Marginal Distribution)


Given p ( x, y ) , the marginal distributions of X and Y are given by:
p X ( x) = ∑p ( x, y )
y∈∆

pY ( y ) = ∑ p ( x , y )
x∈∆

Definition (Jointly Continuous)


We say that X and Y are jointly continuous if there is a function:
f : R × R → [0, ∞) such that P( X ∈ A, Y ∈ B ) = ∫∫ f ( x, y)dydx
A B

for all intervals A, B ⊂ R .

Definition (Joint Probability Density Function)


The function f defined above is called the joint probability density function of
X and Y .

Fact
If X and Y are jointly continuous, then both X and Y are continuous
random variables.

Definition (Independent)
Two random variables X and Y are called independent if for all intervals
A, B ⊂ R ,
P ( X ∈ A, Y ∈ B ) = P ( X ∈ A) P (Y ∈ B )
Proposition
(a ) Discrete random variables X and Y are independent if and only if
p ( x, y ) = p X ( x) pY ( y ) for all x, y .
(b ) Random variables X and Y with a joint density function f ( x, y ) are
independent if and only if f ( x, y ) = f X ( x) f Y ( y ) for all x, y .

Proposition
Let X and Y be two independent, continuous random variables with densities
f X , f Y . Then, X + Y is a continuous random variable with density function:

f X +Y ( a ) = ∫ f X ( a − y ) f Y ( y ) dy
−∞

Proposition
Let X 1 and X 2 be normally distributed with parameters µ1 , σ 1 and
2
( )
( µ ,σ ) respectively. If X and Y
2 2
2
are independent, then
X + Y ~ N ( µ + µ ,σ + σ ) .
2 2
1 2 1 2

Proposition
If X and Y are independent and Poisson-distributed with parameters λ and
µ respectively, then X + Y ~ Poisson ( λ + µ ) .

You might also like