0% found this document useful (0 votes)
53 views

SOR1211 - Probability

1. The document defines key concepts in probability such as sample space, events, mutually exclusive events, independent events, permutations, combinations, and random variables. 2. It also covers formulas for calculating probabilities of events, conditional probabilities, Bayes' theorem, and probability distributions of random variables. 3. The document provides proofs of theorems related to probabilities, including the multiplication rule, addition rule, and theorem of total probability.

Uploaded by

Matthew Curmi
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views

SOR1211 - Probability

1. The document defines key concepts in probability such as sample space, events, mutually exclusive events, independent events, permutations, combinations, and random variables. 2. It also covers formulas for calculating probabilities of events, conditional probabilities, Bayes' theorem, and probability distributions of random variables. 3. The document provides proofs of theorems related to probabilities, including the multiplication rule, addition rule, and theorem of total probability.

Uploaded by

Matthew Curmi
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 17

SOR1211 – Probability

Elementary Probability:
1. The set of all possible outcomes of a statistical experiment is called the sample space and is
represented by S.

2. Each outcome in a sample space is called an element.

S := { element 1 , element 2 … element n }

3. Sample Spaces with a large or infinite number of sample points are best described by a
statement or rule method.

S := { x∨statement on x }

That is S is a set of all x such that the x follows by the statement.

4. An event is a subset of a sample space. That subset represents all of the elements for which
the event is true. The probability measure, denoted by P assign the probability to each
subset.

5. The complement of an event A with respect to S, is the subset of all elements of S that are
not A . The complement of A is denoted as A' .

6. The σ alegebra refers to a collection of subsets to which a probability can be assigned.

7. The intersection of two events A and B, denoted by A ∩ B, is the event containing all the
elements common to A and B.

8. Two events A and B are mutually exclusive, or disjoint of A ∩ B=φ, that is if A and B
have no elements in common.

9. The union of two events A and B, denoted by A ∪B , is the event containing all the elements
that belong to A or B or both. That is, if A={ a , b , c } and B= {b , c , d , e } , then
A ∪ B= {a , b , c , d , e }.

10. If an operation can be performed in n1 ways, and if for each of these ways a second can be
performed in n2 ways, then the two operation can be performed in n1 n2 ways. The
generalized multiplication rule extends the latter for any n k operations.

11. If the operation is carried out with replacement, then;

1
Sample Space (S )=Number of Elements (n)Sample¿ r ¿ ¿
12. If the operation is carried out without replacement, then;

Sample Space ( S )=n × ( n−1 ) × … × ( n−r +1 )

13. A permutation is an arrangement of all or part of a set of objects. The number of


permutations of n objects is;

n ( n−1 )( n−2 ) …=n!

14. In general, n distinct objects taken r at a time can be arranged in n ( n−1 )( n−2 ) … ( n−r+ 1 )
ways.
n!
nPr=
( n−r ) !

15. Circular permutations of n objects is ( n−1 ) !

16. The number of distinct permutations of n objects of which n1 are of one kind, n2 of a second
kind and n k of the k th kind is;

n!
n1 ! n2 ! …n k !

17. The number of ways of partitioning a set of n objects into r cells with n1 elements in the first
cell, n2 elements in the second, and so forth is;

n n!
( ) =
n 1 , n2 , … , n r n1 ! n2 ! … nr !

Whereby n1 +n 2+ …+nr =n

18. The number of combinations of n distinct objects taken r at a time is;

2
n n!
(r , n−r )= r ! ( n−r )!

19. The probability of an event A is the sum of the weights of all sample points in A . Therefore,

0 ≤ P ( A ) ≤ 1, P ( φ )=0 and P ( S )=1

20. If A1 , A 2 , A3 … is a sequence of mutually exclusive events, then;

P ( A 1 ∪ A 2 ∪ A3 ∪ … ) =P ( A 1 ) + P ( A 2 ) + P ( A 3 ) +…

21. If an experiment can result in any one of N different equally likely outcomes, and if exactly n
of these outcomes correspond to event A , then the probability of event A is;

n
P ( A )=
N

22. If A and B are two events, then;

P ( A ∪ B ) =P ( A )+ P ( B )−P ( A ∩B )

23. If A and B are mutually exclusive, then;

P ( A ∪ B ) =P ( A )+ P ( B )

24. For three events A , B and C ;

P ( A ∪ B ∪ C ) =P ( A )+ P ( B ) + P ( C )−P ( A ∩ B )−P ( A ∩C )−P ( B ∩C )+ P ( A ∩ B ∩C )

25. If A and A' are complementary events, then;

P ( A ) + P ( A ' )=1

3
26. The probability of an event B occurring when it is known that some event A has occurred is
called conditional probability;

P ( A ∩B )
P ( B∨A )=
P(A)

27. If A and B are independent then P ( B∨A )=P ( B ) and P ( A∨B )=P ( A ) .

28. If in an experiment the events A and B can both occur, then;

P ( A ∩ B )=P ( A ) P ( B∨ A )

Provided P ( A ) >0

That is, the multiplication or product rule.

29. Two events A and B are independent ⇔ P ( A ∩ B )=P ( A ) P ( B ).

30. A collection of events A={ A1 , ..., An } are mutually independent if for any subset of
A , A i 1 , ... A ik , for k ≤ n, we have;

P( Ai 1 ∩·· · ∩ A ik )=P( A i 1) ·· · P( A ik )

28. If the events B1 , B2 ,... , B k constitute a partition of the sample space S such that P( Bk )≠ 0
for i=1 , 2, ... k , then for any event A of S;

k k
P( A)=∑ P (B i ∩ A)=∑ P(Bi ) P (A∨Bi )
i =1 i =1

That is the theorem of total probability or the rule of elimination.

29. Bayes’ Theorem: If the events B1, B2, ..., Bk constitute a partition of the sample space S such
that P( Bi )≠ 0 or i=1 , 2, ... k , then for any event A in S, such that P( A)≠0 ;

P ( Br ∩ A ) P (B r ) P ( A|Br )
P ( B r| A ) = k
= k

∑ P ( Bi ∩ A ) ∑ P ( B i ) P ( A|Bi )
i=1 i=1

4
Proof:

P ( Br ∩ A ) P ( Br ) P( A∨B r)
P ( B r| A ) = =
P( A) P(A)

Since B1, B2, ..., Bk constitute a partition then;

B1 ∪ B2 ∪ ...∪ Bk =S and Bi ∩ B j=φ for i≠ j.

Let A be any event in S.

A=S ∩ A
A=( B 1 ∪ B2 ∪... ∪ B k ) ∩ A

A=( B 1 ∩ A ) ∪ ( B2 ∩ A ) ∪… ∪ ( Bk ∩ A )

Since ( B1 ∩ A ) , ( B 2 ∩ A ) , … , ( Bk ∩ A ) are mutually exclusive then;

P ( A )=P ( B1 ∩ A ) + P ( B 2 ∩ A ) + …+ P ( B k ∩ A )
k k
P ( A )=∑ P ( Bi ∩ A )=∑ P ( Bi ) P( A∨B i)
i=1 i=1

P ( Br ) P( A∨Br )
P ( B r| A ) = k

∑ P ( Bi ) P( A∨B i)
i=1

30. Further Proofs:

P ( A ' ∩ B ∩C )=P ( B ∩C )−P ( A ∩ B∩C )


¿ P ( B ) P ( C )−P ( A ) P ( B ) P ( C )

¿ P ( B ) P ( C ) [ 1−P ( A ) ]

¿ P ( A' ) P ( B) P ( B)

P ( A ∩B ' ∩C ' ) =P ( A )−P ( A ∩B )−P ( A ∩C ) + P ( A ∩ B ∩C )


¿ P ( A )−P ( A ) P ( B )−P ( A ) P (C ) + P ( A ) P ( B ) P ( C )

5
¿ P ( A ) [ 1−P ( C ) ][ 1−P ( B ) ]

¿ P ( A ) P ( B' ) P ( C ' )

P ( A ∩B ' ) =P ( A )−P ( A ∩B )
¿ P ( A )−P ( A ) P ( B )

¿ P ( A ) [ 1−P ( B ) ]

¿ P ( A ) P ( B' )

P ( A ' ∩ B' ) =1−P ( A )−P ( B )+ P ( A ∩ B )


¿ 1−P ( A )−P ( B ) + P ( A ) P ( B )

¿ [ 1−P ( A ) ] −P ( B ) [ 1−P ( A ) ]

¿ [ 1−P ( A ) ] [1−P ( B ) ]

¿ P ( A ' ) P ( B' )

Random Variables:
1. If each point in a sample space is assigned a numerical value denoted by x , these values are,
of course, random quantities determined by the outcome of the experiment. A random
variable, denoted by X is a function that associates a real number with each element in the
sample space. The random variable for which 0 and 1 are chosen to describe the two possible
values is called a Bernoulli random variable.

2. If a sample space contains a finite number of possibilities or an unending sequence with as


many elements as there are whole numbers, it is called a discrete sample space.

3. If a sample space contains an infinite number of possibilities equal to the number of points on
a line segment, it is called a continuous sample space.

4. Frequently, it is convenient to represent all the probabilities of a random variable X by a


formula. The set of ordered pairs ( x , f (x )) is a probability function, probability mass
function, or probability distribution of the discrete random variable X if, for each possible
outcome x ;

 f (x)≥ 0
 P ( X=x )=f ( x )

∑ f ( x )=1
x

6
5. The cumulative distribution function F (x) of a discrete random variable X with
probability distribution f (x) is;

F (x)=P( X ≤ x )=∑ f (t)


t≤x

For −∞< x <∞ .


The cumulative distribution function is a monotone nondecreasing function defined not only
for the values assumed by the given random variable but for all real numbers.

The probability distribution is better visualised graphically by means of a probability


histogram.

6. In dealing with continuous variables, the function f (x) is referred to as the probability
density function (pdf). For a continuous random variable X , defined over the set of real
numbers;

 f ( x ) ≥0 ∀ x ∈ R

 ∫ f ( x ) dx=1
−∞
b
 P ( a< X < b )=∫ f ( x ) dx
a

7. The cumulative distribution function F (x)of a continuous random variable X with density
function f(x) is;

x
F (x)=P( X ≤ x )=∫ f ( t ) dt
−∞

For −∞< x <∞ .

Mathematical Expectation and Variance:


1. The average value as the mean of the random variable X or the mean of the probability
distribution of X can be written as μx or simply as μwhen it is clear to which random
variable we refer. It is common practice for the term mean to be interchanged with
mathematical expectation, denoted by E ( x ).

2. Let X be a random variable with probability distribution f (x) . The mean, or expected value,
of a discrete X is;

7
μ= E ( x )=∑ xf ( x )
x

For a continuous X ;


μ= E ( x )= ∫ xf ( x ) dx
−∞

3. The most important measure of variability of a random variable X is obtained by applying the
latter theorem with g( X)=(X −μ)2. The quantity is referred to as the variance of the
random variable X or the variance of the probability distribution of X and is denoted by
Var (X ) or the symbol σ 2X , or simply by σ 2 when it is clear to which random variable we
refer.

Let X be a random variable with probability distribution f (x) and mean μ. The variance of a
discrete X is;

σ 2=E[( X−μ)2 ]=∑ (x−μ)2 f (x)


x

For a continuous X ;


σ 2=E[( X−μ)2 ]= ∫ ( x−μ )2 f ( x ) dx
−∞

The positive square root of the variance ( σ ) is called the standard deviation of X .

4. The variance of a random variable X is;

σ 2=E ( X 2 ) −μ 2

5. If a and b are constant, then;

E ( aX +b )=∑ ( aX + b ) P ( X =x )
x

8
E ( aX +b )=a ∑ xP ( X =x )+ b ∑ P ( X=x )
x x

E ( aX +b )=aE ( X )+ b

Setting a=0 ⇒ E ( b )=b


Setting b=0 ⇒ E ( aX ) =aE ( X )

E ( a X 2+ b )=∑ ( a X 2 +b ) P ( X =x )
x

E ( a X + b )=∑ ( a X 2 ) P ( X=x ) + ∑ bP ( X =x )
2

x x

E ( a X 2+ b )=aE ( X 2 ) +b

E ( b−a X 2 ) =∑ ( b−a X 2) P ( X=x )


x

E ( b−a X 2 ) =∑ ( b ) P ( X=x )−∑ ( a X 2 ) P ( X=x )


x x

E ( b−a X 2 ) =b−aE ( X 2)

6. The expected value of the sum or difference of two or more functions of a random variable X
is the sum or difference of the expected values of the functions. That is;

E [g( X)± h( X )]=E[g ( X )]± E [h(X )]

7. If a and b are constant, then;

2
Var ( aX +b )=E [ ( aX +b )( aX +b ) ]−E [ aX +b ]

2
Var ( aX +b )=E [ a2 X 2 +2 abX +b2 ]− [ aE ( X ) +b ]

2
Var ( aX +b )=a2 E [ X 2 ] −[ E ( X ) ]

Var ( aX +b )=a2 Var ( X )

2
Var ( a X 2 +b )=E [ ( a X 2+ b ) ( a X 2+ b ) ]−E [ a X 2 +b ]

9
2
Var ( a X 2 +b )=∑ ( a2 X 4 +2 ab X 2 +b 2 ) P ( X =x )−aE [ X 2 +b ]
x

Var ( a X 2 +b )=a 2 Var ( X 2 )

2
Var ( b−a X 2 ) =E [ ( b−a X 2 ) ( b−a X 2 ) ] −E [ b−a X 2 ]

2
Var ( b−a X 2 ) =E [ b2−2 ab X 2 +a 2 X 4 ]−E [ b−a X 2 ]

2
Var ( b−a X 2 ) =b2 ∑ P ( X=x )−2 ab ∑ x 2 P ( X =x )+ a2 ∑ x 4 P ( X =x ) −b2 +2 abE ( X 2 ) −a2 ( E ( X 2) )
x x x

Var ( b−a X 2 ) =a2 Var ( X )

Binomial and Multinomial Distributions:


1. The Bernoulli process consists of a series of trials, aptly called Bernoulli trials whereby two
possible outcomes can be a success or a failure. The probability of a success is denoted by p
and all trials are independent.

2. The number X of successes in n Bernoulli trials is called a binomial random variable. The
probability distribution of this discrete random variable is called the binomial distribution
denoted by b ( x ; n , p) .

3. Then the probability distribution of the binomial random variable X , the number of successes
in n independent trials, is;

b ( x ; n , p)=( nx) p q
x n−x

whereby x=0 , 1 ,2 , 3 , … n

4. The mean and variance of the binomial distribution b ( x ; n , p) are μ=np and σ 2=npq
respectively.

5. The binomial experiment becomes a multinomial experiment if we let each trial have more
than two possible outcomes.

10
6. If a given trial can result in the k outcomes E1 , E2 ,... Ek with probabilities p1 , p2 , ... pk , then
the probability distribution of the random variables X 1 , X 2 ,... X k, representing the number of
occurrences for E1 , E2 ,... Ek in n independent trials, is;

f ( x 1 , x 2 , … x k ; p1 , p2 , … p k ,n )=
( x , x n, … x ) p
1 2 k
x1
1
x2
, p2 , … pk
xk

whereby

k k

∑ x i=n and ∑ pi =1
i=1 i=1

Expectation Derivation:

n
n− x
E ( X ( X −1 ) )=∑ x ( x−1 ) nCx p x ( 1− p )
x=0

n
( n−2 ) !
¿ n ( n−1 ) p 2
∑ ( x −2 ) ! ( n−x ) ! p x−2 ( 1− p )n− x
x=2

2 n−1
¿ n ( n−1 ) p ( p+ 1− p )

¿ n ( n−1 ) p2
E ( X ) =np

Variance Derivation:

E ( X 2) =n2 p 2−n p2 +np


2 2
E ( X 2) −E ( X ) =n2 p2−n p2 +np−( np )

Var ( X )=n2 p2 +n p2 +np−n2 p 2


Var ( X )=np (1− p )

Geometric Distribution:

11
1. If repeated independent trials can result in a success with probability p and a failure with
probability q=1− p , then the probability distribution of the random variable X , the number
of the trial on which the first success occurs is;

x n−x
1 1
g ( x ; p )=nCx ( mean )( 1−
mean )
g ( x ; p )=nCx p x qn− x
whereby x=1 , 2, 3 , …
2. The mean and variance of a random variable following geometric distribution are;

1 1−p
μ= σ 2=
p p2

Expectation Derivation:

x−1
E ( X ) =∑ xp ( 1− p )
x=1


x−1
¿ p ∑ x ( 1− p )
x=1

∂ ( 1− y )−1 y =0
¿p
∂y y=1− p
1
E ( X )=
p

Variation Derivation:


x−1
E ( X ( X −1 ) )=∑ x ( x−1 ) p ( 1− p )
x=1


x−2
¿ p ( 1− p ) ∑ x ( x−1 ) ( 1− p )
x=1

∂ ( 1− y )−2 y =0
¿ p ( 1− p )
∂y y=1− p
2 ( 1− p )
¿
p2

12
2 ( 1− p ) 1 1
Var ( X )= + − 2
p2 p p

( 1− p )
Var ( X )=
p2

Poisson Distribution and the Poisson Process:


1. Experiments yielding numerical values of a random variable X , the number of outcomes
occurring during a given time interval or in a specified region, are called Poisson
experiments.

2. The Poisson process is defined by the following properties. The number of outcomes
occurring in one-time interval or specified region of space is independent of the number that
occur in any other disjoint time interval. The probability that a single outcome will occur
during a very short time interval or in a small region is proportional to the length of the time
interval or the size of the region and does not depend on the number of outcomes occurring
outside this time interval or region. The probability that more than one outcome will occur in
such a short time interval or fall in such a small region is negligible.

3. The number X of outcomes occurring during a Poisson experiment is called a Poisson


random variable, and its probability distribution is called the Poisson distribution. The
mean number of outcomes is computed from μ= λt , whereby t is the specific region of
interest and λ is the rate of occurrence of outcomes denoted by p(x ; λt ).

4. The probability distribution of the Poisson random variable X , representing the number of
outcomes occurring in a given time interval or specified region denoted by t , is;

e− λt ( λt) x
p(x ; λt)=
x!

whereby x=0 , 1 ,2 , …

r
p(r ; λt)=∑ p(x ; λt )
x=0

5. Both the mean and the variance of the Poisson distribution p(x ; λt) are λt .

6. Let X be a binomial random variable with probability distribution b ( x ; n , p) . When n → ∞,


p →0 , and np → μ remain constant;

13
b ( x ; n , p)→ p (x ; μ)

Expectation Derivation:


e− λ λ x
E ( X ) =∑ x
x=0 x!

λ x−1
¿ λe− λ ∑ x
x=0 ( x−1 ) !

¿ λe− λ e λ
E ( X ) =λ

Variation Derivation:


e−λ λ x
E ( X ( X −1 ) )=∑ x ( x−1 )
x=0 x!

λ x−2
¿ λ2 e− λ ∑ x ( x−1 )
x=0 ( x−2 ) !

¿ λ2 e− λ e λ =λ2
2
Var ( X )=E ( X 2 )−E ( X )

Var ( X )= λ2+ λ−λ2


Var ( X )= λ

Normal Distribution:
1. The normal distribution is often referred to as the Gaussian distribution which constitutes
of a bell-shaped curve. A continuous random variable X having the bell-shaped distribution is
called a normal random variable.

2. The mathematical equation for the probability distribution of the normal variable depends on
the mean and standard deviation. Hence, the values of the density of X is denoted by;

n(x ; μ , σ )

3. The density of the normal random variable X , with mean μ and variance σ 2 is;

14
−1
1 2

n(x ; μ , σ )= e 2 σ ( x−μ)2
√ 2 πσ

Whereby −∞< x <∞

4. A normal curve is said to have the following properties. The mode, which is the point on
the horizontal axis where the curve is a maximum, occurs at x=μ. The curve is symmetric
about a vertical axis through the mean ( μ ) . The curve has its points of inflection at x=μ ± σ ;
it is concave downward if μ−σ < X < μ+ σ and is concave upwards otherwise. The normal
curve approaches the horizontal axis asymptotically as we proceed in either direction away
from the mean. The total area under the curve and above the horizontal axis is equal to 1.

5. The mean and variance of n(x ; μ , σ ) are μ and σ 2, respectively. Hence, the standard
deviation is σ .

6. The distribution of a normal random variable with mean 0 and variance 1 is called a standard
normal distribution.

Expectation Derivation:

∞ −1 x− μ
1 2 ( )
σ
E ( X −μ )= ∫ ( x−μ ) e dx
√ 2 πσ −∞
E ( X ) =μ

Variation Derivation:

Var ( X )=E ( ( X−μ )2 )


2 2
−y ∞ −y
σ
y e 2σ ∞ +
−1 2 2

Var ( X )= ∫ e 2 σ dy
√2 π −∞ √ 2 π −∞

Var ( X )=σ 2

Exponential Distribution:
1. The exponential distribution is a special case of the gamma distribution. The gamma
function is defined as;

15

Γ ( α )=∫ x α −1 e− x dx
0

For α >0

2. The continuous random variable X has a gamma distribution, with parameters α and β , if its
density function is given by;

−x
1

{
f ( x ; α , β )= β α Γ ( α )
0
x α −1 eβ

Whereby x >0 , α > 0 and β >0

3. The mean and variance of the gamma distribution are;

μ=αβ and σ 2=α β 2

4. The mean and variance of the exponential distribution are;

μ= β and σ 2=β 2
1
λ=
μ
Mean/Expectation Derivation:


E ( X ) =∫ λx e− λx dx
0


¿−x e− λx ∞ +∫ e−λx dx
0 0

1
E ( X )=
λ

Variance Derivation:

16

E ( X ) =∫ λ x 2 e− λx dx
2


2 − λx ∞ +2 x e− λx dx
¿−x e ∫
0 0

2
¿
λ2
2
Var ( X )=E ( X 2 )−E ( X )
2
2 1
Var ( X )=
λ 2
−()
λ
1
Var ( X )=
λ2

Normal Approximation to the Binomial Distribution:

μ=np

σ =√ np ( 1− p )

17

You might also like