Module 1 (3)
Module 1 (3)
Syllabus
Probability concepts review - Axioms of probability, concepts of random variables, probability mass
function, probability density function, cumulative density functions, Expectation. Concepts of joint and
multiple random variables, joint, conditional and marginal distributions. Correlation and independence.
set
A set is a collection of some items (elements).
Random Experiments
Before rolling a die you do not know the result. This is an example of a random experiment. In
particular, a random experiment is a process by which we observe something uncertain. After the
experiment, the result of the random experiment is known.
Here are some examples of random experiments and their sample spaces:
Example
We toss a coin three times and observe the sequence of heads/tails. The sample space here may be
defined as
S={(H,H,H),(H,H,T),(H,T,H),(T,H,H),(H,T,T),(T,H,T),(T,T,H),(T,T,T)}.
Similarly, A∩B occurs if both A and B occur. Similarly, if A1,A2,⋯,An are events, then the event
A1∪A2∪A3⋯∪An occurs if at least one of A1,A2,⋯,An occurs. The event A1∩A2∩A3⋯∩An occurs if
all of A1,A2,⋯,An occur.
Probability
We assign a probability measure P(A) to an event A. This is a value between 0 and 1 that shows how
likely the event is. If P(A) is close to 0, it is very unlikely that the event A occurs. On the other hand, if
P(A) is close to 1, A is very likely to occur.
The main subject of probability theory is to develop tools and techniques to calculate probabilities of
different events. Probability theory is based on some axioms that act as the foundation for the theory,
so let us state and explain these axioms.
Axioms of Probability:
Example
In a presidential election, there are four candidates. Call them A, B, C, and D. Based on our polling
analysis, we estimate that A has a 20 percent chance of winning the election, while B has a 40 percent
chance of winning. What is the probability that A or B win the election?
Solution
Notice that the events that {A wins}, {B wins}, {C wins}, and {D wins} are disjoint since more than
one of them cannot occur at the same time. For example, if A wins, then B cannot win. From the third
axiom of probability, the probability of the union of two disjoint events is the summation of individual
probabilities. Therefore,
=0.2+0.4
=0.6
Finding Probabilities
Suppose that we are given a random experiment with a sample space S. To find the probability of an
event, there are usually two steps: first, we use the specific information that we have about the random
experiment. Second, we use the probability axioms.
Example
Solution
Let's first use the specific information that we have about the random experiment. The problem states
that the die is fair, which means that all six possible outcomes are equally likely, i.e.,
P({1})=P({2})=⋯=P({6}).
Now we can use the axioms of probability. In particular, since the events {1},{2},⋯,{6} are disjoint
we can write
1 =P(S)
=P({1}∪{2}∪⋯∪{6})
=P({1})+P({2})+⋯+P({6})
=6P({1})
.
Thus, P({1})=P({2})=⋯=P({6})=1/6.
Again since {1} and {5} are disjoint, we have
P(E)=P({1,5})=P({1})+P({5})=2/6=1/3.
Random variable in statistics is a mathematical concept that assigns numerical values to outcomes of a
sample space. There are two types of Random Variables, Discrete and Continuous.
A random variable is considered a discrete random variable when it takes specific, or distinct values
within an interval.
Example: If two unbiased coins are tossed then find the random variable associated to that event.
Solution:
Suppose Two (unbiased) coins are tossed
We define random variable a function which maps from sample space of an experiment to the real
numbers. Mathematically, Random Variable is expressed as,
X: S →R
where,
S is Sample Space
P(X = xi) = pi
where 1 ≤ i ≤ m
0 ≤ pi ≤ 1; where 1 ≤ i ≤ m
Example
X = {0, 1, 2} where m = 3
P(X = 1) = (Probability that number of heads is 1) = P(HT | TH) = 1/2×1/2 + 1/2×1/2 = 1/2
For example,
Suppose a dice is thrown (X = outcome of the dice). Here, the sample space S = {1, 2, 3, 4, 5, 6}. The
output of the function will be:
P(X=1) = 1/6
P(X=2) = 1/6
P(X=3) = 1/6
P(X=4) = 1/6
P(X=5) = 1/6
P(X=6) = 1/6
A random variable X is said to be discrete if it takes on a finite number of values. The probability
function associated with it is said to be
0 ≤ pi ≤ 1
xi 0 1 2
P1 + 0.3 + 0.5 = 1
P1 = 0.2
Then, P (X = 0) is 0.2
A random variable X is said to be continuous if it takes on an infinite number of values. The probability
function associated with it is said to be PDF (Probability Density Function).
Such that,
f(x) = kx3; 0 ≤ x ≤ 3 = 0
Solution:
If a function f is said to be a density function, then the sum of all probabilities is equal to 1.
∫ f(x) dx = 1
∫ kx3 dx = 1
K[x4]/4 = 1
Given interval, 0 ≤ x ≤ 3 = 0
K[34 – 04]/4 = 1
K(81/4) = 1
K = 4/81
Thus,
P = 4/81×[16-1]/4
P = 15/81
For any random variable X where P is its respective probability we define its mean as,
Mean(μ) = ∑ X.P
where,
The variance of random variable tells us how the random variable is spread about the mean value of the
random variable. Variance of Random Variable is calculated using the formula,
where,
E(X2) = ∑X2P
E(X) = ∑XP
Now for any new random variable Y in which the random variable X is its input, i.e. Y = f(X), then
cumulative distribution function of Y is,
Fy(Y) = P(g(X) ≤ y)
For a random variable its probability distribution is calculated using three methods,
Probability of a random variable X that takes values x is defined using a probability function of X that
is denoted by f (x) = f (X = x).
Binomial Distribution
Poisson Distribution
Bernoulli’s Distribution
Exponential Distribution
Normal Distribution
Example 1: Find the mean value for the continuous random variable, f(x) = x2, 1 ≤ x ≤ 3
Solution:
Given,
f(x) = x2
1≤x≤3
E(x) = ∫31 x.f(x)dx
E(x) = [x4/4]31
E(x) = 1/4{80} = 20
Example 2: Find the mean value for the continuous random variable, f(x) = e x, 1 ≤ x ≤ 3
Solution:
Given,
f(x) = ex
1≤x≤3
E(x) = 2e3
The probability Density Function describes the probability distribution of continuous random variables.
Cumulative Distribution Function is a kind of probability distribution that deals with both continuous and discrete
random variables
Applicable only for Continuous Random Applicable for both continuous and discrete
Variable
Variable. random variables.
Value The value lies between 0 and 1. The value of CDF is always non-negative.
Exceptation
In reinforcement learning for expected value is the expected return from state s to the end.
The Probability Mass Function (PMF) provides the probability distribution for discrete variables. For
example, rolling dice. There are 6 distinct possible outcomes that define the entire sample space {1, 2, 3, 4, 5, 6}.
Note that we only have whole numbers, i.e. no 1.2 or 3.75.
Concepts of joint and multiple random variables, joint, conditional and marginal distributions.
Joint probability is the probability of two events occurring simultaneously. Marginal probability is the probability
of an event irrespective of the outcome of another variable. Conditional probability is the probability of one event
occurring in the presence of a second event.
What is the relationship between correlation and independence?
If ρ(X,Y) = 0 we say that X and Y are “uncorrelated.” If two variables are independent, then their correlation
will be 0. However, like with covariance. it doesn't go the other way. A correlation of 0 does not imply
independence.