Random Variables and Distributions - New
Random Variables and Distributions - New
Distributions
Speaker:
Dr. Venkateswara Raoo,
NIT Warangal
Note: These slides were assembled by Dr. K. V. Rao, with grateful acknowledgement of the many others who made their course materials available online.
Concepts
• Random variables (discrete and continuous)
• Probability distributions over discrete/continuous r.v.'s
• Notions of joint, marginal, and conditional probability
distributions
• Properties of random variables (and of functions of
random variables)
• Expectation and variance/covariance of random
variables
• Examples of probability distributions
Probability and Statistics
Probability is the chance of an outcome in an experiment (also called event).
Probability deals with predicting the Statistics involves the analysis of the
likelihood of future events. frequency of past events
Example: Consider there is a drawer containing 100 socks: 30 red, 20 blue and
50 black socks.
We can use probability to answer questions about the selection of a
random sample of these socks.
• Q1. What is the probability that we draw two blue socks or two red socks from the
drawer?
• Q2. What is the probability that we pull out three socks or have matching pair?
• Q3. What is the probability that we draw five socks and they are all black?
3
Statistics
Instead, if we have no knowledge about the type of socks in the drawers, then we enter into
the realm of statistics. Statistics helps us to infer properties about the population on the basis
of the random sample.
• Q1: A random sample of 10 socks from the drawer produced one blue, four red, five black socks.
What is the total population of black, blue or red socks in the drawer?
• Q2: We randomly sample 10 socks, and write down the number of black socks and then return the
socks to the drawer. The process is done for five times. The mean number of socks for each of these
trial is 7. What is the true number of black socks in the drawer?
• etc.
4
Probability vs. Statistics
In other words:
• In probability, we are given a model and asked what kind of data we are likely to see.
• In statistics, we are given data and asked what kind of model is likely to have generated
it.
Examples :
• You have a fair coin (equal probability of heads or tails). You will toss it 100 times.
What is the probability of 60 or more heads? We can get only a single answer
because of the standard computation strategy.
• You have a coin of unknown provenance. To investigate whether it is fair you toss it
100 times and count the number of heads. Let’s say you count 60 heads. Your job as
a statistician is to draw a conclusion (inference) from this data.
5
Random Variable
• Probability is often associated with at least one event.
– Rolling a die or pulling a color ball out of a bag.
• In these examples the outcome of the event is random
• So the variable that represents the outcome of these
events is called a random variable.
Joint
probability
Marginal
probability
Example: Discrete case
• P(Heads) = ?
• = P(Heads, die = 1) + P(Heads, die = 2) + … + P(Heads, die = 6).
• = 1/12 + 1/12 + … + 1/12 = 1/2
• Formally, 𝑃 𝑋 = 𝑥 = σ𝑦 𝑃(𝑋 = 𝑥, 𝑌 = 𝑦)
Conditional Probability
• Conditional probability refers to the probability of an
event given that another event occurred.
Listing Table
.00 x n!
p (x ) = px(1 – p)n – x
0 1 2 x!(n – x)!
Continuous probability distribution
• For a continuous r.v. X, a probability p(X = x) is meaningless and
zero.
• Instead we use f(x) to denote the probability density and is often
expressed in terms of an integral between two points, p(a≤X ≤b).
• For a continuous r.v. X, we can only talk about probability within
an interval 𝑋 ∈ (𝑥, 𝑥 + 𝛿𝑥).
• 𝑝(𝑥)𝛿𝑥 is the probability that 𝑋 ∈ 𝑥, 𝑥 + 𝛿𝑥
as 𝛿𝑥 → 0.
– f 𝑥 ≥ 0 and
– )𝑥(𝑓 𝑥dx = 1
• Suppose that the diameter of a metal cylinder has a p.d.f
50.5
49.5
(1.5 − 6( x − 50.0) 2 ) dx = [1.5 x − 2( x − 50.0) 3 ]50.5
49.5
50.1
49.8
(1.5 − 6( x − 50.0) 2 ) dx = [1.5 x − 2( x − 50.0) 3 ]50.1
49.8
dF ( x)
f ( x) =
dx
P ( a X b) = P ( X b) − P ( X a )
= F (b) − F (a)
P ( a X b) = P ( a X b)
Continuous probability distribution
• Cumulative Distribution Function
• Example
x
F ( x) = P( X x) = (1.5 − 6( y − 50.0) 2 )dy
49.5
P( X 50.0) = 0.5
F ( x)
P( X 49.7) = 0.104
• Discrete case
• Continuous case
The Classification Problem Katydids
(informal definition)
Katydid or Grasshopper?
Slides credit (for this example):
Dr Eamonn Keogh University of California - Riverside
For any domain of interest, we can measure features
Abdomen Thorax
Length Length Antennae
Length
Mandible
Size
Spiracle
Diameter Leg Length
My_Collection
We can store features
Insect Abdomen Antennae Insect Class
in a database. ID Length Length
1 2.7 5.5 Grasshopper
2 8.0 9.1 Katydid
3 0.9 4.7 Grasshopper
The classification
4 1.1 3.1 Grasshopper
problem can now be
5 5.4 8.5 Katydid
expressed as:
6 2.9 1.9 Grasshopper
7 6.1 6.6 Katydid
• Given a training database
(My_Collection), predict the class 8 0.5 1.0 Grasshopper
label of a previously unseen Katydid
instance
9 8.3 6.6
10 8.1 4.7 Katydids
10
9
8
7
Antenna Length
6
5
4
3
2
1
1 2 3 4 5 6 7 8 9 10
Abdomen Length
With a lot of data, we can build a histogram. Let
us just build one for “Antenna Length” for now…
10
9
8
7
Antenna Length
6
5
4
3
2
1
1 2 3 4 5 6 7 8 9 10
Katydids
Grasshoppers
We can leave the
histograms as they are,
or we can summarize
them with two normal
distributions.
• We can just ask ourselves, give the distributions of antennae lengths we have
seen, is it more probable that our insect is a Grasshopper or a Katydid.
• There is a formal way to discuss the most probable classification…
p(cj | d) = probability of class cj, given that we have observed d
Antennae length is 3
p(cj | d) = probability of class cj, given that we have observed d
10
Antennae length is 3
p(cj | d) = probability of class cj, given that we have observed d
P(Grasshopper | 7 ) = 3 / (3 + 9) = 0.250
P(Katydid | 7 ) = 9 / (3 + 9) = 0.750
9
3
Antennae length is 7
Summary of basic rules
Mean or Expectation
• Example: Tossing a single unfair die
– For fun, imagine a weighted die (cheating!)
Imagine the following probability values.
• Variance( 2 )
– A positive quantity that measures the spread of the distribution
of the random variable about its mean value
– Larger values of the variance indicate that the distribution is
more spread out
– Definition: Var( X ) = E (( X − E ( X )) 2 )
= E ( X 2 ) − ( E ( X )) 2
• Standard Deviation
– The positive square root of the variance
– Denoted by
The variance of a Random Variable
Var( X ) = E (( X − E ( X )) 2 )
= E ( X 2 − 2 XE ( X ) + ( E ( X )) 2 )
= E ( X 2 ) − 2 E ( X ) E ( X ) + ( E ( X )) 2
= E ( X 2 ) − ( E ( X )) 2
= Σx2p − μ2
f ( x)
Two distribution with
identical mean values but
different variances
x
The variance of a Random Variable
Covariance
E ( X ) = 2.59, E (Y ) = 1.79
4 3
E ( XY ) = ijpij
i =1 j =1
Cov( X , Y ) = E ( XY ) − E ( X ) E (Y )
= 4.86 − (2.59 1.79) = 0.224
Correlation
• When two sets of data are strongly linked together we
say they have a High Correlation.
– Correlation is Positive when the values increase together, and
– Correlation is Negative when one value decreases as the other
increases
– The value shows how good the correlation is (not how steep
the line is), and if it is positive or negative.
Correlation
• Example: Ice Cream Sales
• Correlation:
Cov( X , Y )
Corr( X , Y ) =
Var( X )Var(Y )
5C3 = 5!/3!2! = 10
5
P(3 heads and 2 tails) = ×P(heads)3 × P(tails)2 =
3
10 × (½)5=31.25%
Binomial distribution function:
X= the number of heads tossed in 5 coin tosses
p(x)
x
0 1 2 3 4 5
number of heads
Definitions: Bernouilli
Bernouilli trial: If there is only 1 trial with probability of
success p and probability of failure 1-p, this is called a
Bernouilli distribution. (special case of the binomial with
n=1)
1 1
Probability of success: P( X = 1) = p (1 − p)1−1 = p
1
n!
P ( D = x, R = y , G = z ) = p Dx p Ry (1 − p D − p R ) z
x! y! z!
Multinomial example
Specific Example: if you are randomly choosing 8 people from an
audience that contains 50% democrats, 30% republicans, and 20%
green party, what’s the probability of choosing exactly 4 democrats, 3
republicans, and 1 green party member?
8!
P( D = 4, R = 3, G = 1) = (.5) 4 (.3) 3 (.2)1
4! 3!1!
You can see that it gets hard to calculate very fast! The
multinomial has many uses in genetics where a person
may have 1 of many possible alleles (that occur with
certain probabilities in a given population) at a gene
locus.
Poisson Distribution
3. Examples
• Number of customers arriving in 20 minutes
• Number of strikes per year in the India.
• Number of defects per lot (group) of DVD’s
Poisson Probability Distribution Function
e
x –
p (x) = (x = 0, 1, 2, 3, . . .)
x!
-
e
x
p ( x) =
x!
( 3.6 )
4 -3.6
e
p (4) = = .1912
4!
Uniform Probability Distribution
NY
Chicago
probability distribution
1
for 120 x 140
f (x) = 20
0 elsewhere
Uniform Probability Distribution
1 for a x b
f ( x) = b − a
0 elsewhere
1
20
1
20
10
1 − ( x − ) 2 / 2 2
f ( x) = e
2
Where:
μ is the mean
σ is the standard deviation
= 3.1459
e = 2.71828
Normal Probability Distribution
Characteristics
x
Normal Probability Distribution
Characteristics
Standard Deviation
x
Mean
Normal Probability Distribution
Characteristics
x
Normal Probability Distribution
Characteristics
x
-10 0 20
Normal Probability Distribution
Characteristics
= 15
= 25
x
Normal Probability Distribution
Characteristics
Probabilities for the normal random variable are
given by areas under the curve. The total area
under the curve is 1 (.5 to the left of the mean and
.5 to the right).
.5 .5
x
The Standard Normal Distribution
• The Standard Normal Distribution is a normal distribution with the special
properties that is mean is zero and its standard deviation is one.
=0 =1
=1
z
0
Cumulative Probability
Probability that z ≤ 1 is the area under the curve to
the left of 1.
P( z 1)
z
0 1
What is P(z ≤ 1)?
2.46 z
Exercise 2
-1.29 1.29 z
Note that, because of the symmetry, the area to the left of -1.29 is the
same as the area to the right of 1.29
References
• Probability and Statistics:The Science of Uncertainty,
Second Edition, Michael J. Evans and Jeffrey S.,
Rosenthal University of Toronto.
HAPPY LEARNING
Thank you !