L08-ContinuousRandomVariables
L08-ContinuousRandomVariables
So far, our sample spaces have all been discrete sets, and thus the output of our random variables have been
restricted to discrete values. What if the sample space is continuous, such as Ω = R? This means that the
output of a random variable X : Ω → R could possibly take on a continuum of values.
Example: Let’s say we record the time elapsed from the start of class to when the last person arrives. This is
a continuous random variable T that takes values from 0 to 80 minutes. What is the probability that T = 5?
Well, if the precision of my watch only goes up to minutes, then I might find the last person arrives during
the fifth minute. But if my precision is in seconds, it is less likely that the last person arrives exactly 5
minutes late. Then if I have a stopwatch that goes to hundredths of a second, it seems almost impossible that
the last person will come in at 5 minutes on the dot. As the precision of our measurements get better and
better, the probability goes down. If we were able to measure at infinite precision, the probability P (T = 5)
would be zero! However, the probability that the last person arrives between 5 and 6 minutes late is nonzero.
In other words, P (5 ≤ T ≤ 6) is not zero.
Important! A pdf is not a probability! Only when you integrate it does it give you a probability. In fact,
you can have pdf’s where the value f (x) is greater than one for some x. (We’ll see some examples below).
Exercise: Are the following functions pdf’s? If so, what are the cdf’s?
(
1
− 2x2 if x ∈ [−1, 1]
f (x) = 2
0 elsewhere
1
Answer: No, f (1) = − 32 , and a pdf must always be non-negative.
(
sin(x) if x ∈ [ π2 , π]
f (x) =
0 elsewhere
Answer: Yes, here we have f (x) ≥ 0, and we can verify that it integrates to one over the range it is non-zero:
Z π Z π
f (x)dx = sin(x)dx
π/2 π/2
x=π
= − cos(x)
x=π/2
= − cos(π) + cos(π/2) = 1
Uniform Distribution:
https://en.wikipedia.org/wiki/Uniform_distribution_(continuous)
The uniform distribution is the continuous equivalent of “equally likely” outcomes that we had in the discrete
case. The pdf for a uniform random variable on the interval [α, β] is
(
1
if x ∈ [α, β]
f (x) = β−α
0 otherwise
If X is a such a random variable, we write X ∼ U (α, β). Notice that if β − α < 1, then f (x) will be greater
than one in [α, β].
Exercise: What is the cdf for U (α, β)? Verify that the density for U (α, β) integrates to one.
Exercise: Is the following function a valid cdf, and if so, what is its associated pdf?
(
exp(a) if a ≤ 0,
F (a) =
1 otherwise.
Answer: Yes, we can verify that lima→−∞ F (a) = 0 and that F (0) = 1. The pdf for this random variable
comes by taking the derivative:
(
dF (a) exp(x) if x ≤ 0,
f (x) = =
da a=x 0 otherwise.
2
Exponential Distribution:
https://en.wikipedia.org/wiki/Exponential_distribution
pdf: f (x) = λe−λx
cdf: F (a) = 1 − e−λa
Notation: X ∼ Exp(λ)
What it’s good for: Exponential distributions model events that occur randomly in time at some specified
rate (the rate is the λ parameter). For example, we might want to model the arrival times of people coming to
class, or radioactive decay (release of an ionizing particle from a radioactive material), or the time you will
receive your next email. These are all “random processes” that can be modeled as exponential distributions
with some rate λ.
The exponential distribution is a continuous limit of the geometric distribution (see book example also). If
students are arriving to class at a rate of λ, let T be the random variable representing the time the next student
arrives. What it P (T > t), i.e., the probability the next student comes in beyond time t? If we break the
interval [0, t] into n discrete chunks of size t/n, then next arrival can be modeled as a geometric distribution
with probability of arriving in a particular chunk of p = λt n . So, using the geometric distribution, P (T > t)
is approximately (1 − p)n . We get a better approximation by increasing n, i.e., by breaking the interval into
smaller chunks. In the limit we have:
λt n
P (T > t) = lim 1 − = e−λt
n→∞ n
So, P (T ≤ t) = 1 − P (T > t) = 1 − e−λt , which is the cdf for the exponential distribution.
Exercise: For the exponential distribution, Exp(λ), what is a value for λ and x that makes f (x) > 1?
Pareto Distribution:
https://en.wikipedia.org/wiki/Pareto_distribution
α
pdf: f (x) = xα+1 for x ≥ 1 and f (x) = 0 for x < 1.
1
cdf: F (a) = 1 − aα for a ≥ 1 and F (a) = 0 for a < 1.
Notation: X ∼ Par (α)
What it’s good for: The Pareto distribution was originally used to model wealth distribution in economics.
It models how a large portion of a society’s wealth is owned by a small percentage of the population. In
general, it is good for anything where probabilities are greater for some outcomes and then fall off for others
(see Wikipedia for many other uses of Pareto). Computer science applications include: file sizes in internet
traffic (lots of small files, fewer large ones), hard drive error lengths (lots of small errors, fewer large errors).
3
Quantiles:
The pth quantile of a random variable X is the smallest number qp such that F (qp ) = P (X ≤ qp ) = p.
Another way to say this is qp = F −1 (p). The 50% quantile is called the median.
Notice that we can compute probabilities in an interval using the quantile function: