0% found this document useful (0 votes)
43 views

Chapter 2 - Random Variables (With Solutions)

This document provides an overview of random variables and some key probability distributions. It defines a random variable as a function that maps outcomes of an experiment or trial to numerical values. Random variables can be discrete, taking on countable values, or continuous, having uncountable values. The document outlines the probability mass function (PMF) and cumulative distribution function (CDF) for characterizing discrete random variables and provides examples of Bernoulli, binomial, and other common probability distributions.

Uploaded by

caoyuanboy
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views

Chapter 2 - Random Variables (With Solutions)

This document provides an overview of random variables and some key probability distributions. It defines a random variable as a function that maps outcomes of an experiment or trial to numerical values. Random variables can be discrete, taking on countable values, or continuous, having uncountable values. The document outlines the probability mass function (PMF) and cumulative distribution function (CDF) for characterizing discrete random variables and provides examples of Bernoulli, binomial, and other common probability distributions.

Uploaded by

caoyuanboy
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 77

ST5201: Statistical

Foundations of Data Science


Chapter 2: Random variables

Wei Yang, Wang


[email protected]
Department of Statistics and Data Science
National University of Singapore (NUS)
1
Outline

● Introduction: What is a random variable?

● Discrete random variables and corresponding distributions

● Continuous random variables and corresponding distributions

● Functions of a random variable

Textbook reading: Section 2.1 – Section 2.3

2
What is a random variable?

● A rule of association: associate all outcomes in Ω with the possible numbers/values


representing/corresponding to the events of interest

● Example: Examine 2 light bulbs. We are interested in the number of defective bulbs (∈ {0, 1, 2}),
but not whether a specific bulb is defective or not. Such a rule is illustrated below:

3
What is a random variable?

Usually, people are interested in a special measurable characteristic of the outcomes in different
experiments, though the sample space Ω may consist of qualitative or quantitative outcomes
● Manufacturers: proportion of defective light bulbs from a lot
● Market researchers: preference in a scale of 1-10 of consumers about a proposed product

● Research physicians: changes in certain reading from patients who are prescribed a new drug

● Data scientists: What is the proportion of correct model predictions?

4
What is a random variable?

A random variable serves as such a rule of association:

● A variable: takes on different possible numerical values

● random: different values depending on which outcome occurs

When an experiment is conducted, an outcome (subject to uncertainty) occurred ⇒ a corresponding


“random” number is realized according to the rule

● Manufacturers: the number of defective light bulbs from a lot of fixed size N is any integer [0, N]
● Market researchers: preference score of consumers about the proposed product, integer in [1, 10]

● Research physicians: change in certain reading from a patient who is prescribed a new drug is any real
number from −∞ to ∞

● Data scientists: Number of correctly classified images from N images is an integer in [0, N] 5
Random variables

v
Definition
For a given sample space Ω of an experiment, a random variable (r.v.) is a function X whose domain is Ω
and whose range is the set of real numbers

6
Random variables

Notations:

● Upper-case letters X, Y, Z . . . to denote r.v.’s


● Lower-case letters x, y, z, . . . to denote their possible values
● e.g., in the example of Examine 2 light bulbs:
○ Let X(Upper-case) denote the number of defective bulbs
○ X takes on values x = 0, 1, 2. ⇔ Range: {0, 1, 2}.
○ Note: usually, we are interested in the range of r.v.s, and the corresponding probabilities. Say, P(X = 0), P(X = 1),
and P(X = 2) in this case.

● 2 types of r.v.’s: Discrete versus Continuous

7
Discrete random variables

Definition
A discrete r.v. is a r.v. that can take on only a finite or at most a countably infinite number of values

Range is

● finite: e.g., {0, 1, 2} in the example of Examine 2 light bulbs


● infinite: e.g., how many dice will you throw before you can get a six? Range = {0, 1, 2, · · · }

To understand the performance of a r.v. X, one wants to specify the probability attached to each value in its range

● i.e. P(X = x), for any x in the range of X,

where P is the probability measure defined in Lecture 1. It can be found based on the probability of outcomes.
8
Probability mass function

For discrete r.v.’s,

● Let A = {ω ∈ Ω|X(ω) = x} be the set/event containing all the outcomes ω ∈ Ω which are mapped to
x by X.
● Following from the rule of probability, for any , simply add probabilities of all outcomes in A

Definition
The probability mass function (pmf), or frequency function of a discrete r.v. with range {x1, x2, · · · } is a
function p s.t.

9
Cumulative distribution function
Definition
The cumulative distribution function (cdf) of any r.v. is defined by

For all cdf’s:


1. F(x) is always non-decreasing
2. limx→−∞ F(x) = 0
3. limx→∞ F(x) = 1
For cdf’s of discrete r.v.’s:
1. F(x) = Σ{xi|xi≤x} p(xi) is a step-function
2. jumps occur at all xi in the range of X
3. jump size at xi is p(xi)

Remark: Both pmf and cdf are necessary and sufficient ways to uniquely characterize a discrete r.v.
10
Examples

Examine 3 light bulbs (D - Defective, G - Good):


● Ω = {DDD, DDG, DGD, GDD, GGD, GDG, DGG, GGG}}
● Let X be the total number of defective bulbs observed
● X is a discrete r.v. taking on {0, 1, 2, 3}
● With counting methods,

P(X = 3) = 1/8, P(X = 2) = 3/8,


P(X = 1) = 3/8, P(X = 0) = 1/8.

11
Examples

● The pmf of X is given by

● The length/height of the vertical bar at any point xi is p(xi),


with the sum of all the bar’s height as 1
● Refer to the set {0, 1, 2, 3} as the support of the pmf
○ the values at which p(x) is non-zero

○ identical to the range of X

12
Examples

● The cdf of X is given by

● stays at 0 from x = −∞ until 0 (the smallest possible value


of x)
● jumps at the support of the pmf
● stays at 1 from x = 3 to x = ∞
● always remember to write a function from −∞ to ∞!
13
r.v. vs probability distribution
● Recall:

cdf is a necessary and sufficient way to uniquely characterize a r.v.

cdf characterizes the distribution of probabilities on each possible value/subset of a r.v., so does pmf (for a
discrete r.v.)

● We can characterize a r.v. with its distribution, and say “(the r.v.) X follows/has a distribution with pmf p(x)
(or cdf F(x))”.

● Later, we will introduce distributions/r.v.s with specific names, such as


○ A Bernoulli r.v.: a r.v. follows a Bernoulli distribution
○ a normal r.v.: a r.v. follows a normal distribution
○ ...

14
Bernoulli trials and Bernoulli r.v.

● Bernoulli trial: an experiment whose outcomes can be classified as, generically “success (S)” or
“failure (F)”
● A function X which maps S to 1 and F to 0 defines the simplest discrete r.v. which takes on only 2
possible values

Definition
A Bernoulli r.v. with parameter or probability of success 0 < p < 1 takes on only 2 values, 0 & 1, with p(1) =
p, p(0) = 1 − p, p(x) = 0 if x ≠ 0 and x ≠ 1.

● Write X ~ Ber(p)
● pmf of X, where q = 1 − p:

Will remove “p(x) = 0” otherwise in the following definitions for simplicity, but keep it in mind in your work! 15
Examples
● Many examples of Bernoulli trials: Toss a coin; Examine a light bulb; Whether it rains in NUS tomorrow; ...
● Being the simplest experiment resulting in only 2 possible outcomes, different Bernoulli trials appear very
often in the real world
● In the real world, there are many more random phenomena corresponding to experiments defined
systematically by or related to ≥ 2 Bernoulli trials
● R.v. constructed from these experiments, for example,
○ binomial r.v.
○ geometric r.v.
○ negative binomial r.v.
○ hypergeomeric r.v.
○ Poisson r.v.

16
The Binomial distribution

● A Bernoulli trial associated with probability of success p is repeatedly performed for n


independent times (n ≥ 1 is fixed)
● Interest in X: total number of successes observed from n trials
● X is a binomial r.v. with parameters n (number of trials) and p (probability of success)

Definition
A r.v. X is said to have a binomial distribution with parameters n and p, write X ∼ Bin(n, p), if its pmf is
defined by

● Recall the binomial theorem, there is , which satisfies

17
The Binomial distribution

● The pmf p(x) = P(X = x) for x = 0, 1, 2, · · · , n can be found by counting methods with combinations:
○ Any outcome of the experiment is an sequence of Success (1) and Failure (0)

○ A particular sequence of x of 1s and n − x of 0s occurs with probability px (1 − p)n−x (following from the
multiplication law and independence between trial results)

○ different ways to obtain combinations of x 1s and n − x 0s.

○ Adding up px (1 − p)n−x for times gives p(x) (following from the addition law and disjointness of all these
sequences)

● Remark: When we set n = 1 in the experiment, we see that Ber(p) ≡ Bin(1, p)

18
The Binomial distribution

The shape of pmf: symmetric versus skewed as shown by pmf’s with n = 10 & p = 0.5 (above) or p = 0.1
(below)

19
Example

Examine 3 light bulbs:


● X, total number of defective bulbs among 3 light bulbs, is of interest
● n = 3 identical Bernoulli trials resulting in 1 of 2 outcomes, {Defective, Normal} with the same
probability of success (defective) p; 3 bulbs to be examined
● The 3 trials are independent
● X ∼ Bin(3, p)
● Remark: Here p is unknown to us but known to be a certain real number between 0 and 1

20
Example

Problem: Solution:

Suppose that it is known that a manufacturer Let X be the number of defective fuses in a lot of
produces defective fuses subject to a probability size 100. Then, X is a binomial r.v. with parameters
of .05. In a lot of 100 produced fuses, what are the n = 100 and p = 0.05.
probabilities that

1. there are 2 defective fuses?


2. there are less than 5 defective fuses?

21
The Geometric distribution
A Bernoulli trial associated with probability of success p is repeatedly performed indep. until the 1st success is observed

Interest in X, the total number of trials performed

X is a geometric r.v. with probability of success p where p is the probability of observing a success from every Bernoulli trial

Definition
A r.v. X is said to have a geometric distribution with parameter p, write X ∼ Geo(p), if its pmf is defined by

Remark: infinite sample space

(the latter sum is called a geometric sum)


22
The Negative Binomial distribution
A Bernoulli trial associated with probability of success p is repeatedly performed indep. until the rth success is
observed

Interest in X, the total number of trials performed

X is a negative binomial r.v. with parameters r and p, where p is the probability of observing a success from every
Bernoulli trial

Definition
A r.v. X is said to have a negative binomial distribution with parameters r and p, write X ∼ NegBin(r, p), if
its pmf is defined by

Remark: infinite sample space; total number of trials ≥ r


23
Quiz

Question: Solution:

Suppose that it is known that a manufacturer


produces defective fuses subject to a probability
of 0.05. By examining the produced fuses in series
one-by-one, what are the probabilities that

1. the first defective fuse appear at the 5th


examined fuse?
2. the 5th defective fuse appear at the 50th
examined fuse?

24
Quiz

Question: Solution:

Suppose that it is known that a manufacturer Let X be the number of fuses to be examined.
produces defective fuses subject to a probability Then, X ∼ Geo(0.05) ≡ NegBin(1, 0.05) for Question
of 0.05. By examining the produced fuses in series 1, X ∼ NegBin(5, 0.05) for Question 2.
one-by-one, what are the probabilities that

1. the first defective fuse appear at the 5th


examined fuse?
2. the 5th defective fuse appear at the 50th
examined fuse?

25
The Hypergeometric distribution

From a population of 2 kinds of objects (Success & Failure) with r Successes and N − r Failures, a total of n
< N objects are draw without replacement

26
The Hypergeometric distribution

One usual interest is about X, the total number of S drawn

X is a hypergeometric r.v. with parameters r, N, and n, where r is number of successes, N is population size,
and n is sample size

Definition
A r.v. X is said to have a hypergeometric distribution with parameters r, N and n, write X ∼ Hyper(r, N, n),
if its pmf is defined by

for any integer x satisfying max(0, n - (N - r)) ≤ x ≤ min(r, n)


27
The Hypergeometric distribution

p(x) is derived easily due to the fact that the order of the n selected objects does not matter

● total number of ways to sample n out of N objects is


● total number of ways to sample x out of r S’s together with n − x out of N − r F’s is given by

Hypergeometric versus binomial:

A Bin(n, p) r.v. can be alternatively defined by the same mechanism as above with the sample size set as the
number of trials n and p = r/N, except that the objects are drawn with replacement

28
Example

Problem: Solution:

It is common that manufacturers perform quality Let X be number of defective items observed in a
control of their products by sampling a few sample of size 10. Then, X is a hypergeometric r.v.
products from a lot. When there are defective with parameters r = 4, N = 20 and n = 10, which
items more than a threshold value, then the lot
takes on values 0, 1, 2, 3, 4
will not be shipped. Suppose that in a lot of size 20,
4 of the products are defective. When there are >
2 defective items among 10 inspected, the lot will
be rejected. What is the prob that this lot will be
rejected?

29
The Poisson distribution
Definition
A r.v. X is said to have a Poisson distribution with parameters λ > 0, write X ∼ Poi(λ), if its pmf is defined by

● e ≈ 2.71828 denotes the base of the natural logarithm


● A binomial r.v. with infinite number of trials: Derived as the limit of Bin(n, p)
○ Poisson approximation to binomial probs of Y ∼ Bin(n, p):

when n→∞ and p→0 s.t. np = λ is moderate


30
Applications of Poisson distribution

Poisson distribution is a good model for the number of occurrences of a rare incidence in a fixed period of
time, in a given space, etc.

● # of busses passing a stop in a given hour


● # of people entering a store on a given day
● # of new birth in a given day
● # of misprints on a page
● # of occurrences of the DNA sequence ”ACGT” in a gene
● # of patients arriving in an emergency room between 11 and 12 pm

31
The Poisson pmf

32
Example

Problem: Solution:

Flaws (bad records) on a used video tape occur on Let X be the number of flaws on a tape of 4800
the average of 1 flaw per 1200 feet and the feet. Then X〜Poi(4800/1200 x 1 = 4)
number of flaws follows a Poisson distribution.
What are the probabilities that 1.

1. there is no flaw on a tape of 4800 feet? 2.


2. there are more than 2 flaws on a tape of
4800 feet?

33
Example - Poisson vs Binomial

Problem: Solution:

In a huge community, it is known that 0.7% of the Here we assume that the number of colour blind people
population is colour-blind. What is the probability in a group of 1000 people from this community, Y〜
that at most 10 in a group of 1000 people from the Bin(1000, 0.007). The required probability is
community are colour blind?

As n = 1000 is large and p = 0.007 ≈ 0, Y has


approximately a Poisson distribution with parameter
1000 x 0.007 = 7. I.e. X 〜 Poi(7). Hence,

34
The Poisson process
● Alternatively the Poisson distribution can be derived from a mathematical model, called a Poisson process
having rate λ > 0, for describing a random phenomenon regarding occurrence of a certain incidence or
“event” in time.
● λ is the rate per unit time at which events occur
○ E.g. λ = 2 may stand for 2 events per minute/hour/week

● The events in the experiment satisfies:


1. Number of occurrences of the event are independent in mutually exclusive/non-overlapping time intervals

2. Probability of exactly 1 occurrence of the event in a sufficiently small interval of length h is roughly λh
3. Probability of ≥2 occurrences of the event in a sufficiently small interval of length h is essentially zero

● Number of events occurring in an interval of length t is N(t)~Poi(λt)


Note: Unit of t must match unit of time for λ.
35
Quiz

Problem: Solutions:

Suppose an office receives phone calls in


accordance with assumptions 1-3 in the previous
slide. The average number of calls is 0.5 per
minute. What are the probabilities that there are

1. no calls in a 5 minute interval?


2. exactly 14 calls in half an hour?

36
Quiz

Problem: Solutions:

Suppose an office receives phone calls in 1. Recall that


accordance with assumptions 1-3 in the previous
slide. The average number of calls is 0.5 per Here x = 0, λ = 5 x 0.5 = 2.5. Hence,
minute. What are the probabilities that there are

1. no calls in a 5 minute interval?


2. exactly 14 calls in half an hour?
2. x = 14, λ = 30 x 0.5 = 15

37
Continuous random variables

● Among real world “experiments”, many “natural” variables/quantities of interest have sample
spaces with uncountable possible outcomes, e.g.,
○ temperature range on any day

○ annual income of a company

○ lifetime of a light bulb

○ weight loss after exercise

● Define a one-to-one function X which maps these uncountably infinite outcomes to themselves
(function f(x) = x)
● Viewing X as a r.v., the range of X is an interval (possibly bounded) or a union of intervals, and this
kind of r.v.’s are called continuous random variables.
38
Continuous random variables

In case we follow what we did in discussing discrete r.v.’s, we need P(X = x) for every possible value of x in
R of the cont. r.v. X.

However, impractical & illegitimate to talk about P(X = x) for every x!

● Uncountably infinite number of x or P(X = x) to deal with

● For any cont. r.v., we must have, for any possible value x in the range of X,

P(X = x) = 0,

due to P(Ω) = ∑all x P(X = x) = 1

39
Probability density function

Replace the pmf by probability


density function (pdf)

40
Probability density function

Definition
The probability density function (pdf ) of a cont. r.v. X is an integrable function satisfying

● is piecewise continuous

● The prob that X takes on a value in the interval equals the area under the curve
between a & b:

41
PMF vs PDF

42
Properties of continuous r.v.
Property
1. for any
2.
3. For small , if is continuous at c,

● The value of . However, from property 3, the probability that X is in a small interval
around c is proportional to
● (Integral) property 3 in differential notation:

43
CDF of continuous r.v.
Definition
The cdf of a cont. r.v. X with pdf f(x) is defined by

● The cdf for cont. r.v. is also continuous

● With the fundamental theorem of calculus, the pdf is the first derivative of the cdf:

● In terms of the cdf, the prob that X takes on a value in [a, b] is

44
Uniform r.v.

The simplest cont. r.v. is a uniform r.v.

Definition
A r.v. X is called a uniform r.v. with parameter a and b, write X ∼ Unif(a, b), if, for b > a, its pdf is given by

When a = 0, b = 1, we have X ∼ Unif(0, 1), called as standard uniform r.v.

The probability that X is in any interval of length h in (a, b) equals to h/(b − a). For standard uniform, it is h.

45
Uniform r.v.

What is the cdf of a Unif(0, 1) r.v.?

46
pdf

Uniform r.v.

Similarly, the cdf of X ∼ Unif(a, b) is given by

cdf

47
Example

The direction of imperfection on a tyre is subject to What is the probability that the angle is
uncertainty. Let X be the angle clockwise, between a between 90॰ and 180॰?
vertical reference line and the line connecting the centre
of the tyre to the imperfection. A possible pdf for X is:

48
Exponential r.v.

Definition

A r.v. X is called an exponential r.v. with parameter λ, write X ∼ Exp(λ), if, for λ > 0, its pdf is given by

● 1 parameter λ>0 like a Poisson r.v.


● Family of exponential densities indexed by different values of λ
● can be shown to be the random time between occurrences of 2 consecutive “events” of a Poisson
process with the same parameter
● The larger λ, the more rapidly the pdf drops off

49
Exponential r.v.

PDF CDF

50
Exponential r.v.
Memoryless property
For X ∼ Exp(λ), and s, t > 0,

is independent of s.

Let X be the lifetime of some product: if the product is good at time t, the distribution of the remaining
time that it is good is the same as the original lifetime distribution when it was new

The only cont. r.v. possessing this property. In fact, the memorylessness property indicates the
exponential distribution.

Good to model lifetimes/waiting times until occurrence of some specific event

51
Exponential r.v.

Problem: Solution:
Suppose that the length of wait for a taxi at a taxi Let X be the duration of wait. Then X 〜 Exp(0.1).
stand is an exponential r.v. with parameter λ = 0.1.
Someone is in front of you in the queue for a 1. P(X > 10) = 1 - F(10) = e-0.1x10 = e-1 = 0.368
taxi. Find the probabilities that you will have to
wait 2. P(10 > X > 20) = F(20) - F(10)

1. More than 10 minutes = (1 - e-2) - (1 - e-1) = 0.233


2. Between 10 to 20 minutes
3. More than 20 minute after having 3. P(X > 20 | X > 10) = P(X > 10) = 0.368
already waited more than 10 minutes

52
Gamma r.v.
Definition
A r.v. X is called a gamma r.v. with shape parameter α and rate parameter λ, write X ∼ G(α, λ), if, for α, λ >
0, its pdf is given by

is called the gamma function





When α = 1, f(x) reduces to an exponential density
Family of gamma densities for different values of α & λ: a fairly flexible class for modeling nonnegative
r.v.’s 53
Gamma density / pdf (λ = 1)

α = 0.5 (solid) α = 5 (solid)


α = 1 (dotted) α = 10 (dotted)

54
Beta r.v.
Definition
A r.v. X is called a beta r.v. with parameters a and b, write X ∼ B(a, b), if, for a, b > 0, its pdf is given by

When a = b = 1, f(x) reduces to a standard uniform density

A useful alternative to Unif(0, 1) for modeling r.v.’s on [0, 1]

Family of beta densities for different values of a & b: a fairly flexible class for modeling r.v.’s that are
restricted on [0, 1]

55
Beta density / pdf

a = 2, b = 2
a = 6, b = 6

a = 6, b = 2 a = 0.5, b = 4

56
Normal distribution
Definition
A r.v. X has a normal/Gaussian distribution with parameters μ and σ, write X ∼ N(μ, σ2), if, for −∞ < μ < ∞,
σ > 0, its pdf is given by

π ≈ 3.1415927 represents the familiar mathematical constant related to a circle

Mean parameter μ and standard deviation parameter σ

Family of normal densities for different values of μ & σ

57
Normal distribution

Play a central role in probability and statistics

The most widely used models for diverse phenomena such as measurement errors in scientific
experiments, reaction times in psychological experiments, etc.

Many r.v.’s, such as height and time, have distributions that are well approximated by a normal
distribution

Central Limit Theorem (CLT) to be introduced later - sum of independent. r.v.’s is approximately normal -
justifies the use of the normal distribution in many applications

An important special case: the normal distribution with mean μ = 0 and sd σ = 1, called the standard
normal distribution and denoted by Z ∼ N(0, 1)
58
The normal density / pdf

● f(x) is a bell-shaped / mound-shaped curve


● f(x) is symmetric about μ

59
The normal density / pdf

● μ: centre; locates the maximum/peak of the curve

● σ: shape/spread of the distribution.

60
Linear transformation of a Normal r.v.
Definition
Suppose that X ∼ N(μ, σ2). Then, Y = a + bX, for fixed constants a and b, is a normal r.v. with mean a + bμ
and variance b2σ2.

A streamline proof for b > 0:

1. From CDF:

2. Thus PDF:

So,

61
Standard Normal r.v.

Definition
Suppose that X ∼ N(μ, σ2). Then, is a standard normal r.v. with mean 0 and variance 1.

Notice that is a linear transformation of X, and apply the result with a = -μ/σ and b = 1/σ

The above linear transformation on the r.v. X defined by subtracting the mean of X followed by dividing
the result by the sd of X is called standardization of X

Usually reserve Z to denote a standard normal r.v.

62
Probability computation for Normal r.v.
Definition
The cdf of a N(0, 1) r.v., Z, is defined by, for −∞ < x < ∞,

Area under the standard normal density between −∞ and x

No closed-form expression for this integral

Z-table is used to check values of Φ(x) for x ≥ 0

Because of symmetry, Φ(x) = 1 − Φ(−x). It can be used for x < 0.


63
Probability computation for Normal r.v.

cdf of a N(μ, σ2) r.v. in terms of Φ(·)


The cdf of X ∼ N(μ, σ2) is given by

Computing probabilities of a N(μ, σ2) r.v.


For X ∼ N(μ, σ2), where −∞ < c ≤ d < ∞.

64
Z-table

Table 2: “Cumulative Normal


Distribution-Value of P Corresponding
to a z-score zP for the Standard Normal
Curve” - page A7 of the textbook

65
Example

Lower tailed probability:

P(Z<0.34) = Φ(0.34) = 0.6331

P(Z<1.96) = Φ(1.96) = 0.9750

P(Z<-1.96) = 1 - Φ(1.96) = 0.0250

66
Example

Upper tailed probability: Between 2 points:


P(Z>1.96) = 1 - P(Z ≤ 1.96) = 1 -
Φ(1.96) = 0.0250

P(Z>-1.96) = Φ(1.96) = 0.9750

67
Quiz

Let X be the gestational length in weeks. We know from prior research X ∼ N(39.18, 22). What is the
probability of gestations being less than 40 weeks?

68
Quiz

Let X be the gestational length in weeks. We know from prior research X ∼ N(39.18, 22). What is the
probability of gestations being less than 40 weeks?

0.6591

0.41 69
The “inverse” problem

Definition
For a r.v. X ∼ N(μ, σ2) and probability 0 < p < 1, we are interested in d s.t. P(X ≤ d) = p. Here, d is called
the quantile of X at p, and this problem is called “inverse” problem of computing probs of a normal r.v.
When a normally distributed r.v. X is of interest,

● what is the value of d in the following claim for any given 0 < p < 1,
● frequency that a realization from X is ≤ d is p

Locate p in the pool of numbers in the Z-table, and then compute d based on the x value associated with p

70
Example

Let X be the gestational length in weeks. We know from prior research X ∼ N(39, 22). What is the
gestational length l such that 40% of all gestational lengths are shorter?

Find , where

Since p < 0.5, d < 0, hence transform

Find

71
Functions of a r.v.

Given a r.v. X with density function, it is often that we are interested in another r.v. Y = g(X) which is
defined as a known function g (either one-to-one or many-to-one) of X

e.g., interested in the revenue of a shop (Y) which depends on the sales (X) (assuming that the r.v., sales, is
fully understood)

The composite function g ◦ X


defines a new r.v. Y from Ω to

72
Functions of a r.v.

Same as for linear transformation Y = a + bX, X ∼ N(μ, σ2).

1. Obtain cdf of Y in terms of X


2. Differentiate cdf w.r.t y.

Change of variable technique


Let X be a cont. r.v. having pdf f and cdf F. Suppose that g(x) is a strictly monotonic (increasing or
decreasing) & differentiable (& thus cont.) function of x on some interval I. Suppose that f(x) = 0 if x is not
in I. Then, the r.v. Y defined by Y = g(X) has a pdf given by

where g-1(y) is defined to be the value of x s.t. g(x) = y. 73


Functions of a r.v.

Proof: Assume g(x) is an increasing function. Suppose y = g(x) for some x, then with Y = g(X),

Differentiating w.r.t y yields

g-1(y) is non-decreasing, so derivative is non-negative.

74
Example

Let Z 〜N(0, 1). Define Y = eZ. Find the pdf of Y.

Exponential function is increasing, and Y is non-negative for all −∞ < z < ∞.

By change of variable, g-1(y) = ln(y),

Note: This non-negative r.v. is called a lognormal r.v. as a logarithmic transformation of Y gives a normal r.v.

75
Quiz

What is the pdf of Y = Z2 where Z 〜N(0, 1)?

Note: square function is neither increasing nor decreasing. So change of variable technique is not
applicable for any

76
Quiz

What is the pdf of Y = Z2 where Z 〜N(0, 1)?

Note: square function is neither increasing nor decreasing. So change of variable technique is not
applicable for any

Note: This is a gamma r.v. with


parameters ½ and ½, which is
also called the chi-squared ( )
Differentiating w.r.t y, r.v. with 1 degree of freedom.

77

You might also like