0% found this document useful (0 votes)
8 views36 pages

Week3 4

Uploaded by

build852
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views36 pages

Week3 4

Uploaded by

build852
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

Discrete random variable

Cumulative distribution function

We characterize random variable by cumulative distribution function


(CDF) in addition to probability mass function. We write CDF as FX (x).

FX (x) = P(X  x)

Recall the probability mass function

pX (x) = P(X = x)

Do you find any connection between them?

6
PMF v.s. CDF

If you toss a fair coin 8 times, let X denote number of heads you get. The
probability mass function and cumulative distribution function of X are

• PMF ! CDF: take cumulative sum.


P(X  1) = P(X = 0) + P(X = 1)
• CDF ! PMF: take di↵erence.
P(X = 2) = P(X  2) P(X  1)
7
Expectation and variance

The most important concept for random variables.

• Expectation measures the average of the random variable


X
E [X ] = xP(X = x)
x

• Variance measures how volatile is the random variable

Var[X ] = E [(X E [X ])2 ] = E [X 2 ] (E [X ])2

8
Binomial random variable

We call that binomial random variable, which is sum of a sequence of


independent Bernoulli random variables.
• Repeat a binary experiment n times, you are interested in counting
how many times an event happen, which has probability p to
happen.

• We call it binomial random variable. Or say the random variable


follows a binomial distribution. Write it as Binomial(n, p), or
Bin(n, p).

• We use notation X ⇠ Binomial(n, p) to show that a random variable


X follows binomial distribution.

• The probability mass function


✓ ◆
n k
P(X = k) = p (1 p)n k
, k = 0, 1, · · · , n
k
9
Binomial random variable

Consider n = 6 and k = 3,

• There are many cases to list 3 events happen out of 6.


n n!
k = k!(n k)! .
• You know it happens k times, does not happen n k times. So the
probability for each case is p k (1 p)n k .
10
• P(X = k) = kn p k (1 p)n k , k = 0, 1, · · · , n
Conditions required to be Binomial

Conditions that need to be met for the binomial distribution

• The trials must be independent to each other.

• Total number of trials must be fixed. n.

• Each trial gives you binary outcome, success or failure.

• The probability of a success p, must be the same for all trials.

11
Urn example

You have a urn with 10 blue and 20 red balls inside. You pick 9 balls with
replacement. Let X be the number of blue balls. What is P(X = 3)?
• With replacement is critical here. I can draw the same ball twice.
10
• We have P(blue) = 30 = 13 , and P(red) = 23 .

• Probability of any sequence of 9 balls with 3 blue and 6 red is


1 3 2 6
• 3 3

• How many sequences of 9 balls are there with 3 blue balls?


9
• 3 ways.

9 1 3 2 6
• Therefore the probability is P(X = 3) = 3 3 3 .

• X is a binomial random variable, X ⇠Binomial 9, 13 .


12
Another example

You send out promotions to 10 customers, and estimate that with


probability 0.3 a customer will purchase the product.

• Let X denote number of customers buy the product, X ⇠


Binomial(10, 0.3).

• What’s the probability that at least one customer buy?

• 1 P(X = 0) = 1 (1 0.3)10

• What’s the probability that more than 7 customer buy?

• P(X = 8) + P(X = 9) + P(X = 10) =


10 8 2 10 9 1 10
8 (0.3) (1 0.3) + 9 (0.3) (1 0.3) + 0.3
10 10
• Notice that 0 = 10 =1

13
Geometric random variable

• Bernoulli PMF describes the probability of success / failure in a


single experiment.

• Binomial PMF describes the probability of k success among n


experiments.

• Sometimes, we are interested in how many times you have to try,


until the first success.

14
Geometric random variable

You are repeating a Bernoulli experiment, you won’t stop until seeing
desired outcome.

It is possible to repeat many times .. no upper bound of number of trials


if p is super small.

15
Geometric random variable

• Alice keep buying lottery until she wins a hundred million dollars.
She is interested in the random variable “number of lottery tickets
bought until win 100M$”.

• Alice tries to catch a taxi. How many occupied taxis will drive past
before she finds a vacant one?

• The number of trials required to get a single success is a geometric


random variable.

16
Geometric random variable

We repeatedly toss a biased coin, P(head) = p. The geometric random


variable is the number X of tosses to get the first head.

• X can take any integers, from 1, 2, · · · . No upper bound.

• X = k means that the k-th toss you get a head, and you get tails
for all k 1 tosses before.

• P(X = k) = P(TTTTTT · · · TTTH) = (1 p)k 1


p
P1
• k=1 P(X = k) = 1 (why?)

17
PMF of geometric random variable

P(X = k) = P(TTTTTT · · · TTTH) = (1 p)k 1


p

18
Geometric random variable

• What does X k imply?

• You failed in the first k 1 trials. The k-th might be a success, but
who knows.

• What does X > k imply?

• You failed in the first k trials.

19
Geometric random variable

P1
• P(X k) = j=k p(1 p)i 1
= (1 p)k 1
. You get the sum by
calculus.

• Intuitively, P(X k) asks for the probability that the first k 1


tosses are tails.

• Consider tossing a coin, what’s the probability that the first k 1


tosses are tails? (1 p)k 1 . You can calculate it directly rather than
taking sum of the series.

• X > k is equivalent to X k + 1, so
P(X > k) = P(X k + 1) = (1 p)k

20
Geometric: mean and variance

X ⇠ Geo(p), its mean and variance are

1 1 p
E [X ] = , Var[X ] =
p p2
See page 156 of Ross for a proof.
Intuitively, the smaller p, the more times you have to try to get the first
success.

21
Memoryless of geometric random variable

What is P(X > a + b | X > a)?


• X > a + b | X > a, means given your first a experiments fails, the
event that the first a + b experiments are failed.

• Basically, you keep doing your poor experiments, and get b more
failure.

• We can calculate it by the definition of conditional probability


P(X > a + b)
P(X > a + b | X > a) =
P(X > a)
(1 p)a+b
=
(1 p)a
= (1 p)b
= P(X > b)
You forgot about your first a failures, and started the clock fresh!
22
Memoryless of geometric random variable

Just like you never grow old

P(X > a + b | X > a) = P(X > b)

Another way to write it

P(X > b | X > a) = P(X > b a), 8b > a

23
Example of memoryless

Someone is playing baccarat in a casino. The probability of a win is 10%.

• How many times is he expected to play to get the first win?


• E [X ] = 1/p = 10
• Given he has already lost 9 plays, how many times is he expected to
play to get the first win?
• E [X | X > 9]. This is conditional expectation, calculate the
conditional probability, then the expectation.
• Due to the memoryless property,
P(X > a + b | X > a) = P(X > b). We know that the PMF
P(X = a + b | X > a) = P(X = b), so essentially
E [X | X > 9] = E [X + 9] = E [X ] + 9 = 19.

24
Example of memoryless

1
X
E [X | X > 9] = xP(X = x | X > 9)
x=10
X1
= XP(X = x 9) by memoryless
x=10

= 10 ⇥ P(X = 1) + 11 ⇥ P(X = 2) + · · · + ...


1
X
= (9 + x)P(X = x)
x=1
X1 1
X
= 9P(X = x) + xP(X = x)
x=1 x=1
X1 X1
=9 P(X = x) + xP(X = x)
x=1 x=1

= 9 + E [X ] = 19

25
Negative binomial random variable

• Bernoulli PMF describes the probability of success / failure in a


single experiment.

• Binomial PMF describes the probability of k success among n


experiments.

• Geometric PMF describes how many times you have to try, until
the first success.

• We go one step further from geometric: how many times you have
to try, until r success.

26
Negative binomial random variable

• Formally, you are doing independent trials, each has probability p to


be a success. X denotes number of trials to perform until a total r
successes is accumulated.

• Denote it as X ⇠ NegBin(p, r ).

• PMF:
✓ ◆
n 1 r
P(X = n) = p (1 p)n r
, n = r , r + 1, r + 2, · · ·
r 1

• Apparently, minimal value of X is r , you get r success straight, the


upper bound is infinity.

27
Negative binomial random variable

Intuition of the PMF


• Consider the probability of X = n, so we collect r successes after n
trials.

• So the last trial (n-th) must be the r -th success, right? That’s when
we stop.

• The remaining r 1 successes must be in the previous n 1 trials.


n 1
• How many ways to choose r 1 successes among n 1 trials? r 1 .

• What’s the probability for one combination?


p r 1 (1 p)(n 1) (r 1) = p r 1 (1 p)n r

• Do not forget your last trial is a success! One more factor p.


n 1 n 1
• P(X = n) = r 1 pr 1
(1 p)n r
⇥p = r 1 p r (1 p)n r

28
Negative binomial random variable

Consider r = 3. The last one must be a success.

You have r 1 success in the previous n 1 trials.

29
Mean and variance

Suppose X ⇠ NegBin(p, r ).

r r (1 p)
E [X ] = , Var[X ] =
p p2
See page 159 of Ross for a proof.

30
Example

• Christian Ronaldo scores a goal with probability 0.7. What is the


probability that he will make 3 goals out of 5 shooting attempts in
one football game?

• X ⇠ NegBin(0.7, 3)
4
• P(X = 5) = 2 0.73 (1 0.7)2 = 0.185

• On average, he needs to shoot 3/0.7 = 4.29 times to get 3 goals.

31
Hypergeometric random variable

Suppose an urn has N balls, of which m are white and N m are black.
You randomly choose n balls from the urn without replacement. X
denote number of white balls being selected. It is a hypergeometric
random variable X ⇠ HyperGeo(N, m, n).

m N m
i n i
P(X = i) = N
, i = 0, 1, · · · , m
n

• Intuitively, the denominator is total number of ways you choose n


from N balls.

• To get i white balls, means that you draw i out of m white balls,
and n i out of N m balls.
h i
mn mn (n 1)(m 1)
• E [X ] = N , Var[X ] = N N 1 +1

32
Example

You are inspecting a shipment of 10 electronic devices, suppose there are


2 defective. You randomly pick 3 devices to inspect and the policy is
rejecting this shipment if you see any defective device. What’s the
probability you will reject this shipment?

• X ⇠ HyperGeo(10, 2, 3).

• P(reject) = P(X = 1) + P(X = 2) = 1 P(X = 0)

(20)(83)
• P(X = 0) = 10 = 0.47
(3)

• P(reject) = 1 0.47 = 0.53

33
Large n, small p

• I write a book with 10,000 words. Probability that a word has a typo
1
is 1000 . I am interested in how many typos can be there in the book?

• X number of typos in the book. We want to find P(X = 3).

• This is a binomial random variable, right? We have 10,000


experiments, and p = 1/1000 for each experiment. We are interested
in how many times a typo happens among the 10,000 experiments.
10000 1 3 999 997
• P(X = 3) = 3 1000 1000

• The calculation above... is too crazy. Maybe a simpler


approximation is better.

• This is a representative type of problems: many binary outcomes


experiments (large n), but probability of a success is tiny (small p).

34
Large n, small p

There are many settings in real life that may have huge n, but small p.

• The number of car crashes everyday.

• The number of customers enter your store within a given time


period.

• The number of mutations on a strand of DNA.

We describe such situation by Poisson random variable.

35
Poisson random variable

• A Poisson random variable takes non-negative integers as values. It


has one non-negative parameter . We write it as X ⇠ Poisson( ).
k
• P(X = k) = e k! , k = 0, 1, · · · . No upper bound.
P1 ⇣ 2

• k=0 P(X = k) = e 1 + + 2! + · · · = 1. Exponential series.

36
Poisson as an approximation to binomial

• When n is large but p is very small, a binomial random variable can


be well approximated by a Poisson with = n ⇥ p.

• The plot above, we have np = 3 but di↵erent n and p. You can see
the approximation is better when n large p tiny.

• When np is fixed, the smaller p, the better approximation.

37
Poisson: mean and variance

X ⇠ Poisson( ), the mean and variance are both

E [X ] = , Var[X ] =

• Derivation is omitted here, see page 145 of Ross for the proof.
• One interesting property of Poisson random variable is its mean and
variance are the same.
• How do you remember it?
• Hint: Binomial(n, p) can be approximated by Poisson( ) where
= np. So .. np ⇡ ?

38
Example

Assume that on a given day 1000 cars are out in the city. On an average
3 out of 1000 cars run into a traffic accident per day. Suppose each
accident is independent to each other.

1. What’s the probability that we see at least 2 accidents in a day?


2. Binomial?? Each car is a Bernoulli wether accident or not.. large n
but small p. Use Poisson!
3 30 3 31
3. P(X 2) = 1 P(X = 0) P(X = 1) = 1 e 0! e 1! = 0.8
4. If you know at least one accident, what’s the probability that the
total number of accident is at least two?
5.
3
P(X 1) = 1 P(X = 0) = 1 e = 0.950
P(X 2|X 1) = P(X 2)/P(X 1) = 0.8/0.950 = 0.84

39
Example

Arsenal are playing Chelsea tomorrow in an English Premier League.


Suppose the numbers of goals they expect to score are Poisson with
mean rate 2.2 and 2.7 respectively. Assume they score independently.
What’s the probability of a 1 - 1 draw?

1
2.2 2.2 2.2
P(Arsenal gets 1) = e =e ⇥ 2.2
1!
1
2.7 2.7
P(Chelsea gets 1) = e = e 2.7 ⇥ 2.7
1!
P(1 1 draw) = P(Arsenal gets 1) ⇥ P(Chelsea gets 1)
2.2 2.7
=e ⇥ 2.2 ⇥ e ⇥ 2.7
= 0.044

40

You might also like