Bernoulli Distribution
Bernoulli Distribution
2
PMF and Expectation
We have:
Experiment with different outcomes.
Random Variable, X: Numeric values that can be assigned
to groups of outcomes in our sample space.
PMF, pX :
Input: numeric values of our random variable, k
Output: probabilities of getting those values, P(X=k)
Expectation of X:
The average value over many trials of the experiment. (aka mean)
() = * 0 1 ⋅1
value
+:- + ./
all possible fraction of time itself
values each value happens 3
Summary from last time
Linearity of expectation:
E .$ + 0 = .E $ + 0
Law of the Unconscious Statistician (LOTUS) aka
Expectation of a general function:
4
Moments of a random variable
The nth moment of a random variable is defined as:
$ $
E[# ] = ' ) *())
(
• Expectation = 1st moment
Problem: Let Y = outcome of a single die roll.
Calculate the 2nd moment of Y.
Solution:
6
Variance
Consider the following distributions (PMFs):
7
Variance
If X is a random variable with mean E[X] = µ then
the variance of X, denoted Var(X), is:
Var(X) = E [(X – µ) ] 2
Note: Var(X) ≥ 0
Also known as the 2nd Central Moment, or square of the Standard Deviation
!
8
Variance as a spread
Variance aka
Average squared difference:
E[(X – E[X])2] ≥ 0
-…
-85
-70
-55
-40
-25
-10
20
35
50
65
80
95
5
40 70 100
-100
-85
-70
-55
-40
-25
-10
20
35
50
65
80
95
5
40 70 100
9
Computing Variance
Var ! = # ! − # ! %
=# !−& %
=) +−& %
, + X is an RV with mean µ
* i.e.,E[X] = µ
= ) + % − 2&+ + &% , +
*
= ) + % , + − 2& ) + , + + &% ) , +
* * *
= # ! % − 2&# ! + &%
= # ! % − 2&% + &% = # ! % − &%
=# !% − #! % !
10
Variance of a 6-sided die
Let Y = outcome of a single die roll.
Calculate the variance of Y.
Solution:
E[Y] = (1/6) (1 + 2 + 3 + 4 + 5 + 6) = 7/2 1st moment, mean, expectation
E[Y2] = (1/6) (12 + 22 + 32 + 42 + 52 + 62) = 91/6 2nd moment
11
Properties of Variance
13
Jacob Bernoulli
Jacob Bernoulli (1654-1705), also known as “James”, was a Swiss
mathematician
E[X] = p
Var(X) = p(1 – p)
Examples:
• Coin flip
• Random binary digit
• Whether a disk drive crashed !
15
Binomial Random Variable
Consider n independent trials of Ber(p) random variables.
• X is # successes in n trials
Binomial RV, X:
n k n−k
X ~ Bin(n,p) P(X=k) = p(k) =
k
p 1−p
X ∈ {0,1, …, n}
Examples:
• # of heads in n coin flips
• # of 1’s in randomly generated length n bit string
• # of disk drives crashed in 1000 computer cluster
(assuming disks crash independently) !
16
n k n−k
Three coin flips p(k) = p 1−p
k
Three fair (“heads” with p = 0.5) coins are flipped.
• X is number of heads
• X ~ Bin(3, 0.5)
Compute the PMF of X.
1 k 1 3−k
3 , where k = {0,1,2,3}
P(X=k) = p(k) = k 2 1− 2
3 10 1 3 1 3 12 1 1 3
P(X=0) = 0 2 1− 2 = 8 P(X=2) = 2 2 1− 2 = 8
3 11 1 2 3 3 13 1 0 1
P(X=1) = 1 2 1− 2 = 8 P(X=3) = 3 2 1− 2 = 8
17
n k n−k
PMF of Binomial p(k) = k p 1−p
P(X=k)
k k
18
n k n−k
Genetic inheritance p(k) = p 1−p
k
Each person has 2 genes per trait (e.g., eye color).
• Child receives 1 gene (equally likely) from each parent
• Brown is “dominant”, blue is ”recessive”:
• Child has brown eyes if either (or both) genes are brown; blue eyes only if both genes are blue.
• Parents each have 1 brown and 1 blue gene.
• 4 children total
P(3 children with brown eyes)?
Solution:
Define: X = # children with brown eyes. X ~ Bin(4, p)
p = P(child has brown eyes)
p = 1 – P(child has blue eyes) = 1 – (1/2) (1/2) = 0.75
!
à X ~ Bin(4, 0.75)
4
P(X = 3) = 0.753 0.25 1 ≈ 0.4219
3 20
Properties of Bin(n,p)
Consider X ~ Bin(n,p).
E[X] = np
Var(X) = np(1 – p)
E[X2] = n2p2 – np2 + np
Proof: Var(X) = E[X2] – (E[X])2
E[X2] = Var(X) + (E[X])2
= np(1 – p) + (np)2
!
= n2p2 – np2 + np
Note: Ber(p) = Bin(1,p)
21
Hamming Codes (error correcting codes)
You want to send 4 bit string over network.
• Add 3 “parity” bits and send 7 bits total
• Each bit independently corrupted (flipped) in transition w.p. 0.1
• Define X = number of bits corrupted: X ~ Bin(7,0.1)
• Parity bits allow us to correct at most 1 bit error.
P(a correctable message is received)?
Solution:
Define: E = correctable message is received
P(E) = P(X = 0) + P(X = 1), where X ~ Bin(7, 0.1)
7 7
0.1 0.9 ≈ 0.8503
0 7 1 6
P(E) = 0.1 0.9 +
0 1
What if we didn’t use error correcting codes?
Define: P(E) = P(X = 0), where X ~ Bin(4, 0.1)
P(E) =
4 0
0.1 0.9
4
≈ 0.6561
Using error correction
improves reliability by 30%!
!
0 22
Q: How do you make an
octopus laugh?
A: You give it ten
tickles!
Break
Attendance: tinyurl.com/cs109summer2018
23
Binomial IRL
In real networks:
◦ Large bit strings (n ≈ 104)
◦ Tiny probability of bit corruption (p ≈ 10-6)
◦ X ~ Bin(104, 10-6) is unpleasant
X ~ Poi(2), l = 2
lk −l 25 −2
P(X = 5) = ' = ' ≈ 0.0361
k! 5!
!27
lk −l
Earthquakes X ~ Poi(l): p(k) =
k!
'
There are an average of 2.8 major earthquakes in the world each year.
What is the probability of more than 1 major earthquake happening next
year?
Solution:
Define: X = # major earthquakes next year
X ~ Poi(2.8)
WTF: P(> 1 earthquake happening next year)
P(X > 1) = 1 – [P(X=0) + P(X=1)]
2.8 0 2.8 1
=1– –
–2.8 ' –2.8
!
'
0! 1!
≈ 1 – 0.06 – 0.17 = 0.77 28
Poisson process
Given: A unit of time (1 year, 1 sec, whatever)
Events arrive at rate l per unit of time
WTF: X, # occurrences per unit of time
→1 → 1k = 1 31
Bit corruption
• Send bit string of length n = 104
• Probability of independent bit corruption p = 10-6
What is P(message arrives uncorrupted)?
Solution 1:
Let Y ~ Bin(104, 10-6).
P(Y = 0) = 10 4
0 10 4
≈ 0.99049829
(10 ) (1 –10 )
−6 −6
0
Solution 2:
Let X ~ Poi(l = 104 x 10-6 = 0.01).
lk −l 0.010 −0.01
P(X = 0) = ) = ) ≈ 0.99049834
k! 0!
32
Poisson is Binomial in Limit
Bin(10,0.3), Bin(100,0.03), Poi(3)
P(X=k)
“moderate”?
• n > 20 and p < 0.05
• n > 100 and p < 0.1
• n → ¥, p → 0
k
!
33
A Real License Plate Seen at Stanford
36
Discrete RV distributions, part 1
X ~ Ber(p) 1 = success w.p. p,
X ∈ {0,1} 0 = failure w.p. (1 – p)