0% found this document useful (0 votes)

76 views15 pages

Unexpected - Expectations 2015

The document discusses the use of probabilistic methods to solve problems by introducing random variables and their expectations, emphasizing the importance of understanding these variables as functions. It covers definitions, properties of expected value, and examples illustrating concepts like linearity of expectation and conditional expectations. The authors provide various problems and solutions to demonstrate these principles in action.

Uploaded by

Altanany

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

76 views15 pages

Unexpected - Expectations 2015

Uploaded by

Altanany

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Unexpected Expectations

Evan Chen Andrew Critch

[email protected] [email protected]

June, 2015

Sometimes the easiest way to solve a problem is to define some random variables whose
expectations make things more intuitive to think and write about. When the problem
doesn’t involve any randomness at its outset, folks call this a “probabilistic method”
because you introduce probability where it wasn’t invited. But actually this method is
cool even when it’s not the only probabilistic thing going on, so we’ll look at some of
those applications, too.
Note from the second author: I’ve made very few changes to Evan’s original version of
these notes; all the credit for assembling this excellent collection of problems goes to him!

1 Definitions and Notation

A random variable is just a quantity that that depends on some world or input that we
assign probabilities to. For instance, consider a six-sided die roll (with equal probability
assigned to each outcome), and let D6 be the number that ends up on the top face of the
die. Here D6 is a random variable. The subscript “6” isn’t needed; it’s just a common
way to indicate a random variable having 6 equally likely outcomes.
We can discuss the probability of certain events, which we’ll denote P(•). For instance,
we can write things like
1
P(D6 = 1) = P(D6 = 2) = · · · = P(D6 = 6) =
6
P(D6 = 0) = 0
1
P(D6 ≥ 4) = .
2
∗
If we let D6 be the number that ends up on the bottom of the die, we can also say
things like
P(D6 = D6∗ ) = 0 and P(D6 + D6∗ = 7) = 1
Now suppose we rolled the same die again, and called the bottom face E6 . Even though
E6 has the same distribution over its outputs as D6∗ (namely, 1 . . . 6 each happening with
probability 1/6), it interacts with D6 in very different ways:
1 1
P(D6 = E6 ) = and P(D6 + D6∗ = 7) =
6 6
This is why it’s important to think of a random variable as a function of a random input,
rather than just a list of values with a probability associated to each one, because in the
latter sense, D6∗ and E6 would be identical. Folks often call the outputs of a random
variable its “values”, which suppresses awareness that each random variable is actually a

1
Evan Chen, Andrew Critch 2 Properties of Expected Value

function. But to avoid confusion one needs to remember that the variables are functions,
either implicitly or explicitly.
The conditional probability of Y given X is defined as

P(X = x | Y = y) := P(X = x and Y = y)/P(Y = y)

and is undefined when P(Y = y) = 0. In our dice example, P[D6 = 3 | D6∗ = 4] = 1

We say X is independent of Y and write X ⊥⊥ Y if for all x, y,

P(X = x and Y = y) = P(X = x)P(Y = y)

This is equivalent to the statement

P(X = x | Y = y) = P(X = x)

i.e., that finding out the value of Y tells you nothing about the distribution of X, and by
symmetry, conversely. In our dice example, you can check that

⊥ E6 and D6 6⊥⊥ D6∗

D6 ⊥

The expected value or expectation of a random variable X is the probability-

weighted average of its values:
def
X
E[X] = P(X = x) · x.
x

For our dice roll D6 ,

1 1 1
E[D6 ] = · 1 + · 2 + · · · + · 6 = 3.5.
6 6 6

1.1 Graph theory terminology

A number of problems here will refer to an independent set, which is a set of vertices
(in a graph) for which no two are connected by an edge. See http://en.wikipedia.
org/wiki/Graph_theory#Definitions if you aren’t yet familiar with basic terms from
graph theory, like vertices, edges, subgraphs, and so on.

2 Properties of Expected Value

2.1 A Motivating Example
It is an unspoken law that any introduction to expected value begins with the following
classical example.
Example 2.1. At MOP, there are n people, each of who has a name tag. We shuffle the
name tags and randomly give each person one of the name tags. Let S be the number of
people who receive their own name tag. Prove that the expected value of S is 1.
This result might seem surprising, as one might intuitively expect E[S] to depend on
the choice of n.
For simplicity, let us call a person a fixed point if they receive their own name tag.1
Thus S is just the number of fixed points, and we wish to show that E[S] = 1. If we’re
1
This is actually a term used to describe points which are unchanged by a permutation. So the usual
phrasing of this question is “what is the expected number of fixed points of a random permutation?”

2
Evan Chen, Andrew Critch 2 Properties of Expected Value

interested in the expected value, then according to our definition we should go through
all n! permutations, count up the total number of fixed points, and then divide by n! to
get the average. Since we want E[S] = 1, we expect to see a total of n! fixed points.
Let us begin by illustrating the case n = 4 first, calling the people W , X, Y , Z.
W X Y Z Σ
1 W X Y Z 4
2 W X Z Y 2
3 W Y X Z 2
4 W Y Z X 1
5 W Z X Y 1
6 W Z Y X 2
7 X W Y Z 2
8 X W Z Y 0
9 X Y W Z 1
10 X Y Z W 0
11 X Z W Y 0
12 X Z Y W 1
13 Y W X Z 1
14 Y W Z X 0
15 Y X W Z 2
16 Y X Z W 1
17 Y Z W X 0
18 Y Z X W 0
19 Z W X Y 0
20 Z W Y X 1
21 Z X W Y 1
22 Z X Y W 2
23 Z Y W X 0
24 Z Y X W 0
Σ 6 6 6 6 24
We’ve listed all 4! = 24 permutations, and indeed we see that there are a total of 24
fixed points, which are hilighted. Unfortunately, if we look at the rightmost column,
there doesn’t seem to be a pattern, and it seems hard to prove that this holds for other
values of n.
However, suppose that rather than trying to add by rows, we add by columns. There’s
a very clear pattern if we try to add by the columns: we see a total of 6 fixed points in
each column. Indeed, the six fixed W points correspond to the 3! = 6 permutations of
the remaining letters X, Y , Z. Similarly, the six fixed X points correspond to the 3! = 6
permutations of the remaining letters W , Y , Z.
This generalizes very nicely: if we have n letters, then each letter appears as a fixed
point (n − 1)! times. Thus the expected value is
 
1  1
E[S] = (n − 1)! + (n − 1)! + · · · + (n − 1)! = · n · (n − 1)! = 1.
n! | {z } n!
n times
Cute, right? Now let’s bring out the artillery.

2.2 Linearity of Expectation

The crux result of this section is the following theorem.

3
Evan Chen, Andrew Critch 2 Properties of Expected Value

Theorem 2.2 (Linearity of Expectation). Given any random variables X1 , X2 , . . . , Xn ,

and constants ai , we always have

E[a1 X1 + a2 X2 + · · · + an Xn ] = a1 E[X1 ] + a2 E[X2 ] + · · · + an E[Xn ].

This theorem is highly intuitive if the X1 , X2 , . . . , Xn are independent of each other –

if we roll 100 dice, we expect an average of 350. The wonderful thing is that this holds
even if the variables are not independent. And the basic idea is just the double-counting
we did in the earlier example: even if the variables depend on each other, if you look only
at the expected value, you can still add just by columns. The proof of the theorem is just
a bunch of sigma signs which say exactly the same thing, so we won’t bother including it.
Anyways, we can now nuke our original problem. The trick is to define indicator
variables as follows: for each i = 1, 2, . . . , n let
(
def 1 if person i gets his own name tag
Si =
0 otherwise.

Obviously,
S = S1 + S2 + · · · + Sn .
Moreover, it is easy to see that E[Si ] = P(Si = 1) = n1 for each i: if we look at any
particular person, the probability they get their own name tag is simply n1 . Therefore,

1 1 1
E[S] = E[S1 ] + E[S2 ] + · · · + E[Sn ] = + + · · · + = 1.
|n n {z n}
n times

Now that was a lot easier! By working in the context of expected value, we get a
framework where the “double-counting” idea is basically automatic. In other words,
linearity of expectation lets us only focus on small, local components when computing an
expected value, without having to think about a lot of interactions between cases and
quantities that would otherwise distract us.

2.3 More Examples

Example 2.3 (HMMT 2006). At a nursery, 2006 babies sit in a circle. Suddenly, each
baby randomly pokes either the baby to its left or to its right. What is the expected
value of the number of unpoked babies?

Solution. Number the babies 1, 2, . . . , 2006. Define

(
def 1 if baby i is unpoked
Xi =
0 otherwise.

1 2 1

We seek E[X1 + X2 + · · · + X2006 ]. Note that any particular baby has probability 2 = 4
of being unpoked (if both its neighbors miss). Hence E[Xi ] = 14 for each i, and

1 1003
E[X1 + X2 + · · · + X2006 ] = E[X1 ] + E[X2 ] + · · · + E[X2006 ] = 2006 · = .
4 2
Seriously, this should feel like cheating.

4
Evan Chen, Andrew Critch 2 Properties of Expected Value

2.4 Conditional expectations

While E[X + Y ] = E[X] + E[Y ] always holds, in general E[XY ] 6= E[X]E[Y ]. However,
there is something you can say in general about the expectation of a product, for which
we need a definition:
The conditional expectation of X given Y is a function of X that says what the
expectation of Y is once X is known:
X
E[Y | X] = y · P(Y = y | X = x)
y

For example, if X is a (uniformly) randomly chosen face of a die, and Y is a randomly

chosen other face of that die, then
X 1 21 − X
E[Y | X] = i· =
6 6
i6=X

To emphasize again, E[Y | X] denotes a function depending on the value of X, which

in particular makes it a random variable. So now we can ask about the value of an
expression like E[E[Y | X]]. It’s an exercise in comfort-with-notation to prove that, in
general,

Proposition 2.4 (Conditional expectations). For any random variables X and Y ,

E[f (X)Y ] = E[f (X) · E[Y | X]]

In particular, when f (X) = X and f (X) = 1, we get

E[XY ] = E[X · E[Y | X]] and E[Y ] = E[E[Y | X]]

To illustrate with our dice example,

E[XY ] = E[X · E[Y | X]]

21 − X
= E[X · ]
6
21X − X 2
= E[ ]
6
21 1
= E[X] − E[X 2 ]
6 6
21 21 1 91 175
= · − · ... =
6 6 6 6 3
Proposition 2.5 (Independence and multiplicative expectations). Given any random
variables X and Y ,

X⊥
⊥Y ⇒ E[Y | X] = E[Y ] ⇒ E[XY ] = E[X]E[Y ]

where (exercise) each of these implications is strict, i.e., the reverse does not hold.

It’s reasonably common to forget that these implications are not reversible, so it’s a
reasonable exercise to come up with examples to illustrate it. But while you’re thinking
about that, don’t forget: unlike when multiplying two random variables, linearity of
expectation does not require independence!

5
Evan Chen, Andrew Critch 3 Direct Existence Proofs

2.5 Practice Problems

The first two problems are somewhat straightforward applications of the methods de-
scribed above.
Problem 2.6 (AHSME 1989). Suppose that 7 boys and 13 girls line up in a row. Let
S be the number of places in the row where a boy and a girl are standing next to each
other. For example, for the row GBBGGGBGBGGGBGBGGBGG we have S = 12.
Find the expected value of S.
Problem 2.7 (AIME 2006 #6). Let S be the set of real numbers that can be represented
as repeating decimals of the form 0.abc where a, b, c are distinct digits. Find the sum of
the elements of S.
The next few problems are harder; in these problems linearity of expectation is not
the main idea of the solution. All problems below were written by Lewis Chen.
Problem 2.8 (NIMO 4.3). One day, a bishop and a knight were on squares in the same
row of an infinite chessboard, when a huge meteor storm occurred, placing a meteor in
each square on the chessboard independently and randomly with probability p. Neither
the bishop nor the knight were hit, but their movement may have been obstructed by
the meteors. For what value of p is the expected number of valid squares that the bishop
can move to (in one move) equal to the expected number of squares that the knight can
move to (in one move)?
Problem 2.9 (NIMO 5.6). Tom has a scientific calculator. Unfortunately, all keys
are broken except for one row: 1, 2, 3, + and -. Tom presses a sequence of 5 random
keystrokes; at each stroke, each key is equally likely to be pressed. The calculator then
evaluates the entire expression, yielding a result of E. Find the expected value of E.
(Note: Negative numbers are permitted, so 13-22 gives E = −9. Any excess operators
are parsed as signs, so -2-+3 gives E = −5 and -+-31 gives E = 31. Trailing operators
are discarded, so 2++-+ gives E = 2. A string consisting only of operators, such as -++-+,
gives E = 0.)

3 Direct Existence Proofs

In its simplest form, we can use expected value to show existence as follows: suppose
we know that the average score of the USAMO 2014 was 12.51. Then there exists a
contestant who got at least 13 points, and a contestant who got at most 12 points. This
is similar in spirit to the pigeonhole principle, but the probabilistic phrasing is far more
robust.

3.1 A First Example

Let’s look at a very simple example, taken from the midterm of a class at the San Jose
State University.2
Example 3.1 (SJSU M179 Midterm). Prove that any subgraph of Kn,n with at least
n2 − n + 1 edges has a perfect matching.
We illustrate the case n = 4 in the figure.
2
For a phrasing of the problem without graph theory: given n red points and n blue points, suppose we
connect at least n2 − n + 1 pairs of opposite colors. Prove that we can select n segments, no two of
which share an endpoint.

6
Evan Chen, Andrew Critch 3 Direct Existence Proofs

Figure 1: The case n = 4. There are n2 −n+1 = 13 edges, and the matching is highlighted
in green.

This problem doesn’t “feel” like it should be very hard. After all, there’s only a total
of n2 possible edges, so having n2 − n + 1 edges means we have practically all edges
present.3
So let’s be really careless and just randomly pair off one set of points with the other,
regardless of whether there is actually an edge present. We call the score of such a pairing
the number of pairs which are actually connected by an edge. We wish to show that
some pairing has score n, as this will be the desired perfect matching.
So what’s the expected value of a random pairing? Number the pairs 1, 2, . . . , n and
define (
def 1 if the ith pair is connected by an edge
Xi =
0 otherwise.
Then the score of the configuration is X = X1 + X2 + · · · + Xn . Given any red point and
2
any blue point, the probability they are connected by an edge is at least n −n+1
n2
. This
n2 −n+1
means that E[Xi ] = n2 , so

E[X] = E[X1 ] + · · · + E[Xn ]

= n · E[X1 ]
n2 − n + 1
=
n
1
=n−1+ .
n
Since X takes only integer values, there must be some configuration which achieves
X = n. Thus, we’re done.

3.2 Application: Ramsey Numbers

Let’s do another simple example. Before we begin, I will quickly introduce a silly algebraic
lemma, taken from [5, page 30].

Lemma 3.2. For any positive integers n and k,

n 1 en k
< .
k e k

Here e ≈ 2.718 . . . is Euler’s constant.

3
On the other hand, n2 − n + 1 is actually the best bound possible. Can you construct a counterexample
with n2 − n?

7
Evan Chen, Andrew Critch 4 Heavy Machinery
n nk
and then use calculus to prove that k! ≥ e(k/e)k . Specifically,

Proof. Do k < k!
Z k
ln 1 + ln 2 + · · · + ln k ≥ ln x dx = k ln k − k + 1
x=1

whence exponentiating works.

Algebra isn’t much fun, but at least it’s easy. Let’s get back to the combinatorics.

Example 3.3 (Ramsey Numbers). Let n and k be integers with n ≤ 2k/2 and k ≥ 3.
Then it is possible to color the edges of the complete graphon n vertices with the
following property: one cannot find k vertices for which the k2 edges among them are
monochromatic.

Remark. In the language of Ramsey numbers, prove that R(k, k) > 2k/2 .

Solution. Again we just randomly color the edges and hope for the best. We use a coin
flip to determine the color of each of the n2 edges. Let’s call a collection of k vertices
bad if all k2 edges are the same color. The probability that any collection is bad is

(k)−1
1 2
.
2
n

The number of collections in k , so the expected number of bad collections is
n

k
E[number of bad collections] = k .
2(2)−1
We just want to show this is less than 1. You can check this fairly easily using Lemma 3.2;
in fact, we have a lot of room to spare.

3.3 Practice Problems

The first two problems are from [2]; the last one is from [4].

Problem 3.4. Show that one can construct a (round-robin) tournament outcome with
more than 1000 people such that for any set of 1000 people, some contestant outside that
set beats all of them.

Problem 3.5 (BAMO 2004). Consider a set of n real numbers, not all zero, with sum
zero. Prove that one can label the numbers as a1 , a2 , . . . , an such that

a1 a2 + a2 a3 + · · · + an a1 < 0.

Problem 3.6 (Russia 1996). In the Duma there are 1600 delegates, who have formed
16, 000 committees of 80 people each. Prove that one can find two committees having no
fewer than four common members.

4 Heavy Machinery
Here are some really nice ideas used in modern theory. Unfortunately I couldn’t find
many olympiad problems that used them. If you know of any, please let me know!

8
Evan Chen, Andrew Critch 4 Heavy Machinery

4.1 Alteration
In previous arguments we often proved a result by showing E[bad] < 1. A second method
is to select some things, find the expected value of the number of “bad” situations, and
subtract that off. An example will make this clear.
Example 4.1 (Weak Turán). A graph G has n vertices and average degree d. Prove
n
that it is possible to select an independent set of size at least 2d .
n
Proof. Rather than selecting 2d vertices randomly and hoping the number of edges is 1,
we’ll instead select each vertex with probability p. (We will pick a good choice of p later.)
That means the expected number of vertices we will take is np. Now there are 12 nd
edges, so the expected number of “bad” situations (i.e. an edge in which both vertices
are taken) is 12 nd · p2 .
Now we can just get rid of all the bad situations. For each bad edge, delete one of its
endpoints arbitrarily (possibly with overlap). This costs us at most 12 nd · p2 vertices, so
the expected value of the number of vertices left is

1 2 1
np − ndp = np 1 − dp .
2 2
1 n
It seems like a good choice of p is d, which now gives us an expected value of 2d , as
desired.

A stronger result is Problem 5.5.

4.2 Union Bounds and Markov’s Inequality

A second way to establish existence is to establish a nonzero probability. One way to do
this is using a union bound.
Proposition 4.2 (Union Bound). Consider several events A1 , A2 , . . . , Ak . If

P(A1 ) + P(A2 ) + · · · + P(Ak ) < 1

then there is a nonzero probability that none of the events occur.

The following assertion is sometimes useful for this purpose.
Theorem 4.3 (Markov’s Inequality). Let X be a random variable taking only nonnegative
values. Suppose E[X] = c. Then
1
P(X ≥ rc) ≤ .
r
This is intuitively obvious: if the average score on the USAMO was 7, then at most 16
of the contestants got a perfect score. The inequality is also sometimes called Chebyshev’s
inequality or the first Chebyshev inequality.

4.3 Lovász Local Lemma

The Lovász Local Lemma (abbreviated LLL) is in some sense a refinement of the union
bound idea – if the events in question are “mostly” independent, then the probability no
events occur is still nonzero.
We present below the “symmetric” version of the Local Lemma. An asymmetric version
also exists (see Wikipedia).

9
Evan Chen, Andrew Critch 4 Heavy Machinery

Theorem 4.4 (Lovász Local Lemma). Consider several events, each occurring with
probability at most p, and such that each event is independent of all the others except at
most d of them. 4 Then if
epd ≤ 1
the probability that no events occur is positive. (Here e = 2.71828 . . . is Euler’s constant.)

Note that we don’t use the number of events, only the number of dependencies.
As the name implies, the local lemma is useful in situations where in a random
algorithm, it appears that things do not depend much on each other. The following
Russian problem is such an example.

Example 4.5 (Russia 2006). At a tourist camp, each person has at least 50 and at
most 100 friends among the other persons at the camp. Show that one can hand out a
T-shirt to every person such that the T-shirts have (at most) 1331 different colors, and
any person has 20 friends whose T-shirts all have pairwise different colors.

The constant C = 1331 is extremely weak. We’ll reduce it to C = 48 below.

Solution. Give each person a random T-shirt. For each person P , we consider the event
E(P ) meaning “P ’s neighbors have at most 19 colors of shirts”. We wish to use the
Local Lemma to prove that there is a nonzero probability that no events occur.
If we have two people A and B, and they are neither friends nor have a mutual friend
(in graph theoretic language, the distance between them is at least two), then the events
E(A) and E(B) do not depend on each other at all. So any given E(P ) depends only on
friends, and friends of friends. Because any P has at most 100 friends, and each of these
friends has at most 99 friends other than P , E(P ) depends on at most 100+100·99 = 1002
other events. Hence in the lemma we can set d = 1002 .
For a given person, look at their 50 ≤ k ≤ 100 neighbors. The probability that there
are at most 19 colors among the neighbors is clearly at most
k
C 19
· .
19 C

To estimate the binomial coefficient, we can again use our silly Lemma 3.2 to get that
this is at most

1 eC 19
k k−19 31
19 18 19 18 19
· =e · ≤e .
e 19 C C C

19 31
Thus, we can put p = e18

C . Thus the Lemma implies we are done as long as
31
19 19
e · 1002 ≤ 1.
C

It turns out that C = 48 is the best possible outcome here. Establishing the inequality
when C = 1331 just amounts to some rough estimation with the e’s.

4
More precisely, if we donate the events with binary variables X1 , X2 , . . ., then we require that for each
i, there is a set Di of size at most d + 1 containing Xi such that Xi is independent of its complement
Dic . In other words, measuring the value of all the variables in Dic together will tell you nothing about
the distribution of Xi .

10
Evan Chen, Andrew Critch 5 More Practice Problems

5 More Practice Problems

These problems are mostly taken from [2, 4].
Problem 5.1 (IMC 2002). An olympiad has six problems and 200 contestants. The
contestants are very skilled, so each problem is solved by at least 120 of the contestants.
Prove that there exist two contestants such that each problem is solved by at least one
of them.
Problem 5.2 (Romania 2004). Prove that for any complex numbers z1 , z2 , . . . , zn ,
satisfying |z1 |2 + |z2 |2 + · · · + |zn |2 = 1, one can select ε1 , ε2 , . . . , εn ∈ {−1, 1} such that
n
X
εk zk ≤ 1.
k=1

Problem 5.3 (Shortlist 1999 C4). Let A be a set of N distinct residues (mod N 2 ). Prove
that there exists a set B of N residues (mod N 2 ) such that A + B = {a + b|a ∈ A, b ∈ B}
contains at least half of all the residues (mod N 2 ).
Problem 5.4 (Iran TST 2008/6). Suppose 799 teams participate in a round-robin
tournament. Prove that one can find two disjoint groups A and B of seven teams each
such that all teams in A defeated all teams in B.
Problem 5.5 (Caro-Wei Theorem). Consider a graph G with vertex set V . Prove that
one can find an independent set with size at least
X 1
.
deg v + 1
v∈V
Remark. Note that, by applying Jensen’s inequality, our independent set has size at
n
least d+1 , where d is the average degree. This result is called Turán’s Theorem (or
the complement thereof).
Problem 5.6 (USAMO 2012/6). For integer n ≥ 2, let x1 , x2 , . . . , xn be real numbers
satisfying x1 +x2 +. . .+xn = 0 and x21 +x22 +. . .+x2n = 1. For each subset A ⊆ {1, 2, . . . , n},
define X
SA = xi .
i∈A
(If A is the empty set, then SA = 0.) Prove that for any positive number λ, the number
of sets A satisfying SA ≥ λ is at most 2n−3 /λ2 .
Problem 5.7 (Online Math Open, Ray Li). Kevin has 2n − 1 cookies, each labeled with
a unique nonempty subset of {1, 2, . . . , n}. Each day, he chooses one cookie uniformly at
random out of the cookies not yet eaten. Then, he eats that cookie, and all remaining
cookies that are labeled with a subset of that cookie. Compute the expected value of the
number of days that Kevin eats a cookie before all cookies are gone.
Problem 5.8. Let n be a positive integer. Let ak denote the number of permutations
of n elements with k fixed points. Compute
a1 + 4a2 + 9a3 + · · · + n2 an .
Problem 5.9 (Russia 1999). In a certain school, every boy likes at least one girl. Prove
that we can find a set S of at least half the students in the school such that each boy in
S likes an odd number of girls in S.
Problem 5.10. Let n be a positive integer. Suppose 11n points are arranged in a circle,
colored with one of n colors, so that each color appears exactly 11 times. Prove that one
can select a point of every color such that no two are adjacent.

11
Evan Chen, Andrew Critch References

References
[1] pythag011 at http://www.aops.com/Forum/viewtopic.php?f=133&t=481300

[2] Ravi B’s collection of problems, available at:

http://www.aops.com/Forum/viewtopic.php?p=1943887#p1943887.

[3] Problem 6 talk (c > 1) by Po-Shen Loh, USA leader, at the IMO 2014.

[4] Also MOP lecture notes: http://math.cmu.edu/~ploh/olympiad.shtml.

[5] Lecture notes by Holden Lee from an MIT course:

http://web.mit.edu/~holden1/www/coursework/math/18997/notes.pdf

Thanks to all the sources above. Other nice reads that I went through while preparing
this, but eventually did not use:

1. Alon and Spencer’s The Probabilistic Method. The first four chapters are here:
http://cs.nyu.edu/cs/faculty/spencer/nogabook/.
2. A MathCamp lecture that gets the girth-chromatic number result:
http://math.ucsb.edu/~padraic/mathcamp_2010/class_graph_theory_probabilistic/
lecture2_girth_chromatic.pdf

12
Evan Chen, Andrew Critch 6 Unexpected Expectations: Solution Sketches

6 Unexpected Expectations: Solution Sketches

2.6 Answer: 9.1. Make an indicator variable for each adjacent pair. . .

2.7 Answer: 360. Pick a, b, c randomly and compute E[0.abc]. Then multiply by |S|.

2.8 8p = 4 · p + p2 + p3 + . . . .

2.9 Answer: 1866. Any expression with a + or - in it has a complementary expression

with that sign switched, such that any numbers after the sign are cancelled out in
expectation. Thus we need only consider numbers occurring before a sign. Also, any
expression with a 3 in it has a complementary expression with a 1 instead of the 3, so in
expectation every numeral is a 2. The probability of hitting a numeral in any keystroke
is p = 35 , so the total expectation is

(10p)5 − 1 6 7775
2(p + 10p2 + · · · + 104 p5 ) = 2p · = · = 6 · 311 = 1866
10p − 1 5 5
3.4 Suppose there are n people, and decide each game outcome with a coin flip. Let U
be the set of “unbeaten” subsets S of size 1000, i.e. such that nobody outside S beats all
of S.
X
E[|U |] = P(S is unbeaten)
|S|=1000
X
= P(∀t ∈ S c ∃s ∈ S : s beats t)
|S|=1000
X Y
= 1 − 2−1000
|S|=1000 t∈S c

n
= · (1 − 2−1000 )n−1000
1000
which is less than 1 for very large n (exponentials eventually dominate polynomials).
Hence for large n, sometimes |U | = 0, as needed.

3.5 Choose the ordering uniformly randomly. Then, with the convention n + 1 = 1,

−ai a2
E[ai ai+1 ] = E[ai E[ai+1 | ai ]] = E[ai · ( )] = −E[ i ] < 0
n−1 n−1
since they are not all zero, hence the expectation of the given sum is strictly negative,
and so the sum itself is sometimes negative.

3.6 Let ni be the number of committees which the ith delegate is in. Pick two committees
A and B randomly, so
X ni (ni − 1)
E[|A ∩ B|] =
16, 000 · 15, 999
i

Letting f (n) = n(n − 1), by Jensen’s inequality and the fact that the average person is
on 16,000·80
1600 = 800 committees,

1600f (800) 639, 200

E[|A ∩ B|] ≥ = >4
16, 000 · 15, 999 10 · 15, 999
so sometimes thee two committees have at least 4 people in common.

13
Evan Chen, Andrew Critch 6 Unexpected Expectations: Solution Sketches

5.1 Pick the two contestants, 1 and 2, randomly. Let Xi be the indicator that both
80 2
contestants miss problem i, so each E[Xi ] < ( 200 ) = 4/25, and their expected number of
both-missed problems is 24/25 < 1 . . .

5.2 Select each of the εi randomly with a coin flip. Let LHS
P be2 thePleft-hand side of the
2 2
desired inequality. Since |z| = zz for any z, LHS = k |zk | + i<j i j (zi z̄j + z̄i zj )
and since i and j are independent, E(i j ) = 0, so E(LHS 2 ) = k |zk |2 = 1, hence
P
LHS 2 ≤ 1 sometimes, as needed.

5.3 Select the elements of B = {b1 . . . bn } uniformly randomly (we’ll even allow repetitions,
for simplicity). For each r (mod N 2 ), and each i, j ∈ 1 . . . n,

N2 − N 1
P(r ∈
/ A + bi ) = P(bi ∈
/ A − r) = =1−
N2 N
1 N 1
so P(r ∈
/ A + B) < (1 − N) < e < 21 . Thus E[|A + B|] > (1 − 1e )N 2 . . .

5.4 Let Dk be the set of teams which defeat the kth team (here 1≤ k ≤ 799), and
dk 799
dk = |Dk | Select A = {a1 , . . . , a7 } randomly, so P(A ⊆ Dk ) = 7 / 7 , so letting
P dk 799
N be the number of teams dominated by A, E[N ] = k 7 / 7 . The function
x
799
7 / 7 is convex, and the average value of dk is 798/2 = 398, so by Jensen’s inequality,
E[N ] ≥ 799 · 398
799
7 / 7 > 799 · ( 12 )7 > 6, hence sometimes N ≥ 7.

5.5 A fairly natural approach is to use a greedy algorithm: randomly choose a vertex,
append it to W , remove it and its neighbors from G, repeat until nothing is left, and then
W will be an independent set. One can prove by induction on |G| that E[|W |] satisfies
the given bound.
A simpler proof is to randomly order the vertices {v1 . . . vn } of G, and take W to be
the subset of those vi which occur before all their neighbors. Then
X X 1
E[|W |] = ind(vi ∈ W ) =
deg(v) + 1
i i

5.6 Since SA = −SAc , we have P(SA > λ) = P(SAc < −λ) = P(SA < −λ) Thus
P(SA > λ) = 12 P(SA
2 > λ2 ), and S 2 is always positive, so we can apply the Markov
A
inequality to it.
X X
2
E(SA )= P(i ∈ A)x2i + P(i, j ∈ A)2xi xj
i ij
1X 2 1X
= xi + 2xi xj
2 4
i i6=j

2 =1+
P
Since 0 = S[n] i6=j 2xi xj , we have

2 1 1 1
E(SA ) = (1) + (−1) =
2 4 4
2 > λ2 ) < 1
hence by the Markov inequality, P(SA 4λ2
, as needed.

5.7 The number of days equals the number of times a cookie is chosen (rather than
merely eliminated). Let C be the set of cookies chosen by the process and S = {1 . . . n},
so X X
E[#days] = E[|C|] = E[ind(A ∈ C)] = P(A ∈ C)
A⊆S A⊆S

14
Evan Chen, Andrew Critch 6 Unexpected Expectations: Solution Sketches

For any A ⊆ S, A is eliminated at the unique stage where a superset A0 of A is chosen, so

1 1
P(A ∈ C) = P(A = A0 ) = = n−|A|
#{supersets of A} 2
Thus
X X X 4n+1 − 4
P(A ∈ C) = 2|A|−n = 2k · 2k−n =
3 · 2n
A⊆S A⊆S 1≤k≤n

5.8 For a random permutation let X be the number of fixed points, so the required
expression is exactly n!E[X 2 ]. We already know E[X] = 1 from Example 2.1, and by a
similar argument, the expected number of pairs of fixed points in a random permutation
is
X X n 1 1 1
E[ ]= P(i, j both fixed) = · =
2 2 n n−1 2
ij

Then E[X 2 ] = 2E[ x2 ] + E[X] = 2, so the given expression is 2n!.

5.9 Let Lb be the set of girls liked by a given boy b, and let B and G be the sets of chosen
boys and girls. For fixed G, WLOG B is the set of all boys who like an odd number of
girls in G, so the challenge is to choose G. Doing so uniformly randomly means each girl
has probability 50% to be included, and for each boy b,

P(b ∈ B) = P(|Lb ∩ G| is odd) = 50%

because a uniformly random subset of Lb is 50% likely to be odd. Hence E[|B ∪ G|] >
50%(total).

5.10 Label the points s1 . . . s11n = s0 in order around the circle. Choose one point of
each color randomly to form a set A, and consider the indicators Bi = ind(si , si+1 ∈ A)
of the “bad” events where an adjacent pair occurs in A. For each i, p = P(Bi ) is 11−2 if
they’re different colors, or 0 if they’re the same color, so p ≤ 11−2 . Unfortunately, the
bound E[ i Bi ] < 11n = 11
P
112 n is only below 1 if n > 11, so for small n this bound is not
strong enough to show that sometimes all the Bi = 0.
However, each Bi is independent of Bj for all j except when sj or sj+1 has the same
color as si or si+1 , so we can try to apply the Lovasz Local Lemma. There are 21 other
pairs (si , si+1 ) sharing a color with si , and at most another 21 pairs sharing a color with
si+1 , so Bi ⊥ ⊥ Bj for all but at most d = 42 values of j. Now,

42 28 42 30 · 40
epd = e · < · < <1
121 10 121 1210
so by LLL there is a positive probability that all the Bi = 0, as needed.

Victoria Krakovna - Concurrency and Collinearity
No ratings yet
Victoria Krakovna - Concurrency and Collinearity
8 pages
Light GCN
No ratings yet
Light GCN
26 pages
XII ISC NOTES 2020-21 Determinants
No ratings yet
XII ISC NOTES 2020-21 Determinants
4 pages
GIS Assignment
No ratings yet
GIS Assignment
9 pages
Practical Research 2 - First Quarter Exam
91% (34)
Practical Research 2 - First Quarter Exam
4 pages
2021 DSQ-40 Evaluation Students
No ratings yet
2021 DSQ-40 Evaluation Students
10 pages
Grade X
No ratings yet
Grade X
1 page
Games 2018
No ratings yet
Games 2018
8 pages
Recitation Problems Week 7
No ratings yet
Recitation Problems Week 7
2 pages
4 Random Variables
No ratings yet
4 Random Variables
6 pages
Fundamentals of Structural Analysis 5th Edition by Leet Uang Lanning ISBN Solution Manual
100% (48)
Fundamentals of Structural Analysis 5th Edition by Leet Uang Lanning ISBN Solution Manual
111 pages
1Z0 1087 24 Demo
No ratings yet
1Z0 1087 24 Demo
4 pages
2024 Note 2
No ratings yet
2024 Note 2
5 pages
MMAT5340 Sol1
No ratings yet
MMAT5340 Sol1
5 pages
Math for Computer Applications
From Everand
Math for Computer Applications
The Editors of REA
No ratings yet
060 Random Variables
No ratings yet
060 Random Variables
5 pages
ProbabilisticMethod 17
No ratings yet
ProbabilisticMethod 17
12 pages
Afm B200C PDF
No ratings yet
Afm B200C PDF
1 page
Chapter 6, 7, 8
No ratings yet
Chapter 6, 7, 8
25 pages
Expected Value
No ratings yet
Expected Value
3 pages
BSC-Mobile-Application and-Web-Portal
No ratings yet
BSC-Mobile-Application and-Web-Portal
19 pages
ProbabilisticMethod 19
No ratings yet
ProbabilisticMethod 19
10 pages
ProbabilisticMethod 15
No ratings yet
ProbabilisticMethod 15
9 pages
Expected Uses of Probability - Evan Chen
No ratings yet
Expected Uses of Probability - Evan Chen
17 pages
Mit18 05 s22 Class04-Prep-B
No ratings yet
Mit18 05 s22 Class04-Prep-B
7 pages
Graph Slide
No ratings yet
Graph Slide
30 pages
ProbabilisticMethod 15
No ratings yet
ProbabilisticMethod 15
11 pages
ProbabilisticMethod 13
No ratings yet
ProbabilisticMethod 13
13 pages
Frenos Sauleda - ToP - 05
No ratings yet
Frenos Sauleda - ToP - 05
1 page
ProbabilisticMethod 6
No ratings yet
ProbabilisticMethod 6
16 pages
Enumerative Combinatorics Slide
No ratings yet
Enumerative Combinatorics Slide
32 pages
2015 Algebra 2
No ratings yet
2015 Algebra 2
9 pages
Experiment - 5 Raymond Classifier: Name: Aman Agrawal Roll No:18CH30003
No ratings yet
Experiment - 5 Raymond Classifier: Name: Aman Agrawal Roll No:18CH30003
6 pages
04 Estimation
No ratings yet
04 Estimation
48 pages
ProbabilisticMethod 14
No ratings yet
ProbabilisticMethod 14
12 pages
Linearity of Expectation: Unraveling Black Magic
No ratings yet
Linearity of Expectation: Unraveling Black Magic
26 pages
IP Unit 4 (Expectation)
No ratings yet
IP Unit 4 (Expectation)
22 pages
Differential Equations (Calculus) Mathematics E-Book For Public Exams
From Everand
Differential Equations (Calculus) Mathematics E-Book For Public Exams
Mohmmad Khaja Shareef
5/5 (1)
Food Recognition and Calorie Estimation Using Image Processing
No ratings yet
Food Recognition and Calorie Estimation Using Image Processing
5 pages
VZ 950 Titan 2018 en
100% (2)
VZ 950 Titan 2018 en
8 pages
Section 5 - Expectation and Variance
No ratings yet
Section 5 - Expectation and Variance
15 pages
MIT Overview Basic Probability
No ratings yet
MIT Overview Basic Probability
24 pages
Akopyan-CONJUGATION OF LINES WITH RESPECT TO A T PDF
No ratings yet
Akopyan-CONJUGATION OF LINES WITH RESPECT TO A T PDF
9 pages
Lecture 10
No ratings yet
Lecture 10
16 pages
Mathematical Expectation
No ratings yet
Mathematical Expectation
6 pages
ProbabilisticMethod 16
No ratings yet
ProbabilisticMethod 16
13 pages
Test Inks: For Testing Surface Energy
No ratings yet
Test Inks: For Testing Surface Energy
12 pages
4-SSS SAS ASA and AAS Congruence PDF
No ratings yet
4-SSS SAS ASA and AAS Congruence PDF
4 pages
Mod2 4
No ratings yet
Mod2 4
19 pages
ProbabilisticMethod 4
No ratings yet
ProbabilisticMethod 4
13 pages
Heron's Formula For The Area of A Triangle - Shailesh A Shirali
No ratings yet
Heron's Formula For The Area of A Triangle - Shailesh A Shirali
9 pages
ProbabilityStatistics Probability2
No ratings yet
ProbabilityStatistics Probability2
11 pages
Mathematical Expectation Discrete
No ratings yet
Mathematical Expectation Discrete
23 pages
Inversion in A Circle - Tom Davis 2011 PDF
No ratings yet
Inversion in A Circle - Tom Davis 2011 PDF
24 pages
Parity Slide
No ratings yet
Parity Slide
19 pages
SEAOC Seismic Design Manual Examples - UBC 97 - Vol III
No ratings yet
SEAOC Seismic Design Manual Examples - UBC 97 - Vol III
341 pages
Wine Quality Prediction Using Machine Learning Algorithms
100% (1)
Wine Quality Prediction Using Machine Learning Algorithms
4 pages
LN06 Random Variables
No ratings yet
LN06 Random Variables
5 pages
VERBO TO BE b1
100% (1)
VERBO TO BE b1
27 pages
18 - Expected Value
No ratings yet
18 - Expected Value
38 pages
Chapter Six Expectation: Page 1 of 16
No ratings yet
Chapter Six Expectation: Page 1 of 16
16 pages
Notes On Measure, Probability and Stochastic Processes
No ratings yet
Notes On Measure, Probability and Stochastic Processes
155 pages
Wattle Lecture 15
No ratings yet
Wattle Lecture 15
6 pages
Machine Learning and Pattern Recognition Expectations
No ratings yet
Machine Learning and Pattern Recognition Expectations
3 pages
Expectation
No ratings yet
Expectation
94 pages
ProbabilisticMethod 9
No ratings yet
ProbabilisticMethod 9
15 pages
Heron Triangles: A Gergonne-Cevian-and-Median Perspective K. R. S. Sastry
No ratings yet
Heron Triangles: A Gergonne-Cevian-and-Median Perspective K. R. S. Sastry
8 pages
Inversion - Mathematical Excalibur
No ratings yet
Inversion - Mathematical Excalibur
4 pages
McqComputer Applications - Scholarexpress
No ratings yet
McqComputer Applications - Scholarexpress
6 pages
Isogonal
No ratings yet
Isogonal
56 pages
Euler's Formula and Poncelet's Porism Lev Emelyanov and Tatiana Emelyanova
No ratings yet
Euler's Formula and Poncelet's Porism Lev Emelyanov and Tatiana Emelyanova
4 pages
ProbabilisticMethod 3
No ratings yet
ProbabilisticMethod 3
13 pages
2015 Number Theory
No ratings yet
2015 Number Theory
13 pages
DNA To Proteins Practice
No ratings yet
DNA To Proteins Practice
2 pages
ProbabilisticMethod 7
No ratings yet
ProbabilisticMethod 7
12 pages
Pedal Triangles and The Simson Line - Topics in Geometry
No ratings yet
Pedal Triangles and The Simson Line - Topics in Geometry
36 pages
Probability p4
No ratings yet
Probability p4
25 pages
Isogonal Conjugates 1
No ratings yet
Isogonal Conjugates 1
12 pages
Circuit Maker
No ratings yet
Circuit Maker
8 pages
2.3 Expectation of Random Variables
No ratings yet
2.3 Expectation of Random Variables
3 pages
BH35 2
100% (1)
BH35 2
4 pages
Treatment of Hfo For Diesel Engine
No ratings yet
Treatment of Hfo For Diesel Engine
58 pages
Expected Value of A Random Variable
No ratings yet
Expected Value of A Random Variable
10 pages
List of IT Companies in Chennai
100% (1)
List of IT Companies in Chennai
19 pages
ANGLE BISECTORS - San Jose Math Circle
No ratings yet
ANGLE BISECTORS - San Jose Math Circle
20 pages
2nd Year Physics CH Wise 2021 by 786 Academy
100% (6)
2nd Year Physics CH Wise 2021 by 786 Academy
10 pages
Lecture - 02
No ratings yet
Lecture - 02
36 pages
The Intouch Triangle and The OI-lin (Eric Danneels)
No ratings yet
The Intouch Triangle and The OI-lin (Eric Danneels)
10 pages
Expectations of Discrete Random Variables: Scott She Eld
No ratings yet
Expectations of Discrete Random Variables: Scott She Eld
18 pages
Transforms and Partial Differential Equations ND09 MA2211
No ratings yet
Transforms and Partial Differential Equations ND09 MA2211
4 pages
3.1 Expectation: Expectation or The Expected Value of X, Denoted by E (X), Is Defined by
No ratings yet
3.1 Expectation: Expectation or The Expected Value of X, Denoted by E (X), Is Defined by
3 pages
Differential Games
From Everand
Differential Games
Avner Friedman
No ratings yet
Expected Value Markov Chains
No ratings yet
Expected Value Markov Chains
10 pages
Theorem of Ceva, Menelaus and Van Aubel
No ratings yet
Theorem of Ceva, Menelaus and Van Aubel
10 pages
Expectations of Discrete Random Variables: Scott Sheffield
No ratings yet
Expectations of Discrete Random Variables: Scott Sheffield
61 pages
Expected Uses of Probability: Evan Chen
No ratings yet
Expected Uses of Probability: Evan Chen
18 pages
Mathematical Foundations of Computer Science Lecture Outline
No ratings yet
Mathematical Foundations of Computer Science Lecture Outline
5 pages
Mathematical Foundations of Computer Science Lecture Outline
No ratings yet
Mathematical Foundations of Computer Science Lecture Outline
6 pages
20 de Secrete Pentru Fotografii Digitale Uimitoare V2
100% (1)
20 de Secrete Pentru Fotografii Digitale Uimitoare V2
18 pages
More Discrete R.V
No ratings yet
More Discrete R.V
40 pages
Discrete Probability Distributions: Random Variables
No ratings yet
Discrete Probability Distributions: Random Variables
52 pages
Random Variables Tarea Teoría
No ratings yet
Random Variables Tarea Teoría
8 pages
Expectation 2014
No ratings yet
Expectation 2014
2 pages
Small Mathematical Expectation
No ratings yet
Small Mathematical Expectation
6 pages
Probabilistic PDF
No ratings yet
Probabilistic PDF
16 pages
Introduction To Probability and Expected Value
No ratings yet
Introduction To Probability and Expected Value
12 pages
Expectation Value (Statisitic Formulae)
No ratings yet
Expectation Value (Statisitic Formulae)
6 pages
Chapter 4 Slides
No ratings yet
Chapter 4 Slides
27 pages
Draw PDF
No ratings yet
Draw PDF
21 pages

Unexpected - Expectations 2015

Uploaded by

Unexpected - Expectations 2015

Uploaded by

Unexpected Expectations

Evan Chen Andrew Critch

1 Definitions and Notation

P(X = x | Y = y) := P(X = x and Y = y)/P(Y = y)

and is undefined when P(Y = y) = 0. In our dice example, P[D6 = 3 | D6∗ = 4] = 1

P(X = x and Y = y) = P(X = x)P(Y = y)

This is equivalent to the statement

⊥ E6 and D6 6⊥⊥ D6∗

The expected value or expectation of a random variable X is the probability-

For our dice roll D6 ,

1.1 Graph theory terminology

2 Properties of Expected Value

2.2 Linearity of Expectation

Theorem 2.2 (Linearity of Expectation). Given any random variables X1 , X2 , . . . , Xn ,

E[a1 X1 + a2 X2 + · · · + an Xn ] = a1 E[X1 ] + a2 E[X2 ] + · · · + an E[Xn ].

This theorem is highly intuitive if the X1 , X2 , . . . , Xn are independent of each other –

2.3 More Examples

Solution. Number the babies 1, 2, . . . , 2006. Define

2.4 Conditional expectations

For example, if X is a (uniformly) randomly chosen face of a die, and Y is a randomly

To emphasize again, E[Y | X] denotes a function depending on the value of X, which

Proposition 2.4 (Conditional expectations). For any random variables X and Y ,

E[f (X)Y ] = E[f (X) · E[Y | X]]

In particular, when f (X) = X and f (X) = 1, we get

E[XY ] = E[X · E[Y | X]] and E[Y ] = E[E[Y | X]]

To illustrate with our dice example,

E[XY ] = E[X · E[Y | X]]

2.5 Practice Problems

3 Direct Existence Proofs

3.1 A First Example

E[X] = E[X1 ] + · · · + E[Xn ]

3.2 Application: Ramsey Numbers

Lemma 3.2. For any positive integers n and k,

Here e ≈ 2.718 . . . is Euler’s constant.

whence exponentiating works.

3.3 Practice Problems

A stronger result is Problem 5.5.

4.2 Union Bounds and Markov’s Inequality

P(A1 ) + P(A2 ) + · · · + P(Ak ) < 1

then there is a nonzero probability that none of the events occur.

4.3 Lovász Local Lemma

The constant C = 1331 is extremely weak. We’ll reduce it to C = 48 below.

5 More Practice Problems

[2] Ravi B’s collection of problems, available at:

[4] Also MOP lecture notes: http://math.cmu.edu/~ploh/olympiad.shtml.

[5] Lecture notes by Holden Lee from an MIT course:

6 Unexpected Expectations: Solution Sketches

2.9 Answer: 1866. Any expression with a + or - in it has a complementary expression

1600f (800) 639, 200

For any A ⊆ S, A is eliminated at the unique stage where a superset A0 of A is chosen, so

Then E[X 2 ] = 2E[ x2 ] + E[X] = 2, so the given expression is 2n!.

P(b ∈ B) = P(|Lb ∩ G| is odd) = 50%

You might also like