Random Walks
Random Walks
2 Basics
Suppose G is an undirected d-regular graph on n nodes. Then a random walk starts at some
node v, chooses a neighbor w of v uniformly at random, moves to w, and repeats. Then after
k steps, we have a probability distribution of which vertex one might be at. This corresponds
with a vector v (k) with one coordinate for each node (representing the probability we are at
P (k)
that node) satisfying i∈V vi = 1.
(More generally, we can consider starting our random walk with a probability distribution
(0) (k) (k)
v . Then starting at node i corresponds to the distribution vi = 1, vj = 0 for j 6= i)
Example
Consider K3 , the triangle. We start at some vertex v1 with probability 1. After 1 step, we
have equal probability of walking towards each of the other vertices, so we are at v1 with
probability 0, and at v2 , v3 with probability 21 each. And so forth (pictured below, with the
probability of being at a given vertex in steps 0 through 3 labelled)
1
1, 0, 1/2, 1/4, ...
v_1
v 3 v 2
M = 12 0 21
1 1
2 2
0
A stationary distribution of the random walk is a vector (probability distribution) σ
which is unchanged by one step: ie, σ such that M σ = σ. Equivalently,
P σ is an eigenvector
for M with eigenvalue 1 (and since it’s a probability distribution, i∈V σi = 1, σi ≥ 0).
Then note the following:
1
Lemma 2.1 A, M have the same eigenvectors, with eigenvalues scaled by d
Proof. We have that λ an eigenvalue of A iff there is some x with Ax = λx. But this occurs
iff M x = d1 Ax = λd x, ie, iff λd is an eigenvalue of M (with the same eigenvalue x).
3 A Question
Recall from previous lectures that σ = n1 is a stationary distribution (since 1 an eigenvector
for A). Also, σ is unique iff G is connected. This lets us ask the following question:
Question: If we start from an arbitrary initial distribution v, and iterate the random
walk, will we converge to σ? (assuming G connected, so σ = n1 unique.)
Answer: As stated, no. For example, if G = X ∪ Y is bipartite, then we know whether
we’re in X or Y after k steps by the parity of k (so no convergence). (It turns out that this
is the only thing that can go wrong, but as many of the graphs we will be interested in turn
out to be bipartite, this is a serious drawback!)
The answer is yes, however, if we change our random walk to allow ourselves to “stall”
(stay at the same vertex v) at any v with probability 12 . This new idea of random walk
2
has transition matrix Mij0 = 2d
1
if i, j ∈ E, 0 if i 6= j not in E, and 21 if i = j. Thus,
M 0 = 2 I + 2d A = 2 I + 2 M . Furthermore, M x = λx iff ( 21 I + 12 M )x = ( 12 + λ2 )x, and
1 1 1 1
4 Bounding Eigenvalues
Recall our notation from last lecture: we let λi = λi (L) be the ith eigenvalue of the Laplacian
(arranged in increasing order 0 = λ1 < λ2 ≤ λ3 ≤ . . . ≤ λn ). Then λi (A) = d − λi (arranged
λi
in decreasing order), and hence λi (M ) = d1 (d − λi ) = 1 − d1 λi and λi (M 0 ) = 1 − 2d . Denote
this last λ0i = λi (M 0 ).
How large can λn be?
Proof.
Let x be the nth eigenvector, with corresponding eigenvalue λn .
Lx = λn x
What happens to the ith coordinate when x → Lx?
X
xi → (xi − xj )
(i,j)∈E
Suppose xi has the maximum absolute value. We can assume without loss of generality
that xi > 0. Then ∀i, j, xi − xj ≤ 2xi , so:
X
λn xi = (Lx)i = (xi − xj ) ≤ 2dxi
(i,j)∈E
(Note that the only way to have this be an equality is if all coordinates xi have the same
magnitude and edges only connect pairs of nodes with opposite sign. This means we have a
bipartite graph.)
(Note that λ0n ≥ 0 holds because λ0i = 1−λi /2d, which holds because we set the probability
of “stalling” in the random walk to be 1/2. Had we chosen a smaller probability, λ0n could
be negative).
3
5 Convergence of Random Walks
5.1 Proof of eventual convergence
Now we are going to prove that regardless of the initial probability distribution, a random
walk on a graph (with stalling) always converges to the stationary distribution σ. The
stationary distribution σ is defined as before to be the eigenvector of M (and M 0 ) with
eigenvalue 1. (Note that an eigenvector with eigenvalue 1 in M also has eigenvalue 1 in M 0 ).
4
n
√ X 2 0 2k
1/2
≤ n αi (λ2 )
i=2
n
√ X 1/2
= n(λ02 )k αi2
i=2
√
≤ n(λ02 )k
The last inequality follows since ( ni=2 αi2 )1/2 ≤ kvk2 ≤ kvk1 = 1.
P
q is the first eigenvector, so qi = 1/n. Hence the maximum relative error is:
h i
= n max qi − (M 0 )k v
i∈v i
= n q − (M 0 )k v
∞
≤ n q − (M 0 )k v
1
√
≤ n n(λ02 )k
λ2 k
= n1.5 (1 − )
2d
How many steps (k) must we take to be within max relative error δ?
λ2 k
We need n1.5 (1 − 2d
) ≤ δ, so take
2d 3d
k= (3/2) ln n + ln(1/δ) ≤ ln n + ln(1/δ)
λ2 λ2
α2
To bound k in terms of the graph expansion α, we use the fact that λ2 ≥ 2d
(which was
proved at great expense in a previous lecture). Thus we can take
3d 6d2
k= α2
ln n + ln(1/δ) = 2 ln n + ln(1/δ)
2d
α
5
6 Connection to Markov Chains
We now present some of the more general phrasing and results used by mathematicians when
talking about random walks.
A Markov Chain C has a state set V (assume |V | = n, finite) and transition matrix M ,
where Mij = Probability of going from state i to j.
Let GC be the “state graph” of C. The nodes of GC are V . GC contains a directed edge
(i, j) if Mij > 0. (Note that the matrix equivalent to iterating C is v → v T M ).
We present two definitions.
Definition 6.1 Markov Chain C is “irreducible” if the state graph GC is strongly connected.
Here is a basic result about Markov Chains, which we will give without proof.