0% found this document useful (0 votes)
19 views6 pages

Random Walks

lecture on random walks

Uploaded by

ahsdfkjhsdk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views6 pages

Random Walks

lecture on random walks

Uploaded by

ahsdfkjhsdk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Lecture Notes on Random Walks

Lecturer: Jon Kleinberg


Scribed by: Kate Jenkins, Russ Woodroofe

1 Introduction to Random Walks


It will be useful to consider random walks on large graphs to study actions on other objects:
Eg:
1) We will model card shuffling as a random walk on the n! permutations of n objects.
2) We will look at a 2 dimensional lattice of particles (which will represent the states of
some system)
Two representative questions we might ask are:
1) Is a graph G connected? Of course, we can check this in polynomial time. If G is large,
however, we would also like to check this in small space. Deterministically, it is known how
to do this in O(log2 n) (or even a little better), although these algorithms are not polynomial
in time. We will use random walk techniques to give a probabilistic algorithm which takes
O(log n) space, and expected polynomial time.
2) (From card shuffling) Given a large set, we often would like to pick an element ap-
proximately uniformly at random. How quickly can we do this? The techniques we will use
in this problem will also be useful in approximately counting the size of the set, if this is not
known.

2 Basics
Suppose G is an undirected d-regular graph on n nodes. Then a random walk starts at some
node v, chooses a neighbor w of v uniformly at random, moves to w, and repeats. Then after
k steps, we have a probability distribution of which vertex one might be at. This corresponds
with a vector v (k) with one coordinate for each node (representing the probability we are at
P (k)
that node) satisfying i∈V vi = 1.
(More generally, we can consider starting our random walk with a probability distribution
(0) (k) (k)
v . Then starting at node i corresponds to the distribution vi = 1, vj = 0 for j 6= i)

Example
Consider K3 , the triangle. We start at some vertex v1 with probability 1. After 1 step, we
have equal probability of walking towards each of the other vertices, so we are at v1 with
probability 0, and at v2 , v3 with probability 21 each. And so forth (pictured below, with the
probability of being at a given vertex in steps 0 through 3 labelled)

1
1, 0, 1/2, 1/4, ...
v_1

0, 1/2, 1/4, 3/8, ... 0, 1/2, 1/4, 3/8, ...

v 3 v 2

Recall that the adjacency matrix A of a graph G = (V, E) is an n × n matrix (where


n = |V |) with Aij = 1 if (i, j) ∈ E, 0 otherwise. The Laplacian L = I − A. We define
M = d1 A to be the transition matrix of G (so Mij is the probability that we move from i to
j in a step starting at i). So in the example above,
0 21 21
 

M =  12 0 21 
1 1
2 2
0
A stationary distribution of the random walk is a vector (probability distribution) σ
which is unchanged by one step: ie, σ such that M σ = σ. Equivalently,
P σ is an eigenvector
for M with eigenvalue 1 (and since it’s a probability distribution, i∈V σi = 1, σi ≥ 0).
Then note the following:
1
Lemma 2.1 A, M have the same eigenvectors, with eigenvalues scaled by d

Proof. We have that λ an eigenvalue of A iff there is some x with Ax = λx. But this occurs
iff M x = d1 Ax = λd x, ie, iff λd is an eigenvalue of M (with the same eigenvalue x).

3 A Question
Recall from previous lectures that σ = n1 is a stationary distribution (since 1 an eigenvector
for A). Also, σ is unique iff G is connected. This lets us ask the following question:
Question: If we start from an arbitrary initial distribution v, and iterate the random
walk, will we converge to σ? (assuming G connected, so σ = n1 unique.)
Answer: As stated, no. For example, if G = X ∪ Y is bipartite, then we know whether
we’re in X or Y after k steps by the parity of k (so no convergence). (It turns out that this
is the only thing that can go wrong, but as many of the graphs we will be interested in turn
out to be bipartite, this is a serious drawback!)
The answer is yes, however, if we change our random walk to allow ourselves to “stall”
(stay at the same vertex v) at any v with probability 12 . This new idea of random walk

2
has transition matrix Mij0 = 2d
1
if i, j ∈ E, 0 if i 6= j not in E, and 21 if i = j. Thus,
M 0 = 2 I + 2d A = 2 I + 2 M . Furthermore, M x = λx iff ( 21 I + 12 M )x = ( 12 + λ2 )x, and
1 1 1 1

thus M, M 0 have the same eigenvectors, and λ an eigenvalue of M corresponds to 21 + 12 λ an


eigenvalue of M 0 .

4 Bounding Eigenvalues
Recall our notation from last lecture: we let λi = λi (L) be the ith eigenvalue of the Laplacian
(arranged in increasing order 0 = λ1 < λ2 ≤ λ3 ≤ . . . ≤ λn ). Then λi (A) = d − λi (arranged
λi
in decreasing order), and hence λi (M ) = d1 (d − λi ) = 1 − d1 λi and λi (M 0 ) = 1 − 2d . Denote
this last λ0i = λi (M 0 ).
How large can λn be?

Claim 4.1 The Laplacian matrix L of a d-regular graph G has λn ≤ 2d (and λn = 2d ⇐⇒


G is bipartite)

Proof.
Let x be the nth eigenvector, with corresponding eigenvalue λn .

Lx = λn x
What happens to the ith coordinate when x → Lx?
X
xi → (xi − xj )
(i,j)∈E

Suppose xi has the maximum absolute value. We can assume without loss of generality
that xi > 0. Then ∀i, j, xi − xj ≤ 2xi , so:
X
λn xi = (Lx)i = (xi − xj ) ≤ 2dxi
(i,j)∈E

(Note that the only way to have this be an equality is if all coordinates xi have the same
magnitude and edges only connect pairs of nodes with opposite sign. This means we have a
bipartite graph.)

Since λ0i = 1 − λi /2d, the claim implies that in M 0 we have:

1 = λ01 > λ02 ≥ λ03 ≥ . . . ≥ λ0n ≥ 0

(Note that λ0n ≥ 0 holds because λ0i = 1−λi /2d, which holds because we set the probability
of “stalling” in the random walk to be 1/2. Had we chosen a smaller probability, λ0n could
be negative).

3
5 Convergence of Random Walks
5.1 Proof of eventual convergence
Now we are going to prove that regardless of the initial probability distribution, a random
walk on a graph (with stalling) always converges to the stationary distribution σ. The
stationary distribution σ is defined as before to be the eigenvector of M (and M 0 ) with
eigenvalue 1. (Note that an eigenvector with eigenvalue 1 in M also has eigenvalue 1 in M 0 ).

Theorem 5.1 A random walk on a d-regular graph G (with self-loops as in M 0 ) converges


to σ from any initial distribution v.
0
Proof. ThePeigenvectors of M form a basis ω1 , . . . , ωn , so we can write v in terms of this
basis: v = i αi ωi . Hence,
X X
M 0v = αi M 0 ωi = αλ0i ωi
i i

Iterating for k steps,


X
v → (M 0 )k v = αi (λ0i )k ωi

For i 6= 1, we have λ0i < 1 and hence (λ0i )k → 0 as k → ∞. These converge to 0


exponentially quickly, and the rate is determined by λ02 (the second largest eigenvalue).
λ01 = 1, so limk→∞ (M 0 )k v = α1 ω1 = σ.

5.2 Rate of convergence


How quickly does v converge to the stationary distribution?
We will use the following fact for x ∈ Rd :
√ √
kxk1 ≤ dkxk2 ≤ dkxk1

Let q = α1 ω1 (= first term of above sum).


Then,

q − (M 0 )k v ≤ n q − (M 0 )k v
1 2
n
√ X
= n αi (λ0i )k ωi
2
i=2
n
√ X 1/2
= n αi2 (λ0i )2k
i=2

4
n
√ X 2 0 2k
1/2
≤ n αi (λ2 )
i=2
n
√ X 1/2
= n(λ02 )k αi2
i=2

≤ n(λ02 )k

The last inequality follows since ( ni=2 αi2 )1/2 ≤ kvk2 ≤ kvk1 = 1.
P

5.3 Maximum Relative Error


The maximum relative error per node is
h i
0 k
qi − (M ) v
i
max
i∈v
qi

q is the first eigenvector, so qi = 1/n. Hence the maximum relative error is:
h i
= n max qi − (M 0 )k v
i∈v i

= n q − (M 0 )k v

≤ n q − (M 0 )k v
1

≤ n n(λ02 )k
λ2 k
= n1.5 (1 − )
2d
How many steps (k) must we take to be within max relative error δ?
λ2 k
We need n1.5 (1 − 2d
) ≤ δ, so take

2d   3d  
k= (3/2) ln n + ln(1/δ) ≤ ln n + ln(1/δ)
λ2 λ2
α2
To bound k in terms of the graph expansion α, we use the fact that λ2 ≥ 2d
(which was
proved at great expense in a previous lecture). Thus we can take

3d   6d2  
k= α2
ln n + ln(1/δ) = 2 ln n + ln(1/δ)
2d
α

5
6 Connection to Markov Chains
We now present some of the more general phrasing and results used by mathematicians when
talking about random walks.
A Markov Chain C has a state set V (assume |V | = n, finite) and transition matrix M ,
where Mij = Probability of going from state i to j.
Let GC be the “state graph” of C. The nodes of GC are V . GC contains a directed edge
(i, j) if Mij > 0. (Note that the matrix equivalent to iterating C is v → v T M ).
We present two definitions.

Definition 6.1 Markov Chain C is “irreducible” if the state graph GC is strongly connected.

Definition 6.2 Markov Chain C is “aperiodic” if the gcd(all cycle lengths in GC ) = 1.

Here is a basic result about Markov Chains, which we will give without proof.

Theorem 6.3 Let C be a finite, irreducible, aperiodic Markov Chain. Then:


(1) ∃ unique stationary distribution σ (i.e. σ T M = σ T )
(2) σi > 0, ∀i ∈ V
(3) The expected time to return to state i starting from state i is 1/σi .
(4) If N (i, t)=number of visits to i in first t steps, then N (i,t)
t
→ σi almost surely.

You might also like