Limits of Graph Sequences
Limits of Graph Sequences
Contents
1 Warm-up: convergence in distribution 1
1.1 Starting case: empirical distirbution on k categories . . . . . . . 1
1.2 Convergence of distributions on R1 . . . . . . . . . . . . . . . . . 1
1.3 Alternative view of convergence in distribution . . . . . . . . . . 3
1.4 Lessons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1
Figure 1: S2 and S3 are the simplices corresponding to the range of possible
probability distributions on 2 and 3 categories, respectively. Each point in a
simplex is a probability distribution, since each point is a vector whose entries
sum to 1.
2
Figure 2: The empirical cdf (orange) is a step function which converges to the
true cdf (blue) as the sample size increases.
So, at any finite n, pn (x) takes jumps of size 1/n at (at most) n distinct
points. See Figure 2 for an example.
iid
Theorem 1.1. (Glivenko-Cantelli) If Xi ∼ ρ = true cdf, then
a.s.
max |pn (x) − ρ(x)| → 0.
x
3
for all bounded and continuous f (such f are known as “test functions”). If
d
this convergence holds, then we say that µn → µ, i.e. that µn converges in
distribution to µ.
How do we apply this notion to the convergence of empirical distributions?
In particular, the empirical distribution is a jump function, meaning it does not
have a pdf in the standard sense. However, we can consider the empirical pdf
to be a mixture of Dirac delta functions, as follows: put µn to be the empirical
distribution from n samples. Naturally, we let µ denote the distribution that
all data were drawn from. Define the pdf of µn to be
n
1X
δ(x − xi ),
n i=1
so that Z
f (x)δ(x) dx = f (0)
R
for any f : R → R.
Now, putting mn to be this mixture of Dirac delta functions, we get
Z n Z
1X
f (x)mn (x) dx = f (x)δ(x − xi ) dx (1)
R n i=1 R
n Z
1X
= f (y + xi )δ(y) dy (2)
n i=1 R
n
1X
= f (xi ). (3)
n i=1
d
So for µn → µ, we need
n
1X d
δ(x − xi ) → µ,
n i=1
which we now know means that
n Z
1X n→∞
f (xi ) −→ f (x) µ(dx),
n i=1 R
where φ(x) is the pdf of a random variable with distribution N (0, σ).
4
for all bounded and continuous test functions f . Since
n Z
1X
f (xi ) ∈ R, f (x) µ(dx) ∈ R,
n i=1 R
1.4 Lessons
1. Observed data sets get represented as objects with lots of discreteness.
2. They tend towards continuous limit objects.
5
Figure 3: Graph isomorphism is determined by the graphs’ structure, not by
the label associated with each node.
Figure 4: Take three arbitrary nodes in g: do they have the same structure as
f ? Here, we identify two successful matchings between a triplet in g and f .
6
Figure 5: The subgraph induced by the three green nodes in g is isomorphic
to f . The same applies to the three pink nodes, and to a several other sets of
node triplets in g. The total number of such triplets that are isomophic to f is
denoted by Iso(f, g).
7
Figure 6: There exists some f 0 ⊇ f such that f 0 ' G[k].
Since this is a linear system of equations, we can invert to get tiso (f, g) as a linear
combination of tinjective (f, g). This imples that if we have all of the tinjective (f, g),
we can always calculate all of the tiso (f, g), and vice versa. Therefore, the iso-
morphism densities converge if and only if the injective homomorphism densities
converge.
2.3 Homomorphisms
We previously went from a strong notion of graph matching, graph isomorphism,
to a weaker notion, injective homomorphism. We showed that convergence of
8
Figure 7: In a non-injective mapping, two different nodes in the domain graph
f can be mapped to the same node in the image graph g.
one implies the other, so that we could get away with working with the simpler
notion of injective homomorphism.
Naturally, the next step is to define an even weaker notion of graph matching,
which will be easier to calculate as well. To that end, we remove the restriction
of injectivity (in the sense of injective functions mapping to unique elements in
the image), to obtain the graph homomorphism.
Definition 7. A homomorphism from f to g is a mapping φ : V (f ) → V (g)
such that if (i, j) ∈ E(f ), then (φ(i), φ(j)) ∈ E(g). Figure 7 shows an exam-
ple where f is injectively mapped to g, and another example where f is non-
injectively mapped to g.
The notion of graph homomorphism, in contrast to the injective homomor-
phism, is akin to sampling nodes of g with replacement, in the sense that we are
allowing φ to assign the same node in g to more than one node from f .
Just as we defined G[k] as the subgraph induced by picking k nodes without
replacement, we define G0 [k] to be the subgraph induced by picking k (not
necessarily distinct) nodes of g, i.e. with replacement. This leads to a new
notion of motif density:
Hom(f, g)
thom = = P(f ⊆ G0 [k]).
nk