0% found this document useful (0 votes)

7 views

EntropyMethod

Uploaded by

Jeffry Immanuel

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views

EntropyMethod

Uploaded by

Jeffry Immanuel

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Yuval Wigderson The Entropy Method for Sidorenko’s Conjecture October 19, 2018

This talk is based on Balázs Szegedy’s paper1 “An information theoretic approach to Sidorenko’s con-
jecture”; however, much of the presentation is in a different language (he does everything in terms of the
Kullback–Leibler divergence, which is a function that is for certain purposes more amenable than entropy,
though in my opinion is harder to understand intuitively). Another good resource for this topic is Gow-
ers’s blog post2 “Entropy and Sidorenko’s conjecture—after Szegedy”, which is presented in the language of
entropy, though it only seeks to prove very special cases of Szegedy’s results.

1 Basics of information theory

Suppose X is a random variable on a finite space X , with probability density p(x) = Pr(X = x) for x ∈ X .
Definition 1. The (Shannon) entropy of X is defined by

X 1 1
H(X) = p(x) log = E log
p(x) X p(X)
x∈X

with the convention that 0 log 0 = 0.

The Shannon entropy is supposed to measure the amount of information conveyed by X, as measured
by the number of bits it should take to store the outcome of X. By Jensen’s inequality, we have that

1 1
H(X) = E log ≤ log E = log |X |
X p(X) X p(X)

Moreover, equality is achieved by the uniform distribution on X.

For two (possibly dependent) random variables X, Y on spaces X , Y, we can define their joint entropy
by the formula
X 1
H(X, Y ) = p(x, y) log
p(x, y)
x∈X ,y∈Y

where p(x, y) = Pr(X = x, Y = y). Equivalently, if we think of the pair (X, Y ) as a random variable on the
space X × Y, then H(X, Y ) is just the entropy of this single variable. We can similarly define the conditional
entropy H(Y | X) to be the entropy of the random variable (Y | X) on the space X × Y, defined by

Pr((Y | X) = (x, y)) = Pr(Y = y | X = x)

Equivalently, we can define the conditional entropy by the formula

X p(x)
H(Y | X) = p(x, y) log
p(x, y)
x∈X ,y∈Y

Regardless of definition, it measures the amount of information gained by learning Y , given that we already
know X. From the definitions and some simple computations, it follows that

H(X, Y ) = H(X) + H(Y | X)

which makes intuitive sense: the amount of information gained by learning X and Y is the same as the
amount of information gained by first learning X and then learning Y , already knowing X. Inductively
applying this fact, one can derive the chain rule:

H(X1 , . . . , Xn ) = H(X1 ) + H(X2 | X1 ) + · · · + H(Xn | X1 , . . . , Xn−1 )

1 https://arxiv.org/pdf/1406.6738.pdf
2 https://gowers.wordpress.com/2015/11/18/entropy-and-sidorenkos-conjecture-after-szegedy/

1
Yuval Wigderson The Entropy Method for Sidorenko’s Conjecture October 19, 2018

If X and Y are independent, then

H(X, Y ) = H(X) + H(Y ) H(Y | X) = H(Y )

At the other extreme, if the value of X determines the value of Y (i.e. if Y = f (X) for some deterministic
function f ), then
H(X, Y ) = H(X) H(Y | X) = 0
The important consequence of these results, for our purposes, is that we can always add “dummy” variables
to our entropy: if X determines Y , then we can always add Y to any entropy involving X, and we can also
add Y to the conditioning in any conditional entropy conditioning on X.
The final notion we will need is really one in probability theory. Suppose that X1 , X2 , X3 are three finite
probability spaces, coming with random variables X1 , X2 , X3 on them; suppose too that every point in Xi
has positive probability under Xi , to avoid degeneracy. Suppose we have maps ψi : Xi → X3 for i = 1, 2 so
that ψi (Xi ) = X3 for i = 1, 2. Let X4 be the fiber product of this configuration, i.e.

X4 = {(x1 , x2 ) ∈ X1 × X2 : ψ1 (x1 ) = ψ2 (x2 )}

Then a random variable X on X4 is called a coupling of X1 and X2 over X3 if its marginal distributions
on X1 and X2 equal X1 , X2 , respectively. Moreover, the most natural and most important coupling for our
purposes will be the conditionally independent coupling X4 , defined by

Pr(X1 = x1 ) Pr(X2 = x2 )
Pr(X4 = (x1 , x2 )) =
Pr(X3 = ψ1 (x1 ))

Intuitively, X4 consists of all possible outcomes from X1 × X2 that agree on their induced outcome in X3 ,
and a coupling is just a probability distribution on such outcomes. The conditionally independent coupling
is the coupling that is “as independent as possible”: if we observe a sample from X3 , then X4 will be the
independent distribution on all outcomes from X1 , X2 that yield the outcome we observed in X3 . The most
important property that we will need of couplings is that the conditionally independent coupling maximizes
entropy: i.e. for every coupling X, we have that

H(X) ≤ H(X4 )

This follows from the so-called submodularity of entropy, and is a relative version of the fact above, that
the uniform distribution maximizes entropy. Heuristically, it just says that the conditionally independent
coupling is the one that is most random: given the outcome of X3 , we impose no further dependencies on
the outcome from X1 , X2 .

2 Sidorenko’s Conjecture
Recall that we denote by v(G), e(G) the number of vertices and edges, respectively, of a graph G. For two
graphs H, G, we define
|hom(H, G)|
t(H, G) =
v(G)v(H)
This is the fraction of maps V (H) → V (G) that map edges to edges, or equivalently the probability that a
random map V (H) → V (G) will map edges to edges. Sidorenko’s conjecture says that for all bipartite H
and all G,
t(H, G) ≥ t(K2 , G)e(H)
If G is a random graph, then t(H, G) = t(K2 , G)e(H) + o(1) with high probability, and thus, if Sidorenko’s
conjecture is true, it is asymptotically tight.

2
Yuval Wigderson The Entropy Method for Sidorenko’s Conjecture October 19, 2018

Szegedy’s “information theoretic” approach to Sidorenko’s conjecture is based on the idea that if X is a
random variable supported on hom(H, G), then

log |hom(H, G)| ≥ H(X)

and therefore
log t(H, G) ≥ H(X) − v(H) log v(G)
Similarly,
log t(K2 , G)e(H) = e(H)(log(2e(G)) − 2 log v(G))

Thus, to prove Sidorenko’s conjecture, it suffices to find a random variable X supported on hom(H, G)
so that

H(X) ≥ e(H)(log(2e(G)) − 2 log v(G)) + v(H) log v(G)

= e(H) log(2e(G)) + (v(H) − 2e(H)) log v(G)

To write this in a slightly nicer form, let V be a uniformly random vertex of G, and E a uniformly random
oriented edge (i.e. we care about the order of the endpoints of E). Equivalently, V is uniform on hom(K1 , G)
and E is uniform on hom(K2 , G). Then we have that

H(V ) = log v(G) H(E) = log(2e(G))

Thus, to prove Sidorenko’s conjecture for H, it suffices to find a random variable X on hom(H, G) such that

H(X) ≥ e(H)H(E) + (v(H) − 2e(H))H(V )

We will actually also require the stronger (but very natural) condition that the marginal distribution on any
edge of H is just the distribution of E. More formally, we define the following.
Definition 2. A witness variable on a (bipartite) graph H is an infinite family of random variables X(G),
one for each graph G, with the following properties:

1. X(G) is a random variable on the space hom(H, G)

2. For every edge uv ∈ E(H), the marginal distribution of X(G) on uv (formally, the induced distribution
coming from the projection hom(H, G) → hom(uv, G)) is uniform
3.
H(X(G)) ≥ e(H)H(E(G)) + (v(H) − 2e(H))H(V (G))

Generally, we’ll suppress the G, and just talk about the variable X rather than X(G). However, note that the
definition of a witness variable requires that a variable on hom(H, G) be defined for every G. In particular,
this implies that the space hom(H, G) is non-empty for every G, which implies that H must be bipartite.
The argument presented above proves the following result:
Theorem 3. If H has a witness variable, then H satisfies Sidorenko’s conjecture.
Thus, the task at hand is to find witness variables on graphs. To build witness variables on new graphs,
we will use inductive procedures, where we use various building operations to create new graphs from old
graphs, and similarly combine the witness variables to get a witness variable on the new graph. To start, we
need one graph with a witness variable.
Proposition 4. K2 has a witness variable, namely E, the uniform distribution on hom(K2 , G).

3
Yuval Wigderson The Entropy Method for Sidorenko’s Conjecture October 19, 2018

Proof. E is certainly supported on hom(K2 , G) and has the correct marginal, by definition. So all we need
to check is that for any graph G,

H(E) ≥ e(K2 )H(E) + (v(K2 ) − 2e(K2 ))H(V )

But this certainly holds, since v(K2 ) = 2e(K2 ), so the second term disappears, and we are just left with the
equality H(E) = H(E).
Our main technique for building up new graphs from old graphs will be gluing.
Definition 5. Given two graphs H1 , H2 and vertex subsets S1 ⊆ V1 , S2 ⊆ V2 , and given a bijection f : S1 ↔
S2 , we define the glued graph H = H1 ∪f H2 in the natural way. We let

V (H) = (V (H1 ) t V (H2 ))/(s ∼ f (s) : s ∈ S1 )

and let E(H) be the image of E(H1 ) t E(H2 ) under the gluing projection, except that we delete all parallel
edges we get. We will denote by S the image of S1 (and S2 ) in H.
Lemma 6. Suppose H1 , H2 have witness variables XH1 , XH2 , and suppose that S1 , S2 are independent sets in
H1 , H2 , respectively. Suppose too that there is a bijection f : S1 ↔ S2 so that the marginal distributions XH1 |S
and XH2 |S are identical (i.e. f induces a measure-preserving map XH1 |S1 → XH2 |S2 ). Let H = H1 ∪f H2 ,
and let X be the conditionally independent coupling of XH1 , XH2 . Then X is a witness variable for H.
Proof. First, we need to check that H is indeed supported on hom(H, G). This follows from the fact that
a homomorphism H → G is the same as two homomorphisms H1 → G, H2 → G that agree on S (i.e. that
are identified under f ), and thus hom(H, G) is indeed the fiber product of hom(H1 , G), hom(H2 , G) over
hom(S, G).
Next, we will check the marginals. Since S1 , S2 are independent sets, any edge in H must be contained
in either H1 or H2 (and not both); without loss of generality, it’s in H1 . But since X is a conditionally
independent coupling, the distribution of this edge does not depend on the mapping of H2 \ S, and thus the
marginal of X is the same as the marginal of XH1 , which we assumed was E.
Finally, we need to check the entropy inequality. Let XS be the marginal of S, i.e. the induced random
variable on hom(S, G). Since X determines and is determined by the mapping of the vertices, and since
H1 ∪ H2 ∪ S covers the vertices of H, we have that H(X) = H(XH1 , XH2 , XS ). We now apply the chain rule:

H(X) = H(XH1 , XH2 , XS ) = H(XS ) + H(XH1 | XS ) + H(XH2 | XS , XH1 )

Now, recall that conditional on XS , XH2 is independent of XH1 . So we may remove XH1 from the final
conditioning, and get
H(X) = H(XS ) + H(XH1 | XS ) + H(XH2 | XS )
On the other hand, we may also write

H(XH1 ) = H(XH1 , XS ) = H(XS ) + H(XH1 | XS )

and similarly for XH2 . Plugging this in, we get that

H(X) = H(XH1 ) + H(XH2 ) − H(XS )

It is worth noting that one of the most important properties of entropy is its submodularity, which means
that we can always get an inequality of this form. However, in the conditionally independent setting, we
actually get equality (which is good, because the inequality goes the wrong way for our purposes); this
can be thought of as a form of inclusion-exclusion for conditionally independent random variables. By our
assumption that XH1 , XH2 are witness variables, we get that

H(X) ≥ [e(H1 )H(E) + (v(H1 ) − 2e(H1 ))H(V )] + [e(H2 )H(E) + (v(H2 ) − 2e(H2 ))H(V )] − H(XS )
= [e(H1 ) + e(H2 )]H(E) + [v(H1 ) + v(H2 ) − 2e(H1 ) − 2e(H2 )]H(V ) − H(XS )

4
Yuval Wigderson The Entropy Method for Sidorenko’s Conjecture October 19, 2018

Observe that since S is independent, e(H1 ) + e(H2 ) = e(H) and v(H1 ) + v(H2 ) = v(H) + |S|. Finally, observe
that XS is some (potentially very complicated) random variable on V (G)S . Therefore, we have that

H(XS ) ≤ log |V (G)S | = |S| log v(G) = |S|H(V )

Putting this all together, we get that

H(X) ≥ e(H)H(E) + (v(H) − 2e(H))H(V )

Therefore, if we can build up a graph H by starting with a single edge and always gluing along identically-
distributed independent sets, we get a witness variable for H, and thus show that H satisfies Sidorenko’s
conjecture. Of course, checking that two independent sets in two different graphs have the same distribution
might be hard; one convenient tool to do this in simple cases is the following lemma:
Lemma 7. Suppose H has no isolated vertices. Let D be a random variable on V (G) defined by

deg(v)
Pr(D = v) =
2e(G)

Then if X is a witness variable, then the marginal of X on any vertex of H is precisely D.

Proof. Observe that D is the marginal of E on either of its vertices. Then since E is the marginal of X on
every edge of H, D must be the marginal of X on any vertex of H.
Corollary 8. If H1 , H2 have witness variables and H is formed by gluing them along a single non-isolated
vertex (i.e. |S| = 1), then H has a witness variable.
Proof. This follows immediately from Lemma 6. Indeed, XH1 |S and XH2 |S must both be distributed as D,
since they both consist of a single vertex, and in particular they are identically distributed.
Corollary 9. Every tree has a witness variable, and thus satisfies Sidorenko’s conjecture.
Proof. We prove this by induction on the number of vertices; the base case v(H) = 2 is the K2 case proved
above. Inductively, we may pick a leaf of H and write H as a gluing of a smaller tree H1 and a single edge,
glued along a single vertex. Then by the induction hypothesis and by Corollary 8, we are done.

Corollary 10. Let T be a tree, let S ⊆ V (T ) be an independent set, and let f : S → S be the identity map.
Then H = T ∪f T has a witness variable, and thus satisfies Sidorenko’s conjecture.
In particular, by taking T to be a path, all even cycles satisfy Sidorenko’s conjecture.
Proof. This again follows immediately from Lemma 6. We proved above that T has a witness variable XT ,
and since we are gluing along the identity map, we automatically have that XT induces the same distributed
on S1 = S2 = S. Thus, we can take the conditionally independent coupling and get the desired witness
variable on H.
In Sidorenko’s original paper, he proved using Cauchy–Schwarz and Hölder that trees and even cycles
satisfy Sidorenko’s conjecture, so these results in themselves are not too exciting. However, the next result
uses the same tools to show that all so-called “tree-arrangeable” graphs satisfy Sidorenko’s conjecture, which
was the state of art before Szegedy’s work.
Definition 11. Suppose we are given a bipartite graph H = (A, B, E). We are allowed to extend H in two
ways to get a new bipartite graph H 0 :
• We may add a single vertex v to A and connect to a single vertex b ∈ B, or

5
Yuval Wigderson The Entropy Method for Sidorenko’s Conjecture October 19, 2018

• We may add a single vertex v to B and connect it a subset of N (b), where b is some vertex in B and
N (b) is its neighborhood in A.
A graph is called tree-arrangeable if it may be built up from K2 using a sequence of these allowed operations.
The name “tree-arrangeable” comes from an equivalent description, which is a bit less intuitive in my opinion,
and also less amenable to our purposes.

Tree-arrangeable graphs were introduced by Kim, Lee, and Lee, who proved that they satisfy Sidorenko’s
conjecture. Using the entropy tools we’ve already developed, we can provide a simple alternative proof.
Theorem 12. Every tree-arrangeable graph has a witness variable, and thus satisfies Sidorenko’s conjecture.
Proof. We will prove this by induction on the number of vertices of H = (A, B, E). In fact, the inductive
statement will be stronger: we will prove that not only does H have a witness variable XH , but that for
every vertex u ∈ B, the marginal distribution of its neighborhood N (u) is just the distribution of deg(u)
independent neighbors of a sample of D.
The base case is H = K2 , for which both these statements are clear; the witness variable is just E, and
the only neighborhoods are single vertices, for which we already know the marginals to be distributed as D.
For the inductive case, suppose this is true for all H with n − 1 vertices. For a tree-arrangeable graph
H 0 on n vertices, we may write it as an extension of some H on n − 1 vertices according to one of the
two operations defined above. First, suppose that H 0 is gotten from H by adding a new vertex v to A
and connecting it to some b ∈ B; equivalently, H 0 is gotten by gluing H and K2 along S = {b}. Then by
Corollary 8, we know that the conditionally independent coupling of XH with E will be a witness variable
for H 0 . To check the condition about neighborhoods, observe that all neighborhoods in H 0 are the same as
neighborhoods in H, except for N (b). Its neighborhood in H 0 is just its neighborhood in H, plus the new
vertex v. But since XH 0 is a conditionally independent coupling along b, we know that the distribution of
v is independent from the distribution of the other neighbors of b. Since we assumed inductively that they
were distributed as independent neighbors of D, the same is true in H 0 .
Now, assume instead that H 0 is gotten from H by adding a new vertex v to B and connecting it to
some subset of NH (b) for some fixed b ∈ B. Equivalently, we can think of H 0 as a gluing of H with a star
K1,m along m neighbors of b. Since K1,m can be built as a series of gluings of edges, we know by the same
argument as above that in its witness variable, the marginal on the m leaves is just m independent neighbors
of D. Moreover, by the inductive hypothesis, we know that the distribution of any m neighbors of b is also
that of m independent neighbors of D. So by Lemma 6, this gluing will provide us a witness variable XH 0
for H 0 . To check the neighborhood condition, again observe that the only neighborhood of a vertex in B
that changed is the neighborhood of v. But we automatically get that the distribution of N (v) is that of m
independent neighbors of D, since that was the marginal of this set in both XK1,m and XH . This completes
the proof.
Observe that any bipartite graph which has one vertex b that is complete to the other side is tree-
arrangeable. Indeed, we may build up such a graph by first adding a bunch of vertices to A, each connected
to b, according to the first operation, and then adding vertices to B connected to whichever vertices of A
we’d like, by the second operation. Thus, this result generalizes the theorem of Conlon, Fox, and Sudakov,
which says that bipartite graphs with one complete vertex satisfy Sidorenko’s conjecture.
Lemma 6 said that when we glue along identically-distributed independent sets, we can build new witness
variables by a conditionally independent coupling. We can in fact do even better, and glue along forests. To
do this, let H be a bipartite graph with no isolated vertices, and say that X is a strong witness variable for
H if we have the inequality
H(X) ≥ e(H)H(E) + (v(H) − 2e(H))H(D)
(the only difference being that H(V ) was replaced by H(D)). Note that since H has no isolated vertices, we
have that v(H) ≤ 2e(H), and thus the second term is negative. Thus, since H(D) ≤ H(V ), this inequality is
indeed stronger than the inequality defining a witness variable. Note that for H = K2 , we still have equality
if X = E, and thus K2 has a strong witness variable. Therefore, for the purposes of inductively building

6
Yuval Wigderson The Entropy Method for Sidorenko’s Conjecture October 19, 2018

up graphs from an edge and from gluings, the base case still works if we wish to maintain this stronger
condition.
Lemma 13. Suppose H1 , H2 have strong witness variables XH1 , XH2 , and suppose that S1 , S2 are vertex
sets in H1 , H2 , respectively, with a bijection f : S1 ↔ S2 so that the marginal distributions XH1 |S and XH2 |S
are identical. Suppose too that S1 , S2 span forests (i.e. contain no cycles), and that f is an isomorphism of
these forests. Let H = H1 ∪f H2 , and let X be the conditionally independent coupling of XH1 , XH2 . Then
X is a strong witness variable for H.
Proof. As in the proof of Lemma 6, we can use conditional independence and the chain rule to write

H(X) = H(XH1 , XH2 , XS )

= H(XS ) + H(XH1 | XS ) + H(XH2 | XS , XH1 )
= H(XS ) + H(XH1 | XS ) + H(XH2 | XS )
= H(XS ) + (H(XH1 ) − H(XS )) + (H(XH2 ) − H(XS ))
= H(XH1 ) + H(XH2 ) − H(XS )

By assumption,

H(XH1 ) ≥ e(H1 )H(E) + (v(H1 ) − 2e(H1 ))H(D)

H(XH2 ) ≥ e(H2 )H(E) + (v(H2 ) − 2e(H2 ))H(D)

Moreover, since H is gotten by gluing H1 and H2 along H,

v(H) = v(H1 ) + v(H2 ) − v(S)

e(H) = e(H1 ) + e(H2 ) − e(S)

We get this formula for the edge count because we delete all parallel edges when gluing, and since f is an
isomorphism, every edge in S comes from both S1 and S2 , and thus one copy of it is deleted. So if we can
prove that
H(XS ) ≤ e(S)H(E) + (v(S) − 2e(S))H(D)
then we will get that

H(X) ≥ (e(H1 ) + e(H2 ) − e(S))H(E) + [(v(H1 ) + v(H2 ) − v(S)) − 2(e(H1 ) + e(H2 ) − e(S))]H(D)
= e(H)H(E) + (v(H) − 2e(H))H(D)

as desired.
So it suffices to prove that H(XS ) ≤ e(S)H(E) + (v(S) − 2e(S))H(D). Note that since X is a conditionally
independent coupling, we have that the marginals on all edges in S are E, and the marginals on all vertices
in S are D. We will actually prove the following more general fact:
Proposition 14. If F is a forest and Y is a random variable on hom(F, G) such that the marginal of Y on
every edge of F is E, and the marginal on every vertex is D, then

H(Y ) ≤ e(F )H(E) + (v(F ) − 2e(F ))H(D)

It’s interesting to note that once we finish this proof, we will know that F has a strong witness variable,
and for that variable the reverse inequality is also true. Thus, for forests, the entropy inequality for strong
witness variables is actually an equality.
Proof. We prove this by induction on v(F ). In the case v(F ) = 2, we have that F is either K2 , in which case
we just need H(Y ) ≤ H(E), which is true with equality, or else F is two independent vertices, in which case
we need H(Y ) ≤ 2H(V ). This is also true, since Y is some variable on V (G)2 , so its entropy is bounded by
the entropy of the uniform distribution, which is 2H(V ). So the base case is proved.

7
Yuval Wigderson The Entropy Method for Sidorenko’s Conjecture October 19, 2018

For the inductive case, since F is a forest, we can find some v ∈ V (F ) so that deleting v disonnects
F . Therefore, we can write V = V1 ∪ V2 , where V1 ∩ V2 = {v}, there is no edge between V1 and V2 , and
|V1 |, |V2 | < |V (F )|. Then Y is some coupling of Y |V1 and Y |V2 , and thus its entropy is upper-bounded by
the entropy of the conditionally independent coupling. Therefore, by the inclusion-exclusion property of the
conditionally independent coupling, we get that

H(Y ) ≤ H(Y |V1 ) + H(Y |V2 ) − H(Y |v )

Note that Y |V1 and Y |V2 are also distributions whose marginals on edges are E and on vertices are D, so by
the inductive hypothesis,

H(Y ) ≤ (e(V1 )H(E) + (|V1 | − 2e(V1 ))H(D)) + (e(V2 )H(E) + (|V2 | − 2e(V2 ))H(D)) − H(Y |v )
= (e(V1 ) + e(V2 ))H(E) + (|V1 | + |V2 | − 1 − 2(e(V1 ) + e(V2 )))H(D)
= e(F )H(E) + (v(F ) − 2e(F ))H(D)

This completes the proof of the proposition, and thus we are done.
Corollary 15 (Stated somewhat informally). Consider the class of graphs that can be built up from a
single edge by repeated gluings along isomorphic forests, where all the gluings are done so that the marginal
distributions are equal along the glued forests. Then any graph in this class satisfies Sidorenko’s conjecture.

Proof. This follows immediately from Lemma 13; the only thing to be slightly careful about is that the
notion of a strong witness variable is only stronger than the usual notion of a witness variable under the
assumption that the graph has no isolated vertices. But this is no real issue; if H has isolated points, begin
by deleting them and finding a strong witness variable for the remaining part of the graph. In particular, it is
a witness variable. Now, adding back the isolated vertices (e.g. by gluing them along an empty independent
set, using Lemma 6), we get a witness variable for H, so it satisfies Sidorenko’s conjecture.

Unfortunately, I haven’t been able to think of any natural examples of graphs that can be constructed by
gluings along forests, but cannot be constructed by gluings along independent sets. For unnatural examples,
one can simply take two arbitrarily complicated graphs that have witness variables, and glue them along an
edge; I think that in general, such examples cannot be constructed by only gluing along independent sets. I
suspect that hypercubes can be built by gluings along forests, which would allow us to reprove a theorem of
Hatami (using the techniques of weakly norming graphs) that shows that all hypercubes satisfy Sidorenko’s
conjecture; however, I have not yet found a way to construct such a sequence of gluings.
In Szegedy’s paper, he proves Sidorenko’s conjecture for an even larger class of graphs, which he calls thick
graphs. However, even their definition is difficult to understand: they are graphs that arise in certain ways
from certain complicated hypergraphs (called reflection complexes), and these hypergraphs can themselves
be built up by various gluing operations akin to the ones above. The language of reflection complexes allows
one to keep track of the gluing operations in a systematic fashion (in particular, it makes easy the important
step of verifying that the sets along which we’re gluing have identical marginals), so it does appear to be
the correct framework to work in when one wants to derive the most general theorem possible from the
entropy method. However, it is also a fairly complicated framework to define and to work in, and I don’t
understand it well enough to explain it. In theory, if a graph is thick, then it should be possible to prove
that it satisfies Sidorenko’s conjecture using only simple operations, as above, by manually keeping track of
the information that the reflection complex stores automatically. However, I have been unable to actually
do this for even fairly simple graphs (e.g. Szegedy reproves that hypercubes satisfy Sidorenko’s conjecture
using this technique, but I have been unable to understand the sequence of simple entropy steps that his
proof should be encoding).

Lect 6 Quantinfo 1112
No ratings yet
Lect 6 Quantinfo 1112
13 pages
Entropy
No ratings yet
Entropy
21 pages
Problem Set 1
No ratings yet
Problem Set 1
3 pages
Entropie Eng PDF
No ratings yet
Entropie Eng PDF
6 pages
Ee5143 Pset1 PDF
No ratings yet
Ee5143 Pset1 PDF
4 pages
Lecture 1: Entropy and Mutual Information: 2.1 Example
No ratings yet
Lecture 1: Entropy and Mutual Information: 2.1 Example
8 pages
L02
No ratings yet
L02
5 pages
1.1 Shannon's Information Measures: Lecture 1 - January 26
No ratings yet
1.1 Shannon's Information Measures: Lecture 1 - January 26
5 pages
Math7224 Notes
No ratings yet
Math7224 Notes
32 pages
Tutorial1 20
No ratings yet
Tutorial1 20
2 pages
Lecture 3: Entropy, Relative Entropy, and Mutual Information
No ratings yet
Lecture 3: Entropy, Relative Entropy, and Mutual Information
5 pages
Lecture Notes Part II
No ratings yet
Lecture Notes Part II
52 pages
Information Theory111
No ratings yet
Information Theory111
1 page
Time Series Analysis Ringvorlesung 2
No ratings yet
Time Series Analysis Ringvorlesung 2
44 pages
Lec38 - 210108071 - AKSHAY KUMAR JHA
No ratings yet
Lec38 - 210108071 - AKSHAY KUMAR JHA
12 pages
Mathematical Problems and Solutions On Information Theory
No ratings yet
Mathematical Problems and Solutions On Information Theory
28 pages
L01
No ratings yet
L01
5 pages
Elements of Information Theory 2006 Thomas M. Cover and Joy A. Thomas
No ratings yet
Elements of Information Theory 2006 Thomas M. Cover and Joy A. Thomas
16 pages
Lecture 2: Entropy and Mutual Information: 2.1 Example
No ratings yet
Lecture 2: Entropy and Mutual Information: 2.1 Example
8 pages
Lecture 2: Gibb's, Data Processing and Fano's Inequalities: 2.1.1 Fundamental Limits in Information Theory
No ratings yet
Lecture 2: Gibb's, Data Processing and Fano's Inequalities: 2.1.1 Fundamental Limits in Information Theory
6 pages
SummaryFeb5 2024
No ratings yet
SummaryFeb5 2024
2 pages
1 Introduction To Information Theory
No ratings yet
1 Introduction To Information Theory
9 pages
Emc
No ratings yet
Emc
50 pages
Modern Crypto 18 Homework 2 Solution
No ratings yet
Modern Crypto 18 Homework 2 Solution
5 pages
Shannon's Theorems: Math and Science Summer Program 2020
No ratings yet
Shannon's Theorems: Math and Science Summer Program 2020
28 pages
mutual_info_boolean_functions_AGKN2013
No ratings yet
mutual_info_boolean_functions_AGKN2013
7 pages
ITC Module2 1
No ratings yet
ITC Module2 1
34 pages
Ejercicios Munkres Resueltos
No ratings yet
Ejercicios Munkres Resueltos
28 pages
CHAPTER 6 Six
No ratings yet
CHAPTER 6 Six
17 pages
Information Theory and Coding (Lecture 2) : Dr. Farman Ullah
No ratings yet
Information Theory and Coding (Lecture 2) : Dr. Farman Ullah
36 pages
The Binary Entropy Function: ECE 7680 Lecture 2 - Definitions and Basic Facts
No ratings yet
The Binary Entropy Function: ECE 7680 Lecture 2 - Definitions and Basic Facts
8 pages
SI_Chapter-1
No ratings yet
SI_Chapter-1
30 pages
1-Information Removed
No ratings yet
1-Information Removed
5 pages
Chapter2 PDF
No ratings yet
Chapter2 PDF
22 pages
3logistic Regression
No ratings yet
3logistic Regression
61 pages
Lecture 8: Channel Capacity, Continuous Random Variables: 1.1 Examples
No ratings yet
Lecture 8: Channel Capacity, Continuous Random Variables: 1.1 Examples
6 pages
Information Theory Entropy Relative Entropy
No ratings yet
Information Theory Entropy Relative Entropy
60 pages
Joint Density
No ratings yet
Joint Density
28 pages
L04
No ratings yet
L04
4 pages
CS 725: Foundations of Machine Learning: Lecture 2. Overview of Probability Theory For ML
No ratings yet
CS 725: Foundations of Machine Learning: Lecture 2. Overview of Probability Theory For ML
23 pages
16oct24 Annotations
No ratings yet
16oct24 Annotations
35 pages
R Variables
No ratings yet
R Variables
9 pages
Lecture 4
No ratings yet
Lecture 4
16 pages
Exercise Problems: Information Theory and Coding
No ratings yet
Exercise Problems: Information Theory and Coding
6 pages
Lec1_RandomVariables (1)
No ratings yet
Lec1_RandomVariables (1)
11 pages
Linearclassification
No ratings yet
Linearclassification
31 pages
3.1 Binary Classification
No ratings yet
3.1 Binary Classification
4 pages
E2 201: Information Theory (2019) Solutions To Homework 3
No ratings yet
E2 201: Information Theory (2019) Solutions To Homework 3
11 pages
Information Theory Differential Entropy
No ratings yet
Information Theory Differential Entropy
29 pages
Chapter 4
No ratings yet
Chapter 4
17 pages
Slide 04
No ratings yet
Slide 04
16 pages
Notes On Jensen's Inequality
No ratings yet
Notes On Jensen's Inequality
7 pages
Lecture_15
No ratings yet
Lecture_15
7 pages
Communication Theory and Coding: Basics
No ratings yet
Communication Theory and Coding: Basics
17 pages
HW 1
No ratings yet
HW 1
4 pages
Slide 0
No ratings yet
Slide 0
16 pages
Lecture Note PDF
No ratings yet
Lecture Note PDF
373 pages
LECTURE 1: Introduction
No ratings yet
LECTURE 1: Introduction
16 pages
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
Theory of Approximation
From Everand
Theory of Approximation
N. I. Achieser
No ratings yet
Natural Parameter Form For Multivariate Gaussian
No ratings yet
Natural Parameter Form For Multivariate Gaussian
17 pages
Matrix-Based System Reliability Method and Applications To Bridge Networks
No ratings yet
Matrix-Based System Reliability Method and Applications To Bridge Networks
10 pages
Computer Vision: Models, Learning and Inference
No ratings yet
Computer Vision: Models, Learning and Inference
59 pages
Lab Condicional Independencia
No ratings yet
Lab Condicional Independencia
7 pages
BN Problems
No ratings yet
BN Problems
17 pages
Bayesian Networks
No ratings yet
Bayesian Networks
24 pages
Ai HW1
No ratings yet
Ai HW1
2 pages
Module 2
No ratings yet
Module 2
17 pages
ML Unit3
No ratings yet
ML Unit3
21 pages
Bayesian Learning: Salma Itagi, Svit
No ratings yet
Bayesian Learning: Salma Itagi, Svit
14 pages
Bayes Ball
No ratings yet
Bayes Ball
5 pages
Lec 2
No ratings yet
Lec 2
27 pages
Ai Homework
No ratings yet
Ai Homework
2 pages
cs228 HW 1
No ratings yet
cs228 HW 1
6 pages
SP14 CS188 Lecture 16 - Bayes Nets
No ratings yet
SP14 CS188 Lecture 16 - Bayes Nets
42 pages
Conditional Independence
No ratings yet
Conditional Independence
4 pages
SP14 CS188 Lecture 13 - Markov Models
No ratings yet
SP14 CS188 Lecture 13 - Markov Models
33 pages
Mathematical Statistics (MA212M) : Lecture Slides
No ratings yet
Mathematical Statistics (MA212M) : Lecture Slides
7 pages
Module 4 - Bayesian Learning
No ratings yet
Module 4 - Bayesian Learning
36 pages
Week 8
No ratings yet
Week 8
4 pages
Unit-4 Uncertainity
No ratings yet
Unit-4 Uncertainity
62 pages
Machine Learning - 9: BITS Pilani
No ratings yet
Machine Learning - 9: BITS Pilani
13 pages