layout | title | date | modified_date | categories | permalink | use_math |
---|---|---|---|---|---|---|
post |
The union-closed conjecture |
2022-11-27 |
2023-06-22 |
maths |
/blog/union-closed.html |
true |
There has recently been a big breakthrough on a result called the union-closed conjecture, thanks to a new paper by Justin Gilmer and some follow-up work by others. It was interesting to me because it used information theory ideas on a non-information-theoretic problem, and also because the breakthrough was surprisingly straightforward for such a famous unsolved problem.
We work with a finite base set
Then consider the following conjecture:
Conjecture 1. Let $\mathcal F$ be a union-closed family of subsets of $[n]$. Then there exists an $i \in [n]$ in at least $\alpha \lvert\mathcal F\rvert$ of the sets in $\mathcal F$.
Frankl's union-closed conjecture is Conjecture 1 with
Until a few days ago, the union-closed conjecture had been proved in a few special cases, but the best general value of
The purpose of this blogpost is to try to explain the argument to myself. This will require some -- but not very much! -- knowledge of information theory. I found Chase and Lovett the easiest paper to read, so this is based on that.
The key of Gilmer's approach is to take
Theorem 2. Let $A$ and $B$ be IID subsets of $[n]$ drawn according to some distribution with $\mathbb P(i \in A) \leq p$ for all $i \in [n]$. Then
where $\psi = (3 - \sqrt{5})/2$.
First, let's see how Theorem 2 proves Conjecture 1 with
But on the other hand, since
These two equations are a contradiction -- so our assumption that
Let us note that it will require some new ideas to push past
It remains to prove Theorem 2.
There's an important technical lemma in this work. That's this:
Lemma 3. For $0\leq p,q \leq 1$, we have
where $h(p) = -p\log p + (1-p)\log(1-p)$ is the binary entropy and $\psi = (3 - \sqrt{5})/2$.
(Note that there is equality at
Lemma 3 seems to be rather fiddly to prove. Chase and Lovett show that it follows from the one-variable version
Chase and Lovett leave
where
Here's a graph of
-- that is, the difference between the two sides of
{:style="display:block; margin-left:auto; margin-right:auto; width: 600px"}
OK, so the proof of Theorem 2.
It is convenient to adopt some notation. Let
We start by writing
$$ \begin{align*} H(A \cup B) &= \sum_{i=1}^n H\big((A \cup B)i \mid (A \cup B){< i} \big) \ & \geq \sum_{i=1}^n H\big((A \cup B)i \mid A{< i}, B_{< i} \big) , \end{align*} $$
where the equality on the first line is the chain rule and the inequality on the second line is the data processing inequality. Let's look at one of the terms from the sum. We have, by definition,
$$ H\big((A \cup B)i \mid A{< i}, B_{< i} \big) = \mathbb E_{a,b}, H\big((A \cup B)i \mid A{< i} = a, B_{< i} = b\big) , $$
where
$$ H\big((A \cup B)i \mid A{< i} = a, B_{< i} = b\big) = h\big((1-p_a)(1-p_b)\big) , $$
because
So
$$ \begin{align*} H\big((A \cup B)i \mid A{< i}, B_{< i} \big) &= \mathbb E_{a,b} ,H\big((A \cup B)i \mid A{< i} = a, B_{< i} = b) \ &\geq \mathbb E_{a,b} \frac{1}{2(1-\psi)} \big((1-p_b)h(p_a) + (1-p_a)h(p_b) \big) \ &= \frac{1}{2(1-\psi)} \Big( \mathbb E_b (1 - p_b),\mathbb E_a ,h(p_a) + \mathbb E_a (1 - p_a) , \mathbb E_b ,h(p_b) \Big) \end{align*} $$
Now,
So we have
$$ \begin{align*} H\big((A \cup B)i \mid A{< i}, B_{< i} \big) &\geq \frac{1}{2(1-\psi)} \big( (1-p)H(A_i \mid A_{<i}) + (1-p)H(B_i \mid B_{<i}) \big) \ &\geq \frac{1-p}{2(1-\psi)} \big( H(A_i \mid A_{<i}) + H(B_i \mid B_{<i}) \big) \ &= \frac{1-p}{1-\psi} H(A_i \mid A_{<i}) , \end{align*} $$
since
Finally, putting this all together,
$$ \begin{align*} H(A \cup B) &\geq \sum_{i=1}^n H\big((A \cup B)i \mid A{< i}, B_{< i} \big) \ &\geq \sum_{i=1}^n \frac{1-p}{1-\psi} H(A_i \mid A_{<i}) \ &= \frac{1-p}{1-\psi} \sum_{i=1}^n H(A_i \mid A_{<i}) \ &= \frac{1-p}{1-\psi} H(A) , \end{align*} $$
where the last step is the chain rule again. Thus Theorem 2 is proved.
- Pebody also proved the same result, on 23 November, two days after the other three. A couple of the steps in the main lemma say "To be proved", but it seems perhaps a bit less computer-aided than the others.
- In the paper of Chase and Lovett, they show that
$\alpha = \psi$ is in fact tight for "almost union-closed" families, where at least$(1-\epsilon)\lvert\mathcal F\rvert^2$ of the pairs$A, B$ have their union$A \cup B$ in$\mathcal F$ . (Gil Kalai points out that this is different to saying that at least$(1-\epsilon)\lvert\mathcal F \cup \mathcal F\rvert$ of the unions$A \cup B$ are in$\mathcal F$ .) -
Sawin's paper also gave a sketch of how the result might be pushed a little beyond
$\psi$ by making$A$ and$B$ not independent (which isn't required for the proof to work). In December, Yu and then Cambie filled in the details, and got$\alpha = 0.3823...$ (compared to$\psi = 0.3819...$ ). - In June 2023, Liu developed Sawin's improvement a bit further, and computed $\alpha = 0.3827...$.
- I had somehow missed this earlier, but Boppana gives a delightfully short proof of the main lemma, in the form
$f(x) := h(x^2) - \phi x h(x) \geq 0$ . In fact, he proved it in 1985, in a different context. It's simple to calculate the third derivative$f'''$ , and to see has at most two roots in$[0,1]$ . So$f$ itself has at most five roots -- these are the double root at$0$ , the double root at$\phi$ and the root at$1$ . So$f$ has the same sign on the whole of$[0,1]$ , and it's clear that this sign is positive. (I haven't thought about this close enough to work out why going via the third derivative works, when going via the first or second presumably doesn't.)