0% found this document useful (0 votes)
26 views21 pages

Multispin

Uploaded by

Achraf Louiza
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views21 pages

Multispin

Uploaded by

Achraf Louiza
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

ON THE ZEROS OF PARTITION FUNCTIONS

WITH MULTI-SPIN INTERACTIONS

Alexander Barvinok

June 13, 2024

Abstract. Let X1 , . . . , Xn be probability spaces, let X be their direct product, let


φ1 , . . . , φm : X −→ C be random variables, each depending only on a few coordinates
of a point x = (x1 , . . . , xn ), and let f = φ1 + . . . + φm . The expectation E eλf , where
λ ∈ C, appears in statistical physics as the partition function of a system with multi-
spin interactions, and also in combinatorics and computer science, where it is known
as the partition function of edge-coloring models, tensor network contractions or a
Holant polynomial. Assuming that each φi is 1-Lipschitz in the Hamming metric of
X, that each φi (x) depends on at most r ≥ 2 coordinates x1 , . . . , xn of x ∈ X, and
that for each j there are at most c ≥ 1 functions φi that depend on the coordinate xj ,

we prove that E eλf 6= 0 provided |λ| ≤ (3c r − 1)−1 and that the bound is sharp
up to a constant factor. Taking a scaling limit, we prove a similar result for functions
φ1 , . . . , φm : Rn −→ C that are 1-Lipschitz in the `1 metric of Rn and where the
expectation is taken with respect to the standard Gaussian measure in Rn . As a
corollary, the value of the expectation can be efficiently approximated, provided λ
lies in a slightly smaller disc.

1. Introduction and the main results


We investigate the complex zeros and computational complexity of functionals
of a particular type, which appear under different names in statistical physics, com-
puter science, and combinatorics. The functionals can be viewed as the partition
function of a spin system with multiple spin interactions (statistical physics) or as
the partition function of a hypergraph edge-coloring model, also known as a ten-
sor network contraction, or as a Holant polynomial (combinatorics and computer
science).
In what follows, we consider functions on the direct product X1 × . . . × Xn of
probability spaces. For a point x ∈ X, x = (x1 , . . . , xn ), we refer to xj ∈ Xj as the

1991 Mathematics Subject Classification. 82B20, 30C15, 68R05, 68W25.


Key words and phrases. partition function, spin systems, hypergraphs, edge-coloring models,
zeros, algorithm, interpolation method, numerical integration.
This research was partially supported by NSF Grant DMS 1855428.

Typeset by AMS-TEX
1
j-th coordinate of x. The Hamming distance between two points x, y ∈ X is the
number of the coordinates where they differ:

dist(x, y) = |j : xj 6= yj | , where x = (x1 , . . . , xn ) and y = (y1 , . . . , yn ) .

We prove the following main result.


(1.1) Theorem. Let X1 , . . . , Xn be probability spaces, let X = X1 × . . . × Xn be
the product space and let φ1 , . . . , φm : X −→ C be measurable functions. Assume
that
(1) Each function φi is 1-Lipschitz in the Hamming metric, that is,

|φi (x) − φi (y)| ≤ 1

provided x = (x1 , . . . , xn ) and y = (y1 , . . . , yn ) differ in one coordinate;


(2) Each function φi depends only on at most r ≥ 2 coordinates, that is, for
each i = 1, . . . , m there is a subset Ji ⊂ {1, . . . , n} with |Ji | ≤ r such that

φi (x1 , . . . , xn ) = φi (y1 , . . . , yn ) whenever xj = yj for all j ∈ Ji ;

(3) For every j = 1, . . . , n, there are at most c ≥ 1 functions φi that depend on


the coordinate xj , that is, |i : j ∈ Ji | ≤ c for all j.
Let
m
X
f= φi
i=1

and suppose that λ is a complex number such that

1
|λ| ≤ √ .
3c r − 1

Then
E eλf 6= 0.
Moreover, if, additionally,

(1.1.1) |φi (x)| ≤ L for all x∈X and i = 1, . . . , m

and some L > 0, then


 n
|λ|mL λf −|λ|mL π
(1.1.2) e ≥ Ee ≥ e cos √ .
4 r−1

We prove Theorem 1.1 in Section 2, and in Section 4 we show that the bound for
|λ| is optimal, up to a constant factor. The method of proof extends those of [Ba17]
2
and [BR19]. The dependence on r is worth noting. One approach frequently used
for problems of this type is the cluster expansion method, see [Je24] for a recent
survey. In the situation of Theorem 1.1 it apparently gives only |λ| = Ω(1/rc)
as a bound for the zero-free region, while also requiring |φi | to remain uniformly
bounded [Ca+22].
The dependence on r allows us to obtain zero-free regions for the partition func-
tion of another model that can be considered as a scaling limit of that described by
Theorem 1.1. We consider the standard Gaussian probability measure in Rn with
density  
1 1 2 2

exp − x1 + . . . + xn for x = (x1 , . . . , xn )
(2π)n/2 2
and prove the following result.
(1.2) Theorem. Let φ1 , . . . , φm : Rn −→ C be functions on Euclidean space Rn ,
endowed with the standard Gaussian probability measure. Assume that
(1) Each function φi is 1-Lipschitz in the `1 metric of Rn , that is,
n
X
|φi (x1 , . . . , xn ) − φi (y1 , . . . , yn )| ≤ |xi − yi | ;
i=1

(2) Each function φi depends only on at most r ≥ 2 coordinates;


(3) For every j = 1, . . . , n, there are at most c ≥ 1 functions φi that depend on
the coordinate xj ;
Let
m
X
f= φi
i=1

and suppose that λ is a complex number such that

1
|λ| < √ .
6c r − 1

Then
E eλf 6= 0.
Moreover, if, additionally,

(1.2.1) |φi (x)| ≤ L for all x ∈ Rn and i = 1, . . . , m

and some L > 0, then


2
(1.2.2) e|λ|mL ≥ E eλf ≥ e−|λ|mL e−π n/32r
.

We prove Theorem 1.2 in Section 3.


3
Suppose we want to approximate the value of E eλf efficiently. As we are dealing
with complex numbers, it is convenient to adopt the following definition: we say
that a complex number z1 6= 0 approximates another complex number z2 6= 0 within
relative error  if z1 = ew1 and z2 = ew2 for some w1 and w2 such that |w1 −w2 | ≤ .
Let us fix some 0 < δ < 1. From by now standard method of polynomial
interpolation [Ba16], [PR17], it follows that for any λ such that

1−δ 1−δ
|λ| ≤ √ (in Theorem 1.1) or |λ| ≤ √ (in Theorem 1.2)
3c r − 1 6c r − 1

approximating the value of E eλf within relative error 0 <  < 1 reduces to the
computation of the mk expectations

E (φi1 · · · φik ) for all 1 ≤ i1 , i2 , . . . , ik ≤ m.

where
k = Oδ (ln(m + n) − ln ) .
We sketch the algorithm in Section 5. In many applications of Theorem 1.1 the
probability spaces Xj are finite. Assuming that each Xj contains at most q ele-
ments,computing each expectation E (φi1 · · · φik ) by the direct enumeration takes
O q rk time, provided we have an access to the values of φi (x) for a given x ∈ X.
If the parameter r is fixed in advance, we obtain a quasi-polynomial algorithm of
(q(m + n))O(ln(m+n)−ln ) complexity. Moreover, the general technique of Patel and
Regts [PR17] allows one to speed it up to a genuinely polynomial time algorithm
O(1)
of q(m + n)−1 complexity, provided the parameter c is also fixed in advance.
In the context of Theorem 1.2, the expectations E (φi1 · · · φik ) are represented by
integrals in the space of dimension at most kr and often can be efficiently computed
or approximated with high accuracy, assuming again that r is fixed in advance. In
that case, we also obtain an algorithm of quasi-polynomial complexity.
As we mentioned, the expectation E eλf appears in several different, though
closely related contexts.
(1.3) Statistical physics: partition functions of systems with multi-spin
interactions. Here we describe the statistical physics context of Theorem 1.1.
Suppose that we have a system of n particles, where the j-th particle can be in a
state described by a point xj ∈ Xj , also called the spin of the particle. The vector
x = (x1 , . . . , xn ) of all spins is called a spin configuration. The particles interact
in various ways, and the energy of a spin configuration is given by a function
f (x1 , . . . , xn ). Then for a real λ > 0, interpreted as the inverse temperature, the
value of E e−λf is the partition function of the system, see, for example, [FV18] for
the general setup. If the function f is written as a sum of φi , as in Theorem 1.1,
then the energy of the system is a sum over subsystems, each containing at most
r particles, and each particle participating in at most c subsystems. The classical
works of Lee and Yang [YL52], [LY52] relate the complex zeros of the partition
4
function λ 7−→ E eλf to the phenomenon of phase transition. Generally, zero-free
regions like the one described by Theorem 1.1 correspond to regimes with no phase
transition.
In terms of Theorem 1.1, the most studied case is that of r = 2, which includes
the classical Ising, Potts and Heisenberg models, see [FV18]. In that case, we have
a (finite) graph G = (V, E) with set V of vertices and set E of edges. The particles
are identified with the vertices of G, the spins are ±1 in the case of the Ising model,
elements of some finite set in the case of the Potts model, or vectors in Euclidean
space, in the case of the Heisenberg model. The interactions are pairwise and
described by the functions attached to the edges of E. Hence, in the context of
Theorem 1.1, we have r = 2 and c is the largest degree of a vertex of G. Starting
with [LY52], the complex zeros of the partition function of systems with pairwise
interactions were actively studied in great many papers. We refer to [FV18] for
earlier, and to [B+21], [G+22], [PR20], [P+23] for more recent works.
Zero-free regions for systems with multiple spin interactions were considered by
Suzuki and Fisher [SF71], again in connection with phase transition, see also [L+19]
and [L+16] for recent developments. In this case, the particles are identified with
the vertices of a hypergraph, while functions attached to the edges (sometimes called
hyperedges) of the hypergraph describe interactions. In terms of Theorem 1.1, the
maximum number of vertices of an edge of the hypergraph is r, while c is the largest
degree of vertex. The papers [SF71] and [L+19], respectively [L+16], consider
rather special ferromagnetic, respectively antiferromagnetic, types of interaction,
so their results are not directly comparable to ours. While [SF71], [L+19] and
[L+16] say more about those specific models, our Theorem 1.1 allows one to handle
a wider class of interactions.

(1.4) Statistical physics: partition functions of interacting particles with


an external field. Here we give an example of how Theorem 1.2 applies to systems
of pairwise interacting particles, cf. [FV18]. For an integer N > 1, let us consider
N particles represented as vectors x(1) , . . . , x(N ) in Rd , chosen independently at
random from the standard Gaussian distribution in Rd . Suppose that there are
pairwise repulsive constant forces with potentials

φij = − x(i) − x(j)

where k · k is the standard Euclidean norm in Rd . Hence the total energy of the
system is
  X
f x(1) , . . . , x(N ) = − kx(i) − x(j) k,
i6=j

which is minimized when the particles are far away from each other. However,
the probability of a configuration with large pairwise distances is small, so that
the Gaussian density can be interpreted as an external force pushing the particles
5
N

towards the origin, In terms of Theorem 1.2, we have n = dN , m = 2 , r = 2d
and c = N . Theorem 1.2 establishes a zero-free region of the order
 
1
|λ| = Ω √
N d

for the partition function E e−λf . This model is apparently related to the old
problem of finding a configuration of points on the unit sphere in Rd that maximizes
the sum of pairwise distances between points, see [B+23] (we note that for large
√ d
the standard Gaussian measure in Rd is concentrated around the sphere kxk = d).

Essentially identical models described by Theorem 1.1 were studied in connection


with questions in combinatorics and theoretical computer science.
(1.5) Combinatorics and computer science: edge-coloring models and
tensor network contractions. Let G = (V, E) be a graph. Suppose now that
every edge of G can be in one of the k states {1, . . . , k}, typically called “colors”.
In terms of Theorem 1.1, we have n = |E| and X1 = . . . = Xn = {1, . . . , k}, so
that X = {1, . . . , k}E . Suppose further, that to each vertex v of G a function
ψv : X1 × . . . × Xn −→ C is attached, that depends only on the colors of the edges
of G that contain v. The expression
X Y
(1.5.1) ψv (x)
x∈X v∈V

is known as the tensor network contraction, or the partition function of the edge
coloring model, or as a Holant polynomial, see [Re18] for relations between different
models, and references. For relations with spin systems in statistical physics and
also knot invariants, see [HJ93]. Assuming that ψv (x) 6= 0 for all x and v, we can
write ψv (x) = exp {φv (x)} and (1.5.1) can be written as the scaled expectation
E eλf of Theorem 1.1, assuming the uniform probability measure in each space Xj .
In the context of Theorem 1.1, we have c = 2, while r is equal to the largest degree
of a vertex of G. Note that compared to the graph interpretation of the Ising and
Potts models from Section 1.3, the parameters r and c switch places.
A similar to (1.5.1) expression can be built for hypergraphs, in which case r is
still the maximum degree of vertex, while c becomes the maximum size of an edge.
In combinatorics and computer science, there is a lot of interest in efficient
approximation of (1.5.1), also in connection with zero-free regions [GG16], [B+22],
[Ca+22], [G+21], [Re18], since many interesting counting problems on graphs and
hypergraphs can be stated as a problem of computing (1.5.1). To illustrate, we
consider just one example of perfect matchings in a hypergraph.
Let H = (V, E) be a hypergraph with set V of vertices and set E of edges. We
choose X = {1, 2}E so that every edge of H can be colored into one of the two
colors, which we interpret as the edge being selected or not selected. We define
6
ψv (x) = 1 if exactly one edge containing v is selected, and ψv (x) = 0 otherwise.
Then (1.5.1) is exactly the number of perfect matchings is H. Since deciding whether
a hypergraph contains a perfect matching is a well-known NP-complete problem,
there is little hope to approximate (1.5.1) efficiently. One can try to come up with
a more approachable version of the problem by modifying the definition of ψv so
that ψv (x) = 1 if exactly one edge containing v is selected, and ψv (x) = 1 − δ
otherwise, for some 0 < δ < 1. In this case, the sum (1.5.1), while taken over
all collections of edges of H, is “exponentially tilted” towards perfect matchings.
Thus every perfect matching is counted with weight 1, and an arbitrary collection
of edges is counted with a weight exponentially small in the number of vertices
where the perfect matching condition is violated. In other words, the weight of a
collection of edges is exponentially small in the number of vertices that belong to
any number of edges in the collection other than 1.
The larger δ we are able to choose, the more (1.5.1) is tilted towards perfect
matchings. It follows from Theorem 1.1 via the interpolation method that for
a fixed r, √ there is a quasi-polynomial algorithm approximating (1.5.1) for some
δ = Ω (1/c r), where c is the maximum number of vertices in an edge of H and
r is the largest degree of vertex. The results [Ca+22] and [Re18] appear to allow
only for δ = Ω (1/cr). We note that Theorem 1.1 allows us to choose
  
|k − 1|
ψv = exp −Ω √ ,
c r
where k is the number of selected edges containing v, so as to assign smaller weights
to the vertices v with bigger violation
√ of the perfect matching condition, up to the
smallest weight of ψv = exp {−Ω( r/c)}, when all edges containing v are selected.
We note also that Theorem 1.1 allows us to select edges with a non-uniform prob-
ability, and hence to zoom in the perfect matchings even more: if the hypergraph
H is r-regular, that is, if each vertex is contained in exactly r edges, it makes sense
to select each edge independently with probability 1/r, so that for each vertex v
the expected number of selected edges containing v is exactly 1. If H is not regu-
lar, one can choose instead the maximum entropy distribution, which also ensures
that the expected number of selected edges containing any given vertex is exactly
1. The maximum entropy distribution exists if an only if there exists a fractional
perfect matching, that is, an assignment of non-negative real weights to the edges
of H such that for every vertex of H the sum of weights of the edges containing it
is exactly 1, see [Ba23] for details.
(1.6) Numerical integration in higher dimensions. Computationally efficient
numerical integration of multivariate functions is an old problem that is often asso-
ciated with “the curse of dimensionality”. The most spectacular success is achieved
for integration of log-concave functions on Rn via the Monte Carlo Markov Chain
method, see [LV07] for a survey. Deterministic methods achieved much less success.
In [GS24], the authors, using the deterministic decay of correlations approach, con-
sidered integration of functions over the unit cube [0, 1]n . The model of [GS24] fits
7
the setup of our Theorem 1.1, if we choose X1 = . . . = Xn = [0, 1] endowed with
the Lebesgue probability measure. The results of [GS24] appear to be weaker than
our Theorem 1.1, in the dependence on both parameters r and c, as well as in the
class of allowed functions φi .
Theorem 1.2 allows us to integrate efficiently some functions that are decidedly
not log-concave, for example if we choose φi (x) = |xi | for some i, thus reaching
outside the realm of functions efficiently integrated by randomized methods.
(1.7) Other applications. Zero-free regions of partition functions of the type
covered by Theorem 1.1 turn out to be relevant to the decay of correlations [Ga23],
[Re23], to the mixing time of Markov Chains [Ch+22], to the validity of the Central
Limit Theorem for combinatorial structures [MS19], as well as to other related
algorithmic applications [J+22].

2. Proof of Theorem 1.1


We start with a lemma.
(2.1) Lemma. Let X be a probability space and let f : X −→ C be an integrable
function. Suppose that f (x) 6= 0 for all x ∈ X and, moreover, for any two points
x, y ∈ X, the angle between f (x) and f (y), considered as non-zero vectors in R2 = C
does not exceed θ for some 0 ≤ θ < 2π/3. Then
 
θ
|E f | ≥ cos E |f |.
2

Proof. This is Lemma 3.3 from [BR19], see also Lemma 3.6.3 of [Ba16]. 
We will need some technical inequalities.
(2.2) Lemma. The following inequalities hold
(1)
 k
α π
cos √ ≥ cos α for 0≤α≤ and k ≥ 1;
k 2
(2)
π
sin(τ α) ≥ τ sin α for 0≤α≤ and 0 ≤ τ ≤ 1;
2
(3)
|ez − 1| ≤ 2|z| for z∈C such that |z| ≤ 1.

Proof. The inequality of Part (1) is proved, for example, in [Ba23]. To prove the
inequality of Part (2), let

f (α) = sin(τ α) and g(α) = τ sin α.


8
Then f (0) = g(0) = 0 and

f 0 (α) = τ cos(τ α) ≥ τ cos α = g 0 (α),

from which the proof follows. To prove the inequality of Part (3), we note that

∞ ∞
z
X zk X |z|k
|e − 1| = ≤ ,
k! k!
k=1 k=1

and hence it suffices to check the inequality assuming that z is non-negative real.
Since the function

ez − 1 X z k
=
z (k + 1)!
k=0

is increasing for z ≥ 0, it suffices to check the inequality at z = 1, where it states


that |e − 1| ≤ 2. 
Let X1 , . . . , Xn be probability spaces and let f : X1 × . . . × Xn −→ C be an
integrable function. For a subset S ⊂ {1, . . . , n}, by ES f we denote the conditional
expectation obtained by integrating f over the coordinates xj ∈ Xj with j ∈ S.
Thus ES f is a function of the coordinates xj with j ∈ / S. In particular, for S = ∅,
we have ES f = f and for S = {1, . . . , n}, we have ES = E f .
Now we are ready to prove Theorem 1.1.
(2.3) Proof of Theorem 1.1. Let

π
(2.3.1) θ= √ .
2 r−1

For a subset S ⊂ {1, . . . , n}, we prove by induction on |S| the following

(2.3.2) Statement: Let functions φi , f and a complex number λ be as in Theorem


1.1. For every S ⊂ {1, . . . , n}, we have ES eλf 6= 0, by which we mean that
ES eλf 6= 0 for any choice of the coordinates xj ∈ Xj with j ∈
/ S. Moreover, for
every j ∈/ S, the value of ES e 6= 0, considered as a vector in R2 = C, rotates
λf

by not more than an angle of θ when only the xj coordinate of x = (x1 , . . . , xn )


changes, while all other coordinates stay the same.

Once we have (2.3.2) for S = {1, . . . , n}, we conclude that E eλf 6= 0, which is
what we want to prove.
Suppose that |S| = 0, so that S = ∅ and
( m
)
X
ES eλf = eλf = exp λ φi .
i=1
9
Let x0 , x00 ∈ X be two points that differ only in the xj coordinate. Since at most c
of the functions φi depend on the coordinate xj and each function φi is 1-Lipschitz,
we have
1
|λf (x0 ) − λf (x00 )| = |λ||f (x0 ) − f (x00 )| ≤ |λ|c ≤ √ < θ,
3 r−1

which establishes the base of the induction.


Suppose now that Statement 2.3.2 holds for every S with |S| ≤ k, where 0 ≤
k < n. Let us choose an arbitrary S ⊂ {1, . . . , n} such that |S| = k + 1 and an
index j ∈ S. Applying Lemma 2.1 and the induction hypothesis, we conclude that
 
λf λf θ
(2.3.3) ES e = E{j} ES\{j} e ≥ cos E{j} ES\{j} eλf .
2

It follows that
ES eλf 6= 0.
Assuming that k + 1 < n, let us pick some index not in S, without loss of generality
index n. We need to prove that as the coordinate xn changes from some value x0n
to some value x00n , while other coordinates remain the same, the value of ES eλf
rotates through an angle of at most θ. Let

I = {i : φi depends on xn } , so |I| ≤ c.

Without loss of generality, I 6= ∅. For each i ∈ I, we define two functions φ0i and φ00i ,
obtained from φi by fixing the coordinate xn to x0n and x00n respectively. Although
φ0i and φ00i are functions of the first n − 1 coordinates x1 , . . . , xn−1 of x ∈ X, we
formally consider them as functions on X, by ignoring the last coordinate xn .
Since each function φi is 1-Lipschitz in the Hamming metric, we have

(2.3.4) |φ0i (x) − φ00i (x)| ≤ 1 for all x∈X and i ∈ I.

Thus the value of ES eλf , where we fix xn = x0n , is


( )
X X
(2.3.5) ES exp λ φ0i + λ φi ,
i∈I i∈I
/

while the value of ES eλf , where we fix xn = x00n , is


( )
X X
00
(2.3.6) ES exp λ φi + λ φi .
i∈I i∈I
/

We pass from (2.3.5) to (2.3.6) step by step, replacing one φ0i by φ00i at each step.
Our goal is to prove that at each step, the expectation rotates by at most θ/c. Once
10
we prove that, it would follow that the expectation ES eλf rotates by at most θ,
when we replace the value of xn = x0n by xn = x00n .
Let us pick an index in I, without loss of generality index m. We define

(2.3.7) f 0 = φ0m + ψ and f 00 = φ00m + ψ,

where ψ is the sum of some functions φ0i , φ00i and φi , where for each i 6= m we select
exactly one of the three functions φ0i , φ00i or φi into the sum for ψ. Hence our goal
0 00
is to show that the angle between ES eλf and ES eλf does not exceed θ/c.
Since each of the functions φ0i and φ00i is obtained from φi by specifying some value
of xn , we can apply the induction hypothesis both to f 0 and to f 00 . Let S0 ⊂ S
be the set of indices j ∈ S such that φ0m and φ00m depend on xj . In particular,
|S0 | ≤ r − 1.
If S0 = ∅, then
0 0 00 00
ES eλf = eλφm ES eλψ and ES eλf = eλφm ES eλψ .
0 00
Using (2.3.4), we conclude the angle between ES eλf 6= 0 and ES eλf 6= 0 does
not exceed
1 θ
|λφ00m (x) − λφ0m (x)| ≤ |λ| ≤ √ < .
3c r − 1 c
Suppose now that S0 6= ∅. Since φ0m and φ00m do not depend on the coordinates
xj with j ∈
/ S0 , from (2.3.7) we have

00
 00   00 0 0
  00 0 0

ES eλf =ES eλφm eλψ = ES eλφm −λφm eλφm +λψ = E eλφm −λφm eλf
 00 0
 0

=ES0 eλφm −λφm ES\S0 eλf

and hence
    
λf 00 λf 0 λφ00 0
m −λφm λf 0 λf 0
ES e − ES e =ES0 e ES\S0 e − ES0 ES\S0 e
 00 0
 0

=ES0 eλφm −λφm − 1 ES\S0 eλf .

By (2.3.4) and by Part (3) of Lemma 2.2, we have

00 0
eλφm −λφm − 1 ≤ 2|λ|.

Therefore,

00 0 0
(2.3.8) ES eλf − ES eλf ≤ 2|λ|ES0 ES\S0 eλf .
11
Iterating (2.3.3) with f replaced by f 0 , we obtain that
 |S0 |
λf 0 λf 0 θ 0
ES e = ES0 ES\S0 e ≥ cos ES0 ES\S0 eλf
2
(2.3.9)  r−1
θ 0
≥ cos ES0 ES\S0 eλf .
2

Combining (2.3.8) and (2.3.9), we conclude that


00 0
ES eλf − ES eλf 2|λ|
λf 0
≤ .
|ES e | cosr−1 (θ/2)

Recalling the bound for |λ|, formula (2.3.1) for θ and using Part (1) of Lemma 2.2,
we obtain
00 0
ES eλf − ES eλf 2

π
−(r−1)
≤ √ cos √
|ES eλf 0 | 3c r − 1 4 r−1

2 1 2 2
≤ √ = p .
3c r − 1 cos(π/4) 3c (r − 1)
0 00
It follows now that the angle, call it α, between ES eλf and ES eλf is acute and
that √
2 2
sin α ≤ p .
3c (r − 1)
From (2.3.1) and Part (2) of Lemma 2.2, we have

θ π 1 π 1
sin = sin √ ≥ √ sin = √ ≥ sin α.
c 2c r − 1 c r−1 2 c r−1
0 00
Hence the angle between ES eλf and ES eλf indeed does not exceed θ/c and
ES eλf rotates by not more than an angle of θ when the value of one of the coor-
dinates xj with j ∈/ S changes, while the others remain the same. This completes
the induction step in proving Statement 2.3.2.
It remains to prove (1.1.2) assuming (1.1.1). Clearly, we have

|f (x)| ≤ mL for all x∈X

and the upper bound in (1.1.2) follows. To prove the lower bound, iterating (2.3.3),
we get  n  n
λf θ λf −|λ|mL π
Ee ≥ cos E e ≥ e cos √ ,
2 4 r−1
as required. 
12
(2.4) Remark. Let X1 , . . . , Xn and X = X1 × . . . × Xn be probability spaces and
let φ1 , . . . , φm : X −→ C be measurable functions. We assume that φi is 1-Lipschitz
for all i = 1, . . . , m and that each φi (x) depends on at most r ≥ 2 coordinates of
x = (x1 , . . . , xn ), x ∈ X. Suppose further that λ1 , . . . , λm are complex numbers
such that
X 1
|λi | ≤ √ for j = 1, . . . , n.
i:
3 r − 1
φi depends on xj

Then a straightforward modification of our proof in Section 2.3 implies that


( m
)
X
E exp λi φi 6= 0.
i=1

Moreover, if (1.1.1) holds, then


( m
) ( m
) ( m
) n
X X X π
exp L |λi | ≥ E exp λi φi ≥ exp −L |λi | cos √ .
i=1 i=1 i=1
4 r−1

3. Proof of Theorem 1.2


First, we consider the case of bounded functions φi , so that the condition (1.2.1)
is satisfied.
We are going to use Theorem 1.1. For an integer N ≥ 1, we consider nN
probability spaces X1 = . . . = XnN = {−1, 1} and their direct product X =
{−1, 1}nN , all endowed with the uniform probability measure. We write a point
x ∈ X as x = (xjk ), where j = 1, . . . , n and k = 1, . . . , N . For i = 1, . . . , m, we
define functions ψi : {−1, 1}nN −→ C by
√  
N x11 + . . . + x1N xj1 + . . . + xjN xn1 + . . . + xnN
ψi (x) = φi √ ,... , √ ,... , √
2 N N N

and
m
X
gN = ψi .
i=1

Since the functions φi are 1-Lipschitz in the `1 metric of Rn , the functions ψi are
1-Lipschitz in the Hamming metric of X. Moreover, each function ψi depends on
not more than rN = rN coordinates and at most c functions ψi depend on any
particular coordinate xjk . Finally, from (1.2.1), we conclude that

N
|ψi (x)| ≤ L for i = 1, . . . , m.
2
13
Therefore, from formula (1.1.2) of Theorem 1.1, we conclude that
nN

 
λN gN 1 π
Ee ≥ exp − |λN |m N L cos √ .
2 4 rN − 1

provided
1
|λN | ≤ √ .
3c rN
Therefore,
   nN
2 π
(3.1) E exp λ √ gN ≥ exp {−|λ|mL} cos √
N 4 rN − 1

provided
1
|λ| ≤ √ .
6c r
By the Central Limit Theorem, as N −→ ∞, the random vector
 
x11 + . . . + x1N xj1 + . . . + xjN xn1 + . . . + xnN
√ ,... , √ ,... , √
N N N

converges in distribution to the standard Gaussian measure in Rn . Since the func-


tions φi are continuous and bounded, we have
 
2
lim E exp λ √ gN = E eλf ,
N −→∞ N

where the expectation in the right hand side is taken with respect to the standard
Gaussian measure in Rn , see, for example, Section 7.2 of [GS20]. Since
nN nN
π2 π2 n
   
π
lim cos √ = lim 1− = exp − ,
N −→∞ 4 rN − 1 N −→∞ 32(rN − 1) 32r

from (3.1) we obtain the lower bound in the inequality (1.2.2). The upper bound
in (1.2.2) is trivial.
It remains to consider the general case of not necessarily bounded functions φi .
Shifting, if necessary,

φi := φi − φi (0) for i = 1, . . . , m,

without loss of generality we assume that

φi (0) = 0 for i = 1, . . . , m.
14
Then
n
X
(3.2) |φi (x1 , . . . , xn )| ≤ |xj | for i = 1, . . . , m.
j=1

For L > 0, we define the truncation of φi by

|φi (x)| ≤ L

φi (x) if
φi,L (x) =
Lφi (x)/|φi (x)| if |φ(x)| > L.

Then
|φi,L (x) − φi,L (y)| ≤ |φi (x) − φi (y)| ,
so φi,L satisfy the conditions of Theorem 1.2, and are bounded. Hence for
m
X
fL = φL
i=1

we have
1
E eλfL 6= 0 provided |λ| ≤ √ .
6c r − 1
From (3.2) it follows that

lim E eλfL = E eλf


L−→+∞

and that the convergence is uniform on any compact set of λ in C. Then by the
Hurwitz Theorem, see, for example, Section 1 of [Kr01], we have two options:√either
E eλf = 0 for all λ ∈ C or E eλf 6= 0 for all λ in the open disc |λ| < 1/6c r − 1.
Since for λ = 0 we clearly have E eλf = 1, the first option is not realized. 

4. Optimality
Our goal is to show that the bound

1
|λ| ≤ √
3c r − 1

in Theorem 1.1 is optimal, up to an absolute constant factor, replacing ‘3’.


First, we show that the dependence on r is optimal. We reverse engineer an
example where the bound is sharp from our proof of Theorem 1.2 in Section 3.
The functions
Z +∞ Z 0
zx −x2 /2 2
z 7−→ e e dx and z 7−→ ezx e−x /2 dx,
0 −∞
15
where we integrate over a real variable x, are non-constant entire functions of z ∈ C
and hence the range of each, by the Liouville Theorem, is the whole plane C, except
perhaps one point. Therefore, there are two points u, v ∈ C such that
Z 0 Z +∞
ux −x2 /2 2
e e dx + evx e−x /2
dx = 0.
−∞ 0

Let us define ψ : R −→ C

if x ≥ 0

vx
ψ(x) =
ux if x < 0.

Then
Z +∞
2
(4.1) eψ(x) e−x /2
dx = 0.
−∞

It is easy to check that ψ is Lipschitz, more precisely,

|ψ(x) − ψ(y)| ≤ τ |x − y| where τ = max {|u|, |v|} .

For an L > 0, we consider the truncation

if |ψ(x)| ≤ L

ψ(x)
ψL (x) =
Lψ(x)/|ψ(x)| if |ψ(x)| > L.

Then

(4.2) |ψL (x) − ψL (y)| ≤ |ψ(x) − ψ(y)| ≤ τ |x − y|,

so ψL is also Lipschitz and, in addition, bounded.


Let us fix some ρ > 1. We have
Z +∞ Z +∞
zψL (x) −x2 /2 2
(4.3) lim e e dx = ezψ(x) e−x /2
dx
L−→∞ −∞ −∞

and from (4.2) it follows that the convergence is uniform in z on all compact sets
in C. Since the right hand side of (4.3) for z = 0 is equal to 1, while for z = 1 is
equal to 0 by (4.1), by the Hurwitz Theorem, see, for example, Section 1 of [Kr01],
we conclude that for all sufficiently large L we must have
Z +∞
2
(4.4) ezψL (x) e−x /2
dx = 0 for some z with |z| < ρ.
−∞
16
Let us pick some L such that (4.4) holds. For an integer n, we consider n
probability spaces X1 = . . . = Xn = {−1, 1}n and their product X = {−1, 1}n , all
endowed with the uniform probability measure. We define φ : X −→ C by
√ n
!
n 1 X
(4.5) φ(x) = ψL √ xi where x = (x1 , . . . , xn ) .
2τ n i=1

It follows from (4.2) that φ is 1-Lipschitz in the Hamming metric of {−1, 1}n . Hence
φ satisfies the conditions of Theorem 1.1 with c = 1 and r = n.
Since by the CentraL Limit Theorem the normalized sum
n
1 X
√ xi
n i=1

converges in distribution to the standard Gaussian measure as n grows, and the


function ψL is continuous and bounded, we have
( n
!) Z +∞
1 X 1 2
(4.6) E exp zψL √ xi −→ √ ezψL (x) e−x /2 dx
n i=1 2π −∞
and that the convergence in (4.6) is uniform in z on all compact subsets of C, see,
for example, Section 7.2 of [GS20]. The right hand side of (4.6) is equal to 1 for
z = 0. Then by (4.4) and the Hurwitz Theorem, for all sufficiently large n we must
have ( !)
n
1 X
E exp zψL √ xi = 0 for some z with |z| < ρ.
n i=1
Therefore, from (4.5) we conclude that for all sufficiently large n, there is λ satisfying
2τ ρ
|λ| < √ and E eλφ = 0,
n
which proves that the bound in Theorem 1.1 is indeed optimal in terms of r, up to
an absolute constant factor.
To prove that the dependence on c is optimal, for an integer k ≥ 1, we introduce
(4.7) fk (x) = φ(x) + . . . + φ(x),
| {z }
k times

where φ is defined by (4.5). While the parameter r remains the same for all fk
defined by (4.7), the parameter c changes, c = k. Furthermore,
E eλfk = E e(kλ)f ,
and hence the zero-free region for λ should scale
   
1 1
|λ| = O =O .
k c

17
Question: computational complexity. It would be interesting to find out
whether the dependence on r is optimal from the computational
√ complexity point
of view, that is, whether for real-valued f and λ  1/c r, the approximation of
E eλf becomes computationally difficult, possibly conditioned on P 6= NP or other
commonly believed hypothesis. The argument with the “cloning” of f as in (4.7)
shows that the dependence on c is indeed optimal.

5. Approximations
Here we sketch how Theorems 1.1, respectively Theorem 1.2, allow us to approx-
imate E eλf provided

1−δ 1−δ
(5.1) |λ| ≤ √ , respectively, |λ| ≤ √
3c r − 1 6c r − 1

for some 0 < δ < 1, fixed in advance. The approach was used many times before,
in particular in [Ba17] in the context closest to ours.
The algorithm is based on the following result.
(5.2) Lemma. Let p(z) be a univariate polynomial of degree N in a complex vari-
able z. Suppose that for some β > 1, we have p(z) 6= 0 for all z satisfying |z| < β
and let us choose a branch of g(z) = ln p(z) in the disc |z| < β. For an integer
k ≥ 1, let Tk (z) be the Taylor polynomial of g(z) degree k computed at z = 0, that
is,
k
X g (s) (0) s
Tk (z) = g(0) + z .
s=1
s!

Then
N
|g(z) − Tk (z)| ≤ for all |z| ≤ 1.
(k + 1)(β − 1)β k

Proof. This is Lemma 2.2.1 from [Ba16]. 


As follows from Lemma 5.2, to approximate g(1) by Tk (1) within an additive
error 0 <  < 1 and hence to approximate p(1) by exp{Tk (1)} within a relative
error 0 <  < 1, it suffices to choose k = Oβ (ln N − ln ). Moreover, the derivatives
g (s) (0) can be computed from p(0) and the derivatives p(s) (0) for s = 1, . . . , k in
O(k 2 ) time, see Section 2.2.2 of [Ba16].
In the context of Theorem 1.1, let us pick an arbitrary x0 ∈ X and let γ = f (x0 ).
In the context of Theorem 1.2, let γ = f (0). Since

E eλ(f −γ) = e−λγ E eλf ,

without loss of generality, we assume that f (x0 ) = 0 (Theorem 1.1) or f (0) = 0


(Theorem 1.2).
18
Then in the context of Theorem 1.1, respectively Theorem 1.2, we have
|f (x)| ≤ nm for all x ∈ X, respectively,
n
(5.3) X
|f (x1 , . . . , xn )| ≤ m |xj | for all (x1 , . . . , xn ) ∈ Rn .
j=1

Given a complex λ satisfying (5.1), we consider a function


( m
)
X
(5.4) z 7−→ E eλzf = E exp λz φi .
i=1

While (5.4) is not a polynomial, we can approximate it close enough by its Taylor
polynomial pN (z) and then use Lemma 5.2 to approximate ln pN (z) by a polynomial
of a low degree. Using (5.3) and the bounds (1.1.2) and (1.2.2), for a given an
0 <  < 1, one can compute
 O(1)
1
N = (m + n) ln ,

such that the Taylor polynomial of (5.4),
N N m
!s
X λs z s X λs z s X
pN (z) = Ef s = φi
s=0
s! s=0
s! i=1

satisfies
1
pN (z) 6= 0 provided |z| ≤
1−δ
and pN (1) approximates E eλf within a relative error of /3 (in the context of The-
orem 1.2, we replace φi by their appropriate truncations), see [Ba17] for estimates.
Using Lemma 5.2 with β = (1 − δ)−1 , we further approximate pN (1) within a
relative error of /3 using only k = Oδ (ln(m + n) − ln )) first derivatives,
m
!s
(s)
X
pN (0) = λs E φi for s = 1, . . . , k,
i=1

which in turn reduces to computing mOδ (ln(m+n)−ln ) different expectations


E (φi1 · · · φik ) ,
see, for example, [Ba17] for a similar computation.
This results in a quasi-polynomial algorithm for approximating E eλf . As we
remarked, Patel and Regts [PR17], see also [L+19], developed methods for faster
computation of the relevant derivatives of g(z) = ln p(z), which in some cases
O(1)
allows one to obtain a genuinely polynomial algorithm of m+n)/ complexity,
provided the parameters r and c are fixed in advance.
19
References
[B+22] Z. Bai, Y. Cao, and H. Wang, Zero-freeness and approximation of real Boolean Holant
problems, Theoretical Computer Science 917 (2022), 12–30.
[B+23] A. Barg, P. Boyvalenkov, and M. Stoyanova, Bounds for the sum of distances of spher-
ical sets of small size, Discrete Mathematics 346 (2023), no. 5, Paper No. 113346, 19
pp.
[Ba16] A. Barvinok, Combinatorics and Complexity of Partition Functions, Algorithms and
Combinatorics 30, Springer, Cham, 2016.
[Ba17] A. Barvinok, Computing the partition function of a polynomial on the Boolean cube,
A Journey through Discrete Mathematics, Springer, Cham, 2017, pp. 135–164.
[BR19] A. Barvinok and G. Regts, Weighted counting of solutions to sparse systems of equa-
tions, Combinatorics, Probability & Computing 28 (2019), no. 5, 696–719.
[Ba23] A. Barvinok, Smoothed counting of 0-1 points in polyhedra, Random Structures &
Algorithms 63 (2023), no. 1, 27–60.
[B+21] F. Bencs, E. Davies, V. Patel, G. Regts, On zero-free regions for the anti-ferromagnetic
Potts model on bounded-degree graphs, Annales de l’Institut Henri Poincaré D 8 (2021),
no.3, 459–489.
[Ca+22] K. Casel, P. Fischbeck, T. Friedrich, A. Göbel, and J. A. Gregor Lagodzinski, Ze-
ros and approximations of Holant polynomials on the complex plane, Computational
Complexity 31 (2022), no. 2, Paper No. 11, 52 pp.
[Ch+22] Z. Chen, K. Liu and E. Vigoda, Spectral independence via stability and applications to
Holant-type problems, 2021 IEEE 62nd Annual Symposium on Foundations of Com-
puter Science–FOCS 2021, IEEE Computer Society, Los Alamitos, CA, 2022, pp. 149–
160.
[FV18] S. Friedli and Y. Velenik, Statistical Mechanics of Lattice Systems. A concrete math-
ematical introduction, Cambridge University Press, Cambridge, 2018.
[GG16] A. Galanis and L. A. Goldberg, The complexity of approximately counting in 2-spin
systems on k-uniform bounded-degree hypergraphs, Information and Computation 251
(2016), 36–66.
[Ga23] D. Gamarnik, Correlation decay and the absence of zeros property of partition func-
tions, Random Structures & Algorithms 62 (2023), no. 1, 155–180.
[GS24] D. Gamarnik and D. Smedira, Integrating high-dimensional functions deterministically,
preprint arXiv:2402.08232 (2024).
[G+22] A. Galanis, L.A. Goldberg, A. Herrera-Poyatos, The complexity of approximating the
complex-valued Ising model on bounded degree graphs, SIAM Journal on Discrete Math-
ematics 36 (2022), no.3, 2159–2204.
[GS20] G.R. Grimmett and D.R. Stirzaker, Probability and Random Processes. Fourth edition,
Oxford University Press, Oxford, 2020.
[G+21] H. Guo, C. Liao, P. Lu and C. Zhang, Zeros of Holant problems: locations and algo-
rithms, ACM Transactions on Algorithms 17 (2021), no. 1, Art. 4, 25 pp.
[HJ93] P. de la Harpe and V. F .R. Jones, Graph invariants related to statistical mechanical
models: examples and problems, Journal of Combinatorial Theory. Series B 57 (1993),
no. 2, 207–227.
[J+22] V. Jain, W. Perkins, A. Sah, and M. Sawhney, Approximate counting and sampling via
local central limit theorems, STOC ’22–Proceedings of the 54th Annual ACM SIGACT
Symposium on Theory of Computing, Association for Computing Machinery (ACM),
New York, 2022, pp. 1473–1486.
[Je24] M. Jenssen, The cluster expansion in combinatorics, Surveys in Combinatorics 2024
(F. Fischer and R. Johnson, eds.), London Math. Soc. Lecture Note Ser., vol. 493,
Cambridge University Press, Cambridge, 2024, pp. 55–88.
20
[Kr04] S.G. Krantz, Complex Analysis: the Geometric Viewpoint. Second edition, Carus
Mathematical Monographs, 23, Mathematical Association of America, Washington,
DC, 2004.
[LY52] T.D. Lee and C.N. Yang, Statistical theory of equations of state and phase transitions.
II. Lattice gas and Ising model, Physical Review (2) 87 (1952), 410–419.
[L+19] J. Liu, A. Sinclair and P. Srivastava, The Ising partition function: zeros and deter-
ministic approximation, Journal of Statistical Physics 174 (2019), no.2, 287–315.
[LV07] L. Lovász and S. Vempala, The geometry of logconcave functions and sampling algo-
rithms, Random Structures & Algorithms 30 (2007), no. 3, 307–358.
[L+16] P. Lu, K. Yang and C. Zhang, FPTAS for hardcore and Ising models on hypergraphs,
Art. No. 51, 14 pp., 33rd Symposium on Theoretical Aspects of Computer Science,
LIPIcs. Leibniz Int. Proc. Inform., 47, Schloss Dagstuhl. Leibniz-Zentrum für Infor-
matik, Wadern, 2016, also preprint arXiv:1509.05494.
[MS19] M. Michelen and J. Sahasrabudhe, Central limit theorems from the roots of probability
generating functions, Advances in Mathematics 358 (2019), 106840, 27 pp.
[PR17] V. Patel and G. Regts, Deterministic polynomial-time approximation algorithms for
partition functions and graph polynomials, SIAM Journal on Computing 46 (2017),
no. 6, 1893–1919.
[PR20] H. Peters and G. Regts, Location of zeros for the partition function of the Ising model
on bounded degree graphs, Journal of the London Mathematical Society (2) 101 (2020),
no.2, 765–785.
[P+23] V. Patel, G. Regts and A. Stam, A near-optimal zero-free disk for the Ising model,
preprint arXiv:2311.05574 (2023).
[Re18] G. Regts, Zero-free regions of partition functions with applications to algorithms and
graph limits, Combinatorica 38 (2018), no. 4, 987–1015.
[Re23] G. Regts, Absence of zeros implies strong spatial mixing, Probability Theory and Re-
lated Fields 186 (2023), no. 1–2, 621–641.
[SF71] M. Suzuki and M.E. Fisher, Zeros of the partition function for the Heisenberg, Fer-
roelectric, and general Ising models, Journal of Mathematical Physics 12(2) (1971),
235–246.
[YL52] C.N. Yang and T.D. Lee, Statistical theory of equations of state and phase transitions.
I. Theory of condensation, Physical Review (2) 87 (1952), 404–409.

Department of Mathematics, University of Michigan, Ann Arbor, MI 48109-1043,


USA
E-mail address: [email protected]

21

You might also like