0% found this document useful (0 votes)
26 views65 pages

Open Problems

Uploaded by

masonv995
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views65 pages

Open Problems

Uploaded by

masonv995
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 65

SOME OPEN PROBLEMS

STEFAN STEINERBERGER

Abstract. This list contains some open problems that I came across and
that are not well known (no Riemann hypothesis...). Some are extremely
difficult, others might be doable and some might be easy (one of the problems
is that I do not know which is which). The presentation is pretty casual,
the relevant papers/references usually have more details – if you have any
questions, comments, remarks or additional references, please email me!

Contents

Part 1. Combinatorics 3
1. The Motzkin-Schmidt problem 3
2. Great Circles on S2 4
3. Strange Patterns in Ulam’s Sequence 6
4. Topological Structures in Irrational Rotations on the Torus 7
5. Graphical Designs 8
6. How big is the boundary of a graph? 9
7. The Constant in the Komlos Conjecture 10
8. The Inverse of the Star Discrepancy 12
9. Erdős Distinct Subset Sums Problem 13
10. Finding Short Paths in Graphs with Specral Graph Theory 13
11. A Curve through the ℓ1 −norm of eigenvectors of Erdős-Renyi graphs 14
12. Matching Oscillations in High-Frequency Eigenvectors. 16
13. Balanced Measures are small? 16
14. Different areas of triangles induced by points on S1 18

Part 2. Analysis 19
15. Sums of distances between points on the sphere 19
16. A Directional Poincare Inequality: flows of vector fields 19
17. Auto-Convolution Inequalities and additive combinatorics 20
18. Maxwell’s Conjecture on Point Charges 21
19. Quadratic Crofton and sets that barely see themselves 22
20. Opaque Sets 23
21. A Greedy Energy Sequence on the Unit Interval 23
22. The Kritzinger sequence 25
23. A Special Property that some Lattices have? 26
24. Roots of Classical Orthogonal Polynomials 27
25. An Estimate for Probability Distributions 28

Last update: June 10, 2024. Partially supported by the NSF (DMS-1763179, DMS-2123224)
and the Alfred P. Sloan Foundation.
1
2

26. Hermite-Hadamard Inequalities 29


27. A Strange Inequality for the number of critical points 30
28. An improved Isoperimetric Inequality? 31
29. Geometric Probability 32
30. The Polarization Constant in Rn 32
31. Bad Science Matrices 33
32. Sums of Square Roots Close to an Integer 33
33. A Beurling estimate for GFA? 35

Part 3. Fourier Analysis 37


34. An Exponential Sum for Sequences of Reals 37
35. The Bourgain-Clozel-Kahane Root Uncertainty Principle 37
36. An Uncertainty Principle 37
37. Littlewood’s Cosine Root Problem 38
38. The ‘Complexity’ of the Hardy-Littlewood Maximal Function 39
39. A Compactness Problem 40
40. Some Line Integrals 40
41. A Cube in Rn hitting a lattice point 41
42. A kind of uncertainty principle? 42

Part 4. Partial Differential Equations 43


43. The Hot Spots Conjecture 43
44. Norms of Laplacian Eigenfunctions 43
45. Laplacian Eigenfunctions with Dirichlet conditions and mean value 0 44
46. A PDE describing roots of polynomials under differentiation 45
47. The Khavinson-Shapiro conjecture 46
48. Dirichlet Eigenfunctions with Mean Value 0 46

Part 5. Problems involving Optimal Transport 47


49. Optimal Transport with concave cost 47
50. A Wasserstein transport problem 48
51. A Wasserstein Inequality in two dimensions 49
52. Transporting Mass from Ω to ∂Ω 49
53. Why is this Linear System usually solvable? 50

Part 6. Linear Algebra 51


54. Eigenvector Phase Retrieval 51
55. Matrix products 51
56. The Kaczmarz algorithm 51
57. Approximate Solutions of linear systems 52
58. Finding the Center of a Sphere from Many Points on the Sphere 53
59. Small Subsingular Values 53
60. Solving Equations with More Variables than Equations. 55

Part 7. Miscellaneous/Recreational 56
61. Geodesics on compact manifolds 56
62. The Traveling Salesman Constant 57
63. Number of Positions in Chess 58
3

64. Ulam Sets 58


65. An amusing sequence of functions 58

Part 8. Solved Problems 61


66. A Wasserstein Uncertainty Principle with Applications 61
67. A Sign Pattern for the Hypergeometric Function 1 F2 61
68. A Refinement of Smale’s Conjecture? 62
69. A Type of Kantorovich-Rubinstein Inequality? 62
70. A Pretty Inequality involving the Cubic Jacobi Theta Function 63
71. An Inequality for Eigenvalues of the Graph Laplacian? 64
72. Pasteczka’s conjecture 65

Part 1. Combinatorics
1. The Motzkin-Schmidt problem
Let x1 , . . . , xn be n points in [0, 1]2 . The goal is to find a line ℓ such that the
ε−neighborhood of ℓ contains at least 3 points. How small can one choose ε (de-
pending on n) to ensure that this is always possible? A simple pigeonholing argu-
ment shows that ε ≤ 3/n always works. As far as I know, this trivial bound has
never been improved. The question, due to T. Motzkin and W. M. Schmidt (inde-
pendently), is whether ε = o(1/n) is possible. J. Beck has established this result
assuming that the points are somewhat evenly distributed in the square (J. Beck,
Almost collinear triples among N points on the plane). I would also be interested in
what happens when the points lie on S2 and one wants to capture at least 3 using
neighborhoods of great circles.

Figure 1. Finding strips that contains three points.

There is a harder version of the question (in the sense that if one could show that
no such constant exists, then this implies Motzkin-Schmidt).
Jean was very generous with problems. [...] He told me there
the following problem, which to the best of my knowledge, is still
open. Is it possible to find n points in the unit square such that the
1/n−neighborhood of any line contains no more than C of them for
some absolute constant C? The motivation for this problem comes
4

from a possible construction of spherical harmonics as a combina-


tion of Gaussian beams, which would have L∞ −norm bounded by
a constant independently of the degree. (Varjú, Remembering Jean
Bourgain (1954-2018), AMS Notices June 2021, p. 957)
Using suitable rectangles centered at the points of dimensions δn × 2 and then
rotating them and controlling the L2 −norm in the usual manner from above and
below one sees that there exists a strip of width
 −1
n
X 1
δn ≤ n 
 
∥x − x ∥

i,j=1 i j
i̸=j

containing at least 3 points. We note that this bound is always ≤ c/n but not
necessarily better than that if the points are nicely distributed in the square.

2. Great Circles on S2
Let C1 , . . . , Cn denote the 1/n−neighborhood of n great circles on S2 . Here’s
a natural question: how much do they have to overlap? I proved (Discrete &
Computational Geometry, 2018) that

2−2s
X n n
 if 0 ≤ s < 2
s −2
|Ci ∩ Cj | ≳s n log n if s = 2
i,j=1

 1−3s/2
i̸=j n if s > 2.
and these bounds are sharp.

Figure 2. Great circles, their 1/n−neighborhoods and intersec-


tion pattern.

The case s = 1 is interesting: it is essentially equivalent to the L2 −norm of the


sum of characteristic functions: using χCi to denote the characteristic function of
Ci on S2 , we see that
n Z Xn n Z n
!2
X X X
2
1+ |Ci ∩ Cj | = χCi + χCi χCj dx = χCi dx.
i,j=1 S2 i=1 i,j=1 S2 i=1
i̸=j i̸=j
5

This means there are arrangements of great circles where


n
X n
X
χCi ∼1∼ χCi .
i=1 L1 (S2 ) i=1 L2 (S2 )
It would be very interesting to understand the behavior of the Lp −norm for some
p > 2 and this seems to be a very difficult problem.
Open Question. What is the best lower bound on
n
X
χCi as n increases?
i=1 Lp (S2 )
These results are partially inspired by trying to understand how curvature impacts
the Kakeya phenomenon. By the Kakeya phenomenon in d = 2 dimensions, we
mean the following relatively elementary proposition.
Proposition (Folklore). Let ℓ1 , . . . , ℓn be any set of n lines in R2 such that any
two lines intersect in some point and denote their 1/n−neighborhoods by T1 , . . . , Tn .
Let s ≥ 0, then there exists cs > 0 such that

2−2s
X n n
 if 0 ≤ s < 1
s
|Ti ∩ Tj | ≥ cs log n if s = 1 .
i,j=1

 1−s
i̸=j n if s > 1.
We see, essentially, that there has to be some unavoidable overlap, this is shown
by the log n for s = 1. The results cited above show that on the sphere the log is
necessary for s = 2. What happens in negative curvature? Poincaré disk?

Figure 3. The difference between R2 and S2 illustrated: the cur-


vature of S2 increases transversality, which decreases the area of
intersection.

If we consider the δ−neighborhoods C1,δ , C2,δ , . . . , Cn,δ of n fixed great circles where
no two great circles coincide and ifp1 , . . . , pn denote one of their ’poles’, then
n n
1 X X 1
lim 2 |Ci,δ ∩ Cj,δ |s = 2 s/2 .
δ→0 δ
i,j=1 i,j=1 (1 − ⟨pi , pj ⟩ )
i̸=j i̸=j

This seems like an interesting minimization problem in its own right.


6

Update (Nov 2020). This notion of Riesz energy


n
X 1
2
i,j=1 (1 − ⟨pi , pj ⟩ )s/2
i̸=j

has been investigated by Chen, Hardin, Saff (‘On the search for tight frames...’,
arXiv 2020) who show that minimizing configurations are well seperated.

3. Strange Patterns in Ulam’s Sequence


In the 1960s, Stanislaw Ulam defined the following integer sequence: start with 1, 2
and then add the smallest integer that can be uniquely written as the sum of two
distinct earlier terms. It is not quite clear why he defined this sequence. It starts
1, 2, 3, 4, 6, 8, 11, 13, . . .
I found (Experimental Mathematics, 2017) that this sequence seems to obey a
strange quasi-periodic law: indeed, the first 107 terms of the sequence satisfy
cos (2.5714474995an ) < 0 except for an ∈ {2, 3, 47, 69} .
I don’t know what that number 2.571 . . . is or why this should be true. This type
of computation has since been extended to higher values, it seems to hold true up
to at least 1012 terms. Moreover, the sequence 2.571447 . . . an mod 2π seems to
have a limiting distribution that is compactly supported and has a strange shape
(see the Figure). What is going on here?

0 1 2 3 4 5 6

Figure 4. The empirical density of 2.571 . . . an mod 2π seems to


be compactly supported.

The same seems to be true if one starts with initial values different from 1, 2. For
some choices, the arising sequence seems to be a union of arithmetic progressions:
whenever that is not the case, it seems to be ‘chaotic’ in the same sense: there
exists a constant α (depending on the initival values) such that αan mod 2π has
a strange limiting distribution. There are now several papers showing that these
strange type of phenomenon seems to persist even in other settings. I would like to
understand what this sequence does – even the most basic things are not known:
the sequence is known to be infinite since an + an−1 can be uniquely written as the
sum of two earlier terms and thus
an+1 ≤ an + an−1
7

This also shows that the sequence grows at most exponentially and that is the best
bound I am aware of. Empirically, the sequence has density ∼ 7% and it seems like
an ≤ 14n for all n sufficiently large.

Update (May 2022). Rodrigo Angelo (‘A hidden signal in Hofstadter’s H se-
quence’) discovered another example of a sequence with this property and gives
rigorous proofs for this example.

4. Topological Structures in Irrational Rotations on the Torus


Let xn = nα mod 1 where α ∈ R\Q. Let us consider the first n elements x1 , . . . , xn
and then construct the following graph G on n vertices {1, 2, . . . , n}: first, we
connect (1, 2), then (2, 3), then (3, 4) and so on until (n, 1). This results in a cycle
on n elements. Then we find the permutation π : {1, 2, . . . , n} → {1, 2, . . . , n} for
which
xπ(1) < xπ(2) < · · · < xπ(n) .
We then connect π(1) to π(2) and π(2) to π(3) and so on until π(n) is being
connected to π(1). I used (arXiv, August 2020) these types of graphs to construct
a test for whether x1 , . . . , xn are i.i.d. samples of a random variable: in that
case, the arising Graphs are expanders and close to Ramanujan. However, if one
builds this graph from the sequence of irrational rotations on the torus (certainly
not i.i.d.), very interesting graphs arise. It seems like they correspond to nice
underlying limiting objects? What are those?

√ 5. The Graph arising from xn = ϕn mod 1, where ϕ =


Figure
(1+ 5)/2 (left) and the van der Corput sequence in base 2 (right).

Simultaneously, if one takes the standard van der Corput sequence in base 2, one
seems to end up with a really nice manifold with some strange holes where things
are glued together in a fun way.

Update (Dec 2020). We found (‘Finding Structure in Sequences of Real Numbers


via Graph Theory: a Problem List’, arXiv, Dec 2020) that there are many sequences
that lead to interesting Graphs when applying this construction. For most of them
it is not at all clear why this happens.
8

5. Graphical Designs
This problem is somewhere between PDEs and Combinatorics. Let G = (V, E) be
a finite, undirected, simple graph. Then we can define functions on the Graph as
mappings f : V → R. We can also define a Laplacian on the Graph, this is simply
a map that sends functions to other functions. One possible choice is
1 X
(Lf )(v) = f (v) − f (w).
deg(v) w∼ v
E

Figure 6. Generalized Petersen Graph (12,4) on 24 vertices: a


subset of 8 vertices integrates the first 22 eigenfunctions exactly.

There are a couple of different definitions of the Laplacian and I don’t know what’s
the best choice for this problem. However, these different definitions of a Laplacian
all agree on d−regular graphs and the phenomenon at hand is already interesting
for d−regular graphs. Once one has a Laplacian matrix, one has eigenvectors
and eigenvalues. We will interpret these eigenvectors again as functions on the
graph. What one observes is that for many interesting graphs, there are subsets
of the vertices W ⊂ V such that for many different eigenvectors ϕk of the Graph
Laplacian X
ϕk (v) = 0.
v∈W

Figure 7. The Truncated Tetrahedral Graph on 12 vertices: a


subset of 4 vertices integrates the first 11 eigenfunctions exactly.
9

What is interesting is that such sets seem to inherit a lot of rich structure. I
proved (Journal of Graph Theory) that if there are large subsets W such that
the equation holds for many different eigenvectors k, then the Graph has to be
‘non-Euclidean’ in the sense the volume of balls grows quite quickly (exponentially
depending on the precise parameters). This shows that we expect graphs with nice
structure like this to be more like an expander than, say, a path graph. Konstantin
Golubev showed (Lin. Alg. Appl.) that this framework naturally encodes some
classical results from classical combinatorics: both the Erdős-Ko-Rado theorem and
the Deza-Frankl theorem can be stated as being special types of these ‘Graphical
Designs’. There are many other connections: (1) for certain types of graphs, this
seems to be related to results from coding theory and (2) such points would also
be very good points when one tries to sample an unknown function in just a few
vertices; this is because the definition can be (3) regarded as a Graph equivalent of
the classical notion of ‘spherical design’ on Sd−1 . In fact, seeing as designs can be
regarded as an algebraic definition that gives rise to Platonic bodies in R3 , I like to
think of these ‘graphical designs’ as the analogue of ‘Platonic bodies in a graph’.
There are a great many questions:
(1) when do such sets exists?
(2) are there nice extremal examples?
(3) how does one find them quickly?
(4) how can one prove non-existence?
The theory of spherical designs on Sd−1 is quite rich and full of intricate problems
so I would assumes the Graph analogue to be at least as difficult and probably
more difficult. But there are many more graphs than there are spheres (only one
per dimension: Sd ), so there should be many more interesting examples that may
themselves be tied to interesting algebraic-combinatorial structures.

6. How big is the boundary of a graph?


Let G = (V, E) be a graph. In ‘The Boundary of a Graph and its Isoperimetric
Inequality’ (Jan 2022) we introduced the following notion of a ‘boundary’ of a
graph: we say that u ∈ V is part of the boundary ∂G if there exists another vertex
v ∈ V such that
1 X
d(w, v) < d(u, v).
deg(u)
(u,w)∈E

Figure 8. Graphs, their boundary ∂G (red) and V \ ∂G (blue).

The same paper an isoperimetric inequality: each vertex v ∈ V will detect a ‘large’
number of vertices as boundary vertices.
10

Theorem. If G is a connected graph with maximal degree ∆, then for all v ∈ V


 
 1 X  1 |V |
u∈V d(w, v) < d(u, v) ≥ .
 deg(u)  2∆ diam(G)
(u,w)∈E

This inequality is presumably close to optimal and it implies


1 |V |
|∂G| ≥ .
2∆ diam(G)
I would be interested in understanding whether one can other results of this fla-
vor (perhaps invoking other graph parameters). I would also be interested in the
isoperimetric problem: what are the graphs with minimal boundary? Put differ-
ently, what are the ‘balls’ in the graph universe?

Update (Sep 2022). Chiem, Dudarov, Lee, Lee & Liu (‘A characterization of
graphs with at most four boundary vertices’) have characterized all graphs with at
most 4 boundary vertices.

7. The Constant in the Komlos Conjecture


Let A ∈ Rn×n have the property that all columns have ℓ2 −norm at most 1.
Komlos Conjecture. There exists a universal constant K > 0
such that for all such matrices A

min ∥Ax∥ℓ∞ ≤ K.
x∈{−1,1}n

The best known result is K = O( log n) (Banaszczyk) and the conjecture is that
there exists a universal K = O(1) independent of the dimension. What makes the
conjecture even more charming is the constant K might actually be quite small.

Question. Is it possible to get good lower bounds on K?

Remarkably little seems to be known about this. It is not that easy to construct
a matrix showing that K > 1.5 and apparently for a while it was considered a
unreasonable guess that K = 2. The best known result that I know of is due
to Kunisky (‘The discrepancy of unsatisfiable matrices and a lower bound for the
Komlos conjecture constant’) showing that

K ≥ 1 + 2 = 2.4142 . . .

What makes the problem of constructing lower bounds hard is that given a n × n
matrix, one needs to check 2n vectors to verify that for all of them ∥Ax∥ℓ∞ is large.

What I found is that, at least with regards to numerical experimentation, finite


projective
√ planes seem to be interesting candidates (the matrix example showing
K ≥ 3 is the incidence matrix of the Fano plane). The problem then has com-
pletely combinatorial flavor. Given a finite projective plane X, what is the best
11

 
1 1 1 0 0 0 0

 1 0 0 1 0 0 1 

 1 0 0 0 1 1 0 
1  
√  0 1 0 1 0 1 0 
3
 0 1 0 0 1 0 1


 
 0 0 1 1 1 0 0 
0 0 1 0 0 1 1


Figure 9. A matrix showing K ≥ 3.

constant cX such that for every 2−coloring χ : X → {−1, 1}, there always exists a
line ℓ in the projective such that
X
χ(x) ≥ cX ?
x∈ℓ

It is well-known (follows from the spectral bound, for example), that


p
cX ≥ (1 − o(1)) · #X.
It is known (Joel Spencer, Coloring the projective plane, 1988) that this is the
correct order of magnitude and, for some universal α > 1,
p
cX ≤ α #X.
We are now interested in the value of α for the following reason.
Proposition. If X is a finite projective plane, then
cX
K≥√ .
#X
Of course, unsurprisingly, the constant cX is also not easy to compute. What one
can do in practice is to try heuristic coloring schemes to obtain upper bounds and
hope that they somehow capture the underlying behavior.

X P G(2, 2) P G(2, 3) P G(2, 4) P G(2, 5) P G(2, 7) P G(2, 13) P G(2, 23)


cX =3 =2 =3 ≤4 ≤4 ≤8 ≤ 12

Table 1. Some upper bounds on cX

If, for example, it were the case that for X =


√P G(2, 23), we indeed have cX = 12,
then this would correspond to a lower K ≥ 6 ∼ 2.44 . . . . I have seen people use
SAT solvers for related problems but it’s not clear to me whether there is any hope
of doing something like this here. It might also be interesting to obtain an upper
bound on how well one could hope to do with this kind of construction.
Question. Is it possible to make Spencer’s bound
p
cX ≤ α #X
effective? Can one choose α = 10 for example?
12

Update (Aug 2022). Victor Reis reports the bounds in Table 2. They do seem to
indicate that there are actually rather effective two-colorings (in the sense of the α
being close to 2 or even smaller) which would suggest that finite projective plane
will not lead to good lower bound for the Komlos conjecture.

X P G(2, 5) P G(2, 13) P G(2, 23) P G(2, 31) P G(2, 41)


cX =4 ≤6 ≤8 ≤ 10 ≤ 12
X P G(2, 47) P G(2, 67) P G(2, 79) P G(2, 83)
cX ≤ 14 ≤ 16 ≤ 16 ≤ 20

Table 2. Bounds on cX by Victor Reis

8. The Inverse of the Star Discrepancy


A central problem in discrepancy theory is the challenge of distributing points
{x1 , . . . , xn } in [0, 1]d as evenly as possible. Naturally there are different measures
of regularity, one such measure is

1
star-discrepancy = max # {1 ≤ i ≤ n : xi ∈ [0, y]} − vol([0, y]) ,
y∈[0,1]d n

where [0, y] = [0, y1 ] × · · · × [0, yd ] has volume vol([0, y]) = y1 · y2 · · · · · yd . A


fundamental question in the area is the following.
Problem. Suppose we are in [0, 1]d and want the star-discrepancy
to be smaller than 0 < ε < 1, how many points do we need?
The cardinality of the smallest set of points in [0, 1]d achieving a star-discrepancy

smaller than ε is sometimes denoted by N∞ (d, ε). The best known bounds are

d ∗ d
≲ N∞ (d, ε) ≲ 2 ,
ε ε
where the upper bound is a probabilistic argument by Heinrich, Novak, Wasilkowski
& Wozniakowski (2001). The lower bound was established by Hinrichs (2004) using
Vapnik-Chervonenkis classes and the Sauer–Shelah lemma.

The upper bound construction is relatively easy: take iid random points. This leads
to a fascinating dichotomy
• either random points are essentially as regular as possible
• or there are more regular constructions we do not yet know about.
I could well imagine that random is best possible but am personally hoping for the
existence of better sets (because such sets would probably be pretty interesting). I
gave a new proof of the lower bound (‘An elementary proof of a lower bound for the
inverse of the star discrepancy’) which is relatively simple and entirely elementary
– can any of these ideas be used to construct ‘good’ sets of points?
13

9. Erdős Distinct Subset Sums Problem


This problem is fairly well known, indeed, in a 1989 Rostock Math Kolloquium
survey of problems, Erdős calls it “perhaps my first serious conjecture which goes
back to 1931 or 32”. Let a1 < · · · < an be a set of n positive number integers
such that all subset sums are distinct: from the sum of the subset it is possible
to uniquely identify the subset. The powers of 2, for example, have this property.
Erdős conjectured (and offered $ 500) that an ≥ c · 2n . Currently, the best known
bound is
2n
an ≥ (c − o(1)) √
n
where different estimates for c have been given over the years
c ≥ 1/4 Erdős & Moser
3/2
≥ 2/3 Alon & Spencer

≥ 1/ π Elkies

≥ 1/ 3 Bae, Guy
p
≥ 3/2π Aliev
p
≥ 2/π Dubroff, Fox & Xu.
p
I gave another proof for 2/π using Fourier Analysis (‘Some Remarks on the Erdős
Distinct Subset Sums Problem’) which suggests a couple of additional things about
the structure of such sets. In particular, it seems to suggest that optimal sets may
have several values rather close to the maximal value and then additional values at
some of the points (max ai )/2, (max ai )/3, (max ai )/4, . . . . Elkies (J Comb Theory
A, 1986) already has the following gorgeous reformulation of the problem.

Erdős Distinct Subset Sum Problem, Fourier Version. Sup-


pose a1 , . . . , an are distinct integers such that
Z 1Y n
1
cos (2πai x)2 dx = n ,
0 i=1 2

does this imply maxi ai ≳ 2n ? We know maxi ai ≳ 2n / n.

There is another interesting problem that arises from relaxing the conditions: what
if we do not require subset sums to be distinct but merely require them to attain
many different values (some duplicates are allowed as long as there are few).

Problem. Are there sets {a1 , . . . , an } ⊂ N such that the subset


sums attain (1 − o(1)) · 2n distinct values and an = o(2n ) ?

10. Finding Short Paths in Graphs with Specral Graph Theory


This section describes a curious phenomenon that I do not understand (described
in greater detail in ‘A Spectral Approach to the Shortest Path Problem’, arXiv
April 2020). Given a (connected) graph G = (V, E) and a vertex u ∈ V , we can
14

look for the following function ϕ : V → R


(f (w1 ) − f (w2 ))2
P
(w1 ,w2 )∈E
ϕ = arg min P .
f :V →R
f (u)=0,f ̸≡0 w∈V f (w)2
Basically, ϕ is a function that vanishes in the vertex u but changes as little as
possible from one vertex to the next (subject to a normalization in ℓ2 ). ϕ is actually
easy to compute, it is an eigenvector of a square matrix that is explicit (essentially
the Graph Laplacian after one has removed the row and column that belongs to
u). We can assume w.l.o.g. that ϕ is positive everywhere except in u.

Figure 10. Paths taken by the Spectral Method.

If one then starts in a vertex u ̸= v ∈ V and is interested in a short path to


the vertex u, one can do the following. Look among all neighbors of where you
currently are for the one that has the smallest ϕ−value. Go there and repeat. This
will provably lead to a path from v to u.
Question. Very often, this path will be fairly short (i.e. compara-
ble in length to the shortest path). Why?
We emphasize that this not always the case; however, we found that in many cases
these paths are quite good. What can be proven? Is it possible to find families of
graphs for which these spectral paths always coincide with the shortest paths?

Part of the motivation is that this question can be interpreted as a discrete version
of the Hot Spots conjecture (in particular, if the graph discretizes a convex domain
in the usual grid-like fashion, then we expect ϕ to be monotonically increasing away
from the vertex and to assume its maximum on the boundary).

Update (Nov 2020). Yancey & Yancey (arXiv:2011.08998) discuss some coun-
terexamples and propose graph curvature as an interesting condition.

11. A Curve through the ℓ1 −norm of eigenvectors of Erdős-Renyi


graphs
This is a strange phenomenon that Alex Cloninger and I discovered a while back. It
is mentioned in our paper (‘On The Dual Geometry of Laplacian Eigenfunctions’,
arXiv April 2018) but it’s not widely known.
15

Let G(n, p) be a standard Erdős-Renyi random graph and let L = D − A be the


associated Laplacian matrix. This matrix has eigenvalues 0 = λ1 ≤ λ2 ≤ · · · ≤ λn .
We only looked at the case where p is fixed and n is large and the Graph is usually
(even highly)
√ connected. In that case there is one trivial (constant) eigenfunction
ϕ1 = 1/ n. We note that, since the eigenvectors are normalized in ℓ2 , we have
√ √
∥vi ∥ℓ1 ≤ n · ∥vi ∥ℓ2 = n.
Moreover, ∥vi ∥ℓ1 is a measure for how localized an eigenvector is: the more it
concentrates its mass on few vertices, the smaller the norm is. If the eigenvector
√ is
completely flat (i.e. constant), then the norm is maximal and given by n.

Figure 11. ℓ1 −norm of v1 , . . . , vn lies on a nice curve (the under-


lying graph is G(n, p) with n = 5000 and p = 0.4).

Question. When we plot ∥vi ∥ℓ1 for i = 1...n, they seem to lie on
a curve (see Fig. 11). Why would they do that?
This somehow means that eigenvectors at the edge of the spectrum are more lo-
calized: that is perhaps not too surprising. What is truly surprising is that the
eigenvectors seem to uniformly lie very close to this curve. There seems to be
a strong measure concentration phenomenon at work: the curve always looks the
same for many different random realizations of G(n, p) (for fixed n, p).

A second question is what happens in the middle. What we see there is that
∥vi ∥ 1
E max √ ℓ ∼ 0.8.
1≤i≤n n
p i seems to be i ∼ n/2. If X ∼ N (0, 1) is a standard Gaussian, then
The relevant
E|X| = 2/π ∼ 0.7978 . . . . Coincidence?

Update (Dec 2020). I asked the question on mathoverflow. Ofer Zeitouni pointed
out that works by Rudelson-Vershynin and Eldan et al. suggest the lower bound
∥vi ∥ 1 1
E min √ ℓ ≳ .
1≤i≤n n (log n)c
16

12. Matching Oscillations in High-Frequency Eigenvectors.


This section discusses a phenomenon that is perhaps best introduced with an ex-
ample: consider the Thomassen graph on 94 vertices, consider the Graph Laplacian
L = D − A with eigenvalues ordered as
λ1 ≥ λ2 ≥ · · · ≥ λ94 = 0.
This graph is 3-regular, the three largest eigenvalues are distinct. The Figure shows
the signs of ϕ2 , ϕ3 (left and middle) and the sign of ϕ2 · ϕ3 .

Figure 12. Sign of the 2nd and the 3rd eigenvector and of their
product.

The second and the third eigenvector have sign changes across most of the edges:
they oscillate essentially as quickly as the graph allows. In contrast, the (pointwise)
product of these high-frequency eigenvectors appears to be much smoother and ex-
hibits a sign pattern typical of low-frequency eigenvectors: positive and negative
entries are clustered together and meet across a smooth interface.

I gave a theoretical explanation in (‘The product of two high-frequency Graph


Laplacian eigenfunctions is smooth’, Discrete Mathematics) but it seems like it’s a
rather rich phenomenon and maybe I barely scratched the surface? It also seems
as if this might actually be useful in applications...?

13. Balanced Measures are small?


Let G = (V, E) be some finite, combinatorial graph. We can define an infinite
sequence in the set of vertices by starting with some arbitrary vertex and then
setting
Xk k
X
d(xk+1 , xi ) = max d(v, xi ).
v∈V
i=1 i=1
17

In words: the new vertex that is being added has the property of maximizing the
sum of distances to the previous vertices. This procedure is also interesting in the
continuous setting (see ‘Sums of distances between points on the sphere’)

Figure 13. The Frucht graph (left) and an Erdős-Renyi random


graph on n = 50 vertices (right). The rule ends up only selecting
the red vertices (not necessarily with equal frequency).

One could now wonder about the long-term behavior of this construction and what
one observes is that vertices are not taken with equal frequency, things are quite
lopsided and the actual frequency with which a vertex is selected converges to a
very special kind of probability distribution.
Definition 1. A probability measure µ on the vertices V is balanced if
X X
∀w ∈ V µ(w) > 0 =⇒ d(w, u)µ(u) = max d(v, u)µ(u).
v∈V
u∈V u∈V

So basically the support of the measure is simultaneously the point to which the
global transport of the entire measure would be the most expensive. We note that
balanced measures are not necessarily unique.

ϕ
ϕ

Figure 14. m path graphs of length 2ℓ + 1 glued together at


the endpoints. Top: two different balanced measures leading to
(bottom) two different embeddings emphasizing different aspects.

The thing that is now really interesting is the following.


Problem. Balanced measures tend to be supported in a rather
small number of vertices. Why?
18

There are certainly graphs where this is not the case: examples are given by the
dodecahedral graph or the Desargue graphs for which the uniform measure is bal-
anced. In practice, it seems very difficult for graphs to have the support of a
balanced measure contain a large number of vertices and it would be interesting to
have a more quantitative understanding of this.

14. Different areas of triangles induced by points on S1



Suppose we are given n distinct points x1 , x2 , . . . , xn ∈ S1 = x ∈ R2 : ∥x∥ = 1 .
Any three points xi , xj , xk induce a triangle ∆(xi , xj , xk ) which has some area. The
question is: how many different areas are we going to see?
Question. Is it true that
# {area [∆(xi , xj , xk )] : 1 ≤ i < j < k ≤ n} ≳ n2 ?
One could even ask the stronger question whether any fixed point, being part of
∼ n2 triangles, is bound to see at least ∼ n2 different areas, i.e. whether
# {area [∆(x1 , xj , xk )] : 1 ≤ j < k ≤ n} ≳ n2 ?
There are a couple of different reasons why I like the problem (besides it being a
totally elementary question). The first is that the extremizers are not obvious. It
is completely clear that one would expect n equispaced points to generate the least
number of different triangle areas and this is probably true. However, the set of
points
Xn = eikα : 1 ≤ k ≤ n ⊂ C ∼ = R2


only differs up to a constant. It is easy to see that, since areas of a triangle are
determined by the three sides and we constrain the points to lie on S1 , that
2
# {area [∆(xi , xj , xk )] : 1 ≤ i < j < k ≤ n} ≲ (# {∥xi − xj ∥ : 1 ≤ i, j ≤ n})
and irrational rotations on the torus only have ∼ n pairwise distances since
|eikα − eimα | = |ei(k−m)α − 1|.
The second reason I like the problem is the formula
 2  2  2
 is it iu
 s−t u−s t−u
area ∆(e , e , e ) = 4 sin sin sin
2 2 2
This naturally calls for the angles to be in some form of generalized arithmetic
progression but it’s clear that there is a nonlinear twist to it. The third reason I
like the problem is that it is known (Erdos-Purdy and others) that n points in the
planar induce at least ⌊(n − 1)/2⌋ different triangle areas: taking equispaced points
on two parallel lines shows that this is sharp. This means the curvature of S1 has
to somehow play a role.

Update (May 2023). I asked the question on mathoverflow, no reply.


19

Part 2. Analysis
15. Sums of distances between points on the sphere
It is known that for any set {x1 , . . . , xn } ⊂ S2 , we have
n
X 4 2 √
∥xi − xj ∥ ≤ n − c n.
i,j=1
3

Here, ∥ · ∥ is the Euclidean distance between points in R3 . The number 4/3 is


the average distance between two randomly chosen points on the sphere and c > 0.
The inequality follows from the Stolarsky Invariance Principle (cf. works of Dmitriy
Bilyk) and the lower bound on the L2 −spherical cap discrepancy due to Jozsef Beck.

No good explicit construction of points with this property is known (the only con-
struction that is known to attain this bound are randomized constructions involving
either jittered sampling or determinantal point processes). A strange mystery (dis-
cussed in ‘Polarization and Greedy Energy on the Sphere’) is the following: if we
start with an arbitrary initial set of points {x1 , . . . , xm } and then define a greedy
sequence by setting the next point so as to maximize the sum of the existing dis-
tances
Xn
xn+1 = arg max2 ∥x − xi ∥.
x∈S
i=1
This seems to actually lead to a sequence of maximal growth. We know (‘Polariza-
tion and Greedy Energy on the Sphere’) that, for n sufficiently large (depending
on the initial conditions),
n
X 4 2
∥xi − xj ∥ ≥ n − 100n
i,j=1
3

and it’s completely clear from the proof that this far from optimal (in fact, it is
optimal for a much wider class of sequences that is clearly not as well behaved).
The procedure is very stable: one does not need to take the actual maximum, it is
enough to pick a value that is sufficiently close. One can even sometimes replace
xn+1 by some other arbitrary point (perhaps a really bad one) and, as long as one
does not do that too often, this does not seem to cause any issues, it appears to
be an incredibly robust procedure. It also seemingly works in higher dimensions.
Why? We also note that there are at least two different problems here
(1) finding a good construction for fixed n
(2) finding a sequence (xk )∞ n
k=1 such that (xk )k=1 is good uniformly in n.
It is clear that the second problem is harder than the first but it also appears, nu-
merically, as if the greedy sequence leads to a sequence (xk )∞ n
k=1 such that (xk )k=1
is optimal (up to constants) uniformly in n.

The problem is also interesting on graphs (see ‘Balanced measures are small?’)

16. A Directional Poincare Inequality: flows of vector fields


I showed (Arkiv Math. 2016) a curious refinement of the Poincaré inequality on
the torus Td . A special case on the 2−dimensional Torus T2 reads as follows: for
20

functions f : T2 → R with mean value 0


∂f √ ∂f
∥f ∥2L2 (T2 ) ≤ c∥∇f ∥L2 (T2 ) + 2
∂x ∂y L2 (T2 )

Moreover, the inequality fails when 2 is replaced by e and probably fails when

2 is replaced by π. This is a funny inequality because, as opposed to the classical
Poincare inequality, this one does not have a difference in regularity: there are in-
finitely many orthogonal functions where the LHS is (up to a constant) comparable
to the RHS. So in some sense it is an absolutely sharp form of Poincare where some
portion of the derivative is exchanged against a directional derivative.

One would assume that this is generally possible. For example, let V be a vector
field on T2 . When do we have, for some fixed universal δ = δ(V ) > 0 that for all
f ∈ C ∞ (T2 ) with mean value 0
∥∇f ∥1−δ δ
L2 (T2 ) ∥ ⟨∇f, V ⟩ ∥L2 (T2 ) ≥ cα ∥f ∥L2 (T2 ) ?

How does δ depend on V ? One would expect that it depends on the mixing prop-
erties of V : the better it mixes, the larger δ can be. It should be connected to how
quickly the flow of the vector field transports you from the vicinity of one point to
the vicinity of another point. Is it true that δ can never be larger than 1/2?

17. Auto-Convolution Inequalities and additive combinatorics


All these questions are relevant in additive combinatorics (see papers) but inter-
esting in their own right. Let f ∈ C ∞ (−1/4, 1/4) satisfy f ≥ 0 and have total
integral 1. Then the convolution f ∗ f is compactly supported on (−1/2, 1/2) and,
by Fubini, still has total integral 1. How large is ∥f ∗ f ∥L∞ going to be? By the
Pigeonhole principle, we have ∥f ∗ f ∥L∞ ≥ 1. However, this is clearly lossy since it
can only be sharp if f ∗ f was the characteristic function on (−1/2, 1/2) which is
the convolution of a function with itself (this can be seen by looking at the Fourier
transform which assumes negative values).
Theorem (Alex Cloninger & S, Proc. Amer. Math. Soc.). Let f : R → R≥0 be
supported in [−1/4, 1/4]. Then
Z Z 1/4 !2
sup f (t)f (x − t)dt ≥ 1.28 f (x)dx .
x∈R R −1/4

It seems likely that the optimal constant is closer to ∼ 1.5.

The question also has a dual formulation (which also has relevance in Additive
Combinatorics): whenever we have an L1 −function f , then there exists a shift such
that f (x) and f (x − t) have small inner product. We observe that, for f ≥ 0,
Z Z 1Z
min f (x)f (x − t)dx ≤ f (x)f (x − t)dxdt
0≤t≤1 R 0 R
Z ∞Z
∥f ∥2L1
≤ f (x)f (x − t)dxdt = .
0 R 2
So the statement itself is not complicated but the optimal constant seems to be
quite complicated. We currently know that
21

Theorem (Rick Barnard & S., J. Number Theory). We have


Z
min f (x)f (x + t)dx ≤ 0.42∥f ∥2L1
0≤t≤1 R

and 0.42 cannot be replaced by 0.37.

Noah Kravitz (arXiv, April 2020) proved that the question is equivalent to an old
question about the cardinality of difference bases. We conclude with a related
question of G. Martin & K. O’Bryant: if f ∈ L1 (R) is nonnegative, can Hölder’s
inequality be improved? Is there a universal δ > 0 such that
∥f ∗ f ∥L∞ ∥f ∗ f ∥L1
≥ 1 + δ?
∥f ∗ f ∥2L2

They produce an example (f (x) = 1/ 2x if 0 < x < 1/2, f (x) = 0 otherwise)
showing that δ cannot exceed 0.13.

Update (Aug 2022). Ramos & Madrid (Comm. Pure Appl Anal, 2021) proved
that the optimal constant c in
Z
min f (x)f (x + t)dx ≤ c∥f ∥2L1
0≤t≤1 R

is strictly less than what one obtains from the argument by Rick and myself.

Update (Nov 2022). Ethan White (‘An almost-tight L2 autoconvolution inequal-


ity’) essentially solves an L2 −version of the convolution problem (in the sense of
getting the sharp constant up to 4 digits).

18. Maxwell’s Conjecture on Point Charges


This is a strikingly simple question that can be traced to the work of James Clerk
Maxwell. Let x1 , x2 , x3 ∈ R2 and define f : R2 → R via
3
X 1
f (x) = .
i=1
∥x − xi ∥

How often can ∇f vanish or, phrased differently, how many critical points can there
be? It is easy to see that all the critical points have to be in the convex hull of
the three points of x1 , x2 , x3 . Once one does some basic experimentation, one sees
that there seem to be at most 4 critical points (if the triangle is very flat, it is
possible that there are only 2). Gabrielov, Novikov, Shapiro (Proc. London Math,
2007) showed that there are at most 12. A more recent 2015 Physica D paper of
Y.-L. Tsai (Maxwell’s conjecture on three point charges with equal magnitudes)
shows that there are at most 4 critical points – the proof is heavily computational
and seems hard to generalize. Is there a ‘simple’ proof for n = 3? What about
other potential functions? Now suppose there are 4 points x1 , x2 , x3 , x4 . In that
case, it is not even known whether there is a finite number! There are even related
one-dimensional problems that are wide open such as
22

Conjecture (Gabrielov, Novikov, Shapiro). Let (x1 , y1 ), . . . , (xℓ , yℓ ) ∈


R2 . Then for any choice of (ζ1 , . . . , ζℓ ) and any α ≥ 1/2, the func-
tion Vα : R → R given by

X ζi
Vα (x) =
i=1
(x − xi )2 + yi2 )α
has at most 2ℓ − 1 − 1 real critical points.

Update (July 2023) . Vladimir Zolotov (‘Upper bounds for the number of isolated
critical points via Thom-Milnor theorem’) gives a number of very strong bounds
for this and related problems.

19. Quadratic Crofton and sets that barely see themselves


This is a problem at the interface of the calculus of variations and integral/random
geometry. The problem is easily stated in any dimension but already the case of
convex domains Ω ⊂ R2 . Let us first consider a one-dimensional rectifiable set
L ⊂ R2 with length L > 0. We define a notion of energy as
|⟨n(x), y − x⟩ ⟨y − x, n(y)⟩|
Z Z
E(L) = dσ(x)dσ(y),
L L ∥x − y∥3
where x, y are elements in the set, n(x) and n(y) denote the normal vectors in x
and y, respectively, and σ is the arclength measure.

n(x) n(y)

x y

One way of thinking about the functional is that it measures the behavior of the set
when projected onto itself in the following sense: consider x, y ∈ L and let us take
small neighborhoods around x and y (we may think of these as approximately being
short line segments). We could then ask for the expected size of the projection of
one such line segment onto the other under a ‘random’ projection. Equivalently,
we can ask for the likelihood that a ‘random’ line intersects both line segments.
Questions. Given a convex set Ω, which L ⊂ Ω minimize E(L)?
A partial answer (‘Quadratic Crofton and sets that see themselves as little as pos-
sible’) is given by the following Theorem that shows that, at least for a positive
proportion of lengths L the answer is fairly simple.
Theorem. Let Ω ⊂ Rn be a bounded, convex domain with C 1 -boundary. There
exists a constant cΩ such that if
0 ≤ L − m|∂Ω| ≤ cΩ for some m ∈ N,
then among all (n − 1)−dimensional piecewise differentiable Σ ⊂ Ω with surface
area Hn−1 (Σ) = L the energy
|⟨n(x), y − x⟩ ⟨y − x, n(y)⟩|
Z Z
E(Σ) = dσ(x)dσ(y)
Σ Σ ∥x − y∥n+1
is minimized by m copies of the boundary and a segment of a hyperplane.
23

One can also rephrase the question (see the referenced paper). By Crofton’s for-
mula, the expected number of intersections of a ‘random’ line with a set is only
determined by the co-dimension 1 volume of the set (i.e. the length in R2 ).
Questions. Which set minimizes the variance of intersections?
One motivating image could be the following. Suppose we cover Earth with sensors
to detect cosmic radiation: how should we arrange the sensors so that a random
ray is roughly captured by the same number of sensors? The real-life case of this
scenario is covered by the above Theorem: since sensors are expensive, we are in
the setting where 0 < L ≪ 1 and one should put the sensors in a single hyperplane.

20. Opaque Sets


This is very much related to the previous question (in fact, it inspired work that led
to the problems in the previous section). The problem goes back to a 1916 paper of
Mazurkiewicz and a 1959 paper of Bagemihl. Already the special case of the unit
square is interesting.
What is the length of the shortest one-dimensional set such that
each line intersecting [0, 1]2 also intersects the set?

Figure 15. Left: three sides of the boundary give an opaque set
with length 3. Right: the
√ conjectured
p shortest opaque set for the
unit square with length 2 + 3/2 ∼ 2.63.

If Ω ⊂ R2 is convex, then any opaque set has length at least |∂Ω|/2. It is known
that this cannot be improved in general (take an extremely thin rectangle and use
two short sides and one long sides). There are many proofs of this (often using
variants of Crofton’s Formula or, equivalently, averaging over projections). What
is somewhat shocking is that, even for the unit square [0, 1]2 , the best lower bounds
aren’t much better: the best bound is something like 2.000021. It would be very
nice to have some better bounds.

21. A Greedy Energy Sequence on the Unit Interval


This is a very curious phenomenon. Identify the one-dimensional Torus T with
T∼
= [0, 1] and consider the function f : T → R given by
1
f (x) = x2 − x + .
6
1Kawamura, Akitoshi; Moriyama, Sonoko; Otachi, Yota; Pach, Janos (2019), A lower bound
on opaque sets, Computational Geometry, 80: 13-22
24

(The full phenomenon seems to hold for much more general functions but this seems
to be the easiest special case.) This function has a maximum in 0 and mean value
0. We can now consider sequences obtained in the following way
Xn
xn+1 = arg min f (x − xk ).
x∈T
k=1
What happens is that the arising sequence (xn )∞ n=1 seems to be very regularly
distributed in all the usual ways: for any subinterval J ⊂ [0, 1], we have
# {1 ≤ i ≤ N : xi ∈ J} ∼ |J| · N + very small error term.
There are many other ways of phrasing the phenomenon, for example it seems to
be that
N
X
f (xk − xℓ ) grows very slowly (logarithmically?) in N.
k,ℓ=1

We only know the much weaker bound


N
X
f (xk − xℓ ) ≲ n.
k,ℓ=1

Another observation is that


Xn
f (x − xk ) grows very slowly (logarithmically?) in N.
k=1 L∞
The best known result is in a paper with Louis Brown (J. Complexity) that shows
n
X
f (x − xk ) ≲ n1/3 for infinitely many n.
k=1 L∞
We note that this is the sum of n functions of size ∼ 1: for it to grow only logarith-
mically, a lot of cancellation has to take place. The function f has mean value 0,
so cancellation implies that the xk have to be somehow evenly spread. One could
phrase many of these things in terms of exponential sum estimates which seem to
be small, i.e.
X n
e2πiℓxk is relatively small.
k=1
One explicit conjecture one could make is
n n
X 1 X
e2πiℓxk ≲ log n

ℓ=1 k=1
but I would be interested in anything that could be said. Extensions to higher di-
mensions or other domains would be very, very interesting. I first studied (Monat-
shefte Math. 2020) such sequences for the special case (long story, explained in the
paper why)
f (x) = − log |2 sin (πx)|.
Florian Pausinger then proved that when initialized on sets with exactly one el-
ement, then sequences of this type are always variations of the van der Corput
sequence (Annali di Matematica Pura ed Applicata, 2020). I later realized that
this function f can be much more elegantly phrased in the complex plane and that
25

lead to nearly optimal results (arXiv, June 2020) for this particular function – but
the proof is quite special and uses a number of tricks that are highly tailored to this
particular function; the phenomenon seems to be much, much more robust. Louis
Brown and I (J. Complexity, 2020) proved Wasserstein bounds that get really good
in dimensions d ≥ 3. But the one-dimensional problem seems to be hard and quite
interesting.

22. The Kritzinger sequence


Ralph Kritzinger (‘Uniformly distributed sequences generated by a greedy min-
imization of the L2 discrepancy’) defined the following sequence (xn )∞
n=1 . One
starts with x1 = 1/2 and then sets, in a greedy fashion,
N
X
xN +1 = arg min −2 max {x, xn } + (N + 1)x2 − x.
0≤x≤1
n=1

This seems maybe a bit arbitrary at first glance but arises naturally when trying to
pick xN +1 in such a way that the L2 −distance between the empirical distribution
and the uniform distribution is as small as possible (see the paper). What is
particularly nice about this greedy sequence is that its consecutive elements are
‘nice’
1 1 5 1 7 5 13
, , , , , , ,...
2 4 6 8 10 12 14
We observe that xn can be written as xn = p/(2n) with p odd (additional cancella-
tion may occur, so the denominator is always a divisor of 2n). The sequence seems
to be very regularly distributed in the sense that

max |# {1 ≤ i ≤ N : xi ≤ x} − N x| as a function of N is very small.


0≤x≤1

Kritzinger proves

max |# {1 ≤ i ≤ N : xi ≤ x} − N x| ≲ N
0≤x≤1

but one could imagine the upper bound being as small as log N . It doesn’t seem
to matter much whether x1 = 1/2. In fact, even starting with an arbitrary initial
set {x1 , . . . , xm } ⊂ [0, 1], one observes this high degree of regularity. Why?

Update (July 2022). The Kritzinger sequence turns out to coincide with the
sequence that one obtains when greedily minimizing the Wasserstein W2 distance
between the empirical measure and the Lebesgue measure on [0, 1]. Using some
other ideas I was able to show (‘ On Combinatorial Properties of Greedy Wasserstein
Minimization’) that for infinitely many N ∈ N
Z x
max # {1 ≤ i ≤ N : xi ≤ y} − N y dy ≲ N 1/3 .
0≤x≤1 0

This in particular implies that the sequence is quite a bit more regular than iid
random points (for which this quantity would be ∼ N 1/2 with overwhelming like-
lihood).
26

23. A Special Property that some Lattices have?


This is a purely geometric problem that arose out of some calculus of variations
considerations (see the paper). Consider the standard hexagonal lattice Λ in R2
and fix the density (say, the volume of each little triangle is 1). Let r > 0 be an
arbitrary real number and consider
Λr = {x ∈ Γ : ∥x∥ = r} .

1.5

0.5

-0.5

-1

-1.5

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2

Figure 16. The Hexagonal Lattice and a slight perturbation.

We can now perturb the lattice a little bit: by this I mean that we perturb the basis
vectors a tiny bit (but in such a way that the density, the volume of a fundamental
cell, is preserved). This ‘wiggling of the lattice’ leads to a ‘wiggling of the points’
Λr (by this we mean exactly what it sounds like: each point in Λr has a basis
representation a1 v1 +a2 v2 where v1 , v2 are the basis vectors of the hexagonal lattice
and we now consider a1 w1 + a2 w2 where w1 , w2 are the slightly perturbed vectors).
After wiggling the points in this way, some will move closer and some will move
further away.
Theorem (Faulhuber & S, J. Stat. Phys). The sum of the dis-
tances increases under small perturbations.
I believe this to be quite a curious property: it shows, in a certain sense, ‘points
in the hexagonal lattice are, on average, closer to the origin than the points of any
nearby lattice’. It seems a bit like optimal sphere packing but also like something
else. I would believe that most of the lattices that are optimal w.r.t. sphere packing
have this property but it’s not clear to me whether there are others.
Question. Which other lattices have this property? Even in R3
this already seems tricky. What about D4 or E8? Leech?
27

Our proof for the hexagonal lattice is actually quite simple: the set Λr has a ro-
tational symmetry by 120◦ so instead of studying Λr , it suffices to study a triple
of points having this symmetry and then the computation becomes explicit. In
principle this should also work for other lattices but one has to identify proper
symmetries and then see whether one can do the computations.

Update (Dec 2022) Paige Helms (‘Extremal Property of the Square Lattice’,
arXiv Dec 2022) has established a similar result for the square lattice Z2 .

24. Roots of Classical Orthogonal Polynomials


Consider the differential equation −(p(x)y ′ )′ + q(x)y ′ = λy, where p(x) is a poly-
nomial of degree at most 2 and q(x) is a polynomial of degree at most 1. This
setting includes the classical Jacobi polynomials, Hermite polynomials, Legendre
polynomials, Chebychev polynomials and Laguerre polynomials.
In 1885, Stieltjes studied a special case, the Jacobi polynomials given by
(1 − x2 )y ′′ (x) − (β − α − (α + β + 2)x) y ′ (x) = n(n + α + β + 1)y(x)
and proved that the solution, a polynomial of degree n, has the following nice
interpretation: its roots are exactly the minimal energy configuration of
n n  
X X α+1 β+1
E=− log |xi − xj | − log |xi − 1| + log |xi + 1| .
i,j=1 i=1
2 2
i̸=j

Differentiating E, this results in an interesting relationship between the roots


n
X 1 1 α+1 1 β+1
= + for all 1 ≤ i ≤ n.
k=1
xk − xi 2 xi − 1 2 xi + 1
k̸=i

I managed to extend this result to all classical polynomials (Proc. AMS 2018).
Theorem. Let p(x), q(x) be polynomials of degree at most 2 and 1, respectively.
Then the set {x1 , . . . , xn }, assumed to be in the domain of definition, satisfies
n
X 2
p(xi ) = q(xi ) − p′ (xi ) for all 1≤i≤n
k=1
xk − xi
k̸=i

if and only if
n
Y
y(x) = (x − xk ) solves − (p(x)y ′ )′ + q(x)y ′ = λy for some λ ∈ R.
k=1

What’s particularly interesting is that one can use this to define a system of ODEs
for which the stationary state corresponds exactly to roots of classical orthogonal
polynomials. More precisely, consider
n
d X 2
xi (t) = −p(xi ) + p′ (xi (t)) − q(xi (t)) (⋄)
dt k=1
x k (t) − x i (t)
k̸=i

We can then show that the underlying system of ODEs converges exponentially
quickly to the true solution.
28

Theorem. The system (⋄) converges for all initial values x1 (0) < · · · < xn (0) to
the zeros x1 < · · · < xn of the degree n polynomial solving the equation. Moreover,
max |xi (t) − xi | ≤ ce−σn t ,
1≤i≤n

where c > 0 depends on everything and σn ≥ λn − λn−1 .


This allows one to find roots of an orthogonal polynomial pn by simply running an
ODE. It is actually completely independent of pn−1 or pn+1 , there are no recurrence
relations, no solution formulas, it’s just an ODE.
Question. Do analogous systems of ODEs exist for other types of
orthogonal polynomials? Is it possible to get results in a similar
spirit?

-1

Figure 17. Evolution of the system of ODEs for 0 ≤ t ≤ 0.01


approaches the zeros of the Legendre polynomial P100 in (−1, 1).

25. An Estimate for Probability Distributions


This question seems quite elementary: it’s really a question about real functions.
Suppose we are given a probability distribution f (x)dx on the positive real line
[0, ∞] and X, Y are independent random variables drawn from that distributions.
We can try to analyze the event
{X + Y ≥ 2z} ,
where z is some large parameter. There are two ways this event can happen: either
one of the random variables is smaller than z (in which case the other one has to be
bigger) or they are both bigger than z. A fascinating result of Feldheim & Feldheim
(arXiv:1609.03004) says that

P(X + Y ≥ 2z and min(X, Y ) ≤ z)


lim sup = ∞.
z→∞ P (X + Y ≥ 2z and min(X, Y ) ≥ z)
29

I would like to understand whether one can quantify how this result goes to infinity.
Suppose we have a random variable that is not compactly supported (and maybe
has a smooth density?)
• Question 1. Is there always a z > 0 such that
P(X + Y ≥ 2z and min(X, Y ) ≤ z) (2 log 2)z
≥ ?
P (X + Y ≥ 2z and min(X, Y ) ≥ z) med(X)
• Question 2. Is there always a z > 0 such that
P(X + Y ≥ 2z and min(X, Y ) ≤ z) 2z
≥ ?
P (X + Y ≥ 2z and min(X, Y ) ≥ z) EX
The numbers are coming from assuming that exponential distributions are the worst
case (they might not be). In case the constants are wrong: is the growth of the
RHS linear in z? If that is wrong: what is it? Note that all these probabilites can
be written explicitly as integrals over f (x)f (y)dxdy over certain regions.

I was originally interested in whether the assumption of the random variable not
having a compactly supported distribution is necessary. It turns out that it is: I
proved (Stat. Prob. Lett.)
Theorem. If X, Y are i.i.d. random variables drawn from an absolutely continuous
probability distributions with density f (x)dx on R≥0 , then
1
sup P (X ≤ z and X + Y ≥ 2z) ≥ ,
z>0 24 + 8 log2 (med(X)∥f ∥L∞ )
where med X denotes the median of the probability distribution. This estimate is
sharp up to constants and the supremum can be restricted to 0 ≤ z ≤ med(X).
It would be interesting to know whether it is possible to determine the sharp con-
stants and the extremal distribution.

26. Hermite-Hadamard Inequalities


The Hermite-Hadamard inequality states that if f : [a, b] → R is convex, then
Z b
1 f (a) + f (b)
f (x)dx ≤ .
b−a a 2
It is not difficult: a convex function stays below a line. However, once one goes
to higher dimensions, things become extremely difficult. I proved (J. Geom. Anal,
2018) that if Ω ⊂ Rn is convex and f : Ω → R is convex and positive on ∂Ω, then
Z Z
1 cn
f dx ≤ f dσ,
|Ω| Ω |∂Ω| ∂Ω
where cn is a universal constant. It was then shown by Beck, Brandolini, Burdzy,
Henrot, Langford, Larson, Smits and S (J. Geom. Anal.) that this inequality does
indeed hold for subharmonic functions as well and that cn ≤ 2n3/2 . The sharp
constant was obtained by Simon Larson (2020) who proved that cn = n.

I proved in the original paper (J. Geom. Anal, 2018) that if f : Ω → R is merely
subharmonic, i.e. ∆f ≥ 0, then we still have
Z Z
1/n
f dx ≤ cn |Ω| f dσ.
Ω ∂Ω
30

Jianfeng Lu and I then proved (Proc. AMS 2020) that one√ can take cn = 1.
Jeremy Hoskins and I proved that (arxiv, Dec 2019) c2 < 1/ 2π ∼ 0.39 . . . and
have obtained a candidate domain that leads to a constant of ∼ 0.358. We believe
that this is probably close to the best possible domain, it is shown in the Figure.
It’s currently not even known whether an extremal shape exists. Does it exist?
And does the curvature of its boundary vanish at exactly one point?
1.2

0.8

0.6

0.4

0.2

-0.2

-0.6 -0.4 -0.2 0 0.2 0.4 0.6

Figure 18. A candidate for the extremal shape in n = 2 dimen-


sions.

Question. What can be said about the extremal domain? Does


its curvature vanish in exactly one point?
There is also another interesting phenomenon: all these inequalities are proven for
subharmonic functions. This is of course more general since every convex function
is subharmonic but not vice versa. It is also clear from the characterization of these
inequalities, that the extremal functions for the subharmonic Hermite-Hadamard
inequalities are not going to be convex, they will merely be harmonic. So we would
expect stronger statements in the case where the function f is convex.
Question. What are the optimal constants for the Hermite-Hadamard
inequalities Z Z
1 cn
f dx ≤ f dσ
|Ω| Ω |∂Ω| ∂Ω
and Z Z
f dx ≤ cn |Ω|1/n f dσ
Ω ∂Ω
when f is assumed to be convex (and Ω ⊂ Rn is convex)?
We know, from the subharmonic case, that cn ≤ n for the first inequality and
cn ≤ 1 for the second inequality. But at this point even the growth/decay of these
functions as a function of n is not clear when we restrict to convex functions. I
mentioned in the original paper (J. Geom. Anal, 2018) that this problems seems to
have some connection to an optimal transport problem where one transports the
interior volume to the surface along lines in the most even way.

27. A Strange Inequality for the number of critical points


A while back I ran into a curious inequality – it sort of dropped out of other things
I was doing (‘Wasserstein Distance, Fourier Analysis . . . ’, 2018).
Theorem. Let f : T → R be continuously differentiable with mean value 0. Then
∥f ′ ∥2L1 (T)
(number of critical points of f ) · ∥f ∥L2 (T) ≳ .
∥f ′ ∥L∞ (T)
31

It says something interesting: if a function has large derivatives, then it is either


big or it is has a lot of wiggles (= critical points). I always thought that this was a
really curious kind of statement. I would like to understand this better. Note that
there is a relatively easy inequality

(number of critical points of f ) · ∥f ∥L∞ (T) ≳ ∥f ′ ∥L1 (T) .

Are there more such inequalities? Are they part of a family? I would be especially
interested in higher-dimensional analogues.

28. An improved Isoperimetric Inequality?


The classical isoperimetric inequality in Rn says that a large set has a large bound-
ary and, for Ω ⊂ Rd ,
d−1
|∂Ω| ≥ c · |Ω| d

Let now Ω ⊂ Rd be a bounded domain with smooth boundary and let x ∈ Ω be an


arbitrary point in the domain. We define a subset (∂Ω)x ⊆ ∂Ω via

(∂Ω)x = {y ∈ ∂Ω : the geodesic from x to y arrives non-tangentially} .

We note that the geodesic is defined as the shortest path γ : [0, 1] → Ω with
γ(0) = x and γ(1) = y. We say that it arrives non-tangentially if ⟨γ ′ (1), ν⟩ ̸= 0,
where ν is the normal vector of ∂Ω in y. Of course (∂Ω)x is a subset of the full
boundary ∂Ω. We were interested in whether this non-tangential boundary (∂Ω)x
still obeys some form of isoperimetric principle.

x x x

Figure 19. Various examples of (∂Ω)x .

It is not terribly difficult to show (and was done in ‘The Boundary of a Graph and
its Isoperimetric Inequality’, Jan 2022) that for convex domains Ω ⊂ Rd
|Ω|
∀x ∈ Ω |(∂Ω)x | ≥ (d − 1) .
diam(Ω)
The constant d−1 cannot be optimal (but is optimal up to a factor of 2 in d = 2). It
seems natural to ask: what is the optimal constant cd such that for convex Ω ⊂ Rd
|Ω|
∀x ∈ Ω |(∂Ω)x | ≥ cd ?
diam(Ω)
The other natural question is to ask what sort of conditions one needs on the domain
Ω for this non-tangential isoperimetric principle to hold.
32

29. Geometric Probability


Let Ω ⊂ Rd be a domain with finite volume. Suppose X, Y are two i.i.d. random
variables that are uniformly distributed in Ω. It is clear by scaling that

E∥X − Y ∥ℓ2 (Rd ) ≥ cd |Ω|1/d .


An old result of Blaschke shows that the sharp constant cd is given by the ball. This
is perhaps not too surprising: two randomly chosen points from the ball are closer to
each other than any other points (see also Bonnet, Gusakova, Thäle, Zaporozhets,
arXiv 2020).
Problem. One would naturally expect an inequality of the flavor

V∥X − Y ∥ℓ2 (Rd ) ≥ cd |Ω|2/d .


Which domain minimizes the variance, which gives the smallest
constant cd ? Is it even clear that the extremal domain is convex?

30. The Polarization Constant in Rn


The following question is due to Benitez, Sarantopoulos & Tonge (‘Lower bounds
for norms of products of polynomials, Math. Proc. Cambridge Philos. Soc. 1998).
Problem. Let {v1 , . . . , vn } ⊂ Rn be any set of n vectors all of
which have length 1. Is

max |⟨x, v1 ⟩ · ⟨x, v2 ⟩ · · · ⟨x, vn ⟩| ≥ n−n/2 ?


∥x∥=1

The inequality, if true, would be sharp for orthonormal systems. Pinasco (‘The
n−th linear polarization constant...’, arXiv Aug 2022) has proven the result for
n ≤ 14. It is also relatively easy to prove the result up to constants: by averaging
over points on the sphere and linearity of expectation, we see that
Z
E log (|⟨x, v1 ⟩ · ⟨x, v2 ⟩ · · · ⟨x, vn ⟩|) = n log |⟨x, v1 ⟩| dx.
Sn−1

This integral looks unpleasant but we know that, for sufficiently high-dimensions,
the inner product | ⟨x, v⟩ | is distributed like a Gaussian with variance 1/n. Thus
we expect that
Z √ nx2
ne− 2
Z
log |⟨x, v1 ⟩| dx ∼ √ log |x|dx.
Sn−1 R 2π
The integral can be evaluated
Z √ nx2
ne− 2 γ + log (2) 1
√ log |x|dx = − − log n,
R 2π 2 2
where γ ∼ 0.577 . . . is the Euler-Mascheroni constant. Thus

max |⟨x, v1 ⟩ · ⟨x, v2 ⟩ · · · ⟨x, vn ⟩| ≥ 0.52n · n−n/2 .


∥x∥=1

Note that several arguments leading to better constants are known (see Pinasco).
33

31. Bad Science Matrices


Suppose A ∈ Rn×n has rows that are normalized in ℓ2 (Rn ). How large can
1 X
β(A) = n ∥Ax∥ℓ∞ be?
2 n
x∈{−1,1}

This has a number of different interpretations: as an affine map that sends the unit
n
cube {−1, 1} to points with, typically, at least one large coordinate. It can also be
understood as a series of n statistical tests that, while individually fair, are likely
to have at least one give a small p−value when tested on a sequence of fair flips
of a coin. Various other geometric interpretations are conceivable. I proved (‘Bad
Science Matrices’) that
1 X p
max ∥Ax∥ ℓ ∞ = (1 + o(1)) · 2 log n.
A∈Rn×n 2n n
x∈{−1,1}

The problem is that



(1) the o(1) term is too big: the optimal rate is attained for random ±1/ n
matrices but the extremizers do not seem to behave like that at all, they
seem to be very structured and very much not random
(2) needless to say, it would be nice to refine the o(1) term but this might be
extremely complicated; maybe it can be done when n is a power of 2 via
some clever recursive construction
(3) it is not at all clear how to find candidate extremizers and, even if they are
identified, it is not clear how to prove that they are extremal
The best bounds I currently know in dimensions 2 ≤ n ≤ 8 are as follows.

n 2 3 4 5 6 7 8
βn ≥ 1.41 1.57 1.73 1.79 1.86 1.93 2

Figure 20. Lower bounds for βn = max β(A) when n ≤ 8

To illustrate the difficulty, the best example for n = 5 I currently know is


 
2 2 0 0 2
 −2 2 0 2 0 
1  
A= √  −2 0
√ √0 −2
√ √2 
2 3 0 − 3

3 3 3

√ √ √ √

0 3 3 − 3 − 3
which shows that √
2+3 3
max β(A) ≥ ∼ 1.799 . . . .
A∈R5×5 4
It appears magnificently structured but it’s maybe less clear what the structure is.

32. Sums of Square Roots Close to an Integer


This problem is somewhere between Analysis, Fourier Analysis and Number Theory.
It originates from, it seems, Computer Graphics (comparing distances in Euclidean
spaces) and is a special case of a well-known problem in complexity theory.
34

If we have the sum of k square roots of integers 1 ≤ ai ≤ n, how close can this be
to an integer? For example
√ √ √
3 + 20 + 23 ∼ 11 + 0.0000182859.
Note that if all the integers are squares themselves, then this sum of square roots is
an integer and we are not interested in that case, we are interested in ‘near-misses’.
Problem. Fix k ∈ N and use ∥x∥ to denote the distance between
x ∈ R and the nearest integer. How does
√ √ √ √
min {∥ a1 + · · · + ak ∥ : 1 ≤ a1 , . . . , ak ≤ n, a1 + · · · + ak ∈
/ N}
scale as a function of n?

This case k = 1 is pretty easy: if n is not a square number, then


√ 1
∥ n∥ ≳ 1/2 .
n
The best case is clearly when n is merely 1 removed from a square number and
then this follows simply from Taylor expansion. The case k = 2 was considered by
Angluin-Eisenstat.
√ √
Theorem. If 1 ≤ a, b ≤ n are integers and a + b ∈ / N then
√ √ 1
∥ a + b∥ ≳ 3/2 .
n
√ √
This can be seen by squaring a + b: if the number √ is extremely close to an
integer, then
√ so is its square. The square is a + b + 4ab and we know from k = 1
how close 4ab can be to an integer. This also gives us optimal cases: we want to
pick a and b so that 4ab is very close to a square number.

k = 3√is the first open case. We know that if 1 ≤ a, b, c ≤ n are integers and
√ √
a+ b+ c∈ / N then
√ √ √ 1
∥ a + b + c∥ ≳ 7/2
n
with the following nice argument that we learned from Arturas Dubickas and Roger
Heath-Brown (independently): multiplying over all 8 possible choices of signs
Y √ √ √ 
± a± b± c ∈N

and since all 8 products are of size ∼ n, none of these expressions can be closer
than n−7/2 to an integer without being one. Conversely, we have the following nice
example due to Nick Marshall
p p p 4
∥ (k − 1)2 + 2 + (k + 1)2 + 2 + (2k)2 − 8∥ ∼ 5
k
which shows that
1 √ √ √ 1
7/2
≲ ∥ a + b + c∥ ≲ 5/2 .
n n

A natural guess would be n−3 : the numbers a mod 1 are pretty rigid but if we
sum 3 of them, maybe there is enough randomness that they start behaving like iid
random variables would?
35

I proved (‘Sums of square roots that are close to an integer’) that, when k gets
large, there are examples with
√ √ 1/3
0 <≲ ∥ a1 + · · · + ak ∥ ≲ n−c·k .
Surely one would expect the true rate to be closer to n−k .

Update (April 2024). Siddharth Iyer (Distribution of sums of square roots modulo
1) used an entirely different approach to show ≲ n−k/2 .
Update (June 2024). Additional results were obtained by Arturas Dubickas (J.
Complexity, 2024).

33. A Beurling estimate for GFA?


Suppose we are given n disks of radius 1 in the Euclidean plane R2 which are
arranged in such a way that the disks are all disjoint but touching in the sense that
they form one connected component (see the picture below for two examples). We
use x1 , . . . , xn ∈ R2 to denote the centers of these disks, let α > 0 be some arbitrary
real parameter and define the notion of energy E : R2 \ {x1 , . . . xn } → R≥0
n
X 1
E(x) = .
∥x − xk ∥α
k=1

Figure 21. Left: a small number of disks colored by the color of


the incoming gradient flows. Black disks are not hit by any gradient
flows. More exposed disks are more likely to be hit. Right: several
thousand disks, 250 incoming gradient flows (equispaced in angle).

We now consider the following simple randomized procedure.


(1) Suppose x1 , . . . , xn have their center of mass in the origin (0, 0). (It’s not
important that it’s exactly the origin, just fixing translation invariance).
n
(2) Pick r ≫ ee to be an extremely large number. 
(3) Pick, uniformly at random, a point x0 on the circle x ∈ R2 : ∥x∥ = r .
36

(4) Consider the gradient ascent started in x0 with respect to the energy E.
This gives rise to a curve γ that starts in γ(0) = x0 and satisfies
γ̇(t) = (∇E)(γ(t)).
(5) We stop the curve once it is within distance 2 of one of the existing points
x1 , . . . , xn ∈ R2 . Once the curve stops, we are at distance 2 of exactly one
existing disk (with likelihood 1).

The main problem is now the following

Question. The ‘most popular’ disk, the one most likely to be hit,
how popular is it. What is the best possible upper bound on
max P (gradient flow hits xj )
1≤j≤n

in terms of n alone?

The question is a question of scaling in n, not about precise constants. It is kind of


clear what one would probably expect: the worst case should be when the n disks
are all arranged on a line and, in that case, the two endpoints are the most exposed
and should be the ones most likely to be hit. How likely that is will then depend
on α: the larger α, the more likely they are to be hit.

Motivation. Such a result could be understood as an analogue of Beurling’s


estimate for Brownian motion. I proved (‘Random Growth via Gradient Flow
Aggregation’) that when 0 < α < 1, then we have
α−1
max P (gradient flow hits xj ) ≤ n 2α+2 .
1≤j≤n

This estimate is nearly optimal when α is close to 0 but surely not optimal in
general. It also gives no nontrivial results when α ≥ 1 but one would expect n−cα
for some cα > 0 and all 0 < α < ∞. When α → 0, the optimal bound has to be
close to n−1/2 and when α → ∞, there is no nontrivial bound, one only gets ≲ 1.
Such Beurling estimates would, in turn, give rise to improved diameter estimate for
the Gradient Flow Aggregation process, see the paper for more details.
37

Part 3. Fourier Analysis


34. An Exponential Sum for Sequences of Reals
This question is motivated by an inequality I proved for sequences exhibiting
√ Poisso-
nian Pair Correlation (J. Number Theory, 2020). The special √ role that n mod 1
plays in these types of gap statistics suggests that for xn = n, uniformly in N ,
N N 2
X 1 X 2πikxn
e ≲ 1.
N n=1
k=1

Is this true? If so, then this would be best possible. Is it possible to describe
other sequences having this property? Such sequences are candidates for having
interesting gap statistics.

35. The Bourgain-Clozel-Kahane Root Uncertainty Principle


Let f : R → R be in L1 (R) and even. We define
A(f ) := inf {r > 0 : f (x) ≥ 0 if |x| > r}
A(fb) := inf {r > 0 : fb(y) ≥ 0 if |y| > r}.
We have
Theorem (Bourgain, Clozel & Kahane). Let f : R → R be a nonzero, integrable,
even function such that f (0) ≤ 0, fb ∈ L1 (R) and fb(0) ≤ 0. Then
A(f )A(fb) ≥ 0.1687,
and 0.1687 cannot be replaced by 0.41.
Felipe Goncalves, Diogo Oliveira e Silva and I improved this lower bound to 0.2025
and showed that it cannot be replaced by 0.353. What’s really quite interesting is
that the extremal function has to have infinitely many double roots. It would be
nice to understand how it behaves.

There are now several papers concerned with questions of this type. One that is
especially worth emphasizing is a very nice result by Cohn - Goncalves (‘An optimal
uncertainty principle in √twelve dimensions via modular forms’). They prove that
the optimal constant is 2 in 12 dimensions.

36. An Uncertainty Principle


This question is motivated by a basic question: when averaging a function f by
convolving with a function u (resulting in the ‘averaged function’ u ∗ f ), what func-
tion u should one consider? The question is intentionally vague and I would be
interested in good axiomatic results (‘the ‘smoothest’ average should satisfy prop-
erties P1 , P2 , . . . and the only functions satisfying all these properties are ...’).

One such axiomatic approach resulted in a really interesting uncertainty principle


(‘Scale Space...’, arXiv, May 2020). It says that for α > 0 and β > n/2, there exists
cα,β,n > 0 such that for all u ∈ L1 (Rn )
β α+β
∥|ξ|β · u
b∥α α
L∞ (Rn ) · ∥|x| · u∥L1 (Rn ) ≥ cα,β,n ∥u∥L1 (Rn ) .
38

These inequalities arise naturally when looking for the ‘best’ or ‘smoothest’ convo-
lution kernel. I would be interested in what can be said about the extremizers.
Question. What can be said about the extremizers of this func-
tional? One interesting question would be whether the extremizer
‘exploits’ the L∞ −bound fully and assumes it infinitely many times
such that |bu(ξ)| ∼ |ξ|−β . This would imply that the extremizer is
not smooth.
When n = 1 and β = 1, then for many values of α, the characteristic function
centered at the origin seems to play a special role. When n = 1 and β = 2, then
u(x) = 1 − |x| seems to play a special role (up to symmetries). It is not clear to me
whether they are global extremizers but it seems conceivable.

Discrete versions of these statements on Z have been proven in joint work with
Noah Kravitz (arXiv, July 2020). More precisely, we showed the
Pnfollowing: suppose
u : {−n, . . . , n} → R is a symmetric function normalized to k=−n u(k) = 1. We
show that every convolution operator is not-too-smooth, in the sense that
∥∇(f ∗ u)∥ℓ2 (Z) 2
sup ≥ ,
f ∈ℓ2 (Z) ∥f ∥ℓ2 2n + 1

and we show that equality holds if and only if u is constant on the interval {−n, . . . , n}.
In the setting where smoothness is measured by the ℓ2 -norm of the discrete second
derivative and we further restrict our attention to functions u with nonnegative
Fourier transform, we establish the inequality
∥∆(f ∗ u)∥ℓ2 (Z) 4
sup ≥ ,
f ∈ℓ2 (Z) ∥f ∥ℓ2 (Z) (n + 1)2

with equality if and only if u is the triangle function u(k) = (n + 1 − |k|)/(n + 1)2 .
It would be interesting to have variants of this type of statements for other ways
of measuring smoothness, other Lp −spaces.... – this seems to be quite interesting
and quite unexplored!

I would also be quite interested in what can be said about the optimal function
u when restricted to functions u : [−∞, 0] → R. This would have practical appli-
cations: when smoothing some real numbers (say, the stock prize or the current
temperature) we cannot look into the future. Thus the average has to be taken
with respect to the past (see also S & Tsyvinski ‘On Vickrey’s Income Averaging’).

Update (Feb 2024). Sean Richardson (‘A Sharp Fourier Inequality and the Epanech-
nikov Kernel’) solved the problem for the second derivatives without the assumption
that the kernel have a non-negative Fourier transform.
 The extremal kernel is ex-
plicit and very close in shape to a parabola max 1 − |x|2 , 0 .

37. Littlewood’s Cosine Root Problem


Let A ⊂ N. How many roots does the function
X
f (x) = cos (kx) necessarily have on [0, 2π]?
k∈A
39

Littlewood originally conjectured that such a function should have ∼ |A| roots
which is now known to be false (Borwein, Erdelyi, Ferguson, Lockhart, Annals).
The best unconditional lower bound is due to Sahasrabudhe (Advances, 2016) which
shows that
number of roots ≳ (log log log |A|)1/2− .
Erdelyi (2017) improved the 1/2− to 1−. Surely it must be much bigger than that!

A warning example can be found in the paper of Sahasrabudhe: the trigonometric


polynomial
X2n  rπ 
2 cos θ + sin cos (rx)
r=2
2
has only 2 roots and all coefficients in {0, −1, 1, 2}. So it’s not enough to work with
the fact that the coefficients are small in the sense of having a small absolute value,
it is actually important that they are in {0, 1} which naturally restricts the number
of approaches that one could try.

38. The ‘Complexity’ of the Hardy-Littlewood Maximal Function


Given a function f : R → R, the Hardy-Littlewood maximal function is defined via
Z x+r
1
(Mf )(x) = sup |f (z)|dz.
r>0 2r x−r

This is fairly classical object. The following object is less classical: define, for a
given function f : R → R and a given x ∈ R,
 Z x+r Z x+s 
1 1
rf (x) = inf f (z)dz = sup f (z)dz .
r>0 2r x−r s>0 2s x−s

So rf (x) is simply the shortest interval such that the average of f over that interval
is the same as the largest possible average.
Vague Problem. rf should assume many different values.
I proved (Studia Math, 2015) that if f is periodic and rf assumes only two values
{0, γ} and r−f also only assumes the same two values {0, γ}, then
f (x) = a + b sin (cx + d)
and c is determined by γ. The proof requires transcendental number theory (the
Lindemann-Weierstrass theorem), I always thought that was strange. Maybe we
even have:
Conjecture. If f ∈ L∞ (R) and rf assumes only finitely many
values, then
f (x) = a + b sin (cx + d).
Motivated by some heuristics (see paper), maybe we also have
Conjecture. Suppose f : R → R is C 1 and satisfies
f ′ (x + 1) − f (x + 1) = −f ′ (x − 1) − f (x − 1) whenever f (x) < 0.
Then
f (x) = a + b sin (cx + d) for some a, b, c, d ∈ R.
In general, it would be nice to have a better understanding of rf and how it depends
on f . Can rf (R) assume infinitely many values while not containing an interval?
40

39. A Compactness Problem


This is a phenomenon that I find really interesting: it should exist in many settings
but I only know how to prove it on T2 . Let f ∈ C ∞ (T2 ) have mean value 0.
Consider the problem of maximizing the average value of f over a closed geodesic
(straight periodic lines). This means we are interested in
Z
1
sup f dH1 ,
γ closed geodesic |γ| γ

where γ ranges over all closed geodesics γ : S1 → T2 and |γ| denotes their length.

The idea is that such an extremal geodesic somehow cannot be very long unless
the function oscillates a lot. If the function is very nice and smooth, then that
supremum should be attained by a relatively short geodesic.
Theorem (S, Bull. Aust. Math. Soc., 2019). Let f : T2 → R be at least s ≥ 2
times differentiable and have mean value 0. Then
Z
1
sup f dH1 ,
γ closed geodesic |γ| γ

is assumed for a closed geodesic γ : S1 → T2 of length no more than


 
|γ|s ≲s max ∥∂α f ∥L1 (T2 ) ∥∇f ∥L2 ∥f ∥−2
L2 .
|α|=s

I always though this was a really interesting result. I would expect that it’s not
quite optimal (there should be a loss of derivatives on the right-hand side). I would
also expect that there are analogous results on higher-dimensional tori Td . I would
in fact expect that such results actually exist in a wide variety of settings: a natural
starting point might be a setting where geodesics and Fourier Analysis work well
together.
Question What is the sharp form in T2 ? Is it possible to prove
analogous results on Td or in other settings? What is the correct
formulation of this underlying phenomenon without geodesics?
It’s not clear to me how to phrase this problem in a setting where geodesics don’t
make sense. What’s a proper way to encode this principle in Euclidean space?

40. Some Line Integrals


This question is motivated by a result that Felipe Goncalves, Diogo Oliveira e Silva
and I proved (Journal of Spectral Theory). In particular, any progress on this
particular problem would lead to some refined statement about the n−point corre-
lation of eigenfunctions of Schrödinger operators. The problem itself is completely
elementary. Let Td ∼ = [0, 1]d be the standard d−dimensional Torus and define the
function !
Yd
fd (x) = sign cos (2πxk ) ,
k=1
where sign is simply the sign of the real number (with sign(0) = 0). This is simply
a nice function that assumes values in {−1, 0, 1} in a checkerbox pattern. Here’s
the question: let a, b ∈ Rd and let
γ(t) = at + b mod 1.
41

−1 +1

+1 −1

Figure 22. The sign of sin (x) sin (y) for (x, y) ∈ T2 and a closed
geodesic that spends significantly more time in the positive region
than in the negative region. The flow γ(t) = (t, t) would spend
even more time in the positive region but that is not allowed: the
coefficients have to be different.

We will also assume that all the entries of the vector a are distinct. These linear
flows γ can be periodic or not periodic. We only care about the ones that are
periodic, this means that aLγ ∈ Zd for some minimal 0 < Lγ ∈ R in which case
this linear flow has length Lγ ∥a∥. What can be said about
Z
1
fd (γ(t))dt.
Lγ ∥a∥ one period
Typically it will be close to 0. What is the largest value it can assume? For d = 2
we solve the problem explicitly and find some very short geodesic that is the unique
maximizer. As d ≥ 3, the techniques from our paper might still apply but it seems
more challenging to get good values.

Update (Dec 2022). Dou, Goh, Liu, Legate, Pettigrew (‘Cosine Sign Correla-
tion’, arXiv) proved the following result: if a1 , a2 , a3 are three different positive
integers and if x ∼ U [0, 2π] is a uniformly distributed random variables then the
likelihood that cos (a1 x), cos (a2 x), cos (a3 x) are all positive or all negative is ≥ 1/9
with equality if and only if {a1 , a2 , a3 } = {m, 3m, 9m} for some m ∈ N. They con-
jecture that for 4 different numbers the extremal set {1, 3, 11, 33} (and multiples
thereof).

41. A Cube in Rn hitting a lattice point


This problem is from a paper of Henk & Tsintsifas (‘Lattice Point Coverings’).
Problem. Is there a universal constant c (independent of every-
thing) such that each cube Q ⊂ Rn of side length c (possibly trans-
lated away from the origin and rotated) always intersects Zd ?
It is clear that there exists √
such a constant cn for each dimension and a result
of Banaszczyk implies cn ≲ log n. But maybe there exists a uniform constant?
(This can be understood as a relaxation of the Komlos conjecture).

There is a somewhat dual question: given a cube Q ⊂ [0, 1]n whose center is in
0 ∈ Rn , can the cube be rotated in such a way so as to capture a lot more lattice
points than predicted by its volume? For the sake of concreteness, we ask
42

Problem. For which dimensions n ∈ N (if any) is it possible to


rotate the cube centered at the origin with sidelength 1000 so that
it contains 1001n lattice points?

42. A kind of uncertainty principle?


Let A ⊂ N be a finite set of integers. What is the smallest r > 0 such that there
exists a function f : T → R with

(1) suppf ⊆ (−r, r) R


(2) f (x) ≥ 0 but is not identically 0 and T f (x)dx > 0
(3) for all a ∈ A we have
fb(a) = 0.
In words, what is the smallest possible support of a probability measure centered
at 0 that vanishes on a prescribed set of integers? There is an easy lower bound
1
r≳ .
mina∈A a
One would be inclined to think if A is some lacunary set, say A = {2n : 1 ≤ n ≤ 1000},
then probably this simple lower bound is sharp. But it is clear that if we have, say,
A = {n, n + 1, . . . , 2n}, then one would expect that the support has to be quite a
bit bigger than 1/n. Is there an estimate like
!1/2
X 1
r≳
a2
a∈A
or something along these lines?

Motivation. This question is motivated by the following particularly simple proof


of a fascinating result of Kozma-Oravecz (see also ‘Local sign changes of polynomi-
als’): the function
X X1
g(x) = ca cos(ax) has a root in each interval of length .
a
a∈A a∈A

It is clear that if A = {b}, then there exists such a function f supported on an


interval of length b−1 . Then the convolution is of the form
X
(f ∗ g)(x) = da cos(ax)
a∈A
a̸=b

and the result follows by induction. If one had a better understanding of the ques-
tion above, one could try to remove more than 1 frequency at a time which might
lead to stronger results.

It is clear that the question is equally interesting in higher dimension Td or on the


sphere Sd with trigonometric polynomials replaced by spherical harmonics.
43

Part 4. Partial Differential Equations


43. The Hot Spots Conjecture
Let Ω ⊂ R2 be convex (or maybe only simply connected). Let u2 be the smallest
nontrivial eigenfunction of the Neumann Laplacian, i.e.
(
−∆u2 = µ2 u2 in Ω

∂n u2 =0 on ∂Ω.

Are maximum and minimum assumed at the boundary? This famous conjecture
of J. Rauch has inspired a lot of work. I proved (Comm. PDE, 2020), that if Ω is
a convex domain of dimension N × 1, then maximum and minimum are at most
distance ∼ 1 from a pair of points whose distance is the diameter of Ω. This is the
optimal form of this statement (think of a rectangle), I always wondered whether
the argument could possibly be sharpened to say more about Hot Spots.

x1 x2

Figure 23. Maximum and minimum are attained close (at most
a universal multiple of the inradius away) to the points achieving
maximal distance (the ‘tips’ of the domain).

Update (Aug 2020). In a recent paper (‘An upper bound on the hot spots con-
stants’), it is shown that whenever the conjecture fails, it cannot fail too badly: if
Ω ⊂ Rd is a bounded, connected domain with smooth enough boundary, then
∥u∥L∞ (Ω) ≤ 60∥u∥L∞ (∂Ω) .
One naturally wonders about the optimal constant in this inequality. The proof
shows that 60 can be replaced by 4 in sufficiently high dimensions. An example of
Kleefeld shows that the constant is at least 1.001.

Update (May 2022). Mariano, Panzo & Wang (‘Improved upper bounds for
the Hot Spots constant of Lipschitz
√ domains’) have improved the constant in
∥u∥L∞ (Ω) ≤ c∥u∥L∞ (∂Ω) to c ≤ e + ε in high dimensions.

44. Norms of Laplacian Eigenfunctions


Let (M, g) be some smooth compact d−dimensional Riemannian manifold (possibly
without boundary if that makes life easier) and consider the sequence of Laplacian
eigenfunctions
−∆ϕk = λk ϕk
which we assume to be normalized as ∥ϕk ∥L2 = 1. There has been a lot of work on
trying to understand how large these eigenfunctions can be. A classical result is
d−1
∥ϕk ∥L∞ ≲ λk 4
44

and this is sharp on the sphere. Another celebrated result is the local Weyl law: if
the manifold is assumed to have volume 1, then
n
X d−1
ϕk (x)2 = n + O(n d ).
k=1
Question. How large can
Xn
∥ϕk ∥L∞ be?
k=1
Is there a bound along the lines of
X n
∥ϕk ∥L∞ ≲ n(log n)c ?
k=1

The sum is, trivially, at least ∼ n. If we consider a high-dimensional torus Td ,


d ≥ 5, then the multiplicity of the eigenspaces should grow polynomially in the fre-
quency and a random
√ rotation of each eigenspace should lead to a basis for which
the sum is ∼ n log n though it is not clear to me how easy it is to make this precise.

There is a naturally dual question. Considering that


1 = ∥ϕk ∥2L2 ≤ ∥ϕk ∥L1 ∥ϕk ∥L∞
there is the equally interesting problem of bounding ∥ϕk ∥L1 from below. This
problem has received much less attention. It is not entirely clear to me to what
extent it is really dual. It arose naturally in ‘Quantum Entanglement and the
Growth of Laplacian Eigenfunctions’.
Question. How small can
Xn
∥ϕk ∥L1 be?
k=1
Is there a bound along the lines of
n
X n
∥ϕk ∥L1 ≳ ?
(log n)c
k=1
Maybe c = 0? I would expect this to be true for some c ≤ 1 and
maybe even c ≤ 1/2.

45. Laplacian Eigenfunctions with Dirichlet conditions and mean


value 0
I learned about this charming problem from Raghavendra Venkatraman. Suppose
we have some domain Ω ⊂ Rn and consider
−∆uk = λk uk
with Dirichlet boundary conditions uk ∂Ω = 0. If the boundary conditions were
Neumann, then the first eigenfunction would be constant and all the other ones
would, by orthogonality, have mean value 0. The situation is quite different for
Dirichlet: already the first eigenfunction is not constant and there is no reason
why any of them should have mean value 0. One would expect that, for a generic
domain, none of them have mean value 0. Conversely, on the d−dimensional ball,
45

we see that only ∼ n1/d of the first n eigenfunctions have a mean value different
from 0. In general, having many of these eigenfunctions have mean value 0 should
be a sign of great symmetry in the domain.
Problem. Is it true that among the first n Dirichlet eigenfunc-
tions on a domain in d−dimensions at least n1/d have a mean value
different from the ball? One might even conjecture that if there
are ≤ Cn1/d , then it already has to be the ball independently of C
(meaning that rate ∼ n1/d already implies it is the ball).
The ball is the most symmetric and should be the worst. Raghav and I proved
(’Dirichlet
√ eigenfunctions with nonzero mean value ’) that, up to logs, at least
n1/d of the first n have a mean value different from 0. Clearly, the square root
should not be there.

46. A PDE describing roots of polynomials under differentiation


Suppose we an initial probability measure µ = f (x)dx. We can then consider n iid
copies of this random variable and create the random polynomial
n
Y
pn (x) = (x − xi ).
k=1

Note that in this model of randomness, the roots are random (other popular models
for random polynomials choose random coefficients).
Question. What is the distribution of roots of the (t · n)−th de-
rivative of the polynomial for 0 < t < 1?
In ‘A Nonlocal Transport Equation Describing Roots of Polynomials Under Dif-
ferentiation’ I gave a formal derivation of a PDE describing the evolution of the
density
 
∂u 1 ∂ Hu
+ arctan = 0,
∂t π ∂x u
where Hu is the Hilbert transform of u.

Question. Does this PDE have a solution for smooth, compactly


supported initial data on R?

The best result in this direction is the beautiful work of Alexander Kiselev and
Changhui Tan (‘The Flow of Polynomial Roots Under Differentiation’) who study
the analogous problem on S1 (with polynomials replaced by trigonometric polyno-
mials). However, nothing so far seems to be known on the real line.

The PDE seems to be of independent interest: using work of Shlyakhtenko-Tao, I


gave another (again formal) that this process is also equivalent to fractional free
convolution in Free Probability Theory (see ‘Free Convolution Powers via Roots
of Polynomials’). This claim has since been proven rigorously (see for example,
Hoskins-Kabluchko or Arizmendi-Garza-Vargas-Perales). This naturally suggests
that the PDE should have a solution, it should have infinitely many conservation
laws etc. etc. etc.
46

47. The Khavinson-Shapiro conjecture


This conjecture is due to Khavinson & Shapiro, I first saw it in the charming article
‘A Tale of Ellipsoids in Potential Theory’ by Khavinson & Lundberg.

The starting point is the following basic fact.


Proposition. Let Ω ⊂ Rn be an ellipsoid and consider the Dirichlet problem
(
∆u = 0
u =p on ∂Ω,
where p : Rn → R is a polynomial. Then u is a polynomial.
So, basically, the harmonic extension of a polynomial to the inside of the domain
remains a polynomial. The question is: does this property characterize ellipsoids?

It’s an extraordinarily beautiful conjecture but it’s not so clear how to approach it.

48. Dirichlet Eigenfunctions with Mean Value 0


d
Let Ω ⊂ R be some bounded domain and consider the sequence of Laplacian
eigenfunctions
−∆ϕk = λϕk
with Dirichlet boundary conditions: ϕk ∂Ω = 0. In contrast to Neumann boundary
conditions2, there is no good reason why one should have
Z
ϕk dx = 0.

If one encounters such a vanishing integral, it’s a reasonable guess that this comes
from some underlying symmetry in the domain. Among all domains, the ball is the
most symmetric.
Fact. Among the first ∼ n eigenfunctions on the d−dimensional
ball, only ∼ n1/d satisfy
Z
ϕk dx ̸= 0.

This leads to a very natural problem.
Problem. Show that all domains Ω ⊂ Rd have the property that
among the first n eigenfunctions there are at least cd n1/d satisfying
Z
ϕk dx ̸= 0.

A very tempting conjecture is that the ball is the only domain
with the property of having as few as ≤ cd n1/d eigenfunctions with
nonzero mean.
Raghav Venkatraman and I (‘Dirichlet eigenfunctions with nonzero mean value’)
proved the result with n1/2d (up to logs).

2If we impose Neumann boundary conditions, then the first eigenfunction is constant ϕ ≡ c
1
and then, by orthogonality, all other eigenfunctions have mean value 0.
47

Part 5. Problems involving Optimal Transport


49. Optimal Transport with concave cost
Perhaps the easiest instance of this problem is the following: let X = {x1 , . . . , xn }
and Y = {y1 , . . . , yn } be two sets of real numbers, what can be said about the
optimal transport cost
Xn
Wpp (X, Y ) = min |xi − yπ(i) |p ,
π∈Sn
i=1

where π : {1, 2, . . . , n} → {1, 2, . . . , n} ranges over all permutations? The answer is


trivial when p ≥ 1: order both sets in increasing order and send the i−th largest
element from X to the i−th largest element from Y . The problem becomes highly
nontrivial when the cost function is concave.

Figure 24. 20 (right: 50) red points on R being optimally


matched to 20 (right: 50) blue points on R (shown displaced to
illustrate the matching) and subject to cost c(x, y) = |x − y|1/2 .

For concave functions of the distance, the picture which emerges is


rather different. Here the optimal maps will not be smooth, but dis-
play an intricate structure which – for us – was unexpected; it seems
equally fascinating from the mathematical and the economic point
of view. [...] To describe one effect in economic terms: the con-
cavity of the cost function favors a long trip and a short trip over
two trips of average length [...] it can be efficient for two trucks
carrying the same commodity to pass each other traveling opposite
directions on the highway: one truck must be a local supplier, the
other on a longer haul. (Gangbo & McCann)

What we (Andrea Ottolini and S, ‘Greedy Matching in Optimal Transport with


concave cost’) found is that the obvious greedy algorithm (find the red and blue
point at minimal distance among all pairs, connect the two, remove them both from
their respective sets and repeat) does amazingly well when the cost function is very
concave (say, c(x, y) = d(x, y)0.01 ) with errors as small as 0.01%.

Another amazing matching is the Dyck matching introduced by Caracciolo-D’Achille-


Erba-Sportiello (‘ The Dyck bound in the concave 1-dimensional random assignment
model’). Their idea is to introduce g : [0, 1] → Z
g(x) = # {1 ≤ i ≤ n : xi ≤ x} − # {1 ≤ i ≤ n : yi ≤ x}
48

The function is increasing whenever x crosses a new element of x while it decreases


every time it crosses an element of y. The Dyck matching is then obtained by
matching across level sets of the function g (see Fig. 25). The Dyck matching is
independent of the cost function.

Figure 25. A collection of n = 5 red/blue points, the function


g(x) (left, rescaled for clarity) and the Dyck matching (right).

Question. What we see empirically is that the greedy matching


is very good for c(x, y) = d(x, y)p and p close to 0 while the Dyck
matching is very good when c(x, y) = d(x, y)p when p is close to
1. Both matching are okay in the other regime for random points
(with errors as small as 10%). Can this be made precise? One
particularly fascinating regime is that of iid random points (where
the Dyck matching is known to have optimal order but the constant
is not known; for the greedy algorithm even the order is open).
Generally it seems that very little is known for optimal transport with concave cost.
The associated transport plans seem to have all sorts of intriguing mixture of local
and global behavior.

50. A Wasserstein transport problem


Consider the unit cube [0, 1]d . For which values of p, d is there a sequence (xn )∞
n=1
such that, uniformly in N ,
N
!
1 X
Wp δxk , dx ≲ N −1/d ,
N
k=1
where Wp is the p−Wasserstein distance? For each fixed dimension, the problem
gets harder as p increases. Here is what I know:
(1) A result of Cole Graham (‘Irregularity of distribution in Wasserstein dis-
tance’, 2019) implies that for d = 1, no such value p exists since there is no
such sequence even for p = 1.
(2) In d ≥ 2, Louis Brown and I (‘On the Wasserstein distance between classical
sequences and the Lebesgue measure’, 2020) constructed a sequence that
has this rate for p ≤ 2. The argument requires a nontrivial amount of
Number Theory (the existence of certain badly approximable vectors) and
it would be very desirable to have a more stable, robust, explicit, simple
construction.
(3) Boissard & Le Gouic (On the mean speed of convergence of empirical
and occupation measures in Wasserstein distance, 2014) have an argument
showing that for d > 2p, points chosen uniformly at random satisfy the
inequality. Can this be extended to a sequence?
Is it true that for, say, d = 2, this is impossible for p sufficiently large?
49

51. A Wasserstein Inequality in two dimensions


Let (M, g) be a smooth compact d−dimensional Riemannian manifold without
boundary and let G(x, y) to denote the Green function of the Laplacian, i.e. G has
mean value 0 and Z
−∆x G(x, y)f (y)dy = f (x).
M
I proved (’A Wasserstein Inequality and Minimal Green Energy on Compact Man-
ifolds’) that for any {x1 , . . . , xn } ⊂ M
1/2
n
! n
(√
log n
1X 1 X √
n
if d = 2
W2 δxk , dx ≲M G(xk , xℓ ) + −1/d
.
n n k,ℓ=1 n if d ≥ 3
k=1
k̸=ℓ

This inequality is sharp up to constants when d ≥ 3.



Question. Is the log n term for d = 2 necessary?
I do not know and could well imagine that it is or is not necessary. It would
be very interesting if it were not necessary, then the argument in (Brown & S,
Positive-definite Functions, Exponential Sums and the Greedy Algorithm: a curious
Phenomenon) would lead to an explicit greedy construction of an infinite sequence
n
!
1X
W2 δxk , dx ≲ n−1/2
n
k=1

on general manifolds (which would partially answer the preceding question).

52. Transporting Mass from Ω to ∂Ω


This is a curious problem that is naturally connected some rather interesting ques-
tions (see below). The setup is as follows: we are given a domain Ω ⊂ Rn . Let us
assume for simplicity that the domain is convex and very nice. We start with the
Hausdorff measure Hd on Ω. There is now a game that can be played: for any fixed
x ∈ Ω and any two points y, z ∈ Ω such that
x = ty + (1 − t)z
we are allowed to take the measure at x and transport a fraction of t to y and a
fraction of 1 − t to z. In particular, y and z can be transported to the boundary.

y z
x

x = ty + (1 − t)z
f (x) ≤ tf (y) + (1 − t)f (z)

We play this game until all the measure is on the boundary, we call it µ. The total
measure on the boundary then
µ(∂Ω) = H(Ω).
50

We are interested in how ‘evenly’ it can be distributed: the question is therefore:


how small can

n−1
be?
dH (∂Ω L∞
Here the derivative is understood as Radon-Nikodym. If the measure on the bound-
ary was perfectly flat, then
dµ Hd (Ω)
= .
dHn−1 (∂Ω L∞ Hd−1 (Ω)

This problem has a curious relationship with Hermite-Hadamard inequalities for


convex functions: more precisely, for convex, nonnegative f : Ω → R, we have
Z Z

f dx ≤ · f dσ.
Ω dσ L∞ ∂Ω
In particular, if the centers of mass of Ω and ∂Ω are distinct, then the constant is
strictly larger than Hd (Ω)/Hd−1 (Ω) and the best possible measure is not flat.

Question. Which one is the flattest measure,



the smallest value of
dσ L∞
that can be achieve by this type of transport?

This technique was used in (The Hermite-Hadamard inequality in higher dimen-


sion, J. Geom. Anal) to obtain some bounds; however, none of these arguments
attempted to be optimal in any way (they are quite lose in terms of the constants).

53. Why is this Linear System usually solvable?


This problem is somewhere between Linear Algebra and Optimal Transport. It’s
about the existence of certain measures on discrete spaces so that the optimal trans-
port has a certain invariance property, so we put it here. It can be described in a
very easy way in the language of linear algebra.

This strange phenomenon was discovered in ‘Curvature on graphs via equilibrium


measures’ (J. Graph Theory 2023). Let G = (V, E) be some finite, connected graph
and let D denotes its graph distance matrix,
Dij = d(vi , vj ).
This is a symmetric integer-valued matrix. We are interested in the linear system
Dx = 1,
where 1 = (1, 1, 1 . . . , 1) is the all 1’s vector. For reasons that I cannot explain,
it seems that this linear system tends to have a solution for most graphs. The
smallest counterexamples are 2 graphs on 7 vertices and 14 graphs on 8 vertices.
It is known that there are infinitely many examples of graphs for which the linear
system does not have a solution: this is proven in this paper3.
3William Dudarov, Noah Feinberg, Raymond Guo, Ansel Goh, Andrea Ottolini, Alicia Stepin,
Raghavenda Tripathi, Joia Zhang, On the image of graph distance matrices, arXiv
51

Question. Is there some sense in which graphs G = (V, E) for


which the linear system Dx = 1 does not have a solution is ‘small’ ?
Are such graphs exceptional?
It is also proven, in the paper cited above, that for Erdős-Renyi graphs G(n, p) the
matrix D is invertible with likelihood tending to 1 as n → ∞.

Part 6. Linear Algebra


54. Eigenvector Phase Retrieval
Suppose A ∈ Rn×n has eigenvalue λ ∈ R, suppose that eigenvalue has multiplicity
1 and there is a unique eigenvector (up to sign) Ax = λx. Knowing A and λ, I can
find x by solving
(A − λ · Idn×n )x = 0.
3
This can be done in O(n ) time.

Suppose now someone, additionally, gives you the n numbers (|xi |)ni=1 , the absolute
value of n of these numbers. Is it possible to quickly recover the missing signs
xi = εi |xi |? Since we have strictly more information, the problem become easier
and can be solved in O(n3 ). But it somehow feels as if this additional information
should help us (and potentially help us a lot). Hau-tieng Wu and I (‘Recovering
eigenvectors from the absolute value of their entries’, on arXiv) propose an algo-
rithm that works some of the time. The problem should become a lot easier when
|λ| is very large, i.e. when |λ| ∼ ∥A∥.

55. Matrix products


Let A, B ∈ Rn×n be two symmetric, positive-definite matrices. Under which cir-
cumstances is it true that ‘ordered products are always bigger than unordered
products’, i.e. for example
∥AABABA∥ ≤ ∥AAAABB∥?
We know
(1) that inequalities of this type are always true when n = 2 (joint work with
R. Alaifari and L. Pierce, Proc. AMS, 2020)
(2) that individual such inequalities can be true for all n
(3) that there are such inequalities that are false for all n ≥ 3 (as shown by S.
Drury, Electron. J. Linear Algebra, 2009)
I would assume that such inequalities are ‘generally’ true. There are many ways
of making this precise: one way would be to say that for any fixed inequality, the
measure of matrices (A, B) for which that fixed inequality fails becomes small as
n → ∞. Moreover, one would assume that, as the products gets longer, there
should be less and less counterexamples.

56. The Kaczmarz algorithm


The Kaczmarz is an interesting algorithm for solving linear systems of equations
Ax = b. It interprets such systems as the intersection of hyperplanes: using ai to
denote the i−th column of A. Then we are looking for a solution of
⟨ai , x⟩ = bi ,
52

for all i. The Kaczmarz method is an iterative scheme: given an approximate


solution xk , let us pick an equation, say the i−th equation, and modify xk the
smallest possible amount necessary to make it correct: set xk+1 = xk + δai , where
δ is such that ⟨ai , xk+1 ⟩ = bi . Formally,
bi − ⟨ai , xk ⟩
xk+1 = xk − ai .
∥ai ∥2
Strohmer & Verhsynin proved that if the i−th equation is chosen with likelihood
proportional to ∥ai ∥2 , then this algorithm converges exponentially and
 k
2 1
E ∥xk − x∥2 ≤ 1 − ∥x0 − x∥22 .
∥A∥2F ∥A−1 ∥22
I proved (Randomized Kaczmarz..., arXiv, June 2020) that this algorithm has a
particular connection to the singular vectors of the matrix A. More precisely,
Theorem. Let vℓ be a (right) singular vector of A associated to the singular value
σℓ . Then
k
σℓ2

E ⟨xk − x, vℓ ⟩ = 1 − ⟨x0 − x, vℓ ⟩ .
∥A∥2F
This suggests that xk −x will, for large value of k, be mainly a combination of small
singular vectors (i.e. singular vectors associated to small singular values σℓ ). This
has an interesting geometric combination that I would like to understand better: it
basically means you bounce around the hyperplanes in a way that prefers certain
angles. What I would like to understand better is more refined statistics of the
vector
n
(⟨xk − x, vℓ ⟩)ℓ=1 as k increases.
The Theorem mentioned above analyzes the expectation of a fixed entry as k in-
creases but surely there is no strong form of concentration. Presumably the variance
is gigantic? What happens geometrically to the point? In the same paper, I also
showed that
Theorem. If xk ̸= x and P(xk+1 = x) = 0, then
 2 2
xk − x xk+1 − x 1 xk − x
E , =1− A .
∥xk − x∥ ∥xk+1 − x∥ ∥A∥2F ∥xk − x∥
This emphasizes the same principle: one bounces around randomly but at different
speeds in different subspaces. This gives some insight into what is happening – in
particular, can these ideas be somehow used to accelerate the convergence of the
algorithm?

57. Approximate Solutions of linear systems


n×n
Let A ∈ R be invertible, x ∈ Rn unknown and b = Ax given. We are interested
in approximate solutions: vectors y ∈ Rn such that ∥Ay − b∥ is small. I proved
(‘Approximate Solutions of Linear Systems at a universal rate’) that for 0 < ε < 1
there is a composition of k orthogonal projections onto the n hyperplanes generated
by the rows of A, where  
1 n
k ≤ 2 log
ε ε2
53

which maps the origin to a vector y ∈ Rn satisfying ∥Ay − Ax∥ ≤ ε · ∥A∥ · ∥x∥. This
upper bound on k is independent of the spectral properties of the matrix A.

The proof is probabilistic. This leads to a natural question.


Problem. Is the log factor necessary?
It would be very nice if it could be removed: note that, in some sense and after some
rescaling, the quantity n/ε2 is close to the effective numerical rank, the dimension
of the space where the matrix can be large. Removing the log would mean that
projections explore the space effectively. This might be too good to be true and
may simply be a good indicator that the log cannot be removed.

58. Finding the Center of a Sphere from Many Points on the Sphere
Here is a particularly funny way of solving linear systems Ax = b where A ∈ Rn×n is
assumed to be invertible (taken from ‘Surrounding the Solution of a Linear System
of equations from all sides’, arXiv, Sep. 2020). Denote the rows of A by a1 , . . . , an .
Then, for any y ∈ Rn and any 1 ≤ i ≤ n,
bi − ⟨y, ai ⟩
y and y+2· ai
∥ai ∥2
have the same distance to the solution x. This means we can very quickly generated
points that all have the same distance from the solution by starting with a random
guess for the solution and then iterating this procedure. Indeed, generating m
points on a sphere around the solution x has computational cost O(n · m), it is very
cheap. In particular, it is very cheap to generate c · n points on the sphere like that,
where c is a constant.
Problem. Given at least n + 1 points on a sphere in Rn , how
would one quickly determine an accurate approximation of its cen-
ter? Does it help if one has c · n points?
The problem can, of course, be solved by setting up a linear system – the question
is whether it can be done (computationally) cheaper if one is okay with only having
an approximation of the center.

A very natural way to do is to simply average the points. This is not very good
when the points are clustered in some region of space, though. I proved that if you
pick the rows of A with likelihood proportional to ∥ai ∥2 and then average, then the
arising sequence of points satisfies
m
1 X 1 + ∥A∥F ∥A−1 ∥
E x− xk ≤ √ · ∥x − x1 ∥.
m m
k=1

This gives rise to an algorithm that is as fast as the Random Kaczmarz method.
A better way of approximating the center would presumably give rise to a faster
method!

59. Small Subsingular Values


Suppose A ∈ Rm×n with m > n is a tall rectangular matrix with many more rows
than columns. We assume furthermore that the rows are all normalized in ℓ2 . We
54

can now define, for any 0 < α < 1 the restricted singular value
∥AS x∥
σα,min (A) = min inf ,
S⊂{1,2,...,m}
|S|=αm
x̸=0 ∥x∥

where AS is the restriction of A to rows indexed by S. It’s clear that this quantity
will grow as α grows and coincides with the classical smallest singular value of
A when α = 1. Haddock, Needell, Rebrova & Swartworth (Quantile Kaczmarz,
SIMAX 2022) proved that for certain types of random matrices one has
r
3/2 m
σα,min (A) ≳ α with high likelihood.
n
I’d be interested in understanding what the best kind of matrix for this problem
would be, the one maximizing these quantities. Note that since the rows are all
normalized in ℓ2 , we can think of the rows as points on the unit sphere.

Let us consider the case where A ∈ Rm×n has each row sampled uniformly at
random from the surface measure of Sn−1 and suppose that the matrix is large,
m, n ≫ 1, and that the ratio m/n is large. Trying to find a subset S ⊂ {1, 2, . . . , m}
such that AS has a small singular value might be difficult, however, we can flip the
question: for a given x ∈ Sn−1 , how would we choose S to have
X 2
∥AS x∥2 = ⟨x, ai ⟩ as small as possible?
i∈S
2
This is a much easier problem: compute ⟨x, ai ⟩ for 1 ≤ i ≤ m and then pick S
to be the set of desired size corresponding to the smallest of these numbers. Using
rotational invariance of Gaussian vectors, we can suppose that x = (1, 0, . . . , 0).
Then we expect, in high dimensions, that
1
⟨ai , x⟩ ∼ √ γ where γ ∼ N (0, 1).
n

Figure 26. Removing a small spherical cap around the vector x.

This suggest a certain picture: large inner products are those where many rows ai
are nicely aligned with x and we know with which likelihood to expect them (these
are just all the points in the two spherical caps centered at x and −x). This would
then suggest that, in the limit as m, n, m/n → ∞, we have an estimate along the
lines of
2 Z β
σα,min (A) 1 2
=√ e−x /2 x2 dx
(m/n) 2π −β
55

where the parameter α is implicitly defined via


Z β
1 2
√ e−x /2
dx = α.
2π −β

Question. Is there a universal estimate, for A ∈ Rm×n (maybe


subject to m, n → ∞ and maybe also m/n → ∞) with all rows
normalized to 1, along the lines of
r
m
σα,min (A) ≤ cα
n

where cα is the number predicted by what one obtains when the


rows are sampled uniformly at random on the sphere? Or is there
an improvement by picking the rows to be a highly structured set
of points?
Motivation. These questions arise naturally in the context of q−quartile Kacz-
marz method (see Haddock, Needell, Rebrova & Swartworth (Quantile Kaczmarz,
SIMAX 2022) and my paper on quantile Kaczmarz (Information and Inference)).
However, I like them independently of that, it seems like a very nicely geometric
question.

60. Solving Equations with More Variables than Equations.


This is a fun problem from joint work with Ofir Lindenbaum (arXiv March 2020,
to appear in Signal Processing).

There is an underlying vector x ∈ Rd all of whose entries are either −1, 0, 1 and
most of them are 0. In fact, we may assume that only a relatively small number is
±1. We would like to understand how x looks like but we only have access to

y = Ax + ω,

where A ∈ Rn×d is a random matrix filled with independent N (0, 1) Gaussians and
ω ∈ Rn is a random Gaussian vector.

y= x+w

It is not terribly difficult to see that if n is very, very large, then it is fairly easy
to reconstruct x. The question is: how small can you make n and still reconstruct
x with high likelihood? What is remarkable is that this is doable even when n is
smaller than the number of variables d. Ofir and I propose a fun algorithm: you
take random subsets of the n rows, then do a least squares reconstruction and then
average this over many random subsets. The method seems to differ from other
methods and does work rather well even when n is quite small.
56

In fact, in some parameter regimes (small number of variables, little information),


this method outperforms all the other methods. The method itself is quite simple
and it seems like that one should be able to further improve it by playing with it.
Question. Are there natural variations on this idea?
Many of the other proposed method come with a wide variety of variations; our
particular approach seems to have not been explored very much, so maybe there
are some interesting variations that maybe work even better?

Update (Mar 2021). Ofir Lindenbaum and I found a tweak of the method which we
call RLS (Refined Least Squares for Support Recovery, arXiv, March 2021) which
leads to state-of-the-art results in many regimes. It seems very likely that we have
not yet fully exhausted the possibility of the method.

Part 7. Miscellaneous/Recreational
61. Geodesics on compact manifolds

This question is about whether the vector field V (x, y) = ( 2, 1) on the two-
dimensional flat torus T2 has, in some sense, the best mixing properties. Let (M, g)
be a smooth, compact two-dimensional Riemannian manifold without boundary: let
x ∈ M be a particular starting point and let γ : [0, ∞] → R be a geodesic starting
in x (in some arbitrary direction; parametrized according to arclength).
For any ε > 0, we can define Lε as the smallest number such that
{γ(t) : 0 ≤ t ≤ Lε } is ε − dense on the manifold.
Put differently, Lε is how long we have to go along the geodesic so that it visits
every point on the manifold up to distance at most ε. Here’s the question: how
57

x0

Figure 27. The best space-filling geodesic?

long does Lε have to be given ε? Since its ε−neighborhood is the entire manifold,
we expect Lε · ε ≳ vol(M ).
Problem. Suppose (M, g) has the property that there exists a
fixed geodesic such that
c
Lε ≤
ε
for one fixed universal c and all sufficiently small ε. What does this
tell us about (M, g)?
One example would be M = T2 with the canonical metric and the geodesic mov-
ing in a direction whose ratio of x and y−coordinates is badly approximable. It
seems reasonable to assume that one can glue many tori together but the queston
is whether these are fundamentally the only type of examples. Does hyperbolicity
help?

One would perhaps assume that in a very hyperbolic two-dimensional setting one
has something like
log(1/ε)
Lε ≲
ε
for fairly generic geodesics?.

Update (Dec 2022). Apparently this type of property is known as the existence of
a ‘superdense’ geodesic, two very recent papers in this spirit are
• J. Beck and W. Chen, Generalization of a density theorem of Khinchin and
diophantine approximation
• J. Southerland, Superdensity and bounded geodesics in moduli space

62. The Traveling Salesman Constant


Pick n points i.i.d. from [0, 1]2 . The length of the shortest traveling salesman path
is known to satisfy

length of shortest path ∼ β n,
where β is a universal constant (this is the Beardwood-Halton-Hammersley theorem
from 1948). They gave the estimates
5
≤ β ≤ 0.92116 . . .
8
58

The best known lower bound is due to Gaudio & Jaillet (Op. Rest. Lett., 2020)
and is β ≥ 0.6277. I proved (Adv. Appl. Prob.) that β ≤ βBHH − 10−6 though,
if numerical evaluation of integrals is permissible, the improvement is a bit bigger.
Numerical experiments suggest that β ∼ 0.7. It seems like such a fundamental
question, it would be nice to understand this a bit better.

63. Number of Positions in Chess


This is a very old question going back to Shannon’s estimate for the complexity of
chess. C. Shannon roughly estimate the number of admissible positions in Chess
to be
64!
∼ ∼ 4.6 · 1042 .
32!(8!)2 (2!)6
Shannon’s way of counting is rough, it excludes some admissible positions and in-
cludes some impossible ones. The best known upper bound is ≤ 1046 . I (Int. J.
Game Theory, 2015) showed that if one excludes promotion (a pawn at the end of
the board may be exchanged), one can bound the number from above by ≤ 2 · 1042 .
I believe the actual number is quite a bit smaller. None of these counting scheme’s
properly account for the pawns. A pawn in A2 can never move over and end up
on H3. I think properly counting that should decrease the number a lot. The com-
monly established wisdom is that the truth is somewhere between 1040 and 1050
but I think it’s actually less than that, maybe even less than 1038 . This is arguably
not very important but I am slightly bothered by the fact that everybody seems to
be so sure that it’s ≥ 1040 .

Update (Dec 2021). Gourion (arXiv:2112.09386) proposes a new upper bound


of 4 × 1037 for number of states without promotion.

64. Ulam Sets


Motivated by the strange behavior of Ulam sequences, Noah Kravitz and I (Integers,
2018) looked into Ulam sets: for a set of elements in a vector space {x1 , . . . , xn },
keep adding the shortest vector that can be uniquely written as the sum of two
distinct earlier terms. We observed that even the simplest settings, R2 , R3 , Z ×
Z5 ,..., lead to very strange structures: some seemingly random, some extremely
structured. What is happening here?

Update (Aug. 2020). Bade, Cui, Labelle, Li (arXiv, August 2020) have looked at
these types of sets in other settings as well. Lots and lots of structure!

Update (Dec. 2022). Andrei Mandelshtam (On fractal patterns in Ulam words,
arXiv:2211.14229 ) has found some highly intricate structure in Ulam words.

65. An amusing sequence of functions


Let us consider the sequence
n
X | sin (kπx)|
fn (x) = .
k
k=1
59

Figure 28. The set in R3 generated from (1, 0, 0), (0, 1, 0), (0, 0, 1)
(projected onto the plane that is orthogonal to (1, 1, 1)).

This sequence arose out of some fairly unrelated questions (that were further pur-
sued in a paper with X. Cheng and G. Mishne, J. Number Theory) but turned out
to be quite curious.

Theorem (S, Mathematics Magazine 2018). The function fn (x) has a strict local
minimum in x = p/q for all n ≥ q 2 .

The asymptotically sharp scaling is given by n ≥ (1 + o(1))q 2 /π. It’s not difficult
to see that fn grows like log n and thus f∞ does not exist. But as n becomes large,
there does seem to be some sort of universal function that emerges. Is it possible
to make some more precise statements about fn ?

0.1 0.9 0.38 0.39

Figure 29. The function f50.000 on [0.1, 0.9] and zoomed in


(right). The big cusp in the right picture is located at x = 5/13,
the two smaller cusps are at x = 8/21 and x = 7/18.

More generally, if (M, g) is a compact manifold and

−∆ϕk = λk ϕk
60

is a sequence of L2 −normalized eigenfunctions, is it possible to say anything about


the function fn : M → R given by
n
X |ϕk (x)|
fn (x) = √ ?
k=1
λk
61

Part 8. Solved Problems


66. A Wasserstein Uncertainty Principle with Applications
This question arose out of understanding level sets of sums of Laplacian eigenfunc-
tions (Calc. Var. PDE 2020) but is actually a topic that is of independent interest
and has more to do with calculus of variations and geometric measure theory.

Let Ω = [0, 1]d (presumably this holds on much more general domains, manifolds,
etc.) and let f : [0, 1]d → R denote a function with mean value 0. Then
µ = max(f, 0)dx and ν = max(−f, 0)dx
are two measures with the same total mass (since f has mean value 0). How much
does it cost to ‘transport’ µ to ν? If we assume that transporting a ε−unit of
measure distance D costs ε · D, then this naturally leads to the ‘Earth-Mover’
Wasserstein distance W1 . The size of W1 (µ, ν) depends on the function, of course.

Here’s a basic idea: if W1 (µ, ν) is quite small, then the transport is cheap. But if
the transport is cheap, then most of the positive part of f has to lie pretty close
to most of the negative part of f . But that should somehow force the zero set
{x : f (x) = 0} to have large (d − 1)−dimensional volume. In (Calc Var Elliptic
Equations, 2020) I proved in d = 2 dimensions, i.e. for f : [0, 1]2 → R, that
∥f ∥2L1
W1 (f+ , f− ) · H1 x ∈ (0, 1)2 : f (x) = 0 ≳

.
∥f ∥L∞
This result is sharp. Amir Sagiv and I generalized this to higher dimensions (SIAM
J. Math. Anal). The currently sharpest form in higher dimensions is due to Carroll,
Massaneda & Ortega-Cerda (Bull. London Math. Soc.) and reads
 2− d1
d−1
 d ∥f ∥L1
W1 (f+ , f− ) · H x ∈ (0, 1) : f (x) = 0 ≳d ∥f ∥L1 .
∥f ∥L∞
Here, it is not clear whether the power is optimal or not. Of course, for all these
inequalities it would also be interested in having the same underlying thought ex-
pressed in other ways: certainly the idea behind these things can be expressed in
many different ways.

Update (Nov. 2020). A sharp form of this principle has been established in
Fabio Cavalletti, Sara Farinelli, Indeterminacy estimates and the
size of nodal sets in singular spaces, arXiv:2011.04409

67. A Sign Pattern for the Hypergeometric Function 1 F2


This is motivated by the immediately preceding section: some curious structure
arises naturally when studying the local stability of the inequality.
Question. Let α > 0. We define, for integers k ≥ 1, the sequence
1 + α 3 3 + α π2
 
ak = 1 F2 ; , ; − (2k − 1)2 .
2 2 2 16
For which α is it true that ak ≥ 0 for odd values of k and ak ≤ 0
for even values of k?
62

If α is an integer, the hypergeometric function simplifies tremendously and it is not


hard to check that the desired property is satisfied for α ∈ {2, 3, 4, 5, 6}. It should
be true for all integers α ≥ 2. In fact, I would expect it to be true for all real α ≥ 2.
Once it is true for some fixed α > 0, it implies that for all smooth, even functions
f : [−1/2, 1/2] → R,
α + 1 1/2
Z
max(f ) ≥
b (1 − |2x|α ) f (x)dx,
απ −1/2
where
        
1 b 1 3 b 3
max(fb) = max sup 2k + f 2k + , − inf 2k + f 2k + .
k∈N 2 2 k∈N 2 2
Update (Oct. 2021). The sign pattern has been established in
Yong-Kum Cho and Young Woong Park, The zeros of certain Fourier
transforms:Improvements of Pólya’s results, arXiv:2110.01885

68. A Refinement of Smale’s Conjecture?


Let f : C → C be a polynomial normalized to f (0) = 0 and |f ′ (0)| = 1. Smale
proved in 1981 that there exists a critical point (z ∈ C such that f ′ (z) = 0)
satisfying
|f (z)| ≤ 4|z|.
The question is whether 4 can be replaced by 1.
A Stronger Conjecture? Let g : C → C be a polynomial with
|g(0)| = 1 and consider the subset
A = {z ∈ C : |g(z)| < 1} ⊂ C.
Let B be the connected component of A whose closure contains 0.
Then the polynomial zg(z) contains a critical point in B.
This, if true, would slightly refine Smale’s conjecture (which says that there is a
critical point in A). In practice, the statement seems to be true – in most cases,
the number of roots of zg(z) in B seems to be the same as the number of roots of
g(z) in B (which is at least 1). For a while I thought that this stronger statement
might be true until Peter Müller constructed a counterexample of degree 5.

The counterexamples are ‘barely’ counterexamples, so I am naturally still wondering


whether something along these lines might be true...

69. A Type of Kantorovich-Rubinstein Inequality?


Let f : [0, 1]d → R and let µ be a probability measure on [0, 1]d . Is there an
inequality
Z Z
f (x)dx − f (x)dµ ≤ c · ∥∇f ∥Ld,1 · W∞ (µ, dx),
[0,1]d [0,1]d

where Lp,q is the Lorentz space and W∞ the ∞−Wasserstein distance. This in-
equality is ‘almost’ (in a suitable sense) proven in ‘On a Kantorovich-Rubinstein
63

inequality’ (arXiv: Oct 2020). The most general question is whether there exist
inequalities of the type
Z Z
f (x)dx − f (x)dµ ≤ c · ∥∇f ∥Lp,q · Wr (µ, dx),
[0,1]d [0,1]d

The case p = q = ∞ and r = 1 is, of course, the famous Kantorovich-Rubinstein


inequality that also holds for more general combination of measures (it is not nec-
essary for one of them to be dx).

Update (March 2022). The conjectured inequality has been established by Filippo
Santambrogio in the preprint ‘Sharp Wasserstein estimates for integral sampling
and Lorentz summability of transport densities’ (cvgmt: 5463 and Journal of Func-
tional Analysis)

70. A Pretty Inequality involving the Cubic Jacobi Theta Function


Here is a pretty inequality: for all 0 < q < 1,

X 2 2 2π
q m +mn+n ≥ √ .
m,n=−∞
3 log (1/q)
It came up naturally in unrelated work (‘On the Logarithmic of Points on S2 ’, arXiv
Nov. 2020). The inequality seems to be remarkably accurate as q → 1. I think
a way of proving it for q ∈ (q0 , 1) for some absolute q0 would be to combine an
identity of Borwein & Borwein (1991)

2
+mn+n2
X
qm = θ3 (q)θ3 (q 3 ) + θ2 (q)θ2 (q 3 ),
m,n=−∞

where
∞ ∞
(k+1/2)2 2
X X
θ2 (q) = q and θ3 (q) = qk
k=−∞ k=−∞
with the identities
 
2 ∞
1 π log q X 1
θ2 (q) = (q 2 , q 2 )∞ · exp − + +   ,
log q 12 12 2
k sinh π k k=1 log q

and
 
2 ∞ k
1 π log q X (−1)
θ3 (q) = (q 2 , q 2 )∞ · exp − + +   ,
log q 12 12 k sinh π k
2
k=1 log q

and

!
π2 log (1/q) X 1 qbk
 
2 2 1 log (1/q)
(q ; q )∞ = exp − − log + − ,
12 log (1/q) 2 π 12 k 1 − qbk
k=1
where qb is an abbreviation for
2π 2
 
qb = exp − .
log (1/q)
For q ∈ (0, q0 ), one could probably establish it using a computer.
64

Question. Is there a ‘nice’ proof? Is there a more fundamental


reason why the inequality is true? What if m2 +mn+n2 is replaced
by another positive-definite quadratic form?
Update (Feb 2023). Nian-Hong Zhou tells me that there is a very simple proof by
taking the Poisson Summation Formula since all the Fourier coefficients are positive
and gives Equation 5 on page 205 of Schoenberg’s Elliptic Modular Functions as a
reference. So this one really has a nice and simple solution!

71. An Inequality for Eigenvalues of the Graph Laplacian?


Let G = (V, E) be a connected, finite graph and let L = D − A be the (Kirchhoff)
Graph Laplacian. It has eigenvalues
0 = λ1 < λ2 ≤ λ3 ≤ · · · ≤ λn .
Question. Is there a universal constant c > 0 such that
n
X 1
diam(G)2 ≤ c ?
λk
k=2

A famous inequality of Alon-Milman shows that already the first term of the sum
is essentially big enough provided one compensates for vertices of large degree: if
we denote the maximal degree by ∆, then
2∆(log #V )2
diam(G)2 ≤ .
λ2
The question is now whether one can replace the dependency on ∆ and #V by
summing over the remainder of the spectrum.

Motivation. This inequality would imply that the notion of graph curvature
introduced in ‘Graph Curvature via Resistance Distance’ satisfies a Bonnet-Myers-
type inequality: if the curvature is bounded from below by K > 0, then
diam(G) ≤ c · K −1/2 .

Update (May 2023). Using an identity of McKay, we can rephrase the question in
terms of the average commute time. The commute time between two vertices i, j is
the expected time a random walk needs to go from i to j and then back to i. The
question is then whether
1 X |E|
commute(v, w) ≳ diam(G)2 .
|V |2 |V |
v,w∈V

The inequality is sharp up to constants for cycle graphs Cn where the average
commute time is ∼ n2 . One of the reasons the inequality is interesting is that |E|
appears: adding more edges increases the global commute time.

Update (July 2023). I asked the question on mathoverflow and Yuval Peres pro-
duced the following counterexample.
65


Let ℓ = ⌊ n/2⌋ and let G1 , . . . , Gℓ be disjoint cliques of size ℓ. Let
K be a clique on the remaining n − ℓ2 > n/2 nodes. Connect every
node in Gi to every node in Gi+1 for i < ℓ and connect every node
in Gℓ to every node in K. This defines a graph of diameter ℓ. For
i < ℓ/2 the effective resistance between a node v in Gi and a node
w in K is Θ(1/ℓ) so the commute time between v and w is Θ(ℓ3 ).
Consequently, the average commute time between all pairs is Θ(ℓ3 )
which contradicts the conjectured inequality.

72. Pasteczka’s conjecture


Pasteczka is interested in convex domains Ω ⊂ Rn such that for all convex functions
f :Ω→R Z Z
1 1
f dx ≤ f dσ
|Ω| Ω |∂Ω| ∂Ω
Pasteczka (Annales Universitatis Paedagogicae Cracoviensis Studia Mathematica,
2018) remarks that by plugging in f (x) = xi (the i−th coordinate function) and
f (x) = −xi (both of which are convex), we can deduce that such a domain Ω needs
to satisfy
center of mass of Ω = center of mass of ∂Ω.
He conjectures that this condition implies that the convex Hermite-Hadamard in-
equality holds with constant 1. Or, put differently, the worst case is given by linear
functions. This would be very nice if it were true – maybe too nice?

Update (June 2023). I asked whether anyone had any thoughts on Pasteczka’s
conjecture on mathoverflow, no replies.

Update (July 2023). Noah Kravitz and Mitchell Lee (‘Hermite–Hadamard in-
equalities for nearly-spherical domains’) just uploaded a short paper to the arXiv
that proves Pasteczka’s conjecture for domains sufficiently close to the ball.

Update (July 2023). Nazarov and Ryagobin prove Pasteczka’s conjecture in two
dimensions and disprove it in general in dimensions ≥ 3. This resolves Pasteczka’s
conjecture but, naturally, raises a new question (especially when combining it with
the inequality of. Kravitz-Lee): when do domains admit a Hermite-Hadamard
inequality?
Department of Mathematics, University of Washington, Seattle
Email address: [email protected]

You might also like