Outstanding Challenges in Combinatorics On Words (12w5068) : 1 Overview of The Field
Outstanding Challenges in Combinatorics On Words (12w5068) : 1 Overview of The Field
(12w5068)
James Currie (University of Winnipeg),
Jeffery Shallit (University of Waterloo)
Feb. 19 – Feb. 24, 2012
1
2
for word equations by G. S. Makanin [16] is an important sample outcome. Over the last 20 years or so com-
binatorics on words has developed into a quickly growing topic of its own; a few textbooks [13, 14, 15, 2]
have also emerged as very influential.
2 Recent Developments
Several striking results have been achieved in this area recently. Among these are the resolution of long-
standing problems and conjectures:
• A 1972 conjecture by F. Dejean [8] stated a precise bound on the size of unavoidable repetitions in
infinite words. This conjecture was finally confirmed through the work of M. Rao, J. Currie, N. Ram-
persad, and A. Carpi [19, 7, 4].
• The centralizer of a language is the maximal language commuting with it. The question, raised in
1970 by J. H. Conway [6], whether the centralizer of a rational language is always rational has been
negatively answered with a celebrated result [12]. In fact, even complete co-recursively enumerable
centralizers exist for finite languages.
• The satisfiability of word equations with constants is in PSPACE [18]. It follows from the proof of that
result that the satisfiability of word equations with constants is in NP if one shows that the minimal
solutions of a word equations are single exponential in the size of the equation if they exist.
• The solution of J.-P. Duval’s 1982 conjecture and the Ehrenfeucht-Silberger problem about the relation
between the period of a word and the maximum length of its unbordered factors by T. Harju, S. Holub,
and D. Nowotka in 2004 settled long-standing questions [10, 11].
• Does there exist an infinite word over a finite subset of N such that no three consecutive blocks of
the same size and the same sum exist? G. Pirillo and S. Varricchio raised that question 1994 in the
context of semigroup theory. L. Halbeisen and N. Hungerbuhler formulated that problem in different
terminology in 2000 independently of G. Pirillo and S. Varricchio. Just recently that question was
affirmatively answered by J. Cassaigne, J. D. Currie, L. Schaeffer, and J. Shallit [5].
In addition, tools in several subareas are clearly coming to maturity. At one point, the connections be-
tween combinatorics on words and transcendence results seemed to be one-way only, and somewhat ad hoc;
today, this connection is better understood, and several tools have emerged in this intersection of discrete
mathematics with algebra. Major progress has also been made on variations of the run-length problem,
which ties together combinatorics on words with ideas from data compression. As a final example, properties
of automatic sequences, expressed in a certain logic, have extremely recently been shown to be decidable.
3 Presentation Highlights
3.1 Run-length and maximal exponent problems
3.1.1 Videotaped lecture
For the first of our videotaped lectures, Maximal Exponent Repeats, Maxime Crochemore presented an
overview of the run-length problem and related questions. This talk was a summary of basic issues related
to repetitions in strings, concentrating on algorithmic and combinatorial aspects. This area is important both
from theoretical and practical point of view. Repetitions are highly periodic factors (substrings) in strings
and are related to periodicities, regularities, and compression. The repetitive structure of strings leads to
higher compression rates, and conversely, some compression techniques are at the core of fast algorithms for
detecting repetitions. There are several types of repetitions in strings: squares, cubes, and maximal repetitions
also called runs. For these repetitions, we distinguish between the factors (sometimes qualified as distinct)
and their occurrences (also called positioned factors). The combinatorics of repetitions is a very intricate area,
full of open problems. For example we know that the number of (distinct) primitively-rooted squares in a
3
string of length n is no more than 2nθ(log n), and is conjectured to be n, and that their number of occurrences
can be θ(n log n). Similarly we know that there are at most 1.029n and at least 0.944n maximal repetitions
and the conjecture is again that the exact bound is n.
I will review what’s known about matching fractions and suggest some areas for investigation.
(1967/68) gives an asymptotic formula for the number of 0’s and 1’s in any fixed linear subsequence of
TM, i.e., (tan+b ). In quite recent work, Mauduit and Rivat (2009) found precise formulas for quadratic
polynomials. The cases of cubes and higher-degree polynomials remain elusive so far. From work of Dartyge
and Tenenbaum (2006) it follows that there are % N 2/h! symbols “0” (or “1”) in any subsequence indexed by
a polynomial of degree h. The aim of the present talk is to give an overview about the various results known
in this area, pose some (old and new) conjectures, and – to improve by elementary/combinatorial means the
general lower bound to % N 4/(3h+1) .
Eric Rowland spoke on Counting equivalence classes of words in F2
Abstract: In the last decade several papers have appeared concerning the size of an equivalence class of
words in a free group under its automorphism group. A central theme of the area is that information about
the equivalence class of a word can be obtained from statistics of its (contiguous) subwords. Here we are
interested in the free group on 2 generators. We give a new characterization of words of minimal length,
and we introduce a natural operation that “grows” words from smaller words. The growth operation gives
rise to a notion of maximally minimal words (in a way that can be made precise), which we call root words.
Equivalence classes containing root words have special structure, and the hope is that understanding this
structure will lead to an exact enumeration of equivalence classes in F2 containing a minimal word of length
n.
Julien Leroy gave a lecture entitled The S-adic conjecture)
Abstract: An infinite word is S-adic if it can be obtained by successive iterations of morphisms belonging
to the set S. Sturmian words are well-known examples of S-adic words with card(S) = 4. The S-adic
conjecture tries to determine the link that should exist between S-adicity and sub-linear factor complexity.
More precisely, it says that there is a stronger notion of S-adicity which is equivalent to sub-linear factor
complexity, i.e., an infinite word would have a sub-linear complexity if and only if it is “strongly S-adic”.
In this talk, I will present some recent results about that conjecture. First I will present some examples
that allow to reject some natural ideas that one could have. Then I will briefly explain a general method
to compute an S-adic expansion of any uniformly recurrent infinite word with sub-linear complexity. That
method allows to solve the conjecture in the particular case of uniformly recurrent infinite words with first
difference of complexity bounded by 2.
This result is extremal with respect to the CFT since a consequence of the CFT is that, for any infinite
recurrent word x, either the function px(n) is bounded, and in such a case x is periodic, or px(n) ≥ n + 1
for infinitely many integers n. As a byproduct of the techniques used in the paper we extend a result of of
Harju and Nowotka stating that any finite Fibonacci word fn for n ≥ 5 has only one critical point. Indeed
we determine the exact number of critical points in any finite standard Sturmian word.
Young researcher Robert Mercas presented a different perspective on results of Fine and Wilf: On
Pseudo-repetitions in words
Abstract: The notion of repetition of factors in words was studied already from the beginnings of the combi-
natorics on words area. One of the recent generalizations regarding this concept was introduced by L. Kari et
al., and considers a word to be an f -repetition if it is the iterated concatenation of one of its prefixes and the
image of this prefix through an anti-/morphic involution f . In this paper, we extend the notion of f -repetitions
to arbitrary anti-/morphisms, and investigate a series of algorithmic problems arising in this context. Further,
we present a series of results in the fashion of the Fine and Wilf theorem for f -repetitions, when f is an
iso(anti)morphism.
Several other speakers contributed to this topic:
Dirk Nowotka spoke on Avoidability under Permutations; Arturo Carpi (Universita Degli Studi Di Pe-
rugia) gave a lecture on Unrepetitive walks in digraphs (and the repetition threshold); Sébastien Ferenczi
presented Word combinatorics of interval-exchange transformations for every permutationAnna Frid gave a
brief problem involving Morphisms on Permutations
4 Open problems
A session on open problems identified several questions:
Then define P (u, v) = {p(u, v, w) : w primitive }. Characterize the u, v such that P (u, v) = {0, 1}.
2. If x is a prefix of infinitely many square-free words over {1, 2, 3} and y is a suffix of infinitely many
square-free words over {1, 2, 3}, must there exist some squarefree word xuy over {1, 2, 3}?
3. Does the paperfolding word contain arbitrarily large Abelian powers?
4. Can the iterated hairpin completion of a singleton w (a) be regular (b) be context-free but not regular?
5. Improve the bounds on run lengths and sums of exponents of runs:
The paper was just accepted for publication in Information and Computation on April 17, 2012 and
will contain an acknowledgement to the inspiring workshop at BIRS.
• The recent BIRS workshop was a great opportunity for Boris Adamczewski to advance his collabora-
tion with Jason Bell. In particular, it allowed them to
– (Almost) Finish the writing of a joint paper on diagonals of multivariate algebraic functions.
– Discuss a current project on Mahler’s functions
– Discuss various new questions. They also restarted a joint project with Berthé and Zamboni.
• In addition Bell was able to
– Work on characterization of subsets of R accepted by Buchi automata with respect to two multi-
plicatively independent bases (with Julien Leroy and Emilie Charlier.
– Begin work on various questions of Jeff Shallit on k-automatic sequences with specific properties.
• Štěpán Holub solved the workshop open problem on the paperfolding word, shortly after his return
from BIRS.
• Thomas Stoll also participated in discussion regarding the paperfolding word, and also
– Answered a question of Shallit asking whether it is possible to give a bound for min(n : tk1 n =
e1 , tk2 n = e2 ) where tn is the Thue-Morse sequence and k1 , k2 are arbitrary distinct integers and
e1 , e2 all of the four possibilities. He got the result during the BIRS workshop and they are now
in progress toward finding the natural generalized result (with two students, one in Marseille and
one from Stanford).
– Also with Shallit, Stoll found during BIRS the proof of a conjecture of Eric Rowland that the
2-kernel oftn+l is of size f (l) where f (l) satisfies some nice explicit recursive relations and is
k-regular.
– Researchers from France (Stoll, Cassaigne, Ochem) were also inspired to think about the (diffi-
cult) problem of avoiding sumsquares in words over finite alphabets.
Researchers were extremely enthusiastic about the workshop, and the wonderful atmosphere at BIRS. Thanks
to BIRS for putting this resource at the disposal of the international mathematical community.
References
[1] S. Adjan & P.S. Novikov, Infinite periodic groups I–III. (Russian) Izv. Akad. Nauk SSSR Ser. Mat. 32
(1968).
[2] Jean-Paul Allouche and Jeffrey Shallit, Automatic Sequences: Theory, Applications, Generalizations.
Cambridge University Press, 2003.
[3] Jean Berstel, Axel Thue’s papers on repetitions in words: a translation, Publications du LCIM 20, Uni-
versité du Quebec à Montréal (1994)
[4] A. Carpi, On Dejean’s conjecture over large alphabets, Theoret. Comput. Sci. 385 (2007) 137–151.
[5] J. Cassaigne, J. Currie, L. Schaeffer, J. Shallit, Avoiding Three Consecutive Blocks of the Same Size and
Same Sum, arXiv:1106.5204v3
[6] Regular Algebra and Finite Machines, Chapman Hall, 1971.
[7] J. D. Currie, N. Rampersad, A proof of Dejean’s conjecture, Math. Comp. 80 (2011), 1063–1070.
[8] F. Dejean, Sur un théorème de Thue, J. Combin. Theory Ser. A 13 (1972) 90–99.
9
[9] N. J. Fine and H. S. Wilf, Uniqueness theorem for periodic functions. Proc. Amer. Math. Soc., /textbf 16
(1965) 109–114.
[10] T. Harju and D. Nowotka, Periodicity and unbordered words: A proof of the extended Duval conjecture,
J. ACM 54(4) (2007).
[11] S. Holub, A proof of the extended Duval conjecture, Theor. Comput. Sci., 339 (2005) 61–67.
[12] M. Kunc, Regular solutions of language inequalities and well quasi-orders, Proc. of ICALP 2004, 870–
888, 2004.
[13] M. Lothaire, Combinatorics on Words, Encyclopedia of Mathematics and its Applications 17, Addison-
Wesley, Reading (1983).
[14] M. Lothaire, Algebraic Combinatorics on Words, Cambridge University Press, Cambridge (2002).
[15] M. Lothaire, Applied Combinatorics on Words, Cambridge University Press, Cambridge (2005).
[16] G. S. Makanin, The problem of solvability of equations in a free semigroup, Mat. Sb. (N.S.) 103(145)
(1977) 147–236
[17] M. Morse, Recurrent geodesics on a surface of negative curvature, Trans. AMS 22 (1921), 84–100.
[18] W. Plandowski, Satisfiability of word equations with constants is in PSPACE, J. ACM 51(3) 483–496
(2004).
[19] M. Rao, Last Cases of Dejean’s Conjecture, Theor. Comput. Sci. 412 (2011), 3010–3018.
[20] M. P. Schtzenberger, Une thorie algbrique du codage, Séminaire Dubreil-Pisot, 1955–1956, Exposé No.
15.
[21] A. I. Shirshov, Bases of free Lie algebras, Algebra i Logika 1, 1962, 14–19.
[22] Axel Thue, Über unendliche Zeichenreihen, Norske Vid. Selsk. Skr. I. Mat. Nat. Kl. Christiana No. 7
(1906).
[23] A. Thue, Über die gegenseitige Lage gleicher Teile gewisser Zeichenreihen, Norske Vid. Selsk. Skr. I.
Mat. Nat. Kl. Christiana 1 (1912) 1–67.
[24] B. L. van der Waerden, Beweis einer Baudetschen Vermutung, Nieuw. Arch. Wisk. 15 (1927) 212–216.