0% found this document useful (0 votes)
73 views12 pages

On Fair Words

The document discusses fair words, which are words where for any two distinct symbols a and b, the number of occurrences of the subword ab equals the number of occurrences of ba. It provides formulas for counting the number of fair words up to length 10 and methods for constructing fair words. It introduces precedence matrices, which generalize Parikh mappings by counting occurrences of subwords of length 2. Precedence matrices can be used to represent whether a word is fair based on the difference between counts of ab and ba subwords.

Uploaded by

vanaj123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
73 views12 pages

On Fair Words

The document discusses fair words, which are words where for any two distinct symbols a and b, the number of occurrences of the subword ab equals the number of occurrences of ba. It provides formulas for counting the number of fair words up to length 10 and methods for constructing fair words. It introduces precedence matrices, which generalize Parikh mappings by counting occurrences of subwords of length 2. Precedence matrices can be used to represent whether a word is fair based on the difference between counts of ab and ba subwords.

Uploaded by

vanaj123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Journal of Automata, Languages and Combinatorics 14 (2009) 2, –

°c Otto-von-Guericke-Universität Magdeburg

ON FAIR WORDS

Anton Černý
Department of Information Science, Kuwait University
P.O. Box 5969, Safat 13060, Kuwait
e-mail: [email protected]

ABSTRACT
A word is called fair if it contains, for each pair of distinct symbols a, b, the same
number of occurrences of the scattered subword ab as of ba. We provide formulas for
the number of fair words of length up to 10 on a k-letter alphabet and some methods
for constructing fair words. We use a generalization of Parikh vector called p-matrix
to count the number of occurrences of subwords of shape ab in a word.
Keywords: Parikh mapping, Parikh matrix, fair word, palindrome

1. Introduction

Imagine members of k ≥ 1 rivaling groups a1 , a2 , . . . , ak want to pass a narrow door.


In which order they should pass? A solution to this problem is fair, if, for any two
distinct groups ai , aj , the number of member pairs, where a member of ai precedes
a member of aj , is the same as of those where the order is reversed. Any passing
order can be denoted by a word on the alphabet Σ = {a1 , a2 , . . . , ak ), containing as
many occurrences of ai as there are members of the group ai . A word describing a
fair solution will be called fair. Not always such a word exists. Palindromes are fair,
but they are not the only fair words.
Our interest in these words was invoked when considering matrix generalizations
of the Parikh mapping [8]. A very promising way of such generalization seem to be
Parikh matrices [4, 5, 7, 9, 10], which count the number of occurrences of the factors
of the word a1 a2 . . . ak in words on the ordered alphabet Σ = {a1 , a2 , . . . , ak ). Their
extensions have been studied, as well [6, 1, 12]. However, we feel that there is no
particular reason to be interested in occurrences of factors of the word a1 a2 . . . ak
given by one preferred ordering of the alphabet. The p-matrices we are going to
propose here do not suffer under such lack of symmetry. This, of course, does not
necessarily imply that they will prove to be a more useful tool. In particular, since the
structure of words is not reflected by using the conventional matrix calculus, as it is
the case in Parikh matrices. In a recent study [11], generalized Parikh matrices were
used to investigate the difference of the number of occurrences of subwords ab and
2 A. ČERNÝ

ba in binary words. The words with difference 0 are the fair words (called subword
balanced1 in [11]). Theorem 29 refrains the result from there.
We first introduce here p-matrices, p-morphisms and the induced p-equivalence of
words. Then we switch to investigation of fair words; in particular, we are interested
in counting fair words of a given length. We provide an inclusion-exclusion principle,
explicit formulas for the number of fair words of length up to 10 and a conjecture
regarding the general formula.

2. Precedence Matrices

Throughout the paper we consider a fixed ordered alphabet Σ = {a1 , a2 , . . . , ak ),


k ≥ 1. We denote as Σ∗ the set of all words on Σ, by ε the empty word and by w R
the mirror image (reverse) of the word w. If not specified otherwise “word” means
a word on Σ and i, j denote two arbitrary values, 1 ≤ i, j ≤ k. We use the term
“subword” in the sense of “scattered subword”. Thus a word x is a subword of a
word y = c1 c2 . . . cm , where c1 , c2 , . . . , cm ∈ Σ, m ≥ 1, if x = ci1 ci2 . . . cir for some
1 ≤ i1 < i2 < · · · < ir ≤ m, r ≥ 1. The r-tuple (i1 , i2 , . . . , ir ) is called the occurrence
of x in y. The number of occurrences of x in y is denoted as |y|x . By default, the
empty word ε is a subword of every word y with |y|ε = 1. The Parikh vector of a
word w is the vector ψ(w) = (|w|a1 , |w|a2 , . . . , |w|ak ). The Parikh matrix of a word
w is the (k + 1) × (k + 1) upper triangular matrix, where all elements at the main
diagonal are equal to 1 and, for 1 ≤ i ≤ j ≤ k, the element at position i, j + 1 is
|w|ai ai+1 ...aj .
We consider the set Fk of all k × k matrices on the the set Z of all integers, by
“matrix” we mean an element of Fk . The zero matrix from Fk is denoted as 0k . For
a matrix X and 1 ≤ i, j ≤ k, we denote as Xi,j the element of X at position i, j. We
introduce the following “product” operation ◦ on Fk . For A, B ∈ Fk we define
(
Ai,i + Bi,i if i = j,
(A ◦ B)i,j =
Ai,j + Bi,j + Ai,i Bj,j if i 6= j.

The operation ◦ is associative: ((A◦B)◦C)i,i = Ai,i +Bi,i +Ci,i = (A◦(B◦C))i,i and, if


i 6= j, ((A◦B)◦C)i,j = Ai,j +Bi,j +Ci,j +Ai,i Bj,j +Ai,i Cj,j +Bi,i Cj,j = (A◦(B◦C))i,j .
The neutral element for ◦ is 0k .
For a matrix A the matrix A− defined as
(
− −Ai,i if i = j,
Ai,j =
Ai,i Aj,j − Ai,j if i 6= j

satisfies A− ◦A = A◦A− = 0k . The transposition operation composed with ◦ behaves


the same way as composed with usual matrix multiplication. For 1 ≤ i, j ≤ k. Then
((A ◦ B)T )i,i = Ai,i + Bi,i = (B T ◦ AT )i,i , ((A ◦ B)T )i,j = Aj,i + Bj,i + Aj,j Bi,i =
(B T ◦ AT )i,j .
1 We prefer not to use the term “balanced words” since it usually denotes a different type of words

– see [13].
On Fair Words 3

Proposition 1 (Fk , ◦) is a (non-commutative, for k ≥ 2) group. For A, B ∈ Fk ,


(A ◦ B)T = B T ◦ AT .

For a symbol as ∈ Σ, 1 ≤ s ≤ k, let Eas be the matrix defined as (Eas )i,j = 1


if i = j = s and (Eas )i,j = 0 otherwise. The precedence morphism (or p-morphism)
on Σ is the morphism of monoids ΦΣ : (Σ∗ , ·) → (Fk , ◦) (we will mostly omit the
subscript Σ) defined by by ΦΣ (as ) = Eas . For a word w, the matrix ΦΣ (w) is called
the precedence matrix (or p-matrix ) of w. The following lemma can be easily proved
by induction on the length of the word w.

Lemma 2 For each word w,


(
|w|ai if i = j,
Φ(w)i,j =
|w|ai aj if i 6= j. 2

Thus the p-matrix of a word w shows for each pair of distinct symbols ai , aj , how
many times ai precedes aj in w. The main diagonal is the Parikh vector ψ(w), hence
the precedence mapping is a generalization of the Parikh mapping. Using the fact
that, for i 6= j, |w|ai aj + |w|aj ai = |w|ai |w|aj , the following corollaries are easily
obtained.

Corollary 3 For i 6= j, Φ(w)i,j + Φ(w)j,i = Φ(w)i,i Φ(w)j,j . 2

Corollary 4 (Φ(w)− )T = (Φ(w)T )− . 2

Corollary 5 Φ(w R ) = Φ(w)T . 2

Corollary 6 Φ(w)− is obtained from Φ(w R ) = Φ(w)T by changing the signs of ele-
ments on the main diagonal. 2

Example 7 Let Σ = {a, b, c}. Then


111 11 2
à ! à !
2 2 R
¡ 2 ¢
Φ(bc acb) = 1 2 3 , Φ(bcac b) = Φ (bc acb) = 1 2 3 .
233 13 3

Remark 8 One can easily see that the precedence matrix for a different ordering of
the set Σ can be obtained by permuting the rows and columns accordingly. Thus the
precedence matrix of a word does not strongly depend on the particular ordering of
the alphabet Σ.

Remark 9 If |Σ| = 2 then the upper triangular part of the p-matrix of a word w is
identical to the triangular part of the Parikh matrix of w above (and not including)
the main diagonal. Corollary 3 then implies, that there is a one-to-one correspondence
between Parikh matrices and p-matrices in this case. Therefore many results related
to Parikh matrices on a 2-letter alphabet (e. g., those from [5, 2]) can be directly
translated to the results related to p-matrices.
4 A. ČERNÝ

Let Gk be the set of all matrices A from Fk satisfying for each pair of distinct
indices 1 ≤ i, j ≤ k the condition Ai,j + Aj,i = Ai,i Aj,j .

Proposition 10 (Gk , ◦) is a subgroup of (Fk , ◦).



Proof. Let A, B ∈ Gk . Then, for i 6= j, (since Bj,i = Bi,j )
− − − −
(A ◦ B − )i,j + (A ◦ B − )j,i = Ai,j + Bi,j + Ai,i Bj,j + Aj,i + Bj,i + Aj,j Bi,i
− − − −
= Ai,j + Aj,i + Bi,j + Bj,i + Ai,i Bj,j + Aj,j Bi,i
= Ai,i Aj,j + Bi,i Bj,j − Ai,i Bj,j − Aj,j Bi,i
= Ai,i (Aj,j − Bj,j ) + Bi,i (Bj,j − Aj,j )
= Ai,i (A ◦ B − )j,j − Bi,i (A ◦ B − )j,j
= (A ◦ B − )i,i (A ◦ B − )j,j . 2

But not every element of Gk with non-negative values is a p-matrix. We have a


generalization of the necessary condition from Corollary 3 to be satisfied by elements
of a precedence matrix. We will first formulate this condition in terms of subwords
of length 1 and 2 and then we will use Lemma 2 to translate it to p-matrix terms.

Theorem 11 Let b1 , b2 , . . . , br be r ≥ 2 distinct symbols from Σ. Let w ∈ Σ∗ . De-


note by m1 (w), m2 (w) the smallest and the second smallest, and by M1 (w), M2 (w)
the greatest and the second greatest2 , respectively, of the values |w|b1 , |w|b2 , . . . , |w|br .
Then
Xr
m1 (w)m2 (w) ≤ |w|bi b(i+1) mod r ≤ M1 (w)M2 (w).
i=1

Proof. We will prove the first inequality by induction on |w|. The second inequality
can be proved in a similar way. For w = ε the inequality is trivially true. Consider the
word w = xa, a ∈ Σ and assume that the assertion is true for x. If a ∈ / {b1 , b2 , . . . , br }
then it is true for w, as well, since all the values involved are the same for w and x. Let
now a ∈ {b1 , b2 , . . . , br }. We may assume a = b1 . Then, for 1 ≤ i ≤ r − 1, |w|bi bi+1 =
|x|bi bi+1 , |w|
Pbi+1 |x|b1 +1 and |w|br b1 = |x|br b1 +|x|br = |x|br b1 +|w|br .
= |x|bi+1 , |w|b1 = P
r r
Therefore i=1 |w|bi b(i+1) mod r = i=1 |x|bi b(i+1) mod r + |w|br ≥ m1 (x)m2 (x) + |w|br .
There are three cases possible.
1. m1 (w) = m1 (x) and m2 (w) = m2 (x).
Then m1 (x)m2 (x) + |w|br ≥ m1 (w)m2 (w).
2. m1 (w) = m1 (x) and m2 (w) = m2 (x) + 1.
Then |w|br = |x|br ≥ m1 (x) and m1 (x)m2 (x) + |w|br ≥ m1 (x)(m2 (x) + 1) =
m1 (w)m2 (w).
3. m1 (w) = m1 (x) + 1 and m2 (w) = m2 (x).
Then m1 (w) = |w|b1 and |w|br ≥ m2 (w) = m2 (x). Therefore m1 (x)m2 (x) + |w|br ≥
(m1 (x) + 1)m2 (x) = m1 (w)m2 (w). 2
2 Possibly m1 (w) = m2 (w) and/or M1 (w) = M2 (w) .
On Fair Words 5

Corollary 12 Let A ∈ Fk and let 1 ≤ j1 , j2 , . . . , jr ≤ k, r ≥ 2 be distinct indices.


Denote by m1 , m2 the smallest and the second smallest, and by M1 , M2 the greatest
and the second greatest3 , respectively, of the values Aj1 ,j1 , Aj2 ,j2 , . . . , Ajr ,jr . If A is a
p-matrix then
r
X
m1 m2 ≤ Aji ,j(i+1) mod r ≤ M1 M2 . 2
i=1

205
Ã
!
Example 13 The matrix M = 4 2 2 satisfies for each pair of distinct indices
143
1 ≤ i, j ≤ 3 the condition Mii ,j + Mji ,i = Mi,i Mj,j but it is not a p-matrix, since
M1,2 + M2,3 + M3,1 = 3 < 2 · 2 where M1,1 = M2,2 = 2 are the two smallest of the
values M1,1 , M2,2 , M3,3 .

3. Fair Words

Two words x, y on Σ will be called p-equivalent (denoted x ≡Σ y, we will mostly


omit the subscript Σ) if Φ(x) = Φ(y). If two words are p-equivalent, then they are
of the same length and contain the same number of occurrences of each symbol. The
shortest words being p-equivalent, but non-identical, are abba and baab. Since Φ is a
morphism, the p-equivalence is a congruence with respect to concatenation. Corollary
5 implies that the p-equivalence is preserved by the mirror image operation.

Proposition 14 Let x, y, y 0 , z be words such that y ≡ y 0 . Then xyz ≡ xy 0 z and


y R ≡ y 0R . 2

On the other hand, the fact xy ≡ x0 y 0 and |x| = |x0 | does not imply x ≡ x0 as
illustrated by the following example.

Example 15 abba ≡ baab but ab 6≡ ba. (Actually, no two proper prefixes of abba and
baab of the same length are equivalent.)

A word x on Σ is fair if Φ(x) = Φ(x)T . The fairness means no symbol is preferred


to some other. All words of length 0 and 1 are fair, so are all words on a singleton
alphabet. All palindromes are fair. But there are other words that are fair, as well.
Hence fair words may be considered to be a generalization of palindromes.

Example 16 One can easily check that the fair words of length up to 6 are exactly
the palindromes (the alphabet size does not matter). Starting from length 7, non-
palindromic fair words exist: ab3 a2 b, ab3 a2 bab, ab2 a2 baba2 b, abc2 babcbacb2 ca are few
examples.

The following Propositions 17 and 19 follow directly from Corollary 3.


3 Possibly m1 = m2 and/or M1 = M2 .
6 A. ČERNÝ

Proposition 17 If x is a fair word, then at most one element on the main diagonal
of the matrix Φ(x) (i. e., in the Parikh vector ψ(x)) is odd. 2

Corollary 18 A fair word of length n ≥ 0 contains at most dn/2e distinct symbols. 2

Proposition 19 If a word x is fair then Φ(x) is uniquely determined by the Parikh


vector ψ(x). 2

Corollary 20 Any two fair words having the same Parikh vector are equivalent. 2

Lemma 21 Let x be a fair word and y a word. Then the word yxy R is fair.
Proof. Φ(yxy R )T = (Φ(y) ◦ Φ(x) ◦ Φ(y R ))T = (Φ(y R )T ◦ Φ(x)T ◦ Φ(y)T ) = Φ(y) ◦
Φ(x) ◦ Φ(y R ) = Φ(yxy R ). 2

Lemma 22 If uvz is a fair word and v 0 ≡ v then uv 0 z is a fair word and uv 0 z ≡ uvz.
Proof. Since both v 0 ≡ v and v 0R ≡ v R , Proposition 14 implies (uv 0 z)R = z R v 0R uR ≡
z R v R uR = (uvz)R ≡ uvz ≡ uv 0 z. 2

Problem 23 Can every fair word be obtained from a word constructed as in Lem-
ma 21 for |x| ≤ 1 by finitely many substitutions of some proper factor v by v 0 as in
Lemma 22?

Lemma 24 Let x, y, y 0 be words. If xy ≡ xy 0 or yx ≡ y 0 x then y ≡ y 0 .


Proof. Follows from the fact that (Fk , ◦, 0k ) is a group. 2

Let Γ, ∆ be alphabets and h : Γ∗ → ∆∗ a morphism.

Lemma 25 If x, y ∈ Γ∗ such that x ≡Γ y then h(x) ≡∆ h(y).


Proof. Let Γ = {b1 , b2 , . . . , bm }, ∆ = {c1 , c2 , . . . , cn } and let 1 ≤ p, q ≤ m, 1 ≤
r, s ≤ n, p 6= q, r 6= s. For simplicity, for any word v ∈ Γ∗ , we denote |v|p = |v|bp ,
|v|p,q = |v|bp bq and for any word w ∈ ∆∗ , |w|r = |w|cr , |w|r,s = |w|cr cs . Then, for
each occurrence of bp in x, h(x) contains a factor h(bp ) containing |h(bp )|r,s subwords
cr cs . For each subword bp bq of x there are |h(bp )|r |h(bq )|s subwords cr cs of h(x)
such that cr , cs are contained in the factors h(bp ), h(bq ) being images of the particular
occurrences of bp , bq , respectively.
Φ∆ (h(x))r,r = |h(x)|r
X
= |x|p |h(bp )|r
p
X
= |y|p |h(bp )|r
p
= |h(y)|r
= Φ∆ (h(y))r,r
On Fair Words 7

Φ∆ (h(x))r,s = |h(x)|r,s
X X¡ ¢
|x|p
= |x|p |h(bp )|r,s + 2 |h(bp )|r |h(bp )|s
p p
X
+ |x|p.q |h(bp )|r |h(bq )|s
p.q
X X¡
|y|p
¢
= |y|p |h(bp )|r,s + 2 |h(bp )|r |h(bp )|s
p p
X
+ |y|p.q |h(bp )|r |h(bq )|s
p.q

= |h(y)|r,s
= Φ∆ (h(y))r,s . 2

The morphism h will be called fair if all the words h(a), a ∈ Γ, are fair. It will be
called unfair if none of the words h(a) is fair.

Corollary 26 If h is fair then h(x) is fair for every fair word x ∈ Γ∗ .

Proof. Consider the mirror image morphism hR : Γ∗ → ∆∗ defined for b ∈ Γ as


hR (b) = h(b)R . Then hR (b) ≡∆ h(b) and Lemma 22 implies that replacing each
factor h(b) in h(xR ) by hR (b) yields a word equivalent to h(xR ). Hence h(x) ≡∆
h(xR ) ≡∆ hR (xR ) = h(x)R . 2

Corollary 27 A word obtained from a fair word by erasing all occurrences of some
symbol is fair. 2

Example 28 Consider the morphism h : {0, 1}∗ → {a, b}∗ defined as h(0) = ab,
h(1) = ba. Then h(01) = abba ≡{a,b} baab = h(10) but 01 6≡{0,1} 10. Therefore equiv-
alence of morphic images does not necessarily imply the equivalence of the original
words.

The language Lsym consisting of all fair words was considered in [11] for the case
|Σ| = 2 and its position within the Chomsky hierarchy has been determined. The
proof from [11] is valid for larger alphabets, as well, hence the result can be extended
as follows.

Theorem 29 For |Σ| ≥ 2, the language Lsym is context-sensitive, but not context-
free.

A natural question is what portion of all words are fair words. We denote as F (k, n)
the number of fair words of length n on a k-symbol alphabet. Table 1 shows these
numbers, as well as the percentage of the total number of words of the given length,
for a few small values k, n.
8 A. ČERNÝ

The table indicates that the values of F (k, n) tend to increase with the increasing
n while the frequency of occurrences of fair words tends to decrease. We do not know
whether the limit of the frequency exists, we assume it does and is equal to 0 (see
Conjecture 34). Of interest are slightly falling values of F (k, n) between odd and
even values of n. Again, we do not know whether this is a common pattern for larger
values of n and how to explain this phenomenon. Conjecture 34 indicates, why for
even n there is a smaller absolute difference between F (k, n − 1) and F (k, n) than
between F (k, n) and F (k, n + 1). For small values of n we are able to establish a
recurrence relation
¡ for¢ F (k, n). To obtain it, we first denote, for 0 ≤ i ≤ p ≤ k, as
ap,k,i = (−1)p−i k−i
p−i .

Lemma 30


 1 if i = p,

p−1
ap,k,i = X ¡k−j ¢

 − p−j aj,k,i if i < p.

j=i

Proof. The assertion for i = p is obvious. Assume i < p. Then


p−1
X p
X
k−j k−j
¡ ¢ ¡ ¢
ap,k,i + p−j aj,k,i = p−j aj,k,i
j=i j=i
Xp
¡k−j ¢¡k−i¢
= (−1)j−i p−j j−i
j=i
p−i
¡k−i¢ X ¡p−i¢
= p−i (−1)j j
j=0

= 0. 2

Now we consider the number Fp (k, n) of fair words of length n on a k-symbol


alphabet, containing exactly p ≥ 0 distinct symbols. We have the following inclusion-
exclusion principle.

Theorem 31 For 0 ≤ p ≤ k,
p
X ¡k−i¢¡k¢
Fp (k, n) = (−1)p−i p−i i F (i, n).
i=0

Proof. Induction on p. The assertion is trivial for p = 0. Let us assume it is true for
all values smaller than some p ≥ 1. To evaluate Fp (k, n), for each of the p-tuple of
symbols we count the F (p, n) fair words consisting of the symbols of the p-tuple only.
The words not containing all the p distinct symbols are counted here as well, even
several times, as the¡ p-tuples intersect. Each word containing exactly j < p distinct
symbols is counted k−j
¢
p−j times, since this is the number of ways the j-tuple can be
On Fair Words 9

k=2 k=3 k=4 k=5


n F (k, n) % F (k, n) % F (k, n) % F (k, n) %
0 1 100.00 1 100.00 1 100.00 1 100.00
1 2 100.00 3 100.00 4 100.00 5 100.00
2 2 50.00 3 33.33 4 25.00 5 20.00
3 4 50.00 9 33.33 16 25.00 25 20.00
4 4 25.00 9 11.11 16 6.25 25 4.00
5 8 25.00 27 11.11 64 6.25 125 4.00
6 8 12.50 27 3.70 64 1.563 125 0.80
7 20 15.63 93 4.25 280 1.71 665 0.85
8 18 7.03 87 1.33 268 0.41 645 0.17
9 52 10.16 333 1.69 1264 0.48 3625 0.19
10 48 4.69 309 0.52 1192 0.11 3465 0.04
11 152 7.42 1323 0.75 6160 0.15
12 138 3.37 1185 0.22 5620 0.03
13 472 5.76 5709 0.36
14 428 2.61 5007 0.11
15 1520 4.64 26775 0.19
16 1392 2.12 23217 0.05
17 5044 3.85
18 4652 1.78
19 17112 3.26
20 15884 1.52

Table 1: Frequency of fair words. For each alphabet size, the second column provides the
percentage of the words of the particular length being fair.

completed to a p-tuple. We get


p−1
X
¡k ¢ ¡ k−j
¢
Fp (k, n) = p F (p, n) − p−j Fj (k, n)
j=0
p−1
X j
¡k ¢ ¡ k−j
¢X ¡k ¢
= p F (p, n) − p−j aj,k,i i F (i, n)
j=0 i=0
p−1 X
X j
¡k ¢ ¡ k−j
¢ ¡k ¢
= p F (p, n) − p−j aj,k,i i F (i, n)
j=0 i=0
p−1 X
X p−1
¡k ¢ ¡ k−j
¢ ¡k ¢
= p F (p, n) − p−j aj,k,i i F (i, n)
i=0 j=i
p−1
X
¡k ¢ ¡k ¢
= p F (p, n) + ap,k,i i F (i, n)
i=0
p
X ¡k ¢
= ap,k,i i F (i, n). 2
i=0
10 A. ČERNÝ

Now we can formulate the recurrence relation for F (k, n) for values of n that are
small compared to k.

Theorem 32 If 0 ≤ n ≤ 2k − 2 then
dn/2e dn/2e
X X ¡k−i¢¡k¢
F (k, n) = (−1)p−i p−i i F (i, n).
i=0 p=i

Proof. Using Corollary 18 and Theorem 31 we obtain


dn/2e
X
F (k, n) = Fp (k, n)
p=0
dn/2e p
X X ¡k ¢
= ap,k,i i F (i, n)
p=0 i=0
dn/2e dn/2e
X X ¡k ¢
= ap,k,i i F (i, n) 2
i=0 p=i

Proposition 33 For n > 0, F (0, n) = 0 and for k ≥ 0,


F (k, 0) = 1,
F (k, 1) = F (k, 2) = k,
F (k, 3) = F (k, 4) = k 2 ,
F (k, 5) = F (k, 6) = k 3 ,
F (k, 7) = k(k 3 + 2k − 2),
F (k, 8) = k(k 3 + k − 1),
F (k, 9) = k(k 4 + 5k 2 − 5k),
F (k, 10) = k(k 4 + 3k 2 − k − 2).
Proof. Those of the assertions, which are not obvious or do not follow immediately
from Table 1, can be proved as an exercise in formal algebraic transformations. For
example, the last assertion is true
P5 forPk5 ≤ 5, as seen from
¢ Table 1. For k ≥ 6,
p−i k−i k 4 2
¡ ¢¡
Theorem 32 yields F (k, 10) = i=1 p=i (−1) p−i i i(i + 3i − i − 2); this
4 2
expression evaluates to k(k + 3k − k − 2). 2
Proposition 33 leads us to the conjecture that each F (k, n), n ≥ 1, can be expressed
by a polynomial in k of degree dn/2e with the leading coefficient equal to 1. Moreover,
for k ≥ 1, F (k, n) is a multiple of k, since the number of fair words beginning by one
particular symbol is the same for any of the k symbols. Finally, for k = 1 the value
of the polynomial should be F (1, n) = 1:

Conjecture 34 For n ≥ 1, k ≥ 1, F (k, n) = k(k dn/2e−1 + p(k)) where p(k) is a


polynomial with integer coefficients in variable k of degree at most dn/2e − 2, or zero
polynomial, and p(1) = 0. 2
On Fair Words 11

We finish by extending the notion of fairness to (one-way) infinite words (see, [3],
Chapter 2 for related definitions). An infinite word w = b1 b2 . . ., bi ∈ Σ, is fair if
infinitely many prefixes of w are fair. The following lemma provides an easy method
of construction of infinite fair words.

Lemma 35 Let h : Σ∗ → Σ∗ be a fair morphism. If h(a) ∈ aΣ+ for some a ∈ Σ


then limn→∞ hn (a) is a fair infinite word.

Proof. Corollary 26 implies that all the words hn (a) are fair. 2

Example 36 The fair morphism a 7→ aba, b 7→ b yields the fair periodic infinite word
abababababa . . .
The fair morphism a 7→ abba, b 7→ bab yields the fair infinite word
abbababbababbabababbabab . . .
The morphism a 7→ ab, b 7→ bb yields the infinite word abbbbbbbbbbbb . . ., which is
not fair.
The infinite word abab2 ab3 ab4 . . . is an example of a non-periodic infinite word,
having just 3 fair prefixes. Indeed it can be easily seen that the prefix abab2 . . . abr abi ,
0 ≤ r, 0 ≤ i ≤ r + 1, contains r(r + 1)(r + 2)/6 subwords ba. Hence it is fair iff this
number equals to (r + 1)[r(r + 1)/2 + i]/2 implying i = (r − r 2 )/6. Thus i = 0 and
r = 0 or r = 1; the only fair prefixes are ε, a, and aba.
The unfair morphism a 7→ ab, b 7→ ba yields the fair infinite word of Thue-Morse
abbabaabbaababba . . . (generated, as well, by iteration of the square of the same mor-
phism a 7→ abba, b 7→ baab being itself fair).
The morphism a 7→ ab, b 7→ a, which is neither fair nor unfair, yields the infinite
word of Fibonacci, which is fair [11, Theorem 9].
Every infinite word having infinitely many palindromes as prefixes is fair. The fair
morphism a 7→ ab3 a2 b, b 7→ ab3 a2 b yields a fair periodic infinite word with just three
palindromic prefixes ε, a, and ab3 a, since no its prefix is finished by a2 b3 a.

4. Conclusion

We presented just some basic properties of p-matrices and fair words. More questions
remain open than answered. Is the necessary condition from Corollary 12 sufficient
as well? Knowing explicit formulas for F (k, n) for larger values of n could lead to
proving or disproving Conjecture 34. Answering Problem 23 could be one step towards
a general formula for F (k, n). Fair infinite words may be studied in the framework of
more general investigations suggested in [11].

References

[1] Ö. Egecioglu, O. H. Ibarra, A Matrix q-Analogue of the Parikh Map. In:
J.-J. Lévy, E. W. Mayr, J. C. Mitchell (eds.), IFIP TCS . Kluwer, 2004,
125–138.
12 A. ČERNÝ

[2] S. Fossé, G. Richomme, Some characterizations of Parikh matrix equivalent


binary words. Inf. Process. Lett. 92 (2004) 2, 77–82.
[3] M. Lothaire, Combinatorics on words. Cambridge University Press, 1997.
[4] A. Mateescu, Algebraic Aspects of Parikh Matrices. In: J. Karhumäki,
H. A. Maurer, G. Paun, G. Rozenberg (eds.), Theory Is Forever . Lecture
Notes in Computer Science 3113, Springer, 2004, 170–180.
[5] A. Mateescu, A. Salomaa, Matrix Indicators For Subword Occurrences And
Ambiguity. Int. J. Found. Comput. Sci. 15 (2004) 2, 277–292.
[6] A. Mateescu, A. Salomaa, K. Salomaa, S. Yu, A sharpening of the Parikh
mapping. ITA 35 (2001) 6, 551–564.
[7] A. Mateescu, A. Salomaa, S. Yu, Subword histories and Parikh matrices. J.
Comput. Syst. Sci. 68 (2004) 1, 1–21.
[8] R. J. Parikh, On Context-Free Languages. J. ACM 13 (1966) 4, 570–581.
[9] A. Salomaa, Connections between subwords and certain matrix mappings.
Theor. Comput. Sci. 340 (2005) 1, 188–203.
[10] A. Salomaa, On the injectivity of the Parikh matrix mappings. Fundam. Inf.
64 (2005) 1–4, 391–404.
[11] A. Salomaa, Subword Balance in BinaryWords, Languages and Sequences. Fun-
dam. Inf. 75 (2007) 1–4, 469–482.
[12] T.-F. Şerbănuţă, Extending Parikh matrices. Theor. Comput. Sci. 310 (2004)
1–3, 233–246.
[13] L. Vuillon, Balanced words. Bull. Belg. Math. Soc. Simon Stevin 10 (2003) 5,
787–805.

(Received: July 30, 2006; revised: April 22, 2009)

You might also like