0% found this document useful (0 votes)
47 views21 pages

Kirsten Determinization

This document presents an abstract for a paper on determinization of weighted finite automata over arbitrary semirings. It generalizes an algorithm by Mohri for determinizing weighted finite automata over the tropical semiring to work for any semiring. The paper defines key concepts like maximal factorizations and proves that the twins property is sufficient for termination of the algorithm in certain semirings called commutative min-semirings. It also characterizes single-valued weighted finite automata and shows several properties are equivalent, including that the behavior has bounded variation.

Uploaded by

Pino Affe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views21 pages

Kirsten Determinization

This document presents an abstract for a paper on determinization of weighted finite automata over arbitrary semirings. It generalizes an algorithm by Mohri for determinizing weighted finite automata over the tropical semiring to work for any semiring. The paper defines key concepts like maximal factorizations and proves that the twins property is sufficient for termination of the algorithm in certain semirings called commutative min-semirings. It also characterizes single-valued weighted finite automata and shows several properties are equivalent, including that the behavior has bounded variation.

Uploaded by

Pino Affe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

On the Determinization of Weighted Automata

Daniel Kirsten∗† Ina Mäurer‡


Technische Universität Dresden Universität Leipzig
Institut für Algebra Institut für Informatik
01062 Dresden, Germany 04109 Leipzig, Germany
[email protected] [email protected]

November 29, 2005

Abstract
In the paper, we generalize an algorithm and some related results by Mohri [25] for deter-
minization of weighted finite automata (WFA) over the tropical semiring. We present the
underlying mathematical concepts of his algorithm in a precise way for arbitrary semirings.
We define a class of semirings in which we can show that the twins property is sufficient for
the termination of the algorithm. We also introduce single-valued WFA and give a partial
correction of a claim by Mohri [25] by showing several characterizations of single-valued
WFA, e.g., the formal power series computed by a single-valued WFA is subsequential iff it
has bounded variation. Also, it is decidable in polynomial time whether a given WFA over
the tropical semiring is single-valued.

1 Introduction
Weighted finite automata are of great theoretical and practical interest in computer science.
They play a crucial role in the structure theory of recognizable languages in free monoids and
trace monoids. However, weighted finite automata also have practical applications in speech
recognition and image compression [6, 9, 14, 17, 18, 25]. The behaviour of a weighted finite
automaton, for short WFA, can be described as a formal power series, i.e., a mapping from a
free monoid into some semiring.
Although WFA were already studied in the seventies [4, 7, 11], there are many recent articles
which focus mainly on two streams: WFA over the min–plus (tropical) and max–plus semirings
and string-to-string transducers [2, 3, 8, 20].
In contrast to unweighted automata, there are WFA which do not admit a subsequential
(deterministic) equivalent. However, Mohri developed an algorithm which determinizes WFA
over the tropical semiring [25] and which is implemented within the AT&T FSM Library TM . This
algorithm is not perfect, e.g., there are WFA on which Mohri’s algorithm does not terminate
despite there are subsequential equivalents. Nevertheless, his algorithm is very successful on
WFA which occur in speech recognition. Mohri proves that in the tropical semiring the twins
property is a sufficient condition for the termination of his algorithm [25].
Mohri develops his ideas for WFA over the tropical semiring. Here we wish to investigate
generalizations of his results to other semirings.

Corresponding author.

Supported by the research grant KI 822/1–1 of the German Research Community (Deutsche Forschungsge-
meinschaft).

Supported by the GK 334 of the German Research Community (Deutsche Forschungsgemeinschaft) and the
Freistaat Sachsen.
2 2 PREREQUISITES

In Section 3, we generalize Mohri’s algorithm to an abstract notion of determinization of


WFA over arbitrary semirings. This determinization utilizes so-called factorizations which de-
pend on the semiring. We introduce the notion of a maximal factorization and show that in
zero-divisor free semirings, maximal factorizations are optimal in comparison to arbitrary factor-
izations. Moreover, we prove that the twins property is a sufficient condition for the termination
of our abstract determinization for a certain class of semirings which we call commutative min-
semirings. Mohri’s determinization turns out as the particular case of our approach in the
tropical semiring with a maximal factorization. Consequently, we achieve several results by
Mohri as specific cases of our approach, but we also inherit the problem that our approach
does not always terminate even if the given WFA admits a subsequential equivalent.
In Section 4, we deal with unambiguous and single-valued WFA. A WFA is single-valued if
all accepting paths of the same word have the same weight. This is a natural generalization of
unambiguous WFA. By applying Eilenberg’s cross section theorem, we show that every single-
valued WFA admits an unambiguous equivalent. We also discuss a different proof of this result
which relies on a construction due to Schützenberger and Sakarovitch [27]. Similar results
are known for transducers over the tropical semiring and for certain string-to-string transducers
[29, 15, 19, 31]. However these approaches rely on the cancellativity of the multiplication in the
semiring while our approach just requires idempotency of the semiring addition. We achieve the
following characterization: the behaviour |T | of some single-valued WFA T is subsequential iff
|T | has bounded variation iff Mohri’s algorithm terminates on T iff T has the twins property.
An example shows that the assumption that T is single-valued cannot be omitted. This partly
corrects a claim by Mohri [25]1 . Finally, we use techniques from [1, 3] to show that it is
decidable in polynomial time whether a given WFA over the tropical semiring is single-valued.

2 Prerequisites
2.1 Basic Definitions and Notations
A semiring (K, ⊕, , 0, 1) consists of a set K together with two binary operations ⊕, and such
that (K, ⊕, 0) is a commutative monoid, (K, , 1) is a monoid which distributes over (K, ⊕, 0),
and 0 acts as a zero for all elements. Some k ∈ K, k 6= 0 is called a zero-divisor if there is some
l ∈ K, l 6= 0 such that k l = 0 or l k = 0. If K does not have zero-divisors, then K is called
zero-divisor-free. A semiring K is called idempotent if k ⊕ k = k for every k ∈ K.
We call K ⊆ K a subsemiring of K if K is closed under ⊕ and and 0, 1 ∈ K. For every
K ⊆ K, we call the closure of the set K ∪ {0, 1} under ⊕ and the subsemiring generated by K
and denote it by hKi. If hKi is finite for every finite set K ⊆ K, then K is called locally finite [10].
Important examples of locally finite semirings are the semirings of the form (K, min, max, 0, 1)
where min and max are defined by some total order on a set K.
By (N, +, ·, 0, 1) we mark the semiring over N = {0, 1, . . . } with addition and multiplication of
natural numbers. A semiring often used is the tropical semiring T = (R+ ∪. {∞}, min, +, ∞, 0).
We denote by (B, ∨, ∧, false, true) the Boolean semiring. It consists of the set B = {true, false}
with logical disjunction and conjunction.
Let ∆ be some finite alphabet with a total order ≤. We extend ≤ to ∆∗ as follows. Firstly,
for u, v ∈ ∆∗ , we set u ≤ v if u is shorter than v. Secondly, for a, b ∈ ∆ and u, v, w ∈ ∆∗ ,
satisfying a b and |u| = |v|, we set wau ≤ wbv. By defining an operation min on the free
monoid ∆∗ by ≤, we obtain a semiring (∆∗ ∪. {0}, min, ·, 0, ε), where 0 denotes a new maximal
element.
Mappings from a free monoid to some semiring K are called formal power series. For a
semiring K and a finite set Q, we denote by MQ×Q (K) the set of all Q×Q-matrices over K. We
1
This claim is corrected in the electronic version of [25] on Mohri’s homepage.
2.2 Weighted finite automata 3

denote by M1×Q (K) and MQ×1 (K) the set of all row resp. column matrices. For a row matrix
(correspondingly for column matrices) λ ∈ M1×Q (K), we write λ[q] for the q-th entry of λ. For
a ∈ K, an element (a, . . . , a) ∈ M1×Q (K) we abbreviate by aQ . We define the multiplication of
matrices by the operations of K, as usual. WeL use 99K for partial and → for total mappings. As
usual, we define the sum over the empty set, ∅, as the zero 0 of K.

2.2 Weighted finite automata


We recall some notions on weighted automata and recommend [4, 5, 22, 26, 28] for overviews.
Let K be an arbitrary semiring and let Σ be an alphabet.
A weighted finite automaton (for short WFA) over K is a tuple [Q, E, λ, %], where

• Q is a non-empty, finite set of states,

• E is a finite subset of Q × Σ × K × Q, and

• λ ∈ M1×Q (K), % ∈ MQ×1 (K).

We call the tuples in E transitions.


There are two equivalent ways to define the behaviour of WFA. At first, we define the
behaviour using matrices. Let T = [Q, E, λ, %] be a WFA. The set of transitions E defines a
homomorphism θ : Σ∗ → MQ×Q (K) in the following way: for every a ∈ Σ, let θ(a) be the matrix
MQ×Q (K) such that for every p, q ∈ Q, we have
M
θ(a)[p, q] := k.
(p,a,k,q)∈E

This mapping θ : Σ → MQ×Q (K) induces a unique homomorphism θ : Σ∗ → MQ×Q (K).


The WFA T computes a formal power series |T | : Σ∗ → K by letting

|T |(w) := λ θ(w) %

for every w ∈ Σ∗ . We call two WFA T and T 0 equivalent if |T | = |T 0 |, i.e. if |T |(w) = |T 0 |(w)
for every w ∈ Σ∗ . If some formal power series f : Σ∗ → K is computed by a WFA, then we call
f recognizable.
Now we give an equivalent method to define the behaviour of a WFA. Again let T =
[Q, E, λ, %] be a WFA. We can regard λ and % as mappings λ, % : Q → K such that for every
q ∈ Q, we have λ(q) = λ[q] and %(q) = %[q].
Let n ≥ 1. A path π of length n is a sequence

(q0 , a0 , k0 , q1 ) (q1 , a1 , k1 , q2 ) . . . (qn−1 , an−1 , kn−1 , qn )

of transitions in E. The word a0 . . . an−1 is called the label of π. We say that π starts at q0 and
ends at qn . We define σ(π) := k0 k1 · · · kn−1 , the weight of π. We assume that for every
q ∈ Q there is a path of length 0 which starts and ends at q, is labeled with ε and weighted with
w
1. For every p, q ∈ Q and every w ∈ Σ∗ , we denote by p ; q the set of all paths with label w

which start at p and end at q. Then for every w ∈ Σ one can show
M   M   M  
|T |(w) = λ(p) σ(π) %(q) = λ(p) σ(π) %(q) .
w p,q∈Q w
p,q∈Q, π ∈ p ; q π∈p;q

For every q ∈ Q, we call λ(q) resp. %(q) the initial weight resp. terminal weight of q. Let
I := { q ∈ Q | λ(q) 6= 0 } and F := { q ∈ Q | %(q) 6= 0 }. We call the states in I resp. F the initial
4 3 DETERMINIZATION

states resp. accepting states of T . Let π = (q0 , a0 , k0 , q1 ) (q1 , a1 , k1 , q2 ) . . . (qn−1 , an−1 , kn−1 , qn )
be a path. Then π is successful if q0 ∈ I and qn ∈ F . For every 0 ≤ i ≤ j ≤ n, we let

π(i, j) := (qi , ai , ki , qi+1 ) . . . (qj−1 , aj−1 , kj−1 , qj )

be the subpath of π from qi to qj . If π and π 0 are paths such that π 0 starts at the same state
where π ends, then we denote by ππ 0 the concatenation of π and π 0 . For every p, q, r ∈ Q and
u v u v
u, v ∈ Σ∗ , we write the concatenation of p ; q and q ; r as p ; q ; r. For subsets P, R ⊆ Q,
w w
we denote by P ; R the union of the sets p ; r for every p ∈ P , r ∈ R.
A subsequential WFA is a tuple T = [Q, δ, σ, q0 , k0 , %] such that:
• Q is a finite set of states,

• δ : Q × Σ 99K Q and σ : Q × Σ 99K K are partial mappings such that for every q ∈ Q,
a ∈ Σ : δ(q, a) is defined iff σ(q, a) is defined,

• q0 ∈ Q, k0 ∈ K,

• % : Q → K is a mapping.

We extend δ and σ to words w ∈ Σ∗ as follows: for every q ∈ Q, we set δ(q, ε) := q and


σ(q, ε) := 1. For every q ∈ Q, w ∈ Σ∗ , and a ∈ Σ, we set δ(q, wa) := δ(δ(q, w), a) and
σ(q, wa) := σ(q, w) σ(δ(q, w), a) provided that δ(q, w) and δ(δ(q, w), a) are defined.
The formal power series recognized by T is defined for every w ∈ Σ∗ , by

k0 σ(q0 , w) %(δ(q0 , w)) , if δ(q0 , w) is defined
|T |(w) :=
0 , otherwise.

A subsequential formal power series is a formal power series which is recognized by a subsequen-
tial WFA. It is easy to transform a subsequential WFA T into a WFA computing |T |.
A formal power series from Σ∗ to B can be considered as a subset of Σ∗ or as a formal
language over Σ. In the same way, recognizable formal power series from Σ∗ to B can be
considered as recognizable languages. We call a subsequential WFA over the Boolean semiring
B a deterministic finite automaton.

3 Determinization
In this section, we deal with several approaches to determinize WFA. We explain a straightfor-
ward idea, Mohri’s algorithm, and a generalization of Mohri’s algorithm to arbitrary semirings.
This generalization utilizes so-called factorizations which depend on the semiring. We intro-
duce the notion of a maximal factorization and show that maximal factorizations are in some
sense optimal in comparison to arbitrary factorizations. This result is of practical importance:
whenever one implements the determinization algorithm, one should prefer a maximal factor-
ization. In Section 3.6, we will define min-semirings and show that for min-semirings the twins
property is sufficient for the existence of a determinization of a WFA by our generalization using
a maximal factorization.
Let K be a semiring and let Σ be an alphabet during this section.

3.1 A straightforward idea


Let T = [Q, E, λ, %] be a WFA. At first, we describe a straightforward idea to construct an
equivalent, subsequential WFA T 0 = [Q0 , δ, σ, q0 , k0 , %0 ]. We just sketch this method, because it
is a very particular case of a more general approach, which we will discuss below.
3.2 Mohri’s algorithm 5

The set of states Q0 is the least subset of M1×Q (K) which contains λ, and is closed under
multiplication with matrices θ(a) for every a ∈ Σ. We use λ from T as initial state of T 0 and let
k0 := 1. For every letter a ∈ Σ and every u ∈ Q0 , we set δ(u, a) := u θ(a), σ(u, a) := 1, and
%0 (u) = u %.
The lack of this idea is that Q0 might be infinite. If K is finite, or more generally, if K is
locally finite, the construction shows that for every WFA there exists a subsequential equivalent.
In the Boolean semiring, every matrix in M1×Q (B) represents a subset of Q. Now, it is quite
clear that in the Boolean semiring, this idea is just the classical determinization of WFA by a
power set construction.
One can slightly modify this idea by leaving δ(u, a) and σ(u, a) undefined if u θ(a) = 0Q .

3.2 Mohri’s algorithm


In [25], Mohri gives an improvement of the above approach. In comparison to the above
approach, Mohri’s algorithm produces WFA’s with fewer states as we will prove in Section 3.4.
We sketch his approach for the tropical semiring T = (R+ ∪. {∞}, min, +, ∞, 0). Let T =
[Q, E, λ, %] be a WFA over T. As before, we want to construct an equivalent, subsequential
WFA T 0 = [Q0 , δ, σ, q0 , k0 , %0 ]. Again, the states Q0 are a subset of M1×Q (T).
For every u ∈ M1×Q (T), let min(u) := minq∈Q u[q]. There is a matrix u0 such that u =
min(u) + u0 . If u 6= ∞Q , then u0 is uniquely determined and in this case, we denote u0 in a rather
sloppy way by − min(u) + u.
We construct T 0 . We may assume λ 6= ∞Q since otherwise, a determinization of T is obvious.
Let u ∈ M1×Q (T) and a ∈ Σ with u + θ(a) 6= ∞Q . We abbreviate u + θ(a) by v. If the WFA
is in state u and reads the letter a, then it does not change the state to v. It rather factorizes v
into min(v) and − min(v) + v, and changes the inner state to − min(v) + v. Hence, the transition
should be weighted with min(v). For every u ∈ M1×Q (T) and every a ∈ Σ with u + θ(a) 6= ∞Q ,
we define
 
• δ(u, a) := − min u + θ(a) + u + θ(a) and
• σ(u, a) := min(u + θ(a)).
If u + θ(a) = ∞Q , then δ(u, a) and σ(u, a) remain undefined. We define k0 := min(λ) and
q0 := − min(λ) + λ. As above, we set %0 (u) = u + %.
We define the set of states Q0 as the least subset of M1×Q (T) which contains q0 and is closed
under δ, i.e., for every u ∈ Q0 and every a ∈ Σ , we have δ(u, a) ∈ Q0 if δ(u, a) is defined.
The set Q0 is not necessarily finite, even if there is some subsequential WFA which is equiv-
alent to T . If Q0 is finite, then we define T 0 = [Q0 , δ|Q0 ×Σ , σ|Q0 ×Σ , q0 , k0 , %0Q0 ].
In [25], Mohri gives an algorithm which computes the WFA T 0 . This algorithm terminates
iff Q0 is finite. Besides other results, he shows that T 0 is indeed equivalent to T , provided that
Q0 is finite. We continue the examination of Mohri’s approach in Section 4.2.

3.3 A generalization of Mohri’s algorithm


If one tries to generalize Mohri’s algorithm to arbitrary semirings, then one encounters the
problem of adapting the factorization of matrices v ∈ M1×Q (T) into min(v) and − min(v) + v
to arbitrary semirings. By the great diversity of semirings, we have to choose a rather abstract
method.
Let K be an arbitrary semiring and Q be a nonempty, finite set. We call two mappings
f : M1×Q (K) \ {0Q } → M1×Q (K) and g : M1×Q (K) \ {0Q } → K a factorization of dimension Q
if for every u ∈ M1×Q (K) \ {0Q } we have
u = g(u) f (u).
6 3 DETERMINIZATION

Let (f, g) be a factorization. For every u ∈ M1×Q (K) \ {0Q }, we have f (u) 6= 0Q and g(u) 6= 0.
If for every u ∈ M1×Q (K) \ {0Q } we have f (u) = u and g(u) = 1, then we call (f, g) the trivial
factorization. Every semiring K admits a factorization, namely the trivial factorization.
Let T = [Q, E, λ, %] be a WFA over K, and let (f, g) be a factorization of dimension Q. We
may assume that λ 6= 0Q . We define:
• δ̃ : M1×Q (K) × Σ 99K M1×Q (K),
δ̃(u, a) := f u θ(a) , for every u ∈ M1×Q (K), a ∈ Σ with u θ(a) 6= 0Q ,


• σ̃ : M1×Q (K) × Σ 99K K,


σ̃(u, a) := g u θ(a) , for every u ∈ M1×Q (K), a ∈ Σ with u θ(a) 6= 0Q , and


• %̃ : M1×Q (K) → K,
%̃(u) := u %, for every u ∈ M1×Q (K).
Let Q̃ be the least subset of M1×Q (K) which contains f (λ) and is closed under δ̃, i.e., for every
u ∈ Q̃ and a ∈ Σ, we have δ̃(u, a) ∈ Q̃ provided that δ̃(u, a) is defined. If Q̃ is finite, then we
call h i
T̃ := Q̃, δ̃|Q̃×Σ , σ̃|Q̃×Σ , f (λ), g(λ), %̃|Q̃

the determinization of T by (f, g). If Q̃ is infinite, then the determinization of T by (f, g) is not
defined. To avoid some inconvenient technical details, we say that the determinization of T by
(f, g) is not defined if λ = 0Q .
Let us mention that Mohri defines his algorithm using the factorization
M −1
g(u) := u[q] and f (u) = g(u) u
q∈Q

for every u ∈ M1×Q (K) \ {0Q } (cf. [25, p. 285]). The main problem is that it is left open how
to interpret the power −1 in general semirings. For example, consider the semiring (N, +, ·, 0, 1),
let Q = {q1 , q2 } and u = (4, 10), i.e., g(u) = 14. Regardless of how we interpret 14−1 , we cannot
define f (u) in a way that the key property g(u) · f (u) = u, i.e., 14 · f (u) = (4, 10) is satisfied.
If (f, g) is the trivial factorization, then the determinization of T by (f, g) is the WFA which
we obtain by the idea sketched in Section 3.1. If K is locally finite then the determinization of
T with respect to the trivial factorization is defined.
If K is the tropical semiring T, and we set g(u) = min(u) and f (u) = − min(u) + u for
u ∈ M1×Q (T) \ {∞Q }, then the determinization of T by (f, g) yields the same as in Section 3.2.
Theorem 3.1. Let K be an arbitrary semiring, T = [Q, E, λ, %] be a WFA over K and (f, g)
be a factorization of dimension Q. If the determinization of T by (f, g) is defined, then it is
equivalent to T .
Proof. Let T̃ be the determinization of T by (f, g). We denote the initial state of T̃ by ũ, i.e.,
ũ := f (λ). Let w ∈ Σ∗ be arbitrary. We show the following two assertions by an induction on
the length of w ∈ Σ∗ .
1. If δ̃(ũ, w) is defined, then λ θ(w) = g(λ) σ̃(ũ, w) δ̃(ũ, w).

2. If δ̃(ũ, w) is not defined, then λ θ(w) = 0Q .


For w = ε, we have δ̃(ũ, ε) = ũ and σ̃(ũ, ε) = 1. Moreover, θ(ε) is the identity matrix. Thus, the
equation in (1) reduces to λ = g(λ) ũ. This is true, because (f, g) is a factorization. Further,
(2) is obviously true for w = ε.
Now, let w ∈ Σ∗ satisfy (1) and (2), and let a ∈ Σ be arbitrary.
3.4 Maximal factorizations 7

Case 1: δ̃(ũ, wa) is defined. This implies that σ̃(ũ, wa), δ̃(ũ, w) and σ̃(ũ, w) are defined. We
only have to consider (1), because (2) is obviously true. Applying the definition of δ̃ and σ̃, that
(f, g) is a factorization and the inductive hypothesis, we obtain
 
g(λ) σ̃(ũ, wa) δ̃(ũ, wa) = g(λ) σ̃(ũ, w) σ̃ δ̃(ũ, w), a δ̃ δ̃(ũ, w), a
 
= g(λ) σ̃(ũ, w) g δ̃(ũ, w) θ(a) f δ̃(ũ, w) θ(a)
= g(λ) σ̃(ũ, w) δ̃(ũ, w) θ(a) = λ θ(w) θ(a) = λ θ(wa).

This proves (1).


Case 2: δ̃(ũ, wa) is not defined. We only have to consider  (2), because (1) is obviously true.
Case 2.1: δ̃(ũ, w) is defined. It implies that δ̃ δ̃(ũ, w), a is not defined, i.e., δ̃(ũ, w) θ(a) = 0Q .
In combination with the inductive hypothesis, we obtain:

λ θ(wa) = λ θ(w) θ(a) = g(λ) σ̃(ũ, w) δ̃(ũ, w) θ(a) = g(λ) σ̃(ũ, w) 0Q = 0Q .

Case 2.2: δ̃(ũ, w) is not defined. By the inductive hypothesis, we have λ θ(w) = 0Q . Thus,
we conclude
λ θ(wa) = λ θ(w) θ(a) = 0Q θ(a) = 0Q .
Now, it is easy to show that T and T̃ are equivalent. Let w ∈ Σ∗ be arbitrary. If δ̃(ũ, w) is not
defined, then |T̃ |(w) = 0. By (2), we have λ θ(w) = 0Q , and thus, |T |(w) = 0.
If δ̃(ũ, w) is defined, then by (1) we obtain

|T̃ |(w) = g(λ) σ̃(ũ, w) %̃ δ̃(ũ, w) = g(λ) σ̃(ũ, w) δ̃(ũ, w) % = λ θ(w) % = |T |(w).

3.4 Maximal factorizations


The existence of the determinization of some WFA T depends on the choice of f and g. Thus,
we examine factorizations. In particular, we are interested in knowing which factorizations are
more suitable than others.
We call a factorization (f, g) maximal if for every u ∈ M1×Q (K) and every k ∈ K with
k u 6= 0Q we have f (u) = f (k u). The restriction “k u 6= 0Q ” is not just due to the
fact that f (0Q ) is not defined. Even if we modify the notion of a factorization in a way that
f (0Q ) and g(0Q ) are defined, it is definitely necessary to keep the restriction “k u 6= 0Q ” in
the notion of a maximal factorization. Otherwise, we could show for every u ∈ M1×Q (K) that
f (u) = f (0 u) = f (0Q ) and the notion of a maximal factorization becomes meaningless.
If (f, g) is a maximal factorization, then for every u ∈ M1×Q (K) \ {0Q } we have f (f (u)) =
f g(u) f (u) = f (u).
We have the following lemma for maximal factorizations:

Lemma 3.2. Let K be a semiring, T = [Q, E, λ, %] be a WFA and (f, g) be a maximal factor-
ization. Let u ∈ M1×Q (K) and w ∈ Σ+ be a non-empty word satisfying u θ(w) 6= 0Q . Then,
δ̃(u, w) is defined and we have δ̃(u, w) = f (u θ(w)) = δ̃(f (u), w).

Proof. At first, we show by an induction on the length of w that δ̃(u, w) is defined and the first
equation is true. If w is a letter, then this equation is the definition of δ̃. So let w ∈ Σ+ and
a ∈ Σ satisfy u θ(wa) 6= 0Q , i.e., u θ(w) θ(a) 6= 0Q . By induction, δ̃(u, w) is defined and
the left equation is true. Consequently,
 
δ̃(u, wa) = δ̃(δ̃(u, w), a) = δ̃ f (u θ(w)), a = f f (u θ(w)) θ(a) .
8 3 DETERMINIZATION

Because (f, g) is maximal, we can multiply the argument of the outermost occurrence of f by
g(u θ(w)).
  
= f g(u θ(w)) f (u θ(w)) θ(a) = f u θ(w) θ(a) = f u θ(wa) .

We show the second equation. Because u θ(w) = g(u) f (u) θ(w) and u θ(w) 6= 0Q , we
have f (u) θ(w) 6= 0Q , i.e., δ̃(f (u), w) is defined. By the first equation,
 
δ̃(u, w) = f (u θ(w)) = f g(u) f (u) θ(w) = f f (u) θ(w) = δ̃(f (u), w).

If we consider Lemma 3.2 for w = ε, we see δ̃(u, ε) = u but f (u θ(ε)) = δ̃(f (u), ε) = f (u).
Hence, both equations in Lemma 3.2 hold for w = ε iff u = f (u).
The following theorem shows that maximal factorizations are optimal in comparison to other
factorizations if the semiring K is zero-divisor free, or more generally, if for every u ∈ M1×Q (K) \
{0Q }, g(u) is not a zero-divisor.
Theorem 3.3. Let T = [Q, E, λ, %] be a WFA over K satisfying λ 6= 0Q . Let (f, g) be a
maximal factorization of dimension Q and assume that for every u ∈ M1×Q (K) \ {0Q }, g(u) is
not a zero-divisor. Let (f 0 , g 0 ) be an arbitrary factorization of dimension Q.
1. If the determinization of T by (f 0 , g 0 ) exists, then the determinization of T by (f, g) exists.

2. Assume that the determinization of T by (f 0 , g 0 ) exists, and let Q0 be the set of its states.
Let Q̃ be the set of states of the determinization of T by (f, g).
We have Q̃ = f (Q0 ), and thus, |Q̃| ≤ |Q0 |.
Proof. At first, note that for every u ∈ M1×Q (K)\{0Q } we have g 0 (u) f 0 (u) = u = g(u) f (u)
and since (f, g) is maximal, f (f 0 (u)) = f (u) = f (f (u)). Consider the following mappings:
• δ 0 : M1×Q (K) × Σ 99K M1×Q (K),
δ 0 (u, a) := f 0 u θ(a) , for every u ∈ M1×Q (K), a ∈ Σ satisfying u θ(a) 6= 0∞ , and


• δ̃ : M1×Q (K) × Σ 99K M1×Q (K),


δ̃(u, a) := f u θ(a) , for every u ∈ M1×Q (K), a ∈ Σ satisfying u θ(a) 6= 0∞ .


Let Q0 be the least subset of M1×Q (K) which contains f 0 (λ) and is closed under δ 0 , i.e., for every
u ∈ Q0 and every a ∈ Σ, we have δ 0 (u, a) ∈ Q provided that δ 0 (u, a) is defined. Let Q̃ be the
least subset of M1×Q (K) which contains f (λ) and is closed under δ̃ in the same sense. To prove
(1) and (2), we show Q̃ = f (Q0 ). As mentioned above, we have f (f 0 (λ)) = f (λ). To show
Q̃ = f (Q0 ), we derive the following two assertions for every u ∈ M1×Q (K) \ {0Q } and every
a ∈ Σ.
a. δ 0 (u, a) is defined iff δ̃ f (u), a is defined.


b. If δ 0 (u, a) is defined, then we have δ̃ f (u), a = f δ 0 (u, a) .


 

We have u θ(a) = g(u) f (u) θ(a). Because g(u) 6= 0, and g(u) is not a zero-divisor, we
have u θ(a) = 0Q iff f (u) θ(a) = 0Q . This proves (a).
Now, assume that δ 0 (u, a) and δ̃ f (u), a are defined, i.e., u θ(a) 6= 0Q . We show:


f δ 0 (u, a) = f f 0 (u θ(a)) = f u θ(a) = f g(u) f (u) θ(a)


   

 
= f f (u) θ(a) = δ̃ f (u), a .
3.5 Examples of maximal factorizations 9

If both (f, g) and (f 0 , g 0 ) are maximal factorizations, then f and f 0 are mutually inverse “bi-
jections” between the determinizations of T by (f, g) and by (f 0 , g 0 ). Hence, all determinizations
of some WFA T by maximal factorizations yield the same subsequential WFA up to renaming
the states.
Note that by Theorem 3.3 the determinization by some maximal factorization is optimal
among determinizations by arbitrary factorizations. It is possible that the determinization of
a WFA T by some maximal factorization (f, g) does not exist, even if there is an equivalent
subsequential WFA. If the determinization of T by some maximal factorization exists, then it is
not necessarily the smallest subsequential WFA equivalent to T . It is quite possible that there
are better approaches to determinize WFA which are entirely different from our generalization of
Mohri’s approach. For example, in the semiring (Z ∪. {∞}, min, +, ∞, 0), where Z denotes the
set of integers, an algorithm can enumerate an infinite list of all subsequential WFA’s and check
whether there is an equivalent WFA in the list.2 This algorithm is far from being practical, but
it terminates iff an equivalent WFA exists.

3.5 Examples of maximal factorizations


In the previous section, we have seen that maximal factorizations are optimal in some sense,
and thus, maximal factorizations are of practical interest. In this section, we present maximal
factorizations for several semirings. Let Q be a nonempty, finite set.
In the Boolean semiring B, for every factorization (f, g) and every u ∈ M1×Q (B) \ {0Q },
we have f (u) = u and g(u) = 1. Hence, there is only the trivial factorization. The trivial
factorization in the Boolean semiring is a maximal factorization.
We consider the semiring (N, +, ·, 0, 1). For every u ∈ M1×Q (N)\{0Q }, let g(u) be the greatest
common divisor of the entries of u. There is a mapping f : M1×Q (N) \ {0Q } → M1×Q (N) such
that u = g(u) · f (u). Clearly, (f, g) is maximal. For u ∈ M1×Q (N) \ {0Q } and n ≥ 1, we have
g(n · u) = n · g(u) and f (n · u) = f (u). A generalization to the semiring (Z, +, ·, 0, 1) is obvious.
For the tropical semiring T, Mohri’s factorization g(u) = min(u) and f (u) = − min(u) + u
for u ∈ M1×Q (T) \ {∞Q }, as explained in Section 3.2, is maximal. We call this factorization
Mohri’s factorization.
Now, let K be some semiring such that (K \ {0}, ) is a group. We may assume that
Q = {q1 , . . . , q|Q| }. Let u ∈ M1×Q (K) \ {0Q } and i be the smallest integer with u[qi ] 6= 0.
We set g(u) := u[qi ] and f (u) := g(u)−1 u. This pair of mappings (f, g) obviously forms a
factorization. Let u ∈ M1×Q (K) \ {0Q } and k ∈ K \ {0} be arbitrary. The least integer i such
that u[qi ] 6= 0 is also the least integer i such that (k u)[qi ] 6= 0. We have (k u)[qi ] = k u[qi ],
i.e., g(k u) = k g(u), and obtain
−1
f (k u) = g(k u)−1 k u = k g(u) k u = g(u)−1 k −1 k u = f (u).

Hence, (f, g) is a maximal factorization.


Now, consider the string semiring (∆∗ ∪. {0}, min, ·, 0, ε) over an alphabet ∆ with some or-
dering ≤. For every u ∈ M1×Q (N) \ {0Q }, let g(u) be the longest common prefix of the non-zero
entries of u. There is a unique “residual” mapping f : M1×Q (∆∗ ) \ {0Q } → M1×Q (∆∗ ) such that
(f, g) is a factorization. This factorization is maximal.
Next we present a semiring which does not admit a maximal factorization. Let K0 be the set
of all natural numbers which admit a factorization into an even number of primes, e.g., 4 = 2 · 2
and 126 = 2·3·3·7 belong to K0 , but 2 and 18 = 2·3·3 do not belong to K0 . Let K = K0 ∪. {1, ∞}
and consider the semiring (K, min, ·, ∞, 1) where min is defined by the usual ordering of natural
2
The equivalence between an arbitrary and a subsequential WFA over (Z ∪. {∞}, min, +, ∞, 0) is decidable by
[21, Prop. 5.3] since every subsequential WFA over (Z ∪. {∞}, min, +, ∞, 0) is also a subsequential WFA over
(Z ∪. {−∞}, max, +, −∞, 0) .
10 3 DETERMINIZATION

numbers, · is the usual multiplication of natural numbers and ∞ denotes a new maximal element.
Let Q = {1, 2}. By contradiction, assume that K admits a maximal factorization (f, g) of
dimension Q. Let u = (2 · 3 · 5 · 7, 2 · 3 · 5 · 11). Since 2 · 3 · (5 · 7, 5 · 11) = u = 2 · 5 · (3 · 7, 3 · 11), we
have f ((5 · 7, 5 · 11)) = f (u) = f ((3 · 7, 3 · 11)). However, g((5 · 7, 5 · 11)) is a common divisor
of 5 · 7 and 5 · 11 in K, i.e., g((5 · 7, 5 · 11)) = 1, and hence, f ((5 · 7, 5 · 11)) = (5 · 7, 5 · 11). In
the same way, we can derive f ((3 · 7, 3 · 11)) = (3 · 7, 3 · 11) which is a contradiction.

3.6 A generalization of the twins property


The twins property was introduced by Choffrut in 1977 [7] and studied in various papers,
e.g., [1, 4, 25]. In [25], Mohri proves that the twins property is a sufficient condition for the
termination of his determinization algorithm on WFA over the tropical semiring. We want to
generalize this result to a larger class of semirings.
Let T = [Q, E, λ, %] be a WFA over some semiring K. Following [25], we say that T has the
twins property if we have for every u, v ∈ Σ∗ and for every p, q ∈ Q
u v u v
I ; p ; p 6= ∅ ∧ I ; q ; q 6= ∅ =⇒ θ(v)[p, p] = θ(v)[q, q].

In his proof, Mohri uses several conditions which cannot be applied to arbitrary semirings.
Following [20, 30], some path π = (q0 , a0 , k0 , q1 ) . . . (qn−1 , an−1 , kn−1 , qn ) is called victorious if
we have
σ(π) = θ(a0 . . . an−1 )[q0 , qn ].

Mohri’s conditions in [25] are:


w
1. Let w ∈ Σ∗ and p, q ∈ Q. If θ(w)[p, q] 6= 0, then there is a victorious path in p ; q.

2. Let π be a victorious path. For every 0 ≤ i ≤ j ≤ |π| the path π(i, j) is victorious.

3. The commutativity of the tropical semiring.3

If we want to generalize Mohri’s result to a larger class of semirings, then we have to take care
w
of these conditions. Just assume that there are exactly two paths π, π 0 ∈ p ; q in condition (1),
and σ(π) = k and σ(π 0 ) = l. Then, θ(w)[p, q] = k ⊕ l. To assume the existence of a victorious
path, we need either θ(w)[p, q] = σ(π) or θ(w)[p, q] = σ(π 0 ), i.e., we need k ⊕ l ∈ {k, l}. Hence,
we restrict ourselves to semirings K satisfying k ⊕ l ∈ {k, l} for every k, l ∈ K.
This property has an important consequence.

Lemma 3.4. Let K be a semiring. The following three conditions are equivalent.

1. For every k, l ∈ K, we have k ⊕ l ∈ {k, l}.

2. There is a linear order relation ≤ on K such that ⊕ is the minimum with respect to ≤.

3. There is a linear order relation ≤ on K such that:

(a) The operation ⊕ is the minimum with respect to ≤.


(b) The ordering ≤ is stable with respect to , i.e., for every k1 , k2 , l ∈ K, k1 ≤ k2 implies
l k1 ≤ l k2 and k1 l ≤ k2 l.
3
For instance, he uses the commutativity of the tropical semiring to obtain σ(π0 ) = σ(π00 ) + θ1 (p0 , u2 , p0 ) in
the last part of the proof of Theorem 11 in [25].
3.6 A generalization of the twins property 11

Proof. (3) ⇒ (2) and (2) ⇒ (1) are obvious. We show (1) ⇒ (3). For every k, l ∈ K let

k≤l :⇐⇒ k ⊕ l = k.

By (1), ≤ is reflexive and total, and it is antisymmetric by the commutativity of ⊕.


Let k1 , k2 , k3 ∈ K satisfying k1 ≤ k2 ≤ k3 . Then, we have

k1 ⊕ k3 = (k1 ⊕ k2 ) ⊕ k3 = k1 ⊕ (k2 ⊕ k3 ) = k1 ⊕ k2 = k1 ,

i.e., k1 ≤ k3 . Thus, ≤ is an ordering, and it is easy to see that ⊕ is the minimum with respect
to ≤. Assertion (3b) follows from the distributivity of over ⊕.

If a semiring K satisfies assertions (1,2,3) in Lemma 3.4, then we call K a min-semiring. In


[23] these semirings are called extremal
The tropical semiring and the semiring (N ∪. {∞}, min, ·, ∞, 1) (where · denotes integer mul-
tiplication) are min-semirings. Moreover, every ordered monoid (M,
.  ≤, ⊕, 1M ) admits an exten-
sion to a zero-divisor-free min-semiring M ∪ {∞}, min, ⊕, ∞, 1M by setting m ≤ ∞ for every
m ∈ M and defining min with respect to ≤. Thus, min-semirings are naturally related to ordered
algebraic structures which are an important field in mathematics [12, 13, 16].
If K is a min-semiring, then we have k ≤ 0 for every k ∈ K. Every min-semiring is idempotent.
However, there are idempotent semirings which are not min-semirings, e.g., (2M , ∪, ∩, ∅, M ) for
some non-empty set M .
It is well-known in the theory of semirings, that a semiring K is idempotent iff ⊕ is the
infimum over some partial ordering which is stable with respect to [13]. Lemma 3.4 is a
variant of this result for min-semirings.
Let K be a min-semring. By induction, one can easily show that for every n ≥ 1, and every
k1 , . . . , kn ∈ K, there is some 1 ≤ i ≤ n such that ⊕1≤j≤n kj = ki .
Now, consider Mohri’s first condition for a min-semiring K. Since
M
θ(w)[p, q] = σ(π),
w
π∈p;q

w
we can assume that there is some path π ∈ p ; q such that σ(π) = θ(w)[p, q] provided that
θ(w)[p, q] 6= 0 and K is a min-semiring.
Now, we take care of Mohri’s second condition. Unfortunately, this condition is not true in
general, even if K is a min-semiring. We can avoid this condition in our proof of Theorem 3.5,
below. However, due to this problem, we cannot prove Theorem 3.5 by slight adjustments to
Mohri’s proof.

Theorem 3.5. Let K be a commutative min-semiring and let T = [Q, E, λ, %] be a WFA sat-
isfying λ 6= 0Q . Let (f, g) be a maximal factorization. If T has the twins property, then the
determinization of T by (f, g) is defined.

Proof. Let δ̃ : M1×Q (K) × Σ 99K M1×Q (K) and Q̃ be defined as in Section 3.3. We prove the
theorem by showing that Q̃ is finite. Our strategy is to construct a finite subset K 0 ⊆ K such
that for every w ∈ Σ∗ we have δ̃(f (λ), w) ∈ f (M1×Q (K 0 )) if δ̃(f (λ), w) is defined.
Let w = a0. . . a|w|−1 ∈ Σ∗ such that δ̃(f
(λ), w) is defined. We derive information on λ θ(w).
Let Q := p ∈ Q (λ θ(w))[p] 6= 0 . We have Q 6= ∅. For every p ∈ Q0 , let q0,p ∈ Q and
0 0


w
πp ∈ q0,p ; p be a path such that

λ(q0,p ) σ(πp ) = (λ θ(w))[p]. (1)

Note that q0,p and πp exist, because K is a min-semiring.


12 3 DETERMINIZATION

For p ∈ Q0 , we denote πp = (qp,0 , a0 , kp,0 , qp,1 ) . . . (qp,|w−1| , a|w|−1 , kp,|w|−1 , qp,|w| ).


Now, let n ≥ 0 and let 0 ≤ i1 < i2 ≤ i3 < i4 ≤ i5 < · · · < i2n ≤ |w| such that we have for
every 1 ≤ l ≤ n and every p ∈ Q0 : qp,i2l−1 = qp,i2l . By the pigeon hole principle, we can assume
0
X
|w| − (i2l − i2l−1 ) ≤ |Q||Q | ≤ |Q||Q| . (2)
1≤l≤n

We denote by ≤⊕ the ordering of K given by Lemma 3.4.


Now let p ∈ Q0 and 1 ≤ l ≤ n be arbitrary. We have

θ(ai2l−1 . . . ai2l −1 )[qp,i2l−1 , qp,i2l ] ≤⊕ σ πp (i2l−1 , i2l ) . (3)

Since K is a min-semiring, there is a path ν which starts and ends at the same states as
πp (i2l−1 , i2l ), has the same label as πp (i2l−1 , i2l ) and θ(ai2l−1 . . . ai2l −1 )[qp,i2l−1 , qp,i2l ] = σ(ν).
Let πp0 be the path which we obtain by replacing in πp the part πp (i2l−1 , i2l ) by ν.
We have σ(πp0 ) ≤⊕ σ(πp ) by (3), the choice of ν, and the stability of ≤⊕ under . In
combination with the choice of πp , we obtain

λ(qp,0 ) σ(πp0 ) ≤⊕ λ(qp,0 ) σ(πp ) = (λ θ(w))[p]. (4)

Consequently, we have λ(qp,0 ) σ(πp0 ) = (λ θ(w))[p] in (4).


We perform such a replacement for every 1 ≤ l ≤ n and every p ∈ Q0 . To avoid technical
overhead, we assume that we were lucky, i.e, we assume that for p ∈ Q0 and 1 ≤ l ≤ n line (3)
is an equation: 
θ(ai2l−1 . . . ai2l −1 )[qp,i2l−1 , qp,i2l ] = σ πp (i2l−1 , i2l ) . (5)
Because T has the twins property, we have for arbitrary p, q ∈ Q0 and 1 ≤ l ≤ n the equality
σ πp (i2l−1 , i2l )) = σ πq (i2l−1 , i2l )). Thus, we can define for some p ∈ Q0
K 
k := σ πp (i2l−1 , i2l ) .
1≤l≤n

We are now able to derive information on λ θ(w)[p] using (1) and k. Let π̂p be the path4 which
we obtain by “erasing” the parts πp (i2l−1 , i2l ) from πp for every 1 ≤ l ≤ n. We define for every
p ∈ Q0
kp := λ(qp,0 ) σ(π̂p ).
By the commutativity of we have

(λ θ(w))[p] = k kp . (6)

If we set kp := 0 for p ∈ Q \ Q0 , then (6) holds for every p ∈ Q. We define k 0 ∈ M1×Q (K) by
setting k 0 [p] = kp . We can state (6) as λ θ(w) = k k 0 . Let
v
K 0 := λ(q) σ(π) q, p ∈ Q, v ∈ Σ∗ , |v| ≤ |Q||Q| , π ∈ q ; p .


By (2), we have kp ∈ K 0 for every p ∈ Q, and thus k 0 ∈ M1×Q (K 0 ). Note that K 0 and M1×Q (K 0 )
are finite. By Lemma 3.2, we obtain

δ̃ f (λ), w = δ̃ λ, w = f (λ θ(w)) = f (k k 0 ) = f (k 0 ) ∈ f (M1×Q (K 0 )).


 

Thus, Q̃ is finite.
4
Let ŵ ∈ Σ∗ be the label of π̂p . Please note that we do NOT necessarily have λ(qp,0 ) σ(π̂p ) = λ θ(ŵ)[p].
(There are easy counterexamples.)
13

Next we show that we cannot prove Theorem 3.5 for arbitrary semirings: Both the assump-
tion that K is commutative and that K is a min-semiring cannot be left out

Example 3.6. Let Σ = {a}. Consider the WFA T1 over the semiring (N, +, ·, 0, 1) defined by
Q = {q0 , q1 }, λ(q0 ) = λ(q1 ) = %(q0 ) = %(q1 ) = 1 and E = {(q0 , a, 1, q0 ), (q0 , a, 1, q1 ), (q1 , a, 1, q1 )}.

1 1
T1 :
1 a/1 1
1 2

a/1 a/1

For every n ≥ 0 we have λ·θ(an ) = (1, n+1). It is obvious that every maximal factorization (f, g)
satisfies f ((1, n+1)) = (1, n+1). Thus, the determinization of T1 by some maximal factorization
does not exist, although T1 has the twins property and (N, +, ·, 0, 1) is commutative.

Example 3.7. Let Σ = {a} and ∆ = {a, b} with a < b. Consider the string semiring
(∆∗ , min, ·, 0, ε) and the WFA T2 defined by Q = {q0 , q1 }, λ(q0 ) = %(q0 ) = %(q1 ) = ε, λ(q1 ) = b
with transitions E = {(q0 , a, a, q0 ), (q1 , a, a, q1 )}.

T2 : ε ε
ε b
1 2
a/a a/a

For every n ≥ 0 we have λ · θ(an ) = (an , ban ). As above, it is obvious that every maximal
factorization (f, g) satisfies f ((an , ban )) = (an , ban ). Thus, the determinization of T2 by some
maximal factorization does not exist, although T2 has the twins property and (∆∗ , min, ·, 0, ε) is
a min-semiring.

It is an open question whether one can generalize Theorem 3.5 to idempotent commutative
semirings. If K is an idempotent semiring, then ⊕ is the infimum over some partial ordering,
i.e., we have a slightly weaker condition than in a min-semiring. However, in an idempotent
semiring we can no longer assume the existence of victorious paths which is of crucial importance
in the proof of Theorem 3.5 (e.g., for the existence of ν). On the other hand, we do not have a
counterexample which shows that Theorem 3.5 cannot be generalized to idempotent semirings.

4 Unambiguous and Single-valued WFA


Subsequentially, we fix an alphabet Σ and an arbitrary semiring K. We define here two classes
of weighted finite automata and consider their relationships and properties. A WFA T is called
unambiguous if every word w ∈ Σ∗ is the label of at most one successful path in T .
This is a proper extension of the strength of computability defined by subsequential WFA.
However, not every formal power series is computable by an unambiguous WFA. It is known
that there are recognizable formal power series which are not unambiguous, see e.g. [24, 20].

Example 4.1. Let Σ = {a, b} and set f (w) := min |w|a , |w|b (w ∈ Σ∗ ). It defines a recogniz-


able formal power series f : Σ∗ → T which cannot be computed by an unambiguous WFA.

We still need some definitions and a normal form. Let T = [Q, E, λ, %] be a WFA. We call
some state q 0 ∈ Q accessible if there are p, q ∈ Q and u, v ∈ Σ∗ satisfying λ(p) 6= 0, %(q) 6= 0,
u v
and p ; q 0 ; q 6= ∅. We call T trim if every state in Q is accessible.
14 4 UNAMBIGUOUS AND SINGLE-VALUED WFA

Remark 4.2. Let T = [Q, E, λ, %] be a WFA such that we have |T |(w) 6= 0 for at least one
nonempty word w ∈ Σ+ . There is a WFA T 0 = [Q0 , E 0 , λ0 , %0 ] such that:

1. We have |T 0 |(w) = |T |(w) for every nonempty word w ∈ Σ+ , and |T 0 |(ε) = 0.

2. For every p, q ∈ Q0 and every a ∈ Σ, there is at most one k ∈ K with (p, a, k, q) ∈ E 0 .

3. We have |Q0 | ≤ |Q| + 2.

4. There are p0 , q 0 ∈ Q0 such that

(a) λ0 [p0 ] = %0 [q 0 ] = 1.
(b) For every r ∈ Q with r 6= p0 , we have λ0 [r] = 0.
(c) For every r ∈ Q with r 6= q 0 , we have %0 [r] = 0.
(d) E 0 ⊆ Q0 \ {q 0 } × Σ × K \ {0} × Q0 \ {p0 } .
  

5. T 0 is trim.

Note that T 0 in Remark 4.2 is not necessarily subsequential, even if T is subsequential.

4.1 A cross-section construction


We introduce the following concept. We call a WFA T = [Q, E, λ, %] single-valued if for every
w w
w ∈ Σ∗ and any two successful paths π ∈ i ; f and π 0 ∈ i0 ; f 0 of T we have

λ(i) σ(π) %(f ) = λ(i0 ) σ(π 0 ) %(f 0 ).

That is to say, for successful paths with identical labels, the weights including initial and final
weights, coincide. In the case of the tropical or a min-semiring it simply means that the weight
of a successful path together with the respective initial and final weight is equal to the value for
the considered label defined by |T |.
This provides an extension of the notion of unambiguous WFA. However, the construction
in this section is used to show that these WFA admit equivalent unambiguous ones. Hence the
power of computability of the two classes of formal power series coincide. To prove Theorem 4.4
we will need the cross section theorem due to Eilenberg [11] (cf. also [4]).

Proposition 4.3 (Eilenberg). Let Σ and ∆ be alphabets. For a morphism α : Σ∗ → ∆∗ and


any recognizable language A ⊆ Σ∗ there exists a recognizable language B ⊆ A such that α maps
bijectively B onto α(A).

Given a single-valued WFA T we use Proposition 4.3 to construct a cross section of the set
of all successful paths in T that contains for every label exactly one of the original successful
paths. However, we have to restrict ourselves to idempotent semirings.

Theorem 4.4. Let K be an idempotent semiring and T = [Q, E, λ, %] a single-valued WFA over
K. There exists an equivalent unambiguous WFA for T .

Proof. Firstly, we treat the case where |T |(ε) = 0. Let T = [Q, E, λ, %] be a single-valued WFA.
We may assume that T has the normal form of Remark 4.2, in particular 4.(d), and only has
accessible states. Let Σ, {i} and{f } be the considered alphabet, initial and accepting states of
T , respectively. We define

R := k ∈ K ∃ p, q ∈ Q, a ∈ Σ, (p, a, k, q) ∈ E ,
4.1 A cross-section construction 15

the set of all weights of transitions in E. Then E ⊆ Q × Σ × R × Q can be seen as an alphabet.


We obtain the morphism α : E ∗ → Σ∗ needed for Proposition 4.3 by the natural extension of

α(p, a, k, q) := a, α(ε) := ε

to words. Thus, a path in T simply is mapped to its label. Let

S := (p, a, k, p0 )(q, a0 , k 0 , q 0 ) ∈ E 2 p0 6= q ⊆ E ∗


be the set of all words over E with length 2 which are not successive. The set of all successful
paths in T h i
P := {i} × Σ × R × Q E ∗ ∩ E ∗ Q × Σ × R × {f } \ E ∗ SE ∗ ,


is recognizable in E. Then α(P ) contains the labels of successful paths.


Due to Proposition 4.3 there exists a recognizable language B ⊆ P , such that α maps B
bijectively onto α(P ). Now, let A = [QA , EA , iA , FA ] be a deterministic finite automaton for B
over the alphabet E. We define a WFA T 0 = [QA , E 0 , λ0 , %0 ], where λ0 (iA ) = 1, λ0 (q) = 0 for
q ∈ QA \ {iA }, %0 (q) = 1 for q ∈ FA , and %0 (q) = 0 for q ∈ QA \ FA . We define E 0 by:

E 0 := (p, a, k, q) ∃ z1 , z2 ∈ QA such that (p, (z1 , a, k, z2 ), q) ∈ EA .



(7)

Since α maps B bijectively onto α(P ) the constructed WFA T 0 is unambiguous. We have to
show that T 0 and T are equivalent.
Let w = a1 a2 . . . an be an element of α(P ). Since T is single-valued and K is an idempotent
semiring there exists a path

πT = (z0 , a1 , k1 , z1 )(z1 , a2 , k2 , z2 ) . . . (zn−1 , an , kn , zn ) ∈ B


J
which satisfies α(πT ) = w und |T |(w) = kl . There is a path
1≤l≤n

πA = (q0 , (z0 , a1 , k1 , z1 ), q1 )(q1 , (z1 , a2 , k2 , z2 ), q2 ) . . . (qn−1 , (zn−1 , an , kn , zn ), qn )

in A with label πT . Because of the construction of E 0 in (7) we find

πT 0 = (q0 , a1 , k1 , q1 )(q1 , a2 , k2 , q2 ) . . . (qn−1 , an , kn , qn )

as a path in T 0 such that |T 0 |(w) =


J
kl = |T |(w).
1≤l≤n
Conversely, if w = a1 a2 . . . an is a label of a successful path in T 0 , there exists a path

πT 0 = (q0 , a1 , k1 , q1 )(q1 , a2 , k2 , q2 ) . . . (qn−1 , an , kn , qn )

in T 0 . The word w is mapped to kl by T 0 . Again following the construction of E 0 , there


J
1≤l≤n
are states z0 , z1 , . . . , zn in Q and a path

πA = (q0 , (z0 , a1 , k1 , z1 ), q1 )(q1 , (z1 , a2 , k2 , z2 ), q2 ) . . . (qn−1 , (zn−1 , an , kn , zn ), qn )

in A for πT 0 . Since A recognizes B ⊆ P , we have (z0 ,J


a1 , k1 , z1 ) . . . (zn−1 , an , kn , zn ) ∈ B and this
is a successful path in T . It follows that |T |(a) = kl = |T 0 |(a), because T is single-valued
1≤l≤n
and K is idempotent.
If we consider the second case |T |(ε) 6= 0, we can construct a normal form T1 from T using
Remark 4.2, so, it holds |T1 |(w) = |T |(w) for every nonempty word w ∈ Σ+ and |T1 |(ε) = 0.
Following the proof in the first case above, we obtain an equivalent unambiguous WFA T2 =
[Q2 , E2 , λ2 , %2 ] for T1 . Now we define a WFA T 0 by adding a new state q and extending λ2 (q) := 1
and %2 (q) := |T |(ε).
16 4 UNAMBIGUOUS AND SINGLE-VALUED WFA

Note that Theorem 4.4 is already known for transducers over the tropical semiring and for
certain string-to-string transducers [29, 15, 19] even in a stronger fashion [31]. However, these
approaches rely on the cancellativity of the multiplication in the semiring while our approach
just requires idempotency of the semiring addition.
Also, note that every min-semiring is idempotent. So especially for the min-semirings, in-
troduced in Section 3.6, one can apply Theorem 4.4.
As already mentioned, the proof of Theorem 4.4 relies on the cross section theorem. We
could give an alternative proof for this theorem which utilizes a so-called immersion (mor-
phism with special conditions) of an (non-weighted) automaton due to Schützenberger and
Sakarovitch [27]. We briefly describe the main steps of this proof. For a single-valued WFA
S we consider the underlying (non-weighted) automaton A. With [27, Theorem 3], for A there
exists an equivalent unambiguous automaton B and an immersion from B into A, mapping edges
in B to edges in A and states in B to states in A. This morphism has certain compatibility
properties. We want to dispose weights along B. For this, to an edge in B we associate the
weight of the image of this edge under the immersion considered not in A but in the originally
given weighted automaton S. We call the resulting WFA S 0 . Now, very similar to the part of
the first proof showing |T | = |T 0 |, using also the idempotency of the semiring and properties of
the immersion, it is possible to prove the equivalence of the constructed WFA S 0 and S.

4.2 Weighted finite automata over the tropical semiring

In this section we consider WFA over the tropical semiring T. We always consider determiniza-
tion with respect to the maximal factorization by Mohri, as explained in Section 3.5. If not
indicated otherwise, the underlying alphabet and the set of transitions are Σ and E, respectively.
For the class of single-valued and trim WFA we show that the determinization is optimal in the
sense that for a given single-valued and trim WFA T the determinization is defined iff T is de-
terminizable, i.e., there exists an equivalent subsequential WFA for T . This extends Mohri [25]
from unambiguous to single-valued WFA. Furthermore, we show that it is decidable whether a
single-valued WFA over T computes a subsequential formal power series.
On Σ∗ one can define a metric d by setting

∀u, v ∈ Σ∗ : d(u, v) = |u| + |v| − 2|u ∧ v|.

For two reals r1 , r2 the euclidean metric is denoted by |r1 − r2 |.


A partial function α : Σ∗ 99K R+ has bounded variation if for all k ≥ 0 there is a K ≥ 0 such
that for all u, v ∈ Σ∗ with α(u), α(v) ∈ R+ the following holds:

d(u, v) ≤ k ⇒ |α(u) − α(v)| ≤ K.

Lemma 4.5 ([25]). Every subsequential formal power series has bounded variation.

On the other hand, the formal power series of an unambiguous WFA does not necessarily
have bounded variation, as the following example shows:

Example 4.6. The formal power series computed by the following unambiguous WFA T does
4.2 Weighted finite automata over the tropical semiring 17

not have bounded variation:


a/1

T :
c/0
2 4
a/1

a/0 b/0
3 5

a/0

In [25, p. 283, Th. 9] Mohri claims a characterization of recognizable power series which
are subsequential by saying a recognizable series is subsequential if and only if it has bounded
variation. Lemma 4.7 shows the incorrectness of his claim.
Lemma 4.7. The function f of Example 4.1 is a recognizable formal power series, has bounded
variation, but it is not subsequential.
Proof. By Example 4.1, it remains to show that f has bounded variation. Let w ∈ Σ∗ be
arbitrary. By distinguishing the cases |w|a < |w|b and |w|a ≥ |w|b , we obtain f (w) ≤ f (wa) ≤
f (w) + 1 and f (w) ≤ f (wb) ≤ f (w) + 1. By an induction on the length of w0 ∈ Σ∗ , we have

f (w) ≤ f (ww0 ) ≤ f (w) + |w0 |, i.e., 0 ≤ f (ww0 ) − f (w) ≤ |w0 |.

Let w, u, v ∈ Σ∗ be arbitrary such that w = wu ∧ wv, and thus, d(wu, wv) = |u| + |v|. We have

|f (wu) − f (wv)| ≤ |f (wu) − f (w)| + |f (wv) − f (w)| ≤ |u| + |v| = d(wu, wv).

Hence, f satisfies the definition of bounded variation for K := k.

On the other hand, if the considered WFA is single-valued, trim and has the property of bounded
variation, then it has the twins property and hence its determinization is defined. Therefore, we
have the following propositions.
Proposition 4.8. Every trim and single-valued WFA computing a formal power series with
bounded variation has the twins property.
Proof. Let T = [Q, E, λ, %] be a trim and single-valued WFA such that |T | has bounded variation
and let I, F in T be the set of initial and accepting states, respectively.
Consider q1 , q2 ∈ Q, i1 , i2 ∈ I and u, v ∈ Σ∗ such that
u u v v
i1 ; q1 6= ∅ ∧ i2 ; q2 6= ∅ ∧ q1 ; q1 6= ∅ ∧ q2 ; q2 6= ∅.

Since T is trim and has bounded variation, there are w1 , w2 ∈ Σ∗ , f1 , f2 ∈ F and a K ≥ 0 such
that h i
w w
q1 ;1 f1 6= ∅ ∧ q2 ;2 f2 6= ∅ ∧ ∀k ≥ 0 : |T |(uv k w1 ) − |T |(uv k w2 ) ≤ K .

T is single-valued and trim, hence all paths between two states over identical words have the
same weight:
|T |(uv k w1 ) = λ(i1 ) + θ(u)[i1 , q1 ] + θ(w1 )[q1 , f1 ] + k · θ(v)[q1 , q1 ] + %(f1 )
=: k · θ(v)[q1 , q1 ] + C1
|T |(uv k w2 ) = λ(i2 ) + θ(u)[i2 , q2 ] + θ(w2 )[q2 , f2 ] + k · θ(v)[q2 , q2 ] + %(f2 )
=: k · θ(v)[q2 , q2 ] + C2
18 4 UNAMBIGUOUS AND SINGLE-VALUED WFA


=⇒ ∀k
≥ 0 : C1 − C2 + k θ(v)[q

1 , q1 ] − θ(v)[q2 , q2 ] ≤ K

=⇒ θ(v)[q1 , q1 ] − θ(v)[q2 , q2 ] = 0

Now, for the considered class of formal power series computed by single-valued WFA, we can
show that the properties of bounded variation and subsequentiality coincide.

Proposition 4.9. Let f be a formal power series. The following two conditions are equivalent:

1. f is subsequential.

2. f has bounded variation and is computable by a single-valued WFA.

Proof. Every subsequential (and therefore single-valued) WFA for f computes a formal power
series with bounded variation (Lemma 4.5). Conversely, if f is computable by a single-valued
WFA, by Proposition 4.8 this WFA has the twins property. Applying Theorem 3.5, we get
(2) ⇒ (1).

Theorem 4.10. Let T be a single-valued and trim WFA. The following conditions are equiva-
lent:

1. |T | is subsequential.

2. The determinization of T is defined.

3. T has the twins property.

4. |T | has bounded variation.

Proof. The equivalence follows from Proposition 4.8, Theorems 3.5 and 3.1, and Lemma 4.5.

4.3 Decidable properties


It is decidable in polynomial time whether a transducer has the twins property [1, 3]. We apply
techniques from [1, 3] to show that it is decidable in polynomial time whether a given WFA over
the tropical semiring is single-valued.
Throughout this section, let T = [Q, E, λ, %] be a WFA over the tropical semiring and denote
by I resp. F the sets of initial resp. accepting states of T .
w
We call two states q, q 0 ∈ Q co-twins if there exists a word w ∈ Σ∗ such that both q ; F
w
and q 0 ; F are non-empty. We denote the set of all co-twins of T by C.
Now, let F be the least subset of C × R such that

1. for every q, q 0 ∈ I, (q, q 0 ) ∈ C, we have q, q 0 , λ(q) − λ(q 0 ) ∈ F and




2. for every (p, p0 , k) ∈ F , (q, q 0 ) ∈ C, (p, a, l, q), (p0 , a, l0 , q 0 ) ∈ E, we have (q, q 0 , k+l−l0 ) ∈ F .

Lemma 4.11. With the preceding terminology, the following two assertions are equivalent:

1. T is not single-valued.

2. There is some (q, q 0 , k) ∈ F such that q, q 0 ∈ F and %(q 0 ) − %(q) 6= k.

Moreover, these conditions are satisfied if

(∗) there are (q, q 0 , k1 ) ∈ F and (q, q 0 , k2 ) ∈ F such that k1 6= k2 .


4.3 Decidable properties 19

Proof. (1) ⇒ (2) By (1), there is some word w ∈ Σ∗ and there are q0 , q00 ∈ I, q|w| , q|w|
0 ∈ F , and
w w
paths π ∈ q0 ; q|w| , π 0 ∈ q00 ; q|w|
0 such that

λ(q0 ) + θ(π) + %(q|w| ) 6= λ(q00 ) + θ(π 0 ) + %(q|w|


0
),

and hence,
λ(q0 ) + θ(π) − λ(q00 ) − θ(π 0 ) 6= %(q|w|
0
) − %(q|w| ). (8)

For 0 ≤ i < |w|, we denote the i-th transition of π resp. π 0 by (qi , ai , ki , qi+1 ) resp. 0 0 0
 (qi , ai , ki , qi+1 ).
0 0 0
For every 0 ≤ i ≤ |w|, we have (qi , qi ) ∈ C. Moreover, q0 , q0 , λ(q0 ) − λ(q0 ) ∈ F , and by an
induction on i, we can show that for every 0 ≤ i ≤ |w|,

qi , qi0 , λ(q0 ) + θ(π(0, i)) − λ(q00 ) − θ(π 0 (0, i))



∈ F.

The last statement for i = |w|, q := q|w| , and q 0 := q|w|


0 together with equation (8) proves (2).
(2) ⇒ (1) By induction, we can show that for every triple (q, q 0 , k) ∈ F there are states
w w
p, p0 ∈ I, some word w ∈ Σ∗ , and paths π ∈ p ; q, π 0 ∈ p0 ; q 0 such that

λ(p) + θ(π) − λ(p0 ) − θ(π 0 ) = k.

Let (q, q 0 , k) as in (2). By considering w, p, p0 , π, and π 0 as above, we conclude

λ(p) + θ(π) + %(q) − λ(p0 ) − θ(π 0 ) − %(q 0 ) 6= 0,

and hence, T is not single-valued.


Finally, assume that (∗) is satisfied and let (q, q 0 , k1 ) ∈ F and (q, q 0 , k2 ) ∈ F such that
w
k1 6= k2 . Since (q, q 0 ) ∈ C, there are some word w ∈ Σ∗ , states p, p0 ∈ F , and paths π ∈ q ; p,
w 0
π ∈ q ; p . By an induction on these paths, we can show that both (p, p , k1 + θ(π) − θ(π 0 ))
0 0 0

and (p, p0 , k2 + θ(π) − θ(π 0 )) belong to F . Since k1 6= k2 , one of these triples proves (2).

Condition (∗) in Lemma 4.11 will be crucial for the complexity of an algorithm which explores
the set F to decide whether T is single-valued.

Theorem 4.12. Let T be a WFA over the tropical semring. It is decidable in polynomial time
whether T is single-valued.

Proof. At first, the algorithm computes the set C. This is possible in polynomial time since C
is the least subset of Q × Q which contains F × F and is closed as follows: for every (q, q 0 ) ∈ C
and transitions (p, a, k, q), (p0 , a, l, q 0 ) ∈ E, we have (p, p0 ) ∈ C.
Then, the algorithm generates the set F . Whenever the algorithm produces a new triple
in F , it checks whether condition (2) or (∗) in Lemma 4.11 is satisfied. If so, then T is not
single-valued. If the algorithm computes the entire set F , and both condition (2) and (∗) in
Lemma 4.11 are not satisfied, then T is single-valued.
The set F is possibly infinite. However, in every subset F which consists of more that
|C| triples, there are two triples as in condition (∗). Hence, the algorithm generates at most
|C| + 1 ≤ |Q|2 + 1 triples, i.e., it terminates in polynomial time.

Acknowledgment
We gratefully acknowledge the work of the unknown referees and the editors which resulted in
improvements of this paper.
20 REFERENCES

References
[1] C. Allauzen and M. Mohri. Efficient algorithms for testing the twins property. Journal of Automata,
Languages and Combinatorics, 8(2):117–144, 2003.
[2] M.-P. Béal and O. Carton. Computing the prefix of an automaton. R.A.I.R.O. - Informatique Théorique et
Applications, 34(6):503–514, 2000.
[3] M.-P. Béal, O. Carton, C. Prieur, and J. Sakarovitch. Squaring transducers. An efficient procedure for
deciding functionality and sequentiality. Theoretical Computer Science, 292(1):45–63, 2003.
[4] J. Berstel. Transductions and Context-Free Languages. B. G. Teubner, Stuttgart, 1979.
[5] J. Berstel and C. Reutenauer. Rational Series and Their Languages, volume 12 of EATCS Monographs on
Theoretical Computer Science. Springer-Verlag, Berlin Heidelberg New York, 1984.
[6] A.L. Buchsbaum, R. Giancarlo, and J.R. Westbrook. On the determinization of weighted finite automata.
SIAM Journal of Computing, pages 1502–1531, 2000.
[7] C. Choffrut. Une caracterisation des fonctions sequentielles et des fonctions sous-sequentielles en tant que
relations rationnelles. Theoretical Computer Science, 5(3):325–337, 1977.
[8] C. Choffrut. Minimizing subsequential transducers: A survey. Theoretical Computer Science, 292(1):131–143,
2003.
[9] K. Culik II and J. Kari. Image compression using weighted finite automata. Computer & Graphics, 17:305–
313, 1993.
[10] M. Droste and P. Gastin. On aperiodic and star-free formal power series in partially commuting variables.
In D. Krob, A.A. Mikhalev, and A.V. Mikhalev, editors, Formal Power Series and Algebraic Combinatorics,
12th Int. Conf., pages 158 – 169. Springer-Verlag, Berlin, 2000.
[11] S. Eilenberg. Automata, Languages, and Machines, Vol. A. Academic Press, New York, 1974.
[12] L. Fuchs. Teilweise geordnete algebraische Strukturen. Vandenhoeck & Ruprecht, Göttingen, 1966.
[13] J. S. Golan. Semirings and Their Applications. Kluwer Academic Publishers, 1999.
[14] U. Hafner. Low Bit-Rate Image and Video Coding with Weighted Finite Automata. PhD thesis, Universität
Würzburg, 1999.
[15] T. Harju, H.C.M. Kleijn, and M. Latteux. Compositional representations of rational functions. R.A.I.R.O.
- Informatique Théorique et Applications, 26:243–255, 1992.
[16] W.C. Holland and J. Martinez, editors. Ordered Algebraic Structures. Kluwer Academic Publishers, 1997.
[17] Z. Jiang, B. Litov, and O. de Vel. Similarity enrichments in image compression through weighted finite
automata. In COCOON’00 Proceedings, volume 1858 of Lecture Notes in Computer Science, pages 447–456.
Springer-Verlag, Berlin, 2000.
[18] F. Katritzke. Refinements of Data Compression using Weighted Finite Automata. PhD thesis, Universität
Siegen, 2001.
[19] R. Klemm and A. Weber. Economy of description for single-valued transducers. Information and Computa-
tion, 118(2):327–340, 1995.
[20] I. Klimann, S. Lombardy, J. Mairesse, and C. Prieur. Deciding unambiguity and sequentiality from a finitely
ambiguous max-plus automaton. Theoretical Computer Science, 327(3):349–373, 2004.
[21] D. Krob. Some consequences of a Fatou property of the tropical semiring. Journal of Pure and Applied
Algebra, 93:231–249, 1994.
[22] W. Kuich. Semirings and formal power series. In G. Rozenberg and A. Salomaa, editors, Handbook of Formal
Languages, Vol. 1, Word, Language, Grammar, pages 609–677. Springer-Verlag, Berlin, 1997.
[23] B. Mahr. Iteration and summability in semirings. Annals of Discrete Mathematics, 19:229–256, 1984.
[24] I. Mäurer. Zur Minimalisierung und Determinisierung von sequentiellen Transducern. Master’s thesis,
Technische Universität Dresden, Institut für Algebra, 2002.
[25] M. Mohri. Finite-state transducers in language and speech processing. Computational Linguistics, 23:269–
311, 1997.
[26] C. Reutenauer. A survey on noncommutative rational series. DIMACS Series in Discrete Mathematics and
Theoretical Computer Science, 24:159–169, 1996.
[27] J. Sakarovitch. A construction on finite automata that has remained hidden. Theoretical Computer Science,
204:205–231, 1998.
REFERENCES 21

[28] A. Salomaa and M. Soittola. Automata-Theoretic Aspects of Formal Power Series. Texts and Monographs
on Computer Science. Springer-Verlag, Berlin Heidelberg New York, 1978.
[29] M. Schützenberger. Sur les relations rationelles entre monoı̈des libres. Theoretical Computer Science, 3:243–
259, 1976.
[30] I. Simon. On semigroups of matrices over the tropical semiring. Informatique Théorique et Applications,
28:277–294, 1994.
[31] A. Weber. Decomposing a k-valued transducer into k unambiguous ones. Informatique Théorique et Appli-
cations, 30:379–413, 1996.

You might also like