Kirsten Determinization
Kirsten Determinization
Abstract
In the paper, we generalize an algorithm and some related results by Mohri [25] for deter-
minization of weighted finite automata (WFA) over the tropical semiring. We present the
underlying mathematical concepts of his algorithm in a precise way for arbitrary semirings.
We define a class of semirings in which we can show that the twins property is sufficient for
the termination of the algorithm. We also introduce single-valued WFA and give a partial
correction of a claim by Mohri [25] by showing several characterizations of single-valued
WFA, e.g., the formal power series computed by a single-valued WFA is subsequential iff it
has bounded variation. Also, it is decidable in polynomial time whether a given WFA over
the tropical semiring is single-valued.
1 Introduction
Weighted finite automata are of great theoretical and practical interest in computer science.
They play a crucial role in the structure theory of recognizable languages in free monoids and
trace monoids. However, weighted finite automata also have practical applications in speech
recognition and image compression [6, 9, 14, 17, 18, 25]. The behaviour of a weighted finite
automaton, for short WFA, can be described as a formal power series, i.e., a mapping from a
free monoid into some semiring.
Although WFA were already studied in the seventies [4, 7, 11], there are many recent articles
which focus mainly on two streams: WFA over the min–plus (tropical) and max–plus semirings
and string-to-string transducers [2, 3, 8, 20].
In contrast to unweighted automata, there are WFA which do not admit a subsequential
(deterministic) equivalent. However, Mohri developed an algorithm which determinizes WFA
over the tropical semiring [25] and which is implemented within the AT&T FSM Library TM . This
algorithm is not perfect, e.g., there are WFA on which Mohri’s algorithm does not terminate
despite there are subsequential equivalents. Nevertheless, his algorithm is very successful on
WFA which occur in speech recognition. Mohri proves that in the tropical semiring the twins
property is a sufficient condition for the termination of his algorithm [25].
Mohri develops his ideas for WFA over the tropical semiring. Here we wish to investigate
generalizations of his results to other semirings.
∗
Corresponding author.
†
Supported by the research grant KI 822/1–1 of the German Research Community (Deutsche Forschungsge-
meinschaft).
‡
Supported by the GK 334 of the German Research Community (Deutsche Forschungsgemeinschaft) and the
Freistaat Sachsen.
2 2 PREREQUISITES
2 Prerequisites
2.1 Basic Definitions and Notations
A semiring (K, ⊕, , 0, 1) consists of a set K together with two binary operations ⊕, and such
that (K, ⊕, 0) is a commutative monoid, (K, , 1) is a monoid which distributes over (K, ⊕, 0),
and 0 acts as a zero for all elements. Some k ∈ K, k 6= 0 is called a zero-divisor if there is some
l ∈ K, l 6= 0 such that k l = 0 or l k = 0. If K does not have zero-divisors, then K is called
zero-divisor-free. A semiring K is called idempotent if k ⊕ k = k for every k ∈ K.
We call K ⊆ K a subsemiring of K if K is closed under ⊕ and and 0, 1 ∈ K. For every
K ⊆ K, we call the closure of the set K ∪ {0, 1} under ⊕ and the subsemiring generated by K
and denote it by hKi. If hKi is finite for every finite set K ⊆ K, then K is called locally finite [10].
Important examples of locally finite semirings are the semirings of the form (K, min, max, 0, 1)
where min and max are defined by some total order on a set K.
By (N, +, ·, 0, 1) we mark the semiring over N = {0, 1, . . . } with addition and multiplication of
natural numbers. A semiring often used is the tropical semiring T = (R+ ∪. {∞}, min, +, ∞, 0).
We denote by (B, ∨, ∧, false, true) the Boolean semiring. It consists of the set B = {true, false}
with logical disjunction and conjunction.
Let ∆ be some finite alphabet with a total order ≤. We extend ≤ to ∆∗ as follows. Firstly,
for u, v ∈ ∆∗ , we set u ≤ v if u is shorter than v. Secondly, for a, b ∈ ∆ and u, v, w ∈ ∆∗ ,
satisfying a b and |u| = |v|, we set wau ≤ wbv. By defining an operation min on the free
monoid ∆∗ by ≤, we obtain a semiring (∆∗ ∪. {0}, min, ·, 0, ε), where 0 denotes a new maximal
element.
Mappings from a free monoid to some semiring K are called formal power series. For a
semiring K and a finite set Q, we denote by MQ×Q (K) the set of all Q×Q-matrices over K. We
1
This claim is corrected in the electronic version of [25] on Mohri’s homepage.
2.2 Weighted finite automata 3
denote by M1×Q (K) and MQ×1 (K) the set of all row resp. column matrices. For a row matrix
(correspondingly for column matrices) λ ∈ M1×Q (K), we write λ[q] for the q-th entry of λ. For
a ∈ K, an element (a, . . . , a) ∈ M1×Q (K) we abbreviate by aQ . We define the multiplication of
matrices by the operations of K, as usual. WeL use 99K for partial and → for total mappings. As
usual, we define the sum over the empty set, ∅, as the zero 0 of K.
|T |(w) := λ θ(w) %
for every w ∈ Σ∗ . We call two WFA T and T 0 equivalent if |T | = |T 0 |, i.e. if |T |(w) = |T 0 |(w)
for every w ∈ Σ∗ . If some formal power series f : Σ∗ → K is computed by a WFA, then we call
f recognizable.
Now we give an equivalent method to define the behaviour of a WFA. Again let T =
[Q, E, λ, %] be a WFA. We can regard λ and % as mappings λ, % : Q → K such that for every
q ∈ Q, we have λ(q) = λ[q] and %(q) = %[q].
Let n ≥ 1. A path π of length n is a sequence
of transitions in E. The word a0 . . . an−1 is called the label of π. We say that π starts at q0 and
ends at qn . We define σ(π) := k0 k1 · · · kn−1 , the weight of π. We assume that for every
q ∈ Q there is a path of length 0 which starts and ends at q, is labeled with ε and weighted with
w
1. For every p, q ∈ Q and every w ∈ Σ∗ , we denote by p ; q the set of all paths with label w
∗
which start at p and end at q. Then for every w ∈ Σ one can show
M M M
|T |(w) = λ(p) σ(π) %(q) = λ(p) σ(π) %(q) .
w p,q∈Q w
p,q∈Q, π ∈ p ; q π∈p;q
For every q ∈ Q, we call λ(q) resp. %(q) the initial weight resp. terminal weight of q. Let
I := { q ∈ Q | λ(q) 6= 0 } and F := { q ∈ Q | %(q) 6= 0 }. We call the states in I resp. F the initial
4 3 DETERMINIZATION
states resp. accepting states of T . Let π = (q0 , a0 , k0 , q1 ) (q1 , a1 , k1 , q2 ) . . . (qn−1 , an−1 , kn−1 , qn )
be a path. Then π is successful if q0 ∈ I and qn ∈ F . For every 0 ≤ i ≤ j ≤ n, we let
be the subpath of π from qi to qj . If π and π 0 are paths such that π 0 starts at the same state
where π ends, then we denote by ππ 0 the concatenation of π and π 0 . For every p, q, r ∈ Q and
u v u v
u, v ∈ Σ∗ , we write the concatenation of p ; q and q ; r as p ; q ; r. For subsets P, R ⊆ Q,
w w
we denote by P ; R the union of the sets p ; r for every p ∈ P , r ∈ R.
A subsequential WFA is a tuple T = [Q, δ, σ, q0 , k0 , %] such that:
• Q is a finite set of states,
• δ : Q × Σ 99K Q and σ : Q × Σ 99K K are partial mappings such that for every q ∈ Q,
a ∈ Σ : δ(q, a) is defined iff σ(q, a) is defined,
• q0 ∈ Q, k0 ∈ K,
• % : Q → K is a mapping.
A subsequential formal power series is a formal power series which is recognized by a subsequen-
tial WFA. It is easy to transform a subsequential WFA T into a WFA computing |T |.
A formal power series from Σ∗ to B can be considered as a subset of Σ∗ or as a formal
language over Σ. In the same way, recognizable formal power series from Σ∗ to B can be
considered as recognizable languages. We call a subsequential WFA over the Boolean semiring
B a deterministic finite automaton.
3 Determinization
In this section, we deal with several approaches to determinize WFA. We explain a straightfor-
ward idea, Mohri’s algorithm, and a generalization of Mohri’s algorithm to arbitrary semirings.
This generalization utilizes so-called factorizations which depend on the semiring. We intro-
duce the notion of a maximal factorization and show that maximal factorizations are in some
sense optimal in comparison to arbitrary factorizations. This result is of practical importance:
whenever one implements the determinization algorithm, one should prefer a maximal factor-
ization. In Section 3.6, we will define min-semirings and show that for min-semirings the twins
property is sufficient for the existence of a determinization of a WFA by our generalization using
a maximal factorization.
Let K be a semiring and let Σ be an alphabet during this section.
The set of states Q0 is the least subset of M1×Q (K) which contains λ, and is closed under
multiplication with matrices θ(a) for every a ∈ Σ. We use λ from T as initial state of T 0 and let
k0 := 1. For every letter a ∈ Σ and every u ∈ Q0 , we set δ(u, a) := u θ(a), σ(u, a) := 1, and
%0 (u) = u %.
The lack of this idea is that Q0 might be infinite. If K is finite, or more generally, if K is
locally finite, the construction shows that for every WFA there exists a subsequential equivalent.
In the Boolean semiring, every matrix in M1×Q (B) represents a subset of Q. Now, it is quite
clear that in the Boolean semiring, this idea is just the classical determinization of WFA by a
power set construction.
One can slightly modify this idea by leaving δ(u, a) and σ(u, a) undefined if u θ(a) = 0Q .
Let (f, g) be a factorization. For every u ∈ M1×Q (K) \ {0Q }, we have f (u) 6= 0Q and g(u) 6= 0.
If for every u ∈ M1×Q (K) \ {0Q } we have f (u) = u and g(u) = 1, then we call (f, g) the trivial
factorization. Every semiring K admits a factorization, namely the trivial factorization.
Let T = [Q, E, λ, %] be a WFA over K, and let (f, g) be a factorization of dimension Q. We
may assume that λ 6= 0Q . We define:
• δ̃ : M1×Q (K) × Σ 99K M1×Q (K),
δ̃(u, a) := f u θ(a) , for every u ∈ M1×Q (K), a ∈ Σ with u θ(a) 6= 0Q ,
• %̃ : M1×Q (K) → K,
%̃(u) := u %, for every u ∈ M1×Q (K).
Let Q̃ be the least subset of M1×Q (K) which contains f (λ) and is closed under δ̃, i.e., for every
u ∈ Q̃ and a ∈ Σ, we have δ̃(u, a) ∈ Q̃ provided that δ̃(u, a) is defined. If Q̃ is finite, then we
call h i
T̃ := Q̃, δ̃|Q̃×Σ , σ̃|Q̃×Σ , f (λ), g(λ), %̃|Q̃
the determinization of T by (f, g). If Q̃ is infinite, then the determinization of T by (f, g) is not
defined. To avoid some inconvenient technical details, we say that the determinization of T by
(f, g) is not defined if λ = 0Q .
Let us mention that Mohri defines his algorithm using the factorization
M −1
g(u) := u[q] and f (u) = g(u) u
q∈Q
for every u ∈ M1×Q (K) \ {0Q } (cf. [25, p. 285]). The main problem is that it is left open how
to interpret the power −1 in general semirings. For example, consider the semiring (N, +, ·, 0, 1),
let Q = {q1 , q2 } and u = (4, 10), i.e., g(u) = 14. Regardless of how we interpret 14−1 , we cannot
define f (u) in a way that the key property g(u) · f (u) = u, i.e., 14 · f (u) = (4, 10) is satisfied.
If (f, g) is the trivial factorization, then the determinization of T by (f, g) is the WFA which
we obtain by the idea sketched in Section 3.1. If K is locally finite then the determinization of
T with respect to the trivial factorization is defined.
If K is the tropical semiring T, and we set g(u) = min(u) and f (u) = − min(u) + u for
u ∈ M1×Q (T) \ {∞Q }, then the determinization of T by (f, g) yields the same as in Section 3.2.
Theorem 3.1. Let K be an arbitrary semiring, T = [Q, E, λ, %] be a WFA over K and (f, g)
be a factorization of dimension Q. If the determinization of T by (f, g) is defined, then it is
equivalent to T .
Proof. Let T̃ be the determinization of T by (f, g). We denote the initial state of T̃ by ũ, i.e.,
ũ := f (λ). Let w ∈ Σ∗ be arbitrary. We show the following two assertions by an induction on
the length of w ∈ Σ∗ .
1. If δ̃(ũ, w) is defined, then λ θ(w) = g(λ) σ̃(ũ, w) δ̃(ũ, w).
Case 1: δ̃(ũ, wa) is defined. This implies that σ̃(ũ, wa), δ̃(ũ, w) and σ̃(ũ, w) are defined. We
only have to consider (1), because (2) is obviously true. Applying the definition of δ̃ and σ̃, that
(f, g) is a factorization and the inductive hypothesis, we obtain
g(λ) σ̃(ũ, wa) δ̃(ũ, wa) = g(λ) σ̃(ũ, w) σ̃ δ̃(ũ, w), a δ̃ δ̃(ũ, w), a
= g(λ) σ̃(ũ, w) g δ̃(ũ, w) θ(a) f δ̃(ũ, w) θ(a)
= g(λ) σ̃(ũ, w) δ̃(ũ, w) θ(a) = λ θ(w) θ(a) = λ θ(wa).
Case 2.2: δ̃(ũ, w) is not defined. By the inductive hypothesis, we have λ θ(w) = 0Q . Thus,
we conclude
λ θ(wa) = λ θ(w) θ(a) = 0Q θ(a) = 0Q .
Now, it is easy to show that T and T̃ are equivalent. Let w ∈ Σ∗ be arbitrary. If δ̃(ũ, w) is not
defined, then |T̃ |(w) = 0. By (2), we have λ θ(w) = 0Q , and thus, |T |(w) = 0.
If δ̃(ũ, w) is defined, then by (1) we obtain
|T̃ |(w) = g(λ) σ̃(ũ, w) %̃ δ̃(ũ, w) = g(λ) σ̃(ũ, w) δ̃(ũ, w) % = λ θ(w) % = |T |(w).
Lemma 3.2. Let K be a semiring, T = [Q, E, λ, %] be a WFA and (f, g) be a maximal factor-
ization. Let u ∈ M1×Q (K) and w ∈ Σ+ be a non-empty word satisfying u θ(w) 6= 0Q . Then,
δ̃(u, w) is defined and we have δ̃(u, w) = f (u θ(w)) = δ̃(f (u), w).
Proof. At first, we show by an induction on the length of w that δ̃(u, w) is defined and the first
equation is true. If w is a letter, then this equation is the definition of δ̃. So let w ∈ Σ+ and
a ∈ Σ satisfy u θ(wa) 6= 0Q , i.e., u θ(w) θ(a) 6= 0Q . By induction, δ̃(u, w) is defined and
the left equation is true. Consequently,
δ̃(u, wa) = δ̃(δ̃(u, w), a) = δ̃ f (u θ(w)), a = f f (u θ(w)) θ(a) .
8 3 DETERMINIZATION
Because (f, g) is maximal, we can multiply the argument of the outermost occurrence of f by
g(u θ(w)).
= f g(u θ(w)) f (u θ(w)) θ(a) = f u θ(w) θ(a) = f u θ(wa) .
We show the second equation. Because u θ(w) = g(u) f (u) θ(w) and u θ(w) 6= 0Q , we
have f (u) θ(w) 6= 0Q , i.e., δ̃(f (u), w) is defined. By the first equation,
δ̃(u, w) = f (u θ(w)) = f g(u) f (u) θ(w) = f f (u) θ(w) = δ̃(f (u), w).
If we consider Lemma 3.2 for w = ε, we see δ̃(u, ε) = u but f (u θ(ε)) = δ̃(f (u), ε) = f (u).
Hence, both equations in Lemma 3.2 hold for w = ε iff u = f (u).
The following theorem shows that maximal factorizations are optimal in comparison to other
factorizations if the semiring K is zero-divisor free, or more generally, if for every u ∈ M1×Q (K) \
{0Q }, g(u) is not a zero-divisor.
Theorem 3.3. Let T = [Q, E, λ, %] be a WFA over K satisfying λ 6= 0Q . Let (f, g) be a
maximal factorization of dimension Q and assume that for every u ∈ M1×Q (K) \ {0Q }, g(u) is
not a zero-divisor. Let (f 0 , g 0 ) be an arbitrary factorization of dimension Q.
1. If the determinization of T by (f 0 , g 0 ) exists, then the determinization of T by (f, g) exists.
2. Assume that the determinization of T by (f 0 , g 0 ) exists, and let Q0 be the set of its states.
Let Q̃ be the set of states of the determinization of T by (f, g).
We have Q̃ = f (Q0 ), and thus, |Q̃| ≤ |Q0 |.
Proof. At first, note that for every u ∈ M1×Q (K)\{0Q } we have g 0 (u)f 0 (u) = u = g(u)f (u)
and since (f, g) is maximal, f (f 0 (u)) = f (u) = f (f (u)). Consider the following mappings:
• δ 0 : M1×Q (K) × Σ 99K M1×Q (K),
δ 0 (u, a) := f 0 u θ(a) , for every u ∈ M1×Q (K), a ∈ Σ satisfying u θ(a) 6= 0∞ , and
Let Q0 be the least subset of M1×Q (K) which contains f 0 (λ) and is closed under δ 0 , i.e., for every
u ∈ Q0 and every a ∈ Σ, we have δ 0 (u, a) ∈ Q provided that δ 0 (u, a) is defined. Let Q̃ be the
least subset of M1×Q (K) which contains f (λ) and is closed under δ̃ in the same sense. To prove
(1) and (2), we show Q̃ = f (Q0 ). As mentioned above, we have f (f 0 (λ)) = f (λ). To show
Q̃ = f (Q0 ), we derive the following two assertions for every u ∈ M1×Q (K) \ {0Q } and every
a ∈ Σ.
a. δ 0 (u, a) is defined iff δ̃ f (u), a is defined.
We have u θ(a) = g(u) f (u) θ(a). Because g(u) 6= 0, and g(u) is not a zero-divisor, we
have u θ(a) = 0Q iff f (u) θ(a) = 0Q . This proves (a).
Now, assume that δ 0 (u, a) and δ̃ f (u), a are defined, i.e., u θ(a) 6= 0Q . We show:
= f f (u) θ(a) = δ̃ f (u), a .
3.5 Examples of maximal factorizations 9
If both (f, g) and (f 0 , g 0 ) are maximal factorizations, then f and f 0 are mutually inverse “bi-
jections” between the determinizations of T by (f, g) and by (f 0 , g 0 ). Hence, all determinizations
of some WFA T by maximal factorizations yield the same subsequential WFA up to renaming
the states.
Note that by Theorem 3.3 the determinization by some maximal factorization is optimal
among determinizations by arbitrary factorizations. It is possible that the determinization of
a WFA T by some maximal factorization (f, g) does not exist, even if there is an equivalent
subsequential WFA. If the determinization of T by some maximal factorization exists, then it is
not necessarily the smallest subsequential WFA equivalent to T . It is quite possible that there
are better approaches to determinize WFA which are entirely different from our generalization of
Mohri’s approach. For example, in the semiring (Z ∪. {∞}, min, +, ∞, 0), where Z denotes the
set of integers, an algorithm can enumerate an infinite list of all subsequential WFA’s and check
whether there is an equivalent WFA in the list.2 This algorithm is far from being practical, but
it terminates iff an equivalent WFA exists.
numbers, · is the usual multiplication of natural numbers and ∞ denotes a new maximal element.
Let Q = {1, 2}. By contradiction, assume that K admits a maximal factorization (f, g) of
dimension Q. Let u = (2 · 3 · 5 · 7, 2 · 3 · 5 · 11). Since 2 · 3 · (5 · 7, 5 · 11) = u = 2 · 5 · (3 · 7, 3 · 11), we
have f ((5 · 7, 5 · 11)) = f (u) = f ((3 · 7, 3 · 11)). However, g((5 · 7, 5 · 11)) is a common divisor
of 5 · 7 and 5 · 11 in K, i.e., g((5 · 7, 5 · 11)) = 1, and hence, f ((5 · 7, 5 · 11)) = (5 · 7, 5 · 11). In
the same way, we can derive f ((3 · 7, 3 · 11)) = (3 · 7, 3 · 11) which is a contradiction.
In his proof, Mohri uses several conditions which cannot be applied to arbitrary semirings.
Following [20, 30], some path π = (q0 , a0 , k0 , q1 ) . . . (qn−1 , an−1 , kn−1 , qn ) is called victorious if
we have
σ(π) = θ(a0 . . . an−1 )[q0 , qn ].
2. Let π be a victorious path. For every 0 ≤ i ≤ j ≤ |π| the path π(i, j) is victorious.
If we want to generalize Mohri’s result to a larger class of semirings, then we have to take care
w
of these conditions. Just assume that there are exactly two paths π, π 0 ∈ p ; q in condition (1),
and σ(π) = k and σ(π 0 ) = l. Then, θ(w)[p, q] = k ⊕ l. To assume the existence of a victorious
path, we need either θ(w)[p, q] = σ(π) or θ(w)[p, q] = σ(π 0 ), i.e., we need k ⊕ l ∈ {k, l}. Hence,
we restrict ourselves to semirings K satisfying k ⊕ l ∈ {k, l} for every k, l ∈ K.
This property has an important consequence.
Lemma 3.4. Let K be a semiring. The following three conditions are equivalent.
2. There is a linear order relation ≤ on K such that ⊕ is the minimum with respect to ≤.
Proof. (3) ⇒ (2) and (2) ⇒ (1) are obvious. We show (1) ⇒ (3). For every k, l ∈ K let
k≤l :⇐⇒ k ⊕ l = k.
k1 ⊕ k3 = (k1 ⊕ k2 ) ⊕ k3 = k1 ⊕ (k2 ⊕ k3 ) = k1 ⊕ k2 = k1 ,
i.e., k1 ≤ k3 . Thus, ≤ is an ordering, and it is easy to see that ⊕ is the minimum with respect
to ≤. Assertion (3b) follows from the distributivity of over ⊕.
w
we can assume that there is some path π ∈ p ; q such that σ(π) = θ(w)[p, q] provided that
θ(w)[p, q] 6= 0 and K is a min-semiring.
Now, we take care of Mohri’s second condition. Unfortunately, this condition is not true in
general, even if K is a min-semiring. We can avoid this condition in our proof of Theorem 3.5,
below. However, due to this problem, we cannot prove Theorem 3.5 by slight adjustments to
Mohri’s proof.
Theorem 3.5. Let K be a commutative min-semiring and let T = [Q, E, λ, %] be a WFA sat-
isfying λ 6= 0Q . Let (f, g) be a maximal factorization. If T has the twins property, then the
determinization of T by (f, g) is defined.
Proof. Let δ̃ : M1×Q (K) × Σ 99K M1×Q (K) and Q̃ be defined as in Section 3.3. We prove the
theorem by showing that Q̃ is finite. Our strategy is to construct a finite subset K 0 ⊆ K such
that for every w ∈ Σ∗ we have δ̃(f (λ), w) ∈ f (M1×Q (K 0 )) if δ̃(f (λ), w) is defined.
Let w = a0. . . a|w|−1 ∈ Σ∗ such that δ̃(f
(λ), w) is defined. We derive information on λθ(w).
Let Q := p ∈ Q (λ θ(w))[p] 6= 0 . We have Q 6= ∅. For every p ∈ Q0 , let q0,p ∈ Q and
0 0
w
πp ∈ q0,p ; p be a path such that
Since K is a min-semiring, there is a path ν which starts and ends at the same states as
πp (i2l−1 , i2l ), has the same label as πp (i2l−1 , i2l ) and θ(ai2l−1 . . . ai2l −1 )[qp,i2l−1 , qp,i2l ] = σ(ν).
Let πp0 be the path which we obtain by replacing in πp the part πp (i2l−1 , i2l ) by ν.
We have σ(πp0 ) ≤⊕ σ(πp ) by (3), the choice of ν, and the stability of ≤⊕ under . In
combination with the choice of πp , we obtain
We are now able to derive information on λ θ(w)[p] using (1) and k. Let π̂p be the path4 which
we obtain by “erasing” the parts πp (i2l−1 , i2l ) from πp for every 1 ≤ l ≤ n. We define for every
p ∈ Q0
kp := λ(qp,0 ) σ(π̂p ).
By the commutativity of we have
(λ θ(w))[p] = k kp . (6)
If we set kp := 0 for p ∈ Q \ Q0 , then (6) holds for every p ∈ Q. We define k 0 ∈ M1×Q (K) by
setting k 0 [p] = kp . We can state (6) as λ θ(w) = k k 0 . Let
v
K 0 := λ(q) σ(π) q, p ∈ Q, v ∈ Σ∗ , |v| ≤ |Q||Q| , π ∈ q ; p .
By (2), we have kp ∈ K 0 for every p ∈ Q, and thus k 0 ∈ M1×Q (K 0 ). Note that K 0 and M1×Q (K 0 )
are finite. By Lemma 3.2, we obtain
Thus, Q̃ is finite.
4
Let ŵ ∈ Σ∗ be the label of π̂p . Please note that we do NOT necessarily have λ(qp,0 ) σ(π̂p ) = λ θ(ŵ)[p].
(There are easy counterexamples.)
13
Next we show that we cannot prove Theorem 3.5 for arbitrary semirings: Both the assump-
tion that K is commutative and that K is a min-semiring cannot be left out
Example 3.6. Let Σ = {a}. Consider the WFA T1 over the semiring (N, +, ·, 0, 1) defined by
Q = {q0 , q1 }, λ(q0 ) = λ(q1 ) = %(q0 ) = %(q1 ) = 1 and E = {(q0 , a, 1, q0 ), (q0 , a, 1, q1 ), (q1 , a, 1, q1 )}.
1 1
T1 :
1 a/1 1
1 2
a/1 a/1
For every n ≥ 0 we have λ·θ(an ) = (1, n+1). It is obvious that every maximal factorization (f, g)
satisfies f ((1, n+1)) = (1, n+1). Thus, the determinization of T1 by some maximal factorization
does not exist, although T1 has the twins property and (N, +, ·, 0, 1) is commutative.
Example 3.7. Let Σ = {a} and ∆ = {a, b} with a < b. Consider the string semiring
(∆∗ , min, ·, 0, ε) and the WFA T2 defined by Q = {q0 , q1 }, λ(q0 ) = %(q0 ) = %(q1 ) = ε, λ(q1 ) = b
with transitions E = {(q0 , a, a, q0 ), (q1 , a, a, q1 )}.
T2 : ε ε
ε b
1 2
a/a a/a
For every n ≥ 0 we have λ · θ(an ) = (an , ban ). As above, it is obvious that every maximal
factorization (f, g) satisfies f ((an , ban )) = (an , ban ). Thus, the determinization of T2 by some
maximal factorization does not exist, although T2 has the twins property and (∆∗ , min, ·, 0, ε) is
a min-semiring.
It is an open question whether one can generalize Theorem 3.5 to idempotent commutative
semirings. If K is an idempotent semiring, then ⊕ is the infimum over some partial ordering,
i.e., we have a slightly weaker condition than in a min-semiring. However, in an idempotent
semiring we can no longer assume the existence of victorious paths which is of crucial importance
in the proof of Theorem 3.5 (e.g., for the existence of ν). On the other hand, we do not have a
counterexample which shows that Theorem 3.5 cannot be generalized to idempotent semirings.
Example 4.1. Let Σ = {a, b} and set f (w) := min |w|a , |w|b (w ∈ Σ∗ ). It defines a recogniz-
We still need some definitions and a normal form. Let T = [Q, E, λ, %] be a WFA. We call
some state q 0 ∈ Q accessible if there are p, q ∈ Q and u, v ∈ Σ∗ satisfying λ(p) 6= 0, %(q) 6= 0,
u v
and p ; q 0 ; q 6= ∅. We call T trim if every state in Q is accessible.
14 4 UNAMBIGUOUS AND SINGLE-VALUED WFA
Remark 4.2. Let T = [Q, E, λ, %] be a WFA such that we have |T |(w) 6= 0 for at least one
nonempty word w ∈ Σ+ . There is a WFA T 0 = [Q0 , E 0 , λ0 , %0 ] such that:
(a) λ0 [p0 ] = %0 [q 0 ] = 1.
(b) For every r ∈ Q with r 6= p0 , we have λ0 [r] = 0.
(c) For every r ∈ Q with r 6= q 0 , we have %0 [r] = 0.
(d) E 0 ⊆ Q0 \ {q 0 } × Σ × K \ {0} × Q0 \ {p0 } .
5. T 0 is trim.
That is to say, for successful paths with identical labels, the weights including initial and final
weights, coincide. In the case of the tropical or a min-semiring it simply means that the weight
of a successful path together with the respective initial and final weight is equal to the value for
the considered label defined by |T |.
This provides an extension of the notion of unambiguous WFA. However, the construction
in this section is used to show that these WFA admit equivalent unambiguous ones. Hence the
power of computability of the two classes of formal power series coincide. To prove Theorem 4.4
we will need the cross section theorem due to Eilenberg [11] (cf. also [4]).
Given a single-valued WFA T we use Proposition 4.3 to construct a cross section of the set
of all successful paths in T that contains for every label exactly one of the original successful
paths. However, we have to restrict ourselves to idempotent semirings.
Theorem 4.4. Let K be an idempotent semiring and T = [Q, E, λ, %] a single-valued WFA over
K. There exists an equivalent unambiguous WFA for T .
Proof. Firstly, we treat the case where |T |(ε) = 0. Let T = [Q, E, λ, %] be a single-valued WFA.
We may assume that T has the normal form of Remark 4.2, in particular 4.(d), and only has
accessible states. Let Σ, {i} and{f } be the considered alphabet, initial and accepting states of
T , respectively. We define
R := k ∈ K ∃ p, q ∈ Q, a ∈ Σ, (p, a, k, q) ∈ E ,
4.1 A cross-section construction 15
α(p, a, k, q) := a, α(ε) := ε
S := (p, a, k, p0 )(q, a0 , k 0 , q 0 ) ∈ E 2 p0 6= q ⊆ E ∗
be the set of all words over E with length 2 which are not successive. The set of all successful
paths in T h i
P := {i} × Σ × R × Q E ∗ ∩ E ∗ Q × Σ × R × {f } \ E ∗ SE ∗ ,
Since α maps B bijectively onto α(P ) the constructed WFA T 0 is unambiguous. We have to
show that T 0 and T are equivalent.
Let w = a1 a2 . . . an be an element of α(P ). Since T is single-valued and K is an idempotent
semiring there exists a path
Note that Theorem 4.4 is already known for transducers over the tropical semiring and for
certain string-to-string transducers [29, 15, 19] even in a stronger fashion [31]. However, these
approaches rely on the cancellativity of the multiplication in the semiring while our approach
just requires idempotency of the semiring addition.
Also, note that every min-semiring is idempotent. So especially for the min-semirings, in-
troduced in Section 3.6, one can apply Theorem 4.4.
As already mentioned, the proof of Theorem 4.4 relies on the cross section theorem. We
could give an alternative proof for this theorem which utilizes a so-called immersion (mor-
phism with special conditions) of an (non-weighted) automaton due to Schützenberger and
Sakarovitch [27]. We briefly describe the main steps of this proof. For a single-valued WFA
S we consider the underlying (non-weighted) automaton A. With [27, Theorem 3], for A there
exists an equivalent unambiguous automaton B and an immersion from B into A, mapping edges
in B to edges in A and states in B to states in A. This morphism has certain compatibility
properties. We want to dispose weights along B. For this, to an edge in B we associate the
weight of the image of this edge under the immersion considered not in A but in the originally
given weighted automaton S. We call the resulting WFA S 0 . Now, very similar to the part of
the first proof showing |T | = |T 0 |, using also the idempotency of the semiring and properties of
the immersion, it is possible to prove the equivalence of the constructed WFA S 0 and S.
In this section we consider WFA over the tropical semiring T. We always consider determiniza-
tion with respect to the maximal factorization by Mohri, as explained in Section 3.5. If not
indicated otherwise, the underlying alphabet and the set of transitions are Σ and E, respectively.
For the class of single-valued and trim WFA we show that the determinization is optimal in the
sense that for a given single-valued and trim WFA T the determinization is defined iff T is de-
terminizable, i.e., there exists an equivalent subsequential WFA for T . This extends Mohri [25]
from unambiguous to single-valued WFA. Furthermore, we show that it is decidable whether a
single-valued WFA over T computes a subsequential formal power series.
On Σ∗ one can define a metric d by setting
Lemma 4.5 ([25]). Every subsequential formal power series has bounded variation.
On the other hand, the formal power series of an unambiguous WFA does not necessarily
have bounded variation, as the following example shows:
Example 4.6. The formal power series computed by the following unambiguous WFA T does
4.2 Weighted finite automata over the tropical semiring 17
T :
c/0
2 4
a/1
a/0 b/0
3 5
a/0
In [25, p. 283, Th. 9] Mohri claims a characterization of recognizable power series which
are subsequential by saying a recognizable series is subsequential if and only if it has bounded
variation. Lemma 4.7 shows the incorrectness of his claim.
Lemma 4.7. The function f of Example 4.1 is a recognizable formal power series, has bounded
variation, but it is not subsequential.
Proof. By Example 4.1, it remains to show that f has bounded variation. Let w ∈ Σ∗ be
arbitrary. By distinguishing the cases |w|a < |w|b and |w|a ≥ |w|b , we obtain f (w) ≤ f (wa) ≤
f (w) + 1 and f (w) ≤ f (wb) ≤ f (w) + 1. By an induction on the length of w0 ∈ Σ∗ , we have
Let w, u, v ∈ Σ∗ be arbitrary such that w = wu ∧ wv, and thus, d(wu, wv) = |u| + |v|. We have
|f (wu) − f (wv)| ≤ |f (wu) − f (w)| + |f (wv) − f (w)| ≤ |u| + |v| = d(wu, wv).
On the other hand, if the considered WFA is single-valued, trim and has the property of bounded
variation, then it has the twins property and hence its determinization is defined. Therefore, we
have the following propositions.
Proposition 4.8. Every trim and single-valued WFA computing a formal power series with
bounded variation has the twins property.
Proof. Let T = [Q, E, λ, %] be a trim and single-valued WFA such that |T | has bounded variation
and let I, F in T be the set of initial and accepting states, respectively.
Consider q1 , q2 ∈ Q, i1 , i2 ∈ I and u, v ∈ Σ∗ such that
u u v v
i1 ; q1 6= ∅ ∧ i2 ; q2 6= ∅ ∧ q1 ; q1 6= ∅ ∧ q2 ; q2 6= ∅.
Since T is trim and has bounded variation, there are w1 , w2 ∈ Σ∗ , f1 , f2 ∈ F and a K ≥ 0 such
that h i
w w
q1 ;1 f1 6= ∅ ∧ q2 ;2 f2 6= ∅ ∧ ∀k ≥ 0 : |T |(uv k w1 ) − |T |(uv k w2 ) ≤ K .
T is single-valued and trim, hence all paths between two states over identical words have the
same weight:
|T |(uv k w1 ) = λ(i1 ) + θ(u)[i1 , q1 ] + θ(w1 )[q1 , f1 ] + k · θ(v)[q1 , q1 ] + %(f1 )
=: k · θ(v)[q1 , q1 ] + C1
|T |(uv k w2 ) = λ(i2 ) + θ(u)[i2 , q2 ] + θ(w2 )[q2 , f2 ] + k · θ(v)[q2 , q2 ] + %(f2 )
=: k · θ(v)[q2 , q2 ] + C2
18 4 UNAMBIGUOUS AND SINGLE-VALUED WFA
=⇒ ∀k
≥ 0 : C1 − C2 + k θ(v)[q
1 , q1 ] − θ(v)[q2 , q2 ] ≤ K
=⇒ θ(v)[q1 , q1 ] − θ(v)[q2 , q2 ] = 0
Now, for the considered class of formal power series computed by single-valued WFA, we can
show that the properties of bounded variation and subsequentiality coincide.
Proposition 4.9. Let f be a formal power series. The following two conditions are equivalent:
1. f is subsequential.
Proof. Every subsequential (and therefore single-valued) WFA for f computes a formal power
series with bounded variation (Lemma 4.5). Conversely, if f is computable by a single-valued
WFA, by Proposition 4.8 this WFA has the twins property. Applying Theorem 3.5, we get
(2) ⇒ (1).
Theorem 4.10. Let T be a single-valued and trim WFA. The following conditions are equiva-
lent:
1. |T | is subsequential.
Proof. The equivalence follows from Proposition 4.8, Theorems 3.5 and 3.1, and Lemma 4.5.
2. for every (p, p0 , k) ∈ F , (q, q 0 ) ∈ C, (p, a, l, q), (p0 , a, l0 , q 0 ) ∈ E, we have (q, q 0 , k+l−l0 ) ∈ F .
Lemma 4.11. With the preceding terminology, the following two assertions are equivalent:
1. T is not single-valued.
Proof. (1) ⇒ (2) By (1), there is some word w ∈ Σ∗ and there are q0 , q00 ∈ I, q|w| , q|w|
0 ∈ F , and
w w
paths π ∈ q0 ; q|w| , π 0 ∈ q00 ; q|w|
0 such that
and hence,
λ(q0 ) + θ(π) − λ(q00 ) − θ(π 0 ) 6= %(q|w|
0
) − %(q|w| ). (8)
For 0 ≤ i < |w|, we denote the i-th transition of π resp. π 0 by (qi , ai , ki , qi+1 ) resp. 0 0 0
(qi , ai , ki , qi+1 ).
0 0 0
For every 0 ≤ i ≤ |w|, we have (qi , qi ) ∈ C. Moreover, q0 , q0 , λ(q0 ) − λ(q0 ) ∈ F , and by an
induction on i, we can show that for every 0 ≤ i ≤ |w|,
and (p, p0 , k2 + θ(π) − θ(π 0 )) belong to F . Since k1 6= k2 , one of these triples proves (2).
Condition (∗) in Lemma 4.11 will be crucial for the complexity of an algorithm which explores
the set F to decide whether T is single-valued.
Theorem 4.12. Let T be a WFA over the tropical semring. It is decidable in polynomial time
whether T is single-valued.
Proof. At first, the algorithm computes the set C. This is possible in polynomial time since C
is the least subset of Q × Q which contains F × F and is closed as follows: for every (q, q 0 ) ∈ C
and transitions (p, a, k, q), (p0 , a, l, q 0 ) ∈ E, we have (p, p0 ) ∈ C.
Then, the algorithm generates the set F . Whenever the algorithm produces a new triple
in F , it checks whether condition (2) or (∗) in Lemma 4.11 is satisfied. If so, then T is not
single-valued. If the algorithm computes the entire set F , and both condition (2) and (∗) in
Lemma 4.11 are not satisfied, then T is single-valued.
The set F is possibly infinite. However, in every subset F which consists of more that
|C| triples, there are two triples as in condition (∗). Hence, the algorithm generates at most
|C| + 1 ≤ |Q|2 + 1 triples, i.e., it terminates in polynomial time.
Acknowledgment
We gratefully acknowledge the work of the unknown referees and the editors which resulted in
improvements of this paper.
20 REFERENCES
References
[1] C. Allauzen and M. Mohri. Efficient algorithms for testing the twins property. Journal of Automata,
Languages and Combinatorics, 8(2):117–144, 2003.
[2] M.-P. Béal and O. Carton. Computing the prefix of an automaton. R.A.I.R.O. - Informatique Théorique et
Applications, 34(6):503–514, 2000.
[3] M.-P. Béal, O. Carton, C. Prieur, and J. Sakarovitch. Squaring transducers. An efficient procedure for
deciding functionality and sequentiality. Theoretical Computer Science, 292(1):45–63, 2003.
[4] J. Berstel. Transductions and Context-Free Languages. B. G. Teubner, Stuttgart, 1979.
[5] J. Berstel and C. Reutenauer. Rational Series and Their Languages, volume 12 of EATCS Monographs on
Theoretical Computer Science. Springer-Verlag, Berlin Heidelberg New York, 1984.
[6] A.L. Buchsbaum, R. Giancarlo, and J.R. Westbrook. On the determinization of weighted finite automata.
SIAM Journal of Computing, pages 1502–1531, 2000.
[7] C. Choffrut. Une caracterisation des fonctions sequentielles et des fonctions sous-sequentielles en tant que
relations rationnelles. Theoretical Computer Science, 5(3):325–337, 1977.
[8] C. Choffrut. Minimizing subsequential transducers: A survey. Theoretical Computer Science, 292(1):131–143,
2003.
[9] K. Culik II and J. Kari. Image compression using weighted finite automata. Computer & Graphics, 17:305–
313, 1993.
[10] M. Droste and P. Gastin. On aperiodic and star-free formal power series in partially commuting variables.
In D. Krob, A.A. Mikhalev, and A.V. Mikhalev, editors, Formal Power Series and Algebraic Combinatorics,
12th Int. Conf., pages 158 – 169. Springer-Verlag, Berlin, 2000.
[11] S. Eilenberg. Automata, Languages, and Machines, Vol. A. Academic Press, New York, 1974.
[12] L. Fuchs. Teilweise geordnete algebraische Strukturen. Vandenhoeck & Ruprecht, Göttingen, 1966.
[13] J. S. Golan. Semirings and Their Applications. Kluwer Academic Publishers, 1999.
[14] U. Hafner. Low Bit-Rate Image and Video Coding with Weighted Finite Automata. PhD thesis, Universität
Würzburg, 1999.
[15] T. Harju, H.C.M. Kleijn, and M. Latteux. Compositional representations of rational functions. R.A.I.R.O.
- Informatique Théorique et Applications, 26:243–255, 1992.
[16] W.C. Holland and J. Martinez, editors. Ordered Algebraic Structures. Kluwer Academic Publishers, 1997.
[17] Z. Jiang, B. Litov, and O. de Vel. Similarity enrichments in image compression through weighted finite
automata. In COCOON’00 Proceedings, volume 1858 of Lecture Notes in Computer Science, pages 447–456.
Springer-Verlag, Berlin, 2000.
[18] F. Katritzke. Refinements of Data Compression using Weighted Finite Automata. PhD thesis, Universität
Siegen, 2001.
[19] R. Klemm and A. Weber. Economy of description for single-valued transducers. Information and Computa-
tion, 118(2):327–340, 1995.
[20] I. Klimann, S. Lombardy, J. Mairesse, and C. Prieur. Deciding unambiguity and sequentiality from a finitely
ambiguous max-plus automaton. Theoretical Computer Science, 327(3):349–373, 2004.
[21] D. Krob. Some consequences of a Fatou property of the tropical semiring. Journal of Pure and Applied
Algebra, 93:231–249, 1994.
[22] W. Kuich. Semirings and formal power series. In G. Rozenberg and A. Salomaa, editors, Handbook of Formal
Languages, Vol. 1, Word, Language, Grammar, pages 609–677. Springer-Verlag, Berlin, 1997.
[23] B. Mahr. Iteration and summability in semirings. Annals of Discrete Mathematics, 19:229–256, 1984.
[24] I. Mäurer. Zur Minimalisierung und Determinisierung von sequentiellen Transducern. Master’s thesis,
Technische Universität Dresden, Institut für Algebra, 2002.
[25] M. Mohri. Finite-state transducers in language and speech processing. Computational Linguistics, 23:269–
311, 1997.
[26] C. Reutenauer. A survey on noncommutative rational series. DIMACS Series in Discrete Mathematics and
Theoretical Computer Science, 24:159–169, 1996.
[27] J. Sakarovitch. A construction on finite automata that has remained hidden. Theoretical Computer Science,
204:205–231, 1998.
REFERENCES 21
[28] A. Salomaa and M. Soittola. Automata-Theoretic Aspects of Formal Power Series. Texts and Monographs
on Computer Science. Springer-Verlag, Berlin Heidelberg New York, 1978.
[29] M. Schützenberger. Sur les relations rationelles entre monoı̈des libres. Theoretical Computer Science, 3:243–
259, 1976.
[30] I. Simon. On semigroups of matrices over the tropical semiring. Informatique Théorique et Applications,
28:277–294, 1994.
[31] A. Weber. Decomposing a k-valued transducer into k unambiguous ones. Informatique Théorique et Appli-
cations, 30:379–413, 1996.