Hanneforth-Tree Automata
Hanneforth-Tree Automata
Universität Leipzig, Faculty of Mathematics and Computer Science, Institute of Computer Science
PO box 100 920, 04009 Leipzig, Germany
e-mail address: [email protected]
1. Introduction
Weighted tree automata [FV09] have recently found various applications in fields as di-
verse as natural language and XML processing [KM09], system verification [Jac11], and
pattern recognition. Most applications require efficient algorithms for basic manipulations
2012 ACM CCS: [Theory of computation]: Formal languages and automata theory — Tree languages;
[Theory of computation]: Formal languages and automata — Automata extensions — Quantitative
automata.
2010 Mathematics Subject Classification: 68Q45, 68Q25.
Key words and phrases: pushing — weighted tree automaton — minimization — equivalence testing.
∗
This is a revised and extended version of [Maletti, Quernheim: Pushing for weighted tree automata.
Proc. 36th Int. Conf. Mathematical Foundations of Computer Science, LNCS 6907, p. 460–471, 2011].
Financially supported by the German Research Foundation (DFG) grant MA / 4959 / 1-1.
1
2 TH. HANNEFORTH, A. MALETTI, AND DANIEL QUERNHEIM
2. Preliminaries
We write N for the set of all nonnegative integers and [u] for its subset {i | 1 ≤ i ≤ u}
given u ∈ N. The k-fold Cartesian product of a set Q is written as Qk , and the empty
tuple () ∈ Q0 is often written as ε. Every finite and nonempty set is also called alphabet,
of which the elements are called symbols. A ranked alphabet (Σ, rk) consists of an alpha-
bet Σ and a mapping rk : Σ → N, which assigns a rank to each symbol. If the ranking ‘rk’
is obvious from the context, then we simply write Σ for the ranked alphabet. For each
k ∈ N, we let Σk be the set {σ ∈ Σ | rk(σ) = k} of k-ary symbols of Σ. Moreover, we let
Σ(Q) = {σw | σ ∈ Σ, w ∈ Qrk(σ) }. The set TΣ (Q) of all Σ-trees indexed by Q is inductively
defined to be the smallest set T such that Q ⊆ T and Σ(T ) ⊆ T . Instead of TΣ (∅) we
simply write TΣ . The size |t| of a tree t ∈ TΣ (Q) is inductively defined by |q| = 1 for every
q ∈ Q and |σ(t1 , . . . , tk )| = 1 + ki=1 |ti | for every k ∈ N, σ ∈ Σk , and t1 , . . . , tk ∈ TΣ (Q).
P
To increase readability, we often omit quantifications like “for all k ∈ N” if they are obvious
from the context.
We reserve the use of a special symbol that is not an element in any considered
alphabet. Its function is to mark a designated position in certain trees called contexts.
Formally, the set CΣ (Q) of all Σ-contexts indexed by Q is defined as the smallest set C
such that ∈ C and σ(t1 , . . . , ti−1 , c, ti+1 , . . . , tk ) ∈ C for every σ ∈ Σk , t1 , . . . , tk ∈ TΣ (Q),
i ∈ [k], and c ∈ C. As before, we simplify CΣ (∅) to CΣ . In simple words, a context is a
tree, in which the special symbol occurs exactly once and at a leaf position. Note that
CΣ (Q) ∩ TΣ (Q) = ∅, but CΣ (Q) ⊆ TΣ (Q ∪ {}), which allows us to treat contexts like trees.
Given c ∈ CΣ (Q) and t ∈ TΣ (Q ∪ {}), the tree c[t] is obtained from c by replacing the
unique occurrence of in c by t. In particular, c[c′ ] ∈ CΣ (Q) given that c, c′ ∈ CΣ (Q).
A commutative semiring [HW98, Gol99] is a tuple (S, +, ·, 0, 1) such that (S, +, 0) and
(S, ·, 1) are commutative monoids and s · 0 = 0 and s · (s1 + s2 ) = (s · s1 ) + (s · s2 ) for all
s, s1 , s2 ∈ S (i.e., · distributes over +). It is a commutative semifield if (S \ {0}, ·, 1) is a
commutative group (i.e., in addition, for every s ∈ S \ {0} there exists s−1 ∈ S such that
s · s−1 = 1). Typical commutative semifields include
• the Boolean semifield B = ({0, 1}, max, min, 0, 1),
• the field of rational numbers (Q, +, ·, 0, 1), and
• the Viterbi semifield (Q≥0 , max, ·, 0, 1), where Q≥0 = {q ∈ Q | q ≥ 0}.
Given a mapping f : A → S, we write supp(f ) for the set {a ∈ A | f (a) 6= 0} of elements
that are mapped via f to a non-zero semiring element.
For the rest of the paper, let (S, +, ·, 0, 1) be a commutative semifield.
A weighted tree automaton [BLB83, Boz99, Kui98, BV03, Bor05, FV09] (for short: wta)
is a tuple M = (Q, Σ, µ, F ), in which
• Q is an alphabet of states,
• Σ is a ranked alphabet of input symbols,
• µ : Σ(Q) × Q → S assigns a weight to each transition, and
• F ⊆ Q is a set of final states.
We often write elements of TΣ (Q) × Q as t → q instead of (t, q). The size |M | of the wta M
is X
|M | = (|t| + 1) .
t→q∈supp(µ)
4 TH. HANNEFORTH, A. MALETTI, AND DANIEL QUERNHEIM
for all p, q ∈ Q, σ ∈ Σk , and t1 , . . . , tk ∈ TΣP (Q). The wta M recognizes the weighted
tree language M : TΣ → S such that M (t) = q∈F hµ (t → q) for every t ∈ TΣ . Two wta
M and M ′ are equivalent if their recognized weighted tree languages coincide. The un-
weighted (finite-state) tree automaton [GS84, GS97, CDG+ 07] (for short: fta) correspond-
ing to M is unw(M ) = (Q, Σ, supp(µ), F ).1 We note that supp(M ) ⊆ L(unw(M )), where
L(unw(M )) is the tree language recognized by the fta unw(M ).
The wta M = (Q, Σ, µ, F ) is (bottom-up) deterministic (or a dwta) if for every t ∈ Σ(Q)
there exists at most one q ∈ Q such that t → q ∈ supp(µ). In other words, a wta M is deter-
ministic if and only if unw(M ) is bottom-up deterministic. In a dwta we can (without loss of
information) treat µ and hµ as partial mappings µ : Σ(Q) 99K Q×S and hµ : TΣ (Q) 99K Q×S.
(1) (2)
We use µ(1) and µ(2) as well as hµ and hµ for the corresponding projections to the first and
second output component, respectively (e.g., µ(1) : Σ(Q) 99K Q and µ(2) : Σ(Q) 99K S). To
avoid complicated distinctions, we treat undefinedness like a value (i.e., it is equal to itself,
but different from every other value). We observe that supp(M ) = L(unw(M )) for each
dwta M .2 Moreover, the restriction to final states instead of final weights in the definition
of a wta does not restrict the expressive power [Bor05, Lemma 6.1.4], which applies to both
wta and dwta.
An equivalence relation ≡ on a set A is a reflexive, symmetric, and transitive subset
of A2 . The equivalence class [a]≡ of the element a ∈ A is {a′ ∈ A | a ≡ a′ }, and we let
(A′ /≡) = {[a′ ]≡ | a′ ∈ A′ } for every A′ ⊆ A. Whenever ≡ is obvious from the context, we
simply omit it. The equivalence ≡ respects a set A′ ⊆ A if [a] ⊆ A′ or [a] ⊆ A \ A′ for every
a ∈ A (i.e., each equivalence class is either completely in A′ or completely outside A′ ).
Let M = (Q, Σ, µ, F ) be a dwta. An equivalence relation ≡ ⊆ Q2 is a congruence
(of M ) if µ(1) (σ(q1 , . . . , qk )) ≡ µ(1) (σ(q1′ , . . . , qk′ )) for every σ ∈ Σk and all equivalent
states q1 ≡ q1′ , . . . , qk ≡ qk′ . Note that this definition of congruence completely disre-
gards the weights, which yields that ≡ is a congruence for M if and only if ≡ is a con-
gruence for unw(M ). Two states q1 , q2 ∈ Q are weakly equivalent, written as q1 ∼M q2 , if
(1) (1)
hµ (c[q1 ]) ∈ F if and only if hµ (c[q2 ]) ∈ F for all contexts c ∈ CΣ (Q). In other words,
weak equivalence coincides with classical equivalence [GS84, Definition II.6.8] for unw(M ).
Consequently, the weak equivalence relation ∼M is actually a congruence of M that re-
spects F [GS84, Theorem II.6.10]. Finally, two states are (strongly) equivalent, written as
q1 ≡M q2 if there exists a factor s ∈ S \ {0} such that for all c ∈ CΣ (Q) we have
h(2) (1) (2) (1)
µ (c[q1 ]) · χF hµ (c[q1 ]) = s · hµ (c[q2 ]) · χF hµ (c[q2 ]) ,
where χF : Q → {0, 1} is the characteristic function of F ; i.e., F (q) = 1 if and only if q ∈ F
for all q ∈ Q. The equivalence relation ≡M is called the Myhill-Nerode equivalence
1An fta computes in the same manner as a wta over the Boolean semifield B.
2The statement holds because each commutative semifield is zero-divisor free [Bor03, Lemma 1].
PUSHING FOR WEIGHTED TREE AUTOMATA 5
relation [Mal09, Definition 3]. It is also a congruence that respects F [Mal09, Lemma 4]. If
M is clear from the context, then we just write ≡ instead of ≡M .
3. Signs of life
First, we demonstrate how to efficiently compute signs of life (Definition 3.1), which are
evidence that a final state can be reached. Together with these signs of life we also compute
a pushing weight for each state (Section 4). Our Algorithm 1 is a straightforward extension
of [Mal09, Algorithm 1] that computes on equivalence classes of states (with respect to a
congruence that respects finality) instead of states.3 This change guarantees that equivalent
states receive the same sign of life, which is an essential requirement for the algorithms in
Sections 5 and 6.
Before we start we need to recall the definition of a sign of life [Mal09]. In addition, we
recall the relevant properties that we use in our algorithm. For the rest of this section, let
M = (Q, Σ, µ, F ) be a dwta.
Definition 3.1 (see [Mal09, Section 2]). A context c ∈ CΣ (Q) is a sign of life for the
(1)
state q ∈ Q if hµ (c[q]) ∈ F . Any state that has a sign of life is live; otherwise it is dead.
Lemma 3.2 (see [Mal09, Lemma 9]). We have ∼ = ⊆ ∼M for every congruence ∼ = that
respects F . In particular, ≡M ⊆ ∼M . Moreover, every sign of life for q ∈ Q is also a sign
of life for every q ′ ∈ [q]∼
=.
Proof. It is known that ∼M is the coarsest congruence that respects F [GS84, Theorem II.6.10].4
Consequently, ∼ = ⊆ ∼M and ≡M ⊆ ∼M since we already remarked that ≡M is also a congru-
ence that respects F . Based on the definition of ∼M it is trivial to see that all elements of an
equivalence class of ∼M share the same signs of life [Mal09, Lemma 9]. Since [q]∼ = ⊆ [q]∼M
we obtain the desired statement.
Now let us explain Algorithm 1 in detail. Every final state q ∈ F is trivially live as
evidenced by the trivial sign of life . Since the congruence ∼ = respects F , the set (F/∼ =)
contains equivalence classes that contain only final states. We set the sign of life for each
class to [see Line 3], and for each involved state q we set its pushing weight to 1 [see
Line 4]. Overall, this initialization takes time O(|F |). Next, we add all those blocks to the
live states L and to the blocks U yet to be explored. As long as there are still unexplored
blocks, we select a block B from U and remove it from U . Then we consider all transitions
that end in a state that belongs to the block B and check whether it contains a source state
that is not yet present in L. For each such source state qi , we add its equivalence class [qi ]∼
=
to both L and U . Then we set the sign of life for this class to the sign of life for B extended
by the considered transition [see Line 12]. Finally, we select each state q from [qi ]∼ = and
compute a pushing weight by multiplying the weight of the currently considered transition
with qi replaced by q to the already computed pushing weight for the target state reached
by the modified transition [see Line 13].
Theorem 3.3 ([Mal09, Lemma 10 and Theorem 11]). Algorithm 1 is correct and runs in
time O(|M | + |Q|).
3Note that our algorithm is not simply the previous algorithm executed on the quotient dwta with respect
to the congruence. The original dwta is used essentially in the computation of the pushing weights.
4Mind that ∼
M coincides with classical equivalence on unw(M ) and that our notion of congruence
completely disregards the weights.
6 TH. HANNEFORTH, A. MALETTI, AND DANIEL QUERNHEIM
Alg. 1 ComputeSoL: Compute a sign of life and its weight for each state.
Require: dwta M = (Q, Σ, µ, F ) and congruence ∼ = ⊆ Q2 of M that respects F
∼
Ensure: return live state partition (L/=) and the mappings sol : (L/∼ =) → CΣ (Q) and
(2)
λ : L → S \ {0} such that λ(q) = hµ (sol([q]∼ = )[q]) for every q ∈ L
L ← (F/=) ∼ // final states are trivially live . . .
2: for all B ∈ L do
sol(B) ← // . . . with the trivial context as sign of life. . .
4: λ(q) ← 1 for all q ∈ B // . . . and trivial pushing weight
U ←L // start from the final states
6: while U 6= ∅ do
take B ∈ U and U ← U \ {B} // get an unexplored class
8: (1)
for all σ(q1 , . . . , qk ) ∈ Σ(Q) such that µ (σ(q1 , . . . , qk )) ∈ B do
for all i ∈ [1, k] such that [qi ]∼ = ∈ / L do
10: c ← σ(q1 , . . . , qi−1 , , qi+1 , . . . , qk ) // prepare context
L ← L ∪ {[qi ]∼ = }; U ← U ∪ {[q ]∼
i = } // add class to L and U
12: = ) ← sol(B)[c]
sol([qi ]∼ // add transition to target block’s sign of life
λ(q) ← λ(µ(1) (c[q])) · µ(2) (c[q]) for all q ∈ [qi ]∼ = // multiply transition weight
14: return (L, sol, λ)
Proof. We already argued that the initialization runs in time O(|F |) ⊆ O(|Q|). It is easy
to see that U ⊆ L at all times in the main loop [Line 6–13] of the algorithm. Consequently,
each block can be added at most once to U since it is added at the same time to L and
only blocks not in L can be added to U . This yields that the main loop executes at most
|(Q/∼=)| ≤ |Q| times. The inner loop [Line 9–13] can execute at most |M | times since each
transition is considered at most once in the middle loop and at most once for each source
state of the transition. The statements in the inner loop all execute in constant time except
for Line 13, which can be executed once for each state q ∈ Q. Overall, we thus obtain the
running time O(|M | + |Q|).
Now let us prove the post-conditions. By Lemma 3.2 we know that signs of life are
shared between elements in an equivalence class of ∼=. The remaining statements are proved
by induction along the outer main loop. Initially, we set
λ(q) = 1 = h(2) (2)
µ (q) = hµ ([q])
by Lines 3–4, which proves the post-condition because sol([q]∼ = ) = . In the main loop,
we set λ(q) = λ(µ(1) (c[q])) · µ(2) (c[q]) in Line 13. The equivalence class of q ′ = µ(1) (c[q])
has already been explored in a previous iteration because q ∼ = qi , which by the congruence
(1) ∼ (1)
property yields µ (c[q]) = µ (c[qi ]) and the latter was in the explored equivalence class B,
which in turn yields that the former is in B. Consequently, we can employ the induction
(2)
hypothesis and obtain λ(q ′ ) = hµ (sol(B)[q ′ ]). In addition,
λ(q) = λ(q ′ ) · µ(2) (c[q]) = h(2) ′ (2)
µ (sol(B)[q ]) · µ (c[q])
= h(2) (2)
µ (sol(B)[c[q]]) = hµ ((sol(B)[c])[q]) ,
which proves the post-condition because sol([q]∼ = ) = sol([qi ]∼= ) = sol(B)[c] by Line 12.
= ) is a sign of life for q, which proves that q is live. Finally, suppose that there
Clearly, sol([q]∼
PUSHING FOR WEIGHTED TREE AUTOMATA 7
Figure 1: Dwta over the rational numbers before (left) and after (right) pushing.
= ∈
is a live state q ∈ Q such that [q]∼ / L (i.e., we assume a live state that is not classified as
such by Algorithm 1). Since it is live, it has a sign of life c ∈ CΣ (Q). By induction on c we
can prove that, when processing c[q], there exists a transition that uses a source state qi such
5
/ L, whereas the target state q ′ is such that [q ′ ]∼
= ∈
that [qi ]∼ ′
= ∈ L. However, since [q ] was
explored, the considered transition was considered in the algorithm, which means that the
= was added to L. This contradicts the assumption, which shows that
equivalence class [qi ]∼
all states that are not represented in L are indeed dead.
Example 3.4. Our example dwta N = (Q, Σ, µ, F ) is depicted left in Figure 1. For any
transition (small circle, the annotation specifies the input symbol and the weight separated
by a colon), the arrow leads to the target state and the source states q1 , . . . , qk have been
arranged in a counter-clockwise fashion starting from the target arrow. As usual, final states
are doubly circled. The graphical representation of wta is explained in detail in [Bor05]. The
coarsest congruence ∼ = respecting F = {q1 , qf } is represented by the set {{q1 , qf }, {q2 , qb }}
of equivalence classes (i.e., partition). We use this congruence in Algorithm 1. First, the
block F of final states is marked as live and added to U . It is assigned the trivial context
as sign of life and each final state is assigned the trivial weight 1. Clearly, we can only select
one equivalence class B = F in the main loop. Let us consider the transition γ(qb ) → qf ,
whose target state qf is in B. Since [qb ]∼ = = {q2 , qb } has not yet been marked as live, we add
it to both L and U . In addition, we set its sign of life to γ(). Finally, we set the pushing
weights to λ(qb ) = λ(qf ) · µ(2) (γ(qb ) → qf ) = 8 and λ(q2 ) = λ(qf ) · µ(2) (γ(q2 ) → qf ) = 2.
Now all states are live, so the loops will terminate. Consequently, we have computed all
signs of life and the pushing weights
λ(q1 ) = λ(qf ) = 1 λ(q2 ) = 2 and λ(qb ) = 8 .
4. Pushing
The Myhill-Nerode congruence requires that there is a unique scaling factor for every
pair (q, q ′ ) of equivalent states. Thus, any fixed sign of life c for both q and q ′ [for which
(1) (1) (2) (2)
χF (hµ (c[q])) = 1 = χF (hµ (c[q ′ ]))] yields non-zero weights hµ (c[q]) and hµ (c[q ′ ]), which
can be used to determine this unique scaling factor between q and q ′ . In fact, we already
computed those weights. Clearly, states that are not weakly equivalent (and thus might not
have the same sign of life after executing Algorithm 1 with ∼M ) also cannot be equivalent.
5Such a switch must exist because all the final states are represented in L.
8 TH. HANNEFORTH, A. MALETTI, AND DANIEL QUERNHEIM
For the remaining pairs of live states, we computed a sign of life sol([q]∼M ) for the equivalence
class [q]∼M of q in the previous section. In addition, we computed pushing weights λ(q) and
λ(q ′ ). Now, we will use these weights to normalize the wta by pushing [Moh97, Eis03, PG09].
Intuitively, pushing cancels the scaling factor for equivalent states, which we will prove in
the next section. In general, it just redistributes weights along the transitions. In weighted
(finite-state) string automata [Sak09], pushing is performed from the final states towards
the initial states [Moh97]. Since we work with bottom-up wta [Bor05] (i.e., our notion of
determinism is bottom-up), this works analogously here by moving weights from the root
towards the leaves. However, we introduce our notion of pushing for arbitrary, not necessarily
deterministic wta.
In this section, let M = (Q, Σ, µ, F ) be an arbitrary wta and λ : Q → S \ {0}
be an arbitrary mapping such that λ(q) = 1 for every q ∈ F .
Definition 4.1 (see [Moh97, page 296]). The pushed wta pushλ (M ) is (Q, Σ, µ′ , F ) such
that for every σ ∈ Σk and q, q1 , . . . , qk ∈ Q
k
Y
′
µ (σ(q1 , . . . , qk ) → q) = λ(q) · µ(σ(q1 , . . . , qk ) → q) · λ(qi )−1 .
i=1
The mapping λ indicates the pushed weights. It is non-zero everywhere and has to be 1
for final states because our model does not have final weights. In the pushed wta pushλ (M ),
the weight of every transition leading to the state q ∈ Q is obtained from the weight of
the corresponding transition in M by multiplying the weight λ(q). To compensate, the
weight of every transition leaving the state q will cancel the weight λ(q) by multiplying
with λ(q)−1 . Thus, we expect an equivalent wta after pushing, which we confirm by showing
that M and pushλ (M ) are indeed equivalent.
Proposition 4.2 (see [Moh97, Lemma 4]). The wta M and pushλ (M ) are equivalent. More-
over, if M is deterministic, then so is pushλ (M ).
Proof. Let pushλ (M ) = M ′ = (Q, Σ, µ′ , F ). The preservation of determinism is obvious
because supp(µ′ ) ⊆ supp(µ).6 We prove that hµ′ (t → q) = λ(q) · hµ (t → q) for every t ∈ TΣ
and q ∈ Q by induction on t. Let t = σ(t1 , . . . , tk ) for some σ ∈ Σk and t1 , . . . , tk ∈ TΣ . By
the induction hypothesis, we have hµ′ (ti → qi ) = λ(qi ) · hµ (ti → qi ) for every i ∈ [1, k] and
qi ∈ Q. Consequently,
X k
Y
hµ′ (t → q) = µ′ (σ(q1 , . . . , qk ) → q) · hµ′ (ti → qi )
q1 ,...,qk ∈Q i=1
X k
Y k
Y
= λ(q) · µ(σ(q1 , . . . , qk ) → q) · λ(qi )−1 · λ(qi ) · hµ (ti → qi )
q1 ,...,qk ∈Q i=1 i=1
= λ(q) · hµ (t → q) .
We complete the proof as follows.
X X X
M ′ (t) = hµ′ (t → q) = λ(q) · hµ (t → q) = hµ (t → q) = M (t)
q∈F q∈F q∈F
Alg. 2 Overall structure of our minimization algorithm; see [Mal09] for details. Note that
the final merging is performed on the input dwta M . The alphabetic dwta N is only needed
to compute the equivalence ≡M .
Require: a dwta M with states Q
Ensure: return a minimal, equivalent dwta
∼M ← ComputeCoarsestCongruence(M, Q × Q) // complexity: O(|M | log |Q|)
2: (L, sol, λ) ← ComputeSoL(M, ∼M ) // complexity: O(|M |)
N ← alph(pushλ (M )) // complexity: O(|M |)
4: ≡M ← ComputeCoarsestCongruence(N, ∼M ) // complexity: O(|M | log |Q|)
return MergeStates(M, ≡M , λ) // complexity: O(|M |)
Theorem 4.3. The wta pushλ (M ) is equivalent to M and can be obtained in time O(|M |).
Example 4.4. Let us return to our example dwta N left in Figure 1 and perform pushing.
The pushing weights λ are given in Example 3.4. We consider the transition σ(qb , qf ) → q2 ,
which has weight 4 in N . In pushλ (N ) this transition has the weight
λ(q2 ) · µ(σ(qb , qf ) → q2 ) · λ(qb )−1 · λ(qf )−1 = 2 · 4 · 8−1 · 1−1 = 1 .
The dwta pushλ (N ) is presented right in Figure 1. With a little effort, we can confirm that
q2 and qb are equivalent in pushλ (N ), whereas q1 and qf are not.
5. Minimization
Our main application of weight pushing is efficient dwta minimization, which we present
next. The overall structure of our minimization procedure is presented in Algorithm 2. As
mentioned earlier, the coarsest congruence ∼M for a dwta M = (Q, Σ, µ, F ) that respects F
can be obtained by minimization [HMM09] of unw(M ).
Let M = (Q, Σ, µ, F ) be a dwta such that |Q| ≤ |M | and λ : Q → S \ {0} be
the pushing weights computed by Algorithm 1 when run on M and ∼M .7 In
addition, we let pushλ (M ) = M ′ = (Q, Σ, µ′ , F ).
The dwta M ′ has the property that (µ′ )(2) (σ(q1 , . . . , qk )) = (µ′ )(2) (σ(q1′ , . . . , qk′ )) for
all σ ∈ Σk and states qi ≡M qi′ for every i ∈ [1, k]. We will prove this property (5.1) in
Lemma 5.2. It is this property, which, in analogy to the string case [Moh97, Eis03], allows
us to compute the equivalence ≡M on an unweighted fta N , in which we treat the transition
weight as part of the input symbol. For example, the algorithm of [HMM09] can then be
used to compute ∼N . Finally, we merge the equivalent states using the information about
the scaling factors contained in the pushing weights λ in the same way as in [Mal09]. Let
us start with the formal definitions.8
Definition 5.1. Let M = (Q, Σ, µ, F ) be a dwta, and let S ′ = {µ(τ ) | τ ∈ supp(µ)} be
the finite set of non-zero weights that occur as transition weights in M . The alphabetic
dwta alph(M ) for M is (Q, Σ × S ′ , µ′′ , F ), where
7All trim wta (i.e., those wta without useless states) automatically fulfill the restriction |Q| ≤ |M |.
8We avoid a change of the weight structure from our semifield to the Boolean semifield B since the
multiplicative submonoid induced by {0, 1} is isomorphic to the multiplicative monoid of B. Thus, our dwta
with weights in {0, 1} compute in the same manner as a dwta over B or equivalently a deterministic fta.
10 TH. HANNEFORTH, A. MALETTI, AND DANIEL QUERNHEIM
both pj and p′j . Moreover, we have a constant scaling factor between the equivalent states
qj and qj′ , which yields
(2) (2)
λ(qj ) (†) hµ (c[cj [qj ]]) (‡) hµ (c[pj ]) · µ(2) (cj [qj ])
= (2) = (2) (5.5)
λ(qj′ ) hµ (c[cj [qj′ ]]) hµ (c[p′j ]) · µ(2) (cj [qj′ ])
(2)
λ(pj ) hµ (c[pj ])
′ = (2) , (5.6)
λ(pj ) hµ (c[p′j ])
where (†) holds because c[cj ] is a sign of life for both qj and qj′ and (‡) holds essentially by
definition. With these equations, let us inspect the main equality.
Qj−1
λ(µ(1) (cj [qj ])) · µ(2) (cj [qj ]) · i=1 λ(qi′ )−1 · ki=j λ(qi )−1
Q
λ(µ(1) (cj [qj′ ])) · µ(2) (cj [qj′ ]) · ji=1 λ(qi′ )−1 · ki=j+1 λ(qi )−1
Q Q
(2)
λ(pj ) · µ(2) (cj [qj ]) · λ(qj )−1 (5.6) hµ (c[pj ]) · µ(2) (cj [qj ]) λ(qj′ ) (5.5)
= = (2) · = 1
λ(p′j ) · µ(2) (cj [qj′ ]) · λ(qj′ )−1 hµ (c[p′j ]) · µ(2) (cj [qj′ ]) λ(qj )
Now we are ready to return to the proof obligation expressed in (5.1). We apply (5.4)
in total k times to obtain the desired statement.
k
′ (2) (5.2) (1) (2)
Y
(µ ) (σ(q1 , . . . , qk )) = λ(µ (σ(q1 , . . . , qk ))) · µ (σ(q1 , . . . , qk )) · λ(qi )−1
i=1
0
Y k
Y
= λ(µ(1) (c1 [q1 ])) · µ(2) (c1 [q1 ]) · λ(qi′ )−1 · λ(qi )−1
i=1 i=1
1 k
(5.4) Y Y
= λ(µ(1) (c1 [q1′ ])) · µ(2) (c1 [q1′ ]) · λ(qi′ )−1 · λ(qi )−1
i=1 i=2
1
Y k
Y
(1) (2)
= λ(µ (c2 [q2 ])) · µ (c2 [q2 ]) · λ(qi′ )−1 · λ(qi )−1
i=1 i=2
...
k k
(5.4) Y Y
= λ(µ(1) (ck [qk′ ])) · µ(2) (ck [qk′ ]) · λ(qi′ )−1 · λ(qi )−1
i=1 i=k+1
k
Y
= λ(µ(1) (σ(q1′ , . . . , qk′ ))) · µ(2) (σ(q1′ , . . . , qk′ )) · λ(qi′ )−1
i=1
(5.3) ′ (2)
= (µ ) (σ(q1′ , . . . , qk′ )) ,
which completes the proof.
Theorem 5.3. We have ≡M = ∼N , where N = alph(M ′ ).
Proof. Lemma 5.2 shows that ≡M is a congruence of N that respects F . Since ∼N is
the coarsest congruence of N that respects F by [GS84, Theorem II.6.10], we obtain that
≡M ⊆ ∼N . The converse is simple to prove as states that are weakly equivalent in alph(M ′ )
share exactly the same signs of life with the scaling factor 1. Since the signs of life already
12 TH. HANNEFORTH, A. MALETTI, AND DANIEL QUERNHEIM
indicate the transition weights, we immediately obtain that such weakly equivalent states
in alph(M ′ ) have corresponding transitions with equal transition weights in M ′ , which proves
that those states are also equivalent in M ′ with the scaling factor 1. The latter statement
can then be used to prove that they are also equivalent in M (with a scaling factor that is
potentially different from 1).
The currently fastest dwta minimization algorithm [Mal09] runs in time O(|M | · |Q|).
Our approach, which relies on pushing and is presented in Algorithm 2, achieves the same
run-time O(|M | · log |Q|) as the fastest minimization algorithms for deterministic fta.
Corollary 5.4 (see Algorithm 2). For every dwta M = (Q, Σ, µ, F ), we can compute an
equivalent minimal dwta in time O(|M | log |Q|).
6. Testing equivalence
In this final section, we want to decide whether two given dwta are equivalent. To this end,
let M = (Q, Σ, µ, F ) and M ′ = (Q′ , Σ, µ′ , F ′ ) be dwta. The overall approach is presented
in Alg. 3. First, we compute a correspondence g : Q → Q′ between states. For every q ∈ Q,
(1)
we compute a tree t ∈ TΣ , which is also called access tree for q, such that hµ (t) = q. If no
access tree exists, then q is not reachable and can be deleted. A dwta, in which all states
are reachable, is called accessible. To avoid these details, let us assume that M and M ′ are
accessible, which can always be achieved in time O(|M |+|M ′ |). In this case, we can compute
an access tree a(q) ∈ TΣ for every state q ∈ Q in time O(|M |) using standard breadth-first
search, in which we unfold each state (i.e., explore all transitions leading to it) at most once.
To keep the representation efficient, we store the access trees in the format Σ(Q), where
each state q ∈ Q refers to its access tree a(q). To obtain the corresponding state g(q), we
compute the state of Q′ that is reached when processing the access tree a(q). Formally,
(1)
g(q) = hµ′ (a(q)) for every q ∈ Q. This computation can also be achieved in time O(|M |)
(1)
since we can reuse the results for the subtrees. Consequently, we have that hµ (a(q)) = q and
(1)
hµ′ (a(q)) = g(q) for every q ∈ Q. Clearly, the computation of the access trees a : Q → TΣ
and the correspondence g : Q → Q′ can be performed in time O(|M |). Next, we compute
the coarsest congruences ∼M and ∼M ′ for M and M ′ that respect F and F ′ , respectively,
and the signs of life for M .
Lemma 6.1. Let M and M ′ be equivalent. The correspondence g : Q → Q′ is compatible
with the congruences ∼M and ∼M ′ ; i.e., g(q) ∼M ′ g(p) if and only if q ∼M p for all q, p ∈ Q.
Moreover, for every reachable q ′ ∈ Q′ there exists q ∈ Q such that g(q) ∈ [q ′ ]∼′ . Consequently,
g induces a bijection g : (Q/∼M ) → (Q′ /∼M ′ ) on the equivalence classes.
Proof. Let q, p ∈ Q, and let t = a(q) and u = a(p) be the corresponding access trees. Then
q ∼M p
⇐⇒ {c ∈ CΣ (Q) | h(1) (1)
µ (c[q]) ∈ F } = {c ∈ CΣ (Q) | hµ (c[p]) ∈ F }
⇐⇒ {c ∈ CΣ | h(1) (1)
µ (c[q]) ∈ F } = {c ∈ CΣ | hµ (c[p]) ∈ F } (⋆)
⇐⇒ {c ∈ CΣ | c[t] ∈ supp(M )} = {c ∈ CΣ | c[u] ∈ supp(M )} (†)
′ ′
⇐⇒ {c ∈ CΣ | c[t] ∈ supp(M )} = {c ∈ CΣ | c[u] ∈ supp(M )} (since M = M ′ )
PUSHING FOR WEIGHTED TREE AUTOMATA 13
(1) (1)
⇐⇒ {c ∈ CΣ | hµ′ (c[g(q)]) ∈ F ′ } = {c ∈ CΣ | hµ′ (c[g(p)]) ∈ F ′ } (†)
(1) (1)
⇐⇒ {c ∈ CΣ (Q) | hµ′ (c[g(q)]) ∈ F ′ } = {c ∈ CΣ (Q) | hµ′ (c[g(p)]) ∈ F ′ } (⋆)
⇐⇒ g(q) ∼M ′ g(p) ,
(1)
where (⋆) follows from [Mal08, Lemma 4] and (†) follows from the easy fact that hµ (c[q]) ∈ F
(1)
if and only if c[t] ∈ supp(M ) for all q ∈ Q and t ∈ TΣ such that hµ (t) = q.
For the second statement, let q ′ ∈ Q′ be a reachable state, and let t ∈ TΣ be such that
(1)
hµ′ (t) = q ′ . Clearly, we have
(1) (†)
{c ∈ CΣ | hµ′ (c[q ′ ]) ∈ F ′ } = {c ∈ CΣ | c[t] ∈ supp(M ′ )} = {c ∈ CΣ | c[t] ∈ supp(M )}
(†) (†)
= {c ∈ CΣ | h(1)
µ (c[q]) ∈ F } = {c ∈ CΣ | c[a(q)] ∈ supp(M )}
(†) (1)
= {c ∈ CΣ | c[a(q)] ∈ supp(M ′ )} = {c ∈ CΣ | hµ′ (c[g(q)]) ∈ F ′ }
(1)
where q = hµ (t). Consequently, using (⋆) we obtain q ′ ∼M ′ g(q).
We just demonstrated that for equivalent dwta the correspondence g always yields a
bijection g : (Q/∼M ) → (Q′ /∼M ′ ). We can test the compatibility in time O(|Q| + |Q′ |).
Next we transfer the signs of life via g to the equivalence classes of ∼M ′ and calculate
the corresponding pushing weights for all states q ′ ∈ Q′ . Since the signs of life can con-
tain states of Q, we need to rename them using the correspondence g, so we use the func-
tion reng : TΣ (Q∪{}) → TΣ (Q′ ∪{}), which is defined by reng () = , reng (q) = g(q) for all
q ∈ Q, and reng (σ(t1 , . . . , tk )) = σ(reng (t1 ), . . . , reng (tk )) for all σ ∈ Σk and t1 , . . . , tk ∈ TΣ (Q∪{}).
We note that reng (c) ∈ CΣ (Q′ ) for all c ∈ CΣ (Q).
Using this approach corresponding equivalence classes receive the same sign of life. We
then minimize M and M ′ using the method of Section 5 (i.e., we perform pushing followed by
unweighted minimization). Finally, we test the obtained deterministic fta for isomorphism.
Lemma 6.2. We use the symbols of Algorithm 3. Given a compatible correspondence g, the
dwta M and M ′ are equivalent if and only if the deterministic (unweighted) fta alph(pushλ (M ))
and alph(pushλ′ (M ′ )) are equivalent.
14 TH. HANNEFORTH, A. MALETTI, AND DANIEL QUERNHEIM
Proof. Clearly, if the deterministic fta alph(pushλ (M )) and alph(pushλ′ (M ′ )) are equivalent,
then also pushλ (M ) and pushλ′ (M ′ ) are equivalent since the weights are annotated on
the symbols of the former devices. Moreover, since pushing preserves the semantics (see
Proposition 4.2), also the dwta M and M ′ are equivalent, which concludes one direction.
For the other direction, let M and M ′ be equivalent. Then also pushλ (M ) and pushλ′ (M ′ )
are equivalent due to Proposition 4.2. An easy adaptation of the proof (of the equality (5.1)
of the transition weights) of Lemma 5.2 can be used to show that the transition weights of
corresponding transitions are equal and hence alph(pushλ (M )) and alph(pushλ′ (M ′ )) are
equivalent.
Lemma 6.2 proves the correctness of Algorithm 3 because the minimal deterministic
fta for a given tree language is unique (up to isomorphism) [GS84, Theorem 2.11.12]. The
run-time of our algorithm should be compared to the previously (asymptotically) fastest
equivalence test for dwta of [DHM11], which runs in time O(|M | · |M ′ |).
Theorem 6.3. We can test equivalence of M and M ′ in time O((|M |+|M ′ |)·log (|Q| + |Q′ |)).
Acknowledgments
The authors gratefully acknowledge the insight and suggestions provided by the reviewers
of the conference version.
References
[Ang87] Dana Angluin. Learning regular sets from queries and counterexamples. Information and Com-
putation, 75(2):87–106, 1987.
[BLB83] Symeon Bozapalidis and Olympia Louscou-Bozapalidou. The rank of a formal tree power series.
Theoretical Computer Science, 27(1–2):211–215, 1983.
[BMV10] Matthias Büchse, Jonathan May, and Heiko Vogler. Determinization of weighted tree automata
using factorizations. Journal of Automata, Languages and Combinatorics, 15(3–4):229–254,
2010.
[Bor03] Björn Borchardt. The Myhill-Nerode theorem for recognizable tree series. In Proc. 7th Int. Conf.
Developments in Language Theory, volume 2710 of Lecture Notes in Computer Science, pages
146–158. Springer, 2003.
[Bor05] Björn Borchardt. The Theory of Recognizable Tree Series. PhD thesis, TU Dresden, 2005.
[Boz99] Symeon Bozapalidis. Equational elements in additive algebras. Theory of Computing Systems,
32(1):1–33, 1999.
[BR82] Jean Berstel and Christophe Reutenauer. Recognizable formal power series on trees. Theoretical
Computer Science, 18(2):115–148, 1982.
[BV03] Björn Borchardt and Heiko Vogler. Determinization of finite state weighted tree automata.
Journal of Automata, Languages and Combinatorics, 8(3):417–463, 2003.
[CDG+ 07] Hubert Comon, Max Dauchet, Rémi Gilleron, Florent Jacquemard, Denis Lugiez, Christof
Löding, Sophie Tison, and Marc Tommasi. Tree automata techniques and applications. Available
on: http://tata.gforge.inria.fr/, 2007. version 2008-11-18.
[DGMM11] Manfred Droste, Doreen Götze, Steffen Märcker, and Ingmar Meinecke. Weighted tree automata
over valuation monoids and their characterization by weighted logics. In Bozapalidis Festschrift,
volume 7020 of Lecture Notes in Computer Science, pages 30–55. Springer, 2011.
[DHM11] Frank Drewes, Johanna Högberg, and Andreas Maletti. MAT learners for tree series — an
abstract data type and two realizations. Acta Informatica, 48(3):165–189, 2011.
[DV06] Manfred Droste and Heiko Vogler. Weighted tree automata and weighted logics. Theoretical
Computer Science, 366(3):228–247, 2006.
PUSHING FOR WEIGHTED TREE AUTOMATA 15
[Eis03] Jason Eisner. Simpler and more general minimization for weighted finite-state automata. In
Proc. 2003 Human Language Technology Conf., pages 64–71. Association for Computational
Linguistics, 2003.
[FV09] Zoltán Fülöp and Heiko Vogler. Weighted tree automata and tree transducers. In Handbook of
Weighted Automata, EATCS Monographs in Theoretical Computer Science, chapter 9, pages
313–403. Springer, 2009.
[Gol99] Jonathan S. Golan. Semirings and their Applications. Kluwer Academic Publishers, 1999.
[GS84] Ferenc Gécseg and Magnus Steinby. Tree Automata. Akadémiai Kiadó, Budapest, 1984. 2nd
edition available at: https://arxiv.org/abs/1509.06233.
[GS97] Ferenc Gécseg and Magnus Steinby. Tree languages. In Handbook of Formal Languages, volume 3,
chapter 1, pages 1–68. Springer, 1997.
[HMM07] Johanna Högberg, Andreas Maletti, and Jonathan May. Bisimulation minimisation for weighted
tree automata. In Proc. 11th Int. Conf. Developments in Language Theory, volume 4588 of
Lecture Notes in Computer Science, pages 229–241. Springer, 2007.
[HMM09] Johanna Högberg, Andreas Maletti, and Jonathan May. Backward and forward bisimulation
minimization of tree automata. Theoretical Computer Science, 410(37):3539–3552, 2009.
[HW98] Uwe Hebisch and Hanns J. Weinert. Semirings – Algebraic Theory and Applications in Computer
Science. World Scientific, 1998.
[Jac11] Florent Jacquemard. Extended Tree Automata Models for the Verification of Infinite State Sys-
tems. Mémoire d’habilitation à diriger des recherches, Laboratoire Spécification et Vérification,
ENS de Cachan, 2011.
[KM01] Nils Klarlund and Anders Møller. MONA Version 1.4 User Manual. BRICS, Department of
Computer Science, Aarhus University, 2001. Notes Series NS-01-1.
[KM09] Kevin Knight and Jonathan May. Applications of weighted automata in natural language pro-
cessing. In Handbook of Weighted Automata, EATCS Monographs in Theoretical Computer
Science, chapter 14, pages 571–596. Springer, 2009.
[Kui98] Werner Kuich. Formal power series over trees. In Proc. 3rd Int. Conf. Developments in Language
Theory, pages 61–101. Aristotle University of Thessaloniki, 1998.
[Mal08] Andreas Maletti. Minimizing deterministic weighted tree automata. In Proc. 2nd Int. Conf.
Language and Automata: Theory and Applications, volume 5196 of Lecture Notes in Computer
Science, pages 357–372. Springer, 2008.
[Mal09] Andreas Maletti. Minimizing deterministic weighted tree automata. Information and Computa-
tion, 207(11):1284–1299, 2009.
[Man08] Eleni G. Mandrali. Weighted tree automata with discounting. Master’s thesis, Aristotle Univer-
sity of Thessaloniki, 2008.
[MKV10] Jonathan May, Kevin Knight, and Heiko Vogler. Efficient inference through cascades of weighted
tree transducers. In Proc. 48th Annual Meeting of the ACL, pages 1058–1066. Association for
Computational Linguistics, 2010.
[Moh97] Mehryar Mohri. Finite-state transducers in language and speech processing. Computational
Linguistics, 23(2):269–311, 1997.
[PG09] Matt Post and Daniel Gildea. Weight pushing and binarization for fixed-grammar parsing. In
Proc. 11th Int. Workshop on Parsing Technologies, pages 89–98. Association for Computational
Linguistics, 2009.
[Sak09] Jacques Sakarovitch. Rational and recognisable power series. In Handbook of Weighted Automata,
EATCS Monographs in Theoretical Computer Science, chapter 4, pages 105–174. Springer, 2009.
[TW68] James W. Thatcher and Jesse B. Wright. Generalized finite automata theory with an application
to a decision problem of second-order logic. Mathematical Systems Theory, 2(1):57–81, 1968.
[VDH16] Heiko Vogler, Manfred Droste, and Luisa Herrmann. A weighted MSO logic with storage be-
haviour and its Büchi-Elgot-Trakhtenbrot theorem. In Proc. 10th Int. Conf. Language and Au-
tomata Theory and Applications, volume 9618 of Lecture Notes in Computer Science, pages
127–139. Springer, 2016.
[WDF+ 09] Christoph Weidenbach, Dilyana Dimova, Arnaud Fietzke, Rohit Kumar, Martin Suda, and
Patrick Wischnewski. Spass version 3.5. In Proc. 22nd Int. Conf. Automated Deduction, volume
5663 of Lecture Notes in Computer Science, pages 140–145. Springer, 2009.