Algebraic Elimination of epsilon-transitions

We here decribe a method of removing the e -transitions of a weighted automaton. The existence of a solution for this removal depends on the existence of the star of a single matrix which, in turn, is based on the computation of the stars of scalars in the ground semiring. We discuss two aspects of the star problem (by inﬁnite sums and by equations) and give an algorithm to suppress the e -transitions and preserve the behaviour. Running complexities are computed.


Introduction
Automata with multiplicities (or weighted automata) are a versatile class of transition systems which can modelize as well classical (boolean), stochastic, transducer automata and be applied to various purposes such as image compression, speech recognition, formal linguistic (and automatic treatment of natural languages too) and probabilistic modelling.For generalities over automata with multiplicities see [1] and [10], problems over identities and decidability results on these objects can be found in [11], [12] and [13].A particular type of these automata are the automata with ε-transitions denoted by k-ε-automata which are the result, for example, of the application of Thompson method to transform a weighted regular expression into a weighted automaton [14].The aim of this paper is to study the equivalence between k-ε-automata and k-automata.Indeed, we will present here an algebraic method in order to compute, for a weighted automaton with ε-transitions (choosen in a suited class ) an equivalent weighted automaton without εtransitions which has the same behaviour.Here, the closure of ε-transitions implies the existence of the star of transition matrix for ε.Its running time complexity is deduced from that of the matrix multiplication in k n×n .In the case of well-known semirings (like boolean and tropical), the closure is computed in O(n 3 ) [15].We fit the running time complexity to the case when k is a ring.
The structure of the paper is the following.We first recall (in Section 2) the notions of a semiring and the computation of the star of matrices.After introducing (in Section 3) the notions of a k-automaton and k-ε-automaton, we present (in Section 4 and 5) our principal result which is a method of elimination of ε-transitions and show particular cases of series on which our result can be applied.In Section 6, we give the equivalence between the two types of automata and discuss its validity.A conclusion section ends the paper.

Semirings
In the following, a semiring (k, ⊕, ⊗, 0 k , 1 k ) is a set together with two laws and their neutrals.More precisely (k, ⊕, 0 k ) is a commutative monoid with 0 k as neutral and (k, ⊗, 1 k ) is a monoid with 1 k as neutral.The product is distributive with respect to the addition and zero is an annihilator (0 k ⊗ x = x ⊗ 0 k = 0 k ) [7].For example all rings are semirings, whereas (N, +, ×, 0, 1), the boolean semiring B = ({0, 1}, ∨, ∧, 0, 1) and the tropical semiring T = (R + ∪ {∞}, min, +, ∞, 0) are well-known examples of semirings that are not rings.The star of a scalar is introduced by the following definition: If y ∈ k is a left and right star of x ∈ k, we say that y is a star for x and we write y = x .

Example(s) 1
1.For k = C, any complex number x = 1 has a unique star which is y = (1 − x) −1 .In the case |x| < 1, we observe easily that y = 1 and an infinite number of solutions exist for the right star (which is not a left star if α = 0).

3.
For k = T (tropical semiring), any number x > 0 has a unique star y = 1.
We can observe that if the opposite −x of x exists then right (resp.left) stars of x are right (resp.left) inverses of (1 ⊕ (−x)) and conversely.Thus, if they exist, any right star x r equals any left star x l as In this case, the star is unique.
If n is a positive integer then the set k n×n of square matrices with coefficients in k has a natural semiring structure with the usual operations (sum and product ii) In [8] and [16], analog formulas are expressed for the computation of the inverse of matrices when k is a division ring (it can be extended to the case of rings).
iii) The formulas described above are valid with matrices of any size with any block partitionning.Matrices of even size are often, in practice, partitionned into square blocks but, for matrices with odd dimensions, the approach called dynamic peeling is applied.More specifically, let M ∈ k n×n a matrix given by M = a 11 a 12 a 21 a 22 where n ∈ 2N + 1.The dynamic peeling [9] consists of cutting out the matrix in the following way: a 11 is a (n − 1) × (n − 1) matrix, a 12 is a (n − 1) × 1 matrix, a 21 is a 1 × (n − 1) matrix and a 22 is a scalar.
Theorem 2 Let k be a semiring.The right (resp.left) star of a matrix of size n ∈ N can be computed in O(n ω ) operations with: m denote the number of operations ⊕, ⊗ and in k that the addition, the multiplication and the star of matrix respectively perform with an input of size n.Then by Theorem 1.For arbitrary semiring, one has 1) .If k is a ring, using Strassen's algorithm for the matrix multiplication [19], it is known that at most n log 2 (7) operations are necessary.If k is a field, using Coppersmith and Winograd's algorithm [3], it is known that at most n 2.376 operations are necessary.Suppose that T × m−1 = 2 (m−1)ω .The solution of the recurrence relation ( 5) is where the leading term is 2 mω .
The running time complexity for the computation of the right (resp.left) star of a matrix depends on T , T and T , but it depends also on the representation of coefficients in machine.In the case k = Z for example, the multiplication of two integers is computed in O(m log(m) log(log(m))), using FFT if m bits are necessary [18].

Theorem 3 The space complexity of the right (resp. left) star of a matrix of size n
Proof.For n = 2 m ∈ N and k a semiring, let E * m denote the space complexity of operation * that the star of matrix perform with an input of size n.Then The solution of the recurrence relation ( 6) is where the leading term is m • 4 m .
The running of the algorithm needs the reservation of memory spaces for the resulting matrix (the star of the input matrix) and for intermediate results stored in temporary locations.
Let k Σ be the set of noncommutative formal series with Σ as alphabet (i.e.functions on the free monoid Σ * with values in k).It is a semiring equipped with + the sum and • the Cauchy product.We denote by α(?) and (?)α the left and right external product respectively.The star (?) * of a formal series is well-defined if and only if the star of the constant term exists [10,1].The set RAT k (Σ) is the closure of the alphabet Σ by the sum, the Cauchy product and the star.

Automata with multiplicities
Let Σ be a finite alphabet and k be a semiring.A weighted automaton (or linear representation) of dimension n on Σ with multiplicities in k is a triplet (λ, µ, γ) where: • λ ∈ k 1×n (the input vector), • µ : Σ → k n×n (the transition function), • γ ∈ k n×1 (the output vector).
Such automaton is usually drawn as a directed valued graph (see Figure 1).A transition (i, a, j) ∈ {1, . . ., n} × Σ × {1, . .., n} connects the state i with the state j.Its weight is µ(a) i j .The weight of the initial (final) state i is λ i (respectively γ i ).The mapping µ induces a morphism of monoid from Σ * to k n×n .
The behaviour of the weighted automaton A belongs to k Σ .It is defined by: behaviour More precisely, the weight behaviour(A ), u of the word u in the formal series behaviour(A ) is the weight of u for the k-automaton A (this is an accordance with the scalar product denotation S|u := S(u) for any function S : Let u = aba.Then, its weight in A is: The set REC k (Σ) is known to be equal to the set of series which are the behaviour of a k-automaton.We recall the well-known result of Schützenberger [17]: A k-ε-automaton A ε is a k-automaton over the alphabet Σ ε = Σ ∪ ε (see Figure 2).We must keep the reader aware that ε is considered here as a new letter and that there exists an empty word for Σ * ε = (Σ ∪ ε) * denoted here by ε.The transition matrix of ε is denoted µ ε.

Example(s) 3
In Figure 2, the behaviour of the automaton A ε is behaviour

Algebraic elimination
Let Φ be the morphism from Σ * ε to Σ * induced by It is classical that the morphism Φ can be uniquely extended to the polynomials of k Σ ε as a morphism of algebras k Σ ε → k Σ by, for P a polynomial, as, in this case, the sum is a finite-supported sum and then well defined.But, we remark that the set of preimages of This shows that, in this case, all preimages are infinite and we will discuss on the convergence of the sum ∑ Φ(v)=u P|v .
In the sequel, we will extend formula (7) in two ways: 1. To the series for which the sum (8) remains with finite support (this set is larger than the polynomials and include also the behaviours of ε-automata with an acyclic ε-transition matrix).We will call them Φ-finite series (FF series).
2. Having supposed the semiring endowed with a topology (or, at least, an "infinite sums" function) we define the set of series for which the sum (8) converge (this definition covers the behaviour of classical boolean ε-automata).We will call them Φ-convergent series (FC series).
After these extensions, we will prove that the behaviour of the automaton obtained by algebraic elimination is the image by Φ (the erasure of ε) of the behaviour (in k Σ ε ) of the initial automaton.
5 FF and FC series

FF (Φ-finite) series
Let S ∈ k Σ ε be a formal series, we recall that the support of S is given by: We will call (FF) the following condition: (FF) For any u ∈ Σ * , the set supp(S) ∩ (Φ −1 (u)) is finite.
If the formal series S satisfies (FF), we say that it is
• The star S * need not be Φ-finite even if S is Φ-finite.The simplest example is provided by S = ε.
Next we show that Φ : k Σ ε → k Σ can be extended to k Σ ε Φ-finite as a polymorphism.

Theorem 5 For any S, T ∈
Proof.For the sum and the Cauchy product, we obtain the result by the following relations: Then Φ(S * ) is a solution of the equation Y = ε + Φ(S)Y as S * = ε + SS * , and Φ(S * ) = Φ(S) * .
A Φ-finite series may be not rational.

Example(s) 4
The series in N Σ S = ∑ |u| a =|u| ε u. is not rational and however Φ-finite.
We recall that a matrix M ∈ k n×n is nilpotent if there exists a positive integer N ≥ n such that M N = 0. Proposition 1 Let S be a rational series in k Σ ε with (λ, µ, γ) a linear representation of S. i) If µ is nilpotent then S is Φ-finite.ii) Conversely, if S is Φ-finite, k a field and (λ, µ, γ) is of minimal dimension then µ is nilpotent.
Proof.i) With the notations of the theorem, suppose that there is an integer N such that µ(ε are regular (L is a block matrix of n lines of size 1 × n and G is a block matrix of n columns of size n × 1; indeed, L and G are n × n square matrices.)[1].Now, for 1 ≤ i, j ≤ n the family as a subfamily of ( S|v ) Φ(v)=Φ(u i v j ) must be with finite support.This implies that (Lµ(ε n )G) n≥0 is with finite support.As L and G are invertible, µ(ε) must be nilpotent.

FC (Φ-convergent) series
If we want to go further in the extension of Φ (and so doing to cover the -boolean -classical case), we must extend the domain of computability of the sums ( 8) to (some) countable families.Many approaches exist in the literature [10], mainly by topology, ordered structure or "sum" function.
Here, we adopt the last option with a minimal axiomatization adapted to our goal.
The semiring k will be supposed endowed with a sum function sum taking some (at most) countable families (a i ) i∈I (called summable) and computing an element of k denoted sum i∈I a i .This function is subjected to the following axioms: CS1.
-If (a i ) i∈I is finite, then it is summable and CS2.
-If (a i ) i∈I and (b i ) i∈I are summable, so is (a i + b i ) i∈I and CS3.
-If (a i ) i∈I and (b j ) j∈J are summable, so is (a i b j ) (i, j)∈I×J and CS4.
-If I = ⊔ λ∈Λ J λ with Λ finite and each (a j ) j∈J λ is summable.Then so is (a i ) i∈I and CS6.
-If (a i ) i∈I is summable and φ : J → I is one-to-one then (a φ( j) ) j∈J is summable and Definition 2 A semiring with sum function (as above) which fulfills CS1..6 will be called a CS-semiring.
If k is a CS-semiring, the semiring of square matrices k n×n will be equipped with the following sum function: A family (M (i) ) i∈I of square matrices will be said summable iff it is so componentwise i.e. the n 2 families (M (i) r,s ) i∈I (for 1 ≤ r, s ≤ n) are summable.In this case, the sum of the family is the matrix L such that, for 1 ≤ r, s ≤ n, L rs = sum i∈I M (i) rs (i.e. the sum is computed componentwise).It can be easily checked that, with this sum function, k n×n is a CS-semiring.

Remark 4
Let k be a topological semiring (i.e.k is endowed with some Hausdorff topology T such that the two binary operations -sum and product -are continuous mappings k × k → k).We recall that a family (a i ) i∈I is said summable with sum s iff it satisfies the following property, where B(s) is a basis of neighbourhoods of s.
In this case the axioms CS12456 are automatically satisfied for the preceding (usual) notion of summability.

Example(s) 5
Below some examples of CS-semirings which are metric semirings (i.e. the notion of summability and the sum function are given as in Remark ( 4)).
1.The fields Q, R, C with their usual metric.
If the formal series S satisfies (FC), we say that it is It is straightforward that a Φ-finite series is Φ-convergent.We have now a theorem similar to theorem (6) for k Σ ε Φ-conv .
Proof.Stability by +, α(?) and (?)α is straightforward using the axioms CS123.Let us give the details of the proof for the Cauchy product, we have to prove that, for every S, T ∈ k Σ ε Φ-conv and u ∈ Σ * , the (countable) family is summable.
From the definition of the Cauchy product we have the finite sums and, from CS4, the summability would be a consequence of that of the family (with the same sum).This family can be partitionned in a finite sum of families (with the same sum) each of which, by CS3, is summable.Thus, by CS6, the family (21) is summable and hence summability of (20) (with the same sum) follows.
Next, we show that Φ : Proof.The proof is similar to that of theorem (5), using the axioms of CS-semirings.
Remark 5 i) In the sequel, as in the classical case, the summability of (µ(ε) n ) n∈N will play a central role.
We will then call closable a square matrix M ∈ k n×n such that the family (M n ) n∈N is summable.Note that, in this case, the sum sum n∈N M n is a two-sided (we could say "topological") star of M. ii) For example, with the boolean semiring endowed with the discrete topology, every M ∈ B n×n is closable (i.e. the sequence S N = ∑ N k=0 M k is stationnary).We have the following theorem, very similar to (1).
Proof.The proof (i) is similar to that of theorem (1).The first computation of (ii) is similar, but, to conclude, we use the property (which holds in R and C) that a family is summable iff it is absolutely summable (because of CS6) and then subfamilies of summable families are summable.

Equivalence
We now deal with an algebraic method to eliminate the ε-transitions from a weighted ε-automaton A ε .
The result is a weighted automaton with behaviour Φ(behaviour(A ε )).
Proof.The point i) is a reformulation of proposition (2).Remark that, if (µ(ε) n ) n∈N is summable its sum is a (two sided) star of µ(ε) that, for convenience, we will denote µ(ε) * .
Let B be the behaviour of A ε one has the conclusion follows taking, for all a ∈ Σ, Theorem 2 gives the lower bounds if the set of coefficients is a semiring (resp.ring, field).In the following, we have an example of a boolean automaton with ε-transition.
In the next example, our algebraic method is applied on a Q-ε-automaton.

Example(s) 7
The linear representation of Figure 5 is: The resulting automaton is presented in Figure 6 and its linear representation is (λ ′ , µ ′ , γ ′ ).

Conclusion
Algebraic elimination for ε-automata has been presented.The problem of removing the ε-transitions is originated from generic ε-removal algorithm for weighted automata [15] using Floyd-Warshall and generic single-source shortest distance algorithms.Here, we have the same objective but the methods and algorithms are different.In [15], the principal characteristics of semirings used by the algorithm as well as the complexity of different algorithms used for each step of the elimination are detailed.The case of acyclic and non acyclic automata are analysed differently.Our algorithm here works with any semiring (supposing only that µ(ε) is closable) and the complexity is unique for the case of acyclic or non acyclic automata.This algorithm is even more efficient when the considered semiring is a ring.

Figure 2 :
Figure 2: A N-ε-automaton Example(s) 2 The behaviour of the automaton A of Figure 1 is behaviour(A ) = ∑ u,v∈Σ *

(a 22 a 21 a 11 * a12)A 22 + 1 q×q = A 22 Remark 2 i) Similar formulas can be stated in the case of the left star. The matrix N is the left star
11 + a 12 A 21 + 1 p×p a 11 A 12 + a 12 A 22 a 21 A 11 + a 22 A 21 a 21 A 12 + a 22 A 22 + 1 q×qwhere 0 p×q is the zero matrix in k p×q .We verify the relations (1), (2), (3) and (4) by:a 11 A 11 + a 12 A 21 + 1 p×p = a 11 A 11 +a 12 a * 22 a 21 A 11 + 1 p×p = A 11 (a 11 + a 12 a 22 * a 21 ) + 1 p×p = A 11 a 11 A 12 + a 12 A 22 = a 11 a 11 * a 12 A 22 + a 12 A 22 = (a 11 a 11 * + 1)a 12 A 22 = a 11 * a 12 A 22 = A 12 a 21 A 11 + a 22 A 21 = a 21 A 11 + a 22 a 22 * a 21 A 11 = (1 + a 22 a 22 * )a 21 A 11 = a 22 * a 21 A 11 = A 21 a 21 A 12 + a 22 A 22 + 1 q×q = a 21 a 11 * a 12 A 22 + a 22 A 22 + 1 q×q = ).The (right) star of M ∈ k n×n (when it exists) is a solution of the equation MY + 1 n×n = Y (where 1 n×n is the identity matrix).Let M ∈ k n×n given by M = a 11 a 12 a 21 a 22 where a 11 ∈ k p×p , a 12 ∈ k p×q , a 21 ∈ k q×p and a 22 ∈ k q×q such that p + q = n.Let N ∈ k n×n given by