Defect Effect of Bi-infinite Words in the Two-element Case

Let X be a two-element set of words over a finite alphabet. If a bi-infinite word possesses two X-factorizations which are not shiftequivalent, then the primitive roots of the words in X are conjugates. Note, that this is a strict sharpening of a defect theorem for bi-infinite words stated in \emphKMP. Moreover, we prove that there is at most one bi-infinite word possessing two different X-factorizations and give a necessary and sufficient conditions on X for the existence of such a word. Finally, we prove that the family of sets X for which such a word exists is parameterizable.


Introduction
Defect theorem is one of the fundamental results on words, cf [Lo].Intuitively it states that if n words satisfy a nontrivial relation, then these words can be expressed as products of at most n − 1 words.Actually, as discussed in [CK], for example, there does not exist just one defect theorem but several ones depending on restrictions put on the required n − 1 words.
It is also well-known that the nontrivial relation above can be replaced by a weaker condition, namely by the nontrivial one-way infinite relation, cf.[Br] and [HK].The goal of this note is to look for defect theorems for bi-infinite words.In a strict sense such results do not exist: the set X = {ab, ba} of words satisfies a bi-infinite nontrivial relation since (ab) Z = (ba) Z , but there exists no word ρ such that X ⊆ ρ + .However, in [KMP2] there was proved one result and we are going to prove another one in a special case which both can be viewed as defect theorems for bi-infinite words.
In terms of factorizations of words defect theorem can be stated as follows: Let X ⊆ Σ + be a finite set of words.If there exists a word w ∈ Σ + having two different X-factorizations, then the rank of X is at most card(X) − 1.Here the rank of X can be defined in different ways, cf again [CK].For example, it can be defined as a combinatorial rank r c (X) denoting the smallest number k such that X ⊆ Y + with card(Y ) = k.

Ján Maňuch
To describe our results let w be a bi-infinite word, i.e., an element of Σ Z , and X a finite subset of Σ + .We say that w has an X-factorization if w ∈ X Z , and that w has two different X-factorizations, if it has two X-factorizations such that they do not match at least in one point of w.The following result was shown in [KMP2]: If a nonperiodic bi-infinite word w has two different X-factorizations, then the combinatorial rank r c (X) of X is at most card(X) − 1.Moreover, if r c (X) = card(X), then the number of bi-infinite words with two different X-factorizations is finite.
We are going to prove a strict sharpening of this result for the two-element case: Let card(X) = r c (X) = 2, so that X is a code.If a bi-infinite word w has two different X-factorizations which are not shiftequivalent, then the primitive roots of words in X are conjugates.Moreover, there is at most one bi-infinite word possessing two different X-factorizations.
The first part of our result is related to the main result of [lRlR], and, we believe, deducible from considerations of that paper.However, our proof is self-contained and essentially shorter, and moreover formulated directly to yield a defect-type of theorem.
Our paper is organized as follows.
In Section 2 we fix our terminology and present the auxiliary results needed for our proofs.In Section 3 we prove, as our main result, a defect theorem for binary sets X satisfying a nontrivial bi-infinite relation.To prove this seems to be quite complicated.In Section 4 we prove the second part of our result, i.e., the uniqueness of the X-ambiguous bi-infinite word in the two-element case.In Section 5 we give a characterization of two-element sets X, which allow an X-ambiguous bi-infinite word.The last section contains conclusions and open problems.
The extended abstract of this paper and paper [KMP2] has appeared in [KMP1].

Preliminaries
In this section we fix our terminology and recall a few lemmas on combinatorics of words needed for the proofs of our results.For undefined notions we refer to [Lo] or [CK].
Let Σ be a finite alphabet and X a finite subset of Σ + .The sets of all finite, infinite and bi-infinite words over Σ are denoted by Σ * , Σ N and Σ Z , respectively.Formally, a representation of bi-infinite word is a mapping f w : Z → Σ, usually written as Representations f : Z → Σ and f : Z → Σ represent the same bi-infinite word if there exists an integer i 0 such that for all integers i, f Let f w be a representation of a bi-infinite word w.We say that a bi-infinite word is periodic if there exists a positive integer i 0 , called a period, such that f w (i) = f w (i 0 + i) for all integers i.Note that a non-periodic bi-infinite word has infinitely many representations, while a periodic one has exactly π(w) representations, where π(w) is the smallest period of w.
An X-factorization of w is any sequence of words from X yielding w as their products.Formally, let f w be a fixed representation of w ∈ Σ Z .An X-factorization of w is a mapping F : Z → X × Z such that for each k ∈ Z if F(k) = (α, i) and F(k + 1) = (β, j), then a i a i+1 . . .a j−1 = α, i.e., the position i is a starting position of the factor α in w.We say that two X-factorizations F 1 and F 2 of a bi-infinite word are • disjoint, whenever the starting positions of all factors in F 1 are distinct from the ones in F 2 , • shiftequivalent, if there is a k 0 such that whenever Notice that the above definitions are independent on the choice of a representation of w.
An X-ambiguous bi-infinite word is a bi-infinite word, which has two different X-factorizations.Let amb (X) be the set of all X-ambiguous bi-infinite words and let sum(X) be the sum of lengths of words in X, i.e., the size of X.Clearly, in both of the above cases the two factorizations are disjoint.
We define the combinatorial rank of X ⊆ Σ + by the formula For the sake of completeness we remind that where r f (X) denotes the free rank (or simply the rank) of X defined as the cardinality of the base of the smallest free semigroup containing X, cf [CK].
Example 1 (continued).Clearly, r c (X) = 2, since X ⊆ {a, b} + , but for no word ρ the inclusion X ⊆ ρ + holds.On the other hand, since X is a code we conclude that r f (X) = 3.

Ján Maňuch
We say that a finite word w = w 1 . . .w m has a period n ∈ N, if there is a word u such that w = u n .The shortest period is called the period of w, denoted as π(w).If w = u π(w) , then u is called the root of w, denoted as ρ(w).A word w is primitive if ρ(w) = w.The mirror image of w, denoted w R , is the word w m . . .w 1 .
Next we recall a few basic results on words that we shall need in our later considerations, for their proofs the reader is referred to [Lo] or [CK].Lemma 3. If two words u and v satisfy the relation ut = tv for some u, v,t ∈ Σ + , i.e., if they are conjugates, then there exist words p and q such that pq is primitive and u = (pq) i , v = (qp) i and t ∈ p(qp) * for some i ≥ 1 .
In Section 4 we shall need also the following result which has been proved in [LyS].
Lemma 4. Consider nonempty words x, y, z satisfying equation x m = y n z p , where m, n, p ≥ 2. Then all words x, y, z are powers of a common word.
In order to formulate our fifth, and most crucial lemma, we need some terminology, cf [CK] or [HK].
We associate a finite set X ⊆ Σ + with a graph G X = (V X , E X ), called the dependency graph of X, as follows: the set V X of vertices of G X equals to X, and the set E X of edges of G X is defined by the condition

Then we have
Lemma 5.For each finite set X ⊆ Σ + , the combinatorial rank of X is at most the number of connected components of G X .
As we shall see, Lemma 5 is particularly suitable for our subsequent considerations.Indeed, in that lemma it is crucial that words in X are nonempty, and that indeed is satisfied in the proofs of our Theorem 2.

The Two-element Case
In this section we generalize the following result of [KMP2] in the case of two-element sets.
Theorem 1.Consider a set X = {α 1 , . . ., α n } ⊆ Σ + .Let w be a bi-infinite word over Σ and F 1 , F 2 two different X-factorizations of w.Then the combinatorial rank of X is at most n − 1, or both the word w and the X-factorizations F 1 , F 2 are periodic.Moreover, if the rank of X is n, then the number of periodic bi-infinite words with two different X-factorizations is finite.
A restriction of Theorem 1 to two-element sets yields the following consequence.Corollary 1.Consider set X = {α, β} ⊆ Σ + .Let w be a bi-infinite word over Σ and F 1 , F 2 two different X-factorizations of w.Then the words α, β commute or both the word w and the X-factorizations F 1 , F 2 are periodic.
First we recall that in a strict sense we cannot have a defect theorem for bi-infinite words even in this simple case.
Example 2. The set X = {ab, ba} is of combinatorial rank 2 although the word (ab) Z has two disjoint, and even non-shiftequivalent, X-factorizations.
As a main result of this paper we, however, show that the above example, and its natural variants, are the only exceptions which may occur.And even in these cases the roots of words in X are conjugates, i.e., they are cyclic permutations of powers of a common word.
To prove our main result we will need also one partial result from [KMP2], which can be stated as follows: In the notation of [KMP2] this lemma claims that the situation when w possesses two different minimal t-pairs implies a defect effect.This situation is depicted in Figure 1.Theorem 2. Consider set X = {α, β} with α, β ∈ Σ + .Let w be a bi-infinite word over Σ and F 1 , F 2 two different X-factorizations of w containing together both elements of X.Then one of the following possibilities holds: (i) α and β commute, or (ii) the roots of α and β are conjugates and Proof.We can assume that α and β do not commute.Then, by Lemma 5, the factorizations F 1 , F 2 must be disjoint.Indeed, if factorizations F 1 and F 2 are not disjoint, then we can take the parts of factorizations to the right (respectively, to the left) from a Ján Maňuch place where they are joint to obtain an infinite equation Since the factorizations are different, at least one of these two equations is nontrivial.Hence, by Lemma 5, the words α and β commute, a contradiction.
Further, by Corollary 1, the factorizations F 1 , F 2 are periodic.By Lemma 1 the periods of F 1 and F 2 have the same length and are conjugates.Whenever we find the situation which is shown in Figure 2, since the factorizations are periodic with the same length of the periods, this situation occurs infinitely many times.Using Lemma 6 we get that f 1 , f 2 are periods of If both α and β are not primitive, we can replace them by powers of their roots ρ(α) π(α) , ρ(β) π(β) and explore the situation over a slightly different set X = {ρ(α), ρ(β)}.If we prove that the claim holds for ρ(α), ρ(β), then, as is obvious, it must hold also for α and β and, moreover, in case (iii) we have either π(α) = 1, i.e., α is primitive, or π(β) = 1, i.e., β is primitive.So it is enough to consider only the case when α and β are primitive.Without loss of generality we can also assume that |α| ≤ |β|.Now, if F 2 does not contain the factor αβ, then it contains only α's or only β's or there is a point inside F 2 from which to the left there are only β's and to the right only α's.In the last case the factorization F 2 is clearly nonperiodic -a contradiction with Corollary 1.Consider now, for example, the case F 2 = α Z .If F 1 contains any α, then we have the situation depicted in Figure 3 which, by Lemma 2, contradicts the primitiveness of α.So we have F 1 = β Z , and, by Lemma 1, α and β must be conjugates, which is case (ii).From now on we may assume that F 2 contains the factor αβ.In Figure 4 we can see all possibilities how F 1 covers the border between the above occurrences of α and β.We shall analyze all three cases.Case 1,2.We can analyze first two cases simultaneously, because when we forget about the relation between lengths of α and β, then they are clearly symmetric.So consider the second case of 3 possibilities drawn in Figure 4.If the word to the right of the β in the factorization F 1 is also β, then β is not primitive which is not the case.Hence, we have the situation shown in Figure 5. Now if z R = α or z L = α, then v 1 = v 2 and we arrive into the situation depicted in Figure 2 with f 1 = βα and f 2 = αβ which is case (iii) of our claim.So consider the other case when z R = z L = β.We can continue in this way inductively until sequences of β's exceed α's (on both sides at the same time) or we obtain the situation in Figure 2 with The first possibility is shown in Figure 6.Now again if z R = β, then we have v 1 = v 2 , and hence we are again in case (iii).So assume that z R = α.We have β = v 3 t = tv 4 , which by Lemma 3 allows us to write v 3 = (pq) k , v 4 = (qp) k , t = p(qp) n , where pq is primitive and k ≥ 1, n ≥ 0. We can see that α ends with pq and starts with qp.This means that the word pqqp matches the word β = (pq) k+n p around the black point shown in Figure 6.Since the factorizations are disjoint, the black point must lie inside β.There are 5 possibilities where the black point inside β can be.In case (1) the black point matches with the end of the first p in β, in case (2) it matches with the end of any pq in β, in case (3) it occurs inside the first p of β, in case (4) inside the first q, and, finally, in the last case it occurs in the rest of β, as it is shown in Figure 7.

Ján Maňuch
The dependency graph of this system is then connected which implies that unknowns in Y , and hence also α and β, commute, which is a contradiction.
Similarly, in case (2) we can write where l ≥ 1, e 1 , e 2 ∈ X * and the second equation is obtained by taking parts of F 1 , F 2 between the black point and the next occurrence (to the right) of the black point, so e i is a part of the factorization F i .We can rewrite these two equations as a system of equations with unknowns Y = {α, p, q}: and, by Lemma 5, we have again a contradiction.
In case (3) the qp, which follows the black point, lies inside the first pqp in β.But this is a contradiction because then pq cannot be primitive.In case (5) we can use the same argument with the pq which precedes the black point.The situation around the black point in case ( 4) is shown in Figure 8.It follows, by Lemma 2, that q is not primitive, and that there is an s such that u 1 = s i , u 2 = s j and q = s i+ j with i, j ≥ 1.Now, as above, we have two equations with unknowns Y = {α, p, s}: where w 1 , w 2 ∈ Y * and e 1 , e 2 ∈ X * are parts of the X-factorizations.Note that the second equation deals with the word starting with the first u 2 after the black point and ending in the next occurrence of the black point.
Case 3. Now we shall analyze the third possibility shown in Figure 4. Since β is primitive there must be α to the right of the β on the upper line.Using the same considerations as in the previous case we come to Figure 9, or we end up in case (iii) with f 1 = βα n , f 2 = α n β for some n ≥ 1.In the first case we have There are again two possibilities.Assume first that z R = α.We know that β starts with v 3 v 1 , so, as it is shown in Figure 10, v 3 v 1 lies inside αα = v 1 v 3 v 1 v 3 .This implies that either α is not primitive, or v 3 v 1 matches with v 3 v 1 in αα.In the first case we have a contradiction.In the second case it is obvious from Figure 10 that v 2 = v 3 , say equal to p, and v 1 = v 4 , say equal to q, and moreover that |p| = |v 3 | = |v 4 | = |q|.So we have pα i−1 β = βα i−1 q, which means that v = pα i−1 = p(qp) i−1 conjugates with u = α i−1 q = q(pq) i−1 .We shall show that this is again a contradiction with the primitiveness of α = qp.We have already analyzed this situation.Since the word u is a conjugate with v, a factor of the word uu must be equal to the word v.The word u starts with qp and ends with pq, so the middle point of the word pqqp lies inside the word v = p(qp) i−1 .There are again 5 possibilities (see Figure 7).Since |p| = |q| in cases ( 1) and ( 2) we have p = q, so that α = v 1 v 3 = qp = p 2 proving that α is not primitive, a contradiction.In cases ( 3) and ( 5) we also have a contradiction with the primitiveness of α as we already proved.In case (4) we have u 1 = s i , u 2 = s j , q = s i+ j (see Figure 8), and since |p| = |q| we also have p = u 2 u 1 = s i+ j = q, which is again a contradiction.
According to Figure 9 and Figure 11 we have the following system of equations with unknowns The dependency graph associated with this system is connected, and hence all unknowns commute, in particular α commutes with β.This completes the proof of the theorem.

Ján Maňuch
Theorem 2 deserves a few comments.The number of different X-factorizations of the bi-infinite word w having an X-factorization is very different in cases (i)-(iii).In case (i) there exist non-denumerably many such X-factorizations, in case (ii) there are finitely many different X-factorizations, and if we consider all shiftequivalent X-factorizations as the one, then there are exactly two of them.Finally, in case (iii) there are also finitely many different X-factorizations, which are all shiftequivalent.This actually means that in case (iii) no bi-infinite word can be expressed in two different ways as a product of words from X. Hence, indeed, Theorem 2 shows a defect effect of a two-element set for bi-infinite factorizations.
In Theorem 2 we showed that if the words of X do not commute and their roots are not conjugates, then only the case (iii) is possible.But if they do not commute and are conjugates Theorem 2 allows either case (ii) or (iii).Now we shall prove that in this situation only case (ii) is possible.According to the last part of the proof of Theorem 2, we can formulate the following lemma.
Lemma 7. If pq is primitive and p, q are nonempty, then p(qp) n and q(pq) n are not conjugates for any n ≥ 1.

This yields easily
Corollary 2. If α and β are different conjugates, then αβ must be primitive.
Proof.Assume the contrary that αβ is not primitive, so we have αβ = t i , where t is primitive and i ≥ 2. Now if i is even, then immediately α = β, which is a contradiction.For odd i = 2n + 1 we have α = t n p, β = qt n , where t = pq.But α and β are conjugates and so, by Lemma 7, we have a contradiction.
In fact Corollary 2 is a special case of the claim in [LeS] which states under the additional assumption that α, β are primitive, that αβ m is primitive for all natural numbers m.The proof is not difficult, but we need only this special case to prove the next result.Corollary 3. Consider set X = {α, β} with α, β ∈ Σ + .Let w be a bi-infinite word over Σ and F 1 , F 2 two different X-factorizations of w containing together both elements of X.If the roots of α, β are noncommuting conjugates, then F 1 ∈ α Z , F 2 ∈ β Z , or vice versa.
Proof.Again as in the proof of Theorem 2, we can assume that α, β are primitive.We have to show that the one of the X-factorizations F 1 , F 2 consists only of α's and the other only of β s.So assume the contrary that the lower factorization contains both α and β.Without loss of generality we have the situation shown in Figure 12.Since β is primitive we have z = α, by Lemma 2. We can write β = u 2 u 3 = u 3 u 4 , and so, by Lemma 3, we have where pq is primitive.If z L = α, then, by Lemma 2, αβ is not primitive, which contradicts Corollary 2. Hence assume z L = β, which implies u 1 = u 3 .Similarly u 5 = u 3 and we have p(qp) n (pq) i = u 1 u 2 = α = u 4 u 5 = (qp) i p(qp) n .By Lemma 5, then p and q commute, which is again a contradiction.
4 The Uniqueness of the Bi-infinite Word In [KMP2], cf.Theorem 1 it was proved that if the rank of the set X equals to card(X), then the number of X-ambiguous bi-infinite words is finite.In this section we shall prove that in the two-element case, for each set X, there is at most one X-ambiguous bi-infinite word.This holds also in the case when r c (X) = 1, since then both elements of X are powers of a common word t and the only possible bi-infinite word is t Z .The situation is also trivial in the case when roots of elements of X = {α, β} are conjugates: by Corollary 3 the only possible bi-infinite word is w = α Z = β Z .So we need to consider only the case when the roots of α and β are not conjugates.
In this case, by Theorem 2, we know that an X-ambiguous bi-infinite word must be of the form (αβ n ) Z or (α n β) Z .Moreover, since w has two X-factorizations, the word αβ n or the word α n β cannot be primitive, by Lemma 2.
As we stated in the previous section, if α and β are conjugates, then the words αβ n and α n β are primitive for all n.Now, we shall show a similar result for α, β being non-conjugates, i.e., we shall show that at most one word in the set of words {αβ n ; n ≥ 1} ∪ {α m β; m ≥ 1} is not primitive.By this result, we have that also in the last case there is at most one X-ambiguous bi-infinite word.We need two lemmas.
Lemma 8. Let α, β be primitive and not conjugates.Then for any n, m ≥ 0 with n = m, at most one of the words αβ n and αβ m is not primitive.
Proof.Assume the contrary that both αβ n , αβ m are non-primitive with m < n.For m = 0 the claim is obvious, so we can assume m ≥ 1, and so n ≥ 2. We can write and therefore also where s,t are primitive and i, j ≥ 2. Now if n − m ≥ 2, then, by Lemma 4, s, t and β are powers of a common word, and so are α and β, which is a contradiction.So we can assume m = n − 1, and thus equation (1) simplifies to s i = t j β. loss of generality, we can assume that |t| > |β|.We have the situation depicted in Figure 13, where Since u 2 α = α u 1 , Lemma 3 gives us where k ≥ 1, l ≥ 0 and pq is primitive.We may assume p, q = 1.Now considering the last occurrence of s in Figure 13 we can, by (6), write s = s β = s (qp) k (pq) k .We also have for some r = 1.The first occurrence of qp in s after s must match with qp in w, otherwise qp is not primitive.But then, since r = 1, the first occurrence of pq in s after s (qp) k matches with some qp in w, so we have pq = qp, which is again a contradiction with the primitiveness of pq.
As a consequence of Lemmas 8 and 9 we have the following corollary: Corollary 4. Let α, β be primitive and not conjugates.Then at most one word in the set {αβ n ; n ≥ 1} ∪ {α m β; m ≥ 1} is not primitive.
Finally, we can state the result of this section, which is a consequence of Corollary 4 and the considerations in the beginning of this section.Theorem 3. Consider set X = {α, β} with α, β ∈ Σ + .There is at most one X-ambiguous bi-infinite word over Σ.

The Existence of the Bi-infinite Word
We consider again only the two-element case in this section.In the previous section we proved that there is at most one X-ambiguous bi-infinite word.It is natural to ask when such a word exists.It is easy to see that there are sets X for which there is no X-ambiguous bi-infinite word.For example, take a set which implies u 1 = u 3 u 2 v = u 3 (u 2 u 3 ) m−1 pq.We obtained exactly solution (10), which completes the proof.
The following lemma gives us the characterization of solutions of equation ( 8) and hence also of sets X allowing an X-ambiguous bi-infinite word in case (iii).
Lemma 12. Assume that α and β do not commute.All solutions of the equation where p, q ∈ Σ + , j ≥ 0 and j < i if n = 1.
The following theorem summarizes the previous results.
Notice, that in the last case of Theorem 4 the occurrence of β −1 can be eliminated, but we prefer this form for its simplicity.This theorem shows that the family of the two-element sets X, such that there exists an X-ambiguous bi-infinite word, is parameterizable.Such a characterization does not help us, if we want to decide whether there is an X-ambiguous bi-infinite word for the certain set X, but we can use it to generate all such sets.

Conclusions and Open Problems
Our Theorem 2 is closely related to the main result of [lRlR], where it is characterized when a finite word can have two disjoint X-interpretations for a binary set X. Our result could be concluded, with some effort, from the considerations in this paper.However, our proof is simpler, due to the use of the graph lemma (Lemma 5), and moreover directly formulated to obtain a defect type of theorems.
We pose an open problem asking whether Theorem 2 can be extended to arbitrary sets.

Example 1 .
Let X = {a, bab, baab}.The word (baa) Z has two different X-factorizations, namely the ones depicted as: ¡ . . .b a a b a a b . . .They are clearly shiftequivalent.On the other hand the word w = . . .bababaabaab • • • = N (ba)b(aab) N also has two different X-factorizations, which, however, are not shiftequivalent: ¢ . . .a b a b a b a b a a b a a b a a b a a . . .

Lemma 1 .
(Fine and Wilf) Let u, v ∈ Σ + .If the words u N and v N have a common prefix of length at least |u| + |v| − gcd(|u|, |v|), then u and v are powers of a common word.Lemma 2. No primitive word r satisfies a relation rr = srp with s = 1 and p = 1.

Fig. 1 .
Fig. 1.An illustration of the situation considered in Lemma 6.

Fig. 5 .
Fig. 5.The second case of 3 cases shown in Figure 4.

Fig. 11 .
Fig. 11.The case z R = β.So it remains the case z R = β.The situation is drawn in Figure 11.Since β is not primitive z must be α.It is obvious that |t| = |v 2 | = |v 1 |, which implies t = v 1 .We know that β ends with v 2 v 4 , and hence there is v 2 v 4 at the end of the last upper β in Figure 11, and v 4 at the end of the last lower β.But since |v 2 v 4 | = |v 4 t| we have the equation v 2 v 4 = v 4 t = v 4 v 1 .According to Figure 9 and Figure 11 we have the following system of equations with unknowns Y = {β, v 1 , v 2 , v 3 , v 4 }:

Fig. 12 .
Fig. 12.The situation when α and β are conjugates and the lower factorization contains both α and β.