Applying a uniform marked morphism to a word †

. We describe the relationship between diﬀerent parameters of the initial word and its image obtained by application of a uniform marked morphism. The functions described include the subword complexity, frequency of factors, and the recurrence function. The relations obtained for the image of a word can be used also for the image of a factorial language. Using induction, we give a full description of the involved functions of the ﬁxed point of the morphism considered.


Introduction
Different languages and words generated by morphisms play a significant part in the formal language theory.Such languages include D0L languages (and infinite words) which are obtained by iterating a morphism, HD0L languages (and words) which are obtained from a D0L language (respectively, a D0L word) by application of another morphism, and others.
To emphasize the importance of such languages, the theory of avoidable patterns can be mentioned.Examples of infinite words on a given alphabet which avoid a pattern are usually constructed as D0L words or HD0L words.Only recently it has been proved [6] that there exists a pattern which is avoidable on the binary alphabet but is not avoidable in any binary D0L word.However, it is still unknown if every k-avoidable pattern is avoidable by some HD0L word on the k-letter alphabet.
We consider a general case of simple application of a morphism to a word.Imposing some restrictions to the morphism considered (it must be uniform, marked, and circular on the initial word), we find explicit formulas linking the quantitative characteristics of the initial word and its image.This approach is valid also for the application of a morphism to a language (see Subsection 7.1) and allows to give a full description of these characteristics of a D0L word (see Subsection 7.2).
This research can be considered as an extension and generalization of papers [10,9] (devoted to the subword complexity) and [8] (devoted to the frequency of factors), where the properties of a D0L word have been considered.However, here a new approach is used based on the direct relations between sets of allowable words of given lengths.This approach can be applied to find many functions of a language; some of them are considered here.

Uniform marked morphisms
We consider two alphabets Σ 1 = {a 1 , . . ., a q } and Σ 2 and a morphism ϕ : Σ * 1 → Σ * 2 ; clearly, ϕ can be extended to the set Σ ω 1 of all (right) infinite words on Σ 1 .We call images ϕ(a i ) of letters blocks; the length of a finite word u is denoted by |u|.A morphism is called non-erasing if all its blocks are non-empty; i.e., if for all letters a i ∈ Σ 1 we have |ϕ(a i )| > 0. We suppose ϕ to be non-erasing.
The (finite or infinite) word we consider is denoted by w; its image is denoted by ϕ(w).Note, that since ϕ is non-erasing, ϕ(w) is finite if and only if w is finite.
We say that a word u is allowable in a word v if it occurs in v as a factor: v = s 1 us 2 for some words s 1 and s 2 .The set of words allowable in w is denoted by A, and the set of words allowable in ϕ(w) is denoted by B. Our goal is to describe the relationship between A and B.
A morphism ϕ is called uniform if all its blocks are of the same length: |ϕ(a i )| = m for all a i ∈ Σ 1 ; m is called the block length of ϕ.If ϕ is uniform, then the length of the image of a word u depends only on the length of u; so, there exists a direct relationship between lengths of elements of A and of elements of B. That is why in this paper we consider uniform morphisms.
A morphism is called marked if all the images of letters begin with distinct symbols and end with distinct symbols; note that a morphism ϕ : Σ If ϕ is marked, then every block together with its inverse image is uniquely determined by its first (or last) symbol.
The triplet s = (u, j, k), where u = u 0 u 1 . . .u n+1 ∈ Σ + 1 , and j and k are integers, is called an . The word u is called the ancestor of the interpretation s and is denoted by a(s).
If in addition u ∈ A, then s is called an interpretation of v on w.Note, that a word can admit several interpretations, and even several interpretations on w with the same ancestor.The set of all the interpretations of a word v is denoted by I(v), and the set of its interpretations on w is denoted by I w (v).
The notions of an interpretation and its ancestor were introduced by F. Mignosi and P. Séébold in [11].
The notion of synchronization point was introduced by J. Cassaigne in [4].Put very simply, a synchronization point notes the point in u where the demarcation between blocks necessarily takes place in an occurrence of u in ϕ(w).
The word u ∈ B is called circular on w if it has a synchronization point on w.
Remark 2 If ϕ is uniform, then in a circular word all the demarcations between blocks are determined.
Remark 3 If ϕ is marked, then circularity of a word u ∈ B means that it admits a unique interpretation on w: as we know a synchronization point, we can reconstruct every block of this interpretation from its first (or last) symbol.The ancestor of this interpretation is called the ancestor of the word u and is denoted by a(u).We call u a descendant of a(u).
The morphism ϕ is called circular on w with synchronization delay equal to L = L(ϕ, w) if every word allowable in ϕ(w) of length at least L is circular.This notion was inspired by the papers of Mignosi and Séébold [11] and Cassaigne [4].
Let ϕ be a uniform marked morphism with block length equal to m, and let ϕ be circular on w with synchronization delay equal to L. In addition to L, we use another parameter of circularity called structure ordering number; it is denoted by K = K(ϕ, w) and defined as the least integer satisfying the inequality m(K − 1) + 1 ≥ L. In what follows, we mostly use the synchronization delay through the mediation of the structure ordering number.

Example: circularity of the Thue-Morse morphism
Let us consider the famous Thue-Morse morphism ϕ T M on the two-letter alphabet Σ = {a, b}: Clearly, ϕ T M is a uniform marked morphism.Our goal is to find the set C T M of words such that ϕ T M is circular on a word w if and only if w ∈ C T M .
Let every power of both symbols occur in w; i. e., let for all n a n ∈ A, b n ∈ A. In this case, the word (ab) n is allowable in ϕ(w) and non-circular for all n.Actually, it admits two interpretations on w: (a n , 0, 0) and (b n+1 , 1, 1).So, there exist arbitrarily long non-circular words in B, and Conversely, if a n+1 is not allowable in w for some n, then w ∈ C T M ; more precisely, ϕ is circular on w with synchronization delay L ≤ 2n + 1.
Actually, we shall prove that a word u ∈ B of length 2n + 1 is always circular on w.If two equal symbols occur successively in u, then there is a demarcation between blocks between them; so, two successive equal symbols determine a synchronization point, and u is circular.
If there are no successive equal symbols in u, then u = (ab) n a or u = (ba) n b.In both cases, u admits two interpretations; the ancestor of one of them is a n+1 , and the ancestor of the other is b n+1 .Since a n+1 is not allowable in w, and u is allowable, u admits one interpretation on w (with ancestor b n+1 ) and is circular.

Anna Frid
So, a word u ∈ B of length 2n+1 is always circular, and ϕ is circular on w with synchronization delay L ≤ 2n + 1.
Analogously, ϕ is circular on w if for some n the word b n+1 is not allowable in w.So, we have described the set C T M : 3 Basic relations

On the number of occurrences
For words u and v, we denote the number of occurrences of u in v by L u (v).The following easy lemma is valid for an arbitrary morphism ϕ : Σ * 1 → Σ * 2 and is very useful in the subsequent discussion.
Proof.Every occurrence of v in ϕ(u) corresponds to an occurrence of the ancestor of certain its interpretation in u, and vice versa.
Remark 4 As it was noticed in Remark 3, if ϕ is marked, and v is circular, then v admits a unique interpretation with the ancestor which can be denoted by a(v).In this case, Equality (1) can be rewritten as

Sets of allowable words of given lengths
Hence-forward, unless otherwise specified, ϕ is a uniform morphism with block length equal to m, and w is an arbitrary (finite or infinite) word.
In what follows, we obtain some relations linking quantitative characteristics of sets A and B of allowable words.We shall also strengthen these relations by assuming that ϕ is marked and circular on w.
To state the next results, we need to introduce some more notations.First, recall that for a word u = u 0 . . .u n , u i ∈ Σ 1 , the word Ψ jk (u) is obtained from ϕ(u) by erasing j symbols from the left and k symbols from the right; here we need only the case when we do not erase complete blocks, i. e., when 0 ≤ j, k < m.
For a set M ⊂ Σ * 1 and for j, k ∈ {0, . . ., m − 1}, we define the set Ψ jk (M ) naturally: Clearly, in general case Now we can formulate the main theorem.As its corollaries, we shall obtain a variety of relations among the quantitative characteristics of A and B.
Theorem 1 Let N = m(n − 1) + ∆ + 1 for some n > 0 and some ∆ ∈ {0, . . ., m}.Then If in addition ϕ is marked and circular on w with the structure ordering number equal to K, and n ≥ K, then all the maps Ψ jk above are one-to-one on the sets A(n) (and, respectively, A(n + 1)), and all the mentioned sets Ψ jk (A(n + l)), l ∈ {0, 1}, are mutually disjoint.
Proof.Let us consider a word v ∈ B(N ).Since it is allowable in ϕ(w), it occurs as a factor in some word ϕ(u), where u ∈ A. We choose u as short as possible, that is, v must contain parts of the first and the last blocks of ϕ(u).In other terms, we choose such u ∈ A that v = Ψ jk (u) for some j, k ∈ {0, . . ., m − 1}.It can be easily seen that the length of u must be equal to n or to n + 1.More exactly, either u ∈ A(n), and then v ∈ Ψ jk (A(n)), where j , and the equality is proved.Now let ϕ be marked and circular on w with structure ordering number K, and let n ≥ K. Sets Ψ jk (A(n + l)), l ∈ {0, 1}, mentioned in the equation ( 3) can intersect only because of some of their elements having several interpretations; by the same reason, it may happen that a map Ψ jk is not one-to-one on A(n+l).However, since |v| = N = m(n−1)+∆+1 ≥ m(K −1)+1 ≥ L, every word v ∈ B(N ) is circular.Since ϕ is marked, v admits a unique interpretation.Thus, the theorem is proved.
Remark 5 Here and in Theorems 2 and 3 some redundancy exists in the range of values of ∆.It would be sufficient to consider only ∆ ∈ {0, . . .m − 1} or ∆ ∈ {1, . . ., m}.The latter version will be more useful for us, but we do not want to loose the case of N = m(K − 1) + 1.
In what follows, we derive some corollaries of Equalities (1) and ( 2) and Theorem 1 concerning functions of w and ϕ(w).

Subword complexity
In this section, we study the subword complexity functions of w and ϕ(w).The subword complexity f s (n) of a word s counts the number of distinct words of length n which are allowable in s.This function of infinite words has been studied in numerous papers; see, for instance, the survey [1] by J.-P.Allouche.
In our terms, the subword complexity functions of w and ϕ(w Here we derive a corollary of Theorem 1 which expresses the subword complexity of ϕ(w) in terms of the subword complexity of w.

Anna Frid
Theorem 2 Let ϕ be a uniform morphism with block length equal to m.Let N = m(n−1)+∆+1 for some n > 0 and some ∆ ∈ {0, . . ., m}.Then If in addition ϕ is circular on w and marked, and n ≥ K, then equality holds: Proof.Using Equality (3), we obtain that Clearly, if the maps Ψ jk above are one-to-one on the respective sets A(n) and A(n + 1), and the images Ψ jk (A(n + l)), l ∈ {0, 1}, are mutually disjoint, then all the inequalities of the latter derivation become equalities.Thus, it follows from Theorem 1 that if ϕ is marked and circular on w with structure ordering number equal to K, and n ≥ K, then Equality (4) holds.
Remark 6 Note that the subword complexity function of ϕ(w) does not depend on ϕ itself or even on the alphabet Σ 2 .The only necessary parameters of ϕ are its block length and the structure ordering number of ϕ on w.

Example: the subword complexity of images of Sturmian words
An infinite word w is called Sturmian if its subword complexity is f w (n) = n + 1; as f w (1) = 2, the alphabet of a Sturmian word is binary: Σ 1 = {a, b}.It can be easily seen that the subword complexity function of Sturmian words is the least possible for a non-ultimately periodic word.
Sturmian words were introduced in the classical paper [12] and have been extensively studied; see, for example, the survey [3].There exists a variety of equivalent definitions of Sturmian words; however, we are interested just in the subword complexity.
A famous example of a Sturmian word is the Fibonacci word w F = abaababaabaababaababa . . . .
Let w be a Sturmian word and ϕ be a circular on w uniform marked morphism; let its block length be equal to m and the structure ordering number be equal to K(ϕ, w) = K.It follows from Theorem 2 that for every n ≥ K and for every ∆ ∈ {0, . . ., m − 1} the following equality holds: in other terms, for every n ≥ m(K − 1) + 1 This equality conforms Proposition 8 in [5] which states that a morphic image of a Sturmian word is quasi-Sturmian, i.e., has subword complexity f (n) = n + c for some c and all n ≥ n 0 .
In particular, let ϕ be the Thue-Morse morphism ϕ T M (see Example 2.3); its block length is m = 2. Without loss of generality we can state that the word bb never occurs in a Sturmian word (since in a non-ultimately periodic binary word ab and ba must occur, and f w (2) = 3).So, it follows from Example 2.3 that the Thue-Morse morphism is circular on a Sturmian word with the synchronization delay L ≤ 3 (in fact, L = 3).Structure ordering number is K + 2, and for In particular, this is the subword complexity of the word ϕ T M (w F ) = abbaababbaabbaababbaababba . . . .

Frequency of allowable words and frequency tables
In this section, we deal with the frequency of factors in an infinite word s on an alphabet Σ.Let s(n) denote the prefix of length n of s; for every word u ∈ Σ + its frequency µ s (u) in s is denoted as a limit Of course, this definition is valid only if this limit exists.The set of frequencies µ s determines a probability measure on the Borel subsets of Σ w : the frequency µ s (u) can be considered as the measure of the cylinder set [u] = {s|s = us ′ , s ′ ∈ Σ w }.

Frequency of a word in ϕ(w)
Our goal is to relate the frequencies of words in ϕ(w) to the frequencies of words in w.Here we assume that w is an infinite word on Σ, and ϕ is a uniform morphism with block length equal to m.
Lemma 2 If for every word u ∈ Σ + 1 its frequency µ w (u) exists, then for every v ∈ Σ + 2 its frequency in ϕ(w) exists and is equal to Proof.Since ancestors s ∈ I(v) \ I w (v) do not occur in w, and ϕ(w(n)) = ϕ(w)(mn), it follows from Lemma 1 that for every n > 0 Dividing this equality by mn and passing to the limit n → ∞, we obtain that µ w (a(s)).
A similar formula for a fixed point of a non-erasing morphism was proved in [8].
Using Formula (6), one can compute the frequency of a word in ϕ(w) from the frequency in w of the ancestors of its interpretations.Note, that if ϕ is marked and circular on w, and the length of v exceeds the synchronization delay, then v has the only ancestor a(v) (see Remark 3), and Formula (6) can be rewritten as In what follows, we use Theorem 1 and Equation (7) to describe the set of frequencies of words of given length in ϕ(w) in terms of frequencies of words in w.To do it, we introduce the notion of frequency tables.

Frequency tables
Let s be an infinite word with the subword complexity equal to f (n).A frequency table on s is a table where , and F describes the frequencies of allowable words of length n in s.
It can be easily seen that adding columns of the kind 0 µ (zero columns) and permuting columns do not change the meaning of a frequency table.We do not distinguish tables which differ by the order of columns or by zero columns.The sum F + F ′ of two frequency tables is defined as the table obtained from by junction of columns corresponding to the same frequency: instead of two columns k i µ and For a frequency table and for p ≥ 0, r > 0, we define a table T p r (F ) as follows: Theorem 3 Let the frequency µ w be defined for every word on Σ 1 , and let ϕ be a uniform marked morphism circular on w.If n exceeds the structure ordering number of ϕ on w, ∆ ∈ {0, . . ., m}, and N = m(n − 1) + ∆ + 1, then Proof.The statement of the theorem follows immediately from Theorem 1 and Formula ( 7): each word u ′ ∈ A of length n with frequency µ ′ in w is an ancestor of m − ∆ words of length N allowable in ϕ with frequency 1 m µ ′ : they are words Ψ i−1,m−i−∆ (u ′ ), where i ∈ {1, . . ., m − ∆}.
Analogously, each allowable in w word of length n + 1 and of frequency µ ′′ is an ancestor of ∆ words of length N allowable in ϕ(w); each of them has frequency 1 m µ ′′ .Since every allowable in ϕ(w) word of length N has an ancestor of length n or n + 1 allowable in w, the theorem is proved.

Example: frequency of factors in the Thue-Morse image of the Fibonacci word
Let us compute the set of frequencies of factors in the word ϕ T M (w F ) (see Examples 2.3 and 4.1) using frequency of factors in the Fibonacci word w F described by M. Dekking in [7].
In terms of frequency tables, the frequency of factors of length n in w F described in [7] is This table means that P l−1 − r of n + 1 allowable words of length n have frequency p l−2 , P l−2 + r words have frequency p l−1 , and r words have frequency p l .Note, that F w F (n + 1) looks analogously even if r is maximal (r = P l−1 − 1):

Anna Frid
Now, let us apply Theorem 3 to find frequencies of factors in the Thue-Morse image ϕ T M (w F ) of the Fibonacci word.Let N = 2n − 1 + ∆ for ∆ ∈ {0, 1}; then N = 2P l + r ′ + 1, where r ′ = 2r + ∆ ∈ {0, . . ., 2P l−1 − 1}.Equality (8) applied to N is We have found the set of frequencies of factors of given length in the Thue-Morse image of the Fibonacci word.

The recurrence function and its relatives
Let A s (n) be the set of all words of length n which are allowable in a word s.The recurrence function R s (n) of s can be defined as the least length satisfying the following condition: all the words of A s (n) are allowable in every word u ∈ A s (R s (n)).In other terms, R s (n) is the size of the smallest window which contains all the elements of A s (n) whatever its position in s.
The recurrence function is the classical tool associated with infinite words.Recently, two more functions R ′ s (n) and R ′′ s (n) slightly different from R s (n) have been defined.The function R ′ s (n) introduced by J.-P.Allouche and M. Bousquet-Mélou in [2] is the length of the shortest prefix of s containing all the words of A s (n); and the function R ′′ s (n) introduced by J. Cassaigne in [5] is the length of the shortest allowable word in which every word from A s (n) is allowable. Clearly, The recurrence function is finite only if s is uniformly recurrent, i. e., if for every word u allowable in s the distance between two successive occurrences of u in s is bounded; if it is not for some u of length n, R s (n) is defined to be +∞.The other two functions are defined everywhere.Unlike R ′ s , the functions R s and R ′′ s depend in fact only on the set of words allowable in s.Let the word w be infinite.Here we use Equality (2) and Theorem 1 to relate these functions of ϕ(w) to the corresponding functions of w; we assume here, that ϕ is a uniform marked morphism circular on w.
Theorem 4 Let w be an infinite word, and ϕ be a uniform marked morphism circular on w.Let n exceed the structure ordering number of ϕ on w.Then for every ∆ ∈ {1, . . ., m} and for Proof.Let R w (n + 1) be finite.It can be easily seen that the recurrence function R(n) is directly connected with the maximal distance between two adjacent occurrences of a word of length n.The latter function is equal to R(n) − n + 1 and is non-decreasing.
Note that ∆ here is chosen to be larger than 0. Thus, it follows from Theorem 1 that every word of A of length n + 1 has a descendant of B of length N .Since ϕ is uniform, and due to circularity, if the length between two occurrences of a word in w is l, then the length between corresponding occurrences of its descendants in ϕ(w) is ml.In terms of the recurrence function it means that Substituting to this equality the expression for N , we obtain the statement of the theorem.Now let R w (n + 1) be infinite.It means that there exist as long as is wished words of A in which not all the words of A(n + 1) occur.But due to Theorem 1, their images cannot contain all the words of B(N ); thus, R ϕ(w) (N ) is also infinite.
Theorem 5 Under conditions of Theorem 4, Proof.Let u be the prefix of w of length R ′ w (n + 1): u = w(R ′ w (n + 1)).By the definition of the function R ′ w , all the elements of A(n + 1) (and, consequently, of A(n)) are allowable in u, and the suffix v 1 of u of length n + 1 does not occur earlier in it.However, since the suffix v 0 of u of length n is the prefix of some allowable word of length n + 1, it occurs somewhere else in u.
Since N is larger than the synchronization delay of ϕ on w, every word allowable in ϕ(w) of length N is a descendant of some word of length n or n + 1.Since ∆ ≥ 1, it follows from Equality (3) that every word allowable in w of length n + 1 (and in particular v 1 ) is an ancestor of some word of length N allowable in ϕ(w).
It follows from Theorem 1 applied to u that ϕ(u) contains as factors all the elements of B(N ); so, R ′ ϕ(w) (N ) ≤ mR ′ w (n + 1).On the other hand, unlike the descendants of v 0 which occur somewhere earlier in ϕ(w), each of the descendants of v 1 (including Ψ m−1,m−∆ (v 1 )) occurs in ϕ(w) only once.Thus, the last word from B(N ) which has a unique occurrence in ϕ(u) is Ψ m−1,m−∆ (v 1 ), and so R ′ ϕ(w) (N ) = mR ′ w (n + 1) − m + ∆.As for the function R ′′ , to find the relationship between that of w and of ϕ(w), we need some additional properties of w (or, more generally, of the language A) to be satisfied.
Namely, a language L ⊂ Σ * is called prolongable if for every word u ∈ L there exist symbols a, b ∈ Σ such that au ∈ L, ub ∈ L.
A word u ∈ A is called (right) special if it can be prolonged to the right to an element of A by at least two different letters: ua, ub ∈ A for some a, b ∈ Σ 1 , a = b.Note that a sequence w is non-ultimately periodic if and only if for each n there exists a special word of length n.
Theorem 6 Under conditions of Theorem 4, let the language A be prolongable and w be nonultimately periodic.Then Proof.Let us call a word u ∈ A(R ′′ w (n + 1)) trivial if it begins and ends with the same word s ∈ A(n) which does not occur anywhere else in u: u = su ′ = u ′′ s, but u = u 1 su 2 for u 1 , u 2 ∈ Σ + 1 .Lemma 3 There exists a non-trivial word in A(R ′′ w (n + 1)) containing all the words of A(n + 1).Proof.Let us choose a word v ∈ A(R ′′ w (n + 1)) containing all the words of A(n + 1): by the definition of R ′′ w , such a word exists.If it is non-trivial, then the lemma is proved; otherwise, u = su ′ = u ′′ s for some s ∈ A(n).Since s does not occur anywhere in the middle of u, it can be prolonged to the right only by the first letter of u ′ (which is denoted by a).Thus, s is not special, and neither is u: the only word of uΣ 1 ∩ A is ua.
Let s = bs ′ for b ∈ Σ 1 .Consider the word s ′ u ′ a ∈ A(R ′′ w (n + 1)).It contains all the words of A(n+1): all of them except sa are factors of s ′ u ′ , and sa is a suffix of s ′ u ′ a.If s ′ u ′ a is non-trivial, then the lemma is proved; otherwise, it is not special, and we can repeat the procedure and obtain from s ′ u ′ a a new word by erasing the first letter and adding the allowable letter to the right.
However, if we could repeat this procedure infinitely many times, if would mean the existence of an infinite suffix of w not containing special words of length R ′′ w (n + 1).This is possible only if w is ultimately periodic which contradicts with the conditions of the theorem.
Since m > 0, it follows from Theorem 1 that a word cannot contain the elements of B(N ) if its ancestor does not contain all the words of A(n + 1).We must look for shortest words containing all the elements of B(N ) among descendants of words of A(R ′′ w (n+1)) containing all the elements of A(n + 1).
Let a word u 0 ∈ A(R ′′ w (n + 1)) contain all the words of A(n + 1) and be non-trivial.By the definition of R ′′ w , its prefix and suffix of length n + 1 do not occur anywhere else in it, but since A is a prolongable language, the prefix and the suffix of u 0 of length n occur in it somewhere else; since u 0 is non-trivial, each of them occur in the middle of u 0 (even if they are equal).So, to obtain the shortest descendant v 0 of u 0 containing all the words of B(N ), we can erase all the symbols from the left (and from the right) of ϕ(u 0 ) which occur only in the descendants of length N of these prefix and suffix of length n.These symbols are the first and the last m − ∆ symbols of ϕ(u 0 ); so, v 0 = Ψ m−∆,m−∆ (u 0 ).
On the other hand, if u 1 ∈ A(R ′′ w (n + 1)) contains all the words of A(n + 1) and is trivial, we must be more cautious with erasing letters from the left and from the right of its ϕ-image, and the descendants of u 1 containing all the words of B(N ) are longer than v 0 .Thus,

Example: On words with ultimately grouped factors
In [5], J. Cassaigne introduced a notion of words with ultimately grouped factors: a word w if said to have ultimately grouped factors if R ′′ w (n) = f w (n) + n − 1 for all n ≥ n 0 (w).He proved also that every quasi-Sturmian word (i.e., a word whose subword complexity is f w (n) = n + c for some c and for all n ≥ n 1 ) has ultimately grouped factors.
Let w be a non-ultimately periodic word with ultimately grouped factors; let the language of factors of w be prolongable, and a uniform marked morphism ϕ be circular on w.If n exceeds the structure ordering number of ϕ on w and n 0 (w), and N = m(n − 1) + ∆ + 1 for ∆ ∈ {1, . . ., m}, then it follows from Theorem 6 and Formula (4) that Thus, ϕ(w) has ultimately grouped factors if and only if f w (n + 1) = f w (n) + 1 for all large enough n; i. e., if and only if w is a quasi-Sturmian word.
7 Some generalizations

A generalization to factorial languages
It can be easily seen that sets of allowable words of given length in fact do not depend on the words w and ϕ(w), but rather on the sets A and B. The same can be said on the subword complexity, the recurrence function, and its relative R ′′ .But the initial language A can be defined not only as the set of words which are allowable in a word w.The only condition it must satisfy is factually closure under taking a factor: if a word u is an element of A, then all its factors are also elements of A. In other terms, A must be a factorial language on Σ 1 .
For a factorial language A, we define its factorial image B ⊂ Σ * 2 as the least factorial language containing the set ϕ(A) = {ϕ(u)|u ∈ A}.Properties of A and B are similar to those of sets of allowable words of w and ϕ(w) respectively.
Replacing everywhere in Subsection 2.2 w by A and ϕ(w) by B, we obtain the necessary definitions concerning circularity on a factorial language A (see also Subsection 3.2 in [4]).
The subword complexity function f L (n) of a factorial language L is defined naturally as the number of its distinct elements of length n.
It can be easily seen that after replacing circularity on w by circularity on A, we can extend Theorem 1 to an arbitrary factorial language A. The same can be done with Theorem 2 which is a direct corollary of Theorem 1; in the relations between subword complexity values, f w should be replaced by f A and f ϕ(w) should be replaced by f B .
As it was mentioned in Section 6, the recurrence function R and its relative R ′′ depend in fact not on the words w and ϕ(w) themselves, but on the sets of allowable words.So, similar functions can be defined on an arbitrary factorial language L: R L (n) is the least length such that every element of L of length R L (n) contains as factors all the elements of L of length n, and R ′′ L (n) is the length of the shortest element of L which contains all the words from L of length n.On a factorial language, both these functions can be not everywhere defined; if the length from a definition does not exist, the corresponding function is supposed to be infinite.
Let A be a factorial language and B be its factorial image.After replacing the occurrences of w by those of A, and the occurrences of ϕ(w) by those of B, Theorem 4 can be extended to a factorial language A and its factorial image B. To extend Theorem 6, we must just add the conditions for A to be prolongable and to have infinitely many special elements (the latter condition serves instead of non-periodicity of w).

A generalization to D0L words
Let the alphabets Σ 1 and Σ 2 be equal; i.e., let ϕ be a morphism ϕ : Σ * → Σ * .If the block ϕ(a) begins with a for some symbol a ∈ Σ, then ϕ admits a fixed point w which can be obtained as a limit w = lim In [13], the probability measure defined by the set of frequencies of factors in a D0L word has been studied, and a sufficient condition of the existence of such a measure was given.
A fixed point w of a morphism ϕ is called circular if ϕ is circular on w.An easy-to-check criterion of circularity of a fixed point of a uniform marked morphism was proved in [9].
If w is a circular fixed point of a uniform marked morphism ϕ, then the formulas obtained above can be applied to w many times, until the length n considered is less than the structure ordering number of ϕ on w.
As it follows from Lemma 3 of [11], a circular fixed point of a uniform morphism cannot be ultimately periodic.On the other hand, it is not difficult to show that the language of factors of a fixed point of marked morphism is prolongable.That is why we do not need any additional conditions for Formula (12).

i→∞ ϕ i
(a); if ϕ is non-erasing and ϕ(a) = a, then w is an infinite word satisfying the equality w = ϕ(w).Fixed points of morphisms are called also D0L words.They have been studied in numerous papers and contain many famous examples, including the Fibonacci word (which is the fixed point of the morphism ϕ F defined by ϕ F (a) = ab, ϕ F (b) = a) and the Thue-Morse word which is a fixed point of the Thue-Morse morphism ϕ T M (see Example 2.3).