Antisquares and Critical Exponents

The (bitwise) complement $\overline{x}$ of a binary word $x$ is obtained by changing each $0$ in $x$ to $1$ and vice versa. An $\textit{antisquare}$ is a nonempty word of the form $x\, \overline{x}$. In this paper, we study infinite binary words that do not contain arbitrarily large antisquares. For example, we show that the repetition threshold for the language of infinite binary words containing exactly two distinct antisquares is $(5+\sqrt{5})/2$. We also study repetition thresholds for related classes, where"two"in the previous sentence is replaced by a larger number. We say a binary word is $\textit{good}$ if the only antisquares it contains are $01$ and $10$. We characterize the minimal antisquares, that is, those words that are antisquares but all proper factors are good. We determine the growth rate of the number of good words of length $n$ and determine the repetition threshold between polynomial and exponential growth for the number of good words.


Introduction
Let x be a finite nonempty binary word.We say that x is an antisquare if there exists a word y such that x = y y, where the overline denotes a morphism that maps 0 → 1 and 1 → 0. For example, 011100 is an antisquare.The order of an antisquare y y is defined to be |y|, where |y| denotes the length of y.
Avoidance of antisquares has been studied previously in combinatorics on words.For example, Mousavi et al. (2016) proved that the infinite Fibonacci word the fixed point of the morphism 0 → 01, 1 → 0, has exactly four antisquare factors, namely, 01, 10, 1001, and 10100101.More generally, all the antisquares in Sturmian words have recently been characterized 2 Aseem Baranwal et al.
in Hieronymi et al. (2022).On the other hand, Ng et al. (2019) classified those infinite binary words containing the minimum possible numbers of distinct squares and antisquares.
It is easy to see that no infinite binary word, except the trivial families given by (0 + ϵ)1 ω and (1 + ϵ)0 ω , can contain at most one distinct antisquare.(Here the notation x ω refers to the right-infinite word xxx • • • .)However, once we move to two distinct antisquares, the situation is quite different.We have the following: Proposition 1.There are exponentially many finite binary words of length n having at most two distinct antisquares, and there are uncountably many infinite binary words with the same property.
Proof: It is easy to see that every binary word in {1000, 10000} * has only the antisquares 01 and 10, which proves the first claim.
For the second, consider the uncountable set of infinite words {1000, 10000} ω .(Here, by S ω for a set S of nonempty finite words, we mean the set of all infinite words arising from concatenations of elements of S.) Furthermore, it is easy to see that if an infinite binary word, other than 001 ω and its complement, contains exactly two antisquares, then these antisquares must be 01 and 10.Call a binary word good if it contains no antisquare factors, except possibly 01 and 10.This suggests studying the following problem.
Problem 2. Find the repetition threshold for good words.
The repetition threshold for a class of (finite or infinite) words is defined as follows.First, we say that a finite word w = w[1.
.n] has period p ≥ 1 if w[i] = w[i + p] for 1 ≤ i ≤ n − p.The smallest period of a word w is called the period, and we write it as per(w).The exponent of a finite word w, written exp(w) is defined to be |w|/ per(w).We say a word (finite or infinite) is α-free if the exponents of its nonempty factors are all < α.We say a word is α + -free if the exponents of its nonempty factors are all ≤ α.The critical exponent of a finite or infinite word x is the supremum, over all nonempty finite factors w of x, of exp(w); it is written ce(x).Finally, the repetition threshold for a language L of infinite words is defined to be the infimum, over all x ∈ L, of ce(x).
The critical exponent of a word can be either rational or irrational.If it is rational, then it can either be attained by a particular finite factor, or not attained.For example, the critical exponent of both • the Thue-Morse word t = 0110100110010110100101100 • • • , fixed point of the morphism 0 → 01, 1 → 10; and • the variant Thue-Morse word vtm = 2102012101202102012021012 • • • , fixed point of the morphism 2 → 210, 1 → 20, 0 → 1 is 2, but it is attained in the former case and not attained in the latter.If the critical exponent α is attained, we typically write it as α + .
In 1972, Dejean (1972) ) wrote a classic paper on combinatorics on words, where she determined the repetition threshold for the language of all infinite words over {0, 1, 2}-it is 7 4 + -and conjectured the value of the repetition threshold for the languages Σ * k for k ≥ 4, where Σ k = {0, 1, . . ., k − 1}.Dejean's conjecture was only completely resolved in 2011, in Rao (2011) and Currie and Rampersad (2011), independently.
For variations on and generalizations of repetition threshold, see Ilie et al. (2005); Badkobeh and Crochemore (2011); Fiorenzi et al. (2011); Samsonov and Shur (2012); Mousavi and Shallit (2013).The goal of this paper is to study the repetition threshold for two classes of infinite words: • AO ℓ , the binary words avoiding all antisquares of order ≥ ℓ; • AN n , the binary words with no more than n antisquares.
It turns out that there is an interesting and subtle hierarchy, depending on the values of ℓ and n.
Our work is very similar in flavor to that of Shallit (2004), which found a similar hierarchy concerning critical exponents and sizes of squares avoided.The hierarchy for antisquares, as we will see, however, is significantly more complex.
In this paper, in Sections 2 and 3, we solve Problem 2, and show that the repetition threshold for good words is 2 + α, where α = (1 + √ 5)/2 is the golden ratio.Proving that the repetition threshold for a class of infinite words equals some real number β generally consists of two parts: first, an explicit construction of a word avoiding β + powers.This is often carried out by finding an appropriate morphism h whose infinite fixed point x (or an image of x under a second morphism) has the desired property.Second, if β is rational, then one can prove there is no infinite word avoiding β-powers by a breadth-first or depth-first search of the infinite tree of all words.If β is irrational, however, one must generally be more clever.
In Section 4 we determine the repetition threshold for binary words avoiding all antisquares of order ≥ ℓ, and in Section 5 we determine the repetition threshold for binary words with no more than n antisquares.In Section 6 we completely characterize the minimal antisquares; i.e., the binary words that are antisquares but have the property that all proper factors are good.This characterization is then used in Section 7, where we determine the growth rate of the number of good words of length n.In this section we also show that the repetition threshold between polynomial and exponential growth for good words avoiding α-powers is α = 15 4 ; i.e., there are exponentially many such words avoiding 15 4 + -powers, but only polynomially many that avoid 15 4 -powers.
2 A good infinite word with critical exponent 2 + α Consider the morphisms below: Let us write φ ω (0) for the (unique) infinite fixed point of φ that starts with 0. We claim that the infinite word w = g(φ ω (0)) does not have antisquares other than 01 and 10, and has critical exponent 2 + α, where α = (1 + √ 5)/2.The infinite word w is Fibonacci-automatic, in the sense of Mousavi et al. (2016), so we can apply the Walnut theorem-prover Mousavi (2016) to establish this claim.For more about Walnut, see Shallit (2022).
We start with the Fibonacci automaton for φ ω (0), as displayed in Figure 1.Let us name the above automaton FF.txt, and store it in the Word Automata Library of Walnut.We can verify the correctness of this automaton as follows.First, we claim that φ ω (0) = 0f .To see this, let f denote the morphism that maps 0 → 010, 1 → 01; i.e., the morphism f is the square of the Fibonacci morphism.One can easily verify the identities φ n (0) = 0f n (0)0 −1 and φ n (01) = 0f n (10)0 −1 by simultaneous induction, whence follows the claim.We can then use Walnut to verify the correctness of the automaton FF with the command which returns TRUE.
Now we can use Walnut to create a Fibonacci automaton for w.
Theorem 3. The word w does not contain antisquares other than 01 and 10, and has critical exponent 2 + α.
Proof: We write a Walnut formula asserting that there exists an antisquare of order ≥ 2, as follows: This returns FALSE, so there are no antisquares of order ≥ 2.
We now compute the periods that are associated with factors that have exponent ≥ 3.
The predicate above produces the automaton in Figure 3, which shows that these periods are of the form 10010 * in Fibonacci representation.
Next, we compute the pairs (n, p) such that w contains a factor of length n + p with period p of the form 10010 * and n + p is the longest length of any factor with this period.reg pows msd_fib "0 * 10010 * "; def maximalreps "?msd_fib Ei (Aj (j<n The automaton produced by the predicate highestpow is given in Figure 4.The strings accepted by this automaton, omitting the leading [0, 0], are as follows: • These correspond, respectively, to the values where we have used the well-known Fibonacci identities Now the exponent of these finite factors is (n + p)/p, which is for j ≥ 6.These quotients tend to 2 + α from below, and hence the critical exponent is 2 + α.

Optimality of the previous construction
In this section we show that the critical exponent of the word constructed in Section 2 is best possible; i.e., that every infinite good word has critical exponent at least 2 + α.It is somewhat easier to work with bi-infinite words, so we begin with results concerning bi-infinite words and then explain at the end of the section how to obtain the desired result for right-infinite words.If S is a set of nonempty finite words, then by ω S ω we mean the set of bi-infinite words made up of concatenations of the elements of S.
Theorem 4. Every bi-infinite binary word avoiding 4-powers and {11, 000, 10101} has the same set of factors as f .
• Proof: Notice that F is closed under bitwise complement and reversal.Let w be a bi-infinite binary word avoiding 4-powers and F .Suppose that w contains 001011.
• Since 1001 ∈ F and 0110 ∈ F , the word w contains 00010111.
This is a contradiction since 1000101110 ∈ F .So w avoids 001011.By considering the complement, the word w also avoids 110100.Now suppose that w contains 110111011.
• Since 1111 is a 4-power, the word w contains 0111011101110.
• Since 0011 ∈ F and 1100 ∈ F , the word w contains 101110111011101.
Notice that F ′ is closed under complement and reversal.By symmetry, we now suppose that w contains 11.Notice that 111, 11011, and 1101011 are the only possible factors of w that start with 11, end with two identical letters, and contain two identical letters only as a prefix and a suffix.In particular, the word w avoids 00.Moreover, the blocks of consecutive 1's have length 1 or 3.So w ∈ ω {01, 11} ω .Thus w = g(v) for some bi-infinite binary word v.
So v also avoids {11, 000, 10101}.By Theorem 4, the word v has the same set of factors as f .That is, the word w has the same set of factors as g(f ).
Theorem 6.Every good bi-infinite binary word has critical exponent at least 2 + α.
Proof: Suppose that w is a good bi-infinite binary word; that is, it contain no antisquares except possibly 01 and 10.Also assume w has critical exponent smaller than 2+α.Consider the set F from Lemma 5 and notice that F \ {101110111011101, 010001000100010} contains only antisquares.So w avoids 4-powers and F \ {101110111011101, 010001000100010}.Moreover, w avoids 101110111011101 = (1011) 15/4 since 15/4 > 2 + α.By symmetry, w also avoids 010001000100010.So w avoids 4-powers and F .By Lemma 5, w has the same set of factors as either g(f ) or g(f ).So w has critical exponent 2+α.
Proof: Let w be an infinite good binary word, and let RecFac(w) denote the set of its recurrent factors.That is, the set RecFac(w) consists of the factors of w that appear infinitely often in w.Then for any y ∈ RecFac(w), we see that y has arbitrarily large two-sided extensions in RecFac(w).By a 'two-sided' analogue of König's infinity lemma, there exists a bi-infinite word w ′ such that every factor of w ′ is an element of RecFac(w).By Theorem 6, the bi-infinite word w ′ has critical exponent at least 2 + α, and thus so does the infinite word w.

The class AO ℓ
Instead of avoiding all antisquares of order greater than one, we could consider avoiding arbitrarily large antisquares.
Proof: By a result of Karhumäki and Shallit (2004), we know that every infinite binary word avoiding 7 3 -powers can be written in the form , where x i ∈ {ϵ, 0, 1, 00, 11} and µ is the Thue-Morse morphism, defined by µ : 0 → 01, 1 → 10.It follows that every such word must contain arbitrarily large factors of the form µ n (0).But every word µ n (0) for n ≥ 1 is an antisquare.
On the other hand, we can prove the following result on the class AO ℓ of binary words avoiding antisquares of order ≥ ℓ: Theorem 9.There exists an infinite β + -free binary word containing no antisquare of order ≥ ℓ for the following pairs (ℓ, β): These are all optimal.
Proof: Item (a) was already proved in Section 2. For each of the remaining pairs (ℓ, β), we apply a morphism ξ ℓ to any ternary squarefree infinite word w and check that it has the desired properties.The morphisms are given in Table 1.The columns are ℓ, where the word contains no antisquares of order ≥ ℓ; β, where the word avoids β + powers; s, the size of the uniform morphism; and the morphism.
To verify the β + -freeness of ξ ℓ (w) we use (Mol et al., 2020, Lemma 23).In order to do so, we check that ξ ℓ is synchronizing and that ξ ℓ (u) is β + -free for every squarefree ternary word u of length t, where t is specified by (Mol et al., 2020, Lemma 23).In order to show that ξ ℓ (w) contains no antisquares of order ≥ ℓ, we find a length m such that if v and its complement are both factors of ξ ℓ (w), then |v| ≤ m.We can then check that ξ ℓ (w) contains no antisquares of order ≥ ℓ by exhaustively checking all factors of length at most 2m.The parameters t and m for each ξ ℓ are given in Table 2 The optimality of item (a) was already proved in Section 3. The optimality of the remaining items can be established by depth-first search.For each pair (ℓ, β), a longest word containing no antisquares of order ≥ ℓ, but avoiding β-powers (instead of β + ), is given in Table 3.The columns give ℓ, β, the length L of a longest such word, and a longest such word.

The class AN n
In this section, we consider the class AN n of binary words containing no more than n distinct antisquares as factors.
Theorem 10.There exists an infinite β + -free binary word containing no more than n antisquares for the following pairs (n, β): These are all optimal.
Proof: Item (a) was already proved in Section 2. For the remaining cases, the proof is similar to that of Theorem 9: for each pair (n, β), we apply a morphism ζ n to any ternary squarefree infinite word w and check that it has the desired properties.The morphisms are given in Table 4.The columns are n, the largest number of allowed antisquares; β, where the word avoids β + powers; s, the size of the uniform morphism; and the morphism.Tab.4: Morphisms generating words in ANn.
To verify the β + -freeness of ζ n (w) we use (Mol et al., 2020, Lemma 23).In order to do so, we check that ζ n is synchronizing and that ζ n (u) is β + -free for every squarefree ternary word u of length t, where t is specified by (Mol et al., 2020, Lemma 23).In order to show that ζ n (w) contains at most n distinct antisquares, we find a length m such that if v and its complement are both factors of ζ n (w), then |v| ≤ m.
We can then enumerate the antisquares appearing in ζ n (w) and check that there are at most n of them.The parameters t and ℓ for each ζ n are given in Table 5 The optimality of item (a) was already proved in Section 3. The optimality of the remaining items can be established by depth-first search.For each pair (n, β), a longest word containing at most n antisquares, but avoiding β-powers (instead of β + ), is given in Table 6.The columns give n, β, the length L of a longest such word, and a longest such word.

Minimal antisquares
Consider the language L of all finite good words.In this section we determine the minimal antisquares or minimal forbidden factors for L Mignosi et al. (2002).These are the words w such that w is an antisquare, but w properly contains no antisquare factors, except possibly 01 and 10.This characterization will be useful for enumerating the number of length-n words in L.
We start with some basic results about antisquares.
Lemma 12.If x is an antisquare, so is every conjugate of x.
Proof: Write x as ayay for some (possibly empty) word y.Then a cyclic shift by one symbol gives yaya, which is clearly an antisquare.Repeating this argument |x| times gives the result.
Lemma 13.A word w is a minimal antisquare if and only if all conjugates of w are minimal antisquares.
Proof: Let w = uu.Let us prove the forward direction first.Assume, contrary to what we want to prove, that w is a minimal antisquare, but some rotation of w contains a shorter antisquare, other than 01 and 10.
Let rs be an antisquare of minimum length with |rs| > 2 that is not a factor of w, but is a factor of some rotation of w.Then we can write w = sxr with r, s, x ̸ = ϵ.There are three cases to consider: |r| ≥ |u|: Let t be defined by r = tu.Then w = uu = sxtsxt, where rs = tsxts is an antisquare.If t = ϵ, then rs = sxs; hence w = sxsx contains the antisquare sxs.This is a contradiction, since we assumed w does not have a shorter antisquare, but sxs is an antisquare in w with s, x ̸ = ϵ.Therefore t ̸ = ϵ.
Since w and rs are both antisquares, they have even length.Therefore |x| is even.Consider the factorization x = x 1 x 2 with |x 1 | = |x 2 |.Then rs = tsxts = tsx 1 x 2 ts is an antisquare, where |tsx 1 | = |x 2 ts|.Therefore, tsx 1 must end with ts and x 2 ts must begin with ts, giving the antisquare tsts, which is not 01 or 10 (since t, s ̸ = ϵ), and is shorter than rs.This contradicts our assumption that rs is the smallest such antisquare.
|s| ≥ |u|: Let s be defined by s = ut.Then w = uu = txrtxr.Now rs = rtxrt is an antisquare, and hence the complement rs = rtxrt is also an antisquare.Let r ′ = rtxr and s ′ = t.So we can write w = txrtxr = s ′ xr ′ where r ′ s ′ is an antisquare of the same length as rs, and |r ′ | > |u|.This reduces the problem to the previous case.and the remaining cases correspond to certain conjugates of 0 n 101 n 01 for n ≥ 3, already listed in the statement of the theorem.
It now remains to handle the case of more than 7 runs.By Lemma 14 x has at least 10 runs.This involves a rather tedious examination of cases, based on the following three simple observations: Proof: If condition (a) is not satisfied, then z consists of isolated occurrences of numbers ≥ 2, separated by blocks of consecutive 1's.We assume this in what follows.
If z both begins and ends with a number ≥ 2, then zz satisfies (a).Thus we may assume that z either begins or ends with 1 (or both).
Suppose z contains the block 1111.Then zz contains the block b1111c for b, c ≥ 1, and hence satisfies (b).Thus we may assume that the maximal 1-blocks in z are of length 1, 2, or 3.
Suppose all the maximal 1-blocks of z are of length 1 or 3.If z begins with 1 and ends with b ≥ 2, then z cannot be of odd length, and similarly if z ends with 1 and begins with b ≥ 2. So z must begin and end with 1. Proposition 18.If a bi-infinite good word w contains the factor 001011, then w = ω 0101 ω .
Proof: Consider a maximal factor of w of the form 0 k 101 k , with 2 ≤ k < ∞.By maximality and by symmetry, we can assume that w contains 0 k 101 k 0. This factor is not extendable to the right: • 0 k 101 k 00 contains the antisquare 1100 as a suffix.
By Proposition 18, there are not enough good words containing 001011 to contribute to the growth rate of good words.This also holds for good words containing the symmetric factor 110100.Thus, good words and binary words avoiding F have the same growth rate.
To enumerate binary words avoiding F , we instead enumerate the 'Pansiot codes' of these words.If x = x 1 x 2 • • • x n is a binary word, then the Pansiot code of x is the binary word p 1 p 2 • • • p n−1 such that for i = 1, . . ., n − 1 x i if p i = 0; x i if p i = 1.
For example, the binary word 010 is the Pansiot code for the two binary words 0011 and 1100.
The Pansiot codes of binary words avoiding F are the binary words avoiding {010, 101, 11111, 01110}.These words consist of blocks of 0's of length at least 2 and blocks of 1's of length 2 or 4. Consider the number C n of such words ending with 00.They are obtained from shorter words by adding a suffix 0, 1100, or 111100.From the relation C n = C n−1 + C n−4 + C n−6 , the growth rate is the positive real root of X 6 = X 5 + X 2 + 1.Since X 6 − X 5 − X 2 − 1 = (X + 1)(X 2 − X + 1)(X 3 − X 2 − 1), this is the root ψ of X 3 = X 2 + 1.
Next we show that the threshold exponent at which the number of good words becomes exponential is 15 4 .(For overlap-free words, the threshold is 7 3 ; see Karhumäki and Shallit (2004).)Theorem 20.Let w be any squarefree word over the alphabet {0, 1, 2}.Apply the map h that sends The resulting word is good, and has exponent at most 15 4 , and it is exactly 15 4 if |w| ≥ 5.

Fig. 4 :
Fig. 4: Automaton for pairs (n, p) associated with highest powers in w.
Longest words avoiding β-powers and containing no antisquares of order ≥ ℓ. n (a) if r(x) contains two consecutive terms, both ≥ 2, then x contains the shorter antisquare 0011 or 1100; (b) if r(x) contains six consecutive terms a1bc1d with a ≥ c and b ≤ d, then x contains the shorter antisquare 0 c 10 b 1 c 01 b or its complement.(c) if r(x) contains an interior occurrence of 2, then x contains the antisquare 0110 or 1001.Suppose x = uu.If z = r(u) is of odd length, then zz = r(x).If z = r(u) is of even length, then writing z = ayb with a, b single numbers, we have r(x) = ay(a + b)yb.When we speak of a maximal 1-block in what follows, we mean one that cannot be extended by additional 1's to the left or right.It now suffices to prove the following two lemmas: Lemma 16.Let z ∈ N * , and suppose |z| ≥ 5 is odd.Then zz contains either (a) two consecutive terms that are ≥ 2, or (b) six consecutive terms a1bc1d with a ≥ c and b ≤ d.
Since |z| ≥ 5, we know z has a prefix of the form 1c1d and a suffix of the form a1b1, where a, b, c, d ≥ 1. Hence zz contains the block a1b11c1d.If b ≤ c, then the block a1b11c fulfills condition (b); if b ≥ c, then the block b11c1d fulfills condition (b).Thus there must be a maximal 1-block of length 2 in z.Then zz contains two blocks, one of the form a1b11c and one of the form b11c1d, where a, d ≥ 1 and b, c ≥ 2. If b ≤ c, then the block a1b11c fulfills condition (b); if b ≥ c, then the block b11c1d fulfills condition (b).
Longest words with at most n antisquares and avoiding β-powers.