Cubefree words with many squares

We construct infinite cubefree binary words containing exponentially many distinct squares of length n. We also show that for every positive integer n, there is a cubefree binary square of length 2n.


Introduction
A square is a non-empty word of the form xx, and a cube is a non-empty word of the form xxx. An overlap is a word of the form axaxa, where a is a letter and x is a word (possibly empty). A word is squarefree (resp. cubefree, overlap-free) if none of its factors are squares (resp. cubes, overlaps). For further background material concerning combinatorics on words we refer the reader to [2].
It is well-known that there exist infinite squarefree words over a ternary alphabet and infinite overlap-free words over a binary alphabet. Clearly, any overlap-free word is also cubefree. Any infinite cubefree binary word must contain squares; however, Dekking [8] proved that there exists an infinite cubefree binary word containing no squares xx where the length of x is greater than 3 (see also [13,14]). In this paper we consider instead the existence of infinite cubefree binary words with many distinct squares.
Most known constructions of infinite cubefree words involve the iteration of a morphism. Words constructed in this manner are often refered to as infinite D0L words. Ehrenfeucht and Rozenberg [9,10,11] proved several results concerning the factor complexity of infinite D0L words. They showed that any squarefree or cubefree D0L word has O(n log n) factors of length n. Thus, an infinite cubefree D0L word cannot have many distinct square factors. By constrast, we show here how to construct infinite cubefree binary words containing exponentially many distinct squares of length n.
Other work related to the problems considered here include [1,6,7]. Let µ denote the Thue-Morse morphism: i.e., the morphism that maps 0 → 01 and 1 → 10. The Thue-Morse word is the infinite word t = 011010011001011010010110 · · · obtained by iteratively applying µ to the word 0. The Thue-Morse word is well-known to be overlap-free, and hence, a fortiori, cubefree [16]. The squares occurring in the Thue-Morse word were characterized by Pansiot [12] and Brlek [4]  The set A is the set of squares appearing in the Thue-Morse word. Shelton and Soni [15] characterized the overlap-free squares (the result is also attributed to Thue by Berstel [3]), as being the conjugates of the words in A. (A conjugate of x is a word y such that x = uv and y = vu for some u, v.) Currie and Rampersad [6] showed that the conjugates of the words in A are also precisely the 7/3-power-free squares. Thus, there are only 7/3-power-free squares of length 2n when n is a power of 2, or 3 times a power of 2. By contrast, we show that there are cubefree binary squares of length 2n for every positive integer n. We use this result to construct infinite cubefree binary words containing exponentially many distinct squares.

Main results
The main results of this paper are the following two theorems.
Theorem 1. Let n be a positive integer. There exists a cubefree binary square of length 2n.

Theorem 2.
There exists an infinite cubefree binary word containing exponentially many distinct squares of length n.
We first establish some preliminary results. Proof. Aberkane and Currie [1,Lemma 4] proved that for every integer m ≥ 6, the Thue-Morse word contains a factor of length m of the form 10y10. Then the Thue-Morse word also contains the factor µ(10y10) = 1001µ(y)1001, which has length 2m. Finally, we observe that 10011001 and 1001101001 are factors of the Thue-Morse word of lengths 8 and 10 respectively.

Lemma 4.
If y is overlap-free and ayb is a cube of period p, then p ≤ |ab|.
Proof. Otherwise deleting a and b removes less than a full period from ayb, leaving an overlap.
Lemma 5. If z is a factor of yyy where |y| = p and |z| ≤ p+1, then there are two occurrences of z in yyy.
Proof. Certainly if z is a factor of yy it occurs twice in yyy. If z is a factor of yyy but not of yy, then z must span the central y of yyy and a bit more on both ends, giving z a length of p + 2 or more. Proof of Theorem 6. Suppose yyy is a cube in x0x0 with |y| = p > 0. Case 1: Period p ≥ 4. By Lemma 5 and Remark 1, word 01010 is not a factor of yyy. We have two possibilities: Case 1a: Cube yyy is a factor of x ′ 100101. This is impossible by Lemma 4, since x ′ 1001 is overlap-free, |01| = 2, and p ≥ 4 > 2.
Case 1b: Cube yyy is a factor of 101001x ′′ 0. This is again impossible by Lemma 4, since 1001x ′′ is overlap-free. Case 2: Period p ≤ 3. If 01010 is a factor of yyy, then one of 001010 and 010100 is a factor. However, neither of these has period 1, 2 or 3; this is impossible. We conclude that 01010 is not a factor of yyy. This gives a similar case breakdown as in Case 1.
Case 2a: Cube yyy is a factor of x ′ 100101. Case 2ai: Cube yyy is a suffix of x ′ 100101. In this case, p ≤ 2 by Lemma 4, since x ′ 1001 is overlap-free. However, the longest suffix of x ′ 100101 of period 1 or 2 is 0101, which is cubefree.
Case 2aii: Cube yyy is a suffix of x ′ 10010. This forces p = 1, which is impossible. Case 2b: Cube yyy is a factor of 101001x ′′ 0.
Case 2bii: Cube yyy is a factor of 1001x ′′ 0 = x0. This is impossible by Case 2a.
Theorem 7. Let x be a factor of the Thue-Morse word of the form x = 1001x ′′ = x ′ 1001.
Then the word x101100x101100 is cubefree.
Proof of Theorem 7. Suppose yyy is a cube in x101100x101100 with |y| = p > 0. Case 1: Period p ≥ 4. By Lemma 5 and Remark 2, word 00100 is not a factor of yyy. We have two possibilities: Case 1a: Cube yyy is a factor of x10110010. Word x10110010 contains 11011 as a factor exactly once. By Lemma 5 and Remark 2, there are two possibilities: Case 1ai: Cube yyy is contained in x101. In this case, p ≤ 3 by Lemma 4, since x is overlap-free. This is a contradiction.
Case 1aii: Cube yyy is contained in 10110010. This is clearly impossible. Case 1b: Cube yyy is a factor of 0x101100. Again, word 0x101100 contains 11011 as a factor exactly once. Therefore, either yyy is contained in 101100 or in 0x101. The first alternative evidently is impossible, while the second is ruled out by Lemma 4. Case 2: Period p ≤ 3. If 00100 is a factor of yyy, then we must have p = 3, since 00100 does not have period 1 or 2. However, in x101100x101100, the maximal factor of period 3 containing 00100 is 1001001, which is not a cube. We conclude that 00100 is not a factor of yyy. This gives a similar case breakdown to Case 1: Case 2a: Cube yyy is a factor of x10110010. By Lemma 4 the word x10 must be cubefree. Therefore, yyy must be a suffix of one of these words: None of the w n ends in a cube of period 1, 2 or 3. (In the case of words w 4 , w 3 , the longest suffixes of period 3 have lengths 6 and 5 respectively.) It follows that yyy is not a suffix of any of the w n , and this case does not occur.
Case 2b: Cube yyy is a factor of 0x101100. Since |yyy| = 3p ≤ 9 ≤ |0x|, yyy is a factor of 0x or of x101100. The first possibility was ruled out in Theorem 6, and the second in Case 2a.
Theorems 6 and 7 together establish Theorem 1. Next we show that the number of cubefree binary squares of length n grows exponentially. Brandenburg [5,Theorem 6] showed that h maps cubefree words to cubefree words. Moreover, since h is uniform and injective, the set of words h(yy) consists of at least 2 m/2 cubefree squares of length 12m. Asymptotically, we thus have exponentially many cubefree binary squares of length n, as required.
We now prove Theorem 2.
Proof of Theorem 2. In the proof of Proposition 8 we showed that there are at least 2 m/2 cubefree binary squares of length 12m for every positive integer m. Let S therefore be any set of cubefree squares over {0, 1} where S contains at least 2 m/2 words of length 12m for every positive integer m. Let x = x 1 x 2 · · · be any infinite cubefree binary word over {2, 3}. Construct a word w = x 1 S 1 x 2 S 2 · · · , where the set of S i 's is equal to the set S, so that w is cubefree and contains exponentially many distinct squares of length n. Let g be the morphism Brandenburg [5,Theorem 6] showed that g maps cubefree words to cubefree words. Thus, g(w) is cubefree and, by the uniformity and injectivity of g, contains exponentially many distinct squares of length n.
Note that Theorem 2 implies that existence of an infinite cubefree binary word with exponential factor complexity-i.e., with exponentially many factors of length n. Similarly, one can easily construct an infinite squarefree word over {0, 1, 2} with exponential factor complexity. Proposition 9. There exists an infinite squarefree word over {0, 1, 2} with exponential factor complexity.
Proof. Let w be any infinite squarefree word over {0, 1, 2} and let x be any infinite word over {3, 4} with 2 n factors of length n for every positive n. Let y be the word obtained by forming the perfect shuffle of w and x: that is, if w = w 0 w 1 w 2 · · · and x = x 0 x 1 x 2 · · · , then define y = w 0 x 0 w 1 x 1 w 2 x 2 · · · . Clearly, y is a squarefree word with exponential factor complexity. Let f be the morphism 0 → 010201202101210212 1 → 010201202102010212 2 → 010201202120121012 3 → 010201210201021012 4 → 010201210212021012.
Brandenburg [5,Theorem 4] showed that f maps squarefree words to squarefree words. The uniformity and injectivity of f implies that f (y) is a squarefree word with exponential factor complexity, as required.