On the Hausdorff measure of regular ω -languages in Cantor space ∗

This paper deals with the calculation of the Hausdorff measure of regular ω -languages, that is, subsets of the Cantor space deﬁnable by ﬁnite automata. Using methods for decomposing regular ω -languages into disjoint unions of parts of simple structure we derive two sufﬁcient conditions under which ω -languages with a closure deﬁnable by a ﬁnite automaton have the same Hausdorff measure as this closure. The ﬁrst of these condition is related to the homogeneity of the local behaviour of the Hausdorff dimension of the underlying set, and the other with a certain topological density of the set in its closure.

Regular ω-languages are not only famous because they are definable by finite automata but also because they are the ones definable in Büchi's [Büc62] restricted monadic second order arithmetic (cf. the survey [Tho90] or [PP04]).
Hausdorff dimension and Hausdorff measure for regular ω-languages have been proved to be computable (cf.[Ban89,MW88,Edg08] or [MS94,Sta98a]).The computation of the Hausdorff measure of a regular ω-language uses several properties which do not hold for larger classes of ω-languages (cf.[Sta93,MS94,Sta98b]).These properties show that subsets of the Cantor space definable by finite automata really deserve the name "regular".
For instance, Theorem 21 of [MS94] shows a strong connection of Hausdorff dimension and topological density for regular ω-languages closed in Cantor space, and the measure-category-theorem of [Sta98b] shows that this connection can be extended to arbitrary regular ω-languages.
Our investigations relate the Hausdorff measure of a subset of the Cantor space to the Hausdorff measure of its closure.The result in Section 4.1 shows that under a certain homogeneity condition the measure of a regular ω-language coincides with the measure of its closure.The proof uses the decomposition theorem of [Sta98a] which is based on McNaughton's theorem [McN66] and extends in some sense earlier decompositions of [Arn83], [SW74] and [Wag79]  (i) .In our paper the decomposition is directed to a partition of the set of final sets (the table) T of a Muller automaton A accepting a given ω-language F = L ω (A) into the ones contributing to the Hausdorff measure and the ones not contributing.Crucial for this partition is the fact that only those final sets in T maximal w.r.t.set inclusion can contribute to the Hausdorff measure of the accepted ω-language.
Another result (in Section 4.2) is a sufficient condition under which infinite intersections of regular ω-languages topologically large relative to its closure have the same Hausdorff measure as their closure.
Here, for the case of finite measure, we rely on the measure-category theorem derived in [Sta98b, Theorem 4] (see also [VV06,Section 4.4]).The extension to sets of infinite measure requires the closer inspection of regular ω-languages closed in Cantor space as given in Section 3.3.
The paper is organised as follows.After introducing some notation in Section 2 several properties of Hausdorff measure and dimension are listed.Then the third section deals with decompositions of regular ω-languages derived from the accepting automata.This concerns the general decomposition as in [Sta98a], a new decomposition according to non-null Hausdorff measure and the decomposition of closed sets mentioned above.Then in Section 4 we derive the results on the coincidence of the Hausdorff measures of ω-languages of a certain shape and their closures.

Notation
In this section we introduce the notation used throughout the paper.By N = {0, 1, 2, . ..} we denote the set of natural numbers.Its elements will be usually denoted by letters i, . . ., n.Let X be an alphabet of cardinality |X| = r ≥ 2. Then X * is the set of finite words on X, including the empty word e, and X ω is the set of infinite strings (ω-words) over X. Subsets of X * will be referred to as languages and subsets of X ω as ω-languages.
For w ∈ X * and η ∈ X * ∪ X ω let w • η be their concatenation.This concatenation product extends in an obvious way to subsets W ⊆ X * and B ⊆ X * ∪ X ω .For a language W let W * := i∈N W i , and {e}} be the set of infinite strings formed by concatenating non-empty words in W . Furthermore, |w| is the length of the word w ∈ X * and pref (B) is the set of all finite prefixes of strings in B ⊆ X * ∪ X ω .We shall abbreviate w ∈ pref ({η}) (η ∈ X * ∪ X ω ) by w η.
As usual, we consider X ω as a topological space (Cantor space).The closure (smallest closed set containing F ) of a subset F ⊆ X ω , C(F ), is described as C(F ) := {ξ : pref ({ξ}) ⊆ pref (F )}.The open sets in Cantor space are the ω-languages of the form W • X ω .
We assume the reader to be familiar with the basic facts of the theory of regular languages and finite automata.We postpone the definition of regularity for ω-languages to Section 3.For more details on ω-languages and regular ω-languages see the book [PP04] or the survey papers [Sta97,Tho90].

Hausdorff Dimension and Hausdorff Measure
First, we shall describe briefly the basic formulae needed for the definition of Hausdorff dimension and Hausdorff measure.For more background and motivation see Section 1 of [MS94].
We recall the definition of the Hausdorff measure and Hausdorff dimension (see [Edg08,Fal90]) of a subset of X ω .In the setting of languages and ω-languages this can be read as follows (see [Sta93,Sta98a]).For F ⊆ X ω , r = |X| ≥ 2 and 0 ≤ α ≤ 1 the equation defines the α-dimensional metric outer measure on X ω .The measure L α satisfies the following properties (see [Edg08,Fal90,MS94]).

It holds the scaling property
Then the Hausdorff dimension of F is defined as It should be mentioned that dim is countably stable and invariant under scaling, that is, for In particular, every at most countable subset E ⊆ X ω has Hausdorff dimension dim E = 0, and the measure L 0 is the counting measure, that is, Hausdorff dimension and measure need not be distributed uniformly on a set.In order to describe a certain homogeneity we use the following concept (cf.[MS94, Section 4]).We say that an ω-language F has locally positive α-dimensional measure provided L α (F ∩ w • X ω ) > 0 for all w ∈ pref (F ).Then the following technical result is true.
We add a further relation of the Hausdorff dimension and the measure L α for ω-languages of a special shape.
Proof: The first part is Proposition 6.6 of [Sta93].Let w ∈ pref (W ω ).Then there is a v ∈ X * such that wv ∈ W * , and, consequently wv 2

Decomposition of Regular ω-languages
As usual we call a language W ⊆ X * regular if there is a finite (deterministic) automaton A = (X; S; s 0 ; δ), where S is the finite set of states, s 0 ∈ S is the initial state and δ : S × X → S is the transition function (ii) , such that W = {w : δ(s 0 ; w) ∈ S } for some fixed set S ⊆ S.
An ω-language F ⊆ X ω is called regular provided there are a finite (deterministic) automaton A = (X; S; s 0 ; δ) and a table T ⊆ {S : S ⊆ S} such that for ξ ∈ X ω it holds ξ ∈ F if and only if Inf(A; ξ) ∈ T where Inf(A; ξ) is the set of all states s ∈ S through which the automaton A runs infinitely often when reading the input ξ.Observe that S = Inf(A; ξ) holds for a subset S ⊆ S if and only if 1. there is a word u ∈ X * such that δ(s 0 ; u) ∈ S , and 2. for all s, s ∈ S there are non-empty words w, v ∈ X * such that δ(s, w) = s and δ(s , v) = s.
Such sets were referred to as essential sets [Wag79] or loops [Sta97, Section 5.1].
Thus, to ease our notation, unless stated otherwise in the sequel we will assume all automata to be initially connected, that is, S = {δ(s 0 ; w) : w ∈ X * } and the tables T to be contained in the set of loops {Inf(A; ξ) : ξ ∈ X ω }.
We are going to split F into smaller mutually disjoint parts.Let A = (X; S; s 0 ; δ) be fixed.We refer to a word v ∈ X * as (s; S )-loop completing if and only if 1. v is not the empty word, 2. δ(s, v) = s and {δ(s, v ) : v v} = S , and and we call a word w ∈ X * (s; S )-loop entering provided 1. δ(s 0 ; w) = s and 2. if w = w • x for some x ∈ X then δ(s 0 ; w ) / ∈ S .

The general case
Denote by V (s;S ) the set of all (s; S )-loop completing words and by W (s;S ) the set of all (s; S )-loop entering words.Both languages are regular and constructible from the finite automaton A = (X; S; s 0 ; δ).Moreover, V (s;S ) is prefix-free, whereas W (s;S ) need not be so.Nevertheless, every ξ ∈ F S has a unique . Here the state s ∈ S is uniquely determined as the state succeeding the last state ŝ / ∈ S in the sequence (δ(s 0 ; u)) u ξ .Thus we obtain the following (see [Sta98a, Lemma 3]).Lemma 4 (Decomposition Lemma) Let A = (X; S; s 0 ; δ) be a finite automaton, T ⊆ {Inf(A; ξ) : ξ ∈ X ω } be a table and let and the sets w • V ω (s;S ) are pairwise disjoint.
(ii) We use the same symbol δ to denote the usual extension of the function δ to S × X * .
As an immediate consequence of the Decomposition Lemma we obtain that every regular ω-language has the form n i=1 W i • V ω i where W i , V i are regular languages (see [Büc62,PP04,Sta97] or [Tho90]).The converse is also true, that is, if W ⊆ X * and F, E ⊆ X ω are regular then also W ω , W • E and E ∪ F are regular ω-languages.Note, however, that the representation of Eq. ( 3) is much finer, since it splits a regular ω-language F = . ., n}, where, additionally, the languages V i are prefix-free.

Decomposition according to Hausdorff measure
Next we are going to construct, depending on the automaton A = (X; S; s 0 ; δ), a subset F of the set F in Eq. (3) on which the Hausdorff measure L α is concentrated.To this end we need some properties of the measure L α for regular ω-languages.
Since regular ω-languages are Borel sets in Cantor space (cf.[Sta97,Tho90]), L α is not only an outer measure but a measure on the class of regular ω-languages.Thus we have the following (cf.[Fal86]).
From Eq. (3), Proposition 5 and Proposition 1.3 we obtain a formula for the Hausdorff measure L α (F ) of F : The following lemma shows that several of the sets w • V ω (s;S ) do not contribute to the measure L α (F ) of F .
Proof: To prove the first assertion it suffices to show pref (V ω (s;S ) ) ⊆ pref (V ω (s;S ) ).Let A s := (X; S; s; δ).Then ζ ∈ V ω (s;S ) if and only if Inf(A s ; ζ) = S and {δ(s, u) : u ξ} ⊆ S .Consequently, for v ∈ V * (s;S ) and ξ ∈ V ω (s;S ) we have (s;S ) and thus pref (V ω (s;S ) ) ⊆ pref (V ω (s;S ) ).As V ω (s;S ) and V ω (s;S ) are disjoint subsets of C(V ω (s;S ) ) the second assertion follows from the first one and Proposition 7. 2 Proposition 8 shows that for an ω-language F accepted by an automaton A = (X; S; s 0 ; δ) and a table for which S is maximal w.r.t.set inclusion in T .
If α = dim F and we choose among the maximal sets S ∈ T those for which L α (w • V ω (s;S ) ) > 0 we eliminate all sets w • V ω (s;S ) with L α (w • V ω (s;S ) ) = 0 in Eq. (3) and obtain the following.

The case of closed ω-languages
In [SW74,Wag79] it was observed that the tables T of finite automata A = (X; S; s 0 ; δ) accepting regular ω-languages closed in Cantor topology have the following simple structure.
Lemma 10 Let A = (X; S; s 0 ; δ) be an initially connected finite automaton and let T ⊆ {Inf(A; ξ) : ξ ∈ X ω } be a table such that the ω-language F = {ξ : Inf(A; ξ) ∈ T } is closed.Then T satisfies the following properties.
Informally speaking, Condition 1 of Lemma 10 shows that the table T is fully determined by the automaton A and its strongly connected components (SCCs), that is, subsets S ∈ T satisfying the condition ∀s∀s (s, s ∈ S → ∃w∃v(w = e = v ∧ δ(s, w) = s ∧ δ(s , v) = s)).In connection with Proposition 8 one observes that strongly connected components are maximal sets in {Inf(A; ξ) : ξ ∈ X ω }.
As we shall see in the following theorem among the strongly connected components S ⊆ S + the terminal ones play a special rôle.A similar observation was made in [MS94, Section 3] in connection with the calculation of the Hausdorff measure of closed regular ω-languages.Here we call a strongly connected component S ∈ T terminal in T provided δ(s, v) ∈ S or δ(s, v) ∈ S − for s ∈ S and arbitrary v ∈ X * .
Next observe that in view of Lemma 4 we have F = S ∈ T s∈S W (s;S ) •V ω (s;S ) .Then the inclusion relations follow from F ⊆ F and the fact that F is closed.
For the proof of second assertion it suffices to show w 4 Results on Hausdorff Measure

Sets of locally positive measure
Theorem 12 If F ⊆ X ω is a regular ω-language, α = dim F , F has locally positive α-dimensional From Theorem 9 we know that F contains a regular ω-language F = where the sets W i , V i are regular, the V i are prefix-free with L α (V ω i ) > 0, and the sets w The following example shows that the additional assertion need not be true for dim 2 As an immediate consequence we obtain the following.
Corollary 13 If F ⊆ X ω is a regular ω-language, α := dim F , F has locally positive α-dimensional measure and In case L α (F ) = ∞ the measure of the difference L α (C(F ) \ F ) may be finite or infinite.
Example 2 Let X = {0, 1} and In Theorem 12 the hypothesis that F has locally positive α-dimensional measure is essential.We give an example.
Example 3 Let X = {0, 1} and From Proposition 3 and Theorem 12 we obtain the following relationship for the Hausdorff measure of regular ω-power languages.Corollary 14 Let W ⊆ X * .If W ω is a regular ω-language and α = dim W ω then Corollary 14 and, consequently, Theorem 12 are not valid for non-regular ω-languages.In [Sta05, Section 3.3] examples of prefix-free non-regular languages W fulfilling various relationships between L α (W ω ) and L α (C(W ω )) are given.As a further application of Theorem 12 we derive a result which is in some sense a converse to Proposition 2. It shows that for special closed regular ω-languages F we can find a subset of the form of Eq. (5) having the same measure and the same closure as F .Here we use the approach of Theorem 11.Proposition 15 Let A = (X; S; s 0 ; δ) be an initially connected finite automaton and let T ⊆ {Inf(A; ξ) : ξ ∈ X ω } be a table such that the ω-language F = {ξ : Inf(A; ξ) ∈ T } is closed.Let T ⊆ T be the set of all strongly connected components terminal in T and F = {ξ : Inf(A; ξ) ∈ T }.
If dim F = α and F has locally positive α-dimensional measure then F = C(F ) and L α (F ) = L α (F ).
Proof: The first assertion was already proved in Theorem 11.Then, in view of Theorem 12, it suffices to show that F has locally positive α-dimensional measure, that is, L α (F ∩ w • X ω ) > 0 for all w ∈ pref (F ) = pref (F ).
Let v ∈ pref (F ) and consider the identity F = S ∈ T s∈S W (s;S ) • V ω (s;S ) derived in the proof of Theorem 11.
Then there are an S ∈ T and an Thus Theorem 12 proves the assertion.

The measure of sets residual in its closure
This last part shows that an ω-language having a regular closure and which is topologically large in its closure has the same measure as its closure.
Before we proceed to the presentation of the results we have to introduce some necessary prerequisites concerning the topology of the Cantor space.
As usual, a countable intersection of open sets is referred to as a G δ -set.Moreover, we call a set F nowhere dense in E provided C(E \C(F )) = C(E), that is, if C(F ) does not contain a nonempty subset of the form E ∩ w • X ω , and a subset F is referred to as of first Baire category in E if F is a countable union of sets nowhere dense in E. If E is closed and F is of first Baire category in E then E \ F is referred to as residual in E. In particular, G δ -sets E in Cantor space are residual in C(E).
The following lemma shows a connection between Hausdorff dimension and relative density of regular ω-language.
Lemma 16 ([Sta98b, Theorem 8]) Let E ⊆ X ω be a regular ω-language which is closed in Cantor space, α = dim E and let E have finite and locally positive α-dimensional measure.
Then every regular ω-language F ⊆ E is of first Baire category in E if and only if L α (F ) = 0.
This much preparatory apparatus leads to the following result.
Theorem 17 Let E ⊆ X ω be an ω-language which is a countable intersection of regular ω-languages, residual in C(E) and let C(E) be regular.The Theorems 17 and 18 can be applied also to non-regular ω-languages.We give a simple example.
Example 4 Let E := w∈X * X * • w • X ω be the set of all ω-words which contain every word as an infix.
Those ω-words are referred to as disjunctive [JST83] or rich [Sta98b].E is a G δ -set in Cantor space, hence residual in C(E) = X ω .
We have dim C(E) = 1 and obtain L 1 (C(E)) = L 1 (E) = 1. 2 The condition that E be residual in C(E) is really essential as the following example shows.