Surjective cellular automata far from the Garden of Eden

One of the ﬁrst and most famous results of cellular automata theory, Moore’s Garden-of-Eden theorem has been proven to hold if and only if the underlying group possesses the measure-theoretic properties suggested by von Neumann to be the obstacle to the Banach-Tarski paradox. We show that several other results from the literature, already known to characterize surjective cellular automata in dimension d , hold precisely when the Garden-of-Eden theorem does. We focus in particular on the balancedness theorem, which has been proven by Bartholdi to fail on amenable groups, and we measure the amount of such failure.


Introduction
Cellular automata (CA) are local descriptions of global dynamics.Given an underlying uniform graph (e.g., the square grid on the plane) a CA is defined by a finite alphabet, a finite neighborhood for the nodes of the graph, and a local function that maps states of a neighborhood into states of a point.By synchronous application of the local function at all nodes, a global function on configurations is defined.
The study of global properties of CA and their relations with the local description has been a main topic of research since the field was established.Indeed, the Garden-of-Eden theorem by Moore [19] and its converse by Myhill [20], which link surjectivity of the global map of 2D CA to pre-injectivity (a property that may be described as the impossibility of erasing finitely many errors in finite time) also have the distinction of being the first rigorous results of cellular automata theory.Since then, several more properties were later proven to be equivalent to surjectivity for d-dimensional CA.Among them are: • Balancedness [18]: each pattern of a given shape has the same number of preimages.
• Preservation of Martin-Löf randomness [3]: the image of any algorithmically incompressible configuration is itself algorithmically incompressible.
With the subsequent efforts to extend the definition of CA to the more general situation of Cayley graphs of finitely generated groups, an unexpected phenomenon appeared: the Garden-of-Eden theorem actually depends on properties of the involved groups.This phenomenon dates back to Machì and Mignosi's 1993 paper [15], where counterexamples to both Moore's and Myhill's theorems on the free group on two generators are presented, but the theorems themselves are proven for groups of subexponential growth, a class which includes the Euclidean groups.Comparing the original papers [19] and [20], a key fact emerges, which is crucial for the proofs: in Z d , the size of a hypercube is a d-th power of the side, but the number of sites on its outer surface is a polynomial of degree d − 1.In other words, it seems that, to get Moore's or Myhill's theorems for CA on a group G, we need that in G the sphere grows more slowly than the ball.
What is actually sufficient for the Garden-of-Eden theorem to hold is a slightly weaker property called amenability, which was formulated by von Neumann in an attempt to explain the Banach-Tarski paradox: the unit ball in the space can be decomposed into finitely many parts, and those parts reassembled so to form two unit balls!Informally, a group is amenable if, however given a finite shape for the sphere, it is always possible to find a finite ball whose sphere is proportionally as small as wished: it turns out that the Hausdorff phenomenon takes place in the space because the group of rotations of the space has a free subgroup on two generators, which precludes amenability.Ceccherini-Silberstein et al. [7] proved then that Moore's theorem holds for CA on any amenable group, but fails for groups that have a free subgroup on two generators.After about a decade, Bartholdi [1] completed the proof for every non-amenable group, and added preservation of the uniform product measure to the list of properties verified by surjective CA on all and only the amenable groups.This can also be related to characterizations by Ornstein and Weiss [22] of groups whose full shifts over distinct alphabets factor onto one another.
In this paper, we extend the range of Bartholdi's theorem by characterizing amenable groups as those where surjective CA have additional properties.We start by considering balancedness, which is the combinatorial variant of preservation the product measure: thus, amenable groups are precisely those where surjective CA are balanced.We then include several properties studied in topological dynamics: CA with any of these properties are surjective, and we show that the converse implications holds precisely for CA on amenable groups.
Theorem 1 Let G be a group.The following are equivalent.
2. Every surjective CA on G is pre-injective.3. Every surjective CA on G preserves the uniform product measure.4. Every surjective CA on G is balanced.5. Every surjective CA on G is recurrent for the uniform product measure.

Every surjective CA on G is nonwandering.
We then show a fact which is remarkable by its own right.Not only preservation of the uniform product measure by surjective CA characterizes amenable groups: it also fails catastrophically for non-amenable ones, in the sense given by the following statement.
Theorem 2 Let G be a non-amenable group.There exist an alphabet Q, a subset U of Q G such that µ Π (U ) = 1, and a surjective cellular automaton A over G with alphabet Q such that µ Π (F −1 A (U )) = 0, where F A is the global function of A and µ Π the uniform product measure on Q G .
To prove Theorem 2, we introduce a definition of normality for configurations, which is modeled on the one for infinite words over a finite alphabet.Such a trick has been successfully applied before on the Euclidean groups Z d : however, in our more general setting, several properties do not hold, which forces us to add further conditions to ensure that the set of normal configurations (the set U in Theorem 2) has full measure.In turn, the cellular automaton A will be a variant of Bartholdi's counterexample, modified so that it has a spreading state.
Finally, for finitely generated groups with decidable word problem, Martin-Löf randomness can be defined: such definition depends on the measure defined on the Borel σ-algebra, which for our aims will be the product measure.Under these additional hypotheses, we show that the result by Calude et al. [3] about surjective CA preserving Martin-Löf randomness, holds precisely for amenable groups.
Theorem 3 Let G be a finitely generated group with decidable word problem.Then G is amenable if and only if for every surjective CA A on G, whenever a configuration c is Martin-Löf random with respect to the product measure µ Π , so is its image F A (c).
In addition, if G is not amenable, there exists a surjective CA on G such that every Martin-Löf random configuration w.r.t.µ Π has a nonrandom image and only nonrandom preimages.In particular, the set U in Theorem 2 can be taken as the set of Martin-Löf random configurations w.r.t.µ Π .
The paper is organized as follows.Section 2 provides a background.Section 3 deals with balancedness, and Section 4 with the nonwandering property.Section 5 is devoted to the proof of Theorem 2, and Section 6 to that of Theorem 3.

Background
Given a set X, we denote by PF(X) the set of all finite subsets of X.

Groups
Let G be a group.We call 1 G , or simply 1, its identity element.Given a set X, the family σ = {σ g } g∈G of transformations of X G = {c : G → X}, called translations, defined by is a right action of G on X G , that is, σ gh = σ h •σ g for every g, h ∈ G.This is consistent with defining the product φψ of functions as the composition ψ • φ.Other authors (cf.[6]) define σ g (c)(x) as c(g −1 x), so that σ becomes a left action.However, most of the definitions and properties we deal with do not depend on the "side" of the multiplication: we will therefore stick to (1).A set of generators for G is a subset S ⊆ G such that for each g ∈ G there is a word w = w 1 . . .w n on S ∪ S −1 such that g = w 1 • • • w n .The minimum length of such a word is called the length of g w.r.t.S, and indicated by g S , or simply g .G is finitely generated (briefly, f.g.) if S can be chosen finite.A group G is free on a set S if it is isomorphic to the group of reduced words on S ∪ S −1 : a word w is said to be reduced if for every s ∈ S the pairs ss −1 and s −1 s do not appear in w.For r ≥ 0, g ∈ G the disk of radius r centered in g is D r (g) = {h ∈ G | g −1 h ≤ r}.The points of D r (g) can be "reached" from the "origin" 1 G by first "walking" up to g, then making up to r steps: this is consistent with the definition of translations by (1), where to determine c g (z) we first move from 1 to g, then from g to gz.We write D r for D r (1).We also put A group G is residually finite (briefly, r.f.) if for every g = 1 there exists a homomorphism φ : G → H such that H is finite and φ(g) = 1.Equivalently, G is r.f.if the intersection of all its subgroups of finite index is trivial.It follows from the definitions that, if G is r.f. and U ⊆ G is finite, then there exists Lemma 4 ([10, Lemma 2.3.2])Let G be a residually finite group and F a finite subset of G not containing 1 G .There exists a subgroup H of finite index in G, which does not intersect F , and such that the right cosets Hu, u ∈ F , are pairwise disjoint.
we say that c is H-periodic.The family of periodic configurations in X G is indicated by Per(G, X).
A group G is amenable if it satisfies the following equivalent conditions: 1.There exists a finitely additive probability measure µ on G such that µ(gA) = µ(A) for every g ∈ G, A ⊆ G.
2. For every U ∈ PF(G) and ε > 0 there exists K ∈ PF(G) such that 3. There exists a net {X i } i∈I of finite nonempty subsets of G such that, for every U ∈ PF(G), Similar definitions want µ right-invariant and (2) replaced by |KU \ K| < ε|K|-and similarly for (3)-or µ both left-and right-invariant and set-theoretic differences in (2) and (3) replaced by symmetric difference (recall that A △ B = (A \ B) ∪ (B \ A)): in fact, all these definitions are equivalent.Also, a group is amenable if and only if all of its finitely generated subgroups are amenable.
Proposition 6 (Tarski alternative; cf.[6, Theorem 4.9.2])A group is paradoxical if and only if it is not amenable.
A bounded-propagation two-to-one compressing map over a group G is a map φ : G → G such that, for some finite propagation set S ⊆ G, φ(g) −1 g ∈ S and |φ −1 (g)| = 2 for every g ∈ G.In particular, such a map must be surjective, and |S| ≥ 2. By [6, Theorem 4.9.2], a group has a bounded-propagation two-to-one compressing map if and only if it is paradoxical.
Example 7 Let G = F 2 be the free group on two generators a, b; for g ∈ G let w = w(g) = w 1 w 2 • • • w m be the unique reduced word on {a, b, a −1 , b −1 } that represents g.Define: Then φ is a bounded-propagation two-to-one compressing map with S = {1, a, b}.

Cellular automata
A cellular automaton (briefly, CA) on a group G is a triple A = Q, N , f where the alphabet Q is a finite set, the neighborhood N ⊆ G is finite and nonempty, and f : Q N → Q is a local function.This, in turn, induces a global function on the space Q G of configurations, defined by Hedlund's theorem [6,Theorem 1.8.1] states that global functions of CA are exactly those functions from Q G to itself that commute with translations and are continuous in the prodiscrete topology, i.e., the product topology where Q is considered as a discrete space.A base for this topology is given by the cylinders of the form C(E, p) = {c : G → Q | c| E = p}, with E a finite subset of G and p : E → Q a pattern: observe that, for countable groups, this base is countable.Also, the elementary cylinders C(g, q) = {c : G → Q | c(g) = q} with g ∈ G and q ∈ Q form a subbase.The set E is called the shape of the pattern p. Through (4) we also consider, for every finite E ⊆ G, a function f : As a consequence of Hedlund's theorem, CA behave well with respect to periodic configurations.
An occurrence of a pattern p : other words, the pattern p g : gE → Q defined by p g (gz) = p(z) is a copy of p.We indicate as occ(p, c) the set of occurrences of the pattern p in the configuration c.
The following can be seen as an example of folkloric consequence of the compactness of the configuration space.

.1]) A cellular automaton with finite alphabet has a GOE configuration if and only if it has an orphan pattern.
A configuration is rich (or shift-transitive) if it contains occurrences of every pattern.The orphan pattern principle can then be restated as follows: a CA is surjective if and only if it sends rich configurations into rich configurations.
Two configurations are asymptotic if they differ on at most finitely many points; a CA is pre-injective if distinct asymptotic configurations have distinct images.
Proposition 12 (Ceccherini-Silberstein, Machì and Scarabotti [7]) Let G be an amenable group and let A be a CA on G. Then A is surjective if and only if it is pre-injective.

Measures
we say that µ-almost every point satisfies a property P if the set of the points which do not satisfy P is µ-null.We say that F preserves µ if F µ = µ.If Σ C is the σ-algebra generated by the cylinders, by the Carathéodory extension theorem and the Hahn-Kolmogorov theorem a probability measure on Σ C is completely determined by its value on the cylinders.The measure µ is called the uniform product measure.Observe that Σ C coincides with the Borel σ-algebra generated by the open sets if and only if G is countable.Also observe that CA global functions are both Borel measurable and Σ C -measurable.
Proposition 14 (Bartholdi's theorem [1]) Let G be a group.The following are equivalent.

Every surjective
3. Every surjective CA on G preserves the uniform product measure µ Π .
Let µ be a probability measure over Q G .We say that F : Proposition 15 (Poincaré recurrence theorem; cf.[12,Theorem 4.1.19])Let (X, Σ, µ) be a probability space and let F : X → X be a measurable function.If F preserves µ, then F is µ-recurrent.

Balancedness
Let A = Q, N , f a CA on Z d such that N = {−r, . . ., r} d .According to Maruoka and Kimura [18], A is n-balanced if each pattern on a hypercube of side n has |Q| (n+2r) d −n d pre-images.The authors then prove that such A is surjective if and only if it is n-balanced for every n.On the other hand, the majority CA on {0, 1} Z such that f (c(−1), c(0), c(1)) = 0 if and only if at most one of the arguments is 1, is 1-balanced but not 2-balanced, as 0011 and 0101 are easily checked to be the only two preimages of 01: also, it has the Garden-of-Eden pattern 01001.Such GOE is also of minimal length: for example, 0100 has the preimage 010100.
The balancedness condition is the same as saying that each pattern on a given shape has the same number of pre-images: to see how, just "patch" arbitrary shapes to "fill" a hypercube.This allows to extend the definition to CA over arbitrary groups.
Definition 16 Let G be a group and let A = Q, N , f be a CA on G.A is balanced if for every finite nonempty E ⊆ G, every pattern p : E → Q has the same number of preimages: (5) The neighborhood N seems to have a crucial role in Definition 16: which may make the reader suspect it to be ill-posed.However, as we will see in a moment, what looks like a property of the presentation, is actually a property of the dynamics: balancedness of A only depends on its global function F A , not on the choice of the neighborhood N or the local function f -provided F A remains the same.

Proposition 17 A cellular automaton is balanced if and only if it preserves the uniform product measure.
Proof: The argument is similar to the one used in [3] for As Since the r.h.s. in ( 5) is always positive, no pattern is an orphan for a balanced CA.In [7], two CA on the free group on two generators are shown, one being surjective but not pre-injective, the other pre-injective but not surjective: both have an unbalanced local function.Therefore, balancedness in general groups is strictly stronger than surjectivity, and possibly uncorrelated with pre-injectivity.Balancedness allows us to generalize [3, Point 1 of Theorem 4.4] to finitely generated amenable groups.
Lemma 19 (Step 1 in proof of [7, Theorem 3]) Let G be a finitely generated amenable group, q ≥ 2, and n > r > 0. For L = D n there exist m > 0 and B ⊆ G such that B contains m disjoint copies of L and Proposition 20 Let G be a finitely generated amenable group and let A = Q, D r , f , r > 0, be a CA on G.If c is not rich then F A (c) is not rich.
Proof: Suppose there is a pattern with support L = D n , n > r, that does not occur in c.Choose m and B according to Lemma 19.By hypothesis, the number of patterns with support B which occur in c is at most (q |L| − 1) m q |B|−m|L| , with q = |Q|; therefore, the number of patterns with support B \ ∂ r B which occur in F A (c) cannot exceed this number too.By Lemma 19, this is strictly less than q |B|−|∂rB| , which is the total number of patterns with support B \ ∂ r B: hence, some of those patterns do not occur in F A (c). ✷ The last statement of this section is a strengthening of a result by Lawton [14] (also stated in [24, Theorem 1.3]) which states that injective CA on residually finite groups are surjective.
Theorem 21 Let G be a residually finite group and A = Q, N , f an injective CA over G. Then A is balanced.
Proof: Let E be a finite subset of G: it is not restrictive to suppose 1 ∈ E ∩ N , so that E, N ⊆ EN .Suppose, for the sake of contradiction, that p : The r.h.s. in (7) is the number of H-periodic configurations that coincide with p on E. Since A is injective and G is r.f., by [14], A is reversible, and by Lemma 8, F A sends H-periodic configurations into H-periodic configurations.But because of ( 7) and the pigeonhole principle, there must exist two H-periodic configurations with the same image according to F A , which contradicts injectivity of A. ✷

The nonwandering property
Bartholdi's theorem can be expanded by adding more properties that are satisfied by every surjective CA if and only if the underlying group is amenable.
Remark 23 If A is µ-recurrent for some probability measure µ with full support-i.e., no nonempty open set is µ-null-then A is nonwandering.
Observe that, for the latter to hold, it is not necessary that every open set be measurable: it is sufficient that every open set contains a measurable open set of positive measure, which is the case for µ Π .We say that a state q 0 ∈ Q is spreading for A = Q, N , f if for every u ∈ Q N such that u i = q 0 for some i ∈ N we have f (u) = q 0 .Lemma 24 A nonwandering nontrivial CA has no spreading state.
Proof: Suppose that the nontrivial CA A = Q, N , f has a spreading state q 0 .Let U = C(N ∪ {1 G }, p) where It follows from the definitions and the orphan pattern principle that if a CA is nonwandering (or µ Πrecurrent), then it is surjective.The next two statements are immediate consequences of the Poincaré recurrence theorem.
Proposition 25 Let G be a group and let A be a CA on G.
We might ask what the role of amenability in Corollary 26 is.The following counterexample shows that surjective CA on paradoxical groups may fail to be nonwandering.
Example 27 Let G be a non-amenable group, φ a bounded-propagation two-to-one compressing map with propagation set S, a total ordering of S and Q = S ×{0, 1}×S ⊔{q 0 }, where q 0 / ∈ S ×{0, 1}×S.Let A = Q, S, f with: Clearly, such a CA cannot be nonwandering, as it is nontrivial and has the spreading state q 0 .In particular, it is neither µ Π -recurrent nor balanced.
Proposition 28 The cellular automaton A from Example 27 is surjective.
then i = js for some s ∈ S, and there exists a unique t ∈ S \ {s} such that φ(jt) = j.If x j = q 0 , then set y i = (s, 0, s): otherwise, we can write x j = (p, α, q).If s ≺ t, then set y i = (s, α, p); otherwise set y i = (s, 1, q).This definition has the property that for every i ∈ G, Let us prove that the configuration y is a preimage of x by the global map of the CA.Let j ∈ G and s, t ∈ S such that s ≺ t, y js ∈ {s} × {0, 1} × S, and y jt ∈ {t} × {0, 1} × S. Then s = φ(js) −1 js and t = φ(jt) −1 jt, and φ(js) = φ(jt) = j: hence, there exists exactly one such pair (s, t).If x j = q 0 , then the definition of y gives y jt = (t, 0, t), and f will apply its third subrule.If x j is written (p, α, q), then y js = (s, α, p) and y jt = (t, 1, q), and f will apply its second subrule.✷ Observe that, in the proof of Proposition 28, for every configuration we construct a preimage which does not contain the state q 0 .This, together with Proposition 20, leads us to the following.
Remark 29 A finitely generated group G is paradoxical if and only if there exists a CA on G which takes a nonrich configuration into a rich one.

Normal configurations
The results from Sections 3 and 4, together with the existing literature, show that Theorem 1 is true.We now move on to Theorem 2 and search for a suitable set U ⊆ Q G of full measure with a null preimage.To construct such a set, we introduce the concept of normal configuration, according to some parameters: normality shall thus be a quantitative concept, more precise than the nonwandering property which is a qualitative one.
Our definition is based on the one for normal infinite words.Let U ⊂ N, and denote The lower density, upper density and density of U are, respectively, the liminf, limsup and limit, when it exists, of P (U |n) when n goes to infinity.Given an infinite word w, an occurrence in w of a finite word u is a position i ≥ 0 such that w [i:i+|u|−1] = u.Call occ(u, w) the set of occurrences of u in w.An infinite word w on the alphabet Q is said to be m-normal, m ∈ N, if for every The notion of m-normality admits a characterization which will be helpful in the next section.
Theorem 30 (Niven and Zuckerman; cf [21]) Let m ≥ 1 and let Q be a finite set.An infinite word over Q is m-normal if and only if it is 1-normal when considered as a word over Q m .
Let now h : N → G be an injective function.For U ⊆ G we define the lower density dens inf h U , upper density dens sup h U , and density (if it exists) dens h U as those of the preimage h −1 (U ).
Note that we do not require that h be bijective.The reason for this, is that the structure of general groups is usually not as convenient as that of Z d , and it is not always possible to subdivide a group into "nicely shaped blocks" (such as the hypercubes of Z d ) and see a configuration as a "coarser-grained" configuration on the same group.We will discuss this in further detail later on in this section.
This definition passes a basic "sanity check": normality on larger sets ensures normality on smaller sets.
Proof: Let p : E → Q.Every z ∈ G which is an occurrence of p in c, is also an occurrence of exactly one of the |Q| |F | patterns p : F → Q that extend p; vice versa, if p| E = p, then each occurrence of p is also an occurrence of p. Hence, whatever n is, As The vice versa of Lemma 32 does not hold: being h-E-normal for every proper subset E of F does not imply being h-F -normal.
Our aim is to prove that, at least under certain conditions on h and E, µ Π -almost all configurations are h-E-normal: to do this, we need criteria for h-E-normality.A basic test is provided by

For every
Proof: Clearly, Point 1 implies Points 2 and 3, and Points 2 and 3 together imply Point 1.We then only have to prove that Points 2 and 3 are equivalent: this will be easy once we observe that, for every n > 0, which expresses the obvious fact that every point is an occurrence of some pattern with support E. Suppose that for some p : for infinitely many values of n: because of ( 9), for those values we also have Therefore, for all such values of n, there must be some p : : since the n's are infinitely many and the p's are finitely many, there must be at least one p : The converse implication is proven similarly.✷ Lemma 34 has an immediate consequence, which will have great importance later.
Lemma 35 Let A = Q, N , f be a nontrivial CA on G with a spreading state q 0 and s, t two distinct elements of N .For every injective function h Proof: Since q 0 is spreading, occ(q 0 , F A (c)) contains occ(p, c) for every {s, t}-pattern p such that p(s) = q 0 or p(t) = q 0 .Given c's h-{s, t}-normality, each of these 2|Q|−1 patterns has density 1/|Q| 2 : therefore, is the set of all the configurations c ∈ Q G that are not h-E-normal.If each L h,p,k has measure 0, thenas the p's are finitely many and the k's are countably many-so has L h,E , and almost all configurations are h-E-normal.This, in the classical case of infinite words over a |Q|-ary alphabet, is achieved for E = {0, . . ., r − 1}, via estimates such as the following.
Proposition 36 (Chernoff bound [8]; cf.[2, Lemma 6.56]) Let Y 0 , . . ., Y n−1 be independent nonnegative random variables; let S n = Y 0 + . . .+ Y n−1 , and let µ = µ(n) be the average of S n .For every δ ∈ (0, 1), In particular, if the Y i 's are Bernoulli trials with probability p, and 0 < ε < min(p, 1 − p), then for The Chernoff bound, together with the Borel-Cantelli lemma, allows to prove that the set of non-normal infinite words has product measure zero.However, one of the reasons why we can express m-normality of sequences as 1-normality of other sequences, is that the interval {0, . . ., m − 1} is a coset of a submonoid of N isomorphic to N: as any subgroup of finite index of Z d is isomorphic to Z d , it is possible to adapt the classical argument for infinite words so that it works for d-dimensional configurations.But a subgroup of index 2 of the free group on two generators is free on three generators (cf.[16, Theorem 2.10]) and thus is not isomorphic to it; therefore, if we just mimic the classical argument and consider patterns with support a coset of a subgroup, we need in general to change the underlying group!Otherwise, when estimating the number of occurrences of a pattern, we have to deal with non-independent events, and cannot (straightforwardly) apply the Chernoff bound.This is the key reason we mentioned earlier for our hypothesis that h may be non-surjective.In fact, if the E-shaped neighborhoods of the points of h(N) are pairwise disjoint, then the events of the form "h(i) is an occurrence of p" for p : E → Q are independent, and we can apply the Chernoff bound while keeping the same underlying group.

Lemma 37 Let E be a finite subset of G and let
Proof: As the sets h(i)E, i ≥ 0, are pairwise disjoint, the Boolean random variables Y i which take value 1 if and only if c h(i) E = p, are independent and identically distributed according to a Bernoulli distribution of parameter t decreases exponentially in n for fixed k and p.By the Borel-Cantelli lemma, as this holds for each of the countably many pairs (p, k) with p : E → Q and k ≥ 1, the thesis follows.✷ Observe that there is no need for the group G to be countable.
Proof of Theorem 2: Choose S, Q, and A as by Example 27; let h : N → G be a function such that h(n)S ∩ h(m)S = ∅ for n = m.Let U be the set of h-1-normal configurations: the hypotheses of Lemma 37 are also satisfied for E = {1 G }, which means that µ Π (U ) = 1.By Lemma 35, the images via F A of h-S-normal configurations are not h-1-normal: thus, the preimages of the elements of U must belong to L h,S , which is a µ Π -null set by Lemma 37. ✷ Observe that Theorem 2 holds precisely because on non-amenable groups there are surjective CA which are not balanced.
Proposition 38 Let G be an infinite group and let A = Q, N , f be a CA on G.The following are equivalent.
1.A is balanced.

For every injective function
An arbitrary g ∈ G is an occurrence of p in F A (c) if and only if it is an occurrence in c of one of the patterns p : EN → Q such that f (p) = p.Consequently, for every n ∈ N, If A is balanced and c is h-EN -normal, then the right-hand side of ( 14) is a sum of Suppose then that A is not balanced.Then there exist E ∈ PF(G) and p : 0 by Lemma 37, so there exists an h-EN -normal configuration c.For such c, the right-hand side of ( 14) is a sum of more than 6 Martin-L öf random configurations We are now left with the task of proving Theorem 3. To do this, we need to define Martin-Löf randomness for configurations.This requires some hypotheses on the underlying group: we must be able not only to enumerate its elements, but also to do it in a computable way.
Let G be a group, S a set of generators for G, and R a set of words on S ∪ S −1 .We say that S, R is a presentation of G, and write G = S, R , if G is isomorphic to the quotient G S /K R , where G S is the free group on S (consisting of reduced words on S ∪ S −1 ) and K R is the normal subgroup of G S generated by R, i.e., the intersection of all normal subgroups of G S that contain the elements identified by the words in R. The word problem (briefly, w.p.) for the group G = S, R is the set of words on S ∪ S −1 that represent the identity element of G.Although this set depends on the choice of the presentation, its decidability does not; and although the problem is not decidable even for finitely generated groups, it is so for the Euclidean groups Z d , the free groups, Gromov's hyperbolic groups [11] which generalize free groups, and many more.
An indexing of a countable group G is a bijection φ : N → G; we often write G = {g i | i ≥ 0}, to mean g i = φ(i).An indexing is admissible if there exists a computable function m : N × N → N such that g i • g j = g m(i,j) for every i, j ∈ N. In this case, there is also a computable function ι : Proposition 40 (Rabin, 1960;

cf. [23, Theorem 4]) A finitely generated group has an admissible indexing if and only if it has decidable word problem.
Proof: Assume G = {g i | i ≥ 0} is an admissible indexing, i.e., g i • g j = g m(i,j) for every i, j ∈ N and m is computable.Let S be a set of generators for G, and u = u 1 . . .u ℓ a word over S ∪ S −1 ; say u r = g ir for every r ∈ {1, . . ., ℓ}.We can decide whether u and 1 G identify the same element of G by inductively computing the sequence (a r ) with a 1 = i 1 and a r = m(a r−1 , i r ) for r = 2, . . ., ℓ; u = 1 G if and only if a ℓ is the (unique) index representing 1 G .
Suppose now that G has decidable word problem.Let S be a finite set of generators for G: define an ordering on S ∪ S −1 .A computable bijection φ : N → G can be obtained by enumerating, in lexicographic order, first D 0 = {1 G }, then D 1 \ D 0 = S ∪ S −1 , then D 2 \ D 1 , and so on.Moreover, the function m : N × N → N given by m(i, j) = φ(w) where w is a word on S ∪ S −1 representing g i • g j , is computable.✷ Throughout this section, G will be an infinite, finitely generated group with decidable word problem, and φ : N → G an admissible indexing: we write G = {g i } i≥0 to mean g i = φ(i).
We recall the definition of Martin-Löf randomness for infinite words (cf.[17] and [2, Sections 5.4 and 6.2]).A sequential Martin-Löf test (briefly, M-L test) is a recursively enumerable set U ⊆ N × Q * such that the level sets U n = {x ∈ Q * | (n, x) ∈ U } satisfy the following conditions: 3. For every n ≥ 1 and x, y ∈ Q * , if x ∈ U n and y ∈ xQ * then y ∈ U n .
An infinite word w fails a sequential M-L test U if w ∈ n≥0 U n Q N ; the word w is Martin-Löf random if w does not fail any sequential M-L test.Observe that, according to this definition, if η : N → N is a computable bijection, then w is M-L random if and only if w • η is M-L random.It is well known (cf.[17] and [2,Theorem 6.61]) that M-L random words are normal.
Thanks to an approach by Hertling and Weihrauch, it is possible to define Martin-Löf randomness of infinite words in a way that allows to introduce the concept in the more general context of configurations.The prodiscrete topology and product measure on Q N are defined similarly as on Q G .Given two sequences U = {U i } i≥0 , V = {V j } j≥0 of open subsets of Q N , we say that U is V-computable if there is a recursively enumerable set A ⊆ N such that where π(i, j) = (i + j)(i + j + 1)/2 + j is the standard primitive recursive bijection from N × N to N. Given an ordering Q = {q 0 , . . ., q |Q|−1 }, let B i = w i Q N where w i is the i-th element of Q * in the length-lexicographic order-i.e., w 0 is the empty word, w 1 = q 0 , . . ., w |Q| = q |Q|−1 , w |Q|+1 = q 0 q 0 , w |Q|+2 = q 0 q 1 , and so on-and let B ′ i = j∈E(i+1) B j , where n ∈ E(i) if and only if the n-th bit in the binary expansion of i is 1: then B ′ is an enumeration of a base of the prodiscrete topology of Q N .Observe that the property "w ∈ B ′ j " only depends on a prefix u of w which can be computed from j. Proposition 41 (Hertling and Weihrauch; cf.[2, Theorem 6.99]) Let w : N → Q be an infinite word.The following are equivalent.

For every B
We can now define Martin-Löf randomness for configurations, in analogy with the previous formalism.Given an ordering Q = {q 0 , . . ., q |Q|−1 }, we define a computable bijective enumeration B of the elementary cylinders as B |Q|i+j = C(g i , q j ).To enumerate the cylinders, we define a computable bijection Ψ : PF(N) → N as Ψ(E) = n∈E 2 n (so that Ψ(∅) = 0) and set B ′ i = j∈Ψ −1 (i) B j .Observe that the property "c ∈ B j " only depends on the values of c on a finite subset which can be computed from j.If U and V are families of open subsets of Q G , we say that U is V-computable if there exists a r.e.set A such that (15) holds.
Definition 42 Let G be a f.g. group with decidable word problem; let Σ C ⊆ Q G be the σ-algebra generated by the cylinders (i.e., as G is countable, the Borel σ-algebra) and let µ : As the set of configurations failing a given M-L µ Π -test is µ Π -null, and the family of M-L µ Π -tests is countable, the set of M-L µ Π -random configurations has full measure.
The next statement has been used by Calude et al. ([3]; cf.[2, Section 9.5]) for CA on Z d and one specific admissible indexing.We need it in our more general context: however, the proof is similar.
Lemma 43 Let φ : N → G be an admissible indexing.

The function
Corollary 44 (cf.[2, Theorem 9.10]) Let φ : N → G be an admissible indexing.Then c : As a consequence of Corollary 44, the definition of Martin-Löf µ Π -randomness does not depend on the choice of the admissible indexing.In fact, if φ, ψ : N → G are admissible indexings, then Given a pattern p, the set of configurations where p has no occurrence is an intersection of a countably infinite, computable family of cylinders U i having equal product measure µ Π (U i ) = m < 1.It is then straightforward to construct a M-L µ Π -test that every such configuration fails.We have thus Remark 45 Every µ Π -random configuration is rich.
We can now prove Theorem 3. Let us start with the "only if" direction.
As A is a CA, for every j ∈ N there exists E j ∈ PF(N) such that F −1 A (B ′ j ) = k∈Ej B ′ k ; moreover, the function j → E j is computable because G has decidable word problem.Then Proof: Since µ Π -random configurations form a set of measure 1 and contain occurrences of every pattern, the first part is immediate.For the second part, if Proof of Theorem 3, sufficiency of amenability: Suppose G is an amenable group.Let A be a surjective CA on G with alphabet Q: by Bartholdi's theorem, A preserves µ To prove the "if" part of Theorem 3 (i) , we resort to normal configurations; in doing this, we need a result which is of interest by itself.We say that a ∈ Q N is M-L random relatively to b ∈ Q N if it is M-L random when computability is considered according to Turing machines with oracle b.
Proposition 48 (van Lambalgen's theorem [13]; cf.[9, Corollary 6.9.3])Let a and b be two infinite words over the alphabet Q and let c be the interleaving of a and b, i.e., c(2n) = a(n) and c(2n+1) = b(n) for every n ≥ 0. The following are equivalent.
2. a is M-L random, and b is M-L random relatively to a.

b is M-L random, and a is M-L random relatively to b.
The necessity of conditions 2 and 3 is clear: if, for example, a = b, then c is not M-L random.
Lemma 49 Let G be an infinite f.g. group with decidable word problem.For every nonempty E ∈ PF(G) there exists a computable injective function h : N → G satisfying the following properties:

For any alphabet
Proof: Let G = {g i } i≥0 be an admissible indexing of G. Define an injective function ι : N → N by putting ι(0) = 0, and ι(n + 1) the smallest k such that g ι(0) E, . . ., g ι(n) E, g k E are pairwise disjoint: then ι is computable.If E = {e 0 , . . ., e k−1 }, then h(kn + j) = g ι(2n) • e j and h(n) = g ι(2n) are injective, computable, and satisfy point 1, and in addition, h satisfies point 2. Taking every other value of ι ensures that the complement of the codomain is infinite: we will need this in the next step.
Let now c : G → Q be a M-L µ Π -random configuration.Then v(i) = c(g i ) is a M-L random infinite word.As the codomain of h is recursive and its complement is infinite, there exists a computable bijection π : N → N such that g π(2m) = h(m) for every m ∈ N. Then v • π is a M-L random infinite word: by van Lambalgen's theorem, w(i) = (v • π)(2i) is M-L random, thus also k-normal.By Theorem 30, for every u ∈ Q k , lim  i) The proof in [4] actually presented an error. injective

Conclusions
We have shown that several characterizations of surjective CA, which were known from [3] to hold on Z d , also hold in the more general case of amenable groups: actually, not only do they hold, but each of them characterizes amenable groups in the sense that it holds for CA on G if and only if G is amenable.This allows us to draw the graph of implications in Figure 1.In addition to this, we have determined the level to which the balancedness theorem fails on non-amenable groups: as in this case there are sets of full measure with null preimages, such failure can rightly be called catastrophic.This is a remarkable result that sheds new light on the links between cellular automata theory and group theory.There are, however, several more questions left open.The most important of these, is surely whether Myhill's theorem as well holds only for CA on amenable groups, i.e., whether pre-injectivity implies surjectivity if and only if the underlying group is amenable.The question is open and presumably very difficult (cf. the discussion in [1]) also because, contrary to the other properties examined-which always imply surjectivity regardless of the properties of the underlying group-pre-injectivity appears to be independent of it, as follows from the counterexamples in [15] and [7].Another open problem concerns the existence of an injective CA which is not balanced: a negative answer would solve Gottschalk's conjecture that injective CA over arbitrary groups are surjective.Further questions arise from Remark 29, such as which cellular automata are capable of taking a nonrich configuration into a rich one, and whether this is linked to balancedness.More generally, the relationships between all properties linked here to surjectivity are, in many cases, yet to be explored.
the right-hand side has |Q| |F |−|E| summands, each converging to |Q| −|F | by hypothesis, the left-hand side converges to |Q| −|E| .As p is arbitrary, c is E-normal.✷ n→∞ |occ(u, w) ∩ {0, k, . . ., (n − 1)k}|n = 1 |Q| k , which, as w(kn + j) = c( h(kn + j)) = c(h(n) • e j ), is the same as saying that c is h-E-normal.✷ Proof ofTheorem 3, necessity of amenability: Let G be a non-amenable f.g. group with decidable word problem.Define S, Q and A as by Example 27.Construct h : S → G as by Lemma 49, withE = S ∪ {1 G }. Let c ∈ Q G : we will show that at most one between c and F A (c) is M-L µ Π -random.Suppose that c is indeed M-L µ Π -random.Then c is h-E-normal by the choice of h, thus h-S-normal by Lemma 32.By Lemma 35, F A (c) is not h-1-normal, and cannot be M-L µ Π -random.✷ Observe that L h,p,n,k is a finite union of cylinders.By definition, dens inf h occ(p, c) < |Q| −|E| if and only if there exists k ≥ 1 such that c ∈ L h,p,k,n for infinitely many values of n, i.e., if c ∈ lim sup n L h,p,k,n = n≥1 m≥n L h,p,k,m : this set, which we call L h,p,k , belongs to the σ-algebra Σ C generated by the cylinders.Then Figure1: A diagram of implications between cellular automata properties.Full lines hold for every group; dotted lines hold for amenable groups; dashed lines hold for residually finite groups; wavy lines hold for finitely generated groups with decidable word problem.Starred implications are proven in the present paper.Implications with a question mark are conjectured.Transitivity and ergodicity have not been discussed here, but we include their implications since they are similarly conjectured equivalent for CA.