Directed figure codes are decidable

Two-dimensional structures of various kinds can be viewed as generalizations of words. Codicity veriﬁcation and the defect effect, important properties related to word codes, are studied also in this context. Unfortunately, both are lost in the case of two common structures, polyominoes and ﬁgures. We consider directed ﬁgures deﬁned as labelled polyominoes with designated start and end points, equipped with catenation operation that uses a merging function to resolve possible conﬂicts. We prove that in this setting veriﬁcation whether a given ﬁnite set of directed ﬁgures is a code is decidable and we give a constructive algorithm. We also clarify the status of the defect effect for directed ﬁgures.


Introduction
Codes, i.e., subsets X of a monoid such that every product of the elements decomposes uniquely over X, are a common object of study.Standard variable-length word codes (subsets of a free monoid Σ * , where Σ is an alphabet) are the best-known among them (see e.g. the classic monograph Berstel and Perrin (1985)), with properties that have become folklore.This includes the defect effect and the decidability of codicity.Various authors have extended codes to other structures like trees in Mantaci and Restivo (2001); Karhumäki and Mantaci (1999) and two-dimensional structures of various kinds like polyominoes in Aigrain and Beauquier (1995); Beauquier and Nivat (2003).Whilst the extension to trees preserves the above properties, both are lost in the case of polyominoes.Similar problems arise when considering figures defined as labelled polyominoes.Only a limited version of the defect theorem holds in this case and codicity is decidable for very restricted classes of figures; see a comprehensive study in Harju and Karhumäki (2004) and also Moczurad (2000Moczurad ( , 2007)).
Many authors, like Costagliola et al. (2003Costagliola et al. ( , 2005)), study other kinds of two-dimensional structures.This interest is driven by obvious applications of picture models in diverse disciplines.Moreover, pictures can be seen as a further natural extension of words.Thus, in the present paper we are interested in codicity testing and the defect effect for yet another kind of pictures.
We consider directed figures defined as labelled polyominoes with designated start and end points.This setting is similar to one of the models, symbolic pixel pictures, described in Costagliola et al. (2005) and admits a natural definition of catenation.We use the attribute "directed" to emphasize the way figures are catenated; this should not be confused with the meaning of "directed" in e.g.directed polyominoes.
We prove that verification whether a given finite set of directed figures is a code is decidable and we give a constructive algorithm (constructive in the sense that it finds a double factorization of a figure for a non-code).This is a significant change in comparison to previously mentioned picture models.On the other hand, we give several examples to disprove the defect theorem in this case.
Section 2 of the paper defines directed figures and related operations.Then, in Section 3, we give the necessary condition for a set of figures to be a code.We then proceed to the main result of the paper, the decidability of codicity verification, in Section 4. Proof of the main theorem leads to an algorithm which is presented in Section 5. Finally, Section 6 clarifies the status of the defect effect for directed figures.We conclude with some observations and directions for possible further research.

Preliminaries
Throughout the paper Σ is a finite, non-empty alphabet.
A translation in Z 2 by vector u = (u x , u y ) ∈ Z 2 will be denoted by τ u , For a set V ⊆ Z 2 and an arbitrary function f : V → Σ it obviously induces A set U ⊆ Z 2 is connected if for any x, y ∈ U there exists a sequence x = x 1 , x 2 , . . ., x n , x n+1 = y contained in U such that dist(x i , x i+1 ) = 1 for i ∈ {1, . . ., n}.
Thus, we actually consider figures up to a translation.
Example 1 A directed figure and its graphical representation.Each point of the domain, (x, y), is represented by a unit square in R 2 with bottom left corner in (x, y).A circle marks the start point and a diamond marks the end point of the figure.However, since we consider figures up to a translation, we do not mark the coordinates.
) is a monoid if and only if m is associative.From now on let m be an arbitrary associative merging function.We will use the notation Σ ⋄ m for both the monoid and the set of directed figures itself.Abusing this notation, we will also write X ⋄ m to denote the set of all figures that can be composed by m is never free, since its basis must contain "unit figures" (figures of the type for all a ∈ Σ), contradicting the freeness.
Example 3 Regardless of the merging function, the following figure can be decomposed as e.g.
. Now consider the following set of directed figures: Since tran(x) = (0, 0), every element of Y has the same domain.There is only a finite number of possible labellings of this domain, which implies that regardless of the merging function and labelling of x, there exist a, b ∈ N, a = b such that x a = x b .Hence X is not a code; cf.Examples 9 and 10. ✷ Following conditions are equivalent (• denotes the usual dot product): , there exists a line passing through (0, 0) such that all v i 's lie on one side of it.

Decidability of verification
In this section we prove that testing whether a given set of figures is a code is decidable.We begin with observations that allow us to construct a "bounding area" for figures.We then proceed with properties that imply finiteness of possible configuration sets and, consequently, decidability of the problem in question.
or, by Theorem 1, X is not a code.If τ E exists, we can take it to be long enough so that for all x ∈ X dom(x) ∪ {end(x)} ⊆ HP (τ E , begin(x)), where, for given v, w ∈ R 2 , HP (v, w) for all x ∈ X.Without loss of generality we can assume )) (where R φ denotes a rotation by φ and ∠ is the angle spanned by two vectors) and begin(x) = (0, 0) for all x ∈ X.Now choose constants r S , r N , r W > 0 such that the vectors define a "bounding area" for figures in X, i.e., for all x ∈ X dom(x) ∪ {end(x)} ⊆ {HP (τ, begin(x))}.
where the union in the definition of CW + (x) is taken over v ∈ Z 2 lying within an angle spanned by vectors −tran(x 1 ) and −tran(x n ).
The following properties are now immediately proven: Proposition 4 For all x, y ∈ X ⋄ m CW + (x) ⊆ CW + (xy).
We define a configuration as a pair (x, y), with x, y ∈ X ⋄ m .We say that (x ′ , y ′ ) ∈ (X ⋄ m ) 2 is a successor of (x, y) and write (x, y) ≺ (x ′ , y ′ ) if x ′ = xx ′′ for some x ′′ ∈ X and y = y ′ , or y ′ = yy ′′ for some y ′′ ∈ X and x = x ′ .
By ≺ * we denote the transitive closure of ≺.Obviously X is not a code if and only if there exist x, y ∈ X and z ∈ X ⋄ m such that x = y and (x, y) ≺ * (z, z).Our goal is either to find a configuration (x, y) such that x = y and (x, y) ≺ . . .≺ (z, z) (then X is not a code), or to prove that such a configuration does not exist (then X is a code).A configuration satisfying the above condition will be called an eventually terminating configuration.
Unfortunately, there are potentially infinitely many configurations to check.The following propositions will let us reduce the number of configurations under consideration.Proposition 5 If (x, y) is eventually terminating and (x ′ , y ′ ) ≺ (x, y), then (x ′ , y ′ ) is eventually terminating.
Proof: Obvious.✷ Proposition 6 If (x, y) is eventually terminating, then Proof: See definitions of CW + and CE + .✷ Proposition 7 If (x, y) is eventually terminating, then Proof: See definition of CE − and Proposition 1. ✷ Notice that we do not need all of the information contained in configurations, just those labellings that can be changed by future catenations.By Proposition 7, instead of (x, y) we can consider a reduced configuration defined as a pair (π RC (x, y), π RC (y, x)) where . Hence the number of reduced configurations, up to translation, is finite.This leads us to the main theorem of the paper: Theorem 2 It is decidable whether a given finite set X ⊆ Σ ⋄ m is a code.

Algorithm
From the proof of Theorem 2 we can obtain an algorithm that verifies whether a given set of directed figures is a code.The following procedure returns true if the given input set is a code; otherwise it returns false.Additionally, if the given set is not a code, the algorithm finds x 1 , . . ., Note that this is the reason for processing configurations (and not just reduced configurations) within the main loop of the algorithm.iii.if (x, yz) may be eventually terminating 8. return true; Note that configurations are considered up to a translation; in particular, this applies to ". . .∈ RC" tests.
Observe that the actual implementation may omit the RC T M P set and use C = ∅ as the loop condition (a new element is added to C if and only if a new element is added to RC).This would improve the efficiency but would obliterate the similarity of the algorithm to the algorithm of Sardinas and Patterson, c.f. Example 8.The implementation can also make use of the symmetrical nature of configurations.
The following examples show the algorithm behaviour for a code and a non-code.For the sake of simplicity, we depict elements of the set C only and we omit steps containing obvious assignments.We also omit pairs that can be obtained from another pair by exchanging the elements.
Step 2: The condition clearly fails.
Step 8: Thus X is a code.
Step 2: The condition fails.
Step 7, iteration 2: One of configurations gives wx = yz, hence X is not a code.
The final example illustrates the similarity of the algorithm presented here to the well-known algorithm of Sardinas and Patterson (ASP) when figures resemble words.
Step 2: The condition fails.
Step 7, iteration 1: C = {(y, zy)}; other pairs fail various conditions: (yx, z) fails the ". . .∈ RC" test, (yy, z) and (yz, z) fail the condition of Proposition 7; (y, zx) and (y, zz) fail the τ E -span condition.Note a small dissimilarity with ASP, which is a consequence of extending configurations from both sides.This could be optimized by always extending only the "leftmost" component with respect to the τ E direction.
Observe that the RC set of the algorithm corresponds to the union of sets constructed by ASP in each step.

Defect effect
In this section we present several counterexamples to disprove the defect theorem for directed figures.They show that the effect fails even for very simple sets.On the other hand, restricting figures to wordlike shapes, with appropriately chosen start and end points, obviously guarantees the defect property.It would be interesting to characterize what restrictions on figures allow the defect effect to hold.

Let us now consider the following examples:
Example 9 Let Σ = {a}, m = {(a, a) → a} and which implies that X is not a code.However, there exists no Y such that |Y | < 2 and X + ⊆ Y + .
Thus, the defect effect does not hold even for two squares.The following example shows that in fact this is even worse: there are singleton non-codes.
Example 10 Let Σ = {a}, m = {(a, a) → a} and which implies that X is not a code.However, there exists no Y such that |Y | < 1 and X + ⊆ Y + .
The defect theorem does not hold even for non-codes that do not allow a word x with tran(x) = (0, 0) to be composed: ⋄ a which implies that X is not a code.However, there exists no Y such that |Y | < 2 and X + ⊆ Y + .

Final remarks
It is interesting to note that the complexity of the codicity verification algorithm depends on the angle spanned by the translation vectors of figures.Bigger angles result in higher complexity, since there are more configurations to check.The obvious upper bound is exponential in the size of "bounding areas" defined in Section 4; this in turn grows like tan(α/2), where α is the angle in question.However, at α = π the complexity drops radically because of Theorem 1.At the other end of spectrum, when the angle tends to zero, the figures resemble words and the algorithm becomes similar to the Sardinas and Patterson's algorithm, cf.Example 8. Estimating the complexity is an obvious subject for further research.An interesting extension of our results would be to consider figures with no merging function, where catenation is a partial operation defined for those figures that do not overlap when catenated.Additional restrictions on figures would have to be imposed (e.g., starting point on the inner boundary and end point on the outer boundary of a figure).
Another direction for further study is the behaviour of infinite catenations, like the usual infinite word asymptotics (cf.Foryś (2004)), and the density of codes (cf.Moczurad and Moczurad (2007)).

Definition 1 (
Directed figure) Let D ⊆ Z 2 be finite and connected, b ∈ D, e ∈ Γ(D) and l : D → Σ.A quadruple f = (D, b, e, l) is called a directed figure (over Σ) with

Fig. 3 :
Fig. 3: CW + (x) and CE + (x) regions; the black dot denotes the end point of x.