Object grammars and random generation

This paper presents a new systematic approach for the uniform random generation of combinatorial objects. The method is based on the notion of object grammars which give recursive descriptions of objects and generalize context-free grammars. The application of particular valuations to these grammars leads to enumeration and random generation of objects according to non algebraic parameters.


Introduction
An object grammar defines classes of objects by means of terminal objects and certain types of operations applied to the objects.It is most often described with pictures.For instance, the standard decomposition of complete binary trees is an object grammar (Figure 1).The formalism given here for object grammars [6,7] generalizes the one for context-free grammars.An important application of these grammars is a systematic approach for the specification of bijections between sets of combinatorial objects (see [7]).
The paper outlines another important application of the object grammars: a systematic method for generating combinatorial objects uniformly at random.The work lies in the recursive method framework; this method is to generate recursively random objects by endowing a recurrence formula with a probabilistic interpretation.This generation process has been first formalized by Nijenhuis and Wilf [11,14,15].They have a general approach.They base the recursive procedure on an acyclic directed rooted graph with a terminal vertex and numbered edges, graph which depends on the family of objects.The recursive method has been also formalized by Hickey and Cohen in the special case of context-free languages [10], and by Greene within the framework of the labelled formal languages [9].
Recently, Flajolet, Zimmermann and Van Cutsem have given a systematic approach for this method with specifications of structures by grammars involving set, sequence and cycle constructions [8].The methods that they have examined enable to start from any hight level specification of decomposable class and compile automatically procedures that solve the corresponding random generation problem.They have presented two closely related groups of methods: the sequential algorithms (linear search) which have worst case time complexity ✆ ✞✝ ✠✟ ☛✡ ✌☞ , when applied to objects of size ✟ , and the boustrophedonic algorithms (bidirectional fashion) which have worst case time complexity ✆ ✞✝ ✠✟ ✎✍ ✑✏ ✓✒ ✔✟ ☛☞ .The present work is a continuation of the research of these authors.It is a systematization of the recursive method based on the object grammars.It then extends the field of structures which can be  generated using such method.Another important contribution of this work is to consider the random generation of objects according to several parameters simultaneously, and to consider not only algebraic parameters (i.e. that lead to algebraic generating functions), but also parameters that lead to generating functions satisfying ✖ -equations.The other methods have rarely dealed with these latest parameters.
In section 2, we review the necessary definitions for object grammars.We then provide in section 3 the notion of ✖ -linear valuations.They formalize the behaviour (on objects defined by an object grammar) of parameters that lead automatically to ✖ -equations satisfied by the corresponding generating functions.
Section 4 introduces our systematic random generation method.Given an unambigous object grammar and a corresponding ✖ -linear valuation, it allows to construct automatically the enumeration and uniform generation procedures according to the valuation.These procedures use sequential algorithms and have worst case time complexity ✆ ✗✝ ✠✘ ✚✙ ✛✝ ✠✘ ✢✜ ✣✙ ✠☞ ✛☞ , when applied to objects of valuation ✤ ✦✥ ✓✖ ★✧ , assuming the enumeration tables have been computed once for all in ✆ ✗✝ ✠✘ ✩✡ ✪✙ ✠✡ ★☞ time (see [6] for the general case of valuation).If one only considers an algebraic parameter (✤ ✩✥ ), the complexity is the same as in [8], and the boustrophedonic search can be used.Note that, as in [8], the complexity is related to the number of arithmetic operations, unit cost is taken for the manipulation of a large integer.
The path taken here is eminently praticable and the method has been implemented in the Maple language (package named qAlGO).Section 5 gives some results obtained with this program concerning the uniform random generation of convex polyominoes according to the area and planar trees according to the internal path length.The package CombStruct, written by P. Zimmermann, cannot study these objects and this type of parameter.The packages qAlGO and CombStruct complement each others.In section 6, we finish by discussing some ideas and directions of research.

Object Grammars
Let ✫ be a family of sets of objects.An object operation (in ✫ ) is a mapping Example 2.1 .A parallelogram polyomino can be defined as the surface lying between two North-East paths that are disjoint, except at their common ending points (see Figure 2) [5].Let ✯ ✔▲ ✪▲ be the set of parallelogram polyominoes.
is a finite family of sets of objects.(I is a finite subset of IN).
( is a set of object operations ✬ in ✫ .S is a fixed set of ✫ called the axiom of the grammar.
The dimension of an object grammar is the cardinality of ✫ .
In the following, the terms grammar and operation will often be used for object grammar and object operation respectively.
The construction of an object can be described by its derivation tree : internal nodes are labelled with object operations and leaves with terminal objects.These derivation trees are comparable to the abstract trees within the theory of Compiling. Let be an object grammar and ✯ q✻ ❦✫ a set of objects.An object ❊ is said to be generated in ♥ by ✯ , if there is a derivation tree of ♥ on ✯ (i.e. the codomain of the label of the root is ✯ ) whose evaluation is ❊ .The set of objects generated by ✯ in ♥ is denoted by r ts ✿✝ ✠✯ t☞ .If ❴ in ✫ is chosen as the axiom of ♥ , then The parallelogram polyomino of Figure 2 belongs to r s ▼⑧ ✝ ✠✯ ▲ ✪▲ ☞ , its derivation tree in ♥ ✡ is given in Figure 4.The set r s ✢⑧ ✝ ✠✯ ▲ ✪▲ ☞ is the set of parallelogram polyominoes.The set r s ⑩⑨ ✝ ✠✯ ▲ ✪▲ ☞ is the set of Ferrers diagrams; it is a proper subset of parallelogram polyominoes.
By analogy to context-free grammars, an object grammar ♥ is unambiguous if every object in r s ✝ ❴ ☞ has exactly one derivation tree.Unambiguity is an important property for building bijections.
One can also define several normal forms for object grammars: reduced, 1-2 or complete.The reduced and 1-2 forms extend usual normal forms of context-free grammars: the reduced and Chomsky normal form.A grammar is said to be reduced if every set of objects ✯ in ✫ is accessible from the axiom and r ts ✉✝ ✠✯ t☞ ❷❶ ❹❸ ; it is said to be in 1-2 form if all its operations are of arity or ✁ .The complete form is specific for object grammars.A grammar is said to be complete if r ts ✾✝ ■✯ t☞ ✔ ❺✯ for every set of objects ✯ in ✫ (generally r ✣s ✾✝ ✠✯ t☞ ✿❻ ⑥✯ ).For example, the grammar ♥ ✡ previously defined is complete while ♥ ✰ is not.The details on transformations of object grammars into normal forms are given in [6].

Another Definition
A complete, unambiguous object grammar ♥ ⑥ ✣) ❼✫ ❃ ✛❭ ✗❃ )( ❈❃ ✪❴ ✮❵ can be described as a system of equations ❽ involving sets of objects, terminal objects and object operations, or as a system of graphic equations.
The equations describe the decomposition of a set of objects into a disjoint union of terminal objects and images of operations: For example, the equation for the grammar ♥ ✡ generating parallelogram polyominoes previously defined is A schematic representation of this grammar is given in Figure 5.

Expanded 1-2 form
The automatic method of random generation presented in the paper is based on the expanded 1-2 form of object grammars.An object grammar ♥ ➡ ✣) ❦✫ ❃ ✛❭ ✗❃ )( ➢❵ is called in expanded 1-2 form if, for every ✯ in ✫ , the equation that defines it has one of the forms

Proposition 2.2 Every object grammar has an equivalent expanded 1-2 form.
Proof.To transform an object grammar into an expanded 1-2 form, it suffices to change all the sums and domains of the object operations having arity ❵ ❦✁ by adding sets of objects and identity object operations of arity ✁ .Thus, the equation ☞ is replaced by the set of equations

③
In the following, we will often use the term 1-2 form for expanded 1-2 form.

Enumeration
Let ➫ be a ring and Theorem 3.1 Let ♥ ❺ ✣) ❦✫ ❃ ✛❭ ④❃ •( ✐❵ be a complete, unambiguous object grammar, and one can directly obtain the following equation: Proof.The object grammar ♥ is unambiguous and complete, given equation (1).Equation ( 2) is obvious, since we have disjoint unions.

③
The objective is to obtain a system of equations for the generating functions of the sets of the grammar.Then, one has to express This depends on the nature of object valuations considered.

❐ -linear Object Valuations
respectively, and ✬ an object operation with (3) This is a system of ✖ -equations were the unknowns are the generating functions ➳ ➅❢ ✳✝ ■✯ t☞ for the sets ✯ in ✫ .

Object Valuations and 1-2 form
The proof of Proposition 2.2 has shown how to reduce object grammars in 1-2 form.The ✖ -linear object valuations are very well preserved through this transformation.

Proposition 3.3 If ➾ is a set of ✖ -linear object valuations associated with an object grammar, it is possible to construct an equivalent set of object valuations associated with its 1-2 form.
Proof.Recall that an equation ☞ in the grammar is replaced by the set of equations Concerning the object valuations, if we have

Enumeration and Random Generation Procedures
Not all object grammars ♥ and possible corresponding sets of functions ➾ lead to random generation.The couples ✝ ✠♥ ❃ ➾ ✺☞ considered here are well-founded, i.e. each set of objects generates at least one object, and it generates a finite number of objects having the same valuation's value (it is the definition of an object valuation).An algorithm performing this task is detailed in [6].It is inspired by works of Zimmermann [16].

Generation Procedures
With each set of objects ✯ of the grammar is associated a procedure ö ✓❢ having parameters ✘ and ✙ , and generating (uniformly at random) an object of ✯ having the valuation ✤ ✩✥ ✓✖ ★✧ .More precisely, this procedure constructs the derivation tree in the grammar.It depends on the form of the equation which defines ✯ in the object grammar: The procedure ö ❢ is trivial : . The procedure ö ❢ must generate an object belonging either to ✯ ✱✰ or to ✯ ✡ .The probability that this object belongs to ✯ ❆✰ is equal to
û For each set of objects ✯ in ♥ ð , create the enumeration procedure ❍ ❢ , then compute all the coefficients up to rank ✝ ■✘ ❃ ✙ ✠☞ .û For each set of objects ✯ in ♥ ð , create the generation procedures ö ❢ as indicated above.Proof.A random generation procedure consists in constructing recursively a derivation tree in ♥ .This tree is binary because ♥ is in 1-2 form.The size of the derivation tree of an object having the valuation ✤ ✦✥ ✓✖ ★✧ is proportional to ✘ Ô✜ ä✙ .The generation of a vertex of the tree has a maximal cost of ✆ ✗✝ ■✘ ✚✙ ✠☞ (the loops of the procedure).Thus, the complexity of the generation of the derivation tree in the worst case is ✆ ✗✝ ✠✘ ý✙ ➉✝ ✠✘ ❆✜ ✮✙ ■☞ ➉☞ .③

Algorithm for Uniform random generation
One can now describe an uniform random generation procedure for the objects of an object grammar according to a set of ✖ -linear object valuations (Figure 6).The obtained generation procedures give the derivation trees of objects in ♥ ð , but not directly in ♥ .A simple transformation (linear cost in ✆ ✗✝ ■✘ t✜ ❬✙ ✠☞ ) gives the derivation trees in ♥ .This postprocessing does not affect the conclusions of the complexity studies.Futhermore, at the expense of some programming effort, it can be effected 'on the fly'.

Maple package qAlGO
The program qAlGO (in Maple language) implements the method developed in the previous sections.The package qAlGO builds automatically the enumeration and generation procedures from a unambiguous object grammar and a set of corresponding ✖ -linear valuations (see the annex of [6] for syntax and use).The automatic nature of the software qAlGO gives a very useful tool which makes easy the experimental study of various statistics on combinatorial objects.In the following, we present relevant examples of random generation.

Convex Polyominoes According to the Area
Here is an example of experimental studies using qAlGO.It concerns the random generation, according to the area, of different classes of convex polyominoes: parallelogram polyominoes, convex directed polyominoes and convex polyominoes.
First, the example of parallelogram polyominoes.It suffices to give as input to qAlGO an object grammar that generates them and the corresponding object valuations: the grammar in Figure 5 and the valuation ➳ ➶ ✢➹ defined in example 3.1.Thus one obtains the enumeration and the uniform generation according to the width (in ✤ ) and the area (in ✖ ).
For the convex polyominoes, we use a much more complicated object grammar (of dimension 9 and with 34 object operations !), but the principle is exactly the same as before for the parallelogram polyominoes.ÿ CalICo offers a software environment for manipulations and visualizations of combinatorial objects; it allows the communication of graphical interfaces and computer algebra software such as Maple.Such experiments showed us how thin the random convex polyominoes according to the area is.According to the perimeter, they look more thick.Examples of such polyominoes are given Figures 8 and  9.We also find out that convex polyominoes, random according to the area, have either a north-east, or north-west orientation (as Figure 9), with same probability one half.
To understand the thin look of such random polyominoes according to the area, we computed experimentally the average value of two parameters: the height of a column and the gluing number between two adjacent columns, which is the number of cells by which two adjacent columns are in contact.After generating ➽ ✓➽ ✓➽ convex polyominoes having area ➽ ✓➽ and ➽ ✓➽ ✓➽ parallelogram polyominoes having area ✁ ➽ ✓➽ , we obtain the following average values: ✁ ✩❃ • ✆☎ for the height of the columns, and ✓❃ • ✝☎ for the gluing number between two adjacent columns.
Remark.The result for the average height of a column ( ✁ ✦❃ • ✝☎ ) coincides exactly with what Bender [1] has obtained using asymptotic analysis methods.The difference of between these two average parameters can be explained simply by noticing that this difference has for limit the quotient between the height and the width of the polyomino, which is by symmetry.

Planar Trees According to the Internal Path Length
A simple unambiguous object grammar for the planar trees is given in Figure 10.The internal path  length (ipl) of a tree is the sum of the distances of all its nodes from its root.The q-linear valuation Setting ✖ to , we obtain the linear valuation for the size of the trees (the number of nodes).
Using qAlGO with the above grammar and these two valuations, we are able to generate at random planar trees according to the size (see Figure 11), and also, according to the internal path length (see Figure 12).The latter have a remarkable look: they have a very small height.

Conclusions and Perspectives
The interest of our approach lies in its generality and simplicity.Time complexity are 'computable' and, at the same time, one gains access to the random generation of arbitrarily complex objects according to Fig. 12: A random planar tree of internal path length 100.