Generating Functions of Stochastic L-Systems and Application to Models of Plant Development

If the interest of stochastic L-systems for plant growth simulation and visualization is broadly acknowledged, their full mathematical potential has not been taken advantage of. In this article, we show how to link stochastic L-systems to multitype branching processes, in order to characterize the probability distributions and moments of the numbers of organs in plant structure. Plant architectural development can be seen as the combination of two subprocesses driving the bud population dynamics, branching and differentiation. By writing the stochastic L-system associated to each subprocess, we get the generating function associated to the whole system by compounding the associated generating functions. The modelling of stochastic branching is classical, but to model differentiation, we introduce a new framework based on multivariate phase-type random vectors.


Introduction
Models of plant development (or organogenesis) describe the dynamic creation of organs (leaves, internodes, flowers/fruits) and how they arrange to form plant structures.When the smallest scale of interest is that of organs (and not cells), discrete models are generally used to simulate plant structural development.The parallel rewriting grammar introduced by Lindenmayer (1968) (called L-system) is particularly adapted to model the evolution of branching patterns and its algorithmic power has been broadly taken advantage of since Smith (1984).The stochastic version of this type of grammar gives interesting results from simulation and graphical points of view by increasing the realistic aspect of geometric plants, cf.Prusinkiewicz and Lindenmayer (1990).However, when the aim of development models is not purely geometric but a basis for the study of plant functioning (biomass production and repartition), there is a need to bypass the full simulation of plant structure and to write properly the dynamic equations of development, in order to compute the growth biophysical processes, cf.De Reffye et al. (2008).A solution based on grammar factorization was proposed by de Reffye et al. (2003) and Cournède et al. (2006).It 1365-8050 c 2008 Discrete Mathematics and Theoretical Computer Science (DMTCS), Nancy, France was also derived in the stochastic case in Kang et al. (2007) and the formalism of stochastic grammars was taken advantage of to compute the generating functions of simple branching plant structures.
The objective of this article is to explore the relationship between stochastic L-Systems and multitype branching processes in order to compute the corresponding probability generating functions and the distribution moments of the number of organs.The formalism is applied to a stochastic model of plant development (GreenLab).The article starts with the presentation of the basic botanical concepts and explains how a stochastic grammar can be derived.The theoretical framework is then introduced and we show how to derive generating functions from stochastic L-systems.It is applied to the two main botanical processes underlying plant development, branching and differentiation sequence of the meristem, which are first studied separately and then composed to form the complete stochastic model.The modelling of stochastic branching is classical but, to model differentiation, we introduce a new framework based on multivariate phase-type random vectors.Finally, the recursive relationship to compute the generating functions of plant structures can be introduced and applied to obtain the moments of the numbers of organs.

Botanical concepts to model plant development
As explained in Barthélémy and Caraglio (2007), organogenesis results from the functioning of undifferentiated cells constituting the apical meristem and located at the tip of axes.When in active phase, this meristem forms buds that will develop into new growth units composed of one or several metamers (also called phytomers).A metamer is a botanical entity chosen as the elementary scale to model plant architectural development in this study.It is composed of an internode bearing organs: axillary buds, leaves, flowers.Depending on species, metamers are set in place rhythmically -plants grow by successive shoots of several metamers -or continuously -when meristems keep on functioning and generate metamers one by one.In both cases, the time between the appearances of new shoots defines the architectural Growth Cycle.A Growth Unit is the set of metamers built by a bud during a growth cycle (only one in the continuous case).These metamers can be of different kinds and ordered according to botanical rules, like acrotony.For example, most temperate trees grow rhythmically, new shoots appearing at spring.If we do not consider polycyclism and neoformation, the architectural growth cycle corresponds to one year.
In this work, we do not consider time scales that are smaller than the architectural growth cycle and we study the development of new growth units as a discrete process.The Chronological Age (CA) of a plant (or of an organ) is defined as the number of growth cycles it has existed for.
Since metamers may bear axillary buds, plant architecture develops into a hierarchical branching system.Barthélémy and Caraglio (2007) underlined that architectural units can be grouped into categories characterized by a particular combination of morphological parameters.Thus, the concept of Physiological Age (PA) was introduced to represent the different types of growth units and axes.For instance, on coffee trees, there are two types: orthotropic trunk and plagiotropic branches.The main trunk's PA is equal to 1 and the oldest PA denoted by P corresponds to the ultimate state of differentiation for an axis, it is usually short, without branches.We need less than 5 PAs to describe the axis typology of most trees.The apical meristem or bud of an axis is thus characterized by the PA of the growth unit that it may produce and a metamer is characterized by its PA i (which is the PA of the growth unit that it belongs to).Moreover, along an axis, the morphological features of the growth unit may evolve with the age of the apical meristem.This process is described as the meristem sequence of differentiation by Barthélémy and Caraglio (2007), and corresponds to a transition to a superior PA of the meristem.
Since the number of potential axillary buds per metamer can be considered as a fixed botanical data, it is straightforward to deduce the whole plant structure from the population of buds.Moreover, we will often identify the meristem with the bud that it forms, for the sake of simplicity.Therefore, in the following, modelling plant development is equivalent to studying the dynamic evolution of the population of buds.It is mainly driven by the two botanical processes described above: branching resulting from the appearance of lateral buds in growth units and the differentiation sequence of meristems resulting in the change of PAs for terminal buds.As a consequence, at growth cycle n, a bud is characterized by 3 indices: its PA φ, the CA k of the axis (which is also the CA of the meristem) and the initial PA of the meristem β.It will be denoted by b k,β n,φ .In the sequel, the two botanical processes will be referred to as branching and differentiation.
Fig. 1: Example of deterministic development for a bud during 3 cycles.A bud of PA 1 gives a metamer of PA 1 bearing a lateral bud of PA 2 and a terminal bud of PA 1.A bud of PA 2 gives a metamer of PA 2 bearing a lateral bud of PA 3 and a terminal bud of PA 2. A bud of PA 3 gives a metamer of PA 3 and a terminal bud of PA 3.After two cycles, the terminal bud of an axis of PA 1 differentiates and its new PA is 2.After one cycle, the terminal bud of an axis of PA 2 differentiates and its new PA is 3.
The productions of buds deeply vary according to their positions, environmental conditions, plant age. . .The deterministic modelling of such phenomena is difficult.For this reason, the development processes are considered as stochastic.For the sake of clarity, we only consider in this study growth units with one metamer.However, the same principles would apply for complex growth units with a random number of metamers as described in Kang et al. (2008).The computations of the general case are given in Loi and Cournède (2008).Experimental observations on plant populations and statistical analyses allow the estimation of the parameters of stochastic organogenesis models, cf.de Reffye (1979).
• P l : bud survival probability.At each growth cycle, a bud may stay alive with probability P l or die with probability (1 − P l ).It may depend on bud's PA and will thus be denoted by P l (i) for 1 ≤ i ≤ P .
• P a : bud activity probability.At each growth cycle, if a bud is alive, it may stay dormant with probability (1 − P a ) or produce a new growth unit with probability P a .P a may also depend on bud's PA: P a (i) for 1 ≤ i ≤ P .
• P b i,j (k): production probabilities.If at a given growth cycle, a bud of PA i is active, P b i,j (k) is the probability that the growth unit it develops into bears k axillary buds of PA j.Botanical constraints usually impose that 0 ≤ k ≤ B max i,j .
• λ i : inverses of average occupation times.During its sequence of differentiation, a meristem stays of PA i for an average period of 1/λ i .
• q i,j : transition probabilities.When a meristem of PA i changes its PA, q i,j is the probability that its new PA equals j.Note that the botanical differentiation sequence imposes that q i,j = 0 if j ≤ i.
In computational models, plants are generally represented as words in a language based on a generative parallel rewriting grammar also called L-system (Lindenmayer (1968), Smith (1984), Prusinkiewicz andLindenmayer (1990), Franc ¸on (1990)).It is particularly adapted to describe the production of buds at discrete time steps.The following section will introduce a probabilistic framework to analyze stochastic models of plant development, based on stochastic L-systems.

A probabilistic framework based on stochastic L-systems
In this section, denotes an alphabet, W the set of all words over V and W + the set of nonempty words over V .Let 1 be the empty word and "." be the concatenation operator.Then, (W, ., 1) is a noncommutative monoid.

Markov kernel and stochastic L-system
We first recall some basic definitions and properties of the Markov chain theory (Stroock (2005)).Throughout the following all sets are finite or countable.

Definition 3.1 (Transition matrix) A transition matrix (or kernel
Definition 3.2 (Markov chain) A Markov chain with state space S is a stochastic process (X n ) n∈N on some probability space (Ω, F, P ), such that P (X n+1 = y | X n = x) = p x,y for all x, y ∈ S. P = (p x,y ) is a Markov kernel on S.
Theorem 3.1 For every tupel (P, µ) of Markov kernel P on S and probability measure µ on S there exists a Markov Chain (X n ) n∈N of kernel P and starting measure µ = P (X 0 ∈ •).
We propose beneath a definition for stochastic 0L-systems on V based on the notion of Markov kernel.Definition 3.3 (Stochastic 0L-system) A stochastic 0L-system is a construct G = ω a , π where: • ω a ∈ W + is called an axiom.It represents the structure initiating the growth.
• π is a transition matrix from V to W .
Remark: Definition 3.3 is equivalent to the one given by Prusinkiewicz and Lindenmayer (1990): the set defines the production rules.It represents the possible evolutions for all letters throughout the L-system.
We can now define a more general class of L-systems called stochastic F0L-system, extending the classical definition of F0L-system (Rozenberg and Salomaa (1980), p. 89) to the stochastic case: Definition 3.4 (Stochastic F0L-system) A stochastic F0L-system is a construct G = A, π where: • A is a finite nonempty subset of V (called the set of axioms of G).
• for every ω a ∈ A, G[ω a ] = ω a , π is a stochastic 0L-system (called component system of G).
Proposition 3.1 Let P = (p x,y ) be the square matrix on W such that, for all and for all y ∈ W : Then, P is a Markov kernel on W .For a stochastic F0L-system A, π , P is called the associated Markov kernel.
Remark: To every component system G[ω a ] = ω a , π we associate a Markov chain (F n [ω a ]) n∈N via the associated kernel P of G and the starting measure µ = δ ωa where δ ωa is the Dirac measure concentrated on ω a .
Beside plant topological structure directly deduced from L-Systems, in order to compute plant functioning, the numbers of organs are crucial variables (see for example De Reffye et al. ( 2008)).To determine them, the order of symbols in words does not play any role, and we can consider the L-systems as commutative.Let R be an equivalence relation on W defined as follows: w 1 Rw 2 ⇔ there exists Π, a permutation on the symbol ranks, such that Π(w 1 ) = w 2 .Let us denote the quotient set W/R by W * .From now on, each word w ∈ W will be assimilated to the ordered representative w * of its equivalence class (i.e.wRw * and w In the following, the transition matrix π is thus considered as a map from V × W * into R.

Multitype branching processes and stochastic L-systems
Stochastic L-systems are closely related to multitype branching processes.Let us recall first the definition of a multitype branching process (Harris (1963), Athreya and Ney (2004)).Let us consider a population with m types of individuals.Assume a type i individual produces children of all types according to a probability distribution {P i (j) : Assume all individuals produce offspring independently of each other and of the past history of the process.Let Z n,i be the number of type i individuals in the n-th generation.Let {ξ n,i having distribution P i (.).Definition 3.5 (multitype Galton-Watson branching process) If the vector ) of population sizes in the n-th generation evolves by the recursive relation then (Z n ) n∈N is a multitype Galton-Watson branching process.
The j-th component of ξ n,i represents the number of type j individuals produced by the k-th type i individual in the n-th generation.The set {P i (.)} i∈{1,••• ,m} is called the offspring distribution.Considering this definition, we have the following theorem: )) n∈N is a multitype Galton-Watson branching process whose offspring distributions {P i (.)} i∈{1,••• ,m} are given by: m} be a set of probability distributions defined as follows: n,i having distribution P i (.).Thus, the q-th component of ξ n,i is the random variable that gives the number of type "v q " symbols in a word generated by a type "v i " symbol throughout G. Since there are we have: and then, N.B.: the converse is true.Giving a multitype Galton-Watson branching process, there is a stochastic L-system whose stochastic production function is entirely determined by the offspring distribution.
Let us now define the generating function associated to a stochastic 0L-system.Let S = (s Definition 3.6 (generating function associated to a stochastic 0L-system) Let G = A, π be a stochastic F0L-system on V = {v 1 , • • • , v m } with letters acting independently.Let G[ω a ] = ω a , π be a component system of G and (F n [ω a ]) n∈N the corresponding Markov chain.For n ∈ N, the generating function By using the classical composition of generating functions (Harris (1963)) for a multitype Galton-Watson branching process, we deduce directly the following theorem: π be a component system of G and and (F n [v]) n∈N the corresponding Markov chain.For all n ∈ N, let ψ n [v] be the generating function associated to 4 Application to stochastic plant development The GreenLab model of plant development is based on the botanical modelling concepts recalled in section 2. It can be described by a stochastic F0L-system G tot = V, π tot (Kang et al. (2007)).Let N ∈ N be the time during which we observe the growth of the plant.A bud b k,β n,φ is symbolized by s k,β φ and a of PA j by m j .V is the union of S={s k,β φ }} and M={m j : j ∈ {1, • • • , P }} where S is the set of buds (nonterminal elements) and M is the set of metamers (terminal elements).A dead bud will be represented by the empty word 1.
The seed can be considered as the initial bud and is thus represented by s 0,1 1 .We consider the component system G tot [s 0,1 1 ] and (F tot n [s 0,1 1 ]) n∈N the corresponding Markov chain.For n ∈ N, F tot n [s 0,1 1 ] is the random variable representing the possible realisations of a plant after n growth cycles.Our objective is to determine the generating function associated to F tot n [s 0,1 1 ] with Theorem 3.3.Thus, we only need to determine the generating function associated to F tot 1 [s 0,1 1 ].However, the development processes underlying GreenLab organogenesis are complex.The dynamics of the population of buds results from the combination of both branching and differentiation (see Section 2) and we do not have an easy access to the distribution of F tot 1 [s 0,1 1 ].Thus, the idea is to break down the whole system (branching + differentiation) into two subsystems and study them separately.This is the aim of Sections 4.1 (for branching) and 4.2 (for differentiation).In both cases, we write the corresponding generating function.Section 4.3 combines the two processes by using the results thus obtained and gives the recursive equations for the generating functions derivating from G tot .The recursive equations for the moments of the numbers of metamers are finally deduced.

Branching
In this section we focus on the GreenLab development model with only branching.The differentiation process is not taken into account.This section gives not only the generating function for the branching process but also a method to build it.The model can be represented by a stochastic F0L-system G br = V, π br .V is defined as in Section 4. For every component system G br [v] of G br , the corresponding Markov chain is denoted (F br n [v]) n∈N .In order to illustrate the branching process, let us take the example of coffee trees, plants with stochastic branching and no differentiation.They have two PAs (i.e.P=2).A metamer of PA 1 can bear a maximum of two lateral buds of PA 2 (i.e.B max 1,2 = 2).A metamer of PA 2 does not bear any lateral bud.At each growth cycle, there are three possible evolutions for a bud: • case 1: with φ ∈ {1, 2}, the bud is still alive but rests, .• case 3: the bud is active.In that case, it produces a new metamer with lateral buds (F br ).A probability of occurrence can be given for each of these possible evolutions.Let us now consider the general case.The aim is to write the generating function associated to G br .Let ψ br 1 [s k,β φ ](S, M ) .
Proof: Using Definition 3.6, we get the generating function by enumerating all the possible evolutions of s k,β φ .
• case 3: the rules are of type with probability of occurrence P l (φ)P a (φ) Therefore, by summing the three cases, we get the equation of Theorem 4.1.2

Differentiation
In this section we focus on the differentiation process, corresponding to the change of meristem's PA.The branching process is not taken into account.We proceed as in Section 4.1 and write the stochastic F0L-System G dif = V, π dif associated to the differentiation process in order to write the corresponding generating function.V is defined as in Section 4. For every component system ) n∈N .To illustrate differentiation, let us consider the example of maize.In standard cultivation conditions, maize is a mono-stem plant, that is to say without ramification.However, we can distinguish two types of metamers along the stem.The first ones are short and can potentially bear tillers.They are followed by longer phytomers after meristem differentiation.Finally, the meristem ends up by flowering, which terminates the differentiation sequence and the stem development, see Figure 3.The two types of metamers are characterized by two different PAs (P = 2).We assume that the terminal bud of an axis of CA k = 0 can not change its PA immediately.At each growth cycle, there are three possible evolutions for a bud through the differentiation process: • case 1: φ with φ ∈ {1, 2}, the bud does not differentiate.The PA of the terminal bud remains the same.
• case 2: 2 , the bud differentiates and is still alive.The PA of the terminal bud changes and is higher (cf Section 2) • case 3: , the bud differentiates and dies.The growth of the axis stops.
A probability of occurrence can be given for each of these evolutions.Let us now consider the general case.In order to write the generating function, we first have to identify the stochastic process underlying the differentiation of the terminal bud.
The aim is to study the evolution of the PA φ of a bud with respect to the CA k of the axis that bears the bud.Let b k,β n,φ be a bud of PA φ at growth cycle n.The axis that bears the bud is of CA k and was initiated by a bud of PA β.Then, we assume φ = φ(k).Moreover, we assume that the occupation time of a bud in the PA j follows an exponential law of parameter λ j where λ j is defined in Section 2. It is in keeping with the botanical theory of meristematic differentiation sequence as a sequence of events inducing plant morphological ageing.Therefore, we can represent the discrete process (φ(k)) k∈N by a finite state space Markov process (φ(t)) t∈R+ .
We choose to model the stochastic behaviour of (φ(t)) t∈R+ with multivariate phase-type random vectors, which are particularly used in reliability theory (Neuts (1975)).An interesting aspect of phase-type distributions is that they can be written in a closed form, and consequently, various quantities of interest can be evaluated with relative ease (Assaf et al. (1984)).Let (X(t)) t∈R+ be a right-continuous Markov process on a finite state space E = {1, • • • , P, ∆}.Assume 1, • • • , P are transient and ∆ is absorbing.
Then, the infinitesimal generator Q of size P + 1 by P + 1 has the form where A is a P × P matrix and e is a vector of size P with all its components set to 1. Let Γ 1 , • • • , Γ n be nonempty stochastically closed subsets of E for X.A subset Γ of E is said to be stochastically closed for X if once X enters Γ, it never leaves it.Let a be the starting measure on E with a(∆) = 0. Let α be the vector of size P defined by α = (a(1), • • • , a(P )).We define Definition 4.1 (multivariate phase-type random vector) If the following hypotheses are true: 1 - called a multivariate phase-type random vector with representation (α, A).
Proof: For all β ∈ {1, • • • , P }, we have to prove the three hypotheses: 1 -it is obvious that Neuts (1975), the absorption into d is certain if and only if A is non singular.This is true because det A = (−1) P P k=1 λ k = 0.

-given the only nonzero component of α
Then, the third hypothesis is true.According to Definition 4.1, (T β β+1 , ..., T β P +1 ) is a multivariate phase-type random vector with representation (α β , A).The second part of the proposition is obvious because Γ P +1 ⊂ Γ P ⊂ ... ⊂ Γ 1 .2 We will now determine the generating function derivating from G dif .The probability distribution of and all the probabilities are explicit functions of the matrix A: Proof: We will proceed as in Section 4.1.For the bud s k,β φ , we know that φ enters Γ φ before k − 1 but not Γ φ+1 .It explains why all the probabilities in Theorem 4.2 have the form P (.|D).There are 3 cases for bud differentiation: • F dif 1 [s k,β φ ] = s k,β φ : the bud does not change its PA.This is the case if φ β does not enter Γ k+1 before k.Therefore, the probability of occurrence is P (T β φ+1 > k + 1|D).
Let m j [s k,β φ , n] be the number of type j metamers in a structure initiated by s k,β φ after n growth cycles and let M n be the matrix of size (N + 1)P 2 by P whose j-th column vector is E m j [s k,β φ , n] k,β,φ .
Let e k be the vector of size k with all its components set to 1.We deduce the fundamental recursion equation for the expectations of the numbers of organs on each type of structures.It is important to note that the first line of the matrix corresponds to those of the whole plant.In the same way, we can give recursion formulas for the variance.The detailed results, as well as examples of simulations and biological discussions, are given in Loi and Cournède (2008).

Conclusion
While the interest of stochastic L-systems for plant growth simulation and visualization is broadly acknowledged, its full mathematical potential to characterize the probability distributions and moments of the numbers of organs in plant structure has not been taken advantage of.The need for an analytic computation of these distributions is crucial in stochastic functional-structural models in order to derive the distribution or the moments of biomass production (cf.Kang et al. (2008) for preliminary results).It has led us to clearly formalize the link between stochastic L-systems and multi-type branching processes, and thus to derive an inductive relationship to compute the associated generating functions.This framework was applied successfully to the GreenLab organogenesis model, by decomposing the development process into two botanical sub-processes, branching and meristem differentiation.For the latter, multivariate phase type random vectors were introduced to describe the stochastic sequence of meristem differentiation.From the inductive relationship on the generating functions of the numbers of organs, we can determine the inductive relationship giving the moments of the distributions.
The next step is the link of stochastic organogenesis and functioning models.It is possible to derive the approximate moments of biomass production thanks to differential statistics.An interesting issue concerns the modelling of the probability distributions governing organogenesis as functions of biomass production or groth rate (cf.Mathieu et al. (2006)).The stochastic processes resulting from these interactions between development and functioning should lead us to improve the proposed formalism.

Fig. 2 :
Fig. 2: (a) Production rules of coffee trees: phytomers and buds of PA 1 are in black and that of PA 2 are in gray.The cross symbol represents bud death.(b) An example of topology obtained after 10 growth cycles.(c) 3D representation.