This section of DMTCS is devoted to publishing original research from several domains covered by Volume B of the Handbook of Theoretical Computer Science (Elsevier Publisher). Our scope is suggested by the following list of keywords: automata theory, automata-theoretic complexity, automatic program verification, combinatorics of words, coding theory, concurrency, data bases, formal languages, functional programming, logic in computer science, logic programming, program specification, rewriting, semantics of programming languages, theorem proving.

Editors: Henning Fernau ; Juhani Eero Urho Karhumäki ; Klaus-Jörn Lange ; Andreas Maletti ; Anca Muscholl ; Daniel Reidenbach ; Howard Straubing ; Val Tannen

In this paper, we facilitate the reasoning about impure programming languages, by annotating terms with “decorations”that describe what computational (side) effect evaluation of a term may involve. In a point-free categorical language,called the “decorated logic”, we formalize the mutable state and the exception effects first separately, exploiting anice duality between them, and then combined. The combined decorated logic is used as the target language forthe denotational semantics of the IMP+Exc imperative programming language, and allows us to prove equivalencesbetween programs written in IMP+Exc. The combined logic is encoded in Coq, and this encoding is used to certifysome program equivalence proofs.

We introduce weighted regular tree grammars with storage as combination of (a) regular tree grammars with storage and (b) weighted tree automata over multioperator monoids. Each weighted regular tree grammar with storage generates a weighted tree language, which is a mapping from the set of trees to the multioperator monoid. We prove that, for multioperator monoids canonically associated to particular strong bi-monoids, the support of the generated weighted tree languages can be generated by (unweighted) regular tree grammars with storage. We characterize the class of all generated weighted tree languages by the composition of three basic concepts. Moreover, we prove results on the elimination of chain rules and of finite storage types, and we characterize weighted regular tree grammars with storage by a new weighted MSO-logic.

We examine inkdots placed on the input string as a way of providing advice to finite automata, and establish the relations between this model and the previously studied models of advised finite automata. The existence of an infinite hierarchy of classes of languages that can be recognized with the help of increasing numbers of inkdots as advice is shown. The effects of different forms of advice on the succinctness of the advised machines are examined. We also study randomly placed inkdots as advice to probabilistic finite automata, and demonstrate the superiority of this model over its deterministic version. Even very slowly growing amounts of space can become a resource of meaningful use if the underlying advised model is extended with access to secondary memory, while it is famously known that such small amounts of space are not useful for unadvised one-way Turing machines.

We discuss cellular automata over arbitrary finitely generated groups. We call a cellular automaton post-surjective if for any pair of asymptotic configurations, every pre-image of one is asymptotic to a pre-image of the other. The well known dual concept is pre-injectivity: a cellular automaton is pre-injective if distinct asymptotic configurations have distinct images. We prove that pre-injective, post-surjective cellular automata are reversible. Moreover, on sofic groups, post-surjectivity alone implies reversibility. We also prove that reversible cellular automata over arbitrary groups are balanced, that is, they preserve the uniform measure on the configuration space.

A finite deterministic (semi)automaton A = (Q, Σ, δ) is k-compressible if there is some word w ∈ Σ + such that theimage of its state set Q under the natural action of w is reduced by at least k states. Such word w, if it exists, is calleda k-compressing word for A and A is said to be k-compressed by w. A word is k-collapsing if it is k-compressing foreach k-compressible automaton, and it is k-synchronizing if it is k-compressing for all k-compressible automata withk+1 states. We compute a set W of short words such that each 3-compressible automaton on a two-letter alphabetis 3-compressed at least by a word in W. Then we construct a shortest common superstring of the words in W and,with a further refinement, we obtain a 3-collapsing word of length 53. Moreover, as previously announced, we showthat the shortest 3-synchronizing word is not 3-collapsing, illustrating the new bounds 34 ≤ c(2, 3) ≤ 53 for the length c(2, 3) of the shortest 3-collapsing word on a two-letter alphabet.

Codes with various kinds of decipherability, weaker than the usual unique decipherability, have been studied since multiset decipherability was introduced in mid-1980s. We consider decipherability of directed figure codes, where directed figures are defined as labelled polyominoes with designated start and end points, equipped with catenation operation that may use a merging function to resolve possible conflicts. This is one of possible extensions generalizing words and variable-length codes to planar structures. Here, verification whether a given set is a code is no longer decidable in general. We study the decidability status of figure codes depending on catenation type (with or without a merging function), decipherability kind (unique, multiset, set or numeric) and code geometry (several classes determined by relative positions of start and end points of figures). We give decidability or undecidability proofs in all but two cases that remain open.

A right ideal (left ideal, two-sided ideal) is a non-empty language $L$ over an alphabet $\Sigma$ such that $L=L\Sigma^*$ ($L=\Sigma^*L$, $L=\Sigma^*L\Sigma^*$). Let $k=3$ for right ideals, 4 for left ideals and 5 for two-sided ideals. We show that there exist sequences ($L_n \mid n \ge k $) of right, left, and two-sided regular ideals, where $L_n$ has quotient complexity (state complexity) $n$, such that $L_n$ is most complex in its class under the following measures of complexity: the size of the syntactic semigroup, the quotient complexities of the left quotients of $L_n$, the number of atoms (intersections of complemented and uncomplemented left quotients), the quotient complexities of the atoms, and the quotient complexities of reversal, star, product (concatenation), and all binary boolean operations. In that sense, these ideals are "most complex" languages in their classes, or "universal witnesses" to the complexity of the various operations.

For a language $L$, we consider its cyclic closure, and more generally the language $C^{k}(L)$, which consists of all words obtained by partitioning words from $L$ into $k$ factors and permuting them. We prove that the classes of ET0L and EDT0L languages are closed under the operators $C^k$. This both sharpens and generalises Brandstädt's result that if $L$ is context-free then $C^{k}(L)$ is context-sensitive and not context-free in general for $k \geq 3$. We also show that the cyclic closure of an indexed language is indexed.

We consider implicit signatures over finite semigroups determined by sets of pseudonatural numbers. We prove that, under relatively simple hypotheses on a pseudovariety V of semigroups, the finitely generated free algebra for the largest such signature is closed under taking factors within the free pro-V semigroup on the same set of generators. Furthermore, we show that the natural analogue of the Pin-Reutenauer descriptive procedure for the closure of a rational language in the free group with respect to the profinite topology holds for the pseudovariety of all finite semigroups. As an application, we establish that a pseudovariety enjoys this property if and only if it is full.

We consider three problems related to dynamics of one-tape Turing machines: Existence of blocking configurations, surjectivity in the trace, and entropy positiveness. In order to address them, a reversible two-counter machine is simulated by a reversible Turing machine on the right side of its tape. By completing the machine in different ways, we prove that none of the former problems is decidable. In particular, the problems about blocking configurations and entropy are shown to be undecidable for the class of reversible Turing machines.

Promise problems were mainly studied in quantum automata theory. Here we focus on state complexity of classical automata for promise problems. First, it was known that there is a family of unary promise problems solvable by quantum automata by using a single qubit, but the number of states required by corresponding one-way deterministic automata cannot be bounded by a constant. For this family, we show that even two-way nondeterminism does not help to save a single state. By comparing this with the corresponding state complexity of alternating machines, we then get a tight exponential gap between two-way nondeterministic and one-way alternating automata solving unary promise problems. Second, despite of the existing quadratic gap between Las Vegas realtime probabilistic automata and one-way deterministic automata for language recognition, we show that, by turning to promise problems, the tight gap becomes exponential. Last, we show that the situation is different for one-way […]

This paper deals with the calculation of the Hausdorff measure of regular ω-languages, that is, subsets of the Cantor space definable by finite automata. Using methods for decomposing regular ω-languages into disjoint unions of parts of simple structure we derive two sufficient conditions under which ω-languages with a closure definable by a finite automaton have the same Hausdorff measure as this closure. The first of these condition is related to the homogeneity of the local behaviour of the Hausdorff dimension of the underlying set, and the other with a certain topological density of the set in its closure.

We provide a counterexample to a lemma used in a recent tentative improvement of the Pin-Frankl bound for synchronizing automata. This example naturally leads us to formulate an open question, whose answer could fix the line of the proof, and improve the bound.

First, we close the multi-parameter analysis of a canonical problem concerning short reset words (SYN) initiated by Fernau et al. (2013). Namely, we prove that the problem, parameterized by the number of states, does not admit a polynomial kernel unless the polynomial hierarchy collapses. Second, we consider a related canonical problem concerning synchronizing road colorings (SRCP). Here we give a similar complete multi-parameter analysis. Namely, we show that the problem, parameterized by the number of states, admits a polynomial kernel and we close the previous research of restrictions to particular values of both the alphabet size and the maximum length of a reset word.

An S-adic characterization of minimal subshifts with first difference of complexity 1 ≤ p(n + 1) − p(n) ≤ 2 S. Ferenczi proved that any minimal subshift with first difference of complexity bounded by 2 is S-adic with Card(S) ≤ 3 27. In this paper, we improve this result by giving an S-adic characterization of these subshifts with a set S of 5 morphisms, solving by this way the S-adic conjecture for this particular case.

The implicit signature κ consists of the multiplication and the (ω-1)-power. We describe a procedure to transform each κ-term over a finite alphabet A into a certain canonical form and show that different canonical forms have different interpretations over some finite semigroup. The procedure of construction of the canonical forms, which is inspired in McCammond\textquoterights normal form algorithm for ω-terms interpreted over the pseudovariety A of all finite aperiodic semigroups, consists in applying elementary changes determined by an elementary set Σ of pseudoidentities. As an application, we deduce that the variety of κ-semigroups generated by the pseudovariety S of all finite semigroups is defined by the set Σ and that the free κ-semigroup generated by the alphabet A in that variety has decidable word problem. Furthermore, we show that each ω-term has a unique ω-term in canonical form with the same value over A. In particular, the canonical forms provide new, simpler, […]

The Cerný's conjecture states that for every synchronizing automaton with n states there exists a reset word of length not exceeding (n - 1)2. We prove this conjecture for a class of automata preserving certain properties of intervals of a directed graph. Our result unifies and generalizes some earlier results obtained by other authors.

One of the first and most famous results of cellular automata theory, Moore's Garden-of-Eden theorem has been proven to hold if and only if the underlying group possesses the measure-theoretic properties suggested by von Neumann to be the obstacle to the Banach-Tarski paradox. We show that several other results from the literature, already known to characterize surjective cellular automata in dimension d, hold precisely when the Garden-of-Eden theorem does. We focus in particular on the balancedness theorem, which has been proven by Bartholdi to fail on amenable groups, and we measure the amount of such failure.

Our aim is to construct a finite automaton recognizing the set of words that are at a bounded distance from some word of a given regular language. We define new regular operators, the similarity operators, based on a generalization of the notion of distance and we introduce the family of regular expressions extended to similarity operators, that we call AREs (Approximate Regular Expressions). We set formulae to compute the Brzozowski derivatives and the Antimirov derivatives of an ARE, which allows us to give a solution to the ARE membership problem and to provide the construction of two recognizers for the language denoted by an ARE. As far as we know, the family of approximative regular expressions is introduced for the first time in this paper. Classical approximate regular expression matching algorithms are approximate matching algorithms on regular expressions. Our approach is rather to process an exact matching on approximate regular expressions.

The paper presents a condition necessarily satisfied by (tiling system) recognizable two-dimensional languages. The new recognizability condition is compared with all the other ones known in the literature (namely three conditions), once they are put in a uniform setting: they are stated as bounds on the growth of some complexity functions defined for two-dimensional languages. The gaps between such functions are analyzed and examples are shown that asymptotically separate them. Finally the new recognizability condition results to be the strongest one, while the remaining ones are its particular cases. The problem of deciding whether a two-dimensional language is recognizable is here related to the one of estimating the minimal size of finite automata recognizing a sequence of (one-dimensional) string languages.

If L is a language, the automaticity function A_L(n) (resp. N_L(n)) of L counts the number of states of a smallest deterministic (resp. non-deterministic) finite automaton that accepts a language that agrees with L on all inputs of length at most n. We provide bounds for the automaticity of the language of primitive words and the language of unbordered words over a k-letter alphabet. We also give a bound for the automaticity of the language of base-b representations of the irreducible polynomials over a finite field. This latter result is analogous to a result of Shallit concerning the base-k representations of the set of prime numbers.

We investigate structural complexity measures on digraphs, in particular the cycle rank. This concept is intimately related to a classical topic in formal language theory, namely the star height of regular languages. We explore this connection, and obtain several new algorithmic insights regarding both cycle rank and star height. Among other results, we show that computing the cycle rank is NP-complete, even for sparse digraphs of maximum outdegree 2. Notwithstanding, we provide both a polynomial-time approximation algorithm and an exponential-time exact algorithm for this problem. The former algorithm yields an O((log n)^(3/2))- approximation in polynomial time, whereas the latter yields the optimum solution, and runs in time and space O*(1.9129^n) on digraphs of maximum outdegree at most two. Regarding the star height problem, we identify a subclass of the regular languages for which we can precisely determine the computational complexity of the star height problem. Namely, the star […]

The join of two varieties is the smallest variety containing both. In finite semigroup theory, the varieties of R-trivial and L-trivial monoids are two of the most prominent classes of finite monoids. Their join is known to be decidable due to a result of Almeida and Azevedo. In this paper, we give a new proof for Almeida and Azevedo's effective characterization of the join of R-trivial and L-trivial monoids. This characterization is a single identity of omega-terms using three variables.

Let T be a monadic-second order class of finite trees, and let T(x) be its (ordinary) generating function, with radius of convergence rho. If rho >= 1 then T has an explicit specification (without using recursion) in terms of the operations of union, sum, stack, and the multiset operators n and (>= n). Using this, one has an explicit expression for T(x) in terms of the initial functions x and x . (1 - x(n))(-1), the operations of addition and multiplication, and the Polya exponentiation operators E-n, E-(>= n). Let F be a monadic-second order class of finite forests, and let F (x) = Sigma(n) integral(n)x(n) be its (ordinary) generating function. Suppose F is closed under extraction of component trees and sums of forests. Using the above-mentioned structure theory for the class T of trees in F, Compton's theory of 0-1 laws, and a significantly strengthened version of 2003 results of Bell and Burris on generating functions, we show that F has a monadic second-order 0-1 law iff the radius […]

We study expansions in non-integer negative base -beta introduced by Ito and Sadahiro. Using countable automata associated with (-beta)-expansions, we characterize the case where the (-beta)-shift is a system of finite type. We prove that, if beta is a Pisot number, then the (-beta)-shift is a sofic system. In that case, addition (and more generally normalization on any alphabet) is realizable by a finite transducer. We then give an on-line algorithm for the conversion from positive base beta to negative base -beta. When beta is a Pisot number, the conversion can be realized by a finite on-line transducer.

We construct infinite cubefree binary words containing exponentially many distinct squares of length n. We also show that for every positive integer n, there is a cubefree binary square of length 2n.

We characterize the relations which are first-order definable in the model of the group of integers with the constant 1. This allows us to show that given a relation defined by a first-order formula in this model enriched with the usual ordering, it is recursively decidable whether or not it is first-order definable without the ordering.

We simplify the known formula for the asymptotic estimate of the number of deterministic and accessible automata with n states over a k-letter alphabet. The proof relies on the theory of Lagrange inversion applied in the context of generalized binomial series.

Two-dimensional structures of various kinds can be viewed as generalizations of words. Codicity verification and the defect effect, important properties related to word codes, are studied also in this context. Unfortunately, both are lost in the case of two common structures, polyominoes and figures. We consider directed figures defined as labelled polyominoes with designated start and end points, equipped with catenation operation that uses a merging function to resolve possible conflicts. We prove that in this setting verification whether a given finite set of directed figures is a code is decidable and we give a constructive algorithm. We also clarify the status of the defect effect for directed figures.

Given a word w over a finite alphabet Sigma and a finite deterministic automaton A = < Q,Sigma,delta >, the inequality vertical bar delta(Q,w)vertical bar <= vertical bar Q vertical bar - k means that under the natural action of the word w the image of the state set Q is reduced by at least k states. The word w is k-collapsing (k-synchronizing) if this inequality holds for any deterministic finite automaton ( with k + 1 states) that satisfies such an inequality for at least one word. We prove that for each alphabet Sigma there is a 2-collapsing word whose length is vertical bar Sigma vertical bar(3)+6 vertical bar Sigma vertical bar(2)+5 vertical bar Sigma vertical bar/2. Then we produce shorter 2-collapsing and 2-synchronizing words over alphabets of 4 and 5 letters.

We consider relational periods where the relation is a compatibility relation on words induced by a relation on letters. We introduce three types of periods, namely global, external and local relational periods, and we compare their properties by proving variants of the theorem of Fine and Wilf for these periods.

We consider subshifts of the full shift of all binary bi-infinite sequences. On the one hand, the topological entropy of any subshift with computably co-enumerable language is a right-computable real number between 0 and 1. We show that, on the other hand, any right-computable real number between 0 and 1, whether computable or not, is the entropy of some subshift with even polynomial time decidable language. In addition, we show that computability of the entropy of a subshift does not imply any kind of computability of the language of the subshift

Fekete's lemma is a well-known combinatorial result on number sequences: we extend it to functions defined on d-tuples of integers. As an application of the new variant, we show that nonsurjective d-dimensional cellular automata are characterized by loss of arbitrarily much information on finite supports, at a growth rate greater than that of the support's boundary determined by the automaton's neighbourhood index.

In 1973, V. Virkkunen proved that propagating scattered context grammars which use leftmost derivations are as powerful as context-sensitive grammars. This paper brings a significantly simplified proof of this result.

For certain generalized Thue-Morse words t, we compute the critical exponent, i.e., the supremum of the set of rational numbers that are exponents of powers in t, and determine exactly the occurrences of powers realizing it.

We define a morphism based upon a Latin square that generalizes the Thue-Morse morphism. We prove that fixed points of this morphism are overlap-free sequences, generalizing results of Allouche - Shallit and Frid.