Search | arXiv e-print repository

Lexicographic transductions of finite words

Authors: Emmanuel Filiot, Pierre-Alain Reynier, Nathan Lhote

Abstract: Regular transductions over finite words have linear input-to-output growth. This class of transductions enjoys many characterizations. Recently, regular transductions have been extended by Bojańczyk to polyregular transductions, which have polynomial growth, and are characterized by pebble transducers and MSO interpretations. Another class of interest is that of transductions defined by streaming… ▽ More Regular transductions over finite words have linear input-to-output growth. This class of transductions enjoys many characterizations. Recently, regular transductions have been extended by Bojańczyk to polyregular transductions, which have polynomial growth, and are characterized by pebble transducers and MSO interpretations. Another class of interest is that of transductions defined by streaming string transducers or marble transducers, which have exponential growth and are incomparable with polyregular transductions. In this paper, we consider MSO set interpretations (MSOSI) over finite words which were introduced by Colcombet and Loeding. MSOSI are a natural candidate for the class of "regular transductions with exponential growth", and are rather well-behaved. However MSOSI lack, for now, two desirable properties that regular and polyregular transductions have. The first property is being described by an automaton model, which is closely related to the second property of regularity preserving meaning preserving regular languages under inverse image. We first show that if MSOSI are (effectively) regularity preserving then any automatic $ω$-word has a decidable MSO theory, an almost 20 years old conjecture of Bárány. Our main contribution is the introduction of a class of transductions of exponential growth, which we call lexicographic transductions. We provide three different presentations for this class: 1) as the closure of simple transductions (recognizable transductions) under a single operator called maplex; 2) as a syntactic fragment of MSOSI (but the regular languages are given by automata instead of formulas); 3) we give an automaton based model called nested marble transducers, which generalize both marble transducers and pebble transducers. We show that this class enjoys many nice properties including being regularity preserving. △ Less

Submitted 3 March, 2025; originally announced March 2025.

ACM Class: F.1.1; F.4.3

arXiv:2501.10270 [pdf, other]

The structure of polynomial growth for tree automata/transducers and MSO set queries

Authors: Paul Gallot, Nathan Lhote, Lê Thành Dũng Nguyên

Abstract: Given an $\mathbb{N}$-weighted tree automaton, we give a decision procedure for exponential vs polynomial growth (with respect to the input size) in quadratic time, and an algorithm that computes the exact polynomial degree of growth in cubic time. As a special case, they apply to the growth of the ambiguity of a nondeterministic tree automaton, i.e. the number of distinct accepting runs over a gi… ▽ More Given an $\mathbb{N}$-weighted tree automaton, we give a decision procedure for exponential vs polynomial growth (with respect to the input size) in quadratic time, and an algorithm that computes the exact polynomial degree of growth in cubic time. As a special case, they apply to the growth of the ambiguity of a nondeterministic tree automaton, i.e. the number of distinct accepting runs over a given input. Our time complexities match the recent fine-grained lower bounds for these problems restricted to ambiguity of word automata. We deduce analogous decidability results (ignoring complexity) for the growth of the number of results of set queries in Monadic Second-Order logic (MSO) over ranked trees. In the case of polynomial growth of degree $k$, we also prove a reparameterization theorem for such queries: their results can be mapped to $k$-tuples of input nodes in a finite-to-one and MSO-definable fashion. This property of MSO set queries leads directly to a generalization of the dimension minimization theorem for string-to-string polyregular functions. Our generalization applies to MSO set interpretations from trees, which subsume (as we show) tree-walking tree transducers and invisible pebble tree-to-string transducers. Finally, with a bit more work we obtain the following: * a new, short and conceptual proof that macro tree transducers (MTTs) of linear growth compute only tree-to-tree MSO transductions; * a procedure to decide polynomial size-to-height increase for MTTs and compute the degree. The paper concludes with a survey of a wide range of related work, with over a hundred references. △ Less

Submitted 12 March, 2025; v1 submitted 17 January, 2025; originally announced January 2025.

Comments: 39 pages, meant for TheoretiCS submission. New in v2: precise complexity bounds, additional references in related work, readability improvements

arXiv:2501.07428 [pdf, ps, other]

Well-Quasi-Orderings on Word Languages

Authors: Nathan Lhote, Aliaume Lopez, Lia Schütze

Abstract: The set of finite words over a well-quasi-ordered set is itself well-quasi-ordered. This seminal result by Higman is a cornerstone of the theory of well-quasi-orderings and has found numerous applications in computer science. However, this result is based on a specific choice of ordering on words, the (scattered) subword ordering. In this paper, we describe to what extent other natural orderings (… ▽ More The set of finite words over a well-quasi-ordered set is itself well-quasi-ordered. This seminal result by Higman is a cornerstone of the theory of well-quasi-orderings and has found numerous applications in computer science. However, this result is based on a specific choice of ordering on words, the (scattered) subword ordering. In this paper, we describe to what extent other natural orderings (prefix, suffix, and infix) on words can be used to derive Higman-like theorems. More specifically, we are interested in characterizing languages of words that are well-quasi-ordered under these orderings. We show that a simple characterization is possible for the prefix and suffix orderings, and that under extra regularity assumptions, this also extends to the infix ordering. We furthermore provide decision procedures for a large class of languages, that contains regular and context-free languages. △ Less

Submitted 13 January, 2025; originally announced January 2025.

MSC Class: 68Q45 ACM Class: F.4.3; F.4.2

arXiv:2307.13505 [pdf, other]

Minimizing Cost Register Automata over a Field

Authors: Yahia Idriss Benalioua, Nathan Lhote, Pierre-Alain Reynier

Abstract: Weighted automata (WA) are an extension of finite automata that define functions from words to values in a given semiring. An alternative deterministic model, called Cost Register Automata (CRA), was introduced by Alur et al. It enriches deterministic finite automata with a finite number of registers, which store values, updated at each transition using the operations of the semiring. It is known… ▽ More Weighted automata (WA) are an extension of finite automata that define functions from words to values in a given semiring. An alternative deterministic model, called Cost Register Automata (CRA), was introduced by Alur et al. It enriches deterministic finite automata with a finite number of registers, which store values, updated at each transition using the operations of the semiring. It is known that CRA with register updates defined by linear maps have the same expressiveness as WA. Previous works have studied the register minimization problem: given a function computable by a WA and an integer k, is it possible to realize it using a CRA with at most k registers? In this paper, we solve this problem for CRA over a field with linear register updates, using the notion of linear hull, an algebraic invariant of WA introduced recently by Bell and Smertnig. We then generalise the approach to solve a more challenging problem, that consists in minimizing simultaneously the number of states and that of registers. In addition, we also lift our results to the setting of CRA with affine updates. Last, while the linear hull was recently shown to be computable by Bell and Smertnig, no complexity bounds were given. To fill this gap, we provide two new algorithms to compute invariants of WA. This allows us to show that the register (resp. state-register) minimization problem can be solved in 2-ExpTime (resp. in NExpTime). △ Less

Submitted 28 June, 2024; v1 submitted 25 July, 2023; originally announced July 2023.

arXiv:2110.12395 [pdf, other]

Weighted Automata and Expressions over Pre-Rational Monoids

Authors: Nicolas Baudru, Louis-Marie Dando, Nathan Lhote, Benjamin Monmege, Pierre-Alain Reynier, Jean-Marc Talbot

Abstract: The Kleene theorem establishes a fundamental link between automata and expressions over the free monoid. Numerous generalisations of this result exist in the literature. Lifting this result to a weighted setting has been widely studied. Moreover, different monoids can be considered: for instance, two-way automata, and even tree-walking automata, can be described by expressions using the free inver… ▽ More The Kleene theorem establishes a fundamental link between automata and expressions over the free monoid. Numerous generalisations of this result exist in the literature. Lifting this result to a weighted setting has been widely studied. Moreover, different monoids can be considered: for instance, two-way automata, and even tree-walking automata, can be described by expressions using the free inverse monoid. In the present work, we aim at combining both research directions and consider weighted extensions of automata and expressions over a class of monoids that we call pre-rational, generalising both the free inverse monoid and graded monoids. The presence of idempotent elements in these pre-rational monoids leads in the weighted setting to consider infinite sums. To handle such sums, we will have to restrict ourselves to rationally additive semirings. Our main result is thus a generalisation of the Kleene theorem for pre-rational monoids and rationally additive semirings. As a corollary, we obtain a class of expressions equivalent to weighted two-way automata, as well as one for tree-walking automata. △ Less

Submitted 24 October, 2021; originally announced October 2021.

arXiv:2101.07038 [pdf, other]

doi 10.46298/lmcs-18(3:9)2022

Computability of Data-Word Transductions over Different Data Domains

Authors: Léo Exibard, Emmanuel Filiot, Nathan Lhote, Pierre-Alain Reynier

Abstract: In this paper, we investigate the problem of synthesizing computable functions of infinite words over an infinite alphabet (data $ω$-words). The notion of computability is defined through Turing machines with infinite inputs which can produce the corresponding infinite outputs in the limit. We use non-deterministic transducers equipped with registers, an extension of register automata with outputs… ▽ More In this paper, we investigate the problem of synthesizing computable functions of infinite words over an infinite alphabet (data $ω$-words). The notion of computability is defined through Turing machines with infinite inputs which can produce the corresponding infinite outputs in the limit. We use non-deterministic transducers equipped with registers, an extension of register automata with outputs, to describe specifications. Being non-deterministic, such transducers may not define functions but more generally relations of data $ω$-words. In order to increase the expressive power of these machines, we even allow guessing of arbitrary data values when updating their registers. For functions over data $ω$-words, we identify a sufficient condition (the possibility of determining the next letter to be outputted, which we call next letter problem) under which computability (resp. uniform computability) and continuity (resp. uniform continuity) coincide. We focus on two kinds of data domains: first, the general setting of oligomorphic data, which encompasses any data domain with equality, as well as the setting of rational numbers with linear order; and second, the set of natural numbers equipped with linear order. For both settings, we prove that functionality, i.e. determining whether the relation recognized by the transducer is actually a function, is decidable. We also show that the so-called next letter problem is decidable, yielding equivalence between (uniform) continuity and (uniform) computability. Last, we provide characterizations of (uniform) continuity, which allow us to prove that these notions, and thus also (uniform) computability, are decidable. We even show that all these decision problems are PSpace-complete for $(\mathbb{N},<)$ and for a large class of oligomorphic data domains, including for instance $(\mathbb{Q},<)$. △ Less

Submitted 28 July, 2022; v1 submitted 18 January, 2021; originally announced January 2021.

Journal ref: Logical Methods in Computer Science, Volume 18, Issue 3 (July 29, 2022) lmcs:7104

arXiv:2006.16645

doi 10.1145/3373718.3394804

Pebble Minimization of Polyregular Functions

Authors: Nathan Lhote

Abstract: We show that a polyregular word-to-word function is regular if and only if its output size is at most linear in its input size. Moreover a polyregular function can be realized by: a transducer with two pebbles if and only if its output has quadratic size in its input, a transducer with three pebbles if and only if its output has cubic size in its input, etc. Moreover the characterization is decida… ▽ More We show that a polyregular word-to-word function is regular if and only if its output size is at most linear in its input size. Moreover a polyregular function can be realized by: a transducer with two pebbles if and only if its output has quadratic size in its input, a transducer with three pebbles if and only if its output has cubic size in its input, etc. Moreover the characterization is decidable and, given a polyregular function, one can compute a transducer realizing it with the minimal number of pebbles. We apply the result to mso interpretations from words to words. We show that mso interpretations of dimension k exactly coincide with k-pebble transductions. △ Less

Submitted 15 February, 2023; v1 submitted 30 June, 2020; originally announced June 2020.

Comments: The main result of the article is false. Counterexamples and more can be found here: arXiv:2301.09234

arXiv:1910.06019 [pdf, other]

Equivalence kernels of sequential functions and sequential observation synthesis

Authors: Paulin Fournier, Nathan Lhote

Abstract: We show that one can decide if a rational equivalence relation can be given as the equivalence kernel of a sequential letter-to-letter transduction. This problem comes from the setting of games with imperfect information. In [1, p. 6] the authors propose to model imperfect information by a rational equivalence relation and leave open the problem of deciding if one can synthesize a sequential lette… ▽ More We show that one can decide if a rational equivalence relation can be given as the equivalence kernel of a sequential letter-to-letter transduction. This problem comes from the setting of games with imperfect information. In [1, p. 6] the authors propose to model imperfect information by a rational equivalence relation and leave open the problem of deciding if one can synthesize a sequential letter-to-letter transducer (Mealy machine) which maps equivalent histories to the same sequence of observations. We also show that knowing if an equivalence relation can be given as the equivalence kernel of a sequential transducer is undecidable, even if the relation is given as a letter-to-letter transducer. △ Less

Submitted 14 October, 2019; originally announced October 2019.

arXiv:1908.03890 [pdf, other]

A Robust Class of Linear Recurrence Sequences

Authors: Corentin Barloy, Nathanaël Fijalkow, Nathan Lhote, Filip Mazowiecki

Abstract: We introduce a subclass of linear recurrence sequences which we call poly-rational sequences because they are denoted by rational expressions closed under sum and product. We show that this class is robust by giving several characterisations: polynomially ambiguous weighted automata, copyless cost-register automata, rational formal series, and linear recurrence sequences whose eigenvalues are root… ▽ More We introduce a subclass of linear recurrence sequences which we call poly-rational sequences because they are denoted by rational expressions closed under sum and product. We show that this class is robust by giving several characterisations: polynomially ambiguous weighted automata, copyless cost-register automata, rational formal series, and linear recurrence sequences whose eigenvalues are roots of rational numbers. △ Less

Submitted 11 August, 2019; originally announced August 2019.

arXiv:1906.04199 [pdf, other]

doi 10.46298/lmcs-18(2:23)2022

Synthesis of Computable Regular Functions of Infinite Words

Authors: V. Dave, E. Filiot, S. Krishna, N. Lhote

Abstract: Regular functions from infinite words to infinite words can be equivalently specified by MSO-transducers, streaming $ω$-string transducers as well as deterministic two-way transducers with look-ahead. In their one-way restriction, the latter transducers define the class of rational functions. Even though regular functions are robustly characterised by several finite-state devices, even the subclas… ▽ More Regular functions from infinite words to infinite words can be equivalently specified by MSO-transducers, streaming $ω$-string transducers as well as deterministic two-way transducers with look-ahead. In their one-way restriction, the latter transducers define the class of rational functions. Even though regular functions are robustly characterised by several finite-state devices, even the subclass of rational functions may contain functions which are not computable (by a Turing machine with infinite input). This paper proposes a decision procedure for the following synthesis problem: given a regular function $f$ (equivalently specified by one of the aforementioned transducer model), is $f$ computable and if it is, synthesize a Turing machine computing it. For regular functions, we show that computability is equivalent to continuity, and therefore the problem boils down to deciding continuity. We establish a generic characterisation of continuity for functions preserving regular languages under inverse image (such as regular functions). We exploit this characterisation to show the decidability of continuity (and hence computability) of rational and regular functions. For rational functions, we show that this can be done in $\mathsf{NLogSpace}$ (it was already known to be in $\mathsf{PTime}$ by Prieur). In a similar fashion, we also effectively characterise uniform continuity of regular functions, and relate it to the notion of uniform computability, which offers stronger efficiency guarantees. △ Less

Submitted 18 September, 2024; v1 submitted 15 May, 2019; originally announced June 2019.

Journal ref: Logical Methods in Computer Science, Volume 18, Issue 2 (June 29, 2022) lmcs:7592

arXiv:1905.13190 [pdf, other]

String-to-String Interpretations with Polynomial-Size Output

Authors: Mikołaj Bojańczyk, Sandra Kiefer, Nathan Lhote

Abstract: String-to-string MSO interpretations are like Courcelle's MSO transductions, except that a single output position can be represented using a tuple of input positions instead of just a single input position. In particular, the output length is polynomial in the input length, as opposed to MSO transductions, which have output of linear length. We show that string-to-string MSO interpretations are ex… ▽ More String-to-string MSO interpretations are like Courcelle's MSO transductions, except that a single output position can be represented using a tuple of input positions instead of just a single input position. In particular, the output length is polynomial in the input length, as opposed to MSO transductions, which have output of linear length. We show that string-to-string MSO interpretations are exactly the polyregular functions. The latter class has various characterizations, one of which is that it consists of the string-to-string functions recognized by pebble transducers. Our main result implies the surprising fact that string-to-string MSO interpretations are closed under composition. △ Less

Submitted 30 May, 2019; originally announced May 2019.

Comments: 35 pages, full version of a paper accepted for the 46th International Colloquium on Automata, Languages and Programming (ICALP 2019)

arXiv:1705.03726 [pdf, other]

doi 10.23638/LMCS-15(4:16)2019

Logical and Algebraic Characterizations of Rational Transductions

Authors: Emmanuel Filiot, Olivier Gauwin, Nathan Lhote

Abstract: Rational word languages can be defined by several equivalent means: finite state automata, rational expressions, finite congruences, or monadic second-order (MSO) logic. The robust subclass of aperiodic languages is defined by: counter-free automata, star-free expressions, aperiodic (finite) congruences, or first-order (FO) logic. In particular, their algebraic characterization by aperiodic congru… ▽ More Rational word languages can be defined by several equivalent means: finite state automata, rational expressions, finite congruences, or monadic second-order (MSO) logic. The robust subclass of aperiodic languages is defined by: counter-free automata, star-free expressions, aperiodic (finite) congruences, or first-order (FO) logic. In particular, their algebraic characterization by aperiodic congruences allows to decide whether a regular language is aperiodic. We lift this decidability result to rational transductions, i.e., word-to-word functions defined by finite state transducers. In this context, logical and algebraic characterizations have also been proposed. Our main result is that one can decide if a rational transduction (given as a transducer) is in a given decidable congruence class. We also establish a transfer result from logic-algebra equivalences over languages to equivalences over transductions. As a consequence, it is decidable if a rational transduction is first-order definable, and we show that this problem is PSPACE-complete. △ Less

Submitted 18 December, 2019; v1 submitted 10 May, 2017; originally announced May 2017.

Journal ref: Logical Methods in Computer Science, Volume 15, Issue 4 (December 19, 2019) lmcs:3653

arXiv:1705.00844

Edge arboricity : Do we need equitabilility ?

Authors: Nathan Lhote, Mohammed Senhaji

Abstract: In this paper we study a new variant of graph arboricity, which requires all the forests to have the same number of edges (up to a difference of 1). We prove that the new variant, which we call equitable arboricity, is equivalent to ordinary arboricity. In other words we show that any arborescent decomposition of a graph can be transformed into an equitable one without modifying the number of used… ▽ More In this paper we study a new variant of graph arboricity, which requires all the forests to have the same number of edges (up to a difference of 1). We prove that the new variant, which we call equitable arboricity, is equivalent to ordinary arboricity. In other words we show that any arborescent decomposition of a graph can be transformed into an equitable one without modifying the number of used forests. △ Less

Submitted 3 May, 2017; v1 submitted 2 May, 2017; originally announced May 2017.

Comments: A colleague just informed me that the result is an obvious consequence of the exchange axiom in matroïds

arXiv:1702.07157 [pdf, other]

On Reversible Transducers

Authors: Luc Dartois, Paulin Fournier, Ismaël Jecker, Nathan Lhote

Abstract: Deterministic two-way transducers define the robust class of regular functions which is, among other good properties, closed under composition. However, the best known algorithms for composing two-way transducers cause a double exponential blow-up in the size of the inputs. In this paper, we introduce a class of transducers for which the composition has polynomial complexity. It is the class of re… ▽ More Deterministic two-way transducers define the robust class of regular functions which is, among other good properties, closed under composition. However, the best known algorithms for composing two-way transducers cause a double exponential blow-up in the size of the inputs. In this paper, we introduce a class of transducers for which the composition has polynomial complexity. It is the class of reversible transducers, for which the computation steps can be reversed deterministically. While in the one-way setting this class is not very expressive, we prove that any two-way transducer can be made reversible through a single exponential blow-up. As a consequence, we prove that the composition of two-way transducers can be done with a single exponential blow-up in the number of states. A uniformization of a relation is a function with the same domain and which is included in the original relation. Our main result actually states that we can uniformize any non-deterministic two-way transducer by a reversible transducer with a single exponential blow-up, improving the known result by de Souza which has a quadruple exponential complexity. As a side result, our construction also gives a quadratic transformation from copyless streaming string transducers to two-way transducers, improving the exponential previous bound. △ Less

Submitted 23 February, 2017; originally announced February 2017.

ACM Class: F.4.3

arXiv:1701.03670 [pdf, other]

Logics for Word Transductions with Synthesis

Authors: Luc Dartois, Emmanuel Filiot, Nathan Lhote

Abstract: We introduce a logic, called LT, to express properties of transductions, i.e. binary relations from input to output (finite) words. In LT, the input/output dependencies are modelled via an origin function which associates to any position of the output word, the input position from which it originates. LT is well-suited to express relations (which are not necessarily functional), and can express al… ▽ More We introduce a logic, called LT, to express properties of transductions, i.e. binary relations from input to output (finite) words. In LT, the input/output dependencies are modelled via an origin function which associates to any position of the output word, the input position from which it originates. LT is well-suited to express relations (which are not necessarily functional), and can express all regular functional transductions, i.e. transductions definable for instance by deterministic two-way transducers. Despite its high expressive power, LT has decidable satisfiability and equivalence problems, with tight non-elementary and elementary complexities, depending on specific representation of LT-formulas. Our main contribution is a synthesis result: from any transduction R defined in LT , it is possible to synthesise a regular functional transduction f such that for all input words u in the domain of R, f is defined and (u,f(u)) belongs to R. As a consequence, we obtain that any functional transduction is regular iff it is LT-definable. We also investigate the algorithmic and expressiveness properties of several extensions of LT, and explicit a correspondence between transductions and data words. As a side-result, we obtain a new decidable logic for data words. △ Less

Submitted 30 May, 2018; v1 submitted 13 January, 2017; originally announced January 2017.

Comments: 10 pages + appendix

arXiv:1701.02903 [pdf, ps, other]

On Delay and Regret Determinization of Max-Plus Automata

Authors: Emmanuel Filiot, Ismaël Jecker, Nathan Lhote, Guillermo A. Pérez, Jean-François Raskin

Abstract: Decidability of the determinization problem for weighted automata over the semiring $(\mathbb{Z} \cup {-\infty}, \max, +)$, WA for short, is a long-standing open question. We propose two ways of approaching it by constraining the search space of deterministic WA: k-delay and r-regret. A WA N is k-delay determinizable if there exists a deterministic automaton D that defines the same function as N a… ▽ More Decidability of the determinization problem for weighted automata over the semiring $(\mathbb{Z} \cup {-\infty}, \max, +)$, WA for short, is a long-standing open question. We propose two ways of approaching it by constraining the search space of deterministic WA: k-delay and r-regret. A WA N is k-delay determinizable if there exists a deterministic automaton D that defines the same function as N and for all words α in the language of N, the accepting run of D on α is always at most k-away from a maximal accepting run of N on α. That is, along all prefixes of the same length, the absolute difference between the running sums of weights of the two runs is at most k. A WA N is r-regret determinizable if for all words α in its language, its non-determinism can be resolved on the fly to construct a run of N such that the absolute difference between its value and the value assigned to α by N is at most r. We show that a WA is determinizable if and only if it is k-delay determinizable for some k. Hence deciding the existence of some k is as difficult as the general determinization problem. When k and r are given as input, the k-delay and r-regret determinization problems are shown to be EXPtime-complete. We also show that determining whether a WA is r-regret determinizable for some r is in EXPtime. △ Less

Submitted 3 March, 2017; v1 submitted 11 January, 2017; originally announced January 2017.

arXiv:1506.06497 [pdf, other]

Towards an algebraic characterization of rational word functions

Authors: Nathan Lhote

Abstract: In formal language theory, several different models characterize regular languages, such as finite automata, congruences of finite index, or monadic second-order logic (MSO). Moreover, several fragments of MSO have effective characterizations based on algebraic properties. When we consider transducers instead of automata, such characterizations are much more challenging, because many of the proper… ▽ More In formal language theory, several different models characterize regular languages, such as finite automata, congruences of finite index, or monadic second-order logic (MSO). Moreover, several fragments of MSO have effective characterizations based on algebraic properties. When we consider transducers instead of automata, such characterizations are much more challenging, because many of the properties of regular languages do not generalize to regular word functions. In this paper we consider word functions that are definable by one-way transducers (rational functions). We show that the canonical bimachine of Reutenauer and Schützenberger preserves certain algebraic properties of rational functions, similar to the case of word languages. In particular, we give an effective characterization of functions that can be defined by an aperiodic one-way transducer. △ Less

Submitted 22 June, 2015; originally announced June 2015.

Showing 1–17 of 17 results for author: Lhote, N