-
A Simple Algorithm for Worst-Case Optimal Join and Sampling
Authors:
Florent Capelli,
Oliver Irwin,
Sylvain Salvati
Abstract:
We present an elementary branch and bound algorithm with a simple analysis of why it achieves worstcase optimality for join queries on classes of databases defined respectively by cardinality or acyclic degree constraints. We then show that if one is given a reasonable way for recursively estimating upper bounds on the number of answers of the join queries, our algorithm can be turned into algorit…
▽ More
We present an elementary branch and bound algorithm with a simple analysis of why it achieves worstcase optimality for join queries on classes of databases defined respectively by cardinality or acyclic degree constraints. We then show that if one is given a reasonable way for recursively estimating upper bounds on the number of answers of the join queries, our algorithm can be turned into algorithm for uniformly sampling answers with expected running time $O(UP/OUT)$ where $UP$ is the upper bound, $OUT$ is the actual number of answers and $O(\cdot)$ ignores polylogarithmic factors. Our approach recovers recent results on worstcase optimal join algorithm and sampling in a modular, clean and elementary way.
△ Less
Submitted 21 September, 2024;
originally announced September 2024.
-
Direct Access for Conjunctive Queries with Negations
Authors:
Florent Capelli,
Nofar Carmeli,
Oliver Irwin,
Sylvain Salvati
Abstract:
Given a conjunctive query $Q$ and a database $D$, a direct access to the answers of $Q$ over $D$ is the operation of returning, given an index $k$, the $k$-th answer for some order on its answers. While this problem is #P-hard in general with respect to combined complexity, many conjunctive queries have an underlying structure that allows for a direct access to their answers for some lexicographic…
▽ More
Given a conjunctive query $Q$ and a database $D$, a direct access to the answers of $Q$ over $D$ is the operation of returning, given an index $k$, the $k$-th answer for some order on its answers. While this problem is #P-hard in general with respect to combined complexity, many conjunctive queries have an underlying structure that allows for a direct access to their answers for some lexicographical ordering that takes polylogarithmic time in the size of the database after a polynomial time precomputation. Previous work has precisely characterised the tractable classes and given fine-grained lower bounds on the precomputation time needed depending on the structure of the query.
In this paper, we generalise these tractability results to the case of signed conjunctive queries, that is, conjunctive queries that may contain negative atoms. Our technique is based on a class of circuits that can represent relational data. We first show that this class supports tractable direct access after a polynomial time preprocessing. We then give bounds on the size of the circuit needed to represent the answer set of signed conjunctive queries depending on their structure. Both results combined together allow us to prove the tractability of direct access for a large class of conjunctive queries. On the one hand, we recover the known tractable classes from the literature in the case of positive conjunctive queries. On the other hand, we generalise and unify known tractability results about negative conjunctive queries -- that is, queries having only negated atoms. In particular, we show that the class of $β$-acyclic negative conjunctive queries and the class of bounded nest set width negative conjunctive queries admit tractable direct access.
△ Less
Submitted 11 November, 2024; v1 submitted 24 October, 2023;
originally announced October 2023.
-
Jumping Evaluation of Nested Regular Path Queries
Authors:
Joachim Niehren,
Sylvain Salvati,
Rustam Azimov
Abstract:
Nested regular path queries are used for querying graph databases and RDF triple stores. We propose a new algorithm for evaluating nested regular path queries on a graph from a set of start nodes in combined linear time. We show that this complexity upper bound can be reduced by making it dependent on the size of the query's top-down needed subgraph, a notion that we introduce. For many queries in…
▽ More
Nested regular path queries are used for querying graph databases and RDF triple stores. We propose a new algorithm for evaluating nested regular path queries on a graph from a set of start nodes in combined linear time. We show that this complexity upper bound can be reduced by making it dependent on the size of the query's top-down needed subgraph, a notion that we introduce. For many queries in practice, the top-down needed subgraph is way smaller than the whole graph. Our algorithm is based on a novel compilation schema from nested regular path queries to monadic datalog queries. Its complexity upper bound follows from known properties of top-down datalog evaluation. As an application, we show that our algorithm permits to reformulate in simple terms a variant of a very efficient automata-based algorithm proposed by Maneth and Nguyen that evaluates navigational path queries in datatrees based on indexes and jumping. Moreover, it overcomes some limitations of Maneth and Nguyen's: it is not bound to trees and applies to graphs; it is not limited to forward navigational XPath but can treat any nested regular path query and it can be implemented efficiently without any dedicated techniques, by using any efficient datalog evaluator such as LogicBlox.
△ Less
Submitted 5 August, 2022;
originally announced August 2022.
-
$O_n$ is an $n$-MCFL
Authors:
Kilian Gebhardt,
Frédéric Meunier,
Sylvain Salvati
Abstract:
Commutative properties in formal languages pose problems at the frontier of computer science, computational linguistics and computational group theory. A prominent problem of this kind is the position of the language $O_n$, the language that contains the same number of letters $a_i$ and $\bar a_i$ with $1\leq i\leq n$, in the known classes of formal languages. It has recently been shown that…
▽ More
Commutative properties in formal languages pose problems at the frontier of computer science, computational linguistics and computational group theory. A prominent problem of this kind is the position of the language $O_n$, the language that contains the same number of letters $a_i$ and $\bar a_i$ with $1\leq i\leq n$, in the known classes of formal languages. It has recently been shown that $O_n$ is a Multiple Context-Free Language (MCFL). However the more precise conjecture of Nederhof that $O_n$ is an MCFL of dimension $n$ was left open. We present two proofs of this conjecture, both relying on tools from algebraic topology. On our way, we prove a variant of the necklace splitting theorem.
△ Less
Submitted 22 December, 2020;
originally announced December 2020.
-
Typing weak MSOL properties
Authors:
Sylvain Salvati,
Igor Walukiewicz
Abstract:
We consider lambda-Y-calculus as a non-interpreted functional programming language: the result of the execution of a program is its normal form that can be seen as the tree of calls to built-in operations. Weak monadic second-order logic (wMSOL) is well suited to express properties of such trees. We give a type system for ensuring that the result of the execution of a lambda-Y-program satisfies a…
▽ More
We consider lambda-Y-calculus as a non-interpreted functional programming language: the result of the execution of a program is its normal form that can be seen as the tree of calls to built-in operations. Weak monadic second-order logic (wMSOL) is well suited to express properties of such trees. We give a type system for ensuring that the result of the execution of a lambda-Y-program satisfies a given wMSOL property. In order to prove soundness and completeness of the system we construct a denotational semantics of lambda-Y-calculus that is capable of computing properties expressed in wMSOL.
△ Less
Submitted 22 March, 2017; v1 submitted 9 September, 2016;
originally announced September 2016.
-
The Diagonal Problem for Higher-Order Recursion Schemes is Decidable
Authors:
Lorenzo Clemente,
Paweł Parys,
Sylvain Salvati,
Igor Walukiewicz
Abstract:
A non-deterministic recursion scheme recognizes a language of finite trees. This very expressive model can simulate, among others, higher-order pushdown automata with collapse. We show decidability of the diagonal problem for schemes. This result has several interesting consequences. In particular, it gives an algorithm that computes the downward closure of languages of words recognized by schemes…
▽ More
A non-deterministic recursion scheme recognizes a language of finite trees. This very expressive model can simulate, among others, higher-order pushdown automata with collapse. We show decidability of the diagonal problem for schemes. This result has several interesting consequences. In particular, it gives an algorithm that computes the downward closure of languages of words recognized by schemes. In turn, this has immediate application to separability problems and reachability analysis of concurrent systems.
△ Less
Submitted 2 May, 2016;
originally announced May 2016.
-
Ordered Tree-Pushdown Systems
Authors:
Lorenzo Clemente,
Paweł Parys,
Sylvain Salvati,
Igor Walukiewicz
Abstract:
We define a new class of pushdown systems where the pushdown is a tree instead of a word. We allow a limited form of lookahead on the pushdown conforming to a certain ordering restriction, and we show that the resulting class enjoys a decidable reachability problem. This follows from a preservation of recognizability result for the backward reachability relation of such systems. As an application,…
▽ More
We define a new class of pushdown systems where the pushdown is a tree instead of a word. We allow a limited form of lookahead on the pushdown conforming to a certain ordering restriction, and we show that the resulting class enjoys a decidable reachability problem. This follows from a preservation of recognizability result for the backward reachability relation of such systems. As an application, we show that our simple model can encode several formalisms generalizing pushdown systems, such as ordered multi-pushdown systems, annotated higher-order pushdown systems, the Krivine machine, and ordered annotated multi-pushdown systems. In each case, our procedure yields tight complexity.
△ Less
Submitted 12 October, 2015;
originally announced October 2015.
-
Using models to model-check recursive schemes
Authors:
Sylvain Salvati,
Igor Walukiewicz
Abstract:
We propose a model-based approach to the model checking problem for recursive schemes. Since simply typed lambda calculus with the fixpoint operator, lambda-Y-calculus, is equivalent to schemes, we propose the use of a model of lambda-Y-calculus to discriminate the terms that satisfy a given property. If a model is finite in every type, this gives a decision procedure. We provide a construction o…
▽ More
We propose a model-based approach to the model checking problem for recursive schemes. Since simply typed lambda calculus with the fixpoint operator, lambda-Y-calculus, is equivalent to schemes, we propose the use of a model of lambda-Y-calculus to discriminate the terms that satisfy a given property. If a model is finite in every type, this gives a decision procedure. We provide a construction of such a model for every property expressed by automata with trivial acceptance conditions and divergence testing. Such properties pose already interesting challenges for model construction. Moreover, we argue that having models capturing some class of properties has several other virtues in addition to providing decidability of the model-checking problem. As an illustration, we show a very simple construction transforming a scheme to a scheme reflecting a property captured by a given model.
△ Less
Submitted 10 June, 2015; v1 submitted 14 March, 2015;
originally announced March 2015.