-
Learning Tree Pattern Transformations
Authors:
Daniel Neider,
Leif Sabellek,
Johannes Schmidt,
Fabian Vehlken,
Thomas Zeume
Abstract:
Explaining why and how a tree $t$ structurally differs from another tree $t^\star$ is a question that is encountered throughout computer science, including in understanding tree-structured data such as XML or JSON data. In this article, we explore how to learn explanations for structural differences between pairs of trees from sample data: suppose we are given a set…
▽ More
Explaining why and how a tree $t$ structurally differs from another tree $t^\star$ is a question that is encountered throughout computer science, including in understanding tree-structured data such as XML or JSON data. In this article, we explore how to learn explanations for structural differences between pairs of trees from sample data: suppose we are given a set $\{(t_1, t_1^\star),\dots, (t_n, t_n^\star)\}$ of pairs of labelled, ordered trees; is there a small set of rules that explains the structural differences between all pairs $(t_i, t_i^\star)$? This raises two research questions: (i) what is a good notion of "rule" in this context?; and (ii) how can sets of rules explaining a data set be learned algorithmically?
We explore these questions from the perspective of database theory by (1) introducing a pattern-based specification language for tree transformations; (2) exploring the computational complexity of variants of the above algorithmic problem, e.g. showing NP-hardness for very restricted variants; and (3) discussing how to solve the problem for data from CS education research using SAT solvers.
△ Less
Submitted 18 February, 2025; v1 submitted 10 October, 2024;
originally announced October 2024.
-
Remarks on Parikh-recognizable omega-languages
Authors:
Mario Grobler,
Leif Sabellek,
Sebastian Siebertz
Abstract:
Several variants of Parikh automata on infinite words were recently introduced by Guha et al. [FSTTCS, 2022]. We show that one of these variants coincides with blind counter machine as introduced by Fernau and Stiebe [Fundamenta Informaticae, 2008]. Fernau and Stiebe showed that every $ω$-language recognized by a blind counter machine is of the form $\bigcup_iU_iV_i^ω$ for Parikh recognizable lang…
▽ More
Several variants of Parikh automata on infinite words were recently introduced by Guha et al. [FSTTCS, 2022]. We show that one of these variants coincides with blind counter machine as introduced by Fernau and Stiebe [Fundamenta Informaticae, 2008]. Fernau and Stiebe showed that every $ω$-language recognized by a blind counter machine is of the form $\bigcup_iU_iV_i^ω$ for Parikh recognizable languages $U_i, V_i$, but blind counter machines fall short of characterizing this class of $ω$-languages. They posed as an open problem to find a suitable automata-based characterization. We introduce several additional variants of Parikh automata on infinite words that yield automata characterizations of classes of $ω$-language of the form $\bigcup_iU_iV_i^ω$ for all combinations of languages $U_i, V_i$ being regular or Parikh-recognizable. When both $U_i$ and $V_i$ are regular, this coincides with Büchi's classical theorem. We study the effect of $\varepsilon$-transitions in all variants of Parikh automata and show that almost all of them admit $\varepsilon$-elimination. Finally we study the classical decision problems with applications to model checking.
△ Less
Submitted 31 October, 2023; v1 submitted 14 July, 2023;
originally announced July 2023.
-
Parikh Automata on Infinite Words
Authors:
Mario Grobler,
Leif Sabellek,
Sebastian Siebertz
Abstract:
Parikh automata on finite words were first introduced by Klaedtke and Rueß [Automata, Languages and Programming, 2003]. In this paper, we introduce several variants of Parikh automata on infinite words and study their expressiveness. We show that one of our new models is equivalent to synchronous blind counter machines introduced by Fernau and Stiebe [Fundamenta Informaticae, 2008]. All our models…
▽ More
Parikh automata on finite words were first introduced by Klaedtke and Rueß [Automata, Languages and Programming, 2003]. In this paper, we introduce several variants of Parikh automata on infinite words and study their expressiveness. We show that one of our new models is equivalent to synchronous blind counter machines introduced by Fernau and Stiebe [Fundamenta Informaticae, 2008]. All our models admit ε-elimination, which to the best of our knowledge is an open question for blind counter automata. We then study the classical decision problems of the new automata models.
△ Less
Submitted 21 January, 2023;
originally announced January 2023.
-
Ontology-Mediated Querying on Databases of Bounded Cliquewidth
Authors:
Carsten Lutz,
Leif Sabellek,
Lukas Schulze
Abstract:
We study the evaluation of ontology-mediated queries (OMQs) on databases of bounded cliquewidth from the viewpoint of parameterized complexity theory. As the ontology language, we consider the description logics $\mathcal{ALC}$ and $\mathcal{ALCI}$ as well as the guarded two-variable fragment GF$_2$ of first-order logic. Queries are atomic queries (AQs), conjunctive queries (CQs), and unions of CQ…
▽ More
We study the evaluation of ontology-mediated queries (OMQs) on databases of bounded cliquewidth from the viewpoint of parameterized complexity theory. As the ontology language, we consider the description logics $\mathcal{ALC}$ and $\mathcal{ALCI}$ as well as the guarded two-variable fragment GF$_2$ of first-order logic. Queries are atomic queries (AQs), conjunctive queries (CQs), and unions of CQs. All studied OMQ problems are fixed-parameter linear (FPL) when the parameter is the size of the OMQ plus the cliquewidth. Our main contribution is a detailed analysis of the dependence of the running time on the parameter, exhibiting several interesting effects.
△ Less
Submitted 13 September, 2022; v1 submitted 4 May, 2022;
originally announced May 2022.
-
How to Approximate Ontology-Mediated Queries
Authors:
Anneke Haga,
Carsten Lutz,
Leif Sabellek,
Frank Wolter
Abstract:
We introduce and study several notions of approximation for ontology-mediated queries based on the description logics ALC and ALCI. Our approximations are of two kinds: we may (1) replace the ontology with one formulated in a tractable ontology language such as ELI or certain TGDs and (2) replace the database with one from a tractable class such as the class of databases whose treewidth is bounded…
▽ More
We introduce and study several notions of approximation for ontology-mediated queries based on the description logics ALC and ALCI. Our approximations are of two kinds: we may (1) replace the ontology with one formulated in a tractable ontology language such as ELI or certain TGDs and (2) replace the database with one from a tractable class such as the class of databases whose treewidth is bounded by a constant. We determine the computational complexity and the relative completeness of the resulting approximations. (Almost) all of them reduce the data complexity from coNP-complete to PTime, in some cases even to fixed-parameter tractable and to linear time. While approximations of kind (1) also reduce the combined complexity, this tends to not be the case for approximations of kind (2). In some cases, the combined complexity even increases.
△ Less
Submitted 30 June, 2022; v1 submitted 12 July, 2021;
originally announced July 2021.
-
Query Expressibility and Verification in Ontology-Based Data Access
Authors:
Carsten Lutz,
Johannes Marti,
Leif Sabellek
Abstract:
In ontology-based data access, multiple data sources are integrated using an ontology and mappings. In practice, this is often achieved by a bootstrapping process, that is, the ontology and mappings are first designed to support only the most important queries over the sources and then gradually extended to enable additional queries. In this paper, we study two reasoning problems that support such…
▽ More
In ontology-based data access, multiple data sources are integrated using an ontology and mappings. In practice, this is often achieved by a bootstrapping process, that is, the ontology and mappings are first designed to support only the most important queries over the sources and then gradually extended to enable additional queries. In this paper, we study two reasoning problems that support such an approach. The expressibility problem asks whether a given source query $q_s$ is expressible as a target query (that is, over the ontology's vocabulary) and the verification problem asks, additionally given a candidate target query $q_t$, whether $q_t$ expresses $q_s$. We consider (U)CQs as source and target queries and GAV mappings, showing that both problems are $Π^p_2$-complete in DL-Lite, coNExpTime-complete between EL and ELHI when source queries are rooted, and 2ExpTime-complete for unrestricted source queries.
△ Less
Submitted 18 November, 2020;
originally announced November 2020.
-
A Complete Classification of the Complexity and Rewritability of Ontology-Mediated Queries based on the Description Logic EL
Authors:
Carsten Lutz,
Leif Sabellek
Abstract:
We provide an ultimately fine-grained analysis of the data complexity and rewritability of ontology-mediated queries (OMQs) based on an EL ontology and a conjunctive query (CQ). Our main results are that every such OMQ is in AC0, NL-complete, or PTime-complete and that containment in NL coincides with rewritability into linear Datalog (whereas containment in AC0 coincides with rewritability into f…
▽ More
We provide an ultimately fine-grained analysis of the data complexity and rewritability of ontology-mediated queries (OMQs) based on an EL ontology and a conjunctive query (CQ). Our main results are that every such OMQ is in AC0, NL-complete, or PTime-complete and that containment in NL coincides with rewritability into linear Datalog (whereas containment in AC0 coincides with rewritability into first-order logic). We establish natural characterizations of the three cases in terms of bounded depth and (un)bounded pathwidth, and show that every of the associated meta problems such as deciding wether a given OMQ is rewritable into linear Datalog is ExpTime-complete. We also give a way to construct linear Datalog rewritings when they exist and prove that there is no constant Datalog rewritings.
△ Less
Submitted 29 April, 2019;
originally announced April 2019.