-
Mining Frequent Structures in Conceptual Models
Authors:
Mattia Fumagalli,
Tiago Prince Sales,
Pedro Paulo F. Barcelos,
Giovanni Micale,
Philipp-Lorenz Glaser,
Dominik Bork,
Vadim Zaytsev,
Diego Calvanese,
Giancarlo Guizzardi
Abstract:
The problem of using structured methods to represent knowledge is well-known in conceptual modeling and has been studied for many years. It has been proven that adopting modeling patterns represents an effective structural method. Patterns are, indeed, generalizable recurrent structures that can be exploited as solutions to design problems. They aid in understanding and improving the process of cr…
▽ More
The problem of using structured methods to represent knowledge is well-known in conceptual modeling and has been studied for many years. It has been proven that adopting modeling patterns represents an effective structural method. Patterns are, indeed, generalizable recurrent structures that can be exploited as solutions to design problems. They aid in understanding and improving the process of creating models. The undeniable value of using patterns in conceptual modeling was demonstrated in several experimental studies. However, discovering patterns in conceptual models is widely recognized as a highly complex task and a systematic solution to pattern identification is currently lacking. In this paper, we propose a general approach to the problem of discovering frequent structures, as they occur in conceptual modeling languages. As proof of concept, we implement our approach by focusing on two widely-used conceptual modeling languages. This implementation includes an exploratory tool that integrates a frequent subgraph mining algorithm with graph manipulation techniques. The tool processes multiple conceptual models and identifies recurrent structures based on various criteria. We validate the tool using two state-of-the-art curated datasets: one consisting of models encoded in OntoUML and the other in ArchiMate. The primary objective of our approach is to provide a support tool for language engineers. This tool can be used to identify both effective and ineffective modeling practices, enabling the refinement and evolution of conceptual modeling languages. Furthermore, it facilitates the reuse of accumulated expertise, ultimately supporting the creation of higher-quality models in a given language.
△ Less
Submitted 25 December, 2024; v1 submitted 11 June, 2024;
originally announced June 2024.
-
Integrating 3D City Data through Knowledge Graphs
Authors:
Linfang Ding,
Guohui Xiao,
Albulen Pano,
Mattia Fumagalli,
Dongsheng Chen,
Yu Feng,
Diego Calvanese,
Hongchao Fan,
Liqiu Meng
Abstract:
CityGML is a widely adopted standard by the Open Geospatial Consortium (OGC) for representing and exchanging 3D city models. The representation of semantic and topological properties in CityGML makes it possible to query such 3D city data to perform analysis in various applications, e.g., security management and emergency response, energy consumption and estimation, and occupancy measurement. Howe…
▽ More
CityGML is a widely adopted standard by the Open Geospatial Consortium (OGC) for representing and exchanging 3D city models. The representation of semantic and topological properties in CityGML makes it possible to query such 3D city data to perform analysis in various applications, e.g., security management and emergency response, energy consumption and estimation, and occupancy measurement. However, the potential of querying CityGML data has not been fully exploited. The official GML/XML encoding of CityGML is only intended as an exchange format but is not suitable for query answering. The most common way of dealing with CityGML data is to store them in the 3DCityDB system as relational tables and then query them with the standard SQL query language. Nevertheless, for end users, it remains a challenging task to formulate queries over 3DCityDB directly for their ad-hoc analytical tasks, because there is a gap between the conceptual semantics of CityGML and the relational schema adopted in 3DCityDB. In fact, the semantics of CityGML itself can be modeled as a suitable ontology. The technology of Knowledge Graphs (KGs), where an ontology is at the core, is a good solution to bridge such a gap. Moreover, embracing KGs makes it easier to integrate with other spatial data sources, e.g., OpenStreetMap and existing (Geo)KGs (e.g., Wikidata, DBPedia, and GeoNames), and to perform queries combining information from multiple data sources. In this work, we describe a CityGML KG framework to populate the concepts in the CityGML ontology using declarative mappings to 3DCityDB, thus exposing the CityGML data therein as a KG. To demonstrate the feasibility of our approach, we use CityGML data from the city of Munich as test data and integrate OpenStreeMap data in the same area.
△ Less
Submitted 17 October, 2023;
originally announced October 2023.
-
AI-Augmented Business Process Management Systems: A Research Manifesto
Authors:
Marlon Dumas,
Fabiana Fournier,
Lior Limonad,
Andrea Marrella,
Marco Montali,
Jana-Rebecca Rehse,
Rafael Accorsi,
Diego Calvanese,
Giuseppe De Giacomo,
Dirk Fahland,
Avigdor Gal,
Marcello La Rosa,
Hagen Völzer,
Ingo Weber
Abstract:
AI-Augmented Business Process Management Systems (ABPMSs) are an emerging class of process-aware information systems, empowered by trustworthy AI technology. An ABPMS enhances the execution of business processes with the aim of making these processes more adaptable, proactive, explainable, and context-sensitive. This manifesto presents a vision for ABPMSs and discusses research challenges that nee…
▽ More
AI-Augmented Business Process Management Systems (ABPMSs) are an emerging class of process-aware information systems, empowered by trustworthy AI technology. An ABPMS enhances the execution of business processes with the aim of making these processes more adaptable, proactive, explainable, and context-sensitive. This manifesto presents a vision for ABPMSs and discusses research challenges that need to be surmounted to realize this vision. To this end, we define the concept of ABPMS, we outline the lifecycle of processes within an ABPMS, we discuss core characteristics of an ABPMS, and we derive a set of challenges to realize systems with these characteristics.
△ Less
Submitted 4 November, 2022; v1 submitted 30 January, 2022;
originally announced January 2022.
-
SMT-Based Safety Verification of Data-Aware Processes under Ontologies (Extended Version)
Authors:
Diego Calvanese,
Alessandro Gianola,
Andrea Mazzullo,
Marco Montali
Abstract:
In the context of verification of data-aware processes (DAPs), a formal approach based on satisfiability modulo theories (SMT) has been considered to verify parameterised safety properties of so-called artifact-centric systems. This approach requires a combination of model-theoretic notions and algorithmic techniques based on backward reachability. We introduce here a variant of one of the most in…
▽ More
In the context of verification of data-aware processes (DAPs), a formal approach based on satisfiability modulo theories (SMT) has been considered to verify parameterised safety properties of so-called artifact-centric systems. This approach requires a combination of model-theoretic notions and algorithmic techniques based on backward reachability. We introduce here a variant of one of the most investigated models in this spectrum, namely simple artifact systems (SASs), where, instead of managing a database, we operate over a description logic (DL) ontology expressed in (a slight extension of) RDFS. This DL, enjoying suitable model-theoretic properties, allows us to define DL-based SASs to which backward reachability can still be applied, leading to decidability in PSPACE of the corresponding safety problems.
△ Less
Submitted 27 August, 2021;
originally announced August 2021.
-
INODE: Building an End-to-End Data Exploration System in Practice [Extended Vision]
Authors:
Sihem Amer-Yahia,
Georgia Koutrika,
Frederic Bastian,
Theofilos Belmpas,
Martin Braschler,
Ursin Brunner,
Diego Calvanese,
Maximilian Fabricius,
Orest Gkini,
Catherine Kosten,
Davide Lanti,
Antonis Litke,
Hendrik Lücke-Tieke,
Francesco Alessandro Massucci,
Tarcisio Mendes de Farias,
Alessandro Mosca,
Francesco Multari,
Nikolaos Papadakis,
Dimitris Papadopoulos,
Yogendra Patil,
Aurélien Personnaz,
Guillem Rull,
Ana Sima,
Ellery Smith,
Dimitrios Skoutas
, et al. (3 additional authors not shown)
Abstract:
A full-fledged data exploration system must combine different access modalities with a powerful concept of guiding the user in the exploration process, by being reactive and anticipative both for data discovery and for data linking. Such systems are a real opportunity for our community to cater to users with different domain and data science expertise. We introduce INODE -- an end-to-end data expl…
▽ More
A full-fledged data exploration system must combine different access modalities with a powerful concept of guiding the user in the exploration process, by being reactive and anticipative both for data discovery and for data linking. Such systems are a real opportunity for our community to cater to users with different domain and data science expertise. We introduce INODE -- an end-to-end data exploration system -- that leverages, on the one hand, Machine Learning and, on the other hand, semantics for the purpose of Data Management (DM). Our vision is to develop a classic unified, comprehensive platform that provides extensive access to open datasets, and we demonstrate it in three significant use cases in the fields of Cancer Biomarker Reearch, Research and Innovation Policy Making, and Astrophysics. INODE offers sustainable services in (a) data modeling and linking, (b) integrated query processing using natural language, (c) guidance, and (d) data exploration through visualization, thus facilitating the user in discovering new insights. We demonstrate that our system is uniquely accessible to a wide range of users from larger scientific communities to the public. Finally, we briefly illustrate how this work paves the way for new research opportunities in DM.
△ Less
Submitted 9 April, 2021;
originally announced April 2021.
-
Mapping Patterns for Virtual Knowledge Graphs
Authors:
Diego Calvanese,
Avigdor Gal,
Davide Lanti,
Marco Montali,
Alessandro Mosca,
Roee Shraga
Abstract:
Virtual Knowledge Graphs (VKG) constitute one of the most promising paradigms for integrating and accessing legacy data sources. A critical bottleneck in the integration process involves the definition, validation, and maintenance of mappings that link data sources to a domain ontology. To support the management of mappings throughout their entire lifecycle, we propose a comprehensive catalog of s…
▽ More
Virtual Knowledge Graphs (VKG) constitute one of the most promising paradigms for integrating and accessing legacy data sources. A critical bottleneck in the integration process involves the definition, validation, and maintenance of mappings that link data sources to a domain ontology. To support the management of mappings throughout their entire lifecycle, we propose a comprehensive catalog of sophisticated mapping patterns that emerge when linking databases to ontologies. To do so, we build on well-established methodologies and patterns studied in data management, data analysis, and conceptual modeling. These are extended and refined through the analysis of concrete VKG benchmarks and real-world use cases, and considering the inherent impedance mismatch between data sources and ontologies. We validate our catalog on the considered VKG scenarios, showing that it covers the vast majority of patterns present therein.
△ Less
Submitted 11 August, 2023; v1 submitted 3 December, 2020;
originally announced December 2020.
-
Counting Query Answers over a DL-Lite Knowledge Base (extended version)
Authors:
Diego Calvanese,
Julien Corman,
Davide Lanti,
Simon Razniewski
Abstract:
Counting answers to a query is an operation supported by virtually all database management systems. In this paper we focus on counting answers over a Knowledge Base (KB), which may be viewed as a database enriched with background knowledge about the domain under consideration. In particular, we place our work in the context of Ontology-Mediated Query Answering/Ontology-based Data Access (OMQA/OBDA…
▽ More
Counting answers to a query is an operation supported by virtually all database management systems. In this paper we focus on counting answers over a Knowledge Base (KB), which may be viewed as a database enriched with background knowledge about the domain under consideration. In particular, we place our work in the context of Ontology-Mediated Query Answering/Ontology-based Data Access (OMQA/OBDA), where the language used for the ontology is a member of the DL-Lite family and the data is a (usually virtual) set of assertions. We study the data complexity of query answering, for different members of the DL-Lite family that include number restrictions, and for variants of conjunctive queries with counting that differ with respect to their shape (connected, branching, rooted). We improve upon existing results by providing a PTIME and coNP lower bounds, and upper bounds in PTIME and LOGSPACE. For the latter case, we define a novel query rewriting technique into first-order logic with counting.
△ Less
Submitted 17 July, 2020; v1 submitted 12 May, 2020;
originally announced May 2020.
-
On Expansion and Contraction of DL-Lite Knowledge Bases
Authors:
Dmitriy Zheleznyakov,
Evgeny Kharlamov,
Werner Nutt,
Diego Calvanese
Abstract:
Knowledge bases (KBs) are not static entities: new information constantly appears and some of the previous knowledge becomes obsolete. In order to reflect this evolution of knowledge, KBs should be expanded with the new knowledge and contracted from the obsolete one. This problem is well-studied for propositional but much less for first-order KBs. In this work we investigate knowledge expansion an…
▽ More
Knowledge bases (KBs) are not static entities: new information constantly appears and some of the previous knowledge becomes obsolete. In order to reflect this evolution of knowledge, KBs should be expanded with the new knowledge and contracted from the obsolete one. This problem is well-studied for propositional but much less for first-order KBs. In this work we investigate knowledge expansion and contraction for KBs expressed in DL-Lite, a family of description logics (DLs) that underlie the tractable fragment OWL 2 QL of the Web Ontology Language OWL 2. We start with a novel knowledge evolution framework and natural postulates that evolution should respect, and compare our postulates to the well-established AGM postulates. We then review well-known model and formula-based approaches for expansion and contraction for propositional theories and show how they can be adapted to the case of DL-Lite. In particular, we show intrinsic limitations of model-based approaches: besides the fact that some of them do not respect the postulates we have established, they ignore the structural properties of KBs. This leads to undesired properties of evolution results: evolution of DL-Lite KBs cannot be captured in DL-Lite. Moreover, we show that well-known formula-based approaches are also not appropriate for DL-Lite expansion and contraction: they either have a high complexity of computation, or they produce logical theories that cannot be expressed in DL-Lite. Thus, we propose a novel formula-based approach that respects our principles and for which evolution is expressible in DL-Lite. For this approach we also propose
polynomial time deterministic algorithms to compute evolution of DL-Lite KBs when evolution affects only factual data.
△ Less
Submitted 25 January, 2020;
originally announced January 2020.
-
Combined Covers and Beth Definability (Extended Version)
Authors:
Diego Calvanese,
Silvio Ghilardi,
Alessandro Gianola,
Marco Montali,
Andrey Rivkin
Abstract:
In ESOP 2008, Gulwani and Musuvathi introduced a notion of cover and exploited it to handle infinite-state model checking problems. Motivated by applications to the verification of data-aware processes, we proved in a previous paper that covers are strictly related to model completions, a well-known topic in model theory. In this paper we investigate cover transfer to theory combinations in the di…
▽ More
In ESOP 2008, Gulwani and Musuvathi introduced a notion of cover and exploited it to handle infinite-state model checking problems. Motivated by applications to the verification of data-aware processes, we proved in a previous paper that covers are strictly related to model completions, a well-known topic in model theory. In this paper we investigate cover transfer to theory combinations in the disjoint signatures case. We prove that for convex theories, cover algorithms can be transferred to theory combinations under the same hypothesis (equality interpolation property aka strong amalgamation property) needed to transfer quantifier-free interpolation. In the non-convex case, we show by a counterexample that covers may not exist in the combined theories, even in case combined quantifier-free interpolants do exist. However, we exhibit a cover transfer algorithm operating also in the non-convex case for special kinds of theory combinations; these combinations (called `tame combinations') concern multi-sorted theories arising in many model-checking applications (in particular, the ones oriented to verification of data-aware processes).
△ Less
Submitted 29 June, 2020; v1 submitted 18 November, 2019;
originally announced November 2019.
-
Formal Modeling and SMT-Based Parameterized Verification of Data-Aware BPMN (Extended Version)
Authors:
Diego Calvanese,
Silvio Ghilardi,
Alessandro Gianola,
Marco Montali,
Andrey Rivkin
Abstract:
We propose DAB -- a data-aware extension of BPMN where the process operates over case and persistent data (partitioned into a read-only database called catalog and a read-write database called repository). The model trades off between expressiveness and the possibility of supporting parameterized verification of safety properties on top of it. Specifically, taking inspiration from the literature o…
▽ More
We propose DAB -- a data-aware extension of BPMN where the process operates over case and persistent data (partitioned into a read-only database called catalog and a read-write database called repository). The model trades off between expressiveness and the possibility of supporting parameterized verification of safety properties on top of it. Specifically, taking inspiration from the literature on verification of artifact systems, we study verification problems where safety properties are checked irrespectively of the content of the read-only catalog, and accepting the potential presence of unboundedly many tuples in the catalog and repository. We tackle such problems using an array-based backward reachability procedure fully implemented in MCMT -- a state-of-the-art array-based SMT model checker. Notably, we prove that the procedure is sound and complete for checking safety of DABs, and single out additional conditions that guarantee its termination and, in turn, show decidability of checking safety.
△ Less
Submitted 24 June, 2019; v1 submitted 31 May, 2019;
originally announced June 2019.
-
Enriching Ontology-based Data Access with Provenance (Extended Version)
Authors:
Diego Calvanese,
Davide Lanti,
Ana Ozaki,
Rafael Penaloza,
Guohui Xiao
Abstract:
Ontology-based data access (OBDA) is a popular paradigm for querying heterogeneous data sources by connecting them through mappings to an ontology. In OBDA, it is often difficult to reconstruct why a tuple occurs in the answer of a query. We address this challenge by enriching OBDA with provenance semirings, taking inspiration from database theory. In particular, we investigate the problems of (i)…
▽ More
Ontology-based data access (OBDA) is a popular paradigm for querying heterogeneous data sources by connecting them through mappings to an ontology. In OBDA, it is often difficult to reconstruct why a tuple occurs in the answer of a query. We address this challenge by enriching OBDA with provenance semirings, taking inspiration from database theory. In particular, we investigate the problems of (i) deciding whether a provenance annotated OBDA instance entails a provenance annotated conjunctive query, and (ii) computing a polynomial representing the provenance of a query entailed by a provenance annotated OBDA instance. Differently from pure databases, in our case these polynomials may be infinite. To regain finiteness, we consider idempotent semirings, and study the complexity in the case of DL-Lite ontologies. We implement Task (ii) in a state-of-the-art OBDA system and show the practical feasibility of the approach through an extensive evaluation against two popular benchmarks.
△ Less
Submitted 1 June, 2019;
originally announced June 2019.
-
Formal Modeling and SMT-Based Parameterized Verification of Multi-Case Data-Aware BPMN
Authors:
Diego Calvanese,
Silvio Ghilardi,
Alessandro Gianola,
Marco Montali,
Andrey Rivkin
Abstract:
We propose DAB -- a data-aware extension of the BPMN de-facto standard with the ability of operating over case and persistent data (partitioned into a read-only catalog and a read-write repository), and that balances between expressiveness and the possibility of supporting parameterized verification of safety properties on top of it. In particular, we take inspiration from the literature on verifi…
▽ More
We propose DAB -- a data-aware extension of the BPMN de-facto standard with the ability of operating over case and persistent data (partitioned into a read-only catalog and a read-write repository), and that balances between expressiveness and the possibility of supporting parameterized verification of safety properties on top of it. In particular, we take inspiration from the literature on verification of artifact systems, and consider verification problems where safety properties are checked irrespectively of the content of the read-only catalog, possibly considering an unbounded number of active cases and tuples in the catalog and repository. Such problems are tackled using fully implemented array-based backward reachability techniques belonging to the well-established tradition of SMT model checking. We also identify relevant classes of DABs for which the backward reachability procedure implemented in the MCMT array-based model checker is sound and complete, and then further strengthen such classes to ensure termination.
△ Less
Submitted 20 June, 2019; v1 submitted 30 May, 2019;
originally announced May 2019.
-
Modeling and In-Database Management of Relational, Data-Aware Processes (Extended Version)
Authors:
Diego Calvanese,
Marco Montali,
Fabio Patrizi,
Andrey Rivkin
Abstract:
During the last two decades, it has been increasingly acknowledged that the engineering of information systems usually requires a huge effort in integrating master data and business processes. This has led to a plethora of proposals, both from academia and the industry. However, such approaches typically come with ad-hoc abstractions to represent and interact with the data component. This has a tw…
▽ More
During the last two decades, it has been increasingly acknowledged that the engineering of information systems usually requires a huge effort in integrating master data and business processes. This has led to a plethora of proposals, both from academia and the industry. However, such approaches typically come with ad-hoc abstractions to represent and interact with the data component. This has a twofold disadvantage. On the one hand, they cannot be used to effortlessly enrich an existing relational database with dynamics. On the other hand, they generally do not allow for integrated modelling, verification, and enactment. We attack these two challenges by proposing a declarative approach, fully grounded in SQL, that supports the agile modelling of relational data-aware processes directly on top of relational databases. We show how this approach can be automatically translated into a concrete procedural SQL dialect, executable directly inside any relational database engine. The translation exploits an in-database representation of process states that, in turn, is used to handle, at once, process enactment with or without logging of the executed instances, as well as process verification. The approach has been implemented in a working prototype.
△ Less
Submitted 8 July, 2019; v1 submitted 18 October, 2018;
originally announced October 2018.
-
Semantic DMN: Formalizing and Reasoning About Decisions in the Presence of Background Knowledge
Authors:
Diego Calvanese,
Marlon Dumas,
Fabrizio Maria Maggi,
Marco Montali
Abstract:
The Decision Model and Notation (DMN) is a recent OMG standard for the elicitation and representation of decision models, and for managing their interconnection with business processes. DMN builds on the notion of decision tables, and their combination into more complex decision requirements graphs (DRGs), which bridge between business process models and decision logic models. DRGs may rely on add…
▽ More
The Decision Model and Notation (DMN) is a recent OMG standard for the elicitation and representation of decision models, and for managing their interconnection with business processes. DMN builds on the notion of decision tables, and their combination into more complex decision requirements graphs (DRGs), which bridge between business process models and decision logic models. DRGs may rely on additional, external business knowledge models, whose functioning is not part of the standard. In this work, we consider one of the most important types of business knowledge, namely background knowledge that conceptually accounts for the structural aspects of the domain of interest, and propose decision knowledge bases (DKBs), which semantically combine DRGs modeled in DMN, and domain knowledge captured by means of first-order logic with datatypes. We provide a logic-based semantics for such an integration, and formalize different DMN reasoning tasks for DKBs. We then consider background knowledge formulated as a description logic ontology with datatypes, and show how the main verification tasks for DMN in this enriched setting can be formalized as standard DL reasoning services, and actually carried out in ExpTime. We discuss the effectiveness of our framework on a case study in maritime security.
△ Less
Submitted 14 September, 2018; v1 submitted 30 July, 2018;
originally announced July 2018.
-
Verification of Data-Aware Processes via Array-Based Systems (Extended Version)
Authors:
Diego Calvanese,
Silvio Ghilardi,
Alessandro Gianola,
Marco Montali,
Andrey Rivkin
Abstract:
We study verification over a general model of artifact-centric systems, to assess (parameterized) safety properties irrespectively of the initial database instance. We view such artifact systems as array-based systems, which allows us to check safety by adapting backward reachability, establishing for the first time a correspondence with model checking based on Satisfiability-Modulo-Theories (SMT)…
▽ More
We study verification over a general model of artifact-centric systems, to assess (parameterized) safety properties irrespectively of the initial database instance. We view such artifact systems as array-based systems, which allows us to check safety by adapting backward reachability, establishing for the first time a correspondence with model checking based on Satisfiability-Modulo-Theories (SMT). To do so, we make use of the model-theoretic machinery of model completion, which surprisingly turns out to be an effective tool for verification of relational systems, and represents the main original contribution of this paper. In this way, we pursue a twofold purpose. On the one hand, we reconstruct (restricted to safety) the essence of some important decidability results obtained in the literature for artifact-centric systems, and we devise a genuinely novel class of decidable cases. On the other, we are able to exploit SMT technology in implementations, building on the well-known MCMT model checker for array-based systems, and extending it to make all our foundational results fully operational.
△ Less
Submitted 27 February, 2019; v1 submitted 29 June, 2018;
originally announced June 2018.
-
Quantifier Elimination for Database Driven Verification
Authors:
Diego Calvanese,
Silvio Ghilardi,
Alessandro Gianola,
Marco Montali,
Andrey Rivkin
Abstract:
Running verification tasks in database driven systems requires solving quantifier elimination problems of a new kind. These quantifier elimination problems are related to the notion of a cover introduced in ESOP 2008 by Gulwani and Musuvathi. In this paper, we show how covers are strictly related to model completions, a well-known topic in model theory. We also investigate the computation of cover…
▽ More
Running verification tasks in database driven systems requires solving quantifier elimination problems of a new kind. These quantifier elimination problems are related to the notion of a cover introduced in ESOP 2008 by Gulwani and Musuvathi. In this paper, we show how covers are strictly related to model completions, a well-known topic in model theory. We also investigate the computation of covers within the Superposition Calculus, by adopting a constrained version of the calculus, equipped with appropriate settings and reduction strategies. In addition, we show that cover computations are computationally tractable for the fragment of the language used in applications to database driven verification. This observation is confirmed by analyzing the preliminary results obtained using the MCMT tool on the verification of data-aware process benchmarks. These benchmarks can be found in the last version of the tool distribution.
△ Less
Submitted 17 June, 2019; v1 submitted 25 June, 2018;
originally announced June 2018.
-
Efficient Handling of SPARQL OPTIONAL for OBDA (Extended Version)
Authors:
Guohui Xiao,
Roman Kontchakov,
Benjamin Cogrel,
Diego Calvanese,
Elena Botoeva
Abstract:
OPTIONAL is a key feature in SPARQL for dealing with missing information. While this operator is used extensively, it is also known for its complexity, which can make efficient evaluation of queries with OPTIONAL challenging. We tackle this problem in the Ontology-Based Data Access (OBDA) setting, where the data is stored in a SQL relational database and exposed as a virtual RDF graph by means of…
▽ More
OPTIONAL is a key feature in SPARQL for dealing with missing information. While this operator is used extensively, it is also known for its complexity, which can make efficient evaluation of queries with OPTIONAL challenging. We tackle this problem in the Ontology-Based Data Access (OBDA) setting, where the data is stored in a SQL relational database and exposed as a virtual RDF graph by means of an R2RML mapping. We start with a succinct translation of a SPARQL fragment into SQL. It fully respects bag semantics and three-valued logic and relies on the extensive use of the LEFT JOIN operator and COALESCE function. We then propose optimisation techniques for reducing the size and improving the structure of generated SQL queries. Our optimisations capture interactions between JOIN, LEFT JOIN, COALESCE and integrity constraints such as attribute nullability, uniqueness and foreign key constraints. Finally, we empirically verify effectiveness of our techniques on the BSBM OBDA benchmark.
△ Less
Submitted 18 June, 2018; v1 submitted 15 June, 2018;
originally announced June 2018.
-
Cost-Driven Ontology-Based Data Access (Extended Version)
Authors:
Davide Lanti,
Guohui Xiao,
Diego Calvanese
Abstract:
In ontology-based data access (OBDA), users are provided with a conceptual view of a (relational) data source that abstracts away details about data storage. This conceptual view is realized through an ontology that is connected to the data source through declarative mappings, and query answering is carried out by translating the user queries over the conceptual view into SQL queries over the data…
▽ More
In ontology-based data access (OBDA), users are provided with a conceptual view of a (relational) data source that abstracts away details about data storage. This conceptual view is realized through an ontology that is connected to the data source through declarative mappings, and query answering is carried out by translating the user queries over the conceptual view into SQL queries over the data source. Standard translation techniques in OBDA try to transform the user query into a union of conjunctive queries (UCQ), following the heuristic argument that UCQs can be efficiently evaluated by modern relational database engines. In this work, we show that translating to UCQs is not always the best choice, and that, under certain conditions on the interplay between the ontology, the map- pings, and the statistics of the data, alternative translations can be evaluated much more efficiently. To find the best translation, we devise a cost model together with a novel cardinality estimation that takes into account all such OBDA components. Our experiments confirm that (i) alternatives to the UCQ translation might produce queries that are orders of magnitude more efficient, and (ii) the cost model we propose is faithful to the actual query evaluation cost, and hence is well suited to select the best translation.
△ Less
Submitted 2 February, 2018; v1 submitted 21 July, 2017;
originally announced July 2017.
-
Research Directions for Principles of Data Management (Dagstuhl Perspectives Workshop 16151)
Authors:
Serge Abiteboul,
Marcelo Arenas,
Pablo Barceló,
Meghyn Bienvenu,
Diego Calvanese,
Claire David,
Richard Hull,
Eyke Hüllermeier,
Benny Kimelfeld,
Leonid Libkin,
Wim Martens,
Tova Milo,
Filip Murlak,
Frank Neven,
Magdalena Ortiz,
Thomas Schwentick,
Julia Stoyanovich,
Jianwen Su,
Dan Suciu,
Victor Vianu,
Ke Yi
Abstract:
In April 2016, a community of researchers working in the area of Principles of Data Management (PDM) joined in a workshop at the Dagstuhl Castle in Germany. The workshop was organized jointly by the Executive Committee of the ACM Symposium on Principles of Database Systems (PODS) and the Council of the International Conference on Database Theory (ICDT). The mission of this workshop was to identify…
▽ More
In April 2016, a community of researchers working in the area of Principles of Data Management (PDM) joined in a workshop at the Dagstuhl Castle in Germany. The workshop was organized jointly by the Executive Committee of the ACM Symposium on Principles of Database Systems (PODS) and the Council of the International Conference on Database Theory (ICDT). The mission of this workshop was to identify and explore some of the most important research directions that have high relevance to society and to Computer Science today, and where the PDM community has the potential to make significant contributions. This report describes the family of research directions that the workshop focused on from three perspectives: potential practical relevance, results already obtained, and research questions that appear surmountable in the short and medium term.
△ Less
Submitted 31 January, 2017;
originally announced January 2017.
-
Metric Temporal Logic for Ontology-Based Data Access over Log Data
Authors:
Diego Calvanese,
Elem Güzel Kalaycı,
Vladislav Ryzhikov,
Guohui Xiao,
Michael Zakharyaschev
Abstract:
We present a new metric temporal logic HornMTL over dense time and its datalog extension datalogMTL. The use of datalogMTL is demonstrated in the context of ontology-based data access over meteorological data. We show decidability of answering ontology-mediated queries for a practically relevant non-recursive fragment of datalogMTL. Finally, we discuss directions of the future work, including the…
▽ More
We present a new metric temporal logic HornMTL over dense time and its datalog extension datalogMTL. The use of datalogMTL is demonstrated in the context of ontology-based data access over meteorological data. We show decidability of answering ontology-mediated queries for a practically relevant non-recursive fragment of datalogMTL. Finally, we discuss directions of the future work, including the potential use-cases in analyzing log data of engines and devices.
△ Less
Submitted 4 January, 2017;
originally announced January 2017.
-
Data Scaling in OBDA Benchmarks: The VIG Approach
Authors:
Davide Lanti,
Guohui Xiao,
Diego Calvanese
Abstract:
In this paper we describe VIG, a data scaler for benchmarks in the context of ontology-based data access (OBDA). Data scaling is a relatively recent approach, proposed in the database community, that allows for quickly scaling up an input data instance to s times its size, while preserving certain application-specific characteristics. The advantage of the approach is that the user is not required…
▽ More
In this paper we describe VIG, a data scaler for benchmarks in the context of ontology-based data access (OBDA). Data scaling is a relatively recent approach, proposed in the database community, that allows for quickly scaling up an input data instance to s times its size, while preserving certain application-specific characteristics. The advantage of the approach is that the user is not required to manually input the characteristics of the data to be produced, making it particularly suitable for OBDA benchmarks, where the complexity of database schemas might pose a challenge for manual input (e.g., the NPD benchmark contains 70 tables with some containing more than 60 columns). As opposed to a traditional data scaler, VIG includes domain information provided by the OBDA mappings and the ontology in order to produce data. VIG is currently used in the NPD benchmark, but it is not NPD-specific and can be seeded with any data instance. The distinguishing features of VIG are (1) its simple and clear generation strategy; (2) its efficiency, as each value is generated in constant time, without accesses to the disk or to RAM to retrieve previously generated values; (3) and its generality, as the data is exported in CSV files that can be easily imported by any RDBMS system. VIG is a java implementation licensed under Apache 2.0, and its source code is available on GitHub (https://github.com/ontop/vig) in the form of a Maven project. The code is being maintained since two years by the -ontop- team at the Free University of Bozen-Bolzano.
△ Less
Submitted 29 July, 2016; v1 submitted 21 July, 2016;
originally announced July 2016.
-
Expressivity and Complexity of MongoDB (Extended Version)
Authors:
Elena Botoeva,
Diego Calvanese,
Benjamin Cogrel,
Guohui Xiao
Abstract:
A significant number of novel database architectures and data models have been proposed during the last decade. While some of these new systems have gained in popularity, they lack a proper formalization, and a precise understanding of the expressivity and the computational properties of the associated query languages. In this paper, we aim at filling this gap, and we do so by considering MongoDB,…
▽ More
A significant number of novel database architectures and data models have been proposed during the last decade. While some of these new systems have gained in popularity, they lack a proper formalization, and a precise understanding of the expressivity and the computational properties of the associated query languages. In this paper, we aim at filling this gap, and we do so by considering MongoDB, a widely adopted document database managing complex (tree structured) values represented in a JSON-based data model, equipped with a powerful query mechanism. We provide a formalization of the MongoDB data model, and of a core fragment, called MQuery, of the MongoDB query language. We study the expressivity of MQuery, showing its equivalence with nested relational algebra. We further investigate the computational complexity of significant fragments of it, obtaining several (tight) bounds in combined complexity, which range from LOGSPACE to alternating exponential-time with a polynomial number of alternations. As a consequence, we obtain also a characterization of the combined complexity of nested relational algebra query evaluation.
△ Less
Submitted 25 April, 2017; v1 submitted 30 March, 2016;
originally announced March 2016.
-
Semantics and Analysis of DMN Decision Tables
Authors:
Diego Calvanese,
Marlon Dumas,
Ülari Laurson,
Fabrizio M. Maggi,
Marco Montali,
Irene Teinemaa
Abstract:
The Decision Model and Notation (DMN) is a standard notation to capture decision logic in business applications in general and business processes in particular. A central construct in DMN is that of a decision table. The increasing use of DMN decision tables to capture critical business knowledge raises the need to support analysis tasks on these tables such as correctness and completeness checkin…
▽ More
The Decision Model and Notation (DMN) is a standard notation to capture decision logic in business applications in general and business processes in particular. A central construct in DMN is that of a decision table. The increasing use of DMN decision tables to capture critical business knowledge raises the need to support analysis tasks on these tables such as correctness and completeness checking. This paper provides a formal semantics for DMN tables, a formal definition of key analysis tasks and scalable algorithms to tackle two such tasks, i.e., detection of overlapping rules and of missing rules. The algorithms are based on a geometric interpretation of decision tables that can be used to support other analysis tasks by tapping into geometric algorithms. The algorithms have been implemented in an open-source DMN editor and tested on large decision tables derived from a credit lending dataset.
△ Less
Submitted 24 March, 2016;
originally announced March 2016.
-
Beyond OWL 2 QL in OBDA: Rewritings and Approximations (Extended Version)
Authors:
Elena Botoeva,
Diego Calvanese,
Valerio Santarelli,
Domenico Fabio Savo,
Alessandro Solimando,
Guohui Xiao
Abstract:
Ontology-based data access (OBDA) is a novel paradigm facilitating access to relational data, realized by linking data sources to an ontology by means of declarative mappings. DL-Lite_R, which is the logic underpinning the W3C ontology language OWL 2 QL and the current language of choice for OBDA, has been designed with the goal of delegating query answering to the underlying database engine, and…
▽ More
Ontology-based data access (OBDA) is a novel paradigm facilitating access to relational data, realized by linking data sources to an ontology by means of declarative mappings. DL-Lite_R, which is the logic underpinning the W3C ontology language OWL 2 QL and the current language of choice for OBDA, has been designed with the goal of delegating query answering to the underlying database engine, and thus is restricted in expressive power. E.g., it does not allow one to express disjunctive information, and any form of recursion on the data. The aim of this paper is to overcome these limitations of DL-Lite_R, and extend OBDA to more expressive ontology languages, while still leveraging the underlying relational technology for query answering. We achieve this by relying on two well-known mechanisms, namely conservative rewriting and approximation, but significantly extend their practical impact by bringing into the picture the mapping, an essential component of OBDA. Specifically, we develop techniques to rewrite OBDA specifications with an expressive ontology to "equivalent" ones with a DL-Lite_R ontology, if possible, and to approximate them otherwise. We do so by exploiting the high expressive power of the mapping layer to capture part of the domain semantics of rich ontology languages. We have implemented our techniques in the prototype system OntoProx, making use of the state-of-the-art OBDA system Ontop and the query answering system Clipper, and we have shown their feasibility and effectiveness with experiments on synthetic and real-world data.
△ Less
Submitted 1 December, 2015; v1 submitted 26 November, 2015;
originally announced November 2015.
-
Fixpoint Node Selection Query Languages for Trees
Authors:
Diego Calvanese,
Giuseppe De Giacomo,
Maurizio Lenzerini,
Moshe Y. Vardi
Abstract:
The study of node selection query languages for (finite) trees has been a major topic in the recent research on query languages for Web documents. On one hand, there has been an extensive study of XPath and its various extensions. On the other hand, query languages based on classical logics, such as first-order logic (FO) or Monadic Second-Order Logic (MSO), have been considered. Results in this a…
▽ More
The study of node selection query languages for (finite) trees has been a major topic in the recent research on query languages for Web documents. On one hand, there has been an extensive study of XPath and its various extensions. On the other hand, query languages based on classical logics, such as first-order logic (FO) or Monadic Second-Order Logic (MSO), have been considered. Results in this area typically relate an XPath-based language to a classical logic. What has yet to emerge is an XPath-related language that is as expressive as MSO, and at the same time enjoys the computational properties of XPath, which are linear time query evaluation and exponential time query-containment test. In this paper we propose muXPath, which is the alternation-free fragment of XPath extended with fixpoint operators. Using two-way alternating automata, we show that this language does combine desired expressiveness and computational properties, placing it as an attractive candidate for the definite node-selection query language for trees.
△ Less
Submitted 14 November, 2018; v1 submitted 29 September, 2015;
originally announced September 2015.
-
Verification of Generalized Inconsistency-Aware Knowledge and Action Bases (Extended Version)
Authors:
Diego Calvanese,
Marco Montali,
Ario Santoso
Abstract:
Knowledge and Action Bases (KABs) have been put forward as a semantically rich representation of a domain, using a DL KB to account for its static aspects, and actions to evolve its extensional part over time, possibly introducing new objects. Recently, KABs have been extended to manage inconsistency, with ad-hoc verification techniques geared towards specific semantics. This work provides a twofo…
▽ More
Knowledge and Action Bases (KABs) have been put forward as a semantically rich representation of a domain, using a DL KB to account for its static aspects, and actions to evolve its extensional part over time, possibly introducing new objects. Recently, KABs have been extended to manage inconsistency, with ad-hoc verification techniques geared towards specific semantics. This work provides a twofold contribution along this line of research. On the one hand, we enrich KABs with a high-level, compact action language inspired by Golog, obtaining so called Golog-KABs (GKABs). On the other hand, we introduce a parametric execution semantics for GKABs, so as to elegantly accomodate a plethora of inconsistency-aware semantics based on the notion of repair. We then provide several reductions for the verification of sophisticated first-order temporal properties over inconsistency-aware GKABs, and show that it can be addressed using known techniques, developed for standard KABs.
△ Less
Submitted 4 June, 2015; v1 submitted 30 April, 2015;
originally announced April 2015.
-
Adding Context to Knowledge and Action Bases
Authors:
Diego Calvanese,
İsmail İlkan Ceylan,
Marco Montali,
Ario Santoso
Abstract:
Knowledge and Action Bases (KABs) have been recently proposed as a formal framework to capture the dynamics of systems which manipulate Description Logic (DL) Knowledge Bases (KBs) through action execution. In this work, we enrich the KAB setting with contextual information, making use of different context dimensions. On the one hand, context is determined by the environment using context-changing…
▽ More
Knowledge and Action Bases (KABs) have been recently proposed as a formal framework to capture the dynamics of systems which manipulate Description Logic (DL) Knowledge Bases (KBs) through action execution. In this work, we enrich the KAB setting with contextual information, making use of different context dimensions. On the one hand, context is determined by the environment using context-changing actions that make use of the current state of the KB and the current context. On the other hand, it affects the set of TBox assertions that are relevant at each time point, and that have to be considered when processing queries posed over the KAB. Here we extend to our enriched setting the results on verification of rich temporal properties expressed in mu-calculus, which had been established for standard KABs. Specifically, we show that under a run-boundedness condition, verification stays decidable.
△ Less
Submitted 26 December, 2014;
originally announced December 2014.
-
Verification of Relational Multiagent Systems with Data Types (Extended Version)
Authors:
Diego Calvanese,
Giorgio Delzanno,
Marco Montali
Abstract:
We study the extension of relational multiagent systems (RMASs), where agents manipulate full-fledged relational databases, with data types and facets equipped with domain-specific, rigid relations (such as total orders). Specifically, we focus on design-time verification of RMASs against rich first-order temporal properties expressed in a variant of first-order mu-calculus with quantification acr…
▽ More
We study the extension of relational multiagent systems (RMASs), where agents manipulate full-fledged relational databases, with data types and facets equipped with domain-specific, rigid relations (such as total orders). Specifically, we focus on design-time verification of RMASs against rich first-order temporal properties expressed in a variant of first-order mu-calculus with quantification across states. We build on previous decidability results under the "state-bounded" assumption, i.e., in each single state only a bounded number of data objects is stored in the agent databases, while unboundedly many can be encountered over time. We recast this condition, showing decidability in presence of dense, linear orders, and facets defined on top of them. Our approach is based on the construction of a finite-state, sound and complete abstraction of the original system, in which dense linear orders are reformulated as non-rigid relations working on the active domain of the system only. We also show undecidability when including a data type equipped with the successor relation.
△ Less
Submitted 17 November, 2014;
originally announced November 2014.
-
Verifiable UML Artifact-Centric Business Process Models (Extended Version)
Authors:
Diego Calvanese,
Marco Montali,
Montserrat Estanol,
Ernest Teniente
Abstract:
Artifact-centric business process models have gained increasing momentum recently due to their ability to combine structural (i.e., data related) with dynamical (i.e., process related) aspects. In particular, two main lines of research have been pursued so far: one tailored to business artefact modeling languages and methodologies, the other focused on the foundations for their formal verification…
▽ More
Artifact-centric business process models have gained increasing momentum recently due to their ability to combine structural (i.e., data related) with dynamical (i.e., process related) aspects. In particular, two main lines of research have been pursued so far: one tailored to business artefact modeling languages and methodologies, the other focused on the foundations for their formal verification. In this paper, we merge these two lines of research, by showing how recent theoretical decidability results for verification can be fruitfully transferred to a concrete UML-based modeling methodology. In particular, we identify additional steps in the methodology that, in significant cases, guarantee the possibility of verifying the resulting models against rich first-order temporal properties. Notably, our results can be seamlessly transferred to different languages for the specification of the artifact lifecycles.
△ Less
Submitted 26 August, 2014; v1 submitted 21 August, 2014;
originally announced August 2014.
-
Managing Change in Graph-structured Data Using Description Logics (long version with appendix)
Authors:
Shqiponja Ahmetaj,
Diego Calvanese,
Magdalena Ortiz,
Mantas Simkus
Abstract:
In this paper, we consider the setting of graph-structured data that evolves as a result of operations carried out by users or applications. We study different reasoning problems, which range from ensuring the satisfaction of a given set of integrity constraints after a given sequence of updates, to deciding the (non-)existence of a sequence of actions that would take the data to an (un)desirable…
▽ More
In this paper, we consider the setting of graph-structured data that evolves as a result of operations carried out by users or applications. We study different reasoning problems, which range from ensuring the satisfaction of a given set of integrity constraints after a given sequence of updates, to deciding the (non-)existence of a sequence of actions that would take the data to an (un)desirable state, starting either from a specific data instance or from an incomplete description of it. We consider an action language in which actions are finite sequences of conditional insertions and deletions of nodes and labels, and use Description Logics for describing integrity constraints and (partial) states of the data. We then formalize the above data management problems as a static verification problem and several planning problems. We provide algorithms and tight complexity bounds for the formalized problems, both for an expressive DL and for a variant of DL-Lite.
△ Less
Submitted 29 May, 2014; v1 submitted 16 April, 2014;
originally announced April 2014.
-
Updating RDFS ABoxes and TBoxes in SPARQL
Authors:
Albin Ahmeti,
Diego Calvanese,
Axel Polleres
Abstract:
Updates in RDF stores have recently been standardised in the SPARQL 1.1 Update specification. However, computing answers entailed by ontologies in triple stores is usually treated orthogonal to updates. Even the W3C's recent SPARQL 1.1 Update language and SPARQL 1.1 Entailment Regimes specifications explicitly exclude a standard behaviour how SPARQL endpoints should treat entailment regimes other…
▽ More
Updates in RDF stores have recently been standardised in the SPARQL 1.1 Update specification. However, computing answers entailed by ontologies in triple stores is usually treated orthogonal to updates. Even the W3C's recent SPARQL 1.1 Update language and SPARQL 1.1 Entailment Regimes specifications explicitly exclude a standard behaviour how SPARQL endpoints should treat entailment regimes other than simple entailment in the context of updates. In this paper, we take a first step to close this gap. We define a fragment of SPARQL basic graph patterns corresponding to (the RDFS fragment of) DL-Lite and the corresponding SPARQL update language, dealing with updates both of ABox and of TBox statements. We discuss possible semantics along with potential strategies for implementing them. We treat both, (i) materialised RDF stores, which store all entailed triples explicitly, and (ii) reduced RDF Stores, that is, redundancy-free RDF stores that do not store any RDF triples (corresponding to DL-Lite ABox statements) entailed by others already.
△ Less
Submitted 27 March, 2014;
originally announced March 2014.
-
Nested Regular Path Queries in Description Logics
Authors:
Meghyn Bienvenu,
Diego Calvanese,
Magdalena Ortiz,
Mantas Simkus
Abstract:
Two-way regular path queries (2RPQs) have received increased attention recently due to their ability to relate pairs of objects by flexibly navigating graph-structured data. They are present in property paths in SPARQL 1.1, the new standard RDF query language, and in the XML query language XPath. In line with XPath, we consider the extension of 2RPQs with nesting, which allows one to require that…
▽ More
Two-way regular path queries (2RPQs) have received increased attention recently due to their ability to relate pairs of objects by flexibly navigating graph-structured data. They are present in property paths in SPARQL 1.1, the new standard RDF query language, and in the XML query language XPath. In line with XPath, we consider the extension of 2RPQs with nesting, which allows one to require that objects along a path satisfy complex conditions, in turn expressed through (nested) 2RPQs. We study the computational complexity of answering nested 2RPQs and conjunctions thereof (CN2RPQs) in the presence of domain knowledge expressed in description logics (DLs). We establish tight complexity bounds in data and combined complexity for a variety of DLs, ranging from lightweight DLs (DL-Lite, EL) up to highly expressive ones. Interestingly, we are able to show that adding nesting to (C)2RPQs does not affect worst-case data complexity of query answering for any of the considered DLs. However, in the case of lightweight DLs, adding nesting to 2RPQs leads to a surprising jump in combined complexity, from P-complete to Exp-complete.
△ Less
Submitted 4 March, 2014; v1 submitted 27 February, 2014;
originally announced February 2014.
-
Reasoning about Explanations for Negative Query Answers in DL-Lite
Authors:
Diego Calvanese,
Magdalena Ortiz,
Mantas Simkus,
Giorgio Stefanoni
Abstract:
In order to meet usability requirements, most logic-based applications provide explanation facilities for reasoning services. This holds also for Description Logics, where research has focused on the explanation of both TBox reasoning and, more recently, query answering. Besides explaining the presence of a tuple in a query answer, it is important to explain also why a given tuple is missing. W…
▽ More
In order to meet usability requirements, most logic-based applications provide explanation facilities for reasoning services. This holds also for Description Logics, where research has focused on the explanation of both TBox reasoning and, more recently, query answering. Besides explaining the presence of a tuple in a query answer, it is important to explain also why a given tuple is missing. We address the latter problem for instance and conjunctive query answering over DL-Lite ontologies by adopting abductive reasoning; that is, we look for additions to the ABox that force a given tuple to be in the result. As reasoning tasks we consider existence and recognition of an explanation, and relevance and necessity of a given assertion for an explanation. We characterize the computational complexity of these problems for arbitrary, subset minimal, and cardinality minimal explanations.
△ Less
Submitted 3 February, 2014;
originally announced February 2014.
-
Description Logic Knowledge and Action Bases
Authors:
Babak Bagheri Hariri,
Diego Calvanese,
Marco Montali,
Giuseppe De Giacomo,
Riccardo De Masellis,
Paolo Felli
Abstract:
Description logic Knowledge and Action Bases (KAB) are a mechanism for providing both a semantically rich representation of the information on the domain of interest in terms of a description logic knowledge base and actions to change such information over time, possibly introducing new objects. We resort to a variant of DL-Lite where the unique name assumption is not enforced and where equality b…
▽ More
Description logic Knowledge and Action Bases (KAB) are a mechanism for providing both a semantically rich representation of the information on the domain of interest in terms of a description logic knowledge base and actions to change such information over time, possibly introducing new objects. We resort to a variant of DL-Lite where the unique name assumption is not enforced and where equality between objects may be asserted and inferred. Actions are specified as sets of conditional effects, where conditions are based on epistemic queries over the knowledge base (TBox and ABox), and effects are expressed in terms of new ABoxes. In this setting, we address verification of temporal properties expressed in a variant of first-order mu-calculus with quantification across states. Notably, we show decidability of verification, under a suitable restriction inspired by the notion of weak acyclicity in data exchange.
△ Less
Submitted 3 February, 2014;
originally announced February 2014.
-
The DL-Lite Family and Relations
Authors:
Alessandro Artale,
Diego Calvanese,
Roman Kontchakov,
Michael Zakharyaschev
Abstract:
The recently introduced series of description logics under the common moniker DL-Lite has attracted attention of the description logic and semantic web communities due to the low computational complexity of inference, on the one hand, and the ability to represent conceptual modeling formalisms, on the other. The main aim of this article is to carry out a thorough and systematic investigation of…
▽ More
The recently introduced series of description logics under the common moniker DL-Lite has attracted attention of the description logic and semantic web communities due to the low computational complexity of inference, on the one hand, and the ability to represent conceptual modeling formalisms, on the other. The main aim of this article is to carry out a thorough and systematic investigation of inference in extensions of the original DL-Lite logics along five axes: by (i) adding the Boolean connectives and (ii) number restrictions to concept constructs, (iii) allowing role hierarchies, (iv) allowing role disjointness, symmetry, asymmetry, reflexivity, irreflexivity and transitivity constraints, and (v) adopting or dropping the unique same assumption. We analyze the combined complexity of satisfiability for the resulting logics, as well as the data complexity of instance checking and answering positive existential queries. Our approach is based on embedding DL-Lite logics in suitable fragments of the one-variable first-order logic, which provides useful insights into their properties and, in particular, computational behavior.
△ Less
Submitted 15 January, 2014;
originally announced January 2014.
-
Shape and Content: Incorporating Domain Knowledge into Shape Analysis
Authors:
Diego Calvanese,
Tomer Kotek,
Mantas Šimkus,
Helmut Veith,
Florian Zuleger
Abstract:
The verification community has studied dynamic data structures primarily in a bottom-up way by analyzing pointers and the shapes induced by them. Recent work in fields such as separation logic has made significant progress in extracting shapes from program source code. Many real world programs however manipulate complex data whose structure and content is most naturally described by formalisms fro…
▽ More
The verification community has studied dynamic data structures primarily in a bottom-up way by analyzing pointers and the shapes induced by them. Recent work in fields such as separation logic has made significant progress in extracting shapes from program source code. Many real world programs however manipulate complex data whose structure and content is most naturally described by formalisms from object oriented programming and databases. In this paper, we look at the verification of programs with dynamic data structures from the perspective of content representation. Our approach is based on description logic, a widely used knowledge representation paradigm which gives a logical underpinning for diverse modeling frameworks such as UML and ER. Technically, we assume that we have separation logic shape invariants obtained from a shape analysis tool, and requirements on the program data in terms of description logic. We show that the two-variable fragment of first order logic with counting and trees %(whose decidability was proved at LICS 2013) can be used as a joint framework to embed suitable fragments of description logic and separation logic.
△ Less
Submitted 9 July, 2014; v1 submitted 23 December, 2013;
originally announced December 2013.
-
Verification of Semantically-Enhanced Artifact Systems (Extended Version)
Authors:
Babak Bagheri Hariri,
Diego Calvanese,
Marco Montali,
Ario Santoso,
Dmitry Solomakhin
Abstract:
Artifact-Centric systems have emerged in the last years as a suitable framework to model business-relevant entities, by combining their static and dynamic aspects. In particular, the Guard-Stage-Milestone (GSM) approach has been recently proposed to model artifacts and their lifecycle in a declarative way. In this paper, we enhance GSM with a Semantic Layer, constituted by a full-fledged OWL 2 QL…
▽ More
Artifact-Centric systems have emerged in the last years as a suitable framework to model business-relevant entities, by combining their static and dynamic aspects. In particular, the Guard-Stage-Milestone (GSM) approach has been recently proposed to model artifacts and their lifecycle in a declarative way. In this paper, we enhance GSM with a Semantic Layer, constituted by a full-fledged OWL 2 QL ontology linked to the artifact information models through mapping specifications. The ontology provides a conceptual view of the domain under study, and allows one to understand the evolution of the artifact system at a higher level of abstraction. In this setting, we present a technique to specify temporal properties expressed over the Semantic Layer, and verify them according to the evolution in the underlying GSM model. This technique has been implemented in a tool that exploits state-of-the-art ontology-based data access technologies to manipulate the temporal properties according to the ontology and the mappings, and that relies on the GSMC model checker for verification.
△ Less
Submitted 28 August, 2013;
originally announced August 2013.
-
Verification of Inconsistency-Aware Knowledge and Action Bases (Extended Version)
Authors:
Diego Calvanese,
Evgeny Kharlamov,
Marco Montali,
Ario Santoso,
Dmitriy Zheleznyakov
Abstract:
Description Logic Knowledge and Action Bases (KABs) have been recently introduced as a mechanism that provides a semantically rich representation of the information on the domain of interest in terms of a DL KB and a set of actions to change such information over time, possibly introducing new objects. In this setting, decidability of verification of sophisticated temporal properties over KABs, ex…
▽ More
Description Logic Knowledge and Action Bases (KABs) have been recently introduced as a mechanism that provides a semantically rich representation of the information on the domain of interest in terms of a DL KB and a set of actions to change such information over time, possibly introducing new objects. In this setting, decidability of verification of sophisticated temporal properties over KABs, expressed in a variant of first-order mu-calculus, has been shown. However, the established framework treats inconsistency in a simplistic way, by rejecting inconsistent states produced through action execution. We address this problem by showing how inconsistency handling based on the notion of repairs can be integrated into KABs, resorting to inconsistency-tolerant semantics. In this setting, we establish decidability and complexity of verification.
△ Less
Submitted 23 April, 2013;
originally announced April 2013.
-
Exchanging OWL 2 QL Knowledge Bases
Authors:
Marcelo Arenas,
Elena Botoeva,
Diego Calvanese,
Vladislav Ryzhikov
Abstract:
Knowledge base exchange is an important problem in the area of data exchange and knowledge representation, where one is interested in exchanging information between a source and a target knowledge base connected through a mapping. In this paper, we study this fundamental problem for knowledge bases and mappings expressed in OWL 2 QL, the profile of OWL 2 based on the description logic DL-Lite_R. M…
▽ More
Knowledge base exchange is an important problem in the area of data exchange and knowledge representation, where one is interested in exchanging information between a source and a target knowledge base connected through a mapping. In this paper, we study this fundamental problem for knowledge bases and mappings expressed in OWL 2 QL, the profile of OWL 2 based on the description logic DL-Lite_R. More specifically, we consider the problem of computing universal solutions, identified as one of the most desirable translations to be materialized, and the problem of computing UCQ-representations, which optimally capture in a target TBox the information that can be extracted from a source TBox and a mapping by means of unions of conjunctive queries. For the former we provide a novel automata-theoretic technique, and complexity results that range from NP to EXPTIME, while for the latter we show NLOGSPACE-completeness.
△ Less
Submitted 1 July, 2013; v1 submitted 21 April, 2013;
originally announced April 2013.
-
Verification of Relational Data-Centric Dynamic Systems with External Services
Authors:
Babak Bagheri Hariri,
Diego Calvanese,
Giuseppe De Giacomo,
Alin Deutsch,
Marco Montali
Abstract:
Data-centric dynamic systems are systems where both the process controlling the dynamics and the manipulation of data are equally central. In this paper we study verification of (first-order) mu-calculus variants over relational data-centric dynamic systems, where data are represented by a full-fledged relational database, and the process is described in terms of atomic actions that evolve the dat…
▽ More
Data-centric dynamic systems are systems where both the process controlling the dynamics and the manipulation of data are equally central. In this paper we study verification of (first-order) mu-calculus variants over relational data-centric dynamic systems, where data are represented by a full-fledged relational database, and the process is described in terms of atomic actions that evolve the database. The execution of such actions may involve calls to external services, providing fresh data inserted into the system. As a result such systems are typically infinite-state. We show that verification is undecidable in general, and we isolate notable cases, where decidability is achieved. Specifically we start by considering service calls that return values deterministically (depending only on passed parameters). We show that in a mu-calculus variant that preserves knowledge of objects appeared along a run we get decidability under the assumption that the fresh data introduced along a run are bounded, though they might not be bounded in the overall system. In fact we tie such a result to a notion related to weak acyclicity studied in data exchange. Then, we move to nondeterministic services where the assumption of data bounded run would result in a bound on the service calls that can be invoked during the execution and hence would be too restrictive. So we investigate decidability under the assumption that knowledge of objects is preserved only if they are continuously present. We show that if infinitely many values occur in a run but do not accumulate in the same state, then we get again decidability. We give syntactic conditions to avoid this accumulation through the novel notion of "generate-recall acyclicity", which takes into consideration that every service call activation generates new values that cannot be accumulated indefinitely.
△ Less
Submitted 29 February, 2012;
originally announced March 2012.
-
Unifying Class-Based Representation Formalisms
Authors:
D. Calvanese,
M. Lenzerini,
D. Nardi
Abstract:
The notion of class is ubiquitous in computer science and is central in many formalisms for the representation of structured knowledge used both in knowledge representation and in databases. In this paper we study the basic issues underlying such representation formalisms and single out both their common characteristics and their distinguishing features. Such investigation leads u…
▽ More
The notion of class is ubiquitous in computer science and is central in many formalisms for the representation of structured knowledge used both in knowledge representation and in databases. In this paper we study the basic issues underlying such representation formalisms and single out both their common characteristics and their distinguishing features. Such investigation leads us to propose a unifying framework in which we are able to capture the fundamental aspects of several representation languages used in different contexts. The proposed formalism is expressed in the style of description logics, which have been introduced in knowledge representation as a means to provide a semantically well-founded basis for the structural aspects of knowledge representation systems. The description logic considered in this paper is a subset of first order logic with nice computational characteristics. It is quite expressive and features a novel combination of constructs that has not been studied before. The distinguishing constructs are number restrictions, which generalize existence and functional dependencies, inverse roles, which allow one to refer to the inverse of a relationship, and possibly cyclic assertions, which are necessary for capturing real world domains. We are able to show that it is precisely such combination of constructs that makes our logic powerful enough to model the essential set of features for defining class structures that are common to frame systems, object-oriented database languages, and semantic data models. As a consequence of the established correspondences, several significant extensions of each of the above formalisms become available. The high expressiveness of the logic we propose and the need for capturing the reasoning in different contexts forces us to distinguish between unrestricted and finite model reasoning. A notable feature of our proposal is that reasoning in both cases is decidable. We argue that, by virtue of the high expressive power and of the associated reasoning capabilities on both unrestricted and finite models, our logic provides a common core for class-based representation formalisms.
△ Less
Submitted 26 May, 2011;
originally announced May 2011.
-
View Synthesis from Schema Mappings
Authors:
Diego Calvanese,
Giuseppe De Giacomo,
Maurizio Lenzerini,
Moshe Y. Vardi
Abstract:
In data management, and in particular in data integration, data exchange, query optimization, and data privacy, the notion of view plays a central role. In several contexts, such as data integration, data mashups, and data warehousing, the need arises of designing views starting from a set of known correspondences between queries over different schemas. In this paper we deal with the issue of au…
▽ More
In data management, and in particular in data integration, data exchange, query optimization, and data privacy, the notion of view plays a central role. In several contexts, such as data integration, data mashups, and data warehousing, the need arises of designing views starting from a set of known correspondences between queries over different schemas. In this paper we deal with the issue of automating such a design process. We call this novel problem "view synthesis from schema mappings": given a set of schema mappings, each relating a query over a source schema to a query over a target schema, automatically synthesize for each source a view over the target schema in such a way that for each mapping, the query over the source is a rewriting of the query over the target wrt the synthesized views. We study view synthesis from schema mappings both in the relational setting, where queries and views are (unions of) conjunctive queries, and in the semistructured data setting, where queries and views are (two-way) regular path queries, as well as unions of conjunctions thereof. We provide techniques and complexity upper bounds for each of these cases.
△ Less
Submitted 4 March, 2010;
originally announced March 2010.
-
Conjunctive Query Containment and Answering under Description Logics Constraints
Authors:
Diego Calvanese,
Giuseppe De Giacomo,
Maurizio Lenzerini
Abstract:
Query containment and query answering are two important computational tasks in databases. While query answering amounts to compute the result of a query over a database, query containment is the problem of checking whether for every database, the result of one query is a subset of the result of another query.
In this paper, we deal with unions of conjunctive queries, and we address query conta…
▽ More
Query containment and query answering are two important computational tasks in databases. While query answering amounts to compute the result of a query over a database, query containment is the problem of checking whether for every database, the result of one query is a subset of the result of another query.
In this paper, we deal with unions of conjunctive queries, and we address query containment and query answering under Description Logic constraints. Every such constraint is essentially an inclusion dependencies between concepts and relations, and their expressive power is due to the possibility of using complex expressions, e.g., intersection and difference of relations, special forms of quantification, regular expressions over binary relations, in the specification of the dependencies. These types of constraints capture a great variety of data models, including the relational, the entity-relationship, and the object-oriented model, all extended with various forms of constraints, and also the basic features of the ontology languages used in the context of the Semantic Web.
We present the following results on both query containment and query answering. We provide a method for query containment under Description Logic constraints, thus showing that the problem is decidable, and analyze its computational complexity. We prove that query containment is undecidable in the case where we allow inequalities in the right-hand side query, even for very simple constraints and queries. We show that query answering under Description Logic constraints can be reduced to query containment, and illustrate how such a reduction provides upper bound results with respect to both combined and data complexity.
△ Less
Submitted 28 July, 2005;
originally announced July 2005.
-
Data complexity of answering conjunctive queries over SHIQ knowledge bases
Authors:
M. Magdalena Ortiz de la Fuente,
Diego Calvanese,
Thomas Eiter,
Enrico Franconi
Abstract:
An algorithm for answering conjunctive queries over SHIQ knowledge bases that is coNP in data complexity is given. The algorithm is based on the tableau algorithm for reasoning with individuals in SHIQ. The blocking conditions of the tableau are weakened in such a way that the set of models the modified algorithm yields suffices to check query entailment. The modified blocking conditions are bas…
▽ More
An algorithm for answering conjunctive queries over SHIQ knowledge bases that is coNP in data complexity is given. The algorithm is based on the tableau algorithm for reasoning with individuals in SHIQ. The blocking conditions of the tableau are weakened in such a way that the set of models the modified algorithm yields suffices to check query entailment. The modified blocking conditions are based on the ones proposed by Levy and Rousset for reasoning with Horn Rules in the description logic ALCNR.
△ Less
Submitted 22 July, 2005;
originally announced July 2005.