Search | arXiv e-print repository

Mining Frequent Structures in Conceptual Models

Authors: Mattia Fumagalli, Tiago Prince Sales, Pedro Paulo F. Barcelos, Giovanni Micale, Philipp-Lorenz Glaser, Dominik Bork, Vadim Zaytsev, Diego Calvanese, Giancarlo Guizzardi

Abstract: The problem of using structured methods to represent knowledge is well-known in conceptual modeling and has been studied for many years. It has been proven that adopting modeling patterns represents an effective structural method. Patterns are, indeed, generalizable recurrent structures that can be exploited as solutions to design problems. They aid in understanding and improving the process of cr… ▽ More The problem of using structured methods to represent knowledge is well-known in conceptual modeling and has been studied for many years. It has been proven that adopting modeling patterns represents an effective structural method. Patterns are, indeed, generalizable recurrent structures that can be exploited as solutions to design problems. They aid in understanding and improving the process of creating models. The undeniable value of using patterns in conceptual modeling was demonstrated in several experimental studies. However, discovering patterns in conceptual models is widely recognized as a highly complex task and a systematic solution to pattern identification is currently lacking. In this paper, we propose a general approach to the problem of discovering frequent structures, as they occur in conceptual modeling languages. As proof of concept, we implement our approach by focusing on two widely-used conceptual modeling languages. This implementation includes an exploratory tool that integrates a frequent subgraph mining algorithm with graph manipulation techniques. The tool processes multiple conceptual models and identifies recurrent structures based on various criteria. We validate the tool using two state-of-the-art curated datasets: one consisting of models encoded in OntoUML and the other in ArchiMate. The primary objective of our approach is to provide a support tool for language engineers. This tool can be used to identify both effective and ineffective modeling practices, enabling the refinement and evolution of conceptual modeling languages. Furthermore, it facilitates the reuse of accumulated expertise, ultimately supporting the creation of higher-quality models in a given language. △ Less

Submitted 25 December, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

arXiv:2310.11555 [pdf, other]

Integrating 3D City Data through Knowledge Graphs

Authors: Linfang Ding, Guohui Xiao, Albulen Pano, Mattia Fumagalli, Dongsheng Chen, Yu Feng, Diego Calvanese, Hongchao Fan, Liqiu Meng

Abstract: CityGML is a widely adopted standard by the Open Geospatial Consortium (OGC) for representing and exchanging 3D city models. The representation of semantic and topological properties in CityGML makes it possible to query such 3D city data to perform analysis in various applications, e.g., security management and emergency response, energy consumption and estimation, and occupancy measurement. Howe… ▽ More CityGML is a widely adopted standard by the Open Geospatial Consortium (OGC) for representing and exchanging 3D city models. The representation of semantic and topological properties in CityGML makes it possible to query such 3D city data to perform analysis in various applications, e.g., security management and emergency response, energy consumption and estimation, and occupancy measurement. However, the potential of querying CityGML data has not been fully exploited. The official GML/XML encoding of CityGML is only intended as an exchange format but is not suitable for query answering. The most common way of dealing with CityGML data is to store them in the 3DCityDB system as relational tables and then query them with the standard SQL query language. Nevertheless, for end users, it remains a challenging task to formulate queries over 3DCityDB directly for their ad-hoc analytical tasks, because there is a gap between the conceptual semantics of CityGML and the relational schema adopted in 3DCityDB. In fact, the semantics of CityGML itself can be modeled as a suitable ontology. The technology of Knowledge Graphs (KGs), where an ontology is at the core, is a good solution to bridge such a gap. Moreover, embracing KGs makes it easier to integrate with other spatial data sources, e.g., OpenStreetMap and existing (Geo)KGs (e.g., Wikidata, DBPedia, and GeoNames), and to perform queries combining information from multiple data sources. In this work, we describe a CityGML KG framework to populate the concepts in the CityGML ontology using declarative mappings to 3DCityDB, thus exposing the CityGML data therein as a KG. To demonstrate the feasibility of our approach, we use CityGML data from the city of Munich as test data and integrate OpenStreeMap data in the same area. △ Less

Submitted 17 October, 2023; originally announced October 2023.

arXiv:2201.12855 [pdf, ps, other]

doi 10.1145/3576047

AI-Augmented Business Process Management Systems: A Research Manifesto

Authors: Marlon Dumas, Fabiana Fournier, Lior Limonad, Andrea Marrella, Marco Montali, Jana-Rebecca Rehse, Rafael Accorsi, Diego Calvanese, Giuseppe De Giacomo, Dirk Fahland, Avigdor Gal, Marcello La Rosa, Hagen Völzer, Ingo Weber

Abstract: AI-Augmented Business Process Management Systems (ABPMSs) are an emerging class of process-aware information systems, empowered by trustworthy AI technology. An ABPMS enhances the execution of business processes with the aim of making these processes more adaptable, proactive, explainable, and context-sensitive. This manifesto presents a vision for ABPMSs and discusses research challenges that nee… ▽ More AI-Augmented Business Process Management Systems (ABPMSs) are an emerging class of process-aware information systems, empowered by trustworthy AI technology. An ABPMS enhances the execution of business processes with the aim of making these processes more adaptable, proactive, explainable, and context-sensitive. This manifesto presents a vision for ABPMSs and discusses research challenges that need to be surmounted to realize this vision. To this end, we define the concept of ABPMS, we outline the lifecycle of processes within an ABPMS, we discuss core characteristics of an ABPMS, and we derive a set of challenges to realize systems with these characteristics. △ Less

Submitted 4 November, 2022; v1 submitted 30 January, 2022; originally announced January 2022.

Comments: 19 pages, 1 figure

Journal ref: ACM Transactions on Management Information Systems, 31 January 2023 Volume 14, Issue 1, Article No.: 11, pp 1-19

arXiv:2108.12330 [pdf, ps, other]

SMT-Based Safety Verification of Data-Aware Processes under Ontologies (Extended Version)

Authors: Diego Calvanese, Alessandro Gianola, Andrea Mazzullo, Marco Montali

Abstract: In the context of verification of data-aware processes (DAPs), a formal approach based on satisfiability modulo theories (SMT) has been considered to verify parameterised safety properties of so-called artifact-centric systems. This approach requires a combination of model-theoretic notions and algorithmic techniques based on backward reachability. We introduce here a variant of one of the most in… ▽ More In the context of verification of data-aware processes (DAPs), a formal approach based on satisfiability modulo theories (SMT) has been considered to verify parameterised safety properties of so-called artifact-centric systems. This approach requires a combination of model-theoretic notions and algorithmic techniques based on backward reachability. We introduce here a variant of one of the most investigated models in this spectrum, namely simple artifact systems (SASs), where, instead of managing a database, we operate over a description logic (DL) ontology expressed in (a slight extension of) RDFS. This DL, enjoying suitable model-theoretic properties, allows us to define DL-based SASs to which backward reachability can still be applied, leading to decidability in PSPACE of the corresponding safety problems. △ Less

Submitted 27 August, 2021; originally announced August 2021.

arXiv:2104.04194 [pdf, other]

INODE: Building an End-to-End Data Exploration System in Practice [Extended Vision]

Authors: Sihem Amer-Yahia, Georgia Koutrika, Frederic Bastian, Theofilos Belmpas, Martin Braschler, Ursin Brunner, Diego Calvanese, Maximilian Fabricius, Orest Gkini, Catherine Kosten, Davide Lanti, Antonis Litke, Hendrik Lücke-Tieke, Francesco Alessandro Massucci, Tarcisio Mendes de Farias, Alessandro Mosca, Francesco Multari, Nikolaos Papadakis, Dimitris Papadopoulos, Yogendra Patil, Aurélien Personnaz, Guillem Rull, Ana Sima, Ellery Smith, Dimitrios Skoutas , et al. (3 additional authors not shown)

Abstract: A full-fledged data exploration system must combine different access modalities with a powerful concept of guiding the user in the exploration process, by being reactive and anticipative both for data discovery and for data linking. Such systems are a real opportunity for our community to cater to users with different domain and data science expertise. We introduce INODE -- an end-to-end data expl… ▽ More A full-fledged data exploration system must combine different access modalities with a powerful concept of guiding the user in the exploration process, by being reactive and anticipative both for data discovery and for data linking. Such systems are a real opportunity for our community to cater to users with different domain and data science expertise. We introduce INODE -- an end-to-end data exploration system -- that leverages, on the one hand, Machine Learning and, on the other hand, semantics for the purpose of Data Management (DM). Our vision is to develop a classic unified, comprehensive platform that provides extensive access to open datasets, and we demonstrate it in three significant use cases in the fields of Cancer Biomarker Reearch, Research and Innovation Policy Making, and Astrophysics. INODE offers sustainable services in (a) data modeling and linking, (b) integrated query processing using natural language, (c) guidance, and (d) data exploration through visualization, thus facilitating the user in discovering new insights. We demonstrate that our system is uniquely accessible to a wide range of users from larger scientific communities to the public. Finally, we briefly illustrate how this work paves the way for new research opportunities in DM. △ Less

Submitted 9 April, 2021; originally announced April 2021.

Comments: 8 pages, 5 figures

ACM Class: I.2; H.2

arXiv:2012.01917 [pdf, other]

Mapping Patterns for Virtual Knowledge Graphs

Authors: Diego Calvanese, Avigdor Gal, Davide Lanti, Marco Montali, Alessandro Mosca, Roee Shraga

Abstract: Virtual Knowledge Graphs (VKG) constitute one of the most promising paradigms for integrating and accessing legacy data sources. A critical bottleneck in the integration process involves the definition, validation, and maintenance of mappings that link data sources to a domain ontology. To support the management of mappings throughout their entire lifecycle, we propose a comprehensive catalog of s… ▽ More Virtual Knowledge Graphs (VKG) constitute one of the most promising paradigms for integrating and accessing legacy data sources. A critical bottleneck in the integration process involves the definition, validation, and maintenance of mappings that link data sources to a domain ontology. To support the management of mappings throughout their entire lifecycle, we propose a comprehensive catalog of sophisticated mapping patterns that emerge when linking databases to ontologies. To do so, we build on well-established methodologies and patterns studied in data management, data analysis, and conceptual modeling. These are extended and refined through the analysis of concrete VKG benchmarks and real-world use cases, and considering the inherent impedance mismatch between data sources and ontologies. We validate our catalog on the considered VKG scenarios, showing that it covers the vast majority of patterns present therein. △ Less

Submitted 11 August, 2023; v1 submitted 3 December, 2020; originally announced December 2020.

Comments: 40 pages

arXiv:2005.05886 [pdf, ps, other]

Counting Query Answers over a DL-Lite Knowledge Base (extended version)

Authors: Diego Calvanese, Julien Corman, Davide Lanti, Simon Razniewski

Abstract: Counting answers to a query is an operation supported by virtually all database management systems. In this paper we focus on counting answers over a Knowledge Base (KB), which may be viewed as a database enriched with background knowledge about the domain under consideration. In particular, we place our work in the context of Ontology-Mediated Query Answering/Ontology-based Data Access (OMQA/OBDA… ▽ More Counting answers to a query is an operation supported by virtually all database management systems. In this paper we focus on counting answers over a Knowledge Base (KB), which may be viewed as a database enriched with background knowledge about the domain under consideration. In particular, we place our work in the context of Ontology-Mediated Query Answering/Ontology-based Data Access (OMQA/OBDA), where the language used for the ontology is a member of the DL-Lite family and the data is a (usually virtual) set of assertions. We study the data complexity of query answering, for different members of the DL-Lite family that include number restrictions, and for variants of conjunctive queries with counting that differ with respect to their shape (connected, branching, rooted). We improve upon existing results by providing a PTIME and coNP lower bounds, and upper bounds in PTIME and LOGSPACE. For the latter case, we define a novel query rewriting technique into first-order logic with counting. △ Less

Submitted 17 July, 2020; v1 submitted 12 May, 2020; originally announced May 2020.

Comments: Extended version of an article published at IJCAI 2020

arXiv:2001.09365 [pdf, other]

On Expansion and Contraction of DL-Lite Knowledge Bases

Authors: Dmitriy Zheleznyakov, Evgeny Kharlamov, Werner Nutt, Diego Calvanese

Abstract: Knowledge bases (KBs) are not static entities: new information constantly appears and some of the previous knowledge becomes obsolete. In order to reflect this evolution of knowledge, KBs should be expanded with the new knowledge and contracted from the obsolete one. This problem is well-studied for propositional but much less for first-order KBs. In this work we investigate knowledge expansion an… ▽ More Knowledge bases (KBs) are not static entities: new information constantly appears and some of the previous knowledge becomes obsolete. In order to reflect this evolution of knowledge, KBs should be expanded with the new knowledge and contracted from the obsolete one. This problem is well-studied for propositional but much less for first-order KBs. In this work we investigate knowledge expansion and contraction for KBs expressed in DL-Lite, a family of description logics (DLs) that underlie the tractable fragment OWL 2 QL of the Web Ontology Language OWL 2. We start with a novel knowledge evolution framework and natural postulates that evolution should respect, and compare our postulates to the well-established AGM postulates. We then review well-known model and formula-based approaches for expansion and contraction for propositional theories and show how they can be adapted to the case of DL-Lite. In particular, we show intrinsic limitations of model-based approaches: besides the fact that some of them do not respect the postulates we have established, they ignore the structural properties of KBs. This leads to undesired properties of evolution results: evolution of DL-Lite KBs cannot be captured in DL-Lite. Moreover, we show that well-known formula-based approaches are also not appropriate for DL-Lite expansion and contraction: they either have a high complexity of computation, or they produce logical theories that cannot be expressed in DL-Lite. Thus, we propose a novel formula-based approach that respects our principles and for which evolution is expressible in DL-Lite. For this approach we also propose polynomial time deterministic algorithms to compute evolution of DL-Lite KBs when evolution affects only factual data. △ Less

Submitted 25 January, 2020; originally announced January 2020.

arXiv:1911.07774 [pdf, other]

Combined Covers and Beth Definability (Extended Version)

Authors: Diego Calvanese, Silvio Ghilardi, Alessandro Gianola, Marco Montali, Andrey Rivkin

Abstract: In ESOP 2008, Gulwani and Musuvathi introduced a notion of cover and exploited it to handle infinite-state model checking problems. Motivated by applications to the verification of data-aware processes, we proved in a previous paper that covers are strictly related to model completions, a well-known topic in model theory. In this paper we investigate cover transfer to theory combinations in the di… ▽ More In ESOP 2008, Gulwani and Musuvathi introduced a notion of cover and exploited it to handle infinite-state model checking problems. Motivated by applications to the verification of data-aware processes, we proved in a previous paper that covers are strictly related to model completions, a well-known topic in model theory. In this paper we investigate cover transfer to theory combinations in the disjoint signatures case. We prove that for convex theories, cover algorithms can be transferred to theory combinations under the same hypothesis (equality interpolation property aka strong amalgamation property) needed to transfer quantifier-free interpolation. In the non-convex case, we show by a counterexample that covers may not exist in the combined theories, even in case combined quantifier-free interpolants do exist. However, we exhibit a cover transfer algorithm operating also in the non-convex case for special kinds of theory combinations; these combinations (called `tame combinations') concern multi-sorted theories arising in many model-checking applications (in particular, the ones oriented to verification of data-aware processes). △ Less

Submitted 29 June, 2020; v1 submitted 18 November, 2019; originally announced November 2019.

arXiv:1906.07811 [pdf, ps, other]

Formal Modeling and SMT-Based Parameterized Verification of Data-Aware BPMN (Extended Version)

Authors: Diego Calvanese, Silvio Ghilardi, Alessandro Gianola, Marco Montali, Andrey Rivkin

Abstract: We propose DAB -- a data-aware extension of BPMN where the process operates over case and persistent data (partitioned into a read-only database called catalog and a read-write database called repository). The model trades off between expressiveness and the possibility of supporting parameterized verification of safety properties on top of it. Specifically, taking inspiration from the literature o… ▽ More We propose DAB -- a data-aware extension of BPMN where the process operates over case and persistent data (partitioned into a read-only database called catalog and a read-write database called repository). The model trades off between expressiveness and the possibility of supporting parameterized verification of safety properties on top of it. Specifically, taking inspiration from the literature on verification of artifact systems, we study verification problems where safety properties are checked irrespectively of the content of the read-only catalog, and accepting the potential presence of unboundedly many tuples in the catalog and repository. We tackle such problems using an array-based backward reachability procedure fully implemented in MCMT -- a state-of-the-art array-based SMT model checker. Notably, we prove that the procedure is sound and complete for checking safety of DABs, and single out additional conditions that guarantee its termination and, in turn, show decidability of checking safety. △ Less

Submitted 24 June, 2019; v1 submitted 31 May, 2019; originally announced June 2019.

Comments: long version of a paper accepted at the BPM conference. arXiv admin note: substantial text overlap with arXiv:1905.12991

arXiv:1906.00179 [pdf, ps, other]

Enriching Ontology-based Data Access with Provenance (Extended Version)

Authors: Diego Calvanese, Davide Lanti, Ana Ozaki, Rafael Penaloza, Guohui Xiao

Abstract: Ontology-based data access (OBDA) is a popular paradigm for querying heterogeneous data sources by connecting them through mappings to an ontology. In OBDA, it is often difficult to reconstruct why a tuple occurs in the answer of a query. We address this challenge by enriching OBDA with provenance semirings, taking inspiration from database theory. In particular, we investigate the problems of (i)… ▽ More Ontology-based data access (OBDA) is a popular paradigm for querying heterogeneous data sources by connecting them through mappings to an ontology. In OBDA, it is often difficult to reconstruct why a tuple occurs in the answer of a query. We address this challenge by enriching OBDA with provenance semirings, taking inspiration from database theory. In particular, we investigate the problems of (i) deciding whether a provenance annotated OBDA instance entails a provenance annotated conjunctive query, and (ii) computing a polynomial representing the provenance of a query entailed by a provenance annotated OBDA instance. Differently from pure databases, in our case these polynomials may be infinite. To regain finiteness, we consider idempotent semirings, and study the complexity in the case of DL-Lite ontologies. We implement Task (ii) in a state-of-the-art OBDA system and show the practical feasibility of the approach through an extensive evaluation against two popular benchmarks. △ Less

Submitted 1 June, 2019; originally announced June 2019.

arXiv:1905.12991 [pdf, ps, other]

Formal Modeling and SMT-Based Parameterized Verification of Multi-Case Data-Aware BPMN

Authors: Diego Calvanese, Silvio Ghilardi, Alessandro Gianola, Marco Montali, Andrey Rivkin

Abstract: We propose DAB -- a data-aware extension of the BPMN de-facto standard with the ability of operating over case and persistent data (partitioned into a read-only catalog and a read-write repository), and that balances between expressiveness and the possibility of supporting parameterized verification of safety properties on top of it. In particular, we take inspiration from the literature on verifi… ▽ More We propose DAB -- a data-aware extension of the BPMN de-facto standard with the ability of operating over case and persistent data (partitioned into a read-only catalog and a read-write repository), and that balances between expressiveness and the possibility of supporting parameterized verification of safety properties on top of it. In particular, we take inspiration from the literature on verification of artifact systems, and consider verification problems where safety properties are checked irrespectively of the content of the read-only catalog, possibly considering an unbounded number of active cases and tuples in the catalog and repository. Such problems are tackled using fully implemented array-based backward reachability techniques belonging to the well-established tradition of SMT model checking. We also identify relevant classes of DABs for which the backward reachability procedure implemented in the MCMT array-based model checker is sound and complete, and then further strengthen such classes to ensure termination. △ Less

Submitted 20 June, 2019; v1 submitted 30 May, 2019; originally announced May 2019.

Comments: This article builds upon arXiv:1906.07811, extending it in two respects. First, while arXiv:1906.07811 focuses on the verification of DABs considering a single, running case, we consider here the possibility of (unboundedly many) cases running concurrently. Second, we provide full proofs of the technical results, including those from arXiv:1906.07811 and those for this version

arXiv:1810.08062 [pdf, other]

Modeling and In-Database Management of Relational, Data-Aware Processes (Extended Version)

Authors: Diego Calvanese, Marco Montali, Fabio Patrizi, Andrey Rivkin

Abstract: During the last two decades, it has been increasingly acknowledged that the engineering of information systems usually requires a huge effort in integrating master data and business processes. This has led to a plethora of proposals, both from academia and the industry. However, such approaches typically come with ad-hoc abstractions to represent and interact with the data component. This has a tw… ▽ More During the last two decades, it has been increasingly acknowledged that the engineering of information systems usually requires a huge effort in integrating master data and business processes. This has led to a plethora of proposals, both from academia and the industry. However, such approaches typically come with ad-hoc abstractions to represent and interact with the data component. This has a twofold disadvantage. On the one hand, they cannot be used to effortlessly enrich an existing relational database with dynamics. On the other hand, they generally do not allow for integrated modelling, verification, and enactment. We attack these two challenges by proposing a declarative approach, fully grounded in SQL, that supports the agile modelling of relational data-aware processes directly on top of relational databases. We show how this approach can be automatically translated into a concrete procedural SQL dialect, executable directly inside any relational database engine. The translation exploits an in-database representation of process states that, in turn, is used to handle, at once, process enactment with or without logging of the executed instances, as well as process verification. The approach has been implemented in a working prototype. △ Less

Submitted 8 July, 2019; v1 submitted 18 October, 2018; originally announced October 2018.

arXiv:1807.11615 [pdf, other]

Semantic DMN: Formalizing and Reasoning About Decisions in the Presence of Background Knowledge

Authors: Diego Calvanese, Marlon Dumas, Fabrizio Maria Maggi, Marco Montali

Abstract: The Decision Model and Notation (DMN) is a recent OMG standard for the elicitation and representation of decision models, and for managing their interconnection with business processes. DMN builds on the notion of decision tables, and their combination into more complex decision requirements graphs (DRGs), which bridge between business process models and decision logic models. DRGs may rely on add… ▽ More The Decision Model and Notation (DMN) is a recent OMG standard for the elicitation and representation of decision models, and for managing their interconnection with business processes. DMN builds on the notion of decision tables, and their combination into more complex decision requirements graphs (DRGs), which bridge between business process models and decision logic models. DRGs may rely on additional, external business knowledge models, whose functioning is not part of the standard. In this work, we consider one of the most important types of business knowledge, namely background knowledge that conceptually accounts for the structural aspects of the domain of interest, and propose decision knowledge bases (DKBs), which semantically combine DRGs modeled in DMN, and domain knowledge captured by means of first-order logic with datatypes. We provide a logic-based semantics for such an integration, and formalize different DMN reasoning tasks for DKBs. We then consider background knowledge formulated as a description logic ontology with datatypes, and show how the main verification tasks for DMN in this enriched setting can be formalized as standard DL reasoning services, and actually carried out in ExpTime. We discuss the effectiveness of our framework on a case study in maritime security. △ Less

Submitted 14 September, 2018; v1 submitted 30 July, 2018; originally announced July 2018.

Comments: Under consideration for publication in Theory and Practice of Logic Programming (TPLP)

arXiv:1806.11459 [pdf, other]

Verification of Data-Aware Processes via Array-Based Systems (Extended Version)

Authors: Diego Calvanese, Silvio Ghilardi, Alessandro Gianola, Marco Montali, Andrey Rivkin

Abstract: We study verification over a general model of artifact-centric systems, to assess (parameterized) safety properties irrespectively of the initial database instance. We view such artifact systems as array-based systems, which allows us to check safety by adapting backward reachability, establishing for the first time a correspondence with model checking based on Satisfiability-Modulo-Theories (SMT)… ▽ More We study verification over a general model of artifact-centric systems, to assess (parameterized) safety properties irrespectively of the initial database instance. We view such artifact systems as array-based systems, which allows us to check safety by adapting backward reachability, establishing for the first time a correspondence with model checking based on Satisfiability-Modulo-Theories (SMT). To do so, we make use of the model-theoretic machinery of model completion, which surprisingly turns out to be an effective tool for verification of relational systems, and represents the main original contribution of this paper. In this way, we pursue a twofold purpose. On the one hand, we reconstruct (restricted to safety) the essence of some important decidability results obtained in the literature for artifact-centric systems, and we devise a genuinely novel class of decidable cases. On the other, we are able to exploit SMT technology in implementations, building on the well-known MCMT model checker for array-based systems, and extending it to make all our foundational results fully operational. △ Less

Submitted 27 February, 2019; v1 submitted 29 June, 2018; originally announced June 2018.

arXiv:1806.09686 [pdf, ps, other]

Quantifier Elimination for Database Driven Verification

Authors: Diego Calvanese, Silvio Ghilardi, Alessandro Gianola, Marco Montali, Andrey Rivkin

Abstract: Running verification tasks in database driven systems requires solving quantifier elimination problems of a new kind. These quantifier elimination problems are related to the notion of a cover introduced in ESOP 2008 by Gulwani and Musuvathi. In this paper, we show how covers are strictly related to model completions, a well-known topic in model theory. We also investigate the computation of cover… ▽ More Running verification tasks in database driven systems requires solving quantifier elimination problems of a new kind. These quantifier elimination problems are related to the notion of a cover introduced in ESOP 2008 by Gulwani and Musuvathi. In this paper, we show how covers are strictly related to model completions, a well-known topic in model theory. We also investigate the computation of covers within the Superposition Calculus, by adopting a constrained version of the calculus, equipped with appropriate settings and reduction strategies. In addition, we show that cover computations are computationally tractable for the fragment of the language used in applications to database driven verification. This observation is confirmed by analyzing the preliminary results obtained using the MCMT tool on the verification of data-aware process benchmarks. These benchmarks can be found in the last version of the tool distribution. △ Less

Submitted 17 June, 2019; v1 submitted 25 June, 2018; originally announced June 2018.

arXiv:1806.05918 [pdf, ps, other]

Efficient Handling of SPARQL OPTIONAL for OBDA (Extended Version)

Authors: Guohui Xiao, Roman Kontchakov, Benjamin Cogrel, Diego Calvanese, Elena Botoeva

Abstract: OPTIONAL is a key feature in SPARQL for dealing with missing information. While this operator is used extensively, it is also known for its complexity, which can make efficient evaluation of queries with OPTIONAL challenging. We tackle this problem in the Ontology-Based Data Access (OBDA) setting, where the data is stored in a SQL relational database and exposed as a virtual RDF graph by means of… ▽ More OPTIONAL is a key feature in SPARQL for dealing with missing information. While this operator is used extensively, it is also known for its complexity, which can make efficient evaluation of queries with OPTIONAL challenging. We tackle this problem in the Ontology-Based Data Access (OBDA) setting, where the data is stored in a SQL relational database and exposed as a virtual RDF graph by means of an R2RML mapping. We start with a succinct translation of a SPARQL fragment into SQL. It fully respects bag semantics and three-valued logic and relies on the extensive use of the LEFT JOIN operator and COALESCE function. We then propose optimisation techniques for reducing the size and improving the structure of generated SQL queries. Our optimisations capture interactions between JOIN, LEFT JOIN, COALESCE and integrity constraints such as attribute nullability, uniqueness and foreign key constraints. Finally, we empirically verify effectiveness of our techniques on the BSBM OBDA benchmark. △ Less

Submitted 18 June, 2018; v1 submitted 15 June, 2018; originally announced June 2018.

Comments: technical report for ISWC 2018 paper

arXiv:1707.06974 [pdf, other]

Cost-Driven Ontology-Based Data Access (Extended Version)

Authors: Davide Lanti, Guohui Xiao, Diego Calvanese

Abstract: In ontology-based data access (OBDA), users are provided with a conceptual view of a (relational) data source that abstracts away details about data storage. This conceptual view is realized through an ontology that is connected to the data source through declarative mappings, and query answering is carried out by translating the user queries over the conceptual view into SQL queries over the data… ▽ More In ontology-based data access (OBDA), users are provided with a conceptual view of a (relational) data source that abstracts away details about data storage. This conceptual view is realized through an ontology that is connected to the data source through declarative mappings, and query answering is carried out by translating the user queries over the conceptual view into SQL queries over the data source. Standard translation techniques in OBDA try to transform the user query into a union of conjunctive queries (UCQ), following the heuristic argument that UCQs can be efficiently evaluated by modern relational database engines. In this work, we show that translating to UCQs is not always the best choice, and that, under certain conditions on the interplay between the ontology, the map- pings, and the statistics of the data, alternative translations can be evaluated much more efficiently. To find the best translation, we devise a cost model together with a novel cardinality estimation that takes into account all such OBDA components. Our experiments confirm that (i) alternatives to the UCQ translation might produce queries that are orders of magnitude more efficient, and (ii) the cost model we propose is faithful to the actual query evaluation cost, and hence is well suited to select the best translation. △ Less

Submitted 2 February, 2018; v1 submitted 21 July, 2017; originally announced July 2017.

Comments: Extended version of the ISWC 17 paper "Cost-Driven Ontology-Based Data Access"

arXiv:1701.09007 [pdf, other]

Research Directions for Principles of Data Management (Dagstuhl Perspectives Workshop 16151)

Authors: Serge Abiteboul, Marcelo Arenas, Pablo Barceló, Meghyn Bienvenu, Diego Calvanese, Claire David, Richard Hull, Eyke Hüllermeier, Benny Kimelfeld, Leonid Libkin, Wim Martens, Tova Milo, Filip Murlak, Frank Neven, Magdalena Ortiz, Thomas Schwentick, Julia Stoyanovich, Jianwen Su, Dan Suciu, Victor Vianu, Ke Yi

Abstract: In April 2016, a community of researchers working in the area of Principles of Data Management (PDM) joined in a workshop at the Dagstuhl Castle in Germany. The workshop was organized jointly by the Executive Committee of the ACM Symposium on Principles of Database Systems (PODS) and the Council of the International Conference on Database Theory (ICDT). The mission of this workshop was to identify… ▽ More In April 2016, a community of researchers working in the area of Principles of Data Management (PDM) joined in a workshop at the Dagstuhl Castle in Germany. The workshop was organized jointly by the Executive Committee of the ACM Symposium on Principles of Database Systems (PODS) and the Council of the International Conference on Database Theory (ICDT). The mission of this workshop was to identify and explore some of the most important research directions that have high relevance to society and to Computer Science today, and where the PDM community has the potential to make significant contributions. This report describes the family of research directions that the workshop focused on from three perspectives: potential practical relevance, results already obtained, and research questions that appear surmountable in the short and medium term. △ Less

Submitted 31 January, 2017; originally announced January 2017.

arXiv:1701.00976 [pdf, ps, other]

Metric Temporal Logic for Ontology-Based Data Access over Log Data

Authors: Diego Calvanese, Elem Güzel Kalaycı, Vladislav Ryzhikov, Guohui Xiao, Michael Zakharyaschev

Abstract: We present a new metric temporal logic HornMTL over dense time and its datalog extension datalogMTL. The use of datalogMTL is demonstrated in the context of ontology-based data access over meteorological data. We show decidability of answering ontology-mediated queries for a practically relevant non-recursive fragment of datalogMTL. Finally, we discuss directions of the future work, including the… ▽ More We present a new metric temporal logic HornMTL over dense time and its datalog extension datalogMTL. The use of datalogMTL is demonstrated in the context of ontology-based data access over meteorological data. We show decidability of answering ontology-mediated queries for a practically relevant non-recursive fragment of datalogMTL. Finally, we discuss directions of the future work, including the potential use-cases in analyzing log data of engines and devices. △ Less

Submitted 4 January, 2017; originally announced January 2017.

ACM Class: I.2.4

Journal ref: In Proceedings of the 2nd International Workshop on Ontologies and Logic Programming for Query Answering (ONTOLP-16), 2016

arXiv:1607.06343 [pdf, ps, other]

Data Scaling in OBDA Benchmarks: The VIG Approach

Authors: Davide Lanti, Guohui Xiao, Diego Calvanese

Abstract: In this paper we describe VIG, a data scaler for benchmarks in the context of ontology-based data access (OBDA). Data scaling is a relatively recent approach, proposed in the database community, that allows for quickly scaling up an input data instance to s times its size, while preserving certain application-specific characteristics. The advantage of the approach is that the user is not required… ▽ More In this paper we describe VIG, a data scaler for benchmarks in the context of ontology-based data access (OBDA). Data scaling is a relatively recent approach, proposed in the database community, that allows for quickly scaling up an input data instance to s times its size, while preserving certain application-specific characteristics. The advantage of the approach is that the user is not required to manually input the characteristics of the data to be produced, making it particularly suitable for OBDA benchmarks, where the complexity of database schemas might pose a challenge for manual input (e.g., the NPD benchmark contains 70 tables with some containing more than 60 columns). As opposed to a traditional data scaler, VIG includes domain information provided by the OBDA mappings and the ontology in order to produce data. VIG is currently used in the NPD benchmark, but it is not NPD-specific and can be seeded with any data instance. The distinguishing features of VIG are (1) its simple and clear generation strategy; (2) its efficiency, as each value is generated in constant time, without accesses to the disk or to RAM to retrieve previously generated values; (3) and its generality, as the data is exported in CSV files that can be easily imported by any RDBMS system. VIG is a java implementation licensed under Apache 2.0, and its source code is available on GitHub (https://github.com/ontop/vig) in the form of a Maven project. The code is being maintained since two years by the -ontop- team at the Free University of Bozen-Bolzano. △ Less

Submitted 29 July, 2016; v1 submitted 21 July, 2016; originally announced July 2016.

Comments: Typo 1: ShallowWellbore -> DevelopmentWellbore Typo 2: f(fname) [page 4] -> f(fid)

arXiv:1603.09291 [pdf, other]

Expressivity and Complexity of MongoDB (Extended Version)

Authors: Elena Botoeva, Diego Calvanese, Benjamin Cogrel, Guohui Xiao

Abstract: A significant number of novel database architectures and data models have been proposed during the last decade. While some of these new systems have gained in popularity, they lack a proper formalization, and a precise understanding of the expressivity and the computational properties of the associated query languages. In this paper, we aim at filling this gap, and we do so by considering MongoDB,… ▽ More A significant number of novel database architectures and data models have been proposed during the last decade. While some of these new systems have gained in popularity, they lack a proper formalization, and a precise understanding of the expressivity and the computational properties of the associated query languages. In this paper, we aim at filling this gap, and we do so by considering MongoDB, a widely adopted document database managing complex (tree structured) values represented in a JSON-based data model, equipped with a powerful query mechanism. We provide a formalization of the MongoDB data model, and of a core fragment, called MQuery, of the MongoDB query language. We study the expressivity of MQuery, showing its equivalence with nested relational algebra. We further investigate the computational complexity of significant fragments of it, obtaining several (tight) bounds in combined complexity, which range from LOGSPACE to alternating exponential-time with a polynomial number of alternations. As a consequence, we obtain also a characterization of the combined complexity of nested relational algebra query evaluation. △ Less

Submitted 25 April, 2017; v1 submitted 30 March, 2016; originally announced March 2016.

ACM Class: H.2.1; H.2.3

arXiv:1603.07466 [pdf, other]

Semantics and Analysis of DMN Decision Tables

Authors: Diego Calvanese, Marlon Dumas, Ülari Laurson, Fabrizio M. Maggi, Marco Montali, Irene Teinemaa

Abstract: The Decision Model and Notation (DMN) is a standard notation to capture decision logic in business applications in general and business processes in particular. A central construct in DMN is that of a decision table. The increasing use of DMN decision tables to capture critical business knowledge raises the need to support analysis tasks on these tables such as correctness and completeness checkin… ▽ More The Decision Model and Notation (DMN) is a standard notation to capture decision logic in business applications in general and business processes in particular. A central construct in DMN is that of a decision table. The increasing use of DMN decision tables to capture critical business knowledge raises the need to support analysis tasks on these tables such as correctness and completeness checking. This paper provides a formal semantics for DMN tables, a formal definition of key analysis tasks and scalable algorithms to tackle two such tasks, i.e., detection of overlapping rules and of missing rules. The algorithms are based on a geometric interpretation of decision tables that can be used to support other analysis tasks by tapping into geometric algorithms. The algorithms have been implemented in an open-source DMN editor and tested on large decision tables derived from a credit lending dataset. △ Less

Submitted 24 March, 2016; originally announced March 2016.

Comments: Submitted to the International Conference on Business Process Management (BPM 2016)

ACM Class: D.2.2; D.2.4

arXiv:1511.08412 [pdf, ps, other]

Beyond OWL 2 QL in OBDA: Rewritings and Approximations (Extended Version)

Authors: Elena Botoeva, Diego Calvanese, Valerio Santarelli, Domenico Fabio Savo, Alessandro Solimando, Guohui Xiao

Abstract: Ontology-based data access (OBDA) is a novel paradigm facilitating access to relational data, realized by linking data sources to an ontology by means of declarative mappings. DL-Lite_R, which is the logic underpinning the W3C ontology language OWL 2 QL and the current language of choice for OBDA, has been designed with the goal of delegating query answering to the underlying database engine, and… ▽ More Ontology-based data access (OBDA) is a novel paradigm facilitating access to relational data, realized by linking data sources to an ontology by means of declarative mappings. DL-Lite_R, which is the logic underpinning the W3C ontology language OWL 2 QL and the current language of choice for OBDA, has been designed with the goal of delegating query answering to the underlying database engine, and thus is restricted in expressive power. E.g., it does not allow one to express disjunctive information, and any form of recursion on the data. The aim of this paper is to overcome these limitations of DL-Lite_R, and extend OBDA to more expressive ontology languages, while still leveraging the underlying relational technology for query answering. We achieve this by relying on two well-known mechanisms, namely conservative rewriting and approximation, but significantly extend their practical impact by bringing into the picture the mapping, an essential component of OBDA. Specifically, we develop techniques to rewrite OBDA specifications with an expressive ontology to "equivalent" ones with a DL-Lite_R ontology, if possible, and to approximate them otherwise. We do so by exploiting the high expressive power of the mapping layer to capture part of the domain semantics of rich ontology languages. We have implemented our techniques in the prototype system OntoProx, making use of the state-of-the-art OBDA system Ontop and the query answering system Clipper, and we have shown their feasibility and effectiveness with experiments on synthetic and real-world data. △ Less

Submitted 1 December, 2015; v1 submitted 26 November, 2015; originally announced November 2015.

Comments: The extended version of the AAAI 2016 paper "Beyond OWL 2 QL in OBDA: Rewritings and Approximations" by Elena Botoeva, Diego Calvanese, Valerio Santarelli, Domenico Fabio Savo, Alessandro Solimando,and Guohui Xiao

arXiv:1509.08979 [pdf, ps, other]

Fixpoint Node Selection Query Languages for Trees

Authors: Diego Calvanese, Giuseppe De Giacomo, Maurizio Lenzerini, Moshe Y. Vardi

Abstract: The study of node selection query languages for (finite) trees has been a major topic in the recent research on query languages for Web documents. On one hand, there has been an extensive study of XPath and its various extensions. On the other hand, query languages based on classical logics, such as first-order logic (FO) or Monadic Second-Order Logic (MSO), have been considered. Results in this a… ▽ More The study of node selection query languages for (finite) trees has been a major topic in the recent research on query languages for Web documents. On one hand, there has been an extensive study of XPath and its various extensions. On the other hand, query languages based on classical logics, such as first-order logic (FO) or Monadic Second-Order Logic (MSO), have been considered. Results in this area typically relate an XPath-based language to a classical logic. What has yet to emerge is an XPath-related language that is as expressive as MSO, and at the same time enjoys the computational properties of XPath, which are linear time query evaluation and exponential time query-containment test. In this paper we propose muXPath, which is the alternation-free fragment of XPath extended with fixpoint operators. Using two-way alternating automata, we show that this language does combine desired expressiveness and computational properties, placing it as an attractive candidate for the definite node-selection query language for trees. △ Less

Submitted 14 November, 2018; v1 submitted 29 September, 2015; originally announced September 2015.

arXiv:1504.08108 [pdf, other]

Verification of Generalized Inconsistency-Aware Knowledge and Action Bases (Extended Version)

Authors: Diego Calvanese, Marco Montali, Ario Santoso

Abstract: Knowledge and Action Bases (KABs) have been put forward as a semantically rich representation of a domain, using a DL KB to account for its static aspects, and actions to evolve its extensional part over time, possibly introducing new objects. Recently, KABs have been extended to manage inconsistency, with ad-hoc verification techniques geared towards specific semantics. This work provides a twofo… ▽ More Knowledge and Action Bases (KABs) have been put forward as a semantically rich representation of a domain, using a DL KB to account for its static aspects, and actions to evolve its extensional part over time, possibly introducing new objects. Recently, KABs have been extended to manage inconsistency, with ad-hoc verification techniques geared towards specific semantics. This work provides a twofold contribution along this line of research. On the one hand, we enrich KABs with a high-level, compact action language inspired by Golog, obtaining so called Golog-KABs (GKABs). On the other hand, we introduce a parametric execution semantics for GKABs, so as to elegantly accomodate a plethora of inconsistency-aware semantics based on the notion of repair. We then provide several reductions for the verification of sophisticated first-order temporal properties over inconsistency-aware GKABs, and show that it can be addressed using known techniques, developed for standard KABs. △ Less

Submitted 4 June, 2015; v1 submitted 30 April, 2015; originally announced April 2015.

arXiv:1412.7965 [pdf, other]

Adding Context to Knowledge and Action Bases

Authors: Diego Calvanese, İsmail İlkan Ceylan, Marco Montali, Ario Santoso

Abstract: Knowledge and Action Bases (KABs) have been recently proposed as a formal framework to capture the dynamics of systems which manipulate Description Logic (DL) Knowledge Bases (KBs) through action execution. In this work, we enrich the KAB setting with contextual information, making use of different context dimensions. On the one hand, context is determined by the environment using context-changing… ▽ More Knowledge and Action Bases (KABs) have been recently proposed as a formal framework to capture the dynamics of systems which manipulate Description Logic (DL) Knowledge Bases (KBs) through action execution. In this work, we enrich the KAB setting with contextual information, making use of different context dimensions. On the one hand, context is determined by the environment using context-changing actions that make use of the current state of the KB and the current context. On the other hand, it affects the set of TBox assertions that are relevant at each time point, and that have to be considered when processing queries posed over the KAB. Here we extend to our enriched setting the results on verification of rich temporal properties expressed in mu-calculus, which had been established for standard KABs. Specifically, we show that under a run-boundedness condition, verification stays decidable. △ Less

Submitted 26 December, 2014; originally announced December 2014.

Comments: ARCOE-Logic 2014 Workshop Notes, pp. 25-36

arXiv:1411.4516 [pdf, ps, other]

Verification of Relational Multiagent Systems with Data Types (Extended Version)

Authors: Diego Calvanese, Giorgio Delzanno, Marco Montali

Abstract: We study the extension of relational multiagent systems (RMASs), where agents manipulate full-fledged relational databases, with data types and facets equipped with domain-specific, rigid relations (such as total orders). Specifically, we focus on design-time verification of RMASs against rich first-order temporal properties expressed in a variant of first-order mu-calculus with quantification acr… ▽ More We study the extension of relational multiagent systems (RMASs), where agents manipulate full-fledged relational databases, with data types and facets equipped with domain-specific, rigid relations (such as total orders). Specifically, we focus on design-time verification of RMASs against rich first-order temporal properties expressed in a variant of first-order mu-calculus with quantification across states. We build on previous decidability results under the "state-bounded" assumption, i.e., in each single state only a bounded number of data objects is stored in the agent databases, while unboundedly many can be encountered over time. We recast this condition, showing decidability in presence of dense, linear orders, and facets defined on top of them. Our approach is based on the construction of a finite-state, sound and complete abstraction of the original system, in which dense linear orders are reformulated as non-rigid relations working on the active domain of the system only. We also show undecidability when including a data type equipped with the successor relation. △ Less

Submitted 17 November, 2014; originally announced November 2014.

arXiv:1408.5094 [pdf, other]

doi 10.1145/2661829.2662050

Verifiable UML Artifact-Centric Business Process Models (Extended Version)

Authors: Diego Calvanese, Marco Montali, Montserrat Estanol, Ernest Teniente

Abstract: Artifact-centric business process models have gained increasing momentum recently due to their ability to combine structural (i.e., data related) with dynamical (i.e., process related) aspects. In particular, two main lines of research have been pursued so far: one tailored to business artefact modeling languages and methodologies, the other focused on the foundations for their formal verification… ▽ More Artifact-centric business process models have gained increasing momentum recently due to their ability to combine structural (i.e., data related) with dynamical (i.e., process related) aspects. In particular, two main lines of research have been pursued so far: one tailored to business artefact modeling languages and methodologies, the other focused on the foundations for their formal verification. In this paper, we merge these two lines of research, by showing how recent theoretical decidability results for verification can be fruitfully transferred to a concrete UML-based modeling methodology. In particular, we identify additional steps in the methodology that, in significant cases, guarantee the possibility of verifying the resulting models against rich first-order temporal properties. Notably, our results can be seamlessly transferred to different languages for the specification of the artifact lifecycles. △ Less

Submitted 26 August, 2014; v1 submitted 21 August, 2014; originally announced August 2014.

Comments: Extended version of "Verifiable UML Artifact-Centric Business Process Models" - to appear in the Proceedings of CIKM 2014

arXiv:1404.4274 [pdf, ps, other]

Managing Change in Graph-structured Data Using Description Logics (long version with appendix)

Authors: Shqiponja Ahmetaj, Diego Calvanese, Magdalena Ortiz, Mantas Simkus

Abstract: In this paper, we consider the setting of graph-structured data that evolves as a result of operations carried out by users or applications. We study different reasoning problems, which range from ensuring the satisfaction of a given set of integrity constraints after a given sequence of updates, to deciding the (non-)existence of a sequence of actions that would take the data to an (un)desirable… ▽ More In this paper, we consider the setting of graph-structured data that evolves as a result of operations carried out by users or applications. We study different reasoning problems, which range from ensuring the satisfaction of a given set of integrity constraints after a given sequence of updates, to deciding the (non-)existence of a sequence of actions that would take the data to an (un)desirable state, starting either from a specific data instance or from an incomplete description of it. We consider an action language in which actions are finite sequences of conditional insertions and deletions of nodes and labels, and use Description Logics for describing integrity constraints and (partial) states of the data. We then formalize the above data management problems as a static verification problem and several planning problems. We provide algorithms and tight complexity bounds for the formalized problems, both for an expressive DL and for a variant of DL-Lite. △ Less

Submitted 29 May, 2014; v1 submitted 16 April, 2014; originally announced April 2014.

arXiv:1403.7248 [pdf]

Updating RDFS ABoxes and TBoxes in SPARQL

Authors: Albin Ahmeti, Diego Calvanese, Axel Polleres

Abstract: Updates in RDF stores have recently been standardised in the SPARQL 1.1 Update specification. However, computing answers entailed by ontologies in triple stores is usually treated orthogonal to updates. Even the W3C's recent SPARQL 1.1 Update language and SPARQL 1.1 Entailment Regimes specifications explicitly exclude a standard behaviour how SPARQL endpoints should treat entailment regimes other… ▽ More Updates in RDF stores have recently been standardised in the SPARQL 1.1 Update specification. However, computing answers entailed by ontologies in triple stores is usually treated orthogonal to updates. Even the W3C's recent SPARQL 1.1 Update language and SPARQL 1.1 Entailment Regimes specifications explicitly exclude a standard behaviour how SPARQL endpoints should treat entailment regimes other than simple entailment in the context of updates. In this paper, we take a first step to close this gap. We define a fragment of SPARQL basic graph patterns corresponding to (the RDFS fragment of) DL-Lite and the corresponding SPARQL update language, dealing with updates both of ABox and of TBox statements. We discuss possible semantics along with potential strategies for implementing them. We treat both, (i) materialised RDF stores, which store all entailed triples explicitly, and (ii) reduced RDF Stores, that is, redundancy-free RDF stores that do not store any RDF triples (corresponding to DL-Lite ABox statements) entailed by others already. △ Less

Submitted 27 March, 2014; originally announced March 2014.

arXiv:1402.7122 [pdf, other]

Nested Regular Path Queries in Description Logics

Authors: Meghyn Bienvenu, Diego Calvanese, Magdalena Ortiz, Mantas Simkus

Abstract: Two-way regular path queries (2RPQs) have received increased attention recently due to their ability to relate pairs of objects by flexibly navigating graph-structured data. They are present in property paths in SPARQL 1.1, the new standard RDF query language, and in the XML query language XPath. In line with XPath, we consider the extension of 2RPQs with nesting, which allows one to require that… ▽ More Two-way regular path queries (2RPQs) have received increased attention recently due to their ability to relate pairs of objects by flexibly navigating graph-structured data. They are present in property paths in SPARQL 1.1, the new standard RDF query language, and in the XML query language XPath. In line with XPath, we consider the extension of 2RPQs with nesting, which allows one to require that objects along a path satisfy complex conditions, in turn expressed through (nested) 2RPQs. We study the computational complexity of answering nested 2RPQs and conjunctions thereof (CN2RPQs) in the presence of domain knowledge expressed in description logics (DLs). We establish tight complexity bounds in data and combined complexity for a variety of DLs, ranging from lightweight DLs (DL-Lite, EL) up to highly expressive ones. Interestingly, we are able to show that adding nesting to (C)2RPQs does not affect worst-case data complexity of query answering for any of the considered DLs. However, in the case of lightweight DLs, adding nesting to 2RPQs leads to a surprising jump in combined complexity, from P-complete to Exp-complete. △ Less

Submitted 4 March, 2014; v1 submitted 27 February, 2014; originally announced February 2014.

Comments: added Figure 1

arXiv:1402.0575 [pdf]

doi 10.1613/jair.3870

Reasoning about Explanations for Negative Query Answers in DL-Lite

Authors: Diego Calvanese, Magdalena Ortiz, Mantas Simkus, Giorgio Stefanoni

Abstract: In order to meet usability requirements, most logic-based applications provide explanation facilities for reasoning services. This holds also for Description Logics, where research has focused on the explanation of both TBox reasoning and, more recently, query answering. Besides explaining the presence of a tuple in a query answer, it is important to explain also why a given tuple is missing. W… ▽ More In order to meet usability requirements, most logic-based applications provide explanation facilities for reasoning services. This holds also for Description Logics, where research has focused on the explanation of both TBox reasoning and, more recently, query answering. Besides explaining the presence of a tuple in a query answer, it is important to explain also why a given tuple is missing. We address the latter problem for instance and conjunctive query answering over DL-Lite ontologies by adopting abductive reasoning; that is, we look for additions to the ABox that force a given tuple to be in the result. As reasoning tasks we consider existence and recognition of an explanation, and relevance and necessity of a given assertion for an explanation. We characterize the computational complexity of these problems for arbitrary, subset minimal, and cardinality minimal explanations. △ Less

Submitted 3 February, 2014; originally announced February 2014.

Journal ref: Journal Of Artificial Intelligence Research, Volume 48, pages 635-669, 2013

arXiv:1402.0569 [pdf]

doi 10.1613/jair.3826

Description Logic Knowledge and Action Bases

Authors: Babak Bagheri Hariri, Diego Calvanese, Marco Montali, Giuseppe De Giacomo, Riccardo De Masellis, Paolo Felli

Abstract: Description logic Knowledge and Action Bases (KAB) are a mechanism for providing both a semantically rich representation of the information on the domain of interest in terms of a description logic knowledge base and actions to change such information over time, possibly introducing new objects. We resort to a variant of DL-Lite where the unique name assumption is not enforced and where equality b… ▽ More Description logic Knowledge and Action Bases (KAB) are a mechanism for providing both a semantically rich representation of the information on the domain of interest in terms of a description logic knowledge base and actions to change such information over time, possibly introducing new objects. We resort to a variant of DL-Lite where the unique name assumption is not enforced and where equality between objects may be asserted and inferred. Actions are specified as sets of conditional effects, where conditions are based on epistemic queries over the knowledge base (TBox and ABox), and effects are expressed in terms of new ABoxes. In this setting, we address verification of temporal properties expressed in a variant of first-order mu-calculus with quantification across states. Notably, we show decidability of verification, under a suitable restriction inspired by the notion of weak acyclicity in data exchange. △ Less

Submitted 3 February, 2014; originally announced February 2014.

Journal ref: Journal Of Artificial Intelligence Research, Volume 46, pages 651-686, 2013

arXiv:1401.3487 [pdf]

doi 10.1613/jair.2820

The DL-Lite Family and Relations

Authors: Alessandro Artale, Diego Calvanese, Roman Kontchakov, Michael Zakharyaschev

Abstract: The recently introduced series of description logics under the common moniker DL-Lite has attracted attention of the description logic and semantic web communities due to the low computational complexity of inference, on the one hand, and the ability to represent conceptual modeling formalisms, on the other. The main aim of this article is to carry out a thorough and systematic investigation of… ▽ More The recently introduced series of description logics under the common moniker DL-Lite has attracted attention of the description logic and semantic web communities due to the low computational complexity of inference, on the one hand, and the ability to represent conceptual modeling formalisms, on the other. The main aim of this article is to carry out a thorough and systematic investigation of inference in extensions of the original DL-Lite logics along five axes: by (i) adding the Boolean connectives and (ii) number restrictions to concept constructs, (iii) allowing role hierarchies, (iv) allowing role disjointness, symmetry, asymmetry, reflexivity, irreflexivity and transitivity constraints, and (v) adopting or dropping the unique same assumption. We analyze the combined complexity of satisfiability for the resulting logics, as well as the data complexity of instance checking and answering positive existential queries. Our approach is based on embedding DL-Lite logics in suitable fragments of the one-variable first-order logic, which provides useful insights into their properties and, in particular, computational behavior. △ Less

Submitted 15 January, 2014; originally announced January 2014.

Journal ref: Journal Of Artificial Intelligence Research, Volume 36, pages 1-69, 2009

arXiv:1312.6624 [pdf, ps, other]

Shape and Content: Incorporating Domain Knowledge into Shape Analysis

Authors: Diego Calvanese, Tomer Kotek, Mantas Šimkus, Helmut Veith, Florian Zuleger

Abstract: The verification community has studied dynamic data structures primarily in a bottom-up way by analyzing pointers and the shapes induced by them. Recent work in fields such as separation logic has made significant progress in extracting shapes from program source code. Many real world programs however manipulate complex data whose structure and content is most naturally described by formalisms fro… ▽ More The verification community has studied dynamic data structures primarily in a bottom-up way by analyzing pointers and the shapes induced by them. Recent work in fields such as separation logic has made significant progress in extracting shapes from program source code. Many real world programs however manipulate complex data whose structure and content is most naturally described by formalisms from object oriented programming and databases. In this paper, we look at the verification of programs with dynamic data structures from the perspective of content representation. Our approach is based on description logic, a widely used knowledge representation paradigm which gives a logical underpinning for diverse modeling frameworks such as UML and ER. Technically, we assume that we have separation logic shape invariants obtained from a shape analysis tool, and requirements on the program data in terms of description logic. We show that the two-variable fragment of first order logic with counting and trees %(whose decidability was proved at LICS 2013) can be used as a joint framework to embed suitable fragments of description logic and separation logic. △ Less

Submitted 9 July, 2014; v1 submitted 23 December, 2013; originally announced December 2013.

arXiv:1308.6292 [pdf, other]

Verification of Semantically-Enhanced Artifact Systems (Extended Version)

Authors: Babak Bagheri Hariri, Diego Calvanese, Marco Montali, Ario Santoso, Dmitry Solomakhin

Abstract: Artifact-Centric systems have emerged in the last years as a suitable framework to model business-relevant entities, by combining their static and dynamic aspects. In particular, the Guard-Stage-Milestone (GSM) approach has been recently proposed to model artifacts and their lifecycle in a declarative way. In this paper, we enhance GSM with a Semantic Layer, constituted by a full-fledged OWL 2 QL… ▽ More Artifact-Centric systems have emerged in the last years as a suitable framework to model business-relevant entities, by combining their static and dynamic aspects. In particular, the Guard-Stage-Milestone (GSM) approach has been recently proposed to model artifacts and their lifecycle in a declarative way. In this paper, we enhance GSM with a Semantic Layer, constituted by a full-fledged OWL 2 QL ontology linked to the artifact information models through mapping specifications. The ontology provides a conceptual view of the domain under study, and allows one to understand the evolution of the artifact system at a higher level of abstraction. In this setting, we present a technique to specify temporal properties expressed over the Semantic Layer, and verify them according to the evolution in the underlying GSM model. This technique has been implemented in a tool that exploits state-of-the-art ontology-based data access technologies to manipulate the temporal properties according to the ontology and the mappings, and that relies on the GSMC model checker for verification. △ Less

Submitted 28 August, 2013; originally announced August 2013.

arXiv:1304.6442 [pdf, ps, other]

Verification of Inconsistency-Aware Knowledge and Action Bases (Extended Version)

Authors: Diego Calvanese, Evgeny Kharlamov, Marco Montali, Ario Santoso, Dmitriy Zheleznyakov

Abstract: Description Logic Knowledge and Action Bases (KABs) have been recently introduced as a mechanism that provides a semantically rich representation of the information on the domain of interest in terms of a DL KB and a set of actions to change such information over time, possibly introducing new objects. In this setting, decidability of verification of sophisticated temporal properties over KABs, ex… ▽ More Description Logic Knowledge and Action Bases (KABs) have been recently introduced as a mechanism that provides a semantically rich representation of the information on the domain of interest in terms of a DL KB and a set of actions to change such information over time, possibly introducing new objects. In this setting, decidability of verification of sophisticated temporal properties over KABs, expressed in a variant of first-order mu-calculus, has been shown. However, the established framework treats inconsistency in a simplistic way, by rejecting inconsistent states produced through action execution. We address this problem by showing how inconsistency handling based on the notion of repairs can be integrated into KABs, resorting to inconsistency-tolerant semantics. In this setting, we establish decidability and complexity of verification. △ Less

Submitted 23 April, 2013; originally announced April 2013.

arXiv:1304.5810 [pdf, ps, other]

Exchanging OWL 2 QL Knowledge Bases

Authors: Marcelo Arenas, Elena Botoeva, Diego Calvanese, Vladislav Ryzhikov

Abstract: Knowledge base exchange is an important problem in the area of data exchange and knowledge representation, where one is interested in exchanging information between a source and a target knowledge base connected through a mapping. In this paper, we study this fundamental problem for knowledge bases and mappings expressed in OWL 2 QL, the profile of OWL 2 based on the description logic DL-Lite_R. M… ▽ More Knowledge base exchange is an important problem in the area of data exchange and knowledge representation, where one is interested in exchanging information between a source and a target knowledge base connected through a mapping. In this paper, we study this fundamental problem for knowledge bases and mappings expressed in OWL 2 QL, the profile of OWL 2 based on the description logic DL-Lite_R. More specifically, we consider the problem of computing universal solutions, identified as one of the most desirable translations to be materialized, and the problem of computing UCQ-representations, which optimally capture in a target TBox the information that can be extracted from a source TBox and a mapping by means of unions of conjunctive queries. For the former we provide a novel automata-theoretic technique, and complexity results that range from NP to EXPTIME, while for the latter we show NLOGSPACE-completeness. △ Less

Submitted 1 July, 2013; v1 submitted 21 April, 2013; originally announced April 2013.

arXiv:1203.0024 [pdf, ps, other]

Verification of Relational Data-Centric Dynamic Systems with External Services

Authors: Babak Bagheri Hariri, Diego Calvanese, Giuseppe De Giacomo, Alin Deutsch, Marco Montali

Abstract: Data-centric dynamic systems are systems where both the process controlling the dynamics and the manipulation of data are equally central. In this paper we study verification of (first-order) mu-calculus variants over relational data-centric dynamic systems, where data are represented by a full-fledged relational database, and the process is described in terms of atomic actions that evolve the dat… ▽ More Data-centric dynamic systems are systems where both the process controlling the dynamics and the manipulation of data are equally central. In this paper we study verification of (first-order) mu-calculus variants over relational data-centric dynamic systems, where data are represented by a full-fledged relational database, and the process is described in terms of atomic actions that evolve the database. The execution of such actions may involve calls to external services, providing fresh data inserted into the system. As a result such systems are typically infinite-state. We show that verification is undecidable in general, and we isolate notable cases, where decidability is achieved. Specifically we start by considering service calls that return values deterministically (depending only on passed parameters). We show that in a mu-calculus variant that preserves knowledge of objects appeared along a run we get decidability under the assumption that the fresh data introduced along a run are bounded, though they might not be bounded in the overall system. In fact we tie such a result to a notion related to weak acyclicity studied in data exchange. Then, we move to nondeterministic services where the assumption of data bounded run would result in a bound on the service calls that can be invoked during the execution and hence would be too restrictive. So we investigate decidability under the assumption that knowledge of objects is preserved only if they are continuously present. We show that if infinitely many values occur in a run but do not accumulate in the same state, then we get again decidability. We give syntactic conditions to avoid this accumulation through the novel notion of "generate-recall acyclicity", which takes into consideration that every service call activation generates new values that cannot be accumulated indefinitely. △ Less

Submitted 29 February, 2012; originally announced March 2012.

arXiv:1105.5452 [pdf, ps]

doi 10.1613/jair.548

Unifying Class-Based Representation Formalisms

Authors: D. Calvanese, M. Lenzerini, D. Nardi

Abstract: The notion of class is ubiquitous in computer science and is central in many formalisms for the representation of structured knowledge used both in knowledge representation and in databases. In this paper we study the basic issues underlying such representation formalisms and single out both their common characteristics and their distinguishing features. Such investigation leads u… ▽ More The notion of class is ubiquitous in computer science and is central in many formalisms for the representation of structured knowledge used both in knowledge representation and in databases. In this paper we study the basic issues underlying such representation formalisms and single out both their common characteristics and their distinguishing features. Such investigation leads us to propose a unifying framework in which we are able to capture the fundamental aspects of several representation languages used in different contexts. The proposed formalism is expressed in the style of description logics, which have been introduced in knowledge representation as a means to provide a semantically well-founded basis for the structural aspects of knowledge representation systems. The description logic considered in this paper is a subset of first order logic with nice computational characteristics. It is quite expressive and features a novel combination of constructs that has not been studied before. The distinguishing constructs are number restrictions, which generalize existence and functional dependencies, inverse roles, which allow one to refer to the inverse of a relationship, and possibly cyclic assertions, which are necessary for capturing real world domains. We are able to show that it is precisely such combination of constructs that makes our logic powerful enough to model the essential set of features for defining class structures that are common to frame systems, object-oriented database languages, and semantic data models. As a consequence of the established correspondences, several significant extensions of each of the above formalisms become available. The high expressiveness of the logic we propose and the need for capturing the reasoning in different contexts forces us to distinguish between unrestricted and finite model reasoning. A notable feature of our proposal is that reasoning in both cases is decidable. We argue that, by virtue of the high expressive power and of the associated reasoning capabilities on both unrestricted and finite models, our logic provides a common core for class-based representation formalisms. △ Less

Submitted 26 May, 2011; originally announced May 2011.

Journal ref: Journal Of Artificial Intelligence Research, Volume 11, pages 199-240, 1999

arXiv:1003.1179 [pdf, other]

View Synthesis from Schema Mappings

Authors: Diego Calvanese, Giuseppe De Giacomo, Maurizio Lenzerini, Moshe Y. Vardi

Abstract: In data management, and in particular in data integration, data exchange, query optimization, and data privacy, the notion of view plays a central role. In several contexts, such as data integration, data mashups, and data warehousing, the need arises of designing views starting from a set of known correspondences between queries over different schemas. In this paper we deal with the issue of au… ▽ More In data management, and in particular in data integration, data exchange, query optimization, and data privacy, the notion of view plays a central role. In several contexts, such as data integration, data mashups, and data warehousing, the need arises of designing views starting from a set of known correspondences between queries over different schemas. In this paper we deal with the issue of automating such a design process. We call this novel problem "view synthesis from schema mappings": given a set of schema mappings, each relating a query over a source schema to a query over a target schema, automatically synthesize for each source a view over the target schema in such a way that for each mapping, the query over the source is a rewriting of the query over the target wrt the synthesized views. We study view synthesis from schema mappings both in the relational setting, where queries and views are (unions of) conjunctive queries, and in the semistructured data setting, where queries and views are (two-way) regular path queries, as well as unions of conjunctions thereof. We provide techniques and complexity upper bounds for each of these cases. △ Less

Submitted 4 March, 2010; originally announced March 2010.

arXiv:cs/0507067 [pdf, ps, other]

Conjunctive Query Containment and Answering under Description Logics Constraints

Authors: Diego Calvanese, Giuseppe De Giacomo, Maurizio Lenzerini

Abstract: Query containment and query answering are two important computational tasks in databases. While query answering amounts to compute the result of a query over a database, query containment is the problem of checking whether for every database, the result of one query is a subset of the result of another query. In this paper, we deal with unions of conjunctive queries, and we address query conta… ▽ More Query containment and query answering are two important computational tasks in databases. While query answering amounts to compute the result of a query over a database, query containment is the problem of checking whether for every database, the result of one query is a subset of the result of another query. In this paper, we deal with unions of conjunctive queries, and we address query containment and query answering under Description Logic constraints. Every such constraint is essentially an inclusion dependencies between concepts and relations, and their expressive power is due to the possibility of using complex expressions, e.g., intersection and difference of relations, special forms of quantification, regular expressions over binary relations, in the specification of the dependencies. These types of constraints capture a great variety of data models, including the relational, the entity-relationship, and the object-oriented model, all extended with various forms of constraints, and also the basic features of the ontology languages used in the context of the Semantic Web. We present the following results on both query containment and query answering. We provide a method for query containment under Description Logic constraints, thus showing that the problem is decidable, and analyze its computational complexity. We prove that query containment is undecidable in the case where we allow inequalities in the right-hand side query, even for very simple constraints and queries. We show that query answering under Description Logic constraints can be reduced to query containment, and illustrate how such a reduction provides upper bound results with respect to both combined and data complexity. △ Less

Submitted 28 July, 2005; originally announced July 2005.

ACM Class: I.2.4; F.4.1

arXiv:cs/0507059 [pdf, ps, other]

Data complexity of answering conjunctive queries over SHIQ knowledge bases

Authors: M. Magdalena Ortiz de la Fuente, Diego Calvanese, Thomas Eiter, Enrico Franconi

Abstract: An algorithm for answering conjunctive queries over SHIQ knowledge bases that is coNP in data complexity is given. The algorithm is based on the tableau algorithm for reasoning with individuals in SHIQ. The blocking conditions of the tableau are weakened in such a way that the set of models the modified algorithm yields suffices to check query entailment. The modified blocking conditions are bas… ▽ More An algorithm for answering conjunctive queries over SHIQ knowledge bases that is coNP in data complexity is given. The algorithm is based on the tableau algorithm for reasoning with individuals in SHIQ. The blocking conditions of the tableau are weakened in such a way that the set of models the modified algorithm yields suffices to check query entailment. The modified blocking conditions are based on the ones proposed by Levy and Rousset for reasoning with Horn Rules in the description logic ALCNR. △ Less

Submitted 22 July, 2005; originally announced July 2005.

Comments: Technical Report, 22 pages

Showing 1–44 of 44 results for author: Calvanese, D