Search | arXiv e-print repository

Reasoning Capabilities and Invariability of Large Language Models

Authors: Alessandro Raganato, Rafael Peñaloza, Marco Viviani, Gabriella Pasi

Abstract: Large Language Models (LLMs) have shown remarkable capabilities in manipulating natural language across multiple applications, but their ability to handle simple reasoning tasks is often questioned. In this work, we aim to provide a comprehensive analysis of LLMs' reasoning competence, specifically focusing on their prompt dependency. In particular, we introduce a new benchmark dataset with a seri… ▽ More Large Language Models (LLMs) have shown remarkable capabilities in manipulating natural language across multiple applications, but their ability to handle simple reasoning tasks is often questioned. In this work, we aim to provide a comprehensive analysis of LLMs' reasoning competence, specifically focusing on their prompt dependency. In particular, we introduce a new benchmark dataset with a series of simple reasoning questions demanding shallow logical reasoning. Aligned with cognitive psychology standards, the questions are confined to a basic domain revolving around geometric figures, ensuring that responses are independent of any pre-existing intuition about the world and rely solely on deduction. An empirical analysis involving zero-shot and few-shot prompting across 24 LLMs of different sizes reveals that, while LLMs with over 70 billion parameters perform better in the zero-shot setting, there is still a large room for improvement. An additional test with chain-of-thought prompting over 22 LLMs shows that this additional prompt can aid or damage the performance of models, depending on whether the rationale is required before or after the answer. △ Less

Submitted 1 May, 2025; originally announced May 2025.

Comments: Accepted for publication in the Proceedings of the 23rd IEEE/WIC International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT 2024)

arXiv:2501.12728 [pdf, other]

A Call for Critically Rethinking and Reforming Data Analysis in Empirical Software Engineering

Authors: Matteo Esposito, Mikel Robredo, Murali Sridharan, Guilherme Horta Travassos, Rafael Peñaloza, Valentina Lenarduzzi

Abstract: Context: Empirical Software Engineering (ESE) drives innovation in SE through qualitative and quantitative studies. However, concerns about the correct application of empirical methodologies have existed since the 2006 Dagstuhl seminar on SE. Objective: To analyze three decades of SE research, identify mistakes in statistical methods, and evaluate experts' ability to detect and address these issue… ▽ More Context: Empirical Software Engineering (ESE) drives innovation in SE through qualitative and quantitative studies. However, concerns about the correct application of empirical methodologies have existed since the 2006 Dagstuhl seminar on SE. Objective: To analyze three decades of SE research, identify mistakes in statistical methods, and evaluate experts' ability to detect and address these issues. Methods: We conducted a literature survey of ~27,000 empirical studies, using LLMs to classify statistical methodologies as adequate or inadequate. Additionally, we selected 30 primary studies and held a workshop with 33 ESE experts to assess their ability to identify and resolve statistical issues. Results: Significant statistical issues were found in the primary studies, and experts showed limited ability to detect and correct these methodological problems, raising concerns about the broader ESE community's proficiency in this area. Conclusions. Despite our study's eventual limitations, its results shed light on recurring issues from promoting information copy-and-paste from past authors' works and the continuous publication of inadequate approaches that promote dubious results and jeopardize the spread of the correct statistical strategies among researchers. Besides, it justifies further investigation into empirical rigor in software engineering to expose these recurring issues and establish a framework for reassessing our field's foundation of statistical methodology application. Therefore, this work calls for critically rethinking and reforming data analysis in empirical software engineering, paving the way for our work soon. △ Less

Submitted 22 January, 2025; originally announced January 2025.

arXiv:2409.09485 [pdf, other]

Enumerating Minimal Unsatisfiable Cores of LTLf formulas

Authors: Antonio Ielo, Giuseppe Mazzotta, Rafael Peñaloza, Francesco Ricca

Abstract: Linear Temporal Logic over finite traces ($\text{LTL}_f$) is a widely used formalism with applications in AI, process mining, model checking, and more. The primary reasoning task for $\text{LTL}_f$ is satisfiability checking; yet, the recent focus on explainable AI has increased interest in analyzing inconsistent formulas, making the enumeration of minimal explanations for infeasibility a relevant… ▽ More Linear Temporal Logic over finite traces ($\text{LTL}_f$) is a widely used formalism with applications in AI, process mining, model checking, and more. The primary reasoning task for $\text{LTL}_f$ is satisfiability checking; yet, the recent focus on explainable AI has increased interest in analyzing inconsistent formulas, making the enumeration of minimal explanations for infeasibility a relevant task also for $\text{LTL}_f$. This paper introduces a novel technique for enumerating minimal unsatisfiable cores (MUCs) of an $\text{LTL}_f$ specification. The main idea is to encode a $\text{LTL}_f$ formula into an Answer Set Programming (ASP) specification, such that the minimal unsatisfiable subsets (MUSes) of the ASP program directly correspond to the MUCs of the original $\text{LTL}_f$ specification. Leveraging recent advancements in ASP solving yields a MUC enumerator achieving good performance in experiments conducted on established benchmarks from the literature. △ Less

Submitted 14 September, 2024; originally announced September 2024.

arXiv:2408.08095 [pdf, other]

Evaluating Time-Dependent Methods and Seasonal Effects in Code Technical Debt Prediction

Authors: Mikel Robredo, Nyyti Saarimaki, Davide Taibi, Rafael Penaloza, Valentina Lenarduzzi

Abstract: Code Technical Debt prediction has become a popular research niche in recent software engineering literature. Technical Debt is an important metric in software projects as it measures professionals' effort to clean the code. Therefore, predicting its future behavior becomes a crucial task. However, no well-defined and consistent approach can completely capture the features that impact the evolutio… ▽ More Code Technical Debt prediction has become a popular research niche in recent software engineering literature. Technical Debt is an important metric in software projects as it measures professionals' effort to clean the code. Therefore, predicting its future behavior becomes a crucial task. However, no well-defined and consistent approach can completely capture the features that impact the evolution of Code Technical Debt. The goal of this study is to evaluate the impact of considering time-dependent techniques as well as seasonal effects in temporal data in the prediction performance within the context of Code Technical Debt. The study adopts existing, yet not extensively adopted, time-dependent prediction techniques and compares their prediction performance to commonly used Machine Learning models. Further, the study strengthens the evaluation of time-dependent methods by extending the analysis to capture the impact of seasonality in Code Technical Debt data. We trained 11 prediction models using the commit history of 31 open-source projects developed with Java. We predicted the future observations of the SQALE index to evaluate their predictive performance. Our study confirms the positive impact of considering time-dependent techniques. The adopted multivariate time series analysis model ARIMAX overcame the rest of the adopted models. Incorporating seasonal effects led to an enhancement in the predictive performance of the adopted time-dependent techniques. However, the impact of this effect was found to be relatively modest. The findings of this study corroborate our position in favor of implementing techniques that capture the existing time dependence within historical data of software metrics, specifically in the context of this study, namely, Code Technical Debt. This necessitates the utilization of techniques that can effectively address this evidence. △ Less

Submitted 15 August, 2024; originally announced August 2024.

arXiv:2407.18169 [pdf, other]

In Search of Metrics to Guide Developer-Based Refactoring Recommendations

Authors: Mikel Robredo, Matteo Esposito, Fabio Palomba, Rafael Peñaloza, Valentina Lenarduzzi

Abstract: Context. Source code refactoring is a well-established approach to improving source code quality without compromising its external behavior. Motivation. The literature described the benefits of refactoring, yet its application in practice is threatened by the high cost of time, resource allocation, and effort required to perform it continuously. Providing refactoring recommendations closer to what… ▽ More Context. Source code refactoring is a well-established approach to improving source code quality without compromising its external behavior. Motivation. The literature described the benefits of refactoring, yet its application in practice is threatened by the high cost of time, resource allocation, and effort required to perform it continuously. Providing refactoring recommendations closer to what developers perceive as relevant may support the broader application of refactoring in practice and drive prioritization efforts. Aim. In this paper, we aim to foster the design of a developer-based refactoring recommender, proposing an empirical study into the metrics that study the developer's willingness to apply refactoring operations. We build upon previous work describing the developer's motivations for refactoring and investigate how product and process metrics may grasp those motivations. Expected Results. We will quantify the value of product and process metrics in grasping developers' motivations to perform refactoring, thus providing a catalog of metrics for developer-based refactoring recommenders to use. △ Less

Submitted 25 July, 2024; originally announced July 2024.

arXiv:2407.14280 [pdf, other]

How to Blend Concepts in Diffusion Models

Authors: Lorenzo Olearo, Giorgio Longari, Simone Melzi, Alessandro Raganato, Rafael Peñaloza

Abstract: For the last decade, there has been a push to use multi-dimensional (latent) spaces to represent concepts; and yet how to manipulate these concepts or reason with them remains largely unclear. Some recent methods exploit multiple latent representations and their connection, making this research question even more entangled. Our goal is to understand how operations in the latent space affect the un… ▽ More For the last decade, there has been a push to use multi-dimensional (latent) spaces to represent concepts; and yet how to manipulate these concepts or reason with them remains largely unclear. Some recent methods exploit multiple latent representations and their connection, making this research question even more entangled. Our goal is to understand how operations in the latent space affect the underlying concepts. To that end, we explore the task of concept blending through diffusion models. Diffusion models are based on a connection between a latent representation of textual prompts and a latent space that enables image reconstruction and generation. This task allows us to try different text-based combination strategies, and evaluate easily through a visual analysis. Our conclusion is that concept blending through space manipulation is possible, although the best strategy depends on the context of the blend. △ Less

Submitted 22 September, 2024; v1 submitted 19 July, 2024; originally announced July 2024.

arXiv:2311.03114 [pdf, other]

Ignoring Time Dependence in Software Engineering Data. A Mistake

Authors: Mikel Robredo, Nyyti Saarimaki, Rafael Penaloza, Valentina Lenarduzzi

Abstract: Researchers often delve into the connections between different factors derived from the historical data of software projects. For example, scholars have devoted their endeavors to the exploration of associations among these factors. However, a significant portion of these studies has failed to consider the limitations posed by the temporal interdependencies among these variables and the potential… ▽ More Researchers often delve into the connections between different factors derived from the historical data of software projects. For example, scholars have devoted their endeavors to the exploration of associations among these factors. However, a significant portion of these studies has failed to consider the limitations posed by the temporal interdependencies among these variables and the potential risks associated with the use of statistical methods ill-suited for analyzing data with temporal connections. Our goal is to highlight the consequences of neglecting time dependence during data analysis in current research. We pinpointed out certain potential problems that arise when disregarding the temporal aspect in the data, and support our argument with both theoretical and real examples. △ Less

Submitted 12 November, 2023; v1 submitted 6 November, 2023; originally announced November 2023.

arXiv:2310.16472 [pdf, ps, other]

Semiring Provenance for Lightweight Description Logics

Authors: Camille Bourgaux, Ana Ozaki, Rafael Peñaloza

Abstract: We investigate semiring provenance--a successful framework originally defined in the relational database setting--for description logics. In this context, the ontology axioms are annotated with elements of a commutative semiring and these annotations are propagated to the ontology consequences in a way that reflects how they are derived. We define a provenance semantics for a language that encompa… ▽ More We investigate semiring provenance--a successful framework originally defined in the relational database setting--for description logics. In this context, the ontology axioms are annotated with elements of a commutative semiring and these annotations are propagated to the ontology consequences in a way that reflects how they are derived. We define a provenance semantics for a language that encompasses several lightweight description logics and show its relationships with semantics that have been defined for ontologies annotated with a specific kind of annotation (such as fuzzy degrees). We show that under some restrictions on the semiring, the semantics satisfies desirable properties (such as extending the semiring provenance defined for databases). We then focus on the well-known why-provenance, for which we study the complexity of problems related to the provenance of an assertion or a conjunctive query answer. Finally, we consider two more restricted cases which correspond to the so-called positive Boolean provenance and lineage in the database setting. For these cases, we exhibit relationships with well-known notions related to explanations in description logics and complete our complexity analysis. As a side contribution, we provide conditions on an $\mathcal{ELHI}_\bot$ ontology that guarantee tractable reasoning. △ Less

Submitted 26 March, 2025; v1 submitted 25 October, 2023; originally announced October 2023.

Comments: Paper currently under review. 133 pages

arXiv:2306.02036 [pdf, other]

On the Empirical Evidence of Microservice Logical Coupling. A Registered Report

Authors: Dario Amoroso d Aragona, Luca Pascarella, Andrea Janes, Valentina Lenarduzzi, Rafael Penaloza, Davide Taibi

Abstract: [Context] Coupling is a widely discussed metric by software engineers while developing complex software systems, often referred to as a crucial factor and symptom of a poor or good design. Nevertheless, measuring the logical coupling among microservices and analyzing the interactions between services is non-trivial because it demands runtime information in the form of log files, which are not alwa… ▽ More [Context] Coupling is a widely discussed metric by software engineers while developing complex software systems, often referred to as a crucial factor and symptom of a poor or good design. Nevertheless, measuring the logical coupling among microservices and analyzing the interactions between services is non-trivial because it demands runtime information in the form of log files, which are not always accessible. [Objective and Method] In this work, we propose the design of a study aimed at empirically validating the Microservice Logical Coupling (MLC) metric presented in our previous study. In particular, we plan to empirically study Open Source Systems (OSS) built using a microservice architecture. [Results] The result of this work aims at corroborating the effectiveness and validity of the MLC metric. Thus, we will gather empirical evidence and develop a methodology to analyze and support the claims regarding the MLC metric. Furthermore, we establish its usefulness in evaluating and understanding the logical coupling among microservices. △ Less

Submitted 3 June, 2023; originally announced June 2023.

arXiv:2305.00760 [pdf, other]

Breaks and Code Quality: Investigating the Impact of Forgetting on Software Development. A Registered Report

Authors: Dario Amoroso d'Aragona, Luca Pascarella, Andrea Janes, Valentina Lenarduzzi, Rafael Penaloza, Davide Taibi

Abstract: Developers interrupting their participation in a project might slowly forget critical information about the code, such as its intended purpose, structure, the impact of external dependencies, and the approach used for implementation. Forgetting the implementation details can have detrimental effects on software maintenance, comprehension, knowledge sharing, and developer productivity, resulting in… ▽ More Developers interrupting their participation in a project might slowly forget critical information about the code, such as its intended purpose, structure, the impact of external dependencies, and the approach used for implementation. Forgetting the implementation details can have detrimental effects on software maintenance, comprehension, knowledge sharing, and developer productivity, resulting in bugs, and other issues that can negatively influence the software development process. Therefore, it is crucial to ensure that developers have a clear understanding of the codebase and can work efficiently and effectively even after long interruptions. This registered report proposes an empirical study aimed at investigating the impact of the developer's activity breaks duration and different code quality properties. In particular, we aim at understanding if the amount of activity in a project impact the code quality, and if developers with different activity profiles show different impacts on code quality. The results might be useful to understand if it is beneficial to promote the practice of developing multiple projects in parallel, or if it is more beneficial to reduce the number of projects each developer contributes. △ Less

Submitted 28 August, 2023; v1 submitted 1 May, 2023; originally announced May 2023.

arXiv:2111.11779 [pdf, other]

Answering Fuzzy Queries over Fuzzy DL-Lite Ontologies

Authors: Gabriella Pasi, Rafael Peñaloza

Abstract: A prominent problem in knowledge representation is how to answer queries taking into account also the implicit consequences of an ontology representing domain knowledge. While this problem has been widely studied within the realm of description logic ontologies, it has been surprisingly neglected within the context of vague or imprecise knowledge, particularly from the point of view of mathematica… ▽ More A prominent problem in knowledge representation is how to answer queries taking into account also the implicit consequences of an ontology representing domain knowledge. While this problem has been widely studied within the realm of description logic ontologies, it has been surprisingly neglected within the context of vague or imprecise knowledge, particularly from the point of view of mathematical fuzzy logic. In this paper we study the problem of answering conjunctive queries and threshold queries w.r.t. ontologies in fuzzy DL-Lite. Specifically, we show through a rewriting approach that threshold query answering w.r.t. consistent ontologies remains in $AC_0$ in data complexity, but that conjunctive query answering is highly dependent on the selected triangular norm, which has an impact on the underlying semantics. For the idempodent Gödel t-norm, we provide an effective method based on a reduction to the classical case. This paper is under consideration in Theory and Practice of Logic Programming (TPLP). △ Less

Submitted 23 November, 2021; originally announced November 2021.

Comments: Under consideration in Theory and Practice of Logic Programming (TPLP)

arXiv:2109.11216 [pdf, other]

Union and Intersection of all Justifications

Authors: Jieying Chen, Yue Ma, Rafael Peñaloza, Hui Yang

Abstract: We present new algorithm for computing the union and intersection of all justifications for a given ontological consequence without first computing the set of all justifications. Through an empirical evaluation, we show that our approach works well in practice for expressive DLs. In particular, the union of all justifications can be computed much faster than with existing justification-enumeration… ▽ More We present new algorithm for computing the union and intersection of all justifications for a given ontological consequence without first computing the set of all justifications. Through an empirical evaluation, we show that our approach works well in practice for expressive DLs. In particular, the union of all justifications can be computed much faster than with existing justification-enumeration approaches. We further discuss how to use these results to repair ontologies efficiently. △ Less

Submitted 23 September, 2021; originally announced September 2021.

arXiv:2108.12774 [pdf, other]

An Upper Bound for Provenance in ELHr

Authors: Rafael Peñaloza

Abstract: We investigate the entailment problem in ELHr ontologies annotated with provenance information. In more detail, we show that subsumption entailment is in NP if provenance is represented with polynomials from the Trio semiring and in PTime if the semiring is not commutative. The proof is based on the construction of a weighted tree automaton which recognises a language that matches with the corresp… ▽ More We investigate the entailment problem in ELHr ontologies annotated with provenance information. In more detail, we show that subsumption entailment is in NP if provenance is represented with polynomials from the Trio semiring and in PTime if the semiring is not commutative. The proof is based on the construction of a weighted tree automaton which recognises a language that matches with the corresponding provenance polynomial. △ Less

Submitted 29 August, 2021; originally announced August 2021.

Comments: Full version of paper appearing in the Description Logic Workshop 2021

arXiv:2107.03997 [pdf, other]

Probabilistic Trace Alignment

Authors: Giacomo Bergami, Fabrizio Maria Maggi, Marco Montali, Rafael Peñaloza

Abstract: Alignments provide sophisticated diagnostics that pinpoint deviations in a trace with respect to a process model and their severity. However, approaches based on trace alignments use crisp process models as reference and recent probabilistic conformance checking approaches check the degree of conformance of an event log with respect to a stochastic process model instead of finding trace alignments… ▽ More Alignments provide sophisticated diagnostics that pinpoint deviations in a trace with respect to a process model and their severity. However, approaches based on trace alignments use crisp process models as reference and recent probabilistic conformance checking approaches check the degree of conformance of an event log with respect to a stochastic process model instead of finding trace alignments. In this paper, for the first time, we provide a conformance checking approach based on trace alignments using stochastic Workflow nets. Conceptually, this requires to handle the two possibly contrasting forces of the cost of the alignment on the one hand and the likelihood of the model trace with respect to which the alignment is computed on the other. △ Less

Submitted 8 July, 2021; originally announced July 2021.

arXiv:2009.13407 [pdf, other]

The Probabilistic Description Logic $\mathcal{BALC}$

Authors: Leonard Botha, Thomas Meyer, Rafael Peñaloza

Abstract: Description logics (DLs) are well-known knowledge representation formalisms focused on the representation of terminological knowledge. Due to their first-order semantics, these languages (in their classical form) are not suitable for representing and handling uncertainty. A probabilistic extension of a light-weight DL was recently proposed for dealing with certain knowledge occurring in uncertain… ▽ More Description logics (DLs) are well-known knowledge representation formalisms focused on the representation of terminological knowledge. Due to their first-order semantics, these languages (in their classical form) are not suitable for representing and handling uncertainty. A probabilistic extension of a light-weight DL was recently proposed for dealing with certain knowledge occurring in uncertain contexts. In this paper, we continue that line of research by introducing the Bayesian extension \BALC of the propositionally closed DL \ALC. We present a tableau-based procedure for deciding consistency, and adapt it to solve other probabilistic, contextual, and general inferences in this logic. We also show that all these problems remain \ExpTime-complete, the same as reasoning in the underlying classical \ALC. △ Less

Submitted 28 September, 2020; originally announced September 2020.

Comments: Under consideration in Theory and Practice of Logic Programming (TPLP)

arXiv:2007.00571 [pdf, other]

Reasoning with Contextual Knowledge and Influence Diagrams

Authors: Erman Acar, Rafael Peñaloza

Abstract: Influence diagrams (IDs) are well-known formalisms extending Bayesian networks to model decision situations under uncertainty. Although they are convenient as a decision theoretic tool, their knowledge representation ability is limited in capturing other crucial notions such as logical consistency. We complement IDs with the light-weight description logic (DL) EL to overcome such limitations. We c… ▽ More Influence diagrams (IDs) are well-known formalisms extending Bayesian networks to model decision situations under uncertainty. Although they are convenient as a decision theoretic tool, their knowledge representation ability is limited in capturing other crucial notions such as logical consistency. We complement IDs with the light-weight description logic (DL) EL to overcome such limitations. We consider a setup where DL axioms hold in some contexts, yet the actual context is uncertain. The framework benefits from the convenience of using DL as a domain knowledge representation language and the modelling strength of IDs to deal with decisions over contexts in the presence of contextual uncertainty. We define related reasoning problems and study their computational complexity. △ Less

Submitted 1 July, 2020; originally announced July 2020.

arXiv:2003.08298 [pdf, other]

Axiom Pinpointing

Authors: Rafael Peñaloza

Abstract: Axiom pinpointing refers to the task of finding the specific axioms in an ontology which are responsible for a consequence to follow. This task has been studied, under different names, in many research areas, leading to a reformulation and reinvention of techniques. In this work, we present a general overview to axiom pinpointing, providing the basic notions, different approaches for solving it, a… ▽ More Axiom pinpointing refers to the task of finding the specific axioms in an ontology which are responsible for a consequence to follow. This task has been studied, under different names, in many research areas, leading to a reformulation and reinvention of techniques. In this work, we present a general overview to axiom pinpointing, providing the basic notions, different approaches for solving it, and some variations and applications which have been considered in the literature. This should serve as a starting point for researchers interested in related problems, with an ample bibliography for delving deeper into the details. △ Less

Submitted 18 March, 2020; originally announced March 2020.

arXiv:2001.07541 [pdf, ps, other]

Provenance for the Description Logic ELHr

Authors: Camille Bourgaux, Ana Ozaki, Rafael Peñaloza, Livia Predoiu

Abstract: We address the problem of handling provenance information in ELHr ontologies. We consider a setting recently introduced for ontology-based data access, based on semirings and extending classical data provenance, in which ontology axioms are annotated with provenance tokens. A consequence inherits the provenance of the axioms involved in deriving it, yielding a provenance polynomial as an annotatio… ▽ More We address the problem of handling provenance information in ELHr ontologies. We consider a setting recently introduced for ontology-based data access, based on semirings and extending classical data provenance, in which ontology axioms are annotated with provenance tokens. A consequence inherits the provenance of the axioms involved in deriving it, yielding a provenance polynomial as an annotation. We analyse the semantics for the ELHr case and show that the presence of conjunctions poses various difficulties for handling provenance, some of which are mitigated by assuming multiplicative idempotency of the semiring. Under this assumption, we study three problems: ontology completion with provenance, computing the set of relevant axioms for a consequence, and query answering. △ Less

Submitted 24 October, 2023; v1 submitted 21 January, 2020; originally announced January 2020.

Comments: This is the long version of an IJCAI 2020 paper (23 pages) - v3 fixes glitches in proof of lemma 27 and in claim 30

MSC Class: 16Y60

arXiv:1906.00179 [pdf, ps, other]

Enriching Ontology-based Data Access with Provenance (Extended Version)

Authors: Diego Calvanese, Davide Lanti, Ana Ozaki, Rafael Penaloza, Guohui Xiao

Abstract: Ontology-based data access (OBDA) is a popular paradigm for querying heterogeneous data sources by connecting them through mappings to an ontology. In OBDA, it is often difficult to reconstruct why a tuple occurs in the answer of a query. We address this challenge by enriching OBDA with provenance semirings, taking inspiration from database theory. In particular, we investigate the problems of (i)… ▽ More Ontology-based data access (OBDA) is a popular paradigm for querying heterogeneous data sources by connecting them through mappings to an ontology. In OBDA, it is often difficult to reconstruct why a tuple occurs in the answer of a query. We address this challenge by enriching OBDA with provenance semirings, taking inspiration from database theory. In particular, we investigate the problems of (i) deciding whether a provenance annotated OBDA instance entails a provenance annotated conjunctive query, and (ii) computing a polynomial representing the provenance of a query entailed by a provenance annotated OBDA instance. Differently from pure databases, in our case these polynomials may be infinite. To regain finiteness, we consider idempotent semirings, and study the complexity in the case of DL-Lite ontologies. We implement Task (ii) in a state-of-the-art OBDA system and show the practical feasibility of the approach through an extensive evaluation against two popular benchmarks. △ Less

Submitted 1 June, 2019; originally announced June 2019.

arXiv:1903.04940 [pdf, other]

Temporal Logics Over Finite Traces with Uncertainty (Technical Report)

Authors: Fabrizio M. Maggi, Marco Montali, Rafael Peñaloza

Abstract: Temporal logics over finite traces have recently seen wide application in a number of areas, from business process modelling, monitoring, and mining to planning and decision making. However, real-life dynamic systems contain a degree of uncertainty which cannot be handled with classical logics. We thus propose a new probabilistic temporal logic over finite traces using superposition semantics, whe… ▽ More Temporal logics over finite traces have recently seen wide application in a number of areas, from business process modelling, monitoring, and mining to planning and decision making. However, real-life dynamic systems contain a degree of uncertainty which cannot be handled with classical logics. We thus propose a new probabilistic temporal logic over finite traces using superposition semantics, where all possible evolutions are possible, until observed. We study the properties of the logic and provide automata-based mechanisms for deriving probabilistic inferences from its formulas. We then study a fragment of the logic with better computational properties. Notably, formulas in this fragment can be discovered from event log data using off-the-shelf existing declarative process discovery techniques. △ Less

Submitted 18 November, 2019; v1 submitted 12 March, 2019; originally announced March 2019.

Comments: Extended version of paper accepted at AAAI 2020

arXiv:1810.01516 [pdf, ps, other]

Cutting Diamonds: Temporal DLs with Probabilistic Distributions over Data

Authors: Alisa Kovtunova, Rafael Peñaloza

Abstract: Recent work has studied a probabilistic extension of the temporal logic LTL that refines the eventuality (or diamond) constructor with a probability distribution on when will this eventuality be satisfied. In this paper, we adapt this notion to a well established temporal extension of DL-Lite, allowing the new probabilistic constructor only in the ABox assertions. We investigate the satisfiability… ▽ More Recent work has studied a probabilistic extension of the temporal logic LTL that refines the eventuality (or diamond) constructor with a probability distribution on when will this eventuality be satisfied. In this paper, we adapt this notion to a well established temporal extension of DL-Lite, allowing the new probabilistic constructor only in the ABox assertions. We investigate the satisfiability problem of this new temporal DL over equiparametric geometric distributions. △ Less

Submitted 2 October, 2018; originally announced October 2018.

Comments: Full version of the paper accepted for 31st International Workshop on Description Logics

arXiv:1808.01877 [pdf, ps, other]

Query Answering for Rough EL Ontologies (Extended Technical Report)

Authors: Rafael Peñaloza, Veronika Thost, Anni-Yasmin Turhan

Abstract: Querying large datasets with incomplete and vague data is still a challenge. Ontology-based query answering extends standard database query answering by background knowledge from an ontology to augment incomplete data. We focus on ontologies written in rough description logics (DLs), which allow to represent vague knowledge by partitioning the domain of discourse into classes of indiscernible elem… ▽ More Querying large datasets with incomplete and vague data is still a challenge. Ontology-based query answering extends standard database query answering by background knowledge from an ontology to augment incomplete data. We focus on ontologies written in rough description logics (DLs), which allow to represent vague knowledge by partitioning the domain of discourse into classes of indiscernible elements. In this paper, we extend the combined approach for ontology-based query answering to a variant of the DL EL augmented with rough concept constructors. We show that this extension preserves the good computational properties of classical EL and can be implemented by standard database systems. △ Less

Submitted 6 August, 2018; originally announced August 2018.

Comments: Extended version of a paper accepted at KR 2018

arXiv:1808.00248 [pdf, ps, other]

Repairing Description Logic Ontologies by Weakening Axioms

Authors: Franz Baader, Francesco Kriegel, Adrian Nuradiansyah, Rafael Peñaloza

Abstract: The classical approach for repairing a Description Logic ontology O in the sense of removing an unwanted consequence $α$ is to delete a minimal number of axioms from O such that the resulting ontology O' does not have the consequence $α$. However, the complete deletion of axioms may be too rough, in the sense that it may also remove consequences that are actually wanted. To alleviate this problem,… ▽ More The classical approach for repairing a Description Logic ontology O in the sense of removing an unwanted consequence $α$ is to delete a minimal number of axioms from O such that the resulting ontology O' does not have the consequence $α$. However, the complete deletion of axioms may be too rough, in the sense that it may also remove consequences that are actually wanted. To alleviate this problem, we propose a more gentle way of repair in which axioms are not necessarily deleted, but only weakened. On the one hand, we investigate general properties of this gentle repair method. On the other hand, we propose and analyze concrete approaches for weakening axioms expressed in the Description Logic EL. △ Less

Submitted 1 August, 2018; originally announced August 2018.

Comments: Extended version of the paper "Making Repairs in Description Logics More Gentle" accepted at KR 2018

Report number: 18-01

arXiv:1805.10250 [pdf, ps, other]

Consequence-Based Axiom Pinpointing

Authors: Ana Ozaki, Rafael Peñaloza

Abstract: Axiom pinpointing refers to the problem of finding the axioms in an ontology that are relevant for understanding a given entailment or consequence. One approach for axiom pinpointing, known as glass-box, is to modify a classical decision procedure for the entailments into a method that computes the solutions for the pinpointing problem. Recently, consequence-based decision procedures have been pro… ▽ More Axiom pinpointing refers to the problem of finding the axioms in an ontology that are relevant for understanding a given entailment or consequence. One approach for axiom pinpointing, known as glass-box, is to modify a classical decision procedure for the entailments into a method that computes the solutions for the pinpointing problem. Recently, consequence-based decision procedures have been proposed as a promising alternative for tableaux-based reasoners for standard ontology languages. In this work, we present a general framework to extend consequence-based algorithms with axiom pinpointing. △ Less

Submitted 25 May, 2018; originally announced May 2018.

Comments: Technical Report

arXiv:1711.03430 [pdf, ps, other]

Repairing Ontologies via Axiom Weakening

Authors: Nicolas Troquard, Roberto Confalonieri, Pietro Galliani, Rafael Penaloza, Daniele Porello, Oliver Kutz

Abstract: Ontology engineering is a hard and error-prone task, in which small changes may lead to errors, or even produce an inconsistent ontology. As ontologies grow in size, the need for automated methods for repairing inconsistencies while preserving as much of the original knowledge as possible increases. Most previous approaches to this task are based on removing a few axioms from the ontology to regai… ▽ More Ontology engineering is a hard and error-prone task, in which small changes may lead to errors, or even produce an inconsistent ontology. As ontologies grow in size, the need for automated methods for repairing inconsistencies while preserving as much of the original knowledge as possible increases. Most previous approaches to this task are based on removing a few axioms from the ontology to regain consistency. We propose a new method based on weakening these axioms to make them less restrictive, employing the use of refinement operators. We introduce the theoretical framework for weakening DL ontologies, propose algorithms to repair ontologies based on the framework, and provide an analysis of the computational complexity. Through an empirical analysis made over real-life ontologies, we show that our approach preserves significantly more of the original knowledge of the ontology than removing axioms. △ Less

Submitted 9 November, 2017; originally announced November 2017.

Comments: To appear AAAI 2018

arXiv:1707.08468 [pdf, ps, other]

A Decidable Very Expressive Description Logic for Databases (Extended Version)

Authors: Alessandro Artale, Enrico Franconi, Rafael Peñaloza, Francesco Sportelli

Abstract: We introduce $\mathcal{DLR}^+$, an extension of the n-ary propositionally closed description logic $\mathcal{DLR}$ to deal with attribute-labelled tuples (generalising the positional notation), projections of relations, and global and local objectification of relations, able to express inclusion, functional, key, and external uniqueness dependencies. The logic is equipped with both TBox and ABox a… ▽ More We introduce $\mathcal{DLR}^+$, an extension of the n-ary propositionally closed description logic $\mathcal{DLR}$ to deal with attribute-labelled tuples (generalising the positional notation), projections of relations, and global and local objectification of relations, able to express inclusion, functional, key, and external uniqueness dependencies. The logic is equipped with both TBox and ABox axioms. We show how a simple syntactic restriction on the appearance of projections sharing common attributes in a $\mathcal{DLR}^+$ knowledge base makes reasoning in the language decidable with the same computational complexity as $\mathcal{DLR}$. The obtained $\mathcal{DLR}^\pm$ n-ary description logic is able to encode more thoroughly conceptual data models such as EER, UML, and ORM. △ Less

Submitted 25 July, 2017; originally announced July 2017.

Comments: 20 pages. Extended version of paper appearing in the International Semantic Web Conference (ISWC 2017). arXiv admin note: text overlap with arXiv:1604.00799

arXiv:1706.03207 [pdf, ps, other]

Towards Statistical Reasoning in Description Logics over Finite Domains (Full Version)

Authors: Rafael Peñaloza, Nico Potyka

Abstract: We present a probabilistic extension of the description logic $\mathcal{ALC}$ for reasoning about statistical knowledge. We consider conditional statements over proportions of the domain and are interested in the probabilistic-logical consequences of these proportions. After introducing some general reasoning problems and analyzing their properties, we present first algorithms and complexity resul… ▽ More We present a probabilistic extension of the description logic $\mathcal{ALC}$ for reasoning about statistical knowledge. We consider conditional statements over proportions of the domain and are interested in the probabilistic-logical consequences of these proportions. After introducing some general reasoning problems and analyzing their properties, we present first algorithms and complexity results for reasoning in some fragments of Statistical $\mathcal{ALC}$. △ Less

Submitted 10 June, 2017; originally announced June 2017.

Comments: 16 pages. Extended version of "Towards Statistical Reasoning in Description Logics over Finite Domains" published at the 11th International Conference on Scalable Uncertainty Management (SUM 2017)

arXiv:1606.09521 [pdf, ps, other]

Probabilistic Reasoning in the Description Logic ALCP with the Principle of Maximum Entropy (Full Version)

Authors: Rafael Peñaloza, Nico Potyka

Abstract: A central question for knowledge representation is how to encode and handle uncertain knowledge adequately. We introduce the probabilistic description logic ALCP that is designed for representing context-dependent knowledge, where the actual context taking place is uncertain. ALCP allows the expression of logical dependencies on the domain and probabilistic dependencies on the possible contexts. I… ▽ More A central question for knowledge representation is how to encode and handle uncertain knowledge adequately. We introduce the probabilistic description logic ALCP that is designed for representing context-dependent knowledge, where the actual context taking place is uncertain. ALCP allows the expression of logical dependencies on the domain and probabilistic dependencies on the possible contexts. In order to draw probabilistic conclusions, we employ the principle of maximum entropy. We provide reasoning algorithms for this logic, and show that it satisfies several desirable properties of probabilistic logics. △ Less

Submitted 30 June, 2016; originally announced June 2016.

Comments: Full version of paper accepted at the Tenth International Conference on Scalable Uncertainty Management (SUM 2016)

arXiv:1509.08761 [pdf, ps, other]

Reasoning in Infinitely Valued G-IALCQ

Authors: Stefan Borgwardt, Rafael Peñaloza

Abstract: Fuzzy Description Logics (FDLs) are logic-based formalisms used to represent and reason with vague or imprecise knowledge. It has been recently shown that reasoning in most FDLs using truth values from the interval [0,1] becomes undecidable in the presence of a negation constructor and general concept inclusion axioms. One exception to this negative result are FDLs whose semantics is based on the… ▽ More Fuzzy Description Logics (FDLs) are logic-based formalisms used to represent and reason with vague or imprecise knowledge. It has been recently shown that reasoning in most FDLs using truth values from the interval [0,1] becomes undecidable in the presence of a negation constructor and general concept inclusion axioms. One exception to this negative result are FDLs whose semantics is based on the infinitely valued Gödel t-norm (G). In this paper, we extend previous decidability results for G-IALC to deal also with qualified number restrictions. Our novel approach is based on a combination of the known crispification technique for finitely valued FDLs and the automata-based procedure originally developed for reasoning in G-IALC. The proposed approach combines the advantages of these two methods, while removing their respective drawbacks. △ Less

Submitted 29 September, 2015; originally announced September 2015.

Comments: Workshop on Weighted Logics for Artificial Intelligence, 2015

arXiv:1508.02626 [pdf, other]

Answering Fuzzy Conjunctive Queries over Finitely Valued Fuzzy Ontologies

Authors: Stefan Borgwardt, Theofilos Mailis, Rafael Peñaloza, Anni-Yasmin Turhan

Abstract: Fuzzy Description Logics (DLs) provide a means for representing vague knowledge about an application domain. In this paper, we study fuzzy extensions of conjunctive queries (CQs) over the DL $\mathcal{SROIQ}$ based on finite chains of degrees of truth. To answer such queries, we extend a well-known technique that reduces the fuzzy ontology to a classical one, and use classical DL reasoners as a bl… ▽ More Fuzzy Description Logics (DLs) provide a means for representing vague knowledge about an application domain. In this paper, we study fuzzy extensions of conjunctive queries (CQs) over the DL $\mathcal{SROIQ}$ based on finite chains of degrees of truth. To answer such queries, we extend a well-known technique that reduces the fuzzy ontology to a classical one, and use classical DL reasoners as a black box. We improve the complexity of previous reduction techniques for finitely valued fuzzy DLs, which allows us to prove tight complexity results for answering certain kinds of fuzzy CQs. We conclude with an experimental evaluation of a prototype implementation, showing the feasibility of our approach. △ Less

Submitted 14 October, 2015; v1 submitted 11 August, 2015; originally announced August 2015.

Comments: submitted to the Journal on Data Semantics, v1: 19 pages, v2: 20 pages, improved evaluation section

arXiv:1507.03920 [pdf, ps, other]

doi 10.1017/S1471068415000241

Fuzzy Answer Set Computation via Satisfiability Modulo Theories

Authors: Mario Alviano, Rafael Penaloza

Abstract: Fuzzy answer set programming (FASP) combines two declarative frameworks, answer set programming and fuzzy logic, in order to model reasoning by default over imprecise information. Several connectives are available to combine different expressions; in particular the \Godel and \Luka fuzzy connectives are usually considered, due to their properties. Although the \Godel conjunction can be easily elim… ▽ More Fuzzy answer set programming (FASP) combines two declarative frameworks, answer set programming and fuzzy logic, in order to model reasoning by default over imprecise information. Several connectives are available to combine different expressions; in particular the \Godel and \Luka fuzzy connectives are usually considered, due to their properties. Although the \Godel conjunction can be easily eliminated from rule heads, we show through complexity arguments that such a simplification is infeasible in general for all other connectives. %, even if bodies are restricted to \Luka or \Godel conjunctions. The paper analyzes a translation of FASP programs into satisfiability modulo theories~(SMT), which in general produces quantified formulas because of the minimality of the semantics. Structural properties of many FASP programs allow to eliminate the quantification, or to sensibly reduce the number of quantified variables. Indeed, integrality constraints can replace recursive rules commonly used to force Boolean interpretations, and completion subformulas can guarantee minimality for acyclic programs with atomic heads. Moreover, head cycle free rules can be replaced by shifted subprograms, whose structure depends on the eliminated head connective, so that ordered completion may replace the minimality check if also \Luka disjunction in rule bodies is acyclic. The paper also presents and evaluates a prototype system implementing these translations. To appear in Theory and Practice of Logic Programming (TPLP), Proceedings of ICLP 2015. △ Less

Submitted 14 July, 2015; originally announced July 2015.

ACM Class: I.2

Journal ref: Theory and Practice of Logic Programming 15 (2015) 588-603

arXiv:1506.08030 [pdf, other]

Dynamic Bayesian Ontology Languages

Authors: İsmail İlkan Ceylan, Rafael Peñaloza

Abstract: Many formalisms combining ontology languages with uncertainty, usually in the form of probabilities, have been studied over the years. Most of these formalisms, however, assume that the probabilistic structure of the knowledge remains static over time. We present a general approach for extending ontology languages to handle time-evolving uncertainty represented by a dynamic Bayesian network. We sh… ▽ More Many formalisms combining ontology languages with uncertainty, usually in the form of probabilities, have been studied over the years. Most of these formalisms, however, assume that the probabilistic structure of the knowledge remains static over time. We present a general approach for extending ontology languages to handle time-evolving uncertainty represented by a dynamic Bayesian network. We show how reasoning in the original language and dynamic Bayesian inferences can be exploited for effective reasoning in our framework. △ Less

Submitted 26 June, 2015; originally announced June 2015.

Comments: Fifth International Workshop on Statistical Relational AI (StarAI'2015)

Showing 1–32 of 32 results for author: Peñaloza, R