Search | arXiv e-print repository

ACTIVA: Amortized Causal Effect Estimation without Graphs via Transformer-based Variational Autoencoder

Authors: Andreas Sauter, Saber Salehkaleybar, Aske Plaat, Erman Acar

Abstract: Predicting the distribution of outcomes under hypothetical interventions is crucial in domains like healthcare, economics, and policy-making. Current methods often rely on strong assumptions, such as known causal graphs or parametric models, and lack amortization across problem instances, limiting their practicality. We propose a novel transformer-based conditional variational autoencoder architec… ▽ More Predicting the distribution of outcomes under hypothetical interventions is crucial in domains like healthcare, economics, and policy-making. Current methods often rely on strong assumptions, such as known causal graphs or parametric models, and lack amortization across problem instances, limiting their practicality. We propose a novel transformer-based conditional variational autoencoder architecture, named ACTIVA, that extends causal transformer encoders to predict causal effects as mixtures of Gaussians. Our method requires no causal graph and predicts interventional distributions given only observational data and a queried intervention. By amortizing over many simulated instances, it enables zero-shot generalization to novel datasets without retraining. Experiments demonstrate accurate predictions for synthetic and semi-synthetic data, showcasing the effectiveness of our graph-free, amortized causal inference approach. △ Less

Submitted 3 March, 2025; originally announced March 2025.

arXiv:2502.19367 [pdf, other]

dCMF: Learning interpretable evolving patterns from temporal multiway data

Authors: Christos Chatzis, Carla Schenker, Jérémy E. Cohen, Evrim Acar

Abstract: Multiway datasets are commonly analyzed using unsupervised matrix and tensor factorization methods to reveal underlying patterns. Frequently, such datasets include timestamps and could correspond to, for example, health-related measurements of subjects collected over time. The temporal dimension is inherently different from the other dimensions, requiring methods that account for its intrinsic pro… ▽ More Multiway datasets are commonly analyzed using unsupervised matrix and tensor factorization methods to reveal underlying patterns. Frequently, such datasets include timestamps and could correspond to, for example, health-related measurements of subjects collected over time. The temporal dimension is inherently different from the other dimensions, requiring methods that account for its intrinsic properties. Linear Dynamical Systems (LDS) are specifically designed to capture sequential dependencies in the observed data. In this work, we bridge the gap between tensor factorizations and dynamical modeling by exploring the relationship between LDS, Coupled Matrix Factorizations (CMF) and the PARAFAC2 model. We propose a time-aware coupled factorization model called d(ynamical)CMF that constrains the temporal evolution of the latent factors to adhere to a specific LDS structure. Using synthetic datasets, we compare the performance of dCMF with PARAFAC2 and t(emporal)PARAFAC2 which incorporates temporal smoothness. Our results show that dCMF and PARAFAC2-based approaches perform similarly when capturing smoothly evolving patterns that adhere to the PARAFAC2 structure. However, dCMF outperforms alternatives when the patterns evolve smoothly but deviate from the PARAFAC2 structure. Furthermore, we demonstrate that the proposed dCMF method enables to capture more complex dynamics when additional prior information about the temporal evolution is incorporated. △ Less

Submitted 26 February, 2025; originally announced February 2025.

arXiv:2411.04867 [pdf, other]

Think Smart, Act SMARL! Analyzing Probabilistic Logic Shields for Multi-Agent Reinforcement Learning

Authors: Satchit Chatterji, Erman Acar

Abstract: Safe reinforcement learning (RL) is crucial for real-world applications, and multi-agent interactions introduce additional safety challenges. While Probabilistic Logic Shields (PLS) has been a powerful proposal to enforce safety in single-agent RL, their generalizability to multi-agent settings remains unexplored. In this paper, we address this gap by conducting extensive analyses of PLS within de… ▽ More Safe reinforcement learning (RL) is crucial for real-world applications, and multi-agent interactions introduce additional safety challenges. While Probabilistic Logic Shields (PLS) has been a powerful proposal to enforce safety in single-agent RL, their generalizability to multi-agent settings remains unexplored. In this paper, we address this gap by conducting extensive analyses of PLS within decentralized, multi-agent environments, and in doing so, propose Shielded Multi-Agent Reinforcement Learning (SMARL) as a general framework for steering MARL towards norm-compliant outcomes. Our key contributions are: (1) a novel Probabilistic Logic Temporal Difference (PLTD) update for shielded, independent Q-learning, which incorporates probabilistic constraints directly into the value update process; (2) a probabilistic logic policy gradient method for shielded PPO with formal safety guarantees for MARL; and (3) comprehensive evaluation across symmetric and asymmetrically shielded $n$-player game-theoretic benchmarks, demonstrating fewer constraint violations and significantly better cooperation under normative constraints. These results position SMARL as an effective mechanism for equilibrium selection, paving the way toward safer, socially aligned multi-agent systems. △ Less

Submitted 14 May, 2025; v1 submitted 7 November, 2024; originally announced November 2024.

Comments: 21 pages, 16 figures, Earlier title: "Analyzing Probabilistic Logic Driven Safety in Multi-Agent Reinforcement Learning" (changed for specificity and clarity)

arXiv:2411.00431 [pdf, other]

Integrating Fuzzy Logic into Deep Symbolic Regression

Authors: Wout Gerdes, Erman Acar

Abstract: Credit card fraud detection is a critical concern for financial institutions, intensified by the rise of contactless payment technologies. While deep learning models offer high accuracy, their lack of explainability poses significant challenges in financial settings. This paper explores the integration of fuzzy logic into Deep Symbolic Regression (DSR) to enhance both performance and explainabilit… ▽ More Credit card fraud detection is a critical concern for financial institutions, intensified by the rise of contactless payment technologies. While deep learning models offer high accuracy, their lack of explainability poses significant challenges in financial settings. This paper explores the integration of fuzzy logic into Deep Symbolic Regression (DSR) to enhance both performance and explainability in fraud detection. We investigate the effectiveness of different fuzzy logic implications, specifically Łukasiewicz, Gödel, and Product, in handling the complexity and uncertainty of fraud detection datasets. Our analysis suggest that the Łukasiewicz implication achieves the highest F1-score and overall accuracy, while the Product implication offers a favorable balance between performance and explainability. Despite having a performance lower than state-of-the-art (SOTA) models due to information loss in data transformation, our approach provides novelty and insights into into integrating fuzzy logic into DSR for fraud detection, providing a comprehensive comparison between different implications and methods. △ Less

Submitted 1 November, 2024; originally announced November 2024.

Comments: 10 pages, 1 figure, published for XAI FIN 24 https://easychair.org/cfp/xaifin2024

arXiv:2410.21928 [pdf, other]

Differentiable Inductive Logic Programming for Fraud Detection

Authors: Boris Wolfson, Erman Acar

Abstract: Current trends in Machine Learning prefer explainability even when it comes at the cost of performance. Therefore, explainable AI methods are particularly important in the field of Fraud Detection. This work investigates the applicability of Differentiable Inductive Logic Programming (DILP) as an explainable AI approach to Fraud Detection. Although the scalability of DILP is a well-known issue, we… ▽ More Current trends in Machine Learning prefer explainability even when it comes at the cost of performance. Therefore, explainable AI methods are particularly important in the field of Fraud Detection. This work investigates the applicability of Differentiable Inductive Logic Programming (DILP) as an explainable AI approach to Fraud Detection. Although the scalability of DILP is a well-known issue, we show that with some data curation such as cleaning and adjusting the tabular and numerical data to the expected format of background facts statements, it becomes much more applicable. While in processing it does not provide any significant advantage on rather more traditional methods such as Decision Trees, or more recent ones like Deep Symbolic Classification, it still gives comparable results. We showcase its limitations and points to improve, as well as potential use cases where it can be much more useful compared to traditional methods, such as recursive rule learning. △ Less

Submitted 29 October, 2024; originally announced October 2024.

arXiv:2410.06070 [pdf, other]

Enforcing Interpretability in Time Series Transformers: A Concept Bottleneck Framework

Authors: Angela van Sprang, Erman Acar, Willem Zuidema

Abstract: There has been a recent push of research on Transformer-based models for long-term time series forecasting, even though they are inherently difficult to interpret and explain. While there is a large body of work on interpretability methods for various domains and architectures, the interpretability of Transformer-based forecasting models remains largely unexplored. To address this gap, we develop… ▽ More There has been a recent push of research on Transformer-based models for long-term time series forecasting, even though they are inherently difficult to interpret and explain. While there is a large body of work on interpretability methods for various domains and architectures, the interpretability of Transformer-based forecasting models remains largely unexplored. To address this gap, we develop a framework based on Concept Bottleneck Models to enforce interpretability of time series Transformers. We modify the training objective to encourage a model to develop representations similar to predefined interpretable concepts. In our experiments, we enforce similarity using Centered Kernel Alignment, and the predefined concepts include time features and an interpretable, autoregressive surrogate model (AR). We apply the framework to the Autoformer model, and present an in-depth analysis for a variety of benchmark tasks. We find that the model performance remains mostly unaffected, while the model shows much improved interpretability. Additionally, interpretable concepts become local, which makes the trained model easily intervenable. As a proof of concept, we demonstrate a successful intervention in the scenario of a time shift in the data, which eliminates the need to retrain. △ Less

Submitted 8 October, 2024; originally announced October 2024.

arXiv:2409.01274 [pdf, other]

DAVIDE: Depth-Aware Video Deblurring

Authors: German F. Torres, Jussi Kalliola, Soumya Tripathy, Erman Acar, Joni-Kristian Kämäräinen

Abstract: Video deblurring aims at recovering sharp details from a sequence of blurry frames. Despite the proliferation of depth sensors in mobile phones and the potential of depth information to guide deblurring, depth-aware deblurring has received only limited attention. In this work, we introduce the 'Depth-Aware VIdeo DEblurring' (DAVIDE) dataset to study the impact of depth information in video deblurr… ▽ More Video deblurring aims at recovering sharp details from a sequence of blurry frames. Despite the proliferation of depth sensors in mobile phones and the potential of depth information to guide deblurring, depth-aware deblurring has received only limited attention. In this work, we introduce the 'Depth-Aware VIdeo DEblurring' (DAVIDE) dataset to study the impact of depth information in video deblurring. The dataset comprises synchronized blurred, sharp, and depth videos. We investigate how the depth information should be injected into the existing deep RGB video deblurring models, and propose a strong baseline for depth-aware video deblurring. Our findings reveal the significance of depth information in video deblurring and provide insights into the use cases where depth cues are beneficial. In addition, our results demonstrate that while the depth improves deblurring performance, this effect diminishes when models are provided with a longer temporal context. Project page: https://germanftv.github.io/DAVIDE.github.io/ . △ Less

Submitted 2 September, 2024; originally announced September 2024.

arXiv:2408.00682 [pdf, other]

Learning in Multi-Objective Public Goods Games with Non-Linear Utilities

Authors: Nicole Orzan, Erman Acar, Davide Grossi, Patrick Mannion, Roxana Rădulescu

Abstract: Addressing the question of how to achieve optimal decision-making under risk and uncertainty is crucial for enhancing the capabilities of artificial agents that collaborate with or support humans. In this work, we address this question in the context of Public Goods Games. We study learning in a novel multi-objective version of the Public Goods Game where agents have different risk preferences, by… ▽ More Addressing the question of how to achieve optimal decision-making under risk and uncertainty is crucial for enhancing the capabilities of artificial agents that collaborate with or support humans. In this work, we address this question in the context of Public Goods Games. We study learning in a novel multi-objective version of the Public Goods Game where agents have different risk preferences, by means of multi-objective reinforcement learning. We introduce a parametric non-linear utility function to model risk preferences at the level of individual agents, over the collective and individual reward components of the game. We study the interplay between such preference modelling and environmental uncertainty on the incentive alignment level in the game. We demonstrate how different combinations of individual preferences and environmental uncertainties sustain the emergence of cooperative patterns in non-cooperative environments (i.e., where competitive strategies are dominant), while others sustain competitive patterns in cooperative environments (i.e., where cooperative strategies are dominant). △ Less

Submitted 1 August, 2024; originally announced August 2024.

Comments: In press at ECAI 2024

arXiv:2407.02610 [pdf, other]

Towards Federated Learning with On-device Training and Communication in 8-bit Floating Point

Authors: Bokun Wang, Axel Berg, Durmus Alp Emre Acar, Chuteng Zhou

Abstract: Recent work has shown that 8-bit floating point (FP8) can be used for efficiently training neural networks with reduced computational overhead compared to training in FP32/FP16. In this work, we investigate the use of FP8 training in a federated learning context. This brings not only the usual benefits of FP8 which are desirable for on-device training at the edge, but also reduces client-server co… ▽ More Recent work has shown that 8-bit floating point (FP8) can be used for efficiently training neural networks with reduced computational overhead compared to training in FP32/FP16. In this work, we investigate the use of FP8 training in a federated learning context. This brings not only the usual benefits of FP8 which are desirable for on-device training at the edge, but also reduces client-server communication costs due to significant weight compression. We present a novel method for combining FP8 client training while maintaining a global FP32 server model and provide convergence analysis. Experiments with various machine learning models and datasets show that our method consistently yields communication reductions of at least 2.9x across a variety of tasks and models compared to an FP32 baseline. △ Less

Submitted 2 July, 2024; originally announced July 2024.

arXiv:2407.01356 [pdf, other]

tPARAFAC2: Tracking evolving patterns in (incomplete) temporal data

Authors: Christos Chatzis, Carla Schenker, Max Pfeffer, Evrim Acar

Abstract: Tensor factorizations have been widely used for the task of uncovering patterns in various domains. Often, the input is time-evolving, shifting the goal to tracking the evolution of the underlying patterns instead. To adapt to this more complex setting, existing methods incorporate temporal regularization but they either have overly constrained structural requirements or lack uniqueness which is c… ▽ More Tensor factorizations have been widely used for the task of uncovering patterns in various domains. Often, the input is time-evolving, shifting the goal to tracking the evolution of the underlying patterns instead. To adapt to this more complex setting, existing methods incorporate temporal regularization but they either have overly constrained structural requirements or lack uniqueness which is crucial for interpretation. In this paper, in order to capture the underlying evolving patterns, we introduce t(emporal)PARAFAC2, which utilizes temporal smoothness regularization on the evolving factors. Previously, Alternating Optimization (AO) and Alternating Direction Method of Multipliers (ADMM)-based algorithmic approach has been introduced to fit the PARAFAC2 model to fully observed data. In this paper, we extend this algorithmic framework to the case of partially observed data and use it to fit the tPARAFAC2 model to complete and incomplete datasets with the goal of revealing evolving patterns. Our numerical experiments on simulated datasets demonstrate that tPARAFAC2 can extract the underlying evolving patterns more accurately compared to the state-of-the-art in the presence of high amounts of noise and missing data. Using two real datasets, we also demonstrate the effectiveness of the algorithmic approach in terms of handling missing data and tPARAFAC2 model in terms of revealing evolving patterns. The paper provides an extensive comparison of different approaches for handling missing data within the proposed framework, and discusses both the advantages and limitations of tPARAFAC2 model. △ Less

Submitted 5 May, 2025; v1 submitted 1 July, 2024; originally announced July 2024.

Comments: 16 pages, 15 figures

arXiv:2406.12338 [pdf, other]

PARAFAC2-based Coupled Matrix and Tensor Factorizations with Constraints

Authors: Carla Schenker, Xiulin Wang, David Horner, Morten A. Rasmussen, Evrim Acar

Abstract: Data fusion models based on Coupled Matrix and Tensor Factorizations (CMTF) have been effective tools for joint analysis of data from multiple sources. While the vast majority of CMTF models are based on the strictly multilinear CANDECOMP/PARAFAC (CP) tensor model, recently also the more flexible PARAFAC2 model has been integrated into CMTF models. PARAFAC2 tensor models can handle irregular/ragge… ▽ More Data fusion models based on Coupled Matrix and Tensor Factorizations (CMTF) have been effective tools for joint analysis of data from multiple sources. While the vast majority of CMTF models are based on the strictly multilinear CANDECOMP/PARAFAC (CP) tensor model, recently also the more flexible PARAFAC2 model has been integrated into CMTF models. PARAFAC2 tensor models can handle irregular/ragged tensors and have shown to be especially useful for modelling dynamic data with unaligned or irregular time profiles. However, existing PARAFAC2-based CMTF models have limitations in terms of possible regularizations on the factors and/or types of coupling between datasets. To address these limitations, in this paper we introduce a flexible algorithmic framework that fits PARAFAC2-based CMTF models using Alternating Optimization (AO) and the Alternating Direction Method of Multipliers (ADMM). The proposed framework allows to impose various constraints on all modes and linear couplings to other matrix-, CP- or PARAFAC2-models. Experiments on various simulated and a real dataset demonstrate the utility and versatility of the proposed framework as well as its benefits in terms of accuracy and efficiency in comparison with state-of-the-art methods. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: 15 pages, 15 figures,1 table

arXiv:2405.13092 [pdf, other]

CausalPlayground: Addressing Data-Generation Requirements in Cutting-Edge Causality Research

Authors: Andreas W M Sauter, Erman Acar, Aske Plaat

Abstract: Research on causal effects often relies on synthetic data due to the scarcity of real-world datasets with ground-truth effects. Since current data-generating tools do not always meet all requirements for state-of-the-art research, ad-hoc methods are often employed. This leads to heterogeneity among datasets and delays research progress. We address the shortcomings of current data-generating librar… ▽ More Research on causal effects often relies on synthetic data due to the scarcity of real-world datasets with ground-truth effects. Since current data-generating tools do not always meet all requirements for state-of-the-art research, ad-hoc methods are often employed. This leads to heterogeneity among datasets and delays research progress. We address the shortcomings of current data-generating libraries by introducing CausalPlayground, a Python library that provides a standardized platform for generating, sampling, and sharing structural causal models (SCMs). CausalPlayground offers fine-grained control over SCMs, interventions, and the generation of datasets of SCMs for learning and quantitative research. Furthermore, by integrating with Gymnasium, the standard framework for reinforcement learning (RL) environments, we enable online interaction with the SCMs. Overall, by introducing CausalPlayground we aim to foster more efficient and comparable research in the field. All code and API documentation is available at https://github.com/sa-and/CausalPlayground. △ Less

Submitted 21 May, 2024; originally announced May 2024.

arXiv:2404.11208 [pdf, other]

CAGE: Causality-Aware Shapley Value for Global Explanations

Authors: Nils Ole Breuer, Andreas Sauter, Majid Mohammadi, Erman Acar

Abstract: As Artificial Intelligence (AI) is having more influence on our everyday lives, it becomes important that AI-based decisions are transparent and explainable. As a consequence, the field of eXplainable AI (or XAI) has become popular in recent years. One way to explain AI models is to elucidate the predictive importance of the input features for the AI model in general, also referred to as global ex… ▽ More As Artificial Intelligence (AI) is having more influence on our everyday lives, it becomes important that AI-based decisions are transparent and explainable. As a consequence, the field of eXplainable AI (or XAI) has become popular in recent years. One way to explain AI models is to elucidate the predictive importance of the input features for the AI model in general, also referred to as global explanations. Inspired by cooperative game theory, Shapley values offer a convenient way for quantifying the feature importance as explanations. However many methods based on Shapley values are built on the assumption of feature independence and often overlook causal relations of the features which could impact their importance for the ML model. Inspired by studies of explanations at the local level, we propose CAGE (Causally-Aware Shapley Values for Global Explanations). In particular, we introduce a novel sampling procedure for out-coalition features that respects the causal relations of the input features. We derive a practical approach that incorporates causal knowledge into global explanation and offers the possibility to interpret the predictive feature importance considering their causal relation. We evaluate our method on synthetic data and real-world data. The explanations from our approach suggest that they are not only more intuitive but also more faithful compared to previous global explanation methods. △ Less

Submitted 17 April, 2024; originally announced April 2024.

arXiv:2403.18245 [pdf, other]

LocalCop: An R package for local likelihood inference for conditional copulas

Authors: Elif F. Acar, Martin Lysy, Alan Kuchinsky

Abstract: Conditional copulas models allow the dependence structure between multiple response variables to be modelled as a function of covariates. LocalCop (Acar & Lysy, 2024) is an R/C++ package for computationally efficient semiparametric conditional copula modelling using a local likelihood inference framework developed in Acar, Craiu, & Yao (2011), Acar, Craiu, & Yao (2013) and Acar, Czado, & Lysy (201… ▽ More Conditional copulas models allow the dependence structure between multiple response variables to be modelled as a function of covariates. LocalCop (Acar & Lysy, 2024) is an R/C++ package for computationally efficient semiparametric conditional copula modelling using a local likelihood inference framework developed in Acar, Craiu, & Yao (2011), Acar, Craiu, & Yao (2013) and Acar, Czado, & Lysy (2019). △ Less

Submitted 27 March, 2024; originally announced March 2024.

Comments: 6 pages, 2 figures; submitted to the Journal of Open Source Software (JOSS)

MSC Class: 62

arXiv:2403.08436 [pdf, other]

PFStorer: Personalized Face Restoration and Super-Resolution

Authors: Tuomas Varanka, Tapani Toivonen, Soumya Tripathy, Guoying Zhao, Erman Acar

Abstract: Recent developments in face restoration have achieved remarkable results in producing high-quality and lifelike outputs. The stunning results however often fail to be faithful with respect to the identity of the person as the models lack necessary context. In this paper, we explore the potential of personalized face restoration with diffusion models. In our approach a restoration model is personal… ▽ More Recent developments in face restoration have achieved remarkable results in producing high-quality and lifelike outputs. The stunning results however often fail to be faithful with respect to the identity of the person as the models lack necessary context. In this paper, we explore the potential of personalized face restoration with diffusion models. In our approach a restoration model is personalized using a few images of the identity, leading to tailored restoration with respect to the identity while retaining fine-grained details. By using independent trainable blocks for personalization, the rich prior of a base restoration model can be exploited to its fullest. To avoid the model relying on parts of identity left in the conditioning low-quality images, a generative regularizer is employed. With a learnable parameter, the model learns to balance between the details generated based on the input image and the degree of personalization. Moreover, we improve the training pipeline of face restoration models to enable an alignment-free approach. We showcase the robust capabilities of our approach in several real-world scenarios with multiple identities, demonstrating our method's ability to generate fine-grained details with faithful restoration. In the user study we evaluate the perceptual quality and faithfulness of the genereated details, with our method being voted best 61% of the time compared to the second best with 25% of the votes. △ Less

Submitted 13 March, 2024; originally announced March 2024.

arXiv:2402.09495 [pdf, other]

On the Potential of Network-Based Features for Fraud Detection

Authors: Catayoun Azarm, Erman Acar, Mickey van Zeelt

Abstract: Online transaction fraud presents substantial challenges to businesses and consumers, risking significant financial losses. Conventional rule-based systems struggle to keep pace with evolving fraud tactics, leading to high false positive rates and missed detections. Machine learning techniques offer a promising solution by leveraging historical data to identify fraudulent patterns. This article ex… ▽ More Online transaction fraud presents substantial challenges to businesses and consumers, risking significant financial losses. Conventional rule-based systems struggle to keep pace with evolving fraud tactics, leading to high false positive rates and missed detections. Machine learning techniques offer a promising solution by leveraging historical data to identify fraudulent patterns. This article explores using the personalised PageRank (PPR) algorithm to capture the social dynamics of fraud by analysing relationships between financial accounts. The primary objective is to compare the performance of traditional features with the addition of PPR in fraud detection models. Results indicate that integrating PPR enhances the model's predictive power, surpassing the baseline model. Additionally, the PPR feature provides unique and valuable information, evidenced by its high feature importance score. Feature stability analysis confirms consistent feature distributions across training and test datasets. △ Less

Submitted 19 February, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

arXiv:2401.16974 [pdf, other]

CORE: Towards Scalable and Efficient Causal Discovery with Reinforcement Learning

Authors: Andreas W. M. Sauter, Nicolò Botteghi, Erman Acar, Aske Plaat

Abstract: Causal discovery is the challenging task of inferring causal structure from data. Motivated by Pearl's Causal Hierarchy (PCH), which tells us that passive observations alone are not enough to distinguish correlation from causation, there has been a recent push to incorporate interventions into machine learning research. Reinforcement learning provides a convenient framework for such an active appr… ▽ More Causal discovery is the challenging task of inferring causal structure from data. Motivated by Pearl's Causal Hierarchy (PCH), which tells us that passive observations alone are not enough to distinguish correlation from causation, there has been a recent push to incorporate interventions into machine learning research. Reinforcement learning provides a convenient framework for such an active approach to learning. This paper presents CORE, a deep reinforcement learning-based approach for causal discovery and intervention planning. CORE learns to sequentially reconstruct causal graphs from data while learning to perform informative interventions. Our results demonstrate that CORE generalizes to unseen graphs and efficiently uncovers causal structures. Furthermore, CORE scales to larger graphs with up to 10 variables and outperforms existing approaches in structure estimation accuracy and sample efficiency. All relevant code and supplementary material can be found at https://github.com/sa-and/CORE △ Less

Submitted 30 January, 2024; originally announced January 2024.

Comments: To be published In Proc. of the 23rd International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2024), Auckland, New Zealand, May 6 - 10, 2024, IFAAMAS

ACM Class: I.2.6; I.2.8

arXiv:2401.12646 [pdf, other]

Emergent Cooperation under Uncertain Incentive Alignment

Authors: Nicole Orzan, Erman Acar, Davide Grossi, Roxana Rădulescu

Abstract: Understanding the emergence of cooperation in systems of computational agents is crucial for the development of effective cooperative AI. Interaction among individuals in real-world settings are often sparse and occur within a broad spectrum of incentives, which often are only partially known. In this work, we explore how cooperation can arise among reinforcement learning agents in scenarios chara… ▽ More Understanding the emergence of cooperation in systems of computational agents is crucial for the development of effective cooperative AI. Interaction among individuals in real-world settings are often sparse and occur within a broad spectrum of incentives, which often are only partially known. In this work, we explore how cooperation can arise among reinforcement learning agents in scenarios characterised by infrequent encounters, and where agents face uncertainty about the alignment of their incentives with those of others. To do so, we train the agents under a wide spectrum of environments ranging from fully competitive, to fully cooperative, to mixed-motives. Under this type of uncertainty we study the effects of mechanisms, such as reputation and intrinsic rewards, that have been proposed in the literature to foster cooperation in mixed-motives environments. Our findings show that uncertainty substantially lowers the agents' ability to engage in cooperative behaviour, when that would be the best course of action. In this scenario, the use of effective reputation mechanisms and intrinsic rewards boosts the agents' capability to act nearly-optimally in cooperative environments, while greatly enhancing cooperation in mixed-motive environments as well. △ Less

Submitted 23 January, 2024; originally announced January 2024.

arXiv:2312.00586 [pdf, other]

Explainable Fraud Detection with Deep Symbolic Classification

Authors: Samantha Visbeek, Erman Acar, Floris den Hengst

Abstract: There is a growing demand for explainable, transparent, and data-driven models within the domain of fraud detection. Decisions made by fraud detection models need to be explainable in the event of a customer dispute. Additionally, the decision-making process in the model must be transparent to win the trust of regulators and business stakeholders. At the same time, fraud detection solutions can be… ▽ More There is a growing demand for explainable, transparent, and data-driven models within the domain of fraud detection. Decisions made by fraud detection models need to be explainable in the event of a customer dispute. Additionally, the decision-making process in the model must be transparent to win the trust of regulators and business stakeholders. At the same time, fraud detection solutions can benefit from data due to the noisy, dynamic nature of fraud and the availability of large historical data sets. Finally, fraud detection is notorious for its class imbalance: there are typically several orders of magnitude more legitimate transactions than fraudulent ones. In this paper, we present Deep Symbolic Classification (DSC), an extension of the Deep Symbolic Regression framework to classification problems. DSC casts classification as a search problem in the space of all analytic functions composed of a vocabulary of variables, constants, and operations and optimizes for an arbitrary evaluation metric directly. The search is guided by a deep neural network trained with reinforcement learning. Because the functions are mathematical expressions that are in closed-form and concise, the model is inherently explainable both at the level of a single classification decision and the model's decision process. Furthermore, the class imbalance problem is successfully addressed by optimizing for metrics that are robust to class imbalance such as the F1 score. This eliminates the need for oversampling and undersampling techniques that plague traditional approaches. Finally, the model allows to explicitly balance between the prediction accuracy and the explainability. An evaluation on the PaySim data set demonstrates competitive predictive performance with state-of-the-art models, while surpassing them in terms of explainability. This establishes DSC as a promising model for fraud detection systems. △ Less

Submitted 1 December, 2023; originally announced December 2023.

Comments: 12 pages, 3 figures, To be published in the 3rd International Workshop on Explainable AI in Finance of the 4th ACM International Conference on AI in Finance (ICAIF, https://ai-finance.org/)

arXiv:2312.00247 [pdf, ps, other]

Note on approximation of truncated Baskakov operators by Fuzzy numbers

Authors: Ecem Acar, Sevilay Kirci Serenbay, Saleem Yaseen Majeed Al Khalidy

Abstract: In this paper, we firstly introduce nonlinear truncated Baskakov operators on compact intervals and obtain some direct theorems. Also, we give the approximation of fuzzy numbers by truncated nonlinear Baskakov operators. In this paper, we firstly introduce nonlinear truncated Baskakov operators on compact intervals and obtain some direct theorems. Also, we give the approximation of fuzzy numbers by truncated nonlinear Baskakov operators. △ Less

Submitted 28 July, 2023; originally announced December 2023.

Comments: 14 pages

ACM Class: G.0

arXiv:2308.07126 [pdf, other]

doi 10.1109/MLSP55844.2023.10285943

A Time-aware tensor decomposition for tracking evolving patterns

Authors: Christos Chatzis, Max Pfeffer, Pedro Lind, Evrim Acar

Abstract: Time-evolving data sets can often be arranged as a higher-order tensor with one of the modes being the time mode. While tensor factorizations have been successfully used to capture the underlying patterns in such higher-order data sets, the temporal aspect is often ignored, allowing for the reordering of time points. In recent studies, temporal regularizers are incorporated in the time mode to tac… ▽ More Time-evolving data sets can often be arranged as a higher-order tensor with one of the modes being the time mode. While tensor factorizations have been successfully used to capture the underlying patterns in such higher-order data sets, the temporal aspect is often ignored, allowing for the reordering of time points. In recent studies, temporal regularizers are incorporated in the time mode to tackle this issue. Nevertheless, existing approaches still do not allow underlying patterns to change in time (e.g., spatial changes in the brain, contextual changes in topics). In this paper, we propose temporal PARAFAC2 (tPARAFAC2): a PARAFAC2-based tensor factorization method with temporal regularization to extract gradually evolving patterns from temporal data. Through extensive experiments on synthetic data, we demonstrate that tPARAFAC2 can capture the underlying evolving patterns accurately performing better than PARAFAC2 and coupled matrix factorization with temporal smoothness regularization. △ Less

Submitted 15 August, 2023; v1 submitted 14 August, 2023; originally announced August 2023.

Comments: 6 pages, 5 figures

arXiv:2307.13894 [pdf, other]

AI4GCC - Team: Below Sea Level: Critiques and Improvements

Authors: Bram Renting, Phillip Wozny, Robert Loftin, Claudia Wieners, Erman Acar

Abstract: We present a critical analysis of the simulation framework RICE-N, an integrated assessment model (IAM) for evaluating the impacts of climate change on the economy. We identify key issues with RICE-N, including action masking and irrelevant actions, and suggest improvements such as utilizing tariff revenue and penalizing overproduction. We also critically engage with features of IAMs in general, n… ▽ More We present a critical analysis of the simulation framework RICE-N, an integrated assessment model (IAM) for evaluating the impacts of climate change on the economy. We identify key issues with RICE-N, including action masking and irrelevant actions, and suggest improvements such as utilizing tariff revenue and penalizing overproduction. We also critically engage with features of IAMs in general, namely overly optimistic damage functions and unrealistic abatement cost functions. Our findings contribute to the ongoing efforts to further develop the RICE-N framework in an effort to improve the simulation, making it more useful as an inspiration for policymakers. △ Less

Submitted 4 August, 2023; v1 submitted 25 July, 2023; originally announced July 2023.

Comments: Presented at AI For Global Climate Cooperation Competition, 2023 (arXiv:cs/2307.06951)

Report number: AI4GCC/2023/track3/2

arXiv:2307.13892 [pdf, other]

AI4GCC-Team -- Below Sea Level: Score and Real World Relevance

Authors: Phillip Wozny, Bram Renting, Robert Loftin, Claudia Wieners, Erman Acar

Abstract: As our submission for track three of the AI for Global Climate Cooperation (AI4GCC) competition, we propose a negotiation protocol for use in the RICE-N climate-economic simulation. Our proposal seeks to address the challenges of carbon leakage through methods inspired by the Carbon Border Adjustment Mechanism (CBAM) and Climate Clubs (CC). We demonstrate the effectiveness of our approach by compa… ▽ More As our submission for track three of the AI for Global Climate Cooperation (AI4GCC) competition, we propose a negotiation protocol for use in the RICE-N climate-economic simulation. Our proposal seeks to address the challenges of carbon leakage through methods inspired by the Carbon Border Adjustment Mechanism (CBAM) and Climate Clubs (CC). We demonstrate the effectiveness of our approach by comparing simulated outcomes to representative concentration pathways (RCP) and shared socioeconomic pathways (SSP). Our protocol results in a temperature rise comparable to RCP 3.4/4.5 and SSP 2. Furthermore, we provide an analysis of our protocol's World Trade Organization compliance, administrative and political feasibility, and ethical concerns. We recognize that our proposal risks hurting the least developing countries, and we suggest specific corrective measures to avoid exacerbating existing inequalities, such as technology sharing and wealth redistribution. Future research should improve the RICE-N tariff mechanism and implement actions allowing for the aforementioned corrective measures. △ Less

Submitted 4 August, 2023; v1 submitted 25 July, 2023; originally announced July 2023.

Comments: Presented at AI For Global Climate Cooperation Competition, 2023 (arXiv:2307.06951)

Report number: AI4GCC/2023/track2/1

arXiv:2301.01837 [pdf, other]

A Meta-Learning Algorithm for Interrogative Agendas

Authors: Erman Acar, Andrea De Domenico, Krishna Manoorkar, Mattia Panettiere

Abstract: Explainability is a key challenge and a major research theme in AI research for developing intelligent systems that are capable of working with humans more effectively. An obvious choice in developing explainable intelligent systems relies on employing knowledge representation formalisms which are inherently tailored towards expressing human knowledge e.g., interrogative agendas. In the scope of t… ▽ More Explainability is a key challenge and a major research theme in AI research for developing intelligent systems that are capable of working with humans more effectively. An obvious choice in developing explainable intelligent systems relies on employing knowledge representation formalisms which are inherently tailored towards expressing human knowledge e.g., interrogative agendas. In the scope of this work, we focus on formal concept analysis (FCA), a standard knowledge representation formalism, to express interrogative agendas, and in particular to categorize objects w.r.t. a given set of features. Several FCA-based algorithms have already been in use for standard machine learning tasks such as classification and outlier detection. These algorithms use a single concept lattice for such a task, meaning that the set of features used for the categorization is fixed. Different sets of features may have different importance in that categorization, we call a set of features an agenda. In many applications a correct or good agenda for categorization is not known beforehand. In this paper, we propose a meta-learning algorithm to construct a good interrogative agenda explaining the data. Such algorithm is meant to call existing FCA-based classification and outlier detection algorithms iteratively, to increase their accuracy and reduce their sample complexity. The proposed method assigns a measure of importance to different set of features used in the categorization, hence making the results more explainable. △ Less

Submitted 4 January, 2023; originally announced January 2023.

arXiv:2210.13054 [pdf, other]

doi 10.1109/ICASSP49357.2023.10094562

PARAFAC2-based Coupled Matrix and Tensor Factorizations

Authors: Carla Schenker, Xiulin Wang, Evrim Acar

Abstract: Coupled matrix and tensor factorizations (CMTF) have emerged as an effective data fusion tool to jointly analyze data sets in the form of matrices and higher-order tensors. The PARAFAC2 model has shown to be a promising alternative to the CANDECOMP/PARAFAC (CP) tensor model due to its flexibility and capability to handle irregular/ragged tensors. While fusion models based on a PARAFAC2 model coupl… ▽ More Coupled matrix and tensor factorizations (CMTF) have emerged as an effective data fusion tool to jointly analyze data sets in the form of matrices and higher-order tensors. The PARAFAC2 model has shown to be a promising alternative to the CANDECOMP/PARAFAC (CP) tensor model due to its flexibility and capability to handle irregular/ragged tensors. While fusion models based on a PARAFAC2 model coupled with matrix/tensor decompositions have been recently studied, they are limited in terms of possible regularizations and/or types of coupling between data sets. In this paper, we propose an algorithmic framework for fitting PARAFAC2-based CMTF models with the possibility of imposing various constraints on all modes and linear couplings, using Alternating Optimization (AO) and the Alternating Direction Method of Multipliers (ADMM). Through numerical experiments, we demonstrate that the proposed algorithmic approach accurately recovers the underlying patterns using various constraints and linear couplings. △ Less

Submitted 24 October, 2022; originally announced October 2022.

arXiv:2209.00322 [pdf, ps, other]

doi 10.1002/widm.1494

Unsupervised EHR-based Phenotyping via Matrix and Tensor Decompositions

Authors: Florian Becker, Age K. Smilde, Evrim Acar

Abstract: Computational phenotyping allows for unsupervised discovery of subgroups of patients as well as corresponding co-occurring medical conditions from electronic health records (EHR). Typically, EHR data contains demographic information, diagnoses and laboratory results. Discovering (novel) phenotypes has the potential to be of prognostic and therapeutic value. Providing medical practitioners with tra… ▽ More Computational phenotyping allows for unsupervised discovery of subgroups of patients as well as corresponding co-occurring medical conditions from electronic health records (EHR). Typically, EHR data contains demographic information, diagnoses and laboratory results. Discovering (novel) phenotypes has the potential to be of prognostic and therapeutic value. Providing medical practitioners with transparent and interpretable results is an important requirement and an essential part for advancing precision medicine. Low-rank data approximation methods such as matrix (e.g., non-negative matrix factorization) and tensor decompositions (e.g., CANDECOMP/PARAFAC) have demonstrated that they can provide such transparent and interpretable insights. Recent developments have adapted low-rank data approximation methods by incorporating different constraints and regularizations that facilitate interpretability further. In addition, they offer solutions for common challenges within EHR data such as high dimensionality, data sparsity and incompleteness. Especially extracting temporal phenotypes from longitudinal EHR has received much attention in recent years. In this paper, we provide a comprehensive review of low-rank approximation-based approaches for computational phenotyping. The existing literature is categorized into temporal vs. static phenotyping approaches based on matrix vs. tensor decompositions. Furthermore, we outline different approaches for the validation of phenotypes, i.e., the assessment of clinical significance. △ Less

Submitted 1 September, 2022; originally announced September 2022.

Comments: 28 pages, 5 figures

Journal ref: WIREs Data Mining and Knowledge Discovery, 13(4), e1494, 2023

arXiv:2207.08457 [pdf, other]

A Meta-Reinforcement Learning Algorithm for Causal Discovery

Authors: Andreas Sauter, Erman Acar, Vincent François-Lavet

Abstract: Causal discovery is a major task with the utmost importance for machine learning since causal structures can enable models to go beyond pure correlation-based inference and significantly boost their performance. However, finding causal structures from data poses a significant challenge both in computational effort and accuracy, let alone its impossibility without interventions in general. In this… ▽ More Causal discovery is a major task with the utmost importance for machine learning since causal structures can enable models to go beyond pure correlation-based inference and significantly boost their performance. However, finding causal structures from data poses a significant challenge both in computational effort and accuracy, let alone its impossibility without interventions in general. In this paper, we develop a meta-reinforcement learning algorithm that performs causal discovery by learning to perform interventions such that it can construct an explicit causal graph. Apart from being useful for possible downstream applications, the estimated causal graph also provides an explanation for the data-generating process. In this article, we show that our algorithm estimates a good graph compared to the SOTA approaches, even in environments whose underlying causal structure is previously unseen. Further, we make an ablation study that shows how learning interventions contribute to the overall performance of our approach. We conclude that interventions indeed help boost the performance, efficiently yielding an accurate estimate of the causal structure of a possibly unseen environment. △ Less

Submitted 21 February, 2023; v1 submitted 18 July, 2022; originally announced July 2022.

Comments: Camera-ready version for CLEAR23

arXiv:2207.03031 [pdf, other]

FedHeN: Federated Learning in Heterogeneous Networks

Authors: Durmus Alp Emre Acar, Venkatesh Saligrama

Abstract: We propose a novel training recipe for federated learning with heterogeneous networks where each device can have different architectures. We introduce training with a side objective to the devices of higher complexities to jointly train different architectures in a federated setting. We empirically show that our approach improves the performance of different architectures and leads to high communi… ▽ More We propose a novel training recipe for federated learning with heterogeneous networks where each device can have different architectures. We introduce training with a side objective to the devices of higher complexities to jointly train different architectures in a federated setting. We empirically show that our approach improves the performance of different architectures and leads to high communication savings compared to the state-of-the-art methods. △ Less

Submitted 6 July, 2022; originally announced July 2022.

Comments: Workshop paper to be appear at DyNN, ICML 2022

arXiv:2203.05898 [pdf, other]

Hyperbolic Image Segmentation

Authors: Mina GhadimiAtigh, Julian Schoep, Erman Acar, Nanne van Noord, Pascal Mettes

Abstract: For image segmentation, the current standard is to perform pixel-level optimization and inference in Euclidean output embedding spaces through linear hyperplanes. In this work, we show that hyperbolic manifolds provide a valuable alternative for image segmentation and propose a tractable formulation of hierarchical pixel-level classification in hyperbolic space. Hyperbolic Image Segmentation opens… ▽ More For image segmentation, the current standard is to perform pixel-level optimization and inference in Euclidean output embedding spaces through linear hyperplanes. In this work, we show that hyperbolic manifolds provide a valuable alternative for image segmentation and propose a tractable formulation of hierarchical pixel-level classification in hyperbolic space. Hyperbolic Image Segmentation opens up new possibilities and practical benefits for segmentation, such as uncertainty estimation and boundary information for free, zero-label generalization, and increased performance in low-dimensional output embeddings. △ Less

Submitted 11 March, 2022; originally announced March 2022.

Comments: accepted to CVPR 2022

arXiv:2111.04263 [pdf, other]

Federated Learning Based on Dynamic Regularization

Authors: Durmus Alp Emre Acar, Yue Zhao, Ramon Matas Navarro, Matthew Mattina, Paul N. Whatmough, Venkatesh Saligrama

Abstract: We propose a novel federated learning method for distributively training neural network models, where the server orchestrates cooperation between a subset of randomly chosen devices in each round. We view Federated Learning problem primarily from a communication perspective and allow more device level computations to save transmission costs. We point out a fundamental dilemma, in that the minima o… ▽ More We propose a novel federated learning method for distributively training neural network models, where the server orchestrates cooperation between a subset of randomly chosen devices in each round. We view Federated Learning problem primarily from a communication perspective and allow more device level computations to save transmission costs. We point out a fundamental dilemma, in that the minima of the local-device level empirical loss are inconsistent with those of the global empirical loss. Different from recent prior works, that either attempt inexact minimization or utilize devices for parallelizing gradient computation, we propose a dynamic regularizer for each device at each round, so that in the limit the global and device solutions are aligned. We demonstrate both through empirical results on real and synthetic data as well as analytical results that our scheme leads to efficient training, in both convex and non-convex settings, while being fully agnostic to device heterogeneity and robust to large number of devices, partial participation and unbalanced data. △ Less

Submitted 9 November, 2021; v1 submitted 7 November, 2021; originally announced November 2021.

Comments: Slightly extended version of ICLR 2021 Paper

arXiv:2111.01348 [pdf, other]

Faster Algorithms for Learning Convex Functions

Authors: Ali Siahkamari, Durmus Alp Emre Acar, Christopher Liao, Kelly Geyer, Venkatesh Saligrama, Brian Kulis

Abstract: The task of approximating an arbitrary convex function arises in several learning problems such as convex regression, learning with a difference of convex (DC) functions, and learning Bregman or $f$-divergences. In this paper, we develop and analyze an approach for solving a broad range of convex function learning problems that is faster than state-of-the-art approaches. Our approach is based on a… ▽ More The task of approximating an arbitrary convex function arises in several learning problems such as convex regression, learning with a difference of convex (DC) functions, and learning Bregman or $f$-divergences. In this paper, we develop and analyze an approach for solving a broad range of convex function learning problems that is faster than state-of-the-art approaches. Our approach is based on a 2-block ADMM method where each block can be computed in closed form. For the task of convex Lipschitz regression, we establish that our proposed algorithm converges with iteration complexity of $ O(n\sqrt{d}/ε)$ for a dataset $\bm X \in \mathbb R^{n\times d}$ and $ε> 0$. Combined with per-iteration computation complexity, our method converges with the rate $O(n^3 d^{1.5}/ε+n^2 d^{2.5}/ε+n d^3/ε)$. This new rate improves the state of the art rate of $O(n^5d^2/ε)$ if $d = o( n^4)$. Further we provide similar solvers for DC regression and Bregman divergence learning. Unlike previous approaches, our method is amenable to the use of GPUs. We demonstrate on regression and metric learning experiments that our approach is over 100 times faster than existing approaches on some data sets, and produces results that are comparable to state of the art. △ Less

Submitted 19 June, 2022; v1 submitted 1 November, 2021; originally announced November 2021.

Comments: 21 pages, 3 figures. Proceedings of the 39 th International Conference on Machine Learning, Baltimore, Maryland, USA, PMLR 162, 2022. Copy- right 2022 by the author(s)

arXiv:2110.01278 [pdf, other]

doi 10.1137/21M1450033

An AO-ADMM approach to constraining PARAFAC2 on all modes

Authors: Marie Roald, Carla Schenker, Vince D. Calhoun, Tülay Adalı, Rasmus Bro, Jeremy E. Cohen, Evrim Acar

Abstract: Analyzing multi-way measurements with variations across one mode of the dataset is a challenge in various fields including data mining, neuroscience and chemometrics. For example, measurements may evolve over time or have unaligned time profiles. The PARAFAC2 model has been successfully used to analyze such data by allowing the underlying factor matrices in one mode (i.e., the evolving mode) to ch… ▽ More Analyzing multi-way measurements with variations across one mode of the dataset is a challenge in various fields including data mining, neuroscience and chemometrics. For example, measurements may evolve over time or have unaligned time profiles. The PARAFAC2 model has been successfully used to analyze such data by allowing the underlying factor matrices in one mode (i.e., the evolving mode) to change across slices. The traditional approach to fit a PARAFAC2 model is to use an alternating least squares-based algorithm, which handles the constant cross-product constraint of the PARAFAC2 model by implicitly estimating the evolving factor matrices. This approach makes imposing regularization on these factor matrices challenging. There is currently no algorithm to flexibly impose such regularization with general penalty functions and hard constraints. In order to address this challenge and to avoid the implicit estimation, in this paper, we propose an algorithm for fitting PARAFAC2 based on alternating optimization with the alternating direction method of multipliers (AO-ADMM). With numerical experiments on simulated data, we show that the proposed PARAFAC2 AO-ADMM approach allows for flexible constraints, recovers the underlying patterns accurately, and is computationally efficient compared to the state-of-the-art. We also apply our model to two real-world datasets from neuroscience and chemometrics, and show that constraining the evolving mode improves the interpretability of the extracted patterns. △ Less

Submitted 8 July, 2022; v1 submitted 4 October, 2021; originally announced October 2021.

MSC Class: 15A69; 90C26

Journal ref: SIAM J. Math. Data Sci. 4 (2022) 1191-1222

arXiv:2102.02087 [pdf, other]

doi 10.23919/EUSIPCO54536.2021.9615927

PARAFAC2 AO-ADMM: Constraints in all modes

Authors: Marie Roald, Carla Schenker, Jeremy E. Cohen, Evrim Acar

Abstract: The PARAFAC2 model provides a flexible alternative to the popular CANDECOMP/PARAFAC (CP) model for tensor decompositions. Unlike CP, PARAFAC2 allows factor matrices in one mode (i.e., evolving mode) to change across tensor slices, which has proven useful for applications in different domains such as chemometrics, and neuroscience. However, the evolving mode of the PARAFAC2 model is traditionally m… ▽ More The PARAFAC2 model provides a flexible alternative to the popular CANDECOMP/PARAFAC (CP) model for tensor decompositions. Unlike CP, PARAFAC2 allows factor matrices in one mode (i.e., evolving mode) to change across tensor slices, which has proven useful for applications in different domains such as chemometrics, and neuroscience. However, the evolving mode of the PARAFAC2 model is traditionally modelled implicitly, which makes it challenging to regularise it. Currently, the only way to apply regularisation on that mode is with a flexible coupling approach, which finds the solution through regularised least-squares subproblems. In this work, we instead propose an alternating direction method of multipliers (ADMM)-based algorithm for fitting PARAFAC2 and widen the possible regularisation penalties to any proximable function. Our numerical experiments demonstrate that the proposed ADMM-based approach for PARAFAC2 can accurately recover the underlying components from simulated data while being both computationally efficient and flexible in terms of imposing constraints. △ Less

Submitted 3 February, 2021; originally announced February 2021.

Comments: 5 pages, 4 figures, submitted to EUSIPCO21

arXiv:2007.09605 [pdf, other]

doi 10.1109/JSTSP.2020.3045848

A Flexible Optimization Framework for Regularized Matrix-Tensor Factorizations with Linear Couplings

Authors: Carla Schenker, Jeremy E. Cohen, Evrim Acar

Abstract: Coupled matrix and tensor factorizations (CMTF) are frequently used to jointly analyze data from multiple sources, also called data fusion. However, different characteristics of datasets stemming from multiple sources pose many challenges in data fusion and require to employ various regularizations, constraints, loss functions and different types of coupling structures between datasets. In this pa… ▽ More Coupled matrix and tensor factorizations (CMTF) are frequently used to jointly analyze data from multiple sources, also called data fusion. However, different characteristics of datasets stemming from multiple sources pose many challenges in data fusion and require to employ various regularizations, constraints, loss functions and different types of coupling structures between datasets. In this paper, we propose a flexible algorithmic framework for coupled matrix and tensor factorizations which utilizes Alternating Optimization (AO) and the Alternating Direction Method of Multipliers (ADMM). The framework facilitates the use of a variety of constraints, loss functions and couplings with linear transformations in a seamless way. Numerical experiments on simulated and real datasets demonstrate that the proposed approach is accurate, and computationally efficient with comparable or better performance than available CMTF methods for Frobenius norm loss, while being more flexible. Using Kullback-Leibler divergence on count data, we demonstrate that the algorithm yields accurate results also for other loss functions. △ Less

Submitted 19 July, 2020; originally announced July 2020.

arXiv:2007.00571 [pdf, other]

Reasoning with Contextual Knowledge and Influence Diagrams

Authors: Erman Acar, Rafael Peñaloza

Abstract: Influence diagrams (IDs) are well-known formalisms extending Bayesian networks to model decision situations under uncertainty. Although they are convenient as a decision theoretic tool, their knowledge representation ability is limited in capturing other crucial notions such as logical consistency. We complement IDs with the light-weight description logic (DL) EL to overcome such limitations. We c… ▽ More Influence diagrams (IDs) are well-known formalisms extending Bayesian networks to model decision situations under uncertainty. Although they are convenient as a decision theoretic tool, their knowledge representation ability is limited in capturing other crucial notions such as logical consistency. We complement IDs with the light-weight description logic (DL) EL to overcome such limitations. We consider a setup where DL axioms hold in some contexts, yet the actual context is uncertain. The framework benefits from the convenience of using DL as a domain knowledge representation language and the modelling strength of IDs to deal with decisions over contexts in the presence of contextual uncertainty. We define related reasoning problems and study their computational complexity. △ Less

Submitted 1 July, 2020; originally announced July 2020.

arXiv:2006.03472 [pdf, other]

Analyzing Differentiable Fuzzy Implications

Authors: Emile van Krieken, Erman Acar, Frank van Harmelen

Abstract: Combining symbolic and neural approaches has gained considerable attention in the AI community, as it is often argued that the strengths and weaknesses of these approaches are complementary. One such trend in the literature are weakly supervised learning techniques that employ operators from fuzzy logics. In particular, they use prior background knowledge described in such logics to help the train… ▽ More Combining symbolic and neural approaches has gained considerable attention in the AI community, as it is often argued that the strengths and weaknesses of these approaches are complementary. One such trend in the literature are weakly supervised learning techniques that employ operators from fuzzy logics. In particular, they use prior background knowledge described in such logics to help the training of a neural network from unlabeled and noisy data. By interpreting logical symbols using neural networks (or grounding them), this background knowledge can be added to regular loss functions, hence making reasoning a part of learning. In this paper, we investigate how implications from the fuzzy logic literature behave in a differentiable setting. In such a setting, we analyze the differences between the formal properties of these fuzzy implications. It turns out that various fuzzy implications, including some of the most well-known, are highly unsuitable for use in a differentiable learning setting. A further finding shows a strong imbalance between gradients driven by the antecedent and the consequent of the implication. Furthermore, we introduce a new family of fuzzy implications (called sigmoidal implications) to tackle this phenomenon. Finally, we empirically show that it is possible to use Differentiable Fuzzy Logics for semi-supervised learning, and show that sigmoidal implications outperform other choices of fuzzy implications. △ Less

Submitted 4 June, 2020; originally announced June 2020.

Comments: 10 pages, 10 figures, accepted to 17th International Conference on Principles of Knowledge Representation and Reasoning (KR 2020). arXiv admin note: substantial text overlap with arXiv:2002.06100

arXiv:2004.06298 [pdf, other]

Budget Learning via Bracketing

Authors: Aditya Gangrade, Durmus Alp Emre Acar, Venkatesh Saligrama

Abstract: Conventional machine learning applications in the mobile/IoT setting transmit data to a cloud-server for predictions. Due to cost considerations (power, latency, monetary), it is desirable to minimise device-to-server transmissions. The budget learning (BL) problem poses the learner's goal as minimising use of the cloud while suffering no discernible loss in accuracy, under the constraint that the… ▽ More Conventional machine learning applications in the mobile/IoT setting transmit data to a cloud-server for predictions. Due to cost considerations (power, latency, monetary), it is desirable to minimise device-to-server transmissions. The budget learning (BL) problem poses the learner's goal as minimising use of the cloud while suffering no discernible loss in accuracy, under the constraint that the methods employed be edge-implementable. We propose a new formulation for the BL problem via the concept of bracketings. Concretely, we propose to sandwich the cloud's prediction, $g,$ via functions $h^-, h^+$ from a `simple' class so that $h^- \le g \le h^+$ nearly always. On an instance $x$, if $h^+(x)=h^-(x)$, we leverage local processing, and bypass the cloud. We explore theoretical aspects of this formulation, providing PAC-style learnability definitions; associating the notion of budget learnability to approximability via brackets; and giving VC-theoretic analyses of their properties. We empirically validate our theory on real-world datasets, demonstrating improved performance over prior gating based methods. △ Less

Submitted 14 April, 2020; originally announced April 2020.

Comments: Slightly expanded version of a paper to be presented at AISTATS 2020

arXiv:2003.06492 [pdf, other]

On Sufficient and Necessary Conditions in Bounded CTL: A Forgetting Approach

Authors: Renyan Feng, Erman Acar, Stefan Schlobach, Yisong Wang, Wanwei Liu

Abstract: Computation Tree Logic (CTL) is one of the central formalisms in formal verification. As a specification language, it is used to express a property that the system at hand is expected to satisfy. From both the verification and the system design points of view, some information content of such property might become irrelevant for the system due to various reasons, e.g., it might become obsolete by… ▽ More Computation Tree Logic (CTL) is one of the central formalisms in formal verification. As a specification language, it is used to express a property that the system at hand is expected to satisfy. From both the verification and the system design points of view, some information content of such property might become irrelevant for the system due to various reasons, e.g., it might become obsolete by time, or perhaps infeasible due to practical difficulties. Then, the problem arises on how to subtract such piece of information without altering the relevant system behaviour or violating the existing specifications over a given signature. Moreover, in such a scenario, two crucial notions are informative: the strongest necessary condition (SNC) and the weakest sufficient condition (WSC) of a given property. To address such a scenario in a principled way, we introduce a forgetting-based approach in CTL and show that it can be used to compute SNC and WSC of a property under a given model and over a given signature. We study its theoretical properties and also show that our notion of forgetting satisfies existing essential postulates of knowledge forgetting. Furthermore, we analyse the computational complexity of some basic reasoning tasks for the fragment CTL_AF in particular. △ Less

Submitted 3 July, 2020; v1 submitted 13 March, 2020; originally announced March 2020.

arXiv:2002.06100 [pdf, other]

doi 10.1016/j.artint.2021.103602

Analyzing Differentiable Fuzzy Logic Operators

Authors: Emile van Krieken, Erman Acar, Frank van Harmelen

Abstract: The AI community is increasingly putting its attention towards combining symbolic and neural approaches, as it is often argued that the strengths and weaknesses of these approaches are complementary. One recent trend in the literature are weakly supervised learning techniques that employ operators from fuzzy logics. In particular, these use prior background knowledge described in such logics to he… ▽ More The AI community is increasingly putting its attention towards combining symbolic and neural approaches, as it is often argued that the strengths and weaknesses of these approaches are complementary. One recent trend in the literature are weakly supervised learning techniques that employ operators from fuzzy logics. In particular, these use prior background knowledge described in such logics to help the training of a neural network from unlabeled and noisy data. By interpreting logical symbols using neural networks, this background knowledge can be added to regular loss functions, hence making reasoning a part of learning. We study, both formally and empirically, how a large collection of logical operators from the fuzzy logic literature behave in a differentiable learning setting. We find that many of these operators, including some of the most well-known, are highly unsuitable in this setting. A further finding concerns the treatment of implication in these fuzzy logics, and shows a strong imbalance between gradients driven by the antecedent and the consequent of the implication. Furthermore, we introduce a new family of fuzzy implications (called sigmoidal implications) to tackle this phenomenon. Finally, we empirically show that it is possible to use Differentiable Fuzzy Logics for semi-supervised learning, and compare how different operators behave in practice. We find that, to achieve the largest performance improvement over a supervised baseline, we have to resort to non-standard combinations of logical operators which perform well in learning, but no longer satisfy the usual logical laws. △ Less

Submitted 24 August, 2021; v1 submitted 14 February, 2020; originally announced February 2020.

Comments: 47 pages, 18 figures. V2: Added analysis for existential quantification. Improved experiments and writing

arXiv:2002.03169 [pdf, ps, other]

Distance-based Equilibria in Normal-Form Games

Authors: Erman Acar, Reshef Meir

Abstract: We propose a simple uncertainty modification for the agent model in normal-form games; at any given strategy profile, the agent can access only a set of "possible profiles" that are within a certain distance from the actual action profile. We investigate the various instantiations in which the agent chooses her strategy using well-known rationales e.g., considering the worst case, or trying to min… ▽ More We propose a simple uncertainty modification for the agent model in normal-form games; at any given strategy profile, the agent can access only a set of "possible profiles" that are within a certain distance from the actual action profile. We investigate the various instantiations in which the agent chooses her strategy using well-known rationales e.g., considering the worst case, or trying to minimize the regret, to cope with such uncertainty. Any such modification in the behavioral model naturally induces a corresponding notion of equilibrium; a distance-based equilibrium. We characterize the relationships between the various equilibria, and also their connections to well-known existing solution concepts such as Trembling-hand perfection. Furthermore, we deliver existence results, and show that for some class of games, such solution concepts can actually lead to better outcomes. △ Less

Submitted 8 February, 2020; originally announced February 2020.

Comments: Author's preprint

arXiv:1911.02926 [pdf, other]

doi 10.1109/ICASSP40776.2020.9053902

Tracing Network Evolution Using the PARAFAC2 Model

Authors: Marie Roald, Suchita Bhinge, Chunying Jia, Vince Calhoun, Tülay Adalı, Evrim Acar

Abstract: Characterizing time-evolving networks is a challenging task, but it is crucial for understanding the dynamic behavior of complex systems such as the brain. For instance, how spatial networks of functional connectivity in the brain evolve during a task is not well-understood. A traditional approach in neuroimaging data analysis is to make simplifications through the assumption of static spatial net… ▽ More Characterizing time-evolving networks is a challenging task, but it is crucial for understanding the dynamic behavior of complex systems such as the brain. For instance, how spatial networks of functional connectivity in the brain evolve during a task is not well-understood. A traditional approach in neuroimaging data analysis is to make simplifications through the assumption of static spatial networks. In this paper, without assuming static networks in time and/or space, we arrange the temporal data as a higher-order tensor and use a tensor factorization model called PARAFAC2 to capture underlying patterns (spatial networks) in time-evolving data and their evolution. Numerical experiments on simulated data demonstrate that PARAFAC2 can successfully reveal the underlying networks and their dynamics. We also show the promising performance of the model in terms of tracing the evolution of task-related functional connectivity in the brain through the analysis of functional magnetic resonance imaging data. △ Less

Submitted 23 October, 2019; originally announced November 2019.

Comments: 5 pages, 5 figures, conference

ACM Class: I.5.1

arXiv:1908.04700 [pdf, other]

Semi-Supervised Learning using Differentiable Reasoning

Authors: Emile van Krieken, Erman Acar, Frank van Harmelen

Abstract: We introduce Differentiable Reasoning (DR), a novel semi-supervised learning technique which uses relational background knowledge to benefit from unlabeled data. We apply it to the Semantic Image Interpretation (SII) task and show that background knowledge provides significant improvement. We find that there is a strong but interesting imbalance between the contributions of updates from Modus Pone… ▽ More We introduce Differentiable Reasoning (DR), a novel semi-supervised learning technique which uses relational background knowledge to benefit from unlabeled data. We apply it to the Semantic Image Interpretation (SII) task and show that background knowledge provides significant improvement. We find that there is a strong but interesting imbalance between the contributions of updates from Modus Ponens (MP) and its logical equivalent Modus Tollens (MT) to the learning process, suggesting that our approach is very sensitive to a phenomenon called the Raven Paradox. We propose a solution to overcome this situation. △ Less

Submitted 13 August, 2019; originally announced August 2019.

Journal ref: IFCoLog Journal of Logic and its Applications 6 (2019) 633-653

arXiv:1907.00032 [pdf, other]

doi 10.1016/j.chemolab.2020.104038.

Cross-product Penalized Component Analysis (XCAN)

Authors: José Camacho, Evrim Acar, Morten A. Rasmussen, Rasmus Bro

Abstract: Matrix factorization methods are extensively employed to understand complex data. In this paper, we introduce the cross-product penalized component analysis (XCAN), a sparse matrix factorization based on the optimization of a loss function that allows a trade-off between variance maximization and structural preservation. The approach is based on previous developments, notably (i) the Sparse Princi… ▽ More Matrix factorization methods are extensively employed to understand complex data. In this paper, we introduce the cross-product penalized component analysis (XCAN), a sparse matrix factorization based on the optimization of a loss function that allows a trade-off between variance maximization and structural preservation. The approach is based on previous developments, notably (i) the Sparse Principal Component Analysis (SPCA) framework based on the LASSO, (ii) extensions of SPCA to constrain both modes of the factorization, like co-clustering or the Penalized Matrix Decomposition (PMD), and (iii) the Group-wise Principal Component Analysis (GPCA) method. The result is a flexible modeling approach that can be used for data exploration in a large variety of problems. We demonstrate its use with applications from different disciplines. △ Less

Submitted 28 June, 2019; originally announced July 2019.

Journal ref: Chemometrics and Intelligent Laboratory Systems, 2020, 203: 104038-

arXiv:1811.11735 [pdf]

Biomimetic potassium selective nanopores

Authors: Elif Turker Acar, Steven F. Buchsbaum, Cody Combs, Francesco Fornasiero, Zuzanna S. Siwy

Abstract: Reproducing the exquisite ion selectivity displayed by biological ion channels in artificial nanopore systems has proven to be one of the most challenging tasks undertaken by the nanopore community, yet a successful achievement of this goal offers immense technological potential. Here we show a strategy to design solid-state nanopores that selectively transport potassium ions and show negligible c… ▽ More Reproducing the exquisite ion selectivity displayed by biological ion channels in artificial nanopore systems has proven to be one of the most challenging tasks undertaken by the nanopore community, yet a successful achievement of this goal offers immense technological potential. Here we show a strategy to design solid-state nanopores that selectively transport potassium ions and show negligible conductance for sodium ions. The nanopores contain walls decorated with 4'-aminobenzo-18-crown-6 ether and ssDNA molecules located at one pore entrance. The ionic selectivity stems from facilitated transport of potassium ions in the pore region containing crown ethers, while the highly charged ssDNA plays the role of a cation filter. Achieving potassium selectivity in solid-state nanopores opens new avenues toward advanced separation processes, more efficient biosensing technologies and novel biomimetic nanopore systems. △ Less

Submitted 28 November, 2018; originally announced November 2018.

arXiv:1612.02189 [pdf, other]

doi 10.1109/ISCAS.2017.8050303

Tensor-Based Fusion of EEG and FMRI to Understand Neurological Changes in Schizophrenia

Authors: Evrim Acar, Yuri Levin-Schwartz, Vince D. Calhoun, Tülay Adalı

Abstract: Neuroimaging modalities such as functional magnetic resonance imaging (fMRI) and electroencephalography (EEG) provide information about neurological functions in complementary spatiotemporal resolutions; therefore, fusion of these modalities is expected to provide better understanding of brain activity. In this paper, we jointly analyze fMRI and multi-channel EEG signals collected during an audito… ▽ More Neuroimaging modalities such as functional magnetic resonance imaging (fMRI) and electroencephalography (EEG) provide information about neurological functions in complementary spatiotemporal resolutions; therefore, fusion of these modalities is expected to provide better understanding of brain activity. In this paper, we jointly analyze fMRI and multi-channel EEG signals collected during an auditory oddball task with the goal of capturing brain activity patterns that differ between patients with schizophrenia and healthy controls. Rather than selecting a single electrode or matricizing the third-order tensor that can be naturally used to represent multi-channel EEG signals, we preserve the multi-way structure of EEG data and use a coupled matrix and tensor factorization (CMTF) model to jointly analyze fMRI and EEG signals. Our analysis reveals that (i) joint analysis of EEG and fMRI using a CMTF model can capture meaningful temporal and spatial signatures of patterns that behave differently in patients and controls, and (ii) these differences and the interpretability of the associated components increase by including multiple electrodes from frontal, motor and parietal areas, but not necessarily by including all electrodes in the analysis. △ Less

Submitted 7 December, 2016; originally announced December 2016.

arXiv:1607.02328 [pdf, other]

Common and Distinct Components in Data Fusion

Authors: Age K. Smilde, Ingrid Mage, Tormod Naes, Thomas Hankemeier, Mirjam A. Lips, Henk A. L. Kiers, Evrim Acar, Rasmus Bro

Abstract: In many areas of science multiple sets of data are collected pertaining to the same system. Examples are food products which are characterized by different sets of variables, bio-processes which are on-line sampled with different instruments, or biological systems of which different genomics measurements are obtained. Data fusion is concerned with analyzing such sets of data simultaneously to arri… ▽ More In many areas of science multiple sets of data are collected pertaining to the same system. Examples are food products which are characterized by different sets of variables, bio-processes which are on-line sampled with different instruments, or biological systems of which different genomics measurements are obtained. Data fusion is concerned with analyzing such sets of data simultaneously to arrive at a global view of the system under study. One of the upcoming areas of data fusion is exploring whether the data sets have something in common or not. This gives insight into common and distinct variation in each data set, thereby facilitating understanding the relationships between the data sets. Unfortunately, research on methods to distinguish common and distinct components is fragmented, both in terminology as well as in methods: there is no common ground which hampers comparing methods and understanding their relative merits. This paper provides a unifying framework for this subfield of data fusion by using rigorous arguments from linear algebra. The most frequently used methods for distinguishing common and distinct components are explained in this framework and some practical examples are given of these methods in the areas of (medical) biology and food science. △ Less

Submitted 8 July, 2016; originally announced July 2016.

Comments: 50 pages, 12 figures

arXiv:1606.01385 [pdf, other]

Conditional Copula Models for Right-Censored Clustered Event Time Data

Authors: Candida Geerdens, Elif Fidan Acar, Paul Janssen

Abstract: This paper proposes a modelling strategy to infer the impact of a covariate on the dependence structure of right-censored clustered event time data. The joint survival function of the event times is modelled using a parametric conditional copula whose parameter depends on a cluster-level covariate in a functional way. We use a local likelihood approach to estimate the form of the copula parameter… ▽ More This paper proposes a modelling strategy to infer the impact of a covariate on the dependence structure of right-censored clustered event time data. The joint survival function of the event times is modelled using a parametric conditional copula whose parameter depends on a cluster-level covariate in a functional way. We use a local likelihood approach to estimate the form of the copula parameter and outline a generalized likelihood ratio-type test strategy to formally test its constancy. A bootstrap procedure is employed to obtain an approximate $p$-value for the test. The performance of the proposed estimation and testing methods are evaluated in simulations under different rates of right-censoring and for various parametric copula families, considering both parametrically and nonparametrically estimated margins. We apply the methods to data from the Diabetic Retinopathy Study to assess the impact of disease onset age on the loss of visual acuity. △ Less

Submitted 4 June, 2016; originally announced June 2016.

Comments: 23 pages, 5 figures, appendix and supplemental material

arXiv:1501.03933 [pdf, ps, other]

RDF Validation Requirements - Evaluation and Logical Underpinning

Authors: Thomas Bosch, Andreas Nolle, Erman Acar, Kai Eckert

Abstract: There are many case studies for which the formulation of RDF constraints and the validation of RDF data conforming to these constraint is very important. As a part of the collaboration with the W3C and the DCMI working groups on RDF validation, we identified major RDF validation requirements and initiated an RDF validation requirements database which is available to contribute at http://purl.org/n… ▽ More There are many case studies for which the formulation of RDF constraints and the validation of RDF data conforming to these constraint is very important. As a part of the collaboration with the W3C and the DCMI working groups on RDF validation, we identified major RDF validation requirements and initiated an RDF validation requirements database which is available to contribute at http://purl.org/net/rdf-validation. The purpose of this database is to collaboratively collect case studies, use cases, requirements, and solutions regarding RDF validation. Although, there are multiple constraint languages which can be used to formulate RDF constraints (associated with these requirements), there is no standard way to formulate them. This paper serves to evaluate to which extend each requirement is satisfied by each of these constraint languages. We take reasoning into account as an important pre-validation step and therefore map constraints to DL in order to show that each constraint can be mapped to an ontology describing RDF constraints generically. △ Less

Submitted 17 July, 2015; v1 submitted 16 January, 2015; originally announced January 2015.

Comments: arXiv admin note: text overlap with arXiv:1504.04479

arXiv:1409.8083 [pdf, other]

Variational Inference For Probabilistic Latent Tensor Factorization with KL Divergence

Authors: Beyza Ermis, Y. Kenan Yılmaz, A. Taylan Cemgil, Evrim Acar

Abstract: Probabilistic Latent Tensor Factorization (PLTF) is a recently proposed probabilistic framework for modelling multi-way data. Not only the common tensor factorization models but also any arbitrary tensor factorization structure can be realized by the PLTF framework. This paper presents full Bayesian inference via variational Bayes that facilitates more powerful modelling and allows more sophistica… ▽ More Probabilistic Latent Tensor Factorization (PLTF) is a recently proposed probabilistic framework for modelling multi-way data. Not only the common tensor factorization models but also any arbitrary tensor factorization structure can be realized by the PLTF framework. This paper presents full Bayesian inference via variational Bayes that facilitates more powerful modelling and allows more sophisticated inference on the PLTF framework. We illustrate our approach on model order selection and link prediction. △ Less

Submitted 29 September, 2014; originally announced September 2014.

arXiv:1208.6231 [pdf, other]

Link Prediction via Generalized Coupled Tensor Factorisation

Authors: Beyza Ermiş, Evrim Acar, A. Taylan Cemgil

Abstract: This study deals with the missing link prediction problem: the problem of predicting the existence of missing connections between entities of interest. We address link prediction using coupled analysis of relational datasets represented as heterogeneous data, i.e., datasets in the form of matrices and higher-order tensors. We propose to use an approach based on probabilistic interpretation of tens… ▽ More This study deals with the missing link prediction problem: the problem of predicting the existence of missing connections between entities of interest. We address link prediction using coupled analysis of relational datasets represented as heterogeneous data, i.e., datasets in the form of matrices and higher-order tensors. We propose to use an approach based on probabilistic interpretation of tensor factorisation models, i.e., Generalised Coupled Tensor Factorisation, which can simultaneously fit a large class of tensor models to higher-order tensors/matrices with com- mon latent factors using different loss functions. Numerical experiments demonstrate that joint analysis of data from multiple sources via coupled factorisation improves the link prediction performance and the selection of right loss function and tensor model is crucial for accurately predicting missing links. △ Less

Submitted 30 August, 2012; originally announced August 2012.

Showing 1–50 of 56 results for author: Acar, E