-
Evaluation-Time Policy Switching for Offline Reinforcement Learning
Authors:
Natinael Solomon Neggatu,
Jeremie Houssineau,
Giovanni Montana
Abstract:
Offline reinforcement learning (RL) looks at learning how to optimally solve tasks using a fixed dataset of interactions from the environment. Many off-policy algorithms developed for online learning struggle in the offline setting as they tend to over-estimate the behaviour of out of distributions actions. Existing offline RL algorithms adapt off-policy algorithms, employing techniques such as co…
▽ More
Offline reinforcement learning (RL) looks at learning how to optimally solve tasks using a fixed dataset of interactions from the environment. Many off-policy algorithms developed for online learning struggle in the offline setting as they tend to over-estimate the behaviour of out of distributions actions. Existing offline RL algorithms adapt off-policy algorithms, employing techniques such as constraining the policy or modifying the value function to achieve good performance on individual datasets but struggle to adapt to different tasks or datasets of different qualities without tuning hyper-parameters. We introduce a policy switching technique that dynamically combines the behaviour of a pure off-policy RL agent, for improving behaviour, and a behavioural cloning (BC) agent, for staying close to the data. We achieve this by using a combination of epistemic uncertainty, quantified by our RL model, and a metric for aleatoric uncertainty extracted from the dataset. We show empirically that our policy switching technique can outperform not only the individual algorithms used in the switching process but also compete with state-of-the-art methods on numerous benchmarks. Our use of epistemic uncertainty for policy switching also allows us to naturally extend our method to the domain of offline to online fine-tuning allowing our model to adapt quickly and safely from online data, either matching or exceeding the performance of current methods that typically require additional modification or hyper-parameter fine-tuning.
△ Less
Submitted 15 March, 2025;
originally announced March 2025.
-
Investigating Relational State Abstraction in Collaborative MARL
Authors:
Sharlin Utke,
Jeremie Houssineau,
Giovanni Montana
Abstract:
This paper explores the impact of relational state abstraction on sample efficiency and performance in collaborative Multi-Agent Reinforcement Learning. The proposed abstraction is based on spatial relationships in environments where direct communication between agents is not allowed, leveraging the ubiquity of spatial reasoning in real-world multi-agent scenarios. We introduce MARC (Multi-Agent R…
▽ More
This paper explores the impact of relational state abstraction on sample efficiency and performance in collaborative Multi-Agent Reinforcement Learning. The proposed abstraction is based on spatial relationships in environments where direct communication between agents is not allowed, leveraging the ubiquity of spatial reasoning in real-world multi-agent scenarios. We introduce MARC (Multi-Agent Relational Critic), a simple yet effective critic architecture incorporating spatial relational inductive biases by transforming the state into a spatial graph and processing it through a relational graph neural network. The performance of MARC is evaluated across six collaborative tasks, including a novel environment with heterogeneous agents. We conduct a comprehensive empirical analysis, comparing MARC against state-of-the-art MARL baselines, demonstrating improvements in both sample efficiency and asymptotic performance, as well as its potential for generalization. Our findings suggest that a minimal integration of spatial relational inductive biases as abstraction can yield substantial benefits without requiring complex designs or task-specific engineering. This work provides insights into the potential of relational state abstraction to address sample efficiency, a key challenge in MARL, offering a promising direction for developing more efficient algorithms in spatially complex environments.
△ Less
Submitted 19 December, 2024;
originally announced December 2024.
-
Improving Active Learning with a Bayesian Representation of Epistemic Uncertainty
Authors:
Jake Thomas,
Jeremie Houssineau
Abstract:
A popular strategy for active learning is to specifically target a reduction in epistemic uncertainty, since aleatoric uncertainty is often considered as being intrinsic to the system of interest and therefore not reducible. Yet, distinguishing these two types of uncertainty remains challenging and there is no single strategy that consistently outperforms the others. We propose to use a particular…
▽ More
A popular strategy for active learning is to specifically target a reduction in epistemic uncertainty, since aleatoric uncertainty is often considered as being intrinsic to the system of interest and therefore not reducible. Yet, distinguishing these two types of uncertainty remains challenging and there is no single strategy that consistently outperforms the others. We propose to use a particular combination of probability and possibility theories, with the aim of using the latter to specifically represent epistemic uncertainty, and we show how this combination leads to new active learning strategies that have desirable properties. In order to demonstrate the efficiency of these strategies in non-trivial settings, we introduce the notion of a possibilistic Gaussian process (GP) and consider GP-based multiclass and binary classification problems, for which the proposed methods display a strong performance for both simulated and real datasets.
△ Less
Submitted 11 December, 2024;
originally announced December 2024.
-
Redesigning the ensemble Kalman filter with a dedicated model of epistemic uncertainty
Authors:
Chatchuea Kimchaiwong,
Jeremie Houssineau,
Adam M. Johansen
Abstract:
The problem of incorporating information from observations received serially in time is widespread in the field of uncertainty quantification. Within a probabilistic framework, such problems can be addressed using standard filtering techniques. However, in many real-world problems, some (or all) of the uncertainty is epistemic, arising from a lack of knowledge, and is difficult to model probabilis…
▽ More
The problem of incorporating information from observations received serially in time is widespread in the field of uncertainty quantification. Within a probabilistic framework, such problems can be addressed using standard filtering techniques. However, in many real-world problems, some (or all) of the uncertainty is epistemic, arising from a lack of knowledge, and is difficult to model probabilistically. This paper introduces a possibilistic ensemble Kalman filter designed for this setting and characterizes some of its properties. Using possibility theory to describe epistemic uncertainty is appealing from a philosophical perspective, and it is easy to justify certain heuristics often employed in standard ensemble Kalman filters as principled approaches to capturing uncertainty within it. The possibilistic approach motivates a robust mechanism for characterizing uncertainty which shows good performance with small sample sizes, and can outperform standard ensemble Kalman filters at given sample size, even when dealing with genuinely aleatoric uncertainty.
△ Less
Submitted 27 November, 2024;
originally announced November 2024.
-
Mitigating Relative Over-Generalization in Multi-Agent Reinforcement Learning
Authors:
Ting Zhu,
Yue Jin,
Jeremie Houssineau,
Giovanni Montana
Abstract:
In decentralized multi-agent reinforcement learning, agents learning in isolation can lead to relative over-generalization (RO), where optimal joint actions are undervalued in favor of suboptimal ones. This hinders effective coordination in cooperative tasks, as agents tend to choose actions that are individually rational but collectively suboptimal. To address this issue, we introduce MaxMax Q-Le…
▽ More
In decentralized multi-agent reinforcement learning, agents learning in isolation can lead to relative over-generalization (RO), where optimal joint actions are undervalued in favor of suboptimal ones. This hinders effective coordination in cooperative tasks, as agents tend to choose actions that are individually rational but collectively suboptimal. To address this issue, we introduce MaxMax Q-Learning (MMQ), which employs an iterative process of sampling and evaluating potential next states, selecting those with maximal Q-values for learning. This approach refines approximations of ideal state transitions, aligning more closely with the optimal joint policy of collaborating agents. We provide theoretical analysis supporting MMQ's potential and present empirical evaluations across various environments susceptible to RO. Our results demonstrate that MMQ frequently outperforms existing baselines, exhibiting enhanced convergence and sample efficiency.
△ Less
Submitted 17 November, 2024;
originally announced November 2024.
-
Robust Multi-Sensor Multi-Target Tracking Using Possibility Labeled Multi-Bernoulli Filter
Authors:
Han Cai,
Chenbao Xue,
Jeremie Houssineau,
Zhirun Xue
Abstract:
With the increasing complexity of multiple target tracking scenes, a single sensor may not be able to effectively monitor a large number of targets. Therefore, it is imperative to extend the single-sensor technique to Multi-Sensor Multi-Target Tracking (MSMTT) for enhanced functionality. Typical MSMTT methods presume complete randomness of all uncertain components, and therefore effective solution…
▽ More
With the increasing complexity of multiple target tracking scenes, a single sensor may not be able to effectively monitor a large number of targets. Therefore, it is imperative to extend the single-sensor technique to Multi-Sensor Multi-Target Tracking (MSMTT) for enhanced functionality. Typical MSMTT methods presume complete randomness of all uncertain components, and therefore effective solutions such as the random finite set filter and covariance intersection method have been derived to conduct the MSMTT task. However, the presence of epistemic uncertainty, arising from incomplete information, is often disregarded within the context of MSMTT. This paper develops an innovative possibility Labeled Multi-Bernoulli (LMB) Filter based on the labeled Uncertain Finite Set (UFS) theory. The LMB filter inherits the high robustness of the possibility generalized labeled multi-Bernoulli filter with simplified computational complexity. The fusion of LMB UFSs is derived and adapted to develop a robust MSMTT scheme. Simulation results corroborate the superior performance exhibited by the proposed approach in comparison to typical probabilistic methods.
△ Less
Submitted 4 January, 2024;
originally announced January 2024.
-
Target tracking in the framework of possibility theory: The possibilistic Bernoulli filter
Authors:
Branko Ristic,
Jeremie Houssineau,
Sanjeev Arulampalam
Abstract:
The Bernoulli filter is a Bayes filter for joint detection and tracking of a target in the presence of false and miss detections. This paper presents a mathematical formulation of the Bernoulli filter in the framework of possibility theory, where uncertainty is represented using {\em possibility} functions, rather than {\em probability} distributions. Possibility functions model the uncertainty in…
▽ More
The Bernoulli filter is a Bayes filter for joint detection and tracking of a target in the presence of false and miss detections. This paper presents a mathematical formulation of the Bernoulli filter in the framework of possibility theory, where uncertainty is represented using {\em possibility} functions, rather than {\em probability} distributions. Possibility functions model the uncertainty in a non-additive manner, and have the capacity to deal with partial (incomplete) problem specification. Thus, the main advantage of the possibilistic Bernoulli filter, derived in this paper, is that it can operate even in the absence of precise measurement and/or dynamic model parameters. This feature of the proposed filter is demonstrated in the context of target tracking using multi-static Doppler shifts as measurements.
△ Less
Submitted 10 November, 2019;
originally announced November 2019.
-
Robust TMA using the possibility particle filter
Authors:
Branko Ristic,
Jeremie Houssineau,
Sanjeev Arulampalam
Abstract:
The problem is target motion analysis (TMA), where the objective is to estimate the state of a moving target from noise corrupted bearings-only measurements. The focus is on recursive TMA, traditionally solved using the Bayesian filters (e.g. the extended or unscented Kalman filters, particle filters). The TMA is a difficult problem and may cause the algorithms to diverge, especially when the meas…
▽ More
The problem is target motion analysis (TMA), where the objective is to estimate the state of a moving target from noise corrupted bearings-only measurements. The focus is on recursive TMA, traditionally solved using the Bayesian filters (e.g. the extended or unscented Kalman filters, particle filters). The TMA is a difficult problem and may cause the algorithms to diverge, especially when the measurement noise model is imperfect or mismatched. As a robust alternative to the Bayesian filters for TMA, we propose the recently introduced possibility filter. This filter is implemented in the sequential Monte Carlo framework, and referred to as the possibility particle filter. The paper demonstrates its superior performance against the standard particle filter in the presence of a model mismatch, and equal performance in the case of the exact model match.
△ Less
Submitted 31 May, 2018;
originally announced June 2018.
-
Fusion of finite set distributions: Pointwise consistency and global cardinality
Authors:
Murat Üney,
Jérémie Houssineau,
Emmanuel Delande,
Simon J. Julier,
Daniel E. Clark
Abstract:
A recent trend in distributed multi-sensor fusion is to use random finite set filters at the sensor nodes and fuse the filtered distributions algorithmically using their exponential mixture densities (EMDs). Fusion algorithms which extend the celebrated covariance intersection and consensus based approaches are such examples. In this article, we analyse the variational principle underlying EMDs an…
▽ More
A recent trend in distributed multi-sensor fusion is to use random finite set filters at the sensor nodes and fuse the filtered distributions algorithmically using their exponential mixture densities (EMDs). Fusion algorithms which extend the celebrated covariance intersection and consensus based approaches are such examples. In this article, we analyse the variational principle underlying EMDs and show that the EMDs of finite set distributions do not necessarily lead to consistent fusion of cardinality distributions. Indeed, we demonstrate that these inconsistencies may occur with overwhelming probability in practice, through examples with Bernoulli, Poisson and independent identically distributed (IID) cluster processes. We prove that pointwise consistency of EMDs does not imply consistency in global cardinality and vice versa. Then, we redefine the variational problems underlying fusion and provide iterative solutions thereby establishing a framework that guarantees cardinality consistent fusion.
△ Less
Submitted 3 December, 2018; v1 submitted 17 February, 2018;
originally announced February 2018.
-
Bayesian data assimilation based on a family of outer measures
Authors:
Jeremie Houssineau,
Daniel E. Clark
Abstract:
A flexible representation of uncertainty that remains within the standard framework of probabilistic measure theory is presented along with a study of its properties. This representation relies on a specific type of outer measure that is based on the measure of a supremum, hence combining additive and highly sub-additive components. It is shown that this type of outer measure enables the introduct…
▽ More
A flexible representation of uncertainty that remains within the standard framework of probabilistic measure theory is presented along with a study of its properties. This representation relies on a specific type of outer measure that is based on the measure of a supremum, hence combining additive and highly sub-additive components. It is shown that this type of outer measure enables the introduction of intuitive concepts such as pullback and general data assimilation operations.
△ Less
Submitted 9 November, 2016;
originally announced November 2016.
-
A unified approach for multi-object triangulation, tracking and camera calibration
Authors:
Jeremie Houssineau,
Daniel Clark,
Spela Ivekovic,
Chee Sing Lee,
Jose Franco
Abstract:
Object triangulation, 3-D object tracking, feature correspondence, and camera calibration are key problems for estimation from camera networks. This paper addresses these problems within a unified Bayesian framework for joint multi-object tracking and sensor registration. Given that using standard filtering approaches for state estimation from cameras is problematic, an alternative parametrisation…
▽ More
Object triangulation, 3-D object tracking, feature correspondence, and camera calibration are key problems for estimation from camera networks. This paper addresses these problems within a unified Bayesian framework for joint multi-object tracking and sensor registration. Given that using standard filtering approaches for state estimation from cameras is problematic, an alternative parametrisation is exploited, called disparity space. The disparity space-based approach for triangulation and object tracking is shown to be more effective than non-linear versions of the Kalman filter and particle filtering for non-rectified cameras. The approach for feature correspondence is based on the Probability Hypothesis Density (PHD) filter, and hence inherits the ability to update without explicit measurement association, to initiate new targets, and to discriminate between target and clutter. The PHD filtering approach then forms the basis of a camera calibration method from static or moving objects. Results are shown on simulated data.
△ Less
Submitted 9 October, 2014;
originally announced October 2014.