Search | arXiv e-print repository

Enhancing Cooperative Multi-Agent Reinforcement Learning with State Modelling and Adversarial Exploration

Authors: Andreas Kontogiannis, Konstantinos Papathanasiou, Yi Shen, Giorgos Stamou, Michael M. Zavlanos, George Vouros

Abstract: Learning to cooperate in distributed partially observable environments with no communication abilities poses significant challenges for multi-agent deep reinforcement learning (MARL). This paper addresses key concerns in this domain, focusing on inferring state representations from individual agent observations and leveraging these representations to enhance agents' exploration and collaborative t… ▽ More Learning to cooperate in distributed partially observable environments with no communication abilities poses significant challenges for multi-agent deep reinforcement learning (MARL). This paper addresses key concerns in this domain, focusing on inferring state representations from individual agent observations and leveraging these representations to enhance agents' exploration and collaborative task execution policies. To this end, we propose a novel state modelling framework for cooperative MARL, where agents infer meaningful belief representations of the non-observable state, with respect to optimizing their own policies, while filtering redundant and less informative joint state information. Building upon this framework, we propose the MARL SMPE algorithm. In SMPE, agents enhance their own policy's discriminative abilities under partial observability, explicitly by incorporating their beliefs into the policy network, and implicitly by adopting an adversarial type of exploration policies which encourages agents to discover novel, high-value states while improving the discriminative abilities of others. Experimentally, we show that SMPE outperforms state-of-the-art MARL algorithms in complex fully cooperative tasks from the MPE, LBF, and RWARE benchmarks. △ Less

Submitted 12 June, 2025; v1 submitted 8 May, 2025; originally announced May 2025.

Comments: Accepted at ICML 2025

arXiv:2502.04773 [pdf, ps, other]

An Extended Benchmarking of Multi-Agent Reinforcement Learning Algorithms in Complex Fully Cooperative Tasks

Authors: George Papadopoulos, Andreas Kontogiannis, Foteini Papadopoulou, Chaido Poulianou, Ioannis Koumentis, George Vouros

Abstract: Multi-Agent Reinforcement Learning (MARL) has recently emerged as a significant area of research. However, MARL evaluation often lacks systematic diversity, hindering a comprehensive understanding of algorithms' capabilities. In particular, cooperative MARL algorithms are predominantly evaluated on benchmarks such as SMAC and GRF, which primarily feature team game scenarios without assessing adequ… ▽ More Multi-Agent Reinforcement Learning (MARL) has recently emerged as a significant area of research. However, MARL evaluation often lacks systematic diversity, hindering a comprehensive understanding of algorithms' capabilities. In particular, cooperative MARL algorithms are predominantly evaluated on benchmarks such as SMAC and GRF, which primarily feature team game scenarios without assessing adequately various aspects of agents' capabilities required in fully cooperative real-world tasks such as multi-robot cooperation and warehouse, resource management, search and rescue, and human-AI cooperation. Moreover, MARL algorithms are mainly evaluated on low dimensional state spaces, and thus their performance on high-dimensional (e.g., image) observations is not well-studied. To fill this gap, this paper highlights the crucial need for expanding systematic evaluation across a wider array of existing benchmarks. To this end, we conduct extensive evaluation and comparisons of well-known MARL algorithms on complex fully cooperative benchmarks, including tasks with images as agents' observations. Interestingly, our analysis shows that many algorithms, hailed as state-of-the-art on SMAC and GRF, may underperform standard MARL baselines on fully cooperative benchmarks. Finally, towards more systematic and better evaluation of cooperative MARL algorithms, we have open-sourced PyMARLzoo+, an extension of the widely used (E)PyMARL libraries, which addresses an open challenge from [TBG++21], facilitating seamless integration and support with all benchmarks of PettingZoo, as well as Overcooked, PressurePlate, Capture Target and Box Pushing. △ Less

Submitted 3 July, 2025; v1 submitted 7 February, 2025; originally announced February 2025.

arXiv:2412.11266 [pdf, other]

Bayesian inference of mean velocity fields and turbulence models from flow MRI

Authors: A. Kontogiannis, P. Nair, M. Loecher, D. B. Ennis, A. Marsden, M. P. Juniper

Abstract: We solve a Bayesian inverse Reynolds-averaged Navier-Stokes (RANS) problem that assimilates mean flow data by jointly reconstructing the mean flow field and learning its unknown RANS parameters. We devise an algorithm that learns the most likely parameters of an algebraic effective viscosity model, and estimates their uncertainties, from mean flow data of a turbulent flow. We conduct a flow MRI ex… ▽ More We solve a Bayesian inverse Reynolds-averaged Navier-Stokes (RANS) problem that assimilates mean flow data by jointly reconstructing the mean flow field and learning its unknown RANS parameters. We devise an algorithm that learns the most likely parameters of an algebraic effective viscosity model, and estimates their uncertainties, from mean flow data of a turbulent flow. We conduct a flow MRI experiment to obtain mean flow data of a confined turbulent jet in an idealized medical device known as the FDA (Food and Drug Administration) nozzle. The algorithm successfully reconstructs the mean flow field and learns the most likely turbulence model parameters without overfitting. The methodology accepts any turbulence model, be it algebraic (explicit) or multi-equation (implicit), as long as the model is differentiable, and naturally extends to unsteady turbulent flows. △ Less

Submitted 15 December, 2024; originally announced December 2024.

arXiv:2408.02604 [pdf, other]

doi 10.1017/jfm.2025.92

Learning rheological parameters of non-Newtonian fluids from velocimetry data

Authors: Alexandros Kontogiannis, Richard Hodgkinson, Emily L. Manchester

Abstract: We solve a Bayesian inverse Navier-Stokes (N-S) problem that assimilates velocimetry data in order to jointly reconstruct the flow field and learn the unknown N-S parameters. By incorporating a Carreau shear-thinning viscosity model into the N-S problem, we devise an algorithm that learns the most likely Carreau parameters of a shear-thinning fluid, and estimates their uncertainties, from velocime… ▽ More We solve a Bayesian inverse Navier-Stokes (N-S) problem that assimilates velocimetry data in order to jointly reconstruct the flow field and learn the unknown N-S parameters. By incorporating a Carreau shear-thinning viscosity model into the N-S problem, we devise an algorithm that learns the most likely Carreau parameters of a shear-thinning fluid, and estimates their uncertainties, from velocimetry data alone. We then conduct a flow-MRI experiment to obtain velocimetry data of an axisymmetric laminar jet through an idealised medical device (FDA nozzle) for a blood analogue fluid. We show that the algorithm can successfully reconstruct the flow field by learning the most likely Carreau parameters, and that the learned parameters are in very good agreement with rheometry measurements. The algorithm accepts any algebraic effective viscosity model, as long as the model is differentiable, and it can be extended to more complicated non-Newtonian fluids (e.g. Oldroyd-B fluid) if a viscoelastic model is incorporated into the N-S problem. △ Less

Submitted 19 January, 2025; v1 submitted 5 August, 2024; originally announced August 2024.

Journal ref: J. Fluid Mech. 1011 (2025) R3

arXiv:2406.18464 [pdf, other]

doi 10.1088/1361-6420/ad9cb7

Bayesian inverse Navier-Stokes problems: joint flow field reconstruction and parameter learning

Authors: Alexandros Kontogiannis, Scott V. Elgersma, Andrew J. Sederman, Matthew P. Juniper

Abstract: We formulate and solve a Bayesian inverse Navier-Stokes (N-S) problem that assimilates velocimetry data in order to jointly reconstruct a 3D flow field and learn the unknown N-S parameters, including the boundary position. By hardwiring a generalised N-S problem, and regularising its unknown parameters using Gaussian prior distributions, we learn the most likely parameters in a collapsed search sp… ▽ More We formulate and solve a Bayesian inverse Navier-Stokes (N-S) problem that assimilates velocimetry data in order to jointly reconstruct a 3D flow field and learn the unknown N-S parameters, including the boundary position. By hardwiring a generalised N-S problem, and regularising its unknown parameters using Gaussian prior distributions, we learn the most likely parameters in a collapsed search space. The most likely flow field reconstruction is then the N-S solution that corresponds to the learned parameters. We develop the method in the variational setting and use a stabilised Nitsche weak form of the N-S problem that permits the control of all N-S parameters. To regularise the inferred the geometry, we use a viscous signed distance field (vSDF) as an auxiliary variable, which is given as the solution of a viscous Eikonal boundary value problem. We devise an algorithm that solves this inverse problem, and numerically implement it using an adjoint-consistent stabilised cut-cell finite element method. We then use this method to reconstruct magnetic resonance velocimetry (flow-MRI) data of a 3D steady laminar flow through a physical model of an aortic arch for two different Reynolds numbers and signal-to-noise ratio (SNR) levels (low/high). We find that the method can accurately i) reconstruct the low SNR data by filtering out the noise/artefacts and recovering flow features that are obscured by noise, and ii) reproduce the high SNR data without overfitting. Although the framework that we develop applies to 3D steady laminar flows in complex geometries, it readily extends to time-dependent laminar and Reynolds-averaged turbulent flows, as well as non-Newtonian (e.g. viscoelastic) fluids. △ Less

Submitted 10 December, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

arXiv:2301.03043 [pdf, other]

XDQN: Inherently Interpretable DQN through Mimicking

Authors: Andreas Kontogiannis, George Vouros

Abstract: Although deep reinforcement learning (DRL) methods have been successfully applied in challenging tasks, their application in real-world operational settings is challenged by methods' limited ability to provide explanations. Among the paradigms for explainability in DRL is the interpretable box design paradigm, where interpretable models substitute inner constituent models of the DRL method, thus m… ▽ More Although deep reinforcement learning (DRL) methods have been successfully applied in challenging tasks, their application in real-world operational settings is challenged by methods' limited ability to provide explanations. Among the paradigms for explainability in DRL is the interpretable box design paradigm, where interpretable models substitute inner constituent models of the DRL method, thus making the DRL method "inherently" interpretable. In this paper we explore this paradigm and we propose XDQN, an explainable variation of DQN, which uses an interpretable policy model trained through mimicking. XDQN is challenged in a complex, real-world operational multi-agent problem, where agents are independent learners solving congestion problems. Specifically, XDQN is evaluated in three MARL scenarios, pertaining to the demand-capacity balancing problem of air traffic management. XDQN achieves performance similar to that of DQN, while its abilities to provide global models' interpretations and interpretations of local decisions are demonstrated. △ Less

Submitted 8 January, 2023; originally announced January 2023.

arXiv:2207.01466 [pdf, other]

doi 10.1109/TIP.2022.3228172

Physics-informed compressed sensing for PC-MRI: an inverse Navier-Stokes problem

Authors: Alexandros Kontogiannis, Matthew P. Juniper

Abstract: We formulate a physics-informed compressed sensing (PICS) method for the reconstruction of velocity fields from noisy and sparse phase-contrast magnetic resonance signals. The method solves an inverse Navier-Stokes boundary value problem, which permits us to jointly reconstruct and segment the velocity field, and at the same time infer hidden quantities such as the hydrodynamic pressure and the wa… ▽ More We formulate a physics-informed compressed sensing (PICS) method for the reconstruction of velocity fields from noisy and sparse phase-contrast magnetic resonance signals. The method solves an inverse Navier-Stokes boundary value problem, which permits us to jointly reconstruct and segment the velocity field, and at the same time infer hidden quantities such as the hydrodynamic pressure and the wall shear stress. Using a Bayesian framework, we regularize the problem by introducing a priori information about the unknown parameters in the form of Gaussian random fields. This prior information is updated using the Navier-Stokes problem, an energy-based segmentation functional, and by requiring that the reconstruction is consistent with the $k$-space signals. We create an algorithm that solves this reconstruction problem, and test it for noisy and sparse $k$-space signals of the flow through a converging nozzle. We find that the method is capable of reconstructing and segmenting the velocity fields from sparsely-sampled (15% $k$-space coverage), low ($\sim$$10$) signal-to-noise ratio (SNR) signals, and that the reconstructed velocity field compares well with that derived from fully-sampled (100% $k$-space coverage) high ($>40$) SNR signals of the same flow. △ Less

Submitted 30 November, 2022; v1 submitted 4 July, 2022; originally announced July 2022.

Journal ref: IEEE Transactions on Image Processing, 2022

arXiv:2207.01368 [pdf, other]

doi 10.1017/jfm.2022.503

Joint reconstruction and segmentation of noisy velocity images as an inverse Navier-Stokes problem

Authors: Alexandros Kontogiannis, Scott V. Elgersma, Andrew J. Sederman, Matthew P. Juniper

Abstract: We formulate and solve a generalized inverse Navier-Stokes problem for the joint velocity field reconstruction and boundary segmentation of noisy flow velocity images. To regularize the problem we use a Bayesian framework with Gaussian random fields. This allows us to estimate the uncertainties of the unknowns by approximating their posterior covariance with a quasi-Newton method. We first test th… ▽ More We formulate and solve a generalized inverse Navier-Stokes problem for the joint velocity field reconstruction and boundary segmentation of noisy flow velocity images. To regularize the problem we use a Bayesian framework with Gaussian random fields. This allows us to estimate the uncertainties of the unknowns by approximating their posterior covariance with a quasi-Newton method. We first test the method for synthetic noisy images of 2D flows and observe that the method successfully reconstructs and segments the noisy synthetic images with a signal-to-noise ratio (SNR) of 3. Then we conduct a magnetic resonance velocimetry (MRV) experiment to acquire images of an axisymmetric flow for low ($\simeq 6$) and high ($>30$) SNRs. We show that the method is capable of reconstructing and segmenting the low SNR images, producing noiseless velocity fields and a smooth segmentation, with negligible errors compared with the high SNR images. This amounts to a reduction of the total scanning time by a factor of 27. At the same time, the method provides additional knowledge about the physics of the flow (e.g. pressure), and addresses the shortcomings of MRV (low spatial resolution and partial volume effects) that otherwise hinder the accurate estimation of wall shear stresses. Although the implementation of the method is restricted to 2D steady planar and axisymmetric flows, the formulation applies immediately to 3D steady flows and naturally extends to 3D periodic and unsteady flows. △ Less

Submitted 4 July, 2022; originally announced July 2022.

Journal ref: Journal of Fluid Mechanics, 2022

arXiv:2112.07620 [pdf, other]

Tree-based Focused Web Crawling with Reinforcement Learning

Authors: Andreas Kontogiannis, Dimitrios Kelesis, Vasilis Pollatos, George Giannakopoulos, Georgios Paliouras

Abstract: A focused crawler aims at discovering as many web pages and web sites relevant to a target topic as possible, while avoiding irrelevant ones. Reinforcement Learning (RL) has been a promising direction for optimizing focused crawling, because RL can naturally optimize the long-term profit of discovering relevant web locations within the context of a reward. In this paper, we propose TRES, a novel R… ▽ More A focused crawler aims at discovering as many web pages and web sites relevant to a target topic as possible, while avoiding irrelevant ones. Reinforcement Learning (RL) has been a promising direction for optimizing focused crawling, because RL can naturally optimize the long-term profit of discovering relevant web locations within the context of a reward. In this paper, we propose TRES, a novel RL-empowered framework for focused crawling that aims at maximizing both the number of relevant web pages (aka \textit{harvest rate}) and the number of relevant web sites (\textit{domains}). We model the focused crawling problem as a novel Markov Decision Process (MDP), which the RL agent aims to solve by determining an optimal crawling strategy. To overcome the computational infeasibility of exhaustively searching for the best action at each time step, we propose Tree-Frontier, a provably efficient tree-based sampling algorithm that adaptively discretizes the large state and action spaces and evaluates only a few representative actions. Experimentally, utilizing online real-world data, we show that TRES significantly outperforms and Pareto-dominates state-of-the-art methods in terms of harvest rate and the number of retrieved relevant domains, while it provably reduces by orders of magnitude the number of URLs needed to be evaluated at each crawling step. △ Less

Submitted 17 May, 2025; v1 submitted 11 December, 2021; originally announced December 2021.

arXiv:2107.07863 [pdf, other]

Simultaneous boundary shape estimation and velocity field de-noising in Magnetic Resonance Velocimetry using Physics-informed Neural Networks

Authors: Ushnish Sengupta, Alexandros Kontogiannis, Matthew P. Juniper

Abstract: Magnetic resonance velocimetry (MRV) is a non-invasive experimental technique widely used in medicine and engineering to measure the velocity field of a fluid. These measurements are dense but have a low signal-to-noise ratio (SNR). The measurements can be de-noised by imposing physical constraints on the flow, which are encapsulated in governing equations for mass and momentum. Previous studies h… ▽ More Magnetic resonance velocimetry (MRV) is a non-invasive experimental technique widely used in medicine and engineering to measure the velocity field of a fluid. These measurements are dense but have a low signal-to-noise ratio (SNR). The measurements can be de-noised by imposing physical constraints on the flow, which are encapsulated in governing equations for mass and momentum. Previous studies have required the shape of the boundary (for example, a blood vessel) to be known a priori. This, however, requires a set of additional measurements, which can be expensive to obtain. In this paper, we present a physics-informed neural network that instead uses the noisy MRV data alone to simultaneously infer the most likely boundary shape and de-noised velocity field. We achieve this by training an auxiliary neural network that takes the value 1.0 within the inferred domain of the governing PDE and 0.0 outside. This network is used to weight the PDE residual term in the loss function accordingly and implicitly learns the geometry of the system. We test our algorithm by assimilating both synthetic and real MRV measurements for flows that can be well modeled by the Poisson and Stokes equations. We find that we are able to reconstruct very noisy (SNR = 2.5) MRV signals and recover the ground truth with low reconstruction errors of 3.7 - 7.5%. The simplicity and flexibility of our physics-informed neural network approach can readily scale to assimilating MRV data with complex 3D geometries, time-varying 4D data, or unknown parameters in the physical model. △ Less

Submitted 16 July, 2021; originally announced July 2021.

Showing 1–10 of 10 results for author: Kontogiannis, A