-
Multi-Agent Reinforcement Learning with Selective State-Space Models
Authors:
Jemma Daniel,
Ruan de Kock,
Louay Ben Nessir,
Sasha Abramowitz,
Omayma Mahjoub,
Wiem Khlifi,
Claude Formanek,
Arnu Pretorius
Abstract:
The Transformer model has demonstrated success across a wide range of domains, including in Multi-Agent Reinforcement Learning (MARL) where the Multi-Agent Transformer (MAT) has emerged as a leading algorithm in the field. However, a significant drawback of Transformer models is their quadratic computational complexity relative to input size, making them computationally expensive when scaling to l…
▽ More
The Transformer model has demonstrated success across a wide range of domains, including in Multi-Agent Reinforcement Learning (MARL) where the Multi-Agent Transformer (MAT) has emerged as a leading algorithm in the field. However, a significant drawback of Transformer models is their quadratic computational complexity relative to input size, making them computationally expensive when scaling to larger inputs. This limitation restricts MAT's scalability in environments with many agents. Recently, State-Space Models (SSMs) have gained attention due to their computational efficiency, but their application in MARL remains unexplored. In this work, we investigate the use of Mamba, a recent SSM, in MARL and assess whether it can match the performance of MAT while providing significant improvements in efficiency. We introduce a modified version of MAT that incorporates standard and bi-directional Mamba blocks, as well as a novel "cross-attention" Mamba block. Extensive testing shows that our Multi-Agent Mamba (MAM) matches the performance of MAT across multiple standard multi-agent environments, while offering superior scalability to larger agent scenarios. This is significant for the MARL community, because it indicates that SSMs could replace Transformers without compromising performance, whilst also supporting more effective scaling to higher numbers of agents. Our project page is available at https://sites.google.com/view/multi-agent-mamba .
△ Less
Submitted 28 October, 2024; v1 submitted 25 October, 2024;
originally announced October 2024.
-
Sable: a Performant, Efficient and Scalable Sequence Model for MARL
Authors:
Omayma Mahjoub,
Sasha Abramowitz,
Ruan de Kock,
Wiem Khlifi,
Simon du Toit,
Jemma Daniel,
Louay Ben Nessir,
Louise Beyers,
Claude Formanek,
Liam Clark,
Arnu Pretorius
Abstract:
As multi-agent reinforcement learning (MARL) progresses towards solving larger and more complex problems, it becomes increasingly important that algorithms exhibit the key properties of (1) strong performance, (2) memory efficiency, and (3) scalability. In this work, we introduce Sable, a performant, memory-efficient, and scalable sequence modeling approach to MARL. Sable works by adapting the ret…
▽ More
As multi-agent reinforcement learning (MARL) progresses towards solving larger and more complex problems, it becomes increasingly important that algorithms exhibit the key properties of (1) strong performance, (2) memory efficiency, and (3) scalability. In this work, we introduce Sable, a performant, memory-efficient, and scalable sequence modeling approach to MARL. Sable works by adapting the retention mechanism in Retentive Networks (Sun et al., 2023) to achieve computationally efficient processing of multi-agent observations with long context memory for temporal reasoning. Through extensive evaluations across six diverse environments, we demonstrate how Sable is able to significantly outperform existing state-of-the-art methods in a large number of diverse tasks (34 out of 45 tested). Furthermore, Sable maintains performance as we scale the number of agents, handling environments with more than a thousand agents while exhibiting a linear increase in memory usage. Finally, we conduct ablation studies to isolate the source of Sable's performance gains and confirm its efficient computational memory usage.
△ Less
Submitted 26 May, 2025; v1 submitted 2 October, 2024;
originally announced October 2024.
-
Impact of Non-Standard Unicode Characters on Security and Comprehension in Large Language Models
Authors:
Johan S Daniel,
Anand Pal
Abstract:
The advancement of large language models has significantly improved natural language processing. However, challenges such as jailbreaks (prompt injections that cause an LLM to follow instructions contrary to its intended use), hallucinations (generating incorrect or misleading information), and comprehension errors remain prevalent. In this report, we present a comparative analysis of the performa…
▽ More
The advancement of large language models has significantly improved natural language processing. However, challenges such as jailbreaks (prompt injections that cause an LLM to follow instructions contrary to its intended use), hallucinations (generating incorrect or misleading information), and comprehension errors remain prevalent. In this report, we present a comparative analysis of the performance of fifteen distinct models, with each model undergoing a standardized test comprising 38 queries across three key metrics: jailbreaks, hallucinations, and comprehension errors. The models are assessed based on the total occurrences of jailbreaks, hallucinations, and comprehension errors. Our work exposes these models' inherent vulnerabilities and challenges the notion of human-level language comprehension of these models. We have empirically analysed the impact of non-standard Unicode characters on LLMs and their safeguarding mechanisms on the best-performing LLMs, including GPT-4, Gemini 1.5 Pro, LlaMA-3-70B, and Claude 3 Opus. By incorporating alphanumeric symbols from Unicode outside the standard Latin block and variants of characters in other languages, we observed a reduction in the efficacy of guardrails implemented through Reinforcement Learning Human Feedback (RLHF). Consequently, these models exhibit heightened vulnerability to content policy breaches and prompt leakage. Our study also suggests a need to incorporate non-standard Unicode text in LLM training data to enhance the capabilities of these models.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Experimental demonstration of an integrated on-chip p-bit core utilizing stochastic Magnetic Tunnel Junctions and 2D-MoS2 FETs
Authors:
John Daniel,
Zheng Sun,
Xuejian Zhang,
Yuanqiu Tan,
Neil Dilley,
Zhihong Chen,
Joerg Appenzeller
Abstract:
Probabilistic computing is a novel computing scheme that offers a more efficient approach than conventional CMOS-based logic in a variety of applications ranging from optimization to Bayesian inference, and invertible Boolean logic. The probabilistic-bit (or p-bit, the base unit of probabilistic computing) is a naturally fluctuating entity that requires tunable stochasticity; by coupling low-barri…
▽ More
Probabilistic computing is a novel computing scheme that offers a more efficient approach than conventional CMOS-based logic in a variety of applications ranging from optimization to Bayesian inference, and invertible Boolean logic. The probabilistic-bit (or p-bit, the base unit of probabilistic computing) is a naturally fluctuating entity that requires tunable stochasticity; by coupling low-barrier stochastic Magnetic Tunnel Junctions (MTJs) with a transistor circuit, a compact implementation is achieved. In this work, through integrating stochastic MTJs with 2D-MoS$_{2}$ FETs, the first on-chip realization of a key p-bit building block displaying voltage-controllable stochasticity is demonstrated. In addition, supported by circuit simulations, this work provides a careful analysis of the three transistor-one magnetic tunnel junction (3T-1MTJ) p-bit design, evaluating how the characteristics of each component influence the overall p-bit output. This understanding of the interplay between the characteristics of the transistors and the MTJ is vital for the construction of a fully functioning p-bit, making the design rules presented in this article key for future experimental implementations of scaled on-chip p-bit networks.
△ Less
Submitted 16 October, 2023; v1 submitted 21 August, 2023;
originally announced August 2023.
-
Blind identification of Ambisonic reduced room impulse response
Authors:
Srđan Kitić,
Jérôme Daniel
Abstract:
Recently proposed Generalized Time-domain Velocity Vector (GTVV) is a generalization of relative room impulse response in spherical harmonic (aka Ambisonic) domain that allows for blind estimation of early-echo parameters: the directions and relative delays of individual reflections. However, the derived closed-form expression of GTVV mandates few assumptions to hold, most important being that the…
▽ More
Recently proposed Generalized Time-domain Velocity Vector (GTVV) is a generalization of relative room impulse response in spherical harmonic (aka Ambisonic) domain that allows for blind estimation of early-echo parameters: the directions and relative delays of individual reflections. However, the derived closed-form expression of GTVV mandates few assumptions to hold, most important being that the impulse response of the reference signal needs to be a minimum-phase filter. In practice, the reference is obtained by spatial filtering towards the Direction-of-Arrival of the source, and the aforementioned condition is bounded by the performance of the applied beamformer (and thus, by the Ambisonic array order). In the present work, we suggest to circumvent this problem by directly modeling the impulse responses constituting the GTVV time series, which permits not only to relax the initial assumptions, but also to extract the information therein in a more consistent and efficient manner, entering the realm of blind system identification. Experiments using measured room impulse responses confirm the effectiveness of the proposed approach.
△ Less
Submitted 6 November, 2023; v1 submitted 5 May, 2023;
originally announced May 2023.
-
"Understanding Robustness Lottery": A Geometric Visual Comparative Analysis of Neural Network Pruning Approaches
Authors:
Zhimin Li,
Shusen Liu,
Xin Yu,
Kailkhura Bhavya,
Jie Cao,
Diffenderfer James Daniel,
Peer-Timo Bremer,
Valerio Pascucci
Abstract:
Deep learning approaches have provided state-of-the-art performance in many applications by relying on large and overparameterized neural networks. However, such networks have been shown to be very brittle and are difficult to deploy on resource-limited platforms. Model pruning, i.e., reducing the size of the network, is a widely adopted strategy that can lead to a more robust and compact model. M…
▽ More
Deep learning approaches have provided state-of-the-art performance in many applications by relying on large and overparameterized neural networks. However, such networks have been shown to be very brittle and are difficult to deploy on resource-limited platforms. Model pruning, i.e., reducing the size of the network, is a widely adopted strategy that can lead to a more robust and compact model. Many heuristics exist for model pruning, but empirical studies show that some heuristics improve performance whereas others can make models more brittle or have other side effects. This work aims to shed light on how different pruning methods alter the network's internal feature representation and the corresponding impact on model performance. To facilitate a comprehensive comparison and characterization of the high-dimensional model feature space, we introduce a visual geometric analysis of feature representations. We decomposed and evaluated a set of critical geometric concepts from the common adopted classification loss, and used them to design a visualization system to compare and highlight the impact of pruning on model performance and feature representation. The proposed tool provides an environment for in-depth comparison of pruning methods and a comprehensive understanding of how model response to common data corruption. By leveraging the proposed visualization, machine learning researchers can reveal the similarities between pruning methods and redundant in robustness evaluation benchmarks, obtain geometric insights about the differences between pruned models that achieve superior robustness performance, and identify samples that are robust or fragile to model pruning and common data corruption to model pruning and data corruption but also obtain insights and explanations on how some pruned models achieve superior robustness performance.
△ Less
Submitted 24 October, 2023; v1 submitted 16 June, 2022;
originally announced June 2022.
-
Echo-enabled Direction-of-Arrival and range estimation of a mobile source in Ambisonic domain
Authors:
Jérôme Daniel,
Srđan Kitić
Abstract:
Range estimation of a far field sound source in a reverberant environment is known to be a notoriously difficult problem, hence most localization methods are only capable of estimating the source's Direction-of-Arrival (DoA). In an earlier work, we have demonstrated that, under certain restrictive acoustic conditions and given the orientation of a reflecting surface, one can exploit the dominant a…
▽ More
Range estimation of a far field sound source in a reverberant environment is known to be a notoriously difficult problem, hence most localization methods are only capable of estimating the source's Direction-of-Arrival (DoA). In an earlier work, we have demonstrated that, under certain restrictive acoustic conditions and given the orientation of a reflecting surface, one can exploit the dominant acoustic reflection to evaluate the DoA \emph{and} the distance to a static sound source in Ambisonic domain. In this article, we leverage the recently presented Generalized Time-domain Velocity Vector (GTVV) representation to estimate these quantities for a moving sound source without an a priori knowledge of reflectors' orientations. We show that the trajectories of a moving source and the corresponding reflections are spatially and temporally related, which can be used to infer the absolute delay of the propagating source signal and, therefore, approximate the microphone-to-source distance. Experiments on real sound data confirm the validity of the proposed approach.
△ Less
Submitted 10 March, 2022;
originally announced March 2022.
-
Post Quantum Cryptography: Techniques, Challenges, Standardization, and Directions for Future Research
Authors:
Ritik Bavdekar,
Eashan Jayant Chopde,
Ashutosh Bhatia,
Kamlesh Tiwari,
Sandeep Joshua Daniel,
Atul
Abstract:
The development of large quantum computers will have dire consequences for cryptography. Most of the symmetric and asymmetric cryptographic algorithms are vulnerable to quantum algorithms. Grover's search algorithm gives a square root time boost for the searching of the key in symmetric schemes like AES and 3DES. The security of asymmetric algorithms like RSA, Diffie Hellman, and ECC is based on t…
▽ More
The development of large quantum computers will have dire consequences for cryptography. Most of the symmetric and asymmetric cryptographic algorithms are vulnerable to quantum algorithms. Grover's search algorithm gives a square root time boost for the searching of the key in symmetric schemes like AES and 3DES. The security of asymmetric algorithms like RSA, Diffie Hellman, and ECC is based on the mathematical hardness of prime factorization and discrete logarithm. The best classical algorithms available take exponential time. Shor's factoring algorithm can solve the problems in polynomial time. Major breakthroughs in quantum computing will render all the present-day widely used asymmetric cryptosystems insecure. This paper analyzes the vulnerability of the classical cryptosystems in the context of quantum computers discusses various post-quantum cryptosystem families, discusses the status of the NIST post-quantum cryptography standardization process, and finally provides a couple of future research directions in this field.
△ Less
Submitted 6 February, 2022;
originally announced February 2022.
-
Generalized Time Domain Velocity Vector
Authors:
Srđan Kitić,
Jérôme Daniel
Abstract:
We introduce and analyze Generalized Time Domain Velocity Vector (GTVV), an extension of the previously presented acoustic multipath footprint extracted from the Ambisonic recordings. GTVV is better adapted to adverse acoustic conditions, and enables efficient parameter estimation of multiple plane wave components in the recorded multichannel mixture. Experiments on simulated data confirm the pred…
▽ More
We introduce and analyze Generalized Time Domain Velocity Vector (GTVV), an extension of the previously presented acoustic multipath footprint extracted from the Ambisonic recordings. GTVV is better adapted to adverse acoustic conditions, and enables efficient parameter estimation of multiple plane wave components in the recorded multichannel mixture. Experiments on simulated data confirm the predicted theoretical advantages of these new spatio-temporal features.
△ Less
Submitted 19 May, 2022; v1 submitted 12 October, 2021;
originally announced October 2021.
-
Unsupervised Learning of Depth Estimation and Visual Odometry for Sparse Light Field Cameras
Authors:
S. Tejaswi Digumarti,
Joseph Daniel,
Ahalya Ravendran,
Donald G. Dansereau
Abstract:
While an exciting diversity of new imaging devices is emerging that could dramatically improve robotic perception, the challenges of calibrating and interpreting these cameras have limited their uptake in the robotics community. In this work we generalise techniques from unsupervised learning to allow a robot to autonomously interpret new kinds of cameras. We consider emerging sparse light field (…
▽ More
While an exciting diversity of new imaging devices is emerging that could dramatically improve robotic perception, the challenges of calibrating and interpreting these cameras have limited their uptake in the robotics community. In this work we generalise techniques from unsupervised learning to allow a robot to autonomously interpret new kinds of cameras. We consider emerging sparse light field (LF) cameras, which capture a subset of the 4D LF function describing the set of light rays passing through a plane. We introduce a generalised encoding of sparse LFs that allows unsupervised learning of odometry and depth. We demonstrate the proposed approach outperforming monocular and conventional techniques for dealing with 4D imagery, yielding more accurate odometry and depth maps and delivering these with metric scale. We anticipate our technique to generalise to a broad class of LF and sparse LF cameras, and to enable unsupervised recalibration for coping with shifts in camera behaviour over the lifetime of a robot. This work represents a first step toward streamlining the integration of new kinds of imaging devices in robotics applications.
△ Less
Submitted 21 March, 2021;
originally announced March 2021.
-
Time Domain Velocity Vector for Retracing the Multipath Propagation
Authors:
Jérôme Daniel,
Srđan Kitić
Abstract:
We propose a conceptually and computationally simple form of sound velocity that offers a readable view of the interference between direct and indirect sound waves. Unlike most approaches in the literature, it jointly exploits both active and reactive sound intensity measurements, as typically derived from a first order ambisonics recording. This representation has a potential both as a valuable t…
▽ More
We propose a conceptually and computationally simple form of sound velocity that offers a readable view of the interference between direct and indirect sound waves. Unlike most approaches in the literature, it jointly exploits both active and reactive sound intensity measurements, as typically derived from a first order ambisonics recording. This representation has a potential both as a valuable tool for directly analyzing sound multipath propagation, as well as being a new spatial feature format for machine learning algorithms in audio and acoustics. As a showcase, we demonstrate that the Direction-Of-Arrival and the range of a sound source can be estimated as a development of this approach. To the best knowledge of the authors, this is the first time that range is estimated from an ambisonics recording.
△ Less
Submitted 3 June, 2020;
originally announced June 2020.