Skip to main content

Showing 1–43 of 43 results for author: Zimmer, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.02726  [pdf, ps, other

    cs.AI cs.LG

    Bourbaki: Self-Generated and Goal-Conditioned MDPs for Theorem Proving

    Authors: Matthieu Zimmer, Xiaotong Ji, Rasul Tutunov, Anthony Bordg, Jun Wang, Haitham Bou Ammar

    Abstract: Reasoning remains a challenging task for large language models (LLMs), especially within the logically constrained environment of automated theorem proving (ATP), due to sparse rewards and the vast scale of proofs. These challenges are amplified in benchmarks like PutnamBench, which contains university-level problems requiring complex, multi-step reasoning. To address this, we introduce self-gener… ▽ More

    Submitted 3 July, 2025; originally announced July 2025.

  2. arXiv:2506.13905  [pdf, ps, other

    cs.AR

    Spec2RTL-Agent: Automated Hardware Code Generation from Complex Specifications Using LLM Agent Systems

    Authors: Zhongzhi Yu, Mingjie Liu, Michael Zimmer, Yingyan Celine Lin, Yong Liu, Haoxing Ren

    Abstract: Despite recent progress in generating hardware RTL code with LLMs, existing solutions still suffer from a substantial gap between practical application scenarios and the requirements of real-world RTL code development. Prior approaches either focus on overly simplified hardware descriptions or depend on extensive human guidance to process complex specifications, limiting their scalability and auto… ▽ More

    Submitted 16 June, 2025; originally announced June 2025.

  3. arXiv:2505.23696  [pdf, ps, other

    cs.LG cs.SC

    Computational Algebra with Attention: Transformer Oracles for Border Basis Algorithms

    Authors: Hiroshi Kera, Nico Pelleriti, Yuki Ishihara, Max Zimmer, Sebastian Pokutta

    Abstract: Solving systems of polynomial equations, particularly those with finitely many solutions, is a crucial challenge across many scientific fields. Traditional methods like Gröbner and Border bases are fundamental but suffer from high computational costs, which have motivated recent Deep Learning approaches to improve efficiency, albeit at the expense of output correctness. In this work, we introduce… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

    Comments: 13+19 pages (3+9 figures, 2+7 tables)

  4. arXiv:2505.13289  [pdf, ps, other

    cs.LG cs.CV

    RECON: Robust symmetry discovery via Explicit Canonical Orientation Normalization

    Authors: Alonso Urbano, David W. Romero, Max Zimmer, Sebastian Pokutta

    Abstract: Real-world data often exhibits unknown or approximate symmetries, yet existing equivariant networks must commit to a fixed transformation group prior to training, e.g., continuous $SO(2)$ rotations. This mismatch degrades performance when the actual data symmetries differ from those in the transformation group. We introduce RECON, a framework to discover each input's intrinsic symmetry distributio… ▽ More

    Submitted 19 May, 2025; originally announced May 2025.

  5. arXiv:2502.17066  [pdf, other

    cs.CV cs.LG

    DUNIA: Pixel-Sized Embeddings via Cross-Modal Alignment for Earth Observation Applications

    Authors: Ibrahim Fayad, Max Zimmer, Martin Schwartz, Philippe Ciais, Fabian Gieseke, Gabriel Belouze, Sarah Brood, Aurelien De Truchis, Alexandre d'Aspremont

    Abstract: Significant efforts have been directed towards adapting self-supervised multimodal learning for Earth observation applications. However, existing methods produce coarse patch-sized embeddings, limiting their effectiveness and integration with other modalities like LiDAR. To close this gap, we present DUNIA, an approach to learn pixel-sized embeddings through cross-modal alignment between images an… ▽ More

    Submitted 24 February, 2025; originally announced February 2025.

    Comments: 26 pages, 8 figures

  6. arXiv:2502.15051  [pdf, ps, other

    cs.LG

    Approximating Latent Manifolds in Neural Networks via Vanishing Ideals

    Authors: Nico Pelleriti, Max Zimmer, Elias Wirth, Sebastian Pokutta

    Abstract: Deep neural networks have reshaped modern machine learning by learning powerful latent representations that often align with the manifold hypothesis: high-dimensional data lie on lower-dimensional manifolds. In this paper, we establish a connection between manifold learning and computational algebra by demonstrating how vanishing ideals can characterize the latent manifolds of deep networks. To th… ▽ More

    Submitted 6 June, 2025; v1 submitted 20 February, 2025; originally announced February 2025.

    Comments: ICML25 camera-ready, 28 pages (9 main body, rest appendix and references), 12 figures, 3 tables, 3 algorithms

  7. arXiv:2502.01208  [pdf, ps, other

    cs.LG cs.CL

    On Almost Surely Safe Alignment of Large Language Models at Inference-Time

    Authors: Xiaotong Ji, Shyam Sundhar Ramesh, Matthieu Zimmer, Ilija Bogunovic, Jun Wang, Haitham Bou Ammar

    Abstract: We introduce a novel inference-time alignment approach for LLMs that aims to generate safe responses almost surely, i.e., with probability approaching one. Our approach models the generation of safe responses as a constrained Markov Decision Process (MDP) within the LLM's latent space. We augment a safety state that tracks the evolution of safety constraints and dynamically penalize unsafe generat… ▽ More

    Submitted 20 June, 2025; v1 submitted 3 February, 2025; originally announced February 2025.

  8. arXiv:2501.19328  [pdf, ps, other

    cs.LG cs.AI cs.CV

    Capturing Temporal Dynamics in Large-Scale Canopy Tree Height Estimation

    Authors: Jan Pauls, Max Zimmer, Berkant Turan, Sassan Saatchi, Philippe Ciais, Sebastian Pokutta, Fabian Gieseke

    Abstract: With the rise in global greenhouse gas emissions, accurate large-scale tree canopy height maps are essential for understanding forest structure, estimating above-ground biomass, and monitoring ecological disruptions. To this end, we present a novel approach to generate large-scale, high-resolution canopy height maps over time. Our model accurately predicts canopy height over multiple years given S… ▽ More

    Submitted 12 June, 2025; v1 submitted 31 January, 2025; originally announced January 2025.

    Comments: ICML Camera-Ready, 9 pages main paper, 8 pages references and appendix, 9 figures, 8 tables

  9. arXiv:2501.18527  [pdf, ps, other

    cs.LG math.CO

    Neural Discovery in Mathematics: Do Machines Dream of Colored Planes?

    Authors: Konrad Mundinger, Max Zimmer, Aldo Kiem, Christoph Spiegel, Sebastian Pokutta

    Abstract: We demonstrate how neural networks can drive mathematical discovery through a case study of the Hadwiger-Nelson problem, a long-standing open problem at the intersection of discrete geometry and extremal combinatorics that is concerned with coloring the plane while avoiding monochromatic unit-distance pairs. Using neural networks as approximators, we reformulate this mixed discrete-continuous geom… ▽ More

    Submitted 5 June, 2025; v1 submitted 30 January, 2025; originally announced January 2025.

    Comments: 9 pages main paper, 11 pages references and appendix, 17 figures, 1 table

    Journal ref: Proc. 42nd ICML, PMLR 267, 2025

  10. arXiv:2410.23432  [pdf, ps, other

    cs.CY cs.SI

    Web Scraping for Research: Legal, Ethical, Institutional, and Scientific Considerations

    Authors: Megan A. Brown, Andrew Gruen, Gabe Maldoff, Solomon Messing, Zeve Sanderson, Michael Zimmer

    Abstract: Scientists across disciplines often use data from the internet to conduct research, generating valuable insights about human behavior. However, as generative AI relying on massive text corpora becomes increasingly valuable, platforms have greatly restricted access to data through official channels. As a result, researchers will likely engage in more web scraping to collect data, introducing new ch… ▽ More

    Submitted 19 December, 2024; v1 submitted 30 October, 2024; originally announced October 2024.

  11. arXiv:2410.03804  [pdf, other

    cs.CL cs.AI cs.LG

    Mixture of Attentions For Speculative Decoding

    Authors: Matthieu Zimmer, Milan Gritta, Gerasimos Lampouras, Haitham Bou Ammar, Jun Wang

    Abstract: The growth in the number of parameters of Large Language Models (LLMs) has led to a significant surge in computational requirements, making them challenging and costly to deploy. Speculative decoding (SD) leverages smaller models to efficiently propose future tokens, which are then verified by the LLM in parallel. Small models that utilise activations from the LLM currently achieve the fastest dec… ▽ More

    Submitted 3 April, 2025; v1 submitted 4 October, 2024; originally announced October 2024.

    Comments: Accepted at International Conference on Learning Representations (ICLR 2025)

  12. arXiv:2406.19741  [pdf, other

    cs.RO cs.AI

    ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning

    Authors: Christopher E. Mower, Yuhui Wan, Hongzhan Yu, Antoine Grosnit, Jonas Gonzalez-Billandon, Matthieu Zimmer, Jinlong Wang, Xinyu Zhang, Yao Zhao, Anbang Zhai, Puze Liu, Daniel Palenicek, Davide Tateo, Cesar Cadena, Marco Hutter, Jan Peters, Guangjian Tian, Yuzheng Zhuang, Kun Shao, Xingyue Quan, Jianye Hao, Jun Wang, Haitham Bou-Ammar

    Abstract: We present a framework for intuitive robot programming by non-experts, leveraging natural language prompts and contextual information from the Robot Operating System (ROS). Our system integrates large language models (LLMs), enabling non-experts to articulate task requirements to the system through a chat interface. Key features of the framework include: integration of ROS with an AI agent connect… ▽ More

    Submitted 12 July, 2024; v1 submitted 28 June, 2024; originally announced June 2024.

    Comments: This document contains 26 pages and 13 figures

  13. arXiv:2406.01076  [pdf, other

    cs.CV cs.AI cs.LG

    Estimating Canopy Height at Scale

    Authors: Jan Pauls, Max Zimmer, Una M. Kelly, Martin Schwartz, Sassan Saatchi, Philippe Ciais, Sebastian Pokutta, Martin Brandt, Fabian Gieseke

    Abstract: We propose a framework for global-scale canopy height estimation based on satellite data. Our model leverages advanced data preprocessing techniques, resorts to a novel loss function designed to counter geolocation inaccuracies inherent in the ground-truth height measurements, and employs data from the Shuttle Radar Topography Mission to effectively filter out erroneous labels in mountainous regio… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: ICML Camera-Ready, 17 pages, 14 figures, 7 tables

  14. arXiv:2404.02896  [pdf, other

    cs.LG

    Comment on "Machine learning conservation laws from differential equations"

    Authors: Michael F. Zimmer

    Abstract: The paper [1] by Liu, Madhavan, and Tegmark sought to use machine learning methods to elicit known conservation laws for several systems. However, in their example of a damped 1D harmonic oscillator they made seven serious errors, causing both their method and result to be incorrect. In this Comment, those errors are reviewed.

    Submitted 31 January, 2025; v1 submitted 3 April, 2024; originally announced April 2024.

    Comments: 3 pages, 1 figure; comment on https://doi.org/10.1103/PhysRevE.106.045307. This update now includes: Error#7, Fig1, Eqn5, a new paragraph in Afterword

  15. arXiv:2403.19418  [pdf, other

    cs.LG nlin.CD

    Constants of Motion for Conserved and Non-conserved Dynamics

    Authors: Michael F. Zimmer

    Abstract: This paper begins with a dynamical model that was obtained by applying a machine learning technique (FJet) to time-series data; this dynamical model is then analyzed with Lie symmetry techniques to obtain constants of motion. This analysis is performed on both the conserved and non-conserved cases of the 1D and 2D harmonic oscillators. For the 1D oscillator, constants are found in the cases where… ▽ More

    Submitted 31 January, 2025; v1 submitted 28 March, 2024; originally announced March 2024.

    Comments: 13 pages, 4 figures. (corrected typos, removed Fig4, renamed phi-bar to phi)

  16. arXiv:2403.12764  [pdf, other

    cs.LG math.NA

    Neural Parameter Regression for Explicit Representations of PDE Solution Operators

    Authors: Konrad Mundinger, Max Zimmer, Sebastian Pokutta

    Abstract: We introduce Neural Parameter Regression (NPR), a novel framework specifically developed for learning solution operators in Partial Differential Equations (PDEs). Tailored for operator learning, this approach surpasses traditional DeepONets (Lu et al., 2021) by employing Physics-Informed Neural Network (PINN, Raissi et al., 2019) techniques to regress Neural Network (NN) parameters. By parametrizi… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: ICLR24 Workshop AI4Differential Equations In Science, 15 pages, 4 figures, 2 tables, 1 algorithm

  17. arXiv:2402.12265  [pdf, other

    cs.LG cs.AI cs.DC

    On the Byzantine-Resilience of Distillation-Based Federated Learning

    Authors: Christophe Roux, Max Zimmer, Sebastian Pokutta

    Abstract: Federated Learning (FL) algorithms using Knowledge Distillation (KD) have received increasing attention due to their favorable properties with respect to privacy, non-i.i.d. data and communication cost. These methods depart from transmitting model parameters and instead communicate information about a learning task by sharing predictions on a public dataset. In this work, we study the performance… ▽ More

    Submitted 17 March, 2025; v1 submitted 19 February, 2024; originally announced February 2024.

  18. arXiv:2402.06570  [pdf, other

    cs.LG cs.RO

    Distilling Morphology-Conditioned Hypernetworks for Efficient Universal Morphology Control

    Authors: Zheng Xiong, Risto Vuorio, Jacob Beck, Matthieu Zimmer, Kun Shao, Shimon Whiteson

    Abstract: Learning a universal policy across different robot morphologies can significantly improve learning efficiency and enable zero-shot generalization to unseen morphologies. However, learning a highly performant universal policy requires sophisticated architectures like transformers (TF) that have larger memory and computational cost than simpler multi-layer perceptrons (MLP). To achieve both good per… ▽ More

    Submitted 3 June, 2024; v1 submitted 9 February, 2024; originally announced February 2024.

    Comments: ICML 2024

  19. arXiv:2312.15230  [pdf, other

    cs.LG cs.AI

    PERP: Rethinking the Prune-Retrain Paradigm in the Era of LLMs

    Authors: Max Zimmer, Megi Andoni, Christoph Spiegel, Sebastian Pokutta

    Abstract: Neural Networks can be effectively compressed through pruning, significantly reducing storage and compute demands while maintaining predictive performance. Simple yet effective methods like magnitude pruning remove less important parameters and typically require a costly retraining procedure to restore performance. However, with the rise of LLMs, full retraining has become infeasible due to memory… ▽ More

    Submitted 5 February, 2025; v1 submitted 23 December, 2023; originally announced December 2023.

    Comments: 32 pages, 7 figures, 24 tables

  20. arXiv:2312.14878  [pdf, other

    cs.AI cs.LG

    Pangu-Agent: A Fine-Tunable Generalist Agent with Structured Reasoning

    Authors: Filippos Christianos, Georgios Papoudakis, Matthieu Zimmer, Thomas Coste, Zhihao Wu, Jingxuan Chen, Khyati Khandelwal, James Doran, Xidong Feng, Jiacheng Liu, Zheng Xiong, Yicheng Luo, Jianye Hao, Kun Shao, Haitham Bou-Ammar, Jun Wang

    Abstract: A key method for creating Artificial Intelligence (AI) agents is Reinforcement Learning (RL). However, constructing a standalone RL policy that maps perception to action directly encounters severe problems, chief among them being its lack of generality across multiple tasks and the need for a large amount of training data. The leading cause is that it cannot effectively integrate prior information… ▽ More

    Submitted 22 December, 2023; originally announced December 2023.

    Comments: paper and appendix, 27 pages

  21. arXiv:2310.13669  [pdf, other

    cs.LG cs.AI cs.CL cs.PL

    Automatic Unit Test Data Generation and Actor-Critic Reinforcement Learning for Code Synthesis

    Authors: Philip John Gorinski, Matthieu Zimmer, Gerasimos Lampouras, Derrick Goh Xin Deik, Ignacio Iacobacci

    Abstract: The advent of large pre-trained language models in the domain of Code Synthesis has shown remarkable performance on various benchmarks, treating the problem of Code Generation in a fashion similar to Natural Language Generation, trained with a Language Modelling (LM) objective. In addition, the property of programming language code being precisely evaluable with respect to its semantics -- through… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

    Comments: 9 pages + 4 pages appendix; 4 Figures, 4 Tables, 1 Algorithm; Accepted to Findings of EMNLP 2023

  22. arXiv:2306.16788  [pdf, other

    cs.LG cs.AI

    Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging

    Authors: Max Zimmer, Christoph Spiegel, Sebastian Pokutta

    Abstract: Neural networks can be significantly compressed by pruning, yielding sparse models with reduced storage and computational demands while preserving predictive performance. Model soups (Wortsman et al., 2022) enhance generalization and out-of-distribution (OOD) performance by averaging the parameters of multiple models into a single one, without increasing inference time. However, achieving both spa… ▽ More

    Submitted 23 March, 2024; v1 submitted 29 June, 2023; originally announced June 2023.

    Comments: ICLR24 Camera Ready, 9 pages, 5 pages references, 16 pages appendix

  23. arXiv:2305.15930  [pdf, other

    cs.LG

    End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes

    Authors: Alexandre Maraval, Matthieu Zimmer, Antoine Grosnit, Haitham Bou Ammar

    Abstract: Meta-Bayesian optimisation (meta-BO) aims to improve the sample efficiency of Bayesian optimisation by leveraging data from related tasks. While previous methods successfully meta-learn either a surrogate model or an acquisition function independently, joint training of both components remains an open challenge. This paper proposes the first end-to-end differentiable meta-BO framework that general… ▽ More

    Submitted 22 December, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

  24. arXiv:2206.00759  [pdf, other

    cs.LG cs.AI

    Interpretability Guarantees with Merlin-Arthur Classifiers

    Authors: Stephan Wäldchen, Kartikey Sharma, Berkant Turan, Max Zimmer, Sebastian Pokutta

    Abstract: We propose an interactive multi-agent classifier that provides provable interpretability guarantees even for complex agents such as neural networks. These guarantees consist of lower bounds on the mutual information between selected features and the classification decision. Our results are inspired by the Merlin-Arthur protocol from Interactive Proof Systems and express these bounds in terms of me… ▽ More

    Submitted 22 March, 2024; v1 submitted 1 June, 2022; originally announced June 2022.

    Comments: AISTATS24 Camera-Ready Version, 34 pages total (9 pages main part, 3 pages references, 22 pages appendix), 17 figures, 3 tables

    MSC Class: 68T01; 91A06 ACM Class: I.2.0

  25. arXiv:2205.13902  [pdf, other

    cs.LG

    Sample-Efficient Optimisation with Probabilistic Transformer Surrogates

    Authors: Alexandre Maraval, Matthieu Zimmer, Antoine Grosnit, Rasul Tutunov, Jun Wang, Haitham Bou Ammar

    Abstract: Faced with problems of increasing complexity, recent research in Bayesian Optimisation (BO) has focused on adapting deep probabilistic models as flexible alternatives to Gaussian Processes (GPs). In a similar vein, this paper investigates the feasibility of employing state-of-the-art probabilistic transformers in BO. Upon further investigation, we observe two drawbacks stemming from their training… ▽ More

    Submitted 30 May, 2022; v1 submitted 27 May, 2022; originally announced May 2022.

  26. arXiv:2205.11921  [pdf, other

    cs.LG math.OC

    Compression-aware Training of Neural Networks using Frank-Wolfe

    Authors: Max Zimmer, Christoph Spiegel, Sebastian Pokutta

    Abstract: Many existing Neural Network pruning approaches rely on either retraining or inducing a strong bias in order to converge to a sparse solution throughout training. A third paradigm, 'compression-aware' training, aims to obtain state-of-the-art dense models that are robust to a wide range of compression ratios using a single dense training run while also avoiding retraining. We propose a framework c… ▽ More

    Submitted 14 February, 2024; v1 submitted 24 May, 2022; originally announced May 2022.

    Comments: 8 pages, 5 pages references, 14 pages appendix, 8 figures, and 11 tables

  27. arXiv:2112.13418  [pdf, other

    cs.LG cs.AI

    Neuro-Symbolic Hierarchical Rule Induction

    Authors: Claire Glanois, Xuening Feng, Zhaohui Jiang, Paul Weng, Matthieu Zimmer, Dong Li, Wulong Liu

    Abstract: We propose an efficient interpretable neuro-symbolic model to solve Inductive Logic Programming (ILP) problems. In this model, which is built from a set of meta-rules organised in a hierarchical structure, first-order rules are invented by learning embeddings to match facts and body predicates of a meta-rule. To instantiate it, we specifically design an expressive set of generic meta-rules, and de… ▽ More

    Submitted 26 December, 2021; originally announced December 2021.

    Comments: 10 pages, Figures et references

    ACM Class: I.2.6; I.2.3

  28. arXiv:2112.13112  [pdf, other

    cs.LG cs.AI

    A Survey on Interpretable Reinforcement Learning

    Authors: Claire Glanois, Paul Weng, Matthieu Zimmer, Dong Li, Tianpei Yang, Jianye Hao, Wulong Liu

    Abstract: Although deep reinforcement learning has become a promising machine learning approach for sequential decision-making problems, it is still not mature enough for high-stake domains such as autonomous driving or medical applications. In such contexts, a learned policy needs for instance to be interpretable, so that it can be inspected before any deployment (e.g., for safety and verifiability reasons… ▽ More

    Submitted 24 February, 2022; v1 submitted 24 December, 2021; originally announced December 2021.

    ACM Class: I.2.6

  29. arXiv:2111.13499  [pdf, other

    cs.DB cs.SI

    Bitemporal Property Graphs to Organize Evolving Systems

    Authors: Christopher Rost, Philip Fritzsche, Lucas Schons, Maximilian Zimmer, Dieter Gawlick, Erhard Rahm

    Abstract: This work is a summarized view on the results of a one-year cooperation between Oracle Corp. and the University of Leipzig. The goal was to research the organization of relationships within multi-dimensional time-series data, such as sensor data from the IoT area. We showed in this project that temporal property graphs with some extensions are a prime candidate for this organizational task that co… ▽ More

    Submitted 26 November, 2021; originally announced November 2021.

    Comments: 21 pages

  30. arXiv:2111.00843  [pdf, other

    cs.LG

    How I Learned to Stop Worrying and Love Retraining

    Authors: Max Zimmer, Christoph Spiegel, Sebastian Pokutta

    Abstract: Many Neural Network Pruning approaches consist of several iterative training and pruning steps, seemingly losing a significant amount of their performance after pruning and then recovering it in the subsequent retraining phase. Recent works of Renda et al. (2020) and Le & Hua (2021) demonstrate the significance of the learning rate schedule during the retraining phase and propose specific heuristi… ▽ More

    Submitted 12 March, 2023; v1 submitted 1 November, 2021; originally announced November 2021.

    Comments: ICLR2023 camera-ready version, 9 pages main text, 34 pages appendix, 2 tables, 3 figures in main text

  31. arXiv:2110.06917  [pdf, other

    cs.LG

    Extracting Dynamical Models from Data

    Authors: Michael F. Zimmer

    Abstract: The problem of determining the underlying dynamics of a system when only given data of its state over time has challenged scientists for decades. In this paper, the approach of using machine learning to model the updates of the phase space variables is introduced; this is done as a function of the phase space variables. (More generally, the modeling is done over functions of the jet space.) This a… ▽ More

    Submitted 31 January, 2024; v1 submitted 13 October, 2021; originally announced October 2021.

    Comments: 19 pages, 18 figures

  32. arXiv:2105.11439  [pdf, other

    cs.LG cs.RO math.OC

    2nd-order Updates with 1st-order Complexity

    Authors: Michael F. Zimmer

    Abstract: It has long been a goal to efficiently compute and use second order information on a function ($f$) to assist in numerical approximations. Here it is shown how, using only basic physics and a numerical approximation, such information can be accurately obtained at a cost of ${\cal O}(N)$ complexity, where $N$ is the dimensionality of the parameter space of $f$. In this paper, an algorithm ({\em VA-… ▽ More

    Submitted 27 May, 2021; v1 submitted 24 May, 2021; originally announced May 2021.

    Comments: 12 pages, 3 figures, conference preprint

  33. arXiv:2102.11529  [pdf, other

    cs.AI

    Differentiable Logic Machines

    Authors: Matthieu Zimmer, Xuening Feng, Claire Glanois, Zhaohui Jiang, Jianyi Zhang, Paul Weng, Dong Li, Jianye Hao, Wulong Liu

    Abstract: The integration of reasoning, learning, and decision-making is key to build more general artificial intelligence systems. As a step in this direction, we propose a novel neural-logic architecture, called differentiable logic machine (DLM), that can solve both inductive logic programming (ILP) and reinforcement learning (RL) problems, where the solution can be interpreted as a first-order logic pro… ▽ More

    Submitted 5 July, 2023; v1 submitted 23 February, 2021; originally announced February 2021.

    Comments: Transactions on Machine Learning Research (TMLR)

  34. arXiv:2012.09421  [pdf, other

    cs.LG cs.AI cs.MA

    Learning Fair Policies in Decentralized Cooperative Multi-Agent Reinforcement Learning

    Authors: Matthieu Zimmer, Claire Glanois, Umer Siddique, Paul Weng

    Abstract: We consider the problem of learning fair policies in (deep) cooperative multi-agent reinforcement learning (MARL). We formalize it in a principled way as the problem of optimizing a welfare function that explicitly encodes two important aspects of fairness: efficiency and equity. As a solution method, we propose a novel neural network architecture, which is composed of two sub-networks specificall… ▽ More

    Submitted 22 June, 2021; v1 submitted 17 December, 2020; originally announced December 2020.

    Comments: International Conference on Machine Learning

  35. Hyperparameter Auto-tuning in Self-Supervised Robotic Learning

    Authors: Jiancong Huang, Juan Rojas, Matthieu Zimmer, Hongmin Wu, Yisheng Guan, Paul Weng

    Abstract: Policy optimization in reinforcement learning requires the selection of numerous hyperparameters across different environments. Fixing them incorrectly may negatively impact optimization performance leading notably to insufficient or redundant learning. Insufficient learning (due to convergence to local optima) results in under-performing policies whilst redundant learning wastes time and resource… ▽ More

    Submitted 24 March, 2021; v1 submitted 16 October, 2020; originally announced October 2020.

    Comments: 8 pages, 6 figures, Published in IEEE Robotics and Automation Letters; Presented at The 2021 International Conference on Robotics and Automation (ICRA 2021); Presented at Deep RL Workshop, NeurIPS 2020

    Journal ref: IEEE Robotics and Automation Letters, Volume:6, Issue:2, P. 3537-3544, April 2021

  36. arXiv:2010.07873  [pdf, other

    cs.LG

    Neograd: Near-Ideal Gradient Descent

    Authors: Michael F. Zimmer

    Abstract: The purpose of this paper is to improve upon existing variants of gradient descent by solving two problems: (1) removing (or reducing) the plateau that occurs while minimizing the cost function, (2) continually adjusting the learning rate to an "ideal" value. The approach taken is to approximately solve for the learning rate as a function of a trust metric. When this technique is hybridized with m… ▽ More

    Submitted 2 August, 2021; v1 submitted 15 October, 2020; originally announced October 2020.

    Comments: 23 pages, 13 figures; preprint

  37. arXiv:2010.07243  [pdf, other

    cs.LG math.OC

    Deep Neural Network Training with Frank-Wolfe

    Authors: Sebastian Pokutta, Christoph Spiegel, Max Zimmer

    Abstract: This paper studies the empirical efficacy and benefits of using projection-free first-order methods in the form of Conditional Gradients, a.k.a. Frank-Wolfe methods, for training Neural Networks with constrained parameters. We draw comparisons both to current state-of-the-art stochastic Gradient Descent methods as well as across different variants of stochastic Conditional Gradients. In particular… ▽ More

    Submitted 21 October, 2020; v1 submitted 14 October, 2020; originally announced October 2020.

    Comments: fixed coding error in figure 1 and extended abstract; 13 pages, Abstract 11 pages, 9 figures, 6 tables

  38. arXiv:2008.07773  [pdf, other

    cs.AI cs.LG

    Learning Fair Policies in Multiobjective (Deep) Reinforcement Learning with Average and Discounted Rewards

    Authors: Umer Siddique, Paul Weng, Matthieu Zimmer

    Abstract: As the operations of autonomous systems generally affect simultaneously several users, it is crucial that their designs account for fairness considerations. In contrast to standard (deep) reinforcement learning (RL), we investigate the problem of learning a policy that treats its users equitably. In this paper, we formulate this novel RL problem, in which an objective function, which encodes a not… ▽ More

    Submitted 18 August, 2020; originally announced August 2020.

  39. arXiv:1910.09959  [pdf, other

    cs.AI cs.RO

    Towards More Sample Efficiency in Reinforcement Learning with Data Augmentation

    Authors: Yijiong Lin, Jiancong Huang, Matthieu Zimmer, Juan Rojas, Paul Weng

    Abstract: Deep reinforcement learning (DRL) is a promising approach for adaptive robot control, but its current application to robotics is currently hindered by high sample requirements. We propose two novel data augmentation techniques for DRL in order to reuse more efficiently observed data. The first one called Kaleidoscope Experience Replay exploits reflectional symmetries, while the second called Goal-… ▽ More

    Submitted 15 November, 2019; v1 submitted 18 October, 2019; originally announced October 2019.

    Comments: NeurIPS 2019 Workshop on Robot Learning: Control and Interaction in the Real World (accepted after double-blind peer review). arXiv admin note: substantial text overlap with arXiv:1909.10707

  40. arXiv:1909.10707  [pdf, other

    cs.RO cs.AI cs.LG

    Invariant Transform Experience Replay: Data Augmentation for Deep Reinforcement Learning

    Authors: Yijiong Lin, Jiancong Huang, Matthieu Zimmer, Yisheng Guan, Juan Rojas, Paul Weng

    Abstract: Deep Reinforcement Learning (RL) is a promising approach for adaptive robot control, but its current application to robotics is currently hindered by high sample requirements. To alleviate this issue, we propose to exploit the symmetries present in robotic tasks. Intuitively, symmetries from observed trajectories define transformations that leave the space of feasible RL trajectories invariant and… ▽ More

    Submitted 4 July, 2020; v1 submitted 24 September, 2019; originally announced September 2019.

    Comments: 8 pages, 11 figures, additional 3 pages for appendix. IEEE Robotics and Automation Letters (RAL), 2020. Also in: Intelligent Robots and Systems (IROS)

    Journal ref: IEEE Robotics and Automation Letters, Volume: 5, Issue: 4, p. 6615-6622, Oct. 2020

  41. Exploiting the Sign of the Advantage Function to Learn Deterministic Policies in Continuous Domains

    Authors: Matthieu Zimmer, Paul Weng

    Abstract: In the context of learning deterministic policies in continuous domains, we revisit an approach, which was first proposed in Continuous Actor Critic Learning Automaton (CACLA) and later extended in Neural Fitted Actor Critic (NFAC). This approach is based on a policy update different from that of deterministic policy gradient (DPG). Previous work has observed its excellent performance empirically,… ▽ More

    Submitted 24 June, 2019; v1 submitted 9 June, 2019; originally announced June 2019.

    Comments: International Joint Conferences on Artificial Intelligence

  42. arXiv:1711.01436  [pdf, other

    q-bio.QM cs.AI cs.NE

    Searching for Biophysically Realistic Parameters for Dynamic Neuron Models by Genetic Algorithms from Calcium Imaging Recording

    Authors: Magdalena Fuchs, Manuel Zimmer, Radu Grosu, Ramin M. Hasani

    Abstract: Individual Neurons in the nervous systems exploit various dynamics. To capture these dynamics for single neurons, we tune the parameters of an electrophysiological model of nerve cells, to fit experimental data obtained by calcium imaging. A search for the biophysical parameters of this model is performed by means of a genetic algorithm, where the model neuron is exposed to a predefined input curr… ▽ More

    Submitted 4 November, 2017; originally announced November 2017.

  43. arXiv:1705.07250  [pdf, other

    cs.LG

    Speedup from a different parametrization within the Neural Network algorithm

    Authors: Michael F. Zimmer

    Abstract: A different parametrization of the hyperplanes is used in the neural network algorithm. As demonstrated on several autoencoder examples it significantly outperforms the usual parametrization, reaching lower training error values with only a fraction of the number of epochs. It's argued that it makes it easier to understand and initialize the parameters.

    Submitted 2 June, 2017; v1 submitted 19 May, 2017; originally announced May 2017.

    Comments: 8 pages

    ACM Class: K.3.2