Skip to main content

Showing 1–11 of 11 results for author: Moulin, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.10899  [pdf, ps, other

    stat.ML cs.LG stat.ME

    Demystifying Spectral Feature Learning for Instrumental Variable Regression

    Authors: Dimitri Meunier, Antoine Moulin, Jakub Wornbard, Vladimir R. Kostic, Arthur Gretton

    Abstract: We address the problem of causal effect estimation in the presence of hidden confounders, using nonparametric instrumental variable (IV) regression. A leading strategy employs spectral features - that is, learned features spanning the top eigensubspaces of the operator linking treatments to instruments. We derive a generalization error bound for a two-stage least squares estimator based on spectra… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

  2. arXiv:2506.01722  [pdf, ps, other

    cs.LG stat.ML

    When Lower-Order Terms Dominate: Adaptive Expert Algorithms for Heavy-Tailed Losses

    Authors: Antoine Moulin, Emmanuel Esposito, Dirk van der Hoeven

    Abstract: We consider the problem setting of prediction with expert advice with possibly heavy-tailed losses, i.e.\ the only assumption on the losses is an upper bound on their second moments, denoted by $θ$. We develop adaptive algorithms that do not require any prior knowledge about the range or the second moment of the losses. Existing adaptive algorithms have what is typically considered a lower-order t… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

  3. arXiv:2505.19946  [pdf, ps, other

    cs.LG

    Inverse Q-Learning Done Right: Offline Imitation Learning in $Q^π$-Realizable MDPs

    Authors: Antoine Moulin, Gergely Neu, Luca Viano

    Abstract: We study the problem of offline imitation learning in Markov decision processes (MDPs), where the goal is to learn a well-performing policy given a dataset of state-action pairs generated by an expert policy. Complementing a recent line of work on this topic that assumes the expert belongs to a tractable class of known policies, we approach this problem from a new angle and leverage a different ty… ▽ More

    Submitted 2 June, 2025; v1 submitted 26 May, 2025; originally announced May 2025.

  4. arXiv:2502.13900  [pdf, other

    cs.LG

    Optimistically Optimistic Exploration for Provably Efficient Infinite-Horizon Reinforcement and Imitation Learning

    Authors: Antoine Moulin, Gergely Neu, Luca Viano

    Abstract: We study the problem of reinforcement learning in infinite-horizon discounted linear Markov decision processes (MDPs), and propose the first computationally efficient algorithm achieving near-optimal regret guarantees in this setting. Our main idea is to combine two classic techniques for optimistic exploration: additive exploration bonuses applied to the reward function, and artificial transition… ▽ More

    Submitted 19 February, 2025; originally announced February 2025.

  5. arXiv:2501.14249  [pdf, other

    cs.LG cs.AI cs.CL

    Humanity's Last Exam

    Authors: Long Phan, Alice Gatti, Ziwen Han, Nathaniel Li, Josephina Hu, Hugh Zhang, Chen Bo Calvin Zhang, Mohamed Shaaban, John Ling, Sean Shi, Michael Choi, Anish Agrawal, Arnav Chopra, Adam Khoja, Ryan Kim, Richard Ren, Jason Hausenloy, Oliver Zhang, Mantas Mazeika, Dmitry Dodonov, Tung Nguyen, Jaeho Lee, Daron Anderson, Mikhail Doroshenko, Alun Cennyth Stokes , et al. (1084 additional authors not shown)

    Abstract: Benchmarks are important tools for tracking the rapid advancements in large language model (LLM) capabilities. However, benchmarks are not keeping pace in difficulty: LLMs now achieve over 90\% accuracy on popular benchmarks like MMLU, limiting informed measurement of state-of-the-art LLM capabilities. In response, we introduce Humanity's Last Exam (HLE), a multi-modal benchmark at the frontier of… ▽ More

    Submitted 19 April, 2025; v1 submitted 24 January, 2025; originally announced January 2025.

    Comments: 29 pages, 6 figures

  6. arXiv:2407.10448  [pdf, other

    cs.LG stat.ML

    Spectral Representation for Causal Estimation with Hidden Confounders

    Authors: Haotian Sun, Antoine Moulin, Tongzheng Ren, Arthur Gretton, Bo Dai

    Abstract: We address the problem of causal effect estimation where hidden confounders are present, with a focus on two settings: instrumental variable regression with additional observed confounders, and proxy causal learning. Our approach uses a singular value decomposition of a conditional expectation operator, followed by a saddle-point optimization problem, which, in the context of IV regression, can be… ▽ More

    Submitted 10 March, 2025; v1 submitted 15 July, 2024; originally announced July 2024.

    Comments: Haotian Sun, Antoine Moulin, and Tongzheng Ren contributed equally

  7. arXiv:2302.14004  [pdf, other

    cs.LG stat.ML

    Optimistic Planning by Regularized Dynamic Programming

    Authors: Antoine Moulin, Gergely Neu

    Abstract: We propose a new method for optimistic planning in infinite-horizon discounted Markov decision processes based on the idea of adding regularization to the updates of an otherwise standard approximate value iteration procedure. This technique allows us to avoid contraction and monotonicity arguments typically required by existing analyses of approximate dynamic programming methods, and in particula… ▽ More

    Submitted 14 June, 2023; v1 submitted 27 February, 2023; originally announced February 2023.

  8. arXiv:2210.09897  [pdf, other

    q-fin.TR cs.GT cs.LG q-fin.CP

    Learning to simulate realistic limit order book markets from data as a World Agent

    Authors: Andrea Coletta, Aymeric Moulin, Svitlana Vyetrenko, Tucker Balch

    Abstract: Multi-agent market simulators usually require careful calibration to emulate real markets, which includes the number and the type of agents. Poorly calibrated simulators can lead to misleading conclusions, potentially causing severe loss when employed by investment banks, hedge funds, and traders to study and evaluate trading strategies. In this paper, we propose a world model simulator that accur… ▽ More

    Submitted 26 September, 2022; originally announced October 2022.

  9. arXiv:2202.00941  [pdf, other

    cs.MA cs.AI econ.GN q-fin.MF

    CTMSTOU driven markets: simulated environment for regime-awareness in trading policies

    Authors: Selim Amrouni, Aymeric Moulin, Tucker Balch

    Abstract: Market regimes is a popular topic in quantitative finance even though there is little consensus on the details of how they should be defined. They arise as a feature both in financial market prediction problems and financial market task performing problems. In this work we use discrete event time multi-agent market simulation to freely experiment in a reproducible and understandable environment… ▽ More

    Submitted 3 February, 2022; v1 submitted 2 February, 2022; originally announced February 2022.

    Comments: fix typo in title

  10. arXiv:2110.14771  [pdf, other

    cs.MA cs.AI q-fin.TR

    ABIDES-Gym: Gym Environments for Multi-Agent Discrete Event Simulation and Application to Financial Markets

    Authors: Selim Amrouni, Aymeric Moulin, Jared Vann, Svitlana Vyetrenko, Tucker Balch, Manuela Veloso

    Abstract: Model-free Reinforcement Learning (RL) requires the ability to sample trajectories by taking actions in the original problem environment or a simulated version of it. Breakthroughs in the field of RL have been largely facilitated by the development of dedicated open source simulators with easy to use frameworks such as OpenAI Gym and its Atari environments. In this paper we propose to use the Open… ▽ More

    Submitted 27 October, 2021; originally announced October 2021.

  11. arXiv:2110.13287  [pdf, other

    cs.AI cs.CE cs.LG cs.MA q-fin.TR

    Towards Realistic Market Simulations: a Generative Adversarial Networks Approach

    Authors: Andrea Coletta, Matteo Prata, Michele Conti, Emanuele Mercanti, Novella Bartolini, Aymeric Moulin, Svitlana Vyetrenko, Tucker Balch

    Abstract: Simulated environments are increasingly used by trading firms and investment banks to evaluate trading strategies before approaching real markets. Backtesting, a widely used approach, consists of simulating experimental strategies while replaying historical market scenarios. Unfortunately, this approach does not capture the market response to the experimental agents' actions. In contrast, multi-ag… ▽ More

    Submitted 25 October, 2021; originally announced October 2021.

    Comments: 8 pages, 9 figures, ICAIF'21 - 2nd ACM International Conference on AI in Finance

    MSC Class: I.2; I.6 ACM Class: I.2; I.6