Skip to main content

Showing 1–11 of 11 results for author: Poiani, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.22475  [pdf, ps, other

    cs.LG

    Non-Asymptotic Analysis of (Sticky) Track-and-Stop

    Authors: Riccardo Poiani, Martino Bernasconi, Andrea Celli

    Abstract: In pure exploration problems, a statistician sequentially collects information to answer a question about some stochastic and unknown environment. The probability of returning a wrong answer should not exceed a maximum risk parameter $δ$ and good algorithms make as few queries to the environment as possible. The Track-and-Stop algorithm is a pioneering method to solve these problems. Specifically,… ▽ More

    Submitted 28 May, 2025; originally announced May 2025.

  2. arXiv:2505.22473  [pdf, ps, other

    cs.LG

    Pure Exploration with Infinite Answers

    Authors: Riccardo Poiani, Martino Bernasconi, Andrea Celli

    Abstract: We study pure exploration problems where the set of correct answers is possibly infinite, e.g., the regression of any continuous function of the means of the bandit. We derive an instance-dependent lower bound for these problems. By analyzing it, we discuss why existing methods (i.e., Sticky Track-and-Stop) for finite answer problems fail at being asymptotically optimal in this more general settin… ▽ More

    Submitted 28 May, 2025; originally announced May 2025.

  3. arXiv:2411.01898  [pdf, other

    cs.LG cs.AI

    Best-Arm Identification in Unimodal Bandits

    Authors: Riccardo Poiani, Marc Jourdan, Emilie Kaufmann, Rémy Degenne

    Abstract: We study the fixed-confidence best-arm identification problem in unimodal bandits, in which the means of the arms increase with the index of the arm up to their maximum, then decrease. We derive two lower bounds on the stopping time of any algorithm. The instance-dependent lower bound suggests that due to the unimodal structure, only three arms contribute to the leading confidence-dependent cost.… ▽ More

    Submitted 26 May, 2025; v1 submitted 4 November, 2024; originally announced November 2024.

  4. arXiv:2410.13463  [pdf, other

    cs.LG

    Truncating Trajectories in Monte Carlo Policy Evaluation: an Adaptive Approach

    Authors: Riccardo Poiani, Nicole Nobili, Alberto Maria Metelli, Marcello Restelli

    Abstract: Policy evaluation via Monte Carlo (MC) simulation is at the core of many MC Reinforcement Learning (RL) algorithms (e.g., policy gradient methods). In this context, the designer of the learning system specifies an interaction budget that the agent usually spends by collecting trajectories of fixed length within a simulator. However, is this data collection strategy the best option? To answer this… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

  5. arXiv:2406.03033  [pdf, other

    cs.LG stat.ML

    Optimal Multi-Fidelity Best-Arm Identification

    Authors: Riccardo Poiani, Rémy Degenne, Emilie Kaufmann, Alberto Maria Metelli, Marcello Restelli

    Abstract: In bandit best-arm identification, an algorithm is tasked with finding the arm with highest mean reward with a specified accuracy as fast as possible. We study multi-fidelity best-arm identification, in which the algorithm can choose to sample an arm at a lower fidelity (less accurate mean estimate) for a lower cost. Several methods have been proposed for tackling this problem, but their optimalit… ▽ More

    Submitted 26 May, 2025; v1 submitted 5 June, 2024; originally announced June 2024.

  6. arXiv:2401.03857  [pdf, other

    cs.LG cs.AI

    Inverse Reinforcement Learning with Sub-optimal Experts

    Authors: Riccardo Poiani, Gabriele Curti, Alberto Maria Metelli, Marcello Restelli

    Abstract: Inverse Reinforcement Learning (IRL) techniques deal with the problem of deducing a reward function that explains the behavior of an expert agent who is assumed to act optimally in an underlying unknown task. In several problems of interest, however, it is possible to observe the behavior of multiple experts with different degree of optimality (e.g., racing drivers whose skills ranges from amateur… ▽ More

    Submitted 8 January, 2024; originally announced January 2024.

  7. arXiv:2308.15552  [pdf, ps, other

    cs.LG stat.ML

    Pure Exploration under Mediators' Feedback

    Authors: Riccardo Poiani, Alberto Maria Metelli, Marcello Restelli

    Abstract: Stochastic multi-armed bandits are a sequential-decision-making framework, where, at each interaction step, the learner selects an arm and observes a stochastic reward. Within the context of best-arm identification (BAI) problems, the goal of the agent lies in finding the optimal arm, i.e., the one with highest expected reward, as accurately and efficiently as possible. Nevertheless, the sequentia… ▽ More

    Submitted 12 January, 2024; v1 submitted 29 August, 2023; originally announced August 2023.

  8. arXiv:2305.04361  [pdf, ps, other

    cs.LG cs.AI

    Truncating Trajectories in Monte Carlo Reinforcement Learning

    Authors: Riccardo Poiani, Alberto Maria Metelli, Marcello Restelli

    Abstract: In Reinforcement Learning (RL), an agent acts in an unknown environment to maximize the expected cumulative discounted sum of an external reward signal, i.e., the expected return. In practice, in many tasks of interest, such as policy optimization, the agent usually spends its interaction budget by collecting episodes of fixed length within a simulator (i.e., Monte Carlo simulation). However, give… ▽ More

    Submitted 7 May, 2023; originally announced May 2023.

  9. arXiv:2207.12509  [pdf, ps, other

    cs.LG cs.AI cs.MA

    Optimizing Empty Container Repositioning and Fleet Deployment via Configurable Semi-POMDPs

    Authors: Riccardo Poiani, Ciprian Stirbu, Alberto Maria Metelli, Marcello Restelli

    Abstract: With the continuous growth of the global economy and markets, resource imbalance has risen to be one of the central issues in real logistic scenarios. In marine transportation, this trade imbalance leads to Empty Container Repositioning (ECR) problems. Once the freight has been delivered from an exporting country to an importing one, the laden will turn into empty containers that need to be reposi… ▽ More

    Submitted 25 July, 2022; originally announced July 2022.

  10. arXiv:2105.08834  [pdf, other

    cs.LG

    Meta-Reinforcement Learning by Tracking Task Non-stationarity

    Authors: Riccardo Poiani, Andrea Tirinzoni, Marcello Restelli

    Abstract: Many real-world domains are subject to a structured non-stationarity which affects the agent's goals and the environmental dynamics. Meta-reinforcement learning (RL) has been shown successful for training agents that quickly adapt to related tasks. However, most of the existing meta-RL algorithms for non-stationary domains either make strong assumptions on the task generation process or require sa… ▽ More

    Submitted 18 May, 2021; originally announced May 2021.

    Comments: To appear at IJCAI 2021

  11. arXiv:2007.00722  [pdf, other

    cs.LG stat.ML

    Sequential Transfer in Reinforcement Learning with a Generative Model

    Authors: Andrea Tirinzoni, Riccardo Poiani, Marcello Restelli

    Abstract: We are interested in how to design reinforcement learning agents that provably reduce the sample complexity for learning new tasks by transferring knowledge from previously-solved ones. The availability of solutions to related problems poses a fundamental trade-off: whether to seek policies that are expected to achieve high (yet sub-optimal) performance in the new task immediately or whether to se… ▽ More

    Submitted 1 July, 2020; originally announced July 2020.

    Comments: ICML 2020