-
Distributionally Robust Multi-Agent Reinforcement Learning for Dynamic Chute Mapping
Authors:
Guangyi Liu,
Suzan Iloglu,
Michael Caldara,
Joseph W. Durham,
Michael M. Zavlanos
Abstract:
In Amazon robotic warehouses, the destination-to-chute mapping problem is crucial for efficient package sorting. Often, however, this problem is complicated by uncertain and dynamic package induction rates, which can lead to increased package recirculation. To tackle this challenge, we introduce a Distributionally Robust Multi-Agent Reinforcement Learning (DRMARL) framework that learns a destinati…
▽ More
In Amazon robotic warehouses, the destination-to-chute mapping problem is crucial for efficient package sorting. Often, however, this problem is complicated by uncertain and dynamic package induction rates, which can lead to increased package recirculation. To tackle this challenge, we introduce a Distributionally Robust Multi-Agent Reinforcement Learning (DRMARL) framework that learns a destination-to-chute mapping policy that is resilient to adversarial variations in induction rates. Specifically, DRMARL relies on group distributionally robust optimization (DRO) to learn a policy that performs well not only on average but also on each individual subpopulation of induction rates within the group that capture, for example, different seasonality or operation modes of the system. This approach is then combined with a novel contextual bandit-based predictor of the worst-case induction distribution for each state-action pair, significantly reducing the cost of exploration and thereby increasing the learning efficiency and scalability of our framework. Extensive simulations demonstrate that DRMARL achieves robust chute mapping in the presence of varying induction distributions, reducing package recirculation by an average of 80\% in the simulation scenario.
△ Less
Submitted 12 March, 2025;
originally announced March 2025.
-
Lifelong Multi-Agent Path Finding in Large-Scale Warehouses
Authors:
Jiaoyang Li,
Andrew Tinka,
Scott Kiesel,
Joseph W. Durham,
T. K. Satish Kumar,
Sven Koenig
Abstract:
Multi-Agent Path Finding (MAPF) is the problem of moving a team of agents to their goal locations without collisions. In this paper, we study the lifelong variant of MAPF, where agents are constantly engaged with new goal locations, such as in large-scale automated warehouses. We propose a new framework Rolling-Horizon Collision Resolution (RHCR) for solving lifelong MAPF by decomposing the proble…
▽ More
Multi-Agent Path Finding (MAPF) is the problem of moving a team of agents to their goal locations without collisions. In this paper, we study the lifelong variant of MAPF, where agents are constantly engaged with new goal locations, such as in large-scale automated warehouses. We propose a new framework Rolling-Horizon Collision Resolution (RHCR) for solving lifelong MAPF by decomposing the problem into a sequence of Windowed MAPF instances, where a Windowed MAPF solver resolves collisions among the paths of the agents only within a bounded time horizon and ignores collisions beyond it. RHCR is particularly well suited to generating pliable plans that adapt to continually arriving new goal locations. We empirically evaluate RHCR with a variety of MAPF solvers and show that it can produce high-quality solutions for up to 1,000 agents (= 38.9\% of the empty cells on the map) for simulated warehouse instances, significantly outperforming existing work.
△ Less
Submitted 12 March, 2021; v1 submitted 15 May, 2020;
originally announced May 2020.
-
The ACRV Picking Benchmark (APB): A Robotic Shelf Picking Benchmark to Foster Reproducible Research
Authors:
Jürgen Leitner,
Adam W. Tow,
Jake E. Dean,
Niko Suenderhauf,
Joseph W. Durham,
Matthew Cooper,
Markus Eich,
Christopher Lehnert,
Ruben Mangels,
Christopher McCool,
Peter Kujala,
Lachlan Nicholson,
Trung Pham,
James Sergeant,
Liao Wu,
Fangyi Zhang,
Ben Upcroft,
Peter Corke
Abstract:
Robotic challenges like the Amazon Picking Challenge (APC) or the DARPA Challenges are an established and important way to drive scientific progress. They make research comparable on a well-defined benchmark with equal test conditions for all participants. However, such challenge events occur only occasionally, are limited to a small number of contestants, and the test conditions are very difficul…
▽ More
Robotic challenges like the Amazon Picking Challenge (APC) or the DARPA Challenges are an established and important way to drive scientific progress. They make research comparable on a well-defined benchmark with equal test conditions for all participants. However, such challenge events occur only occasionally, are limited to a small number of contestants, and the test conditions are very difficult to replicate after the main event. We present a new physical benchmark challenge for robotic picking: the ACRV Picking Benchmark (APB). Designed to be reproducible, it consists of a set of 42 common objects, a widely available shelf, and exact guidelines for object arrangement using stencils. A well-defined evaluation protocol enables the comparison of \emph{complete} robotic systems -- including perception and manipulation -- instead of sub-systems only. Our paper also describes and reports results achieved by an open baseline system based on a Baxter robot.
△ Less
Submitted 14 December, 2016; v1 submitted 16 September, 2016;
originally announced September 2016.
-
Discrete Partitioning and Coverage Control for Gossiping Robots
Authors:
Joseph W. Durham,
Ruggero Carli,
Paolo Frasca,
Francesco Bullo
Abstract:
We propose distributed algorithms to automatically deploy a team of mobile robots to partition and provide coverage of a non-convex environment. To handle arbitrary non-convex environments, we represent them as graphs. Our partitioning and coverage algorithm requires only short-range, unreliable pairwise "gossip" communication. The algorithm has two components: (1) a motion protocol to ensure that…
▽ More
We propose distributed algorithms to automatically deploy a team of mobile robots to partition and provide coverage of a non-convex environment. To handle arbitrary non-convex environments, we represent them as graphs. Our partitioning and coverage algorithm requires only short-range, unreliable pairwise "gossip" communication. The algorithm has two components: (1) a motion protocol to ensure that neighboring robots communicate at least sporadically, and (2) a pairwise partitioning rule to update territory ownership when two robots communicate. By studying an appropriate dynamical system on the space of partitions of the graph vertices, we prove that territory ownership converges to a pairwise-optimal partition in finite time. This new equilibrium set represents improved performance over common Lloyd-type algorithms. Additionally, we detail how our algorithm scales well for large teams in large environments and how the computation can run in anytime with limited resources. Finally, we report on large-scale simulations in complex environments and hardware experiments using the Player/Stage robot control system.
△ Less
Submitted 26 September, 2011; v1 submitted 8 November, 2010;
originally announced November 2010.
-
Pairwise Optimal Discrete Coverage Control for Gossiping Robots
Authors:
Joseph W. Durham,
Ruggero Carli,
Francesco Bullo
Abstract:
We propose distributed algorithms to automatically deploy a group of robotic agents and provide coverage of a discretized environment represented by a graph. The classic Lloyd approach to coverage optimization involves separate centering and partitioning steps and converges to the set of centroidal Voronoi partitions. In this work we present a novel graph coverage algorithm which achieves better p…
▽ More
We propose distributed algorithms to automatically deploy a group of robotic agents and provide coverage of a discretized environment represented by a graph. The classic Lloyd approach to coverage optimization involves separate centering and partitioning steps and converges to the set of centroidal Voronoi partitions. In this work we present a novel graph coverage algorithm which achieves better performance without this separation while requiring only pairwise ``gossip'' communication between agents. Our new algorithm provably converges to an element of the set of pairwise-optimal partitions, a subset of the set of centroidal Voronoi partitions. We illustrate that this new equilibrium set represents a significant performance improvement through numerical comparisons to existing Lloyd-type methods. Finally, we discuss ways to efficiently do the necessary computations.
△ Less
Submitted 29 August, 2010;
originally announced August 2010.