-
Fast Primal-Dual Update against Local Weight Update in Linear Assignment Problem and Its Application
Authors:
Kohei Morita,
Shinya Shiroshita,
Yutaro Yamaguchi,
Yu Yokoi
Abstract:
We consider a dynamic situation in the weighted bipartite matching problem: edge weights in the input graph are repeatedly updated and we are asked to maintain an optimal matching at any moment. A trivial approach is to compute an optimal matching from scratch each time an update occurs. In this paper, we show that if each update occurs locally around a single vertex, then a single execution of Di…
▽ More
We consider a dynamic situation in the weighted bipartite matching problem: edge weights in the input graph are repeatedly updated and we are asked to maintain an optimal matching at any moment. A trivial approach is to compute an optimal matching from scratch each time an update occurs. In this paper, we show that if each update occurs locally around a single vertex, then a single execution of Dijkstra's algorithm is sufficient to preserve optimality with the aid of a dual solution. As an application of our result, we provide a faster implementation of the envy-cycle procedure for finding an envy-free allocation of indivisible items. Our algorithm runs in $\mathrm{O}(mn^2)$ time, while the known bound of the original one is $\mathrm{O}(mn^3)$, where $n$ and $m$ denote the numbers of agents and items, respectively.
△ Less
Submitted 30 July, 2023; v1 submitted 24 August, 2022;
originally announced August 2022.
-
Learning General Inventory Management Policy for Large Supply Chain Network
Authors:
Soh Kumabe,
Shinya Shiroshita,
Takanori Hayashi,
Shirou Maruyama
Abstract:
Inventory management in warehouses directly affects profits made by manufacturers. Particularly, large manufacturers produce a very large variety of products that are handled by a significantly large number of retailers. In such a case, the computational complexity of classical inventory management algorithms is inordinately large. In recent years, learning-based approaches have become popular for…
▽ More
Inventory management in warehouses directly affects profits made by manufacturers. Particularly, large manufacturers produce a very large variety of products that are handled by a significantly large number of retailers. In such a case, the computational complexity of classical inventory management algorithms is inordinately large. In recent years, learning-based approaches have become popular for addressing such problems. However, previous studies have not been managed systems where both the number of products and retailers are large. This study proposes a reinforcement learning-based warehouse inventory management algorithm that can be used for supply chain systems where both the number of products and retailers are large. To solve the computational problem of handling large systems, we provide a means of approximate simulation of the system in the training phase. Our experiments on both real and artificial data demonstrate that our algorithm with approximated simulation can successfully handle large supply chain networks.
△ Less
Submitted 28 April, 2022;
originally announced April 2022.
-
The MineRL 2020 Competition on Sample Efficient Reinforcement Learning using Human Priors
Authors:
William H. Guss,
Mario Ynocente Castro,
Sam Devlin,
Brandon Houghton,
Noboru Sean Kuno,
Crissman Loomis,
Stephanie Milani,
Sharada Mohanty,
Keisuke Nakata,
Ruslan Salakhutdinov,
John Schulman,
Shinya Shiroshita,
Nicholay Topin,
Avinash Ummadisingu,
Oriol Vinyals
Abstract:
Although deep reinforcement learning has led to breakthroughs in many difficult domains, these successes have required an ever-increasing number of samples, affording only a shrinking segment of the AI community access to their development. Resolution of these limitations requires new, sample-efficient methods. To facilitate research in this direction, we propose this second iteration of the MineR…
▽ More
Although deep reinforcement learning has led to breakthroughs in many difficult domains, these successes have required an ever-increasing number of samples, affording only a shrinking segment of the AI community access to their development. Resolution of these limitations requires new, sample-efficient methods. To facilitate research in this direction, we propose this second iteration of the MineRL Competition. The primary goal of the competition is to foster the development of algorithms which can efficiently leverage human demonstrations to drastically reduce the number of samples needed to solve complex, hierarchical, and sparse environments. To that end, participants compete under a limited environment sample-complexity budget to develop systems which solve the MineRL ObtainDiamond task in Minecraft, a sequential decision making environment requiring long-term planning, hierarchical control, and efficient exploration methods. The competition is structured into two rounds in which competitors are provided several paired versions of the dataset and environment with different game textures and shaders. At the end of each round, competitors submit containerized versions of their learning algorithms to the AIcrowd platform where they are trained from scratch on a hold-out dataset-environment pair for a total of 4-days on a pre-specified hardware platform. In this follow-up iteration to the NeurIPS 2019 MineRL Competition, we implement new features to expand the scale and reach of the competition. In response to the feedback of the previous participants, we introduce a second minor track focusing on solutions without access to environment interactions of any kind except during test-time. Further we aim to prompt domain agnostic submissions by implementing several novel competition mechanics including action-space randomization and desemantization of observations and actions.
△ Less
Submitted 26 January, 2021;
originally announced January 2021.
-
Discovering Avoidable Planner Failures of Autonomous Vehicles using Counterfactual Analysis in Behaviorally Diverse Simulation
Authors:
Daisuke Nishiyama,
Mario Ynocente Castro,
Shirou Maruyama,
Shinya Shiroshita,
Karim Hamzaoui,
Yi Ouyang,
Guy Rosman,
Jonathan DeCastro,
Kuan-Hui Lee,
Adrien Gaidon
Abstract:
Automated Vehicles require exhaustive testing in simulation to detect as many safety-critical failures as possible before deployment on public roads. In this work, we focus on the core decision-making component of autonomous robots: their planning algorithm. We introduce a planner testing framework that leverages recent progress in simulating behaviorally diverse traffic participants. Using large…
▽ More
Automated Vehicles require exhaustive testing in simulation to detect as many safety-critical failures as possible before deployment on public roads. In this work, we focus on the core decision-making component of autonomous robots: their planning algorithm. We introduce a planner testing framework that leverages recent progress in simulating behaviorally diverse traffic participants. Using large scale search, we generate, detect, and characterize dynamic scenarios leading to collisions. In particular, we propose methods to distinguish between unavoidable and avoidable accidents, focusing especially on automatically finding planner-specific defects that must be corrected before deployment. Through experiments in complex multi-agent intersection scenarios, we show that our method can indeed find a wide range of critical planner failures.
△ Less
Submitted 24 November, 2020;
originally announced November 2020.
-
Behaviorally Diverse Traffic Simulation via Reinforcement Learning
Authors:
Shinya Shiroshita,
Shirou Maruyama,
Daisuke Nishiyama,
Mario Ynocente Castro,
Karim Hamzaoui,
Guy Rosman,
Jonathan DeCastro,
Kuan-Hui Lee,
Adrien Gaidon
Abstract:
Traffic simulators are important tools in autonomous driving development. While continuous progress has been made to provide developers more options for modeling various traffic participants, tuning these models to increase their behavioral diversity while maintaining quality is often very challenging. This paper introduces an easily-tunable policy generation algorithm for autonomous driving agent…
▽ More
Traffic simulators are important tools in autonomous driving development. While continuous progress has been made to provide developers more options for modeling various traffic participants, tuning these models to increase their behavioral diversity while maintaining quality is often very challenging. This paper introduces an easily-tunable policy generation algorithm for autonomous driving agents. The proposed algorithm balances diversity and driving skills by leveraging the representation and exploration abilities of deep reinforcement learning via a distinct policy set selector. Moreover, we present an algorithm utilizing intrinsic rewards to widen behavioral differences in the training. To provide quantitative assessments, we develop two trajectory-based evaluation metrics which measure the differences among policies and behavioral coverage. We experimentally show the effectiveness of our methods on several challenging intersection scenes.
△ Less
Submitted 11 November, 2020;
originally announced November 2020.