Bottom-Up Meta-Policy Search

Melo, Luckeciano C.; Maximo, Marcos R. O. A.; da Cunha, Adilson Marques

Computer Science > Machine Learning

arXiv:1910.10232 (cs)

[Submitted on 22 Oct 2019 (v1), last revised 9 Dec 2019 (this version, v2)]

Title:Bottom-Up Meta-Policy Search

Authors:Luckeciano C. Melo, Marcos R. O. A. Maximo, Adilson Marques da Cunha

View PDF

Abstract:Despite of the recent progress in agents that learn through interaction, there are several challenges in terms of sample efficiency and generalization across unseen behaviors during training. To mitigate these problems, we propose and apply a first-order Meta-Learning algorithm called Bottom-Up Meta-Policy Search (BUMPS), which works with two-phase optimization procedure: firstly, in a meta-training phase, it distills few expert policies to create a meta-policy capable of generalizing knowledge to unseen tasks during training; secondly, it applies a fast adaptation strategy named Policy Filtering, which evaluates few policies sampled from the meta-policy distribution and selects which best solves the task. We conducted all experiments in the RoboCup 3D Soccer Simulation domain, in the context of kick motion learning. We show that, given our experimental setup, BUMPS works in scenarios where simple multi-task Reinforcement Learning does not. Finally, we performed experiments in a way to evaluate each component of the algorithm.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Robotics (cs.RO)
Cite as:	arXiv:1910.10232 [cs.LG]
	(or arXiv:1910.10232v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1910.10232

Submission history

From: Luckeciano Melo [view email]
[v1] Tue, 22 Oct 2019 21:12:54 UTC (566 KB)
[v2] Mon, 9 Dec 2019 11:41:39 UTC (2,702 KB)

Computer Science > Machine Learning

Title:Bottom-Up Meta-Policy Search

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Bottom-Up Meta-Policy Search

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators