Skip to main content

Showing 1–5 of 5 results for author: Choe, J S B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.07516  [pdf, ps, other

    cs.RO

    Average-Reward Maximum Entropy Reinforcement Learning for Global Policy in Double Pendulum Tasks

    Authors: Jean Seong Bjorn Choe, Bumkyu Choi, Jong-kook Kim

    Abstract: This report presents our reinforcement learning-based approach for the swing-up and stabilisation tasks of the acrobot and pendubot, tailored specifcially to the updated guidelines of the 3rd AI Olympics at ICRA 2025. Building upon our previously developed Average-Reward Entropy Advantage Policy Optimization (AR-EAPO) algorithm, we refined our solution to effectively address the new competition sc… ▽ More

    Submitted 12 May, 2025; originally announced May 2025.

  2. arXiv:2503.15290  [pdf, other

    cs.RO

    Reinforcement Learning for Robust Athletic Intelligence: Lessons from the 2nd 'AI Olympics with RealAIGym' Competition

    Authors: Felix Wiebe, Niccolò Turcato, Alberto Dalla Libera, Jean Seong Bjorn Choe, Bumkyu Choi, Tim Lukas Faust, Habib Maraqten, Erfan Aghadavoodi, Marco Cali, Alberto Sinigaglia, Giulio Giacomuzzo, Diego Romeres, Jong-kook Kim, Gian Antonio Susto, Shubham Vyas, Dennis Mronga, Boris Belousov, Jan Peters, Frank Kirchner, Shivesh Kumar

    Abstract: In the field of robotics many different approaches ranging from classical planning over optimal control to reinforcement learning (RL) are developed and borrowed from other fields to achieve reliable control in diverse tasks. In order to get a clear understanding of their individual strengths and weaknesses and their applicability in real world robotic scenarios is it important to benchmark and co… ▽ More

    Submitted 19 March, 2025; originally announced March 2025.

    Comments: 8 pages, 7 figures

  3. arXiv:2409.08938  [pdf, other

    cs.RO cs.LG

    Average-Reward Maximum Entropy Reinforcement Learning for Underactuated Double Pendulum Tasks

    Authors: Jean Seong Bjorn Choe, Bumkyu Choi, Jong-kook Kim

    Abstract: This report presents a solution for the swing-up and stabilisation tasks of the acrobot and the pendubot, developed for the AI Olympics competition at IROS 2024. Our approach employs the Average-Reward Entropy Advantage Policy Optimization (AR-EAPO), a model-free reinforcement learning (RL) algorithm that combines average-reward RL and maximum entropy RL. Results demonstrate that our controller ac… ▽ More

    Submitted 13 September, 2024; originally announced September 2024.

  4. arXiv:2407.18143  [pdf, other

    cs.LG cs.AI

    Maximum Entropy On-Policy Actor-Critic via Entropy Advantage Estimation

    Authors: Jean Seong Bjorn Choe, Jong-Kook Kim

    Abstract: Entropy Regularisation is a widely adopted technique that enhances policy optimisation performance and stability. A notable form of entropy regularisation is augmenting the objective with an entropy term, thereby simultaneously optimising the expected return and the entropy. This framework, known as maximum entropy reinforcement learning (MaxEnt RL), has shown theoretical and empirical successes.… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

  5. arXiv:2403.13866  [pdf, other

    cs.LG cs.AI

    The Bid Picture: Auction-Inspired Multi-player Generative Adversarial Networks Training

    Authors: Joo Yong Shim, Jean Seong Bjorn Choe, Jong-Kook Kim

    Abstract: This article proposes auction-inspired multi-player generative adversarial networks training, which mitigates the mode collapse problem of GANs. Mode collapse occurs when an over-fitted generator generates a limited range of samples, often concentrating on a small subset of the data distribution. Despite the restricted diversity of generated samples, the discriminator can still be deceived into di… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.