Search | arXiv e-print repository

doi 10.1109/QCE57702.2023.10181

Quantum Natural Policy Gradients: Towards Sample-Efficient Reinforcement Learning

Authors: Nico Meyer, Daniel D. Scherer, Axel Plinge, Christopher Mutschler, Michael J. Hartmann

Abstract: Reinforcement learning is a growing field in AI with a lot of potential. Intelligent behavior is learned automatically through trial and error in interaction with the environment. However, this learning process is often costly. Using variational quantum circuits as function approximators potentially can reduce this cost. In order to implement this, we propose the quantum natural policy gradient (Q… ▽ More Reinforcement learning is a growing field in AI with a lot of potential. Intelligent behavior is learned automatically through trial and error in interaction with the environment. However, this learning process is often costly. Using variational quantum circuits as function approximators potentially can reduce this cost. In order to implement this, we propose the quantum natural policy gradient (QNPG) algorithm -- a second-order gradient-based routine that takes advantage of an efficient approximation of the quantum Fisher information matrix. We experimentally demonstrate that QNPG outperforms first-order based training on Contextual Bandits environments regarding convergence speed and stability and moreover reduces the sample complexity. Furthermore, we provide evidence for the practical feasibility of our approach by training on a 12-qubit hardware device. △ Less

Submitted 9 August, 2023; v1 submitted 26 April, 2023; originally announced April 2023.

Comments: Accepted to the 1st International Workshop on Quantum Machine Learning: From Foundations to Applications (QML@QCE 2023), Bellevue, Washington, USA. 6 pages, 4 figures, 1 table

arXiv:2212.06663 [pdf, other]

Quantum Policy Gradient Algorithm with Optimized Action Decoding

Authors: Nico Meyer, Daniel D. Scherer, Axel Plinge, Christopher Mutschler, Michael J. Hartmann

Abstract: Quantum machine learning implemented by variational quantum circuits (VQCs) is considered a promising concept for the noisy intermediate-scale quantum computing era. Focusing on applications in quantum reinforcement learning, we propose a specific action decoding procedure for a quantum policy gradient approach. We introduce a novel quality measure that enables us to optimize the classical post-pr… ▽ More Quantum machine learning implemented by variational quantum circuits (VQCs) is considered a promising concept for the noisy intermediate-scale quantum computing era. Focusing on applications in quantum reinforcement learning, we propose a specific action decoding procedure for a quantum policy gradient approach. We introduce a novel quality measure that enables us to optimize the classical post-processing required for action selection, inspired by local and global quantum measurements. The resulting algorithm demonstrates a significant performance improvement in several benchmark environments. With this technique, we successfully execute a full training routine on a 5-qubit hardware device. Our method introduces only negligible classical overhead and has the potential to improve VQC-based algorithms beyond the field of quantum reinforcement learning. △ Less

Submitted 22 May, 2023; v1 submitted 13 December, 2022; originally announced December 2022.

Comments: Accepted to the 40th International Conference on Machine Learning (ICML 2023), Honolulu, Hawaii, USA. 22 pages, 10 figures, 3 tables

Journal ref: Proceedings of the 40th International Conference on Machine Learning, PMLR 202:24592-24613, 2023

arXiv:1709.01560 [pdf, other]

doi 10.1109/LRA.2017.2654542

Ergodic Exploration using Binary Sensing for Non-Parametric Shape Estimation

Authors: Ian Abraham, Ahalya Prabhakar, Mitra J. Z. Hartmann, Todd D. Murphey

Abstract: Current methods to estimate object shape---using either vision or touch---generally depend on high-resolution sensing. Here, we exploit ergodic exploration to demonstrate successful shape estimation when using a low-resolution binary contact sensor. The measurement model is posed as a collision-based tactile measurement, and classification methods are used to discriminate between shape boundary re… ▽ More Current methods to estimate object shape---using either vision or touch---generally depend on high-resolution sensing. Here, we exploit ergodic exploration to demonstrate successful shape estimation when using a low-resolution binary contact sensor. The measurement model is posed as a collision-based tactile measurement, and classification methods are used to discriminate between shape boundary regions in the search space. Posterior likelihood estimates of the measurement model help the system actively seek out regions where the binary sensor is most likely to return informative measurements. Results show successful shape estimation of various objects as well as the ability to identify multiple objects in an environment. Interestingly, it is shown that ergodic exploration utilizes non-contact motion to gather significant information about shape. The algorithm is extended in three dimensions in simulation and we present two dimensional experimental results using the Rethink Baxter robot. △ Less

Submitted 5 September, 2017; originally announced September 2017.

Comments: 8 pages

Journal ref: IEEE Robotics and Automation Letters, vol. 2, no. 2, pp. 827-834, 2017

Showing 1–3 of 3 results for author: Hartmann, M J