Search | arXiv e-print repository

Tackling Decision Processes with Non-Cumulative Objectives using Reinforcement Learning

Authors: Maximilian Nägele, Jan Olle, Thomas Fösel, Remmy Zen, Florian Marquardt

Abstract: Markov decision processes (MDPs) are used to model a wide variety of applications ranging from game playing over robotics to finance. Their optimal policy typically maximizes the expected sum of rewards given at each step of the decision process. However, a large class of problems does not fit straightforwardly into this framework: Non-cumulative Markov decision processes (NCMDPs), where instead o… ▽ More Markov decision processes (MDPs) are used to model a wide variety of applications ranging from game playing over robotics to finance. Their optimal policy typically maximizes the expected sum of rewards given at each step of the decision process. However, a large class of problems does not fit straightforwardly into this framework: Non-cumulative Markov decision processes (NCMDPs), where instead of the expected sum of rewards, the expected value of an arbitrary function of the rewards is maximized. Example functions include the maximum of the rewards or their mean divided by their standard deviation. In this work, we introduce a general mapping of NCMDPs to standard MDPs. This allows all techniques developed to find optimal policies for MDPs, such as reinforcement learning or dynamic programming, to be directly applied to the larger class of NCMDPs. Focusing on reinforcement learning, we show applications in a diverse set of tasks, including classical control, portfolio optimization in finance, and discrete optimization problems. Given our approach, we can improve both final performance and training time compared to relying on standard MDPs. △ Less

Submitted 23 May, 2025; v1 submitted 22 May, 2024; originally announced May 2024.

ACM Class: I.2.8; I.2.6

arXiv:2310.10498 [pdf, other]

Fast quantum control of cavities using an improved protocol without coherent errors

Authors: Jonas Landgraf, Christa Flühmann, Thomas Fösel, Florian Marquardt, Robert J. Schoelkopf

Abstract: The selective number-dependent arbitrary phase (SNAP) gates form a powerful class of quantum gates, imparting arbitrarily chosen phases to the Fock states of a cavity. However, for short pulses, coherent errors limit the performance. Here we demonstrate in theory and experiment that such errors can be completely suppressed, provided that the pulse times exceed a specific limit. The resulting short… ▽ More The selective number-dependent arbitrary phase (SNAP) gates form a powerful class of quantum gates, imparting arbitrarily chosen phases to the Fock states of a cavity. However, for short pulses, coherent errors limit the performance. Here we demonstrate in theory and experiment that such errors can be completely suppressed, provided that the pulse times exceed a specific limit. The resulting shorter gate times also reduce incoherent errors. Our approach needs only a small number of frequency components, the resulting pulses can be interpreted easily, and it is compatible with fault-tolerant schemes. △ Less

Submitted 28 November, 2024; v1 submitted 16 October, 2023; originally announced October 2023.

Comments: 22 pages, 3 figures in the main text, 1 figure in the Appendix

arXiv:2210.16715 [pdf, other]

doi 10.1038/s41467-023-42901-3

Realizing a deep reinforcement learning agent discovering real-time feedback control strategies for a quantum system

Authors: Kevin Reuer, Jonas Landgraf, Thomas Fösel, James O'Sullivan, Liberto Beltrán, Abdulkadir Akin, Graham J. Norris, Ants Remm, Michael Kerschbaum, Jean-Claude Besse, Florian Marquardt, Andreas Wallraff, Christopher Eichler

Abstract: To realize the full potential of quantum technologies, finding good strategies to control quantum information processing devices in real time becomes increasingly important. Usually these strategies require a precise understanding of the device itself, which is generally not available. Model-free reinforcement learning circumvents this need by discovering control strategies from scratch without re… ▽ More To realize the full potential of quantum technologies, finding good strategies to control quantum information processing devices in real time becomes increasingly important. Usually these strategies require a precise understanding of the device itself, which is generally not available. Model-free reinforcement learning circumvents this need by discovering control strategies from scratch without relying on an accurate description of the quantum system. Furthermore, important tasks like state preparation, gate teleportation and error correction need feedback at time scales much shorter than the coherence time, which for superconducting circuits is in the microsecond range. Developing and training a deep reinforcement learning agent able to operate in this real-time feedback regime has been an open challenge. Here, we have implemented such an agent in the form of a latency-optimized deep neural network on a field-programmable gate array (FPGA). We demonstrate its use to efficiently initialize a superconducting qubit into a target state. To train the agent, we use model-free reinforcement learning that is based solely on measurement data. We study the agent's performance for strong and weak measurements, and for three-level readout, and compare with simple strategies based on thresholding. This demonstration motivates further research towards adoption of reinforcement learning for real-time feedback control of quantum devices and more generally any physical system requiring learnable low-latency feedback control. △ Less

Submitted 29 October, 2022; originally announced October 2022.

Comments: 14 pages, 10 figures

Journal ref: Nat Commun 14, 7138 (2023)

arXiv:2208.03836 [pdf, other]

doi 10.1103/PhysRevA.107.010101

Artificial Intelligence and Machine Learning for Quantum Technologies

Authors: Mario Krenn, Jonas Landgraf, Thomas Foesel, Florian Marquardt

Abstract: In recent years, the dramatic progress in machine learning has begun to impact many areas of science and technology significantly. In the present perspective article, we explore how quantum technologies are benefiting from this revolution. We showcase in illustrative examples how scientists in the past few years have started to use machine learning and more broadly methods of artificial intelligen… ▽ More In recent years, the dramatic progress in machine learning has begun to impact many areas of science and technology significantly. In the present perspective article, we explore how quantum technologies are benefiting from this revolution. We showcase in illustrative examples how scientists in the past few years have started to use machine learning and more broadly methods of artificial intelligence to analyze quantum measurements, estimate the parameters of quantum devices, discover new quantum experimental setups, protocols, and feedback strategies, and generally improve aspects of quantum computing, quantum communication, and quantum simulation. We highlight open challenges and future possibilities and conclude with some speculative visions for the next decade. △ Less

Submitted 7 August, 2022; originally announced August 2022.

Comments: 23 pages, 8 figures; comments welcome!

Journal ref: Phys. Rev. A 107(1), 010101 (2023)

arXiv:2105.00352 [pdf, other]

doi 10.22331/q-2022-05-17-714

Deep Learning of Quantum Many-Body Dynamics via Random Driving

Authors: Naeimeh Mohseni, Thomas Fösel, Lingzhen Guo, Carlos Navarrete-Benlloch, Florian Marquardt

Abstract: Neural networks have emerged as a powerful way to approach many practical problems in quantum physics. In this work, we illustrate the power of deep learning to predict the dynamics of a quantum many-body system, where the training is \textit{based purely on monitoring expectation values of observables under random driving}. The trained recurrent network is able to produce accurate predictions for… ▽ More Neural networks have emerged as a powerful way to approach many practical problems in quantum physics. In this work, we illustrate the power of deep learning to predict the dynamics of a quantum many-body system, where the training is \textit{based purely on monitoring expectation values of observables under random driving}. The trained recurrent network is able to produce accurate predictions for driving trajectories entirely different than those observed during training. As a proof of principle, here we train the network on numerical data generated from spin models, showing that it can learn the dynamics of observables of interest without needing information about the full quantum state. This allows our approach to be applied eventually to actual experimental data generated from a quantum many-body system that might be open, noisy, or disordered, without any need for a detailed understanding of the system. This scheme provides considerable speedup for rapid explorations and pulse optimization. Remarkably, we show the network is able to extrapolate the dynamics to times longer than those it has been trained on, as well as to the infinite-system-size limit. △ Less

Submitted 16 November, 2022; v1 submitted 1 May, 2021; originally announced May 2021.

Journal ref: Quantum 6, 714 (2022)

arXiv:2103.07585 [pdf, other]

Quantum circuit optimization with deep reinforcement learning

Authors: Thomas Fösel, Murphy Yuezhen Niu, Florian Marquardt, Li Li

Abstract: A central aspect for operating future quantum computers is quantum circuit optimization, i.e., the search for efficient realizations of quantum algorithms given the device capabilities. In recent years, powerful approaches have been developed which focus on optimizing the high-level circuit structure. However, these approaches do not consider and thus cannot optimize for the hardware details of th… ▽ More A central aspect for operating future quantum computers is quantum circuit optimization, i.e., the search for efficient realizations of quantum algorithms given the device capabilities. In recent years, powerful approaches have been developed which focus on optimizing the high-level circuit structure. However, these approaches do not consider and thus cannot optimize for the hardware details of the quantum architecture, which is especially important for near-term devices. To address this point, we present an approach to quantum circuit optimization based on reinforcement learning. We demonstrate how an agent, realized by a deep convolutional neural network, can autonomously learn generic strategies to optimize arbitrary circuits on a specific architecture, where the optimization target can be chosen freely by the user. We demonstrate the feasibility of this approach by training agents on 12-qubit random circuits, where we find on average a depth reduction by 27% and a gate count reduction by 15%. We examine the extrapolation to larger circuits than used for training, and envision how this approach can be utilized for near-term quantum devices. △ Less

Submitted 12 March, 2021; originally announced March 2021.

Comments: 10 pages, 5 figures; keywords: quantum computing, quantum circuit optimization, machine learning, reinforcement learning, deep reinforcement learning

arXiv:2004.14256 [pdf, other]

Efficient cavity control with SNAP gates

Authors: Thomas Fösel, Stefan Krastanov, Florian Marquardt, Liang Jiang

Abstract: Microwave cavities coupled to superconducting qubits have been demonstrated to be a promising platform for quantum information processing. A major challenge in this setup is to realize universal control over the cavity. A promising approach are selective number-dependent arbitrary phase (SNAP) gates combined with cavity displacements. It has been proven that this is a universal gate set, but a cen… ▽ More Microwave cavities coupled to superconducting qubits have been demonstrated to be a promising platform for quantum information processing. A major challenge in this setup is to realize universal control over the cavity. A promising approach are selective number-dependent arbitrary phase (SNAP) gates combined with cavity displacements. It has been proven that this is a universal gate set, but a central question remained open so far: how can a given target operation be realized efficiently with a sequence of these operations. In this work, we present a practical scheme to address this problem. It involves a hierarchical strategy to insert new gates into a sequence, followed by a co-optimization of the control parameters, which generates short high-fidelity sequences. For a broad range of experimentally relevant applications, we find that they can be implemented with 3 to 4 SNAP gates, compared to up to 50 with previously known techniques. △ Less

Submitted 29 April, 2020; originally announced April 2020.

Comments: 9 pages, 4 figures

arXiv:1802.05267 [pdf, other]

doi 10.1103/PhysRevX.8.031084

Reinforcement Learning with Neural Networks for Quantum Feedback

Authors: Thomas Fösel, Petru Tighineanu, Talitha Weiss, Florian Marquardt

Abstract: Machine learning with artificial neural networks is revolutionizing science. The most advanced challenges require discovering answers autonomously. This is the domain of reinforcement learning, where control strategies are improved according to a reward function. The power of neural-network-based reinforcement learning has been highlighted by spectacular recent successes, such as playing Go, but i… ▽ More Machine learning with artificial neural networks is revolutionizing science. The most advanced challenges require discovering answers autonomously. This is the domain of reinforcement learning, where control strategies are improved according to a reward function. The power of neural-network-based reinforcement learning has been highlighted by spectacular recent successes, such as playing Go, but its benefits for physics are yet to be demonstrated. Here, we show how a network-based "agent" can discover complete quantum-error-correction strategies, protecting a collection of qubits against noise. These strategies require feedback adapted to measurement outcomes. Finding them from scratch, without human guidance, tailored to different hardware resources, is a formidable challenge due to the combinatorially large search space. To solve this, we develop two ideas: two-stage learning with teacher/student networks and a reward quantifying the capability to recover the quantum information stored in a multi-qubit system. Beyond its immediate impact on quantum computation, our work more generally demonstrates the promise of neural-network-based reinforcement learning in physics. △ Less

Submitted 31 August, 2018; v1 submitted 14 February, 2018; originally announced February 2018.

Comments: 7 pages maintext + methods + supplementary, 6 maintext figures; for related lectures, see: http://machine-learning-for-physicists.org

Journal ref: Phys. Rev. X 8, 031084 (2018)

Showing 1–8 of 8 results for author: Foesel, T