High-Fidelity Data-Driven Dynamics Model for Reinforcement Learning-based Magnetic Control in HL-3 Tokamak

Wu, Niannian; Yang, Zongyu; Li, Rongpeng; Wei, Ning; Chen, Yihang; Dong, Qianyun; Li, Jiyuan; Zheng, Guohui; Gong, Xinwen; Gao, Feng; Li, Bo; Xu, Min; Zhao, Zhifeng; Zhong, Wulyu

Physics > Plasma Physics

arXiv:2409.09238v1 (physics)

[Submitted on 14 Sep 2024 (this version), latest version 15 Aug 2025 (v2)]

Title:High-Fidelity Data-Driven Dynamics Model for Reinforcement Learning-based Magnetic Control in HL-3 Tokamak

Authors:Niannian Wu, Zongyu Yang, Rongpeng Li, Ning Wei, Yihang Chen, Qianyun Dong, Jiyuan Li, Guohui Zheng, Xinwen Gong, Feng Gao, Bo Li, Min Xu, Zhifeng Zhao, Wulyu Zhong

View PDF HTML (experimental)

Abstract:The drive to control tokamaks, a prominent technology in nuclear fusion, is essential due to its potential to provide a virtually unlimited source of clean energy. Reinforcement learning (RL) promises improved flexibility to manage the intricate and non-linear dynamics of the plasma encapsulated in a tokamak. However, RL typically requires substantial interaction with a simulator capable of accurately evolving the high-dimensional plasma state. Compared to first-principle-based simulators, whose intense computations lead to sluggish RL training, we devise an effective method to acquire a fully data-driven simulator, by mitigating the arising compounding error issue due to the underlying autoregressive nature. With high accuracy and appealing extrapolation capability, this high-fidelity dynamics model subsequently enables the rapid training of a qualified RL agent to directly generate engineering-reasonable magnetic coil commands, aiming at the desired long-term targets of plasma current and last closed flux surface. Together with a surrogate magnetic equilibrium reconstruction model EFITNN, the RL agent successfully maintains a $100$-ms, $1$ kHz trajectory control with accurate waveform tracking on the HL-3 tokamak. Furthermore, it also demonstrates the feasibility of zero-shot adaptation to changed triangularity targets, confirming the robustness of the developed data-driven dynamics model. Our work underscores the advantage of fully data-driven dynamics models in yielding RL-based trajectory control policies at a sufficiently fast pace, an anticipated engineering requirement in daily discharge practices for the upcoming ITER device.

Subjects:	Plasma Physics (physics.plasm-ph)
Cite as:	arXiv:2409.09238 [physics.plasm-ph]
	(or arXiv:2409.09238v1 [physics.plasm-ph] for this version)
	https://doi.org/10.48550/arXiv.2409.09238

Submission history

From: Niannian Wu [view email]
[v1] Sat, 14 Sep 2024 00:17:04 UTC (10,869 KB)
[v2] Fri, 15 Aug 2025 03:40:14 UTC (21,194 KB)

Physics > Plasma Physics

Title:High-Fidelity Data-Driven Dynamics Model for Reinforcement Learning-based Magnetic Control in HL-3 Tokamak

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Physics > Plasma Physics

Title:High-Fidelity Data-Driven Dynamics Model for Reinforcement Learning-based Magnetic Control in HL-3 Tokamak

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators