High-Fidelity Data-Driven Dynamics Model for Reinforcement Learning-based Control in HL-3 Tokamak

Wu, Niannian; Yang, Zongyu; Li, Rongpeng; Wei, Ning; Chen, Yihang; Dong, Qianyun; Li, Jiyuan; Zheng, Guohui; Gong, Xinwen; Gao, Feng; Li, Bo; Xu, Min; Zhao, Zhifeng; Zhong, Wulyu

Physics > Plasma Physics

arXiv:2409.09238 (physics)

[Submitted on 14 Sep 2024 (v1), last revised 15 Aug 2025 (this version, v2)]

Title:High-Fidelity Data-Driven Dynamics Model for Reinforcement Learning-based Control in HL-3 Tokamak

Authors:Niannian Wu, Zongyu Yang, Rongpeng Li, Ning Wei, Yihang Chen, Qianyun Dong, Jiyuan Li, Guohui Zheng, Xinwen Gong, Feng Gao, Bo Li, Min Xu, Zhifeng Zhao, Wulyu Zhong

View PDF HTML (experimental)

Abstract:The success of reinforcement learning (RL)-based control in tokamaks, an emerging technique for controlled nuclear fusion with improved flexibility, typically requires substantial interaction with a simulator capable of accurately evolving the high-dimensional plasma state. Compared to first-principle-based simulators, whose intense computations lead to sluggish RL training, we devise an effective method to acquire a fully data-driven simulator, by mitigating the arising compounding error issue due to the underlying autoregressive nature. With high accuracy and appealing extrapolation capability, this high-fidelity dynamics model subsequently enables the rapid training of a qualified RL agent to directly generate engineering-reasonable actuator commands, aiming at the desired long-term targets of plasma configuration. Together with a surrogate model for Equilibrium Fitting code based on neural network, named EFITNN, the RL agent successfully maintains a 400-ms, 1 kHz trajectory control with accurate waveform tracking of plasma current and last closed flux surface on the HL-3 tokamak. Furthermore, it also demonstrates the feasibility of zero-shot adaptation to changed triangularity targets, confirming the robustness of the developed data-driven dynamics model. Our work underscores the advantage of fully data-driven dynamics models in yielding RL-based trajectory control policies at a sufficiently fast pace, an anticipated engineering requirement in daily discharge practices for the upcoming ITER device.

Comments:	Accepted for publication in Communications Physics
Subjects:	Plasma Physics (physics.plasm-ph)
Cite as:	arXiv:2409.09238 [physics.plasm-ph]
	(or arXiv:2409.09238v2 [physics.plasm-ph] for this version)
	https://doi.org/10.48550/arXiv.2409.09238

Submission history

From: Niannian Wu [view email]
[v1] Sat, 14 Sep 2024 00:17:04 UTC (10,869 KB)
[v2] Fri, 15 Aug 2025 03:40:14 UTC (21,194 KB)

Physics > Plasma Physics

Title:High-Fidelity Data-Driven Dynamics Model for Reinforcement Learning-based Control in HL-3 Tokamak

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Physics > Plasma Physics

Title:High-Fidelity Data-Driven Dynamics Model for Reinforcement Learning-based Control in HL-3 Tokamak

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators