Multi-Agent Meta-Offline Reinforcement Learning for Timely UAV Path Planning and Data Collection

Eldeeb, Eslam; Alves, Hirley

Abstract:Multi-agent reinforcement learning (MARL) has been widely adopted in high-performance computing and complex data-driven decision-making in the wireless domain. However, conventional MARL schemes face many obstacles in real-world scenarios. First, most MARL algorithms are online, which might be unsafe and impractical. Second, MARL algorithms are environment-specific, meaning network configuration changes require model retraining. This letter proposes a novel meta-offline MARL algorithm that combines conservative Q-learning (CQL) and model agnostic meta-learning (MAML). CQL enables offline training by leveraging pre-collected datasets, while MAML ensures scalability and adaptability to dynamic network configurations and objectives. We propose two algorithm variants: independent training (M-I-MARL) and centralized training decentralized execution (M-CTDE-MARL). Simulation results show that the proposed algorithm outperforms conventional schemes, especially the CTDE approach that achieves 50 % faster convergence in dynamic scenarios than the benchmarks. The proposed framework enhances scalability, robustness, and adaptability in wireless communication systems by optimizing UAV trajectories and scheduling policies.

Subjects:	Multiagent Systems (cs.MA)
Cite as:	arXiv:2501.16098 [cs.MA]
	(or arXiv:2501.16098v1 [cs.MA] for this version)
	https://doi.org/10.48550/arXiv.2501.16098

Computer Science > Multiagent Systems

Title:Multi-Agent Meta-Offline Reinforcement Learning for Timely UAV Path Planning and Data Collection

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators