Designing Deep Learning Frameworks for LLMs:Challenges, Expectations, and Opportunities

Mu, Yanzhou; Wang, Rong; Zhai, Juan; Fang, Chunrong; Chen, Xiang; Wu, Jiacong; Guo, An; Shen, Jiawei; Li, Bingzhuo; Chen, Zhenyu

Abstract:Large language models (LLMs) drive significant advancements in real industry applications. LLMs rely on DL frameworks for efficient model construction, distributed execution, and optimized deployment. Their large parameter scale and long execution cycles place extreme demands on DL frameworks in terms of scalability, stability, and efficiency. Therefore, poor usability, limited functionality, and subtle bugs in DL frameworks may hinder development efficiency and cause severe failures or resource waste. However, a fundamental question remains underinvestigated, i.e., What challenges do DL frameworks face in supporting LLMs? To seek an answer, we investigate these challenges through a large-scale analysis of issue reports from three major DL frameworks (MindSpore, PyTorch, TensorFlow) and eight associated LLM toolkits (e.g., Megatron). We construct a taxonomy of LLM-centric bugs, requirements, and user questions and enrich it through interviews with 11 LLM users and eight DL framework developers, uncovering key technical challenges and misalignments between user needs and developer priorities. Our contributions are threefold: (1) we develop a comprehensive taxonomy comprising four question themes (nine sub-themes), four requirement themes (15 sub-themes), and ten bug themes (45 sub-themes); (2) we assess the perceived importance and priority of these challenges based on practitioner insights; and (3) we identify five key findings across the LLM development and propose five actionable recommendations to improve the reliability, usability, and testability of DL frameworks. Our results highlight critical limitations in current DL frameworks and offer concrete guidance for advancing their support for the next generation of LLM construction and applications.

Comments:	12 pages, 2 figures
Subjects:	Software Engineering (cs.SE)
Cite as:	arXiv:2506.13114 [cs.SE]
	(or arXiv:2506.13114v1 [cs.SE] for this version)
	https://doi.org/10.48550/arXiv.2506.13114

Computer Science > Software Engineering

Title:Designing Deep Learning Frameworks for LLMs:Challenges, Expectations, and Opportunities

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators