CodeContests+: High-Quality Test Case Generation for Competitive Programming

Wang, Zihan; Liu, Siyao; Sun, Yang; Li, Hongyan; Shen, Kai

Computer Science > Software Engineering

arXiv:2506.05817 (cs)

[Submitted on 6 Jun 2025]

Title:CodeContests+: High-Quality Test Case Generation for Competitive Programming

Authors:Zihan Wang, Siyao Liu, Yang Sun, Hongyan Li, Kai Shen

View PDF HTML (experimental)

Abstract:Competitive programming, due to its high reasoning difficulty and precise correctness feedback, has become a key task for both training and evaluating the reasoning capabilities of large language models (LLMs). However, while a large amount of public problem data, such as problem statements and solutions, is available, the test cases of these problems are often difficult to obtain. Therefore, test case generation is a necessary task for building large-scale datasets, and the quality of the test cases directly determines the accuracy of the evaluation. In this paper, we introduce an LLM-based agent system that creates high-quality test cases for competitive programming problems. We apply this system to the CodeContests dataset and propose a new version with improved test cases, named CodeContests+. We evaluated the quality of test cases in CodeContestsPlus. First, we used 1.72 million submissions with pass/fail labels to examine the accuracy of these test cases in evaluation. The results indicated that CodeContests+ achieves significantly higher accuracy than CodeContests, particularly with a notably higher True Positive Rate (TPR). Subsequently, our experiments in LLM Reinforcement Learning (RL) further confirmed that improvements in test case quality yield considerable advantages for RL.

Comments:	28 pages, 7 figures
Subjects:	Software Engineering (cs.SE); Computation and Language (cs.CL)
Cite as:	arXiv:2506.05817 [cs.SE]
	(or arXiv:2506.05817v1 [cs.SE] for this version)
	https://doi.org/10.48550/arXiv.2506.05817

Submission history

From: Zihan Wang [view email]
[v1] Fri, 6 Jun 2025 07:29:01 UTC (1,997 KB)

Computer Science > Software Engineering

Title:CodeContests+: High-Quality Test Case Generation for Competitive Programming

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Software Engineering

Title:CodeContests+: High-Quality Test Case Generation for Competitive Programming

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators