Skip to main content

Showing 1–1 of 1 results for author: Yixing, L

.
  1. arXiv:2410.15164  [pdf, other

    cs.AI

    SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation

    Authors: Jingxuan Chen, Derek Yuen, Bin Xie, Yuhao Yang, Gongwei Chen, Zhihao Wu, Li Yixing, Xurui Zhou, Weiwen Liu, Shuai Wang, Kaiwen Zhou, Rui Shao, Liqiang Nie, Yasheng Wang, Jianye Hao, Jun Wang, Kun Shao

    Abstract: Smartphone agents are increasingly important for helping users control devices efficiently, with (Multimodal) Large Language Model (MLLM)-based approaches emerging as key contenders. Fairly comparing these agents is essential but challenging, requiring a varied task scope, the integration of agents with different implementations, and a generalisable evaluation pipeline to assess their strengths an… ▽ More

    Submitted 31 March, 2025; v1 submitted 19 October, 2024; originally announced October 2024.

    Comments: ICLR 2025 Spotlight