Skip to main content

Showing 1–1 of 1 results for author: Zhang, V X J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2501.01645  [pdf, other

    cs.CV cs.AI

    HLV-1K: A Large-scale Hour-Long Video Benchmark for Time-Specific Long Video Understanding

    Authors: Heqing Zou, Tianze Luo, Guiyang Xie, Victor Xiao Jie Zhang, Fengmao Lv, Guangcong Wang, Junyang Chen, Zhuochen Wang, Hansheng Zhang, Huaijian Zhang

    Abstract: Multimodal large language models have become a popular topic in deep visual understanding due to many promising real-world applications. However, hour-long video understanding, spanning over one hour and containing tens of thousands of visual frames, remains under-explored because of 1) challenging long-term video analyses, 2) inefficient large-model approaches, and 3) lack of large-scale benchmar… ▽ More

    Submitted 13 May, 2025; v1 submitted 3 January, 2025; originally announced January 2025.

    Comments: Accepted to ICME 2025