Skip to main content

Showing 1–1 of 1 results for author: Dongfang, Z

.
  1. arXiv:2505.11907  [pdf, ps, other

    cs.CV

    Are Multimodal Large Language Models Ready for Omnidirectional Spatial Reasoning?

    Authors: Zihao Dongfang, Xu Zheng, Ziqiao Weng, Yuanhuiyi Lyu, Danda Pani Paudel, Luc Van Gool, Kailun Yang, Xuming Hu

    Abstract: The 180x360 omnidirectional field of view captured by 360-degree cameras enables their use in a wide range of applications such as embodied AI and virtual reality. Although recent advances in multimodal large language models (MLLMs) have shown promise in visual-spatial reasoning, most studies focus on standard pinhole-view images, leaving omnidirectional perception largely unexplored. In this pape… ▽ More

    Submitted 17 May, 2025; originally announced May 2025.