Skip to main content

Showing 1–3 of 3 results for author: Ng, X Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2410.21311  [pdf, other

    cs.CV cs.AI

    MMDocBench: Benchmarking Large Vision-Language Models for Fine-Grained Visual Document Understanding

    Authors: Fengbin Zhu, Ziyang Liu, Xiang Yao Ng, Haohui Wu, Wenjie Wang, Fuli Feng, Chao Wang, Huanbo Luan, Tat Seng Chua

    Abstract: Large Vision-Language Models (LVLMs) have achieved remarkable performance in many vision-language tasks, yet their capabilities in fine-grained visual understanding remain insufficiently evaluated. Existing benchmarks either contain limited fine-grained evaluation samples that are mixed with other data, or are confined to object-level assessments in natural images. To holistically assess LVLMs' fi… ▽ More

    Submitted 25 October, 2024; originally announced October 2024.

    Comments: Under review

  2. arXiv:2206.02679  [pdf, other

    cs.RO cs.AI

    Real2Sim or Sim2Real: Robotics Visual Insertion using Deep Reinforcement Learning and Real2Sim Policy Adaptation

    Authors: Yiwen Chen, Xue Li, Sheng Guo, Xian Yao Ng, Marcelo Ang

    Abstract: Reinforcement learning has shown a wide usage in robotics tasks, such as insertion and grasping. However, without a practical sim2real strategy, the policy trained in simulation could fail on the real task. There are also wide researches in the sim2real strategies, but most of those methods rely on heavy image rendering, domain randomization training, or tuning. In this work, we solve the insertio… ▽ More

    Submitted 6 June, 2022; originally announced June 2022.

  3. arXiv:2205.05963  [pdf, other

    cs.RO cs.AI cs.CV

    Economical Precise Manipulation and Auto Eye-Hand Coordination with Binocular Visual Reinforcement Learning

    Authors: Yiwen Chen, Sheng Guo, Zedong Zhang, Lei Zhou, Xian Yao Ng, Marcelo H. Ang Jr

    Abstract: Precision robotic manipulation tasks (insertion, screwing, precisely pick, precisely place) are required in many scenarios. Previous methods achieved good performance on such manipulation tasks. However, such methods typically require tedious calibration or expensive sensors. 3D/RGB-D cameras and torque/force sensors add to the cost of the robotic application and may not always be economical. In t… ▽ More

    Submitted 15 September, 2022; v1 submitted 12 May, 2022; originally announced May 2022.

    Comments: 12 pages, 16 figures