Skip to main content

Showing 1–1 of 1 results for author: Cao, K Q N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.21954  [pdf, ps, other

    cs.CV cs.AI

    UniTalk: Towards Universal Active Speaker Detection in Real World Scenarios

    Authors: Le Thien Phuc Nguyen, Zhuoran Yu, Khoa Quang Nhat Cao, Yuwei Guo, Tu Ho Manh Pham, Tuan Tai Nguyen, Toan Ngo Duc Vo, Lucas Poon, Soochahn Lee, Yong Jae Lee

    Abstract: We present UniTalk, a novel dataset specifically designed for the task of active speaker detection, emphasizing challenging scenarios to enhance model generalization. Unlike previously established benchmarks such as AVA, which predominantly features old movies and thus exhibits significant domain gaps, UniTalk focuses explicitly on diverse and difficult real-world conditions. These include underre… ▽ More

    Submitted 28 May, 2025; originally announced May 2025.