We gratefully acknowledge support from
the Simons Foundation and member institutions.

Yuanbo Hou is qualified to endorse.

Audio-visual scene classification via contrastive event-object alignment and semantic-based fusion

Yuanbo Hou: Is registered as an author of this paper.
Can endorse for cs.CL, cs.HC, cs.MM, cs.RO, cs.SD, eess.AS, eess.IV, eess.SP. (why?)

Bo Kang and Dick Botteldooren are not registered as owners of this paper. (why?)