Skip to main content

Showing 1–5 of 5 results for author: Somayazulu, A

.
  1. arXiv:2504.05451  [pdf, other

    cs.CV

    Learning Activity View-invariance Under Extreme Viewpoint Changes via Curriculum Knowledge Distillation

    Authors: Arjun Somayazulu, Efi Mavroudi, Changan Chen, Lorenzo Torresani, Kristen Grauman

    Abstract: Traditional methods for view-invariant learning from video rely on controlled multi-view settings with minimal scene clutter. However, they struggle with in-the-wild videos that exhibit extreme viewpoint differences and share little visual content. We introduce a method for learning rich video representations in the presence of such severe view-occlusions. We first define a geometry-based metric t… ▽ More

    Submitted 7 April, 2025; originally announced April 2025.

  2. arXiv:2404.16216  [pdf, other

    cs.CV cs.RO cs.SD eess.AS

    ActiveRIR: Active Audio-Visual Exploration for Acoustic Environment Modeling

    Authors: Arjun Somayazulu, Sagnik Majumder, Changan Chen, Kristen Grauman

    Abstract: An environment acoustic model represents how sound is transformed by the physical characteristics of an indoor environment, for any given source/receiver location. Traditional methods for constructing acoustic models involve expensive and time-consuming collection of large quantities of acoustic data at dense spatial locations in the space, or rely on privileged knowledge of scene geometry to inte… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: Project page: https://vision.cs.utexas.edu/projects/active_rir/

  3. arXiv:2311.18259  [pdf, other

    cs.CV cs.AI

    Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives

    Authors: Kristen Grauman, Andrew Westbury, Lorenzo Torresani, Kris Kitani, Jitendra Malik, Triantafyllos Afouras, Kumar Ashutosh, Vijay Baiyya, Siddhant Bansal, Bikram Boote, Eugene Byrne, Zach Chavis, Joya Chen, Feng Cheng, Fu-Jen Chu, Sean Crane, Avijit Dasgupta, Jing Dong, Maria Escobar, Cristhian Forigua, Abrham Gebreselasie, Sanjay Haresh, Jing Huang, Md Mohaiminul Islam, Suyog Jain , et al. (76 additional authors not shown)

    Abstract: We present Ego-Exo4D, a diverse, large-scale multimodal multiview video dataset and benchmark challenge. Ego-Exo4D centers around simultaneously-captured egocentric and exocentric video of skilled human activities (e.g., sports, music, dance, bike repair). 740 participants from 13 cities worldwide performed these activities in 123 different natural scene contexts, yielding long-form captures from… ▽ More

    Submitted 25 September, 2024; v1 submitted 30 November, 2023; originally announced November 2023.

    Comments: Expanded manuscript (compared to arxiv v1 from Nov 2023 and CVPR 2024 paper from June 2024) for more comprehensive dataset and benchmark presentation, plus new results on v2 data release

  4. arXiv:2307.15064  [pdf, other

    cs.MM cs.CV cs.SD eess.AS

    Self-Supervised Visual Acoustic Matching

    Authors: Arjun Somayazulu, Changan Chen, Kristen Grauman

    Abstract: Acoustic matching aims to re-synthesize an audio clip to sound as if it were recorded in a target acoustic environment. Existing methods assume access to paired training data, where the audio is observed in both source and target environments, but this limits the diversity of training data or requires the use of simulated data or heuristics to create paired samples. We propose a self-supervised ap… ▽ More

    Submitted 23 November, 2023; v1 submitted 27 July, 2023; originally announced July 2023.

    Comments: Project page: https://vision.cs.utexas.edu/projects/ss_vam/ . Accepted at NeurIPS 2023

  5. arXiv:2211.05047  [pdf, other

    eess.AS cs.AI cs.SD

    A Comparative Study of Data Augmentation Techniques for Deep Learning Based Emotion Recognition

    Authors: Ravi Shankar, Abdouh Harouna Kenfack, Arjun Somayazulu, Archana Venkataraman

    Abstract: Automated emotion recognition in speech is a long-standing problem. While early work on emotion recognition relied on hand-crafted features and simple classifiers, the field has now embraced end-to-end feature learning and classification using deep neural networks. In parallel to these models, researchers have proposed several data augmentation techniques to increase the size and variability of ex… ▽ More

    Submitted 9 November, 2022; originally announced November 2022.

    Comments: Under Submission