Skip to main content

Showing 1–4 of 4 results for author: Fedorishin, D

Searching in archive cs. Search in all archives.
.
  1. Audio Match Cutting: Finding and Creating Matching Audio Transitions in Movies and Videos

    Authors: Dennis Fedorishin, Lie Lu, Srirangaraj Setlur, Venu Govindaraju

    Abstract: A "match cut" is a common video editing technique where a pair of shots that have a similar composition transition fluidly from one to another. Although match cuts are often visual, certain match cuts involve the fluid transition of audio, where sounds from different sources merge into one indistinguishable transition between two shots. In this paper, we explore the ability to automatically find a… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

    Comments: Accepted to ICASSP 2024

  2. arXiv:2403.11037  [pdf, other

    eess.AS cs.SD

    Fine-Grained Engine Fault Sound Event Detection Using Multimodal Signals

    Authors: Dennis Fedorishin, Livio Forte III, Philip Schneider, Srirangaraj Setlur, Venu Govindaraju

    Abstract: Sound event detection (SED) is an active area of audio research that aims to detect the temporal occurrence of sounds. In this paper, we apply SED to engine fault detection by introducing a multimodal SED framework that detects fine-grained engine faults of automobile engines using audio and accelerometer-recorded vibration. We first introduce the problem of engine fault SED on a dataset collected… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

    Comments: Accepted to ICASSP 2024

  3. arXiv:2307.10237  [pdf, other

    cs.CV cs.AI cs.LG

    CoNAN: Conditional Neural Aggregation Network For Unconstrained Face Feature Fusion

    Authors: Bhavin Jawade, Deen Dayal Mohan, Dennis Fedorishin, Srirangaraj Setlur, Venu Govindaraju

    Abstract: Face recognition from image sets acquired under unregulated and uncontrolled settings, such as at large distances, low resolutions, varying viewpoints, illumination, pose, and atmospheric conditions, is challenging. Face feature aggregation, which involves aggregating a set of N feature representations present in a template into a single global representation, plays a pivotal role in such recognit… ▽ More

    Submitted 16 July, 2023; originally announced July 2023.

    Comments: Paper accepted at IJCB 2023

  4. arXiv:2211.03019  [pdf, other

    cs.CV

    Hear The Flow: Optical Flow-Based Self-Supervised Visual Sound Source Localization

    Authors: Dennis Fedorishin, Deen Dayal Mohan, Bhavin Jawade, Srirangaraj Setlur, Venu Govindaraju

    Abstract: Learning to localize the sound source in videos without explicit annotations is a novel area of audio-visual research. Existing work in this area focuses on creating attention maps to capture the correlation between the two modalities to localize the source of the sound. In a video, oftentimes, the objects exhibiting movement are the ones generating the sound. In this work, we capture this charact… ▽ More

    Submitted 5 November, 2022; originally announced November 2022.

    Comments: Accepted to WACV 2023