Skip to main content

Showing 1–4 of 4 results for author: Vineet, V

Searching in archive eess. Search in all archives.
.
  1. arXiv:2412.15922  [pdf, other

    cs.LG cs.SD eess.AS

    RiTTA: Modeling Event Relations in Text-to-Audio Generation

    Authors: Yuhang He, Yash Jain, Xubo Liu, Andrew Markham, Vibhav Vineet

    Abstract: Despite significant advancements in Text-to-Audio (TTA) generation models achieving high-fidelity audio with fine-grained context understanding, they struggle to model the relations between audio events described in the input text. However, previous TTA methods have not systematically explored audio event relation modeling, nor have they proposed frameworks to enhance this capability. In this work… ▽ More

    Submitted 4 January, 2025; v1 submitted 20 December, 2024; originally announced December 2024.

    Comments: Project Site: https://yuhanghe01.github.io/RiTTA-Proj/. Code: https://github.com/yuhanghe01/RiTTA

  2. arXiv:2207.01398  [pdf, other

    cs.CV eess.IV

    Large-scale Robustness Analysis of Video Action Recognition Models

    Authors: Madeline Chantry Schiappa, Naman Biyani, Prudvi Kamtam, Shruti Vyas, Hamid Palangi, Vibhav Vineet, Yogesh Rawat

    Abstract: We have seen a great progress in video action recognition in recent years. There are several models based on convolutional neural network (CNN) and some recent transformer based approaches which provide top performance on existing benchmarks. In this work, we perform a large-scale robustness analysis of these existing models for video action recognition. We focus on robustness against real-world d… ▽ More

    Submitted 7 April, 2023; v1 submitted 4 July, 2022; originally announced July 2022.

    Comments: Accepted in 2023 Conference on Computer Vision and Pattern Recognition (CVPR)

  3. arXiv:2010.10691  [pdf, other

    cs.SD cs.LG eess.AS

    Prediction of Object Geometry from Acoustic Scattering Using Convolutional Neural Networks

    Authors: Ziqi Fan, Vibhav Vineet, Chenshen Lu, T. W. Wu, Kyla McMullen

    Abstract: Acoustic scattering is strongly influenced by boundary geometry of objects over which sound scatters. The present work proposes a method to infer object geometry from scattering features by training convolutional neural networks. The training data is generated from a fast numerical solver developed on CUDA. The complete set of simulations is sampled to generate multiple datasets containing differe… ▽ More

    Submitted 10 February, 2021; v1 submitted 20 October, 2020; originally announced October 2020.

    Comments: Accepted by ICASSP 2021

  4. arXiv:1911.01802  [pdf, other

    eess.AS cs.LG cs.SD eess.IV eess.SP

    Fast acoustic scattering using convolutional neural networks

    Authors: Ziqi Fan, Vibhav Vineet, Hannes Gamper, Nikunj Raghuvanshi

    Abstract: Diffracted scattering and occlusion are important acoustic effects in interactive auralization and noise control applications, typically requiring expensive numerical simulation. We propose training a convolutional neural network to map from a convex scatterer's cross-section to a 2D slice of the resulting spatial loudness distribution. We show that employing a full-resolution residual network for… ▽ More

    Submitted 15 February, 2020; v1 submitted 30 October, 2019; originally announced November 2019.

    Comments: Accepted by ICASSP 2020