Skip to main content

Showing 1–10 of 10 results for author: Kumar, V B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.12103  [pdf, other

    cs.AI cs.CY cs.LG

    The Amazon Nova Family of Models: Technical Report and Model Card

    Authors: Amazon AGI, Aaron Langford, Aayush Shah, Abhanshu Gupta, Abhimanyu Bhatter, Abhinav Goyal, Abhinav Mathur, Abhinav Mohanty, Abhishek Kumar, Abhishek Sethi, Abi Komma, Abner Pena, Achin Jain, Adam Kunysz, Adam Opyrchal, Adarsh Singh, Aditya Rawal, Adok Achar Budihal Prasad, AdriĆ  de Gispert, Agnika Kumar, Aishwarya Aryamane, Ajay Nair, Akilan M, Akshaya Iyengar, Akshaya Vishnu Kudlu Shanbhogue , et al. (761 additional authors not shown)

    Abstract: We present Amazon Nova, a new generation of state-of-the-art foundation models that deliver frontier intelligence and industry-leading price performance. Amazon Nova Pro is a highly-capable multimodal model with the best combination of accuracy, speed, and cost for a wide range of tasks. Amazon Nova Lite is a low-cost multimodal model that is lightning fast for processing images, video, documents… ▽ More

    Submitted 17 March, 2025; originally announced June 2025.

    Comments: 48 pages, 10 figures

    Report number: 20250317

  2. arXiv:2503.01872  [pdf, other

    cs.LG cs.AI cs.CV

    FairGen: Controlling Sensitive Attributes for Fair Generations in Diffusion Models via Adaptive Latent Guidance

    Authors: Mintong Kang, Vinayshekhar Bannihatti Kumar, Shamik Roy, Abhishek Kumar, Sopan Khosla, Balakrishnan Murali Narayanaswamy, Rashmi Gangadharaiah

    Abstract: Text-to-image diffusion models often exhibit biases toward specific demographic groups, such as generating more males than females when prompted to generate images of engineers, raising ethical concerns and limiting their adoption. In this paper, we tackle the challenge of mitigating generation bias towards any target attribute value (e.g., "male" for "gender") in diffusion models while preserving… ▽ More

    Submitted 25 February, 2025; originally announced March 2025.

    Comments: Under submission

  3. arXiv:2406.08641  [pdf, ps, other

    cs.SD cs.CL eess.AS

    ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling Constraints, Languages, and Datasets

    Authors: Jiatong Shi, Shih-Heng Wang, William Chen, Martijn Bartelds, Vanya Bannihatti Kumar, Jinchuan Tian, Xuankai Chang, Dan Jurafsky, Karen Livescu, Hung-yi Lee, Shinji Watanabe

    Abstract: ML-SUPERB evaluates self-supervised learning (SSL) models on the tasks of language identification and automatic speech recognition (ASR). This benchmark treats the models as feature extractors and uses a single shallow downstream model, which can be fine-tuned for a downstream task. However, real-world use cases may require different configurations. This paper presents ML-SUPERB~2.0, which is a ne… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  4. arXiv:2303.10160  [pdf, other

    eess.AS cs.LG cs.SD

    Visual Information Matters for ASR Error Correction

    Authors: Vanya Bannihatti Kumar, Shanbo Cheng, Ningxin Peng, Yuchen Zhang

    Abstract: Aiming to improve the Automatic Speech Recognition (ASR) outputs with a post-processing step, ASR error correction (EC) techniques have been widely developed due to their efficiency in using parallel text data. Previous works mainly focus on using text or/ and speech data, which hinders the performance gain when not only text and speech information, but other modalities, such as visual information… ▽ More

    Submitted 26 May, 2023; v1 submitted 16 March, 2023; originally announced March 2023.

    Comments: Accepted at ICASSP 2023

  5. arXiv:2212.09573  [pdf, other

    cs.CL

    Privacy Adhering Machine Un-learning in NLP

    Authors: Vinayshekhar Bannihatti Kumar, Rashmi Gangadharaiah, Dan Roth

    Abstract: Regulations introduced by General Data Protection Regulation (GDPR) in the EU or California Consumer Privacy Act (CCPA) in the US have included provisions on the \textit{right to be forgotten} that mandates industry applications to remove data related to an individual from their systems. In several real world industry applications that use Machine Learning to build models on user data, such mandat… ▽ More

    Submitted 19 December, 2022; originally announced December 2022.

  6. arXiv:2210.17035  [pdf, other

    cs.CL cs.AI

    Evaluation of large-scale synthetic data for Grammar Error Correction

    Authors: Vanya Bannihatti Kumar

    Abstract: Grammar Error Correction(GEC) mainly relies on the availability of high quality of large amount of synthetic parallel data of grammatically correct and erroneous sentence pairs. The quality of the synthetic data is evaluated on how well the GEC system performs when pre-trained using it. But this does not provide much insight into what are the necessary factors which define the quality of these dat… ▽ More

    Submitted 30 October, 2022; originally announced October 2022.

  7. arXiv:2210.03264  [pdf, other

    cs.CL

    Unsupervised Neural Stylistic Text Generation using Transfer learning and Adapters

    Authors: Vinayshekhar Bannihatti Kumar, Rashmi Gangadharaiah, Dan Roth

    Abstract: Research has shown that personality is a key driver to improve engagement and user experience in conversational systems. Conversational agents should also maintain a consistent persona to have an engaging conversation with a user. However, text generation datasets are often crowd sourced and thereby have an averaging effect where the style of the generation model is an average style of all the cro… ▽ More

    Submitted 6 October, 2022; originally announced October 2022.

  8. arXiv:1907.10136  [pdf, other

    cs.CL

    Dr.Quad at MEDIQA 2019: Towards Textual Inference and Question Entailment using contextualized representations

    Authors: Vinayshekhar Bannihatti Kumar, Ashwin Srinivasan, Aditi Chaudhary, James Route, Teruko Mitamura, Eric Nyberg

    Abstract: This paper presents the submissions by Team Dr.Quad to the ACL-BioNLP 2019 shared task on Textual Inference and Question Entailment in the Medical Domain. Our system is based on the prior work Liu et al. (2019) which uses a multi-task objective function for textual entailment. In this work, we explore different strategies for generalizing state-of-the-art language understanding models to the speci… ▽ More

    Submitted 23 July, 2019; originally announced July 2019.

    Comments: Accepted in ACL challenge MediQA as part of the BioNLP workshop

  9. arXiv:1907.08259  [pdf, ps, other

    cs.LG cs.CL stat.ML

    WriterForcing: Generating more interesting story endings

    Authors: Prakhar Gupta, Vinayshekhar Bannihatti Kumar, Mukul Bhutani, Alan W Black

    Abstract: We study the problem of generating interesting endings for stories. Neural generative models have shown promising results for various text generation problems. Sequence to Sequence (Seq2Seq) models are typically trained to generate a single output sequence for a given input sequence. However, in the context of a story, multiple endings are possible. Seq2Seq models tend to ignore the context and ge… ▽ More

    Submitted 18 July, 2019; originally announced July 2019.

    Comments: Accepted in ACL workshop on Storytelling 2019

  10. arXiv:1508.06823  [pdf

    cs.DC

    Framework for Application Mapping over Packet-Switched Network of FPGAs: Case Studies

    Authors: Vinay B. Y. Kumar, Pinalkumar Engineer, Mandar Datar, Yatish Turakhia, Saurabh Agarwal, Sanket Diwale, Sachin B. Patkar

    Abstract: The algorithm-to-hardware High-level synthesis (HLS) tools today are purported to produce hardware comparable in quality to handcrafted designs, particularly with user directive driven or domains specific HLS. However, HLS tools are not readily equipped for when an application/algorithm needs to scale. We present a (work-in-progress) semi-automated framework to map applications over a packet-switc… ▽ More

    Submitted 27 August, 2015; originally announced August 2015.

    Comments: Presented at Second International Workshop on FPGAs for Software Programmers (FSP 2015) (arXiv:1508.06320)

    Report number: FSP/2015/05