Skip to main content

Showing 1–26 of 26 results for author: Sunder, V

.
  1. arXiv:2505.08699  [pdf, other

    eess.AS

    Granite-speech: open-source speech-aware LLMs with strong English ASR capabilities

    Authors: George Saon, Avihu Dekel, Alexander Brooks, Tohru Nagano, Abraham Daniels, Aharon Satt, Ashish Mittal, Brian Kingsbury, David Haws, Edmilson Morais, Gakuto Kurata, Hagai Aronowitz, Ibrahim Ibrahim, Jeff Kuo, Kate Soule, Luis Lastras, Masayuki Suzuki, Ron Hoory, Samuel Thomas, Sashi Novitasari, Takashi Fukuda, Vishal Sunder, Xiaodong Cui, Zvi Kons

    Abstract: Granite-speech LLMs are compact and efficient speech language models specifically designed for English ASR and automatic speech translation (AST). The models were trained by modality aligning the 2B and 8B parameter variants of granite-3.3-instruct to speech on publicly available open-source corpora containing audio inputs and text targets consisting of either human transcripts for ASR or automati… ▽ More

    Submitted 13 May, 2025; v1 submitted 13 May, 2025; originally announced May 2025.

    Comments: 7 pages, 9 figures

  2. arXiv:2501.09104  [pdf, other

    cs.SD cs.AI eess.AS

    A Non-autoregressive Model for Joint STT and TTS

    Authors: Vishal Sunder, Brian Kingsbury, George Saon, Samuel Thomas, Slava Shechtman, Hagai Aronowitz, Eric Fosler-Lussier, Luis Lastras

    Abstract: In this paper, we take a step towards jointly modeling automatic speech recognition (STT) and speech synthesis (TTS) in a fully non-autoregressive way. We develop a novel multimodal framework capable of handling the speech and text modalities as input either individually or together. The proposed model can also be trained with unpaired speech or text data owing to its multimodal nature. We further… ▽ More

    Submitted 20 January, 2025; v1 submitted 15 January, 2025; originally announced January 2025.

    Comments: 5 pages, 3 figures, 3 tables

  3. arXiv:2501.01936  [pdf, other

    cs.LG

    Improving Transducer-Based Spoken Language Understanding with Self-Conditioned CTC and Knowledge Transfer

    Authors: Vishal Sunder, Eric Fosler-Lussier

    Abstract: In this paper, we propose to improve end-to-end (E2E) spoken language understand (SLU) in an RNN transducer model (RNN-T) by incorporating a joint self-conditioned CTC automatic speech recognition (ASR) objective. Our proposed model is akin to an E2E differentiable cascaded model which performs ASR and SLU sequentially and we ensure that the SLU task is conditioned on the ASR task by having CTC se… ▽ More

    Submitted 3 January, 2025; originally announced January 2025.

    Comments: 8 pages, 4 figures

  4. arXiv:2310.11486  [pdf, other

    eess.AS cs.AI cs.LG

    End-to-End real time tracking of children's reading with pointer network

    Authors: Vishal Sunder, Beulah Karrolla, Eric Fosler-Lussier

    Abstract: In this work, we explore how a real time reading tracker can be built efficiently for children's voices. While previously proposed reading trackers focused on ASR-based cascaded approaches, we propose a fully end-to-end model making it less prone to lags in voice tracking. We employ a pointer network that directly learns to predict positions in the ground truth text conditioned on the streaming sp… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

    Comments: 5 pages, 3 figures

  5. arXiv:2204.05188  [pdf, other

    cs.CL cs.SD eess.AS

    Tokenwise Contrastive Pretraining for Finer Speech-to-BERT Alignment in End-to-End Speech-to-Intent Systems

    Authors: Vishal Sunder, Eric Fosler-Lussier, Samuel Thomas, Hong-Kwang J. Kuo, Brian Kingsbury

    Abstract: Recent advances in End-to-End (E2E) Spoken Language Understanding (SLU) have been primarily due to effective pretraining of speech representations. One such pretraining paradigm is the distillation of semantic knowledge from state-of-the-art text-based models like BERT to speech encoder neural networks. This work is a step towards doing the same in a much more efficient and fine-grained manner whe… ▽ More

    Submitted 1 July, 2022; v1 submitted 11 April, 2022; originally announced April 2022.

    Comments: 5 pages, 2 figures

  6. arXiv:2204.05183  [pdf, other

    cs.CL cs.SD eess.AS

    Building an ASR Error Robust Spoken Virtual Patient System in a Highly Class-Imbalanced Scenario Without Speech Data

    Authors: Vishal Sunder, Prashant Serai, Eric Fosler-Lussier

    Abstract: A Virtual Patient (VP) is a powerful tool for training medical students to take patient histories, where responding to a diverse set of spoken questions is essential to simulate natural conversations with a student. The performance of such a Spoken Language Understanding system (SLU) can be adversely affected by both the presence of Automatic Speech Recognition (ASR) errors in the test data and a… ▽ More

    Submitted 1 July, 2022; v1 submitted 11 April, 2022; originally announced April 2022.

    Comments: 5 pages, 3 figures

  7. arXiv:2204.05169  [pdf, other

    cs.CL cs.AI

    Towards End-to-End Integration of Dialog History for Improved Spoken Language Understanding

    Authors: Vishal Sunder, Samuel Thomas, Hong-Kwang J. Kuo, Jatin Ganhotra, Brian Kingsbury, Eric Fosler-Lussier

    Abstract: Dialog history plays an important role in spoken language understanding (SLU) performance in a dialog system. For end-to-end (E2E) SLU, previous work has used dialog history in text form, which makes the model dependent on a cascaded automatic speech recognizer (ASR). This rescinds the benefits of an E2E system which is intended to be compact and robust to ASR errors. In this paper, we propose a h… ▽ More

    Submitted 11 April, 2022; originally announced April 2022.

    Comments: 5 pages, 1 figure

  8. arXiv:2103.12258  [pdf, other

    cs.CL cs.LG

    Hallucination of speech recognition errors with sequence to sequence learning

    Authors: Prashant Serai, Vishal Sunder, Eric Fosler-Lussier

    Abstract: Automatic Speech Recognition (ASR) is an imperfect process that results in certain mismatches in ASR output text when compared to plain written text or transcriptions. When plain text data is to be used to train systems for spoken language understanding or ASR, a proven strategy to reduce said mismatch and prevent degradations, is to hallucinate what the ASR outputs would be given a gold transcrip… ▽ More

    Submitted 31 March, 2021; v1 submitted 22 March, 2021; originally announced March 2021.

    Comments: Submitted to IEEE/ACM Transactions on Audio Speech and Language Processing

  9. arXiv:2010.15090  [pdf, other

    cs.CL cs.LG

    Handling Class Imbalance in Low-Resource Dialogue Systems by Combining Few-Shot Classification and Interpolation

    Authors: Vishal Sunder, Eric Fosler-Lussier

    Abstract: Utterance classification performance in low-resource dialogue systems is constrained by an inevitably high degree of data imbalance in class labels. We present a new end-to-end pairwise learning framework that is designed specifically to tackle this phenomenon by inducing a few-shot classification capability in the utterance representations and augmenting data through an interpolation of utterance… ▽ More

    Submitted 28 October, 2020; originally announced October 2020.

    Comments: 5 pages, 4 figures, 3 tables

  10. arXiv:1912.07228  [pdf, ps, other

    math.OA

    Planar algebras, quantum information theory and subfactors

    Authors: Vijay Kodiyalam, Sruthymurali, V. S. Sunder

    Abstract: We define generalised notions of biunitary elements in planar algebras and show that objects arising in quantum information theory such as Hadamard matrices, quantum latin squares and unitary error bases are all given by biunitary elements in the spin planar algebra. We show that there are natural subfactor planar algebras associated with biunitary elements.

    Submitted 16 December, 2019; originally announced December 2019.

    Comments: 18 pages, 25 figures

    MSC Class: 46L37; 81P45; 81P68

  11. arXiv:1906.02427  [pdf, other

    cs.AI cs.LG cs.LO

    One-shot Information Extraction from Document Images using Neuro-Deductive Program Synthesis

    Authors: Vishal Sunder, Ashwin Srinivasan, Lovekesh Vig, Gautam Shroff, Rohit Rahul

    Abstract: Our interest in this paper is in meeting a rapidly growing industrial demand for information extraction from images of documents such as invoices, bills, receipts etc. In practice users are able to provide a very small number of example images labeled with the information that needs to be extracted. We adopt a novel two-level neuro-deductive, approach where (a) we use pre-trained deep neural netwo… ▽ More

    Submitted 6 June, 2019; originally announced June 2019.

    Comments: 11 pages, appears in the 13th International Workshop on Neural-Symbolic Learning and Reasoning at IJCAI 2019

  12. arXiv:1901.11191  [pdf, ps, other

    math.OA math.GT

    On a presentation of the spin planar algebra

    Authors: Vijay Kodiyalam, Sohan Lal Saini, Sruthymurali, V. S. Sunder

    Abstract: We define a certain abstract planar algebra by generators and relations, study various aspects of its structure, and then identify it with Jones' spin planar algebra.

    Submitted 30 January, 2019; originally announced January 2019.

    Comments: 11 pages, 12 figures

    MSC Class: 46L37

  13. arXiv:1809.07066  [pdf, other

    cs.LG cs.AI cs.MA stat.ML

    Prosocial or Selfish? Agents with different behaviors for Contract Negotiation using Reinforcement Learning

    Authors: Vishal Sunder, Lovekesh Vig, Arnab Chatterjee, Gautam Shroff

    Abstract: We present an effective technique for training deep learning agents capable of negotiating on a set of clauses in a contract agreement using a simple communication protocol. We use Multi Agent Reinforcement Learning to train both agents simultaneously as they negotiate with each other in the training environment. We also model selfish and prosocial behavior to varying degrees in these agents. Empi… ▽ More

    Submitted 19 September, 2018; originally announced September 2018.

    Comments: Proceedings of the 11th International Workshop on Automated Negotiations (held in conjunction with IJCAI 2018)

  14. arXiv:1804.01000  [pdf, other

    cs.CL cs.AI

    CIKM AnalytiCup 2017 Lazada Product Title Quality Challenge An Ensemble of Deep and Shallow Learning to predict the Quality of Product Titles

    Authors: Karamjit Singh, Vishal Sunder

    Abstract: We present an approach where two different models (Deep and Shallow) are trained separately on the data and a weighted average of the outputs is taken as the final result. For the Deep approach, we use different combinations of models like Convolution Neural Network, pretrained word2vec embeddings and LSTMs to get representations which are then used to train a Deep Neural Network. For Clarity pred… ▽ More

    Submitted 1 April, 2018; originally announced April 2018.

  15. On a tensor-analogue of the Schur product

    Authors: K. Sumesh, V. S. Sunder

    Abstract: We consider the tensorial Schur product $R \circ^\otimes S = [r_{ij} \otimes s_{ij}]$ for $R \in M_n(\mathcal{A}), S\in M_n(\mathcal{B}),$ with $\mathcal{A}, \mathcal{B}$ unital $C^*$-algebras, verify that such a `tensorial Schur product' of positive operators is again positive, and then use this fact to prove (an apparently marginally more general version of) the classical result of Choi that a l… ▽ More

    Submitted 14 October, 2015; v1 submitted 16 September, 2015; originally announced September 2015.

    Comments: Corrected typos, The final publication (with marginal changes) is available at Springer via http://dx.doi.org/[10.1007/s11117-015-0377-x]

  16. arXiv:1410.7188  [pdf, ps, other

    quant-ph math-ph math.OA math.QA

    The Functional Analysis of Quantum Information Theory

    Authors: Ved Prakash Gupta, Prabha Mandayam, V. S. Sunder

    Abstract: This book is a compilation of notes from a two-week international workshop on the "The Functional Analysis of Quantum Information Theory" that was held at the Institute of Mathematical Sciences during 26/12/2011-06/01/2012. The workshop was devoted to the mathematical framework of quantized functional analysis (QFA), and aimed at illustrating its applications to problems in quantum communication.… ▽ More

    Submitted 28 April, 2015; v1 submitted 27 October, 2014; originally announced October 2014.

    Comments: v3; 123 pages; 4 chapters; To appear in Springer Lecture Notes in Physics; Video recordings of the lectures can be found here: https://www.youtube.com/playlist?list=PLD3E479AB374A718F&spfreload=10

  17. arXiv:1211.2576  [pdf, ps, other

    math.OA

    Extendable endomorphisms on factors

    Authors: Panchugopal Bikram, Masaki Izumi, R. Srinivasan, V. S. Sunder

    Abstract: We begin this note with a von Neumann algebraic version of the elementary but extremely useful fact about being able to extend inner-product preserving maps from a total set of the domain Hilbert space to an isometry defined on the entire domain. This leads us to the notion of when `good' endomorphisms of a factorial probability space $(M,φ)$ (which we call equi-modular) admit a natural extension… ▽ More

    Submitted 11 October, 2013; v1 submitted 12 November, 2012; originally announced November 2012.

    Comments: 26 pages. New co-author (Izumi) added in view of his contributions

    MSC Class: 46L55

  18. arXiv:1210.7581  [pdf, ps, other

    math.OA math.PR

    Continuous minimax theorems

    Authors: Madhushree Basu, V. S. Sunder

    Abstract: In classical matrix theory, there exist useful extremal characterizations of eigenvalues and their sums for Hermitian matrices (due to Ky Fan, Courant-Fischer-Weyl and Wielandt) and some consequences such as the majorization assertion in Lidskii's theorem. In this paper, we extend these results to the context of self adjoint elements of finite von Neumann algebras, and their distribution and quant… ▽ More

    Submitted 11 November, 2013; v1 submitted 29 October, 2012; originally announced October 2012.

    MSC Class: 46L10; 60B11; 34L15

  19. arXiv:1102.4663  [pdf, ps, other

    math.QA math.OA

    Hilbert von Neumann modules

    Authors: Panchugopal Bikram, Kunal Mukherjee, R. Srinivasan, V. S. Sunder

    Abstract: We introduce a way of regarding Hilbert von Neumann modules as spaces of operators between Hilbert space, not unlike [Skei], but in an apparently much simpler manner and involving far less machinery. We verify that our definition is equivalent to that of [Skei], by verifying the `Riesz lemma' or what is called `self-duality' in [Skei]. An advantage with our approach is that we can totally side-ste… ▽ More

    Submitted 23 February, 2011; originally announced February 2011.

    Comments: 20 pages

    MSC Class: 46L10

  20. arXiv:1102.4413  [pdf, ps, other

    math.OA math.FA

    From graphs to free products

    Authors: Madhushree Basu, Vijay Kodiyalam, V. S. Sunder

    Abstract: We investigate a construction which associates a finite von Neumann algebra $M(Γ,μ)$ to a finite weighted graph $(Γ,μ)$. Pleasantly, but not surprisingly, the von Neumann algebra associated to to a `flower with $n$ petals' is the group von Neumann algebra of the free group on $n$ generators. In general, the algebra $M(Γ,μ)$ is a free product, with amalgamation over a finite-dimensional abelian sub… ▽ More

    Submitted 22 February, 2011; originally announced February 2011.

    Comments: 14 pages, 1 figure

    MSC Class: 46L54

  21. arXiv:0911.2047  [pdf, ps, other

    math.OA math.FA

    On the Guionnet-Jones-Shlyakhtenko construction for graphs

    Authors: Vijay Kodiyalam, V. S. Sunder

    Abstract: Using an analogue of the Guionnet-Jones-Shlaykhtenko construction for graphs we show that their construction applied to any subfactor planar algebra of finite depth yields an inclusion of interpolated free group factors with finite parameter, thereby giving another proof of their universality for finite depth planar algebras.

    Submitted 24 March, 2010; v1 submitted 11 November, 2009; originally announced November 2009.

    Comments: 42 pages, 12 figures. v2 has updated references and minor changes. v3 corrects some typos.

    MSC Class: 46L37; 46L54

  22. arXiv:0901.3180  [pdf, ps, other

    math.OA math.FA

    Guionnet-Jones-Shlyakhtenko subfactors associated to finite-dimensional Kac algebras

    Authors: Vijay Kodiyalam, V. S. Sunder

    Abstract: We analyse the Guionnet-Jones-Shlyakhtenko construction for the planar algebra associated to a finite-dimensional Kac algebra and identify the factors that arise as finite interpolated free group factors.

    Submitted 10 March, 2009; v1 submitted 20 January, 2009; originally announced January 2009.

    Comments: 18 pages, 21 figures, corrected typos

  23. arXiv:0807.3704  [pdf, ps, other

    math.OA

    From subfactor planar algebras to subfactors

    Authors: Vijay Kodiyalam, V. S. Sunder

    Abstract: We present a purely planar algebraic proof of the main result of a paper of Guionnet-Jones-Shlaykhtenko which constructs an extremal subfactor from a subfactor planar algebra whose standard invariant is given by that planar algebra.

    Submitted 23 July, 2008; originally announced July 2008.

    Comments: 22 pages, 25 figures

    MSC Class: 46L37

  24. arXiv:math/0509302  [pdf, ps, other

    math.QA math.OA

    Planar algebras and Kuperberg's 3-manifold invariant

    Authors: Vijay Kodiyalam, V. S. Sunder

    Abstract: We recapture Kuperberg's numerical invariant of 3-manifolds associated to a semisimple and cosemisimple Hopf algebra through a `planar algebra construction'. A result of possibly independent interest, used during the proof, which relates duality in planar graphs and Hopf algebras, is the subject of a final section.

    Submitted 14 September, 2005; originally announced September 2005.

    Comments: 19 pages, 9 figures

    MSC Class: 57M27;16W30

  25. Subfactors and 1+1-dimensional TQFTs

    Authors: Vijay Kodiyalam, Vishwambhar Pati, V. S. Sunder

    Abstract: We construct a certain `cobordism category' ${\cal D}$ whose morphisms are suitably decorated cobordism classes between similarly decorated closed oriented 1-manifolds, and show that there is essentially a bijection between (1+1-dimensional) unitary topological quantum field theories (TQFTs) defined on ${\cal D}$, on the one hand, and Jones' subfactor planar algebras, on the other.

    Submitted 4 July, 2005; originally announced July 2005.

    Comments: 57 pages, 9 figures

    MSC Class: 46L37

    Journal ref: Int.J.Math.18:69-112,2007

  26. arXiv:math/0506153  [pdf, ps, other

    math.QA

    The planar algebra of a semisimple and cosemisimple Hopf algebra

    Authors: Vijay Kodiyalam, V. S. Sunder

    Abstract: To a semisimple and cosemisimple Hopf algebra over an algebraically closed field, we associate a planar algebra defined by generators and relations and show that it is a connected, irreducible, spherical, non-degenerate planar algebra with non-zero modulus and of depth two. This association is shown to yield a bijection between (the isomorphism classes, on both sides, of) such objects.

    Submitted 20 June, 2005; v1 submitted 9 June, 2005; originally announced June 2005.

    Comments: 16 pages, 20 figures; content added

    MSC Class: 16W30; 46L37