Skip to main content

Showing 1–7 of 7 results for author: Haws, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2309.11210  [pdf, other

    eess.AS cs.CL cs.SD

    Speak While You Think: Streaming Speech Synthesis During Text Generation

    Authors: Avihu Dekel, Slava Shechtman, Raul Fernandez, David Haws, Zvi Kons, Ron Hoory

    Abstract: Large Language Models (LLMs) demonstrate impressive capabilities, yet interaction with these models is mostly facilitated through text. Using Text-To-Speech to synthesize LLM outputs typically results in notable latency, which is impractical for fluent voice conversations. We propose LLM2Speech, an architecture to synthesize speech while text is being generated by an LLM which yields significant l… ▽ More

    Submitted 20 September, 2023; originally announced September 2023.

    Comments: Under review for ICASSP 2024

  2. arXiv:2208.01818  [pdf, other

    cs.SD cs.CL eess.AS

    VQ-T: RNN Transducers using Vector-Quantized Prediction Network States

    Authors: Jiatong Shi, George Saon, David Haws, Shinji Watanabe, Brian Kingsbury

    Abstract: Beam search, which is the dominant ASR decoding algorithm for end-to-end models, generates tree-structured hypotheses. However, recent studies have shown that decoding with hypothesis merging can achieve a more efficient search with comparable or better performance. But, the full context in recurrent networks is not compatible with hypothesis merging. We propose to use vector-quantized long short-… ▽ More

    Submitted 2 August, 2022; originally announced August 2022.

    Comments: Interspeech 2022 accepted paper

  3. arXiv:2207.12262  [pdf, other

    eess.AS cs.SD

    Transplantation of Conversational Speaking Style with Interjections in Sequence-to-Sequence Speech Synthesis

    Authors: Raul Fernandez, David Haws, Guy Lorberbom, Slava Shechtman, Alexander Sorin

    Abstract: Sequence-to-Sequence Text-to-Speech architectures that directly generate low level acoustic features from phonetic sequences are known to produce natural and expressive speech when provided with adequate amounts of training data. Such systems can learn and transfer desired speaking styles from one seen speaker to another (in multi-style multi-speaker settings), which is highly desirable for creati… ▽ More

    Submitted 25 July, 2022; originally announced July 2022.

    Comments: Accepted for presentation at Interspeech 2022

  4. arXiv:2108.10803  [pdf, ps, other

    cs.CL cs.AI cs.SD eess.AS

    Reducing Exposure Bias in Training Recurrent Neural Network Transducers

    Authors: Xiaodong Cui, Brian Kingsbury, George Saon, David Haws, Zoltan Tuske

    Abstract: When recurrent neural network transducers (RNNTs) are trained using the typical maximum likelihood criterion, the prediction network is trained only on ground truth label sequences. This leads to a mismatch during inference, known as exposure bias, when the model must deal with label sequences containing errors. In this paper we investigate approaches to reducing exposure bias in training to impro… ▽ More

    Submitted 24 August, 2021; originally announced August 2021.

    Comments: accepted to Interspeech 2021

  5. arXiv:1310.1659  [pdf, ps, other

    cs.LG cs.CE

    MINT: Mutual Information based Transductive Feature Selection for Genetic Trait Prediction

    Authors: Dan He, Irina Rish, David Haws, Simon Teyssedre, Zivan Karaman, Laxmi Parida

    Abstract: Whole genome prediction of complex phenotypic traits using high-density genotyping arrays has attracted a great deal of attention, as it is relevant to the fields of plant and animal breeding and genetic epidemiology. As the number of genotypes is generally much bigger than the number of samples, predictive models suffer from the curse-of-dimensionality. The curse-of-dimensionality problem not onl… ▽ More

    Submitted 6 October, 2013; originally announced October 2013.

  6. arXiv:1310.1649  [pdf, other

    cs.DS

    QuickLexSort: An efficient algorithm for lexicographically sorting nested restrictions of a database

    Authors: David Haws

    Abstract: Lexicographical sorting is a fundamental problem with applications to contingency tables, databases, Bayesian networks, and more. A standard method to lexicographically sort general data is to iteratively use a stable sort -- a sort which preserves existing orders. Here we present a new method of lexicographical sorting called QuickLexSort. Whereas a stable sort based lexicographical sorting algor… ▽ More

    Submitted 6 October, 2013; originally announced October 2013.

    Comments: 17, 1 figure

    MSC Class: 68Q25; 68P10; 62H17 ACM Class: F.2.2; G.3

  7. arXiv:0911.0645  [pdf, ps, other

    q-bio.PE cs.LG q-bio.QM

    Bayes estimators for phylogenetic reconstruction

    Authors: Peter Huggins, Wenbin Li, David Haws, Thomas Friedrich, Jinze Liu, Ruriko Yoshida

    Abstract: Tree reconstruction methods are often judged by their accuracy, measured by how close they get to the true tree. Yet most reconstruction methods like ML do not explicitly maximize this accuracy. To address this problem, we propose a Bayesian solution. Given tree samples, we propose finding the tree estimate which is closest on average to the samples. This ``median'' tree is known as the Bayes es… ▽ More

    Submitted 21 November, 2009; v1 submitted 3 November, 2009; originally announced November 2009.

    Comments: 31 pages, 4 figures, and 3 tables