Showing 1–2 of 2 results for author: Meghawat, A

Search v0.5.6 released 2020-02-24

arXiv:2106.11292 [pdf, other]

cs.CL cs.LG

doi 10.21437/Interspeech.2021-1767

A Discriminative Entity-Aware Language Model for Virtual Assistants

Authors: Mandana Saebi, Ernest Pusateri, Aaksha Meghawat, Christophe Van Gysel

Abstract: High-quality automatic speech recognition (ASR) is essential for virtual assistants (VAs) to work well. However, ASR often performs poorly on VA requests containing named entities. In this work, we start from the observation that many ASR errors on named entities are inconsistent with real-world knowledge. We extend previous discriminative n-gram language modeling approaches to incorporate real-wo… ▽ More High-quality automatic speech recognition (ASR) is essential for virtual assistants (VAs) to work well. However, ASR often performs poorly on VA requests containing named entities. In this work, we start from the observation that many ASR errors on named entities are inconsistent with real-world knowledge. We extend previous discriminative n-gram language modeling approaches to incorporate real-world knowledge from a Knowledge Graph (KG), using features that capture entity type-entity and entity-entity relationships. We apply our model through an efficient lattice rescoring process, achieving relative sentence error rate reductions of more than 25% on some synthesized test sets covering less popular entities, with minimal degradation on a uniformly sampled VA test set. △ Less

Submitted 21 June, 2021; originally announced June 2021.

Comments: To appear in Interspeech 2021
arXiv:1609.05244 [pdf, other]

cs.CL cs.IR

Select-Additive Learning: Improving Generalization in Multimodal Sentiment Analysis

Authors: Haohan Wang, Aaksha Meghawat, Louis-Philippe Morency, Eric P. Xing

Abstract: Multimodal sentiment analysis is drawing an increasing amount of attention these days. It enables mining of opinions in video reviews which are now available aplenty on online platforms. However, multimodal sentiment analysis has only a few high-quality data sets annotated for training machine learning algorithms. These limited resources restrict the generalizability of models, where, for example,… ▽ More Multimodal sentiment analysis is drawing an increasing amount of attention these days. It enables mining of opinions in video reviews which are now available aplenty on online platforms. However, multimodal sentiment analysis has only a few high-quality data sets annotated for training machine learning algorithms. These limited resources restrict the generalizability of models, where, for example, the unique characteristics of a few speakers (e.g., wearing glasses) may become a confounding factor for the sentiment classification task. In this paper, we propose a Select-Additive Learning (SAL) procedure that improves the generalizability of trained neural networks for multimodal sentiment analysis. In our experiments, we show that our SAL approach improves prediction accuracy significantly in all three modalities (verbal, acoustic, visual), as well as in their fusion. Our results show that SAL, even when trained on one dataset, achieves good generalization across two new test datasets. △ Less

Submitted 12 April, 2017; v1 submitted 16 September, 2016; originally announced September 2016.

Comments: Supplementary files at: http://www.cs.cmu.edu/~haohanw/document/sal_supp.pdf

Search v0.5.6 released 2020-02-24