Skip to main content

Showing 1–4 of 4 results for author: Morency, L

Searching in archive eess. Search in all archives.
.
  1. arXiv:2305.13583  [pdf, other

    cs.CL cs.MM eess.AS eess.IV

    Cross-Attention is Not Enough: Incongruity-Aware Dynamic Hierarchical Fusion for Multimodal Affect Recognition

    Authors: Yaoting Wang, Yuanchao Li, Paul Pu Liang, Louis-Philippe Morency, Peter Bell, Catherine Lai

    Abstract: Fusing multiple modalities has proven effective for multimodal information processing. However, the incongruity between modalities poses a challenge for multimodal fusion, especially in affect recognition. In this study, we first analyze how the salient affective information in one modality can be affected by the other, and demonstrate that inter-modal incongruity exists latently in crossmodal att… ▽ More

    Submitted 12 November, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: *First two authors contributed equally

  2. arXiv:2101.08919  [pdf, other

    eess.AS cs.CR cs.SD

    Understanding the Tradeoffs in Client-side Privacy for Downstream Speech Tasks

    Authors: Peter Wu, Paul Pu Liang, Jiatong Shi, Ruslan Salakhutdinov, Shinji Watanabe, Louis-Philippe Morency

    Abstract: As users increasingly rely on cloud-based computing services, it is important to ensure that uploaded speech data remains private. Existing solutions rely either on server-side methods or focus on hiding speaker identity. While these approaches reduce certain security concerns, they do not give users client-side control over whether their biometric information is sent to the server. In this paper,… ▽ More

    Submitted 22 October, 2021; v1 submitted 21 January, 2021; originally announced January 2021.

  3. arXiv:1911.09783  [pdf, other

    cs.LG cs.SD eess.AS stat.ML

    WildMix Dataset and Spectro-Temporal Transformer Model for Monoaural Audio Source Separation

    Authors: Amir Zadeh, Tianjun Ma, Soujanya Poria, Louis-Philippe Morency

    Abstract: Monoaural audio source separation is a challenging research area in machine learning. In this area, a mixture containing multiple audio sources is given, and a model is expected to disentangle the mixture into isolated atomic sources. In this paper, we first introduce a challenging new dataset for monoaural source separation called WildMix. WildMix is designed with the goal of extending the bounda… ▽ More

    Submitted 21 November, 2019; originally announced November 2019.

  4. arXiv:1906.02125  [pdf, other

    cs.CL cs.AI cs.LG cs.SD eess.AS stat.ML

    Strong and Simple Baselines for Multimodal Utterance Embeddings

    Authors: Paul Pu Liang, Yao Chong Lim, Yao-Hung Hubert Tsai, Ruslan Salakhutdinov, Louis-Philippe Morency

    Abstract: Human language is a rich multimodal signal consisting of spoken words, facial expressions, body gestures, and vocal intonations. Learning representations for these spoken utterances is a complex research problem due to the presence of multiple heterogeneous sources of information. Recent advances in multimodal learning have followed the general trend of building more complex models that utilize va… ▽ More

    Submitted 28 February, 2020; v1 submitted 14 May, 2019; originally announced June 2019.

    Comments: NAACL 2019 oral presentation