Skip to main content

Showing 1–3 of 3 results for author: Priyasad, D

Searching in archive stat. Search in all archives.
.
  1. arXiv:2009.10991  [pdf, other

    eess.AS cs.HC cs.LG stat.ML

    Attention Driven Fusion for Multi-Modal Emotion Recognition

    Authors: Darshana Priyasad, Tharindu Fernando, Simon Denman, Clinton Fookes, Sridha Sridharan

    Abstract: Deep learning has emerged as a powerful alternative to hand-crafted methods for emotion recognition on combined acoustic and text modalities. Baseline systems model emotion information in text and acoustic modes independently using Deep Convolutional Neural Networks (DCNN) and Recurrent Neural Networks (RNN), followed by applying attention, fusion, and classification. In this paper, we present a d… ▽ More

    Submitted 10 October, 2020; v1 submitted 23 September, 2020; originally announced September 2020.

    Comments: An updated version of the ICASSP 2020 paper

  2. arXiv:2007.08076  [pdf, other

    cs.LG cs.CV stat.ML

    Memory based fusion for multi-modal deep learning

    Authors: Darshana Priyasad, Tharindu Fernando, Simon Denman, Sridha Sridharan, Clinton Fookes

    Abstract: The use of multi-modal data for deep machine learning has shown promise when compared to uni-modal approaches with fusion of multi-modal features resulting in improved performance in several applications. However, most state-of-the-art methods use naive fusion which processes feature streams independently, ignoring possible long-term dependencies within the data during fusion. In this paper, we pr… ▽ More

    Submitted 23 October, 2020; v1 submitted 15 July, 2020; originally announced July 2020.

    Comments: Pre-print submitted to Information Fusion

  3. arXiv:2004.01546  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    Temporarily-Aware Context Modelling using Generative Adversarial Networks for Speech Activity Detection

    Authors: Tharindu Fernando, Sridha Sridharan, Mitchell McLaren, Darshana Priyasad, Simon Denman, Clinton Fookes

    Abstract: This paper presents a novel framework for Speech Activity Detection (SAD). Inspired by the recent success of multi-task learning approaches in the speech processing domain, we propose a novel joint learning framework for SAD. We utilise generative adversarial networks to automatically learn a loss function for joint prediction of the frame-wise speech/ non-speech classifications together with the… ▽ More

    Submitted 1 April, 2020; originally announced April 2020.

    Journal ref: IEEE/ACM Transactions on Audio, Speech and Language Processing, 2020