Skip to main content

Showing 1–7 of 7 results for author: Turcan, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.12196  [pdf, other

    cs.CL

    MASIVE: Open-Ended Affective State Identification in English and Spanish

    Authors: Nicholas Deas, Elsbeth Turcan, Iván Pérez Mejía, Kathleen McKeown

    Abstract: In the field of emotion analysis, much NLP research focuses on identifying a limited number of discrete emotion categories, often applied across languages. These basic sets, however, are rarely designed with textual data in mind, and culture, language, and dialect can influence how particular emotions are interpreted. In this work, we broaden our scope to a practically unbounded set of \textit{aff… ▽ More

    Submitted 12 November, 2024; v1 submitted 16 July, 2024; originally announced July 2024.

    Comments: EMNLP 2024

  2. arXiv:2406.05190  [pdf, other

    cs.LG cs.AI cs.CL

    Evaluating the Effectiveness of Data Augmentation for Emotion Classification in Low-Resource Settings

    Authors: Aashish Arora, Elsbeth Turcan

    Abstract: Data augmentation has the potential to improve the performance of machine learning models by increasing the amount of training data available. In this study, we evaluated the effectiveness of different data augmentation techniques for a multi-label emotion classification task using a low-resource dataset. Our results showed that Back Translation outperformed autoencoder-based approaches and that g… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: The first author contributed significantly

  3. arXiv:2305.14291  [pdf, other

    cs.CL

    Evaluation of African American Language Bias in Natural Language Generation

    Authors: Nicholas Deas, Jessi Grieser, Shana Kleiner, Desmond Patton, Elsbeth Turcan, Kathleen McKeown

    Abstract: We evaluate how well LLMs understand African American Language (AAL) in comparison to their performance on White Mainstream English (WME), the encouraged "standard" form of English taught in American classrooms. We measure LLM performance using automatic metrics and human judgments for two tasks: a counterpart generation task, where a model generates AAL (or WME) given WME (or AAL), and a masked s… ▽ More

    Submitted 12 November, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: EMNLP 2023 Camera-Ready

  4. arXiv:2106.09790  [pdf, other

    cs.CL cs.AI

    Multi-Task Learning and Adapted Knowledge Models for Emotion-Cause Extraction

    Authors: Elsbeth Turcan, Shuai Wang, Rishita Anubhai, Kasturi Bhattacharjee, Yaser Al-Onaizan, Smaranda Muresan

    Abstract: Detecting what emotions are expressed in text is a well-studied problem in natural language processing. However, research on finer grained emotion analysis such as what causes an emotion is still in its infancy. We present solutions that tackle both emotion recognition and emotion cause detection in a joint fashion. Considering that common-sense knowledge plays an important role in understanding i… ▽ More

    Submitted 17 June, 2021; originally announced June 2021.

    Comments: 15 pages, 6 figures. Findings of ACL 2021

  5. arXiv:2104.07868  [pdf, other

    cs.CL

    Segmenting Subtitles for Correcting ASR Segmentation Errors

    Authors: David Wan, Chris Kedzie, Faisal Ladhak, Elsbeth Turcan, Petra Galuščáková, Elena Zotkina, Zhengping Jiang, Peter Bell, Kathleen McKeown

    Abstract: Typical ASR systems segment the input audio into utterances using purely acoustic information, which may not resemble the sentence-like units that are expected by conventional machine translation (MT) systems for Spoken Language Translation. In this work, we propose a model for correcting the acoustic segmentation of ASR models for low-resource languages to improve performance on downstream tasks.… ▽ More

    Submitted 15 April, 2021; originally announced April 2021.

  6. arXiv:2010.09693  [pdf, other

    cs.CL

    Subtitles to Segmentation: Improving Low-Resource Speech-to-Text Translation Pipelines

    Authors: David Wan, Zhengping Jiang, Chris Kedzie, Elsbeth Turcan, Peter Bell, Kathleen McKeown

    Abstract: In this work, we focus on improving ASR output segmentation in the context of low-resource language speech-to-text translation. ASR output segmentation is crucial, as ASR systems segment the input audio using purely acoustic information and are not guaranteed to output sentence-like segments. Since most MT systems expect sentences as input, feeding in longer unsegmented passages can lead to sub-op… ▽ More

    Submitted 19 October, 2020; originally announced October 2020.

    Journal ref: CLSST@LREC 2020 68-73

  7. arXiv:1911.00133  [pdf, other

    cs.CL

    Dreaddit: A Reddit Dataset for Stress Analysis in Social Media

    Authors: Elsbeth Turcan, Kathleen McKeown

    Abstract: Stress is a nigh-universal human experience, particularly in the online world. While stress can be a motivator, too much stress is associated with many negative health outcomes, making its identification useful across a range of domains. However, existing computational research typically only studies stress in domains such as speech, or in short genres such as Twitter. We present Dreaddit, a new t… ▽ More

    Submitted 31 October, 2019; originally announced November 2019.

    Comments: To appear in the proceedings of the Tenth International Workshop on Health Text Mining and Information Analysis