Skip to main content

Showing 1–4 of 4 results for author: Wassie, A K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.02518  [pdf, ps, other

    cs.CL cs.SD eess.AS

    Bemba Speech Translation: Exploring a Low-Resource African Language

    Authors: Muhammad Hazim Al Farouq, Aman Kassahun Wassie, Yasmin Moslem

    Abstract: This paper describes our system submission to the International Conference on Spoken Language Translation (IWSLT 2025), low-resource languages track, namely for Bemba-to-English speech translation. We built cascaded speech translation systems based on Whisper and NLLB-200, and employed data augmentation techniques, such as back-translation. We investigate the effect of using synthetic data and dis… ▽ More

    Submitted 2 June, 2025; v1 submitted 5 May, 2025; originally announced May 2025.

    Comments: IWSLT 2025

    Journal ref: Proceedings of the 22nd International Conference on Spoken Language Translation (IWSLT 2025)

  2. arXiv:2412.05862  [pdf, ps, other

    cs.CL

    Domain-Specific Translation with Open-Source Large Language Models: Resource-Oriented Analysis

    Authors: Aman Kassahun Wassie, Mahdi Molaei, Yasmin Moslem

    Abstract: In this work, we compare the domain-specific translation performance of open-source autoregressive decoder-only large language models (LLMs) with task-oriented machine translation (MT) models. Our experiments focus on the medical domain and cover four language directions with varied resource availability: English-to-French, English-to-Portuguese, English-to-Swahili, and Swahili-to-English. Despite… ▽ More

    Submitted 30 May, 2025; v1 submitted 8 December, 2024; originally announced December 2024.

  3. arXiv:2402.08015  [pdf, other

    cs.CL

    Walia-LLM: Enhancing Amharic-LLaMA by Integrating Task-Specific and Generative Datasets

    Authors: Israel Abebe Azime, Atnafu Lambebo Tonja, Tadesse Destaw Belay, Mitiku Yohannes Fuge, Aman Kassahun Wassie, Eyasu Shiferaw Jada, Yonas Chanie, Walelign Tewabe Sewunetie, Seid Muhie Yimam

    Abstract: Large language models (LLMs) have received a lot of attention in natural language processing (NLP) research because of their exceptional performance in understanding and generating human languages. However, low-resource languages are left behind due to the unavailability of resources. In this work, we focus on enhancing the LLaMA-2-Amharic model by integrating task-specific and generative datasets… ▽ More

    Submitted 29 April, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

  4. arXiv:2311.14530  [pdf

    cs.CL

    Machine Translation for Ge'ez Language

    Authors: Aman Kassahun Wassie

    Abstract: Machine translation (MT) for low-resource languages such as Ge'ez, an ancient language that is no longer the native language of any community, faces challenges such as out-of-vocabulary words, domain mismatches, and lack of sufficient labeled training data. In this work, we explore various methods to improve Ge'ez MT, including transfer-learning from related languages, optimizing shared vocabulary… ▽ More

    Submitted 15 April, 2024; v1 submitted 24 November, 2023; originally announced November 2023.

    Comments: 8 pages, 1 figure