Skip to main content

Showing 1–2 of 2 results for author: Sheshkal, S A

Searching in archive eess. Search in all archives.
.
  1. arXiv:2411.13209  [pdf, other

    cs.SD cs.AI cs.HC eess.AS

    Comparative Analysis of Audio Feature Extraction for Real-Time Talking Portrait Synthesis

    Authors: Pegah Salehi, Sajad Amouei Sheshkal, Vajira Thambawita, Sushant Gautam, Saeed S. Sabet, Dag Johansen, Michael A. Riegler, Pål Halvorsen

    Abstract: This paper examines the integration of real-time talking-head generation for interviewer training, focusing on overcoming challenges in Audio Feature Extraction (AFE), which often introduces latency and limits responsiveness in real-time applications. To address these issues, we propose and implement a fully integrated system that replaces conventional AFE models with Open AI's Whisper, leveraging… ▽ More

    Submitted 20 November, 2024; originally announced November 2024.

    Comments: 16 pages, 6 figures, 3 tables. submitted to MDPI journal in as Big Data and Cognitive Computing

    MSC Class: 68T45; 68T07; 68T01

  2. SinGAN-Seg: Synthetic training data generation for medical image segmentation

    Authors: Vajira Thambawita, Pegah Salehi, Sajad Amouei Sheshkal, Steven A. Hicks, Hugo L. Hammer, Sravanthi Parasa, Thomas de Lange, Pål Halvorsen, Michael A. Riegler

    Abstract: Analyzing medical data to find abnormalities is a time-consuming and costly task, particularly for rare abnormalities, requiring tremendous efforts from medical experts. Artificial intelligence has become a popular tool for the automatic processing of medical data, acting as a supportive tool for doctors. However, the machine learning models used to build these tools are highly dependent on the da… ▽ More

    Submitted 25 April, 2022; v1 submitted 29 June, 2021; originally announced July 2021.