Skip to main content

Showing 1–14 of 14 results for author: Shirani, S

Searching in archive eess. Search in all archives.
.
  1. arXiv:2502.05757  [pdf

    cs.SD eess.AS eess.SP

    Large Language Model-based Nonnegative Matrix Factorization For Cardiorespiratory Sound Separation

    Authors: Yasaman Torabi, Shahram Shirani, James P. Reilly

    Abstract: This study represents the first integration of large language models (LLMs) with non-negative matrix factorization (NMF), marking a novel advancement in the source separation field. The LLM is employed in two unique ways: enhancing the separation results by providing detailed insights for disease prediction and operating in a feedback loop to optimize a fundamental frequency penalty added to the N… ▽ More

    Submitted 8 February, 2025; originally announced February 2025.

  2. arXiv:2411.14663  [pdf

    eess.IV cs.CV

    BrightVAE: Luminosity Enhancement in Underexposed Endoscopic Images

    Authors: Farzaneh Koohestani, Zahra Nabizadeh, Nader Karimi, Shahram Shirani, Shadrokh Samavi

    Abstract: The enhancement of image luminosity is especially critical in endoscopic images. Underexposed endoscopic images often suffer from reduced contrast and uneven brightness, significantly impacting diagnostic accuracy and treatment planning. Internal body imaging is challenging due to uneven lighting and shadowy regions. Enhancing such images is essential since precise image interpretation is crucial… ▽ More

    Submitted 21 November, 2024; originally announced November 2024.

    Comments: 18 pages, 6 figures

  3. arXiv:2410.03280  [pdf

    eess.AS cs.AI cs.LG eess.SP

    Manikin-Recorded Cardiopulmonary Sounds Dataset Using Digital Stethoscope

    Authors: Yasaman Torabi, Shahram Shirani, James P. Reilly

    Abstract: Heart and lung sounds are crucial for healthcare monitoring. Recent improvements in stethoscope technology have made it possible to capture patient sounds with enhanced precision. In this dataset, we used a digital stethoscope to capture both heart and lung sounds, including individual and mixed recordings. To our knowledge, this is the first dataset to offer both separate and mixed cardiorespirat… ▽ More

    Submitted 4 October, 2024; originally announced October 2024.

  4. arXiv:2406.12432  [pdf

    eess.SP cs.AI cs.LG eess.AS

    MEMS and ECM Sensor Technologies for Cardiorespiratory Sound Monitoring - A Comprehensive Review

    Authors: Yasaman Torabi, Shahram Shirani, James P. Reilly, Gail M Gauvreau

    Abstract: This paper presents a comprehensive review of cardiorespiratory auscultation sensing devices (i.e., stethoscopes), which is useful for understanding the theoretical aspects and practical design notes. In this paper, we first introduce the acoustic properties of the heart and lungs, as well as a brief history of stethoscope evolution. Then, we discuss the basic concept of electret condenser microph… ▽ More

    Submitted 14 February, 2025; v1 submitted 18 June, 2024; originally announced June 2024.

    Journal ref: Sensors, Vol. 24, Issue 21, Page 7036, 2024

  5. arXiv:2406.01321  [pdf, other

    cs.SD cs.AI cs.LG cs.MM eess.AS

    Sequence-to-Sequence Multi-Modal Speech In-Painting

    Authors: Mahsa Kadkhodaei Elyaderani, Shahram Shirani

    Abstract: Speech in-painting is the task of regenerating missing audio contents using reliable context information. Despite various recent studies in multi-modal perception of audio in-painting, there is still a need for an effective infusion of visual and auditory information in speech in-painting. In this paper, we introduce a novel sequence-to-sequence model that leverages the visual information to in-pa… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  6. arXiv:2406.00901  [pdf, other

    cs.MM cs.AI cs.LG cs.SD eess.AS

    Robust Multi-Modal Speech In-Painting: A Sequence-to-Sequence Approach

    Authors: Mahsa Kadkhodaei Elyaderani, Shahram Shirani

    Abstract: The process of reconstructing missing parts of speech audio from context is called speech in-painting. Human perception of speech is inherently multi-modal, involving both audio and visual (AV) cues. In this paper, we introduce and study a sequence-to-sequence (seq2seq) speech in-painting model that incorporates AV features. Our approach extends AV speech in-painting techniques to scenarios where… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  7. arXiv:2404.09029  [pdf

    cs.MM cs.IT cs.LG eess.IV

    A Parametric Rate-Distortion Model for Video Transcoding

    Authors: Maedeh Jamali, Nader Karimi, Shadrokh Samavi, Shahram Shirani

    Abstract: Over the past two decades, the surge in video streaming applications has been fueled by the increasing accessibility of the internet and the growing demand for network video. As users with varying internet speeds and devices seek high-quality video, transcoding becomes essential for service providers. In this paper, we introduce a parametric rate-distortion (R-D) transcoding model. Our model excel… ▽ More

    Submitted 13 April, 2024; originally announced April 2024.

  8. arXiv:2306.07383  [pdf

    cs.CV eess.IV

    Supervised Deep Learning for Content-Aware Image Retargeting with Fourier Convolutions

    Authors: MohammadHossein Givkashi, MohammadReza Naderi, Nader Karimi, Shahram Shirani, Shadrokh Samavi

    Abstract: Image retargeting aims to alter the size of the image with attention to the contents. One of the main obstacles to training deep learning models for image retargeting is the need for a vast labeled dataset. Labeled datasets are unavailable for training deep learning models in the image retargeting tasks. As a result, we present a new supervised approach for training deep learning models. We use th… ▽ More

    Submitted 12 June, 2023; originally announced June 2023.

    Comments: 18 pages, 5 figures

  9. arXiv:2305.01889  [pdf

    eess.SP

    A New Non-Negative Matrix Factorization Approach for Blind Source Separation of Cardiovascular and Respiratory Sound Based on the Periodicity of Heart and Lung Function

    Authors: Yasaman Torabi, Shahram Shirani, James P. Reilly

    Abstract: Auscultation provides a rich diversity of information to diagnose cardiovascular and respiratory diseases. However, sound auscultation is challenging due to noise. In this study, a modified version of the affine non-negative matrix factorization (NMF) approach is proposed to blindly separate lung and heart sounds recorded by a digital stethoscope. This method applies a novel NMF algorithm, which e… ▽ More

    Submitted 25 February, 2025; v1 submitted 3 May, 2023; originally announced May 2023.

  10. arXiv:2303.06736  [pdf

    eess.IV cs.CV

    Endoscopy Classification Model Using Swin Transformer and Saliency Map

    Authors: Zahra Sobhaninia, Nasrin Abharian, Nader Karimi, Shahram Shirani, Shadrokh Samavi

    Abstract: Endoscopy is a valuable tool for the early diagnosis of colon cancer. However, it requires the expertise of endoscopists and is a time-consuming process. In this work, we propose a new multi-label classification method, which considers two aspects of learning approaches (local and global views) for endoscopic image classification. The model consists of a Swin transformer branch and a modified VGG1… ▽ More

    Submitted 12 March, 2023; originally announced March 2023.

    Comments: 5 pages, 3 figures

  11. arXiv:2206.09977  [pdf, ps, other

    cs.LG cs.AI eess.SY math.DS math.OC

    Analysis of Thompson Sampling for Controlling Unknown Linear Diffusion Processes

    Authors: Mohamad Kazem Shirani Faradonbeh, Sadegh Shirani, Mohsen Bayati

    Abstract: Linear diffusion processes serve as canonical continuous-time models for dynamic decision-making under uncertainty. These systems evolve according to drift matrices that specify the instantaneous rates of change in the expected system state, while also experiencing continuous random disturbances modeled by Brownian noise. For instance, in medical applications such as artificial pancreas systems, t… ▽ More

    Submitted 7 June, 2025; v1 submitted 20 June, 2022; originally announced June 2022.

  12. arXiv:2109.05614  [pdf

    cs.CV eess.IV

    MSGDD-cGAN: Multi-Scale Gradients Dual Discriminator Conditional Generative Adversarial Network

    Authors: Mohammadreza Naderi, Zahra Nabizadeh, Nader Karimi, Shahram Shirani, Shadrokh Samavi

    Abstract: Conditional Generative Adversarial Networks (cGANs) have been used in many image processing tasks. However, they still have serious problems maintaining the balance between conditioning the output on the input and creating the output with the desired distribution based on the corresponding ground truth. The traditional cGANs, similar to most conventional GANs, suffer from vanishing gradients, whic… ▽ More

    Submitted 12 September, 2021; originally announced September 2021.

    Comments: 8 pages, 6 figures

  13. arXiv:2009.00982  [pdf

    eess.IV cs.CV

    Classification of Diabetic Retinopathy Using Unlabeled Data and Knowledge Distillation

    Authors: Sajjad Abbasi, Mohsen Hajabdollahi, Pejman Khadivi, Nader Karimi, Roshanak Roshandel, Shahram Shirani, Shadrokh Samavi

    Abstract: Knowledge distillation allows transferring knowledge from a pre-trained model to another. However, it suffers from limitations, and constraints related to the two models need to be architecturally similar. Knowledge distillation addresses some of the shortcomings associated with transfer learning by generalizing a complex model to a lighter model. However, some parts of the knowledge may not be di… ▽ More

    Submitted 1 September, 2020; originally announced September 2020.

    Comments: 21 pages, 6 figures, 7 tables. arXiv admin note: substantial text overlap with arXiv:2002.03321

  14. arXiv:2004.08690  [pdf

    eess.IV cs.CV cs.LG

    A fast semi-automatic method for classification and counting the number and types of blood cells in an image

    Authors: Hamed Sadeghi, Shahram Shirani, David W. Capson

    Abstract: A novel and fast semi-automatic method for segmentation, locating and counting blood cells in an image is proposed. In this method, thresholding is used to separate the nucleus from the other parts. We also use Hough transform for circles to locate the center of white cells. Locating and counting of red cells is performed using template matching. We make use of finding local maxima, labeling and m… ▽ More

    Submitted 18 April, 2020; originally announced April 2020.