Skip to main content

Showing 1–11 of 11 results for author: Sikder, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.10885  [pdf, other

    cs.SD cs.AI eess.AS

    BanglaFake: Constructing and Evaluating a Specialized Bengali Deepfake Audio Dataset

    Authors: Istiaq Ahmed Fahad, Kamruzzaman Asif, Sifat Sikder

    Abstract: Deepfake audio detection is challenging for low-resource languages like Bengali due to limited datasets and subtle acoustic features. To address this, we introduce BangalFake, a Bengali Deepfake Audio Dataset with 12,260 real and 13,260 deepfake utterances. Synthetic speech is generated using SOTA Text-to-Speech (TTS) models, ensuring high naturalness and quality. We evaluate the dataset through b… ▽ More

    Submitted 16 May, 2025; originally announced May 2025.

    Comments: 5 page

  2. arXiv:2504.07117  [pdf, other

    q-bio.TO cs.AI

    RP-SAM2: Refining Point Prompts for Stable Surgical Instrument Segmentation

    Authors: Nuren Zhaksylyk, Ibrahim Almakky, Jay Paranjape, S. Swaroop Vedula, Shameema Sikder, Vishal M. Patel, Mohammad Yaqub

    Abstract: Accurate surgical instrument segmentation is essential in cataract surgery for tasks such as skill assessment and workflow optimization. However, limited annotated data makes it difficult to develop fully automatic models. Prompt-based methods like SAM2 offer flexibility yet remain highly sensitive to the point prompt placement, often leading to inconsistent segmentations. We address this issue by… ▽ More

    Submitted 25 March, 2025; originally announced April 2025.

  3. arXiv:2410.24181  [pdf, other

    cs.CV

    Federated Black-Box Adaptation for Semantic Segmentation

    Authors: Jay N. Paranjape, Shameema Sikder, S. Swaroop Vedula, Vishal M. Patel

    Abstract: Federated Learning (FL) is a form of distributed learning that allows multiple institutions or clients to collaboratively learn a global model to solve a task. This allows the model to utilize the information from every institute while preserving data privacy. However, recent studies show that the promise of protecting the privacy of data is not upheld by existing methods and that it is possible t… ▽ More

    Submitted 31 October, 2024; originally announced October 2024.

    Comments: Accepted at NEURIPS 2024

  4. arXiv:2408.06447  [pdf, other

    cs.CV

    S-SAM: SVD-based Fine-Tuning of Segment Anything Model for Medical Image Segmentation

    Authors: Jay N. Paranjape, Shameema Sikder, S. Swaroop Vedula, Vishal M. Patel

    Abstract: Medical image segmentation has been traditionally approached by training or fine-tuning the entire model to cater to any new modality or dataset. However, this approach often requires tuning a large number of parameters during training. With the introduction of the Segment Anything Model (SAM) for prompted segmentation of natural images, many efforts have been made towards adapting it efficiently… ▽ More

    Submitted 12 August, 2024; originally announced August 2024.

    Comments: Accepted in MICCAI 2024

  5. arXiv:2405.10913  [pdf, other

    cs.CV

    Blackbox Adaptation for Medical Image Segmentation

    Authors: Jay N. Paranjape, Shameema Sikder, S. Swaroop Vedula, Vishal M. Patel

    Abstract: In recent years, various large foundation models have been proposed for image segmentation. There models are often trained on large amounts of data corresponding to general computer vision tasks. Hence, these models do not perform well on medical data. There have been some attempts in the literature to perform parameter-efficient finetuning of such foundation models for medical image segmentation.… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

    Comments: Accepted early at MICCAI 2024

  6. arXiv:2309.00848  [pdf, other

    cs.CV cs.LG

    Bengali Document Layout Analysis -- A YOLOV8 Based Ensembling Approach

    Authors: Nazmus Sakib Ahmed, Saad Sakib Noor, Ashraful Islam Shanto Sikder, Abhijit Paul

    Abstract: This paper focuses on enhancing Bengali Document Layout Analysis (DLA) using the YOLOv8 model and innovative post-processing techniques. We tackle challenges unique to the complex Bengali script by employing data augmentation for model robustness. After meticulous validation set evaluation, we fine-tune our approach on the complete dataset, leading to a two-stage prediction strategy for accurate e… ▽ More

    Submitted 16 September, 2024; v1 submitted 2 September, 2023; originally announced September 2023.

  7. arXiv:2308.15822  [pdf

    eess.IV cs.CV

    AMDNet23: A combined deep Contour-based Convolutional Neural Network and Long Short Term Memory system to diagnose Age-related Macular Degeneration

    Authors: Md. Aiyub Ali, Md. Shakhawat Hossain, Md. Kawar Hossain, Subhadra Soumi Sikder, Sharun Akter Khushbu, Mirajul Islam

    Abstract: In light of the expanding population, an automated framework of disease detection can assist doctors in the diagnosis of ocular diseases, yields accurate, stable, rapid outcomes, and improves the success rate of early detection. The work initially intended the enhancing the quality of fundus images by employing an adaptive contrast enhancement algorithm (CLAHE) and Gamma correction. In the preproc… ▽ More

    Submitted 30 August, 2023; originally announced August 2023.

    Report number: ISWA-D-23-00333

  8. arXiv:2308.04035  [pdf, other

    cs.CV

    Cross-Dataset Adaptation for Instrument Classification in Cataract Surgery Videos

    Authors: Jay N. Paranjape, Shameema Sikder, Vishal M. Patel, S. Swaroop Vedula

    Abstract: Surgical tool presence detection is an important part of the intra-operative and post-operative analysis of a surgery. State-of-the-art models, which perform this task well on a particular dataset, however, perform poorly when tested on another dataset. This occurs due to a significant domain shift between the datasets resulting from the use of different tools, sensors, data resolution etc. In thi… ▽ More

    Submitted 31 July, 2023; originally announced August 2023.

    Comments: MICCAI 2023

  9. arXiv:2308.03726  [pdf, other

    cs.CV

    AdaptiveSAM: Towards Efficient Tuning of SAM for Surgical Scene Segmentation

    Authors: Jay N. Paranjape, Nithin Gopalakrishnan Nair, Shameema Sikder, S. Swaroop Vedula, Vishal M. Patel

    Abstract: Segmentation is a fundamental problem in surgical scene analysis using artificial intelligence. However, the inherent data scarcity in this domain makes it challenging to adapt traditional segmentation techniques for this task. To tackle this issue, current research employs pretrained models and finetunes them on the given data. Even so, these require training deep networks with millions of parame… ▽ More

    Submitted 7 August, 2023; originally announced August 2023.

    Comments: 10 pages, 6 figures, 5 tables

  10. arXiv:2307.11081  [pdf, other

    cs.CV cs.LG

    GLSFormer: Gated - Long, Short Sequence Transformer for Step Recognition in Surgical Videos

    Authors: Nisarg A. Shah, Shameema Sikder, S. Swaroop Vedula, Vishal M. Patel

    Abstract: Automated surgical step recognition is an important task that can significantly improve patient safety and decision-making during surgeries. Existing state-of-the-art methods for surgical step recognition either rely on separate, multi-stage modeling of spatial and temporal information or operate on short-range temporal resolution when learned jointly. However, the benefits of joint modeling of sp… ▽ More

    Submitted 20 July, 2023; originally announced July 2023.

    Comments: Accepted to MICCAI 2023 (Early Accept)

  11. arXiv:2205.06416  [pdf, other

    cs.CV

    Video-based assessment of intraoperative surgical skill

    Authors: Sanchit Hira, Digvijay Singh, Tae Soo Kim, Shobhit Gupta, Gregory Hager, Shameema Sikder, S. Swaroop Vedula

    Abstract: Purpose: The objective of this investigation is to provide a comprehensive analysis of state-of-the-art methods for video-based assessment of surgical skill in the operating room. Methods: Using a data set of 99 videos of capsulorhexis, a critical step in cataract surgery, we evaluate feature based methods previously developed for surgical skill assessment mostly under benchtop settings. In additi… ▽ More

    Submitted 12 May, 2022; originally announced May 2022.