Skip to main content

Showing 1–24 of 24 results for author: Chanda, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2411.18007  [pdf

    cs.CV

    AI-Driven Smartphone Solution for Digitizing Rapid Diagnostic Test Kits and Enhancing Accessibility for the Visually Impaired

    Authors: R. B. Dastagir, J. T. Jami, S. Chanda, F. Hafiz, M. Rahman, K. Dey, M. M. Rahman, M. Qureshi, M. M. Chowdhury

    Abstract: Rapid diagnostic tests are crucial for timely disease detection and management, yet accurate interpretation of test results remains challenging. In this study, we propose a novel approach to enhance the accuracy and reliability of rapid diagnostic test result interpretation by integrating artificial intelligence (AI) algorithms, including convolutional neural networks (CNN), within a smartphone-ba… ▽ More

    Submitted 26 November, 2024; originally announced November 2024.

  2. Static and Dynamic Synthesis of Bengali and Devanagari Signatures

    Authors: Miguel A. Ferrer, Sukalpa Chanda, Moises Diaz, Chayan Kr. Banerjee, Anirban Majumdar, Cristina Carmona-Duarte, Parikshit Acharya, Umapada Pal

    Abstract: Developing an automatic signature verification system is challenging and demands a large number of training samples. This is why synthetic handwriting generation is an emerging topic in document image analysis. Some handwriting synthesizers use the motor equivalence model, the well-established hypothesis from neuroscience, which analyses how a human being accomplishes movement. Specifically, a mot… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

    Comments: Accepted version. Published on IEEE Transactions on Cybernetics [ISSN 2168-2267], v. 48(10), p. 2896-2907

    Journal ref: IEEE Transactions on Cybernetics, v. 48(10), p. 2896-2907, 2018

  3. arXiv:2312.08010  [pdf, other

    cs.CV cs.LG

    EZ-CLIP: Efficient Zeroshot Video Action Recognition

    Authors: Shahzad Ahmad, Sukalpa Chanda, Yogesh S Rawat

    Abstract: Recent advancements in large-scale pre-training of visual-language models on paired image-text data have demonstrated impressive generalization capabilities for zero-shot tasks. Building on this success, efforts have been made to adapt these image-based visual-language models, such as CLIP, for videos extending their zero-shot capabilities to the video domain. While these adaptations have shown pr… ▽ More

    Submitted 19 January, 2024; v1 submitted 13 December, 2023; originally announced December 2023.

  4. arXiv:2311.11662  [pdf, other

    cs.CV

    Enhanced Spatio-Temporal Context for Temporally Consistent Robust 3D Human Motion Recovery from Monocular Videos

    Authors: Sushovan Chanda, Amogh Tiwari, Lokender Tiwari, Brojeshwar Bhowmick, Avinash Sharma, Hrishav Barua

    Abstract: Recovering temporally consistent 3D human body pose, shape and motion from a monocular video is a challenging task due to (self-)occlusions, poor lighting conditions, complex articulated body poses, depth ambiguity, and limited availability of annotated data. Further, doing a simple perframe estimation is insufficient as it leads to jittery and implausible results. In this paper, we propose a nove… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

  5. arXiv:2301.10172  [pdf, other

    cs.CL cs.LG

    MTTN: Multi-Pair Text to Text Narratives for Prompt Generation

    Authors: Archan Ghosh, Debgandhar Ghosh, Madhurima Maji, Suchinta Chanda, Kalporup Goswami

    Abstract: The increased interest in diffusion models has opened up opportunities for advancements in generative text modeling. These models can produce impressive images when given a well-crafted prompt, but creating a powerful or meaningful prompt can be hit-or-miss. To address this, we have created a large-scale dataset that is derived and synthesized from real prompts and indexed with popular image-text… ▽ More

    Submitted 29 January, 2023; v1 submitted 21 January, 2023; originally announced January 2023.

  6. Mechanics of geodesics in Information geometry and Black Hole Thermodynamics

    Authors: Sumanto Chanda, Tatsuaki Wada

    Abstract: In this article we shall discuss the theory of geodesics in information geometry, and an application in astrophysics. We will study how gradient flows in information geometry describe geodesics, explore the related mechanics by introducing a constraint, and apply our theory to Gaussian model and black hole thermodynamics. Thus, we demonstrate how deformation of gradient flows leads to more general… ▽ More

    Submitted 6 December, 2023; v1 submitted 13 December, 2022; originally announced December 2022.

    Comments: 24 pages. Corrections made. New section and 2 references added. Please comment

  7. arXiv:2210.09922  [pdf, other

    cs.LG

    Few-Shot Learning of Compact Models via Task-Specific Meta Distillation

    Authors: Yong Wu, Shekhor Chanda, Mehrdad Hosseinzadeh, Zhi Liu, Yang Wang

    Abstract: We consider a new problem of few-shot learning of compact models. Meta-learning is a popular approach for few-shot learning. Previous work in meta-learning typically assumes that the model architecture during meta-training is the same as the model architecture used for final deployment. In this paper, we challenge this basic assumption. For final deployment, we often need the model to be small. Bu… ▽ More

    Submitted 18 October, 2022; originally announced October 2022.

    Comments: This paper has been accepted by WACV'2023

  8. arXiv:2202.10401  [pdf, other

    cs.CV

    Vision-Language Pre-Training with Triple Contrastive Learning

    Authors: Jinyu Yang, Jiali Duan, Son Tran, Yi Xu, Sampath Chanda, Liqun Chen, Belinda Zeng, Trishul Chilimbi, Junzhou Huang

    Abstract: Vision-language representation learning largely benefits from image-text alignment through contrastive losses (e.g., InfoNCE loss). The success of this alignment strategy is attributed to its capability in maximizing the mutual information (MI) between an image and its matched text. However, simply performing cross-modal alignment (CMA) ignores data potential within each modality, which may result… ▽ More

    Submitted 28 March, 2022; v1 submitted 21 February, 2022; originally announced February 2022.

    Comments: CVPR 2022; code: https://github.com/uta-smile/TCL

  9. arXiv:2111.10618  [pdf, other

    eess.IV cs.CV

    PAANet: Progressive Alternating Attention for Automatic Medical Image Segmentation

    Authors: Abhishek Srivastava, Sukalpa Chanda, Debesh Jha, Michael A. Riegler, Pål Halvorsen, Dag Johansen, Umapada Pal

    Abstract: Medical image segmentation can provide detailed information for clinical analysis which can be useful for scenarios where the detailed location of a finding is important. Knowing the location of disease can play a vital role in treatment and decision-making. Convolutional neural network (CNN) based encoder-decoder techniques have advanced the performance of automated medical image segmentation sys… ▽ More

    Submitted 20 November, 2021; originally announced November 2021.

  10. arXiv:2111.10614  [pdf, other

    eess.IV cs.CV

    GMSRF-Net: An improved generalizability with global multi-scale residual fusion network for polyp segmentation

    Authors: Abhishek Srivastava, Sukalpa Chanda, Debesh Jha, Umapada Pal, Sharib Ali

    Abstract: Colonoscopy is a gold standard procedure but is highly operator-dependent. Efforts have been made to automate the detection and segmentation of polyps, a precancerous precursor, to effectively minimize missed rate. Widely used computer-aided polyp segmentation systems actuated by encoder-decoder have achieved high performance in terms of accuracy. However, polyp segmentation datasets collected fro… ▽ More

    Submitted 20 November, 2021; originally announced November 2021.

  11. arXiv:2111.10605  [pdf, other

    cs.CV

    Exploiting Multi-Scale Fusion, Spatial Attention and Patch Interaction Techniques for Text-Independent Writer Identification

    Authors: Abhishek Srivastava, Sukalpa Chanda, Umapada Pal

    Abstract: Text independent writer identification is a challenging problem that differentiates between different handwriting styles to decide the author of the handwritten text. Earlier writer identification relied on handcrafted features to reveal pieces of differences between writers. Recent work with the advent of convolutional neural network, deep learning-based methods have evolved. In this paper, three… ▽ More

    Submitted 20 November, 2021; originally announced November 2021.

    Comments: 14 pages, 4 figures

  12. arXiv:2111.10591  [pdf, other

    cs.CV

    AGA-GAN: Attribute Guided Attention Generative Adversarial Network with U-Net for Face Hallucination

    Authors: Abhishek Srivastava, Sukalpa Chanda, Umapada Pal

    Abstract: The performance of facial super-resolution methods relies on their ability to recover facial structures and salient features effectively. Even though the convolutional neural network and generative adversarial network-based methods deliver impressive performances on face hallucination tasks, the ability to use attributes associated with the low-resolution images to improve performance is unsatisfa… ▽ More

    Submitted 20 November, 2021; originally announced November 2021.

    Comments: 27 pages, 9 Figures

  13. arXiv:2108.09335  [pdf, other

    cs.CV cs.LG

    LoOp: Looking for Optimal Hard Negative Embeddings for Deep Metric Learning

    Authors: Bhavya Vasudeva, Puneesh Deora, Saumik Bhattacharya, Umapada Pal, Sukalpa Chanda

    Abstract: Deep metric learning has been effectively used to learn distance metrics for different visual tasks like image retrieval, clustering, etc. In order to aid the training process, existing methods either use a hard mining strategy to extract the most informative samples or seek to generate hard synthetics using an additional network. Such approaches face different challenges and can lead to biased em… ▽ More

    Submitted 20 August, 2021; originally announced August 2021.

    Comments: 17 pages, 9 figures, 5 tables. Accepted at The IEEE/CVF International Conference on Computer Vision (ICCV) 2021

  14. arXiv:2107.10756   

    cs.CV cs.LG

    Semantic Text-to-Face GAN -ST^2FG

    Authors: Manan Oza, Sukalpa Chanda, David Doermann

    Abstract: Faces generated using generative adversarial networks (GANs) have reached unprecedented realism. These faces, also known as "Deep Fakes", appear as realistic photographs with very little pixel-level distortions. While some work has enabled the training of models that lead to the generation of specific properties of the subject, generating a facial image based on a natural language description has… ▽ More

    Submitted 13 December, 2023; v1 submitted 22 July, 2021; originally announced July 2021.

    Comments: Experiments needs to be redone

  15. arXiv:2105.15093  [pdf, other

    cs.CV

    Pho(SC)-CTC -- A Hybrid Approach Towards Zero-shot Word Image Recognition

    Authors: Ravi Bhatt, Anuj Rai, Narayanan C. Krishnan, Sukalpa Chanda

    Abstract: Annotating words in a historical document image archive for word image recognition purpose demands time and skilled human resource (like historians, paleographers). In a real-life scenario, obtaining sample images for all possible words is also not feasible. However, Zero-shot learning methods could aptly be used to recognize unseen/out-of-lexicon words in such historical document images. Based on… ▽ More

    Submitted 21 December, 2022; v1 submitted 31 May, 2021; originally announced May 2021.

    Comments: Accepted (International Journal on Document Analysis and Recognition). This paper is the extension of the paper titled "Pho(SC)Net: An Approach Towards Zero-shot Word Image Recognition in Historical Documents" published in ICDAR 2021

  16. arXiv:2105.09909  [pdf, other

    cs.CV cs.AI cs.NE

    PLSM: A Parallelized Liquid State Machine for Unintentional Action Detection

    Authors: Dipayan Das, Saumik Bhattacharya, Umapada Pal, Sukalpa Chanda

    Abstract: Reservoir Computing (RC) offers a viable option to deploy AI algorithms on low-end embedded system platforms. Liquid State Machine (LSM) is a bio-inspired RC model that mimics the cortical microcircuits and uses spiking neural networks (SNN) that can be directly realized on neuromorphic hardware. In this paper, we present a novel Parallelized LSM (PLSM) architecture that incorporates spatio-tempor… ▽ More

    Submitted 6 May, 2021; originally announced May 2021.

  17. arXiv:2105.07451  [pdf, other

    eess.IV cs.CV

    MSRF-Net: A Multi-Scale Residual Fusion Network for Biomedical Image Segmentation

    Authors: Abhishek Srivastava, Debesh Jha, Sukalpa Chanda, Umapada Pal, Håvard D. Johansen, Dag Johansen, Michael A. Riegler, Sharib Ali, Pål Halvorsen

    Abstract: Methods based on convolutional neural networks have improved the performance of biomedical image segmentation. However, most of these methods cannot efficiently segment objects of variable sizes and train on small and biased datasets, which are common for biomedical use cases. While methods exist that incorporate multi-scale fusion approaches to address the challenges arising with variable sizes,… ▽ More

    Submitted 30 January, 2022; v1 submitted 16 May, 2021; originally announced May 2021.

    Journal ref: IEEE Journal of Biomedical and Health Informatics, 2022

  18. arXiv:2105.05170  [pdf

    cs.SE cs.MA

    Mandating Code Disclosure is Unnecessary -- Strict Model Verification Does Not Require Accessing Original Computer Code

    Authors: Sasanka Sekhar Chanda

    Abstract: Mandating public availability of computer code underlying computational simulation modeling research ends up doing a disservice to the cause of model verification when inconsistencies between the specifications in the publication text and specifications in the computer code go unchallenged. Conversely, a model is verified when an independent researcher undertakes the set of mental processing tasks… ▽ More

    Submitted 11 May, 2021; originally announced May 2021.

    Comments: 12 pages, 2 Figures

  19. arXiv:2105.04311  [pdf

    cs.NE nlin.AO

    Overcoming Complexity Catastrophe: An Algorithm for Beneficial Far-Reaching Adaptation under High Complexity

    Authors: Sasanka Sekhar Chanda, Sai Yayavaram

    Abstract: In his seminal work with NK algorithms, Kauffman noted that fitness outcomes from algorithms navigating an NK landscape show a sharp decline at high complexity arising from pervasive interdependence among problem dimensions. This phenomenon - where complexity effects dominate (Darwinian) adaptation efforts - is called complexity catastrophe. We present an algorithm - incremental change taking turn… ▽ More

    Submitted 10 May, 2021; originally announced May 2021.

    Comments: 10 pages, 5 Figures

  20. arXiv:2104.12620  [pdf

    cs.AI physics.atm-clus q-bio.PE

    An Algorithm to Effect Prompt Termination of Myopic Local Search on Kauffman-s NK Landscape

    Authors: Sasanka Sekhar Chanda

    Abstract: In Kauffman-s NK model, myopic local search involves flipping one randomly-chosen bit of an N-bit decision string in every time step and accepting the new configuration if that has higher fitness. One issue is that, this algorithm consumes the full extent of computational resources allocated - given by the number of alternative configurations inspected - even though search is expected to terminate… ▽ More

    Submitted 11 May, 2021; v1 submitted 26 April, 2021; originally announced April 2021.

    Comments: 13 Pages, 10 Figures, 1 Table

  21. arXiv:2101.06770  [pdf, other

    cs.CV

    Improving Apparel Detection with Category Grouping and Multi-grained Branches

    Authors: Qing Tian, Sampath Chanda, K C Amit Kumar, Douglas Gray

    Abstract: Training an accurate object detector is expensive and time-consuming. One main reason lies in the laborious labeling process, i.e., annotating category and bounding box information for all instances in every image. In this paper, we examine ways to improve performance of deep object detectors without extra labeling. We first explore to group existing categories of high visual and semantic similari… ▽ More

    Submitted 17 January, 2021; originally announced January 2021.

  22. arXiv:2008.04073  [pdf

    cs.CY cs.AI

    AI Failures: A Review of Underlying Issues

    Authors: Debarag Narayan Banerjee, Sasanka Sekhar Chanda

    Abstract: Instances of Artificial Intelligence (AI) systems failing to deliver consistent, satisfactory performance are legion. We investigate why AI failures occur. We address only a narrow subset of the broader field of AI Safety. We focus on AI failures on account of flaws in conceptualization, design and deployment. Other AI Safety issues like trade-offs between privacy and security or convenience, bad… ▽ More

    Submitted 18 July, 2020; originally announced August 2020.

    Comments: 8 pages

  23. arXiv:2006.08333  [pdf

    cs.AI cs.MA q-bio.PE

    An Algorithm to find Superior Fitness on NK Landscapes under High Complexity: Muddling Through

    Authors: Sasanka Sekhar Chanda, Sai Yayavaram

    Abstract: Under high complexity - given by pervasive interdependence between constituent elements of a decision in an NK landscape - our algorithm obtains fitness superior to that reported in extant research. We distribute the decision elements comprising a decision into clusters. When a change in value of a decision element is considered, a forward move is made if the aggregate fitness of the cluster membe… ▽ More

    Submitted 7 September, 2020; v1 submitted 6 June, 2020; originally announced June 2020.

    Comments: 6 pages and 5 figures

    MSC Class: Keywords. algorithm; complexity; fitness; interdependence; muddling through; NK model; policy making; public administration

  24. arXiv:1907.02244  [pdf, other

    cs.CV eess.IV

    Searching for Apparel Products from Images in the Wild

    Authors: Son Tran, Ming Du, Sampath Chanda, R. Manmatha, Cj Taylor

    Abstract: In this age of social media, people often look at what others are wearing. In particular, Instagram and Twitter influencers often provide images of themselves wearing different outfits and their followers are often inspired to buy similar clothes.We propose a system to automatically find the closest visually similar clothes in the online Catalog (street-to-shop searching). The problem is challengi… ▽ More

    Submitted 7 April, 2022; v1 submitted 4 July, 2019; originally announced July 2019.

    Comments: KDD2019, AI for Fashion Workshop