Skip to main content

Showing 1–27 of 27 results for author: Shamsolmoali, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2504.14294  [pdf, other

    cs.CV

    From Missing Pieces to Masterpieces: Image Completion with Context-Adaptive Diffusion

    Authors: Pourya Shamsolmoali, Masoumeh Zareapoor, Huiyu Zhou, Michael Felsberg, Dacheng Tao, Xuelong Li

    Abstract: Image completion is a challenging task, particularly when ensuring that generated content seamlessly integrates with existing parts of an image. While recent diffusion models have shown promise, they often struggle with maintaining coherence between known and unknown (missing) regions. This issue arises from the lack of explicit spatial and semantic alignment during the diffusion process, resultin… ▽ More

    Submitted 19 April, 2025; originally announced April 2025.

    Comments: Accepted in TPAMI

  2. arXiv:2504.12197  [pdf, other

    cs.CV

    Beyond Patches: Mining Interpretable Part-Prototypes for Explainable AI

    Authors: Mahdi Alehdaghi, Rajarshi Bhattacharya, Pourya Shamsolmoali, Rafael M. O. Cruz, Maguelonne Heritier, Eric Granger

    Abstract: Deep learning has provided considerable advancements for multimedia systems, yet the interpretability of deep models remains a challenge. State-of-the-art post-hoc explainability methods, such as GradCAM, provide visual interpretation based on heatmaps but lack conceptual clarity. Prototype-based approaches, like ProtoPNet and PIPNet, offer a more structured explanation but rely on fixed patches,… ▽ More

    Submitted 16 April, 2025; originally announced April 2025.

  3. arXiv:2504.09482  [pdf, other

    cs.CL cs.AI cs.ET

    HalluShift: Measuring Distribution Shifts towards Hallucination Detection in LLMs

    Authors: Sharanya Dasgupta, Sujoy Nath, Arkaprabha Basu, Pourya Shamsolmoali, Swagatam Das

    Abstract: Large Language Models (LLMs) have recently garnered widespread attention due to their adeptness at generating innovative responses to the given prompts across a multitude of domains. However, LLMs often suffer from the inherent limitation of hallucinations and generate incorrect information while maintaining well-structured and coherent responses. In this work, we hypothesize that hallucinations s… ▽ More

    Submitted 13 April, 2025; originally announced April 2025.

  4. arXiv:2503.04107  [pdf, other

    cs.CV

    Fractional Correspondence Framework in Detection Transformer

    Authors: Masoumeh Zareapoor, Pourya Shamsolmoali, Huiyu Zhou, Yue Lu, Salvador García

    Abstract: The Detection Transformer (DETR), by incorporating the Hungarian algorithm, has significantly simplified the matching process in object detection tasks. This algorithm facilitates optimal one-to-one matching of predicted bounding boxes to ground-truth annotations during training. While effective, this strict matching process does not inherently account for the varying densities and distributions o… ▽ More

    Submitted 6 March, 2025; originally announced March 2025.

    Journal ref: ACMMM-2024

  5. arXiv:2501.13307  [pdf, other

    cs.CV

    From Cross-Modal to Mixed-Modal Visible-Infrared Re-Identification

    Authors: Mahdi Alehdaghi, Rajarshi Bhattacharya, Pourya Shamsolmoali, Rafael M. O. Cruz, Eric Granger

    Abstract: Visible-infrared person re-identification (VI-ReID) aims to match individuals across different camera modalities, a critical task in modern surveillance systems. While current VI-ReID methods focus on cross-modality matching, real-world applications often involve mixed galleries containing both V and I images, where state-of-the-art methods show significant performance limitations due to large dom… ▽ More

    Submitted 22 January, 2025; originally announced January 2025.

  6. arXiv:2410.09306  [pdf, ps, other

    cs.CV

    TD-Paint: Faster Diffusion Inpainting Through Time Aware Pixel Conditioning

    Authors: Tsiry Mayet, Pourya Shamsolmoali, Simon Bernard, Eric Granger, Romain Hérault, Clement Chatelain

    Abstract: Diffusion models have emerged as highly effective techniques for inpainting, however, they remain constrained by slow sampling rates. While recent advances have enhanced generation quality, they have also increased sampling time, thereby limiting scalability in real-world applications. We investigate the generative sampling process of diffusion-based inpainting models and observe that these models… ▽ More

    Submitted 23 June, 2025; v1 submitted 11 October, 2024; originally announced October 2024.

  7. arXiv:2404.19113  [pdf, other

    cs.CV cs.LG

    Source-Free Domain Adaptation of Weakly-Supervised Object Localization Models for Histology

    Authors: Alexis Guichemerre, Soufiane Belharbi, Tsiry Mayet, Shakeeb Murtaza, Pourya Shamsolmoali, Luke McCaffrey, Eric Granger

    Abstract: Given the emergence of deep learning, digital pathology has gained popularity for cancer diagnosis based on histology images. Deep weakly supervised object localization (WSOL) models can be trained to classify histology images according to cancer grade and identify regions of interest (ROIs) for interpretation, using inexpensive global image-class annotations. A WSOL model initially trained on som… ▽ More

    Submitted 12 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

    Comments: 16 pages, 21 figures, 5 tables, CVPRw 2024

  8. arXiv:2403.10782  [pdf, other

    cs.CV

    Bidirectional Multi-Step Domain Generalization for Visible-Infrared Person Re-Identification

    Authors: Mahdi Alehdaghi, Pourya Shamsolmoali, Rafael M. O. Cruz, Eric Granger

    Abstract: A key challenge in visible-infrared person re-identification (V-I ReID) is training a backbone model capable of effectively addressing the significant discrepancies across modalities. State-of-the-art methods that generate a single intermediate bridging domain are often less effective, as this generated domain may not adequately capture sufficient common discriminant information. This paper introd… ▽ More

    Submitted 10 February, 2025; v1 submitted 15 March, 2024; originally announced March 2024.

  9. arXiv:2401.03540  [pdf, other

    cs.CV

    SeTformer is What You Need for Vision and Language

    Authors: Pourya Shamsolmoali, Masoumeh Zareapoor, Eric Granger, Michael Felsberg

    Abstract: The dot product self-attention (DPSA) is a fundamental component of transformers. However, scaling them to long sequences, like documents or high-resolution images, becomes prohibitively expensive due to quadratic time and memory complexities arising from the softmax operation. Kernel methods are employed to simplify computations by approximating softmax but often lead to performance drops compare… ▽ More

    Submitted 7 January, 2024; originally announced January 2024.

  10. arXiv:2310.18676  [pdf, other

    cs.CV

    Efficient Object Detection in Optical Remote Sensing Imagery via Attention-based Feature Distillation

    Authors: Pourya Shamsolmoali, Jocelyn Chanussot, Huiyu Zhou, Yue Lu

    Abstract: Efficient object detection methods have recently received great attention in remote sensing. Although deep convolutional networks often have excellent detection accuracy, their deployment on resource-limited edge devices is difficult. Knowledge distillation (KD) is a strategy for addressing this issue since it makes models lightweight while maintaining accuracy. However, existing KD methods for ob… ▽ More

    Submitted 28 October, 2023; originally announced October 2023.

  11. arXiv:2310.07440  [pdf, other

    cs.CV

    Distance Weighted Trans Network for Image Completion

    Authors: Pourya Shamsolmoali, Masoumeh Zareapoor, Huiyu Zhou, Xuelong Li, Yue Lu

    Abstract: The challenge of image generation has been effectively modeled as a problem of structure priors or transformation. However, existing models have unsatisfactory performance in understanding the global input image structures because of particular inherent features (for example, local inductive prior). Recent studies have shown that self-attention is an efficient modeling technique for image completi… ▽ More

    Submitted 25 October, 2023; v1 submitted 11 October, 2023; originally announced October 2023.

  12. arXiv:2310.04099  [pdf, other

    cs.CV

    ClusVPR: Efficient Visual Place Recognition with Clustering-based Weighted Transformer

    Authors: Yifan Xu, Pourya Shamsolmoali, Jie Yang

    Abstract: Visual place recognition (VPR) is a highly challenging task that has a wide range of applications, including robot navigation and self-driving vehicles. VPR is particularly difficult due to the presence of duplicate regions and the lack of attention to small objects in complex scenes, resulting in recognition deviations. In this paper, we present ClusVPR, a novel approach that tackles the specific… ▽ More

    Submitted 12 October, 2023; v1 submitted 6 October, 2023; originally announced October 2023.

  13. arXiv:2307.12517  [pdf, other

    cs.CV

    Entropy Transformer Networks: A Learning Approach via Tangent Bundle Data Manifold

    Authors: Pourya Shamsolmoali, Masoumeh Zareapoor

    Abstract: This paper focuses on an accurate and fast interpolation approach for image transformation employed in the design of CNN architectures. Standard Spatial Transformer Networks (STNs) use bilinear or linear interpolation as their interpolation, with unrealistic assumptions about the underlying data distributions, which leads to poor performance under scale variations. Moreover, STNs do not preserve t… ▽ More

    Submitted 24 July, 2023; originally announced July 2023.

  14. arXiv:2307.03240  [pdf, other

    cs.CV

    Adaptive Generation of Privileged Intermediate Information for Visible-Infrared Person Re-Identification

    Authors: Mahdi Alehdaghi, Arthur Josi, Pourya Shamsolmoali, Rafael M. O. Cruz, Eric Granger

    Abstract: Visible-infrared person re-identification seeks to retrieve images of the same individual captured over a distributed network of RGB and IR sensors. Several V-I ReID approaches directly integrate both V and I modalities to discriminate persons within a shared representation space. However, given the significant gap in data distributions between V and I modalities, cross-modal V-I ReID remains chal… ▽ More

    Submitted 10 February, 2025; v1 submitted 6 July, 2023; originally announced July 2023.

  15. arXiv:2305.00379  [pdf, other

    cs.CV

    Image Completion via Dual-path Cooperative Filtering

    Authors: Pourya Shamsolmoali, Masoumeh Zareapoor, Eric Granger

    Abstract: Given the recent advances with image-generating algorithms, deep image completion methods have made significant progress. However, state-of-art methods typically provide poor cross-scene generalization, and generated masked areas often contain blurry artifacts. Predictive filtering is a method for restoring images, which predicts the most effective kernels based on the input scene. Motivated by th… ▽ More

    Submitted 29 April, 2023; originally announced May 2023.

  16. arXiv:2304.00948  [pdf, other

    cs.CV

    VTAE: Variational Transformer Autoencoder with Manifolds Learning

    Authors: Pourya Shamsolmoali, Masoumeh Zareapoor, Huiyu Zhou, Dacheng Tao, Xuelong Li

    Abstract: Deep generative models have demonstrated successful applications in learning non-linear data distributions through a number of latent variables and these models use a nonlinear function (generator) to map latent samples into the data space. On the other hand, the nonlinearity of the generator implies that the latent space shows an unsatisfactory projection of the data space, which results in poor… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

  17. arXiv:2209.00232  [pdf, other

    cs.CV

    Hybrid Gromov-Wasserstein Embedding for Capsule Learning

    Authors: Pourya Shamsolmoali, Masoumeh Zareapoor, Swagatam Das, Eric Granger, Salvador Garcia

    Abstract: Capsule networks (CapsNets) aim to parse images into a hierarchy of objects, parts, and their relations using a two-step process involving part-whole transformation and hierarchical component routing. However, this hierarchical relationship modeling is computationally expensive, which has limited the wider use of CapsNet despite its potential advantages. The current state of CapsNet models primari… ▽ More

    Submitted 24 October, 2023; v1 submitted 1 September, 2022; originally announced September 2022.

  18. arXiv:2205.10272  [pdf, other

    cs.CV

    Salient Skin Lesion Segmentation via Dilated Scale-Wise Feature Fusion Network

    Authors: Pourya Shamsolmoali, Masoumeh Zareapoor, Eric Granger, Huiyu Zhou

    Abstract: Skin lesion detection in dermoscopic images is essential in the accurate and early diagnosis of skin cancer by a computerized apparatus. Current skin lesion segmentation approaches show poor performance in challenging circumstances such as indistinct lesion boundaries, low contrast between the lesion and the surrounding area, or heterogeneous background that causes over/under segmentation of the s… ▽ More

    Submitted 25 July, 2022; v1 submitted 20 May, 2022; originally announced May 2022.

  19. arXiv:2205.05927  [pdf, other

    cs.CV

    Enhanced Single-shot Detector for Small Object Detection in Remote Sensing Images

    Authors: Pourya Shamsolmoali, Masoumeh Zareapoor, Eric Granger, Jocelyn Chanussot, Jie Yang

    Abstract: Small-object detection is a challenging problem. In the last few years, the convolution neural networks methods have been achieved considerable progress. However, the current detectors struggle with effective features extraction for small-scale objects. To address this challenge, we propose image pyramid single-shot detector (IPSSD). In IPSSD, single-shot detector is adopted combined with an image… ▽ More

    Submitted 12 May, 2022; originally announced May 2022.

    Journal ref: 42 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2022

  20. Multi-patch Feature Pyramid Network for Weakly Supervised Object Detection in Optical Remote Sensing Images

    Authors: Pourya Shamsolmoali, Jocelyn Chanussot, Masoumeh Zareapoor, Huiyu Zhou, Jie Yang

    Abstract: Object detection is a challenging task in remote sensing because objects only occupy a few pixels in the images, and the models are required to simultaneously learn object locations and detection. Even though the established approaches well perform for the objects of regular sizes, they achieve weak performance when analyzing small ones or getting stuck in the local minima (e.g. false object parts… ▽ More

    Submitted 18 August, 2021; originally announced August 2021.

  21. Rotation Equivariant Feature Image Pyramid Network for Object Detection in Optical Remote Sensing Imagery

    Authors: Pourya Shamsolmoali, Masoumeh Zareapoor, Jocelyn Chanussot, Huiyu Zhou, Jie Yang

    Abstract: Detection of objects is extremely important in various aerial vision-based applications. Over the last few years, the methods based on convolution neural networks have made substantial progress. However, because of the large variety of object scales, densities, and arbitrary orientations, the current detectors struggle with the extraction of semantically strong features for small-scale objects by… ▽ More

    Submitted 5 September, 2021; v1 submitted 1 June, 2021; originally announced June 2021.

  22. arXiv:2012.13736  [pdf, other

    cs.CV eess.IV

    Image Synthesis with Adversarial Networks: a Comprehensive Survey and Case Studies

    Authors: Pourya Shamsolmoali, Masoumeh Zareapoor, Eric Granger, Huiyu Zhou, Ruili Wang, M. Emre Celebi, Jie Yang

    Abstract: Generative Adversarial Networks (GANs) have been extremely successful in various application domains such as computer vision, medicine, and natural language processing. Moreover, transforming an object or person to a desired shape become a well-studied research in the GANs. GANs are powerful models for learning complex distributions to synthesize semantically meaningful samples. However, there is… ▽ More

    Submitted 26 December, 2020; originally announced December 2020.

  23. Road Segmentation for Remote Sensing Images using Adversarial Spatial Pyramid Networks

    Authors: Pourya Shamsolmoali, Masoumeh Zareapoor, Huiyu Zhou, Ruili Wang, Jie Yang

    Abstract: Road extraction in remote sensing images is of great importance for a wide range of applications. Because of the complex background, and high density, most of the existing methods fail to accurately extract a road network that appears correct and complete. Moreover, they suffer from either insufficient training data or high costs of manual annotation. To address these problems, we introduce a new… ▽ More

    Submitted 10 August, 2020; originally announced August 2020.

  24. arXiv:2008.03071  [pdf, other

    cs.CV cs.LG

    Oversampling Adversarial Network for Class-Imbalanced Fault Diagnosis

    Authors: Masoumeh Zareapoor, Pourya Shamsolmoali, Jie Yang

    Abstract: The collected data from industrial machines are often imbalanced, which poses a negative effect on learning algorithms. However, this problem becomes more challenging for a mixed type of data or while there is overlapping between classes. Class-imbalance problem requires a robust learning system which can timely predict and classify the data. We propose a new adversarial network for simultaneous c… ▽ More

    Submitted 7 August, 2020; originally announced August 2020.

  25. arXiv:2004.02182  [pdf, other

    cs.LG stat.ML

    Imbalanced Data Learning by Minority Class Augmentation using Capsule Adversarial Networks

    Authors: Pourya Shamsolmoali, Masoumeh Zareapoor, Linlin Shen, Abdul Hamid Sadka, Jie Yang

    Abstract: The fact that image datasets are often imbalanced poses an intense challenge for deep learning techniques. In this paper, we propose a method to restore the balance in imbalanced images, by coalescing two concurrent methods, generative adversarial networks (GANs) and capsule network. In our model, generative and discriminative networks play a novel competitive game, in which the generator generate… ▽ More

    Submitted 8 April, 2020; v1 submitted 5 April, 2020; originally announced April 2020.

  26. arXiv:2003.08002  [pdf

    cs.CV

    AMIL: Adversarial Multi Instance Learning for Human Pose Estimation

    Authors: Pourya Shamsolmoali, Masoumeh Zareapoor, Huiyu Zhou, Jie Yang

    Abstract: Human pose estimation has an important impact on a wide range of applications from human-computer interface to surveillance and content-based video retrieval. For human pose estimation, joint obstructions and overlapping upon human bodies result in departed pose estimation. To address these problems, by integrating priors of the structure of human bodies, we present a novel structure-aware network… ▽ More

    Submitted 17 March, 2020; originally announced March 2020.

  27. arXiv:2003.07784  [pdf

    eess.IV cs.CV

    A novel Deep Structure U-Net for Sea-Land Segmentation in Remote Sensing Images

    Authors: Pourya Shamsolmoali, Masoumeh Zareapoor, Ruili Wang, Huiyu Zhou, Jie Yang

    Abstract: Sea-land segmentation is an important process for many key applications in remote sensing. Proper operative sea-land segmentation for remote sensing images remains a challenging issue due to complex and diverse transition between sea and lands. Although several Convolutional Neural Networks (CNNs) have been developed for sea-land segmentation, the performance of these CNNs is far from the expected… ▽ More

    Submitted 17 March, 2020; originally announced March 2020.

    Comments: 14 pages, 14 figures