CFMW: Cross-modality Fusion Mamba for Robust Object Detection under Adverse Weather

Li, Haoyuan; Hu, Qi; Zhou, Binjia; Yao, You; Lin, Jiacheng; Yang, Kailun; Chen, Peng

Computer Science > Computer Vision and Pattern Recognition

arXiv:2404.16302 (cs)

[Submitted on 25 Apr 2024 (v1), last revised 8 Jul 2025 (this version, v2)]

Title:CFMW: Cross-modality Fusion Mamba for Robust Object Detection under Adverse Weather

Authors:Haoyuan Li, Qi Hu, Binjia Zhou, You Yao, Jiacheng Lin, Kailun Yang, Peng Chen

View PDF HTML (experimental)

Abstract:Visible-infrared image pairs provide complementary information, enhancing the reliability and robustness of object detection applications in real-world scenarios. However, most existing methods face challenges in maintaining robustness under complex weather conditions, which limits their applicability. Meanwhile, the reliance on attention mechanisms in modality fusion introduces significant computational complexity and storage overhead, particularly when dealing with high-resolution images. To address these challenges, we propose the Cross-modality Fusion Mamba with Weather-removal (CFMW) to augment stability and cost-effectiveness under adverse weather conditions. Leveraging the proposed Perturbation-Adaptive Diffusion Model (PADM) and Cross-modality Fusion Mamba (CFM) modules, CFMW is able to reconstruct visual features affected by adverse weather, enriching the representation of image details. With efficient architecture design, CFMW is 3 times faster than Transformer-style fusion (e.g., CFT). To bridge the gap in relevant datasets, we construct a new Severe Weather Visible-Infrared (SWVI) dataset, encompassing diverse adverse weather scenarios such as rain, haze, and snow. The dataset contains 64,281 paired visible-infrared images, providing a valuable resource for future research. Extensive experiments on public datasets (i.e., M3FD and LLVIP) and the newly constructed SWVI dataset conclusively demonstrate that CFMW achieves state-of-the-art detection performance. Both the dataset and source code will be made publicly available at this https URL.

Comments:	Accepted to IEEE Transactions on Circuits and Systems for Video Technology (TCSVT). The dataset and source code will be made publicly available at this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Robotics (cs.RO); Image and Video Processing (eess.IV)
Cite as:	arXiv:2404.16302 [cs.CV]
	(or arXiv:2404.16302v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2404.16302

Submission history

From: Kailun Yang [view email]
[v1] Thu, 25 Apr 2024 02:54:11 UTC (42,156 KB)
[v2] Tue, 8 Jul 2025 14:46:42 UTC (8,487 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:CFMW: Cross-modality Fusion Mamba for Robust Object Detection under Adverse Weather

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:CFMW: Cross-modality Fusion Mamba for Robust Object Detection under Adverse Weather

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators