close this message
arXiv smileybones

arXiv Is Hiring a DevOps Engineer

Work on one of the world's most important websites and make an impact on open science.

View Jobs
Skip to main content
Cornell University

arXiv Is Hiring a DevOps Engineer

View Jobs
We gratefully acknowledge support from the Simons Foundation, member institutions, and all contributors. Donate
arxiv logo > cs.CV

Help | Advanced Search

arXiv logo
Cornell University Logo

quick links

  • Login
  • Help Pages
  • About

Computer Vision and Pattern Recognition

Authors and titles for recent submissions

  • Fri, 16 May 2025
  • Thu, 15 May 2025
  • Wed, 14 May 2025
  • Tue, 13 May 2025
  • Mon, 12 May 2025

See today's new changes

Total of 538 entries : 1-50 51-100 101-150 151-200 ... 501-538
Showing up to 50 entries per page: fewer | more | all

Fri, 16 May 2025 (showing first 50 of 74 entries )

[1] arXiv:2505.10566 [pdf, html, other]
Title: 3D-Fixup: Advancing Photo Editing with 3D Priors
Yen-Chi Cheng, Krishna Kumar Singh, Jae Shin Yoon, Alex Schwing, Liangyan Gui, Matheus Gadelha, Paul Guerrero, Nanxuan Zhao
Comments: SIGGRAPH 2025. Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[2] arXiv:2505.10565 [pdf, html, other]
Title: Depth Anything with Any Prior
Zehan Wang, Siyu Chen, Lihe Yang, Jialei Wang, Ziang Zhang, Hengshuang Zhao, Zhou Zhao
Comments: Home page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[3] arXiv:2505.10562 [pdf, html, other]
Title: End-to-End Vision Tokenizer Tuning
Wenxuan Wang, Fan Zhang, Yufeng Cui, Haiwen Diao, Zhuoyan Luo, Huchuan Lu, Jing Liu, Xinlong Wang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[4] arXiv:2505.10557 [pdf, html, other]
Title: MathCoder-VL: Bridging Vision and Code for Enhanced Multimodal Mathematical Reasoning
Ke Wang, Junting Pan, Linda Wei, Aojun Zhou, Weikang Shi, Zimu Lu, Han Xiao, Yunqiao Yang, Houxing Ren, Mingjie Zhan, Hongsheng Li
Comments: Accepted to ACL 2025 Findings
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[5] arXiv:2505.10551 [pdf, other]
Title: Does Feasibility Matter? Understanding the Impact of Feasibility on Synthetic Training Data
Yiwen Liu, Jessica Bader, Jae Myung Kim
Comments: CVPRW 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[6] arXiv:2505.10541 [pdf, html, other]
Title: Exploring Implicit Visual Misunderstandings in Multimodal Large Language Models through Attention Analysis
Pengfei Wang, Guohai Xu, Weinong Wang, Junjie Yang, Jie Lou, Yunhua Xue
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[7] arXiv:2505.10533 [pdf, html, other]
Title: Enhancing Multi-Image Question Answering via Submodular Subset Selection
Aaryan Sharma, Shivansh Gupta, Samar Agarwal, Vishak Prasad C., Ganesh Ramakrishnan
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[8] arXiv:2505.10497 [pdf, html, other]
Title: MorphGuard: Morph Specific Margin Loss for Enhancing Robustness to Face Morphing Attacks
Iurii Medvedev, Nuno Goncalves
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[9] arXiv:2505.10496 [pdf, html, other]
Title: CheXGenBench: A Unified Benchmark For Fidelity, Privacy and Utility of Synthetic Chest Radiographs
Raman Dutt, Pedro Sanchez, Yongchen Yao, Steven McDonagh, Sotirios A. Tsaftaris, Timothy Hospedales
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[10] arXiv:2505.10483 [pdf, html, other]
Title: UniEval: Unified Holistic Evaluation for Unified Multimodal Understanding and Generation
Yi Li, Haonan Wang, Qixiang Zhang, Boyu Xiao, Chenchang Hu, Hualiang Wang, Xiaomeng Li
Comments: UniEval is the first evaluation framework designed for unified multimodal models, including a holistic benchmark UniBench and the UniScore metric
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[11] arXiv:2505.10481 [pdf, html, other]
Title: Logos as a Well-Tempered Pre-train for Sign Language Recognition
Ilya Ovodov, Petr Surovtsev, Karina Kvanchiani, Alexander Kapitanov, Alexander Nagaev
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[12] arXiv:2505.10473 [pdf, html, other]
Title: Consistent Quantity-Quality Control across Scenes for Deployment-Aware Gaussian Splatting
Fengdi Zhang, Hongkun Cao, Ruqi Huang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[13] arXiv:2505.10453 [pdf, html, other]
Title: Vision language models have difficulty recognizing virtual objects
Tyler Tran, Sangeet Khemlani, J.G. Trafton
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[14] arXiv:2505.10420 [pdf, html, other]
Title: Learned Lightweight Smartphone ISP with Unpaired Data
Andrei Arhire, Radu Timofte
Comments: Accepted at CVPRW 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[15] arXiv:2505.10352 [pdf, html, other]
Title: SpikeVideoFormer: An Efficient Spike-Driven Video Transformer with Hamming Attention and $\mathcal{O}(T)$ Complexity
Shihao Zou, Qingfeng Li, Wei Ji, Jingjing Li, Yongkui Yang, Guoqi Li, Chao Dong
Comments: Accepted by ICML 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[16] arXiv:2505.10351 [pdf, html, other]
Title: A Unified and Scalable Membership Inference Method for Visual Self-supervised Encoder via Part-aware Capability
Jie Zhu, Jirong Zha, Ding Li, Leye Wang
Comments: An extension of our ACM CCS2024 conference paper (arXiv:2404.02462). We show the impacts of scaling from both data and model aspects on membership inference for self-supervised visual encoders
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[17] arXiv:2505.10294 [pdf, html, other]
Title: MIPHEI-ViT: Multiplex Immunofluorescence Prediction from H&E Images using ViT Foundation Models
Guillaume Balezo, Roger Trullo, Albert Pla Planas, Etienne Decenciere, Thomas Walter
Subjects: Computer Vision and Pattern Recognition (cs.CV); Tissues and Organs (q-bio.TO)
[18] arXiv:2505.10292 [pdf, html, other]
Title: StoryReasoning Dataset: Using Chain-of-Thought for Scene Understanding and Grounded Story Generation
Daniel A. P. Oliveira, David Martins de Matos
Comments: 31 pages, 14 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[19] arXiv:2505.10289 [pdf, html, other]
Title: MSCI: Addressing CLIP's Inherent Limitations for Compositional Zero-Shot Learning
Yue Wang, Shuai Xu, Xuelin Zhu, Yicong Li
Comments: 9 pages, 5 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[20] arXiv:2505.10281 [pdf, html, other]
Title: MFogHub: Bridging Multi-Regional and Multi-Satellite Data for Global Marine Fog Detection and Forecasting
Mengqiu Xu, Kaixin Chen, Heng Guo, Yixiang Huang, Ming Wu, Zhenwei Shi, Chuang Zhang, Jun Guo
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[21] arXiv:2505.10267 [pdf, html, other]
Title: HandReader: Advanced Techniques for Efficient Fingerspelling Recognition
Pavel Korotaev, Petr Surovtsev, Alexander Kapitanov, Karina Kvanchiani, Aleksandr Nagaev
Comments: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[22] arXiv:2505.10258 [pdf, html, other]
Title: Inferring Driving Maps by Deep Learning-based Trail Map Extraction
Michael Hubbertz, Pascal Colling, Qi Han, Tobias Meisen
Comments: This paper was accepted at the CVPR WAD 2025 Workshop
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[23] arXiv:2505.10257 [pdf, html, other]
Title: Sage Deer: A Super-Aligned Driving Generalist Is Your Copilot
Hao Lu, Jiaqi Tang, Jiyao Wang, Yunfan LU, Xu Cao, Qingyong Hu, Yin Wang, Yuting Zhang, Tianxin Xie, Yunpeng Zhang, Yong Chen, Jiayu.Gao, Bin Huang, Dengbo He, Shuiguang Deng, Hao Chen, Ying-Cong Chen
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[24] arXiv:2505.10250 [pdf, html, other]
Title: ADHMR: Aligning Diffusion-based Human Mesh Recovery via Direct Preference Optimization
Wenhao Shen, Wanqi Yin, Xiaofeng Yang, Cheng Chen, Chaoyue Song, Zhongang Cai, Lei Yang, Hao Wang, Guosheng Lin
Comments: Accepted by ICML 2025. Code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[25] arXiv:2505.10238 [pdf, html, other]
Title: MTVCrafter: 4D Motion Tokenization for Open-World Human Image Animation
Yanbo Ding
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[26] arXiv:2505.10231 [pdf, html, other]
Title: On the Interplay of Human-AI Alignment,Fairness, and Performance Trade-offs in Medical Imaging
Haozhe Luo, Ziyu Zhou, Zixin Shu, Aurélie Pahud de Mortanges, Robert Berke, Mauricio Reyes
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[27] arXiv:2505.10223 [pdf, other]
Title: Data-Agnostic Augmentations for Unknown Variations: Out-of-Distribution Generalisation in MRI Segmentation
Puru Vaish, Felix Meister, Tobias Heimann, Christoph Brune, Jelmer M. Wolterink
Comments: Accepted at MIDL 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[28] arXiv:2505.10205 [pdf, html, other]
Title: VolE: A Point-cloud Framework for Food 3D Reconstruction and Volume Estimation
Umair Haroon, Ahmad AlMughrabi, Thanasis Zoumpekas, Ricardo Marques, Petia Radeva
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[29] arXiv:2505.10169 [pdf, html, other]
Title: Modeling Saliency Dataset Bias
Matthias Kümmerer, Harneet Khanuja, Matthias Bethge
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[30] arXiv:2505.10152 [pdf, html, other]
Title: Multi-Source Collaborative Style Augmentation and Domain-Invariant Learning for Federated Domain Generalization
Yikang Wei
Comments: IJCAI 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[31] arXiv:2505.10124 [pdf, html, other]
Title: IMITATE: Image Registration with Context for unknown time frame recovery
Ziad Kheil, Lucas Robinet, Laurent Risser, Soleakhena Ken
Comments: IEEE ISBI 2025
Journal-ref: 2025 IEEE 22nd International Symposium on Biomedical Imaging (ISBI), Houston, TX, USA, 2025, pp. 01-05
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[32] arXiv:2505.10118 [pdf, html, other]
Title: Why 1 + 1 < 1 in Visual Token Pruning: Beyond Naive Integration via Multi-Objective Balanced Covering
Yangfu Li, Hongjian Zhan, Tianyi Chen, Qi Liu, Yue Lu
Comments: 31 pages,9 figures,conference
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[33] arXiv:2505.10088 [pdf, html, other]
Title: MMRL++: Parameter-Efficient and Interaction-Aware Representation Learning for Vision-Language Models
Yuncheng Guo, Xiaodong Gu
Comments: Due to the limitation "The abstract field cannot be longer than 1,920 characters", the abstract appearing here is slightly shorter than that in the PDF file
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[34] arXiv:2505.10072 [pdf, html, other]
Title: ToonifyGB: StyleGAN-based Gaussian Blendshapes for 3D Stylized Head Avatars
Rui-Yang Ju, Sheng-Yen Huang, Yi-Ping Hung
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[35] arXiv:2505.10055 [pdf, html, other]
Title: PsOCR: Benchmarking Large Multimodal Models for Optical Character Recognition in Low-resource Pashto Language
Ijazul Haq, Yingjie Zhang, Irfan Ali Khan
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[36] arXiv:2505.10049 [pdf, html, other]
Title: Advances in Radiance Field for Dynamic Scene: From Neural Field to Gaussian Field
Jinlong Fan, Xuepu Zeng, Jing Zhang, Mingming Gong, Yuxiang Yang, Dacheng Tao
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[37] arXiv:2505.10046 [pdf, html, other]
Title: Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis
Bingda Tang, Boyang Zheng, Xichen Pan, Sayak Paul, Saining Xie
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[38] arXiv:2505.10030 [pdf, html, other]
Title: DeepSeqCoco: A Robust Mobile Friendly Deep Learning Model for Detection of Diseases in Cocos nucifera
Miit Daga, Dhriti Parikh, Swarna Priya Ramu
Comments: This paper is accepted for publication in IEEE Access journal and is currently pending revisions before publication
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[39] arXiv:2505.10027 [pdf, other]
Title: ORL-LDM: Offline Reinforcement Learning Guided Latent Diffusion Model Super-Resolution Reconstruction
Shijie Lyu
Comments: Accepted by the 4th International Conference on Computing Innovation and Applied Physics (CONF-CIAP 2025), and will be published in EAI Community Research Series-CORE or Theoretical and Natural Science (TNS)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[40] arXiv:2505.10016 [pdf, other]
Title: Application of YOLOv8 in monocular downward multiple Car Target detection
Shijie Lyu
Comments: Accepted by the 5th International Conference on Signal Processing and Machine Learning (CONF-SPML 2025), to appear in Applied and Computational Engineering
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[41] arXiv:2505.09998 [pdf, html, other]
Title: From Air to Wear: Personalized 3D Digital Fashion with AR/VR Immersive 3D Sketching
Ying Zang, Yuanqi Hu, Xinyu Chen, Yuxia Xu, Suhui Wang, Chunan Yu, Lanyun Zhu, Deyi Ji, Xin Xu, Tianrun Chen
Comments: 8 pages, 5 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[42] arXiv:2505.09997 [pdf, other]
Title: Descriptive Image-Text Matching with Graded Contextual Similarity
Jinhyun Jang, Jiyeong Lee, Kwanghoon Sohn
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[43] arXiv:2505.09990 [pdf, html, other]
Title: PointArena: Probing Multimodal Grounding Through Language-Guided Pointing
Long Cheng, Jiafei Duan, Yi Ru Wang, Haoquan Fang, Boyang Li, Yushan Huang, Elvis Wang, Ainaz Eftekhar, Jason Lee, Wentao Yuan, Rose Hendrix, Noah A. Smith, Fei Xia, Dieter Fox, Ranjay Krishna
Comments: 10 Pages, Dataset and code:this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[44] arXiv:2505.09986 [pdf, html, other]
Title: High Quality Underwater Image Compression with Adaptive Correction and Codebook-based Augmentation
Yimin Zhou, Yichong Xia, Sicheng Pan, Bin Chen, Baoyi An, Haoqian Wang, Zhi Wang, Yaowei Wang, Zikun Zhou
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[45] arXiv:2505.09971 [pdf, html, other]
Title: APCoTTA: Continual Test-Time Adaptation for Semantic Segmentation of Airborne LiDAR Point Clouds
Yuan Gao, Shaobo Xia, Sheng Nie, Cheng Wang, Xiaohuan Xi, Bisheng Yang
Comments: 18 pages,12 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[46] arXiv:2505.09967 [pdf, html, other]
Title: TKFNet: Learning Texture Key Factor Driven Feature for Facial Expression Recognition
Liqian Deng
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[47] arXiv:2505.09965 [pdf, html, other]
Title: MambaControl: Anatomy Graph-Enhanced Mamba ControlNet with Fourier Refinement for Diffusion-Based Disease Trajectory Prediction
Hao Yang, Tao Tan, Shuai Tan, Weiqin Yang, Kunyan Cai, Calvin Chen, Yue Sun
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[48] arXiv:2505.09943 [pdf, html, other]
Title: CSPENet: Contour-Aware and Saliency Priors Embedding Network for Infrared Small Target Detection
Jiakun Deng, Kexuan Li, Xingye Cui, Jiaxuan Li, Chang Long, Tian Pu, Zhenming Peng
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[49] arXiv:2505.09939 [pdf, html, other]
Title: Non-Registration Change Detection: A Novel Change Detection Task and Benchmark Dataset
Zhe Shan, Lei Zhou, Liu Mao, Shaofan Chen, Chuanqiu Ren, Xia Xie
Comments: Accepted to IGARSS 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[50] arXiv:2505.09935 [pdf, html, other]
Title: VRU-CIPI: Crossing Intention Prediction at Intersections for Improving Vulnerable Road Users Safety
Ahmed S. Abdelrahman, Mohamed Abdel-Aty, Quoc Dai Tran
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Total of 538 entries : 1-50 51-100 101-150 151-200 ... 501-538
Showing up to 50 entries per page: fewer | more | all
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack