Skip to main content
Cornell University
We gratefully acknowledge support from the Simons Foundation, member institutions, and all contributors. Donate
arxiv logo > cs.CV

Help | Advanced Search

arXiv logo
Cornell University Logo

quick links

  • Login
  • Help Pages
  • About

Computer Vision and Pattern Recognition

Authors and titles for May 2025

Total of 3183 entries : 1-50 51-100 101-150 151-200 201-250 251-300 ... 3151-3183
Showing up to 50 entries per page: fewer | more | all
[101] arXiv:2505.01431 [pdf, html, other]
Title: ZS-VCOS: Zero-Shot Video Camouflaged Object Segmentation By Optical Flow and Open Vocabulary Object Detection
Wenqi Guo, Mohamed Shehata, Shan Du
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[102] arXiv:2505.01481 [pdf, html, other]
Title: VideoHallu: Evaluating and Mitigating Multi-modal Hallucinations on Synthetic Video Understanding
Zongxia Li, Xiyang Wu, Guangyao Shi, Yubin Qin, Hongyang Du, Fuxiao Liu, Tianyi Zhou, Dinesh Manocha, Jordan Lee Boyd-Graber
Journal-ref: NeurIPS 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[103] arXiv:2505.01490 [pdf, html, other]
Title: WorldGenBench: A World-Knowledge-Integrated Benchmark for Reasoning-Driven Text-to-Image Generation
Daoan Zhang, Che Jiang, Ruoshi Xu, Biaoxiang Chen, Zijian Jin, Yutian Lu, Jianguo Zhang, Liang Yong, Jiebo Luo, Shengda Luo
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[104] arXiv:2505.01530 [pdf, other]
Title: Automated Parsing of Engineering Drawings for Structured Information Extraction Using a Fine-tuned Document Understanding Transformer
Muhammad Tayyab Khan, Zane Yong, Lequn Chen, Jun Ming Tan, Wenhe Feng, Seung Ki Moon
Comments: This manuscript has been accepted for publication at IEEE International Conference on Industrial Engineering and Engineering Management (IEEM)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[105] arXiv:2505.01548 [pdf, html, other]
Title: Learning Flow-Guided Registration for RGB-Event Semantic Segmentation
Zhen Yao, Xiaowen Ying, Zhiyu Zhu, Mooi Choo Chuah
Comments: 20 pages, 14 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[106] arXiv:2505.01558 [pdf, html, other]
Title: A Sensor Agnostic Domain Generalization Framework for Leveraging Geospatial Foundation Models: Enhancing Semantic Segmentation viaSynergistic Pseudo-Labeling and Generative Learning
Anan Yaghmour, Melba M. Crawford, Saurabh Prasad
Comments: Accepted in the 2025 CVPR Workshop on Foundation and Large Vision Models in Remote Sensing, to appear in CVPR 2025 Workshop Proceedings
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[107] arXiv:2505.01571 [pdf, html, other]
Title: PainFormer: a Vision Foundation Model for Automatic Pain Assessment
Stefanos Gkikas, Raul Fernandez Rojas, Manolis Tsiknakis
Journal-ref: IEEE Transactions on Affective Computing; 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[108] arXiv:2505.01578 [pdf, html, other]
Title: Grounding Task Assistance with Multimodal Cues from a Single Demonstration
Gabriel Sarch, Balasaravanan Thoravi Kumaravel, Sahithya Ravi, Vibhav Vineet, Andrew D. Wilson
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[109] arXiv:2505.01583 [pdf, html, other]
Title: TEMPURA: Temporal Event Masked Prediction and Understanding for Reasoning in Action
Jen-Hao Cheng, Vivian Wang, Huayu Wang, Huapeng Zhou, Yi-Hao Peng, Hou-I Liu, Hsiang-Wei Huang, Kuang-Ming Chen, Cheng-Yen Yang, Wenhao Chai, Yi-Ling Chen, Vibhav Vineet, Qin Cai, Jenq-Neng Hwang
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[110] arXiv:2505.01615 [pdf, html, other]
Title: Multimodal and Multiview Deep Fusion for Autonomous Marine Navigation
Dimitrios Dagdilelis, Panagiotis Grigoriadis, Roberto Galeazzi
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[111] arXiv:2505.01650 [pdf, html, other]
Title: Toward Onboard AI-Enabled Solutions to Space Object Detection for Space Sustainability
Wenxuan Zhang, Peng Hu
Comments: This paper has been accepted at the 18th International Conference on Space Operations (SpaceOps 2025)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[112] arXiv:2505.01656 [pdf, html, other]
Title: A Novel WaveInst-based Network for Tree Trunk Structure Extraction and Pattern Analysis in Forest Inventory
Chenyang Fan, Xujie Zhu, Taige Luo, Sheng Xu, Zhulin Chen, Hongxin Yang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[113] arXiv:2505.01664 [pdf, html, other]
Title: Soft-Masked Semi-Dual Optimal Transport for Partial Domain Adaptation
Yi-Ming Zhai, Chuan-Xian Ren, Hong Yan
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[114] arXiv:2505.01680 [pdf, html, other]
Title: Automated ARAT Scoring Using Multimodal Video Analysis, Multi-View Fusion, and Hierarchical Bayesian Models: A Clinician Study
Tamim Ahmed, Thanassis Rikakis
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC); Probability (math.PR)
[115] arXiv:2505.01694 [pdf, html, other]
Title: Topology-Aware CLIP Few-Shot Learning
Dazhi Huang
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[116] arXiv:2505.01699 [pdf, html, other]
Title: Component-Based Fairness in Face Attribute Classification with Bayesian Network-informed Meta Learning
Yifan Liu, Ruichen Yao, Yaokun Liu, Ruohan Zong, Zelin Li, Yang Zhang, Dong Wang
Comments: Accepted by ACM FAccT 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[117] arXiv:2505.01711 [pdf, html, other]
Title: Knowledge-Augmented Language Models Interpreting Structured Chest X-Ray Findings
Alexander Davis, Rafael Souza, Jia-Hao Lim
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[118] arXiv:2505.01713 [pdf, html, other]
Title: Vision and Intention Boost Large Language Model in Long-Term Action Anticipation
Congqi Cao, Lanshu Hu, Yating Yu, Yanning Zhang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[119] arXiv:2505.01726 [pdf, html, other]
Title: Probabilistic Interactive 3D Segmentation with Hierarchical Neural Processes
Jie Liu, Pan Zhou, Zehao Xiao, Jiayi Shen, Wenzhe Yin, Jan-Jakob Sonke, Efstratios Gavves
Comments: ICML 2025 Proceedings
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[120] arXiv:2505.01729 [pdf, html, other]
Title: PosePilot: Steering Camera Pose for Generative World Models with Self-supervised Depth
Bu Jin, Weize Li, Baihan Yang, Zhenxin Zhu, Junpeng Jiang, Huan-ang Gao, Haiyang Sun, Kun Zhan, Hengtong Hu, Xueyang Zhang, Peng Jia, Hao Zhao
Comments: Accepted at IEEE/RSJ IROS 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[121] arXiv:2505.01737 [pdf, other]
Title: Learning Multi-frame and Monocular Prior for Estimating Geometry in Dynamic Scenes
Seong Hyeon Park, Jinwoo Shin
Comments: This paper was supported by RLWRLD
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[122] arXiv:2505.01743 [pdf, html, other]
Title: An LLM-Empowered Low-Resolution Vision System for On-Device Human Behavior Understanding
Siyang Jiang, Bufang Yang, Lilin Xu, Mu Yuan, Yeerzhati Abudunuer, Kaiwei Liu, Liekang Zeng, Hongkai Chen, Zhenyu Yan, Xiaofan Jiang, Guoliang Xing
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[123] arXiv:2505.01746 [pdf, html, other]
Title: Co$^{3}$Gesture: Towards Coherent Concurrent Co-speech 3D Gesture Generation with Interactive Diffusion
Xingqun Qi, Yatian Wang, Hengyuan Zhang, Jiahao Pan, Wei Xue, Shanghang Zhang, Wenhan Luo, Qifeng Liu, Yike Guo
Comments: Accepted as ICLR 2025 (Spotlight)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[124] arXiv:2505.01766 [pdf, html, other]
Title: Multimodal Graph Representation Learning for Robust Surgical Workflow Recognition with Adversarial Feature Disentanglement
Long Bai, Boyi Ma, Ruohan Wang, Guankun Wang, Beilei Cui, Zhongliang Jiang, Mobarakol Islam, Zhe Min, Jiewen Lai, Nassir Navab, Hongliang Ren
Comments: Accepted by Information Fusion
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[125] arXiv:2505.01790 [pdf, html, other]
Title: Enhancing the Learning Experience: Using Vision-Language Models to Generate Questions for Educational Videos
Markos Stamatakis, Joshua Berger, Christian Wartena, Ralph Ewerth, Anett Hoppe
Comments: 12 pages (excluding references), 8 tables, 1 equation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Multimedia (cs.MM)
[126] arXiv:2505.01799 [pdf, html, other]
Title: AquaGS: Fast Underwater Scene Reconstruction with SfM-Free Gaussian Splatting
Junhao Shi, Jisheng Xu, Jianping He, Zhiliang Lin
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[127] arXiv:2505.01802 [pdf, html, other]
Title: Efficient 3D Full-Body Motion Generation from Sparse Tracking Inputs with Temporal Windows
Georgios Fotios Angelis, Savas Ozkan, Sinan Mutlu, Paul Wisbey, Anastasios Drosou, Mete Ozay
Comments: Accepted to CVPRW2025 - 4D Vision Workshop
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[128] arXiv:2505.01805 [pdf, html, other]
Title: Not Every Tree Is a Forest: Benchmarking Forest Types from Satellite Remote Sensing
Yuchang Jiang, Maxim Neumann
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[129] arXiv:2505.01809 [pdf, html, other]
Title: 3DWG: 3D Weakly Supervised Visual Grounding via Category and Instance-Level Alignment
Xiaoqi Li, Jiaming Liu, Nuowei Han, Liang Heng, Yandong Guo, Hao Dong, Yang Liu
Comments: ICRA 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[130] arXiv:2505.01823 [pdf, html, other]
Title: PhytoSynth: Leveraging Multi-modal Generative Models for Crop Disease Data Generation with Novel Benchmarking and Prompt Engineering Approach
Nitin Rai, Arnold W. Schumann, Nathan Boyd
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Emerging Technologies (cs.ET)
[131] arXiv:2505.01837 [pdf, html, other]
Title: CVVNet: A Cross-Vertical-View Network for Gait Recognition
Xiangru Li, Wei Song, Yingda Huang, Wei Meng, Le Chang, Hongyang Li
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[132] arXiv:2505.01838 [pdf, html, other]
Title: MVHumanNet++: A Large-scale Dataset of Multi-view Daily Dressing Human Captures with Richer Annotations for 3D Human Digitization
Chenghong Li, Hongjie Liao, Yihao Zhi, Xihe Yang, Zhengwentai Sun, Jiahao Chang, Shuguang Cui, Xiaoguang Han
Comments: project page: this https URL. arXiv admin note: substantial text overlap with arXiv:2312.02963
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[133] arXiv:2505.01851 [pdf, html, other]
Title: Mitigating Group-Level Fairness Disparities in Federated Visual Language Models
Chaomeng Chen, Zitong Yu, Junhao Dong, Sen Su, Linlin Shen, Shutao Xia, Xiaochun Cao
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[134] arXiv:2505.01857 [pdf, html, other]
Title: DualDiff: Dual-branch Diffusion Model for Autonomous Driving with Semantic Fusion
Haoteng Li, Zhao Yang, Zezhong Qian, Gongpeng Zhao, Yuqi Huang, Jun Yu, Huazheng Zhou, Longjun Liu
Comments: 8 pages, 6 figures,
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[135] arXiv:2505.01869 [pdf, html, other]
Title: Visual enhancement and 3D representation for underwater scenes: a review
Guoxi Huang, Haoran Wang, Brett Seymour, Evan Kovacs, John Ellerbrock, Dave Blackham, Nantheera Anantrasirichai
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[136] arXiv:2505.01881 [pdf, html, other]
Title: PhysNav-DG: A Novel Adaptive Framework for Robust VLM-Sensor Fusion in Navigation Applications
Trisanth Srinivasan, Santosh Patapati
Comments: Accepted at IEEE/CVF Computer Society Conference on Computer Vision and Pattern Recognition Workshops 2025 (CVPRW)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM); Robotics (cs.RO)
[137] arXiv:2505.01882 [pdf, other]
Title: CMAWRNet: Multiple Adverse Weather Removal via a Unified Quaternion Neural Architecture
Vladimir Frants, Sos Agaian, Karen Panetta, Peter Huang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[138] arXiv:2505.01888 [pdf, html, other]
Title: Rethinking Score Distilling Sampling for 3D Editing and Generation
Xingyu Miao, Haoran Duan, Yang Long, Jungong Han
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[139] arXiv:2505.01928 [pdf, html, other]
Title: GenSync: A Generalized Talking Head Framework for Audio-driven Multi-Subject Lip-Sync using 3D Gaussian Splatting
Anushka Agarwal, Muhammad Yusuf Hassan, Talha Chafekar
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[140] arXiv:2505.01934 [pdf, html, other]
Title: GauS-SLAM: Dense RGB-D SLAM with Gaussian Surfels
Yongxin Su, Lin Chen, Kaiting Zhang, Zhongliang Zhao, Chenfeng Hou, Ziping Yu
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[141] arXiv:2505.01938 [pdf, html, other]
Title: HybridGS: High-Efficiency Gaussian Splatting Data Compression using Dual-Channel Sparse Representation and Point Cloud Encoder
Qi Yang, Le Yang, Geert Van Der Auwera, Zhu Li
Comments: Accepted by ICML2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[142] arXiv:2505.01950 [pdf, html, other]
Title: Segment Any RGB-Thermal Model with Language-aided Distillation
Dong Xing, Xianxun Zhu, Wei Zhou, Qika Lin, Hang Yang, Yuqing Wang
Comments: arXiv admin note: text overlap with arXiv:2412.04220 by other authors
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[143] arXiv:2505.01958 [pdf, html, other]
Title: A Comprehensive Analysis for Visual Object Hallucination in Large Vision-Language Models
Liqiang Jing, Guiming Hardy Chen, Ehsan Aghazadeh, Xin Eric Wang, Xinya Du
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[144] arXiv:2505.01969 [pdf, html, other]
Title: MC3D-AD: A Unified Geometry-aware Reconstruction Model for Multi-category 3D Anomaly Detection
Jiayi Cheng, Can Gao, Jie Zhou, Jiajun Wen, Tao Dai, Jinbao Wang
Comments: 7 pages of main text, 3 pages of appendix, accepted to IJCAI 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[145] arXiv:2505.01973 [pdf, other]
Title: Visual Dominance and Emerging Multimodal Approaches in Distracted Driving Detection: A Review of Machine Learning Techniques
Anthony Dontoh, Stephanie Ivey, Logan Sirbaugh, Andrews Danyo, Armstrong Aboah
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[146] arXiv:2505.01984 [pdf, html, other]
Title: Lifelong Whole Slide Image Analysis: Online Vision-Language Adaptation and Past-to-Present Gradient Distillation
Doanh C. Bui, Hoai Luan Pham, Vu Trung Duong Le, Tuan Hai Vu, Van Duy Tran, Khang Nguyen, Yasuhiko Nakashima
Journal-ref: Bui, D. C., Pham, H. L., Le, V. T. D., Vu, T. H., Tran, V. D., Nguyen, K., & Nakashima, Y. (2025). Lifelong Whole Slide Image Analysis: Online Vision-Language Adaptation and Past-to-Present Gradient Distillation. IEEE Access
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[147] arXiv:2505.01986 [pdf, html, other]
Title: Drug classification based on X-ray spectroscopy combined with machine learning
Yongming Li, Peng Wang, Bangdong Han
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[148] arXiv:2505.02005 [pdf, html, other]
Title: Learning Heterogeneous Mixture of Scene Experts for Large-scale Neural Radiance Fields
Zhenxing Mi, Ping Yin, Xue Xiao, Dan Xu
Comments: Accepted by TPAMI
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[149] arXiv:2505.02007 [pdf, html, other]
Title: Efficient Noise Calculation in Deep Learning-based MRI Reconstructions
Onat Dalmaz, Arjun D. Desai, Reinhard Heckel, Tolga Çukur, Akshay S. Chaudhari, Brian A. Hargreaves
Comments: Accepted ICML 2025. Supplementary material included
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[150] arXiv:2505.02013 [pdf, html, other]
Title: MLLM-Enhanced Face Forgery Detection: A Vision-Language Fusion Solution
Siran Peng, Zipei Wang, Li Gao, Xiangyu Zhu, Tianshuo Zhang, Ajian Liu, Haoyuan Zhang, Zhen Lei
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Total of 3183 entries : 1-50 51-100 101-150 151-200 201-250 251-300 ... 3151-3183
Showing up to 50 entries per page: fewer | more | all
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status