Computer Vision and Pattern Recognition

Authors and titles for May 2025

Total of 3183 entries : 1-50 51-100 101-150 151-200 201-250 251-300 ... 3151-3183

Showing up to 50 entries per page: fewer | more | all

[101] arXiv:2505.01431 [pdf, html, other]: Title: ZS-VCOS: Zero-Shot Video Camouflaged Object Segmentation By Optical Flow and Open Vocabulary Object Detection

Wenqi Guo, Mohamed Shehata, Shan Du

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[102] arXiv:2505.01481 [pdf, html, other]: Title: VideoHallu: Evaluating and Mitigating Multi-modal Hallucinations on Synthetic Video Understanding

Zongxia Li, Xiyang Wu, Guangyao Shi, Yubin Qin, Hongyang Du, Fuxiao Liu, Tianyi Zhou, Dinesh Manocha, Jordan Lee Boyd-Graber

Journal-ref: NeurIPS 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[103] arXiv:2505.01490 [pdf, html, other]: Title: WorldGenBench: A World-Knowledge-Integrated Benchmark for Reasoning-Driven Text-to-Image Generation

Daoan Zhang, Che Jiang, Ruoshi Xu, Biaoxiang Chen, Zijian Jin, Yutian Lu, Jianguo Zhang, Liang Yong, Jiebo Luo, Shengda Luo

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[104] arXiv:2505.01530 [pdf, other]: Title: Automated Parsing of Engineering Drawings for Structured Information Extraction Using a Fine-tuned Document Understanding Transformer

Muhammad Tayyab Khan, Zane Yong, Lequn Chen, Jun Ming Tan, Wenhe Feng, Seung Ki Moon

Comments: This manuscript has been accepted for publication at IEEE International Conference on Industrial Engineering and Engineering Management (IEEM)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[105] arXiv:2505.01548 [pdf, html, other]: Title: Learning Flow-Guided Registration for RGB-Event Semantic Segmentation

Zhen Yao, Xiaowen Ying, Zhiyu Zhu, Mooi Choo Chuah

Comments: 20 pages, 14 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[106] arXiv:2505.01558 [pdf, html, other]: Title: A Sensor Agnostic Domain Generalization Framework for Leveraging Geospatial Foundation Models: Enhancing Semantic Segmentation viaSynergistic Pseudo-Labeling and Generative Learning

Anan Yaghmour, Melba M. Crawford, Saurabh Prasad

Comments: Accepted in the 2025 CVPR Workshop on Foundation and Large Vision Models in Remote Sensing, to appear in CVPR 2025 Workshop Proceedings

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[107] arXiv:2505.01571 [pdf, html, other]: Title: PainFormer: a Vision Foundation Model for Automatic Pain Assessment

Stefanos Gkikas, Raul Fernandez Rojas, Manolis Tsiknakis

Journal-ref: IEEE Transactions on Affective Computing; 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[108] arXiv:2505.01578 [pdf, html, other]: Title: Grounding Task Assistance with Multimodal Cues from a Single Demonstration

Gabriel Sarch, Balasaravanan Thoravi Kumaravel, Sahithya Ravi, Vibhav Vineet, Andrew D. Wilson

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[109] arXiv:2505.01583 [pdf, html, other]: Title: TEMPURA: Temporal Event Masked Prediction and Understanding for Reasoning in Action

Jen-Hao Cheng, Vivian Wang, Huayu Wang, Huapeng Zhou, Yi-Hao Peng, Hou-I Liu, Hsiang-Wei Huang, Kuang-Ming Chen, Cheng-Yen Yang, Wenhao Chai, Yi-Ling Chen, Vibhav Vineet, Qin Cai, Jenq-Neng Hwang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[110] arXiv:2505.01615 [pdf, html, other]: Title: Multimodal and Multiview Deep Fusion for Autonomous Marine Navigation

Dimitrios Dagdilelis, Panagiotis Grigoriadis, Roberto Galeazzi

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[111] arXiv:2505.01650 [pdf, html, other]: Title: Toward Onboard AI-Enabled Solutions to Space Object Detection for Space Sustainability

Wenxuan Zhang, Peng Hu

Comments: This paper has been accepted at the 18th International Conference on Space Operations (SpaceOps 2025)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[112] arXiv:2505.01656 [pdf, html, other]: Title: A Novel WaveInst-based Network for Tree Trunk Structure Extraction and Pattern Analysis in Forest Inventory

Chenyang Fan, Xujie Zhu, Taige Luo, Sheng Xu, Zhulin Chen, Hongxin Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[113] arXiv:2505.01664 [pdf, html, other]: Title: Soft-Masked Semi-Dual Optimal Transport for Partial Domain Adaptation

Yi-Ming Zhai, Chuan-Xian Ren, Hong Yan

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[114] arXiv:2505.01680 [pdf, html, other]: Title: Automated ARAT Scoring Using Multimodal Video Analysis, Multi-View Fusion, and Hierarchical Bayesian Models: A Clinician Study

Tamim Ahmed, Thanassis Rikakis

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC); Probability (math.PR)
[115] arXiv:2505.01694 [pdf, html, other]: Title: Topology-Aware CLIP Few-Shot Learning

Dazhi Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[116] arXiv:2505.01699 [pdf, html, other]: Title: Component-Based Fairness in Face Attribute Classification with Bayesian Network-informed Meta Learning

Yifan Liu, Ruichen Yao, Yaokun Liu, Ruohan Zong, Zelin Li, Yang Zhang, Dong Wang

Comments: Accepted by ACM FAccT 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[117] arXiv:2505.01711 [pdf, html, other]: Title: Knowledge-Augmented Language Models Interpreting Structured Chest X-Ray Findings

Alexander Davis, Rafael Souza, Jia-Hao Lim

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[118] arXiv:2505.01713 [pdf, html, other]: Title: Vision and Intention Boost Large Language Model in Long-Term Action Anticipation

Congqi Cao, Lanshu Hu, Yating Yu, Yanning Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[119] arXiv:2505.01726 [pdf, html, other]: Title: Probabilistic Interactive 3D Segmentation with Hierarchical Neural Processes

Jie Liu, Pan Zhou, Zehao Xiao, Jiayi Shen, Wenzhe Yin, Jan-Jakob Sonke, Efstratios Gavves

Comments: ICML 2025 Proceedings

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[120] arXiv:2505.01729 [pdf, html, other]: Title: PosePilot: Steering Camera Pose for Generative World Models with Self-supervised Depth

Bu Jin, Weize Li, Baihan Yang, Zhenxin Zhu, Junpeng Jiang, Huan-ang Gao, Haiyang Sun, Kun Zhan, Hengtong Hu, Xueyang Zhang, Peng Jia, Hao Zhao

Comments: Accepted at IEEE/RSJ IROS 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[121] arXiv:2505.01737 [pdf, other]: Title: Learning Multi-frame and Monocular Prior for Estimating Geometry in Dynamic Scenes

Seong Hyeon Park, Jinwoo Shin

Comments: This paper was supported by RLWRLD

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[122] arXiv:2505.01743 [pdf, html, other]: Title: An LLM-Empowered Low-Resolution Vision System for On-Device Human Behavior Understanding

Siyang Jiang, Bufang Yang, Lilin Xu, Mu Yuan, Yeerzhati Abudunuer, Kaiwei Liu, Liekang Zeng, Hongkai Chen, Zhenyu Yan, Xiaofan Jiang, Guoliang Xing

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[123] arXiv:2505.01746 [pdf, html, other]: Title: Co$^{3}$Gesture: Towards Coherent Concurrent Co-speech 3D Gesture Generation with Interactive Diffusion

Xingqun Qi, Yatian Wang, Hengyuan Zhang, Jiahao Pan, Wei Xue, Shanghang Zhang, Wenhan Luo, Qifeng Liu, Yike Guo

Comments: Accepted as ICLR 2025 (Spotlight)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[124] arXiv:2505.01766 [pdf, html, other]: Title: Multimodal Graph Representation Learning for Robust Surgical Workflow Recognition with Adversarial Feature Disentanglement

Long Bai, Boyi Ma, Ruohan Wang, Guankun Wang, Beilei Cui, Zhongliang Jiang, Mobarakol Islam, Zhe Min, Jiewen Lai, Nassir Navab, Hongliang Ren

Comments: Accepted by Information Fusion

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[125] arXiv:2505.01790 [pdf, html, other]: Title: Enhancing the Learning Experience: Using Vision-Language Models to Generate Questions for Educational Videos

Markos Stamatakis, Joshua Berger, Christian Wartena, Ralph Ewerth, Anett Hoppe

Comments: 12 pages (excluding references), 8 tables, 1 equation

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Multimedia (cs.MM)
[126] arXiv:2505.01799 [pdf, html, other]: Title: AquaGS: Fast Underwater Scene Reconstruction with SfM-Free Gaussian Splatting

Junhao Shi, Jisheng Xu, Jianping He, Zhiliang Lin

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[127] arXiv:2505.01802 [pdf, html, other]: Title: Efficient 3D Full-Body Motion Generation from Sparse Tracking Inputs with Temporal Windows

Georgios Fotios Angelis, Savas Ozkan, Sinan Mutlu, Paul Wisbey, Anastasios Drosou, Mete Ozay

Comments: Accepted to CVPRW2025 - 4D Vision Workshop

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[128] arXiv:2505.01805 [pdf, html, other]: Title: Not Every Tree Is a Forest: Benchmarking Forest Types from Satellite Remote Sensing

Yuchang Jiang, Maxim Neumann

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[129] arXiv:2505.01809 [pdf, html, other]: Title: 3DWG: 3D Weakly Supervised Visual Grounding via Category and Instance-Level Alignment

Xiaoqi Li, Jiaming Liu, Nuowei Han, Liang Heng, Yandong Guo, Hao Dong, Yang Liu

Comments: ICRA 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[130] arXiv:2505.01823 [pdf, html, other]: Title: PhytoSynth: Leveraging Multi-modal Generative Models for Crop Disease Data Generation with Novel Benchmarking and Prompt Engineering Approach

Nitin Rai, Arnold W. Schumann, Nathan Boyd

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Emerging Technologies (cs.ET)
[131] arXiv:2505.01837 [pdf, html, other]: Title: CVVNet: A Cross-Vertical-View Network for Gait Recognition

Xiangru Li, Wei Song, Yingda Huang, Wei Meng, Le Chang, Hongyang Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[132] arXiv:2505.01838 [pdf, html, other]: Title: MVHumanNet++: A Large-scale Dataset of Multi-view Daily Dressing Human Captures with Richer Annotations for 3D Human Digitization

Chenghong Li, Hongjie Liao, Yihao Zhi, Xihe Yang, Zhengwentai Sun, Jiahao Chang, Shuguang Cui, Xiaoguang Han

Comments: project page: this https URL. arXiv admin note: substantial text overlap with arXiv:2312.02963

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[133] arXiv:2505.01851 [pdf, html, other]: Title: Mitigating Group-Level Fairness Disparities in Federated Visual Language Models

Chaomeng Chen, Zitong Yu, Junhao Dong, Sen Su, Linlin Shen, Shutao Xia, Xiaochun Cao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[134] arXiv:2505.01857 [pdf, html, other]: Title: DualDiff: Dual-branch Diffusion Model for Autonomous Driving with Semantic Fusion

Haoteng Li, Zhao Yang, Zezhong Qian, Gongpeng Zhao, Yuqi Huang, Jun Yu, Huazheng Zhou, Longjun Liu

Comments: 8 pages, 6 figures,

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[135] arXiv:2505.01869 [pdf, html, other]: Title: Visual enhancement and 3D representation for underwater scenes: a review

Guoxi Huang, Haoran Wang, Brett Seymour, Evan Kovacs, John Ellerbrock, Dave Blackham, Nantheera Anantrasirichai

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[136] arXiv:2505.01881 [pdf, html, other]: Title: PhysNav-DG: A Novel Adaptive Framework for Robust VLM-Sensor Fusion in Navigation Applications

Trisanth Srinivasan, Santosh Patapati

Comments: Accepted at IEEE/CVF Computer Society Conference on Computer Vision and Pattern Recognition Workshops 2025 (CVPRW)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM); Robotics (cs.RO)
[137] arXiv:2505.01882 [pdf, other]: Title: CMAWRNet: Multiple Adverse Weather Removal via a Unified Quaternion Neural Architecture

Vladimir Frants, Sos Agaian, Karen Panetta, Peter Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[138] arXiv:2505.01888 [pdf, html, other]: Title: Rethinking Score Distilling Sampling for 3D Editing and Generation

Xingyu Miao, Haoran Duan, Yang Long, Jungong Han

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[139] arXiv:2505.01928 [pdf, html, other]: Title: GenSync: A Generalized Talking Head Framework for Audio-driven Multi-Subject Lip-Sync using 3D Gaussian Splatting

Anushka Agarwal, Muhammad Yusuf Hassan, Talha Chafekar

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[140] arXiv:2505.01934 [pdf, html, other]: Title: GauS-SLAM: Dense RGB-D SLAM with Gaussian Surfels

Yongxin Su, Lin Chen, Kaiting Zhang, Zhongliang Zhao, Chenfeng Hou, Ziping Yu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[141] arXiv:2505.01938 [pdf, html, other]: Title: HybridGS: High-Efficiency Gaussian Splatting Data Compression using Dual-Channel Sparse Representation and Point Cloud Encoder

Qi Yang, Le Yang, Geert Van Der Auwera, Zhu Li

Comments: Accepted by ICML2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[142] arXiv:2505.01950 [pdf, html, other]: Title: Segment Any RGB-Thermal Model with Language-aided Distillation

Dong Xing, Xianxun Zhu, Wei Zhou, Qika Lin, Hang Yang, Yuqing Wang

Comments: arXiv admin note: text overlap with arXiv:2412.04220 by other authors

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[143] arXiv:2505.01958 [pdf, html, other]: Title: A Comprehensive Analysis for Visual Object Hallucination in Large Vision-Language Models

Liqiang Jing, Guiming Hardy Chen, Ehsan Aghazadeh, Xin Eric Wang, Xinya Du

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[144] arXiv:2505.01969 [pdf, html, other]: Title: MC3D-AD: A Unified Geometry-aware Reconstruction Model for Multi-category 3D Anomaly Detection

Jiayi Cheng, Can Gao, Jie Zhou, Jiajun Wen, Tao Dai, Jinbao Wang

Comments: 7 pages of main text, 3 pages of appendix, accepted to IJCAI 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[145] arXiv:2505.01973 [pdf, other]: Title: Visual Dominance and Emerging Multimodal Approaches in Distracted Driving Detection: A Review of Machine Learning Techniques

Anthony Dontoh, Stephanie Ivey, Logan Sirbaugh, Andrews Danyo, Armstrong Aboah

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[146] arXiv:2505.01984 [pdf, html, other]: Title: Lifelong Whole Slide Image Analysis: Online Vision-Language Adaptation and Past-to-Present Gradient Distillation

Doanh C. Bui, Hoai Luan Pham, Vu Trung Duong Le, Tuan Hai Vu, Van Duy Tran, Khang Nguyen, Yasuhiko Nakashima

Journal-ref: Bui, D. C., Pham, H. L., Le, V. T. D., Vu, T. H., Tran, V. D., Nguyen, K., & Nakashima, Y. (2025). Lifelong Whole Slide Image Analysis: Online Vision-Language Adaptation and Past-to-Present Gradient Distillation. IEEE Access

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[147] arXiv:2505.01986 [pdf, html, other]: Title: Drug classification based on X-ray spectroscopy combined with machine learning

Yongming Li, Peng Wang, Bangdong Han

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[148] arXiv:2505.02005 [pdf, html, other]: Title: Learning Heterogeneous Mixture of Scene Experts for Large-scale Neural Radiance Fields

Zhenxing Mi, Ping Yin, Xue Xiao, Dan Xu

Comments: Accepted by TPAMI

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[149] arXiv:2505.02007 [pdf, html, other]: Title: Efficient Noise Calculation in Deep Learning-based MRI Reconstructions

Onat Dalmaz, Arjun D. Desai, Reinhard Heckel, Tolga Çukur, Akshay S. Chaudhari, Brian A. Hargreaves

Comments: Accepted ICML 2025. Supplementary material included

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[150] arXiv:2505.02013 [pdf, html, other]: Title: MLLM-Enhanced Face Forgery Detection: A Vision-Language Fusion Solution

Siran Peng, Zipei Wang, Li Gao, Xiangyu Zhu, Tianshuo Zhang, Ajian Liu, Haoyuan Zhang, Zhen Lei

Subjects: Computer Vision and Pattern Recognition (cs.CV)

Total of 3183 entries : 1-50 51-100 101-150 151-200 201-250 251-300 ... 3151-3183

Showing up to 50 entries per page: fewer | more | all