Computer Vision and Pattern Recognition

Authors and titles for February 2025

Total of 2199 entries : 1-100 101-200 201-300 301-400 ... 2101-2199

Showing up to 100 entries per page: fewer | more | all

[1] arXiv:2502.00051 [pdf, other]: Title: A two-stage dual-task learning strategy for early prediction of pathological complete response to neoadjuvant chemotherapy for breast cancer using dynamic contrast-enhanced magnetic resonance images

Bowen Jing (1), Jing Wang (1) ((1) Department of Radiation Oncology, University of Texas Southwestern Medical Center)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Medical Physics (physics.med-ph)
[2] arXiv:2502.00074 [pdf, html, other]: Title: SpikingRTNH: Spiking Neural Network for 4D Radar Object Detection

Dong-Hee Paek, Seung-Hyun Kong

Comments: arxiv preprint

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)
[3] arXiv:2502.00076 [pdf, html, other]: Title: Influence of color correction on pathology detection in Capsule Endoscopy

Bidossessi Emmanuel Agossou, Marius Pedersen, Kiran Raja, Anuja Vats, Pål Anders Floor

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[4] arXiv:2502.00083 [pdf, html, other]: Title: CerraData-4MM: A multimodal benchmark dataset on Cerrado for land use and land cover classification

Mateus de Souza Miranda, Ronny Hänsch, Valdivino Alexandre de Santiago Júnior, Thales Sehn Körting, Erison Carlos dos Santos Monteiro

Comments: 9 pages, 13 Figures, 3 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[5] arXiv:2502.00094 [pdf, html, other]: Title: AIN: The Arabic INclusive Large Multimodal Model

Ahmed Heakl, Sara Ghaboura, Omkar Thawkar, Fahad Shahbaz Khan, Hisham Cholakkal, Rao Muhammad Anwer, Salman Khan

Comments: 20 pages, 16 figures, ACL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)
[6] arXiv:2502.00129 [pdf, html, other]: Title: ProtoSnap: Prototype Alignment for Cuneiform Signs

Rachel Mikulinsky, Morris Alper, Shai Gordin, Enrique Jiménez, Yoram Cohen, Hadar Averbuch-Elor

Comments: Accepted to ICLR 2025. Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[7] arXiv:2502.00133 [pdf, html, other]: Title: Exploring Transfer Learning for Deep Learning Polyp Detection in Colonoscopy Images Using YOLOv8

Fabian Vazquez, Jose Angel Nuñez, Xiaoyan Fu, Pengfei Gu, Bin Fu

Comments: 10 pages, 3 figures, 6 tables, SPIE conference

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[8] arXiv:2502.00156 [pdf, html, other]: Title: ALBAR: Adversarial Learning approach to mitigate Biases in Action Recognition

Joseph Fioresi, Ishan Rajendrakumar Dave, Mubarak Shah

Comments: Accepted to ICLR 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR)
[9] arXiv:2502.00173 [pdf, html, other]: Title: Lifting by Gaussians: A Simple, Fast and Flexible Method for 3D Instance Segmentation

Rohan Chacko, Nicolai Haeni, Eldar Khaliullin, Lin Sun, Douglas Lee

Comments: Accepted to WACV 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[10] arXiv:2502.00196 [pdf, html, other]: Title: DermaSynth: Rich Synthetic Image-Text Pairs Using Open Access Dermatology Datasets

Abdurrahim Yilmaz, Furkan Yuceyalcin, Ece Gokyayla, Donghee Choi, Ozan Erdem, Ali Anil Demircali, Rahmetullah Varol, Ufuk Gorkem Kirabali, Gulsum Gencoglan, Joram M. Posma, Burak Temelkuran

Comments: 12 pages, 4 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[11] arXiv:2502.00205 [pdf, html, other]: Title: EcoWeedNet: A Lightweight and Automated Weed Detection Method for Sustainable Next-Generation Agricultural Consumer Electronics

Omar H. Khater, Abdul Jabbar Siddiqui, M. Shamim Hossain, Aiman El-Maleh

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[12] arXiv:2502.00232 [pdf, html, other]: Title: A Hybrid Random Forest and CNN Framework for Tile-Wise Oil-Water Classification in Hyperspectral Images

Mehdi Nickzamir, Seyed Mohammad Sheikh Ahamdi Gandab

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[13] arXiv:2502.00250 [pdf, html, other]: Title: Transformer-Based Vector Font Classification Using Different Font Formats: TrueType versus PostScript

Takumu Fujioka (1), Gouhei Tanaka (1 and 2) ((1) Nagoya Institute of Technology, (2) The University of Tokyo)

Comments: 8 pages, 8 figures, 4 tables, Submitted to IJCNN 2025. Code available at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[14] arXiv:2502.00262 [pdf, html, other]: Title: INSIGHT: Enhancing Autonomous Driving Safety through Vision-Language Models on Context-Aware Hazard Detection and Edge Case Evaluation

Dianwei Chen, Zifan Zhang, Yuchen Liu, Xianfeng Terry Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[15] arXiv:2502.00266 [pdf, html, other]: Title: MCM: Multi-layer Concept Map for Efficient Concept Learning from Masked Images

Yuwei Sun, Lu Mi, Ippei Fujisawa, Ryota Kanai

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[16] arXiv:2502.00307 [pdf, html, other]: Title: A Diffusion Model Translator for Efficient Image-to-Image Translation

Mengfei Xia, Yu Zhou, Ran Yi, Yong-Jin Liu, Wenping Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[17] arXiv:2502.00315 [pdf, html, other]: Title: MonoDINO-DETR: Depth-Enhanced Monocular 3D Object Detection Using a Vision Foundation Model

Jihyeok Kim, Seongwoo Moon, Sungwon Nah, David Hyunchul Shim

Comments: 8 pages, 8 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[18] arXiv:2502.00333 [pdf, html, other]: Title: BiMaCoSR: Binary One-Step Diffusion Model Leveraging Flexible Matrix Compression for Real Super-Resolution

Kai Liu, Kaicheng Yang, Zheng Chen, Zhiteng Li, Yong Guo, Wenbo Li, Linghe Kong, Yulun Zhang

Comments: 10 pages, 5 figures. The code and models will be available at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[19] arXiv:2502.00342 [pdf, html, other]: Title: Embodied Intelligence for 3D Understanding: A Survey on 3D Scene Question Answering

Zechuan Li, Hongshan Yu, Yihao Ding, Yan Li, Yong He, Naveed Akhtar

Comments: This is a submitted version of a paper accepted by Information Fusion

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[20] arXiv:2502.00360 [pdf, html, other]: Title: Shape from Semantics: 3D Shape Generation from Multi-View Semantics

Liangchen Li, Caoliwen Wang, Yuqi Zhou, Bailin Deng, Juyong Zhang

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[21] arXiv:2502.00372 [pdf, html, other]: Title: NAVER: A Neuro-Symbolic Compositional Automaton for Visual Grounding with Explicit Logic Reasoning

Zhixi Cai, Fucai Ke, Simindokht Jahangard, Maria Garcia de la Banda, Reza Haffari, Peter J. Stuckey, Hamid Rezatofighi

Comments: ICCV 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[22] arXiv:2502.00375 [pdf, html, other]: Title: Scalable Framework for Classifying AI-Generated Content Across Modalities

Anh-Kiet Duong, Petra Gomez-Krämer

Comments: Defactify4 @ AAAI 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[23] arXiv:2502.00379 [pdf, html, other]: Title: Latent Action Learning Requires Supervision in the Presence of Distractors

Alexander Nikulin, Ilya Zisman, Denis Tarasov, Nikita Lyubaykin, Andrei Polubarov, Igor Kiselev, Vladislav Kurenkov

Comments: ICML 2025, Poster, Project Page: this https URL, Source code: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[24] arXiv:2502.00382 [pdf, html, other]: Title: Masked Generative Nested Transformers with Decode Time Scaling

Sahil Goyal, Debapriya Tula, Gagan Jain, Pradeep Shenoy, Prateek Jain, Sujoy Paul

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[25] arXiv:2502.00386 [pdf, html, other]: Title: Efficient Adaptive Label Refinement for Label Noise Learning

Wenzhen Zhang, Debo Cheng, Guangquan Lu, Bo Zhou, Jiaye Li, Shichao Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[26] arXiv:2502.00392 [pdf, html, other]: Title: RefDrone: A Challenging Benchmark for Referring Expression Comprehension in Drone Scenes

Zhichao Sun, Yepeng Liu, Huachao Zhu, Yuliang Gu, Yuda Zou, Zelong Liu, Gui-Song Xia, Bo Du, Yongchao Xu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[27] arXiv:2502.00397 [pdf, html, other]: Title: Minimalistic Video Saliency Prediction via Efficient Decoder & Spatio Temporal Action Cues

Rohit Girmaji, Siddharth Jain, Bhav Beri, Sarthak Bansal, Vineet Gandhi

Comments: Accepted at 2025 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2025)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[28] arXiv:2502.00402 [pdf, html, other]: Title: Enhancing Highway Safety: Accident Detection on the A9 Test Stretch Using Roadside Sensors

Walter Zimmer, Ross Greer, Xingcheng Zhou, Rui Song, Marc Pavel, Daniel Lehmberg, Ahmed Ghita, Akshay Gopalkrishnan, Mohan Trivedi, Alois Knoll

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[29] arXiv:2502.00404 [pdf, html, other]: Title: Exploring Linear Attention Alternative for Single Image Super-Resolution

Rongchang Lu, Changyu Li, Donghang Li, Guojing Zhang, Jianqiang Huang, Xilai Li

Comments: This paper has been published to IEEE International Joint Conference on Neural Networks 2025 as the final camera ready version. Contact at [email protected]

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[30] arXiv:2502.00412 [pdf, html, other]: Title: TROI: Cross-Subject Pretraining with Sparse Voxel Selection for Enhanced fMRI Visual Decoding

Ziyu Wang, Tengyu Pan, Zhenyu Li, Ji Wu, Xiuxing Li, Jianyong Wang

Comments: ICASSP 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[31] arXiv:2502.00418 [pdf, html, other]: Title: Parameter Efficient Fine-Tuning of Segment Anything Model for Biomedical Imaging

Carolin Teuber, Anwai Archit, Constantin Pape

Comments: Published in MIDL 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[32] arXiv:2502.00425 [pdf, html, other]: Title: MQuant: Unleashing the Inference Potential of Multimodal Large Language Models via Full Static Quantization

JiangYong Yu, Sifan Zhou, Dawei Yang, Shuo Wang, Shuoyu Li, Xing Hu, Chen Xu, Zukang Xu, Changyong Shu, Zhihang Yuan

Comments: Accepted by ACM MM 2025. First PTQ solution for Multimodal large language models applicable to 5 mainstream MLLMs

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[33] arXiv:2502.00426 [pdf, html, other]: Title: TEST-V: TEst-time Support-set Tuning for Zero-shot Video Classification

Rui Yan, Jin Wang, Hongyu Qu, Xiaoyu Du, Dong Zhang, Jinhui Tang, Tieniu Tan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[34] arXiv:2502.00433 [pdf, html, other]: Title: CAT Pruning: Cluster-Aware Token Pruning For Text-to-Image Diffusion Models

Xinle Cheng, Zhuoming Chen, Zhihao Jia

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[35] arXiv:2502.00435 [pdf, html, other]: Title: SatMamba: Development of Foundation Models for Remote Sensing Imagery Using State Space Models

Chuc Man Duc, Hiromichi Fukui

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[36] arXiv:2502.00462 [pdf, html, other]: Title: MambaGlue: Fast and Robust Local Feature Matching With Mamba

Kihwan Ryoo, Hyungtae Lim, Hyun Myung

Comments: Proc. IEEE Int'l Conf. Robotics and Automation (ICRA) 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[37] arXiv:2502.00464 [pdf, other]: Title: Evaluation of End-to-End Continuous Spanish Lipreading in Different Data Conditions

David Gimeno-Gómez, Carlos-D. Martínez-Hinarejos

Comments: Accepted in the "Language Resources and Evaluation" journal, Springer Nature

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[38] arXiv:2502.00474 [pdf, other]: Title: A framework for river connectivity classification using temporal image processing and attention based neural networks

Timothy James Becker, Derin Gezgin, Jun Yi He Wu, Mary Becker

Comments: 15 pages, 8 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[39] arXiv:2502.00500 [pdf, html, other]: Title: Video Latent Flow Matching: Optimal Polynomial Projections for Video Interpolation and Extrapolation

Yang Cao, Zhao Song, Chiwun Yang

Comments: 39 pages, 6 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[40] arXiv:2502.00528 [pdf, html, other]: Title: Vision-Language Modeling in PET/CT for Visual Grounding of Positive Findings

Zachary Huemann, Samuel Church, Joshua D. Warner, Daniel Tran, Xin Tie, Alan B McMillan, Junjie Hu, Steve Y. Cho, Meghan Lubner, Tyler J. Bradshaw

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[41] arXiv:2502.00535 [pdf, html, other]: Title: Work-Efficient Parallel Non-Maximum Suppression Kernels

David Oro, Carles Fernández, Xavier Martorell, Javier Hernando

Comments: Code: this https URL

Journal-ref: The Computer Journal, Volume 65, Issue 4, April 2022, Pages 773-787

Subjects: Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC)
[42] arXiv:2502.00536 [pdf, html, other]: Title: CAD: Confidence-Aware Adaptive Displacement for Semi-Supervised Medical Image Segmentation

Wenbo Xiao, Zhihao Xu, Guiping Liang, Yangjun Deng, Yi Xiao

Comments: 9 pages, 3 figures, 4 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[43] arXiv:2502.00547 [pdf, html, other]: Title: Milmer: a Framework for Multiple Instance Learning based Multimodal Emotion Recognition

Zaitian Wang, Jian He, Yu Liang, Xiyuan Hu, Tianhao Peng, Kaixin Wang, Jiakai Wang, Chenlong Zhang, Weili Zhang, Shuang Niu, Xiaoyang Xie

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC)
[44] arXiv:2502.00563 [pdf, html, other]: Title: Complex Wavelet Mutual Information Loss: A Multi-Scale Loss Function for Semantic Segmentation

Renhao Lu

Comments: Accepted at ICML 2025. This version corresponds to the official camera-ready submission

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[45] arXiv:2502.00568 [pdf, html, other]: Title: Generating crossmodal gene expression from cancer histopathology improves multimodal AI predictions

Samiran Dey, Christopher R.S. Banerji, Partha Basuchowdhuri, Sanjoy K. Saha, Deepak Parashar, Tapabrata Chakraborti

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[46] arXiv:2502.00571 [pdf, html, other]: Title: Contrastive Forward-Forward: A Training Algorithm of Vision Transformer

Hossein Aghagolzadeh, Mehdi Ezoji

Comments: 22 pages, 8 figures, under review

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[47] arXiv:2502.00594 [pdf, html, other]: Title: Fast Vision Mamba: Pooling Spatial Dimensions for Accelerated Processing

Saarthak Kapse, Robin Betz, Srinivasan Sivanandan

Comments: 20 pages, 15 figures, this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[48] arXiv:2502.00618 [pdf, html, other]: Title: DesCLIP: Robust Continual Learning via General Attribute Descriptions for VLM-Based Visual Recognition

Chiyuan He, Zihuan Qiu, Fanman Meng, Linfeng Xu, Qingbo Wu, Hongliang Li

Comments: Accepted by IEEE Transactions on Multimedia

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[49] arXiv:2502.00630 [pdf, html, other]: Title: Self-Prompt SAM: Medical Image Segmentation via Automatic Prompt SAM Adaptation

Bin Xie, Hao Tang, Dawen Cai, Yan Yan, Gady Agam

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[50] arXiv:2502.00631 [pdf, html, other]: Title: MedConv: Convolutions Beat Transformers on Long-Tailed Bone Density Prediction

Xuyin Qi, Zeyu Zhang, Huazhan Zheng, Mingxi Chen, Numan Kutaiba, Ruth Lim, Cherie Chiang, Zi En Tham, Xuan Ren, Wenxin Zhang, Lei Zhang, Hao Zhang, Wenbing Lv, Guangzhen Yao, Renda Han, Kangsheng Wang, Mingyuan Li, Hongtao Mao, Yu Li, Zhibin Liao, Yang Zhao, Minh-Son To

Comments: Accepted to IJCNN 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[51] arXiv:2502.00639 [pdf, html, other]: Title: Zeroth-order Informed Fine-Tuning for Diffusion Model: A Recursive Likelihood Ratio Optimizer

Tao Ren, Zishi Zhang, Zehao Li, Jingyang Jiang, Shentao Qin, Guanghao Li, Yan Li, Yi Zheng, Xinping Li, Min Zhan, Yijie Peng

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Machine Learning (stat.ML)
[52] arXiv:2502.00654 [pdf, html, other]: Title: EmoTalkingGaussian: Continuous Emotion-conditioned Talking Head Synthesis

Junuk Cha, Seongro Yoon, Valeriya Strizhkova, Francois Bremond, Seungryul Baek

Comments: 22 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[53] arXiv:2502.00662 [pdf, html, other]: Title: Mitigating the Modality Gap: Few-Shot Out-of-Distribution Detection with Multi-modal Prototypes and Image Bias Estimation

Yimu Wang, Evelien Riddell, Adrian Chow, Sean Sedwards, Krzysztof Czarnecki

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Machine Learning (cs.LG)
[54] arXiv:2502.00663 [pdf, html, other]: Title: Enhanced Convolutional Neural Networks for Improved Image Classification

Xiaoran Yang, Shuhan Yu, Wenxi Xu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[55] arXiv:2502.00665 [pdf, html, other]: Title: Cross-Modal Synergies: Unveiling the Potential of Motion-Aware Fusion Networks in Handling Dynamic and Static ReID Scenarios

Fuxi Ling, Hongye Liu, Guoqiang Huang, Jing Li, Hong Wu, Zhihao Tang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[56] arXiv:2502.00688 [pdf, html, other]: Title: High-Order Matching for One-Step Shortcut Diffusion Models

Bo Chen, Chengyue Gong, Xiaoyu Li, Yingyu Liang, Zhizhou Sha, Zhenmei Shi, Zhao Song, Mingda Wan

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[57] arXiv:2502.00695 [pdf, html, other]: Title: TMI-CLNet: Triple-Modal Interaction Network for Chronic Liver Disease Prognosis From Imaging, Clinical, and Radiomic Data Fusion

Linglong Wu, Xuhao Shan, Ruiquan Ge, Ruoyu Liang, Chi Zhang, Yonghong Li, Ahmed Elazab, Huoling Luo, Yunbi Liu, Changmiao Wang

Comments: 6 pages, 3 figures, accepted by IEEE ISBI 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[58] arXiv:2502.00700 [pdf, html, other]: Title: S2CFormer: Revisiting the RD-Latency Trade-off in Transformer-based Learned Image Compression

Yunuo Chen, Qian Li, Bing He, Donghui Feng, Ronghua Wu, Qi Wang, Li Song, Guo Lu, Wenjun Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[59] arXiv:2502.00708 [pdf, html, other]: Title: PhiP-G: Physics-Guided Text-to-3D Compositional Scene Generation

Qixuan Li, Chao Wang, Zongjin He, Yan Peng

Comments: 13 pages.8 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[60] arXiv:2502.00711 [pdf, html, other]: Title: VIKSER: Visual Knowledge-Driven Self-Reinforcing Reasoning Framework

Chao Wang, Chunbai Zhang, Yongxiao Tian, Yang Zhou, Yan Peng

Comments: 14 pages,17 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[61] arXiv:2502.00717 [pdf, html, other]: Title: MINT: Mitigating Hallucinations in Large Vision-Language Models via Token Reduction

Chao Wang, Jianming Yang, Yang Zhou

Comments: 8 pages, 5 figures, 4 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[62] arXiv:2502.00719 [pdf, html, other]: Title: Vision and Language Reference Prompt into SAM for Few-shot Segmentation

Kosuke Sakurai, Ryotaro Shimizu, Masayuki Goto

Comments: 8 pages, 2 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[63] arXiv:2502.00730 [pdf, html, other]: Title: Spatio-Temporal Progressive Attention Model for EEG Classification in Rapid Serial Visual Presentation Task

Yang Li, Wei Liu, Tianzhi Feng, Fu Li, Chennan Wu, Boxun Fu, Zhifu Zhao, Xiaotian Wang, Guangming Shi

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[64] arXiv:2502.00783 [pdf, other]: Title: A method for estimating forest carbon storage distribution density via artificial intelligence generated content model

Zhenyu Yu, Jinnian Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[65] arXiv:2502.00784 [pdf, other]: Title: Estimating forest carbon stocks from high-resolution remote sensing imagery by reducing domain shift with style transfer

Zhenyu Yu, Jinnian Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[66] arXiv:2502.00796 [pdf, html, other]: Title: Task-Specific Adaptation with Restricted Model Access

Matan Levy, Rami Ben-Ari, Dvir Samuel, Nir Darshan, Dani Lischinski

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[67] arXiv:2502.00800 [pdf, html, other]: Title: Adversarial Semantic Augmentation for Training Generative Adversarial Networks under Limited Data

Mengping Yang, Zhe Wang, Ziqiu Chi, Dongdong Li, Wenli Du

Comments: This work was completed in 2022 and submitted to an IEEE journal for potential publication

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[68] arXiv:2502.00801 [pdf, html, other]: Title: Environment-Driven Online LiDAR-Camera Extrinsic Calibration

Zhiwei Huang, Jiaqi Li, Ping Zhong, Rui Fan

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[69] arXiv:2502.00833 [pdf, other]: Title: Cross multiscale vision transformer for deep fake detection

Akhshan P, Taneti Sanjay, Chandrakala S

Comments: This version of the manuscript contains errors in wording and explanation, which may cause confusion in interpreting the methodology and results. The authors are preparing a revised version with corrected and clearer descriptions

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[70] arXiv:2502.00843 [pdf, html, other]: Title: VLM-Assisted Continual learning for Visual Question Answering in Self-Driving

Yuxin Lin, Mengshi Qi, Liang Liu, Huadong Ma

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[71] arXiv:2502.00848 [pdf, html, other]: Title: RealRAG: Retrieval-augmented Realistic Image Generation via Self-reflective Contrastive Learning

Yuanhuiyi Lyu, Xu Zheng, Lutao Jiang, Yibo Yan, Xin Zou, Huiyu Zhou, Linfeng Zhang, Xuming Hu

Comments: Accepted to ICML2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[72] arXiv:2502.00869 [pdf, html, other]: Title: STAF: Sinusoidal Trainable Activation Functions for Implicit Neural Representation

Alireza Morsali, MohammadJavad Vaez, Mohammadhossein Soltani, Amirhossein Kazerouni, Babak Taati, Morteza Mohammad-Noori

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[73] arXiv:2502.00896 [pdf, html, other]: Title: LoR-VP: Low-Rank Visual Prompting for Efficient Vision Model Adaptation

Can Jin, Ying Li, Mingyu Zhao, Shiyu Zhao, Zhenting Wang, Xiaoxiao He, Ligong Han, Tong Che, Dimitris N. Metaxas

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[74] arXiv:2502.00939 [pdf, html, other]: Title: Fruit Fly Classification (Diptera: Tephritidae) in Images, Applying Transfer Learning

Erick Andrew Bustamante Flores, Harley Vera Olivera, Ivan Cesar Medrano Valencia, Carlos Fernando Montoya Cubas

Comments: 15 pages and 19 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[75] arXiv:2502.00954 [pdf, html, other]: Title: Hypo3D: Exploring Hypothetical Reasoning in 3D

Ye Mao, Weixun Luo, Junpeng Jing, Anlan Qiu, Krystian Mikolajczyk

Comments: 24 pages, 15 figures, 15 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[76] arXiv:2502.00960 [pdf, html, other]: Title: SAM-guided Pseudo Label Enhancement for Multi-modal 3D Semantic Segmentation

Mingyu Yang, Jitong Lu, Hun-Seok Kim

Comments: ICRA 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[77] arXiv:2502.00965 [pdf, html, other]: Title: CLIP-UP: A Simple and Efficient Mixture-of-Experts CLIP Training Recipe with Sparse Upcycling

Xinze Wang, Chen Chen, Yinfei Yang, Hong-You Chen, Bowen Zhang, Aditya Pal, Xiangxin Zhu, Xianzhi Du

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[78] arXiv:2502.00968 [pdf, html, other]: Title: CoDe: Blockwise Control for Denoising Diffusion Models

Anuj Singh, Sayak Mukherjee, Ahmad Beirami, Hadi Jamali-Rad

Journal-ref: Transactions on Machine Learning Research, 2025. ISSN: 2835-8856

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[79] arXiv:2502.00972 [pdf, html, other]: Title: Pushing the Boundaries of State Space Models for Image and Video Generation

Yicong Hong, Long Mai, Yuan Yao, Feng Liu

Comments: 21 pages, paper under review

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[80] arXiv:2502.00992 [pdf, html, other]: Title: FCBoost-Net: A Generative Network for Synthesizing Multiple Collocated Outfits via Fashion Compatibility Boosting

Dongliang Zhou, Haijun Zhang, Jianghong Ma, Jicong Fan, Zhao Zhang

Comments: This paper has been accepted for presentation at ACM Multimedia 2023

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[81] arXiv:2502.01000 [pdf, html, other]: Title: Adapting Foundation Models for Few-Shot Medical Image Segmentation: Actively and Sequentially

Jingyun Yang, Guoqing Zhang, Jingge Wang, Yang Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[82] arXiv:2502.01002 [pdf, other]: Title: Multi-Resolution SAR and Optical Remote Sensing Image Registration Methods: A Review, Datasets, and Future Perspectives

Wenfei Zhang, Ruipeng Zhao, Yongxiang Yao, Yi Wan, Peihao Wu, Jiayuan Li, Yansheng Li, Yongjun Zhang

Comments: 48 pages, 10 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[83] arXiv:2502.01004 [pdf, html, other]: Title: ZeroBP: Learning Position-Aware Correspondence for Zero-shot 6D Pose Estimation in Bin-Picking

Jianqiu Chen, Zikun Zhou, Xin Li, Ye Zheng, Tianpeng Bao, Zhenyu He

Comments: ICRA 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[84] arXiv:2502.01023 [pdf, other]: Title: Vessel segmentation for X-separation

Taechang Kim, Sooyeon Ji, Kyeongseon Min, Minjun Kim, Jonghyo Youn, Chungseok Oh, Jiye Kim, Jongho Lee

Subjects: Computer Vision and Pattern Recognition (cs.CV); Quantitative Methods (q-bio.QM)
[85] arXiv:2502.01045 [pdf, html, other]: Title: WonderHuman: Hallucinating Unseen Parts in Dynamic 3D Human Reconstruction

Zilong Wang, Zhiyang Dou, Yuan Liu, Cheng Lin, Xiao Dong, Yunhui Guo, Chenxu Zhang, Xin Li, Wenping Wang, Xiaohu Guo

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[86] arXiv:2502.01048 [pdf, other]: Title: Sparks of Explainability: Recent Advancements in Explaining Large Vision Models

Thomas Fel

Comments: Doctoral thesis

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[87] arXiv:2502.01051 [pdf, html, other]: Title: Diffusion Model as a Noise-Aware Latent Reward Model for Step-Level Preference Optimization

Tao Zhang, Cheng Da, Kun Ding, Huan Yang, Kun Jin, Yan Li, Tingting Gao, Di Zhang, Shiming Xiang, Chunhong Pan

Comments: 25 pages, 26 tables, 15 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[88] arXiv:2502.01056 [pdf, html, other]: Title: Mitigating Hallucinations in Large Vision-Language Models with Internal Fact-based Contrastive Decoding

Chao Wang, Xuancheng Zhou, Weiwei Fu, Yang Zhou

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[89] arXiv:2502.01061 [pdf, html, other]: Title: OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models

Gaojie Lin, Jianwen Jiang, Jiaqi Yang, Zerong Zheng, Chao Liang

Comments: ICCV 2025, Homepage: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[90] arXiv:2502.01080 [pdf, html, other]: Title: BC-GAN: A Generative Adversarial Network for Synthesizing a Batch of Collocated Clothing

Dongliang Zhou, Haijun Zhang, Jianghong Ma, Jianyang Shi

Comments: This paper was accepted by IEEE TCSVT

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[91] arXiv:2502.01081 [pdf, html, other]: Title: The Jumping Reasoning Curve? Tracking the Evolution of Reasoning Performance in GPT-[n] and o-[n] Models on Multimodal Puzzles

Vernon Y.H. Toh, Yew Ken Chia, Deepanway Ghosal, Soujanya Poria

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[92] arXiv:2502.01098 [pdf, html, other]: Title: SatFlow: Generative model based framework for producing High Resolution Gap Free Remote Sensing Imagery

Bharath Irigireddy, Varaprasad Bandaru

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[93] arXiv:2502.01101 [pdf, html, other]: Title: VidSketch: Hand-drawn Sketch-Driven Video Generation with Diffusion Control

Lifan Jiang, Shuang Chen, Boxi Wu, Xiaotong Guan, Jiahui Zhang

Comments: 17pages, 15 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[94] arXiv:2502.01105 [pdf, html, other]: Title: LayerTracer: Cognitive-Aligned Layered SVG Synthesis via Diffusion Transformer

Yiren Song, Danze Chen, Mike Zheng Shou

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[95] arXiv:2502.01157 [pdf, other]: Title: Radiant Foam: Real-Time Differentiable Ray Tracing

Shrisudhan Govindarajan, Daniel Rebain, Kwang Moo Yi, Andrea Tagliasacchi

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[96] arXiv:2502.01181 [pdf, html, other]: Title: BVINet: Unlocking Blind Video Inpainting with Zero Annotations

Zhiliang Wu, Kerui Chen, Kun Li, Hehe Fan, Yi Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[97] arXiv:2502.01183 [pdf, html, other]: Title: Enhancing Environmental Robustness in Few-shot Learning via Conditional Representation Learning

Qianyu Guo, Jingrong Wu, Tianxing Wu, Haofen Wang, Weifeng Ge, Wenqiang Zhang

Comments: 15 pages, 8 figures, Accepted by IEEE Transactions on Image Processing

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[98] arXiv:2502.01186 [pdf, html, other]: Title: A High-Accuracy SSIM-based Scoring System for Coin Die Link Identification

Patrice Labedan, Nicolas Drougard, Alexandre Berezin, Guowei Sun, Francis Dieulafait

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[99] arXiv:2502.01191 [pdf, html, other]: Title: Towards Robust and Reliable Concept Representations: Reliability-Enhanced Concept Embedding Model

Yuxuan Cai, Xiyu Wang, Satoshi Tsutsui, Winnie Pang, Bihan Wen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[100] arXiv:2502.01199 [pdf, html, other]: Title: Nearly Lossless Adaptive Bit Switching

Haiduo Huang, Zhenhua Liu, Tian Xia, Wenzhe zhao, Pengju Ren

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)

Total of 2199 entries : 1-100 101-200 201-300 301-400 ... 2101-2199

Showing up to 100 entries per page: fewer | more | all