Computer Vision and Pattern Recognition

Authors and titles for June 2025

Total of 3130 entries : 1-2000 2001-3130 2901-3130

Showing up to 2000 entries per page: fewer | more | all

[2901] arXiv:2506.16827 (cross-list from cs.GR) [pdf, html, other]: Title: Beyond Blur: A Fluid Perspective on Generative Diffusion Models

Grzegorz Gruszczynski, Jakub Meixner, Michal Jan Wlodarczyk, Przemyslaw Musialski

Comments: ICCV 2025 main conference, 8 pages paper, 20 pages appendix, 24 figures, supplementary pseudocode in appendix, this https URL

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[2902] arXiv:2506.16890 (cross-list from cs.LG) [pdf, html, other]: Title: From Lab to Factory: Pitfalls and Guidelines for Self-/Unsupervised Defect Detection on Low-Quality Industrial Images

Sebastian Hönel, Jonas Nordqvist

Comments: 18 pages, 7 figures, 1 table. Camera-ready version for the 2025 conference European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD '25)

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Applications (stat.AP)
[2903] arXiv:2506.16898 (cross-list from cs.AI) [pdf, html, other]: Title: AI's Blind Spots: Geographic Knowledge and Diversity Deficit in Generated Urban Scenario

Ciro Beneduce, Massimiliano Luca, Bruno Lepri

Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY)
[2904] arXiv:2506.16934 (cross-list from eess.IV) [pdf, other]: Title: PET Tracer Separation Using Conditional Diffusion Transformer with Multi-latent Space Learning

Bin Huang, Feihong Xu, Xinchong Shi, Shan Huang, Binxuan Li, Fei Li, Qiegen Liu

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2905] arXiv:2506.17110 (cross-list from cs.RO) [pdf, html, other]: Title: Monocular One-Shot Metric-Depth Alignment for RGB-Based Robot Grasping

Teng Guo, Baichuan Huang, Jingjin Yu

Comments: Accepted to IROS 2025

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[2906] arXiv:2506.17133 (cross-list from eess.IV) [pdf, html, other]: Title: Robust Training with Data Augmentation for Medical Imaging Classification

Josué Martínez-Martínez, Olivia Brown, Mostafa Karami, Sheida Nabavi

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[2907] arXiv:2506.17140 (cross-list from eess.IV) [pdf, html, other]: Title: MeDi: Metadata-Guided Diffusion Models for Mitigating Biases in Tumor Classification

David Jacob Drexlin, Jonas Dippel, Julius Hense, Niklas Prenißl, Grégoire Montavon, Frederick Klauschen, Klaus-Robert Müller

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[2908] arXiv:2506.17165 (cross-list from eess.IV) [pdf, html, other]: Title: Proportional Sensitivity in Generative Adversarial Network (GAN)-Augmented Brain Tumor Classification Using Convolutional Neural Network

Mahin Montasir Afif, Abdullah Al Noman, K. M. Tahsin Kabir, Md. Mortuza Ahmmed, Md. Mostafizur Rahman, Mufti Mahmud, Md. Ashraful Babu

Comments: This papaer has been submitted to The 18th International Conference on Brain Informatics (BI'25), Italy

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[2909] arXiv:2506.17198 (cross-list from cs.RO) [pdf, html, other]: Title: Dex1B: Learning with 1B Demonstrations for Dexterous Manipulation

Jianglong Ye, Keyi Wang, Chengjing Yuan, Ruihan Yang, Yiquan Li, Jiyue Zhu, Yuzhe Qin, Xueyan Zou, Xiaolong Wang

Comments: Accepted to RSS 2025. Project page: this https URL

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[2910] arXiv:2506.17206 (cross-list from cs.GR) [pdf, html, other]: Title: DreamCube: 3D Panorama Generation via Multi-plane Synchronization

Yukun Huang, Yanning Zhou, Jianan Wang, Kaiyi Huang, Xihui Liu

Comments: Project page: this https URL

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[2911] arXiv:2506.17232 (cross-list from cs.LG) [pdf, html, other]: Title: PCaM: A Progressive Focus Attention-Based Information Fusion Method for Improving Vision Transformer Domain Adaptation

Zelin Zang, Fei Wang, Liangyu Li, Jinlin Wu, Chunshui Zhao, Zhen Lei, Baigui Sun

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[2912] arXiv:2506.17307 (cross-list from cs.LG) [pdf, html, other]: Title: Learning to Adapt Frozen CLIP for Few-Shot Test-Time Domain Adaptation

Zhixiang Chi, Li Gu, Huan Liu, Ziqiang Wang, Yanan Wu, Yang Wang, Konstantinos N Plataniotis

Comments: ICLR2025,this https URL

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[2913] arXiv:2506.17320 (cross-list from cs.CY) [pdf, other]: Title: MAARTA:Multi-Agentic Adaptive Radiology Teaching Assistant

Akash Awasthi, Brandon V. Chang, Anh M. Vu, Ngan Le, Rishi Agrawal, Zhigang Deng, Carol Wu, Hien Van Nguyen

Comments: Accepted to MICCAI 2025 (Main Conference)

Subjects: Computers and Society (cs.CY); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[2914] arXiv:2506.17324 (cross-list from cs.LG) [pdf, other]: Title: Origins of Creativity in Attention-Based Diffusion Models

Emma Finn, T. Anderson Keller, Manos Theodosis, Demba E. Ba

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[2915] arXiv:2506.17337 (cross-list from eess.IV) [pdf, html, other]: Title: Can Common VLMs Rival Medical VLMs? Evaluation and Strategic Insights

Yuan Zhong, Ruinan Jin, Xiaoxiao Li, Qi Dou

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[2916] arXiv:2506.17361 (cross-list from eess.IV) [pdf, html, other]: Title: Efficient Feedback Gate Network for Hyperspectral Image Super-Resolution

Xufei Wang, Mingjian Zhang, Fei Ge, Jinchen Zhu, Wen Sha, Jifen Ren, Zhimeng Hou, Shouguo Zheng, ling Zheng, Shizhuang Weng

Comments: 20 pages,17 figures

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[2917] arXiv:2506.17364 (cross-list from cs.CY) [pdf, html, other]: Title: AI-based Multimodal Biometrics for Detecting Smartphone Distractions: Application to Online Learning

Alvaro Becerra, Roberto Daza, Ruth Cobos, Aythami Morales, Mutlu Cukurova, Julian Fierrez

Comments: Accepted in EC-TEL25: 20th European Conference on Technology Enhanced Learning, Newcastle and Durham, UK, 15-19 September 2025

Subjects: Computers and Society (cs.CY); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[2918] arXiv:2506.17372 (cross-list from cs.CY) [pdf, html, other]: Title: Multimodal Political Bias Identification and Neutralization

Cedric Bernard, Xavier Pleimling, Amun Kharel, Chase Vickery

Subjects: Computers and Society (cs.CY); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[2919] arXiv:2506.17378 (cross-list from cs.RO) [pdf, html, other]: Title: A workflow for generating synthetic LiDAR datasets in simulation environments

Abhishek Phadke, Shakib Mahmud Dipto, Pratip Rana

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[2920] arXiv:2506.17412 (cross-list from eess.IV) [pdf, html, other]: Title: VMRA-MaR: An Asymmetry-Aware Temporal Framework for Longitudinal Breast Cancer Risk Prediction

Zijun Sun, Solveig Thrun, Michael Kampffmeyer

Comments: MICCAI 2025, Provisional Accept

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2921] arXiv:2506.17425 (cross-list from eess.IV) [pdf, html, other]: Title: Trans${^2}$-CBCT: A Dual-Transformer Framework for Sparse-View CBCT Reconstruction

Minmin Yang, Huantao Ren, Senem Velipasalar

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[2922] arXiv:2506.17462 (cross-list from cs.RO) [pdf, other]: Title: General-Purpose Robotic Navigation via LVLM-Orchestrated Perception, Reasoning, and Acting

Bernard Lange, Anil Yildiz, Mansur Arief, Shehryar Khattak, Mykel Kochenderfer, Georgios Georgakis

Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[2923] arXiv:2506.17501 (cross-list from eess.IV) [pdf, html, other]: Title: DSA-NRP: No-Reflow Prediction from Angiographic Perfusion Dynamics in Stroke EVT

Shreeram Athreya, Carlos Olivares, Ameera Ismail, Kambiz Nael, William Speier, Corey Arnold

Comments: 12 pages, 4 figures

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2924] arXiv:2506.17516 (cross-list from cs.RO) [pdf, html, other]: Title: EASE: Embodied Active Event Perception via Self-Supervised Energy Minimization

Zhou Chen, Sanjoy Kundu, Harsimran S. Baweja, Sathyanarayanan N. Aakur

Comments: Accepted to IEEE Robotics and Automation Letters, 2025

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[2925] arXiv:2506.17540 (cross-list from eess.IV) [pdf, html, other]: Title: MTSIC: Multi-stage Transformer-based GAN for Spectral Infrared Image Colorization

Tingting Liu, Yuan Liu, Jinhui Tang, Liyin Yuan, Chengyu Liu, Chunlai Li, Xiubao Sui, Qian Chen

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[2926] arXiv:2506.17552 (cross-list from cs.LG) [pdf, other]: Title: DRIMV_TSK: An Interpretable Surgical Evaluation Model for Incomplete Multi-View Rectal Cancer Data

Wei Zhang, Zi Wang, Hanwen Zhou, Zhaohong Deng, Weiping Ding, Yuxi Ge, Te Zhang, Yuanpeng Zhang, Kup-Sze Choi, Shitong Wang, Shudong Hu

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[2927] arXiv:2506.17623 (cross-list from cs.MM) [pdf, html, other]: Title: Can Generated Images Serve as a Viable Modality for Text-Centric Multimodal Learning?

Yuesheng Huang, Peng Zhang, Riliang Liu, Jiaqi Liang

Comments: 4 figures,7 tables

Subjects: Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV)
[2928] arXiv:2506.17636 (cross-list from cs.GR) [pdf, html, other]: Title: 3D Gaussian Splatting for Fine-Detailed Surface Reconstruction in Large-Scale Scene

Shihan Chen, Zhaojin Li, Zeyu Chen, Qingsong Yan, Gaoyang Shen, Ran Duan

Comments: IROS 2025

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[2929] arXiv:2506.17747 (cross-list from physics.geo-ph) [pdf, html, other]: Title: Pix2Geomodel: A Next-Generation Reservoir Geomodeling with Property-to-Property Translation

Abdulrahman Al-Fakih, Ardiansyah Koeshidayatullah, Nabil A. Saraih, Tapan Mukerji, Rayan Kanfar, Abdulmohsen Alali, SanLinn I. Kaka

Comments: 34 pages, 13 figures

Subjects: Geophysics (physics.geo-ph); Computational Engineering, Finance, and Science (cs.CE); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
[2930] arXiv:2506.17770 (cross-list from cs.GR) [pdf, html, other]: Title: Collaborative Texture Filtering

Tomas Akenine-Möller, Pontus Ebelin, Matt Pharr, Bartlomiej Wronski

Comments: Accepted to ACM/EG Symposium on High Performance Graphics (HPG), 2025

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[2931] arXiv:2506.17872 (cross-list from cs.LG) [pdf, html, other]: Title: Decoding Federated Learning: The FedNAM+ Conformal Revolution

Sree Bhargavi Balija, Amitash Nanda, Debashis Sahoo

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[2932] arXiv:2506.17874 (cross-list from stat.ML) [pdf, html, other]: Title: DRO-Augment Framework: Robustness by Synergizing Wasserstein Distributionally Robust Optimization and Data Augmentation

Jiaming Hu, Debarghya Mukherjee, Ioannis Ch. Paschalidis

Comments: 26 pages,3 figures

Subjects: Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[2933] arXiv:2506.17879 (cross-list from eess.IV) [pdf, html, other]: Title: StainPIDR: A Pathological Image Decouplingand Reconstruction Method for Stain Normalization Based on Color Vector Quantization and Structure Restaining

Zheng Chen

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[2934] arXiv:2506.17954 (cross-list from eess.IV) [pdf, other]: Title: Mobile Image Analysis Application for Mantoux Skin Test

Liong Gele, Tan Chye Cheah

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2935] arXiv:2506.17966 (cross-list from cs.IR) [pdf, html, other]: Title: LLM-Enhanced Multimodal Fusion for Cross-Domain Sequential Recommendation

Wangyu Wu, Zhenhong Chen, Xianglin Qiu, Siqi Song, Xiaowei Huang, Fei Ma, Jimin Xiao

Comments: arXiv admin note: substantial text overlap with arXiv:2504.15085

Subjects: Information Retrieval (cs.IR); Computer Vision and Pattern Recognition (cs.CV)
[2936] arXiv:2506.17967 (cross-list from cs.LG) [pdf, html, other]: Title: Adapting Vision-Language Models for Evaluating World Models

Mariya Hendriksen, Tabish Rashid, David Bignell, Raluca Georgescu, Abdelhak Lemkhenter, Katja Hofmann, Sam Devlin, Sarah Parisot

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[2937] arXiv:2506.17968 (cross-list from cs.LG) [pdf, html, other]: Title: h-calibration: Rethinking Classifier Recalibration with Probabilistic Error-Bounded Objective

Wenjian Huang, Guiping Cao, Jiahao Xia, Jingkun Chen, Hao Wang, Jianguo Zhang

Journal-ref: IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Probability (math.PR); Machine Learning (stat.ML)
[2938] arXiv:2506.17983 (cross-list from eess.IV) [pdf, html, other]: Title: LVPNet: A Latent-variable-based Prediction-driven End-to-end Framework for Lossless Compression of Medical Images

Chenyue Song, Chen Hui, Qing Lin, Wei Zhang, Siqiao Li, Haiqi Zhu, Zhixuan Li, Shengping Zhang, Shaohui Liu, Feng Jiang, Xiang Li

Comments: Accepted to MICCAI 2025

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2939] arXiv:2506.18017 (cross-list from cs.GR) [pdf, html, other]: Title: Auto-Regressive Surface Cutting

Yang Li, Victor Cheung, Xinhai Liu, Yuguang Chen, Zhongjin Luo, Biwen Lei, Haohan Weng, Zibo Zhao, Jingwei Huang, Zhuo Chen, Chunchao Guo

Comments: Tech. report. this https URL

Subjects: Graphics (cs.GR); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[2940] arXiv:2506.18069 (cross-list from cs.DL) [pdf, html, other]: Title: Unfolding the Past: A Comprehensive Deep Learning Approach to Analyzing Incunabula Pages

Klaudia Ropel, Krzysztof Kutt, Luiz do Valle Miranda, Grzegorz J. Nalepa

Comments: 10 pages, 8 figures; submitted to TPDL 2025; change in v2: updated e-mail address

Subjects: Digital Libraries (cs.DL); Computer Vision and Pattern Recognition (cs.CV)
[2941] arXiv:2506.18072 (cross-list from eess.IV) [pdf, html, other]: Title: Multimodal Medical Image Binding via Shared Text Embeddings

Yunhao Liu, Suyang Xi, Shiqi Liu, Hong Ding, Chicheng Jin, Chong Zhong, Junjun He, Catherine C. Liu, Yiqing Shen

Comments: 10 pages, 3 figures

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[2942] arXiv:2506.18088 (cross-list from cs.RO) [pdf, html, other]: Title: RoboTwin 2.0: A Scalable Data Generator and Benchmark with Strong Domain Randomization for Robust Bimanual Robotic Manipulation

Tianxing Chen, Zanxin Chen, Baijun Chen, Zijian Cai, Yibin Liu, Zixuan Li, Qiwei Liang, Xianliang Lin, Yiheng Ge, Zhenyu Gu, Weiliang Deng, Yubin Guo, Tian Nian, Xuanbing Xie, Qiangyu Chen, Kailun Su, Tianling Xu, Guodong Liu, Mengkang Hu, Huan-ang Gao, Kaixuan Wang, Zhixuan Liang, Yusen Qin, Xiaokang Yang, Ping Luo, Yao Mu

Comments: Project Page: this https URL, Code: this https URL, Doc: this https URL

Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Multiagent Systems (cs.MA)
[2943] arXiv:2506.18158 (cross-list from cs.AI) [pdf, html, other]: Title: Chain-of-Memory: Enhancing GUI Agents for Cross-Application Navigation

Xinzge Gao, Chuanrui Hu, Bin Chen, Teng Li

Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[2944] arXiv:2506.18162 (cross-list from cs.LG) [pdf, html, other]: Title: Pitfalls of Conformal Predictions for Medical Image Classification

Hendrik Mehrtens, Tabea Bucher, Titus J. Brinker

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[2945] arXiv:2506.18172 (cross-list from eess.IV) [pdf, html, other]: Title: STACT-Time: Spatio-Temporal Cross Attention for Cine Thyroid Ultrasound Time Series Classification

Irsyad Adam, Tengyue Zhang, Shrayes Raman, Zhuyu Qiu, Brandon Taraku, Hexiang Feng, Sile Wang, Ashwath Radhachandran, Shreeram Athreya, Vedrana Ivezic, Peipei Ping, Corey Arnold, William Speier

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[2946] arXiv:2506.18201 (cross-list from cs.CL) [pdf, html, other]: Title: Deciphering Emotions in Children Storybooks: A Comparative Analysis of Multimodal LLMs in Educational Applications

Bushra Asseri, Estabraq Abdelaziz, Maha Al Mogren, Tayef Alhefdhi, Areej Al-Wabil

Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[2947] arXiv:2506.18251 (cross-list from cs.GR) [pdf, html, other]: Title: Morse: Dual-Sampling for Lossless Acceleration of Diffusion Models

Chao Li, Jiawei Fan, Anbang Yao

Comments: Fixed a prompt typo in Figure 18 of the Appendix. This work is accepted to ICML 2025. The project page: this https URL

Subjects: Graphics (cs.GR); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[2948] arXiv:2506.18270 (cross-list from eess.IV) [pdf, other]: Title: Adaptive Mask-guided K-space Diffusion for Accelerated MRI Reconstruction

Qinrong Cai, Yu Guan, Zhibo Chen, Dong Liang, Qiuyun Fan, Qiegen Liu

Comments: 10 pages, 9 figures

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2949] arXiv:2506.18323 (cross-list from eess.IV) [pdf, html, other]: Title: A Multi-Scale Spatial Attention-Based Zero-Shot Learning Framework for Low-Light Image Enhancement

Muhammad Azeem Aslam, Hassan Khalid, Nisar Ahmed

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[2950] arXiv:2506.18335 (cross-list from eess.IV) [pdf, html, other]: Title: Rethinking Decoder Design: Improving Biomarker Segmentation Using Depth-to-Space Restoration and Residual Linear Attention

Saad Wazir, Daeyoung Kim

Comments: Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR), 2025, pp. 30861-30871

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2951] arXiv:2506.18371 (cross-list from eess.IV) [pdf, html, other]: Title: Transforming H&E images into IHC: A Variance-Penalized GAN for Precision Oncology

Sara Rehmat, Hafeez Ur Rehman

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2952] arXiv:2506.18378 (cross-list from eess.IV) [pdf, html, other]: Title: Taming Vision-Language Models for Medical Image Analysis: A Comprehensive Review

Haoneng Lin, Cheng Xu, Jing Qin

Comments: 34 pages

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2953] arXiv:2506.18407 (cross-list from cs.GR) [pdf, html, other]: Title: What You Think Is What You Get: Bridge User Intent and Transfer Function Design through Multimodal Large Language Models

Yiyao Wang, Bo Pan, Ke Wang, Han Liu, Jinyuan Mao, Yuxin Liu, Minfeng Zhu, Bo Zhang, Weifeng Chen, Xiuqi Huang, Wei Chen

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[2954] arXiv:2506.18443 (cross-list from cs.RO) [pdf, html, other]: Title: Radar and Event Camera Fusion for Agile Robot Ego-Motion Estimation

Yang Lyu, Zhenghao Zou, Yanfeng Li, Chunhui Zhao, Quan Pan

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[2955] arXiv:2506.18474 (cross-list from eess.IV) [pdf, html, other]: Title: A Deep Convolutional Neural Network-Based Novel Class Balancing for Imbalance Data Segmentation

Atifa Kalsoom, M.A. Iftikhar, Amjad Ali, Zubair Shah, Shidin Balakrishnan, Hazrat Ali

Comments: This is preprint of the paper submitted to Scientific Reports journal

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[2956] arXiv:2506.18484 (cross-list from eess.IV) [pdf, html, other]: Title: GANs vs. Diffusion Models for virtual staining with the HER2match dataset

Pascal Klöckner, José Teixeira, Diana Montezuma, Jaime S. Cardoso, Hugo M. Horlings, Sara P. Oliveira

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2957] arXiv:2506.18512 (cross-list from eess.IV) [pdf, html, other]: Title: MedTVT-R1: A Multimodal LLM Empowering Medical Reasoning and Diagnosis

Yuting Zhang, Kaishen Yuan, Hao Lu, Yutao Yue, Jintai Chen, Kaishun Wu

Subjects: Image and Video Processing (eess.IV); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Quantitative Methods (q-bio.QM)
[2958] arXiv:2506.18598 (cross-list from cs.LG) [pdf, html, other]: Title: No Training Wheels: Steering Vectors for Bias Correction at Inference Time

Aviral Gupta, Armaan Sethi, Ameesh Sethi

Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[2959] arXiv:2506.18601 (cross-list from cs.GR) [pdf, html, other]: Title: BulletGen: Improving 4D Reconstruction with Bullet-Time Generation

Denys Rozumnyi, Jonathon Luiten, Numair Khan, Johannes Schönberger, Peter Kontschieder

Subjects: Graphics (cs.GR); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[2960] arXiv:2506.18671 (cross-list from cs.SD) [pdf, html, other]: Title: TCDiff++: An End-to-end Trajectory-Controllable Diffusion Model for Harmonious Music-Driven Group Choreography

Yuqin Dai, Wanlu Zhu, Ronghui Li, Xiu Li, Zhenyu Zhang, Jun Li, Jian Yang

Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Audio and Speech Processing (eess.AS)
[2961] arXiv:2506.18680 (cross-list from cs.GR) [pdf, html, other]: Title: DuetGen: Music Driven Two-Person Dance Generation via Hierarchical Masked Modeling

Anindita Ghosh, Bing Zhou, Rishabh Dabral, Jian Wang, Vladislav Golyanik, Christian Theobalt, Philipp Slusallek, Chuan Guo

Comments: 11 pages, 7 figures, 2 tables, accepted in ACM Siggraph 2025 conference track

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[2962] arXiv:2506.18720 (cross-list from eess.IV) [pdf, html, other]: Title: Temporal Neural Cellular Automata: Application to modeling of contrast enhancement in breast MRI

Daniel M. Lang, Richard Osuala, Veronika Spieker, Karim Lekadir, Rickmer Braren, Julia A. Schnabel

Comments: MICCAI 2025

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2963] arXiv:2506.18725 (cross-list from cs.RO) [pdf, html, other]: Title: TopoRec: Point Cloud Recognition Using Topological Data Analysis

Anirban Ghosh, Iliya Kulbaka, Ian Dahlin, Ayan Dutta

Subjects: Robotics (cs.RO); Computational Geometry (cs.CG); Computer Vision and Pattern Recognition (cs.CV)
[2964] arXiv:2506.18810 (cross-list from cs.AI) [pdf, html, other]: Title: ConciseHint: Boosting Efficient Reasoning via Continuous Concise Hints during Generation

Siao Tang, Xinyin Ma, Gongfan Fang, Xinchao Wang

Comments: Codes are available at this https URL

Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[2965] arXiv:2506.18842 (cross-list from cs.DB) [pdf, html, other]: Title: LIGHTHOUSE: Fast and precise distance to shoreline calculations from anywhere on earth

Patrick Beukema, Henry Herzog, Yawen Zhang, Hunter Pitelka, Favyen Bastani

Comments: 8 pages, 7 figures, 1 table, ICML 2025 ML4RS

Subjects: Databases (cs.DB); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[2966] arXiv:2506.18844 (cross-list from cs.RO) [pdf, other]: Title: Reproducible Evaluation of Camera Auto-Exposure Methods in the Field: Platform, Benchmark and Lessons Learned

Olivier Gamache, Jean-Michel Fortin, Matěj Boxan, François Pomerleau, Philippe Giguère

Comments: 19 pages, 11 figures, pre-print version of the accepted paper for IEEE Transactions on Field Robotics (T-FR)

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[2967] arXiv:2506.18885 (cross-list from cs.RO) [pdf, html, other]: Title: GRAND-SLAM: Local Optimization for Globally Consistent Large-Scale Multi-Agent Gaussian SLAM

Annika Thomas, Aneesa Sonawalla, Alex Rose, Jonathan P. How

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[2968] arXiv:2506.18919 (cross-list from cs.CL) [pdf, html, other]: Title: MemeMind: A Large-Scale Multimodal Dataset with Chain-of-Thought Reasoning for Harmful Meme Detection

Hexiang Gu, Qifan Yu, Saihui Hou, Zhiqin Fang, Huijia Wu, Zhaofeng He

Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[2969] arXiv:2506.19051 (cross-list from eess.IV) [pdf, html, other]: Title: NIC-RobustBench: A Comprehensive Open-Source Toolkit for Neural Image Compression and Robustness Analysis

Georgii Bychkov, Khaled Abud, Egor Kovalev, Alexander Gushchin, Dmitriy Vatolin, Anastasia Antsiferova

Comments: arXiv admin note: text overlap with arXiv:2411.11795

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[2970] arXiv:2506.19055 (cross-list from eess.IV) [pdf, html, other]: Title: Xray2Xray: World Model from Chest X-rays with Volumetric Context

Zefan Yang, Xinrui Song, Xuanang Xu, Yongyi Shi, Ge Wang, Mannudeep K. Kalra, Pingkun Yan

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2971] arXiv:2506.19106 (cross-list from eess.IV) [pdf, html, other]: Title: Staining normalization in histopathology: Method benchmarking using multicenter dataset

Umair Khan, Jouni Härkönen, Marjukka Friman, Leena Latonen, Teijo Kuopio, Pekka Ruusuvuori

Comments: 18 pages, 9 figures

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Tissues and Organs (q-bio.TO)
[2972] arXiv:2506.19139 (cross-list from cs.GR) [pdf, html, other]: Title: SOF: Sorted Opacity Fields for Fast Unbounded Surface Reconstruction

Lukas Radl, Felix Windisch, Thomas Deixelberger, Jozef Hladky, Michael Steiner, Dieter Schmalstieg, Markus Steinberger

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[2973] arXiv:2506.19167 (cross-list from eess.IV) [pdf, other]: Title: A Deep Learning Based Method for Fast Registration of Cardiac Magnetic Resonance Images

Benjamin Graham

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[2974] arXiv:2506.19222 (cross-list from eess.IV) [pdf, html, other]: Title: Deformable Medical Image Registration with Effective Anatomical Structure Representation and Divide-and-Conquer Network

Xinke Ma, Yongsheng Pan, Qingjie Zeng, Mengkang Lu, Bolysbek Murat Yerzhanuly, Bazargul Matkerim, Yong Xia

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2975] arXiv:2506.19234 (cross-list from eess.IV) [pdf, html, other]: Title: Quantitative Benchmarking of Anomaly Detection Methods in Digital Pathology

Can Cui, Xindong Zheng, Ruining Deng, Quan Liu, Tianyuan Yao, Keith T Wilson, Lori A Coburn, Bennett A Landman, Haichun Yang, Yaohong Wang, Yuankai Huo

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2976] arXiv:2506.19266 (cross-list from q-bio.NC) [pdf, other]: Title: Convergent and divergent connectivity patterns of the arcuate fasciculus in macaques and humans

Jiahao Huang, Ruifeng Li, Wenwen Yu, Anan Li, Xiangning Li, Mingchao Yan, Lei Xie, Qingrun Zeng, Xueyan Jia, Shuxin Wang, Ronghui Ju, Feng Chen, Qingming Luo, Hui Gong, Andrew Zalesky, Xiaoquan Yang, Yuanjing Feng, Zheng Wang

Comments: 34 pages, 6 figures

Subjects: Neurons and Cognition (q-bio.NC); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[2977] arXiv:2506.19297 (cross-list from eess.IV) [pdf, html, other]: Title: Explicit Residual-Based Scalable Image Coding for Humans and Machines

Yui Tatsumi, Ziyue Zeng, Hiroshi Watanabe

Comments: Accepted to IEEE 27th International Workshop on Multimedia Signal Processing (MMSP 2025)

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2978] arXiv:2506.19360 (cross-list from cs.CR) [pdf, html, other]: Title: SoK: Can Synthetic Images Replace Real Data? A Survey of Utility and Privacy of Synthetic Image Generation

Yunsung Chung, Yunbei Zhang, Nassir Marrouche, Jihun Hamm

Comments: Accepted at the 34th USENIX Security Symposium (USENIX Security '25). 21 pages, plus a 6-page appendix

Subjects: Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
[2979] arXiv:2506.19363 (cross-list from eess.IV) [pdf, html, other]: Title: Reconsidering Explicit Longitudinal Mammography Alignment for Enhanced Breast Cancer Risk Prediction

Solveig Thrun, Stine Hansen, Zijun Sun, Nele Blum, Suaiba A. Salahuddin, Kristoffer Wickstrøm, Elisabeth Wetzer, Robert Jenssen, Maik Stille, Michael Kampffmeyer

Comments: MICCAI 2025, early accepted

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2980] arXiv:2506.19387 (cross-list from eess.IV) [pdf, other]: Title: NAADA: A Noise-Aware Attention Denoising Autoencoder for Dental Panoramic Radiographs

Khuram Naveed, Bruna Neves de Freitas, Ruben Pauwels

Comments: 10 pages, 8 figures

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[2981] arXiv:2506.19415 (cross-list from cs.GR) [pdf, html, other]: Title: Virtual Memory for 3D Gaussian Splatting

Jonathan Haberl, Philipp Fleck, Clemens Arth

Comments: Based on the Master Thesis from Jonathan Haberl from 2024, Submitted to TVCG in Feb. 2025;

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[2982] arXiv:2506.19455 (cross-list from eess.IV) [pdf, html, other]: Title: Angio-Diff: Learning a Self-Supervised Adversarial Diffusion Model for Angiographic Geometry Generation

Zhifeng Wang, Renjiao Yi, Xin Wen, Chenyang Zhu, Kai Xu, Kunlun He

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2983] arXiv:2506.19464 (cross-list from eess.IV) [pdf, html, other]: Title: Assessing Risk of Stealing Proprietary Models for Medical Imaging Tasks

Ankita Raj, Harsh Swaika, Deepankar Varma, Chetan Arora

Comments: Accepted to MICCAI 2024

Subjects: Image and Video Processing (eess.IV); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
[2984] arXiv:2506.19491 (cross-list from cs.ET) [pdf, html, other]: Title: Experimental Assessment of Neural 3D Reconstruction for Small UAV-based Applications

Genís Castillo Gómez-Raya, Álmos Veres-Vitályos, Filip Lemic, Pablo Royo, Mario Montagud, Sergi Fernández, Sergi Abadal, Xavier Costa-Pérez

Comments: 6 pages, 7 figures, 2 tables, accepted at IEEE International Symposium on Personal, Indoor and Mobile Radio Communications 2025

Subjects: Emerging Technologies (cs.ET); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Networking and Internet Architecture (cs.NI); Image and Video Processing (eess.IV)
[2985] arXiv:2506.19558 (cross-list from cs.LG) [pdf, html, other]: Title: ConCM: Consistency-Driven Calibration and Matching for Few-Shot Class-Incremental Learning

QinZhe Wang, Zixuan Chen, Keke Huang, Xiu Su, Chunhua Yang, Chang Xu

Comments: 9 pages, 5 figures(Excluding the appendix)

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[2986] arXiv:2506.19579 (cross-list from cs.RO) [pdf, html, other]: Title: Fake or Real, Can Robots Tell? Evaluating Embodied Vision-Language Models on Real and 3D-Printed Objects

Federico Tavella, Kathryn Mearns, Angelo Cangelosi

Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[2987] arXiv:2506.19590 (cross-list from eess.IV) [pdf, html, other]: Title: Learning from Anatomy: Supervised Anatomical Pretraining (SAP) for Improved Metastatic Bone Disease Segmentation in Whole-Body MRI

Joris Wuts, Jakub Ceranka, Nicolas Michoux, Frédéric Lecouvet, Jef Vandemeulebroucke

Comments: This preprint is currently under review at *Computers in Biology and Medicine* (Elsevier). This version has not been peer-reviewed

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2988] arXiv:2506.19600 (cross-list from eess.IV) [pdf, html, other]: Title: Filling of incomplete sinograms from sparse PET detector configurations using a residual U-Net

Klara Leffler, Luigi Tommaso Luppino, Samuel Kuttner, Karin Söderkvist, Jan Axelsson

Comments: 15 pages, 9 figures

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Medical Physics (physics.med-ph)
[2989] arXiv:2506.19687 (cross-list from eess.IV) [pdf, html, other]: Title: ReCoGNet: Recurrent Context-Guided Network for 3D MRI Prostate Segmentation

Ahmad Mustafa, Reza Rastegar, Ghassan AlRegib

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2990] arXiv:2506.19708 (cross-list from cs.GR) [pdf, html, other]: Title: Uncovering Conceptual Blindspots in Generative Image Models Using Sparse Autoencoders

Matyas Bohacek, Thomas Fel, Maneesh Agrawala, Ekdeep Singh Lubana

Subjects: Graphics (cs.GR); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[2991] arXiv:2506.19741 (cross-list from cs.LG) [pdf, html, other]: Title: Noise Consistency Training: A Native Approach for One-Step Generator in Learning Additional Controls

Yihong Luo, Shuchen Xue, Tianyang Hu, Jing Tang

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
[2992] arXiv:2506.19742 (cross-list from eess.IV) [pdf, html, other]: Title: NeRF-based CBCT Reconstruction needs Normalization and Initialization

Zhuowei Xu, Han Li, Dai Sun, Zhicheng Li, Yujia Li, Qingpeng Kong, Zhiwei Cheng, Nassir Navab, S. Kevin Zhou

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[2993] arXiv:2506.19797 (cross-list from eess.IV) [pdf, html, other]: Title: Systematic Review of Pituitary Gland and Pituitary Adenoma Automatic Segmentation Techniques in Magnetic Resonance Imaging

Mubaraq Yakubu, Navodini Wijethilake, Jonathan Shapey, Andrew King, Alexander Hammers

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2994] arXiv:2506.19807 (cross-list from cs.AI) [pdf, other]: Title: KnowRL: Exploring Knowledgeable Reinforcement Learning for Factuality

Baochang Ren, Shuofei Qiao, Wenhao Yu, Huajun Chen, Ningyu Zhang

Comments: Work in progress

Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multiagent Systems (cs.MA)
[2995] arXiv:2506.19816 (cross-list from cs.RO) [pdf, html, other]: Title: CronusVLA: Transferring Latent Motion Across Time for Multi-Frame Prediction in Manipulation

Hao Li, Shuai Yang, Yilun Chen, Yang Tian, Xiaoda Yang, Xinyi Chen, Hanqing Wang, Tai Wang, Feng Zhao, Dahua Lin, Jiangmiao Pang

Comments: 36 pages, 21 figures

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[2996] arXiv:2506.19827 (cross-list from cs.RO) [pdf, html, other]: Title: Look to Locate: Vision-Based Multisensory Navigation with 3-D Digital Maps for GNSS-Challenged Environments

Ola Elmaghraby, Eslam Mounier, Paulo Ricardo Marques de Araujo, Aboelmagd Noureldin

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[2997] arXiv:2506.19847 (cross-list from cs.LG) [pdf, html, other]: Title: Orthogonal Finetuning Made Scalable

Zeju Qiu, Weiyang Liu, Adrian Weller, Bernhard Schölkopf

Comments: Technical report (17 pages, 7 figures, project page: this https URL)

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[2998] arXiv:2506.19860 (cross-list from eess.SP) [pdf, html, other]: Title: A Multi-Modal Spatial Risk Framework for EV Charging Infrastructure Using Remote Sensing

Oktay Karakuş, Padraig Corcoran

Comments: 11 pages, 4 figures, 2 tables

Subjects: Signal Processing (eess.SP); Computer Vision and Pattern Recognition (cs.CV)
[2999] arXiv:2506.19935 (cross-list from cs.LG) [pdf, html, other]: Title: Any-Order GPT as Masked Diffusion Model: Decoupling Formulation and Architecture

Shuchen Xue, Tianyu Xie, Tianyang Hu, Zijin Feng, Jiacheng Sun, Kenji Kawaguchi, Zhenguo Li, Zhi-Ming Ma

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
[3000] arXiv:2506.19975 (cross-list from eess.IV) [pdf, html, other]: Title: VoxelOpt: Voxel-Adaptive Message Passing for Discrete Optimization in Deformable Abdominal CT Registration

Hang Zhang, Yuxi Zhang, Jiazheng Wang, Xiang Chen, Renjiu Hu, Xin Tian, Gaolei Li, Min Liu

Comments: Accepted for publication at MICCAI 2025

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Signal Processing (eess.SP)
[3001] arXiv:2506.20045 (cross-list from cs.RO) [pdf, html, other]: Title: Consensus-Driven Uncertainty for Robotic Grasping based on RGB Perception

Eric C. Joyce, Qianwen Zhao, Nathaniel Burgdorfer, Long Wang, Philippos Mordohai

Comments: Accepted to IROS 2025

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[3002] arXiv:2506.20100 (cross-list from cs.LG) [pdf, html, other]: Title: MIRAGE: A Benchmark for Multimodal Information-Seeking and Reasoning in Agricultural Expert-Guided Conversations

Vardhan Dongre, Chi Gui, Shubham Garg, Hooshang Nayyeri, Gokhan Tur, Dilek Hakkani-Tür, Vikram S. Adve

Comments: 66 pages, 32 figures, 23 tables

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[3003] arXiv:2506.20200 (cross-list from eess.IV) [pdf, html, other]: Title: MS-IQA: A Multi-Scale Feature Fusion Network for PET/CT Image Quality Assessment

Siqiao Li, Chen Hui, Wei Zhang, Rui Liang, Chenyue Song, Feng Jiang, Haiqi Zhu, Zhixuan Li, Hong Huang, Xiang Li

Comments: Accepted to MICCAI 2025

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3004] arXiv:2506.20245 (cross-list from cs.LG) [pdf, html, other]: Title: FedBKD: Distilled Federated Learning to Embrace Gerneralization and Personalization on Non-IID Data

Yushan Zhao, Jinyuan He, Donglai Chen, Weijie Luo, Chong Xie, Ri Zhang, Yonghong Chen, Yan Xu

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[3005] arXiv:2506.20267 (cross-list from cs.GR) [pdf, html, other]: Title: X-SiT: Inherently Interpretable Surface Vision Transformers for Dementia Diagnosis

Fabian Bongratz, Tom Nuno Wolf, Jaume Gual Ramon, Christian Wachinger

Comments: MICCAI 2025

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[3006] arXiv:2506.20282 (cross-list from eess.IV) [pdf, html, other]: Title: Opportunistic Osteoporosis Diagnosis via Texture-Preserving Self-Supervision, Mixture of Experts and Multi-Task Integration

Jiaxing Huang, Heng Guo, Le Lu, Fan Yang, Minfeng Xu, Ge Yang, Wei Luo

Comments: Accepted by MICCAI 2025

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3007] arXiv:2506.20303 (cross-list from eess.IV) [pdf, other]: Title: FundaQ-8: A Clinically-Inspired Scoring Framework for Automated Fundus Image Quality Assessment

Lee Qi Zun, Oscar Wong Jin Hao, Nor Anita Binti Che Omar, Zalifa Zakiah Binti Asnir, Mohamad Sabri bin Sinal Zainal, Goh Man Fye

Subjects: Image and Video Processing (eess.IV); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[3008] arXiv:2506.20305 (cross-list from cs.LG) [pdf, html, other]: Title: Learning Moderately Input-Sensitive Functions: A Case Study in QR Code Decoding

Kazuki Yoda, Kazuhiko Kawamoto, Hiroshi Kera

Comments: 17 pages, 13 figures

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[3009] arXiv:2506.20333 (cross-list from eess.IV) [pdf, html, other]: Title: EAGLE: An Efficient Global Attention Lesion Segmentation Model for Hepatic Echinococcosis

Jiayan Chen, Kai Li, Yulu Zhao, Jianqiang Huang, Zhan Wang

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3010] arXiv:2506.20355 (cross-list from quant-ph) [pdf, html, other]: Title: Practical insights on the effect of different encodings, ansätze and measurements in quantum and hybrid convolutional neural networks

Jesús Lozano-Cruz, Albert Nieto-Morales, Oriol Balló-Gimbernat, Adan Garriga, Antón Rodríguez-Otero, Alejandro Borrallo-Rentero

Comments: 20 pages, 22 figures

Subjects: Quantum Physics (quant-ph); Computer Vision and Pattern Recognition (cs.CV)
[3011] arXiv:2506.20367 (cross-list from cs.GR) [pdf, html, other]: Title: DreamAnywhere: Object-Centric Panoramic 3D Scene Generation

Edoardo Alberto Dominici, Jozef Hladky, Floor Verhoeven, Lukas Radl, Thomas Deixelberger, Stefan Ainetter, Philipp Drescher, Stefan Hauswiesner, Arno Coomans, Giacomo Nazzaro, Konstantinos Vardis, Markus Steinberger

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[3012] arXiv:2506.20407 (cross-list from eess.IV) [pdf, html, other]: Title: Fusing Radiomic Features with Deep Representations for Gestational Age Estimation in Fetal Ultrasound Images

Fangyijie Wang, Yuan Liang, Sourav Bhattacharjee, Abey Campbell, Kathleen M. Curran, Guénolé Silvestre

Comments: Accepted at MICCAI 2025

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3013] arXiv:2506.20430 (cross-list from cs.CL) [pdf, html, other]: Title: An Agentic System for Rare Disease Diagnosis with Traceable Reasoning

Weike Zhao, Chaoyi Wu, Yanjie Fan, Xiaoman Zhang, Pengcheng Qiu, Yuze Sun, Xiao Zhou, Yanfeng Wang, Xin Sun, Ya Zhang, Yongguo Yu, Kun Sun, Weidi Xie

Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Multiagent Systems (cs.MA)
[3014] arXiv:2506.20566 (cross-list from cs.RO) [pdf, html, other]: Title: HRIBench: Benchmarking Vision-Language Models for Real-Time Human Perception in Human-Robot Interaction

Zhonghao Shi, Enyu Zhao, Nathaniel Dennler, Jingzhen Wang, Xinyang Xu, Kaleen Shrestha, Mengxue Fu, Daniel Seita, Maja Matarić

Comments: Accepted to the 19th International Symposium on Experimental Robotics (ISER 2025)

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[3015] arXiv:2506.20614 (cross-list from eess.IV) [pdf, html, other]: Title: Weighted Mean Frequencies: a handcraft Fourier feature for 4D Flow MRI segmentation

Simon Perrin, Sébastien Levilly, Huajun Sun, Harold Mouchère, Jean-Michel Serfaty

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[3016] arXiv:2506.20652 (cross-list from cs.GR) [pdf, html, other]: Title: EditP23: 3D Editing via Propagation of Image Prompts to Multi-View

Roi Bar-On, Dana Cohen-Bar, Daniel Cohen-Or

Comments: Code, supplementary videos, interactive 3D visualizations, and additional results are available at this https URL

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[3017] arXiv:2506.20683 (cross-list from eess.IV) [pdf, html, other]: Title: Global and Local Contrastive Learning for Joint Representations from Cardiac MRI and ECG

Alexander Selivanov, Philip Müller, Özgün Turgut, Nil Stolt-Ansó, Daniel Rückert

Comments: accepted to MICCAI 2025 (Springer LNCS)

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Signal Processing (eess.SP)
[3018] arXiv:2506.20689 (cross-list from eess.IV) [pdf, other]: Title: U-R-VEDA: Integrating UNET, Residual Links, Edge and Dual Attention, and Vision Transformer for Accurate Semantic Segmentation of CMRs

Racheal Mukisa, Arvind K. Bansal

Comments: 15 pages, 3 figures

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[3019] arXiv:2506.20703 (cross-list from cs.GR) [pdf, html, other]: Title: Generative Blocks World: Moving Things Around in Pictures

Vaibhav Vavilala, Seemandhar Jain, Rahul Vasanth, D.A. Forsyth, Anand Bhattad

Comments: 23 pages, 16 figures, 2 tables

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[3020] arXiv:2506.20812 (cross-list from cs.RO) [pdf, html, other]: Title: Model-Based Real-Time Pose and Sag Estimation of Overhead Power Lines Using LiDAR for Drone Inspection

Alexandre Girard, Steven A. Parkison, Philippe Hamelin

Comments: Submitted to IEEE case 2025

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[3021] arXiv:2506.20816 (cross-list from cs.LG) [pdf, html, other]: Title: Universal and Efficient Detection of Adversarial Data through Nonuniform Impact on Network Layers

Furkan Mumcu, Yasin Yilmaz

Comments: arXiv admin note: substantial text overlap with arXiv:2410.17442

Journal-ref: Transactions on Machine Learning Research, June 2025

Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
[3022] arXiv:2506.20875 (cross-list from cs.GR) [pdf, html, other]: Title: 3DGH: 3D Head Generation with Composable Hair and Face

Chengan He, Junxuan Li, Tobias Kirschstein, Artem Sevastopolsky, Shunsuke Saito, Qingyang Tan, Javier Romero, Chen Cao, Holly Rushmeier, Giljoo Nam

Comments: Accepted to SIGGRAPH 2025. Project page: this https URL

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[3023] arXiv:2506.20897 (cross-list from eess.IV) [pdf, html, other]: Title: Development of MR spectral analysis method robust against static magnetic field inhomogeneity

Shuki Maruyama, Hidenori Takeshima

Comments: 11 pages, 6 figures

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3024] arXiv:2506.20946 (cross-list from cs.GR) [pdf, html, other]: Title: Consistent Zero-shot 3D Texture Synthesis Using Geometry-aware Diffusion and Temporal Video Models

Donggoo Kang, Jangyeong Kim, Dasol Jeong, Junyoung Choi, Jeonga Wi, Hyunmin Lee, Joonho Gwon, Joonki Paik

Subjects: Graphics (cs.GR); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[3025] arXiv:2506.20969 (cross-list from cs.RO) [pdf, html, other]: Title: ThermalDiffusion: Visual-to-Thermal Image-to-Image Translation for Autonomous Navigation

Shruti Bansal, Wenshan Wang, Yifei Liu, Parv Maheshwari

Comments: Accepted at Thermal Infrared in Robotics (TIRO) Workshop, ICRA 2025

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[3026] arXiv:2506.20990 (cross-list from cs.LG) [pdf, html, other]: Title: SharpZO: Hybrid Sharpness-Aware Vision Language Model Prompt Tuning via Forward-Only Passes

Yifan Yang, Zhen Zhang, Rupak Vignesh Swaminathan, Jing Liu, Nathan Susanj, Zheng Zhang

Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[3027] arXiv:2506.21037 (cross-list from cs.LG) [pdf, html, other]: Title: RL-Selector: Reinforcement Learning-Guided Data Selection via Redundancy Assessment

Suorong Yang, Peijia Li, Furao Shen, Jian Zhao

Comments: ICCV 2025

Journal-ref: ICCV 2025

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[3028] arXiv:2506.21041 (cross-list from cs.RO) [pdf, html, other]: Title: SEAL: Vision-Language Model-Based Safe End-to-End Cooperative Autonomous Driving with Adaptive Long-Tail Modeling

Junwei You, Pei Li, Zhuoyu Jiang, Zilin Huang, Rui Gan, Haotian Shi, Bin Ran

Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[3029] arXiv:2506.21144 (cross-list from cs.LG) [pdf, html, other]: Title: Personalized Federated Learning via Dual-Prompt Optimization and Cross Fusion

Yuguang Zhang, Kuangpu Guo, Zhihe Lu, Yunbo Wang, Jian Liang

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[3030] arXiv:2506.21171 (cross-list from eess.IV) [pdf, other]: Title: Uncover Treasures in DCT: Advancing JPEG Quality Enhancement by Exploiting Latent Correlations

Jing Yang, Qunliang Xing, Mai Xu, Minglang Qiao

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3031] arXiv:2506.21245 (cross-list from eess.IV) [pdf, html, other]: Title: GANet-Seg: Adversarial Learning for Brain Tumor Segmentation with Hybrid Generative Models

Qifei Cui, Xinyu Lu

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3032] arXiv:2506.21272 (cross-list from cs.GR) [pdf, html, other]: Title: FairyGen: Storied Cartoon Video from a Single Child-Drawn Character

Jiayi Zheng, Xiaodong Cun

Comments: Project Page: this https URL ; Code: this https URL

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[3033] arXiv:2506.21319 (cross-list from cs.HC) [pdf, html, other]: Title: SimVecVis: A Dataset for Enhancing MLLMs in Visualization Understanding

Can Liu, Chunlin Da, Xiaoxiao Long, Yuxiao Yang, Yu Zhang, Yong Wang

Subjects: Human-Computer Interaction (cs.HC); Computer Vision and Pattern Recognition (cs.CV)
[3034] arXiv:2506.21331 (cross-list from cs.DL) [pdf, html, other]: Title: Automatic Reviewers Assignment to a Research Paper Based on Allied References and Publications Weight

Tamim Al Mahmud, B M Mainul Hossain, Dilshad Ara

Comments: IEEE Conference Proceedings (5 Pages)

Journal-ref: 2018 4th International Conference on Computing, Communication and Automation (ICCCA), Greater Noida, India, 2018, pp. 1-5

Subjects: Digital Libraries (cs.DL); Computer Vision and Pattern Recognition (cs.CV)
[3035] arXiv:2506.21448 (cross-list from eess.AS) [pdf, html, other]: Title: ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and Editing

Huadai Liu, Jialei Wang, Kaicheng Luo, Wen Wang, Qian Chen, Zhou Zhao, Wei Xue

Subjects: Audio and Speech Processing (eess.AS); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD)
[3036] arXiv:2506.21458 (cross-list from cs.AI) [pdf, other]: Title: Spatial Mental Modeling from Limited Views

Baiqiao Yin, Qineng Wang, Pingyue Zhang, Jianshu Zhang, Kangrui Wang, Zihan Wang, Jieyu Zhang, Keshigeyan Chandrasegaran, Han Liu, Ranjay Krishna, Saining Xie, Manling Li, Jiajun Wu, Li Fei-Fei

Comments: Preprint version

Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[3037] arXiv:2506.21499 (cross-list from eess.IV) [pdf, html, other]: Title: Lightweight Physics-Informed Zero-Shot Ultrasound Plane Wave Denoising

Hojat Asgariandehkordi, Mostafa Sharifzadeh, Hassan Rivaz

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3038] arXiv:2506.21535 (cross-list from eess.IV) [pdf, html, other]: Title: Exploring the Design Space of 3D MLLMs for CT Report Generation

Mohammed Baharoon, Jun Ma, Congyu Fang, Augustin Toma, Bo Wang

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[3039] arXiv:2506.21537 (cross-list from quant-ph) [pdf, html, other]: Title: ResQ: A Novel Framework to Implement Residual Neural Networks on Analog Rydberg Atom Quantum Computers

Nicholas S. DiBrita, Jason Han, Tirthak Patel

Comments: ResQ will appear in the Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2025

Subjects: Quantum Physics (quant-ph); Computer Vision and Pattern Recognition (cs.CV); Emerging Technologies (cs.ET)
[3040] arXiv:2506.21586 (cross-list from cs.CL) [pdf, html, other]: Title: Can Vision Language Models Understand Mimed Actions?

Hyundong Cho, Spencer Lin, Tejas Srinivasan, Michael Saxon, Deuksin Kwon, Natali T. Chavez, Jonathan May

Comments: ACL 2025 Findings

Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[3041] arXiv:2506.21592 (cross-list from cs.CL) [pdf, html, other]: Title: SignBart -- New approach with the skeleton sequence for Isolated Sign language Recognition

Tinh Nguyen, Minh Khue Phan Tran

Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[3042] arXiv:2506.21601 (cross-list from cs.IR) [pdf, html, other]: Title: Hierarchical Patch Compression for ColPali: Efficient Multi-Vector Document Retrieval with Dynamic Pruning and Quantization

Duong Bach

Comments: 9 pages

Subjects: Information Retrieval (cs.IR); Computer Vision and Pattern Recognition (cs.CV)
[3043] arXiv:2506.21604 (cross-list from cs.IR) [pdf, html, other]: Title: Evaluating VisualRAG: Quantifying Cross-Modal Performance in Enterprise Document Understanding

Varun Mannam, Fang Wang, Xin Chen

Comments: Conference: KDD conference workshop: this https URL

Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)
[3044] arXiv:2506.21629 (cross-list from cs.GR) [pdf, html, other]: Title: ICP-3DGS: SfM-free 3D Gaussian Splatting for Large-scale Unbounded Scenes

Chenhao Zhang, Yezhi Shen, Fengqing Zhu

Comments: 6 pages, Source code is available at this https URL. To appear at ICIP 2025

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[3045] arXiv:2506.21630 (cross-list from cs.RO) [pdf, html, other]: Title: TOMD: A Trail-based Off-road Multimodal Dataset for Traversable Pathway Segmentation under Challenging Illumination Conditions

Yixin Sun, Li Li, Wenke E, Amir Atapour-Abarghouei, Toby P. Breckon

Comments: 8 pages, 9 figures, 2025 IJCNN

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[3046] arXiv:2506.21635 (cross-list from cs.RO) [pdf, html, other]: Title: AeroLite-MDNet: Lightweight Multi-task Deviation Detection Network for UAV Landing

Haiping Yang, Huaxing Liu, Wei Wu, Zuohui Chen, Ning Wu

Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[3047] arXiv:2506.21655 (cross-list from cs.LG) [pdf, html, other]: Title: APO: Enhancing Reasoning Ability of MLLMs via Asymmetric Policy Optimization

Minjie Hong, Zirun Guo, Yan Xia, Zehan Wang, Ziang Zhang, Tao Jin, Zhou Zhao

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[3048] arXiv:2506.21680 (cross-list from eess.IV) [pdf, html, other]: Title: PhotonSplat: 3D Scene Reconstruction and Colorization from SPAD Sensors

Sai Sri Teja, Sreevidya Chintalapati, Vinayak Gupta, Mukund Varma T, Haejoon Lee, Aswin Sankaranarayanan, Kaushik Mitra

Comments: Accepted at the International Conference on Computational Photography(ICCP) 2025

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3049] arXiv:2506.21714 (cross-list from cs.LG) [pdf, html, other]: Title: ODE$_t$(ODE$_l$): Shortcutting the Time and Length in Diffusion and Flow Models for Faster Sampling

Denis Gudovskiy, Wenzhao Zheng, Tomoyuki Okuno, Yohei Nakata, Kurt Keutzer

Comments: Preprint. Github page: this http URL

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[3050] arXiv:2506.21732 (cross-list from cs.RO) [pdf, html, other]: Title: Experimental investigation of pose informed reinforcement learning for skid-steered visual navigation

Ameya Salvi, Venkat Krovi

Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Systems and Control (eess.SY)
[3051] arXiv:2506.21748 (cross-list from physics.optics) [pdf, html, other]: Title: Inverse Design of Diffractive Metasurfaces Using Diffusion Models

Liav Hen, Erez Yosef, Dan Raviv, Raja Giryes, Jacob Scheuer

Subjects: Optics (physics.optics); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[3052] arXiv:2506.21765 (cross-list from eess.IV) [pdf, html, other]: Title: TUS-REC2024: A Challenge to Reconstruct 3D Freehand Ultrasound Without External Tracker

Qi Li, Shaheer U. Saeed, Yuliang Huang, Mingyuan Luo, Zhongnuo Yan, Jiongquan Chen, Xin Yang, Dong Ni, Nektarios Winter, Phuc Nguyen, Lucas Steinberger, Caelan Haney, Yuan Zhao, Mingjie Jiang, Bowen Ren, SiYeoul Lee, Seonho Kim, MinKyung Seo, MinWoo Kim, Yimeng Dou, Zhiwei Zhang, Yin Li, Tomy Varghese, Dean C. Barratt, Matthew J. Clarkson, Tom Vercauteren, Yipeng Hu

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3053] arXiv:2506.21812 (cross-list from cs.CL) [pdf, html, other]: Title: Towards Transparent AI: A Survey on Explainable Large Language Models

Avash Palikhe, Zhenyu Yu, Zichong Wang, Wenbin Zhang

Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[3054] arXiv:2506.21860 (cross-list from cs.RO) [pdf, html, other]: Title: Embodied Domain Adaptation for Object Detection

Xiangyu Shi, Yanyuan Qiao, Lingqiao Liu, Feras Dayoub

Comments: Accepted by IROS 2025

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[3055] arXiv:2506.21876 (cross-list from cs.CL) [pdf, html, other]: Title: Do Vision-Language Models Have Internal World Models? Towards an Atomic Evaluation

Qiyue Gao, Xinyu Pi, Kevin Liu, Junrong Chen, Ruolan Yang, Xinqi Huang, Xinyu Fang, Lu Sun, Gautham Kishore, Bo Ai, Stone Tao, Mengyang Liu, Jiaxi Yang, Chao-Jung Lai, Chuanyang Jin, Jiannan Xiang, Benhao Huang, Zeming Chen, David Danks, Hao Su, Tianmin Shu, Ziqiao Ma, Lianhui Qin, Zhiting Hu

Comments: ACL 2025 (Findings)

Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[3056] arXiv:2506.21880 (cross-list from eess.IV) [pdf, html, other]: Title: Physical Degradation Model-Guided Interferometric Hyperspectral Reconstruction with Unfolding Transformer

Yuansheng Li, Yunhao Zou, Linwei Chen, Ying Fu

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3057] arXiv:2506.21884 (cross-list from eess.IV) [pdf, html, other]: Title: UnMix-NeRF: Spectral Unmixing Meets Neural Radiance Fields

Fabian Perez, Sara Rojas, Carlos Hinojosa, Hoover Rueda-Chacón, Bernard Ghanem

Comments: Paper accepted at ICCV 2025 main conference

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Signal Processing (eess.SP)
[3058] arXiv:2506.21934 (cross-list from cs.IR) [pdf, html, other]: Title: CAL-RAG: Retrieval-Augmented Multi-Agent Generation for Content-Aware Layout Design

Najmeh Forouzandehmehr, Reza Yousefi Maragheh, Sriram Kollipara, Kai Zhao, Topojoy Biswas, Evren Korpeoglu, Kannan Achan

Subjects: Information Retrieval (cs.IR); Computer Vision and Pattern Recognition (cs.CV)
[3059] arXiv:2506.21976 (cross-list from cs.LG) [pdf, html, other]: Title: SceneDiffuser++: City-Scale Traffic Simulation via a Generative World Model

Shuhan Tan, John Lambert, Hong Jeon, Sakshum Kulshrestha, Yijing Bai, Jing Luo, Dragomir Anguelov, Mingxing Tan, Chiyu Max Jiang

Comments: Accepted to CVPR 2025

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Multiagent Systems (cs.MA); Robotics (cs.RO)
[3060] arXiv:2506.21977 (cross-list from eess.IV) [pdf, other]: Title: StableCodec: Taming One-Step Diffusion for Extreme Image Compression

Tianyu Zhang, Xin Luo, Li Li, Dong Liu

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3061] arXiv:2506.22012 (cross-list from eess.IV) [pdf, html, other]: Title: Noise-Inspired Diffusion Model for Generalizable Low-Dose CT Reconstruction

Qi Gao, Zhihao Chen, Dong Zeng, Junping Zhang, Jianhua Ma, Hongming Shan

Comments: Accepted for publication in Medical Image Analysis, 2025

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3062] arXiv:2506.22041 (cross-list from eess.IV) [pdf, html, other]: Title: Towards Scalable and Robust White Matter Lesion Localization via Multimodal Deep Learning

Julia Machnio, Sebastian Nørgaard Llambias, Mads Nielsen, Mostafa Mehdipour Ghazi

Comments: 2nd Sorbonne-Heidelberg Workshop on AI in medicine: Machine Learning for multi-modal data

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3063] arXiv:2506.22116 (cross-list from cs.RO) [pdf, html, other]: Title: Evaluating Pointing Gestures for Target Selection in Human-Robot Collaboration

Noora Sassali, Roel Pieters

Comments: Accepted by the 2025 34th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN). Preprint

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[3064] arXiv:2506.22156 (cross-list from cs.AR) [pdf, html, other]: Title: Hardware acceleration for ultra-fast Neural Network training on FPGA for MRF map reconstruction

Mattia Ricchi, Fabrizio Alfonsi, Camilla Marella, Marco Barbieri, Alessandra Retico, Leonardo Brizi, Alessandro Gabrielli, Claudia Testa

Comments: 8 pages, 2 figures, to be published in conference proceedings of SDPS 2024: 2024 International Conference of the Society for Design and Process Science on Advances and Challenges of Applying AI/GenAI in Design and Process Science

Subjects: Hardware Architecture (cs.AR); Computer Vision and Pattern Recognition (cs.CV); Instrumentation and Detectors (physics.ins-det)
[3065] arXiv:2506.22176 (cross-list from cs.RO) [pdf, html, other]: Title: KnotDLO: Toward Interpretable Knot Tying

Holly Dinkel, Raghavendra Navaratna, Jingyi Xiang, Brian Coltin, Trey Smith, Timothy Bretl

Comments: 4 pages, 5 figures, presented at the Workshop on 3D Visual Representations for Manipulation at the 2023 IEEE International Conference on Robotics and Automation in Yokohama, Japan. Video presentation [this https URL]. Poster [this https URL] 3DVRM Workshop [this https URL]

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[3066] arXiv:2506.22222 (cross-list from eess.IV) [pdf, html, other]: Title: Advanced Deep Learning Techniques for Automated Segmentation of Type B Aortic Dissections

Hao Xu, Ruth Lim, Brian E. Chapman

Comments: 9 pages, 5 figures, 3 tables

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3067] arXiv:2506.22226 (cross-list from eess.IV) [pdf, html, other]: Title: Cardiovascular disease classification using radiomics and geometric features from cardiac CT

Ajay Mittal, Raghav Mehta, Omar Todd, Philipp Seeböck, Georg Langs, Ben Glocker

Comments: Under Review at STACOM 2025 with MICCAI 2025

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3068] arXiv:2506.22280 (cross-list from eess.IV) [pdf, html, other]: Title: DIGS: Dynamic CBCT Reconstruction using Deformation-Informed 4D Gaussian Splatting and a Low-Rank Free-Form Deformation Model

Yuliang Huang, Imraj Singh, Thomas Joyce, Kris Thielemans, Jamie R. McClelland

Comments: Accepted by MICCAI 2025

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3069] arXiv:2506.22304 (cross-list from cs.LG) [pdf, html, other]: Title: Unfolding Generative Flows with Koopman Operators: Fast and Interpretable Sampling

Erkan Turan, Aristotelis Siozopoulos, Maks Ovsjanikov

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[3070] arXiv:2506.22340 (cross-list from quant-ph) [pdf, html, other]: Title: QuKAN: A Quantum Circuit Born Machine approach to Quantum Kolmogorov Arnold Networks

Yannick Werner, Akash Malemath, Mengxi Liu, Vitor Fortes Rey, Nikolaos Palaiodimopoulos, Paul Lukowicz, Maximilian Kiefer-Emmanouilidis

Subjects: Quantum Physics (quant-ph); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[3071] arXiv:2506.22397 (cross-list from eess.IV) [pdf, other]: Title: Dehazing Light Microscopy Images with Guided Conditional Flow Matching: finding a sweet spot between fidelity and realism

Anirban Ray, Ashesh, Florian Jug

Comments: 4 figures, 10 pages + refs, 40 pages total (including supplement), 24 supplementary figures

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[3072] arXiv:2506.22426 (cross-list from eess.IV) [pdf, html, other]: Title: Single-shot HDR using conventional image sensor shutter functions and optical randomization

Xiang Dai, Kyrollos Yanny, Kristina Monakhova, Nicholas Antipa

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Signal Processing (eess.SP); Optics (physics.optics)
[3073] arXiv:2506.22467 (cross-list from eess.SP) [pdf, other]: Title: SegmentAnyMuscle: A universal muscle segmentation model across different locations in MRI

Roy Colglazier, Jisoo Lee, Haoyu Dong, Hanxue Gu, Yaqian Chen, Joseph Cao, Zafer Yildiz, Zhonghao Liu, Nicholas Konz, Jichen Yang, Jikai Zhang, Yuwen Chen, Lin Li, Adrian Camarena, Maciej A. Mazurowski

Comments: 24 pages, 6 figures

Subjects: Signal Processing (eess.SP); Computer Vision and Pattern Recognition (cs.CV)
[3074] arXiv:2506.22482 (cross-list from cs.NI) [pdf, other]: Title: Wireless Home Automation Using Social Networking Websites

Divya Alok Gupta, Dwith Chenna, B. Aditya Vighnesh Ramakanth

Comments: 20th Annual International Conference on Advanced Computing and Communications (ADCOM) 2014

Subjects: Networking and Internet Architecture (cs.NI); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
[3075] arXiv:2506.22494 (cross-list from cs.RO) [pdf, html, other]: Title: DriveBLIP2: Attention-Guided Explanation Generation for Complex Driving Scenarios

Shihong Ling, Yue Wan, Xiaowei Jia, Na Du

Comments: Accepted to IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2025. 7 pages, 3 figures

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[3076] arXiv:2506.22532 (cross-list from eess.IV) [pdf, other]: Title: High Resolution Isotropic 3D Cine imaging with Automated Segmentation using Concatenated 2D Real-time Imaging and Deep Learning

Mark Wrobel (1), Michele Pascale (1), Tina Yao (1), Ruaraidh Campbell (1), Elena Milano (2), Michael Quail (1 and 2), Jennifer Steeden (1), Vivek Muthurangu (1) ((1) UCL Centre for Translational Cardiovascular Imaging, University College London, (2) Great Ormond Street Hospital)

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[3077] arXiv:2506.22568 (cross-list from math.OC) [pdf, html, other]: Title: Maximum Dispersion, Maximum Concentration: Enhancing the Quality of MOP Solutions

Gladston Moreira, Ivan Meneghini, Elizabeth Wanner

Comments: 11 pages

Subjects: Optimization and Control (math.OC); Computer Vision and Pattern Recognition (cs.CV)
[3078] arXiv:2506.22580 (cross-list from eess.IV) [pdf, html, other]: Title: FedCLAM: Client Adaptive Momentum with Foreground Intensity Matching for Federated Medical Image Segmentation

Vasilis Siomos, Jonathan Passerat-Palmbach, Giacomo Tarroni

Comments: 10 pages, 2 figures, Accepted at MICCAI 2025

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[3079] arXiv:2506.22593 (cross-list from cs.RO) [pdf, html, other]: Title: Pixels-to-Graph: Real-time Integration of Building Information Models and Scene Graphs for Semantic-Geometric Human-Robot Understanding

Antonello Longo, Chanyoung Chung, Matteo Palieri, Sung-Kyun Kim, Ali Agha, Cataldo Guaragnella, Shehryar Khattak

Comments: Paper accepted to 2025 IEEE International Conference on Automation Science and Engineering (CASE)

Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[3080] arXiv:2506.22706 (cross-list from cs.CR) [pdf, other]: Title: General Autonomous Cybersecurity Defense: Learning Robust Policies for Dynamic Topologies and Diverse Attackers

Arun Ramamurthy, Neil Dhir

Subjects: Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
[3081] arXiv:2506.22790 (cross-list from eess.IV) [pdf, html, other]: Title: ICME 2025 Generalizable HDR and SDR Video Quality Measurement Grand Challenge

Yixu Chen, Bowen Chen, Hai Wei, Alan C. Bovik, Baojun Li, Wei Sun, Linhan Cao, Kang Fu, Dandan Zhu, Jun Jia, Menghan Hu, Xiongkuo Min, Guangtao Zhai, Dounia Hammou, Fei Yin, Rafal Mantiuk, Amritha Premkumar, Prajit T Rajendran, Vignesh V Menon

Comments: ICME 2025 Grand Challenges

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[3082] arXiv:2506.22799 (cross-list from cs.GR) [pdf, html, other]: Title: VoteSplat: Hough Voting Gaussian Splatting for 3D Scene Understanding

Minchao Jiang, Shunyu Jia, Jiaming Gu, Xiaoyuan Lu, Guangming Zhu, Anqi Dong, Liang Zhang

Comments: Accepted to ICCV 2025

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[3083] arXiv:2506.22802 (cross-list from cs.LG) [pdf, html, other]: Title: Riemannian-Geometric Fingerprints of Generative Models

Hae Jin Song, Laurent Itti

Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
[3084] arXiv:2506.22826 (cross-list from math.OC) [pdf, html, other]: Title: Denoising Multi-Color QR Codes and Stiefel-Valued Data by Relaxed Regularizations

Robert Beinert, Jonas Bresch

Comments: 9 pages, 2 figures, 3 algorithms

Subjects: Optimization and Control (math.OC); Computer Vision and Pattern Recognition (cs.CV); Numerical Analysis (math.NA)
[3085] arXiv:2506.22882 (cross-list from eess.IV) [pdf, html, other]: Title: CA-Diff: Collaborative Anatomy Diffusion for Brain Tissue Segmentation

Qilong Xing, Zikai Song, Yuteng Ye, Yuke Chen, Youjia Zhang, Na Feng, Junqing Yu, Wei Yang

Comments: ICME 2025

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[3086] arXiv:2506.22952 (cross-list from eess.IV) [pdf, html, other]: Title: Hierarchical Characterization of Brain Dynamics via State Space-based Vector Quantization

Yanwu Yang, Thomas Wolfers

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Neurons and Cognition (q-bio.NC)
[3087] arXiv:2506.22973 (cross-list from cs.GR) [pdf, html, other]: Title: Confident Splatting: Confidence-Based Compression of 3D Gaussian Splatting via Learnable Beta Distributions

AmirHossein Naghi Razlighi, Elaheh Badali Golezani, Shohreh Kasaei

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[3088] arXiv:2506.22992 (cross-list from cs.AI) [pdf, html, other]: Title: MARBLE: A Hard Benchmark for Multimodal Spatial Reasoning and Planning

Yulun Jiang, Yekun Chai, Maria Brbić, Michael Moor

Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[3089] arXiv:2506.23016 (cross-list from cs.HC) [pdf, html, other]: Title: Deep Learning in Mild Cognitive Impairment Diagnosis using Eye Movements and Image Content in Visual Memory Tasks

Tomás Silva Santos Rocha, Anastasiia Mikhailova, Moreno I. Coco, José Santos-Victor

Comments: 13 pages, 5 figures

Subjects: Human-Computer Interaction (cs.HC); Computer Vision and Pattern Recognition (cs.CV)
[3090] arXiv:2506.23041 (cross-list from cs.LG) [pdf, html, other]: Title: ReMem: Mutual Information-Aware Fine-tuning of Pretrained Vision Transformers for Effective Knowledge Distillation

Chengyu Dong, Huan Gui, Noveen Sachdeva, Long Jin, Ke Yin, Jingbo Shang, Lichan Hong, Ed H.Chi, Zhe Zhao

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[3091] arXiv:2506.23046 (cross-list from cs.CL) [pdf, html, other]: Title: SoMi-ToM: Evaluating Multi-Perspective Theory of Mind in Embodied Social Interactions

Xianzhe Fan, Xuhui Zhou, Chuanyang Jin, Kolby Nottingham, Hao Zhu, Maarten Sap

Comments: 23 pages, 6 figures

Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[3092] arXiv:2506.23102 (cross-list from eess.IV) [pdf, html, other]: Title: MedRegion-CT: Region-Focused Multimodal LLM for Comprehensive 3D CT Report Generation

Sunggu Kyung, Jinyoung Seo, Hyunseok Lim, Dongyeong Kim, Hyungbin Park, Jimin Sung, Jihyun Kim, Wooyoung Jo, Yoojin Nam, Namkug Kim

Comments: 14 pages, 5 figures, submitted to ICCV 2025

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3093] arXiv:2506.23121 (cross-list from eess.IV) [pdf, html, other]: Title: CRISP-SAM2: SAM2 with Cross-Modal Interaction and Semantic Prompting for Multi-Organ Segmentation

Xinlei Yu, Changmiao Wang, Hui Jin, Ahmed Elazab, Gangyong Jia, Xiang Wan, Changqing Zou, Ruiquan Ge

Comments: Accepted By ACMMM25

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[3094] arXiv:2506.23145 (cross-list from cs.LG) [pdf, html, other]: Title: Forget-MI: Machine Unlearning for Forgetting Multimodal Information in Healthcare Settings

Shahad Hardan, Darya Taratynova, Abdelmajid Essofi, Karthik Nandakumar, Mohammad Yaqub

Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
[3095] arXiv:2506.23147 (cross-list from cs.LG) [pdf, html, other]: Title: maneuverRecognition -- A Python package for Timeseries Classification in the domain of Vehicle Telematics

Jonathan Schuster, Fabian Transchel

Comments: 6 pages, 2 figures

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[3096] arXiv:2506.23184 (cross-list from eess.IV) [pdf, html, other]: Title: Score-based Diffusion Model for Unpaired Virtual Histology Staining

Anran Liu, Xiaofei Wang, Jing Cai, Chao Li

Comments: 11 pages, 3 figures

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[3097] arXiv:2506.23208 (cross-list from eess.IV) [pdf, html, other]: Title: Multi-Source COVID-19 Detection via Variance Risk Extrapolation

Runtian Yuan, Qingqiu Li, Junlin Hou, Jilan Xu, Yuejie Zhang, Rui Feng, Hao Chen

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3098] arXiv:2506.23221 (cross-list from cs.LG) [pdf, html, other]: Title: Single Image Inpainting and Super-Resolution with Simultaneous Uncertainty Guarantees by Universal Reproducing Kernels

Bálint Horváth, Balázs Csanád Csáji

Comments: 23 pages, 8 figures, 6 tables

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[3099] arXiv:2506.23259 (cross-list from eess.IV) [pdf, html, other]: Title: Improving Myocardial Infarction Detection via Synthetic ECG Pretraining

Lachin Naghashyar

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3100] arXiv:2506.23298 (cross-list from eess.IV) [pdf, html, other]: Title: Exposing and Mitigating Calibration Biases and Demographic Unfairness in MLLM Few-Shot In-Context Learning for Medical Image Classification

Xing Shen, Justin Szeto, Mingyang Li, Hengguan Huang, Tal Arbel

Comments: Preprint version. The peer-reviewed version of this paper has been accepted to MICCAI 2025 main conference

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[3101] arXiv:2506.23305 (cross-list from eess.IV) [pdf, html, other]: Title: BPD-Neo: An MRI Dataset for Lung-Trachea Segmentation with Clinical Data for Neonatal Bronchopulmonary Dysplasia

Rachit Saluja, Arzu Kovanlikaya, Candace Chien, Lauren Kathryn Blatt, Jeffrey M. Perlman, Stefan Worgall, Mert R. Sabuncu, Jonathan P. Dyke

Comments: Adding link to Zenodo repo for dataset

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3102] arXiv:2506.23309 (cross-list from eess.IV) [pdf, html, other]: Title: SurgTPGS: Semantic 3D Surgical Scene Understanding with Text Promptable Gaussian Splatting

Yiming Huang, Long Bai, Beilei Cui, Kun Yuan, Guankun Wang, Mobarak I. Hoque, Nicolas Padoy, Nassir Navab, Hongliang Ren

Comments: MICCAI 2025. Project Page: this https URL

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3103] arXiv:2506.23316 (cross-list from cs.RO) [pdf, html, other]: Title: InfGen: Scenario Generation as Next Token Group Prediction

Zhenghao Peng, Yuxin Liu, Bolei Zhou

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[3104] arXiv:2506.23334 (cross-list from eess.IV) [pdf, html, other]: Title: Federated Breast Cancer Detection Enhanced by Synthetic Ultrasound Image Augmentation

Hongyi Pan, Ziliang Hong, Gorkem Durak, Ziyue Xu, Ulas Bagci

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[3105] arXiv:2506.23466 (cross-list from eess.IV) [pdf, other]: Title: FD-DiT: Frequency Domain-Directed Diffusion Transformer for Low-Dose CT Reconstruction

Qiqing Liu, Guoquan Wei, Zekun Zhou, Yiyang Wen, Liu Shi, Qiegen Liu

Comments: 11pages, 11 figures

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Medical Physics (physics.med-ph)
[3106] arXiv:2506.23471 (cross-list from cs.IR) [pdf, html, other]: Title: KiseKloset: Comprehensive System For Outfit Retrieval, Recommendation, And Try-On

Thanh-Tung Phan-Nguyen, Khoi-Nguyen Nguyen-Ngoc, Tam V. Nguyen, Minh-Triet Tran, Trung-Nghia Le

Subjects: Information Retrieval (cs.IR); Computer Vision and Pattern Recognition (cs.CV)
[3107] arXiv:2506.23484 (cross-list from cs.MM) [pdf, html, other]: Title: TAG-WM: Tamper-Aware Generative Image Watermarking via Diffusion Inversion Sensitivity

Yuzhuo Chen, Zehua Ma, Han Fang, Weiming Zhang, Nenghai Yu

Comments: Camera-ready version for ICCV 2025. Adds GitHub link; acknowledgments; appendix. Abstract and Figure 1 updated for clarity

Subjects: Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[3108] arXiv:2506.23490 (cross-list from eess.IV) [pdf, html, other]: Title: UltraTwin: Towards Cardiac Anatomical Twin Generation from Multi-view 2D Ultrasound

Junxuan Yu, Yaofei Duan, Yuhao Huang, Yu Wang, Rongbo Ling, Weihao Luo, Ang Zhang, Jingxian Xu, Qiongying Ni, Yongsong Zhou, Binghan Li, Haoran Dou, Liping Liu, Yanfen Chu, Feng Geng, Zhe Sheng, Zhifeng Ding, Dingxin Zhang, Rui Huang, Yuhang Zhang, Xiaowei Xu, Tao Tan, Dong Ni, Zhongshan Gou, Xin Yang

Comments: accepted by miccai 2025

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[3109] arXiv:2506.23492 (cross-list from cs.LG) [pdf, html, other]: Title: Sample Margin-Aware Recalibration of Temperature Scaling

Haolan Guo, Linwei Tao, Haoyang Luo, Minjing Dong, Chang Xu

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[3110] arXiv:2506.23506 (cross-list from eess.IV) [pdf, other]: Title: Artificial Intelligence-assisted Pixel-level Lung (APL) Scoring for Fast and Accurate Quantification in Ultra-short Echo-time MRI

Bowen Xin, Rohan Hickey, Tamara Blake, Jin Jin, Claire E Wainwright, Thomas Benkert, Alto Stemmer, Peter Sly, David Coman, Jason Dowling

Comments: Oral presentation in ISMRM2025

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Medical Physics (physics.med-ph)
[3111] arXiv:2506.23516 (cross-list from cs.LG) [pdf, html, other]: Title: FedWSQ: Efficient Federated Learning with Weight Standardization and Distribution-Aware Non-Uniform Quantization

Seung-Wook Kim, Seongyeol Kim, Jiah Kim, Seowon Ji, Se-Ho Lee

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[3112] arXiv:2506.23537 (cross-list from eess.IV) [pdf, html, other]: Title: AFUNet: Cross-Iterative Alignment-Fusion Synergy for HDR Reconstruction via Deep Unfolding Paradigm

Xinyue Li, Zhangkai Ni, Wenhan Yang

Comments: Accepted to International Conference on Computer Vision (ICCV) 2025

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3113] arXiv:2506.23563 (cross-list from cs.AI) [pdf, html, other]: Title: MMReason: An Open-Ended Multi-Modal Multi-Step Reasoning Benchmark for MLLMs Toward AGI

Huanjin Yao, Jiaxing Huang, Yawen Qiu, Michael K. Chen, Wenzheng Liu, Wei Zhang, Wenjie Zeng, Xikun Zhang, Jingyi Zhang, Yuxin Song, Wenhao Wu, Dacheng Tao

Comments: Technical report

Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[3114] arXiv:2506.23584 (cross-list from eess.IV) [pdf, html, other]: Title: A Clinically-Grounded Two-Stage Framework for Renal CT Report Generation

Renjie Liang, Zhengkang Fan, Jinqian Pan, Chenkun Sun, Russell Terry, Jie Xu

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[3115] arXiv:2506.23664 (cross-list from eess.IV) [pdf, html, other]: Title: Diffusion Model-based Data Augmentation Method for Fetal Head Ultrasound Segmentation

Fangyijie Wang, Kevin Whelan, Félix Balado, Kathleen M. Curran, Guénolé Silvestre

Comments: Accepted at Irish Machine Vision and Image Processing Conference (IMVIP) 2025

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3116] arXiv:2506.23700 (cross-list from eess.IV) [pdf, html, other]: Title: MedSAM-CA: A CNN-Augmented ViT with Attention-Enhanced Multi-Scale Fusion for Medical Image Segmentation

Peiting Tian, Xi Chen, Haixia Bi, Fan Li

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3117] arXiv:2506.23701 (cross-list from eess.IV) [pdf, html, other]: Title: MDPG: Multi-domain Diffusion Prior Guidance for MRI Reconstruction

Lingtong Zhang, Mengdie Song, Xiaohan Hao, Huayu Mai, Bensheng Qiu

Comments: Accept by MICCAI2025

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3118] arXiv:2506.23717 (cross-list from cs.NE) [pdf, html, other]: Title: Towards Efficient and Accurate Spiking Neural Networks via Adaptive Bit Allocation

Xingting Yao, Qinghao Hu, Fei Zhou, Tielong Liu, Gang Li, Peisong Wang, Jian Cheng

Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[3119] arXiv:2506.23721 (cross-list from eess.IV) [pdf, html, other]: Title: Deep Learning-Based Semantic Segmentation for Real-Time Kidney Imaging and Measurements with Augmented Reality-Assisted Ultrasound

Gijs Luijten, Roberto Maria Scardigno, Lisle Faray de Paiva, Peter Hoyer, Jens Kleesiek, Domenico Buongiorno, Vitoantonio Bevilacqua, Jan Egger

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)
[3120] arXiv:2506.23731 (cross-list from cs.LG) [pdf, html, other]: Title: Radioactive Watermarks in Diffusion and Autoregressive Image Generative Models

Michel Meintz, Jan Dubiński, Franziska Boenisch, Adam Dziedzic

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[3121] arXiv:2506.23759 (cross-list from eess.IV) [pdf, html, other]: Title: Spatio-Temporal Representation Decoupling and Enhancement for Federated Instrument Segmentation in Surgical Videos

Zheng Fang, Xiaoming Qi, Chun-Mei Feng, Jialun Pei, Weixin Si, Yueming Jin

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3122] arXiv:2506.23824 (cross-list from cs.LG) [pdf, html, other]: Title: Supercm: Revisiting Clustering for Semi-Supervised Learning

Durgesh Singh, Ahcene Boubekki, Robert Jenssen, Michael C. Kampffmeyer

Journal-ref: 10.1109/ICASSP49357.2023.10095856

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[3123] arXiv:2506.23957 (cross-list from cs.GR) [pdf, html, other]: Title: GaVS: 3D-Grounded Video Stabilization via Temporally-Consistent Local Reconstruction and Rendering

Zinuo You, Stamatios Georgoulis, Anpei Chen, Siyu Tang, Dengxin Dai

Comments: siggraph 2025, project website: this https URL. version 2, update discussion

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[3124] arXiv:2506.24000 (cross-list from cs.LG) [pdf, html, other]: Title: The Illusion of Progress? A Critical Look at Test-Time Adaptation for Vision-Language Models

Lijun Sheng, Jian Liang, Ran He, Zilei Wang, Tieniu Tan

Comments: Github link: this https URL

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[3125] arXiv:2506.24003 (cross-list from eess.IV) [pdf, html, other]: Title: ShapeKit

Junqi Liu, Dongli He, Wenxuan Li, Ningyu Wang, Alan L. Yuille, Zongwei Zhou

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3126] arXiv:2506.24016 (cross-list from cs.CL) [pdf, html, other]: Title: EXPERT: An Explainable Image Captioning Evaluation Metric with Structured Explanations

Hyunjong Kim, Sangyeop Kim, Jongheon Jeong, Yeongjae Cho, Sungzoon Cho

Comments: Accepted at ACL 2025 Findings

Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[3127] arXiv:2506.24034 (cross-list from physics.med-ph) [pdf, html, other]: Title: Supervised Diffusion-Model-Based PET Image Reconstruction

George Webber, Alexander Hammers, Andrew P King, Andrew J Reader

Comments: 12 pages, 6 figures. Submitted to MICCAI 2025, not peer-reviewed

Subjects: Medical Physics (physics.med-ph); Computer Vision and Pattern Recognition (cs.CV)
[3128] arXiv:2506.24074 (cross-list from eess.IV) [pdf, html, other]: Title: C3VDv2 -- Colonoscopy 3D video dataset with enhanced realism

Mayank V. Golhar, Lucas Sebastian Galeano Fretes, Loren Ayers, Venkata S. Akshintala, Taylor L. Bobrow, Nicholas J. Durr

Comments: 19 pages, 7 figures

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3129] arXiv:2506.24108 (cross-list from cs.GR) [pdf, html, other]: Title: Navigating with Annealing Guidance Scale in Diffusion Space

Shai Yehezkel, Omer Dahary, Andrey Voynov, Daniel Cohen-Or

Comments: Project page: this https URL

Subjects: Graphics (cs.GR); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[3130] arXiv:2506.24124 (cross-list from cs.LG) [pdf, html, other]: Title: Teaching Time Series to See and Speak: Forecasting with Aligned Visual and Textual Perspectives

Sixun Dong, Wei Fan, Teresa Wu, Yanjie Fu

Comments: Code: this https URL

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)

Total of 3130 entries : 1-2000 2001-3130 2901-3130

Showing up to 2000 entries per page: fewer | more | all