Skip to main content
Cornell University
We gratefully acknowledge support from the Simons Foundation, member institutions, and all contributors. Donate
arxiv logo > cs.CV

Help | Advanced Search

arXiv logo
Cornell University Logo

quick links

  • Login
  • Help Pages
  • About

Computer Vision and Pattern Recognition

Authors and titles for June 2025

Total of 3130 entries : 1-2000 2001-3130 2901-3130
Showing up to 2000 entries per page: fewer | more | all
[2901] arXiv:2506.16827 (cross-list from cs.GR) [pdf, html, other]
Title: Beyond Blur: A Fluid Perspective on Generative Diffusion Models
Grzegorz Gruszczynski, Jakub Meixner, Michal Jan Wlodarczyk, Przemyslaw Musialski
Comments: ICCV 2025 main conference, 8 pages paper, 20 pages appendix, 24 figures, supplementary pseudocode in appendix, this https URL
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[2902] arXiv:2506.16890 (cross-list from cs.LG) [pdf, html, other]
Title: From Lab to Factory: Pitfalls and Guidelines for Self-/Unsupervised Defect Detection on Low-Quality Industrial Images
Sebastian Hönel, Jonas Nordqvist
Comments: 18 pages, 7 figures, 1 table. Camera-ready version for the 2025 conference European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD '25)
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Applications (stat.AP)
[2903] arXiv:2506.16898 (cross-list from cs.AI) [pdf, html, other]
Title: AI's Blind Spots: Geographic Knowledge and Diversity Deficit in Generated Urban Scenario
Ciro Beneduce, Massimiliano Luca, Bruno Lepri
Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY)
[2904] arXiv:2506.16934 (cross-list from eess.IV) [pdf, other]
Title: PET Tracer Separation Using Conditional Diffusion Transformer with Multi-latent Space Learning
Bin Huang, Feihong Xu, Xinchong Shi, Shan Huang, Binxuan Li, Fei Li, Qiegen Liu
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2905] arXiv:2506.17110 (cross-list from cs.RO) [pdf, html, other]
Title: Monocular One-Shot Metric-Depth Alignment for RGB-Based Robot Grasping
Teng Guo, Baichuan Huang, Jingjin Yu
Comments: Accepted to IROS 2025
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[2906] arXiv:2506.17133 (cross-list from eess.IV) [pdf, html, other]
Title: Robust Training with Data Augmentation for Medical Imaging Classification
Josué Martínez-Martínez, Olivia Brown, Mostafa Karami, Sheida Nabavi
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[2907] arXiv:2506.17140 (cross-list from eess.IV) [pdf, html, other]
Title: MeDi: Metadata-Guided Diffusion Models for Mitigating Biases in Tumor Classification
David Jacob Drexlin, Jonas Dippel, Julius Hense, Niklas Prenißl, Grégoire Montavon, Frederick Klauschen, Klaus-Robert Müller
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[2908] arXiv:2506.17165 (cross-list from eess.IV) [pdf, html, other]
Title: Proportional Sensitivity in Generative Adversarial Network (GAN)-Augmented Brain Tumor Classification Using Convolutional Neural Network
Mahin Montasir Afif, Abdullah Al Noman, K. M. Tahsin Kabir, Md. Mortuza Ahmmed, Md. Mostafizur Rahman, Mufti Mahmud, Md. Ashraful Babu
Comments: This papaer has been submitted to The 18th International Conference on Brain Informatics (BI'25), Italy
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[2909] arXiv:2506.17198 (cross-list from cs.RO) [pdf, html, other]
Title: Dex1B: Learning with 1B Demonstrations for Dexterous Manipulation
Jianglong Ye, Keyi Wang, Chengjing Yuan, Ruihan Yang, Yiquan Li, Jiyue Zhu, Yuzhe Qin, Xueyan Zou, Xiaolong Wang
Comments: Accepted to RSS 2025. Project page: this https URL
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[2910] arXiv:2506.17206 (cross-list from cs.GR) [pdf, html, other]
Title: DreamCube: 3D Panorama Generation via Multi-plane Synchronization
Yukun Huang, Yanning Zhou, Jianan Wang, Kaiyi Huang, Xihui Liu
Comments: Project page: this https URL
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[2911] arXiv:2506.17232 (cross-list from cs.LG) [pdf, html, other]
Title: PCaM: A Progressive Focus Attention-Based Information Fusion Method for Improving Vision Transformer Domain Adaptation
Zelin Zang, Fei Wang, Liangyu Li, Jinlin Wu, Chunshui Zhao, Zhen Lei, Baigui Sun
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[2912] arXiv:2506.17307 (cross-list from cs.LG) [pdf, html, other]
Title: Learning to Adapt Frozen CLIP for Few-Shot Test-Time Domain Adaptation
Zhixiang Chi, Li Gu, Huan Liu, Ziqiang Wang, Yanan Wu, Yang Wang, Konstantinos N Plataniotis
Comments: ICLR2025,this https URL
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[2913] arXiv:2506.17320 (cross-list from cs.CY) [pdf, other]
Title: MAARTA:Multi-Agentic Adaptive Radiology Teaching Assistant
Akash Awasthi, Brandon V. Chang, Anh M. Vu, Ngan Le, Rishi Agrawal, Zhigang Deng, Carol Wu, Hien Van Nguyen
Comments: Accepted to MICCAI 2025 (Main Conference)
Subjects: Computers and Society (cs.CY); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[2914] arXiv:2506.17324 (cross-list from cs.LG) [pdf, other]
Title: Origins of Creativity in Attention-Based Diffusion Models
Emma Finn, T. Anderson Keller, Manos Theodosis, Demba E. Ba
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[2915] arXiv:2506.17337 (cross-list from eess.IV) [pdf, html, other]
Title: Can Common VLMs Rival Medical VLMs? Evaluation and Strategic Insights
Yuan Zhong, Ruinan Jin, Xiaoxiao Li, Qi Dou
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[2916] arXiv:2506.17361 (cross-list from eess.IV) [pdf, html, other]
Title: Efficient Feedback Gate Network for Hyperspectral Image Super-Resolution
Xufei Wang, Mingjian Zhang, Fei Ge, Jinchen Zhu, Wen Sha, Jifen Ren, Zhimeng Hou, Shouguo Zheng, ling Zheng, Shizhuang Weng
Comments: 20 pages,17 figures
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[2917] arXiv:2506.17364 (cross-list from cs.CY) [pdf, html, other]
Title: AI-based Multimodal Biometrics for Detecting Smartphone Distractions: Application to Online Learning
Alvaro Becerra, Roberto Daza, Ruth Cobos, Aythami Morales, Mutlu Cukurova, Julian Fierrez
Comments: Accepted in EC-TEL25: 20th European Conference on Technology Enhanced Learning, Newcastle and Durham, UK, 15-19 September 2025
Subjects: Computers and Society (cs.CY); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[2918] arXiv:2506.17372 (cross-list from cs.CY) [pdf, html, other]
Title: Multimodal Political Bias Identification and Neutralization
Cedric Bernard, Xavier Pleimling, Amun Kharel, Chase Vickery
Subjects: Computers and Society (cs.CY); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[2919] arXiv:2506.17378 (cross-list from cs.RO) [pdf, html, other]
Title: A workflow for generating synthetic LiDAR datasets in simulation environments
Abhishek Phadke, Shakib Mahmud Dipto, Pratip Rana
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[2920] arXiv:2506.17412 (cross-list from eess.IV) [pdf, html, other]
Title: VMRA-MaR: An Asymmetry-Aware Temporal Framework for Longitudinal Breast Cancer Risk Prediction
Zijun Sun, Solveig Thrun, Michael Kampffmeyer
Comments: MICCAI 2025, Provisional Accept
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2921] arXiv:2506.17425 (cross-list from eess.IV) [pdf, html, other]
Title: Trans${^2}$-CBCT: A Dual-Transformer Framework for Sparse-View CBCT Reconstruction
Minmin Yang, Huantao Ren, Senem Velipasalar
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[2922] arXiv:2506.17462 (cross-list from cs.RO) [pdf, other]
Title: General-Purpose Robotic Navigation via LVLM-Orchestrated Perception, Reasoning, and Acting
Bernard Lange, Anil Yildiz, Mansur Arief, Shehryar Khattak, Mykel Kochenderfer, Georgios Georgakis
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[2923] arXiv:2506.17501 (cross-list from eess.IV) [pdf, html, other]
Title: DSA-NRP: No-Reflow Prediction from Angiographic Perfusion Dynamics in Stroke EVT
Shreeram Athreya, Carlos Olivares, Ameera Ismail, Kambiz Nael, William Speier, Corey Arnold
Comments: 12 pages, 4 figures
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2924] arXiv:2506.17516 (cross-list from cs.RO) [pdf, html, other]
Title: EASE: Embodied Active Event Perception via Self-Supervised Energy Minimization
Zhou Chen, Sanjoy Kundu, Harsimran S. Baweja, Sathyanarayanan N. Aakur
Comments: Accepted to IEEE Robotics and Automation Letters, 2025
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[2925] arXiv:2506.17540 (cross-list from eess.IV) [pdf, html, other]
Title: MTSIC: Multi-stage Transformer-based GAN for Spectral Infrared Image Colorization
Tingting Liu, Yuan Liu, Jinhui Tang, Liyin Yuan, Chengyu Liu, Chunlai Li, Xiubao Sui, Qian Chen
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[2926] arXiv:2506.17552 (cross-list from cs.LG) [pdf, other]
Title: DRIMV_TSK: An Interpretable Surgical Evaluation Model for Incomplete Multi-View Rectal Cancer Data
Wei Zhang, Zi Wang, Hanwen Zhou, Zhaohong Deng, Weiping Ding, Yuxi Ge, Te Zhang, Yuanpeng Zhang, Kup-Sze Choi, Shitong Wang, Shudong Hu
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[2927] arXiv:2506.17623 (cross-list from cs.MM) [pdf, html, other]
Title: Can Generated Images Serve as a Viable Modality for Text-Centric Multimodal Learning?
Yuesheng Huang, Peng Zhang, Riliang Liu, Jiaqi Liang
Comments: 4 figures,7 tables
Subjects: Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV)
[2928] arXiv:2506.17636 (cross-list from cs.GR) [pdf, html, other]
Title: 3D Gaussian Splatting for Fine-Detailed Surface Reconstruction in Large-Scale Scene
Shihan Chen, Zhaojin Li, Zeyu Chen, Qingsong Yan, Gaoyang Shen, Ran Duan
Comments: IROS 2025
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[2929] arXiv:2506.17747 (cross-list from physics.geo-ph) [pdf, html, other]
Title: Pix2Geomodel: A Next-Generation Reservoir Geomodeling with Property-to-Property Translation
Abdulrahman Al-Fakih, Ardiansyah Koeshidayatullah, Nabil A. Saraih, Tapan Mukerji, Rayan Kanfar, Abdulmohsen Alali, SanLinn I. Kaka
Comments: 34 pages, 13 figures
Subjects: Geophysics (physics.geo-ph); Computational Engineering, Finance, and Science (cs.CE); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
[2930] arXiv:2506.17770 (cross-list from cs.GR) [pdf, html, other]
Title: Collaborative Texture Filtering
Tomas Akenine-Möller, Pontus Ebelin, Matt Pharr, Bartlomiej Wronski
Comments: Accepted to ACM/EG Symposium on High Performance Graphics (HPG), 2025
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[2931] arXiv:2506.17872 (cross-list from cs.LG) [pdf, html, other]
Title: Decoding Federated Learning: The FedNAM+ Conformal Revolution
Sree Bhargavi Balija, Amitash Nanda, Debashis Sahoo
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[2932] arXiv:2506.17874 (cross-list from stat.ML) [pdf, html, other]
Title: DRO-Augment Framework: Robustness by Synergizing Wasserstein Distributionally Robust Optimization and Data Augmentation
Jiaming Hu, Debarghya Mukherjee, Ioannis Ch. Paschalidis
Comments: 26 pages,3 figures
Subjects: Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[2933] arXiv:2506.17879 (cross-list from eess.IV) [pdf, html, other]
Title: StainPIDR: A Pathological Image Decouplingand Reconstruction Method for Stain Normalization Based on Color Vector Quantization and Structure Restaining
Zheng Chen
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[2934] arXiv:2506.17954 (cross-list from eess.IV) [pdf, other]
Title: Mobile Image Analysis Application for Mantoux Skin Test
Liong Gele, Tan Chye Cheah
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2935] arXiv:2506.17966 (cross-list from cs.IR) [pdf, html, other]
Title: LLM-Enhanced Multimodal Fusion for Cross-Domain Sequential Recommendation
Wangyu Wu, Zhenhong Chen, Xianglin Qiu, Siqi Song, Xiaowei Huang, Fei Ma, Jimin Xiao
Comments: arXiv admin note: substantial text overlap with arXiv:2504.15085
Subjects: Information Retrieval (cs.IR); Computer Vision and Pattern Recognition (cs.CV)
[2936] arXiv:2506.17967 (cross-list from cs.LG) [pdf, html, other]
Title: Adapting Vision-Language Models for Evaluating World Models
Mariya Hendriksen, Tabish Rashid, David Bignell, Raluca Georgescu, Abdelhak Lemkhenter, Katja Hofmann, Sam Devlin, Sarah Parisot
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[2937] arXiv:2506.17968 (cross-list from cs.LG) [pdf, html, other]
Title: h-calibration: Rethinking Classifier Recalibration with Probabilistic Error-Bounded Objective
Wenjian Huang, Guiping Cao, Jiahao Xia, Jingkun Chen, Hao Wang, Jianguo Zhang
Journal-ref: IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Probability (math.PR); Machine Learning (stat.ML)
[2938] arXiv:2506.17983 (cross-list from eess.IV) [pdf, html, other]
Title: LVPNet: A Latent-variable-based Prediction-driven End-to-end Framework for Lossless Compression of Medical Images
Chenyue Song, Chen Hui, Qing Lin, Wei Zhang, Siqiao Li, Haiqi Zhu, Zhixuan Li, Shengping Zhang, Shaohui Liu, Feng Jiang, Xiang Li
Comments: Accepted to MICCAI 2025
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2939] arXiv:2506.18017 (cross-list from cs.GR) [pdf, html, other]
Title: Auto-Regressive Surface Cutting
Yang Li, Victor Cheung, Xinhai Liu, Yuguang Chen, Zhongjin Luo, Biwen Lei, Haohan Weng, Zibo Zhao, Jingwei Huang, Zhuo Chen, Chunchao Guo
Comments: Tech. report. this https URL
Subjects: Graphics (cs.GR); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[2940] arXiv:2506.18069 (cross-list from cs.DL) [pdf, html, other]
Title: Unfolding the Past: A Comprehensive Deep Learning Approach to Analyzing Incunabula Pages
Klaudia Ropel, Krzysztof Kutt, Luiz do Valle Miranda, Grzegorz J. Nalepa
Comments: 10 pages, 8 figures; submitted to TPDL 2025; change in v2: updated e-mail address
Subjects: Digital Libraries (cs.DL); Computer Vision and Pattern Recognition (cs.CV)
[2941] arXiv:2506.18072 (cross-list from eess.IV) [pdf, html, other]
Title: Multimodal Medical Image Binding via Shared Text Embeddings
Yunhao Liu, Suyang Xi, Shiqi Liu, Hong Ding, Chicheng Jin, Chong Zhong, Junjun He, Catherine C. Liu, Yiqing Shen
Comments: 10 pages, 3 figures
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[2942] arXiv:2506.18088 (cross-list from cs.RO) [pdf, html, other]
Title: RoboTwin 2.0: A Scalable Data Generator and Benchmark with Strong Domain Randomization for Robust Bimanual Robotic Manipulation
Tianxing Chen, Zanxin Chen, Baijun Chen, Zijian Cai, Yibin Liu, Zixuan Li, Qiwei Liang, Xianliang Lin, Yiheng Ge, Zhenyu Gu, Weiliang Deng, Yubin Guo, Tian Nian, Xuanbing Xie, Qiangyu Chen, Kailun Su, Tianling Xu, Guodong Liu, Mengkang Hu, Huan-ang Gao, Kaixuan Wang, Zhixuan Liang, Yusen Qin, Xiaokang Yang, Ping Luo, Yao Mu
Comments: Project Page: this https URL, Code: this https URL, Doc: this https URL
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Multiagent Systems (cs.MA)
[2943] arXiv:2506.18158 (cross-list from cs.AI) [pdf, html, other]
Title: Chain-of-Memory: Enhancing GUI Agents for Cross-Application Navigation
Xinzge Gao, Chuanrui Hu, Bin Chen, Teng Li
Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[2944] arXiv:2506.18162 (cross-list from cs.LG) [pdf, html, other]
Title: Pitfalls of Conformal Predictions for Medical Image Classification
Hendrik Mehrtens, Tabea Bucher, Titus J. Brinker
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[2945] arXiv:2506.18172 (cross-list from eess.IV) [pdf, html, other]
Title: STACT-Time: Spatio-Temporal Cross Attention for Cine Thyroid Ultrasound Time Series Classification
Irsyad Adam, Tengyue Zhang, Shrayes Raman, Zhuyu Qiu, Brandon Taraku, Hexiang Feng, Sile Wang, Ashwath Radhachandran, Shreeram Athreya, Vedrana Ivezic, Peipei Ping, Corey Arnold, William Speier
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[2946] arXiv:2506.18201 (cross-list from cs.CL) [pdf, html, other]
Title: Deciphering Emotions in Children Storybooks: A Comparative Analysis of Multimodal LLMs in Educational Applications
Bushra Asseri, Estabraq Abdelaziz, Maha Al Mogren, Tayef Alhefdhi, Areej Al-Wabil
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[2947] arXiv:2506.18251 (cross-list from cs.GR) [pdf, html, other]
Title: Morse: Dual-Sampling for Lossless Acceleration of Diffusion Models
Chao Li, Jiawei Fan, Anbang Yao
Comments: Fixed a prompt typo in Figure 18 of the Appendix. This work is accepted to ICML 2025. The project page: this https URL
Subjects: Graphics (cs.GR); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[2948] arXiv:2506.18270 (cross-list from eess.IV) [pdf, other]
Title: Adaptive Mask-guided K-space Diffusion for Accelerated MRI Reconstruction
Qinrong Cai, Yu Guan, Zhibo Chen, Dong Liang, Qiuyun Fan, Qiegen Liu
Comments: 10 pages, 9 figures
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2949] arXiv:2506.18323 (cross-list from eess.IV) [pdf, html, other]
Title: A Multi-Scale Spatial Attention-Based Zero-Shot Learning Framework for Low-Light Image Enhancement
Muhammad Azeem Aslam, Hassan Khalid, Nisar Ahmed
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[2950] arXiv:2506.18335 (cross-list from eess.IV) [pdf, html, other]
Title: Rethinking Decoder Design: Improving Biomarker Segmentation Using Depth-to-Space Restoration and Residual Linear Attention
Saad Wazir, Daeyoung Kim
Comments: Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR), 2025, pp. 30861-30871
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2951] arXiv:2506.18371 (cross-list from eess.IV) [pdf, html, other]
Title: Transforming H&E images into IHC: A Variance-Penalized GAN for Precision Oncology
Sara Rehmat, Hafeez Ur Rehman
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2952] arXiv:2506.18378 (cross-list from eess.IV) [pdf, html, other]
Title: Taming Vision-Language Models for Medical Image Analysis: A Comprehensive Review
Haoneng Lin, Cheng Xu, Jing Qin
Comments: 34 pages
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2953] arXiv:2506.18407 (cross-list from cs.GR) [pdf, html, other]
Title: What You Think Is What You Get: Bridge User Intent and Transfer Function Design through Multimodal Large Language Models
Yiyao Wang, Bo Pan, Ke Wang, Han Liu, Jinyuan Mao, Yuxin Liu, Minfeng Zhu, Bo Zhang, Weifeng Chen, Xiuqi Huang, Wei Chen
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[2954] arXiv:2506.18443 (cross-list from cs.RO) [pdf, html, other]
Title: Radar and Event Camera Fusion for Agile Robot Ego-Motion Estimation
Yang Lyu, Zhenghao Zou, Yanfeng Li, Chunhui Zhao, Quan Pan
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[2955] arXiv:2506.18474 (cross-list from eess.IV) [pdf, html, other]
Title: A Deep Convolutional Neural Network-Based Novel Class Balancing for Imbalance Data Segmentation
Atifa Kalsoom, M.A. Iftikhar, Amjad Ali, Zubair Shah, Shidin Balakrishnan, Hazrat Ali
Comments: This is preprint of the paper submitted to Scientific Reports journal
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[2956] arXiv:2506.18484 (cross-list from eess.IV) [pdf, html, other]
Title: GANs vs. Diffusion Models for virtual staining with the HER2match dataset
Pascal Klöckner, José Teixeira, Diana Montezuma, Jaime S. Cardoso, Hugo M. Horlings, Sara P. Oliveira
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2957] arXiv:2506.18512 (cross-list from eess.IV) [pdf, html, other]
Title: MedTVT-R1: A Multimodal LLM Empowering Medical Reasoning and Diagnosis
Yuting Zhang, Kaishen Yuan, Hao Lu, Yutao Yue, Jintai Chen, Kaishun Wu
Subjects: Image and Video Processing (eess.IV); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Quantitative Methods (q-bio.QM)
[2958] arXiv:2506.18598 (cross-list from cs.LG) [pdf, html, other]
Title: No Training Wheels: Steering Vectors for Bias Correction at Inference Time
Aviral Gupta, Armaan Sethi, Ameesh Sethi
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[2959] arXiv:2506.18601 (cross-list from cs.GR) [pdf, html, other]
Title: BulletGen: Improving 4D Reconstruction with Bullet-Time Generation
Denys Rozumnyi, Jonathon Luiten, Numair Khan, Johannes Schönberger, Peter Kontschieder
Subjects: Graphics (cs.GR); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[2960] arXiv:2506.18671 (cross-list from cs.SD) [pdf, html, other]
Title: TCDiff++: An End-to-end Trajectory-Controllable Diffusion Model for Harmonious Music-Driven Group Choreography
Yuqin Dai, Wanlu Zhu, Ronghui Li, Xiu Li, Zhenyu Zhang, Jun Li, Jian Yang
Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Audio and Speech Processing (eess.AS)
[2961] arXiv:2506.18680 (cross-list from cs.GR) [pdf, html, other]
Title: DuetGen: Music Driven Two-Person Dance Generation via Hierarchical Masked Modeling
Anindita Ghosh, Bing Zhou, Rishabh Dabral, Jian Wang, Vladislav Golyanik, Christian Theobalt, Philipp Slusallek, Chuan Guo
Comments: 11 pages, 7 figures, 2 tables, accepted in ACM Siggraph 2025 conference track
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[2962] arXiv:2506.18720 (cross-list from eess.IV) [pdf, html, other]
Title: Temporal Neural Cellular Automata: Application to modeling of contrast enhancement in breast MRI
Daniel M. Lang, Richard Osuala, Veronika Spieker, Karim Lekadir, Rickmer Braren, Julia A. Schnabel
Comments: MICCAI 2025
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2963] arXiv:2506.18725 (cross-list from cs.RO) [pdf, html, other]
Title: TopoRec: Point Cloud Recognition Using Topological Data Analysis
Anirban Ghosh, Iliya Kulbaka, Ian Dahlin, Ayan Dutta
Subjects: Robotics (cs.RO); Computational Geometry (cs.CG); Computer Vision and Pattern Recognition (cs.CV)
[2964] arXiv:2506.18810 (cross-list from cs.AI) [pdf, html, other]
Title: ConciseHint: Boosting Efficient Reasoning via Continuous Concise Hints during Generation
Siao Tang, Xinyin Ma, Gongfan Fang, Xinchao Wang
Comments: Codes are available at this https URL
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[2965] arXiv:2506.18842 (cross-list from cs.DB) [pdf, html, other]
Title: LIGHTHOUSE: Fast and precise distance to shoreline calculations from anywhere on earth
Patrick Beukema, Henry Herzog, Yawen Zhang, Hunter Pitelka, Favyen Bastani
Comments: 8 pages, 7 figures, 1 table, ICML 2025 ML4RS
Subjects: Databases (cs.DB); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[2966] arXiv:2506.18844 (cross-list from cs.RO) [pdf, other]
Title: Reproducible Evaluation of Camera Auto-Exposure Methods in the Field: Platform, Benchmark and Lessons Learned
Olivier Gamache, Jean-Michel Fortin, Matěj Boxan, François Pomerleau, Philippe Giguère
Comments: 19 pages, 11 figures, pre-print version of the accepted paper for IEEE Transactions on Field Robotics (T-FR)
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[2967] arXiv:2506.18885 (cross-list from cs.RO) [pdf, html, other]
Title: GRAND-SLAM: Local Optimization for Globally Consistent Large-Scale Multi-Agent Gaussian SLAM
Annika Thomas, Aneesa Sonawalla, Alex Rose, Jonathan P. How
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[2968] arXiv:2506.18919 (cross-list from cs.CL) [pdf, html, other]
Title: MemeMind: A Large-Scale Multimodal Dataset with Chain-of-Thought Reasoning for Harmful Meme Detection
Hexiang Gu, Qifan Yu, Saihui Hou, Zhiqin Fang, Huijia Wu, Zhaofeng He
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[2969] arXiv:2506.19051 (cross-list from eess.IV) [pdf, html, other]
Title: NIC-RobustBench: A Comprehensive Open-Source Toolkit for Neural Image Compression and Robustness Analysis
Georgii Bychkov, Khaled Abud, Egor Kovalev, Alexander Gushchin, Dmitriy Vatolin, Anastasia Antsiferova
Comments: arXiv admin note: text overlap with arXiv:2411.11795
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[2970] arXiv:2506.19055 (cross-list from eess.IV) [pdf, html, other]
Title: Xray2Xray: World Model from Chest X-rays with Volumetric Context
Zefan Yang, Xinrui Song, Xuanang Xu, Yongyi Shi, Ge Wang, Mannudeep K. Kalra, Pingkun Yan
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2971] arXiv:2506.19106 (cross-list from eess.IV) [pdf, html, other]
Title: Staining normalization in histopathology: Method benchmarking using multicenter dataset
Umair Khan, Jouni Härkönen, Marjukka Friman, Leena Latonen, Teijo Kuopio, Pekka Ruusuvuori
Comments: 18 pages, 9 figures
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Tissues and Organs (q-bio.TO)
[2972] arXiv:2506.19139 (cross-list from cs.GR) [pdf, html, other]
Title: SOF: Sorted Opacity Fields for Fast Unbounded Surface Reconstruction
Lukas Radl, Felix Windisch, Thomas Deixelberger, Jozef Hladky, Michael Steiner, Dieter Schmalstieg, Markus Steinberger
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[2973] arXiv:2506.19167 (cross-list from eess.IV) [pdf, other]
Title: A Deep Learning Based Method for Fast Registration of Cardiac Magnetic Resonance Images
Benjamin Graham
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[2974] arXiv:2506.19222 (cross-list from eess.IV) [pdf, html, other]
Title: Deformable Medical Image Registration with Effective Anatomical Structure Representation and Divide-and-Conquer Network
Xinke Ma, Yongsheng Pan, Qingjie Zeng, Mengkang Lu, Bolysbek Murat Yerzhanuly, Bazargul Matkerim, Yong Xia
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2975] arXiv:2506.19234 (cross-list from eess.IV) [pdf, html, other]
Title: Quantitative Benchmarking of Anomaly Detection Methods in Digital Pathology
Can Cui, Xindong Zheng, Ruining Deng, Quan Liu, Tianyuan Yao, Keith T Wilson, Lori A Coburn, Bennett A Landman, Haichun Yang, Yaohong Wang, Yuankai Huo
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2976] arXiv:2506.19266 (cross-list from q-bio.NC) [pdf, other]
Title: Convergent and divergent connectivity patterns of the arcuate fasciculus in macaques and humans
Jiahao Huang, Ruifeng Li, Wenwen Yu, Anan Li, Xiangning Li, Mingchao Yan, Lei Xie, Qingrun Zeng, Xueyan Jia, Shuxin Wang, Ronghui Ju, Feng Chen, Qingming Luo, Hui Gong, Andrew Zalesky, Xiaoquan Yang, Yuanjing Feng, Zheng Wang
Comments: 34 pages, 6 figures
Subjects: Neurons and Cognition (q-bio.NC); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[2977] arXiv:2506.19297 (cross-list from eess.IV) [pdf, html, other]
Title: Explicit Residual-Based Scalable Image Coding for Humans and Machines
Yui Tatsumi, Ziyue Zeng, Hiroshi Watanabe
Comments: Accepted to IEEE 27th International Workshop on Multimedia Signal Processing (MMSP 2025)
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2978] arXiv:2506.19360 (cross-list from cs.CR) [pdf, html, other]
Title: SoK: Can Synthetic Images Replace Real Data? A Survey of Utility and Privacy of Synthetic Image Generation
Yunsung Chung, Yunbei Zhang, Nassir Marrouche, Jihun Hamm
Comments: Accepted at the 34th USENIX Security Symposium (USENIX Security '25). 21 pages, plus a 6-page appendix
Subjects: Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
[2979] arXiv:2506.19363 (cross-list from eess.IV) [pdf, html, other]
Title: Reconsidering Explicit Longitudinal Mammography Alignment for Enhanced Breast Cancer Risk Prediction
Solveig Thrun, Stine Hansen, Zijun Sun, Nele Blum, Suaiba A. Salahuddin, Kristoffer Wickstrøm, Elisabeth Wetzer, Robert Jenssen, Maik Stille, Michael Kampffmeyer
Comments: MICCAI 2025, early accepted
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2980] arXiv:2506.19387 (cross-list from eess.IV) [pdf, other]
Title: NAADA: A Noise-Aware Attention Denoising Autoencoder for Dental Panoramic Radiographs
Khuram Naveed, Bruna Neves de Freitas, Ruben Pauwels
Comments: 10 pages, 8 figures
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[2981] arXiv:2506.19415 (cross-list from cs.GR) [pdf, html, other]
Title: Virtual Memory for 3D Gaussian Splatting
Jonathan Haberl, Philipp Fleck, Clemens Arth
Comments: Based on the Master Thesis from Jonathan Haberl from 2024, Submitted to TVCG in Feb. 2025;
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[2982] arXiv:2506.19455 (cross-list from eess.IV) [pdf, html, other]
Title: Angio-Diff: Learning a Self-Supervised Adversarial Diffusion Model for Angiographic Geometry Generation
Zhifeng Wang, Renjiao Yi, Xin Wen, Chenyang Zhu, Kai Xu, Kunlun He
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2983] arXiv:2506.19464 (cross-list from eess.IV) [pdf, html, other]
Title: Assessing Risk of Stealing Proprietary Models for Medical Imaging Tasks
Ankita Raj, Harsh Swaika, Deepankar Varma, Chetan Arora
Comments: Accepted to MICCAI 2024
Subjects: Image and Video Processing (eess.IV); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
[2984] arXiv:2506.19491 (cross-list from cs.ET) [pdf, html, other]
Title: Experimental Assessment of Neural 3D Reconstruction for Small UAV-based Applications
Genís Castillo Gómez-Raya, Álmos Veres-Vitályos, Filip Lemic, Pablo Royo, Mario Montagud, Sergi Fernández, Sergi Abadal, Xavier Costa-Pérez
Comments: 6 pages, 7 figures, 2 tables, accepted at IEEE International Symposium on Personal, Indoor and Mobile Radio Communications 2025
Subjects: Emerging Technologies (cs.ET); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Networking and Internet Architecture (cs.NI); Image and Video Processing (eess.IV)
[2985] arXiv:2506.19558 (cross-list from cs.LG) [pdf, html, other]
Title: ConCM: Consistency-Driven Calibration and Matching for Few-Shot Class-Incremental Learning
QinZhe Wang, Zixuan Chen, Keke Huang, Xiu Su, Chunhua Yang, Chang Xu
Comments: 9 pages, 5 figures(Excluding the appendix)
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[2986] arXiv:2506.19579 (cross-list from cs.RO) [pdf, html, other]
Title: Fake or Real, Can Robots Tell? Evaluating Embodied Vision-Language Models on Real and 3D-Printed Objects
Federico Tavella, Kathryn Mearns, Angelo Cangelosi
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[2987] arXiv:2506.19590 (cross-list from eess.IV) [pdf, html, other]
Title: Learning from Anatomy: Supervised Anatomical Pretraining (SAP) for Improved Metastatic Bone Disease Segmentation in Whole-Body MRI
Joris Wuts, Jakub Ceranka, Nicolas Michoux, Frédéric Lecouvet, Jef Vandemeulebroucke
Comments: This preprint is currently under review at *Computers in Biology and Medicine* (Elsevier). This version has not been peer-reviewed
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2988] arXiv:2506.19600 (cross-list from eess.IV) [pdf, html, other]
Title: Filling of incomplete sinograms from sparse PET detector configurations using a residual U-Net
Klara Leffler, Luigi Tommaso Luppino, Samuel Kuttner, Karin Söderkvist, Jan Axelsson
Comments: 15 pages, 9 figures
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Medical Physics (physics.med-ph)
[2989] arXiv:2506.19687 (cross-list from eess.IV) [pdf, html, other]
Title: ReCoGNet: Recurrent Context-Guided Network for 3D MRI Prostate Segmentation
Ahmad Mustafa, Reza Rastegar, Ghassan AlRegib
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2990] arXiv:2506.19708 (cross-list from cs.GR) [pdf, html, other]
Title: Uncovering Conceptual Blindspots in Generative Image Models Using Sparse Autoencoders
Matyas Bohacek, Thomas Fel, Maneesh Agrawala, Ekdeep Singh Lubana
Subjects: Graphics (cs.GR); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[2991] arXiv:2506.19741 (cross-list from cs.LG) [pdf, html, other]
Title: Noise Consistency Training: A Native Approach for One-Step Generator in Learning Additional Controls
Yihong Luo, Shuchen Xue, Tianyang Hu, Jing Tang
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
[2992] arXiv:2506.19742 (cross-list from eess.IV) [pdf, html, other]
Title: NeRF-based CBCT Reconstruction needs Normalization and Initialization
Zhuowei Xu, Han Li, Dai Sun, Zhicheng Li, Yujia Li, Qingpeng Kong, Zhiwei Cheng, Nassir Navab, S. Kevin Zhou
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[2993] arXiv:2506.19797 (cross-list from eess.IV) [pdf, html, other]
Title: Systematic Review of Pituitary Gland and Pituitary Adenoma Automatic Segmentation Techniques in Magnetic Resonance Imaging
Mubaraq Yakubu, Navodini Wijethilake, Jonathan Shapey, Andrew King, Alexander Hammers
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[2994] arXiv:2506.19807 (cross-list from cs.AI) [pdf, other]
Title: KnowRL: Exploring Knowledgeable Reinforcement Learning for Factuality
Baochang Ren, Shuofei Qiao, Wenhao Yu, Huajun Chen, Ningyu Zhang
Comments: Work in progress
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multiagent Systems (cs.MA)
[2995] arXiv:2506.19816 (cross-list from cs.RO) [pdf, html, other]
Title: CronusVLA: Transferring Latent Motion Across Time for Multi-Frame Prediction in Manipulation
Hao Li, Shuai Yang, Yilun Chen, Yang Tian, Xiaoda Yang, Xinyi Chen, Hanqing Wang, Tai Wang, Feng Zhao, Dahua Lin, Jiangmiao Pang
Comments: 36 pages, 21 figures
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[2996] arXiv:2506.19827 (cross-list from cs.RO) [pdf, html, other]
Title: Look to Locate: Vision-Based Multisensory Navigation with 3-D Digital Maps for GNSS-Challenged Environments
Ola Elmaghraby, Eslam Mounier, Paulo Ricardo Marques de Araujo, Aboelmagd Noureldin
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[2997] arXiv:2506.19847 (cross-list from cs.LG) [pdf, html, other]
Title: Orthogonal Finetuning Made Scalable
Zeju Qiu, Weiyang Liu, Adrian Weller, Bernhard Schölkopf
Comments: Technical report (17 pages, 7 figures, project page: this https URL)
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[2998] arXiv:2506.19860 (cross-list from eess.SP) [pdf, html, other]
Title: A Multi-Modal Spatial Risk Framework for EV Charging Infrastructure Using Remote Sensing
Oktay Karakuş, Padraig Corcoran
Comments: 11 pages, 4 figures, 2 tables
Subjects: Signal Processing (eess.SP); Computer Vision and Pattern Recognition (cs.CV)
[2999] arXiv:2506.19935 (cross-list from cs.LG) [pdf, html, other]
Title: Any-Order GPT as Masked Diffusion Model: Decoupling Formulation and Architecture
Shuchen Xue, Tianyu Xie, Tianyang Hu, Zijin Feng, Jiacheng Sun, Kenji Kawaguchi, Zhenguo Li, Zhi-Ming Ma
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
[3000] arXiv:2506.19975 (cross-list from eess.IV) [pdf, html, other]
Title: VoxelOpt: Voxel-Adaptive Message Passing for Discrete Optimization in Deformable Abdominal CT Registration
Hang Zhang, Yuxi Zhang, Jiazheng Wang, Xiang Chen, Renjiu Hu, Xin Tian, Gaolei Li, Min Liu
Comments: Accepted for publication at MICCAI 2025
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Signal Processing (eess.SP)
[3001] arXiv:2506.20045 (cross-list from cs.RO) [pdf, html, other]
Title: Consensus-Driven Uncertainty for Robotic Grasping based on RGB Perception
Eric C. Joyce, Qianwen Zhao, Nathaniel Burgdorfer, Long Wang, Philippos Mordohai
Comments: Accepted to IROS 2025
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[3002] arXiv:2506.20100 (cross-list from cs.LG) [pdf, html, other]
Title: MIRAGE: A Benchmark for Multimodal Information-Seeking and Reasoning in Agricultural Expert-Guided Conversations
Vardhan Dongre, Chi Gui, Shubham Garg, Hooshang Nayyeri, Gokhan Tur, Dilek Hakkani-Tür, Vikram S. Adve
Comments: 66 pages, 32 figures, 23 tables
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[3003] arXiv:2506.20200 (cross-list from eess.IV) [pdf, html, other]
Title: MS-IQA: A Multi-Scale Feature Fusion Network for PET/CT Image Quality Assessment
Siqiao Li, Chen Hui, Wei Zhang, Rui Liang, Chenyue Song, Feng Jiang, Haiqi Zhu, Zhixuan Li, Hong Huang, Xiang Li
Comments: Accepted to MICCAI 2025
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3004] arXiv:2506.20245 (cross-list from cs.LG) [pdf, html, other]
Title: FedBKD: Distilled Federated Learning to Embrace Gerneralization and Personalization on Non-IID Data
Yushan Zhao, Jinyuan He, Donglai Chen, Weijie Luo, Chong Xie, Ri Zhang, Yonghong Chen, Yan Xu
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[3005] arXiv:2506.20267 (cross-list from cs.GR) [pdf, html, other]
Title: X-SiT: Inherently Interpretable Surface Vision Transformers for Dementia Diagnosis
Fabian Bongratz, Tom Nuno Wolf, Jaume Gual Ramon, Christian Wachinger
Comments: MICCAI 2025
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[3006] arXiv:2506.20282 (cross-list from eess.IV) [pdf, html, other]
Title: Opportunistic Osteoporosis Diagnosis via Texture-Preserving Self-Supervision, Mixture of Experts and Multi-Task Integration
Jiaxing Huang, Heng Guo, Le Lu, Fan Yang, Minfeng Xu, Ge Yang, Wei Luo
Comments: Accepted by MICCAI 2025
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3007] arXiv:2506.20303 (cross-list from eess.IV) [pdf, other]
Title: FundaQ-8: A Clinically-Inspired Scoring Framework for Automated Fundus Image Quality Assessment
Lee Qi Zun, Oscar Wong Jin Hao, Nor Anita Binti Che Omar, Zalifa Zakiah Binti Asnir, Mohamad Sabri bin Sinal Zainal, Goh Man Fye
Subjects: Image and Video Processing (eess.IV); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[3008] arXiv:2506.20305 (cross-list from cs.LG) [pdf, html, other]
Title: Learning Moderately Input-Sensitive Functions: A Case Study in QR Code Decoding
Kazuki Yoda, Kazuhiko Kawamoto, Hiroshi Kera
Comments: 17 pages, 13 figures
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[3009] arXiv:2506.20333 (cross-list from eess.IV) [pdf, html, other]
Title: EAGLE: An Efficient Global Attention Lesion Segmentation Model for Hepatic Echinococcosis
Jiayan Chen, Kai Li, Yulu Zhao, Jianqiang Huang, Zhan Wang
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3010] arXiv:2506.20355 (cross-list from quant-ph) [pdf, html, other]
Title: Practical insights on the effect of different encodings, ansätze and measurements in quantum and hybrid convolutional neural networks
Jesús Lozano-Cruz, Albert Nieto-Morales, Oriol Balló-Gimbernat, Adan Garriga, Antón Rodríguez-Otero, Alejandro Borrallo-Rentero
Comments: 20 pages, 22 figures
Subjects: Quantum Physics (quant-ph); Computer Vision and Pattern Recognition (cs.CV)
[3011] arXiv:2506.20367 (cross-list from cs.GR) [pdf, html, other]
Title: DreamAnywhere: Object-Centric Panoramic 3D Scene Generation
Edoardo Alberto Dominici, Jozef Hladky, Floor Verhoeven, Lukas Radl, Thomas Deixelberger, Stefan Ainetter, Philipp Drescher, Stefan Hauswiesner, Arno Coomans, Giacomo Nazzaro, Konstantinos Vardis, Markus Steinberger
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[3012] arXiv:2506.20407 (cross-list from eess.IV) [pdf, html, other]
Title: Fusing Radiomic Features with Deep Representations for Gestational Age Estimation in Fetal Ultrasound Images
Fangyijie Wang, Yuan Liang, Sourav Bhattacharjee, Abey Campbell, Kathleen M. Curran, Guénolé Silvestre
Comments: Accepted at MICCAI 2025
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3013] arXiv:2506.20430 (cross-list from cs.CL) [pdf, html, other]
Title: An Agentic System for Rare Disease Diagnosis with Traceable Reasoning
Weike Zhao, Chaoyi Wu, Yanjie Fan, Xiaoman Zhang, Pengcheng Qiu, Yuze Sun, Xiao Zhou, Yanfeng Wang, Xin Sun, Ya Zhang, Yongguo Yu, Kun Sun, Weidi Xie
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Multiagent Systems (cs.MA)
[3014] arXiv:2506.20566 (cross-list from cs.RO) [pdf, html, other]
Title: HRIBench: Benchmarking Vision-Language Models for Real-Time Human Perception in Human-Robot Interaction
Zhonghao Shi, Enyu Zhao, Nathaniel Dennler, Jingzhen Wang, Xinyang Xu, Kaleen Shrestha, Mengxue Fu, Daniel Seita, Maja Matarić
Comments: Accepted to the 19th International Symposium on Experimental Robotics (ISER 2025)
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[3015] arXiv:2506.20614 (cross-list from eess.IV) [pdf, html, other]
Title: Weighted Mean Frequencies: a handcraft Fourier feature for 4D Flow MRI segmentation
Simon Perrin, Sébastien Levilly, Huajun Sun, Harold Mouchère, Jean-Michel Serfaty
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[3016] arXiv:2506.20652 (cross-list from cs.GR) [pdf, html, other]
Title: EditP23: 3D Editing via Propagation of Image Prompts to Multi-View
Roi Bar-On, Dana Cohen-Bar, Daniel Cohen-Or
Comments: Code, supplementary videos, interactive 3D visualizations, and additional results are available at this https URL
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[3017] arXiv:2506.20683 (cross-list from eess.IV) [pdf, html, other]
Title: Global and Local Contrastive Learning for Joint Representations from Cardiac MRI and ECG
Alexander Selivanov, Philip Müller, Özgün Turgut, Nil Stolt-Ansó, Daniel Rückert
Comments: accepted to MICCAI 2025 (Springer LNCS)
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Signal Processing (eess.SP)
[3018] arXiv:2506.20689 (cross-list from eess.IV) [pdf, other]
Title: U-R-VEDA: Integrating UNET, Residual Links, Edge and Dual Attention, and Vision Transformer for Accurate Semantic Segmentation of CMRs
Racheal Mukisa, Arvind K. Bansal
Comments: 15 pages, 3 figures
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[3019] arXiv:2506.20703 (cross-list from cs.GR) [pdf, html, other]
Title: Generative Blocks World: Moving Things Around in Pictures
Vaibhav Vavilala, Seemandhar Jain, Rahul Vasanth, D.A. Forsyth, Anand Bhattad
Comments: 23 pages, 16 figures, 2 tables
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[3020] arXiv:2506.20812 (cross-list from cs.RO) [pdf, html, other]
Title: Model-Based Real-Time Pose and Sag Estimation of Overhead Power Lines Using LiDAR for Drone Inspection
Alexandre Girard, Steven A. Parkison, Philippe Hamelin
Comments: Submitted to IEEE case 2025
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[3021] arXiv:2506.20816 (cross-list from cs.LG) [pdf, html, other]
Title: Universal and Efficient Detection of Adversarial Data through Nonuniform Impact on Network Layers
Furkan Mumcu, Yasin Yilmaz
Comments: arXiv admin note: substantial text overlap with arXiv:2410.17442
Journal-ref: Transactions on Machine Learning Research, June 2025
Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
[3022] arXiv:2506.20875 (cross-list from cs.GR) [pdf, html, other]
Title: 3DGH: 3D Head Generation with Composable Hair and Face
Chengan He, Junxuan Li, Tobias Kirschstein, Artem Sevastopolsky, Shunsuke Saito, Qingyang Tan, Javier Romero, Chen Cao, Holly Rushmeier, Giljoo Nam
Comments: Accepted to SIGGRAPH 2025. Project page: this https URL
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[3023] arXiv:2506.20897 (cross-list from eess.IV) [pdf, html, other]
Title: Development of MR spectral analysis method robust against static magnetic field inhomogeneity
Shuki Maruyama, Hidenori Takeshima
Comments: 11 pages, 6 figures
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3024] arXiv:2506.20946 (cross-list from cs.GR) [pdf, html, other]
Title: Consistent Zero-shot 3D Texture Synthesis Using Geometry-aware Diffusion and Temporal Video Models
Donggoo Kang, Jangyeong Kim, Dasol Jeong, Junyoung Choi, Jeonga Wi, Hyunmin Lee, Joonho Gwon, Joonki Paik
Subjects: Graphics (cs.GR); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[3025] arXiv:2506.20969 (cross-list from cs.RO) [pdf, html, other]
Title: ThermalDiffusion: Visual-to-Thermal Image-to-Image Translation for Autonomous Navigation
Shruti Bansal, Wenshan Wang, Yifei Liu, Parv Maheshwari
Comments: Accepted at Thermal Infrared in Robotics (TIRO) Workshop, ICRA 2025
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[3026] arXiv:2506.20990 (cross-list from cs.LG) [pdf, html, other]
Title: SharpZO: Hybrid Sharpness-Aware Vision Language Model Prompt Tuning via Forward-Only Passes
Yifan Yang, Zhen Zhang, Rupak Vignesh Swaminathan, Jing Liu, Nathan Susanj, Zheng Zhang
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[3027] arXiv:2506.21037 (cross-list from cs.LG) [pdf, html, other]
Title: RL-Selector: Reinforcement Learning-Guided Data Selection via Redundancy Assessment
Suorong Yang, Peijia Li, Furao Shen, Jian Zhao
Comments: ICCV 2025
Journal-ref: ICCV 2025
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[3028] arXiv:2506.21041 (cross-list from cs.RO) [pdf, html, other]
Title: SEAL: Vision-Language Model-Based Safe End-to-End Cooperative Autonomous Driving with Adaptive Long-Tail Modeling
Junwei You, Pei Li, Zhuoyu Jiang, Zilin Huang, Rui Gan, Haotian Shi, Bin Ran
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[3029] arXiv:2506.21144 (cross-list from cs.LG) [pdf, html, other]
Title: Personalized Federated Learning via Dual-Prompt Optimization and Cross Fusion
Yuguang Zhang, Kuangpu Guo, Zhihe Lu, Yunbo Wang, Jian Liang
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[3030] arXiv:2506.21171 (cross-list from eess.IV) [pdf, other]
Title: Uncover Treasures in DCT: Advancing JPEG Quality Enhancement by Exploiting Latent Correlations
Jing Yang, Qunliang Xing, Mai Xu, Minglang Qiao
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3031] arXiv:2506.21245 (cross-list from eess.IV) [pdf, html, other]
Title: GANet-Seg: Adversarial Learning for Brain Tumor Segmentation with Hybrid Generative Models
Qifei Cui, Xinyu Lu
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3032] arXiv:2506.21272 (cross-list from cs.GR) [pdf, html, other]
Title: FairyGen: Storied Cartoon Video from a Single Child-Drawn Character
Jiayi Zheng, Xiaodong Cun
Comments: Project Page: this https URL ; Code: this https URL
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[3033] arXiv:2506.21319 (cross-list from cs.HC) [pdf, html, other]
Title: SimVecVis: A Dataset for Enhancing MLLMs in Visualization Understanding
Can Liu, Chunlin Da, Xiaoxiao Long, Yuxiao Yang, Yu Zhang, Yong Wang
Subjects: Human-Computer Interaction (cs.HC); Computer Vision and Pattern Recognition (cs.CV)
[3034] arXiv:2506.21331 (cross-list from cs.DL) [pdf, html, other]
Title: Automatic Reviewers Assignment to a Research Paper Based on Allied References and Publications Weight
Tamim Al Mahmud, B M Mainul Hossain, Dilshad Ara
Comments: IEEE Conference Proceedings (5 Pages)
Journal-ref: 2018 4th International Conference on Computing, Communication and Automation (ICCCA), Greater Noida, India, 2018, pp. 1-5
Subjects: Digital Libraries (cs.DL); Computer Vision and Pattern Recognition (cs.CV)
[3035] arXiv:2506.21448 (cross-list from eess.AS) [pdf, html, other]
Title: ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and Editing
Huadai Liu, Jialei Wang, Kaicheng Luo, Wen Wang, Qian Chen, Zhou Zhao, Wei Xue
Subjects: Audio and Speech Processing (eess.AS); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD)
[3036] arXiv:2506.21458 (cross-list from cs.AI) [pdf, other]
Title: Spatial Mental Modeling from Limited Views
Baiqiao Yin, Qineng Wang, Pingyue Zhang, Jianshu Zhang, Kangrui Wang, Zihan Wang, Jieyu Zhang, Keshigeyan Chandrasegaran, Han Liu, Ranjay Krishna, Saining Xie, Manling Li, Jiajun Wu, Li Fei-Fei
Comments: Preprint version
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[3037] arXiv:2506.21499 (cross-list from eess.IV) [pdf, html, other]
Title: Lightweight Physics-Informed Zero-Shot Ultrasound Plane Wave Denoising
Hojat Asgariandehkordi, Mostafa Sharifzadeh, Hassan Rivaz
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3038] arXiv:2506.21535 (cross-list from eess.IV) [pdf, html, other]
Title: Exploring the Design Space of 3D MLLMs for CT Report Generation
Mohammed Baharoon, Jun Ma, Congyu Fang, Augustin Toma, Bo Wang
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[3039] arXiv:2506.21537 (cross-list from quant-ph) [pdf, html, other]
Title: ResQ: A Novel Framework to Implement Residual Neural Networks on Analog Rydberg Atom Quantum Computers
Nicholas S. DiBrita, Jason Han, Tirthak Patel
Comments: ResQ will appear in the Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2025
Subjects: Quantum Physics (quant-ph); Computer Vision and Pattern Recognition (cs.CV); Emerging Technologies (cs.ET)
[3040] arXiv:2506.21586 (cross-list from cs.CL) [pdf, html, other]
Title: Can Vision Language Models Understand Mimed Actions?
Hyundong Cho, Spencer Lin, Tejas Srinivasan, Michael Saxon, Deuksin Kwon, Natali T. Chavez, Jonathan May
Comments: ACL 2025 Findings
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[3041] arXiv:2506.21592 (cross-list from cs.CL) [pdf, html, other]
Title: SignBart -- New approach with the skeleton sequence for Isolated Sign language Recognition
Tinh Nguyen, Minh Khue Phan Tran
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[3042] arXiv:2506.21601 (cross-list from cs.IR) [pdf, html, other]
Title: Hierarchical Patch Compression for ColPali: Efficient Multi-Vector Document Retrieval with Dynamic Pruning and Quantization
Duong Bach
Comments: 9 pages
Subjects: Information Retrieval (cs.IR); Computer Vision and Pattern Recognition (cs.CV)
[3043] arXiv:2506.21604 (cross-list from cs.IR) [pdf, html, other]
Title: Evaluating VisualRAG: Quantifying Cross-Modal Performance in Enterprise Document Understanding
Varun Mannam, Fang Wang, Xin Chen
Comments: Conference: KDD conference workshop: this https URL
Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)
[3044] arXiv:2506.21629 (cross-list from cs.GR) [pdf, html, other]
Title: ICP-3DGS: SfM-free 3D Gaussian Splatting for Large-scale Unbounded Scenes
Chenhao Zhang, Yezhi Shen, Fengqing Zhu
Comments: 6 pages, Source code is available at this https URL. To appear at ICIP 2025
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[3045] arXiv:2506.21630 (cross-list from cs.RO) [pdf, html, other]
Title: TOMD: A Trail-based Off-road Multimodal Dataset for Traversable Pathway Segmentation under Challenging Illumination Conditions
Yixin Sun, Li Li, Wenke E, Amir Atapour-Abarghouei, Toby P. Breckon
Comments: 8 pages, 9 figures, 2025 IJCNN
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[3046] arXiv:2506.21635 (cross-list from cs.RO) [pdf, html, other]
Title: AeroLite-MDNet: Lightweight Multi-task Deviation Detection Network for UAV Landing
Haiping Yang, Huaxing Liu, Wei Wu, Zuohui Chen, Ning Wu
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[3047] arXiv:2506.21655 (cross-list from cs.LG) [pdf, html, other]
Title: APO: Enhancing Reasoning Ability of MLLMs via Asymmetric Policy Optimization
Minjie Hong, Zirun Guo, Yan Xia, Zehan Wang, Ziang Zhang, Tao Jin, Zhou Zhao
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[3048] arXiv:2506.21680 (cross-list from eess.IV) [pdf, html, other]
Title: PhotonSplat: 3D Scene Reconstruction and Colorization from SPAD Sensors
Sai Sri Teja, Sreevidya Chintalapati, Vinayak Gupta, Mukund Varma T, Haejoon Lee, Aswin Sankaranarayanan, Kaushik Mitra
Comments: Accepted at the International Conference on Computational Photography(ICCP) 2025
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3049] arXiv:2506.21714 (cross-list from cs.LG) [pdf, html, other]
Title: ODE$_t$(ODE$_l$): Shortcutting the Time and Length in Diffusion and Flow Models for Faster Sampling
Denis Gudovskiy, Wenzhao Zheng, Tomoyuki Okuno, Yohei Nakata, Kurt Keutzer
Comments: Preprint. Github page: this http URL
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[3050] arXiv:2506.21732 (cross-list from cs.RO) [pdf, html, other]
Title: Experimental investigation of pose informed reinforcement learning for skid-steered visual navigation
Ameya Salvi, Venkat Krovi
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Systems and Control (eess.SY)
[3051] arXiv:2506.21748 (cross-list from physics.optics) [pdf, html, other]
Title: Inverse Design of Diffractive Metasurfaces Using Diffusion Models
Liav Hen, Erez Yosef, Dan Raviv, Raja Giryes, Jacob Scheuer
Subjects: Optics (physics.optics); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[3052] arXiv:2506.21765 (cross-list from eess.IV) [pdf, html, other]
Title: TUS-REC2024: A Challenge to Reconstruct 3D Freehand Ultrasound Without External Tracker
Qi Li, Shaheer U. Saeed, Yuliang Huang, Mingyuan Luo, Zhongnuo Yan, Jiongquan Chen, Xin Yang, Dong Ni, Nektarios Winter, Phuc Nguyen, Lucas Steinberger, Caelan Haney, Yuan Zhao, Mingjie Jiang, Bowen Ren, SiYeoul Lee, Seonho Kim, MinKyung Seo, MinWoo Kim, Yimeng Dou, Zhiwei Zhang, Yin Li, Tomy Varghese, Dean C. Barratt, Matthew J. Clarkson, Tom Vercauteren, Yipeng Hu
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3053] arXiv:2506.21812 (cross-list from cs.CL) [pdf, html, other]
Title: Towards Transparent AI: A Survey on Explainable Large Language Models
Avash Palikhe, Zhenyu Yu, Zichong Wang, Wenbin Zhang
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[3054] arXiv:2506.21860 (cross-list from cs.RO) [pdf, html, other]
Title: Embodied Domain Adaptation for Object Detection
Xiangyu Shi, Yanyuan Qiao, Lingqiao Liu, Feras Dayoub
Comments: Accepted by IROS 2025
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[3055] arXiv:2506.21876 (cross-list from cs.CL) [pdf, html, other]
Title: Do Vision-Language Models Have Internal World Models? Towards an Atomic Evaluation
Qiyue Gao, Xinyu Pi, Kevin Liu, Junrong Chen, Ruolan Yang, Xinqi Huang, Xinyu Fang, Lu Sun, Gautham Kishore, Bo Ai, Stone Tao, Mengyang Liu, Jiaxi Yang, Chao-Jung Lai, Chuanyang Jin, Jiannan Xiang, Benhao Huang, Zeming Chen, David Danks, Hao Su, Tianmin Shu, Ziqiao Ma, Lianhui Qin, Zhiting Hu
Comments: ACL 2025 (Findings)
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[3056] arXiv:2506.21880 (cross-list from eess.IV) [pdf, html, other]
Title: Physical Degradation Model-Guided Interferometric Hyperspectral Reconstruction with Unfolding Transformer
Yuansheng Li, Yunhao Zou, Linwei Chen, Ying Fu
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3057] arXiv:2506.21884 (cross-list from eess.IV) [pdf, html, other]
Title: UnMix-NeRF: Spectral Unmixing Meets Neural Radiance Fields
Fabian Perez, Sara Rojas, Carlos Hinojosa, Hoover Rueda-Chacón, Bernard Ghanem
Comments: Paper accepted at ICCV 2025 main conference
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Signal Processing (eess.SP)
[3058] arXiv:2506.21934 (cross-list from cs.IR) [pdf, html, other]
Title: CAL-RAG: Retrieval-Augmented Multi-Agent Generation for Content-Aware Layout Design
Najmeh Forouzandehmehr, Reza Yousefi Maragheh, Sriram Kollipara, Kai Zhao, Topojoy Biswas, Evren Korpeoglu, Kannan Achan
Subjects: Information Retrieval (cs.IR); Computer Vision and Pattern Recognition (cs.CV)
[3059] arXiv:2506.21976 (cross-list from cs.LG) [pdf, html, other]
Title: SceneDiffuser++: City-Scale Traffic Simulation via a Generative World Model
Shuhan Tan, John Lambert, Hong Jeon, Sakshum Kulshrestha, Yijing Bai, Jing Luo, Dragomir Anguelov, Mingxing Tan, Chiyu Max Jiang
Comments: Accepted to CVPR 2025
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Multiagent Systems (cs.MA); Robotics (cs.RO)
[3060] arXiv:2506.21977 (cross-list from eess.IV) [pdf, other]
Title: StableCodec: Taming One-Step Diffusion for Extreme Image Compression
Tianyu Zhang, Xin Luo, Li Li, Dong Liu
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3061] arXiv:2506.22012 (cross-list from eess.IV) [pdf, html, other]
Title: Noise-Inspired Diffusion Model for Generalizable Low-Dose CT Reconstruction
Qi Gao, Zhihao Chen, Dong Zeng, Junping Zhang, Jianhua Ma, Hongming Shan
Comments: Accepted for publication in Medical Image Analysis, 2025
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3062] arXiv:2506.22041 (cross-list from eess.IV) [pdf, html, other]
Title: Towards Scalable and Robust White Matter Lesion Localization via Multimodal Deep Learning
Julia Machnio, Sebastian Nørgaard Llambias, Mads Nielsen, Mostafa Mehdipour Ghazi
Comments: 2nd Sorbonne-Heidelberg Workshop on AI in medicine: Machine Learning for multi-modal data
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3063] arXiv:2506.22116 (cross-list from cs.RO) [pdf, html, other]
Title: Evaluating Pointing Gestures for Target Selection in Human-Robot Collaboration
Noora Sassali, Roel Pieters
Comments: Accepted by the 2025 34th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN). Preprint
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[3064] arXiv:2506.22156 (cross-list from cs.AR) [pdf, html, other]
Title: Hardware acceleration for ultra-fast Neural Network training on FPGA for MRF map reconstruction
Mattia Ricchi, Fabrizio Alfonsi, Camilla Marella, Marco Barbieri, Alessandra Retico, Leonardo Brizi, Alessandro Gabrielli, Claudia Testa
Comments: 8 pages, 2 figures, to be published in conference proceedings of SDPS 2024: 2024 International Conference of the Society for Design and Process Science on Advances and Challenges of Applying AI/GenAI in Design and Process Science
Subjects: Hardware Architecture (cs.AR); Computer Vision and Pattern Recognition (cs.CV); Instrumentation and Detectors (physics.ins-det)
[3065] arXiv:2506.22176 (cross-list from cs.RO) [pdf, html, other]
Title: KnotDLO: Toward Interpretable Knot Tying
Holly Dinkel, Raghavendra Navaratna, Jingyi Xiang, Brian Coltin, Trey Smith, Timothy Bretl
Comments: 4 pages, 5 figures, presented at the Workshop on 3D Visual Representations for Manipulation at the 2023 IEEE International Conference on Robotics and Automation in Yokohama, Japan. Video presentation [this https URL]. Poster [this https URL] 3DVRM Workshop [this https URL]
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[3066] arXiv:2506.22222 (cross-list from eess.IV) [pdf, html, other]
Title: Advanced Deep Learning Techniques for Automated Segmentation of Type B Aortic Dissections
Hao Xu, Ruth Lim, Brian E. Chapman
Comments: 9 pages, 5 figures, 3 tables
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3067] arXiv:2506.22226 (cross-list from eess.IV) [pdf, html, other]
Title: Cardiovascular disease classification using radiomics and geometric features from cardiac CT
Ajay Mittal, Raghav Mehta, Omar Todd, Philipp Seeböck, Georg Langs, Ben Glocker
Comments: Under Review at STACOM 2025 with MICCAI 2025
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3068] arXiv:2506.22280 (cross-list from eess.IV) [pdf, html, other]
Title: DIGS: Dynamic CBCT Reconstruction using Deformation-Informed 4D Gaussian Splatting and a Low-Rank Free-Form Deformation Model
Yuliang Huang, Imraj Singh, Thomas Joyce, Kris Thielemans, Jamie R. McClelland
Comments: Accepted by MICCAI 2025
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3069] arXiv:2506.22304 (cross-list from cs.LG) [pdf, html, other]
Title: Unfolding Generative Flows with Koopman Operators: Fast and Interpretable Sampling
Erkan Turan, Aristotelis Siozopoulos, Maks Ovsjanikov
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[3070] arXiv:2506.22340 (cross-list from quant-ph) [pdf, html, other]
Title: QuKAN: A Quantum Circuit Born Machine approach to Quantum Kolmogorov Arnold Networks
Yannick Werner, Akash Malemath, Mengxi Liu, Vitor Fortes Rey, Nikolaos Palaiodimopoulos, Paul Lukowicz, Maximilian Kiefer-Emmanouilidis
Subjects: Quantum Physics (quant-ph); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[3071] arXiv:2506.22397 (cross-list from eess.IV) [pdf, other]
Title: Dehazing Light Microscopy Images with Guided Conditional Flow Matching: finding a sweet spot between fidelity and realism
Anirban Ray, Ashesh, Florian Jug
Comments: 4 figures, 10 pages + refs, 40 pages total (including supplement), 24 supplementary figures
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[3072] arXiv:2506.22426 (cross-list from eess.IV) [pdf, html, other]
Title: Single-shot HDR using conventional image sensor shutter functions and optical randomization
Xiang Dai, Kyrollos Yanny, Kristina Monakhova, Nicholas Antipa
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Signal Processing (eess.SP); Optics (physics.optics)
[3073] arXiv:2506.22467 (cross-list from eess.SP) [pdf, other]
Title: SegmentAnyMuscle: A universal muscle segmentation model across different locations in MRI
Roy Colglazier, Jisoo Lee, Haoyu Dong, Hanxue Gu, Yaqian Chen, Joseph Cao, Zafer Yildiz, Zhonghao Liu, Nicholas Konz, Jichen Yang, Jikai Zhang, Yuwen Chen, Lin Li, Adrian Camarena, Maciej A. Mazurowski
Comments: 24 pages, 6 figures
Subjects: Signal Processing (eess.SP); Computer Vision and Pattern Recognition (cs.CV)
[3074] arXiv:2506.22482 (cross-list from cs.NI) [pdf, other]
Title: Wireless Home Automation Using Social Networking Websites
Divya Alok Gupta, Dwith Chenna, B. Aditya Vighnesh Ramakanth
Comments: 20th Annual International Conference on Advanced Computing and Communications (ADCOM) 2014
Subjects: Networking and Internet Architecture (cs.NI); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
[3075] arXiv:2506.22494 (cross-list from cs.RO) [pdf, html, other]
Title: DriveBLIP2: Attention-Guided Explanation Generation for Complex Driving Scenarios
Shihong Ling, Yue Wan, Xiaowei Jia, Na Du
Comments: Accepted to IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2025. 7 pages, 3 figures
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[3076] arXiv:2506.22532 (cross-list from eess.IV) [pdf, other]
Title: High Resolution Isotropic 3D Cine imaging with Automated Segmentation using Concatenated 2D Real-time Imaging and Deep Learning
Mark Wrobel (1), Michele Pascale (1), Tina Yao (1), Ruaraidh Campbell (1), Elena Milano (2), Michael Quail (1 and 2), Jennifer Steeden (1), Vivek Muthurangu (1) ((1) UCL Centre for Translational Cardiovascular Imaging, University College London, (2) Great Ormond Street Hospital)
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[3077] arXiv:2506.22568 (cross-list from math.OC) [pdf, html, other]
Title: Maximum Dispersion, Maximum Concentration: Enhancing the Quality of MOP Solutions
Gladston Moreira, Ivan Meneghini, Elizabeth Wanner
Comments: 11 pages
Subjects: Optimization and Control (math.OC); Computer Vision and Pattern Recognition (cs.CV)
[3078] arXiv:2506.22580 (cross-list from eess.IV) [pdf, html, other]
Title: FedCLAM: Client Adaptive Momentum with Foreground Intensity Matching for Federated Medical Image Segmentation
Vasilis Siomos, Jonathan Passerat-Palmbach, Giacomo Tarroni
Comments: 10 pages, 2 figures, Accepted at MICCAI 2025
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[3079] arXiv:2506.22593 (cross-list from cs.RO) [pdf, html, other]
Title: Pixels-to-Graph: Real-time Integration of Building Information Models and Scene Graphs for Semantic-Geometric Human-Robot Understanding
Antonello Longo, Chanyoung Chung, Matteo Palieri, Sung-Kyun Kim, Ali Agha, Cataldo Guaragnella, Shehryar Khattak
Comments: Paper accepted to 2025 IEEE International Conference on Automation Science and Engineering (CASE)
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[3080] arXiv:2506.22706 (cross-list from cs.CR) [pdf, other]
Title: General Autonomous Cybersecurity Defense: Learning Robust Policies for Dynamic Topologies and Diverse Attackers
Arun Ramamurthy, Neil Dhir
Subjects: Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
[3081] arXiv:2506.22790 (cross-list from eess.IV) [pdf, html, other]
Title: ICME 2025 Generalizable HDR and SDR Video Quality Measurement Grand Challenge
Yixu Chen, Bowen Chen, Hai Wei, Alan C. Bovik, Baojun Li, Wei Sun, Linhan Cao, Kang Fu, Dandan Zhu, Jun Jia, Menghan Hu, Xiongkuo Min, Guangtao Zhai, Dounia Hammou, Fei Yin, Rafal Mantiuk, Amritha Premkumar, Prajit T Rajendran, Vignesh V Menon
Comments: ICME 2025 Grand Challenges
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[3082] arXiv:2506.22799 (cross-list from cs.GR) [pdf, html, other]
Title: VoteSplat: Hough Voting Gaussian Splatting for 3D Scene Understanding
Minchao Jiang, Shunyu Jia, Jiaming Gu, Xiaoyuan Lu, Guangming Zhu, Anqi Dong, Liang Zhang
Comments: Accepted to ICCV 2025
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[3083] arXiv:2506.22802 (cross-list from cs.LG) [pdf, html, other]
Title: Riemannian-Geometric Fingerprints of Generative Models
Hae Jin Song, Laurent Itti
Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
[3084] arXiv:2506.22826 (cross-list from math.OC) [pdf, html, other]
Title: Denoising Multi-Color QR Codes and Stiefel-Valued Data by Relaxed Regularizations
Robert Beinert, Jonas Bresch
Comments: 9 pages, 2 figures, 3 algorithms
Subjects: Optimization and Control (math.OC); Computer Vision and Pattern Recognition (cs.CV); Numerical Analysis (math.NA)
[3085] arXiv:2506.22882 (cross-list from eess.IV) [pdf, html, other]
Title: CA-Diff: Collaborative Anatomy Diffusion for Brain Tissue Segmentation
Qilong Xing, Zikai Song, Yuteng Ye, Yuke Chen, Youjia Zhang, Na Feng, Junqing Yu, Wei Yang
Comments: ICME 2025
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[3086] arXiv:2506.22952 (cross-list from eess.IV) [pdf, html, other]
Title: Hierarchical Characterization of Brain Dynamics via State Space-based Vector Quantization
Yanwu Yang, Thomas Wolfers
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Neurons and Cognition (q-bio.NC)
[3087] arXiv:2506.22973 (cross-list from cs.GR) [pdf, html, other]
Title: Confident Splatting: Confidence-Based Compression of 3D Gaussian Splatting via Learnable Beta Distributions
AmirHossein Naghi Razlighi, Elaheh Badali Golezani, Shohreh Kasaei
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[3088] arXiv:2506.22992 (cross-list from cs.AI) [pdf, html, other]
Title: MARBLE: A Hard Benchmark for Multimodal Spatial Reasoning and Planning
Yulun Jiang, Yekun Chai, Maria Brbić, Michael Moor
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[3089] arXiv:2506.23016 (cross-list from cs.HC) [pdf, html, other]
Title: Deep Learning in Mild Cognitive Impairment Diagnosis using Eye Movements and Image Content in Visual Memory Tasks
Tomás Silva Santos Rocha, Anastasiia Mikhailova, Moreno I. Coco, José Santos-Victor
Comments: 13 pages, 5 figures
Subjects: Human-Computer Interaction (cs.HC); Computer Vision and Pattern Recognition (cs.CV)
[3090] arXiv:2506.23041 (cross-list from cs.LG) [pdf, html, other]
Title: ReMem: Mutual Information-Aware Fine-tuning of Pretrained Vision Transformers for Effective Knowledge Distillation
Chengyu Dong, Huan Gui, Noveen Sachdeva, Long Jin, Ke Yin, Jingbo Shang, Lichan Hong, Ed H.Chi, Zhe Zhao
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[3091] arXiv:2506.23046 (cross-list from cs.CL) [pdf, html, other]
Title: SoMi-ToM: Evaluating Multi-Perspective Theory of Mind in Embodied Social Interactions
Xianzhe Fan, Xuhui Zhou, Chuanyang Jin, Kolby Nottingham, Hao Zhu, Maarten Sap
Comments: 23 pages, 6 figures
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[3092] arXiv:2506.23102 (cross-list from eess.IV) [pdf, html, other]
Title: MedRegion-CT: Region-Focused Multimodal LLM for Comprehensive 3D CT Report Generation
Sunggu Kyung, Jinyoung Seo, Hyunseok Lim, Dongyeong Kim, Hyungbin Park, Jimin Sung, Jihyun Kim, Wooyoung Jo, Yoojin Nam, Namkug Kim
Comments: 14 pages, 5 figures, submitted to ICCV 2025
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3093] arXiv:2506.23121 (cross-list from eess.IV) [pdf, html, other]
Title: CRISP-SAM2: SAM2 with Cross-Modal Interaction and Semantic Prompting for Multi-Organ Segmentation
Xinlei Yu, Changmiao Wang, Hui Jin, Ahmed Elazab, Gangyong Jia, Xiang Wan, Changqing Zou, Ruiquan Ge
Comments: Accepted By ACMMM25
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[3094] arXiv:2506.23145 (cross-list from cs.LG) [pdf, html, other]
Title: Forget-MI: Machine Unlearning for Forgetting Multimodal Information in Healthcare Settings
Shahad Hardan, Darya Taratynova, Abdelmajid Essofi, Karthik Nandakumar, Mohammad Yaqub
Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
[3095] arXiv:2506.23147 (cross-list from cs.LG) [pdf, html, other]
Title: maneuverRecognition -- A Python package for Timeseries Classification in the domain of Vehicle Telematics
Jonathan Schuster, Fabian Transchel
Comments: 6 pages, 2 figures
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[3096] arXiv:2506.23184 (cross-list from eess.IV) [pdf, html, other]
Title: Score-based Diffusion Model for Unpaired Virtual Histology Staining
Anran Liu, Xiaofei Wang, Jing Cai, Chao Li
Comments: 11 pages, 3 figures
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[3097] arXiv:2506.23208 (cross-list from eess.IV) [pdf, html, other]
Title: Multi-Source COVID-19 Detection via Variance Risk Extrapolation
Runtian Yuan, Qingqiu Li, Junlin Hou, Jilan Xu, Yuejie Zhang, Rui Feng, Hao Chen
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3098] arXiv:2506.23221 (cross-list from cs.LG) [pdf, html, other]
Title: Single Image Inpainting and Super-Resolution with Simultaneous Uncertainty Guarantees by Universal Reproducing Kernels
Bálint Horváth, Balázs Csanád Csáji
Comments: 23 pages, 8 figures, 6 tables
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[3099] arXiv:2506.23259 (cross-list from eess.IV) [pdf, html, other]
Title: Improving Myocardial Infarction Detection via Synthetic ECG Pretraining
Lachin Naghashyar
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3100] arXiv:2506.23298 (cross-list from eess.IV) [pdf, html, other]
Title: Exposing and Mitigating Calibration Biases and Demographic Unfairness in MLLM Few-Shot In-Context Learning for Medical Image Classification
Xing Shen, Justin Szeto, Mingyang Li, Hengguan Huang, Tal Arbel
Comments: Preprint version. The peer-reviewed version of this paper has been accepted to MICCAI 2025 main conference
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[3101] arXiv:2506.23305 (cross-list from eess.IV) [pdf, html, other]
Title: BPD-Neo: An MRI Dataset for Lung-Trachea Segmentation with Clinical Data for Neonatal Bronchopulmonary Dysplasia
Rachit Saluja, Arzu Kovanlikaya, Candace Chien, Lauren Kathryn Blatt, Jeffrey M. Perlman, Stefan Worgall, Mert R. Sabuncu, Jonathan P. Dyke
Comments: Adding link to Zenodo repo for dataset
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3102] arXiv:2506.23309 (cross-list from eess.IV) [pdf, html, other]
Title: SurgTPGS: Semantic 3D Surgical Scene Understanding with Text Promptable Gaussian Splatting
Yiming Huang, Long Bai, Beilei Cui, Kun Yuan, Guankun Wang, Mobarak I. Hoque, Nicolas Padoy, Nassir Navab, Hongliang Ren
Comments: MICCAI 2025. Project Page: this https URL
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3103] arXiv:2506.23316 (cross-list from cs.RO) [pdf, html, other]
Title: InfGen: Scenario Generation as Next Token Group Prediction
Zhenghao Peng, Yuxin Liu, Bolei Zhou
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[3104] arXiv:2506.23334 (cross-list from eess.IV) [pdf, html, other]
Title: Federated Breast Cancer Detection Enhanced by Synthetic Ultrasound Image Augmentation
Hongyi Pan, Ziliang Hong, Gorkem Durak, Ziyue Xu, Ulas Bagci
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[3105] arXiv:2506.23466 (cross-list from eess.IV) [pdf, other]
Title: FD-DiT: Frequency Domain-Directed Diffusion Transformer for Low-Dose CT Reconstruction
Qiqing Liu, Guoquan Wei, Zekun Zhou, Yiyang Wen, Liu Shi, Qiegen Liu
Comments: 11pages, 11 figures
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Medical Physics (physics.med-ph)
[3106] arXiv:2506.23471 (cross-list from cs.IR) [pdf, html, other]
Title: KiseKloset: Comprehensive System For Outfit Retrieval, Recommendation, And Try-On
Thanh-Tung Phan-Nguyen, Khoi-Nguyen Nguyen-Ngoc, Tam V. Nguyen, Minh-Triet Tran, Trung-Nghia Le
Subjects: Information Retrieval (cs.IR); Computer Vision and Pattern Recognition (cs.CV)
[3107] arXiv:2506.23484 (cross-list from cs.MM) [pdf, html, other]
Title: TAG-WM: Tamper-Aware Generative Image Watermarking via Diffusion Inversion Sensitivity
Yuzhuo Chen, Zehua Ma, Han Fang, Weiming Zhang, Nenghai Yu
Comments: Camera-ready version for ICCV 2025. Adds GitHub link; acknowledgments; appendix. Abstract and Figure 1 updated for clarity
Subjects: Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[3108] arXiv:2506.23490 (cross-list from eess.IV) [pdf, html, other]
Title: UltraTwin: Towards Cardiac Anatomical Twin Generation from Multi-view 2D Ultrasound
Junxuan Yu, Yaofei Duan, Yuhao Huang, Yu Wang, Rongbo Ling, Weihao Luo, Ang Zhang, Jingxian Xu, Qiongying Ni, Yongsong Zhou, Binghan Li, Haoran Dou, Liping Liu, Yanfen Chu, Feng Geng, Zhe Sheng, Zhifeng Ding, Dingxin Zhang, Rui Huang, Yuhang Zhang, Xiaowei Xu, Tao Tan, Dong Ni, Zhongshan Gou, Xin Yang
Comments: accepted by miccai 2025
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[3109] arXiv:2506.23492 (cross-list from cs.LG) [pdf, html, other]
Title: Sample Margin-Aware Recalibration of Temperature Scaling
Haolan Guo, Linwei Tao, Haoyang Luo, Minjing Dong, Chang Xu
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[3110] arXiv:2506.23506 (cross-list from eess.IV) [pdf, other]
Title: Artificial Intelligence-assisted Pixel-level Lung (APL) Scoring for Fast and Accurate Quantification in Ultra-short Echo-time MRI
Bowen Xin, Rohan Hickey, Tamara Blake, Jin Jin, Claire E Wainwright, Thomas Benkert, Alto Stemmer, Peter Sly, David Coman, Jason Dowling
Comments: Oral presentation in ISMRM2025
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Medical Physics (physics.med-ph)
[3111] arXiv:2506.23516 (cross-list from cs.LG) [pdf, html, other]
Title: FedWSQ: Efficient Federated Learning with Weight Standardization and Distribution-Aware Non-Uniform Quantization
Seung-Wook Kim, Seongyeol Kim, Jiah Kim, Seowon Ji, Se-Ho Lee
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[3112] arXiv:2506.23537 (cross-list from eess.IV) [pdf, html, other]
Title: AFUNet: Cross-Iterative Alignment-Fusion Synergy for HDR Reconstruction via Deep Unfolding Paradigm
Xinyue Li, Zhangkai Ni, Wenhan Yang
Comments: Accepted to International Conference on Computer Vision (ICCV) 2025
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3113] arXiv:2506.23563 (cross-list from cs.AI) [pdf, html, other]
Title: MMReason: An Open-Ended Multi-Modal Multi-Step Reasoning Benchmark for MLLMs Toward AGI
Huanjin Yao, Jiaxing Huang, Yawen Qiu, Michael K. Chen, Wenzheng Liu, Wei Zhang, Wenjie Zeng, Xikun Zhang, Jingyi Zhang, Yuxin Song, Wenhao Wu, Dacheng Tao
Comments: Technical report
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[3114] arXiv:2506.23584 (cross-list from eess.IV) [pdf, html, other]
Title: A Clinically-Grounded Two-Stage Framework for Renal CT Report Generation
Renjie Liang, Zhengkang Fan, Jinqian Pan, Chenkun Sun, Russell Terry, Jie Xu
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[3115] arXiv:2506.23664 (cross-list from eess.IV) [pdf, html, other]
Title: Diffusion Model-based Data Augmentation Method for Fetal Head Ultrasound Segmentation
Fangyijie Wang, Kevin Whelan, Félix Balado, Kathleen M. Curran, Guénolé Silvestre
Comments: Accepted at Irish Machine Vision and Image Processing Conference (IMVIP) 2025
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3116] arXiv:2506.23700 (cross-list from eess.IV) [pdf, html, other]
Title: MedSAM-CA: A CNN-Augmented ViT with Attention-Enhanced Multi-Scale Fusion for Medical Image Segmentation
Peiting Tian, Xi Chen, Haixia Bi, Fan Li
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3117] arXiv:2506.23701 (cross-list from eess.IV) [pdf, html, other]
Title: MDPG: Multi-domain Diffusion Prior Guidance for MRI Reconstruction
Lingtong Zhang, Mengdie Song, Xiaohan Hao, Huayu Mai, Bensheng Qiu
Comments: Accept by MICCAI2025
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3118] arXiv:2506.23717 (cross-list from cs.NE) [pdf, html, other]
Title: Towards Efficient and Accurate Spiking Neural Networks via Adaptive Bit Allocation
Xingting Yao, Qinghao Hu, Fei Zhou, Tielong Liu, Gang Li, Peisong Wang, Jian Cheng
Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[3119] arXiv:2506.23721 (cross-list from eess.IV) [pdf, html, other]
Title: Deep Learning-Based Semantic Segmentation for Real-Time Kidney Imaging and Measurements with Augmented Reality-Assisted Ultrasound
Gijs Luijten, Roberto Maria Scardigno, Lisle Faray de Paiva, Peter Hoyer, Jens Kleesiek, Domenico Buongiorno, Vitoantonio Bevilacqua, Jan Egger
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)
[3120] arXiv:2506.23731 (cross-list from cs.LG) [pdf, html, other]
Title: Radioactive Watermarks in Diffusion and Autoregressive Image Generative Models
Michel Meintz, Jan Dubiński, Franziska Boenisch, Adam Dziedzic
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[3121] arXiv:2506.23759 (cross-list from eess.IV) [pdf, html, other]
Title: Spatio-Temporal Representation Decoupling and Enhancement for Federated Instrument Segmentation in Surgical Videos
Zheng Fang, Xiaoming Qi, Chun-Mei Feng, Jialun Pei, Weixin Si, Yueming Jin
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3122] arXiv:2506.23824 (cross-list from cs.LG) [pdf, html, other]
Title: Supercm: Revisiting Clustering for Semi-Supervised Learning
Durgesh Singh, Ahcene Boubekki, Robert Jenssen, Michael C. Kampffmeyer
Journal-ref: 10.1109/ICASSP49357.2023.10095856
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[3123] arXiv:2506.23957 (cross-list from cs.GR) [pdf, html, other]
Title: GaVS: 3D-Grounded Video Stabilization via Temporally-Consistent Local Reconstruction and Rendering
Zinuo You, Stamatios Georgoulis, Anpei Chen, Siyu Tang, Dengxin Dai
Comments: siggraph 2025, project website: this https URL. version 2, update discussion
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[3124] arXiv:2506.24000 (cross-list from cs.LG) [pdf, html, other]
Title: The Illusion of Progress? A Critical Look at Test-Time Adaptation for Vision-Language Models
Lijun Sheng, Jian Liang, Ran He, Zilei Wang, Tieniu Tan
Comments: Github link: this https URL
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[3125] arXiv:2506.24003 (cross-list from eess.IV) [pdf, html, other]
Title: ShapeKit
Junqi Liu, Dongli He, Wenxuan Li, Ningyu Wang, Alan L. Yuille, Zongwei Zhou
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3126] arXiv:2506.24016 (cross-list from cs.CL) [pdf, html, other]
Title: EXPERT: An Explainable Image Captioning Evaluation Metric with Structured Explanations
Hyunjong Kim, Sangyeop Kim, Jongheon Jeong, Yeongjae Cho, Sungzoon Cho
Comments: Accepted at ACL 2025 Findings
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[3127] arXiv:2506.24034 (cross-list from physics.med-ph) [pdf, html, other]
Title: Supervised Diffusion-Model-Based PET Image Reconstruction
George Webber, Alexander Hammers, Andrew P King, Andrew J Reader
Comments: 12 pages, 6 figures. Submitted to MICCAI 2025, not peer-reviewed
Subjects: Medical Physics (physics.med-ph); Computer Vision and Pattern Recognition (cs.CV)
[3128] arXiv:2506.24074 (cross-list from eess.IV) [pdf, html, other]
Title: C3VDv2 -- Colonoscopy 3D video dataset with enhanced realism
Mayank V. Golhar, Lucas Sebastian Galeano Fretes, Loren Ayers, Venkata S. Akshintala, Taylor L. Bobrow, Nicholas J. Durr
Comments: 19 pages, 7 figures
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[3129] arXiv:2506.24108 (cross-list from cs.GR) [pdf, html, other]
Title: Navigating with Annealing Guidance Scale in Diffusion Space
Shai Yehezkel, Omer Dahary, Andrey Voynov, Daniel Cohen-Or
Comments: Project page: this https URL
Subjects: Graphics (cs.GR); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[3130] arXiv:2506.24124 (cross-list from cs.LG) [pdf, html, other]
Title: Teaching Time Series to See and Speak: Forecasting with Aligned Visual and Textual Perspectives
Sixun Dong, Wei Fan, Teresa Wu, Yanjie Fu
Comments: Code: this https URL
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
Total of 3130 entries : 1-2000 2001-3130 2901-3130
Showing up to 2000 entries per page: fewer | more | all
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack