Skip to main content

Showing 1–22 of 22 results for author: Bian, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.24173  [pdf, ps, other

    cs.CV

    DrVD-Bench: Do Vision-Language Models Reason Like Human Doctors in Medical Image Diagnosis?

    Authors: Tianhong Zhou, Yin Xu, Yingtao Zhu, Chuxi Xiao, Haiyang Bian, Lei Wei, Xuegong Zhang

    Abstract: Vision-language models (VLMs) exhibit strong zero-shot generalization on natural images and show early promise in interpretable medical image analysis. However, existing benchmarks do not systematically evaluate whether these models truly reason like human clinicians or merely imitate superficial patterns. To address this gap, we propose DrVD-Bench, the first multimodal benchmark for clinical visu… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

  2. arXiv:2504.12711  [pdf, other

    cs.CV cs.AI eess.IV

    NTIRE 2025 Challenge on Day and Night Raindrop Removal for Dual-Focused Images: Methods and Results

    Authors: Xin Li, Yeying Jin, Xin Jin, Zongwei Wu, Bingchen Li, Yufei Wang, Wenhan Yang, Yu Li, Zhibo Chen, Bihan Wen, Robby T. Tan, Radu Timofte, Qiyu Rong, Hongyuan Jing, Mengmeng Zhang, Jinglong Li, Xiangyu Lu, Yi Ren, Yuting Liu, Meng Zhang, Xiang Chen, Qiyuan Guan, Jiangxin Dong, Jinshan Pan, Conglin Gou , et al. (112 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2025 Challenge on Day and Night Raindrop Removal for Dual-Focused Images. This challenge received a wide range of impressive solutions, which are developed and evaluated using our collected real-world Raindrop Clarity dataset. Unlike existing deraining datasets, our Raindrop Clarity dataset is more diverse and challenging in degradation types and contents, which includ… ▽ More

    Submitted 19 April, 2025; v1 submitted 17 April, 2025; originally announced April 2025.

    Comments: Challenge Report of CVPR NTIRE 2025; 26 pages; Methods from 32 teams

  3. arXiv:2503.15015  [pdf, other

    cs.CR

    OFL: Opportunistic Federated Learning for Resource-Heterogeneous and Privacy-Aware Devices

    Authors: Yunlong Mao, Mingyang Niu, Ziqin Dang, Chengxi Li, Hanning Xia, Yuejuan Zhu, Haoyu Bian, Yuan Zhang, Jingyu Hua, Sheng Zhong

    Abstract: Efficient and secure federated learning (FL) is a critical challenge for resource-limited devices, especially mobile devices. Existing secure FL solutions commonly incur significant overhead, leading to a contradiction between efficiency and security. As a result, these two concerns are typically addressed separately. This paper proposes Opportunistic Federated Learning (OFL), a novel FL framework… ▽ More

    Submitted 19 March, 2025; originally announced March 2025.

    Comments: 14 pages, 13 figures

  4. arXiv:2410.19279  [pdf, other

    eess.SP cs.AI

    UbiHR: Resource-efficient Long-range Heart Rate Sensing on Ubiquitous Devices

    Authors: Haoyu Bian, Bin Guo, Sicong Liu, Yasan Ding, Shanshan Gao, Zhiwen Yu

    Abstract: Ubiquitous on-device heart rate sensing is vital for high-stress individuals and chronic patients. Non-contact sensing, compared to contact-based tools, allows for natural user monitoring, potentially enabling more accurate and holistic data collection. However, in open and uncontrolled mobile environments, user movement and lighting introduce. Existing methods, such as curve-based or short-range… ▽ More

    Submitted 24 October, 2024; originally announced October 2024.

  5. arXiv:2410.18084  [pdf, other

    cs.CV cs.RO

    DynamicCity: Large-Scale 4D Occupancy Generation from Dynamic Scenes

    Authors: Hengwei Bian, Lingdong Kong, Haozhe Xie, Liang Pan, Yu Qiao, Ziwei Liu

    Abstract: Urban scene generation has been developing rapidly recently. However, existing methods primarily focus on generating static and single-frame scenes, overlooking the inherently dynamic nature of real-world driving environments. In this work, we introduce DynamicCity, a novel 4D occupancy generation framework capable of generating large-scale, high-quality dynamic 4D scenes with semantics. DynamicCi… ▽ More

    Submitted 2 March, 2025; v1 submitted 23 October, 2024; originally announced October 2024.

    Comments: ICLR 2025 Spotlight; 35 pages, 18 figures, 15 tables; Project Page at https://dynamic-city.github.io/

  6. arXiv:2409.01388  [pdf, other

    cs.DB

    Serverless Query Processing with Flexible Performance SLAs and Prices

    Authors: Haoqiong Bian, Dongyang Geng, Yunpeng Chai, Anastasia Ailamaki

    Abstract: Serverless query processing has become increasingly popular due to its auto-scaling, high elasticity, and pay-as-you-go pricing. It allows cloud data warehouse (or lakehouse) users to focus on data analysis without the burden of managing systems and resources. Accordingly, in serverless query services, users become more concerned about cost-efficiency under acceptable performance than performance… ▽ More

    Submitted 23 December, 2024; v1 submitted 2 September, 2024; originally announced September 2024.

    Comments: 9 pages, 7 figures

  7. arXiv:2408.00624  [pdf, other

    eess.AS cs.CL cs.CV

    SynesLM: A Unified Approach for Audio-visual Speech Recognition and Translation via Language Model and Synthetic Data

    Authors: Yichen Lu, Jiaqi Song, Xuankai Chang, Hengwei Bian, Soumi Maiti, Shinji Watanabe

    Abstract: In this work, we present SynesLM, an unified model which can perform three multimodal language understanding tasks: audio-visual automatic speech recognition(AV-ASR) and visual-aided speech/machine translation(VST/VMT). Unlike previous research that focused on lip motion as visual cues for speech signals, our work explores more general visual information within entire frames, such as objects and a… ▽ More

    Submitted 1 August, 2024; originally announced August 2024.

  8. arXiv:2407.06833  [pdf, other

    q-bio.QM cs.CV eess.IV

    Training-free CryoET Tomogram Segmentation

    Authors: Yizhou Zhao, Hengwei Bian, Michael Mu, Mostofa R. Uddin, Zhenyang Li, Xiang Li, Tianyang Wang, Min Xu

    Abstract: Cryogenic Electron Tomography (CryoET) is a useful imaging technology in structural biology that is hindered by its need for manual annotations, especially in particle picking. Recent works have endeavored to remedy this issue with few-shot learning or contrastive learning techniques. However, supervised training is still inevitable for them. We instead choose to leverage the power of existing 2D… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: This preprint has not undergone peer review or any post-submission improvements or corrections. The Version of Record of this contribution will be published in MICCAI 2024

  9. arXiv:2405.19784  [pdf

    cs.DB cs.AI cs.DC cs.HC cs.LG

    PixelsDB: Serverless and NL-Aided Data Analytics with Flexible Service Levels and Prices

    Authors: Haoqiong Bian, Dongyang Geng, Haoyang Li, Yunpeng Chai, Anastasia Ailamaki

    Abstract: Serverless query processing has become increasingly popular due to its advantages, including automated resource management, high elasticity, and pay-as-you-go pricing. For users who are not system experts, serverless query processing greatly reduces the cost of owning a data analytic system. However, it is still a significant challenge for non-expert users to transform their complex and evolving d… ▽ More

    Submitted 23 December, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

    Comments: 4 pages, 4 figures

  10. DAWN: Domain-Adaptive Weakly Supervised Nuclei Segmentation via Cross-Task Interactions

    Authors: Ye Zhang, Yifeng Wang, Zijie Fang, Hao Bian, Linghan Cai, Ziyue Wang, Yongbing Zhang

    Abstract: Weakly supervised segmentation methods have gained significant attention due to their ability to reduce the reliance on costly pixel-level annotations during model training. However, the current weakly supervised nuclei segmentation approaches typically follow a two-stage pseudo-label generation and network training process. The performance of the nuclei segmentation heavily relies on the quality… ▽ More

    Submitted 3 February, 2025; v1 submitted 23 April, 2024; originally announced April 2024.

    Comments: 15 pages, 11 figures, 12 tables

  11. arXiv:2404.14248  [pdf, other

    cs.CV

    NTIRE 2024 Challenge on Low Light Image Enhancement: Methods and Results

    Authors: Xiaoning Liu, Zongwei Wu, Ao Li, Florin-Alexandru Vasluianu, Yulun Zhang, Shuhang Gu, Le Zhang, Ce Zhu, Radu Timofte, Zhi Jin, Hongjun Wu, Chenxi Wang, Haitao Ling, Yuanhao Cai, Hao Bian, Yuxin Zheng, Jing Lin, Alan Yuille, Ben Shao, Jin Guo, Tianli Liu, Mohao Wu, Yixu Feng, Shuo Hou, Haotian Lin , et al. (87 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2024 low light image enhancement challenge, highlighting the proposed solutions and results. The aim of this challenge is to discover an effective network design or solution capable of generating brighter, clearer, and visually appealing results when dealing with a variety of conditions, including ultra-high resolution (4K and beyond), non-uniform illumination, backlig… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: NTIRE 2024 Challenge Report

  12. arXiv:2402.04756  [pdf, other

    cs.CV

    Boundary-aware Contrastive Learning for Semi-supervised Nuclei Instance Segmentation

    Authors: Ye Zhang, Ziyue Wang, Yifeng Wang, Hao Bian, Linghan Cai, Hengrui Li, Lingbo Zhang, Yongbing Zhang

    Abstract: Semi-supervised segmentation methods have demonstrated promising results in natural scenarios, providing a solution to reduce dependency on manual annotation. However, these methods face significant challenges when directly applied to pathological images due to the subtle color differences between nuclei and tissues, as well as the significant morphological variations among nuclei. Consequently, t… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

    Comments: 12 pages, 3 figures, 6 tables

  13. arXiv:2311.16464  [pdf, other

    cs.CV cs.AI

    Bridging the Gap: A Unified Video Comprehension Framework for Moment Retrieval and Highlight Detection

    Authors: Yicheng Xiao, Zhuoyan Luo, Yong Liu, Yue Ma, Hengwei Bian, Yatai Ji, Yujiu Yang, Xiu Li

    Abstract: Video Moment Retrieval (MR) and Highlight Detection (HD) have attracted significant attention due to the growing demand for video analysis. Recent approaches treat MR and HD as similar video grounding problems and address them together with transformer-based architecture. However, we observe that the emphasis of MR and HD differs, with one necessitating the perception of local relationships and th… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

  14. arXiv:2309.10519  [pdf, other

    cs.CV

    Spatial-Assistant Encoder-Decoder Network for Real Time Semantic Segmentation

    Authors: Yalun Wang, Shidong Chen, Huicong Bian, Weixiao Li, Qin Lu

    Abstract: Semantic segmentation is an essential technology for self-driving cars to comprehend their surroundings. Currently, real-time semantic segmentation networks commonly employ either encoder-decoder architecture or two-pathway architecture. Generally speaking, encoder-decoder models tend to be quicker,whereas two-pathway models exhibit higher accuracy. To leverage both strengths, we present the Spati… ▽ More

    Submitted 19 September, 2023; originally announced September 2023.

  15. arXiv:2306.17373  [pdf, other

    cs.CV cs.AI

    HVTSurv: Hierarchical Vision Transformer for Patient-Level Survival Prediction from Whole Slide Image

    Authors: Zhuchen Shao, Yang Chen, Hao Bian, Jian Zhang, Guojun Liu, Yongbing Zhang

    Abstract: Survival prediction based on whole slide images (WSIs) is a challenging task for patient-level multiple instance learning (MIL). Due to the vast amount of data for a patient (one or multiple gigapixels WSIs) and the irregularly shaped property of WSI, it is difficult to fully explore spatial, contextual, and hierarchical interaction in the patient-level bag. Many studies adopt random sampling pre-… ▽ More

    Submitted 29 June, 2023; originally announced June 2023.

    Comments: accepted by AAAI 2023

  16. arXiv:2303.06705  [pdf, other

    cs.CV

    Retinexformer: One-stage Retinex-based Transformer for Low-light Image Enhancement

    Authors: Yuanhao Cai, Hao Bian, Jing Lin, Haoqian Wang, Radu Timofte, Yulun Zhang

    Abstract: When enhancing low-light images, many deep learning algorithms are based on the Retinex theory. However, the Retinex model does not consider the corruptions hidden in the dark or introduced by the light-up process. Besides, these methods usually require a tedious multi-stage training pipeline and rely on convolutional neural networks, showing limitations in capturing long-range dependencies. In th… ▽ More

    Submitted 26 October, 2023; v1 submitted 12 March, 2023; originally announced March 2023.

    Comments: ICCV 2023; The first Transformer-based method for low-light image enhancement

  17. arXiv:2301.03251  [pdf, other

    quant-ph cs.LG

    VQNet 2.0: A New Generation Machine Learning Framework that Unifies Classical and Quantum

    Authors: Huanyu Bian, Zhilong Jia, Menghan Dou, Yuan Fang, Lei Li, Yiming Zhao, Hanchao Wang, Zhaohui Zhou, Wei Wang, Wenyu Zhu, Ye Li, Yang Yang, Weiming Zhang, Nenghai Yu, Zhaoyun Chen, Guoping Guo

    Abstract: With the rapid development of classical and quantum machine learning, a large number of machine learning frameworks have been proposed. However, existing machine learning frameworks usually only focus on classical or quantum, rather than both. Therefore, based on VQNet 1.0, we further propose VQNet 2.0, a new generation of unified classical and quantum machine learning framework that supports hybr… ▽ More

    Submitted 9 January, 2023; originally announced January 2023.

  18. arXiv:2212.08568  [pdf, other

    cs.CV cs.LG

    Biomedical image analysis competitions: The state of current participation practice

    Authors: Matthias Eisenmann, Annika Reinke, Vivienn Weru, Minu Dietlinde Tizabi, Fabian Isensee, Tim J. Adler, Patrick Godau, Veronika Cheplygina, Michal Kozubek, Sharib Ali, Anubha Gupta, Jan Kybic, Alison Noble, Carlos Ortiz de Solórzano, Samiksha Pachade, Caroline Petitjean, Daniel Sage, Donglai Wei, Elizabeth Wilden, Deepak Alapatt, Vincent Andrearczyk, Ujjwal Baid, Spyridon Bakas, Niranjan Balu, Sophia Bano , et al. (331 additional authors not shown)

    Abstract: The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis,… ▽ More

    Submitted 12 September, 2023; v1 submitted 16 December, 2022; originally announced December 2022.

  19. arXiv:2208.12657  [pdf, other

    cs.CV

    Multi tasks RetinaNet for mitosis detection

    Authors: Chen Yang, Wang Ziyue, Fang Zijie, Bian Hao, Zhang Yongbing

    Abstract: The account of mitotic cells is a key feature in tumor diagnosis. However, due to the variability of mitotic cell morphology, it is a highly challenging task to detect mitotic cells in tumor tissues. At the same time, although advanced deep learning method have achieved great success in cell detection, the performance is often unsatisfactory when tested data from another domain (i.e. the different… ▽ More

    Submitted 26 August, 2022; originally announced August 2022.

  20. arXiv:2206.12798  [pdf, other

    cs.CV

    Multiple Instance Learning with Mixed Supervision in Gleason Grading

    Authors: Hao Bian, Zhuchen Shao, Yang Chen, Yifeng Wang, Haoqian Wang, Jian Zhang, Yongbing Zhang

    Abstract: With the development of computational pathology, deep learning methods for Gleason grading through whole slide images (WSIs) have excellent prospects. Since the size of WSIs is extremely large, the image label usually contains only slide-level label or limited pixel-level labels. The current mainstream approach adopts multi-instance learning to predict Gleason grades. However, some methods only co… ▽ More

    Submitted 26 June, 2022; originally announced June 2022.

    Comments: Accepted by MICCAI 2022

  21. arXiv:2106.00908  [pdf, other

    cs.CV

    TransMIL: Transformer based Correlated Multiple Instance Learning for Whole Slide Image Classification

    Authors: Zhuchen Shao, Hao Bian, Yang Chen, Yifeng Wang, Jian Zhang, Xiangyang Ji, Yongbing Zhang

    Abstract: Multiple instance learning (MIL) is a powerful tool to solve the weakly supervised classification in whole slide image (WSI) based pathology diagnosis. However, the current MIL methods are usually based on independent and identical distribution hypothesis, thus neglect the correlation among different instances. To address this problem, we proposed a new framework, called correlated MIL, and provid… ▽ More

    Submitted 31 October, 2021; v1 submitted 1 June, 2021; originally announced June 2021.

  22. arXiv:2009.08829  [pdf, other

    eess.IV cs.CV

    Residual Spatial Attention Network for Retinal Vessel Segmentation

    Authors: Changlu Guo, Márton Szemenyei, Yugen Yi, Wei Zhou, Haodong Bian

    Abstract: Reliable segmentation of retinal vessels can be employed as a way of monitoring and diagnosing certain diseases, such as diabetes and hypertension, as they affect the retinal vascular structure. In this work, we propose the Residual Spatial Attention Network (RSAN) for retinal vessel segmentation. RSAN employs a modified residual block structure that integrates DropBlock, which can not only be uti… ▽ More

    Submitted 18 September, 2020; originally announced September 2020.

    Comments: ICONIP 2020