-
Generalizable Cervical Cancer Screening via Large-scale Pretraining and Test-Time Adaptation
Authors:
Hao Jiang,
Cheng Jin,
Huangjing Lin,
Yanning Zhou,
Xi Wang,
Jiabo Ma,
Li Ding,
Jun Hou,
Runsheng Liu,
Zhizhong Chai,
Luyang Luo,
Huijuan Shi,
Yinling Qian,
Qiong Wang,
Changzhong Li,
Anjia Han,
Ronald Cheong Kin Chan,
Hao Chen
Abstract:
Cervical cancer is a leading malignancy in female reproductive system. While AI-assisted cytology offers a cost-effective and non-invasive screening solution, current systems struggle with generalizability in complex clinical scenarios. To address this issue, we introduced Smart-CCS, a generalizable Cervical Cancer Screening paradigm based on pretraining and adaptation to create robust and general…
▽ More
Cervical cancer is a leading malignancy in female reproductive system. While AI-assisted cytology offers a cost-effective and non-invasive screening solution, current systems struggle with generalizability in complex clinical scenarios. To address this issue, we introduced Smart-CCS, a generalizable Cervical Cancer Screening paradigm based on pretraining and adaptation to create robust and generalizable screening systems. To develop and validate Smart-CCS, we first curated a large-scale, multi-center dataset named CCS-127K, which comprises a total of 127,471 cervical cytology whole-slide images collected from 48 medical centers. By leveraging large-scale self-supervised pretraining, our CCS models are equipped with strong generalization capability, potentially generalizing across diverse scenarios. Then, we incorporated test-time adaptation to specifically optimize the trained CCS model for complex clinical settings, which adapts and refines predictions, improving real-world applicability. We conducted large-scale system evaluation among various cohorts. In retrospective cohorts, Smart-CCS achieved an overall area under the curve (AUC) value of 0.965 and sensitivity of 0.913 for cancer screening on 11 internal test datasets. In external testing, system performance maintained high at 0.950 AUC across 6 independent test datasets. In prospective cohorts, our Smart-CCS achieved AUCs of 0.947, 0.924, and 0.986 in three prospective centers, respectively. Moreover, the system demonstrated superior sensitivity in diagnosing cervical cancer, confirming the accuracy of our cancer screening results by using histology findings for validation. Interpretability analysis with cell and slide predictions further indicated that the system's decision-making aligns with clinical practice. Smart-CCS represents a significant advancement in cancer screening across diverse clinical contexts.
△ Less
Submitted 12 February, 2025;
originally announced February 2025.
-
Towards A Generalizable Pathology Foundation Model via Unified Knowledge Distillation
Authors:
Jiabo Ma,
Zhengrui Guo,
Fengtao Zhou,
Yihui Wang,
Yingxue Xu,
Jinbang Li,
Fang Yan,
Yu Cai,
Zhengjie Zhu,
Cheng Jin,
Yi Lin,
Xinrui Jiang,
Chenglong Zhao,
Danyi Li,
Anjia Han,
Zhenhui Li,
Ronald Cheong Kin Chan,
Jiguang Wang,
Peng Fei,
Kwang-Ting Cheng,
Shaoting Zhang,
Li Liang,
Hao Chen
Abstract:
Foundation models pretrained on large-scale datasets are revolutionizing the field of computational pathology (CPath). The generalization ability of foundation models is crucial for the success in various downstream clinical tasks. However, current foundation models have only been evaluated on a limited type and number of tasks, leaving their generalization ability and overall performance unclear.…
▽ More
Foundation models pretrained on large-scale datasets are revolutionizing the field of computational pathology (CPath). The generalization ability of foundation models is crucial for the success in various downstream clinical tasks. However, current foundation models have only been evaluated on a limited type and number of tasks, leaving their generalization ability and overall performance unclear. To address this gap, we established a most comprehensive benchmark to evaluate the performance of off-the-shelf foundation models across six distinct clinical task types, encompassing a total of 72 specific tasks, including slide-level classification, survival prediction, ROI-tissue classification, ROI retrieval, visual question answering, and report generation. Our findings reveal that existing foundation models excel at certain task types but struggle to effectively handle the full breadth of clinical tasks. To improve the generalization of pathology foundation models, we propose a unified knowledge distillation framework consisting of both expert and self-knowledge distillation, where the former allows the model to learn from the knowledge of multiple expert models, while the latter leverages self-distillation to enable image representation learning via local-global alignment. Based on this framework, we curated a dataset of 96,000 whole slide images (WSIs) and developed a Generalizable Pathology Foundation Model (GPFM). This advanced model was trained on a substantial dataset comprising 190 million images extracted from approximately 72,000 publicly available slides, encompassing 34 major tissue types. Evaluated on the established benchmark, GPFM achieves an impressive average rank of 1.6, with 42 tasks ranked 1st, while the second-best model, UNI, attains an average rank of 3.7, with only 6 tasks ranked 1st.
△ Less
Submitted 14 April, 2025; v1 submitted 25 July, 2024;
originally announced July 2024.
-
Deep Learning for Computational Cytology: A Survey
Authors:
Hao Jiang,
Yanning Zhou,
Yi Lin,
Ronald CK Chan,
Jiang Liu,
Hao Chen
Abstract:
Computational cytology is a critical, rapid-developing, yet challenging topic in the field of medical image computing which analyzes the digitized cytology image by computer-aided technologies for cancer screening. Recently, an increasing number of deep learning (DL) algorithms have made significant progress in medical image analysis, leading to the boosting publications of cytological studies. To…
▽ More
Computational cytology is a critical, rapid-developing, yet challenging topic in the field of medical image computing which analyzes the digitized cytology image by computer-aided technologies for cancer screening. Recently, an increasing number of deep learning (DL) algorithms have made significant progress in medical image analysis, leading to the boosting publications of cytological studies. To investigate the advanced methods and comprehensive applications, we survey more than 120 publications of DL-based cytology image analysis in this article. We first introduce various deep learning methods, including fully supervised, weakly supervised, unsupervised, and transfer learning. Then, we systematically summarize the public datasets, evaluation metrics, versatile cytology image analysis applications including classification, detection, segmentation, and other related tasks. Finally, we discuss current challenges and potential research directions of computational cytology.
△ Less
Submitted 16 February, 2022; v1 submitted 10 February, 2022;
originally announced February 2022.