Automated interpretation of congenital heart disease from multi-view echocardiograms

Wang, Jing; Liu, Xiaofeng; Wang, Fangyun; Zheng, Lin; Gao, Fengqiao; Zhang, Hanwen; Zhang, Xin; Xie, Wanqing; Wang, Binbin

doi:10.1016/j.media.2020.101942

Electrical Engineering and Systems Science > Image and Video Processing

arXiv:2311.18788 (eess)

[Submitted on 30 Nov 2023]

Title:Automated interpretation of congenital heart disease from multi-view echocardiograms

Authors:Jing Wang, Xiaofeng Liu, Fangyun Wang, Lin Zheng, Fengqiao Gao, Hanwen Zhang, Xin Zhang, Wanqing Xie, Binbin Wang

View PDF

Abstract:Congenital heart disease (CHD) is the most common birth defect and the leading cause of neonate death in China. Clinical diagnosis can be based on the selected 2D key-frames from five views. Limited by the availability of multi-view data, most methods have to rely on the insufficient single view analysis. This study proposes to automatically analyze the multi-view echocardiograms with a practical end-to-end framework. We collect the five-view echocardiograms video records of 1308 subjects (including normal controls, ventricular septal defect (VSD) patients and atrial septal defect (ASD) patients) with both disease labels and standard-view key-frame labels. Depthwise separable convolution-based multi-channel networks are adopted to largely reduce the network parameters. We also approach the imbalanced class problem by augmenting the positive training samples. Our 2D key-frame model can diagnose CHD or negative samples with an accuracy of 95.4\%, and in negative, VSD or ASD classification with an accuracy of 92.3\%. To further alleviate the work of key-frame selection in real-world implementation, we propose an adaptive soft attention scheme to directly explore the raw video data. Four kinds of neural aggregation methods are systematically investigated to fuse the information of an arbitrary number of frames in a video. Moreover, with a view detection module, the system can work without the view records. Our video-based model can diagnose with an accuracy of 93.9\% (binary classification), and 92.1\% (3-class classification) in a collected 2D video testing set, which does not need key-frame selection and view annotation in testing. The detailed ablation study and the interpretability analysis are provided.

Comments:	Published in Medical Image Analysis
Subjects:	Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Medical Physics (physics.med-ph)
Cite as:	arXiv:2311.18788 [eess.IV]
	(or arXiv:2311.18788v1 [eess.IV] for this version)
	https://doi.org/10.48550/arXiv.2311.18788
Journal reference:	Medical Image Analysis (Volume 69, April 2021, 101942)
Related DOI:	https://doi.org/10.1016/j.media.2020.101942

Submission history

From: Xiaofeng Liu [view email]
[v1] Thu, 30 Nov 2023 18:37:21 UTC (2,426 KB)

Electrical Engineering and Systems Science > Image and Video Processing

Title:Automated interpretation of congenital heart disease from multi-view echocardiograms

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Image and Video Processing

Title:Automated interpretation of congenital heart disease from multi-view echocardiograms

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators