Topic Modeling for Classification of Clinical Reports

Kayi, Efsun Sarioglu; Yadav, Kabir; Chamberlain, James M.; Choi, Hyeong-Ah

Abstract:Electronic health records (EHRs) contain important clinical information about patients. Efficient and effective use of this information could supplement or even replace manual chart review as a means of studying and improving the quality and safety of healthcare delivery. However, some of these clinical data are in the form of free text and require pre-processing before use in automated systems. A common free text data source is radiology reports, typically dictated by radiologists to explain their interpretations. We sought to demonstrate machine learning classification of computed tomography (CT) imaging reports into binary outcomes, i.e. positive and negative for fracture, using regular text classification and classifiers based on topic modeling. Topic modeling provides interpretable themes (topic distributions) in reports, a representation that is more compact than the commonly used bag-of-words representation and can be processed faster than raw text in subsequent automated processes. We demonstrate new classifiers based on this topic modeling representation of the reports. Aggregate topic classifier (ATC) and confidence-based topic classifier (CTC) use a single topic that is determined from the training dataset based on different measures to classify the reports on the test dataset. Alternatively, similarity-based topic classifier (STC) measures the similarity between the reports' topic distributions to determine the predicted class. Our proposed topic modeling-based classifier systems are shown to be competitive with existing text classification techniques and provides an efficient and interpretable representation.

Comments:	18 pages
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:1706.06177 [cs.CL]
	(or arXiv:1706.06177v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1706.06177

Computer Science > Computation and Language

Title:Topic Modeling for Classification of Clinical Reports

Submission history

Access Paper:

References & Citations

1 blog link

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators