A Dataset for Medical Instructional Video Classification and Question Answering

Gupta, Deepak; Attal, Kush; Demner-Fushman, Dina

Computer Science > Computer Vision and Pattern Recognition

arXiv:2201.12888 (cs)

[Submitted on 30 Jan 2022]

Title:A Dataset for Medical Instructional Video Classification and Question Answering

Authors:Deepak Gupta, Kush Attal, Dina Demner-Fushman

View PDF

Abstract:This paper introduces a new challenge and datasets to foster research toward designing systems that can understand medical videos and provide visual answers to natural language questions. We believe medical videos may provide the best possible answers to many first aids, medical emergency, and medical education questions. Toward this, we created the MedVidCL and MedVidQA datasets and introduce the tasks of Medical Video Classification (MVC) and Medical Visual Answer Localization (MVAL), two tasks that focus on cross-modal (medical language and medical video) understanding. The proposed tasks and datasets have the potential to support the development of sophisticated downstream applications that can benefit the public and medical practitioners. Our datasets consist of 6,117 annotated videos for the MVC task and 3,010 annotated questions and answers timestamps from 899 videos for the MVAL task. These datasets have been verified and corrected by medical informatics experts. We have also benchmarked each task with the created MedVidCL and MedVidQA datasets and proposed the multimodal learning methods that set competitive baselines for future research.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
Cite as:	arXiv:2201.12888 [cs.CV]
	(or arXiv:2201.12888v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2201.12888

Submission history

From: Deepak Gupta [view email]
[v1] Sun, 30 Jan 2022 18:06:31 UTC (4,654 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CV

< prev | next >

new | recent | 2022-01

Change to browse by:

cs
cs.CL

References & Citations

DBLP - CS Bibliography

listing | bibtex

Deepak Gupta
Dina Demner-Fushman

export BibTeX citation

Computer Science > Computer Vision and Pattern Recognition

Title:A Dataset for Medical Instructional Video Classification and Question Answering

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:A Dataset for Medical Instructional Video Classification and Question Answering

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators