-
TEMSET-24K: Densely Annotated Dataset for Indexing Multipart Endoscopic Videos using Surgical Timeline Segmentation
Authors:
Muhammad Bilal,
Mahmood Alam,
Deepa Bapu,
Stephan Korsgen,
Neeraj Lal,
Simon Bach,
Amir M Hajivanand,
Muhammed Ali,
Kamran Soomro,
Iqbal Qasim,
Paweł Capik,
Aslam Khan,
Zaheer Khan,
Hunaid Vohra,
Massimo Caputo,
Andrew Beggs,
Adnan Qayyum,
Junaid Qadir,
Shazad Ashraf
Abstract:
Indexing endoscopic surgical videos is vital in surgical data science, forming the basis for systematic retrospective analysis and clinical performance evaluation. Despite its significance, current video analytics rely on manual indexing, a time-consuming process. Advances in computer vision, particularly deep learning, offer automation potential, yet progress is limited by the lack of publicly av…
▽ More
Indexing endoscopic surgical videos is vital in surgical data science, forming the basis for systematic retrospective analysis and clinical performance evaluation. Despite its significance, current video analytics rely on manual indexing, a time-consuming process. Advances in computer vision, particularly deep learning, offer automation potential, yet progress is limited by the lack of publicly available, densely annotated surgical datasets. To address this, we present TEMSET-24K, an open-source dataset comprising 24,306 trans-anal endoscopic microsurgery (TEMS) video micro-clips. Each clip is meticulously annotated by clinical experts using a novel hierarchical labeling taxonomy encompassing phase, task, and action triplets, capturing intricate surgical workflows. To validate this dataset, we benchmarked deep learning models, including transformer-based architectures. Our in silico evaluation demonstrates high accuracy (up to 0.99) and F1 scores (up to 0.99) for key phases like Setup and Suturing. The STALNet model, tested with ConvNeXt, ViT, and SWIN V2 encoders, consistently segmented well-represented phases. TEMSET-24K provides a critical benchmark, propelling state-of-the-art solutions in surgical data science.
△ Less
Submitted 10 February, 2025;
originally announced February 2025.
-
Online Localization and Prediction of Actions and Interactions
Authors:
Khurram Soomro,
Haroon Idrees,
Mubarak Shah
Abstract:
This paper proposes a person-centric and online approach to the challenging problem of localization and prediction of actions and interactions in videos. Typically, localization or recognition is performed in an offline manner where all the frames in the video are processed together. This prevents timely localization and prediction of actions and interactions - an important consideration for many…
▽ More
This paper proposes a person-centric and online approach to the challenging problem of localization and prediction of actions and interactions in videos. Typically, localization or recognition is performed in an offline manner where all the frames in the video are processed together. This prevents timely localization and prediction of actions and interactions - an important consideration for many tasks including surveillance and human-machine interaction.
In our approach, we estimate human poses at each frame and train discriminative appearance models using the superpixels inside the pose bounding boxes. Since the pose estimation per frame is inherently noisy, the conditional probability of pose hypotheses at current time-step (frame) is computed using pose estimations in the current frame and their consistency with poses in the previous frames. Next, both the superpixel and pose-based foreground likelihoods are used to infer the location of actors at each time through a Conditional Random. The issue of visual drift is handled by updating the appearance models, and refining poses using motion smoothness on joint locations, in an online manner. For online prediction of action (interaction) confidences, we propose an approach based on Structural SVM that operates on short video segments, and is trained with the objective that confidence of an action or interaction increases as time progresses. Lastly, we quantify the performance of both detection and prediction together, and analyze how the prediction accuracy varies as a time function of observed action (interaction) at different levels of detection performance. Our experiments on several datasets suggest that despite using only a few frames to localize actions (interactions) at each time instant, we are able to obtain competitive results to state-of-the-art offline methods.
△ Less
Submitted 4 December, 2016;
originally announced December 2016.
-
Providing Traceability for Neuroimaging Analyses
Authors:
R. McClatchey,
A. Branson,
A. Anjum,
P. Bloodsworth,
I. Habib,
K. Munir,
J. Shamdasani,
K. Soomro,
the neuGRID Consortium
Abstract:
With the increasingly digital nature of biomedical data and as the complexity of analyses in medical research increases, the need for accurate information capture, traceability and accessibility has become crucial to medical researchers in the pursuance of their research goals. Grid- or Cloud-based technologies, often based on so-called Service Oriented Architectures (SOA), are increasingly being…
▽ More
With the increasingly digital nature of biomedical data and as the complexity of analyses in medical research increases, the need for accurate information capture, traceability and accessibility has become crucial to medical researchers in the pursuance of their research goals. Grid- or Cloud-based technologies, often based on so-called Service Oriented Architectures (SOA), are increasingly being seen as viable solutions for managing distributed data and algorithms in the bio-medical domain. For neuroscientific analyses, especially those centred on complex image analysis, traceability of processes and datasets is essential but up to now this has not been captured in a manner that facilitates collaborative study. Over the past decade, we have been working with mammographers, paediatricians and neuroscientists in three generations of projects to provide the data management and provenance services now required for 21st century medical research. This paper outlines the finding of a requirements study and a resulting system architecture for the production of services to support neuroscientific studies of biomarkers for Alzheimers Disease. The paper proposes a software infrastructure and services that provide the foundation for such support. It introduces the use of the CRISTAL software to provide provenance management as one of a number of services delivered on a SOA, deployed to manage neuroimaging projects that have been studying biomarkers for Alzheimers disease.
△ Less
Submitted 24 February, 2014;
originally announced February 2014.
-
UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild
Authors:
Khurram Soomro,
Amir Roshan Zamir,
Mubarak Shah
Abstract:
We introduce UCF101 which is currently the largest dataset of human actions. It consists of 101 action classes, over 13k clips and 27 hours of video data. The database consists of realistic user uploaded videos containing camera motion and cluttered background. Additionally, we provide baseline action recognition results on this new dataset using standard bag of words approach with overall perform…
▽ More
We introduce UCF101 which is currently the largest dataset of human actions. It consists of 101 action classes, over 13k clips and 27 hours of video data. The database consists of realistic user uploaded videos containing camera motion and cluttered background. Additionally, we provide baseline action recognition results on this new dataset using standard bag of words approach with overall performance of 44.5%. To the best of our knowledge, UCF101 is currently the most challenging dataset of actions due to its large number of classes, large number of clips and also unconstrained nature of such clips.
△ Less
Submitted 3 December, 2012;
originally announced December 2012.
-
PhantomOS: A Next Generation Grid Operating System
Authors:
Irfan Habib,
Kamran Soomro,
Ashiq Anjum,
Richard McClatchey,
Arshad Ali,
Peter Bloodsworth
Abstract:
Grid Computing has made substantial advances in the past decade; these are primarily due to the adoption of standardized Grid middleware. However Grid computing has not yet become pervasive because of some barriers that we believe have been caused by the adoption of middleware centric approaches. These barriers include: scant support for major types of applications such as interactive applicatio…
▽ More
Grid Computing has made substantial advances in the past decade; these are primarily due to the adoption of standardized Grid middleware. However Grid computing has not yet become pervasive because of some barriers that we believe have been caused by the adoption of middleware centric approaches. These barriers include: scant support for major types of applications such as interactive applications; lack of flexible, autonomic and scalable Grid architectures; lack of plug-and-play Grid computing and, most importantly, no straightforward way to setup and administer Grids. PhantomOS is a project which aims to address many of these barriers. Its goal is the creation of a user friendly pervasive Grid computing platform that facilitates the rapid deployment and easy maintenance of Grids whilst providing support for major types of applications on Grids of almost any topology. In this paper we present the detailed system architecture and an overview of its implementation.
△ Less
Submitted 5 July, 2007;
originally announced July 2007.
-
From Grid Middleware to a Grid Operating System
Authors:
Arshad Ali,
Richard McClatchey,
Ashiq Anjum,
Irfan Habib,
Kamran Soomro,
Mohammed Asif,
Ali Adil,
Athar Mohsin
Abstract:
Grid computing has made substantial advances during the last decade. Grid middleware such as Globus has contributed greatly in making this possible. There are, however, significant barriers to the adoption of Grid computing in other fields, most notably day-to-day user computing environments. We will demonstrate in this paper that this is primarily due to the limitations of the existing Grid mid…
▽ More
Grid computing has made substantial advances during the last decade. Grid middleware such as Globus has contributed greatly in making this possible. There are, however, significant barriers to the adoption of Grid computing in other fields, most notably day-to-day user computing environments. We will demonstrate in this paper that this is primarily due to the limitations of the existing Grid middleware which does not take into account the needs of everyday scientific and business users. In this paper we will formally advocate a Grid Operating System and propose an architecture to migrate Grid computing into a Grid operating system which we believe would help remove most of the technical barriers to the adoption of Grid computing and make it relevant to the day-to-day user. We believe this proposed transition to a Grid operating system will drive more pervasive Grid computing research and application development and deployment in future.
△ Less
Submitted 8 August, 2006;
originally announced August 2006.