-
M-SCAN: A Multistage Framework for Lumbar Spinal Canal Stenosis Grading Using Multi-View Cross Attention
Authors:
Arnesh Batra,
Arush Gumber,
Anushk Kumar
Abstract:
The increasing prevalence of lumbar spinal canal stenosis has resulted in a surge of MRI (Magnetic Resonance Imaging), leading to labor-intensive interpretation and significant inter-reader variability, even among expert radiologists. This paper introduces a novel and efficient deep-learning framework that fully automates the grading of lumbar spinal canal stenosis. We demonstrate state-of-the-art…
▽ More
The increasing prevalence of lumbar spinal canal stenosis has resulted in a surge of MRI (Magnetic Resonance Imaging), leading to labor-intensive interpretation and significant inter-reader variability, even among expert radiologists. This paper introduces a novel and efficient deep-learning framework that fully automates the grading of lumbar spinal canal stenosis. We demonstrate state-of-the-art performance in grading spinal canal stenosis on a dataset of 1,975 unique studies, each containing three distinct types of 3D cross-sectional spine images: Axial T2, Sagittal T1, and Sagittal T2/STIR. Employing a distinctive training strategy, our proposed multistage approach effectively integrates sagittal and axial images. This strategy employs a multi-view model with a sequence-based architecture, optimizing feature extraction and cross-view alignment to achieve an AUROC (Area Under the Receiver Operating Characteristic Curve) of 0.971 in spinal canal stenosis grading surpassing other state-of-the-art methods.
△ Less
Submitted 3 March, 2025;
originally announced March 2025.
-
ASGIR: Audio Spectrogram Transformer Guided Classification And Information Retrieval For Birds
Authors:
Yashwardhan Chaudhuri,
Paridhi Mundra,
Arnesh Batra,
Orchid Chetia Phukan,
Arun Balaji Buduru
Abstract:
Recognition and interpretation of bird vocalizations are pivotal in ornithological research and ecological conservation efforts due to their significance in understanding avian behaviour, performing habitat assessment and judging ecological health. This paper presents an audio spectrogram-guided classification framework called ASGIR for improved bird sound recognition and information retrieval. Ou…
▽ More
Recognition and interpretation of bird vocalizations are pivotal in ornithological research and ecological conservation efforts due to their significance in understanding avian behaviour, performing habitat assessment and judging ecological health. This paper presents an audio spectrogram-guided classification framework called ASGIR for improved bird sound recognition and information retrieval. Our work is accompanied by a simple-to-use, two-step information retrieval system that uses geographical location and bird sounds to localize and retrieve relevant bird information by scraping Wikipedia page information of recognized birds. ASGIR offers a substantial performance on a random subset of 51 classes of Xeno-Canto dataset Bird sounds from European countries with a median of 100\% performance on F1, Precision and Sensitivity metrics. Our code is available as follows: https://github.com/MainSample1234/AS-GIR .
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Flatness-based motion planning for a non-uniform moving cantilever Euler-Bernoulli beam with a tip-mass
Authors:
Soham Chatterjee,
Aman Batra,
Vivek Natarajan
Abstract:
Consider a non-uniform Euler-Bernoulli beam with a tip-mass at one end and a cantilever joint at the other end. The cantilever joint is not fixed and can itself be moved along an axis perpendicular to the beam. The position of the cantilever joint is the control input to the beam. The dynamics of the beam is governed by a coupled PDE-ODE model with boundary input. On a natural state-space, there e…
▽ More
Consider a non-uniform Euler-Bernoulli beam with a tip-mass at one end and a cantilever joint at the other end. The cantilever joint is not fixed and can itself be moved along an axis perpendicular to the beam. The position of the cantilever joint is the control input to the beam. The dynamics of the beam is governed by a coupled PDE-ODE model with boundary input. On a natural state-space, there exists a unique state trajectory for this beam model for every initial state and each smooth control input which is compatible with the initial state. In this paper, we study the motion planning problem of transferring the beam from an initial state to a final state over a prescribed time interval. We address this problem by extending the generating functions approach to flatness-based control, originally proposed in the literature for motion planning of parabolic PDEs, to the beam model. We prove that such a transfer is possible if the initial and final states belong to a certain set, which also contains steady-states of the beam. We illustrate our theoretical results using simulations and experiments.
△ Less
Submitted 23 July, 2024;
originally announced July 2024.
-
Reviewing FID and SID Metrics on Generative Adversarial Networks
Authors:
Ricardo de Deijn,
Aishwarya Batra,
Brandon Koch,
Naseef Mansoor,
Hema Makkena
Abstract:
The growth of generative adversarial network (GAN) models has increased the ability of image processing and provides numerous industries with the technology to produce realistic image transformations. However, with the field being recently established there are new evaluation metrics that can further this research. Previous research has shown the Fréchet Inception Distance (FID) to be an effective…
▽ More
The growth of generative adversarial network (GAN) models has increased the ability of image processing and provides numerous industries with the technology to produce realistic image transformations. However, with the field being recently established there are new evaluation metrics that can further this research. Previous research has shown the Fréchet Inception Distance (FID) to be an effective metric when testing these image-to-image GANs in real-world applications. Signed Inception Distance (SID), a founded metric in 2023, expands on FID by allowing unsigned distances. This paper uses public datasets that consist of façades, cityscapes, and maps within Pix2Pix and CycleGAN models. After training these models are evaluated on both inception distance metrics which measure the generating performance of the trained models. Our findings indicate that usage of the metric SID incorporates an efficient and effective metric to complement, or even exceed the ability shown using the FID for the image-to-image GANs
△ Less
Submitted 5 February, 2024;
originally announced February 2024.