-
SafeTy Reasoning Elicitation Alignment for Multi-Turn Dialogues
Authors:
Martin Kuo,
Jianyi Zhang,
Aolin Ding,
Louis DiValentin,
Amin Hass,
Benjamin F Morris,
Isaac Jacobson,
Randolph Linderman,
James Kiessling,
Nicolas Ramos,
Bhavna Gopal,
Maziyar Baran Pouyan,
Changwei Liu,
Hai Li,
Yiran Chen
Abstract:
Malicious attackers can exploit large language models (LLMs) by engaging them in multi-turn dialogues to achieve harmful objectives, posing significant safety risks to society. To address this challenge, we propose a novel defense mechanism: SafeTy Reasoning Elicitation Alignment for Multi-Turn Dialogues (STREAM). STREAM defends LLMs against multi-turn attacks while preserving their functional cap…
▽ More
Malicious attackers can exploit large language models (LLMs) by engaging them in multi-turn dialogues to achieve harmful objectives, posing significant safety risks to society. To address this challenge, we propose a novel defense mechanism: SafeTy Reasoning Elicitation Alignment for Multi-Turn Dialogues (STREAM). STREAM defends LLMs against multi-turn attacks while preserving their functional capabilities. Our approach involves constructing a human-annotated dataset, the Safety Reasoning Multi-turn Dialogues dataset, which is used to fine-tune a plug-and-play safety reasoning moderator. This model is designed to identify malicious intent hidden within multi-turn conversations and alert the target LLM of potential risks. We evaluate STREAM across multiple LLMs against prevalent multi-turn attack strategies. Experimental results demonstrate that our method significantly outperforms existing defense techniques, reducing the Attack Success Rate (ASR) by 51.2%, all while maintaining comparable LLM capability.
△ Less
Submitted 31 May, 2025;
originally announced June 2025.
-
Spatio-temporal pulse propagation during highly-resolved onset of Rayleigh-Taylor and Kelvin-Helmholtz Rayleigh-Taylor instabilities
Authors:
Bhavna Joshi,
Aditi Sengupta,
Yassin Ajanif,
Lucas Lestandi
Abstract:
The present study explores the onset of the Rayleigh-Taylor instability (RTI) and Kelvin-Helmholtz Rayleigh-Taylor instability (KHRTI) with highly-resolved direct numerical simulations of two setups which consider air at different temperatures (or densities) and/or velocities in two halves of three-dimensional cuboidal domains. The compressible Navier-Stokes equations are solved using a novel para…
▽ More
The present study explores the onset of the Rayleigh-Taylor instability (RTI) and Kelvin-Helmholtz Rayleigh-Taylor instability (KHRTI) with highly-resolved direct numerical simulations of two setups which consider air at different temperatures (or densities) and/or velocities in two halves of three-dimensional cuboidal domains. The compressible Navier-Stokes equations are solved using a novel parallel algorithm which does not involve overlapping points at sub-domain boundaries. The pressure disturbance field is compared during onset of RTI and KHRTI and corresponding convection- and advection-dominated mechanisms are highlighted by instantaneous features, spectra, and proper orthogonal decomposition. The relative contributions of pressure, kinetic energy and rotational energy to the overall energy budget is explored for both instabilities, revealing acoustic trigger to be the incipient mechanism for both RTI and KHRTI. The nonlinear, spatio-temporal nature of the instability is further explored by application of a transport equation for enstrophy of compressible flows. This provides insights into the similarities and differences between the onset mechanisms of RTI and KHRTI, serving as a benchmark data set for shear and buoyancy-driven instabilities across diverse applications in geophysics, nuclear energy and atmospheric fluid dynamics.
△ Less
Submitted 12 May, 2025;
originally announced May 2025.
-
Subgroup Performance of a Commercial Digital Breast Tomosynthesis Model for Breast Cancer Detection
Authors:
Beatrice Brown-Mulry,
Rohan Satya Isaac,
Sang Hyup Lee,
Ambika Seth,
KyungJee Min,
Theo Dapamede,
Frank Li,
Aawez Mansuri,
MinJae Woo,
Christian Allison Fauria-Robinson,
Bhavna Paryani,
Judy Wawira Gichoya,
Hari Trivedi
Abstract:
While research has established the potential of AI models for mammography to improve breast cancer screening outcomes, there have not been any detailed subgroup evaluations performed to assess the strengths and weaknesses of commercial models for digital breast tomosynthesis (DBT) imaging. This study presents a granular evaluation of the Lunit INSIGHT DBT model on a large retrospective cohort of 1…
▽ More
While research has established the potential of AI models for mammography to improve breast cancer screening outcomes, there have not been any detailed subgroup evaluations performed to assess the strengths and weaknesses of commercial models for digital breast tomosynthesis (DBT) imaging. This study presents a granular evaluation of the Lunit INSIGHT DBT model on a large retrospective cohort of 163,449 screening mammography exams from the Emory Breast Imaging Dataset (EMBED). Model performance was evaluated in a binary context with various negative exam types (162,081 exams) compared against screen detected cancers (1,368 exams) as the positive class. The analysis was stratified across demographic, imaging, and pathologic subgroups to identify potential disparities. The model achieved an overall AUC of 0.91 (95% CI: 0.90-0.92) with a precision of 0.08 (95% CI: 0.08-0.08), and a recall of 0.73 (95% CI: 0.71-0.76). Performance was found to be robust across demographics, but cases with non-invasive cancers (AUC: 0.85, 95% CI: 0.83-0.87), calcifications (AUC: 0.80, 95% CI: 0.78-0.82), and dense breast tissue (AUC: 0.90, 95% CI: 0.88-0.91) were associated with significantly lower performance compared to other groups. These results highlight the need for detailed evaluation of model characteristics and vigilance in considering adoption of new tools for clinical deployment.
△ Less
Submitted 17 March, 2025;
originally announced March 2025.
-
Qubit-Based Framework for Quantum Machine Learning: Bridging Classical Data and Quantum Algorithms
Authors:
Bhavna Bose,
Saurav Verma
Abstract:
This paper dives into the exciting and rapidly growing field of quantum computing, explaining its core ideas, current progress, and how it could revolutionize the way we solve complex problems. It starts by breaking down the basics, like qubits, quantum circuits, and how principles like superposition and entanglement make quantum computers fundamentally different-and far more powerful for certain…
▽ More
This paper dives into the exciting and rapidly growing field of quantum computing, explaining its core ideas, current progress, and how it could revolutionize the way we solve complex problems. It starts by breaking down the basics, like qubits, quantum circuits, and how principles like superposition and entanglement make quantum computers fundamentally different-and far more powerful for certain tasks-than the classical computers we use today. We also explore how quantum computing deals with complex problems and why it is uniquely suited for challenges classical systems struggle to handle. A big part of this paper focuses on Quantum Machine Learning (QML), where the strengths of quantum computing meet the world of artificial intelligence. By processing massive datasets and optimizing intricate algorithms, quantum systems offer new possibilities for machine learning. We highlight different approaches to combining quantum and classical computing, showing how they can work together to produce faster and more accurate results. Additionally, we explore the tools and platforms available-like TensorFlow Quantum, Qiskit and PennyLane-that are helping researchers and developers bring these theories to life. Of course, quantum computing has its hurdles. Challenges like scaling up hardware, correcting errors, and keeping qubits stable are significant roadblocks. Yet, with rapid advancements in cloud-based platforms and innovative technologies, the potential of quantum computing feels closer than ever. This paper aims to offer readers a clear and comprehensive introduction to quantum computing, its role in machine learning, and the immense possibilities it holds for the future of technology.
△ Less
Submitted 17 February, 2025;
originally announced February 2025.
-
Proton and neutron electromagnetic form factors using $N_f$=2+1+1 twisted-mass fermions with physical values of the quark masses
Authors:
Constantia Alexandrou,
Simone Bacchio,
Giannis Koutsou,
Bhavna Prasad,
Gregoris Spanoudes
Abstract:
We compute the electromagnetic form factors of the proton and neutron using lattice QCD with $N_f = 2 + 1 + 1$ twisted mass clover-improved fermions and quark masses tuned to their physical values. Three ensembles with lattice spacings of $a$=0.080 fm, 0.068 fm, and 0.057 fm, and approximately the same physical volume allow us to obtain the continuum limit directly at the physical pion mass. Sever…
▽ More
We compute the electromagnetic form factors of the proton and neutron using lattice QCD with $N_f = 2 + 1 + 1$ twisted mass clover-improved fermions and quark masses tuned to their physical values. Three ensembles with lattice spacings of $a$=0.080 fm, 0.068 fm, and 0.057 fm, and approximately the same physical volume allow us to obtain the continuum limit directly at the physical pion mass. Several values of the source-sink time separation ranging from 0.5 fm to 1.5 fm are used, enabling a thorough analysis of excited state effects via multi-state fits. The disconnected contributions are analyzed using high statistics for the two-point functions combined with low-mode deflation and hierarchical probing for the fermion loop estimation. We study the momentum dependence of the form factors using the z-expansion and dipole Ansaetze, thereby enabling the extraction of the electric and magnetic radii, as well as the magnetic moments in the continuum limit, for which we provide preliminary results.
△ Less
Submitted 16 February, 2025;
originally announced February 2025.
-
Hamming Attention Distillation: Binarizing Keys and Queries for Efficient Long-Context Transformers
Authors:
Mark Horton,
Tergel Molom-Ochir,
Peter Liu,
Bhavna Gopal,
Chiyue Wei,
Cong Guo,
Brady Taylor,
Deliang Fan,
Shan X. Wang,
Hai Li,
Yiran Chen
Abstract:
Pre-trained transformer models with extended context windows are notoriously expensive to run at scale, often limiting real-world deployment due to their high computational and memory requirements. In this paper, we introduce Hamming Attention Distillation (HAD), a novel framework that binarizes keys and queries in the attention mechanism to achieve significant efficiency gains. By converting keys…
▽ More
Pre-trained transformer models with extended context windows are notoriously expensive to run at scale, often limiting real-world deployment due to their high computational and memory requirements. In this paper, we introduce Hamming Attention Distillation (HAD), a novel framework that binarizes keys and queries in the attention mechanism to achieve significant efficiency gains. By converting keys and queries into {-1, +1} vectors and replacing dot-product operations with efficient Hamming distance computations, our method drastically reduces computational overhead. Additionally, we incorporate attention matrix sparsification to prune low-impact activations, which further reduces the cost of processing long-context sequences. \par Despite these aggressive compression strategies, our distilled approach preserves a high degree of representational power, leading to substantially improved accuracy compared to prior transformer binarization methods. We evaluate HAD on a range of tasks and models, including the GLUE benchmark, ImageNet, and QuALITY, demonstrating state-of-the-art performance among binarized Transformers while drastically reducing the computational costs of long-context inference. \par We implement HAD in custom hardware simulations, demonstrating superior performance characteristics compared to a custom hardware implementation of standard attention. HAD achieves just $\mathbf{1.78}\%$ performance losses on GLUE compared to $9.08\%$ in state-of-the-art binarization work, and $\mathbf{2.5}\%$ performance losses on ImageNet compared to $12.14\%$, all while targeting custom hardware with a $\mathbf{79}\%$ area reduction and $\mathbf{87}\%$ power reduction compared to its standard attention counterpart.
△ Less
Submitted 3 February, 2025;
originally announced February 2025.
-
SAFER: Sharpness Aware layer-selective Finetuning for Enhanced Robustness in vision transformers
Authors:
Bhavna Gopal,
Huanrui Yang,
Mark Horton,
Yiran Chen
Abstract:
Vision transformers (ViTs) have become essential backbones in advanced computer vision applications and multi-modal foundation models. Despite their strengths, ViTs remain vulnerable to adversarial perturbations, comparable to or even exceeding the vulnerability of convolutional neural networks (CNNs). Furthermore, the large parameter count and complex architecture of ViTs make them particularly p…
▽ More
Vision transformers (ViTs) have become essential backbones in advanced computer vision applications and multi-modal foundation models. Despite their strengths, ViTs remain vulnerable to adversarial perturbations, comparable to or even exceeding the vulnerability of convolutional neural networks (CNNs). Furthermore, the large parameter count and complex architecture of ViTs make them particularly prone to adversarial overfitting, often compromising both clean and adversarial accuracy.
This paper mitigates adversarial overfitting in ViTs through a novel, layer-selective fine-tuning approach: SAFER. Instead of optimizing the entire model, we identify and selectively fine-tune a small subset of layers most susceptible to overfitting, applying sharpness-aware minimization to these layers while freezing the rest of the model. Our method consistently enhances both clean and adversarial accuracy over baseline approaches. Typical improvements are around 5%, with some cases achieving gains as high as 20% across various ViT architectures and datasets.
△ Less
Submitted 2 January, 2025;
originally announced January 2025.
-
Impact of Scalar NSI on Spatial and Temporal Correlations in Neutrino Oscillations
Authors:
Bhavna Yadav,
Ashutosh Kumar Alok
Abstract:
Neutrino oscillation experiments are gradually approaching an era of precision, where subleading effects can also be tested. One such subleading effect is Non-Standard Interactions (NSI), which can play a crucial role in neutrino oscillations. Various works have typically discussed vector NSI in the context of quantum correlations. Recently, there have been improvements in the bounds on scalar NSI…
▽ More
Neutrino oscillation experiments are gradually approaching an era of precision, where subleading effects can also be tested. One such subleading effect is Non-Standard Interactions (NSI), which can play a crucial role in neutrino oscillations. Various works have typically discussed vector NSI in the context of quantum correlations. Recently, there have been improvements in the bounds on scalar NSI as well. In light of these developments, we aim to examine the impact of scalar NSI on quantum correlation measures. To analyze this impact, we are considering the strongest measure of quantum correlation, i.e., non-locality. Our study will encompass both spatial and temporal non-locality measures.
△ Less
Submitted 26 November, 2024;
originally announced November 2024.
-
Steering in Neutrino Oscillations with Non-Standard Interaction
Authors:
Lekhashri Konwar,
Bhavna Yadav
Abstract:
In this study, we analyze the influence of Non-Standard Interaction (NSI) on steering in three-flavor neutrino oscillations, with a focus on the NO$ν$A and DUNE experimental setups. DUNE, having a longer baseline, exhibits a more pronounced deviation towards NSI in steering compared to NO$ν$A. Within the energy range where DUNE's maximum flux appears, the steering value for DUNE shows a $21\%$ dev…
▽ More
In this study, we analyze the influence of Non-Standard Interaction (NSI) on steering in three-flavor neutrino oscillations, with a focus on the NO$ν$A and DUNE experimental setups. DUNE, having a longer baseline, exhibits a more pronounced deviation towards NSI in steering compared to NO$ν$A. Within the energy range where DUNE's maximum flux appears, the steering value for DUNE shows a $21\%$ deviation from the Standard Model (SM) to NSI for normal ordering (NO), while for inverted ordering (IO), the steering value increases by approximately $15\%$ relative to the SM. We conduct a comparative analysis of nonlocality, steering, and entanglement. Additionally, we express steering in terms of three-flavor neutrino oscillation probabilities and explore the relationship between steering inequality and concurrence.
△ Less
Submitted 21 November, 2024;
originally announced November 2024.
-
KModels: Unlocking AI for Business Applications
Authors:
Roy Abitbol,
Eyal Cohen,
Muhammad Kanaan,
Bhavna Agrawal,
Yingjie Li,
Anuradha Bhamidipaty,
Erez Bilgory
Abstract:
As artificial intelligence (AI) continues to rapidly advance, there is a growing demand to integrate AI capabilities into existing business applications. However, a significant gap exists between the rapid progress in AI and how slowly AI is being embedded into business environments. Deploying well-performing lab models into production settings, especially in on-premise environments, often entails…
▽ More
As artificial intelligence (AI) continues to rapidly advance, there is a growing demand to integrate AI capabilities into existing business applications. However, a significant gap exists between the rapid progress in AI and how slowly AI is being embedded into business environments. Deploying well-performing lab models into production settings, especially in on-premise environments, often entails specialized expertise and imposes a heavy burden of model management, creating significant barriers to implementing AI models in real-world applications.
KModels leverages proven libraries and platforms (Kubeflow Pipelines, KServe) to streamline AI adoption by supporting both AI developers and consumers. It allows model developers to focus solely on model development and share models as transportable units (Templates), abstracting away complex production deployment concerns. KModels enables AI consumers to eliminate the need for a dedicated data scientist, as the templates encapsulate most data science considerations while providing business-oriented control.
This paper presents the architecture of KModels and the key decisions that shape it. We outline KModels' main components as well as its interfaces. Furthermore, we explain how KModels is highly suited for on-premise deployment but can also be used in cloud environments.
The efficacy of KModels is demonstrated through the successful deployment of three AI models within an existing Work Order Management system. These models operate in a client's data center and are trained on local data, without data scientist intervention. One model improved the accuracy of Failure Code specification for work orders from 46% to 83%, showcasing the substantial benefit of accessible and localized AI solutions.
△ Less
Submitted 8 September, 2024;
originally announced September 2024.
-
Criticality Leveraged Adversarial Training (CLAT) for Boosted Performance via Parameter Efficiency
Authors:
Bhavna Gopal,
Huanrui Yang,
Jingyang Zhang,
Mark Horton,
Yiran Chen
Abstract:
Adversarial training enhances neural network robustness but suffers from a tendency to overfit and increased generalization errors on clean data. This work introduces CLAT, an innovative approach that mitigates adversarial overfitting by introducing parameter efficiency into the adversarial training process, improving both clean accuracy and adversarial robustness. Instead of tuning the entire mod…
▽ More
Adversarial training enhances neural network robustness but suffers from a tendency to overfit and increased generalization errors on clean data. This work introduces CLAT, an innovative approach that mitigates adversarial overfitting by introducing parameter efficiency into the adversarial training process, improving both clean accuracy and adversarial robustness. Instead of tuning the entire model, CLAT identifies and fine-tunes robustness-critical layers - those predominantly learning non-robust features - while freezing the remaining model to enhance robustness. It employs dynamic critical layer selection to adapt to changes in layer criticality throughout the fine-tuning process. Empirically, CLAT can be applied on top of existing adversarial training methods, significantly reduces the number of trainable parameters by approximately 95%, and achieves more than a 2% improvement in adversarial robustness compared to baseline methods.
△ Less
Submitted 23 December, 2024; v1 submitted 19 August, 2024;
originally announced August 2024.
-
Pulse propagation in the quiescent environment during direct numerical simulation of Rayleigh-Taylor instability: Solution by Bromwich contour integral method
Authors:
Tapan K. Sengupta,
Bhavna Joshi,
Prasannabalaji Sundaram
Abstract:
In: {\it "Three-dimensional direct numerical simulation (DNS) of Rayleigh-Taylor instability (RTI) trigerred by acoustic excitation -- Sengupta et al. {\bf 34},054108 (2022)"} the receptivity of RTI to pressure pulses have been established. It has also been shown that at the onset of RTI these pulses are one-dimensional and the dissipation of the pressure pulses are governed by a dissipative wave…
▽ More
In: {\it "Three-dimensional direct numerical simulation (DNS) of Rayleigh-Taylor instability (RTI) trigerred by acoustic excitation -- Sengupta et al. {\bf 34},054108 (2022)"} the receptivity of RTI to pressure pulses have been established. It has also been shown that at the onset of RTI these pulses are one-dimensional and the dissipation of the pressure pulses are governed by a dissipative wave equation. The propagation of these infrasonic to ultrasonic pressure pulses have been studied theoretically and numerically by a high fidelity numerical procedure in the physical plane. The numerical results are consistent with the theoretical analysis and the DNS of RTI noted above. The properties of pulse propagation in a quiescent dissipative ambience have been theoretically obtained from the linearized compressible Navier-Stokes equation, without Stokes' hypothesis. This analysis is extended here for a special class of excitation, with combination of wavenumbers and circular frequencies for which the phase shift results in an imposed time period is integral multiple of $π$, and the signal amplification is by a real factor. Here, the governing partial differential equation (PDE) for the free-field propagation of pulses is solved by the Bromwich contour integral method in the spectral plane. This method, for an input Gaussian pulse excited at a fixed frequency, is the so-called signal problem. Responses for the specific phase shifts integral multiple of $π$ can reinforce each other due to the phase coherence. It is shown that these combinations occur at a fixed wavenumber, with higher frequencies attenuated more in such a sequence.
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
NSI effects on tripartite entanglement in neutrino oscillations
Authors:
Lekhashri Konwar,
Bhavna Yadav
Abstract:
In this study, we investigate the impact of new physics on different measures of tripartite entanglement within the context of three-flavor neutrino oscillations. These measures encompass concurrence, entanglement of formation, and negativity. We analyze the influence of new physics on these measures across a range of experimental setups involving both reactors and accelerators. Reactor experiment…
▽ More
In this study, we investigate the impact of new physics on different measures of tripartite entanglement within the context of three-flavor neutrino oscillations. These measures encompass concurrence, entanglement of formation, and negativity. We analyze the influence of new physics on these measures across a range of experimental setups involving both reactors and accelerators. Reactor experiments under consideration include Daya Bay, JUNO, and KamLAND setups, while accelerator experiments encompass T2K, MINOS, and DUNE. Our analysis reveals that accelerator experiments demonstrate greater sensitivity to NSI, with the most pronounced impact observed in the DUNE experiment. Negativity, while a weaker metric compared to EOF and concurrence, exhibits maximal sensitivity to NSI effects, particularly evident when neutrinos possess moderate to high energies. Conversely, reactor experiments demonstrate less sensitivity to NSI, with concurrence and EOF displaying more prominent effects.
△ Less
Submitted 18 March, 2025; v1 submitted 15 February, 2024;
originally announced February 2024.
-
Violation of LGtI inequalities in the light of NO$ν$A and T2K anomaly
Authors:
Lekhashri Konwar,
Juhi Vardani,
Bhavna Yadav
Abstract:
The recent anomaly observed in NO$ν$A and T2K experiments in standard three-flavor neutrino oscillation could potentially signal physics extending beyond the standard model (SM). For the NSI parameters that can accommodate this anomaly, we explore the violation of Leggett-Garg type inequalities (LGtI) within the context of three-flavor neutrino oscillations. Our analysis focuses on LGtI violations…
▽ More
The recent anomaly observed in NO$ν$A and T2K experiments in standard three-flavor neutrino oscillation could potentially signal physics extending beyond the standard model (SM). For the NSI parameters that can accommodate this anomaly, we explore the violation of Leggett-Garg type inequalities (LGtI) within the context of three-flavor neutrino oscillations. Our analysis focuses on LGtI violations in scenarios involving complex NSI with $ε_{eμ}$ or $ε_{eτ}$ coupling in long baseline accelerator experiments for normal and inverted mass ordering.LGtI violation is significantly enhanced in normal ordering (NO) for $ε_{eτ}$ scenario for T2K, NO$ν$A, and DUNE experiment set-up. We find that for inverted ordering (IO), in the DUNE experimental set-up above $8.5$ GeV, the LGtI violation can be an indication of $ε_{eτ}$ new physics scenario.
△ Less
Submitted 30 March, 2025; v1 submitted 5 January, 2024;
originally announced January 2024.
-
Bifurcation sequence of two-dimensional Taylor-Green vortex via vortex interactions: Evolution of energy spectrum
Authors:
Tapan K. Sengupta,
Ankan Sarkar,
Bhavna Joshi,
Prasannabalaji Sundaram,
Vajjala K. Suman
Abstract:
The vorticity dynamics of the two-dimensional (2D) Taylor-Green vortex (TGV) problem is investigated in its multi-cellular configuration by solving the incompressible Navier-Stokes equation for long time intervals using a pseudo-spectral method. This helps follow the vorticity dynamics of periodic free shear layer flows by solving an extremely accurate algorithm to explain vortex interactions that…
▽ More
The vorticity dynamics of the two-dimensional (2D) Taylor-Green vortex (TGV) problem is investigated in its multi-cellular configuration by solving the incompressible Navier-Stokes equation for long time intervals using a pseudo-spectral method. This helps follow the vorticity dynamics of periodic free shear layer flows by solving an extremely accurate algorithm to explain vortex interactions that lead to vortex stripping (forward cascade), merger, and reconnection (inverse cascade) during various stages of evolution of periodic arrangements of a large number of TGV vortical cells. This latter aspect has been adopted so as not to be affected by the periodicity constraints of a single periodic cell and the various imposed symmetries that attenuate disturbance growth. The analytic solution of the TGV provides the initial condition and the spatially accurate Fourier spectral method enables one to track the first instability of the initial doubly periodic vortices. Despite a plethora of studies following the primary instability to relate it with transition to turbulence and the subsequent decay of turbulence in the literature, the topic of bifurcation sequence for periodic TGV is rare, and that is one of the main aims of the present research. Instead of restricting one's attention on a single periodic TGV cell, here it is purposely reported for multiple cells of the TGV in both directions, without invoking any asymmetries extraneously. For such an ensemble, one can study various vortical interactions giving rise to atypical energy spectra, a topic that has also been seldom addressed to distinguish between successive instabilities that can upon a conjecture, lead to transition and subsequent relaminarization, versus the bifurcation sequences leading from one equilibrium state to subsequent ones. The present study shows the dominance of the latter for 2D TGV at post-critical Reynolds number.
△ Less
Submitted 29 November, 2023;
originally announced November 2023.
-
Perturbation Field in The Presence of Uniform Mean Flow: Doppler Effect for Flows and Acoustics
Authors:
Tapan K. Sengupta,
Aditi Sengupta,
Bhavna Joshi,
Prasannabalaji Sundaram
Abstract:
Having developed the perturbation equation for a dissipative quiescent medium for planar propagation using the linearized compressible Navier-Stokes equation without the Stokes' hypothesis \cite{arxiv2023}, here the same is extended where a uniform mean flow is present in the ambiance to explore the propagation properties for the Doppler effect.
Having developed the perturbation equation for a dissipative quiescent medium for planar propagation using the linearized compressible Navier-Stokes equation without the Stokes' hypothesis \cite{arxiv2023}, here the same is extended where a uniform mean flow is present in the ambiance to explore the propagation properties for the Doppler effect.
△ Less
Submitted 8 August, 2023;
originally announced August 2023.
-
LISSNAS: Locality-based Iterative Search Space Shrinkage for Neural Architecture Search
Authors:
Bhavna Gopal,
Arjun Sridhar,
Tunhou Zhang,
Yiran Chen
Abstract:
Search spaces hallmark the advancement of Neural Architecture Search (NAS). Large and complex search spaces with versatile building operators and structures provide more opportunities to brew promising architectures, yet pose severe challenges on efficient exploration and exploitation. Subsequently, several search space shrinkage methods optimize by selecting a single sub-region that contains some…
▽ More
Search spaces hallmark the advancement of Neural Architecture Search (NAS). Large and complex search spaces with versatile building operators and structures provide more opportunities to brew promising architectures, yet pose severe challenges on efficient exploration and exploitation. Subsequently, several search space shrinkage methods optimize by selecting a single sub-region that contains some well-performing networks. Small performance and efficiency gains are observed with these methods but such techniques leave room for significantly improved search performance and are ineffective at retaining architectural diversity. We propose LISSNAS, an automated algorithm that shrinks a large space into a diverse, small search space with SOTA search performance. Our approach leverages locality, the relationship between structural and performance similarity, to efficiently extract many pockets of well-performing networks. We showcase our method on an array of search spaces spanning various sizes and datasets. We accentuate the effectiveness of our shrunk spaces when used in one-shot search by achieving the best Top-1 accuracy in two different search spaces. Our method achieves a SOTA Top-1 accuracy of 77.6\% in ImageNet under mobile constraints, best-in-class Kendal-Tau, architectural diversity, and search space size.
△ Less
Submitted 6 July, 2023;
originally announced July 2023.
-
Equation for Aeroacoustics in a Quiescent Environment
Authors:
Tapan K. Sengupta,
Aditi Sengupta,
Bhavna Joshi
Abstract:
The perturbation equation for aeroacoustics has been derived in a dissipative medium from the linearized compressible Navier-Stokes equation without any assumption, by expressing the same in spectral plane as in Continuum perturbation field in quiescent ambience: Common foundation of flows and acoustics Sengupta et al., Phys. Fluids,35, 056111 (2023). The governing partial differential equation (P…
▽ More
The perturbation equation for aeroacoustics has been derived in a dissipative medium from the linearized compressible Navier-Stokes equation without any assumption, by expressing the same in spectral plane as in Continuum perturbation field in quiescent ambience: Common foundation of flows and acoustics Sengupta et al., Phys. Fluids,35, 056111 (2023). The governing partial differential equation (PDE) for the free-field propagation of the disturbances in the spectral plane provides the dispersion relation between wavenumber and circular frequency in the dissipative medium, as characterized by a nondimensional diffusion number. Here, the implications of the dispersion relation of the perturbation field in the quiescent medium are probed for different orders of magnitude of the generalized kinematic viscosity, across large ranges of the wavenumber and the circular frequency. The adopted global spectral analysis helps not only classify the PDE into parabolic and hyperbolic types, but also explain the existence of a critical wavenumber depending on space-time scales.
△ Less
Submitted 4 July, 2023;
originally announced July 2023.
-
Transverse spectral Instabilities in rotation-modified Kadomtsev-Petviashvili equation and related models
Authors:
Bhavna,
Ashish Kumar Pandey,
Anastassiya Semenova
Abstract:
The rotation modified Kadomtsev Petviashvili equation which is also known as the Kadomtsev Petviashvili Ostrovsky equation, describes the gradual wave field diffusion in the transverse direction to the direction of the propagation of the wave in a rotating frame of reference. This equation is a generalization of the Ostrovsky equation additionally having weak transverse effects. We investigate tra…
▽ More
The rotation modified Kadomtsev Petviashvili equation which is also known as the Kadomtsev Petviashvili Ostrovsky equation, describes the gradual wave field diffusion in the transverse direction to the direction of the propagation of the wave in a rotating frame of reference. This equation is a generalization of the Ostrovsky equation additionally having weak transverse effects. We investigate transverse instability and stability of small periodic traveling waves of the Ostrovsky equation with respect to either periodic or square integrable perturbations in the direction of wave propagation and periodic perturbations in the transverse direction of motion in the rotation modified Kadomtsev Petviashvili equation. We also study transverse stability or instability in generalized rotation modified KP equation by taking dispersion term as general and quadratic and cubic nonlinearity. As a consequence, we obtain transverse stability or instability in two-dimensional generalization of RMBO equation, Ostrovsky-Gardner equation, Ostrovsky-fKdV equation, Ostrovsky-mKdV equation, Ostrovsky-ILW equation, Ostrovsky-Whitham etc.
△ Less
Submitted 9 December, 2024; v1 submitted 18 May, 2023;
originally announced May 2023.
-
Modulational Instability in the Ostrovsky Equation and Related Models
Authors:
Bhavna,
Mathew A. Johnson,
Ashish Kumar Pandey
Abstract:
We study the modulational instability of small-amplitude periodic traveling wave solutions in a dispersion generalized Ostrovsky equation. Specifically, we investigate the invertibility of the associated linearized operator in the vicinity of the origin and derive a modulational instability index that depends on the dispersion and nonlinearity. For the classical Ostrovsky equation, we recover the…
▽ More
We study the modulational instability of small-amplitude periodic traveling wave solutions in a dispersion generalized Ostrovsky equation. Specifically, we investigate the invertibility of the associated linearized operator in the vicinity of the origin and derive a modulational instability index that depends on the dispersion and nonlinearity. For the classical Ostrovsky equation, we recover the well-known Lighthill condition for modulational instability of small-amplitude periodic traveling waves, and further provide a rigorous connection of the Lighthill condition to the spectral instability of the underlying wave. Our results and methodologies further apply to a wide-class of Ostrovsky type models that incorporate various dispersive effects. As such, we present new results illuminating the effects of rotation on various full-dispersion models arising in the study of weakly nonlinear surface water waves.
△ Less
Submitted 23 September, 2024; v1 submitted 14 May, 2023;
originally announced May 2023.
-
Evolution of Perturbation in Quiescent Medium
Authors:
Tapan K. Sengupta,
Shivam K. Jha,
Aditi Sengupta,
Bhavna Joshi,
Prasannabalaji Sundaram
Abstract:
Here, the perturbation equation for a dissipative medium is derived from the first principle from the linearized compressible Navier-Stokes equation without Stokes's hypothesis. The dispersion relations of this generic governing equation are obtained for one and three-dimensional perturbations, which exhibit both the dispersive and dissipative nature of the perturbations traveling in a dissipative…
▽ More
Here, the perturbation equation for a dissipative medium is derived from the first principle from the linearized compressible Navier-Stokes equation without Stokes's hypothesis. The dispersion relations of this generic governing equation are obtained for one and three-dimensional perturbations, which exhibit both the dispersive and dissipative nature of the perturbations traveling in a dissipative medium, depending upon the length scale. We specifically provide a theoretical cut-off wave number above which the perturbation equation represents diffusive and dissipative nature. Such behavior has not been reported before, as per the knowledge of the authors.
△ Less
Submitted 27 March, 2023;
originally announced March 2023.
-
Firenze: Model Evaluation Using Weak Signals
Authors:
Bhavna Soman,
Ali Torkamani,
Michael J. Morais,
Jeffrey Bickford,
Baris Coskun
Abstract:
Data labels in the security field are frequently noisy, limited, or biased towards a subset of the population. As a result, commonplace evaluation methods such as accuracy, precision and recall metrics, or analysis of performance curves computed from labeled datasets do not provide sufficient confidence in the real-world performance of a machine learning (ML) model. This has slowed the adoption of…
▽ More
Data labels in the security field are frequently noisy, limited, or biased towards a subset of the population. As a result, commonplace evaluation methods such as accuracy, precision and recall metrics, or analysis of performance curves computed from labeled datasets do not provide sufficient confidence in the real-world performance of a machine learning (ML) model. This has slowed the adoption of machine learning in the field. In the industry today, we rely on domain expertise and lengthy manual evaluation to build this confidence before shipping a new model for security applications. In this paper, we introduce Firenze, a novel framework for comparative evaluation of ML models' performance using domain expertise, encoded into scalable functions called markers. We show that markers computed and combined over select subsets of samples called regions of interest can provide a robust estimate of their real-world performances. Critically, we use statistical hypothesis testing to ensure that observed differences-and therefore conclusions emerging from our framework-are more prominent than that observable from the noise alone. Using simulations and two real-world datasets for malware and domain-name-service reputation detection, we illustrate our approach's effectiveness, limitations, and insights. Taken together, we propose Firenze as a resource for fast, interpretable, and collaborative model development and evaluation by mixed teams of researchers, domain experts, and business owners.
△ Less
Submitted 2 July, 2022;
originally announced July 2022.
-
Transverse Spectral Instabilities in Konopelchenko-Dubrovsky Equation
Authors:
Bhavna,
Ashish Kumar Pandey,
Sudhir Singh
Abstract:
We study the transverse spectral stability of the one-dimensional small-amplitude periodic traveling wave solutions of the (2+1)-dimensional Konopelchenko-Dubrovsky (KD) equation. We show that these waves are transversely unstable with respect to two-dimensional perturbations that are periodic in both directions with long wavelength in the transverse direction. We also show that these waves are tr…
▽ More
We study the transverse spectral stability of the one-dimensional small-amplitude periodic traveling wave solutions of the (2+1)-dimensional Konopelchenko-Dubrovsky (KD) equation. We show that these waves are transversely unstable with respect to two-dimensional perturbations that are periodic in both directions with long wavelength in the transverse direction. We also show that these waves are transversely stable with respect to perturbations which are either mean-zero periodic or square-integrable in the direction of the propagation of the wave and periodic in the transverse direction with finite or short wavelength. We discuss the implications of these results for special cases of the KD equation - namely, KP-II and mKP-II equations.
△ Less
Submitted 1 April, 2022; v1 submitted 23 March, 2022;
originally announced March 2022.
-
Long-Term Missing Value Imputation for Time Series Data Using Deep Neural Networks
Authors:
Jangho Park,
Juliane Muller,
Bhavna Arora,
Boris Faybishenko,
Gilberto Pastorello,
Charuleka Varadharajan,
Reetik Sahu,
Deborah Agarwal
Abstract:
We present an approach that uses a deep learning model, in particular, a MultiLayer Perceptron (MLP), for estimating the missing values of a variable in multivariate time series data. We focus on filling a long continuous gap (e.g., multiple months of missing daily observations) rather than on individual randomly missing observations. Our proposed gap filling algorithm uses an automated method for…
▽ More
We present an approach that uses a deep learning model, in particular, a MultiLayer Perceptron (MLP), for estimating the missing values of a variable in multivariate time series data. We focus on filling a long continuous gap (e.g., multiple months of missing daily observations) rather than on individual randomly missing observations. Our proposed gap filling algorithm uses an automated method for determining the optimal MLP model architecture, thus allowing for optimal prediction performance for the given time series. We tested our approach by filling gaps of various lengths (three months to three years) in three environmental datasets with different time series characteristics, namely daily groundwater levels, daily soil moisture, and hourly Net Ecosystem Exchange. We compared the accuracy of the gap-filled values obtained with our approach to the widely-used R-based time series gap filling methods ImputeTS and mtsdi. The results indicate that using an MLP for filling a large gap leads to better results, especially when the data behave nonlinearly. Thus, our approach enables the use of datasets that have a large gap in one variable, which is common in many long-term environmental monitoring observations.
△ Less
Submitted 24 February, 2022;
originally announced February 2022.
-
Semantic Annotation and Querying Framework based on Semi-structured Ayurvedic Text
Authors:
Hrishikesh Terdalkar,
Arnab Bhattacharya,
Madhulika Dubey,
Ramamurthy S,
Bhavna Naneria Singh
Abstract:
Knowledge bases (KB) are an important resource in a number of natural language processing (NLP) and information retrieval (IR) tasks, such as semantic search, automated question-answering etc. They are also useful for researchers trying to gain information from a text. Unfortunately, however, the state-of-the-art in Sanskrit NLP does not yet allow automated construction of knowledge bases due to u…
▽ More
Knowledge bases (KB) are an important resource in a number of natural language processing (NLP) and information retrieval (IR) tasks, such as semantic search, automated question-answering etc. They are also useful for researchers trying to gain information from a text. Unfortunately, however, the state-of-the-art in Sanskrit NLP does not yet allow automated construction of knowledge bases due to unavailability or lack of sufficient accuracy of tools and methods. Thus, in this work, we describe our efforts on manual annotation of Sanskrit text for the purpose of knowledge graph (KG) creation. We choose the chapter Dhanyavarga from Bhavaprakashanighantu of the Ayurvedic text Bhavaprakasha for annotation. The constructed knowledge graph contains 410 entities and 764 relationships. Since Bhavaprakashanighantu is a technical glossary text that describes various properties of different substances, we develop an elaborate ontology to capture the semantics of the entity and relationship types present in the text. To query the knowledge graph, we design 31 query templates that cover most of the common question patterns. For both manual annotation and querying, we customize the Sangrahaka framework previously developed by us. The entire system including the dataset is available from https://sanskrit.iitk.ac.in/ayurveda/ . We hope that the knowledge graph that we have created through manual annotation and subsequent curation will help in development and testing of NLP tools in future as well as studying of the Bhavaprakasanighantu text.
△ Less
Submitted 31 January, 2022;
originally announced February 2022.
-
Can NSI affect non-local correlations in neutrino oscillations?
Authors:
Bhavna Yadav,
Trisha Sarkar,
Khushboo Dixit,
Ashutosh Kumar Alok
Abstract:
Non-local correlations in entangled systems are usually captured by measures such as Bell's inequality violation. It was recently shown that in neutrino systems, a measure of non-local advantage of quantum coherence (NAQC) can be considered as a stronger measure of non-local correlations as compared to the Bell's inequality violation. In this work, we analyze the effects of non standard interactio…
▽ More
Non-local correlations in entangled systems are usually captured by measures such as Bell's inequality violation. It was recently shown that in neutrino systems, a measure of non-local advantage of quantum coherence (NAQC) can be considered as a stronger measure of non-local correlations as compared to the Bell's inequality violation. In this work, we analyze the effects of non standard interaction (NSI) on these measures in the context of two flavour neutrino oscillations for DUNE, MINOS, T2K, KamLAND, JUNO and Daya Bay experimental set-ups. We find that even in the presence of NSI, Bell's inequality violation occurs in the entire energy range whereas the NAQC violation is observed only in some specific energy range justifying the more elementary feature of NAQC. Further, we find that NSI can enhance the violation of NAQC and Bell's inequality parameter in the higher energy range of a given experimental set-up; these enhancements being maximal for the KamLAND experiment. However, the possible enhancement in the violation of the Bell's inequality parameter over the standard model prediction can be up to 11% whereas for NAQC it is 7%. Thus although NAQC is a comparatively stronger witness of nonclassicality, it shows lesser sensitivity to NSI effects in comparison to the Bell's inequality parameter.
△ Less
Submitted 18 May, 2022; v1 submitted 14 January, 2022;
originally announced January 2022.
-
Fake News Detection Tools and Methods -- A Review
Authors:
Sakshini Hangloo,
Bhavna Arora
Abstract:
In the past decade, the social networks platforms and micro-blogging sites such as Facebook, Twitter, Instagram, and Weibo have become an integral part of our day-to-day activities and is widely used all over the world by billions of users to share their views and circulate information in the form of messages, pictures, and videos. These are even used by government agencies to spread important inf…
▽ More
In the past decade, the social networks platforms and micro-blogging sites such as Facebook, Twitter, Instagram, and Weibo have become an integral part of our day-to-day activities and is widely used all over the world by billions of users to share their views and circulate information in the form of messages, pictures, and videos. These are even used by government agencies to spread important information through their verified Facebook accounts and official Twitter handles, as they can reach a huge population within a limited time window. However, many deceptive activities like propaganda and rumor can mislead users on a daily basis. In these COVID times, fake news and rumors are very prevalent and are shared in a huge number which has created chaos in this tough time. And hence, the need for Fake News Detection in the present scenario is inevitable. In this paper, we survey the recent literature about different approaches to detect fake news over the Internet. In particular, we firstly discuss fake news and the various terms related to it that have been considered in the literature. Secondly, we highlight the various publicly available datasets and various online tools that are available and can debunk Fake News in real-time. Thirdly, we describe fake news detection methods based on two broader areas i.e., its content and the social context. Finally, we provide a comparison of various techniques that are used to debunk fake news.
△ Less
Submitted 21 November, 2021;
originally announced December 2021.
-
Transverse spectral instability in generalized Kadomtsev-Petviashvili equation
Authors:
Bhavna,
Atul Kumar,
Ashish Kumar Pandey
Abstract:
We study transverse stability and instability of one-dimensional small-amplitude periodic traveling waves of a generalized Kadomtsev-Petviashvili equation with respect to two-dimensional perturbations, which are either periodic or square-integrable in the direction of the propagation of the underlying one-dimensional wave and periodic in the transverse direction. We obtain transverse instability r…
▽ More
We study transverse stability and instability of one-dimensional small-amplitude periodic traveling waves of a generalized Kadomtsev-Petviashvili equation with respect to two-dimensional perturbations, which are either periodic or square-integrable in the direction of the propagation of the underlying one-dimensional wave and periodic in the transverse direction. We obtain transverse instability results in KP-fKdV, KP-ILW, and KP-Whitham equations. Moreover, assuming the spectral stability of one-dimensional wave with respect to one-dimensional square-integrable periodic perturbations, we obtain transverse stability results in aforementioned equations.
△ Less
Submitted 31 March, 2022; v1 submitted 1 September, 2021;
originally announced September 2021.
-
High-frequency instabilities of the Ostrovsky equation
Authors:
Bhavna,
Atul Kumar,
Ashish Kumar Pandey
Abstract:
We study spectral stability of small amplitude periodic traveling waves of the Ostrovsky equation. We prove that these waves exhibit spectral instabilities arising from a collision of pair of non-zero eigenvalues on the imaginary axis when subjected to square integrable perturbations on the whole real line. We also list all such collisions between pair of eigenvalues on the imaginary axis and do a…
▽ More
We study spectral stability of small amplitude periodic traveling waves of the Ostrovsky equation. We prove that these waves exhibit spectral instabilities arising from a collision of pair of non-zero eigenvalues on the imaginary axis when subjected to square integrable perturbations on the whole real line. We also list all such collisions between pair of eigenvalues on the imaginary axis and do a Krein signature analysis.
△ Less
Submitted 5 July, 2021;
originally announced July 2021.
-
Uncertainty guided semi-supervised segmentation of retinal layers in OCT images
Authors:
Suman Sedai,
Bhavna Antony,
Ravneet Rai,
Katie Jones,
Hiroshi Ishikawa,
Joel Schuman,
Wollstein Gadi,
Rahil Garnavi
Abstract:
Deep convolutional neural networks have shown outstanding performance in medical image segmentation tasks. The usual problem when training supervised deep learning methods is the lack of labeled data which is time-consuming and costly to obtain. In this paper, we propose a novel uncertainty-guided semi-supervised learning based on a student-teacher approach for training the segmentation network us…
▽ More
Deep convolutional neural networks have shown outstanding performance in medical image segmentation tasks. The usual problem when training supervised deep learning methods is the lack of labeled data which is time-consuming and costly to obtain. In this paper, we propose a novel uncertainty-guided semi-supervised learning based on a student-teacher approach for training the segmentation network using limited labeled samples and a large number of unlabeled images. First, a teacher segmentation model is trained from the labeled samples using Bayesian deep learning. The trained model is used to generate soft segmentation labels and uncertainty maps for the unlabeled set. The student model is then updated using the softly segmented samples and the corresponding pixel-wise confidence of the segmentation quality estimated from the uncertainty of the teacher model using a newly designed loss function. Experimental results on a retinal layer segmentation task show that the proposed method improves the segmentation performance in comparison to the fully supervised approach and is on par with the expert annotator. The proposed semi-supervised segmentation framework is a key contribution and applicable for biomedical image segmentation across various imaging modalities where access to annotated medical images is challenging
△ Less
Submitted 2 March, 2021;
originally announced March 2021.
-
A System for Automated Open-Source Threat Intelligence Gathering and Management
Authors:
Peng Gao,
Xiaoyuan Liu,
Edward Choi,
Bhavna Soman,
Chinmaya Mishra,
Kate Farris,
Dawn Song
Abstract:
To remain aware of the fast-evolving cyber threat landscape, open-source Cyber Threat Intelligence (OSCTI) has received growing attention from the community. Commonly, knowledge about threats is presented in a vast number of OSCTI reports. Despite the pressing need for high-quality OSCTI, existing OSCTI gathering and management platforms, however, have primarily focused on isolated, low-level Indi…
▽ More
To remain aware of the fast-evolving cyber threat landscape, open-source Cyber Threat Intelligence (OSCTI) has received growing attention from the community. Commonly, knowledge about threats is presented in a vast number of OSCTI reports. Despite the pressing need for high-quality OSCTI, existing OSCTI gathering and management platforms, however, have primarily focused on isolated, low-level Indicators of Compromise. On the other hand, higher-level concepts (e.g., adversary tactics, techniques, and procedures) and their relationships have been overlooked, which contain essential knowledge about threat behaviors that is critical to uncovering the complete threat scenario. To bridge the gap, we propose SecurityKG, a system for automated OSCTI gathering and management. SecurityKG collects OSCTI reports from various sources, uses a combination of AI and NLP techniques to extract high-fidelity knowledge about threat behaviors, and constructs a security knowledge graph. SecurityKG also provides a UI that supports various types of interactivity to facilitate knowledge graph exploration.
△ Less
Submitted 26 February, 2021; v1 submitted 19 January, 2021;
originally announced January 2021.
-
Filtering cohomology of ordinary and Lagrangian Grassmannians
Authors:
The 2020 Polymath Jr. REU "q-binomials,
the Grassmannian group",
:,
Huda Ahmed,
Rasiel Chishti,
Yu-Cheng Chiu,
Galen Dorpalen-Barry,
Jeremy Ellis,
David Fang,
Michael Feigen,
Jonathan Feigert,
Mabel González,
Dylan Harker,
Jiaye Wei,
Bhavna Joshi,
Gandhar Kulkarni,
Kapil Lad,
Zhen Liu,
Ma Mingyang,
Lance Myers,
Arjun Nigam,
Tudor Popescu,
Victor Reiner,
Zijian Rong,
Eunice Sukarto
, et al. (9 additional authors not shown)
Abstract:
This paper studies, for a positive integer $m$, the subalgebra of the cohomology ring of the complex Grassmannians generated by the elements of degree at most $m$. We build in two ways upon a conjecture for the Hilbert series of this subalgebra due to Reiner and Tudose. The first reinterprets it in terms of the operation of $k$-conjugation, suggesting two conjectural bases for the subalgebras that…
▽ More
This paper studies, for a positive integer $m$, the subalgebra of the cohomology ring of the complex Grassmannians generated by the elements of degree at most $m$. We build in two ways upon a conjecture for the Hilbert series of this subalgebra due to Reiner and Tudose. The first reinterprets it in terms of the operation of $k$-conjugation, suggesting two conjectural bases for the subalgebras that would imply their conjecture. The second introduces an analogous conjecture for the cohomology of Lagrangian Grassmannians.
△ Less
Submitted 12 September, 2021; v1 submitted 5 November, 2020;
originally announced November 2020.
-
Dueling Deep Q-Network for Unsupervised Inter-frame Eye Movement Correction in Optical Coherence Tomography Volumes
Authors:
Yasmeen M. George,
Suman Sedai,
Bhavna J. Antony,
Hiroshi Ishikawa,
Gadi Wollstein,
Joel S. Schuman,
Rahil Garnavi
Abstract:
In optical coherence tomography (OCT) volumes of retina, the sequential acquisition of the individual slices makes this modality prone to motion artifacts, misalignments between adjacent slices being the most noticeable. Any distortion in OCT volumes can bias structural analysis and influence the outcome of longitudinal studies. On the other hand, presence of speckle noise that is characteristic o…
▽ More
In optical coherence tomography (OCT) volumes of retina, the sequential acquisition of the individual slices makes this modality prone to motion artifacts, misalignments between adjacent slices being the most noticeable. Any distortion in OCT volumes can bias structural analysis and influence the outcome of longitudinal studies. On the other hand, presence of speckle noise that is characteristic of this imaging modality, leads to inaccuracies when traditional registration techniques are employed. Also, the lack of a well-defined ground truth makes supervised deep-learning techniques ill-posed to tackle the problem. In this paper, we tackle these issues by using deep reinforcement learning to correct inter-frame movements in an unsupervised manner. Specifically, we use dueling deep Q-network to train an artificial agent to find the optimal policy, i.e. a sequence of actions, that best improves the alignment by maximizing the sum of reward signals. Instead of relying on the ground-truth of transformation parameters to guide the rewarding system, for the first time, we use a combination of intensity based image similarity metrics. Further, to avoid the agent bias towards speckle noise, we ensure the agent can see retinal layers as part of the interacting environment. For quantitative evaluation, we simulate the eye movement artifacts by applying 2D rigid transformations on individual B-scans. The proposed model achieves an average of 0.985 and 0.914 for normalized mutual information and correlation coefficient, respectively. We also compare our model with elastix intensity based medical image registration approach, where significant improvement is achieved by our model for both noisy and denoised volumes.
△ Less
Submitted 3 July, 2020;
originally announced July 2020.
-
Surrogate Optimization of Deep Neural Networks for Groundwater Predictions
Authors:
Juliane Mueller,
Jangho Park,
Reetik Sahu,
Charuleka Varadharajan,
Bhavna Arora,
Boris Faybishenko,
Deborah Agarwal
Abstract:
Sustainable management of groundwater resources under changing climatic conditions require an application of reliable and accurate predictions of groundwater levels. Mechanistic multi-scale, multi-physics simulation models are often too hard to use for this purpose, especially for groundwater managers who do not have access to the complex compute resources and data. Therefore, we analyzed the appl…
▽ More
Sustainable management of groundwater resources under changing climatic conditions require an application of reliable and accurate predictions of groundwater levels. Mechanistic multi-scale, multi-physics simulation models are often too hard to use for this purpose, especially for groundwater managers who do not have access to the complex compute resources and data. Therefore, we analyzed the applicability and performance of four modern deep learning computational models for predictions of groundwater levels. We compare three methods for optimizing the models' hyperparameters, including two surrogate model-based algorithms and a random sampling method. The models were tested using predictions of the groundwater level in Butte County, California, USA, taking into account the temporal variability of streamflow, precipitation, and ambient temperature. Our numerical study shows that the optimization of the hyperparameters can lead to reasonably accurate performance of all models (root mean squared errors of groundwater predictions of 2 meters or less), but the ''simplest'' network, namely a multilayer perceptron (MLP) performs overall better for learning and predicting groundwater data than the more advanced long short-term memory or convolutional neural networks in terms of prediction accuracy and time-to-solution, making the MLP a suitable candidate for groundwater prediction.
△ Less
Submitted 3 February, 2020; v1 submitted 28 August, 2019;
originally announced August 2019.
-
Inference of visual field test performance from OCT volumes using deep learning
Authors:
Stefan Maetschke,
Bhavna Antony,
Hiroshi Ishikawa,
Gadi Wollstein,
Joel Schuman,
Rahil Garnavi
Abstract:
Visual field tests (VFT) are pivotal for glaucoma diagnosis and conducted regularly to monitor disease progression. Here we address the question to what degree aggregate VFT measurements such as Visual Field Index (VFI) and Mean Deviation (MD) can be inferred from Optical Coherence Tomography (OCT) scans of the Optic Nerve Head (ONH) or the macula. Accurate inference of VFT measurements from OCT c…
▽ More
Visual field tests (VFT) are pivotal for glaucoma diagnosis and conducted regularly to monitor disease progression. Here we address the question to what degree aggregate VFT measurements such as Visual Field Index (VFI) and Mean Deviation (MD) can be inferred from Optical Coherence Tomography (OCT) scans of the Optic Nerve Head (ONH) or the macula. Accurate inference of VFT measurements from OCT could reduce examination time and cost. We propose a novel 3D Convolutional Neural Network (CNN) for this task and compare its accuracy with classical machine learning (ML) algorithms trained on common, segmentation-based OCT, features employed for glaucoma diagnostics. Peak accuracies were achieved on ONH scans when inferring VFI with a Pearson Correlation (PC) of 0.88$\pm$0.035 for the CNN and a significantly lower (p $<$ 0.01) PC of 0.74$\pm$0.090 for the best performing, classical ML algorithm - a Random Forest regressor. Estimation of MD was equally accurate with a PC of 0.88$\pm$0.023 on ONH scans for the CNN.
△ Less
Submitted 10 October, 2019; v1 submitted 4 August, 2019;
originally announced August 2019.
-
Joint Segmentation and Uncertainty Visualization of Retinal Layers in Optical Coherence Tomography Images using Bayesian Deep Learning
Authors:
Suman Sedai,
Bhavna Antony,
Dwarikanath Mahapatra,
Rahil Garnavi
Abstract:
Optical coherence tomography (OCT) is commonly used to analyze retinal layers for assessment of ocular diseases. In this paper, we propose a method for retinal layer segmentation and quantification of uncertainty based on Bayesian deep learning. Our method not only performs end-to-end segmentation of retinal layers, but also gives the pixel wise uncertainty measure of the segmentation output. The…
▽ More
Optical coherence tomography (OCT) is commonly used to analyze retinal layers for assessment of ocular diseases. In this paper, we propose a method for retinal layer segmentation and quantification of uncertainty based on Bayesian deep learning. Our method not only performs end-to-end segmentation of retinal layers, but also gives the pixel wise uncertainty measure of the segmentation output. The generated uncertainty map can be used to identify erroneously segmented image regions which is useful in downstream analysis. We have validated our method on a dataset of 1487 images obtained from 15 subjects (OCT volumes) and compared it against the state-of-the-art segmentation algorithms that does not take uncertainty into account. The proposed uncertainty based segmentation method results in comparable or improved performance, and most importantly is more robust against noise.
△ Less
Submitted 12 September, 2018;
originally announced September 2018.
-
A feature agnostic approach for glaucoma detection in OCT volumes
Authors:
Stefan Maetschke,
Bhavna Antony,
Hiroshi Ishikawa,
Gadi Wollstein,
Joel S. Schuman,
Rahil Garnavi
Abstract:
Optical coherence tomography (OCT) based measurements of retinal layer thickness, such as the retinal nerve fibre layer (RNFL) and the ganglion cell with inner plexiform layer (GCIPL) are commonly used for the diagnosis and monitoring of glaucoma. Previously, machine learning techniques have utilized segmentation-based imaging features such as the peripapillary RNFL thickness and the cup-to-disc r…
▽ More
Optical coherence tomography (OCT) based measurements of retinal layer thickness, such as the retinal nerve fibre layer (RNFL) and the ganglion cell with inner plexiform layer (GCIPL) are commonly used for the diagnosis and monitoring of glaucoma. Previously, machine learning techniques have utilized segmentation-based imaging features such as the peripapillary RNFL thickness and the cup-to-disc ratio. Here, we propose a deep learning technique that classifies eyes as healthy or glaucomatous directly from raw, unsegmented OCT volumes of the optic nerve head (ONH) using a 3D Convolutional Neural Network (CNN). We compared the accuracy of this technique with various feature-based machine learning algorithms and demonstrated the superiority of the proposed deep learning based method.
Logistic regression was found to be the best performing classical machine learning technique with an AUC of 0.89. In direct comparison, the deep learning approach achieved a substantially higher AUC of 0.94 with the additional advantage of providing insight into which regions of an OCT volume are important for glaucoma detection.
Computing Class Activation Maps (CAM), we found that the CNN identified neuroretinal rim and optic disc cupping as well as the lamina cribrosa (LC) and its surrounding areas as the regions significantly associated with the glaucoma classification. These regions anatomically correspond to the well established and commonly used clinical markers for glaucoma diagnosis such as increased cup volume, cup diameter, and neuroretinal rim thinning at the superior and inferior segments.
△ Less
Submitted 23 October, 2019; v1 submitted 12 July, 2018;
originally announced July 2018.
-
The formation of IRIS diagnostics VIII. IRIS observations in the C II 133.5 nm multiplet
Authors:
Bhavna Rathore,
Tiago M. D. Pereira,
Mats Carlsson,
Bart De Pontieu
Abstract:
The C II 133.5 nm multiplet has been observed by NASA's Interface Region Imaging Spectrograph (IRIS) in unprecedented spatial resolution. The aims of this work are to characterize these new observations of the C II lines, place them in context with previous work, and to identify any additional value the C II lines bring when compared with other spectral lines. We make use of wide, long exposure IR…
▽ More
The C II 133.5 nm multiplet has been observed by NASA's Interface Region Imaging Spectrograph (IRIS) in unprecedented spatial resolution. The aims of this work are to characterize these new observations of the C II lines, place them in context with previous work, and to identify any additional value the C II lines bring when compared with other spectral lines. We make use of wide, long exposure IRIS rasters covering the quiet Sun and an active region. Line properties such as velocity shift and width are extracted from individual spectra and analyzed. The lines have a variety of shapes (mostly single-peak or double-peak), are strongest in active regions and weaker in the quiet Sun. The ratio between the 133.4 nm and 133.5 nm components is always less than 1.8, indicating that their radiation is optically thick in all locations. Maps of the C II line widths are a powerful new diagnostic of chromospheric structures, and their line shifts are a robust velocity diagnostic. Compared with earlier quiet Sun observations, we find similar absolute intensities and mean line widths, but smaller red shifts; this difference can perhaps be attributed to differences in spectral resolution and spatial coverage. The C II intensity maps are somewhat similar to those of transition region lines, but also share some features with chromospheric maps such as those from the Mg II k line, indicating that they are formed between the upper chromosphere and transition region. C II intensity, width, and velocity maps can therefore be used to gather additional information about the upper chromosphere.
△ Less
Submitted 29 October, 2015; v1 submitted 16 October, 2015;
originally announced October 2015.
-
The formation of IRIS diagnostics VI. The Diagnostic Potential of the C II Lines at 133.5 nm in the Solar Atmosphere
Authors:
Bhavna Rathore,
Mats Carlsson,
Jorrit Leenaarts,
Bart De Pontieu
Abstract:
We use 3D radiation magnetohydrodynamic models to investigate how the thermodynamic quantities in the simulation are encoded in observable quantities, thus exploring the diagnostic potential of the 133.5 nm lines. We find that the line core intensity is correlated with the temperature at the formation height but the correlation is rather weak, especially when the lines are strong. The line core Do…
▽ More
We use 3D radiation magnetohydrodynamic models to investigate how the thermodynamic quantities in the simulation are encoded in observable quantities, thus exploring the diagnostic potential of the 133.5 nm lines. We find that the line core intensity is correlated with the temperature at the formation height but the correlation is rather weak, especially when the lines are strong. The line core Doppler shift is a good measure of the line-of-sight velocity at the formation height. The line width is both dependent on the width of the absorption profile (thermal and non-thermal width) and an opacity broadening factor of 1.2-4 due to the optically thick line formation with a larger broadening for double peak profiles. The 133.5 nm lines can be formed both higher and lower than the core of the Mg II k line depending on the amount of plasma in the 14-50 kK temperature range. More plasma in this temperature range gives a higher 133.5 nm formation height relative to the Mg II k line core. The synthetic line profiles have been compared with IRIS observations. The derived parameters from the simulated line profiles cover the parameter range seen in observations but on average the synthetic profiles are too narrow. We interpret this discrepancy as a combination of a lack of plasma at chromospheric temperatures in the simulation box and too small non-thermal velocities. The large differences in the distribution of properties between the synthetic profiles and the observed ones show that the 133.5 nm lines are powerful diagnostics of the upper chromosphere and lower transition region.
△ Less
Submitted 19 August, 2015; v1 submitted 18 August, 2015;
originally announced August 2015.
-
The formation of IRIS diagnostics V. A quintessential model atom of C II and general formation properties of the C II lines at 133.5 nm
Authors:
Bhavna Rathore,
Mats Carlsson
Abstract:
The 133.5 nm lines are important observables for the NASA/SMEX mission Interface Region Imaging Spectrograph (IRIS). To make 3D non-LTE radiative transfer computationally feasible it is crucial to have a model atom with as few levels as possible while retaining the main physical processes. We here develop such a model atom and we study the general formation properties of the C II lines. We find th…
▽ More
The 133.5 nm lines are important observables for the NASA/SMEX mission Interface Region Imaging Spectrograph (IRIS). To make 3D non-LTE radiative transfer computationally feasible it is crucial to have a model atom with as few levels as possible while retaining the main physical processes. We here develop such a model atom and we study the general formation properties of the C II lines. We find that a nine-level model atom of C I-C III with the transitions treated assuming complete frequency redistribution (CRD) suffices to describe the 133.5 nm lines. 3D scattering effects are important for the intensity in the core of the line. The lines are formed in the optically thick regime. The core intensity is formed in layers where the temperature is about 10kK at the base of the transition region. The lines are 1.2-4 times wider than the atomic absorption profile due to the formation in the optically thick regime. The smaller opacity broadening happens for single peak intensity profiles where the chromospheric temperature is low with a steep source function increase into the transition region, the larger broadening happens when there is a temperature increase from the photosphere to the low chromosphere leading to a local source function maximum and a double peak intensity profile with a central reversal. Assuming optically thin formation with the standard coronal approximation leads to several errors: Neglecting photoionization severly underestimates the amount of C II at temperatures below 16kK, erroneously shifts the formation from 10kK to 25kK and leads to too low intensities.
△ Less
Submitted 19 August, 2015; v1 submitted 18 August, 2015;
originally announced August 2015.