-
TRAIL: Trace Reasoning and Agentic Issue Localization
Authors:
Darshan Deshpande,
Varun Gangal,
Hersh Mehta,
Jitin Krishnan,
Anand Kannappan,
Rebecca Qian
Abstract:
The increasing adoption of agentic workflows across diverse domains brings a critical need to scalably and systematically evaluate the complex traces these systems generate. Current evaluation methods depend on manual, domain-specific human analysis of lengthy workflow traces - an approach that does not scale with the growing complexity and volume of agentic outputs. Error analysis in these settin…
▽ More
The increasing adoption of agentic workflows across diverse domains brings a critical need to scalably and systematically evaluate the complex traces these systems generate. Current evaluation methods depend on manual, domain-specific human analysis of lengthy workflow traces - an approach that does not scale with the growing complexity and volume of agentic outputs. Error analysis in these settings is further complicated by the interplay of external tool outputs and language model reasoning, making it more challenging than traditional software debugging. In this work, we (1) articulate the need for robust and dynamic evaluation methods for agentic workflow traces, (2) introduce a formal taxonomy of error types encountered in agentic systems, and (3) present a set of 148 large human-annotated traces (TRAIL) constructed using this taxonomy and grounded in established agentic benchmarks. To ensure ecological validity, we curate traces from both single and multi-agent systems, focusing on real-world applications such as software engineering and open-world information retrieval. Our evaluations reveal that modern long context LLMs perform poorly at trace debugging, with the best Gemini-2.5-pro model scoring a mere 11% on TRAIL. Our dataset and code are made publicly available to support and accelerate future research in scalable evaluation for agentic workflows.
△ Less
Submitted 19 May, 2025; v1 submitted 13 May, 2025;
originally announced May 2025.
-
Simultaneous cooling of qubits via a quantum absorption refrigerator and beyond
Authors:
Jithin G. Krishnan,
Chandrima B. Pushpan,
Amit Kumar Pal
Abstract:
We design a quantum thermal device that can simultaneously and dynamically cool multiple target qubits. Using a setup with three bosonic heat baths, we propose an engineering of interaction Hamiltonian using operators on different subspaces of the full Hilbert space of the system labelled by different magnetizations. We demonstrate, using the local as well as global quantum master equations, that…
▽ More
We design a quantum thermal device that can simultaneously and dynamically cool multiple target qubits. Using a setup with three bosonic heat baths, we propose an engineering of interaction Hamiltonian using operators on different subspaces of the full Hilbert space of the system labelled by different magnetizations. We demonstrate, using the local as well as global quantum master equations, that a set of target qubits can be cooled simultaneously using these interaction Hamiltonians, while equal cooling of all target qubits is possible only when the local quantum master equation is used. However, the amount of cooling obtained from different magnetization subspaces, as quantified by a distance-based measure of qubit-local steady-state temperatures, may vary. We also investigate cooling of a set of target qubits when the interaction Hamiltonian has different magnetization components, and when the design of the quantum thermal device involves two heat baths instead of three. Further, we demonstrate, using local quantum master equation, that during providing cooling to the target qubits, the designed device operates only as a quantum absorption refrigerator. In contrast, use of the global quantum master equation indicates cooling of the target qubits even when the device works outside the operation regime of a quantum absorption refrigerator. We also extend the design to a star network of qubits interacting via Heisenberg interaction among each other, kept in contact with either three, or two heat baths, and discuss cooling of a set of target qubits using this device.
△ Less
Submitted 21 October, 2024;
originally announced October 2024.
-
A Little Aggression Goes a Long Way
Authors:
Jyothi Krishnan,
Neeldhara Misra,
Saraswati Girish Nanoti
Abstract:
Aggression is a two-player game of troop placement and attack played on a map (modeled as a graph). Players take turns deploying troops on a territory (a vertex on the graph) until they run out. Once all troops are placed, players take turns attacking enemy territories. A territory can be attacked if it has $k$ troops and there are more than $k$ enemy troops on adjacent territories. At the end of…
▽ More
Aggression is a two-player game of troop placement and attack played on a map (modeled as a graph). Players take turns deploying troops on a territory (a vertex on the graph) until they run out. Once all troops are placed, players take turns attacking enemy territories. A territory can be attacked if it has $k$ troops and there are more than $k$ enemy troops on adjacent territories. At the end of the game, the player who controls the most territories wins. In the case of a tie, the player with more surviving troops wins. The first player to exhaust their troops in the placement phase leads the attack phase.
We study the complexity of the game when the graph along with an assignment of troops and the sequence of attacks planned by the second player. Even in this restrained setting, we show that the problem of determining an optimal sequence of first player moves is NP-complete. We then analyze the game for when the input graph is a matching or a cycle.
△ Less
Submitted 9 June, 2024;
originally announced June 2024.
-
An analytic approach for understanding mechanisms driving breakthrough infections
Authors:
Amanda Brucker,
Jillian H Hurst,
Emily C O'Brien,
Deverick Anderson,
Michael E Yarrington,
Jay Krishnan,
Benjamin A Goldstein
Abstract:
Real world data is an increasingly utilized resource for post-market monitoring of vaccines and provides insight into real world effectiveness. However, outside of the setting of a clinical trial, heterogeneous mechanisms may drive observed breakthrough infection rates among vaccinated individuals; for instance, waning vaccine-induced immunity as time passes and the emergence of a new strain again…
▽ More
Real world data is an increasingly utilized resource for post-market monitoring of vaccines and provides insight into real world effectiveness. However, outside of the setting of a clinical trial, heterogeneous mechanisms may drive observed breakthrough infection rates among vaccinated individuals; for instance, waning vaccine-induced immunity as time passes and the emergence of a new strain against which the vaccine has reduced protection. Analyses of infection incidence rates are typically predicated on a presumed mechanism in their choice of an "analytic time zero" after which infection rates are modeled. In this work, we propose an explicit test for driving mechanism situated in a standard Cox proportional hazards framework. We explore the test's performance in simulation studies and in an illustrative application to real world data. We additionally introduce subgroup differences in infection incidence and evaluate the impact of time zero misspecification on bias and coverage of model estimates. In this study we observe strong power and controlled type I error of the test to detect the correct infection-driving mechanism under various settings. Similar to previous studies, we find mitigated bias and greater coverage of estimates when the analytic time zero is correctly specified or accounted for.
△ Less
Submitted 30 January, 2024;
originally announced January 2024.
-
Entanglement in XYZ model on a spin-star system: Anisotropy vs. field-induced dynamics
Authors:
Jithin G. Krishnan,
Harikrishnan K. J.,
Amit Kumar Pal
Abstract:
We consider a star-network of $n=n_0+n_p$ spin-$\frac{1}{2}$ particles, where interaction between $n_0$ central spins and $n_p$ peripheral spins are of the XYZ-type. In the limit $n_0/n_p\ll 1$, we show that for odd $n$, the ground state is doubly degenerate, while for even $n$, the energy gap becomes negligible when $n$ is large, inducing an \emph{effective} double degeneracy. In the same limit,…
▽ More
We consider a star-network of $n=n_0+n_p$ spin-$\frac{1}{2}$ particles, where interaction between $n_0$ central spins and $n_p$ peripheral spins are of the XYZ-type. In the limit $n_0/n_p\ll 1$, we show that for odd $n$, the ground state is doubly degenerate, while for even $n$, the energy gap becomes negligible when $n$ is large, inducing an \emph{effective} double degeneracy. In the same limit, we show that for vanishing $xy$-anisotropy $γ$, bipartite entanglement on the peripheral spins computed using either a partial trace-based, or a measurement-based approach exhibits a logarithmic growth with $n_p$, where the sizes of the partitions are typically $\sim n_p/2$. This feature disappears for $γ\neq 0$, which we refer to as the \emph{anisotropy effect}. Interestingly, when the system is taken out of equilibrium by the introduction of a magnetic field of constant strength on all spins, the time-averaged bipartite entanglement on the periphery at the long-time limit exhibits a logarithmic growth with $n_p$ irrespective of the value of $γ$. We further study the $n_0/n_p\gg 1$ and $n_0/n_p\rightarrow 1$ limits of the model, and show that the behaviour of bipartite peripheral entanglement is qualitatively different from that of the $n_0/n_p\ll 1$ limit.
△ Less
Submitted 29 July, 2023;
originally announced July 2023.
-
Spike-by-Spike Frequency Analysis of Amperometry Traces Provides Statistical Validation of Observations in the Time Domain
Authors:
Jeyashree Krishnan,
Zeyu Lian,
Pieter E. Oomen,
Xiulan He,
Soodabeh Majdi,
Andreas Schuppert,
Andrew Ewing
Abstract:
Amperometry is a commonly used electrochemical method for studying the process of exocytosis in real-time. Given the high precision of recording that amperometry procedures offer, the volume of data generated can span over several hundreds of megabytes to a few gigabytes and therefore necessitates systematic and reproducible methods for analysis. Though the spike characteristics of amperometry tra…
▽ More
Amperometry is a commonly used electrochemical method for studying the process of exocytosis in real-time. Given the high precision of recording that amperometry procedures offer, the volume of data generated can span over several hundreds of megabytes to a few gigabytes and therefore necessitates systematic and reproducible methods for analysis. Though the spike characteristics of amperometry traces in the time domain hold information about the dynamics of exocytosis, these biochemical signals are, more often than not, characterized by time-varying signal properties. Such signals with time-variant properties may occur at different frequencies and therefore analyzing them in the frequency domain may provide statistical validation for observations already established in the time domain. This necessitates the use of time-variant, frequency-selective signal processing methods as well, which can adeptly quantify the dominant or mean frequencies in the signal. The Fast Fourier Transform (FFT) is a well-established computational tool that is commonly used to find the frequency components of a signal buried in noise. In this work, we outline a method for spike-based frequency analysis of amperometry traces using FFT that also provides statistical validation of observations on spike characteristics in the time domain. We demonstrate the method by utilizing simulated signals and by subsequently testing it on diverse amperometry datasets generated from different experiments with various chemical stimulations. To our knowledge, this is the first fully automated open-source tool available dedicated to the analysis of spikes extracted from amperometry signals in the frequency domain.
△ Less
Submitted 6 February, 2023;
originally announced February 2023.
-
Tree-Based Learning on Amperometric Time Series Data Demonstrates High Accuracy for Classification
Authors:
Jeyashree Krishnan,
Zeyu Lian,
Pieter E. Oomen,
Xiulan He,
Soodabeh Majdi,
Andreas Schuppert,
Andrew Ewing
Abstract:
Elucidating exocytosis processes provide insights into cellular neurotransmission mechanisms, and may have potential in neurodegenerative diseases research. Amperometry is an established electrochemical method for the detection of neurotransmitters released from and stored inside cells. An important aspect of the amperometry method is the sub-millisecond temporal resolution of the current recordin…
▽ More
Elucidating exocytosis processes provide insights into cellular neurotransmission mechanisms, and may have potential in neurodegenerative diseases research. Amperometry is an established electrochemical method for the detection of neurotransmitters released from and stored inside cells. An important aspect of the amperometry method is the sub-millisecond temporal resolution of the current recordings which leads to several hundreds of gigabytes of high-quality data. In this study, we present a universal method for the classification with respect to diverse amperometric datasets using data-driven approaches in computational science. We demonstrate a very high prediction accuracy (greater than or equal to 95%). This includes an end-to-end systematic machine learning workflow for amperometric time series datasets consisting of pre-processing; feature extraction; model identification; training and testing; followed by feature importance evaluation - all implemented. We tested the method on heterogeneous amperometric time series datasets generated using different experimental approaches, chemical stimulations, electrode types, and varying recording times. We identified a certain overarching set of common features across these datasets which enables accurate predictions. Further, we showed that information relevant for the classification of amperometric traces are neither in the spiky segments alone, nor can it be retrieved from just the temporal structure of spikes. In fact, the transients between spikes and the trace baselines carry essential information for a successful classification, thereby strongly demonstrating that an effective feature representation of amperometric time series requires the full time series. To our knowledge, this is one of the first studies that propose a scheme for machine learning, and in particular, supervised learning on full amperometry time series data.
△ Less
Submitted 6 February, 2023;
originally announced February 2023.
-
Representation Deficiency in Masked Language Modeling
Authors:
Yu Meng,
Jitin Krishnan,
Sinong Wang,
Qifan Wang,
Yuning Mao,
Han Fang,
Marjan Ghazvininejad,
Jiawei Han,
Luke Zettlemoyer
Abstract:
Masked Language Modeling (MLM) has been one of the most prominent approaches for pretraining bidirectional text encoders due to its simplicity and effectiveness. One notable concern about MLM is that the special $\texttt{[MASK]}$ symbol causes a discrepancy between pretraining data and downstream data as it is present only in pretraining but not in fine-tuning. In this work, we offer a new perspec…
▽ More
Masked Language Modeling (MLM) has been one of the most prominent approaches for pretraining bidirectional text encoders due to its simplicity and effectiveness. One notable concern about MLM is that the special $\texttt{[MASK]}$ symbol causes a discrepancy between pretraining data and downstream data as it is present only in pretraining but not in fine-tuning. In this work, we offer a new perspective on the consequence of such a discrepancy: We demonstrate empirically and theoretically that MLM pretraining allocates some model dimensions exclusively for representing $\texttt{[MASK]}$ tokens, resulting in a representation deficiency for real tokens and limiting the pretrained model's expressiveness when it is adapted to downstream data without $\texttt{[MASK]}$ tokens. Motivated by the identified issue, we propose MAE-LM, which pretrains the Masked Autoencoder architecture with MLM where $\texttt{[MASK]}$ tokens are excluded from the encoder. Empirically, we show that MAE-LM improves the utilization of model dimensions for real token representations, and MAE-LM consistently outperforms MLM-pretrained models across different pretraining settings and model sizes when fine-tuned on the GLUE and SQuAD benchmarks.
△ Less
Submitted 16 March, 2024; v1 submitted 3 February, 2023;
originally announced February 2023.
-
Using natural language processing and structured medical data to phenotype patients hospitalized due to COVID-19
Authors:
Feier Chang,
Jay Krishnan,
Jillian H Hurst,
Michael E Yarrington,
Deverick J Anderson,
Emily C O'Brien,
Benjamin A Goldstein
Abstract:
To identify patients who are hospitalized because of COVID-19 as opposed to those who were admitted for other indications, we compared the performance of different computable phenotype definitions for COVID-19 hospitalizations that use different types of data from the electronic health records (EHR), including structured EHR data elements, provider notes, or a combination of both data types. And c…
▽ More
To identify patients who are hospitalized because of COVID-19 as opposed to those who were admitted for other indications, we compared the performance of different computable phenotype definitions for COVID-19 hospitalizations that use different types of data from the electronic health records (EHR), including structured EHR data elements, provider notes, or a combination of both data types. And conduct a retrospective data analysis utilizing chart review-based validation. Participants are 586 hospitalized individuals who tested positive for SARS-CoV-2 during January 2022. We used natural language processing to incorporate data from provider notes and LASSO regression and Random Forests to fit classification algorithms that incorporated structured EHR data elements, provider notes, or a combination of structured data and provider notes. Results: Based on a chart review, 38% of 586 patients were determined to be hospitalized for reasons other than COVID-19 despite having tested positive for SARS-CoV-2. A classification algorithm that used provider notes had significantly better discrimination than one that used structured EHR data elements (AUROC: 0.894 vs 0.841, p < 0.001), and performed similarly to a model that combined provider notes with structured data elements (AUROC: 0.894 vs 0.893). Assessments of hospital outcome metrics significantly differed based on whether the population included all hospitalized patients who tested positive for SARS-CoV-2 versus those who were determined to have been hospitalized due to COVID-19. This work demonstrates the utility of natural language processing approaches to derive information related to patient hospitalizations in cases where there may be multiple conditions that could serve as the primary indication for hospitalization.
△ Less
Submitted 2 February, 2023;
originally announced February 2023.
-
Controlling gain with loss: Bounds on localizable entanglement in multi-qubit systems
Authors:
Jithin G. Krishnan,
Harikrishnan K. J.,
Amit Kumar Pal
Abstract:
We investigate the relation between the amount of entanglement localized on a chosen subsystem of a multi-qubit system via local measurements on the rest of the system, and the bipartite entanglement that is lost during this measurement process. We study a number of paradigmatic pure states, including the generalized GHZ, the generalized W, Dicke, and the generalized Dicke states. For the generali…
▽ More
We investigate the relation between the amount of entanglement localized on a chosen subsystem of a multi-qubit system via local measurements on the rest of the system, and the bipartite entanglement that is lost during this measurement process. We study a number of paradigmatic pure states, including the generalized GHZ, the generalized W, Dicke, and the generalized Dicke states. For the generalized GHZ and W states, we analytically derive bounds on localizable entanglement in terms of the entanglement present in the system prior to the measurement. Also, for the Dicke and the generalized Dicke states, we demonstrate that with increasing system size, localizable entanglement tends to be equal to the bipartite entanglement present in the system over a specific partition before measurement. We extend the investigation numerically in the case of arbitrary multi-qubit pure states. We also analytically determine the modification of these results, including the proposed bounds, in situations where these pure states are subjected to single-qubit phase-flip noise on all qubits. Additionally, we study one-dimensional paradigmatic quantum spin models, namely the transverse-field XY model and the XXZ model in an external field, and numerically demonstrate a quadratic dependence of the localized entanglement on the lost entanglement. We show that this relation is robust even in the presence of disorder in the strength of the external field.
△ Less
Submitted 15 June, 2022;
originally announced June 2022.
-
Random Feature Approximation for Online Nonlinear Graph Topology Identification
Authors:
Rohan Money,
Joshin Krishnan,
Baltasar Beferull-Lozano
Abstract:
Online topology estimation of graph-connected time series is challenging, especially since the causal dependencies in many real-world networks are nonlinear. In this paper, we propose a kernel-based algorithm for graph topology estimation. The algorithm uses a Fourier-based Random feature approximation to tackle the curse of dimensionality associated with the kernel representations. Exploiting the…
▽ More
Online topology estimation of graph-connected time series is challenging, especially since the causal dependencies in many real-world networks are nonlinear. In this paper, we propose a kernel-based algorithm for graph topology estimation. The algorithm uses a Fourier-based Random feature approximation to tackle the curse of dimensionality associated with the kernel representations. Exploiting the fact that the real-world networks often exhibit sparse topologies, we propose a group lasso based optimization framework, which is solve using an iterative composite objective mirror descent method, yielding an online algorithm with fixed computational complexity per iteration. The experiments conducted on real and synthetic data show that the proposed method outperforms its competitors.
△ Less
Submitted 19 October, 2021;
originally announced October 2021.
-
Cross-Lingual Text Classification of Transliterated Hindi and Malayalam
Authors:
Jitin Krishnan,
Antonios Anastasopoulos,
Hemant Purohit,
Huzefa Rangwala
Abstract:
Transliteration is very common on social media, but transliterated text is not adequately handled by modern neural models for various NLP tasks. In this work, we combine data augmentation approaches with a Teacher-Student training scheme to address this issue in a cross-lingual transfer setting for fine-tuning state-of-the-art pre-trained multilingual language models such as mBERT and XLM-R. We ev…
▽ More
Transliteration is very common on social media, but transliterated text is not adequately handled by modern neural models for various NLP tasks. In this work, we combine data augmentation approaches with a Teacher-Student training scheme to address this issue in a cross-lingual transfer setting for fine-tuning state-of-the-art pre-trained multilingual language models such as mBERT and XLM-R. We evaluate our method on transliterated Hindi and Malayalam, also introducing new datasets for benchmarking on real-world scenarios: one on sentiment classification in transliterated Malayalam, and another on crisis tweet classification in transliterated Hindi and Malayalam (related to the 2013 North India and 2018 Kerala floods). Our method yielded an average improvement of +5.6% on mBERT and +4.7% on XLM-R in F1 scores over their strong baselines.
△ Less
Submitted 31 August, 2021;
originally announced August 2021.
-
Online Non-linear Topology Identification from Graph-connected Time Series
Authors:
Rohan Money,
Joshin Krishnan,
Baltasar Beferull-Lozano
Abstract:
Estimating the unknown causal dependencies among graph-connected time series plays an important role in many applications, such as sensor network analysis, signal processing over cyber-physical systems, and finance engineering. Inference of such causal dependencies, often know as topology identification, is not well studied for non-linear non-stationary systems, and most of the existing methods ar…
▽ More
Estimating the unknown causal dependencies among graph-connected time series plays an important role in many applications, such as sensor network analysis, signal processing over cyber-physical systems, and finance engineering. Inference of such causal dependencies, often know as topology identification, is not well studied for non-linear non-stationary systems, and most of the existing methods are batch-based which are not capable of handling streaming sensor signals. In this paper, we propose an online kernel-based algorithm for topology estimation of non-linear vector autoregressive time series by solving a sparse online optimization framework using the composite objective mirror descent method. Experiments conducted on real and synthetic data sets show that the proposed algorithm outperforms the state-of-the-art methods for topology estimation.
△ Less
Submitted 31 March, 2021;
originally announced April 2021.
-
Multilingual Code-Switching for Zero-Shot Cross-Lingual Intent Prediction and Slot Filling
Authors:
Jitin Krishnan,
Antonios Anastasopoulos,
Hemant Purohit,
Huzefa Rangwala
Abstract:
Predicting user intent and detecting the corresponding slots from text are two key problems in Natural Language Understanding (NLU). In the context of zero-shot learning, this task is typically approached by either using representations from pre-trained multilingual transformers such as mBERT, or by machine translating the source data into the known target language and then fine-tuning. Our work f…
▽ More
Predicting user intent and detecting the corresponding slots from text are two key problems in Natural Language Understanding (NLU). In the context of zero-shot learning, this task is typically approached by either using representations from pre-trained multilingual transformers such as mBERT, or by machine translating the source data into the known target language and then fine-tuning. Our work focuses on a particular scenario where the target language is unknown during training. To this goal, we propose a novel method to augment the monolingual source data using multilingual code-switching via random translations to enhance a transformer's language neutrality when fine-tuning it for a downstream task. This method also helps discover novel insights on how code-switching with different language families around the world impact the performance on the target language. Experiments on the benchmark dataset of MultiATIS++ yielded an average improvement of +4.2% in accuracy for intent task and +1.8% in F1 for slot task using our method over the state-of-the-art across 8 different languages. Furthermore, we present an application of our method for crisis informatics using a new human-annotated tweet dataset of slot filling in English and Haitian Creole, collected during Haiti earthquake disaster.
△ Less
Submitted 16 March, 2021; v1 submitted 13 March, 2021;
originally announced March 2021.
-
A Long-Range Ising Model of a Barabási-Albert Network
Authors:
Jeyashree Krishnan,
Reza Torabi,
Edoardo Di Napoli,
Carsten Honerkamp,
Andreas Schuppert
Abstract:
Networks that have power-law connectivity, commonly referred to as the scale-free networks, are an important class of complex networks. A heterogeneous mean-field approximation has been previously proposed for the Ising model of the Barabási-Albert model of scale-free networks with classical spins on the nodes wherein it was shown that the critical temperature for such a system scales logarithmica…
▽ More
Networks that have power-law connectivity, commonly referred to as the scale-free networks, are an important class of complex networks. A heterogeneous mean-field approximation has been previously proposed for the Ising model of the Barabási-Albert model of scale-free networks with classical spins on the nodes wherein it was shown that the critical temperature for such a system scales logarithmically with network size. For finite sizes, there is no criticality for such a system and hence no true phase transition in terms of singular behavior. Further, in the thermodynamic limit, the mean-field prediction of an infinite critical temperature for the system may exclude any true phase transition even then. Nevertheless, with an eye on potential applications of the model on biological systems that are generally finite, one may still try to find approximations that describe the relevant observables quantitatively. Here we present an alternative, approximate formulation for the description of the Ising model of a Barabási-Albert Network. Using the classical definition of magnetization, we show that Ising models on a network can be well-approximated by a long-range interacting homogeneous Ising model wherein each node of the network couples to all other spins with a strength determined by the mean degree of the Barabási-Albert Network. In such an effective long-range Ising model of a Barabási-Albert Network, the critical temperature is directly proportional to the number of preferentially attached links added to grow the network. The proposed model describes the magnetization of the majority of the sites with average or smaller than average degree better compared to the heterogeneous mean-field approximation. The long-range Ising model is the only homogeneous description of Barabási-Albert networks that we know of.
△ Less
Submitted 7 May, 2020;
originally announced May 2020.
-
Common-Knowledge Concept Recognition for SEVA
Authors:
Jitin Krishnan,
Patrick Coronado,
Hemant Purohit,
Huzefa Rangwala
Abstract:
We build a common-knowledge concept recognition system for a Systems Engineer's Virtual Assistant (SEVA) which can be used for downstream tasks such as relation extraction, knowledge graph construction, and question-answering. The problem is formulated as a token classification task similar to named entity extraction. With the help of a domain expert and text processing methods, we construct a dat…
▽ More
We build a common-knowledge concept recognition system for a Systems Engineer's Virtual Assistant (SEVA) which can be used for downstream tasks such as relation extraction, knowledge graph construction, and question-answering. The problem is formulated as a token classification task similar to named entity extraction. With the help of a domain expert and text processing methods, we construct a dataset annotated at the word-level by carefully defining a labelling scheme to train a sequence model to recognize systems engineering concepts. We use a pre-trained language model and fine-tune it with the labeled dataset of concepts. In addition, we also create some essential datasets for information such as abbreviations and definitions from the systems engineering domain. Finally, we construct a simple knowledge graph using these extracted concepts along with some hyponym relations.
△ Less
Submitted 25 March, 2020;
originally announced March 2020.
-
Unsupervised and Interpretable Domain Adaptation to Rapidly Filter Tweets for Emergency Services
Authors:
Jitin Krishnan,
Hemant Purohit,
Huzefa Rangwala
Abstract:
During the onset of a disaster event, filtering relevant information from the social web data is challenging due to its sparse availability and practical limitations in labeling datasets of an ongoing crisis. In this paper, we hypothesize that unsupervised domain adaptation through multi-task learning can be a useful framework to leverage data from past crisis events for training efficient informa…
▽ More
During the onset of a disaster event, filtering relevant information from the social web data is challenging due to its sparse availability and practical limitations in labeling datasets of an ongoing crisis. In this paper, we hypothesize that unsupervised domain adaptation through multi-task learning can be a useful framework to leverage data from past crisis events for training efficient information filtering models during the sudden onset of a new crisis. We present a novel method to classify relevant tweets during an ongoing crisis without seeing any new examples, using the publicly available dataset of TREC incident streams. Specifically, we construct a customized multi-task architecture with a multi-domain discriminator for crisis analytics: multi-task domain adversarial attention network. This model consists of dedicated attention layers for each task to provide model interpretability; critical for real-word applications. As deep networks struggle with sparse datasets, we show that this can be improved by sharing a base layer for multi-task learning and domain adversarial training. Evaluation of domain adaptation for crisis events is performed by choosing a target event as the test set and training on the rest. Our results show that the multi-task model outperformed its single task counterpart. For the qualitative evaluation of interpretability, we show that the attention layer can be used as a guide to explain the model predictions and empower emergency services for exploring accountability of the model, by showcasing the words in a tweet that are deemed important in the classification process. Finally, we show a practical implication of our work by providing a use-case for the COVID-19 pandemic.
△ Less
Submitted 20 October, 2020; v1 submitted 4 March, 2020;
originally announced March 2020.
-
Diversity-Based Generalization for Unsupervised Text Classification under Domain Shift
Authors:
Jitin Krishnan,
Hemant Purohit,
Huzefa Rangwala
Abstract:
Domain adaptation approaches seek to learn from a source domain and generalize it to an unseen target domain. At present, the state-of-the-art unsupervised domain adaptation approaches for subjective text classification problems leverage unlabeled target data along with labeled source data. In this paper, we propose a novel method for domain adaptation of single-task text classification problems b…
▽ More
Domain adaptation approaches seek to learn from a source domain and generalize it to an unseen target domain. At present, the state-of-the-art unsupervised domain adaptation approaches for subjective text classification problems leverage unlabeled target data along with labeled source data. In this paper, we propose a novel method for domain adaptation of single-task text classification problems based on a simple but effective idea of diversity-based generalization that does not require unlabeled target data but still matches the state-of-the-art in performance. Diversity plays the role of promoting the model to better generalize and be indiscriminate towards domain shift by forcing the model not to rely on same features for prediction. We apply this concept on the most explainable component of neural networks, the attention layer. To generate sufficient diversity, we create a multi-head attention model and infuse a diversity constraint between the attention heads such that each head will learn differently. We further expand upon our model by tri-training and designing a procedure with an additional diversity constraint between the attention heads of the tri-trained classifiers. Extensive evaluation using the standard benchmark dataset of Amazon reviews and a newly constructed dataset of Crisis events shows that our fully unsupervised method matches with the competing baselines that uses unlabeled target data. Our results demonstrate that machine learning architectures that ensure sufficient diversity can generalize better; encouraging future research to design ubiquitously usable learning models without using unlabeled target data.
△ Less
Submitted 20 October, 2020; v1 submitted 25 February, 2020;
originally announced February 2020.
-
A Modified Ising Model of Barabási-Albert Network with Gene-type Spins
Authors:
Jeyashree Krishnan,
Reza Torabi,
Edoardo Di Napoli,
Andreas Schuppert
Abstract:
The central question of systems biology is to understand how individual components of a biological system such as genes or proteins cooperate in emerging phenotypes resulting in the evolution of diseases. As living cells are open systems in quasi-steady state type equilibrium in continuous exchange with their environment, computational techniques that have been successfully applied in statistical…
▽ More
The central question of systems biology is to understand how individual components of a biological system such as genes or proteins cooperate in emerging phenotypes resulting in the evolution of diseases. As living cells are open systems in quasi-steady state type equilibrium in continuous exchange with their environment, computational techniques that have been successfully applied in statistical thermodynamics to describe phase transitions may provide new insights to emerging behavior of biological systems. Here we will systematically evaluate the translation of computational techniques from solid-state physics to network models that closely resemble biological networks and develop specific translational rules to tackle problems unique to living systems. Hence we will focus on logic models exhibiting only two states in each network node. Motivated by the apparent asymmetry between biological states where an entity exhibits boolean states i.e. is active or inactive, we present an adaptation of symmetric Ising model towards an asymmetric one fitting to living systems here referred to as the modified Ising model with gene-type spins. We analyze phase transitions by Monte Carlo simulations and propose mean-field solution of modified Ising model of a network type that closely resembles real-world network, the Barabási-Albert model of scale-free networks. We show that asymmetric Ising models show similarities to symmetric Ising models with external field and undergoes a discontinuous phase transition of the first-order and exhibits hysteresis. The simulation setup presented here can be directly used for any biological network connectivity dataset and is also applicable for other networks that exhibit similar states of activity. This is a general statistical method to deal with non-linear large scale models arising in the context of biological systems and is scalable to any network size.
△ Less
Submitted 19 August, 2019;
originally announced August 2019.
-
SURE-fuse WFF: A Multi-resolution Windowed Fourier Analysis for Interferometric Phase Denoising
Authors:
Joshin P. Krishnan,
Mário A. T. Figueiredo,
José M. Bioucas-Dias
Abstract:
Interferometric phase (InPhase) imaging is an important part of many present-day coherent imaging technologies. Often in such imaging techniques, the acquired images, known as interferograms, suffer from two major degradations: 1) phase wrapping caused by the fact that the sensing mechanism can only measure sinusoidal $2π$-periodic functions of the actual phase, and 2) noise introduced by the acqu…
▽ More
Interferometric phase (InPhase) imaging is an important part of many present-day coherent imaging technologies. Often in such imaging techniques, the acquired images, known as interferograms, suffer from two major degradations: 1) phase wrapping caused by the fact that the sensing mechanism can only measure sinusoidal $2π$-periodic functions of the actual phase, and 2) noise introduced by the acquisition process or the system. This work focusses on InPhase denoising which is a fundamental restoration step to many posterior applications of InPhase, namely to phase unwrapping. The presence of sharp fringes that arises from phase wrapping makes InPhase denoising a hard-inverse problem. Motivated by the fact that the InPhase images are often locally sparse in Fourier domain, we propose a multi-resolution windowed Fourier filtering (WFF) analysis that fuses WFF estimates with different resolutions, thus overcoming the WFF fixed resolution limitation. The proposed fusion relies on an unbiased estimate of the mean square error derived using the Stein's lemma adapted to complex-valued signals. This estimate, known as SURE, is minimized using an optimization framework to obtain the fusion weights. Strong experimental evidence, using synthetic and real (InSAR & MRI) data, that the developed algorithm, termed as SURE-fuse WFF, outperforms the best hand-tuned fixed resolution WFF as well as other state-of-the-art InPhase denoising algorithms, is provided.
△ Less
Submitted 26 February, 2019; v1 submitted 9 November, 2018;
originally announced November 2018.
-
Patch-based Interferometric Phase Estimation via Mixture of Gaussian Density Modelling & Non-local Averaging in the Complex Domain
Authors:
Joshin P. Krishnan,
José M. Bioucas-Dias
Abstract:
This paper addresses interferometric phase (InPhase) image denoising, i.e., the denoising of phase modulo-2p images from sinusoidal 2p-periodic and noisy observations. The wrapping discontinuities present in the InPhase images, which are to be preserved carefully, make InPhase denoising a challenging inverse problem. We propose a novel two-step algorithm to tackle this problem by exploiting the no…
▽ More
This paper addresses interferometric phase (InPhase) image denoising, i.e., the denoising of phase modulo-2p images from sinusoidal 2p-periodic and noisy observations. The wrapping discontinuities present in the InPhase images, which are to be preserved carefully, make InPhase denoising a challenging inverse problem. We propose a novel two-step algorithm to tackle this problem by exploiting the non-local self-similarity of the InPhase images. In the first step, the patches of the phase images are modelled using Mixture of Gaussian (MoG) densities in the complex domain. An Expectation Maximization(EM) algorithm is formulated to learn the parameters of the MoG from the noisy data. The learned MoG is used as a prior for estimating the InPhase images from the noisy images using Minimum Mean Square Error (MMSE) estimation. In the second step, an additional exploitation of non-local self-similarity is done by performing a type of non-local mean filtering. Experiments conducted on simulated and real (MRI and InSAR) datasets show results which are competitive with the state-of-the-art techniques.
△ Less
Submitted 24 October, 2018;
originally announced October 2018.
-
Dictionary Learning Phase Retrieval from Noisy Diffraction Patterns
Authors:
Joshin P. Krishnan,
José M. Bioucas-Dias,
Vladimir Katkovnik
Abstract:
This paper proposes a novel algorithm for image phase retrieval, i.e., for recovering complex-valued images from the amplitudes of noisy linear combinations (often the Fourier transform) of the sought complex images. The algorithm is developed using the alternating projection framework and is aimed to obtain high performance for heavily noisy (Poissonian or Gaussian) observations. The estimation o…
▽ More
This paper proposes a novel algorithm for image phase retrieval, i.e., for recovering complex-valued images from the amplitudes of noisy linear combinations (often the Fourier transform) of the sought complex images. The algorithm is developed using the alternating projection framework and is aimed to obtain high performance for heavily noisy (Poissonian or Gaussian) observations. The estimation of the target images is reformulated as a sparse regression, often termed sparse coding, in the complex domain. This is accomplished by learning a complex domain dictionary from the data it represents via matrix factorization with sparsity constraints on the code (i.e., the regression coefficients). Our algorithm, termed dictionary learning phase retrieval (DLPR), jointly learns the referred to dictionary and reconstructs the unknown target image. The effectiveness of DLPR is illustrated through experiments conducted on complex images, simulated and real, where it shows noticeable advantages over the state-of-the-art competitors.
△ Less
Submitted 18 October, 2018;
originally announced October 2018.
-
Perfect spike detection via time reversal
Authors:
Jeyashree Krishnan,
PierGianLuca Porta Mana,
Moritz Helias,
Markus Diesmann,
Edoardo Di Napoli
Abstract:
Spiking neuronal networks are usually simulated with three main simulation schemes: the classical time-driven and event-driven schemes, and the more recent hybrid scheme. All three schemes evolve the state of a neuron through a series of checkpoints: equally spaced in the first scheme and determined neuron-wise by spike events in the latter two. The time-driven and the hybrid scheme determine whet…
▽ More
Spiking neuronal networks are usually simulated with three main simulation schemes: the classical time-driven and event-driven schemes, and the more recent hybrid scheme. All three schemes evolve the state of a neuron through a series of checkpoints: equally spaced in the first scheme and determined neuron-wise by spike events in the latter two. The time-driven and the hybrid scheme determine whether the membrane potential of a neuron crosses a threshold at the end of of the time interval between consecutive checkpoints. Threshold crossing can, however, occur within the interval even if this test is negative. Spikes can therefore be missed. The present work derives, implements, and benchmarks a method for perfect retrospective spike detection. This method can be applied to neuron models with affine or linear subthreshold dynamics. The idea behind the method is to propagate the threshold with a time-inverted dynamics, testing whether the threshold crosses the neuron state to be evolved, rather than vice versa. Algebraically this translates into a set of inequalities necessary and sufficient for threshold crossing. This test is slower than the imperfect one, but faster than an alternative perfect tests based on bisection or root-finding methods. Comparison confirms earlier results that the imperfect test rarely misses spikes (less than a fraction $1/10^8$ of missed spikes) in biologically relevant settings. This study offers an alternative geometric point of view on neuronal dynamics.
△ Less
Submitted 18 June, 2017;
originally announced June 2017.
-
Pulse Bifurcations and Instabilities in an Excitable Medium: Computations in Finite Ring Domains
Authors:
M. Or-Guil,
J. Krishnan,
I. G. Kevrekidis,
M. Bar
Abstract:
We investigate the instabilities and bifurcations of traveling pulses in a model excitable medium; in particular we discuss three different scenarios for the loss of stability resp. the disappearance of stable pulses. In numerical simulations beyond the instabilities we observe replication of pulses (backfiring) resulting in complex periodic or spatiotemporally chaotic dynamics as well as modula…
▽ More
We investigate the instabilities and bifurcations of traveling pulses in a model excitable medium; in particular we discuss three different scenarios for the loss of stability resp. the disappearance of stable pulses. In numerical simulations beyond the instabilities we observe replication of pulses (backfiring) resulting in complex periodic or spatiotemporally chaotic dynamics as well as modulated traveling pulses. We approximate the linear stability of traveling pulses through computations in a finite albeit large domain with periodic boundary conditions. The critical eigenmodes at the onset of the instabilities are related to the resulting spatiotemporal dynamics and act upon the back of the pulses. The first scenario has been analyzed earlier for high excitability resp. low excitation threshold: it involves the collision of a stable pulse branch with an unstable pulse branch in a so called T-point.
△ Less
Submitted 21 June, 2001;
originally announced June 2001.