-
SpeechVerse: A Large-scale Generalizable Audio Language Model
Authors:
Nilaksh Das,
Saket Dingliwal,
Srikanth Ronanki,
Rohit Paturi,
Zhaocheng Huang,
Prashant Mathur,
Jie Yuan,
Dhanush Bekal,
Xing Niu,
Sai Muralidhar Jayanthi,
Xilai Li,
Karel Mundnich,
Monica Sunkara,
Sravan Bodapati,
Sundararajan Srinivasan,
Kyu J Han,
Katrin Kirchhoff
Abstract:
Large language models (LLMs) have shown incredible proficiency in performing tasks that require semantic understanding of natural language instructions. Recently, many works have further expanded this capability to perceive multimodal audio and text inputs, but their capabilities are often limited to specific fine-tuned tasks such as automatic speech recognition and translation. We therefore devel…
▽ More
Large language models (LLMs) have shown incredible proficiency in performing tasks that require semantic understanding of natural language instructions. Recently, many works have further expanded this capability to perceive multimodal audio and text inputs, but their capabilities are often limited to specific fine-tuned tasks such as automatic speech recognition and translation. We therefore develop SpeechVerse, a robust multi-task training and curriculum learning framework that combines pre-trained speech and text foundation models via a small set of learnable parameters, while keeping the pre-trained models frozen during training. The models are instruction finetuned using continuous latent representations extracted from the speech foundation model to achieve optimal zero-shot performance on a diverse range of speech processing tasks using natural language instructions. We perform extensive benchmarking that includes comparing our model performance against traditional baselines across several datasets and tasks. Furthermore, we evaluate the model's capability for generalized instruction following by testing on out-of-domain datasets, novel prompts, and unseen tasks. Our empirical experiments reveal that our multi-task SpeechVerse model is even superior to conventional task-specific baselines on 9 out of the 11 tasks.
△ Less
Submitted 24 March, 2025; v1 submitted 13 May, 2024;
originally announced May 2024.
-
Exploring Explainability in Video Action Recognition
Authors:
Avinab Saha,
Shashank Gupta,
Sravan Kumar Ankireddy,
Karl Chahine,
Joydeep Ghosh
Abstract:
Image Classification and Video Action Recognition are perhaps the two most foundational tasks in computer vision. Consequently, explaining the inner workings of trained deep neural networks is of prime importance. While numerous efforts focus on explaining the decisions of trained deep neural networks in image classification, exploration in the domain of its temporal version, video action recognit…
▽ More
Image Classification and Video Action Recognition are perhaps the two most foundational tasks in computer vision. Consequently, explaining the inner workings of trained deep neural networks is of prime importance. While numerous efforts focus on explaining the decisions of trained deep neural networks in image classification, exploration in the domain of its temporal version, video action recognition, has been scant. In this work, we take a deeper look at this problem. We begin by revisiting Grad-CAM, one of the popular feature attribution methods for Image Classification, and its extension to Video Action Recognition tasks and examine the method's limitations. To address these, we introduce Video-TCAV, by building on TCAV for Image Classification tasks, which aims to quantify the importance of specific concepts in the decision-making process of Video Action Recognition models. As the scalable generation of concepts is still an open problem, we propose a machine-assisted approach to generate spatial and spatiotemporal concepts relevant to Video Action Recognition for testing Video-TCAV. We then establish the importance of temporally-varying concepts by demonstrating the superiority of dynamic spatiotemporal concepts over trivial spatial concepts. In conclusion, we introduce a framework for investigating hypotheses in action recognition and quantitatively testing them, thus advancing research in the explainability of deep neural networks used in video action recognition.
△ Less
Submitted 13 April, 2024;
originally announced April 2024.
-
Attribution Regularization for Multimodal Paradigms
Authors:
Sahiti Yerramilli,
Jayant Sravan Tamarapalli,
Jonathan Francis,
Eric Nyberg
Abstract:
Multimodal machine learning has gained significant attention in recent years due to its potential for integrating information from multiple modalities to enhance learning and decision-making processes. However, it is commonly observed that unimodal models outperform multimodal models, despite the latter having access to richer information. Additionally, the influence of a single modality often dom…
▽ More
Multimodal machine learning has gained significant attention in recent years due to its potential for integrating information from multiple modalities to enhance learning and decision-making processes. However, it is commonly observed that unimodal models outperform multimodal models, despite the latter having access to richer information. Additionally, the influence of a single modality often dominates the decision-making process, resulting in suboptimal performance. This research project aims to address these challenges by proposing a novel regularization term that encourages multimodal models to effectively utilize information from all modalities when making decisions. The focus of this project lies in the video-audio domain, although the proposed regularization technique holds promise for broader applications in embodied AI research, where multiple modalities are involved. By leveraging this regularization term, the proposed approach aims to mitigate the issue of unimodal dominance and improve the performance of multimodal machine learning systems. Through extensive experimentation and evaluation, the effectiveness and generalizability of the proposed technique will be assessed. The findings of this research project have the potential to significantly contribute to the advancement of multimodal machine learning and facilitate its application in various domains, including multimedia analysis, human-computer interaction, and embodied AI research.
△ Less
Submitted 2 April, 2024;
originally announced April 2024.
-
Semantic Augmentation in Images using Language
Authors:
Sahiti Yerramilli,
Jayant Sravan Tamarapalli,
Tanmay Girish Kulkarni,
Jonathan Francis,
Eric Nyberg
Abstract:
Deep Learning models are incredibly data-hungry and require very large labeled datasets for supervised learning. As a consequence, these models often suffer from overfitting, limiting their ability to generalize to real-world examples. Recent advancements in diffusion models have enabled the generation of photorealistic images based on textual inputs. Leveraging the substantial datasets used to tr…
▽ More
Deep Learning models are incredibly data-hungry and require very large labeled datasets for supervised learning. As a consequence, these models often suffer from overfitting, limiting their ability to generalize to real-world examples. Recent advancements in diffusion models have enabled the generation of photorealistic images based on textual inputs. Leveraging the substantial datasets used to train these diffusion models, we propose a technique to utilize generated images to augment existing datasets. This paper explores various strategies for effective data augmentation to improve the out-of-domain generalization capabilities of deep learning models.
△ Less
Submitted 2 April, 2024;
originally announced April 2024.
-
Modeling RIS from Electromagnetic Principles to Communication Systems--Part II: System-Level Simulation, Ray Tracing, and Measurement
Authors:
Le Hao,
Sravan K. R. Vuyyuru,
Sergei A. Tretyakov,
Artan Salihu,
Markus Rupp,
Risto Valkonen
Abstract:
In this paper, we systematically study the electromagnetic (EM) and communication aspects of an RIS through EM simulations, system-level and ray-tracing simulations, and finally measurements. We simulate a nearly perfect, lossless RIS, and a realistic lossy anomalous reflector (AR) in different ray tracers and analyze the large-scale fading of simple RIS-assisted links. We also compare the results…
▽ More
In this paper, we systematically study the electromagnetic (EM) and communication aspects of an RIS through EM simulations, system-level and ray-tracing simulations, and finally measurements. We simulate a nearly perfect, lossless RIS, and a realistic lossy anomalous reflector (AR) in different ray tracers and analyze the large-scale fading of simple RIS-assisted links. We also compare the results with continuous and quantized unit cell reflection phases with one to four-bit resolutions. Finally, we perform over-the-air communication link measurements in an indoor setting with a manufactured sample of a wide-angle AR. The EM, system-level, and ray-tracing simulation results show good agreement with the measurement results. It is proved that the introduced macroscopic model of RIS from the EM aspects is consistent with our proposed communication models, both for an ideal RIS and a realistic AR.
△ Less
Submitted 19 March, 2024;
originally announced March 2024.
-
Modeling RIS from Electromagnetic Principles to Communication Systems--Part I: Synthesis and Characterization of a Scalable Anomalous Reflector
Authors:
Sravan K. R. Vuyyuru,
Le Hao,
Markus Rupp,
Sergei A. Tretyakov,
Risto Valkonen
Abstract:
This work aims to build connections between the electromagnetic and communication aspects of Reconfigurable Intelligent Surfaces (RIS) by proposing a methodology to combine outputs from electromagnetic RIS design into an RIS-tailored system-level simulator and a ray tracer. In this first part of the contribution, a periodic anomalous reflector is designed using an algebraic array antenna scatterin…
▽ More
This work aims to build connections between the electromagnetic and communication aspects of Reconfigurable Intelligent Surfaces (RIS) by proposing a methodology to combine outputs from electromagnetic RIS design into an RIS-tailored system-level simulator and a ray tracer. In this first part of the contribution, a periodic anomalous reflector is designed using an algebraic array antenna scattering synthesis technique that enables electromagnetically accurate modeling of scattering surfaces with both static and reconfigurable scattering characteristics. The multi-mode periodic structure, capable of scattering into several anomalous angles through manipulation of reactive loads, is then cropped into finite-sized arrays, and the quantization effects of the load reactances on the array scattering are analyzed. An experimental anomalous reflector is demonstrated with a comparison between simulated and measured scattering performance. In the second part, the simulated receiving and transmitting scattering patterns of the anomalous reflector are utilized to build an electromagnetically consistent path loss model of an RIS into a system-level simulator. Large-scale fading is analyzed in simple scenarios of RIS-assisted wireless networks to verify the communication model, and an indoor scenario measurement using the manufactured anomalous reflector sample to support the simulation analysis. After verifying the connections between electromagnetic and communication aspects through simulations and measurements, the proposed communication model can be used for a broad range of RIS designs to perform large-scale system-level and ray-tracing simulations in realistic scenarios.
△ Less
Submitted 19 March, 2024;
originally announced March 2024.
-
LightCode: Light Analytical and Neural Codes for Channels with Feedback
Authors:
Sravan Kumar Ankireddy,
Krishna Narayanan,
Hyeji Kim
Abstract:
The design of reliable and efficient codes for channels with feedback remains a longstanding challenge in communication theory. While significant improvements have been achieved by leveraging deep learning techniques, neural codes often suffer from high computational costs, a lack of interpretability, and limited practicality in resource-constrained settings. We focus on designing low-complexity c…
▽ More
The design of reliable and efficient codes for channels with feedback remains a longstanding challenge in communication theory. While significant improvements have been achieved by leveraging deep learning techniques, neural codes often suffer from high computational costs, a lack of interpretability, and limited practicality in resource-constrained settings. We focus on designing low-complexity coding schemes that are interpretable and more suitable for communication systems. We advance both analytical and neural codes. First, we demonstrate that PowerBlast, an analytical coding scheme inspired by Schalkwijk-Kailath (SK) and Gallager-Nakiboğlu (GN) schemes, achieves notable reliability improvements over both SK and GN schemes, outperforming neural codes in high signal-to-noise ratio (SNR) regions. Next, to enhance reliability in low-SNR regions, we propose LightCode, a lightweight neural code that achieves state-of-the-art reliability while using a fraction of memory and compute compared to existing deeplearning-based codes. Finally, we systematically analyze the learned codes, establishing connections between LightCode and PowerBlast, identifying components crucial for performance, and providing interpretation aided by linear regression analysis.
△ Less
Submitted 16 November, 2024; v1 submitted 15 March, 2024;
originally announced March 2024.
-
Unitary quantum gravitational physics and the CMB parity asymmetry
Authors:
Enrique Gaztañaga,
K. Sravan Kumar
Abstract:
The proposal of Direct-Sum Quantum Field Theory (DQFT) offers a new perspective for quantum fields by combining parity and time reversal operations, blurring the distinction between quantum past and future while preserving causality. This approach provides a unitary QFT in curved spacetime, resolving the information-loss paradox. When applied to inflationary quantum fluctuations, DQFT predicts var…
▽ More
The proposal of Direct-Sum Quantum Field Theory (DQFT) offers a new perspective for quantum fields by combining parity and time reversal operations, blurring the distinction between quantum past and future while preserving causality. This approach provides a unitary QFT in curved spacetime, resolving the information-loss paradox. When applied to inflationary quantum fluctuations, DQFT predicts variations in CMB measurements, explaining longstanding anomalies as a result of parity asymmetry. The data strongly supports the unitary treatment of quantum fluctuations, with a probability over 650 times greater than the standard prediction. This significant discrepancy underscores the validity and importance of the DQFT approach in understanding the intricate relationship between gravity and quantum mechanics.
△ Less
Submitted 10 May, 2025; v1 submitted 4 March, 2024;
originally announced March 2024.
-
DeepPolar: Inventing Nonlinear Large-Kernel Polar Codes via Deep Learning
Authors:
S Ashwin Hebbar,
Sravan Kumar Ankireddy,
Hyeji Kim,
Sewoong Oh,
Pramod Viswanath
Abstract:
Progress in designing channel codes has been driven by human ingenuity and, fittingly, has been sporadic. Polar codes, developed on the foundation of Arikan's polarization kernel, represent the latest breakthrough in coding theory and have emerged as the state-of-the-art error-correction code for short-to-medium block length regimes. In an effort to automate the invention of good channel codes, es…
▽ More
Progress in designing channel codes has been driven by human ingenuity and, fittingly, has been sporadic. Polar codes, developed on the foundation of Arikan's polarization kernel, represent the latest breakthrough in coding theory and have emerged as the state-of-the-art error-correction code for short-to-medium block length regimes. In an effort to automate the invention of good channel codes, especially in this regime, we explore a novel, non-linear generalization of Polar codes, which we call DeepPolar codes. DeepPolar codes extend the conventional Polar coding framework by utilizing a larger kernel size and parameterizing these kernels and matched decoders through neural networks. Our results demonstrate that these data-driven codes effectively leverage the benefits of a larger kernel size, resulting in enhanced reliability when compared to both existing neural codes and conventional Polar codes.
△ Less
Submitted 4 June, 2024; v1 submitted 13 February, 2024;
originally announced February 2024.
-
Automated detection of motion artifacts in brain MR images using deep learning and explainable artificial intelligence
Authors:
Marina Manso Jimeno,
Keerthi Sravan Ravi,
Maggie Fung,
John Thomas Vaughan, Jr.,
Sairam Geethanath
Abstract:
Quality assessment, including inspecting the images for artifacts, is a critical step during MRI data acquisition to ensure data quality and downstream analysis or interpretation success. This study demonstrates a deep learning model to detect rigid motion in T1-weighted brain images. We leveraged a 2D CNN for three-class classification and tested it on publicly available retrospective and prospec…
▽ More
Quality assessment, including inspecting the images for artifacts, is a critical step during MRI data acquisition to ensure data quality and downstream analysis or interpretation success. This study demonstrates a deep learning model to detect rigid motion in T1-weighted brain images. We leveraged a 2D CNN for three-class classification and tested it on publicly available retrospective and prospective datasets. Grad-CAM heatmaps enabled the identification of failure modes and provided an interpretation of the model's results. The model achieved average precision and recall metrics of 85% and 80% on six motion-simulated retrospective datasets. Additionally, the model's classifications on the prospective dataset showed a strong inverse correlation (-0.84) compared to average edge strength, an image quality metric indicative of motion. This model is part of the ArtifactID tool, aimed at inline automatic detection of Gibbs ringing, wrap-around, and motion artifacts. This tool automates part of the time-consuming QA process and augments expertise on-site, particularly relevant in low-resource settings where local MR knowledge is scarce.
△ Less
Submitted 13 February, 2024;
originally announced February 2024.
-
A Novel Approach to Regularising 1NN classifier for Improved Generalization
Authors:
Aditya Challa,
Sravan Danda,
Laurent Najman
Abstract:
In this paper, we propose a class of non-parametric classifiers, that learn arbitrary boundaries and generalize well.
Our approach is based on a novel way to regularize 1NN classifiers using a greedy approach. We refer to this class of classifiers as Watershed Classifiers. 1NN classifiers are known to trivially over-fit but have very large VC dimension, hence do not generalize well. We show that…
▽ More
In this paper, we propose a class of non-parametric classifiers, that learn arbitrary boundaries and generalize well.
Our approach is based on a novel way to regularize 1NN classifiers using a greedy approach. We refer to this class of classifiers as Watershed Classifiers. 1NN classifiers are known to trivially over-fit but have very large VC dimension, hence do not generalize well. We show that watershed classifiers can find arbitrary boundaries on any dense enough dataset, and, at the same time, have very small VC dimension; hence a watershed classifier leads to good generalization.
Traditional approaches to regularize 1NN classifiers are to consider $K$ nearest neighbours. Neighbourhood component analysis (NCA) proposes a way to learn representations consistent with ($n-1$) nearest neighbour classifier, where $n$ denotes the size of the dataset. In this article, we propose a loss function which can learn representations consistent with watershed classifiers, and show that it outperforms the NCA baseline.
△ Less
Submitted 13 February, 2024;
originally announced February 2024.
-
Nested Construction of Polar Codes via Transformers
Authors:
Sravan Kumar Ankireddy,
S Ashwin Hebbar,
Heping Wan,
Joonyoung Cho,
Charlie Zhang
Abstract:
Tailoring polar code construction for decoding algorithms beyond successive cancellation has remained a topic of significant interest in the field. However, despite the inherent nested structure of polar codes, the use of sequence models in polar code construction is understudied. In this work, we propose using a sequence modeling framework to iteratively construct a polar code for any given lengt…
▽ More
Tailoring polar code construction for decoding algorithms beyond successive cancellation has remained a topic of significant interest in the field. However, despite the inherent nested structure of polar codes, the use of sequence models in polar code construction is understudied. In this work, we propose using a sequence modeling framework to iteratively construct a polar code for any given length and rate under various channel conditions. Simulations show that polar codes designed via sequential modeling using transformers outperform both 5G-NR sequence and Density Evolution based approaches for both AWGN and Rayleigh fading channels.
△ Less
Submitted 30 January, 2024;
originally announced January 2024.
-
Finding origins of CMB anomalies in the inflationary quantum fluctuations
Authors:
Enrique Gaztañaga,
K. Sravan Kumar
Abstract:
In this paper, we present compelling evidence for the parity asymmetry (a discrete symmetry that is separate from isotropy) in the Cosmic Microwave Background (CMB) map, measured through two-point temperature correlations. This parity asymmetric CMB challenges our understanding of the quantum physics of the early Universe rather than LCDM ($Λ$ Cold-Dark-Matter). We commence by conducting a compreh…
▽ More
In this paper, we present compelling evidence for the parity asymmetry (a discrete symmetry that is separate from isotropy) in the Cosmic Microwave Background (CMB) map, measured through two-point temperature correlations. This parity asymmetric CMB challenges our understanding of the quantum physics of the early Universe rather than LCDM ($Λ$ Cold-Dark-Matter). We commence by conducting a comprehensive analysis of the Planck CMB, focusing on the distribution of power in low-multipoles and temperature anticorrelations at parity conjugate points in position space. We find tension with the near scale-invariant power-law power spectrum of Standard Inflation (SI), with p-values of the order $\mathcal{O}\left( 10^{-4}-10^{-3} \right)$. Alternatively, we explore the framework of direct-sum inflation (DSI), where a quantum fluctuation arises as a direct-sum of two components evolving forward and backward in time at parity conjugate points in physical space. We found that DSI is consistent with data on parity asymmetry, the absence of power at $θ>60^{\circ}$, and power suppression at low-even-multipoles, which are major data anomalies in the SI. Furthermore, we discover that the parameters characterizing the hemispherical power asymmetry anomaly become statistically insignificant when the large SI quadrupole amplitude is reduced to align with the data. DSI explains this low quadrupole with a p-value of $3.5\%$, 39 times higher than SI. Combining statistics from parameters measuring parity and low-$\ell$ angular power spectrum, we find that DSI is 50-650 times more probable than SI. In summary, our investigation suggests that CMB temperature fluctuations exhibit homogeneity and isotropy but parity-asymmetric consistent with predictions of DSI. This observation provides tantalizing evidence for the quantum mechanical nature of gravity.
△ Less
Submitted 6 June, 2024; v1 submitted 16 January, 2024;
originally announced January 2024.
-
CUTTANA: Scalable Graph Partitioning for Faster Distributed Graph Databases and Analytics
Authors:
Milad Rezaei Hajidehi,
Sraavan Sridhar,
Margo Seltzer
Abstract:
Graph partitioning plays a pivotal role in various distributed graph processing applications, including graph analytics, graph neural network training, and distributed graph databases. Graphs that require distributed settings are often too large to fit in the main memory of a single machine. This challenge renders traditional in-memory graph partitioners infeasible, leading to the emergence of str…
▽ More
Graph partitioning plays a pivotal role in various distributed graph processing applications, including graph analytics, graph neural network training, and distributed graph databases. Graphs that require distributed settings are often too large to fit in the main memory of a single machine. This challenge renders traditional in-memory graph partitioners infeasible, leading to the emergence of streaming solutions. Streaming partitioners produce lower-quality partitions because they work from partial information and must make premature decisions before they have a complete view of a vertex's neighborhood. We introduce CUTTANA, a streaming graph partitioner that partitions massive graphs (Web/Twitter scale) with superior quality compared to existing streaming solutions. CUTTANA uses a novel buffering technique that prevents the premature assignment of vertices to partitions and a scalable coarsening and refinement technique that enables a complete graph view, improving the intermediate assignment made by a streaming partitioner. We implemented a parallel version for CUTTANA that offers nearly the same partitioning latency as existing streaming partitioners.
Our experimental analysis shows that CUTTANA consistently yields better partitioning quality than existing state-of-the-art streaming vertex partitioners in terms of both edge-cut and communication volume metrics. We also evaluate the workload latencies that result from using CUTTANA and other partitioners in distributed graph analytics and databases. CUTTANA outperforms the other methods in most scenarios (algorithms, datasets). In analytics applications, CUTTANA improves runtime performance by up to 59% compared to various streaming partitioners (HDRF, Fennel, Ginger, HeiStream). In graph database tasks, CUTTANA results in higher query throughput by up to 23%, without hurting tail latency.
△ Less
Submitted 9 December, 2024; v1 submitted 13 December, 2023;
originally announced December 2023.
-
Efficient Synthesis of Passively Loaded Finite Arrays for Tunable Anomalous Reflection
Authors:
Sravan K. R. Vuyyuru,
Risto Valkonen,
Sergei A. Tretyakov,
Do-Hoon Kwon
Abstract:
A design methodology for planar loaded antenna arrays is proposed to synthesize a perfect anomalous reflection into an arbitrary direction by optimizing the scattering characteristics of passively loaded array antennas. It is based on efficient and accurate prediction of the induced current distribution and the associated scattering for any given set of load impedances. For a fixed array of finite…
▽ More
A design methodology for planar loaded antenna arrays is proposed to synthesize a perfect anomalous reflection into an arbitrary direction by optimizing the scattering characteristics of passively loaded array antennas. It is based on efficient and accurate prediction of the induced current distribution and the associated scattering for any given set of load impedances. For a fixed array of finite dimensions, the deflection angles can be continuously adjusted with proper tuning of each load. We study and develop anomalous reflectors as semi-finite (finite $\times$ infinite) and finite planar rectangular arrays comprising printed patches with a subwavelength spacing. Anomalous reflection into an arbitrary desired angle using purely reactive loads is numerically and experimentally validated. Owing to the algebraic nature of load optimization, the design methodology may be applied to the synthesis of large-scale reflectors of practical significance.
△ Less
Submitted 7 December, 2023;
originally announced December 2023.
-
JPPF: Multi-task Fusion for Consistent Panoptic-Part Segmentation
Authors:
Shishir Muralidhara,
Sravan Kumar Jagadeesh,
René Schuster,
Didier Stricker
Abstract:
Part-aware panoptic segmentation is a problem of computer vision that aims to provide a semantic understanding of the scene at multiple levels of granularity. More precisely, semantic areas, object instances, and semantic parts are predicted simultaneously. In this paper, we present our Joint Panoptic Part Fusion (JPPF) that combines the three individual segmentations effectively to obtain a panop…
▽ More
Part-aware panoptic segmentation is a problem of computer vision that aims to provide a semantic understanding of the scene at multiple levels of granularity. More precisely, semantic areas, object instances, and semantic parts are predicted simultaneously. In this paper, we present our Joint Panoptic Part Fusion (JPPF) that combines the three individual segmentations effectively to obtain a panoptic-part segmentation. Two aspects are of utmost importance for this: First, a unified model for the three problems is desired that allows for mutually improved and consistent representation learning. Second, balancing the combination so that it gives equal importance to all individual results during fusion. Our proposed JPPF is parameter-free and dynamically balances its input. The method is evaluated and compared on the Cityscapes Panoptic Parts (CPP) and Pascal Panoptic Parts (PPP) datasets in terms of PartPQ and Part-Whole Quality (PWQ). In extensive experiments, we verify the importance of our fair fusion, highlight its most significant impact for areas that can be further segmented into parts, and demonstrate the generalization capabilities of our design without fine-tuning on 5 additional datasets.
△ Less
Submitted 30 November, 2023;
originally announced November 2023.
-
Multi-teacher Distillation for Multilingual Spelling Correction
Authors:
Jingfen Zhang,
Xuan Guo,
Sravan Bodapati,
Christopher Potts
Abstract:
Accurate spelling correction is a critical step in modern search interfaces, especially in an era of mobile devices and speech-to-text interfaces. For services that are deployed around the world, this poses a significant challenge for multilingual NLP: spelling errors need to be caught and corrected in all languages, and even in queries that use multiple languages. In this paper, we tackle this ch…
▽ More
Accurate spelling correction is a critical step in modern search interfaces, especially in an era of mobile devices and speech-to-text interfaces. For services that are deployed around the world, this poses a significant challenge for multilingual NLP: spelling errors need to be caught and corrected in all languages, and even in queries that use multiple languages. In this paper, we tackle this challenge using multi-teacher distillation. On our approach, a monolingual teacher model is trained for each language/locale, and these individual models are distilled into a single multilingual student model intended to serve all languages/locales. In experiments using open-source data as well as user data from a worldwide search service, we show that this leads to highly effective spelling correction models that can meet the tight latency requirements of deployed services.
△ Less
Submitted 19 November, 2023;
originally announced November 2023.
-
Retrieve and Copy: Scaling ASR Personalization to Large Catalogs
Authors:
Sai Muralidhar Jayanthi,
Devang Kulshreshtha,
Saket Dingliwal,
Srikanth Ronanki,
Sravan Bodapati
Abstract:
Personalization of automatic speech recognition (ASR) models is a widely studied topic because of its many practical applications. Most recently, attention-based contextual biasing techniques are used to improve the recognition of rare words and domain specific entities. However, due to performance constraints, the biasing is often limited to a few thousand entities, restricting real-world usabili…
▽ More
Personalization of automatic speech recognition (ASR) models is a widely studied topic because of its many practical applications. Most recently, attention-based contextual biasing techniques are used to improve the recognition of rare words and domain specific entities. However, due to performance constraints, the biasing is often limited to a few thousand entities, restricting real-world usability. To address this, we first propose a "Retrieve and Copy" mechanism to improve latency while retaining the accuracy even when scaled to a large catalog. We also propose a training strategy to overcome the degradation in recall at such scale due to an increased number of confusing entities. Overall, our approach achieves up to 6% more Word Error Rate reduction (WERR) and 3.6% absolute improvement in F1 when compared to a strong baseline. Our method also allows for large catalog sizes of up to 20K without significantly affecting WER and F1-scores, while achieving at least 20% inference speedup per acoustic frame.
△ Less
Submitted 14 November, 2023;
originally announced November 2023.
-
Revisiting primordial black holes formation from preheating instabilities: the case of Starobinsky inflation
Authors:
Daniel del-Corral,
Paolo Gondolo,
K. Sravan Kumar,
João Marto
Abstract:
In recent years, the formation of primordial black holes (PBH) in the early universe inflationary cosmology has garnered significant attention. One plausible scenario for primordial black hole (PBH) formation arises during the preheating stage following inflation. Notably, this scenario does not necessitate any ad-hoc fine-tuning of the scalar field potential. This paper focuses on the growth of p…
▽ More
In recent years, the formation of primordial black holes (PBH) in the early universe inflationary cosmology has garnered significant attention. One plausible scenario for primordial black hole (PBH) formation arises during the preheating stage following inflation. Notably, this scenario does not necessitate any ad-hoc fine-tuning of the scalar field potential. This paper focuses on the growth of primordial density perturbation and the consequent possibility of PBH formation in the preheating stage of the Starobinsky model for inflation. The typical mechanism for PBH formation during preheating is based on the collapse of primordial fluctuations that become super-horizon during inflation (type I) and re-enter the particle horizon in the different phases of cosmic expansion. In this work, we show that there exists a certain range of modes that remain in the sub-horizon (not exited) during inflation (type II modes) but evolve identically to type I modes if they fall into the instability band, leading to large density perturbation above the threshold and can potentially also contribute to the PBH formation. We detail the conditions determining the possible collapse of type I and/or type II modes whose wavelengths are larger than the Jeans length we derive from the effective sound speed of scalar field fluctuations. Since the preheating stage is an 'inflaton' (approximately) matter-dominated phase, we follow the framework of the critical collapse of fluctuations and compute the mass fraction using the well-known Press-Schechter and the Khlopov-Polnarev formalisms, and compare the two. Finally, we comment on the implications of our study for the investigations concerned with primordial accretion and consequent PBH contribution to the dark matter.
△ Less
Submitted 5 February, 2025; v1 submitted 5 November, 2023;
originally announced November 2023.
-
Generalized zero-shot audio-to-intent classification
Authors:
Veera Raghavendra Elluru,
Devang Kulshreshtha,
Rohit Paturi,
Sravan Bodapati,
Srikanth Ronanki
Abstract:
Spoken language understanding systems using audio-only data are gaining popularity, yet their ability to handle unseen intents remains limited. In this study, we propose a generalized zero-shot audio-to-intent classification framework with only a few sample text sentences per intent. To achieve this, we first train a supervised audio-to-intent classifier by making use of a self-supervised pre-trai…
▽ More
Spoken language understanding systems using audio-only data are gaining popularity, yet their ability to handle unseen intents remains limited. In this study, we propose a generalized zero-shot audio-to-intent classification framework with only a few sample text sentences per intent. To achieve this, we first train a supervised audio-to-intent classifier by making use of a self-supervised pre-trained model. We then leverage a neural audio synthesizer to create audio embeddings for sample text utterances and perform generalized zero-shot classification on unseen intents using cosine similarity. We also propose a multimodal training strategy that incorporates lexical information into the audio representation to improve zero-shot performance. Our multimodal training approach improves the accuracy of zero-shot intent classification on unseen intents of SLURP by 2.75% and 18.2% for the SLURP and internal goal-oriented dialog datasets, respectively, compared to audio-only training.
△ Less
Submitted 4 November, 2023;
originally announced November 2023.
-
Learning RL-Policies for Joint Beamforming Without Exploration: A Batch Constrained Off-Policy Approach
Authors:
Heasung Kim,
Sravan Kumar Ankireddy
Abstract:
In this work, we consider the problem of network parameter optimization for rate maximization. We frame this as a joint optimization problem of power control, beam forming, and interference cancellation. We consider the setting where multiple Base Stations (BSs) communicate with multiple user equipment (UEs). Because of the exponential computational complexity of brute force search, we instead sol…
▽ More
In this work, we consider the problem of network parameter optimization for rate maximization. We frame this as a joint optimization problem of power control, beam forming, and interference cancellation. We consider the setting where multiple Base Stations (BSs) communicate with multiple user equipment (UEs). Because of the exponential computational complexity of brute force search, we instead solve this nonconvex optimization problem using deep reinforcement learning (RL) techniques. Modern communication systems are notorious for their difficulty in exactly modeling their behavior. This limits us in using RL-based algorithms as interaction with the environment is needed for the agent to explore and learn efficiently. Further, it is ill-advised to deploy the algorithm in the real world for exploration and learning because of the high cost of failure. In contrast to the previous RL-based solutions proposed, such as deep-Q network (DQN) based control, we suggest an offline model-based approach. We specifically consider discrete batch-constrained deep Q-learning (BCQ) and show that performance similar to DQN can be achieved with only a fraction of the data without exploring. This maximizes sample efficiency and minimizes risk in deploying a new algorithm to commercial networks. We provide the entire project resource, including code and data, at the following link: https://github.com/Heasung-Kim/ safe-rl-deployment-for-5g.
△ Less
Submitted 11 November, 2023; v1 submitted 12 October, 2023;
originally announced October 2023.
-
Exemplar-Free Continual Transformer with Convolutions
Authors:
Anurag Roy,
Vinay Kumar Verma,
Sravan Voonna,
Kripabandhu Ghosh,
Saptarshi Ghosh,
Abir Das
Abstract:
Continual Learning (CL) involves training a machine learning model in a sequential manner to learn new information while retaining previously learned tasks without the presence of previous training data. Although there has been significant interest in CL, most recent CL approaches in computer vision have focused on convolutional architectures only. However, with the recent success of vision transf…
▽ More
Continual Learning (CL) involves training a machine learning model in a sequential manner to learn new information while retaining previously learned tasks without the presence of previous training data. Although there has been significant interest in CL, most recent CL approaches in computer vision have focused on convolutional architectures only. However, with the recent success of vision transformers, there is a need to explore their potential for CL. Although there have been some recent CL approaches for vision transformers, they either store training instances of previous tasks or require a task identifier during test time, which can be limiting. This paper proposes a new exemplar-free approach for class/task incremental learning called ConTraCon, which does not require task-id to be explicitly present during inference and avoids the need for storing previous training instances. The proposed approach leverages the transformer architecture and involves re-weighting the key, query, and value weights of the multi-head self-attention layers of a transformer trained on a similar task. The re-weighting is done using convolution, which enables the approach to maintain low parameter requirements per task. Additionally, an image augmentation-based entropic task identification approach is used to predict tasks without requiring task-ids during inference. Experiments on four benchmark datasets demonstrate that the proposed approach outperforms several competitive approaches while requiring fewer parameters.
△ Less
Submitted 22 August, 2023;
originally announced August 2023.
-
Collaborative Wideband Spectrum Sensing and Scheduling for Networked UAVs in UTM Systems
Authors:
Sravan Reddy Chintareddy,
Keenan Roach,
Kenny Cheung,
Morteza Hashemi
Abstract:
In this paper, we propose a data-driven framework for collaborative wideband spectrum sensing and scheduling for networked unmanned aerial vehicles (UAVs), which act as the secondary users to opportunistically utilize detected spectrum holes. To this end, we propose a multi-class classification problem for wideband spectrum sensing to detect vacant spectrum spots based on collected I/Q samples. To…
▽ More
In this paper, we propose a data-driven framework for collaborative wideband spectrum sensing and scheduling for networked unmanned aerial vehicles (UAVs), which act as the secondary users to opportunistically utilize detected spectrum holes. To this end, we propose a multi-class classification problem for wideband spectrum sensing to detect vacant spectrum spots based on collected I/Q samples. To enhance the accuracy of the spectrum sensing module, the outputs from the multi-class classification by each individual UAV are fused at a server in the unmanned aircraft system traffic management (UTM) ecosystem. In the spectrum scheduling phase, we leverage reinforcement learning (RL) solutions to dynamically allocate the detected spectrum holes to the secondary users (i.e., UAVs). To evaluate the proposed methods, we establish a comprehensive simulation framework that generates a near-realistic synthetic dataset using MATLAB LTE toolbox by incorporating base-station~(BS) locations in a chosen area of interest, performing ray-tracing, and emulating the primary users channel usage in terms of I/Q samples. This evaluation methodology provides a flexible framework to generate large spectrum datasets that could be used for developing ML/AI-based spectrum management solutions for aerial devices.
△ Less
Submitted 9 August, 2023;
originally announced August 2023.
-
MAEA: Multimodal Attribution for Embodied AI
Authors:
Vidhi Jain,
Jayant Sravan Tamarapalli,
Sahiti Yerramilli,
Yonatan Bisk
Abstract:
Understanding multimodal perception for embodied AI is an open question because such inputs may contain highly complementary as well as redundant information for the task. A relevant direction for multimodal policies is understanding the global trends of each modality at the fusion layer. To this end, we disentangle the attributions for visual, language, and previous action inputs across different…
▽ More
Understanding multimodal perception for embodied AI is an open question because such inputs may contain highly complementary as well as redundant information for the task. A relevant direction for multimodal policies is understanding the global trends of each modality at the fusion layer. To this end, we disentangle the attributions for visual, language, and previous action inputs across different policies trained on the ALFRED dataset. Attribution analysis can be utilized to rank and group the failure scenarios, investigate modeling and dataset biases, and critically analyze multimodal EAI policies for robustness and user trust before deployment. We present MAEA, a framework to compute global attributions per modality of any differentiable policy. In addition, we show how attributions enable lower-level behavior analysis in EAI policies for language and visual attributions.
△ Less
Submitted 25 July, 2023;
originally announced July 2023.
-
Towards a unitary formulation of quantum field theory in curved space-time: the case of Schwarzschild black hole
Authors:
K. Sravan Kumar,
João Marto
Abstract:
We argue that the origin of unitarity violation and information loss paradox in our understanding of black holes (BH) lies in the standard way of doing quantum field theory in curved space-time (QFTCS), which is heavily biased on intuition borrowed from classical General Relativity. In this paper, with the quantum first approach, we formulate a so-called direct-sum QFT (DQFT) in BH space-time base…
▽ More
We argue that the origin of unitarity violation and information loss paradox in our understanding of black holes (BH) lies in the standard way of doing quantum field theory in curved space-time (QFTCS), which is heavily biased on intuition borrowed from classical General Relativity. In this paper, with the quantum first approach, we formulate a so-called direct-sum QFT (DQFT) in BH space-time based on a novel formulation of discrete space-time transformations in gravity that potentially restores unitarity. By invoking the quantum effects associated with the gravitational backreaction, we show that the Hawking quanta emerging outside of the Schwarzschild radius ($r_S=2GM$) cannot be independent of the quanta that continue to be inside $r_S$. This enables the information to be carried by Hawking quanta, but in the BH DQFT formalism, we do not get any firewalls. Furthermore, DQFT leads to the BH evaporation involving only pure states. This means the quantum mechanical effects at the BH horizon produce two components of a maximally entangled pure state in geometric superselection sector Hilbert spaces. This construction enables pure states to evolve into pure states, restoring unitarity and observer complementarity. Finally, we discuss how our framework leaves important clues for formulating a scattering matrix and probing the nature of quantum gravity.
△ Less
Submitted 12 December, 2024; v1 submitted 19 July, 2023;
originally announced July 2023.
-
Machine-directed gravitational-wave counterpart discovery
Authors:
Niharika Sravan,
Matthew J. Graham,
Michael W. Coughlin,
Tomas Ahumada,
Shreya Anand
Abstract:
Joint observations in electromagnetic and gravitational waves shed light on the physics of objects and surrounding environments with extreme gravity that are otherwise unreachable via siloed observations in each messenger. However, such detections remain challenging due to the rapid and faint nature of counterparts. Protocols for discovery and inference still rely on human experts manually inspect…
▽ More
Joint observations in electromagnetic and gravitational waves shed light on the physics of objects and surrounding environments with extreme gravity that are otherwise unreachable via siloed observations in each messenger. However, such detections remain challenging due to the rapid and faint nature of counterparts. Protocols for discovery and inference still rely on human experts manually inspecting survey alert streams and intuiting optimal usage of limited follow-up resources. Strategizing an optimal follow-up program requires adaptive sequential decision-making given evolving light curve data that (i) maximizes a global objective despite incomplete information and (ii) is robust to stochasticity introduced by detectors/observing conditions. Reinforcement learning (RL) approaches allow agents to implicitly learn the physics/detector dynamics and the behavior policy that maximize a designated objective through experience. To demonstrate the utility of such an approach for the kilonova follow-up problem, we train a toy RL agent for the goal of maximizing follow-up photometry for the true kilonova among several contaminant transient light curves. In a simulated environment where the agent learns online, it achieves 3x higher accuracy compared to a random strategy. However, it is surpassed by human agents by up to a factor of 2. This is likely because our hypothesis function (Q that is linear in state-action features) is an insufficient representation of the optimal behavior policy. More complex agents could perform at par or surpass human experts. Agents like these could pave the way for machine-directed software infrastructure to efficiently respond to next generation detectors, for conducting science inference and optimally planning expensive follow-up observations, scalably and with demonstrable performance guarantees.
△ Less
Submitted 24 September, 2024; v1 submitted 18 July, 2023;
originally announced July 2023.
-
Multilingual Contextual Adapters To Improve Custom Word Recognition In Low-resource Languages
Authors:
Devang Kulshreshtha,
Saket Dingliwal,
Brady Houston,
Sravan Bodapati
Abstract:
Connectionist Temporal Classification (CTC) models are popular for their balance between speed and performance for Automatic Speech Recognition (ASR). However, these CTC models still struggle in other areas, such as personalization towards custom words. A recent approach explores Contextual Adapters, wherein an attention-based biasing model for CTC is used to improve the recognition of custom enti…
▽ More
Connectionist Temporal Classification (CTC) models are popular for their balance between speed and performance for Automatic Speech Recognition (ASR). However, these CTC models still struggle in other areas, such as personalization towards custom words. A recent approach explores Contextual Adapters, wherein an attention-based biasing model for CTC is used to improve the recognition of custom entities. While this approach works well with enough data, we showcase that it isn't an effective strategy for low-resource languages. In this work, we propose a supervision loss for smoother training of the Contextual Adapters. Further, we explore a multilingual strategy to improve performance with limited training data. Our method achieves 48% F1 improvement in retrieving unseen custom entities for a low-resource language. Interestingly, as a by-product of training the Contextual Adapters, we see a 5-11% Word Error Rate (WER) reduction in the performance of the base CTC model as well.
△ Less
Submitted 3 July, 2023;
originally announced July 2023.
-
Don't Stop Self-Supervision: Accent Adaptation of Speech Representations via Residual Adapters
Authors:
Anshu Bhatia,
Sanchit Sinha,
Saket Dingliwal,
Karthik Gopalakrishnan,
Sravan Bodapati,
Katrin Kirchhoff
Abstract:
Speech representations learned in a self-supervised fashion from massive unlabeled speech corpora have been adapted successfully toward several downstream tasks. However, such representations may be skewed toward canonical data characteristics of such corpora and perform poorly on atypical, non-native accented speaker populations. With the state-of-the-art HuBERT model as a baseline, we propose an…
▽ More
Speech representations learned in a self-supervised fashion from massive unlabeled speech corpora have been adapted successfully toward several downstream tasks. However, such representations may be skewed toward canonical data characteristics of such corpora and perform poorly on atypical, non-native accented speaker populations. With the state-of-the-art HuBERT model as a baseline, we propose and investigate self-supervised adaptation of speech representations to such populations in a parameter-efficient way via training accent-specific residual adapters. We experiment with 4 accents and choose automatic speech recognition (ASR) as the downstream task of interest. We obtain strong word error rate reductions (WERR) over HuBERT-large for all 4 accents, with a mean WERR of 22.7% with accent-specific adapters and a mean WERR of 25.1% if the entire encoder is accent-adapted. While our experiments utilize HuBERT and ASR as the downstream task, our proposed approach is both model and task-agnostic.
△ Less
Submitted 1 July, 2023;
originally announced July 2023.
-
Updated observing scenarios and multi-messenger implications for the International Gravitational-wave Network's O4 and O5
Authors:
R. Weizmann Kiendrebeogo,
Amanda M. Farah,
Emily M. Foley,
Abigail Gray,
Nina Kunert,
Anna Puecher,
Andrew Toivonen,
R. Oliver VandenBerg,
Shreya Anand,
Tomás Ahumada,
Viraj Karambelkar,
Michael W. Coughlin,
Tim Dietrich,
S. Zacharie Kam,
Peter T. H. Pang,
Leo P. Singer,
Niharika Sravan
Abstract:
An advanced LIGO and Virgo's third observing run brought another binary neutron star merger (BNS) and the first neutron-star black hole mergers. While no confirmed kilonovae were identified in conjunction with any of these events, continued improvements of analyses surrounding GW170817 allow us to project constraints on the Hubble Constant ($H_0$), the Galactic enrichment from $r$-process nucleosy…
▽ More
An advanced LIGO and Virgo's third observing run brought another binary neutron star merger (BNS) and the first neutron-star black hole mergers. While no confirmed kilonovae were identified in conjunction with any of these events, continued improvements of analyses surrounding GW170817 allow us to project constraints on the Hubble Constant ($H_0$), the Galactic enrichment from $r$-process nucleosynthesis, and ultra-dense matter possible from forthcoming events. Here, we describe the expected constraints based on the latest expected event rates from the international gravitational-wave network (IGWN) and analyses of GW170817. We show the expected detection rate of gravitational waves and their counterparts, as well as how sensitive potential constraints are to the observed numbers of counterparts. We intend this analysis as support for the community when creating scientifically driven electromagnetic follow-up proposals. During the next observing run O4, we predict an annual detection rate of electromagnetic counterparts from BNS of $0.43^{+0.58}_{-0.26}$ ($1.97^{+2.68}_{-1.2}$) for the Zwicky Transient Facility (Rubin Observatory).
△ Less
Submitted 12 December, 2023; v1 submitted 15 June, 2023;
originally announced June 2023.
-
DCTX-Conformer: Dynamic context carry-over for low latency unified streaming and non-streaming Conformer ASR
Authors:
Goeric Huybrechts,
Srikanth Ronanki,
Xilai Li,
Hadis Nosrati,
Sravan Bodapati,
Katrin Kirchhoff
Abstract:
Conformer-based end-to-end models have become ubiquitous these days and are commonly used in both streaming and non-streaming automatic speech recognition (ASR). Techniques like dual-mode and dynamic chunk training helped unify streaming and non-streaming systems. However, there remains a performance gap between streaming with a full and limited past context. To address this issue, we propose the…
▽ More
Conformer-based end-to-end models have become ubiquitous these days and are commonly used in both streaming and non-streaming automatic speech recognition (ASR). Techniques like dual-mode and dynamic chunk training helped unify streaming and non-streaming systems. However, there remains a performance gap between streaming with a full and limited past context. To address this issue, we propose the integration of a novel dynamic contextual carry-over mechanism in a state-of-the-art (SOTA) unified ASR system. Our proposed dynamic context Conformer (DCTX-Conformer) utilizes a non-overlapping contextual carry-over mechanism that takes into account both the left context of a chunk and one or more preceding context embeddings. We outperform the SOTA by a relative 25.0% word error rate, with a negligible latency impact due to the additional context embeddings.
△ Less
Submitted 1 March, 2024; v1 submitted 13 June, 2023;
originally announced June 2023.
-
Applications of Deep Learning to physics workflows
Authors:
Manan Agarwal,
Jay Alameda,
Jeroen Audenaert,
Will Benoit,
Damon Beveridge,
Meghna Bhattacharya,
Chayan Chatterjee,
Deep Chatterjee,
Andy Chen,
Muhammed Saleem Cholayil,
Chia-Jui Chou,
Sunil Choudhary,
Michael Coughlin,
Maximilian Dax,
Aman Desai,
Andrea Di Luca,
Javier Mauricio Duarte,
Steven Farrell,
Yongbin Feng,
Pooyan Goodarzi,
Ekaterina Govorkova,
Matthew Graham,
Jonathan Guiang,
Alec Gunny,
Weichangfeng Guo
, et al. (43 additional authors not shown)
Abstract:
Modern large-scale physics experiments create datasets with sizes and streaming rates that can exceed those from industry leaders such as Google Cloud and Netflix. Fully processing these datasets requires both sufficient compute power and efficient workflows. Recent advances in Machine Learning (ML) and Artificial Intelligence (AI) can either improve or replace existing domain-specific algorithms…
▽ More
Modern large-scale physics experiments create datasets with sizes and streaming rates that can exceed those from industry leaders such as Google Cloud and Netflix. Fully processing these datasets requires both sufficient compute power and efficient workflows. Recent advances in Machine Learning (ML) and Artificial Intelligence (AI) can either improve or replace existing domain-specific algorithms to increase workflow efficiency. Not only can these algorithms improve the physics performance of current algorithms, but they can often be executed more quickly, especially when run on coprocessors such as GPUs or FPGAs. In the winter of 2023, MIT hosted the Accelerating Physics with ML at MIT workshop, which brought together researchers from gravitational-wave physics, multi-messenger astrophysics, and particle physics to discuss and share current efforts to integrate ML tools into their workflows. The following white paper highlights examples of algorithms and computing frameworks discussed during this workshop and summarizes the expected computing needs for the immediate future of the involved fields.
△ Less
Submitted 13 June, 2023;
originally announced June 2023.
-
Probing pre-supernova mass loss in double-peaked Type Ibc supernovae from the Zwicky Transient Facility
Authors:
Kaustav K. Das,
Mansi M. Kasliwal,
Jesper Sollerman,
Christoffer Fremling,
I. Irani,
Shing-Chi Leung,
Sheng Yang,
Samantha Wu,
Jim Fuller,
Shreya Anand,
Igor Andreoni,
C. Barbarino,
Thomas G. Brink,
Kishalay De,
Alison Dugas,
Steven L. Groom,
George Helou,
K-Ryan Hinds,
Anna Y. Q. Ho,
Viraj Karambelkar,
S. R. Kulkarni,
Daniel A. Perley,
Josiah Purdum,
Nicolas Regnault,
Steve Schulze
, et al. (12 additional authors not shown)
Abstract:
Eruptive mass loss of massive stars prior to supernova (SN) explosion is key to understanding their evolution and end fate. An observational signature of pre-SN mass loss is the detection of an early, short-lived peak prior to the radioactive-powered peak in the lightcurve of the SN. This is usually attributed to the SN shock passing through an extended envelope or circumstellar medium (CSM). Such…
▽ More
Eruptive mass loss of massive stars prior to supernova (SN) explosion is key to understanding their evolution and end fate. An observational signature of pre-SN mass loss is the detection of an early, short-lived peak prior to the radioactive-powered peak in the lightcurve of the SN. This is usually attributed to the SN shock passing through an extended envelope or circumstellar medium (CSM). Such an early peak is common for double-peaked Type IIb SNe with an extended Hydrogen envelope but is uncommon for normal Type Ibc SNe with very compact progenitors. In this paper, we systematically study a sample of 14 double-peaked Type Ibc SNe out of 475 Type Ibc SNe detected by the Zwicky Transient Facility. The rate of these events is ~ 3-9 % of Type Ibc SNe. A strong correlation is seen between the peak brightness of the first and the second peak. We perform a holistic analysis of this sample's photometric and spectroscopic properties. We find that six SNe have ejecta mass less than 1.5 Msun. Based on the nebular spectra and lightcurve properties, we estimate that the progenitor masses for these are less than ~ 12 Msun. The rest have an ejecta mass > 2.4 Msun and a higher progenitor mass. This sample suggests that the SNe with low progenitor masses undergo late-time binary mass transfer. Meanwhile, the SNe with higher progenitor masses are consistent with wave-driven mass loss or pulsation-pair instability-driven mass loss simulations.
△ Less
Submitted 7 August, 2024; v1 submitted 7 June, 2023;
originally announced June 2023.
-
Cosmology in nonlocal gravity
Authors:
Alexey S. Koshelev,
K. Sravan Kumar,
Alexei A. Starobinsky
Abstract:
In this chapter we review the recent developments of realizing $R^2$-like inflation in the framework of a most general UV nonlocal extension of Einstein's general theory of relativity (GR). It is a well-motivated robust approach towards quantum gravity. In the past decades, nonlocal gravitational theories which are quadratic in curvature have been understood to be ghost-free and super-renormalizab…
▽ More
In this chapter we review the recent developments of realizing $R^2$-like inflation in the framework of a most general UV nonlocal extension of Einstein's general theory of relativity (GR). It is a well-motivated robust approach towards quantum gravity. In the past decades, nonlocal gravitational theories which are quadratic in curvature have been understood to be ghost-free and super-renormalizable around maximally symmetric spacetimes. However, in the context of early Universe cosmology we show that one must go beyond the quadratic curvature nonlocal gravity in order to achieve a consistent ghost-free framework of Universe evolution from quasi de Sitter to Minkowski spacetime. In this regard, we discuss a construction of a most general nonlocal gravity action that leads to $R^2$-like inflation and discuss the corresponding observational predictions for the scalar and tensor spectral tilts, tensor-to-scalar ratio, and the primordial non-Gaussianities. We present an analysis of how the nonlocal inflationary cosmology goes beyond the established notions of effective field theories of inflation. Finally, we comment on some open questions and prospects of higher curvature nonlocal gravity on its way of achieving the UV completion.
△ Less
Submitted 29 May, 2023;
originally announced May 2023.
-
Task-aware Distributed Source Coding under Dynamic Bandwidth
Authors:
Po-han Li,
Sravan Kumar Ankireddy,
Ruihan Zhao,
Hossein Nourkhiz Mahjoub,
Ehsan Moradi-Pari,
Ufuk Topcu,
Sandeep Chinchali,
Hyeji Kim
Abstract:
Efficient compression of correlated data is essential to minimize communication overload in multi-sensor networks. In such networks, each sensor independently compresses the data and transmits them to a central node due to limited communication bandwidth. A decoder at the central node decompresses and passes the data to a pre-trained machine learning-based task to generate the final output. Thus,…
▽ More
Efficient compression of correlated data is essential to minimize communication overload in multi-sensor networks. In such networks, each sensor independently compresses the data and transmits them to a central node due to limited communication bandwidth. A decoder at the central node decompresses and passes the data to a pre-trained machine learning-based task to generate the final output. Thus, it is important to compress the features that are relevant to the task. Additionally, the final performance depends heavily on the total available bandwidth. In practice, it is common to encounter varying availability in bandwidth, and higher bandwidth results in better performance of the task. We design a novel distributed compression framework composed of independent encoders and a joint decoder, which we call neural distributed principal component analysis (NDPCA). NDPCA flexibly compresses data from multiple sources to any available bandwidth with a single model, reducing computing and storage overhead. NDPCA achieves this by learning low-rank task representations and efficiently distributing bandwidth among sensors, thus providing a graceful trade-off between performance and bandwidth. Experiments show that NDPCA improves the success rate of multi-view robotic arm manipulation by 9% and the accuracy of object detection tasks on satellite imagery by 14% compared to an autoencoder with uniform bandwidth allocation.
△ Less
Submitted 2 December, 2024; v1 submitted 24 May, 2023;
originally announced May 2023.
-
Masked Audio Text Encoders are Effective Multi-Modal Rescorers
Authors:
Jinglun Cai,
Monica Sunkara,
Xilai Li,
Anshu Bhatia,
Xiao Pan,
Sravan Bodapati
Abstract:
Masked Language Models (MLMs) have proven to be effective for second-pass rescoring in Automatic Speech Recognition (ASR) systems. In this work, we propose Masked Audio Text Encoder (MATE), a multi-modal masked language model rescorer which incorporates acoustic representations into the input space of MLM. We adopt contrastive learning for effectively aligning the modalities by learning shared rep…
▽ More
Masked Language Models (MLMs) have proven to be effective for second-pass rescoring in Automatic Speech Recognition (ASR) systems. In this work, we propose Masked Audio Text Encoder (MATE), a multi-modal masked language model rescorer which incorporates acoustic representations into the input space of MLM. We adopt contrastive learning for effectively aligning the modalities by learning shared representations. We show that using a multi-modal rescorer is beneficial for domain generalization of the ASR system when target domain data is unavailable. MATE reduces word error rate (WER) by 4%-16% on in-domain, and 3%-7% on out-of-domain datasets, over the text-only baseline. Additionally, with very limited amount of training data (0.8 hours), MATE achieves a WER reduction of 8%-23% over the first-pass baseline.
△ Less
Submitted 24 May, 2023; v1 submitted 11 May, 2023;
originally announced May 2023.
-
A robust explanation of CMB anomalies with a new formulation of inflationary quantum fluctuations
Authors:
K. Sravan Kumar,
João Marto
Abstract:
The presence of CMB Hemispherical Asymmetry (HPA) challenges the current understanding of inflationary cosmology which does not generically predict the parity violation in the primordial correlations. In this paper, we shall review the recently proposed resolution to this based on a new formulation of quantizing inflationary fluctuations by focusing on the discrete spacetime transformations in a g…
▽ More
The presence of CMB Hemispherical Asymmetry (HPA) challenges the current understanding of inflationary cosmology which does not generically predict the parity violation in the primordial correlations. In this paper, we shall review the recently proposed resolution to this based on a new formulation of quantizing inflationary fluctuations by focusing on the discrete spacetime transformations in a gravitational context. The predictive power of this formulation is that one can generate a scale dependent HPA in the context of single field inflation for all the primordial modes including scalar and tensor fluctuations without introducing any additional parameters. This result can be seen as an indication of spontaneous breaking of $\mathcal{C}\mathcal{P}\mathcal{T}$ symmetry in an expanding Universe, if confirmed by future observations it would be a great leap in the subject of quantum field theory in curved spacetime.
△ Less
Submitted 10 May, 2023;
originally announced May 2023.
-
Towards a unitary formulation of quantum field theory in curved spacetime: the case of de Sitter spacetime
Authors:
K. Sravan Kumar,
João Marto
Abstract:
Before we ask what the quantum gravity theory is, it is a legitimate quest to formulate a robust quantum field theory in curved spacetime (QFTCS). Several conceptual problems, especially unitarity loss (pure states evolving into mixed states), have raised concerns over several decades. In this paper, acknowledging the fact that {time} is a parameter in quantum theory, which is different from its s…
▽ More
Before we ask what the quantum gravity theory is, it is a legitimate quest to formulate a robust quantum field theory in curved spacetime (QFTCS). Several conceptual problems, especially unitarity loss (pure states evolving into mixed states), have raised concerns over several decades. In this paper, acknowledging the fact that {time} is a parameter in quantum theory, which is different from its status in the context of General Relativity (GR), we start with a "quantum first approach" and propose a new formulation for QFTCS based on the discrete spacetime transformations which offer a way to achieve unitarity. We rewrite the QFT in Minkowski spacetime with a direct-sum Fock space structure based on the discrete spacetime transformations and geometric superselection rules. Applying this framework to QFTCS, in the context of de Sitter (dS) spacetime, we elucidate how this approach to quantization complies with unitarity and the observer complementarity principle. We then comment on understanding the scattering of states in de Sitter spacetime. Furthermore, we discuss briefly the implications of our QFTCS approach to future research in quantum gravity.
△ Less
Submitted 29 December, 2024; v1 submitted 10 May, 2023;
originally announced May 2023.
-
SENSEI: Search for Millicharged Particles produced in the NuMI Beam
Authors:
Liron Barak,
Itay M. Bloch,
Ana M. Botti,
Mariano Cababie,
Gustavo Cancelo,
Luke Chaplinsky,
Michael Crisler,
Alex Drlica-Wagner,
Rouven Essig,
Juan Estrada,
Erez Etzion,
Guillermo Fernandez Moroni,
Roni Harnik,
Stephen E. Holland,
Yaron Korn,
Zhen Liu,
Sravan Munagavalasa,
Aviv Orly,
Santiago E. Perez,
Ryan Plestid,
Dario Rodrigues,
Nathan A. Saffold,
Silvia Scorza,
Aman Singal,
Miguel Sofo Haro
, et al. (6 additional authors not shown)
Abstract:
Millicharged particles appear in several extensions of the Standard Model, but have not yet been detected. These hypothetical particles could be produced by an intense proton beam striking a fixed target. We use data collected in 2020 by the SENSEI experiment in the MINOS cavern at the Fermi National Accelerator Laboratory to search for ultra-relativistic millicharged particles produced in collisi…
▽ More
Millicharged particles appear in several extensions of the Standard Model, but have not yet been detected. These hypothetical particles could be produced by an intense proton beam striking a fixed target. We use data collected in 2020 by the SENSEI experiment in the MINOS cavern at the Fermi National Accelerator Laboratory to search for ultra-relativistic millicharged particles produced in collisions of protons in the NuMI beam with a fixed graphite target. The absence of any ionization events with 3 to 6 electrons in the SENSEI data allow us to place world-leading constraints on millicharged particles for masses between 30 MeV to 380 MeV. This work also demonstrates the potential of utilizing low-threshold detectors to investigate new particles in beam-dump experiments, and motivates a future experiment designed specifically for this purpose.
△ Less
Submitted 24 May, 2023; v1 submitted 8 May, 2023;
originally announced May 2023.
-
Mask The Bias: Improving Domain-Adaptive Generalization of CTC-based ASR with Internal Language Model Estimation
Authors:
Nilaksh Das,
Monica Sunkara,
Sravan Bodapati,
Jinglun Cai,
Devang Kulshreshtha,
Jeff Farris,
Katrin Kirchhoff
Abstract:
End-to-end ASR models trained on large amount of data tend to be implicitly biased towards language semantics of the training data. Internal language model estimation (ILME) has been proposed to mitigate this bias for autoregressive models such as attention-based encoder-decoder and RNN-T. Typically, ILME is performed by modularizing the acoustic and language components of the model architecture,…
▽ More
End-to-end ASR models trained on large amount of data tend to be implicitly biased towards language semantics of the training data. Internal language model estimation (ILME) has been proposed to mitigate this bias for autoregressive models such as attention-based encoder-decoder and RNN-T. Typically, ILME is performed by modularizing the acoustic and language components of the model architecture, and eliminating the acoustic input to perform log-linear interpolation with the text-only posterior. However, for CTC-based ASR, it is not as straightforward to decouple the model into such acoustic and language components, as CTC log-posteriors are computed in a non-autoregressive manner. In this work, we propose a novel ILME technique for CTC-based ASR models. Our method iteratively masks the audio timesteps to estimate a pseudo log-likelihood of the internal LM by accumulating log-posteriors for only the masked timesteps. Extensive evaluation across multiple out-of-domain datasets reveals that the proposed approach improves WER by up to 9.8% and OOV F1-score by up to 24.6% relative to Shallow Fusion, when only text data from target domain is available. In the case of zero-shot domain adaptation, with no access to any target domain data, we demonstrate that removing the source domain bias with ILME can still outperform Shallow Fusion to improve WER by up to 9.3% relative.
△ Less
Submitted 5 May, 2023;
originally announced May 2023.
-
Dynamic Chunk Convolution for Unified Streaming and Non-Streaming Conformer ASR
Authors:
Xilai Li,
Goeric Huybrechts,
Srikanth Ronanki,
Jeff Farris,
Sravan Bodapati
Abstract:
Recently, there has been an increasing interest in unifying streaming and non-streaming speech recognition models to reduce development, training and deployment cost. The best-known approaches rely on either window-based or dynamic chunk-based attention strategy and causal convolutions to minimize the degradation due to streaming. However, the performance gap still remains relatively large between…
▽ More
Recently, there has been an increasing interest in unifying streaming and non-streaming speech recognition models to reduce development, training and deployment cost. The best-known approaches rely on either window-based or dynamic chunk-based attention strategy and causal convolutions to minimize the degradation due to streaming. However, the performance gap still remains relatively large between non-streaming and a full-contextual model trained independently. To address this, we propose a dynamic chunk-based convolution replacing the causal convolution in a hybrid Connectionist Temporal Classification (CTC)-Attention Conformer architecture. Additionally, we demonstrate further improvements through initialization of weights from a full-contextual model and parallelization of the convolution and self-attention modules. We evaluate our models on the open-source Voxpopuli, LibriSpeech and in-house conversational datasets. Overall, our proposed model reduces the degradation of the streaming mode over the non-streaming full-contextual model from 41.7% and 45.7% to 16.7% and 26.2% on the LibriSpeech test-clean and test-other datasets respectively, while improving by a relative 15.5% WER over the previous state-of-the-art unified model.
△ Less
Submitted 25 April, 2023; v1 submitted 18 April, 2023;
originally announced April 2023.
-
Searching for millicharged particles with 1 kg of Skipper-CCDs using the NuMI beam at Fermilab
Authors:
Santiago Perez,
Dario Rodrigues,
Juan Estrada,
Roni Harnik,
Zhen Liu,
Brenda A. Cervantes-Vergara,
Juan Carlos D'Olivo,
Ryan D. Plestid,
Javier Tiffenberg,
Tien-Tien Yu,
Alexis Aguilar-Arevalo,
Fabricio Alcalde-Bessia,
Nicolas Avalos,
Oscar Baez,
Daniel Baxter,
Xavier Bertou,
Carla Bonifazi,
Ana Botti,
Gustavo Cancelo,
Nuria Castelló-Mor,
Alvaro E. Chavarria,
Claudio R. Chavez,
Fernando Chierchie,
Juan Manuel De Egea,
Cyrus Dreyer
, et al. (35 additional authors not shown)
Abstract:
Oscura is a planned light-dark matter search experiment using Skipper-CCDs with a total active mass of 10 kg. As part of the detector development, the collaboration plans to build the Oscura Integration Test (OIT), an engineering test with 10% of the total mass. Here we discuss the early science opportunities with the OIT to search for millicharged particles (mCPs) using the NuMI beam at Fermilab.…
▽ More
Oscura is a planned light-dark matter search experiment using Skipper-CCDs with a total active mass of 10 kg. As part of the detector development, the collaboration plans to build the Oscura Integration Test (OIT), an engineering test with 10% of the total mass. Here we discuss the early science opportunities with the OIT to search for millicharged particles (mCPs) using the NuMI beam at Fermilab. mCPs would be produced at low energies through photon-mediated processes from decays of scalar, pseudoscalar, and vector mesons, or direct Drell-Yan productions. Estimates show that the OIT would be a world-leading probe for mCPs in the MeV mass range.
△ Less
Submitted 2 December, 2023; v1 submitted 17 April, 2023;
originally announced April 2023.
-
Skipper-CCD Sensors for the Oscura Experiment: Requirements and Preliminary Tests
Authors:
Brenda A. Cervantes-Vergara,
Santiago Perez,
Juan Estrada,
Ana Botti,
Claudio R. Chavez,
Fernando Chierchie,
Nathan Saffold,
Alexis Aguilar-Arevalo,
Fabricio Alcalde-Bessia,
Nicolás Avalos,
Oscar Baez,
Daniel Baxter,
Xavier Bertou,
Carla Bonifazi,
Gustavo Cancelo,
Nuria Castelló-Mor,
Alvaro E. Chavarria,
Juan Manuel De Egea,
Juan Carlos D'Olivo,
Cyrus Dreyer,
Alex Drlica-Wagner,
Rouven Essig,
Ezequiel Estrada,
Erez Etzion,
Paul Grylls
, et al. (30 additional authors not shown)
Abstract:
Oscura is a proposed multi-kg skipper-CCD experiment designed for a dark matter (DM) direct detection search that will reach unprecedented sensitivity to sub-GeV DM-electron interactions with its 10 kg detector array. Oscura is planning to operate at SNOLAB with 2070 m overburden, and aims to reach a background goal of less than one event in each electron bin in the 2-10 electron ionization-signal…
▽ More
Oscura is a proposed multi-kg skipper-CCD experiment designed for a dark matter (DM) direct detection search that will reach unprecedented sensitivity to sub-GeV DM-electron interactions with its 10 kg detector array. Oscura is planning to operate at SNOLAB with 2070 m overburden, and aims to reach a background goal of less than one event in each electron bin in the 2-10 electron ionization-signal region for the full 30 kg-year exposure, with a radiation background rate of 0.01 dru. In order to achieve this goal, Oscura must address each potential source of background events, including instrumental backgrounds. In this work, we discuss the main instrumental background sources and the strategy to control them, establishing a set of constraints on the sensors' performance parameters. We present results from the tests of the first fabricated Oscura prototype sensors, evaluate their performance in the context of the established constraints and estimate the Oscura instrumental background based on these results.
△ Less
Submitted 11 April, 2024; v1 submitted 10 April, 2023;
originally announced April 2023.
-
Growth-rate measurement with type-Ia supernovae using ZTF survey simulations
Authors:
Bastien Carreres,
Julian E. Bautista,
Fabrice Feinstein,
Dominique Fouchez,
Benjamin Racine,
Mathew Smith,
Mellissa Amenouche,
Marie Aubert,
Suhail Dhawan,
Madeleine Ginolin,
Ariel Goobar,
Philippe Gris,
Leander Lacroix,
Eric Nuss,
Nicolas Regnault,
Mickael Rigault,
Estelle Robert,
Philippe Rosnet,
Kelian Sommer,
Richard Dekany,
Steven L. Groom,
Niharika Sravan,
Frank J. Masci,
Josiah Purdum
Abstract:
Measurements of the growth rate of structures at $z < 0.1$ with peculiar velocity surveys have the potential of testing the validity of general relativity on cosmic scales. In this work, we present growth-rate measurements from realistic simulated sets of type-Ia supernovae (SNe Ia) from the Zwicky Transient Facility (ZTF). We describe our simulation methodology, the light-curve fitting and peculi…
▽ More
Measurements of the growth rate of structures at $z < 0.1$ with peculiar velocity surveys have the potential of testing the validity of general relativity on cosmic scales. In this work, we present growth-rate measurements from realistic simulated sets of type-Ia supernovae (SNe Ia) from the Zwicky Transient Facility (ZTF). We describe our simulation methodology, the light-curve fitting and peculiar velocity estimation. Using the maximum likelihood method, we derive constraints on $fσ_8$ using only ZTF SN Ia peculiar velocities. We carefully tested the method and we quantified biases due to selection effects (photometric detection, spectroscopic follow-up for typing) on several independent realizations. We simulated the equivalent of 6 years of ZTF data, and considering an unbiased spectroscopically typed sample at $z < 0.06$, we obtained unbiased estimates of $fσ_8$ with an average uncertainty of 19% precision. We also investigated the information gain in applying bias correction methods. Our results validate our framework which can be used on real ZTF data.
△ Less
Submitted 22 June, 2023; v1 submitted 2 March, 2023;
originally announced March 2023.
-
Compressed Error HARQ: Feedback Communication on Noise-Asymmetric Channels
Authors:
Sravan Kumar Ankireddy,
S. Ashwin Hebbar,
Yihan Jiang,
Hyeji Kim,
Pramod Viswanath
Abstract:
In modern communication systems with feedback, there are increasingly more scenarios where the transmitter has much less power than the receiver (e.g., medical implant devices), which we refer to as noise-asymmetric channels. For such channels, the feedback link is of higher quality than the forward link. However, feedback schemes for cellular communications, such as hybrid ARQ, do not fully utili…
▽ More
In modern communication systems with feedback, there are increasingly more scenarios where the transmitter has much less power than the receiver (e.g., medical implant devices), which we refer to as noise-asymmetric channels. For such channels, the feedback link is of higher quality than the forward link. However, feedback schemes for cellular communications, such as hybrid ARQ, do not fully utilize the high-quality feedback link. To this end, we introduce Compressed Error Hybrid ARQ, a generalization of hybrid ARQ tailored for noise-asymmetric channels; the receiver sends its estimated message to the transmitter, and the transmitter harmoniously switches between hybrid ARQ and compressed error retransmission. We show that our proposed method significantly improves reliability, latency, and spectral efficiency compared to the conventional hybrid ARQ in various practical scenarios where the transmitter is resource-constrained.
△ Less
Submitted 20 February, 2023;
originally announced February 2023.
-
Electrochemically induced switching from antiferromagnetic spin-chain to frustrated spin-glass state in maple-leaf lattice Na2Mn3O7
Authors:
Corson Chao,
Shivani Srivastava,
Ming Lei,
Bachu Sravan Kumar,
Varun Kamboj,
Hari Ramachandran,
Zhelong Jiang,
Anirudh Adavi,
Kevin H. Stone,
Mark Asta,
Lilia S. Xie,
Iwnetim I. Abate
Abstract:
We report the electrochemical tuning of magnetic properties in the Na2Mn3O7 maple-leaf lattice (MLL) through ion deintercalation, revealing a switch from the 1D antiferromagnetic (AFM) spin-chain behavior of the S=3/2 MLL structure to frustrated magnetism spin-glass behavior. By utilizing Na deintercalation, we stabilize ferromagnetic (FM) short-range interactions within the original short-range A…
▽ More
We report the electrochemical tuning of magnetic properties in the Na2Mn3O7 maple-leaf lattice (MLL) through ion deintercalation, revealing a switch from the 1D antiferromagnetic (AFM) spin-chain behavior of the S=3/2 MLL structure to frustrated magnetism spin-glass behavior. By utilizing Na deintercalation, we stabilize ferromagnetic (FM) short-range interactions within the original short-range AFM system, creating magnetic frustration within the system beyond that induced from the MLL geometrically frustrated structure, leading to a spin-glass state. Magnetic and structural analyses, combined with density functional theory (DFT) calculations, demonstrate the near-degeneracy between AFM and FM configurations in Na2Mn3O7, suggesting that the altered lattice distortions and disorder introduced via deintercalation are responsible for the frustrated magnetism. Our findings provide a novel platform for studying low-dimensional magnetism, spin glass behavior, and potential applications in spintronics and computing technologies. This study represents the first observation of an induced spin glass state in MLL materials and is a rare example of electrochemically induced spin glass state, highlighting the critical role of ion intercalation in tuning magnetic interactions.
△ Less
Submitted 21 January, 2025; v1 submitted 24 January, 2023;
originally announced January 2023.
-
HST Proper Motion Measurements of Supernova Remnant N132D: Center of Expansion and Age
Authors:
John Banovetz,
Dan Milisavljevic,
Niharika Sravan,
Kathryn E. Weil,
Bhagya Subrayan,
Robert A. Fesen,
Daniel J. Patnaude,
Paul P. Plucinsky,
Charles J. Law,
William P. Blair,
Jon A. Morse
Abstract:
We present proper motion measurements of oxygen-rich ejecta of the LMC supernova remnant N132D using two epochs of Hubble Space Telescope Advanced Camera for Surveys data spanning 16 years. The proper motions of 120 individual knots of oxygen-rich gas were measured and used to calculate a center of expansion (CoE) of $α$=05:25:01.71 and $δ$=-69:38:41.64 (J2000) with a 1-$σ$ uncertainty of 2.90 arc…
▽ More
We present proper motion measurements of oxygen-rich ejecta of the LMC supernova remnant N132D using two epochs of Hubble Space Telescope Advanced Camera for Surveys data spanning 16 years. The proper motions of 120 individual knots of oxygen-rich gas were measured and used to calculate a center of expansion (CoE) of $α$=05:25:01.71 and $δ$=-69:38:41.64 (J2000) with a 1-$σ$ uncertainty of 2.90 arcseconds. This new CoE measurement is 9.2 and 10.8 arcseconds from two previous CoE estimates based on the geometry of the optically emitting ejecta. We also derive an explosion age of 2770 $\pm$ 500 yr, which is consistent with recent age estimates of $\approx 2500$ yr made from 3D ejecta reconstructions. We verify our estimates of the CoE and age using a new automated procedure that detected and tracked the proper motions of 137 knots, with 73 knots that overlap with the visually identified knots. We find the proper motions of ejecta are still ballistic, despite the remnant's age, and are consistent with the notion that the ejecta are expanding into an ISM cavity. Evidence for explosion asymmetry from the parent supernova is also observed. Using the visually measured proper motion measurements and corresponding center of expansion and age, we compare N132D to other supernova remnants with proper motion ejecta studies.
△ Less
Submitted 5 January, 2023;
originally announced January 2023.
-
Developing and deploying deep learning models in brain MRI: a review
Authors:
Kunal Aggarwal,
Marina Manso Jimeno,
Keerthi Sravan Ravi,
Gilberto Gonzalez,
Sairam Geethanath
Abstract:
Magnetic Resonance Imaging (MRI) of the brain has benefited from deep learning (DL) to alleviate the burden on radiologists and MR technologists, and improve throughput. The easy accessibility of DL tools have resulted in the rapid increase of DL models and subsequent peer-reviewed publications. However, the rate of deployment in clinical settings is low. Therefore, this review attempts to bring t…
▽ More
Magnetic Resonance Imaging (MRI) of the brain has benefited from deep learning (DL) to alleviate the burden on radiologists and MR technologists, and improve throughput. The easy accessibility of DL tools have resulted in the rapid increase of DL models and subsequent peer-reviewed publications. However, the rate of deployment in clinical settings is low. Therefore, this review attempts to bring together the ideas from data collection to deployment into the clinic building on the guidelines and principles that accreditation agencies have espoused. We introduce the need for and the role of DL to deliver accessible MRI. This is followed by a brief review of DL examples in the context of neuropathologies. Based on these studies and others, we collate the prerequisites to develop and deploy DL models for brain MRI. We then delve into the guiding principles to practice good machine learning practices in the context of neuroimaging with a focus on explainability. A checklist based on the FDA's good machine learning practices is provided as a summary of these guidelines. Finally, we review the current challenges and future opportunities in DL for brain MRI.
△ Less
Submitted 3 January, 2023;
originally announced January 2023.
-
Rethinking the Role of Scale for In-Context Learning: An Interpretability-based Case Study at 66 Billion Scale
Authors:
Hritik Bansal,
Karthik Gopalakrishnan,
Saket Dingliwal,
Sravan Bodapati,
Katrin Kirchhoff,
Dan Roth
Abstract:
Language models have been shown to perform better with an increase in scale on a wide variety of tasks via the in-context learning paradigm. In this paper, we investigate the hypothesis that the ability of a large language model to in-context learn-perform a task is not uniformly spread across all of its underlying components. Using a 66 billion parameter language model (OPT-66B) across a diverse…
▽ More
Language models have been shown to perform better with an increase in scale on a wide variety of tasks via the in-context learning paradigm. In this paper, we investigate the hypothesis that the ability of a large language model to in-context learn-perform a task is not uniformly spread across all of its underlying components. Using a 66 billion parameter language model (OPT-66B) across a diverse set of 14 downstream tasks, we find this is indeed the case: $\sim$70% of attention heads and $\sim$20% of feed forward networks can be removed with minimal decline in task performance. We find substantial overlap in the set of attention heads (un)important for in-context learning across tasks and number of in-context examples. We also address our hypothesis through a task-agnostic lens, finding that a small set of attention heads in OPT-66B score highly on their ability to perform primitive induction operations associated with in-context learning, namely, prefix matching and copying. These induction heads overlap with task-specific important heads, reinforcing arguments by Olsson et al. (arXiv:2209.11895) regarding induction head generality to more sophisticated behaviors associated with in-context learning. Overall, our study provides several insights that indicate large language models may be under-trained for in-context learning and opens up questions on how to pre-train language models to more effectively perform in-context learning.
△ Less
Submitted 16 August, 2023; v1 submitted 18 December, 2022;
originally announced December 2022.
-
Multi-task Fusion for Efficient Panoptic-Part Segmentation
Authors:
Sravan Kumar Jagadeesh,
René Schuster,
Didier Stricker
Abstract:
In this paper, we introduce a novel network that generates semantic, instance, and part segmentation using a shared encoder and effectively fuses them to achieve panoptic-part segmentation. Unifying these three segmentation problems allows for mutually improved and consistent representation learning. To fuse the predictions of all three heads efficiently, we introduce a parameter-free joint fusion…
▽ More
In this paper, we introduce a novel network that generates semantic, instance, and part segmentation using a shared encoder and effectively fuses them to achieve panoptic-part segmentation. Unifying these three segmentation problems allows for mutually improved and consistent representation learning. To fuse the predictions of all three heads efficiently, we introduce a parameter-free joint fusion module that dynamically balances the logits and fuses them to create panoptic-part segmentation. Our method is evaluated on the Cityscapes Panoptic Parts (CPP) and Pascal Panoptic Parts (PPP) datasets. For CPP, the PartPQ of our proposed model with joint fusion surpasses the previous state-of-the-art by 1.6 and 4.7 percentage points for all areas and segments with parts, respectively. On PPP, our joint fusion outperforms a model using the previous top-down merging strategy by 3.3 percentage points in PartPQ and 10.5 percentage points in PartPQ for partitionable classes.
△ Less
Submitted 19 December, 2022; v1 submitted 15 December, 2022;
originally announced December 2022.
-
The prevalence and influence of circumstellar material around hydrogen-rich supernova progenitors
Authors:
Rachel J. Bruch,
Avishay Gal-Yam,
Ofer Yaron,
Ping Chen,
Nora L. Strotjohann,
Ido Irani,
Erez Zimmerman,
Steve Schulze,
Yi Yang,
Young-Lo Kim,
Mattia Bulla,
Jesper Sollerman,
Mickael Rigault,
Eran Ofek,
Maayane Soumagnac,
Frank J. Masci,
Christoffer Fremling,
Daniel Perley,
Jakob Nordin,
S. Bradley Cenko,
Anna Y. Q. Ho,
S. Adams,
Igor Adreoni,
Eric C. Bellm,
Nadia Blagorodnova
, et al. (22 additional authors not shown)
Abstract:
Narrow transient emission lines (flash-ionization features) in early supernova (SN) spectra trace the presence of circumstellar material (CSM) around the massive progenitor stars of core-collapse SNe. The lines disappear within days after the SN explosion, suggesting that this material is spatially confined, and originates from enhanced mass loss shortly (months to a few years) prior to explosion.…
▽ More
Narrow transient emission lines (flash-ionization features) in early supernova (SN) spectra trace the presence of circumstellar material (CSM) around the massive progenitor stars of core-collapse SNe. The lines disappear within days after the SN explosion, suggesting that this material is spatially confined, and originates from enhanced mass loss shortly (months to a few years) prior to explosion. We performed a systematic survey of H-rich (Type II) SNe discovered within less than two days from explosion during the first phase of the Zwicky Transient Facility (ZTF) survey (2018-2020), finding thirty events for which a first spectrum was obtained within $< 2$ days from explosion. The measured fraction of events showing flash ionisation features ($>36\%$ at $95\%$ confidence level) confirms that elevated mass loss in massive stars prior to SN explosion is common. We find that SNe II showing flash ionisation features are not significantly brighter, nor bluer, nor more slowly rising than those without. This implies that CSM interaction does not contribute significantly to their early continuum emission, and that the CSM is likely optically thin. We measured the persistence duration of flash ionisation emission and find that most SNe show flash features for $\approx 5 $ days. Rarer events, with persistence timescales $>10$ days, are brighter and rise longer, suggesting these may be intermediate between regular SNe II and strongly-interacting SNe IIn.
△ Less
Submitted 13 December, 2022; v1 submitted 6 December, 2022;
originally announced December 2022.