Search | arXiv e-print repository

Interplay of prompt and non-prompt photons in photon-triggered jet observables

Authors: Chathuranga Sirimanna, Yasuki Tachibana, Abhijit Majumder, Aaron Angerami, Ritu Arora, Steffen Bass, Yi Chen, Ritoban Datta, Lipei Du, Raymond Ehlers, Hannah Elfner, Rainer J. Fries, Charles Gale, Yayun He, Barbara Jacak, Peter Jacobs, Sangyong Jeon, Yi Ji, Florian Jonas, Lauren Kasper, Michael Kordell, Amit Kumar, Raghav Kunnawalkam-Elayavalli, Joseph Latessa, Yen-Jie Lee , et al. (27 additional authors not shown)

Abstract: Prompt photons are important yet challenging to observe in relativistic heavy-ion collisions, as they are produced in the early stages and traverse almost the entire QGP medium without interaction. Experimental analyses typically employ isolation cuts, in the hope to identify prompt photons. Most theoretical studies consider only events with actual prompt photons, assuming no contribution from iso… ▽ More Prompt photons are important yet challenging to observe in relativistic heavy-ion collisions, as they are produced in the early stages and traverse almost the entire QGP medium without interaction. Experimental analyses typically employ isolation cuts, in the hope to identify prompt photons. Most theoretical studies consider only events with actual prompt photons, assuming no contribution from isolated non-prompt photons to reduce computational cost. For the first time, we present a study that compares simulation results generated using inclusive (bremsstrahlung) and prompt-photon events with multiple experimental observables for both $p-p$ and $Pb-Pb$ collisions at $5.02$ TeV. Simulations are carried out using the multi-stage JETSCAPE framework tuned to describe the quenching of jets and hadrons. Isolated non-prompt photons are generated in hard photon bremsstrahlung, where the photon is radiated at a sufficient angle to the jet. Several photon triggered jet and jet substructure observables show significant contributions from inclusive photons, yielding an improvement in comparison with experimental data. Novel photon triggered jet substructure observables are also expected to show new structures, yet to be detected in experiment. This effort examines the significance of isolated non-prompt photons using parameters tuned for a simultaneous description of the leading hadron and jet spectrum, and thus provides an independent verification of the multistage evolution framework. △ Less

Submitted 1 July, 2025; originally announced July 2025.

Comments: 5 pages, 4 figures

arXiv:2506.16344 [pdf, ps, other]

Effects of hadronic reinteraction on jet fragmentation from small to large systems

Authors: Hendrik Roch, Aaron Angerami, Ritu Arora, Steffen Bass, Yi Chen, Ritoban Datta, Lipei Du, Raymond Ehlers, Hannah Elfner, Rainer J. Fries, Charles Gale, Yayun He, Barbara Jacak, Peter Jacobs, Sangyong Jeon, Yi Ji, Florian Jonas, Lauren Kasper, Michael Kordell II, Amit Kumar, Raghav Kunnawalkam-Elayavalli, Joseph Latessa, Yen-Jie Lee, Roy Lemmon, Matt Luzum , et al. (27 additional authors not shown)

Abstract: We investigate the impact of the hadronic phase on jet quenching in nuclear collider experiments, an open question in heavy-ion physics. Previous studies in a simplified setup suggest that hadronic interactions could have significant effects, but a systematic analysis is needed. Using the X-SCAPE event generator with the SMASH afterburner, we study the role of hadronic rescattering on jet fragment… ▽ More We investigate the impact of the hadronic phase on jet quenching in nuclear collider experiments, an open question in heavy-ion physics. Previous studies in a simplified setup suggest that hadronic interactions could have significant effects, but a systematic analysis is needed. Using the X-SCAPE event generator with the SMASH afterburner, we study the role of hadronic rescattering on jet fragmentation hadrons. Applying this framework to $e^++e^-$ collisions, we demonstrate that even in small systems with limited particle production, hadronic interactions lead to measurable modifications in final-state hadronic and jet observables by comparing scenarios with and without afterburner rescattering. △ Less

Submitted 19 June, 2025; originally announced June 2025.

Comments: 6 pages, 3 figures, conference proceedings for Hard Probes 2024

arXiv:2506.15990 [pdf, ps, other]

Extraction of jet-medium interaction details through jet substructure for inclusive and gamma-tagged jets

Authors: Y. Tachibana, C. Sirimanna, A. Majumder, A. Angerami, R. Arora, S. A. Bass, Y. Chen, R. Datta, L. Du, R. Ehlers, H. Elfner, R. J. Fries, C. Gale, Y. He, B. V. Jacak, P. M. Jacobs, S. Jeon, Y. Ji, F. Jonas, L. Kasper, M. Kordell II, A. Kumar, R. Kunnawalkam-Elayavalli, J. Latessa, Y. -J. Lee , et al. (27 additional authors not shown)

Abstract: We present a comprehensive study of jet substructure modifications in high-energy heavy-ion collisions using both inclusive jets and $γ$-tagged jets, based on a multi-stage jet evolution model within the Monte Carlo framework JETSCAPE. To investigate hard parton splittings inside jets, we focus on Soft Drop observables. Our results for the groomed splitting radius and groomed jet mass distribution… ▽ More We present a comprehensive study of jet substructure modifications in high-energy heavy-ion collisions using both inclusive jets and $γ$-tagged jets, based on a multi-stage jet evolution model within the Monte Carlo framework JETSCAPE. To investigate hard parton splittings inside jets, we focus on Soft Drop observables. Our results for the groomed splitting radius and groomed jet mass distributions of inclusive jets show a slight narrowing compared to proton-proton baselines. We demonstrate that this apparent narrowing is primarily a selection bias from energy loss, rather than a direct modification of the splitting structure, by analyzing $γ$-tagged jets, where such bias is eliminated or significantly reduced. We also show that quark jets exhibit genuine modifications in their splitting structure, which is not seen in gluon jets. These effects are clearly visible in the substructure of $γ$-tagged jets, which are dominated by quark jets, but are not apparent for inclusive jets. This demonstrates that $γ$-tagged jets offer a powerful probe of medium-induced modifications to the hard splitting structure of jets. △ Less

Submitted 18 June, 2025; originally announced June 2025.

Comments: 5 pages, 3 figures, Proceedings of the 12th International Conference on Hard and Electromagnetic Probes of High-Energy Nuclear Collisions (Hard Probes 2024), September 22-27, 2024, Nagasaki, Japan

arXiv:2505.18149 [pdf, ps, other]

First Finish Search: Efficient Test-Time Scaling in Large Language Models

Authors: Aradhye Agarwal, Ayan Sengupta, Tanmoy Chakraborty

Abstract: Test-time scaling (TTS), which involves dynamic allocation of compute during inference, offers a promising way to improve reasoning in large language models. While existing TTS methods work well, they often rely on long decoding paths or require a large number of samples to be generated, increasing the token usage and inference latency. We observe the surprising fact that for reasoning tasks, shor… ▽ More Test-time scaling (TTS), which involves dynamic allocation of compute during inference, offers a promising way to improve reasoning in large language models. While existing TTS methods work well, they often rely on long decoding paths or require a large number of samples to be generated, increasing the token usage and inference latency. We observe the surprising fact that for reasoning tasks, shorter traces are much more likely to be correct than longer ones. Motivated by this, we introduce First Finish Search (FFS), a training-free parallel decoding strategy that launches $n$ independent samples and returns as soon as any one completes. We evaluate FFS alongside simple decoding, beam search, majority voting, and budget forcing on four reasoning models (DeepSeek-R1, R1-Distill-Qwen-32B, QwQ-32B and Phi-4-Reasoning-Plus) and across four datasets (AIME24, AIME25-I, AIME25-II and GPQA Diamond). With DeepSeek-R1, FFS achieves $82.23\%$ accuracy on the AIME datasets, a $15\%$ improvement over DeepSeek-R1's standalone accuracy, nearly matching OpenAI's o4-mini performance. Our theoretical analysis explains why stopping at the shortest trace is likely to yield a correct answer and identifies the conditions under which early stopping may be suboptimal. The elegance and simplicity of FFS demonstrate that straightforward TTS strategies can perform remarkably well, revealing the untapped potential of simple approaches at inference time. △ Less

Submitted 23 May, 2025; originally announced May 2025.

arXiv:2505.15442 [pdf, ps, other]

On the Generalization vs Fidelity Paradox in Knowledge Distillation

Authors: Suhas Kamasetty Ramesh, Ayan Sengupta, Tanmoy Chakraborty

Abstract: Knowledge distillation (KD) is a key technique for compressing large language models into smaller ones while preserving performance. Despite the recent traction of KD research, its effectiveness for smaller language models (LMs) and the mechanisms driving knowledge transfer remain underexplored. In this work, we present the first large-scale empirical and statistical analysis of KD across models r… ▽ More Knowledge distillation (KD) is a key technique for compressing large language models into smaller ones while preserving performance. Despite the recent traction of KD research, its effectiveness for smaller language models (LMs) and the mechanisms driving knowledge transfer remain underexplored. In this work, we present the first large-scale empirical and statistical analysis of KD across models ranging from 0.5B to 7B parameters on 14 complex reasoning tasks in a zero-shot setting. Our findings reveal that KD can improve the average performance of smaller models by up to $10\%$, with a peak task specific gain of $22\%$, while providing only marginal benefits ($\sim 1.3\%$) for larger models. Surprisingly, teacher performance has a minimal impact on student outcomes, while teacher task expertise impacts KD effectiveness. A correlation study indicates that smaller LMs benefit more from KD, whereas larger LMs show diminished gains. Additionally, we uncover a misalignment between improvements in student performance and reasoning fidelity, suggesting that while KD enhances accuracy, it does not always maintain the structured decision-making processes of the teacher. Our ablation study further highlights the importance of teacher signals and logit smoothing in influencing students' performance after distillation. Overall, our study offers a comprehensive empirical and statistical assessment of KD, highlighting both its benefits and trade-offs when distilling knowledge from larger to smaller LMs. △ Less

Submitted 21 May, 2025; originally announced May 2025.

arXiv:2505.14928 [pdf, other]

Deformation of Jets Induced by Ambient Medium Flow

Authors: Arjun Sengupta, Rainer J. Fries

Abstract: The evolution of jets showers in high energy nuclear collisions is influenced in various ways by the presence of a surrounding medium. The interaction of jet constituents with the medium can happen during the partonic stage of the jet, during hadronization, and even during its hadronic stage. We demonstrate how flow of the ambient medium in a direction transverse to the jet can introduce both dipo… ▽ More The evolution of jets showers in high energy nuclear collisions is influenced in various ways by the presence of a surrounding medium. The interaction of jet constituents with the medium can happen during the partonic stage of the jet, during hadronization, and even during its hadronic stage. We demonstrate how flow of the ambient medium in a direction transverse to the jet can introduce both dipole and quadrupole defomations. We propose to analyze the $n=1$ and $n=2$ harmonic deformations of soft and semi-hard hadrons or subjets in a jet with respect to the jet core using the method of $q$-vectors. We discuss simulations which show how the transverse shapes and their preferred angles evolve when the ambient environment of jets changes from the vacuum to a parton medium without flow and finally to a medium with various rates of transverse flow. Our study includes the effects of both flow during the development of the parton shower and hadronization. The existence of dipole deformations, and the correlation of the angles of dipole and quadrupole deformations could constitute promising experimental signals for the presence and size of ambient transverse flow. △ Less

Submitted 20 May, 2025; originally announced May 2025.

Comments: Contribution to Hard Probes 2024; 5 pages, 4 figures; submitted to EPJ Web of Conferences

arXiv:2505.07919 [pdf, other]

Revolutionising Bacterial Genomics: Graph-Based Strategies for Improved Variant Identification

Authors: Fathima Nuzla Ismail, Abira Sengupta

Abstract: A significant advancement in bioinformatics is using genome graph techniques to improve variation discovery across organisms. Traditional approaches, such as bwa mem, rely on linear reference genomes for genomic analyses but may introduce biases when applied to highly diverse bacterial genomes of the same species. Pangenome graphs provide an alternative paradigm for evaluating structural and minor… ▽ More A significant advancement in bioinformatics is using genome graph techniques to improve variation discovery across organisms. Traditional approaches, such as bwa mem, rely on linear reference genomes for genomic analyses but may introduce biases when applied to highly diverse bacterial genomes of the same species. Pangenome graphs provide an alternative paradigm for evaluating structural and minor variations within a graphical framework, including insertions, deletions, and single nucleotide polymorphisms. Pangenome graphs enhance the detection and interpretation of complex genetic variants by representing the full genetic diversity of a species. In this study, we present a robust and reliable bioinformatics pipeline utilising the PanGenome Graph Builder (PGGB) and the Variation Graph toolbox (vg giraffe) to align whole-genome sequencing data, call variants against a graph reference, and construct pangenomes from assembled genomes. Our results demonstrate that leveraging pangenome graphs over a single linear reference genome significantly improves mapping rates and variant calling accuracy for simulated and actual bacterial pathogens datasets. △ Less

Submitted 12 May, 2025; originally announced May 2025.

arXiv:2505.07433 [pdf, other]

Spatio-temporal pulse propagation during highly-resolved onset of Rayleigh-Taylor and Kelvin-Helmholtz Rayleigh-Taylor instabilities

Authors: Bhavna Joshi, Aditi Sengupta, Yassin Ajanif, Lucas Lestandi

Abstract: The present study explores the onset of the Rayleigh-Taylor instability (RTI) and Kelvin-Helmholtz Rayleigh-Taylor instability (KHRTI) with highly-resolved direct numerical simulations of two setups which consider air at different temperatures (or densities) and/or velocities in two halves of three-dimensional cuboidal domains. The compressible Navier-Stokes equations are solved using a novel para… ▽ More The present study explores the onset of the Rayleigh-Taylor instability (RTI) and Kelvin-Helmholtz Rayleigh-Taylor instability (KHRTI) with highly-resolved direct numerical simulations of two setups which consider air at different temperatures (or densities) and/or velocities in two halves of three-dimensional cuboidal domains. The compressible Navier-Stokes equations are solved using a novel parallel algorithm which does not involve overlapping points at sub-domain boundaries. The pressure disturbance field is compared during onset of RTI and KHRTI and corresponding convection- and advection-dominated mechanisms are highlighted by instantaneous features, spectra, and proper orthogonal decomposition. The relative contributions of pressure, kinetic energy and rotational energy to the overall energy budget is explored for both instabilities, revealing acoustic trigger to be the incipient mechanism for both RTI and KHRTI. The nonlinear, spatio-temporal nature of the instability is further explored by application of a transport equation for enstrophy of compressible flows. This provides insights into the similarities and differences between the onset mechanisms of RTI and KHRTI, serving as a benchmark data set for shear and buoyancy-driven instabilities across diverse applications in geophysics, nuclear energy and atmospheric fluid dynamics. △ Less

Submitted 12 May, 2025; originally announced May 2025.

arXiv:2505.04717 [pdf, other]

Big Data Architecture for Large Organizations

Authors: Fathima Nuzla Ismail, Abira Sengupta, Shanika Amarasoma

Abstract: The exponential growth of big data has transformed how large organisations leverage information to drive innovation, optimise processes, and maintain competitive advantages. However, managing and extracting insights from vast, heterogeneous data sources requires a scalable, secure, and well-integrated big data architecture. This paper proposes a comprehensive big data framework that aligns with or… ▽ More The exponential growth of big data has transformed how large organisations leverage information to drive innovation, optimise processes, and maintain competitive advantages. However, managing and extracting insights from vast, heterogeneous data sources requires a scalable, secure, and well-integrated big data architecture. This paper proposes a comprehensive big data framework that aligns with organisational objectives while ensuring flexibility, scalability, and governance. The architecture encompasses multiple layers, including data ingestion, transformation, storage, analytics, machine learning, and security, incorporating emerging technologies such as Generative AI (GenAI) and low-code machine learning. Cloud-based implementations across Google Cloud, AWS, and Microsoft Azure are analysed, highlighting their tools and capabilities. Additionally, this study explores advancements in big data architecture, including AI-driven automation, data mesh, and Data Ocean paradigms. By establishing a structured, adaptable framework, this research provides a foundational blueprint for large organisations to harness big data as a strategic asset effectively. △ Less

Submitted 7 May, 2025; originally announced May 2025.

arXiv:2505.01635 [pdf]

Dendritic Computing with Multi-Gate Ferroelectric Field-Effect Transistors

Authors: A N M Nafiul Islam, Xuezhong Niu, Jiahui Duan, Shubham Kumar, Kai Ni, Abhronil Sengupta

Abstract: Although inspired by neuronal systems in the brain, artificial neural networks generally employ point-neurons, which offer far less computational complexity than their biological counterparts. Neurons have dendritic arbors that connect to different sets of synapses and offer local non-linear accumulation - playing a pivotal role in processing and learning. Inspired by this, we propose a novel neur… ▽ More Although inspired by neuronal systems in the brain, artificial neural networks generally employ point-neurons, which offer far less computational complexity than their biological counterparts. Neurons have dendritic arbors that connect to different sets of synapses and offer local non-linear accumulation - playing a pivotal role in processing and learning. Inspired by this, we propose a novel neuron design based on a multi-gate ferroelectric field-effect transistor that mimics dendrites. It leverages ferroelectric nonlinearity for local computations within dendritic branches, while utilizing the transistor action to generate the final neuronal output. The branched architecture paves the way for utilizing smaller crossbar arrays in hardware integration, leading to greater efficiency. Using an experimentally calibrated device-circuit-algorithm co-simulation framework, we demonstrate that networks incorporating our dendritic neurons achieve superior performance in comparison to much larger networks without dendrites ($\sim$17$\times$ fewer trainable weight parameters). These findings suggest that dendritic hardware can significantly improve computational efficiency, and learning capacity of neuromorphic systems optimized for edge applications. △ Less

Submitted 2 May, 2025; originally announced May 2025.

arXiv:2505.00985 [pdf, other]

Position: Enough of Scaling LLMs! Lets Focus on Downscaling

Authors: Yash Goel, Ayan Sengupta, Tanmoy Chakraborty

Abstract: We challenge the dominant focus on neural scaling laws and advocate for a paradigm shift toward downscaling in the development of large language models (LLMs). While scaling laws have provided critical insights into performance improvements through increasing model and dataset size, we emphasize the significant limitations of this approach, particularly in terms of computational inefficiency, envi… ▽ More We challenge the dominant focus on neural scaling laws and advocate for a paradigm shift toward downscaling in the development of large language models (LLMs). While scaling laws have provided critical insights into performance improvements through increasing model and dataset size, we emphasize the significant limitations of this approach, particularly in terms of computational inefficiency, environmental impact, and deployment constraints. To address these challenges, we propose a holistic framework for downscaling LLMs that seeks to maintain performance while drastically reducing resource demands. This paper outlines practical strategies for transitioning away from traditional scaling paradigms, advocating for a more sustainable, efficient, and accessible approach to LLM development. △ Less

Submitted 25 May, 2025; v1 submitted 2 May, 2025; originally announced May 2025.

arXiv:2504.19967 [pdf]

Enhancing short-term traffic prediction by integrating trends and fluctuations with attention mechanism

Authors: Adway Das, Agnimitra Sengupta, S. Ilgin Guler

Abstract: Traffic flow prediction is a critical component of intelligent transportation systems, yet accurately forecasting traffic remains challenging due to the interaction between long-term trends and short-term fluctuations. Standard deep learning models often struggle with these challenges because their architectures inherently smooth over fine-grained fluctuations while focusing on general trends. Thi… ▽ More Traffic flow prediction is a critical component of intelligent transportation systems, yet accurately forecasting traffic remains challenging due to the interaction between long-term trends and short-term fluctuations. Standard deep learning models often struggle with these challenges because their architectures inherently smooth over fine-grained fluctuations while focusing on general trends. This limitation arises from low-pass filtering effects, gate biases favoring stability, and memory update mechanisms that prioritize long-term information retention. To address these shortcomings, this study introduces a hybrid deep learning framework that integrates both long-term trend and short-term fluctuation information using two input features processed in parallel, designed to capture complementary aspects of traffic flow dynamics. Further, our approach leverages attention mechanisms, specifically Bahdanau attention, to selectively focus on critical time steps within traffic data, enhancing the model's ability to predict congestion and other transient phenomena. Experimental results demonstrate that features learned from both branches are complementary, significantly improving the goodness-of-fit statistics across multiple prediction horizons compared to a baseline model. Notably, the attention mechanism enhances short-term forecast accuracy by directly targeting immediate fluctuations, though challenges remain in fully integrating long-term trends. This framework can contribute to more effective congestion mitigation and urban mobility planning by advancing the robustness and precision of traffic prediction models. △ Less

Submitted 28 April, 2025; originally announced April 2025.

arXiv:2504.18996 [pdf, ps, other]

Generalised tree modules: Hom-sets and indecomposability

Authors: Annoy Sengupta, Amit Kuber

Abstract: For a zero-relation algebra over a field $\mathcal K$, Crawley-Boevey introduced the concept of a tree module and provided a combinatorial description of a basis for the space of homomorphisms between two tree modules--the basis elements are called graph maps. The indecomposability of tree modules is essentially due to Gabriel. We relax a condition in the definition of a tree module to define gene… ▽ More For a zero-relation algebra over a field $\mathcal K$, Crawley-Boevey introduced the concept of a tree module and provided a combinatorial description of a basis for the space of homomorphisms between two tree modules--the basis elements are called graph maps. The indecomposability of tree modules is essentially due to Gabriel. We relax a condition in the definition of a tree module to define generalised tree modules and when $\mathrm{char}(\mathcal K)\neq2$, under a certain condition, provide a combinatorial description of a finite generating set for the space of homomorphisms between two such modules--we call the generators generalised graph maps. As an application, we provide a sufficient condition for the (in)decomposability of certain generalised tree modules. We also show that all indecomposable modules over a Dynkin quiver of type $\mathbf D$ are isomorphic to generalised tree modules--this result also follows from a theorem of Ringel which states that all exceptional modules over the path algebra $\mathcal KQ$ of a finite quiver $Q$ are generalised tree modules. △ Less

Submitted 22 May, 2025; v1 submitted 26 April, 2025; originally announced April 2025.

Comments: 23 pages, added references to the literature on tree modules by Ringel, Kinser and Katter-Mahrt

MSC Class: 16G20

arXiv:2504.14729 [pdf, other]

Rank Bounds and PIT for $Σ^3 ΠΣΠ^d$ circuits via a non-linear Edelstein-Kelly theorem

Authors: Abhibhav Garg, Rafael Oliveira, Akash Kumar Sengupta

Abstract: We prove a non-linear Edelstein-Kelly theorem for polynomials of constant degree, fully settling a stronger form of Conjecture 30 in Gupta (2014), and generalizing the main result of Peleg and Shpilka (STOC 2021) from quadratic polynomials to polynomials of any constant degree. As a consequence of our result, we obtain constant rank bounds for depth-4 circuits with top fanin 3 and constant botto… ▽ More We prove a non-linear Edelstein-Kelly theorem for polynomials of constant degree, fully settling a stronger form of Conjecture 30 in Gupta (2014), and generalizing the main result of Peleg and Shpilka (STOC 2021) from quadratic polynomials to polynomials of any constant degree. As a consequence of our result, we obtain constant rank bounds for depth-4 circuits with top fanin 3 and constant bottom fanin (denoted $Σ^{3}ΠΣΠ^{d}$ circuits) which compute the zero polynomial. This settles a stronger form of Conjecture 1 in Gupta (2014) when $k=3$, for any constant degree bound; additionally this also makes progress on Conjecture 28 in Beecken, Mittmann, and Saxena (Information \& Computation, 2013). Our rank bounds, when combined with Theorem 2 in Beecken, Mittmann, and Saxena (Information \& Computation, 2013) yield the first deterministic, polynomial time PIT algorithm for $Σ^{3}ΠΣΠ^{d}$ circuits. △ Less

Submitted 25 April, 2025; v1 submitted 20 April, 2025; originally announced April 2025.

Comments: 43 pages. Added reference to concurrent work

arXiv:2504.14466 [pdf, other]

A Bio-inspired Asymmetric Double-Gate Ferroelectric FET for Emulating Astrocyte and Dendrite Dynamics in Neuromorphic Systems

Authors: Zhouhang Jiang, A N M Nafiul Islam, Zhuangyu Han, Zijian Zhao, Franz Müller, Jiahui Duan, Halid Mulaosmanovic, Stefan Dünkel, Sven Beyer, Sourav Dutta, Vijaykrishnan Narayanan, Thomas Kämpfe, Suma George Cardwell, Frances Chance, Abhronil Sengupta, Kai Ni

Abstract: Neuromorphic systems seek to replicate the functionalities of biological neural networks to attain significant improvements in performance and efficiency of AI computing platforms. However, these systems have generally remained limited to emulation of simple neurons and synapses; and ignored higher order functionalities enabled by other components of the brain like astrocytes and dendrites. In thi… ▽ More Neuromorphic systems seek to replicate the functionalities of biological neural networks to attain significant improvements in performance and efficiency of AI computing platforms. However, these systems have generally remained limited to emulation of simple neurons and synapses; and ignored higher order functionalities enabled by other components of the brain like astrocytes and dendrites. In this work, drawing inspiration from biology, we introduce a compact Double-Gate Ferroelectric Field Effect Transistor (DG-FeFET) cell that can emulate the dynamics of both astrocytes and dendrites within neuromorphic architectures. We demonstrate that with a ferroelectric top gate for synaptic weight programming as in conventional synapses and a non-ferroelectric back gate, the DG-FeFET realizes a synapse with a dynamic gain modulation mechanism. This can be leveraged as an analog for a compact astrocyte-tripartite synapse, as well as enabling dendrite-like gain modulation operations. By employing a fully-depleted silicon-on-insulator (FDSOI) FeFET as our double-gate device, we validate the linear control of the synaptic weight via the back gate terminal (i.e., the gate underneath the buried oxide (BOX) layer) through comprehensive theoretical and experimental studies. We showcase the promise such a tripartite synaptic device holds for numerous important neuromorphic applications, including autonomous self-repair of faulty neuromorphic hardware mediated by astrocytic functionality. Coordinate transformations based on dragonfly prey-interception circuitry models are also demonstrated based on dendritic function emulation by the device. This work paves the way forward for developing truly "brain-like" neuromorphic hardware that go beyond the current dogma focusing only on neurons and synapses. △ Less

Submitted 19 April, 2025; originally announced April 2025.

Comments: 37 pages, 6 figure, 2 tables

arXiv:2504.11560 [pdf]

Enhancing Deterministic Freezing Level Predictions in the Northern Sierra Nevada Through Deep Neural Networks

Authors: Vesta Afzali Gorooh, Agniv Sengupta, Shawn Roj, Rachel Weihs, Brian Kawzenuk, Luca Delle Monache, F. Martin Ralph

Abstract: Accurate prediction of the freezing level (FZL) is essential for hydrometeorological forecasting systems and precipitation phase estimation, and it influences runoff generation and reservoir management decisions. In this study, we develop a deep learning based postprocessing framework using the Unet convolutional neural network (CNN) architecture to refine the FZL forecasts from the West Weather R… ▽ More Accurate prediction of the freezing level (FZL) is essential for hydrometeorological forecasting systems and precipitation phase estimation, and it influences runoff generation and reservoir management decisions. In this study, we develop a deep learning based postprocessing framework using the Unet convolutional neural network (CNN) architecture to refine the FZL forecasts from the West Weather Research and Forecasting (West-WRF) model. The proposed framework leverages reforecast data from West WRF and FZL estimates from the California Nevada River Forecast Center (CNRFC) to train a deterministic Unet model over the Yuba-Feather watershed, a hydrologically critical basin in northern California. We introduce two variants of our model, Unet-Log and Unet-GMM, which utilize the logarithm of the hyperbolic cosine of Error and Gaussian Mixture Model loss functions, respectively, to enhance FZL forecast accuracy beyond an RMSE based benchmark. Results indicate that the Unet based postprocessing framework significantly improves FZL forecast skill across diverse atmospheric conditions and complex topography. Compared to the raw West-WRF output, our model achieves reductions in RMSE of up to 25% and increases the forecast observation correlation by about 10% over the Yuba-Feather watershed. Furthermore, it effectively captures the spatiotemporal variability of the FZL across different elevations, mitigating systematic biases inherent in the West-WRF model. This novel deep learning based postprocessing approach demonstrates a promising pathway for integrating machine learning into hydrometeorological forecasting and decision support within the Forecast Informed Reservoir Operations (FIRO) framework. △ Less

Submitted 15 April, 2025; originally announced April 2025.

arXiv:2504.08897 [pdf, other]

Toward Spiking Neural Network Local Learning Modules Resistant to Adversarial Attacks

Authors: Jiaqi Lin, Abhronil Sengupta

Abstract: Recent research has shown the vulnerability of Spiking Neural Networks (SNNs) under adversarial examples that are nearly indistinguishable from clean data in the context of frame-based and event-based information. The majority of these studies are constrained in generating adversarial examples using Backpropagation Through Time (BPTT), a gradient-based method which lacks biological plausibility. I… ▽ More Recent research has shown the vulnerability of Spiking Neural Networks (SNNs) under adversarial examples that are nearly indistinguishable from clean data in the context of frame-based and event-based information. The majority of these studies are constrained in generating adversarial examples using Backpropagation Through Time (BPTT), a gradient-based method which lacks biological plausibility. In contrast, local learning methods, which relax many of BPTT's constraints, remain under-explored in the context of adversarial attacks. To address this problem, we examine adversarial robustness in SNNs through the framework of four types of training algorithms. We provide an in-depth analysis of the ineffectiveness of gradient-based adversarial attacks to generate adversarial instances in this scenario. To overcome these limitations, we introduce a hybrid adversarial attack paradigm that leverages the transferability of adversarial instances. The proposed hybrid approach demonstrates superior performance, outperforming existing adversarial attack methods. Furthermore, the generalizability of the method is assessed under multi-step adversarial attacks, adversarial attacks in black-box FGSM scenarios, and within the non-spiking domain. △ Less

Submitted 11 April, 2025; originally announced April 2025.

arXiv:2504.06774 [pdf, other]

Hybrid machine learning models based on physical patterns to accelerate CFD simulations: a short guide on autoregressive models

Authors: Arindam Sengupta, Rodrigo Abadía-Heredia, Ashton Hetherington, José Miguel Pérez, Soledad Le Clainche

Abstract: Accurate modeling of the complex dynamics of fluid flows is a fundamental challenge in computational physics and engineering. This study presents an innovative integration of High-Order Singular Value Decomposition (HOSVD) with Long Short-Term Memory (LSTM) architectures to address the complexities of reduced-order modeling (ROM) in fluid dynamics. HOSVD improves the dimensionality reduction proce… ▽ More Accurate modeling of the complex dynamics of fluid flows is a fundamental challenge in computational physics and engineering. This study presents an innovative integration of High-Order Singular Value Decomposition (HOSVD) with Long Short-Term Memory (LSTM) architectures to address the complexities of reduced-order modeling (ROM) in fluid dynamics. HOSVD improves the dimensionality reduction process by preserving multidimensional structures, surpassing the limitations of Singular Value Decomposition (SVD). The methodology is tested across numerical and experimental data sets, including two- and three-dimensional (2D and 3D) cylinder wake flows, spanning both laminar and turbulent regimes. The emphasis is also on exploring how the depth and complexity of LSTM architectures contribute to improving predictive performance. Simpler architectures with a single dense layer effectively capture the periodic dynamics, demonstrating the network's ability to model non-linearities and chaotic dynamics. The addition of extra layers provides higher accuracy at minimal computational cost. These additional layers enable the network to expand its representational capacity, improving the prediction accuracy and reliability. The results demonstrate that HOSVD outperforms SVD in all tested scenarios, as evidenced by using different error metrics. Efficient mode truncation by HOSVD-based models enables the capture of complex temporal patterns, offering reliable predictions even in challenging, noise-influenced data sets. The findings underscore the adaptability and robustness of HOSVD-LSTM architectures, offering a scalable framework for modeling fluid dynamics. △ Less

Submitted 9 April, 2025; originally announced April 2025.

arXiv:2504.04342 [pdf, other]

Compression Laws for Large Language Models

Authors: Ayan Sengupta, Siddhant Chaudhary, Tanmoy Chakraborty

Abstract: We introduce compression laws for language language models (LLMs). While recent scaling laws have sought to understand how LLMs scale with respect to model size, pre-training data, and computational resources, we focus on understanding how model compression affects the performance of a pre-trained LLM on downstream tasks. We empirically examine the effects of structured model compression on LLMs t… ▽ More We introduce compression laws for language language models (LLMs). While recent scaling laws have sought to understand how LLMs scale with respect to model size, pre-training data, and computational resources, we focus on understanding how model compression affects the performance of a pre-trained LLM on downstream tasks. We empirically examine the effects of structured model compression on LLMs through over $1000$ experiments across eight models with sizes ranging from $0.5B$ to $14B$ parameters. Our findings indicate that the test cross-entropy loss increases quadratically with the compression ratio, whereas performance on downstream tasks declines only linearly. Our study emphasizes the importance of recovery fine-tuning in enhancing generation loss, showing that the test loss of compressed LLMs can improve by up to 55% with recovery fine-tuning. At higher compression ratios (up to 90%), compressed LLMs demonstrate a speed increase of 60% during inference compared to their uncompressed counterparts, compensating for the performance degradation at this level. However, for smaller models ($\le 7B$), the computational gains are limited, peaking at just 35%. We conclude that model compression can be highly beneficial for larger models, especially when a smaller model within the same computational budget is not available. These insights provide the practical guidelines for utilizing model compression techniques for adopting LLMs in real-life applications in resource-constrained settings. △ Less

Submitted 5 April, 2025; originally announced April 2025.

Comments: 16 pages, 11 figures, 6 tables

arXiv:2503.23693 [pdf, other]

Enhanced signal of momentum broadening in hard splittings for $γ$-tagged jets in a multistage approach

Authors: Y. Tachibana, C. Sirimanna, A. Majumder, A. Angerami, R. Arora, S. A. Bass, Y. Chen, R. Datta, L. Du, R. Ehlers, H. Elfner, R. J. Fries, C. Gale, Y. He, B. V. Jacak, P. M. Jacobs, S. Jeon, Y. Ji, F. Jonas, L. Kasper, M. Kordell II, A. Kumar, R. Kunnawalkam-Elayavalli, J. Latessa, Y. -J. Lee , et al. (27 additional authors not shown)

Abstract: We investigate medium-induced modifications to jet substructure observables that characterize hard splitting patterns in central Pb-Pb collisions at the top energy of the Large Hadron Collider (LHC). Using a multistage Monte Carlo simulation of in-medium jet shower evolution, we explore flavor-dependent medium effects through simulations of inclusive and $γ$-tagged jets. The results show that quar… ▽ More We investigate medium-induced modifications to jet substructure observables that characterize hard splitting patterns in central Pb-Pb collisions at the top energy of the Large Hadron Collider (LHC). Using a multistage Monte Carlo simulation of in-medium jet shower evolution, we explore flavor-dependent medium effects through simulations of inclusive and $γ$-tagged jets. The results show that quark jets undergo a non-monotonic modification compared to gluon jets in observables such as the Pb-Pb to $p$-$p$ ratio of the Soft Drop prong angle $r_g$, the relative prong transverse momentum $k_{T,g}$ and the groomed mass $m_g$ distributions. Due to this non-monotonic modification, $γ$-tagged jets, enriched in quark jets, provide surprisingly clear signals of medium-induced structural modifications, distinct from effects dominated by selection bias. This work highlights the potential of hard substructures in $γ$-tagged jets as powerful tools for probing the jet-medium interactions in high-energy heavy-ion collisions. All simulations for $γ$-tagged jet analyses carried out in this paper used triggered events containing at least one hard photon, which highlights the utility of these observables for future Bayesian analysis. △ Less

Submitted 30 March, 2025; originally announced March 2025.

Comments: 26 pages, 31 figures

arXiv:2503.23685 [pdf, other]

An In-Situ Spatial-Temporal Sequence Detector for Neuromorphic Vision Sensor Empowered by High Density Vertical NAND Storage

Authors: Zijian Zhao, Varun Darshana Parekh, Po-Kai Hsu, Yixin Qin, Yiming Song, A N M Nafiul Islam, Ningyuan Cao, Siddharth Joshi, Thomas Kämpfe, Moonyoung Jung, Kwangyou Seo, Kwangsoo Kim, Wanki Kim, Daewon Ha, Sourav Dutta, Abhronil Sengupta, Xiao Gong, Shimeng Yu, Vijaykrishnan Narayanan, Kai Ni

Abstract: Neuromorphic vision sensors require efficient real-time pattern recognition, yet conventional architectures struggle with energy and latency constraints. Here, we present a novel in-situ spatiotemporal sequence detector that leverages vertical NAND storage to achieve massively parallel pattern detection. By encoding each cell with two single-transistor-based multi-level cell (MLC) memory elements,… ▽ More Neuromorphic vision sensors require efficient real-time pattern recognition, yet conventional architectures struggle with energy and latency constraints. Here, we present a novel in-situ spatiotemporal sequence detector that leverages vertical NAND storage to achieve massively parallel pattern detection. By encoding each cell with two single-transistor-based multi-level cell (MLC) memory elements, such as ferroelectric field-effect transistors (FeFETs), and mapping a pixel's temporal sequence onto consecutive word lines (WLs), we enable direct temporal pattern detection within NAND strings. Each NAND string serves as a dedicated reference for a single pixel, while different blocks store patterns for distinct pixels, allowing large-scale spatial-temporal pattern recognition via simple direct bit-line (BL) sensing, a well-established operation in vertical NAND storage. We experimentally validate our approach at both the cell and array levels, demonstrating that vertical NAND-based detector achieves more than six orders of magnitude improvement in energy efficiency and more than three orders of magnitude reduction in latency compared to conventional CPU-based methods. These findings establish vertical NAND storage as a scalable and energy-efficient solution for next-generation neuromorphic vision processing. △ Less

Submitted 30 March, 2025; originally announced March 2025.

Comments: 26 pages, 7 figures

arXiv:2503.14491 [pdf, other]

Partial Quantum Shadow Tomography for Structured Operators and its Experimental Demonstration using NMR

Authors: Aniket Sengupta, Arijit Chatterjee, G. J. Sreejith, T. S. Mahesh

Abstract: Quantum shadow tomography based on the classical shadow representation provides an efficient way to estimate properties of an unknown quantum state without performing a full quantum state tomography. In scenarios where estimating the expectation values for only certain classes of observables is required, obtaining information about the entire density matrix is unnecessary. We propose a partial qua… ▽ More Quantum shadow tomography based on the classical shadow representation provides an efficient way to estimate properties of an unknown quantum state without performing a full quantum state tomography. In scenarios where estimating the expectation values for only certain classes of observables is required, obtaining information about the entire density matrix is unnecessary. We propose a partial quantum shadow tomography protocol, which allows estimation of a subset of density matrix elements contributing to the expectation values of certain classes of structured observables. This method utilizes tomographically incomplete subsets of single qubit Pauli basis measurements to perform partial shadow tomography, making it experimentally more efficient. We demonstrate the advantage over unitary $k$-designs such as Clifford, full Pauli basis, and methods utilizing mutually unbiased bases by numerically analyzing the protocol for structured density matrices and observables. We experimentally demonstrate the partial shadow estimation scheme for a wide class of two-qubit states (pure, entangled, and mixed) in the nuclear magnetic resonance (NMR) platform, which relies on ensemble-based measurements. The full density matrix experimentally reconstructed by combining different partial estimators produces fidelities exceeding 97%. △ Less

Submitted 24 March, 2025; v1 submitted 18 March, 2025; originally announced March 2025.

Comments: 15 pages, 5 figures

arXiv:2502.19263 [pdf, other]

doi 10.1145/3708359.3712082

ArtInsight: Enabling AI-Powered Artwork Engagement for Mixed Visual-Ability Families

Authors: Arnavi Chheda-Kothary, Ritesh Kanchi, Chris Sanders, Kevin Xiao, Aditya Sengupta, Melanie Kneitmix, Jacob O. Wobbrock, Jon E. Froehlich

Abstract: We introduce ArtInsight, a novel AI-powered system to facilitate deeper engagement with child-created artwork in mixed visual-ability families. ArtInsight leverages large language models (LLMs) to craft a respectful and thorough initial description of a child's artwork, and provides: creative AI-generated descriptions for a vivid overview, audio recording to capture the child's own description of… ▽ More We introduce ArtInsight, a novel AI-powered system to facilitate deeper engagement with child-created artwork in mixed visual-ability families. ArtInsight leverages large language models (LLMs) to craft a respectful and thorough initial description of a child's artwork, and provides: creative AI-generated descriptions for a vivid overview, audio recording to capture the child's own description of their artwork, and a set of AI-generated questions to facilitate discussion between blind or low-vision (BLV) family members and their children. Alongside ArtInsight, we also contribute a new rubric to score AI-generated descriptions of child-created artwork and an assessment of state-of-the-art LLMs. We evaluated ArtInsight with five groups of BLV family members and their children, and as a case study with one BLV child therapist. Our findings highlight a preference for ArtInsight's longer, artistically-tailored descriptions over those generated by existing BLV AI tools. Participants highlighted the creative description and audio recording components as most beneficial, with the former helping ``bring a picture to life'' and the latter centering the child's narrative to generate context-aware AI responses. Our findings reveal different ways that AI can be used to support art engagement, including before, during, and after interaction with the child artist, as well as expectations that BLV adults and their sighted children have about AI-powered tools. △ Less

Submitted 10 March, 2025; v1 submitted 26 February, 2025; originally announced February 2025.

Comments: 21 pages, 30th International Conference on Intelligent User Interfaces (IUI 2025)

Journal ref: 30th International Conference on Intelligent User Interfaces 2025

arXiv:2502.13239 [pdf, other]

Towards Robustness Across Cosmological Simulation Models TNG, SIMBA, ASTRID, and EAGLE

Authors: Yongseok Jo, Shy Genel, Anirvan Sengupta, Benjamin Wandelt, Rachel Somerville, Francisco Villaescusa-Navarro

Abstract: The rapid advancement of large-scale cosmological simulations has opened new avenues for cosmological and astrophysical research. However, the increasing diversity among cosmological simulation models presents a challenge to the robustness. In this work, we develop the Model-Insensitive ESTimator (MIEST), a machine that can robustly estimate the cosmological parameters, $Ω_m$ and $σ_8$, from neura… ▽ More The rapid advancement of large-scale cosmological simulations has opened new avenues for cosmological and astrophysical research. However, the increasing diversity among cosmological simulation models presents a challenge to the robustness. In this work, we develop the Model-Insensitive ESTimator (MIEST), a machine that can robustly estimate the cosmological parameters, $Ω_m$ and $σ_8$, from neural hydrogen maps of simulation models in the CAMELS project$-$TNG, SIMBA, ASTRID, and EAGLE. An estimator is considered robust if it possesses a consistent predictive power across all simulations, including those used during the training phase. We train our machine using multiple simulation models and ensure that it only extracts common features between the models while disregarding the model-specific features. This allows us to develop a novel model that is capable of accurately estimating parameters across a range of simulation models, without being biased towards any particular model. Upon the investigation of the latent space$-$a set of summary statistics, we find that the implementation of robustness leads to the blending of latent variables across different models, demonstrating the removal of model-specific features. In comparison to a standard machine lacking robustness, the average performance of MIEST on the unseen simulations during the training phase has been improved by $\sim17$% for $Ω_m$ and $\sim 38$% for $σ_8$. By using a machine learning approach that can extract robust, yet physical features, we hope to improve our understanding of galaxy formation and evolution in a (subgrid) model-insensitive manner, and ultimately, gain insight into the underlying physical processes responsible for robustness. This is a Learning the Universe publication. △ Less

Submitted 18 February, 2025; originally announced February 2025.

Comments: This is a Learning the Universe publication. 26 pages, 11 figures

arXiv:2502.12051 [pdf, other]

How to Upscale Neural Networks with Scaling Law? A Survey and Practical Guidelines

Authors: Ayan Sengupta, Yash Goel, Tanmoy Chakraborty

Abstract: Neural scaling laws have revolutionized the design and optimization of large-scale AI models by revealing predictable relationships between model size, dataset volume, and computational resources. Early research established power-law relationships in model performance, leading to compute-optimal scaling strategies. However, recent studies highlighted their limitations across architectures, modalit… ▽ More Neural scaling laws have revolutionized the design and optimization of large-scale AI models by revealing predictable relationships between model size, dataset volume, and computational resources. Early research established power-law relationships in model performance, leading to compute-optimal scaling strategies. However, recent studies highlighted their limitations across architectures, modalities, and deployment contexts. Sparse models, mixture-of-experts, retrieval-augmented learning, and multimodal models often deviate from traditional scaling patterns. Moreover, scaling behaviors vary across domains such as vision, reinforcement learning, and fine-tuning, underscoring the need for more nuanced approaches. In this survey, we synthesize insights from over 50 studies, examining the theoretical foundations, empirical findings, and practical implications of scaling laws. We also explore key challenges, including data efficiency, inference scaling, and architecture-specific constraints, advocating for adaptive scaling strategies tailored to real-world applications. We suggest that while scaling laws provide a useful guide, they do not always generalize across all architectures and training strategies. △ Less

Submitted 26 May, 2025; v1 submitted 17 February, 2025; originally announced February 2025.

Comments: 21 pages, 11 tables, 4 figures

arXiv:2502.05164 [pdf, ps, other]

In-context denoising with one-layer transformers: connections between attention and associative memory retrieval

Authors: Matthew Smart, Alberto Bietti, Anirvan M. Sengupta

Abstract: We introduce in-context denoising, a task that refines the connection between attention-based architectures and dense associative memory (DAM) networks, also known as modern Hopfield networks. Using a Bayesian framework, we show theoretically and empirically that certain restricted denoising problems can be solved optimally even by a single-layer transformer. We demonstrate that a trained attentio… ▽ More We introduce in-context denoising, a task that refines the connection between attention-based architectures and dense associative memory (DAM) networks, also known as modern Hopfield networks. Using a Bayesian framework, we show theoretically and empirically that certain restricted denoising problems can be solved optimally even by a single-layer transformer. We demonstrate that a trained attention layer processes each denoising prompt by performing a single gradient descent update on a context-aware DAM energy landscape, where context tokens serve as associative memories and the query token acts as an initial state. This one-step update yields better solutions than exact retrieval of either a context token or a spurious local minimum, providing a concrete example of DAM networks extending beyond the standard retrieval paradigm. Overall, this work solidifies the link between associative memory and attention mechanisms first identified by Ramsauer et al., and demonstrates the relevance of associative memory models in the study of in-context learning. △ Less

Submitted 6 June, 2025; v1 submitted 7 February, 2025; originally announced February 2025.

Comments: Accepted to ICML 2025

arXiv:2501.16482 [pdf, other]

Hybrid Hadronization -- A Study of In-Medium Hadronization of Jets

Authors: A. Sengupta, R. J. Fries, M. Kordell II, B. Kim, A. Angerami, R. Arora, S. A. Bass, Y. Chen, R. Datta, L. Du, R. Ehlers, H. Elfner, C. Gale, Y. He, B. V. Jacak, P. M. Jacobs, S. Jeon, Y. Ji, F. Jonas, L. Kasper, A. Kumar, R. Kunnawalkam-Elayavalli, J. Latessa, Y. -J. Lee, R. Lemmon , et al. (28 additional authors not shown)

Abstract: QCD jets are considered important probes for quark gluon plasma created in collisions of nuclei at high energies. Their parton showers are significantly altered if they develop inside of a deconfined medium. Hadronization of jets is also thought to be affected by the presence of quarks and gluons. We present a systematic study of the effects of a thermal bath of partons on the hadronization of par… ▽ More QCD jets are considered important probes for quark gluon plasma created in collisions of nuclei at high energies. Their parton showers are significantly altered if they develop inside of a deconfined medium. Hadronization of jets is also thought to be affected by the presence of quarks and gluons. We present a systematic study of the effects of a thermal bath of partons on the hadronization of parton showers. We use the JETSCAPE framework to create parton showers both in vacuum and in a brick of quark gluon plasma. The brick setup allows important parameters, like the size of the plasma as well as the collective flow of partons, to be varied systematically. We hadronize the parton showers using Hybrid Hadronization, which permits shower partons to form strings with thermal partons, or to recombine directly with thermal partons as well as with each other. We find a sizeable amount of interaction of shower partons with thermal partons during hadronization, indicating a natural continuation of the interaction of jet and medium during this stage. The observed effects grow with the size of the medium. Collective flow easily transfers from the thermal partons onto the emerging jet hadrons. We also see a significant change in hadron chemistry as expected in the presence of quark recombination processes. △ Less

Submitted 27 January, 2025; originally announced January 2025.

Comments: 12 pages, 6 figures

arXiv:2501.15848 [pdf, other]

The Signals of the Doomsday

Authors: Amartya Sengupta, Dejan Stojkovic, De-Chang Dai

Abstract: The measured standard model parameters indicate that we might live in a false Higgs vacuum, possibly with a very long lifetime. However, small black holes can serve as catalysers and significantly speed up the phase transition. In fact, bubbles of true vacuum might already exist in our universe. We calculate the spectrum of Higgs particles produced by such a bubble, and use event generators to stu… ▽ More The measured standard model parameters indicate that we might live in a false Higgs vacuum, possibly with a very long lifetime. However, small black holes can serve as catalysers and significantly speed up the phase transition. In fact, bubbles of true vacuum might already exist in our universe. We calculate the spectrum of Higgs particles produced by such a bubble, and use event generators to study their decay and subsequent evolution of the decay products to obtain the spectrum of emitted photons and neutrinos as a long-range signature. If the propagation of the bubble walls slows down due to interaction with the surrounding matter and plasma, these signals can reach us before the bubble wall hits us, thus representing the signals of the doomsday. △ Less

Submitted 10 March, 2025; v1 submitted 27 January, 2025; originally announced January 2025.

Comments: References added, Clarifying comments added, 12 Pages, 5 figures

arXiv:2501.15296 [pdf, other]

You Only Prune Once: Designing Calibration-Free Model Compression With Policy Learning

Authors: Ayan Sengupta, Siddhant Chaudhary, Tanmoy Chakraborty

Abstract: The ever-increasing size of large language models (LLMs) presents significant challenges for deployment due to their heavy computational and memory requirements. Current model pruning techniques attempt to alleviate these issues by relying heavily on external calibration datasets to determine which parameters to prune or compress, thus limiting their flexibility and scalability across different co… ▽ More The ever-increasing size of large language models (LLMs) presents significant challenges for deployment due to their heavy computational and memory requirements. Current model pruning techniques attempt to alleviate these issues by relying heavily on external calibration datasets to determine which parameters to prune or compress, thus limiting their flexibility and scalability across different compression ratios. Moreover, these methods often cause severe performance degradation, particularly in downstream tasks, when subjected to higher compression rates. In this paper, we propose PruneNet, a novel model compression method that addresses these limitations by reformulating model pruning as a policy learning process. PruneNet decouples the pruning process from the model architecture, eliminating the need for calibration datasets. It learns a stochastic pruning policy to assess parameter importance solely based on intrinsic model properties while preserving the spectral structure to minimize information loss. PruneNet can compress the LLaMA-2-7B model in just 15 minutes, achieving over 80% retention of its zero-shot performance with a 30% compression ratio, outperforming existing methods that retain only 75% performance. Furthermore, on complex multitask language understanding tasks, PruneNet demonstrates its robustness by preserving up to 80% performance of the original model, proving itself a superior alternative to conventional structured compression techniques. △ Less

Submitted 28 February, 2025; v1 submitted 25 January, 2025; originally announced January 2025.

arXiv:2501.01495 [pdf, other]

doi 10.3847/1538-4357/adb3a0

Search for continuous gravitational waves from known pulsars in the first part of the fourth LIGO-Virgo-KAGRA observing run

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, R. Abbott, I. Abouelfettouh, F. Acernese, K. Ackley, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, D. Agarwal, M. Agathos, M. Aghaei Abchouyeh, O. D. Aguiar, I. Aguilar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, S. Albanesi, R. A. Alfaidi, A. Al-Jodah, C. Alléné , et al. (1794 additional authors not shown)

Abstract: Continuous gravitational waves (CWs) emission from neutron stars carries information about their internal structure and equation of state, and it can provide tests of General Relativity. We present a search for CWs from a set of 45 known pulsars in the first part of the fourth LIGO--Virgo--KAGRA observing run, known as O4a. We conducted a targeted search for each pulsar using three independent ana… ▽ More Continuous gravitational waves (CWs) emission from neutron stars carries information about their internal structure and equation of state, and it can provide tests of General Relativity. We present a search for CWs from a set of 45 known pulsars in the first part of the fourth LIGO--Virgo--KAGRA observing run, known as O4a. We conducted a targeted search for each pulsar using three independent analysis methods considering the single-harmonic and the dual-harmonic emission models. We find no evidence of a CW signal in O4a data for both models and set upper limits on the signal amplitude and on the ellipticity, which quantifies the asymmetry in the neutron star mass distribution. For the single-harmonic emission model, 29 targets have the upper limit on the amplitude below the theoretical spin-down limit. The lowest upper limit on the amplitude is $6.4\!\times\!10^{-27}$ for the young energetic pulsar J0537-6910, while the lowest constraint on the ellipticity is $8.8\!\times\!10^{-9}$ for the bright nearby millisecond pulsar J0437-4715. Additionally, for a subset of 16 targets we performed a narrowband search that is more robust regarding the emission model, with no evidence of a signal. We also found no evidence of non-standard polarizations as predicted by the Brans-Dicke theory. △ Less

Submitted 2 January, 2025; originally announced January 2025.

Comments: main paper: 12 pages, 6 figures, 4 tables

Report number: LIGO-P2400315

Journal ref: Astrophys.J. 983 (2025) 2, 99

arXiv:2412.19738 [pdf, other]

Hard Photon Triggered Jets in $p$-$p$ and $A$-$A$ Collisions

Authors: C. Sirimanna, Y. Tachibana, A. Majumder, A. Angerami, R. Arora, S. A. Bass, Y. Chen, R. Datta, L. Du, R. Ehlers, H. Elfner, R. J. Fries, C. Gale, Y. He, B. V. Jacak, P. M. Jacobs, S. Jeon, Y. Ji, F. Jonas, L. Kasper, M. Kordell II, A. Kumar, R. Kunnawalkam-Elayavalli, J. Latessa, Y. -J. Lee , et al. (27 additional authors not shown)

Abstract: An investigation of high transverse momentum (high-$p_T$) photon triggered jets in proton-proton ($p$-$p$) and ion-ion ($A$-$A$) collisions at $\sqrt{s_{NN}} = 0.2$ and $5.02~\mathrm{TeV}$ is carried out, using the multistage description of in-medium jet evolution. Monte Carlo simulations of hard scattering and energy loss in heavy-ion collisions are performed using parameters tuned in a previous… ▽ More An investigation of high transverse momentum (high-$p_T$) photon triggered jets in proton-proton ($p$-$p$) and ion-ion ($A$-$A$) collisions at $\sqrt{s_{NN}} = 0.2$ and $5.02~\mathrm{TeV}$ is carried out, using the multistage description of in-medium jet evolution. Monte Carlo simulations of hard scattering and energy loss in heavy-ion collisions are performed using parameters tuned in a previous study of the nuclear modification factor ($R_{AA}$) for inclusive jets and high-$p_T$ hadrons. We obtain a good reproduction of the experimental data for photon triggered jet $R_{AA}$, as measured by the ATLAS detector, the distribution of the ratio of jet to photon $p_T$ ($X_{\rm J γ}$), measured by both CMS and ATLAS, and the photon-jet azimuthal correlation as measured by CMS. We obtain a moderate description of the photon triggered jet $I_{AA}$, as measured by STAR. A noticeable improvement in the comparison is observed when one goes beyond prompt photons and includes bremsstrahlung and decay photons, revealing their significance in certain kinematic regions, particularly at $X_{Jγ} > 1$. Moreover, azimuthal angle correlations demonstrate a notable impact of non-prompt photons on the distribution, emphasizing their role in accurately describing experimental results. This work highlights the success of the multistage model of jet modification to straightforwardly predict (this set of) photon triggered jet observables. This comparison, along with the role played by non-prompt photons, has important consequences on the inclusion of such observables in a future Bayesian analysis. △ Less

Submitted 27 December, 2024; originally announced December 2024.

Comments: 16 pages, 10 figures

arXiv:2412.13012 [pdf, other]

doi 10.1140/epjp/s13360-024-05947-w

Deep Learning Based Superconductivity: Prediction and Experimental Tests

Authors: Daniel Kaplan, Adam Zhang, Joanna Blawat, Rongying Jin, Robert J. Cava, Viktor Oudovenko, Gabriel Kotliar, Anirvan M. Sengupta, Weiwei Xie

Abstract: The discovery of novel superconducting materials is a longstanding challenge in materials science, with a wealth of potential for applications in energy, transportation, and computing. Recent advances in artificial intelligence (AI) have enabled expediting the search for new materials by efficiently utilizing vast materials databases. In this study, we developed an approach based on deep learning… ▽ More The discovery of novel superconducting materials is a longstanding challenge in materials science, with a wealth of potential for applications in energy, transportation, and computing. Recent advances in artificial intelligence (AI) have enabled expediting the search for new materials by efficiently utilizing vast materials databases. In this study, we developed an approach based on deep learning (DL) to predict new superconducting materials. We have synthesized a compound derived from our DL network and confirmed its superconducting properties in agreement with our prediction. Our approach is also compared to previous work based on random forests (RFs). In particular, RFs require knowledge of the chem-ical properties of the compound, while our neural net inputs depend solely on the chemical composition. With the help of hints from our network, we discover a new ternary compound $\textrm{Mo}_{20}\textrm{Re}_{6}\textrm{Si}_{4}$, which becomes superconducting below 5.4 K. We further discuss the existing limitations and challenges associated with using AI to predict and, along with potential future research directions. △ Less

Submitted 17 December, 2024; originally announced December 2024.

Comments: 14 pages + 2 appendices + references. EPJ submission

Journal ref: Eur. Phys. J. Plus (2025) 140:58

arXiv:2412.12019 [pdf, other]

Learning interactions between Rydberg atoms

Authors: Olivier Simard, Anna Dawid, Joseph Tindall, Michel Ferrero, Anirvan M. Sengupta, Antoine Georges

Abstract: Quantum simulators have the potential to solve quantum many-body problems that are beyond the reach of classical computers, especially when they feature long-range entanglement. To fulfill their prospects, quantum simulators must be fully controllable, allowing for precise tuning of the microscopic physical parameters that define their implementation. We consider Rydberg-atom arrays, a promising p… ▽ More Quantum simulators have the potential to solve quantum many-body problems that are beyond the reach of classical computers, especially when they feature long-range entanglement. To fulfill their prospects, quantum simulators must be fully controllable, allowing for precise tuning of the microscopic physical parameters that define their implementation. We consider Rydberg-atom arrays, a promising platform for quantum simulations. Experimental control of such arrays is limited by the imprecision on the optical tweezers positions when assembling the array, hence introducing uncertainties in the simulated Hamiltonian. In this work, we introduce a scalable approach to Hamiltonian learning using graph neural networks (GNNs). We employ the Density Matrix Renormalization Group (DMRG) to generate ground-state snapshots of the transverse field Ising model realized by the array, for many realizations of the Hamiltonian parameters. Correlation functions reconstructed from these snapshots serve as input data to carry out the training. We demonstrate that our GNN model has a remarkable capacity to extrapolate beyond its training domain, both regarding the size and the shape of the system, yielding an accurate determination of the Hamiltonian parameters with a minimal set of measurements. We prove a theorem establishing a bijective correspondence between the correlation functions and the interaction parameters in the Hamiltonian, which provides a theoretical foundation to our learning algorithm. Our work could open the road to feedback control of the positions of the optical tweezers, hence providing a decisive improvement of analog quantum simulators. △ Less

Submitted 16 December, 2024; originally announced December 2024.

Comments: 19 pages, 11 figures

arXiv:2411.17635 [pdf, other]

Variational Dual Solutions of Chern-Simons Theory

Authors: Amit Acharya, Janusz Ginster, Ambar N. Sengupta

Abstract: A scheme for generating weakly lower semi-continuous action functionals corresponding to the Euler-Lagrange equations of Chern-Simons theory is described. Coercivity is deduced for such a functional in appropriate function spaces to prove the existence of a minimizer, which constitutes a solution to the Euler-Lagrange equations of Chern-Simons theory in a relaxed sense. A geometric analysis is als… ▽ More A scheme for generating weakly lower semi-continuous action functionals corresponding to the Euler-Lagrange equations of Chern-Simons theory is described. Coercivity is deduced for such a functional in appropriate function spaces to prove the existence of a minimizer, which constitutes a solution to the Euler-Lagrange equations of Chern-Simons theory in a relaxed sense. A geometric analysis is also made, especially for the gauge group SU(2), relating connection forms on the bundle to corresponding forms in the dual scheme. △ Less

Submitted 26 November, 2024; originally announced November 2024.

arXiv:2411.14063 [pdf, other]

doi 10.1103/PhysRevD.111.083041

Probing dark matter halo profiles with multi-band observations of gravitational waves

Authors: Divya Tahelyani, Arpan Bhattacharyya, Anand S. Sengupta

Abstract: In this paper, we evaluate the potential of multiband gravitational wave observations from a deci-Hz space-based detector and third-generation ground-based gravitational wave detectors to constrain the properties of dark matter spikes around intermediate-mass ratio inspirals. The presence of dark matter influences the orbital evolution of the secondary compact object through dynamic friction, whic… ▽ More In this paper, we evaluate the potential of multiband gravitational wave observations from a deci-Hz space-based detector and third-generation ground-based gravitational wave detectors to constrain the properties of dark matter spikes around intermediate-mass ratio inspirals. The presence of dark matter influences the orbital evolution of the secondary compact object through dynamic friction, which leads to a phase shift in the gravitational waveform compared to the vacuum case. Our analysis shows that the proposed Indian space-based detector GWSat, operating in the deciHz frequency band, provides the most stringent constraints on the dark matter spike parameters, as IMRIs spend a significant portion of their inspiral phase within its sensitivity range. While third-generation ground-based detectors such as the Einstein Telescope and Cosmic Explorer offer additional constraints, their contribution is somewhat limited, particularly for higher-mass systems where the signal duration in their frequency bands is shorter. However, for systems with detector-frame total masses $M_z < 400 \rm M_{\odot}$, Cosmic Explorer and Einstein Telescope could improve the estimation of the chirp mass, symmetric mass ratio, luminosity distance, and dark matter spike power-law index by more than $15\%$. Nonetheless, their impact on the constraint of spike density is minimal. These results highlight the crucial role of deciHz space-based detectors in probing dark matter interactions with gravitational wave sources. △ Less

Submitted 3 April, 2025; v1 submitted 21 November, 2024; originally announced November 2024.

Comments: 15 pages, 5 figures, substantial improvement, added analysis for the dynamic spike profile

Journal ref: Phys. Rev. D 111, 083041 (2025)

arXiv:2411.12266 [pdf, other]

Comparing design and off-design aerodynamic performance of a natural laminar airfoil

Authors: Aditi Sengupta, Abhijeet Guha

Abstract: Natural laminar flow airfoils are essential technologies designed to reduce drag and significantly enhance aerodynamic performance. A notable example is the SHM1 airfoil, created to meet the requirements of the small-business Honda jet. This airfoil has undergone extensive testing across various operational conditions, including low-speed wind tunnel tests and flight tests across a range of Reynol… ▽ More Natural laminar flow airfoils are essential technologies designed to reduce drag and significantly enhance aerodynamic performance. A notable example is the SHM1 airfoil, created to meet the requirements of the small-business Honda jet. This airfoil has undergone extensive testing across various operational conditions, including low-speed wind tunnel tests and flight tests across a range of Reynolds numbers and free-stream Mach numbers, as detailed in "Natural-laminar-flow airfoil development for a lightweight business jet" by Fujino et al., J. Aircraft, 40(4), 2003. Additionally, investigations into drag-divergence behavior have been conducted using a transonic wind tunnel, with subsequent studies focusing on transonic shock boundary layer interactions through both experimental and numerical approaches. This study employs a series of numerical simulations to analyze the flow physics and aerodynamic performance across different free-stream Mach numbers in the subsonic and transonic regimes. This is achieved by examining computed instantaneous numerical Schlieren for various design conditions (such as low speed, climb, and cruise) and off-design scenarios (including transonic shock emergence, drag-divergence, and shock-induced separation). The dominant time scales, the time-averaged load distributions and boundary layer parameters are compared to provide a comprehensive overview of the SHM1's aerodynamics, establishing benchmark results for optimization of various flow separation and shock control techniques. △ Less

Submitted 19 November, 2024; originally announced November 2024.

arXiv:2411.12242 [pdf, other]

Effect of Gaussian wake amplitude on wake-induced transition for a T106A low pressure turbine cascade

Authors: Aditi Sengupta

Abstract: The wake-induced transition on the suction surface of a T106A low-pressure turbine (LPT) blade is investigated through a series of implicit large eddy simulations, solving the two-dimensional (2D) compressible Navier-Stokes equations (NSE). The impact of the incoming Gaussian wake amplitude on the blade's profile loss and associated boundary layer parameters is examined, revealing a 50\% reduction… ▽ More The wake-induced transition on the suction surface of a T106A low-pressure turbine (LPT) blade is investigated through a series of implicit large eddy simulations, solving the two-dimensional (2D) compressible Navier-Stokes equations (NSE). The impact of the incoming Gaussian wake amplitude on the blade's profile loss and associated boundary layer parameters is examined, revealing a 50\% reduction in skin friction drag at the highest amplitude. The results indicate that increasing wake amplitude leads to delayed separation and earlier reattachment, resulting in reduced separated flow. The vorticity and enstrophy dynamics during the transition process under varying wake amplitudes reveal characteristic features of wake-induced transition, such as puffs, streaks, and turbulent spots. The periodic passing of wakes induces intermittent "calmed regions", which suppress flow separation and improve profile loss at low Reynolds numbers (Re), typically found in LPTs. The energy budget, accounting for both translational and rotational energy via the turbulent kinetic energy (TKE) and compressible enstrophy transport equation (CETE), respectively, shows trends with increasing wake amplitude. The relative contribution to TKE production and the roles of baroclinicity, compressibility, and viscous terms are explained. △ Less

Submitted 19 November, 2024; originally announced November 2024.

Comments: 31 pages, 16 figures

arXiv:2411.04358 [pdf, other]

Robust and Efficient Fine-tuning of LLMs with Bayesian Reparameterization of Low-Rank Adaptation

Authors: Ayan Sengupta, Vaibhav Seth, Arinjay Pathak, Natraj Raman, Sriram Gopalakrishnan, Tanmoy Chakraborty

Abstract: Large Language Models (LLMs) are highly resource-intensive to fine-tune due to their enormous size. While low-rank adaptation is a prominent parameter-efficient fine-tuning approach, it suffers from sensitivity to hyperparameter choices, leading to instability in model performance on fine-tuning downstream tasks. This paper highlights the importance of effective parameterization in low-rank fine-t… ▽ More Large Language Models (LLMs) are highly resource-intensive to fine-tune due to their enormous size. While low-rank adaptation is a prominent parameter-efficient fine-tuning approach, it suffers from sensitivity to hyperparameter choices, leading to instability in model performance on fine-tuning downstream tasks. This paper highlights the importance of effective parameterization in low-rank fine-tuning to reduce estimator variance and enhance the stability of final model outputs. We propose MonteCLoRA, an efficient fine-tuning technique, employing Monte Carlo estimation to learn an unbiased posterior estimation of low-rank parameters with low expected variance, which stabilizes fine-tuned LLMs with only O(1) additional parameters. MonteCLoRA shows significant improvements in accuracy and robustness, achieving up to 3.8% higher accuracy and 8.6% greater robustness than existing efficient fine-tuning methods on natural language understanding tasks with pre-trained RoBERTa-base. Furthermore, in generative tasks with pre-trained LLaMA-1-7B, MonteCLoRA demonstrates robust zero-shot performance with 50% lower variance than the contemporary efficient fine-tuning methods. The theoretical and empirical results presented in the paper underscore how parameterization and hyperpriors balance exploration-exploitation in the low-rank parametric space, therefore leading to more optimal and robust parameter estimation during efficient fine-tuning. △ Less

Submitted 8 November, 2024; v1 submitted 6 November, 2024; originally announced November 2024.

Comments: 48 pages, 10 figures, 10 tables, Code: https://github.com/LCS2-IIITD/MonteCLoRA

arXiv:2410.17791 [pdf, other]

Tipping points in fitness landscape of heterogeneous populations

Authors: Sumana Bhattacharyya, Uttam Singh, Anupam Sengupta

Abstract: Predicting fitness of biologically-active populations, communities or systems in fluctuating environments is a long-standing challenge. Phenotypic plasticity and bet-hedging strategy, two key evolutionary traits living systems harness to optimize fitness in dynamic environments, have been widely reported yet how interplays therein could mediate fitness landscapes of heterogeneous populations remai… ▽ More Predicting fitness of biologically-active populations, communities or systems in fluctuating environments is a long-standing challenge. Phenotypic plasticity and bet-hedging strategy, two key evolutionary traits living systems harness to optimize fitness in dynamic environments, have been widely reported yet how interplays therein could mediate fitness landscapes of heterogeneous populations remain unknown. Leveraging the financial asset pricing model, here we provide a dynamical framework for fitness of heterogeneous populations, underpinned by the interrelations between sub-populations exhibiting phenotypic plasticity and bet-hedgeding. Our framework, independent of the definition of fitness, employs a nonlinear difference equation to present fitness dynamics, and capture the emergence of tipping points, marking the onset of critical state transitions which lead to catastrophic shifts. This study identifies limits on the selective advantage conferred by bet-hedging through reduction in the temporal variance of fitness, with far-reaching ramifications on our current understanding of hedging-mediated fitness enhancement of a population. The lower bound of the effective fitness variance is set by a maximum number of bet-hedgers, beyond which the fitness landscape approaches critical transition, as confirmed by critical slowing down in the vicinity of tipping points. We estimate the scaling law for the critical slowing down numerically and derive the characteristic recovery time for heterogeneous populations. Taken together, our work provides a generic theoretical framework to quantify fitness dynamics and predict critical transitions in heterogeneous populations. The results can be extended further to model fitness landscapes of natural and synthetic multi-species consortia exposed to environmental fluctuations mimicking climatic shifts and immunopathological settings. △ Less

Submitted 23 October, 2024; originally announced October 2024.

Comments: 34 pages, 7 figures

arXiv:2410.16565 [pdf, other]

doi 10.3847/1538-4357/adc681

Search for gravitational waves emitted from SN 2023ixf

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, R. Abbott, I. Abouelfettouh, F. Acernese, K. Ackley, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, D. Agarwal, M. Agathos, M. Aghaei Abchouyeh, O. D. Aguiar, I. Aguilar, L. Aiello, A. Ain, T. Akutsu, S. Albanesi, R. A. Alfaidi, A. Al-Jodah, C. Alléné, A. Allocca , et al. (1758 additional authors not shown)

Abstract: We present the results of a search for gravitational-wave transients associated with core-collapse supernova SN 2023ixf, which was observed in the galaxy Messier 101 via optical emission on 2023 May 19th, during the LIGO-Virgo-KAGRA 15th Engineering Run. We define a five-day on-source window during which an accompanying gravitational-wave signal may have occurred. No gravitational waves have been… ▽ More We present the results of a search for gravitational-wave transients associated with core-collapse supernova SN 2023ixf, which was observed in the galaxy Messier 101 via optical emission on 2023 May 19th, during the LIGO-Virgo-KAGRA 15th Engineering Run. We define a five-day on-source window during which an accompanying gravitational-wave signal may have occurred. No gravitational waves have been identified in data when at least two gravitational-wave observatories were operating, which covered $\sim 14\%$ of this five-day window. We report the search detection efficiency for various possible gravitational-wave emission models. Considering the distance to M101 (6.7 Mpc), we derive constraints on the gravitational-wave emission mechanism of core-collapse supernovae across a broad frequency spectrum, ranging from 50 Hz to 2 kHz where we assume the gravitational-wave emission occurred when coincident data are available in the on-source window. Considering an ellipsoid model for a rotating proto-neutron star, our search is sensitive to gravitational-wave energy $1 \times 10^{-4} M_{\odot} c^2$ and luminosity $2.6 \times 10^{-4} M_{\odot} c^2/s$ for a source emitting at 82 Hz. These constraints are around an order of magnitude more stringent than those obtained so far with gravitational-wave data. The constraint on the ellipticity of the proto-neutron star that is formed is as low as 1.08, at frequencies above 1200 Hz, surpassing past results. △ Less

Submitted 11 March, 2025; v1 submitted 21 October, 2024; originally announced October 2024.

Comments: Main paper: 6 pages, 4 figures and 1 table. Total with appendices: 20 pages, 4 figures, and 1 table

Report number: LIGO-P2400125

Journal ref: ApJ 985 183 (2025)

arXiv:2410.09151 [pdf, other]

doi 10.3847/1538-4357/ad8de0

A search using GEO600 for gravitational waves coincident with fast radio bursts from SGR 1935+2154

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, R. Abbott, I. Abouelfettouh, F. Acernese, K. Ackley, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, D. Agarwal, M. Agathos, M. Aghaei Abchouyeh, O. D. Aguiar, I. Aguilar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, S. Albanesi, R. A. Alfaidi, A. Al-Jodah, C. Alléné , et al. (1758 additional authors not shown)

Abstract: The magnetar SGR 1935+2154 is the only known Galactic source of fast radio bursts (FRBs). FRBs from SGR 1935+2154 were first detected by CHIME/FRB and STARE2 in 2020 April, after the conclusion of the LIGO, Virgo, and KAGRA Collaborations' O3 observing run. Here we analyze four periods of gravitational wave (GW) data from the GEO600 detector coincident with four periods of FRB activity detected by… ▽ More The magnetar SGR 1935+2154 is the only known Galactic source of fast radio bursts (FRBs). FRBs from SGR 1935+2154 were first detected by CHIME/FRB and STARE2 in 2020 April, after the conclusion of the LIGO, Virgo, and KAGRA Collaborations' O3 observing run. Here we analyze four periods of gravitational wave (GW) data from the GEO600 detector coincident with four periods of FRB activity detected by CHIME/FRB, as well as X-ray glitches and X-ray bursts detected by NICER and NuSTAR close to the time of one of the FRBs. We do not detect any significant GW emission from any of the events. Instead, using a short-duration GW search (for bursts $\leq$ 1 s) we derive 50\% (90\%) upper limits of $10^{48}$ ($10^{49}$) erg for GWs at 300 Hz and $10^{49}$ ($10^{50}$) erg at 2 kHz, and constrain the GW-to-radio energy ratio to $\leq 10^{14} - 10^{16}$. We also derive upper limits from a long-duration search for bursts with durations between 1 and 10 s. These represent the strictest upper limits on concurrent GW emission from FRBs. △ Less

Submitted 21 May, 2025; v1 submitted 11 October, 2024; originally announced October 2024.

Comments: 15 pages of text including references, 4 figures, 5 tables

Report number: LIGO-P2400192

Journal ref: ApJ 977 255 (2024)

arXiv:2409.14288 [pdf, other]

doi 10.1103/PhysRevD.111.042009

Accelerated parameter estimation of supermassive black hole binaries in LISA using a meshfree approximation

Authors: Abhishek Sharma, Anand S. Sengupta, Suvodip Mukherjee

Abstract: The Laser Interferometer Space Antenna (LISA) will be capable of detecting gravitational waves (GWs) in the milli-Hertz band. Among various sources, LISA will detect the coalescence of supermassive black hole binaries (SMBHBs). Accurate and rapid inference of parameters for such sources will be important for potential electromagnetic follow-up efforts. Rapid Bayesian inference with LISA includes a… ▽ More The Laser Interferometer Space Antenna (LISA) will be capable of detecting gravitational waves (GWs) in the milli-Hertz band. Among various sources, LISA will detect the coalescence of supermassive black hole binaries (SMBHBs). Accurate and rapid inference of parameters for such sources will be important for potential electromagnetic follow-up efforts. Rapid Bayesian inference with LISA includes additional complexities as compared to current generation terrestrial detectors in terms of time and frequency dependent antenna response functions. In this work, we extend a recently developed, computationally efficient technique that uses meshfree interpolation methods to accelerate Bayesian reconstruction of compact binaries. Originally developed for second-generation terrestrial detectors, this technique is now adapted for LISA parameter estimation. Using the full inspiral, merger, and ringdown waveform (PhenomD) and assuming rigid adiabatic antenna response function, we show faithful inference of SMBHB parameters from GW signals embedded in stationary, Gaussian instrumental noise. We discuss the computational cost and performance of the meshfree approximation method in estimating the GW source parameters. △ Less

Submitted 22 February, 2025; v1 submitted 21 September, 2024; originally announced September 2024.

Comments: 18 pages, 6 figure

Journal ref: Physical Review D (Vol. 111, pages 042009, numpages 16), 2025

arXiv:2409.11605 [pdf]

Harnessing AI data-driven global weather models for climate attribution: An analysis of the 2017 Oroville Dam extreme atmospheric river

Authors: Jorge Baño-Medina, Agniv Sengupta, Allison Michaelis, Luca Delle Monache, Julie Kalansky, Duncan Watson-Parris

Abstract: AI data-driven models (Graphcast, Pangu Weather, Fourcastnet, and SFNO) are explored for storyline-based climate attribution due to their short inference times, which can accelerate the number of events studied, and provide real time attributions when public attention is heightened. The analysis is framed on the extreme atmospheric river episode of February 2017 that contributed to the Oroville da… ▽ More AI data-driven models (Graphcast, Pangu Weather, Fourcastnet, and SFNO) are explored for storyline-based climate attribution due to their short inference times, which can accelerate the number of events studied, and provide real time attributions when public attention is heightened. The analysis is framed on the extreme atmospheric river episode of February 2017 that contributed to the Oroville dam spillway incident in Northern California. Past and future simulations are generated by perturbing the initial conditions with the pre-industrial and the late-21st century temperature climate change signals, respectively. The simulations are compared to results from a dynamical model which represents plausible pseudo-realities under both climate environments. Overall, the AI models show promising results, projecting a 5-6 % increase in the integrated water vapor over the Oroville dam in the present day compared to the pre-industrial, in agreement with the dynamical model. Different geopotential-moisture-temperature dependencies are unveiled for each of the AI-models tested, providing valuable information for understanding the physicality of the attribution response. However, the AI models tend to simulate weaker attribution values than the pseudo-reality imagined by the dynamical model, suggesting some reduced extrapolation skill, especially for the late-21st century regime. Large ensembles generated with an AI model (>500 members) produced statistically significant present-day to pre-industrial attribution results, unlike the >20-member ensemble from the dynamical model. This analysis highlights the potential of AI models to conduct attribution analysis, while emphasizing future lines of work on explainable artificial intelligence to gain confidence in these tools, which can enable reliable attribution studies in real-time. △ Less

Submitted 17 September, 2024; originally announced September 2024.

Comments: This Work has been submitted to Artificial Intelligence for the Earth Systems

arXiv:2408.14470 [pdf, ps, other]

Step-by-Step Unmasking for Parameter-Efficient Fine-tuning of Large Language Models

Authors: Aradhye Agarwal, Suhas K Ramesh, Ayan Sengupta, Tanmoy Chakraborty

Abstract: Fine-tuning large language models (LLMs) on downstream tasks requires substantial computational resources. Selective PEFT, a class of parameter-efficient fine-tuning (PEFT) methodologies, aims to mitigate these computational challenges by selectively fine-tuning only a small fraction of the model parameters. Although parameter-efficient, these techniques often fail to match the performance of full… ▽ More Fine-tuning large language models (LLMs) on downstream tasks requires substantial computational resources. Selective PEFT, a class of parameter-efficient fine-tuning (PEFT) methodologies, aims to mitigate these computational challenges by selectively fine-tuning only a small fraction of the model parameters. Although parameter-efficient, these techniques often fail to match the performance of fully fine-tuned models, primarily due to inherent biases introduced during parameter selection. Traditional selective PEFT techniques use a fixed set of parameters selected using different importance heuristics, failing to capture parameter importance dynamically and often leading to suboptimal performance. We introduce $\text{ID}^3$, a novel selective PEFT method that calculates parameter importance continually, and dynamically unmasks parameters by balancing exploration and exploitation in parameter selection. Our empirical study on 16 tasks spanning natural language understanding, mathematical reasoning and summarization demonstrates the effectiveness of our method compared to fixed-masking selective PEFT techniques. We analytically show that $\text{ID}^3$ reduces the number of gradient updates by a factor of two, enhancing computational efficiency. Since $\text{ID}^3$ is robust to random initialization of neurons and operates directly on the optimization process, it is highly flexible and can be integrated with existing additive and reparametrization-based PEFT techniques such as adapters and LoRA respectively. △ Less

Submitted 23 June, 2025; v1 submitted 26 August, 2024; originally announced August 2024.

Comments: 15 pages, 7 tables, 9 figures

arXiv:2408.08247 [pdf, other]

Bayesian Inference analysis of jet quenching using inclusive jet and hadron suppression measurements

Authors: R. Ehlers, Y. Chen, J. Mulligan, Y. Ji, A. Kumar, S. Mak, P. M. Jacobs, A. Majumder, A. Angerami, R. Arora, S. A. Bass, R. Datta, L. Du, H. Elfner, R. J. Fries, C. Gale, Y. He, B. V. Jacak, S. Jeon, F. Jonas, L. Kasper, M. Kordell II, R. Kunnawalkam-Elayavalli, J. Latessa, Y. -J. Lee , et al. (28 additional authors not shown)

Abstract: The JETSCAPE Collaboration reports a new determination of the jet transport parameter $\hat{q}$ in the Quark-Gluon Plasma (QGP) using Bayesian Inference, incorporating all available inclusive hadron and jet yield suppression data measured in heavy-ion collisions at RHIC and the LHC. This multi-observable analysis extends the previously published JETSCAPE Bayesian Inference determination of… ▽ More The JETSCAPE Collaboration reports a new determination of the jet transport parameter $\hat{q}$ in the Quark-Gluon Plasma (QGP) using Bayesian Inference, incorporating all available inclusive hadron and jet yield suppression data measured in heavy-ion collisions at RHIC and the LHC. This multi-observable analysis extends the previously published JETSCAPE Bayesian Inference determination of $\hat{q}$, which was based solely on a selection of inclusive hadron suppression data. JETSCAPE is a modular framework incorporating detailed dynamical models of QGP formation and evolution, and jet propagation and interaction in the QGP. Virtuality-dependent partonic energy loss in the QGP is modeled as a thermalized weakly-coupled plasma, with parameters determined from Bayesian calibration using soft-sector observables. This Bayesian calibration of $\hat{q}$ utilizes Active Learning, a machine--learning approach, for efficient exploitation of computing resources. The experimental data included in this analysis span a broad range in collision energy and centrality, and in transverse momentum. In order to explore the systematic dependence of the extracted parameter posterior distributions, several different calibrations are reported, based on combined jet and hadron data; on jet or hadron data separately; and on restricted kinematic or centrality ranges of the jet and hadron data. Tension is observed in comparison of these variations, providing new insights into the physics of jet transport in the QGP and its theoretical formulation. △ Less

Submitted 28 August, 2024; v1 submitted 15 August, 2024; originally announced August 2024.

Comments: 20 pages, 10 figures, 2 tables, submitted to PRC; updated acknowledgements

arXiv:2407.20979 [pdf, other]

doi 10.3390/microorganisms13030637

Substrate stiffness modulates bacterial adhesion and diversity of adherent phenotypes across growth stages

Authors: René Riedel, Garima Rani, Anupam Sengupta

Abstract: Surface-adhesion and stiffness of underlying substrates mediate geometry, mechanics and self-organization of bacterial colonies. Recent studies have qualitatively indicted that stiffness may impact bacterial attachment, yet the variation of cell-to-surface adhesion with substrate stiffness remains to be quantified. Here, by developing a cell-level Force Distance Spectroscopy (FDS) technique based… ▽ More Surface-adhesion and stiffness of underlying substrates mediate geometry, mechanics and self-organization of bacterial colonies. Recent studies have qualitatively indicted that stiffness may impact bacterial attachment, yet the variation of cell-to-surface adhesion with substrate stiffness remains to be quantified. Here, by developing a cell-level Force Distance Spectroscopy (FDS) technique based on Atomic Force Microscopy (AFM), we simultaneously quantify the cell-surface adhesion alongside stiffness of the underlying substrates to reveal stiffness-dependent adhesion in phototrophic bacterium Chromatium okenii. As stiffness of the soft substrate, modelled via low-melting-point (LMP) agarose pad, was varied between 20 kPa and 120 kPa by changing agarose concentrations, we observe a progressive increase of the mean adhesion force by over an order of magnitude, from 0.21 (+/-0.10) nN to 2.42 (+/-1.16) nN. In contrast, passive polystyrene (PS) microparticles of comparable dimensions showed no perceptible change in their surface adhesion. Furthermore, for Escherichia coli, the cell-surface adhesion varied between 0.29 (+/-0.17) nN to 0.39 (+/-0.20) nN, showing a weak dependence on the substrate stiffness, thus suggesting that the stiffness-modulated adhesion is a species-specific trait. Finally, by quantifying the adhesion of C. okenii populations across growth stages, we report an emergent co-existence of weak and strongly adherent sub-populations, demonstrating a diversification of adherent phenotypes over time. Taken together, these findings suggest that bacteria, depending on the species and their physiological stage, actively modulate cell-to-surface adhesion in response to substrate stiffness, and leverage it as a functional trait to modulate initial attachment and colonization on soft substrates during early stages of biofilm development. △ Less

Submitted 30 July, 2024; originally announced July 2024.

Comments: 37 pages, 10 figures

Journal ref: Microorganisms 2025

arXiv:2407.17443 [pdf, other]

A soft-hard framework with exact four momentum conservation for small systems

Authors: I. Soudi, W. Zhao, A. Majumder, C. Shen, J. H. Putschke, B. Boudreaux, A. Angerami, R. Arora, S. A. Bass, Y. Chen, R. Datta, L. Du, R. Ehlers, H. Elfner, R. J. Fries, C. Gale, Y. He, B. V. Jacak, P. M. Jacobs, S. Jeon, Y. Ji, L. Kasper, M. Kelsey, M. Kordell II, A. Kumar , et al. (28 additional authors not shown)

Abstract: A new framework, called x-scape, for the combined study of both hard and soft transverse momentum sectors in high energy proton-proton ($p$-$p$) and proton-nucleus ($p$-$A$) collisions is set up. A dynamical initial state is set up using the 3d-Glauber model with transverse locations of hotspots within each incoming nucleon. A hard scattering that emanates from two colliding hotspots is carried ou… ▽ More A new framework, called x-scape, for the combined study of both hard and soft transverse momentum sectors in high energy proton-proton ($p$-$p$) and proton-nucleus ($p$-$A$) collisions is set up. A dynamical initial state is set up using the 3d-Glauber model with transverse locations of hotspots within each incoming nucleon. A hard scattering that emanates from two colliding hotspots is carried out using the Pythia generator. Initial state radiation from the incoming hard partons is carried out in a new module called I-matter, which includes the longitudinal location of initial splits. The energy-momentum of both the initial hard partons and their associated beam remnants is removed from the hot spots, depleting the energy-momentum available for the formation of the bulk medium. Outgoing showers are simulated using the matter generator, and results are presented for both cases, allowing for and not allowing for energy loss. First comparisons between this hard-soft model and single inclusive hadron and jet data from $p$-$p$ and minimum bias $p$-$Pb$ collisions are presented. Single hadron spectra in $p$-$p$ are used to carry out a limited (in number of parameters) Bayesian calibration of the model. Fair comparisons with data are indicative of the utility of this new framework. Theoretical studies of the correlation between jet $p_T$ and event activity at mid and forward rapidity are carried out. △ Less

Submitted 24 July, 2024; originally announced July 2024.

Comments: 18 pages, 15 figures

arXiv:2407.16758 [pdf, ps, other]

doi 10.1007/JHEP10(2024)021

Tadpole conjecture in non-geometric backgrounds

Authors: Katrin Becker, Nathan Brady, Mariana Graña, Miguel Morros, Anindya Sengupta, Qi You

Abstract: Calabi-Yau compactifications have typically a large number of complex structure and/or Kähler moduli that have to be stabilised in phenomenologically-relevant vacua. The former can in principle be done by fluxes in type IIB solutions. However, the tadpole conjecture proposes that the number of stabilised moduli can at most grow linearly with the tadpole charge of the fluxes required for stabilisat… ▽ More Calabi-Yau compactifications have typically a large number of complex structure and/or Kähler moduli that have to be stabilised in phenomenologically-relevant vacua. The former can in principle be done by fluxes in type IIB solutions. However, the tadpole conjecture proposes that the number of stabilised moduli can at most grow linearly with the tadpole charge of the fluxes required for stabilisation. We scrutinise this conjecture in the $2^6$ Gepner model: a non-geometric background mirror dual to a rigid Calabi-Yau manifold, in the deep interior of moduli space. By constructing an extensive set of supersymmetric Minkowski flux solutions, we spectacularly confirm the linear growth, while achieving a slightly higher ratio of stabilised moduli to flux charge than the conjectured upper bound. As a byproduct, we obtain for the first time a set of solutions within the tadpole bound where all complex structure moduli are massive. Since the $2^6$ model has no Kähler moduli, these show that the massless Minkowski conjecture does not hold beyond supergravity. △ Less

Submitted 23 May, 2025; v1 submitted 23 July, 2024; originally announced July 2024.

Comments: 33 pages, 2 figures. Bibliography contains a GitHub link to the accompanying codes and dataset. v2: We corrected a crucial factor of 2 in comparing to the tadpole conjecture (Ref. [13]), which uses a different convention for the flux charge

Journal ref: J. High Energ. Phys. 2024, 21 (2024)

arXiv:2407.16756 [pdf, other]

doi 10.1007/JHEP10(2024)095

Fully stabilized Minkowski vacua in the $2^6$ Landau-Ginzburg model

Authors: Muthusamy Rajaguru, Anindya Sengupta, Timm Wrase

Abstract: We study moduli stabilization via fluxes in the $2^6$ Landau-Ginzburg model. Fluxes not only give masses to scalar fields but can also induce higher order couplings that stabilize massless fields. We investigate this for several different flux choices in the $2^6$ model and find two examples that are inconsistent with the Refined Tadpole Conjecture. We also present, to our knowledge, the first 4d… ▽ More We study moduli stabilization via fluxes in the $2^6$ Landau-Ginzburg model. Fluxes not only give masses to scalar fields but can also induce higher order couplings that stabilize massless fields. We investigate this for several different flux choices in the $2^6$ model and find two examples that are inconsistent with the Refined Tadpole Conjecture. We also present, to our knowledge, the first 4d $\mathcal{N}=1$ Minkowski solution in string theory without any flat direction. △ Less

Submitted 26 May, 2025; v1 submitted 23 July, 2024; originally announced July 2024.

Comments: 33 pages, 1 figure; v2: corrected factor of 2 with respect to tadpole conjecture

arXiv:2407.13617 [pdf, other]

Fast Scrambling at the Boundary

Authors: Ancel Larzul, Anirvan M. Sengupta, Antoine Georges, Marco Schirò

Abstract: Many-body systems which saturate the quantum bound on chaos are attracting interest across a wide range of fields. Notable examples include the Sachdev-Ye-Kitaev model and its variations, all characterised by some form or randomness and all to all couplings. Here we study many-body quantum chaos in a quantum impurity model showing Non-Fermi-Liquid physics, the overscreened multichannel $SU(N)$ Kon… ▽ More Many-body systems which saturate the quantum bound on chaos are attracting interest across a wide range of fields. Notable examples include the Sachdev-Ye-Kitaev model and its variations, all characterised by some form or randomness and all to all couplings. Here we study many-body quantum chaos in a quantum impurity model showing Non-Fermi-Liquid physics, the overscreened multichannel $SU(N)$ Kondo model. We compute exactly the low-temperature behavior of the out-of time order correlator in the limit of large $N$ and large number of channels $K$, at fixed ratio $γ=K/N$. Due to strong correlations at the impurity site the spin fractionalizes in auxiliary fermions and bosons. We show that all the degrees of freedom of our theory acquire a Lyapunov exponent which is linear in temperature as $T\rightarrow 0$, with a prefactor that depends on $γ$. Remarkably, for $N=K$ the impurity spin displays maximal chaos, while bosons and fermions only get up to half of the maximal Lyapunov exponent. Our results highlights two new features: a non-disordered model which is maximally chaotic due to strong correlations at its boundary and a fractionalization of quantum chaos. △ Less

Submitted 18 July, 2024; originally announced July 2024.

Comments: 16 pages, 7 figures

Showing 1–50 of 503 results for author: Sengupta, A