-
QuArch: A Benchmark for Evaluating LLM Reasoning in Computer Architecture
Authors:
Shvetank Prakash,
Andrew Cheng,
Arya Tschand,
Mark Mazumder,
Varun Gohil,
Jeffrey Ma,
Jason Yik,
Zishen Wan,
Jessica Quaye,
Elisavet Lydia Alvanaki,
Avinash Kumar,
Chandrashis Mazumdar,
Tuhin Khare,
Alexander Ingare,
Ikechukwu Uchendu,
Radhika Ghosal,
Abhishek Tyagi,
Chenyu Wang,
Andrea Mattia Garavagno,
Sarah Gu,
Alice Guo,
Grace Hur,
Luca Carloni,
Tushar Krishna,
Ankita Nayak
, et al. (2 additional authors not shown)
Abstract:
The field of computer architecture, which bridges high-level software abstractions and low-level hardware implementations, remains absent from current large language model (LLM) evaluations. To this end, we present QuArch (pronounced 'quark'), the first benchmark designed to facilitate the development and evaluation of LLM knowledge and reasoning capabilities specifically in computer architecture.…
▽ More
The field of computer architecture, which bridges high-level software abstractions and low-level hardware implementations, remains absent from current large language model (LLM) evaluations. To this end, we present QuArch (pronounced 'quark'), the first benchmark designed to facilitate the development and evaluation of LLM knowledge and reasoning capabilities specifically in computer architecture. QuArch provides a comprehensive collection of 2,671 expert-validated question-answer (QA) pairs covering various aspects of computer architecture, including processor design, memory systems, and interconnection networks. Our evaluation reveals that while frontier models possess domain-specific knowledge, they struggle with skills that require higher-order thinking in computer architecture. Frontier model accuracies vary widely (from 34% to 72%) on these advanced questions, highlighting persistent gaps in architectural reasoning across analysis, design, and implementation QAs. By holistically assessing fundamental skills, QuArch provides a foundation for building and measuring LLM capabilities that can accelerate innovation in computing systems. With over 140 contributors from 40 institutions, this benchmark represents a community effort to set the standard for architectural reasoning in LLM evaluation.
△ Less
Submitted 24 October, 2025;
originally announced October 2025.
-
Sublogarithmic Distillation in all Prime Dimensions using Punctured Reed-Muller Codes
Authors:
Tanay Saha,
Shiroman Prakash
Abstract:
Magic state distillation is a leading but costly approach to fault-tolerant quantum computation, and it is important to explore all possible ways of minimizing its overhead cost. The number of ancillae required to produce a magic state within a target error rate $ε$ is $O(\log^γ (ε^{-1}))$ where $γ$ is known as the yield parameter. Hastings and Haah derived a family of distillation protocols with…
▽ More
Magic state distillation is a leading but costly approach to fault-tolerant quantum computation, and it is important to explore all possible ways of minimizing its overhead cost. The number of ancillae required to produce a magic state within a target error rate $ε$ is $O(\log^γ (ε^{-1}))$ where $γ$ is known as the yield parameter. Hastings and Haah derived a family of distillation protocols with sublogarithmic overhead (i.e., $γ< 1$) based on punctured Reed-Muller codes. Building on work by Campbell \textit{et al.} and Krishna-Tillich, which suggests that qudits of dimension $p>2$ can significantly reduce overhead, we generalize their construction to qudits of arbitrary prime dimension $p$. We find that, in an analytically tractable puncturing scheme, the number of qudits required to achieve sublogarithmic overhead decreases drastically as $p$ increases, and the asymptotic yield parameter approaches $\frac{1}{\ln p}$ as $p \to \infty$. We also perform a small computational search for optimal puncture locations, which results in several interesting triorthogonal codes, including a $[[519,106,5]]_5$ code with $γ=0.99$.
△ Less
Submitted 12 October, 2025;
originally announced October 2025.
-
Spectral nature of Sco X-1 observed using the X-ray SPECtroscopy and Timing (XSPECT) payload on-board XPoSat
Authors:
V. P. Shyam Prakash,
Vivek K. Agrawal,
Rwitika Chatterjee,
Radhakrishna Vatedka,
Koushal Vadodariya,
A. M. Vinodkumar
Abstract:
Scorpius X-1 is the brightest and first discovered X-ray source in the sky. Studying this source in the low-energy band has been challenging in the past due to its high brightness. However, with the X-ray SPECtroscopy and Timing (XSPECT) payload on-board Indias first X-ray Polarimetry Satellite (XPoSat), we have the capability to study the source despite its very high brightness, thanks to the fas…
▽ More
Scorpius X-1 is the brightest and first discovered X-ray source in the sky. Studying this source in the low-energy band has been challenging in the past due to its high brightness. However, with the X-ray SPECtroscopy and Timing (XSPECT) payload on-board Indias first X-ray Polarimetry Satellite (XPoSat), we have the capability to study the source despite its very high brightness, thanks to the fast (1 ms) readout of the instrument. We investigate the evolution of the spectral and timing properties of Sco X-1 across the horizontal, normal, and flaring branch, as observed with XSPECT. We examine changes in the spectral parameters as a function of position on the color-color diagram (CCD). Spectral studies indicate that the soft X-ray emission can be modeled using a multicolor disk component, with the inner disk temperature ranging from 0.6 to 0.8 keV. The hard component is described by a Comptonized continuum using either the nthComp or Comptb model with electron temperatures from 2.4 to 4.7 keV and optical depth between 5 and 14. Additionally, we observe the presence of an iron K-alpha line at 6.6 keV and an iron K-beta line at 7.6 keV. Both spectral models suggest a steep rise in Comptonization flux as well as disk flux in the flaring branch. An increase in neutron star blackbody temperature and inner disk temperature are also observed during flaring. The Z-track is driven by changes in the optical depth of the corona, the Comptonization flux and the disk flux and the inner disk temperature. No quasi-periodic oscillations are detected in any branch, suggesting their association with the high-energy spectrum.
△ Less
Submitted 3 October, 2025;
originally announced October 2025.
-
Judging by Appearances? Auditing and Intervening Vision-Language Models for Bail Prediction
Authors:
Sagnik Basu,
Shubham Prakash,
Ashish Maruti Barge,
Siddharth D Jaiswal,
Abhisek Dash,
Saptarshi Ghosh,
Animesh Mukherjee
Abstract:
Large language models (LLMs) have been extensively used for legal judgment prediction tasks based on case reports and crime history. However, with a surge in the availability of large vision language models (VLMs), legal judgment prediction systems can now be made to leverage the images of the criminals in addition to the textual case reports/crime history. Applications built in this way could lea…
▽ More
Large language models (LLMs) have been extensively used for legal judgment prediction tasks based on case reports and crime history. However, with a surge in the availability of large vision language models (VLMs), legal judgment prediction systems can now be made to leverage the images of the criminals in addition to the textual case reports/crime history. Applications built in this way could lead to inadvertent consequences and be used with malicious intent. In this work, we run an audit to investigate the efficiency of standalone VLMs in the bail decision prediction task. We observe that the performance is poor across multiple intersectional groups and models \textit{wrongly deny bail to deserving individuals with very high confidence}. We design different intervention algorithms by first including legal precedents through a RAG pipeline and then fine-tuning the VLMs using innovative schemes. We demonstrate that these interventions substantially improve the performance of bail prediction. Our work paves the way for the design of smarter interventions on VLMs in the future, before they can be deployed for real-world legal judgment prediction.
△ Less
Submitted 30 September, 2025;
originally announced October 2025.
-
AQUA: Attention via QUery mAgnitudes for Memory and Compute Efficient Inference in LLMs
Authors:
Santhosh G S,
Saurav Prakash,
Balaraman Ravindran
Abstract:
The quadratic complexity of the attention mechanism remains a fundamental barrier to scaling Large Language Models (LLMs) to longer contexts, creating a critical bottleneck in both computation and memory. To address this, we introduce AQUA (Attention via QUery mAgnitudes) a novel and versatile approximation strategy that significantly reduces the cost of attention with a graceful performance trade…
▽ More
The quadratic complexity of the attention mechanism remains a fundamental barrier to scaling Large Language Models (LLMs) to longer contexts, creating a critical bottleneck in both computation and memory. To address this, we introduce AQUA (Attention via QUery mAgnitudes) a novel and versatile approximation strategy that significantly reduces the cost of attention with a graceful performance trade-off. Our method operates in two phases: an efficient offline step where we compute a universal, language agnostic projection matrix via SVD on a calibration dataset, and an online inference step where we project query and key vectors and dynamically select a sparse subset of dimensions based on the query's magnitude. We provide a formal theoretical analysis of AQUA, establishing the break-even point at which it becomes more computationally efficient than standard attention. Our empirical evaluations on state-of-the-art models like Llama-3.1-8B demonstrate that a 25% reduction in the attention dot-product computation can be achieved with a statistically insignificant impact on performance across a wide range of benchmarks. We further showcase the versatility of AQUA by demonstrating its ability to synergistically accelerate existing token eviction methods like H2O and to directly reduce KV-cache memory size. By offering a controllable knob to balance efficiency and accuracy, AQUA provides a practical and powerful tool for making large-scale LLM inference more accessible and sustainable.
△ Less
Submitted 14 September, 2025;
originally announced September 2025.
-
Lifetime-Aware Design of Item-Level Intelligence
Authors:
Shvetank Prakash,
Andrew Cheng,
Olof Kindgren,
Ashiq Ahamed,
Graham Knight,
Jed Kufel,
Francisco Rodriguez,
Arya Tschand,
David Kong,
Mariam Elgamal,
Jerry Huang,
Emma Chen,
Gage Hills,
Richard Price,
Emre Ozer,
Vijay Janapa Reddi
Abstract:
We present FlexiFlow, a lifetime-aware design framework for item-level intelligence (ILI) where computation is integrated directly into disposable products like food packaging and medical patches. Our framework leverages natively flexible electronics which offer significantly lower costs than silicon but are limited to kHz speeds and several thousands of gates. Our insight is that unlike tradition…
▽ More
We present FlexiFlow, a lifetime-aware design framework for item-level intelligence (ILI) where computation is integrated directly into disposable products like food packaging and medical patches. Our framework leverages natively flexible electronics which offer significantly lower costs than silicon but are limited to kHz speeds and several thousands of gates. Our insight is that unlike traditional computing with more uniform deployment patterns, ILI applications exhibit 1000X variation in operational lifetime, fundamentally changing optimal architectural design decisions when considering trillion-item deployment scales. To enable holistic design and optimization, we model the trade-offs between embodied carbon footprint and operational carbon footprint based on application-specific lifetimes. The framework includes: (1) FlexiBench, a workload suite targeting sustainability applications from spoilage detection to health monitoring; (2) FlexiBits, area-optimized RISC-V cores with 1/4/8-bit datapaths achieving 2.65X to 3.50X better energy efficiency per workload execution; and (3) a carbon-aware model that selects optimal architectures based on deployment characteristics. We show that lifetime-aware microarchitectural design can reduce carbon footprint by 1.62X, while algorithmic decisions can reduce carbon footprint by 14.5X. We validate our approach through the first tape-out using a PDK for flexible electronics with fully open-source tools, achieving 30.9kHz operation. FlexiFlow enables exploration of computing at the Extreme Edge where conventional design methodologies must be reevaluated to account for new constraints and considerations.
△ Less
Submitted 9 September, 2025;
originally announced September 2025.
-
Investigation on Structural, Optical, Thermal, and Magnetic Properties of Bismuth Ferrite Nanoparticles Synthesized at Lower Annealing Temperature
Authors:
Naresh Prajapati,
G. Surya Prakash,
Manoj Kumar,
Himanshu Pandey
Abstract:
Due to its multiferroic properties and narrow optical bandgap, Bismuth ferrite has been widely explored for spintronics, photovoltaics, and photocatalysis applications. Bismuth ferrite can be synthesized in various forms like bulk, thin films, and nanostructures using various synthesis techniques. It is challenging to synthesize the pure BiFeO3 phase due to the volatile nature of bismuth and the v…
▽ More
Due to its multiferroic properties and narrow optical bandgap, Bismuth ferrite has been widely explored for spintronics, photovoltaics, and photocatalysis applications. Bismuth ferrite can be synthesized in various forms like bulk, thin films, and nanostructures using various synthesis techniques. It is challenging to synthesize the pure BiFeO3 phase due to the volatile nature of bismuth and the very narrow temperature range for forming this phase. So, this work aims to synthesize the pure BiFeO3 phase at lower annealing temperatures using an efficient sol-gel method. We have chosen the annealing temperature from 450 to 650 C, and a detailed analysis of structural and optical properties is performed here. X-ray diffraction is used to confirm the crystalline nature of the material. Single-phase Rietveld analysis of XRD patterns is carried out to study the effect of annealing temperature on structural parameters. All the samples are crystalized in pure rhombohedral BiFeO3 phase with the R3c space group symmetry, except those annealed at higher temperatures, 600 C and 650 C. Strain and dislocation densities were decreasing with an increase in the annealing temperature. From the UV-visible analysis, a strong response is observed below 600 nm in the visible region, and the band gap from the absorption behaviour is estimated in the range of 2.26 - 2.60 eV for these Bismuth ferrite nanoparticles. Fourier transform infrared analysis confirmed the existence of metal-oxygen bonds in Bismuth ferrite nanoparticles. These nanoparticles were found to be thermally stable from the thermal analysis performed using differential scanning calorimetry. Bismuth ferrite nanoparticles were weakly magnetic from the vibrating sample magnetometry analysis.
△ Less
Submitted 9 September, 2025;
originally announced September 2025.
-
Extreme magnetic field-boosted superconductivity in a high-temperature superconductor
Authors:
Km Rubi,
King Yau Yip,
Elizabeth Krenkel,
Nurul Fitriyah,
Xing Gao,
Saurav Prakash,
S. Lin Er Chow,
Tsz Fung Poon,
Mun K. Chan,
David Graf,
A. Ariando,
Neil Harrison
Abstract:
Magnetic fields typically suppress superconductivity through Pauli and orbital limiting effects. However, there are rare instances of magnetic-field-induced superconductivity, as seen in Chevrel phase compounds [1], organic conductors [2], uranium-based heavy-fermion systems [3, 4], and moire graphene [5], though these materials possess inherently low superconducting transition temperatures (Tc).…
▽ More
Magnetic fields typically suppress superconductivity through Pauli and orbital limiting effects. However, there are rare instances of magnetic-field-induced superconductivity, as seen in Chevrel phase compounds [1], organic conductors [2], uranium-based heavy-fermion systems [3, 4], and moire graphene [5], though these materials possess inherently low superconducting transition temperatures (Tc). Here, we demonstrate high field-stabilized superconductivity in a class of materials with a significantly higher Tc (up to 40 K): the infinite-layer nickelates [6]. Both low-field and high-field superconducting states can be plausibly explained by a compensation mechanism akin to the Jaccarino-Peter effect. These findings demonstrate the possibility of achieving substantially enhanced upper critical fields in high-temperature superconductors.
△ Less
Submitted 22 August, 2025;
originally announced August 2025.
-
Federated Nonlinear System Identification
Authors:
Omkar Tupe,
Max Hartman,
Lav R. Varshney,
Saurav Prakash
Abstract:
We consider federated learning of linearly-parameterized nonlinear systems. We establish theoretical guarantees on the effectiveness of federated nonlinear system identification compared to centralized approaches, demonstrating that the convergence rate improves as the number of clients increases. Although the convergence rates in the linear and nonlinear cases differ only by a constant, this cons…
▽ More
We consider federated learning of linearly-parameterized nonlinear systems. We establish theoretical guarantees on the effectiveness of federated nonlinear system identification compared to centralized approaches, demonstrating that the convergence rate improves as the number of clients increases. Although the convergence rates in the linear and nonlinear cases differ only by a constant, this constant depends on the feature map $φ$, which can be carefully chosen in the nonlinear setting to increase excitation and improve performance. We experimentally validate our theory in physical settings where client devices are driven by i.i.d. control inputs and control policies exhibiting i.i.d. random perturbations, ensuring non-active exploration. Experiments use trajectories from nonlinear dynamical systems characterized by real-analytic feature functions, including polynomial and trigonometric components, representative of physical systems including pendulum and quadrotor dynamics. We analyze the convergence behavior of the proposed method under varying noise levels and data distributions. Results show that federated learning consistently improves convergence of any individual client as the number of participating clients increases.
△ Less
Submitted 24 August, 2025; v1 submitted 20 August, 2025;
originally announced August 2025.
-
Stranski-Krastanov Growth of Disordered ScNx Thin Films on MgO(100): Influence of Defect Densities on Electronic Structure and Transport Properties
Authors:
Susmita Chowdhury,
Rachana Gupta,
Najnin Bano,
Yogesh Kumar,
Shashi Prakash,
Dinesh Kumar Shukla,
Vasant G. Sathe,
Mukul Gupta
Abstract:
We report a nascent real time Stranski-Krastanov growth of reactively sputtered ScNx thin films on MgO(100). The epitaxial growth was limited to 5 nm at a substrate temperature (Ts) of 25 C while the self-sustaining epitaxial nature along the [100] azimuth was retained up to 25 nm in Ts = 250 and 500 C samples due to enhanced adatom mobility. At Ts = 700 C, the film showed half order in-situ RHEED…
▽ More
We report a nascent real time Stranski-Krastanov growth of reactively sputtered ScNx thin films on MgO(100). The epitaxial growth was limited to 5 nm at a substrate temperature (Ts) of 25 C while the self-sustaining epitaxial nature along the [100] azimuth was retained up to 25 nm in Ts = 250 and 500 C samples due to enhanced adatom mobility. At Ts = 700 C, the film showed half order in-situ RHEED pattern, with forbidden (hkl) planes indicating N deficient hcp Sc-N phase. Presence of defect densities i.e., N vacancies and O interstitials leads to a disorder in ScNx system with weak localization effect and appearance of Raman relaxed first order transverse and longitudinal optical phonon modes and further leads to metal like Seebeck coefficient. Higher grain boundaries at Ts = 25 C and higher N out-diffusion at Ts = 700 C paves way for incorporation of higher oxygen interstitial in these samples.
△ Less
Submitted 7 August, 2025;
originally announced August 2025.
-
Zak-OTFS over CP-OFDM
Authors:
Saif Khan Mohammed,
Saurabh Prakash,
Muhammad Ubadah,
Imran Ali Khan,
Ronny Hadani,
Shlomo Rakib,
Shachar Kons,
Yoav Hebron,
Ananthanarayanan Chockalingam,
Robert Calderbank
Abstract:
Zak-Orthogonal Time Frequency Space (Zak-OTFS) modulation has been shown to achieve significantly better performance compared to the standardized Cyclic-Prefix Orthogonal Frequency Division Multiplexing (CP-OFDM), in high delay/Doppler spread scenarios envisaged in next generation communication systems. Zak-OTFS carriers are quasi-periodic pulses in the delay-Doppler (DD) domain, characterized by…
▽ More
Zak-Orthogonal Time Frequency Space (Zak-OTFS) modulation has been shown to achieve significantly better performance compared to the standardized Cyclic-Prefix Orthogonal Frequency Division Multiplexing (CP-OFDM), in high delay/Doppler spread scenarios envisaged in next generation communication systems. Zak-OTFS carriers are quasi-periodic pulses in the delay-Doppler (DD) domain, characterized by two parameters, (i) the pulse period along the delay axis (``delay period") (Doppler period is related to the delay period), and (ii) the pulse shaping filter. An important practical challenge is enabling support for Zak-OTFS modulation in existing CP-OFDM based modems. In this paper we show that Zak-OTFS modulation with pulse shaping constrained to sinc filtering (filter bandwidth equal to the communication bandwidth $B$) followed by time-windowing with a rectangular window of duration $(T + T_{cp})$ ($T$ is the symbol duration and $T_{cp}$ is the CP duration), can be implemented as a low-complexity precoder over standard CP-OFDM. We also show that the Zak-OTFS de-modulator with matched filtering constrained to sinc filtering (filter bandwidth $B$) followed by rectangular time windowing over duration $T$ can be implemented as a low-complexity post-processing of the CP-OFDM de-modulator output. This proposed ``Zak-OTFS over CP-OFDM" architecture enables us to harness the benefits of Zak-OTFS in existing network infrastructure. We also show that the proposed Zak-OTFS over CP-OFDM is a family of modulations, with CP-OFDM being a special case when the delay period takes its minimum possible value equal to the inverse bandwidth, i.e., Zak-OTFS over CP-OFDM with minimum delay period.
△ Less
Submitted 12 August, 2025; v1 submitted 5 August, 2025;
originally announced August 2025.
-
Multi-Attention Stacked Ensemble for Lung Cancer Detection in CT Scans
Authors:
Uzzal Saha,
Surya Prakash
Abstract:
In this work, we address the challenge of binary lung nodule classification (benign vs malignant) using CT images by proposing a multi-level attention stacked ensemble of deep neural networks. Three pretrained backbones -- EfficientNet V2 S, MobileViT XXS, and DenseNet201 -- are each adapted with a custom classification head tailored to 96 x 96 pixel inputs. A two-stage attention mechanism learns…
▽ More
In this work, we address the challenge of binary lung nodule classification (benign vs malignant) using CT images by proposing a multi-level attention stacked ensemble of deep neural networks. Three pretrained backbones -- EfficientNet V2 S, MobileViT XXS, and DenseNet201 -- are each adapted with a custom classification head tailored to 96 x 96 pixel inputs. A two-stage attention mechanism learns both model-wise and class-wise importance scores from concatenated logits, and a lightweight meta-learner refines the final prediction. To mitigate class imbalance and improve generalization, we employ dynamic focal loss with empirically calculated class weights, MixUp augmentation during training, and test-time augmentation at inference. Experiments on the LIDC-IDRI dataset demonstrate exceptional performance, achieving 98.09 accuracy and 0.9961 AUC, representing a 35 percent reduction in error rate compared to state-of-the-art methods. The model exhibits balanced performance across sensitivity (98.73) and specificity (98.96), with particularly strong results on challenging cases where radiologist disagreement was high. Statistical significance testing confirms the robustness of these improvements across multiple experimental runs. Our approach can serve as a robust, automated aid for radiologists in lung cancer screening.
△ Less
Submitted 27 July, 2025;
originally announced July 2025.
-
Persistent paramagnons in high-temperature infinite-layer nickelate superconductors
Authors:
Yujie Yan,
Ying Chan,
Xunyang Hong,
S. Lin Er Chow,
Zhaoyang Luo,
Yuehong Li,
Tianren Wang,
Yuetong Wu,
Izabela Biało,
Nurul Fitriyah,
Saurav Prakash,
Xing Gao,
King Yau Yip,
Qiang Gao,
Xiaolin Ren,
Jaewon Choi,
Ganesha Channagowdra,
Jun Okamoto,
Xingjiang Zhou,
Zhihai Zhu,
Liang Si,
Mirian Garcia-Fernandez,
Ke-Jin Zhou,
Hsiao-Yu Huang,
Di-Jing Huang
, et al. (3 additional authors not shown)
Abstract:
The recent discovery of high-temperature superconductivity in hole-doped SmNiO$_2$, exhibiting the record-high transition temperature $T_c$ among infinite-layer (IL) nickelates, has opened a new avenue for exploring design principles of superconductivity. Experimentally determining the electronic structure and magnetic interactions in this new system is crucial to elucidating the mechanism behind…
▽ More
The recent discovery of high-temperature superconductivity in hole-doped SmNiO$_2$, exhibiting the record-high transition temperature $T_c$ among infinite-layer (IL) nickelates, has opened a new avenue for exploring design principles of superconductivity. Experimentally determining the electronic structure and magnetic interactions in this new system is crucial to elucidating the mechanism behind the enhanced superconductivity. Here, we report a Ni $L$-edge resonant inelastic x-ray scattering (RIXS) study of superconducting Sm-based IL nickelate thin films Sm$_{1-x-y-z}$Eu$_x$Ca$_y$Sr$_z$NiO$_2$ (SECS). Dispersive paramagnonic excitations are observed in both optimally and overdoped SECS samples, supporting a spin-fluctuation-mediated pairing scenario. However, despite the two-fold enhancement of $T_c$ in the Sm-based nickelates compared to their Pr-based counterparts, the effective exchange coupling strength is reduced by approximately $20\%$. This behavior contrasts with hole-doped cuprates, where magnetic interactions correlate positively with $T_c$, highlighting essential differences in their superconducting mechanisms.
△ Less
Submitted 24 July, 2025;
originally announced July 2025.
-
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities
Authors:
Gheorghe Comanici,
Eric Bieber,
Mike Schaekermann,
Ice Pasupat,
Noveen Sachdeva,
Inderjit Dhillon,
Marcel Blistein,
Ori Ram,
Dan Zhang,
Evan Rosen,
Luke Marris,
Sam Petulla,
Colin Gaffney,
Asaf Aharoni,
Nathan Lintz,
Tiago Cardal Pais,
Henrik Jacobsson,
Idan Szpektor,
Nan-Jiang Jiang,
Krishna Haridasan,
Ahmed Omran,
Nikunj Saunshi,
Dara Bahri,
Gaurav Mishra,
Eric Chu
, et al. (3410 additional authors not shown)
Abstract:
In this report, we introduce the Gemini 2.X model family: Gemini 2.5 Pro and Gemini 2.5 Flash, as well as our earlier Gemini 2.0 Flash and Flash-Lite models. Gemini 2.5 Pro is our most capable model yet, achieving SoTA performance on frontier coding and reasoning benchmarks. In addition to its incredible coding and reasoning skills, Gemini 2.5 Pro is a thinking model that excels at multimodal unde…
▽ More
In this report, we introduce the Gemini 2.X model family: Gemini 2.5 Pro and Gemini 2.5 Flash, as well as our earlier Gemini 2.0 Flash and Flash-Lite models. Gemini 2.5 Pro is our most capable model yet, achieving SoTA performance on frontier coding and reasoning benchmarks. In addition to its incredible coding and reasoning skills, Gemini 2.5 Pro is a thinking model that excels at multimodal understanding and it is now able to process up to 3 hours of video content. Its unique combination of long context, multimodal and reasoning capabilities can be combined to unlock new agentic workflows. Gemini 2.5 Flash provides excellent reasoning abilities at a fraction of the compute and latency requirements and Gemini 2.0 Flash and Flash-Lite provide high performance at low latency and cost. Taken together, the Gemini 2.X model generation spans the full Pareto frontier of model capability vs cost, allowing users to explore the boundaries of what is possible with complex agentic problem solving.
△ Less
Submitted 16 October, 2025; v1 submitted 7 July, 2025;
originally announced July 2025.
-
Artificial Intelligence and Machine Learning in the Development of Vaccines and Immunotherapeutics Yesterday, Today, and Tomorrow
Authors:
Elhoucine Elfatimi,
Yassir Lekbach,
Swayam Prakash,
Lbachir BenMohamed
Abstract:
In the past, the development of vaccines and immunotherapeutics relied heavily on trial-and-error experimentation and extensive in vivo testing, often requiring years of pre-clinical and clinical trials. Today, artificial intelligence (AI) and deep learning (DL) are actively transforming vaccine and immunotherapeutic design, by (i) offering predictive frameworks that support rapid, data-driven dec…
▽ More
In the past, the development of vaccines and immunotherapeutics relied heavily on trial-and-error experimentation and extensive in vivo testing, often requiring years of pre-clinical and clinical trials. Today, artificial intelligence (AI) and deep learning (DL) are actively transforming vaccine and immunotherapeutic design, by (i) offering predictive frameworks that support rapid, data-driven decision-making; (ii) increasingly being implemented as time- and resource-efficient strategies that integrate computational models, systems vaccinology, and multi-omics data to better phenotype, differentiate, and classify patient diseases and cancers; predict patients' immune responses; and identify the factors contributing to optimal vaccine and immunotherapeutic protective efficacy; (iii) refining the selection of B- and T-cell antigen/epitope targets to enhance efficacy and durability of immune protection; and (iv) enabling a deeper understanding of immune regulation, immune evasion, immune checkpoints, and regulatory pathways. The future of AI and DL points toward (i) replacing animal preclinical testing of drugs, vaccines, and immunotherapeutics with computational-based models, as recently proposed by the United States FDA; and (ii) enabling real-time in vivo modeling for immunobridging and prediction of protection in clinical trials. This may result in a fast and transformative shift for the development of personal vaccines and immunotherapeutics against infectious pathogens and cancers.
△ Less
Submitted 13 June, 2025;
originally announced June 2025.
-
The Amazon Nova Family of Models: Technical Report and Model Card
Authors:
Amazon AGI,
Aaron Langford,
Aayush Shah,
Abhanshu Gupta,
Abhimanyu Bhatter,
Abhinav Goyal,
Abhinav Mathur,
Abhinav Mohanty,
Abhishek Kumar,
Abhishek Sethi,
Abi Komma,
Abner Pena,
Achin Jain,
Adam Kunysz,
Adam Opyrchal,
Adarsh Singh,
Aditya Rawal,
Adok Achar Budihal Prasad,
Adrià de Gispert,
Agnika Kumar,
Aishwarya Aryamane,
Ajay Nair,
Akilan M,
Akshaya Iyengar,
Akshaya Vishnu Kudlu Shanbhogue
, et al. (761 additional authors not shown)
Abstract:
We present Amazon Nova, a new generation of state-of-the-art foundation models that deliver frontier intelligence and industry-leading price performance. Amazon Nova Pro is a highly-capable multimodal model with the best combination of accuracy, speed, and cost for a wide range of tasks. Amazon Nova Lite is a low-cost multimodal model that is lightning fast for processing images, video, documents…
▽ More
We present Amazon Nova, a new generation of state-of-the-art foundation models that deliver frontier intelligence and industry-leading price performance. Amazon Nova Pro is a highly-capable multimodal model with the best combination of accuracy, speed, and cost for a wide range of tasks. Amazon Nova Lite is a low-cost multimodal model that is lightning fast for processing images, video, documents and text. Amazon Nova Micro is a text-only model that delivers our lowest-latency responses at very low cost. Amazon Nova Canvas is an image generation model that creates professional grade images with rich customization controls. Amazon Nova Reel is a video generation model offering high-quality outputs, customization, and motion control. Our models were built responsibly and with a commitment to customer trust, security, and reliability. We report benchmarking results for core capabilities, agentic performance, long context, functional adaptation, runtime performance, and human evaluation.
△ Less
Submitted 17 March, 2025;
originally announced June 2025.
-
SUSY meets SMEFT: Complete one-loop matching of the general MSSM
Authors:
Sabine Kraml,
Andre Lessa,
Suraj Prakash,
Felix Wilsch
Abstract:
We present the complete one-loop matching of the Minimal Supersymmetric Standard Model (MSSM) onto the Standard Model Effective Field Theory (SMEFT), considering the most general case for the MSSM with conserved $R$-parity, which has 124 free parameters. The matching is performed with the Matchete package, which integrates out all superpartners at once with non-degenerate masses, while also retain…
▽ More
We present the complete one-loop matching of the Minimal Supersymmetric Standard Model (MSSM) onto the Standard Model Effective Field Theory (SMEFT), considering the most general case for the MSSM with conserved $R$-parity, which has 124 free parameters. The matching is performed with the Matchete package, which integrates out all superpartners at once with non-degenerate masses, while also retaining the most general flavor structure. Our results include all correlations among the different SMEFT Wilson coefficients that are governed by supersymmetry and thus provide a basis for future systematic and global studies of the MSSM parameter space employing EFT methods. A detailed discussion is provided on the treatment of the Higgs sector and electroweak symmetry breaking, along with the reduction of redundant operators in the EFT Lagrangian to the Warsaw basis. Furthermore, we validate against existing results in the literature and present a minimal phenomenological example. Extensive auxiliary material, including the code utilized for the matching, is available on GitHub.
△ Less
Submitted 5 June, 2025;
originally announced June 2025.
-
EFT analysis of New Physics at COHERENT with Dirac neutrinos
Authors:
Víctor Bresó-Pla,
Sergio Cruz-Alzaga,
Martín González-Alonso,
Suraj Prakash
Abstract:
We study the sensitivity of COHERENT-like experiments to non-standard contributions within the so-called $ν$WEFT framework. The latter is the most general low-energy effective field theory that includes not only the light SM fields but also additional right-handed Dirac neutrinos. Our analysis includes for the first time flavor-general New Physics effects in neutrino production (pion and muon deca…
▽ More
We study the sensitivity of COHERENT-like experiments to non-standard contributions within the so-called $ν$WEFT framework. The latter is the most general low-energy effective field theory that includes not only the light SM fields but also additional right-handed Dirac neutrinos. Our analysis includes for the first time flavor-general New Physics effects in neutrino production (pion and muon decays) and neutrino detection (through Coherent Elastic Neutrino-Nucleus Scattering). Despite the generality, the results can be written in compact form and are easy to implement in existing or future analyses using effective nuclear charges. We use current COHERENT data to set constraints on the corresponding effective operators, and we estimate the sensitivity of future measurements.
△ Less
Submitted 2 May, 2025;
originally announced May 2025.
-
Enhancing User Sequence Modeling through Barlow Twins-based Self-Supervised Learning
Authors:
Yuhan Liu,
Lin Ning,
Neo Wu,
Karan Singhal,
Philip Andrew Mansfield,
Devora Berlowitz,
Sushant Prakash,
Bradley Green
Abstract:
User sequence modeling is crucial for modern large-scale recommendation systems, as it enables the extraction of informative representations of users and items from their historical interactions. These user representations are widely used for a variety of downstream tasks to enhance users' online experience. A key challenge for learning these representations is the lack of labeled training data. W…
▽ More
User sequence modeling is crucial for modern large-scale recommendation systems, as it enables the extraction of informative representations of users and items from their historical interactions. These user representations are widely used for a variety of downstream tasks to enhance users' online experience. A key challenge for learning these representations is the lack of labeled training data. While self-supervised learning (SSL) methods have emerged as a promising solution for learning representations from unlabeled data, many existing approaches rely on extensive negative sampling, which can be computationally expensive and may not always be feasible in real-world scenario. In this work, we propose an adaptation of Barlow Twins, a state-of-the-art SSL methods, to user sequence modeling by incorporating suitable augmentation methods. Our approach aims to mitigate the need for large negative sample batches, enabling effective representation learning with smaller batch sizes and limited labeled data. We evaluate our method on the MovieLens-1M, MovieLens-20M, and Yelp datasets, demonstrating that our method consistently outperforms the widely-used dual encoder model across three downstream tasks, achieving an 8%-20% improvement in accuracy. Our findings underscore the effectiveness of our approach in extracting valuable sequence-level information for user modeling, particularly in scenarios where labeled data is scarce and negative examples are limited.
△ Less
Submitted 1 May, 2025;
originally announced May 2025.
-
Geometric Formulation of Unified Force-Impedance Control on SE(3) for Robotic Manipulators
Authors:
Joohwan Seo,
Nikhil Potu Surya Prakash,
Soomi Lee,
Arvind Kruthiventy,
Megan Teng,
Jongeun Choi,
Roberto Horowitz
Abstract:
In this paper, we present an impedance control framework on the SE(3) manifold, which enables force tracking while guaranteeing passivity. Building upon the unified force-impedance control (UFIC) and our previous work on geometric impedance control (GIC), we develop the geometric unified force impedance control (GUFIC) to account for the SE(3) manifold structure in the controller formulation using…
▽ More
In this paper, we present an impedance control framework on the SE(3) manifold, which enables force tracking while guaranteeing passivity. Building upon the unified force-impedance control (UFIC) and our previous work on geometric impedance control (GIC), we develop the geometric unified force impedance control (GUFIC) to account for the SE(3) manifold structure in the controller formulation using a differential geometric perspective. As in the case of the UFIC, the GUFIC utilizes energy tank augmentation for both force-tracking and impedance control to guarantee the manipulator's passivity relative to external forces. This ensures that the end effector maintains safe contact interaction with uncertain environments and tracks a desired interaction force. Moreover, we resolve a non-causal implementation problem in the UFIC formulation by introducing velocity and force fields. Due to its formulation on SE(3), the proposed GUFIC inherits the desirable SE(3) invariance and equivariance properties of the GIC, which helps increase sample efficiency in machine learning applications where a learning algorithm is incorporated into the control law. The proposed control law is validated in a simulation environment under scenarios requiring tracking an SE(3) trajectory, incorporating both position and orientation, while exerting a force on a surface. The codes are available at https://github.com/Joohwan-Seo/GUFIC_mujoco.
△ Less
Submitted 15 July, 2025; v1 submitted 23 April, 2025;
originally announced April 2025.
-
A User-Tunable Machine Learning Framework for Step-Wise Synthesis Planning
Authors:
Shivesh Prakash,
Hans-Arno Jacobsen,
Viki Kumar Prasad
Abstract:
We introduce MHNpath, a machine learning-driven retrosynthetic tool designed for computer-aided synthesis planning. Leveraging modern Hopfield networks and novel comparative metrics, MHNpath efficiently prioritizes reaction templates, improving the scalability and accuracy of retrosynthetic predictions. The tool incorporates a tunable scoring system that allows users to prioritize pathways based o…
▽ More
We introduce MHNpath, a machine learning-driven retrosynthetic tool designed for computer-aided synthesis planning. Leveraging modern Hopfield networks and novel comparative metrics, MHNpath efficiently prioritizes reaction templates, improving the scalability and accuracy of retrosynthetic predictions. The tool incorporates a tunable scoring system that allows users to prioritize pathways based on cost, reaction temperature, and toxicity, thereby facilitating the design of greener and cost-effective reaction routes. We demonstrate its effectiveness through case studies involving complex molecules from ChemByDesign, showcasing its ability to predict novel synthetic and enzymatic pathways. Furthermore, we benchmark MHNpath against existing frameworks, replicating experimentally validated "gold-standard" pathways from PaRoutes. Our case studies reveal that the tool can generate shorter, cheaper, moderate-temperature routes employing green solvents, as exemplified by compounds such as dronabinol, arformoterol, and lupinine.
△ Less
Submitted 3 April, 2025; v1 submitted 2 April, 2025;
originally announced April 2025.
-
Neutrino Theory in the Precision Era
Authors:
Asmaa Abada,
Gabriela Barenboim,
Toni Bertólez-Martínez,
Sandipan Bhattacherjee,
Sara Bolognesi,
Patrick D. Bolton,
Nilay Bostan,
Gustavo C. Branco,
Sabya Sachi Chatterjee,
Adriano Cherchiglia,
Marco Chianese,
B. A. Couto e Silva,
Peter B. Denton,
Stephen Dolan,
Marco Drewes,
Ilham El Atmani,
Miguel Escudero,
Ivan Esteban,
Manuel Ettengruber,
Enrique Fernández-Martínez,
Julien Froustey,
Raj Gandhi,
Julia Gehrlein,
Srubabati Goswami,
André de Gouvêa
, et al. (54 additional authors not shown)
Abstract:
This document summarises discussions on future directions in theoretical neutrino physics, which are the outcome of a neutrino theory workshop held at CERN in February 2025. The starting point is the realisation that neutrino physics offers unique opportunities to address some of the most fundamental questions in physics. This motivates a vigorous experimental programme which the theory community…
▽ More
This document summarises discussions on future directions in theoretical neutrino physics, which are the outcome of a neutrino theory workshop held at CERN in February 2025. The starting point is the realisation that neutrino physics offers unique opportunities to address some of the most fundamental questions in physics. This motivates a vigorous experimental programme which the theory community fully supports. \textbf{A strong effort in theoretical neutrino physics is paramount to optimally take advantage of upcoming neutrino experiments and to explore the synergies with other areas of particle, astroparticle, and nuclear physics, as well as cosmology.} Progress on the theory side has the potential to significantly boost the physics reach of experiments, as well as go well beyond their original scope. Strong collaboration between theory and experiment is essential in the precision era. To foster such collaboration, \textbf{we propose to establish a CERN Neutrino Physics Centre.} Taking inspiration from the highly successful LHC Physics Center at Fermilab, the CERN Neutrino Physics Centre would be the European hub of the neutrino community, covering experimental and theoretical activities.
△ Less
Submitted 27 March, 2025;
originally announced April 2025.
-
Exact Fluctuating Hydrodynamics of the Scaled Light-Heavy Model
Authors:
Shilpa Prakash,
Mustansir Barma,
Kabir Ramola
Abstract:
We study the exact fluctuating hydrodynamics of the scaled Light-Heavy model (sLH), in which two species of particles (light and heavy) interact with a fluctuating surface. This model is similar in definition to the unscaled Light-Heavy model (uLH), except it uses rates scaled with the system size. The consequence, it turns out, is a phase diagram that differs from that of the unscaled model. We d…
▽ More
We study the exact fluctuating hydrodynamics of the scaled Light-Heavy model (sLH), in which two species of particles (light and heavy) interact with a fluctuating surface. This model is similar in definition to the unscaled Light-Heavy model (uLH), except it uses rates scaled with the system size. The consequence, it turns out, is a phase diagram that differs from that of the unscaled model. We derive the fluctuating hydrodynamics for this model using an action formalism involving the construction of path integrals for the probability of different states that give the complete macroscopic picture starting from the microscopic one. This is then used to obtain the two-point steady-state (static) correlation functions between fluctuations in the two density fields in the homogeneous phase. We show that these theoretical results match well with microscopic simulations away from the critical line. We derive an exponentially decaying form for the two-point steady-state correlation function with a correlation length that diverges as the critical line is approached. Finally, we also compute the dynamic correlations in the homogeneous phase and use them to determine the relaxation dynamics as well as the dynamic exponents of the system.
△ Less
Submitted 20 March, 2025;
originally announced March 2025.
-
DevOps Automation Pipeline Deployment with IaC (Infrastructure as Code)
Authors:
Adarsh Saxena,
Sudhakar Singh,
Shiv Prakash,
Tiansheng Yang,
Rajkumar Singh Rathore
Abstract:
DevOps pipeline is a set of automated tasks or processes or jobs that has tasks assigned to execute automatically that allow the Development team and Operations team to collaborate for building and deployment of the software or services. DevOps as a culture includes better collaboration between different teams within an organization and the removal of silos between them. This paper aims to streaml…
▽ More
DevOps pipeline is a set of automated tasks or processes or jobs that has tasks assigned to execute automatically that allow the Development team and Operations team to collaborate for building and deployment of the software or services. DevOps as a culture includes better collaboration between different teams within an organization and the removal of silos between them. This paper aims to streamline the current software development and deployment process that is being followed in most of today's generation DevOps deployment as Continuous Integration and Continuous Delivery (CI/CD) pipelines. Centered to the level of software development life cycle (SDLC), it also describes the current ambiguous definition to clarify the implementation of DevOps in practice along a sample CI/CD pipeline deployment. The further objective of the paper is to demonstrate the implementation strategy of DevOps Infrastructure as Code (IaC) and Pipeline as a code and the removal of ambiguity in the definition of DevOps Infrastructure as a Code methodology.
△ Less
Submitted 20 March, 2025;
originally announced March 2025.
-
Blind Augmentation: Calibration-free Camera Distortion Model Estimation for Real-time Mixed-reality Consistency
Authors:
Siddhant Prakash,
David R. Walton,
Rafael K. dos Anjos,
Anthony Steed,
Tobias Ritschel
Abstract:
Real camera footage is subject to noise, motion blur (MB) and depth of field (DoF). In some applications these might be considered distortions to be removed, but in others it is important to model them because it would be ineffective, or interfere with an aesthetic choice, to simply remove them. In augmented reality applications where virtual content is composed into a live video feed, we can mode…
▽ More
Real camera footage is subject to noise, motion blur (MB) and depth of field (DoF). In some applications these might be considered distortions to be removed, but in others it is important to model them because it would be ineffective, or interfere with an aesthetic choice, to simply remove them. In augmented reality applications where virtual content is composed into a live video feed, we can model noise, MB and DoF to make the virtual content visually consistent with the video. Existing methods for this typically suffer two main limitations. First, they require a camera calibration step to relate a known calibration target to the specific cameras response. Second, existing work require methods that can be (differentiably) tuned to the calibration, such as slow and specialized neural networks. We propose a method which estimates parameters for noise, MB and DoF instantly, which allows using off-the-shelf real-time simulation methods from e.g., a game engine in compositing augmented content. Our main idea is to unlock both features by showing how to use modern computer vision methods that can remove noise, MB and DoF from the video stream, essentially providing self-calibration. This allows to auto-tune any black-box real-time noise+MB+DoF method to deliver fast and high-fidelity augmentation consistency.
△ Less
Submitted 3 March, 2025;
originally announced March 2025.
-
Muon-Decay Parameters from COHERENT
Authors:
Víctor Bresó-Pla,
Sergio Cruz-Alzaga,
Martín González-Alonso,
Suraj Prakash
Abstract:
We demonstrate that measurements of Coherent Elastic Neutrino-Nucleus Scattering (CE$ν$NS) at spallation sources are valuable probes of muon-decay physics. Using COHERENT data we derive the first direct constraint on the Michel parameters governing the $\barν_μ$ energy distribution. We also discuss future sensitivities, the implications for the Lorentz structure of the interactions mediating muon…
▽ More
We demonstrate that measurements of Coherent Elastic Neutrino-Nucleus Scattering (CE$ν$NS) at spallation sources are valuable probes of muon-decay physics. Using COHERENT data we derive the first direct constraint on the Michel parameters governing the $\barν_μ$ energy distribution. We also discuss future sensitivities, the implications for the Lorentz structure of the interactions mediating muon decay and the application to other neutrino-production mechanisms like pion decay.
△ Less
Submitted 4 October, 2025; v1 submitted 25 February, 2025;
originally announced February 2025.
-
Design and Implementation of Flutter based Multi-platform Docker Controller App
Authors:
Adarsh Saxena,
Sudhakar Singh,
Shiv Prakash,
Nand Lal Yadav,
Tiansheng Yang,
Rajkumar Singh Rathore,
Shreya Singh
Abstract:
This paper focuses on developing a Flutter application for controlling Docker resources remotely. The application provides a user-friendly interface for executing various Docker-related commands on the server where the Docker engine is installed. The application uses the SSH protocol to establish a secure connection with the server and execute the commands. Further, an alternative approach is also…
▽ More
This paper focuses on developing a Flutter application for controlling Docker resources remotely. The application provides a user-friendly interface for executing various Docker-related commands on the server where the Docker engine is installed. The application uses the SSH protocol to establish a secure connection with the server and execute the commands. Further, an alternative approach is also explored, which involves connecting the application with the Docker engine using HTTP. This proposed Docker controller application provides a significant advantage for managing Docker resources remotely, which is highly beneficial in DevOps fields. It provides a user-friendly interface to manage containers, making it easy to create, start, stop, restart, and remove containers. It abstracts away the complexities of working with Docker commands, allowing users to interact with containers more intuitively. It can be used to manage a number of docker engines from one place making it easy to control and monitor all the docker resources. Its performance, security, and scalability are evaluated using various testing techniques, and the results are found satisfactory. Further improvements may include enhancing the application's features, optimizing the performance, and exploring other possible approaches for establishing the connection between the application and the Docker engine.
△ Less
Submitted 17 February, 2025;
originally announced February 2025.
-
First spectro-polarimetric study of the neutron star low-mass X-ray binary GX 9+1
Authors:
V. P. Shyam Prakash,
Vivek K. Agrawal,
A. M. Vinodkumar
Abstract:
We present the first spectro-polarimetric study of the bright atoll source GX 9+1, using the simultaneous Imaging X-ray Polarimetry Explorer (IXPE), and Neutron star Interior Composition Explorer (NICER) observations. The source was observed to remain in the soft state, with no changes in state throughout the observation period. The source does not show significant polarization in the 2-8 keV ener…
▽ More
We present the first spectro-polarimetric study of the bright atoll source GX 9+1, using the simultaneous Imaging X-ray Polarimetry Explorer (IXPE), and Neutron star Interior Composition Explorer (NICER) observations. The source was observed to remain in the soft state, with no changes in state throughout the observation period. The source does not show significant polarization in the 2-8 keV energy range. However, a significant polarization (3.3 sigma) was detected in the 2-3 keV range, with a polarization degree of 3.3 +/- 0.8% and a polarization angle of 11 +/- 7 deg. We used the simultaneous energy spectra from NICER (0.6 - 11 keV) and IXPE (2-8 keV) to study the spectral properties of the source during observations. The observed spectrum of the source can be well described by a combination of Comptonized blackbody emission from the neutron star surface (compbb model in XSPEC) and thermal Comptonized component with seed photons from the accretion disc. The spectral properties of GX 9+1 during the observation are consistent with those of other bright atoll-sources in the soft state. However, the high polarization degree observed in the low-energy band does not align with previous IXPE observations of other atoll-sources. This observed polarization in the source is attributed to the strong polarization of the Comptonized blackbody component. We discuss the results from the spectro-polarimetric studies in the context of various accretion disc and coronal geometries of the source.
△ Less
Submitted 5 February, 2025; v1 submitted 4 February, 2025;
originally announced February 2025.
-
A Novel Precoder for Peak-to-Average Power Ratio Reduction in OTFS Systems
Authors:
Saurabh Prakash,
Venkatesh Khammammetti,
Saif Khan Mohammed
Abstract:
We consider the issue of high peak-to-average-power ratio (PAPR) of Orthogonal time frequency space (OTFS) modulated signals. This paper proposes a low-complexity novel iterative PAPR reduction method which achieves a PAPR reduction of roughly 5 dB when compared to a OTFS modulated signal without any PAPR compensation. Simulations reveal that the PAPR achieved by the proposed method is significant…
▽ More
We consider the issue of high peak-to-average-power ratio (PAPR) of Orthogonal time frequency space (OTFS) modulated signals. This paper proposes a low-complexity novel iterative PAPR reduction method which achieves a PAPR reduction of roughly 5 dB when compared to a OTFS modulated signal without any PAPR compensation. Simulations reveal that the PAPR achieved by the proposed method is significantly better than that achieved by other state-of-art methods. Simulations also reveal that the error rate performance of OTFS based systems with the proposed PAPR reduction is similar to that achieved with the other state-of-art methods.
△ Less
Submitted 18 January, 2025;
originally announced January 2025.
-
Invariant Theory and Magic State Distillation
Authors:
Amolak Ratan Kalra,
Shiroman Prakash
Abstract:
We show that the performance of a linear self-orthogonal $GF(4)$ code for magic state distillation of Bravyi and Kitaev's $|T\rangle$-state is characterized by its simple weight enumerator. We compute weight enumerators of all such codes with fewer than 20 qubits and find none whose threshold exceeds that of the 5-qubit code. Using constraints on weight enumerators from invariant theory and linear…
▽ More
We show that the performance of a linear self-orthogonal $GF(4)$ code for magic state distillation of Bravyi and Kitaev's $|T\rangle$-state is characterized by its simple weight enumerator. We compute weight enumerators of all such codes with fewer than 20 qubits and find none whose threshold exceeds that of the 5-qubit code. Using constraints on weight enumerators from invariant theory and linear programming, we establish bounds on the exponent characterizing noise suppression of a $|T\rangle$-state distillation protocol. We also obtain new non-negativity constraints on such weight enumerators by demanding consistency of the associated magic state distillation routine. These constraints yield new bounds on the distances of classical Hermitian self-dual and maximal self-orthogonal linear $GF(4)$ codes, notably proving the nonexistence of such codes with parameters $[12m, 6m, 4m+2]_{GF(4)}$.
△ Less
Submitted 22 January, 2025; v1 submitted 17 January, 2025;
originally announced January 2025.
-
QuArch: A Question-Answering Dataset for AI Agents in Computer Architecture
Authors:
Shvetank Prakash,
Andrew Cheng,
Jason Yik,
Arya Tschand,
Radhika Ghosal,
Ikechukwu Uchendu,
Jessica Quaye,
Jeffrey Ma,
Shreyas Grampurohit,
Sofia Giannuzzi,
Arnav Balyan,
Fin Amin,
Aadya Pipersenia,
Yash Choudhary,
Ankita Nayak,
Amir Yazdanbakhsh,
Vijay Janapa Reddi
Abstract:
We introduce QuArch, a dataset of 1500 human-validated question-answer pairs designed to evaluate and enhance language models' understanding of computer architecture. The dataset covers areas including processor design, memory systems, and performance optimization. Our analysis highlights a significant performance gap: the best closed-source model achieves 84% accuracy, while the top small open-so…
▽ More
We introduce QuArch, a dataset of 1500 human-validated question-answer pairs designed to evaluate and enhance language models' understanding of computer architecture. The dataset covers areas including processor design, memory systems, and performance optimization. Our analysis highlights a significant performance gap: the best closed-source model achieves 84% accuracy, while the top small open-source model reaches 72%. We observe notable struggles in memory systems, interconnection networks, and benchmarking. Fine-tuning with QuArch improves small model accuracy by up to 8%, establishing a foundation for advancing AI-driven computer architecture research. The dataset and leaderboard are at https://harvard-edge.github.io/QuArch/.
△ Less
Submitted 6 January, 2025; v1 submitted 3 January, 2025;
originally announced January 2025.
-
Room Temperature Strong Orbital Moments in Perpendicularly Magnetized Magnetic Insulator
Authors:
Ganesh Ji Omar,
Pierluigi Gargiani,
Manuel Valvidares,
Zhi Shiuh Lim,
Saurav Prakash,
T. S. Suraj,
Abhijit Ghosh,
Sze Ter Lim,
James Lourembam,
Ariando Ariando
Abstract:
The balance between the orbital and spin magnetic moments in a magnetic system is the heart of many intriguing phenomena. Here, we show experimental evidence of a large orbital moment, which competes with its spin counterpart in a ferrimagnetic insulator thulium iron garnet, Tm3Fe5O12. Leveraging element-specific X-ray magnetic circular dichroism (XMCD), we establish that the dominant contribution…
▽ More
The balance between the orbital and spin magnetic moments in a magnetic system is the heart of many intriguing phenomena. Here, we show experimental evidence of a large orbital moment, which competes with its spin counterpart in a ferrimagnetic insulator thulium iron garnet, Tm3Fe5O12. Leveraging element-specific X-ray magnetic circular dichroism (XMCD), we establish that the dominant contribution to the orbital moment originates from 4f orbitals of Tm. Besides the large Tm orbital moment, intriguingly, our results also reveal a smaller but evident non-zero XMCD signal in the O K edge, suggesting additional spin-orbit coupling and exchange interactions with the nearest neighbour Fe atoms. The unquenched orbital moment is primarily responsible for a significant reduction in g-factor, typically 2 in transition metals, as determined independently using ferromagnetic resonance spectroscopy. Our findings reveal a non-linear reduction in the g-factor from 1.7 at 300 K to 1.56 at 200 K in Tm3Fe5O12 thin films. These results provide critical insights into the role of the f orbitals in long-range magnetic order and stimulate further exploration in orbitronics.
△ Less
Submitted 11 December, 2024;
originally announced December 2024.
-
ColonNet: A Hybrid Of DenseNet121 And U-NET Model For Detection And Segmentation Of GI Bleeding
Authors:
Ayushman Singh,
Sharad Prakash,
Aniket Das,
Nidhi Kushwaha
Abstract:
This study presents an integrated deep learning model for automatic detection and classification of Gastrointestinal bleeding in the frames extracted from Wireless Capsule Endoscopy (WCE) videos. The dataset has been released as part of Auto-WCBleedGen Challenge Version V2 hosted by the MISAHUB team. Our model attained the highest performance among 75 teams that took part in this competition. It a…
▽ More
This study presents an integrated deep learning model for automatic detection and classification of Gastrointestinal bleeding in the frames extracted from Wireless Capsule Endoscopy (WCE) videos. The dataset has been released as part of Auto-WCBleedGen Challenge Version V2 hosted by the MISAHUB team. Our model attained the highest performance among 75 teams that took part in this competition. It aims to efficiently utilizes CNN based model i.e. DenseNet and UNet to detect and segment bleeding and non-bleeding areas in the real-world complex dataset. The model achieves an impressive overall accuracy of 80% which would surely help a skilled doctor to carry out further diagnostics.
△ Less
Submitted 6 December, 2024;
originally announced December 2024.
-
Fault-Tolerant Implementation of the Deutsch-Jozsa Algorithm
Authors:
Divyanshu Singh,
Shiroman Prakash
Abstract:
We show that one can implement the Deutsch-Josza algorithm, one of the first and simplest quantum algorithms, in a fault-tolerant manner using the smallest quantum error-detecting code -- the $[[4,2,2]]$ code -- without any ancillae. We implemented the algorithm on a trapped-ion quantum computer with and without fault-tolerant encoding and compared the results. With approximately $99 \%$ confidenc…
▽ More
We show that one can implement the Deutsch-Josza algorithm, one of the first and simplest quantum algorithms, in a fault-tolerant manner using the smallest quantum error-detecting code -- the $[[4,2,2]]$ code -- without any ancillae. We implemented the algorithm on a trapped-ion quantum computer with and without fault-tolerant encoding and compared the results. With approximately $99 \%$ confidence, we found that the fault-tolerant implementation provided a noise reduction for all oracles. Averaged across all oracles, the reduction in error rate was nearly $90 \%$.
△ Less
Submitted 10 September, 2025; v1 submitted 6 December, 2024;
originally announced December 2024.
-
Comparative Study of MAC Protocols for Wireless Mesh Network
Authors:
Ankita Singh,
Shiv Prakash,
Sudhakar Singh
Abstract:
Wireless networking is encouraged by the constant enhancement of sensors' ability and wireless communication. To provide service quality support for multimedia viz. audio and video streams, the IEEE 802.11e MAC (Media Access Control) improves basic 802.11 MAC. IEEE 802.11 standard series such as IEEE 802.11a, b, g, n, p, and ac have been promoted and specified in the current communications and con…
▽ More
Wireless networking is encouraged by the constant enhancement of sensors' ability and wireless communication. To provide service quality support for multimedia viz. audio and video streams, the IEEE 802.11e MAC (Media Access Control) improves basic 802.11 MAC. IEEE 802.11 standard series such as IEEE 802.11a, b, g, n, p, and ac have been promoted and specified in the current communications and connection development. Each standard has functionality that matches the kind of applications for which the standard is intended. IEEE 802.11ac has better performance with fewer interferences and achieves gigabits per second capacity transfer rates. This paper discusses the comparative examination of the IEEE 802.11a, IEEE 802.11b, IEEE 802.11g, IEEE 802.11n, IEEE 802.11p, and IEEE 802.11ac standards which increase accuracy and performance pertaining to the IEEE 802.11 standard. In this paper, we investigate the design requirements for numerous simultaneous peer-to-peer connections. Further, this study offers a systematic review and analysis of the MAC layer in WMN (Wireless Mesh Network) and also highlights their open research issues and challenges. Finally, this paper discusses various potential directions for future research in this area with an emphasis on their strengths and limitations.
△ Less
Submitted 8 November, 2024;
originally announced November 2024.
-
Continuous Sign Language Recognition System using Deep Learning with MediaPipe Holistic
Authors:
Sharvani Srivastava,
Sudhakar Singh,
Pooja,
Shiv Prakash
Abstract:
Sign languages are the language of hearing-impaired people who use visuals like the hand, facial, and body movements for communication. There are different signs and gestures representing alphabets, words, and phrases. Nowadays approximately 300 sign languages are being practiced worldwide such as American Sign Language (ASL), Chinese Sign Language (CSL), Indian Sign Language (ISL), and many more.…
▽ More
Sign languages are the language of hearing-impaired people who use visuals like the hand, facial, and body movements for communication. There are different signs and gestures representing alphabets, words, and phrases. Nowadays approximately 300 sign languages are being practiced worldwide such as American Sign Language (ASL), Chinese Sign Language (CSL), Indian Sign Language (ISL), and many more. Sign languages are dependent on the vocal language of a place. Unlike vocal or spoken languages, there are no helping words in sign language like is, am, are, was, were, will, be, etc. As only a limited population is well-versed in sign language, this lack of familiarity of sign language hinders hearing-impaired people from communicating freely and easily with everyone. This issue can be addressed by a sign language recognition (SLR) system which has the capability to translate the sign language into vocal language. In this paper, a continuous SLR system is proposed using a deep learning model employing Long Short-Term Memory (LSTM), trained and tested on an ISL primary dataset. This dataset is created using MediaPipe Holistic pipeline for tracking face, hand, and body movements and collecting landmarks. The system recognizes the signs and gestures in real-time with 88.23% accuracy.
△ Less
Submitted 7 November, 2024;
originally announced November 2024.
-
CapsuleNet: A Deep Learning Model To Classify GI Diseases Using EfficientNet-b7
Authors:
Aniket Das,
Ayushman Singh,
Nishant,
Sharad Prakash
Abstract:
Gastrointestinal (GI) diseases represent a significant global health concern, with Capsule Endoscopy (CE) offering a non-invasive method for diagnosis by capturing a large number of GI tract images. However, the sheer volume of video frames necessitates automated analysis to reduce the workload on doctors and increase the diagnostic accuracy. In this paper, we present CapsuleNet, a deep learning m…
▽ More
Gastrointestinal (GI) diseases represent a significant global health concern, with Capsule Endoscopy (CE) offering a non-invasive method for diagnosis by capturing a large number of GI tract images. However, the sheer volume of video frames necessitates automated analysis to reduce the workload on doctors and increase the diagnostic accuracy. In this paper, we present CapsuleNet, a deep learning model developed for the Capsule Vision 2024 Challenge, aimed at classifying 10 distinct GI abnormalities. Using a highly imbalanced dataset, we implemented various data augmentation strategies, reducing the data imbalance to a manageable level. Our model leverages a pretrained EfficientNet-b7 backbone, tuned with additional layers for classification and optimized with PReLU activation functions. The model demonstrated superior performance on validation data, achieving a micro accuracy of 84.5% and outperforming the VGG16 baseline across most classes. Despite these advances, challenges remain in classifying certain abnormalities, such as Erythema. Our findings suggest that CNN-based models like CapsuleNet can provide an efficient solution for GI tract disease classification, particularly when inference time is a critical factor.
△ Less
Submitted 24 October, 2024;
originally announced October 2024.
-
DENOASR: Debiasing ASRs through Selective Denoising
Authors:
Anand Kumar Rai,
Siddharth D Jaiswal,
Shubham Prakash,
Bendi Pragnya Sree,
Animesh Mukherjee
Abstract:
Automatic Speech Recognition (ASR) systems have been examined and shown to exhibit biases toward particular groups of individuals, influenced by factors such as demographic traits, accents, and speech styles. Noise can disproportionately impact speakers with certain accents, dialects, or speaking styles, leading to biased error rates. In this work, we introduce a novel framework DENOASR, which is…
▽ More
Automatic Speech Recognition (ASR) systems have been examined and shown to exhibit biases toward particular groups of individuals, influenced by factors such as demographic traits, accents, and speech styles. Noise can disproportionately impact speakers with certain accents, dialects, or speaking styles, leading to biased error rates. In this work, we introduce a novel framework DENOASR, which is a selective denoising technique to reduce the disparity in the word error rates between the two gender groups, male and female. We find that a combination of two popular speech denoising techniques, viz. DEMUCS and LE, can be effectively used to mitigate ASR disparity without compromising their overall performance. Experiments using two state-of-the-art open-source ASRs - OpenAI WHISPER and NVIDIA NEMO - on multiple benchmark datasets, including TIE, VOX-POPULI, TEDLIUM, and FLEURS, show that there is a promising reduction in the average word error rate gap across the two gender groups. For a given dataset, the denoising is selectively applied on speech samples having speech intelligibility below a certain threshold, estimated using a small validation sample, thus ameliorating the need for large-scale human-written ground-truth transcripts. Our findings suggest that selective denoising can be an elegant approach to mitigate biases in present-day ASR systems.
△ Less
Submitted 22 October, 2024;
originally announced October 2024.
-
Ionization of Rydberg atoms embedded in Ultracold Plasma due to electron-atom interaction
Authors:
Satyam Prakash,
Ashok S Vudayagiri
Abstract:
When ultracold plasma is generated using photonization of laser cooled atoms, some atoms reach only upto Rydberg states. These in turn interact with the free electrons of the plasma and get ionized further. We study the interaction of electron-Rydberg atom using potential scattering technique in quantum mechanical domain and compute the associated cross sections for Cesium atoms, analytically. We…
▽ More
When ultracold plasma is generated using photonization of laser cooled atoms, some atoms reach only upto Rydberg states. These in turn interact with the free electrons of the plasma and get ionized further. We study the interaction of electron-Rydberg atom using potential scattering technique in quantum mechanical domain and compute the associated cross sections for Cesium atoms, analytically. We notice a close agreement with the experimental data of ionization of Rydberg atoms as reported in Phys. Rev. A 71, 013416 (2005). The experiments showed a rapid increase in ionization above a specific Rydberg state. Our theory supports the same, and also indicates that this is due to the relation between scattering length and the radius of the orbit.
△ Less
Submitted 12 October, 2024;
originally announced October 2024.
-
Efficiently Identifying Low-Quality Language Subsets in Multilingual Datasets: A Case Study on a Large-Scale Multilingual Audio Dataset
Authors:
Farhan Samir,
Emily P. Ahn,
Shreya Prakash,
Márton Soskuthy,
Vered Shwartz,
Jian Zhu
Abstract:
Curating datasets that span multiple languages is challenging. To make the collection more scalable, researchers often incorporate one or more imperfect classifiers in the process, like language identification models. These models, however, are prone to failure, resulting in some language subsets being unreliable for downstream tasks. We introduce a statistical test, the Preference Proportion Test…
▽ More
Curating datasets that span multiple languages is challenging. To make the collection more scalable, researchers often incorporate one or more imperfect classifiers in the process, like language identification models. These models, however, are prone to failure, resulting in some language subsets being unreliable for downstream tasks. We introduce a statistical test, the Preference Proportion Test, for identifying such unreliable subsets. By annotating only 20 samples for a language subset, we're able to identify systematic transcription errors for 10 language subsets in a recent large multilingual transcribed audio dataset, X-IPAPack (Zhu et al., 2024). We find that filtering this low-quality data out when training models for the downstream task of phonetic transcription brings substantial benefits, most notably a 25.7% relative improvement on transcribing recordings in out-of-distribution languages. Our method lays a path forward for systematic and reliable multilingual dataset auditing.
△ Less
Submitted 5 October, 2024;
originally announced October 2024.
-
Efficient Training of Transformers for Molecule Property Prediction on Small-scale Datasets
Authors:
Shivesh Prakash
Abstract:
The blood-brain barrier (BBB) serves as a protective barrier that separates the brain from the circulatory system, regulating the passage of substances into the central nervous system. Assessing the BBB permeability of potential drugs is crucial for effective drug targeting. However, traditional experimental methods for measuring BBB permeability are challenging and impractical for large-scale scr…
▽ More
The blood-brain barrier (BBB) serves as a protective barrier that separates the brain from the circulatory system, regulating the passage of substances into the central nervous system. Assessing the BBB permeability of potential drugs is crucial for effective drug targeting. However, traditional experimental methods for measuring BBB permeability are challenging and impractical for large-scale screening. Consequently, there is a need to develop computational approaches to predict BBB permeability. This paper proposes a GPS Transformer architecture augmented with Self Attention, designed to perform well in the low-data regime. The proposed approach achieved a state-of-the-art performance on the BBB permeability prediction task using the BBBP dataset, surpassing existing models. With a ROC-AUC of 78.8%, the approach sets a state-of-the-art by 5.5%. We demonstrate that standard Self Attention coupled with GPS transformer performs better than other variants of attention coupled with GPS Transformer.
△ Less
Submitted 7 September, 2024;
originally announced September 2024.
-
Whittle Index Learning Algorithms for Restless Bandits with Constant Stepsizes
Authors:
Vishesh Mittal,
Rahul Meshram,
Surya Prakash
Abstract:
We study the Whittle index learning algorithm for restless multi-armed bandits. We consider index learning algorithm with Q-learning. We first present Q-learning algorithm with exploration policies -- epsilon-greedy, softmax, epsilon-softmax with constant stepsizes. We extend the study of Q-learning to index learning for single-armed restless bandit. The algorithm of index learning is two-timescal…
▽ More
We study the Whittle index learning algorithm for restless multi-armed bandits. We consider index learning algorithm with Q-learning. We first present Q-learning algorithm with exploration policies -- epsilon-greedy, softmax, epsilon-softmax with constant stepsizes. We extend the study of Q-learning to index learning for single-armed restless bandit. The algorithm of index learning is two-timescale variant of stochastic approximation, on slower timescale we update index learning scheme and on faster timescale we update Q-learning assuming fixed index value. In Q-learning updates are in asynchronous manner. We study constant stepsizes two timescale stochastic approximation algorithm. We provide analysis of two-timescale stochastic approximation for index learning with constant stepsizes. Further, we present study on index learning with deep Q-network (DQN) learning and linear function approximation with state-aggregation method. We describe the performance of our algorithms using numerical examples. We have shown that index learning with Q learning, DQN and function approximations learns the Whittle index.
△ Less
Submitted 6 September, 2024;
originally announced September 2024.
-
RLPF: Reinforcement Learning from Prediction Feedback for User Summarization with LLMs
Authors:
Jiaxing Wu,
Lin Ning,
Luyang Liu,
Harrison Lee,
Neo Wu,
Chao Wang,
Sushant Prakash,
Shawn O'Banion,
Bradley Green,
Jun Xie
Abstract:
LLM-powered personalization agent systems employ Large Language Models (LLMs) to predict users' behavior from their past activities. However, their effectiveness often hinges on the ability to effectively leverage extensive, long user historical data due to its inherent noise and length of such data. Existing pretrained LLMs may generate summaries that are concise but lack the necessary context fo…
▽ More
LLM-powered personalization agent systems employ Large Language Models (LLMs) to predict users' behavior from their past activities. However, their effectiveness often hinges on the ability to effectively leverage extensive, long user historical data due to its inherent noise and length of such data. Existing pretrained LLMs may generate summaries that are concise but lack the necessary context for downstream tasks, hindering their utility in personalization systems. To address these challenges, we introduce Reinforcement Learning from Prediction Feedback (RLPF). RLPF fine-tunes LLMs to generate concise, human-readable user summaries that are optimized for downstream task performance. By maximizing the usefulness of the generated summaries, RLPF effectively distills extensive user history data while preserving essential information for downstream tasks. Our empirical evaluation demonstrates significant improvements in both extrinsic downstream task utility and intrinsic summary quality, surpassing baseline methods by up to 22% on downstream task performance and achieving an up to 84.59% win rate on Factuality, Abstractiveness, and Readability. RLPF also achieves a remarkable 74% reduction in context length while improving performance on 16 out of 19 unseen tasks and/or datasets, showcasing its generalizability. This approach offers a promising solution for enhancing LLM personalization by effectively transforming long, noisy user histories into informative and human-readable representations.
△ Less
Submitted 16 January, 2025; v1 submitted 6 September, 2024;
originally announced September 2024.
-
A Prototype Model of Zero-Trust Architecture Blockchain with EigenTrust-Based Practical Byzantine Fault Tolerance Protocol to Manage Decentralized Clinical Trials
Authors:
Ashok Kumar Peepliwall,
Hari Mohan Pandey,
Surya Prakash,
Anand A Mahajan,
Sudhinder Singh Chowhan,
Vinesh Kumar,
Rahul Sharma
Abstract:
The COVID-19 pandemic necessitated the emergence of decentralized Clinical Trials (DCTs) due to patient retention, accelerate trials, improve data accessibility, enable virtual care, and facilitate seamless communication through integrated systems. However, integrating systems in DCTs exposes clinical data to potential security threats, making them susceptible to theft at any stage, a high risk of…
▽ More
The COVID-19 pandemic necessitated the emergence of decentralized Clinical Trials (DCTs) due to patient retention, accelerate trials, improve data accessibility, enable virtual care, and facilitate seamless communication through integrated systems. However, integrating systems in DCTs exposes clinical data to potential security threats, making them susceptible to theft at any stage, a high risk of protocol deviations, and monitoring issues. To mitigate these challenges, blockchain technology serves as a secure framework, acting as a decentralized ledger, creating an immutable environment by establishing a zero-trust architecture, where data are deemed untrusted until verified. In combination with Internet of Things (IoT)-enabled wearable devices, blockchain secures the transfer of clinical trial data on private blockchains during DCT automation and operations. This paper proposes a prototype model of the Zero-Trust Architecture Blockchain (z-TAB) to integrate patient-generated clinical trial data during DCT operation management. The EigenTrust-based Practical Byzantine Fault Tolerance (T-PBFT) algorithm has been incorporated as a consensus protocol, leveraging Hyperledger Fabric. Furthermore, the Internet of Things (IoT) has been integrated to streamline data processing among stakeholders within the blockchain platforms. Rigorous evaluation has been done to evaluate the quality of the system.
△ Less
Submitted 29 August, 2024;
originally announced August 2024.
-
A Search for High-Threshold Qutrit Magic State Distillation Routines
Authors:
Shiroman Prakash,
Rishabh Singhal
Abstract:
Determining the best attainable threshold for qudit magic state distillation is directly related to the question of whether or not contextuality is sufficient for universal quantum computation. We carry out a search for high-threshold magic state distillation routines for a highly-symmetric qutrit magic state known as the strange state. Our search covers a large class of $[[n,1]]_3$ qutrit stabili…
▽ More
Determining the best attainable threshold for qudit magic state distillation is directly related to the question of whether or not contextuality is sufficient for universal quantum computation. We carry out a search for high-threshold magic state distillation routines for a highly-symmetric qutrit magic state known as the strange state. Our search covers a large class of $[[n,1]]_3$ qutrit stabilizer codes with up to 23 qutrits, and is facilitated by a theorem that relates the distillation performance of a qudit stabilizer code to its weight-enumerators. We could not find any code with $n<23$ qutrits that distills the strange state with better than linear noise suppression, other than the 11-qutrit Golay code. However, for $n=23$, we find over 600 CSS codes that can distill the qutrit strange state with cubic noise suppression. While none of these codes surpass the threshold of the 11-qutrit Golay code, their existence suggests that, for large codes, the ability to distill the qutrit strange state is somewhat generic.
△ Less
Submitted 1 August, 2024;
originally announced August 2024.
-
On rigidity of conformal submersions
Authors:
Atreyee Bhattacharya,
Sayoojya Prakash
Abstract:
Conformal submersions are generalizations of Riemannian submersions where the differential of the submersion map is a conformal isometry when restricted to the horizontal distribution. Riemannian submersions have been instrumental in the construction of new Riemannian metrics from existing ones. Therefore, being generalizations of Riemannian submersions, conformal submersions are potential tools f…
▽ More
Conformal submersions are generalizations of Riemannian submersions where the differential of the submersion map is a conformal isometry when restricted to the horizontal distribution. Riemannian submersions have been instrumental in the construction of new Riemannian metrics from existing ones. Therefore, being generalizations of Riemannian submersions, conformal submersions are potential tools for constructing new examples of (special) Riemannian manifolds. A conformal submersion is said to be rigid if it reduces to a Riemannian submersion up to homothety. In this paper, we study curvature conditions that force conformal submersions to be rigid. Moreover, under suitable circumstances, using these rigidity criteria for conformal submersions, we conclude the rigidity of certain quasi-Einstein metrics.
△ Less
Submitted 26 July, 2024; v1 submitted 22 July, 2024;
originally announced July 2024.
-
Embracing Federated Learning: Enabling Weak Client Participation via Partial Model Training
Authors:
Sunwoo Lee,
Tuo Zhang,
Saurav Prakash,
Yue Niu,
Salman Avestimehr
Abstract:
In Federated Learning (FL), clients may have weak devices that cannot train the full model or even hold it in their memory space. To implement large-scale FL applications, thus, it is crucial to develop a distributed learning method that enables the participation of such weak clients. We propose EmbracingFL, a general FL framework that allows all available clients to join the distributed training…
▽ More
In Federated Learning (FL), clients may have weak devices that cannot train the full model or even hold it in their memory space. To implement large-scale FL applications, thus, it is crucial to develop a distributed learning method that enables the participation of such weak clients. We propose EmbracingFL, a general FL framework that allows all available clients to join the distributed training regardless of their system resource capacity. The framework is built upon a novel form of partial model training method in which each client trains as many consecutive output-side layers as its system resources allow. Our study demonstrates that EmbracingFL encourages each layer to have similar data representations across clients, improving FL efficiency. The proposed partial model training method guarantees convergence to a neighbor of stationary points for non-convex and smooth problems. We evaluate the efficacy of EmbracingFL under a variety of settings with a mixed number of strong, moderate (~40% memory), and weak (~15% memory) clients, datasets (CIFAR-10, FEMNIST, and IMDB), and models (ResNet20, CNN, and LSTM). Our empirical study shows that EmbracingFL consistently achieves high accuracy as like all clients are strong, outperforming the state-of-the-art width reduction methods (i.e. HeteroFL and FjORD).
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Beyond the Visible: Jointly Attending to Spectral and Spatial Dimensions with HSI-Diffusion for the FINCH Spacecraft
Authors:
Ian Vyse,
Rishit Dagli,
Dav Vrat Chadha,
John P. Ma,
Hector Chen,
Isha Ruparelia,
Prithvi Seran,
Matthew Xie,
Eesa Aamer,
Aidan Armstrong,
Naveen Black,
Ben Borstein,
Kevin Caldwell,
Orrin Dahanaggamaarachchi,
Joe Dai,
Abeer Fatima,
Stephanie Lu,
Maxime Michet,
Anoushka Paul,
Carrie Ann Po,
Shivesh Prakash,
Noa Prosser,
Riddhiman Roy,
Mirai Shinjo,
Iliya Shofman
, et al. (4 additional authors not shown)
Abstract:
Satellite remote sensing missions have gained popularity over the past fifteen years due to their ability to cover large swaths of land at regular intervals, making them ideal for monitoring environmental trends. The FINCH mission, a 3U+ CubeSat equipped with a hyperspectral camera, aims to monitor crop residue cover in agricultural fields. Although hyperspectral imaging captures both spectral and…
▽ More
Satellite remote sensing missions have gained popularity over the past fifteen years due to their ability to cover large swaths of land at regular intervals, making them ideal for monitoring environmental trends. The FINCH mission, a 3U+ CubeSat equipped with a hyperspectral camera, aims to monitor crop residue cover in agricultural fields. Although hyperspectral imaging captures both spectral and spatial information, it is prone to various types of noise, including random noise, stripe noise, and dead pixels. Effective denoising of these images is crucial for downstream scientific tasks. Traditional methods, including hand-crafted techniques encoding strong priors, learned 2D image denoising methods applied across different hyperspectral bands, or diffusion generative models applied independently on bands, often struggle with varying noise strengths across spectral bands, leading to significant spectral distortion. This paper presents a novel approach to hyperspectral image denoising using latent diffusion models that integrate spatial and spectral information. We particularly do so by building a 3D diffusion model and presenting a 3-stage training approach on real and synthetically crafted datasets. The proposed method preserves image structure while reducing noise. Evaluations on both popular hyperspectral denoising datasets and synthetically crafted datasets for the FINCH mission demonstrate the effectiveness of this approach.
△ Less
Submitted 15 June, 2024;
originally announced June 2024.
-
A Diagnostic Tool for Functional Causal Discovery
Authors:
Shreya Prakash,
Fan Xia,
Elena Erosheva
Abstract:
Causal discovery methods aim to determine the causal direction between variables using observational data. Functional causal discovery methods, such as those based on the Linear Non-Gaussian Acyclic Model (LiNGAM), rely on structural and distributional assumptions to infer the causal direction. However, approaches for assessing causal discovery methods' performance as a function of sample size or…
▽ More
Causal discovery methods aim to determine the causal direction between variables using observational data. Functional causal discovery methods, such as those based on the Linear Non-Gaussian Acyclic Model (LiNGAM), rely on structural and distributional assumptions to infer the causal direction. However, approaches for assessing causal discovery methods' performance as a function of sample size or the impact of assumption violations, inevitable in real-world scenarios, are lacking. To address this need, we propose Causal Direction Detection Rate (CDDR) diagnostic that evaluates whether and to what extent the interaction between assumption violations and sample size affects the ability to identify the hypothesized causal direction. Given a bivariate dataset of size N on a pair of variables, X and Y, CDDR diagnostic is the plotted comparison of the probability of each causal discovery outcome (e.g. X causes Y, Y causes X, or inconclusive) as a function of sample size less than N. We fully develop CDDR diagnostic in a bivariate case and demonstrate its use for two methods, LiNGAM and our new test-based causal discovery approach. We find CDDR diagnostic for the test-based approach to be more informative since it uses a richer set of causal discovery outcomes. Under certain assumptions, we prove that the probability estimates of detecting each possible causal discovery outcome are consistent and asymptotically normal. Through simulations, we study CDDR diagnostic's behavior when linearity and non-Gaussianity assumptions are violated. Additionally, we illustrate CDDR diagnostic on four real datasets, including three for which the causal direction is known.
△ Less
Submitted 25 September, 2024; v1 submitted 11 June, 2024;
originally announced June 2024.
-
SEE-2-SOUND: Zero-Shot Spatial Environment-to-Spatial Sound
Authors:
Rishit Dagli,
Shivesh Prakash,
Robert Wu,
Houman Khosravani
Abstract:
Generating combined visual and auditory sensory experiences is critical for the consumption of immersive content. Recent advances in neural generative models have enabled the creation of high-resolution content across multiple modalities such as images, text, speech, and videos. Despite these successes, there remains a significant gap in the generation of high-quality spatial audio that complement…
▽ More
Generating combined visual and auditory sensory experiences is critical for the consumption of immersive content. Recent advances in neural generative models have enabled the creation of high-resolution content across multiple modalities such as images, text, speech, and videos. Despite these successes, there remains a significant gap in the generation of high-quality spatial audio that complements generated visual content. Furthermore, current audio generation models excel in either generating natural audio or speech or music but fall short in integrating spatial audio cues necessary for immersive experiences. In this work, we introduce SEE-2-SOUND, a zero-shot approach that decomposes the task into (1) identifying visual regions of interest; (2) locating these elements in 3D space; (3) generating mono-audio for each; and (4) integrating them into spatial audio. Using our framework, we demonstrate compelling results for generating spatial audio for high-quality videos, images, and dynamic images from the internet, as well as media generated by learned approaches.
△ Less
Submitted 7 July, 2025; v1 submitted 6 June, 2024;
originally announced June 2024.