Search | arXiv e-print repository

Hearing Health in Home Healthcare: Leveraging LLMs for Illness Scoring and ALMs for Vocal Biomarker Extraction

Authors: Yu-Wen Chen, William Ho, Sasha M. Vergez, Grace Flaherty, Pallavi Gupta, Zhihong Zhang, Maryam Zolnoori, Margaret V. McDonald, Maxim Topaz, Zoran Kostic, Julia Hirschberg

Abstract: The growing demand for home healthcare calls for tools that can support care delivery. In this study, we explore automatic health assessment from voice using real-world home care visit data, leveraging the diverse patient information it contains. First, we utilize Large Language Models (LLMs) to integrate Subjective, Objective, Assessment, and Plan (SOAP) notes derived from unstructured audio tran… ▽ More The growing demand for home healthcare calls for tools that can support care delivery. In this study, we explore automatic health assessment from voice using real-world home care visit data, leveraging the diverse patient information it contains. First, we utilize Large Language Models (LLMs) to integrate Subjective, Objective, Assessment, and Plan (SOAP) notes derived from unstructured audio transcripts and structured vital signs into a holistic illness score that reflects a patient's overall health. This compact representation facilitates cross-visit health status comparisons and downstream analysis. Next, we design a multi-stage preprocessing pipeline to extract short speech segments from target speakers in home care recordings for acoustic analysis. We then employ an Audio Language Model (ALM) to produce plain-language descriptions of vocal biomarkers and examine their association with individuals' health status. Our experimental results benchmark both commercial and open-source LLMs in estimating illness scores, demonstrating their alignment with actual clinical outcomes, and revealing that SOAP notes are substantially more informative than vital signs. Building on the illness scores, we provide the first evidence that ALMs can identify health-related acoustic patterns from home care recordings and present them in a human-readable form. Together, these findings highlight the potential of LLMs and ALMs to harness heterogeneous in-home visit data for better patient monitoring and care. △ Less

Submitted 20 October, 2025; originally announced October 2025.

Comments: The Second Workshop on GenAI for Health at NeurIPS 2025

arXiv:2510.14401 [pdf, ps, other]

The Role of Social Learning and Collective Norm Formation in Fostering Cooperation in LLM Multi-Agent Systems

Authors: Prateek Gupta, Qiankun Zhong, Hiromu Yakura, Thomas Eisenmann, Iyad Rahwan

Abstract: A growing body of multi-agent studies with Large Language Models (LLMs) explores how norms and cooperation emerge in mixed-motive scenarios, where pursuing individual gain can undermine the collective good. While prior work has explored these dynamics in both richly contextualized simulations and simplified game-theoretic environments, most LLM systems featuring common-pool resource (CPR) games pr… ▽ More A growing body of multi-agent studies with Large Language Models (LLMs) explores how norms and cooperation emerge in mixed-motive scenarios, where pursuing individual gain can undermine the collective good. While prior work has explored these dynamics in both richly contextualized simulations and simplified game-theoretic environments, most LLM systems featuring common-pool resource (CPR) games provide agents with explicit reward functions directly tied to their actions. In contrast, human cooperation often emerges without full visibility into payoffs and population, relying instead on heuristics, communication, and punishment. We introduce a CPR simulation framework that removes explicit reward signals and embeds cultural-evolutionary mechanisms: social learning (adopting strategies and beliefs from successful peers) and norm-based punishment, grounded in Ostrom's principles of resource governance. Agents also individually learn from the consequences of harvesting, monitoring, and punishing via environmental feedback, enabling norms to emerge endogenously. We establish the validity of our simulation by reproducing key findings from existing studies on human behavior. Building on this, we examine norm evolution across a $2\times2$ grid of environmental and social initialisations (resource-rich vs. resource-scarce; altruistic vs. selfish) and benchmark how agentic societies comprised of different LLMs perform under these conditions. Our results reveal systematic model differences in sustaining cooperation and norm formation, positioning the framework as a rigorous testbed for studying emergent norms in mixed-motive LLM societies. Such analysis can inform the design of AI systems deployed in social and organizational contexts, where alignment with cooperative norms is critical for stability, fairness, and effective governance of AI-mediated environments. △ Less

Submitted 16 October, 2025; originally announced October 2025.

arXiv:2510.13844 [pdf]

doi 10.3390/galaxies13050115

Evolution of Size, Mass, and Density of Galaxies Since Cosmic Dawn

Authors: Rajendra P. Gupta

Abstract: The formation and evolution of galaxies and other astrophysical objects have become of great interest, especially since the launch of the James Webb Space Telescope in 2021. The mass, size, and density of objects in the early universe appear to be drastically different from those predicted by the standard cosmology - the $Λ$CDM model. This work shows that the mass-size-density evolution is not sur… ▽ More The formation and evolution of galaxies and other astrophysical objects have become of great interest, especially since the launch of the James Webb Space Telescope in 2021. The mass, size, and density of objects in the early universe appear to be drastically different from those predicted by the standard cosmology - the $Λ$CDM model. This work shows that the mass-size-density evolution is not surprising when we use the CCC+TL cosmology, which is based on the concepts of covarying coupling constants in an expanding universe and the tired light effect contributing to the observed redshift. This model is consistent with supernovae Pantheon+ data, the angular size of the cosmic dawn galaxies, BAO, CMB sound horizon, galaxy formation time scales, time dilation, galaxy rotation curves, etc., and does not have the coincidence problem. The effective radii $r_e$ of the objects are larger in the new model by $r_e \propto (1+z)^{0.93}$. Thus, the object size evolution in different studies, estimated as $r_e \propto (1+z)^s$ with $s=-1.0 \pm {0.3}$, is modified to $r_e \propto (1+z)^{s+0.93}$, the dynamical mass by $(1+z)^{0.93}$, and number density by $(1+z)^{-2.80}$. The luminosity modification increases slowly with $z$ to 1.8 at $z=20$. Thus, the stellar mass increase is modest, and the luminosity and stellar density decrease are mainly due to the larger object size in the new model. Since the aging of the universe is stretched in the new model, its temporal evolution is much slower (e.g., at $z=10$, the age is about a dex longer); stars, black holes, and galaxies do not have to form at unrealistic rates. △ Less

Submitted 11 October, 2025; originally announced October 2025.

Comments: 22 pages, 6 figures

Journal ref: Galaxies 13, 115 (2025)

arXiv:2510.13835 [pdf, ps, other]

ConDABench: Interactive Evaluation of Language Models for Data Analysis

Authors: Avik Dutta, Priyanshu Gupta, Hosein Hasanbeig, Rahul Pratap Singh, Harshit Nigam, Sumit Gulwani, Arjun Radhakrishna, Gustavo Soares, Ashish Tiwari

Abstract: Real-world data analysis tasks often come with under-specified goals and unclean data. User interaction is necessary to understand and disambiguate a user's intent, and hence, essential to solving these complex tasks. Existing benchmarks for evaluating LLMs on data analysis tasks do not capture these complexities or provide first-class support for interactivity. We introduce ConDABench, a framewor… ▽ More Real-world data analysis tasks often come with under-specified goals and unclean data. User interaction is necessary to understand and disambiguate a user's intent, and hence, essential to solving these complex tasks. Existing benchmarks for evaluating LLMs on data analysis tasks do not capture these complexities or provide first-class support for interactivity. We introduce ConDABench, a framework for generating conversational data analysis (ConDA) benchmarks and evaluating external tools on the generated benchmarks. \bench consists of (a) a multi-agent workflow for generating realistic benchmarks from articles describing insights gained from public datasets, (b) 1,420 ConDA problems generated using this workflow, and (c) an evaluation harness that, for the first time, makes it possible to systematically evaluate conversational data analysis tools on the generated ConDA problems. Evaluation of state-of-the-art LLMs on the benchmarks reveals that while the new generation of models are better at solving more instances, they are not necessarily better at solving tasks that require sustained, long-form engagement. ConDABench is an avenue for model builders to measure progress towards truly collaborative models that can complete complex interactive tasks. △ Less

Submitted 10 October, 2025; originally announced October 2025.

arXiv:2510.13050 [pdf, ps, other]

An Operational Deep Learning System for Satellite-Based High-Resolution Global Nowcasting

Authors: Shreya Agrawal, Mohammed Alewi Hassen, Emmanuel Asiedu Brempong, Boris Babenko, Fred Zyda, Olivia Graham, Di Li, Samier Merchant, Santiago Hincapie Potes, Tyler Russell, Danny Cheresnick, Aditya Prakash Kakkirala, Stephan Rasp, Avinatan Hassidim, Yossi Matias, Nal Kalchbrenner, Pramod Gupta, Jason Hickey, Aaron Bell

Abstract: Precipitation nowcasting, which predicts rainfall up to a few hours ahead, is a critical tool for vulnerable communities in the Global South frequently exposed to intense, rapidly developing storms. Timely forecasts provide a crucial window to protect lives and livelihoods. Traditional numerical weather prediction (NWP) methods suffer from high latency, low spatial and temporal resolution, and sig… ▽ More Precipitation nowcasting, which predicts rainfall up to a few hours ahead, is a critical tool for vulnerable communities in the Global South frequently exposed to intense, rapidly developing storms. Timely forecasts provide a crucial window to protect lives and livelihoods. Traditional numerical weather prediction (NWP) methods suffer from high latency, low spatial and temporal resolution, and significant gaps in accuracy across the world. Recent machine learning-based nowcasting methods, common in the Global North, cannot be extended to the Global South due to extremely sparse radar coverage. We present Global MetNet, an operational global machine learning nowcasting model. It leverages the Global Precipitation Mission's CORRA dataset, geostationary satellite data, and global NWP data to predict precipitation for the next 12 hours. The model operates at a high resolution of approximately 0.05° (~5km) spatially and 15 minutes temporally. Global MetNet significantly outperforms industry-standard hourly forecasts and achieves significantly higher skill, making forecasts useful over a much larger area of the world than previously available. Our model demonstrates better skill in data-sparse regions than even the best high-resolution NWP models achieve in the US. Validated using ground radar and satellite data, it shows significant improvements across key metrics like the critical success index and fractions skill score for all precipitation rates and lead times. Crucially, our model generates forecasts in under a minute, making it readily deployable for real-time applications. It is already deployed for millions of users on Google Search. This work represents a key step in reducing global disparities in forecast quality and integrating sparse, high-resolution satellite observations into weather forecasting. △ Less

Submitted 14 October, 2025; originally announced October 2025.

arXiv:2510.11232 [pdf, ps, other]

LightPneumoNet: Lightweight Pneumonia Classifier

Authors: Neilansh Chauhan, Piyush Kumar Gupta, Faraz Doja

Abstract: Effective pneumonia diagnosis is often challenged by the difficulty of deploying large, computationally expensive deep learning models in resource-limited settings. This study introduces LightPneumoNet, an efficient, lightweight convolutional neural network (CNN) built from scratch to provide an accessible and accurate diagnostic solution for pneumonia detection from chest X-rays. Our model was tr… ▽ More Effective pneumonia diagnosis is often challenged by the difficulty of deploying large, computationally expensive deep learning models in resource-limited settings. This study introduces LightPneumoNet, an efficient, lightweight convolutional neural network (CNN) built from scratch to provide an accessible and accurate diagnostic solution for pneumonia detection from chest X-rays. Our model was trained on a public dataset of 5,856 chest X-ray images. Preprocessing included image resizing to 224x224, grayscale conversion, and pixel normalization, with data augmentation (rotation, zoom, shear) to prevent overfitting. The custom architecture features four blocks of stacked convolutional layers and contains only 388,082 trainable parameters, resulting in a minimal 1.48 MB memory footprint. On the independent test set, our model delivered exceptional performance, achieving an overall accuracy of 0.942, precision of 0.92, and an F1-Score of 0.96. Critically, it obtained a sensitivity (recall) of 0.99, demonstrating a near-perfect ability to identify true pneumonia cases and minimize clinically significant false negatives. Notably, LightPneumoNet achieves this high recall on the same dataset where existing approaches typically require significantly heavier architectures or fail to reach comparable sensitivity levels. The model's efficiency enables deployment on low-cost hardware, making advanced computer-aided diagnosis accessible in underserved clinics and serving as a reliable second-opinion tool to improve patient outcomes. △ Less

Submitted 13 October, 2025; originally announced October 2025.

Comments: 13 pages (including references), 5 figures

arXiv:2510.10215 [pdf, ps, other]

Bounds of Validity for Bifurcations of Equilibria in a Class of Networked Dynamical Systems

Authors: Pranav Gupta, Ravi Banavar, Anastasia Bizyaeva

Abstract: Local bifurcation analysis plays a central role in understanding qualitative transitions in networked nonlinear dynamical systems, including dynamic neural network and opinion dynamics models. In this article we establish explicit bounds of validity for the classification of bifurcation diagrams in two classes of continuous-time networked dynamical systems, analogous in structure to the Hopfield a… ▽ More Local bifurcation analysis plays a central role in understanding qualitative transitions in networked nonlinear dynamical systems, including dynamic neural network and opinion dynamics models. In this article we establish explicit bounds of validity for the classification of bifurcation diagrams in two classes of continuous-time networked dynamical systems, analogous in structure to the Hopfield and the Firing Rate dynamic neural network models. Our approach leverages recent advances in computing the bounds for the validity of Lyapunov-Schmidt reduction, a reduction method widely employed in nonlinear systems analysis. Using these bounds we rigorously characterize neighborhoods around bifurcation points where predictions from reduced-order models remain reliable. We further demonstrate how these bounds can be applied to an illustrative family of nonlinear opinion dynamics on k-regular graphs, which emerges as a special case of the general framework. These results provide new analytical tools for quantifying the robustness of bifurcation phenomena in dynamics over networked systems and highlight the interplay between network structure and nonlinear dynamical behavior. △ Less

Submitted 11 October, 2025; originally announced October 2025.

Comments: This manuscript has been submitted to the 2026 American Control Conference taking place in New Orleans, Louisiana, in May 2026

arXiv:2510.09611 [pdf, ps, other]

Discrete non-abelian X-ray transforms

Authors: Pranav Gupta, Roman Novikov

Abstract: We define a discrete version of the non-abelian X-ray transform, going back in particular to Manakov, Zakharov (1981) and Strichartz (1982). We extend to this transform non-overdetermined reconstruction results obtained for the abelian case in the recent article by Novikov, Sharma (2025). In addition, we establish relations with the continuous non-abelian X-ray transform. In this respect, our resu… ▽ More We define a discrete version of the non-abelian X-ray transform, going back in particular to Manakov, Zakharov (1981) and Strichartz (1982). We extend to this transform non-overdetermined reconstruction results obtained for the abelian case in the recent article by Novikov, Sharma (2025). In addition, we establish relations with the continuous non-abelian X-ray transform. In this respect, our results include an explicit and exact non-overdetermined layer-stripping reconstruction procedure for piecewise constant matrix-valued functions from their continuous non-abelian X-ray transform. To our knowledge, this result is new even for the classical X-ray transform. △ Less

Submitted 4 September, 2025; originally announced October 2025.

MSC Class: 44A12; 65R10; 65N21; 65N22

arXiv:2510.06239 [pdf, ps, other]

OpenStaxQA: A multilingual dataset based on open-source college textbooks

Authors: Pranav Gupta

Abstract: We present OpenStaxQA, an evaluation benchmark specific to college-level educational applications based on 43 open-source college textbooks in English, Spanish, and Polish, available under a permissive Creative Commons license. We finetune and evaluate large language models (LLMs) with approximately 7 billion parameters on this dataset using quantized low rank adapters (QLoRa). Additionally we als… ▽ More We present OpenStaxQA, an evaluation benchmark specific to college-level educational applications based on 43 open-source college textbooks in English, Spanish, and Polish, available under a permissive Creative Commons license. We finetune and evaluate large language models (LLMs) with approximately 7 billion parameters on this dataset using quantized low rank adapters (QLoRa). Additionally we also perform a zero-shot evaluation on the AI2 reasoning challenge dev dataset in order to check if OpenStaxQA can lead to an improved performance on other tasks. We also discuss broader impacts relevant to datasets such as OpenStaxQA. △ Less

Submitted 3 October, 2025; originally announced October 2025.

arXiv:2510.02372 [pdf, ps, other]

DDVV conjecture for Riemannian maps from quaternionic space forms

Authors: Kirti Gupta, Punam Gupta

Abstract: In this paper, we investigate the DDVV-type inequality for Riemannian maps from quaternionic space forms to Riemannian manifolds. We also discuss the equality case of the derived inequality with application. In this paper, we investigate the DDVV-type inequality for Riemannian maps from quaternionic space forms to Riemannian manifolds. We also discuss the equality case of the derived inequality with application. △ Less

Submitted 29 September, 2025; originally announced October 2025.

MSC Class: 2020: 53C15; 53C26; 53C55

arXiv:2510.01234 [pdf, ps, other]

LLMRank: Understanding LLM Strengths for Model Routing

Authors: Shubham Agrawal, Prasang Gupta

Abstract: The rapid growth of large language models (LLMs) with diverse capabilities, latency and computational costs presents a critical deployment challenge: selecting the most suitable model for each prompt to optimize the trade-off between performance and efficiency. We introduce LLMRank, a prompt-aware routing framework that leverages rich, human-readable features extracted from prompts, including task… ▽ More The rapid growth of large language models (LLMs) with diverse capabilities, latency and computational costs presents a critical deployment challenge: selecting the most suitable model for each prompt to optimize the trade-off between performance and efficiency. We introduce LLMRank, a prompt-aware routing framework that leverages rich, human-readable features extracted from prompts, including task type, reasoning patterns, complexity indicators, syntactic cues, and signals from a lightweight proxy solver. Unlike prior one-shot routers that rely solely on latent embeddings, LLMRank predicts per-model utility using a neural ranking model trained on RouterBench, comprising 36,497 prompts spanning 11 benchmarks and 11 state-of-the-art LLMs, from small efficient models to large frontier systems. Our approach achieves up to 89.2% of oracle utility, while providing interpretable feature attributions that explain routing decisions. Extensive studies demonstrate the importance of multifaceted feature extraction and the hybrid ranking objective, highlighting the potential of feature-driven routing for efficient and transparent LLM deployment. △ Less

Submitted 23 September, 2025; originally announced October 2025.

Comments: 13 pages, 1 figure

arXiv:2509.24099 [pdf, ps, other]

Unified Multi-Modal Interactive & Reactive 3D Motion Generation via Rectified Flow

Authors: Prerit Gupta, Shourya Verma, Ananth Grama, Aniket Bera

Abstract: Generating realistic, context-aware two-person motion conditioned on diverse modalities remains a central challenge in computer graphics, animation, and human-computer interaction. We introduce DualFlow, a unified and efficient framework for multi-modal two-person motion generation. DualFlow conditions 3D motion synthesis on diverse inputs, including text, music, and prior motion sequences. Levera… ▽ More Generating realistic, context-aware two-person motion conditioned on diverse modalities remains a central challenge in computer graphics, animation, and human-computer interaction. We introduce DualFlow, a unified and efficient framework for multi-modal two-person motion generation. DualFlow conditions 3D motion synthesis on diverse inputs, including text, music, and prior motion sequences. Leveraging rectified flow, it achieves deterministic straight-line sampling paths between noise and data, reducing inference time and mitigating error accumulation common in diffusion-based models. To enhance semantic grounding, DualFlow employs a Retrieval-Augmented Generation (RAG) module that retrieves motion exemplars using music features and LLM-based text decompositions of spatial relations, body movements, and rhythmic patterns. We use contrastive objective that further strengthens alignment with conditioning signals and introduce synchronization loss that improves inter-person coordination. Extensive evaluations across text-to-motion, music-to-motion, and multi-modal interactive benchmarks show consistent gains in motion quality, responsiveness, and efficiency. DualFlow produces temporally coherent and rhythmically synchronized motions, setting state-of-the-art in multi-modal human motion generation. △ Less

Submitted 13 October, 2025; v1 submitted 28 September, 2025; originally announced September 2025.

Comments: Under review at ICLR 2026

arXiv:2509.20205 [pdf, ps, other]

Fulcrum: Optimizing Concurrent DNN Training and Inferencing on Edge Accelerators

Authors: Prashanthi S. K., Saisamarth Taluri, Pranav Gupta, Amartya Ranjan Saikia, Kunal Kumar Sahoo, Atharva Vinay Joshi, Lakshya Karwa, Kedar Dhule, Yogesh Simmhan

Abstract: The proliferation of GPU accelerated edge devices like Nvidia Jetsons and the rise in privacy concerns are placing an emphasis on concurrent DNN training and inferencing on edge devices. Inference and training have different computing and QoS goals. But edge accelerators like Jetson do not support native GPU sharing and expose 1000s of power modes. This requires careful time-sharing of concurrent… ▽ More The proliferation of GPU accelerated edge devices like Nvidia Jetsons and the rise in privacy concerns are placing an emphasis on concurrent DNN training and inferencing on edge devices. Inference and training have different computing and QoS goals. But edge accelerators like Jetson do not support native GPU sharing and expose 1000s of power modes. This requires careful time-sharing of concurrent workloads to meet power--performance goals, while limiting costly profiling. In this paper, we design an intelligent time-slicing approach for concurrent DNN training and inferencing on Jetsons. We formulate an optimization problem to interleave training and inferencing minibatches, and decide the device power mode and inference minibatch size, while maximizing the training throughput and staying within latency and power budgets, with modest profiling costs. We propose GMD, an efficient multi-dimensional gradient descent search which profiles just $15$ power modes; and ALS, an Active Learning technique which identifies reusable Pareto-optimal power modes, but profiles $50$--$150$ power modes. We evaluate these within our Fulcrum scheduler for $273,000+$ configurations across $15$ DNN workloads. We also evaluate our strategies on dynamic arrival inference and concurrent inferences. ALS and GMD outperform simpler and more complex baselines with larger-scale profiling. Their solutions satisfy the latency and power budget for $>97\%$ of our runs, and on average are within $7\%$ of the optimal throughput. △ Less

Submitted 24 September, 2025; originally announced September 2025.

arXiv:2509.20189 [pdf, ps, other]

Pagoda: An Energy and Time Roofline Study for DNN Workloads on Edge Accelerators

Authors: Prashanthi S. K., Kunal Kumar Sahoo, Amartya Ranjan Saikia, Pranav Gupta, Atharva Vinay Joshi, Priyanshu Pansari, Yogesh Simmhan

Abstract: Edge accelerators such as Nvidia Jetsons are becoming an integral part of the computing continuum, and are often used for DNN inferencing and training. Nvidia Jetson edge devices have $2000$+ CUDA cores within a $70$W power envelope and offer $1000$s of power modes to customize CPU, GPU and memory frequencies. Their widely varying power--performance trade-offs can be exploited for energy and power… ▽ More Edge accelerators such as Nvidia Jetsons are becoming an integral part of the computing continuum, and are often used for DNN inferencing and training. Nvidia Jetson edge devices have $2000$+ CUDA cores within a $70$W power envelope and offer $1000$s of power modes to customize CPU, GPU and memory frequencies. Their widely varying power--performance trade-offs can be exploited for energy and power-constrained deployments. While data-driven methods to predict the power and latency of DNN workloads for edge devices exist, there is a lack of principled study to understand why edge accelerators and their power modes perform the way they do. We develop a time roofline and a novel energy roofline model for the Jetson Orin AGX for diverse power modes, and couple it with an analytical model of the compute (FLOP) and memory access (bytes) for DNN inference workloads to analyze them from first principles. These reveal unique, sometimes counter-intuitive, insights into the power and performance behavior of DNN workloads on edge accelerators, e.g., the default power mode MAXN is not the most energy efficient and time efficiency implies energy efficiency for all power modes. We also extend our analytical roofline models to DNN training. Finally, we apply these methods to tune the power mode (and hence the roofline) of the edge device to optimize the latency and energy for DNN inference, with up to $15\%$ lower energy and minimal degradation in inference time. △ Less

Submitted 24 September, 2025; originally announced September 2025.

arXiv:2509.11896 [pdf]

doi 10.3390/galaxies13050108

Testing CCC+TL Cosmology with Galaxy Rotation Curves

Authors: Rajendra P. Gupta

Abstract: This paper aims to explore whether astrophysical observations, primarily galaxy rotation curves, result from covarying coupling constants (CCC) rather than from dark matter. We have shown in earlier papers that cosmological observations, such as supernovae type 1a (Pantheon+), the small size of galaxies at cosmic dawn, baryon acoustic oscillations (BAO), the sound horizon in the cosmic microwave b… ▽ More This paper aims to explore whether astrophysical observations, primarily galaxy rotation curves, result from covarying coupling constants (CCC) rather than from dark matter. We have shown in earlier papers that cosmological observations, such as supernovae type 1a (Pantheon+), the small size of galaxies at cosmic dawn, baryon acoustic oscillations (BAO), the sound horizon in the cosmic microwave background (CMB), and time dilation effect, can be easily accounted for without requiring dark energy and dark matter when coupling constants are permitted to evolve in an expanding Universe, as predicted by Dirac, and the redshift is considered jointly due to the Universe's expansion and Zwicky's tired light (TL) effect. Here, we show that the CCC parameter α is responsible for generating the illusion of dark matter and dark energy, which we call α-matter and α-energy, and is influenced by the baryonic matter density distribution. While cosmologically α is a constant determined for the homogenous and isotropic Universe, e.g., by fitting Pantheon+ data, it can vary locally due to the extreme anisotropy of the matter distribution. Thus, in high baryonic density regions, one expects α-matter and α-energy densities to be relatively low and vice versa. We present its application to a few galaxy rotation curves from the SPARC database and find the results promising. △ Less

Submitted 15 September, 2025; originally announced September 2025.

Comments: 10 pages, 12 figures

Journal ref: Galaxies 13 (2025) 108

arXiv:2509.11095 [pdf]

GCN-TULHOR: Trajectory-User Linking Leveraging GCNs and Higher-Order Spatial Representations

Authors: Khoa Tran, Pranav Gupta, Manos Papagelis

Abstract: Trajectory-user linking (TUL) aims to associate anonymized trajectories with the users who generated them, which is crucial for personalized recommendations, privacy-preserving analytics, and secure location-based services. Existing methods struggle with sparse data, incomplete routes, and limited modeling of complex spatial dependencies, often relying on low-level check-in data or ignoring spatia… ▽ More Trajectory-user linking (TUL) aims to associate anonymized trajectories with the users who generated them, which is crucial for personalized recommendations, privacy-preserving analytics, and secure location-based services. Existing methods struggle with sparse data, incomplete routes, and limited modeling of complex spatial dependencies, often relying on low-level check-in data or ignoring spatial patterns. In this paper, we introduced GCN-TULHOR, a method that transforms raw location data into higher-order mobility flow representations using hexagonal tessellation, reducing data sparsity and capturing richer spatial semantics, and integrating Graph Convolutional Networks (GCNs). Our approach converts both sparse check-in and continuous GPS trajectory data into unified higher-order flow representations, mitigating sparsity while capturing deeper semantic information. The GCN layer explicitly models complex spatial relationships and non-local dependencies without requiring side information such as timestamps or points of interest. Experiments on six real-world datasets show consistent improvements over classical baselines, RNN- and Transformer-based models, and the TULHOR method in accuracy, precision, recall, and F1-score. GCN-TULHOR achieves 1-8% relative gains in accuracy and F1. Sensitivity analysis identifies an optimal setup with a single GCN layer and 512-dimensional embeddings. The integration of GCNs enhances spatial learning and improves generalizability across mobility data. This work highlights the value of combining graph-based spatial learning with sequential modeling, offering a robust and scalable solution for TUL with applications in recommendations, urban planning, and security. △ Less

Submitted 14 September, 2025; originally announced September 2025.

arXiv:2509.08349 [pdf, ps, other]

Volcanic Satellites Tidally Venting Na, K, SO2 in Optical & Infrared Light

Authors: Apurva V. Oza, Andrea Gebek, Moritz Meyer zu Westram, Armen Tokadjian, Anthony L. Piro, Renyu Hu, Athira Unni, Raghav Chari, Aaron Bello-Arufe, Carl A. Schmidt, Amy J. Louca, Yamila Miguel, Raissa Estrela, Jeehyun Yang, Mario Damiano, Yasuhiro Hasegawa, Luis Welbanks, Diana Powell, Rishabh Garg, Pulkit Gupta, Yuk L. Yung, Rosaly M. C. Lopes

Abstract: Recent infrared spectroscopy from the James Webb Space Telescope (JWST) has spurred analyses of common volcanic gases such as carbon dioxide (CO2), sulfur dioxide (SO2), alongside alkali metals sodium (Na I) and potassium (K I) surrounding the hot Saturn WASP-39 b. We report more than an order-of-magnitude of variability in the density of neutral Na, K, and SO2 between ground-based measurements an… ▽ More Recent infrared spectroscopy from the James Webb Space Telescope (JWST) has spurred analyses of common volcanic gases such as carbon dioxide (CO2), sulfur dioxide (SO2), alongside alkali metals sodium (Na I) and potassium (K I) surrounding the hot Saturn WASP-39 b. We report more than an order-of-magnitude of variability in the density of neutral Na, K, and SO2 between ground-based measurements and JWST, at distinct epochs, hinting at exogenic physical processes similar to those sourcing Io's extended atmosphere and torus. Tidally-heated volcanic satellite simulations sputtering gas into a cloud or toroid orbiting the planet, are able to reproduce the probed line-of-sight column density variations. The estimated SO2 flux is consistent with tidal gravitation predictions, with a Na/SO2 ratio far smaller than Io's. Although stable satellite orbits at this system are known to be < 15.3 hours, several high-resolution alkali Doppler shift observations are required to constrain a putative orbit. Due to the Roche limit interior to the planetary photosphere at ~ 8 hours, atmosphere-exosphere interactions are expected to be especially important at this system. △ Less

Submitted 10 September, 2025; originally announced September 2025.

Comments: 10 pages, 5 figures, accepted for publication in the Monthly Notices of the Royal Astronomical Society

arXiv:2509.06189 [pdf]

Harnessing the polar vortex motion in oxide heterostructures

Authors: Pushpendra Gupta, Mohit Tanwani, Qi Xu, Guanshihan Du, Peiran Tong, Yongjun Wu, Zijian Hong, He Tian, Ramamoorthy Ramesh, Sujit Das

Abstract: Polar topology, an analogue of the magnetic topology, serves as a large playground for exotic physical phenomena with a wide range of multifunctional applications. Polar vortices and skyrmions are representative polar topologies that have been predicted to significantly enhance the functionality and information density of nanoelectronic devices due to their ultrasmall dimensions. Despite these adv… ▽ More Polar topology, an analogue of the magnetic topology, serves as a large playground for exotic physical phenomena with a wide range of multifunctional applications. Polar vortices and skyrmions are representative polar topologies that have been predicted to significantly enhance the functionality and information density of nanoelectronic devices due to their ultrasmall dimensions. Despite these advantages, the practical realization of polar topologies in devices is impeded by the intrinsic challenges associated with their controlled motion and manipulation. Therefore, harnessing vortex manipulation-such as motion, on demand creation, annihilation, and shape transformation-is essential for practical device integration. However, vortex motion is often challenged by intrinsic physical limitations in collective lattice distortions and strong pinning effects from the surrounding environment, which remains elusive. In this study, we present real time observation of vortex motion in PbTiO3/SrTiO3 heterostructures, achieved through the application of localized pulsed electric fields and trailing bias fields from a conductive tip. Notably, the vortices exhibit reversible motion in response to the field direction. Furthermore, by precisely manoeuvring the conductive Atomic-Force-Microscopy tip along specific trajectories, we achieved controlled vortex reshaping, with reconfigured vortices showing remarkable stability over extended periods. This underline physical mechanism is further pinpointed by phase-field simulations, which revealed that the motion of the vortex boundary is controlled through the switching of the zigzag patterns of the vortex core. This study highlights the feasibility of harnessing vortex dynamics through external stimuli, advancing the fundamental physical understanding and prospects for next-generation polar vortex-based nanoelectronic devices. △ Less

Submitted 7 September, 2025; originally announced September 2025.

arXiv:2509.05884 [pdf, ps, other]

Introduction to Number Theoretic Transform

Authors: Banhirup Sengupta, Peenal Gupta, Souvik Sengupta

Abstract: The Number Theoretic Transform (NTT) can be regarded as a variant of the Discrete Fourier Transform. NTT has been quite a powerful mathematical tool in developing Post-Quantum Cryptography and Homomorphic Encryption. The Fourier Transform essentially decomposes a signal into its frequencies. They are traditionally sine or cosine waves. NTT works more over groups or finite fields rather than on a c… ▽ More The Number Theoretic Transform (NTT) can be regarded as a variant of the Discrete Fourier Transform. NTT has been quite a powerful mathematical tool in developing Post-Quantum Cryptography and Homomorphic Encryption. The Fourier Transform essentially decomposes a signal into its frequencies. They are traditionally sine or cosine waves. NTT works more over groups or finite fields rather than on a continuous signal and polynomials work as the analog of sine waves in case of NTT. Fast Fourier Trnasform (FFT) style NTT or fast NTT has been proven to be useful in lattice-based cryptography due to its ability to reduce the complexity of polynomial multiplication from quadratic to quasilinear. We have introduced the concepts of cyclic, negacyclic convolutions along with NTT and its inverse and their fast versions. △ Less

Submitted 6 September, 2025; originally announced September 2025.

arXiv:2508.20587 [pdf, ps, other]

SemSR: Semantics aware robust Session-based Recommendations

Authors: Jyoti Narwariya, Priyanka Gupta, Muskan Gupta, Jyotsana Khatri, Lovekesh Vig

Abstract: Session-based recommendation (SR) models aim to recommend items to anonymous users based on their behavior during the current session. While various SR models in the literature utilize item sequences to predict the next item, they often fail to leverage semantic information from item titles or descriptions impeding session intent identification and interpretability. Recent research has explored La… ▽ More Session-based recommendation (SR) models aim to recommend items to anonymous users based on their behavior during the current session. While various SR models in the literature utilize item sequences to predict the next item, they often fail to leverage semantic information from item titles or descriptions impeding session intent identification and interpretability. Recent research has explored Large Language Models (LLMs) as promising approaches to enhance session-based recommendations, with both prompt-based and fine-tuning based methods being widely investigated. However, prompt-based methods struggle to identify optimal prompts that elicit correct reasoning and lack task-specific feedback at test time, resulting in sub-optimal recommendations. Fine-tuning methods incorporate domain-specific knowledge but incur significant computational costs for implementation and maintenance. In this paper, we present multiple approaches to utilize LLMs for session-based recommendation: (i) in-context LLMs as recommendation agents, (ii) LLM-generated representations for semantic initialization of deep learning SR models, and (iii) integration of LLMs with data-driven SR models. Through comprehensive experiments on two real-world publicly available datasets, we demonstrate that LLM-based methods excel at coarse-level retrieval (high recall values), while traditional data-driven techniques perform well at fine-grained ranking (high Mean Reciprocal Rank values). Furthermore, the integration of LLMs with data-driven SR models significantly out performs both standalone LLM approaches and data-driven deep learning models, as well as baseline SR models, in terms of both Recall and MRR metrics. △ Less

Submitted 28 August, 2025; originally announced August 2025.

Comments: Accepted at EARL workshop @RecSys'25, Prague, Czech Republic

arXiv:2508.18283 [pdf, ps, other]

Technology-assisted Personalized Yoga for Better Health -- Challenges and Outlook

Authors: Vivek Kumar, Himanshu Sahu, Hari Prabhat Gupta, Biplav Srivastava

Abstract: Yoga is a discipline of physical postures, breathing techniques, and meditative practices rooted in ancient Indian traditions, now embraced worldwide for promoting overall well-being and inner balance. The practices are a large set of items, our term for executable actions like physical poses or breath exercises, to offer for a person's well-being. However, to get benefits of Yoga tailored to a pe… ▽ More Yoga is a discipline of physical postures, breathing techniques, and meditative practices rooted in ancient Indian traditions, now embraced worldwide for promoting overall well-being and inner balance. The practices are a large set of items, our term for executable actions like physical poses or breath exercises, to offer for a person's well-being. However, to get benefits of Yoga tailored to a person's unique needs, a person needs to (a) discover their subset from the large and seemingly complex set with inter-dependencies, (b) continue to follow them with interest adjusted to their changing abilities and near-term objectives, and (c) as appropriate, adapt to alternative items based on changing environment and the person's health conditions. In this vision paper, we describe the challenges for the Yoga personalization problem. Next, we sketch a preliminary approach and use the experience to provide an outlook on solving the challenging problem using existing and novel techniques from a multidisciplinary computing perspective. To the best of our knowledge, this is the first paper that comprehensively examines decision support issues around Yoga personalization, from pose sensing to recommendation of corrections for a complete regimen, and illustrates with a case study of Surya Namaskar -- a set of 12 choreographed poses. △ Less

Submitted 15 August, 2025; originally announced August 2025.

Comments: 10 Pages, 11 figures, 2 tables

arXiv:2508.18083 [pdf, ps, other]

GWTC-4.0: Population Properties of Merging Compact Binaries

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, I. Abouelfettouh, F. Acernese, K. Ackley, C. Adamcewicz, S. Adhicary, D. Adhikari, N. Adhikari, R. X. Adhikari, V. K. Adkins, S. Afroz, D. Agarwal, M. Agathos, M. Aghaei Abchouyeh, O. D. Aguiar, S. Ahmadzadeh, L. Aiello, A. Ain, P. Ajith, T. Akutsu, S. Albanesi, R. A. Alfaidi , et al. (1783 additional authors not shown)

Abstract: We detail the population properties of merging compact objects using 158 mergers from the cumulative Gravitational-Wave Transient Catalog 4.0, which includes three types of binary mergers: binary neutron star, neutron star--black hole binary, and binary black hole mergers. We resolve multiple over- and under-densities in the black hole mass distribution: features persist at primary masses of… ▽ More We detail the population properties of merging compact objects using 158 mergers from the cumulative Gravitational-Wave Transient Catalog 4.0, which includes three types of binary mergers: binary neutron star, neutron star--black hole binary, and binary black hole mergers. We resolve multiple over- and under-densities in the black hole mass distribution: features persist at primary masses of $10\,M_\odot$ and $35\,M_\odot$ with a possible third feature at $\sim 20\,M_\odot$. These are departures from an otherwise power-law-like continuum that steepens above $35\,M_\odot$. Binary black holes with primary masses near $10\,M_\odot$ are more likely to have less massive secondaries, with a mass ratio distribution peaking at $q = 0.74^{+0.13}_{-0.13}$, potentially a signature of stable mass transfer during binary evolution. Black hole spins are inferred to be non-extremal, with 90\% of black holes having $χ< 0.57$, and preferentially aligned with binary orbits, implying many merging binaries form in isolation. However, we find a significant fraction, 0.24-0.42, of binaries have negative effective inspiral spins, suggesting many could be formed dynamically in gas-free environments. We find evidence for correlation between effective inspiral spin and mass ratio, though it is unclear if this is driven by variation in the mode of the distribution or the width. (Abridged) △ Less

Submitted 17 September, 2025; v1 submitted 25 August, 2025; originally announced August 2025.

Comments: As part of the Astrophysical Journal Letters Focus Issue on the Gravitational Wave Transient Catalog

Report number: LIGO-P2400004

arXiv:2508.18081 [pdf, ps, other]

GWTC-4.0: Methods for Identifying and Characterizing Gravitational-wave Transients

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, I. Abouelfettouh, F. Acernese, K. Ackley, S. Adhicary, D. Adhikari, N. Adhikari, R. X. Adhikari, V. K. Adkins, S. Afroz, D. Agarwal, M. Agathos, M. Aghaei Abchouyeh, O. D. Aguiar, S. Ahmadzadeh, L. Aiello, A. Ain, P. Ajith, S. Akcay, T. Akutsu, S. Albanesi, R. A. Alfaidi , et al. (1787 additional authors not shown)

Abstract: The Gravitational-Wave Transient Catalog (GWTC) is a collection of candidate gravitational-wave transient signals identified and characterized by the LIGO-Virgo-KAGRA Collaboration. Producing the contents of the GWTC from detector data requires complex analysis methods. These comprise techniques to model the signal; identify the transients in the data; evaluate the quality of the data and mitigate… ▽ More The Gravitational-Wave Transient Catalog (GWTC) is a collection of candidate gravitational-wave transient signals identified and characterized by the LIGO-Virgo-KAGRA Collaboration. Producing the contents of the GWTC from detector data requires complex analysis methods. These comprise techniques to model the signal; identify the transients in the data; evaluate the quality of the data and mitigate possible instrumental issues; infer the parameters of each transient; compare the data with the waveform models for compact binary coalescences; and handle the large amount of results associated with all these different analyses. In this paper, we describe the methods employed to produce the catalog's fourth release, GWTC-4.0, focusing on the analysis of the first part of the fourth observing run of Advanced LIGO, Advanced Virgo and KAGRA. △ Less

Submitted 25 August, 2025; originally announced August 2025.

Comments: As part of the Astrophysical Journal Letters Focus Issue on the Gravitational Wave Transient Catalog

Report number: LIGO-P2400300

arXiv:2508.18080 [pdf, ps, other]

GWTC-4.0: An Introduction to Version 4.0 of the Gravitational-Wave Transient Catalog

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, I. Abouelfettouh, F. Acernese, K. Ackley, S. Adhicary, D. Adhikari, N. Adhikari, R. X. Adhikari, V. K. Adkins, S. Afroz, D. Agarwal, M. Agathos, M. Aghaei Abchouyeh, O. D. Aguiar, S. Ahmadzadeh, L. Aiello, A. Ain, P. Ajith, S. Akcay, T. Akutsu, S. Albanesi, R. A. Alfaidi , et al. (1786 additional authors not shown)

Abstract: The Gravitational-Wave Transient Catalog (GWTC) is a collection of short-duration (transient) gravitational wave signals identified by the LIGO-Virgo-KAGRA Collaboration in gravitational-wave data produced by the eponymous detectors. The catalog provides information about the identified candidates, such as the arrival time and amplitude of the signal and properties of the signal's source as inferr… ▽ More The Gravitational-Wave Transient Catalog (GWTC) is a collection of short-duration (transient) gravitational wave signals identified by the LIGO-Virgo-KAGRA Collaboration in gravitational-wave data produced by the eponymous detectors. The catalog provides information about the identified candidates, such as the arrival time and amplitude of the signal and properties of the signal's source as inferred from the observational data. GWTC is the data release of this dataset and version 4.0 extends the catalog to include observations made during the first part of the fourth LIGO-Virgo-KAGRA observing run up until 2024 January 31. This paper marks an introduction to a collection of articles related to this version of the catalog, GWTC-4.0. The collection of articles accompanying the catalog provides documentation of the methods used to analyze the data, summaries of the catalog of events, observational measurements drawn from the population, and detailed discussions of selected candidates △ Less

Submitted 23 September, 2025; v1 submitted 25 August, 2025; originally announced August 2025.

Comments: As part of the Astrophysical Journal Letters Focus Issue on the Gravitational Wave Transient Catalog. Update following peer review

Report number: LIGO-P2400293

arXiv:2508.17083 [pdf, ps, other]

Learning ON Large Datasets Using Bit-String Trees

Authors: Prashant Gupta

Abstract: This thesis develops computational methods in similarity-preserving hashing, classification, and cancer genomics. Standard space partitioning-based hashing relies on Binary Search Trees (BSTs), but their exponential growth and sparsity hinder efficiency. To overcome this, we introduce Compressed BST of Inverted hash tables (ComBI), which enables fast approximate nearest-neighbor search with reduce… ▽ More This thesis develops computational methods in similarity-preserving hashing, classification, and cancer genomics. Standard space partitioning-based hashing relies on Binary Search Trees (BSTs), but their exponential growth and sparsity hinder efficiency. To overcome this, we introduce Compressed BST of Inverted hash tables (ComBI), which enables fast approximate nearest-neighbor search with reduced memory. On datasets of up to one billion samples, ComBI achieves 0.90 precision with 4X-296X speed-ups over Multi-Index Hashing, and also outperforms Cellfishing.jl on single-cell RNA-seq searches with 2X-13X gains. Building on hashing structures, we propose Guided Random Forest (GRAF), a tree-based ensemble classifier that integrates global and local partitioning, bridging decision trees and boosting while reducing generalization error. Across 115 datasets, GRAF delivers competitive or superior accuracy, and its unsupervised variant (uGRAF) supports guided hashing and importance sampling. We show that GRAF and ComBI can be used to estimate per-sample classifiability, which enables scalable prediction of cancer patient survival. To address challenges in interpreting mutations, we introduce Continuous Representation of Codon Switches (CRCS), a deep learning framework that embeds genetic changes into numerical vectors. CRCS allows identification of somatic mutations without matched normals, discovery of driver genes, and scoring of tumor mutations, with survival prediction validated in bladder, liver, and brain cancers. Together, these methods provide efficient, scalable, and interpretable tools for large-scale data analysis and biomedical applications. △ Less

Submitted 23 August, 2025; originally announced August 2025.

Comments: PhD thesis

arXiv:2508.16911 [pdf, ps, other]

MDD: A Dataset for Text-and-Music Conditioned Duet Dance Generation

Authors: Prerit Gupta, Jason Alexander Fotso-Puepi, Zhengyuan Li, Jay Mehta, Aniket Bera

Abstract: We introduce Multimodal DuetDance (MDD), a diverse multimodal benchmark dataset designed for text-controlled and music-conditioned 3D duet dance motion generation. Our dataset comprises 620 minutes of high-quality motion capture data performed by professional dancers, synchronized with music, and detailed with over 10K fine-grained natural language descriptions. The annotations capture a rich move… ▽ More We introduce Multimodal DuetDance (MDD), a diverse multimodal benchmark dataset designed for text-controlled and music-conditioned 3D duet dance motion generation. Our dataset comprises 620 minutes of high-quality motion capture data performed by professional dancers, synchronized with music, and detailed with over 10K fine-grained natural language descriptions. The annotations capture a rich movement vocabulary, detailing spatial relationships, body movements, and rhythm, making MDD the first dataset to seamlessly integrate human motions, music, and text for duet dance generation. We introduce two novel tasks supported by our dataset: (1) Text-to-Duet, where given music and a textual prompt, both the leader and follower dance motion are generated (2) Text-to-Dance Accompaniment, where given music, textual prompt, and the leader's motion, the follower's motion is generated in a cohesive, text-aligned manner. We include baseline evaluations on both tasks to support future research. △ Less

Submitted 23 August, 2025; originally announced August 2025.

Comments: Accepted at ICCV 2025. Project page: https://gprerit96.github.io/mdd-page

arXiv:2508.12045 [pdf]

Large Language Models Enable Design of Personalized Nudges across Cultures

Authors: Vladimir Maksimenko, Qingyao Xin, Prateek Gupta, Bin Zhang, Prateek Bansal

Abstract: Nudge strategies are effective tools for influencing behaviour, but their impact depends on individual preferences. Strategies that work for some individuals may be counterproductive for others. We hypothesize that large language models (LLMs) can facilitate the design of individual-specific nudges without the need for costly and time-intensive behavioural data collection and modelling. To test th… ▽ More Nudge strategies are effective tools for influencing behaviour, but their impact depends on individual preferences. Strategies that work for some individuals may be counterproductive for others. We hypothesize that large language models (LLMs) can facilitate the design of individual-specific nudges without the need for costly and time-intensive behavioural data collection and modelling. To test this, we use LLMs to design personalized decoy-based nudges tailored to individual profiles and cultural contexts, aimed at encouraging air travellers to voluntarily offset CO$_2$ emissions from flights. We evaluate their effectiveness through a large-scale survey experiment ($n=3495$) conducted across five countries. Results show that LLM-informed personalized nudges are more effective than uniform settings, raising offsetting rates by 3-7$\%$ in Germany, Singapore, and the US, though not in China or India. Our study highlights the potential of LLM as a low-cost testbed for piloting nudge strategies. At the same time, cultural heterogeneity constrains their generalizability underscoring the need for combining LLM-based simulations with targeted empirical validation. △ Less

Submitted 16 October, 2025; v1 submitted 16 August, 2025; originally announced August 2025.

arXiv:2508.00978 [pdf, ps, other]

Mapping the Distant and Metal-Poor Milky Way with SDSS-V

Authors: Vedant Chandra, Phillip A. Cargile, Alexander P. Ji, Charlie Conroy, Hans-Walter Rix, Emily Cunningham, Bruno Dias, Chervin Laporte, William Cerny, Guilherme Limberg, Avrajit Bandyopadhyay, Ana Bonaca, Andrew R. Casey, John Donor, Jose G. Fernandez-Trincado, Peter M. Frinchaboy, Pramod Gupta, Keith Hawkins, Jennifer A. Johnson, Juna A. Kollmeier, Madeline Lucey, Ilija Medan, Szabolcs Meszaros, Sean Morrison, Jose Sanchez-Gallego , et al. (6 additional authors not shown)

Abstract: The fifth-generation Sloan Digital Sky Survey (SDSS-V) is conducting the first all-sky low-resolution spectroscopic survey of the Milky Way's stellar halo. We describe the stellar parameter pipeline for the SDSS-V halo survey, which simultaneously models spectra, broadband photometry, and parallaxes to derive stellar parameters, metallicities, alpha abundances, and distances. The resulting BOSS-MI… ▽ More The fifth-generation Sloan Digital Sky Survey (SDSS-V) is conducting the first all-sky low-resolution spectroscopic survey of the Milky Way's stellar halo. We describe the stellar parameter pipeline for the SDSS-V halo survey, which simultaneously models spectra, broadband photometry, and parallaxes to derive stellar parameters, metallicities, alpha abundances, and distances. The resulting BOSS-MINESweeper catalog is validated across a wide range of stellar parameters and metallicities using star clusters and a comparison to high-resolution spectroscopic surveys. We demonstrate several scientific capabilities of this dataset: identifying the most chemically peculiar stars in our Galaxy, discovering and mapping distant halo substructures, and measuring the all--sky dynamics of the Milky Way on the largest scales. The BOSS-MINESweeper catalog for SDSS DR19 is publicly available and will be updated for future data releases. △ Less

Submitted 1 August, 2025; originally announced August 2025.

Comments: 31 pages, 19 figures; Submitted to AAS Journals;

arXiv:2507.20019 [pdf]

Anomaly Detection in Human Language via Meta-Learning: A Few-Shot Approach

Authors: Saurav Singla, Aarav Singla, Advik Gupta, Parnika Gupta

Abstract: We propose a meta learning framework for detecting anomalies in human language across diverse domains with limited labeled data. Anomalies in language ranging from spam and fake news to hate speech pose a major challenge due to their sparsity and variability. We treat anomaly detection as a few shot binary classification problem and leverage meta-learning to train models that generalize across tas… ▽ More We propose a meta learning framework for detecting anomalies in human language across diverse domains with limited labeled data. Anomalies in language ranging from spam and fake news to hate speech pose a major challenge due to their sparsity and variability. We treat anomaly detection as a few shot binary classification problem and leverage meta-learning to train models that generalize across tasks. Using datasets from domains such as SMS spam, COVID-19 fake news, and hate speech, we evaluate model generalization on unseen tasks with minimal labeled anomalies. Our method combines episodic training with prototypical networks and domain resampling to adapt quickly to new anomaly detection tasks. Empirical results show that our method outperforms strong baselines in F1 and AUC scores. We also release the code and benchmarks to facilitate further research in few-shot text anomaly detection. △ Less

Submitted 26 July, 2025; originally announced July 2025.

Comments: 15 pages. PyTorch code for few-shot anomaly detection using meta-learning is available upon request or can be shared via GitHub

arXiv:2507.19819 [pdf, ps, other]

ChipletPart: Cost-Aware Partitioning for 2.5D Systems

Authors: Alexander Graening, Puneet Gupta, Andrew B. Kahng, Bodhisatta Pramanik, Zhiang Wang

Abstract: Industry adoption of chiplets has been increasing as a cost-effective option for making larger high-performance systems. Consequently, partitioning large systems into chiplets is increasingly important. In this work, we introduce ChipletPart - a cost-driven 2.5D system partitioner that addresses the unique constraints of chiplet systems, including complex objective functions, limited reach of inte… ▽ More Industry adoption of chiplets has been increasing as a cost-effective option for making larger high-performance systems. Consequently, partitioning large systems into chiplets is increasingly important. In this work, we introduce ChipletPart - a cost-driven 2.5D system partitioner that addresses the unique constraints of chiplet systems, including complex objective functions, limited reach of inter-chiplet I/O transceivers, and the assignment of heterogeneous manufacturing technologies to different chiplets. ChipletPart integrates a sophisticated chiplet cost model with its underlying genetic algorithm-based technology assignment and partitioning methodology, along with a simulated annealing-based chiplet floorplanner. Our results show that: (i) ChipletPart reduces chiplet cost by up to 58% (20% geometric mean) compared to state-of-the-art min-cut partitioners, which often yield floorplan-infeasible solutions; (ii) ChipletPart generates partitions with up to 47% (6% geometric mean) lower cost as compared to the prior work Floorplet; and (iii) for the testcases we study, heterogeneous integration reduces cost by up to 43% (15% geometric mean) compared to homogeneous implementations. Additionally, we explore Bayesian optimization (BO) for finding low cost and floorplan-feasible chiplet solutions with technology assignments. On some testcases, our BO framework achieves better system cost (up to 5.3% improvement) with higher runtime overhead (up to 4x) compared to our GA framework. We also present case studies that show how changes in packaging and inter-chiplet signaling technologies can affect partitioning solutions. Finally, we make ChipletPart, the underlying chiplet cost model, and a chiplet testcase generator available as open-source tools for the community. △ Less

Submitted 8 August, 2025; v1 submitted 26 July, 2025; originally announced July 2025.

Comments: 14 pages, 13 figures

arXiv:2507.18827 [pdf, ps, other]

CueBuddy: helping non-native English speakers navigate English-centric STEM education

Authors: Pranav Gupta

Abstract: Students across the world in STEM classes, especially in the Global South, fall behind their peers who are more fluent in English, despite being at par with them in terms of scientific prerequisites. While many of them are able to follow everyday English at ease, key terms in English stay challenging. In most cases, such students have had most of their course prerequisites in a lower resource lang… ▽ More Students across the world in STEM classes, especially in the Global South, fall behind their peers who are more fluent in English, despite being at par with them in terms of scientific prerequisites. While many of them are able to follow everyday English at ease, key terms in English stay challenging. In most cases, such students have had most of their course prerequisites in a lower resource language. Live speech translation to lower resource languages is a promising area of research, however, models for speech translation can be too expensive on a large scale and often struggle with technical content. In this paper, we describe CueBuddy, which aims to remediate these issues by providing real-time "lexical cues" through technical keyword spotting along real-time multilingual glossary lookup to help students stay up to speed with complex English jargon without disrupting their concentration on the lecture. We also describe the limitations and future extensions of our approach. △ Less

Submitted 24 July, 2025; originally announced July 2025.

arXiv:2507.16095 [pdf, ps, other]

Improving Personalized Image Generation through Social Context Feedback

Authors: Parul Gupta, Abhinav Dhall, Thanh-Toan Do

Abstract: Personalized image generation, where reference images of one or more subjects are used to generate their image according to a scene description, has gathered significant interest in the community. However, such generated images suffer from three major limitations -- complex activities, such as $<$man, pushing, motorcycle$>$ are not generated properly with incorrect human poses, reference human ide… ▽ More Personalized image generation, where reference images of one or more subjects are used to generate their image according to a scene description, has gathered significant interest in the community. However, such generated images suffer from three major limitations -- complex activities, such as $<$man, pushing, motorcycle$>$ are not generated properly with incorrect human poses, reference human identities are not preserved, and generated human gaze patterns are unnatural/inconsistent with the scene description. In this work, we propose to overcome these shortcomings through feedback-based fine-tuning of existing personalized generation methods, wherein, state-of-art detectors of pose, human-object-interaction, human facial recognition and human gaze-point estimation are used to refine the diffusion model. We also propose timestep-based inculcation of different feedback modules, depending upon whether the signal is low-level (such as human pose), or high-level (such as gaze point). The images generated in this manner show an improvement in the generated interactions, facial identities and image quality over three benchmark datasets. △ Less

Submitted 21 July, 2025; originally announced July 2025.

arXiv:2507.13533 [pdf, ps, other]

Increasing the Expressiveness of a Gradual Verifier

Authors: Priyam Gupta

Abstract: Static verification provides strong correctness guarantees for code; however, fully specifying programs for static verification is a complex, burdensome process for users. Gradual verification was introduced to make this process easier by supporting the verification of partially specified programs. The only currently working gradual verifier, Gradual C0, successfully verifies heap manipulating pro… ▽ More Static verification provides strong correctness guarantees for code; however, fully specifying programs for static verification is a complex, burdensome process for users. Gradual verification was introduced to make this process easier by supporting the verification of partially specified programs. The only currently working gradual verifier, Gradual C0, successfully verifies heap manipulating programs, but lacks expressiveness in its specification language. This paper describes the design and implementation of an extension to Gradual C0 that supports unfolding expressions, which allow more intuitive specifications of recursive heap data structures. △ Less

Submitted 17 July, 2025; originally announced July 2025.

Comments: Presented at the 52nd ACM SIGPLAN Symposium on Principles of Programming Languages (POPL 2025) Student Research Competition

arXiv:2507.11940 [pdf, ps, other]

IANN-MPPI: Interaction-Aware Neural Network-Enhanced Model Predictive Path Integral Approach for Autonomous Driving

Authors: Kanghyun Ryu, Minjun Sung, Piyush Gupta, Jovin D'sa, Faizan M. Tariq, David Isele, Sangjae Bae

Abstract: Motion planning for autonomous vehicles (AVs) in dense traffic is challenging, often leading to overly conservative behavior and unmet planning objectives. This challenge stems from the AVs' limited ability to anticipate and respond to the interactive behavior of surrounding agents. Traditional decoupled prediction and planning pipelines rely on non-interactive predictions that overlook the fact t… ▽ More Motion planning for autonomous vehicles (AVs) in dense traffic is challenging, often leading to overly conservative behavior and unmet planning objectives. This challenge stems from the AVs' limited ability to anticipate and respond to the interactive behavior of surrounding agents. Traditional decoupled prediction and planning pipelines rely on non-interactive predictions that overlook the fact that agents often adapt their behavior in response to the AV's actions. To address this, we propose Interaction-Aware Neural Network-Enhanced Model Predictive Path Integral (IANN-MPPI) control, which enables interactive trajectory planning by predicting how surrounding agents may react to each control sequence sampled by MPPI. To improve performance in structured lane environments, we introduce a spline-based prior for the MPPI sampling distribution, enabling efficient lane-changing behavior. We evaluate IANN-MPPI in a dense traffic merging scenario, demonstrating its ability to perform efficient merging maneuvers. Our project website is available at https://sites.google.com/berkeley.edu/iann-mppi △ Less

Submitted 16 July, 2025; originally announced July 2025.

Comments: To be published in The IEEE International Conference on Intelligent Transportation Systems (ITSC) 2025

arXiv:2507.11545 [pdf, ps, other]

The AI Shadow War: SaaS vs. Edge Computing Architectures

Authors: Rhea Pritham Marpu, Kevin J McNamara, Preeti Gupta

Abstract: The very DNA of AI architecture presents conflicting paths: centralized cloud-based models (Software-as-a-Service) versus decentralized edge AI (local processing on consumer devices). This paper analyzes the competitive battleground across computational capability, energy efficiency, and data privacy. Recent breakthroughs show edge AI challenging cloud systems on performance, leveraging innovation… ▽ More The very DNA of AI architecture presents conflicting paths: centralized cloud-based models (Software-as-a-Service) versus decentralized edge AI (local processing on consumer devices). This paper analyzes the competitive battleground across computational capability, energy efficiency, and data privacy. Recent breakthroughs show edge AI challenging cloud systems on performance, leveraging innovations like test-time training and mixture-of-experts architectures. Crucially, edge AI boasts a 10,000x efficiency advantage: modern ARM processors consume merely 100 microwatts forinference versus 1 watt for equivalent cloud processing. Beyond efficiency, edge AI secures data sovereignty by keeping processing local, dismantling single points of failure in centralized architectures. This democratizes access throughaffordable hardware, enables offline functionality, and reduces environmental impact by eliminating data transmission costs. The edge AI market projects explosive growth from $9 billion in 2025 to $49.6 billion by 2030 (38.5% CAGR), fueled by privacy demands and real-time analytics. Critical applications including personalized education, healthcare monitoring, autonomous transport, and smart infrastructure rely on edge AI's ultra-low latency (5-10ms versus 100-500ms for cloud). The convergence of architectural innovation with fundamental physics confirms edge AI's distributed approach aligns with efficient information processing, signaling the inevitable emergence of hybrid edge-cloud ecosystems. △ Less

Submitted 8 July, 2025; originally announced July 2025.

arXiv:2507.10869 [pdf, ps, other]

Focus on Texture: Rethinking Pre-training in Masked Autoencoders for Medical Image Classification

Authors: Chetan Madan, Aarjav Satia, Soumen Basu, Pankaj Gupta, Usha Dutta, Chetan Arora

Abstract: Masked Autoencoders (MAEs) have emerged as a dominant strategy for self-supervised representation learning in natural images, where models are pre-trained to reconstruct masked patches with a pixel-wise mean squared error (MSE) between original and reconstructed RGB values as the loss. We observe that MSE encourages blurred image re-construction, but still works for natural images as it preserves… ▽ More Masked Autoencoders (MAEs) have emerged as a dominant strategy for self-supervised representation learning in natural images, where models are pre-trained to reconstruct masked patches with a pixel-wise mean squared error (MSE) between original and reconstructed RGB values as the loss. We observe that MSE encourages blurred image re-construction, but still works for natural images as it preserves dominant edges. However, in medical imaging, when the texture cues are more important for classification of a visual abnormality, the strategy fails. Taking inspiration from Gray Level Co-occurrence Matrix (GLCM) feature in Radiomics studies, we propose a novel MAE based pre-training framework, GLCM-MAE, using reconstruction loss based on matching GLCM. GLCM captures intensity and spatial relationships in an image, hence proposed loss helps preserve morphological features. Further, we propose a novel formulation to convert matching GLCM matrices into a differentiable loss function. We demonstrate that unsupervised pre-training on medical images with the proposed GLCM loss improves representations for downstream tasks. GLCM-MAE outperforms the current state-of-the-art across four tasks - gallbladder cancer detection from ultrasound images by 2.1%, breast cancer detection from ultrasound by 3.1%, pneumonia detection from x-rays by 0.5%, and COVID detection from CT by 0.6%. Source code and pre-trained models are available at: https://github.com/ChetanMadan/GLCM-MAE. △ Less

Submitted 14 July, 2025; originally announced July 2025.

Comments: To appear at MICCAI 2025

arXiv:2507.10365 [pdf, ps, other]

A ruled residue theorem for function fields of hyperelliptic curves

Authors: Parul Gupta, Sumit Chandra Mishra

Abstract: We study residually transcendental extensions of a valuation $v$ on a field $E$ to function fields of hyperelliptic curves over $E$. We show that $v$ has at most finitely many extensions to the function field of a hyperelliptic curve over $E$, for which the residue field extension is transcendental but not ruled, assuming that the residue characteristic of $v$ is either zero or greater than the de… ▽ More We study residually transcendental extensions of a valuation $v$ on a field $E$ to function fields of hyperelliptic curves over $E$. We show that $v$ has at most finitely many extensions to the function field of a hyperelliptic curve over $E$, for which the residue field extension is transcendental but not ruled, assuming that the residue characteristic of $v$ is either zero or greater than the degree of the hyperelliptic curve. △ Less

Submitted 14 July, 2025; originally announced July 2025.

MSC Class: 12F20; 12J10; 12J20; 14H05; 16H05

arXiv:2507.10066 [pdf, ps, other]

LayLens: Improving Deepfake Understanding through Simplified Explanations

Authors: Abhijeet Narang, Parul Gupta, Liuyijia Su, Abhinav Dhall

Abstract: This demonstration paper presents $\mathbf{LayLens}$, a tool aimed to make deepfake understanding easier for users of all educational backgrounds. While prior works often rely on outputs containing technical jargon, LayLens bridges the gap between model reasoning and human understanding through a three-stage pipeline: (1) explainable deepfake detection using a state-of-the-art forgery localization… ▽ More This demonstration paper presents $\mathbf{LayLens}$, a tool aimed to make deepfake understanding easier for users of all educational backgrounds. While prior works often rely on outputs containing technical jargon, LayLens bridges the gap between model reasoning and human understanding through a three-stage pipeline: (1) explainable deepfake detection using a state-of-the-art forgery localization model, (2) natural language simplification of technical explanations using a vision-language model, and (3) visual reconstruction of a plausible original image via guided image editing. The interface presents both technical and layperson-friendly explanations in addition to a side-by-side comparison of the uploaded and reconstructed images. A user study with 15 participants shows that simplified explanations significantly improve clarity and reduce cognitive load, with most users expressing increased confidence in identifying deepfakes. LayLens offers a step toward transparent, trustworthy, and user-centric deepfake forensics. △ Less

Submitted 12 August, 2025; v1 submitted 14 July, 2025; originally announced July 2025.

Comments: Accepted to ACM ICMI 2025 Demos

arXiv:2507.09157 [pdf, ps, other]

PU-Lie: Lightweight Deception Detection in Imbalanced Diplomatic Dialogues via Positive-Unlabeled Learning

Authors: Bhavinkumar Vinodbhai Kuwar, Bikrant Bikram Pratap Maurya, Priyanshu Gupta, Nitin Choudhury

Abstract: Detecting deception in strategic dialogues is a complex and high-stakes task due to the subtlety of language and extreme class imbalance between deceptive and truthful communications. In this work, we revisit deception detection in the Diplomacy dataset, where less than 5% of messages are labeled deceptive. We introduce a lightweight yet effective model combining frozen BERT embeddings, interpreta… ▽ More Detecting deception in strategic dialogues is a complex and high-stakes task due to the subtlety of language and extreme class imbalance between deceptive and truthful communications. In this work, we revisit deception detection in the Diplomacy dataset, where less than 5% of messages are labeled deceptive. We introduce a lightweight yet effective model combining frozen BERT embeddings, interpretable linguistic and game-specific features, and a Positive-Unlabeled (PU) learning objective. Unlike traditional binary classifiers, PU-Lie is tailored for situations where only a small portion of deceptive messages are labeled, and the majority are unlabeled. Our model achieves a new best macro F1 of 0.60 while reducing trainable parameters by over 650x. Through comprehensive evaluations and ablation studies across seven models, we demonstrate the value of PU learning, linguistic interpretability, and speaker-aware representations. Notably, we emphasize that in this problem setting, accurately detecting deception is more critical than identifying truthful messages. This priority guides our choice of PU learning, which explicitly models the rare but vital deceptive class. △ Less

Submitted 12 July, 2025; originally announced July 2025.

arXiv:2507.07093 [pdf, ps, other]

The Nineteenth Data Release of the Sloan Digital Sky Survey

Authors: SDSS Collaboration, Gautham Adamane Pallathadka, Mojgan Aghakhanloo, James Aird, Andrés Almeida, Singh Amrita, Friedrich Anders, Scott F. Anderson, Stefan Arseneau, Consuelo González Avila, Shir Aviram, Catarina Aydar, Carles Badenes, Jorge K. Barrera-Ballesteros, Franz E. Bauer, Aida Behmard, Michelle Berg, F. Besser, Christian Moni Bidin, Dmitry Bizyaev, Guillermo Blanc, Michael R. Blanton, Jo Bovy, William Nielsen Brandt, Joel R. Brownstein , et al. (187 additional authors not shown)

Abstract: Mapping the local and distant Universe is key to our understanding of it. For decades, the Sloan Digital Sky Survey (SDSS) has made a concerted effort to map millions of celestial objects to constrain the physical processes that govern our Universe. The most recent and fifth generation of SDSS (SDSS-V) is organized into three scientific ``mappers". Milky Way Mapper (MWM) that aims to chart the var… ▽ More Mapping the local and distant Universe is key to our understanding of it. For decades, the Sloan Digital Sky Survey (SDSS) has made a concerted effort to map millions of celestial objects to constrain the physical processes that govern our Universe. The most recent and fifth generation of SDSS (SDSS-V) is organized into three scientific ``mappers". Milky Way Mapper (MWM) that aims to chart the various components of the Milky Way and constrain its formation and assembly, Black Hole Mapper (BHM), which focuses on understanding supermassive black holes in distant galaxies across the Universe, and Local Volume Mapper (LVM), which uses integral field spectroscopy to map the ionized interstellar medium in the local group. This paper describes and outlines the scope and content for the nineteenth data release (DR19) of SDSS and the most substantial to date in SDSS-V. DR19 is the first to contain data from all three mappers. Additionally, we also describe nine value added catalogs (VACs) that enhance the science that can be conducted with the SDSS-V data. Finally, we discuss how to access SDSS DR19 and provide illustrative examples and tutorials. △ Less

Submitted 9 July, 2025; originally announced July 2025.

Comments: Submitted to AASJournals. 56 Pages, 9 Tables, 11 Figures

arXiv:2507.06989 [pdf, ps, other]

Sloan Digital Sky Survey-V: Pioneering Panoptic Spectroscopy

Authors: Juna A. Kollmeier, Hans-Walter Rix, Conny Aerts, James Aird, Pablo Vera Alfaro, Andrés Almeida, Scott F. Anderson, Óscar Jiménez Arranz, Stefan M. Arseneau, Roberto Assef, Shir Aviram, Catarina Aydar, Carles Badenes, Avrajit Bandyopadhyay, Kat Barger, Robert H. Barkhouser, Franz E. Bauer, Chad Bender, Felipe Besser, Binod Bhattarai, Pavaman Bilgi, Jonathan Bird, Dmitry Bizyaev, Guillermo A. Blanc, Michael R. Blanton , et al. (195 additional authors not shown)

Abstract: The Sloan Digital Sky Survey-V (SDSS-V) is pioneering panoptic spectroscopy: it is the first all-sky, multi-epoch, optical-to-infrared spectroscopic survey. SDSS-V is mapping the sky with multi-object spectroscopy (MOS) at telescopes in both hemispheres (the 2.5-m Sloan Foundation Telescope at Apache Point Observatory and the 100-inch du Pont Telescope at Las Campanas Observatory), where 500 zonal… ▽ More The Sloan Digital Sky Survey-V (SDSS-V) is pioneering panoptic spectroscopy: it is the first all-sky, multi-epoch, optical-to-infrared spectroscopic survey. SDSS-V is mapping the sky with multi-object spectroscopy (MOS) at telescopes in both hemispheres (the 2.5-m Sloan Foundation Telescope at Apache Point Observatory and the 100-inch du Pont Telescope at Las Campanas Observatory), where 500 zonal robotic fiber positioners feed light from a wide-field focal plane to an optical (R$\sim 2000$, 500 fibers) and a near-infrared (R$\sim 22,000$, 300 fibers) spectrograph. In addition to these MOS capabilities, the survey is pioneering ultra wide-field ($\sim$ 4000~deg$^2$) integral field spectroscopy enabled by a new dedicated facility (LVM-I) at Las Campanas Observatory, where an integral field spectrograph (IFS) with 1801 lenslet-coupled fibers arranged in a 0.5 degree diameter hexagon feeds multiple R$\sim$4000 optical spectrographs that cover 3600-9800 angstroms. SDSS-V's hardware and multi-year survey strategy are designed to decode the chemo-dynamical history of the Milky Way Galaxy and tackle fundamental open issues in stellar physics in its Milky Way Mapper program, trace the growth physics of supermassive black holes in its Black Hole Mapper program, and understand the self-regulation mechanisms and the chemical enrichment of galactic ecosystems at the energy-injection scale in its Local Volume Mapper program. The survey is well-timed to multiply the scientific output from major all-sky space missions. The SDSS-V MOS programs began robotic operations in 2021; IFS observations began in 2023 with the completion of the LVM-I facility. SDSS-V builds upon decades of heritage of SDSS's pioneering advances in data analysis, collaboration spirit, infrastructure, and product deliverables in astronomy. △ Less

Submitted 9 July, 2025; originally announced July 2025.

Comments: 76 pages, submitted to the Astronomical Journal

arXiv:2507.06261 [pdf, ps, other]

Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

Authors: Gheorghe Comanici, Eric Bieber, Mike Schaekermann, Ice Pasupat, Noveen Sachdeva, Inderjit Dhillon, Marcel Blistein, Ori Ram, Dan Zhang, Evan Rosen, Luke Marris, Sam Petulla, Colin Gaffney, Asaf Aharoni, Nathan Lintz, Tiago Cardal Pais, Henrik Jacobsson, Idan Szpektor, Nan-Jiang Jiang, Krishna Haridasan, Ahmed Omran, Nikunj Saunshi, Dara Bahri, Gaurav Mishra, Eric Chu , et al. (3410 additional authors not shown)

Abstract: In this report, we introduce the Gemini 2.X model family: Gemini 2.5 Pro and Gemini 2.5 Flash, as well as our earlier Gemini 2.0 Flash and Flash-Lite models. Gemini 2.5 Pro is our most capable model yet, achieving SoTA performance on frontier coding and reasoning benchmarks. In addition to its incredible coding and reasoning skills, Gemini 2.5 Pro is a thinking model that excels at multimodal unde… ▽ More In this report, we introduce the Gemini 2.X model family: Gemini 2.5 Pro and Gemini 2.5 Flash, as well as our earlier Gemini 2.0 Flash and Flash-Lite models. Gemini 2.5 Pro is our most capable model yet, achieving SoTA performance on frontier coding and reasoning benchmarks. In addition to its incredible coding and reasoning skills, Gemini 2.5 Pro is a thinking model that excels at multimodal understanding and it is now able to process up to 3 hours of video content. Its unique combination of long context, multimodal and reasoning capabilities can be combined to unlock new agentic workflows. Gemini 2.5 Flash provides excellent reasoning abilities at a fraction of the compute and latency requirements and Gemini 2.0 Flash and Flash-Lite provide high performance at low latency and cost. Taken together, the Gemini 2.X model generation spans the full Pareto frontier of model capability vs cost, allowing users to explore the boundaries of what is possible with complex agentic problem solving. △ Less

Submitted 16 October, 2025; v1 submitted 7 July, 2025; originally announced July 2025.

Comments: 72 pages, 17 figures

arXiv:2507.06259 [pdf, ps, other]

Curvature inequalities for anti-invariant submersion from quaternionic space forms

Authors: Kirti Gupta, Punam Gupta, R. K. Gangele

Abstract: This paper focuses on deriving several curvature inequalities involving the Ricci and scalar curvatures of the horizontal and vertical distributions in anti-invariant Riemannian submersions from quaternionic space forms onto Riemannian manifolds. In addition, a Ricci curvature inequality for anti-invariant Riemannian submersions is established. The equality cases for all derived inequalities are a… ▽ More This paper focuses on deriving several curvature inequalities involving the Ricci and scalar curvatures of the horizontal and vertical distributions in anti-invariant Riemannian submersions from quaternionic space forms onto Riemannian manifolds. In addition, a Ricci curvature inequality for anti-invariant Riemannian submersions is established. The equality cases for all derived inequalities are also examined. △ Less

Submitted 7 July, 2025; originally announced July 2025.

Comments: arXiv admin note: text overlap with arXiv:2506.15099

MSC Class: 53C12; 53C15; 53C26; 53C55

arXiv:2507.04029 [pdf, ps, other]

Active Scalar Mixing by Homogeneous Isotropic Turbulence

Authors: Joaquim P. Jossy, Pratyush S. Awasthi, Prateek Gupta

Abstract: We study the mixing of active scalars by homogeneous isotropic incompressible stochastic velocity fields. We consider both Navier-Stokes generated turbulent fields as well as artificially generated homogeneous isotropic stochastic fields. We use Fourier pseudospectral direct numerical simulations to study the mixing dynamics of two non-reacting species of different density ratios. We use the Atwoo… ▽ More We study the mixing of active scalars by homogeneous isotropic incompressible stochastic velocity fields. We consider both Navier-Stokes generated turbulent fields as well as artificially generated homogeneous isotropic stochastic fields. We use Fourier pseudospectral direct numerical simulations to study the mixing dynamics of two non-reacting species of different density ratios. We use the Atwood number to create a denser mixture and a lighter mixture. We show that in the absence of stirring, a denser mixture homogenizes faster than the lighter mixture. The direction of the density gradient causes the interface across which the molecular diffusion occurs to expand outward for the denser mixture and inward for the lighter mixture. The stirring process, which enhances the diffusion process, increases the rate of homogenization in both mixing methods under study. We define a new mixing metric for studying the mixing evolution of active scalars, which indicates that a denser inhomogeneity in a lighter mixture spreads faster but homogenizes slower. For low Mach number turbulence, there is a negligible coupling between the density gradients and the velocity field responsible for stirring. The post-stirring behavior of active scalars is found to be similar to passive scalars, where the scalar energy spectra decay exponentially and exhibit self-similarity. The turbulence fields generated by solving the Navier-Stokes equation homogenize both the mixtures faster than the synthetic cases. We show that matching the kinetic energy spectra and inertial subrange scaling of a synthetically generated stochastic field with that of a Navier-Stokes generated field is not enough to study mixing dynamics. △ Less

Submitted 5 July, 2025; originally announced July 2025.

arXiv:2507.00056 [pdf, ps, other]

Explicit Constructions of Astheno-Kähler Manifolds

Authors: Punam Gupta, Nidhi Yadav

Abstract: We investigate the conditions under which astheno-Kähler structures can be identified on the product of two compact trans-Sasakian manifolds of dimensions greater than 2. We investigate the conditions under which astheno-Kähler structures can be identified on the product of two compact trans-Sasakian manifolds of dimensions greater than 2. △ Less

Submitted 26 June, 2025; originally announced July 2025.

MSC Class: 53C55; 53C15; 53C25

arXiv:2506.21931 [pdf, ps, other]

ARAG: Agentic Retrieval Augmented Generation for Personalized Recommendation

Authors: Reza Yousefi Maragheh, Pratheek Vadla, Priyank Gupta, Kai Zhao, Aysenur Inan, Kehui Yao, Jianpeng Xu, Praveen Kanumala, Jason Cho, Sushant Kumar

Abstract: Retrieval-Augmented Generation (RAG) has shown promise in enhancing recommendation systems by incorporating external context into large language model prompts. However, existing RAG-based approaches often rely on static retrieval heuristics and fail to capture nuanced user preferences in dynamic recommendation scenarios. In this work, we introduce ARAG, an Agentic Retrieval-Augmented Generation fr… ▽ More Retrieval-Augmented Generation (RAG) has shown promise in enhancing recommendation systems by incorporating external context into large language model prompts. However, existing RAG-based approaches often rely on static retrieval heuristics and fail to capture nuanced user preferences in dynamic recommendation scenarios. In this work, we introduce ARAG, an Agentic Retrieval-Augmented Generation framework for Personalized Recommendation, which integrates a multi-agent collaboration mechanism into the RAG pipeline. To better understand the long-term and session behavior of the user, ARAG leverages four specialized LLM-based agents: a User Understanding Agent that summarizes user preferences from long-term and session contexts, a Natural Language Inference (NLI) Agent that evaluates semantic alignment between candidate items retrieved by RAG and inferred intent, a context summary agent that summarizes the findings of NLI agent, and an Item Ranker Agent that generates a ranked list of recommendations based on contextual fit. We evaluate ARAG accross three datasets. Experimental results demonstrate that ARAG significantly outperforms standard RAG and recency-based baselines, achieving up to 42.1% improvement in NDCG@5 and 35.5% in Hit@5. We also, conduct an ablation study to analyse the effect by different components of ARAG. Our findings highlight the effectiveness of integrating agentic reasoning into retrieval-augmented recommendation and provide new directions for LLM-based personalization. △ Less

Submitted 11 August, 2025; v1 submitted 27 June, 2025; originally announced June 2025.

ACM Class: I.2.11; I.2.7; H.3.3

arXiv:2506.19543 [pdf, ps, other]

Long-term atomistic finite-temperature substitutional diffusion

Authors: Shashank Saxena, Prateek Gupta, Dennis M. Kochmann

Abstract: Simulating long-term mass diffusion kinetics with atomic precision is important to predict chemical and mechanical properties of alloys over time scales of engineering interest in applications, including (but not limited to) alloy heat treatment, corrosion resistance, and hydrogen embrittlement. We present a new strategy to bridge from the time scale of atomic vibrations to that of vacancy-mediate… ▽ More Simulating long-term mass diffusion kinetics with atomic precision is important to predict chemical and mechanical properties of alloys over time scales of engineering interest in applications, including (but not limited to) alloy heat treatment, corrosion resistance, and hydrogen embrittlement. We present a new strategy to bridge from the time scale of atomic vibrations to that of vacancy-mediated atomic hops by a combination of statistical mechanics-based Gaussian phase packets (GPP) relaxation and a nudged elastic band (NEB)-facilitated harmonic transition state theory (H-TST) time update. We validate the approach by simulating bulk self-diffusion in copper and the segregation of vacancies and magnesium to a stacking fault and a symmetric tilt grain boundary in aluminum, modeled with an embedded atom method (EAM) potential. The method correctly predicts the kinetics in bulk copper and equilibrium impurity concentrations in aluminum, in agreement with the Langmuir-Mclean solution in the dilute limit. Notably, this technique can reach realistic diffusion time scales of days, weeks, and even years in a computational time of hours, demonstrating its capability to study the long-term chemo-thermo-mechanically coupled behavior of atomic ensembles. △ Less

Submitted 24 June, 2025; originally announced June 2025.

arXiv:2506.15099 [pdf, ps, other]

On H-Conformal Semi-invariant Submersion

Authors: Punam Gupta, Kirti Gupta

Abstract: We explore h-conformal semi-invariant submersions and almost h-conformal semi-invariant submersions originating from quaternionic Kähler manifolds to Riemannian manifolds. Our investigation focuses on the geometric characteristics of these submersions, including the integrability of distributions and the geometry of foliations. Additionally, we establish the necessary and sufficient conditions for… ▽ More We explore h-conformal semi-invariant submersions and almost h-conformal semi-invariant submersions originating from quaternionic Kähler manifolds to Riemannian manifolds. Our investigation focuses on the geometric characteristics of these submersions, including the integrability of distributions and the geometry of foliations. Additionally, we establish the necessary and sufficient conditions for such submersions to be totally geodesic. We also examine the equivalent conditions for the total manifold of the submersion to be twisted product manifold. Finally, we present a series of examples illustrating quaternionic Kähler manifolds and h-conformal semi-invariant submersions from quaternionic Kähler manifolds to Riemannian manifolds. △ Less

Submitted 17 June, 2025; originally announced June 2025.

MSC Class: 53C12; 53C15; 53C26; 53C55

arXiv:2506.13823 [pdf, ps, other]

KHIFC-user friendly program for studying Heavy ion fusion barrier characteristics

Authors: H. C. Manjunatha, P. S. Damodara Gupta, N. Sowmya, K. N. Sridhar

Abstract: We have developed an application for studying heavy ion fusion barrier characteristics such as such as fusion barrier heights ($V_B$), positions ($R_B$) and curvature of the inverted parabola ($\hbarω$). We call this application as KHIFC (Kolar Heavy Ion fusion barrier Characteristics). This software application hosted in the domain "https://systematics-of-heavy-ion-fusion.vercel.app/". This KHIFC… ▽ More We have developed an application for studying heavy ion fusion barrier characteristics such as such as fusion barrier heights ($V_B$), positions ($R_B$) and curvature of the inverted parabola ($\hbarω$). We call this application as KHIFC (Kolar Heavy Ion fusion barrier Characteristics). This software application hosted in the domain "https://systematics-of-heavy-ion-fusion.vercel.app/". This KHIFC produces fusion cross-sections with the simple input of projectile, target and center of mass energy. The values produced by the KHIFC is validated with experiments. Efficient tools like KHIFC are essential for researchers in nuclear physics, particularly when dealing with complex systems such as actinide and superheavy nuclei. By providing quick calculations and insights, it can significantly aid in experiment planning and theoretical investigations. △ Less

Submitted 15 June, 2025; originally announced June 2025.

arXiv:2506.12678 [pdf, ps, other]

Adapting by Analogy: OOD Generalization of Visuomotor Policies via Functional Correspondence

Authors: Pranay Gupta, Henny Admoni, Andrea Bajcsy

Abstract: End-to-end visuomotor policies trained using behavior cloning have shown a remarkable ability to generate complex, multi-modal low-level robot behaviors. However, at deployment time, these policies still struggle to act reliably when faced with out-of-distribution (OOD) visuals induced by objects, backgrounds, or environment changes. Prior works in interactive imitation learning solicit corrective… ▽ More End-to-end visuomotor policies trained using behavior cloning have shown a remarkable ability to generate complex, multi-modal low-level robot behaviors. However, at deployment time, these policies still struggle to act reliably when faced with out-of-distribution (OOD) visuals induced by objects, backgrounds, or environment changes. Prior works in interactive imitation learning solicit corrective expert demonstrations under the OOD conditions -- but this can be costly and inefficient. We observe that task success under OOD conditions does not always warrant novel robot behaviors. In-distribution (ID) behaviors can directly be transferred to OOD conditions that share functional similarities with ID conditions. For example, behaviors trained to interact with in-distribution (ID) pens can apply to interacting with a visually-OOD pencil. The key challenge lies in disambiguating which ID observations functionally correspond to the OOD observation for the task at hand. We propose that an expert can provide this OOD-to-ID functional correspondence. Thus, instead of collecting new demonstrations and re-training at every OOD encounter, our method: (1) detects the need for feedback by first checking if current observations are OOD and then identifying whether the most similar training observations show divergent behaviors, (2) solicits functional correspondence feedback to disambiguate between those behaviors, and (3) intervenes on the OOD observations with the functionally corresponding ID observations to perform deployment-time generalization. We validate our method across diverse real-world robotic manipulation tasks with a Franka Panda robotic manipulator. Our results show that test-time functional correspondences can improve the generalization of a vision-based diffusion policy to OOD objects and environment conditions with low feedback. △ Less

Submitted 14 June, 2025; originally announced June 2025.

Comments: 15 pages, 11 figures

Showing 1–50 of 751 results for author: Gupta, P