Search | arXiv e-print repository

An Explainable AI Model for Binary LJ Fluids

Authors: Israrul H Hashmi, Rahul Karmakar, Marripelli Maniteja, Kumar Ayush, Tarak K. Patra

Abstract: Lennard-Jones (LJ) fluids serve as an important theoretical framework for understanding molecular interactions. Binary LJ fluids, where two distinct species of particles interact based on the LJ potential, exhibit rich phase behavior and provide valuable insights of complex fluid mixtures. Here we report the construction and utility of an artificial intelligence (AI) model for binary LJ fluids, fo… ▽ More Lennard-Jones (LJ) fluids serve as an important theoretical framework for understanding molecular interactions. Binary LJ fluids, where two distinct species of particles interact based on the LJ potential, exhibit rich phase behavior and provide valuable insights of complex fluid mixtures. Here we report the construction and utility of an artificial intelligence (AI) model for binary LJ fluids, focusing on their effectiveness in predicting radial distribution functions (RDFs) across a range of conditions. The RDFs of a binary mixture with varying compositions and temperatures are collected from molecular dynamics (MD) simulations to establish and validate the AI model. In this AI pipeline, RDFs are discretized in order to reduce the output dimension of the model. This, in turn, improves the efficacy, and reduce the complexity of an AI RDF model. The model is shown to predict RDFs for many unknown mixtures very accurately, especially outside the training temperature range. Our analysis suggests that the particle size ratio has a higher order impact on the microstructure of a binary mixture. We also highlight the areas where the fidelity of the AI model is low when encountering new regimes with different underlying physics. △ Less

Submitted 24 February, 2025; originally announced February 2025.

arXiv:2502.16865 [pdf, other]

Multimodal Search in Chemical Documents and Reactions

Authors: Ayush Kumar Shah, Abhisek Dey, Leo Luo, Bryan Amador, Patrick Philippy, Ming Zhong, Siru Ouyang, David Mark Friday, David Bianchi, Nick Jackson, Richard Zanibbi, Jiawei Han

Abstract: We present a multimodal search tool that facilitates retrieval of chemical reactions, molecular structures, and associated text from scientific literature. Queries may combine molecular diagrams, textual descriptions, and reaction data, allowing users to connect different representations of chemical information. To support this, the indexing process includes chemical diagram extraction and parsing… ▽ More We present a multimodal search tool that facilitates retrieval of chemical reactions, molecular structures, and associated text from scientific literature. Queries may combine molecular diagrams, textual descriptions, and reaction data, allowing users to connect different representations of chemical information. To support this, the indexing process includes chemical diagram extraction and parsing, extraction of reaction data from text in tabular form, and cross-modal linking of diagrams and their mentions in text. We describe the system's architecture, key functionalities, and retrieval process, along with expert assessments of the system. This demo highlights the workflow and technical components of the search system. △ Less

Submitted 24 February, 2025; originally announced February 2025.

Comments: 4 pages, 2 figures, SIGIR 2025 Demonstration Submission

arXiv:2502.16182 [pdf, other]

IPO: Your Language Model is Secretly a Preference Classifier

Authors: Shivank Garg, Ayush Singh, Shweta Singh, Paras Chopra

Abstract: Reinforcement learning from human feedback (RLHF) has emerged as the primary method for aligning large language models (LLMs) with human preferences. While it enables LLMs to achieve human-level alignment, it often incurs significant computational and financial costs due to its reliance on training external reward models or human-labeled preferences. In this work, we propose Implicit Preference Op… ▽ More Reinforcement learning from human feedback (RLHF) has emerged as the primary method for aligning large language models (LLMs) with human preferences. While it enables LLMs to achieve human-level alignment, it often incurs significant computational and financial costs due to its reliance on training external reward models or human-labeled preferences. In this work, we propose Implicit Preference Optimization (IPO), an alternative approach that leverages generative LLMs as preference classifiers, thereby reducing the dependence on external human feedback or reward models to obtain preferences. We conduct a comprehensive evaluation on the preference classification ability of LLMs using RewardBench, assessing models across different sizes, architectures, and training levels to validate our hypothesis. Furthermore, we investigate the self-improvement capabilities of LLMs by generating multiple responses for a given instruction and employing the model itself as a preference classifier for Direct Preference Optimization (DPO)-based training. Our findings demonstrate that models trained through IPO achieve performance comparable to those utilizing state-of-the-art reward models for obtaining preferences. △ Less

Submitted 20 March, 2025; v1 submitted 22 February, 2025; originally announced February 2025.

arXiv:2502.15708 [pdf, other]

MAML: Towards a Faster Web in Developing Regions

Authors: Ayush Pandey, Matteo Varvello, Syed Ishtiaque Ahmed, Shurui Zhou, Lakshmi Subramanian, Yasir Zaki

Abstract: The web experience in developing regions remains subpar, primarily due to the growing complexity of modern webpages and insufficient optimization by content providers. Users in these regions typically rely on low-end devices and limited bandwidth, which results in a poor user experience as they download and parse webpages bloated with excessive third-party CSS and JavaScript (JS). To address these… ▽ More The web experience in developing regions remains subpar, primarily due to the growing complexity of modern webpages and insufficient optimization by content providers. Users in these regions typically rely on low-end devices and limited bandwidth, which results in a poor user experience as they download and parse webpages bloated with excessive third-party CSS and JavaScript (JS). To address these challenges, we introduce the Mobile Application Markup Language (MAML), a flat layout-based web specification language that reduces computational and data transmission demands, while replacing the excessive bloat from JS with a new scripting language centered on essential (and popular) web functionalities. Last but not least, MAML is backward compatible as it can be transpiled to minimal HTML/JavaScript/CSS and thus work with legacy browsers. We benchmark MAML in terms of page load times and sizes, using a translator which can automatically port any webpage to MAML. When compared to the popular Google AMP, across 100 testing webpages, MAML offers webpage speedups by tens of seconds under challenging network conditions thanks to its significant size reductions. Next, we run a competition involving 25 university students porting 50 of the above webpages to MAML using a web-based editor we developed. This experiment verifies that, with little developer effort, MAML is quite effective in maintaining the visual and functional correctness of the originating webpages. △ Less

Submitted 20 January, 2025; originally announced February 2025.

Comments: 8 pages, 4 figures

arXiv:2502.15392 [pdf, other]

Chitrarth: Bridging Vision and Language for a Billion People

Authors: Shaharukh Khan, Ayush Tarun, Abhinav Ravi, Ali Faraz, Akshat Patidar, Praveen Kumar Pokala, Anagha Bhangare, Raja Kolla, Chandra Khatri, Shubham Agarwal

Abstract: Recent multimodal foundation models are primarily trained on English or high resource European language data, which hinders their applicability to other medium and low-resource languages. To address this limitation, we introduce Chitrarth (Chitra: Image; Artha: Meaning), an inclusive Vision-Language Model (VLM), specifically targeting the rich linguistic diversity and visual reasoning across 10 pr… ▽ More Recent multimodal foundation models are primarily trained on English or high resource European language data, which hinders their applicability to other medium and low-resource languages. To address this limitation, we introduce Chitrarth (Chitra: Image; Artha: Meaning), an inclusive Vision-Language Model (VLM), specifically targeting the rich linguistic diversity and visual reasoning across 10 prominent Indian languages. Our model effectively integrates a state-of-the-art (SOTA) multilingual Large Language Model (LLM) with a vision module, primarily trained on multilingual image-text data. Furthermore, we also introduce BharatBench, a comprehensive framework for evaluating VLMs across various Indian languages, ultimately contributing to more diverse and effective AI systems. Our model achieves SOTA results for benchmarks across low resource languages while retaining its efficiency in English. Through our research, we aim to set new benchmarks in multilingual-multimodal capabilities, offering substantial improvements over existing models and establishing a foundation to facilitate future advancements in this arena. △ Less

Submitted 21 February, 2025; originally announced February 2025.

arXiv:2502.12263 [pdf, other]

Improved constraints on the Faraday rotation towards eight fast radio bursts using dense grids of polarized radio galaxies

Authors: Ayush Pandhi, Bryan M. Gaensler, Ziggy Pleunis, Sebastian Hutschenreuter, Casey Law, Ryan Mckinven, Shane P. O'Sullivan, Emily B. Petroff, Tessa Vernstrom

Abstract: We present 2-4 GHz observations of polarized radio galaxies towards eight fast radio bursts (FRBs), producing grids of Faraday rotation measure (RM) sources with sky densities of 9-28 polarized sources per square degree. Using a Bayesian interpolation framework, we constrain Galactic RM fluctuations below ~ 1 degree squared angular scales around the FRB positions. Despite the positions of all eigh… ▽ More We present 2-4 GHz observations of polarized radio galaxies towards eight fast radio bursts (FRBs), producing grids of Faraday rotation measure (RM) sources with sky densities of 9-28 polarized sources per square degree. Using a Bayesian interpolation framework, we constrain Galactic RM fluctuations below ~ 1 degree squared angular scales around the FRB positions. Despite the positions of all eight FRBs far from the Galactic plane, we constrain previously unresolved small-scale Galactic RM structures around six of the eight FRBs. In two of these fields, we find potential changes in the sign of the Galactic RM that are not captured by previous, sparsely sampled RM grid observations. Our Galactic RM estimate towards the FRBs differs between a few rad m^-2 up to ~ 40 rad m^-2 from the all-sky Galactic RM map of Hutschenreuter et al. (2022). Extrapolating our results to the known population of polarized FRB sources, we may be incorrectly interpreting the host galaxy RM for ~ 30% of the FRB source population with current RM grid observations. Measuring small-scale Galactic RM variations is crucial for identifying FRBs in low density and weakly magnetized environments, which in turn could serve as potent probes of cosmic magnetism. This framework of reconstructing continuous Galactic RM structure from RM grid observations can be readily applied to FRBs that fall in the sky coverage of upcoming large-sky radio polarization surveys of radio galaxies, such as the Very Large Array Sky Survey (VLASS) and the Polarization Sky Survey of the Universe's Magnetism (POSSUM). △ Less

Submitted 17 February, 2025; originally announced February 2025.

Comments: 25 pages, 8 figures, accepted to ApJ

arXiv:2502.11572 [pdf, other]

Improving Rare-Word Recognition of Whisper in Zero-Shot Settings

Authors: Yash Jogi, Vaibhav Aggarwal, Shabari S Nair, Yash Verma, Aayush Kubba

Abstract: Whisper, despite being trained on 680K hours of web-scaled audio data, faces difficulty in recognising rare words like domain-specific terms, with a solution being contextual biasing through prompting. To improve upon this method, in this paper, we propose a supervised learning strategy to fine-tune Whisper for contextual biasing instruction. We demonstrate that by using only 670 hours of Common V… ▽ More Whisper, despite being trained on 680K hours of web-scaled audio data, faces difficulty in recognising rare words like domain-specific terms, with a solution being contextual biasing through prompting. To improve upon this method, in this paper, we propose a supervised learning strategy to fine-tune Whisper for contextual biasing instruction. We demonstrate that by using only 670 hours of Common Voice English set for fine-tuning, our model generalises to 11 diverse open-source English datasets, achieving a 45.6% improvement in recognition of rare words and 60.8% improvement in recognition of words unseen during fine-tuning over the baseline method. Surprisingly, our model's contextual biasing ability generalises even to languages unseen during fine-tuning. △ Less

Submitted 18 February, 2025; v1 submitted 17 February, 2025; originally announced February 2025.

Comments: Accepted at IEEE SLT 2024

arXiv:2502.11287 [pdf, other]

MC-BEVRO: Multi-Camera Bird Eye View Road Occupancy Detection for Traffic Monitoring

Authors: Arpitsinh Vaghela, Duo Lu, Aayush Atul Verma, Bharatesh Chakravarthi, Hua Wei, Yezhou Yang

Abstract: Single camera 3D perception for traffic monitoring faces significant challenges due to occlusion and limited field of view. Moreover, fusing information from multiple cameras at the image feature level is difficult because of different view angles. Further, the necessity for practical implementation and compatibility with existing traffic infrastructure compounds these challenges. To address these… ▽ More Single camera 3D perception for traffic monitoring faces significant challenges due to occlusion and limited field of view. Moreover, fusing information from multiple cameras at the image feature level is difficult because of different view angles. Further, the necessity for practical implementation and compatibility with existing traffic infrastructure compounds these challenges. To address these issues, this paper introduces a novel Bird's-Eye-View road occupancy detection framework that leverages multiple roadside cameras to overcome the aforementioned limitations. To facilitate the framework's development and evaluation, a synthetic dataset featuring diverse scenes and varying camera configurations is generated using the CARLA simulator. A late fusion and three early fusion methods were implemented within the proposed framework, with performance further enhanced by integrating backgrounds. Extensive evaluations were conducted to analyze the impact of multi-camera inputs and varying BEV occupancy map sizes on model performance. Additionally, a real-world data collection pipeline was developed to assess the model's ability to generalize to real-world environments. The sim-to-real capabilities of the model were evaluated using zero-shot and few-shot fine-tuning, demonstrating its potential for practical application. This research aims to advance perception systems in traffic monitoring, contributing to improved traffic management, operational efficiency, and road safety. △ Less

Submitted 16 February, 2025; originally announced February 2025.

arXiv:2502.11217 [pdf, other]

A Catalog of Local Universe Fast Radio Bursts from CHIME/FRB and the KKO Outrigger

Authors: The CHIME/FRB Collaboration, :, Mandana Amiri, Daniel Amouyal, Bridget C. Andersen, Shion Andrew, Kevin Bandura, Mohit Bhardwaj, P. J. Boyle, Charanjot Brar, Alyssa Cassity, Shami Chatterjee, Alice P. Curtin, Matt Dobbs, Fengqiu Adam Dong, Yuxin Dong, Gwendolyn M. Eadie, Tarraneh Eftekhari, Wen-fai Fong, Emmanuel Fonseca, B. M. Gaensler, Mark Halpern, Jason W. T. Hessels, Hans Hopkins, Adaeze L. Ibik , et al. (41 additional authors not shown)

Abstract: We present the first catalog of fast radio burst (FRB) host galaxies from CHIME/FRB Outriggers, selected uniformly in the radio and the optical by localizing 81 new bursts to 2'' x ~60'' accuracy using CHIME and the KKO Outrigger, located 66 km from CHIME. Of the 81 localized bursts, we use the Probabilistic Association of Transients to their Hosts (PATH) algorithm to securely identify 21 new FRB… ▽ More We present the first catalog of fast radio burst (FRB) host galaxies from CHIME/FRB Outriggers, selected uniformly in the radio and the optical by localizing 81 new bursts to 2'' x ~60'' accuracy using CHIME and the KKO Outrigger, located 66 km from CHIME. Of the 81 localized bursts, we use the Probabilistic Association of Transients to their Hosts (PATH) algorithm to securely identify 21 new FRB host galaxies, and compile spectroscopic redshifts for 19 systems, 15 of which are newly obtained via spectroscopic observations. The most nearby source is FRB 20231229A, at a distance of 90 Mpc. One burst in our sample is from a previously reported repeating source in a galaxy merger (FRB 20190303A). Three new FRB host galaxies (FRBs 20230203A, 20230703A, and 20231206A) are found towards X-ray and optically selected galaxy clusters, potentially doubling the sample of known galaxy cluster FRBs. A search for radio counterparts reveals that FRB 20231128A is associated with a luminous persistent radio source (PRS) candidate with high significance ($P_{cc} \sim 10^{-2}$). If its compactness is confirmed, it would be the nearest known compact PRS at $z = 0.1079$. Our catalog significantly increases the statistics of the Macquart relation at low redshifts ($z < 0.2$). In the near future, the completed CHIME/FRB Outriggers array will produce hundreds of FRBs localized with very long baseline interferometry (VLBI). This will significantly expand the known sample and pave the way for future telescopes relying on VLBI for FRB localization. △ Less

Submitted 24 March, 2025; v1 submitted 16 February, 2025; originally announced February 2025.

Comments: 27 pages, 10 figures

arXiv:2502.10113 [pdf, other]

Strain-Induced Optical and Molecular Transformations in PET Films for Organic Electronic Applications

Authors: Mahya Ghorab, Ayush K. Ranga, Patrice Donfack, Arnulf Materny, Veit Wagner, Mojtaba Joodaki

Abstract: Poly(ethylene terephthalate) (PET) films are widely used in flexible electronics and optoelectronics, where their mechanical durability and optical performance under strain are essential for device reliability. This study investigates the impact of applied mechanical strain on the optical and molecular properties of PET at room temperature,using UV-Vis absorption and Raman spectroscopy. The work e… ▽ More Poly(ethylene terephthalate) (PET) films are widely used in flexible electronics and optoelectronics, where their mechanical durability and optical performance under strain are essential for device reliability. This study investigates the impact of applied mechanical strain on the optical and molecular properties of PET at room temperature,using UV-Vis absorption and Raman spectroscopy. The work explores how varying strain levels, from 0% (unstretched) to 30%, affect the transparency, vibrational modes, and molecular reorganization within PET films. UV-Vis absorbance measurements reveal that strain induces significant changes in the light transmission properties of PET, particularly in the visible range, and increases absorption in the UVA and visible region by up to 100%. Raman spectra indicate that strain levels higher than 5% lead to irreversible shifts of vibrational lines, accompanied by an increase of their full width at half maximum (FWHM), suggesting molecular reorientation and crystallinity changes. The phonon mode coupled with C-O stretching [O-CH2] shows the strongest response to applied mechanical stress. This study provides a comprehensive understanding of strain-induced optical and structural alterations in PET, with implications for improving the mechanical and optical performance of PET-based devices in strainsensitive applications, such as organic solar cells (OSCs), organic light-emitting diodes (OLEDs), and flexible sensors. △ Less

Submitted 14 February, 2025; originally announced February 2025.

arXiv:2502.09787 [pdf, other]

TableTalk: Scaffolding Spreadsheet Development with a Language Agent

Authors: Jenny T. Liang, Aayush Kumar, Yasharth Bajpai, Sumit Gulwani, Vu Le, Chris Parnin, Arjun Radhakrishna, Ashish Tiwari, Emerson Murphy-Hill, Guastavo Soares

Abstract: Despite its ubiquity in the workforce, spreadsheet programming remains challenging as programmers need both spreadsheet-specific knowledge (e.g., APIs to write formulas) and problem-solving skills to create complex spreadsheets. Large language models (LLMs) can help automate aspects of this process, and recent advances in planning and reasoning have enabled language agents, which dynamically plan,… ▽ More Despite its ubiquity in the workforce, spreadsheet programming remains challenging as programmers need both spreadsheet-specific knowledge (e.g., APIs to write formulas) and problem-solving skills to create complex spreadsheets. Large language models (LLMs) can help automate aspects of this process, and recent advances in planning and reasoning have enabled language agents, which dynamically plan, use tools, and take iterative actions to complete complex tasks. These agents observe, plan, and act, making them well-suited to scaffold spreadsheet programming by following expert processes. We present TableTalk, a language agent that helps programmers build spreadsheets conversationally. Its design reifies three design principles -- scaffolding, flexibility, and incrementality -- which we derived from two studies of seven programmers and 62 Excel templates. TableTalk structures spreadsheet development by generating step-by-step plans and suggesting three next steps users can choose from. It also integrates tools that enable incremental spreadsheet construction. A user study with 20 programmers shows that TableTalk produces spreadsheets 2.3 times more likely to be preferred over a baseline agent, while reducing cognitive load and time spent reasoning about spreadsheet actions by 12.6%. TableTalk's approach has implications for human-agent collaboration. This includes providing persistent direct manipulation interfaces for stopping or undoing agent actions, while ensuring that such interfaces for accepting actions can be deactivated. △ Less

Submitted 13 February, 2025; originally announced February 2025.

arXiv:2502.06693 [pdf, ps, other]

Recent Advances, Applications and Open Challenges in Machine Learning for Health: Reflections from Research Roundtables at ML4H 2024 Symposium

Authors: Amin Adibi, Xu Cao, Zongliang Ji, Jivat Neet Kaur, Winston Chen, Elizabeth Healey, Brighton Nuwagira, Wenqian Ye, Geoffrey Woollard, Maxwell A Xu, Hejie Cui, Johnny Xi, Trenton Chang, Vasiliki Bikia, Nicole Zhang, Ayush Noori, Yuan Xia, Md. Belal Hossain, Hanna A. Frank, Alina Peluso, Yuan Pu, Shannon Zejiang Shen, John Wu, Adibvafa Fallahpour, Sazan Mahbub , et al. (17 additional authors not shown)

Abstract: The fourth Machine Learning for Health (ML4H) symposium was held in person on December 15th and 16th, 2024, in the traditional, ancestral, and unceded territories of the Musqueam, Squamish, and Tsleil-Waututh Nations in Vancouver, British Columbia, Canada. The symposium included research roundtable sessions to foster discussions between participants and senior researchers on timely and relevant to… ▽ More The fourth Machine Learning for Health (ML4H) symposium was held in person on December 15th and 16th, 2024, in the traditional, ancestral, and unceded territories of the Musqueam, Squamish, and Tsleil-Waututh Nations in Vancouver, British Columbia, Canada. The symposium included research roundtable sessions to foster discussions between participants and senior researchers on timely and relevant topics for the ML4H community. The organization of the research roundtables at the conference involved 13 senior and 27 junior chairs across 13 tables. Each roundtable session included an invited senior chair (with substantial experience in the field), junior chairs (responsible for facilitating the discussion), and attendees from diverse backgrounds with an interest in the session's topic. △ Less

Submitted 10 February, 2025; originally announced February 2025.

arXiv:2502.06016 [pdf, other]

Dissecting the massive pristine, neutral gas reservoir of a remarkably bright galaxy at z = 14.179

Authors: Kasper E. Heintz, Clara Pollock, Joris Witstok, Stefano Carniani, Kevin N. Hainline, Francesco D'Eugenio, Chamilla Terp, Aayush Saxena, Darach Watson

Abstract: At cosmic dawn, the first stars and galaxies are believed to form from and be deeply embedded in clouds of dense, pristine gas. Here we present a study of the JWST/NIRSpec data of the most distant, spectroscopically confirmed galaxy observed to date, JADES-GS-z14-0 (GS-z14 for short), at $z=14.179$, combined with recent far-infrared measurements of the [OIII]-$88μ$m and [CII]-$158μ$m line transiti… ▽ More At cosmic dawn, the first stars and galaxies are believed to form from and be deeply embedded in clouds of dense, pristine gas. Here we present a study of the JWST/NIRSpec data of the most distant, spectroscopically confirmed galaxy observed to date, JADES-GS-z14-0 (GS-z14 for short), at $z=14.179$, combined with recent far-infrared measurements of the [OIII]-$88μ$m and [CII]-$158μ$m line transitions and underlying dust-continuum emission. Based on the observed prominent damped Lyman-$α$ (DLA) absorption profile, we determine a substantial neutral atomic hydrogen (HI) column density, $\log (N_{\rm HI} / {\rm cm^{-2}}) = 22.27^{+0.08}_{-0.09}$, consistent with previous estimates though seemingly at odds with the dynamical and gas mass of the galaxy. Using various independent but complementary approaches, considering the implied neutral gas mass from the DLA measurement, the star-formation rate surface density, and the metal abundance, we demonstrate that the total gas mass of GS-z14 is of the order $\log (M_{\rm gas} / M_\odot) = 9.8\pm 0.3$. This implies a substantial gas mass fraction, $f_{\rm gas} \gtrsim 0.9$ and that the bulk of the interstellar medium (ISM) is in the form of HI. We show that the derived gas mass is fully consistent with the non-detection of [CII]-$158μ$m, assuming an appropriate scaling to the neutral gas. The low dust-to-gas ratio, $A_V/N_{\rm HI} = (1.3\pm 0.6)\times 10^{-23}$\,mag\,cm$^2$, derived in the line-of-sight through the DLA further indicates that the absorbing gas is more pristine than the central, star-forming regions probed by the [OIII]-$88μ$m emission. These results highlight the implications for far-infrared line-detection searchers attainable with ALMA and demonstrate that the bright, relatively massive galaxy GS-z14 at $z=14.179$ is deeply embedded in a substantial, pristine HI gas reservoir dominating its baryonic matter content. △ Less

Submitted 9 February, 2025; originally announced February 2025.

Comments: Submitted. Comments welcome!

arXiv:2502.05923 [pdf, other]

ARISE: Iterative Rule Induction and Synthetic Data Generation for Text Classification

Authors: Yashwanth M., Vaibhav Singh, Ayush Maheshwari, Amrith Krishna, Ganesh Ramakrishnan

Abstract: We propose ARISE, a framework that iteratively induces rules and generates synthetic data for text classification. We combine synthetic data generation and automatic rule induction, via bootstrapping, to iteratively filter the generated rules and data. We induce rules via inductive generalisation of syntactic n-grams, enabling us to capture a complementary source of supervision. These rules alone… ▽ More We propose ARISE, a framework that iteratively induces rules and generates synthetic data for text classification. We combine synthetic data generation and automatic rule induction, via bootstrapping, to iteratively filter the generated rules and data. We induce rules via inductive generalisation of syntactic n-grams, enabling us to capture a complementary source of supervision. These rules alone lead to performance gains in both, in-context learning (ICL) and fine-tuning (FT) settings. Similarly, use of augmented data from ARISE alone improves the performance for a model, outperforming configurations that rely on complex methods like contrastive learning. Further, our extensive experiments on various datasets covering three full-shot, eight few-shot and seven multilingual variant settings demonstrate that the rules and data we generate lead to performance improvements across these diverse domains and languages. △ Less

Submitted 9 February, 2025; originally announced February 2025.

Comments: Accepted to Findings of NAACL 2025

arXiv:2502.05826 [pdf, other]

MindCraft: Revolutionizing Education through AI-Powered Personalized Learning and Mentorship for Rural India

Authors: Arihant Bardia, Aayush Agrawal

Abstract: MindCraft is a modern platform designed to revolutionize education in rural India by leveraging Artificial Intelligence (AI) to create personalized learning experiences, provide mentorship, and foster resource-sharing. In a country where access to quality education is deeply influenced by geography and socio economic status, rural students often face significant barriers in their educational journ… ▽ More MindCraft is a modern platform designed to revolutionize education in rural India by leveraging Artificial Intelligence (AI) to create personalized learning experiences, provide mentorship, and foster resource-sharing. In a country where access to quality education is deeply influenced by geography and socio economic status, rural students often face significant barriers in their educational journeys. MindCraft aims to bridge this gap by utilizing AI to create tailored learning paths, connect students with mentors, and enable a collaborative network of educational resources that transcends both physical and digital divides. This paper explores the challenges faced by rural students, the transformative potential of AI, and how MindCraft offers a scalable, sustainable solution for equitable education system. By focusing on inclusivity, personalized learning, and mentorship, MindCraft seeks to empower rural students, equipping them with the skills, knowledge, and opportunities needed to thrive in an increasingly digital world. Ultimately, MindCraft envisions a future in which technology not only bridges educational gaps but also becomes the driving force for a more inclusive and empowered society. △ Less

Submitted 9 February, 2025; originally announced February 2025.

arXiv:2502.04260 [pdf, other]

Realistic Image-to-Image Machine Unlearning via Decoupling and Knowledge Retention

Authors: Ayush K. Varshney, Vicenç Torra

Abstract: Machine Unlearning allows participants to remove their data from a trained machine learning model in order to preserve their privacy, and security. However, the machine unlearning literature for generative models is rather limited. The literature for image-to-image generative model (I2I model) considers minimizing the distance between Gaussian noise and the output of I2I model for forget samples a… ▽ More Machine Unlearning allows participants to remove their data from a trained machine learning model in order to preserve their privacy, and security. However, the machine unlearning literature for generative models is rather limited. The literature for image-to-image generative model (I2I model) considers minimizing the distance between Gaussian noise and the output of I2I model for forget samples as machine unlearning. However, we argue that the machine learning model performs fairly well on unseen data i.e., a retrained model will be able to catch generic patterns in the data and hence will not generate an output which is equivalent to Gaussian noise. In this paper, we consider that the model after unlearning should treat forget samples as out-of-distribution (OOD) data, i.e., the unlearned model should no longer recognize or encode the specific patterns found in the forget samples. To achieve this, we propose a framework which decouples the model parameters with gradient ascent, ensuring that forget samples are OOD for unlearned model with theoretical guarantee. We also provide $(ε, δ)$-unlearning guarantee for model updates with gradient ascent. The unlearned model is further fine-tuned on the remaining samples to maintain its performance. We also propose an attack model to ensure that the unlearned model has effectively removed the influence of forget samples. Extensive empirical evaluation on two large-scale datasets, ImageNet-1K and Places365 highlights the superiority of our approach. To show comparable performance with retrained model, we also show the comparison of a simple AutoEncoder on various baselines on CIFAR-10 dataset. △ Less

Submitted 6 February, 2025; originally announced February 2025.

arXiv:2502.02927 [pdf, other]

Bayesian estimation of Unit-Weibull distribution based on dual generalized order statistics with application to the Cotton Production Data

Authors: Qazi J. Azhad, Abdul Nasir Khan, Bhagwati Devi, Jahangir Sabbir Khan, Ayush Tripathi

Abstract: The Unit Weibull distribution with parameters $α$ and $β$ is considered to study in the context of dual generalized order statistics. For the analysis purpose, Bayes estimators based on symmetric and asymmetric loss functions are obtained. The methods which are utilized for Bayesian estimation are approximation and simulation tools such as Lindley, Tierney-Kadane and Markov chain Monte Carlo metho… ▽ More The Unit Weibull distribution with parameters $α$ and $β$ is considered to study in the context of dual generalized order statistics. For the analysis purpose, Bayes estimators based on symmetric and asymmetric loss functions are obtained. The methods which are utilized for Bayesian estimation are approximation and simulation tools such as Lindley, Tierney-Kadane and Markov chain Monte Carlo methods. The authors have considered squared error loss function as symmetric and LINEX and general entropy loss function as asymmetric loss functions. After presenting the mathematical results, a simulation study is conducted to exhibit the performances of various derived estimators. As this study is considered for the dual generalized order statistics that is unification of models based distinct ordered random variable such as order statistics, record values, etc. This provides flexibility in our results and in continuation of this, the cotton production data of USA is analyzed for both submodels of ordered random variables: order statistics and record values. △ Less

Submitted 5 February, 2025; originally announced February 2025.

Comments: 19 Pages, 1 figure, 12 tables, preprint

ACM Class: G.3

arXiv:2502.01846 [pdf, other]

UVGS: Reimagining Unstructured 3D Gaussian Splatting using UV Mapping

Authors: Aashish Rai, Dilin Wang, Mihir Jain, Nikolaos Sarafianos, Kefan Chen, Srinath Sridhar, Aayush Prakash

Abstract: 3D Gaussian Splatting (3DGS) has demonstrated superior quality in modeling 3D objects and scenes. However, generating 3DGS remains challenging due to their discrete, unstructured, and permutation-invariant nature. In this work, we present a simple yet effective method to overcome these challenges. We utilize spherical mapping to transform 3DGS into a structured 2D representation, termed UVGS. UVGS… ▽ More 3D Gaussian Splatting (3DGS) has demonstrated superior quality in modeling 3D objects and scenes. However, generating 3DGS remains challenging due to their discrete, unstructured, and permutation-invariant nature. In this work, we present a simple yet effective method to overcome these challenges. We utilize spherical mapping to transform 3DGS into a structured 2D representation, termed UVGS. UVGS can be viewed as multi-channel images, with feature dimensions as a concatenation of Gaussian attributes such as position, scale, color, opacity, and rotation. We further find that these heterogeneous features can be compressed into a lower-dimensional (e.g., 3-channel) shared feature space using a carefully designed multi-branch network. The compressed UVGS can be treated as typical RGB images. Remarkably, we discover that typical VAEs trained with latent diffusion models can directly generalize to this new representation without additional training. Our novel representation makes it effortless to leverage foundational 2D models, such as diffusion models, to directly model 3DGS. Additionally, one can simply increase the 2D UV resolution to accommodate more Gaussians, making UVGS a scalable solution compared to typical 3D backbones. This approach immediately unlocks various novel generation applications of 3DGS by inherently utilizing the already developed superior 2D generation capabilities. In our experiments, we demonstrate various unconditional, conditional generation, and inpainting applications of 3DGS based on diffusion models, which were previously non-trivial. △ Less

Submitted 20 March, 2025; v1 submitted 3 February, 2025; originally announced February 2025.

Comments: https://aashishrai3799.github.io/uvgs

arXiv:2502.00921 [pdf, ps, other]

Blink of an eye: a simple theory for feature localization in generative models

Authors: Marvin Li, Aayush Karan, Sitan Chen

Abstract: Large language models can exhibit unexpected behavior in the blink of an eye. In a recent computer use demo, a language model switched from coding to Googling pictures of Yellowstone, and these sudden shifts in behavior have also been observed in reasoning patterns and jailbreaks. This phenomenon is not unique to autoregressive models: in diffusion models, key features of the final output are deci… ▽ More Large language models can exhibit unexpected behavior in the blink of an eye. In a recent computer use demo, a language model switched from coding to Googling pictures of Yellowstone, and these sudden shifts in behavior have also been observed in reasoning patterns and jailbreaks. This phenomenon is not unique to autoregressive models: in diffusion models, key features of the final output are decided in narrow ``critical windows'' of the generation process. In this work we develop a simple, unifying theory to explain this phenomenon using the formalism of stochastic localization samplers. We show that it emerges generically as the generation process localizes to a sub-population of the distribution it models. While critical windows have been studied at length in diffusion models, existing theory heavily relies on strong distributional assumptions and the particulars of Gaussian diffusion. In contrast to existing work our theory (1) applies to autoregressive and diffusion models; (2) makes no distributional assumptions; (3) quantitatively improves previous bounds even when specialized to diffusions; and (4) requires basic tools and no stochastic calculus or statistical-physics-based machinery. We also identify an intriguing connection to the all-or-nothing phenomenon from statistical inference. Finally, we validate our predictions empirically for LLMs and find that critical windows often coincide with failures in problem solving for various math and reasoning benchmarks. △ Less

Submitted 5 June, 2025; v1 submitted 2 February, 2025; originally announced February 2025.

arXiv:2502.00292 [pdf, other]

doi 10.1140/epjc/s10052-025-14238-2

Reconstructing FHDE with Scalar and Gauge Fields

Authors: Ayush Bidlan, Paulo Moniz, Oem Trivedi

Abstract: We revisit the Fractional Holographic Dark Energy (FHDE) model to reconstruct it by means of dynamic candidates such as ($i$) Quintessence, ($ii$) K-essence, ($iii$) Dilaton, ($iv$) Yang-Mills condensate, ($v$) DBI-essence, and ($vi$) Tachyonic fields in a flat Friedmann-Robertson-Walker (FRW) Universe. In particular, the dark-energy possibilities ($i$)-($vi$) are formulated through suitable field… ▽ More We revisit the Fractional Holographic Dark Energy (FHDE) model to reconstruct it by means of dynamic candidates such as ($i$) Quintessence, ($ii$) K-essence, ($iii$) Dilaton, ($iv$) Yang-Mills condensate, ($v$) DBI-essence, and ($vi$) Tachyonic fields in a flat Friedmann-Robertson-Walker (FRW) Universe. In particular, the dark-energy possibilities ($i$)-($vi$) are formulated through suitable field descriptions. Being concrete, we establish a comprehensive correspondence between FHDE and suitable scalar and gauge field frameworks that co-substantiate our investigation and subsequent discussion. In more detail, we methodically compute the corresponding Equation of State (EoS) parameters and field (kinetic and potential) features for the fractional parameter ($α$) range, viz. $1<α\leq2$. Conclusively, our results show that the modifications brought by the fractional features satisfactorily enable late-time cosmic acceleration, together with avoiding quantum instabilities by preventing the EoS from entering the phantom divide i.e., $ω(z)\rightarrow-\infty$, which is a common issue in standard scalar field models without fractional dynamics (e.g., K-essence field). Our findings further indicate that fractional calculus attributes can be significant in addressing the challenges of dark-energy models by offering a robust framework to prospect late-time acceleration and properly fitting observational constraints. Notably, we find that as the fractional features start to dominate, the EoS parameter of all the effective field configurations asymptotically approaches a $Λ$CDM behaviour in the far-future limit $z\rightarrow-1$. In summary, the recent perspective introduced by FHDE \citep{Trivedi:2024inb} can indeed be cast as a promising aspirant through the use of prominent field frameworks. △ Less

Submitted 31 January, 2025; originally announced February 2025.

Comments: 27 pages & 17 figures. Comments are very welcome!

Report number: 520, (2025)

Journal ref: Eur. Phys. J. C 85, 520 (2025)

arXiv:2501.17871 [pdf, other]

On the challenges of detecting MCI using EEG in the wild

Authors: Aayush Mishra, David Joffe, Sankara Surendra Telidevara, David S Oakley, Anqi Liu

Abstract: Recent studies have shown promising results in the detection of Mild Cognitive Impairment (MCI) using easily accessible Electroencephalogram (EEG) data which would help administer early and effective treatment for dementia patients. However, the reliability and practicality of such systems remains unclear. In this work, we investigate the potential limitations and challenges in developing a robust… ▽ More Recent studies have shown promising results in the detection of Mild Cognitive Impairment (MCI) using easily accessible Electroencephalogram (EEG) data which would help administer early and effective treatment for dementia patients. However, the reliability and practicality of such systems remains unclear. In this work, we investigate the potential limitations and challenges in developing a robust MCI detection method using two contrasting datasets: 1) CAUEEG, collected and annotated by expert neurologists in controlled settings and 2) GENEEG, a new dataset collected and annotated in general practice clinics, a setting where routine MCI diagnoses are typically made. We find that training on small datasets, as is done by most previous works, tends to produce high variance models that make overconfident predictions, and are unreliable in practice. Additionally, distribution shifts between datasets make cross-domain generalization challenging. Finally, we show that MCI detection using EEG may suffer from fundamental limitations because of the overlapping nature of feature distributions with control groups. We call for more effort in high-quality data collection in actionable settings (like general practice clinics) to make progress towards this salient goal of non-invasive MCI detection. △ Less

Submitted 15 January, 2025; originally announced January 2025.

Comments: 10 pages

arXiv:2501.17675 [pdf, other]

Glimmers in the Cosmic Dawn. II. A variability census of supermassive black holes across the Universe

Authors: Vieri Cammelli, Jonathan C. Tan, Alice R. Young, Matthew J. Hayes, Jasbir Singh, Richard S. Ellis, Aayush Saxena, Nicolas Laporte, Pierluigi Monaco, Benjamin W. Keller

Abstract: Understanding the origin and evolution of supermassive black holes (SMBH) stands as one of the most important challenges in astrophysics and cosmology, with little current theoretical consensus. Improved observational constraints on the cosmological evolution of SMBH demographics are needed. Here we report results of a search via photometric variability for SMBHs appearing as active galactic nucle… ▽ More Understanding the origin and evolution of supermassive black holes (SMBH) stands as one of the most important challenges in astrophysics and cosmology, with little current theoretical consensus. Improved observational constraints on the cosmological evolution of SMBH demographics are needed. Here we report results of a search via photometric variability for SMBHs appearing as active galactic nuclei (AGN) in the cosmological volume defined by the Hubble Ultra Deep Field (HUDF). This work includes particular focus on a new observation carried out in 2023 with the \textit{Hubble Space Telescope (HST)} using the WFC3/IR/F140W, which is compared directly to equivalent data taken 11 years earlier in 2012. Two earlier pairs of observations from 2009 to 2012 with WFC3/IR/F105W and WFC3/IR/F160W are also analysed. We identify 443, 149, and 78 AGN candidates as nuclear sources that exhibit photometric variability at a level of 2, 2.5 and 3~$σ$ in at least one filter. This sample includes 29, 14, and 9 AGN at redshifts $z>6$, when the Universe was $\lesssim900$~Myr old. After variability and luminosity function (down to $M_{\rm UV}=-17\:$mag) completeness corrections, we estimate the co-moving number density of SMBHs, $n_{\rm SMBH}(z)$. At $z = 6 - 9$, $n_{\rm SMBH}\gtrsim 10^{-2}\:{\rm cMpc^{-3}}$. At low-$z$ our observations are sensitive to AGN fainter than $M_{\rm UV}=-17 \:$mag, and we estimate $n_{\rm SMBH}\gtrsim 6\times 10^{-2}\:{\rm cMpc^{-3}}$. We discuss how these results place strong constraints on a variety of SMBH seeding theories. △ Less

Submitted 29 January, 2025; originally announced January 2025.

Comments: Submitted to ApJ. Comments welcome

arXiv:2501.15793 [pdf, other]

Advancing Portfolio Optimization: Adaptive Minimum-Variance Portfolios and Minimum Risk Rate Frameworks

Authors: Ayush Jha, Abootaleb Shirvani, Ali Jaffri, Svetlozar T. Rachev, Frank J. Fabozzi

Abstract: This study presents the Adaptive Minimum-Variance Portfolio (AMVP) framework and the Adaptive Minimum-Risk Rate (AMRR) metric, innovative tools designed to optimize portfolios dynamically in volatile and nonstationary financial markets. Unlike traditional minimum-variance approaches, the AMVP framework incorporates real-time adaptability through advanced econometric models, including ARFIMA-FIGARC… ▽ More This study presents the Adaptive Minimum-Variance Portfolio (AMVP) framework and the Adaptive Minimum-Risk Rate (AMRR) metric, innovative tools designed to optimize portfolios dynamically in volatile and nonstationary financial markets. Unlike traditional minimum-variance approaches, the AMVP framework incorporates real-time adaptability through advanced econometric models, including ARFIMA-FIGARCH processes and non-Gaussian innovations. Empirical applications on cryptocurrency and equity markets demonstrate the proposed framework's superior performance in risk reduction and portfolio stability, particularly during periods of structural market breaks and heightened volatility. The findings highlight the practical implications of using the AMVP and AMRR methodologies to address modern investment challenges, offering actionable insights for portfolio managers navigating uncertain and rapidly changing market conditions. △ Less

Submitted 27 January, 2025; originally announced January 2025.

arXiv:2501.15739 [pdf, other]

doi 10.3847/1538-4357/adaec0

Automatic Machine Learning Framework to Study Morphological Parameters of AGN Host Galaxies within $z < 1.4$ in the Hyper Supreme-Cam Wide Survey

Authors: Chuan Tian, C. Megan Urry, Aritra Ghosh, Daisuke Nagai, Tonima T. Ananna, Meredith C. Powell, Connor Auge, Aayush Mishra, David B. Sanders, Nico Cappelluti, Kevin Schawinski

Abstract: We present a composite machine learning framework to estimate posterior probability distributions of bulge-to-total light ratio, half-light radius, and flux for Active Galactic Nucleus (AGN) host galaxies within $z<1.4$ and $m<23$ in the Hyper Supreme-Cam Wide survey. We divide the data into five redshift bins: low ($0<z<0.25$), mid ($0.25<z<0.5$), high ($0.5<z<0.9$), extra ($0.9<z<1.1$) and extre… ▽ More We present a composite machine learning framework to estimate posterior probability distributions of bulge-to-total light ratio, half-light radius, and flux for Active Galactic Nucleus (AGN) host galaxies within $z<1.4$ and $m<23$ in the Hyper Supreme-Cam Wide survey. We divide the data into five redshift bins: low ($0<z<0.25$), mid ($0.25<z<0.5$), high ($0.5<z<0.9$), extra ($0.9<z<1.1$) and extreme ($1.1<z<1.4$), and train our models independently in each bin. We use PSFGAN to decompose the AGN point source light from its host galaxy, and invoke the Galaxy Morphology Posterior Estimation Network (GaMPEN) to estimate morphological parameters of the recovered host galaxy. We first trained our models on simulated data, and then fine-tuned our algorithm via transfer learning using labeled real data. To create training labels for transfer learning, we used GALFIT to fit $\sim 20,000$ real HSC galaxies in each redshift bin. We comprehensively examined that the predicted values from our final models agree well with the GALFIT values for the vast majority of cases. Our PSFGAN + GaMPEN framework runs at least three orders of magnitude faster than traditional light-profile fitting methods, and can be easily retrained for other morphological parameters or on other datasets with diverse ranges of resolutions, seeing conditions, and signal-to-noise ratios, making it an ideal tool for analyzing AGN host galaxies from large surveys coming soon from the Rubin-LSST, Euclid, and Roman telescopes. △ Less

Submitted 26 January, 2025; originally announced January 2025.

Comments: Accepted for publication in The Astrophysical Journal. 31 Pages. 20 Figures

arXiv:2501.15666 [pdf, other]

MimicGait: A Model Agnostic approach for Occluded Gait Recognition using Correlational Knowledge Distillation

Authors: Ayush Gupta, Rama Chellappa

Abstract: Gait recognition is an important biometric technique over large distances. State-of-the-art gait recognition systems perform very well in controlled environments at close range. Recently, there has been an increased interest in gait recognition in the wild prompted by the collection of outdoor, more challenging datasets containing variations in terms of illumination, pitch angles, and distances. A… ▽ More Gait recognition is an important biometric technique over large distances. State-of-the-art gait recognition systems perform very well in controlled environments at close range. Recently, there has been an increased interest in gait recognition in the wild prompted by the collection of outdoor, more challenging datasets containing variations in terms of illumination, pitch angles, and distances. An important problem in these environments is that of occlusion, where the subject is partially blocked from camera view. While important, this problem has received little attention. Thus, we propose MimicGait, a model-agnostic approach for gait recognition in the presence of occlusions. We train the network using a multi-instance correlational distillation loss to capture both inter-sequence and intra-sequence correlations in the occluded gait patterns of a subject, utilizing an auxiliary Visibility Estimation Network to guide the training of the proposed mimic network. We demonstrate the effectiveness of our approach on challenging real-world datasets like GREW, Gait3D and BRIAR. We release the code in https://github.com/Ayush-00/mimicgait. △ Less

Submitted 26 January, 2025; originally announced January 2025.

Comments: Accepted to WACV 2025 as Poster

arXiv:2501.13941 [pdf, other]

GaussMark: A Practical Approach for Structural Watermarking of Language Models

Authors: Adam Block, Ayush Sekhari, Alexander Rakhlin

Abstract: Recent advances in Large Language Models (LLMs) have led to significant improvements in natural language processing tasks, but their ability to generate human-quality text raises significant ethical and operational concerns in settings where it is important to recognize whether or not a given text was generated by a human. Thus, recent work has focused on developing techniques for watermarking LLM… ▽ More Recent advances in Large Language Models (LLMs) have led to significant improvements in natural language processing tasks, but their ability to generate human-quality text raises significant ethical and operational concerns in settings where it is important to recognize whether or not a given text was generated by a human. Thus, recent work has focused on developing techniques for watermarking LLM-generated text, i.e., introducing an almost imperceptible signal that allows a provider equipped with a secret key to determine if given text was generated by their model. Current watermarking techniques are often not practical due to concerns with generation latency, detection time, degradation in text quality, or robustness. Many of these drawbacks come from the focus on token-level watermarking, which ignores the inherent structure of text. In this work, we introduce a new scheme, GaussMark, that is simple and efficient to implement, has formal statistical guarantees on its efficacy, comes at no cost in generation latency, and embeds the watermark into the weights of the model itself, providing a structural watermark. Our approach is based on Gaussian independence testing and is motivated by recent empirical observations that minor additive corruptions to LLM weights can result in models of identical (or even improved) quality. We show that by adding a small amount of Gaussian noise to the weights of a given LLM, we can watermark the model in a way that is statistically detectable by a provider who retains the secret key. We provide formal statistical bounds on the validity and power of our procedure. Through an extensive suite of experiments, we demonstrate that GaussMark is reliable, efficient, and relatively robust to corruptions such as insertions, deletions, substitutions, and roundtrip translations and can be instantiated with essentially no loss in model quality. △ Less

Submitted 17 January, 2025; originally announced January 2025.

arXiv:2501.13901 [pdf, other]

Optimizing Portfolios with Pakistan-Exposed ETFs: Risk and Performance Insight

Authors: Ali Jaffri, Abootaleb Shirvani, Ayush Jha, Svetlozar T. Rachev, Frank J. Fabozzi

Abstract: This study examines the investment landscape of Pakistan as an emerging and frontier market, focusing on implications for international investors, particularly those in the United States, through exchange-traded funds (ETFs) with exposure to Pakistan. The analysis encompasses 30 ETFs with varying degrees of exposure to Pakistan, covering the period from January 1, 2016, to February 2024. This rese… ▽ More This study examines the investment landscape of Pakistan as an emerging and frontier market, focusing on implications for international investors, particularly those in the United States, through exchange-traded funds (ETFs) with exposure to Pakistan. The analysis encompasses 30 ETFs with varying degrees of exposure to Pakistan, covering the period from January 1, 2016, to February 2024. This research highlights the potential benefits and risks associated with investing in these ETFs, emphasizing the importance of thorough risk assessments and portfolio performance comparisons. By providing descriptive statistics and performance metrics based on historical optimization, this paper aims to equip investors with the necessary insights to make informed decisions when optimizing their portfolios with Pakistan-exposed ETFs. The second part of the paper introduces and assesses dynamic optimization methodologies. This section is designed to explore the adaptability and performance metrics of dynamic optimization techniques in comparison with conventional historical optimization methods. By integrating dynamic optimization into the investigation, this research aims to offer insights into the efficacy of these contrasting methodologies in the context of Pakistan-exposed ETFs. The findings underscore the significance of Pakistan's market dynamics within the broader context of emerging markets, offering a pathway for diversification and potential growth in investment strategies. △ Less

Submitted 23 January, 2025; originally announced January 2025.

arXiv:2501.13890 [pdf, ps, other]

Federated Granger Causality Learning for Interdependent Clients with State Space Representation

Authors: Ayush Mohanty, Nazal Mohamed, Paritosh Ramanan, Nagi Gebraeel

Abstract: Advanced sensors and IoT devices have improved the monitoring and control of complex industrial enterprises. They have also created an interdependent fabric of geographically distributed process operations (clients) across these enterprises. Granger causality is an effective approach to detect and quantify interdependencies by examining how one client's state affects others over time. Understandin… ▽ More Advanced sensors and IoT devices have improved the monitoring and control of complex industrial enterprises. They have also created an interdependent fabric of geographically distributed process operations (clients) across these enterprises. Granger causality is an effective approach to detect and quantify interdependencies by examining how one client's state affects others over time. Understanding these interdependencies captures how localized events, such as faults and disruptions, can propagate throughout the system, possibly causing widespread operational impacts. However, the large volume and complexity of industrial data pose challenges in modeling these interdependencies. This paper develops a federated approach to learning Granger causality. We utilize a linear state space system framework that leverages low-dimensional state estimates to analyze interdependencies. This addresses bandwidth limitations and the computational burden commonly associated with centralized data processing. We propose augmenting the client models with the Granger causality information learned by the server through a Machine Learning (ML) function. We examine the co-dependence between the augmented client and server models and reformulate the framework as a standalone ML algorithm providing conditions for its sublinear and linear convergence rates. We also study the convergence of the framework to a centralized oracle model. Moreover, we include a differential privacy analysis to ensure data security while preserving causal insights. Using synthetic data, we conduct comprehensive experiments to demonstrate the robustness of our approach to perturbations in causality, the scalability to the size of communication, number of clients, and the dimensions of raw data. We also evaluate the performance on two real-world industrial control system datasets by reporting the volume of data saved by decentralization. △ Less

Submitted 29 May, 2025; v1 submitted 23 January, 2025; originally announced January 2025.

Comments: Published as a conference paper at International Conference on Learning Representations (ICLR) 2025

arXiv:2501.13687 [pdf, other]

Question Answering on Patient Medical Records with Private Fine-Tuned LLMs

Authors: Sara Kothari, Ayush Gupta

Abstract: Healthcare systems continuously generate vast amounts of electronic health records (EHRs), commonly stored in the Fast Healthcare Interoperability Resources (FHIR) standard. Despite the wealth of information in these records, their complexity and volume make it difficult for users to retrieve and interpret crucial health insights. Recent advances in Large Language Models (LLMs) offer a solution, e… ▽ More Healthcare systems continuously generate vast amounts of electronic health records (EHRs), commonly stored in the Fast Healthcare Interoperability Resources (FHIR) standard. Despite the wealth of information in these records, their complexity and volume make it difficult for users to retrieve and interpret crucial health insights. Recent advances in Large Language Models (LLMs) offer a solution, enabling semantic question answering (QA) over medical data, allowing users to interact with their health records more effectively. However, ensuring privacy and compliance requires edge and private deployments of LLMs. This paper proposes a novel approach to semantic QA over EHRs by first identifying the most relevant FHIR resources for a user query (Task1) and subsequently answering the query based on these resources (Task2). We explore the performance of privately hosted, fine-tuned LLMs, evaluating them against benchmark models such as GPT-4 and GPT-4o. Our results demonstrate that fine-tuned LLMs, while 250x smaller in size, outperform GPT-4 family models by 0.55% in F1 score on Task1 and 42% on Meteor Task in Task2. Additionally, we examine advanced aspects of LLM usage, including sequential fine-tuning, model self-evaluation (narcissistic evaluation), and the impact of training data size on performance. The models and datasets are available here: https://huggingface.co/genloop △ Less

Submitted 23 January, 2025; originally announced January 2025.

arXiv:2501.13683 [pdf, other]

Unlearning Clients, Features and Samples in Vertical Federated Learning

Authors: Ayush K. Varshney, Konstantinos Vandikas, Vicenç Torra

Abstract: Federated Learning (FL) has emerged as a prominent distributed learning paradigm. Within the scope of privacy preservation, information privacy regulations such as GDPR entitle users to request the removal (or unlearning) of their contribution from a service that is hosting the model. For this purpose, a server hosting an ML model must be able to unlearn certain information in cases such as copyri… ▽ More Federated Learning (FL) has emerged as a prominent distributed learning paradigm. Within the scope of privacy preservation, information privacy regulations such as GDPR entitle users to request the removal (or unlearning) of their contribution from a service that is hosting the model. For this purpose, a server hosting an ML model must be able to unlearn certain information in cases such as copyright infringement or security issues that can make the model vulnerable or impact the performance of a service based on that model. While most unlearning approaches in FL focus on Horizontal FL (HFL), where clients share the feature space and the global model, Vertical FL (VFL) has received less attention from the research community. VFL involves clients (passive parties) sharing the sample space among them while not having access to the labels. In this paper, we explore unlearning in VFL from three perspectives: unlearning clients, unlearning features, and unlearning samples. To unlearn clients and features we introduce VFU-KD which is based on knowledge distillation (KD) while to unlearn samples, VFU-GA is introduced which is based on gradient ascent. To provide evidence of approximate unlearning, we utilize Membership Inference Attack (MIA) to audit the effectiveness of our unlearning approach. Our experiments across six tabular datasets and two image datasets demonstrate that VFU-KD and VFU-GA achieve performance comparable to or better than both retraining from scratch and the benchmark R2S method in many cases, with improvements of $(0-2\%)$. In the remaining cases, utility scores remain comparable, with a modest utility loss ranging from $1-5\%$. Unlike existing methods, VFU-KD and VFU-GA require no communication between active and passive parties during unlearning. However, they do require the active party to store the previously communicated embeddings. △ Less

Submitted 23 January, 2025; originally announced January 2025.

Comments: Paper accepted for publication in PETS 2025, Issue II

arXiv:2501.13483 [pdf, other]

Robust Amortized Bayesian Inference with Self-Consistency Losses on Unlabeled Data

Authors: Aayush Mishra, Daniel Habermann, Marvin Schmitt, Stefan T. Radev, Paul-Christian Bürkner

Abstract: Amortized Bayesian inference (ABI) with neural networks can solve probabilistic inverse problems orders of magnitude faster than classical methods. However, ABI is not yet sufficiently robust for widespread and safe application. When performing inference on observations outside the scope of the simulated training data, posterior approximations are likely to become highly biased, which cannot be co… ▽ More Amortized Bayesian inference (ABI) with neural networks can solve probabilistic inverse problems orders of magnitude faster than classical methods. However, ABI is not yet sufficiently robust for widespread and safe application. When performing inference on observations outside the scope of the simulated training data, posterior approximations are likely to become highly biased, which cannot be corrected by additional simulations due to the bad pre-asymptotic behavior of current neural posterior estimators. In this paper, we propose a semi-supervised approach that enables training not only on labeled simulated data generated from the model, but also on \textit{unlabeled} data originating from any source, including real data. To achieve this, we leverage Bayesian self-consistency properties that can be transformed into strictly proper losses that do not require knowledge of ground-truth parameters. We test our approach on several real-world case studies, including applications to high-dimensional time-series and image data. Our results show that semi-supervised learning with unlabeled data drastically improves the robustness of ABI in the out-of-simulation regime. Notably, inference remains accurate even when evaluated on observations far away from the labeled and unlabeled data seen during training. △ Less

Submitted 15 May, 2025; v1 submitted 23 January, 2025; originally announced January 2025.

arXiv:2501.13283 [pdf, other]

STM Image Analysis using Autoencoders

Authors: Peter Binev, Joshua Moorehead, Ayush Parambath, Luke Parrella, Rori Pumphrey, Miruna Savu

Abstract: This study explores the application of Convolutional Autoencoders (CAEs) for analyzing and reconstructing Scanning Tunneling Microscopy (STM) images of various crystalline lattice structures. We developed two distinct CAE architectures to process simulated STM images of simple cubic, body-centered cubic (BCC), face-centered cubic (FCC), and hexagonal lattices. Our models were trained on… ▽ More This study explores the application of Convolutional Autoencoders (CAEs) for analyzing and reconstructing Scanning Tunneling Microscopy (STM) images of various crystalline lattice structures. We developed two distinct CAE architectures to process simulated STM images of simple cubic, body-centered cubic (BCC), face-centered cubic (FCC), and hexagonal lattices. Our models were trained on $17\times17$ pixel patches extracted from $256\times256$ simulated STM images, incorporating realistic noise characteristics. We evaluated the models' performance using Mean Squared Error (MSE) and Structural Similarity (SSIM) index, and analyzed the learned latent space representations. The results demonstrate the potential of deep learning techniques in STM image analysis, while also highlighting challenges in latent space interpretability and full image reconstruction. This work lays the foundation for future advancements in automated analysis of atomic-scale imaging data, with potential applications in materials science and nanotechnology. △ Less

Submitted 22 January, 2025; originally announced January 2025.

Comments: 18 pages

MSC Class: 65D40; 68T07 ACM Class: G.1.10

arXiv:2501.11935 [pdf, other]

To Google or To ChatGPT? A Comparison of CS2 Students' Information Gathering Approaches and Outcomes

Authors: Aayush Kumar, Daniel Prol, Amin Alipour, Sruti Srinivasa Ragavan

Abstract: LLMs such as ChatGPT have been widely adopted by students in higher education as tools for learning programming and related concepts. However, it remains unclear how effective students are and what strategies students use while learning with LLMs. Since the majority of students' experiences in online self-learning have come through using search engines such as Google, evaluating AI tools in this c… ▽ More LLMs such as ChatGPT have been widely adopted by students in higher education as tools for learning programming and related concepts. However, it remains unclear how effective students are and what strategies students use while learning with LLMs. Since the majority of students' experiences in online self-learning have come through using search engines such as Google, evaluating AI tools in this context can help us address these gaps. In this mixed methods research, we conducted an exploratory within-subjects study to understand how CS2 students learn programming concepts using both LLMs as well as traditional online methods such as educational websites and videos to examine how students approach learning within and across both scenarios. We discovered that students found it easier to learn a more difficult concept using traditional methods than using ChatGPT. We also found that students ask fewer follow-ups and use more keyword-based queries for search engines while their prompts to LLMs tend to explicitly ask for information. △ Less

Submitted 22 March, 2025; v1 submitted 21 January, 2025; originally announced January 2025.

arXiv:2501.08288 [pdf, other]

Avoiding subtraction and division of stochastic signals using normalizing flows: NFdeconvolve

Authors: Pedro Pessoa, Max Schweiger, Lance W. Q. Xu, Tristan Manha, Ayush Saurabh, Julian Antolin Camarena, Steve Pressé

Abstract: Across the scientific realm, we find ourselves subtracting or dividing stochastic signals. For instance, consider a stochastic realization, $x$, generated from the addition or multiplication of two stochastic signals $a$ and $b$, namely $x=a+b$ or $x = ab$. For the $x=a+b$ example, $a$ can be fluorescence background and $b$ the signal of interest whose statistics are to be learned from the measure… ▽ More Across the scientific realm, we find ourselves subtracting or dividing stochastic signals. For instance, consider a stochastic realization, $x$, generated from the addition or multiplication of two stochastic signals $a$ and $b$, namely $x=a+b$ or $x = ab$. For the $x=a+b$ example, $a$ can be fluorescence background and $b$ the signal of interest whose statistics are to be learned from the measured $x$. Similarly, when writing $x=ab$, $a$ can be thought of as the illumination intensity and $b$ the density of fluorescent molecules of interest. Yet dividing or subtracting stochastic signals amplifies noise, and we ask instead whether, using the statistics of $a$ and the measurement of $x$ as input, we can recover the statistics of $b$. Here, we show how normalizing flows can generate an approximation of the probability distribution over $b$, thereby avoiding subtraction or division altogether. This method is implemented in our software package, NFdeconvolve, available on GitHub with a tutorial linked in the main text. △ Less

Submitted 14 January, 2025; originally announced January 2025.

arXiv:2501.07339 [pdf, other]

Evaluating Pre-Trained Models for Multi-Language Vulnerability Patching

Authors: Zanis Ali Khan, Aayush Garg, Yuejun Guo, Qiang Tang

Abstract: Software vulnerabilities pose critical security risks, demanding prompt and effective mitigation strategies. While advancements in Automated Program Repair (APR) have primarily targeted general software bugs, the domain of vulnerability patching, which is a security-critical subset of APR, remains underexplored. This paper investigates the potential of pre-trained language models, CodeBERT and Cod… ▽ More Software vulnerabilities pose critical security risks, demanding prompt and effective mitigation strategies. While advancements in Automated Program Repair (APR) have primarily targeted general software bugs, the domain of vulnerability patching, which is a security-critical subset of APR, remains underexplored. This paper investigates the potential of pre-trained language models, CodeBERT and CodeT5, for automated vulnerability patching across diverse datasets and five programming languages. We evaluate these models on their accuracy, computational efficiency, and how the length of vulnerable code patches impacts performance. Our findings reveal promising accuracy levels, particularly for CodeT5 on datasets with complex vulnerability patterns, while CodeBERT demonstrates strengths in handling fragmented or context-limited datasets. CodeT5 further showcases superior efficiency, making it well-suited for large-scale applications. However, both models face challenges in maintaining performance as patch length increases, highlighting the complexity of addressing extended in program repair specifically aimed at fixing vulnerabilities. This study benchmarks model performance, highlights key limitations, and offers insights to improve automated vulnerability patching for practical security applications. △ Less

Submitted 13 January, 2025; originally announced January 2025.

arXiv:2501.05656 [pdf, other]

Evidential Deep Learning for Uncertainty Quantification and Out-of-Distribution Detection in Jet Identification using Deep Neural Networks

Authors: Ayush Khot, Xiwei Wang, Avik Roy, Volodymyr Kindratenko, Mark S. Neubauer

Abstract: Current methods commonly used for uncertainty quantification (UQ) in deep learning (DL) models utilize Bayesian methods which are computationally expensive and time-consuming. In this paper, we provide a detailed study of UQ based on evidential deep learning (EDL) for deep neural network models designed to identify jets in high energy proton-proton collisions at the Large Hadron Collider and explo… ▽ More Current methods commonly used for uncertainty quantification (UQ) in deep learning (DL) models utilize Bayesian methods which are computationally expensive and time-consuming. In this paper, we provide a detailed study of UQ based on evidential deep learning (EDL) for deep neural network models designed to identify jets in high energy proton-proton collisions at the Large Hadron Collider and explore its utility in anomaly detection. EDL is a DL approach that treats learning as an evidence acquisition process designed to provide confidence (or epistemic uncertainty) about test data. Using publicly available datasets for jet classification benchmarking, we explore hyperparameter optimizations for EDL applied to the challenge of UQ for jet identification. We also investigate how the uncertainty is distributed for each jet class, how this method can be implemented for the detection of anomalies, how the uncertainty compares with Bayesian ensemble methods, and how the uncertainty maps onto latent spaces for the models. Our studies uncover some pitfalls of EDL applied to anomaly detection and a more effective way to quantify uncertainty from EDL as compared with the foundational EDL setup. These studies illustrate a methodological approach to interpreting EDL in jet classification models, providing new insights on how EDL quantifies uncertainty and detects out-of-distribution data which may lead to improved EDL methods for DL models applied to classification tasks. △ Less

Submitted 9 January, 2025; originally announced January 2025.

Comments: 38 pages (including references) with 17 figures and 3 tables. Repository: https://github.com/FAIR4HEP/PFIN4UQAD . Submitted to Machine Learning: Science and Technology

arXiv:2501.04578 [pdf, other]

Analysis of Climatic Trends and Variability in Indian Topography

Authors: Ayush Prusty, Akshita Gupta, Vivek Ashok Bohara

Abstract: The climatic change is one of the serious concerns nowadays. The impacts of climate change are global in scope and unprecedented in scale. Moreover, a small perturbation in climatic changes affects not only the pristine ecosystem but also the socioeconomic sectors. Specifically, the affect of climatic changes is related to frequent casualties. This makes it essential to dwelve deeper into analyzin… ▽ More The climatic change is one of the serious concerns nowadays. The impacts of climate change are global in scope and unprecedented in scale. Moreover, a small perturbation in climatic changes affects not only the pristine ecosystem but also the socioeconomic sectors. Specifically, the affect of climatic changes is related to frequent casualties. This makes it essential to dwelve deeper into analyzing the socio-climatic trends and variability. This work provides a comprehensive analysis of India's climatic trends, emphasizing on regional variations and specifically delving into the unique climate of Delhi. Specifically, this research unveils the temporal and spatial variations in temperature patterns by amalgamating extensive datasets encompassing India's diverse landscapes. The study uses advanced statistical tools and methodologies to scrutinize temperature's annual and seasonal variability. The insights drawn from this rigorous analysis may offer invaluable contributions to regional planning strategies, adaptive measures, and informed decision-making amidst the complex impacts of climate change. By bridging the gap between broader climatic trends and localized impacts, this research aims to facilitate more effective measures to mitigate and adapt to the multifaceted challenges of climate change, ensuring a more nuanced and tailored approaches. We utilized the Mann-Kendall test and Theil-Sen's slope estimator to analyze the trends and variability of the climatic conditions over the decades. The results demonstrate that temperature variations have increased over 0.58oC on average over the last decade. Moreover, over last decade the variability of Indian states shows that Lakshadweep faced the highest change (0.87oC), highlighting coastal vulnerability, while Tripura observed the least change of 0.07oC. △ Less

Submitted 8 January, 2025; originally announced January 2025.

arXiv:2501.03133 [pdf, other]

Cloudy-Maraston: Integrating nebular continuum and line emission with the Maraston stellar population synthesis models

Authors: Sophie L. Newman, Christopher C. Lovell, Claudia Maraston, Mauro Giavalisco, William J. Roper, Aayush Saxena, Aswin P. Vijayan, Stephen M. Wilkins

Abstract: The James Webb Space Telescope has ushered in an era of abundant high-redshift observations of young stellar populations characterized by strong emission lines, motivating us to integrate nebular emission into the new Maraston stellar population model which incorporates the latest Geneva stellar evolutionary tracks for massive stars with rotation. We use the photoionization code Cloudy to obtain t… ▽ More The James Webb Space Telescope has ushered in an era of abundant high-redshift observations of young stellar populations characterized by strong emission lines, motivating us to integrate nebular emission into the new Maraston stellar population model which incorporates the latest Geneva stellar evolutionary tracks for massive stars with rotation. We use the photoionization code Cloudy to obtain the emergent nebular continuum and line emission for a range of modelling parameters, then compare our results to observations on various emission line diagnostic diagrams. We carry out a detailed comparison with several other models in the literature assuming different input physics, including modified prescriptions for stellar evolution and the inclusion of binary stars, and find close agreement in the H$\rm β$, H$\rm α$, [N II]$λ6583$, and [S II]$λ6731$ luminosities between the models. However, we find significant differences in lines with high ionization energies, such as He II$λ$1640 and [O III]$λ5007$, due to large variations in the hard ionizing photon production rates. The models differ by a maximum of $\hat{Q}_{\rm [O III]λ5007} = \rm 6 \times 10^9 \; s^{-1} \, M_{\odot}^{-1}$, where these differences are mostly caused by the assumed stellar rotation and effective temperatures for the Wolf Rayet phase. Interestingly, rotation and uncorrected effective temperatures in our single star population models alone generate [O III] ionizing photon production rates higher than models including binary stars with ages between 1 to 8 Myr. These differences highlight the dependence of derived properties from SED fitting on the assumed model, as well as the sensitivity of predictions from cosmological simulations. △ Less

Submitted 6 January, 2025; originally announced January 2025.

Comments: 20 pages, 19 figures, submitted to MNRAS. Comments are welcome

arXiv:2501.03102 [pdf]

doi 10.2514/6.2025-2187

Enhancing Multirotor Drone Efficiency: Exploring Minimum Energy Consumption Rate of Forward Flight under Varying Payload

Authors: Ayush Patnaik, Nicolas Michel, Xinfan Lin

Abstract: Multirotor unmanned aerial vehicle is a prevailing type of aircraft with wide real-world applications. Energy efficiency is a critical aspect of its performance, determining the range and duration of the missions that can be performed. In this study, we show both analytically and numerically that the optimum of a key energy efficiency index in forward flight, namely energy per meter traveled per u… ▽ More Multirotor unmanned aerial vehicle is a prevailing type of aircraft with wide real-world applications. Energy efficiency is a critical aspect of its performance, determining the range and duration of the missions that can be performed. In this study, we show both analytically and numerically that the optimum of a key energy efficiency index in forward flight, namely energy per meter traveled per unit mass, is a constant under different vehicle mass (including payload). Note that this relationship is only true under the optimal forward velocity that minimizes the energy consumption (under different mass), but not under arbitrary velocity. The study is based on a previously developed model capturing the first-principle energy dynamics of the multirotor, and a key step is to prove that the pitch angle under optimal velocity is a constant. By employing both analytical derivation and validation studies, the research provides critical insights into the optimization of multirotor energy efficiency, and facilitate the development of flight control strategies to extend mission duration and range. △ Less

Submitted 6 January, 2025; originally announced January 2025.

Comments: https://arc.aiaa.org/doi/10.2514/6.2025-2187

Journal ref: AIAA 2025-2187

arXiv:2412.20455 [pdf, other]

Cross-Modal Fusion and Attention Mechanism for Weakly Supervised Video Anomaly Detection

Authors: Ayush Ghadiya, Purbayan Kar, Vishal Chudasama, Pankaj Wasnik

Abstract: Recently, weakly supervised video anomaly detection (WS-VAD) has emerged as a contemporary research direction to identify anomaly events like violence and nudity in videos using only video-level labels. However, this task has substantial challenges, including addressing imbalanced modality information and consistently distinguishing between normal and abnormal features. In this paper, we address t… ▽ More Recently, weakly supervised video anomaly detection (WS-VAD) has emerged as a contemporary research direction to identify anomaly events like violence and nudity in videos using only video-level labels. However, this task has substantial challenges, including addressing imbalanced modality information and consistently distinguishing between normal and abnormal features. In this paper, we address these challenges and propose a multi-modal WS-VAD framework to accurately detect anomalies such as violence and nudity. Within the proposed framework, we introduce a new fusion mechanism known as the Cross-modal Fusion Adapter (CFA), which dynamically selects and enhances highly relevant audio-visual features in relation to the visual modality. Additionally, we introduce a Hyperbolic Lorentzian Graph Attention (HLGAtt) to effectively capture the hierarchical relationships between normal and abnormal representations, thereby enhancing feature separation accuracy. Through extensive experiments, we demonstrate that the proposed model achieves state-of-the-art results on benchmark datasets of violence and nudity detection. △ Less

Submitted 29 December, 2024; originally announced December 2024.

Comments: Accepted to CVPR'24 MULA Workshop

arXiv:2412.18163 [pdf]

Survey of Pseudonymization, Abstractive Summarization & Spell Checker for Hindi and Marathi

Authors: Rasika Ransing, Mohammed Amaan Dhamaskar, Ayush Rajpurohit, Amey Dhoke, Sanket Dalvi

Abstract: India's vast linguistic diversity presents unique challenges and opportunities for technological advancement, especially in the realm of Natural Language Processing (NLP). While there has been significant progress in NLP applications for widely spoken languages, the regional languages of India, such as Marathi and Hindi, remain underserved. Research in the field of NLP for Indian regional language… ▽ More India's vast linguistic diversity presents unique challenges and opportunities for technological advancement, especially in the realm of Natural Language Processing (NLP). While there has been significant progress in NLP applications for widely spoken languages, the regional languages of India, such as Marathi and Hindi, remain underserved. Research in the field of NLP for Indian regional languages is at a formative stage and holds immense significance. The paper aims to build a platform which enables the user to use various features like text anonymization, abstractive text summarization and spell checking in English, Hindi and Marathi language. The aim of these tools is to serve enterprise and consumer clients who predominantly use Indian Regional Languages. △ Less

Submitted 23 December, 2024; originally announced December 2024.

arXiv:2412.15349 [pdf, other]

Adaptive Urban Planning: A Hybrid Framework for Balanced City Development

Authors: Pratham Singla, Ayush Singh, Adesh Gupta, Shivank Garg

Abstract: Urban planning faces a critical challenge in balancing city-wide infrastructure needs with localized demographic preferences, particularly in rapidly developing regions. Although existing approaches typically focus on top-down optimization or bottom-up community planning, only some frameworks successfully integrate both perspectives. Our methodology employs a two-tier approach: First, a determinis… ▽ More Urban planning faces a critical challenge in balancing city-wide infrastructure needs with localized demographic preferences, particularly in rapidly developing regions. Although existing approaches typically focus on top-down optimization or bottom-up community planning, only some frameworks successfully integrate both perspectives. Our methodology employs a two-tier approach: First, a deterministic solver optimizes basic infrastructure requirements in the city region. Second, four specialized planning agents, each representing distinct sub-regions, propose demographic-specific modifications to a master planner. The master planner then evaluates and integrates these suggestions to ensure cohesive urban development. We validate our framework using a newly created dataset comprising detailed region and sub-region maps from three developing cities in India, focusing on areas undergoing rapid urbanization. The results demonstrate that this hybrid approach enables more nuanced urban development while maintaining overall city functionality. △ Less

Submitted 19 December, 2024; originally announced December 2024.

arXiv:2412.14048 [pdf, other]

Evidential Deep Learning for Probabilistic Modelling of Extreme Storm Events

Authors: Ayush Khot, Xihaier Luo, Ai Kagawa, Shinjae Yoo

Abstract: Uncertainty quantification (UQ) methods play an important role in reducing errors in weather forecasting. Conventional approaches in UQ for weather forecasting rely on generating an ensemble of forecasts from physics-based simulations to estimate the uncertainty. However, it is computationally expensive to generate many forecasts to predict real-time extreme weather events. Evidential Deep Learnin… ▽ More Uncertainty quantification (UQ) methods play an important role in reducing errors in weather forecasting. Conventional approaches in UQ for weather forecasting rely on generating an ensemble of forecasts from physics-based simulations to estimate the uncertainty. However, it is computationally expensive to generate many forecasts to predict real-time extreme weather events. Evidential Deep Learning (EDL) is an uncertainty-aware deep learning approach designed to provide confidence about its predictions using only one forecast. It treats learning as an evidence acquisition process where more evidence is interpreted as increased predictive confidence. We apply EDL to storm forecasting using real-world weather datasets and compare its performance with traditional methods. Our findings indicate that EDL not only reduces computational overhead but also enhances predictive uncertainty. This method opens up novel opportunities in research areas such as climate risk assessment, where quantifying the uncertainty about future climate is crucial. △ Less

Submitted 18 December, 2024; originally announced December 2024.

Comments: 14 pages, 10 figures

arXiv:2412.12867 [pdf, other]

Adding TESS to CRÉME. Light curves and masses of 300+ eclipsing binaries

Authors: Krzysztof G. Hełminiak, Ayush Moharana, Tilak B. Pawar, Ganesh Pawar

Abstract: The Comprehensive Research with Echelles on the Most interesting Eclipsing binaries (CRÉME) projects was aimed to collect high-resolutions spectra of about 380 detached eclipsing binaries (DEBs), which mostly do not have literature RV data. From this vast observational material we were able to estimate masses of components of 325 double-lined system. Since the launch of the TESS mission we have be… ▽ More The Comprehensive Research with Echelles on the Most interesting Eclipsing binaries (CRÉME) projects was aimed to collect high-resolutions spectra of about 380 detached eclipsing binaries (DEBs), which mostly do not have literature RV data. From this vast observational material we were able to estimate masses of components of 325 double-lined system. Since the launch of the TESS mission we have been collecting 2-min cadence photometry for the CRÉME targets through successful GI proposals. As by Sector 85, we obtained data for $>$330 of them. We are thus now in the process of comprehensively analyzing our targets. This paper presents the recent status of the CRÉME project and its space photometry counterpart, and describes several sub-projects within CRÉME that focus on specific classes of targets. △ Less

Submitted 17 December, 2024; originally announced December 2024.

Comments: 6 pages, 1 figure, 1 table. Submitted to Contributions of the Astronomical Observatory Skalnate Pleso as proceedings for "Binary and Multiple Stars in the Era of Big Sky Surveys"

arXiv:2412.10106 [pdf, other]

A Cascaded Dilated Convolution Approach for Mpox Lesion Classification

Authors: Ayush Deshmukh

Abstract: The global outbreak of the Mpox virus, classified as a Public Health Emergency of International Concern (PHEIC) by the World Health Organization, presents significant diagnostic challenges due to its visual similarity to other skin lesion diseases. Traditional diagnostic methods for Mpox, which rely on clinical symptoms and laboratory tests, are slow and labor intensive. Deep learning-based approa… ▽ More The global outbreak of the Mpox virus, classified as a Public Health Emergency of International Concern (PHEIC) by the World Health Organization, presents significant diagnostic challenges due to its visual similarity to other skin lesion diseases. Traditional diagnostic methods for Mpox, which rely on clinical symptoms and laboratory tests, are slow and labor intensive. Deep learning-based approaches for skin lesion classification offer a promising alternative. However, developing a model that balances efficiency with accuracy is crucial to ensure reliable and timely diagnosis without compromising performance. This study introduces the Cascaded Atrous Group Attention (CAGA) framework to address these challenges, combining the Cascaded Atrous Attention module and the Cascaded Group Attention mechanism. The Cascaded Atrous Attention module utilizes dilated convolutions and cascades the outputs to enhance multi-scale representation. This is integrated into the Cascaded Group Attention mechanism, which reduces redundancy in Multi-Head Self-Attention. By integrating the Cascaded Atrous Group Attention module with EfficientViT-L1 as the backbone architecture, this approach achieves state-of-the-art performance, reaching an accuracy of 98% on the Mpox Close Skin Image (MCSI) dataset while reducing model parameters by 37.5% compared to the original EfficientViT-L1. The model's robustness is demonstrated through extensive validation on two additional benchmark datasets, where it consistently outperforms existing approaches. △ Less

Submitted 13 January, 2025; v1 submitted 13 December, 2024; originally announced December 2024.

Comments: 8 pages, 4 figures, Submitted to Medical Imaging with Deep Learning

arXiv:2412.07979 [pdf, other]

AmCLR: Unified Augmented Learning for Cross-Modal Representations

Authors: Ajay Jagannath, Aayush Upadhyay, Anant Mehta

Abstract: Contrastive learning has emerged as a pivotal framework for representation learning, underpinning advances in both unimodal and bimodal applications like SimCLR and CLIP. To address fundamental limitations like large batch size dependency and bimodality, methods such as SogCLR leverage stochastic optimization for the global contrastive objective. Inspired by SogCLR's efficiency and adaptability, w… ▽ More Contrastive learning has emerged as a pivotal framework for representation learning, underpinning advances in both unimodal and bimodal applications like SimCLR and CLIP. To address fundamental limitations like large batch size dependency and bimodality, methods such as SogCLR leverage stochastic optimization for the global contrastive objective. Inspired by SogCLR's efficiency and adaptability, we introduce AmCLR and xAmCLR objective functions tailored for bimodal vision-language models to further enhance the robustness of contrastive learning. AmCLR integrates diverse augmentations, including text paraphrasing and image transformations, to reinforce the alignment of contrastive representations, keeping batch size limited to a few hundred samples unlike CLIP which needs batch size of 32,768 to produce reasonable results. xAmCLR further extends this paradigm by incorporating intra-modal alignments between original and augmented modalities for richer feature learning. These advancements yield a more resilient and generalizable contrastive learning process, aimed at overcoming bottlenecks in scaling and augmentative diversity. Since we have built our framework on the existing SogCLR, we are able to demonstrate improved representation quality with fewer computational resources, establishing a foundation for scalable and robust multi-modal learning. △ Less

Submitted 10 December, 2024; originally announced December 2024.

Comments: 16 pages, 2 figures

arXiv:2412.06956 [pdf]

Microcontroller-Driven MPPT System for Enhanced Photovoltaic Efficiency: An Experimental Approach in Nepal

Authors: Diwakar Khadka, Satish Adhikari, Atit Pokharel, Sandeep Marasinee, Aayush Pathak

Abstract: Solar energy utilization in places like Nepal, is often obstructed by unpredicted environmental factors and existing technological barriers. The challenges encountered often result in fluctuating energy outputs, hindering the transition to greener energy solutions. To tackle these issues, this study introduces a custom-designed Maximum Power Point Tracking (MPPT) controller, seamlessly incorporate… ▽ More Solar energy utilization in places like Nepal, is often obstructed by unpredicted environmental factors and existing technological barriers. The challenges encountered often result in fluctuating energy outputs, hindering the transition to greener energy solutions. To tackle these issues, this study introduces a custom-designed Maximum Power Point Tracking (MPPT) controller, seamlessly incorporated into a microcontroller-based battery charging system. This approach seeks to enhance the efficiency of photovoltaic (PV) systems, aligning with the global shift towards renewables. The research's primary objective is to enhance PV module power yield employing MPPT techniques, thereby reducing dependency on non-renewable energy sources. Key goals include real-time MPP tracking for optimal power extraction from PV modules and the integration of a real-time monitoring mechanism for PV and battery states. Leveraging a coordinated interplay of sensors measuring temperature, voltage, and current, vital metrics are fed to the microcontroller. This, in turn, generates a precise Pulse Width Modulation (PWM) signal, fine-tuning the voltage regulation of the buck-boost converter Metal Oxide Semiconductor Field Effect Transistor (MOSFET) for optimal operation. The adopted approach emphasizes monitoring environmental metrics, overseeing power outputs, and generating PWM signals to adeptly manage the buck-boost converter MOSFET voltage. Concurrently, data is transmitted hourly to a cloud platform, facilitating real-time monitoring capabilities showcasing the IoT application. As a result of these integrations, an efficiency improvement of approximately 37.28% was observed. In essence, this research underscores the profound impact of merging advanced technologies within the renewable energy sector, offering a robust blueprint for enhancing energy stability and productivity. △ Less

Submitted 9 December, 2024; originally announced December 2024.

Comments: Experimental analysis

arXiv:2412.05183 [pdf, other]

Privacy Drift: Evolving Privacy Concerns in Incremental Learning

Authors: Sayyed Farid Ahamed, Soumya Banerjee, Sandip Roy, Aayush Kapoor, Marc Vucovich, Kevin Choi, Abdul Rahman, Edward Bowen, Sachin Shetty

Abstract: In the evolving landscape of machine learning (ML), Federated Learning (FL) presents a paradigm shift towards decentralized model training while preserving user data privacy. This paper introduces the concept of ``privacy drift", an innovative framework that parallels the well-known phenomenon of concept drift. While concept drift addresses the variability in model accuracy over time due to change… ▽ More In the evolving landscape of machine learning (ML), Federated Learning (FL) presents a paradigm shift towards decentralized model training while preserving user data privacy. This paper introduces the concept of ``privacy drift", an innovative framework that parallels the well-known phenomenon of concept drift. While concept drift addresses the variability in model accuracy over time due to changes in the data, privacy drift encapsulates the variation in the leakage of private information as models undergo incremental training. By defining and examining privacy drift, this study aims to unveil the nuanced relationship between the evolution of model performance and the integrity of data privacy. Through rigorous experimentation, we investigate the dynamics of privacy drift in FL systems, focusing on how model updates and data distribution shifts influence the susceptibility of models to privacy attacks, such as membership inference attacks (MIA). Our results highlight a complex interplay between model accuracy and privacy safeguards, revealing that enhancements in model performance can lead to increased privacy risks. We provide empirical evidence from experiments on customized datasets derived from CIFAR-100 (Canadian Institute for Advanced Research, 100 classes), showcasing the impact of data and concept drift on privacy. This work lays the groundwork for future research on privacy-aware machine learning, aiming to achieve a delicate balance between model accuracy and data privacy in decentralized environments. △ Less

Submitted 6 December, 2024; originally announced December 2024.

Comments: 6 pages, 7 figures, Accepted in IEEE ICNC 25

arXiv:2412.05031 [pdf, other]

Fully independent response in disordered solids

Authors: Mengjie Zu, Aayush Desai, Carl P. Goodrich

Abstract: Unlike in crystals, it is difficult to trace emergent material properties of amorphous solids to their underlying structure. Nevertheless, one can tune features of a disordered spring network, ranging from bulk elastic constants to specific allosteric responses, through highly precise alterations of the structure. This has been understood through the notion of independent bond-level response -- th… ▽ More Unlike in crystals, it is difficult to trace emergent material properties of amorphous solids to their underlying structure. Nevertheless, one can tune features of a disordered spring network, ranging from bulk elastic constants to specific allosteric responses, through highly precise alterations of the structure. This has been understood through the notion of independent bond-level response -- the observation that in many cases, different springs have different effects on different properties. While this idea has motivated inverse design in numerous contexts, it has not been formalized and quantified in a general context that not just informs but enables and predicts inverse design. Here, we show how to quantify independent response by linearizing the simultaneous change in multiple emergent features, and introduce the much stronger notion of fully independent response. Remarkably, we find that the mechanical properties of disordered solids are always fully independent across a wide array of scenarios, regardless of the target features, tunable parameters, and details of particle-particle interactions. Furthermore, our formulation quantifies the susceptibility of feature changes to parameter changes, which we find to be correlated with the maximum linear tunability. These results formalize our understanding of a key fundamental difference between ordered and disordered solids while also creating a practical tool to both understand and perform inverse design. △ Less

Submitted 2 May, 2025; v1 submitted 6 December, 2024; originally announced December 2024.

arXiv:2412.04569 [pdf, other]

Towards Performance-Aware Allocation for Accelerated Machine Learning on GPU-SSD Systems

Authors: Ayush Gundawar, Euijun Chung, Hyesoon Kim

Abstract: The exponential growth of data-intensive machine learning workloads has exposed significant limitations in conventional GPU-accelerated systems, especially when processing datasets exceeding GPU DRAM capacity. We propose MQMS, an augmented in-storage GPU architecture and simulator that is aware of internal SSD states and operations, enabling intelligent scheduling and address allocation to overcom… ▽ More The exponential growth of data-intensive machine learning workloads has exposed significant limitations in conventional GPU-accelerated systems, especially when processing datasets exceeding GPU DRAM capacity. We propose MQMS, an augmented in-storage GPU architecture and simulator that is aware of internal SSD states and operations, enabling intelligent scheduling and address allocation to overcome performance bottlenecks caused by CPU-mediated data access patterns. MQMS introduces dynamic address allocation to maximize internal parallelism and fine-grained address mapping to efficiently handle small I/O requests without incurring read-modify-write overheads. Through extensive evaluations on workloads ranging from large language model inference to classical machine learning algorithms, MQMS demonstrates orders-of-magnitude improvements in I/O request throughput, device response time, and simulation end time compared to existing simulators. △ Less

Submitted 8 December, 2024; v1 submitted 5 December, 2024; originally announced December 2024.

Showing 101–150 of 1,098 results for author: Aayush