Skip to main content

Showing 101–150 of 907 results for author: Prateek

.
  1. arXiv:2409.12924  [pdf, other

    eess.SP cs.AI cs.CL cs.LG cs.SD eess.AS

    Wavelet GPT: Wavelet Inspired Large Language Models

    Authors: Prateek Verma

    Abstract: Large Language Models (LLMs) have ushered in a new wave of artificial intelligence advancements impacting every scientific field and discipline. We live in a world where most of the data around us, e.g., text, audio, and music, has a multi-scale structure. This paper infuses LLMs with a traditional signal processing idea, namely wavelets, during pre-training to take advantage of the structure. Wit… ▽ More

    Submitted 9 February, 2025; v1 submitted 3 September, 2024; originally announced September 2024.

    Comments: 12 pages, 4 figures;

  2. arXiv:2409.10870  [pdf, other

    cs.CL cs.AI cs.LG cs.SD eess.AS

    Adaptive Large Language Models By Layerwise Attention Shortcuts

    Authors: Prateek Verma, Mert Pilanci

    Abstract: Transformer architectures are the backbone of the modern AI revolution. However, they are based on simply stacking the same blocks in dozens of layers and processing information sequentially from one block to another. In this paper, we propose to challenge this and introduce adaptive computations for LLM-like setups, which allow the final layer to attend to all of the intermediate layers as it dee… ▽ More

    Submitted 16 September, 2024; originally announced September 2024.

    Comments: 6 pages, 3 figures

  3. arXiv:2409.06152  [pdf, other

    quant-ph cs.NI

    Comparing One- and Two-way Quantum Repeater Architectures

    Authors: Prateek Mantri, Kenneth Goodenough, Don Towsley

    Abstract: Quantum repeaters are an essential building block for realizing long-distance quantum communications. However, due to the fragile nature of quantum information, these repeaters suffer from loss and operational errors. Prior works have classified repeaters into three broad categories based on their use of probabilistic or near-deterministic methods to mitigate these errors. Besides differences in c… ▽ More

    Submitted 9 September, 2024; originally announced September 2024.

    Comments: 25 pages, 7 figures

  4. arXiv:2409.01754  [pdf, other

    cs.CY cs.AI cs.CL cs.HC

    Empirical evidence of Large Language Model's influence on human spoken communication

    Authors: Hiromu Yakura, Ezequiel Lopez-Lopez, Levin Brinkmann, Ignacio Serna, Prateek Gupta, Iyad Rahwan

    Abstract: Artificial Intelligence (AI) agents now interact with billions of humans in natural language, thanks to advances in Large Language Models (LLMs) like ChatGPT. This raises the question of whether AI has the potential to shape a fundamental aspect of human culture: the way we speak. Recent analyses revealed that scientific publications already exhibit evidence of AI-specific language. But this evide… ▽ More

    Submitted 3 September, 2024; originally announced September 2024.

  5. arXiv:2409.01735  [pdf, ps, other

    stat.ME stat.AP

    Multi-objective Bayesian optimization for Likelihood-Free inference in sequential sampling models of decision making

    Authors: David Chen, Xinwei Li, Eui-Jin Kim, Prateek Bansal, David Nott

    Abstract: Statistical models are often defined by a generative process for simulating synthetic data, but this can lead to intractable likelihoods. Likelihood free inference (LFI) methods enable Bayesian inference to be performed in this case. Extending a popular approach to simulation-efficient LFI for single-source data, we propose Multi-objective Bayesian Optimization for Likelihood Free Inference (MOBOL… ▽ More

    Submitted 4 June, 2025; v1 submitted 3 September, 2024; originally announced September 2024.

    Comments: 39 pages, 16 figures

  6. arXiv:2408.15218  [pdf, other

    eess.IV cs.CV

    Histo-Diffusion: A Diffusion Super-Resolution Method for Digital Pathology with Comprehensive Quality Assessment

    Authors: Xuan Xu, Saarthak Kapse, Prateek Prasanna

    Abstract: Digital pathology has advanced significantly over the last decade, with Whole Slide Images (WSIs) encompassing vast amounts of data essential for accurate disease diagnosis. High-resolution WSIs are essential for precise diagnosis but technical limitations in scanning equipment and variablity in slide preparation can hinder obtaining these images. Super-resolution techniques can enhance low-resolu… ▽ More

    Submitted 27 August, 2024; originally announced August 2024.

    Comments: We have submitted our paper to Medical Image Analysis and are currently awaiting feedback

  7. arXiv:2408.07357  [pdf, other

    physics.flu-dyn

    Mixing of active scalars due to random shock waves in two dimensions

    Authors: Joaquim P. Jossy, Prateek Gupta

    Abstract: In this work, we investigate the mixing of active scalars in two dimensions by the stirring action of stochastically generated shock waves. We use direct numerical simulations (DNS) of the interaction of shock waves with two non-reacting species to analyse the mixing dynamics for different Atwood numbers (At). Unlike passive scalars, the presence of density gradients in active scalars makes the sp… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

  8. arXiv:2408.07057  [pdf, ps, other

    cs.LG cs.AI cs.CL

    A Survey on Model MoErging: Recycling and Routing Among Specialized Experts for Collaborative Learning

    Authors: Prateek Yadav, Colin Raffel, Mohammed Muqeeth, Lucas Caccia, Haokun Liu, Tianlong Chen, Mohit Bansal, Leshem Choshen, Alessandro Sordoni

    Abstract: The availability of performant pre-trained models has led to a proliferation of fine-tuned expert models that are specialized to a particular domain or task. Model MoErging methods aim to recycle expert models to create an aggregate system with improved performance or generalization. A key component of MoErging methods is the creation of a router that decides which expert model(s) to use for a par… ▽ More

    Submitted 13 August, 2024; originally announced August 2024.

    Comments: 26 pages

  9. arXiv:2408.06130  [pdf, other

    cs.DC

    FaasMeter: Energy-First Serverless Computing

    Authors: Abdul Rehman, Alexander Fuerst, Prateek Sharma

    Abstract: Functions as a Service has emerged as a popular abstraction for a wide range of cloud applications and an important cloud workload. We present the design and implementation of FaasMeter, a FaaS control plane which provides energy monitoring, accounting, control, and pricing as first-class operations. The highly diverse and dynamic workloads of FaaS create additional complexity to measuring and con… ▽ More

    Submitted 12 August, 2024; originally announced August 2024.

    Comments: 11 figures, 15 pages

  10. arXiv:2408.04870  [pdf, other

    cs.CR cs.AI

    ConfusedPilot: Confused Deputy Risks in RAG-based LLMs

    Authors: Ayush RoyChowdhury, Mulong Luo, Prateek Sahu, Sarbartha Banerjee, Mohit Tiwari

    Abstract: Retrieval augmented generation (RAG) is a process where a large language model (LLM) retrieves useful information from a database and then generates the responses. It is becoming popular in enterprise settings for daily business operations. For example, Copilot for Microsoft 365 has accumulated millions of businesses. However, the security implications of adopting such RAG-based systems are unclea… ▽ More

    Submitted 23 October, 2024; v1 submitted 9 August, 2024; originally announced August 2024.

  11. arXiv:2407.21072  [pdf, other

    cs.AI cs.CL

    Beyond Metrics: A Critical Analysis of the Variability in Large Language Model Evaluation Frameworks

    Authors: Marco AF Pimentel, Clément Christophe, Tathagata Raha, Prateek Munjal, Praveen K Kanithi, Shadab Khan

    Abstract: As large language models (LLMs) continue to evolve, the need for robust and standardized evaluation benchmarks becomes paramount. Evaluating the performance of these models is a complex challenge that requires careful consideration of various linguistic tasks, model architectures, and benchmarking methodologies. In recent years, various frameworks have emerged as noteworthy contributions to the fi… ▽ More

    Submitted 28 July, 2024; originally announced July 2024.

    Comments: 15 pages, 3 figures

  12. arXiv:2407.19985  [pdf, other

    cs.CV cs.AI cs.LG

    Mixture of Nested Experts: Adaptive Processing of Visual Tokens

    Authors: Gagan Jain, Nidhi Hegde, Aditya Kusupati, Arsha Nagrani, Shyamal Buch, Prateek Jain, Anurag Arnab, Sujoy Paul

    Abstract: The visual medium (images and videos) naturally contains a large amount of information redundancy, thereby providing a great opportunity for leveraging efficiency in processing. While Vision Transformer (ViT) based models scale effectively to large data regimes, they fail to capitalize on this inherent redundancy, leading to higher computational costs. Mixture of Experts (MoE) networks demonstrate… ▽ More

    Submitted 30 July, 2024; v1 submitted 29 July, 2024; originally announced July 2024.

  13. arXiv:2407.12753  [pdf, other

    cs.CV cs.AI cs.LG

    LookupViT: Compressing visual information to a limited number of tokens

    Authors: Rajat Koner, Gagan Jain, Prateek Jain, Volker Tresp, Sujoy Paul

    Abstract: Vision Transformers (ViT) have emerged as the de-facto choice for numerous industry grade vision solutions. But their inference cost can be prohibitive for many settings, as they compute self-attention in each layer which suffers from quadratic computational complexity in the number of tokens. On the other hand, spatial information in images and spatio-temporal information in videos is usually spa… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: ECCV 2024

  14. arXiv:2407.02712  [pdf, other

    eess.SP eess.IV

    Parametric Modeling and Estimation of Photon Registrations for 3D Imaging

    Authors: Weijian Zhang, Hashan K. Weerasooriya, Prateek Chennuri, Stanley H. Chan

    Abstract: In single-photon light detection and ranging (SP-LiDAR) systems, the histogram distortion due to hardware dead time fundamentally limits the precision of depth estimation. To compensate for the dead time effects, the photon registration distribution is typically modeled based on the Markov chain self-excitation process. However, this is a discrete process and it is computationally expensive, thus… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  15. arXiv:2406.16797  [pdf, other

    cs.CL cs.AI

    Lottery Ticket Adaptation: Mitigating Destructive Interference in LLMs

    Authors: Ashwinee Panda, Berivan Isik, Xiangyu Qi, Sanmi Koyejo, Tsachy Weissman, Prateek Mittal

    Abstract: Existing methods for adapting large language models (LLMs) to new tasks are not suited to multi-task adaptation because they modify all the model weights -- causing destructive interference between tasks. The resulting effects, such as catastrophic forgetting of earlier tasks, make it challenging to obtain good performance on multiple tasks at the same time. To mitigate this, we propose Lottery Ti… ▽ More

    Submitted 25 June, 2024; v1 submitted 24 June, 2024; originally announced June 2024.

  16. arXiv:2406.15877  [pdf, other

    cs.SE cs.AI cs.CL

    BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions

    Authors: Terry Yue Zhuo, Minh Chien Vu, Jenny Chim, Han Hu, Wenhao Yu, Ratnadira Widyasari, Imam Nur Bani Yusuf, Haolan Zhan, Junda He, Indraneil Paul, Simon Brunner, Chen Gong, Thong Hoang, Armel Randy Zebaze, Xiaoheng Hong, Wen-Ding Li, Jean Kaddour, Ming Xu, Zhihan Zhang, Prateek Yadav, Naman Jain, Alex Gu, Zhoujun Cheng, Jiawei Liu, Qian Liu , et al. (8 additional authors not shown)

    Abstract: Task automation has been greatly empowered by the recent advances in Large Language Models (LLMs) via Python code, where the tasks ranging from software engineering development to general-purpose reasoning. While current benchmarks have shown that LLMs can solve tasks using programs like human developers, the majority of their evaluations are limited to short and self-contained algorithmic tasks o… ▽ More

    Submitted 1 April, 2025; v1 submitted 22 June, 2024; originally announced June 2024.

    Comments: Accpeted at ICLR 2025 (Oral), built with love by the BigCode community :)

  17. arXiv:2406.14861  [pdf, other

    eess.SY cs.ET

    Resilience of the Electric Grid through Trustable IoT-Coordinated Assets (Extended version)

    Authors: Vineet J. Nair, Venkatesh Venkataramanan, Priyank Srivastava, Partha S. Sarker, Anurag Srivastava, Laurentiu D. Marinovici, Jun Zha, Christopher Irwin, Prateek Mittal, John Williams, Jayant Kumar, H. Vincent Poor, Anuradha M. Annaswamy

    Abstract: The electricity grid has evolved from a physical system to a cyber-physical system with digital devices that perform measurement, control, communication, computation, and actuation. The increased penetration of distributed energy resources (DERs) including renewable generation, flexible loads, and storage provides extraordinary opportunities for improvements in efficiency and sustainability. Howev… ▽ More

    Submitted 30 January, 2025; v1 submitted 21 June, 2024; originally announced June 2024.

    Comments: Accepted to the Proceedings of the National Academy of Sciences (PNAS) 2025. Extended version with supplementary information included

  18. arXiv:2406.14598  [pdf, other

    cs.AI

    SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal

    Authors: Tinghao Xie, Xiangyu Qi, Yi Zeng, Yangsibo Huang, Udari Madhushani Sehwag, Kaixuan Huang, Luxi He, Boyi Wei, Dacheng Li, Ying Sheng, Ruoxi Jia, Bo Li, Kai Li, Danqi Chen, Peter Henderson, Prateek Mittal

    Abstract: Evaluating aligned large language models' (LLMs) ability to recognize and reject unsafe user requests is crucial for safe, policy-compliant deployments. Existing evaluation efforts, however, face three limitations that we address with SORRY-Bench, our proposed benchmark. First, existing methods often use coarse-grained taxonomies of unsafe topics, and are over-representing some fine-grained topics… ▽ More

    Submitted 1 March, 2025; v1 submitted 20 June, 2024; originally announced June 2024.

    Comments: Paper accepted to ICLR 2025

  19. arXiv:2406.11011  [pdf, ps, other

    cs.LG cs.CL stat.ML

    Data Shapley in One Training Run

    Authors: Jiachen T. Wang, Prateek Mittal, Dawn Song, Ruoxi Jia

    Abstract: Data Shapley provides a principled framework for attributing data's contribution within machine learning contexts. However, existing approaches require re-training models on different data subsets, which is computationally intensive, foreclosing their application to large-scale models. Furthermore, they produce the same attribution score for any models produced by running the learning algorithm, m… ▽ More

    Submitted 7 June, 2025; v1 submitted 16 June, 2024; originally announced June 2024.

    Comments: ICLR 2025 Outstanding Paper Runner-Up

  20. arXiv:2406.10254  [pdf, other

    cs.CL cs.AI cs.LG cs.SD eess.AS

    Towards Signal Processing In Large Language Models

    Authors: Prateek Verma, Mert Pilanci

    Abstract: This paper introduces the idea of applying signal processing inside a Large Language Model (LLM). With the recent explosion of generative AI, our work can help bridge two fields together, namely the field of signal processing and large language models. We draw parallels between classical Fourier-Transforms and Fourier Transform-like learnable time-frequency representations for every intermediate a… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 12 pages, 3 figures

  21. arXiv:2406.08554  [pdf, other

    physics.chem-ph cond-mat.stat-mech quant-ph

    Quantum Hardware-Enabled Molecular Dynamics via Transfer Learning

    Authors: Abid Khan, Prateek Vaish, Yaoqi Pang, Nikhil Kowshik, Michael S. Chen, Clay H. Batton, Grant M. Rotskoff, J. Wayne Mullinax, Bryan K. Clark, Brenda M. Rubenstein, Norm M. Tubman

    Abstract: The ability to perform ab initio molecular dynamics simulations using potential energies calculated on quantum computers would allow virtually exact dynamics for chemical and biochemical systems, with substantial impacts on the fields of catalysis and biophysics. However, noisy hardware, the costs of computing gradients, and the number of qubits required to simulate large systems present major cha… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 1- pages, 12 figures

  22. arXiv:2406.05946  [pdf, other

    cs.CR cs.AI

    Safety Alignment Should Be Made More Than Just a Few Tokens Deep

    Authors: Xiangyu Qi, Ashwinee Panda, Kaifeng Lyu, Xiao Ma, Subhrajit Roy, Ahmad Beirami, Prateek Mittal, Peter Henderson

    Abstract: The safety alignment of current Large Language Models (LLMs) is vulnerable. Relatively simple attacks, or even benign fine-tuning, can jailbreak aligned models. We argue that many of these vulnerabilities are related to a shared underlying issue: safety alignment can take shortcuts, wherein the alignment adapts a model's generative distribution primarily over only its very first few output tokens.… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  23. arXiv:2405.20365  [pdf, other

    astro-ph.HE astro-ph.CO hep-ph

    Breaking into the window of primordial black hole dark matter with x-ray microlensing

    Authors: Manish Tamta, Nirmal Raj, Prateek Sharma

    Abstract: Primordial black holes (PBHs) in the mass range $10^{-16}-10^{-11}~M_\odot$ may constitute all the dark matter. We show that gravitational microlensing of bright x-ray pulsars provide the most robust and immediately implementable opportunity to uncover PBH dark matter in this mass window. As proofs of concept, we show that the currently operational NICER telescope can probe this window near… ▽ More

    Submitted 28 February, 2025; v1 submitted 30 May, 2024; originally announced May 2024.

    Comments: 10 pages revtex4 + references, 4 figures, 1 table; v2 clarifies limit-setting procedure, adds notes on future missions, matches PRD

  24. arXiv:2405.19524  [pdf, other

    cs.CR cs.AI

    AI Risk Management Should Incorporate Both Safety and Security

    Authors: Xiangyu Qi, Yangsibo Huang, Yi Zeng, Edoardo Debenedetti, Jonas Geiping, Luxi He, Kaixuan Huang, Udari Madhushani, Vikash Sehwag, Weijia Shi, Boyi Wei, Tinghao Xie, Danqi Chen, Pin-Yu Chen, Jeffrey Ding, Ruoxi Jia, Jiaqi Ma, Arvind Narayanan, Weijie J Su, Mengdi Wang, Chaowei Xiao, Bo Li, Dawn Song, Peter Henderson, Prateek Mittal

    Abstract: The exposure of security vulnerabilities in safety-aligned language models, e.g., susceptibility to adversarial attacks, has shed light on the intricate interplay between AI safety and AI security. Although the two disciplines now come together under the overarching goal of AI risk management, they have historically evolved separately, giving rise to differing perspectives. Therefore, in this pape… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  25. arXiv:2405.15556  [pdf, other

    cs.LG cs.CL cs.CR

    Certifiably Robust RAG against Retrieval Corruption

    Authors: Chong Xiang, Tong Wu, Zexuan Zhong, David Wagner, Danqi Chen, Prateek Mittal

    Abstract: Retrieval-augmented generation (RAG) has been shown vulnerable to retrieval corruption attacks: an attacker can inject malicious passages into retrieval results to induce inaccurate responses. In this paper, we propose RobustRAG as the first defense framework against retrieval corruption attacks. The key insight of RobustRAG is an isolate-then-aggregate strategy: we get LLM responses from each pas… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  26. arXiv:2405.15068  [pdf

    physics.bio-ph cond-mat.soft

    Rapid Sensing of Heat Stress using Machine Learning of Micrographs of Red Blood Cells Dispersed in Liquid Crystals

    Authors: Prateek Verma, Elizabeth Adeogun, Elizabeth S. Greene, Sami Dridi, Ukash Nakarmi, Karthik Nayani

    Abstract: An imbalance between bodily heat production and heat dissipation leads to heat stress in organisms. In addition to diminished animal well-being, heat stress is detrimental to the poultry industry as poultry entails fast growth and high yield, resulting in greater metabolic activity and higher body heat production. When stressed, cells overexpress heat shock proteins (such as HSP70, a well-establis… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  27. arXiv:2405.01616  [pdf, other

    q-bio.BM cs.AI cs.LG

    Generative Active Learning for the Search of Small-molecule Protein Binders

    Authors: Maksym Korablyov, Cheng-Hao Liu, Moksh Jain, Almer M. van der Sloot, Eric Jolicoeur, Edward Ruediger, Andrei Cristian Nica, Emmanuel Bengio, Kostiantyn Lapchevskyi, Daniel St-Cyr, Doris Alexandra Schuetz, Victor Ion Butoi, Jarrid Rector-Brooks, Simon Blackburn, Leo Feng, Hadi Nekoei, SaiKrishna Gottipati, Priyesh Vijayan, Prateek Gupta, Ladislav Rampášek, Sasikanth Avancha, Pierre-Luc Bacon, William L. Hamilton, Brooks Paige, Sanchit Misra , et al. (9 additional authors not shown)

    Abstract: Despite substantial progress in machine learning for scientific discovery in recent years, truly de novo design of small molecules which exhibit a property of interest remains a significant challenge. We introduce LambdaZero, a generative active learning approach to search for synthesizable molecules. Powered by deep reinforcement learning, LambdaZero learns to search over the vast space of molecu… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  28. arXiv:2405.01349  [pdf, other

    cs.LG cs.CR

    Position: Towards Resilience Against Adversarial Examples

    Authors: Sihui Dai, Chong Xiang, Tong Wu, Prateek Mittal

    Abstract: Current research on defending against adversarial examples focuses primarily on achieving robustness against a single attack type such as $\ell_2$ or $\ell_{\infty}$-bounded attacks. However, the space of possible perturbations is much larger than considered by many existing defenses and is difficult to mathematically model, so the attacker can easily bypass the defense by using a type of attack t… ▽ More

    Submitted 8 October, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

  29. arXiv:2405.00876  [pdf, other

    cs.CV cs.AI cs.LG

    Beyond Human Vision: The Role of Large Vision Language Models in Microscope Image Analysis

    Authors: Prateek Verma, Minh-Hao Van, Xintao Wu

    Abstract: Vision language models (VLMs) have recently emerged and gained the spotlight for their ability to comprehend the dual modality of image and textual data. VLMs such as LLaVA, ChatGPT-4, and Gemini have recently shown impressive performance on tasks such as natural image captioning, visual question answering (VQA), and spatial reasoning. Additionally, a universal segmentation model by Meta AI, Segme… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  30. arXiv:2405.00174  [pdf, other

    astro-ph.SR physics.space-ph

    Using sunRunner3D to interpret the global structure of the heliosphere from in situ measurements

    Authors: José Juan González-Avilés, Pete Riley, Michal Ben-Nun, Prateek Mayank, Bhargav Vaidya

    Abstract: Understanding the large-scale three-dimensional structure of the inner heliosphere, while important in its own right, is crucial for space weather applications, such as forecasting the time of arrival and propagation of coronal mass ejections (CMEs). This study uses sunRunner3D (3D), a 3-D magnetohydrodynamic (MHD) model, to simulate solar wind (SW) streams and generate background states. SR3D emp… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

    Comments: 31 pages, 9 figures, 3 tables, accepted for publication in the Journal of Space Weather and Space Climate

    Report number: 12

    Journal ref: J. Space Weather Space Clim. Volume 14, 2024

  31. arXiv:2404.14779  [pdf, other

    cs.CL

    Med42 -- Evaluating Fine-Tuning Strategies for Medical LLMs: Full-Parameter vs. Parameter-Efficient Approaches

    Authors: Clément Christophe, Praveen K Kanithi, Prateek Munjal, Tathagata Raha, Nasir Hayat, Ronnie Rajan, Ahmed Al-Mahrooqi, Avani Gupta, Muhammad Umar Salman, Gurpreet Gosal, Bhargav Kanakiya, Charles Chen, Natalia Vassilieva, Boulbaba Ben Amor, Marco AF Pimentel, Shadab Khan

    Abstract: This study presents a comprehensive analysis and comparison of two predominant fine-tuning methodologies - full-parameter fine-tuning and parameter-efficient tuning - within the context of medical Large Language Models (LLMs). We developed and refined a series of LLMs, based on the Llama-2 architecture, specifically designed to enhance medical knowledge retrieval, reasoning, and question-answering… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: Published at AAAI 2024 Spring Symposium - Clinical Foundation Models

  32. arXiv:2404.13740  [pdf, other

    cond-mat.soft

    Chemical interactions in active droplets

    Authors: Prateek Dwivedi, Sobiya Ashraf, Pawan Kumar, Dipin Pillai, Rahul Mangal

    Abstract: Interactions among biologically active agents is facilitated by their self-generated chemical and hydrodynamic fields. In order to elucidate the pair-wise interactions between such micro-organisms, we employ active droplets as a model system, capable of self-generating chemical and hydrodynamic fields. We demonstrate that the solute Péclet number ($Pe$), characterizing the relative strength of its… ▽ More

    Submitted 8 February, 2025; v1 submitted 21 April, 2024; originally announced April 2024.

    Comments: 11 pages, 6 figures

  33. arXiv:2404.09635  [pdf, other

    physics.flu-dyn

    Pair statistics of oblate spheroids settling in a turbulent flow

    Authors: Prateek Anand, Samriddhi Sankar Ray

    Abstract: We perform direct numerical simulations of sub-Kolmogorov, inertial spheroids settling under gravity in homogeneous, isotropic turbulence and find that small-scale clustering, measured via the correlation dimension, depends sensitively on their aspect ratios. In particular, such particles are shown to cluster more as their anisotropy increases. Further, the approach rate for pairs of spheroids are… ▽ More

    Submitted 9 April, 2025; v1 submitted 15 April, 2024; originally announced April 2024.

    Journal ref: J. Fluid Mech. 1009 (2025) A69

  34. Ram pressure stripping in clusters: Gravity can bind the ISM but not the CGM

    Authors: Ritali Ghosh, Alankar Dutta, Prateek Sharma

    Abstract: We explore the survival of a galaxy's circumgalactic medium (CGM) as it experiences ram pressure stripping (RPS) moving through the intracluster medium (ICM). For a satellite galaxy, the CGM is often assumed to be entirely stripped/evaporated, an assumption that may not always be justified. We carry out 3D-hydrodynamic simulations of the interstellar and circumgalactic media (ISM+CGM) of a galaxy… ▽ More

    Submitted 11 August, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: 23 pages, 15 figures; journal-accepted version; a short video description of the paper is here: https://youtu.be/yssXeoE6JV4?si=etoZuLLT1btLzo6_

    Journal ref: MNRAS, vol 531, 3445-3467 (2024)

  35. arXiv:2404.00399  [pdf, other

    cs.CL cs.AI cs.LG

    Aurora-M: Open Source Continual Pre-training for Multilingual Language and Code

    Authors: Taishi Nakamura, Mayank Mishra, Simone Tedeschi, Yekun Chai, Jason T Stillerman, Felix Friedrich, Prateek Yadav, Tanmay Laud, Vu Minh Chien, Terry Yue Zhuo, Diganta Misra, Ben Bogin, Xuan-Son Vu, Marzena Karpinska, Arnav Varma Dantuluri, Wojciech Kusa, Tommaso Furlanello, Rio Yokota, Niklas Muennighoff, Suhas Pai, Tosin Adewumi, Veronika Laippala, Xiaozhe Yao, Adalberto Junior, Alpay Ariyak , et al. (20 additional authors not shown)

    Abstract: Pretrained language models are an integral part of AI applications, but their high computational cost for training limits accessibility. Initiatives such as Bloom and StarCoder aim to democratize access to pretrained models for collaborative community development. Despite these efforts, such models encounter challenges such as limited multilingual capabilities, risks of catastrophic forgetting dur… ▽ More

    Submitted 26 December, 2024; v1 submitted 30 March, 2024; originally announced April 2024.

    Comments: Preprint

  36. arXiv:2404.00222  [pdf, ps, other

    math.CO math.AC

    Positivity preservers over finite fields

    Authors: Dominique Guillot, Himanshu Gupta, Prateek Kumar Vishwakarma, Chi Hoi Yip

    Abstract: We resolve an algebraic version of Schoenberg's celebrated theorem [Duke Math.J., 1942] characterizing entrywise matrix transforms that preserve positive definiteness. Compared to the classical real and complex settings, we consider matrices with entries in a finite field and obtain a complete characterization of such preservers for matrices of a fixed dimension. When the dimension of the matrices… ▽ More

    Submitted 18 October, 2024; v1 submitted 29 March, 2024; originally announced April 2024.

    Comments: 32 pages, LaTeX; this version contains simplified proofs. Section 6 is completely new

    MSC Class: 15B48 (primary); 15B33; 05C25; 05C50; 11T06 (secondary)

  37. arXiv:2403.20327  [pdf, other

    cs.CL cs.AI

    Gecko: Versatile Text Embeddings Distilled from Large Language Models

    Authors: Jinhyuk Lee, Zhuyun Dai, Xiaoqi Ren, Blair Chen, Daniel Cer, Jeremy R. Cole, Kai Hui, Michael Boratko, Rajvi Kapadia, Wen Ding, Yi Luan, Sai Meher Karthik Duddu, Gustavo Hernandez Abrego, Weiqiang Shi, Nithi Gupta, Aditya Kusupati, Prateek Jain, Siddhartha Reddy Jonnalagadda, Ming-Wei Chang, Iftekhar Naim

    Abstract: We present Gecko, a compact and versatile text embedding model. Gecko achieves strong retrieval performance by leveraging a key idea: distilling knowledge from large language models (LLMs) into a retriever. Our two-step distillation process begins with generating diverse, synthetic paired data using an LLM. Next, we further refine the data quality by retrieving a set of candidate passages for each… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: 18 pages

  38. Zutu: A Platform for Localization and Navigation of Swarm Robots Using Virtual Grids

    Authors: Prateek, Pawan Wadhwani, Reshesh Kumar Pathak, Mayur Bhosale, A Helen Victoria

    Abstract: Swarm robots, which are inspired from the way insects behave collectively in order to achieve a common goal, have become a major part of research with applications involving search and rescue, area exploration, surveillance etc. In this paper, we present a swarm of robots that do not require individual extrinsic sensors to sense the environment but instead use a single central camera to locate and… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

    Comments: Accepted at 7th International Conference on Robotics and Automation Engineering, ICRAE 2022, Singapore, November 18 - November 20, 2022

  39. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1112 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 16 December, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  40. arXiv:2403.04890  [pdf, other

    cs.CL

    Few shot chain-of-thought driven reasoning to prompt LLMs for open ended medical question answering

    Authors: Saeel Sandeep Nachane, Ojas Gramopadhye, Prateek Chanda, Ganesh Ramakrishnan, Kshitij Sharad Jadhav, Yatin Nandwani, Dinesh Raghu, Sachindra Joshi

    Abstract: In this paper, we propose a modified version of the MedQA-USMLE dataset, named MEDQA-OPEN, which contains open-ended medical questions without options to mimic clinical scenarios, along with clinician-approved reasoned answers. Additionally, we implement a prompt driven by Chain of Thought (CoT) reasoning, CLINICR, to mirror the prospective process of incremental reasoning, reaching a correct resp… ▽ More

    Submitted 15 October, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

    Comments: The paper is accepted in EMNLP 2024 Findings

  41. arXiv:2403.00871  [pdf, other

    cs.CR cs.AI cs.CL cs.LG

    Teach LLMs to Phish: Stealing Private Information from Language Models

    Authors: Ashwinee Panda, Christopher A. Choquette-Choo, Zhengming Zhang, Yaoqing Yang, Prateek Mittal

    Abstract: When large language models are trained on private data, it can be a significant privacy risk for them to memorize and regurgitate sensitive information. In this work, we propose a new practical data extraction attack that we call "neural phishing". This attack enables an adversary to target and extract sensitive or personally identifiable information (PII), e.g., credit card numbers, from a model… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

    Comments: ICLR 2024

  42. arXiv:2402.14162  [pdf, other

    cs.CV cs.AI

    On Large Visual Language Models for Medical Imaging Analysis: An Empirical Study

    Authors: Minh-Hao Van, Prateek Verma, Xintao Wu

    Abstract: Recently, large language models (LLMs) have taken the spotlight in natural language processing. Further, integrating LLMs with vision enables the users to explore emergent abilities with multimodal data. Visual language models (VLMs), such as LLaVA, Flamingo, or CLIP, have demonstrated impressive performance on various visio-linguistic tasks. Consequently, there are enormous applications of large… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  43. Finite-temperature grain boundary properties from quasistatic atomistics

    Authors: Miguel Spínola, Shashank Saxena, Prateek Gupta, Brandon Runnels, Dennis M. Kochmann

    Abstract: Grain boundary (GB) properties greatly influence the mechanical, electrical, and thermal response of polycrystalline materials. Most computational studies of GB properties at finite temperatures use molecular dynamics (MD), which is computationally expensive, limited in the range of accessible timescales, and requires cumbersome techniques like thermodynamic integration to estimate free energies.… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

    Journal ref: Computational Materials Science (2024)

  44. arXiv:2402.09889  [pdf, other

    physics.flu-dyn

    Dissipation of nonlinear acoustic waves in thermoviscous pores

    Authors: Krishna Sahithi, Prateek Gupta

    Abstract: We derive a nonlinear acoustic wave propagation model for analysing the thermoviscous dissipation in narrow pores with wavy walls. As the nonlinear waves propagate in the thermoviscous pores, the wave-steepening effect competes with the bulk dissipation, as well as the thermoviscous heat transfer and shear from the pore walls. Consequently, the length scale of the wave is modified. We use the char… ▽ More

    Submitted 15 February, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

  45. arXiv:2402.09360  [pdf, other

    cs.LG cs.AI

    HiRE: High Recall Approximate Top-$k$ Estimation for Efficient LLM Inference

    Authors: Yashas Samaga B L, Varun Yerram, Chong You, Srinadh Bhojanapalli, Sanjiv Kumar, Prateek Jain, Praneeth Netrapalli

    Abstract: Autoregressive decoding with generative Large Language Models (LLMs) on accelerators (GPUs/TPUs) is often memory-bound where most of the time is spent on transferring model parameters from high bandwidth memory (HBM) to cache. On the other hand, recent works show that LLMs can maintain quality with significant sparsity/redundancy in the feedforward (FFN) layers by appropriately training the model… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

  46. arXiv:2402.08644  [pdf, other

    cs.AI cs.CL

    Tandem Transformers for Inference Efficient LLMs

    Authors: Aishwarya P S, Pranav Ajit Nair, Yashas Samaga, Toby Boyd, Sanjiv Kumar, Prateek Jain, Praneeth Netrapalli

    Abstract: The autoregressive nature of conventional large language models (LLMs) inherently limits inference speed, as tokens are generated sequentially. While speculative and parallel decoding techniques attempt to mitigate this, they face limitations: either relying on less accurate smaller models for generation or failing to fully leverage the base LLM's representations. We introduce a novel architectu… ▽ More

    Submitted 20 October, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

  47. arXiv:2402.06086  [pdf, other

    cs.DC cs.AI cs.DS

    Rhizomes and Diffusions for Processing Highly Skewed Graphs on Fine-Grain Message-Driven Systems

    Authors: Bibrak Qamar Chandio, Prateek Srivastava, Maciej Brodowicz, Martin Swany, Thomas Sterling

    Abstract: The paper provides a unified co-design of 1) a programming and execution model that allows spawning tasks from within the vertex data at runtime, 2) language constructs for \textit{actions} that send work to where the data resides, combining parallel expressiveness of local control objects (LCOs) to implement asynchronous graph processing primitives, 3) and an innovative vertex-centric data-struct… ▽ More

    Submitted 7 May, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

    Comments: arXiv admin note: text overlap with arXiv:2402.02576

    ACM Class: C.1.4; C.3; C.4; D.1.3

  48. arXiv:2402.05162  [pdf, other

    cs.LG cs.AI cs.CL

    Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications

    Authors: Boyi Wei, Kaixuan Huang, Yangsibo Huang, Tinghao Xie, Xiangyu Qi, Mengzhou Xia, Prateek Mittal, Mengdi Wang, Peter Henderson

    Abstract: Large language models (LLMs) show inherent brittleness in their safety mechanisms, as evidenced by their susceptibility to jailbreaking and even non-malicious fine-tuning. This study explores this brittleness of safety alignment by leveraging pruning and low-rank modifications. We develop methods to identify critical regions that are vital for safety guardrails, and that are disentangled from util… ▽ More

    Submitted 24 October, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

    Comments: 22 pages, 9 figures. Project page is available at https://boyiwei.com/alignment-attribution/

  49. Localizing uniformly moving single-frequency sources using an inverse 2.5D approach

    Authors: Christian H. Kasess, Wolfgang Kreuzer, Prateek Soni, Holger Waubke

    Abstract: Localizing linearly moving sound sources using microphone arrays is challenging as the transient nature of the signal leads to relatively short observation periods. Commonly, a moving focus is used and most methods operate at least partially in the time domain. In contrast, this manuscript presents an inverse source localization algorithm for uniformly moving single-frequency sources that acts ent… ▽ More

    Submitted 20 August, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

    Comments: 35 pages, 17 figures

    Journal ref: J. Sound Vib., 593 (2024), 118653

  50. arXiv:2401.15605  [pdf, other

    cs.HC cs.CY

    AI as a Medical Ally: Evaluating ChatGPT's Usage and Impact in Indian Healthcare

    Authors: Aryaman Raina, Prateek Mishra, Harshit goyal, Dhruv Kumar

    Abstract: This study investigates the integration and impact of Large Language Models (LLMs), like ChatGPT, in India's healthcare sector. Our research employs a dual approach, engaging both general users and medical professionals through surveys and interviews respectively. Our findings reveal that healthcare professionals value ChatGPT in medical education and preliminary clinical settings, but exercise ca… ▽ More

    Submitted 28 January, 2024; originally announced January 2024.

    Comments: Under review