Skip to main content

Showing 151–200 of 906 results for author: Prateek

.
  1. arXiv:2401.12509  [pdf

    cs.SI cs.LG

    Digital cloning of online social networks for language-sensitive agent-based modeling of misinformation spread

    Authors: Prateek Puri, Gabriel Hassler, Anton Shenk, Sai Katragadda

    Abstract: We develop a simulation framework for studying misinformation spread within online social networks that blends agent-based modeling and natural language processing techniques. While many other agent-based simulations exist in this space, questions over their fidelity and generalization to existing networks in part hinders their ability to provide actionable insights. To partially address these con… ▽ More

    Submitted 23 January, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

  2. arXiv:2401.11464  [pdf

    eess.IV cs.CV cs.LG

    Task-specific regularization loss towards model calibration for reliable lung cancer detection

    Authors: Mehar Prateek Kalra, Mansi Singhal, Rohan Raju Dhanakashirur

    Abstract: Lung cancer is one of the significant causes of cancer-related deaths globally. Early detection and treatment improve the chances of survival. Traditionally CT scans have been used to extract the most significant lung infection information and diagnose cancer. This process is carried out manually by an expert radiologist. The imbalance in the radiologists-to-population ratio in a country like Indi… ▽ More

    Submitted 21 January, 2024; originally announced January 2024.

  3. arXiv:2401.11154  [pdf, other

    cond-mat.soft

    Motility and pair-wise interactions of chemically active droplets in 1-D confinement

    Authors: Pawan Kumar, Prateek Dwivedi, Sobiya Ashraf, Dipin Pillai, Rahul Mangal

    Abstract: Self-propelled droplets serve as ideal model systems to delve deeper into understanding of the motion of biological micro-swimmers by simulating their motility. Biological microorganisms are renowned for showcasing a diverse array of dynamic swimming behaviors when confronted with physical constraints. This study aims to elucidate the impact of physical constraints on swimming characteristics of b… ▽ More

    Submitted 20 January, 2024; originally announced January 2024.

    Comments: 13 pages, 9 figures

  4. arXiv:2401.11103  [pdf, other

    cs.DS cs.LG stat.ML

    Efficient Data Shapley for Weighted Nearest Neighbor Algorithms

    Authors: Jiachen T. Wang, Prateek Mittal, Ruoxi Jia

    Abstract: This work aims to address an open problem in data valuation literature concerning the efficient computation of Data Shapley for weighted $K$ nearest neighbor algorithm (WKNN-Shapley). By considering the accuracy of hard-label KNN with discretized weights as the utility function, we reframe the computation of WKNN-Shapley into a counting problem and introduce a quadratic-time algorithm, presenting… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

    Comments: AISTATS 2024 Oral

  5. arXiv:2401.09856  [pdf, other

    cs.NI

    EDAF: An End-to-End Delay Analytics Framework for 5G-and-Beyond Networks

    Authors: Samie Mostafavi, Marius Tillner, Gourav Prateek Sharma, James Gross

    Abstract: Supporting applications in emerging domains like cyber-physical systems and human-in-the-loop scenarios typically requires adherence to strict end-to-end delay guarantees. Contributions of many tandem processes unfolding layer by layer within the wireless network result in violations of delay constraints, thereby severely degrading application performance. Meeting the application's stringent requi… ▽ More

    Submitted 18 January, 2024; originally announced January 2024.

    Comments: Submitted to the 11th International Workshop on Computer and Networking Experimental Research using Testbeds (CNERT 2024)

  6. arXiv:2401.04343  [pdf, other

    cs.LG cs.CL cs.CR

    Private Fine-tuning of Large Language Models with Zeroth-order Optimization

    Authors: Xinyu Tang, Ashwinee Panda, Milad Nasr, Saeed Mahloujifar, Prateek Mittal

    Abstract: Differentially private stochastic gradient descent (DP-SGD) allows models to be trained in a privacy-preserving manner, but has proven difficult to scale to the era of foundation models. We introduce DP-ZO, a private fine-tuning framework for large language models by privatizing zeroth order optimization methods. A key insight into the design of our method is that the direction of the gradient in… ▽ More

    Submitted 30 January, 2025; v1 submitted 8 January, 2024; originally announced January 2024.

  7. arXiv:2401.02412  [pdf, other

    cs.LG cs.AI cs.CL cs.CV

    LLM Augmented LLMs: Expanding Capabilities through Composition

    Authors: Rachit Bansal, Bidisha Samanta, Siddharth Dalmia, Nitish Gupta, Shikhar Vashishth, Sriram Ganapathy, Abhishek Bapna, Prateek Jain, Partha Talukdar

    Abstract: Foundational models with billions of parameters which have been trained on large corpora of data have demonstrated non-trivial skills in a variety of domains. However, due to their monolithic structure, it is challenging and expensive to augment them or impart new skills. On the other hand, due to their adaptation abilities, several new instances of these models are being trained towards new domai… ▽ More

    Submitted 4 January, 2024; originally announced January 2024.

    Comments: 17 pages, 2 figures, 8 tables

  8. arXiv:2401.00446  [pdf, other

    astro-ph.GA physics.flu-dyn

    Dissipation of AGN jets in a clumpy interstellar medium

    Authors: Riju Dutta, Prateek Sharma, Kartick C. Sarkar, James M. Stone

    Abstract: Accreting supermassive black holes (SMBHs) frequently power jets that interact with the interstellar/circumgalactic medium (ISM/CGM), regulating star-formation in the galaxy. Highly supersonic jets launched by active galactic nuclei (AGN) power a cocoon that confines them and shocks the ambient medium. We build upon the models of narrow conical jets interacting with a smooth ambient medium, to inc… ▽ More

    Submitted 31 December, 2023; originally announced January 2024.

    Comments: 23 pages, 12 figures, 3 tables; to be submitted; comments are welcome; accompanying video: http://youtu.be/DUpSwMMrGfk

  9. arXiv:2312.15010  [pdf, other

    cs.CV

    SI-MIL: Taming Deep MIL for Self-Interpretability in Gigapixel Histopathology

    Authors: Saarthak Kapse, Pushpak Pati, Srijan Das, Jingwei Zhang, Chao Chen, Maria Vakalopoulou, Joel Saltz, Dimitris Samaras, Rajarsi R. Gupta, Prateek Prasanna

    Abstract: Introducing interpretability and reasoning into Multiple Instance Learning (MIL) methods for Whole Slide Image (WSI) analysis is challenging, given the complexity of gigapixel slides. Traditionally, MIL interpretability is limited to identifying salient regions deemed pertinent for downstream tasks, offering little insight to the end-user (pathologist) regarding the rationale behind these selectio… ▽ More

    Submitted 18 May, 2024; v1 submitted 22 December, 2023; originally announced December 2023.

  10. arXiv:2312.14461  [pdf, other

    cs.CR cs.AI cs.LG

    Attacking Byzantine Robust Aggregation in High Dimensions

    Authors: Sarthak Choudhary, Aashish Kolluri, Prateek Saxena

    Abstract: Training modern neural networks or models typically requires averaging over a sample of high-dimensional vectors. Poisoning attacks can skew or bias the average vectors used to train the model, forcing the model to learn specific patterns or avoid learning anything useful. Byzantine robust aggregation is a principled algorithmic defense against such biasing. Robust aggregators can bound the maximu… ▽ More

    Submitted 15 December, 2024; v1 submitted 22 December, 2023; originally announced December 2023.

  11. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1326 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 9 May, 2025; v1 submitted 18 December, 2023; originally announced December 2023.

  12. arXiv:2312.07330  [pdf, other

    cs.CV

    Learned representation-guided diffusion models for large-image generation

    Authors: Alexandros Graikos, Srikar Yellapragada, Minh-Quan Le, Saarthak Kapse, Prateek Prasanna, Joel Saltz, Dimitris Samaras

    Abstract: To synthesize high-fidelity samples, diffusion models typically require auxiliary data to guide the generation process. However, it is impractical to procure the painstaking patch-level annotation effort required in specialized domains like histopathology and satellite imagery; it is often performed by domain experts and involves hundreds of millions of patches. Modern-day self-supervised learning… ▽ More

    Submitted 28 March, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

  13. Between the cosmic-ray `knee' and the `ankle': Contribution from star clusters

    Authors: Sourav Bhadra, Satyendra Thoudam, Biman B Nath, Prateek Sharma

    Abstract: We show that massive young star clusters may be possible candidates that can accelerate Galactic cosmic rays (CRs) in the range of $10^7\hbox{--}10^9$ GeV (between the `knee' and `ankle'). Various plausible scenarios such as acceleration at the wind termination shock (WTS), supernova shocks inside these young star clusters, etc. have been proposed,since it is difficult to accelerate particles up t… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

    Comments: 18 pages, 6 figures, accepted for publication in ApJ

  14. arXiv:2312.06749  [pdf, other

    hep-ph astro-ph.CO

    Electroweak Phase Transition with a Double Well Done Doubly Well

    Authors: Prateek Agrawal, Simone Blasi, Alberto Mariotti, Michael Nee

    Abstract: We revisit the electroweak phase transition in the scalar singlet extension of the standard model with a $\mathbb{Z}_2$ symmetry. In significant parts of the parameter space the phase transition occurs in two steps - including canonical benchmarks used in experimental projections for gravitational waves. Domain walls produced in the first step of the transition seed the final step to the electrowe… ▽ More

    Submitted 29 February, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

    Comments: 24 pages, 8 figures, Journal Version

    Report number: DESY-23-208

  15. arXiv:2311.18281  [pdf, other

    eess.IV cs.CV

    Utilizing Radiomic Feature Analysis For Automated MRI Keypoint Detection: Enhancing Graph Applications

    Authors: Sahar Almahfouz Nasser, Shashwat Pathak, Keshav Singhal, Mohit Meena, Nihar Gupte, Ananya Chinmaya, Prateek Garg, Amit Sethi

    Abstract: Graph neural networks (GNNs) present a promising alternative to CNNs and transformers in certain image processing applications due to their parameter-efficiency in modeling spatial relationships. Currently, a major area of research involves the converting non-graph input data for GNN-based models, notably in scenarios where the data originates from images. One approach involves converting images i… ▽ More

    Submitted 30 November, 2023; originally announced November 2023.

  16. Adaptive friends-of-friends algorithm for identifying gravitationally bound cosmological structures

    Authors: Prateek Gupta, Surajit Paul

    Abstract: The Universe at the present epoch is found to be a network of matter over-dense and under-dense regions. To date, this picture of the Universe is best revealed through cosmological large-volume simulations and large-scale galaxy redshift surveys, in which, the most important step is the appropriate identification of structures. So far, these structures are identified using various group finding co… ▽ More

    Submitted 26 November, 2023; originally announced November 2023.

    Comments: 28 pages, 13 figures, published in the Physical Review D

    Journal ref: Vol. 108, Issue 10, Page 103509, Year 2023, Phys. Rev. D

  17. arXiv:2311.14744  [pdf

    physics.chem-ph cs.AI cs.LG

    Coarse-Grained Configurational Polymer Fingerprints for Property Prediction using Machine Learning

    Authors: Ishan Kumar, Prateek K Jha

    Abstract: In this work, we present a method to generate a configurational level fingerprint for polymers using the Bead-Spring-Model. Unlike some of the previous fingerprinting approaches that employ monomer-level information where atomistic descriptors are computed using quantum chemistry calculations, this approach incorporates configurational information from a coarse-grained model of a long polymer chai… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

  18. arXiv:2311.13171  [pdf, other

    cs.LG cs.AI cs.CL

    ComPEFT: Compression for Communicating Parameter Efficient Updates via Sparsification and Quantization

    Authors: Prateek Yadav, Leshem Choshen, Colin Raffel, Mohit Bansal

    Abstract: Parameter-efficient fine-tuning (PEFT) techniques make it possible to efficiently adapt a language model to create "expert" models that specialize to new tasks or domains. Recent techniques in model merging and compositional generalization leverage these expert models by dynamically composing modules to improve zero/few-shot generalization. Despite the efficiency of PEFT methods, the size of exper… ▽ More

    Submitted 22 November, 2023; originally announced November 2023.

    Comments: 25 Pages, 6 Figures, 16 Tables

  19. arXiv:2311.13168  [pdf, other

    cs.CV

    3D Face Style Transfer with a Hybrid Solution of NeRF and Mesh Rasterization

    Authors: Jianwei Feng, Prateek Singhal

    Abstract: Style transfer for human face has been widely researched in recent years. Majority of the existing approaches work in 2D image domain and have 3D inconsistency issue when applied on different viewpoints of the same face. In this paper, we tackle the problem of 3D face style transfer which aims at generating stylized novel views of a 3D human face with multi-view consistency. We propose to use a ne… ▽ More

    Submitted 22 November, 2023; originally announced November 2023.

    Journal ref: WACV 2024

  20. arXiv:2311.07449  [pdf, other

    cs.CV

    Semantically Grounded QFormer for Efficient Vision Language Understanding

    Authors: Moulik Choraria, Xinbo Wu, Sourya Basu, Nitesh Sekhar, Yue Wu, Xu Zhang, Prateek Singhal, Lav R. Varshney

    Abstract: General purpose Vision Language Models (VLMs) have received tremendous interest in recent years, owing to their ability to learn rich vision-language correlations as well as their broad zero-shot competencies. One immensely popular line of work utilizes frozen unimodal models, by bridging vision representations to language using a trainable module called the QFormer. However, this method relies he… ▽ More

    Submitted 16 December, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

    Comments: Preprint Under Review

  21. arXiv:2311.03376  [pdf, other

    cs.IR cs.LG stat.ML

    Blocked Collaborative Bandits: Online Collaborative Filtering with Per-Item Budget Constraints

    Authors: Soumyabrata Pal, Arun Sai Suggala, Karthikeyan Shanmugam, Prateek Jain

    Abstract: We consider the problem of \emph{blocked} collaborative bandits where there are multiple users, each with an associated multi-armed bandit problem. These users are grouped into \emph{latent} clusters such that the mean reward vectors of users within the same cluster are identical. Our goal is to design algorithms that maximize the cumulative reward accrued by all the users over time, under the \em… ▽ More

    Submitted 31 October, 2023; originally announced November 2023.

    Comments: 44 pages, To Appear in NeurIPS 2023

  22. ExPECA: An Experimental Platform for Trustworthy Edge Computing Applications

    Authors: Samie Mostafavi, Vishnu Narayanan Moothedath, Stefan Rönngren, Neelabhro Roy, Gourav Prateek Sharma, Sangwon Seo, Manuel Olguín Muñoz, James Gross

    Abstract: This paper presents ExPECA, an edge computing and wireless communication research testbed designed to tackle two pressing challenges: comprehensive end-to-end experimentation and high levels of experimental reproducibility. Leveraging OpenStack-based Chameleon Infrastructure (CHI) framework for its proven flexibility and ease of operation, ExPECA is located in a unique, isolated underground facili… ▽ More

    Submitted 2 November, 2023; originally announced November 2023.

  23. arXiv:2310.19332  [pdf, other

    astro-ph.SR

    Solar Flare Prediction and Feature Selection using Light Gradient Boosting Machine Algorithm

    Authors: Vysakh P. A., Prateek Mayank

    Abstract: Solar flares are among the most severe space weather phenomena, and they have the capacity to generate radiation storms and radio disruptions on Earth. The accurate prediction of solar flare events remains a significant challenge, requiring continuous monitoring and identification of specific features that can aid in forecasting this phenomenon, particularly for different classes of solar flares.… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: Accepted for publication in Solar Physics journal

  24. arXiv:2310.18219  [pdf, other

    astro-ph.SR astro-ph.EP physics.space-ph

    SWASTi-CME: A physics-based model to study CME evolution and its interaction with Solar Wind

    Authors: Prateek Mayank, Bhargav Vaidya, Wageesh Mishra, D. Chakrabarty

    Abstract: Coronal mass ejections (CMEs) are primary drivers of space weather and studying their evolution in the inner heliosphere is vital to prepare for a timely response. Solar wind streams, acting as background, influence their propagation in the heliosphere and associated geomagnetic storm activity. This study introduces SWASTi-CME, a newly developed MHD-based CME model integrated into the Space Weathe… ▽ More

    Submitted 27 October, 2023; originally announced October 2023.

    Comments: Accepted for publication in ApJS

  25. arXiv:2310.16033  [pdf, other

    cs.CV cs.CL

    Towards Perceiving Small Visual Details in Zero-shot Visual Question Answering with Multimodal LLMs

    Authors: Jiarui Zhang, Mahyar Khayatkhoei, Prateek Chhikara, Filip Ilievski

    Abstract: Multimodal Large Language Models (MLLMs) have recently achieved promising zero-shot accuracy on visual question answering (VQA) -- a fundamental task affecting various downstream applications and domains. Given the great potential for the broad use of these models, it is important to investigate their limitations in dealing with different image and question properties. In this work, we investigate… ▽ More

    Submitted 12 February, 2024; v1 submitted 24 October, 2023; originally announced October 2023.

    Comments: 20 pages, 12 figures, 7 tables

  26. arXiv:2310.13076  [pdf, other

    cs.CV cs.CR

    PatchCURE: Improving Certifiable Robustness, Model Utility, and Computation Efficiency of Adversarial Patch Defenses

    Authors: Chong Xiang, Tong Wu, Sihui Dai, Jonathan Petit, Suman Jana, Prateek Mittal

    Abstract: State-of-the-art defenses against adversarial patch attacks can now achieve strong certifiable robustness with a marginal drop in model utility. However, this impressive performance typically comes at the cost of 10-100x more inference-time computation compared to undefended models -- the research community has witnessed an intense three-way trade-off between certifiable robustness, model utility,… ▽ More

    Submitted 2 April, 2024; v1 submitted 19 October, 2023; originally announced October 2023.

    Comments: USENIX Security 2024. (extended) technical report

  27. arXiv:2310.12916  [pdf, ps, other

    math.CO

    Plücker inequalities for weakly separated coordinates in totally nonnegative Grassmannian

    Authors: Daniel Soskin, Prateek Kumar Vishwakarma

    Abstract: We show that the partial sums of the long Plücker relations for pairs of weakly separated Plücker coordinates oscillate around $0$ on the totally nonnegative part of the Grassmannian. Our result generalizes the classical oscillating inequalities by Gantmacher--Krein (1941) and recent results on totally nonnegative matrix inequalities by Fallat--Vishwakarma (2023). In fact we obtain a characterizat… ▽ More

    Submitted 24 January, 2024; v1 submitted 19 October, 2023; originally announced October 2023.

    Comments: Updated the main theorem (and its proof) to a more general setting. Minor changes to the exposition. 21 pages, 20 figures

    MSC Class: Primary 15A15; 15B48; 15A15; secondary 15A45; 20C08

  28. arXiv:2310.10636  [pdf, other

    cs.LG

    Dual-Encoders for Extreme Multi-Label Classification

    Authors: Nilesh Gupta, Devvrit Khatri, Ankit S Rawat, Srinadh Bhojanapalli, Prateek Jain, Inderjit Dhillon

    Abstract: Dual-encoder (DE) models are widely used in retrieval tasks, most commonly studied on open QA benchmarks that are often characterized by multi-class and limited training data. In contrast, their performance in multi-label and data-rich retrieval settings like extreme multi-label classification (XMC), remains under-explored. Current empirical evidence indicates that DE models fall significantly sho… ▽ More

    Submitted 17 March, 2024; v1 submitted 16 October, 2023; originally announced October 2023.

    Comments: 27 pages, 8 figures

    Journal ref: ICLR 2024 camera-ready publication

  29. arXiv:2310.10294  [pdf, other

    cs.CL cs.AI

    Key-phrase boosted unsupervised summary generation for FinTech organization

    Authors: Aadit Deshpande, Shreya Goyal, Prateek Nagwanshi, Avinash Tripathy

    Abstract: With the recent advances in social media, the use of NLP techniques in social media data analysis has become an emerging research direction. Business organizations can particularly benefit from such an analysis of social media discourse, providing an external perspective on consumer behavior. Some of the NLP applications such as intent detection, sentiment classification, text summarization can he… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

    Comments: 8 pages, 4 figures

  30. arXiv:2310.08891  [pdf, other

    cs.LG cs.IR

    EHI: End-to-end Learning of Hierarchical Index for Efficient Dense Retrieval

    Authors: Ramnath Kumar, Anshul Mittal, Nilesh Gupta, Aditya Kusupati, Inderjit Dhillon, Prateek Jain

    Abstract: Dense embedding-based retrieval is widely used for semantic search and ranking. However, conventional two-stage approaches, involving contrastive embedding learning followed by approximate nearest neighbor search (ANNS), can suffer from misalignment between these stages. This mismatch degrades retrieval performance. We propose End-to-end Hierarchical Indexing (EHI), a novel method that directly ad… ▽ More

    Submitted 13 October, 2024; v1 submitted 13 October, 2023; originally announced October 2023.

  31. arXiv:2310.07931  [pdf, other

    cs.LG cs.AI cs.CL cs.CV

    D2 Pruning: Message Passing for Balancing Diversity and Difficulty in Data Pruning

    Authors: Adyasha Maharana, Prateek Yadav, Mohit Bansal

    Abstract: Analytical theories suggest that higher-quality data can lead to lower test errors in models trained on a fixed data budget. Moreover, a model can be trained on a lower compute budget without compromising performance if a dataset can be stripped of its redundancies. Coreset selection (or data pruning) seeks to select a subset of the training data so as to maximize the performance of models trained… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

    Comments: 17 pages (Our code is available at https://github.com/adymaharana/d2pruning)

  32. arXiv:2310.07727  [pdf, other

    cs.CV eess.IV

    Deep Learning based Systems for Crater Detection: A Review

    Authors: Atal Tewari, K Prateek, Amrita Singh, Nitin Khanna

    Abstract: Craters are one of the most prominent features on planetary surfaces, used in applications such as age estimation, hazard detection, and spacecraft navigation. Crater detection is a challenging problem due to various aspects, including complex crater characteristics such as varying sizes and shapes, data resolution, and planetary data types. Similar to other computer vision tasks, deep learning-ba… ▽ More

    Submitted 28 September, 2023; originally announced October 2023.

  33. arXiv:2310.07707  [pdf, other

    cs.LG cs.CL cs.CV

    MatFormer: Nested Transformer for Elastic Inference

    Authors: Devvrit, Sneha Kudugunta, Aditya Kusupati, Tim Dettmers, Kaifeng Chen, Inderjit Dhillon, Yulia Tsvetkov, Hannaneh Hajishirzi, Sham Kakade, Ali Farhadi, Prateek Jain

    Abstract: Foundation models are applied in a broad spectrum of settings with different inference constraints, from massive multi-accelerator clusters to resource-constrained standalone mobile devices. However, the substantial costs associated with training these models often limit the number of unique model sizes that can be offered. Consequently, practitioners are compelled to select a model that may not b… ▽ More

    Submitted 14 December, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

    Comments: 30 pages, 11 figures, first three authors contributed equally. NeurIPS, 2024

  34. arXiv:2310.07514  [pdf

    stat.AP

    Causal inference for disruption management in urban metro networks

    Authors: Nan Zhang, Daniel Horcher, Prateek Bansal, Daniel J. Graham

    Abstract: Urban metro systems can provide highly efficient and effective movements of vast passenger volumes in cities, but they are often affected by disruptions, causing delays, crowding, and ultimately a decline in passenger satisfaction and patronage. To manage and mitigate such adverse consequences, metro operators could benefit greatly from a quantitative understanding of the causal impact of disrupti… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

  35. arXiv:2310.03717  [pdf, other

    astro-ph.GA astro-ph.IM

    Beyond radial profiles: Using log-normal distributions to model the multiphase circumgalactic medium

    Authors: Alankar Dutta, Mukesh Singh Bisht, Prateek Sharma, Ritali Ghosh, Manami Roy, Biman B. Nath

    Abstract: Recent observations and simulations reveal that the circumgalactic medium (CGM) surrounding galaxies is multiphase, with the gas temperatures spanning a wide range at most radii, $\sim 10^4\ {\rm K}$ to the virial temperature ($\sim 10^6$ K for Milky Way). Traditional CGM models using simple density profiles are inadequate at reproducing observations that indicate a broad temperature range. Altern… ▽ More

    Submitted 9 April, 2024; v1 submitted 26 September, 2023; originally announced October 2023.

    Comments: 23 pages, 15 figures, 4 tables; submitted to MNRAS

  36. arXiv:2310.03693  [pdf, other

    cs.CL cs.AI cs.CR cs.LG

    Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!

    Authors: Xiangyu Qi, Yi Zeng, Tinghao Xie, Pin-Yu Chen, Ruoxi Jia, Prateek Mittal, Peter Henderson

    Abstract: Optimizing large language models (LLMs) for downstream use cases often involves the customization of pre-trained LLMs through further fine-tuning. Meta's open release of Llama models and OpenAI's APIs for fine-tuning GPT-3.5 Turbo on custom datasets also encourage this practice. But, what are the safety costs associated with such custom fine-tuning? We note that while existing safety alignment inf… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

  37. arXiv:2310.02166  [pdf, other

    cs.CL

    Large Language Models Meet Knowledge Graphs to Answer Factoid Questions

    Authors: Mikhail Salnikov, Hai Le, Prateek Rajput, Irina Nikishina, Pavel Braslavski, Valentin Malykh, Alexander Panchenko

    Abstract: Recently, it has been shown that the incorporation of structured knowledge into Large Language Models significantly improves the results for a variety of NLP tasks. In this paper, we propose a method for exploring pre-trained Text-to-Text Language Models enriched with additional information from Knowledge Graphs for answering factoid questions. More specifically, we propose an algorithm for subgra… ▽ More

    Submitted 3 October, 2023; originally announced October 2023.

  38. arXiv:2310.01334  [pdf, other

    cs.LG cs.AI cs.CL

    Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy

    Authors: Pingzhi Li, Zhenyu Zhang, Prateek Yadav, Yi-Lin Sung, Yu Cheng, Mohit Bansal, Tianlong Chen

    Abstract: Sparsely activated Mixture-of-Experts (SMoE) has shown promise to scale up the learning capacity of neural networks, however, they have issues like (a) High Memory Usage, due to duplication of the network layers into multiple copies as experts; and (b) Redundancy in Experts, as common learning-based routing policies suffer from representational collapse. Therefore, vanilla SMoE models are memory i… ▽ More

    Submitted 14 March, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

    Comments: This paper is accepted in ICLR 2024

  39. arXiv:2309.14393  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    LLMCarbon: Modeling the end-to-end Carbon Footprint of Large Language Models

    Authors: Ahmad Faiz, Sotaro Kaneda, Ruhan Wang, Rita Osi, Prateek Sharma, Fan Chen, Lei Jiang

    Abstract: The carbon footprint associated with large language models (LLMs) is a significant concern, encompassing emissions from their training, inference, experimentation, and storage processes, including operational and embodied carbon emissions. An essential aspect is accurately estimating the carbon impact of emerging LLMs even before their training, which heavily relies on GPU usage. Existing studies… ▽ More

    Submitted 19 January, 2024; v1 submitted 25 September, 2023; originally announced September 2023.

    Comments: 15 pages, 8 figures

    Journal ref: published in ICLR2024

  40. Nonparametric mixed logit model with market-level parameters estimated from market share data

    Authors: Xiyuan Ren, Joseph Y. J. Chow, Prateek Bansal

    Abstract: We propose a nonparametric mixed logit model that is estimated using market-level choice share data. The model treats each market as an agent and represents taste heterogeneity through market-specific parameters by solving a multiagent inverse utility maximization problem, addressing the limitations of existing market-level choice models with parametric estimation. A simulation study is conducted… ▽ More

    Submitted 19 April, 2025; v1 submitted 22 September, 2023; originally announced September 2023.

    Journal ref: Transportation Research Part B 196 (2025) 103220

  41. arXiv:2309.10023  [pdf, other

    hep-ph physics.atom-ph

    Searching for axion forces with spin precession in atoms and molecules

    Authors: Prateek Agrawal, Nicholas R. Hutzler, David E. Kaplan, Surjeet Rajendran, Mario Reig

    Abstract: We propose to use atoms and molecules as quantum sensors of axion-mediated monopole-dipole forces. We show that electron spin precession experiments using atomic and molecular beams are well-suited for axion searches thanks to the presence of co-magnetometer states and single-shot temporal resolution. Experimental strategies to detect axion gradients from localised sources and the earth are presen… ▽ More

    Submitted 29 August, 2024; v1 submitted 18 September, 2023; originally announced September 2023.

    Comments: 9 pages, 2 figures. Comments welcome. V2: matches published version. Appendix on axion co-magnetometry added

    Journal ref: J. High Energy Phys. 2024, 133 (2024)

  42. arXiv:2309.09212  [pdf, other

    cs.RO

    RobotPerf: An Open-Source, Vendor-Agnostic, Benchmarking Suite for Evaluating Robotics Computing System Performance

    Authors: Víctor Mayoral-Vilches, Jason Jabbour, Yu-Shun Hsiao, Zishen Wan, Martiño Crespo-Álvarez, Matthew Stewart, Juan Manuel Reina-Muñoz, Prateek Nagras, Gaurav Vikhe, Mohammad Bakhshalipour, Martin Pinzger, Stefan Rass, Smruti Panigrahi, Giulio Corradi, Niladri Roy, Phillip B. Gibbons, Sabrina M. Neuman, Brian Plancher, Vijay Janapa Reddi

    Abstract: We introduce RobotPerf, a vendor-agnostic benchmarking suite designed to evaluate robotics computing performance across a diverse range of hardware platforms using ROS 2 as its common baseline. The suite encompasses ROS 2 packages covering the full robotics pipeline and integrates two distinct benchmarking approaches: black-box testing, which measures performance by eliminating upper layers and re… ▽ More

    Submitted 29 January, 2024; v1 submitted 17 September, 2023; originally announced September 2023.

  43. arXiv:2309.08751  [pdf, ps, other

    cs.SD cs.AI cs.LG cs.MM eess.AS

    Diverse Audio Embeddings -- Bringing Features Back Outperforms CLAP!

    Authors: Prateek Verma

    Abstract: With the advent of modern AI architectures, a shift has happened towards end-to-end architectures. This pivot has led to neural architectures being trained without domain-specific biases/knowledge, optimized according to the task. We in this paper, learn audio embeddings via diverse feature representations, in this case, domain-specific. For the case of audio classification over hundreds of catego… ▽ More

    Submitted 6 May, 2025; v1 submitted 15 September, 2023; originally announced September 2023.

    Comments: 6 pages, 1 figure, 2 table

  44. arXiv:2309.07330  [pdf, other

    cs.CV

    Automated Assessment of Critical View of Safety in Laparoscopic Cholecystectomy

    Authors: Yunfan Li, Himanshu Gupta, Haibin Ling, IV Ramakrishnan, Prateek Prasanna, Georgios Georgakis, Aaron Sasson

    Abstract: Cholecystectomy (gallbladder removal) is one of the most common procedures in the US, with more than 1.2M procedures annually. Compared with classical open cholecystectomy, laparoscopic cholecystectomy (LC) is associated with significantly shorter recovery period, and hence is the preferred method. However, LC is also associated with an increase in bile duct injuries (BDIs), resulting in significa… ▽ More

    Submitted 13 September, 2023; originally announced September 2023.

  45. arXiv:2309.06439  [pdf, other

    cs.CV

    Attention De-sparsification Matters: Inducing Diversity in Digital Pathology Representation Learning

    Authors: Saarthak Kapse, Srijan Das, Jingwei Zhang, Rajarsi R. Gupta, Joel Saltz, Dimitris Samaras, Prateek Prasanna

    Abstract: We propose DiRL, a Diversity-inducing Representation Learning technique for histopathology imaging. Self-supervised learning techniques, such as contrastive and non-contrastive approaches, have been shown to learn rich and effective representations of digitized tissue samples with limited pathologist supervision. Our analysis of vanilla SSL-pretrained models' attention distribution reveals an insi… ▽ More

    Submitted 12 September, 2023; originally announced September 2023.

  46. arXiv:2309.06349  [pdf, other

    stat.ML cs.LG eess.SY math.OC math.ST

    Generalized Regret Analysis of Thompson Sampling using Fractional Posteriors

    Authors: Prateek Jaiswal, Debdeep Pati, Anirban Bhattacharya, Bani K. Mallick

    Abstract: Thompson sampling (TS) is one of the most popular and earliest algorithms to solve stochastic multi-armed bandit problems. We consider a variant of TS, named $α$-TS, where we use a fractional or $α$-posterior ($α\in(0,1)$) instead of the standard posterior distribution. To compute an $α$-posterior, the likelihood in the definition of the standard posterior is tempered with a factor $α$. For $α$-TS… ▽ More

    Submitted 12 September, 2023; originally announced September 2023.

  47. arXiv:2309.05000  [pdf, other

    astro-ph.GA

    Multiphase Neutral Interstellar Medium: Analyzing Simulation with H I 21cm Observational Data Analysis Techniques

    Authors: Soumyadeep Bhattacharjee, Nirupam Roy, Prateek Sharma, Amit Seta, Christoph Federrath

    Abstract: Several different methods are regularly used to infer the properties of the neutral interstellar medium (ISM) using atomic hydrogen (H I) 21cm absorption and emission spectra. In this work, we study various techniques used for inferring ISM gas phase properties, namely the correlation between brightness temperature and optical depth $(T_B(v)$, $τ(v))$ at each channel velocity ($v$), and decomposit… ▽ More

    Submitted 23 November, 2023; v1 submitted 10 September, 2023; originally announced September 2023.

    Comments: 22 pages (including appendixes), 16 figures, 3 tables, Accepted for publication in MNRAS

  48. arXiv:2309.03934  [pdf, other

    hep-th hep-ph

    The Monodromic Axion-Photon Coupling

    Authors: Prateek Agrawal, Arthur Platschorre

    Abstract: We consider the general form of the axion coupling to photons in the axion-Maxwell theory. On general grounds this coupling takes the form of a monodromic function of the axion, which we call $g(a)$, multiplying the Chern-Pontryagin density $F \widetilde{F}$ of the photon. We show that the non-linearity of $g(a)$ is a spurion for the shift symmetry of the axion. In this context, when… ▽ More

    Submitted 10 October, 2023; v1 submitted 7 September, 2023; originally announced September 2023.

    Comments: 20 pages, 1 figure; v2: typos corrected, references added

  49. arXiv:2309.00748  [pdf, other

    cs.CV cs.LG

    PathLDM: Text conditioned Latent Diffusion Model for Histopathology

    Authors: Srikar Yellapragada, Alexandros Graikos, Prateek Prasanna, Tahsin Kurc, Joel Saltz, Dimitris Samaras

    Abstract: To achieve high-quality results, diffusion models must be trained on large datasets. This can be notably prohibitive for models in specialized domains, such as computational pathology. Conditioning on labeled data is known to help in data-efficient model training. Therefore, histopathology reports, which are rich in valuable clinical information, are an ideal choice as guidance for a histopatholog… ▽ More

    Submitted 30 November, 2023; v1 submitted 1 September, 2023; originally announced September 2023.

    Comments: WACV 2024 publication

  50. arXiv:2308.15709  [pdf, other

    cs.LG cs.CR cs.GT stat.ML

    Threshold KNN-Shapley: A Linear-Time and Privacy-Friendly Approach to Data Valuation

    Authors: Jiachen T. Wang, Yuqing Zhu, Yu-Xiang Wang, Ruoxi Jia, Prateek Mittal

    Abstract: Data valuation aims to quantify the usefulness of individual data sources in training machine learning (ML) models, and is a critical aspect of data-centric ML research. However, data valuation faces significant yet frequently overlooked privacy challenges despite its importance. This paper studies these challenges with a focus on KNN-Shapley, one of the most practical data valuation methods nowad… ▽ More

    Submitted 25 November, 2023; v1 submitted 29 August, 2023; originally announced August 2023.

    Comments: NeurIPS 2023 Spotlight