Skip to main content

Showing 1–50 of 427 results for author: Tushar

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.21080  [pdf, ps, other

    cs.CV cs.AI cs.LG

    EgoAdapt: Adaptive Multisensory Distillation and Policy Learning for Efficient Egocentric Perception

    Authors: Sanjoy Chowdhury, Subrata Biswas, Sayan Nag, Tushar Nagarajan, Calvin Murdock, Ishwarya Ananthabhotla, Yijun Qian, Vamsi Krishna Ithapu, Dinesh Manocha, Ruohan Gao

    Abstract: Modern perception models, particularly those designed for multisensory egocentric tasks, have achieved remarkable performance but often come with substantial computational costs. These high demands pose challenges for real-world deployment, especially in resource-constrained environments. In this paper, we introduce EgoAdapt, a framework that adaptively performs cross-modal distillation and policy… ▽ More

    Submitted 26 June, 2025; originally announced June 2025.

    Comments: Accepted at ICCV 2025

  2. arXiv:2506.18289  [pdf, ps, other

    cs.SE cs.AI

    Tu(r)ning AI Green: Exploring Energy Efficiency Cascading with Orthogonal Optimizations

    Authors: Saurabhsingh Rajput, Mootez Saad, Tushar Sharma

    Abstract: AI's exponential growth intensifies computational demands and energy challenges. While practitioners employ various optimization techniques, that we refer as "knobs" in this paper, to tune model efficiency, these are typically afterthoughts and reactive ad-hoc changes applied in isolation without understanding their combinatorial effects on energy efficiency. This paper emphasizes on treating ener… ▽ More

    Submitted 23 June, 2025; originally announced June 2025.

    Comments: In review

  3. arXiv:2506.13109  [pdf, ps, other

    cs.CL cs.AI

    Leveraging In-Context Learning for Language Model Agents

    Authors: Shivanshu Gupta, Sameer Singh, Ashish Sabharwal, Tushar Khot, Ben Bogin

    Abstract: In-context learning (ICL) with dynamically selected demonstrations combines the flexibility of prompting large language models (LLMs) with the ability to leverage training data to improve performance. While ICL has been highly successful for prediction and generation tasks, leveraging it for agentic tasks that require sequential decision making is challenging -- one must think not only about how t… ▽ More

    Submitted 16 June, 2025; originally announced June 2025.

    Comments: 16 pages, 12 figures

  4. arXiv:2506.12404  [pdf, ps, other

    cs.LG cs.AI

    EXGnet: a single-lead explainable-AI guided multiresolution network with train-only quantitative features for trustworthy ECG arrhythmia classification

    Authors: Tushar Talukder Showrav, Soyabul Islam Lincoln, Md. Kamrul Hasan

    Abstract: Background: Deep learning has significantly advanced ECG arrhythmia classification, enabling high accuracy in detecting various cardiac conditions. The use of single-lead ECG systems is crucial for portable devices, as they offer convenience and accessibility for continuous monitoring in diverse settings. However, the interpretability and reliability of deep learning models in clinical application… ▽ More

    Submitted 14 June, 2025; originally announced June 2025.

    Comments: 21 pages, 3 figures

  5. arXiv:2506.12103  [pdf, other

    cs.AI cs.CY cs.LG

    The Amazon Nova Family of Models: Technical Report and Model Card

    Authors: Amazon AGI, Aaron Langford, Aayush Shah, Abhanshu Gupta, Abhimanyu Bhatter, Abhinav Goyal, Abhinav Mathur, Abhinav Mohanty, Abhishek Kumar, Abhishek Sethi, Abi Komma, Abner Pena, Achin Jain, Adam Kunysz, Adam Opyrchal, Adarsh Singh, Aditya Rawal, Adok Achar Budihal Prasad, Adrià de Gispert, Agnika Kumar, Aishwarya Aryamane, Ajay Nair, Akilan M, Akshaya Iyengar, Akshaya Vishnu Kudlu Shanbhogue , et al. (761 additional authors not shown)

    Abstract: We present Amazon Nova, a new generation of state-of-the-art foundation models that deliver frontier intelligence and industry-leading price performance. Amazon Nova Pro is a highly-capable multimodal model with the best combination of accuracy, speed, and cost for a wide range of tasks. Amazon Nova Lite is a low-cost multimodal model that is lightning fast for processing images, video, documents… ▽ More

    Submitted 17 March, 2025; originally announced June 2025.

    Comments: 48 pages, 10 figures

    Report number: 20250317

  6. In-Sensor Motion Recognition with Memristive System and Light Sensing Surfaces

    Authors: Hritom Das, Imran Fahad, SNB Tushar, Sk Hasibul Alam, Graham Buchanan, Danny Scott, Garrett S. Rose, Sai Swaminathan

    Abstract: In this paper, we introduce a novel device architecture that merges memristive devices with light-sensing surfaces, for energy-efficient motion recognition at the edge. Our light-sensing surface captures motion data through in-sensor computation. This data is then processed using a memristive system equipped with a HfO2-based synaptic device, coupled with a winner-take-all (WTA) circuit, tailored… ▽ More

    Submitted 7 June, 2025; originally announced June 2025.

    Comments: The paper was published in the 2024 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)

  7. arXiv:2506.03259  [pdf

    cs.CL

    Evaluating Large Language Models for Zero-Shot Disease Labeling in CT Radiology Reports Across Organ Systems

    Authors: Michael E. Garcia-Alcoser, Mobina GhojoghNejad, Fakrul Islam Tushar, David Kim, Kyle J. Lafata, Geoffrey D. Rubin, Joseph Y. Lo

    Abstract: Purpose: This study aims to evaluate the effectiveness of large language models (LLMs) in automating disease annotation of CT radiology reports. We compare a rule-based algorithm (RBA), RadBERT, and three lightweight open-weight LLMs for multi-disease labeling of chest, abdomen, and pelvis (CAP) CT reports. Materials and Methods: This retrospective study analyzed 40,833 CT reports from 29,540 pa… ▽ More

    Submitted 3 June, 2025; originally announced June 2025.

    Comments: 23 pages, 10 figures, to be submitted in Radiology: Artificial Intelligence

    ACM Class: I.2.7

  8. arXiv:2506.02945  [pdf, ps, other

    cs.CL cs.LG

    Quantitative LLM Judges

    Authors: Aishwarya Sahoo, Jeevana Kruthi Karnuthala, Tushar Parmanand Budhwani, Pranchal Agarwal, Sankaran Vaidyanathan, Alexa Siu, Franck Dernoncourt, Jennifer Healey, Nedim Lipka, Ryan Rossi, Uttaran Bhattacharya, Branislav Kveton

    Abstract: LLM-as-a-judge is a framework in which a large language model (LLM) automatically evaluates the output of another LLM. We propose quantitative LLM judges, which align evaluation scores of existing LLM judges to human scores in a given domain using regression models. The models are trained to improve the score of the original judge by using the judge's textual evaluation and score. We present four… ▽ More

    Submitted 3 June, 2025; originally announced June 2025.

  9. arXiv:2506.01173  [pdf, ps, other

    cs.DB cs.LG

    SIFBench: An Extensive Benchmark for Fatigue Analysis

    Authors: Tushar Gautam, Robert M. Kirby, Jacob Hochhalter, Shandian Zhe

    Abstract: Fatigue-induced crack growth is a leading cause of structural failure across critical industries such as aerospace, civil engineering, automotive, and energy. Accurate prediction of stress intensity factors (SIFs) -- the key parameters governing crack propagation in linear elastic fracture mechanics -- is essential for assessing fatigue life and ensuring structural integrity. While machine learnin… ▽ More

    Submitted 9 June, 2025; v1 submitted 1 June, 2025; originally announced June 2025.

  10. arXiv:2505.24638  [pdf, ps, other

    cs.CV cs.AI

    Cloud Optical Thickness Retrievals Using Angle Invariant Attention Based Deep Learning Models

    Authors: Zahid Hassan Tushar, Adeleke Ademakinwa, Jianwu Wang, Zhibo Zhang, Sanjay Purushotham

    Abstract: Cloud Optical Thickness (COT) is a critical cloud property influencing Earth's climate, weather, and radiation budget. Satellite radiance measurements enable global COT retrieval, but challenges like 3D cloud effects, viewing angles, and atmospheric interference must be addressed to ensure accurate estimation. Traditionally, the Independent Pixel Approximation (IPA) method, which treats individual… ▽ More

    Submitted 30 May, 2025; originally announced May 2025.

    Comments: 6 pages, 7 figures, to be published in 2025 IEEE International Conference on Image Processing (ICIP)

  11. arXiv:2505.24034  [pdf, ps, other

    cs.LG cs.AI

    LlamaRL: A Distributed Asynchronous Reinforcement Learning Framework for Efficient Large-scale LLM Training

    Authors: Bo Wu, Sid Wang, Yunhao Tang, Jia Ding, Eryk Helenowski, Liang Tan, Tengyu Xu, Tushar Gowda, Zhengxing Chen, Chen Zhu, Xiaocheng Tang, Yundi Qian, Beibei Zhu, Rui Hou

    Abstract: Reinforcement Learning (RL) has become the most effective post-training approach for improving the capabilities of Large Language Models (LLMs). In practice, because of the high demands on latency and memory, it is particularly challenging to develop an efficient RL framework that reliably manages policy models with hundreds to thousands of billions of parameters. In this paper, we present Llama… ▽ More

    Submitted 1 June, 2025; v1 submitted 29 May, 2025; originally announced May 2025.

  12. arXiv:2505.19481  [pdf, ps, other

    cs.LG cs.AI cs.DC cs.MA

    Win Fast or Lose Slow: Balancing Speed and Accuracy in Latency-Sensitive Decisions of LLMs

    Authors: Hao Kang, Qingru Zhang, Han Cai, Weiyuan Xu, Tushar Krishna, Yilun Du, Tsachy Weissman

    Abstract: Large language models (LLMs) have shown remarkable performance across diverse reasoning and generation tasks, and are increasingly deployed as agents in dynamic environments such as code generation and recommendation systems. However, many real-world applications, such as high-frequency trading and real-time competitive gaming, require decisions under strict latency constraints, where faster respo… ▽ More

    Submitted 26 May, 2025; originally announced May 2025.

  13. arXiv:2505.15020  [pdf, ps, other

    cs.DC

    COSMIC: Enabling Full-Stack Co-Design and Optimization of Distributed Machine Learning Systems

    Authors: Aditi Raju, Jared Ni, William Won, Changhai Man, Srivatsan Krishnan, Srinivas Sridharan, Amir Yazdanbakhsh, Tushar Krishna, Vijay Janapa Reddi

    Abstract: Large-scale machine learning models necessitate distributed systems, posing significant design challenges due to the large parameter space across distinct design stacks. Existing studies often focus on optimizing individual system aspects in isolation. This work challenges this limitation and introduces COSMIC, a full-stack distributed machine learning systems environment enabling end-to-end simul… ▽ More

    Submitted 20 May, 2025; originally announced May 2025.

    Comments: 11 pages (excluding references), 10 figures, 6 tables

  14. arXiv:2505.10495  [pdf, ps, other

    cs.LG cs.CL

    RouteNator: A Router-Based Multi-Modal Architecture for Generating Synthetic Training Data for Function Calling LLMs

    Authors: Vibha Belavadi, Tushar Vatsa, Dewang Sultania, Suhas Suresha, Ishita Verma, Cheng Chen, Tracy Holloway King, Michael Friedrich

    Abstract: This paper addresses fine-tuning Large Language Models (LLMs) for function calling tasks when real user interaction data is unavailable. In digital content creation tools, where users express their needs through natural language queries that must be mapped to API calls, the lack of real-world task-specific data and privacy constraints for training on it necessitate synthetic data generation. Exist… ▽ More

    Submitted 15 May, 2025; originally announced May 2025.

    Comments: Proceedings of the 4th International Workshop on Knowledge-Augmented Methods for Natural Language Processing

    Journal ref: https://aclanthology.org/2025.knowledgenlp-1.10/ KnowledgeNLP 2025

  15. arXiv:2505.09831  [pdf, ps, other

    eess.IV cs.CV

    ImplicitStainer: Data-Efficient Medical Image Translation for Virtual Antibody-based Tissue Staining Using Local Implicit Functions

    Authors: Tushar Kataria, Beatrice Knudsen, Shireen Y. Elhabian

    Abstract: Hematoxylin and eosin (H&E) staining is a gold standard for microscopic diagnosis in pathology. However, H&E staining does not capture all the diagnostic information that may be needed. To obtain additional molecular information, immunohistochemical (IHC) stains highlight proteins that mark specific cell types, such as CD3 for T-cells or CK8/18 for epithelial cells. While IHC stains are vital for… ▽ More

    Submitted 14 May, 2025; originally announced May 2025.

  16. arXiv:2505.09829  [pdf, ps, other

    cs.CV

    BoundarySeg:An Embarrassingly Simple Method To Boost Medical Image Segmentation Performance for Low Data Regimes

    Authors: Tushar Kataria, Shireen Y. Elhabian

    Abstract: Obtaining large-scale medical data, annotated or unannotated, is challenging due to stringent privacy regulations and data protection policies. In addition, annotating medical images requires that domain experts manually delineate anatomical structures, making the process both time-consuming and costly. As a result, semi-supervised methods have gained popularity for reducing annotation costs. Howe… ▽ More

    Submitted 14 May, 2025; originally announced May 2025.

  17. TS-Detector : Detecting Feature Toggle Usage Patterns

    Authors: Tajmilur Rahman, Mengzhe Fei, Tushar Sharma, Chanchal Roy

    Abstract: Feature toggles enable developers to control feature states, allowing the features to be released to a limited group of users while preserving overall software functionality. The absence of comprehensive best practices for feature toggle usage often results in improper implementation, causing code quality issues. Although certain feature toggle usage patterns are prone to toggle smells, there is n… ▽ More

    Submitted 8 May, 2025; originally announced May 2025.

    Comments: 33rd ACM International Conference on the Foundations of Software Engineering, June 23--28, 2025, Trondheim, Norway

  18. arXiv:2505.03771  [pdf, other

    cs.AR cs.MA

    OneDSE: A Unified Microprocessor Metric Prediction and Design Space Exploration Framework

    Authors: Ritik Raj, Akshat Ramachandran, Jeff Nye, Shashank Nemawarkar, Tushar Krishna

    Abstract: With the diminishing returns of Moore Law scaling and as power constraints become more impactful, processor designs rely on architectural innovation to achieve differentiating performance. Innovation complexity has increased the design space of modern high-performance processors. This work offers an efficient and novel design space exploration (DSE) solution to these challenges of modern CPU desig… ▽ More

    Submitted 29 April, 2025; originally announced May 2025.

  19. arXiv:2505.00041  [pdf, other

    cs.AR

    MCMComm: Hardware-Software Co-Optimization for End-to-End Communication in Multi-Chip-Modules

    Authors: Ritik Raj, Shengjie Lin, William Won, Tushar Krishna

    Abstract: Increasing AI computing demands and slowing transistor scaling have led to the advent of Multi-Chip-Module (MCMs) based accelerators. MCMs enable cost-effective scalability, higher yield, and modular reuse by partitioning large chips into smaller chiplets. However, MCMs come at an increased communication cost, which requires critical analysis and optimization. This paper makes three main contribut… ▽ More

    Submitted 2 May, 2025; v1 submitted 29 April, 2025; originally announced May 2025.

  20. arXiv:2504.20854  [pdf, other

    cs.NI cs.AI cs.DC eess.SY

    Towards Easy and Realistic Network Infrastructure Testing for Large-scale Machine Learning

    Authors: Jinsun Yoo, ChonLam Lao, Lianjie Cao, Bob Lantz, Minlan Yu, Tushar Krishna, Puneet Sharma

    Abstract: This paper lays the foundation for Genie, a testing framework that captures the impact of real hardware network behavior on ML workload performance, without requiring expensive GPUs. Genie uses CPU-initiated traffic over a hardware testbed to emulate GPU to GPU communication, and adapts the ASTRA-sim simulator to model interaction between the network and the ML workload.

    Submitted 29 April, 2025; originally announced April 2025.

    Comments: Presented as a poster in NSDI 25

  21. arXiv:2504.19323  [pdf, other

    cs.AR cs.AI cs.LG cs.PF

    NSFlow: An End-to-End FPGA Framework with Scalable Dataflow Architecture for Neuro-Symbolic AI

    Authors: Hanchen Yang, Zishen Wan, Ritik Raj, Joongun Park, Ziwei Li, Ananda Samajdar, Arijit Raychowdhury, Tushar Krishna

    Abstract: Neuro-Symbolic AI (NSAI) is an emerging paradigm that integrates neural networks with symbolic reasoning to enhance the transparency, reasoning capabilities, and data efficiency of AI systems. Recent NSAI systems have gained traction due to their exceptional performance in reasoning tasks and human-AI collaborative scenarios. Despite these algorithmic advancements, executing NSAI tasks on existing… ▽ More

    Submitted 29 April, 2025; v1 submitted 27 April, 2025; originally announced April 2025.

    Comments: 2025 IEEE/ACM Design Automation Conference (DAC)

  22. arXiv:2504.18945  [pdf, other

    cs.RO

    Generative AI in Embodied Systems: System-Level Analysis of Performance, Efficiency and Scalability

    Authors: Zishen Wan, Jiayi Qian, Yuhang Du, Jason Jabbour, Yilun Du, Yang Katie Zhao, Arijit Raychowdhury, Tushar Krishna, Vijay Janapa Reddi

    Abstract: Embodied systems, where generative autonomous agents engage with the physical world through integrated perception, cognition, action, and advanced reasoning powered by large language models (LLMs), hold immense potential for addressing complex, long-horizon, multi-objective tasks in real-world environments. However, deploying these systems remains challenging due to prolonged runtime latency, limi… ▽ More

    Submitted 26 April, 2025; originally announced April 2025.

    Comments: 2025 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)

  23. arXiv:2504.15377  [pdf, other

    cs.PF cs.AR

    SCALE-Sim v3: A modular cycle-accurate systolic accelerator simulator for end-to-end system analysis

    Authors: Ritik Raj, Sarbartha Banerjee, Nikhil Chandra, Zishen Wan, Jianming Tong, Ananda Samajdar, Tushar Krishna

    Abstract: The rapid advancements in AI, scientific computing, and high-performance computing (HPC) have driven the need for versatile and efficient hardware accelerators. Existing tools like SCALE-Sim v2 provide valuable cycle-accurate simulations for systolic-array-based architectures but fall short in supporting key modern features such as sparsity, multi-core scalability, and comprehensive memory analysi… ▽ More

    Submitted 8 May, 2025; v1 submitted 21 April, 2025; originally announced April 2025.

  24. arXiv:2504.14365  [pdf, other

    cs.LG cs.AI cs.AR

    Accelerating LLM Inference with Flexible N:M Sparsity via A Fully Digital Compute-in-Memory Accelerator

    Authors: Akshat Ramachandran, Souvik Kundu, Arnab Raha, Shamik Kundu, Deepak K. Mathaikutty, Tushar Krishna

    Abstract: Large language model (LLM) pruning with fixed N:M structured sparsity significantly limits the expressivity of the sparse model, yielding sub-optimal performance. In contrast, supporting multiple N:M patterns to provide sparse representational freedom introduces costly overhead in hardware. To address these challenges for LLMs, we first present a flexible layer-wise outlier-density-aware N:M spars… ▽ More

    Submitted 19 April, 2025; originally announced April 2025.

  25. arXiv:2504.13180  [pdf, ps, other

    cs.CV cs.AI cs.LG

    PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding

    Authors: Jang Hyun Cho, Andrea Madotto, Effrosyni Mavroudi, Triantafyllos Afouras, Tushar Nagarajan, Muhammad Maaz, Yale Song, Tengyu Ma, Shuming Hu, Suyog Jain, Miguel Martin, Huiyu Wang, Hanoona Rasheed, Peize Sun, Po-Yao Huang, Daniel Bolya, Nikhila Ravi, Shashank Jain, Tammy Stark, Shane Moon, Babak Damavandi, Vivian Lee, Andrew Westbury, Salman Khan, Philipp Krähenbühl , et al. (4 additional authors not shown)

    Abstract: Vision-language models are integral to computer vision research, yet many high-performing models remain closed-source, obscuring their data, design and training recipe. The research community has responded by using distillation from black-box models to label training data, achieving strong benchmark results, at the cost of measurable scientific progress. However, without knowing the details of the… ▽ More

    Submitted 19 June, 2025; v1 submitted 17 April, 2025; originally announced April 2025.

    Comments: Technical Report

  26. arXiv:2504.09775  [pdf, other

    cs.AR cs.AI cs.DC cs.LG

    Understanding and Optimizing Multi-Stage AI Inference Pipelines

    Authors: Abhimanyu Rajeshkumar Bambhaniya, Hanjiang Wu, Suvinay Subramanian, Sudarshan Srinivasan, Souvik Kundu, Amir Yazdanbakhsh, Midhilesh Elavazhagan, Madhu Kumar, Tushar Krishna

    Abstract: The rapid evolution of Large Language Models (LLMs) has driven the need for increasingly sophisticated inference pipelines and hardware platforms. Modern LLM serving extends beyond traditional prefill-decode workflows, incorporating multi-stage processes such as Retrieval Augmented Generation (RAG), key-value (KV) cache retrieval, dynamic model routing, and multi step reasoning. These stages exhib… ▽ More

    Submitted 20 April, 2025; v1 submitted 13 April, 2025; originally announced April 2025.

    Comments: Inference System Design for Multi-Stage AI Inference Pipelines. 13 Pages, 15 Figues, 3 Tables

  27. arXiv:2504.03133  [pdf, other

    cs.CV

    Joint Retrieval of Cloud properties using Attention-based Deep Learning Models

    Authors: Zahid Hassan Tushar, Adeleke Ademakinwa, Jianwu Wang, Zhibo Zhang, Sanjay Purushotham

    Abstract: Accurate cloud property retrieval is vital for understanding cloud behavior and its impact on climate, including applications in weather forecasting, climate modeling, and estimating Earth's radiation balance. The Independent Pixel Approximation (IPA), a widely used physics-based approach, simplifies radiative transfer calculations by assuming each pixel is independent of its neighbors. While comp… ▽ More

    Submitted 9 April, 2025; v1 submitted 3 April, 2025; originally announced April 2025.

    Comments: 6 Pages, 4 figures, to be published in 2025 IEEE International Geoscience and Remote Sensing Symposium (IGARSS 2025)

  28. arXiv:2504.02559  [pdf, other

    cs.CL

    Leveraging LLM For Synchronizing Information Across Multilingual Tables

    Authors: Siddharth Khincha, Tushar Kataria, Ankita Anand, Dan Roth, Vivek Gupta

    Abstract: The vast amount of online information today poses challenges for non-English speakers, as much of it is concentrated in high-resource languages such as English and French. Wikipedia reflects this imbalance, with content in low-resource languages frequently outdated or incomplete. Recent research has sought to improve cross-language synchronization of Wikipedia tables using rule-based methods. Thes… ▽ More

    Submitted 4 April, 2025; v1 submitted 3 April, 2025; originally announced April 2025.

    Comments: 17 Pages, 11 Tables, 2 Figures

  29. arXiv:2504.01879  [pdf, other

    cs.CL cs.CV cs.IR

    TransientTables: Evaluating LLMs' Reasoning on Temporally Evolving Semi-structured Tables

    Authors: Abhilash Shankarampeta, Harsh Mahajan, Tushar Kataria, Dan Roth, Vivek Gupta

    Abstract: Humans continuously make new discoveries, and understanding temporal sequence of events leading to these breakthroughs is essential for advancing science and society. This ability to reason over time allows us to identify future steps and understand the effects of financial and political decisions on our lives. However, large language models (LLMs) are typically trained on static datasets, limitin… ▽ More

    Submitted 2 April, 2025; originally announced April 2025.

    Comments: 19 Pages. 21 Tables, 1 figure

  30. arXiv:2503.24310  [pdf, other

    cs.CL cs.AI

    BEATS: Bias Evaluation and Assessment Test Suite for Large Language Models

    Authors: Alok Abhishek, Lisa Erickson, Tushar Bandopadhyay

    Abstract: In this research, we introduce BEATS, a novel framework for evaluating Bias, Ethics, Fairness, and Factuality in Large Language Models (LLMs). Building upon the BEATS framework, we present a bias benchmark for LLMs that measure performance across 29 distinct metrics. These metrics span a broad range of characteristics, including demographic, cognitive, and social biases, as well as measures of eth… ▽ More

    Submitted 31 March, 2025; originally announced March 2025.

    Comments: 32 pages, 33 figures, preprint version

    MSC Class: 68T01 (Primary); 68T50 (Secondary) ACM Class: I.2.0; I.2.7

  31. arXiv:2503.19786  [pdf, other

    cs.CL cs.AI

    Gemma 3 Technical Report

    Authors: Gemma Team, Aishwarya Kamath, Johan Ferret, Shreya Pathak, Nino Vieillard, Ramona Merhej, Sarah Perrin, Tatiana Matejovicova, Alexandre Ramé, Morgane Rivière, Louis Rouillard, Thomas Mesnard, Geoffrey Cideron, Jean-bastien Grill, Sabela Ramos, Edouard Yvinec, Michelle Casbon, Etienne Pot, Ivo Penchev, Gaël Liu, Francesco Visin, Kathleen Kenealy, Lucas Beyer, Xiaohai Zhai, Anton Tsitsulin , et al. (191 additional authors not shown)

    Abstract: We introduce Gemma 3, a multimodal addition to the Gemma family of lightweight open models, ranging in scale from 1 to 27 billion parameters. This version introduces vision understanding abilities, a wider coverage of languages and longer context - at least 128K tokens. We also change the architecture of the model to reduce the KV-cache memory that tends to explode with long context. This is achie… ▽ More

    Submitted 25 March, 2025; originally announced March 2025.

  32. arXiv:2503.15282  [pdf, other

    cs.SE

    SENAI: Towards Software Engineering Native Generative Artificial Intelligence

    Authors: Mootez Saad, José Antonio Hernández López, Boqi Chen, Neil Ernst, Dániel Varró, Tushar Sharma

    Abstract: Large Language Models have significantly advanced the field of code generation, demonstrating the ability to produce functionally correct code snippets. However, advancements in generative AI for code overlook foundational Software Engineering (SE) principles such as modularity, and single responsibility, and concepts such as cohesion and coupling which are critical for creating maintainable, scal… ▽ More

    Submitted 19 March, 2025; originally announced March 2025.

    Comments: 5 pages, 1 figure

  33. arXiv:2503.12855  [pdf, other

    cs.CV cs.AI cs.CL

    VITED: Video Temporal Evidence Distillation

    Authors: Yujie Lu, Yale Song, William Wang, Lorenzo Torresani, Tushar Nagarajan

    Abstract: We investigate complex video question answering via chain-of-evidence reasoning -- identifying sequences of temporal spans from multiple relevant parts of the video, together with visual evidence within them. Existing models struggle with multi-step reasoning as they uniformly sample a fixed number of frames, which can miss critical evidence distributed nonuniformly throughout the video. Moreover,… ▽ More

    Submitted 17 March, 2025; originally announced March 2025.

    Journal ref: CVPR 2025

  34. arXiv:2503.11953  [pdf, other

    cs.CV

    SPOC: Spatially-Progressing Object State Change Segmentation in Video

    Authors: Priyanka Mandikal, Tushar Nagarajan, Alex Stoken, Zihui Xue, Kristen Grauman

    Abstract: Object state changes in video reveal critical information about human and agent activity. However, existing methods are limited to temporal localization of when the object is in its initial state (e.g., the unchopped avocado) versus when it has completed a state change (e.g., the chopped avocado), which limits applicability for any task requiring detailed information about the progress of the acti… ▽ More

    Submitted 14 March, 2025; originally announced March 2025.

  35. arXiv:2503.10959  [pdf, other

    cs.CV cs.AI

    OuroMamba: A Data-Free Quantization Framework for Vision Mamba Models

    Authors: Akshat Ramachandran, Mingyu Lee, Huan Xu, Souvik Kundu, Tushar Krishna

    Abstract: We present OuroMamba, the first data-free post-training quantization (DFQ) method for vision Mamba-based models (VMMs). We identify two key challenges in enabling DFQ for VMMs, (1) VMM's recurrent state transitions restricts capturing of long-range interactions and leads to semantically weak synthetic data, (2) VMM activations exhibit dynamic outlier variations across time-steps, rendering existin… ▽ More

    Submitted 13 March, 2025; originally announced March 2025.

  36. arXiv:2503.09590  [pdf, other

    cs.CV

    BIMBA: Selective-Scan Compression for Long-Range Video Question Answering

    Authors: Md Mohaiminul Islam, Tushar Nagarajan, Huiyu Wang, Gedas Bertasius, Lorenzo Torresani

    Abstract: Video Question Answering (VQA) in long videos poses the key challenge of extracting relevant information and modeling long-range dependencies from many redundant frames. The self-attention mechanism provides a general solution for sequence modeling, but it has a prohibitive cost when applied to a massive number of spatiotemporal tokens in long videos. Most prior methods rely on compression strateg… ▽ More

    Submitted 13 March, 2025; v1 submitted 12 March, 2025; originally announced March 2025.

    Comments: Accepted by CVPR 2025

  37. arXiv:2503.04748  [pdf

    cs.CY

    Large Language Models in Healthcare

    Authors: Mohammed Al-Garadi, Tushar Mungle, Abdulaziz Ahmed, Abeed Sarker, Zhuqi Miao, Michael E. Matheny

    Abstract: Large language models (LLMs) hold promise for transforming healthcare, from streamlining administrative and clinical workflows to enriching patient engagement and advancing clinical decision-making. However, their successful integration requires rigorous development, adaptation, and evaluation strategies tailored to clinical needs. In this Review, we highlight recent advancements, explore emerging… ▽ More

    Submitted 2 April, 2025; v1 submitted 6 February, 2025; originally announced March 2025.

  38. arXiv:2503.03656  [pdf, other

    cs.SE cs.LG

    Robust Learning of Diverse Code Edits

    Authors: Tushar Aggarwal, Swayam Singh, Abhijeet Awasthi, Aditya Kanade, Nagarajan Natarajan

    Abstract: Software engineering activities frequently involve edits to existing code. However, contemporary code language models (LMs) lack the ability to handle diverse types of code-edit requirements. In this work, we attempt to overcome this shortcoming through (1) a novel synthetic data generation pipeline and (2) a robust model adaptation algorithm. Starting with seed code examples and diverse editing c… ▽ More

    Submitted 10 May, 2025; v1 submitted 5 March, 2025; originally announced March 2025.

    Comments: To appear in ICML 2025 as 'NextCoder: Robust Adaptation of Code LMs to Diverse Code Edits'

  39. arXiv:2503.01162  [pdf, other

    cs.AR

    CogSys: Efficient and Scalable Neurosymbolic Cognition System via Algorithm-Hardware Co-Design

    Authors: Zishen Wan, Hanchen Yang, Ritik Raj, Che-Kai Liu, Ananda Samajdar, Arijit Raychowdhury, Tushar Krishna

    Abstract: Neurosymbolic AI is an emerging compositional paradigm that fuses neural learning with symbolic reasoning to enhance the transparency, interpretability, and trustworthiness of AI. It also exhibits higher data efficiency making it promising for edge deployments. Despite the algorithmic promises and demonstrations, unfortunately executing neurosymbolic workloads on current hardware (CPU/GPU/TPU) is… ▽ More

    Submitted 17 March, 2025; v1 submitted 2 March, 2025; originally announced March 2025.

    Comments: 2025 IEEE International Symposium on High-Performance Computer Architecture (HPCA), 15 pages, 19 figures, 10 tables

  40. arXiv:2502.21187  [pdf, other

    cs.LG

    SYN-LUNGS: Towards Simulating Lung Nodules with Anatomy-Informed Digital Twins for AI Training

    Authors: Fakrul Islam Tushar, Lavsen Dahal, Cindy McCabe, Fong Chi Ho, Paul Segars, Ehsan Abadi, Kyle J. Lafata, Ehsan Samei, Joseph Y. Lo

    Abstract: AI models for lung cancer screening are limited by data scarcity, impacting generalizability and clinical applicability. Generative models address this issue but are constrained by training data variability. We introduce SYN-LUNGS, a framework for generating high-quality 3D CT images with detailed annotations. SYN-LUNGS integrates XCAT3 phantoms for digital twin generation, X-Lesions for nodule si… ▽ More

    Submitted 6 May, 2025; v1 submitted 28 February, 2025; originally announced February 2025.

    Comments: 6 figures, 12 pages

  41. arXiv:2502.17955  [pdf, other

    cs.CL cs.AI

    Language Models' Factuality Depends on the Language of Inquiry

    Authors: Tushar Aggarwal, Kumar Tanmay, Ayush Agrawal, Kumar Ayush, Hamid Palangi, Paul Pu Liang

    Abstract: Multilingual language models (LMs) are expected to recall factual knowledge consistently across languages, yet they often fail to transfer knowledge between languages even when they possess the correct information in one of the languages. For example, we find that an LM may correctly identify Rashed Al Shashai as being from Saudi Arabia when asked in Arabic, but consistently fails to do so when as… ▽ More

    Submitted 25 February, 2025; originally announced February 2025.

  42. arXiv:2502.15147  [pdf, other

    cs.CL

    Latent Factor Models Meets Instructions: Goal-conditioned Latent Factor Discovery without Task Supervision

    Authors: Zhouhang Xie, Tushar Khot, Bhavana Dalvi Mishra, Harshit Surana, Julian McAuley, Peter Clark, Bodhisattwa Prasad Majumder

    Abstract: Instruction-following LLMs have recently allowed systems to discover hidden concepts from a collection of unstructured documents based on a natural language description of the purpose of the discovery (i.e., goal). Still, the quality of the discovered concepts remains mixed, as it depends heavily on LLM's reasoning ability and drops when the data is noisy or beyond LLM's knowledge. We present Inst… ▽ More

    Submitted 27 April, 2025; v1 submitted 20 February, 2025; originally announced February 2025.

    Comments: NAACL 2025

  43. arXiv:2502.13113  [pdf, other

    cs.DC cs.AR

    HARP: A Taxonomy for Heterogeneous and Hierarchical Processors for Mixed-reuse Workloads

    Authors: Raveesh Garg, Michael Pellauer, Tushar Krishna

    Abstract: Artificial intelligence (AI) application domains consist of a mix of tensor operations with high and low arithmetic intensities (aka reuse). Hierarchical (i.e. compute along multiple levels of memory hierarchy) and heterogeneous (multiple different sub-accelerators) accelerators are emerging as a popular way to process mixed reuse workloads, and workloads which consist of tensor operators with div… ▽ More

    Submitted 18 February, 2025; originally announced February 2025.

  44. arXiv:2502.05078  [pdf, other

    cs.AI cs.CL

    Adaptive Graph of Thoughts: Test-Time Adaptive Reasoning Unifying Chain, Tree, and Graph Structures

    Authors: Tushar Pandey, Ara Ghukasyan, Oktay Goktas, Santosh Kumar Radha

    Abstract: Large Language Models (LLMs) have demonstrated impressive reasoning capabilities, yet their performance is highly dependent on the prompting strategy and model scale. While reinforcement learning and fine-tuning have been deployed to boost reasoning, these approaches incur substantial computational and data overhead. In this work, we introduce Adaptive Graph of Thoughts (AGoT), a dynamic, graph-ba… ▽ More

    Submitted 7 February, 2025; originally announced February 2025.

  45. arXiv:2501.09954  [pdf, other

    cs.LG cs.AI cs.AR

    AIRCHITECT v2: Learning the Hardware Accelerator Design Space through Unified Representations

    Authors: Jamin Seo, Akshat Ramachandran, Yu-Chuan Chuang, Anirudh Itagi, Tushar Krishna

    Abstract: Design space exploration (DSE) plays a crucial role in enabling custom hardware architectures, particularly for emerging applications like AI, where optimized and specialized designs are essential. With the growing complexity of deep neural networks (DNNs) and the introduction of advanced foundational models (FMs), the design space for DNN accelerators is expanding at an exponential rate. Addition… ▽ More

    Submitted 16 January, 2025; originally announced January 2025.

    Comments: Accepted to DATE 2025

  46. arXiv:2501.08484  [pdf, other

    cs.OS cs.PF

    CORD: Co-design of Resource Allocation and Deadline Decomposition with Generative Profiling

    Authors: Robert Gifford, Abby Eisenklam, Georgiy A. Bondar, Yifan Cai, Tushar Sial, Linh Thi Xuan Phan, Abhishek Halder

    Abstract: As multicore hardware is becoming increasingly common in real-time systems, traditional scheduling techniques that assume a single worst-case execution time for a task are no longer adequate, since they ignore the impact of shared resources on execution time. When tasks execute concurrently on different cores, their execution times often vary substantially with their allocated budgets of shared re… ▽ More

    Submitted 14 January, 2025; originally announced January 2025.

  47. arXiv:2501.07047  [pdf, other

    cs.CR cs.AR cs.CL cs.PL

    Leveraging ASIC AI Chips for Homomorphic Encryption

    Authors: Jianming Tong, Tianhao Huang, Leo de Castro, Anirudh Itagi, Jingtian Dang, Anupam Golder, Asra Ali, Jevin Jiang, Arvind, G. Edward Suh, Tushar Krishna

    Abstract: Cloud-based services are making the outsourcing of sensitive client data increasingly common. Although homomorphic encryption (HE) offers strong privacy guarantee, it requires substantially more resources than computing on plaintext, often leading to unacceptably large latencies in getting the results. HE accelerators have emerged to mitigate this latency issue, but with the high cost of ASICs. In… ▽ More

    Submitted 28 March, 2025; v1 submitted 12 January, 2025; originally announced January 2025.

    Comments: 16 pages, 11 figures, 4 algorithms, 9 tables. Enabling Google TPUs for privacy-preserving AI inference

  48. A General Framework for Error-controlled Unstructured Scientific Data Compression

    Authors: Qian Gong, Zhe Wang, Viktor Reshniak, Xin Liang, Jieyang Chen, Qing Liu, Tushar M. Athawale, Yi Ju, Anand Rangarajan, Sanjay Ranka, Norbert Podhorszki, Rick Archibald, Scott Klasky

    Abstract: Data compression plays a key role in reducing storage and I/O costs. Traditional lossy methods primarily target data on rectilinear grids and cannot leverage the spatial coherence in unstructured mesh data, leading to suboptimal compression ratios. We present a multi-component, error-bounded compression framework designed to enhance the compression of floating-point unstructured mesh data, which i… ▽ More

    Submitted 12 January, 2025; originally announced January 2025.

    Comments: 10 pages, 9 figures. 2024 IEEE 20th International Conference on e-Science (e-Science). IEEE, 2024

  49. arXiv:2501.06497  [pdf, other

    cs.CL cs.AI

    PASS: Presentation Automation for Slide Generation and Speech

    Authors: Tushar Aggarwal, Aarohi Bhand

    Abstract: In today's fast-paced world, effective presentations have become an essential tool for communication in both online and offline meetings. The crafting of a compelling presentation requires significant time and effort, from gathering key insights to designing slides that convey information clearly and concisely. However, despite the wealth of resources available, people often find themselves manual… ▽ More

    Submitted 15 January, 2025; v1 submitted 11 January, 2025; originally announced January 2025.

  50. arXiv:2501.06043  [pdf, other

    cs.AR

    Axon: A novel systolic array architecture for improved run time and energy efficient GeMM and Conv operation with on-chip im2col

    Authors: Md Mizanur Rahaman Nayan, Ritik Raj, Gouse Basha Shaik, Tushar Krishna, Azad J Naeemi

    Abstract: General matrix multiplication (GeMM) is a core operation in virtually all AI applications. Systolic array (SA) based architectures have shown great promise as GeMM hardware accelerators thanks to their speed and energy efficiency. Unfortunately, SAs incur a linear delay in filling the operands, due to unidirectional propagation via pipeline latches. In this work, we propose a novel in-array data o… ▽ More

    Submitted 10 January, 2025; originally announced January 2025.

    Comments: Accepted for Design Automation and Test in Europe (DATE), 2025. This is preprint of the accepted version