Skip to main content

Showing 1–12 of 12 results for author: Ganesh, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.12103  [pdf, other

    cs.AI cs.CY cs.LG

    The Amazon Nova Family of Models: Technical Report and Model Card

    Authors: Amazon AGI, Aaron Langford, Aayush Shah, Abhanshu Gupta, Abhimanyu Bhatter, Abhinav Goyal, Abhinav Mathur, Abhinav Mohanty, Abhishek Kumar, Abhishek Sethi, Abi Komma, Abner Pena, Achin Jain, Adam Kunysz, Adam Opyrchal, Adarsh Singh, Aditya Rawal, Adok Achar Budihal Prasad, Adrià de Gispert, Agnika Kumar, Aishwarya Aryamane, Ajay Nair, Akilan M, Akshaya Iyengar, Akshaya Vishnu Kudlu Shanbhogue , et al. (761 additional authors not shown)

    Abstract: We present Amazon Nova, a new generation of state-of-the-art foundation models that deliver frontier intelligence and industry-leading price performance. Amazon Nova Pro is a highly-capable multimodal model with the best combination of accuracy, speed, and cost for a wide range of tasks. Amazon Nova Lite is a low-cost multimodal model that is lightning fast for processing images, video, documents… ▽ More

    Submitted 17 March, 2025; originally announced June 2025.

    Comments: 48 pages, 10 figures

    Report number: 20250317

  2. arXiv:2506.04708  [pdf, other

    cs.CL

    Accelerated Test-Time Scaling with Model-Free Speculative Sampling

    Authors: Woomin Song, Saket Dingliwal, Sai Muralidhar Jayanthi, Bhavana Ganesh, Jinwoo Shin, Aram Galstyan, Sravan Babu Bodapati

    Abstract: Language models have demonstrated remarkable capabilities in reasoning tasks through test-time scaling techniques like best-of-N sampling and tree search. However, these approaches often demand substantial computational resources, creating a critical trade-off between performance and efficiency. We introduce STAND (STochastic Adaptive N-gram Drafting), a novel model-free speculative decoding appro… ▽ More

    Submitted 5 June, 2025; originally announced June 2025.

  3. arXiv:2504.18572  [pdf

    cs.AI cs.CL

    BELL: Benchmarking the Explainability of Large Language Models

    Authors: Syed Quiser Ahmed, Bharathi Vokkaliga Ganesh, Jagadish Babu P, Karthick Selvaraj, ReddySiva Naga Parvathi Devi, Sravya Kappala

    Abstract: Large Language Models have demonstrated remarkable capabilities in natural language processing, yet their decision-making processes often lack transparency. This opaqueness raises significant concerns regarding trust, bias, and model performance. To address these issues, understanding and evaluating the interpretability of LLMs is crucial. This paper introduces a standardised benchmarking techniqu… ▽ More

    Submitted 22 April, 2025; originally announced April 2025.

  4. arXiv:2503.04992  [pdf, ps, other

    cs.LG cs.AI cs.CL

    Wanda++: Pruning Large Language Models via Regional Gradients

    Authors: Yifan Yang, Kai Zhen, Bhavana Ganesh, Aram Galstyan, Goeric Huybrechts, Markus Müller, Jonas M. Kübler, Rupak Vignesh Swaminathan, Athanasios Mouchtaris, Sravan Babu Bodapati, Nathan Susanj, Zheng Zhang, Jack FitzGerald, Abhishek Kumar

    Abstract: Large Language Models (LLMs) pruning seeks to remove unimportant weights for inference speedup with minimal accuracy impact. However, existing methods often suffer from accuracy degradation without full-model sparsity-aware fine-tuning. This paper presents Wanda++, a novel pruning framework that outperforms the state-of-the-art methods by utilizing decoder-block-level \textbf{regional} gradients.… ▽ More

    Submitted 1 June, 2025; v1 submitted 6 March, 2025; originally announced March 2025.

    Comments: Paper accepted at ACL 2025 Findings

  5. arXiv:2412.11384  [pdf

    cs.CR

    A Comprehensive Review of Adversarial Attacks on Machine Learning

    Authors: Syed Quiser Ahmed, Bharathi Vokkaliga Ganesh, Sathyanarayana Sampath Kumar, Prakhar Mishra, Ravi Anand, Bhanuteja Akurathi

    Abstract: This research provides a comprehensive overview of adversarial attacks on AI and ML models, exploring various attack types, techniques, and their potential harms. We also delve into the business implications, mitigation strategies, and future research directions. To gain practical insights, we employ the Adversarial Robustness Toolbox (ART) [1] library to simulate these attacks on real-world use c… ▽ More

    Submitted 15 December, 2024; originally announced December 2024.

  6. arXiv:2410.20252  [pdf, other

    cs.CV cs.AI

    Adaptive Video Understanding Agent: Enhancing efficiency with dynamic frame sampling and feedback-driven reasoning

    Authors: Sullam Jeoung, Goeric Huybrechts, Bhavana Ganesh, Aram Galstyan, Sravan Bodapati

    Abstract: Understanding long-form video content presents significant challenges due to its temporal complexity and the substantial computational resources required. In this work, we propose an agent-based approach to enhance both the efficiency and effectiveness of long-form video understanding by utilizing large language models (LLMs) and their tool-harnessing ability. A key aspect of our method is query-a… ▽ More

    Submitted 26 October, 2024; originally announced October 2024.

  7. arXiv:2410.09362  [pdf, other

    cs.LG cs.AI

    SeRA: Self-Reviewing and Alignment of Large Language Models using Implicit Reward Margins

    Authors: Jongwoo Ko, Saket Dingliwal, Bhavana Ganesh, Sailik Sengupta, Sravan Bodapati, Aram Galstyan

    Abstract: Direct alignment algorithms (DAAs), such as direct preference optimization (DPO), have become popular alternatives for Reinforcement Learning from Human Feedback (RLHF) due to their simplicity, efficiency, and stability. However, the preferences used in DAAs are usually collected before the alignment training begins and remain unchanged (off-policy). This can lead to two problems where the policy… ▽ More

    Submitted 12 October, 2024; originally announced October 2024.

  8. arXiv:2211.04780  [pdf, other

    cs.LG cs.CR cs.CV

    On the Robustness of Explanations of Deep Neural Network Models: A Survey

    Authors: Amlan Jyoti, Karthik Balaji Ganesh, Manoj Gayala, Nandita Lakshmi Tunuguntla, Sandesh Kamath, Vineeth N Balasubramanian

    Abstract: Explainability has been widely stated as a cornerstone of the responsible and trustworthy use of machine learning models. With the ubiquitous use of Deep Neural Network (DNN) models expanding to risk-sensitive and safety-critical domains, many methods have been proposed to explain the decisions of these models. Recent years have also seen concerted efforts that have shown how such explanations can… ▽ More

    Submitted 9 November, 2022; originally announced November 2022.

    Comments: Under Review ACM Computing Surveys "Special Issue on Trustworthy AI"

  9. arXiv:2201.11674  [pdf, other

    cs.CV cs.LG

    Vision Checklist: Towards Testable Error Analysis of Image Models to Help System Designers Interrogate Model Capabilities

    Authors: Xin Du, Benedicte Legastelois, Bhargavi Ganesh, Ajitha Rajan, Hana Chockler, Vaishak Belle, Stuart Anderson, Subramanian Ramamoorthy

    Abstract: Using large pre-trained models for image recognition tasks is becoming increasingly common owing to the well acknowledged success of recent models like vision transformers and other CNN-based models like VGG and Resnet. The high accuracy of these models on benchmark tasks has translated into their practical use across many domains including safety-critical applications like autonomous driving and… ▽ More

    Submitted 31 January, 2022; v1 submitted 27 January, 2022; originally announced January 2022.

    Comments: 17 pages, 18 figures

    MSC Class: 62R07 ACM Class: I.4.0

  10. arXiv:2010.14950  [pdf

    cs.SI

    Predicting Engagement with the Internet Research Agency's Facebook and Instagram Campaigns around the 2016 U.S. Presidential Election

    Authors: Dimitra Liotsiou, Bharath Ganesh, Philip N. Howard

    Abstract: The Russian Internet Research Agency's (IRA) online interference campaign in the 2016 U.S. presidential election represents a turning point in the trajectory of democratic elections in the digital age. What can we learn about how the IRA engages U.S. audiences, ahead of the 2020 U.S. presidential election? We provide the first in-depth analysis of the relationships between IRA content characterist… ▽ More

    Submitted 28 October, 2020; originally announced October 2020.

  11. arXiv:2001.11461  [pdf

    cs.SI stat.AP

    Echo Chambers Exist! (But They're Full of Opposing Views)

    Authors: Jonathan Bright, Nahema Marchal, Bharath Ganesh, Stevan Rudinac

    Abstract: The theory of echo chambers, which suggests that online political discussions take place in conditions of ideological homogeneity, has recently gained popularity as an explanation for patterns of political polarization and radicalization observed in many democratic countries. However, while micro-level experimental work has shown evidence that individuals may gravitate towards information that sup… ▽ More

    Submitted 30 January, 2020; originally announced January 2020.

  12. arXiv:1710.07087  [pdf

    cs.SI

    Does Campaigning on Social Media Make a Difference? Evidence from candidate use of Twitter during the 2015 and 2017 UK Elections

    Authors: Jonathan Bright, Scott A Hale, Bharath Ganesh, Andrew Bulovsky, Helen Margetts, Phil Howard

    Abstract: Social media are now a routine part of political campaigns all over the world. However, studies of the impact of campaigning on social platform have thus far been limited to cross-sectional datasets from one election period which are vulnerable to unobserved variable bias. Hence empirical evidence on the effectiveness of political social media activity is thin. We address this deficit by analysing… ▽ More

    Submitted 27 July, 2018; v1 submitted 19 October, 2017; originally announced October 2017.