Skip to main content

Showing 1–8 of 8 results for author: Dhawan, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.06261  [pdf, ps, other

    cs.CL cs.AI

    Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

    Authors: Gheorghe Comanici, Eric Bieber, Mike Schaekermann, Ice Pasupat, Noveen Sachdeva, Inderjit Dhillon, Marcel Blistein, Ori Ram, Dan Zhang, Evan Rosen, Luke Marris, Sam Petulla, Colin Gaffney, Asaf Aharoni, Nathan Lintz, Tiago Cardal Pais, Henrik Jacobsson, Idan Szpektor, Nan-Jiang Jiang, Krishna Haridasan, Ahmed Omran, Nikunj Saunshi, Dara Bahri, Gaurav Mishra, Eric Chu , et al. (3278 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 2.X model family: Gemini 2.5 Pro and Gemini 2.5 Flash, as well as our earlier Gemini 2.0 Flash and Flash-Lite models. Gemini 2.5 Pro is our most capable model yet, achieving SoTA performance on frontier coding and reasoning benchmarks. In addition to its incredible coding and reasoning skills, Gemini 2.5 Pro is a thinking model that excels at multimodal unde… ▽ More

    Submitted 7 July, 2025; originally announced July 2025.

    Comments: 72 pages, 17 figures

  2. arXiv:2407.07018  [pdf, other

    cs.LG cs.CL stat.ME

    End-To-End Causal Effect Estimation from Unstructured Natural Language Data

    Authors: Nikita Dhawan, Leonardo Cotta, Karen Ullrich, Rahul G. Krishnan, Chris J. Maddison

    Abstract: Knowing the effect of an intervention is critical for human decision-making, but current approaches for causal effect estimation rely on manual data collection and structuring, regardless of the causal assumptions. This increases both the cost and time-to-completion for studies. We show how large, diverse observational text data can be mined with large language models (LLMs) to produce inexpensive… ▽ More

    Submitted 28 October, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

    Comments: NeurIPS 2024

  3. arXiv:2311.10291  [pdf, other

    cs.LG

    Leveraging Function Space Aggregation for Federated Learning at Scale

    Authors: Nikita Dhawan, Nicole Mitchell, Zachary Charles, Zachary Garrett, Gintare Karolina Dziugaite

    Abstract: The federated learning paradigm has motivated the development of methods for aggregating multiple client updates into a global server model, without sharing client data. Many federated learning algorithms, including the canonical Federated Averaging (FedAvg), take a direct (possibly weighted) average of the client parameter updates, motivated by results in distributed optimization. In this work, w… ▽ More

    Submitted 16 February, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: 23 pages, 10 figures. Transactions on Machine Learning Research, 2024

  4. arXiv:2302.03519  [pdf, other

    cs.LG cs.AI stat.ML

    Efficient Parametric Approximations of Neural Network Function Space Distance

    Authors: Nikita Dhawan, Sicong Huang, Juhan Bae, Roger Grosse

    Abstract: It is often useful to compactly summarize important properties of model parameters and training data so that they can be used later without storing and/or iterating over the entire dataset. As a specific case, we consider estimating the Function Space Distance (FSD) over a training set, i.e. the average discrepancy between the outputs of two neural networks. We propose a Linearized Activation Func… ▽ More

    Submitted 28 May, 2023; v1 submitted 7 February, 2023; originally announced February 2023.

    Comments: 18 pages, 5 figures, ICML 2023

  5. arXiv:2209.09024  [pdf, other

    cs.LG cs.AI cs.CR

    Dataset Inference for Self-Supervised Models

    Authors: Adam Dziedzic, Haonan Duan, Muhammad Ahmad Kaleem, Nikita Dhawan, Jonas Guan, Yannis Cattan, Franziska Boenisch, Nicolas Papernot

    Abstract: Self-supervised models are increasingly prevalent in machine learning (ML) since they reduce the need for expensively labeled data. Because of their versatility in downstream applications, they are increasingly used as a service exposed via public APIs. At the same time, these encoder models are particularly vulnerable to model stealing attacks due to the high dimensionality of vector representati… ▽ More

    Submitted 13 January, 2023; v1 submitted 16 September, 2022; originally announced September 2022.

    Comments: Accepted at NeurIPS 2022; Updated experiment details

  6. arXiv:2205.07890  [pdf, other

    cs.LG cs.AI cs.CR

    On the Difficulty of Defending Self-Supervised Learning against Model Extraction

    Authors: Adam Dziedzic, Nikita Dhawan, Muhammad Ahmad Kaleem, Jonas Guan, Nicolas Papernot

    Abstract: Self-Supervised Learning (SSL) is an increasingly popular ML paradigm that trains models to transform complex inputs into representations without relying on explicit labels. These representations encode similarity structures that enable efficient learning of multiple downstream tasks. Recently, ML-as-a-Service providers have commenced offering trained SSL models over inference APIs, which transfor… ▽ More

    Submitted 29 June, 2022; v1 submitted 16 May, 2022; originally announced May 2022.

    Comments: Accepted at ICML 2022

  7. arXiv:2007.02931  [pdf, other

    cs.LG stat.ML

    Adaptive Risk Minimization: Learning to Adapt to Domain Shift

    Authors: Marvin Zhang, Henrik Marklund, Nikita Dhawan, Abhishek Gupta, Sergey Levine, Chelsea Finn

    Abstract: A fundamental assumption of most machine learning algorithms is that the training and test data are drawn from the same underlying distribution. However, this assumption is violated in almost all practical applications: machine learning systems are regularly tested under distribution shift, due to changing temporal correlations, atypical end users, or other factors. In this work, we consider the p… ▽ More

    Submitted 1 December, 2021; v1 submitted 6 July, 2020; originally announced July 2020.

    Comments: NeurIPS 2021 ; Project website: https://sites.google.com/view/adaptive-risk-minimization ; Code: https://github.com/henrikmarklund/arm

  8. arXiv:1912.04443  [pdf, other

    cs.RO cs.CV cs.LG

    AVID: Learning Multi-Stage Tasks via Pixel-Level Translation of Human Videos

    Authors: Laura Smith, Nikita Dhawan, Marvin Zhang, Pieter Abbeel, Sergey Levine

    Abstract: Robotic reinforcement learning (RL) holds the promise of enabling robots to learn complex behaviors through experience. However, realizing this promise for long-horizon tasks in the real world requires mechanisms to reduce human burden in terms of defining the task and scaffolding the learning process. In this paper, we study how these challenges can be alleviated with an automated robotic learnin… ▽ More

    Submitted 21 June, 2020; v1 submitted 9 December, 2019; originally announced December 2019.

    Comments: Robotics: Science and Systems (RSS) 2020 camera ready submission. Project website: https://sites.google.com/view/rss20avid