Skip to main content

Showing 1–16 of 16 results for author: Dupuy, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.12103  [pdf, other

    cs.AI cs.CY cs.LG

    The Amazon Nova Family of Models: Technical Report and Model Card

    Authors: Amazon AGI, Aaron Langford, Aayush Shah, Abhanshu Gupta, Abhimanyu Bhatter, Abhinav Goyal, Abhinav Mathur, Abhinav Mohanty, Abhishek Kumar, Abhishek Sethi, Abi Komma, Abner Pena, Achin Jain, Adam Kunysz, Adam Opyrchal, Adarsh Singh, Aditya Rawal, Adok Achar Budihal Prasad, Adrià de Gispert, Agnika Kumar, Aishwarya Aryamane, Ajay Nair, Akilan M, Akshaya Iyengar, Akshaya Vishnu Kudlu Shanbhogue , et al. (761 additional authors not shown)

    Abstract: We present Amazon Nova, a new generation of state-of-the-art foundation models that deliver frontier intelligence and industry-leading price performance. Amazon Nova Pro is a highly-capable multimodal model with the best combination of accuracy, speed, and cost for a wide range of tasks. Amazon Nova Lite is a low-cost multimodal model that is lightning fast for processing images, video, documents… ▽ More

    Submitted 17 March, 2025; originally announced June 2025.

    Comments: 48 pages, 10 figures

    Report number: 20250317

  2. arXiv:2504.03174  [pdf, other

    cs.CL

    Multi-lingual Multi-turn Automated Red Teaming for LLMs

    Authors: Abhishek Singhania, Christophe Dupuy, Shivam Mangale, Amani Namboori

    Abstract: Language Model Models (LLMs) have improved dramatically in the past few years, increasing their adoption and the scope of their capabilities over time. A significant amount of work is dedicated to ``model alignment'', i.e., preventing LLMs to generate unsafe responses when deployed into customer-facing applications. One popular method to evaluate safety risks is \textit{red-teaming}, where agents… ▽ More

    Submitted 4 April, 2025; originally announced April 2025.

    Comments: Accepted at TrustNLP@NAACL 2025

  3. arXiv:2411.18676  [pdf, other

    cs.RO cs.AI cs.LG

    Embodied Red Teaming for Auditing Robotic Foundation Models

    Authors: Sathwik Karnik, Zhang-Wei Hong, Nishant Abhangi, Yen-Chen Lin, Tsun-Hsuan Wang, Christophe Dupuy, Rahul Gupta, Pulkit Agrawal

    Abstract: Language-conditioned robot models have the potential to enable robots to perform a wide range of tasks based on natural language instructions. However, assessing their safety and effectiveness remains challenging because it is difficult to test all the different ways a single task can be phrased. Current benchmarks have two key limitations: they rely on a limited set of human-generated instruction… ▽ More

    Submitted 10 February, 2025; v1 submitted 27 November, 2024; originally announced November 2024.

  4. arXiv:2310.15054  [pdf, other

    cs.LG

    Coordinated Replay Sample Selection for Continual Federated Learning

    Authors: Jack Good, Jimit Majmudar, Christophe Dupuy, Jixuan Wang, Charith Peris, Clement Chung, Richard Zemel, Rahul Gupta

    Abstract: Continual Federated Learning (CFL) combines Federated Learning (FL), the decentralized learning of a central model on a number of client devices that may not communicate their data, and Continual Learning (CL), the learning of a model from a continual stream of data without keeping the entire history. In CL, the main challenge is \textit{forgetting} what was learned from past data. While replay-ba… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: 7 pages, 6 figures, accepted to EMNLP (industry track)

  5. arXiv:2308.04265  [pdf, other

    cs.AI

    FLIRT: Feedback Loop In-context Red Teaming

    Authors: Ninareh Mehrabi, Palash Goyal, Christophe Dupuy, Qian Hu, Shalini Ghosh, Richard Zemel, Kai-Wei Chang, Aram Galstyan, Rahul Gupta

    Abstract: Warning: this paper contains content that may be inappropriate or offensive. As generative models become available for public use in various applications, testing and analyzing vulnerabilities of these models has become a priority. In this work, we propose an automatic red teaming framework that evaluates a given black-box model and exposes its vulnerabilities against unsafe and inappropriate cont… ▽ More

    Submitted 7 November, 2024; v1 submitted 8 August, 2023; originally announced August 2023.

    Comments: EMNLP 2024

  6. arXiv:2305.11759  [pdf, other

    cs.CL cs.AI

    Controlling the Extraction of Memorized Data from Large Language Models via Prompt-Tuning

    Authors: Mustafa Safa Ozdayi, Charith Peris, Jack FitzGerald, Christophe Dupuy, Jimit Majmudar, Haidar Khan, Rahil Parikh, Rahul Gupta

    Abstract: Large Language Models (LLMs) are known to memorize significant portions of their training data. Parts of this memorized content have been shown to be extractable by simply querying the model, which poses a privacy risk. We present a novel approach which uses prompt-tuning to control the extraction rates of memorized content in LLMs. We present two prompt training strategies to increase and decreas… ▽ More

    Submitted 19 May, 2023; originally announced May 2023.

    Comments: 5 pages, 3 Figures, ACL 2023

  7. arXiv:2205.13621  [pdf, other

    cs.CL cs.LG

    Differentially Private Decoding in Large Language Models

    Authors: Jimit Majmudar, Christophe Dupuy, Charith Peris, Sami Smaili, Rahul Gupta, Richard Zemel

    Abstract: Recent large-scale natural language processing (NLP) systems use a pre-trained Large Language Model (LLM) on massive and diverse corpora as a headstart. In practice, the pre-trained model is adapted to a wide array of tasks via fine-tuning on task-specific datasets. LLMs, while effective, have been shown to memorize instances of training data thereby potentially revealing private information proce… ▽ More

    Submitted 8 September, 2022; v1 submitted 26 May, 2022; originally announced May 2022.

  8. arXiv:2203.13920  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    Canary Extraction in Natural Language Understanding Models

    Authors: Rahil Parikh, Christophe Dupuy, Rahul Gupta

    Abstract: Natural Language Understanding (NLU) models can be trained on sensitive information such as phone numbers, zip-codes etc. Recent literature has focused on Model Inversion Attacks (ModIvA) that can extract training data from model parameters. In this work, we present a version of such an attack by extracting canaries inserted in NLU training data. In the attack, an adversary with open-box access to… ▽ More

    Submitted 25 March, 2022; originally announced March 2022.

    Comments: Accepted to ACL 2022, Main Conference

  9. arXiv:2202.03925  [pdf, other

    cs.LG

    Learnings from Federated Learning in the Real world

    Authors: Christophe Dupuy, Tanya G. Roosta, Leo Long, Clement Chung, Rahul Gupta, Salman Avestimehr

    Abstract: Federated Learning (FL) applied to real world data may suffer from several idiosyncrasies. One such idiosyncrasy is the data distribution across devices. Data across devices could be distributed such that there are some "heavy devices" with large amounts of data while there are many "light users" with only a handful of data points. There also exists heterogeneity of data across devices. In this st… ▽ More

    Submitted 8 February, 2022; originally announced February 2022.

  10. arXiv:2107.14586  [pdf, ps, other

    cs.CL cs.CR cs.LG

    An Efficient DP-SGD Mechanism for Large Scale NLP Models

    Authors: Christophe Dupuy, Radhika Arava, Rahul Gupta, Anna Rumshisky

    Abstract: Recent advances in deep learning have drastically improved performance on many Natural Language Understanding (NLU) tasks. However, the data used to train NLU models may contain private information such as addresses or phone numbers, particularly when drawn from human subjects. It is desirable that underlying models do not expose private information contained in the training data. Differentially P… ▽ More

    Submitted 2 March, 2022; v1 submitted 14 July, 2021; originally announced July 2021.

  11. arXiv:2104.08815  [pdf, other

    cs.CL cs.AI cs.LG

    FedNLP: Benchmarking Federated Learning Methods for Natural Language Processing Tasks

    Authors: Bill Yuchen Lin, Chaoyang He, Zihang Zeng, Hulin Wang, Yufen Huang, Christophe Dupuy, Rahul Gupta, Mahdi Soltanolkotabi, Xiang Ren, Salman Avestimehr

    Abstract: Increasing concerns and regulations about data privacy and sparsity necessitate the study of privacy-preserving, decentralized learning methods for natural language processing (NLP) tasks. Federated learning (FL) provides promising approaches for a large number of clients (e.g., personal devices or organizations) to collaboratively learn a shared global model to benefit all clients while allowing… ▽ More

    Submitted 6 May, 2022; v1 submitted 18 April, 2021; originally announced April 2021.

    Comments: Accepted to NAACL 2022 Findings. Github: https://github.com/FedML-AI/FedNLP

  12. arXiv:2102.01502  [pdf, other

    cs.CR cs.AI cs.CL cs.LG

    ADePT: Auto-encoder based Differentially Private Text Transformation

    Authors: Satyapriya Krishna, Rahul Gupta, Christophe Dupuy

    Abstract: Privacy is an important concern when building statistical models on data containing personal information. Differential privacy offers a strong definition of privacy and can be used to solve several privacy concerns (Dwork et al., 2014). Multiple solutions have been proposed for the differentially-private transformation of datasets containing sensitive information. However, such transformation algo… ▽ More

    Submitted 29 January, 2021; originally announced February 2021.

    Journal ref: The 16th conference of the European Chapter of the Association for Computational Linguistics (EACL), 2021

  13. arXiv:2004.04060  [pdf, other

    cs.CL

    Self-Attention Gazetteer Embeddings for Named-Entity Recognition

    Authors: Stanislav Peshterliev, Christophe Dupuy, Imre Kiss

    Abstract: Recent attempts to ingest external knowledge into neural models for named-entity recognition (NER) have exhibited mixed results. In this work, we present GazSelfAttn, a novel gazetteer embedding approach that uses self-attention and match span encoding to build enhanced gazetteer embeddings. In addition, we demonstrate how to build gazetteer resources from the open source Wikidata knowledge base.… ▽ More

    Submitted 18 April, 2020; v1 submitted 8 April, 2020; originally announced April 2020.

    Comments: Preprint

  14. arXiv:1610.05925  [pdf, other

    stat.ML cs.LG

    Learning Determinantal Point Processes in Sublinear Time

    Authors: Christophe Dupuy, Francis Bach

    Abstract: We propose a new class of determinantal point processes (DPPs) which can be manipulated for inference and parameter learning in potentially sublinear time in the number of items. This class, based on a specific low-rank factorization of the marginal kernel, is particularly suited to a subclass of continuous DPPs and DPPs defined on exponentially many items. We apply this new class to modelling tex… ▽ More

    Submitted 19 October, 2016; originally announced October 2016.

    Comments: Under review for AISTATS 2017

  15. arXiv:1610.01417  [pdf, other

    stat.ML cs.LG

    Decentralized Topic Modelling with Latent Dirichlet Allocation

    Authors: Igor Colin, Christophe Dupuy

    Abstract: Privacy preserving networks can be modelled as decentralized networks (e.g., sensors, connected objects, smartphones), where communication between nodes of the network is not controlled by an all-knowing, central node. For this type of networks, the main issue is to gather/learn global information on the network (e.g., by optimizing a global cost function) while keeping the (sensitive) information… ▽ More

    Submitted 5 October, 2016; originally announced October 2016.

  16. arXiv:1603.02644  [pdf, other

    cs.LG stat.ML

    Online but Accurate Inference for Latent Variable Models with Local Gibbs Sampling

    Authors: Christophe Dupuy, Francis Bach

    Abstract: We study parameter inference in large-scale latent variable models. We first propose an unified treatment of online inference for latent variable models from a non-canonical exponential family, and draw explicit links between several previously proposed frequentist or Bayesian methods. We then propose a novel inference method for the frequentist estimation of parameters, that adapts MCMC methods t… ▽ More

    Submitted 31 January, 2018; v1 submitted 8 March, 2016; originally announced March 2016.

    Journal ref: Journal of Machine Learning Research, Journal of Machine Learning Research, 2017, 18, pp.1 - 45