Skip to main content

Showing 1–21 of 21 results for author: Dognin, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2502.15975  [pdf, other

    cs.CL cs.AI

    Sparsity May Be All You Need: Sparse Random Parameter Adaptation

    Authors: Jesus Rios, Pierre Dognin, Ronny Luss, Karthikeyan N. Ramamurthy

    Abstract: Full fine-tuning of large language models for alignment and task adaptation has become prohibitively expensive as models have grown in size. Parameter-Efficient Fine-Tuning (PEFT) methods aim at significantly reducing the computational and memory resources needed for fine-tuning these models by only training on a small number of parameters instead of all model parameters. Currently, the most popul… ▽ More

    Submitted 21 May, 2025; v1 submitted 21 February, 2025; originally announced February 2025.

  2. arXiv:2412.07724  [pdf, other

    cs.CL

    Granite Guardian

    Authors: Inkit Padhi, Manish Nagireddy, Giandomenico Cornacchia, Subhajit Chaudhury, Tejaswini Pedapati, Pierre Dognin, Keerthiram Murugesan, Erik Miehling, Martín Santillán Cooper, Kieran Fraser, Giulio Zizzo, Muhammad Zaid Hameed, Mark Purcell, Michael Desmond, Qian Pan, Zahra Ashktorab, Inge Vejsbjerg, Elizabeth M. Daly, Michael Hind, Werner Geyer, Ambrish Rawat, Kush R. Varshney, Prasanna Sattigeri

    Abstract: We introduce the Granite Guardian models, a suite of safeguards designed to provide risk detection for prompts and responses, enabling safe and responsible use in combination with any large language model (LLM). These models offer comprehensive coverage across multiple risk dimensions, including social bias, profanity, violence, sexual content, unethical behavior, jailbreaking, and hallucination-r… ▽ More

    Submitted 16 December, 2024; v1 submitted 10 December, 2024; originally announced December 2024.

  3. arXiv:2411.12405  [pdf, other

    cs.CL cs.AI cs.HC

    Evaluating the Prompt Steerability of Large Language Models

    Authors: Erik Miehling, Michael Desmond, Karthikeyan Natesan Ramamurthy, Elizabeth M. Daly, Pierre Dognin, Jesus Rios, Djallel Bouneffouf, Miao Liu

    Abstract: Building pluralistic AI requires designing models that are able to be shaped to represent a wide range of value systems and cultures. Achieving this requires first being able to evaluate the degree to which a given model is capable of reflecting various personas. To this end, we propose a benchmark for evaluating the steerability of model personas as a function of prompting. Our design is based on… ▽ More

    Submitted 15 February, 2025; v1 submitted 19 November, 2024; originally announced November 2024.

    Comments: Short version appeared at the Pluralistic Alignment workshop at NeurIPS 2024; extended version appeared at NAACL 2025

  4. arXiv:2409.05907  [pdf, other

    cs.LG cs.AI cs.CL

    Programming Refusal with Conditional Activation Steering

    Authors: Bruce W. Lee, Inkit Padhi, Karthikeyan Natesan Ramamurthy, Erik Miehling, Pierre Dognin, Manish Nagireddy, Amit Dhurandhar

    Abstract: LLMs have shown remarkable capabilities, but precisely controlling their response behavior remains challenging. Existing activation steering methods alter LLM behavior indiscriminately, limiting their practical applicability in settings where selective responses are essential, such as content moderation or domain-specific assistants. In this paper, we propose Conditional Activation Steering (CAST)… ▽ More

    Submitted 17 February, 2025; v1 submitted 6 September, 2024; originally announced September 2024.

    Comments: ICLR 2025, Spotlight

  5. arXiv:2408.10392  [pdf, other

    cs.CL cs.LG

    Value Alignment from Unstructured Text

    Authors: Inkit Padhi, Karthikeyan Natesan Ramamurthy, Prasanna Sattigeri, Manish Nagireddy, Pierre Dognin, Kush R. Varshney

    Abstract: Aligning large language models (LLMs) to value systems has emerged as a significant area of research within the fields of AI and NLP. Currently, this alignment process relies on the availability of high-quality supervised and preference data, which can be both time-consuming and expensive to curate or annotate. In this paper, we introduce a systematic end-to-end methodology for aligning LLMs to th… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

  6. arXiv:2403.12805  [pdf, other

    cs.AI cs.CL

    Contextual Moral Value Alignment Through Context-Based Aggregation

    Authors: Pierre Dognin, Jesus Rios, Ronny Luss, Inkit Padhi, Matthew D Riemer, Miao Liu, Prasanna Sattigeri, Manish Nagireddy, Kush R. Varshney, Djallel Bouneffouf

    Abstract: Developing value-aligned AI agents is a complex undertaking and an ongoing challenge in the field of AI. Specifically within the domain of Large Language Models (LLMs), the capability to consolidate multiple independently trained dialogue agents, each aligned with a distinct moral value, into a unified system that can adapt to and be aligned with multiple moral values is of paramount importance. I… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  7. arXiv:2403.09704  [pdf, other

    cs.CL cs.AI cs.LG

    Alignment Studio: Aligning Large Language Models to Particular Contextual Regulations

    Authors: Swapnaja Achintalwar, Ioana Baldini, Djallel Bouneffouf, Joan Byamugisha, Maria Chang, Pierre Dognin, Eitan Farchi, Ndivhuwo Makondo, Aleksandra Mojsilovic, Manish Nagireddy, Karthikeyan Natesan Ramamurthy, Inkit Padhi, Orna Raz, Jesus Rios, Prasanna Sattigeri, Moninder Singh, Siphiwe Thwala, Rosario A. Uceda-Sosa, Kush R. Varshney

    Abstract: The alignment of large language models is usually done by model providers to add or control behaviors that are common or universally understood across use cases and contexts. In contrast, in this article, we present an approach and architecture that empowers application developers to tune a model to their particular values, social norms, laws and other regulations, and orchestrate between potentia… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: 7 pages, 5 figures

  8. arXiv:2403.06009  [pdf, other

    cs.LG

    Detectors for Safe and Reliable LLMs: Implementations, Uses, and Limitations

    Authors: Swapnaja Achintalwar, Adriana Alvarado Garcia, Ateret Anaby-Tavor, Ioana Baldini, Sara E. Berger, Bishwaranjan Bhattacharjee, Djallel Bouneffouf, Subhajit Chaudhury, Pin-Yu Chen, Lamogha Chiazor, Elizabeth M. Daly, Kirushikesh DB, Rogério Abreu de Paula, Pierre Dognin, Eitan Farchi, Soumya Ghosh, Michael Hind, Raya Horesh, George Kour, Ja Young Lee, Nishtha Madaan, Sameep Mehta, Erik Miehling, Keerthiram Murugesan, Manish Nagireddy , et al. (13 additional authors not shown)

    Abstract: Large language models (LLMs) are susceptible to a variety of risks, from non-faithful output to biased and toxic generations. Due to several limiting factors surrounding LLMs (training cost, API access, data availability, etc.), it may not always be feasible to impose direct safety constraints on a deployed model. Therefore, an efficient and reliable alternative is required. To this end, we presen… ▽ More

    Submitted 19 August, 2024; v1 submitted 9 March, 2024; originally announced March 2024.

  9. arXiv:2304.10819  [pdf, other

    cs.LG cs.AI stat.ML

    Auditing and Generating Synthetic Data with Controllable Trust Trade-offs

    Authors: Brian Belgodere, Pierre Dognin, Adam Ivankay, Igor Melnyk, Youssef Mroueh, Aleksandra Mojsilovic, Jiri Navratil, Apoorva Nitsure, Inkit Padhi, Mattia Rigotti, Jerret Ross, Yair Schiff, Radhika Vedpathak, Richard A. Young

    Abstract: Real-world data often exhibits bias, imbalance, and privacy risks. Synthetic datasets have emerged to address these issues. This paradigm relies on generative AI models to generate unbiased, privacy-preserving data while maintaining fidelity to the original data. However, assessing the trustworthiness of synthetic datasets and models is a critical challenge. We introduce a holistic auditing framew… ▽ More

    Submitted 9 June, 2024; v1 submitted 21 April, 2023; originally announced April 2023.

    Comments: submitted

  10. arXiv:2212.06803  [pdf, other

    cs.LG cs.CY stat.ML

    Fair Infinitesimal Jackknife: Mitigating the Influence of Biased Training Data Points Without Refitting

    Authors: Prasanna Sattigeri, Soumya Ghosh, Inkit Padhi, Pierre Dognin, Kush R. Varshney

    Abstract: In consequential decision-making applications, mitigating unwanted biases in machine learning models that yield systematic disadvantage to members of groups delineated by sensitive attributes such as race and gender is one key intervention to strive for equity. Focusing on demographic parity and equality of opportunity, in this paper we propose an algorithm that improves the fairness of a pre-trai… ▽ More

    Submitted 13 December, 2022; originally announced December 2022.

    Comments: Accepted at Neurips 2022

  11. arXiv:2211.10511  [pdf, other

    cs.CL cs.LG

    Knowledge Graph Generation From Text

    Authors: Igor Melnyk, Pierre Dognin, Payel Das

    Abstract: In this work we propose a novel end-to-end multi-stage Knowledge Graph (KG) generation system from textual inputs, separating the overall process into two stages. The graph nodes are generated first using pretrained language model, followed by a simple edge construction head, enabling efficient KG extraction from the text. For each stage we consider several architectural choices that can be used d… ▽ More

    Submitted 18 November, 2022; originally announced November 2022.

    Comments: Findings of EMNLP 2022

  12. arXiv:2208.06665  [pdf, other

    cs.LG

    Cloud-Based Real-Time Molecular Screening Platform with MolFormer

    Authors: Brian Belgodere, Vijil Chenthamarakshan, Payel Das, Pierre Dognin, Toby Kurien, Igor Melnyk, Youssef Mroueh, Inkit Padhi, Mattia Rigotti, Jarret Ross, Yair Schiff, Richard A. Young

    Abstract: With the prospect of automating a number of chemical tasks with high fidelity, chemical language processing models are emerging at a rapid speed. Here, we present a cloud-based real-time platform that allows users to virtually screen molecules of interest. For this purpose, molecular embeddings inferred from a recently proposed large chemical language model, named MolFormer, are leveraged. The pla… ▽ More

    Submitted 13 August, 2022; originally announced August 2022.

    Comments: Paper accepted at ECML PKDD 2022 demo track

  13. arXiv:2108.12472  [pdf, other

    cs.CL cs.LG

    ReGen: Reinforcement Learning for Text and Knowledge Base Generation using Pretrained Language Models

    Authors: Pierre L. Dognin, Inkit Padhi, Igor Melnyk, Payel Das

    Abstract: Automatic construction of relevant Knowledge Bases (KBs) from text, and generation of semantically meaningful text from KBs are both long-standing goals in Machine Learning. In this paper, we present ReGen, a bidirectional generation of text and graph leveraging Reinforcement Learning (RL) to improve performance. Graph linearization enables us to re-frame both tasks as a sequence to sequence gener… ▽ More

    Submitted 27 August, 2021; originally announced August 2021.

    Comments: Accepted to appear in the main conference of EMNLP 2021

  14. arXiv:2012.11696  [pdf, other

    cs.CV cs.LG

    Image Captioning as an Assistive Technology: Lessons Learned from VizWiz 2020 Challenge

    Authors: Pierre Dognin, Igor Melnyk, Youssef Mroueh, Inkit Padhi, Mattia Rigotti, Jarret Ross, Yair Schiff, Richard A. Young, Brian Belgodere

    Abstract: Image captioning has recently demonstrated impressive progress largely owing to the introduction of neural network algorithms trained on curated dataset like MS-COCO. Often work in this field is motivated by the promise of deployment of captioning systems in practical applications. However, the scarcity of data and contexts in many competition datasets renders the utility of systems trained on the… ▽ More

    Submitted 18 June, 2021; v1 submitted 21 December, 2020; originally announced December 2020.

    Comments: In submission to JAIR

  15. arXiv:2012.11691  [pdf, other

    cs.CV cs.LG

    Alleviating Noisy Data in Image Captioning with Cooperative Distillation

    Authors: Pierre Dognin, Igor Melnyk, Youssef Mroueh, Inkit Padhi, Mattia Rigotti, Jarret Ross, Yair Schiff

    Abstract: Image captioning systems have made substantial progress, largely due to the availability of curated datasets like Microsoft COCO or Vizwiz that have accurate descriptions of their corresponding images. Unfortunately, scarce availability of such cleanly labeled data results in trained algorithms producing captions that can be terse and idiosyncratically specific to details in the image. We propose… ▽ More

    Submitted 21 December, 2020; originally announced December 2020.

    Comments: CVPR 2020 VizWiz Challenge

  16. arXiv:2011.01843  [pdf, other

    cs.LG cs.AI

    Tabular Transformers for Modeling Multivariate Time Series

    Authors: Inkit Padhi, Yair Schiff, Igor Melnyk, Mattia Rigotti, Youssef Mroueh, Pierre Dognin, Jerret Ross, Ravi Nair, Erik Altman

    Abstract: Tabular datasets are ubiquitous in data science applications. Given their importance, it seems natural to apply state-of-the-art deep learning algorithms in order to fully unlock their potential. Here we propose neural network models that represent tabular time series that can optionally leverage their hierarchical structure. This results in two architectures for tabular time series: one for learn… ▽ More

    Submitted 11 February, 2021; v1 submitted 3 November, 2020; originally announced November 2020.

    Comments: Accepted to ICASSP, 2021; https://github.com/IBM/TabFormer

  17. arXiv:2010.14660  [pdf, other

    cs.CL cs.LG

    DualTKB: A Dual Learning Bridge between Text and Knowledge Base

    Authors: Pierre L. Dognin, Igor Melnyk, Inkit Padhi, Cicero Nogueira dos Santos, Payel Das

    Abstract: In this work, we present a dual learning approach for unsupervised text to path and path to text transfers in Commonsense Knowledge Bases (KBs). We investigate the impact of weak supervision by creating a weakly supervised dataset and show that even a slight amount of supervision can significantly improve the model performance and enable better-quality transfers. We examine different model archite… ▽ More

    Submitted 27 October, 2020; originally announced October 2020.

    Comments: Equal Contributions of Authors Pierre L. Dognin, Igor Melnyk, and Inkit Padhi. Accepted at EMNLP'20

  18. arXiv:2005.03588  [pdf, other

    cs.CL cs.LG

    Learning Implicit Text Generation via Feature Matching

    Authors: Inkit Padhi, Pierre Dognin, Ke Bai, Cicero Nogueira dos Santos, Vijil Chenthamarakshan, Youssef Mroueh, Payel Das

    Abstract: Generative feature matching network (GFMN) is an approach for training implicit generative models for images by performing moment matching on features from pre-trained neural networks. In this paper, we present new GFMN formulations that are effective for sequential data. Our experimental results show the effectiveness of the proposed method, SeqGFMN, for three distinct generation tasks in English… ▽ More

    Submitted 8 May, 2020; v1 submitted 7 May, 2020; originally announced May 2020.

    Comments: ACL 2020

  19. arXiv:1904.02762  [pdf, other

    cs.CV cs.LG

    Learning Implicit Generative Models by Matching Perceptual Features

    Authors: Cicero Nogueira dos Santos, Youssef Mroueh, Inkit Padhi, Pierre Dognin

    Abstract: Perceptual features (PFs) have been used with great success in tasks such as transfer learning, style transfer, and super-resolution. However, the efficacy of PFs as key source of information for learning generative models is not well studied. We investigate here the use of PFs in the context of learning implicit generative models through moment matching (MM). More specifically, we propose a new e… ▽ More

    Submitted 4 April, 2019; originally announced April 2019.

    Comments: 16 pages

    Journal ref: ICCV 2019

  20. arXiv:1902.04999  [pdf, other

    cs.LG stat.ML

    Wasserstein Barycenter Model Ensembling

    Authors: Pierre Dognin, Igor Melnyk, Youssef Mroueh, Jerret Ross, Cicero Dos Santos, Tom Sercu

    Abstract: In this paper we propose to perform model ensembling in a multiclass or a multilabel learning setting using Wasserstein (W.) barycenters. Optimal transport metrics, such as the Wasserstein distance, allow incorporating semantic side information such as word embeddings. Using W. barycenters to find the consensus between models allows us to balance confidence and semantics in finding the agreement b… ▽ More

    Submitted 13 February, 2019; originally announced February 2019.

    Comments: ICLR 2019

  21. arXiv:1805.00063  [pdf, other

    cs.LG cs.CL cs.CV stat.ML

    Adversarial Semantic Alignment for Improved Image Captions

    Authors: Pierre L. Dognin, Igor Melnyk, Youssef Mroueh, Jarret Ross, Tom Sercu

    Abstract: In this paper we study image captioning as a conditional GAN training, proposing both a context-aware LSTM captioner and co-attentive discriminator, which enforces semantic alignment between images and captions. We empirically focus on the viability of two training methods: Self-critical Sequence Training (SCST) and Gumbel Straight-Through (ST) and demonstrate that SCST shows more stable gradient… ▽ More

    Submitted 6 June, 2019; v1 submitted 30 April, 2018; originally announced May 2018.

    Comments: Authors Equal Contribution, CVPR 2019