Skip to main content

Showing 1–11 of 11 results for author: Hambardzumyan, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.02554  [pdf, ps, other

    cs.AI cs.LG

    AI Research Agents for Machine Learning: Search, Exploration, and Generalization in MLE-bench

    Authors: Edan Toledo, Karen Hambardzumyan, Martin Josifoski, Rishi Hazra, Nicolas Baldwin, Alexis Audran-Reiss, Michael Kuchnik, Despoina Magka, Minqi Jiang, Alisia Maria Lupidi, Andrei Lupu, Roberta Raileanu, Kelvin Niu, Tatiana Shavrina, Jean-Christophe Gagnon-Audet, Michael Shvartsman, Shagun Sodhani, Alexander H. Miller, Abhishek Charnalia, Derek Dunfield, Carole-Jean Wu, Pontus Stenetorp, Nicola Cancedda, Jakob Nicolaus Foerster, Yoram Bachrach

    Abstract: AI research agents are demonstrating great potential to accelerate scientific progress by automating the design, implementation, and training of machine learning models. We focus on methods for improving agents' performance on MLE-bench, a challenging benchmark where agents compete in Kaggle competitions to solve real-world machine learning problems. We formalize AI research agents as search polic… ▽ More

    Submitted 3 July, 2025; originally announced July 2025.

    Comments: Code: https://github.com/facebookresearch/aira-dojo

  2. arXiv:2506.22419  [pdf, ps, other

    cs.AI cs.CL cs.LG

    The Automated LLM Speedrunning Benchmark: Reproducing NanoGPT Improvements

    Authors: Bingchen Zhao, Despoina Magka, Minqi Jiang, Xian Li, Roberta Raileanu, Tatiana Shavrina, Jean-Christophe Gagnon-Audet, Kelvin Niu, Shagun Sodhani, Michael Shvartsman, Andrei Lupu, Alisia Lupidi, Edan Toledo, Karen Hambardzumyan, Martin Josifoski, Thomas Foster, Lucia Cipolina-Kun, Abhishek Charnalia, Derek Dunfield, Alexander H. Miller, Oisin Mac Aodha, Jakob Foerster, Yoram Bachrach

    Abstract: Rapid advancements in large language models (LLMs) have the potential to assist in scientific progress. A critical capability toward this endeavor is the ability to reproduce existing work. To evaluate the ability of AI agents to reproduce results in an active research area, we introduce the Automated LLM Speedrunning Benchmark, leveraging the research community contributions on the NanoGPT speedr… ▽ More

    Submitted 30 June, 2025; v1 submitted 27 June, 2025; originally announced June 2025.

  3. arXiv:2501.12275  [pdf, other

    cs.CV cs.AI cs.CR cs.LG cs.MA

    With Great Backbones Comes Great Adversarial Transferability

    Authors: Erik Arakelyan, Karen Hambardzumyan, Davit Papikyan, Pasquale Minervini, Albert Gordo, Isabelle Augenstein, Aram H. Markosyan

    Abstract: Advances in self-supervised learning (SSL) for machine vision have improved representation robustness and model performance, giving rise to pre-trained backbones like \emph{ResNet} and \emph{ViT} models tuned with SSL methods such as \emph{SimCLR}. Due to the computational and data demands of pre-training, the utilization of such backbones becomes a strenuous necessity. However, employing these ba… ▽ More

    Submitted 21 January, 2025; originally announced January 2025.

  4. arXiv:2409.20089  [pdf, other

    cs.LG cs.CL cs.CR

    Robust LLM safeguarding via refusal feature adversarial training

    Authors: Lei Yu, Virginie Do, Karen Hambardzumyan, Nicola Cancedda

    Abstract: Large language models (LLMs) are vulnerable to adversarial attacks that can elicit harmful responses. Defending against such attacks remains challenging due to the opacity of jailbreaking mechanisms and the high computational cost of training LLMs robustly. We demonstrate that adversarial attacks share a universal mechanism for circumventing LLM safeguards that works by ablating a dimension in the… ▽ More

    Submitted 20 March, 2025; v1 submitted 30 September, 2024; originally announced September 2024.

  5. arXiv:2404.07004  [pdf, other

    cs.CL

    LM Transparency Tool: Interactive Tool for Analyzing Transformer Language Models

    Authors: Igor Tufanov, Karen Hambardzumyan, Javier Ferrando, Elena Voita

    Abstract: We present the LM Transparency Tool (LM-TT), an open-source interactive toolkit for analyzing the internal workings of Transformer-based language models. Differently from previously existing tools that focus on isolated parts of the decision-making process, our framework is designed to make the entire prediction process transparent, and allows tracing back model behavior from the top-layer represe… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

  6. arXiv:2301.03728  [pdf, other

    cs.CL cs.AI cs.LG

    Scaling Laws for Generative Mixed-Modal Language Models

    Authors: Armen Aghajanyan, Lili Yu, Alexis Conneau, Wei-Ning Hsu, Karen Hambardzumyan, Susan Zhang, Stephen Roller, Naman Goyal, Omer Levy, Luke Zettlemoyer

    Abstract: Generative language models define distributions over sequences of tokens that can represent essentially any combination of data modalities (e.g., any permutation of image tokens from VQ-VAEs, speech tokens from HuBERT, BPE tokens for language or code, and so on). To better understand the scaling properties of such mixed-modal models, we conducted over 250 experiments using seven different modaliti… ▽ More

    Submitted 9 January, 2023; originally announced January 2023.

  7. arXiv:2211.16349  [pdf, other

    cs.LG q-bio.BM

    BARTSmiles: Generative Masked Language Models for Molecular Representations

    Authors: Gayane Chilingaryan, Hovhannes Tamoyan, Ani Tevosyan, Nelly Babayan, Lusine Khondkaryan, Karen Hambardzumyan, Zaven Navoyan, Hrant Khachatrian, Armen Aghajanyan

    Abstract: We discover a robust self-supervised strategy tailored towards molecular representations for generative masked language models through a series of tailored, in-depth ablations. Using this pre-training strategy, we train BARTSmiles, a BART-like model with an order of magnitude more compute than previous self-supervised molecular representations. In-depth evaluations show that BARTSmiles consistentl… ▽ More

    Submitted 29 November, 2022; originally announced November 2022.

    Comments: 27 pages (including appendix)

  8. arXiv:2101.00121  [pdf, other

    cs.CL

    WARP: Word-level Adversarial ReProgramming

    Authors: Karen Hambardzumyan, Hrant Khachatrian, Jonathan May

    Abstract: Transfer learning from pretrained language models recently became the dominant approach for solving many NLP tasks. A common approach to transfer learning for multiple tasks that maximize parameter sharing trains one or more task-specific layers on top of the language model. In this paper, we present an alternative approach based on adversarial reprogramming, which extends earlier work on automati… ▽ More

    Submitted 2 June, 2021; v1 submitted 31 December, 2020; originally announced January 2021.

    Comments: Accepted ACL 2021 Long Paper

  9. arXiv:1809.03211  [pdf, other

    cs.CL

    Towards JointUD: Part-of-speech Tagging and Lemmatization using Recurrent Neural Networks

    Authors: Gor Arakelyan, Karen Hambardzumyan, Hrant Khachatrian

    Abstract: This paper describes our submission to CoNLL 2018 UD Shared Task. We have extended an LSTM-based neural network designed for sequence tagging to additionally generate character-level sequences. The network was jointly trained to produce lemmas, part-of-speech tags and morphological features. Sentence segmentation, tokenization and dependency parsing were handled by UDPipe 1.2 baseline. The results… ▽ More

    Submitted 10 September, 2018; originally announced September 2018.

    Comments: System description paper of our system for the CoNLL 2018 shared task on Universal Dependency parsing

  10. arXiv:1802.03198  [pdf, other

    cs.CL

    Natural Language Inference over Interaction Space: ICLR 2018 Reproducibility Report

    Authors: Martin Mirakyan, Karen Hambardzumyan, Hrant Khachatrian

    Abstract: We have tried to reproduce the results of the paper "Natural Language Inference over Interaction Space" submitted to ICLR 2018 conference as part of the ICLR 2018 Reproducibility Challenge. Initially, we were not aware that the code was available, so we started to implement the network from scratch. We have evaluated our version of the model on Stanford NLI dataset and reached 86.38% accuracy on t… ▽ More

    Submitted 9 February, 2018; originally announced February 2018.

    Comments: as part of ICLR 2018 Reproducibility Challenge

  11. arXiv:1610.00768  [pdf, ps, other

    cs.LG cs.CR stat.ML

    Technical Report on the CleverHans v2.1.0 Adversarial Examples Library

    Authors: Nicolas Papernot, Fartash Faghri, Nicholas Carlini, Ian Goodfellow, Reuben Feinman, Alexey Kurakin, Cihang Xie, Yash Sharma, Tom Brown, Aurko Roy, Alexander Matyasko, Vahid Behzadan, Karen Hambardzumyan, Zhishuai Zhang, Yi-Lin Juang, Zhi Li, Ryan Sheatsley, Abhibhav Garg, Jonathan Uesato, Willi Gierke, Yinpeng Dong, David Berthelot, Paul Hendricks, Jonas Rauber, Rujun Long , et al. (1 additional authors not shown)

    Abstract: CleverHans is a software library that provides standardized reference implementations of adversarial example construction techniques and adversarial training. The library may be used to develop more robust machine learning models and to provide standardized benchmarks of models' performance in the adversarial setting. Benchmarks constructed without a standardized implementation of adversarial exam… ▽ More

    Submitted 27 June, 2018; v1 submitted 3 October, 2016; originally announced October 2016.

    Comments: Technical report for https://github.com/tensorflow/cleverhans