Skip to main content

Showing 1–3 of 3 results for author: Rugina, I

Searching in archive cs. Search in all archives.
.
  1. arXiv:2401.10862  [pdf, other

    cs.LG cs.AI cs.CL cs.CR

    Pruning for Protection: Increasing Jailbreak Resistance in Aligned LLMs Without Fine-Tuning

    Authors: Adib Hasan, Ileana Rugina, Alex Wang

    Abstract: This paper investigates the impact of model compression on the way Large Language Models (LLMs) process prompts, particularly concerning jailbreak resistance. We show that moderate WANDA pruning can enhance resistance to jailbreaking attacks without fine-tuning, while maintaining performance on standard benchmarks. To systematically evaluate this safety enhancement, we introduce a dataset of 225 h… ▽ More

    Submitted 31 October, 2024; v1 submitted 19 January, 2024; originally announced January 2024.

    Comments: Proceedings of the 7th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP

  2. arXiv:2112.11929  [pdf, other

    cs.CV cs.LG

    Meta-Learning and Self-Supervised Pretraining for Real World Image Translation

    Authors: Ileana Rugina, Rumen Dangovski, Mark Veillette, Pooya Khorrami, Brian Cheung, Olga Simek, Marin Soljačić

    Abstract: Recent advances in deep learning, in particular enabled by hardware advances and big data, have provided impressive results across a wide range of computational problems such as computer vision, natural language, or reinforcement learning. Many of these improvements are however constrained to problems with large-scale curated data-sets which require a lot of human labor to gather. Additionally, th… ▽ More

    Submitted 22 December, 2021; originally announced December 2021.

    Comments: 10 pages, 8 figures, 2 tables

  3. arXiv:2012.02030  [pdf, other

    cs.CL

    Data-Informed Global Sparseness in Attention Mechanisms for Deep Neural Networks

    Authors: Ileana Rugina, Rumen Dangovski, Li Jing, Preslav Nakov, Marin Soljačić

    Abstract: Attention mechanisms play a crucial role in the neural revolution of Natural Language Processing (NLP). With the growth of attention-based models, several pruning techniques have been developed to identify and exploit sparseness, making these models more efficient. Most efforts focus on hard-coding attention patterns or pruning attention weights based on training data. We propose Attention Pruning… ▽ More

    Submitted 17 May, 2024; v1 submitted 20 November, 2020; originally announced December 2020.

    Comments: Presented at LREC-COLING 2024: 12 pages, 4 figures, 11 tables