Skip to main content

Showing 1–9 of 9 results for author: Chrysostomou, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2502.18424  [pdf, other

    cs.CL

    Compressing Language Models for Specialized Domains

    Authors: Miles Williams, George Chrysostomou, Vitor Jeronymo, Nikolaos Aletras

    Abstract: Compression techniques such as pruning and quantization offer a solution for more efficient deployment of language models (LMs), albeit with small performance drops in benchmark performance. However, general-purpose LM compression methods can negatively affect performance in specialized domains (e.g. biomedical or legal). Recent work has sought to address this, yet requires computationally expensi… ▽ More

    Submitted 25 February, 2025; originally announced February 2025.

    Comments: Work in progress

  2. arXiv:2410.17170  [pdf, other

    cs.CL

    Self-calibration for Language Model Quantization and Pruning

    Authors: Miles Williams, George Chrysostomou, Nikolaos Aletras

    Abstract: Quantization and pruning are fundamental approaches for model compression, enabling efficient inference for language models. In a post-training setting, state-of-the-art quantization and pruning methods require calibration data, a small set of unlabeled examples. Conventionally, this is randomly sampled web text, aiming to reflect the model training data. However, this poses two key problems: (1)… ▽ More

    Submitted 26 February, 2025; v1 submitted 22 October, 2024; originally announced October 2024.

    Comments: NAACL 2025

  3. Investigating Hallucinations in Pruned Large Language Models for Abstractive Summarization

    Authors: George Chrysostomou, Zhixue Zhao, Miles Williams, Nikolaos Aletras

    Abstract: Despite the remarkable performance of generative large language models (LLMs) on abstractive summarization, they face two significant challenges: their considerable size and tendency to hallucinate. Hallucinations are concerning because they erode reliability and raise safety issues. Pruning is a technique that reduces model size by removing redundant weights, enabling more efficient sparse infere… ▽ More

    Submitted 24 October, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: TACL 2024 (Presented at EMNLP 2024)

    Journal ref: Transactions of the Association for Computational Linguistics (2024) 12: 1163-1181

  4. arXiv:2210.09197  [pdf, other

    cs.CL cs.AI cs.LG

    On the Impact of Temporal Concept Drift on Model Explanations

    Authors: Zhixue Zhao, George Chrysostomou, Kalina Bontcheva, Nikolaos Aletras

    Abstract: Explanation faithfulness of model predictions in natural language processing is typically evaluated on held-out data from the same temporal distribution as the training data (i.e. synchronous settings). While model performance often deteriorates due to temporal variation (i.e. temporal concept drift), it is currently unknown how explanation faithfulness is impacted when the time span of the target… ▽ More

    Submitted 17 October, 2022; originally announced October 2022.

    Comments: Accepted at EMNLP Findings 2022

  5. arXiv:2203.00056  [pdf, other

    cs.CL

    An Empirical Study on Explanations in Out-of-Domain Settings

    Authors: George Chrysostomou, Nikolaos Aletras

    Abstract: Recent work in Natural Language Processing has focused on developing approaches that extract faithful explanations, either via identifying the most important tokens in the input (i.e. post-hoc explanations) or by designing inherently faithful models that first select the most important tokens and then use them to predict the correct label (i.e. select-then-predict models). Currently, these approac… ▽ More

    Submitted 28 February, 2022; originally announced March 2022.

    Comments: ACL2022 Pre-print

  6. arXiv:2109.01819  [pdf, other

    cs.CL cs.AI cs.LG

    Frustratingly Simple Pretraining Alternatives to Masked Language Modeling

    Authors: Atsuki Yamaguchi, George Chrysostomou, Katerina Margatina, Nikolaos Aletras

    Abstract: Masked language modeling (MLM), a self-supervised pretraining objective, is widely used in natural language processing for learning text representations. MLM trains a model to predict a random sample of input tokens that have been replaced by a [MASK] placeholder in a multi-class setting over the entire vocabulary. When pretraining, it is common to use alongside MLM other auxiliary objectives on t… ▽ More

    Submitted 4 September, 2021; originally announced September 2021.

    Comments: Accepted at EMNLP 2021

  7. arXiv:2108.13759  [pdf, other

    cs.CL

    Enjoy the Salience: Towards Better Transformer-based Faithful Explanations with Word Salience

    Authors: George Chrysostomou, Nikolaos Aletras

    Abstract: Pretrained transformer-based models such as BERT have demonstrated state-of-the-art predictive performance when adapted into a range of natural language processing tasks. An open problem is how to improve the faithfulness of explanations (rationales) for the predictions of these models. In this paper, we hypothesize that salient information extracted a priori from the training data can complement… ▽ More

    Submitted 31 August, 2021; originally announced August 2021.

    Comments: EMNLP 2021 Pre-print

  8. arXiv:2105.02657  [pdf, other

    cs.CL

    Improving the Faithfulness of Attention-based Explanations with Task-specific Information for Text Classification

    Authors: George Chrysostomou, Nikolaos Aletras

    Abstract: Neural network architectures in natural language processing often use attention mechanisms to produce probability distributions over input token representations. Attention has empirically been demonstrated to improve performance in various tasks, while its weights have been extensively used as explanations for model predictions. Recent studies (Jain and Wallace, 2019; Serrano and Smith, 2019; Wieg… ▽ More

    Submitted 7 May, 2021; v1 submitted 6 May, 2021; originally announced May 2021.

    Comments: NLP Interpretability ; Accepted at ACL2021

  9. arXiv:2104.08219  [pdf, other

    cs.CL cs.AI

    Flexible Instance-Specific Rationalization of NLP Models

    Authors: George Chrysostomou, Nikolaos Aletras

    Abstract: Recent research on model interpretability in natural language processing extensively uses feature scoring methods for identifying which parts of the input are the most important for a model to make a prediction (i.e. explanation or rationale). However, previous research has shown that there is no clear best scoring method across various text classification tasks while practitioners typically have… ▽ More

    Submitted 6 December, 2021; v1 submitted 16 April, 2021; originally announced April 2021.

    Comments: NLP Interpretability ; Accepted at AAAI2022