Skip to main content

Showing 1–7 of 7 results for author: Nikolaus, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.05970  [pdf, ps, other

    cs.CL

    Towards Developmentally Plausible Rewards: Communicative Success as a Learning Signal for Interactive Language Models

    Authors: Lennart Stöpler, Rufat Asadli, Mitja Nikolaus, Ryan Cotterell, Alex Warstadt

    Abstract: We propose a method for training language models in an interactive setting inspired by child language acquisition. In our setting, a speaker attempts to communicate some information to a listener in a single-turn dialogue and receives a reward if communicative success is achieved. Unlike earlier related work using image--caption data for interactive reference games, we operationalize communicative… ▽ More

    Submitted 9 May, 2025; originally announced May 2025.

  2. arXiv:2403.14208  [pdf, other

    cs.CL

    Automatic Annotation of Grammaticality in Child-Caregiver Conversations

    Authors: Mitja Nikolaus, Abhishek Agrawal, Petros Kaklamanis, Alex Warstadt, Abdellah Fourtassi

    Abstract: The acquisition of grammar has been a central question to adjudicate between theories of language acquisition. In order to conduct faster, more reproducible, and larger-scale corpus studies on grammaticality in child-caregiver conversations, tools for automatic annotation can offer an effective alternative to tedious manual annotation. We propose a coding scheme for context-dependent grammaticalit… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Journal ref: LREC-Coling 2024, May 2024, Turin, Italy

  3. arXiv:2403.11771  [pdf, other

    cs.CV cs.CL

    Modality-Agnostic fMRI Decoding of Vision and Language

    Authors: Mitja Nikolaus, Milad Mozafari, Nicholas Asher, Leila Reddy, Rufin VanRullen

    Abstract: Previous studies have shown that it is possible to map brain activation data of subjects viewing images onto the feature representation space of not only vision models (modality-specific decoding) but also language models (cross-modal decoding). In this work, we introduce and use a new large-scale fMRI dataset (~8,500 trials per subject) of people watching both images and text descriptions of such… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

    Comments: To appear at ICLR 2024 workshop on Representational Alignment (Re-Align)

  4. arXiv:2210.12079  [pdf, other

    cs.CL cs.CV

    Do Vision-and-Language Transformers Learn Grounded Predicate-Noun Dependencies?

    Authors: Mitja Nikolaus, Emmanuelle Salin, Stephane Ayache, Abdellah Fourtassi, Benoit Favre

    Abstract: Recent advances in vision-and-language modeling have seen the development of Transformer architectures that achieve remarkable performance on multimodal reasoning tasks. Yet, the exact capabilities of these black-box models are still poorly understood. While much of previous work has focused on studying their ability to learn meaning at the word-level, their ability to track syntactic dependencies… ▽ More

    Submitted 21 October, 2022; originally announced October 2022.

    Comments: To appear at EMNLP 2022

  5. arXiv:2202.12917  [pdf, other

    cs.CL cs.AI eess.AS eess.IV

    Learning English with Peppa Pig

    Authors: Mitja Nikolaus, Afra Alishahi, Grzegorz Chrupała

    Abstract: Recent computational models of the acquisition of spoken language via grounding in perception exploit associations between the spoken and visual modalities and learn to represent speech and visual data in a joint vector space. A major unresolved issue from the point of ecological validity is the training data, typically consisting of images or videos paired with spoken descriptions of what is depi… ▽ More

    Submitted 27 May, 2022; v1 submitted 25 February, 2022; originally announced February 2022.

    Comments: Accepted to TACL

  6. arXiv:1909.04402  [pdf, other

    cs.LG cs.CL cs.CV stat.ML

    Compositional Generalization in Image Captioning

    Authors: Mitja Nikolaus, Mostafa Abdou, Matthew Lamm, Rahul Aralikatte, Desmond Elliott

    Abstract: Image captioning models are usually evaluated on their ability to describe a held-out set of images, not on their ability to generalize to unseen concepts. We study the problem of compositional generalization, which measures how well a model composes unseen combinations of concepts when describing images. State-of-the-art image captioning models show poor generalization performance on this task. W… ▽ More

    Submitted 16 September, 2019; v1 submitted 10 September, 2019; originally announced September 2019.

    Comments: To appear at CoNLL 2019, EMNLP

    Journal ref: Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL), pp. 87--98, ACL, 2019

  7. arXiv:1906.01634  [pdf, other

    cs.CL cs.AI cs.LG

    On the Realization of Compositionality in Neural Networks

    Authors: Joris Baan, Jana Leible, Mitja Nikolaus, David Rau, Dennis Ulmer, Tim Baumgärtner, Dieuwke Hupkes, Elia Bruni

    Abstract: We present a detailed comparison of two types of sequence to sequence models trained to conduct a compositional task. The models are architecturally identical at inference time, but differ in the way that they are trained: our baseline model is trained with a task-success signal only, while the other model receives additional supervision on its attention mechanism (Attentive Guidance), which has s… ▽ More

    Submitted 6 June, 2019; v1 submitted 4 June, 2019; originally announced June 2019.

    Comments: To appear at BlackboxNLP 2019, ACL