Skip to main content

Showing 1–5 of 5 results for author: Frank, S L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2203.06937  [pdf, ps, other

    cs.CL

    Modelling word learning and recognition using visually grounded speech

    Authors: Danny Merkx, Sebastiaan Scholten, Stefan L. Frank, Mirjam Ernestus, Odette Scharenborg

    Abstract: Background: Computational models of speech recognition often assume that the set of target words is already given. This implies that these models do not learn to recognise speech from scratch without prior knowledge and explicit supervision. Visually grounded speech models learn to recognise speech without prior knowledge by exploiting statistical dependencies between spoken and visual input. Whil… ▽ More

    Submitted 14 March, 2022; originally announced March 2022.

  2. arXiv:2202.10292  [pdf, other

    cs.CL cs.CV cs.LG

    Seeing the advantage: visually grounding word embeddings to better capture human semantic knowledge

    Authors: Danny Merkx, Stefan L. Frank, Mirjam Ernestus

    Abstract: Distributional semantic models capture word-level meaning that is useful in many natural language processing tasks and have even been shown to capture cognitive aspects of word meaning. The majority of these models are purely text based, even though the human sensory experience is much richer. In this paper we create visually grounded word embeddings by combining English text and images and compar… ▽ More

    Submitted 21 February, 2022; originally announced February 2022.

    Journal ref: Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics (CMCL) 2022

  3. Semantic sentence similarity: size does not always matter

    Authors: Danny Merkx, Stefan L. Frank, Mirjam Ernestus

    Abstract: This study addresses the question whether visually grounded speech recognition (VGS) models learn to capture sentence semantics without access to any prior linguistic knowledge. We produce synthetic and natural spoken versions of a well known semantic textual similarity database and show that our VGS model produces embeddings that correlate well with human semantic similarity judgements. Our resul… ▽ More

    Submitted 16 June, 2021; originally announced June 2021.

    Comments: This paper has been accepted at Interspeech 2021 where it will be presented and appear in the conference proceedings in September 2021

    Journal ref: Proc. Interspeech 2021

  4. Human Sentence Processing: Recurrence or Attention?

    Authors: Danny Merkx, Stefan L. Frank

    Abstract: Recurrent neural networks (RNNs) have long been an architecture of interest for computational models of human sentence processing. The recently introduced Transformer architecture outperforms RNNs on many natural language processing tasks but little is known about its ability to model human language processing. We compare Transformer- and RNN-based language models' ability to account for measures… ▽ More

    Submitted 4 May, 2021; v1 submitted 19 May, 2020; originally announced May 2020.

    Comments: This paper will appear in the proceedings of CMCL 2021 to be held June 10th

    Journal ref: Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics (CMCL) 2021

  5. Language learning using Speech to Image retrieval

    Authors: Danny Merkx, Stefan L. Frank, Mirjam Ernestus

    Abstract: Humans learn language by interaction with their environment and listening to other humans. It should also be possible for computational models to learn language directly from speech but so far most approaches require text. We improve on existing neural network approaches to create visually grounded embeddings for spoken utterances. Using a combination of a multi-layer GRU, importance sampling, cyc… ▽ More

    Submitted 9 September, 2019; originally announced September 2019.

    Comments: Submitted to InterSpeech 2019

    Journal ref: Proc. Interspeech 2019