Skip to main content

Showing 1–6 of 6 results for author: Pouw, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.00981  [pdf, ps, other

    cs.CL cs.AI cs.SD eess.AS

    What do self-supervised speech models know about Dutch? Analyzing advantages of language-specific pre-training

    Authors: Marianne de Heer Kloots, Hosein Mohebbi, Charlotte Pouw, Gaofei Shen, Willem Zuidema, Martijn Bentum

    Abstract: How language-specific are speech representations learned by self-supervised models? Existing work has shown that a range of linguistic features can be successfully decoded from end-to-end models trained only on speech recordings. However, it's less clear to what extent pre-training on specific languages improves language-specific linguistic information. Here we test the encoding of Dutch phonetic… ▽ More

    Submitted 1 June, 2025; originally announced June 2025.

    Comments: Accepted to Interspeech 2025. For model, code, and materials, see https://github.com/mdhk/SSL-NL-eval

    Journal ref: Proc. INTERSPEECH 2025

  2. arXiv:2505.22236  [pdf, other

    cs.CL

    A Linguistically Motivated Analysis of Intonational Phrasing in Text-to-Speech Systems: Revealing Gaps in Syntactic Sensitivity

    Authors: Charlotte Pouw, Afra Alishahi, Willem Zuidema

    Abstract: We analyze the syntactic sensitivity of Text-to-Speech (TTS) systems using methods inspired by psycholinguistic research. Specifically, we focus on the generation of intonational phrase boundaries, which can often be predicted by identifying syntactic boundaries within a sentence. We find that TTS systems struggle to accurately generate intonational phrase boundaries in sentences where syntactic b… ▽ More

    Submitted 28 May, 2025; originally announced May 2025.

    Comments: Accepted to CoNLL 2025

  3. arXiv:2406.15265  [pdf, other

    cs.CL

    Perception of Phonological Assimilation by Neural Speech Recognition Models

    Authors: Charlotte Pouw, Marianne de Heer Kloots, Afra Alishahi, Willem Zuidema

    Abstract: Human listeners effortlessly compensate for phonological changes during speech perception, often unconsciously inferring the intended sounds. For example, listeners infer the underlying /n/ when hearing an utterance such as "clea[m] pan", where [m] arises from place assimilation to the following labial [p]. This article explores how the neural speech recognition model Wav2Vec2 perceives assimilate… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: Accepted for publication in Computational Linguistics (Special Issue on Language Learning, Representation, and Processing in Humans and Machines)

  4. arXiv:2310.11282  [pdf, other

    cs.CL

    ChapGTP, ILLC's Attempt at Raising a BabyLM: Improving Data Efficiency by Automatic Task Formation

    Authors: Jaap Jumelet, Michael Hanna, Marianne de Heer Kloots, Anna Langedijk, Charlotte Pouw, Oskar van der Wal

    Abstract: We present the submission of the ILLC at the University of Amsterdam to the BabyLM challenge (Warstadt et al., 2023), in the strict-small track. Our final model, ChapGTP, is a masked language model that was trained for 200 epochs, aided by a novel data augmentation technique called Automatic Task Formation. We discuss in detail the performance of this model on the three evaluation suites: BLiMP, (… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

    Comments: Part of the BabyLM challenge at CoNLL

  5. arXiv:2302.12695  [pdf, other

    cs.CL cs.LG

    Cross-Lingual Transfer of Cognitive Processing Complexity

    Authors: Charlotte Pouw, Nora Hollenstein, Lisa Beinborn

    Abstract: When humans read a text, their eye movements are influenced by the structural complexity of the input sentences. This cognitive phenomenon holds across languages and recent studies indicate that multilingual language models utilize structural similarities between languages to facilitate cross-lingual transfer. We use sentence-level eye-tracking patterns as a cognitive indicator for structural comp… ▽ More

    Submitted 27 February, 2023; v1 submitted 24 February, 2023; originally announced February 2023.

    Comments: Accepted at Findings of EACL 2023

    ACM Class: I.2.7

  6. arXiv:2108.11719  [pdf, other

    physics.soc-ph cs.CV

    Benchmarking high-fidelity pedestrian tracking systems for research, real-time monitoring and crowd control

    Authors: Caspar A. S. Pouw, Joris Willems, Frank van Schadewijk, Jasmin Thurau, Federico Toschi, Alessandro Corbetta

    Abstract: High-fidelity pedestrian tracking in real-life conditions has been an important tool in fundamental crowd dynamics research allowing to quantify statistics of relevant observables including walking velocities, mutual distances and body orientations. As this technology advances, it is becoming increasingly useful also in society. In fact, continued urbanization is overwhelming existing pedestrian i… ▽ More

    Submitted 26 August, 2021; originally announced August 2021.

    Journal ref: Collective Dynamics. v. 6 p. 1-22, 2022