Skip to main content

Showing 1–4 of 4 results for author: Hofmann, M J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.00165  [pdf

    cs.CL cs.LG

    Individual Text Corpora Predict Openness, Interests, Knowledge and Level of Education

    Authors: Markus J. Hofmann, Markus T. Jansen, Christoph Wigbels, Benny Briesemeister, Arthur M. Jacobs

    Abstract: Here we examine whether the personality dimension of openness to experience can be predicted from the individual google search history. By web scraping, individual text corpora (ICs) were generated from 214 participants with a mean number of 5 million word tokens. We trained word2vec models and used the similarities of each IC to label words, which were derived from a lexical approach of personali… ▽ More

    Submitted 29 March, 2024; originally announced April 2024.

    Comments: Proceedings of the 8th workshop on Cognitive Aspects of the Lexicon (CogALex-VIII), LREC/Coling 2024

  2. Language Models Explain Word Reading Times Better Than Empirical Predictability

    Authors: Markus J. Hofmann, Steffen Remus, Chris Biemann, Ralph Radach, Lars Kuchinke

    Abstract: Though there is a strong consensus that word length and frequency are the most important single-word features determining visual-orthographic access to the mental lexicon, there is less agreement as how to best capture syntactic and semantic factors. The traditional approach in cognitive reading research assumes that word predictability from sentence context is best captured by cloze completion pr… ▽ More

    Submitted 2 February, 2022; originally announced February 2022.

    Journal ref: Frontiers in Artificial Intelligence, 4(730570), 1-20 (2022)

  3. arXiv:2010.10176  [pdf

    cs.CL cs.IR

    Individual corpora predict fast memory retrieval during reading

    Authors: Markus J. Hofmann, Lara Müller, Andre Rölke, Ralph Radach, Chris Biemann

    Abstract: The corpus, from which a predictive language model is trained, can be considered the experience of a semantic system. We recorded everyday reading of two participants for two months on a tablet, generating individual corpus samples of 300/500K tokens. Then we trained word2vec models from individual corpora and a 70 million-sentence newspaper corpus to obtain individual and norm-based long-term mem… ▽ More

    Submitted 20 October, 2020; originally announced October 2020.

    Comments: Proceedings of the 6th workshop on Cognitive Aspects of the Lexicon (CogALex-VI), Barcelona, Spain, December 12, 2020; accepted manuscript; 11 pages, 2 figures, 4 Tables

  4. arXiv:1912.10164  [pdf

    cs.CL

    Decomposing predictability: Semantic feature overlap between words and the dynamics of reading for meaning

    Authors: Markus J. Hofmann, Mareike A. Kleemann, Andre Roelke, Christian Vorstius, Ralph Radach

    Abstract: The present study uses a computational approach to examine the role of semantic constraints in normal reading. This methodology avoids confounds inherent in conventional measures of predictability, allowing for theoretically deeper accounts of semantic processing. We start from a definition of associations between words based on the significant log likelihood that two words co-occur frequently tog… ▽ More

    Submitted 6 December, 2019; originally announced December 2019.

    Comments: Journal submission