-
Generating Computational Cognitive Models using Large Language Models
Authors:
Milena Rmus,
Akshay K. Jagadish,
Marvin Mathony,
Tobias Ludwig,
Eric Schulz
Abstract:
Computational cognitive models, which formalize theories of cognition, enable researchers to quantify cognitive processes and arbitrate between competing theories by fitting models to behavioral data. Traditionally, these models are handcrafted, which requires significant domain knowledge, coding expertise, and time investment. However, recent advances in machine learning offer solutions to these…
▽ More
Computational cognitive models, which formalize theories of cognition, enable researchers to quantify cognitive processes and arbitrate between competing theories by fitting models to behavioral data. Traditionally, these models are handcrafted, which requires significant domain knowledge, coding expertise, and time investment. However, recent advances in machine learning offer solutions to these challenges. In particular, Large Language Models (LLMs) have demonstrated remarkable capabilities for in-context pattern recognition, leveraging knowledge from diverse domains to solve complex problems, and generating executable code that can be used to facilitate the generation of cognitive models. Building on this potential, we introduce a pipeline for Guided generation of Computational Cognitive Models (GeCCo). Given task instructions, participant data, and a template function, GeCCo prompts an LLM to propose candidate models, fits proposals to held-out data, and iteratively refines them based on feedback constructed from their predictive performance. We benchmark this approach across four different cognitive domains -- decision making, learning, planning, and memory -- using three open-source LLMs, spanning different model sizes, capacities, and families. On four human behavioral data sets, the LLM generated models that consistently matched or outperformed the best domain-specific models from the cognitive science literature. Taken together, our results suggest that LLMs can generate cognitive models with conceptually plausible theories that rival -- or even surpass -- the best models from the literature across diverse task domains.
△ Less
Submitted 17 May, 2025; v1 submitted 2 February, 2025;
originally announced February 2025.
-
Rendezfood: A Design Case Study of a Conversational Location-based Approach in Restaurants
Authors:
Philip Weber,
Kevin Krings,
Lukas Schröder,
Lea Katharina Michel,
Thomas Ludwig
Abstract:
The restaurant industry is currently facing a challenging socio-economic situation caused by the rise of delivery services, inflation, and typically low margins. Often, technological opportunities for process optimization or customer retention are not fully utilized. In our design case study, we investigate which technologies are already being used to improve the customer experience in restaurants…
▽ More
The restaurant industry is currently facing a challenging socio-economic situation caused by the rise of delivery services, inflation, and typically low margins. Often, technological opportunities for process optimization or customer retention are not fully utilized. In our design case study, we investigate which technologies are already being used to improve the customer experience in restaurants and explore a novel new approach to this issue. We designed, implemented, and evaluated a platform with customers and restaurateurs to increase visibility and emotional connection to nearby restaurants through their dishes. Some of our key findings include the enormous potential of combining location-based systems and conversational agents, but also the difficulties in creating content for such platforms. We contribute to the field of Human-Food Interaction by (1) identifying promising design spaces as well as customer and restaurateur requirements for technology in this domain, (2) presenting an innovative design case study to improve the user experience, and (3) exploring the broader implications of our design case study findings for approaching a real-world metaverse.
△ Less
Submitted 7 January, 2025;
originally announced January 2025.
-
Centaur: a foundation model of human cognition
Authors:
Marcel Binz,
Elif Akata,
Matthias Bethge,
Franziska Brändle,
Fred Callaway,
Julian Coda-Forno,
Peter Dayan,
Can Demircan,
Maria K. Eckstein,
Noémi Éltető,
Thomas L. Griffiths,
Susanne Haridi,
Akshay K. Jagadish,
Li Ji-An,
Alexander Kipnis,
Sreejan Kumar,
Tobias Ludwig,
Marvin Mathony,
Marcelo Mattar,
Alireza Modirshanechi,
Surabhi S. Nath,
Joshua C. Peterson,
Milena Rmus,
Evan M. Russek,
Tankred Saanum
, et al. (15 additional authors not shown)
Abstract:
Establishing a unified theory of cognition has been a major goal of psychology. While there have been previous attempts to instantiate such theories by building computational models, we currently do not have one model that captures the human mind in its entirety. A first step in this direction is to create a model that can predict human behavior in a wide range of settings. Here we introduce Centa…
▽ More
Establishing a unified theory of cognition has been a major goal of psychology. While there have been previous attempts to instantiate such theories by building computational models, we currently do not have one model that captures the human mind in its entirety. A first step in this direction is to create a model that can predict human behavior in a wide range of settings. Here we introduce Centaur, a computational model that can predict and simulate human behavior in any experiment expressible in natural language. We derived Centaur by finetuning a state-of-the-art language model on a novel, large-scale data set called Psych-101. Psych-101 reaches an unprecedented scale, covering trial-by-trial data from over 60,000 participants performing over 10,000,000 choices in 160 experiments. Centaur not only captures the behavior of held-out participants better than existing cognitive models, but also generalizes to new cover stories, structural task modifications, and entirely new domains. Furthermore, we find that the model's internal representations become more aligned with human neural activity after finetuning. Taken together, our results demonstrate that it is possible to discover computational models that capture human behavior across a wide range of domains. We believe that such models provide tremendous potential for guiding the development of cognitive theories and present a case study to demonstrate this.
△ Less
Submitted 28 April, 2025; v1 submitted 26 October, 2024;
originally announced October 2024.
-
Latent Diffusion Model for Generating Ensembles of Climate Simulations
Authors:
Johannes Meuer,
Maximilian Witte,
Tobias Sebastian Finn,
Claudia Timmreck,
Thomas Ludwig,
Christopher Kadow
Abstract:
Obtaining accurate estimates of uncertainty in climate scenarios often requires generating large ensembles of high-resolution climate simulations, a computationally expensive and memory intensive process. To address this challenge, we train a novel generative deep learning approach on extensive sets of climate simulations. The model consists of two components: a variational autoencoder for dimensi…
▽ More
Obtaining accurate estimates of uncertainty in climate scenarios often requires generating large ensembles of high-resolution climate simulations, a computationally expensive and memory intensive process. To address this challenge, we train a novel generative deep learning approach on extensive sets of climate simulations. The model consists of two components: a variational autoencoder for dimensionality reduction and a denoising diffusion probabilistic model that generates multiple ensemble members. We validate our model on the Max Planck Institute Grand Ensemble and show that it achieves good agreement with the original ensemble in terms of variability. By leveraging the latent space representation, our model can rapidly generate large ensembles on-the-fly with minimal memory requirements, which can significantly improve the efficiency of uncertainty quantification in climate simulations.
△ Less
Submitted 4 July, 2024; v1 submitted 2 July, 2024;
originally announced July 2024.
-
If consciousness is dynamically relevant, artificial intelligence isn't conscious
Authors:
Johannes Kleiner,
Tim Ludwig
Abstract:
We demonstrate that if consciousness is relevant for the temporal evolution of a system's states--that is, if it is dynamically relevant--then AI systems cannot be conscious. That is because AI systems run on CPUs, GPUs, TPUs or other processors which have been designed and verified to adhere to computational dynamics that systematically preclude or suppress deviations. The design and verification…
▽ More
We demonstrate that if consciousness is relevant for the temporal evolution of a system's states--that is, if it is dynamically relevant--then AI systems cannot be conscious. That is because AI systems run on CPUs, GPUs, TPUs or other processors which have been designed and verified to adhere to computational dynamics that systematically preclude or suppress deviations. The design and verification preclude or suppress, in particular, potential consciousness-related dynamical effects, so that if consciousness is dynamically relevant, AI systems cannot be conscious.
△ Less
Submitted 9 November, 2023; v1 submitted 11 April, 2023;
originally announced April 2023.
-
Cross-Media Usage of Social Big Data for Emergency Services and Volunteer Communities: Approaches, Development and Challenges of Multi-Platform Social Media Services
Authors:
Marc-André Kaufhold,
Christian Reuter,
Thomas Ludwig
Abstract:
The use of social media is ubiquitous and nowadays well-established in our everyday life, but increasingly also before, during or after emergencies. The produced data is spread across several types of social media and can be used by different actors, such as emergency services or volunteer communities. There are already systems available that support the process of gathering, analysing and distrib…
▽ More
The use of social media is ubiquitous and nowadays well-established in our everyday life, but increasingly also before, during or after emergencies. The produced data is spread across several types of social media and can be used by different actors, such as emergency services or volunteer communities. There are already systems available that support the process of gathering, analysing and distributing information through social media. However, dependent on the goal of analysis, the analysis methods and available systems are limited based on technical or business-oriented restrictions. This paper presents the design of a cross-platform Social Media API, which was integrated and evaluated within multiple emergency scenarios. Based on the lessons learned, we outline the core challenges from the practical development and theoretical findings, focusing (1) cross-platform gathering and data management, (2) trustability and information quality, (3) tailorability and adjustable data operations, and (4) queries, performance, and technical development.
△ Less
Submitted 17 July, 2019;
originally announced July 2019.
-
Mistral Supercomputer Job History Analysis
Authors:
Michał Zasadziński,
Victor Muntés-Mulero,
Marc Solé,
Thomas Ludwig
Abstract:
In this technical report, we show insights and results of operational data analysis from petascale supercomputer Mistral, which is ranked as 42nd most powerful in the world as of January 2018. Data sources include hardware monitoring data, job scheduler history, topology, and hardware information. We explore job state sequences, spatial distribution, and electric power patterns.
In this technical report, we show insights and results of operational data analysis from petascale supercomputer Mistral, which is ranked as 42nd most powerful in the world as of January 2018. Data sources include hardware monitoring data, job scheduler history, topology, and hardware information. We explore job state sequences, spatial distribution, and electric power patterns.
△ Less
Submitted 23 January, 2018;
originally announced January 2018.