Search | arXiv e-print repository

Assessing Episodic Memory in LLMs with Sequence Order Recall Tasks

Authors: Mathis Pink, Vy A. Vo, Qinyuan Wu, Jianing Mu, Javier S. Turek, Uri Hasson, Kenneth A. Norman, Sebastian Michelmann, Alexander Huth, Mariya Toneva

Abstract: Current LLM benchmarks focus on evaluating models' memory of facts and semantic relations, primarily assessing semantic aspects of long-term memory. However, in humans, long-term memory also includes episodic memory, which links memories to their contexts, such as the time and place they occurred. The ability to contextualize memories is crucial for many cognitive tasks and everyday functions. Thi… ▽ More Current LLM benchmarks focus on evaluating models' memory of facts and semantic relations, primarily assessing semantic aspects of long-term memory. However, in humans, long-term memory also includes episodic memory, which links memories to their contexts, such as the time and place they occurred. The ability to contextualize memories is crucial for many cognitive tasks and everyday functions. This form of memory has not been evaluated in LLMs with existing benchmarks. To address the gap in evaluating memory in LLMs, we introduce Sequence Order Recall Tasks (SORT), which we adapt from tasks used to study episodic memory in cognitive psychology. SORT requires LLMs to recall the correct order of text segments, and provides a general framework that is both easily extendable and does not require any additional annotations. We present an initial evaluation dataset, Book-SORT, comprising 36k pairs of segments extracted from 9 books recently added to the public domain. Based on a human experiment with 155 participants, we show that humans can recall sequence order based on long-term memory of a book. We find that models can perform the task with high accuracy when relevant text is given in-context during the SORT evaluation. However, when presented with the book text only during training, LLMs' performance on SORT falls short. By allowing to evaluate more aspects of memory, we believe that SORT will aid in the emerging development of memory-augmented models. △ Less

Submitted 10 October, 2024; originally announced October 2024.

arXiv:2403.11207 [pdf, other]

MindEye2: Shared-Subject Models Enable fMRI-To-Image With 1 Hour of Data

Authors: Paul S. Scotti, Mihir Tripathy, Cesar Kadir Torrico Villanueva, Reese Kneeland, Tong Chen, Ashutosh Narang, Charan Santhirasegaran, Jonathan Xu, Thomas Naselaris, Kenneth A. Norman, Tanishq Mathew Abraham

Abstract: Reconstructions of visual perception from brain activity have improved tremendously, but the practical utility of such methods has been limited. This is because such models are trained independently per subject where each subject requires dozens of hours of expensive fMRI training data to attain high-quality results. The present work showcases high-quality reconstructions using only 1 hour of fMRI… ▽ More Reconstructions of visual perception from brain activity have improved tremendously, but the practical utility of such methods has been limited. This is because such models are trained independently per subject where each subject requires dozens of hours of expensive fMRI training data to attain high-quality results. The present work showcases high-quality reconstructions using only 1 hour of fMRI training data. We pretrain our model across 7 subjects and then fine-tune on minimal data from a new subject. Our novel functional alignment procedure linearly maps all brain data to a shared-subject latent space, followed by a shared non-linear mapping to CLIP image space. We then map from CLIP space to pixel space by fine-tuning Stable Diffusion XL to accept CLIP latents as inputs instead of text. This approach improves out-of-subject generalization with limited training data and also attains state-of-the-art image retrieval and reconstruction metrics compared to single-subject approaches. MindEye2 demonstrates how accurate reconstructions of perception are possible from a single visit to the MRI facility. All code is available on GitHub. △ Less

Submitted 15 June, 2024; v1 submitted 17 March, 2024; originally announced March 2024.

Comments: In Forty-first International Conference on Machine Learning, 2024. Code at https://github.com/MedARC-AI/MindEyeV2. Published as a conference paper at ICML 2024

arXiv:2312.08519 [pdf]

Reconciling Shared versus Context-Specific Information in a Neural Network Model of Latent Causes

Authors: Qihong Lu, Tan T. Nguyen, Qiong Zhang, Uri Hasson, Thomas L. Griffiths, Jeffrey M. Zacks, Samuel J. Gershman, Kenneth A. Norman

Abstract: It has been proposed that, when processing a stream of events, humans divide their experiences in terms of inferred latent causes (LCs) to support context-dependent learning. However, when shared structure is present across contexts, it is still unclear how the "splitting" of LCs and learning of shared structure can be simultaneously achieved. Here, we present the Latent Cause Network (LCNet), a n… ▽ More It has been proposed that, when processing a stream of events, humans divide their experiences in terms of inferred latent causes (LCs) to support context-dependent learning. However, when shared structure is present across contexts, it is still unclear how the "splitting" of LCs and learning of shared structure can be simultaneously achieved. Here, we present the Latent Cause Network (LCNet), a neural network model of LC inference. Through learning, it naturally stores structure that is shared across tasks in the network weights. Additionally, it represents context-specific structure using a context module, controlled by a Bayesian nonparametric inference algorithm, which assigns a unique context vector for each inferred LC. Across three simulations, we found that LCNet could 1) extract shared structure across LCs in a function learning task while avoiding catastrophic interference, 2) capture human data on curriculum effects in schema learning, and 3) infer the underlying event structure when processing naturalistic videos of daily events. Overall, these results demonstrate a computationally feasible approach to reconciling shared structure and context-specific structure in a model of LCs that is scalable from laboratory experiment settings to naturalistic settings. △ Less

Submitted 6 June, 2024; v1 submitted 13 December, 2023; originally announced December 2023.

arXiv:2305.18274 [pdf, other]

Reconstructing the Mind's Eye: fMRI-to-Image with Contrastive Learning and Diffusion Priors

Authors: Paul S. Scotti, Atmadeep Banerjee, Jimmie Goode, Stepan Shabalin, Alex Nguyen, Ethan Cohen, Aidan J. Dempster, Nathalie Verlinde, Elad Yundler, David Weisberg, Kenneth A. Norman, Tanishq Mathew Abraham

Abstract: We present MindEye, a novel fMRI-to-image approach to retrieve and reconstruct viewed images from brain activity. Our model comprises two parallel submodules that are specialized for retrieval (using contrastive learning) and reconstruction (using a diffusion prior). MindEye can map fMRI brain activity to any high dimensional multimodal latent space, like CLIP image space, enabling image reconstru… ▽ More We present MindEye, a novel fMRI-to-image approach to retrieve and reconstruct viewed images from brain activity. Our model comprises two parallel submodules that are specialized for retrieval (using contrastive learning) and reconstruction (using a diffusion prior). MindEye can map fMRI brain activity to any high dimensional multimodal latent space, like CLIP image space, enabling image reconstruction using generative models that accept embeddings from this latent space. We comprehensively compare our approach with other existing methods, using both qualitative side-by-side comparisons and quantitative evaluations, and show that MindEye achieves state-of-the-art performance in both reconstruction and retrieval tasks. In particular, MindEye can retrieve the exact original image even among highly similar candidates indicating that its brain embeddings retain fine-grained image-specific information. This allows us to accurately retrieve images even from large-scale databases like LAION-5B. We demonstrate through ablations that MindEye's performance improvements over previous methods result from specialized submodules for retrieval and reconstruction, improved training techniques, and training models with orders of magnitude more parameters. Furthermore, we show that MindEye can better preserve low-level image features in the reconstructions by using img2img, with outputs from a separate autoencoder. All code is available on GitHub. △ Less

Submitted 7 October, 2023; v1 submitted 29 May, 2023; originally announced May 2023.

Comments: Project Page at https://medarc.ai/mindeye. Code at https://github.com/MedARC-AI/fMRI-reconstruction-NSD/. Published as a conference paper at NeurIPS 2023

arXiv:2301.10297 [pdf, other]

Large language models can segment narrative events similarly to humans

Authors: Sebastian Michelmann, Manoj Kumar, Kenneth A. Norman, Mariya Toneva

Abstract: Humans perceive discrete events such as "restaurant visits" and "train rides" in their continuous experience. One important prerequisite for studying human event perception is the ability of researchers to quantify when one event ends and another begins. Typically, this information is derived by aggregating behavioral annotations from several observers. Here we present an alternative computational… ▽ More Humans perceive discrete events such as "restaurant visits" and "train rides" in their continuous experience. One important prerequisite for studying human event perception is the ability of researchers to quantify when one event ends and another begins. Typically, this information is derived by aggregating behavioral annotations from several observers. Here we present an alternative computational approach where event boundaries are derived using a large language model, GPT-3, instead of using human annotations. We demonstrate that GPT-3 can segment continuous narrative text into events. GPT-3-annotated events are significantly correlated with human event annotations. Furthermore, these GPT-derived annotations achieve a good approximation of the "consensus" solution (obtained by averaging across human annotations); the boundaries identified by GPT-3 are closer to the consensus, on average, than boundaries identified by individual human annotators. This finding suggests that GPT-3 provides a feasible solution for automated event annotations, and it demonstrates a further parallel between human cognition and prediction in large language models. In the future, GPT-3 may thereby help to elucidate the principles underlying human event perception. △ Less

Submitted 24 January, 2023; originally announced January 2023.

arXiv:1902.09006 [pdf, other]

doi 10.7717/peerj.11046

Learning to Perform Role-Filler Binding with Schematic Knowledge

Authors: Catherine Chen, Qihong Lu, Andre Beukers, Christopher Baldassano, Kenneth A. Norman

Abstract: Through specific experiences, humans learn relationships underlying the structure of events in the world. Schema theory suggests that we organize this information in mental frameworks called "schemata," which represent our knowledge of the structure of the world. Generalizing knowledge of structural relationships to new situations requires role-filler binding, the ability to associate specific "fi… ▽ More Through specific experiences, humans learn relationships underlying the structure of events in the world. Schema theory suggests that we organize this information in mental frameworks called "schemata," which represent our knowledge of the structure of the world. Generalizing knowledge of structural relationships to new situations requires role-filler binding, the ability to associate specific "fillers" with abstract "roles." For instance, when we hear the sentence "Alice ordered a tea from Bob," the role-filler bindings "Alice:customer," "tea:drink," and "Bob:barista" allow us to understand and make inferences about the sentence. We can perform these bindings for arbitrary fillers -- we understand this sentence even if we have never heard the names "Alice," "tea," or "Bob" before. In this work, we define a model as capable of performing role-filler binding if it can recall arbitrary fillers corresponding to a specified role, even when these pairings violate correlations seen during training. Previous work found that models can learn this ability when explicitly told what the roles and fillers are, or when given fillers seen during training. We show that networks with external memory can learn these relationships with fillers not seen during training and without explicitly labeled role-filler bindings, and show that analyses inspired by neural decoding can provide a means of understanding what the networks have learned. △ Less

Submitted 12 May, 2020; v1 submitted 24 February, 2019; originally announced February 2019.

Journal ref: PeerJ 9:e11046 (2021)

arXiv:1811.11684 [pdf, other]

Shared Representational Geometry Across Neural Networks

Authors: Qihong Lu, Po-Hsuan Chen, Jonathan W. Pillow, Peter J. Ramadge, Kenneth A. Norman, Uri Hasson

Abstract: Different neural networks trained on the same dataset often learn similar input-output mappings with very different weights. Is there some correspondence between these neural network solutions? For linear networks, it has been shown that different instances of the same network architecture encode the same representational similarity matrix, and their neural activity patterns are connected by ortho… ▽ More Different neural networks trained on the same dataset often learn similar input-output mappings with very different weights. Is there some correspondence between these neural network solutions? For linear networks, it has been shown that different instances of the same network architecture encode the same representational similarity matrix, and their neural activity patterns are connected by orthogonal transformations. However, it is unclear if this holds for non-linear networks. Using a shared response model, we show that different neural networks encode the same input examples as different orthogonal transformations of an underlying shared representation. We test this claim using both standard convolutional neural networks and residual networks on CIFAR10 and CIFAR100. △ Less

Submitted 16 March, 2019; v1 submitted 28 November, 2018; originally announced November 2018.

Comments: Integration of Deep Learning Theories workshop, NeurIPS 2018

arXiv:1809.04166 [pdf, other]

Leabra7: a Python package for modeling recurrent, biologically-realistic neural networks

Authors: C. Daniel Greenidge, Noam Miller, Kenneth A. Norman

Abstract: Emergent is a software package that uses the AdEx neural dynamics model and LEABRA learning algorithm to simulate and train arbitrary recurrent neural network architectures in a biologically-realistic manner. We present Leabra7, a complementary Python library that implements these same algorithms. Leabra7 is developed and distributed using modern software development principles, and integrates tig… ▽ More Emergent is a software package that uses the AdEx neural dynamics model and LEABRA learning algorithm to simulate and train arbitrary recurrent neural network architectures in a biologically-realistic manner. We present Leabra7, a complementary Python library that implements these same algorithms. Leabra7 is developed and distributed using modern software development principles, and integrates tightly with Python's scientific stack. We demonstrate recurrent Leabra7 networks using traditional pattern-association tasks and a standard machine learning task, classifying the IRIS dataset. △ Less

Submitted 19 September, 2018; v1 submitted 11 September, 2018; originally announced September 2018.

Comments: Fix minor typos

arXiv:1608.04647 [pdf, other]

doi 10.1109/BigData.2016.7840719

Enabling Factor Analysis on Thousand-Subject Neuroimaging Datasets

Authors: Michael J. Anderson, Mihai Capotă, Javier S. Turek, Xia Zhu, Theodore L. Willke, Yida Wang, Po-Hsuan Chen, Jeremy R. Manning, Peter J. Ramadge, Kenneth A. Norman

Abstract: The scale of functional magnetic resonance image data is rapidly increasing as large multi-subject datasets are becoming widely available and high-resolution scanners are adopted. The inherent low-dimensionality of the information in this data has led neuroscientists to consider factor analysis methods to extract and analyze the underlying brain activity. In this work, we consider two recent multi… ▽ More The scale of functional magnetic resonance image data is rapidly increasing as large multi-subject datasets are becoming widely available and high-resolution scanners are adopted. The inherent low-dimensionality of the information in this data has led neuroscientists to consider factor analysis methods to extract and analyze the underlying brain activity. In this work, we consider two recent multi-subject factor analysis methods: the Shared Response Model and Hierarchical Topographic Factor Analysis. We perform analytical, algorithmic, and code optimization to enable multi-node parallel implementations to scale. Single-node improvements result in 99x and 1812x speedups on these two methods, and enables the processing of larger datasets. Our distributed implementations show strong scaling of 3.3x and 5.5x respectively with 20 nodes on real datasets. We also demonstrate weak scaling on a synthetic dataset with 1024 subjects, on up to 1024 nodes and 32,768 cores. △ Less

Submitted 17 August, 2016; v1 submitted 16 August, 2016; originally announced August 2016.

MSC Class: 68W15 ACM Class: I.2

Showing 1–9 of 9 results for author: Norman, K A