Search | arXiv e-print repository

NSD-Imagery: A benchmark dataset for extending fMRI vision decoding methods to mental imagery

Authors: Reese Kneeland, Paul S. Scotti, Ghislain St-Yves, Jesse Breedlove, Kendrick Kay, Thomas Naselaris

Abstract: We release NSD-Imagery, a benchmark dataset of human fMRI activity paired with mental images, to complement the existing Natural Scenes Dataset (NSD), a large-scale dataset of fMRI activity paired with seen images that enabled unprecedented improvements in fMRI-to-image reconstruction efforts. Recent models trained on NSD have been evaluated only on seen image reconstruction. Using NSD-Imagery, it… ▽ More We release NSD-Imagery, a benchmark dataset of human fMRI activity paired with mental images, to complement the existing Natural Scenes Dataset (NSD), a large-scale dataset of fMRI activity paired with seen images that enabled unprecedented improvements in fMRI-to-image reconstruction efforts. Recent models trained on NSD have been evaluated only on seen image reconstruction. Using NSD-Imagery, it is possible to assess how well these models perform on mental image reconstruction. This is a challenging generalization requirement because mental images are encoded in human brain activity with relatively lower signal-to-noise and spatial resolution; however, generalization from seen to mental imagery is critical for real-world applications in medical domains and brain-computer interfaces, where the desired information is always internally generated. We provide benchmarks for a suite of recent NSD-trained open-source visual decoding models (MindEye1, MindEye2, Brain Diffuser, iCNN, Takagi et al.) on NSD-Imagery, and show that the performance of decoding methods on mental images is largely decoupled from performance on vision reconstruction. We further demonstrate that architectural choices significantly impact cross-decoding performance: models employing simple linear decoding architectures and multimodal feature decoding generalize better to mental imagery, while complex architectures tend to overfit visual training data. Our findings indicate that mental imagery datasets are critical for the development of practical applications, and establish NSD-Imagery as a useful resource for better aligning visual decoding methods with this goal. △ Less

Submitted 7 June, 2025; originally announced June 2025.

Comments: Published at CVPR 2025

arXiv:2503.06286 [pdf]

A 7T fMRI dataset of synthetic images for out-of-distribution modeling of vision

Authors: Alessandro T. Gifford, Radoslaw M. Cichy, Thomas Naselaris, Kendrick Kay

Abstract: Large-scale visual neural datasets such as the Natural Scenes Dataset (NSD) are boosting NeuroAI research by enabling computational models of the brain with performances beyond what was possible just a decade ago. However, these datasets lack out-of-distribution (OOD) components, which are crucial for the development of more robust models. Here, we address this limitation by releasing NSD-syntheti… ▽ More Large-scale visual neural datasets such as the Natural Scenes Dataset (NSD) are boosting NeuroAI research by enabling computational models of the brain with performances beyond what was possible just a decade ago. However, these datasets lack out-of-distribution (OOD) components, which are crucial for the development of more robust models. Here, we address this limitation by releasing NSD-synthetic, a dataset consisting of 7T fMRI responses from the eight NSD subjects for 284 carefully controlled synthetic images. We show that NSD-synthetic's fMRI responses reliably encode stimulus-related information and are OOD with respect to NSD. Furthermore, OOD generalization tests on NSD-synthetic reveal differences between models of the brain that are not detected with NSD - specifically, self-supervised deep neural networks better explain neural responses than their task-supervised counterparts. These results showcase how NSD-synthetic enables OOD generalization tests that facilitate the development of more robust models of visual processing, and the formulation of more accurate theories of human vision. △ Less

Submitted 8 March, 2025; originally announced March 2025.

arXiv:2301.03198 [pdf]

The Algonauts Project 2023 Challenge: How the Human Brain Makes Sense of Natural Scenes

Authors: A. T. Gifford, B. Lahner, S. Saba-Sadiya, M. G. Vilas, A. Lascelles, A. Oliva, K. Kay, G. Roig, R. M. Cichy

Abstract: The sciences of biological and artificial intelligence are ever more intertwined. Neural computational principles inspire new intelligent machines, which are in turn used to advance theoretical understanding of the brain. To promote further exchange of ideas and collaboration between biological and artificial intelligence researchers, we introduce the 2023 installment of the Algonauts Project chal… ▽ More The sciences of biological and artificial intelligence are ever more intertwined. Neural computational principles inspire new intelligent machines, which are in turn used to advance theoretical understanding of the brain. To promote further exchange of ideas and collaboration between biological and artificial intelligence researchers, we introduce the 2023 installment of the Algonauts Project challenge: How the Human Brain Makes Sense of Natural Scenes (http://algonauts.csail.mit.edu). This installment prompts the fields of artificial and biological intelligence to come together towards building computational models of the visual brain using the largest and richest dataset of fMRI responses to visual scenes, the Natural Scenes Dataset (NSD). NSD provides high-quality fMRI responses to ~73,000 different naturalistic colored scenes, making it the ideal candidate for data-driven model building approaches promoted by the 2023 challenge. The challenge is open to all and makes results directly comparable and transparent through a public leaderboard automatically updated after each submission, thus allowing for rapid model development. We believe that the 2023 installment will spark symbiotic collaborations between biological and artificial intelligence scientists, leading to a deeper understanding of the brain through cutting-edge computational models and to novel ways of engineering artificial intelligent agents through inductive biases from biological systems. △ Less

Submitted 11 July, 2023; v1 submitted 9 January, 2023; originally announced January 2023.

Comments: 5 pages, 2 figures

arXiv:2209.11737 [pdf]

Visual representations in the human brain are aligned with large language models

Authors: Adrien Doerig, Tim C Kietzmann, Emily Allen, Yihan Wu, Thomas Naselaris, Kendrick Kay, Ian Charest

Abstract: The human brain extracts complex information from visual inputs, including objects, their spatial and semantic interrelations, and their interactions with the environment. However, a quantitative approach for studying this information remains elusive. Here, we test whether the contextual information encoded in large language models (LLMs) is beneficial for modelling the complex visual information… ▽ More The human brain extracts complex information from visual inputs, including objects, their spatial and semantic interrelations, and their interactions with the environment. However, a quantitative approach for studying this information remains elusive. Here, we test whether the contextual information encoded in large language models (LLMs) is beneficial for modelling the complex visual information extracted by the brain from natural scenes. We show that LLM embeddings of scene captions successfully characterise brain activity evoked by viewing the natural scenes. This mapping captures selectivities of different brain areas, and is sufficiently robust that accurate scene captions can be reconstructed from brain activity. Using carefully controlled model comparisons, we then proceed to show that the accuracy with which LLM representations match brain representations derives from the ability of LLMs to integrate complex information contained in scene captions beyond that conveyed by individual words. Finally, we train deep neural network models to transform image inputs into LLM representations. Remarkably, these networks learn representations that are better aligned with brain representations than a large number of state-of-the-art alternative models, despite being trained on orders-of-magnitude less data. Overall, our results suggest that LLM embeddings of scene captions provide a representational format that accounts for complex information extracted by the brain from visual inputs. △ Less

Submitted 6 July, 2024; v1 submitted 23 September, 2022; originally announced September 2022.

arXiv:2201.09351 [pdf]

doi 10.1371/journal.pone.0270895

The risk of bias in denoising methods

Authors: Kendrick Kay

Abstract: Experimental datasets are growing rapidly in size, scope, and detail, but the value of these datasets is limited by unwanted measurement noise. It is therefore tempting to apply analysis techniques that attempt to reduce noise and enhance signals of interest. In this paper, we draw attention to the possibility that denoising methods may introduce bias and lead to incorrect scientific inferences. T… ▽ More Experimental datasets are growing rapidly in size, scope, and detail, but the value of these datasets is limited by unwanted measurement noise. It is therefore tempting to apply analysis techniques that attempt to reduce noise and enhance signals of interest. In this paper, we draw attention to the possibility that denoising methods may introduce bias and lead to incorrect scientific inferences. To present our case, we first review the basic statistical concepts of bias and variance. Denoising techniques typically reduce variance observed across repeated measurements, but this can come at the expense of introducing bias to the average expected outcome. We then conduct three simple simulations that provide concrete examples of how bias may manifest in everyday situations. These simulations reveal several findings that may be surprising and counterintuitive: (i) different methods can be equally effective at reducing variance but some incur bias while others do not, (ii) identifying methods that better recover ground truth does not guarantee the absence of bias, (iii) bias can arise even if one has specific knowledge of properties of the signal of interest. We suggest that researchers should consider and possibly quantify bias before deploying denoising methods on important research data. △ Less

Submitted 25 January, 2022; v1 submitted 23 January, 2022; originally announced January 2022.

Comments: 19 pages, 4 figures (v2 update to fix author information)

arXiv:2105.07140 [pdf, other]

NeuroGen: activation optimized image synthesis for discovery neuroscience

Authors: Zijin Gu, Keith W. Jamison, Meenakshi Khosla, Emily J. Allen, Yihan Wu, Thomas Naselaris, Kendrick Kay, Mert R. Sabuncu, Amy Kuceyeski

Abstract: Functional MRI (fMRI) is a powerful technique that has allowed us to characterize visual cortex responses to stimuli, yet such experiments are by nature constructed based on a priori hypotheses, limited to the set of images presented to the individual while they are in the scanner, are subject to noise in the observed brain responses, and may vary widely across individuals. In this work, we propos… ▽ More Functional MRI (fMRI) is a powerful technique that has allowed us to characterize visual cortex responses to stimuli, yet such experiments are by nature constructed based on a priori hypotheses, limited to the set of images presented to the individual while they are in the scanner, are subject to noise in the observed brain responses, and may vary widely across individuals. In this work, we propose a novel computational strategy, which we call NeuroGen, to overcome these limitations and develop a powerful tool for human vision neuroscience discovery. NeuroGen combines an fMRI-trained neural encoding model of human vision with a deep generative network to synthesize images predicted to achieve a target pattern of macro-scale brain activation. We demonstrate that the reduction of noise that the encoding model provides, coupled with the generative network's ability to produce images of high fidelity, results in a robust discovery architecture for visual neuroscience. By using only a small number of synthetic images created by NeuroGen, we demonstrate that we can detect and amplify differences in regional and individual human brain response patterns to visual stimuli. We then verify that these discoveries are reflected in the several thousand observed image responses measured with fMRI. We further demonstrate that NeuroGen can create synthetic images predicted to achieve regional response patterns not achievable by the best-matching natural images. The NeuroGen framework extends the utility of brain encoding models and opens up a new avenue for exploring, and possibly precisely controlling, the human visual system. △ Less

Submitted 15 May, 2021; originally announced May 2021.

arXiv:2104.13714 [pdf]

The Algonauts Project 2021 Challenge: How the Human Brain Makes Sense of a World in Motion

Authors: R. M. Cichy, K. Dwivedi, B. Lahner, A. Lascelles, P. Iamshchinina, M. Graumann, A. Andonian, N. A. R. Murty, K. Kay, G. Roig, A. Oliva

Abstract: The sciences of natural and artificial intelligence are fundamentally connected. Brain-inspired human-engineered AI are now the standard for predicting human brain responses during vision, and conversely, the brain continues to inspire invention in AI. To promote even deeper connections between these fields, we here release the 2021 edition of the Algonauts Project Challenge: How the Human Brain M… ▽ More The sciences of natural and artificial intelligence are fundamentally connected. Brain-inspired human-engineered AI are now the standard for predicting human brain responses during vision, and conversely, the brain continues to inspire invention in AI. To promote even deeper connections between these fields, we here release the 2021 edition of the Algonauts Project Challenge: How the Human Brain Makes Sense of a World in Motion (http://algonauts.csail.mit.edu/). We provide whole-brain fMRI responses recorded while 10 human participants viewed a rich set of over 1,000 short video clips depicting everyday events. The goal of the challenge is to accurately predict brain responses to these video clips. The format of our challenge ensures rapid development, makes results directly comparable and transparent, and is open to all. In this way it facilitates interdisciplinary collaboration towards a common goal of understanding visual intelligence. The 2021 Algonauts Project is conducted in collaboration with the Cognitive Computational Neuroscience (CCN) conference. △ Less

Submitted 28 April, 2021; originally announced April 2021.

Comments: 5 pages, 2 figures

arXiv:2002.03211 [pdf, other]

Appreciating the variety of goals in computational neuroscience

Authors: Konrad P. Kording, Gunnar Blohm, Paul Schrater, Kendrick Kay

Abstract: Within computational neuroscience, informal interactions with modelers often reveal wildly divergent goals. In this opinion piece, we explicitly address the diversity of goals that motivate and ultimately influence modeling efforts. We argue that a wide range of goals can be meaningfully taken to be of highest importance. A simple informal survey conducted on the Internet confirmed the diversity o… ▽ More Within computational neuroscience, informal interactions with modelers often reveal wildly divergent goals. In this opinion piece, we explicitly address the diversity of goals that motivate and ultimately influence modeling efforts. We argue that a wide range of goals can be meaningfully taken to be of highest importance. A simple informal survey conducted on the Internet confirmed the diversity of goals in the community. However, different priorities or preferences of individual researchers can lead to divergent model evaluation criteria. We propose that many disagreements in evaluating the merit of computational research stem from differences in goals and not from the mechanics of constructing, describing, and validating models. We suggest that authors state explicitly their goals when proposing models so that others can judge the quality of the research with respect to its stated goals. △ Less

Submitted 8 February, 2020; originally announced February 2020.

Comments: Accepted for publication in Neurons, Behavior, Data Analysis, and Theory

arXiv:1608.06146 [pdf, other]

doi 10.1371/journal.pcbi.1005400

The Role of the Hes1 Crosstalk Hub in Notch-Wnt Interactions of the Intestinal Crypt

Authors: Sophie K. Kay, Heather A. Harrington, Sarah Shepherd, Keith Brennan, Trevor Dale, James M. Osborne, David J. Gavaghan, Helen M. Byrne

Abstract: The Notch pathway plays a vital role in determining whether cells in the intestinal epithelium adopt a secretory or an absorptive phenotype. Cell fate specification is coordinated via Notch's interaction with the canonical Wnt pathway. Here, we propose a new mathematical model of the Notch and Wnt pathways, in which the Hes1 promoter acts as a hub for pathway crosstalk. Computational simulations o… ▽ More The Notch pathway plays a vital role in determining whether cells in the intestinal epithelium adopt a secretory or an absorptive phenotype. Cell fate specification is coordinated via Notch's interaction with the canonical Wnt pathway. Here, we propose a new mathematical model of the Notch and Wnt pathways, in which the Hes1 promoter acts as a hub for pathway crosstalk. Computational simulations of the model can assist in understanding how healthy intestinal tissue is maintained, and predict the likely consequences of biochemical knockouts upon cell fate selection processes. Chemical reaction network theory (CRNT) is a powerful, generalised framework which assesses the capacity of our model for monostability or multistability, by analysing properties of the underlying network structure without recourse to specific parameter values or functional forms for reaction rates. CRNT highlights the role of beta-catenin in stabilising the Notch pathway and damping oscillations, demonstrating that Wnt-mediated actions on the Hes1 promoter can induce dynamical transitions in the Notch system, from multistability to monostability. Time-dependent model simulations of cell pairs reveal the stabilising influence of Wnt upon the Notch pathway, in which beta-catenin- and Dsh-mediated action on the Hes1 promoter are key in shaping the subcellular dynamics. Where Notch-mediated transcription of Hes1 dominates, there is Notch oscillation and maintenance of fate flexibility; Wnt-mediated transcription of Hes1 favours bistability akin to cell fate selection. Cells could therefore regulate the proportion of Wnt- and Notch-mediated control of the Hes1 promoter to coordinate the timing of cell fate selection as they migrate through the intestinal epithelium and are subject to reduced Wnt stimuli. △ Less

Submitted 22 August, 2016; originally announced August 2016.

arXiv:1411.0721 [pdf]

doi 10.1371/journal.pone.0123272

Evaluating the accuracy of diffusion MRI models in white matter

Authors: Ariel Rokem, Jason D. Yeatman, Franco Pestilli, Kendrick N. Kay, Aviv Mezer, Stefan van der Walt, Brian A. Wandell

Abstract: Models of diffusion MRI within a voxel are useful for making inferences about the properties of the tissue and inferring fiber orientation distribution used by tractography algorithms. A useful model must fit the data accurately. However, evaluations of model-accuracy of some of the models that are commonly used in analyzing human white matter have not been published before. Here, we evaluate mode… ▽ More Models of diffusion MRI within a voxel are useful for making inferences about the properties of the tissue and inferring fiber orientation distribution used by tractography algorithms. A useful model must fit the data accurately. However, evaluations of model-accuracy of some of the models that are commonly used in analyzing human white matter have not been published before. Here, we evaluate model-accuracy of the two main classes of diffusion MRI models. The diffusion tensor model (DTM) summarizes diffusion as a 3-dimensional Gaussian distribution. Sparse fascicle models (SFM) summarize the signal as a linear sum of signals originating from a collection of fascicles oriented in different directions. We use cross-validation to assess model-accuracy at different gradient amplitudes (b-values) throughout the white matter. Specifically, we fit each model to all the white matter voxels in one data set and then use the model to predict a second, independent data set. This is the first evaluation of model-accuracy of these models. In most of the white matter the DTM predicts the data more accurately than test-retest reliability; SFM model-accuracy is higher than test-retest reliability and also higher than the DTM, particularly for measurements with (a) a b-value above 1000 in locations containing fiber crossings, and (b) in the regions of the brain surrounding the optic radiations. The SFM also has better parameter-validity: it more accurately estimates the fiber orientation distribution function (fODF) in each voxel, which is useful for fiber tracking. △ Less

Submitted 14 March, 2015; v1 submitted 3 November, 2014; originally announced November 2014.

Showing 1–10 of 10 results for author: Kay, K