Search | arXiv e-print repository

RTify: Aligning Deep Neural Networks with Human Behavioral Decisions

Authors: Yu-Ang Cheng, Ivan Felipe Rodriguez, Sixuan Chen, Kohitij Kar, Takeo Watanabe, Thomas Serre

Abstract: Current neural network models of primate vision focus on replicating overall levels of behavioral accuracy, often neglecting perceptual decisions' rich, dynamic nature. Here, we introduce a novel computational framework to model the dynamics of human behavioral choices by learning to align the temporal dynamics of a recurrent neural network (RNN) to human reaction times (RTs). We describe an appro… ▽ More Current neural network models of primate vision focus on replicating overall levels of behavioral accuracy, often neglecting perceptual decisions' rich, dynamic nature. Here, we introduce a novel computational framework to model the dynamics of human behavioral choices by learning to align the temporal dynamics of a recurrent neural network (RNN) to human reaction times (RTs). We describe an approximation that allows us to constrain the number of time steps an RNN takes to solve a task with human RTs. The approach is extensively evaluated against various psychophysics experiments. We also show that the approximation can be used to optimize an "ideal-observer" RNN model to achieve an optimal tradeoff between speed and accuracy without human data. The resulting model is found to account well for human RT data. Finally, we use the approximation to train a deep learning implementation of the popular Wong-Wang decision-making model. The model is integrated with a convolutional neural network (CNN) model of visual processing and evaluated using both artificial and natural image stimuli. Overall, we present a novel framework that helps align current vision models with human behavior, bringing us closer to an integrated model of human vision. △ Less

Submitted 26 December, 2024; v1 submitted 5 November, 2024; originally announced November 2024.

Comments: Published at NeurIPS 2024

arXiv:2401.03727 [pdf]

Low-cost, portable, easy-to-use kiosks to facilitate home-cage testing of non-human primates during vision-based behavioral tasks

Authors: Hamidreza Ramezanpour, Christopher Giverin, Kohitij Kar

Abstract: Non-human primates (NHPs), especially rhesus macaques, have played a significant role in our current understanding of the neural computations underlying human vision. Apart from the established homologies in the visual brain areas between these two species, and our extended abilities to probe detailed neural mechanisms in monkeys at multiple scales, one major factor that makes NHPs an extremely ap… ▽ More Non-human primates (NHPs), especially rhesus macaques, have played a significant role in our current understanding of the neural computations underlying human vision. Apart from the established homologies in the visual brain areas between these two species, and our extended abilities to probe detailed neural mechanisms in monkeys at multiple scales, one major factor that makes NHPs an extremely appealing animal model of human-vision is their ability to perform human-like visual behavior. Traditionally, such behavioral studies have been conducted in controlled laboratory settings. Such in-lab studies offer the experimenter a tight control over many experimental variables like overall luminance, eye movements (via eye tracking), auditory interference etc. However, there are several constraints related to such experiments. These include, 1) limited total experimental time, 2) requirement of dedicated human experimenters for the NHPs, 3) requirement of additional lab-space for the experiments, 4) NHPs often need to undergo invasive surgeries for a head-post implant, 5) additional time and training required for chairing and head restraints of monkeys. To overcome these limitations, many laboratories are now adapting home-cage behavioral training and testing of NHPs. Home-cage behavioral testing enables the administering of many vision-based behavioral tasks simultaneously across multiple monkeys with much reduced human personnel requirements, no NHP head restraint, and provide NHPs access to the experiments without specific time constraints. To enable more open-source development of this technology, here we provide the details of operating and building a portable, easy-to-use kiosk for conducting home-cage vision-based behavioral tasks in NHPs. △ Less

Submitted 8 January, 2024; originally announced January 2024.

Comments: Another earlier version available at https://osf.io/preprints/osf/csdzv

arXiv:2401.03376 [pdf]

How to optimize neuroscience data utilization and experiment design for advancing brain models of visual and linguistic cognition?

Authors: Greta Tuckute, Dawn Finzi, Eshed Margalit, Joel Zylberberg, SueYeon Chung, Alona Fyshe, Evelina Fedorenko, Nikolaus Kriegeskorte, Jacob Yates, Kalanit Grill-Spector, Kohitij Kar

Abstract: In recent years, neuroscience has made significant progress in building large-scale artificial neural network (ANN) models of brain activity and behavior. However, there is no consensus on the most efficient ways to collect data and design experiments to develop the next generation of models. This article explores the controversial opinions that have emerged on this topic in the domain of vision a… ▽ More In recent years, neuroscience has made significant progress in building large-scale artificial neural network (ANN) models of brain activity and behavior. However, there is no consensus on the most efficient ways to collect data and design experiments to develop the next generation of models. This article explores the controversial opinions that have emerged on this topic in the domain of vision and language. Specifically, we address two critical points. First, we weigh the pros and cons of using qualitative insights from empirical results versus raw experimental data to train models. Second, we consider model-free (intuition-based) versus model-based approaches for data collection, specifically experimental design and stimulus selection, for optimal model development. Finally, we consider the challenges of developing a synergistic approach to experimental design and model building, including encouraging data and model sharing and the implications of iterative additions to existing models. The goal of the paper is to discuss decision points and propose directions for both experimenters and model developers in the quest to understand the brain. △ Less

Submitted 28 December, 2024; v1 submitted 6 January, 2024; originally announced January 2024.

arXiv:2312.05956 [pdf, other]

The Quest for an Integrated Set of Neural Mechanisms Underlying Object Recognition in Primates

Authors: Kohitij Kar, James J DiCarlo

Abstract: Visual object recognition -- the behavioral ability to rapidly and accurately categorize many visually encountered objects -- is core to primate cognition. This behavioral capability is algorithmically impressive because of the myriad identity-preserving viewpoints and scenes that dramatically change the visual image produced by the same object. Until recently, the brain mechanisms that support th… ▽ More Visual object recognition -- the behavioral ability to rapidly and accurately categorize many visually encountered objects -- is core to primate cognition. This behavioral capability is algorithmically impressive because of the myriad identity-preserving viewpoints and scenes that dramatically change the visual image produced by the same object. Until recently, the brain mechanisms that support that capability were deeply mysterious. However, over the last decade, this scientific mystery has been illuminated by the discovery and development of brain-inspired, image-computable, artificial neural network (ANN) systems that rival primates in this behavioral feat. Apart from fundamentally changing the landscape of artificial intelligence (AI), modified versions of these ANN systems are the current leading scientific hypotheses of an integrated set of mechanisms in the primate ventral visual stream that support object recognition. What separates brain-mapped versions of these systems from prior conceptual models is that they are Sensory-computable, Mechanistic, Anatomically Referenced, and Testable (SMART). Here, we review and provide perspective on the brain mechanisms that the currently leading SMART models address. We review the empirical brain and behavioral alignment successes and failures of those current models. Given ongoing advances in neurobehavioral measurements and AI, we discuss the next frontiers for even more accurate mechanistic understanding. And we outline the likely applications of that SMART-model-based understanding. △ Less

Submitted 10 December, 2023; originally announced December 2023.

arXiv:2206.03951 [pdf]

doi 10.1038/s42256-022-00592-3

Interpretability of artificial neural network models in artificial Intelligence vs. neuroscience

Authors: Kohitij Kar, Simon Kornblith, Evelina Fedorenko

Abstract: Computationally explicit hypotheses of brain function derived from machine learning (ML)-based models have recently revolutionized neuroscience. Despite the unprecedented ability of these artificial neural networks (ANNs) to capture responses in biological neural networks (brains), and our full access to all internal model components (unlike the brain), ANNs are often referred to as black-boxes wi… ▽ More Computationally explicit hypotheses of brain function derived from machine learning (ML)-based models have recently revolutionized neuroscience. Despite the unprecedented ability of these artificial neural networks (ANNs) to capture responses in biological neural networks (brains), and our full access to all internal model components (unlike the brain), ANNs are often referred to as black-boxes with limited interpretability. Interpretability, however, is a multi-faceted construct that is used differently across fields. In particular, interpretability, or explainability, efforts in Artificial Intelligence (AI) focus on understanding how different model components contribute to its output (i.e., decision making). In contrast, the neuroscientific interpretability of ANNs requires explicit alignment between model components and neuroscientific constructs (e.g., different brain areas or phenomena, like recurrence or top-down feedback). Given the widespread calls to improve the interpretability of AI systems, we here highlight these different notions of interpretability and argue that the neuroscientific interpretability of ANNs can be pursued in parallel with, but independently from, the ongoing efforts in AI. Certain ML techniques (e.g., deep dream) can be leveraged in both fields, to ask what stimulus optimally activates the specific model features (feature visualization by optimization), or how different features contribute to the model's output (feature attribution). However, without appropriate brain alignment, certain features will remain uninterpretable to neuroscientists. △ Less

Submitted 7 June, 2022; originally announced June 2022.

Comments: 3 pages

Journal ref: Nat Mach Intell 4, 1065-1067 (2022)

arXiv:1909.06161 [pdf, other]

Brain-Like Object Recognition with High-Performing Shallow Recurrent ANNs

Authors: Jonas Kubilius, Martin Schrimpf, Kohitij Kar, Ha Hong, Najib J. Majaj, Rishi Rajalingham, Elias B. Issa, Pouya Bashivan, Jonathan Prescott-Roy, Kailyn Schmidt, Aran Nayebi, Daniel Bear, Daniel L. K. Yamins, James J. DiCarlo

Abstract: Deep convolutional artificial neural networks (ANNs) are the leading class of candidate models of the mechanisms of visual processing in the primate ventral stream. While initially inspired by brain anatomy, over the past years, these ANNs have evolved from a simple eight-layer architecture in AlexNet to extremely deep and branching architectures, demonstrating increasingly better object categoriz… ▽ More Deep convolutional artificial neural networks (ANNs) are the leading class of candidate models of the mechanisms of visual processing in the primate ventral stream. While initially inspired by brain anatomy, over the past years, these ANNs have evolved from a simple eight-layer architecture in AlexNet to extremely deep and branching architectures, demonstrating increasingly better object categorization performance, yet bringing into question how brain-like they still are. In particular, typical deep models from the machine learning community are often hard to map onto the brain's anatomy due to their vast number of layers and missing biologically-important connections, such as recurrence. Here we demonstrate that better anatomical alignment to the brain and high performance on machine learning as well as neuroscience measures do not have to be in contradiction. We developed CORnet-S, a shallow ANN with four anatomically mapped areas and recurrent connectivity, guided by Brain-Score, a new large-scale composite of neural and behavioral benchmarks for quantifying the functional fidelity of models of the primate ventral visual stream. Despite being significantly shallower than most models, CORnet-S is the top model on Brain-Score and outperforms similarly compact models on ImageNet. Moreover, our extensive analyses of CORnet-S circuitry variants reveal that recurrence is the main predictive factor of both Brain-Score and ImageNet top-1 performance. Finally, we report that the temporal evolution of the CORnet-S "IT" neural population resembles the actual monkey IT population dynamics. Taken together, these results establish CORnet-S, a compact, recurrent ANN, as the current best model of the primate ventral visual stream. △ Less

Submitted 28 October, 2019; v1 submitted 13 September, 2019; originally announced September 2019.

Comments: NeurIPS 2019 (Oral). Code available at https://github.com/dicarlolab/neurips2019

arXiv:1807.00053 [pdf, other]

Task-Driven Convolutional Recurrent Models of the Visual System

Authors: Aran Nayebi, Daniel Bear, Jonas Kubilius, Kohitij Kar, Surya Ganguli, David Sussillo, James J. DiCarlo, Daniel L. K. Yamins

Abstract: Feed-forward convolutional neural networks (CNNs) are currently state-of-the-art for object classification tasks such as ImageNet. Further, they are quantitatively accurate models of temporally-averaged responses of neurons in the primate brain's visual system. However, biological visual systems have two ubiquitous architectural features not shared with typical CNNs: local recurrence within cortic… ▽ More Feed-forward convolutional neural networks (CNNs) are currently state-of-the-art for object classification tasks such as ImageNet. Further, they are quantitatively accurate models of temporally-averaged responses of neurons in the primate brain's visual system. However, biological visual systems have two ubiquitous architectural features not shared with typical CNNs: local recurrence within cortical areas, and long-range feedback from downstream areas to upstream areas. Here we explored the role of recurrence in improving classification performance. We found that standard forms of recurrence (vanilla RNNs and LSTMs) do not perform well within deep CNNs on the ImageNet task. In contrast, novel cells that incorporated two structural features, bypassing and gating, were able to boost task accuracy substantially. We extended these design principles in an automated search over thousands of model architectures, which identified novel local recurrent cells and long-range feedback connections useful for object recognition. Moreover, these task-optimized ConvRNNs matched the dynamics of neural activity in the primate visual system better than feedforward networks, suggesting a role for the brain's recurrent connections in performing difficult visual behaviors. △ Less

Submitted 26 October, 2018; v1 submitted 20 June, 2018; originally announced July 2018.

Comments: NIPS 2018 Camera Ready Version, 16 pages including supplementary information, 6 figures

Showing 1–7 of 7 results for author: Kar, K