-
Crafting Interpretable Embeddings by Asking LLMs Questions
Authors:
Vinamra Benara,
Chandan Singh,
John X. Morris,
Richard Antonello,
Ion Stoica,
Alexander G. Huth,
Jianfeng Gao
Abstract:
Large language models (LLMs) have rapidly improved text embeddings for a growing array of natural-language processing tasks. However, their opaqueness and proliferation into scientific domains such as neuroscience have created a growing need for interpretability. Here, we ask whether we can obtain interpretable embeddings through LLM prompting. We introduce question-answering embeddings (QA-Emb),…
▽ More
Large language models (LLMs) have rapidly improved text embeddings for a growing array of natural-language processing tasks. However, their opaqueness and proliferation into scientific domains such as neuroscience have created a growing need for interpretability. Here, we ask whether we can obtain interpretable embeddings through LLM prompting. We introduce question-answering embeddings (QA-Emb), embeddings where each feature represents an answer to a yes/no question asked to an LLM. Training QA-Emb reduces to selecting a set of underlying questions rather than learning model weights.
We use QA-Emb to flexibly generate interpretable models for predicting fMRI voxel responses to language stimuli. QA-Emb significantly outperforms an established interpretable baseline, and does so while requiring very few questions. This paves the way towards building flexible feature spaces that can concretize and evaluate our understanding of semantic brain representations. We additionally find that QA-Emb can be effectively approximated with an efficient model, and we explore broader applications in simple NLP tasks.
△ Less
Submitted 26 May, 2024;
originally announced May 2024.
-
Explaining black box text modules in natural language with language models
Authors:
Chandan Singh,
Aliyah R. Hsu,
Richard Antonello,
Shailee Jain,
Alexander G. Huth,
Bin Yu,
Jianfeng Gao
Abstract:
Large language models (LLMs) have demonstrated remarkable prediction performance for a growing array of tasks. However, their rapid proliferation and increasing opaqueness have created a growing need for interpretability. Here, we ask whether we can automatically obtain natural language explanations for black box text modules. A "text module" is any function that maps text to a scalar continuous v…
▽ More
Large language models (LLMs) have demonstrated remarkable prediction performance for a growing array of tasks. However, their rapid proliferation and increasing opaqueness have created a growing need for interpretability. Here, we ask whether we can automatically obtain natural language explanations for black box text modules. A "text module" is any function that maps text to a scalar continuous value, such as a submodule within an LLM or a fitted model of a brain region. "Black box" indicates that we only have access to the module's inputs/outputs.
We introduce Summarize and Score (SASC), a method that takes in a text module and returns a natural language explanation of the module's selectivity along with a score for how reliable the explanation is. We study SASC in 3 contexts. First, we evaluate SASC on synthetic modules and find that it often recovers ground truth explanations. Second, we use SASC to explain modules found within a pre-trained BERT model, enabling inspection of the model's internals. Finally, we show that SASC can generate explanations for the response of individual fMRI voxels to language stimuli, with potential applications to fine-grained brain mapping. All code for using SASC and reproducing results is made available on Github.
△ Less
Submitted 15 November, 2023; v1 submitted 16 May, 2023;
originally announced May 2023.
-
PrAGMATiC: a Probabilistic and Generative Model of Areas Tiling the Cortex
Authors:
Alexander G. Huth,
Thomas L. Griffiths,
Frederic E. Theunissen,
Jack L. Gallant
Abstract:
Much of the human cortex seems to be organized into topographic cortical maps. Yet few quantitative methods exist for characterizing these maps. To address this issue we developed a modeling framework that can reveal group-level cortical maps based on neuroimaging data. PrAGMATiC, a probabilistic and generative model of areas tiling the cortex, is a hierarchical Bayesian generative model of cortic…
▽ More
Much of the human cortex seems to be organized into topographic cortical maps. Yet few quantitative methods exist for characterizing these maps. To address this issue we developed a modeling framework that can reveal group-level cortical maps based on neuroimaging data. PrAGMATiC, a probabilistic and generative model of areas tiling the cortex, is a hierarchical Bayesian generative model of cortical maps. This model assumes that the cortical map in each individual subject is a sample from a single underlying probability distribution. Learning the parameters of this distribution reveals the properties of a cortical map that are common across a group of subjects while avoiding the potentially lossy step of co-registering each subject into a group anatomical space. In this report we give a mathematical description of PrAGMATiC, describe approximations that make it practical to use, show preliminary results from its application to a real dataset, and describe a number of possible future extensions.
△ Less
Submitted 14 April, 2015;
originally announced April 2015.
-
Correlated percolation models of structured habitat in ecology
Authors:
G. Huth,
A. Lesne,
F. Munoz,
E. Pitard
Abstract:
Percolation offers acknowledged models of random media when the relevant medium characteristics can be described as a binary feature. However, when considering habitat modeling in ecology, a natural constraint comes from nearest-neighbor correlations between the suitable/unsuitable states of the spatial units forming the habitat. Such constraints are also relevant in the physics of aggregation whe…
▽ More
Percolation offers acknowledged models of random media when the relevant medium characteristics can be described as a binary feature. However, when considering habitat modeling in ecology, a natural constraint comes from nearest-neighbor correlations between the suitable/unsuitable states of the spatial units forming the habitat. Such constraints are also relevant in the physics of aggregation where underlying processes may lead to a form of correlated percolation. However, in ecology, the processes leading to habitat correlations are in general not known or very complex. As proposed by Hiebeler [Ecology {\bf 81}, 1629 (2000)], these correlations can be captured in a lattice model by an observable aggregation parameter $q$, supplementing the density $p$ of suitable sites. We investigate this model as an instance of correlated percolation. We analyze the phase diagram of the percolation transition and compute the cluster size distribution, the pair-connectedness function $C(r)$ and the correlation function $g(r)$. We find that while $g(r)$ displays a power-law decrease associated with long-range correlations in a wide domain of parameter values, critical properties are compatible with the universality class of uncorrelated percolation. We contrast the correlation structures obtained respectively for the correlated percolation model and for the Ising model, and show that the diversity of habitat configurations generated by the Hiebeler model is richer than the archetypal Ising model. We also find that emergent structural properties are peculiar to the implemented algorithm, leading to questioning the notion of a well-defined model of aggregated habitat. We conclude that the choice of model and algorithm have strong consequences on what insights ecological studies can get using such models of species habitat.
△ Less
Submitted 1 October, 2014;
originally announced October 2014.