-
ProxySPEX: Inference-Efficient Interpretability via Sparse Feature Interactions in LLMs
Authors:
Landon Butler,
Abhineet Agarwal,
Justin Singh Kang,
Yigit Efe Erginbas,
Bin Yu,
Kannan Ramchandran
Abstract:
Large Language Models (LLMs) have achieved remarkable performance by capturing complex interactions between input features. To identify these interactions, most existing approaches require enumerating all possible combinations of features up to a given order, causing them to scale poorly with the number of inputs $n$. Recently, Kang et al. (2025) proposed SPEX, an information-theoretic approach th…
▽ More
Large Language Models (LLMs) have achieved remarkable performance by capturing complex interactions between input features. To identify these interactions, most existing approaches require enumerating all possible combinations of features up to a given order, causing them to scale poorly with the number of inputs $n$. Recently, Kang et al. (2025) proposed SPEX, an information-theoretic approach that uses interaction sparsity to scale to $n \approx 10^3$ features. SPEX greatly improves upon prior methods but requires tens of thousands of model inferences, which can be prohibitive for large models. In this paper, we observe that LLM feature interactions are often hierarchical -- higher-order interactions are accompanied by their lower-order subsets -- which enables more efficient discovery. To exploit this hierarchy, we propose ProxySPEX, an interaction attribution algorithm that first fits gradient boosted trees to masked LLM outputs and then extracts the important interactions. Experiments across four challenging high-dimensional datasets show that ProxySPEX more faithfully reconstructs LLM outputs by 20% over marginal attribution approaches while using $10\times$ fewer inferences than SPEX. By accounting for interactions, ProxySPEX identifies features that influence model output over 20% more than those selected by marginal approaches. Further, we apply ProxySPEX to two interpretability tasks. Data attribution, where we identify interactions among CIFAR-10 training samples that influence test predictions, and mechanistic interpretability, where we uncover interactions between attention heads, both within and across layers, on a question-answering task. ProxySPEX identifies interactions that enable more aggressive pruning of heads than marginal approaches.
△ Less
Submitted 23 May, 2025;
originally announced May 2025.
-
SPEX: Scaling Feature Interaction Explanations for LLMs
Authors:
Justin Singh Kang,
Landon Butler,
Abhineet Agarwal,
Yigit Efe Erginbas,
Ramtin Pedarsani,
Kannan Ramchandran,
Bin Yu
Abstract:
Large language models (LLMs) have revolutionized machine learning due to their ability to capture complex interactions between input features. Popular post-hoc explanation methods like SHAP provide marginal feature attributions, while their extensions to interaction importances only scale to small input lengths ($\approx 20$). We propose Spectral Explainer (SPEX), a model-agnostic interaction attr…
▽ More
Large language models (LLMs) have revolutionized machine learning due to their ability to capture complex interactions between input features. Popular post-hoc explanation methods like SHAP provide marginal feature attributions, while their extensions to interaction importances only scale to small input lengths ($\approx 20$). We propose Spectral Explainer (SPEX), a model-agnostic interaction attribution algorithm that efficiently scales to large input lengths ($\approx 1000)$. SPEX exploits underlying natural sparsity among interactions -- common in real-world data -- and applies a sparse Fourier transform using a channel decoding algorithm to efficiently identify important interactions. We perform experiments across three difficult long-context datasets that require LLMs to utilize interactions between inputs to complete the task. For large inputs, SPEX outperforms marginal attribution methods by up to 20% in terms of faithfully reconstructing LLM outputs. Further, SPEX successfully identifies key features and interactions that strongly influence model output. For one of our datasets, HotpotQA, SPEX provides interactions that align with human annotations. Finally, we use our model-agnostic approach to generate explanations to demonstrate abstract reasoning in closed-source LLMs (GPT-4o mini) and compositional reasoning in vision-language models.
△ Less
Submitted 19 February, 2025;
originally announced February 2025.
-
Estimating global article processing charges paid to six publishers for open access between 2019 and 2023
Authors:
Stefanie Haustein,
Eric Schares,
Juan Pablo Alperin,
Madelaine Hare,
Leigh-Ann Butler,
Nina Schönfelder
Abstract:
This study presents estimates of the global expenditure on article processing charges (APCs) paid to six publishers for open access between 2019 and 2023. APCs are fees charged for publishing in some fully open access journals (gold) and in subscription journals to make individual articles open access (hybrid). There is currently no way to systematically track institutional, national or global exp…
▽ More
This study presents estimates of the global expenditure on article processing charges (APCs) paid to six publishers for open access between 2019 and 2023. APCs are fees charged for publishing in some fully open access journals (gold) and in subscription journals to make individual articles open access (hybrid). There is currently no way to systematically track institutional, national or global expenses for open access publishing due to a lack of transparency in APC prices, what articles they are paid for, or who pays them. We therefore curated and used an open dataset of annual APC list prices from Elsevier, Frontiers, MDPI, PLOS, Springer Nature, and Wiley in combination with the number of open access articles from these publishers indexed by OpenAlex to estimate that, globally, a total of \$8.349 billion (\$8.968 billion in 2023 US dollars) were spent on APCs between 2019 and 2023. We estimate that in 2023 MDPI (\$681.6 million), Elsevier (\$582.8 million) and Springer Nature (\$546.6) generated the most revenue with APCs. After adjusting for inflation, we also show that annual spending almost tripled from \$910.3 million in 2019 to \$2.538 billion in 2023, that hybrid exceed gold fees, and that the median APCs paid are higher than the median listed fees for both gold and hybrid. Our approach addresses major limitations in previous efforts to estimate APCs paid and offers much needed insight into an otherwise opaque aspect of the business of scholarly publishing. We call upon publishers to be more transparent about OA fees.
△ Less
Submitted 23 July, 2024;
originally announced July 2024.
-
An open dataset of article processing charges from six large scholarly publishers (2019-2023)
Authors:
Leigh-Ann Butler,
Madelaine Hare,
Nina Schönfelder,
Eric Schares,
Juan Pablo Alperin,
Stefanie Haustein
Abstract:
This paper introduces a dataset of article processing charges (APCs) produced from the price lists of six large scholarly publishers - Elsevier, Frontiers, PLOS, MDPI, Springer Nature and Wiley - between 2019 and 2023. APC price lists were downloaded from publisher websites each year as well as via Wayback Machine snapshots to retrieve fees per journal per year. The dataset includes journal metada…
▽ More
This paper introduces a dataset of article processing charges (APCs) produced from the price lists of six large scholarly publishers - Elsevier, Frontiers, PLOS, MDPI, Springer Nature and Wiley - between 2019 and 2023. APC price lists were downloaded from publisher websites each year as well as via Wayback Machine snapshots to retrieve fees per journal per year. The dataset includes journal metadata, APC collection method, and annual APC price list information in several currencies (USD, EUR, GBP, CHF, JPY, CAD) for 8,712 unique journals and 36,618 journal-year combinations. The dataset was generated to allow for more precise analysis of APCs and can support library collection development and scientometric analysis estimating APCs paid in gold and hybrid OA journals.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Learning to Understand: Identifying Interactions via the Möbius Transform
Authors:
Justin S. Kang,
Yigit E. Erginbas,
Landon Butler,
Ramtin Pedarsani,
Kannan Ramchandran
Abstract:
One of the key challenges in machine learning is to find interpretable representations of learned functions. The Möbius transform is essential for this purpose, as its coefficients correspond to unique importance scores for sets of input variables. This transform is closely related to widely used game-theoretic notions of importance like the Shapley and Bhanzaf value, but it also captures crucial…
▽ More
One of the key challenges in machine learning is to find interpretable representations of learned functions. The Möbius transform is essential for this purpose, as its coefficients correspond to unique importance scores for sets of input variables. This transform is closely related to widely used game-theoretic notions of importance like the Shapley and Bhanzaf value, but it also captures crucial higher-order interactions. Although computing the obius Transform of a function with $n$ inputs involves $2^n$ coefficients, it becomes tractable when the function is sparse and of low-degree as we show is the case for many real-world functions. Under these conditions, the complexity of the transform computation is significantly reduced. When there are $K$ non-zero coefficients, our algorithm recovers the Möbius transform in $O(Kn)$ samples and $O(Kn^2)$ time asymptotically under certain assumptions, the first non-adaptive algorithm to do so. We also uncover a surprising connection between group testing and the Möbius transform. For functions where all interactions involve at most $t$ inputs, we use group testing results to compute the Möbius transform with $O(Kt\log n)$ sample complexity and $O(K\mathrm{poly}(n))$ time. A robust version of this algorithm withstands noise and maintains this complexity. This marks the first $n$ sub-linear query complexity, noise-tolerant algorithm for the Möbius transform. In several examples, we observe that representations generated via sparse Möbius transform are up to twice as faithful to the original function, as compared to Shaply and Banzhaf values, while using the same number of terms.
△ Less
Submitted 15 June, 2024; v1 submitted 4 February, 2024;
originally announced February 2024.
-
Non Commutative Convolutional Signal Models in Neural Networks: Stability to Small Deformations
Authors:
Alejandro Parada-Mayorga,
Landon Butler,
Alejandro Ribeiro
Abstract:
In this paper we discuss the results recently published in~[1] about algebraic signal models (ASMs) based on non commutative algebras and their use in convolutional neural networks. Relying on the general tools from algebraic signal processing (ASP), we study the filtering and stability properties of non commutative convolutional filters. We show how non commutative filters can be stable to small…
▽ More
In this paper we discuss the results recently published in~[1] about algebraic signal models (ASMs) based on non commutative algebras and their use in convolutional neural networks. Relying on the general tools from algebraic signal processing (ASP), we study the filtering and stability properties of non commutative convolutional filters. We show how non commutative filters can be stable to small perturbations on the space of operators. We also show that although the spectral components of the Fourier representation in a non commutative signal model are associated to spaces of dimension larger than one, there is a trade-off between stability and selectivity similar to that observed for commutative models. Our results have direct implications for group neural networks, multigraph neural networks and quaternion neural networks, among other non commutative architectures. We conclude by corroborating these results through numerical experiments.
△ Less
Submitted 5 October, 2023;
originally announced October 2023.
-
Understanding Spoken Language Development of Children with ASD Using Pre-trained Speech Embeddings
Authors:
Anfeng Xu,
Rajat Hebbar,
Rimita Lahiri,
Tiantian Feng,
Lindsay Butler,
Lue Shen,
Helen Tager-Flusberg,
Shrikanth Narayanan
Abstract:
Speech processing techniques are useful for analyzing speech and language development in children with Autism Spectrum Disorder (ASD), who are often varied and delayed in acquiring these skills. Early identification and intervention are crucial, but traditional assessment methodologies such as caregiver reports are not adequate for the requisite behavioral phenotyping. Natural Language Sample (NLS…
▽ More
Speech processing techniques are useful for analyzing speech and language development in children with Autism Spectrum Disorder (ASD), who are often varied and delayed in acquiring these skills. Early identification and intervention are crucial, but traditional assessment methodologies such as caregiver reports are not adequate for the requisite behavioral phenotyping. Natural Language Sample (NLS) analysis has gained attention as a promising complement. Researchers have developed benchmarks for spoken language capabilities in children with ASD, obtainable through the analysis of NLS. This paper proposes applications of speech processing technologies in support of automated assessment of children's spoken language development by classification between child and adult speech and between speech and nonverbal vocalization in NLS, with respective F1 macro scores of 82.6% and 67.8%, underscoring the potential for accurate and scalable tools for ASD research and clinical use.
△ Less
Submitted 31 May, 2023; v1 submitted 23 May, 2023;
originally announced May 2023.
-
Statistical Design and Analysis for Robust Machine Learning: A Case Study from COVID-19
Authors:
Davide Pigoli,
Kieran Baker,
Jobie Budd,
Lorraine Butler,
Harry Coppock,
Sabrina Egglestone,
Steven G. Gilmour,
Chris Holmes,
David Hurley,
Radka Jersakova,
Ivan Kiskin,
Vasiliki Koutra,
Jonathon Mellor,
George Nicholson,
Joe Packham,
Selina Patel,
Richard Payne,
Stephen J. Roberts,
Björn W. Schuller,
Ana Tendero-Cañadas,
Tracey Thornley,
Alexander Titcomb
Abstract:
Since early in the coronavirus disease 2019 (COVID-19) pandemic, there has been interest in using artificial intelligence methods to predict COVID-19 infection status based on vocal audio signals, for example cough recordings. However, existing studies have limitations in terms of data collection and of the assessment of the performances of the proposed predictive models. This paper rigorously ass…
▽ More
Since early in the coronavirus disease 2019 (COVID-19) pandemic, there has been interest in using artificial intelligence methods to predict COVID-19 infection status based on vocal audio signals, for example cough recordings. However, existing studies have limitations in terms of data collection and of the assessment of the performances of the proposed predictive models. This paper rigorously assesses state-of-the-art machine learning techniques used to predict COVID-19 infection status based on vocal audio signals, using a dataset collected by the UK Health Security Agency. This dataset includes acoustic recordings and extensive study participant meta-data. We provide guidelines on testing the performance of methods to classify COVID-19 infection status based on acoustic features and we discuss how these can be extended more generally to the development and assessment of predictive methods based on public health datasets.
△ Less
Submitted 27 February, 2023; v1 submitted 15 December, 2022;
originally announced December 2022.
-
Audio-based AI classifiers show no evidence of improved COVID-19 screening over simple symptoms checkers
Authors:
Harry Coppock,
George Nicholson,
Ivan Kiskin,
Vasiliki Koutra,
Kieran Baker,
Jobie Budd,
Richard Payne,
Emma Karoune,
David Hurley,
Alexander Titcomb,
Sabrina Egglestone,
Ana Tendero Cañadas,
Lorraine Butler,
Radka Jersakova,
Jonathon Mellor,
Selina Patel,
Tracey Thornley,
Peter Diggle,
Sylvia Richardson,
Josef Packham,
Björn W. Schuller,
Davide Pigoli,
Steven Gilmour,
Stephen Roberts,
Chris Holmes
Abstract:
Recent work has reported that AI classifiers trained on audio recordings can accurately predict severe acute respiratory syndrome coronavirus 2 (SARSCoV2) infection status. Here, we undertake a large scale study of audio-based deep learning classifiers, as part of the UK governments pandemic response. We collect and analyse a dataset of audio recordings from 67,842 individuals with linked metadata…
▽ More
Recent work has reported that AI classifiers trained on audio recordings can accurately predict severe acute respiratory syndrome coronavirus 2 (SARSCoV2) infection status. Here, we undertake a large scale study of audio-based deep learning classifiers, as part of the UK governments pandemic response. We collect and analyse a dataset of audio recordings from 67,842 individuals with linked metadata, including reverse transcription polymerase chain reaction (PCR) test outcomes, of whom 23,514 tested positive for SARS CoV 2. Subjects were recruited via the UK governments National Health Service Test-and-Trace programme and the REal-time Assessment of Community Transmission (REACT) randomised surveillance survey. In an unadjusted analysis of our dataset AI classifiers predict SARS-CoV-2 infection status with high accuracy (Receiver Operating Characteristic Area Under the Curve (ROCAUC) 0.846 [0.838, 0.854]) consistent with the findings of previous studies. However, after matching on measured confounders, such as age, gender, and self reported symptoms, our classifiers performance is much weaker (ROC-AUC 0.619 [0.594, 0.644]). Upon quantifying the utility of audio based classifiers in practical settings, we find them to be outperformed by simple predictive scores based on user reported symptoms.
△ Less
Submitted 2 March, 2023; v1 submitted 15 December, 2022;
originally announced December 2022.
-
A large-scale and PCR-referenced vocal audio dataset for COVID-19
Authors:
Jobie Budd,
Kieran Baker,
Emma Karoune,
Harry Coppock,
Selina Patel,
Ana Tendero Cañadas,
Alexander Titcomb,
Richard Payne,
David Hurley,
Sabrina Egglestone,
Lorraine Butler,
Jonathon Mellor,
George Nicholson,
Ivan Kiskin,
Vasiliki Koutra,
Radka Jersakova,
Rachel A. McKendry,
Peter Diggle,
Sylvia Richardson,
Björn W. Schuller,
Steven Gilmour,
Davide Pigoli,
Stephen Roberts,
Josef Packham,
Tracey Thornley
, et al. (1 additional authors not shown)
Abstract:
The UK COVID-19 Vocal Audio Dataset is designed for the training and evaluation of machine learning models that classify SARS-CoV-2 infection status or associated respiratory symptoms using vocal audio. The UK Health Security Agency recruited voluntary participants through the national Test and Trace programme and the REACT-1 survey in England from March 2021 to March 2022, during dominant transmi…
▽ More
The UK COVID-19 Vocal Audio Dataset is designed for the training and evaluation of machine learning models that classify SARS-CoV-2 infection status or associated respiratory symptoms using vocal audio. The UK Health Security Agency recruited voluntary participants through the national Test and Trace programme and the REACT-1 survey in England from March 2021 to March 2022, during dominant transmission of the Alpha and Delta SARS-CoV-2 variants and some Omicron variant sublineages. Audio recordings of volitional coughs, exhalations, and speech were collected in the 'Speak up to help beat coronavirus' digital survey alongside demographic, self-reported symptom and respiratory condition data, and linked to SARS-CoV-2 test results. The UK COVID-19 Vocal Audio Dataset represents the largest collection of SARS-CoV-2 PCR-referenced audio recordings to date. PCR results were linked to 70,794 of 72,999 participants and 24,155 of 25,776 positive cases. Respiratory symptoms were reported by 45.62% of participants. This dataset has additional potential uses for bioacoustics research, with 11.30% participants reporting asthma, and 27.20% with linked influenza PCR test results.
△ Less
Submitted 3 November, 2023; v1 submitted 15 December, 2022;
originally announced December 2022.
-
Learning with Multigraph Convolutional Filters
Authors:
Landon Butler,
Alejandro Parada-Mayorga,
Alejandro Ribeiro
Abstract:
In this paper, we introduce a convolutional architecture to perform learning when information is supported on multigraphs. Exploiting algebraic signal processing (ASP), we propose a convolutional signal processing model on multigraphs (MSP). Then, we introduce multigraph convolutional neural networks (MGNNs) as stacked and layered structures where information is processed according to an MSP model…
▽ More
In this paper, we introduce a convolutional architecture to perform learning when information is supported on multigraphs. Exploiting algebraic signal processing (ASP), we propose a convolutional signal processing model on multigraphs (MSP). Then, we introduce multigraph convolutional neural networks (MGNNs) as stacked and layered structures where information is processed according to an MSP model. We also develop a procedure for tractable computation of filter coefficients in the MGNN and a low cost method to reduce the dimensionality of the information transferred between layers. We conclude by comparing the performance of MGNNs against other learning architectures on an optimal resource allocation task for multi-channel communication systems.
△ Less
Submitted 28 October, 2022;
originally announced October 2022.
-
Convolutional Learning on Multigraphs
Authors:
Landon Butler,
Alejandro Parada-Mayorga,
Alejandro Ribeiro
Abstract:
Graph convolutional learning has led to many exciting discoveries in diverse areas. However, in some applications, traditional graphs are insufficient to capture the structure and intricacies of the data. In such scenarios, multigraphs arise naturally as discrete structures in which complex dynamics can be embedded. In this paper, we develop convolutional information processing on multigraphs and…
▽ More
Graph convolutional learning has led to many exciting discoveries in diverse areas. However, in some applications, traditional graphs are insufficient to capture the structure and intricacies of the data. In such scenarios, multigraphs arise naturally as discrete structures in which complex dynamics can be embedded. In this paper, we develop convolutional information processing on multigraphs and introduce convolutional multigraph neural networks (MGNNs). To capture the complex dynamics of information diffusion within and across each of the multigraph's classes of edges, we formalize a convolutional signal processing model, defining the notions of signals, filtering, and frequency representations on multigraphs. Leveraging this model, we develop a multigraph learning architecture, including a sampling procedure to reduce computational complexity. The introduced architecture is applied towards optimal wireless resource allocation and a hate speech localization task, offering improved performance over traditional graph neural networks.
△ Less
Submitted 8 February, 2023; v1 submitted 22 September, 2022;
originally announced September 2022.
-
Convolutional Filtering and Neural Networks with Non Commutative Algebras
Authors:
Alejandro Parada-Mayorga,
Landon Butler,
Alejandro Ribeiro
Abstract:
In this paper we introduce and study the algebraic generalization of non commutative convolutional neural networks. We leverage the theory of algebraic signal processing to model convolutional non commutative architectures, and we derive concrete stability bounds that extend those obtained in the literature for commutative convolutional neural networks. We show that non commutative convolutional a…
▽ More
In this paper we introduce and study the algebraic generalization of non commutative convolutional neural networks. We leverage the theory of algebraic signal processing to model convolutional non commutative architectures, and we derive concrete stability bounds that extend those obtained in the literature for commutative convolutional neural networks. We show that non commutative convolutional architectures can be stable to deformations on the space of operators. We develop the spectral representation of non commutative signal models to show that non commutative filters process Fourier components independently of each other. In particular we prove that although the spectral decompositions of signals in non commutative models are associated to eigenspaces of dimension larger than one, there exists a trade-off between stability and selectivity, which is controlled by matrix polynomial functions in spaces of matrices of low dimension. This tradeoff shows how when the filters in the algebra are restricted to be stable, there is a loss in discriminability that is compensated in the network by the pointwise nonlinearities. The results derived in this paper have direct applications and implications in non commutative convolutional architectures such as group neural networks, multigraph neural networks, and quaternion neural networks, for which we provide a set of numerical experiments showing their behavior when perturbations are present.
△ Less
Submitted 6 July, 2023; v1 submitted 23 August, 2021;
originally announced August 2021.
-
Learning Connectivity for Data Distribution in Robot Teams
Authors:
Ekaterina Tolstaya,
Landon Butler,
Daniel Mox,
James Paulos,
Vijay Kumar,
Alejandro Ribeiro
Abstract:
Many algorithms for control of multi-robot teams operate under the assumption that low-latency, global state information necessary to coordinate agent actions can readily be disseminated among the team. However, in harsh environments with no existing communication infrastructure, robots must form ad-hoc networks, forcing the team to operate in a distributed fashion. To overcome this challenge, we…
▽ More
Many algorithms for control of multi-robot teams operate under the assumption that low-latency, global state information necessary to coordinate agent actions can readily be disseminated among the team. However, in harsh environments with no existing communication infrastructure, robots must form ad-hoc networks, forcing the team to operate in a distributed fashion. To overcome this challenge, we propose a task-agnostic, decentralized, low-latency method for data distribution in ad-hoc networks using Graph Neural Networks (GNN). Our approach enables multi-agent algorithms based on global state information to function by ensuring it is available at each robot. To do this, agents glean information about the topology of the network from packet transmissions and feed it to a GNN running locally which instructs the agent when and where to transmit the latest state information. We train the distributed GNN communication policies via reinforcement learning using the average Age of Information as the reward function and show that it improves training stability compared to task-specific reward functions. Our approach performs favorably compared to industry-standard methods for data distribution such as random flooding and round robin. We also show that the trained policies generalize to larger teams of both static and mobile agents.
△ Less
Submitted 30 July, 2021; v1 submitted 8 March, 2021;
originally announced March 2021.
-
A Tale of Two Cities: Software Developers Working from Home During the COVID-19 Pandemic
Authors:
Denae Ford,
Margaret-Anne Storey,
Thomas Zimmermann,
Christian Bird,
Sonia Jaffe,
Chandra Maddila,
Jenna L. Butler,
Brian Houck,
Nachiappan Nagappan
Abstract:
The COVID-19 pandemic has shaken the world to its core and has provoked an overnight exodus of developers that normally worked in an office setting to working from home. The magnitude of this shift and the factors that have accompanied this new unplanned work setting go beyond what the software engineering community has previously understood to be remote work. To find out how developers and their…
▽ More
The COVID-19 pandemic has shaken the world to its core and has provoked an overnight exodus of developers that normally worked in an office setting to working from home. The magnitude of this shift and the factors that have accompanied this new unplanned work setting go beyond what the software engineering community has previously understood to be remote work. To find out how developers and their productivity were affected, we distributed two surveys (with a combined total of 3,634 responses that answered all required questions) -- weeks apart to understand the presence and prevalence of the benefits, challenges, and opportunities to improve this special circumstance of remote work. From our thematic qualitative analysis and statistical quantitative analysis, we find that there is a dichotomy of developer experiences influenced by many different factors (that for some are a benefit, while for others a challenge). For example, a benefit for some was being close to family members but for others having family members share their working space and interrupting their focus, was a challenge. Our surveys led to powerful narratives from respondents and revealed the scale at which these experiences exist to provide insights as to how the future of (pandemic) remote work can evolve.
△ Less
Submitted 10 September, 2021; v1 submitted 25 August, 2020;
originally announced August 2020.