Search | arXiv e-print repository

Identifying Feedforward and Feedback Controllable Subspaces of Neural Population Dynamics

Authors: Ankit Kumar, Loren M. Frank, Kristofer E. Bouchard

Abstract: There is overwhelming evidence that cognition, perception, and action rely on feedback control. However, if and how neural population dynamics are amenable to different control strategies is poorly understood, in large part because machine learning methods to directly assess controllability in neural population dynamics are lacking. To address this gap, we developed a novel dimensionality reductio… ▽ More There is overwhelming evidence that cognition, perception, and action rely on feedback control. However, if and how neural population dynamics are amenable to different control strategies is poorly understood, in large part because machine learning methods to directly assess controllability in neural population dynamics are lacking. To address this gap, we developed a novel dimensionality reduction method, Feedback Controllability Components Analysis (FCCA), that identifies subspaces of linear dynamical systems that are most feedback controllable based on a new measure of feedback controllability. We further show that PCA identifies subspaces of linear dynamical systems that maximize a measure of feedforward controllability. As such, FCCA and PCA are data-driven methods to identify subspaces of neural population data (approximated as linear dynamical systems) that are most feedback and feedforward controllable respectively, and are thus natural contrasts for hypothesis testing. We developed new theory that proves that non-normality of underlying dynamics determines the divergence between FCCA and PCA solutions, and confirmed this in numerical simulations. Applying FCCA to diverse neural population recordings, we find that feedback controllable dynamics are geometrically distinct from PCA subspaces and are better predictors of animal behavior. Our methods provide a novel approach towards analyzing neural population dynamics from a control theoretic perspective, and indicate that feedback controllable subspaces are important for behavior. △ Less

Submitted 11 August, 2024; originally announced August 2024.

arXiv:2406.00063 [pdf]

Methods for Linking Data to Online Resources and Ontologies with Applications to Neurophysiology

Authors: Matthew Avaylon, Ryan Ly, Andrew Tritt, Benjamin Dichter, Kristofer E. Bouchard, Christopher J. Mungall, Oliver Ruebel

Abstract: Across many domains, large swaths of digital assets are being stored across distributed data repositories, e.g., the DANDI Archive [8]. The distribution and diversity of these repositories impede researchers from formally defining terminology within experiments, integrating information across datasets, and easily querying, reusing, and analyzing data that follow the FAIR principles [15]. As such,… ▽ More Across many domains, large swaths of digital assets are being stored across distributed data repositories, e.g., the DANDI Archive [8]. The distribution and diversity of these repositories impede researchers from formally defining terminology within experiments, integrating information across datasets, and easily querying, reusing, and analyzing data that follow the FAIR principles [15]. As such, it has become increasingly important to have a standardized method to attach contextual metadata to datasets. Neuroscience is an exemplary use case of this issue due to the complex multimodal nature of experiments. Here, we present the HDMF External Resources Data (HERD) standard and related tools, enabling researchers to annotate new and existing datasets by mapping external references to the data without requiring modification of the original dataset. We integrated HERD closely with Neurodata Without Borders (NWB) [2], a widely used data standard for sharing and storing neurophysiology data. By integrating with NWB, our tools provide neuroscientists with the capability to more easily create and manage neurophysiology data in compliance with controlled sets of terms, enhancing rigor and accuracy of data and facilitating data reuse. △ Less

Submitted 30 May, 2024; originally announced June 2024.

arXiv:2404.03044 [pdf]

The Artificial Intelligence Ontology: LLM-assisted construction of AI concept hierarchies

Authors: Marcin P. Joachimiak, Mark A. Miller, J. Harry Caufield, Ryan Ly, Nomi L. Harris, Andrew Tritt, Christopher J. Mungall, Kristofer E. Bouchard

Abstract: The Artificial Intelligence Ontology (AIO) is a systematization of artificial intelligence (AI) concepts, methodologies, and their interrelations. Developed via manual curation, with the additional assistance of large language models (LLMs), AIO aims to address the rapidly evolving landscape of AI by providing a comprehensive framework that encompasses both technical and ethical aspects of AI tech… ▽ More The Artificial Intelligence Ontology (AIO) is a systematization of artificial intelligence (AI) concepts, methodologies, and their interrelations. Developed via manual curation, with the additional assistance of large language models (LLMs), AIO aims to address the rapidly evolving landscape of AI by providing a comprehensive framework that encompasses both technical and ethical aspects of AI technologies. The primary audience for AIO includes AI researchers, developers, and educators seeking standardized terminology and concepts within the AI domain. The ontology is structured around six top-level branches: Networks, Layers, Functions, LLMs, Preprocessing, and Bias, each designed to support the modular composition of AI methods and facilitate a deeper understanding of deep learning architectures and ethical considerations in AI. AIO's development utilized the Ontology Development Kit (ODK) for its creation and maintenance, with its content being dynamically updated through AI-driven curation support. This approach not only ensures the ontology's relevance amidst the fast-paced advancements in AI but also significantly enhances its utility for researchers, developers, and educators by simplifying the integration of new AI concepts and methodologies. The ontology's utility is demonstrated through the annotation of AI methods data in a catalog of AI research publications and the integration into the BioPortal ontology resource, highlighting its potential for cross-disciplinary research. The AIO ontology is open source and is available on GitHub (https://github.com/berkeleybop/artificial-intelligence-ontology) and BioPortal (https://bioportal.bioontology.org/ontologies/AIO). △ Less

Submitted 3 April, 2024; originally announced April 2024.

arXiv:2403.19724 [pdf]

Towards Reverse-Engineering the Brain: Brain-Derived Neuromorphic Computing Approach with Photonic, Electronic, and Ionic Dynamicity in 3D integrated circuits

Authors: S. J. Ben Yoo, Luis El-Srouji, Suman Datta, Shimeng Yu, Jean Anne Incorvia, Alberto Salleo, Volker Sorger, Juejun Hu, Lionel C Kimerling, Kristofer Bouchard, Joy Geng, Rishidev Chaudhuri, Charan Ranganath, Randall O'Reilly

Abstract: The human brain has immense learning capabilities at extreme energy efficiencies and scale that no artificial system has been able to match. For decades, reverse engineering the brain has been one of the top priorities of science and technology research. Despite numerous efforts, conventional electronics-based methods have failed to match the scalability, energy efficiency, and self-supervised lea… ▽ More The human brain has immense learning capabilities at extreme energy efficiencies and scale that no artificial system has been able to match. For decades, reverse engineering the brain has been one of the top priorities of science and technology research. Despite numerous efforts, conventional electronics-based methods have failed to match the scalability, energy efficiency, and self-supervised learning capabilities of the human brain. On the other hand, very recent progress in the development of new generations of photonic and electronic memristive materials, device technologies, and 3D electronic-photonic integrated circuits (3D EPIC ) promise to realize new brain-derived neuromorphic systems with comparable connectivity, density, energy-efficiency, and scalability. When combined with bio-realistic learning algorithms and architectures, it may be possible to realize an 'artificial brain' prototype with general self-learning capabilities. This paper argues the possibility of reverse-engineering the brain through architecting a prototype of a brain-derived neuromorphic computing system consisting of artificial electronic, ionic, photonic materials, devices, and circuits with dynamicity resembling the bio-plausible molecular, neuro/synaptic, neuro-circuit, and multi-structural hierarchical macro-circuits of the brain based on well-tested computational models. We further argue the importance of bio-plausible local learning algorithms applicable to the neuromorphic computing system that capture the flexible and adaptive unsupervised and self-supervised learning mechanisms central to human intelligence. Most importantly, we emphasize that the unique capabilities in brain-derived neuromorphic computing prototype systems will enable us to understand links between specific neuronal and network-level properties with system-level functioning and behavior. △ Less

Submitted 28 March, 2024; originally announced March 2024.

Comments: 15 pages, 12 figures

arXiv:2310.17780 [pdf, other]

AutoCT: Automated CT registration, segmentation, and quantification

Authors: Zhe Bai, Abdelilah Essiari, Talita Perciano, Kristofer E. Bouchard

Abstract: The processing and analysis of computed tomography (CT) imaging is important for both basic scientific development and clinical applications. In AutoCT, we provide a comprehensive pipeline that integrates an end-to-end automatic preprocessing, registration, segmentation, and quantitative analysis of 3D CT scans. The engineered pipeline enables atlas-based CT segmentation and quantification leverag… ▽ More The processing and analysis of computed tomography (CT) imaging is important for both basic scientific development and clinical applications. In AutoCT, we provide a comprehensive pipeline that integrates an end-to-end automatic preprocessing, registration, segmentation, and quantitative analysis of 3D CT scans. The engineered pipeline enables atlas-based CT segmentation and quantification leveraging diffeomorphic transformations through efficient forward and inverse mappings. The extracted localized features from the deformation field allow for downstream statistical learning that may facilitate medical diagnostics. On a lightweight and portable software platform, AutoCT provides a new toolkit for the CT imaging community to underpin the deployment of artificial intelligence-driven applications. △ Less

Submitted 26 October, 2023; originally announced October 2023.

arXiv:2210.09085 [pdf]

Perspectives for self-driving labs in synthetic biology

Authors: Hector Garcia Martin, Tijana Radivojevic, Jeremy Zucker, Kristofer Bouchard, Jess Sustarich, Sean Peisert, Dan Arnold, Nathan Hillson, Gyorgy Babnigg, Jose Manuel Marti, Christopher J. Mungall, Gregg T. Beckham, Lucas Waldburger, James Carothers, ShivShankar Sundaram, Deb Agarwal, Blake A. Simmons, Tyler Backman, Deepanwita Banerjee, Deepti Tanjore, Lavanya Ramakrishnan, Anup Singh

Abstract: Self-driving labs (SDLs) combine fully automated experiments with artificial intelligence (AI) that decides the next set of experiments. Taken to their ultimate expression, SDLs could usher a new paradigm of scientific research, where the world is probed, interpreted, and explained by machines for human benefit. While there are functioning SDLs in the fields of chemistry and materials science, we… ▽ More Self-driving labs (SDLs) combine fully automated experiments with artificial intelligence (AI) that decides the next set of experiments. Taken to their ultimate expression, SDLs could usher a new paradigm of scientific research, where the world is probed, interpreted, and explained by machines for human benefit. While there are functioning SDLs in the fields of chemistry and materials science, we contend that synthetic biology provides a unique opportunity since the genome provides a single target for affecting the incredibly wide repertoire of biological cell behavior. However, the level of investment required for the creation of biological SDLs is only warranted if directed towards solving difficult and enabling biological questions. Here, we discuss challenges and opportunities in creating SDLs for synthetic biology. △ Less

Submitted 1 November, 2022; v1 submitted 14 October, 2022; originally announced October 2022.

Comments: 17 pages, 3 figures. Submitted for publication in Current Opinion in Biotechnology. Updated figure 3 in this version

arXiv:2210.08973 [pdf, ps, other]

doi 10.1038/s41597-023-02298-6

FAIR for AI: An interdisciplinary and international community building perspective

Authors: E. A. Huerta, Ben Blaiszik, L. Catherine Brinson, Kristofer E. Bouchard, Daniel Diaz, Caterina Doglioni, Javier M. Duarte, Murali Emani, Ian Foster, Geoffrey Fox, Philip Harris, Lukas Heinrich, Shantenu Jha, Daniel S. Katz, Volodymyr Kindratenko, Christine R. Kirkpatrick, Kati Lassila-Perini, Ravi K. Madduri, Mark S. Neubauer, Fotis E. Psomopoulos, Avik Roy, Oliver Rübel, Zhizhen Zhao, Ruike Zhu

Abstract: A foundational set of findable, accessible, interoperable, and reusable (FAIR) principles were proposed in 2016 as prerequisites for proper data management and stewardship, with the goal of enabling the reusability of scholarly data. The principles were also meant to apply to other digital assets, at a high level, and over time, the FAIR guiding principles have been re-interpreted or extended to i… ▽ More A foundational set of findable, accessible, interoperable, and reusable (FAIR) principles were proposed in 2016 as prerequisites for proper data management and stewardship, with the goal of enabling the reusability of scholarly data. The principles were also meant to apply to other digital assets, at a high level, and over time, the FAIR guiding principles have been re-interpreted or extended to include the software, tools, algorithms, and workflows that produce data. FAIR principles are now being adapted in the context of AI models and datasets. Here, we present the perspectives, vision, and experiences of researchers from different countries, disciplines, and backgrounds who are leading the definition and adoption of FAIR principles in their communities of practice, and discuss outcomes that may result from pursuing and incentivizing FAIR AI research. The material for this report builds on the FAIR for AI Workshop held at Argonne National Laboratory on June 7, 2022. △ Less

Submitted 1 August, 2023; v1 submitted 30 September, 2022; originally announced October 2022.

Comments: 10 pages, comments welcome!; v2: 12 pages, accepted to Scientific Data

ACM Class: I.2.0; E.0

Journal ref: Scientific Data 10, 487 (2023)

arXiv:2203.02051 [pdf, other]

Compressed Predictive Information Coding

Authors: Rui Meng, Tianyi Luo, Kristofer Bouchard

Abstract: Unsupervised learning plays an important role in many fields, such as artificial intelligence, machine learning, and neuroscience. Compared to static data, methods for extracting low-dimensional structure for dynamic data are lagging. We developed a novel information-theoretic framework, Compressed Predictive Information Coding (CPIC), to extract useful representations from dynamic data. CPIC sele… ▽ More Unsupervised learning plays an important role in many fields, such as artificial intelligence, machine learning, and neuroscience. Compared to static data, methods for extracting low-dimensional structure for dynamic data are lagging. We developed a novel information-theoretic framework, Compressed Predictive Information Coding (CPIC), to extract useful representations from dynamic data. CPIC selectively projects the past (input) into a linear subspace that is predictive about the compressed data projected from the future (output). The key insight of our framework is to learn representations by minimizing the compression complexity and maximizing the predictive information in latent space. We derive variational bounds of the CPIC loss which induces the latent space to capture information that is maximally predictive. Our variational bounds are tractable by leveraging bounds of mutual information. We find that introducing stochasticity in the encoder robustly contributes to better representation. Furthermore, variational approaches perform better in mutual information estimation compared with estimates under a Gaussian assumption. We demonstrate that CPIC is able to recover the latent space of noisy dynamical systems with low signal-to-noise ratios, and extracts features predictive of exogenous variables in neuroscience data. △ Less

Submitted 3 March, 2022; originally announced March 2022.

arXiv:2111.13786 [pdf, other]

Learning from learning machines: a new generation of AI technology to meet the needs of science

Authors: Luca Pion-Tonachini, Kristofer Bouchard, Hector Garcia Martin, Sean Peisert, W. Bradley Holtz, Anil Aswani, Dipankar Dwivedi, Haruko Wainwright, Ghanshyam Pilania, Benjamin Nachman, Babetta L. Marrone, Nicola Falco, Prabhat, Daniel Arnold, Alejandro Wolf-Yadlin, Sarah Powers, Sharlee Climer, Quinn Jackson, Ty Carlson, Michael Sohn, Petrus Zwart, Neeraj Kumar, Amy Justice, Claire Tomlin, Daniel Jacobson , et al. (11 additional authors not shown)

Abstract: We outline emerging opportunities and challenges to enhance the utility of AI for scientific discovery. The distinct goals of AI for industry versus the goals of AI for science create tension between identifying patterns in data versus discovering patterns in the world from data. If we address the fundamental challenges associated with "bridging the gap" between domain-driven scientific models and… ▽ More We outline emerging opportunities and challenges to enhance the utility of AI for scientific discovery. The distinct goals of AI for industry versus the goals of AI for science create tension between identifying patterns in data versus discovering patterns in the world from data. If we address the fundamental challenges associated with "bridging the gap" between domain-driven scientific models and data-driven AI learning machines, then we expect that these AI models can transform hypothesis generation, scientific discovery, and the scientific process itself. △ Less

Submitted 26 November, 2021; originally announced November 2021.

arXiv:2106.13379 [pdf, other]

Bayesian Inference in High-Dimensional Time-Serieswith the Orthogonal Stochastic Linear Mixing Model

Authors: Rui Meng, Kristofer Bouchard

Abstract: Many modern time-series datasets contain large numbers of output response variables sampled for prolonged periods of time. For example, in neuroscience, the activities of 100s-1000's of neurons are recorded during behaviors and in response to sensory stimuli. Multi-output Gaussian process models leverage the nonparametric nature of Gaussian processes to capture structure across multiple outputs. H… ▽ More Many modern time-series datasets contain large numbers of output response variables sampled for prolonged periods of time. For example, in neuroscience, the activities of 100s-1000's of neurons are recorded during behaviors and in response to sensory stimuli. Multi-output Gaussian process models leverage the nonparametric nature of Gaussian processes to capture structure across multiple outputs. However, this class of models typically assumes that the correlations between the output response variables are invariant in the input space. Stochastic linear mixing models (SLMM) assume the mixture coefficients depend on input, making them more flexible and effective to capture complex output dependence. However, currently, the inference for SLMMs is intractable for large datasets, making them inapplicable to several modern time-series problems. In this paper, we propose a new regression framework, the orthogonal stochastic linear mixing model (OSLMM) that introduces an orthogonal constraint amongst the mixing coefficients. This constraint reduces the computational burden of inference while retaining the capability to handle complex output dependence. We provide Markov chain Monte Carlo inference procedures for both SLMM and OSLMM and demonstrate superior model scalability and reduced prediction error of OSLMM compared with state-of-the-art methods on several real-world applications. In neurophysiology recordings, we use the inferred latent functions for compact visualization of population responses to auditory stimuli, and demonstrate superior results compared to a competing method (GPFA). Together, these results demonstrate that OSLMM will be useful for the analysis of diverse, large-scale time-series datasets. △ Less

Submitted 12 March, 2022; v1 submitted 24 June, 2021; originally announced June 2021.

arXiv:2106.00719 [pdf, other]

Stochastic Collapsed Variational Inference for Structured Gaussian Process Regression Network

Authors: Rui Meng, Herbie Lee, Kristofer Bouchard

Abstract: This paper presents an efficient variational inference framework for deriving a family of structured gaussian process regression network (SGPRN) models. The key idea is to incorporate auxiliary inducing variables in latent functions and jointly treats both the distributions of the inducing variables and hyper-parameters as variational parameters. Then we propose structured variable distributions a… ▽ More This paper presents an efficient variational inference framework for deriving a family of structured gaussian process regression network (SGPRN) models. The key idea is to incorporate auxiliary inducing variables in latent functions and jointly treats both the distributions of the inducing variables and hyper-parameters as variational parameters. Then we propose structured variable distributions and marginalize latent variables, which enables the decomposability of a tractable variational lower bound and leads to stochastic optimization. Our inference approach is able to model data in which outputs do not share a common input set with a computational complexity independent of the size of the inputs and outputs and thus easily handle datasets with missing values. We illustrate the performance of our method on synthetic data and real datasets and show that our model generally provides better imputation results on missing data than the state-of-the-art. We also provide a visualization approach for time-varying correlation across outputs in electrocoticography data and those estimates provide insight to understand the neural population dynamics. △ Less

Submitted 17 November, 2021; v1 submitted 1 June, 2021; originally announced June 2021.

arXiv:2103.12802 [pdf, other]

Numerical Characterization of Support Recovery in Sparse Regression with Correlated Design

Authors: Ankit Kumar, Sharmodeep Bhattacharyya, Kristofer Bouchard

Abstract: Sparse regression is frequently employed in diverse scientific settings as a feature selection method. A pervasive aspect of scientific data that hampers both feature selection and estimation is the presence of strong correlations between predictive features. These fundamental issues are often not appreciated by practitioners, and jeapordize conclusions drawn from estimated models. On the other ha… ▽ More Sparse regression is frequently employed in diverse scientific settings as a feature selection method. A pervasive aspect of scientific data that hampers both feature selection and estimation is the presence of strong correlations between predictive features. These fundamental issues are often not appreciated by practitioners, and jeapordize conclusions drawn from estimated models. On the other hand, theoretical results on sparsity-inducing regularized regression such as the Lasso have largely addressed conditions for selection consistency via asymptotics, and disregard the problem of model selection, whereby regularization parameters are chosen. In this numerical study, we address these issues through exhaustive characterization of the performance of several regression estimators, coupled with a range of model selection strategies. These estimators and selection criteria were examined across correlated regression problems with varying degrees of signal to noise, distribution of the non-zero model coefficients, and model sparsity. Our results reveal a fundamental tradeoff between false positive and false negative control in all regression estimators and model selection criteria examined. Additionally, we are able to numerically explore a transition point modulated by the signal-to-noise ratio and spectral properties of the design covariance matrix at which the selection accuracy of all considered algorithms degrades. Overall, we find that SCAD coupled with BIC or empirical Bayes model selection performs the best feature selection across the regression problems considered. △ Less

Submitted 23 March, 2021; originally announced March 2021.

arXiv:2003.10397 [pdf, other]

Critical Point-Finding Methods Reveal Gradient-Flat Regions of Deep Network Losses

Authors: Charles G. Frye, James Simon, Neha S. Wadia, Andrew Ligeralde, Michael R. DeWeese, Kristofer E. Bouchard

Abstract: Despite the fact that the loss functions of deep neural networks are highly non-convex, gradient-based optimization algorithms converge to approximately the same performance from many random initial points. One thread of work has focused on explaining this phenomenon by characterizing the local curvature near critical points of the loss function, where the gradients are near zero, and demonstratin… ▽ More Despite the fact that the loss functions of deep neural networks are highly non-convex, gradient-based optimization algorithms converge to approximately the same performance from many random initial points. One thread of work has focused on explaining this phenomenon by characterizing the local curvature near critical points of the loss function, where the gradients are near zero, and demonstrating that neural network losses enjoy a no-bad-local-minima property and an abundance of saddle points. We report here that the methods used to find these putative critical points suffer from a bad local minima problem of their own: they often converge to or pass through regions where the gradient norm has a stationary point. We call these gradient-flat regions, since they arise when the gradient is approximately in the kernel of the Hessian, such that the loss is locally approximately linear, or flat, in the direction of the gradient. We describe how the presence of these regions necessitates care in both interpreting past results that claimed to find critical points of neural network losses and in designing second-order methods for optimizing neural networks. △ Less

Submitted 23 March, 2020; originally announced March 2020.

Comments: 18 pages, 5 figures

arXiv:1908.11464 [pdf, other]

Sparse and Low-bias Estimation of High Dimensional Vector Autoregressive Models

Authors: Trevor D. Ruiz, Sharmodeep Bhattacharyya, Mahesh Balasubramanian, Kristofer E. Bouchard

Abstract: Vector autoregressive (VAR) models are widely used for causal discovery and forecasting in multivariate time series analysis. In the high-dimensional setting, which is increasingly common in fields such as neuroscience and econometrics, model parameters are inferred by L1-regularized maximum likelihood (RML). A well-known feature of RML inference is that in general the technique produces a trade-o… ▽ More Vector autoregressive (VAR) models are widely used for causal discovery and forecasting in multivariate time series analysis. In the high-dimensional setting, which is increasingly common in fields such as neuroscience and econometrics, model parameters are inferred by L1-regularized maximum likelihood (RML). A well-known feature of RML inference is that in general the technique produces a trade-off between sparsity and bias that depends on the choice of the regularization hyperparameter. In the context of multivariate time series analysis, sparse estimates are favorable for causal discovery and low-bias estimates are favorable for forecasting. However, owing to a paucity of research on hyperparameter selection methods, practitioners must rely on ad-hoc methods such as cross-validation (or manual tuning). The particular balance that such approaches achieve between the two goals -- causal discovery and forecasting -- is poorly understood. Our paper investigates this behavior and proposes a method (UoI-VAR) that achieves a better balance between sparsity and bias when the underlying causal influences are in fact sparse. We demonstrate through simulation that RML with a hyperparameter selected by cross-validation tends to overfit, producing relatively dense estimates. We further demonstrate that UoI-VAR much more effectively approximates the correct sparsity pattern with only a minor compromise in model fit, particularly so for larger data dimensions, and that the estimates produced by UoI-VAR exhibit less bias. We conclude that our method achieves improved performance especially well-suited to applications involving simultaneous causal discovery and forecasting in high-dimensional settings. △ Less

Submitted 12 March, 2025; v1 submitted 29 August, 2019; originally announced August 2019.

MSC Class: 62M10; 91B84; 62F40; 62F30

Journal ref: Proceedings of the 2nd Conference on Learning for Dynamics and Control, in Proceedings of Machine Learning Research 120 (2020) pp. 55-64

arXiv:1905.13308 [pdf, other]

Hangul Fonts Dataset: a Hierarchical and Compositional Dataset for Investigating Learned Representations

Authors: Jesse A. Livezey, Ahyeon Hwang, Jacob Yeung, Kristofer E. Bouchard

Abstract: Hierarchy and compositionality are common latent properties in many natural and scientific datasets. Determining when a deep network's hidden activations represent hierarchy and compositionality is important both for understanding deep representation learning and for applying deep networks in domains where interpretability is crucial. However, current benchmark machine learning datasets either hav… ▽ More Hierarchy and compositionality are common latent properties in many natural and scientific datasets. Determining when a deep network's hidden activations represent hierarchy and compositionality is important both for understanding deep representation learning and for applying deep networks in domains where interpretability is crucial. However, current benchmark machine learning datasets either have little hierarchical or compositional structure, or the structure is not known. This gap impedes precise analysis of a network's representations and thus hinders development of new methods that can learn such properties. To address this gap, we developed a new benchmark dataset with known hierarchical and compositional structure. The Hangul Fonts Dataset (HFD) is comprised of 35 fonts from the Korean writing system (Hangul), each with 11,172 blocks (syllables) composed from the product of initial consonant, medial vowel, and final consonant glyphs. All blocks can be grouped into a few geometric types which induces a hierarchy across blocks. In addition, each block is composed of individual glyphs with rotations, translations, scalings, and naturalistic style variation across fonts. We find that both shallow and deep unsupervised methods only show modest evidence of hierarchy and compositionality in their representations of the HFD compared to supervised deep networks. Supervised deep network representations contain structure related to the geometrical hierarchy of the characters, but the compositional structure of the data is not evident. Thus, HFD enables the identification of shortcomings in existing methods, a critical first step toward developing new machine learning algorithms to extract hierarchical and compositional structure in the context of naturalistic variability. △ Less

Submitted 9 June, 2021; v1 submitted 23 May, 2019; originally announced May 2019.

arXiv:1905.09944 [pdf, other]

Unsupervised Discovery of Temporal Structure in Noisy Data with Dynamical Components Analysis

Authors: David G. Clark, Jesse A. Livezey, Kristofer E. Bouchard

Abstract: Linear dimensionality reduction methods are commonly used to extract low-dimensional structure from high-dimensional data. However, popular methods disregard temporal structure, rendering them prone to extracting noise rather than meaningful dynamics when applied to time series data. At the same time, many successful unsupervised learning methods for temporal, sequential and spatial data extract f… ▽ More Linear dimensionality reduction methods are commonly used to extract low-dimensional structure from high-dimensional data. However, popular methods disregard temporal structure, rendering them prone to extracting noise rather than meaningful dynamics when applied to time series data. At the same time, many successful unsupervised learning methods for temporal, sequential and spatial data extract features which are predictive of their surrounding context. Combining these approaches, we introduce Dynamical Components Analysis (DCA), a linear dimensionality reduction method which discovers a subspace of high-dimensional time series data with maximal predictive information, defined as the mutual information between the past and future. We test DCA on synthetic examples and demonstrate its superior ability to extract dynamical structure compared to commonly used linear methods. We also apply DCA to several real-world datasets, showing that the dimensions extracted by DCA are more useful than those extracted by other methods for predicting future states and decoding auxiliary variables. Overall, DCA robustly extracts dynamical structure in noisy, high-dimensional data while retaining the computational efficiency and geometric interpretability of linear dimensionality reduction methods. △ Less

Submitted 27 October, 2019; v1 submitted 23 May, 2019; originally announced May 2019.

Comments: 22 pages, 10 figures; updated appendix with additional analyses

Journal ref: NeurIPS 14267-14278 (2019)

arXiv:1901.10603 [pdf, ps, other]

Numerically Recovering the Critical Points of a Deep Linear Autoencoder

Authors: Charles G. Frye, Neha S. Wadia, Michael R. DeWeese, Kristofer E. Bouchard

Abstract: Numerically locating the critical points of non-convex surfaces is a long-standing problem central to many fields. Recently, the loss surfaces of deep neural networks have been explored to gain insight into outstanding questions in optimization, generalization, and network architecture design. However, the degree to which recently-proposed methods for numerically recovering critical points actuall… ▽ More Numerically locating the critical points of non-convex surfaces is a long-standing problem central to many fields. Recently, the loss surfaces of deep neural networks have been explored to gain insight into outstanding questions in optimization, generalization, and network architecture design. However, the degree to which recently-proposed methods for numerically recovering critical points actually do so has not been thoroughly evaluated. In this paper, we examine this issue in a case for which the ground truth is known: the deep linear autoencoder. We investigate two sub-problems associated with numerical critical point identification: first, because of large parameter counts, it is infeasible to find all of the critical points for contemporary neural networks, necessitating sampling approaches whose characteristics are poorly understood; second, the numerical tolerance for accurately identifying a critical point is unknown, and conservative tolerances are difficult to satisfy. We first identify connections between recently-proposed methods and well-understood methods in other fields, including chemical physics, economics, and algebraic geometry. We find that several methods work well at recovering certain information about loss surfaces, but fail to take an unbiased sample of critical points. Furthermore, numerical tolerance must be very strict to ensure that numerically-identified critical points have similar properties to true analytical critical points. We also identify a recently-published Newton method for optimization that outperforms previous methods as a critical point-finding algorithm. We expect our results will guide future attempts to numerically study critical points in large nonlinear neural networks. △ Less

Submitted 29 January, 2019; originally announced January 2019.

arXiv:1808.06992 [pdf, other]

Optimizing the Union of Intersections LASSO ($UoI_{LASSO}$) and Vector Autoregressive ($UoI_{VAR}$) Algorithms for Improved Statistical Estimation at Scale

Authors: Mahesh Balasubramanian, Trevor Ruiz, Brandon Cook, Sharmodeep Bhattacharyya, Prabhat, Aviral Shrivastava, Kristofer Bouchard

Abstract: The analysis of scientific data of increasing size and complexity requires statistical machine learning methods that are both interpretable and predictive. Union of Intersections (UoI), a recently developed framework, is a two-step approach that separates model selection and model estimation. A linear regression algorithm based on UoI, $UoI_{LASSO}$, simultaneously achieves low false positives and… ▽ More The analysis of scientific data of increasing size and complexity requires statistical machine learning methods that are both interpretable and predictive. Union of Intersections (UoI), a recently developed framework, is a two-step approach that separates model selection and model estimation. A linear regression algorithm based on UoI, $UoI_{LASSO}$, simultaneously achieves low false positives and low false negative feature selection as well as low bias and low variance estimates. Together, these qualities make the results both predictive and interpretable. In this paper, we optimize the $UoI_{LASSO}$ algorithm for single-node execution on NERSC's Cori Knights Landing, a Xeon Phi based supercomputer. We then scale $UoI_{LASSO}$ to execute on cores ranging from 68-278,528 cores on a range of dataset sizes demonstrating the weak and strong scaling of the implementation. We also implement a variant of $UoI_{LASSO}$, $UoI_{VAR}$ for vector autoregressive models, to analyze high dimensional time-series data. We perform single node optimization and multi-node scaling experiments for $UoI_{VAR}$ to demonstrate the effectiveness of the algorithm for weak and strong scaling. Our implementations enable to use estimate the largest VAR model (1000 nodes) we are aware of, and apply it to large neurophysiology data 192 nodes). △ Less

Submitted 21 August, 2018; originally announced August 2018.

Comments: 10 pages, 10 figures

arXiv:1806.00534 [pdf, other]

Provably convergent acceleration in factored gradient descent with applications in matrix sensing

Authors: Tayo Ajayi, David Mildebrath, Anastasios Kyrillidis, Shashanka Ubaru, Georgios Kollias, Kristofer Bouchard

Abstract: We present theoretical results on the convergence of \emph{non-convex} accelerated gradient descent in matrix factorization models with $\ell_2$-norm loss. The purpose of this work is to study the effects of acceleration in non-convex settings, where provable convergence with acceleration should not be considered a \emph{de facto} property. The technique is applied to matrix sensing problems, for… ▽ More We present theoretical results on the convergence of \emph{non-convex} accelerated gradient descent in matrix factorization models with $\ell_2$-norm loss. The purpose of this work is to study the effects of acceleration in non-convex settings, where provable convergence with acceleration should not be considered a \emph{de facto} property. The technique is applied to matrix sensing problems, for the estimation of a rank $r$ optimal solution $X^\star \in \mathbb{R}^{n \times n}$. Our contributions can be summarized as follows. $i)$ We show that acceleration in factored gradient descent converges at a linear rate; this fact is novel for non-convex matrix factorization settings, under common assumptions. $ii)$ Our proof technique requires the acceleration parameter to be carefully selected, based on the properties of the problem, such as the condition number of $X^\star$ and the condition number of objective function. $iii)$ Currently, our proof leads to the same dependence on the condition number(s) in the contraction parameter, similar to recent results on non-accelerated algorithms. $iv)$ Acceleration is observed in practice, both in synthetic examples and in two real applications: neuronal multi-unit activities recovery from single electrode recordings, and quantum state tomography on quantum computing simulators. △ Less

Submitted 21 September, 2019; v1 submitted 1 June, 2018; originally announced June 2018.

Comments: 23 pages

arXiv:1805.08889 [pdf, other]

Spiking Linear Dynamical Systems on Neuromorphic Hardware for Low-Power Brain-Machine Interfaces

Authors: David G. Clark, Jesse A. Livezey, Edward F. Chang, Kristofer E. Bouchard

Abstract: Neuromorphic architectures achieve low-power operation by using many simple spiking neurons in lieu of traditional hardware. Here, we develop methods for precise linear computations in spiking neural networks and use these methods to map the evolution of a linear dynamical system (LDS) onto an existing neuromorphic chip: IBM's TrueNorth. We analytically characterize, and numerically validate, the… ▽ More Neuromorphic architectures achieve low-power operation by using many simple spiking neurons in lieu of traditional hardware. Here, we develop methods for precise linear computations in spiking neural networks and use these methods to map the evolution of a linear dynamical system (LDS) onto an existing neuromorphic chip: IBM's TrueNorth. We analytically characterize, and numerically validate, the discrepancy between the spiking LDS state sequence and that of its non-spiking counterpart. These analytical results shed light on the multiway tradeoff between time, space, energy, and accuracy in neuromorphic computation. To demonstrate the utility of our work, we implemented a neuromorphic Kalman filter (KF) and used it for offline decoding of human vocal pitch from neural data. The neuromorphic KF could be used for low-power filtering in domains beyond neuroscience, such as navigation or robotics. △ Less

Submitted 5 June, 2018; v1 submitted 22 May, 2018; originally announced May 2018.

Comments: 23 pages, 8 figures; added reference, removed typo in Fig. 2

arXiv:1803.09807 [pdf, other]

doi 10.1371/journal.pcbi.1007091

Deep learning as a tool for neural data analysis: speech classification and cross-frequency coupling in human sensorimotor cortex

Authors: Jesse A. Livezey, Kristofer E. Bouchard, Edward F. Chang

Abstract: A fundamental challenge in neuroscience is to understand what structure in the world is represented in spatially distributed patterns of neural activity from multiple single-trial measurements. This is often accomplished by learning a simple, linear transformations between neural features and features of the sensory stimuli or motor task. While successful in some early sensory processing areas, li… ▽ More A fundamental challenge in neuroscience is to understand what structure in the world is represented in spatially distributed patterns of neural activity from multiple single-trial measurements. This is often accomplished by learning a simple, linear transformations between neural features and features of the sensory stimuli or motor task. While successful in some early sensory processing areas, linear mappings are unlikely to be ideal tools for elucidating nonlinear, hierarchical representations of higher-order brain areas during complex tasks, such as the production of speech by humans. Here, we apply deep networks to predict produced speech syllables from cortical surface electric potentials recorded from human sensorimotor cortex. We found that deep networks had higher decoding prediction accuracy compared to baseline models, and also exhibited greater improvements in accuracy with increasing dataset size. We further demonstrate that deep network's confusions revealed hierarchical latent structure in the neural data, which recapitulated the underlying articulatory nature of speech motor control. Finally, we used deep networks to compare task-relevant information in different neural frequency bands, and found that the high-gamma band contains the vast majority of information relevant for the speech prediction task, with little-to-no additional contribution from lower-frequencies. Together, these results demonstrate the utility of deep networks as a data analysis tool for neuroscience. △ Less

Submitted 26 March, 2018; originally announced March 2018.

Comments: 23 pages, 9 figures

arXiv:1705.07585 [pdf, other]

Union of Intersections (UoI) for Interpretable Data Driven Discovery and Prediction

Authors: Kristofer E. Bouchard, Alejandro F. Bujan, Farbod Roosta-Khorasani, Shashanka Ubaru, Prabhat, Antoine M. Snijders, Jian-Hua Mao, Edward F. Chang, Michael W. Mahoney, Sharmodeep Bhattacharyya

Abstract: The increasing size and complexity of scientific data could dramatically enhance discovery and prediction for basic scientific applications. Realizing this potential, however, requires novel statistical analysis methods that are both interpretable and predictive. We introduce Union of Intersections (UoI), a flexible, modular, and scalable framework for enhanced model selection and estimation. Meth… ▽ More The increasing size and complexity of scientific data could dramatically enhance discovery and prediction for basic scientific applications. Realizing this potential, however, requires novel statistical analysis methods that are both interpretable and predictive. We introduce Union of Intersections (UoI), a flexible, modular, and scalable framework for enhanced model selection and estimation. Methods based on UoI perform model selection and model estimation through intersection and union operations, respectively. We show that UoI-based methods achieve low-variance and nearly unbiased estimation of a small number of interpretable features, while maintaining high-quality prediction accuracy. We perform extensive numerical investigation to evaluate a UoI algorithm ($UoI_{Lasso}$) on synthetic and real data. In doing so, we demonstrate the extraction of interpretable functional networks from human electrophysiology recordings as well as accurate prediction of phenotypes from genotype-phenotype data with reduced features. We also show (with the $UoI_{L1Logistic}$ and $UoI_{CUR}$ variants of the basic framework) improved prediction parsimony for classification and matrix factorization on several benchmark biomedical data sets. These results suggest that methods based on the UoI framework could improve interpretation and prediction in data-driven discovery across scientific fields. △ Less

Submitted 2 November, 2017; v1 submitted 22 May, 2017; originally announced May 2017.

Comments: 42 pages; a conference version is in NIPS 2017

arXiv:1505.03511 [pdf]

Bootstrapped Adaptive Threshold Selection for Statistical Model Selection and Estimation

Authors: Kristofer E. Bouchard

Abstract: A central goal of neuroscience is to understand how activity in the nervous system is related to features of the external world, or to features of the nervous system itself. A common approach is to model neural responses as a weighted combination of external features, or vice versa. The structure of the model weights can provide insight into neural representations. Often, neural input-output relat… ▽ More A central goal of neuroscience is to understand how activity in the nervous system is related to features of the external world, or to features of the nervous system itself. A common approach is to model neural responses as a weighted combination of external features, or vice versa. The structure of the model weights can provide insight into neural representations. Often, neural input-output relationships are sparse, with only a few inputs contributing to the output. In part to account for such sparsity, structured regularizers are incorporated into model fitting optimization. However, by imposing priors, structured regularizers can make it difficult to interpret learned model parameters. Here, we investigate a simple, minimally structured model estimation method for accurate, unbiased estimation of sparse models based on Bootstrapped Adaptive Threshold Selection followed by ordinary least-squares refitting (BoATS). Through extensive numerical investigations, we show that this method often performs favorably compared to L1 and L2 regularizers. In particular, for a variety of model distributions and noise levels, BoATS more accurately recovers the parameters of sparse models, leading to more parsimonious explanations of outputs. Finally, we apply this method to the task of decoding human speech production from ECoG recordings. △ Less

Submitted 13 May, 2015; originally announced May 2015.

Comments: 10 Pages, 6 Figures

arXiv:1505.00041 [pdf, ps, other]

Modeling neural activity at the ensemble level

Authors: Joaquin Rapela, Mark Kostuk, Peter F. Rowat, Tim Mullen, Edward F. Chang, Kristofer Bouchard

Abstract: Here we demonstrate that the activity of neural ensembles can be quantitatively modeled. We first show that an ensemble dynamical model (EDM) accurately approximates the distribution of voltages and average firing rate per neuron of a population of simulated integrate-and-fire neurons. EDMs are high-dimensional nonlinear dynamical models. To faciliate the estimation of their parameters we present… ▽ More Here we demonstrate that the activity of neural ensembles can be quantitatively modeled. We first show that an ensemble dynamical model (EDM) accurately approximates the distribution of voltages and average firing rate per neuron of a population of simulated integrate-and-fire neurons. EDMs are high-dimensional nonlinear dynamical models. To faciliate the estimation of their parameters we present a dimensionality reduction method and study its performance with simulated data. We then introduce and evaluate a maximum-likelihood method to estimate connectivity parameters in networks of EDMS. Finally, we show that this model an methods accurately approximate the high-gamma power evoked by pure tones in the auditory cortex of rodents. Overall, this article demonstrates that quantitatively modeling brain activity at the ensemble level is indeed possible, and opens the way to understanding the computations performed by neural ensembles, which could revolutionarize our understanding of brain function. △ Less

Submitted 3 September, 2015; v1 submitted 30 April, 2015; originally announced May 2015.

arXiv:1501.00527 [pdf]

doi 10.1371/journal.pcbi.1004471

An adapting auditory-motor feedback loop can contribute to generating vocal repetition

Authors: Jason Wittenbach, Kristofer E. Bouchard, Michael S. Brainard, Dezhe Z. Jin

Abstract: Consecutive repetition of actions is common in behavioral sequences. Although integration of sensory feedback with internal motor programs is important for sequence generation, if and how feedback contributes to repetitive actions is poorly understood. Here we study how auditory feedback contributes to generating repetitive syllable sequences in songbirds. We propose that auditory signals provide… ▽ More Consecutive repetition of actions is common in behavioral sequences. Although integration of sensory feedback with internal motor programs is important for sequence generation, if and how feedback contributes to repetitive actions is poorly understood. Here we study how auditory feedback contributes to generating repetitive syllable sequences in songbirds. We propose that auditory signals provide positive feedback to ongoing motor commands, but this influence decays as feedback weakens from response adaptation during syllable repetitions. Computational models show that this mechanism explains repeat distributions observed in Bengalese finch song. We experimentally confirmed two predictions of this mechanism in Bengalese finches: removal of auditory feedback by deafening reduces syllable repetitions; and neural responses to auditory playback of repeated syllable sequences gradually adapt in sensory-motor nucleus HVC. Together, our results implicate a positive auditory-feedback loop with adaptation in generating repetitive vocalizations, and suggest sensory adaptation is important for feedback control of motor sequences. △ Less

Submitted 2 January, 2015; originally announced January 2015.

Showing 1–25 of 25 results for author: Bouchard, K