-
The unbearable lightness of Restricted Boltzmann Machines: Theoretical Insights and Biological Applications
Authors:
Giovanni di Sarra,
Barbara Bravi,
Yasser Roudi
Abstract:
Restricted Boltzmann Machines are simple yet powerful neural networks. They can be used for learning structure in data, and are used as a building block of more complex neural architectures. At the same time, their simplicity makes them easy to use, amenable to theoretical analysis, yielding interpretable models in applications. Here, we focus on reviewing the role that the activation functions, d…
▽ More
Restricted Boltzmann Machines are simple yet powerful neural networks. They can be used for learning structure in data, and are used as a building block of more complex neural architectures. At the same time, their simplicity makes them easy to use, amenable to theoretical analysis, yielding interpretable models in applications. Here, we focus on reviewing the role that the activation functions, describing the input-output relationship of single neurons in RBM, play in the functionality of these models. We discuss recent theoretical results on the benefits and limitations of different activation functions. We also review applications to biological data analysis, namely neural data analysis, where RBM units are mostly taken to have sigmoid activation functions and binary units, to protein data analysis and immunology where non-binary units and non-sigmoid activation functions have recently been shown to yield important insights into the data. Finally, we discuss open problems addressing which can shed light on broader issues in neural network research.
△ Less
Submitted 8 January, 2025;
originally announced January 2025.
-
Critical scaling in hidden state inference for linear Langevin dynamics
Authors:
Barbara Bravi,
Peter Sollich
Abstract:
We consider the problem of inferring the dynamics of unknown (i.e. hidden) nodes from a set of observed trajectories and study analytically the average prediction error and the typical relaxation time of correlations between errors. We focus on a stochastic linear dynamics of continuous degrees of freedom interacting via random Gaussian couplings in the infinite network size limit. The expected er…
▽ More
We consider the problem of inferring the dynamics of unknown (i.e. hidden) nodes from a set of observed trajectories and study analytically the average prediction error and the typical relaxation time of correlations between errors. We focus on a stochastic linear dynamics of continuous degrees of freedom interacting via random Gaussian couplings in the infinite network size limit. The expected error on the hidden time courses can be found as the equal-time hidden-to-hidden covariance of the probability distribution conditioned on observations. In the stationary regime, we analyze the phase diagram in the space of relevant parameters, namely the ratio between the numbers of observed and hidden nodes, the degree of symmetry of the interactions and the amplitudes of the hidden-to-hidden and hidden-to-observed couplings relative to the decay constant of the internal hidden dynamics. In particular, we identify critical regions in parameter space where the relaxation time and the inference error diverge, and determine the corresponding scaling behaviour.
△ Less
Submitted 25 April, 2017; v1 submitted 2 December, 2016;
originally announced December 2016.
-
Statistical physics approaches to subnetwork dynamics in biochemical systems
Authors:
Barbara Bravi,
Peter Sollich
Abstract:
We apply a Gaussian variational approximation to model reduction in large biochemical networks of unary and binary reactions. We focus on a small subset of variables (subnetwork) of interest, e.g. because they are accessible experimentally, embedded in a larger network (bulk). The key goal is to write dynamical equations reduced to the subnetwork but still retaining the effects of the bulk. As a r…
▽ More
We apply a Gaussian variational approximation to model reduction in large biochemical networks of unary and binary reactions. We focus on a small subset of variables (subnetwork) of interest, e.g. because they are accessible experimentally, embedded in a larger network (bulk). The key goal is to write dynamical equations reduced to the subnetwork but still retaining the effects of the bulk. As a result, the subnetwork-reduced dynamics contains a memory term and an extrinsic noise term with non-trivial temporal correlations. We first derive expressions for this memory and noise in the linearized (Gaussian) dynamics and then use a perturbative power expansion to obtain first order nonlinear corrections. For the case of vanishing intrinsic noise, our description is explicitly shown to be equivalent to projection methods up to quadratic terms, but it is applicable also in the presence of stochastic fluctuations in the original dynamics. An example from the Epidermal Growth Factor Receptor (EGFR) signalling pathway is provided to probe the increased prediction accuracy and computational efficiency of our method.
△ Less
Submitted 5 June, 2017; v1 submitted 28 November, 2016;
originally announced November 2016.
-
Inferring hidden states in Langevin dynamics on large networks: Average case performance
Authors:
Barbara Bravi,
Manfred Opper,
Peter Sollich
Abstract:
We present average performance results for dynamical inference problems in large networks, where a set of nodes is hidden while the time trajectories of the others are observed. Examples of this scenario can occur in signal transduction and gene regulation networks. We focus on the linear stochastic dynamics of continuous variables interacting via random Gaussian couplings of generic symmetry. We…
▽ More
We present average performance results for dynamical inference problems in large networks, where a set of nodes is hidden while the time trajectories of the others are observed. Examples of this scenario can occur in signal transduction and gene regulation networks. We focus on the linear stochastic dynamics of continuous variables interacting via random Gaussian couplings of generic symmetry. We analyze the inference error, given by the variance of the posterior distribution over hidden paths, in the thermodynamic limit and as a function of the system parameters and the ratio α between the number of hidden and observed nodes. By applying Kalman filter recursions we find that the posterior dynamics is governed by an "effective" drift that incorporates the effect of the observations. We present two approaches for characterizing the posterior variance that allow us to tackle, respectively, equilibrium and nonequilibrium dynamics. The first appeals to Random Matrix Theory and reveals average spectral properties of the inference error and typical posterior relaxation times, the second is based on dynamical functionals and yields the inference error as the solution of an algebraic equation.
△ Less
Submitted 8 January, 2017; v1 submitted 6 July, 2016;
originally announced July 2016.
-
Inference for dynamics of continuous variables: the Extended Plefka Expansion with hidden nodes
Authors:
Barbara Bravi,
Peter Sollich
Abstract:
We consider the problem of a subnetwork of observed nodes embedded into a larger bulk of unknown (i.e. hidden) nodes, where the aim is to infer these hidden states given information about the subnetwork dynamics. The biochemical networks underlying many cellular and metabolic processes are important realizations of such a scenario as typically one is interested in reconstructing the time evolution…
▽ More
We consider the problem of a subnetwork of observed nodes embedded into a larger bulk of unknown (i.e. hidden) nodes, where the aim is to infer these hidden states given information about the subnetwork dynamics. The biochemical networks underlying many cellular and metabolic processes are important realizations of such a scenario as typically one is interested in reconstructing the time evolution of unobserved chemical concentrations starting from the experimentally more accessible ones. We present an application to this problem of a novel dynamical mean field approximation, the Extended Plefka Expansion, which is based on a path integral description of the stochastic dynamics. As a paradigmatic model we study the stochastic linear dynamics of continuous degrees of freedom interacting via random Gaussian couplings. The resulting joint distribution is known to be Gaussian and this allows us to fully characterize the posterior statistics of the hidden nodes. In particular the equal-time hidden-to-hidden variance -- conditioned on observations -- gives the expected error at each node when the hidden time courses are predicted based on the observations. We assess the accuracy of the Extended Plefka Expansion in predicting these single node variances as well as error correlations over time, focussing on the role of the system size and the number of observed nodes.
△ Less
Submitted 24 March, 2017; v1 submitted 17 March, 2016;
originally announced March 2016.
-
Extended Plefka Expansion for Stochastic Dynamics
Authors:
Barbara Bravi,
Peter Sollich,
Manfred Opper
Abstract:
We propose an extension of the Plefka expansion, which is well known for the dynamics of discrete spins, to stochastic differential equations with continuous degrees of freedom and exhibiting generic nonlinearities. The scenario is sufficiently general to allow application to e.g. biochemical networks involved in metabolism and regulation. The main feature of our approach is to constrain in the Pl…
▽ More
We propose an extension of the Plefka expansion, which is well known for the dynamics of discrete spins, to stochastic differential equations with continuous degrees of freedom and exhibiting generic nonlinearities. The scenario is sufficiently general to allow application to e.g. biochemical networks involved in metabolism and regulation. The main feature of our approach is to constrain in the Plefka expansion not just first moments akin to magnetizations, but also second moments, specifically two-time correlations and responses for each degree of freedom. The end result is an effective equation of motion for each single degree of freedom, where couplings to other variables appear as a self-coupling to the past (i.e. memory term) and a coloured noise. This constitutes a new mean field approximation that should become exact in the thermodynamic limit of a large network, for suitably long-ranged couplings. For the analytically tractable case of linear dynamics we establish this exactness explicitly by appeal to spectral methods of Random Matrix Theory, for Gaussian couplings with arbitrary degree of symmetry.
△ Less
Submitted 17 March, 2016; v1 submitted 23 September, 2015;
originally announced September 2015.