Search | arXiv e-print repository

arXiv:2407.20292 [pdf]

From pixels to planning: scale-free active inference

Authors: Karl Friston, Conor Heins, Tim Verbelen, Lancelot Da Costa, Tommaso Salvatori, Dimitrije Markovic, Alexander Tschantz, Magnus Koudahl, Christopher Buckley, Thomas Parr

Abstract: This paper describes a discrete state-space model -- and accompanying methods -- for generative modelling. This model generalises partially observed Markov decision processes to include paths as latent variables, rendering it suitable for active inference and learning in a dynamic setting. Specifically, we consider deep or hierarchical forms using the renormalisation group. The ensuing renormalisi… ▽ More This paper describes a discrete state-space model -- and accompanying methods -- for generative modelling. This model generalises partially observed Markov decision processes to include paths as latent variables, rendering it suitable for active inference and learning in a dynamic setting. Specifically, we consider deep or hierarchical forms using the renormalisation group. The ensuing renormalising generative models (RGM) can be regarded as discrete homologues of deep convolutional neural networks or continuous state-space models in generalised coordinates of motion. By construction, these scale-invariant models can be used to learn compositionality over space and time, furnishing models of paths or orbits; i.e., events of increasing temporal depth and itinerancy. This technical note illustrates the automatic discovery, learning and deployment of RGMs using a series of applications. We start with image classification and then consider the compression and generation of movies and music. Finally, we apply the same variational principles to the learning of Atari-like games. △ Less

Submitted 27 July, 2024; originally announced July 2024.

Comments: 64 pages, 28 figures

MSC Class: 92 ACM Class: F.1.1

arXiv:2312.07547 [pdf, other]

Active Inference and Intentional Behaviour

Authors: Karl J. Friston, Tommaso Salvatori, Takuya Isomura, Alexander Tschantz, Alex Kiefer, Tim Verbelen, Magnus Koudahl, Aswin Paul, Thomas Parr, Adeel Razi, Brett Kagan, Christopher L. Buckley, Maxwell J. D. Ramstead

Abstract: Recent advances in theoretical biology suggest that basal cognition and sentient behaviour are emergent properties of in vitro cell cultures and neuronal networks, respectively. Such neuronal networks spontaneously learn structured behaviours in the absence of reward or reinforcement. In this paper, we characterise this kind of self-organisation through the lens of the free energy principle, i.e.,… ▽ More Recent advances in theoretical biology suggest that basal cognition and sentient behaviour are emergent properties of in vitro cell cultures and neuronal networks, respectively. Such neuronal networks spontaneously learn structured behaviours in the absence of reward or reinforcement. In this paper, we characterise this kind of self-organisation through the lens of the free energy principle, i.e., as self-evidencing. We do this by first discussing the definitions of reactive and sentient behaviour in the setting of active inference, which describes the behaviour of agents that model the consequences of their actions. We then introduce a formal account of intentional behaviour, that describes agents as driven by a preferred endpoint or goal in latent state-spaces. We then investigate these forms of (reactive, sentient, and intentional) behaviour using simulations. First, we simulate the aforementioned in vitro experiments, in which neuronal cultures spontaneously learn to play Pong, by implementing nested, free energy minimising processes. The simulations are then used to deconstruct the ensuing predictive behaviour, leading to the distinction between merely reactive, sentient, and intentional behaviour, with the latter formalised in terms of inductive planning. This distinction is further studied using simple machine learning benchmarks (navigation in a grid world and the Tower of Hanoi problem), that show how quickly and efficiently adaptive behaviour emerges under an inductive form of active inference. △ Less

Submitted 16 December, 2023; v1 submitted 6 December, 2023; originally announced December 2023.

Comments: 33 pages, 9 figures

arXiv:2209.02567 [pdf, other]

Capsule Networks as Generative Models

Authors: Alex B. Kiefer, Beren Millidge, Alexander Tschantz, Christopher L. Buckley

Abstract: Capsule networks are a neural network architecture specialized for visual scene recognition. Features and pose information are extracted from a scene and then dynamically routed through a hierarchy of vector-valued nodes called 'capsules' to create an implicit scene graph, with the ultimate aim of learning vision directly as inverse graphics. Despite these intuitions, however, capsule networks are… ▽ More Capsule networks are a neural network architecture specialized for visual scene recognition. Features and pose information are extracted from a scene and then dynamically routed through a hierarchy of vector-valued nodes called 'capsules' to create an implicit scene graph, with the ultimate aim of learning vision directly as inverse graphics. Despite these intuitions, however, capsule networks are not formulated as explicit probabilistic generative models; moreover, the routing algorithms typically used are ad-hoc and primarily motivated by algorithmic intuition. In this paper, we derive an alternative capsule routing algorithm utilizing iterative inference under sparsity constraints. We then introduce an explicit probabilistic generative model for capsule networks based on the self-attention operation in transformer networks and show how it is related to a variant of predictive coding networks using Von-Mises-Fisher (VMF) circular Gaussian distributions. △ Less

Submitted 6 October, 2022; v1 submitted 6 September, 2022; originally announced September 2022.

Comments: Accepted at the 3rd International Workshop on Active Inference, 19th Sept 2022, Grenoble; This version: added reference, corrected typographical error; final submitted version

arXiv:2204.02169 [pdf, other]

Hybrid Predictive Coding: Inferring, Fast and Slow

Authors: Alexander Tschantz, Beren Millidge, Anil K Seth, Christopher L Buckley

Abstract: Predictive coding is an influential model of cortical neural activity. It proposes that perceptual beliefs are furnished by sequentially minimising "prediction errors" - the differences between predicted and observed data. Implicit in this proposal is the idea that perception requires multiple cycles of neural activity. This is at odds with evidence that several aspects of visual perception - incl… ▽ More Predictive coding is an influential model of cortical neural activity. It proposes that perceptual beliefs are furnished by sequentially minimising "prediction errors" - the differences between predicted and observed data. Implicit in this proposal is the idea that perception requires multiple cycles of neural activity. This is at odds with evidence that several aspects of visual perception - including complex forms of object recognition - arise from an initial "feedforward sweep" that occurs on fast timescales which preclude substantial recurrent activity. Here, we propose that the feedforward sweep can be understood as performing amortized inference and recurrent processing can be understood as performing iterative inference. We propose a hybrid predictive coding network that combines both iterative and amortized inference in a principled manner by describing both in terms of a dual optimization of a single objective function. We show that the resulting scheme can be implemented in a biologically plausible neural architecture that approximates Bayesian inference utilising local Hebbian update rules. We demonstrate that our hybrid predictive coding model combines the benefits of both amortized and iterative inference -- obtaining rapid and computationally cheap perceptual inference for familiar data while maintaining the context-sensitivity, precision, and sample efficiency of iterative inference schemes. Moreover, we show how our model is inherently sensitive to its uncertainty and adaptively balances iterative and amortized inference to obtain accurate beliefs using minimum computational expense. Hybrid predictive coding offers a new perspective on the functional relevance of the feedforward and recurrent activity observed during visual perception and offers novel insights into distinct aspects of visual phenomenology. △ Less

Submitted 6 April, 2022; v1 submitted 5 April, 2022; originally announced April 2022.

Comments: 05/04/22 initial upload. 06/04/22 added acknowledgements section

arXiv:2201.06900 [pdf]

A continuity of Markov blanket interpretations under the Free Energy Principle

Authors: Anil Seth, Tomasz Korbak, Alexander Tschantz

Abstract: Bruineberg and colleagues helpfully distinguish between instrumental and ontological interpretations of Markov blankets, exposing the dangers of using the former to make claims about the latter. However, proposing a sharp distinction neglects the value of recognising a continuum spanning from instrumental to ontological. This value extends to the related distinction between being and having a mode… ▽ More Bruineberg and colleagues helpfully distinguish between instrumental and ontological interpretations of Markov blankets, exposing the dangers of using the former to make claims about the latter. However, proposing a sharp distinction neglects the value of recognising a continuum spanning from instrumental to ontological. This value extends to the related distinction between being and having a model. △ Less

Submitted 18 January, 2022; originally announced January 2022.

Comments: 4 pages, 0 figures, invited commentary

arXiv:2201.03904 [pdf, other]

doi 10.21105/joss.04098

pymdp: A Python library for active inference in discrete state spaces

Authors: Conor Heins, Beren Millidge, Daphne Demekas, Brennan Klein, Karl Friston, Iain Couzin, Alexander Tschantz

Abstract: Active inference is an account of cognition and behavior in complex systems which brings together action, perception, and learning under the theoretical mantle of Bayesian inference. Active inference has seen growing applications in academic research, especially in fields that seek to model human or animal behavior. While in recent years, some of the code arising from the active inference literatu… ▽ More Active inference is an account of cognition and behavior in complex systems which brings together action, perception, and learning under the theoretical mantle of Bayesian inference. Active inference has seen growing applications in academic research, especially in fields that seek to model human or animal behavior. While in recent years, some of the code arising from the active inference literature has been written in open source languages like Python and Julia, to-date, the most popular software for simulating active inference agents is the DEM toolbox of SPM, a MATLAB library originally developed for the statistical analysis and modelling of neuroimaging data. Increasing interest in active inference, manifested both in terms of sheer number as well as diversifying applications across scientific disciplines, has thus created a need for generic, widely-available, and user-friendly code for simulating active inference in open-source scientific computing languages like Python. The Python package we present here, pymdp (see https://github.com/infer-actively/pymdp), represents a significant step in this direction: namely, we provide the first open-source package for simulating active inference with partially-observable Markov Decision Processes or POMDPs. We review the package's structure and explain its advantages like modular design and customizability, while providing in-text code blocks along the way to demonstrate how it can be used to build and run active inference processes with ease. We developed pymdp to increase the accessibility and exposure of the active inference framework to researchers, engineers, and developers with diverse disciplinary backgrounds. In the spirit of open-source software, we also hope that it spurs new innovation, development, and collaboration in the growing active inference community. △ Less

Submitted 4 May, 2022; v1 submitted 11 January, 2022; originally announced January 2022.

Journal ref: Journal of Open Source Software, 7(73), 4098 (2022)

arXiv:2105.11203 [pdf, other]

doi 10.1016/j.plrev.2021.11.001

How particular is the physics of the free energy principle?

Authors: Miguel Aguilera, Beren Millidge, Alexander Tschantz, Christopher L. Buckley

Abstract: The free energy principle (FEP) states that any dynamical system can be interpreted as performing Bayesian inference upon its surrounding environment. In this work, we examine in depth the assumptions required to derive the FEP in the simplest possible set of systems -- weakly-coupled non-equilibrium linear stochastic systems. Specifically, we explore (i) how general the requirements imposed on th… ▽ More The free energy principle (FEP) states that any dynamical system can be interpreted as performing Bayesian inference upon its surrounding environment. In this work, we examine in depth the assumptions required to derive the FEP in the simplest possible set of systems -- weakly-coupled non-equilibrium linear stochastic systems. Specifically, we explore (i) how general the requirements imposed on the statistical structure of a system are and (ii) how informative the FEP is about the behaviour of such systems. We discover that two requirements of the FEP -- the Markov blanket condition (i.e. a statistical boundary precluding direct coupling between internal and external states) and stringent restrictions on its solenoidal flows (i.e. tendencies driving a system out of equilibrium) -- are only valid for a very narrow space of parameters. Suitable systems require an absence of perception-action asymmetries that is highly unusual for living systems interacting with an environment. More importantly, we observe that a mathematically central step in the argument, connecting the behaviour of a system to variational inference, relies on an implicit equivalence between the dynamics of the average states of a system with the average of the dynamics of those states. This equivalence does not hold in general even for linear systems, since it requires an effective decoupling from the system's history of interactions. These observations are critical for evaluating the generality and applicability of the FEP and indicate the existence of significant problems of the theory in its current form. These issues make the FEP, as it stands, not straightforwardly applicable to the simple linear systems studied here and suggest that more development is needed before the theory could be applied to the kind of complex systems that describe living and cognitive processes. △ Less

Submitted 19 May, 2022; v1 submitted 24 May, 2021; originally announced May 2021.

Journal ref: Physics of Life Reviews. Volume 40, March 2022, Pages 24-50

arXiv:2010.01047 [pdf, other]

Relaxing the Constraints on Predictive Coding Models

Authors: Beren Millidge, Alexander Tschantz, Anil Seth, Christopher L Buckley

Abstract: Predictive coding is an influential theory of cortical function which posits that the principal computation the brain performs, which underlies both perception and learning, is the minimization of prediction errors. While motivated by high-level notions of variational inference, detailed neurophysiological models of cortical microcircuits which can implements its computations have been developed.… ▽ More Predictive coding is an influential theory of cortical function which posits that the principal computation the brain performs, which underlies both perception and learning, is the minimization of prediction errors. While motivated by high-level notions of variational inference, detailed neurophysiological models of cortical microcircuits which can implements its computations have been developed. Moreover, under certain conditions, predictive coding has been shown to approximate the backpropagation of error algorithm, and thus provides a relatively biologically plausible credit-assignment mechanism for training deep networks. However, standard implementations of the algorithm still involve potentially neurally implausible features such as identical forward and backward weights, backward nonlinear derivatives, and 1-1 error unit connectivity. In this paper, we show that these features are not integral to the algorithm and can be removed either directly or through learning additional sets of parameters with Hebbian update rules without noticeable harm to learning performance. Our work thus relaxes current constraints on potential microcircuit designs and hopefully opens up new regions of the design-space for neuromorphic implementations of predictive coding. △ Less

Submitted 10 October, 2020; v1 submitted 2 October, 2020; originally announced October 2020.

Comments: 02/10/20 initial upload; 10/10/20 minor fixes

arXiv:2009.05359 [pdf, other]

Activation Relaxation: A Local Dynamical Approximation to Backpropagation in the Brain

Authors: Beren Millidge, Alexander Tschantz, Anil K Seth, Christopher L Buckley

Abstract: The backpropagation of error algorithm (backprop) has been instrumental in the recent success of deep learning. However, a key question remains as to whether backprop can be formulated in a manner suitable for implementation in neural circuitry. The primary challenge is to ensure that any candidate formulation uses only local information, rather than relying on global signals as in standard backpr… ▽ More The backpropagation of error algorithm (backprop) has been instrumental in the recent success of deep learning. However, a key question remains as to whether backprop can be formulated in a manner suitable for implementation in neural circuitry. The primary challenge is to ensure that any candidate formulation uses only local information, rather than relying on global signals as in standard backprop. Recently several algorithms for approximating backprop using only local signals have been proposed. However, these algorithms typically impose other requirements which challenge biological plausibility: for example, requiring complex and precise connectivity schemes, or multiple sequential backwards phases with information being stored across phases. Here, we propose a novel algorithm, Activation Relaxation (AR), which is motivated by constructing the backpropagation gradient as the equilibrium point of a dynamical system. Our algorithm converges rapidly and robustly to the correct backpropagation gradients, requires only a single type of computational unit, utilises only a single parallel backwards relaxation phase, and can operate on arbitrary computation graphs. We illustrate these properties by training deep neural networks on visual classification tasks, and describe simplifications to the algorithm which remove further obstacles to neurobiological implementation (for example, the weight-transport problem, and the use of nonlinear derivatives), while preserving performance. △ Less

Submitted 10 October, 2020; v1 submitted 11 September, 2020; originally announced September 2020.

Comments: initial upload; revised version (updated abstract, related work) 28-09-20; 05/10/20: revised for ICLR submission; 10/10/20: minor revisions

arXiv:2008.03238 [pdf]

Neural and phenotypic representation under the free-energy principle

Authors: Maxwell J. D. Ramstead, Casper Hesp, Alec Tschantz, Ryan Smith, Axel Constant, Karl Friston

Abstract: The aim of this paper is to leverage the free-energy principle and its corollary process theory, active inference, to develop a generic, generalizable model of the representational capacities of living creatures; that is, a theory of phenotypic representation. Given their ubiquity, we are concerned with distributed forms of representation (e.g., population codes), whereby patterns of ensemble acti… ▽ More The aim of this paper is to leverage the free-energy principle and its corollary process theory, active inference, to develop a generic, generalizable model of the representational capacities of living creatures; that is, a theory of phenotypic representation. Given their ubiquity, we are concerned with distributed forms of representation (e.g., population codes), whereby patterns of ensemble activity in living tissue come to represent the causes of sensory input or data. The active inference framework rests on the Markov blanket formalism, which allows us to partition systems of interest, such as biological systems, into internal states, external states, and the blanket (active and sensory) states that render internal and external states conditionally independent of each other. In this framework, the representational capacity of living creatures emerges as a consequence of their Markovian structure and nonequilibrium dynamics, which together entail a dual-aspect information geometry. This entails a modest representational capacity: internal states have an intrinsic information geometry that describes their trajectory over time in state space, as well as an extrinsic information geometry that allows internal states to encode (the parameters of) probabilistic beliefs about (fictive) external states. Building on this, we describe here how, in an automatic and emergent manner, information about stimuli can come to be encoded by groups of neurons bound by a Markov blanket; what is known as the neuronal packet hypothesis. As a concrete demonstration of this type of emergent representation, we present numerical simulations showing that self-organizing ensembles of active inference agents sharing the right kind of probabilistic generative model are able to encode recoverable information about a stimulus array. △ Less

Submitted 1 December, 2020; v1 submitted 7 August, 2020; originally announced August 2020.

Comments: 30 pages; 6 figures

Showing 1–10 of 10 results for author: Tschantz, A