-
Complex behavior from intrinsic motivation to occupy action-state path space
Authors:
Jorge Ramírez-Ruiz,
Dmytro Grytskyy,
Chiara Mastrogiuseppe,
Yamen Habib,
Rubén Moreno-Bote
Abstract:
Most theories of behavior posit that agents tend to maximize some form of reward or utility. However, animals very often move with curiosity and seem to be motivated in a reward-free manner. Here we abandon the idea of reward maximization, and propose that the goal of behavior is maximizing occupancy of future paths of actions and states. According to this maximum occupancy principle, rewards are…
▽ More
Most theories of behavior posit that agents tend to maximize some form of reward or utility. However, animals very often move with curiosity and seem to be motivated in a reward-free manner. Here we abandon the idea of reward maximization, and propose that the goal of behavior is maximizing occupancy of future paths of actions and states. According to this maximum occupancy principle, rewards are the means to occupy path space, not the goal per se; goal-directedness simply emerges as rational ways of searching for resources so that movement, understood amply, never ends. We find that action-state path entropy is the only measure consistent with additivity and other intuitive properties of expected future action-state path occupancy. We provide analytical expressions that relate the optimal policy and state-value function, and prove convergence of our value iteration algorithm. Using discrete and continuous state tasks, including a high--dimensional controller, we show that complex behaviors such as `dancing', hide-and-seek and a basic form of altruistic behavior naturally result from the intrinsic motivation to occupy path space. All in all, we present a theory of behavior that generates both variability and goal-directedness in the absence of reward maximization.
△ Less
Submitted 24 February, 2024; v1 submitted 20 May, 2022;
originally announced May 2022.
-
A learning rule balancing energy consumption and information maximization in a feed-forward neuronal network
Authors:
Dmytro Grytskyy,
Renaud B. Jolivet
Abstract:
Information measures are often used to assess the efficacy of neural networks, and learning rules can be derived through optimization procedures on such measures. In biological neural networks, computation is restricted by the amount of available resources. Considering energy restrictions, it is thus reasonable to balance information processing efficacy with energy consumption. Here, we studied ne…
▽ More
Information measures are often used to assess the efficacy of neural networks, and learning rules can be derived through optimization procedures on such measures. In biological neural networks, computation is restricted by the amount of available resources. Considering energy restrictions, it is thus reasonable to balance information processing efficacy with energy consumption. Here, we studied networks of non-linear Hawkes neurons and assessed the information flow through these networks using mutual information. We then applied gradient descent for a combination of mutual information and energetic costs to obtain a learning rule. Through this procedure, we obtained a rule containing a sliding threshold, similar to the Bienenstock-Cooper-Munro rule. The rule contains terms local in time and in space plus one global variable common to the whole network. The rule thus belongs to so-called three-factor rules and the global variable could be related to a number of biological processes. In neural networks using this learning rule, frequent inputs get mapped onto low energy orbits of the network while rare inputs aren't learned.
△ Less
Submitted 11 March, 2021;
originally announced March 2021.
-
A reaction diffusion-like formalism for plastic neural networks reveals dissipative solitons at criticality
Authors:
Dmytro Grytskyy,
Markus Diesmann,
Moritz Helias
Abstract:
Self-organized structures in networks with spike-timing dependent plasticity (STDP) are likely to play a central role for information processing in the brain. In the present study we derive a reaction-diffusion-like formalism for plastic feed-forward networks of nonlinear rate neurons with a correlation sensitive learning rule inspired by and being qualitatively similar to STDP. After obtaining eq…
▽ More
Self-organized structures in networks with spike-timing dependent plasticity (STDP) are likely to play a central role for information processing in the brain. In the present study we derive a reaction-diffusion-like formalism for plastic feed-forward networks of nonlinear rate neurons with a correlation sensitive learning rule inspired by and being qualitatively similar to STDP. After obtaining equations that describe the change of the spatial shape of the signal from layer to layer, we derive a criterion for the non-linearity necessary to obtain stable dynamics for arbitrary input. We classify the possible scenarios of signal evolution and find that close to the transition to the unstable regime meta-stable solutions appear. The form of these dissipative solitons is determined analytically and the evolution and interaction of several such coexistent objects is investigated.
△ Less
Submitted 7 December, 2015; v1 submitted 31 August, 2015;
originally announced August 2015.
-
A unified view on weakly correlated recurrent networks
Authors:
Dmytro Grytskyy,
Tom Tetzlaff,
Markus Diesmann,
Moritz Helias
Abstract:
The diversity of neuron models used in contemporary theoretical neuroscience to investigate specific properties of covariances raises the question how these models relate to each other. In particular it is hard to distinguish between generic properties and peculiarities due to the abstracted model. Here we present a unified view on pairwise covariances in recurrent networks in the irregular regime…
▽ More
The diversity of neuron models used in contemporary theoretical neuroscience to investigate specific properties of covariances raises the question how these models relate to each other. In particular it is hard to distinguish between generic properties and peculiarities due to the abstracted model. Here we present a unified view on pairwise covariances in recurrent networks in the irregular regime. We consider the binary neuron model, the leaky integrate-and-fire model, and the Hawkes process. We show that linear approximation maps each of these models to either of two classes of linear rate models, including the Ornstein-Uhlenbeck process as a special case. The classes differ in the location of additive noise in the rate dynamics, which is on the output side for spiking models and on the input side for the binary model. Both classes allow closed form solutions for the covariance. For output noise it separates into an echo term and a term due to correlated input. The unified framework enables us to transfer results between models. For example, we generalize the binary model and the Hawkes process to the presence of conduction delays and simplify derivations for established results. Our approach is applicable to general network structures and suitable for population averages. The derived averages are exact for fixed out-degree network architectures and approximate for fixed in-degree. We demonstrate how taking into account fluctuations in the linearization procedure increases the accuracy of the effective theory and we explain the class dependent differences between covariances in the time and the frequency domain. Finally we show that the oscillatory instability emerging in networks of integrate-and-fire models with delayed inhibitory feedback is a model-invariant feature: the same structure of poles in the complex frequency plane determines the population power spectra.
△ Less
Submitted 13 September, 2013; v1 submitted 30 April, 2013;
originally announced April 2013.