-
Reinforcement Learning with thermal fluctuations at the nano-scale
Authors:
Francesco Boccardo,
Olivier Pierre-Louis
Abstract:
Reinforcement Learning offers a framework to learn to choose actions in order to achieve some goal. However, at the nano-scale, thermal fluctuations hamper the learning process. We analyze this regime using the general framework of Markov Decision Processes, which applies to a wide variety of problems from nano-navigation to nano-machine actuation. We show that at the nan-oscale, while optimal act…
▽ More
Reinforcement Learning offers a framework to learn to choose actions in order to achieve some goal. However, at the nano-scale, thermal fluctuations hamper the learning process. We analyze this regime using the general framework of Markov Decision Processes, which applies to a wide variety of problems from nano-navigation to nano-machine actuation. We show that at the nan-oscale, while optimal actions should bring an improvement proportional to the small ratio of the applied force times a length-scale over the temperature, the learned improvement is smaller and proportional to the square of this small ratio. Consequently, the efficiency of learning, which compares the learning improvement to the theoretical optimal improvement, drops to zero. Nevertheless, these limitations can be circumvented by using actions learned at a lower temperature. These results are illustrated with simulations of the control of the shape of small particle clusters.
△ Less
Submitted 15 December, 2023; v1 submitted 29 November, 2023;
originally announced November 2023.
-
Temperature transitions and degeneracy in the control of small clusters with a macroscopic field
Authors:
Francesco Boccardo,
Olivier Pierre-Louis
Abstract:
We present a numerical investigation of the control of few-particle fluctuating clusters with a macroscopic field. Our goal is to reach a given target cluster shape is minimum time. This question is formulated as a first passage problem in the space of cluster configurations. We find the optimal policy to set the macroscopic field as a function of the observed shape using dynamic programming. Our…
▽ More
We present a numerical investigation of the control of few-particle fluctuating clusters with a macroscopic field. Our goal is to reach a given target cluster shape is minimum time. This question is formulated as a first passage problem in the space of cluster configurations. We find the optimal policy to set the macroscopic field as a function of the observed shape using dynamic programming. Our results show that the optimal policy is non-unique, and its degeneracy is mainly related to symmetries shared by the initial shape, the force and the target shape. The total fraction of shapes for which optimal choice of the force is non-unique vanishes as the cluster size increases. Furthermore, the optimal policy exhibits a discrete set of transitions when the temperature is varied. Each transition leads to a discontinuity in the derivative of the time to reach with target with respect to temperature. As the size of the cluster increases, the change in the policy due to temperature transitions grows like the total number of configurations and a continuum limit emerges.
△ Less
Submitted 28 June, 2022;
originally announced June 2022.
-
Equilibrium return times of small fluctuating clusters and vacancies
Authors:
Francesco Boccardo,
Younes Benamara,
Olivier Pierre-Louis
Abstract:
The expected return time of a fluctuating two-dimensional cluster or vacancy to a given configuration is studied in thermodynamic equilibrium. We define a family of bond-breaking models that preserve the number of particles. This family includes edge diffusion and surface diffusion inside vacancies in the limit of fast particle diffusion and slow attachment-detachment kinetics. Within the frame of…
▽ More
The expected return time of a fluctuating two-dimensional cluster or vacancy to a given configuration is studied in thermodynamic equilibrium. We define a family of bond-breaking models that preserve the number of particles. This family includes edge diffusion and surface diffusion inside vacancies in the limit of fast particle diffusion and slow attachment-detachment kinetics. Within the frame of these bond-breaking models, the expected return time is found to depend on the energies of the configurations and on the energies of the excited states formed by removing a single particle from the cluster. High and low temperature regimes are studied. We clarify the conditions under which the return time is a non-monotonous function of temperature: a minimum is found when the energy obtained by the average over the excited states of the configuration weighted by their attachment probabilities is lower than the energy averaged over all states. In addition, we show that the optimal temperature at which the return time is minimum is shifted to a higher temperature as compared to the temperature at which the equilibrium probability is maximum. This shift is influenced by the average curvature of the cluster edge, and is therefore larger for vacancies.
△ Less
Submitted 27 May, 2022;
originally announced May 2022.
-
Controlling the shape of small clusters with and without macroscopic fields
Authors:
Francesco Boccardo,
Olivier Pierre-Louis
Abstract:
Despite major advances in the understanding of the formation and dynamics of nano-clusters in the past decades, theoretical bases for the control of their shape are still lacking. We investigate strategies for driving fluctuating few-particle clusters to an arbitrary target shape in minimum time with or without an external field. This question is recast into a first passage problem, solved numeric…
▽ More
Despite major advances in the understanding of the formation and dynamics of nano-clusters in the past decades, theoretical bases for the control of their shape are still lacking. We investigate strategies for driving fluctuating few-particle clusters to an arbitrary target shape in minimum time with or without an external field. This question is recast into a first passage problem, solved numerically, and discussed within a high temperature expansion. Without field, large-enough low-energy target shapes exhibit an optimal temperature at which they are reached in minimum time. We then compute the optimal way to set an external field to minimize the time to reach the target, leading to a gain of time that grows when increasing cluster size or decreasing temperature. This gain can shift the optimal temperature or even create one. Our results could apply to clusters of atoms at equilibrium, and colloidal or nanoparticle clusters under thermo- or electrophoresis.
△ Less
Submitted 18 May, 2022;
originally announced May 2022.