Search | arXiv e-print repository

arXiv:2005.14605 [pdf, ps, other]

doi 10.1038/s41598-021-90144-3

CoolMomentum: A Method for Stochastic Optimization by Langevin Dynamics with Simulated Annealing

Authors: Oleksandr Borysenko, Maksym Byshkin

Abstract: Deep learning applications require global optimization of non-convex objective functions, which have multiple local minima. The same problem is often found in physical simulations and may be resolved by the methods of Langevin dynamics with Simulated Annealing, which is a well-established approach for minimization of many-particle potentials. This analogy provides useful insights for non-convex st… ▽ More Deep learning applications require global optimization of non-convex objective functions, which have multiple local minima. The same problem is often found in physical simulations and may be resolved by the methods of Langevin dynamics with Simulated Annealing, which is a well-established approach for minimization of many-particle potentials. This analogy provides useful insights for non-convex stochastic optimization in machine learning. Here we find that integration of the discretized Langevin equation gives a coordinate updating rule equivalent to the famous Momentum optimization algorithm. As a main result, we show that a gradual decrease of the momentum coefficient from the initial value close to unity until zero is equivalent to application of Simulated Annealing or slow cooling, in physical terms. Making use of this novel approach, we propose CoolMomentum -- a new stochastic optimization method. Applying Coolmomentum to optimization of Resnet-20 on Cifar-10 dataset and Efficientnet-B0 on Imagenet, we demonstrate that it is able to achieve high accuracies. △ Less

Submitted 21 May, 2021; v1 submitted 29 May, 2020; originally announced May 2020.

Comments: 9 pages, 2 figures

Journal ref: Borysenko, O., Byshkin, M. CoolMomentum: a method for stochastic optimization by Langevin dynamics with simulated annealing. Sci Rep 11, 10705 (2021)

arXiv:2003.05824 [pdf, other]

doi 10.1063/5.0007445

Hybrid Particle-Field Molecular Dynamics Under Constant Pressure

Authors: Sigbjørn Løland Bore, Hima Bindu Kolli, Antonio De Nicola, Maksym Byshkin, Toshihiro Kawakatsu, Giuseppe Milano, Michele Cascella

Abstract: Hybrid particle-field methods are computationally efficient approaches for modelling soft matter systems. So far applications of these methodologies have been limited to constant volume conditions. Here, we reformulate particle-field interactions to represent systems coupled to constant external pressure. First, we show that the commonly used particle-field energy functional can be modified to mod… ▽ More Hybrid particle-field methods are computationally efficient approaches for modelling soft matter systems. So far applications of these methodologies have been limited to constant volume conditions. Here, we reformulate particle-field interactions to represent systems coupled to constant external pressure. First, we show that the commonly used particle-field energy functional can be modified to model and parameterize the isotropic contributions to the pressure tensor without interfering with the microscopic forces on the particles. Second, we employ a square gradient particle-field interaction term to model non-isotropic contributions to the pressure tensor, such as in surface tension phenomena. This formulation is implemented within the hybrid particle-field molecular dynamics approach and is tested on a series of model systems. Simulations of a homogeneous water box demonstrate that it is possible to parameterize the equation of state to reproduce any target density for a given external pressure. Moreover, the same parameterization is transferable to systems of similar coarse-grained mapping resolution. Finally, we evaluate the feasibility of the proposed approach on coarse-grained models of phospholipids, finding that the term between water and the lipid hydrocarbon tails is alone sufficient to reproduce the experimental area per lipid in constant-pressure simulations, and to produce a qualitatively correct lateral pressure profile. △ Less

Submitted 12 March, 2020; originally announced March 2020.

Comments: 24 pages, 7 figures

Journal ref: J. Chem. Phys. 152, 184908 (2020)

arXiv:1901.00533 [pdf, other]

A Simple Algorithm for Scalable Monte Carlo Inference

Authors: Alexander Borisenko, Maksym Byshkin, Alessandro Lomi

Abstract: The methods of statistical physics are widely used for modelling complex networks. Building on the recently proposed Equilibrium Expectation approach, we derive a simple and efficient algorithm for maximum likelihood estimation (MLE) of parameters of exponential family distributions - a family of statistical models, that includes Ising model, Markov Random Field and Exponential Random Graph models… ▽ More The methods of statistical physics are widely used for modelling complex networks. Building on the recently proposed Equilibrium Expectation approach, we derive a simple and efficient algorithm for maximum likelihood estimation (MLE) of parameters of exponential family distributions - a family of statistical models, that includes Ising model, Markov Random Field and Exponential Random Graph models. Computational experiments and analysis of empirical data demonstrate that the algorithm increases by orders of magnitude the size of network data amenable to Monte Carlo based inference. We report results suggesting that the applicability of the algorithm may readily be extended to the analysis of large samples of dependent observations commonly found in biology, sociology, astrophysics, and ecology. △ Less

Submitted 11 February, 2020; v1 submitted 2 January, 2019; originally announced January 2019.

Comments: 15 pages + supplementary information

arXiv:1802.10311 [pdf]

doi 10.1038/s41598-018-29725-8

Fast Maximum Likelihood estimation via Equilibrium Expectation for Large Network Data

Authors: Maksym Byshkin, Alex Stivala, Antonietta Mira, Garry Robins, Alessandro Lomi

Abstract: A major line of contemporary research on complex networks is based on the development of statistical models that specify the local motifs associated with macro-structural properties observed in actual networks. This statistical approach becomes increasingly problematic as network size increases. In the context of current research on efficient estimation of models for large network data sets, we pr… ▽ More A major line of contemporary research on complex networks is based on the development of statistical models that specify the local motifs associated with macro-structural properties observed in actual networks. This statistical approach becomes increasingly problematic as network size increases. In the context of current research on efficient estimation of models for large network data sets, we propose a fast algorithm for maximum likelihood estimation (MLE) that afords a signifcant increase in the size of networks amenable to direct empirical analysis. The algorithm we propose in this paper relies on properties of Markov chains at equilibrium, and for this reason it is called equilibrium expectation (EE). We demonstrate the performance of the EE algorithm in the context of exponential random graphmodels (ERGMs) a family of statistical models commonly used in empirical research based on network data observed at a single period in time. Thus far, the lack of efcient computational strategies has limited the empirical scope of ERGMs to relatively small networks with a few thousand nodes. The approach we propose allows a dramatic increase in the size of networks that may be analyzed using ERGMs. This is illustrated in an analysis of several biological networks and one social network with 104,103 nodes △ Less

Submitted 1 August, 2018; v1 submitted 28 February, 2018; originally announced February 2018.

Comments: Final version

Journal ref: Scientific Reports | (2018) 8:11509 https://www.nature.com/articles/s41598-018-29725-8

Showing 1–4 of 4 results for author: Byshkin, M