-
Machine learning-guided construction of an analytic kinetic energy functional for orbital free density functional theory
Authors:
Sergei Manzhos,
Johann Luder,
Pavlo Golub,
Manabu Ihara
Abstract:
Machine learning (ML) of kinetic energy functionals (KEF) for orbital-free density functional theory (OF-DFT) holds the promise of addressing an important bottleneck in large-scale ab initio materials modeling where sufficiently accurate analytic KEFs are lacking. However, ML models are not as easily handled as analytic expressions; they need to be provided in the form of algorithms and associated…
▽ More
Machine learning (ML) of kinetic energy functionals (KEF) for orbital-free density functional theory (OF-DFT) holds the promise of addressing an important bottleneck in large-scale ab initio materials modeling where sufficiently accurate analytic KEFs are lacking. However, ML models are not as easily handled as analytic expressions; they need to be provided in the form of algorithms and associated data. Here, we bridge the two approaches and construct an analytic expression for a KEF guided by interpretative machine learning of crystal cell-averaged kinetic energy densities (τ) of several hundred materials. A previously published dataset including multiple phases of 433 unary, binary, and ternary compounds containing Li, Al, Mg, Si, As, Ga, Sb, Na, Sn, P, and In was used for training, including data at the equilibrium geometry as well as strained structures. A hybrid Gaussian process regression - neural network (GPR-NN) method was used to understand the type of functional dependence of τ on the features which contained cell-averaged terms of the 4th order gradient expansion and the product of the electron density and Kohn-Sham effective potential. Based on this analysis, an analytic model is constructed that can reproduce Kohn-Sham DFT energy-volume curves with sufficient accuracy (pronounced minima that are sufficiently close to the minima of the Kohn-Sham DFT-based curves and with sufficiently close curvatures) to enable structure optimizations and elastic response calculations.
△ Less
Submitted 8 April, 2025; v1 submitted 7 February, 2025;
originally announced February 2025.
-
Neural network with optimal neuron activation functions based on additive Gaussian process regression
Authors:
Sergei Manzhos,
Manabu Ihara
Abstract:
Feed-forward neural networks (NN) are a staple machine learning method widely used in many areas of science and technology. While even a single-hidden layer NN is a universal approximator, its expressive power is limited by the use of simple neuron activation functions (such as sigmoid functions) that are typically the same for all neurons. More flexible neuron activation functions would allow usi…
▽ More
Feed-forward neural networks (NN) are a staple machine learning method widely used in many areas of science and technology. While even a single-hidden layer NN is a universal approximator, its expressive power is limited by the use of simple neuron activation functions (such as sigmoid functions) that are typically the same for all neurons. More flexible neuron activation functions would allow using fewer neurons and layers and thereby save computational cost and improve expressive power. We show that additive Gaussian process regression (GPR) can be used to construct optimal neuron activation functions that are individual to each neuron. An approach is also introduced that avoids non-linear fitting of neural network parameters. The resulting method combines the advantage of robustness of a linear regression with the higher expressive power of a NN. We demonstrate the approach by fitting the potential energy surfaces of the water molecule and formaldehyde. Without requiring any non-linear optimization, the additive GPR based approach outperforms a conventional NN in the high accuracy regime, where a conventional NN suffers more from overfitting.
△ Less
Submitted 19 January, 2023; v1 submitted 13 January, 2023;
originally announced January 2023.
-
The loss of the property of locality of the kernel in high-dimensional Gaussian process regression on the example of the fitting of molecular potential energy surfaces
Authors:
Sergei Manzhos,
Manabu Ihara
Abstract:
Kernel based methods including Gaussian process regression (GPR) and generally kernel ridge regression (KRR) have been finding increasing use in computational chemistry, including the fitting of potential energy surfaces and density functionals in high-dimensional feature spaces. Kernels of the Matern family such as Gaussian-like kernels (basis functions) are often used, which allows imparting the…
▽ More
Kernel based methods including Gaussian process regression (GPR) and generally kernel ridge regression (KRR) have been finding increasing use in computational chemistry, including the fitting of potential energy surfaces and density functionals in high-dimensional feature spaces. Kernels of the Matern family such as Gaussian-like kernels (basis functions) are often used, which allows imparting them the meaning of covariance functions and formulating GPR as an estimator of the mean of a Gaussian distribution. The notion of locality of the kernel is critical for this interpretation. It is also critical to the formulation of multi-zeta type basis functions widely used in computational chemistry We show, on the example of fitting of molecular potential energy surfaces of increasing dimensionality, the practical disappearance of the property of locality of a Gaussian-like kernel in high dimensionality. We also formulate a multi-zeta approach to the kernel and show that it significantly improves the quality of regression in low dimensionality but loses any advantage in high dimensionality, which is attributed to the loss of the property of locality.
△ Less
Submitted 20 November, 2022;
originally announced November 2022.
-
On the optimization of hyperparameters in Gaussian process regression with the help of low-order high-dimensional model representation
Authors:
Sergei Manzhos,
Manabu Ihara
Abstract:
When the data are sparse, optimization of hyperparameters of the kernel in Gaussian process regression by the commonly used maximum likelihood estimation (MLE) criterion often leads to overfitting. We show that choosing hyperparameters (in this case, kernel length parameter and regularization parameter) based on a criterion of the completeness of the basis in the corresponding linear regression pr…
▽ More
When the data are sparse, optimization of hyperparameters of the kernel in Gaussian process regression by the commonly used maximum likelihood estimation (MLE) criterion often leads to overfitting. We show that choosing hyperparameters (in this case, kernel length parameter and regularization parameter) based on a criterion of the completeness of the basis in the corresponding linear regression problem is superior to MLE. We show that this is facilitated by the use of high-dimensional model representation (HDMR) whereby a low-order HDMR representation can provide reliable reference functions and large synthetic test data sets needed for basis parameter optimization even when the original data are few.
△ Less
Submitted 5 January, 2022; v1 submitted 30 November, 2021;
originally announced December 2021.
-
Random Sampling High Dimensional Model Representation Gaussian Process Regression (RS-HDMR-GPR) for representing multidimensional functions with machine-learned lower-dimensional terms allowing insight with a general method
Authors:
Owen Ren,
Mohamed Ali Boussaidi,
Dmitry Voytsekhovsky,
Manabu Ihara,
Sergei Manzhos
Abstract:
We present a Python implementation for RS-HDMR-GPR (Random Sampling High Dimensional Model Representation Gaussian Process Regression). The method builds representations of multivariate functions with lower-dimensional terms, either as an expansion over orders of coupling or using terms of only a given dimensionality. This facilitates, in particular, recovering functional dependence from sparse da…
▽ More
We present a Python implementation for RS-HDMR-GPR (Random Sampling High Dimensional Model Representation Gaussian Process Regression). The method builds representations of multivariate functions with lower-dimensional terms, either as an expansion over orders of coupling or using terms of only a given dimensionality. This facilitates, in particular, recovering functional dependence from sparse data. The code also allows for imputation of missing values of the variables and for a significant pruning of the useful number of HDMR terms. The code can also be used for estimating relative importance of different combinations of input variables, thereby adding an element of insight to a general machine learning method. The capabilities of this regression tool are demonstrated on test cases involving synthetic analytic functions, the potential energy surface of the water molecule, kinetic energy densities of materials (crystalline magnesium, aluminum, and silicon), and financial market data.
△ Less
Submitted 16 November, 2021; v1 submitted 23 November, 2020;
originally announced December 2020.