Search | arXiv e-print repository

arXiv:2505.00625 [pdf, other]

SA-GAT-SR: Self-Adaptable Graph Attention Networks with Symbolic Regression for high-fidelity material property prediction

Authors: Junchi Liu, Ying Tang, Sergei Tretiak, Wenhui Duan, Liujiang Zhou

Abstract: Recent advances in machine learning have demonstrated an enormous utility of deep learning approaches, particularly Graph Neural Networks (GNNs) for materials science. These methods have emerged as powerful tools for high-throughput prediction of material properties, offering a compelling enhancement and alternative to traditional first-principles calculations. While the community has predominantl… ▽ More Recent advances in machine learning have demonstrated an enormous utility of deep learning approaches, particularly Graph Neural Networks (GNNs) for materials science. These methods have emerged as powerful tools for high-throughput prediction of material properties, offering a compelling enhancement and alternative to traditional first-principles calculations. While the community has predominantly focused on developing increasingly complex and universal models to enhance predictive accuracy, such approaches often lack physical interpretability and insights into materials behavior. Here, we introduce a novel computational paradigm, Self-Adaptable Graph Attention Networks integrated with Symbolic Regression (SA-GAT-SR), that synergistically combines the predictive capability of GNNs with the interpretative power of symbolic regression. Our framework employs a self-adaptable encoding algorithm that automatically identifies and adjust attention weights so as to screen critical features from an expansive 180-dimensional feature space while maintaining O(n) computational scaling. The integrated SR module subsequently distills these features into compact analytical expressions that explicitly reveal quantum-mechanically meaningful relationships, achieving 23 times acceleration compared to conventional SR implementations that heavily rely on first principle calculations-derived features as input. This work suggests a new framework in computational materials science, bridging the gap between predictive accuracy and physical interpretability, offering valuable physical insights into material behavior. △ Less

Submitted 22 May, 2025; v1 submitted 1 May, 2025; originally announced May 2025.

arXiv:2503.15294 [pdf, ps, other]

Borsuk-Ulam and Replicable Learning of Large-Margin Halfspaces

Authors: Ari Blondal, Hamed Hatami, Pooya Hatami, Chavdar Lalov, Sivan Tretiak

Abstract: Recent remarkable advances in learning theory have established that, for total concept classes, list replicability, global stability, differentially private (DP) learnability, and shared-randomness replicability all coincide with the finiteness of Littlestone dimension. Does this equivalence extend to partial concept classes? We answer this question by proving that the list replicability number… ▽ More Recent remarkable advances in learning theory have established that, for total concept classes, list replicability, global stability, differentially private (DP) learnability, and shared-randomness replicability all coincide with the finiteness of Littlestone dimension. Does this equivalence extend to partial concept classes? We answer this question by proving that the list replicability number of $d$-dimensional $γ$-margin half-spaces satisfies \[ \frac{d}{2}+1 \le \mathrm{LR}(H^d_γ) \le d, \] which grows with dimension. Consequently, for partial classes, list replicability and global stability do not necessarily follow from bounded Littlestone dimension, pure DP-learnability, or shared-randomness replicability. Applying our main theorem, we resolve several open problems: $\bullet$ Every disambiguation of infinite-dimensional large-margin half-spaces to a total concept class has unbounded Littlestone dimension, answering an open question of Alon et al. (FOCS '21). $\bullet$ The maximum list-replicability number of any finite set of points and homogeneous half-spaces in $d$-dimensional Euclidean space is $d$, resolving a problem of Chase et al. (FOCS '23). $\bullet$ Every disambiguation of the Gap Hamming Distance problem in the large gap regime has unbounded public-coin randomized communication complexity. This answers an open question of Fang et al. (STOC '25). $\bullet$ There exists a partial concept class with Littlestone dimension $1$ such that all its disambiguations have infinite Littlestone dimension. This answers a question of Cheung et al. (ICALP '23). Our lower bound follows from a topological argument based on the local Borsuk-Ulam theorem of Chase, Chornomaz, Moran, and Yehudayoff (STOC '24). For the upper bound, we construct a list-replicable learning rule using the generalization properties of SVMs. △ Less

Submitted 27 April, 2025; v1 submitted 19 March, 2025; originally announced March 2025.

Comments: Added Corollary 1.9 that answers a question of [CHHH23]

arXiv:2110.12096 [pdf]

doi 10.1038/s41598-022-21163-x

Molecular Dynamics on Quantum Annealers

Authors: Igor Gayday, Dmitri Babikov, Alexander Teplukhin, Brian K. Kendrick, Susan M. Mniszewski, Yu Zhang, Sergei Tretiak, Pavel A. Dub

Abstract: In this work we demonstrate a practical prospect of using quantum annealers for simulation of molecular dynamics. A methodology developed for this goal, dubbed Quantum Differential Equations (QDE), is applied to propagate classical trajectories for the vibration of the hydrogen molecule in several regimes: nearly harmonic, highly anharmonic, and dissociative motion. The results obtained using the… ▽ More In this work we demonstrate a practical prospect of using quantum annealers for simulation of molecular dynamics. A methodology developed for this goal, dubbed Quantum Differential Equations (QDE), is applied to propagate classical trajectories for the vibration of the hydrogen molecule in several regimes: nearly harmonic, highly anharmonic, and dissociative motion. The results obtained using the D-Wave 2000Q quantum annealer are all consistent and quickly converge to the analytical reference solution. Several alternative strategies for such calculations are explored and it was found that the most accurate results and the best efficiency are obtained by combining the quantum annealer with classical post-processing (greedy algorithm). Importantly, the QDE framework developed here is entirely general and can be applied to solve any system of first-order ordinary nonlinear differential equations using a quantum annealer. △ Less

Submitted 22 October, 2021; originally announced October 2021.

Journal ref: Sci Rep 12, 16824 (2022)

arXiv:2011.14268 [pdf, other]

Downfolding the Molecular Hamiltonian Matrix using Quantum Community Detection

Authors: Susan M. Mniszewski, Pavel A. Dub, Sergei Tretiak, Petr M. Anisimov, Yu Zhang, Christian F. A. Negre

Abstract: Calculating the ground state energy of a molecule efficiently is of great interest in quantum chemistry. The exact numerical solution of the electronic Schrodinger equation remains unfeasible for most molecules requiring approximate methods at best. In this paper we introduce the use of Quantum Community Detection performed using the D-Wave quantum annealer to reduce the molecular Hamiltonian matr… ▽ More Calculating the ground state energy of a molecule efficiently is of great interest in quantum chemistry. The exact numerical solution of the electronic Schrodinger equation remains unfeasible for most molecules requiring approximate methods at best. In this paper we introduce the use of Quantum Community Detection performed using the D-Wave quantum annealer to reduce the molecular Hamiltonian matrix without chemical knowledge. Given a molecule represented by a matrix of Slater determinants, the connectivity between Slater determinants is viewed as a graph adjacency matrix for determining multiple communities based on modularity maximization. The resulting lowest energy cluster of Slater determinants is used to calculate the ground state energy within chemical accuracy. The details of this method are described along with demonstrating its performance across multiple molecules of interest and a bond dissociation example. This approach is general and can be used as part of electronic structure calculations to reduce the computation required. △ Less

Submitted 28 November, 2020; originally announced November 2020.

Comments: Main manuscript: 22 pages, 6 figures and Supplementary information: 15 pages, 3 figures

Report number: LA-UR-20-26971

arXiv:2003.04934 [pdf, other]

doi 10.1038/s41467-021-21376-0

Automated discovery of a robust interatomic potential for aluminum

Authors: Justin S. Smith, Benjamin Nebgen, Nithin Mathew, Jie Chen, Nicholas Lubbers, Leonid Burakovsky, Sergei Tretiak, Hai Ah Nam, Timothy Germann, Saryu Fensin, Kipton Barros

Abstract: Accuracy of molecular dynamics simulations depends crucially on the interatomic potential used to generate forces. The gold standard would be first-principles quantum mechanics (QM) calculations, but these become prohibitively expensive at large simulation scales. Machine learning (ML) based potentials aim for faithful emulation of QM at drastically reduced computational cost. The accuracy and rob… ▽ More Accuracy of molecular dynamics simulations depends crucially on the interatomic potential used to generate forces. The gold standard would be first-principles quantum mechanics (QM) calculations, but these become prohibitively expensive at large simulation scales. Machine learning (ML) based potentials aim for faithful emulation of QM at drastically reduced computational cost. The accuracy and robustness of an ML potential is primarily limited by the quality and diversity of the training dataset. Using the principles of active learning (AL), we present a highly automated approach to dataset construction. The strategy is to use the ML potential under development to sample new atomic configurations and, whenever a configuration is reached for which the ML uncertainty is sufficiently large, collect new QM data. Here, we seek to push the limits of automation, removing as much expert knowledge from the AL process as possible. All sampling is performed using MD simulations starting from an initially disordered configuration, and undergoing non-equilibrium dynamics as driven by time-varying applied temperatures. We demonstrate this approach by building an ML potential for aluminum (ANI-Al). After many AL iterations, ANI-Al teaches itself to predict properties like the radial distribution function in melt, liquid-solid coexistence curve, and crystal properties such as defect energies and barriers. To demonstrate transferability, we perform a 1.3M atom shock simulation, and show that ANI-Al predictions agree very well with DFT calculations on local atomic environments sampled from the nonequilibrium dynamics. Interestingly, the configurations appearing in shock appear to have been well sampled in the AL training dataset, in a way that we illustrate visually. △ Less

Submitted 24 August, 2020; v1 submitted 10 March, 2020; originally announced March 2020.

Showing 1–5 of 5 results for author: Tretiak, S