-
Structure-Based Experimental Datasets for Benchmarking Protein Simulation Force Fields
Authors:
Chapin E. Cavender,
David A. Case,
Julian C. -H. Chen,
Lillian T. Chong,
Daniel A. Keedy,
Kresten Lindorff-Larsen,
David L. Mobley,
O. H. Samuli Ollila,
Chris Oostenbrink,
Paul Robustelli,
Vincent A. Voelz,
Michael E. Wall,
David C. Wych,
Michael K. Gilson
Abstract:
This review article provides an overview of structurally oriented experimental datasets that can be used to benchmark protein force fields, focusing on data generated by nuclear magnetic resonance (NMR) spectroscopy and room temperature (RT) protein crystallography. We discuss what the observables are, what they tell us about structure and dynamics, what makes them useful for assessing force field…
▽ More
This review article provides an overview of structurally oriented experimental datasets that can be used to benchmark protein force fields, focusing on data generated by nuclear magnetic resonance (NMR) spectroscopy and room temperature (RT) protein crystallography. We discuss what the observables are, what they tell us about structure and dynamics, what makes them useful for assessing force field accuracy, and how they can be connected to molecular dynamics simulations carried out using the force field one wishes to benchmark. We also touch on statistical issues that arise when comparing simulations with experiment. We hope this article will be particularly useful to computational researchers and trainees who develop, benchmark, or use protein force fields for molecular simulations.
△ Less
Submitted 25 March, 2025; v1 submitted 2 March, 2023;
originally announced March 2023.
-
Folding@home: achievements from over twenty years of citizen science herald the exascale era
Authors:
Vincent A. Voelz,
Vijay S. Pande,
Gregory R. Bowman
Abstract:
Simulations of biomolecules have enormous potential to inform our understanding of biology but require extremely demanding calculations. For over twenty years, the Folding@home distributed computing project has pioneered a massively parallel approach to biomolecular simulation, harnessing the resources of citizen scientists across the globe. Here, we summarize the scientific and technical advances…
▽ More
Simulations of biomolecules have enormous potential to inform our understanding of biology but require extremely demanding calculations. For over twenty years, the Folding@home distributed computing project has pioneered a massively parallel approach to biomolecular simulation, harnessing the resources of citizen scientists across the globe. Here, we summarize the scientific and technical advances this perspective has enabled. As the project's name implies, the early years of Folding@home focused on driving advances in our understanding of protein folding by developing statistical methods for capturing long-timescale processes and facilitating insight into complex dynamical processes. Success laid a foundation for broadening the scope of Folding@home to address other functionally relevant conformational changes, such as receptor signaling, enzyme dynamics, and ligand binding. Continued algorithmic advances, hardware developments such as GPU-based computing, and the growing scale of Folding@home have enabled the project to focus on new areas where massively parallel sampling can be impactful. While previous work sought to expand toward larger proteins with slower conformational changes, new work focuses on large-scale comparative studies of different protein sequences and chemical compounds to better understand biology and inform the development of small molecule drugs. Progress on these fronts enabled the community to pivot quickly in response to the COVID-19 pandemic, expanding to become the world's first exascale computer and deploying this massive resource to provide insight into the inner workings of the SARS-CoV-2 virus and aid the development of new antivirals. This success provides a glimpse of what's to come as exascale supercomputers come online, and Folding@home continues its work.
△ Less
Submitted 15 March, 2023;
originally announced March 2023.
-
Adaptive Markov State Model estimation using short reseeding trajectories
Authors:
Hongbin Wan,
Vincent A. Voelz
Abstract:
In the last decade, advances in molecular dynamics (MD) and Markov State Model (MSM) methodologies have made possible accurate and efficient estimation of kinetic rates and reactive pathways for complex biomolecular dynamics occurring on slow timescales. A promising approach to enhanced sampling of MSMs is to use so-called "adaptive" methods, in which new MD trajectories are "seeded" preferentiall…
▽ More
In the last decade, advances in molecular dynamics (MD) and Markov State Model (MSM) methodologies have made possible accurate and efficient estimation of kinetic rates and reactive pathways for complex biomolecular dynamics occurring on slow timescales. A promising approach to enhanced sampling of MSMs is to use so-called "adaptive" methods, in which new MD trajectories are "seeded" preferentially from previously identified states. Here, we investigate the performance of various MSM estimators applied to reseeding trajectory data, for both a simple 1D free energy landscape, and for mini-protein folding MSMs of WW domain and NTL9(1-39). Our results reveal the practical challenges of reseeding simulations, and suggest a simple way to reweight seeding trajectory data to better estimate both thermodynamic and kinetic quantities.
△ Less
Submitted 11 December, 2019;
originally announced December 2019.
-
A maximum-caliber approach to predicting perturbed folding kinetics due to mutations
Authors:
Vincent A. Voelz,
Guangfeng Zhou,
Hongbin Wan
Abstract:
We present a maximum-caliber method for inferring transition rates of a Markov State Model (MSM) with perturbed equilibrium populations, given estimates of state populations and rates for an unperturbed MSM. It is similar in spirit to previous approaches but given the inclusion of prior information it is more robust and simple to implement. We examine its performance in simple biased diffusion mod…
▽ More
We present a maximum-caliber method for inferring transition rates of a Markov State Model (MSM) with perturbed equilibrium populations, given estimates of state populations and rates for an unperturbed MSM. It is similar in spirit to previous approaches but given the inclusion of prior information it is more robust and simple to implement. We examine its performance in simple biased diffusion models of kinetics, and then apply the method to predicting changes in folding rates for several highly non-trivial protein folding systems for which non-native interactions play a significant role, including (1) tryptophan variants of GB1 hairpin, (2) salt-bridge mutations of Fs peptide helix, and (3) MSMs built from ultra-long folding trajectories of FiP35 and GTT variants of WW domain. In all cases, the method correctly predicts changes in folding rates, suggesting the wide applicability of maximum-caliber approaches to efficiently predict how mutations perturb protein conformational dynamics.
△ Less
Submitted 25 May, 2016;
originally announced May 2016.