Navigating protein landscapes with a machine-learned transferable coarse-grained model
Authors:
Nicholas E. Charron,
Felix Musil,
Andrea Guljas,
Yaoyi Chen,
Klara Bonneau,
Aldo S. Pasos-Trejo,
Jacopo Venturin,
Daria Gusew,
Iryna Zaporozhets,
Andreas Krämer,
Clark Templeton,
Atharva Kelkar,
Aleksander E. P. Durumeric,
Simon Olsson,
Adrià Pérez,
Maciej Majewski,
Brooke E. Husic,
Ankit Patel,
Gianni De Fabritiis,
Frank Noé,
Cecilia Clementi
Abstract:
The most popular and universally predictive protein simulation models employ all-atom molecular dynamics (MD), but they come at extreme computational cost. The development of a universal, computationally efficient coarse-grained (CG) model with similar prediction performance has been a long-standing challenge. By combining recent deep learning methods with a large and diverse training set of all-a…
▽ More
The most popular and universally predictive protein simulation models employ all-atom molecular dynamics (MD), but they come at extreme computational cost. The development of a universal, computationally efficient coarse-grained (CG) model with similar prediction performance has been a long-standing challenge. By combining recent deep learning methods with a large and diverse training set of all-atom protein simulations, we here develop a bottom-up CG force field with chemical transferability, which can be used for extrapolative molecular dynamics on new sequences not used during model parametrization. We demonstrate that the model successfully predicts folded structures, intermediates, metastable folded and unfolded basins, and the fluctuations of intrinsically disordered proteins while it is several orders of magnitude faster than an all-atom model. This showcases the feasibility of a universal and computationally efficient machine-learned CG model for proteins.
△ Less
Submitted 27 October, 2023;
originally announced October 2023.
Bertrand-DR: Improving Text-to-SQL using a Discriminative Re-ranker
Authors:
Amol Kelkar,
Rohan Relan,
Vaishali Bhardwaj,
Saurabh Vaichal,
Chandra Khatri,
Peter Relan
Abstract:
To access data stored in relational databases, users need to understand the database schema and write a query using a query language such as SQL. To simplify this task, text-to-SQL models attempt to translate a user's natural language question to corresponding SQL query. Recently, several generative text-to-SQL models have been developed. We propose a novel discriminative re-ranker to improve the…
▽ More
To access data stored in relational databases, users need to understand the database schema and write a query using a query language such as SQL. To simplify this task, text-to-SQL models attempt to translate a user's natural language question to corresponding SQL query. Recently, several generative text-to-SQL models have been developed. We propose a novel discriminative re-ranker to improve the performance of generative text-to-SQL models by extracting the best SQL query from the beam output predicted by the text-to-SQL generator, resulting in improved performance in the cases where the best query was in the candidate list, but not at the top of the list. We build the re-ranker as a schema agnostic BERT fine-tuned classifier. We analyze relative strengths of the text-to-SQL and re-ranker models across different query hardness levels, and suggest how to combine the two models for optimal performance. We demonstrate the effectiveness of the re-ranker by applying it to two state-of-the-art text-to-SQL models, and achieve top 4 score on the Spider leaderboard at the time of writing this article.
△ Less
Submitted 3 November, 2020; v1 submitted 2 February, 2020;
originally announced February 2020.