Exploring evolution-aware & -free protein language models as protein function predictors
Authors:
Mingyang Hu,
Fajie Yuan,
Kevin K. Yang,
Fusong Ju,
Jin Su,
Hui Wang,
Fei Yang,
Qiuyang Ding
Abstract:
Large-scale Protein Language Models (PLMs) have improved performance in protein prediction tasks, ranging from 3D structure prediction to various function predictions. In particular, AlphaFold, a ground-breaking AI system, could potentially reshape structural biology. However, the utility of the PLM module in AlphaFold, Evoformer, has not been explored beyond structure prediction. In this paper, w…
▽ More
Large-scale Protein Language Models (PLMs) have improved performance in protein prediction tasks, ranging from 3D structure prediction to various function predictions. In particular, AlphaFold, a ground-breaking AI system, could potentially reshape structural biology. However, the utility of the PLM module in AlphaFold, Evoformer, has not been explored beyond structure prediction. In this paper, we investigate the representation ability of three popular PLMs: ESM-1b (single sequence), MSA-Transformer (multiple sequence alignment) and Evoformer (structural), with a special focus on Evoformer. Specifically, we aim to answer the following key questions: (i) Does the Evoformer trained as part of AlphaFold produce representations amenable to predicting protein function? (ii) If yes, can Evoformer replace ESM-1b and MSA-Transformer? (ii) How much do these PLMs rely on evolution-related protein data? In this regard, are they complementary to each other? We compare these models by empirical study along with new insights and conclusions. All code and datasets for reproducibility are available at https://github.com/elttaes/Revisiting-PLMs.
△ Less
Submitted 16 October, 2022; v1 submitted 13 June, 2022;
originally announced June 2022.
A Dynamic Programming Implemented 2x2 non-cooperative Game Theory Model for ESS Analysis
Authors:
Chen Shi,
Fang Yuan
Abstract:
Game Theory has been frequently applied in biological research since 1970s. While the key idea of Game Theory is Nash Equilibrium, it is critical to understand and figure out the payoff matrix in order to calculate Nash Equilibrium. In this paper we present a dynamic programming implemented method to compute 2x2 non-cooperative finite resource allocation game's payoff matrix. We assume in one po…
▽ More
Game Theory has been frequently applied in biological research since 1970s. While the key idea of Game Theory is Nash Equilibrium, it is critical to understand and figure out the payoff matrix in order to calculate Nash Equilibrium. In this paper we present a dynamic programming implemented method to compute 2x2 non-cooperative finite resource allocation game's payoff matrix. We assume in one population there exists two types of individuals, aggressive and non-aggressive and each individual has equal and finite resource. The strength of individual could be described by a function of resource consumption in discrete development stages. Each individual undergoes logistic growth hence we divide the development into three stages: initialization, quasilinear growth and termination. We first discuss the theoretical frame of how to dynamic programming to calculate payoff matrix then give three numerical examples representing three different types of aggressive individuals and calculate the payoff matrix for each of them respectively. Based on the numerical payoff matrix we further investigate the evolutionary stable strategies (ESS) of the games.
△ Less
Submitted 27 April, 2009;
originally announced April 2009.