-
AutoLoop: a novel autoregressive deep learning method for protein loop prediction with high accuracy
Authors:
Tianyue Wang,
Xujun Zhang,
Langcheng Wang,
Odin Zhang,
Jike Wang,
Ercheng Wang,
Jialu Wu,
Renling Hu,
Jingxuan Ge,
Shimeng Li,
Qun Su,
Jiajun Yu,
Chang-Yu Hsieh,
Tingjun Hou,
Yu Kang
Abstract:
Protein structure prediction is a critical and longstanding challenge in biology, garnering widespread interest due to its significance in understanding biological processes. A particular area of focus is the prediction of missing loops in proteins, which are vital in determining protein function and activity. To address this challenge, we propose AutoLoop, a novel computational model designed to…
▽ More
Protein structure prediction is a critical and longstanding challenge in biology, garnering widespread interest due to its significance in understanding biological processes. A particular area of focus is the prediction of missing loops in proteins, which are vital in determining protein function and activity. To address this challenge, we propose AutoLoop, a novel computational model designed to automatically generate accurate loop backbone conformations that closely resemble their natural structures. AutoLoop employs a bidirectional training approach while merging atom- and residue-level embedding, thus improving robustness and precision. We compared AutoLoop with twelve established methods, including FREAD, NGK, AlphaFold2, and AlphaFold3. AutoLoop consistently outperforms other methods, achieving a median RMSD of 1.12 Angstrom and a 2-Angstrom success rate of 73.23% on the CASP15 dataset, while maintaining strong performance on the HOMSTARD dataset. It demonstrates the best performance across nearly all loop lengths and secondary structural types. Beyond accuracy, AutoLoop is computationally efficient, requiring only 0.10 s per generation. A post-processing module for side-chain packing and energy minimization further improves results slightly, confirming the reliability of the predicted backbone. A case study also highlights AutoLoop's potential for precise predictions based on dominant loop conformations. These advances hold promise for protein engineering and drug discovery.
△ Less
Submitted 5 May, 2025;
originally announced May 2025.
-
SparseFocus: Learning-based One-shot Autofocus for Microscopy with Sparse Content
Authors:
Yongping Zhai,
Xiaoxi Fu,
Qiang Su,
Jia Hu,
Yake Zhang,
Yunfeng Zhou,
Chaofan Zhang,
Xiao Li,
Wenxin Wang,
Dongdong Wu,
Shen Yan
Abstract:
Autofocus is necessary for high-throughput and real-time scanning in microscopic imaging. Traditional methods rely on complex hardware or iterative hill-climbing algorithms. Recent learning-based approaches have demonstrated remarkable efficacy in a one-shot setting, avoiding hardware modifications or iterative mechanical lens adjustments. However, in this paper, we highlight a significant challen…
▽ More
Autofocus is necessary for high-throughput and real-time scanning in microscopic imaging. Traditional methods rely on complex hardware or iterative hill-climbing algorithms. Recent learning-based approaches have demonstrated remarkable efficacy in a one-shot setting, avoiding hardware modifications or iterative mechanical lens adjustments. However, in this paper, we highlight a significant challenge that the richness of image content can significantly affect autofocus performance. When the image content is sparse, previous autofocus methods, whether traditional climbing-hill or learning-based, tend to fail. To tackle this, we propose a content-importance-based solution, named SparseFocus, featuring a novel two-stage pipeline. The first stage measures the importance of regions within the image, while the second stage calculates the defocus distance from selected important regions. To validate our approach and benefit the research community, we collect a large-scale dataset comprising millions of labelled defocused images, encompassing both dense, sparse and extremely sparse scenarios. Experimental results show that SparseFocus surpasses existing methods, effectively handling all levels of content sparsity. Moreover, we integrate SparseFocus into our Whole Slide Imaging (WSI) system that performs well in real-world applications. The code and dataset will be made available upon the publication of this paper.
△ Less
Submitted 10 February, 2025;
originally announced February 2025.
-
Dynamic PDB: A New Dataset and a SE(3) Model Extension by Integrating Dynamic Behaviors and Physical Properties in Protein Structures
Authors:
Ce Liu,
Jun Wang,
Zhiqiang Cai,
Yingxu Wang,
Huizhen Kuang,
Kaihui Cheng,
Liwei Zhang,
Qingkun Su,
Yining Tang,
Fenglei Cao,
Limei Han,
Siyu Zhu,
Yuan Qi
Abstract:
Despite significant progress in static protein structure collection and prediction, the dynamic behavior of proteins, one of their most vital characteristics, has been largely overlooked in prior research. This oversight can be attributed to the limited availability, diversity, and heterogeneity of dynamic protein datasets. To address this gap, we propose to enhance existing prestigious static 3D…
▽ More
Despite significant progress in static protein structure collection and prediction, the dynamic behavior of proteins, one of their most vital characteristics, has been largely overlooked in prior research. This oversight can be attributed to the limited availability, diversity, and heterogeneity of dynamic protein datasets. To address this gap, we propose to enhance existing prestigious static 3D protein structural databases, such as the Protein Data Bank (PDB), by integrating dynamic data and additional physical properties. Specifically, we introduce a large-scale dataset, Dynamic PDB, encompassing approximately 12.6K proteins, each subjected to all-atom molecular dynamics (MD) simulations lasting 1 microsecond to capture conformational changes. Furthermore, we provide a comprehensive suite of physical properties, including atomic velocities and forces, potential and kinetic energies of proteins, and the temperature of the simulation environment, recorded at 1 picosecond intervals throughout the simulations. For benchmarking purposes, we evaluate state-of-the-art methods on the proposed dataset for the task of trajectory prediction. To demonstrate the value of integrating richer physical properties in the study of protein dynamics and related model design, we base our approach on the SE(3) diffusion model and incorporate these physical properties into the trajectory prediction process. Preliminary results indicate that this straightforward extension of the SE(3) model yields improved accuracy, as measured by MAE and RMSD, when the proposed physical properties are taken into consideration. https://fudan-generative-vision.github.io/dynamicPDB/ .
△ Less
Submitted 18 September, 2024; v1 submitted 22 August, 2024;
originally announced August 2024.
-
Token-Mol 1.0: Tokenized drug design with large language model
Authors:
Jike Wang,
Rui Qin,
Mingyang Wang,
Meijing Fang,
Yangyang Zhang,
Yuchen Zhu,
Qun Su,
Qiaolin Gou,
Chao Shen,
Odin Zhang,
Zhenxing Wu,
Dejun Jiang,
Xujun Zhang,
Huifeng Zhao,
Xiaozhe Wan,
Zhourui Wu,
Liwei Liu,
Yu Kang,
Chang-Yu Hsieh,
Tingjun Hou
Abstract:
Significant interests have recently risen in leveraging sequence-based large language models (LLMs) for drug design. However, most current applications of LLMs in drug discovery lack the ability to comprehend three-dimensional (3D) structures, thereby limiting their effectiveness in tasks that explicitly involve molecular conformations. In this study, we introduced Token-Mol, a token-only 3D drug…
▽ More
Significant interests have recently risen in leveraging sequence-based large language models (LLMs) for drug design. However, most current applications of LLMs in drug discovery lack the ability to comprehend three-dimensional (3D) structures, thereby limiting their effectiveness in tasks that explicitly involve molecular conformations. In this study, we introduced Token-Mol, a token-only 3D drug design model. This model encodes all molecular information, including 2D and 3D structures, as well as molecular property data, into tokens, which transforms classification and regression tasks in drug discovery into probabilistic prediction problems, thereby enabling learning through a unified paradigm. Token-Mol is built on the transformer decoder architecture and trained using random causal masking techniques. Additionally, we proposed the Gaussian cross-entropy (GCE) loss function to overcome the challenges in regression tasks, significantly enhancing the capacity of LLMs to learn continuous numerical values. Through a combination of fine-tuning and reinforcement learning (RL), Token-Mol achieves performance comparable to or surpassing existing task-specific methods across various downstream tasks, including pocket-based molecular generation, conformation generation, and molecular property prediction. Compared to existing molecular pre-trained models, Token-Mol exhibits superior proficiency in handling a wider range of downstream tasks essential for drug design. Notably, our approach improves regression task accuracy by approximately 30% compared to similar token-only methods. Token-Mol overcomes the precision limitations of token-only models and has the potential to integrate seamlessly with general models such as ChatGPT, paving the way for the development of a universal artificial intelligence drug design model that facilitates rapid and high-quality drug design by experts.
△ Less
Submitted 19 August, 2024; v1 submitted 10 July, 2024;
originally announced July 2024.
-
Delete: Deep Lead Optimization Enveloped in Protein Pocket through Unified Deleting Strategies and a Structure-aware Network
Authors:
Haotian Zhang,
Huifeng Zhao,
Xujun Zhang,
Qun Su,
Hongyan Du,
Chao Shen,
Zhe Wang,
Dan Li,
Peichen Pan,
Guangyong Chen,
Yu Kang,
Chang-yu Hsieh,
Tingjun Hou
Abstract:
Drug discovery is a highly complicated process, and it is unfeasible to fully commit it to the recently developed molecular generation methods. Deep learning-based lead optimization takes expert knowledge as a starting point, learning from numerous historical cases about how to modify the structure for better drug-forming properties. However, compared with the more established de novo generation s…
▽ More
Drug discovery is a highly complicated process, and it is unfeasible to fully commit it to the recently developed molecular generation methods. Deep learning-based lead optimization takes expert knowledge as a starting point, learning from numerous historical cases about how to modify the structure for better drug-forming properties. However, compared with the more established de novo generation schemes, lead optimization is still an area that requires further exploration. Previously developed models are often limited to resolving one (or few) certain subtask(s) of lead optimization, and most of them can only generate the two-dimensional structures of molecules while disregarding the vital protein-ligand interactions based on the three-dimensional binding poses. To address these challenges, we present a novel tool for lead optimization, named Delete (Deep lead optimization enveloped in protein pocket). Our model can handle all subtasks of lead optimization involving fragment growing, linking, and replacement through a unified deleting (masking) strategy, and is aware of the intricate pocket-ligand interactions through the geometric design of networks. Statistical evaluations and case studies conducted on individual subtasks demonstrate that Delete has a significant ability to produce molecules with superior binding affinities to protein targets and reasonable drug-likeness from given fragments or atoms. This feature may assist medicinal chemists in developing not only me-too/me-better products from existing drugs but also hit-to-lead for first-in-class drugs in a highly efficient manner.
△ Less
Submitted 4 August, 2023;
originally announced August 2023.
-
Strategy evolution on dynamic networks
Authors:
Qi Su,
Alex McAvoy,
Joshua B. Plotkin
Abstract:
Models of strategy evolution on static networks help us understand how population structure can promote the spread of traits like cooperation. One key mechanism is the formation of altruistic spatial clusters, where neighbors of a cooperative individual are likely to reciprocate, which protects prosocial traits from exploitation. But most real-world interactions are ephemeral and subject to exogen…
▽ More
Models of strategy evolution on static networks help us understand how population structure can promote the spread of traits like cooperation. One key mechanism is the formation of altruistic spatial clusters, where neighbors of a cooperative individual are likely to reciprocate, which protects prosocial traits from exploitation. But most real-world interactions are ephemeral and subject to exogenous restructuring, so that social networks change over time. Strategic behavior on dynamic networks is difficult to study, and much less is known about the resulting evolutionary dynamics. Here, we provide an analytical treatment of cooperation on dynamic networks, allowing for arbitrary spatial and temporal heterogeneity. We show that transitions among a large class of network structures can favor the spread of cooperation, even if each individual social network would inhibit cooperation when static. Furthermore, we show that spatial heterogeneity tends to inhibit cooperation, whereas temporal heterogeneity tends to promote it. Dynamic networks can have profound effects on the evolution of prosocial traits, even when individuals have no agency over network structures.
△ Less
Submitted 5 September, 2023; v1 submitted 27 January, 2023;
originally announced January 2023.
-
The arrow of evolution when the offspring variance is large
Authors:
Guocheng Wang,
Qi Su,
Long Wang,
Joshua B. Plotkin
Abstract:
The concept of fitness is central to evolution, but it quantifies only the expected number of offspring an individual will produce. The actual number of offspring is also subject to noise, arising from environmental or demographic stochasticity. In nature, individuals who are more fecund tend to have greater variance in their offspring number -- sometimes far greater than the Poisson variance assu…
▽ More
The concept of fitness is central to evolution, but it quantifies only the expected number of offspring an individual will produce. The actual number of offspring is also subject to noise, arising from environmental or demographic stochasticity. In nature, individuals who are more fecund tend to have greater variance in their offspring number -- sometimes far greater than the Poisson variance assumed in classical models of population genetics. Here, we develop a model for the evolution of two types reproducing in a population of non-constant size. The frequency-dependent fitness of each type is determined by pairwise interactions in a prisoner's dilemma game, but the offspring number is subject to an exogenously controlled variance that may depend upon the mean. Whereas defectors are preferred by natural selection in classical well-mixed populations, since they always have greater fitness than cooperators, we show that large offspring variance can reverse the direction of evolution and favor cooperation. Reproductive over-dispersion produces qualitatively new dynamics for other types of social interactions, as well, which cannot arise in populations with a fixed size or Poisson offspring variance.
△ Less
Submitted 6 September, 2022;
originally announced September 2022.
-
Evolution of cooperation with joint liability
Authors:
Guocheng Wang,
Qi Su,
Long Wang
Abstract:
"Personal responsibility", one of the basic principles of modern law, requires one to be responsible for what he did. However, personal responsibility is far from the only norm ruling human interactions, especially in social and economic activities. In many collective communities such as among enterprise colleagues and family members, one's personal interests are often bound to others' -- once one…
▽ More
"Personal responsibility", one of the basic principles of modern law, requires one to be responsible for what he did. However, personal responsibility is far from the only norm ruling human interactions, especially in social and economic activities. In many collective communities such as among enterprise colleagues and family members, one's personal interests are often bound to others' -- once one member breaks the rule, a group of people have to bear the punishment or sanction. Such a mechanism is termed "joint liability". Although many real-world cases have demonstrated that joint liability helps to maintain collective collaboration, a deep and systematic theoretical analysis on how and when joint liability promotes cooperation is lacking. Here we use evolutionary game theory to model an interacting system with joint liability, where one's losing credit could deteriorate the reputation of the whole group. We provide the analytical condition to predict when cooperation evolves in the presence of joint liability, which is verified by simulations. We also analytically prove that joint liability can greatly promote cooperation. Our work stresses that joint liability is of great significance in promoting the current economic propensity.
△ Less
Submitted 14 September, 2021; v1 submitted 12 September, 2021;
originally announced September 2021.
-
Payoff Control in Repeated Games
Authors:
Renfei Tan,
Qi Su,
Bin Wu,
Long Wang
Abstract:
Evolutionary game theory is a powerful mathematical framework to study how intelligent individuals adjust their strategies in collective interactions. It has been widely believed that it is impossible to unilaterally control players' payoffs in games, since payoffs are jointly determined by all players. Until recently, a class of so-called zero-determinant strategies are revealed, which enables a…
▽ More
Evolutionary game theory is a powerful mathematical framework to study how intelligent individuals adjust their strategies in collective interactions. It has been widely believed that it is impossible to unilaterally control players' payoffs in games, since payoffs are jointly determined by all players. Until recently, a class of so-called zero-determinant strategies are revealed, which enables a player to make a unilateral payoff control over her partners in two-action repeated games with a constant continuation probability. The existing methods, however, lead to the curse of dimensionality when the complexity of games increases. In this paper, we propose a new mathematical framework to study ruling strategies (with which a player unilaterally makes a linear relation rule on players' payoffs) in repeated games with an arbitrary number of actions or players, and arbitrary continuation probability. We establish an existence theorem of ruling strategies and develop an algorithm to find them. In particular, we prove that strict Markov ruling strategy exists only if either the repeated game proceeds for an infinite number of rounds, or every round is repeated with the same probability. The proposed mathematical framework also enables the search of collaborative ruling strategies for an alliance to control outsiders. Our method provides novel theoretical insights into payoff control in complex repeated games, which overcomes the curse of dimensionality.
△ Less
Submitted 28 August, 2021;
originally announced August 2021.
-
Evolution of cooperation with asymmetric social interactions
Authors:
Qi Su,
Joshua. B Plotkin
Abstract:
How cooperation emerges in human societies is both an evolutionary enigma, and a practical problem with tangible implications for societal health. Population structure has long been recognized as a catalyst for cooperation because local interactions enable reciprocity. Analysis of this phenomenon typically assumes bi-directional social interactions, even though real-world interactions are often un…
▽ More
How cooperation emerges in human societies is both an evolutionary enigma, and a practical problem with tangible implications for societal health. Population structure has long been recognized as a catalyst for cooperation because local interactions enable reciprocity. Analysis of this phenomenon typically assumes bi-directional social interactions, even though real-world interactions are often uni-directional. Uni-directional interactions -- where one individual has the opportunity to contribute altruistically to another, but not conversely -- arise in real-world populations as the result of organizational hierarchies, social stratification, popularity effects, and endogenous mechanisms of network growth. Here we expand the theory of cooperation in structured populations to account for both uni- and bi-directional social interactions. Even though directed interactions remove the opportunity for reciprocity, we find that cooperation can nonetheless be favored in directed social networks and that cooperation is provably maximized for networks with an intermediate proportion of directed interactions, as observed in many empirical settings. We also identify two simple structural motifs that allow efficient modification of interaction directionality to promote cooperation by orders of magnitude. We discuss how our results relate to the concepts of generalized and indirect reciprocity.
△ Less
Submitted 20 May, 2021; v1 submitted 3 May, 2021;
originally announced May 2021.
-
Evolution of prosocial behavior in multilayer populations
Authors:
Qi Su,
Alex McAvoy,
Yoichiro Mori,
Joshua B. Plotkin
Abstract:
Human societies include diverse social relationships. Friends, family, business colleagues, and online contacts can all contribute to one's social life. Individuals may behave differently in different domains, but success in one domain may engender success in another. Here, we study this problem using multilayer networks to model multiple domains of social interactions, in which individuals experi…
▽ More
Human societies include diverse social relationships. Friends, family, business colleagues, and online contacts can all contribute to one's social life. Individuals may behave differently in different domains, but success in one domain may engender success in another. Here, we study this problem using multilayer networks to model multiple domains of social interactions, in which individuals experience different environments and may express different behaviors. We provide a mathematical analysis and find that coupling between layers tends to promote prosocial behavior. Even if prosociality is disfavored in each layer alone, multilayer coupling can promote its proliferation in all layers simultaneously. We apply this analysis to six real-world multilayer networks, ranging from the socio-emotional and professional relationships in a Zambian community, to the online and offline relationships within an academic University. We discuss the implications of our results, which suggest that small modifications to interactions in one domain may catalyze prosociality in a different domain.
△ Less
Submitted 25 October, 2021; v1 submitted 3 October, 2020;
originally announced October 2020.
-
Evolutionary dynamics with game transitions
Authors:
Qi Su,
Alex McAvoy,
Long Wang,
Martin A. Nowak
Abstract:
The environment has a strong influence on a population's evolutionary dynamics. Driven by both intrinsic and external factors, the environment is subject to continual change in nature. To capture an ever-changing environment, we consider a model of evolutionary dynamics with game transitions, where individuals' behaviors together with the games they play in one time step influence the games to be…
▽ More
The environment has a strong influence on a population's evolutionary dynamics. Driven by both intrinsic and external factors, the environment is subject to continual change in nature. To capture an ever-changing environment, we consider a model of evolutionary dynamics with game transitions, where individuals' behaviors together with the games they play in one time step influence the games to be played next time step. Within this model, we study the evolution of cooperation in structured populations and find a simple rule: weak selection favors cooperation over defection if the ratio of the benefit provided by an altruistic behavior, $b$, to the corresponding cost, $c$, exceeds $k-k'$, where $k$ is the average number of neighbors of an individual and $k'$ captures the effects of the game transitions. Even if cooperation cannot be favored in each individual game, allowing for a transition to a relatively valuable game after mutual cooperation and to a less valuable game after defection can result in a favorable outcome for cooperation. In particular, small variations in different games being played can promote cooperation markedly. Our results suggest that simple game transitions can serve as a mechanism for supporting prosocial behaviors in highly-connected populations.
△ Less
Submitted 27 November, 2019; v1 submitted 24 May, 2019;
originally announced May 2019.
-
Evolutionary multiplayer games on graphs with edge diversity
Authors:
Qi Su,
Lei Zhou,
Long Wang
Abstract:
Evolutionary game dynamics in structured populations has been extensively explored in past decades. However, most previous studies assume that payoffs of individuals are fully determined by the strategic behaviors of interacting parties and social ties between them only serve as the indicator of the existence of interactions. This assumption neglects important information carried by inter-personal…
▽ More
Evolutionary game dynamics in structured populations has been extensively explored in past decades. However, most previous studies assume that payoffs of individuals are fully determined by the strategic behaviors of interacting parties and social ties between them only serve as the indicator of the existence of interactions. This assumption neglects important information carried by inter-personal social ties such as genetic similarity, geographic proximity, and social closeness, which may crucially affect the outcome of interactions. To model these situations, we present a framework of evolutionary multiplayer games on graphs with edge diversity, where different types of edges describe diverse social ties. Strategic behaviors together with social ties determine the resulting payoffs of interactants. Under weak selection, we provide a general formula to predict the success of one behavior over the other. We apply this formula to various examples which cannot be dealt with using previous models, including the division of labor and relationship- or edge-dependent games. We find that labor division facilitates collective cooperation by decomposing a many-player game into several games of smaller sizes. The evolutionary process based on relationship-dependent games can be approximated by interactions under a transformed and unified game. Our work stresses the importance of social ties and provides effective methods to reduce the calculating complexity in analyzing the evolution of realistic systems.
△ Less
Submitted 17 October, 2018;
originally announced October 2018.
-
Evolution of Cooperation on Temporal Networks
Authors:
Aming Li,
Lei Zhou,
Qi Su,
Sean P. Cornelius,
Yang-Yu Liu,
Long Wang
Abstract:
The structure of social networks is a key determinant in fostering cooperation and other altruistic behavior among naturally selfish individuals. However, most real social interactions are temporal, being both finite in duration and spread out over time. This raises the question of whether stable cooperation can form despite an intrinsically fragmented social fabric. Here we develop a framework to…
▽ More
The structure of social networks is a key determinant in fostering cooperation and other altruistic behavior among naturally selfish individuals. However, most real social interactions are temporal, being both finite in duration and spread out over time. This raises the question of whether stable cooperation can form despite an intrinsically fragmented social fabric. Here we develop a framework to study the evolution of cooperation on temporal networks in the setting of the classic Prisoner's Dilemma. By analyzing both real and synthetic datasets, we find that temporal networks generally facilitate the evolution of cooperation compared to their static counterparts. More interestingly, we find that the intrinsic human interactive pattern like bursty behavior impedes the evolution of cooperation. Finally, we introduce a measure to quantify the temporality present in networks and demonstrate that there is an intermediate level of temporality that boosts cooperation most. Our results open a new avenue for investigating the evolution of cooperation in more realistic structured populations.
△ Less
Submitted 24 September, 2016;
originally announced September 2016.