-
Deep Learning of Proteins with Local and Global Regions of Disorder
Authors:
Oufan Zhang,
Zi Hao Liu,
Julie D Forman-Kay,
Teresa Head-Gordon
Abstract:
Although machine learning has transformed protein structure prediction of folded protein ground states with remarkable accuracy, intrinsically disordered proteins and regions (IDPs/IDRs) are defined by diverse and dynamical structural ensembles that are predicted with low confidence by algorithms such as AlphaFold. We present a new machine learning method, IDPForge (Intrinsically Disordered Protei…
▽ More
Although machine learning has transformed protein structure prediction of folded protein ground states with remarkable accuracy, intrinsically disordered proteins and regions (IDPs/IDRs) are defined by diverse and dynamical structural ensembles that are predicted with low confidence by algorithms such as AlphaFold. We present a new machine learning method, IDPForge (Intrinsically Disordered Protein, FOlded and disordered Region GEnerator), that exploits a transformer protein language diffusion model to create all-atom IDP ensembles and IDR disordered ensembles that maintains the folded domains. IDPForge does not require sequence-specific training, back transformations from coarse-grained representations, nor ensemble reweighting, as in general the created IDP/IDR conformational ensembles show good agreement with solution experimental data, and options for biasing with experimental restraints are provided if desired. We envision that IDPForge with these diverse capabilities will facilitate integrative and structural studies for proteins that contain intrinsic disorder.
△ Less
Submitted 29 March, 2025; v1 submitted 16 February, 2025;
originally announced February 2025.
-
Biological Insights from Integrative Modeling of Intrinsically Disordered Protein Systems
Authors:
Zi Hao Liu,
Maria Tsanai,
Oufan Zhang,
Teresa Head-Gordon,
Julie Forman-Kay
Abstract:
Intrinsically disordered proteins and regions are increasingly appreciated for their abundance in the proteome and the many functional roles they play in the cell. In this short review, we describe a variety of approaches used to obtain biological insight from the structural ensembles of disordered proteins, regions, and complexes and the integrative biology challenges that arise from combining di…
▽ More
Intrinsically disordered proteins and regions are increasingly appreciated for their abundance in the proteome and the many functional roles they play in the cell. In this short review, we describe a variety of approaches used to obtain biological insight from the structural ensembles of disordered proteins, regions, and complexes and the integrative biology challenges that arise from combining diverse experiments and computational models. Importantly, we highlight findings regarding structural and dynamic characterization of disordered regions involved in binding and phase separation, as well as drug targeting of disordered regions, using a broad framework of integrative modeling approaches.
△ Less
Submitted 27 December, 2024;
originally announced December 2024.
-
Computational Methods to Investigate Intrinsically Disordered Proteins and their Complexes
Authors:
Zi Hao Liu,
Maria Tsanai,
Oufan Zhang,
Julie Forman-Kay,
Teresa Head-Gordon
Abstract:
In 1999 Wright and Dyson highlighted the fact that large sections of the proteome of all organisms are comprised of protein sequences that lack globular folded structures under physiological conditions. Since then the biophysics community has made significant strides in unraveling the intricate structural and dynamic characteristics of intrinsically disordered proteins (IDPs) and intrinsically dis…
▽ More
In 1999 Wright and Dyson highlighted the fact that large sections of the proteome of all organisms are comprised of protein sequences that lack globular folded structures under physiological conditions. Since then the biophysics community has made significant strides in unraveling the intricate structural and dynamic characteristics of intrinsically disordered proteins (IDPs) and intrinsically disordered regions (IDRs). Unlike crystallographic beamlines and their role in streamlining acquisition of structures for folded proteins, an integrated experimental and computational approach aimed at IDPs/IDRs has emerged. In this Perspective we aim to provide a robust overview of current computational tools for IDPs and IDRs, and most recently their complexes and phase separated states, including statistical models, physics-based approaches, and machine learning methods that permit structural ensemble generation and validation against many solution experimental data types.
△ Less
Submitted 3 September, 2024;
originally announced September 2024.
-
Electrostatics of Salt-Dependent Reentrant Phase Behaviors Highlights Diverse Roles of ATP in Biomolecular Condensates
Authors:
Yi-Hsuan Lin,
Tae Hun Kim,
Suman Das,
Tanmoy Pal,
Jonas Wessén,
Atul Kaushik Rangadurai,
Lewis E. Kay,
Julie D. Forman-Kay,
Hue Sun Chan
Abstract:
Liquid-liquid phase separation (LLPS) involving intrinsically disordered protein regions (IDRs) is a major physical mechanism for biological membraneless compartmentalization. The multifaceted electrostatic effects in these biomolecular condensates are exemplified here by experimental and theoretical investigations of the different salt- and ATP-dependent LLPSs of an IDR of messenger RNA-regulatin…
▽ More
Liquid-liquid phase separation (LLPS) involving intrinsically disordered protein regions (IDRs) is a major physical mechanism for biological membraneless compartmentalization. The multifaceted electrostatic effects in these biomolecular condensates are exemplified here by experimental and theoretical investigations of the different salt- and ATP-dependent LLPSs of an IDR of messenger RNA-regulating protein Caprin1 and its phosphorylated variant pY-Caprin1, exhibiting, e.g., reentrant behaviors in some instances but not others. Experimental data are rationalized by physical modeling using analytical theory, molecular dynamics, and polymer field-theoretic simulations, indicating that interchain ion bridges enhance LLPS of polyelectrolytes such as Caprin1 and the high valency of ATP-magnesium is a significant factor for its colocalization with the condensed phases, as similar trends are observed for other IDRs. The electrostatic nature of these features complements ATP's involvement in $π$-related interactions and as an amphiphilic hydrotrope, underscoring a general role of biomolecular condensates in modulating ion concentrations and its functional ramifications.
△ Less
Submitted 31 December, 2024; v1 submitted 9 January, 2024;
originally announced January 2024.
-
Configurational Entropy of Folded Proteins and its Importance for Intrinsically Disordered Proteins
Authors:
Meili Liu,
Akshaya K. Das,
James Lincoff,
Sukanya Sasmal,
Sara Y. Cheng,
Robert Vernon,
Julie Forman-Kay,
Teresa Head-Gordon
Abstract:
Many pairwise additive force fields are in active use for intrinsically disordered proteins (IDPs) and regions (IDRs), some of which modify energetic terms to improve description of IDPs/IDRs, but are largely in disagreement with solution experiments for the disordered states. We have evaluated representative pairwise and many-body protein and water force fields against experimental data on repres…
▽ More
Many pairwise additive force fields are in active use for intrinsically disordered proteins (IDPs) and regions (IDRs), some of which modify energetic terms to improve description of IDPs/IDRs, but are largely in disagreement with solution experiments for the disordered states. We have evaluated representative pairwise and many-body protein and water force fields against experimental data on representative IDPs and IDRs, a peptide that undergoes a disorder-to-order transition, and for seven globular proteins ranging in size from 130-266 amino acids. We find that force fields with the largest statistical fluctuations consistent with the radius of gyration and universal Lindemann values for folded states simultaneously better describe IDPs and IDRs and disorder to order transitions. Hence the crux of what a force field should exhibit to well describe IDRs/IDPs is not just the balance between protein and water energetics, but the balance between energetic effects and configurational entropy of folded states of globular proteins.
△ Less
Submitted 11 November, 2020; v1 submitted 12 July, 2020;
originally announced July 2020.
-
Comparative Roles of Charge, $π$ and Hydrophobic Interactions in Sequence-Dependent Phase Separation of Intrinsically Disordered Proteins
Authors:
Suman Das,
Yi-Hsuan Lin,
Robert M. Vernon,
Julie D. Forman-Kay,
Hue Sun Chan
Abstract:
Endeavoring toward a transferable, predictive coarse-grained explicit-chain model for biomolecular condensates underlain by liquid-liquid phase separation (LLPS), we conducted multiple-chain simulations of the N-terminal intrinsically disordered region (IDR) of DEAD-box helicase Ddx4, as a test case, to assess the roles of electrostatic, hydrophobic, cation-$π$, and aromatic interactions in amino…
▽ More
Endeavoring toward a transferable, predictive coarse-grained explicit-chain model for biomolecular condensates underlain by liquid-liquid phase separation (LLPS), we conducted multiple-chain simulations of the N-terminal intrinsically disordered region (IDR) of DEAD-box helicase Ddx4, as a test case, to assess the roles of electrostatic, hydrophobic, cation-$π$, and aromatic interactions in amino acid sequence-dependent LLPS. We evaluated 3 residue-residue interaction schemes with a shared electrostatic potential. Neither a common hydrophobicity scheme nor one augmented with arginine/lysine-aromatic cation-$π$ interactions consistently accounted for the experimental LLPS data on the wildtype, a charge-scrambled, an FtoA, and an RtoK mutant of Ddx4 IDR. In contrast, interactions based on contact statistics among folded globular protein structures reproduce the overall experimental trend, including that the RtoK mutant has a much diminished LLPS propensity. Consistency between simulation and LLPS experiment was also found for RtoK mutants of P-granule protein LAF-1, underscoring that, to a degree, the important LLPS-driving $π$-related interactions are embodied in classical statistical potentials. Further elucidation will be necessary, however, especially of phenylalanine's role in condensate assembly because experiments on FtoA and YtoF mutants suggest that LLPS-driving phenylalanine interactions are significantly weaker than those posited by common statistical potentials. Protein-protein electrostatic interactions are modulated by relative permittivity, which depends on protein concentration. Analytical theory suggests that this dependence entails enhanced inter-protein interactions in the condensed phase but more favorable protein-solvent interactions in the dilute phase. The opposing trends lead to a modest overall impact on LLPS.
△ Less
Submitted 6 October, 2020; v1 submitted 13 May, 2020;
originally announced May 2020.
-
Extended Experimental Inferential Structure Determination Method for Evaluating the Structural Ensembles of Disordered Protein States
Authors:
James Lincoff,
Mickael Krzeminski,
Mojtaba Haghighatlari,
João M. C. Teixeira,
Gregory-Neal W. Gomes,
Claudiu C. Gradinaru,
Julie D. Forman-Kay,
Teresa Head-Gordon
Abstract:
Characterization of proteins with intrinsic or unfolded state disorder comprises a new frontier in structural biology, requiring the characterization of diverse and dynamic structural ensembles. We introduce a comprehensive Bayesian framework, the Extended Experimental Inferential Structure Determination (X-EISD) method, that calculates the maximum log-likelihood of a protein structural ensemble b…
▽ More
Characterization of proteins with intrinsic or unfolded state disorder comprises a new frontier in structural biology, requiring the characterization of diverse and dynamic structural ensembles. We introduce a comprehensive Bayesian framework, the Extended Experimental Inferential Structure Determination (X-EISD) method, that calculates the maximum log-likelihood of a protein structural ensemble by accounting for the uncertainties of a wide range of experimental data and back-calculation models from structures, including NMR chemical shifts, J-couplings, Nuclear Overhauser Effects, paramagnetic relaxation enhancements, residual dipolar couplings, and hydrodynamic radii, single molecule fluorescence Förster resonance energy transfer efficiencies and small angle X-ray scattering intensity curves. We apply X-EISD to the drkN SH3 unfolded state domain and show that certain experimental data types are more influential than others for both eliminating structural ensemble models, while also finding equally probable disordered ensembles that have alternative structural properties that will stimulate further experiments to discriminate between them.
△ Less
Submitted 29 December, 2019;
originally announced December 2019.
-
Charge Pattern Matching as a "Fuzzy" Mode of Molecular Recognition for the Functional Phase Separations of Intrinsically Disordered Proteins
Authors:
Yi-Hsuan Lin,
Jacob P. Brady,
Julie D. Forman-Kay,
Hue Sun Chan
Abstract:
Biologically functional liquid-liquid phase separation of intrinsically disordered proteins (IDPs) is driven by interactions encoded by their amino acid sequences. Little is currently known about the molecular recognition mechanisms for distributing different IDP sequences into various cellular membraneless compartments. Pertinent physics was addressed recently by applying random-phase-approximati…
▽ More
Biologically functional liquid-liquid phase separation of intrinsically disordered proteins (IDPs) is driven by interactions encoded by their amino acid sequences. Little is currently known about the molecular recognition mechanisms for distributing different IDP sequences into various cellular membraneless compartments. Pertinent physics was addressed recently by applying random-phase-approximation (RPA) polymer theory to electrostatics, which is a major energetic component governing IDP phase properties. RPA accounts for charge patterns and thus has advantages over Flory-Huggins and Overbeek-Voorn mean-field theories. To make progress toward deciphering the phase behaviors of multiple IDP sequences, the RPA formulation for one IDP species plus solvent is hereby extended to treat polyampholyte solutions containing two IDP species. The new formulation generally allows for binary coexistence of two phases, each containing a different set of volume fractions $(φ_1,φ_2)$ for the two different IDP sequences. The asymmetry between the two predicted coexisting phases with regard to their $φ_1/φ_2$ ratios for the two sequences increases with increasing mismatch between their charge patterns. This finding points to a multivalent, stochastic, "fuzzy" mode of molecular recognition that helps populate various IDP sequences differentially into separate phase compartments. An intuitive illustration of this trend is provided by Flory-Huggins models, whereby a hypothetical case of ternary coexistence is also explored. Augmentations of the present RPA theory with a relative permittivity $ε_{\rm r}(φ)$ that depends on IDP volume fraction $φ=φ_1+φ_2$ lead to higher propensities to phase separate, in line with the case with one IDP species we studied previously. ...
△ Less
Submitted 19 October, 2017; v1 submitted 27 July, 2017;
originally announced July 2017.
-
Random-phase-approximation theory for sequence-dependent, biologically functional liquid-liquid phase separation of intrinsically disordered proteins
Authors:
Yi-Hsuan Lin,
Jianhui Song,
Julie D. Forman-Kay,
Hue Sun Chan
Abstract:
Intrinsically disordered proteins (IDPs) are typically low in nonpolar/hydrophobic but relatively high in polar, charged, and aromatic amino acid compositions. Some IDPs undergo liquid-liquid phase separation in the aqueous milieu of the living cell. The resulting phase with enhanced IDP concentration can function as a major component of membraneless organelles that, by creating their own IDP-rich…
▽ More
Intrinsically disordered proteins (IDPs) are typically low in nonpolar/hydrophobic but relatively high in polar, charged, and aromatic amino acid compositions. Some IDPs undergo liquid-liquid phase separation in the aqueous milieu of the living cell. The resulting phase with enhanced IDP concentration can function as a major component of membraneless organelles that, by creating their own IDP-rich microenvironments, stimulate critical biological functions. IDP phase behaviors are governed by their amino acid sequences. To make progress in understanding this sequence-phase relationship, we report further advances in a recently introduced application of random-phase-approximation (RPA) heteropolymer theory to account for sequence-specific electrostatics in IDP phase separation. Here we examine computed variations in phase behavior with respect to block length and charge density of model polyampholytes of alternating equal-length charge blocks to gain insight into trends observed in IDP phase separation. As a real-life example, the theory is applied to rationalize/predict binodal and spinodal phase behaviors of the 236-residue N-terminal disordered region of RNA helicase Ddx4 and its charge-scrambled mutant for which experimental data are available. Fundamental differences are noted between the phase diagrams predicted by RPA and those predicted by mean-field Flory-Huggins and Overbeek-Voorn/Debye-Hückel theories. In the RPA context, a physically plausible dependence of relative permittivity on protein concentration can produce a cooperative effect in favor of IDP-IDP attraction and thus a significant increased tendency to phase separate. Ramifications of these findings for future development of IDP phase separation theory are discussed.
△ Less
Submitted 26 September, 2016;
originally announced September 2016.
-
Sequence-specific polyampholyte phase separation in membraneless organelles
Authors:
Yi-Hsuan Lin,
Julie D. Forman-Kay,
Hue Sun Chan
Abstract:
Liquid-liquid phase separation of charge/aromatic-enriched intrinsically disordered proteins (IDPs) is critical in the biological function of membraneless organelles. Much of the physics of this recent discovery remains to be elucidated. Here we present a theory in the random phase approximation to account for electrostatic effects in polyampholyte phase separations, yielding predictions consisten…
▽ More
Liquid-liquid phase separation of charge/aromatic-enriched intrinsically disordered proteins (IDPs) is critical in the biological function of membraneless organelles. Much of the physics of this recent discovery remains to be elucidated. Here we present a theory in the random phase approximation to account for electrostatic effects in polyampholyte phase separations, yielding predictions consistent with recent experiments on the IDP Ddx4. The theory is applicable to any charge pattern and thus provides a general analytical framework for studying sequence dependence of IDP phase separation.
△ Less
Submitted 21 September, 2016; v1 submitted 29 May, 2016;
originally announced May 2016.