-
Interpreting and Steering Protein Language Models through Sparse Autoencoders
Authors:
Edith Natalia Villegas Garcia,
Alessio Ansuini
Abstract:
The rapid advancements in transformer-based language models have revolutionized natural language processing, yet understanding the internal mechanisms of these models remains a significant challenge. This paper explores the application of sparse autoencoders (SAE) to interpret the internal representations of protein language models, specifically focusing on the ESM-2 8M parameter model. By perform…
▽ More
The rapid advancements in transformer-based language models have revolutionized natural language processing, yet understanding the internal mechanisms of these models remains a significant challenge. This paper explores the application of sparse autoencoders (SAE) to interpret the internal representations of protein language models, specifically focusing on the ESM-2 8M parameter model. By performing a statistical analysis on each latent component's relevance to distinct protein annotations, we identify potential interpretations linked to various protein characteristics, including transmembrane regions, binding sites, and specialized motifs.
We then leverage these insights to guide sequence generation, shortlisting the relevant latent components that can steer the model towards desired targets such as zinc finger domains. This work contributes to the emerging field of mechanistic interpretability in biological sequence models, offering new perspectives on model steering for sequence design.
△ Less
Submitted 13 February, 2025;
originally announced February 2025.
-
What comes next? response times are affected by mispredictions in a stochastic game
Authors:
Paulo Roberto Cabral-Passos,
Antonio Galves,
Jesus Enrique Garcia,
Claudia Domingues Vargas
Abstract:
Acting as a goalkeeper in a video-game, a participant is asked to predict the successive choices of the penalty taker. The sequence of choices of the penalty taker is generated by a stochastic chain with memory of variable length. It has been conjectured that the probability distribution of the response times is a function of the specific sequence of past choices governing the algorithm used by th…
▽ More
Acting as a goalkeeper in a video-game, a participant is asked to predict the successive choices of the penalty taker. The sequence of choices of the penalty taker is generated by a stochastic chain with memory of variable length. It has been conjectured that the probability distribution of the response times is a function of the specific sequence of past choices governing the algorithm used by the penalty taker to make his choice at each step. We found empirical evidence that besides this dependence, the distribution of the response times depends also on the success or failure of the previous prediction made by the participant. Moreover, we found statistical evidence that this dependence propagates up to two steps forward after the prediction failure.
△ Less
Submitted 18 September, 2023;
originally announced September 2023.
-
Competitive exclusion and Hebbian couplings in random generalised Lotka-Volterra systems
Authors:
Enrique Rozas Garcia,
Mark J. Crumpton,
Tobias Galla
Abstract:
We study communities emerging from generalised random Lotka--Volterra dynamics with a large number of species with interactions determined by the degree of niche overlap. Each species is endowed with a number of traits, and competition between pairs of species increases with their similarity in trait space. This leads to a model with random Hopfield-like interactions. We use tools from the theory…
▽ More
We study communities emerging from generalised random Lotka--Volterra dynamics with a large number of species with interactions determined by the degree of niche overlap. Each species is endowed with a number of traits, and competition between pairs of species increases with their similarity in trait space. This leads to a model with random Hopfield-like interactions. We use tools from the theory of disordered systems, notably dynamic mean field theory, to characterise the statistics of the resulting communities at stable fixed points and determine analytically when stability breaks down. Two distinct types of transition are identified in this way, both marked by diverging abundances, but differing in the behaviour of the integrated response function. At fixed points only a fraction of the initial pool of species survives. We numerically study the eigenvalue spectra of the interaction matrix between extant species. We find evidence that the two types of dynamical transition are, respectively, associated with the bulk spectrum or an outlier eigenvalue crossing into the right half of the complex plane.
△ Less
Submitted 29 November, 2023; v1 submitted 27 January, 2023;
originally announced January 2023.
-
Common patterns between dengue cases, climate, and local environmental variables in Costa Rica: A Wavelet Approach
Authors:
Yury E. Garcia,
Shu-Wei Chou-Chen,
Luis A. Barboza,
Maria L. Daza-Torres,
J. Cricelio Montesinos-Lopez,
Paola Vasquez,
Juan G. Calvo,
Miriam Nuno,
Fabio Sanchez
Abstract:
Throughout history, prevention and control of dengue transmission have challenged public health authorities worldwide. In the last decades, the interaction of multiple factors, such as environmental and climate variability, has influenced increments in incidence and geographical spread of the virus. In Costa Rica, a country characterized by multiple microclimates separated by short distances, deng…
▽ More
Throughout history, prevention and control of dengue transmission have challenged public health authorities worldwide. In the last decades, the interaction of multiple factors, such as environmental and climate variability, has influenced increments in incidence and geographical spread of the virus. In Costa Rica, a country characterized by multiple microclimates separated by short distances, dengue has been endemic since its introduction in 1993. Understanding the role of climatic and environmental factors in the seasonal and inter-annual variability of disease spread is essential to develop effective surveillance and control efforts. In this study, we conducted a wavelet time series analysis of weekly climate, local environmental variables, and dengue cases (2001-2019) from 32 cantons in Costa Rica to identify significant periods (e.g., annual, biannual) in which climate and environmental variables co-varied with dengue cases. Wavelet coherence analysis was used to characterize seasonality, multi-year outbreaks, and relative delays between the time series. Results show that dengue outbreaks occurring every 3 years in cantons located in the country's Central, North, and South Pacific regions were highly coherent with the Oceanic Niño 3.4 and the Tropical North Caribbean Index (TNA). Dengue cases were in phase with El Niño 3.4 and TNA, with El Niño 3.4 ahead of dengue cases by roughly nine months and TNA ahead by less than three months. Annual dengue outbreaks were coherent with local environmental variables (NDWI, EVI, Evapotranspiration, and Precipitation) in most cantons except those located in the Central, South Pacific, and South Caribbean regions of the country. The local environmental variables were in phase with dengue cases and were ahead by around three months.
△ Less
Submitted 3 January, 2023;
originally announced January 2023.
-
Projecting the Impact of Covid-19 Variants and Vaccination Strategies in Disease Transmission using a Multilayer Network Model in Costa Rica
Authors:
Yury E. García,
Gustavo Mery,
Paola Vásquez,
Juan G. Calvo,
Luis A. Barboza,
Tania Rivas,
Fabio Sanchez
Abstract:
For countries starting to receive steady supplies of vaccines against SARS-CoV-2, the course of Covid-19 for the following months will be determined by the emergence of new variants and successful roll-out of vaccination campaigns. To anticipate this scenario, we used a multilayer network model developed to forecast the transmission dynamics of Covid-19 in Costa Rica, and to estimate the impact of…
▽ More
For countries starting to receive steady supplies of vaccines against SARS-CoV-2, the course of Covid-19 for the following months will be determined by the emergence of new variants and successful roll-out of vaccination campaigns. To anticipate this scenario, we used a multilayer network model developed to forecast the transmission dynamics of Covid-19 in Costa Rica, and to estimate the impact of the introduction of the Delta variant in the country, under two plausible vaccination scenarios, one sustaining Costa Rica's July 2021 vaccination pace of 30,000 doses per day and with high acceptance from the population and another with declining vaccination pace to 13,000 doses per day and with lower acceptance. Results suggest that the introduction and gradual dominance of the Delta variant would increase Covid-19 hospitalizations and ICU admissions between $35\%$ and $33.25\%$, from August 2021 to December 2021, depending on vaccine administration and acceptance. In the presence of the Delta variant, new Covid-19 hospitalizations and ICU admission would experience an average increase of $24.26\%$ and $27.19\%$ respectively in the same period if the vaccination pace drops. Our results can help decision-makers better prepare for the COVID-19 pandemic in the months to come.
△ Less
Submitted 23 September, 2021; v1 submitted 7 September, 2021;
originally announced September 2021.
-
Wavelet Analysis of Dengue Incidence and its Correlation with Weather and Vegetation Variables in Costa Rica
Authors:
Yury E. García,
Luis A. Barboza,
Fabio Sanchez,
Paola Vásquez,
Juan G. Calvo
Abstract:
Dengue represents a serious public health problem in tropical and subtropical regions worldwide. The number of dengue cases and its geographical expansion has increased in recent decades, driven mostly after by social and environmental factors. In Costa Rica, it has been endemic since it was first introduced in 1993. In this article, wavelet analyzes (wavelet power spectrum and wavelet coherence)…
▽ More
Dengue represents a serious public health problem in tropical and subtropical regions worldwide. The number of dengue cases and its geographical expansion has increased in recent decades, driven mostly after by social and environmental factors. In Costa Rica, it has been endemic since it was first introduced in 1993. In this article, wavelet analyzes (wavelet power spectrum and wavelet coherence) were performed to detect and quantify dengue periodicity and describe patterns of synchrony between dengue incidence and climatic and environmental factors: Normalized Difference Water Index, Enhanced Vegetation Index, Normalized Difference Vegetation Index, Tropical North Atlantic indices, Land Surface Temperature, and El Niño Southern Oscillation indices in 32 different cantons, using dengue surveillance from 2000 to 2019. Results showed that the dengue dominant cycles are in periods of 1, 2, and 3 years. The wavelet coherence analysis showed that the vegetation indices are correlated with dengue incidence in places located in the central and Northern Pacific of the country in the period of 1 year. Climatic variables such as El Niño 3, 3.4, 4, showed a strong correlation with dengue incidence in the period of 3 years and the Tropical North Atlantic is correlated with dengue incidence in the period of 1 year. Land Surface Temperature showed a strong correlation with dengue time series in the 32 cantons.
△ Less
Submitted 1 July, 2021;
originally announced July 2021.
-
The SARS-CoV-2 Spike Protein is vulnerable to moderate electric fields
Authors:
Claudia R. Arbeitman,
Pablo Rojas,
Pedro Ojeda-May,
Martin E. Garcia
Abstract:
Most of the ongoing projects aimed at the development of specific therapies and vaccines against COVID-19 use the SARS-CoV-2 spike (S) protein as the main target [1-3]. The binding of the spike protein with the ACE2 receptor (ACE2) of the host cell constitutes the first and key step for virus entry. During this process, the receptor binding domain (RBD) of the S protein plays an essential role, si…
▽ More
Most of the ongoing projects aimed at the development of specific therapies and vaccines against COVID-19 use the SARS-CoV-2 spike (S) protein as the main target [1-3]. The binding of the spike protein with the ACE2 receptor (ACE2) of the host cell constitutes the first and key step for virus entry. During this process, the receptor binding domain (RBD) of the S protein plays an essential role, since it contains the receptor binding motif (RBM), responsible for the docking to the receptor. So far, mostly biochemical methods are being tested in order to prevent binding of the virus to ACE2 [4]. Here we show, with the help of atomistic simulations, that external electric fields of easily achievable and moderate strengths can dramatically destabilise the S protein, inducing long-lasting structural damage. One striking field-induced conformational change occurs at the level of the recognition loop L3 of the RBD where two parallel beta sheets, believed to be responsible for a high affinity to ACE2 [5], undergo a change into an unstructured coil, which exhibits almost no binding possibilities to the ACE2 receptor (Figure 1a). Remarkably, while the structural flexibility of S allows the virus to improve its probability of entering the cell, it is also the origin of the surprising vulnerability of S upon application of electric fields of strengths at least two orders of magnitude smaller than those required for damaging most proteins. Our findings suggest the existence of a clean physical method to weaken the SARS-CoV-2 virus without further biochemical processing. Moreover, the effect could be used for infection prevention purposes and also to develop technologies for in-vitro structural manipulation of S. Since the method is largely unspecific, it can be suitable for application to mutations in S, to other proteins of SARS-CoV-2 and in general to membrane proteins of other virus types.
△ Less
Submitted 23 March, 2021;
originally announced March 2021.
-
A Multilayer Network Model implementation for Covid-19
Authors:
Juan G. Calvo,
Fabio Sanchez,
Luis A. Barboza,
Yury E. García,
Paola Vásquez
Abstract:
We present a numerical implementation for a multilayer network used to model the transmission of Covid-19 or other diseases with a similar transmission mechanism. The model incorporates different contact types between individuals (household, social contacts, and strangers), which allows flexibility compared to standard SIR type models. The algorithm described in this paper is a simplification of t…
▽ More
We present a numerical implementation for a multilayer network used to model the transmission of Covid-19 or other diseases with a similar transmission mechanism. The model incorporates different contact types between individuals (household, social contacts, and strangers), which allows flexibility compared to standard SIR type models. The algorithm described in this paper is a simplification of the model used to give public health authorities an additional tool for the decision-making process in Costa Rica, by simulating extensive possible scenarios and projections.
△ Less
Submitted 16 March, 2021;
originally announced March 2021.
-
Estimating COVID-19 cases and outbreaks on-stream through phone-calls
Authors:
Ezequiel Alvarez,
Daniela Obando,
Sebastian Crespo,
Enio Garcia,
Nicolas Kreplak,
Franco Marsico
Abstract:
One of the main problems in controlling COVID-19 epidemic spread is the delay in confirming cases. Having information on changes in the epidemic evolution or outbreaks rise before lab-confirmation is crucial in decision making for Public Health policies. We present an algorithm to estimate on-stream the number of COVID-19 cases using the data from telephone calls to a COVID-line. By modeling the c…
▽ More
One of the main problems in controlling COVID-19 epidemic spread is the delay in confirming cases. Having information on changes in the epidemic evolution or outbreaks rise before lab-confirmation is crucial in decision making for Public Health policies. We present an algorithm to estimate on-stream the number of COVID-19 cases using the data from telephone calls to a COVID-line. By modeling the calls as background (proportional to population) plus signal (proportional to infected), we fit the calls in Province of Buenos Aires (Argentina) with coefficient of determination $R^2 > 0.85$. This result allows us to estimate the number of cases given the number of calls from a specific district, days before the lab results are available. We validate the algorithm with real data. We show how to use the algorithm to track on-stream the epidemic, and present the Early Outbreak Alarm to detect outbreaks in advance to lab results. One key point in the developed algorithm is a detailed track of the uncertainties in the estimations, since the alarm uses the significance of the observables as a main indicator to detect an anomaly. We present the details of the explicit example in Villa Azul (Quilmes) where this tool resulted crucial to control an outbreak on time. The presented tools have been designed in urgency with the available data at the time of the development, and therefore have their limitations which we describe and discuss. We consider possible improvements on the tools, many of which are currently under development.
△ Less
Submitted 10 October, 2020;
originally announced October 2020.
-
Visual analytics of COVID-19 dissemination in São Paulo state, Brazil
Authors:
Wilson E. Marcílio-Jr,
Danilo M. Eler,
Rogério E. Garcia,
Ronaldo C. M. Correia,
Rafael M. B. Rodrigues
Abstract:
Visual analytics techniques are useful tools to support decision-making and cope with increasing data, which is particularly important when monitoring natural or artificial phenomena. When monitoring disease progression, visual analytics approaches help decision-makers choose to understand or even prevent dissemination paths. In this paper, we propose a new visual analytics tool for monitoring COV…
▽ More
Visual analytics techniques are useful tools to support decision-making and cope with increasing data, which is particularly important when monitoring natural or artificial phenomena. When monitoring disease progression, visual analytics approaches help decision-makers choose to understand or even prevent dissemination paths. In this paper, we propose a new visual analytics tool for monitoring COVID-19 dissemination. We use k-nearest neighbors of cities to mimic neighboring cities and analyze COVID-19 dissemination based on the comparison of a city under consideration and its neighborhood. Moreover, such analysis is performed based on periods, which facilitates the assessment of isolation policies. We validate our tool by analyzing the progression of COVID-19 in neighboring cities of São Paulo state, Brazil.
△ Less
Submitted 8 March, 2021; v1 submitted 24 June, 2020;
originally announced July 2020.
-
Effective energy density determines the dynamics of suspensions of active and passive matter
Authors:
Ryan Krafnic,
Angel E. Garcia
Abstract:
The unique properties of suspensions containing both active (self-propelling) and passive matter, arising from the nonequilibrium nature of these systems, have been widely studied (e.g., enhanced diffusion, phase separation, and directed motion). Despite this, our understanding of the specific roles played by the relevant parameters of the constituent particles remains incomplete. For instance, to…
▽ More
The unique properties of suspensions containing both active (self-propelling) and passive matter, arising from the nonequilibrium nature of these systems, have been widely studied (e.g., enhanced diffusion, phase separation, and directed motion). Despite this, our understanding of the specific roles played by the relevant parameters of the constituent particles remains incomplete. For instance, to what extent are the velocity and density of swimmers qualitatively distinguishable when it comes to the resultant properties of the suspension as a whole, and when are they merely two different realizations of the same thing? Through the use of numerical simulations, containing both steric and hydrodynamic interactions, we investigate a new parameter, the effective energy density, and its ability to uniquely describe the dynamics and properties of a hybrid system of active and passive particles, including the rate of pair formation and the energy distribution amongst different constituent elements. This parameter depends on both the density and the swimming velocity of the active elements, unifying them into a single variable that surpasses the descriptive ability of either alone.
△ Less
Submitted 21 October, 2019;
originally announced October 2019.
-
ABCD Neurocognitive Prediction Challenge 2019: Predicting individual residual fluid intelligence scores from cortical grey matter morphology
Authors:
Neil P. Oxtoby,
Fabio S. Ferreira,
Agoston Mihalik,
Tong Wu,
Mikael Brudfors,
Hongxiang Lin,
Anita Rau,
Stefano B. Blumberg,
Maria Robu,
Cemre Zor,
Maira Tariq,
Maria Del Mar Estarellas Garcia,
Baris Kanber,
Daniil I. Nikitichev,
Janaina Mourao-Miranda
Abstract:
We predicted residual fluid intelligence scores from T1-weighted MRI data available as part of the ABCD NP Challenge 2019, using morphological similarity of grey-matter regions across the cortex. Individual structural covariance networks (SCN) were abstracted into graph-theory metrics averaged over nodes across the brain and in data-driven communities/modules. Metrics included degree, path length,…
▽ More
We predicted residual fluid intelligence scores from T1-weighted MRI data available as part of the ABCD NP Challenge 2019, using morphological similarity of grey-matter regions across the cortex. Individual structural covariance networks (SCN) were abstracted into graph-theory metrics averaged over nodes across the brain and in data-driven communities/modules. Metrics included degree, path length, clustering coefficient, centrality, rich club coefficient, and small-worldness. These features derived from the training set were used to build various regression models for predicting residual fluid intelligence scores, with performance evaluated both using cross-validation within the training set and using the held-out validation set. Our predictions on the test set were generated with a support vector regression model trained on the training set. We found minimal improvement over predicting a zero residual fluid intelligence score across the sample population, implying that structural covariance networks calculated from T1-weighted MR imaging data provide little information about residual fluid intelligence.
△ Less
Submitted 26 May, 2019;
originally announced May 2019.
-
ABCD Neurocognitive Prediction Challenge 2019: Predicting individual fluid intelligence scores from structural MRI using probabilistic segmentation and kernel ridge regression
Authors:
Agoston Mihalik,
Mikael Brudfors,
Maria Robu,
Fabio S. Ferreira,
Hongxiang Lin,
Anita Rau,
Tong Wu,
Stefano B. Blumberg,
Baris Kanber,
Maira Tariq,
Maria Del Mar Estarellas Garcia,
Cemre Zor,
Daniil I. Nikitichev,
Janaina Mourao-Miranda,
Neil P. Oxtoby
Abstract:
We applied several regression and deep learning methods to predict fluid intelligence scores from T1-weighted MRI scans as part of the ABCD Neurocognitive Prediction Challenge (ABCD-NP-Challenge) 2019. We used voxel intensities and probabilistic tissue-type labels derived from these as features to train the models. The best predictive performance (lowest mean-squared error) came from Kernel Ridge…
▽ More
We applied several regression and deep learning methods to predict fluid intelligence scores from T1-weighted MRI scans as part of the ABCD Neurocognitive Prediction Challenge (ABCD-NP-Challenge) 2019. We used voxel intensities and probabilistic tissue-type labels derived from these as features to train the models. The best predictive performance (lowest mean-squared error) came from Kernel Ridge Regression (KRR; $λ=10$), which produced a mean-squared error of 69.7204 on the validation set and 92.1298 on the test set. This placed our group in the fifth position on the validation leader board and first place on the final (test) leader board.
△ Less
Submitted 26 May, 2019;
originally announced May 2019.
-
Inference for stochastic kinetic models from multiple data sources for joint estimation of infection dynamics from aggregate reports and virological data
Authors:
Yury E. García,
Oksana A. Chkrebtii,
Marcos A. Capistrán and,
Daniel E. Noyola
Abstract:
Influenza and respiratory syncytial virus (RSV) are the leading etiological agents of seasonal acute respiratory infections (ARI) around the world. Medical doctors typically base the diagnosis of ARI on patients' symptoms alone and do not always conduct virological tests necessary to identify individual viruses, which limits the ability to study the interaction between multiple pathogens and make…
▽ More
Influenza and respiratory syncytial virus (RSV) are the leading etiological agents of seasonal acute respiratory infections (ARI) around the world. Medical doctors typically base the diagnosis of ARI on patients' symptoms alone and do not always conduct virological tests necessary to identify individual viruses, which limits the ability to study the interaction between multiple pathogens and make public health recommendations. We consider a stochastic kinetic model (SKM) for two interacting ARI pathogens circulating in a large population and an empirically motivated background process for infections with other pathogens causing similar symptoms. An extended marginal sampling approach based on the Linear Noise Approximation to the SKM integrates multiple data sources and additional model components. We infer the parameters defining the pathogens' dynamics and interaction within a Bayesian hierarchical model and explore the posterior trajectories of infections for each illness based on aggregate infection reports from six epidemic seasons collected by the state health department, and a subset of virological tests from a sentinel program at a general hospital in San Luis Potosí, México. We interpret the results based on real and simulated data and make recommendations for future data collection strategies. Supplementary materials and software are provided online.
△ Less
Submitted 28 March, 2019; v1 submitted 24 March, 2019;
originally announced March 2019.
-
Early pathogen replacement in a model of Influenza and Respiratory Syncytial Virus with partial vaccination. A computational study
Authors:
Yury E. García,
Marcos A. Capistrán
Abstract:
In this paper, we carry out a computational study using the spectral decomposition of the fluctuations of a two-pathogen epidemic model around its deterministic attractor, i.e., steady state or limit cycle, to examine the role of partial vaccination and between-host pathogen interaction on early pathogen replacement during seasonal epidemics of influenza and respiratory syncytial virus.
In this paper, we carry out a computational study using the spectral decomposition of the fluctuations of a two-pathogen epidemic model around its deterministic attractor, i.e., steady state or limit cycle, to examine the role of partial vaccination and between-host pathogen interaction on early pathogen replacement during seasonal epidemics of influenza and respiratory syncytial virus.
△ Less
Submitted 30 October, 2017;
originally announced October 2017.
-
Selecting fast folding proteins by their rate of convergence
Authors:
Dmitry K. Gridnev,
Pedro Ojeda-May,
Martin E. Garcia
Abstract:
We propose a general method for predicting potentially good folders from a given number of amino acid sequences. Our approach is based on the calculation of the rate of convergence of each amino acid chain towards the native structure using only the very initial parts of the dynamical trajectories. It does not require any preliminary knowledge of the native state and can be applied to different ki…
▽ More
We propose a general method for predicting potentially good folders from a given number of amino acid sequences. Our approach is based on the calculation of the rate of convergence of each amino acid chain towards the native structure using only the very initial parts of the dynamical trajectories. It does not require any preliminary knowledge of the native state and can be applied to different kinds of models, including atomistic descriptions. We tested the method within both the lattice and off-lattice model frameworks and obtained several so far unknown good folders. The unbiased algorithm also allows to determine the optimal folding temperature and takes at least 3--4 orders of magnitude less time steps than those needed to compute folding times.
△ Less
Submitted 6 February, 2013; v1 submitted 2 April, 2012;
originally announced April 2012.
-
Arginine-rich peptides destabilize the plasma membrane, consistent with a pore formation translocation mechanism of cell penetrating peptides
Authors:
H. D. Herce,
A. E. Garcia,
J. Litt,
R. S. Kane,
P. Martin,
N. Enrique,
A. Rebolledo,
V. Milesi
Abstract:
Recent molecular dynamics simulations (Herce and Garcia, PNAS, 104: 20805 (2007)) have suggested that the arginine-rich HIV Tat peptides might be able to translocate by destabilizing and inducing transient pores in phospholipid bilayers. In this pathway for peptide translocation, arginine residues play a fundamental role not only in the binding of the peptide to the surface of the membrane but a…
▽ More
Recent molecular dynamics simulations (Herce and Garcia, PNAS, 104: 20805 (2007)) have suggested that the arginine-rich HIV Tat peptides might be able to translocate by destabilizing and inducing transient pores in phospholipid bilayers. In this pathway for peptide translocation, arginine residues play a fundamental role not only in the binding of the peptide to the surface of the membrane but also in the destabilization and nucleation of transient pores across the bilayer, despite being charged and highly hydrophilic. Here we present a molecular dynamics simulation of a peptide composed of nine arginines (Arg-9) that shows that this peptide follows the same translocation pathway previously found for the Tat peptide. We test this hypothesis experimentally by measuring ionic currents across phospholipid bilayers and cell membranes through the pores induced by Arg-9 peptides. We find that Arg-9 peptides, in the presence of an electrostatic potential gradient, induce ionic currents across planar phospholipid bilayers, as well as in cultured osteosarcoma cells and human smooth muscle cells freshly isolated from the umbilical artery. Our results suggest that the mechanism of action of Arg-9 peptide involves the creation of transient pores in lipid bilayers and cell membranes.
△ Less
Submitted 9 October, 2009;
originally announced October 2009.
-
Folding is Not Required for Bilayer Insertion: Replica Exchange Simulations of an a-Helical Peptide with an Explicit Lipid Bilayer
Authors:
Hugh Nymeyer,
Thomas B. Woolf,
Angel E. Garcia
Abstract:
We implement the replica exchange molecular dynamics algorithm to study the interactions of a model peptide (WALP-16) with an explicitly represented DPPC membrane bilayer. We observe the spontaneous, unbiased insertion of WALP-16 into the DPPC bilayer and its folding into an a-helix with a trans-bilayer orientation. We observe that the insertion of the peptide into the DPPC bilayer precedes seco…
▽ More
We implement the replica exchange molecular dynamics algorithm to study the interactions of a model peptide (WALP-16) with an explicitly represented DPPC membrane bilayer. We observe the spontaneous, unbiased insertion of WALP-16 into the DPPC bilayer and its folding into an a-helix with a trans-bilayer orientation. We observe that the insertion of the peptide into the DPPC bilayer precedes secondary structure formation. Although the peptide has some propensity to form a partially helical structure in the interfacial region of the DPPC/water system, this state is not a productive intermediate but rather an off-pathway trap for WALP-16 insertion. Equilibrium simulations show that the observed insertion/folding pathway mirrors the potential of mean force (PMF). Calculation of the enthalpic and entropic contributions to this PMF show that the surface bound conformation of WALP-16 is significantly lower in energy than other conformations, and that the insertion of WALP-16 into the bilayer without regular secondary structure is enthalpically unfavorable by 5-10 kcal/mol/residue. The observed insertion/folding pathway disagrees with the dominant conceptual model, which is that a surface bound helix is an obligatory intermediate for the insertion of a-helical peptides into lipid bilayers. In our simulations, the observed insertion/folding pathway is favored because of a large (> 100 kcal/mol) increase in system entropy that occurs when the unstructured WALP-16 peptide enters the lipid bilayer interior. The insertion/folding pathway that is lowest in free energy depends sensitively on the near cancellation of large enthalpic and entropic terms. This suggests that intrinsic membrane peptides may have a diversity of insertion/folding behaviors depending on the exact system of peptide and lipid under consideration.
△ Less
Submitted 29 October, 2004;
originally announced November 2004.
-
Protein folding mediated by solvation: water expelling and formation of the hydrophobic core occurs after the structure collapse
Authors:
Margaret S. Cheung,
Angel E. Garcia,
Jose N. Onuchic
Abstract:
The interplay between structure-search of the native structure and desolvation in protein folding has been explored using a minimalist model. These results support a folding mechanism where most of the structural formation of the protein is achieved before water is expelled from the hydrophobic core. This view integrates water expulsion effects into the funnel energy landscape theory of protein…
▽ More
The interplay between structure-search of the native structure and desolvation in protein folding has been explored using a minimalist model. These results support a folding mechanism where most of the structural formation of the protein is achieved before water is expelled from the hydrophobic core. This view integrates water expulsion effects into the funnel energy landscape theory of protein folding. Comparisons to experimental results are shown for the SH3 protein. After the folding transition, a near-native intermediate with partially solvated hydrophobic core is found. This transition is followed by a final step that cooperatively squeezes out water molecules from the partially hydrated protein core.
△ Less
Submitted 31 March, 2002;
originally announced April 2002.
-
Hydrophobic Effects on a Molecular Scale
Authors:
G. Hummer,
S. Garde,
A. E. García,
M. E. Paulaitis,
L. R. Pratt
Abstract:
A theoretical approach is developed to quantify hydrophobic hydration and interactions on a molecular scale, with the goal of gaining insight into the molecular origins of hydrophobic effects. The model is based on the fundamental relation between the probability for cavity formation in bulk water resulting from molecular-scale density fluctuations, and the hydration free energy of the simplest…
▽ More
A theoretical approach is developed to quantify hydrophobic hydration and interactions on a molecular scale, with the goal of gaining insight into the molecular origins of hydrophobic effects. The model is based on the fundamental relation between the probability for cavity formation in bulk water resulting from molecular-scale density fluctuations, and the hydration free energy of the simplest hydrophobic solute, a hard particle. This probability is estimated using an information theory (IT) approach, incorporating experimentally available properties of bulk water -- the density and radial distribution function. The IT approach reproduces the simplest hydrophobic effects: hydration of spherical nonpolar solutes, the potential of mean force between methane molecules, and solvent contributions to the torsional equilibrium of butane. Applications of this approach to study temperature and pressure effects provide new insights into the thermodynamics and kinetics of protein folding. The IT model relates the hydrophobic-entropy convergence observed in protein unfolding experiments to the macroscopic isothermal compressibility of water. A novel explanation for pressure denaturation of proteins follows from an analysis of the pressure stability of hydrophobic aggregates, suggesting that water penetrates the hydrophobic core of proteins at high pressures. This resolves a long-standing puzzle, whether pressure denaturation contradicts the hydrophobic-core model of protein stability. Finally, issues of ``dewetting'' of molecularly large nonpolar solutes are discussed in the context of a recently developed perturbation theory approach.
△ Less
Submitted 18 September, 1998; v1 submitted 1 July, 1998;
originally announced July 1998.