-
Entropic contribution to phenotype fitness
Authors:
Pablo Catalán,
Juan Antonio García-Martín,
Jacobo Aguirre,
José A. Cuesta,
Susanna Manrubia
Abstract:
All possible phenotypes are not equally accessible to evolving populations. In fact, only phenotypes of large size, i.e. those resulting from many different genotypes, are found in populations of sequences, presumably because they are easier to discover and maintain. Genotypes that map to these phenotypes usually form mostly connected genotype networks that percolate the space of sequences, thus g…
▽ More
All possible phenotypes are not equally accessible to evolving populations. In fact, only phenotypes of large size, i.e. those resulting from many different genotypes, are found in populations of sequences, presumably because they are easier to discover and maintain. Genotypes that map to these phenotypes usually form mostly connected genotype networks that percolate the space of sequences, thus guaranteeing access to a large set of alternative phenotypes. Within a given environment, where specific phenotypic traits become relevant for adaptation, the replicative ability of a phenotype and its overall fitness (in competition experiments with alternative phenotypes) can be estimated. Two primary questions arise: how do phenotype size, reproductive capability and topology of the genotype network affect the fitness of a phenotype? And, assuming that evolution is only able to access large phenotypes, what is the range of unattainable fitness values? In order to address these questions, we quantify the adaptive advantage of phenotypes of varying size and spectral radius in a two-peak landscape. We derive analytical relationships between the three variables (size, topology, and replicative ability) which are then tested through analysis of genotype-phenotype maps and simulations of population dynamics on such maps. Finally, we analytically show that the fraction of attainable phenotypes decreases with the length of the genotype, though its absolute number increases. The fact that most phenotypes are not visible to evolution very likely forbids the attainment of the highest peak in the landscape. Nevertheless, our results indicate that the relative fitness loss due to this limited accessibility is largely inconsequential for adaptation.
△ Less
Submitted 11 April, 2023;
originally announced April 2023.
-
From genotypes to organisms: State-of-the-art and perspectives of a cornerstone in evolutionary dynamics
Authors:
Susanna Manrubia,
José A. Cuesta,
Jacobo Aguirre,
Sebastian E. Ahnert,
Lee Altenberg,
Alejandro V. Cano,
Pablo Catalán,
Ramon Diaz-Uriarte,
Santiago F. Elena,
Juan Antonio García-Martín,
Paulien Hogeweg,
Bhavin S. Khatri,
Joachim Krug,
Ard A. Louis,
Nora S. Martin,
Joshua L. Payne,
Matthew J. Tarnowski,
Marcel Weiß
Abstract:
Understanding how genotypes map onto phenotypes, fitness, and eventually organisms is arguably the next major missing piece in a fully predictive theory of evolution. We refer to this generally as the problem of the genotype-phenotype map. Though we are still far from achieving a complete picture of these relationships, our current understanding of simpler questions, such as the structure induced…
▽ More
Understanding how genotypes map onto phenotypes, fitness, and eventually organisms is arguably the next major missing piece in a fully predictive theory of evolution. We refer to this generally as the problem of the genotype-phenotype map. Though we are still far from achieving a complete picture of these relationships, our current understanding of simpler questions, such as the structure induced in the space of genotypes by sequences mapped to molecular structures, has revealed important facts that deeply affect the dynamical description of evolutionary processes. Empirical evidence supporting the fundamental relevance of features such as phenotypic bias is mounting as well, while the synthesis of conceptual and experimental progress leads to questioning current assumptions on the nature of evolutionary dynamics-cancer progression models or synthetic biology approaches being notable examples. This work delves into a critical and constructive attitude in our current knowledge of how genotypes map onto molecular phenotypes and organismal functions, and discusses theoretical and empirical avenues to broaden and improve this comprehension. As a final goal, this community should aim at deriving an updated picture of evolutionary processes soundly relying on the structural properties of genotype spaces, as revealed by modern techniques of molecular and functional analysis.
△ Less
Submitted 17 March, 2021; v1 submitted 2 February, 2020;
originally announced February 2020.
-
Statistical theory of phenotype abundance distributions: a test through exact enumeration of genotype spaces
Authors:
Juan Antonio García-Martín,
Pablo Catalán,
Susanna Manrubia,
José A. Cuesta
Abstract:
The evolutionary dynamics of molecular populations are strongly dependent on the structure of genotype spaces. The map between genotype and phenotype determines how easily genotype spaces can be navigated and the accessibility of evolutionary innovations. In particular, the size of neutral networks corresponding to specific phenotypes and its statistical counterpart, the distribution of phenotype…
▽ More
The evolutionary dynamics of molecular populations are strongly dependent on the structure of genotype spaces. The map between genotype and phenotype determines how easily genotype spaces can be navigated and the accessibility of evolutionary innovations. In particular, the size of neutral networks corresponding to specific phenotypes and its statistical counterpart, the distribution of phenotype abundance, have been studied through multiple computationally tractable genotype-phenotype maps. In this work, we test a theory that predicts the abundance of a phenotype and the corresponding asymptotic distribution (given the compositional variability of its genotypes) through the exact enumeration of several GP maps. Our theory predicts with high accuracy phenotype abundance, and our results show that, in navigable genotype spaces ---characterised by the presence of large neutral networks---, phenotype abundance converges to a log-normal distribution.
△ Less
Submitted 25 July, 2018; v1 submitted 11 June, 2018;
originally announced June 2018.
-
RNAiFold2T: Constraint Programming design of thermo-IRES switches
Authors:
Juan Antonio Garcia-Martin,
Ivan Dotu,
Javier Fernandez-Chamorro,
Gloria Lozano,
Jorge Ramajo,
Encarnacion Martinez-Salas,
Peter Clote
Abstract:
Motivation: RNA thermometers (RNATs) are cis-regulatory ele- ments that change secondary structure upon temperature shift. Often involved in the regulation of heat shock, cold shock and virulence genes, RNATs constitute an interesting potential resource in synthetic biology, where engineered RNATs could prove to be useful tools in biosensors and conditional gene regulation. Results: Solving the 2-…
▽ More
Motivation: RNA thermometers (RNATs) are cis-regulatory ele- ments that change secondary structure upon temperature shift. Often involved in the regulation of heat shock, cold shock and virulence genes, RNATs constitute an interesting potential resource in synthetic biology, where engineered RNATs could prove to be useful tools in biosensors and conditional gene regulation. Results: Solving the 2-temperature inverse folding problem is critical for RNAT engineering. Here we introduce RNAiFold2T, the first Constraint Programming (CP) and Large Neighborhood Search (LNS) algorithms to solve this problem. Benchmarking tests of RNAiFold2T against existent programs (adaptive walk and genetic algorithm) inverse folding show that our software generates two orders of magnitude more solutions, thus allow- ing ample exploration of the space of solutions. Subsequently, solutions can be prioritized by computing various measures, including probability of target structure in the ensemble, melting temperature, etc. Using this strategy, we rationally designed two thermosensor internal ribosome entry site (thermo-IRES) elements, whose normalized cap-independent transla- tion efficiency is approximately 50% greater at 42?C than 30?C, when tested in reticulocyte lysates. Translation efficiency is lower than that of the wild-type IRES element, which on the other hand is fully resistant to temperature shift-up. This appears to be the first purely computational design of functional RNA thermoswitches, and certainly the first purely computational design of functional thermo-IRES elements. Availability: RNAiFold2T is publicly available as as part of the new re- lease RNAiFold3.0 at https://github.com/clotelab/RNAiFold and http: //bioinformatics.bc.edu/clotelab/RNAiFold, which latter has a web server as well. The software is written in C++ and uses OR-Tools CP search engine.
△ Less
Submitted 14 May, 2016;
originally announced May 2016.
-
RNA thermodynamic structural entropy
Authors:
Juan Antonio Garcia-Martin,
Peter Clote
Abstract:
Conformational entropy for atomic-level, three dimensional biomolecules is known experimentally to play an important role in protein-ligand discrimination, yet reliable computation of entropy remains a difficult problem. Here we describe the first two accurate and efficient algorithms to compute the conformational entropy for RNA secondary structures, with respect to the Turner energy model, where…
▽ More
Conformational entropy for atomic-level, three dimensional biomolecules is known experimentally to play an important role in protein-ligand discrimination, yet reliable computation of entropy remains a difficult problem. Here we describe the first two accurate and efficient algorithms to compute the conformational entropy for RNA secondary structures, with respect to the Turner energy model, where free energy parameters are determined from UV aborption experiments. An algorithm to compute the derivational entropy for RNA secondary structures had previously been introduced, using stochastic context free grammars (SCFGs). However, the numerical value of derivational entropy depends heavily on the chosen context free grammar and on the training set used to estimate rule probabilities. Using data from the Rfam database, we determine that both of our thermodynamic methods, which agree in numerical value, are substantially faster than the SCFG method. Thermodynamic structural entropy is much smaller than derivational entropy, and the correlation between length-normalized thermodynamic entropy and derivational entropy is moderately weak to poor. In applications, we plot the structural entropy as a function of temperature for known thermoswitches, determine that the correlation between hammerhead ribozyme cleavage activity and total free energy is improved by including an additional free energy term arising from conformational entropy, and plot the structural entropy of windows of the HIV-1 genome. Our software RNAentropy can compute structural entropy for any user-specified temperature, and supports both the Turner'99 and Turner'04 energy parameters. It follows that RNAentropy is state-of-the-art software to compute RNA secondary structure conformational entropy. The software is available at http://bioinformatics.bc.edu/clotelab/RNAentropy.
△ Less
Submitted 22 August, 2015;
originally announced August 2015.
-
RNAiFold 2.0: A web server and software to design custom and Rfam-based RNA molecules
Authors:
Juan Antonio Garcia-Martin,
Ivan Dotu,
Peter Clote
Abstract:
Several algorithms for RNA inverse folding have been used to design synthetic riboswitches, ribozymes and thermoswitches, whose activity has been experimentally validated. The RNAiFold software is unique among approaches for inverse folding in that (exhaustive) constraint programming is used instead of heuristic methods. For that reason, RNAiFold can generate all sequences that fold into the targe…
▽ More
Several algorithms for RNA inverse folding have been used to design synthetic riboswitches, ribozymes and thermoswitches, whose activity has been experimentally validated. The RNAiFold software is unique among approaches for inverse folding in that (exhaustive) constraint programming is used instead of heuristic methods. For that reason, RNAiFold can generate all sequences that fold into the target structure, or determine that there is no solution. RNAiFold 2.0 is a complete overhaul of RNAiFold 1.0, rewritten from the now defunct COMET language to C++. The new code properly extends the capabilities of its predecessor by providing a user-friendly pipeline to design synthetic constructs having the functionality of given Rfam families. In addition, the new software supports amino acid constraints, even for proteins translated in different reading frames from overlapping coding sequences; moreover, structure compatibility/incompatibility constraints have been expanded. With these features, RNAiFold 2.0 allows the user to design single RNA molecules as well as hybridization complexes of two RNA molecules.
The web server, source code and linux binaries are publicly accessible at http://bioinformatics.bc.edu/clotelab/RNAiFold2.0
△ Less
Submitted 15 May, 2015;
originally announced May 2015.
-
Complete RNA inverse folding: computational design of functional hammerhead ribozymes
Authors:
Ivan Dotu,
Juan Antonio Garcia-Martin,
Betty L. Slinger,
Vinodh Mechery,
Michelle M. Meyer,
Peter Clote
Abstract:
Nanotechnology and synthetic biology currently constitute one of the most innovative, interdisciplinary fields of research, poised to radically transform society in the 21st century. This paper concerns the synthetic design of ribonucleic acid molecules, using our recent algorithm, RNAiFold, which can determine all RNA sequences whose minimum free energy secondary structure is a user-specified tar…
▽ More
Nanotechnology and synthetic biology currently constitute one of the most innovative, interdisciplinary fields of research, poised to radically transform society in the 21st century. This paper concerns the synthetic design of ribonucleic acid molecules, using our recent algorithm, RNAiFold, which can determine all RNA sequences whose minimum free energy secondary structure is a user-specified target structure. Using RNAiFold, we design ten cis-cleaving hammerhead ribozymes, all of which are shown to be functional by a cleavage assay. We additionally use RNAiFold to design a functional cis-cleaving hammerhead as a modular unit of a synthetic larger RNA. Analysis of kinetics on this small set of hammerheads suggests that cleavage rate of computationally designed ribozymes may be correlated with positional entropy, ensemble defect, structural flexibility/rigidity and related measures. Artificial ribozymes have been designed in the past either manually or by SELEX (Systematic Evolution of Ligands by Exponential Enrichment); however, this appears to be the first purely computational design and experimental validation of novel functional ribozymes. RNAiFold is available at http://bioinformatics.bc.edu/clotelab/RNAiFold/.
△ Less
Submitted 9 August, 2014;
originally announced August 2014.