-
Spontaneous domain formation in disordered copolymers as a mechanism for chromosome structuring
Authors:
Matteo Negri,
Marco Gherardi,
Guido Tiana,
Marco Cosentino Lagomarsino
Abstract:
Motivated by the problem of domain formation in chromosomes, we studied a co--polymer model where only a subset of the monomers feel attractive interactions. These monomers are displaced randomly from a regularly-spaced pattern, thus introducing some quenched disorder in the system. Previous work has shown that in the case of regularly-spaced interacting monomers this chain can fold into structure…
▽ More
Motivated by the problem of domain formation in chromosomes, we studied a co--polymer model where only a subset of the monomers feel attractive interactions. These monomers are displaced randomly from a regularly-spaced pattern, thus introducing some quenched disorder in the system. Previous work has shown that in the case of regularly-spaced interacting monomers this chain can fold into structures characterized by multiple distinct domains of consecutive segments. In each domain, attractive interactions are balanced by the entropy cost of forming loops. We show by advanced replica-exchange simulations that adding disorder in the position of the interacting monomers further stabilizes these domains. The model suggests that the partitioning of the chain into well-defined domains of consecutive monomers is a spontaneous property of heteropolymers. In the case of chromosomes, evolution could have acted on the spacing of interacting monomers to modulate in a simple way the underlying domains for functional reasons.
△ Less
Submitted 12 February, 2018;
originally announced February 2018.
-
Statistics of shared components in complex component systems
Authors:
Andrea Mazzolini,
Marco Gherardi,
Michele Caselle,
Marco Cosentino Lagomarsino,
Matteo Osella
Abstract:
Many complex systems are modular. Such systems can be represented as "component systems", i.e., sets of elementary components, such as LEGO bricks in LEGO sets. The bricks found in a LEGO set reflect a target architecture, which can be built following a set-specific list of instructions. In other component systems, instead, the underlying functional design and constraints are not obvious a priori,…
▽ More
Many complex systems are modular. Such systems can be represented as "component systems", i.e., sets of elementary components, such as LEGO bricks in LEGO sets. The bricks found in a LEGO set reflect a target architecture, which can be built following a set-specific list of instructions. In other component systems, instead, the underlying functional design and constraints are not obvious a priori, and their detection is often a challenge of both scientific and practical importance, requiring a clear understanding of component statistics. Importantly, some quantitative invariants appear to be common to many component systems, most notably a common broad distribution of component abundances, which often resembles the well-known Zipf's law. Such "laws" affect in a general and non-trivial way the component statistics, potentially hindering the identification of system-specific functional constraints or generative processes. Here, we specifically focus on the statistics of shared components, i.e., the distribution of the number of components shared by different system-realizations, such as the common bricks found in different LEGO sets. To account for the effects of component heterogeneity, we consider a simple null model, which builds system-realizations by random draws from a universe of possible components. Under general assumptions on abundance heterogeneity, we provide analytical estimates of component occurrence, which quantify exhaustively the statistics of shared components. Surprisingly, this simple null model can positively explain important features of empirical component-occurrence distributions obtained from data on bacterial genomes, LEGO sets, and book chapters. Specific architectural features and functional constraints can be detected from occurrence patterns as deviations from these null predictions, as we show for the illustrative case of the "core" genome in bacteria.
△ Less
Submitted 23 April, 2018; v1 submitted 26 July, 2017;
originally announced July 2017.
-
Family-specific scaling laws in bacterial genomes
Authors:
Eleonora de Lazzari,
Jacopo Grilli,
Sergei Maslov,
Marco Cosentino Lagomarsino
Abstract:
Among several quantitative invariants found in evolutionary genomics, one of the most striking is the scaling of the overall abundance of proteins, or protein domains, sharing a specific functional annotation across genomes of given size. The size of these functional categories change, on average, as power-laws in the total number of protein-coding genes. Here, we show that such regularities are n…
▽ More
Among several quantitative invariants found in evolutionary genomics, one of the most striking is the scaling of the overall abundance of proteins, or protein domains, sharing a specific functional annotation across genomes of given size. The size of these functional categories change, on average, as power-laws in the total number of protein-coding genes. Here, we show that such regularities are not restricted to the overall behavior of high-level functional categories, but also exist systematically at the level of single evolutionary families of protein domains. Specifically, the number of proteins within each family follows family-specific scaling laws with genome size. Functionally similar sets of families tend to follow similar scaling laws, but this is not always the case. To understand this systematically, we provide a comprehensive classification of families based on their scaling properties. Additionally, we develop a quantitative score for the heterogeneity of the scaling of families belonging to a given category or predefined group. Under the common reasonable assumption that selection is driven solely or mainly by biological function, these findings point to fine-tuned and interdependent functional roles of specific protein domains, beyond our current functional annotations. This analysis provides a deeper view on the links between evolutionary expansion of protein families and the functional constraints shaping the gene repertoire of bacterial genomes.
△ Less
Submitted 28 March, 2017;
originally announced March 2017.
-
Cell-to-cell variability and robustness in S-phase duration from genome replication kinetics
Authors:
Qing Zhang,
Federico Bassetti,
Marco Gherardi,
Marco Cosentino Lagomarsino
Abstract:
Genome replication, a key process for a cell, relies on stochastic initiation by replication origins, causing a variability of replication timing from cell to cell. While stochastic models of eukaryotic replication are widely available, the link between the key parameters and overall replication timing has not been addressed systematically.We use a combined analytical and computational approach to…
▽ More
Genome replication, a key process for a cell, relies on stochastic initiation by replication origins, causing a variability of replication timing from cell to cell. While stochastic models of eukaryotic replication are widely available, the link between the key parameters and overall replication timing has not been addressed systematically.We use a combined analytical and computational approach to calculate how positions and strength of many origins lead to a given cell-to-cell variability of total duration of the replication of a large region, a chromosome or the entire genome.Specifically, the total replication timing can be framed as an extreme-value problem, since it is due to the last region that replicates in each cell. Our calculations identify two regimes based on the spread between characteristic completion times of all inter-origin regions of a genome. For widely different completion times, timing is set by the single specific region that is typically the last to replicate in all cells. Conversely, when the completion time of all regions are comparable,an extreme-value estimate shows that the cell-to-cell variability of genome replication timing has universal properties. Comparison with available data shows that the replication program of three yeast species falls in this extreme-value regime.
△ Less
Submitted 24 May, 2017; v1 submitted 27 January, 2017;
originally announced January 2017.
-
Stochastic timing in gene expression for simple regulatory strategies
Authors:
Alma Dal Co,
Marco Cosentino Lagomarsino,
Michele Caselle,
Matteo Osella
Abstract:
Timing is essential for many cellular processes, from cellular responses to external stimuli to the cell cycle and circadian clocks. Many of these processes are based on gene expression. For example, an activated gene may be required to reach in a precise time a threshold level of expression that triggers a specific downstream process. However, gene expression is subject to stochastic fluctuations…
▽ More
Timing is essential for many cellular processes, from cellular responses to external stimuli to the cell cycle and circadian clocks. Many of these processes are based on gene expression. For example, an activated gene may be required to reach in a precise time a threshold level of expression that triggers a specific downstream process. However, gene expression is subject to stochastic fluctuations, naturally inducing an uncertainty in this threshold-crossing time with potential consequences on biological functions and phenotypes. Here, we consider such "timing fluctuations", and we ask how they can be controlled. Our analytical estimates and simulations show that, for an induced gene, timing variability is minimal if the threshold level of expression is approximately half of the steady-state level. Timing fuctuations can be reduced by increasing the transcription rate, while they are insensitive to the translation rate. In presence of self-regulatory strategies, we show that self-repression reduces timing noise for threshold levels that have to be reached quickly, while selfactivation is optimal at long times. These results lay a framework for understanding stochasticity of endogenous systems such as the cell cycle, as well as for the design of synthetic trigger circuits.
△ Less
Submitted 23 February, 2017; v1 submitted 29 July, 2016;
originally announced July 2016.
-
Relevant parameters in models of cell division control
Authors:
Jacopo Grilli,
Matteo Osella,
Andrew S. Kennard,
Marco Cosentino Lagomarsino
Abstract:
A recent burst of dynamic single-cell growth-division data makes it possible to characterize the stochastic dynamics of cell division control in bacteria. Different modeling frameworks were used to infer specific mechanisms from such data, but the links between frameworks are poorly explored, with relevant consequences for how well any particular mechanism can be supported by the data. Here, we de…
▽ More
A recent burst of dynamic single-cell growth-division data makes it possible to characterize the stochastic dynamics of cell division control in bacteria. Different modeling frameworks were used to infer specific mechanisms from such data, but the links between frameworks are poorly explored, with relevant consequences for how well any particular mechanism can be supported by the data. Here, we describe a simple and generic framework in which two common formalisms can be used interchangeably: (i) a continuous-time division process described by a hazard function and (ii) a discrete-time equation describing cell size across generations (where the unit of time is a cell cycle). In our framework, this second process is a discrete-time Langevin equation with a simple physical analogue. By perturbative expansion around the mean initial size (or inter-division time), we show explicitly how this framework describes a wide range of division control mechanisms, including combinations of time and size control, as well as the constant added size mechanism recently found to capture several aspects of the cell division behavior of different bacteria. As we show by analytical estimates and numerical simulation, the available data are characterized with great precision by the first-order approximation of this expansion. Hence, a single dimensionless parameter defines the strength and the action of the division control. However, this parameter may emerge from several mechanisms, which are distinguished only by higher-order terms in our perturbative expansion. An analytical estimate of the sample size needed to distinguish between second-order effects shows that this is larger than what is available in the current datasets. These results provide a unified framework for future studies and clarify the relevant parameters at play in the control of cell division.
△ Less
Submitted 29 June, 2016;
originally announced June 2016.
-
Individuality and universality in the growth-division laws of single E. coli cells
Authors:
Andrew S. Kennard,
Matteo Osella,
Avelino Javer,
Jacopo Grilli,
Philippe Nghe,
Sander Tans,
Pietro Cicuta,
Marco Cosentino Lagomarsino
Abstract:
The mean size of exponentially dividing E. coli cells cultured in different nutrient conditions is known to depend on the mean growth rate only. However, the joint fluctuations relating cell size, doubling time and individual growth rate are only starting to be characterized. Recent studies in bacteria (i) revealed the near constancy of the size extension in a single cell cycle (adder mechanism),…
▽ More
The mean size of exponentially dividing E. coli cells cultured in different nutrient conditions is known to depend on the mean growth rate only. However, the joint fluctuations relating cell size, doubling time and individual growth rate are only starting to be characterized. Recent studies in bacteria (i) revealed the near constancy of the size extension in a single cell cycle (adder mechanism), and (ii) reported a universal trend where the spread in both size and doubling times is a linear function of the population means of these variables. Here, we combine experiments and theory and use scaling concepts to elucidate the constraints posed by the second observation on the division control mechanism and on the joint fluctuations of sizes and doubling times. We found that scaling relations based on the means both collapse size and doubling-time distributions across different conditions, and explain how the shape of their joint fluctuations deviates from the means. Our data on these joint fluctuations highlight the importance of cell individuality: single cells do not follow the dependence observed for the means between size and either growth rate or inverse doubling time. Our calculations show that these results emerge from a broad class of division control mechanisms (including the adder mechanism as a particular case) requiring a certain scaling form of the so-called "division hazard rate function", which defines the probability rate of dividing as a function of measurable parameters. This gives a rationale for the universal body-size distributions observed in microbial ecosystems across many microbial species, presumably dividing with multiple mechanisms. Additionally, our experiments show a crossover between fast and slow growth in the relation between individual-cell growth rate and division time, which can be understood in terms of different regimes of genome replication control.
△ Less
Submitted 26 December, 2015; v1 submitted 16 November, 2014;
originally announced November 2014.
-
Soft bounds on diffusion produce skewed distributions and Gompertz growth
Authors:
Salvatore MandrĂ ,
Marco Cosentino Lagomarsino,
Marco Gherardi
Abstract:
Constraints can affect dramatically the behavior of diffusion processes. Recently, we analyzed a natural and a technological system and reported that they perform diffusion-like discrete steps displaying a peculiar constraint, whereby the increments of the diffusing variable are subject to configuration-dependent bounds. This work explores theoretically some of the revealing landmarks of such phen…
▽ More
Constraints can affect dramatically the behavior of diffusion processes. Recently, we analyzed a natural and a technological system and reported that they perform diffusion-like discrete steps displaying a peculiar constraint, whereby the increments of the diffusing variable are subject to configuration-dependent bounds. This work explores theoretically some of the revealing landmarks of such phenomenology, termed "soft bound". At long times, the system reaches a steady state irreversibly (i.e., violating detailed balance), characterized by a skewed "shoulder" in the density distribution, and by a net local probability flux, which has entropic origin. The largest point in the support of the distribution follows a saturating dynamics, expressed by the Gompertz law, in line with empirical observations. Finally, we propose a generic allometric scaling for the origin of soft bounds. These findings shed light on the impact on a system of such "scaling" constraint and on its possible generating mechanisms.
△ Less
Submitted 19 September, 2014;
originally announced September 2014.
-
Combined collapse by bridging and self-adhesion in a prototypical polymer model inspired by the bacterial nucleoid
Authors:
Vittore F. Scolari,
Marco Cosentino Lagomarsino
Abstract:
Recent experimental results suggest that the E. coli chromosome feels a self-attracting interaction of osmotic origin, and is condensed in foci by bridging interactions. Motivated by these findings, we explore a generic modeling framework combining solely these two ingredients, in order to characterize their joint effects. Specifically, we study a simple polymer physics computational model with we…
▽ More
Recent experimental results suggest that the E. coli chromosome feels a self-attracting interaction of osmotic origin, and is condensed in foci by bridging interactions. Motivated by these findings, we explore a generic modeling framework combining solely these two ingredients, in order to characterize their joint effects. Specifically, we study a simple polymer physics computational model with weak ubiquitous short-ranged self attraction and stronger sparse bridging interactions. Combining theoretical arguments and simulations, we study the general phenomenology of polymer collapse induced by these dual contributions, in the case of regularly-spaced bridging. Our results distinguish a regime of classical Flory-like coil-globule collapse dictated by the interplay of excluded volume and attractive energy and a switch-like collapse where bridging interaction compete with entropy loss terms from the looped arms of a star-like rosette. Additionally, we show that bridging can induce stable compartmentalized domains. In these configurations, different "cores" of bridging proteins are kept separated by star-like polymer loops in an entropically favorable multi-domain configuration, with a mechanism that parallels micellar polysoaps. Such compartmentalized domains are stable, and do not need any intra-specific interactions driving their segregation. Domains can be stable also in presence of uniform attraction, as long as the uniform collapse is above its theta point.
△ Less
Submitted 7 December, 2014; v1 submitted 11 August, 2014;
originally announced August 2014.
-
Hard and soft bounds in the evolution of Ubuntu packages. A lesson for species body masses?
Authors:
Marco Gherardi,
Salvatore MandrĂ ,
Bruno Bassetti,
Marco Cosentino Lagomarsino
Abstract:
Open-source software is a complex system; its development depends on the self-coordinated action of a large number of agents. This study follows the size of the building blocks, called "packages", of the Ubuntu Linux operating system over its entire history. The analysis reveals a multiplicative diffusion process, constrained by size-dependent bounds, driving the dynamics of the package-size distr…
▽ More
Open-source software is a complex system; its development depends on the self-coordinated action of a large number of agents. This study follows the size of the building blocks, called "packages", of the Ubuntu Linux operating system over its entire history. The analysis reveals a multiplicative diffusion process, constrained by size-dependent bounds, driving the dynamics of the package-size distribution. A formalization of this into a quantitative model is able to match the data without relying on any adjustable parameters, and generates definite predictions. Finally, we formulate the hypothesis that a similar non-stationary mechanism could be shaping the distribution of mammal body sizes.
△ Less
Submitted 28 February, 2013;
originally announced March 2013.
-
Speed of evolution in large asexual populations with diminishing returns
Authors:
Maria Rita Fumagalli,
Matteo Osella,
Philippe Thomen,
Francois Heslot,
Marco Cosentino Lagomarsino
Abstract:
The adaptive evolution of large asexual populations is generally characterized by competition between clones carrying different beneficial mutations. This interference phenomenon slows down the adaptation speed and makes the theoretical description of the dynamics more complex with respect to the successional occurrence and fixation of beneficial mutations typical of small populations. A simplifie…
▽ More
The adaptive evolution of large asexual populations is generally characterized by competition between clones carrying different beneficial mutations. This interference phenomenon slows down the adaptation speed and makes the theoretical description of the dynamics more complex with respect to the successional occurrence and fixation of beneficial mutations typical of small populations. A simplified modeling framework considering multiple beneficial mutations with equal and constant fitness advantage captures some of the essential features of the actual complex dynamics, and some key predictions from this model are verified in laboratory evolution experiments. However, in these experiments the relative advantage of a beneficial mutation is generally dependent on the genetic background. In particular, the general pattern is that, as mutations in different loci accumulate, the relative advantage of new mutations decreases, trend often referred to as "diminishing return" epistasis.
In this paper, we propose a phenomenological model that generalizes the fixed-advantage framework to include in a simple way this feature. To evaluate the quantitative consequences of diminishing returns on the evolutionary dynamics, we approach the model analytically as well as with direct simulations. Finally, we show how the model parameters can be matched with data from evolutionary experiments in order to infer the mean effect of epistasis and derive order-of-magnitude estimates of the rate of beneficial mutations. Applying this procedure to two experimental data sets gives values of the beneficial mutation rate within the range of previous measurements.
△ Less
Submitted 19 December, 2012; v1 submitted 20 October, 2012;
originally announced October 2012.
-
Growth-rate-dependent dynamics of a bacterial genetic oscillator
Authors:
Matteo Osella,
Marco Cosentino Lagomarsino
Abstract:
Gene networks exhibiting oscillatory dynamics are widespread in biology. The minimal regulatory designs giving rise to oscillations have been implemented synthetically and studied by mathematical modeling. However, most of the available analyses generally neglect the coupling of regulatory circuits with the cellular "chassis" in which the circuits are embedded. For example, the intracellular macro…
▽ More
Gene networks exhibiting oscillatory dynamics are widespread in biology. The minimal regulatory designs giving rise to oscillations have been implemented synthetically and studied by mathematical modeling. However, most of the available analyses generally neglect the coupling of regulatory circuits with the cellular "chassis" in which the circuits are embedded. For example, the intracellular macromolecular composition of fast-growing bacteria changes with growth rate. As a consequence, important parameters of gene expression, such as ribosome concentration or cell volume, are growth-rate dependent, ultimately coupling the dynamics of genetic circuits with cell physiology. This work addresses the effects of growth rate on the dynamics of a paradigmatic example of genetic oscillator, the repressilator. Making use of empirical growth-rate dependences of parameters in bacteria, we show that the repressilator dynamics can switch between oscillations and convergence to a fixed point depending on the cellular state of growth, and thus on the nutrients it is fed. The physical support of the circuit (type of plasmid or gene positions on the chromosome) also plays an important role in determining the oscillation stability and the growth-rate dependence of period and amplitude. This analysis has potential application in the field of synthetic biology, and suggests that the coupling between endogenous genetic oscillators and cell physiology can have substantial consequences for their functionality.
△ Less
Submitted 30 January, 2013; v1 submitted 3 October, 2012;
originally announced October 2012.
-
Gene silencing and large-scale domain structure of the E. coli genome
Authors:
Mina Zarei,
Bianca Sclavi,
Marco Cosentino Lagomarsino
Abstract:
The H-NS chromosome-organizing protein in E. coli can stabilize genomic DNA loops, and form oligomeric structures connected to repression of gene expression. Motivated by the link between chromosome organization, protein binding and gene expression, we analyzed publicly available genomic data sets of various origins, from genome-wide protein binding profiles to evolutionary information, exploring…
▽ More
The H-NS chromosome-organizing protein in E. coli can stabilize genomic DNA loops, and form oligomeric structures connected to repression of gene expression. Motivated by the link between chromosome organization, protein binding and gene expression, we analyzed publicly available genomic data sets of various origins, from genome-wide protein binding profiles to evolutionary information, exploring the connections between chromosomal organization, genesilencing, pseudo-gene localization and horizontal gene transfer. We report the existence of transcriptionally silent contiguous areas corresponding to large regions of H-NS protein binding along the genome, their position indicates a possible relationship with the known large-scale features of chromosome organization.
△ Less
Submitted 28 September, 2012;
originally announced September 2012.
-
Influence of homology and node-age on the growth of protein-protein interaction networks
Authors:
Arianna Bottinelli,
Bruno Bassetti,
Marco Cosentino Lagomarsino,
Marco Gherardi
Abstract:
Proteins participating in a protein-protein interaction network can be grouped into homology classes following their common ancestry. Proteins added to the network correspond to genes added to the classes, so that the dynamics of the two objects are intrinsically linked. Here, we first introduce a statistical model describing the joint growth of the network and the partitioning of nodes into class…
▽ More
Proteins participating in a protein-protein interaction network can be grouped into homology classes following their common ancestry. Proteins added to the network correspond to genes added to the classes, so that the dynamics of the two objects are intrinsically linked. Here, we first introduce a statistical model describing the joint growth of the network and the partitioning of nodes into classes, which is studied through a combined mean-field and simulation approach. We then employ this unified framework to address the specific issue of the age dependence of protein interactions, through the definition of three different node wiring/divergence schemes. Comparison with empirical data indicates that an age-dependent divergence move is necessary in order to reproduce the basic topological observables together with the age correlation between interacting nodes visible in empirical data. We also discuss the possibility of nontrivial joint partition/topology observables.
△ Less
Submitted 16 October, 2012; v1 submitted 13 June, 2012;
originally announced June 2012.
-
Physical descriptions of the bacterial nucleoid at large scales, and their biological implications
Authors:
Vincenzo G. Benza,
Bruno Bassetti,
Kevin D. Dorfman,
Vittore F. Scolari,
Krystyna Bromek,
Pietro Cicuta,
Marco Cosentino Lagomarsino
Abstract:
Recent experimental and theoretical approaches have attempted to quantify the physical organization (compaction and geometry) of the bacterial chromosome with its complement of proteins (the nucleoid). The genomic DNA exists in a complex and dynamic protein-rich state, which is highly organised at various length scales. This has implications on modulating (when not enabling) the core biological pr…
▽ More
Recent experimental and theoretical approaches have attempted to quantify the physical organization (compaction and geometry) of the bacterial chromosome with its complement of proteins (the nucleoid). The genomic DNA exists in a complex and dynamic protein-rich state, which is highly organised at various length scales. This has implications on modulating (when not enabling) the core biological processes of replication, transcription, segregation. We overview the progress in this area, driven in the last few years by new scientific ideas and new interdisciplinary experimental techniques, ranging from high space- and time-resolution microscopy to high-throughput genomics employing sequencing to map different aspects of the nucleoid-related interactome. The aim of this review is to present the wide spectrum of experimental and theoretical findings coherently, from a physics viewpoint. We also discuss some attempts of interpretation that unify different results, highlighting the role that statistical and soft condensed matter physics, and in particular classic and more modern tools from the theory of polymers, plays in describing this system of fundamental biological importance, and pointing to possible directions for future investigation.
△ Less
Submitted 29 May, 2012; v1 submitted 16 April, 2012;
originally announced April 2012.
-
DnaA and the timing of chromosome replication in Escherichia coli as a function of growth rate
Authors:
Matthew A. A. Grant,
Chiara Saggioro,
Ulisse Ferrari,
Bruno Bassetti,
Bianca Sclavi,
Marco Cosentino Lagomarsino
Abstract:
Background: In Escherichia coli, overlapping rounds of DNA replication allow the bacteria to double in faster times than the time required to copy the genome. The precise timing of initiation of DNA replication is determined by a regulatory circuit that depends on the binding of a critical number of ATP-bound DnaA proteins at the origin of replication. The synthesis of DnaA in the cell is controll…
▽ More
Background: In Escherichia coli, overlapping rounds of DNA replication allow the bacteria to double in faster times than the time required to copy the genome. The precise timing of initiation of DNA replication is determined by a regulatory circuit that depends on the binding of a critical number of ATP-bound DnaA proteins at the origin of replication. The synthesis of DnaA in the cell is controlled by a growth-rate dependent, negatively autoregulated gene found near the origin of replication. Both the regulatory and initiation activity of DnaA depend on its nucleotide bound state and its availability.
Results: In order to investigate the contributions of the different regulatory processes to the timing of initiation of DNA replication at varying growth rates, we formulate a minimal quantitative model of the initiator circuit that includes the key ingredients known to regulate the activity of the DnaA protein. This model describes the average-cell oscillations in DnaA-ATP/DNA during the cell cycle, for varying growth rates. We evaluate the conditions under which this ratio attains the same threshold value at the time of initiation, independently of the growth rate.
Conclusions: We find that a quantitative description of replication initiation by DnaA must rely on the dependency of the basic parameters on growth rate, in order to account for the timing of initiation of DNA replication at different cell doubling times. We isolate two main possible scenarios for this. One possibility is that the basal rate of regulatory inactivation by ATP hydrolysis must vary with growth rate. Alternatively, some parameters defining promoter activity need to be a function of the growth rate. In either case, the basal rate of gene expression needs to increase with the growth rate, in accordance with the known characteristics of the dnaA promoter.
△ Less
Submitted 4 January, 2012;
originally announced January 2012.
-
Joint scaling laws in functional and evolutionary categories in prokaryotic genomes
Authors:
Jacopo Grilli,
Bruno Bassetti,
Sergei Maslov,
Marco Cosentino Lagomarsino
Abstract:
We propose and study a class-expansion/innovation/loss model of genome evolution taking into account biological roles of genes and their constituent domains. In our model numbers of genes in different functional categories are coupled to each other. For example, an increase in the number of metabolic enzymes in a genome is usually accompanied by addition of new transcription factors regulating the…
▽ More
We propose and study a class-expansion/innovation/loss model of genome evolution taking into account biological roles of genes and their constituent domains. In our model numbers of genes in different functional categories are coupled to each other. For example, an increase in the number of metabolic enzymes in a genome is usually accompanied by addition of new transcription factors regulating these enzymes. Such coupling can be thought of as a proportional "recipe" for genome composition of the type "a spoonful of sugar for each egg yolk". The model jointly reproduces two known empirical laws: the distribution of family sizes and the nonlinear scaling of the number of genes in certain functional categories (e.g. transcription factors) with genome size. In addition, it allows us to derive a novel relation between the exponents characterising these two scaling laws, establishing a direct quantitative connection between evolutionary and functional categories. It predicts that functional categories that grow faster-than-linearly with genome size to be characterised by flatter-than-average family size distributions. This relation is confirmed by our bioinformatics analysis of prokaryotic genomes. This proves that the joint quantitative trends of functional and evolutionary classes can be understood in terms of evolutionary growth with proportional recipes.
△ Less
Submitted 9 August, 2011; v1 submitted 30 January, 2011;
originally announced January 2011.
-
Gene clusters reflecting macrodomain structure respond to nucleoid perturbations
Authors:
Vittore F. Scolari,
Bruno Bassetti,
Bianca Sclavi,
Marco Cosentino Lagomarsino
Abstract:
Focusing on the DNA-bridging nucleoid proteins Fis and H-NS, and integrating several independent experimental and bioinformatic data sources, we investigate the links between chromosomal spatial organization and global transcriptional regulation. By means of a novel multi-scale spatial aggregation analysis, we uncover the existence of contiguous clusters of nucleoid-perturbation sensitive genes al…
▽ More
Focusing on the DNA-bridging nucleoid proteins Fis and H-NS, and integrating several independent experimental and bioinformatic data sources, we investigate the links between chromosomal spatial organization and global transcriptional regulation. By means of a novel multi-scale spatial aggregation analysis, we uncover the existence of contiguous clusters of nucleoid-perturbation sensitive genes along the genome, whose expression is affected by a combination of topological DNA state and nucleoid-shaping protein occupancy. The clusters correlate well with the macrodomain structure of the genome. The most significant of them lay symmetrically at the edges of the ter macrodomain and involve all of the flagellar and chemotaxis machinery, in addition to key regulators of biofilm formation, suggesting that the regulation of the physical state of the chromosome by the nucleoid proteins plays an important role in coordinating the transcriptional response leading to the switch between a motile and a biofilm lifestyle.
△ Less
Submitted 4 November, 2010;
originally announced November 2010.
-
Mean-field methods in evolutionary duplication-innovation-loss models for the genome-level repertoire of protein domains
Authors:
A. Angelini,
A. Amato,
G. Bianconi,
B. Bassetti,
M. Cosentino Lagomarsino
Abstract:
We present a combined mean-field and simulation approach to different models describing the dynamics of classes formed by elements that can appear, disappear or copy themselves. These models, related to a paradigm duplication-innovation model known as Chinese Restaurant Process, are devised to reproduce the scaling behavior observed in the genome-wide repertoire of protein domains of all known s…
▽ More
We present a combined mean-field and simulation approach to different models describing the dynamics of classes formed by elements that can appear, disappear or copy themselves. These models, related to a paradigm duplication-innovation model known as Chinese Restaurant Process, are devised to reproduce the scaling behavior observed in the genome-wide repertoire of protein domains of all known species. In view of these data, we discuss the qualitative and quantitative differences of the alternative model formulations, focusing in particular on the roles of element loss and of the specificity of empirical domain classes.
△ Less
Submitted 22 January, 2010; v1 submitted 22 October, 2009;
originally announced October 2009.
-
Functional models for large-scale gene regulation networks: realism and fiction
Authors:
M. Cosentino Lagomarsino,
B. Bassetti,
G. Castellani,
D. Remondini
Abstract:
High-throughput experiments are shedding light on the topology of large regulatory networks and at the same time their functional states, namely the states of activation of the nodes (for example transcript or protein levels) in different conditions, times, environments. We now possess a certain amount of information about these two levels of description, stored in libraries, databases and ontol…
▽ More
High-throughput experiments are shedding light on the topology of large regulatory networks and at the same time their functional states, namely the states of activation of the nodes (for example transcript or protein levels) in different conditions, times, environments. We now possess a certain amount of information about these two levels of description, stored in libraries, databases and ontologies. A current challenge is to bridge the gap between topology and function, i.e. developing quantitative models aimed at characterizing the expression patterns of large sets of genes. However, approaches that work well for small networks become impossible to master at large scales, mainly because parameters proliferate. In this review we discuss the state of the art of large-scale functional network models, addressing the issue of what can be considered as realistic and what the main limitations may be. We also show some directions for future work, trying to set the goals that future models should try to achieve. Finally, we will emphasize the possible benefits in the understanding of biological mechanisms underlying complex multifactorial diseases, and in the development of novel strategies for the description and the treatment of such pathologies.
△ Less
Submitted 12 February, 2009;
originally announced February 2009.
-
Identity and divergence of protein domain architectures after the Yeast Whole Genome Duplication event
Authors:
D. Fusco,
L. Grassi,
A. L. Sellerio,
D. Cora`,
B. Bassetti,
M. Caselle,
M. Cosentino Lagomarsino
Abstract:
Analyzing the properties of duplicate genes during evolution is useful to understand the development of new cell functions. The yeast S. cerevisiae is a useful testing ground for this problem, because its duplicated genes with different evolutionary birth and destiny are well distinguishable. In particular, there is a clear detection for the occurrence of a Whole Genome Duplication (WGD) event i…
▽ More
Analyzing the properties of duplicate genes during evolution is useful to understand the development of new cell functions. The yeast S. cerevisiae is a useful testing ground for this problem, because its duplicated genes with different evolutionary birth and destiny are well distinguishable. In particular, there is a clear detection for the occurrence of a Whole Genome Duplication (WGD) event in S. cerevisiae, and the genes derived from this event (WGD paralogs) are known. We studied WGD and non-WGD duplicates by two parallel analysis based on structural protein domains and on Gene Ontology annotation scheme respectively. The results show that while a large number of ``duplicable'' structural domains is shared in local and global duplications, WGD and non-WGD paralogs tend to have different functions. The reason for this is the existence of WGD and non-WGD specific domains with largely different functions. In agreement with the recent findings of Wapinski and collaborators (Nature 449, 2007), WGD paralogs often perform ``core'' cell functions, such as translation and DNA replication, while local duplications associate with ``peripheral'' functions such as response to stress. Our results also support the fact that domain architectures are a reliable tool to detect homology, as the domains of duplicates are largely invariant with date and nature of the duplication, while their sequences and also their functions might migrate.
△ Less
Submitted 20 November, 2008;
originally announced November 2008.
-
Universal Features in the Genome-level Evolution of Protein Domains
Authors:
M. Cosentino Lagomarsino,
A. L. Sellerio,
P. D. Heijning,
B. Bassetti
Abstract:
Protein domains are found on genomes with notable statistical distributions, which bear a high degree of similarity. Previous work has shown how these distributions can be accounted for by simple models, where the main ingredients are probabilities of duplication, innovation, and loss of domains. However, no one so far has addressed the issue that these distributions follow definite trends depen…
▽ More
Protein domains are found on genomes with notable statistical distributions, which bear a high degree of similarity. Previous work has shown how these distributions can be accounted for by simple models, where the main ingredients are probabilities of duplication, innovation, and loss of domains. However, no one so far has addressed the issue that these distributions follow definite trends depending on protein-coding genome size only. We present a stochastic duplication/innovation model, falling in the class of so-called Chinese Restaurant Processes, able to explain this feature of the data. Using only two universal parameters, related to a minimal number of domains and to the relative weight of innovation to duplication, the model reproduces two important aspects: (a) the populations of domain classes (the sets, related to homology classes, containing realizations of the same domain in different proteins) follow common power-laws whose cutoff is dictated by genome size, and (b) the number of domain families is universal and markedly sublinear in genome size. An important ingredient of the model is that the innovation probability decreases with genome size. We propose the possibility to interpret this as a global constraint given by the cost of expanding an increasingly complex interactome.
△ Less
Submitted 11 July, 2008;
originally announced July 2008.
-
A comparative evolutionary study of transcription networks
Authors:
A. L. Sellerio,
B. Bassetti,
H. Isambert,
M. Cosentino Lagomarsino
Abstract:
We present a comparative analysis of large-scale topological and evolutionary properties of transcription networks in three species, the two distant bacteria E. coli and B. subtilis, and the yeast S. cerevisiae. The study focuses on the global aspects of feedback and hierarchy in transcriptional regulatory pathways. While confirming that gene duplication has a significant impact on the shaping o…
▽ More
We present a comparative analysis of large-scale topological and evolutionary properties of transcription networks in three species, the two distant bacteria E. coli and B. subtilis, and the yeast S. cerevisiae. The study focuses on the global aspects of feedback and hierarchy in transcriptional regulatory pathways. While confirming that gene duplication has a significant impact on the shaping of all the analyzed transcription networks, our results point to distinct trends between the bacteria, where time constraints in the transcription of downstream genes might be important in shaping the hierarchical structure of the network, and yeast, which seems able to sustain a higher wiring complexity, that includes the more feedback, intricate hierarchy, and the combinatorial use of heterodimers made of duplicate transcription factors.
△ Less
Submitted 15 May, 2008;
originally announced May 2008.
-
DIA-MCIS. An Importance Sampling Network Randomizer for Network Motif Discovery and Other Topological Observables in Transcription Networks
Authors:
D. Fusco,
B. Bassetti,
P. Jona,
M. Cosentino Lagomarsino
Abstract:
Transcription networks, and other directed networks can be characterized by some topological observables such as for example subgraph occurrence (network motifs). In order to perform such kind of analysis, it is necessary to be able to generate suitable randomized network ensembles. Typically, one considers null networks with the same degree sequences of the original ones. The commonly used algo…
▽ More
Transcription networks, and other directed networks can be characterized by some topological observables such as for example subgraph occurrence (network motifs). In order to perform such kind of analysis, it is necessary to be able to generate suitable randomized network ensembles. Typically, one considers null networks with the same degree sequences of the original ones. The commonly used algorithms sometimes have long convergence times, and sampling problems. We present here an alternative, based on a variant of the importance sampling Montecarlo developed by Chen et al. [1].
△ Less
Submitted 1 June, 2007;
originally announced June 2007.
-
Hierarchy and Feedback in the Evolution of the E. coli Transcription Network
Authors:
M. Cosentino Lagomarsino,
P. Jona,
B. Bassetti,
H. Isambert
Abstract:
The E.coli transcription network has an essentially feedforward structure, with, however, abundant feedback at the level of self-regulations. Here, we investigate how these properties emerged during evolution. An assessment of the role of gene duplication based on protein domain architecture shows that (i) transcriptional autoregulators have mostly arisen through duplication, while (ii) the expe…
▽ More
The E.coli transcription network has an essentially feedforward structure, with, however, abundant feedback at the level of self-regulations. Here, we investigate how these properties emerged during evolution. An assessment of the role of gene duplication based on protein domain architecture shows that (i) transcriptional autoregulators have mostly arisen through duplication, while (ii) the expected feedback loops stemming from their initial cross-regulation are strongly selected against. This requires a divergent coevolution of the transcription factor DNA-binding sites and their respective DNA cis-regulatory regions. Moreover, we find that the network tends to grow by expansion of the existing hierarchical layers of computation, rather than by addition of new layers. We also argue that rewiring of regulatory links due to mutation/selection of novel transcription factor/DNA binding interactions appears not to significantly affect the network global hierarchy, and that horizontally transferred genes are mainly added at the bottom, as new target nodes. These findings highlight the important evolutionary roles of both duplication and selective deletion of crosstalks between autoregulators in the emergence of the hierarchical transcription network of E.coli.
△ Less
Submitted 24 January, 2007;
originally announced January 2007.
-
Randomization and Feedback Properties of Directed Graphs Inspired by Gene Networks
Authors:
M. Cosentino Lagomarsino,
B. Bassetti,
P. Jona
Abstract:
Having in mind the large-scale analysis of gene regulatory networks, we review a graph decimation algorithm, called "leaf-removal", which can be used to evaluate the feedback in a random graph ensemble. In doing this, we consider the possibility of analyzing networks where the diagonal of the adjacency matrix is structured, that is, has a fixed number of nonzero entries. We test these ideas on a…
▽ More
Having in mind the large-scale analysis of gene regulatory networks, we review a graph decimation algorithm, called "leaf-removal", which can be used to evaluate the feedback in a random graph ensemble. In doing this, we consider the possibility of analyzing networks where the diagonal of the adjacency matrix is structured, that is, has a fixed number of nonzero entries. We test these ideas on a network model with fixed degree, using both numerical and analytical calculations. Our results are the following. First, the leaf-removal behavior for large system size enables to distinguish between different regimes of feedback. We show their relations and the connection with the onset of complexity in the graph. Second, the influence of the diagonal structure on this behavior can be relevant.
△ Less
Submitted 27 June, 2006;
originally announced June 2006.
-
Random Networks Tossing Biased Coins
Authors:
F. Bassetti,
M. Cosentino Lagomarsino,
B. Bassetti,
P. Jona
Abstract:
In statistical mechanical investigations on complex networks, it is useful to employ random graphs ensembles as null models, to compare with experimental realizations. Motivated by transcription networks, we present here a simple way to generate an ensemble of random directed graphs with, asymptotically, scale-free outdegree and compact indegree. Entries in each row of the adjacency matrix are s…
▽ More
In statistical mechanical investigations on complex networks, it is useful to employ random graphs ensembles as null models, to compare with experimental realizations. Motivated by transcription networks, we present here a simple way to generate an ensemble of random directed graphs with, asymptotically, scale-free outdegree and compact indegree. Entries in each row of the adjacency matrix are set to be zero or one according to the toss of a biased coin, with a chosen probability distribution for the biases. This defines a quick and simple algorithm, which yields good results already for graphs of size n ~ 100. Perhaps more importantly, many of the relevant observables are accessible analytically, improving upon previous estimates for similar graphs.
△ Less
Submitted 3 April, 2007; v1 submitted 2 April, 2006;
originally announced April 2006.
-
Continuum Description of the Cytoskeleton: Ring Formation in the Cell Cortex
Authors:
Alexander Zumdieck,
Marco Cosentino Lagomarsino,
Catalin Tanase,
Karsten Kruse,
Bela Mulder,
Marileen Dogterom,
Frank J"ulicher
Abstract:
Motivated by the formation of ring-like filament structures in the cortex of plant and animal cells, we study the dynamics of a two-dimensional layer of cytoskeletal filaments and motor proteins near a surface by a general continuum theory. As a result of active processes, dynamic patterns of filament orientation and density emerge via instabilities. We show that self-organization phenomena can…
▽ More
Motivated by the formation of ring-like filament structures in the cortex of plant and animal cells, we study the dynamics of a two-dimensional layer of cytoskeletal filaments and motor proteins near a surface by a general continuum theory. As a result of active processes, dynamic patterns of filament orientation and density emerge via instabilities. We show that self-organization phenomena can lead to the formation of stationary and oscillating rings. We present state diagrams which reveal a rich scenario of asymptotic behaviors and discuss the role of boundary conditions.
△ Less
Submitted 27 October, 2005;
originally announced October 2005.
-
The large-scale logico-chemical structure of a transcriptional regulation network
Authors:
M. Cosentino Lagomarsino,
P. Jona,
B. Bassetti
Abstract:
Identity, response to external stimuli, and spatial architecture of a living system are central topics of molecular biology. Presently, they are largely seen as a result of the interplay between a gene repertoire and the regulatory machinery of the cell. At the transcriptional level, the cis-regulatory regions establish sets of interdependencies between transcription factors and genes, including…
▽ More
Identity, response to external stimuli, and spatial architecture of a living system are central topics of molecular biology. Presently, they are largely seen as a result of the interplay between a gene repertoire and the regulatory machinery of the cell. At the transcriptional level, the cis-regulatory regions establish sets of interdependencies between transcription factors and genes, including other transcription factors. These ``transcription networks'' are too large to be approached globally with a detailed dynamical model. In this paper, we describe an approach to this problem that focuses solely on the compatibility between gene expression patterns and signal integration functions, discussing calculations carried on the simplest, Boolean, realization of the model, and a first application to experimental data sets.
△ Less
Submitted 15 February, 2005;
originally announced February 2005.
-
The Logic Backbone of a Transcription Network
Authors:
M. Cosentino Lagomarsino,
P. Jona,
B. Bassetti
Abstract:
A great part of the effort in the study of coarse grained models of transcription networks is directed to the analysis of their dynamical features. In this letter, we consider the \emph{equilibrium} properties of such systems, showing that the logic backbone underlying all dynamic descriptions has the structure of a computational optimization problem. It involves variables, which correspond to g…
▽ More
A great part of the effort in the study of coarse grained models of transcription networks is directed to the analysis of their dynamical features. In this letter, we consider the \emph{equilibrium} properties of such systems, showing that the logic backbone underlying all dynamic descriptions has the structure of a computational optimization problem. It involves variables, which correspond to gene expression levels, and constraints, which describe the effect of \emph{cis-}regulatory signal integration functions. In the simple paradigmatic case of Boolean variables and signal integration functions, we derive and discuss phase diagrams. Notably, the model exhibits a connectivity transition between a regime of simple, but uncertain, gene control, to a regime of complex combinatorial control.
△ Less
Submitted 2 December, 2005; v1 submitted 10 December, 2004;
originally announced December 2004.