-
A generative, predictive model for menstrual cycle lengths that accounts for potential self-tracking artifacts in mobile health data
Authors:
Kathy Li,
Iñigo Urteaga,
Amanda Shea,
Virginia J. Vitzthum,
Chris H. Wiggins,
Noémie Elhadad
Abstract:
Mobile health (mHealth) apps such as menstrual trackers provide a rich source of self-tracked health observations that can be leveraged for health-relevant research. However, such data streams have questionable reliability since they hinge on user adherence to the app. Therefore, it is crucial for researchers to separate true behavior from self-tracking artifacts. By taking a machine learning appr…
▽ More
Mobile health (mHealth) apps such as menstrual trackers provide a rich source of self-tracked health observations that can be leveraged for health-relevant research. However, such data streams have questionable reliability since they hinge on user adherence to the app. Therefore, it is crucial for researchers to separate true behavior from self-tracking artifacts. By taking a machine learning approach to modeling self-tracked cycle lengths, we can both make more informed predictions and learn the underlying structure of the observed data. In this work, we propose and evaluate a hierarchical, generative model for predicting next cycle length based on previously-tracked cycle lengths that accounts explicitly for the possibility of users skipping tracking their period. Our model offers several advantages: 1) accounting explicitly for self-tracking artifacts yields better prediction accuracy as likelihood of skipping increases; 2) because it is a generative model, predictions can be updated online as a given cycle evolves, and we can gain interpretable insight into how these predictions change over time; and 3) its hierarchical nature enables modeling of an individual's cycle length history while incorporating population-level information. Our experiments using mHealth cycle length data encompassing over 186,000 menstruators with over 2 million natural menstrual cycles show that our method yields state-of-the-art performance against neural network-based and summary statistic-based baselines, while providing insights on disentangling menstrual patterns from self-tracking artifacts. This work can benefit users, mHealth app developers, and researchers in better understanding cycle patterns and user adherence.
△ Less
Submitted 16 March, 2021; v1 submitted 24 February, 2021;
originally announced February 2021.
-
Characterizing physiological and symptomatic variation in menstrual cycles using self-tracked mobile health data
Authors:
Kathy Li,
Iñigo Urteaga,
Chris H. Wiggins,
Anna Druet,
Amanda Shea,
Virginia J. Vitzthum,
Noémie Elhadad
Abstract:
The menstrual cycle is a key indicator of overall health for women of reproductive age. Previously, menstruation was primarily studied through survey results; however, as menstrual tracking mobile apps become more widely adopted, they provide an increasingly large, content-rich source of menstrual health experiences and behaviors over time. By exploring a database of user-tracked observations from…
▽ More
The menstrual cycle is a key indicator of overall health for women of reproductive age. Previously, menstruation was primarily studied through survey results; however, as menstrual tracking mobile apps become more widely adopted, they provide an increasingly large, content-rich source of menstrual health experiences and behaviors over time. By exploring a database of user-tracked observations from the Clue app by BioWink of over 378,000 users and 4.9 million natural cycles, we show that self-reported menstrual tracker data can reveal statistically significant relationships between per-person cycle length variability and self-reported qualitative symptoms. A concern for self-tracked data is that they reflect not only physiological behaviors, but also the engagement dynamics of app users. To mitigate such potential artifacts, we develop a procedure to exclude cycles lacking user engagement, thereby allowing us to better distinguish true menstrual patterns from tracking anomalies. We uncover that women located at different ends of the menstrual variability spectrum, based on the consistency of their cycle length statistics, exhibit statistically significant differences in their cycle characteristics and symptom tracking patterns. We also find that cycle and period length statistics are stationary over the app usage timeline across the variability spectrum. The symptoms that we identify as showing statistically significant association with timing data can be useful to clinicians and users for predicting cycle variability from symptoms or as potential health indicators for conditions like endometriosis. Our findings showcase the potential of longitudinal, high-resolution self-tracked data to improve understanding of menstruation and women's health as a whole.
△ Less
Submitted 14 May, 2020; v1 submitted 24 September, 2019;
originally announced September 2019.
-
Noise expands the response range of the Bacillus subtilis competence circuit
Authors:
Andrew Mugler,
Mark Kittisopikul,
Luke Hayden,
Jintao Liu,
Chris H. Wiggins,
Gurol M. Suel,
Aleksandra M. Walczak
Abstract:
Gene regulatory circuits must contend with intrinsic noise that arises due to finite numbers of proteins. While some circuits act to reduce this noise, others appear to exploit it. A striking example is the competence circuit in Bacillus subtilis, which exhibits much larger noise in the duration of its competence events than a synthetically constructed analog that performs the same function. Here,…
▽ More
Gene regulatory circuits must contend with intrinsic noise that arises due to finite numbers of proteins. While some circuits act to reduce this noise, others appear to exploit it. A striking example is the competence circuit in Bacillus subtilis, which exhibits much larger noise in the duration of its competence events than a synthetically constructed analog that performs the same function. Here, using stochastic modeling and fluorescence microscopy, we show that this larger noise allows cells to exit terminal phenotypic states, which expands the range of stress levels to which cells are responsive and leads to phenotypic heterogeneity at the population level. This is an important example of how noise confers a functional benefit in a genetic decision-making circuit.
△ Less
Submitted 30 August, 2015;
originally announced August 2015.
-
Multiple Lac-mediated loops revealed by Bayesian statistics and tethered particle motion
Authors:
Stephanie Johnson,
Jan-Willem van de Meent,
Rob Phillips,
Chris H. Wiggins,
Martin Lindén
Abstract:
The bacterial transcription factor LacI loops DNA by binding to two separate locations on the DNA simultaneously. Despite being one of the best-studied model systems for transcriptional regulation, the number and conformations of loop structures accessible to LacI remain unclear, though the importance of multiple co-existing loops has been implicated in interactions between LacI and other cellular…
▽ More
The bacterial transcription factor LacI loops DNA by binding to two separate locations on the DNA simultaneously. Despite being one of the best-studied model systems for transcriptional regulation, the number and conformations of loop structures accessible to LacI remain unclear, though the importance of multiple co-existing loops has been implicated in interactions between LacI and other cellular regulators of gene expression. To probe this issue, we have developed a new analysis method for tethered particle motion, a versatile and commonly-used in vitro single-molecule technique. Our method, vbTPM, performs variational Bayesian inference in hidden Markov models. It learns the number of distinct states (i.e., DNA-protein conformations) directly from tethered particle motion data with better resolution than existing methods, while easily correcting for common experimental artifacts. Studying short (roughly 100 bp) LacI-mediated loops, we provide evidence for three distinct loop structures, more than previously reported in single-molecule studies. Moreover, our results confirm that changes in LacI conformation and DNA binding topology both contribute to the repertoire of LacI-mediated loops formed in vitro, and provide qualitatively new input for models of looping and transcriptional regulation. We expect vbTPM to be broadly useful for probing complex protein-nucleic acid interactions.
△ Less
Submitted 18 June, 2014; v1 submitted 4 February, 2014;
originally announced February 2014.
-
Hierarchically-coupled hidden Markov models for learning kinetic rates from single-molecule data
Authors:
Jan-Willem van de Meent,
Jonathan E. Bronson,
Frank Wood,
Ruben L. Gonzalez Jr.,
Chris H. Wiggins
Abstract:
We address the problem of analyzing sets of noisy time-varying signals that all report on the same process but confound straightforward analyses due to complex inter-signal heterogeneities and measurement artifacts. In particular we consider single-molecule experiments which indirectly measure the distinct steps in a biomolecular process via observations of noisy time-dependent signals such as a f…
▽ More
We address the problem of analyzing sets of noisy time-varying signals that all report on the same process but confound straightforward analyses due to complex inter-signal heterogeneities and measurement artifacts. In particular we consider single-molecule experiments which indirectly measure the distinct steps in a biomolecular process via observations of noisy time-dependent signals such as a fluorescence intensity or bead position. Straightforward hidden Markov model (HMM) analyses attempt to characterize such processes in terms of a set of conformational states, the transitions that can occur between these states, and the associated rates at which those transitions occur; but require ad-hoc post-processing steps to combine multiple signals. Here we develop a hierarchically coupled HMM that allows experimentalists to deal with inter-signal variability in a principled and automatic way. Our approach is a generalized expectation maximization hyperparameter point estimation procedure with variational Bayes at the level of individual time series that learns an single interpretable representation of the overall data generating process.
△ Less
Submitted 15 May, 2013;
originally announced May 2013.
-
Time-dependent information transmission in a model regulatory circuit
Authors:
Francesca Mancini,
Chris H. Wiggins,
Matteo Marsili,
Aleksandra M. Walczak
Abstract:
Many biological regulatory systems process signals out of steady state and respond with a physiological delay. A simple model of regulation which respects these features shows how the ability of a delayed output to transmit information is limited: at short times by the timescale of the dynamic input, at long times by that of the dynamic output. We find that topologies of maximally informative netw…
▽ More
Many biological regulatory systems process signals out of steady state and respond with a physiological delay. A simple model of regulation which respects these features shows how the ability of a delayed output to transmit information is limited: at short times by the timescale of the dynamic input, at long times by that of the dynamic output. We find that topologies of maximally informative networks correspond to commonly occurring biological circuits linked to stress response and that circuits functioning out of steady state may exploit absorbing states to transmit information optimally.
△ Less
Submitted 12 August, 2013; v1 submitted 19 February, 2013;
originally announced February 2013.
-
Identifying Hosts of Families of Viruses: A Machine Learning Approach
Authors:
Anil Raj,
Michael Dewar,
Gustavo Palacios,
Raul Rabadan,
Chris H. Wiggins
Abstract:
Identifying viral pathogens and characterizing their transmission is essential to developing effective public health measures in response to a pandemic. Phylogenetics, though currently the most popular tool used to characterize the likely host of a virus, can be ambiguous when studying species very distant to known species and when there is very little reliable sequence information available in th…
▽ More
Identifying viral pathogens and characterizing their transmission is essential to developing effective public health measures in response to a pandemic. Phylogenetics, though currently the most popular tool used to characterize the likely host of a virus, can be ambiguous when studying species very distant to known species and when there is very little reliable sequence information available in the early stages of the pandemic. Motivated by an existing framework for representing biological sequence information, we learn sparse, tree-structured models, built from decision rules based on subsequences, to predict viral hosts from protein sequence data using popular discriminative machine learning tools. Furthermore, the predictive motifs robustly selected by the learning algorithm are found to show strong host-specificity and occur in highly conserved regions of the viral proteome.
△ Less
Submitted 29 May, 2011;
originally announced May 2011.
-
A statistical method for revealing form-function relations in biological networks
Authors:
Andrew Mugler,
Boris Grinshpun,
Riley Franks,
Chris H. Wiggins
Abstract:
Over the past decade, a number of researchers in systems biology have sought to relate the function of biological systems to their network-level descriptions -- lists of the most important players and the pairwise interactions between them. Both for large networks (in which statistical analysis is often framed in terms of the abundance of repeated small subgraphs) and for small networks which can…
▽ More
Over the past decade, a number of researchers in systems biology have sought to relate the function of biological systems to their network-level descriptions -- lists of the most important players and the pairwise interactions between them. Both for large networks (in which statistical analysis is often framed in terms of the abundance of repeated small subgraphs) and for small networks which can be analyzed in greater detail (or even synthesized in vivo and subjected to experiment), revealing the relationship between the topology of small subgraphs and their biological function has been a central goal. We here seek to pose this revelation as a statistical task, illustrated using a particular setup which has been constructed experimentally and for which parameterized models of transcriptional regulation have been studied extensively. The question "how does function follow form" is here mathematized by identifying which topological attributes correlate with the diverse possible information-processing tasks which a transcriptional regulatory network can realize. The resulting method reveals one form-function relationship which had earlier been predicted based on analytic results, and reveals a second for which we can provide an analytic interpretation. Resulting source code is distributed via http://formfunction.sourceforge.net.
△ Less
Submitted 30 November, 2010;
originally announced December 2010.
-
Graphical models for inferring single molecule dynamics
Authors:
Jonathan E. Bronson,
Jake M. Hofman,
Jingyi Fei,
Ruben L. Gonzalez Jr.,
Chris H. Wiggins
Abstract:
Background: The recent explosion of experimental techniques in single molecule biophysics has generated a variety of novel time series data requiring equally novel computational tools for analysis and inference. This article describes in general terms how graphical modeling may be used to learn from biophysical time series data using the variational Bayesian expectation maximization algorithm (VBE…
▽ More
Background: The recent explosion of experimental techniques in single molecule biophysics has generated a variety of novel time series data requiring equally novel computational tools for analysis and inference. This article describes in general terms how graphical modeling may be used to learn from biophysical time series data using the variational Bayesian expectation maximization algorithm (VBEM). The discussion is illustrated by the example of single-molecule fluorescence resonance energy transfer (smFRET) versus time data, where the smFRET time series is modeled as a hidden Markov model (HMM) with Gaussian observables. A detailed description of smFRET is provided as well. Results: The VBEM algorithm returns the model's evidence and an approximating posterior parameter distribution given the data. The former provides a metric for model selection via maximum evidence (ME), and the latter a description of the model's parameters learned from the data. ME/VBEM provide several advantages over the more commonly used approach of maximum likelihood (ML) optimized by the expectation maximization (EM) algorithm, the most important being a natural form of model selection and a well-posed (non-divergent) optimization problem. Conclusions: The results demonstrate the utility of graphical modeling for inference of dynamic processes in single molecule biophysics.
△ Less
Submitted 4 September, 2010;
originally announced September 2010.
-
Analytic methods for modeling stochastic regulatory networks
Authors:
Aleksandra M. Walczak,
Andrew Mugler,
Chris H. WIggins
Abstract:
The past decade has seen a revived interest in the unavoidable or intrinsic noise in biochemical and genetic networks arising from the finite copy number of the participating species. That is, rather than modeling regulatory networks in terms of the deterministic dynamics of concentrations, we model the dynamics of the probability of a given copy number of the reactants in single cells. Most of th…
▽ More
The past decade has seen a revived interest in the unavoidable or intrinsic noise in biochemical and genetic networks arising from the finite copy number of the participating species. That is, rather than modeling regulatory networks in terms of the deterministic dynamics of concentrations, we model the dynamics of the probability of a given copy number of the reactants in single cells. Most of the modeling activity of the last decade has centered on stochastic simulation of individual realizations, i.e., Monte-Carlo methods for generating stochastic time series. Here we review the mathematical description in terms of probability distributions, introducing the relevant derivations and illustrating several cases for which analytic progress can be made either instead of or before turning to numerical computation.
△ Less
Submitted 15 May, 2010;
originally announced May 2010.
-
Telling time with an intrinsically noisy clock
Authors:
Andrew Mugler,
Aleksandra M. Walczak,
Chris H. Wiggins
Abstract:
Intracellular transmission of information via chemical and transcriptional networks is thwarted by a physical limitation: the finite copy number of the constituent chemical species introduces unavoidable intrinsic noise. Here we provide a method for solving for the complete probabilistic description of intrinsically noisy oscillatory driving. We derive and numerically verify a number of simple s…
▽ More
Intracellular transmission of information via chemical and transcriptional networks is thwarted by a physical limitation: the finite copy number of the constituent chemical species introduces unavoidable intrinsic noise. Here we provide a method for solving for the complete probabilistic description of intrinsically noisy oscillatory driving. We derive and numerically verify a number of simple scaling laws. Unlike in the case of measuring a static quantity, response to an oscillatory driving can exhibit a resonant frequency which maximizes information transmission. Further, we show that the optimal regulatory design is dependent on the biophysical constraints (i.e., the allowed copy number and response time). The resulting phase diagram illustrates under what conditions threshold regulation outperforms linear regulation.
△ Less
Submitted 12 February, 2010;
originally announced February 2010.
-
Allosteric collaboration between elongation factor G and the ribosomal L1 stalk directs tRNA movements during translation
Authors:
Jingyi Fei,
Jonathan E. Bronson,
Jake M. Hofman,
Rathi L. Srinivas,
Chris H. Wiggins,
Ruben L. Gonzalez, Jr
Abstract:
Determining the mechanism by which transfer RNAs (tRNAs) rapidly and precisely transit through the ribosomal A, P and E sites during translation remains a major goal in the study of protein synthesis. Here, we report the real-time dynamics of the L1 stalk, a structural element of the large ribosomal subunit that is implicated in directing tRNA movements during translation. Within pre-translocati…
▽ More
Determining the mechanism by which transfer RNAs (tRNAs) rapidly and precisely transit through the ribosomal A, P and E sites during translation remains a major goal in the study of protein synthesis. Here, we report the real-time dynamics of the L1 stalk, a structural element of the large ribosomal subunit that is implicated in directing tRNA movements during translation. Within pre-translocation ribosomal complexes, the L1 stalk exists in a dynamic equilibrium between open and closed conformations. Binding of elongation factor G (EF-G) shifts this equilibrium towards the closed conformation through one of at least two distinct kinetic mechanisms, where the identity of the P-site tRNA dictates the kinetic route that is taken. Within post-translocation complexes, L1 stalk dynamics are dependent on the presence and identity of the E-site tRNA. Collectively, our data demonstrate that EF-G and the L1 stalk allosterically collaborate to direct tRNA translocation from the P to the E sites, and suggest a model for the release of E-site tRNA.
△ Less
Submitted 2 September, 2009;
originally announced September 2009.
-
Spectral solutions to stochastic models of gene expression with bursts and regulation
Authors:
Andrew Mugler,
Aleksandra M. Walczak,
Chris H. Wiggins
Abstract:
Signal-processing molecules inside cells are often present at low copy number, which necessitates probabilistic models to account for intrinsic noise. Probability distributions have traditionally been found using simulation-based approaches which then require estimating the distributions from many samples. Here we present in detail an alternative method for directly calculating a probability dis…
▽ More
Signal-processing molecules inside cells are often present at low copy number, which necessitates probabilistic models to account for intrinsic noise. Probability distributions have traditionally been found using simulation-based approaches which then require estimating the distributions from many samples. Here we present in detail an alternative method for directly calculating a probability distribution by expanding in the natural eigenfunctions of the governing equation, which is linear. We apply the resulting spectral method to three general models of stochastic gene expression: a single gene with multiple expression states (often used as a model of bursting in the limit of two states), a gene regulatory cascade, and a combined model of bursting and regulation. In all cases we find either analytic results or numerical prescriptions that greatly outperform simulations in efficiency and accuracy. In the last case, we show that bimodal response in the limit of slow switching is not only possible but optimal in terms of information transmission.
△ Less
Submitted 20 July, 2009;
originally announced July 2009.
-
Learning Rates and States from Biophysical Time Series: A Bayesian Approach to Model Selection and Single-Molecule FRET Data
Authors:
Jonathan E. Bronson,
Jingyi Fei,
Jake M. Hofman,
Ruben L. Gonzalez, Jr.,
Chris H. Wiggins
Abstract:
Time series data provided by single-molecule Forster resonance energy transfer (sm-FRET) experiments offer the opportunity to infer not only model parameters describing molecular complexes, e.g. rate constants, but also information about the model itself, e.g. the number of conformational states. Resolving whether or how many of such states exist requires a careful approach to the problem of mod…
▽ More
Time series data provided by single-molecule Forster resonance energy transfer (sm-FRET) experiments offer the opportunity to infer not only model parameters describing molecular complexes, e.g. rate constants, but also information about the model itself, e.g. the number of conformational states. Resolving whether or how many of such states exist requires a careful approach to the problem of model selection, here meaning discriminating among models with differing numbers of states. The most straightforward approach to model selection generalizes the common idea of maximum likelihood-selecting the most likely parameter values-to maximum evidence: selecting the most likely model. In either case, such inference presents a tremendous computational challenge, which we here address by exploiting an approximation technique termed variational Bayes. We demonstrate how this technique can be applied to temporal data such as smFRET time series; show superior statistical consistency relative to the maximum likelihood approach; and illustrate how model selection in such probabilistic or generative modeling can facilitate analysis of closely related temporal data currently prevalent in biophysics. Source code used in this analysis, including a graphical user interface, is available open source via http://vbFRET.sourceforge.net
△ Less
Submitted 20 July, 2009;
originally announced July 2009.
-
A stochastic spectral analysis of transcriptional regulatory cascades
Authors:
Aleksandra M. Walczak,
Andrew Mugler,
Chris H. Wiggins
Abstract:
The past decade has seen great advances in our understanding of the role of noise in gene regulation and the physical limits to signaling in biological networks. Here we introduce the spectral method for computation of the joint probability distribution over all species in a biological network. The spectral method exploits the natural eigenfunctions of the master equation of birth-death processe…
▽ More
The past decade has seen great advances in our understanding of the role of noise in gene regulation and the physical limits to signaling in biological networks. Here we introduce the spectral method for computation of the joint probability distribution over all species in a biological network. The spectral method exploits the natural eigenfunctions of the master equation of birth-death processes to solve for the joint distribution of modules within the network, which then inform each other and facilitate calculation of the entire joint distribution. We illustrate the method on a ubiquitous case in nature: linear regulatory cascades. The efficiency of the method makes possible numerical optimization of the input and regulatory parameters, revealing design properties of, e.g., the most informative cascades. We find, for threshold regulation, that a cascade of strong regulations converts a unimodal input to a bimodal output, that multimodal inputs are no more informative than bimodal inputs, and that a chain of up-regulations outperforms a chain of down-regulations. We anticipate that this numerical approach may be useful for modeling noise in a variety of small network topologies in biology.
△ Less
Submitted 25 November, 2008;
originally announced November 2008.
-
Quantifying evolvability in small biological networks
Authors:
Andrew Mugler,
Etay Ziv,
Ilya Nemenman,
Chris H. Wiggins
Abstract:
We introduce a quantitative measure of the capacity of a small biological network to evolve. We apply our measure to a stochastic description of the experimental setup of Guet et al. (Science 296:1466, 2002), treating chemical inducers as functional inputs to biochemical networks and the expression of a reporter gene as the functional output. We take an information-theoretic approach, allowing t…
▽ More
We introduce a quantitative measure of the capacity of a small biological network to evolve. We apply our measure to a stochastic description of the experimental setup of Guet et al. (Science 296:1466, 2002), treating chemical inducers as functional inputs to biochemical networks and the expression of a reporter gene as the functional output. We take an information-theoretic approach, allowing the system to set parameters that optimize signal processing ability, thus enumerating each network's highest-fidelity functions. We find that all networks studied are highly evolvable by our measure, meaning that change in function has little dependence on change in parameters. Moreover, we find that each network's functions are connected by paths in the parameter space along which information is not significantly lowered, meaning a network may continuously change its functionality without losing it along the way. This property further underscores the evolvability of the networks.
△ Less
Submitted 17 November, 2008;
originally announced November 2008.
-
Serially-regulated biological networks fully realize a constrained set of functions
Authors:
Andrew Mugler,
Etay Ziv,
Ilya Nemenman,
Chris H. Wiggins
Abstract:
We show that biological networks with serial regulation (each node regulated by at most one other node) are constrained to {\it direct functionality}, in which the sign of the effect of an environmental input on a target species depends only on the direct path from the input to the target, even when there is a feedback loop allowing for multiple interaction pathways. Using a stochastic model for…
▽ More
We show that biological networks with serial regulation (each node regulated by at most one other node) are constrained to {\it direct functionality}, in which the sign of the effect of an environmental input on a target species depends only on the direct path from the input to the target, even when there is a feedback loop allowing for multiple interaction pathways. Using a stochastic model for a set of small transcriptional regulatory networks that have been studied experimentally, we further find that all networks can achieve all functions permitted by this constraint under reasonable settings of biochemical parameters. This underscores the functional versatility of the networks.
△ Less
Submitted 13 May, 2008;
originally announced May 2008.
-
Motif Discovery through Predictive Modeling of Gene Regulation
Authors:
Manuel Middendorf,
Anshul Kundaje,
Mihir Shah,
Yoav Freund,
Chris H. Wiggins,
Christina Leslie
Abstract:
We present MEDUSA, an integrative method for learning motif models of transcription factor binding sites by incorporating promoter sequence and gene expression data. We use a modern large-margin machine learning approach, based on boosting, to enable feature selection from the high-dimensional search space of candidate binding sequences while avoiding overfitting. At each iteration of the algori…
▽ More
We present MEDUSA, an integrative method for learning motif models of transcription factor binding sites by incorporating promoter sequence and gene expression data. We use a modern large-margin machine learning approach, based on boosting, to enable feature selection from the high-dimensional search space of candidate binding sequences while avoiding overfitting. At each iteration of the algorithm, MEDUSA builds a motif model whose presence in the promoter region of a gene, coupled with activity of a regulator in an experiment, is predictive of differential expression. In this way, we learn motifs that are functional and predictive of regulatory response rather than motifs that are simply overrepresented in promoter sequences. Moreover, MEDUSA produces a model of the transcriptional control logic that can predict the expression of any gene in the organism, given the sequence of the promoter region of the target gene and the expression state of a set of known or putative transcription factors and signaling molecules. Each motif model is either a $k$-length sequence, a dimer, or a PSSM that is built by agglomerative probabilistic clustering of sequences with similar boosting loss. By applying MEDUSA to a set of environmental stress response expression data in yeast, we learn motifs whose ability to predict differential expression of target genes outperforms motifs from the TRANSFAC dataset and from a previously published candidate set of PSSMs. We also show that MEDUSA retrieves many experimentally confirmed binding sites associated with environmental stress response from the literature.
△ Less
Submitted 14 January, 2007;
originally announced January 2007.
-
Optimal signal processing in small stochastic biochemical networks
Authors:
Etay Ziv,
Ilya Nemenman,
Chris H. Wiggins
Abstract:
We quantify the influence of the topology of a transcriptional regulatory network on its ability to process environmental signals. By posing the problem in terms of information theory, we may do this without specifying the function performed by the network. Specifically, we study the maximum mutual information between the input (chemical) signal and the output (genetic) response attainable by th…
▽ More
We quantify the influence of the topology of a transcriptional regulatory network on its ability to process environmental signals. By posing the problem in terms of information theory, we may do this without specifying the function performed by the network. Specifically, we study the maximum mutual information between the input (chemical) signal and the output (genetic) response attainable by the network in the context of an analytic model of particle number fluctuations. We perform this analysis for all biochemical circuits, including various feedback loops, that can be built out of 3 chemical species, each under the control of one regulator. We find that a generic network, constrained to low molecule numbers and reasonable response times, can transduce more information than a simple binary switch and, in fact, manages to achieve close to the optimal information transmission fidelity. These high-information solutions are robust to tenfold changes in most of the networks' biochemical parameters; moreover they are easier to achieve in networks containing cycles with an odd number of negative regulators (overall negative feedback) due to their decreased molecular noise (a result which we derive analytically). Finally, we demonstrate that a single circuit can support multiple high-information solutions. These findings suggest a potential resolution of the "cross-talk" dilemma as well as the previously unexplained observation that transcription factors which undergo proteolysis are more likely to be auto-repressive.
△ Less
Submitted 21 December, 2006;
originally announced December 2006.
-
Dynamics of Semiflexible Polymers in a Flow Field
Authors:
Tobias Munk,
Oskar Hallatschek,
Chris H. Wiggins,
Erwin Frey
Abstract:
We present a novel method to investigate the dynamics of a single semiflexible polymer, subject to anisotropic friction in a viscous fluid. In contrast to previous approaches, we do not rely on a discrete bead-rod model, but introduce a suitable normal mode decomposition of a continuous space curve. By means of a perturbation expansion for stiff filaments we derive a closed set of coupled Langev…
▽ More
We present a novel method to investigate the dynamics of a single semiflexible polymer, subject to anisotropic friction in a viscous fluid. In contrast to previous approaches, we do not rely on a discrete bead-rod model, but introduce a suitable normal mode decomposition of a continuous space curve. By means of a perturbation expansion for stiff filaments we derive a closed set of coupled Langevin equations in mode space for the nonlinear dynamics in two dimensions, taking into account exactly the local constraint of inextensibility. The stochastic differential equations obtained this way are solved numerically, with parameters adjusted to describe the motion of actin filaments. As an example, we show results for the tumbling motion in shear flow.
△ Less
Submitted 18 October, 2006; v1 submitted 26 April, 2006;
originally announced April 2006.
-
Flexive and Propulsive Dynamics of Elastica at Low Reynolds Numbers
Authors:
Chris H. Wiggins,
Raymond E. Goldstein
Abstract:
A stiff one-armed swimmer in glycerine goes nowhere, but if its arm is elastic, exerting a restorative torque proportional to local curvature, the swimmer can go on its way. Considering this happy consequence and the principles of elasticity, we study a hyperdiffusion equation for the shape of the elastica in viscous flow, find solutions for impulsive or oscillatory forcing, and elucidate releva…
▽ More
A stiff one-armed swimmer in glycerine goes nowhere, but if its arm is elastic, exerting a restorative torque proportional to local curvature, the swimmer can go on its way. Considering this happy consequence and the principles of elasticity, we study a hyperdiffusion equation for the shape of the elastica in viscous flow, find solutions for impulsive or oscillatory forcing, and elucidate relevant aspects of propulsion. These results have application in a variety of physical and biological contexts, from dynamic biopolymer bending experiments to instabilities of elastic filaments.
△ Less
Submitted 31 July, 1997;
originally announced July 1997.