-
Model selection and parameter inference in phylogenetics using Nested Sampling
Authors:
Patricio Maturana,
Brendon J. Brewer,
Steffen Klaere,
Remco Bouckaert
Abstract:
Bayesian inference methods rely on numerical algorithms for both model selection and parameter inference. In general, these algorithms require a high computational effort to yield reliable estimates. One of the major challenges in phylogenetics is the estimation of the marginal likelihood. This quantity is commonly used for comparing different evolutionary models, but its calculation, even for sim…
▽ More
Bayesian inference methods rely on numerical algorithms for both model selection and parameter inference. In general, these algorithms require a high computational effort to yield reliable estimates. One of the major challenges in phylogenetics is the estimation of the marginal likelihood. This quantity is commonly used for comparing different evolutionary models, but its calculation, even for simple models, incurs high computational cost. Another interesting challenge relates to the estimation of the posterior distribution. Often, long Markov chains are required to get sufficient samples to carry out parameter inference, especially for tree distributions. In general, these problems are addressed separately by using different procedures. Nested sampling (NS) is a Bayesian computation algorithm which provides the means to estimate marginal likelihoods together with their uncertainties, and to sample from the posterior distribution at no extra cost. The methods currently used in phylogenetics for marginal likelihood estimation lack in practicality due to their dependence on many tuning parameters and the inability of most implementations to provide a direct way to calculate the uncertainties associated with the estimates. To address these issues, we introduce NS to phylogenetics. Its performance is assessed under different scenarios and compared to established methods. We conclude that NS is a competitive and attractive algorithm for phylogenetic inference. An implementation is available as a package for BEAST 2 under the LGPL licence, accessible at https://github.com/BEAST2-Dev/nested-sampling.
△ Less
Submitted 10 April, 2018; v1 submitted 16 March, 2017;
originally announced March 2017.
-
Extinction in a branching process: Why some of the fittest strategies cannot guarantee survival
Authors:
Sterling Sawaya,
Steffen Klaere
Abstract:
The fitness of a biological strategy is typically measured by its expected reproductive rate, the first moment of its offspring distribution. However, strategies with high expected rates can also have high probabilities of extinction. A similar situation is found in gambling and investment, where strategies with a high expected payoff can also have a high risk of ruin. We take inspiration from the…
▽ More
The fitness of a biological strategy is typically measured by its expected reproductive rate, the first moment of its offspring distribution. However, strategies with high expected rates can also have high probabilities of extinction. A similar situation is found in gambling and investment, where strategies with a high expected payoff can also have a high risk of ruin. We take inspiration from the gambler's ruin problem to examine how extinction is related to population growth. Using moment theory we demonstrate how higher moments can impact the probability of extinction. We discuss how moments can be used to find bounds on the extinction probability, focusing on s-convex ordering of random variables, a method developed in actuarial science. This approach generates "best case" and "worst case" scenarios to provide upper and lower bounds on the probability of extinction. Our results demonstrate that even the most fit strategies can have high probabilities of extinction.
△ Less
Submitted 15 May, 2013; v1 submitted 10 September, 2012;
originally announced September 2012.
-
On the group theoretical background of assigning stepwise mutations onto phylogenies
Authors:
Mareike Fischer,
Steffen Klaere,
Minh Anh Thi Nguyen,
Arndt von Haeseler
Abstract:
In a recent paper, Klaere et al. modeled the impact of substitutions on arbitrary branches of a phylogenetic tree on an alignment site by the so-called One Step Mutation (OSM) matrix. By utilizing the concept of the OSM matrix for the four-state nucleotide alphabet, Nguyen et al. presented an efficient procedure to compute the minimal number of substitutions needed to translate one alignment site…
▽ More
In a recent paper, Klaere et al. modeled the impact of substitutions on arbitrary branches of a phylogenetic tree on an alignment site by the so-called One Step Mutation (OSM) matrix. By utilizing the concept of the OSM matrix for the four-state nucleotide alphabet, Nguyen et al. presented an efficient procedure to compute the minimal number of substitutions needed to translate one alignment site into another.The present paper delivers a proof for this computation.Moreover, we provide several mathematical insights into the generalization of the OSM matrix to multistate alphabets.The construction of the OSM matrix is only possible if the matrices representing the substitution types acting on the character states and the identity matrix form a commutative group with respect to matrix multiplication. We illustrate a means to establish such a group for the twenty-state amino acid alphabet and critically discuss its biological usefulness.
△ Less
Submitted 18 October, 2011;
originally announced October 2011.
-
An algebraic analysis of the two state Markov model on tripod trees
Authors:
Steffen Klaere,
Volkmar Liebscher
Abstract:
Methods of phylogenetic inference use more and more complex models to generate trees from data. However, even simple models and their implications are not fully understood.
Here, we investigate the two-state Markov model on a tripod tree, inferring conditions under which a given set of observations gives rise to such a model. This type of investigation has been undertaken before by several scien…
▽ More
Methods of phylogenetic inference use more and more complex models to generate trees from data. However, even simple models and their implications are not fully understood.
Here, we investigate the two-state Markov model on a tripod tree, inferring conditions under which a given set of observations gives rise to such a model. This type of investigation has been undertaken before by several scientists from different fields of research.
In contrast to other work we fully analyse the model, presenting conditions under which one can infer a model from the observation or at least get support for the tree-shaped interdependence of the leaves considered.
We also present all conditions under which the results can be extended from tripod trees to quartet trees, a step necessary to reconstruct at least a topology. Apart from finding conditions under which such an extension works we discuss example cases for which such an extension does not work.
△ Less
Submitted 2 December, 2011; v1 submitted 30 November, 2010;
originally announced December 2010.
-
The link between segregation and phylogenetic diversity
Authors:
David Bryant,
Steffen Klaere
Abstract:
We derive an invertible transform linking two widely used measures of species diversity: phylogenetic diversity and the expected proportions of segregating (non-constant) sites. We assume a bi-allelic, symmetric, finite site model of substitution. Like the Hadamard transform of Hendy and Penny, the transform can be expressed completely independent of the underlying phylogeny. Our results bridge wo…
▽ More
We derive an invertible transform linking two widely used measures of species diversity: phylogenetic diversity and the expected proportions of segregating (non-constant) sites. We assume a bi-allelic, symmetric, finite site model of substitution. Like the Hadamard transform of Hendy and Penny, the transform can be expressed completely independent of the underlying phylogeny. Our results bridge work on diversity from two quite distinct scientific communities.
△ Less
Submitted 29 August, 2010; v1 submitted 26 August, 2010;
originally announced August 2010.