Fast phylogeny reconstruction through learning of ancestral sequences
Authors:
Radu Mihaescu,
Cameron Hill,
Satish Rao
Abstract:
Given natural limitations on the length DNA sequences, designing phylogenetic reconstruction methods which are reliable under limited information is a crucial endeavor. There have been two approaches to this problem: reconstructing partial but reliable information about the tree (\cite{Mo07, DMR08,DHJ06,GMS08}), and reaching "deeper" in the tree through reconstruction of ancestral sequences. In…
▽ More
Given natural limitations on the length DNA sequences, designing phylogenetic reconstruction methods which are reliable under limited information is a crucial endeavor. There have been two approaches to this problem: reconstructing partial but reliable information about the tree (\cite{Mo07, DMR08,DHJ06,GMS08}), and reaching "deeper" in the tree through reconstruction of ancestral sequences. In the latter category, \cite{DMR06} settled an important conjecture of M.Steel, showing that, under the CFN model of evolution, all trees on $n$ leaves with edge lengths bounded by the Ising model phase transition can be recovered with high probability from genomes of length $O(\log n)$ with a polynomial time algorithm. Their methods had a running time of $O(n^{10})$.
Here we enhance our methods from \cite{DHJ06} with the learning of ancestral sequences and provide an algorithm for reconstructing a sub-forest of the tree which is reliable given available data, without requiring a-priori known bounds on the edge lengths of the tree. Our methods are based on an intuitive minimum spanning tree approach and run in $O(n^3)$ time. For the case of full reconstruction of trees with edges under the phase transition, we maintain the same sequence length requirements as \cite{DMR06}, despite the considerably faster running time.
△ Less
Submitted 8 December, 2008;
originally announced December 2008.
Why neighbor-joining works
Authors:
Radu Mihaescu,
Dan Levy,
Lior Pachter
Abstract:
We show that the neighbor-joining algorithm is a robust quartet method for constructing trees from distances. This leads to a new performance guarantee that contains Atteson's optimal radius bound as a special case and explains many cases where neighbor-joining is successful even when Atteson's criterion is not satisfied. We also provide a proof for Atteson's conjecture on the optimal edge radiu…
▽ More
We show that the neighbor-joining algorithm is a robust quartet method for constructing trees from distances. This leads to a new performance guarantee that contains Atteson's optimal radius bound as a special case and explains many cases where neighbor-joining is successful even when Atteson's criterion is not satisfied. We also provide a proof for Atteson's conjecture on the optimal edge radius of the neighbor-joining algorithm. The strong performance guarantees we provide also hold for the quadratic time fast neighbor-joining algorithm, thus providing a theoretical basis for inferring very large phylogenies with neighbor-joining.
△ Less
Submitted 17 June, 2007; v1 submitted 10 February, 2006;
originally announced February 2006.