Skip to main content

Showing 1–10 of 10 results for author: Willis, A

Searching in archive stat. Search in all archives.
.
  1. arXiv:2402.05231  [pdf, other

    stat.ME stat.AP

    Estimating Fold Changes from Partially Observed Outcomes with Applications in Microbial Metagenomics

    Authors: David S Clausen, Sarah Teichman, Amy D Willis

    Abstract: We consider the problem of estimating fold-changes in the expected value of a multivariate outcome observed with unknown sample-specific and category-specific perturbations. This challenge arises in high-throughput sequencing studies of the abundance of microbial taxa because microbes are systematically over- and under-detected relative to their true abundances. Our model admits a partially identi… ▽ More

    Submitted 14 March, 2025; v1 submitted 7 February, 2024; originally announced February 2024.

    Comments: v2 includes clarified exposition, additional examples, expanded simulation study, and supporting theory; Dr Teichman contributed substantially to v2 and is now recognised as a coauthor

  2. arXiv:2204.12733  [pdf, other

    stat.ME

    Modeling complex measurement error in microbiome experiments to estimate relative abundances and detection effects

    Authors: David S Clausen, Amy D Willis

    Abstract: Accurate estimates of microbial species abundances are needed to advance our understanding of the role that microbiomes play in human and environmental health. However, artificially constructed microbiomes demonstrate that intuitive estimators of microbial relative abundances are biased. To address this, we propose a semiparametric method to estimate relative abundances, species detection effects,… ▽ More

    Submitted 14 March, 2025; v1 submitted 27 April, 2022; originally announced April 2022.

    Comments: v2 includes detailed identifiability results, a complete proof of weak convergence, additional simulation results, and clarified exposition

  3. arXiv:1904.00117  [pdf, other

    q-bio.QM stat.AP

    Estimation of cell lineage trees by maximum-likelihood phylogenetics

    Authors: Jean Feng, William S DeWitt III, Aaron McKenna, Noah Simon, Amy Willis, Frederick A Matsen IV

    Abstract: CRISPR technology has enabled large-scale cell lineage tracing for complex multicellular organisms by mutating synthetic genomic barcodes during organismal development. However, these sophisticated biological tools currently use ad-hoc and outmoded computational methods to reconstruct the cell lineage tree from the mutated barcodes. Because these methods are agnostic to the biological mechanism, t… ▽ More

    Submitted 29 March, 2019; originally announced April 2019.

  4. arXiv:1902.02776  [pdf, other

    stat.ME

    Modeling microbial abundances and dysbiosis with beta-binomial regression

    Authors: Bryan D. Martin, Daniela Witten, Amy D. Willis

    Abstract: Using a sample from a population to estimate the proportion of the population with a certain category label is a broadly important problem. In the context of microbiome studies, this problem arises when researchers wish to use a sample from a population of microbes to estimate the population proportion of a particular taxon, known as the taxon's relative abundance. In this paper, we propose a beta… ▽ More

    Submitted 7 February, 2019; originally announced February 2019.

  5. arXiv:1611.03456  [pdf, other

    stat.ME q-bio.PE

    Uncertainty in phylogenetic tree estimates

    Authors: Amy D. Willis, Rayna C. Bell

    Abstract: Estimating phylogenetic trees is an important problem in evolutionary biology, environmental policy and medicine. Although trees are estimated, their uncertainties are discarded by mathematicians working in tree space. Here we explicitly model the multivariate uncertainty of tree estimates. We consider both the cases where uncertainty information arises extrinsically (through covariate information… ▽ More

    Submitted 12 October, 2017; v1 submitted 10 November, 2016; originally announced November 2016.

    Comments: Final version accepted to Journal of Computational and Graphical Statistics

  6. arXiv:1607.08288  [pdf, other

    stat.ME q-bio.PE

    Confidence sets for phylogenetic trees

    Authors: Amy Willis

    Abstract: Inferring evolutionary histories (phylogenetic trees) has important applications in biology, criminology and public health. However, phylogenetic trees are complex mathematical objects that reside in a non-Euclidean space, which complicates their analysis. While our mathematical, algorithmic, and probabilistic understanding of phylogenies in their metric space is mature, rigorous inferential infra… ▽ More

    Submitted 12 October, 2017; v1 submitted 27 July, 2016; originally announced July 2016.

    Comments: Final version accepted to the Journal of the American Statistical Association

    MSC Class: 62F03; 62C99; 62H86

  7. arXiv:1605.02082  [pdf, other

    stat.AP stat.ME

    Improved detection of changes in species richness in high-diversity microbial communities

    Authors: Amy Willis, John Bunge, Thea Whitman

    Abstract: High throughput sequencing (HTS) continues to expand our understanding of microbial communities, despite insufficient sequencing depths to detect all rare taxa. These low abundance taxa are not accounted for in existing methods for detecting changes in species richness. We address this with a new hierarchical model that permits rigorous testing for both heterogeneity and biodiversity changes, and… ▽ More

    Submitted 9 April, 2016; originally announced May 2016.

    Comments: arXiv admin note: text overlap with arXiv:1506.05710

  8. arXiv:1604.02598  [pdf, other

    stat.ME

    Species richness estimation with high diversity but spurious singletons

    Authors: Amy Willis

    Abstract: The presence of uncommon taxa in high-throughput sequenced ecological samples pose challenges to the microbial ecologist, bioinformatician and statistician. It is rarely certain whether these taxa are truly present in the sample or the result of sequencing errors. Unfortunately, alpha-diversity quantification relies on accurate frequency counts, which can rarely be guaranteed. We present a species… ▽ More

    Submitted 9 April, 2016; originally announced April 2016.

  9. arXiv:1506.05710  [pdf, other

    stat.ME q-bio.PE

    Inference for changes in biodiversity

    Authors: Amy Willis, John Bunge, Thea Whitman

    Abstract: We wish to formally test for changes in the taxonomic diversity of a community, especially in the presence of high latent diversity. Drawing on the meta-analysis literature, we construct a model for diversity that accounts for covariate effects as well as sampling variability. This permits inference for changes in richness with covariates and also a test for homogeneity. We argue that we can use t… ▽ More

    Submitted 18 June, 2015; originally announced June 2015.

    Comments: 23 pages, 4 figures

  10. arXiv:1408.3333  [pdf, other

    stat.ME stat.AP

    Estimating Diversity via Frequency Ratios

    Authors: A. Willis, J. Bunge

    Abstract: We wish to estimate the total number of classes in a population based on sample counts, especially in the presence of high latent diversity. Drawing on probability theory that characterizes distributions on the integers by ratios of consecutive probabilities, we construct a nonlinear regression model for the ratios of consecutive frequency counts. This allows us to predict the unobserved count and… ▽ More

    Submitted 9 December, 2014; v1 submitted 14 August, 2014; originally announced August 2014.

    Comments: 17 pages, 1 figure, 4 tables