Skip to main content

Showing 1–18 of 18 results for author: Nelson, J

Searching in archive stat. Search in all archives.
.
  1. arXiv:2504.11740  [pdf, other

    stat.ME

    A cautionary note for plasmode simulation studies in the setting of causal inference

    Authors: Pamela A Shaw, Susan Gruber, Brian D. Williamson, Rishi Desai, Susan M. Shortreed, Chloe Krakauer, Jennifer C. Nelson, Mark J. van der Laan

    Abstract: Plasmode simulation has become an important tool for evaluating the operating characteristics of different statistical methods in complex settings, such as pharmacoepidemiological studies of treatment effectiveness using electronic health records (EHR) data. These studies provide insight into how estimator performance is impacted by challenges including rare events, small sample size, etc., that c… ▽ More

    Submitted 15 April, 2025; originally announced April 2025.

    Comments: 55 pages, 6 tables, 2 figures, 8 supplemental tables, 4 supplemental figures

  2. arXiv:2412.15012  [pdf, other

    stat.ME

    Assessing treatment effects in observational data with missing confounders: A comparative study of practical doubly-robust and traditional missing data methods

    Authors: Brian D. Williamson, Chloe Krakauer, Eric Johnson, Susan Gruber, Bryan E. Shepherd, Mark J. van der Laan, Thomas Lumley, Hana Lee, Jose J. Hernandez Munoz, Fengyu Zhao, Sarah K. Dutcher, Rishi Desai, Gregory E. Simon, Susan M. Shortreed, Jennifer C. Nelson, Pamela A. Shaw

    Abstract: In pharmacoepidemiology, safety and effectiveness are frequently evaluated using readily available administrative and electronic health records data. In these settings, detailed confounder data are often not available in all data sources and therefore missing on a subset of individuals. Multiple imputation (MI) and inverse-probability weighting (IPW) are go-to analytical methods to handle missing… ▽ More

    Submitted 19 December, 2024; originally announced December 2024.

    Comments: 142 pages (27 main, 115 supplemental); 6 figures, 2 tables

  3. arXiv:2310.12871  [pdf, other

    stat.AP cs.CY

    The origins of unpredictability in life trajectory prediction tasks

    Authors: Ian Lundberg, Rachel Brown-Weinstock, Susan Clampet-Lundquist, Sarah Pachman, Timothy J. Nelson, Vicki Yang, Kathryn Edin, Matthew J. Salganik

    Abstract: Why are life trajectories difficult to predict? We investigated this question through in-depth qualitative interviews with 40 families sampled from a multi-decade longitudinal study. Our sampling and interviewing process were informed by the earlier efforts of hundreds of researchers to predict life outcomes for participants in this study. The qualitative evidence we uncovered in these interviews… ▽ More

    Submitted 19 October, 2023; originally announced October 2023.

    Comments: 54 pages, 8 figures

    ACM Class: J.4

  4. arXiv:2306.04444  [pdf, other

    cs.LG cs.CR stat.ML

    Fast Optimal Locally Private Mean Estimation via Random Projections

    Authors: Hilal Asi, Vitaly Feldman, Jelani Nelson, Huy L. Nguyen, Kunal Talwar

    Abstract: We study the problem of locally private mean estimation of high-dimensional vectors in the Euclidean ball. Existing algorithms for this problem either incur sub-optimal error or have high communication and/or run-time complexity. We propose a new algorithmic framework, ProjUnit, for private mean estimation that yields algorithms that are computationally efficient, have low communication complexity… ▽ More

    Submitted 26 June, 2023; v1 submitted 7 June, 2023; originally announced June 2023.

    Comments: Added the correct github link

  5. Analysis of the 24-Hour Activity Cycle: An illustration examining the association with cognitive function in the Adult Changes in Thought (ACT) Study

    Authors: Yinxiang Wu, Dori E. Rosenberg, Mikael Anne Greenwood-Hickman, Susan M. McCurry, Cecile Proust-Lima, Jennifer C. Nelson, Paul K. Crane, Andrea Z. LaCroix, Eric B. Larson, Pamela A. Shaw

    Abstract: The 24-hour activity cycle (24HAC) is a new paradigm for studying activity behaviors in relation to health outcomes. This approach captures the interrelatedness of the daily time spent in physical activity (PA), sedentary behavior (SB), and sleep. We illustrate and compare the use of three popular approaches, namely isotemporal substitution model (ISM), compositional data analysis (CoDA), and late… ▽ More

    Submitted 19 January, 2023; originally announced January 2023.

    Comments: 51 pages, 11 tables, 8 figures

  6. arXiv:2207.10406  [pdf, other

    stat.AP cond-mat.mtrl-sci cond-mat.soft

    Advice on describing Bayesian analysis of neutron and X-ray reflectometry

    Authors: Andrew R. McCluskey, Andrew J. Caruana, Christy J. Kinane, Alexander J. Armstrong, Tom Arnold, Joshaniel F. K. Cooper, David L. Cortie, Arwel V. Hughes, Jean-François Moulin, Andrew R. J. Nelson, Wojciech Potrzebowski, Vladimir Straostin

    Abstract: Driven by the availability of modern software and hardware, Bayesian analysis is becoming more popular in neutron and X-ray reflectometry analysis. The understandability and replicability of these analyses may be harmed by inconsistencies in how the probability distributions central to Bayesian methods are represented in the literature. Herein, we provide advice on how to report the results of Bay… ▽ More

    Submitted 22 January, 2023; v1 submitted 21 July, 2022; originally announced July 2022.

  7. arXiv:2205.07362  [pdf, ps, other

    cs.LG math.RT stat.ML

    What is an equivariant neural network?

    Authors: Lek-Heng Lim, Bradley J. Nelson

    Abstract: We explain equivariant neural networks, a notion underlying breakthroughs in machine learning from deep convolutional neural networks for computer vision to AlphaFold 2 for protein structure prediction, without assuming knowledge of equivariance or neural networks. The basic mathematical ideas are simple but are often obscured by engineering complications that come with practical realizations. We… ▽ More

    Submitted 16 November, 2022; v1 submitted 15 May, 2022; originally announced May 2022.

    Comments: 8 pages, 3 figure

    ACM Class: I.2.6

  8. arXiv:2203.01599  [pdf, ps, other

    cs.LG cs.CG cs.DS stat.ML

    Uniform Approximations for Randomized Hadamard Transforms with Applications

    Authors: Yeshwanth Cherapanamjeri, Jelani Nelson

    Abstract: Randomized Hadamard Transforms (RHTs) have emerged as a computationally efficient alternative to the use of dense unstructured random matrices across a range of domains in computer science and machine learning. For several applications such as dimensionality reduction and compressed sensing, the theoretical guarantees for methods based on RHTs are comparable to approaches using dense random matric… ▽ More

    Submitted 3 March, 2022; originally announced March 2022.

    Comments: STOC 2022

  9. arXiv:2110.08691  [pdf, other

    cs.DS cs.CG cs.LG stat.ML

    Terminal Embeddings in Sublinear Time

    Authors: Yeshwanth Cherapanamjeri, Jelani Nelson

    Abstract: Recently (Elkin, Filtser, Neiman 2017) introduced the concept of a {\it terminal embedding} from one metric space $(X,d_X)$ to another $(Y,d_Y)$ with a set of designated terminals $T\subset X$. Such an embedding $f$ is said to have distortion $ρ\ge 1$ if $ρ$ is the smallest value such that there exists a constant $C>0$ satisfying \begin{equation*} \forall x\in T\ \forall q\in X,\ C d_X(x, q) \… ▽ More

    Submitted 13 March, 2024; v1 submitted 16 October, 2021; originally announced October 2021.

    Journal ref: TheoretiCS, Volume 3 (March 14, 2024) theoretics:9167

  10. arXiv:1911.07688  [pdf, other

    astro-ph.IM stat.CO

    emcee v3: A Python ensemble sampling toolkit for affine-invariant MCMC

    Authors: Daniel Foreman-Mackey, Will M. Farr, Manodeep Sinha, Anne M. Archibald, David W. Hogg, Jeremy S. Sanders, Joe Zuntz, Peter K. G. Williams, Andrew R. J. Nelson, Miguel de Val-Borro, Tobias Erhardt, Ilya Pashchenko, Oriol Abril Pla

    Abstract: emcee is a Python library implementing a class of affine-invariant ensemble samplers for Markov chain Monte Carlo (MCMC). This package has been widely applied to probabilistic modeling problems in astrophysics where it was originally published, with some applications in other fields. When it was first released in 2012, the interface implemented in emcee was fundamentally different from the MCMC li… ▽ More

    Submitted 18 November, 2019; originally announced November 2019.

    Comments: Published in the Journal for Open Source Software

    Journal ref: Journal of Open Source Software, 2 4(43), 1864 (2019)

  11. arXiv:1909.12518  [pdf, ps, other

    cs.LG cs.DS stat.ML

    Margin-Based Generalization Lower Bounds for Boosted Classifiers

    Authors: Allan Grønlund, Lior Kamma, Kasper Green Larsen, Alexander Mathiasen, Jelani Nelson

    Abstract: Boosting is one of the most successful ideas in machine learning. The most well-accepted explanations for the low generalization error of boosting algorithms such as AdaBoost stem from margin theory. The study of margins in the context of boosting algorithms was initiated by Schapire, Freund, Bartlett and Lee (1998) and has inspired numerous boosting algorithms and generalization bounds. To date,… ▽ More

    Submitted 7 May, 2020; v1 submitted 27 September, 2019; originally announced September 2019.

  12. arXiv:1905.12200  [pdf, other

    cs.LG math.AT stat.ML

    A Topology Layer for Machine Learning

    Authors: Rickard Brüel-Gabrielsson, Bradley J. Nelson, Anjan Dwaraknath, Primoz Skraba, Leonidas J. Guibas, Gunnar Carlsson

    Abstract: Topology applied to real world data using persistent homology has started to find applications within machine learning, including deep learning. We present a differentiable topology layer that computes persistent homology based on level set filtrations and edge-based filtrations. We present three novel applications: the topological layer can (i) regularize data reconstruction or the weights of mac… ▽ More

    Submitted 24 April, 2020; v1 submitted 28 May, 2019; originally announced May 2019.

  13. arXiv:1903.06701  [pdf, other

    cs.DC cs.LG cs.NI stat.ML

    Scaling Distributed Machine Learning with In-Network Aggregation

    Authors: Amedeo Sapio, Marco Canini, Chen-Yu Ho, Jacob Nelson, Panos Kalnis, Changhoon Kim, Arvind Krishnamurthy, Masoud Moshref, Dan R. K. Ports, Peter Richtárik

    Abstract: Training machine learning models in parallel is an increasingly important workload. We accelerate distributed parallel training by designing a communication primitive that uses a programmable switch dataplane to execute a key step of the training process. Our approach, SwitchML, reduces the volume of exchanged data by aggregating the model updates from multiple workers in the network. We co-design… ▽ More

    Submitted 30 September, 2020; v1 submitted 22 February, 2019; originally announced March 2019.

  14. arXiv:1810.09250  [pdf, ps, other

    cs.DS math.FA stat.ML

    Optimal terminal dimensionality reduction in Euclidean space

    Authors: Shyam Narayanan, Jelani Nelson

    Abstract: Let $\varepsilon\in(0,1)$ and $X\subset\mathbb R^d$ be arbitrary with $|X|$ having size $n>1$. The Johnson-Lindenstrauss lemma states there exists $f:X\rightarrow\mathbb R^m$ with $m = O(\varepsilon^{-2}\log n)$ such that $$ \forall x\in X\ \forall y\in X, \|x-y\|_2 \le \|f(x)-f(y)\|_2 \le (1+\varepsilon)\|x-y\|_2 . $$ We show that a strictly stronger version of this statement holds, answering one… ▽ More

    Submitted 22 October, 2018; originally announced October 2018.

  15. arXiv:1808.04906  [pdf, ps, other

    stat.ME

    Multiply Robust Causal Inference with Double Negative Control Adjustment for Categorical Unmeasured Confounding

    Authors: Xu Shi, Wang Miao, Jennifer C. Nelson, Eric J. Tchetgen Tchetgen

    Abstract: Unmeasured confounding is a threat to causal inference in observational studies. In recent years, use of negative controls to mitigate unmeasured confounding has gained increasing recognition and popularity. Negative controls have a longstanding tradition in laboratory sciences and epidemiology to rule out non-causal explanations, although they have been used primarily for bias detection. Recently… ▽ More

    Submitted 4 September, 2019; v1 submitted 14 August, 2018; originally announced August 2018.

  16. arXiv:1512.06171  [pdf, other

    stat.ME stat.CO stat.ML

    Regularized Estimation of Piecewise Constant Gaussian Graphical Models: The Group-Fused Graphical Lasso

    Authors: Alexander J. Gibberd, James D. B. Nelson

    Abstract: The time-evolving precision matrix of a piecewise-constant Gaussian graphical model encodes the dynamic conditional dependency structure of a multivariate time-series. Traditionally, graphical models are estimated under the assumption that data is drawn identically from a generating distribution. Introducing sparsity and sparse-difference inducing priors we relax these assumptions and propose a no… ▽ More

    Submitted 31 October, 2017; v1 submitted 18 December, 2015; originally announced December 2015.

    Comments: 32 pages, 9 figures

    Journal ref: Journal of Computational and Graphical Statistics, 2017, Volume 26, Number 3, pp 623--634

  17. arXiv:1507.02268  [pdf, ps, other

    cs.DS cs.LG stat.ML

    Optimal approximate matrix product in terms of stable rank

    Authors: Michael B. Cohen, Jelani Nelson, David P. Woodruff

    Abstract: We prove, using the subspace embedding guarantee in a black box way, that one can achieve the spectral norm guarantee for approximate matrix multiplication with a dimensionality-reducing map having $m = O(\tilde{r}/\varepsilon^2)$ rows. Here $\tilde{r}$ is the maximum stable rank, i.e. squared ratio of Frobenius and operator norms, of the two matrices being multiplied. This is a quantitative impro… ▽ More

    Submitted 2 March, 2016; v1 submitted 8 July, 2015; originally announced July 2015.

    Comments: v3: minor edits; v2: fixed one step in proof of Theorem 9 which was wrong by a constant factor (see the new Lemma 5 and its use; final theorem unaffected)

  18. arXiv:1311.2542  [pdf, ps, other

    cs.DS cs.CG cs.IT math.PR stat.ML

    Toward a unified theory of sparse dimensionality reduction in Euclidean space

    Authors: Jean Bourgain, Sjoerd Dirksen, Jelani Nelson

    Abstract: Let $Φ\in\mathbb{R}^{m\times n}$ be a sparse Johnson-Lindenstrauss transform [KN14] with $s$ non-zeroes per column. For a subset $T$ of the unit sphere, $\varepsilon\in(0,1/2)$ given, we study settings for $m,s$ required to ensure $$ \mathop{\mathbb{E}}_Φ\sup_{x\in T} \left|\|Φx\|_2^2 - 1 \right| < \varepsilon , $$ i.e. so that $Φ$ preserves the norm of every $x\in T$ simultaneously and multiplica… ▽ More

    Submitted 25 August, 2015; v1 submitted 11 November, 2013; originally announced November 2013.

    Journal ref: Geometric and Functional Analysis 25 (2015), no. 4, 1009-1088