Skip to main content

Showing 1–28 of 28 results for author: Genovese, C R

.
  1. Expanding the scope of statistical computing: Training statisticians to be software engineers

    Authors: Alex Reinhart, Christopher R. Genovese

    Abstract: Traditionally, statistical computing courses have taught the syntax of a particular programming language or specific statistical computation methods. Since the publication of Nolan and Temple Lang (2010), we have seen a greater emphasis on data wrangling, reproducible research, and visualization. This shift better prepares students for careers working with complex datasets and producing analyses f… ▽ More

    Submitted 28 October, 2020; v1 submitted 30 December, 2019; originally announced December 2019.

    Comments: 22 pages

    Journal ref: Journal of Statistics and Data Science Education (2021), 29:sup1, S7-S15

  2. arXiv:1601.07872  [pdf, ps, other

    math.ST stat.ME

    Nonparametric Clustering of Functional Data Using Pseudo-Densities

    Authors: Mattia Ciollaro, Christopher R. Genovese, Daren Wang

    Abstract: We study nonparametric clustering of smooth random curves on the basis of the L2 gradient flow associated to a pseudo-density functional and we show that the clustering is well-defined both at the population and at the sample level. We provide an algorithm to mark significant local modes, which are associated to informative sample clusters, and we derive its consistency properties. Our theory is d… ▽ More

    Submitted 28 January, 2016; originally announced January 2016.

    Journal ref: Electron. J. Statist., Volume 10, Number 2 (2016), 2922-2972

  3. arXiv:1509.06443  [pdf, other

    astro-ph.CO stat.AP

    Cosmic Web Reconstruction through Density Ridges: Catalogue

    Authors: Yen-Chi Chen, Shirley Ho, Jon Brinkmann, Peter E. Freeman, Christopher R. Genovese, Donald P. Schneider, Larry Wasserman

    Abstract: We construct a catalogue for filaments using a novel approach called SCMS (subspace constrained mean shift; Ozertem & Erdogmus 2011; Chen et al. 2015). SCMS is a gradient-based method that detects filaments through density ridges (smooth curves tracing high-density regions). A great advantage of SCMS is its uncertainty measure, which allows an evaluation of the errors for the detected filaments. T… ▽ More

    Submitted 21 September, 2015; originally announced September 2015.

    Comments: 14 pages, 12 figures, 4 tables

  4. arXiv:1509.06376  [pdf, other

    astro-ph.GA astro-ph.CO stat.AP

    Detecting Effects of Filaments on Galaxy Properties in the Sloan Digital Sky Survey III

    Authors: Yen-Chi Chen, Shirley Ho, Rachel Mandelbaum, Neta A. Bahcall, Joel R. Brownstein, Peter E. Freeman, Christopher R. Genovese, Donald P. Schneider, Larry Wasserman

    Abstract: We study the effects of filaments on galaxy properties in the Sloan Digital Sky Survey (SDSS) Data Release 12 using filaments from the `Cosmic Web Reconstruction' catalogue (Chen et al. 2016), a publicly available filament catalogue for SDSS. Since filaments are tracers of medium-to-high density regions, we expect that galaxy properties associated with the environment are dependent on the distance… ▽ More

    Submitted 12 January, 2017; v1 submitted 21 September, 2015; originally announced September 2015.

    Comments: To appear in MNRAS

  5. arXiv:1508.04149  [pdf, other

    astro-ph.CO stat.AP

    Investigating Galaxy-Filament Alignments in Hydrodynamic Simulations using Density Ridges

    Authors: Yen-Chi Chen, Shirley Ho, Ananth Tenneti, Rachel Mandelbaum, Rupert Croft, Tiziana DiMatteo, Peter E. Freeman, Christopher R. Genovese, Larry Wasserman

    Abstract: In this paper, we study the filamentary structures and the galaxy alignment along filaments at redshift $z=0.06$ in the MassiveBlack-II simulation, a state-of-the-art, high-resolution hydrodynamical cosmological simulation which includes stellar and AGN feedback in a volume of (100 Mpc$/h$)$^3$. The filaments are constructed using the subspace constrained mean shift (SCMS; Ozertem & Erdogmus (2011… ▽ More

    Submitted 17 August, 2015; originally announced August 2015.

    Comments: 11 pages, 10 figures

  6. arXiv:1506.08826  [pdf, other

    math.ST stat.ME stat.ML

    Statistical Inference using the Morse-Smale Complex

    Authors: Yen-Chi Chen, Christopher R. Genovese, Larry Wasserman

    Abstract: The Morse-Smale complex of a function $f$ decomposes the sample space into cells where $f$ is increasing or decreasing. When applied to nonparametric density estimation and regression, it provides a way to represent, visualize, and compare multivariate functions. In this paper, we present some statistical results on estimating Morse-Smale complexes. This allows us to derive new results for two exi… ▽ More

    Submitted 3 April, 2017; v1 submitted 29 June, 2015; originally announced June 2015.

    Comments: 45 pages, 13 figures. Accepted to Electronic Journal of Statistics

    MSC Class: 62G20 (Primary); 62G05; 62G08 (Secondary)

  7. arXiv:1506.02278  [pdf, other

    stat.ME stat.ML

    Optimal Ridge Detection using Coverage Risk

    Authors: Yen-Chi Chen, Christopher R. Genovese, Shirley Ho, Larry Wasserman

    Abstract: We introduce the concept of coverage risk as an error measure for density ridge estimation. The coverage risk generalizes the mean integrated square error to set estimation. We propose two risk estimators for the coverage risk and we show that we can select tuning parameters by minimizing the estimated risk. We study the rate of convergence for coverage risk and prove consistency of the risk estim… ▽ More

    Submitted 7 June, 2015; originally announced June 2015.

    Comments: 16 pages, 4 figures

  8. arXiv:1504.05438  [pdf, other

    stat.ME math.ST

    Density Level Sets: Asymptotics, Inference, and Visualization

    Authors: Yen-Chi Chen, Christopher R. Genovese, Larry Wasserman

    Abstract: We derive asymptotic theory for the plug-in estimate for density level sets under Hausdoff loss. Based on the asymptotic theory, we propose two bootstrap confidence regions for level sets. The confidence regions can be used to perform tests for anomaly detection and clustering. We also introduce a technique to visualize high dimensional density level sets by combining mode clustering and multidime… ▽ More

    Submitted 5 September, 2016; v1 submitted 21 April, 2015; originally announced April 2015.

    Comments: Accepted to JASA-T&M. 40 pages, 11 figures

    MSC Class: Primary: 62G20; Secondary 62G10; 62G15

  9. arXiv:1501.05303  [pdf, other

    astro-ph.CO stat.AP

    Cosmic Web Reconstruction through Density Ridges: Method and Algorithm

    Authors: Yen-Chi Chen, Shirley Ho, Peter E. Freeman, Christopher R. Genovese, Larry Wasserman

    Abstract: The detection and characterization of filamentary structures in the cosmic web allows cosmologists to constrain parameters that dictates the evolution of the Universe. While many filament estimators have been proposed, they generally lack estimates of uncertainty, reducing their inferential power. In this paper, we demonstrate how one may apply the Subspace Constrained Mean Shift (SCMS) algorithm… ▽ More

    Submitted 27 August, 2015; v1 submitted 21 January, 2015; originally announced January 2015.

    Comments: To appear in MNRAS. 18 pages, 19 figures, 1 table

  10. arXiv:1412.1716  [pdf, ps, other

    stat.ME math.ST stat.ML

    Nonparametric modal regression

    Authors: Yen-Chi Chen, Christopher R. Genovese, Ryan J. Tibshirani, Larry Wasserman

    Abstract: Modal regression estimates the local modes of the distribution of $Y$ given $X=x$, instead of the mean, as in the usual regression sense, and can hence reveal important structure missed by usual regression methods. We study a simple nonparametric method for modal regression, based on a kernel density estimate (KDE) of the joint distribution of $Y$ and $X$. We derive asymptotic error bounds for thi… ▽ More

    Submitted 30 March, 2016; v1 submitted 4 December, 2014; originally announced December 2014.

    Comments: Published at http://dx.doi.org/10.1214/15-AOS1373 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOS-AOS1373

    Journal ref: Annals of Statistics 2016, Vol. 44, No. 2, 489-514

  11. arXiv:1406.5663  [pdf, ps, other

    stat.ME math.ST

    Asymptotic theory for density ridges

    Authors: Yen-Chi Chen, Christopher R. Genovese, Larry Wasserman

    Abstract: The large sample theory of estimators for density modes is well understood. In this paper we consider density ridges, which are a higher-dimensional extension of modes. Modes correspond to zero-dimensional, local high-density regions in point clouds. Density ridges correspond to $s$-dimensional, local high-density regions in point clouds. We establish three main results. First we show that under a… ▽ More

    Submitted 13 October, 2015; v1 submitted 21 June, 2014; originally announced June 2014.

    Comments: Published at http://dx.doi.org/10.1214/15-AOS1329 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOS-AOS1329

    Journal ref: Annals of Statistics 2015, Vol. 43, No. 5, 1896-1928

  12. arXiv:1406.1803  [pdf, other

    stat.ME cs.CG

    Generalized Mode and Ridge Estimation

    Authors: Yen-Chi Chen, Christopher R. Genovese, Larry Wasserman

    Abstract: The generalized density is a product of a density function and a weight function. For example, the average local brightness of an astronomical image is the probability of finding a galaxy times the mean brightness of the galaxy. We propose a method for studying the geometric structure of generalized densities. In particular, we show how to find the modes and ridges of a generalized density functio… ▽ More

    Submitted 6 June, 2014; originally announced June 2014.

  13. arXiv:1406.1780  [pdf, other

    stat.ME stat.ML

    A Comprehensive Approach to Mode Clustering

    Authors: Yen-Chi Chen, Christopher R. Genovese, Larry Wasserman

    Abstract: Mode clustering is a nonparametric method for clustering that defines clusters using the basins of attraction of a density estimator's modes. We provide several enhancements to mode clustering: (i) a soft variant of cluster assignment, (ii) a measure of connectivity between clusters, (iii) a technique for choosing the bandwidth, (iv) a method for denoising small clusters, and (v) an approach to vi… ▽ More

    Submitted 22 December, 2015; v1 submitted 6 June, 2014; originally announced June 2014.

    Comments: 34 pages, 17 figures. Accepted to the Electronic Journal of Statistics. The original title is "Enhanced Mode Clustering"

    MSC Class: 62H30 (Primary); 62G07; 62G99 (Secondary)

  14. arXiv:1401.1867  [pdf, other

    astro-ph.IM astro-ph.CO stat.AP

    Nonparametric 3D map of the IGM using the Lyman-alpha forest

    Authors: Jessi Cisewski, Rupert A. C. Croft, Peter E. Freeman, Christopher R. Genovese, Nishikanta Khandai, Melih Ozbek, Larry Wasserman

    Abstract: Visualizing the high-redshift Universe is difficult due to the dearth of available data; however, the Lyman-alpha forest provides a means to map the intergalactic medium at redshifts not accessible to large galaxy surveys. Large-scale structure surveys, such as the Baryon Oscillation Spectroscopic Survey (BOSS), have collected quasar (QSO) spectra that enable the reconstruction of HI density fluct… ▽ More

    Submitted 8 January, 2014; originally announced January 2014.

  15. arXiv:1312.2098  [pdf, other

    stat.ME cs.CG

    Uncertainty Measures and Limiting Distributions for Filament Estimation

    Authors: Yen-Chi Chen, Christopher R. Genovese, Larry Wasserman

    Abstract: A filament is a high density, connected region in a point cloud. There are several methods for estimating filaments but these methods do not provide any measure of uncertainty. We give a definition for the uncertainty of estimated filaments and we study statistical properties of the estimated filaments. We show how to estimate the uncertainty measures and we construct confidence sets based on a bo… ▽ More

    Submitted 7 December, 2013; originally announced December 2013.

    Comments: Submitted to 30th Annual Symposium on Computational Geometry (SoCG2014)

  16. arXiv:1212.5156  [pdf, ps, other

    math.ST cs.LG stat.ML

    Nonparametric ridge estimation

    Authors: Christopher R. Genovese, Marco Perone-Pacifico, Isabella Verdinelli, Larry Wasserman

    Abstract: We study the problem of estimating the ridges of a density function. Ridge estimation is an extension of mode finding and is useful for understanding the structure of a density. It can also be used to find hidden structure in point cloud data. We show that, under mild regularity conditions, the ridges of the kernel density estimator consistently estimate the ridges of the true density. When the da… ▽ More

    Submitted 28 August, 2014; v1 submitted 20 December, 2012; originally announced December 2012.

    Comments: Published in at http://dx.doi.org/10.1214/14-AOS1218 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOS-AOS1218

    Journal ref: Annals of Statistics, Vol. 42, No. 4, 1511-1545 (2014)

  17. arXiv:1207.0538  [pdf, ps, other

    stat.ME

    Efficient Estimators for Sequential and Resolution-Limited Inverse Problems

    Authors: Darren Homrighausen, Christopher R. Genovese

    Abstract: A common problem in the sciences is that a signal of interest is observed only indirectly, through smooth functionals of the signal whose values are then obscured by noise. In such inverse problems, the functionals dampen or entirely eliminate some of the signal's interesting features. This makes it difficult or even impossible to fully reconstruct the signal, even without noise. In this paper, we… ▽ More

    Submitted 2 July, 2012; originally announced July 2012.

  18. Regularization Techniques for PSF-Matching Kernels. I. Choice of Kernel Basis

    Authors: A. C. Becker, D. Homrighausen, A. J. Connolly, C. R. Genovese, R. Owen, S. J. Bickerton, R. H. Lupton

    Abstract: We review current methods for building PSF-matching kernels for the purposes of image subtraction or coaddition. Such methods use a linear decomposition of the kernel on a series of basis functions. The correct choice of these basis functions is fundamental to the efficiency and effectiveness of the matching - the chosen bases should represent the underlying signal using a reasonably small number… ▽ More

    Submitted 13 February, 2012; originally announced February 2012.

    Comments: Submitted to MNRAS; 5 figures

  19. arXiv:1109.4540  [pdf, ps, other

    math.ST cs.LG stat.ML

    Manifold estimation and singular deconvolution under Hausdorff loss

    Authors: Christopher R. Genovese, Marco Perone-Pacifico, Isabella Verdinelli, Larry Wasserman

    Abstract: We find lower and upper bounds for the risk of estimating a manifold in Hausdorff distance under several models. We also show that there are close connections between manifold estimation and the problem of deconvolving a singular measure.

    Submitted 5 June, 2012; v1 submitted 21 September, 2011; originally announced September 2011.

    Comments: Published in at http://dx.doi.org/10.1214/12-AOS994 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOS-AOS994

    Journal ref: Annals of Statistics 2012, Vol. 40, No. 2, 941-963

  20. Discussion of: Brownian distance covariance

    Authors: Christopher R. Genovese

    Abstract: Discussion on "Brownian distance covariance" by Gábor J. Székely and Maria L. Rizzo [arXiv:1010.0297]

    Submitted 5 October, 2010; originally announced October 2010.

    Comments: Published in at http://dx.doi.org/10.1214/09-AOAS312G the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOAS-AOAS312G

    Journal ref: Annals of Applied Statistics 2009, Vol. 3, No. 4, 1299-1302

  21. arXiv:1003.5536  [pdf, other

    math.ST astro-ph.IM

    The Geometry of Nonparametric Filament Estimation

    Authors: Christopher R. Genovese, Marco Perone-Pacifico, Isabella Verdinelli, Larry Wasserman

    Abstract: We consider the problem of estimating filamentary structure from planar point process data. We make some connections with computational geometry and we develop nonparametric methods for estimating the filaments. We show that, under weak conditions, the filaments have a simple geometric representation as the medial axis of the data distribution's support. Our methods convert an estimator of the sup… ▽ More

    Submitted 12 December, 2010; v1 submitted 25 March, 2010; originally announced March 2010.

    Comments: substantial revision

  22. arXiv:0910.5449  [pdf, ps, other

    stat.AP astro-ph.IM stat.ME

    Straight to the Source: Detecting Aggregate Objects in Astronomical Images with Proper Error Control

    Authors: David A. Friedenberg, Christopher R. Genovese

    Abstract: The next generation of telescopes will acquire terabytes of image data on a nightly basis. Collectively, these large images will contain billions of interesting objects, which astronomers call sources. The astronomers' task is to construct a catalog detailing the coordinates and other properties of the sources. The source catalog is the primary data product for most telescopes and is an importan… ▽ More

    Submitted 28 October, 2009; originally announced October 2009.

  23. Revealing components of the galaxy population through nonparametric techniques

    Authors: Steven P. Bamford, Alex L. Rojas, Robert C. Nichol, Christopher J. Miller, Larry Wasserman, Christopher R. Genovese, Peter E. Freeman

    Abstract: The distributions of galaxy properties vary with environment, and are often multimodal, suggesting that the galaxy population may be a combination of multiple components. The behaviour of these components versus environment holds details about the processes of galaxy development. To release this information we apply a novel, nonparametric statistical technique, identifying four components presen… ▽ More

    Submitted 16 September, 2008; originally announced September 2008.

    Comments: 12 pages, 10 figures, accepted for publication in MNRAS

  24. On the path density of a gradient field

    Authors: Christopher R. Genovese, Marco Perone-Pacifico, Isabella Verdinelli, Larry Wasserman

    Abstract: We consider the problem of reliably finding filaments in point clouds. Realistic data sets often have numerous filaments of various sizes and shapes. Statistical techniques exist for finding one (or a few) filaments but these methods do not handle noisy data sets with many filaments. Other methods can be found in the astronomy literature but they do not have rigorous statistical guarantees. We p… ▽ More

    Submitted 11 September, 2009; v1 submitted 27 May, 2008; originally announced May 2008.

    Comments: Published in at http://dx.doi.org/10.1214/08-AOS671 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOS-AOS671 MSC Class: 62G99; 62G07; 62G20 (Primary)

    Journal ref: Annals of Statistics 2009, Vol. 37, No. 6A, 3236-3271

  25. arXiv:math/0701513  [pdf, ps, other

    math.ST

    Adaptive Confidence Bands

    Authors: Christopher R. Genovese, Larry Wasserman

    Abstract: We show that there do not exist adaptive confidence bands for curve estimation except under very restrictive assumptions. We propose instead to construct adaptive bands that cover a surrogate function f^\star which is close to, but simpler than, f. The surrogate captures the significant features in f. We establish lower bounds on the width for any confidence band for f^\star and construct a proc… ▽ More

    Submitted 18 January, 2007; originally announced January 2007.

  26. Examining the Effect of the Map-Making Algorithm on Observed Power Asymmetry in WMAP Data

    Authors: P. E. Freeman, C. R. Genovese, C. J. Miller, R. C. Nichol, L. Wasserman

    Abstract: We analyze first-year data of WMAP to determine the significance of asymmetry in summed power between arbitrarily defined opposite hemispheres, using maps that we create ourselves with software developed independently of the WMAP team. We find that over the multipole range l=[2,64], the significance of asymmetry is ~ 10^-4, a value insensitive to both frequency and power spectrum. We determine t… ▽ More

    Submitted 13 October, 2005; originally announced October 2005.

    Comments: 45 pages, 16 figures (21 figure files), high-resolution versions of Figures 1-3 at http://www.stat.cmu.edu/~pfreeman, accepted for publication in ApJ

    Journal ref: Astrophys.J.638:1-19,2006

  27. Confidence sets for nonparametric wavelet regression

    Authors: Christopher R. Genovese, Larry Wasserman

    Abstract: We construct nonparametric confidence sets for regression functions using wavelets that are uniform over Besov balls. We consider both thresholding and modulation estimators for the wavelet coefficients. The confidence set is obtained by showing that a pivot process, constructed from the loss function, converges uniformly to a mean zero Gaussian process. Inverting this pivot yields a confidence… ▽ More

    Submitted 30 May, 2005; originally announced May 2005.

    Comments: Published at http://dx.doi.org/10.1214/009053605000000011 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOS-AOS009 MSC Class: 62G15 (Primary) 62G99; 62M99; 62E20 (Secondary)

    Journal ref: Annals of Statistics 2005, Vol. 33, No. 2, 698-729

  28. Nonparametric Inference for the Cosmic Microwave Background

    Authors: Christopher R. Genovese, Christopher J. Miller, Robert C. Nichol, Mihir Arjunwadkar, Larry Wasserman

    Abstract: The Cosmic Microwave Background (CMB), which permeates the entire Universe, is the radiation left over from just 380,000 years after the Big Bang. On very large scales, the CMB radiation field is smooth and isotropic, but the existence of structure in the Universe - stars, galaxies, clusters of galaxies - suggests that the field should fluctuate on smaller scales. Recent observations, from the C… ▽ More

    Submitted 6 October, 2004; originally announced October 2004.

    Comments: Invited review for "Statistical Science". Accepted for publication in Feburary 2004 journal