Skip to main content

Showing 1–44 of 44 results for author: Padilla, O H M

Searching in archive stat. Search in all archives.
.
  1. arXiv:2505.09075  [pdf, ps, other

    stat.ML cs.LG

    Risk Bounds For Distributional Regression

    Authors: Carlos Misael Madrid Padilla, Oscar Hernan Madrid Padilla, Sabyasachi Chatterjee

    Abstract: This work examines risk bounds for nonparametric distributional regression estimators. For convex-constrained distributional regression, general upper bounds are established for the continuous ranked probability score (CRPS) and the worst-case mean squared error (MSE) across the domain. These theoretical results are applied to isotonic and trend filtering distributional regression, yielding conver… ▽ More

    Submitted 13 May, 2025; originally announced May 2025.

  2. arXiv:2504.15879  [pdf, other

    stat.ME

    Multivariate Poisson intensity estimation via low-rank tensor decomposition

    Authors: Haotian Xu, Carlos Misael Madrid Padilla, Oscar Hernan Madrid Padilla, Daren Wang

    Abstract: In this work, we introduce new matrix- and tensor-based methodologies for estimating multivariate intensity functions of spatial point processes. By modeling intensity functions as infinite-rank tensors within function spaces, we develop new algorithms to reveal optimal bias-variance trade-off for infinite-rank tensor estimation. Our methods dramatically enhance estimation accuracy while simultane… ▽ More

    Submitted 22 April, 2025; originally announced April 2025.

  3. arXiv:2412.20355  [pdf, other

    stat.ML cs.LG

    Confidence Interval Construction and Conditional Variance Estimation with Dense ReLU Networks

    Authors: Carlos Misael Madrid Padilla, Oscar Hernan Madrid Padilla, Yik Lun Kei, Zhi Zhang, Yanzhen Chen

    Abstract: This paper addresses the problems of conditional variance estimation and confidence interval construction in nonparametric regression using dense networks with the Rectified Linear Unit (ReLU) activation function. We present a residual-based framework for conditional variance estimation, deriving nonasymptotic bounds for variance estimation under both heteroscedastic and homoscedastic settings. We… ▽ More

    Submitted 31 December, 2024; v1 submitted 29 December, 2024; originally announced December 2024.

  4. arXiv:2412.02986  [pdf, other

    stat.ME

    Bayesian Transfer Learning for Enhanced Estimation and Inference

    Authors: Daoyuan Lai, Oscar Hernan Madrid Padilla, Tian Gu

    Abstract: Transfer learning enhances model performance in a target population with limited samples by leveraging knowledge from related studies. While many works focus on improving predictive performance, challenges of statistical inference persist. Bayesian approaches naturally offer uncertainty quantification for parameter estimates, yet existing Bayesian transfer learning methods are typically limited to… ▽ More

    Submitted 3 December, 2024; originally announced December 2024.

    Comments: 40 pages, 4 figures, 1 Table

  5. arXiv:2411.09961  [pdf, other

    stat.ML cs.LG math.ST

    Dense ReLU Neural Networks for Temporal-spatial Model

    Authors: Zhi Zhang, Carlos Misael Madrid Padilla, Xiaokai Luo, Daren Wang, Oscar Hernan Madrid Padilla

    Abstract: In this paper, we focus on fully connected deep neural networks utilizing the Rectified Linear Unit (ReLU) activation function for nonparametric estimation. We derive non-asymptotic bounds that lead to convergence rates, addressing both temporal and spatial dependence in the observed measurements. By accounting for dependencies across time and space, our models better reflect the complexities of r… ▽ More

    Submitted 22 January, 2025; v1 submitted 15 November, 2024; originally announced November 2024.

  6. arXiv:2406.06014  [pdf, other

    math.ST cs.SI stat.ME stat.ML

    Network two-sample test for block models

    Authors: Chung Kyong Nguen, Oscar Hernan Madrid Padilla, Arash A. Amini

    Abstract: We consider the two-sample testing problem for networks, where the goal is to determine whether two sets of networks originated from the same stochastic model. Assuming no vertex correspondence and allowing for different numbers of nodes, we address a fundamental network testing problem that goes beyond simple adjacency matrix comparisons. We adopt the stochastic block model (SBM) for network dist… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  7. arXiv:2405.13970  [pdf, other

    stat.ME

    Conformal uncertainty quantification using kernel depth measures in separable Hilbert spaces

    Authors: Marcos Matabuena, Rahul Ghosal, Pavlo Mozharovskyi, Oscar Hernan Madrid Padilla, Jukka-Pekka Onnela

    Abstract: Depth measures have gained popularity in the statistical literature for defining level sets in complex data structures like multivariate data, functional data, and graphs. Despite their versatility, integrating depth measures into regression modeling for establishing prediction regions remains underexplored. To address this gap, we propose a novel method utilizing a model-free uncertainty quantifi… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  8. arXiv:2404.04719  [pdf, other

    stat.ME

    Change Point Detection in Dynamic Graphs with Decoder-only Latent Space Model

    Authors: Yik Lun Kei, Jialiang Li, Hangjian Li, Yanzhen Chen, Oscar Hernan Madrid Padilla

    Abstract: This manuscript studies the unsupervised change point detection problem in time series of graphs using a decoder-only latent space model. The proposed framework consists of learnable prior distributions for low-dimensional graph representations and of a decoder that bridges the observed graphs and latent representations. The prior distributions of the latent spaces are learned from the observed da… ▽ More

    Submitted 17 April, 2025; v1 submitted 6 April, 2024; originally announced April 2024.

  9. arXiv:2402.01635  [pdf, other

    stat.ME cs.LG stat.CO stat.ML

    kNN Algorithm for Conditional Mean and Variance Estimation with Automated Uncertainty Quantification and Variable Selection

    Authors: Marcos Matabuena, Juan C. Vidal, Oscar Hernan Madrid Padilla, Jukka-Pekka Onnela

    Abstract: In this paper, we introduce a kNN-based regression method that synergizes the scalability and adaptability of traditional non-parametric kNN models with a novel variable selection technique. This method focuses on accurately estimating the conditional mean and variance of random response variables, thereby effectively characterizing conditional distributions across diverse scenarios.Our approach i… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

  10. arXiv:2308.16172  [pdf, other

    stat.ME stat.ML

    Temporal-spatial model via Trend Filtering

    Authors: Carlos Misael Madrid Padilla, Oscar Hernan Madrid Padilla, Daren Wang

    Abstract: This research focuses on the estimation of a non-parametric regression function designed for data with simultaneous time and space dependencies. In such a context, we study the Trend Filtering, a nonparametric estimator introduced by \cite{mammen1997locally} and \cite{rudin1992nonlinear}. For univariate settings, the signals we consider are assumed to have a kth weak derivative with bounded total… ▽ More

    Submitted 12 September, 2023; v1 submitted 30 August, 2023; originally announced August 2023.

  11. arXiv:2306.15286  [pdf, other

    stat.ME

    Multilayer random dot product graphs: Estimation and online change point detection

    Authors: Fan Wang, Wanshan Li, Oscar Hernan Madrid Padilla, Yi Yu, Alessandro Rinaldo

    Abstract: We study the multilayer random dot product graph (MRDPG) model, an extension of the random dot product graph to multilayer networks. To estimate the edge probabilities, we deploy a tensor-based methodology and demonstrate its superiority over existing approaches. Moving to dynamic MRDPGs, we formulate and analyse an online change point detection framework. At every time point, we observe a realiza… ▽ More

    Submitted 10 June, 2024; v1 submitted 27 June, 2023; originally announced June 2023.

  12. arXiv:2303.17642  [pdf, other

    stat.ME

    Change Point Detection on A Separable Model for Dynamic Networks

    Authors: Yik Lun Kei, Hangjian Li, Yanzhen Chen, Oscar Hernan Madrid Padilla

    Abstract: This paper studies the unsupervised change point detection problem in time series of networks using the Separable Temporal Exponential-family Random Graph Model (STERGM). Inherently, dynamic network patterns can be complex due to dyadic and temporal dependence, and change points detection can identify the discrepancies in the underlying data generating processes to facilitate downstream analysis.… ▽ More

    Submitted 2 March, 2025; v1 submitted 30 March, 2023; originally announced March 2023.

  13. arXiv:2301.11491  [pdf, other

    math.ST stat.ME

    Change point detection and inference in multivariable nonparametric models under mixing conditions

    Authors: Carlos Misael Madrid Padilla, Haotian Xu, Daren Wang, Oscar Hernan Madrid Padilla, Yi Yu

    Abstract: This paper studies multivariate nonparametric change point localization and inference problems. The data consists of a multivariate time series with potentially short range dependence. The distribution of this data is assumed to be piecewise constant with densities in a Hölder class. The change points, or times at which the distribution changes, are unknown. We derive the limiting distributions of… ▽ More

    Submitted 26 January, 2023; originally announced January 2023.

  14. arXiv:2211.14097  [pdf, other

    stat.ME

    Bayesian variance change point detection with credible sets

    Authors: Lorenzo Cappello, Oscar Hernan Madrid Padilla

    Abstract: This paper introduces a novel Bayesian approach to detect changes in the variance of a Gaussian sequence model, focusing on quantifying the uncertainty in the change point locations and providing a scalable algorithm for inference. Such a measure of uncertainty is necessary when change point methods are deployed in sensitive applications, for example, when one is interested in determining whether… ▽ More

    Submitted 2 March, 2025; v1 submitted 25 November, 2022; originally announced November 2022.

  15. arXiv:2208.03675  [pdf, other

    stat.ME math.ST stat.ML

    Kernel Biclustering algorithm in Hilbert Spaces

    Authors: Marcos Matabuena, J. C Vidal, Oscar Hernan Madrid Padilla, Dino Sejdinovic

    Abstract: Biclustering algorithms partition data and covariates simultaneously, providing new insights in several domains, such as analyzing gene expression to discover new biological functions. This paper develops a new model-free biclustering algorithm in abstract spaces using the notions of energy distance (ED) and the maximum mean discrepancy (MMD) -- two distances between probability distributions capa… ▽ More

    Submitted 7 August, 2022; originally announced August 2022.

  16. arXiv:2207.12638  [pdf, other

    math.ST cs.LG stat.ML

    Variance estimation in graphs with the fused lasso

    Authors: Oscar Hernan Madrid Padilla

    Abstract: We study the problem of variance estimation in general graph-structured problems. First, we develop a linear time estimator for the homoscedastic case that can consistently estimate the variance in general graphs. We show that our estimator attains minimax rates for the chain and 2D grid graphs when the mean signal has total variation with canonical scaling. Furthermore, we provide general upper b… ▽ More

    Submitted 18 February, 2024; v1 submitted 25 July, 2022; originally announced July 2022.

  17. arXiv:2206.09092  [pdf, other

    stat.ME math.ST

    Dynamic and heterogeneous treatment effects with abrupt changes

    Authors: Oscar Hernan Madrid Padilla, Yi Yu

    Abstract: From personalised medicine to targeted advertising, it is an inherent task to provide a sequence of decisions with historical covariates and outcome data. This requires understanding of both the dynamics and heterogeneity of treatment effects. In this paper, we are concerned with detecting abrupt changes in the treatment effects in terms of the conditional average treatment effect (CATE) in a sequ… ▽ More

    Submitted 17 June, 2022; originally announced June 2022.

  18. arXiv:2205.13651  [pdf, other

    stat.ME

    A Partially Separable Model for Dynamic Valued Networks

    Authors: Yik Lun Kei, Yanzhen Chen, Oscar Hernan Madrid Padilla

    Abstract: The Exponential-family Random Graph Model (ERGM) is a powerful model to fit networks with complex structures. However, for dynamic valued networks whose observations are matrices of counts that evolve over time, the development of the ERGM framework is still in its infancy. To facilitate the modeling of dyad value increment and decrement, a Partially Separable Temporal ERGM is proposed for dynamic… ▽ More

    Submitted 16 June, 2023; v1 submitted 26 May, 2022; originally announced May 2022.

  19. arXiv:2205.09094  [pdf, other

    stat.ME

    High confidence inference on the probability an individual benefits from treatment using experimental or observational data with known propensity scores

    Authors: Gabriel Ruiz, Oscar Hernan Madrid Padilla

    Abstract: We seek to understand the probability an individual benefits from treatment (PIBT), an inestimable quantity that must be bounded in practice. Given the innate uncertainty in the population-level bounds on PIBT, we seek to better understand the margin of error for their estimation in order to discern whether the estimated bounds on PIBT are tight or wide due to random chance or not. Toward this goa… ▽ More

    Submitted 2 April, 2024; v1 submitted 18 May, 2022; originally announced May 2022.

    Comments: 9 pages, 3 figures

  20. arXiv:2202.01748  [pdf, other

    stat.ME cs.LG stat.ML

    Sequentially learning the topological ordering of causal directed acyclic graphs with likelihood ratio scores

    Authors: Gabriel Ruiz, Oscar Hernan Madrid Padilla, Qing Zhou

    Abstract: Causal discovery, the learning of causality in a data mining scenario, has been of strong scientific and theoretical interest as a starting point to identify "what causes what?" Contingent on assumptions and a proper learning algorithm, it is sometimes possible to identify and accurately estimate a causal directed acyclic graph (DAG), as opposed to a Markov equivalence class of graphs that gives a… ▽ More

    Submitted 19 May, 2022; v1 submitted 3 February, 2022; originally announced February 2022.

  21. arXiv:2110.14298  [pdf, other

    math.ST stat.ML

    Denoising and change point localisation in piecewise-constant high-dimensional regression coefficients

    Authors: Fan Wang, Oscar Hernan Madrid Padilla, Yi Yu, Alessandro Rinaldo

    Abstract: We study the theoretical properties of the fused lasso procedure originally proposed by \cite{tibshirani2005sparsity} in the context of a linear regression model in which the regression coefficient are totally ordered and assumed to be sparse and piecewise constant. Despite its popularity, to the best of our knowledge, estimation error bounds in high-dimensional settings have only been obtained fo… ▽ More

    Submitted 18 February, 2022; v1 submitted 27 October, 2021; originally announced October 2021.

  22. arXiv:2110.08665  [pdf, other

    stat.ME math.ST

    Quantile Regression by Dyadic CART

    Authors: Oscar Hernan Madrid Padilla, Sabyasachi Chatterjee

    Abstract: In this paper we propose and study a version of the Dyadic Classification and Regression Trees (DCART) estimator from Donoho (1997) for (fixed design) quantile regression in general dimensions. We refer to this proposed estimator as the QDCART estimator. Just like the mean regression version, we show that a) a fast dynamic programming based algorithm with computational complexity $O(N \log N)$ exi… ▽ More

    Submitted 16 October, 2021; originally announced October 2021.

  23. arXiv:2110.02401  [pdf, other

    stat.ME

    2D score based estimation of heterogeneous treatment effects

    Authors: Steven Siwei Ye, Yanzhen Chen, Oscar Hernan Madrid Padilla

    Abstract: Statisticians show growing interest in estimating and analyzing heterogeneity in causal effects in observational studies. However, there usually exists a trade-off between accuracy and interpretability for developing a desirable estimator for treatment effects, especially in the case when there are a large number of features in estimation. To make efforts to address the issue, we propose a score-b… ▽ More

    Submitted 23 June, 2023; v1 submitted 5 October, 2021; originally announced October 2021.

  24. arXiv:2110.00901  [pdf, other

    stat.ME

    A causal fused lasso for interpretable heterogeneous treatment effects estimation

    Authors: Oscar Hernan Madrid Padilla, Yanzhen Chen, Gabriel Ruiz, Carlos Misael Madrid Padilla

    Abstract: We propose a novel method for estimating heterogeneous treatment effects based on the fused lasso. By first ordering samples based on the propensity or prognostic score, we match units from the treatment and control groups. We then run the fused lasso to obtain piecewise constant treatment effects with respect to the ordering defined by the score. Similar to the existing methods based on discretiz… ▽ More

    Submitted 14 April, 2025; v1 submitted 2 October, 2021; originally announced October 2021.

  25. arXiv:2106.13685  [pdf, other

    stat.ME stat.ML

    Feature Grouping and Sparse Principal Component Analysis with Truncated Regularization

    Authors: Haiyan Jiang, Shanshan Qin, Oscar Hernan Madrid Padilla

    Abstract: In this paper, we consider a new variant for principal component analysis (PCA), aiming to capture the grouping and/or sparse structures of factor loadings simultaneously. To achieve these goals, we employ a non-convex truncated regularization with naturally adjustable sparsity and grouping effects, and propose the Feature Grouping and Sparse Principal Component Analysis (FGSPCA). The proposed FGS… ▽ More

    Submitted 13 September, 2022; v1 submitted 25 June, 2021; originally announced June 2021.

    Comments: 19 pages, 3 figures

  26. arXiv:2106.10383  [pdf, other

    stat.ME

    Scalable Bayesian change point detection with spike and slab priors

    Authors: Lorenzo Cappello, Oscar Hernan Madrid Padilla, Julia A. Palacios

    Abstract: We study the use of spike and slab priors for consistent estimation of the number of change points and their locations. Leveraging recent results in the variable selection literature, we show that an estimator based on spike and slab priors achieves optimal localization rate in the multiple offline change point detection problem. Based on this estimator, we propose a Bayesian change point detectio… ▽ More

    Submitted 18 June, 2021; originally announced June 2021.

  27. arXiv:2105.13504  [pdf, other

    math.ST cs.LG stat.ML

    Lattice partition recovery with dyadic CART

    Authors: Oscar Hernan Madrid Padilla, Yi Yu, Alessandro Rinaldo

    Abstract: We study piece-wise constant signals corrupted by additive Gaussian noise over a $d$-dimensional lattice. Data of this form naturally arise in a host of applications, and the tasks of signal detection or testing, de-noising and estimation have been studied extensively in the statistical and signal processing literature. In this paper we consider instead the problem of partition recovery, i.e.~of e… ▽ More

    Submitted 27 October, 2021; v1 submitted 27 May, 2021; originally announced May 2021.

  28. arXiv:2012.01758  [pdf, other

    stat.ME math.ST

    Non-parametric Quantile Regression via the K-NN Fused Lasso

    Authors: Steven Siwei Ye, Oscar Hernan Madrid Padilla

    Abstract: Quantile regression is a statistical method for estimating conditional quantiles of a response variable. In addition, for mean estimation, it is well known that quantile regression is more robust to outliers than $l_2$-based methods. By using the fused lasso penalty over a $K$-nearest neighbors graph, we propose an adaptive quantile estimator in a non-parametric setup. We show that the estimator a… ▽ More

    Submitted 17 August, 2021; v1 submitted 3 December, 2020; originally announced December 2020.

    Journal ref: Journal of Machine Learning Research, Vol. 22, No. 111, 1-38, 2021

  29. arXiv:2010.08236  [pdf, other

    math.ST stat.ML

    Quantile regression with deep ReLU Networks: Estimators and minimax rates

    Authors: Oscar Hernan Madrid Padilla, Wesley Tansey, Yanzhen Chen

    Abstract: Quantile regression is the task of estimating a specified percentile response, such as the median, from a collection of known covariates. We study quantile regression with rectified linear unit (ReLU) neural networks as the chosen model class. We derive an upper bound on the expected mean squared error of a ReLU network used to estimate any quantile conditional on a set of covariates. This upper b… ▽ More

    Submitted 17 December, 2020; v1 submitted 16 October, 2020; originally announced October 2020.

  30. arXiv:1912.04160  [pdf, other

    stat.ME

    Energy distance and kernel mean embeddings for two-sample survival testing

    Authors: Marcos Matabuena, Oscar Hernan Madrid Padilla

    Abstract: We study the comparison problem of distribution equality between two random samples under a right censoring scheme. To address this problem, we design a series of tests based on energy distance and kernel mean embeddings. We calibrate our tests using permutation methods and prove that they are consistent against all fixed continuous alternatives. To evaluate our proposed tests, we simulate surviva… ▽ More

    Submitted 9 December, 2019; originally announced December 2019.

  31. arXiv:1912.02151  [pdf, other

    econ.EM math.ST stat.ME

    High Dimensional Latent Panel Quantile Regression with an Application to Asset Pricing

    Authors: Alexandre Belloni, Mingli Chen, Oscar Hernan Madrid Padilla, Zixuan, Wang

    Abstract: We propose a generalization of the linear panel quantile regression model to accommodate both \textit{sparse} and \textit{dense} parts: sparse means while the number of covariates available is large, potentially only a much smaller number of them have a nonzero impact on each conditional quantile of the response variable; while the dense part is represent by a low-rank matrix that can be approxima… ▽ More

    Submitted 23 August, 2022; v1 submitted 4 December, 2019; originally announced December 2019.

    Comments: forthcoming at the Annals of Statistics

  32. arXiv:1911.07494  [pdf, other

    stat.ME math.ST

    Change point localization in dependent dynamic nonparametric random dot product graphs

    Authors: Oscar Hernan Madrid Padilla, Yi Yu, Carey E. Priebe

    Abstract: In this paper, we study the offline change point localization problem in a sequence of dependent nonparametric random dot product graphs. To be specific, assume that at every time point, a network is generated from a nonparametric random dot product graph model \citep[see e.g.][]{athreya2017statistical}, where the latent positions are generated from unknown underlying distributions. The underlying… ▽ More

    Submitted 15 September, 2022; v1 submitted 18 November, 2019; originally announced November 2019.

  33. arXiv:1910.13289  [pdf, other

    math.ST stat.ME

    Optimal nonparametric multivariate change point detection and localization

    Authors: Oscar Hernan Madrid Padilla, Yi Yu, Daren Wang, Alessandro Rinaldo

    Abstract: We study the multivariate nonparametric change point detection problem, where the data are a sequence of independent $p$-dimensional random vectors whose distributions are piecewise-constant with Lipschitz densities changing at unknown times, called change points. We quantify the size of the distributional change at any change point with the supremum norm of the difference between the correspondin… ▽ More

    Submitted 25 June, 2020; v1 submitted 29 October, 2019; originally announced October 2019.

  34. arXiv:1905.10848  [pdf, ps, other

    stat.ML cs.LG

    Learning Gaussian DAGs from Network Data

    Authors: Hangjian Li, Oscar Hernan Madrid Padilla, Qing Zhou

    Abstract: Structural learning of directed acyclic graphs (DAGs) or Bayesian networks has been studied extensively under the assumption that data are independent. We propose a new Gaussian DAG model for dependent data which assumes the observations are correlated according to an undirected network. Under this model, we develop a method to estimate the DAG structure given a topological ordering of the nodes.… ▽ More

    Submitted 28 July, 2021; v1 submitted 26 May, 2019; originally announced May 2019.

    Comments: 14 pages, 5 figures

  35. arXiv:1905.10019  [pdf, other

    stat.ME math.ST

    Optimal nonparametric change point detection and localization

    Authors: Oscar Hernan Madrid Padilla, Yi Yu, Daren Wang, Alessandro Rinaldo

    Abstract: We study change point detection and localization for univariate data in fully nonparametric settings in which, at each time point, we acquire an i.i.d. sample from an unknown distribution. We quantify the magnitude of the distributional changes at the change points using the Kolmogorov--Smirnov distance. We allow all the relevant parameters -- the minimal spacing between two consecutive change poi… ▽ More

    Submitted 23 May, 2019; originally announced May 2019.

    MSC Class: Change point detection; Minimax optimality

  36. Optimal post-selection inference for sparse signals: a nonparametric empirical-Bayes approach

    Authors: Spencer Woody, Oscar Hernan Madrid Padilla, James G. Scott

    Abstract: Many recently developed Bayesian methods have focused on sparse signal detection. However, much less work has been done addressing the natural follow-up question: how to make valid inferences for the magnitude of those signals after selection. Ordinary Bayesian credible intervals suffer from selection bias, owing to the fact that the target of inference is chosen adaptively. Existing Bayesian appr… ▽ More

    Submitted 13 November, 2020; v1 submitted 25 October, 2018; originally announced October 2018.

  37. arXiv:1807.11641  [pdf, other

    stat.ME

    Adaptive Non-Parametric Regression With the $K$-NN Fused Lasso

    Authors: Oscar Hernan Madrid Padilla, James Sharpnack, Yanzhen Chen, Daniela M. Witten

    Abstract: The fused lasso, also known as total-variation denoising, is a locally-adaptive function estimator over a regular grid of design points. In this paper, we extend the fused lasso to settings in which the points do not occur on a regular grid, leading to an approach for non-parametric regression. This approach, which we call the $K$-nearest neighbors ($K$-NN) fused lasso, involves (i) computing the… ▽ More

    Submitted 8 July, 2019; v1 submitted 30 July, 2018; originally announced July 2018.

  38. arXiv:1805.07042  [pdf, other

    stat.ME

    Graphon estimation via nearest neighbor algorithm and 2D fused lasso denoising

    Authors: Oscar Hernan Madrid Padilla

    Abstract: We propose a class of methods for graphon estimation based on exploiting connections with nonparametric regression. The idea is to construct an ordering of the nodes in the network, similar in spirit to Chan and Airoldi (2014). However, rather than only considering orderings based on the empirical degree as in Chan and Airoldi (2014), we use the nearest neighbor algorithm which is an approximating… ▽ More

    Submitted 18 June, 2019; v1 submitted 18 May, 2018; originally announced May 2018.

  39. arXiv:1612.07867  [pdf, other

    stat.ME

    Sequential nonparametric tests for a change in distribution: an application to detecting radiological anomalies

    Authors: Oscar Hernan Madrid Padilla, Alex Athey, Alex Reinhart, James G. Scott

    Abstract: We propose a sequential nonparametric test for detecting a change in distribution, based on windowed Kolmogorov--Smirnov statistics. The approach is simple, robust, highly computationally efficient, easy to calibrate, and requires no parametric assumptions about the underlying null and alternative distributions. We show that both the false-alarm rate and the power of our procedure are amenable to… ▽ More

    Submitted 22 December, 2016; originally announced December 2016.

  40. arXiv:1511.06750  [pdf, other

    stat.ME

    A deconvolution path for mixtures

    Authors: Oscar Hernan Madrid Padilla, Nicholas G. Polson, James G. Scott

    Abstract: We propose a class of estimators for deconvolution in mixture models based on a simple two-step "bin-and-smooth" procedure applied to histogram counts. The method is both statistically and computationally efficient: by exploiting recent advances in convex optimization, we are able to provide a full deconvolution path that shows the estimate for the mixing distribution across a range of plausible d… ▽ More

    Submitted 25 May, 2017; v1 submitted 20 November, 2015; originally announced November 2015.

    Journal ref: Electronic Journal of Statistics Volume 12, Number 1 (2018), 1717-1751

  41. arXiv:1509.04348  [pdf, other

    stat.ME

    Nonparametric density estimation by histogram trend filtering

    Authors: Oscar Hernan Madrid Padilla, James G. Scott

    Abstract: We propose a novel approach for density estimation called histogram trend filtering. Our estimator arises from looking at surrogate Poisson model for counts of observations in a partition of the support of the data. We begin by showing consistency for a variational estimator for this density estimation problem. We then study a discrete estimator that can be efficiently found via convex optimizatio… ▽ More

    Submitted 6 February, 2016; v1 submitted 14 September, 2015; originally announced September 2015.

  42. arXiv:1505.05117  [pdf, other

    stat.ML

    Vector-Space Markov Random Fields via Exponential Families

    Authors: Wesley Tansey, Oscar Hernan Madrid Padilla, Arun Sai Suggala, Pradeep Ravikumar

    Abstract: We present Vector-Space Markov Random Fields (VS-MRFs), a novel class of undirected graphical models where each variable can belong to an arbitrary vector space. VS-MRFs generalize a recent line of work on scalar-valued, uni-parameter exponential family and mixed graphical models, thereby greatly broadening the class of exponential families available (e.g., allowing multinomial and Dirichlet distr… ▽ More

    Submitted 19 May, 2015; originally announced May 2015.

    Comments: See https://github.com/tansey/vsmrfs for code

  43. arXiv:1502.06930  [pdf, ps, other

    stat.ME stat.CO stat.ML

    Tensor decomposition with generalized lasso penalties

    Authors: Oscar Hernan Madrid Padilla, James G. Scott

    Abstract: We present an approach for penalized tensor decomposition (PTD) that estimates smoothly varying latent factors in multi-way data. This generalizes existing work on sparse tensor decomposition and penalized matrix decompositions, in a manner parallel to the generalized lasso for regression and smoothing problems. Our approach presents many nontrivial challenges at the intersection of modeling and c… ▽ More

    Submitted 12 May, 2016; v1 submitted 24 February, 2015; originally announced February 2015.

  44. arXiv:1404.3331  [pdf, other

    stat.ME stat.ML

    Priors for Random Count Matrices Derived from a Family of Negative Binomial Processes

    Authors: Mingyuan Zhou, Oscar Hernan Madrid Padilla, James G. Scott

    Abstract: We define a family of probability distributions for random count matrices with a potentially unbounded number of rows and columns. The three distributions we consider are derived from the gamma-Poisson, gamma-negative binomial, and beta-negative binomial processes. Because the models lead to closed-form Gibbs sampling update equations, they are natural candidates for nonparametric Bayesian priors… ▽ More

    Submitted 13 July, 2015; v1 submitted 12 April, 2014; originally announced April 2014.

    Comments: To appear in Journal of the American Statistical Association (Theory and Methods). 31 pages + 11 page supplement, 5 figures