Skip to main content

Showing 1–21 of 21 results for author: Kosorok, M R

Searching in archive math. Search in all archives.
.
  1. arXiv:2504.16780  [pdf, other

    math.ST stat.ME

    Linear Regression Using Hilbert-Space-Valued Covariates with Unknown Reproducing Kernel

    Authors: Xinyi Li, Margaret Hoch, Michael R. Kosorok

    Abstract: We present a new method of linear regression based on principal components using Hilbert-space-valued covariates with unknown reproducing kernels. We develop a computationally efficient approach to estimation and derive asymptotic theory for the regression parameter estimates under mild assumptions. We demonstrate the approach in simulation studies as well as in data analysis using two-dimensional… ▽ More

    Submitted 23 April, 2025; originally announced April 2025.

  2. arXiv:2408.16381  [pdf, other

    stat.ME math.ST

    Uncertainty quantification for intervals

    Authors: Carlos García Meixide, Michael R. Kosorok, Marcos Matabuena

    Abstract: Data following an interval structure are increasingly prevalent in many scientific applications. In medicine, clinical events are often monitored between two clinical visits, making the exact time of the event unknown and generating outcomes with a range format. As interest in automating healthcare decisions grows, uncertainty quantification via predictive regions becomes essential for developing… ▽ More

    Submitted 30 March, 2025; v1 submitted 29 August, 2024; originally announced August 2024.

  3. arXiv:2206.06140  [pdf, other

    math.ST

    Inference for change-plane regression

    Authors: Chaeryon Kang, Hunyong Cho, Rui Song, Moulinath Banerjee, Eric B. Laber, Michael R. Kosorok

    Abstract: A key challenge in analyzing the behavior of change-plane estimators is that the objective function has multiple minimizers. Two estimators are proposed to deal with this non-uniqueness. For each estimator, an n-rate of convergence is established, and the limiting distribution is derived. Based on these results, we provide a parametric bootstrap procedure for inference. The validity of our theoret… ▽ More

    Submitted 13 January, 2024; v1 submitted 13 June, 2022; originally announced June 2022.

  4. arXiv:2204.12319  [pdf, other

    math.ST stat.ME stat.ML

    Discussion of Multiscale Fisher's Independence Test for Multivariate Dependence

    Authors: Duyeol Lee, Helal El-Zaatari, Michael R. Kosorok, Xinyi Li, Kai Zhang

    Abstract: The multiscale Fisher's independence test (MULTIFIT hereafter) proposed by Gorsky & Ma (2022) is a novel method to test independence between two random vectors. By its design, this test is particularly useful in detecting local dependence. Moreover, by adopting a resampling-free approach, it can easily accommodate massive sample sizes. Another benefit of the proposed method is its ability to inter… ▽ More

    Submitted 26 April, 2022; originally announced April 2022.

  5. arXiv:2007.09811  [pdf, ps, other

    stat.ME math.ST stat.AP stat.ML

    Kernel Assisted Learning for Personalized Dose Finding

    Authors: Liangyu Zhu, Wenbin Lu, Michael R. Kosorok, Rui Song

    Abstract: An individualized dose rule recommends a dose level within a continuous safe dose range based on patient level information such as physical conditions, genetic factors and medication histories. Traditionally, personalized dose finding process requires repeating clinical visits of the patient and frequent adjustments of the dosage. Thus the patient is constantly exposed to the risk of underdosing a… ▽ More

    Submitted 19 July, 2020; originally announced July 2020.

    Comments: Accepted for KDD 2020

  6. arXiv:1912.03662  [pdf, other

    math.ST stat.CO stat.ME stat.ML

    The Binary Expansion Randomized Ensemble Test (BERET)

    Authors: Duyeol Lee, Kai Zhang, Michael R. Kosorok

    Abstract: Recently, the binary expansion testing framework was introduced to test the independence of two continuous random variables by utilizing symmetry statistics that are complete sufficient statistics for dependence. We develop a new test based on an ensemble approach that uses the sum of squared symmetry statistics and distance correlation. Simulation studies suggest that this method improves the pow… ▽ More

    Submitted 7 January, 2021; v1 submitted 8 December, 2019; originally announced December 2019.

  7. arXiv:1904.02624  [pdf, ps, other

    math.ST

    Efficient estimation of accelerated lifetime models under length-biased sampling

    Authors: Pourab Roy, Jason P. Fine, Michael R. Kosorok

    Abstract: In prevalent cohort studies where subjects are recruited at a cross-section, the time to an event may be subject to length-biased sampling, with the observed data being either the forward recurrence time, or the backward recurrence time, or their sum. In the regression setting, it has been shown that the accelerated failure time model for the underlying event time is invariant under these observed… ▽ More

    Submitted 4 April, 2019; originally announced April 2019.

    Comments: 15 pages

    MSC Class: 62G08; 62G86

  8. arXiv:1409.0727  [pdf, other

    math.ST

    Asymptotics for change-point models under varying degrees of mis-specification

    Authors: Rui Song, Moulinath Banerjee, Michael R. Kosorok

    Abstract: Change-point models are widely used by statisticians to model drastic changes in the pattern of observed data. Least squares/maximum likelihood based estimation of change-points leads to curious asymptotic phenomena. When the change-point model is correctly specified, such estimates generally converge at a fast rate ($n$) and are asymptotically described by minimizers of jump process. Under comple… ▽ More

    Submitted 18 October, 2015; v1 submitted 2 September, 2014; originally announced September 2014.

  9. Q-learning with censored data

    Authors: Yair Goldberg, Michael R. Kosorok

    Abstract: We develop methodology for a multistage decision problem with flexible number of stages in which the rewards are survival times that are subject to censoring. We present a novel Q-learning algorithm that is adjusted for censored data and allows a flexible number of stages. We provide finite sample bounds on the generalization error of the policy learned by the algorithm, and show that when the opt… ▽ More

    Submitted 30 May, 2012; originally announced May 2012.

    Comments: Published in at http://dx.doi.org/10.1214/12-AOS968 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOS-AOS968

    Journal ref: Annals of Statistics 2012, Vol. 40, No. 1, 529-560

  10. Likelihood based inference for current status data on a grid: A boundary phenomenon and an adaptive inference procedure

    Authors: Runlong Tang, Moulinath Banerjee, Michael R. Kosorok

    Abstract: In this paper, we study the nonparametric maximum likelihood estimator for an event time distribution function at a point in the current status model with observation times supported on a grid of potentially unknown sparsity and with multiple subjects sharing the same observation time. This is of interest since observation time ties occur frequently with current status data. The grid resolution is… ▽ More

    Submitted 28 May, 2012; originally announced May 2012.

    Comments: Published in at http://dx.doi.org/10.1214/11-AOS942 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOS-AOS942

    Journal ref: Annals of Statistics 2012, Vol. 40, No. 1, 45-72

  11. arXiv:1202.5130  [pdf, other

    stat.ML math.ST

    Support Vector Regression for Right Censored Data

    Authors: Yair Goldberg, Michael R. Kosorok

    Abstract: We develop a unified approach for classification and regression support vector machines for data subject to right censoring. We provide finite sample bounds on the generalization error of the algorithm, prove risk consistency for a wide class of probability measures, and study the associated learning rates. We apply the general methodology to estimation of the (truncated) mean, median, quantiles,… ▽ More

    Submitted 12 January, 2013; v1 submitted 23 February, 2012; originally announced February 2012.

    Comments: In this version, we strengthened the theoretical results and corrected a few mistakes

  12. Simultaneous critical values for $t$-tests in very high dimensions

    Authors: Hongyuan Cao, Michael R. Kosorok

    Abstract: This article considers the problem of multiple hypothesis testing using $t$-tests. The observed data are assumed to be independently generated conditional on an underlying and unknown two-state hidden model. We propose an asymptotically valid data-driven procedure to find critical values for rejection regions controlling the $k$-familywise error rate ($k$-FWER), false discovery rate (FDR) and the… ▽ More

    Submitted 21 February, 2011; v1 submitted 10 February, 2011; originally announced February 2011.

    Comments: Published in at http://dx.doi.org/10.3150/10-BEJ272 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm)

    Report number: IMS-BEJ-BEJ272

    Journal ref: Bernoulli 2011, Vol. 17, No. 1, 347-394

  13. On asymptotically optimal tests under loss of identifiability in semiparametric models

    Authors: Rui Song, Michael R. Kosorok, Jason P. Fine

    Abstract: We consider tests of hypotheses when the parameters are not identifiable under the null in semiparametric models, where regularity conditions for profile likelihood theory fail. Exponential average tests based on integrated profile likelihood are constructed and shown to be asymptotically optimal under a weighted average power criterion with respect to a prior on the nonidentifiable aspect of th… ▽ More

    Submitted 24 August, 2009; originally announced August 2009.

    Comments: Published in at http://dx.doi.org/10.1214/08-AOS643 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOS-AOS643 MSC Class: 62A01; 62G10 (Primary) 62G20; 62C99 (Secondary)

    Journal ref: Annals of Statistics 2009, Vol. 37, No. 5A, 2409-2444

  14. Bootstrapping the Grenander estimator

    Authors: Michael R. Kosorok

    Abstract: The goal of this paper is to study the bootstrap for the Grenander estimator. The first result is a proof of the inconsistency of the nonparametric bootstrap for the Grenander estimator at a given point. The second result is the development and verification of a bootstrap for the $L_1$ confidence band for the Grenander estimator. As part of this work, kernel estimators are studied as alternative… ▽ More

    Submitted 16 May, 2008; originally announced May 2008.

    Comments: Published in at http://dx.doi.org/10.1214/193940307000000202 the IMS Collections (http://www.imstat.org/publications/imscollections.htm) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-COLL1-IMSCOLL121 MSC Class: 62G09; 62G07 (Primary) 60F05; 60G15 (Secondary)

    Journal ref: IMS Collections 2008, Vol. 1, 282-292

  15. arXiv:math/0701540  [pdf, ps, other

    math.ST

    The penalized profile sampler

    Authors: Guang Cheng, Michael R. Kosorok

    Abstract: The penalized profile sampler for semiparametric inference is an extension of the profile sampler method (Lee, Kosorok and Fine, 2005) obtained by profiling a penalized log-likelihood. The idea is to base inference on the posterior distribution obtained by multiplying a profiled penalized log-likelihood by a prior for the parametric component, where the profiling and penalization are applied to… ▽ More

    Submitted 19 January, 2007; originally announced January 2007.

    Comments: 26 pages

    MSC Class: Primary 62G20; 62F25; secondary 62F15; 62F12

  16. General frequentist properties of the posterior profile distribution

    Authors: Guang Cheng, Michael R. Kosorok

    Abstract: In this paper, inference for the parametric component of a semiparametric model based on sampling from the posterior profile distribution is thoroughly investigated from the frequentist viewpoint. The higher-order validity of the profile sampler obtained in Cheng and Kosorok [Ann. Statist. 36 (2008)] is extended to semiparametric models in which the infinite dimensional nuisance parameter may no… ▽ More

    Submitted 19 August, 2008; v1 submitted 7 December, 2006; originally announced December 2006.

    Comments: Published in at http://dx.doi.org/10.1214/07-AOS536 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOS-AOS536 MSC Class: 62G20; 62F25 (Primary) 62F15; 62F12 (Secondary)

    Journal ref: Annals of Statistics 2008, Vol. 36, No. 4, 1819-1853

  17. Higher order semiparametric frequentist inference with the profile sampler

    Authors: Guang Cheng, Michael R. Kosorok

    Abstract: We consider higher order frequentist inference for the parametric component of a semiparametric model based on sampling from the posterior profile distribution. The first order validity of this procedure established by Lee, Kosorok and Fine in [J. American Statist. Assoc. 100 (2005) 960--969] is extended to second-order validity in the setting where the infinite-dimensional nuisance parameter ac… ▽ More

    Submitted 19 August, 2008; v1 submitted 4 May, 2006; originally announced May 2006.

    Comments: Published in at http://dx.doi.org/10.1214/07-AOS523 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOS-AOS523 MSC Class: 62G20; 62F25 (Primary) 62F15; 62F12 (Secondary)

    Journal ref: Annals of Statistics 2008, Vol. 36, No. 4, 1786-1818

  18. arXiv:math/0604043  [pdf, ps, other

    math.ST

    Further details on inference under right censoring for transformation models with a change-point based on a covariate threshold

    Authors: Michael R. Kosorok, Rui Song

    Abstract: We consider linear transformation models applied to right censored survival data with a change-point based on a covariate threshold. We establish consistency and weak convergence of the nonparametric maximum lieklihood estimators. The change-point parameter is shown to be $n$-consistent, while the remaining parameters are shown to have the expected root-$n$ consistency. We show that the procedur… ▽ More

    Submitted 3 April, 2006; originally announced April 2006.

    Comments: University of Wisconsin-Madison Department of Biostatistics and Medical Informatics Technical Report

    MSC Class: 62N01; 62F05; 62G20; 62G10

  19. Penalized log-likelihood estimation for partly linear transformation models with current status data

    Authors: Shuangge Ma, Michael R. Kosorok

    Abstract: We consider partly linear transformation models applied to current status data. The unknown quantities are the transformation function, a linear regression parameter and a nonparametric regression effect. It is shown that the penalized MLE for the regression parameter is asymptotically normal and efficient and converges at the parametric rate, although the penalized MLE for the transformation fu… ▽ More

    Submitted 11 February, 2006; originally announced February 2006.

    Comments: Published at http://dx.doi.org/10.1214/009053605000000444 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOS-AOS0059 MSC Class: 62G08; 60F05 (Primary) 62G20; 62B10 (Secondary)

    Journal ref: Annals of Statistics 2005, Vol. 33, No. 5, 2256-2290

  20. arXiv:math/0508219  [pdf, ps, other

    math.ST

    Marginal asymptotics for the "large p, small n" paradigm: with applications to microarray data

    Authors: Michael R. Kosorok, Shuangge Ma

    Abstract: The "large p, small n" paradigm arises in microarray studies, where expression levels of thousands of genes are monitored for a small number of subjects. There has been an increasing demand for study of asymptotics for the various statistical models and methodologies using genomic data. In this article, we focus on one-sample and two-sample microarray experiments, where the goal is to identify s… ▽ More

    Submitted 12 August, 2005; originally announced August 2005.

    Comments: 39 pages, 2 tables, 1 figure

    Report number: U.W. Madison Department of Biostatistics/Medical Informatics TR188

  21. Robust Inference for Univariate Proportional Hazards Frailty Regression Models

    Authors: Michael R. Kosorok, Bee Leng Lee, Jason P. Fine

    Abstract: We consider a class of semiparametric regression models which are one-parameter extensions of the Cox [J. Roy. Statist. Soc. Ser. B 34 (1972) 187-220] model for right-censored univariate failure times. These models assume that the hazard given the covariates and a random frailty unique to each individual has the proportional hazards form multiplied by the frailty. The frailty is assumed to hav… ▽ More

    Submitted 5 October, 2004; originally announced October 2004.

    Comments: Published by the Institute of Mathematical Statistics (http://www.imstat.org) in the Annals of Statistics (http://www.imstat.org/aos/) at http://dx.doi.org/10.1214/009053604000000535

    Report number: IMS-AOS-AOS182 MSC Class: 62N01; 60F05 (Primary) 62B10; 62F40 (Secondary)

    Journal ref: Annals of Statistics 2004, Vol. 32, No. 4, 1448-1491