Search | arXiv e-print repository

arXiv:1904.10575 [pdf, other]

A penalized likelihood approach for efficiently estimating a partially linear additive transformation model with current status data

Authors: Yan Liu, Minggen Lu, Christopher S. McMahan

Abstract: Current status data are commonly encountered in medical and epidemiological studies in which the failure time for study units is the outcome variable of interest. Data of this form are characterized by the fact that the failure time is not directly observed but rather is known relative to an observation time; i.e., the failure times are either left- or right-censored. Due to its structure, the ana… ▽ More Current status data are commonly encountered in medical and epidemiological studies in which the failure time for study units is the outcome variable of interest. Data of this form are characterized by the fact that the failure time is not directly observed but rather is known relative to an observation time; i.e., the failure times are either left- or right-censored. Due to its structure, the analysis of such data can be challenging. To circumvent these challenges and to provide for a flexible modeling construct which can be used to analyze current status data, herein, a partially linear additive transformation model is proposed. In the formulation of this model, constrained $B$-splines are employed to model the monotone transformation function and nonlinear covariate effects. To provide for more efficient estimates, a penalization technique is used to regularize the estimation of all unknown functions. An easy to implement hybrid algorithm is developed for model fitting and a simple estimator of the large-sample variance-covariance matrix is proposed. It is shown theoretically that the proposed estimators of the finite-dimensional regression coefficients are root-$n$ consistent, asymptotically normal, and achieve the semi-parametric information bound while the estimators of the nonparametric components attain the optimal rate of convergence. The finite-sample performance of the proposed methodology is evaluated through extensive numerical studies and is further demonstrated through the analysis of uterine leiomyomata data. △ Less

Submitted 23 April, 2019; originally announced April 2019.

arXiv:1804.00096 [pdf, other]

A proportional hazards model for interval-censored data subject to instantaneous failures

Authors: Prabhashi W. Withana Gamage, Monica Chaudari, Christopher S. McMahan, Michael R. Kosorok

Abstract: The proportional hazards (PH) model is arguably one of the most popular models used to analyze time to event data arising from clinical trials and longitudinal studies, among many others. In many such studies, the event time of interest is not directly observed but is known relative to periodic examination times; i.e., practitioners observe either current status or interval-censored data. The anal… ▽ More The proportional hazards (PH) model is arguably one of the most popular models used to analyze time to event data arising from clinical trials and longitudinal studies, among many others. In many such studies, the event time of interest is not directly observed but is known relative to periodic examination times; i.e., practitioners observe either current status or interval-censored data. The analysis of data of this structure is often fraught with many difficulties. Further exacerbating this issue, in some such studies the observed data also consists of instantaneous failures; i.e., the event times for several study units coincide exactly with the time at which the study begins. In light of these difficulties, this work focuses on developing a mixture model, under the PH assumptions, which can be used to analyze interval-censored data subject to instantaneous failures. To allow for modeling flexibility, two methods of estimating the unknown cumulative baseline hazard function are proposed; a fully parametric and a monotone spline representation are considered. Through a novel data augmentation procedure involving latent Poisson random variables, an expectation-maximization (EM) algorithm was developed to complete model fitting. The resulting EM algorithm is easy to implement and is computationally efficient. Moreover, through extensive simulation studies the proposed approach is shown to provide both reliable estimation and inference. △ Less

Submitted 3 April, 2018; v1 submitted 30 March, 2018; originally announced April 2018.

arXiv:1710.10351 [pdf, other]

doi 10.1080/01621459.2021.2014854

Bayesian Spatial Binary Regression for Label Fusion in Structural Neuroimaging

Authors: D. Andrew Brown, Christopher S. McMahan, Russell T. Shinohara, Kristin A. Linn

Abstract: Alzheimer's disease is a neurodegenerative condition that accelerates cognitive decline relative to normal aging. It is of critical scientific importance to gain a better understanding of early disease mechanisms in the brain to facilitate effective, targeted therapies. The volume of the hippocampus is often used in diagnosis and monitoring of the disease. Measuring this volume via neuroimaging is… ▽ More Alzheimer's disease is a neurodegenerative condition that accelerates cognitive decline relative to normal aging. It is of critical scientific importance to gain a better understanding of early disease mechanisms in the brain to facilitate effective, targeted therapies. The volume of the hippocampus is often used in diagnosis and monitoring of the disease. Measuring this volume via neuroimaging is difficult since each hippocampus must either be manually identified or automatically delineated, a task referred to as segmentation. Automatic hippocampal segmentation often involves mapping a previously manually segmented image to a new brain image and propagating the labels to obtain an estimate of where each hippocampus is located in the new image. A more recent approach to this problem is to propagate labels from multiple manually segmented atlases and combine the results using a process known as label fusion. To date, most label fusion algorithms employ voting procedures with voting weights assigned directly or estimated via optimization. We propose using a fully Bayesian spatial regression model for label fusion that facilitates direct incorporation of covariate information while making accessible the entire posterior distribution. Our results suggest that incorporating tissue classification (e.g, gray matter) into the label fusion procedure can greatly improve segmentation when relatively homogeneous, healthy brains are used as atlases for diseased brains. The fully Bayesian approach also produces meaningful uncertainty measures about hippocampal volumes, information which can be leveraged to detect significant, scientifically meaningful differences between healthy and diseased populations, improving the potential for early detection and tracking of the disease. △ Less

Submitted 14 January, 2022; v1 submitted 27 October, 2017; originally announced October 2017.

Comments: To appear in Journal of the American Statistical Association, 24 pages, 10 figures

arXiv:1702.05518 [pdf, other]

doi 10.1080/00031305.2019.1595144

Sampling strategies for fast updating of Gaussian Markov random fields

Authors: D. Andrew Brown, Christopher S. McMahan, Stella Watson Self

Abstract: Gaussian Markov random fields (GMRFs) are popular for modeling dependence in large areal datasets due to their ease of interpretation and computational convenience afforded by the sparse precision matrices needed for random variable generation. Typically in Bayesian computation, GMRFs are updated jointly in a block Gibbs sampler or componentwise in a single-site sampler via the full conditional di… ▽ More Gaussian Markov random fields (GMRFs) are popular for modeling dependence in large areal datasets due to their ease of interpretation and computational convenience afforded by the sparse precision matrices needed for random variable generation. Typically in Bayesian computation, GMRFs are updated jointly in a block Gibbs sampler or componentwise in a single-site sampler via the full conditional distributions. The former approach can speed convergence by updating correlated variables all at once, while the latter avoids solving large matrices. We consider a sampling approach in which the underlying graph can be cut so that conditionally independent sites are updated simultaneously. This algorithm allows a practitioner to parallelize updates of subsets of locations or to take advantage of `vectorized' calculations in a high-level language such as R. Through both simulated and real data, we demonstrate computational savings that can be achieved versus both single-site and block updating, regardless of whether the data are on a regular or an irregular lattice. The approach provides a good compromise between statistical and computational efficiency and is accessible to statisticians without expertise in numerical analysis or advanced computing. △ Less

Submitted 4 February, 2019; v1 submitted 17 February, 2017; originally announced February 2017.

Comments: Revised introduction and expanded numerical examples to include Rcpp and parallel implementation. Supplementary material available from the authors. 38 pages, 8 figures

Journal ref: The American Statistician, 2019

Showing 1–4 of 4 results for author: McMahan, C S