Skip to main content

Showing 1–40 of 40 results for author: Kim, J K

Searching in archive stat. Search in all archives.
.
  1. arXiv:2412.12405  [pdf, ps, other

    stat.ME

    Generalized entropy calibration for analyzing voluntary survey data

    Authors: Yonghyun Kwon, Jae Kwang Kim, Yumou Qiu

    Abstract: Statistical analysis of voluntary survey data is an important area of research in survey sampling. We consider a unified approach to voluntary survey data analysis under the assumption that the sampling mechanism is ignorable. Generalized entropy calibration is introduced as a unified tool for calibration weighting to control the selection bias. We first establish the relationship between the gene… ▽ More

    Submitted 1 June, 2025; v1 submitted 16 December, 2024; originally announced December 2024.

  2. arXiv:2404.01076  [pdf, other

    stat.ME

    Debiased calibration estimation using generalized entropy in survey sampling

    Authors: Yonghyun Kwon, Jae Kwang Kim, Yumou Qiu

    Abstract: Incorporating the auxiliary information into the survey estimation is a fundamental problem in survey sampling. Calibration weighting is a popular tool for incorporating the auxiliary information. The calibration weighting method of Deville and Sarndal (1992) uses a distance measure between the design weights and the final weights to solve the optimization problem with calibration constraints. Thi… ▽ More

    Submitted 11 May, 2025; v1 submitted 1 April, 2024; originally announced April 2024.

  3. arXiv:2401.07625  [pdf, ps, other

    stat.ME

    Statistics in Survey Sampling

    Authors: Jae Kwang Kim

    Abstract: Survey sampling theory and methods are introduced. Sampling designs and estimation methods are carefully discussed as a textbook for survey sampling. Topics includes Horvitz-Thompson estimation, simple random sampling, stratified sampling, cluster sampling, ratio estimation, regression estimation, variance estimation, two-phase sampling, and nonresponse adjustment methods.

    Submitted 1 December, 2024; v1 submitted 15 January, 2024; originally announced January 2024.

  4. arXiv:2307.11651  [pdf, other

    stat.ME

    Multiple bias-calibration for adjusting selection bias of non-probability samples using data integration

    Authors: Zhonglei Wang, Shu Yang, Jae Kwang Kim

    Abstract: Valid statistical inference is challenging when the sample is subject to unknown selection bias. Data integration can be used to correct for selection bias when we have a parallel probability sample from the same population with some common measurements. How to model and estimate the selection probability or the propensity score (PS) of a non-probability sample using an independent probability sam… ▽ More

    Submitted 21 July, 2023; originally announced July 2023.

  5. arXiv:2306.15173  [pdf, other

    stat.ME

    Robust propensity score weighting estimation under missing at random

    Authors: Hengfang Wang, Jae Kwang Kim, Jeongseop Han, Youngjo Lee

    Abstract: Missing data is frequently encountered in many areas of statistics. Propensity score weighting is a popular method for handling missing data. The propensity score method employs a response propensity model, but correct specification of the statistical model can be challenging in the presence of missing data. Doubly robust estimation is attractive, as the consistency of the estimator is guaranteed… ▽ More

    Submitted 27 March, 2024; v1 submitted 26 June, 2023; originally announced June 2023.

  6. arXiv:2211.02998  [pdf, ps, other

    stat.ME math.ST

    An empirical likelihood approach to reduce selection bias in voluntary samples

    Authors: Jae Kwang Kim, Kosuke Morikawa

    Abstract: We address the weighting problem in voluntary samples under a nonignorable sample selection model. Under the assumption that the sample selection model is correctly specified, we can compute a consistent estimator of the model parameter and construct the propensity score estimator of the population mean. We use the empirical likelihood method to construct the final weights for voluntary samples by… ▽ More

    Submitted 11 May, 2023; v1 submitted 5 November, 2022; originally announced November 2022.

  7. arXiv:2208.07535  [pdf, other

    stat.ME

    Semiparametric imputation using latent sparse conditional Gaussian mixtures for multivariate mixed outcomes

    Authors: Shonosuke Sugasawa, Jae Kwang Kim, Kosuke Morikawa

    Abstract: This paper proposes a flexible Bayesian approach to multiple imputation using conditional Gaussian mixtures. We introduce novel shrinkage priors for covariate-dependent mixing proportions in the mixture models to automatically select the suitable number of components used in the imputation step. We develop an efficient sampling algorithm for posterior computation and multiple imputation via Markov… ▽ More

    Submitted 16 August, 2022; originally announced August 2022.

    Comments: 29 pages, 5 figures

  8. arXiv:2208.06039  [pdf, other

    stat.ME math.ST

    Semiparametric adaptive estimation under informative sampling

    Authors: Kosuke Morikawa, Yoshikazu Terada, Jae Kwang Kim

    Abstract: In survey sampling, survey data do not necessarily represent the target population, and the samples are often biased. However, information on the survey weights aids in the elimination of selection bias. The Horvitz-Thompson estimator is a well-known unbiased, consistent, and asymptotically normal estimator; however, it is not efficient. Thus, this study derives the semiparametric efficiency bound… ▽ More

    Submitted 3 April, 2024; v1 submitted 11 August, 2022; originally announced August 2022.

    Comments: 21pages, 2 figures, and 2 tables

  9. arXiv:2207.09891  [pdf, other

    stat.ME

    Maximum Likelihood Imputation

    Authors: Jeongseop Han, Youngjo Lee, Jae Kwang Kim

    Abstract: Maximum likelihood (ML) estimation is widely used in statistics. The h-likelihood has been proposed as an extension of Fisher's likelihood to statistical models including unobserved latent variables of recent interest. Its advantage is that the joint maximization gives ML estimators (MLEs) of both fixed and random parameters with their standard error estimates. However, the current h-likelihood ap… ▽ More

    Submitted 20 July, 2022; originally announced July 2022.

  10. Soft calibration for selection bias problems under mixed-effects models

    Authors: Chenyin Gao, Shu Yang, Jae Kwang Kim

    Abstract: Calibration weighting has been widely used to correct selection biases in non-probability sampling, missing data, and causal inference. The main idea is to calibrate the biased sample to the benchmark by adjusting the subject weights. However, hard calibration can produce enormous weights when an exact calibration is enforced on a large set of extraneous covariates. This article proposes a soft ca… ▽ More

    Submitted 22 February, 2023; v1 submitted 2 June, 2022; originally announced June 2022.

    Comments: Accepted for publication in Biometrika

  11. arXiv:2204.09193  [pdf, other

    stat.ME

    Functional Calibration under Non-Probability Survey Sampling

    Authors: Zhonglei Wang, Xiaojun Mao, Jae Kwang Kim

    Abstract: Non-probability sampling is prevailing in survey sampling, but ignoring its selection bias leads to erroneous inferences. We offer a unified nonparametric calibration method to estimate the sampling weights for a non-probability sample by calibrating functions of auxiliary variables in a reproducing kernel Hilbert space. The consistency and the limiting distribution of the proposed estimator are e… ▽ More

    Submitted 19 April, 2022; originally announced April 2022.

  12. arXiv:2202.11276  [pdf, other

    stat.ME stat.AP

    Nearest neighbor ratio imputation with incomplete multi-nomial outcome in survey sampling

    Authors: Chenyin Gao, Katherine Jenny Thompson, Shu Yang, Jae Kwang Kim

    Abstract: Nonresponse is a common problem in survey sampling. Appropriate treatment can be challenging, especially when dealing with detailed breakdowns of totals. Often, the nearest neighbor imputation method is used to handle such incomplete multinomial data. In this article, we investigate the nearest neighbor ratio imputation estimator, in which auxiliary variables are used to identify the closest donor… ▽ More

    Submitted 22 February, 2022; originally announced February 2022.

    Comments: Accepted for publication in JRSS(A)

  13. arXiv:2107.07371  [pdf, other

    stat.ME stat.ML

    Statistical inference using Regularized M-estimation in the reproducing kernel Hilbert space for handling missing data

    Authors: Hengfang Wang, Jae Kwang Kim

    Abstract: Imputation and propensity score weighting are two popular techniques for handling missing data. We address these problems using the regularized M-estimation techniques in the reproducing kernel Hilbert space. Specifically, we first use the kernel ridge regression to develop imputation for handling item nonresponse. While this nonparametric approach is potentially promising for imputation, its stat… ▽ More

    Submitted 15 July, 2021; originally announced July 2021.

    Comments: arXiv admin note: text overlap with arXiv:2102.00058

  14. arXiv:2107.06448  [pdf, other

    stat.ME

    Survey data integration for regression analysis using model calibration

    Authors: Zhonglei Wang, Hang J. Kim, Jae Kwang Kim

    Abstract: We consider regression analysis in the context of data integration. To combine partial information from external sources, we employ the idea of model calibration which introduces a "working" reduced model based on the observed covariates. The working reduced model is not necessarily correctly specified but can be a useful device to incorporate the partial information from the external data. The ac… ▽ More

    Submitted 11 October, 2022; v1 submitted 13 July, 2021; originally announced July 2021.

  15. arXiv:2104.13469  [pdf, other

    stat.ME

    Information projection approach to propensity score estimation for handling selection bias under missing at random

    Authors: Hengfang Wang, Jae Kwang Kim

    Abstract: Propensity score weighting is widely used to improve the representativeness and correct the selection bias in the voluntary sample. The propensity score is often developed using a model for the sampling probability, which can be subject to model misspecification. In this paper, we consider an alternative approach of estimating the inverse of the propensity scores using the density ratio function s… ▽ More

    Submitted 19 July, 2022; v1 submitted 27 April, 2021; originally announced April 2021.

  16. arXiv:2011.05988  [pdf, other

    math.ST cs.IT cs.LG stat.CO stat.ME

    Maximum sampled conditional likelihood for informative subsampling

    Authors: HaiYing Wang, Jae Kwang Kim

    Abstract: Subsampling is a computationally effective approach to extract information from massive data sets when computing resources are limited. After a subsample is taken from the full data, most available methods use an inverse probability weighted (IPW) objective function to estimate the model parameters. The IPW estimator does not fully utilize the information in the selected subsample. In this paper,… ▽ More

    Submitted 9 October, 2022; v1 submitted 11 November, 2020; originally announced November 2020.

  17. arXiv:2001.03259  [pdf, ps, other

    stat.ME

    Statistical Data Integration in Survey Sampling: A Review

    Authors: Shu Yang, Jae Kwang Kim

    Abstract: Finite population inference is a central goal in survey sampling. Probability sampling is the main statistical approach to finite population inference. Challenges arise due to high cost and increasing non-response rates. Data integration provides a timely solution by leveraging multiple data sources to provide more robust and efficient inference than using any single data source alone. The techniq… ▽ More

    Submitted 9 January, 2020; originally announced January 2020.

    Comments: Submitted to Japanese Journal of Statistics and Data Science

  18. arXiv:1909.06534  [pdf, other

    stat.ME

    Semiparametric Imputation Using Conditional Gaussian Mixture Models under Item Nonresponse

    Authors: Danhyang Lee, Jae Kwang Kim

    Abstract: Imputation is a popular technique for handling item nonresponse in survey sampling. Parametric imputation is based on a parametric model for imputation and is less robust against the failure of the imputation model. Nonparametric imputation is fully robust but is not applicable when the dimension of covariates is large due to the curse of dimensionality. Semiparametric imputation is another robust… ▽ More

    Submitted 19 September, 2019; v1 submitted 14 September, 2019; originally announced September 2019.

  19. arXiv:1906.04398  [pdf, other

    stat.ME

    An Approximate Bayesian Approach to Model-assisted Survey Estimation with Many Auxiliary Variables

    Authors: Shonosuke Sugasawa, Jae Kwang Kim

    Abstract: Model-assisted estimation with complex survey data is an important practical problem in survey sampling. When there are many auxiliary variables, selecting significant variables associated with the study variable would be necessary to achieve efficient estimation of population parameters of interest. In this paper, we formulate a regularized regression estimator in the framework of Bayesian infere… ▽ More

    Submitted 31 March, 2020; v1 submitted 11 June, 2019; originally announced June 2019.

    Comments: 37 pages

  20. arXiv:1903.05212  [pdf, ps, other

    stat.ME

    Doubly Robust Inference when Combining Probability and Non-probability Samples with High-dimensional Data

    Authors: Shu Yang, Jae Kwang Kim, Rui Song

    Abstract: Non-probability samples become increasingly popular in survey statistics but may suffer from selection biases that limit the generalizability of results to the target population. We consider integrating a non-probability sample with a probability sample which provides high-dimensional representative covariate information of the target population. We propose a two-step approach for variable selecti… ▽ More

    Submitted 23 August, 2019; v1 submitted 12 March, 2019; originally announced March 2019.

  21. arXiv:1903.03630  [pdf, ps, other

    stat.ML cs.LG stat.ME

    Imputation estimators for unnormalized models with missing data

    Authors: Masatoshi Uehara, Takeru Matsuda, Jae Kwang Kim

    Abstract: Several statistical models are given in the form of unnormalized densities, and calculation of the normalization constant is intractable. We propose estimation methods for such unnormalized models with missing data. The key concept is to combine imputation techniques with estimators for unnormalized models including noise contrastive estimation and score matching. In addition, we derive asymptotic… ▽ More

    Submitted 8 June, 2020; v1 submitted 8 March, 2019; originally announced March 2019.

    Comments: To appear (AISTATS 2020)

  22. arXiv:1901.01645  [pdf, ps, other

    math.ST stat.ME

    Bootstrap inference for the finite population total under complex sampling designs

    Authors: Zhonglei Wang, Jae Kwang Kim, Liuhua Peng

    Abstract: Bootstrap is a useful tool for making statistical inference, but it may provide erroneous results under complex survey sampling. Most studies about bootstrap-based inference are developed under simple random sampling and stratified random sampling. In this paper, we propose a unified bootstrap method applicable to some complex sampling designs, including Poisson sampling and probability-proportion… ▽ More

    Submitted 6 January, 2019; originally announced January 2019.

  23. arXiv:1812.10694  [pdf, ps, other

    stat.ME

    Combining Non-probability and Probability Survey Samples Through Mass Imputation

    Authors: Jae Kwang Kim, Seho Park, Yilin Chen, Changbao Wu

    Abstract: This paper presents theoretical results on combining non-probability and probability survey samples through mass imputation, an approach originally proposed by Rivers (2007) as sample matching without rigorous theoretical justification. Under suitable regularity conditions, we establish the consistency of the mass imputation estimator and derive its asymptotic variance formula. Variance estimators… ▽ More

    Submitted 21 November, 2020; v1 submitted 27 December, 2018; originally announced December 2018.

    Comments: Submitted to Journal of the Royal Statistical Society: Series A

  24. arXiv:1811.11950  [pdf, other

    stat.ME

    Accounting for model uncertainty in multiple imputation under complex sampling

    Authors: Gyuhyeong Goh, Jae Kwang Kim

    Abstract: Multiple imputation provides an effective way to handle missing data. When several possible models are under consideration for the data, the multiple imputation is typically performed under a single-best model selected from the candidate models. This single model selection approach ignores the uncertainty associated with the model selection and so leads to underestimation of the variance of multip… ▽ More

    Submitted 28 November, 2018; originally announced November 2018.

    Comments: 23 pages, 1 Table

  25. arXiv:1810.12519  [pdf, ps, other

    stat.ME

    Semiparametric response model with nonignorable nonresponse

    Authors: Masatoshi Uehara, Jae Kwang Kim

    Abstract: How to deal with nonignorable response is often a challenging problem encountered in statistical analysis with missing data. Parametric model assumption for the response mechanism is often made and there is no way to validate the model assumption with missing data. We consider a semiparametric response model that relaxes the parametric model assumption in the response mechanism. Two types of effic… ▽ More

    Submitted 30 October, 2018; originally announced October 2018.

  26. arXiv:1809.05976  [pdf, ps, other

    stat.ME

    Semiparametric fractional imputation using Gaussian mixture models for handling multivariate missing data

    Authors: Hejian Sang, Jae Kwang Kim

    Abstract: Item nonresponse is frequently encountered in practice. Ignoring missing data can lose efficiency and lead to misleading inference. Fractional imputation is a frequentist approach of imputation for handling missing data. However, the parametric fractional imputation of \cite{kim2011parametric} may be subject to bias under model misspecification. In this paper, we propose a novel semiparametric fra… ▽ More

    Submitted 16 September, 2018; originally announced September 2018.

    Comments: 23 pages, 2 figures

  27. arXiv:1807.10873  [pdf, other

    stat.ME

    Bayesian Sparse Propensity Score Estimation for Unit Nonresponse

    Authors: Hejian Sang, Gyuhyeong Goh, Jae Kwang Kim

    Abstract: Nonresponse weighting adjustment using propensity score is a popular method for handling unit nonresponse. However, including all available auxiliary variables into the propensity model can lead to inefficient and inconsistent estimation, especially with high-dimensional covariates. In this paper, a new Bayesian method using the Spike-and-Slab prior is proposed for sparse propensity score estimati… ▽ More

    Submitted 27 July, 2018; originally announced July 2018.

    Comments: 38 pages, 3 tables

  28. arXiv:1807.02817  [pdf, ps, other

    stat.ME

    Integration of survey data and big observational data for finite population inference using mass imputation

    Authors: Shu Yang, Jae Kwang Kim

    Abstract: Multiple data sources are becoming increasingly available for statistical analyses in the era of big data. As an important example in finite-population inference, we consider an imputation approach to combining a probability sample with big observational data. Unlike the usual imputation for missing data analysis, we create imputed values for the whole elements in the probability sample. Such mass… ▽ More

    Submitted 8 July, 2018; originally announced July 2018.

  29. Sampling techniques for big data analysis in finite population inference

    Authors: Jae Kwang Kim, Zhonglei Wang

    Abstract: In analyzing big data for finite population inference, it is critical to adjust for the selection bias in the big data. In this paper, we propose two methods of reducing the selection bias associated with the big data sample. The first method uses a version of inverse sampling by incorporating auxiliary information from external sources, and the second one borrows the idea of data integration by c… ▽ More

    Submitted 29 January, 2018; originally announced January 2018.

    Comments: 24 pages, 3 tables

  30. arXiv:1707.00974  [pdf, ps, other

    stat.ME

    Nearest neighbor imputation for general parameter estimation in survey sampling

    Authors: Shu Yang, Jae Kwang Kim

    Abstract: Nearest neighbor imputation is popular for handling item nonresponse in survey sampling. In this article, we study the asymptotic properties of the nearest neighbor imputation estimator for general population parameters, including population means, proportions and quantiles. For variance estimation, the conventional bootstrap inference for matching estimators with fixed number of matches has been… ▽ More

    Submitted 30 June, 2017; originally announced July 2017.

    Comments: 25 pages. arXiv admin note: substantial text overlap with arXiv:1703.10256

  31. arXiv:1703.10256  [pdf, ps, other

    stat.ME

    Predictive mean matching imputation in survey sampling

    Authors: Shu Yang, Jae Kwang Kim

    Abstract: Predictive mean matching imputation is popular for handling item nonresponse in survey sampling. In this article, we study the asymptotic properties of the predictive mean matching estimator of the population mean. For variance estimation, the conventional bootstrap inference for matching estimators with fixed matches has been shown to be invalid due to the nonsmoothness nature of the matching est… ▽ More

    Submitted 12 January, 2018; v1 submitted 29 March, 2017; originally announced March 2017.

    Comments: 20 pages, 0 figure, 1 table

  32. arXiv:1702.03453  [pdf, other

    stat.ME

    An approximate Bayesian inference on propensity score estimation under unit nonresponse

    Authors: Hejian Sang, Jae Kwang Kim

    Abstract: Nonresponse weighting adjustment using the response propensity score is a popular tool for handling unit nonresponse. Statistical inference after the nonresponse weighting adjustment is complicated because the effect of estimating the propensity model parameter needs to be incorporated. In this paper, we propose an approximate Bayesian approach to handle unit nonresponse with parametric model assu… ▽ More

    Submitted 11 February, 2017; originally announced February 2017.

    Comments: 38 pages

  33. arXiv:1612.09207  [pdf, other

    stat.ME

    Semiparametric Optimal Estimation With Nonignorable Nonresponse Data

    Authors: Kosuke Morikawa, Jae Kwang Kim

    Abstract: When the response mechanism is believed to be not missing at random (NMAR), a valid analysis requires stronger assumptions on the response mechanism than standard statistical methods would otherwise require. Semiparametric estimators have been developed under the model assumptions on the response mechanism. In this paper, a new statistical test is proposed to guarantee model identifiability withou… ▽ More

    Submitted 7 May, 2020; v1 submitted 29 December, 2016; originally announced December 2016.

    MSC Class: 62F35; 62G20; 62G10

  34. A note on multiple imputation for method of moments estimation

    Authors: Shu Yang, Jae Kwang Kim

    Abstract: Multiple imputation is a popular imputation method for general purpose estimation. Rubin(1987) provided an easily applicable formula for the variance estimation of multiple imputation. However, the validity of the multiple imputation inference requires the congeniality condition of Meng(1994), which is not necessarily satisfied for method of moments estimation. This paper presents the asymptotic b… ▽ More

    Submitted 27 August, 2015; originally announced August 2015.

    Comments: 8 pages, 0 figure

    Journal ref: Biometrika (2016)

  35. Fractional Imputation in Survey Sampling: A Comparative Review

    Authors: Shu Yang, Jae Kwang Kim

    Abstract: Fractional imputation (FI) is a relatively new method of imputation for handling item nonresponse in survey sampling. In FI, several imputed values with their fractional weights are created for each missing item. Each fractional weight represents the conditional probability of the imputed value given the observed data, and the parameters in the conditional probabilities are often computed by an it… ▽ More

    Submitted 27 August, 2015; originally announced August 2015.

    Comments: 26 pages, 2 figures

    Journal ref: Statistical Science (2016)

  36. arXiv:1411.2305  [pdf, other

    cs.DC cs.LG stat.ML

    Model-Parallel Inference for Big Topic Models

    Authors: Xun Zheng, Jin Kyu Kim, Qirong Ho, Eric P. Xing

    Abstract: In real world industrial applications of topic modeling, the ability to capture gigantic conceptual space by learning an ultra-high dimensional topical representation, i.e., the so-called "big model", is becoming the next desideratum after enthusiasms on "big data", especially for fine-grained downstream tasks such as online advertising, where good performances are usually achieved by regression-b… ▽ More

    Submitted 9 November, 2014; originally announced November 2014.

  37. arXiv:1406.4580  [pdf, other

    stat.ML cs.DC cs.LG

    Primitives for Dynamic Big Model Parallelism

    Authors: Seunghak Lee, Jin Kyu Kim, Xun Zheng, Qirong Ho, Garth A. Gibson, Eric P. Xing

    Abstract: When training large machine learning models with many variables or parameters, a single machine is often inadequate since the model may be too large to fit in memory, while training can take a long time even with stochastic updates. A natural recourse is to turn to distributed cluster computing, in order to harness additional memory and processors. However, naive, unstructured parallelization of M… ▽ More

    Submitted 17 June, 2014; originally announced June 2014.

  38. arXiv:1312.7651  [pdf, other

    stat.ML cs.LG eess.SY

    Petuum: A New Platform for Distributed Machine Learning on Big Data

    Authors: Eric P. Xing, Qirong Ho, Wei Dai, Jin Kyu Kim, Jinliang Wei, Seunghak Lee, Xun Zheng, Pengtao Xie, Abhimanu Kumar, Yaoliang Yu

    Abstract: What is a systematic way to efficiently apply a wide spectrum of advanced ML programs to industrial scale problems, using Big Models (up to 100s of billions of parameters) on Big Data (up to terabytes or petabytes)? Modern parallelization strategies employ fine-grained operations and scheduling beyond the classic bulk-synchronous processing paradigm popularized by MapReduce, or even specialized gr… ▽ More

    Submitted 14 May, 2015; v1 submitted 30 December, 2013; originally announced December 2013.

    Comments: 15 pages, 10 figures, final version in KDD 2015 under the same title

  39. arXiv:1312.5766  [pdf, other

    stat.ML cs.LG

    Structure-Aware Dynamic Scheduler for Parallel Machine Learning

    Authors: Seunghak Lee, Jin Kyu Kim, Qirong Ho, Garth A. Gibson, Eric P. Xing

    Abstract: Training large machine learning (ML) models with many variables or parameters can take a long time if one employs sequential procedures even with stochastic updates. A natural solution is to turn to distributed computing on a cluster; however, naive, unstructured parallelization of ML algorithms does not usually lead to a proportional speedup and can even result in divergence, because dependencies… ▽ More

    Submitted 30 December, 2013; v1 submitted 19 December, 2013; originally announced December 2013.

  40. Variance estimation for nearest neighbor imputation for US Census long form data

    Authors: Jae Kwang Kim, Wayne A. Fuller, William R. Bell

    Abstract: Variance estimation for estimators of state, county, and school district quantities derived from the Census 2000 long form are discussed. The variance estimator must account for (1) uncertainty due to imputation, and (2) raking to census population controls. An imputation procedure that imputes more than one value for each missing item using donors that are neighbors is described and the procedure… ▽ More

    Submitted 4 August, 2011; originally announced August 2011.

    Comments: Published in at http://dx.doi.org/10.1214/10-AOAS419 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOAS-AOAS419

    Journal ref: Annals of Applied Statistics 2011, Vol. 5, No. 2A, 824-842