Skip to main content

Showing 1–19 of 19 results for author: Gong, R

Searching in archive stat. Search in all archives.
.
  1. arXiv:2504.15246  [pdf, other

    cs.CR stat.OT

    A Refreshment Stirred, Not Shaken (III): Can Swapping Be Differentially Private?

    Authors: James Bailie, Ruobin Gong, Xiao-Li Meng

    Abstract: The quest for a precise and contextually grounded answer to the question in the present paper's title resulted in this stirred-not-shaken triptych, a phrase that reflects our desire to deepen the theoretical basis, broaden the practical applicability, and reduce the misperception of differential privacy (DP)$\unicode{x2014}$all without shaking its core foundations. Indeed, given the existence of m… ▽ More

    Submitted 21 April, 2025; originally announced April 2025.

    Comments: 27 pages, 1 figure

  2. arXiv:2501.08449  [pdf, other

    cs.CR cs.CY cs.DS stat.ME

    A Refreshment Stirred, Not Shaken (II): Invariant-Preserving Deployments of Differential Privacy for the US Decennial Census

    Authors: James Bailie, Ruobin Gong, Xiao-Li Meng

    Abstract: Through the lens of the system of differential privacy specifications developed in Part I of a trio of articles, this second paper examines two statistical disclosure control (SDC) methods for the United States Decennial Census: the Permutation Swapping Algorithm (PSA), which is similar to the 2010 Census's disclosure avoidance system (DAS), and the TopDown Algorithm (TDA), which was used in the 2… ▽ More

    Submitted 14 January, 2025; originally announced January 2025.

    Comments: 48 pages, 2 figures

  3. arXiv:2412.14503  [pdf, other

    stat.CO

    dapper: Data Augmentation for Private Posterior Estimation in R

    Authors: Kevin Eng, Jordan A. Awan, Nianqiao Phyllis Ju, Vinayak A. Rao, Ruobin Gong

    Abstract: This paper serves as a reference and introduction to using the R package dapper. dapper encodes a sampling framework which allows exact Markov chain Monte Carlo simulation of parameters and latent variables in a statistical model given privatized data. The goal of this package is to fill an urgent need by providing applied researchers with a flexible tool to perform valid Bayesian inference on dat… ▽ More

    Submitted 18 December, 2024; originally announced December 2024.

  4. arXiv:2402.07066  [pdf, other

    cs.CR cs.LG stat.ME

    Differentially Private Range Queries with Correlated Input Perturbation

    Authors: Prathamesh Dharangutte, Jie Gao, Ruobin Gong, Guanyang Wang

    Abstract: This work proposes a class of differentially private mechanisms for linear queries, in particular range queries, that leverages correlated input perturbation to simultaneously achieve unbiasedness, consistency, statistical transparency, and control over utility requirements in terms of accuracy targets expressed either in certain query margins or as implied by the hierarchical database structure.… ▽ More

    Submitted 6 November, 2024; v1 submitted 10 February, 2024; originally announced February 2024.

  5. arXiv:2212.00936  [pdf, other

    cs.CR stat.AP

    Integer Subspace Differential Privacy

    Authors: Prathamesh Dharangutte, Jie Gao, Ruobin Gong, Fang-Yi Yu

    Abstract: We propose new differential privacy solutions for when external \emph{invariants} and \emph{integer} constraints are simultaneously enforced on the data product. These requirements arise in real world applications of private data curation, including the public release of the 2020 U.S. Decennial Census. They pose a great challenge to the production of provably private data products with adequate st… ▽ More

    Submitted 1 December, 2022; originally announced December 2022.

    Comments: Accepted to AAAI 2023

  6. arXiv:2206.00710  [pdf, other

    stat.ME stat.CO

    Data Augmentation MCMC for Bayesian Inference from Privatized Data

    Authors: Nianqiao Ju, Jordan A. Awan, Ruobin Gong, Vinayak A. Rao

    Abstract: Differentially private mechanisms protect privacy by introducing additional randomness into the data. Restricting access to only the privatized data makes it challenging to perform valid statistical inference on parameters underlying the confidential data. Specifically, the likelihood function of the privatized data requires integrating over the large space of confidential databases and is typical… ▽ More

    Submitted 7 December, 2022; v1 submitted 1 June, 2022; originally announced June 2022.

    Comments: 17 pages, 3 figures, 2 tables. NeurIPS 2022

  7. arXiv:2204.05313  [pdf, other

    stat.OT

    Six Statistical Senses

    Authors: Radu V. Craiu, Ruobin Gong, Xiao-Li Meng

    Abstract: This article proposes a set of categories, each one representing a particular distillation of important statistical ideas. Each category is labeled a "sense" because we think of these as essential in helping every statistical mind connect in constructive and insightful ways with statistical theory, methodologies, and computation, toward the ultimate goal of building statistical phronesis. The illu… ▽ More

    Submitted 18 September, 2022; v1 submitted 11 April, 2022; originally announced April 2022.

    MSC Class: 62A01; 62-02

  8. arXiv:2108.11527  [pdf, other

    cs.CR stat.AP

    Subspace Differential Privacy

    Authors: Jie Gao, Ruobin Gong, Fang-Yi Yu

    Abstract: Many data applications have certain invariant constraints due to practical needs. Data curators who employ differential privacy need to respect such constraints on the sanitized data product as a primary utility requirement. Invariants challenge the formulation, implementation, and interpretation of privacy guarantees. We propose subspace differential privacy, to honestly characterize the depend… ▽ More

    Submitted 29 April, 2022; v1 submitted 25 August, 2021; originally announced August 2021.

    Comments: 25 pages, 3 figures; Published in AAAI'22

  9. arXiv:2103.05856  [pdf, other

    stat.ME

    Bayesian Poisson Mortality Projections with Incomplete Data

    Authors: Rui Gong, Xiaoqian Sun, Leping Liu, Yu-Bo Wang

    Abstract: The missing data problem pervasively exists in statistical applications. Even as simple as the count data in mortality projections, it may not be available for certain age-and-year groups due to the budget limitations or difficulties in tracing research units, resulting in the follow-up estimation and prediction inaccuracies. To circumvent this data-driven challenge, we extend the Poisson log-norm… ▽ More

    Submitted 9 March, 2021; originally announced March 2021.

  10. arXiv:2101.00565  [pdf, other

    econ.EM q-fin.ST stat.ME

    Estimation of Tempered Stable Lévy Models of Infinite Variation

    Authors: José E. Figueroa-López, Ruoting Gong, Yuchen Han

    Abstract: We propose a new method for the estimation of a semiparametric tempered stable Lévy model. The estimation procedure combines iteratively an approximate semiparametric method of moment estimator, Truncated Realized Quadratic Variations (TRQV), and a newly found small-time high-order approximation for the optimal threshold of the TRQV of tempered stable processes. The method is tested via simulation… ▽ More

    Submitted 24 February, 2022; v1 submitted 3 January, 2021; originally announced January 2021.

    Comments: 33 pages

  11. Transparent Privacy is Principled Privacy

    Authors: Ruobin Gong

    Abstract: In a technical treatment, this article establishes the necessity of transparent privacy for drawing unbiased statistical inference for a wide range of scientific questions. Transparency is a distinct feature enjoyed by differential privacy: the probabilistic mechanism with which the data are privatized can be made public without sabotaging the privacy guarantee. Uncertainty due to transparent priv… ▽ More

    Submitted 18 September, 2022; v1 submitted 15 June, 2020; originally announced June 2020.

    MSC Class: 68P27; 62F15; 62D10 ACM Class: G.3

    Journal ref: Harvard Data Science Review, Special Issue 2, 2022

  12. arXiv:2003.07577  [pdf, other

    cs.LG stat.ML

    Efficient Bitwidth Search for Practical Mixed Precision Neural Network

    Authors: Yuhang Li, Wei Wang, Haoli Bai, Ruihao Gong, Xin Dong, Fengwei Yu

    Abstract: Network quantization has rapidly become one of the most widely used methods to compress and accelerate deep neural networks. Recent efforts propose to quantize weights and activations from different layers with different precision to improve the overall performance. However, it is challenging to find the optimal bitwidth (i.e., precision) for weights and activations of each layer efficiently. Mean… ▽ More

    Submitted 17 March, 2020; originally announced March 2020.

    Comments: 21 pages, 7 figures

  13. arXiv:2001.08336  [pdf, other

    math.ST stat.ME

    Geometric Conditions for the Discrepant Posterior Phenomenon and Connections to Simpson's Paradox

    Authors: Yang Chen, Ruobin Gong, Min-ge Xie

    Abstract: The discrepant posterior phenomenon (DPP) is a counter-intuitive phenomenon that can frequently occur in a Bayesian analysis of multivariate parameters. It refers to the phenomenon that a parameter estimate based on a posterior is more extreme than both of those inferred based on either the prior or the likelihood alone. Inferential claims that exhibit DPP defy the common intuition that the poster… ▽ More

    Submitted 12 January, 2022; v1 submitted 22 January, 2020; originally announced January 2020.

  14. arXiv:1910.11953  [pdf, other

    stat.CO

    A Gibbs sampler for a class of random convex polytopes

    Authors: Pierre E. Jacob, Ruobin Gong, Paul T. Edlefsen, Arthur P. Dempster

    Abstract: We present a Gibbs sampler for the Dempster-Shafer (DS) approach to statistical inference for Categorical distributions. The DS framework extends the Bayesian approach, allows in particular the use of partial prior information, and yields three-valued uncertainty assessments representing probabilities "for", "against", and "don't know" about formal assertions of interest. The proposed algorithm ta… ▽ More

    Submitted 21 January, 2021; v1 submitted 25 October, 2019; originally announced October 2019.

    Comments: 23 pages including the references and appendices

  15. arXiv:1909.12237  [pdf, other

    stat.CO math.ST

    Exact Inference with Approximate Computation for Differentially Private Data via Perturbations

    Authors: Ruobin Gong

    Abstract: This paper discusses how two classes of approximate computation algorithms can be adapted, in a modular fashion, to achieve exact statistical inference from differentially private data products. Considered are approximate Bayesian computation for Bayesian inference, and Monte Carlo Expectation-Maximization for likelihood inference. Up to Monte Carlo error, inference from these algorithms is exact… ▽ More

    Submitted 26 September, 2022; v1 submitted 26 September, 2019; originally announced September 2019.

  16. arXiv:1908.00876  [pdf, other

    eess.IV cs.LG q-bio.NC stat.ML

    MarmoNet: a pipeline for automated projection mapping of the common marmoset brain from whole-brain serial two-photon tomography

    Authors: Henrik Skibbe, Akiya Watakabe, Ken Nakae, Carlos Enrique Gutierrez, Hiromichi Tsukada, Junichi Hata, Takashi Kawase, Rui Gong, Alexander Woodward, Kenji Doya, Hideyuki Okano, Tetsuo Yamamori, Shin Ishii

    Abstract: Understanding the connectivity in the brain is an important prerequisite for understanding how the brain processes information. In the Brain/MINDS project, a connectivity study on marmoset brains uses two-photon microscopy fluorescence images of axonal projections to collect the neuron connectivity from defined brain regions at the mesoscopic scale. The processing of the images requires the detect… ▽ More

    Submitted 2 August, 2019; originally announced August 2019.

  17. arXiv:1905.05935  [pdf, other

    stat.ME

    Simultaneous Inference Under the Vacuous Orientation Assumption

    Authors: Ruobin Gong

    Abstract: I propose a novel approach to simultaneous inference that alleviates the need to specify a correlational structure among marginal errors. The vacuous orientation assumption retains what the normal i.i.d. assumption implies about the distribution of error configuration, but relaxes the implication that the error orientation is isotropic. When a large number of highly dependent hypotheses are tested… ▽ More

    Submitted 14 May, 2019; originally announced May 2019.

    Comments: 10 pages, 3 figures, ISIPTA 2019

    Journal ref: PMLR 103:225-234, 2019

  18. arXiv:1806.07506  [pdf, other

    cs.SD cs.LG eess.AS stat.ML

    A Simple Fusion of Deep and Shallow Learning for Acoustic Scene Classification

    Authors: Eduardo Fonseca, Rong Gong, Xavier Serra

    Abstract: In the past, Acoustic Scene Classification systems have been based on hand crafting audio features that are input to a classifier. Nowadays, the common trend is to adopt data driven techniques, e.g., deep learning, where audio representations are learned from data. In this paper, we propose a system that consists of a simple fusion of two methods of the aforementioned types: a deep learning approa… ▽ More

    Submitted 27 June, 2018; v1 submitted 19 June, 2018; originally announced June 2018.

    Comments: accepted to SMC 2018; updated Figure 7, results unchanged

  19. arXiv:1712.08946  [pdf, other

    math.ST stat.ME

    Judicious Judgment Meets Unsettling Updating: Dilation, Sure Loss, and Simpson's Paradox

    Authors: Ruobin Gong, Xiao-Li Meng

    Abstract: Statistical learning using imprecise probabilities is gaining more attention because it presents an alternative strategy for reducing irreplicable findings by freeing the user from the task of making up unwarranted high-resolution assumptions. However, model updating as a mathematical operation is inherently exact, hence updating imprecise models requires the user's judgment in choosing among comp… ▽ More

    Submitted 24 December, 2017; originally announced December 2017.

    Comments: 32 pages, 3 figures