Skip to main content

Showing 1–11 of 11 results for author: Sexton, W

.
  1. arXiv:2505.03072  [pdf, ps, other

    cs.CR cs.CY

    SafeTab-H: Disclosure Avoidance for the 2020 Census Detailed Demographic and Housing Characteristics File B (Detailed DHC-B)

    Authors: William Sexton, Skye Berghel, Bayard Carlson, Sam Haney, Luke Hartman, Michael Hay, Ashwin Machanavajjhala, Gerome Miklau, Amritha Pai, Simran Rajpal, David Pujol, Ruchit Shrestha, Daniel Simmons-Marengo

    Abstract: This article describes SafeTab-H, a disclosure avoidance algorithm applied to the release of the U.S. Census Bureau's Detailed Demographic and Housing Characteristics File B (Detailed DHC-B) as part of the 2020 Census. The tabulations contain household statistics about household type and tenure iterated by the householder's detailed race, ethnicity, or American Indian and Alaska Native tribe and v… ▽ More

    Submitted 2 May, 2025; originally announced May 2025.

    Comments: 27 pages, 0 figures. arXiv admin note: substantial text overlap with arXiv:2505.01472

  2. arXiv:2505.01472  [pdf, other

    cs.CR cs.CY

    SafeTab-P: Disclosure Avoidance for the 2020 Census Detailed Demographic and Housing Characteristics File A (Detailed DHC-A)

    Authors: Sam Haney, Skye Berghel, Bayard Carlson, Ryan Cumings-Menon, Luke Hartman, Michael Hay, Ashwin Machanavajjhala, Gerome Miklau, Amritha Pai, Simran Rajpal, David Pujol, William Sexton, Ruchit Shrestha, Daniel Simmons-Marengo

    Abstract: This article describes the disclosure avoidance algorithm that the U.S. Census Bureau used to protect the Detailed Demographic and Housing Characteristics File A (Detailed DHC-A) of the 2020 Census. The tabulations contain statistics (counts) of demographic characteristics of the entire population of the United States, crossed with detailed races and ethnicities at varying levels of geography. The… ▽ More

    Submitted 2 May, 2025; originally announced May 2025.

    Comments: 30 Pages 2 figures

  3. arXiv:2505.01254  [pdf, other

    cs.CR cs.CY

    PHSafe: Disclosure Avoidance for the 2020 Census Supplemental Demographic and Housing Characteristics File (S-DHC)

    Authors: William Sexton, Skye Berghel, Bayard Carlson, Sam Haney, Luke Hartman, Michael Hay, Ashwin Machanavajjhala, Gerome Miklau, Amritha Pai, Simran Rajpal, David Pujol, Ruchit Shrestha, Daniel Simmons-Marengo

    Abstract: This article describes the disclosure avoidance algorithm that the U.S. Census Bureau used to protect the 2020 Census Supplemental Demographic and Housing Characteristics File (S-DHC). The tabulations contain statistics of counts of U.S. persons living in certain types of households, including averages. The article describes the PHSafe algorithm, which is based on adding noise drawn from a discret… ▽ More

    Submitted 2 May, 2025; originally announced May 2025.

    Comments: 26 pages, 1 figure

  4. arXiv:2409.18118  [pdf, other

    cs.CR stat.ME

    Slowly Scaling Per-Record Differential Privacy

    Authors: Brian Finley, Anthony M Caruso, Justin C Doty, Ashwin Machanavajjhala, Mikaela R Meyer, David Pujol, William Sexton, Zachary Terner

    Abstract: We develop formal privacy mechanisms for releasing statistics from data with many outlying values, such as income data. These mechanisms ensure that a per-record differential privacy guarantee degrades slowly in the protected records' influence on the statistics being released. Formal privacy mechanisms generally add randomness, or "noise," to published statistics. If a noisy statistic's distrib… ▽ More

    Submitted 2 May, 2025; v1 submitted 26 September, 2024; originally announced September 2024.

    Comments: This version fixes a mistaken variance formula in the first column of Table 3 and updates Figure 1 to use this variance formula

  5. Privately Answering Queries on Skewed Data via Per Record Differential Privacy

    Authors: Jeremy Seeman, William Sexton, David Pujol, Ashwin Machanavajjhala

    Abstract: We consider the problem of the private release of statistics (like aggregate payrolls) where it is critical to preserve the contribution made by a small number of outlying large entities. We propose a privacy formalism, per-record zero concentrated differential privacy (PzCDP), where the privacy loss associated with each record is a public function of that record's value. Unlike other formalisms w… ▽ More

    Submitted 18 December, 2024; v1 submitted 19 October, 2023; originally announced October 2023.

    Comments: 14 pages, 5 figures

  6. arXiv:2212.04133  [pdf, other

    cs.CR

    Tumult Analytics: a robust, easy-to-use, scalable, and expressive framework for differential privacy

    Authors: Skye Berghel, Philip Bohannon, Damien Desfontaines, Charles Estes, Sam Haney, Luke Hartman, Michael Hay, Ashwin Machanavajjhala, Tom Magerlein, Gerome Miklau, Amritha Pai, William Sexton, Ruchit Shrestha

    Abstract: In this short paper, we outline the design of Tumult Analytics, a Python framework for differential privacy used at institutions such as the U.S. Census Bureau, the Wikimedia Foundation, or the Internal Revenue Service.

    Submitted 8 December, 2022; originally announced December 2022.

  7. arXiv:2209.03310  [pdf, other

    cs.CR stat.ME

    Bayesian and Frequentist Semantics for Common Variations of Differential Privacy: Applications to the 2020 Census

    Authors: Daniel Kifer, John M. Abowd, Robert Ashmead, Ryan Cumings-Menon, Philip Leclerc, Ashwin Machanavajjhala, William Sexton, Pavel Zhuravlev

    Abstract: The purpose of this paper is to guide interpretation of the semantic privacy guarantees for some of the major variations of differential privacy, which include pure, approximate, Rényi, zero-concentrated, and $f$ differential privacy. We interpret privacy-loss accounting parameters, frequentist semantics, and Bayesian semantics (including new results). The driving application is the interpretation… ▽ More

    Submitted 7 September, 2022; originally announced September 2022.

  8. arXiv:2204.08986  [pdf, other

    cs.CR econ.EM stat.AP

    The 2020 Census Disclosure Avoidance System TopDown Algorithm

    Authors: John M. Abowd, Robert Ashmead, Ryan Cumings-Menon, Simson Garfinkel, Micah Heineck, Christine Heiss, Robert Johns, Daniel Kifer, Philip Leclerc, Ashwin Machanavajjhala, Brett Moran, William Sexton, Matthew Spence, Pavel Zhuravlev

    Abstract: The Census TopDown Algorithm (TDA) is a disclosure avoidance system using differential privacy for privacy-loss accounting. The algorithm ingests the final, edited version of the 2020 Census data and the final tabulation geographic definitions. The algorithm then creates noisy versions of key queries on the data, referred to as measurements, using zero-Concentrated Differential Privacy. Another ke… ▽ More

    Submitted 19 April, 2022; originally announced April 2022.

  9. arXiv:2110.13239  [pdf, ps, other

    cs.CR

    An Uncertainty Principle is a Price of Privacy-Preserving Microdata

    Authors: John Abowd, Robert Ashmead, Ryan Cumings-Menon, Simson Garfinkel, Daniel Kifer, Philip Leclerc, William Sexton, Ashley Simpson, Christine Task, Pavel Zhuravlev

    Abstract: Privacy-protected microdata are often the desired output of a differentially private algorithm since microdata is familiar and convenient for downstream users. However, there is a statistical price for this kind of convenience. We show that an uncertainty principle governs the trade-off between accuracy for a population of interest ("sum query") vs. accuracy for its component sub-populations ("poi… ▽ More

    Submitted 25 October, 2021; originally announced October 2021.

    Comments: Preprint of NeurIPS 2021 paper

  10. arXiv:2107.10659  [pdf, ps, other

    cs.CR cs.DB stat.AP

    Differentially Private Algorithms for 2020 Census Detailed DHC Race \& Ethnicity

    Authors: Sam Haney, William Sexton, Ashwin Machanavajjhala, Michael Hay, Gerome Miklau

    Abstract: This article describes a proposed differentially private (DP) algorithms that the US Census Bureau is considering to release the Detailed Demographic and Housing Characteristics (DHC) Race & Ethnicity tabulations as part of the 2020 Census. The tabulations contain statistics (counts) of demographic and housing characteristics of the entire population of the US crossed with detailed races and tribe… ▽ More

    Submitted 22 July, 2021; originally announced July 2021.

    Comments: Presented at Theory and Practice of Differential Privacy Workshop (TPDP) 2021

  11. arXiv:1906.09353  [pdf, ps, other

    econ.TH cs.CR cs.DB

    Suboptimal Provision of Privacy and Statistical Accuracy When They are Public Goods

    Authors: John M. Abowd, Ian M. Schmutte, William Sexton, Lars Vilhuber

    Abstract: With vast databases at their disposal, private tech companies can compete with public statistical agencies to provide population statistics. However, private companies face different incentives to provide high-quality statistics and to protect the privacy of the people whose data are used. When both privacy protection and statistical accuracy are public goods, private providers tend to produce at… ▽ More

    Submitted 21 June, 2019; originally announced June 2019.