-
Toward a Principled Framework for Disclosure Avoidance
Authors:
Michael B Hawes,
Evan M Brassell,
Anthony Caruso,
Ryan Cumings-Menon,
Jason Devine,
Cassandra Dorius,
David Evans,
Kenneth Haase,
Michele C Hedrick,
Alexandra Krause,
Philip Leclerc,
James Livsey,
Rolando A Rodriguez,
Luke T Rogers,
Matthew Spence,
Victoria Velkoff,
Michael Walsh,
James Whitehorne,
Sallie Ann Keller
Abstract:
Responsible disclosure limitation is an iterative exercise in risk assessment and mitigation. From time to time, as disclosure risks grow and evolve and as data users' needs change, agencies must consider redesigning the disclosure avoidance system(s) they use. Discussions about candidate systems often conflate inherent features of those systems with implementation decisions independent of those s…
▽ More
Responsible disclosure limitation is an iterative exercise in risk assessment and mitigation. From time to time, as disclosure risks grow and evolve and as data users' needs change, agencies must consider redesigning the disclosure avoidance system(s) they use. Discussions about candidate systems often conflate inherent features of those systems with implementation decisions independent of those systems. For example, a system's ability to calibrate the strength of protection to suit the underlying disclosure risk of the data (e.g., by varying suppression thresholds), is a worthwhile feature regardless of the independent decision about how much protection is actually necessary. Having a principled discussion of candidate disclosure avoidance systems requires a framework for distinguishing these inherent features of the systems from the implementation decisions that need to be made independent of the system selected. For statistical agencies, this framework must also reflect the applied nature of these systems, acknowledging that candidate systems need to be adaptable to requirements stemming from the legal, scientific, resource, and stakeholder environments within which they would be operating. This paper proposes such a framework. No approach will be perfectly adaptable to every potential system requirement. Because the selection of some methodologies over others may constrain the resulting systems' efficiency and flexibility to adapt to particular statistical product specifications, data user needs, or disclosure risks, agencies may approach these choices in an iterative fashion, adapting system requirements, product specifications, and implementation parameters as necessary to ensure the resulting quality of the statistical product.
△ Less
Submitted 22 August, 2025; v1 submitted 10 February, 2025;
originally announced February 2025.
-
A Simulated Reconstruction and Reidentification Attack on the 2010 U.S. Census
Authors:
John M. Abowd,
Tamara Adams,
Robert Ashmead,
David Darais,
Sourya Dey,
Simson L. Garfinkel,
Nathan Goldschlag,
Michael B. Hawes,
Daniel Kifer,
Philip Leclerc,
Ethan Lew,
Scott Moore,
Rolando A. Rodríguez,
Ramy N. Tadros,
Lars Vilhuber
Abstract:
We show that individual, confidential microdata records from the 2010 U.S. Census of Population and Housing can be accurately reconstructed from the published tabular summaries. Ninety-seven million person records (every resident in 70% of all census blocks) are exactly reconstructed with provable certainty using only public information. We further show that a hypothetical attacker using our metho…
▽ More
We show that individual, confidential microdata records from the 2010 U.S. Census of Population and Housing can be accurately reconstructed from the published tabular summaries. Ninety-seven million person records (every resident in 70% of all census blocks) are exactly reconstructed with provable certainty using only public information. We further show that a hypothetical attacker using our methods can reidentify with 95% accuracy population unique individuals who are perfectly reconstructed and not in the modal race and ethnicity category in their census block (3.4 million persons)--a result that is only possible because their confidential records were used in the published tabulations. Finally, we show that the methods used for the 2020 Census, based on a differential privacy framework, provide better protection against this type of attack, with better published data accuracy, than feasible alternatives.
△ Less
Submitted 28 July, 2025; v1 submitted 18 December, 2023;
originally announced December 2023.
-
An In-Depth Examination of Requirements for Disclosure Risk Assessment
Authors:
Ron S. Jarmin,
John M. Abowd,
Robert Ashmead,
Ryan Cumings-Menon,
Nathan Goldschlag,
Michael B. Hawes,
Sallie Ann Keller,
Daniel Kifer,
Philip Leclerc,
Jerome P. Reiter,
Rolando A. Rodríguez,
Ian Schmutte,
Victoria A. Velkoff,
Pavel Zhuravlev
Abstract:
The use of formal privacy to protect the confidentiality of responses in the 2020 Decennial Census of Population and Housing has triggered renewed interest and debate over how to measure the disclosure risks and societal benefits of the published data products. Following long-established precedent in economics and statistics, we argue that any proposal for quantifying disclosure risk should be bas…
▽ More
The use of formal privacy to protect the confidentiality of responses in the 2020 Decennial Census of Population and Housing has triggered renewed interest and debate over how to measure the disclosure risks and societal benefits of the published data products. Following long-established precedent in economics and statistics, we argue that any proposal for quantifying disclosure risk should be based on pre-specified, objective criteria. Such criteria should be used to compare methodologies to identify those with the most desirable properties. We illustrate this approach, using simple desiderata, to evaluate the absolute disclosure risk framework, the counterfactual framework underlying differential privacy, and prior-to-posterior comparisons. We conclude that satisfying all the desiderata is impossible, but counterfactual comparisons satisfy the most while absolute disclosure risk satisfies the fewest. Furthermore, we explain that many of the criticisms levied against differential privacy would be levied against any technology that is not equivalent to direct, unrestricted access to confidential data. Thus, more research is needed, but in the near-term, the counterfactual approach appears best-suited for privacy-utility analysis.
△ Less
Submitted 13 October, 2023;
originally announced October 2023.
-
$21^{st}$ Century Statistical Disclosure Limitation: Motivations and Challenges
Authors:
John M Abowd,
Michael B Hawes
Abstract:
This chapter examines the motivations and imperatives for modernizing how statistical agencies approach statistical disclosure limitation for official data product releases. It discusses the implications for agencies' broader data governance and decision-making, and it identifies challenges that agencies will likely face along the way. In conclusion, the chapter proposes some principles and best p…
▽ More
This chapter examines the motivations and imperatives for modernizing how statistical agencies approach statistical disclosure limitation for official data product releases. It discusses the implications for agencies' broader data governance and decision-making, and it identifies challenges that agencies will likely face along the way. In conclusion, the chapter proposes some principles and best practices that we believe can help guide agencies in navigating the transformation of their confidentiality programs.
△ Less
Submitted 1 March, 2023;
originally announced March 2023.
-
Confidentiality Protection in the 2020 US Census of Population and Housing
Authors:
John M Abowd,
Michael B Hawes
Abstract:
In an era where external data and computational capabilities far exceed statistical agencies' own resources and capabilities, they face the renewed challenge of protecting the confidentiality of underlying microdata when publishing statistics in very granular form and ensuring that these granular data are used for statistical purposes only. Conventional statistical disclosure limitation methods ar…
▽ More
In an era where external data and computational capabilities far exceed statistical agencies' own resources and capabilities, they face the renewed challenge of protecting the confidentiality of underlying microdata when publishing statistics in very granular form and ensuring that these granular data are used for statistical purposes only. Conventional statistical disclosure limitation methods are too fragile to address this new challenge. This article discusses the deployment of a differential privacy framework for the 2020 US Census that was customized to protect confidentiality, particularly the most detailed geographic and demographic categories, and deliver controlled accuracy across the full geographic hierarchy.
△ Less
Submitted 27 December, 2022; v1 submitted 7 June, 2022;
originally announced June 2022.
-
Bayesian Compressive Sensing Approaches for Direction of Arrival Estimation with Mutual Coupling Effects
Authors:
Matthew Hawes,
Lyudmila Mihaylova,
François Septier,
Simon Godsill
Abstract:
The problem of estimating the dynamic direction of arrival of far field signals impinging on a uniform linear array, with mutual coupling effects, is addressed. This work proposes two novel approaches able to provide accurate solutions, including at the endfire regions of the array. Firstly, a Bayesian compressive sensing Kalman filter is developed, which accounts for the predicted estimated signa…
▽ More
The problem of estimating the dynamic direction of arrival of far field signals impinging on a uniform linear array, with mutual coupling effects, is addressed. This work proposes two novel approaches able to provide accurate solutions, including at the endfire regions of the array. Firstly, a Bayesian compressive sensing Kalman filter is developed, which accounts for the predicted estimated signals rather than using the traditional sparse prior. The posterior probability density function of the received source signals and the expression for the related marginal likelihood function are derived theoretically. Next, a Gibbs sampling based approach with indicator variables in the sparsity prior is developed. This allows sparsity to be explicitly enforced in different ways, including when an angle is too far from the previous estimate. The proposed approaches are validated and evaluated over different test scenarios and compared to the traditional relevance vector machine based method. An improved accuracy in terms of average root mean square error values is achieved (up to 73.39% for the modified relevance vector machine based approach and 86.36% for the Gibbs sampling based approach). The proposed approaches prove to be particularly useful for direction of arrival estimation when the angle of arrival moves into the endfire region of the array.
△ Less
Submitted 13 February, 2017;
originally announced February 2017.
-
Location and Orientation Optimisation for Spatially Stretched Tripole Arrays Based on Compressive Sensing
Authors:
Matthew Hawes,
Lyudmila Mihaylova,
Wei Liu
Abstract:
The design of sparse spatially stretched tripole arrays is an important but also challenging task and this paper proposes for the very first time efficient solutions to this problem. Unlike for the design of traditional sparse antenna arrays, the developed approaches optimise both the dipole locations and orientations. The novelty of the paper consists in formulating these optimisation problems in…
▽ More
The design of sparse spatially stretched tripole arrays is an important but also challenging task and this paper proposes for the very first time efficient solutions to this problem. Unlike for the design of traditional sparse antenna arrays, the developed approaches optimise both the dipole locations and orientations. The novelty of the paper consists in formulating these optimisation problems into a form that can be solved by the proposed compressive sensing and Bayesian compressive sensing based approaches. The performance of the developed approaches is validated and it is shown that accurate approximation of a reference response can be achieved with a 67% reduction in the number of dipoles required as compared to an equivalent uniform spatially stretched tripole array, leading to a significant reduction in the cost associated with the resulting arrays.
△ Less
Submitted 1 February, 2017;
originally announced February 2017.
-
Compressive Sensing Based Design of Sparse Tripole Arrays
Authors:
Matthew Hawes,
Wei Liu,
Lyudmila Mihaylova
Abstract:
This paper considers the problem of designing sparse linear tripole arrays. In such arrays at each antenna location there are three orthogonal dipoles, allowing full measurement of both the horizontal and vertical components of the received waveform. We formulate this problem from the viewpoint of Compressive Sensing (CS). However, unlike for isotropic array elements (single antenna), we now have…
▽ More
This paper considers the problem of designing sparse linear tripole arrays. In such arrays at each antenna location there are three orthogonal dipoles, allowing full measurement of both the horizontal and vertical components of the received waveform. We formulate this problem from the viewpoint of Compressive Sensing (CS). However, unlike for isotropic array elements (single antenna), we now have three complex valued weight coefficients associated with each potential location (due to the three dipoles), which have to be simultaneously minimised. If this is not done, we may only set the weight coefficients of individual dipoles to be zero valued, rather than complete tripoles, meaning some dipoles may remain at each location. Therefore, the contributions of this paper are to formulate the design of sparse tripole arrays as an optimisation problem, and then we obtain a solution based on the minimisation of a modified l1 norm or a series of iteratively solved reweighted minimisations, which ensure a truly sparse solution. Design examples are provided to verify the effectiveness of the proposed methods and show that a good approximation of a reference pattern can be achieved using fewer tripoles than a Uniform Linear Array (ULA) of equivalent length.
△ Less
Submitted 29 March, 2016;
originally announced March 2016.
-
A Bayesian Compressed Sensing Kalman Filter for Direction of Arrival Estimation
Authors:
Matthew Hawes,
Lyudmila Mihaylova,
Francois Septier,
Simon Godsill
Abstract:
In this paper, we look to address the problem of estimating the dynamic direction of arrival (DOA) of a narrowband signal impinging on a sensor array from the far field. The initial estimate is made using a Bayesian compressive sensing (BCS) framework and then tracked using a Bayesian compressed sensing Kalman filter (BCSKF). The BCS framework splits the angular region into N potential DOAs and en…
▽ More
In this paper, we look to address the problem of estimating the dynamic direction of arrival (DOA) of a narrowband signal impinging on a sensor array from the far field. The initial estimate is made using a Bayesian compressive sensing (BCS) framework and then tracked using a Bayesian compressed sensing Kalman filter (BCSKF). The BCS framework splits the angular region into N potential DOAs and enforces a belief that only a few of the DOAs will have a non-zero valued signal present. A BCSKF can then be used to track the change in the DOA using the same framework. There can be an issue when the DOA approaches the endfire of the array. In this angular region current methods can struggle to accurately estimate and track changes in the DOAs. To tackle this problem, we propose changing the traditional sparse belief associated with BCS to a belief that the estimated signals will match the predicted signals given a known DOA change. This is done by modelling the difference between the expected sparse received signals and the estimated sparse received signals as a Gaussian distribution. Example test scenarios are provided and comparisons made with the traditional BCS based estimation method. They show that an improvement in estimation accuracy is possible without a significant increase in computational complexity.
△ Less
Submitted 21 September, 2015;
originally announced September 2015.
-
A Compressive Sensing Based Approach to Sparse Wideband Array Design
Authors:
Matthew B. Hawes,
Wei Liu
Abstract:
Sparse wideband sensor array design for sensor location optimisation is highly nonlinear and it is traditionally solved by genetic algorithms, simulated annealing or other similar optimization methods. However, this is an extremely time-consuming process and more efficient solutions are needed. In this work, this problem is studied from the viewpoint of compressive sensing and a formulation based…
▽ More
Sparse wideband sensor array design for sensor location optimisation is highly nonlinear and it is traditionally solved by genetic algorithms, simulated annealing or other similar optimization methods. However, this is an extremely time-consuming process and more efficient solutions are needed. In this work, this problem is studied from the viewpoint of compressive sensing and a formulation based on a modified $l_1$ norm is derived. As there are multiple coefficients associated with each sensor, the key is to make sure that these coefficients are simultaneously minimized in order to discard the corresponding sensor locations. Design examples are provided to verify the effectiveness of the proposed methods.
△ Less
Submitted 19 March, 2014;
originally announced March 2014.