Skip to main content

Showing 1–32 of 32 results for author: Slavkovic, A

.
  1. arXiv:2504.12520  [pdf, ps, other

    math.ST cs.CY

    Interpreting Network Differential Privacy

    Authors: Jonathan Hehir, Xiaoyue Niu, Aleksandra Slavkovic

    Abstract: How do we interpret the differential privacy (DP) guarantee for network data? We take a deep dive into a popular form of network DP ($\varepsilon$--edge DP) to find that many of its common interpretations are flawed. Drawing on prior work for privacy with correlated data, we interpret DP through the lens of adversarial hypothesis testing and demonstrate a gap between the pairs of hypotheses actual… ▽ More

    Submitted 16 April, 2025; originally announced April 2025.

    Comments: 19 pages

  2. arXiv:2409.08301  [pdf, other

    cs.CR cs.CV cs.LG math.FA math.ST

    Gaussian Differentially Private Human Faces Under a Face Radial Curve Representation

    Authors: Carlos Soto, Matthew Reimherr, Aleksandra Slavkovic, Mark Shriver

    Abstract: In this paper we consider the problem of releasing a Gaussian Differentially Private (GDP) 3D human face. The human face is a complex structure with many features and inherently tied to one's identity. Protecting this data, in a formally private way, is important yet challenging given the dimensionality of the problem. We extend approximate DP techniques for functional data to the GDP framework. W… ▽ More

    Submitted 15 April, 2025; v1 submitted 10 September, 2024; originally announced September 2024.

    Comments: 19 pages, 10 figures

  3. arXiv:2309.14581  [pdf

    stat.AP cs.CR econ.EM

    Assessing Utility of Differential Privacy for RCTs

    Authors: Soumya Mukherjee, Aratrika Mustafi, Aleksandra Slavković, Lars Vilhuber

    Abstract: Randomized control trials, RCTs, have become a powerful tool for assessing the impact of interventions and policies in many contexts. They are considered the gold-standard for inference in the biomedical fields and in many social sciences. Researchers have published an increasing number of studies that rely on RCTs for at least part of the inference, and these studies typically include the respons… ▽ More

    Submitted 25 September, 2023; originally announced September 2023.

    Comments: Submitted

  4. arXiv:2309.02416  [pdf, other

    stat.AP

    Differentially Private Synthetic Heavy-tailed Data

    Authors: Tran Tran, Matthew Reimherr, Aleksandra Slavković

    Abstract: The U.S. Census Longitudinal Business Database (LBD) product contains employment and payroll information of all U.S. establishments and firms dating back to 1976 and is an invaluable resource for economic research. However, the sensitive information in LBD requires confidentiality measures that the U.S. Census in part addressed by releasing a synthetic version (SynLBD) of the data to protect firms… ▽ More

    Submitted 14 October, 2023; v1 submitted 5 September, 2023; originally announced September 2023.

    Comments: 26 pages, LaTeX; corrected typos, added references, and clarified unclear wording

  5. arXiv:2209.12667  [pdf, other

    stat.ML cs.LG math.DG math.ST

    Shape And Structure Preserving Differential Privacy

    Authors: Carlos Soto, Karthik Bharath, Matthew Reimherr, Aleksandra Slavkovic

    Abstract: It is common for data structures such as images and shapes of 2D objects to be represented as points on a manifold. The utility of a mechanism to produce sanitized differentially private estimates from such data is intimately linked to how compatible it is with the underlying structure and geometry of the space. In particular, as recently shown, utility of the Laplace mechanism on a positively cur… ▽ More

    Submitted 21 September, 2022; originally announced September 2022.

    Comments: 15 pages (including supplementary material and references), 3 figures (including supplementary material), to be published in NeurIPS 2022

  6. arXiv:2205.08047  [pdf, other

    stat.ML cs.LG cs.SI math.ST

    Perfect Spectral Clustering with Discrete Covariates

    Authors: Jonathan Hehir, Xiaoyue Niu, Aleksandra Slavkovic

    Abstract: Among community detection methods, spectral clustering enjoys two desirable properties: computational efficiency and theoretical guarantees of consistency. Most studies of spectral clustering consider only the edges of a network as input to the algorithm. Here we consider the problem of performing community detection in the presence of discrete node covariates, where network structure is determine… ▽ More

    Submitted 16 May, 2022; originally announced May 2022.

    Comments: 23 pages, 1 figure

  7. arXiv:2205.03336  [pdf, other

    cs.CR stat.ME

    Statistical Data Privacy: A Song of Privacy and Utility

    Authors: Aleksandra Slavkovic, Jeremy Seeman

    Abstract: To quantify trade-offs between increasing demand for open data sharing and concerns about sensitive information disclosure, statistical data privacy (SDP) methodology analyzes data release mechanisms which sanitize outputs based on confidential data. Two dominant frameworks exist: statistical disclosure control (SDC), and more recent, differential privacy (DP). Despite framing differences, both SD… ▽ More

    Submitted 6 May, 2022; originally announced May 2022.

    Comments: Submitted to Annual Review of Statistics and Its Application, March 2023 Volume

  8. arXiv:2204.01132  [pdf, other

    cs.CR stat.CO

    Exact Privacy Guarantees for Markov Chain Implementations of the Exponential Mechanism with Artificial Atoms

    Authors: Jeremy Seeman, Matthew Reimherr, Aleksandra Slavkovic

    Abstract: Implementations of the exponential mechanism in differential privacy often require sampling from intractable distributions. When approximate procedures like Markov chain Monte Carlo (MCMC) are used, the end result incurs costs to both privacy and accuracy. Existing work has examined these effects asymptotically, but implementable finite sample results are needed in practice so that users can speci… ▽ More

    Submitted 3 April, 2022; originally announced April 2022.

    Comments: 16 pages, 3 figures

    Journal ref: Advances in Neural Information Processing Systems 34 (NeurIPS 2021)

  9. arXiv:2204.01102  [pdf, other

    cs.CR stat.ME

    Formal Privacy for Partially Private Data

    Authors: Jeremy Seeman, Matthew Reimherr, Aleksandra Slavkovic

    Abstract: Differential privacy (DP) quantifies privacy loss by analyzing noise injected into output statistics. For non-trivial statistics, this noise is necessary to ensure finite privacy loss. However, data curators frequently release collections of statistics where some use DP mechanisms and others are released as-is, i.e., without additional randomized noise. Consequently, DP alone cannot characterize t… ▽ More

    Submitted 14 December, 2022; v1 submitted 3 April, 2022; originally announced April 2022.

    Comments: 34 pages, 4 figures; submitted to JMLR

  10. arXiv:2201.10545  [pdf, other

    stat.ME

    A Latent Class Modeling Approach for Generating Synthetic Data and Making Posterior Inferences from Differentially Private Counts

    Authors: Michelle Pistner Nixon, Andrés F. Barrientos, Jerome P. Reiter, Aleksandra Slavković

    Abstract: Several algorithms exist for creating differentially private counts from contingency tables, such as two-way or three-way marginal counts. The resulting noisy counts generally do not correspond to a coherent contingency table, so that some post-processing step is needed if one wants the released counts to correspond to a coherent contingency table. We present a latent class modeling approach for p… ▽ More

    Submitted 25 January, 2022; originally announced January 2022.

  11. arXiv:2108.08266  [pdf, ps, other

    cs.CR math.ST stat.AP stat.ME

    Perturbed M-Estimation: A Further Investigation of Robust Statistics for Differential Privacy

    Authors: Aleksandra Slavkovic, Roberto Molinari

    Abstract: Differential Privacy (DP) provides an elegant mathematical framework for defining a provable disclosure risk in the presence of arbitrary adversaries; it guarantees that whether an individual is in a database or not, the results of a DP procedure should be similar in terms of their probability distribution. While DP mechanisms are provably effective in protecting privacy, they often negatively imp… ▽ More

    Submitted 5 August, 2021; originally announced August 2021.

  12. arXiv:2105.12615  [pdf, other

    math.ST cs.CR cs.SI

    Consistent Spectral Clustering of Network Block Models under Local Differential Privacy

    Authors: Jonathan Hehir, Aleksandra Slavkovic, Xiaoyue Niu

    Abstract: The stochastic block model (SBM) and degree-corrected block model (DCBM) are network models often selected as the fundamental setting in which to analyze the theoretical properties of community detection methods. We consider the problem of spectral clustering of SBM and DCBM networks under a local form of edge differential privacy. Using a randomized response privacy mechanism called the edge-flip… ▽ More

    Submitted 20 September, 2021; v1 submitted 26 May, 2021; originally announced May 2021.

    Comments: 32 pages, 7 figures

    Journal ref: Journal of Privacy and Confidentiality 12 (2), 2022

  13. arXiv:2003.12816  [pdf, other

    stat.AP stat.ME

    Privacy for Spatial Point Process Data

    Authors: Adam Walder, Ephraim M. Hanks, Aleksandra Slavković

    Abstract: In this work we develop methods for privatizing spatial location data, such as spatial locations of individual disease cases. We propose two novel Bayesian methods for generating synthetic location data based on log-Gaussian Cox processes (LGCPs). We show that conditional predictive ordinate (CPO) estimates can easily be obtained for point process data. We construct a novel risk metric that utiliz… ▽ More

    Submitted 28 April, 2020; v1 submitted 28 March, 2020; originally announced March 2020.

  14. arXiv:1904.00459  [pdf, other

    math.ST cs.CR

    Differentially Private Inference for Binomial Data

    Authors: Jordan Awan, Aleksandra Slavkovic

    Abstract: We derive uniformly most powerful (UMP) tests for simple and one-sided hypotheses for a population proportion within the framework of Differential Privacy (DP), optimizing finite sample performance. We show that in general, DP hypothesis tests can be written in terms of linear constraints, and for exchangeable data can always be expressed as a function of the empirical distribution. Using this str… ▽ More

    Submitted 31 March, 2019; originally announced April 2019.

    Comments: 25 pages before references; 39 pages total. 8 figures. arXiv admin note: text overlap with arXiv:1805.09236

  15. arXiv:1901.10864  [pdf, other

    cs.CR cs.LG stat.ML

    Benefits and Pitfalls of the Exponential Mechanism with Applications to Hilbert Spaces and Functional PCA

    Authors: Jordan Awan, Ana Kenney, Matthew Reimherr, Aleksandra Slavković

    Abstract: The exponential mechanism is a fundamental tool of Differential Privacy (DP) due to its strong privacy guarantees and flexibility. We study its extension to settings with summaries based on infinite dimensional outputs such as with functional data analysis, shape analysis, and nonparametric statistics. We show that one can design the mechanism with respect to a specific base measure over the outpu… ▽ More

    Submitted 30 January, 2019; originally announced January 2019.

    Comments: 13 pages, 5 images, 2 tables

    MSC Class: 46E22; 46S50; 60G15; 62H25

  16. arXiv:1805.09392  [pdf, other

    stat.ME

    pMSE Mechanism: Differentially Private Synthetic Data with Maximal Distributional Similarity

    Authors: Joshua Snoke, Aleksandra Slavković

    Abstract: We propose a method for the release of differentially private synthetic datasets. In many contexts, data contain sensitive values which cannot be released in their original form in order to protect individuals' privacy. Synthetic data is a protection method that releases alternative values in place of the original ones, and differential privacy (DP) is a formal guarantee for quantifying the privac… ▽ More

    Submitted 23 May, 2018; originally announced May 2018.

    Comments: 16 pages, 4 figures

  17. arXiv:1805.09236  [pdf, other

    math.ST

    Differentially Private Uniformly Most Powerful Tests for Binomial Data

    Authors: Jordan Awan, Aleksandra Slavkovic

    Abstract: We derive uniformly most powerful (UMP) tests for simple and one-sided hypotheses for a population proportion within the framework of Differential Privacy (DP), optimizing finite sample performance. We show that in general, DP hypothesis tests for exchangeable data can always be expressed as a function of the empirical distribution. Using this structure, we prove a `Neyman-Pearson lemma' for binom… ▽ More

    Submitted 23 May, 2018; originally announced May 2018.

    Comments: 15 pages, 2 figures

  18. arXiv:1801.09236  [pdf, other

    stat.ME

    Structure and Sensitivity in Differential Privacy: Comparing K-Norm Mechanisms

    Authors: Jordan Awan, Aleksandra Slavkovic

    Abstract: Differential privacy (DP), provides a framework for provable privacy protection against arbitrary adversaries, while allowing the release of summary statistics and synthetic data. We address the problem of releasing a noisy real-valued statistic vector $T$, a function of sensitive data under DP, via the class of $K$-norm mechanisms with the goal of minimizing the noise added to achieve privacy. Fi… ▽ More

    Submitted 31 October, 2024; v1 submitted 28 January, 2018; originally announced January 2018.

    Comments: 40 pages, 6 figures, 1 table

    MSC Class: 62J05; 62J07; 62J12; 68W20

  19. arXiv:1711.06660  [pdf, other

    math.ST

    Formal Privacy for Functional Data with Gaussian Perturbations

    Authors: Ardalan Mirshani, Matthew Reimherr, Aleksandra Slavkovic

    Abstract: Motivated by the rapid rise in statistical tools in Functional Data Analysis, we consider the Gaussian mechanism for achieving differential privacy with parameter estimates taking values in a, potentially infinite-dimensional, separable Banach space. Using classic results from probability theory, we show how densities over function spaces can be utilized to achieve the desired differential privacy… ▽ More

    Submitted 27 January, 2019; v1 submitted 17 November, 2017; originally announced November 2017.

    MSC Class: 62; 68

  20. arXiv:1710.06933  [pdf, other

    stat.ME

    Providing Accurate Models across Private Partitioned Data: Secure Maximum Likelihood Estimation

    Authors: Joshua Snoke, Timothy R. Brick, Aleksandra Slavkovic, Michael D. Hunter

    Abstract: This paper focuses on the privacy paradigm of providing access to researchers to remotely carry out analyses on sensitive data stored behind firewalls. We address the situation where the analysis demands data from multiple physically separate databases which cannot be combined. Motivating this problem are analyses using multiple data sources that currently are only possible through extension work… ▽ More

    Submitted 18 October, 2017; originally announced October 2017.

  21. arXiv:1607.04204  [pdf, other

    stat.ME

    Differentially Private Model Selection with Penalized and Constrained Likelihood

    Authors: Jing Lei, Anne-Sophie Charest, Aleksandra Slavkovic, Adam Smith, Stephen Fienberg

    Abstract: In statistical disclosure control, the goal of data analysis is twofold: The released information must provide accurate and useful statistics about the underlying population of interest, while minimizing the potential for an individual record to be identified. In recent years, the notion of differential privacy has received much attention in theoretical computer science, machine learning, and stat… ▽ More

    Submitted 14 July, 2016; originally announced July 2016.

  22. arXiv:1604.06651  [pdf, other

    stat.AP

    General and specific utility measures for synthetic data

    Authors: Joshua Snoke, Gillian Raab, Beata Nowok, Chris Dibben, Aleksandra Slavkovic

    Abstract: Data holders can produce synthetic versions of datasets when concerns about potential disclosure restrict the availability of the original records. This paper is concerned with methods to judge whether such synthetic data have a distribution that is comparable to that of the original data, what we will term general utility. We consider how general utility compares with specific utility, the simila… ▽ More

    Submitted 18 June, 2017; v1 submitted 22 April, 2016; originally announced April 2016.

  23. arXiv:1511.07896  [pdf, other

    stat.ML cs.CR cs.LG

    Private Posterior distributions from Variational approximations

    Authors: Vishesh Karwa, Dan Kifer, Aleksandra B. Slavković

    Abstract: Privacy preserving mechanisms such as differential privacy inject additional randomness in the form of noise in the data, beyond the sampling mechanism. Ignoring this additional noise can lead to inaccurate and invalid inferences. In this paper, we incorporate the privacy mechanism explicitly into the likelihood function by treating the original data as missing, with an end goal of estimating post… ▽ More

    Submitted 24 November, 2015; originally announced November 2015.

  24. arXiv:1511.02930  [pdf, other

    stat.CO cs.CR cs.SI stat.AP

    Sharing Social Network Data: Differentially Private Estimation of Exponential-Family Random Graph Models

    Authors: Vishesh Karwa, Pavel N. Krivitsky, Aleksandra B. Slavković

    Abstract: Motivated by a real-life problem of sharing social network data that contain sensitive personal information, we propose a novel approach to release and analyze synthetic graphs in order to protect privacy of individual relationships captured by the social network while maintaining the validity of statistical results. A case study using a version of the Enron e-mail corpus dataset demonstrates the… ▽ More

    Submitted 23 September, 2016; v1 submitted 9 November, 2015; originally announced November 2015.

    Comments: Updated, 39 pages

  25. arXiv:1409.4696  [pdf, other

    stat.OT cs.CR stat.ME

    Differentially Private Exponential Random Graphs

    Authors: Vishesh Karwa, Aleksandra B. Slavković, Pavel Krivitsky

    Abstract: We propose methods to release and analyze synthetic graphs in order to protect privacy of individual relationships captured by the social network. Proposed techniques aim at fitting and estimating a wide class of exponential random graph models (ERGMs) in a differentially private manner, and thus offer rigorous privacy guarantees. More specifically, we use the randomized response mechanism to rele… ▽ More

    Submitted 15 April, 2015; v1 submitted 16 September, 2014; originally announced September 2014.

    Comments: minor edits

  26. Scalable Privacy-Preserving Data Sharing Methodology for Genome-Wide Association Studies

    Authors: Fei Yu, Stephen E. Fienberg, Aleksandra Slavković, Caroline Uhler

    Abstract: The protection of privacy of individual-level information in genome-wide association study (GWAS) databases has been a major concern of researchers following the publication of "an attack" on GWAS data by Homer et al. (2008) Traditional statistical methods for confidentiality and privacy protection of statistical databases do not scale well to deal with GWAS data, especially in terms of guarantees… ▽ More

    Submitted 21 January, 2014; originally announced January 2014.

    Comments: 28 pages, 2 figures, source code available upon request

  27. arXiv:1401.2134  [pdf, other

    cs.DL astro-ph.IM cs.CY

    10 Simple Rules for the Care and Feeding of Scientific Data

    Authors: Alyssa Goodman, Alberto Pepe, Alexander W. Blocker, Christine L. Borgman, Kyle Cranmer, Mercè Crosas, Rosanne Di Stefano, Yolanda Gil, Paul Groth, Margaret Hedstrom, David W. Hogg, Vinay Kashyap, Ashish Mahabal, Aneta Siemiginowska, Aleksandra Slavkovic

    Abstract: This article offers a short guide to the steps scientists can take to ensure that their data and associated analyses continue to be of value and to be recognized. In just the past few years, hundreds of scholarly papers and reports have been written on questions of data sharing, data provenance, research reproducibility, licensing, attribution, privacy, and more, but our goal here is not to review… ▽ More

    Submitted 9 January, 2014; originally announced January 2014.

    Comments: Accepted in PLOS Computational Biology. This paper was written collaboratively, on the web, in the open, using Authorea. The living version of this article, which includes sources and history, is available at http://www.authorea.com/3410/

  28. arXiv:1401.1397  [pdf, ps, other

    math.ST

    Fibers of multi-way contingency tables given conditionals: relation to marginals, cell bounds and Markov bases

    Authors: Aleksandra B. Slavković, Xiaotian Zhu, Sonja Petrović

    Abstract: A reference set, or a fiber, of a contingency table is the space of all realizations of the table under a given set of constraints such as marginal totals. Understanding the geometry of this space is a key problem in algebraic statistics, important for conducting exact conditional inference, calculating cell bounds, imputing missing cell values, and assessing the risk of disclosure of sensitive in… ▽ More

    Submitted 7 January, 2014; originally announced January 2014.

  29. arXiv:1205.4697  [pdf, ps, other

    stat.ME cs.DS

    Inference using noisy degrees: Differentially private $β$-model and synthetic graphs

    Authors: Vishesh Karwa, Aleksandra Slavković

    Abstract: The $β$-model of random graphs is an exponential family model with the degree sequence as a sufficient statistic. In this paper, we contribute three key results. First, we characterize conditions that lead to a quadratic time algorithm to check for the existence of MLE of the $β$-model, and show that the MLE never exists for the degree partition $β$-model. Second, motivated by privacy problems wit… ▽ More

    Submitted 12 January, 2016; v1 submitted 21 May, 2012; originally announced May 2012.

    Comments: Published at http://dx.doi.org/10.1214/15-AOS1358 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOS-AOS1358

    Journal ref: Annals of Statistics 2016, Vol. 44, No. 1, 87-112

  30. arXiv:1205.0739  [pdf, other

    stat.ME cs.CR

    Privacy-Preserving Data Sharing for Genome-Wide Association Studies

    Authors: Caroline Uhler, Aleksandra B. Slavkovic, Stephen E. Fienberg

    Abstract: Traditional statistical methods for confidentiality protection of statistical databases do not scale well to deal with GWAS (genome-wide association studies) databases especially in terms of guarantees regarding protection from linkage to external information. The more recent concept of differential privacy, introduced by the cryptographic community, is an approach which provides a rigorous defini… ▽ More

    Submitted 3 May, 2012; originally announced May 2012.

    MSC Class: 62F03; 68P25; 92D20

  31. Causal inference in transportation safety studies: Comparison of potential outcomes and causal diagrams

    Authors: Vishesh Karwa, Aleksandra B. Slavković, Eric T. Donnell

    Abstract: The research questions that motivate transportation safety studies are causal in nature. Safety researchers typically use observational data to answer such questions, but often without appropriate causal inference methodology. The field of causal inference presents several modeling frameworks for probing empirical data to assess causal relations. This paper focuses on exploring the applicability o… ▽ More

    Submitted 25 July, 2011; originally announced July 2011.

    Comments: Published in at http://dx.doi.org/10.1214/10-AOAS440 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOAS-AOAS440

    Journal ref: Annals of Applied Statistics 2011, Vol. 5, No. 2B, 1428-1455

  32. arXiv:math/0405046  [pdf, ps, other

    math.AG math.CO math.ST

    The space of compatible full conditionals is a unimodular toric variety

    Authors: Aleksandra B Slavkovic, Seth Sullivant

    Abstract: The set of all m-tuples of compatible full conditional distributions on discrete random variables is an algebraic set whose defining ideal is a unimodular toric ideal. We identify the defining polynomials of these ideals with closed walks on a bipartite graph. Our algebraic characterization provides a natural generalization of the requirement that compatible conditionals have identical odds rati… ▽ More

    Submitted 3 May, 2004; originally announced May 2004.

    Comments: 15 pages