Skip to main content

Showing 1–18 of 18 results for author: Aliakbarpour, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.02842  [pdf, ps, other

    cs.DS cs.LG

    On the Structure of Replicable Hypothesis Testers

    Authors: Anders Aamand, Maryam Aliakbarpour, Justin Y. Chen, Shyam Narayanan, Sandeep Silwal

    Abstract: A hypothesis testing algorithm is replicable if, when run on two different samples from the same distribution, it produces the same output with high probability. This notion, defined by by Impagliazzo, Lei, Pitassi, and Sorell [STOC'22], can increase trust in testing procedures and is deeply related to algorithmic stability, generalization, and privacy. We build general tools to prove lower and up… ▽ More

    Submitted 3 July, 2025; originally announced July 2025.

    Comments: Abstract abridged to meet arxiv requirements

  2. arXiv:2506.01162  [pdf, ps, other

    cs.DS cs.CR cs.LG stat.ML

    Nearly-Linear Time Private Hypothesis Selection with the Optimal Approximation Factor

    Authors: Maryam Aliakbarpour, Zhan Shi, Ria Stevens, Vincent X. Wang

    Abstract: Estimating the density of a distribution from its samples is a fundamental problem in statistics. Hypothesis selection addresses the setting where, in addition to a sample set, we are given $n$ candidate distributions -- referred to as hypotheses -- and the goal is to determine which one best describes the underlying data distribution. This problem is known to be solvable very efficiently, requiri… ▽ More

    Submitted 1 June, 2025; originally announced June 2025.

    Comments: 33 pages

  3. arXiv:2503.14709  [pdf, ps, other

    cs.LG cs.CR cs.DS

    Better Private Distribution Testing by Leveraging Unverified Auxiliary Data

    Authors: Maryam Aliakbarpour, Arnav Burudgunte, Clément Cannone, Ronitt Rubinfeld

    Abstract: We extend the framework of augmented distribution testing (Aliakbarpour, Indyk, Rubinfeld, and Silwal, NeurIPS 2024) to the differentially private setting. This captures scenarios where a data analyst must perform hypothesis testing tasks on sensitive data, but is able to leverage prior knowledge (public, but possibly erroneous or untrusted) about the data distribution. We design private algorit… ▽ More

    Submitted 18 March, 2025; originally announced March 2025.

  4. arXiv:2412.12374  [pdf, ps, other

    cs.LG cs.CR

    Privacy in Metalearning and Multitask Learning: Modeling and Separations

    Authors: Maryam Aliakbarpour, Konstantina Bairaktari, Adam Smith, Marika Swanberg, Jonathan Ullman

    Abstract: Model personalization allows a set of individuals, each facing a different learning task, to train models that are more accurate for each person than those they could develop individually. The goals of personalization are captured in a variety of formal frameworks, such as multitask learning and metalearning. Combining data for model personalization poses risks for privacy because the output of an… ▽ More

    Submitted 16 December, 2024; originally announced December 2024.

  5. arXiv:2412.00974  [pdf, other

    cs.LG cs.DS stat.ML

    Optimal Algorithms for Augmented Testing of Discrete Distributions

    Authors: Maryam Aliakbarpour, Piotr Indyk, Ronitt Rubinfeld, Sandeep Silwal

    Abstract: We consider the problem of hypothesis testing for discrete distributions. In the standard model, where we have sample access to an underlying distribution $p$, extensive research has established optimal bounds for uniformity testing, identity testing (goodness of fit), and closeness testing (equivalence or two-sample testing). We explore these problems in a setting where a predicted data distribut… ▽ More

    Submitted 1 December, 2024; originally announced December 2024.

    Comments: To appear in NeurIPS 24

  6. arXiv:2410.18404  [pdf, other

    cs.LG cs.CR stat.ML

    Enhancing Feature-Specific Data Protection via Bayesian Coordinate Differential Privacy

    Authors: Maryam Aliakbarpour, Syomantak Chaudhuri, Thomas A. Courtade, Alireza Fallah, Michael I. Jordan

    Abstract: Local Differential Privacy (LDP) offers strong privacy guarantees without requiring users to trust external parties. However, LDP applies uniform protection to all data features, including less sensitive ones, which degrades performance of downstream tasks. To overcome this limitation, we propose a Bayesian framework, Bayesian Coordinate Differential Privacy (BCDP), that enables feature-specific p… ▽ More

    Submitted 23 October, 2024; originally announced October 2024.

  7. arXiv:2312.13978  [pdf, other

    cs.LG cs.DS

    Metalearning with Very Few Samples Per Task

    Authors: Maryam Aliakbarpour, Konstantina Bairaktari, Gavin Brown, Adam Smith, Nathan Srebro, Jonathan Ullman

    Abstract: Metalearning and multitask learning are two frameworks for solving a group of related learning tasks more efficiently than we could hope to solve each of the individual tasks on their own. In multitask learning, we are given a fixed set of related learning tasks and need to output one accurate model per task, whereas in metalearning we are given tasks that are drawn i.i.d. from a metadistribution… ▽ More

    Submitted 1 April, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

  8. arXiv:2305.13440  [pdf, ps, other

    cs.DS cs.LG

    Differentially Private Medians and Interior Points for Non-Pathological Data

    Authors: Maryam Aliakbarpour, Rose Silver, Thomas Steinke, Jonathan Ullman

    Abstract: We construct differentially private estimators with low sample complexity that estimate the median of an arbitrary distribution over $\mathbb{R}$ satisfying very mild moment conditions. Our result stands in contrast to the surprising negative result of Bun et al. (FOCS 2015) that showed there is no differentially private estimator with any finite sample complexity that returns any non-trivial appr… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

  9. arXiv:2205.09804  [pdf, ps, other

    cs.DS cs.IT cs.LG

    Estimation of Entropy in Constant Space with Improved Sample Complexity

    Authors: Maryam Aliakbarpour, Andrew McGregor, Jelani Nelson, Erik Waingarten

    Abstract: Recent work of Acharya et al. (NeurIPS 2019) showed how to estimate the entropy of a distribution $\mathcal D$ over an alphabet of size $k$ up to $\pmε$ additive error by streaming over $(k/ε^3) \cdot \text{polylog}(1/ε)$ i.i.d. samples and using only $O(1)$ words of memory. In this work, we give a new constant memory scheme that reduces the sample complexity to $(k/ε^2)\cdot \text{polylog}(1/ε)$.… ▽ More

    Submitted 19 May, 2022; originally announced May 2022.

  10. arXiv:2102.01258  [pdf, other

    cs.IT cs.LG stat.ML

    Local Differential Privacy Is Equivalent to Contraction of $E_γ$-Divergence

    Authors: Shahab Asoodeh, Maryam Aliakbarpour, Flavio P. Calmon

    Abstract: We investigate the local differential privacy (LDP) guarantees of a randomized privacy mechanism via its contraction properties. We first show that LDP constraints can be equivalently cast in terms of the contraction coefficient of the $E_γ$-divergence. We then use this equivalent formula to express LDP guarantees of privacy mechanisms in terms of contraction coefficients of arbitrary $f$-divergen… ▽ More

    Submitted 1 February, 2021; originally announced February 2021.

    Comments: arXiv admin note: text overlap with arXiv:2012.11035

  11. arXiv:2010.02888  [pdf, other

    cs.LG math.ST stat.ML

    Testing Tail Weight of a Distribution Via Hazard Rate

    Authors: Maryam Aliakbarpour, Amartya Shankha Biswas, Kavya Ravichandran, Ronitt Rubinfeld

    Abstract: Understanding the shape of a distribution of data is of interest to people in a great variety of fields, as it may affect the types of algorithms used for that data. We study one such problem in the framework of distribution property testing, characterizing the number of samples required to to distinguish whether a distribution has a certain property or is far from having that property. In particu… ▽ More

    Submitted 4 December, 2022; v1 submitted 6 October, 2020; originally announced October 2020.

  12. arXiv:2008.03891  [pdf, other

    cs.DB

    Rapid Approximate Aggregation with Distribution-Sensitive Interval Guarantees

    Authors: Stephen Macke, Maryam Aliakbarpour, Ilias Diakonikolas, Aditya Parameswaran, Ronitt Rubinfeld

    Abstract: Aggregating data is fundamental to data analytics, data exploration, and OLAP. Approximate query processing (AQP) techniques are often used to accelerate computation of aggregates using samples, for which confidence intervals (CIs) are widely used to quantify the associated error. CIs used in practice fall into two categories: techniques that are tight but not correct, i.e., they yield tight inter… ▽ More

    Submitted 10 August, 2020; originally announced August 2020.

  13. arXiv:2008.03650  [pdf, ps, other

    cs.LG math.ST stat.ML

    Testing Determinantal Point Processes

    Authors: Khashayar Gatmiry, Maryam Aliakbarpour, Stefanie Jegelka

    Abstract: Determinantal point processes (DPPs) are popular probabilistic models of diversity. In this paper, we investigate DPPs from a new perspective: property testing of distributions. Given sample access to an unknown distribution $q$ over the subsets of a ground set, we aim to distinguish whether $q$ is a DPP distribution, or $ε$-far from all DPP distributions in $\ell_1$-distance. In this work, we pro… ▽ More

    Submitted 9 August, 2020; originally announced August 2020.

  14. arXiv:1911.07324  [pdf, ps, other

    cs.DS cs.DM cs.LG stat.ML

    Testing Properties of Multiple Distributions with Few Samples

    Authors: Maryam Aliakbarpour, Sandeep Silwal

    Abstract: We propose a new setting for testing properties of distributions while receiving samples from several distributions, but few samples per distribution. Given samples from $s$ distributions, $p_1, p_2, \ldots, p_s$, we design testers for the following problems: (1) Uniformity Testing: Testing whether all the $p_i$'s are uniform or $ε$-far from being uniform in $\ell_1$-distance (2) Identity Testing:… ▽ More

    Submitted 17 November, 2019; originally announced November 2019.

    Comments: ITCS 2020

  15. arXiv:1907.03190  [pdf, ps, other

    math.ST cs.DS stat.ML

    Testing Mixtures of Discrete Distributions

    Authors: Maryam Aliakbarpour, Ravi Kumar, Ronitt Rubinfeld

    Abstract: There has been significant study on the sample complexity of testing properties of distributions over large domains. For many properties, it is known that the sample complexity can be substantially smaller than the domain size. For example, over a domain of size $n$, distinguishing the uniform distribution from distributions that are far from uniform in $\ell_1$-distance uses only $O(\sqrt{n})$ sa… ▽ More

    Submitted 6 July, 2019; originally announced July 2019.

    Comments: Appeared in COLT 2019

  16. arXiv:1907.03182  [pdf, ps, other

    cs.DS math.ST stat.ML

    Towards Testing Monotonicity of Distributions Over General Posets

    Authors: Maryam Aliakbarpour, Themis Gouleakis, John Peebles, Ronitt Rubinfeld, Anak Yodpinyanee

    Abstract: In this work, we consider the sample complexity required for testing the monotonicity of distributions over partial orders. A distribution $p$ over a poset is monotone if, for any pair of domain elements $x$ and $y$ such that $x \preceq y$, $p(x) \leq p(y)$. To understand the sample complexity of this problem, we introduce a new property called bigness over a finite domain, where the distribution… ▽ More

    Submitted 6 July, 2019; originally announced July 2019.

    Comments: Appeared in COLT 2019

  17. arXiv:1707.05497  [pdf, other

    cs.LG cs.DS cs.IT stat.ML

    Differentially Private Identity and Closeness Testing of Discrete Distributions

    Authors: Maryam Aliakbarpour, Ilias Diakonikolas, Ronitt Rubinfeld

    Abstract: We investigate the problems of identity and closeness testing over a discrete population from random samples. Our goal is to develop efficient testers while guaranteeing Differential Privacy to the individuals of the population. We describe an approach that yields sample-efficient differentially private testers for these problems. Our theoretical results show that there exist private identity and… ▽ More

    Submitted 18 July, 2017; originally announced July 2017.

    Comments: Submitted, May 2017

  18. arXiv:1601.04233  [pdf, other

    cs.DS

    Sublinear-Time Algorithms for Counting Star Subgraphs with Applications to Join Selectivity Estimation

    Authors: Maryam Aliakbarpour, Amartya Shankha Biswas, Themistoklis Gouleakis, John Peebles, Ronitt Rubinfeld, Anak Yodpinyanee

    Abstract: We study the problem of estimating the value of sums of the form $S_p \triangleq \sum \binom{x_i}{p}$ when one has the ability to sample $x_i \geq 0$ with probability proportional to its magnitude. When $p=2$, this problem is equivalent to estimating the selectivity of a self-join query in database systems when one can sample rows randomly. We also study the special case when $\{x_i\}$ is the degr… ▽ More

    Submitted 16 January, 2016; originally announced January 2016.

    Comments: 21 pages