Skip to main content

Showing 1–4 of 4 results for author: Reshef, Y A

Searching in archive stat. Search in all archives.
.
  1. arXiv:1505.02214  [pdf, other

    stat.ME cs.IT cs.LG q-bio.QM stat.ML

    An Empirical Study of Leading Measures of Dependence

    Authors: David N. Reshef, Yakir A. Reshef, Pardis C. Sabeti, Michael M. Mitzenmacher

    Abstract: In exploratory data analysis, we are often interested in identifying promising pairwise associations for further analysis while filtering out weaker, less interesting ones. This can be accomplished by computing a measure of dependence on all variable pairs and examining the highest-scoring pairs, provided the measure of dependence used assigns similar scores to equally noisy relationships of diffe… ▽ More

    Submitted 12 May, 2015; v1 submitted 8 May, 2015; originally announced May 2015.

    Comments: David N. Reshef and Yakir A. Reshef are co-first authors, Pardis C. Sabeti and Michael M. Mitzenmacher are co-last authors

    Journal ref: Ann.Appl.Stat. 12 (2018) 123-155

  2. arXiv:1505.02213  [pdf, other

    stat.ME cs.IT cs.LG q-bio.QM stat.ML

    Measuring dependence powerfully and equitably

    Authors: Yakir A. Reshef, David N. Reshef, Hilary K. Finucane, Pardis C. Sabeti, Michael M. Mitzenmacher

    Abstract: Given a high-dimensional data set we often wish to find the strongest relationships within it. A common strategy is to evaluate a measure of dependence on every variable pair and retain the highest-scoring pairs for follow-up. This strategy works well if the statistic used is equitable [Reshef et al. 2015a], i.e., if, for some measure of noise, it assigns similar scores to equally noisy relationsh… ▽ More

    Submitted 30 August, 2021; v1 submitted 8 May, 2015; originally announced May 2015.

    Comments: YAR and DNR are co-first authors, PCS and MMM are co-last authors. This paper, together with arXiv:1505.02212, subsumes arXiv:1408.4908. v3 includes new analyses and exposition. v4 is identical to v3 except for this comment. An error was found in the argument showing the consistency of the MIC estimator; see arXiv:2107.03836 for discussion and a corrected argument

    Journal ref: J.Mach.Learn.Res. 17 (2016), 1-63

  3. arXiv:1505.02212  [pdf, other

    math.ST cs.LG q-bio.QM stat.ME stat.ML

    Equitability, interval estimation, and statistical power

    Authors: Yakir A. Reshef, David N. Reshef, Pardis C. Sabeti, Michael M. Mitzenmacher

    Abstract: For analysis of a high-dimensional dataset, a common approach is to test a null hypothesis of statistical independence on all variable pairs using a non-parametric measure of dependence. However, because this approach attempts to identify any non-trivial relationship no matter how weak, it often identifies too many relationships to be useful. What is needed is a way of identifying a smaller set of… ▽ More

    Submitted 12 May, 2015; v1 submitted 8 May, 2015; originally announced May 2015.

    Comments: Yakir A. Reshef and David N. Reshef are co-first authors, Pardis C. Sabeti and Michael M. Mitzenmacher are co-last authors. This paper, together with arXiv:1505.02212, subsumes arXiv:1408.4908

  4. arXiv:1408.4908  [pdf, other

    stat.ME cs.IT math.ST q-bio.QM stat.ML

    Theoretical Foundations of Equitability and the Maximal Information Coefficient

    Authors: Yakir A. Reshef, David N. Reshef, Pardis C. Sabeti, Michael Mitzenmacher

    Abstract: The maximal information coefficient (MIC) is a tool for finding the strongest pairwise relationships in a data set with many variables (Reshef et al., 2011). MIC is useful because it gives similar scores to equally noisy relationships of different types. This property, called {\em equitability}, is important for analyzing high-dimensional data sets. Here we formalize the theory behind both equit… ▽ More

    Submitted 12 May, 2015; v1 submitted 21 August, 2014; originally announced August 2014.

    Comments: 46 pages, 3 figures, 2 tables. This paper has been subsumed by arXiv:1505.02213 and arXiv:1505.02212. Please cite those papers instead