-
Computing the Bergsma Dassios sign-covariance
Authors:
Yair Heller,
Ruth Heller
Abstract:
Bergsma and Dassios (2014) introduced an independence measure which is zero if and only if two random variables are independent. This measure can be naively calculated in $O(n^4)$. Weihs et al. (2015) showed that it can be calculated in $O(n^2 \log n)$. In this note we will show that using the methods described in Heller et al. (2016), the measure can easily be calculated in only $O(n^2)$.
Bergsma and Dassios (2014) introduced an independence measure which is zero if and only if two random variables are independent. This measure can be naively calculated in $O(n^4)$. Weihs et al. (2015) showed that it can be calculated in $O(n^2 \log n)$. In this note we will show that using the methods described in Heller et al. (2016), the measure can easily be calculated in only $O(n^2)$.
△ Less
Submitted 27 May, 2016;
originally announced May 2016.
-
Multivariate tests of association based on univariate tests
Authors:
Ruth Heller,
Yair Heller
Abstract:
For testing two random vectors for independence, we consider testing whether the distance of one vector from a center point is independent from the distance of the other vector from a center point by a univariate test. In this paper we provide conditions under which it is enough to have a consistent univariate test of independence on the distances to guarantee that the power to detect dependence b…
▽ More
For testing two random vectors for independence, we consider testing whether the distance of one vector from a center point is independent from the distance of the other vector from a center point by a univariate test. In this paper we provide conditions under which it is enough to have a consistent univariate test of independence on the distances to guarantee that the power to detect dependence between the random vectors increases to one, as the sample size increases. These conditions turn out to be minimal. If the univariate test is distribution-free, the multivariate test will also be distribution-free. If we consider multiple center points and aggregate the center-specific univariate tests, the power may be further improved, and the resulting multivariate test may be distribution-free for specific aggregation methods (if the univariate test is distribution-free). We show that several multivariate tests recently proposed in the literature can be viewed as instances of this general approach.
△ Less
Submitted 10 March, 2016;
originally announced March 2016.
-
Consistent distribution-free $K$-sample and independence tests for univariate random variables
Authors:
Ruth Heller,
Yair Heller,
Shachar Kaufman,
Barak Brill,
Malka Gorfine
Abstract:
A popular approach for testing if two univariate random variables are statistically independent consists of partitioning the sample space into bins, and evaluating a test statistic on the binned data. The partition size matters, and the optimal partition size is data dependent. While for detecting simple relationships coarse partitions may be best, for detecting complex relationships a great gain…
▽ More
A popular approach for testing if two univariate random variables are statistically independent consists of partitioning the sample space into bins, and evaluating a test statistic on the binned data. The partition size matters, and the optimal partition size is data dependent. While for detecting simple relationships coarse partitions may be best, for detecting complex relationships a great gain in power can be achieved by considering finer partitions. We suggest novel consistent distribution-free tests that are based on summation or maximization aggregation of scores over all partitions of a fixed size. We show that our test statistics based on summation can serve as good estimators of the mutual information. Moreover, we suggest regularized tests that aggregate over all partition sizes, and prove those are consistent too. We provide polynomial-time algorithms, which are critical for computing the suggested test statistics efficiently. We show that the power of the regularized tests is excellent compared to existing tests, and almost as powerful as the tests based on the optimal (yet unknown in practice) partition size, in simulations as well as on a real data example.
△ Less
Submitted 18 June, 2015; v1 submitted 24 October, 2014;
originally announced October 2014.
-
Consistent distribution-free tests of association between univariate random variables
Authors:
Ruth Heller,
Yair Heller,
Shachar Kaufman,
Malka Gorfine
Abstract:
We consider the problem of testing whether pairs of univariate random variables are associated. Few tests of independence exist that are consistent against all dependent alternatives and are distribution free. We propose novel tests that are consistent, distribution free, and have excellent power properties. The tests have simple form, and are surprisingly computationally efficient thanks to accom…
▽ More
We consider the problem of testing whether pairs of univariate random variables are associated. Few tests of independence exist that are consistent against all dependent alternatives and are distribution free. We propose novel tests that are consistent, distribution free, and have excellent power properties. The tests have simple form, and are surprisingly computationally efficient thanks to accompanying innovative algorithms we develop. Moreover, we show that one of the test statistics is a consistent estimator of the mutual information. We demonstrate the good power properties in simulations, and apply the tests to a microarray study where many pairs of genes are examined simultaneously for co-dependence.
△ Less
Submitted 8 December, 2014; v1 submitted 7 August, 2013;
originally announced August 2013.
-
A consistent multivariate test of association based on ranks of distances
Authors:
Ruth Heller,
Yair Heller,
Malka Gorfine
Abstract:
We are concerned with the detection of associations between random vectors of any dimension. Few tests of independence exist that are consistent against all dependent alternatives. We propose a powerful test that is applicable in all dimensions and is consistent against all alternatives. The test has a simple form and is easy to implement. We demonstrate its good power properties in simulations an…
▽ More
We are concerned with the detection of associations between random vectors of any dimension. Few tests of independence exist that are consistent against all dependent alternatives. We propose a powerful test that is applicable in all dimensions and is consistent against all alternatives. The test has a simple form and is easy to implement. We demonstrate its good power properties in simulations and on examples.
△ Less
Submitted 31 May, 2012; v1 submitted 17 January, 2012;
originally announced January 2012.