-
Quadratic Form based Multiple Contrast Tests for Comparison of Group Means
Authors:
Paavo Sattler,
Markus Pauly,
Merle Munko
Abstract:
Comparing the mean vectors across different groups is a cornerstone in the realm of multivariate statistics, with quadratic forms commonly serving as test statistics. However, when the overall hypothesis is rejected, identifying specific vector components or determining the groups among which differences exist requires additional investigations. Conversely, employing multiple contrast tests (MCT)…
▽ More
Comparing the mean vectors across different groups is a cornerstone in the realm of multivariate statistics, with quadratic forms commonly serving as test statistics. However, when the overall hypothesis is rejected, identifying specific vector components or determining the groups among which differences exist requires additional investigations. Conversely, employing multiple contrast tests (MCT) allows conclusions about which components or groups contribute to these differences. However, they come with a trade-off, as MCT lose some benefits inherent to quadratic forms. In this paper, we combine both approaches to get a quadratic form based multiple contrast test that leverages the advantages of both. To understand its theoretical properties, we investigate its asymptotic distribution in a semiparametric model. We thereby focus on two common quadratic forms - the Wald-type statistic and the Anova-type statistic - although our findings are applicable to any quadratic form.
Furthermore, we employ Monte-Carlo and resampling techniques to enhance the test's performance in small sample scenarios. Through an extensive simulation study, we assess the performance of our proposed tests against existing alternatives, highlighting their advantages.
△ Less
Submitted 3 June, 2025; v1 submitted 15 November, 2024;
originally announced November 2024.
-
Early and Late Buzzards: Comparing Different Approaches for Quantile-based Multiple Testing in Heavy-Tailed Wildlife Research Data
Authors:
Marléne Baumeister,
Merle Munko,
Kai-Philipp Gladow,
Marc Ditzhaus,
Nayden Chakarov,
Markus Pauly
Abstract:
In medical, ecological and psychological research, there is a need for methods to handle multiple testing, for example to consider group comparisons with more than two groups. Typical approaches that deal with multiple testing are mean or variance based which can be less effective in the context of heavy-tailed and skewed data. Here, the median is the preferred measure of location and the interqua…
▽ More
In medical, ecological and psychological research, there is a need for methods to handle multiple testing, for example to consider group comparisons with more than two groups. Typical approaches that deal with multiple testing are mean or variance based which can be less effective in the context of heavy-tailed and skewed data. Here, the median is the preferred measure of location and the interquartile range (IQR) is an adequate alternative to the variance. Therefore, it may be fruitful to formulate research questions of interest in terms of the median or the IQR. For this reason, we compare different inference approaches for two-sided and non-inferiority hypotheses formulated in terms of medians or IQRs in an extensive simulation study. We consider multiple contrast testing procedures combined with a bootstrap method as well as testing procedures with Bonferroni correction. As an example of a multiple testing problem based on heavy-tailed data we analyse an ecological trait variation in early and late breeding in a medium-sized bird of prey.
△ Less
Submitted 28 April, 2025; v1 submitted 23 September, 2024;
originally announced September 2024.
-
Multiple tests for restricted mean time lost with competing risks data
Authors:
Merle Munko,
Dennis Dobler,
Marc Ditzhaus
Abstract:
Easy-to-interpret effect estimands are highly desirable in survival analysis. In the competing risks framework, one good candidate is the restricted mean time lost (RMTL). It is defined as the area under the cumulative incidence function up to a prespecified time point and, thus, it summarizes the cumulative incidence function into a meaningful estimand. While existing RMTL-based tests are limited…
▽ More
Easy-to-interpret effect estimands are highly desirable in survival analysis. In the competing risks framework, one good candidate is the restricted mean time lost (RMTL). It is defined as the area under the cumulative incidence function up to a prespecified time point and, thus, it summarizes the cumulative incidence function into a meaningful estimand. While existing RMTL-based tests are limited to two-sample comparisons and mostly to two event types, we aim to develop general contrast tests for factorial designs and an arbitrary number of event types based on a Wald-type test statistic. Furthermore, we avoid the often-made, rather restrictive continuity assumption on the event time distribution. This allows for ties in the data, which often occur in practical applications, e.g., when event times are measured in whole days. In addition, we develop more reliable tests for RMTL comparisons that are based on a permutation approach to improve the small sample performance. In a second step, multiple tests for RMTL comparisons are developed to test several null hypotheses simultaneously. Here, we incorporate the asymptotically exact dependence structure between the local test statistics to gain more power. The small sample performance of the proposed testing procedures is analyzed in simulations and finally illustrated by analyzing a real data example about leukemia patients who underwent bone marrow transplantation.
△ Less
Submitted 12 September, 2024;
originally announced September 2024.
-
Conditional Delta-Method for Resampling Empirical Processes in Multiple Sample Problems
Authors:
Merle Munko,
Dennis Dobler
Abstract:
The functional delta-method has a wide range of applications in statistics. Applications on functionals of empirical processes yield various limit results for classical statistics. To improve the finite sample properties of statistical inference procedures that are based on the limit results, resampling procedures such as random permutation and bootstrap methods are a popular solution. In order to…
▽ More
The functional delta-method has a wide range of applications in statistics. Applications on functionals of empirical processes yield various limit results for classical statistics. To improve the finite sample properties of statistical inference procedures that are based on the limit results, resampling procedures such as random permutation and bootstrap methods are a popular solution. In order to analyze the behaviour of the functionals of the resampling empirical processes, corresponding conditional functional delta-methods are desirable. While conditional functional delta-methods for some special cases already exist, there is a lack of more general conditional functional delta-methods for resampling procedures for empirical processes, such as the permutation and pooled bootstrap method. This gap is addressed in the present paper. Thereby, a general multiple sample problem is considered. The flexible application of the developed conditional delta-method is shown in various relevant examples.
△ Less
Submitted 20 August, 2024;
originally announced August 2024.
-
Multiple Comparison Procedures for Simultaneous Inference in Functional MANOVA
Authors:
Merle Munko,
Marc Ditzhaus,
Markus Pauly,
Łukasz Smaga
Abstract:
Functional data analysis is becoming increasingly popular to study data from real-valued random functions. Nevertheless, there is a lack of multiple testing procedures for such data. These are particularly important in factorial designs to compare different groups or to infer factor effects. We propose a new class of testing procedures for arbitrary linear hypotheses in general factorial designs w…
▽ More
Functional data analysis is becoming increasingly popular to study data from real-valued random functions. Nevertheless, there is a lack of multiple testing procedures for such data. These are particularly important in factorial designs to compare different groups or to infer factor effects. We propose a new class of testing procedures for arbitrary linear hypotheses in general factorial designs with functional data. Our methods allow global as well as multiple inference of both, univariate and multivariate mean functions without assuming particular error distributions nor homoscedasticity. That is, we allow for different structures of the covariance functions between groups. To this end, we use point-wise quadratic-form-type test functions that take potential heteroscedasticity into account. Taking the supremum over each test function, we define a class of local test statistics. We analyse their (joint) asymptotic behaviour and propose a resampling approach to approximate the limit distributions. The resulting global and multiple testing procedures are asymptotic valid under weak conditions and applicable in general functional MANOVA settings. We evaluate their small-sample performance in extensive simulations and finally illustrate their applicability by analysing a multivariate functional air pollution data set.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
RMST-based multiple contrast tests in general factorial designs
Authors:
Merle Munko,
Marc Ditzhaus,
Dennis Dobler,
Jon Genuneit
Abstract:
Several methods in survival analysis are based on the proportional hazards assumption. However, this assumption is very restrictive and often not justifiable in practice. Therefore, effect estimands that do not rely on the proportional hazards assumption are highly desirable in practical applications. One popular example for this is the restricted mean survival time (RMST). It is defined as the ar…
▽ More
Several methods in survival analysis are based on the proportional hazards assumption. However, this assumption is very restrictive and often not justifiable in practice. Therefore, effect estimands that do not rely on the proportional hazards assumption are highly desirable in practical applications. One popular example for this is the restricted mean survival time (RMST). It is defined as the area under the survival curve up to a prespecified time point and, thus, summarizes the survival curve into a meaningful estimand. For two-sample comparisons based on the RMST, previous research found the inflation of the type I error of the asymptotic test for small samples and, therefore, a two-sample permutation test has already been developed. The first goal of the present paper is to further extend the permutation test for general factorial designs and general contrast hypotheses by considering a Wald-type test statistic and its asymptotic behavior. Additionally, a groupwise bootstrap approach is considered. Moreover, when a global test detects a significant difference by comparing the RMSTs of more than two groups, it is of interest which specific RMST differences cause the result. However, global tests do not provide this information. Therefore, multiple tests for the RMST are developed in a second step to infer several null hypotheses simultaneously. Hereby, the asymptotically exact dependence structure between the local test statistics is incorporated to gain more power. Finally, the small sample performance of the proposed global and multiple testing procedures is analyzed in simulations and illustrated in a real data example.
△ Less
Submitted 19 March, 2024; v1 submitted 16 August, 2023;
originally announced August 2023.
-
General multiple tests for functional data
Authors:
Merle Munko,
Marc Ditzhaus,
Markus Pauly,
Łukasz Smaga,
Jin-Ting Zhang
Abstract:
While there exists several inferential methods for analyzing functional data in factorial designs, there is a lack of statistical tests that are valid (i) in general designs, (ii) under non-restrictive assumptions on the data generating process and (iii) allow for coherent post-hoc analyses. In particular, most existing methods assume Gaussianity or equal covariance functions across groups (homosc…
▽ More
While there exists several inferential methods for analyzing functional data in factorial designs, there is a lack of statistical tests that are valid (i) in general designs, (ii) under non-restrictive assumptions on the data generating process and (iii) allow for coherent post-hoc analyses. In particular, most existing methods assume Gaussianity or equal covariance functions across groups (homoscedasticity) and are only applicable for specific study designs that do not allow for evaluation of interactions. Moreover, all available strategies are only designed for testing global hypotheses and do not directly allow a more in-depth analysis of multiple local hypotheses. To address the first two problems (i)-(ii), we propose flexible integral-type test statistics that are applicable in general factorial designs under minimal assumptions on the data generating process. In particular, we neither postulate homoscedasticity nor Gaussianity. To approximate the statistics' null distribution, we adopt a resampling approach and validate it methodologically. Finally, we use our flexible testing framework to (iii) infer several local null hypotheses simultaneously. To allow for powerful data analysis, we thereby take the complex dependencies of the different local test statistics into account. In extensive simulations we confirm that the new methods are flexibly applicable. Two illustrate data analyses complete our study. The new testing procedures are implemented in the R package multiFANOVA, which will be available on CRAN soon.
△ Less
Submitted 27 June, 2023;
originally announced June 2023.
-
A Discontinuity Adjustment for Subdistribution Function Confidence Bands Applied to Right-Censored Competing Risks Data (with Erratum)
Authors:
Dennis Dobler,
Merle Munko
Abstract:
The wild bootstrap is the resampling method of choice in survival analytic applications. Theoretic justifications rely on the assumption of existing intensity functions which is equivalent to an exclusion of ties among the event times. However, such ties are omnipresent in practical studies. It turns out that the wild bootstrap should only be applied in a modified manner that corrects for altered…
▽ More
The wild bootstrap is the resampling method of choice in survival analytic applications. Theoretic justifications rely on the assumption of existing intensity functions which is equivalent to an exclusion of ties among the event times. However, such ties are omnipresent in practical studies. It turns out that the wild bootstrap should only be applied in a modified manner that corrects for altered limit variances and emerging dependencies. This again ensures the asymptotic exactness of inferential procedures. An analogous necessity is the use of the Greenwood-type variance estimator for Nelson-Aalen estimators which is particularly preferred in tied data regimes. All theoretic arguments are transferred to bootstrapping Aalen-Johansen estimators for cumulative incidence functions in competing risks. An extensive simulation study as well as an application to real competing risks data of male intensive care unit patients suffering from pneumonia illustrate the practicability of the proposed technique.
△ Less
Submitted 10 September, 2024; v1 submitted 3 February, 2017;
originally announced February 2017.