-
Minimum Copula Divergence for Robust Estimation
Authors:
Shinto Eguchi,
Shogo Kato
Abstract:
This paper introduces a robust estimation framework based solely on the copula function. We begin by introducing a family of divergence measures tailored for copulas, including the \(α\)-, \(β\)-, and \(γ\)-copula divergences, which quantify the discrepancy between a parametric copula model and an empirical copula derived from data independently of marginal specifications. Using these divergence m…
▽ More
This paper introduces a robust estimation framework based solely on the copula function. We begin by introducing a family of divergence measures tailored for copulas, including the \(α\)-, \(β\)-, and \(γ\)-copula divergences, which quantify the discrepancy between a parametric copula model and an empirical copula derived from data independently of marginal specifications. Using these divergence measures, we propose the minimum copula divergence estimator (MCDE), an estimation method that minimizes the divergence between the model and the empirical copula. The framework proves particularly effective in addressing model misspecifications and analyzing heavy-tailed data, where traditional methods such as the maximum likelihood estimator (MLE) may fail. Theoretical results show that common copula families, including Archimedean and elliptical copulas, satisfy conditions ensuring the boundedness of divergence-based estimators, thereby guaranteeing the robustness of MCDE, especially in the presence of extreme observations. Numerical examples further underscore MCDE's ability to adapt to varying dependence structures, ensuring its utility in real-world scenarios.
△ Less
Submitted 23 February, 2025;
originally announced February 2025.
-
Directional data analysis using the spherical Cauchy and the Poisson kernel-based distribution
Authors:
Michail Tsagris,
Panagiotis Papastamoulis,
Shogo Kato
Abstract:
In 2020, two novel distributions for the analysis of directional data were introduced: the spherical Cauchy distribution and the Poisson kernel-based distribution. This paper provides a detailed exploration of both distributions within various analytical frameworks. To enhance the practical utility of these distributions, alternative parametrizations that offer advantages in numerical stability an…
▽ More
In 2020, two novel distributions for the analysis of directional data were introduced: the spherical Cauchy distribution and the Poisson kernel-based distribution. This paper provides a detailed exploration of both distributions within various analytical frameworks. To enhance the practical utility of these distributions, alternative parametrizations that offer advantages in numerical stability and parameter estimation are presented, such as implementation of the Newton-Raphson algorithm for parameter estimation, while facilitating a more efficient and simplified approach in the regression framework. Additionally, a two-sample location test based on the log-likelihood ratio test is introduced. This test is designed to assess whether the location parameters of two populations can be assumed equal. The maximum likelihood discriminant analysis framework is developed for classification purposes, and finally, the problem of clustering directional data is addressed, by fitting finite mixtures of Spherical Cauchy or Poisson kernel-based distributions. Empirical validation is conducted through comprehensive simulation studies and real data applications, wherein the performance of the spherical Cauchy and Poisson kernel-based distributions is systematically compared.
△ Less
Submitted 10 November, 2024; v1 submitted 5 September, 2024;
originally announced September 2024.
-
Measuring and testing tail equivalence
Authors:
Takaaki Koike,
Shogo Kato,
Toshinao Yoshiba
Abstract:
We call two copulas tail equivalent if their first-order approximations in the tail coincide. As a special case, a copula is called tail symmetric if it is tail equivalent to the associated survival copula. We propose a novel measure and statistical test for tail equivalence. The proposed measure takes the value of zero if and only if the two copulas share a pair of tail order and tail order param…
▽ More
We call two copulas tail equivalent if their first-order approximations in the tail coincide. As a special case, a copula is called tail symmetric if it is tail equivalent to the associated survival copula. We propose a novel measure and statistical test for tail equivalence. The proposed measure takes the value of zero if and only if the two copulas share a pair of tail order and tail order parameter in common. Moreover, taking the nature of these tail quantities into account, we design the proposed measure so that it takes a large value when tail orders are different, and a small value when tail order parameters are non-identical. We derive asymptotic properties of the proposed measure, and then propose a novel statistical test for tail equivalence. Performance of the proposed test is demonstrated in a series of simulation studies and empirical analyses of financial stock returns in the periods of the world financial crisis and the COVID-19 recession. Our empirical analysis reveals non-identical tail behaviors in different pairs of stocks, different parts of tails, and the two periods of recessions.
△ Less
Submitted 19 July, 2024;
originally announced July 2024.
-
A versatile trivariate wrapped Cauchy copula with applications to toroidal and cylindrical data
Authors:
Shogo Kato,
Christophe Ley,
Sophia Loizidou,
Kanti V. Mardia
Abstract:
In this paper, we propose a new flexible distribution for data on the three-dimensional torus which we call a trivariate wrapped Cauchy copula. Our trivariate copula has several attractive properties. It has a simple form of density and is unimodal. its parameters are interpretable and allow adjustable degree of dependence between every pair of variables and these can be easily estimated. The cond…
▽ More
In this paper, we propose a new flexible distribution for data on the three-dimensional torus which we call a trivariate wrapped Cauchy copula. Our trivariate copula has several attractive properties. It has a simple form of density and is unimodal. its parameters are interpretable and allow adjustable degree of dependence between every pair of variables and these can be easily estimated. The conditional distributions of the model are well studied bivariate wrapped Cauchy distributions. Furthermore, the distribution can be easily simulated. Parameter estimation via maximum likelihood for the distribution is given and we highlight the simple implementation procedure to obtain these estimates. We compare our model to its competitors for analysing trivariate data and provide some evidence of its advantages. Another interesting feature of this model is that it can be extended to cylindrical copula as we describe this new cylindrical copula and then gives its properties. We illustrate our trivariate wrapped Cauchy copula on data from protein bioinformatics of conformational angles, and our cylindrical copula using climate data related to buoy in the Adriatic Sea. The paper is motivated by these real trivariate datasets, but we indicate how the model can be extended to multivariate copulas.
△ Less
Submitted 11 October, 2024; v1 submitted 19 January, 2024;
originally announced January 2024.
-
Traffic Count Data Analysis Using Mixtures of Kato--Jones Distributions
Authors:
Kota Nagasaki,
Shogo Kato,
Wataru Nakanishi,
M. C. Jones
Abstract:
We discuss the modelling of traffic count data that show the variation of traffic volume within a day. For the modelling, we apply mixtures of Kato-Jones distributions in which each component is unimodal and affords a wide range of skewness and kurtosis. We consider two methods for parameter estimation, namely, a modified method of moments and the maximum likelihood method. These methods were seen…
▽ More
We discuss the modelling of traffic count data that show the variation of traffic volume within a day. For the modelling, we apply mixtures of Kato-Jones distributions in which each component is unimodal and affords a wide range of skewness and kurtosis. We consider two methods for parameter estimation, namely, a modified method of moments and the maximum likelihood method. These methods were seen to be useful for fitting the proposed mixtures to our data. As a result, the variation in traffic volume was classified into the morning and evening traffic whose distributions have different shapes, particularly different degrees of skewness and kurtosis.
△ Less
Submitted 9 July, 2024; v1 submitted 2 June, 2022;
originally announced June 2022.
-
Copula-based measures of asymmetry between the lower and upper tail probabilities
Authors:
Shogo Kato,
Toshinao Yoshiba,
Shinto Eguchi
Abstract:
We propose a copula-based measure of asymmetry between the lower and upper tail probabilities of bivariate distributions. The proposed measure has a simple form and possesses some desirable properties as a measure of asymmetry. The limit of the proposed measure as the index goes to the boundary of its domain can be expressed in a simple form under certain conditions on copulas. A sample analogue o…
▽ More
We propose a copula-based measure of asymmetry between the lower and upper tail probabilities of bivariate distributions. The proposed measure has a simple form and possesses some desirable properties as a measure of asymmetry. The limit of the proposed measure as the index goes to the boundary of its domain can be expressed in a simple form under certain conditions on copulas. A sample analogue of the proposed measure for a sample from a copula is presented and its weak convergence to a Gaussian process is shown. Another sample analogue of the presented measure, which is based on a sample from a distribution on $\mathbb{R}^2$, is given. Simple methods for interval estimation and nonparametric testing based on the two sample analogues are presented. As an example, the presented measure is applied to daily returns of S&P500 and Nikkei225.
△ Less
Submitted 3 August, 2020;
originally announced August 2020.
-
Flexible random-effects distribution models for meta-analysis
Authors:
Hisashi Noma,
Kengo Nagashima,
Shogo Kato,
Satoshi Teramukai,
Toshi A. Furukawa
Abstract:
In meta-analysis, the random-effects models are standard tools to address between-study heterogeneity in evidence synthesis analyses. For the random-effects distribution models, the normal distribution model has been adopted in most systematic reviews due to its computational and conceptual simplicity. However, the restrictive model assumption might have serious influences on the overall conclusio…
▽ More
In meta-analysis, the random-effects models are standard tools to address between-study heterogeneity in evidence synthesis analyses. For the random-effects distribution models, the normal distribution model has been adopted in most systematic reviews due to its computational and conceptual simplicity. However, the restrictive model assumption might have serious influences on the overall conclusions in practices. In this article, we first provide two examples of real-world evidence that clearly show that the normal distribution assumption is unsuitable. To address the model restriction problem, we propose alternative flexible random-effects models that can flexibly regulate skewness, kurtosis and tailweight: skew normal distribution, skew t-distribution, asymmetric Subbotin distribution, Jones-Faddy distribution, and sinh-arcsinh distribution. We also developed a R package, flexmeta, that can easily perform these methods. Using the flexible random-effects distribution models, the results of the two meta-analyses were markedly altered, potentially influencing the overall conclusions of these systematic reviews. The flexible methods and computational tools can provide more precise evidence, and these methods would be recommended at least as sensitivity analysis tools to assess the influence of the normal distribution assumption of the random-effects model.
△ Less
Submitted 1 August, 2020; v1 submitted 10 March, 2020;
originally announced March 2020.
-
Inequality Constrained Multilevel Models
Authors:
Bernet S. Kato,
Carel F. W. Peeters
Abstract:
Multilevel or hierarchical data structures can occur in many areas of research, including economics, psychology, sociology, agriculture, medicine, and public health. Over the last 25 years, there has been increasing interest in developing suitable techniques for the statistical analysis of multilevel data, and this has resulted in a broad class of models known under the generic name of multilevel…
▽ More
Multilevel or hierarchical data structures can occur in many areas of research, including economics, psychology, sociology, agriculture, medicine, and public health. Over the last 25 years, there has been increasing interest in developing suitable techniques for the statistical analysis of multilevel data, and this has resulted in a broad class of models known under the generic name of multilevel models. Generally, multilevel models are useful for exploring how relationships vary across higher-level units taking into account the within and between cluster variations. Research scientists often have substantive theories in mind when evaluating data with statistical models. Substantive theories often involve inequality constraints among the parameters to translate a theory into a model. This chapter shows how the inequality constrained multilevel linear model can be given a Bayesian formulation, how the model parameters can be estimated using a so-called augmented Gibbs sampler, and how posterior probabilities can be computed to assist the researcher in model selection.
△ Less
Submitted 4 January, 2018;
originally announced January 2018.
-
Spatial Clustering of Curves with Functional Covariates: A Bayesian Partitioning Model with Application to Spectra Radiance in Climate Study
Authors:
Zhen Zhang,
Chae Young Lim,
Tapabrata Maiti,
Seiji Kato
Abstract:
In climate change study, the infrared spectral signatures of climate change have recently been conceptually adopted, and widely applied to identifying and attributing atmospheric composition change. We propose a Bayesian hierarchical model for spatial clustering of the high-dimensional functional data based on the effects of functional covariates and local features. We couple the functional mixed-…
▽ More
In climate change study, the infrared spectral signatures of climate change have recently been conceptually adopted, and widely applied to identifying and attributing atmospheric composition change. We propose a Bayesian hierarchical model for spatial clustering of the high-dimensional functional data based on the effects of functional covariates and local features. We couple the functional mixed-effects model with a generalized spatial partitioning method for: (1) producing spatially contiguous clusters for the high-dimensional spatio-functional data; (2) improving the computational efficiency via parallel computing over subregions or multi-level partitions; and (3) capturing the near-boundary ambiguity and uncertainty for data-driven partitions. We propose a generalized partitioning method which puts less constraints on the shape of spatial clusters. Dimension reduction in the parameter space is also achieved via Bayesian wavelets to alleviate the increasing model complexity introduced by clusters. The model well captures the regional effects of the atmospheric and cloud properties on the spectral radiance measurements. The results elaborate the importance of exploiting spatially contiguous partitions for identifying regional effects and small-scale variability.
△ Less
Submitted 19 March, 2016;
originally announced April 2016.
-
A flexible family of distributions on the cylinder
Authors:
Shonosuke Sugasawa,
Kunio Shimizu,
Shogo Kato
Abstract:
We propose a flexible family of distributions, generalized $t$-distributions, on the cylinder which is obtained as a conditional distribution of a trivariate $t$ distribution. The new distribution has unimodality or bimodality, symmetry or asymmetry, depending on the values of parameters and flexibly fits the cylindrical data. The circular marginal of this distribution is distributed as a generali…
▽ More
We propose a flexible family of distributions, generalized $t$-distributions, on the cylinder which is obtained as a conditional distribution of a trivariate $t$ distribution. The new distribution has unimodality or bimodality, symmetry or asymmetry, depending on the values of parameters and flexibly fits the cylindrical data. The circular marginal of this distribution is distributed as a generalized $t$-distribution on the circle. Some other properties are also investigated. The proposed distribution is applied to the real cylindrical data.
△ Less
Submitted 17 July, 2015; v1 submitted 26 January, 2015;
originally announced January 2015.
-
Robust estimation of location and concentration parameters for the von Mises-Fisher distribution
Authors:
Shogo Kato,
Shinto Eguchi
Abstract:
Robust estimation of location and concentration parameters for the von Mises-Fisher distribution is discussed. A key reparametrisation is achieved by expressing the two parameters as one vector on the Euclidean space. With this representation, we first show that maximum likelihood estimator for the von Mises-Fisher distribution is not robust in some situations. Then we propose two families of robu…
▽ More
Robust estimation of location and concentration parameters for the von Mises-Fisher distribution is discussed. A key reparametrisation is achieved by expressing the two parameters as one vector on the Euclidean space. With this representation, we first show that maximum likelihood estimator for the von Mises-Fisher distribution is not robust in some situations. Then we propose two families of robust estimators which can be derived as minimisers of two density power divergences. The presented families enable us to estimate both location and concentration parameters simultaneously. Some properties of the estimators are explored. Simple iterative algorithms are suggested to find the estimates numerically. A comparison with the existing robust estimators is given as well as discussion on difference and similarity between the two proposed estimators. A simulation study is made to evaluate finite sample performance of the estimators. We consider a sea star dataset and discuss the selection of the tuning parameters and outlier detection.
△ Less
Submitted 31 January, 2012;
originally announced January 2012.