Search | arXiv e-print repository

Bernoulli amputation

Authors: Marius Hofert, James Jackson, Niels Hagenbuch

Abstract: An approach to amputation, the process of introducing missing values to a complete dataset, is presented. It allows to construct missingness indicators in a flexible and principled way via copulas and Bernoulli margins and to incorporate dependence in missingness patterns. Besides more classical missingness models such as missing completely at random, missing at random, and missing not at random,… ▽ More An approach to amputation, the process of introducing missing values to a complete dataset, is presented. It allows to construct missingness indicators in a flexible and principled way via copulas and Bernoulli margins and to incorporate dependence in missingness patterns. Besides more classical missingness models such as missing completely at random, missing at random, and missing not at random, the approach is able to model structured missingness such as block missingness and, via mixtures, monotone missingness, which are patterns of missing data frequently found in real-life datasets. Properties such as joint missingness probabilities or missingness correlation are derived mathematically. The approach is demonstrated with mathematical examples and empirical illustrations in terms of a well-known dataset. △ Less

Submitted 26 July, 2024; originally announced July 2024.

MSC Class: 62D10; 62H99; 65C60

arXiv:2306.10663 [pdf, other]

Index-mixed copulas

Authors: Klaus Herrmann, Marius Hofert, Nahid Sadr

Abstract: The class of index-mixed copulas is introduced and its properties are investigated. Index-mixed copulas are constructed from given base copulas and a random index vector, and show a rather remarkable degree of analytical tractability. The analytical form of the copula and, if it exists, its density are derived. As the construction is based on a stochastic representation, sampling algorithms can be… ▽ More The class of index-mixed copulas is introduced and its properties are investigated. Index-mixed copulas are constructed from given base copulas and a random index vector, and show a rather remarkable degree of analytical tractability. The analytical form of the copula and, if it exists, its density are derived. As the construction is based on a stochastic representation, sampling algorithms can be given. Properties investigated include bivariate and trivariate margins, mixtures of index-mixed copulas, symmetries such as radial symmetry and exchangeability, tail dependence, measures of concordance such as Blomqvist's beta, Spearman's rho or Kendall's tau and concordance orderings. Examples and illustrations are provided, and applications to the distribution of sums of dependent random variables as well as the stress testing of general dependence structures are given. A particularly interesting feature of index-mixed copulas is that they allow one to provide a revealing interpretation of the well-known family of Eyraud-Farlie-Gumbel-Morgenstern (EFGM) copulas. Through the lens of index-mixing, one can explain why EFGM copulas can only model a limited range of concordance and are tail independent, for example. Index-mixed copulas do not suffer from such restrictions while remaining analytically tractable. △ Less

Submitted 8 August, 2023; v1 submitted 18 June, 2023; originally announced June 2023.

MSC Class: 60E05; 60G09; 62E15; 62H99; 62H86; 62H20

arXiv:2202.03406 [pdf, other]

Dependence model assessment and selection with DecoupleNets

Authors: Marius Hofert, Avinash Prasad, Mu Zhu

Abstract: Neural networks are suggested for learning a map from $d$-dimensional samples with any underlying dependence structure to multivariate uniformity in $d'$ dimensions. This map, termed DecoupleNet, is used for dependence model assessment and selection. If the data-generating dependence model was known, and if it was among the few analytically tractable ones, one such transformation for $d'=d$ is Ros… ▽ More Neural networks are suggested for learning a map from $d$-dimensional samples with any underlying dependence structure to multivariate uniformity in $d'$ dimensions. This map, termed DecoupleNet, is used for dependence model assessment and selection. If the data-generating dependence model was known, and if it was among the few analytically tractable ones, one such transformation for $d'=d$ is Rosenblatt's transform. DecoupleNets have multiple advantages. For example, they only require an available sample and are applicable to $d'<d$, in particular $d'=2$. This allows for simpler model assessment and selection, both numerically and, because $d'=2$, especially graphically. A graphical assessment method has the advantage of being able to identify why, or in which region of the domain, a candidate model does not provide an adequate fit, thus leading to model selection in particular regions of interest or improved model building strategies in such regions. Through simulation studies with data from various copulas, the feasibility and validity of this novel DecoupleNet approach is demonstrated. Applications to real world data illustrate its usefulness for model assessment and selection. △ Less

Submitted 5 October, 2022; v1 submitted 7 February, 2022; originally announced February 2022.

MSC Class: 62H99; 65C60; 60E05; 62M45; 00A72; 65C10; 62M10

arXiv:2112.03377 [pdf, other]

RafterNet: Probabilistic predictions in multi-response regression

Authors: Marius Hofert, Avinash Prasad, Mu Zhu

Abstract: A fully nonparametric approach for making probabilistic predictions in multi-response regression problems is introduced. Random forests are used as marginal models for each response variable and, as novel contribution of the present work, the dependence between the multiple response variables is modeled by a generative neural network. This combined modeling approach of random forests, correspondin… ▽ More A fully nonparametric approach for making probabilistic predictions in multi-response regression problems is introduced. Random forests are used as marginal models for each response variable and, as novel contribution of the present work, the dependence between the multiple response variables is modeled by a generative neural network. This combined modeling approach of random forests, corresponding empirical marginal residual distributions and a generative neural network is referred to as RafterNet. Multiple datasets serve as examples to demonstrate the flexibility of the approach and its impact for making probabilistic forecasts. △ Less

Submitted 11 October, 2022; v1 submitted 1 December, 2021; originally announced December 2021.

arXiv:2111.07542 [pdf, other]

Single-Index Importance Sampling with Stratification

Authors: Erik Hintz, Marius Hofert, Christiane Lemieux, Yoshihiro Taniguchi

Abstract: In many stochastic problems, the output of interest depends on an input random vector mainly through a single random variable (or index) via an appropriate univariate transformation of the input. We exploit this feature by proposing an importance sampling method that makes rare events more likely by changing the distribution of the chosen index. Further variance reduction is guaranteed by combinin… ▽ More In many stochastic problems, the output of interest depends on an input random vector mainly through a single random variable (or index) via an appropriate univariate transformation of the input. We exploit this feature by proposing an importance sampling method that makes rare events more likely by changing the distribution of the chosen index. Further variance reduction is guaranteed by combining this single-index importance sampling approach with stratified sampling. The dimension-reduction effect of single-index importance sampling also enhances the effectiveness of quasi-Monte Carlo methods. The proposed method applies to a wide range of financial or risk management problems. We demonstrate its efficiency for estimating large loss probabilities of a credit portfolio under a normal and t-copula model and show that our method outperforms the current standard for these problems. △ Less

Submitted 15 November, 2021; originally announced November 2021.

arXiv:2110.03397 [pdf, other]

Smooth bootstrapping of copula functionals

Authors: Maximilian Coblenz, Oliver Grothe, Klaus Herrmann, Marius Hofert

Abstract: The smooth bootstrap for estimating copula functionals in small samples is investigated. It can be used both to gauge the distribution of the estimator in question and to augment the data. Issues arising from kernel density and distribution estimation in the copula domain are addressed, such as how to avoid the bounded domain, which bandwidth matrix to choose, and how the smoothing can be carried… ▽ More The smooth bootstrap for estimating copula functionals in small samples is investigated. It can be used both to gauge the distribution of the estimator in question and to augment the data. Issues arising from kernel density and distribution estimation in the copula domain are addressed, such as how to avoid the bounded domain, which bandwidth matrix to choose, and how the smoothing can be carried out. Furthermore, we investigate how the smooth bootstrap impacts the underlying dependence structure or the functionals in question and under which conditions it does not. We provide specific examples and simulations that highlight advantages and caveats of the approach. △ Less

Submitted 25 March, 2022; v1 submitted 7 October, 2021; originally announced October 2021.

MSC Class: 62H99; 65C60

arXiv:2012.08036 [pdf, other]

Applications of multivariate quasi-random sampling with neural networks

Authors: Marius Hofert, Avinash Prasad, Mu Zhu

Abstract: Generative moment matching networks (GMMNs) are suggested for modeling the cross-sectional dependence between stochastic processes. The stochastic processes considered are geometric Brownian motions and ARMA-GARCH models. Geometric Brownian motions lead to an application of pricing American basket call options under dependence and ARMA-GARCH models lead to an application of simulating predictive d… ▽ More Generative moment matching networks (GMMNs) are suggested for modeling the cross-sectional dependence between stochastic processes. The stochastic processes considered are geometric Brownian motions and ARMA-GARCH models. Geometric Brownian motions lead to an application of pricing American basket call options under dependence and ARMA-GARCH models lead to an application of simulating predictive distributions. In both types of applications the benefit of using GMMNs in comparison to parametric dependence models is highlighted and the fact that GMMNs can produce dependent quasi-random samples with no additional effort is exploited to obtain variance reduction. △ Less

Submitted 27 August, 2021; v1 submitted 14 December, 2020; originally announced December 2020.

Comments: 17 pages, 5 figures

arXiv:2006.10107 [pdf, other]

Right-truncated Archimedean and related copulas

Authors: Marius Hofert

Abstract: The copulas of random vectors with standard uniform univariate margins truncated from the right are considered and a general formula for such right-truncated conditional copulas is derived. This formula is analytical for copulas that can be inverted analytically as functions of each single argument. This is the case, for example, for Archimedean and related copulas. The resulting right-truncated A… ▽ More The copulas of random vectors with standard uniform univariate margins truncated from the right are considered and a general formula for such right-truncated conditional copulas is derived. This formula is analytical for copulas that can be inverted analytically as functions of each single argument. This is the case, for example, for Archimedean and related copulas. The resulting right-truncated Archimedean copulas are not only analytically tractable but can also be characterized as tilted Archimedean copulas. This finding allows one, for example, to more easily derive analytical properties such as the coefficients of tail dependence or sampling procedures of right-truncated Archimedean copulas. As another result, one can easily obtain a limiting Clayton copula for a general vector of truncation points converging to zero; this is an important property for (re)insurance and a fact already known in the special case of equal truncation points, but harder to prove without aforementioned characterization. Furthermore, right-truncated Archimax copulas with logistic stable tail dependence functions are characterized as tilted outer power Archimedean copulas and an analytical form of right-truncated nested Archimedean copulas is also derived. △ Less

Submitted 17 June, 2020; originally announced June 2020.

MSC Class: 60E05; 62E15; 62H99

arXiv:2003.13301 [pdf, other]

Outer power transformations of hierarchical Archimedean copulas: Construction, sampling and estimation

Authors: Jan Górecki, Marius Hofert, Ostap Okhrin

Abstract: A large number of commonly used parametric Archimedean copula (AC) families are restricted to a single parameter, connected to a concordance measure such as Kendall's tau. This often leads to poor statistical fits, particularly in the joint tails, and can sometimes even limit the ability to model concordance or tail dependence mathematically. This work suggests outer power (OP) transformations of… ▽ More A large number of commonly used parametric Archimedean copula (AC) families are restricted to a single parameter, connected to a concordance measure such as Kendall's tau. This often leads to poor statistical fits, particularly in the joint tails, and can sometimes even limit the ability to model concordance or tail dependence mathematically. This work suggests outer power (OP) transformations of Archimedean generators to overcome these limitations. The copulas generated by OP-transformed generators can, for example, allow one to capture both a given concordance measure and a tail dependence coefficient simultaneously. For exchangeable OP-transformed ACs, a formula for computing tail dependence coefficients is obtained, as well as two feasible OP AC estimators are proposed and their properties studied by simulation. For hierarchical extensions of OP-transformed ACs, a new construction principle, efficient sampling and parameter estimation are addressed. By simulation, convergence rate and standard errors of the proposed estimator are studied. Excellent tail fitting capabilities of OP-transformed hierarchical AC models are demonstrated in a risk management application. The results show that the OP transformation is able to improve the statistical fit of exchangeable ACs, particularly of those that cannot capture upper tail dependence or strong concordance, as well as the statistical fit of hierarchical ACs, especially in terms of tail dependence and higher dimensions. Given how comparably simple it is to include OP transformations into existing exchangeable and hierarchical AC models, this transformation provides an attractive trade-off between computational effort and statistical improvement. △ Less

Submitted 31 March, 2020; v1 submitted 30 March, 2020; originally announced March 2020.

Comments: 38 pages, 17 figures

MSC Class: 62H99

arXiv:2003.08009 [pdf, other]

Random number generators produce collisions: Why, how many and more

Authors: Marius Hofert

Abstract: It seems surprising that when applying widely used random number generators to generate one million random numbers on modern architectures, one obtains, on average, about 116 collisions. This article explains why, how to mathematically compute such a number, why they often cannot be obtained in a straightforward way, how to numerically compute them in a robust way and, among other things, what wou… ▽ More It seems surprising that when applying widely used random number generators to generate one million random numbers on modern architectures, one obtains, on average, about 116 collisions. This article explains why, how to mathematically compute such a number, why they often cannot be obtained in a straightforward way, how to numerically compute them in a robust way and, among other things, what would need to be changed to bring this number below 1. The probability of at least one collision is also briefly addressed, which, as it turns out, again needs a careful numerical treatment. Overall, the article provides an introduction to the representation of floating-point numbers on a computer and corresponding implications in statistics and simulation. All computations are carried out in R and are reproducible with the code included in this article. △ Less

Submitted 8 June, 2020; v1 submitted 17 March, 2020; originally announced March 2020.

MSC Class: 65C60

arXiv:2002.10645 [pdf, other]

Multivariate time-series modeling with generative neural networks

Authors: Marius Hofert, Avinash Prasad, Mu Zhu

Abstract: Generative moment matching networks (GMMNs) are introduced as dependence models for the joint innovation distribution of multivariate time series (MTS). Following the popular copula-GARCH approach for modeling dependent MTS data, a framework based on a GMMN-GARCH approach is presented. First, ARMA-GARCH models are utilized to capture the serial dependence within each univariate marginal time serie… ▽ More Generative moment matching networks (GMMNs) are introduced as dependence models for the joint innovation distribution of multivariate time series (MTS). Following the popular copula-GARCH approach for modeling dependent MTS data, a framework based on a GMMN-GARCH approach is presented. First, ARMA-GARCH models are utilized to capture the serial dependence within each univariate marginal time series. Second, if the number of marginal time series is large, principal component analysis (PCA) is used as a dimension-reduction step. Last, the remaining cross-sectional dependence is modeled via a GMMN, the main contribution of this work. GMMNs are highly flexible and easy to simulate from, which is a major advantage over the copula-GARCH approach. Applications involving yield curve modeling and the analysis of foreign exchange-rate returns demonstrate the utility of the GMMN-GARCH approach, especially in terms of producing better empirical predictive distributions and making better probabilistic forecasts. △ Less

Submitted 1 October, 2021; v1 submitted 24 February, 2020; originally announced February 2020.

MSC Class: 62H99; 65C60; 60E05; 00A72; 65C10; 62M10

arXiv:1911.03017 [pdf, other]

Normal variance mixtures: Distribution, density and parameter estimation

Authors: Erik Hintz, Marius Hofert, Christiane Lemieux

Abstract: Normal variance mixtures are a class of multivariate distributions that generalize the multivariate normal by randomizing (or mixing) the covariance matrix via multiplication by a non-negative random variable W. The multivariate t distribution is an example of such mixture, where W has an inverse-gamma distribution. Algorithms to compute the joint distribution function and perform parameter estima… ▽ More Normal variance mixtures are a class of multivariate distributions that generalize the multivariate normal by randomizing (or mixing) the covariance matrix via multiplication by a non-negative random variable W. The multivariate t distribution is an example of such mixture, where W has an inverse-gamma distribution. Algorithms to compute the joint distribution function and perform parameter estimation for the multivariate normal and t (with integer degrees of freedom) can be found in the literature and are implemented in, e.g., the R package mvtnorm. In this paper, efficient algorithms to perform these tasks in the general case of a normal variance mixture are proposed. In addition to the above two tasks, the evaluation of the joint (logarithmic) density function of a general normal variance mixture is tackled as well, as it is needed for parameter estimation and does not always exist in closed form in this more general setup. For the evaluation of the joint distribution function, the proposed algorithms apply randomized quasi-Monte Carlo (RQMC) methods in a way that improves upon existing methods proposed for the multivariate normal and t distributions. An adaptive RQMC algorithm that similarly exploits the superior convergence properties of RQMC methods is presented for the task of evaluating the joint log-density function. This allows the parameter estimation task to be accomplished via an EM-like algorithm where all weights and log-densities are numerically estimated. It is demonstrated through numerical examples that the suggested algorithms are quite fast; even for high dimensions around 1000 the distribution function can be estimated with moderate accuracy using only a few seconds of run time. Even log-densities around -100 can be estimated accurately and quickly. An implementation of all algorithms presented in this work is available in the R package nvmix (version >= 0.0.4). △ Less

Submitted 15 June, 2020; v1 submitted 7 November, 2019; originally announced November 2019.

MSC Class: 62H99; 65C60

arXiv:1811.00683 [pdf, other]

Quasi-random sampling for multivariate distributions via generative neural networks

Authors: Marius Hofert, Avinash Prasad, Mu Zhu

Abstract: Generative moment matching networks (GMMNs) are introduced for generating quasi-random samples from multivariate models with any underlying copula in order to compute estimates under variance reduction. So far, quasi-random sampling for multivariate distributions required a careful design, exploiting specific properties (such as conditional distributions) of the implied parametric copula or the un… ▽ More Generative moment matching networks (GMMNs) are introduced for generating quasi-random samples from multivariate models with any underlying copula in order to compute estimates under variance reduction. So far, quasi-random sampling for multivariate distributions required a careful design, exploiting specific properties (such as conditional distributions) of the implied parametric copula or the underlying quasi-Monte Carlo (QMC) point set, and was only tractable for a small number of models. Utilizing GMMNs allows one to construct quasi-random samples for a much larger variety of multivariate distributions without such restrictions, including empirical ones from real data with dependence structures not well captured by parametric copulas. Once trained on pseudo-random samples from a parametric model or on real data, these neural networks only require a multivariate standard uniform randomized QMC point set as input and are thus fast in estimating expectations of interest under dependence with variance reduction. Numerical examples are considered to demonstrate the approach, including applications inspired by risk management practice. All results are reproducible with the demos GMMN_QMC_paper, GMMN_QMC_data and GMMN_QMC_timings as part of the R package gnn. △ Less

Submitted 2 April, 2020; v1 submitted 1 November, 2018; originally announced November 2018.

MSC Class: 62H99; 65C60; 60E05; 00A72; 65C10

arXiv:1801.03596 [pdf, other]

A framework for measuring dependence between random vectors

Authors: Marius Hofert, Wayne Oldford, Avinash Prasad, Mu Zhu

Abstract: A framework for quantifying dependence between random vectors is introduced. With the notion of a collapsing function, random vectors are summarized by single random variables, called collapsed random variables in the framework. Using this framework, a general graphical assessment of independence between groups of random variables for arbitrary collapsing functions is provided. Measures of associa… ▽ More A framework for quantifying dependence between random vectors is introduced. With the notion of a collapsing function, random vectors are summarized by single random variables, called collapsed random variables in the framework. Using this framework, a general graphical assessment of independence between groups of random variables for arbitrary collapsing functions is provided. Measures of association computed from the collapsed random variables are then used to measure the dependence between random vectors. To this end, suitable collapsing functions are presented. Furthermore, the notion of a collapsed distribution function and collapsed copula are introduced and investigated for certain collapsing functions. This investigation yields a multivariate extension of the Kendall distribution and its corresponding Kendall copula for which some properties and examples are provided. In addition, non-parametric estimators for the collapsed measures of dependence are provided along with their corresponding asymptotic properties. Finally, data applications to bioinformatics and finance are presented. △ Less

Submitted 10 January, 2018; originally announced January 2018.

arXiv:1611.09225 [pdf, other]

On structure, family and parameter estimation of hierarchical Archimedean copulas

Authors: Jan Górecki, Marius Hofert, Martin Holeňa

Abstract: Research on structure determination and parameter estimation of hierarchical Archimedean copulas (HACs) has so far mostly focused on the case in which all appearing Archimedean copulas belong to the same Archimedean family. The present work addresses this issue and proposes a new approach for estimating HACs that involve different Archimedean families. It is based on employing goodness-of-fit test… ▽ More Research on structure determination and parameter estimation of hierarchical Archimedean copulas (HACs) has so far mostly focused on the case in which all appearing Archimedean copulas belong to the same Archimedean family. The present work addresses this issue and proposes a new approach for estimating HACs that involve different Archimedean families. It is based on employing goodness-of-fit test statistics directly into HAC estimation. The approach is summarized in a simple algorithm, its theoretical justification is given and its applicability is illustrated by several experiments, which include estimation of HACs involving up to five different Archimedean families. △ Less

Submitted 18 November, 2016; originally announced November 2016.

Comments: 63 pages, one attachment in attachment.pdf

MSC Class: 62H99

arXiv:1609.09429 [pdf, other]

Visualizing Dependence in High-Dimensional Data: An Application to S&P 500 Constituent Data

Authors: Marius Hofert, Wayne Oldford

Abstract: The notion of a zenpath and a zenplot is introduced to search and detect dependence in high-dimensional data for model building and statistical inference. By using any measure of dependence between two random variables (such as correlation, Spearman's rho, Kendall's tau, tail dependence etc.), a zenpath can construct paths through pairs of variables in different ways, which can then be laid out an… ▽ More The notion of a zenpath and a zenplot is introduced to search and detect dependence in high-dimensional data for model building and statistical inference. By using any measure of dependence between two random variables (such as correlation, Spearman's rho, Kendall's tau, tail dependence etc.), a zenpath can construct paths through pairs of variables in different ways, which can then be laid out and displayed by a zenplot. The approach is illustrated by investigating tail dependence and model fit in constituent data of the S&P 500 during the financial crisis of 2007-2008. The corresponding Global Industry Classification Standard (GICS) sector information is also addressed. Zenpaths and zenplots are useful tools for exploring dependence in high-dimensional data, for example, from the realm of finance, insurance and quantitative risk management. All presented algorithms are implemented using the R package zenplots and all examples and graphics in the paper can be reproduced using the accompanying demo SP500. △ Less

Submitted 5 April, 2017; v1 submitted 29 September, 2016; originally announced September 2016.

Comments: The figures had to be massively reduced in size in order for the paper to fulfill the 10M limit

MSC Class: 62-09; 62H99; 65C60

arXiv:1508.03483 [pdf, other]

Quasi-random numbers for copula models

Authors: Mathieu Cambou, Marius Hofert, Christiane Lemieux

Abstract: The present work addresses the question how sampling algorithms for commonly applied copula models can be adapted to account for quasi-random numbers. Besides sampling methods such as the conditional distribution method (based on a one-to-one transformation), it is also shown that typically faster sampling methods (based on stochastic representations) can be used to improve upon classical Monte Ca… ▽ More The present work addresses the question how sampling algorithms for commonly applied copula models can be adapted to account for quasi-random numbers. Besides sampling methods such as the conditional distribution method (based on a one-to-one transformation), it is also shown that typically faster sampling methods (based on stochastic representations) can be used to improve upon classical Monte Carlo methods when pseudo-random number generators are replaced by quasi-random number generators. This opens the door to quasi-random numbers for models well beyond independent margins or the multivariate normal distribution. Detailed examples (in the context of finance and insurance), illustrations and simulations are given and software has been developed and provided in the R packages copula and qrng. △ Less

Submitted 12 March, 2016; v1 submitted 14 August, 2015; originally announced August 2015.

MSC Class: 62H99; 65C60

arXiv:1403.4291 [pdf, other]

An importance sampling approach for copula models in insurance

Authors: Philipp Arbenz, Mathieu Cambou, Marius Hofert

Abstract: An importance sampling approach for sampling copula models is introduced. We propose two algorithms that improve Monte Carlo estimators when the functional of interest depends mainly on the behaviour of the underlying random vector when at least one of the components is large. Such problems often arise from dependence models in finance and insurance. The importance sampling framework we propose is… ▽ More An importance sampling approach for sampling copula models is introduced. We propose two algorithms that improve Monte Carlo estimators when the functional of interest depends mainly on the behaviour of the underlying random vector when at least one of the components is large. Such problems often arise from dependence models in finance and insurance. The importance sampling framework we propose is general and can be easily implemented for all classes of copula models from which sampling is feasible. We show how the proposal distribution of the two algorithms can be optimized to reduce the sampling error. In a case study inspired by a typical multivariate insurance application, we obtain variance reduction factors between 10 and 30 in comparison to standard Monte Carlo estimators. △ Less

Submitted 7 April, 2015; v1 submitted 17 March, 2014; originally announced March 2014.

Comments: 24 pages, 6 figures

MSC Class: 97M30; 65C05; 65C60; 62G32

arXiv:1309.4402 [pdf, other]

Parallel and other simulations in R made easy: An end-to-end study

Authors: Marius Hofert, Martin Mächler

Abstract: It is shown how to set up, conduct, and analyze large simulation studies with the new R package simsalapar = simulations simplified and launched parallel. A simulation study typically starts with determining a collection of input variables and their values on which the study depends, such as sample sizes, dimensions, types and degrees of dependence, estimation methods, etc. Computations are desire… ▽ More It is shown how to set up, conduct, and analyze large simulation studies with the new R package simsalapar = simulations simplified and launched parallel. A simulation study typically starts with determining a collection of input variables and their values on which the study depends, such as sample sizes, dimensions, types and degrees of dependence, estimation methods, etc. Computations are desired for all com- binations of these variables. If conducting these computations sequentially is too time- consuming, parallel computing can be applied over all combinations of select variables. The final result object of a simulation study is typically an array. From this array, sum- mary statistics can be derived and presented in terms of (flat contingency or LATEX) tables or visualized in terms of (matrix-like) figures. The R package simsalapar provides several tools to achieve the above tasks. Warnings and errors are dealt with correctly, various seeding methods are available, and run time is measured. Furthermore, tools for analyzing the results via tables or graphics are pro- vided. In contrast to rather minimal examples typically found in R packages or vignettes, an end-to-end, not-so-minimal simulation problem from the realm of quantitative risk management is given. The concepts presented and solutions provided by simsalapar may be of interest to students, researchers, and practitioners as a how-to for conducting real- istic, large-scale simulation studies in R. Also, the development of the package revealed useful improvements to R itself, which are available in R 3.0.0. △ Less

Submitted 17 September, 2013; originally announced September 2013.

Comments: The first 19 pages = user manual; Rest is a description of the implementation and technical details

arXiv:1207.1708 [pdf, other]

Estimators for Archimedean copulas in high dimensions

Authors: Marius Hofert, Martin Maechler, Alexander J. McNeil

Abstract: The performance of known and new parametric estimators for Archimedean copulas is investigated, with special focus on large dimensions and numerical difficulties. In particular, method-of-moments-like estimators based on pairwise Kendall's tau, a multivariate extension of Blomqvist's beta, minimum distance estimators, the maximum-likelihood estimator, a simulated maximum-likelihood estimator, and… ▽ More The performance of known and new parametric estimators for Archimedean copulas is investigated, with special focus on large dimensions and numerical difficulties. In particular, method-of-moments-like estimators based on pairwise Kendall's tau, a multivariate extension of Blomqvist's beta, minimum distance estimators, the maximum-likelihood estimator, a simulated maximum-likelihood estimator, and a maximum-likelihood estimator based on the copula diagonal are studied. Their performance is compared in a large-scale simulation study both under known and unknown margins (pseudo-observations), in small and high dimensions, under small and large dependencies, various different Archimedean families and sample sizes. High dimensions up to one hundred are considered for the first time and computational problems arising from such large dimensions are addressed in detail. All methods are implemented in the open source \R{} package \pkg{copula} and can thus be easily accessed and studied. △ Less

Submitted 2 November, 2012; v1 submitted 6 July, 2012; originally announced July 2012.

MSC Class: 62H12; 62F10; 62H99; 62H20; 65C60

arXiv:1204.2410 [pdf, other]

Densities of nested Archimedean copulas

Authors: Marius Hofert, David Pham

Abstract: Nested Archimedean copulas recently gained interest since they generalize the well-known class of Archimedean copulas to allow for partial asymmetry. Sampling algorithms and strategies have been well investigated for nested Archimedean copulas. However, for likelihood based inference it is important to have the density. The present work fills this gap. A general formula for the derivatives of the… ▽ More Nested Archimedean copulas recently gained interest since they generalize the well-known class of Archimedean copulas to allow for partial asymmetry. Sampling algorithms and strategies have been well investigated for nested Archimedean copulas. However, for likelihood based inference it is important to have the density. The present work fills this gap. A general formula for the derivatives of the nodes and inner generators appearing in nested Archimedean copulas is developed. This leads to a tractable formula for the density of nested Archimedean copulas in arbitrary dimensions if the number of nesting levels is not too large. Various examples including famous Archimedean families and transformations of such are given. Furthermore, a numerically efficient way to evaluate the log-density is presented. △ Less

Submitted 28 October, 2012; v1 submitted 11 April, 2012; originally announced April 2012.

Comments: 25 pages

MSC Class: 62H99; 65C60; 62H12; 62F10

Showing 1–21 of 21 results for author: Hofert, M