-
Negative correlations in Ising models of credit risk
Authors:
Chiara Emonti,
Roberto Fontana
Abstract:
We analyze a subclass of Ising models in the context of credit risk, focusing on Dandelion models when the correlations $ρ$ between the central node and each non-central node are negative. We establish the possible range of values for $ρ$ and derive an explicit formula linking the correlation between any pair of non-central nodes to $ρ$. The paper concludes with a simulation study.
We analyze a subclass of Ising models in the context of credit risk, focusing on Dandelion models when the correlations $ρ$ between the central node and each non-central node are negative. We establish the possible range of values for $ρ$ and derive an explicit formula linking the correlation between any pair of non-central nodes to $ρ$. The paper concludes with a simulation study.
△ Less
Submitted 28 February, 2025;
originally announced February 2025.
-
A topology-based algorithm for the isomorphism check of 2-level Orthogonal Arrays
Authors:
Roberto Fontana,
Marco Guerra
Abstract:
We introduce a construction and an algorithm, both based on Topological Data Analysis (TDA), to tackle the problem of the isomorphism check of Orthogonal Arrays (OAs). Specifically, we associate to any binary OA a persistence diagram, one of the main tools in TDA, and explore how the Wasserstein distance between persistence diagrams can be used to inform whether two designs are isomorphic.
We introduce a construction and an algorithm, both based on Topological Data Analysis (TDA), to tackle the problem of the isomorphism check of Orthogonal Arrays (OAs). Specifically, we associate to any binary OA a persistence diagram, one of the main tools in TDA, and explore how the Wasserstein distance between persistence diagrams can be used to inform whether two designs are isomorphic.
△ Less
Submitted 30 September, 2024;
originally announced September 2024.
-
Multi-way contingency tables with uniform margins
Authors:
Elisa Perrone,
Roberto Fontana,
Fabio Rapallo
Abstract:
We study the problem of transforming a multi-way contingency table into an equivalent table with uniform margins and same dependence structure. Such a problem relates to recent developments in copula modeling for discrete random vectors. Here, we focus on three-way binary tables and show that, even in such a simple case, the situation is quite different than for two-way tables. Many more constrain…
▽ More
We study the problem of transforming a multi-way contingency table into an equivalent table with uniform margins and same dependence structure. Such a problem relates to recent developments in copula modeling for discrete random vectors. Here, we focus on three-way binary tables and show that, even in such a simple case, the situation is quite different than for two-way tables. Many more constraints are needed to ensure a unique solution to the problem. Therefore, the uniqueness of the transformed table is subject to arbitrary choices of the practitioner. We illustrate the theory through some examples, and conclude with a discussion on the topic and future research directions.
△ Less
Submitted 5 April, 2024;
originally announced April 2024.
-
Robustness against data loss with Algebraic Statistics
Authors:
Roberto Fontana,
Fabio Rapallo
Abstract:
The paper describes an algorithm that, given an initial design $\mathcal{F}_n$ of size $n$ and a linear model with $p$ parameters, provides a sequence $\mathcal{F}_n \supset \ldots \supset \mathcal{F}_{n-k} \supset \ldots \supset \mathcal{F}_p$ of nested \emph{robust} designs. The sequence is obtained by the removal, one by one, of the runs of $\mathcal{F}_n$ till a $p$-run \emph{saturated} design…
▽ More
The paper describes an algorithm that, given an initial design $\mathcal{F}_n$ of size $n$ and a linear model with $p$ parameters, provides a sequence $\mathcal{F}_n \supset \ldots \supset \mathcal{F}_{n-k} \supset \ldots \supset \mathcal{F}_p$ of nested \emph{robust} designs. The sequence is obtained by the removal, one by one, of the runs of $\mathcal{F}_n$ till a $p$-run \emph{saturated} design $\mathcal{F}_p$ is obtained. The potential impact of the algorithm on real applications is high. The initial fraction $\mathcal{F}_n$ can be of any type and the output sequence can be used to organize the experimental activity. The experiments can start with the runs corresponding to $\mathcal{F}_p$ and continue adding one run after the other (from $\mathcal{F}_{n-k}$ to $\mathcal{F}_{n-k+1}$) till the initial design $\mathcal{F}_n$ is obtained. In this way, if for some unexpected reasons the experimental activity must be stopped before the end when only $n-k$ runs are completed, the corresponding $\mathcal{F}_{n-k}$ has a high value of robustness for $k \in \{1, \ldots, n-p\}$. The algorithm uses the circuit basis, a special representation of the kernel of a matrix with integer entries. The effectiveness of the algorithm is demonstrated through the use of simulations.
△ Less
Submitted 21 June, 2022;
originally announced June 2022.
-
Circuits for robust designs
Authors:
Roberto Fontana,
Fabio Rapallo,
Henry P. Wynn
Abstract:
This paper continues the application of circuit theory to experimental design started by the first two authors. The theory gives a very special and detailed representation of the kernel of the design model matrix. This representation turns out to be an appropriate way to study the optimality criteria referred to as robustness: the sensitivity of the design to the removal of design points. Many exa…
▽ More
This paper continues the application of circuit theory to experimental design started by the first two authors. The theory gives a very special and detailed representation of the kernel of the design model matrix. This representation turns out to be an appropriate way to study the optimality criteria referred to as robustness: the sensitivity of the design to the removal of design points. Many examples are given, from classical combinatorial designs to two-level factorial design including interactions. The complexity of the circuit representations are useful because the large range of options they offer, but conversely require the use of dedicated software. Suggestions for speed improvement are made.
△ Less
Submitted 21 June, 2021;
originally announced June 2021.
-
On the aberrations of mixed level Orthogonal Arrays with removed runs
Authors:
Roberto Fontana,
Fabio Rapallo
Abstract:
Given an Orthogonal Array we analyze the aberrations of the sub-fractions which are obtained by the deletion of some of its points. We provide formulae to compute the Generalized Word-Length Pattern of any sub-fraction. In the case of the deletion of one single point, we provide a simple methodology to find which the best sub-fractions are according to the Generalized Minimum Aberration criterion.…
▽ More
Given an Orthogonal Array we analyze the aberrations of the sub-fractions which are obtained by the deletion of some of its points. We provide formulae to compute the Generalized Word-Length Pattern of any sub-fraction. In the case of the deletion of one single point, we provide a simple methodology to find which the best sub-fractions are according to the Generalized Minimum Aberration criterion. We also study the effect of the deletion of 1, 2 or 3 points on some examples. The methodology does not put any restriction on the number of levels of each factor. It follows that any mixed level Orthogonal Array can be considered.
△ Less
Submitted 11 September, 2018;
originally announced September 2018.
-
Unions of Orthogonal Arrays and their aberrations via Hilbert bases
Authors:
Roberto Fontana,
Fabio Rapallo
Abstract:
We generate all the Orthogonal Arrays (OAs) of a given size n and strength t as the union of a collection of OAs which belong to an inclusion-minimal set of OAs. We derive a formula for computing the (Generalized) Word Length Pattern of a union of OAs that makes use of their polynomial counting functions. In this way the best OAs according to the Generalized Minimum Aberration criterion can be fou…
▽ More
We generate all the Orthogonal Arrays (OAs) of a given size n and strength t as the union of a collection of OAs which belong to an inclusion-minimal set of OAs. We derive a formula for computing the (Generalized) Word Length Pattern of a union of OAs that makes use of their polynomial counting functions. In this way the best OAs according to the Generalized Minimum Aberration criterion can be found by simply exploring a relatively small set of counting functions. The classes of OAs with 5 binary factors, strength 2, and sizes 16 and 20 are fully described.
△ Less
Submitted 2 January, 2018;
originally announced January 2018.
-
Markov Chain Monte Carlo sampling for conditional tests: A link between permutation tests and algebraic statistics
Authors:
Roberto Fontana,
Francesca Romana Crucinio
Abstract:
We consider conditional tests for non-negative discrete exponential families. We develop two Markov Chain Monte Carlo (MCMC) algorithms which allow us to sample from the conditional space and to perform approximated tests. The first algorithm is based on the MCMC sampling described by Sturmfels. The second MCMC sampling consists in a more efficient algorithm which exploits the optimal partition of…
▽ More
We consider conditional tests for non-negative discrete exponential families. We develop two Markov Chain Monte Carlo (MCMC) algorithms which allow us to sample from the conditional space and to perform approximated tests. The first algorithm is based on the MCMC sampling described by Sturmfels. The second MCMC sampling consists in a more efficient algorithm which exploits the optimal partition of the conditional space into orbits of permutations. We thus establish a link between standard permutation and algebraic-statistics-based sampling. Through a simulation study we compare the exact cumulative distribution function (cdf) with the approximated cdfs which are obtained with the two MCMC samplings and the standard permutation sampling. We conclude that the MCMC sampling which exploits the partition of the conditional space into orbits of permutations gives an estimated cdf, under $H_0$, which is more reliable and converges to the exact cdf with the least steps. This sampling technique can also be used to build an approximation of the exact cdf when its exact computation is computationally infeasible.
△ Less
Submitted 26 July, 2017;
originally announced July 2017.
-
Simulations on the combinatorial structure of D-optimal designs
Authors:
Roberto Fontana,
Fabio Rapallo
Abstract:
In this work we present the results of several simulations on main-effect factorial designs. The goal of such simulations is to investigate the connections between the $D$-optimality of a design and its geometrical structure. By means of a combinatorial object, namely the circuit basis of the design matrix, we show that it is possible to define a simple index that exhibits strong connections with…
▽ More
In this work we present the results of several simulations on main-effect factorial designs. The goal of such simulations is to investigate the connections between the $D$-optimality of a design and its geometrical structure. By means of a combinatorial object, namely the circuit basis of the design matrix, we show that it is possible to define a simple index that exhibits strong connections with the $D$-optimality.
△ Less
Submitted 15 April, 2016;
originally announced April 2016.
-
Graphical models for studying museum networks: the Abbonamento Musei Torino Piemonte
Authors:
Cristina Coscia,
Roberto Fontana,
Patrizia Semeraro
Abstract:
Probabilistic graphical models are a powerful tool to represent real-word phenomena and to learn network structures starting from data. This paper applies graphical models in a new framework to study association rules driven by consumer choices in a network of museums. The network consists of the museums participating in the program of Abbonamento Musei Torino Piemonte, which is a yearly subscript…
▽ More
Probabilistic graphical models are a powerful tool to represent real-word phenomena and to learn network structures starting from data. This paper applies graphical models in a new framework to study association rules driven by consumer choices in a network of museums. The network consists of the museums participating in the program of Abbonamento Musei Torino Piemonte, which is a yearly subscription managed by the Associazione Torino Città Capitale Europea. Consumers are card-holders, who are allowed to entry to all the museums in the network for one year. We employ graphical models to highlight associations among the museums driven by card-holder visiting behaviour. We use both undirected graphs to investigate the strength of the network and directed graphs to highlight asimmetry in the association rules.
△ Less
Submitted 10 February, 2016;
originally announced February 2016.
-
Aberration in qualitative multilevel designs
Authors:
Roberto Fontana,
Fabio Rapallo,
Maria-Piera Rogantin
Abstract:
Generalized Word Length Pattern (GWLP) is an important and widely-used tool for comparing fractional factorial designs. We consider qualitative factors, and we code their levels using the roots of the unity. We write the GWLP of a fraction ${\mathcal F}$ using the polynomial indicator function, whose coefficients encode many properties of the fraction. We show that the coefficient of a simple or i…
▽ More
Generalized Word Length Pattern (GWLP) is an important and widely-used tool for comparing fractional factorial designs. We consider qualitative factors, and we code their levels using the roots of the unity. We write the GWLP of a fraction ${\mathcal F}$ using the polynomial indicator function, whose coefficients encode many properties of the fraction. We show that the coefficient of a simple or interaction term can be written using the counts of its levels. This apparently simple remark leads to major consequence, including a convolution formula for the counts. We also show that the mean aberration of a term over the permutation of its levels provides a connection with the variance of the level counts. Moreover, using mean aberrations for symmetric $s^m$ designs with $s$ prime, we derive a new formula for computing the GWLP of ${\mathcal F}$. It is computationally easy, does not use complex numbers and also provides a clear way to interpret the GWLP. As case studies, we consider non-isomorphic orthogonal arrays that have the same GWLP. The different distributions of the mean aberrations suggest that they could be used as a further tool to discriminate between fractions.
△ Less
Submitted 19 September, 2015;
originally announced September 2015.
-
Generalized Minimum Aberration mixed-level orthogonal arrays A general approach based on sequential integer quadratically constrained quadratic programming
Authors:
Roberto Fontana
Abstract:
Orthogonal Fractional Factorial Designs and in particular Orthogonal Arrays are frequently used in many fields of application, including medicine, engineering and agriculture. In this paper we present a methodology and an algorithm to find an orthogonal array, of given size and strength, that satisfies the generalized minimum aberration criterion. The methodology is based on the joint use of polyn…
▽ More
Orthogonal Fractional Factorial Designs and in particular Orthogonal Arrays are frequently used in many fields of application, including medicine, engineering and agriculture. In this paper we present a methodology and an algorithm to find an orthogonal array, of given size and strength, that satisfies the generalized minimum aberration criterion. The methodology is based on the joint use of polynomial counting functions, complex coding of levels and algorithms for quadratic optimization and puts no restriction on the number of levels of each factor.
△ Less
Submitted 14 January, 2015;
originally announced January 2015.
-
$D$-optimal saturated designs: a simulation study
Authors:
Roberto Fontana,
Fabio Rapallo,
Maria Piera Rogantin
Abstract:
In this work we focus on saturated $D$-optimal designs. Using recent results, we identify $D$-optimal designs with the solutions of an optimization problem with linear constraints. We introduce new objective functions based on the geometric structure of the design and we compare them with the classical $D$-efficiency criterion. We perform a simulation study. In all the test cases we observe that d…
▽ More
In this work we focus on saturated $D$-optimal designs. Using recent results, we identify $D$-optimal designs with the solutions of an optimization problem with linear constraints. We introduce new objective functions based on the geometric structure of the design and we compare them with the classical $D$-efficiency criterion. We perform a simulation study. In all the test cases we observe that designs with high values of $D$-efficiency have also high values of the new objective functions.
△ Less
Submitted 4 January, 2014;
originally announced January 2014.
-
Random Latin squares and Sudoku designs generation
Authors:
Roberto Fontana
Abstract:
Uniform random generation of Latin squares is a classical problem. In this paper we prove that both Latin squares and Sudoku designs are maximum cliques of properly defined graphs. We have developed a simple algorithm for uniform random sampling of Latin squares and Sudoku designs. It makes use of recent tools for graph analysis. The corresponding SAS code is annexed.
Uniform random generation of Latin squares is a classical problem. In this paper we prove that both Latin squares and Sudoku designs are maximum cliques of properly defined graphs. We have developed a simple algorithm for uniform random sampling of Latin squares and Sudoku designs. It makes use of recent tools for graph analysis. The corresponding SAS code is annexed.
△ Less
Submitted 16 May, 2013;
originally announced May 2013.
-
A Characterization of Saturated Designs for Factorial Experiments
Authors:
Roberto Fontana,
Fabio Rapallo,
Maria-Piera Rogantin
Abstract:
In this paper we study saturated fractions of factorial designs under the perspective of Algebraic Statistics. We define a criterion to check whether a fraction is saturated or not with respect to a given model. The proposed criterion is based purely on combinatorial objects. Our technique is particularly useful when several fractions are needed. We also show how to generate random saturated fract…
▽ More
In this paper we study saturated fractions of factorial designs under the perspective of Algebraic Statistics. We define a criterion to check whether a fraction is saturated or not with respect to a given model. The proposed criterion is based purely on combinatorial objects. Our technique is particularly useful when several fractions are needed. We also show how to generate random saturated fractions with given projections, by applying the theory of Markov bases for contingency tables.
△ Less
Submitted 30 April, 2013;
originally announced April 2013.
-
Random generation of optimal saturated designs
Authors:
Roberto Fontana
Abstract:
Efficient algorithms for searching for optimal saturated designs are widely available. They maximize a given efficiency measure (such as D-optimality) and provide an optimum design. Nevertheless, they do not guarantee a \emph{global} optimal design. Indeed, they start from an initial random design and find a local optimal design. If the initial design is changed the optimum found will, in general,…
▽ More
Efficient algorithms for searching for optimal saturated designs are widely available. They maximize a given efficiency measure (such as D-optimality) and provide an optimum design. Nevertheless, they do not guarantee a \emph{global} optimal design. Indeed, they start from an initial random design and find a local optimal design. If the initial design is changed the optimum found will, in general, be different. A natural question arises. Should we stop at the design found or should we run the algorithm again in search of a better design? This paper uses very recent methods and software for discovery probability to support the decision to continue or stop the sampling. A software tool written in SAS has been developed.
△ Less
Submitted 28 March, 2013; v1 submitted 26 March, 2013;
originally announced March 2013.
-
Generation of Fractional Factorial Designs
Authors:
Roberto Fontana,
Giovanni Pistone
Abstract:
The joint use of counting functions, Hilbert basis and Markov basis allows to define a procedure to generate all the fractions that satisfy a given set of constraints in terms of orthogonality. The general case of mixed level designs, without restrictions on the number of levels of each factor (like primes or power of primes) is studied. This new methodology has been experimented on some signifi…
▽ More
The joint use of counting functions, Hilbert basis and Markov basis allows to define a procedure to generate all the fractions that satisfy a given set of constraints in terms of orthogonality. The general case of mixed level designs, without restrictions on the number of levels of each factor (like primes or power of primes) is studied. This new methodology has been experimented on some significant classes of fractional factorial designs, including mixed level orthogonal arrays.
△ Less
Submitted 17 June, 2009;
originally announced June 2009.
-
2-level fractional factorial designs which are the union of non trivial regular designs
Authors:
Roberto Fontana,
Giovanni Pistone
Abstract:
Every fraction is a union of points, which are trivial regular fractions. To characterize non trivial decomposition, we derive a condition for the inclusion of a regular fraction as follows. Let $F = \sum_αb_αX^α$ be the indicator polynomial of a generic fraction, see Fontana et al, JSPI 2000, 149-172. Regular fractions are characterized by $R = \frac 1l \sum_{α\in \mathcal L} e_αX^α$, where…
▽ More
Every fraction is a union of points, which are trivial regular fractions. To characterize non trivial decomposition, we derive a condition for the inclusion of a regular fraction as follows. Let $F = \sum_αb_αX^α$ be the indicator polynomial of a generic fraction, see Fontana et al, JSPI 2000, 149-172. Regular fractions are characterized by $R = \frac 1l \sum_{α\in \mathcal L} e_αX^α$, where $α\mapsto e_α$ is an group homeomorphism from $\mathcal L \subset \mathbb Z_2^d$ into $\{-1,+1\}$. The regular $R$ is a subset of the fraction $F$ if $FR = R$, which in turn is equivalent to $\sum_t F(t)R(t) = \sum_t R(t)$. If $\mathcal H = \{α_1 >... α_k\}$ is a generating set of $\mathcal L$, and $R = \frac1{2^k}(1 + e_1X^{α_1}) ... (1 + e_kX^{α_k})$, $e_j = \pm 1$, $j=1 ... k$, the inclusion condition in term of the $b_α$'s is % \begin{equation}b_0 + e_1 b_{α_1} + >... + e_1 ... e_k b_{α_1 + ... + α_k} = 1. \tag{*}\end{equation} % The last part of the paper will discuss some examples to investigate the practical applicability of the previous condition (*).
This paper is an offspring of the Alcotra 158 EU research contract on the planning of sequential designs for sample surveys in tourism statistics.
△ Less
Submitted 31 October, 2007;
originally announced October 2007.