-
Understanding Adversarial Training with Energy-based Models
Authors:
Mujtaba Hussain Mirza,
Maria Rosaria Briglia,
Filippo Bartolucci,
Senad Beadini,
Giuseppe Lisanti,
Iacopo Masi
Abstract:
We aim at using Energy-based Model (EBM) framework to better understand adversarial training (AT) in classifiers, and additionally to analyze the intrinsic generative capabilities of robust classifiers. By viewing standard classifiers through an energy lens, we begin by analyzing how the energies of adversarial examples, generated by various attacks, differ from those of the natural samples. The c…
▽ More
We aim at using Energy-based Model (EBM) framework to better understand adversarial training (AT) in classifiers, and additionally to analyze the intrinsic generative capabilities of robust classifiers. By viewing standard classifiers through an energy lens, we begin by analyzing how the energies of adversarial examples, generated by various attacks, differ from those of the natural samples. The central focus of our work is to understand the critical phenomena of Catastrophic Overfitting (CO) and Robust Overfitting (RO) in AT from an energy perspective. We analyze the impact of existing AT approaches on the energy of samples during training and observe that the behavior of the ``delta energy' -- change in energy between original sample and its adversarial counterpart -- diverges significantly when CO or RO occurs. After a thorough analysis of these energy dynamics and their relationship with overfitting, we propose a novel regularizer, the Delta Energy Regularizer (DER), designed to smoothen the energy landscape during training. We demonstrate that DER is effective in mitigating both CO and RO across multiple benchmarks. We further show that robust classifiers, when being used as generative models, have limits in handling trade-off between image quality and variability. We propose an improved technique based on a local class-wise principal component analysis (PCA) and energy-based guidance for better class-specific initialization and adaptive stopping, enhancing sample diversity and generation quality. Considering that we do not explicitly train for generative modeling, we achieve a competitive Inception Score (IS) and Fréchet inception distance (FID) compared to hybrid discriminative-generative models.
△ Less
Submitted 28 May, 2025;
originally announced May 2025.
-
Constructing Simultaneous Confidence Bands for Errors-in-variables Curves with Application to the Lorenz Curve
Authors:
Ziqing Dong,
Francesco Bartolucci,
Satoshi Kuriki,
Antonietta Mira
Abstract:
Errors-in-variables curves are curves where errors exist not only in the independent variable but also in the dependent variable. We address the challenge of constructing simultaneous confidence bands (SCBs) for such curves. Our method finds application in the Lorenz curve, which represents the concentration of income or wealth. Unlike ordinary regression curves, the Lorenz curve incorporates erro…
▽ More
Errors-in-variables curves are curves where errors exist not only in the independent variable but also in the dependent variable. We address the challenge of constructing simultaneous confidence bands (SCBs) for such curves. Our method finds application in the Lorenz curve, which represents the concentration of income or wealth. Unlike ordinary regression curves, the Lorenz curve incorporates errors in its explanatory variable and requires a fundamentally different treatment. To the best of our knowledge, the development of SCBs for such curves has not been explored in previous research. Using the Lorenz curve as a case study, this paper proposes a novel approach to address this challenge.
△ Less
Submitted 28 January, 2025;
originally announced January 2025.
-
A Lipschitz spaces view of infinitely wide shallow neural networks
Authors:
Francesca Bartolucci,
Marcello Carioni,
José A. Iglesias,
Yury Korolev,
Emanuele Naldi,
Stefano Vigogna
Abstract:
We revisit the mean field parametrization of shallow neural networks, using signed measures on unbounded parameter spaces and duality pairings that take into account the regularity and growth of activation functions. This setting directly leads to the use of unbalanced Kantorovich-Rubinstein norms defined by duality with Lipschitz functions, and of spaces of measures dual to those of continuous fu…
▽ More
We revisit the mean field parametrization of shallow neural networks, using signed measures on unbounded parameter spaces and duality pairings that take into account the regularity and growth of activation functions. This setting directly leads to the use of unbalanced Kantorovich-Rubinstein norms defined by duality with Lipschitz functions, and of spaces of measures dual to those of continuous functions with controlled growth. These allow to make transparent the need for total variation and moment bounds or penalization to obtain existence of minimizers of variational formulations, under which we prove a compactness result in strong Kantorovich-Rubinstein norm, and in the absence of which we show several examples demonstrating undesirable behavior. Further, the Kantorovich-Rubinstein setting enables us to combine the advantages of a completely linear parametrization and ensuing reproducing kernel Banach space framework with optimal transport insights. We showcase this synergy with representer theorems and uniform large data limits for empirical risk minimization, and in proposed formulations for distillation and fusion applications.
△ Less
Submitted 18 October, 2024;
originally announced October 2024.
-
Perturb, Attend, Detect and Localize (PADL): Robust Proactive Image Defense
Authors:
Filippo Bartolucci,
Iacopo Masi,
Giuseppe Lisanti
Abstract:
Image manipulation detection and localization have received considerable attention from the research community given the blooming of Generative Models (GMs). Detection methods that follow a passive approach may overfit to specific GMs, limiting their application in real-world scenarios, due to the growing diversity of generative models. Recently, approaches based on a proactive framework have show…
▽ More
Image manipulation detection and localization have received considerable attention from the research community given the blooming of Generative Models (GMs). Detection methods that follow a passive approach may overfit to specific GMs, limiting their application in real-world scenarios, due to the growing diversity of generative models. Recently, approaches based on a proactive framework have shown the possibility of dealing with this limitation. However, these methods suffer from two main limitations, which raises concerns about potential vulnerabilities: i) the manipulation detector is not robust to noise and hence can be easily fooled; ii) the fact that they rely on fixed perturbations for image protection offers a predictable exploit for malicious attackers, enabling them to reverse-engineer and evade detection. To overcome this issue we propose PADL, a new solution able to generate image-specific perturbations using a symmetric scheme of encoding and decoding based on cross-attention, which drastically reduces the possibility of reverse engineering, even when evaluated with adaptive attack [31]. Additionally, PADL is able to pinpoint manipulated areas, facilitating the identification of specific regions that have undergone alterations, and has more generalization power than prior art on held-out generative models. Indeed, although being trained only on an attribute manipulation GAN model [15], our method generalizes to a range of unseen models with diverse architectural designs, such as StarGANv2, BlendGAN, DiffAE, StableDiffusion and StableDiffusionXL. Additionally, we introduce a novel evaluation protocol, which offers a fair evaluation of localisation performance in function of detection accuracy and better captures real-world scenarios.
△ Less
Submitted 26 September, 2024;
originally announced September 2024.
-
Neural reproducing kernel Banach spaces and representer theorems for deep networks
Authors:
Francesca Bartolucci,
Ernesto De Vito,
Lorenzo Rosasco,
Stefano Vigogna
Abstract:
Studying the function spaces defined by neural networks helps to understand the corresponding learning models and their inductive bias. While in some limits neural networks correspond to function spaces that are reproducing kernel Hilbert spaces, these regimes do not capture the properties of the networks used in practice. In contrast, in this paper we show that deep neural networks define suitabl…
▽ More
Studying the function spaces defined by neural networks helps to understand the corresponding learning models and their inductive bias. While in some limits neural networks correspond to function spaces that are reproducing kernel Hilbert spaces, these regimes do not capture the properties of the networks used in practice. In contrast, in this paper we show that deep neural networks define suitable reproducing kernel Banach spaces.
These spaces are equipped with norms that enforce a form of sparsity, enabling them to adapt to potential latent structures within the input data and their representations. In particular, leveraging the theory of reproducing kernel Banach spaces, combined with variational results, we derive representer theorems that justify the finite architectures commonly employed in applications. Our study extends analogous results for shallow networks and can be seen as a step towards considering more practically plausible neural architectures.
△ Less
Submitted 13 March, 2024;
originally announced March 2024.
-
Uncovering the limits of uniqueness in sampled Gabor phase retrieval: A dense set of counterexamples in $L^2(\mathbb{R})$
Authors:
Rima Alaifari,
Francesca Bartolucci,
Matthias Wellershoff
Abstract:
Sampled Gabor phase retrieval - the problem of recovering a square-integrable signal from the magnitude of its Gabor transform sampled on a lattice - is a fundamental problem in signal processing, with important applications in areas such as imaging and audio processing. Recently, a classification of square-integrable signals which are not phase retrievable from Gabor measurements on parallel line…
▽ More
Sampled Gabor phase retrieval - the problem of recovering a square-integrable signal from the magnitude of its Gabor transform sampled on a lattice - is a fundamental problem in signal processing, with important applications in areas such as imaging and audio processing. Recently, a classification of square-integrable signals which are not phase retrievable from Gabor measurements on parallel lines has been presented. This classification was used to exhibit a family of counterexamples to uniqueness in sampled Gabor phase retrieval. Here, we show that the set of counterexamples to uniqueness in sampled Gabor phase retrieval is dense in $L^2(\mathbb{R})$, but is not equal to the whole of $L^2(\mathbb{R})$ in general. Overall, our work contributes to a better understanding of the fundamental limits of sampled Gabor phase retrieval.
△ Less
Submitted 8 July, 2023;
originally announced July 2023.
-
Representation Equivalent Neural Operators: a Framework for Alias-free Operator Learning
Authors:
Francesca Bartolucci,
Emmanuel de Bézenac,
Bogdan Raonić,
Roberto Molinaro,
Siddhartha Mishra,
Rima Alaifari
Abstract:
Recently, operator learning, or learning mappings between infinite-dimensional function spaces, has garnered significant attention, notably in relation to learning partial differential equations from data. Conceptually clear when outlined on paper, neural operators necessitate discretization in the transition to computer implementations. This step can compromise their integrity, often causing them…
▽ More
Recently, operator learning, or learning mappings between infinite-dimensional function spaces, has garnered significant attention, notably in relation to learning partial differential equations from data. Conceptually clear when outlined on paper, neural operators necessitate discretization in the transition to computer implementations. This step can compromise their integrity, often causing them to deviate from the underlying operators. This research offers a fresh take on neural operators with a framework Representation equivalent Neural Operators (ReNO) designed to address these issues. At its core is the concept of operator aliasing, which measures inconsistency between neural operators and their discrete representations. We explore this for widely-used operator learning techniques. Our findings detail how aliasing introduces errors when handling different discretizations and grids and loss of crucial continuous structures. More generally, this framework not only sheds light on existing challenges but, given its constructive and broad nature, also potentially offers tools for developing new neural operators.
△ Less
Submitted 2 November, 2023; v1 submitted 31 May, 2023;
originally announced May 2023.
-
Unique wavelet sign retrieval from samples without bandlimiting
Authors:
Rima Alaifari,
Francesca Bartolucci,
Matthias Wellershoff
Abstract:
We study the problem of recovering a signal from magnitudes of its wavelet frame coefficients when the analyzing wavelet is real-valued. We show that every real-valued signal can be uniquely recovered, up to global sign, from its multi-wavelet frame coefficients \[ \{\lvert \mathcal{W}_{φ_i} f(α^{m}βn,α^{m}) \rvert: i\in\{1,2,3\}, m,n\in\mathbb{Z}\} \] for every $α>1,β>0$ with…
▽ More
We study the problem of recovering a signal from magnitudes of its wavelet frame coefficients when the analyzing wavelet is real-valued. We show that every real-valued signal can be uniquely recovered, up to global sign, from its multi-wavelet frame coefficients \[ \{\lvert \mathcal{W}_{φ_i} f(α^{m}βn,α^{m}) \rvert: i\in\{1,2,3\}, m,n\in\mathbb{Z}\} \] for every $α>1,β>0$ with $β\ln(α)\leq 4π/(1+4p)$, $p>0$, when the three wavelets $φ_i$ are suitable linear combinations of the Poisson wavelet $P_p$ of order $p$ and its Hilbert transform $\mathscr{H}P_p$. For complex-valued signals we find that this is not possible for any choice of the parameters $α>1,β>0$, and for any window. In contrast to the existing literature on wavelet sign retrieval, our uniqueness results do not require any bandlimiting constraints or other a priori knowledge on the real-valued signals to guarantee their unique recovery from the absolute values of their wavelet coefficients.
△ Less
Submitted 1 July, 2024; v1 submitted 16 February, 2023;
originally announced February 2023.
-
Convolutional Neural Operators for robust and accurate learning of PDEs
Authors:
Bogdan Raonić,
Roberto Molinaro,
Tim De Ryck,
Tobias Rohner,
Francesca Bartolucci,
Rima Alaifari,
Siddhartha Mishra,
Emmanuel de Bézenac
Abstract:
Although very successfully used in conventional machine learning, convolution based neural network architectures -- believed to be inconsistent in function space -- have been largely ignored in the context of learning solution operators of PDEs. Here, we present novel adaptations for convolutional neural networks to demonstrate that they are indeed able to process functions as inputs and outputs.…
▽ More
Although very successfully used in conventional machine learning, convolution based neural network architectures -- believed to be inconsistent in function space -- have been largely ignored in the context of learning solution operators of PDEs. Here, we present novel adaptations for convolutional neural networks to demonstrate that they are indeed able to process functions as inputs and outputs. The resulting architecture, termed as convolutional neural operators (CNOs), is designed specifically to preserve its underlying continuous nature, even when implemented in a discretized form on a computer. We prove a universality theorem to show that CNOs can approximate operators arising in PDEs to desired accuracy. CNOs are tested on a novel suite of benchmarks, encompassing a diverse set of PDEs with possibly multi-scale solutions and are observed to significantly outperform baselines, paving the way for an alternative framework for robust and accurate operator learning. Our code is publicly available at https://github.com/bogdanraonic3/ConvolutionalNeuralOperator
△ Less
Submitted 1 December, 2023; v1 submitted 2 February, 2023;
originally announced February 2023.
-
On the connection between uniqueness from samples and stability in Gabor phase retrieval
Authors:
Rima Alaifari,
Francesca Bartolucci,
Stefan Steinerberger,
Matthias Wellershoff
Abstract:
Gabor phase retrieval is the problem of reconstructing a signal from only the magnitudes of its Gabor transform. Previous findings suggest a possible link between unique solvability of the discrete problem (recovery from measurements on a lattice) and stability of the continuous problem (recovery from measurements on an open subset of $\mathbb{R}^2$). In this paper, we close this gap by proving th…
▽ More
Gabor phase retrieval is the problem of reconstructing a signal from only the magnitudes of its Gabor transform. Previous findings suggest a possible link between unique solvability of the discrete problem (recovery from measurements on a lattice) and stability of the continuous problem (recovery from measurements on an open subset of $\mathbb{R}^2$). In this paper, we close this gap by proving that such a link cannot be made. More precisely, we establish the existence of functions which break uniqueness from samples without affecting stability of the continuous problem. Furthermore, we prove the novel result that counterexamples to unique recovery from samples are dense in $L^2(\mathbb{R})$. Finally, we develop an intuitive argument on the connection between directions of instability in phase retrieval and certain Laplacian eigenfunctions associated to small eigenvalues.
△ Less
Submitted 29 November, 2023; v1 submitted 14 May, 2022;
originally announced May 2022.
-
Understanding neural networks with reproducing kernel Banach spaces
Authors:
Francesca Bartolucci,
Ernesto De Vito,
Lorenzo Rosasco,
Stefano Vigogna
Abstract:
Characterizing the function spaces corresponding to neural networks can provide a way to understand their properties. In this paper we discuss how the theory of reproducing kernel Banach spaces can be used to tackle this challenge. In particular, we prove a representer theorem for a wide class of reproducing kernel Banach spaces that admit a suitable integral representation and include one hidden…
▽ More
Characterizing the function spaces corresponding to neural networks can provide a way to understand their properties. In this paper we discuss how the theory of reproducing kernel Banach spaces can be used to tackle this challenge. In particular, we prove a representer theorem for a wide class of reproducing kernel Banach spaces that admit a suitable integral representation and include one hidden layer neural networks of possibly infinite width. Further, we show that, for a suitable class of ReLU activation functions, the norm in the corresponding reproducing kernel Banach space can be characterized in terms of the inverse Radon transform of a bounded real measure, with norm given by the total variation norm of the measure. Our analysis simplifies and extends recent results in [34,29,30].
△ Less
Submitted 26 October, 2021; v1 submitted 20 September, 2021;
originally announced September 2021.
-
Unitarization of the Horocyclic Radon Transform on Symmetric Spaces
Authors:
Francesca Bartolucci,
Filippo De Mari,
Matteo Monti
Abstract:
We consider the Radon transform for a dual pair $(X,Ξ)$, where $X=G/K$ is a noncompact symmetric space and $Ξ$ is the space of horocycles of $X$. We address the unitarization problem that was considered (and solved in some cases) by Helgason, namely the determination of a pseudo-differential operator such that the pre-composition with the Radon transform extends to a unitary operator…
▽ More
We consider the Radon transform for a dual pair $(X,Ξ)$, where $X=G/K$ is a noncompact symmetric space and $Ξ$ is the space of horocycles of $X$. We address the unitarization problem that was considered (and solved in some cases) by Helgason, namely the determination of a pseudo-differential operator such that the pre-composition with the Radon transform extends to a unitary operator $\mathcal{Q}\colon L^2(X)\to L_\flat^2(Ξ)$, where $L_\flat^2(Ξ)$ is a closed subspace of $L^2(Ξ)$ which accounts for the Weyl symmetries. Furthermore, we show that the unitary extension intertwines the quasi-regular representations of $G$ on $L^2(X)$ and $L_\flat^2(Ξ)$.
△ Less
Submitted 9 August, 2021;
originally announced August 2021.
-
Maximum likelihood estimation of hidden Markov models for continuous longitudinal data with missing responses and dropout
Authors:
Silvia Pandolfi,
Francesco Bartolucci,
Fulvia Pennoni
Abstract:
We propose an inferential approach for maximum likelihood estimation of the hidden Markov models for continuous responses. We extend to the case of longitudinal observations the finite mixture model of multivariate Gaussian distributions with Missing At Random (MAR) outcomes, also accounting for possible dropout. The resulting hidden Markov model accounts for different types of missing pattern: (i…
▽ More
We propose an inferential approach for maximum likelihood estimation of the hidden Markov models for continuous responses. We extend to the case of longitudinal observations the finite mixture model of multivariate Gaussian distributions with Missing At Random (MAR) outcomes, also accounting for possible dropout. The resulting hidden Markov model accounts for different types of missing pattern: (i) partially missing outcomes at a given time occasion; (ii) completely missing outcomes at a given time occasion (intermittent pattern); (iii) dropout before the end of the period of observation (monotone pattern). The MAR assumption is formulated to deal with the first two types of missingness, while to account for informative dropout we assume an extra absorbing state. Maximum likelihood estimation of the model parameters is based on an extended Expectation-Maximization algorithm relying on suitable recursions. The proposal is illustrated by a Monte Carlo simulation study and an application based on historical data on primary biliary cholangitis.
△ Less
Submitted 30 June, 2021;
originally announced June 2021.
-
Estimating the size of a closed population by modeling latent and observed heterogeneity
Authors:
Antonio Forcina,
Francesco Bartolucci
Abstract:
The paper describes a new class of capture-recapture models for closed populations when individual covariates are available. The novelty consists in combining a latent class model for the distribution of the capture history, where the class weights and the conditional distributions given the latent may depend on covariates, with a model for the marginal distribution of the available covariates as…
▽ More
The paper describes a new class of capture-recapture models for closed populations when individual covariates are available. The novelty consists in combining a latent class model for the distribution of the capture history, where the class weights and the conditional distributions given the latent may depend on covariates, with a model for the marginal distribution of the available covariates as in \cite{Liu2017}. In addition, any general form of serial dependence is allowed when modeling capture histories conditionally on the latent and covariates. A Fisher-scoring algorithm for maximum likelihood estimation is proposed, and the Implicit Function Theorem is used to show that the mapping between the marginal distribution of the observed covariates and the probabilities of being never captured is one-to-one. Asymptotic results are outlined, and a procedure for constructing likelihood based confidence intervals for the population size is presented. Two examples based on real data are used to illustrate the proposed approach
△ Less
Submitted 5 November, 2021; v1 submitted 7 June, 2021;
originally announced June 2021.
-
Phase retrieval of bandlimited functions for the wavelet transform
Authors:
Rima Alaifari,
Francesca Bartolucci,
Matthias Wellershoff
Abstract:
We study the recovery of square-integrable signals from the absolute values of their wavelet transforms, also called wavelet phase retrieval. We present a new uniqueness result for wavelet phase retrieval. To be precise, we show that any wavelet with finitely many vanishing moments allows for the unique recovery of real-valued bandlimited signals up to global sign. Additionally, we present the fir…
▽ More
We study the recovery of square-integrable signals from the absolute values of their wavelet transforms, also called wavelet phase retrieval. We present a new uniqueness result for wavelet phase retrieval. To be precise, we show that any wavelet with finitely many vanishing moments allows for the unique recovery of real-valued bandlimited signals up to global sign. Additionally, we present the first uniqueness result for sampled wavelet phase retrieval in which the underlying wavelets are allowed to be complex-valued and we present a uniqueness result for phase retrieval from sampled Cauchy wavelet transform measurements.
△ Less
Submitted 5 November, 2021; v1 submitted 10 September, 2020;
originally announced September 2020.
-
Continuity properties of the shearlet transform and the shearlet synthesis operator on the Lizorkin type spaces
Authors:
Francesca Bartolucci,
Stevan Pilipović,
Nenad Teofanov
Abstract:
We develop a distributional framework for the shearlet transform $\mathcal{S}_ψ\colon\mathcal{S}_0(\mathbb{R}^2)\to\mathcal{S}(\mathbb{S})$ and the shearlet synthesis operator $\mathcal{S}^t_ψ\colon\mathcal{S}(\mathbb{S})\to\mathcal{S}_0(\mathbb{R}^2)$, where $\mathcal{S}_0(\mathbb{R}^2)$ is the Lizorkin test function space and $\mathcal{S}(\mathbb{S})$ is the space of highly localized test functi…
▽ More
We develop a distributional framework for the shearlet transform $\mathcal{S}_ψ\colon\mathcal{S}_0(\mathbb{R}^2)\to\mathcal{S}(\mathbb{S})$ and the shearlet synthesis operator $\mathcal{S}^t_ψ\colon\mathcal{S}(\mathbb{S})\to\mathcal{S}_0(\mathbb{R}^2)$, where $\mathcal{S}_0(\mathbb{R}^2)$ is the Lizorkin test function space and $\mathcal{S}(\mathbb{S})$ is the space of highly localized test functions on the standard shearlet group $\mathbb{S}$. These spaces and their duals $\mathcal{S}_0^\prime (\mathbb R^2),\, \mathcal{S}^\prime (\mathbb{S})$ are called Lizorkin type spaces of test functions and distributions. We analyze the continuity properties of these transforms when the admissible vector $ψ$ belongs to $\mathcal{S}_0(\mathbb{R}^2)$. Then, we define the shearlet transform and the shearlet synthesis operator of Lizorkin type distributions as transpose mappings of the shearlet synthesis operator and the shearlet transform, respectively. They yield continuous mappings from $\mathcal{S}_0^\prime (\mathbb R^2)$ to $\mathcal{S}^\prime (\mathbb{S})$ and from $\mathcal{S}^\prime (\mathbb S)$ to $\mathcal{S}_0^\prime (\mathbb{R}^2)$. Furthermore, we show the consistency of our definition with the shearlet transform defined by direct evaluation of a distribution on the shearlets. The same can be done for the shearlet synthesis operator. Finally, we give a reconstruction formula for Lizorkin type distributions, from which follows that the action of such generalized functions can be written as an absolutely convergent integral over the standard shearlet group.
△ Less
Submitted 6 May, 2020;
originally announced May 2020.
-
The Shearlet Transform and Lizorkin Spaces
Authors:
Francesca Bartolucci,
Stevan Pilipović,
Nenad Teofanov
Abstract:
We prove a continuity result for the shearlet transform when restricted to the space of smooth and rapidly decreasing functions with all vanishing moments. We define the dual shearlet transform, called here the shearlet synthesis operator, and we prove its continuity on the space of smooth and rapidly decreasing functions over $\mathbb{R}^2\times\mathbb{R}\times\mathbb{R}^\times$. Then, we use the…
▽ More
We prove a continuity result for the shearlet transform when restricted to the space of smooth and rapidly decreasing functions with all vanishing moments. We define the dual shearlet transform, called here the shearlet synthesis operator, and we prove its continuity on the space of smooth and rapidly decreasing functions over $\mathbb{R}^2\times\mathbb{R}\times\mathbb{R}^\times$. Then, we use these continuity results to extend the shearlet transform to the space of Lizorkin distributions, and we prove its consistency with the classical definition for test functions.
△ Less
Submitted 7 May, 2020; v1 submitted 14 March, 2020;
originally announced March 2020.
-
Unitarization of the Horocyclic Radon Transform on Homogeneous Trees
Authors:
Francesca Bartolucci,
Filippo De Mari,
Matteo Monti
Abstract:
Following previous work in the continuous setup, we construct the unitarization of the horocyclic Radon transform on a homogeneous tree X and we show that it intertwines the quasi regular representations of the group of isometries of X on the tree itself and on the space of horocycles.
Following previous work in the continuous setup, we construct the unitarization of the horocyclic Radon transform on a homogeneous tree X and we show that it intertwines the quasi regular representations of the group of isometries of X on the tree itself and on the space of horocycles.
△ Less
Submitted 4 August, 2021; v1 submitted 16 February, 2020;
originally announced February 2020.
-
Radon Transform: Dual Pairs and Irreducible Representations
Authors:
Giovanni S. Alberti,
Francesca Bartolucci,
Filippo De Mari,
Ernesto De Vito
Abstract:
We illustrate the general point of view developed in [SIAM J. Math. Anal., 51(6), 4356-4381] that can be described as a variation of Helgason's theory of dual $G$-homogeneous pairs $(X,Ξ)$ and which allows us to prove intertwining properties and inversion formulae of many existing Radon transforms. Here we analyze in detail one of the important aspects in the theory of dual pairs, namely the injec…
▽ More
We illustrate the general point of view developed in [SIAM J. Math. Anal., 51(6), 4356-4381] that can be described as a variation of Helgason's theory of dual $G$-homogeneous pairs $(X,Ξ)$ and which allows us to prove intertwining properties and inversion formulae of many existing Radon transforms. Here we analyze in detail one of the important aspects in the theory of dual pairs, namely the injectivity of the map label-to-manifold $ξ\to\hatξ$ and we prove that it is a necessary condition for the irreducibility of the quasi-regular representation of $G$ on $L^2(Ξ)$. We further explain how the theory in [SIAM J. Math. Anal., 51(6), 4356-4381] applies to the classical Radon and X-ray transforms in $\mathbb R^3$.
△ Less
Submitted 4 February, 2020;
originally announced February 2020.
-
Cone-Adapted Shearlets and Radon Transforms
Authors:
Francesca Bartolucci,
Filippo De Mari,
Ernesto De Vito
Abstract:
We show that the cone-adapted shearlet coefficients can be computed by means of the limited angle horizontal and vertical (affine) Radon transforms and the one-dimensional wavelet transform. This yields formulas that open new perspectives for the inversion of the Radon transform.
We show that the cone-adapted shearlet coefficients can be computed by means of the limited angle horizontal and vertical (affine) Radon transforms and the one-dimensional wavelet transform. This yields formulas that open new perspectives for the inversion of the Radon transform.
△ Less
Submitted 16 October, 2019;
originally announced October 2019.
-
Unitarization and Inversion Formulae for the Radon Transform between Dual Pairs
Authors:
Giovanni S. Alberti,
Francesca Bartolucci,
Filippo De Mari,
Ernesto De Vito
Abstract:
We consider the Radon transform associated to dual pairs $(X,Ξ)$ in the sense of Helgason, with $X=G/K$ and $Ξ=G/H$, where $G=\mathbb{R}^d\rtimes K$, $K$ is a closed subgroup of ${\rm GL}(d,\mathbb{R})$ and $H$ is a closed subgroup of $G$. Under some technical assumptions, we prove that if the quasi regular representations of $G$ acting on $L^2(X)$ and $L^2(Ξ)$ are irreducible, then the Radon tran…
▽ More
We consider the Radon transform associated to dual pairs $(X,Ξ)$ in the sense of Helgason, with $X=G/K$ and $Ξ=G/H$, where $G=\mathbb{R}^d\rtimes K$, $K$ is a closed subgroup of ${\rm GL}(d,\mathbb{R})$ and $H$ is a closed subgroup of $G$. Under some technical assumptions, we prove that if the quasi regular representations of $G$ acting on $L^2(X)$ and $L^2(Ξ)$ are irreducible, then the Radon transform admits a unitarization intertwining the two representations. If, in addition, the representations are square integrable, we provide an inversion formula for the Radon transform based on the voice transform associated to these representations.
△ Less
Submitted 30 October, 2018;
originally announced October 2018.
-
Marginal models with individual-specific effects for the analysis of longitudinal bipartite networks
Authors:
Francesco Bartolucci,
Antonietta Mira,
Stefano Peluso
Abstract:
A new modeling framework for bipartite social networks arising from a sequence of partially time-ordered relational events is proposed. We directly model the joint distribution of the binary variables indicating if each single actor is involved or not in an event. The adopted parametrization is based on first- and second-order effects, formulated as in marginal models for categorical data and free…
▽ More
A new modeling framework for bipartite social networks arising from a sequence of partially time-ordered relational events is proposed. We directly model the joint distribution of the binary variables indicating if each single actor is involved or not in an event. The adopted parametrization is based on first- and second-order effects, formulated as in marginal models for categorical data and free higher order effects. In particular, second-order effects are log-odds ratios with meaningful interpretation from the social perspective in terms of tendency to cooperate, in contrast to first-order effects interpreted in terms of tendency of each single actor to participate in an event. These effects are parametrized on the basis of the event times, so that suitable latent trajectories of individual behaviors may be represented. Inference is based on a composite likelihood function, maximized by an algorithm with numerical complexity proportional to the square of the number of units in the network. A classification composite likelihood is used to cluster the actors, simplifying the interpretation of the data structure. The proposed approach is illustrated on a dataset of scientific articles published in four top statistical journals from 2003 to 2012.
△ Less
Submitted 20 October, 2018;
originally announced October 2018.
-
Radon transform intertwines shearlets and wavelets
Authors:
Francesca Bartolucci,
Filippo De Mari,
Ernesto De Vito
Abstract:
We prove that the unitary affine Radon transform intertwines the quasi-regular representation of a class of semidirect products, built by shearlet dilation groups and translations, and the tensor product of a standard wavelet representation with a wavelet-like representation. This yields a formula for shearlet coefficients that involves only integral transforms applied to the affine Radon transfor…
▽ More
We prove that the unitary affine Radon transform intertwines the quasi-regular representation of a class of semidirect products, built by shearlet dilation groups and translations, and the tensor product of a standard wavelet representation with a wavelet-like representation. This yields a formula for shearlet coefficients that involves only integral transforms applied to the affine Radon transform of the signal, thereby opening new perspectives in the inversion of the Radon transform.
△ Less
Submitted 28 March, 2017;
originally announced March 2017.
-
Evaluation of student proficiency through a multidimensional finite mixture IRT model
Authors:
Silvia Bacci,
Francesco Bartolucci,
Leonardo Grilli,
Carla Rampichini
Abstract:
In certain academic systems, a student can enroll for an exam immediately after the end of the teaching period or can postpone it to any later examination session, so that the grade is missing until the exam is not attempted. We propose an approach for the evaluation in itinere of a student's proficiency accounting also for non-attempted exams. The approach is based on considering each exam as an…
▽ More
In certain academic systems, a student can enroll for an exam immediately after the end of the teaching period or can postpone it to any later examination session, so that the grade is missing until the exam is not attempted. We propose an approach for the evaluation in itinere of a student's proficiency accounting also for non-attempted exams. The approach is based on considering each exam as an item, so that responding to the item amounts to attempting the exam, and on an Item Response Theory model that includes two latent variables corresponding to the student's ability and the propensity to attempt the exam. In this way, we explicitly account for non-ignorable missing observations as the indicators of item response also contribute to measure the ability. The two latent variables are assumed to have a discrete distribution defining latent classes of students that are homogeneous in terms of ability and priority assigned to exams. The model, which also allows for individual covariates in its structural part, is fitted by the Expectation-Maximization algorithm. The approach is illustrated through the analysis of data about the first-year exams of freshmen of the School of Economics at the University of Florence (Italy).
△ Less
Submitted 21 September, 2016;
originally announced September 2016.
-
Composite likelihood inference in a discrete latent variable model for two-way "clustering-by-segmentation" problems
Authors:
Francesco Bartolucci,
Francesca Chiaromonte,
Prabhani Kuruppumullage Don,
Bruce George Lindsay
Abstract:
We consider a discrete latent variable model for two-way data arrays, which allows one to simultaneously produce clusters along one of the data dimensions (e.g. exchangeable observational units or features) and contiguous groups, or segments, along the other (e.g. consecutively ordered times or locations). The model relies on a hidden Markov structure but, given its complexity, cannot be estimated…
▽ More
We consider a discrete latent variable model for two-way data arrays, which allows one to simultaneously produce clusters along one of the data dimensions (e.g. exchangeable observational units or features) and contiguous groups, or segments, along the other (e.g. consecutively ordered times or locations). The model relies on a hidden Markov structure but, given its complexity, cannot be estimated by full maximum likelihood. We therefore introduce composite likelihood methodology based on considering different subsets of the data. The proposed approach is illustrated by simulation, and with an application to genomic data.
△ Less
Submitted 27 June, 2015;
originally announced June 2015.
-
LMest: an R package for latent Markov models for categorical longitudinal data
Authors:
Francesco Bartolucci,
Alessio Farcomeni,
Silvia Pandolfi,
Fulvia Pennoni
Abstract:
Latent Markov (LM) models represent an important class of models for the analysis of longitudinal data (Bartolucci et. al., 2013), especially when response variables are categorical. These models have a great potential of application for the analysis of social, medical, and behavioral data as well as in other disciplines. We propose the R package LMest, which is tailored to deal with these types o…
▽ More
Latent Markov (LM) models represent an important class of models for the analysis of longitudinal data (Bartolucci et. al., 2013), especially when response variables are categorical. These models have a great potential of application for the analysis of social, medical, and behavioral data as well as in other disciplines. We propose the R package LMest, which is tailored to deal with these types of model. In particular, we consider a general framework for extended LM models by including individual covariates and by formulating a mixed approach to take into account additional dependence structures in the data. Such extensions lead to a very flexible class of models, which allows us to fit different types of longitudinal data. Model parameters are estimated through the expectation-maximization algorithm, based on the forward-backward recursions, which is implemented in the main functions of the package. The package also allows us to perform local and global decoding and to obtain standard errors for the parameter estimates. We illustrate its use and the most important features on the basis of examples involving applications in health and criminology.
△ Less
Submitted 19 January, 2015;
originally announced January 2015.
-
A multidimensional latent class IRT model for non-ignorable missing responses
Authors:
Silvia Bacci,
Francesco Bartolucci
Abstract:
We propose a structural equation model, which reduces to a multidimensional latent class item response theory model, for the analysis of binary item responses with non-ignorable missingness. The missingness mechanism is driven by two sets of latent variables: one describing the propensity to respond and the other referred to the abilities measured by the test items. These latent variables are assu…
▽ More
We propose a structural equation model, which reduces to a multidimensional latent class item response theory model, for the analysis of binary item responses with non-ignorable missingness. The missingness mechanism is driven by two sets of latent variables: one describing the propensity to respond and the other referred to the abilities measured by the test items. These latent variables are assumed to have a discrete distribution, so as to reduce the number of parametric assumptions regarding the latent structure of the model. Individual covariates may also be included through a multinomial logistic parametrization of the probabilities of each support point of the distribution of the latent variables. Given the discrete nature of this distribution, the proposed model is efficiently estimated by the Expectation-Maximization algorithm. A simulation study is performed to evaluate the finite sample properties of the parameter estimates. Moreover, an application is illustrated to data coming from a Students' Entry Test for the admission to some university courses.
△ Less
Submitted 17 October, 2014;
originally announced October 2014.
-
A multilevel finite mixture item response model to cluster examinees and schools
Authors:
Michela Gnaldi,
Silvia Bacci,
Francesco Bartolucci
Abstract:
Within the educational context, a key goal is to assess students acquired skills and to cluster students according to their ability level. In this regard, a relevant element to be accounted for is the possible effect of the school students come from. For this aim, we provide a methodological tool which takes into account the multilevel structure of the data (i.e., students in schools) in a suitabl…
▽ More
Within the educational context, a key goal is to assess students acquired skills and to cluster students according to their ability level. In this regard, a relevant element to be accounted for is the possible effect of the school students come from. For this aim, we provide a methodological tool which takes into account the multilevel structure of the data (i.e., students in schools) in a suitable way. This approach allows us to cluster both students and schools into homogeneous classes of ability and effectiveness, and to assess the effect of certain students and school characteristics on the probability to belong to such classes. The approach relies on an extended class of multidimensional latent class IRT models characterized by: (i) latent traits defined at student level and at school level, (ii) latent traits represented through random vectors with a discrete distribution, (iii) the inclusion of covariates at student level and at school level, and (iv) a two-parameter logistic parametrization for the conditional probability of a correct response given the ability. The approach is applied for the analysis of data collected by two national tests administered in Italy to middle school students in June 2009: the INVALSI Italian Test and Mathematics Test. Results allow us to study the relationships between observed characteristics and latent trait standing within each latent class at the different levels of the hierarchy. They show that examinees and school expected observed scores, at a given latent trait level, are dependent on both unobserved (latent class) group membership and observed first and second level covariates.
△ Less
Submitted 11 August, 2014;
originally announced August 2014.
-
Item selection by Latent Class-based methods
Authors:
Francesco Bartolucci,
Giorgio E. Montanari,
Silvia Pandolfi
Abstract:
The evaluation of nursing homes is usually based on the administration of questionnaires made of a large number of polytomous items. In such a context, the Latent Class (LC) model represents a useful tool for clustering subjects in homogenous groups corresponding to different degrees of impairment of the health conditions. It is known that the performance of model-based clustering and the accuracy…
▽ More
The evaluation of nursing homes is usually based on the administration of questionnaires made of a large number of polytomous items. In such a context, the Latent Class (LC) model represents a useful tool for clustering subjects in homogenous groups corresponding to different degrees of impairment of the health conditions. It is known that the performance of model-based clustering and the accuracy of the choice of the number of latent classes may be affected by the presence of irrelevant or noise variables. In this paper, we show the application of an item selection algorithm to real data collected within a project, named ULISSE, on the quality-of-life of elderly patients hosted in italian nursing homes. This algorithm, which is closely related to that proposed by Dean and Raftery in 2010, is aimed at finding the subset of items which provides the best clustering according to the Bayesian Information Criterion. At the same time, it allows us to select the optimal number of latent classes. Given the complexity of the ULISSE study, we perform a validation of the results by means of a sensitivity analysis to different specifications of the initial subset of items and of a resampling procedure.
△ Less
Submitted 15 July, 2014;
originally announced July 2014.
-
Three-step estimation of latent Markov models with covariates
Authors:
Francesco Bartolucci,
Giorgio E. Montanari,
Silvia Pandolfi
Abstract:
We propose a modified version of the three-step estimation method for the latent class model with covariates, which may be used to estimate latent Markov models for longitudinal data. The three-step estimation approach we propose is based on a preliminary clustering of sample units on the basis of the time specific responses only. This approach represents an useful estimation tool when a large num…
▽ More
We propose a modified version of the three-step estimation method for the latent class model with covariates, which may be used to estimate latent Markov models for longitudinal data. The three-step estimation approach we propose is based on a preliminary clustering of sample units on the basis of the time specific responses only. This approach represents an useful estimation tool when a large number of response variables are observed at each time occasion. In such a context, full maximum likelihood estimation, which is typically based on the Expectation-Maximization algorithm, may have some drawbacks, essentially due to the presence of many local maxima of the model likelihood. Moreover, the EM algorithm may be particularly slow to converge, and may become unstable with complex LM models. We prove the consistency of the proposed three-step estimator when the number of response variables tends to infinity. We also show the results of a simulation study aimed at evaluating the performance of the proposed alternative approach with respect to the full likelihood method. We finally illustrate an application to a real dataset on the health status of elderly people hosted in Italian nursing homes.
△ Less
Submitted 5 February, 2014;
originally announced February 2014.
-
A discrete time event-history approach to informative drop-out in multivariate latent Markov models with covariates
Authors:
Francesco Bartolucci,
Alessio Farcomeni
Abstract:
Latent Markov (LM) models represent an important tool of analysis of longitudinal data when response variables are affected by time-varying unobserved heterogeneity, which is accounted for by a hidden Markov chain. In order to avoid bias when using a model of this type in the presence of informative drop-out, we propose an event-history (EH) extension of the LM approach that may be used with multi…
▽ More
Latent Markov (LM) models represent an important tool of analysis of longitudinal data when response variables are affected by time-varying unobserved heterogeneity, which is accounted for by a hidden Markov chain. In order to avoid bias when using a model of this type in the presence of informative drop-out, we propose an event-history (EH) extension of the LM approach that may be used with multivariate longitudinal data, in which one or more outcomes of a different nature are observed at each time occasion. The EH component of the resulting model is referred to the interval-censored drop-out, and bias in LM modeling is avoided by correlated random effects, included in the different model components, which follow a common Markov chain. In order to perform maximum likelihood estimation of the proposed model by the Expectation-Maximization algorithm, we extend the usual backward-forward recursions of Baum and Welch. The algorithm has the same complexity of the one adopted in cases of non-informative drop-out. Standard errors for the parameter estimates are derived by using the Oakes' identity. We illustrate the proposed approach through an application based on data coming from a medical study about primary biliary cirrhosis in which there are two outcomes of interest, the first of which is continuous and the second is binary.
△ Less
Submitted 7 June, 2013;
originally announced June 2013.
-
Joint Assessment of the Differential Item Functioning and Latent Trait Dimensionality of Students' National Tests
Authors:
Michela Gnaldi,
Francesco Bartolucci,
Silvia Bacci
Abstract:
Within the educational context, students' assessment tests are routinely validated through Item Response Theory (IRT) models which assume unidimensionality and absence of Differential Item Functioning (DIF). In this paper, we investigate if such assumptions hold for two national tests administered in Italy to middle school students in June 2009: the Italian Test and the Mathematics Test. To this a…
▽ More
Within the educational context, students' assessment tests are routinely validated through Item Response Theory (IRT) models which assume unidimensionality and absence of Differential Item Functioning (DIF). In this paper, we investigate if such assumptions hold for two national tests administered in Italy to middle school students in June 2009: the Italian Test and the Mathematics Test. To this aim, we rely on an extended class of multidimensional latent class IRT models characterised by: (i) a two-parameter logistic parameterisation for the conditional probability of a correct response, (ii) latent traits represented through a random vector with a discrete distribution, and (iii) the inclusion of (uniform) DIF to account for students' gender and geographical area. A classification of the items into unidimensional groups is also proposed and represented by a dendrogram, which is obtained from a hierarchical clustering algorithm. The results provide evidence for DIF effects for both Tests. Besides, the assumption of unidimensionality is strongly rejected for the Italian Test, whereas it is reasonable for the Mathematics Test.
△ Less
Submitted 3 December, 2012;
originally announced December 2012.
-
A causal analysis of mother's education on birth inequalities
Authors:
Silvia Bacci,
Francesco Bartolucci,
Luca Pieroni
Abstract:
We propose a causal analysis of the mother's educational level on the health status of the newborn, in terms of gestational weeks and weight. The analysis is based on a finite mixture structural equation model, the parameters of which have a causal interpretation. The model is applied to a dataset of almost ten thousand deliveries collected in an Italian region. The analysis confirms that standard…
▽ More
We propose a causal analysis of the mother's educational level on the health status of the newborn, in terms of gestational weeks and weight. The analysis is based on a finite mixture structural equation model, the parameters of which have a causal interpretation. The model is applied to a dataset of almost ten thousand deliveries collected in an Italian region. The analysis confirms that standard regression overestimates the impact of education on the child health. With respect to the current economic literature, our findings indicate that only high education has positive consequences on child health, implying that policy efforts in education should have benefits for welfare.
△ Less
Submitted 3 December, 2012;
originally announced December 2012.
-
A multidimensional latent class Rasch model for the assessment of the Health-related Quality of Life
Authors:
Silvia Bacci,
Francesco Bartolucci
Abstract:
The work describes a multidimensional latent class Rasch model and its application to data about the measurement of some aspects of Health-related Quality of Life and Anxiety and Depression in oncological patients.
The work describes a multidimensional latent class Rasch model and its application to data about the measurement of some aspects of Health-related Quality of Life and Anxiety and Depression in oncological patients.
△ Less
Submitted 12 November, 2012;
originally announced November 2012.
-
Causal inference in paired two-arm experimental studies under non-compliance with application to prognosis of myocardial infarction
Authors:
F. Bartolucci,
A. Farcomeni
Abstract:
Motivated by a study about prompt coronary angiography in myocardial infarction, we propose a method to estimate the causal effect of a treatment in two-arm experimental studies with possible non-compliance in both treatment and control arms. The method is based on a causal model for repeated binary outcomes (before and after the treatment), which includes individual covariates and latent variable…
▽ More
Motivated by a study about prompt coronary angiography in myocardial infarction, we propose a method to estimate the causal effect of a treatment in two-arm experimental studies with possible non-compliance in both treatment and control arms. The method is based on a causal model for repeated binary outcomes (before and after the treatment), which includes individual covariates and latent variables for the unobserved heterogeneity between subjects. Moreover, given the type of non-compliance, the model assumes the existence of three subpopulations of subjects: compliers, never-takers, and always-takers. The model is estimated by a two-step estimator: at the first step the probability that a subject belongs to one of the three subpopulations is estimated on the basis of the available covariates; at the second step the causal effects are estimated through a conditional logistic method, the implementation of which depends on the results from the first step. Standard errors for this estimator are computed on the basis of a sandwich formula. The application shows that prompt coronary angiography in patients with myocardial infarction may significantly decrease the risk of other events within the next two years, with a log-odds of about -2. Given that non-compliance is significant for patients being given the treatment because of high risk conditions, classical estimators fail to detect, or at least underestimate, this effect.
△ Less
Submitted 24 October, 2012;
originally announced October 2012.
-
MultiLCIRT: An R package for multidimensional latent class item response models
Authors:
Francesco Bartolucci,
Silvia Bacci,
Michela Gnaldi
Abstract:
We illustrate a class of Item Response Theory (IRT) models for binary and ordinal polythomous items and we describe an R package for dealing with these models, which is named MultiLCIRT. The models at issue extend traditional IRT models allowing for (i) multidimensionality and (ii) discreteness of latent traits. This class of models also allows for different parameterizations for the conditional d…
▽ More
We illustrate a class of Item Response Theory (IRT) models for binary and ordinal polythomous items and we describe an R package for dealing with these models, which is named MultiLCIRT. The models at issue extend traditional IRT models allowing for (i) multidimensionality and (ii) discreteness of latent traits. This class of models also allows for different parameterizations for the conditional distribution of the response variables given the latent traits, depending on both the type of link function and the constraints imposed on the discriminating and the difficulty item parameters. We illustrate how the proposed class of models may be estimated by the maximum likelihood approach via an Expectation-Maximization algorithm, which is implemented in the MultiLCIRT package, and we discuss in detail issues related to model selection. In order to illustrate this package, we analyze two datasets: one concerning binary items and referred to the measurement of ability in mathematics and the other one coming from the administration of ordinal polythomous items for the assessment of anxiety and depression. In the first application, we illustrate how aggregating items in homogeneous groups through a model-based hierarchical clustering procedure which is implemented in the proposed package. In the second application, we describe the steps to select a specific model having the best fit in our class of IRT models.
△ Less
Submitted 18 October, 2012;
originally announced October 2012.
-
Nested hidden Markov chains for modeling dynamic unobserved heterogeneity in multilevel longitudinal data
Authors:
F. Bartolucci,
M. Lupparelli
Abstract:
In the context of multilevel longitudinal data, where sample units are collected in clusters, an important aspect that should be accounted for is the unobserved heterogeneity between sample units and between clusters. For this aim we propose an approach based on nested hidden (latent) Markov chains, which are associated to every sample unit and to every cluster. The approach allows us to account f…
▽ More
In the context of multilevel longitudinal data, where sample units are collected in clusters, an important aspect that should be accounted for is the unobserved heterogeneity between sample units and between clusters. For this aim we propose an approach based on nested hidden (latent) Markov chains, which are associated to every sample unit and to every cluster. The approach allows us to account for the mentioned forms of unobserved heterogeneity in a dynamic fashion; it also allows us to account for the correlation which may arise between the responses provided by the units belonging to the same cluster. Given the complexity in computing the manifest distribution of these response variables, we make inference on the proposed model through a composite likelihood function based on all the possible pairs of subjects within every cluster. The proposed approach is illustrated through an application to a dataset concerning a sample of Italian workers in which a binary response variable for the worker receiving an illness benefit was repeatedly observed.
△ Less
Submitted 9 August, 2012;
originally announced August 2012.
-
Mixtures of equispaced normal distributions and their use for testing symmetry in univariate data
Authors:
Silvia Bacci,
Francesco Bartolucci
Abstract:
Given a random sample of observations, mixtures of normal densities are often used to estimate the unknown continuous distribution from which the data come. Here we propose the use of this semiparametric framework for testing symmetry about an unknown value. More precisely, we show how the null hypothesis of symmetry may be formulated in terms of normal mixture model, with weights about the centre…
▽ More
Given a random sample of observations, mixtures of normal densities are often used to estimate the unknown continuous distribution from which the data come. Here we propose the use of this semiparametric framework for testing symmetry about an unknown value. More precisely, we show how the null hypothesis of symmetry may be formulated in terms of normal mixture model, with weights about the centre of symmetry constrained to be equal one another. The resulting model is nested in a more general unconstrained one, with same number of mixture components and free weights. Therefore, after having maximised the constrained and unconstrained log-likelihoods by means of a suitable algorithm, such as the Expectation-Maximisation, symmetry is tested against skewness through a likelihood ratio statistic. The performance of the proposed mixture-based test is illustrated through a Monte Carlo simulation study, where we compare two versions of the test, based on different criteria to select the number of mixture components, with the traditional one based on the third standardised moment. An illustrative example is also given that focuses on real data.
△ Less
Submitted 20 April, 2012;
originally announced April 2012.
-
Bayesian inference through encompassing priors and importance sampling for a class of marginal models for categorical data
Authors:
Francesco Bartolucci,
Luisa Scaccia,
Alessio Farcomeni
Abstract:
We develop a Bayesian approach for selecting the model which is the most supported by the data within a class of marginal models for categorical variables formulated through equality and/or inequality constraints on generalised logits (local, global, continuation or reverse continuation), generalised log-odds ratios and similar higher-order interactions. For each constrained model, the prior distr…
▽ More
We develop a Bayesian approach for selecting the model which is the most supported by the data within a class of marginal models for categorical variables formulated through equality and/or inequality constraints on generalised logits (local, global, continuation or reverse continuation), generalised log-odds ratios and similar higher-order interactions. For each constrained model, the prior distribution of the model parameters is formulated following the encompassing prior approach. Then, model selection is performed by using Bayes factors which are estimated by an importance sampling method. The approach is illustrated through three applications involving some datasets, which also include explanatory variables. In connection with one of these examples, a sensitivity analysis to the prior specification is also considered.
△ Less
Submitted 18 February, 2012;
originally announced February 2012.
-
Decomposition of the h-index
Authors:
Francesco Bartolucci
Abstract:
I introduce a decomposition of the h-index, which is nowadays the leading criterion to assess the relevance of a scientist in his/her research field. According to the proposed decomposition, the h-index is the product of two indicators, the first of which measures the impact of the scientist on the research community and the second may be seen as a measure of concentration of the citations in corr…
▽ More
I introduce a decomposition of the h-index, which is nowadays the leading criterion to assess the relevance of a scientist in his/her research field. According to the proposed decomposition, the h-index is the product of two indicators, the first of which measures the impact of the scientist on the research community and the second may be seen as a measure of concentration of the citations in correspondence of a reduced number of papers. The decomposition is illustrated by an application based on data concerning a group of top level economists.
△ Less
Submitted 9 March, 2012; v1 submitted 30 January, 2012;
originally announced January 2012.
-
A note on the application of the Oakes' identity to obtain the observed information matrix of hidden Markov models
Authors:
F. Bartolucci,
A. Farcomeni,
F. Pennoni
Abstract:
We derive the observed information matrix of hidden Markov models by the application of the Oakes (1999)'s identity. The method only requires the first derivative of the forward-backward recursions of Baum and Welch (1970), instead of the second derivative of the forward recursion, which is required within the approach of Lystig and Hughes (2002). The method is illustrated by an example based on t…
▽ More
We derive the observed information matrix of hidden Markov models by the application of the Oakes (1999)'s identity. The method only requires the first derivative of the forward-backward recursions of Baum and Welch (1970), instead of the second derivative of the forward recursion, which is required within the approach of Lystig and Hughes (2002). The method is illustrated by an example based on the analysis of a longitudinal dataset which is well known in sociology.
△ Less
Submitted 28 January, 2012;
originally announced January 2012.
-
A class of Multidimensional Latent Class IRT models for ordinal polytomous item responses
Authors:
Silvia Bacci,
Francesco Bartolucci,
Michela Gnaldi
Abstract:
We propose a class of Item Response Theory models for items with ordinal polytomous responses, which extends an existing class of multidimensional models for dichotomously-scored items measuring more than one latent trait. In the proposed approach, the random vector used to represent the latent traits is assumed to have a discrete distribution with support points corresponding to different latent…
▽ More
We propose a class of Item Response Theory models for items with ordinal polytomous responses, which extends an existing class of multidimensional models for dichotomously-scored items measuring more than one latent trait. In the proposed approach, the random vector used to represent the latent traits is assumed to have a discrete distribution with support points corresponding to different latent classes in the population. We also allow for different parameterizations for the conditional distribution of the response variables given the latent traits - such as those adopted in the Graded Response model, in the Partial Credit model, and in the Rating Scale model - depending on both the type of link function and the constraints imposed on the item parameters. For the proposed models we outline how to perform maximum likelihood estimation via the Expectation-Maximization algorithm. Moreover, we suggest a strategy for model selection which is based on a series of steps consisting of selecting specific features, such as the number of latent dimensions, the number of latent classes, and the specific parametrization. In order to illustrate the proposed approach, we analyze data deriving from a study on anxiety and depression as perceived by oncological patients.
△ Less
Submitted 23 January, 2012;
originally announced January 2012.
-
An alternative to the Baum-Welch recursions for hidden Markov models
Authors:
Francesco Bartolucci
Abstract:
We develop a recursion for hidden Markov model of any order h, which allows us to obtain the posterior distribution of the latent state at every occasion, given the previous h states and the observed data. With respect to the well-known Baum-Welch recursions, the proposed recursion has the advantage of being more direct to use and, in particular, of not requiring dummy renormalizations to avoid nu…
▽ More
We develop a recursion for hidden Markov model of any order h, which allows us to obtain the posterior distribution of the latent state at every occasion, given the previous h states and the observed data. With respect to the well-known Baum-Welch recursions, the proposed recursion has the advantage of being more direct to use and, in particular, of not requiring dummy renormalizations to avoid numerical problems. We also show how this recursion may be expressed in matrix notation, so as to allow for an efficient implementation, and how it may be used to obtain the manifest distribution of the observed data and for parameter estimation within the Expectation-Maximization algorithm. The approach is illustrated by an application to financial data which is focused on the study of the dynamics of the volatility level of log-returns.
△ Less
Submitted 31 December, 2011;
originally announced January 2012.
-
Mixture latent autoregressive models for longitudinal data
Authors:
Francesco Bartolucci,
Silvia Bacci,
Fulvia Pennoni
Abstract:
Many relevant statistical and econometric models for the analysis of longitudinal data include a latent process to account for the unobserved heterogeneity between subjects in a dynamic fashion. Such a process may be continuous (typically an AR(1)) or discrete (typically a Markov chain). In this paper, we propose a model for longitudinal data which is based on a mixture of AR(1) processes with dif…
▽ More
Many relevant statistical and econometric models for the analysis of longitudinal data include a latent process to account for the unobserved heterogeneity between subjects in a dynamic fashion. Such a process may be continuous (typically an AR(1)) or discrete (typically a Markov chain). In this paper, we propose a model for longitudinal data which is based on a mixture of AR(1) processes with different means and correlation coefficients, but with equal variances. This model belongs to the class of models based on a continuous latent process, and then it has a natural interpretation in many contexts of application, but it is more flexible than other models in this class, reaching a goodness-of-fit similar to that of a discrete latent process model, with a reduced number of parameters. We show how to perform maximum likelihood estimation of the proposed model by the joint use of an Expectation-Maximisation algorithm and a Newton-Raphson algorithm, implemented by means of recursions developed in the hidden Markov literature. We also introduce a simple method to obtain standard errors for the parameter estimates and a criterion to choose the number of mixture components. The proposed approach is illustrated by an application to a longitudinal dataset, coming from the Health and Retirement Study, about self-evaluation of the health status by a sample of subjects. In this application, the response variable is ordinal and time-constant and time-varying individual covariates are available.
△ Less
Submitted 6 August, 2011;
originally announced August 2011.
-
Bayesian inference for a class of latent Markov models for categorical longitudinal data
Authors:
Francesco Bartolucci,
Silvia Pandolfi
Abstract:
We propose a Bayesian inference approach for a class of latent Markov models. These models are widely used for the analysis of longitudinal categorical data, when the interest is in studying the evolution of an individual unobservable characteristic. We consider, in particular, the basic latent Markov, which does not account for individual covariates, and its version that includes such covariates…
▽ More
We propose a Bayesian inference approach for a class of latent Markov models. These models are widely used for the analysis of longitudinal categorical data, when the interest is in studying the evolution of an individual unobservable characteristic. We consider, in particular, the basic latent Markov, which does not account for individual covariates, and its version that includes such covariates in the measurement model. The proposed inferential approach is based on a system of priors formulated on a transformation of the initial and transition probabilities of the latent Markov chain. This system of priors is equivalent to one based on Dirichlet distributions. In order to draw samples from the joint posterior distribution of the parameters and the number of latent states, we implement a reversible jump algorithm which alternates moves of Metropolis-Hastings type with moves of split/combine and birth/death types. The proposed approach is illustrated through two applications based on longitudinal datasets.
△ Less
Submitted 4 January, 2011; v1 submitted 2 January, 2011;
originally announced January 2011.
-
An investigation of the discriminant power and dimensionality of items used for assessing health condition of elderly people
Authors:
Francesco Bartolucci,
Giorgio d'Agostino,
Giorgio E. Montanari
Abstract:
With reference to the questionnaire adopted within the Italian project "Ulisse" to assess health condition of elderly people, we investigate two important issues: discriminant power and actual number of dimensions measured by the items composing the questionnaire. The adopted statistical approach is based on the joint use of the latent class model and a multidimensional item response theory model…
▽ More
With reference to the questionnaire adopted within the Italian project "Ulisse" to assess health condition of elderly people, we investigate two important issues: discriminant power and actual number of dimensions measured by the items composing the questionnaire. The adopted statistical approach is based on the joint use of the latent class model and a multidimensional item response theory model based on the 2PL parametrization. The latter allows us to account for the different discriminant power of these items. The analysis is based on the data collected on a sample of 1699 elderly people hosted in 37 nursing homes in Italy. This analysis shows that the selected items indeed measure a different number of dimensions of the health status and that they considerably differ in terms of discriminant power (effectiveness in measuring the actual health status). Implications for the assessment of the performance of nursing homes from a policy-maker prospective are discussed.
△ Less
Submitted 19 August, 2010;
originally announced August 2010.
-
A generalized Multiple-try Metropolis version of the Reversible Jump algorithm
Authors:
S. Pandolfi,
F. Bartolucci,
N. Friel
Abstract:
The Reversible Jump algorithm is one of the most widely used Markov chain Monte Carlo algorithms for Bayesian estimation and model selection. A generalized multiple-try version of this algorithm is proposed. The algorithm is based on drawing several proposals at each step and randomly choosing one of them on the basis of weights (selection probabilities) that may be arbitrary chosen. Among the pos…
▽ More
The Reversible Jump algorithm is one of the most widely used Markov chain Monte Carlo algorithms for Bayesian estimation and model selection. A generalized multiple-try version of this algorithm is proposed. The algorithm is based on drawing several proposals at each step and randomly choosing one of them on the basis of weights (selection probabilities) that may be arbitrary chosen. Among the possible choices, a method is employed which is based on selection probabilities depending on a quadratic approximation of the posterior distribution. Moreover, the implementation of the proposed algorithm for challenging model selection problems, in which the quadratic approximation is not feasible, is considered. The resulting algorithm leads to a gain in efficiency with respect to the Reversible Jump algorithm, and also in terms of computational effort. The performance of this approach is illustrated for real examples involving a logistic regression model and a latent class model.
△ Less
Submitted 11 October, 2013; v1 submitted 3 June, 2010;
originally announced June 2010.
-
An overview of latent Markov models for longitudinal categorical data
Authors:
F. Bartolucci,
A. Farcomeni,
F. Pennoni
Abstract:
We provide a comprehensive overview of latent Markov (LM) models for the analysis of longitudinal categorical data. The main assumption behind these models is that the response variables are conditionally independent given a latent process which follows a first-order Markov chain. We first illustrate the basic LM model in which the conditional distribution of each response variable given the corre…
▽ More
We provide a comprehensive overview of latent Markov (LM) models for the analysis of longitudinal categorical data. The main assumption behind these models is that the response variables are conditionally independent given a latent process which follows a first-order Markov chain. We first illustrate the basic LM model in which the conditional distribution of each response variable given the corresponding latent variable and the initial and transition probabilities of the latent process are unconstrained. For this model we also illustrate in detail maximum likelihood estimation through the Expectation-Maximization algorithm, which may be efficiently implemented by recursions known in the hidden Markov literature. We then illustrate several constrained versions of the basic LM model, which make the model more parsimonious and allow us to include and test hypotheses of interest. These constraints may be put on the conditional distribution of the response variables given the latent process (measurement model) or on the distribution of the latent process (latent model). We also deal with extensions of LM model for the inclusion of individual covariates and to multilevel data. Covariates may affect the measurement or the latent model; we discuss the implications of these two different approaches according to the context of application. Finally, we outline methods for obtaining standard errors for the parameter estimates, for selecting the number of states and for path prediction. Models and related inference are illustrated by the description of relevant socio-economic applications available in the literature.
△ Less
Submitted 14 March, 2010;
originally announced March 2010.
-
Assessment of school performance through a multilevel latent Markov Rasch model
Authors:
Francesco Bartolucci,
Fulvia Pennoni,
Giorgio Vittadini
Abstract:
An extension of the latent Markov Rasch model is described for the analysis of binary longitudinal data with covariates when subjects are collected in clusters, e.g. students clustered in classes. For each subject, the latent process is used to represent the characteristic of interest (e.g. ability) conditional on the effect of the cluster to which he/she belongs. The latter effect is modeled by…
▽ More
An extension of the latent Markov Rasch model is described for the analysis of binary longitudinal data with covariates when subjects are collected in clusters, e.g. students clustered in classes. For each subject, the latent process is used to represent the characteristic of interest (e.g. ability) conditional on the effect of the cluster to which he/she belongs. The latter effect is modeled by a discrete latent variable associated with each cluster. For the maximum likelihood estimation of the model parameters we outline an EM algorithm. We show how the proposed model may be used for assessing the development of cognitive Math achievement. This approach is applied to the analysis of a dataset collected in the Lombardy Region (Italy) and based on test scores over three years of middle-school students attending public and private schools.
△ Less
Submitted 28 September, 2009;
originally announced September 2009.
-
Latent Markov model for longitudinal binary data: An application to the performance evaluation of nursing homes
Authors:
Francesco Bartolucci,
Monia Lupparelli,
Giorgio E. Montanari
Abstract:
Performance evaluation of nursing homes is usually accomplished by the repeated administration of questionnaires aimed at measuring the health status of the patients during their period of residence in the nursing home. We illustrate how a latent Markov model with covariates may effectively be used for the analysis of data collected in this way. This model relies on a not directly observable Mar…
▽ More
Performance evaluation of nursing homes is usually accomplished by the repeated administration of questionnaires aimed at measuring the health status of the patients during their period of residence in the nursing home. We illustrate how a latent Markov model with covariates may effectively be used for the analysis of data collected in this way. This model relies on a not directly observable Markov process, whose states represent different levels of the health status. For the maximum likelihood estimation of the model we apply an EM algorithm implemented by means of certain recursions taken from the literature on hidden Markov chains. Of particular interest is the estimation of the effect of each nursing home on the probability of transition between the latent states. We show how the estimates of these effects may be used to construct a set of scores which allows us to rank these facilities in terms of their efficacy in taking care of the health conditions of their patients. The method is used within an application based on data concerning a set of nursing homes located in the Region of Umbria, Italy, which were followed for the period 2003--2005.
△ Less
Submitted 17 August, 2009;
originally announced August 2009.