Search | arXiv e-print repository

Understanding Adversarial Training with Energy-based Models

Authors: Mujtaba Hussain Mirza, Maria Rosaria Briglia, Filippo Bartolucci, Senad Beadini, Giuseppe Lisanti, Iacopo Masi

Abstract: We aim at using Energy-based Model (EBM) framework to better understand adversarial training (AT) in classifiers, and additionally to analyze the intrinsic generative capabilities of robust classifiers. By viewing standard classifiers through an energy lens, we begin by analyzing how the energies of adversarial examples, generated by various attacks, differ from those of the natural samples. The c… ▽ More We aim at using Energy-based Model (EBM) framework to better understand adversarial training (AT) in classifiers, and additionally to analyze the intrinsic generative capabilities of robust classifiers. By viewing standard classifiers through an energy lens, we begin by analyzing how the energies of adversarial examples, generated by various attacks, differ from those of the natural samples. The central focus of our work is to understand the critical phenomena of Catastrophic Overfitting (CO) and Robust Overfitting (RO) in AT from an energy perspective. We analyze the impact of existing AT approaches on the energy of samples during training and observe that the behavior of the ``delta energy' -- change in energy between original sample and its adversarial counterpart -- diverges significantly when CO or RO occurs. After a thorough analysis of these energy dynamics and their relationship with overfitting, we propose a novel regularizer, the Delta Energy Regularizer (DER), designed to smoothen the energy landscape during training. We demonstrate that DER is effective in mitigating both CO and RO across multiple benchmarks. We further show that robust classifiers, when being used as generative models, have limits in handling trade-off between image quality and variability. We propose an improved technique based on a local class-wise principal component analysis (PCA) and energy-based guidance for better class-specific initialization and adaptive stopping, enhancing sample diversity and generation quality. Considering that we do not explicitly train for generative modeling, we achieve a competitive Inception Score (IS) and Fréchet inception distance (FID) compared to hybrid discriminative-generative models. △ Less

Submitted 28 May, 2025; originally announced May 2025.

Comments: Under review for TPAMI

arXiv:2501.17264 [pdf, other]

Constructing Simultaneous Confidence Bands for Errors-in-variables Curves with Application to the Lorenz Curve

Authors: Ziqing Dong, Francesco Bartolucci, Satoshi Kuriki, Antonietta Mira

Abstract: Errors-in-variables curves are curves where errors exist not only in the independent variable but also in the dependent variable. We address the challenge of constructing simultaneous confidence bands (SCBs) for such curves. Our method finds application in the Lorenz curve, which represents the concentration of income or wealth. Unlike ordinary regression curves, the Lorenz curve incorporates erro… ▽ More Errors-in-variables curves are curves where errors exist not only in the independent variable but also in the dependent variable. We address the challenge of constructing simultaneous confidence bands (SCBs) for such curves. Our method finds application in the Lorenz curve, which represents the concentration of income or wealth. Unlike ordinary regression curves, the Lorenz curve incorporates errors in its explanatory variable and requires a fundamentally different treatment. To the best of our knowledge, the development of SCBs for such curves has not been explored in previous research. Using the Lorenz curve as a case study, this paper proposes a novel approach to address this challenge. △ Less

Submitted 28 January, 2025; originally announced January 2025.

Comments: 18 pages, 6 figures

arXiv:2410.14591 [pdf, ps, other]

A Lipschitz spaces view of infinitely wide shallow neural networks

Authors: Francesca Bartolucci, Marcello Carioni, José A. Iglesias, Yury Korolev, Emanuele Naldi, Stefano Vigogna

Abstract: We revisit the mean field parametrization of shallow neural networks, using signed measures on unbounded parameter spaces and duality pairings that take into account the regularity and growth of activation functions. This setting directly leads to the use of unbalanced Kantorovich-Rubinstein norms defined by duality with Lipschitz functions, and of spaces of measures dual to those of continuous fu… ▽ More We revisit the mean field parametrization of shallow neural networks, using signed measures on unbounded parameter spaces and duality pairings that take into account the regularity and growth of activation functions. This setting directly leads to the use of unbalanced Kantorovich-Rubinstein norms defined by duality with Lipschitz functions, and of spaces of measures dual to those of continuous functions with controlled growth. These allow to make transparent the need for total variation and moment bounds or penalization to obtain existence of minimizers of variational formulations, under which we prove a compactness result in strong Kantorovich-Rubinstein norm, and in the absence of which we show several examples demonstrating undesirable behavior. Further, the Kantorovich-Rubinstein setting enables us to combine the advantages of a completely linear parametrization and ensuing reproducing kernel Banach space framework with optimal transport insights. We showcase this synergy with representer theorems and uniform large data limits for empirical risk minimization, and in proposed formulations for distillation and fusion applications. △ Less

Submitted 18 October, 2024; originally announced October 2024.

Comments: 39 pages, 1 table

MSC Class: 68T07; 46E27; 46B20

arXiv:2409.17941 [pdf, other]

Perturb, Attend, Detect and Localize (PADL): Robust Proactive Image Defense

Authors: Filippo Bartolucci, Iacopo Masi, Giuseppe Lisanti

Abstract: Image manipulation detection and localization have received considerable attention from the research community given the blooming of Generative Models (GMs). Detection methods that follow a passive approach may overfit to specific GMs, limiting their application in real-world scenarios, due to the growing diversity of generative models. Recently, approaches based on a proactive framework have show… ▽ More Image manipulation detection and localization have received considerable attention from the research community given the blooming of Generative Models (GMs). Detection methods that follow a passive approach may overfit to specific GMs, limiting their application in real-world scenarios, due to the growing diversity of generative models. Recently, approaches based on a proactive framework have shown the possibility of dealing with this limitation. However, these methods suffer from two main limitations, which raises concerns about potential vulnerabilities: i) the manipulation detector is not robust to noise and hence can be easily fooled; ii) the fact that they rely on fixed perturbations for image protection offers a predictable exploit for malicious attackers, enabling them to reverse-engineer and evade detection. To overcome this issue we propose PADL, a new solution able to generate image-specific perturbations using a symmetric scheme of encoding and decoding based on cross-attention, which drastically reduces the possibility of reverse engineering, even when evaluated with adaptive attack [31]. Additionally, PADL is able to pinpoint manipulated areas, facilitating the identification of specific regions that have undergone alterations, and has more generalization power than prior art on held-out generative models. Indeed, although being trained only on an attribute manipulation GAN model [15], our method generalizes to a range of unseen models with diverse architectural designs, such as StarGANv2, BlendGAN, DiffAE, StableDiffusion and StableDiffusionXL. Additionally, we introduce a novel evaluation protocol, which offers a fair evaluation of localisation performance in function of detection accuracy and better captures real-world scenarios. △ Less

Submitted 26 September, 2024; originally announced September 2024.

arXiv:2403.08750 [pdf, ps, other]

Neural reproducing kernel Banach spaces and representer theorems for deep networks

Authors: Francesca Bartolucci, Ernesto De Vito, Lorenzo Rosasco, Stefano Vigogna

Abstract: Studying the function spaces defined by neural networks helps to understand the corresponding learning models and their inductive bias. While in some limits neural networks correspond to function spaces that are reproducing kernel Hilbert spaces, these regimes do not capture the properties of the networks used in practice. In contrast, in this paper we show that deep neural networks define suitabl… ▽ More Studying the function spaces defined by neural networks helps to understand the corresponding learning models and their inductive bias. While in some limits neural networks correspond to function spaces that are reproducing kernel Hilbert spaces, these regimes do not capture the properties of the networks used in practice. In contrast, in this paper we show that deep neural networks define suitable reproducing kernel Banach spaces. These spaces are equipped with norms that enforce a form of sparsity, enabling them to adapt to potential latent structures within the input data and their representations. In particular, leveraging the theory of reproducing kernel Banach spaces, combined with variational results, we derive representer theorems that justify the finite architectures commonly employed in applications. Our study extends analogous results for shallow networks and can be seen as a step towards considering more practically plausible neural architectures. △ Less

Submitted 13 March, 2024; originally announced March 2024.

arXiv:2307.03940 [pdf, other]

doi 10.1109/SampTA59647.2023.10301382

Uncovering the limits of uniqueness in sampled Gabor phase retrieval: A dense set of counterexamples in $L^2(\mathbb{R})$

Authors: Rima Alaifari, Francesca Bartolucci, Matthias Wellershoff

Abstract: Sampled Gabor phase retrieval - the problem of recovering a square-integrable signal from the magnitude of its Gabor transform sampled on a lattice - is a fundamental problem in signal processing, with important applications in areas such as imaging and audio processing. Recently, a classification of square-integrable signals which are not phase retrievable from Gabor measurements on parallel line… ▽ More Sampled Gabor phase retrieval - the problem of recovering a square-integrable signal from the magnitude of its Gabor transform sampled on a lattice - is a fundamental problem in signal processing, with important applications in areas such as imaging and audio processing. Recently, a classification of square-integrable signals which are not phase retrievable from Gabor measurements on parallel lines has been presented. This classification was used to exhibit a family of counterexamples to uniqueness in sampled Gabor phase retrieval. Here, we show that the set of counterexamples to uniqueness in sampled Gabor phase retrieval is dense in $L^2(\mathbb{R})$, but is not equal to the whole of $L^2(\mathbb{R})$ in general. Overall, our work contributes to a better understanding of the fundamental limits of sampled Gabor phase retrieval. △ Less

Submitted 8 July, 2023; originally announced July 2023.

Comments: 5 pages, 2 figures

MSC Class: 42C15; 94A12

arXiv:2305.19913 [pdf, other]

Representation Equivalent Neural Operators: a Framework for Alias-free Operator Learning

Authors: Francesca Bartolucci, Emmanuel de Bézenac, Bogdan Raonić, Roberto Molinaro, Siddhartha Mishra, Rima Alaifari

Abstract: Recently, operator learning, or learning mappings between infinite-dimensional function spaces, has garnered significant attention, notably in relation to learning partial differential equations from data. Conceptually clear when outlined on paper, neural operators necessitate discretization in the transition to computer implementations. This step can compromise their integrity, often causing them… ▽ More Recently, operator learning, or learning mappings between infinite-dimensional function spaces, has garnered significant attention, notably in relation to learning partial differential equations from data. Conceptually clear when outlined on paper, neural operators necessitate discretization in the transition to computer implementations. This step can compromise their integrity, often causing them to deviate from the underlying operators. This research offers a fresh take on neural operators with a framework Representation equivalent Neural Operators (ReNO) designed to address these issues. At its core is the concept of operator aliasing, which measures inconsistency between neural operators and their discrete representations. We explore this for widely-used operator learning techniques. Our findings detail how aliasing introduces errors when handling different discretizations and grids and loss of crucial continuous structures. More generally, this framework not only sheds light on existing challenges but, given its constructive and broad nature, also potentially offers tools for developing new neural operators. △ Less

Submitted 2 November, 2023; v1 submitted 31 May, 2023; originally announced May 2023.

Comments: 28 pages

arXiv:2302.08129 [pdf, other]

Unique wavelet sign retrieval from samples without bandlimiting

Authors: Rima Alaifari, Francesca Bartolucci, Matthias Wellershoff

Abstract: We study the problem of recovering a signal from magnitudes of its wavelet frame coefficients when the analyzing wavelet is real-valued. We show that every real-valued signal can be uniquely recovered, up to global sign, from its multi-wavelet frame coefficients \[ \{\lvert \mathcal{W}_{φ_i} f(α^{m}βn,α^{m}) \rvert: i\in\{1,2,3\}, m,n\in\mathbb{Z}\} \] for every $α>1,β>0$ with… ▽ More We study the problem of recovering a signal from magnitudes of its wavelet frame coefficients when the analyzing wavelet is real-valued. We show that every real-valued signal can be uniquely recovered, up to global sign, from its multi-wavelet frame coefficients \[ \{\lvert \mathcal{W}_{φ_i} f(α^{m}βn,α^{m}) \rvert: i\in\{1,2,3\}, m,n\in\mathbb{Z}\} \] for every $α>1,β>0$ with $β\ln(α)\leq 4π/(1+4p)$, $p>0$, when the three wavelets $φ_i$ are suitable linear combinations of the Poisson wavelet $P_p$ of order $p$ and its Hilbert transform $\mathscr{H}P_p$. For complex-valued signals we find that this is not possible for any choice of the parameters $α>1,β>0$, and for any window. In contrast to the existing literature on wavelet sign retrieval, our uniqueness results do not require any bandlimiting constraints or other a priori knowledge on the real-valued signals to guarantee their unique recovery from the absolute values of their wavelet coefficients. △ Less

Submitted 1 July, 2024; v1 submitted 16 February, 2023; originally announced February 2023.

Comments: 14 pages, 2 figures

arXiv:2302.01178 [pdf, other]

Convolutional Neural Operators for robust and accurate learning of PDEs

Authors: Bogdan Raonić, Roberto Molinaro, Tim De Ryck, Tobias Rohner, Francesca Bartolucci, Rima Alaifari, Siddhartha Mishra, Emmanuel de Bézenac

Abstract: Although very successfully used in conventional machine learning, convolution based neural network architectures -- believed to be inconsistent in function space -- have been largely ignored in the context of learning solution operators of PDEs. Here, we present novel adaptations for convolutional neural networks to demonstrate that they are indeed able to process functions as inputs and outputs.… ▽ More Although very successfully used in conventional machine learning, convolution based neural network architectures -- believed to be inconsistent in function space -- have been largely ignored in the context of learning solution operators of PDEs. Here, we present novel adaptations for convolutional neural networks to demonstrate that they are indeed able to process functions as inputs and outputs. The resulting architecture, termed as convolutional neural operators (CNOs), is designed specifically to preserve its underlying continuous nature, even when implemented in a discretized form on a computer. We prove a universality theorem to show that CNOs can approximate operators arising in PDEs to desired accuracy. CNOs are tested on a novel suite of benchmarks, encompassing a diverse set of PDEs with possibly multi-scale solutions and are observed to significantly outperform baselines, paving the way for an alternative framework for robust and accurate operator learning. Our code is publicly available at https://github.com/bogdanraonic3/ConvolutionalNeuralOperator △ Less

Submitted 1 December, 2023; v1 submitted 2 February, 2023; originally announced February 2023.

arXiv:2205.07013 [pdf, other]

doi 10.1007/s43670-023-00079-1

On the connection between uniqueness from samples and stability in Gabor phase retrieval

Authors: Rima Alaifari, Francesca Bartolucci, Stefan Steinerberger, Matthias Wellershoff

Abstract: Gabor phase retrieval is the problem of reconstructing a signal from only the magnitudes of its Gabor transform. Previous findings suggest a possible link between unique solvability of the discrete problem (recovery from measurements on a lattice) and stability of the continuous problem (recovery from measurements on an open subset of $\mathbb{R}^2$). In this paper, we close this gap by proving th… ▽ More Gabor phase retrieval is the problem of reconstructing a signal from only the magnitudes of its Gabor transform. Previous findings suggest a possible link between unique solvability of the discrete problem (recovery from measurements on a lattice) and stability of the continuous problem (recovery from measurements on an open subset of $\mathbb{R}^2$). In this paper, we close this gap by proving that such a link cannot be made. More precisely, we establish the existence of functions which break uniqueness from samples without affecting stability of the continuous problem. Furthermore, we prove the novel result that counterexamples to unique recovery from samples are dense in $L^2(\mathbb{R})$. Finally, we develop an intuitive argument on the connection between directions of instability in phase retrieval and certain Laplacian eigenfunctions associated to small eigenvalues. △ Less

Submitted 29 November, 2023; v1 submitted 14 May, 2022; originally announced May 2022.

Comments: 37 pages, 6 figures; minor changes to improve readability and correction of typos

MSC Class: 42C15; 94A12

arXiv:2109.09710 [pdf, ps, other]

Understanding neural networks with reproducing kernel Banach spaces

Authors: Francesca Bartolucci, Ernesto De Vito, Lorenzo Rosasco, Stefano Vigogna

Abstract: Characterizing the function spaces corresponding to neural networks can provide a way to understand their properties. In this paper we discuss how the theory of reproducing kernel Banach spaces can be used to tackle this challenge. In particular, we prove a representer theorem for a wide class of reproducing kernel Banach spaces that admit a suitable integral representation and include one hidden… ▽ More Characterizing the function spaces corresponding to neural networks can provide a way to understand their properties. In this paper we discuss how the theory of reproducing kernel Banach spaces can be used to tackle this challenge. In particular, we prove a representer theorem for a wide class of reproducing kernel Banach spaces that admit a suitable integral representation and include one hidden layer neural networks of possibly infinite width. Further, we show that, for a suitable class of ReLU activation functions, the norm in the corresponding reproducing kernel Banach space can be characterized in terms of the inverse Radon transform of a bounded real measure, with norm given by the total variation norm of the measure. Our analysis simplifies and extends recent results in [34,29,30]. △ Less

Submitted 26 October, 2021; v1 submitted 20 September, 2021; originally announced September 2021.

arXiv:2108.04338 [pdf, ps, other]

Unitarization of the Horocyclic Radon Transform on Symmetric Spaces

Authors: Francesca Bartolucci, Filippo De Mari, Matteo Monti

Abstract: We consider the Radon transform for a dual pair $(X,Ξ)$, where $X=G/K$ is a noncompact symmetric space and $Ξ$ is the space of horocycles of $X$. We address the unitarization problem that was considered (and solved in some cases) by Helgason, namely the determination of a pseudo-differential operator such that the pre-composition with the Radon transform extends to a unitary operator… ▽ More We consider the Radon transform for a dual pair $(X,Ξ)$, where $X=G/K$ is a noncompact symmetric space and $Ξ$ is the space of horocycles of $X$. We address the unitarization problem that was considered (and solved in some cases) by Helgason, namely the determination of a pseudo-differential operator such that the pre-composition with the Radon transform extends to a unitary operator $\mathcal{Q}\colon L^2(X)\to L_\flat^2(Ξ)$, where $L_\flat^2(Ξ)$ is a closed subspace of $L^2(Ξ)$ which accounts for the Weyl symmetries. Furthermore, we show that the unitary extension intertwines the quasi-regular representations of $G$ on $L^2(X)$ and $L_\flat^2(Ξ)$. △ Less

Submitted 9 August, 2021; originally announced August 2021.

arXiv:2106.15948 [pdf, other]

Maximum likelihood estimation of hidden Markov models for continuous longitudinal data with missing responses and dropout

Authors: Silvia Pandolfi, Francesco Bartolucci, Fulvia Pennoni

Abstract: We propose an inferential approach for maximum likelihood estimation of the hidden Markov models for continuous responses. We extend to the case of longitudinal observations the finite mixture model of multivariate Gaussian distributions with Missing At Random (MAR) outcomes, also accounting for possible dropout. The resulting hidden Markov model accounts for different types of missing pattern: (i… ▽ More We propose an inferential approach for maximum likelihood estimation of the hidden Markov models for continuous responses. We extend to the case of longitudinal observations the finite mixture model of multivariate Gaussian distributions with Missing At Random (MAR) outcomes, also accounting for possible dropout. The resulting hidden Markov model accounts for different types of missing pattern: (i) partially missing outcomes at a given time occasion; (ii) completely missing outcomes at a given time occasion (intermittent pattern); (iii) dropout before the end of the period of observation (monotone pattern). The MAR assumption is formulated to deal with the first two types of missingness, while to account for informative dropout we assume an extra absorbing state. Maximum likelihood estimation of the model parameters is based on an extended Expectation-Maximization algorithm relying on suitable recursions. The proposal is illustrated by a Monte Carlo simulation study and an application based on historical data on primary biliary cholangitis. △ Less

Submitted 30 June, 2021; originally announced June 2021.

arXiv:2106.03811 [pdf, ps, other]

Estimating the size of a closed population by modeling latent and observed heterogeneity

Authors: Antonio Forcina, Francesco Bartolucci

Abstract: The paper describes a new class of capture-recapture models for closed populations when individual covariates are available. The novelty consists in combining a latent class model for the distribution of the capture history, where the class weights and the conditional distributions given the latent may depend on covariates, with a model for the marginal distribution of the available covariates as… ▽ More The paper describes a new class of capture-recapture models for closed populations when individual covariates are available. The novelty consists in combining a latent class model for the distribution of the capture history, where the class weights and the conditional distributions given the latent may depend on covariates, with a model for the marginal distribution of the available covariates as in \cite{Liu2017}. In addition, any general form of serial dependence is allowed when modeling capture histories conditionally on the latent and covariates. A Fisher-scoring algorithm for maximum likelihood estimation is proposed, and the Implicit Function Theorem is used to show that the mapping between the marginal distribution of the observed covariates and the probabilities of being never captured is one-to-one. Asymptotic results are outlined, and a procedure for constructing likelihood based confidence intervals for the population size is presented. Two examples based on real data are used to illustrate the proposed approach △ Less

Submitted 5 November, 2021; v1 submitted 7 June, 2021; originally announced June 2021.

arXiv:2009.05029 [pdf, ps, other]

doi 10.1016/j.acha.2023.01.002

Phase retrieval of bandlimited functions for the wavelet transform

Authors: Rima Alaifari, Francesca Bartolucci, Matthias Wellershoff

Abstract: We study the recovery of square-integrable signals from the absolute values of their wavelet transforms, also called wavelet phase retrieval. We present a new uniqueness result for wavelet phase retrieval. To be precise, we show that any wavelet with finitely many vanishing moments allows for the unique recovery of real-valued bandlimited signals up to global sign. Additionally, we present the fir… ▽ More We study the recovery of square-integrable signals from the absolute values of their wavelet transforms, also called wavelet phase retrieval. We present a new uniqueness result for wavelet phase retrieval. To be precise, we show that any wavelet with finitely many vanishing moments allows for the unique recovery of real-valued bandlimited signals up to global sign. Additionally, we present the first uniqueness result for sampled wavelet phase retrieval in which the underlying wavelets are allowed to be complex-valued and we present a uniqueness result for phase retrieval from sampled Cauchy wavelet transform measurements. △ Less

Submitted 5 November, 2021; v1 submitted 10 September, 2020; originally announced September 2020.

Comments: 18 pages

arXiv:2005.03505 [pdf, ps, other]

Continuity properties of the shearlet transform and the shearlet synthesis operator on the Lizorkin type spaces

Authors: Francesca Bartolucci, Stevan Pilipović, Nenad Teofanov

Abstract: We develop a distributional framework for the shearlet transform $\mathcal{S}_ψ\colon\mathcal{S}_0(\mathbb{R}^2)\to\mathcal{S}(\mathbb{S})$ and the shearlet synthesis operator $\mathcal{S}^t_ψ\colon\mathcal{S}(\mathbb{S})\to\mathcal{S}_0(\mathbb{R}^2)$, where $\mathcal{S}_0(\mathbb{R}^2)$ is the Lizorkin test function space and $\mathcal{S}(\mathbb{S})$ is the space of highly localized test functi… ▽ More We develop a distributional framework for the shearlet transform $\mathcal{S}_ψ\colon\mathcal{S}_0(\mathbb{R}^2)\to\mathcal{S}(\mathbb{S})$ and the shearlet synthesis operator $\mathcal{S}^t_ψ\colon\mathcal{S}(\mathbb{S})\to\mathcal{S}_0(\mathbb{R}^2)$, where $\mathcal{S}_0(\mathbb{R}^2)$ is the Lizorkin test function space and $\mathcal{S}(\mathbb{S})$ is the space of highly localized test functions on the standard shearlet group $\mathbb{S}$. These spaces and their duals $\mathcal{S}_0^\prime (\mathbb R^2),\, \mathcal{S}^\prime (\mathbb{S})$ are called Lizorkin type spaces of test functions and distributions. We analyze the continuity properties of these transforms when the admissible vector $ψ$ belongs to $\mathcal{S}_0(\mathbb{R}^2)$. Then, we define the shearlet transform and the shearlet synthesis operator of Lizorkin type distributions as transpose mappings of the shearlet synthesis operator and the shearlet transform, respectively. They yield continuous mappings from $\mathcal{S}_0^\prime (\mathbb R^2)$ to $\mathcal{S}^\prime (\mathbb{S})$ and from $\mathcal{S}^\prime (\mathbb S)$ to $\mathcal{S}_0^\prime (\mathbb{R}^2)$. Furthermore, we show the consistency of our definition with the shearlet transform defined by direct evaluation of a distribution on the shearlets. The same can be done for the shearlet synthesis operator. Finally, we give a reconstruction formula for Lizorkin type distributions, from which follows that the action of such generalized functions can be written as an absolutely convergent integral over the standard shearlet group. △ Less

Submitted 6 May, 2020; originally announced May 2020.

Comments: 22 pages. arXiv admin note: text overlap with arXiv:2003.06642

arXiv:2003.06642 [pdf, ps, other]

The Shearlet Transform and Lizorkin Spaces

Authors: Francesca Bartolucci, Stevan Pilipović, Nenad Teofanov

Abstract: We prove a continuity result for the shearlet transform when restricted to the space of smooth and rapidly decreasing functions with all vanishing moments. We define the dual shearlet transform, called here the shearlet synthesis operator, and we prove its continuity on the space of smooth and rapidly decreasing functions over $\mathbb{R}^2\times\mathbb{R}\times\mathbb{R}^\times$. Then, we use the… ▽ More We prove a continuity result for the shearlet transform when restricted to the space of smooth and rapidly decreasing functions with all vanishing moments. We define the dual shearlet transform, called here the shearlet synthesis operator, and we prove its continuity on the space of smooth and rapidly decreasing functions over $\mathbb{R}^2\times\mathbb{R}\times\mathbb{R}^\times$. Then, we use these continuity results to extend the shearlet transform to the space of Lizorkin distributions, and we prove its consistency with the classical definition for test functions. △ Less

Submitted 7 May, 2020; v1 submitted 14 March, 2020; originally announced March 2020.

Comments: 17

arXiv:2002.06696 [pdf, ps, other]

Unitarization of the Horocyclic Radon Transform on Homogeneous Trees

Authors: Francesca Bartolucci, Filippo De Mari, Matteo Monti

Abstract: Following previous work in the continuous setup, we construct the unitarization of the horocyclic Radon transform on a homogeneous tree X and we show that it intertwines the quasi regular representations of the group of isometries of X on the tree itself and on the space of horocycles. Following previous work in the continuous setup, we construct the unitarization of the horocyclic Radon transform on a homogeneous tree X and we show that it intertwines the quasi regular representations of the group of isometries of X on the tree itself and on the space of horocycles. △ Less

Submitted 4 August, 2021; v1 submitted 16 February, 2020; originally announced February 2020.

Comments: 19 pages, 3 figures

arXiv:2002.01165 [pdf, ps, other]

Radon Transform: Dual Pairs and Irreducible Representations

Authors: Giovanni S. Alberti, Francesca Bartolucci, Filippo De Mari, Ernesto De Vito

Abstract: We illustrate the general point of view developed in [SIAM J. Math. Anal., 51(6), 4356-4381] that can be described as a variation of Helgason's theory of dual $G$-homogeneous pairs $(X,Ξ)$ and which allows us to prove intertwining properties and inversion formulae of many existing Radon transforms. Here we analyze in detail one of the important aspects in the theory of dual pairs, namely the injec… ▽ More We illustrate the general point of view developed in [SIAM J. Math. Anal., 51(6), 4356-4381] that can be described as a variation of Helgason's theory of dual $G$-homogeneous pairs $(X,Ξ)$ and which allows us to prove intertwining properties and inversion formulae of many existing Radon transforms. Here we analyze in detail one of the important aspects in the theory of dual pairs, namely the injectivity of the map label-to-manifold $ξ\to\hatξ$ and we prove that it is a necessary condition for the irreducibility of the quasi-regular representation of $G$ on $L^2(Ξ)$. We further explain how the theory in [SIAM J. Math. Anal., 51(6), 4356-4381] applies to the classical Radon and X-ray transforms in $\mathbb R^3$. △ Less

Submitted 4 February, 2020; originally announced February 2020.

Comments: 27 pages

arXiv:1910.10219 [pdf, other]

Cone-Adapted Shearlets and Radon Transforms

Authors: Francesca Bartolucci, Filippo De Mari, Ernesto De Vito

Abstract: We show that the cone-adapted shearlet coefficients can be computed by means of the limited angle horizontal and vertical (affine) Radon transforms and the one-dimensional wavelet transform. This yields formulas that open new perspectives for the inversion of the Radon transform. We show that the cone-adapted shearlet coefficients can be computed by means of the limited angle horizontal and vertical (affine) Radon transforms and the one-dimensional wavelet transform. This yields formulas that open new perspectives for the inversion of the Radon transform. △ Less

Submitted 16 October, 2019; originally announced October 2019.

Comments: 19 pages, 3 figures

arXiv:1810.12809 [pdf, ps, other]

Unitarization and Inversion Formulae for the Radon Transform between Dual Pairs

Authors: Giovanni S. Alberti, Francesca Bartolucci, Filippo De Mari, Ernesto De Vito

Abstract: We consider the Radon transform associated to dual pairs $(X,Ξ)$ in the sense of Helgason, with $X=G/K$ and $Ξ=G/H$, where $G=\mathbb{R}^d\rtimes K$, $K$ is a closed subgroup of ${\rm GL}(d,\mathbb{R})$ and $H$ is a closed subgroup of $G$. Under some technical assumptions, we prove that if the quasi regular representations of $G$ acting on $L^2(X)$ and $L^2(Ξ)$ are irreducible, then the Radon tran… ▽ More We consider the Radon transform associated to dual pairs $(X,Ξ)$ in the sense of Helgason, with $X=G/K$ and $Ξ=G/H$, where $G=\mathbb{R}^d\rtimes K$, $K$ is a closed subgroup of ${\rm GL}(d,\mathbb{R})$ and $H$ is a closed subgroup of $G$. Under some technical assumptions, we prove that if the quasi regular representations of $G$ acting on $L^2(X)$ and $L^2(Ξ)$ are irreducible, then the Radon transform admits a unitarization intertwining the two representations. If, in addition, the representations are square integrable, we provide an inversion formula for the Radon transform based on the voice transform associated to these representations. △ Less

Submitted 30 October, 2018; originally announced October 2018.

Comments: 24 pages

arXiv:1810.08778 [pdf, other]

Marginal models with individual-specific effects for the analysis of longitudinal bipartite networks

Authors: Francesco Bartolucci, Antonietta Mira, Stefano Peluso

Abstract: A new modeling framework for bipartite social networks arising from a sequence of partially time-ordered relational events is proposed. We directly model the joint distribution of the binary variables indicating if each single actor is involved or not in an event. The adopted parametrization is based on first- and second-order effects, formulated as in marginal models for categorical data and free… ▽ More A new modeling framework for bipartite social networks arising from a sequence of partially time-ordered relational events is proposed. We directly model the joint distribution of the binary variables indicating if each single actor is involved or not in an event. The adopted parametrization is based on first- and second-order effects, formulated as in marginal models for categorical data and free higher order effects. In particular, second-order effects are log-odds ratios with meaningful interpretation from the social perspective in terms of tendency to cooperate, in contrast to first-order effects interpreted in terms of tendency of each single actor to participate in an event. These effects are parametrized on the basis of the event times, so that suitable latent trajectories of individual behaviors may be represented. Inference is based on a composite likelihood function, maximized by an algorithm with numerical complexity proportional to the square of the number of units in the network. A classification composite likelihood is used to cluster the actors, simplifying the interpretation of the data structure. The proposed approach is illustrated on a dataset of scientific articles published in four top statistical journals from 2003 to 2012. △ Less

Submitted 20 October, 2018; originally announced October 2018.

arXiv:1703.09578 [pdf, ps, other]

Radon transform intertwines shearlets and wavelets

Authors: Francesca Bartolucci, Filippo De Mari, Ernesto De Vito

Abstract: We prove that the unitary affine Radon transform intertwines the quasi-regular representation of a class of semidirect products, built by shearlet dilation groups and translations, and the tensor product of a standard wavelet representation with a wavelet-like representation. This yields a formula for shearlet coefficients that involves only integral transforms applied to the affine Radon transfor… ▽ More We prove that the unitary affine Radon transform intertwines the quasi-regular representation of a class of semidirect products, built by shearlet dilation groups and translations, and the tensor product of a standard wavelet representation with a wavelet-like representation. This yields a formula for shearlet coefficients that involves only integral transforms applied to the affine Radon transform of the signal, thereby opening new perspectives in the inversion of the Radon transform. △ Less

Submitted 28 March, 2017; originally announced March 2017.

Comments: 26 pages, 1 figure

arXiv:1609.06465 [pdf, ps, other]

Evaluation of student proficiency through a multidimensional finite mixture IRT model

Authors: Silvia Bacci, Francesco Bartolucci, Leonardo Grilli, Carla Rampichini

Abstract: In certain academic systems, a student can enroll for an exam immediately after the end of the teaching period or can postpone it to any later examination session, so that the grade is missing until the exam is not attempted. We propose an approach for the evaluation in itinere of a student's proficiency accounting also for non-attempted exams. The approach is based on considering each exam as an… ▽ More In certain academic systems, a student can enroll for an exam immediately after the end of the teaching period or can postpone it to any later examination session, so that the grade is missing until the exam is not attempted. We propose an approach for the evaluation in itinere of a student's proficiency accounting also for non-attempted exams. The approach is based on considering each exam as an item, so that responding to the item amounts to attempting the exam, and on an Item Response Theory model that includes two latent variables corresponding to the student's ability and the propensity to attempt the exam. In this way, we explicitly account for non-ignorable missing observations as the indicators of item response also contribute to measure the ability. The two latent variables are assumed to have a discrete distribution defining latent classes of students that are homogeneous in terms of ability and priority assigned to exams. The model, which also allows for individual covariates in its structural part, is fitted by the Expectation-Maximization algorithm. The approach is illustrated through the analysis of data about the first-year exams of freshmen of the School of Economics at the University of Florence (Italy). △ Less

Submitted 21 September, 2016; originally announced September 2016.

Comments: 24 pages, 9 tables, 1 figure

arXiv:1506.08278 [pdf, other]

Composite likelihood inference in a discrete latent variable model for two-way "clustering-by-segmentation" problems

Authors: Francesco Bartolucci, Francesca Chiaromonte, Prabhani Kuruppumullage Don, Bruce George Lindsay

Abstract: We consider a discrete latent variable model for two-way data arrays, which allows one to simultaneously produce clusters along one of the data dimensions (e.g. exchangeable observational units or features) and contiguous groups, or segments, along the other (e.g. consecutively ordered times or locations). The model relies on a hidden Markov structure but, given its complexity, cannot be estimated… ▽ More We consider a discrete latent variable model for two-way data arrays, which allows one to simultaneously produce clusters along one of the data dimensions (e.g. exchangeable observational units or features) and contiguous groups, or segments, along the other (e.g. consecutively ordered times or locations). The model relies on a hidden Markov structure but, given its complexity, cannot be estimated by full maximum likelihood. We therefore introduce composite likelihood methodology based on considering different subsets of the data. The proposed approach is illustrated by simulation, and with an application to genomic data. △ Less

Submitted 27 June, 2015; originally announced June 2015.

arXiv:1501.04448 [pdf, ps, other]

LMest: an R package for latent Markov models for categorical longitudinal data

Authors: Francesco Bartolucci, Alessio Farcomeni, Silvia Pandolfi, Fulvia Pennoni

Abstract: Latent Markov (LM) models represent an important class of models for the analysis of longitudinal data (Bartolucci et. al., 2013), especially when response variables are categorical. These models have a great potential of application for the analysis of social, medical, and behavioral data as well as in other disciplines. We propose the R package LMest, which is tailored to deal with these types o… ▽ More Latent Markov (LM) models represent an important class of models for the analysis of longitudinal data (Bartolucci et. al., 2013), especially when response variables are categorical. These models have a great potential of application for the analysis of social, medical, and behavioral data as well as in other disciplines. We propose the R package LMest, which is tailored to deal with these types of model. In particular, we consider a general framework for extended LM models by including individual covariates and by formulating a mixed approach to take into account additional dependence structures in the data. Such extensions lead to a very flexible class of models, which allows us to fit different types of longitudinal data. Model parameters are estimated through the expectation-maximization algorithm, based on the forward-backward recursions, which is implemented in the main functions of the package. The package also allows us to perform local and global decoding and to obtain standard errors for the parameter estimates. We illustrate its use and the most important features on the basis of examples involving applications in health and criminology. △ Less

Submitted 19 January, 2015; originally announced January 2015.

arXiv:1410.4856 [pdf, other]

A multidimensional latent class IRT model for non-ignorable missing responses

Authors: Silvia Bacci, Francesco Bartolucci

Abstract: We propose a structural equation model, which reduces to a multidimensional latent class item response theory model, for the analysis of binary item responses with non-ignorable missingness. The missingness mechanism is driven by two sets of latent variables: one describing the propensity to respond and the other referred to the abilities measured by the test items. These latent variables are assu… ▽ More We propose a structural equation model, which reduces to a multidimensional latent class item response theory model, for the analysis of binary item responses with non-ignorable missingness. The missingness mechanism is driven by two sets of latent variables: one describing the propensity to respond and the other referred to the abilities measured by the test items. These latent variables are assumed to have a discrete distribution, so as to reduce the number of parametric assumptions regarding the latent structure of the model. Individual covariates may also be included through a multinomial logistic parametrization of the probabilities of each support point of the distribution of the latent variables. Given the discrete nature of this distribution, the proposed model is efficiently estimated by the Expectation-Maximization algorithm. A simulation study is performed to evaluate the finite sample properties of the parameter estimates. Moreover, an application is illustrated to data coming from a Students' Entry Test for the admission to some university courses. △ Less

Submitted 17 October, 2014; originally announced October 2014.

arXiv:1408.2319 [pdf, ps, other]

A multilevel finite mixture item response model to cluster examinees and schools

Authors: Michela Gnaldi, Silvia Bacci, Francesco Bartolucci

Abstract: Within the educational context, a key goal is to assess students acquired skills and to cluster students according to their ability level. In this regard, a relevant element to be accounted for is the possible effect of the school students come from. For this aim, we provide a methodological tool which takes into account the multilevel structure of the data (i.e., students in schools) in a suitabl… ▽ More Within the educational context, a key goal is to assess students acquired skills and to cluster students according to their ability level. In this regard, a relevant element to be accounted for is the possible effect of the school students come from. For this aim, we provide a methodological tool which takes into account the multilevel structure of the data (i.e., students in schools) in a suitable way. This approach allows us to cluster both students and schools into homogeneous classes of ability and effectiveness, and to assess the effect of certain students and school characteristics on the probability to belong to such classes. The approach relies on an extended class of multidimensional latent class IRT models characterized by: (i) latent traits defined at student level and at school level, (ii) latent traits represented through random vectors with a discrete distribution, (iii) the inclusion of covariates at student level and at school level, and (iv) a two-parameter logistic parametrization for the conditional probability of a correct response given the ability. The approach is applied for the analysis of data collected by two national tests administered in Italy to middle school students in June 2009: the INVALSI Italian Test and Mathematics Test. Results allow us to study the relationships between observed characteristics and latent trait standing within each latent class at the different levels of the hierarchy. They show that examinees and school expected observed scores, at a given latent trait level, are dependent on both unobserved (latent class) group membership and observed first and second level covariates. △ Less

Submitted 11 August, 2014; originally announced August 2014.

Comments: 17 pages, original article. arXiv admin note: text overlap with arXiv:1212.0378

arXiv:1407.3912 [pdf, ps, other]

Item selection by Latent Class-based methods

Authors: Francesco Bartolucci, Giorgio E. Montanari, Silvia Pandolfi

Abstract: The evaluation of nursing homes is usually based on the administration of questionnaires made of a large number of polytomous items. In such a context, the Latent Class (LC) model represents a useful tool for clustering subjects in homogenous groups corresponding to different degrees of impairment of the health conditions. It is known that the performance of model-based clustering and the accuracy… ▽ More The evaluation of nursing homes is usually based on the administration of questionnaires made of a large number of polytomous items. In such a context, the Latent Class (LC) model represents a useful tool for clustering subjects in homogenous groups corresponding to different degrees of impairment of the health conditions. It is known that the performance of model-based clustering and the accuracy of the choice of the number of latent classes may be affected by the presence of irrelevant or noise variables. In this paper, we show the application of an item selection algorithm to real data collected within a project, named ULISSE, on the quality-of-life of elderly patients hosted in italian nursing homes. This algorithm, which is closely related to that proposed by Dean and Raftery in 2010, is aimed at finding the subset of items which provides the best clustering according to the Bayesian Information Criterion. At the same time, it allows us to select the optimal number of latent classes. Given the complexity of the ULISSE study, we perform a validation of the results by means of a sensitivity analysis to different specifications of the initial subset of items and of a resampling procedure. △ Less

Submitted 15 July, 2014; originally announced July 2014.

arXiv:1402.1033 [pdf, ps, other]

Three-step estimation of latent Markov models with covariates

Authors: Francesco Bartolucci, Giorgio E. Montanari, Silvia Pandolfi

Abstract: We propose a modified version of the three-step estimation method for the latent class model with covariates, which may be used to estimate latent Markov models for longitudinal data. The three-step estimation approach we propose is based on a preliminary clustering of sample units on the basis of the time specific responses only. This approach represents an useful estimation tool when a large num… ▽ More We propose a modified version of the three-step estimation method for the latent class model with covariates, which may be used to estimate latent Markov models for longitudinal data. The three-step estimation approach we propose is based on a preliminary clustering of sample units on the basis of the time specific responses only. This approach represents an useful estimation tool when a large number of response variables are observed at each time occasion. In such a context, full maximum likelihood estimation, which is typically based on the Expectation-Maximization algorithm, may have some drawbacks, essentially due to the presence of many local maxima of the model likelihood. Moreover, the EM algorithm may be particularly slow to converge, and may become unstable with complex LM models. We prove the consistency of the proposed three-step estimator when the number of response variables tends to infinity. We also show the results of a simulation study aimed at evaluating the performance of the proposed alternative approach with respect to the full likelihood method. We finally illustrate an application to a real dataset on the health status of elderly people hosted in Italian nursing homes. △ Less

Submitted 5 February, 2014; originally announced February 2014.

arXiv:1306.1678 [pdf, ps, other]

A discrete time event-history approach to informative drop-out in multivariate latent Markov models with covariates

Authors: Francesco Bartolucci, Alessio Farcomeni

Abstract: Latent Markov (LM) models represent an important tool of analysis of longitudinal data when response variables are affected by time-varying unobserved heterogeneity, which is accounted for by a hidden Markov chain. In order to avoid bias when using a model of this type in the presence of informative drop-out, we propose an event-history (EH) extension of the LM approach that may be used with multi… ▽ More Latent Markov (LM) models represent an important tool of analysis of longitudinal data when response variables are affected by time-varying unobserved heterogeneity, which is accounted for by a hidden Markov chain. In order to avoid bias when using a model of this type in the presence of informative drop-out, we propose an event-history (EH) extension of the LM approach that may be used with multivariate longitudinal data, in which one or more outcomes of a different nature are observed at each time occasion. The EH component of the resulting model is referred to the interval-censored drop-out, and bias in LM modeling is avoided by correlated random effects, included in the different model components, which follow a common Markov chain. In order to perform maximum likelihood estimation of the proposed model by the Expectation-Maximization algorithm, we extend the usual backward-forward recursions of Baum and Welch. The algorithm has the same complexity of the one adopted in cases of non-informative drop-out. Standard errors for the parameter estimates are derived by using the Oakes' identity. We illustrate the proposed approach through an application based on data coming from a medical study about primary biliary cirrhosis in which there are two outcomes of interest, the first of which is continuous and the second is binary. △ Less

Submitted 7 June, 2013; originally announced June 2013.

arXiv:1212.0378 [pdf, other]

Joint Assessment of the Differential Item Functioning and Latent Trait Dimensionality of Students' National Tests

Authors: Michela Gnaldi, Francesco Bartolucci, Silvia Bacci

Abstract: Within the educational context, students' assessment tests are routinely validated through Item Response Theory (IRT) models which assume unidimensionality and absence of Differential Item Functioning (DIF). In this paper, we investigate if such assumptions hold for two national tests administered in Italy to middle school students in June 2009: the Italian Test and the Mathematics Test. To this a… ▽ More Within the educational context, students' assessment tests are routinely validated through Item Response Theory (IRT) models which assume unidimensionality and absence of Differential Item Functioning (DIF). In this paper, we investigate if such assumptions hold for two national tests administered in Italy to middle school students in June 2009: the Italian Test and the Mathematics Test. To this aim, we rely on an extended class of multidimensional latent class IRT models characterised by: (i) a two-parameter logistic parameterisation for the conditional probability of a correct response, (ii) latent traits represented through a random vector with a discrete distribution, and (iii) the inclusion of (uniform) DIF to account for students' gender and geographical area. A classification of the items into unidimensional groups is also proposed and represented by a dendrogram, which is obtained from a hierarchical clustering algorithm. The results provide evidence for DIF effects for both Tests. Besides, the assumption of unidimensionality is strongly rejected for the Italian Test, whereas it is reasonable for the Mathematics Test. △ Less

Submitted 3 December, 2012; originally announced December 2012.

Comments: 30 pages, 3 figures, 11 tables

arXiv:1212.0372 [pdf, other]

A causal analysis of mother's education on birth inequalities

Authors: Silvia Bacci, Francesco Bartolucci, Luca Pieroni

Abstract: We propose a causal analysis of the mother's educational level on the health status of the newborn, in terms of gestational weeks and weight. The analysis is based on a finite mixture structural equation model, the parameters of which have a causal interpretation. The model is applied to a dataset of almost ten thousand deliveries collected in an Italian region. The analysis confirms that standard… ▽ More We propose a causal analysis of the mother's educational level on the health status of the newborn, in terms of gestational weeks and weight. The analysis is based on a finite mixture structural equation model, the parameters of which have a causal interpretation. The model is applied to a dataset of almost ten thousand deliveries collected in an Italian region. The analysis confirms that standard regression overestimates the impact of education on the child health. With respect to the current economic literature, our findings indicate that only high education has positive consequences on child health, implying that policy efforts in education should have benefits for welfare. △ Less

Submitted 3 December, 2012; originally announced December 2012.

Comments: 37 pages, 10 tables, 4 figures

arXiv:1211.2635 [pdf, ps, other]

A multidimensional latent class Rasch model for the assessment of the Health-related Quality of Life

Authors: Silvia Bacci, Francesco Bartolucci

Abstract: The work describes a multidimensional latent class Rasch model and its application to data about the measurement of some aspects of Health-related Quality of Life and Anxiety and Depression in oncological patients. The work describes a multidimensional latent class Rasch model and its application to data about the measurement of some aspects of Health-related Quality of Life and Anxiety and Depression in oncological patients. △ Less

Submitted 12 November, 2012; originally announced November 2012.

Comments: In press as contributed chapter on Christensen K.B., Kreiner S., Mesbah M. (eds.), Rasch related models and methods for health science, Wiley-ISTE

arXiv:1210.6678 [pdf, ps, other]

Causal inference in paired two-arm experimental studies under non-compliance with application to prognosis of myocardial infarction

Authors: F. Bartolucci, A. Farcomeni

Abstract: Motivated by a study about prompt coronary angiography in myocardial infarction, we propose a method to estimate the causal effect of a treatment in two-arm experimental studies with possible non-compliance in both treatment and control arms. The method is based on a causal model for repeated binary outcomes (before and after the treatment), which includes individual covariates and latent variable… ▽ More Motivated by a study about prompt coronary angiography in myocardial infarction, we propose a method to estimate the causal effect of a treatment in two-arm experimental studies with possible non-compliance in both treatment and control arms. The method is based on a causal model for repeated binary outcomes (before and after the treatment), which includes individual covariates and latent variables for the unobserved heterogeneity between subjects. Moreover, given the type of non-compliance, the model assumes the existence of three subpopulations of subjects: compliers, never-takers, and always-takers. The model is estimated by a two-step estimator: at the first step the probability that a subject belongs to one of the three subpopulations is estimated on the basis of the available covariates; at the second step the causal effects are estimated through a conditional logistic method, the implementation of which depends on the results from the first step. Standard errors for this estimator are computed on the basis of a sandwich formula. The application shows that prompt coronary angiography in patients with myocardial infarction may significantly decrease the risk of other events within the next two years, with a log-odds of about -2. Given that non-compliance is significant for patients being given the treatment because of high risk conditions, classical estimators fail to detect, or at least underestimate, this effect. △ Less

Submitted 24 October, 2012; originally announced October 2012.

arXiv:1210.5267 [pdf, other]

MultiLCIRT: An R package for multidimensional latent class item response models

Authors: Francesco Bartolucci, Silvia Bacci, Michela Gnaldi

Abstract: We illustrate a class of Item Response Theory (IRT) models for binary and ordinal polythomous items and we describe an R package for dealing with these models, which is named MultiLCIRT. The models at issue extend traditional IRT models allowing for (i) multidimensionality and (ii) discreteness of latent traits. This class of models also allows for different parameterizations for the conditional d… ▽ More We illustrate a class of Item Response Theory (IRT) models for binary and ordinal polythomous items and we describe an R package for dealing with these models, which is named MultiLCIRT. The models at issue extend traditional IRT models allowing for (i) multidimensionality and (ii) discreteness of latent traits. This class of models also allows for different parameterizations for the conditional distribution of the response variables given the latent traits, depending on both the type of link function and the constraints imposed on the discriminating and the difficulty item parameters. We illustrate how the proposed class of models may be estimated by the maximum likelihood approach via an Expectation-Maximization algorithm, which is implemented in the MultiLCIRT package, and we discuss in detail issues related to model selection. In order to illustrate this package, we analyze two datasets: one concerning binary items and referred to the measurement of ability in mathematics and the other one coming from the administration of ordinal polythomous items for the assessment of anxiety and depression. In the first application, we illustrate how aggregating items in homogeneous groups through a model-based hierarchical clustering procedure which is implemented in the proposed package. In the second application, we describe the steps to select a specific model having the best fit in our class of IRT models. △ Less

Submitted 18 October, 2012; originally announced October 2012.

Comments: 36 pages, 1 figures, 4 tables. arXiv admin note: substantial text overlap with arXiv:1201.4667

arXiv:1208.1864 [pdf, ps, other]

Nested hidden Markov chains for modeling dynamic unobserved heterogeneity in multilevel longitudinal data

Authors: F. Bartolucci, M. Lupparelli

Abstract: In the context of multilevel longitudinal data, where sample units are collected in clusters, an important aspect that should be accounted for is the unobserved heterogeneity between sample units and between clusters. For this aim we propose an approach based on nested hidden (latent) Markov chains, which are associated to every sample unit and to every cluster. The approach allows us to account f… ▽ More In the context of multilevel longitudinal data, where sample units are collected in clusters, an important aspect that should be accounted for is the unobserved heterogeneity between sample units and between clusters. For this aim we propose an approach based on nested hidden (latent) Markov chains, which are associated to every sample unit and to every cluster. The approach allows us to account for the mentioned forms of unobserved heterogeneity in a dynamic fashion; it also allows us to account for the correlation which may arise between the responses provided by the units belonging to the same cluster. Given the complexity in computing the manifest distribution of these response variables, we make inference on the proposed model through a composite likelihood function based on all the possible pairs of subjects within every cluster. The proposed approach is illustrated through an application to a dataset concerning a sample of Italian workers in which a binary response variable for the worker receiving an illness benefit was repeatedly observed. △ Less

Submitted 9 August, 2012; originally announced August 2012.

arXiv:1204.4544 [pdf, other]

Mixtures of equispaced normal distributions and their use for testing symmetry in univariate data

Authors: Silvia Bacci, Francesco Bartolucci

Abstract: Given a random sample of observations, mixtures of normal densities are often used to estimate the unknown continuous distribution from which the data come. Here we propose the use of this semiparametric framework for testing symmetry about an unknown value. More precisely, we show how the null hypothesis of symmetry may be formulated in terms of normal mixture model, with weights about the centre… ▽ More Given a random sample of observations, mixtures of normal densities are often used to estimate the unknown continuous distribution from which the data come. Here we propose the use of this semiparametric framework for testing symmetry about an unknown value. More precisely, we show how the null hypothesis of symmetry may be formulated in terms of normal mixture model, with weights about the centre of symmetry constrained to be equal one another. The resulting model is nested in a more general unconstrained one, with same number of mixture components and free weights. Therefore, after having maximised the constrained and unconstrained log-likelihoods by means of a suitable algorithm, such as the Expectation-Maximisation, symmetry is tested against skewness through a likelihood ratio statistic. The performance of the proposed mixture-based test is illustrated through a Monte Carlo simulation study, where we compare two versions of the test, based on different criteria to select the number of mixture components, with the traditional one based on the third standardised moment. An illustrative example is also given that focuses on real data. △ Less

Submitted 20 April, 2012; originally announced April 2012.

Comments: 16 pages, 3 figures, 7 tables

arXiv:1202.4074 [pdf, ps, other]

Bayesian inference through encompassing priors and importance sampling for a class of marginal models for categorical data

Authors: Francesco Bartolucci, Luisa Scaccia, Alessio Farcomeni

Abstract: We develop a Bayesian approach for selecting the model which is the most supported by the data within a class of marginal models for categorical variables formulated through equality and/or inequality constraints on generalised logits (local, global, continuation or reverse continuation), generalised log-odds ratios and similar higher-order interactions. For each constrained model, the prior distr… ▽ More We develop a Bayesian approach for selecting the model which is the most supported by the data within a class of marginal models for categorical variables formulated through equality and/or inequality constraints on generalised logits (local, global, continuation or reverse continuation), generalised log-odds ratios and similar higher-order interactions. For each constrained model, the prior distribution of the model parameters is formulated following the encompassing prior approach. Then, model selection is performed by using Bayes factors which are estimated by an importance sampling method. The approach is illustrated through three applications involving some datasets, which also include explanatory variables. In connection with one of these examples, a sensitivity analysis to the prior specification is also considered. △ Less

Submitted 18 February, 2012; originally announced February 2012.

arXiv:1201.6179

Decomposition of the h-index

Authors: Francesco Bartolucci

Abstract: I introduce a decomposition of the h-index, which is nowadays the leading criterion to assess the relevance of a scientist in his/her research field. According to the proposed decomposition, the h-index is the product of two indicators, the first of which measures the impact of the scientist on the research community and the second may be seen as a measure of concentration of the citations in corr… ▽ More I introduce a decomposition of the h-index, which is nowadays the leading criterion to assess the relevance of a scientist in his/her research field. According to the proposed decomposition, the h-index is the product of two indicators, the first of which measures the impact of the scientist on the research community and the second may be seen as a measure of concentration of the citations in correspondence of a reduced number of papers. The decomposition is illustrated by an application based on data concerning a group of top level economists. △ Less

Submitted 9 March, 2012; v1 submitted 30 January, 2012; originally announced January 2012.

Comments: Trasformed into a letter to the Editor of the "Journal of the American Society for Information"

arXiv:1201.5990 [pdf, ps, other]

A note on the application of the Oakes' identity to obtain the observed information matrix of hidden Markov models

Authors: F. Bartolucci, A. Farcomeni, F. Pennoni

Abstract: We derive the observed information matrix of hidden Markov models by the application of the Oakes (1999)'s identity. The method only requires the first derivative of the forward-backward recursions of Baum and Welch (1970), instead of the second derivative of the forward recursion, which is required within the approach of Lystig and Hughes (2002). The method is illustrated by an example based on t… ▽ More We derive the observed information matrix of hidden Markov models by the application of the Oakes (1999)'s identity. The method only requires the first derivative of the forward-backward recursions of Baum and Welch (1970), instead of the second derivative of the forward recursion, which is required within the approach of Lystig and Hughes (2002). The method is illustrated by an example based on the analysis of a longitudinal dataset which is well known in sociology. △ Less

Submitted 28 January, 2012; originally announced January 2012.

arXiv:1201.4667 [pdf, ps, other]

A class of Multidimensional Latent Class IRT models for ordinal polytomous item responses

Authors: Silvia Bacci, Francesco Bartolucci, Michela Gnaldi

Abstract: We propose a class of Item Response Theory models for items with ordinal polytomous responses, which extends an existing class of multidimensional models for dichotomously-scored items measuring more than one latent trait. In the proposed approach, the random vector used to represent the latent traits is assumed to have a discrete distribution with support points corresponding to different latent… ▽ More We propose a class of Item Response Theory models for items with ordinal polytomous responses, which extends an existing class of multidimensional models for dichotomously-scored items measuring more than one latent trait. In the proposed approach, the random vector used to represent the latent traits is assumed to have a discrete distribution with support points corresponding to different latent classes in the population. We also allow for different parameterizations for the conditional distribution of the response variables given the latent traits - such as those adopted in the Graded Response model, in the Partial Credit model, and in the Rating Scale model - depending on both the type of link function and the constraints imposed on the item parameters. For the proposed models we outline how to perform maximum likelihood estimation via the Expectation-Maximization algorithm. Moreover, we suggest a strategy for model selection which is based on a series of steps consisting of selecting specific features, such as the number of latent dimensions, the number of latent classes, and the specific parametrization. In order to illustrate the proposed approach, we analyze data deriving from a study on anxiety and depression as perceived by oncological patients. △ Less

Submitted 23 January, 2012; originally announced January 2012.

Comments: 25 pages; 10 tables

arXiv:1201.0277 [pdf, ps, other]

An alternative to the Baum-Welch recursions for hidden Markov models

Authors: Francesco Bartolucci

Abstract: We develop a recursion for hidden Markov model of any order h, which allows us to obtain the posterior distribution of the latent state at every occasion, given the previous h states and the observed data. With respect to the well-known Baum-Welch recursions, the proposed recursion has the advantage of being more direct to use and, in particular, of not requiring dummy renormalizations to avoid nu… ▽ More We develop a recursion for hidden Markov model of any order h, which allows us to obtain the posterior distribution of the latent state at every occasion, given the previous h states and the observed data. With respect to the well-known Baum-Welch recursions, the proposed recursion has the advantage of being more direct to use and, in particular, of not requiring dummy renormalizations to avoid numerical problems. We also show how this recursion may be expressed in matrix notation, so as to allow for an efficient implementation, and how it may be used to obtain the manifest distribution of the observed data and for parameter estimation within the Expectation-Maximization algorithm. The approach is illustrated by an application to financial data which is focused on the study of the dynamics of the volatility level of log-returns. △ Less

Submitted 31 December, 2011; originally announced January 2012.

arXiv:1108.1498 [pdf, ps, other]

Mixture latent autoregressive models for longitudinal data

Authors: Francesco Bartolucci, Silvia Bacci, Fulvia Pennoni

Abstract: Many relevant statistical and econometric models for the analysis of longitudinal data include a latent process to account for the unobserved heterogeneity between subjects in a dynamic fashion. Such a process may be continuous (typically an AR(1)) or discrete (typically a Markov chain). In this paper, we propose a model for longitudinal data which is based on a mixture of AR(1) processes with dif… ▽ More Many relevant statistical and econometric models for the analysis of longitudinal data include a latent process to account for the unobserved heterogeneity between subjects in a dynamic fashion. Such a process may be continuous (typically an AR(1)) or discrete (typically a Markov chain). In this paper, we propose a model for longitudinal data which is based on a mixture of AR(1) processes with different means and correlation coefficients, but with equal variances. This model belongs to the class of models based on a continuous latent process, and then it has a natural interpretation in many contexts of application, but it is more flexible than other models in this class, reaching a goodness-of-fit similar to that of a discrete latent process model, with a reduced number of parameters. We show how to perform maximum likelihood estimation of the proposed model by the joint use of an Expectation-Maximisation algorithm and a Newton-Raphson algorithm, implemented by means of recursions developed in the hidden Markov literature. We also introduce a simple method to obtain standard errors for the parameter estimates and a criterion to choose the number of mixture components. The proposed approach is illustrated by an application to a longitudinal dataset, coming from the Health and Retirement Study, about self-evaluation of the health status by a sample of subjects. In this application, the response variable is ordinal and time-constant and time-varying individual covariates are available. △ Less

Submitted 6 August, 2011; originally announced August 2011.

Comments: Submitted

arXiv:1101.0391 [pdf, ps, other]

Bayesian inference for a class of latent Markov models for categorical longitudinal data

Authors: Francesco Bartolucci, Silvia Pandolfi

Abstract: We propose a Bayesian inference approach for a class of latent Markov models. These models are widely used for the analysis of longitudinal categorical data, when the interest is in studying the evolution of an individual unobservable characteristic. We consider, in particular, the basic latent Markov, which does not account for individual covariates, and its version that includes such covariates… ▽ More We propose a Bayesian inference approach for a class of latent Markov models. These models are widely used for the analysis of longitudinal categorical data, when the interest is in studying the evolution of an individual unobservable characteristic. We consider, in particular, the basic latent Markov, which does not account for individual covariates, and its version that includes such covariates in the measurement model. The proposed inferential approach is based on a system of priors formulated on a transformation of the initial and transition probabilities of the latent Markov chain. This system of priors is equivalent to one based on Dirichlet distributions. In order to draw samples from the joint posterior distribution of the parameters and the number of latent states, we implement a reversible jump algorithm which alternates moves of Metropolis-Hastings type with moves of split/combine and birth/death types. The proposed approach is illustrated through two applications based on longitudinal datasets. △ Less

Submitted 4 January, 2011; v1 submitted 2 January, 2011; originally announced January 2011.

arXiv:1008.3268 [pdf, other]

An investigation of the discriminant power and dimensionality of items used for assessing health condition of elderly people

Authors: Francesco Bartolucci, Giorgio d'Agostino, Giorgio E. Montanari

Abstract: With reference to the questionnaire adopted within the Italian project "Ulisse" to assess health condition of elderly people, we investigate two important issues: discriminant power and actual number of dimensions measured by the items composing the questionnaire. The adopted statistical approach is based on the joint use of the latent class model and a multidimensional item response theory model… ▽ More With reference to the questionnaire adopted within the Italian project "Ulisse" to assess health condition of elderly people, we investigate two important issues: discriminant power and actual number of dimensions measured by the items composing the questionnaire. The adopted statistical approach is based on the joint use of the latent class model and a multidimensional item response theory model based on the 2PL parametrization. The latter allows us to account for the different discriminant power of these items. The analysis is based on the data collected on a sample of 1699 elderly people hosted in 37 nursing homes in Italy. This analysis shows that the selected items indeed measure a different number of dimensions of the health status and that they considerably differ in terms of discriminant power (effectiveness in measuring the actual health status). Implications for the assessment of the performance of nursing homes from a policy-maker prospective are discussed. △ Less

Submitted 19 August, 2010; originally announced August 2010.

arXiv:1006.0621 [pdf, ps, other]

A generalized Multiple-try Metropolis version of the Reversible Jump algorithm

Authors: S. Pandolfi, F. Bartolucci, N. Friel

Abstract: The Reversible Jump algorithm is one of the most widely used Markov chain Monte Carlo algorithms for Bayesian estimation and model selection. A generalized multiple-try version of this algorithm is proposed. The algorithm is based on drawing several proposals at each step and randomly choosing one of them on the basis of weights (selection probabilities) that may be arbitrary chosen. Among the pos… ▽ More The Reversible Jump algorithm is one of the most widely used Markov chain Monte Carlo algorithms for Bayesian estimation and model selection. A generalized multiple-try version of this algorithm is proposed. The algorithm is based on drawing several proposals at each step and randomly choosing one of them on the basis of weights (selection probabilities) that may be arbitrary chosen. Among the possible choices, a method is employed which is based on selection probabilities depending on a quadratic approximation of the posterior distribution. Moreover, the implementation of the proposed algorithm for challenging model selection problems, in which the quadratic approximation is not feasible, is considered. The resulting algorithm leads to a gain in efficiency with respect to the Reversible Jump algorithm, and also in terms of computational effort. The performance of this approach is illustrated for real examples involving a logistic regression model and a latent class model. △ Less

Submitted 11 October, 2013; v1 submitted 3 June, 2010; originally announced June 2010.

arXiv:1003.2804 [pdf, ps, other]

An overview of latent Markov models for longitudinal categorical data

Authors: F. Bartolucci, A. Farcomeni, F. Pennoni

Abstract: We provide a comprehensive overview of latent Markov (LM) models for the analysis of longitudinal categorical data. The main assumption behind these models is that the response variables are conditionally independent given a latent process which follows a first-order Markov chain. We first illustrate the basic LM model in which the conditional distribution of each response variable given the corre… ▽ More We provide a comprehensive overview of latent Markov (LM) models for the analysis of longitudinal categorical data. The main assumption behind these models is that the response variables are conditionally independent given a latent process which follows a first-order Markov chain. We first illustrate the basic LM model in which the conditional distribution of each response variable given the corresponding latent variable and the initial and transition probabilities of the latent process are unconstrained. For this model we also illustrate in detail maximum likelihood estimation through the Expectation-Maximization algorithm, which may be efficiently implemented by recursions known in the hidden Markov literature. We then illustrate several constrained versions of the basic LM model, which make the model more parsimonious and allow us to include and test hypotheses of interest. These constraints may be put on the conditional distribution of the response variables given the latent process (measurement model) or on the distribution of the latent process (latent model). We also deal with extensions of LM model for the inclusion of individual covariates and to multilevel data. Covariates may affect the measurement or the latent model; we discuss the implications of these two different approaches according to the context of application. Finally, we outline methods for obtaining standard errors for the parameter estimates, for selecting the number of states and for path prediction. Models and related inference are illustrated by the description of relevant socio-economic applications available in the literature. △ Less

Submitted 14 March, 2010; originally announced March 2010.

arXiv:0909.4961 [pdf, ps, other]

doi 10.3102/1076998610381396

Assessment of school performance through a multilevel latent Markov Rasch model

Authors: Francesco Bartolucci, Fulvia Pennoni, Giorgio Vittadini

Abstract: An extension of the latent Markov Rasch model is described for the analysis of binary longitudinal data with covariates when subjects are collected in clusters, e.g. students clustered in classes. For each subject, the latent process is used to represent the characteristic of interest (e.g. ability) conditional on the effect of the cluster to which he/she belongs. The latter effect is modeled by… ▽ More An extension of the latent Markov Rasch model is described for the analysis of binary longitudinal data with covariates when subjects are collected in clusters, e.g. students clustered in classes. For each subject, the latent process is used to represent the characteristic of interest (e.g. ability) conditional on the effect of the cluster to which he/she belongs. The latter effect is modeled by a discrete latent variable associated with each cluster. For the maximum likelihood estimation of the model parameters we outline an EM algorithm. We show how the proposed model may be used for assessing the development of cognitive Math achievement. This approach is applied to the analysis of a dataset collected in the Lombardy Region (Italy) and based on test scores over three years of middle-school students attending public and private schools. △ Less

Submitted 28 September, 2009; originally announced September 2009.

arXiv:0908.2300 [pdf, ps, other]

doi 10.1214/08-AOAS230

Latent Markov model for longitudinal binary data: An application to the performance evaluation of nursing homes

Authors: Francesco Bartolucci, Monia Lupparelli, Giorgio E. Montanari

Abstract: Performance evaluation of nursing homes is usually accomplished by the repeated administration of questionnaires aimed at measuring the health status of the patients during their period of residence in the nursing home. We illustrate how a latent Markov model with covariates may effectively be used for the analysis of data collected in this way. This model relies on a not directly observable Mar… ▽ More Performance evaluation of nursing homes is usually accomplished by the repeated administration of questionnaires aimed at measuring the health status of the patients during their period of residence in the nursing home. We illustrate how a latent Markov model with covariates may effectively be used for the analysis of data collected in this way. This model relies on a not directly observable Markov process, whose states represent different levels of the health status. For the maximum likelihood estimation of the model we apply an EM algorithm implemented by means of certain recursions taken from the literature on hidden Markov chains. Of particular interest is the estimation of the effect of each nursing home on the probability of transition between the latent states. We show how the estimates of these effects may be used to construct a set of scores which allows us to rank these facilities in terms of their efficacy in taking care of the health conditions of their patients. The method is used within an application based on data concerning a set of nursing homes located in the Region of Umbria, Italy, which were followed for the period 2003--2005. △ Less

Submitted 17 August, 2009; originally announced August 2009.

Comments: Published in at http://dx.doi.org/10.1214/08-AOAS230 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOAS-AOAS230

Journal ref: Annals of Applied Statistics 2009, Vol. 3, No. 2, 611-636

Showing 1–50 of 54 results for author: Bartolucci, F