-
On relative universality, regression operator, and conditional independence
Authors:
Bing Li,
Ben Jones,
Andreas Artemiou
Abstract:
The notion of relative universality with respect to a σ-field was introduced to establish the unbiasedness and Fisher consistency of an estimator in nonlinear sufficient dimension reduction. However, there is a gap in the proof of this result in the existing literature. The existing definition of relative universality seems to be too strong for the proof to be valid. In this note we modify the def…
▽ More
The notion of relative universality with respect to a σ-field was introduced to establish the unbiasedness and Fisher consistency of an estimator in nonlinear sufficient dimension reduction. However, there is a gap in the proof of this result in the existing literature. The existing definition of relative universality seems to be too strong for the proof to be valid. In this note we modify the definition of relative universality using the concept of ǫ-measurability, and rigorously establish the mentioned unbiasedness and Fisher consistency. The significance of this result is beyond its original context of sufficient dimension reduction, because relative universality allows us to use the regression operator to fully characterize conditional independence, a crucially important statistical relation that sits at the core of many areas and methodologies in statistics and machine learning, such as dimension reduction, graphical models, probability embedding, causal inference, and Bayesian estimation.
△ Less
Submitted 15 April, 2025;
originally announced April 2025.
-
The R package psvmSDR: A Unified Algorithm for Sufficient Dimension Reduction via Principal Machines
Authors:
Jungmin Shin,
Seung Jun Shin,
Andreas Artemiou
Abstract:
Sufficient dimension reduction (SDR), which seeks a lower-dimensional subspace of the predictors containing regression or classification information has been popular in a machine learning community. In this work, we present a new R software package psvmSDR that implements a new class of SDR estimators, which we call the principal machine (PM) generalized from the principal support vector machine (…
▽ More
Sufficient dimension reduction (SDR), which seeks a lower-dimensional subspace of the predictors containing regression or classification information has been popular in a machine learning community. In this work, we present a new R software package psvmSDR that implements a new class of SDR estimators, which we call the principal machine (PM) generalized from the principal support vector machine (PSVM). The package covers both linear and nonlinear SDR and provides a function applicable to realtime update scenarios. The package implements the descent algorithm for the PMs to efficiently compute the SDR estimators in various situations. This easy-to-use package will be an attractive alternative to the dr R package that implements classical SDR methods.
△ Less
Submitted 4 September, 2024; v1 submitted 2 September, 2024;
originally announced September 2024.
-
A transportable hyperspectral imaging setup based on fast, high-density spectral scanning for in situ quantitative biochemical mapping of fresh tissue biopsies
Authors:
Luca Giannoni,
Marta Marradi,
Kevin Scibilia,
Ivan Ezhov,
Camilla Bonaudo,
Angelos Artemiou,
Anam Toaha,
Frederic Lange,
Charly Caredda,
Bruno Montcel,
Alessandro Della Puppa,
Ilias Tachtsidis,
Daniel Ruckert,
Francesco Saverio Pavone
Abstract:
Histopathological examination of surgical biopsies, such as in glioma and glioblastoma resection, is hindered in current clinical practice by the long times required for the laboratory analysis and pathological screening, typically taking several days or even weeks to be completed. We propose here a transportable, high-density, spectral-scanning based hyperspectral imaging setup, named HyperProbe1…
▽ More
Histopathological examination of surgical biopsies, such as in glioma and glioblastoma resection, is hindered in current clinical practice by the long times required for the laboratory analysis and pathological screening, typically taking several days or even weeks to be completed. We propose here a transportable, high-density, spectral-scanning based hyperspectral imaging setup, named HyperProbe1, that can provide in situ, fast biochemical analysis and mapping of fresh surgical tissue samples, right after excision, and without the need of fixing or staining. HyperProbe1 is based on spectral scanning via supercontinuum laser illumination filtered with acousto-optic tuneable filters. Such methodology allows the user to select any number and type of wavelength bands in the visible and near-infrared range between 510 and 900 nm (up to 79), and to reconstruct 3D hypercubes composed of high-resolution, widefield images of the surgical samples, where each pixel is associated with a complete spectrum. The system is applied on 11 fresh surgical biopsies of glioma from routine patients, including different grades of tumour classification. Quantitative analysis of the composition of the tissue is performed via fast spectral unmixing to reconstruct mapping of major biomarkers. We also provided a preliminary attempt to infer tumour classification based on differences of composition in the samples, suggesting the possibility to use lipid content and differential cytochrome-c-oxidase concentrations to distinguish between lower and higher grade gliomas. A proof-of-concept of the performances of HyperProbe1 for quantitative, biochemical mapping of surgical biopsies is demonstrated, paving the way for improving current post-surgical, histopathological practice via non-destructive, in situ streamlined screening of fresh tissue samples in a matter of minutes after excision.
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
Structure-preserving non-linear PCA for matrices
Authors:
Joni Virta,
Andreas Artemiou
Abstract:
We propose MNPCA, a novel non-linear generalization of (2D)$^2${PCA}, a classical linear method for the simultaneous dimension reduction of both rows and columns of a set of matrix-valued data. MNPCA is based on optimizing over separate non-linear mappings on the left and right singular spaces of the observations, essentially amounting to the decoupling of the two sides of the matrices. We develop…
▽ More
We propose MNPCA, a novel non-linear generalization of (2D)$^2${PCA}, a classical linear method for the simultaneous dimension reduction of both rows and columns of a set of matrix-valued data. MNPCA is based on optimizing over separate non-linear mappings on the left and right singular spaces of the observations, essentially amounting to the decoupling of the two sides of the matrices. We develop a comprehensive theoretical framework for MNPCA by viewing it as an eigenproblem in reproducing kernel Hilbert spaces. We study the resulting estimators on both population and sample levels, deriving their convergence rates and formulating a coordinate representation to allow the method to be used in practice. Simulations and a real data example demonstrate MNPCA's good performance over its competitors.
△ Less
Submitted 10 October, 2023;
originally announced October 2023.
-
Poisson PCA for matrix count data
Authors:
Joni Virta,
Andreas Artemiou
Abstract:
We develop a dimension reduction framework for data consisting of matrices of counts. Our model is based on assuming the existence of a small amount of independent normal latent variables that drive the dependency structure of the observed data, and can be seen as the exact discrete analogue for a contaminated low-rank matrix normal model. We derive estimators for the model parameters and establis…
▽ More
We develop a dimension reduction framework for data consisting of matrices of counts. Our model is based on assuming the existence of a small amount of independent normal latent variables that drive the dependency structure of the observed data, and can be seen as the exact discrete analogue for a contaminated low-rank matrix normal model. We derive estimators for the model parameters and establish their root-$n$ consistency. An extension of a recent proposal from the literature is used to estimate the latent dimension of the model. Additionally, a sparsity-accommodating variant of the model is considered. The method is shown to surpass both its vectorization-based competitors and matrix methods assuming the continuity of the data distribution in analysing simulated data and real abundance data.
△ Less
Submitted 27 October, 2021;
originally announced October 2021.
-
Principal support vector machines for linear and nonlinear sufficient dimension reduction
Authors:
Bing Li,
Andreas Artemiou,
Lexin Li
Abstract:
We introduce a principal support vector machine (PSVM) approach that can be used for both linear and nonlinear sufficient dimension reduction. The basic idea is to divide the response variables into slices and use a modified form of support vector machine to find the optimal hyperplanes that separate them. These optimal hyperplanes are then aligned by the principal components of their normal vecto…
▽ More
We introduce a principal support vector machine (PSVM) approach that can be used for both linear and nonlinear sufficient dimension reduction. The basic idea is to divide the response variables into slices and use a modified form of support vector machine to find the optimal hyperplanes that separate them. These optimal hyperplanes are then aligned by the principal components of their normal vectors. It is proved that the aligned normal vectors provide an unbiased, $\sqrt{n}$-consistent, and asymptotically normal estimator of the sufficient dimension reduction space. The method is then generalized to nonlinear sufficient dimension reduction using the reproducing kernel Hilbert space. In that context, the aligned normal vectors become functions and it is proved that they are unbiased in the sense that they are functions of the true nonlinear sufficient predictors. We compare PSVM with other sufficient dimension reduction methods by simulation and in real data analysis, and through both comparisons firmly establish its practical advantages.
△ Less
Submitted 13 March, 2012;
originally announced March 2012.