Search | arXiv e-print repository

Sharp-SSL: Selective high-dimensional axis-aligned random projections for semi-supervised learning

Authors: Tengyao Wang, Edgar Dobriban, Milana Gataric, Richard J. Samworth

Abstract: We propose a new method for high-dimensional semi-supervised learning problems based on the careful aggregation of the results of a low-dimensional procedure applied to many axis-aligned random projections of the data. Our primary goal is to identify important variables for distinguishing between the classes; existing low-dimensional methods can then be applied for final class assignment. Motivate… ▽ More We propose a new method for high-dimensional semi-supervised learning problems based on the careful aggregation of the results of a low-dimensional procedure applied to many axis-aligned random projections of the data. Our primary goal is to identify important variables for distinguishing between the classes; existing low-dimensional methods can then be applied for final class assignment. Motivated by a generalized Rayleigh quotient, we score projections according to the traces of the estimated whitened between-class covariance matrices on the projected data. This enables us to assign an importance weight to each variable for a given projection, and to select our signal variables by aggregating these weights over high-scoring projections. Our theory shows that the resulting Sharp-SSL algorithm is able to recover the signal coordinates with high probability when we aggregate over sufficiently many random projections and when the base procedure estimates the whitened between-class covariance matrix sufficiently well. The Gaussian EM algorithm is a natural choice as a base procedure, and we provide a new analysis of its performance in semi-supervised settings that controls the parameter estimation error in terms of the proportion of labeled data in the sample. Numerical results on both simulated data and a real colon tumor dataset support the excellent empirical performance of the method. △ Less

Submitted 18 April, 2023; originally announced April 2023.

Comments: 49 pages, 4 figures

MSC Class: 62H30

arXiv:2002.08724 [pdf, ps, other]

High-resolution signal recovery via generalized sampling and functional principal component analysis

Authors: Milana Gataric

Abstract: In this paper, we introduce a computational framework for recovering a high-resolution approximation of an unknown function from its low-resolution indirect measurements as well as high-resolution training observations by merging the frameworks of generalized sampling and functional principal component analysis. In particular, we increase the signal resolution via a data driven approach, which mod… ▽ More In this paper, we introduce a computational framework for recovering a high-resolution approximation of an unknown function from its low-resolution indirect measurements as well as high-resolution training observations by merging the frameworks of generalized sampling and functional principal component analysis. In particular, we increase the signal resolution via a data driven approach, which models the function of interest as a realization of a random field and leverages a training set of observations generated via the same underlying random process. We study the performance of the resulting estimation procedure and show that high-resolution recovery is indeed possible provided appropriate low-rank and angle conditions hold and provided the training set is sufficiently large relative to the desired resolution. Moreover, we show that the size of the training set can be reduced by leveraging sparse representations of the functional principal components. Furthermore, the effectiveness of the proposed reconstruction procedure is illustrated by various numerical examples. △ Less

Submitted 14 October, 2021; v1 submitted 20 February, 2020; originally announced February 2020.

Comments: Accepted for publication in Springer's Advances in Computational Mathematics

arXiv:1804.10636 [pdf, other]

doi 10.1109/TMI.2018.2875875

Reconstruction of optical vector-fields with applications in endoscopic imaging

Authors: Milana Gataric, George S. D. Gordon, Francesco Renna, Alberto Gil C. P. Ramos, Maria P. Alcolea, Sarah E. Bohndiek

Abstract: We introduce a framework for the reconstruction of the amplitude, phase and polarisation of an optical vector-field using calibration measurements acquired by an imaging device with an unknown linear transformation. By incorporating effective regularisation terms, this new approach is able to recover an optical vector-field with respect to an arbitrary representation system, which may be different… ▽ More We introduce a framework for the reconstruction of the amplitude, phase and polarisation of an optical vector-field using calibration measurements acquired by an imaging device with an unknown linear transformation. By incorporating effective regularisation terms, this new approach is able to recover an optical vector-field with respect to an arbitrary representation system, which may be different from the one used in calibration. In particular, it enables the recovery of an optical vector-field with respect to a Fourier basis, which is shown to yield indicative features of increased scattering associated with tissue abnormalities. We demonstrate the effectiveness of our approach using synthetic holographic images as well as biological tissue samples in an experimental setting where measurements of an optical vector-field are acquired by a fibre endoscope, and observe that indeed the recovered Fourier coefficients are useful in distinguishing healthy tissues from lesions in early stages of oesophageal cancer. △ Less

Submitted 18 July, 2018; v1 submitted 27 April, 2018; originally announced April 2018.

arXiv:1712.05630 [pdf, other]

Sparse principal component analysis via axis-aligned random projections

Authors: Milana Gataric, Tengyao Wang, Richard J. Samworth

Abstract: We introduce a new method for sparse principal component analysis, based on the aggregation of eigenvector information from carefully-selected axis-aligned random projections of the sample covariance matrix. Unlike most alternative approaches, our algorithm is non-iterative, so is not vulnerable to a bad choice of initialisation. We provide theoretical guarantees under which our principal subspace… ▽ More We introduce a new method for sparse principal component analysis, based on the aggregation of eigenvector information from carefully-selected axis-aligned random projections of the sample covariance matrix. Unlike most alternative approaches, our algorithm is non-iterative, so is not vulnerable to a bad choice of initialisation. We provide theoretical guarantees under which our principal subspace estimator can attain the minimax optimal rate of convergence in polynomial time. In addition, our theory provides a more refined understanding of the statistical and computational trade-off in the problem of sparse principal component estimation, revealing a subtle interplay between the effective sample size and the number of random projections that are required to achieve the minimax optimal rate. Numerical studies provide further insight into the procedure and confirm its highly competitive finite-sample performance. △ Less

Submitted 6 May, 2019; v1 submitted 15 December, 2017; originally announced December 2017.

Comments: 32 pages

MSC Class: 62H25

arXiv:1606.07698 [pdf, other]

Computing reconstructions from nonuniform Fourier samples: Universality of stability barriers and stable sampling rates

Authors: Ben Adcock, Milana Gataric, José Luis Romero

Abstract: We study the problem of recovering an unknown compactly-supported multivariate function from samples of its Fourier transform that are acquired nonuniformly, i.e. not necessarily on a uniform Cartesian grid. Reconstruction problems of this kind arise in various imaging applications, where Fourier samples are taken along radial lines or spirals for example. Specifically, we consider finite-dimens… ▽ More We study the problem of recovering an unknown compactly-supported multivariate function from samples of its Fourier transform that are acquired nonuniformly, i.e. not necessarily on a uniform Cartesian grid. Reconstruction problems of this kind arise in various imaging applications, where Fourier samples are taken along radial lines or spirals for example. Specifically, we consider finite-dimensional reconstructions, where a limited number of samples is available, and investigate the rate of convergence of such approximate solutions and their numerical stability. We show that the proportion of Fourier samples that allow for stable approximations of a given numerical accuracy is independent of the specific sampling geometry and is therefore universal for different sampling scenarios. This allows us to relate both sufficient and necessary conditions for different sampling setups and to exploit several results that were previously available only for very specific sampling geometries. The results are obtained by developing: (i) a transference argument for different measures of the concentration of the Fourier transform and Fourier samples; (ii) frame bounds valid up to the critical sampling density, which depend explicitly on the sampling set and the spectrum. As an application, we identify sufficient and necessary conditions for stable and accurate reconstruction of algebraic polynomials or wavelet coefficients from nonuniform Fourier data. △ Less

Submitted 8 May, 2017; v1 submitted 24 June, 2016; originally announced June 2016.

Comments: 24 pages

MSC Class: 94A20; 94A12; 94A08; 42B05; 42C15; 65T40

Journal ref: Applied and Computational Harmonic Analysis, 46(2): 226-249, 2019

arXiv:1505.05308 [pdf, other]

A practical guide to the recovery of wavelet coefficients from Fourier measurements

Authors: Milana Gataric, Clarice Poon

Abstract: In a series of recent papers (Adcock, Hansen and Poon, 2013, Appl. Comput. Harm. Anal. 45(5):3132-3167), (Adcock, Gataric and Hansen, 2014, SIAM J. Imaging Sci. 7(3):1690-1723) and (Adcock, Hansen, Kutyniok and Ma, 2015, SIAM J. Math. Anal. 47(2):1196-1233), it was shown that one can optimally recover the wavelet coefficients of an unknown compactly supported function from pointwise evaluations of… ▽ More In a series of recent papers (Adcock, Hansen and Poon, 2013, Appl. Comput. Harm. Anal. 45(5):3132-3167), (Adcock, Gataric and Hansen, 2014, SIAM J. Imaging Sci. 7(3):1690-1723) and (Adcock, Hansen, Kutyniok and Ma, 2015, SIAM J. Math. Anal. 47(2):1196-1233), it was shown that one can optimally recover the wavelet coefficients of an unknown compactly supported function from pointwise evaluations of its Fourier transform via the method of generalized sampling. While these papers focused on the optimality of generalized sampling in terms of its stability and error bounds, the current paper explains how this optimal method can be implemented to yield a computationally efficient algorithm. In particular, we show that generalized sampling has a computational complexity of $\mathcal{O}(M(N)\log N)$ when recovering the first $N$ boundary-corrected wavelet coefficients of an unknown compactly supported function from $M(N)$ Fourier samples. Therefore, due to the linear correspondences between the number of samples $M$ and number of coefficients $N$ shown previously, generalized sampling offers a computationally optimal way of recovering wavelet coefficients from Fourier data. △ Less

Submitted 6 March, 2016; v1 submitted 20 May, 2015; originally announced May 2015.

arXiv:1411.0300 [pdf, other]

Density theorems for nonuniform sampling of bandlimited functions using derivatives or bunched measurements

Authors: Ben Adcock, Milana Gataric, Anders C. Hansen

Abstract: We provide sufficient density condition for a set of nonuniform samples to give rise to a set of sampling for multivariate bandlimited functions when the measurements consist of pointwise evaluations of a function and its first $k$ derivatives. Along with explicit estimates of corresponding frame bounds, we derive the explicit density bound and show that, as $k$ increases, it grows linearly in… ▽ More We provide sufficient density condition for a set of nonuniform samples to give rise to a set of sampling for multivariate bandlimited functions when the measurements consist of pointwise evaluations of a function and its first $k$ derivatives. Along with explicit estimates of corresponding frame bounds, we derive the explicit density bound and show that, as $k$ increases, it grows linearly in $k+1$ with the constant of proportionality $1/\mathrm{e}$. Seeking larger gap conditions, we also prove a multivariate perturbation result for nonuniform samples that are sufficiently close to sets of sampling, e.g. to uniform samples taken at $k+1$ times the Nyquist rate. Additionally, in the univariate setting, we consider a related problem of so-called nonuniform bunched sampling, where in each sampling interval $s+1$ bunched measurements of a function are taken and the sampling intervals are permitted to be of different length. We derive an explicit density condition which grows linearly in $s+1$ for large $s$, with the constant of proportionality depending on the width of the bunches. The width of the bunches is allowed to be arbitrarily small, and moreover, for sufficiently narrow bunches and sufficiently large $s$, we obtain the same result as in the case of univariate sampling with $s$ derivatives. △ Less

Submitted 9 September, 2016; v1 submitted 2 November, 2014; originally announced November 2014.

arXiv:1410.0088 [pdf, ps, other]

Recovering piecewise smooth functions from nonuniform Fourier measurements

Authors: Ben Adcock, Milana Gataric, Anders C. Hansen

Abstract: In this paper, we consider the problem of reconstructing piecewise smooth functions to high accuracy from nonuniform samples of their Fourier transform. We use the framework of nonuniform generalized sampling (NUGS) to do this, and to ensure high accuracy we employ reconstruction spaces consisting of splines or (piecewise) polynomials. We analyze the relation between the dimension of the reconstru… ▽ More In this paper, we consider the problem of reconstructing piecewise smooth functions to high accuracy from nonuniform samples of their Fourier transform. We use the framework of nonuniform generalized sampling (NUGS) to do this, and to ensure high accuracy we employ reconstruction spaces consisting of splines or (piecewise) polynomials. We analyze the relation between the dimension of the reconstruction space and the bandwidth of the nonuniform samples, and show that it is linear for splines and piecewise polynomials of fixed degree, and quadratic for piecewise polynomials of varying degree. △ Less

Submitted 30 September, 2014; originally announced October 2014.

arXiv:1405.3111 [pdf, other]

Weighted frames of exponentials and stable recovery of multidimensional functions from nonuniform Fourier samples

Authors: Ben Adcock, Milana Gataric, Anders C. Hansen

Abstract: In this paper, we consider the problem of recovering a compactly supported multivariate function from a collection of pointwise samples of its Fourier transform taken nonuniformly. We do this by using the concept of weighted Fourier frames. A seminal result of Beurling shows that sample points give rise to a classical Fourier frame provided they are relatively separated and of sufficient density.… ▽ More In this paper, we consider the problem of recovering a compactly supported multivariate function from a collection of pointwise samples of its Fourier transform taken nonuniformly. We do this by using the concept of weighted Fourier frames. A seminal result of Beurling shows that sample points give rise to a classical Fourier frame provided they are relatively separated and of sufficient density. However, this result does not allow for arbitrary clustering of sample points, as is often the case in practice. Whilst keeping the density condition sharp and dimension independent, our first result removes the separation condition and shows that density alone suffices. However, this result does not lead to estimates for the frame bounds. A known result of Groechenig provides explicit estimates, but only subject to a density condition that deteriorates linearly with dimension. In our second result we improve these bounds by reducing the dimension dependence. In particular, we provide explicit frame bounds which are dimensionless for functions having compact support contained in a sphere. Next, we demonstrate how our two main results give new insight into a reconstruction algorithm---based on the existing generalized sampling framework---that allows for stable and quasi-optimal reconstruction in any particular basis from a finite collection of samples. Finally, we construct sufficiently dense sampling schemes that are often used in practice---jittered, radial and spiral sampling schemes---and provide several examples illustrating the effectiveness of our approach when tested on these schemes. △ Less

Submitted 6 September, 2015; v1 submitted 13 May, 2014; originally announced May 2014.

arXiv:1310.7820 [pdf, ps, other]

On stable reconstructions from nonuniform Fourier measurements

Authors: Ben Adcock, Milana Gataric, Anders C. Hansen

Abstract: We consider the problem of recovering a compactly-supported function from a finite collection of pointwise samples of its Fourier transform taking nonuniformly. First, we show that under suitable conditions on the sampling frequencies - specifically, their density and bandwidth - it is possible to recover any such function $f$ in a stable and accurate manner in any given finite-dimensional subspac… ▽ More We consider the problem of recovering a compactly-supported function from a finite collection of pointwise samples of its Fourier transform taking nonuniformly. First, we show that under suitable conditions on the sampling frequencies - specifically, their density and bandwidth - it is possible to recover any such function $f$ in a stable and accurate manner in any given finite-dimensional subspace; in particular, one which is well suited for approximating $f$. In practice, this is carried out using so-called nonuniform generalized sampling (NUGS). Second, we consider approximation spaces in one dimension consisting of compactly supported wavelets. We prove that a linear scaling of the dimension of the space with the sampling bandwidth is both necessary and sufficient for stable and accurate recovery. Thus wavelets are up to constant factors optimal spaces for reconstruction. △ Less

Submitted 7 April, 2014; v1 submitted 29 October, 2013; originally announced October 2013.

Showing 1–10 of 10 results for author: Gataric, M