-
Stein Discrepancy for Unsupervised Domain Adaptation
Authors:
Anneke von Seeger,
Dongmian Zou,
Gilad Lerman
Abstract:
Unsupervised domain adaptation (UDA) leverages information from a labeled source dataset to improve accuracy on a related but unlabeled target dataset. A common approach to UDA is aligning representations from the source and target domains by minimizing the distance between their data distributions. Previous methods have employed distances such as Wasserstein distance and maximum mean discrepancy.…
▽ More
Unsupervised domain adaptation (UDA) leverages information from a labeled source dataset to improve accuracy on a related but unlabeled target dataset. A common approach to UDA is aligning representations from the source and target domains by minimizing the distance between their data distributions. Previous methods have employed distances such as Wasserstein distance and maximum mean discrepancy. However, these approaches are less effective when the target data is significantly scarcer than the source data. Stein discrepancy is an asymmetric distance between distributions that relies on one distribution only through its score function. In this paper, we propose a novel UDA method that uses Stein discrepancy to measure the distance between source and target domains. We develop a learning framework using both non-kernelized and kernelized Stein discrepancy. Theoretically, we derive an upper bound for the generalization error. Numerical experiments show that our method outperforms existing methods using other domain discrepancy measures when only small amounts of target data are available.
△ Less
Submitted 21 February, 2025; v1 submitted 5 February, 2025;
originally announced February 2025.
-
Theoretical Guarantees for the Subspace-Constrained Tyler's Estimator
Authors:
Gilad Lerman,
Feng Yu,
Teng Zhang
Abstract:
This work analyzes the subspace-constrained Tyler's estimator (STE) designed for recovering a low-dimensional subspace within a dataset that may be highly corrupted with outliers. It assumes a weak inlier-outlier model and allows the fraction of inliers to be smaller than a fraction that leads to computational hardness of the robust subspace recovery problem. It shows that in this setting, if the…
▽ More
This work analyzes the subspace-constrained Tyler's estimator (STE) designed for recovering a low-dimensional subspace within a dataset that may be highly corrupted with outliers. It assumes a weak inlier-outlier model and allows the fraction of inliers to be smaller than a fraction that leads to computational hardness of the robust subspace recovery problem. It shows that in this setting, if the initialization of STE, which is an iterative algorithm, satisfies a certain condition, then STE can effectively recover the underlying subspace. It further shows that under the generalized haystack model, STE initialized by the Tyler's M-estimator (TME), can recover the subspace when the fraction of iniliers is too small for TME to handle.
△ Less
Submitted 12 April, 2024; v1 submitted 27 March, 2024;
originally announced March 2024.
-
Improved Convergence Rates of Windowed Anderson Acceleration for Symmetric Fixed-Point Iterations
Authors:
Casey Garner,
Gilad Lerman,
Teng Zhang
Abstract:
This paper studies the commonly utilized windowed Anderson acceleration (AA) algorithm for fixed-point methods, $x^{(k+1)}=q(x^{(k)})$. It provides the first proof that when the operator $q$ is linear and symmetric the windowed AA, which uses a sliding window of prior iterates, improves the root-linear convergence factor over the fixed-point iterations. When $q$ is nonlinear, yet has a symmetric J…
▽ More
This paper studies the commonly utilized windowed Anderson acceleration (AA) algorithm for fixed-point methods, $x^{(k+1)}=q(x^{(k)})$. It provides the first proof that when the operator $q$ is linear and symmetric the windowed AA, which uses a sliding window of prior iterates, improves the root-linear convergence factor over the fixed-point iterations. When $q$ is nonlinear, yet has a symmetric Jacobian at a fixed point, a slightly modified AA algorithm is proved to have an analogous root-linear convergence factor improvement over fixed-point iterations. Simulations verify our observations. Furthermore, experiments with different data models demonstrate AA is significantly superior to the standard fixed-point methods for Tyler's M-estimation.
△ Less
Submitted 8 March, 2024; v1 submitted 4 November, 2023;
originally announced November 2023.
-
Robust Group Synchronization via Quadratic Programming
Authors:
Yunpeng Shi,
Cole Wyeth,
Gilad Lerman
Abstract:
We propose a novel quadratic programming formulation for estimating the corruption levels in group synchronization, and use these estimates to solve this problem. Our objective function exploits the cycle consistency of the group and we thus refer to our method as detection and estimation of structural consistency (DESC). This general framework can be extended to other algebraic and geometric stru…
▽ More
We propose a novel quadratic programming formulation for estimating the corruption levels in group synchronization, and use these estimates to solve this problem. Our objective function exploits the cycle consistency of the group and we thus refer to our method as detection and estimation of structural consistency (DESC). This general framework can be extended to other algebraic and geometric structures. Our formulation has the following advantages: it can tolerate corruption as high as the information-theoretic bound, it does not require a good initialization for the estimates of group elements, it has a simple interpretation, and under some mild conditions the global minimum of our objective function exactly recovers the corruption levels. We demonstrate the competitive accuracy of our approach on both synthetic and real data experiments of rotation averaging.
△ Less
Submitted 17 June, 2022;
originally announced June 2022.
-
An Unpooling Layer for Graph Generation
Authors:
Yinglong Guo,
Dongmian Zou,
Gilad Lerman
Abstract:
We propose a novel and trainable graph unpooling layer for effective graph generation. Given a graph with features, the unpooling layer enlarges this graph and learns its desired new structure and features. Since this unpooling layer is trainable, it can be applied to graph generation either in the decoder of a variational autoencoder or in the generator of a generative adversarial network (GAN).…
▽ More
We propose a novel and trainable graph unpooling layer for effective graph generation. Given a graph with features, the unpooling layer enlarges this graph and learns its desired new structure and features. Since this unpooling layer is trainable, it can be applied to graph generation either in the decoder of a variational autoencoder or in the generator of a generative adversarial network (GAN). We prove that the unpooled graph remains connected and any connected graph can be sequentially unpooled from a 3-nodes graph. We apply the unpooling layer within the GAN generator. Since the most studied instance of graph generation is molecular generation, we test our ideas in this context. Using the QM9 and ZINC datasets, we demonstrate the improvement obtained by using the unpooling layer instead of an adjacency-matrix-based approach.
△ Less
Submitted 5 March, 2023; v1 submitted 3 June, 2022;
originally announced June 2022.
-
Fast, Accurate and Memory-Efficient Partial Permutation Synchronization
Authors:
Shaohan Li,
Yunpeng Shi,
Gilad Lerman
Abstract:
Previous partial permutation synchronization (PPS) algorithms, which are commonly used for multi-object matching, often involve computation-intensive and memory-demanding matrix operations. These operations become intractable for large scale structure-from-motion datasets. For pure permutation synchronization, the recent Cycle-Edge Message Passing (CEMP) framework suggests a memory-efficient and f…
▽ More
Previous partial permutation synchronization (PPS) algorithms, which are commonly used for multi-object matching, often involve computation-intensive and memory-demanding matrix operations. These operations become intractable for large scale structure-from-motion datasets. For pure permutation synchronization, the recent Cycle-Edge Message Passing (CEMP) framework suggests a memory-efficient and fast solution. Here we overcome the restriction of CEMP to compact groups and propose an improved algorithm, CEMP-Partial, for estimating the corruption levels of the observed partial permutations. It allows us to subsequently implement a nonconvex weighted projected power method without the need of spectral initialization. The resulting new PPS algorithm, MatchFAME (Fast, Accurate and Memory-Efficient Matching), only involves sparse matrix operations, and thus enjoys lower time and space complexities in comparison to previous PPS algorithms. We prove that under adversarial corruption, though without additive noise and with certain assumptions, CEMP-Partial is able to exactly classify corrupted and clean partial permutations. We demonstrate the state-of-the-art accuracy, speed and memory efficiency of our method on both synthetic and real datasets.
△ Less
Submitted 31 March, 2022; v1 submitted 30 March, 2022;
originally announced March 2022.
-
Ensemble Riemannian Data Assimilation over the Wasserstein Space
Authors:
Sagar K. Tamang,
Ardeshir Ebtehaj,
Peter J. Van Leeuwen,
Dongmian Zou,
Gilad Lerman
Abstract:
In this paper, we present an ensemble data assimilation paradigm over a Riemannian manifold equipped with the Wasserstein metric. Unlike the Eulerian penalization of error in the Euclidean space, the Wasserstein metric can capture translation and difference between the shapes of square-integrable probability distributions of the background state and observations -- enabling to formally penalize ge…
▽ More
In this paper, we present an ensemble data assimilation paradigm over a Riemannian manifold equipped with the Wasserstein metric. Unlike the Eulerian penalization of error in the Euclidean space, the Wasserstein metric can capture translation and difference between the shapes of square-integrable probability distributions of the background state and observations -- enabling to formally penalize geophysical biases in state-space with non-Gaussian distributions. The new approach is applied to dissipative and chaotic evolutionary dynamics and its potential advantages and limitations are highlighted compared to the classic variational and filtering data assimilation approaches under systematic and random errors.
△ Less
Submitted 24 March, 2021; v1 submitted 7 September, 2020;
originally announced September 2020.
-
Message Passing Least Squares Framework and its Application to Rotation Synchronization
Authors:
Yunpeng Shi,
Gilad Lerman
Abstract:
We propose an efficient algorithm for solving group synchronization under high levels of corruption and noise, while we focus on rotation synchronization. We first describe our recent theoretically guaranteed message passing algorithm that estimates the corruption levels of the measured group ratios. We then propose a novel reweighted least squares method to estimate the group elements, where the…
▽ More
We propose an efficient algorithm for solving group synchronization under high levels of corruption and noise, while we focus on rotation synchronization. We first describe our recent theoretically guaranteed message passing algorithm that estimates the corruption levels of the measured group ratios. We then propose a novel reweighted least squares method to estimate the group elements, where the weights are initialized and iteratively updated using the estimated corruption levels. We demonstrate the superior performance of our algorithm over state-of-the-art methods for rotation synchronization using both synthetic and real data.
△ Less
Submitted 14 August, 2020; v1 submitted 27 July, 2020;
originally announced July 2020.
-
Robust Multi-object Matching via Iterative Reweighting of the Graph Connection Laplacian
Authors:
Yunpeng Shi,
Shaohan Li,
Gilad Lerman
Abstract:
We propose an efficient and robust iterative solution to the multi-object matching problem. We first clarify serious limitations of current methods as well as the inappropriateness of the standard iteratively reweighted least squares procedure. In view of these limitations, we suggest a novel and more reliable iterative reweighting strategy that incorporates information from higher-order neighborh…
▽ More
We propose an efficient and robust iterative solution to the multi-object matching problem. We first clarify serious limitations of current methods as well as the inappropriateness of the standard iteratively reweighted least squares procedure. In view of these limitations, we suggest a novel and more reliable iterative reweighting strategy that incorporates information from higher-order neighborhoods by exploiting the graph connection Laplacian. We demonstrate the superior performance of our procedure over state-of-the-art methods using both synthetic and real datasets.
△ Less
Submitted 24 October, 2020; v1 submitted 11 June, 2020;
originally announced June 2020.
-
Novelty Detection via Robust Variational Autoencoding
Authors:
Chieh-Hsin Lai,
Dongmian Zou,
Gilad Lerman
Abstract:
We propose a new method for novelty detection that can tolerate high corruption of the training points, whereas previous works assumed either no or very low corruption. Our method trains a robust variational autoencoder (VAE), which aims to generate a model for the uncorrupted training points. To gain robustness to high corruption, we incorporate the following four changes to the common VAE: 1. Ex…
▽ More
We propose a new method for novelty detection that can tolerate high corruption of the training points, whereas previous works assumed either no or very low corruption. Our method trains a robust variational autoencoder (VAE), which aims to generate a model for the uncorrupted training points. To gain robustness to high corruption, we incorporate the following four changes to the common VAE: 1. Extracting crucial features of the latent code by a carefully designed dimension reduction component for distributions; 2. Modeling the latent distribution as a mixture of Gaussian low-rank inliers and full-rank outliers, where the testing only uses the inlier model; 3. Applying the Wasserstein-1 metric for regularization, instead of the Kullback-Leibler (KL) divergence; and 4. Using a robust error for reconstruction. We establish both robustness to outliers and suitability to low-rank modeling of the Wasserstein metric as opposed to the KL divergence. We illustrate state-of-the-art results on standard benchmarks.
△ Less
Submitted 1 March, 2023; v1 submitted 9 June, 2020;
originally announced June 2020.
-
Regularized Variational Data Assimilation for Bias Treatment using the Wasserstein Metric
Authors:
Sagar K. Tamang,
Ardeshir Ebtehaj,
Dongmian Zou,
Gilad Lerman
Abstract:
This paper presents a new variational data assimilation (VDA) approach for the formal treatment of bias in both model outputs and observations. This approach relies on the Wasserstein metric stemming from the theory of optimal mass transport to penalize the distance between the probability histograms of the analysis state and an a priori reference dataset, which is likely to be more uncertain but…
▽ More
This paper presents a new variational data assimilation (VDA) approach for the formal treatment of bias in both model outputs and observations. This approach relies on the Wasserstein metric stemming from the theory of optimal mass transport to penalize the distance between the probability histograms of the analysis state and an a priori reference dataset, which is likely to be more uncertain but less biased than both model and observations. Unlike previous bias-aware VDA approaches, the new Wasserstein metric VDA (WM-VDA) dynamically treats systematic biases of unknown magnitude and sign in both model and observations through assimilation of the reference data in the probability domain and can fully recover the probability histogram of the analysis state. The performance of WM-VDA is compared with the classic three-dimensional VDA (3D-Var) scheme on first-order linear dynamics and the chaotic Lorenz attractor. Under positive systematic biases in both model and observations, we consistently demonstrate a significant reduction in the forecast bias and unbiased root mean squared error.
△ Less
Submitted 4 March, 2020;
originally announced March 2020.
-
Robust Group Synchronization via Cycle-Edge Message Passing
Authors:
Gilad Lerman,
Yunpeng Shi
Abstract:
We propose a general framework for solving the group synchronization problem, where we focus on the setting of adversarial or uniform corruption and sufficiently small noise. Specifically, we apply a novel message passing procedure that uses cycle consistency information in order to estimate the corruption levels of group ratios and consequently solve the synchronization problem in our setting. We…
▽ More
We propose a general framework for solving the group synchronization problem, where we focus on the setting of adversarial or uniform corruption and sufficiently small noise. Specifically, we apply a novel message passing procedure that uses cycle consistency information in order to estimate the corruption levels of group ratios and consequently solve the synchronization problem in our setting. We first explain why the group cycle consistency information is essential for effectively solving group synchronization problems. We then establish exact recovery and linear convergence guarantees for the proposed message passing procedure under a deterministic setting with adversarial corruption. These guarantees hold as long as the ratio of corrupted cycles per edge is bounded by a reasonable constant. We also establish the stability of the proposed procedure to sub-Gaussian noise. We further establish exact recovery with high probability under a common uniform corruption model.
△ Less
Submitted 27 July, 2021; v1 submitted 24 December, 2019;
originally announced December 2019.
-
Robust Subspace Recovery with Adversarial Outliers
Authors:
Tyler Maunu,
Gilad Lerman
Abstract:
We study the problem of robust subspace recovery (RSR) in the presence of adversarial outliers. That is, we seek a subspace that contains a large portion of a dataset when some fraction of the data points are arbitrarily corrupted. We first examine a theoretical estimator that is intractable to calculate and use it to derive information-theoretic bounds of exact recovery. We then propose two tract…
▽ More
We study the problem of robust subspace recovery (RSR) in the presence of adversarial outliers. That is, we seek a subspace that contains a large portion of a dataset when some fraction of the data points are arbitrarily corrupted. We first examine a theoretical estimator that is intractable to calculate and use it to derive information-theoretic bounds of exact recovery. We then propose two tractable estimators: a variant of RANSAC and a simple relaxation of the theoretical estimator. The two estimators are fast to compute and achieve state-of-the-art theoretical performance in a noiseless RSR setting with adversarial outliers. The former estimator achieves better theoretical guarantees in the noiseless case, while the latter estimator is robust to small noise, and its guarantees significantly improve with non-adversarial models of outliers. We give a complete comparison of guarantees for the adversarial RSR problem, as well as a short discussion on the estimation of affine subspaces.
△ Less
Submitted 5 April, 2019;
originally announced April 2019.
-
Robust Subspace Recovery Layer for Unsupervised Anomaly Detection
Authors:
Chieh-Hsin Lai,
Dongmian Zou,
Gilad Lerman
Abstract:
We propose a neural network for unsupervised anomaly detection with a novel robust subspace recovery layer (RSR layer). This layer seeks to extract the underlying subspace from a latent representation of the given data and removes outliers that lie away from this subspace. It is used within an autoencoder. The encoder maps the data into a latent space, from which the RSR layer extracts the subspac…
▽ More
We propose a neural network for unsupervised anomaly detection with a novel robust subspace recovery layer (RSR layer). This layer seeks to extract the underlying subspace from a latent representation of the given data and removes outliers that lie away from this subspace. It is used within an autoencoder. The encoder maps the data into a latent space, from which the RSR layer extracts the subspace. The decoder then smoothly maps back the underlying subspace to a "manifold" close to the original inliers. Inliers and outliers are distinguished according to the distances between the original and mapped positions (small for inliers and large for outliers). Extensive numerical experiments with both image and document datasets demonstrate state-of-the-art precision and recall.
△ Less
Submitted 24 December, 2019; v1 submitted 30 March, 2019;
originally announced April 2019.
-
Encoding Robust Representation for Graph Generation
Authors:
Dongmian Zou,
Gilad Lerman
Abstract:
Generative networks have made it possible to generate meaningful signals such as images and texts from simple noise. Recently, generative methods based on GAN and VAE were developed for graphs and graph signals. However, the mathematical properties of these methods are unclear, and training good generative models is difficult. This work proposes a graph generation model that uses a recent adaptati…
▽ More
Generative networks have made it possible to generate meaningful signals such as images and texts from simple noise. Recently, generative methods based on GAN and VAE were developed for graphs and graph signals. However, the mathematical properties of these methods are unclear, and training good generative models is difficult. This work proposes a graph generation model that uses a recent adaptation of Mallat's scattering transform to graphs. The proposed model is naturally composed of an encoder and a decoder. The encoder is a Gaussianized graph scattering transform, which is robust to signal and graph manipulation. The decoder is a simple fully connected network that is adapted to specific tasks, such as link prediction, signal generation on graphs and full graph and signal generation. The training of our proposed system is efficient since it is only applied to the decoder and the hardware requirements are moderate. Numerical results demonstrate state-of-the-art performance of the proposed system for both link prediction and graph and signal generation.
△ Less
Submitted 15 January, 2019; v1 submitted 28 September, 2018;
originally announced September 2018.
-
An Overview of Robust Subspace Recovery
Authors:
Gilad Lerman,
Tyler Maunu
Abstract:
This paper will serve as an introduction to the body of work on robust subspace recovery. Robust subspace recovery involves finding an underlying low-dimensional subspace in a dataset that is possibly corrupted with outliers. While this problem is easy to state, it has been difficult to develop optimal algorithms due to its underlying nonconvexity. This work emphasizes advantages and disadvantages…
▽ More
This paper will serve as an introduction to the body of work on robust subspace recovery. Robust subspace recovery involves finding an underlying low-dimensional subspace in a dataset that is possibly corrupted with outliers. While this problem is easy to state, it has been difficult to develop optimal algorithms due to its underlying nonconvexity. This work emphasizes advantages and disadvantages of proposed approaches and unsolved problems in the area.
△ Less
Submitted 5 July, 2018; v1 submitted 2 March, 2018;
originally announced March 2018.
-
A Well-Tempered Landscape for Non-convex Robust Subspace Recovery
Authors:
Tyler Maunu,
Teng Zhang,
Gilad Lerman
Abstract:
We present a mathematical analysis of a non-convex energy landscape for robust subspace recovery. We prove that an underlying subspace is the only stationary point and local minimizer in a specified neighborhood under a deterministic condition on a dataset. If the deterministic condition is satisfied, we further show that a geodesic gradient descent method over the Grassmannian manifold can exactl…
▽ More
We present a mathematical analysis of a non-convex energy landscape for robust subspace recovery. We prove that an underlying subspace is the only stationary point and local minimizer in a specified neighborhood under a deterministic condition on a dataset. If the deterministic condition is satisfied, we further show that a geodesic gradient descent method over the Grassmannian manifold can exactly recover the underlying subspace when the method is properly initialized. Proper initialization by principal component analysis is guaranteed with a simple deterministic condition. Under slightly stronger assumptions, the gradient descent method with a piecewise constant step-size scheme achieves linear convergence. The practicality of the deterministic condition is demonstrated on some statistical models of data, and the method achieves almost state-of-the-art recovery guarantees on the Haystack Model for different regimes of sample size and ambient dimension. In particular, when the ambient dimension is fixed and the sample size is large enough, we show that our gradient method can exactly recover the underlying subspace for any fixed fraction of outliers (less than 1).
△ Less
Submitted 28 February, 2019; v1 submitted 12 June, 2017;
originally announced June 2017.
-
Fast Landmark Subspace Clustering
Authors:
Xu Wang,
Gilad Lerman
Abstract:
Kernel methods obtain superb performance in terms of accuracy for various machine learning tasks since they can effectively extract nonlinear relations. However, their time complexity can be rather large especially for clustering tasks. In this paper we define a general class of kernels that can be easily approximated by randomization. These kernels appear in various applications, in particular, t…
▽ More
Kernel methods obtain superb performance in terms of accuracy for various machine learning tasks since they can effectively extract nonlinear relations. However, their time complexity can be rather large especially for clustering tasks. In this paper we define a general class of kernels that can be easily approximated by randomization. These kernels appear in various applications, in particular, traditional spectral clustering, landmark-based spectral clustering and landmark-based subspace clustering. We show that for $n$ data points from $K$ clusters with $D$ landmarks, the randomization procedure results in an algorithm of complexity $O(KnD)$. Furthermore, we bound the error between the original clustering scheme and its randomization. To illustrate the power of this framework, we propose a new fast landmark subspace (FLS) clustering algorithm. Experiments over synthetic and real datasets demonstrate the superior performance of FLS in accelerating subspace clustering with marginal sacrifice of accuracy.
△ Less
Submitted 28 October, 2015;
originally announced October 2015.
-
Nonparametric Bayesian Regression on Manifolds via Brownian Motion
Authors:
Xu Wang,
Gilad Lerman
Abstract:
This paper proposes a novel framework for manifold-valued regression and establishes its consistency as well as its contraction rate. It assumes a predictor with values in the interval $[0,1]$ and response with values in a compact Riemannian manifold $M$. This setting is useful for applications such as modeling dynamic scenes or shape deformations, where the visual scene or the deformed objects ca…
▽ More
This paper proposes a novel framework for manifold-valued regression and establishes its consistency as well as its contraction rate. It assumes a predictor with values in the interval $[0,1]$ and response with values in a compact Riemannian manifold $M$. This setting is useful for applications such as modeling dynamic scenes or shape deformations, where the visual scene or the deformed objects can be modeled by a manifold. The proposed framework is nonparametric and uses the heat kernel (and its associated Brownian motion) on manifolds as an averaging procedure. It directly generalizes the use of the Gaussian kernel (as a natural model of additive noise) in vector-valued regression problems. In order to avoid explicit dependence on estimates of the heat kernel, we follow a Bayesian setting, where Brownian motion on $M$ induces a prior distribution on the space of continuous functions $C([0,1], M)$. For the case of discretized Brownian motion, we establish the consistency of the posterior distribution in terms of the $L_{q}$ distances for any $1 \leq q < \infty$. Most importantly, we establish contraction rate of order $O(n^{-1/4+ε})$ for any fixed $ε>0$, where $n$ is the number of observations. For the continuous Brownian motion we establish weak consistency.
△ Less
Submitted 23 July, 2015;
originally announced July 2015.
-
Riemannian Multi-Manifold Modeling
Authors:
Xu Wang,
Konstantinos Slavakis,
Gilad Lerman
Abstract:
This paper advocates a novel framework for segmenting a dataset in a Riemannian manifold $M$ into clusters lying around low-dimensional submanifolds of $M$. Important examples of $M$, for which the proposed clustering algorithm is computationally efficient, are the sphere, the set of positive definite matrices, and the Grassmannian. The clustering problem with these examples of $M$ is already usef…
▽ More
This paper advocates a novel framework for segmenting a dataset in a Riemannian manifold $M$ into clusters lying around low-dimensional submanifolds of $M$. Important examples of $M$, for which the proposed clustering algorithm is computationally efficient, are the sphere, the set of positive definite matrices, and the Grassmannian. The clustering problem with these examples of $M$ is already useful for numerous application domains such as action identification in video sequences, dynamic texture clustering, brain fiber segmentation in medical imaging, and clustering of deformed images. The proposed clustering algorithm constructs a data-affinity matrix by thoroughly exploiting the intrinsic geometry and then applies spectral clustering. The intrinsic local geometry is encoded by local sparse coding and more importantly by directional information of local tangent spaces and geodesics. Theoretical guarantees are established for a simplified variant of the algorithm even when the clusters intersect. To avoid complication, these guarantees assume that the underlying submanifolds are geodesic. Extensive validation on synthetic and real data demonstrates the resiliency of the proposed method against deviations from the theoretical model as well as its superior performance over state-of-the-art techniques.
△ Less
Submitted 30 September, 2014;
originally announced October 2014.
-
Fast, Robust and Non-convex Subspace Recovery
Authors:
Gilad Lerman,
Tyler Maunu
Abstract:
This work presents a fast and non-convex algorithm for robust subspace recovery. The data sets considered include inliers drawn around a low-dimensional subspace of a higher dimensional ambient space, and a possibly large portion of outliers that do not lie nearby this subspace. The proposed algorithm, which we refer to as Fast Median Subspace (FMS), is designed to robustly determine the underlyin…
▽ More
This work presents a fast and non-convex algorithm for robust subspace recovery. The data sets considered include inliers drawn around a low-dimensional subspace of a higher dimensional ambient space, and a possibly large portion of outliers that do not lie nearby this subspace. The proposed algorithm, which we refer to as Fast Median Subspace (FMS), is designed to robustly determine the underlying subspace of such data sets, while having lower computational complexity than existing methods. We prove convergence of the FMS iterates to a stationary point. Further, under a special model of data, FMS converges to a point which is near to the global minimum with overwhelming probability. Under this model, we show that the iteration complexity is globally bounded and locally $r$-linear. The latter theorem holds for any fixed fraction of outliers (less than 1) and any fixed positive distance between the limit point and the global minimum. Numerical experiments on synthetic and real data demonstrate its competitive speed and accuracy.
△ Less
Submitted 9 June, 2016; v1 submitted 24 June, 2014;
originally announced June 2014.
-
Spectral Clustering Based on Local PCA
Authors:
Ery Arias-Castro,
Gilad Lerman,
Teng Zhang
Abstract:
We propose a spectral clustering method based on local principal components analysis (PCA). After performing local PCA in selected neighborhoods, the algorithm builds a nearest neighbor graph weighted according to a discrepancy between the principal subspaces in the neighborhoods, and then applies spectral clustering. As opposed to standard spectral methods based solely on pairwise distances betwe…
▽ More
We propose a spectral clustering method based on local principal components analysis (PCA). After performing local PCA in selected neighborhoods, the algorithm builds a nearest neighbor graph weighted according to a discrepancy between the principal subspaces in the neighborhoods, and then applies spectral clustering. As opposed to standard spectral methods based solely on pairwise distances between points, our algorithm is able to resolve intersections. We establish theoretical guarantees for simpler variants within a prototypical mathematical framework for multi-manifold clustering, and evaluate our algorithm on various simulated data sets.
△ Less
Submitted 9 January, 2013;
originally announced January 2013.
-
Robust computation of linear models by convex relaxation
Authors:
Gilad Lerman,
Michael McCoy,
Joel A. Tropp,
Teng Zhang
Abstract:
Consider a dataset of vector-valued observations that consists of noisy inliers, which are explained well by a low-dimensional subspace, along with some number of outliers. This work describes a convex optimization problem, called REAPER, that can reliably fit a low-dimensional model to this type of data. This approach parameterizes linear subspaces using orthogonal projectors, and it uses a relax…
▽ More
Consider a dataset of vector-valued observations that consists of noisy inliers, which are explained well by a low-dimensional subspace, along with some number of outliers. This work describes a convex optimization problem, called REAPER, that can reliably fit a low-dimensional model to this type of data. This approach parameterizes linear subspaces using orthogonal projectors, and it uses a relaxation of the set of orthogonal projectors to reach the convex formulation. The paper provides an efficient algorithm for solving the REAPER problem, and it documents numerical experiments which confirm that REAPER can dependably find linear structure in synthetic and natural data. In addition, when the inliers lie near a low-dimensional subspace, there is a rigorous theory that describes when REAPER can approximate this subspace.
△ Less
Submitted 11 August, 2014; v1 submitted 17 February, 2012;
originally announced February 2012.
-
A Novel M-Estimator for Robust PCA
Authors:
Teng Zhang,
Gilad Lerman
Abstract:
We study the basic problem of robust subspace recovery. That is, we assume a data set that some of its points are sampled around a fixed subspace and the rest of them are spread in the whole ambient space, and we aim to recover the fixed underlying subspace. We first estimate "robust inverse sample covariance" by solving a convex minimization procedure; we then recover the subspace by the bottom e…
▽ More
We study the basic problem of robust subspace recovery. That is, we assume a data set that some of its points are sampled around a fixed subspace and the rest of them are spread in the whole ambient space, and we aim to recover the fixed underlying subspace. We first estimate "robust inverse sample covariance" by solving a convex minimization procedure; we then recover the subspace by the bottom eigenvectors of this matrix (their number correspond to the number of eigenvalues close to 0). We guarantee exact subspace recovery under some conditions on the underlying data. Furthermore, we propose a fast iterative algorithm, which linearly converges to the matrix minimizing the convex problem. We also quantify the effect of noise and regularization and discuss many other practical and theoretical issues for improving the subspace recovery in various settings. When replacing the sum of terms in the convex energy function (that we minimize) with the sum of squares of terms, we obtain that the new minimizer is a scaled version of the inverse sample covariance (when exists). We thus interpret our minimizer and its subspace (spanned by its bottom eigenvectors) as robust versions of the empirical inverse covariance and the PCA subspace respectively. We compare our method with many other algorithms for robust PCA on synthetic and real data sets and demonstrate state-of-the-art speed and accuracy.
△ Less
Submitted 23 June, 2014; v1 submitted 20 December, 2011;
originally announced December 2011.
-
Robust recovery of multiple subspaces by geometric l_p minimization
Authors:
Gilad Lerman,
Teng Zhang
Abstract:
We assume i.i.d. data sampled from a mixture distribution with K components along fixed d-dimensional linear subspaces and an additional outlier component. For p>0, we study the simultaneous recovery of the K fixed subspaces by minimizing the l_p-averaged distances of the sampled data points from any K subspaces. Under some conditions, we show that if $0<p\leq1$, then all underlying subspaces can…
▽ More
We assume i.i.d. data sampled from a mixture distribution with K components along fixed d-dimensional linear subspaces and an additional outlier component. For p>0, we study the simultaneous recovery of the K fixed subspaces by minimizing the l_p-averaged distances of the sampled data points from any K subspaces. Under some conditions, we show that if $0<p\leq1$, then all underlying subspaces can be precisely recovered by l_p minimization with overwhelming probability. On the other hand, if K>1 and p>1, then the underlying subspaces cannot be recovered or even nearly recovered by l_p minimization. The results of this paper partially explain the successes and failures of the basic approach of l_p energy minimization for modeling data by multiple subspaces.
△ Less
Submitted 1 February, 2012; v1 submitted 19 April, 2011;
originally announced April 2011.
-
lp-Recovery of the Most Significant Subspace among Multiple Subspaces with Outliers
Authors:
Gilad Lerman,
Teng Zhang
Abstract:
We assume data sampled from a mixture of d-dimensional linear subspaces with spherically symmetric distributions within each subspace and an additional outlier component with spherically symmetric distribution within the ambient space (for simplicity we may assume that all distributions are uniform on their corresponding unit spheres). We also assume mixture weights for the different components. W…
▽ More
We assume data sampled from a mixture of d-dimensional linear subspaces with spherically symmetric distributions within each subspace and an additional outlier component with spherically symmetric distribution within the ambient space (for simplicity we may assume that all distributions are uniform on their corresponding unit spheres). We also assume mixture weights for the different components. We say that one of the underlying subspaces of the model is most significant if its mixture weight is higher than the sum of the mixture weights of all other subspaces. We study the recovery of the most significant subspace by minimizing the lp-averaged distances of data points from d-dimensional subspaces, where p>0. Unlike other lp minimization problems, this minimization is non-convex for all p>0 and thus requires different methods for its analysis. We show that if 0<p<=1, then for any fraction of outliers the most significant subspace can be recovered by lp minimization with overwhelming probability (which depends on the generating distribution and its parameters). We show that when adding small noise around the underlying subspaces the most significant subspace can be nearly recovered by lp minimization for any 0<p<=1 with an error proportional to the noise level. On the other hand, if p>1 and there is more than one underlying subspace, then with overwhelming probability the most significant subspace cannot be recovered or nearly recovered. This last result does not require spherically symmetric outliers.
△ Less
Submitted 13 January, 2014; v1 submitted 18 December, 2010;
originally announced December 2010.
-
Hybrid Linear Modeling via Local Best-fit Flats
Authors:
Teng Zhang,
Arthur Szlam,
Yi Wang,
Gilad Lerman
Abstract:
We present a simple and fast geometric method for modeling data by a union of affine subspaces. The method begins by forming a collection of local best-fit affine subspaces, i.e., subspaces approximating the data in local neighborhoods. The correct sizes of the local neighborhoods are determined automatically by the Jones' $β_2$ numbers (we prove under certain geometric conditions that our method…
▽ More
We present a simple and fast geometric method for modeling data by a union of affine subspaces. The method begins by forming a collection of local best-fit affine subspaces, i.e., subspaces approximating the data in local neighborhoods. The correct sizes of the local neighborhoods are determined automatically by the Jones' $β_2$ numbers (we prove under certain geometric conditions that our method finds the optimal local neighborhoods). The collection of subspaces is further processed by a greedy selection procedure or a spectral method to generate the final model. We discuss applications to tracking-based motion segmentation and clustering of faces under different illuminating conditions. We give extensive experimental evidence demonstrating the state of the art accuracy and speed of the suggested algorithms on these problems and also on synthetic hybrid linear data as well as the MNIST handwritten digits data; and we demonstrate how to use our algorithms for fast determination of the number of affine subspaces.
△ Less
Submitted 1 May, 2012; v1 submitted 17 October, 2010;
originally announced October 2010.
-
Probabilistic Recovery of Multiple Subspaces in Point Clouds by Geometric lp Minimization
Authors:
Gilad Lerman,
Teng Zhang
Abstract:
We assume data independently sampled from a mixture distribution on the unit ball of the D-dimensional Euclidean space with K+1 components: the first component is a uniform distribution on that ball representing outliers and the other K components are uniform distributions along K d-dimensional linear subspaces restricted to that ball. We study both the simultaneous recovery of all K underlying su…
▽ More
We assume data independently sampled from a mixture distribution on the unit ball of the D-dimensional Euclidean space with K+1 components: the first component is a uniform distribution on that ball representing outliers and the other K components are uniform distributions along K d-dimensional linear subspaces restricted to that ball. We study both the simultaneous recovery of all K underlying subspaces and the recovery of the best l0 subspace (i.e., with largest number of points) by minimizing the lp-averaged distances of data points from d-dimensional subspaces of the D-dimensional space. Unlike other lp minimization problems, this minimization is non-convex for all p>0 and thus requires different methods for its analysis. We show that if 0<p <= 1, then both all underlying subspaces and the best l0 subspace can be precisely recovered by lp minimization with overwhelming probability. This result extends to additive homoscedastic uniform noise around the subspaces (i.e., uniform distribution in a strip around them) and near recovery with an error proportional to the noise level. On the other hand, if K>1 and p>1, then we show that both all underlying subspaces and the best l0 subspace cannot be recovered and even nearly recovered. Further relaxations are also discussed. We use the results of this paper for partially justifying recent effective algorithms for modeling data by mixtures of multiple subspaces as well as for discussing the effect of using variants of lp minimizations in RANSAC-type strategies for single subspace recovery.
△ Less
Submitted 19 April, 2012; v1 submitted 9 February, 2010;
originally announced February 2010.
-
Spectral clustering based on local linear approximations
Authors:
Ery Arias-Castro,
Guangliang Chen,
Gilad Lerman
Abstract:
In the context of clustering, we assume a generative model where each cluster is the result of sampling points in the neighborhood of an embedded smooth surface; the sample may be contaminated with outliers, which are modeled as points sampled in space away from the clusters. We consider a prototype for a higher-order spectral clustering method based on the residual from a local linear approximati…
▽ More
In the context of clustering, we assume a generative model where each cluster is the result of sampling points in the neighborhood of an embedded smooth surface; the sample may be contaminated with outliers, which are modeled as points sampled in space away from the clusters. We consider a prototype for a higher-order spectral clustering method based on the residual from a local linear approximation. We obtain theoretical guarantees for this algorithm and show that, in terms of both separation and robustness to outliers, it outperforms the standard spectral clustering algorithm (based on pairwise distances) of Ng, Jordan and Weiss (NIPS '01). The optimal choice for some of the tuning parameters depends on the dimension and thickness of the clusters. We provide estimators that come close enough for our theoretical purposes. We also discuss the cases of clusters of mixed dimensions and of clusters that are generated from smoother surfaces. In our experiments, this algorithm is shown to outperform pairwise spectral clustering on both simulated and real data.
△ Less
Submitted 28 November, 2011; v1 submitted 8 January, 2010;
originally announced January 2010.
-
Foundations of a Multi-way Spectral Clustering Framework for Hybrid Linear Modeling
Authors:
Guangliang Chen,
Gilad Lerman
Abstract:
The problem of Hybrid Linear Modeling (HLM) is to model and segment data using a mixture of affine subspaces. Different strategies have been proposed to solve this problem, however, rigorous analysis justifying their performance is missing. This paper suggests the Theoretical Spectral Curvature Clustering (TSCC) algorithm for solving the HLM problem, and provides careful analysis to justify it.…
▽ More
The problem of Hybrid Linear Modeling (HLM) is to model and segment data using a mixture of affine subspaces. Different strategies have been proposed to solve this problem, however, rigorous analysis justifying their performance is missing. This paper suggests the Theoretical Spectral Curvature Clustering (TSCC) algorithm for solving the HLM problem, and provides careful analysis to justify it. The TSCC algorithm is practically a combination of Govindu's multi-way spectral clustering framework (CVPR 2005) and Ng et al.'s spectral clustering algorithm (NIPS 2001). The main result of this paper states that if the given data is sampled from a mixture of distributions concentrated around affine subspaces, then with high sampling probability the TSCC algorithm segments well the different underlying clusters. The goodness of clustering depends on the within-cluster errors, the between-clusters interaction, and a tuning parameter applied by TSCC. The proof also provides new insights for the analysis of Ng et al. (NIPS 2001).
△ Less
Submitted 14 January, 2009; v1 submitted 20 October, 2008;
originally announced October 2008.