-
Quantum Annealing for Robust Principal Component Analysis
Authors:
Ian Tomeo,
Panos P. Markopoulos,
Andreas Savakis
Abstract:
Principal component analysis is commonly used for dimensionality reduction, feature extraction, denoising, and visualization. The most commonly used principal component analysis method is based upon optimization of the L2-norm, however, the L2-norm is known to exaggerate the contribution of errors and outliers. When optimizing over the L1-norm, the components generated are known to exhibit robustn…
▽ More
Principal component analysis is commonly used for dimensionality reduction, feature extraction, denoising, and visualization. The most commonly used principal component analysis method is based upon optimization of the L2-norm, however, the L2-norm is known to exaggerate the contribution of errors and outliers. When optimizing over the L1-norm, the components generated are known to exhibit robustness or resistance to outliers in the data. The L1-norm components can be solved for with a binary optimization problem. Previously, L1-BF has been used to solve the binary optimization for multiple components simultaneously. In this paper we propose QAPCA, a new method for finding principal components using quantum annealing hardware which will optimize over the robust L1-norm. The conditions required for convergence of the annealing problem are discussed. The potential speedup when using quantum annealing is demonstrated through complexity analysis and experimental results. To showcase performance against classical principal component analysis techniques experiments upon synthetic Gaussian data, a fault detection scenario and breast cancer diagnostic data are studied. We find that the reconstruction error when using QAPCA is comparable to that when using L1-BF.
△ Less
Submitted 25 January, 2025; v1 submitted 11 January, 2025;
originally announced January 2025.
-
Convolutional Neural Network Compression via Dynamic Parameter Rank Pruning
Authors:
Manish Sharma,
Jamison Heard,
Eli Saber,
Panos P. Markopoulos
Abstract:
While Convolutional Neural Networks (CNNs) excel at learning complex latent-space representations, their over-parameterization can lead to overfitting and reduced performance, particularly with limited data. This, alongside their high computational and memory demands, limits the applicability of CNNs for edge deployment. Low-rank matrix approximation has emerged as a promising approach to reduce C…
▽ More
While Convolutional Neural Networks (CNNs) excel at learning complex latent-space representations, their over-parameterization can lead to overfitting and reduced performance, particularly with limited data. This, alongside their high computational and memory demands, limits the applicability of CNNs for edge deployment. Low-rank matrix approximation has emerged as a promising approach to reduce CNN parameters, but its application presents challenges including rank selection and performance loss. To address these issues, we propose an efficient training method for CNN compression via dynamic parameter rank pruning. Our approach integrates efficient matrix factorization and novel regularization techniques, forming a robust framework for dynamic rank reduction and model compression. We use Singular Value Decomposition (SVD) to model low-rank convolutional filters and dense weight matrices and we achieve model compression by training the SVD factors with back-propagation in an end-to-end way. We evaluate our method on an array of modern CNNs, including ResNet-18, ResNet-20, and ResNet-32, and datasets like CIFAR-10, CIFAR-100, and ImageNet (2012), showcasing its applicability in computer vision. Our experiments show that the proposed method can yield substantial storage savings while maintaining or even enhancing classification performance.
△ Less
Submitted 15 January, 2024;
originally announced January 2024.
-
Robust Singular Values based on L1-norm PCA
Authors:
Duc Le,
Panos P. Markopoulos
Abstract:
Singular-Value Decomposition (SVD) is a ubiquitous data analysis method in engineering, science, and statistics. Singular-value estimation, in particular, is of critical importance in an array of engineering applications, such as channel estimation in communication systems, electromyography signal analysis, and image compression, to name just a few. Conventional SVD of a data matrix coincides with…
▽ More
Singular-Value Decomposition (SVD) is a ubiquitous data analysis method in engineering, science, and statistics. Singular-value estimation, in particular, is of critical importance in an array of engineering applications, such as channel estimation in communication systems, electromyography signal analysis, and image compression, to name just a few. Conventional SVD of a data matrix coincides with standard Principal-Component Analysis (PCA). The L2-norm (sum of squared values) formulation of PCA promotes peripheral data points and, thus, makes PCA sensitive against outliers. Naturally, SVD inherits this outlier sensitivity. In this work, we present a novel robust non-parametric method for SVD and singular-value estimation based on a L1-norm (sum of absolute values) formulation, which we name L1-cSVD. Accordingly, the proposed method demonstrates sturdy resistance against outliers and can facilitate more reliable data analysis and processing in a wide range of engineering applications.
△ Less
Submitted 21 October, 2022;
originally announced October 2022.
-
Incremental Task Learning with Incremental Rank Updates
Authors:
Rakib Hyder,
Ken Shao,
Boyu Hou,
Panos Markopoulos,
Ashley Prater-Bennette,
M. Salman Asif
Abstract:
Incremental Task learning (ITL) is a category of continual learning that seeks to train a single network for multiple tasks (one after another), where training data for each task is only available during the training of that task. Neural networks tend to forget older tasks when they are trained for the newer tasks; this property is often known as catastrophic forgetting. To address this issue, ITL…
▽ More
Incremental Task learning (ITL) is a category of continual learning that seeks to train a single network for multiple tasks (one after another), where training data for each task is only available during the training of that task. Neural networks tend to forget older tasks when they are trained for the newer tasks; this property is often known as catastrophic forgetting. To address this issue, ITL methods use episodic memory, parameter regularization, masking and pruning, or extensible network structures. In this paper, we propose a new incremental task learning framework based on low-rank factorization. In particular, we represent the network weights for each layer as a linear combination of several rank-1 matrices. To update the network for a new task, we learn a rank-1 (or low-rank) matrix and add that to the weights of every layer. We also introduce an additional selector vector that assigns different weights to the low-rank matrices learned for the previous tasks. We show that our approach performs better than the current state-of-the-art methods in terms of accuracy and forgetting. Our method also offers better memory efficiency compared to episodic memory- and mask-based approaches. Our code will be available at https://github.com/CSIPlab/task-increment-rank-update.git
△ Less
Submitted 19 July, 2022;
originally announced July 2022.
-
Minimum Mean-Squared-Error Autocorrelation Processing in Coprime Arrays
Authors:
Dimitris G. Chachlakis,
Tongdi Zhou,
Fauzia Ahmad,
Panos P. Markopoulos
Abstract:
Coprime arrays enable Direction-of-Arrival (DoA) estimation of an increased number of sources. To that end, the receiver estimates the autocorrelation matrix of a larger virtual uniform linear array (coarray), by applying selection or averaging to the physical array's autocorrelation estimates, followed by spatial-smoothing. Both selection and averaging have been designed under no optimality crite…
▽ More
Coprime arrays enable Direction-of-Arrival (DoA) estimation of an increased number of sources. To that end, the receiver estimates the autocorrelation matrix of a larger virtual uniform linear array (coarray), by applying selection or averaging to the physical array's autocorrelation estimates, followed by spatial-smoothing. Both selection and averaging have been designed under no optimality criterion and attain arbitrary (suboptimal) Mean-Squared-Error (MSE) estimation performance. In this work, we design a novel coprime array receiver that estimates the coarray autocorrelations with Minimum-MSE (MMSE), for any probability distribution of the source DoAs. Our extensive numerical evaluation illustrates that the proposed MMSE approach returns superior autocorrelation estimates which, in turn, enable higher DoA estimation performance compared to standard counterparts.
△ Less
Submitted 21 October, 2020;
originally announced October 2020.
-
L1-norm Tucker Tensor Decomposition
Authors:
Dimitris G. Chachlakis,
Ashley Prater-Bennette,
Panos P. Markopoulos
Abstract:
Tucker decomposition is a common method for the analysis of multi-way/tensor data. Standard Tucker has been shown to be sensitive against heavy corruptions, due to its L2-norm-based formulation which places squared emphasis to peripheral entries. In this work, we explore L1-Tucker, an L1-norm based reformulation of standard Tucker decomposition. After formulating the problem, we present two algori…
▽ More
Tucker decomposition is a common method for the analysis of multi-way/tensor data. Standard Tucker has been shown to be sensitive against heavy corruptions, due to its L2-norm-based formulation which places squared emphasis to peripheral entries. In this work, we explore L1-Tucker, an L1-norm based reformulation of standard Tucker decomposition. After formulating the problem, we present two algorithms for its solution, namely L1-norm Higher-Order Singular Value Decomposition (L1-HOSVD) and L1-norm Higher-Order Orthogonal Iterations (L1-HOOI). The presented algorithms are accompanied by complexity and convergence analysis. Our numerical studies on tensor reconstruction and classification corroborate that L1-Tucker, implemented by means of the proposed methods, attains similar performance to standard Tucker when the processed data are corruption-free, while it exhibits sturdy resistance against heavily corrupted entries.
△ Less
Submitted 12 April, 2019;
originally announced April 2019.
-
The Exact Solution to Rank-1 L1-norm TUCKER2 Decomposition
Authors:
Panos P. Markopoulos,
Dimitris G. Chachlakis,
Evangelos E. Papalexakis
Abstract:
We study rank-1 {L1-norm-based TUCKER2} (L1-TUCKER2) decomposition of 3-way tensors, treated as a collection of $N$ $D \times M$ matrices that are to be jointly decomposed. Our contributions are as follows. i) We prove that the problem is equivalent to combinatorial optimization over $N$ antipodal-binary variables. ii) We derive the first two algorithms in the literature for its exact solution. Th…
▽ More
We study rank-1 {L1-norm-based TUCKER2} (L1-TUCKER2) decomposition of 3-way tensors, treated as a collection of $N$ $D \times M$ matrices that are to be jointly decomposed. Our contributions are as follows. i) We prove that the problem is equivalent to combinatorial optimization over $N$ antipodal-binary variables. ii) We derive the first two algorithms in the literature for its exact solution. The first algorithm has cost exponential in $N$; the second one has cost polynomial in $N$ (under a mild assumption). Our algorithms are accompanied by formal complexity analysis. iii) We conduct numerical studies to compare the performance of exact L1-TUCKER2 (proposed) with standard HOSVD, HOOI, GLRAM, PCA, L1-PCA, and TPCA-L1. Our studies show that L1-TUCKER2 outperforms (in tensor approximation) all the above counterparts when the processed data are outlier corrupted.
△ Less
Submitted 30 October, 2017;
originally announced October 2017.
-
L1-norm Principal-Component Analysis of Complex Data
Authors:
Nicholas Tsagkarakis,
Panos P. Markopoulos,
Dimitris A. Pados
Abstract:
L1-norm Principal-Component Analysis (L1-PCA) of real-valued data has attracted significant research interest over the past decade. However, L1-PCA of complex-valued data remains to date unexplored despite the many possible applications (e.g., in communication systems). In this work, we establish theoretical and algorithmic foundations of L1-PCA of complex-valued data matrices. Specifically, we fi…
▽ More
L1-norm Principal-Component Analysis (L1-PCA) of real-valued data has attracted significant research interest over the past decade. However, L1-PCA of complex-valued data remains to date unexplored despite the many possible applications (e.g., in communication systems). In this work, we establish theoretical and algorithmic foundations of L1-PCA of complex-valued data matrices. Specifically, we first show that, in contrast to the real-valued case for which an optimal polynomial-cost algorithm was recently reported by Markopoulos et al., complex L1-PCA is formally NP-hard in the number of data points. Then, casting complex L1-PCA as a unimodular optimization problem, we present the first two suboptimal algorithms in the literature for its solution. Our experimental studies illustrate the sturdy resistance of complex L1-PCA against faulty measurements/outliers in the processed data.
△ Less
Submitted 3 August, 2017;
originally announced August 2017.
-
Efficient L1-Norm Principal-Component Analysis via Bit Flipping
Authors:
Panos P. Markopoulos,
Sandipan Kundu,
Shubham Chamadia,
Dimitris A. Pados
Abstract:
It was shown recently that the $K$ L1-norm principal components (L1-PCs) of a real-valued data matrix $\mathbf X \in \mathbb R^{D \times N}$ ($N$ data samples of $D$ dimensions) can be exactly calculated with cost $\mathcal{O}(2^{NK})$ or, when advantageous, $\mathcal{O}(N^{dK - K + 1})$ where $d=\mathrm{rank}(\mathbf X)$, $K<d$ [1],[2]. In applications where $\mathbf X$ is large (e.g., "big" data…
▽ More
It was shown recently that the $K$ L1-norm principal components (L1-PCs) of a real-valued data matrix $\mathbf X \in \mathbb R^{D \times N}$ ($N$ data samples of $D$ dimensions) can be exactly calculated with cost $\mathcal{O}(2^{NK})$ or, when advantageous, $\mathcal{O}(N^{dK - K + 1})$ where $d=\mathrm{rank}(\mathbf X)$, $K<d$ [1],[2]. In applications where $\mathbf X$ is large (e.g., "big" data of large $N$ and/or "heavy" data of large $d$), these costs are prohibitive. In this work, we present a novel suboptimal algorithm for the calculation of the $K < d$ L1-PCs of $\mathbf X$ of cost $\mathcal O(ND \mathrm{min} \{ N,D\} + N^2(K^4 + dK^2) + dNK^3)$, which is comparable to that of standard (L2-norm) PC analysis. Our theoretical and experimental studies show that the proposed algorithm calculates the exact optimal L1-PCs with high frequency and achieves higher value in the L1-PC optimization metric than any known alternative algorithm of comparable computational cost. The superiority of the calculated L1-PCs over standard L2-PCs (singular vectors) in characterizing potentially faulty data/measurements is demonstrated with experiments on data dimensionality reduction and disease diagnosis from genomic data.
△ Less
Submitted 6 October, 2016;
originally announced October 2016.
-
Optimal Algorithms for $L_1$-subspace Signal Processing
Authors:
Panos P. Markopoulos,
George N. Karystinos,
Dimitris A. Pados
Abstract:
We describe ways to define and calculate $L_1$-norm signal subspaces which are less sensitive to outlying data than $L_2$-calculated subspaces. We start with the computation of the $L_1$ maximum-projection principal component of a data matrix containing $N$ signal samples of dimension $D$. We show that while the general problem is formally NP-hard in asymptotically large $N$, $D$, the case of engi…
▽ More
We describe ways to define and calculate $L_1$-norm signal subspaces which are less sensitive to outlying data than $L_2$-calculated subspaces. We start with the computation of the $L_1$ maximum-projection principal component of a data matrix containing $N$ signal samples of dimension $D$. We show that while the general problem is formally NP-hard in asymptotically large $N$, $D$, the case of engineering interest of fixed dimension $D$ and asymptotically large sample size $N$ is not. In particular, for the case where the sample size is less than the fixed dimension ($N<D$), we present in explicit form an optimal algorithm of computational cost $2^N$. For the case $N \geq D$, we present an optimal algorithm of complexity $\mathcal O(N^D)$. We generalize to multiple $L_1$-max-projection components and present an explicit optimal $L_1$ subspace calculation algorithm of complexity $\mathcal O(N^{DK-K+1})$ where $K$ is the desired number of $L_1$ principal components (subspace rank). We conclude with illustrations of $L_1$-subspace signal processing in the fields of data dimensionality reduction, direction-of-arrival estimation, and image conditioning/restoration.
△ Less
Submitted 27 May, 2014;
originally announced May 2014.