-
On the consistent reasoning paradox of intelligence and optimal trust in AI: The power of 'I don't know'
Authors:
Alexander Bastounis,
Paolo Campodonico,
Mihaela van der Schaar,
Ben Adcock,
Anders C. Hansen
Abstract:
We introduce the Consistent Reasoning Paradox (CRP). Consistent reasoning, which lies at the core of human intelligence, is the ability to handle tasks that are equivalent, yet described by different sentences ('Tell me the time!' and 'What is the time?'). The CRP asserts that consistent reasoning implies fallibility -- in particular, human-like intelligence in AI necessarily comes with human-like…
▽ More
We introduce the Consistent Reasoning Paradox (CRP). Consistent reasoning, which lies at the core of human intelligence, is the ability to handle tasks that are equivalent, yet described by different sentences ('Tell me the time!' and 'What is the time?'). The CRP asserts that consistent reasoning implies fallibility -- in particular, human-like intelligence in AI necessarily comes with human-like fallibility. Specifically, it states that there are problems, e.g. in basic arithmetic, where any AI that always answers and strives to mimic human intelligence by reasoning consistently will hallucinate (produce wrong, yet plausible answers) infinitely often. The paradox is that there exists a non-consistently reasoning AI (which therefore cannot be on the level of human intelligence) that will be correct on the same set of problems. The CRP also shows that detecting these hallucinations, even in a probabilistic sense, is strictly harder than solving the original problems, and that there are problems that an AI may answer correctly, but it cannot provide a correct logical explanation for how it arrived at the answer. Therefore, the CRP implies that any trustworthy AI (i.e., an AI that never answers incorrectly) that also reasons consistently must be able to say 'I don't know'. Moreover, this can only be done by implicitly computing a new concept that we introduce, termed the 'I don't know' function -- something currently lacking in modern AI. In view of these insights, the CRP also provides a glimpse into the behaviour of Artificial General Intelligence (AGI). An AGI cannot be 'almost sure', nor can it always explain itself, and therefore to be trustworthy it must be able to say 'I don't know'.
△ Less
Submitted 5 August, 2024;
originally announced August 2024.
-
Do stable neural networks exist for classification problems? -- A new view on stability in AI
Authors:
Z. N. D. Liu,
A. C. Hansen
Abstract:
In deep learning (DL) the instability phenomenon is widespread and well documented, most commonly using the classical measure of stability, the Lipschitz constant. While a small Lipchitz constant is traditionally viewed as guarantying stability, it does not capture the instability phenomenon in DL for classification well. The reason is that a classification function -- which is the target function…
▽ More
In deep learning (DL) the instability phenomenon is widespread and well documented, most commonly using the classical measure of stability, the Lipschitz constant. While a small Lipchitz constant is traditionally viewed as guarantying stability, it does not capture the instability phenomenon in DL for classification well. The reason is that a classification function -- which is the target function to be approximated -- is necessarily discontinuous, thus having an 'infinite' Lipchitz constant. As a result, the classical approach will deem every classification function unstable, yet basic classification functions a la 'is there a cat in the image?' will typically be locally very 'flat' -- and thus locally stable -- except at the decision boundary. The lack of an appropriate measure of stability hinders a rigorous theory for stability in DL, and consequently, there are no proper approximation theoretic results that can guarantee the existence of stable networks for classification functions. In this paper we introduce a novel stability measure $\mathscr{S}(f)$, for any classification function $f$, appropriate to study the stability of discontinuous functions and their approximations. We further prove two approximation theorems: First, for any $ε> 0$ and any classification function $f$ on a \emph{compact set}, there is a neural network (NN) $ψ$, such that $ψ- f \neq 0$ only on a set of measure $< ε$, moreover, $\mathscr{S}(ψ) \geq \mathscr{S}(f) - ε$ (as accurate and stable as $f$ up to $ε$). Second, for any classification function $f$ and $ε> 0$, there exists a NN $ψ$ such that $ψ= f$ on the set of points that are at least $ε$ away from the decision boundary.
△ Less
Submitted 15 January, 2024;
originally announced January 2024.
-
When can you trust feature selection? -- II: On the effects of random data on condition in statistics and optimisation
Authors:
Alexander Bastounis,
Felipe Cucker,
Anders C. Hansen
Abstract:
In Part I, we defined a LASSO condition number and developed an algorithm -- for computing support sets (feature selection) of the LASSO minimisation problem -- that runs in polynomial time in the number of variables and the logarithm of the condition number. The algorithm is trustworthy in the sense that if the condition number is infinite, the algorithm will run forever and never produce an inco…
▽ More
In Part I, we defined a LASSO condition number and developed an algorithm -- for computing support sets (feature selection) of the LASSO minimisation problem -- that runs in polynomial time in the number of variables and the logarithm of the condition number. The algorithm is trustworthy in the sense that if the condition number is infinite, the algorithm will run forever and never produce an incorrect output. In this Part II article, we demonstrate how finite precision algorithms (for example algorithms running floating point arithmetic) will fail on open sets when the condition number is large -- but still finite. This augments Part I's result: If an algorithm takes inputs from an open set that includes at least one point with an infinite condition number, it fails to compute the correct support set for all inputs within that set. Hence, for any finite precision algorithm working on open sets for the LASSO problem with random inputs, our LASSO condition number -- as a random variable -- will estimate the probability of success/failure of the algorithm. We show that a finite precision version of our algorithm works on traditional Gaussian data for LASSO with high probability. The algorithm is trustworthy, specifically, in the random cases where the algorithm fails, it will not produce an output. Finally, we demonstrate classical random ensembles for which the condition number will be large with high probability, and hence where any finite precision algorithm on open sets will fail. We show numerically how commercial software fails on these cases.
△ Less
Submitted 18 December, 2023;
originally announced December 2023.
-
When can you trust feature selection? -- I: A condition-based analysis of LASSO and generalised hardness of approximation
Authors:
Alexander Bastounis,
Felipe Cucker,
Anders C. Hansen
Abstract:
The arrival of AI techniques in computations, with the potential for hallucinations and non-robustness, has made trustworthiness of algorithms a focal point. However, trustworthiness of the many classical approaches are not well understood. This is the case for feature selection, a classical problem in the sciences, statistics, machine learning etc. Here, the LASSO optimisation problem is standard…
▽ More
The arrival of AI techniques in computations, with the potential for hallucinations and non-robustness, has made trustworthiness of algorithms a focal point. However, trustworthiness of the many classical approaches are not well understood. This is the case for feature selection, a classical problem in the sciences, statistics, machine learning etc. Here, the LASSO optimisation problem is standard. Despite its widespread use, it has not been established when the output of algorithms attempting to compute support sets of minimisers of LASSO in order to do feature selection can be trusted. In this paper we establish how no (randomised) algorithm that works on all inputs can determine the correct support sets (with probability $> 1/2$) of minimisers of LASSO when reading approximate input, regardless of precision and computing power. However, we define a LASSO condition number and design an efficient algorithm for computing these support sets provided the input data is well-posed (has finite condition number) in time polynomial in the dimensions and logarithm of the condition number. For ill-posed inputs the algorithm runs forever, hence, it will never produce a wrong answer. Furthermore, the algorithm computes an upper bound for the condition number when this is finite. Finally, for any algorithm defined on an open set containing a point with infinite condition number, there is an input for which the algorithm will either run forever or produce a wrong answer. Our impossibility results stem from generalised hardness of approximation -- within the Solvability Complexity Index (SCI) hierarchy framework -- that generalises the classical phenomenon of hardness of approximation.
△ Less
Submitted 18 December, 2023;
originally announced December 2023.
-
Stationary measures of continuous time Markov chains with applications to stochastic reaction networks
Authors:
Mads Chr Hansen,
Carsten Wiuf,
Chuang Xu
Abstract:
We study continuous-time Markov chains on the non-negative integers under mild regularity conditions (in particular, the set of jump vectors is finite and both forward and backward jumps are possible). Based on the so-called flux balance equation, we derive an iterative formula for calculating stationary measures. Specifically, a stationary measure $π(x)$ evaluated at $x\in\N_0$ is represented as…
▽ More
We study continuous-time Markov chains on the non-negative integers under mild regularity conditions (in particular, the set of jump vectors is finite and both forward and backward jumps are possible). Based on the so-called flux balance equation, we derive an iterative formula for calculating stationary measures. Specifically, a stationary measure $π(x)$ evaluated at $x\in\N_0$ is represented as a linear combination of a few generating terms, similarly to the characterization of a stationary measure of a birth-death process, where there is only one generating term, $π(0)$. The coefficients of the linear combination are recursively determined in terms of the transition rates of the Markov chain. For the class of Markov chains we consider, there is always at least one stationary measure (up to a scaling constant). We give various results pertaining to uniqueness and non-uniqueness of stationary measures, and show that the dimension of the linear space of signed invariant measures is at most the number of generating terms. A minimization problem is constructed in order to compute stationary measures numerically. Moreover, a heuristic linear approximation scheme is suggested for the same purpose by first approximating the generating terms. The correctness of the linear approximation scheme is justified in some special cases. Furthermore, a decomposition of the state space into different types of states (open and closed irreducible classes, and trapping, escaping and neutral states) is presented. Applications to stochastic reaction networks are well illustrated.
△ Less
Submitted 22 November, 2024; v1 submitted 11 December, 2023;
originally announced December 2023.
-
On the existence of optimal multi-valued decoders and their accuracy bounds for undersampled inverse problems
Authors:
Nina Maria Gottschling,
Paolo Campodonico,
Vegard Antun,
Anders C. Hansen
Abstract:
Undersampled inverse problems occur everywhere in the sciences including medical imaging, radar, astronomy etc., yielding underdetermined linear or non-linear reconstruction problems. There are now a myriad of techniques to design decoders that can tackle such problems, ranging from optimization based approaches, such as compressed sensing, to deep learning (DL), and variants in between the two te…
▽ More
Undersampled inverse problems occur everywhere in the sciences including medical imaging, radar, astronomy etc., yielding underdetermined linear or non-linear reconstruction problems. There are now a myriad of techniques to design decoders that can tackle such problems, ranging from optimization based approaches, such as compressed sensing, to deep learning (DL), and variants in between the two techniques. The variety of methods begs for a unifying approach to determine the existence of optimal decoders and fundamental accuracy bounds, in order to facilitate a theoretical and empirical understanding of the performance of existing and future methods. Such a theory must allow for both single-valued and multi-valued decoders, as underdetermined inverse problems typically have multiple solutions. Indeed, multi-valued decoders arise due to non-uniqueness of minimizers in optimisation problems, such as in compressed sensing, and for DL based decoders in generative adversarial models, such as diffusion models and ensemble models. In this work we provide a framework for assessing the lowest possible reconstruction accuracy in terms of worst- and average-case errors. The universal bounds bounds only depend on the measurement model $F$, the model class $\mathcal{M}_1 \subseteq \mathcal{X}$ and the noise model $\mathcal{E}$. For linear $F$ these bounds depend on its kernel, and in the non-linear case the concept of kernel is generalized for undersampled settings. Additionally, we provide multi-valued variational solutions that obtain the lowest possible reconstruction error.
△ Less
Submitted 28 November, 2023;
originally announced November 2023.
-
Implicit regularization in AI meets generalized hardness of approximation in optimization -- Sharp results for diagonal linear networks
Authors:
Johan S. Wind,
Vegard Antun,
Anders C. Hansen
Abstract:
Understanding the implicit regularization imposed by neural network architectures and gradient based optimization methods is a key challenge in deep learning and AI. In this work we provide sharp results for the implicit regularization imposed by the gradient flow of Diagonal Linear Networks (DLNs) in the over-parameterized regression setting and, potentially surprisingly, link this to the phenome…
▽ More
Understanding the implicit regularization imposed by neural network architectures and gradient based optimization methods is a key challenge in deep learning and AI. In this work we provide sharp results for the implicit regularization imposed by the gradient flow of Diagonal Linear Networks (DLNs) in the over-parameterized regression setting and, potentially surprisingly, link this to the phenomenon of phase transitions in generalized hardness of approximation (GHA). GHA generalizes the phenomenon of hardness of approximation from computer science to, among others, continuous and robust optimization. It is well-known that the $\ell^1$-norm of the gradient flow of DLNs with tiny initialization converges to the objective function of basis pursuit. We improve upon these results by showing that the gradient flow of DLNs with tiny initialization approximates minimizers of the basis pursuit optimization problem (as opposed to just the objective function), and we obtain new and sharp convergence bounds w.r.t.\ the initialization size. Non-sharpness of our results would imply that the GHA phenomenon would not occur for the basis pursuit optimization problem -- which is a contradiction -- thus implying sharpness. Moreover, we characterize $\textit{which}$ $\ell_1$ minimizer of the basis pursuit problem is chosen by the gradient flow whenever the minimizer is not unique. Interestingly, this depends on the depth of the DLN.
△ Less
Submitted 13 July, 2023;
originally announced July 2023.
-
CUQIpy: II. Computational uncertainty quantification for PDE-based inverse problems in Python
Authors:
Amal M A Alghamdi,
Nicolai A B Riis,
Babak M Afkham,
Felipe Uribe,
Silja L Christensen,
Per Christian Hansen,
Jakob S Jørgensen
Abstract:
Inverse problems, particularly those governed by Partial Differential Equations (PDEs), are prevalent in various scientific and engineering applications, and uncertainty quantification (UQ) of solutions to these problems is essential for informed decision-making. This second part of a two-paper series builds upon the foundation set by the first part, which introduced CUQIpy, a Python software pack…
▽ More
Inverse problems, particularly those governed by Partial Differential Equations (PDEs), are prevalent in various scientific and engineering applications, and uncertainty quantification (UQ) of solutions to these problems is essential for informed decision-making. This second part of a two-paper series builds upon the foundation set by the first part, which introduced CUQIpy, a Python software package for computational UQ in inverse problems using a Bayesian framework. In this paper, we extend CUQIpy's capabilities to solve PDE-based Bayesian inverse problems through a general framework that allows the integration of PDEs in CUQIpy, whether expressed natively or using third-party libraries such as FEniCS. CUQIpy offers concise syntax that closely matches mathematical expressions, streamlining the modeling process and enhancing the user experience. The versatility and applicability of CUQIpy to PDE-based Bayesian inverse problems are demonstrated on examples covering parabolic, elliptic and hyperbolic PDEs. This includes problems involving the heat and Poisson equations and application case studies in electrical impedance tomography and photo-acoustic tomography, showcasing the software's efficiency, consistency, and intuitive interface. This comprehensive approach to UQ in PDE-based inverse problems provides accessibility for non-experts and advanced features for experts.
△ Less
Submitted 21 March, 2024; v1 submitted 26 May, 2023;
originally announced May 2023.
-
CUQIpy: I. Computational uncertainty quantification for inverse problems in Python
Authors:
Nicolai A B Riis,
Amal M A Alghamdi,
Felipe Uribe,
Silja L Christensen,
Babak M Afkham,
Per Christian Hansen,
Jakob S Jørgensen
Abstract:
This paper introduces CUQIpy, a versatile open-source Python package for computational uncertainty quantification (UQ) in inverse problems, presented as Part I of a two-part series. CUQIpy employs a Bayesian framework, integrating prior knowledge with observed data to produce posterior probability distributions that characterize the uncertainty in computed solutions to inverse problems. The packag…
▽ More
This paper introduces CUQIpy, a versatile open-source Python package for computational uncertainty quantification (UQ) in inverse problems, presented as Part I of a two-part series. CUQIpy employs a Bayesian framework, integrating prior knowledge with observed data to produce posterior probability distributions that characterize the uncertainty in computed solutions to inverse problems. The package offers a high-level modeling framework with concise syntax, allowing users to easily specify their inverse problems, prior information, and statistical assumptions. CUQIpy supports a range of efficient sampling strategies and is designed to handle large-scale problems. Notably, the automatic sampler selection feature analyzes the problem structure and chooses a suitable sampler without user intervention, streamlining the process. With a selection of probability distributions, test problems, computational methods, and visualization tools, CUQIpy serves as a powerful, flexible, and adaptable tool for UQ in a wide selection of inverse problems. Part II of the series focuses on the use of CUQIpy for UQ in inverse problems with partial differential equations (PDEs).
△ Less
Submitted 21 March, 2024; v1 submitted 26 May, 2023;
originally announced May 2023.
-
Inferring Object Boundaries and their Roughness with Uncertainty Quantification
Authors:
Babak Maboudi Afkham,
Nicolai André Brogaard Riis,
Yiqiu Dong,
Per Christian Hansen
Abstract:
This work describes a Bayesian framework for reconstructing the boundaries that represent targeted features in an image, as well as the regularity (i.e., roughness vs. smoothness) of these boundaries.This regularity often carries crucial information in many inverse problem applications, e.g., for identifying malignant tissues in medical imaging. We represent the boundary as a radial function and c…
▽ More
This work describes a Bayesian framework for reconstructing the boundaries that represent targeted features in an image, as well as the regularity (i.e., roughness vs. smoothness) of these boundaries.This regularity often carries crucial information in many inverse problem applications, e.g., for identifying malignant tissues in medical imaging. We represent the boundary as a radial function and characterize the regularity of this function by means of its fractional differentiability. We propose a hierarchical Bayesian formulation which, simultaneously, estimates the function and its regularity, and in addition we quantify the uncertainties in the estimates. Numerical results suggest that the proposed method is a reliable approach for estimating and characterizing object boundaries in imaging applications, as illustrated with examples from X-ray CT and image inpainting. We also show that our method is robust under various noise types, noise levels, and incomplete data.
△ Less
Submitted 25 January, 2024; v1 submitted 8 May, 2023;
originally announced May 2023.
-
On inequalities involving counts of the prime factors of an odd perfect number
Authors:
Graeme Clayton,
Cody S. Hansen
Abstract:
Let $N$ be an odd perfect number. Let $ω(N)$ be the number of distinct prime factors of $N$ and let $Ω(N)$ be the total number (counting multiplicity) of prime factors of $N$. We prove that $\frac{99}{37}ω(N) - \frac{187}{37} \leq Ω(N)$ and that if $3\nmid N$, then $\frac{51}{19}ω(N)-\frac{46}{19} \leq Ω(N)$.
Let $N$ be an odd perfect number. Let $ω(N)$ be the number of distinct prime factors of $N$ and let $Ω(N)$ be the total number (counting multiplicity) of prime factors of $N$. We prove that $\frac{99}{37}ω(N) - \frac{187}{37} \leq Ω(N)$ and that if $3\nmid N$, then $\frac{51}{19}ω(N)-\frac{46}{19} \leq Ω(N)$.
△ Less
Submitted 21 March, 2023;
originally announced March 2023.
-
Alexander Ostrowski's "On Dirichlet Series and Algebraic Differential Equations"
Authors:
Erik Christian Hansen,
Yonathan Stone,
Jesse Wolfson
Abstract:
This is an English translation of Ostrowski's article "Über Dirichletsche Reihen und algebraische Differentialgleichungen" published German in Math. Zeit. vol. 8, 1920, pp. 241-298. In this article, Ostrowski proves a conjecture of Hilbert that the two variable function $ζ(x,s)=\sum_{n\ge 1} \frac{x^n}{n^s}$ cannot be written as a composition of analytic functions of one variable and algebraic fun…
▽ More
This is an English translation of Ostrowski's article "Über Dirichletsche Reihen und algebraische Differentialgleichungen" published German in Math. Zeit. vol. 8, 1920, pp. 241-298. In this article, Ostrowski proves a conjecture of Hilbert that the two variable function $ζ(x,s)=\sum_{n\ge 1} \frac{x^n}{n^s}$ cannot be written as a composition of analytic functions of one variable and algebraic functions of any number of variables.
△ Less
Submitted 3 November, 2022;
originally announced November 2022.
-
Generalised hardness of approximation and the SCI hierarchy -- On determining the boundaries of training algorithms in AI
Authors:
Luca Eva Gazdag,
Anders C. Hansen
Abstract:
Hardness of approximation (HA) -- the phenomenon that, assuming P $\neq$ NP, one can easily compute an $ε$-approximation to the solution of a discrete computational problem for $ε> ε_0 > 0$, but for $ε< ε_0$ it suddenly becomes intractable -- is a core phenomenon in the foundations of computations that has transformed computer science. In this paper we study the newly discovered phenomenon in the…
▽ More
Hardness of approximation (HA) -- the phenomenon that, assuming P $\neq$ NP, one can easily compute an $ε$-approximation to the solution of a discrete computational problem for $ε> ε_0 > 0$, but for $ε< ε_0$ it suddenly becomes intractable -- is a core phenomenon in the foundations of computations that has transformed computer science. In this paper we study the newly discovered phenomenon in the foundations of computational mathematics: generalised hardness of approximation (GHA) -- which in spirit is close to classical HA in computer science. However, GHA is typically independent of the P vs. NP question in many cases. Thus, it requires a new mathematical framework that we initiate in this paper. We demonstrate the hitherto undiscovered phenomenon that GHA happens when using AI techniques in order to train optimal neural networks (NNs). In particular, for any non-zero underdetermined linear problem the following phase transition may occur: One can prove the existence of optimal NNs for solving the problem but they can only be computed to a certain accuracy $ε_0 > 0$. Below the approximation threshold $ε_0$ -- not only does it become intractable to compute the NN -- it becomes impossible regardless of computing power, and no randomised algorithm can solve the problem with probability better than 1/2. In other cases, despite the existence of a stable optimal NN, any attempts of computing it below the approximation threshold $ε_0$ will yield an unstable NN. Our results use and extend the current mathematical framework of the Solvability Complexity Index (SCI) hierarchy and facilitate a program for detecting the GHA phenomenon throughout computational mathematics and AI.
△ Less
Submitted 3 January, 2023; v1 submitted 14 September, 2022;
originally announced September 2022.
-
Prime factors of $Φ_3(x)$ of the same form
Authors:
Cody S. Hansen,
Pace P. Nielsen
Abstract:
We parameterize solutions to the equality $Φ_3(x)=Φ_3(a_1)Φ_3(a_2)\cdotsΦ_3(a_n)$ when each $Φ_3(a_i)$ is prime. Our focus is on the special cases when $n=2,3,4$, as this analysis simplifies and extends bounds on the total number of prime factors of an odd perfect number.
We parameterize solutions to the equality $Φ_3(x)=Φ_3(a_1)Φ_3(a_2)\cdotsΦ_3(a_n)$ when each $Φ_3(a_i)$ is prime. Our focus is on the special cases when $n=2,3,4$, as this analysis simplifies and extends bounds on the total number of prime factors of an odd perfect number.
△ Less
Submitted 19 April, 2022;
originally announced April 2022.
-
The extended Smale's 9th problem -- On computational barriers and paradoxes in estimation, regularisation, computer-assisted proofs and learning
Authors:
Alexander Bastounis,
Anders C Hansen,
Verner Vlačić
Abstract:
Linear and semidefinite programming (LP, SDP), regularisation through basis pursuit (BP) and Lasso have seen great success in mathematics, statistics, data science, computer-assisted proofs and learning. The success of LP is traditionally attributed to the fact that it is "in P" for rational inputs. On the other hand, in his list of problems for the 21st century S. Smale calls for "[Computational]…
▽ More
Linear and semidefinite programming (LP, SDP), regularisation through basis pursuit (BP) and Lasso have seen great success in mathematics, statistics, data science, computer-assisted proofs and learning. The success of LP is traditionally attributed to the fact that it is "in P" for rational inputs. On the other hand, in his list of problems for the 21st century S. Smale calls for "[Computational] models which process approximate inputs and which permit round-off computations". Indeed, since e.g. the exponential function does not have an exact representation or floating-point arithmetic approximates every rational number that is not in base-2, inexact input is a daily encounter. The model allowing inaccurate input of arbitrary precision, which we call the extended model, leads to extended versions of fundamental problems such as: "Are LP and other aforementioned problems in P?" The same question can be asked for an extended version of Smale's 9th problem on the list of mathematical problems for the 21st century: "Is there a polynomial time algorithm over the real numbers which decides the feasibility of the linear system of inequalities, and if so, outputs a feasible candidate?" One can thus pose this problem in the extended model. Similarly, the optimisation problems BP, SDP and Lasso, where the task is to output a solution to a specified precision, can likewise be posed in the extended model, also considering randomised algorithms. We will collectively refer to these problems as the extended Smale's 9th problem, which we settle in both the negative and the positive yielding two surprises: (1) In mathematics, sparse regularisation, statistics, and learning, one successfully computes with non-computable functions. (2) In order to mathematically characterise this phenomenon, one needs an intricate complexity theory for, seemingly paradoxically, non-computable functions.
△ Less
Submitted 2 August, 2022; v1 submitted 29 October, 2021;
originally announced October 2021.
-
GMRES Methods for Tomographic Reconstruction with an Unmatched Back Projector
Authors:
Per Christian Hansen,
Ken Hayami,
Keiichi Morikuni
Abstract:
Unmatched pairs of forward and back projectors are common in X-ray CT computations for large-scale problems; they are caused by the need for fast algorithms that best utilize the computer hardware, and it is an interesting and challenging task to develop fast and easy-to-use algorithms for these cases. Our approach is to use preconditioned GMRES, in the form of the AB- and BA-GMRES algorithms, to…
▽ More
Unmatched pairs of forward and back projectors are common in X-ray CT computations for large-scale problems; they are caused by the need for fast algorithms that best utilize the computer hardware, and it is an interesting and challenging task to develop fast and easy-to-use algorithms for these cases. Our approach is to use preconditioned GMRES, in the form of the AB- and BA-GMRES algorithms, to handle the unmatched normal equations associated with an unmatched pair. These algorithms are simple to implement, they rely only on computations with the available forward and back projectors, and they do not require the tuning of any algorithm parameters. We show that these algorithms are equivalent to well-known LSQR and LSMR algorithms in the case of a matched projector. Our numerical experiments demonstrate that AB- and BA-GMRES exhibit a desired semi-convergence behavior that is comparable with LSQR/LSMR and that standard stopping rules work well. Hence, AB- and BA-GMRES are suited for large-scale CT reconstruction problems with noisy data and unmatched projector pairs.
△ Less
Submitted 7 January, 2022; v1 submitted 4 October, 2021;
originally announced October 2021.
-
The mathematics of adversarial attacks in AI -- Why deep learning is unstable despite the existence of stable neural networks
Authors:
Alexander Bastounis,
Anders C Hansen,
Verner Vlačić
Abstract:
The unprecedented success of deep learning (DL) makes it unchallenged when it comes to classification problems. However, it is well established that the current DL methodology produces universally unstable neural networks (NNs). The instability problem has caused an enormous research effort -- with a vast literature on so-called adversarial attacks -- yet there has been no solution to the problem.…
▽ More
The unprecedented success of deep learning (DL) makes it unchallenged when it comes to classification problems. However, it is well established that the current DL methodology produces universally unstable neural networks (NNs). The instability problem has caused an enormous research effort -- with a vast literature on so-called adversarial attacks -- yet there has been no solution to the problem. Our paper addresses why there has been no solution to the problem, as we prove the following mathematical paradox: any training procedure based on training neural networks for classification problems with a fixed architecture will yield neural networks that are either inaccurate or unstable (if accurate) -- despite the provable existence of both accurate and stable neural networks for the same classification problems. The key is that the stable and accurate neural networks must have variable dimensions depending on the input, in particular, variable dimensions is a necessary condition for stability.
Our result points towards the paradox that accurate and stable neural networks exist, however, modern algorithms do not compute them. This yields the question: if the existence of neural networks with desirable properties can be proven, can one also find algorithms that compute them? There are cases in mathematics where provable existence implies computability, but will this be the case for neural networks? The contrary is true, as we demonstrate how neural networks can provably exist as approximate minimisers to standard optimisation problems with standard cost functions, however, no randomised algorithm can compute them with probability better than 1/2.
△ Less
Submitted 26 March, 2025; v1 submitted 13 September, 2021;
originally announced September 2021.
-
Inference for Dependent Data with Learned Clusters
Authors:
Jianfei Cao,
Christian Hansen,
Damian Kozbur,
Lucciano Villacorta
Abstract:
This paper presents and analyzes an approach to cluster-based inference for dependent data. The primary setting considered here is with spatially indexed data in which the dependence structure of observed random variables is characterized by a known, observed dissimilarity measure over spatial indices. Observations are partitioned into clusters with the use of an unsupervised clustering algorithm…
▽ More
This paper presents and analyzes an approach to cluster-based inference for dependent data. The primary setting considered here is with spatially indexed data in which the dependence structure of observed random variables is characterized by a known, observed dissimilarity measure over spatial indices. Observations are partitioned into clusters with the use of an unsupervised clustering algorithm applied to the dissimilarity measure. Once the partition into clusters is learned, a cluster-based inference procedure is applied to a statistical hypothesis testing procedure. The procedure proposed in the paper allows the number of clusters to depend on the data, which gives researchers a principled method for choosing an appropriate clustering level. The paper gives conditions under which the proposed procedure asymptotically attains correct size. A simulation study shows that the proposed procedure attains near nominal size in finite samples in a variety of statistical testing problems with dependent data.
△ Less
Submitted 14 November, 2022; v1 submitted 30 July, 2021;
originally announced July 2021.
-
Uncertainty Quantification of Inclusion Boundaries in the Context of X-ray Tomography
Authors:
Babak Maboudi Afkham,
Yiqiu Dong,
Per Christian Hansen
Abstract:
In this work, we describe a Bayesian framework for reconstructing the boundaries of piecewise smooth regions in the X-ray computed tomography (CT) problem in an infinite-dimensional setting. In addition to the reconstruction, we are also able to quantify the uncertainty of the predicted boundaries. Our approach is goal oriented, meaning that we directly detect the discontinuities from the data, in…
▽ More
In this work, we describe a Bayesian framework for reconstructing the boundaries of piecewise smooth regions in the X-ray computed tomography (CT) problem in an infinite-dimensional setting. In addition to the reconstruction, we are also able to quantify the uncertainty of the predicted boundaries. Our approach is goal oriented, meaning that we directly detect the discontinuities from the data, instead of reconstructing the entire image. This drastically reduces the dimension of the problem, which makes the application of Markov Chain Monte Carlo (MCMC) methods feasible. We show that our method provides an excellent platform for challenging X-ray CT scenarios (e.g., in case of noisy data, limited angle, or sparse angle imaging). We investigate the performance and accuracy of our method on synthetic data as well as on real-world data. The numerical results indicate that our method provides an accurate method in detecting boundaries of piecewise smooth regions and quantifies the uncertainty in the prediction.
△ Less
Submitted 19 December, 2022; v1 submitted 14 July, 2021;
originally announced July 2021.
-
Inference for Low-Rank Models
Authors:
Victor Chernozhukov,
Christian Hansen,
Yuan Liao,
Yinchu Zhu
Abstract:
This paper studies inference in linear models with a high-dimensional parameter matrix that can be well-approximated by a ``spiked low-rank matrix.'' A spiked low-rank matrix has rank that grows slowly compared to its dimensions and nonzero singular values that diverge to infinity. We show that this framework covers a broad class of models of latent-variables which can accommodate matrix completio…
▽ More
This paper studies inference in linear models with a high-dimensional parameter matrix that can be well-approximated by a ``spiked low-rank matrix.'' A spiked low-rank matrix has rank that grows slowly compared to its dimensions and nonzero singular values that diverge to infinity. We show that this framework covers a broad class of models of latent-variables which can accommodate matrix completion problems, factor models, varying coefficient models, and heterogeneous treatment effects. For inference, we apply a procedure that relies on an initial nuclear-norm penalized estimation step followed by two ordinary least squares regressions. We consider the framework of estimating incoherent eigenvectors and use a rotation argument to argue that the eigenspace estimation is asymptotically unbiased. Using this framework we show that our procedure provides asymptotically normal inference and achieves the semiparametric efficiency bound. We illustrate our framework by providing low-level conditions for its application in a treatment effects context where treatment assignment might be strongly dependent.
△ Less
Submitted 2 January, 2023; v1 submitted 6 July, 2021;
originally announced July 2021.
-
Stopping Rules for Algebraic Iterative Reconstruction Methods in Computed Tomography
Authors:
Per Christian Hansen,
Jakob Sauer Jørgensen,
Peter Winkel Rasmussen
Abstract:
Algebraic models for the reconstruction problem in X-ray computed tomography (CT) provide a flexible framework that applies to many measurement geometries. For large-scale problems we need to use iterative solvers, and we need stopping rules for these methods that terminate the iterations when we have computed a satisfactory reconstruction that balances the reconstruction error and the influence o…
▽ More
Algebraic models for the reconstruction problem in X-ray computed tomography (CT) provide a flexible framework that applies to many measurement geometries. For large-scale problems we need to use iterative solvers, and we need stopping rules for these methods that terminate the iterations when we have computed a satisfactory reconstruction that balances the reconstruction error and the influence of noise from the measurements. Many such stopping rules are developed in the inverse problems communities, but they have not attained much attention in the CT world. The goal of this paper is to describe and illustrate four stopping rules that are relevant for CT reconstructions.
△ Less
Submitted 18 June, 2021;
originally announced June 2021.
-
Can stable and accurate neural networks be computed? -- On the barriers of deep learning and Smale's 18th problem
Authors:
Matthew J. Colbrook,
Vegard Antun,
Anders C. Hansen
Abstract:
Deep learning (DL) has had unprecedented success and is now entering scientific computing with full force. However, current DL methods typically suffer from instability, even when universal approximation properties guarantee the existence of stable neural networks (NNs). We address this paradox by demonstrating basic well-conditioned problems in scientific computing where one can prove the existen…
▽ More
Deep learning (DL) has had unprecedented success and is now entering scientific computing with full force. However, current DL methods typically suffer from instability, even when universal approximation properties guarantee the existence of stable neural networks (NNs). We address this paradox by demonstrating basic well-conditioned problems in scientific computing where one can prove the existence of NNs with great approximation qualities, however, there does not exist any algorithm, even randomised, that can train (or compute) such a NN. For any positive integers $K > 2$ and $L$, there are cases where simultaneously: (a) no randomised training algorithm can compute a NN correct to $K$ digits with probability greater than $1/2$, (b) there exists a deterministic training algorithm that computes a NN with $K-1$ correct digits, but any such (even randomised) algorithm needs arbitrarily many training data, (c) there exists a deterministic training algorithm that computes a NN with $K-2$ correct digits using no more than $L$ training samples. These results imply a classification theory describing conditions under which (stable) NNs with a given accuracy can be computed by an algorithm. We begin this theory by establishing sufficient conditions for the existence of algorithms that compute stable NNs in inverse problems. We introduce Fast Iterative REstarted NETworks (FIRENETs), which we both prove and numerically verify are stable. Moreover, we prove that only $\mathcal{O}(|\log(ε)|)$ layers are needed for an $ε$-accurate solution to the inverse problem.
△ Less
Submitted 15 April, 2021; v1 submitted 20 January, 2021;
originally announced January 2021.
-
On the infinite-dimensional QR algorithm
Authors:
Matthew J. Colbrook,
Anders C. Hansen
Abstract:
Spectral computations of infinite-dimensional operators are notoriously difficult, yet ubiquitous in the sciences. Indeed, despite more than half a century of research, it is still unknown which classes of operators allow for computation of spectra and eigenvectors with convergence rates and error control. Recent progress in classifying the difficulty of spectral problems into complexity hierarchi…
▽ More
Spectral computations of infinite-dimensional operators are notoriously difficult, yet ubiquitous in the sciences. Indeed, despite more than half a century of research, it is still unknown which classes of operators allow for computation of spectra and eigenvectors with convergence rates and error control. Recent progress in classifying the difficulty of spectral problems into complexity hierarchies has revealed that the most difficult spectral problems are so hard that one needs three limits in the computation, and no convergence rates nor error control is possible. This begs the question: which classes of operators allow for computations with convergence rates and error control? In this paper we address this basic question, and the algorithm used is an infinite-dimensional version of the QR algorithm. Indeed, we generalise the QR algorithm to infinite-dimensional operators. We prove that not only is the algorithm executable on a finite machine, but one can also recover the extremal parts of the spectrum and corresponding eigenvectors, with convergence rates and error control. This allows for new classification results in the hierarchy of computational problems that existing algorithms have not been able to capture. The algorithm and convergence theorems are demonstrated on a wealth of examples with comparisons to standard approaches (that are notorious for providing false solutions).We also find that in some cases the IQR algorithm performs better than predicted by theory and make conjectures for future study.
△ Less
Submitted 16 November, 2020;
originally announced November 2020.
-
The asymptotic tails of limit distributions of continuous time Markov chains
Authors:
Chuang Xu,
Mads Christian Hansen,
Carsten Wiuf
Abstract:
This paper investigates tail asymptotics of stationary distributions and quasi-stationary distributions (QSDs) of continuous-time Markov chains on subsets of the non-negative integers. Based on the so-called flux-balance equation, we establish identities for stationary measures and QSDs, which we use to derive tail asymptotics. In particular, continuous-time Markov chains with asymptotic power law…
▽ More
This paper investigates tail asymptotics of stationary distributions and quasi-stationary distributions (QSDs) of continuous-time Markov chains on subsets of the non-negative integers. Based on the so-called flux-balance equation, we establish identities for stationary measures and QSDs, which we use to derive tail asymptotics. In particular, continuous-time Markov chains with asymptotic power law transition rates, tail asymptotics for stationary distributions and QSDs are classified into three types using three easily computable parameters: (i) super-exponential distributions, (ii) exponential-tailed distributions, and (iii) sub-exponential distributions. Our approach to establish tail asymptotics of stationary distributions is different from the classical semimartingale approach, and we do not impose ergodicity nor moment bound conditions. In particular, the results also hold for explosive Markov chains, for which multiple stationary distributions may exist. Furthermore, our results on tail asymptotics of QSDs seem new. We apply our results to biochemical reaction networks, a general single-cell stochastic gene expression model, an extended class of branching processes, and stochastic population processes with bursty reproduction, none of which are birth-death processes. The approach together with the identities easily extends to discrete time Markov chains.
△ Less
Submitted 14 August, 2023; v1 submitted 22 July, 2020;
originally announced July 2020.
-
Full classification of dynamics for one-dimensional continuous time Markov chains with polynomial transition rates
Authors:
Chuang Xu,
Mads Christian Hansen,
Carsten Wiuf
Abstract:
This paper provides full classification of dynamics for continuous time Markov chains (CTMCs) on the non-negative integers with polynomial transition rate functions. Such stochastic processes are abundant in applications, in particular in biology. More precisely, for CTMCs of bounded jumps, we provide necessary and sufficient conditions in terms of calculable parameters for explosivity, recurrence…
▽ More
This paper provides full classification of dynamics for continuous time Markov chains (CTMCs) on the non-negative integers with polynomial transition rate functions. Such stochastic processes are abundant in applications, in particular in biology. More precisely, for CTMCs of bounded jumps, we provide necessary and sufficient conditions in terms of calculable parameters for explosivity, recurrence vs transience, certain absorption, positive recurrence vs null recurrence, and implosivity. Simple sufficient conditions for exponential ergodicity of stationary distributions and quasi-stationary distributions as well as existence and non-existence of moments of hitting times are also obtained. Similar simple sufficient conditions for the aforementioned dynamics together with their opposite dynamics are established for CTMCs with unbounded jumps. The results generalize respective criteria for birth-death processes by Karlin and McGregor in the 1960s. Finally, we apply our results to stochastic reaction networks, an extended class of branching processes, a general bursty single-cell stochastic gene expression model, and population processes, none of which are birth-death processes. The approach is based on a mixture of Lyapunov-Foster type results, semimartingale approach, as well as estimates of stationary measures.
△ Less
Submitted 30 November, 2021; v1 submitted 17 June, 2020;
originally announced June 2020.
-
Structural classification of continuous time Markov chains with applications
Authors:
Chuang Xu,
Mads Christian Hansen,
Carsten Wiuf
Abstract:
This paper is motivated by examples from stochastic reaction network theory. The $Q$-matrix of a stochastic reaction network can be derived from the reaction graph, an edge-labelled directed graph encoding the jump vectors of an associated continuous time Markov chain on the invariant space $\mathbb{N}^d_0$. An open question is how to decompose the space $\mathbb{N}^d_0$ into neutral, trapping, an…
▽ More
This paper is motivated by examples from stochastic reaction network theory. The $Q$-matrix of a stochastic reaction network can be derived from the reaction graph, an edge-labelled directed graph encoding the jump vectors of an associated continuous time Markov chain on the invariant space $\mathbb{N}^d_0$. An open question is how to decompose the space $\mathbb{N}^d_0$ into neutral, trapping, and escaping states, and open and closed communicating classes, and whether this can be done from the reaction graph alone. Such general continuous time Markov chains can be understood as natural generalizations of birth-death processes, incorporating multiple different birth and death mechanisms. We characterize the structure of $\mathbb{N}^d_0$ imposed by a general $Q$-matrix generating continuous time Markov chains with values in $\mathbb{N}^d_0$, in terms of the set of jump vectors and their corresponding transition rate functions. Thus the setting is not limited to stochastic reaction networks. Furthermore, we define structural equivalence of two $Q$-matrices, and provide sufficient conditions for structural equivalence. Examples are abundant in applications. We apply the results to stochastic reaction networks, a Lotka-Volterra model in ecology, the EnvZ-OmpR system in systems biology, and a class of extended branching processes, none of which are birth-death processes.
△ Less
Submitted 22 December, 2021; v1 submitted 17 June, 2020;
originally announced June 2020.
-
Dynamics of continuous time Markov chains with applications
Authors:
Chuang Xu,
Mads Christian Hansen,
Carsten Wiuf
Abstract:
This paper contributes an in-depth study of properties of continuous time Markov chains (CTMCs) on non-negative integer lattices $\N_0^d$, with particular interest in one-dimensional CTMCs with polynomial transitions rates. Such stochastic processes are abundant in applications, in particular in biology. We characterize the structure of the state space of general CTMCs on $\N_0^d$ in terms of the…
▽ More
This paper contributes an in-depth study of properties of continuous time Markov chains (CTMCs) on non-negative integer lattices $\N_0^d$, with particular interest in one-dimensional CTMCs with polynomial transitions rates. Such stochastic processes are abundant in applications, in particular in biology. We characterize the structure of the state space of general CTMCs on $\N_0^d$ in terms of the set of jump vectors and their corresponding transition rate functions. For CTMCs on $\N_0$ with polynomial transition rate functions, we provide threshold criteria in terms of easily computable parameters for various dynamical properties such as explosivity, recurrence, transience, certain absorption, positive/null recurrence, implosivity, and existence and non-existence of moments of hitting times. In particular, simple sufficient conditions for exponential ergodicity of stationary distributions and quasi-stationary distributions are obtained, and the few gap cases are well-illustrated by examples. Subtle differences in conditions for different dynamical properties are revealed in terms of examples. Finally, we apply our results to stochastic reaction networks, an extended class of branching processes, a general bursty single-cell stochastic gene expression model, and population processes which are not birth-death processes.
△ Less
Submitted 19 June, 2020; v1 submitted 27 September, 2019;
originally announced September 2019.
-
Non-uniform recovery guarantees for binary measurements and infinite-dimensional compressed sensing
Authors:
Laura Thesing,
Anders Christian Hansen
Abstract:
Due to the many applications in Magnetic Resonance Imaging (MRI), Nuclear Magnetic Resonance (NMR), radio interferometry, helium atom scattering etc., the theory of compressed sensing with Fourier transform measurements has reached a mature level. However, for binary measurements via the Walsh transform, the theory has been merely non-existent, despite the large number of applications such as fluo…
▽ More
Due to the many applications in Magnetic Resonance Imaging (MRI), Nuclear Magnetic Resonance (NMR), radio interferometry, helium atom scattering etc., the theory of compressed sensing with Fourier transform measurements has reached a mature level. However, for binary measurements via the Walsh transform, the theory has been merely non-existent, despite the large number of applications such as fluorescence microscopy, single pixel cameras, lensless cameras, compressive holography, laser-based failure-analysis etc. Binary measurements are a mainstay in signal and image processing and can be modelled by the Walsh transform and Walsh series that are binary cousins of the respective Fourier counterparts. We help bridging the theoretical gap by providing non-uniform recovery guarantees for infinite-dimensional compressed sensing with Walsh samples and wavelet reconstruction. The theoretical results demonstrate that compressed sensing with Walsh samples, as long as the sampling strategy is highly structured and follows the structured sparsity of the signal, is as effective as in the Fourier case. However, there is a fundamental difference in the asymptotic results when the smoothness and vanishing moments of the wavelet increase. In the Fourier case, this changes the optimal sampling patterns, whereas this is not the case in the Walsh setting.
△ Less
Submitted 30 March, 2021; v1 submitted 3 September, 2019;
originally announced September 2019.
-
The foundations of spectral computations via the Solvability Complexity Index hierarchy
Authors:
Matthew J. Colbrook,
Anders C. Hansen
Abstract:
The problem of computing spectra of operators is arguably one of the most investigated areas of computational mathematics. However, the problem of computing spectra of general bounded infinite matrices has only recently been solved. We establish some of the foundations of computational spectral theory through the Solvability Complexity Index (SCI) hierarchy, an approach closely related to Smale's…
▽ More
The problem of computing spectra of operators is arguably one of the most investigated areas of computational mathematics. However, the problem of computing spectra of general bounded infinite matrices has only recently been solved. We establish some of the foundations of computational spectral theory through the Solvability Complexity Index (SCI) hierarchy, an approach closely related to Smale's program on the foundations of computational mathematics and McMullen's results on polynomial root finding with rational maps. Infinite-dimensional problems yield an intricate infinite classification theory, determining which spectral problems can be solved and with what types of algorithms. We provide answers to many longstanding open questions on the existence of algorithms. For example, we show that spectra can be computed, with error control, from point sampling operator coefficients for large classes of partial differential operators on unbounded domains. Further results include: computing spectra of (possibly unbounded) operators on graphs and separable Hilbert spaces with error control; determining if the spectrum intersects a compact set; the computational spectral gap problem and computing spectral classifications at the bottom of the spectrum; and computing discrete spectra, multiplicities, eigenspaces and determining if the discrete spectrum is non-empty. Moreover, the positive results with error control can be used in computer-assisted proofs. In contrast, the negative results preclude computer-assisted proofs for classes of operators as a whole. Our proofs are constructive, yielding a library of new algorithms and techniques that handle problems that before were out of reach. We demonstrate these algorithms on challenging problems, giving concrete examples of the failure of traditional approaches (e.g., "spectral pollution") compared to the introduced techniques.
△ Less
Submitted 17 September, 2022; v1 submitted 26 August, 2019;
originally announced August 2019.
-
On the stable sampling rate for binary measurements and wavelet reconstruction
Authors:
Anders Christian Hansen,
Laura Thesing
Abstract:
This paper is concerned with the problem of reconstructing an infinite-dimensional signal from a limited number of linear measurements. In particular, we show that for binary measurements (modelled with Walsh functions and Hadamard matrices) and wavelet reconstruction the stable sampling rate is linear. This implies that binary measurements are as efficient as Fourier samples when using wavelets a…
▽ More
This paper is concerned with the problem of reconstructing an infinite-dimensional signal from a limited number of linear measurements. In particular, we show that for binary measurements (modelled with Walsh functions and Hadamard matrices) and wavelet reconstruction the stable sampling rate is linear. This implies that binary measurements are as efficient as Fourier samples when using wavelets as the reconstruction space. Powerful techniques for reconstructions include generalized sampling and its compressed versions, as well as recent methods based on data assimilation. Common to these methods is that the reconstruction quality depends highly on the subspace angle between the sampling and the reconstruction space, which is dictated by the stable sampling rate. As a result of the theory provided in this paper, these methods can now easily use binary measurements and wavelet reconstruction bases.
△ Less
Submitted 29 July, 2019;
originally announced August 2019.
-
A twin error gauge for Kaczmarz's iterations
Authors:
Bart S. van Lith,
Per Christian Hansen,
Michiel E. Hochstenbach
Abstract:
We propose two new algebraic reconstruction techniques based on Kaczmarz's method that produce a regularized solution to noisy tomography problems. Tomography problems exhibit semi-convergence when iterative methods are employed, and the aim is therefore to stop near the semi-convergence point. Our approach is based on an error gauge that is constructed by pairing standard down-sweep Kaczmarz's me…
▽ More
We propose two new algebraic reconstruction techniques based on Kaczmarz's method that produce a regularized solution to noisy tomography problems. Tomography problems exhibit semi-convergence when iterative methods are employed, and the aim is therefore to stop near the semi-convergence point. Our approach is based on an error gauge that is constructed by pairing standard down-sweep Kaczmarz's method with its up-sweep version; we stop the iterations when this error gauge is minimal. The reconstructions of the new methods differ from standard Kaczmarz iterates in that our final result is the average of the stopped up- and down-sweeps. Even when Kaczmarz's method is supplied with an oracle that provides the exact error -- and is thereby able to stop at the best possible iterate -- our methods have a lower two-norm error in the vast majority of our test cases. In terms of computational cost, our methods are a little cheaper than standard Kaczmarz equipped with a statistical stopping rule.
△ Less
Submitted 28 January, 2021; v1 submitted 18 June, 2019;
originally announced June 2019.
-
Fixing Nonconvergence of Algebraic Iterative Reconstruction with an Unmatched Backprojector
Authors:
Yiqiu Dong,
Per Christian Hansen,
Michiel E. Hochstenbach,
Nicolai Andre Brogaard Riis
Abstract:
We consider algebraic iterative reconstruction methods with applications in image reconstruction. In particular, we are concerned with methods based on an unmatched projector/backprojector pair; i.e., the backprojector is not the exact adjoint or transpose of the forward projector. Such situations are common in large-scale computed tomography, and we consider the common situation where the method…
▽ More
We consider algebraic iterative reconstruction methods with applications in image reconstruction. In particular, we are concerned with methods based on an unmatched projector/backprojector pair; i.e., the backprojector is not the exact adjoint or transpose of the forward projector. Such situations are common in large-scale computed tomography, and we consider the common situation where the method does not converge due to the nonsymmetry of the iteration matrix. We propose a modified algorithm that incorporates a small shift parameter, and we give the conditions that guarantee convergence of this method to a fixed point of a slightly perturbed problem. We also give perturbation bounds for this fixed point. Moreover, we discuss how to use Krylov subspace methods to efficiently estimate the leftmost eigenvalue of a certain matrix to select a proper shift parameter. The modified algorithm is illustrated with test problems from computed tomography.
△ Less
Submitted 13 February, 2019; v1 submitted 12 February, 2019;
originally announced February 2019.
-
Inference for Heterogeneous Effects using Low-Rank Estimation of Factor Slopes
Authors:
Victor Chernozhukov,
Christian Hansen,
Yuan Liao,
Yinchu Zhu
Abstract:
We study a panel data model with general heterogeneous effects where slopes are allowed to vary across both individuals and over time. The key dimension reduction assumption we employ is that the heterogeneous slopes can be expressed as having a factor structure so that the high-dimensional slope matrix is low-rank and can thus be estimated using low-rank regularized regression. We provide a simpl…
▽ More
We study a panel data model with general heterogeneous effects where slopes are allowed to vary across both individuals and over time. The key dimension reduction assumption we employ is that the heterogeneous slopes can be expressed as having a factor structure so that the high-dimensional slope matrix is low-rank and can thus be estimated using low-rank regularized regression. We provide a simple multi-step estimation procedure for the heterogeneous effects. The procedure makes use of sample-splitting and orthogonalization to accommodate inference following the use of penalized low-rank estimation. We formally verify that the resulting estimator is asymptotically normal allowing simple construction of inferential statements for {the individual-time-specific effects and for cross-sectional averages of these effects}. We illustrate the proposed method in simulation experiments and by estimating the effect of the minimum wage on employment.
△ Less
Submitted 4 September, 2019; v1 submitted 19 December, 2018;
originally announced December 2018.
-
Existence of a Unique Quasi-stationary Distribution for Stochastic Reaction Networks
Authors:
Mads Christian Hansen,
Carsten Wiuf
Abstract:
In the setting of stochastic dynamical systems that eventually go extinct, the quasi-stationary distributions are useful to understand the long-term behavior of a system before evanescence. For a broad class of applicable continuous-time Markov processes on countably infinite state spaces, known as reaction networks, we introduce the inferred notion of absorbing and endorsed sets, and obtain suffi…
▽ More
In the setting of stochastic dynamical systems that eventually go extinct, the quasi-stationary distributions are useful to understand the long-term behavior of a system before evanescence. For a broad class of applicable continuous-time Markov processes on countably infinite state spaces, known as reaction networks, we introduce the inferred notion of absorbing and endorsed sets, and obtain sufficient conditions for the existence and uniqueness of a quasi-stationary distribution within each such endorsed set. In particular, we obtain sufficient conditions for the existence of a globally attracting quasi-stationary distribution in the space of probability measures on the set of endorsed states. Furthermore, under these conditions, the convergence from any initial distribution to the quasi-stationary distribution is exponential in the total variation norm.
△ Less
Submitted 21 August, 2018;
originally announced August 2018.
-
High-Dimensional Econometrics and Regularized GMM
Authors:
Alexandre Belloni,
Victor Chernozhukov,
Denis Chetverikov,
Christian Hansen,
Kengo Kato
Abstract:
This chapter presents key concepts and theoretical results for analyzing estimation and inference in high-dimensional models. High-dimensional models are characterized by having a number of unknown parameters that is not vanishingly small relative to the sample size. We first present results in a framework where estimators of parameters of interest may be represented directly as approximate means.…
▽ More
This chapter presents key concepts and theoretical results for analyzing estimation and inference in high-dimensional models. High-dimensional models are characterized by having a number of unknown parameters that is not vanishingly small relative to the sample size. We first present results in a framework where estimators of parameters of interest may be represented directly as approximate means. Within this context, we review fundamental results including high-dimensional central limit theorems, bootstrap approximation of high-dimensional limit distributions, and moderate deviation theory. We also review key concepts underlying inference when many parameters are of interest such as multiple testing with family-wise error rate or false discovery rate control. We then turn to a general high-dimensional minimum distance framework with a special focus on generalized method of moments problems where we present results for estimation and inference about model parameters. The presented results cover a wide array of econometric applications, and we discuss several leading special cases including high-dimensional linear regression and linear instrumental variables models to illustrate the general results.
△ Less
Submitted 10 June, 2018; v1 submitted 5 June, 2018;
originally announced June 2018.
-
IR Tools: A MATLAB Package of Iterative Regularization Methods and Large-Scale Test Problems
Authors:
Silvia Gazzola,
Per Christian Hansen,
James G. Nagy
Abstract:
This paper describes a new MATLAB software package of iterative regularization methods and test problems for large-scale linear inverse problems. The software package, called IR Tools, serves two related purposes: we provide implementations of a range of iterative solvers, including several recently proposed methods that are not available elsewhere, and we provide a set of large-scale test problem…
▽ More
This paper describes a new MATLAB software package of iterative regularization methods and test problems for large-scale linear inverse problems. The software package, called IR Tools, serves two related purposes: we provide implementations of a range of iterative solvers, including several recently proposed methods that are not available elsewhere, and we provide a set of large-scale test problems in the form of discretizations of 2D linear inverse problems. The solvers include iterative regularization methods where the regularization is due to the semi-convergence of the iterations, Tikhonov-type formulations where the regularization is explicitly formulated in the form of a regularization term, and methods that can impose bound constraints on the computed solutions. All the iterative methods are implemented in a very flexible fashion that allows the problem's coefficient matrix to be available as a (sparse) matrix, a function handle, or an object. The most basic call to all of the various iterative methods requires only this matrix and the right hand side vector; if the method uses any special stopping criteria, regularization parameters, etc., then default values are set automatically by the code. Moreover, through the use of an optional input structure, the user can also have full control of any of the algorithm parameters. The test problems represent realistic large-scale problems found in image reconstruction and several other applications. Numerical examples illustrate the various algorithms and test problems available in this package.
△ Less
Submitted 1 July, 2018; v1 submitted 15 December, 2017;
originally announced December 2017.
-
Targeted Undersmoothing
Authors:
Christian Hansen,
Damian Kozbur,
Sanjog Misra
Abstract:
This paper proposes a post-model selection inference procedure, called targeted undersmoothing, designed to construct uniformly valid confidence sets for a broad class of functionals of sparse high-dimensional statistical models. These include dense functionals, which may potentially depend on all elements of an unknown high-dimensional parameter. The proposed confidence sets are based on an initi…
▽ More
This paper proposes a post-model selection inference procedure, called targeted undersmoothing, designed to construct uniformly valid confidence sets for a broad class of functionals of sparse high-dimensional statistical models. These include dense functionals, which may potentially depend on all elements of an unknown high-dimensional parameter. The proposed confidence sets are based on an initially selected model and two additionally selected models, an upper model and a lower model, which enlarge the initially selected model. We illustrate application of the procedure in two empirical examples. The first example considers estimation of heterogeneous treatment effects using data from the Job Training Partnership Act of 1982, and the second example looks at estimating profitability from a mailing strategy based on estimated heterogeneous treatment effects in a direct mail marketing campaign. We also provide evidence on the finite sample performance of the proposed targeted undersmoothing procedure through a series of simulation experiments.
△ Less
Submitted 7 June, 2018; v1 submitted 22 June, 2017;
originally announced June 2017.
-
Complexity Issues in Computing Spectra, Pseudospectra and Resolvents
Authors:
Anders C. Hansen,
Olavi Nevanlinna
Abstract:
We display methods that allow for computations of spectra, pseudospectra and resolvents of linear operators on Hilbert spaces and also elements in unital Banach algebras. The paper considers two different approaches, namely, pseudospectral techniques and polynomial numerical hull theory. The former is used for Hilbert space operators whereas the latter can handle the general case of elements in a…
▽ More
We display methods that allow for computations of spectra, pseudospectra and resolvents of linear operators on Hilbert spaces and also elements in unital Banach algebras. The paper considers two different approaches, namely, pseudospectral techniques and polynomial numerical hull theory. The former is used for Hilbert space operators whereas the latter can handle the general case of elements in a Banach algebra. This approach leads to multicentric holomorphic calculus. We also discuss some new types of pseudospectra and the recently defined Solvability Complexity Index
△ Less
Submitted 22 October, 2016;
originally announced October 2016.
-
Computing Spectra -- On the Solvability Complexity Index Hierarchy and Towers of Algorithms
Authors:
Jonathan Ben-Artzi,
Matthew J. Colbrook,
Anders C. Hansen,
Olavi Nevanlinna,
Markus Seidel
Abstract:
This paper establishes some of the fundamental barriers in the theory of computations and finally settles the long-standing computational spectral problem. That is to determine the existence of algorithms that can compute spectra $\mathrm{sp}(A)$ of classes of bounded operators $A = \{a_{ij}\}_{i,j \in \mathbb{N}} \in \mathcal{B}(l^2(\mathbb{N}))$, given the matrix elements…
▽ More
This paper establishes some of the fundamental barriers in the theory of computations and finally settles the long-standing computational spectral problem. That is to determine the existence of algorithms that can compute spectra $\mathrm{sp}(A)$ of classes of bounded operators $A = \{a_{ij}\}_{i,j \in \mathbb{N}} \in \mathcal{B}(l^2(\mathbb{N}))$, given the matrix elements $\{a_{ij}\}_{i,j \in \mathbb{N}}$, that are sharp in the sense that they achieve the boundary of what a digital computer can achieve. Similarly, for a Schrödinger operator $H = -Δ+V$, determine the existence of algorithms that can compute the spectrum $\mathrm{sp}(H)$ given point samples of the potential function $V$. In order to solve these problems, we establish the Solvability Complexity Index (SCI) hierarchy and provide a collection of new algorithms that allow for problems that were previously out of reach. The SCI is the smallest number of limits needed in the computation, yielding a classification hierarchy for all types of problems in computational mathematics that determines the boundaries of what computers can achieve in scientific computing. In addition, the SCI hierarchy provides classifications of computational problems that can be used in computer-assisted proofs. The SCI hierarchy captures many key computational issues in the history of mathematics including the insolvability of the quintic, Smale's problem on the existence of iterative generally convergent algorithm for polynomial root finding, the computational spectral problem, inverse problems, optimisation etc.
△ Less
Submitted 15 June, 2020; v1 submitted 13 August, 2015;
originally announced August 2015.
-
A Tensor-Based Dictionary Learning Approach to Tomographic Image Reconstruction
Authors:
Sara Soltani,
Misha E. Kilmer,
Per Christian Hansen
Abstract:
We consider tomographic reconstruction using priors in the form of a dictionary learned from training images. The reconstruction has two stages: first we construct a tensor dictionary prior from our training data, and then we pose the reconstruction problem in terms of recovering the expansion coefficients in that dictionary. Our approach differs from past approaches in that a) we use a third-orde…
▽ More
We consider tomographic reconstruction using priors in the form of a dictionary learned from training images. The reconstruction has two stages: first we construct a tensor dictionary prior from our training data, and then we pose the reconstruction problem in terms of recovering the expansion coefficients in that dictionary. Our approach differs from past approaches in that a) we use a third-order tensor representation for our images and b) we recast the reconstruction problem using the tensor formulation. The dictionary learning problem is presented as a non-negative tensor factorization problem with sparsity constraints. The reconstruction problem is formulated in a convex optimization framework by looking for a solution with a sparse representation in the tensor dictionary. Numerical results show that our tensor formulation leads to very sparse representations of both the training images and the reconstructions due to the ability of representing repeated features compactly in the dictionary.
△ Less
Submitted 8 June, 2015;
originally announced June 2015.
-
Noise Robustness of a Combined Phase Retrieval and Reconstruction Method for Phase-Contrast Tomography
Authors:
Rasmus Dalgas Kongskov,
Jakob Sauer Jørgensen,
Henning Friis Poulsen,
Per Christian Hansen
Abstract:
Classical reconstruction methods for phase-contrast tomography consist of two stages: phase retrieval and tomographic reconstruction. A novel algebraic method combining the two was suggested by Kostenko et al. (Opt. Express, 21, 12185, 2013) and preliminary results demonstrating improved reconstruction compared to a two-stage method given. Using simulated free-space propagation experiments with a…
▽ More
Classical reconstruction methods for phase-contrast tomography consist of two stages: phase retrieval and tomographic reconstruction. A novel algebraic method combining the two was suggested by Kostenko et al. (Opt. Express, 21, 12185, 2013) and preliminary results demonstrating improved reconstruction compared to a two-stage method given. Using simulated free-space propagation experiments with a single sample-detector distance, we thoroughly compare the novel method with the two-stage method to address limitations of the preliminary results. We demonstrate that the novel method is substantially more robust towards noise; our simulations point to a possible reduction in counting times by an order of magnitude.
△ Less
Submitted 7 September, 2015; v1 submitted 11 May, 2015;
originally announced May 2015.
-
Tomographic Image Reconstruction using Training images
Authors:
Sara Soltani,
Martin S. Andersen,
Per Christian Hansen
Abstract:
We describe and examine an algorithm for tomographic image reconstruction where prior knowledge about the solution is available in the form of training images. We first construct a nonnegative dictionary based on prototype elements from the training images; this problem is formulated as a regularized non-negative matrix factorization. Incorporating the dictionary as a prior in a convex reconstruct…
▽ More
We describe and examine an algorithm for tomographic image reconstruction where prior knowledge about the solution is available in the form of training images. We first construct a nonnegative dictionary based on prototype elements from the training images; this problem is formulated as a regularized non-negative matrix factorization. Incorporating the dictionary as a prior in a convex reconstruction problem, we then find an approximate solution with a sparse representation in the dictionary. The dictionary is applied to non-overlapping patches of the image, which reduces the computational complexity compared to other algorithms. Computational experiments clarify the choice and interplay of the model parameters and the regularization parameters, and we show that in few-projection low-dose settings our algorithm is competitive with total variation regularization and tends to include more texture and more correct edges.
△ Less
Submitted 17 August, 2015; v1 submitted 6 March, 2015;
originally announced March 2015.
-
Valid Post-Selection and Post-Regularization Inference: An Elementary, General Approach
Authors:
Victor Chernozhukov,
Christian Hansen,
Martin Spindler
Abstract:
Here we present an expository, general analysis of valid post-selection or post-regularization inference about a low-dimensional target parameter, $α$, in the presence of a very high-dimensional nuisance parameter, $η$, which is estimated using modern selection or regularization methods. Our analysis relies on high-level, easy-to-interpret conditions that allow one to clearly see the structures ne…
▽ More
Here we present an expository, general analysis of valid post-selection or post-regularization inference about a low-dimensional target parameter, $α$, in the presence of a very high-dimensional nuisance parameter, $η$, which is estimated using modern selection or regularization methods. Our analysis relies on high-level, easy-to-interpret conditions that allow one to clearly see the structures needed for achieving valid post-regularization inference. Simple, readily verifiable sufficient conditions are provided for a class of affine-quadratic models. We focus our discussion on estimation and inference procedures based on using the empirical analog of theoretical equations $$M(α, η)=0$$ which identify $α$. Within this structure, we show that setting up such equations in a manner such that the orthogonality/immunization condition $$\partial_ηM(α, η) = 0$$ at the true parameter values is satisfied, coupled with plausible conditions on the smoothness of $M$ and the quality of the estimator $\hat η$, guarantees that inference on for the main parameter $α$ based on testing or point estimation methods discussed below will be regular despite selection or regularization biases occurring in estimation of $η$. In particular, the estimator of $α$ will often be uniformly consistent at the root-$n$ rate and uniformly asymptotically normal even though estimators $\hat η$ will generally not be asymptotically linear and regular. The uniformity holds over large classes of models that do not impose highly implausible "beta-min" conditions. We also show that inference can be carried out by inverting tests formed from Neyman's $C(α)$ (orthogonal score) statistics.
△ Less
Submitted 18 August, 2015; v1 submitted 14 January, 2015;
originally announced January 2015.
-
On the absence of the RIP in real-world applications of compressed sensing and the RIP in levels
Authors:
Alexander Bastounis,
Anders C. Hansen
Abstract:
The purpose of this paper is twofold. The first is to point out that the Restricted Isometry Property (RIP) does not hold in many applications where compressed sensing is successfully used. This includes fields like Magnetic Resonance Imaging (MRI), Computerized Tomography, Electron Microscopy, Radio Interferometry and Fluorescence Microscopy. We demonstrate that for natural compressed sensing mat…
▽ More
The purpose of this paper is twofold. The first is to point out that the Restricted Isometry Property (RIP) does not hold in many applications where compressed sensing is successfully used. This includes fields like Magnetic Resonance Imaging (MRI), Computerized Tomography, Electron Microscopy, Radio Interferometry and Fluorescence Microscopy. We demonstrate that for natural compressed sensing matrices involving a level based reconstruction basis (e.g. wavelets), the number of measurements required to recover all $s$-sparse signals for reasonable $s$ is excessive. In particular, uniform recovery of all $s$-sparse signals is quite unrealistic. This realisation shows that the RIP is insufficient for explaining the success of compressed sensing in various practical applications. The second purpose of the paper is to introduce a new framework based on a generalised RIP-like definition that fits the applications where compressed sensing is used. We show that the shortcomings that show that uniform recovery is unreasonable no longer apply if we instead ask for structured recovery that is uniform only within each of the levels. To examine this phenomenon, a new tool, termed the 'Restricted Isometry Property in Levels' is described and analysed. Furthermore, we show that with certain conditions on the Restricted Isometry Property in Levels, a form of uniform recovery within each level is possible. Finally, we conclude the paper by providing examples that demonstrate the optimality of the results obtained.
△ Less
Submitted 16 October, 2015; v1 submitted 17 November, 2014;
originally announced November 2014.
-
Density theorems for nonuniform sampling of bandlimited functions using derivatives or bunched measurements
Authors:
Ben Adcock,
Milana Gataric,
Anders C. Hansen
Abstract:
We provide sufficient density condition for a set of nonuniform samples to give rise to a set of sampling for multivariate bandlimited functions when the measurements consist of pointwise evaluations of a function and its first $k$ derivatives. Along with explicit estimates of corresponding frame bounds, we derive the explicit density bound and show that, as $k$ increases, it grows linearly in…
▽ More
We provide sufficient density condition for a set of nonuniform samples to give rise to a set of sampling for multivariate bandlimited functions when the measurements consist of pointwise evaluations of a function and its first $k$ derivatives. Along with explicit estimates of corresponding frame bounds, we derive the explicit density bound and show that, as $k$ increases, it grows linearly in $k+1$ with the constant of proportionality $1/\mathrm{e}$. Seeking larger gap conditions, we also prove a multivariate perturbation result for nonuniform samples that are sufficiently close to sets of sampling, e.g. to uniform samples taken at $k+1$ times the Nyquist rate.
Additionally, in the univariate setting, we consider a related problem of so-called nonuniform bunched sampling, where in each sampling interval $s+1$ bunched measurements of a function are taken and the sampling intervals are permitted to be of different length. We derive an explicit density condition which grows linearly in $s+1$ for large $s$, with the constant of proportionality depending on the width of the bunches. The width of the bunches is allowed to be arbitrarily small, and moreover, for sufficiently narrow bunches and sufficiently large $s$, we obtain the same result as in the case of univariate sampling with $s$ derivatives.
△ Less
Submitted 9 September, 2016; v1 submitted 2 November, 2014;
originally announced November 2014.
-
Recovering piecewise smooth functions from nonuniform Fourier measurements
Authors:
Ben Adcock,
Milana Gataric,
Anders C. Hansen
Abstract:
In this paper, we consider the problem of reconstructing piecewise smooth functions to high accuracy from nonuniform samples of their Fourier transform. We use the framework of nonuniform generalized sampling (NUGS) to do this, and to ensure high accuracy we employ reconstruction spaces consisting of splines or (piecewise) polynomials. We analyze the relation between the dimension of the reconstru…
▽ More
In this paper, we consider the problem of reconstructing piecewise smooth functions to high accuracy from nonuniform samples of their Fourier transform. We use the framework of nonuniform generalized sampling (NUGS) to do this, and to ensure high accuracy we employ reconstruction spaces consisting of splines or (piecewise) polynomials. We analyze the relation between the dimension of the reconstruction space and the bandwidth of the nonuniform samples, and show that it is linear for splines and piecewise polynomials of fixed degree, and quadratic for piecewise polynomials of varying degree.
△ Less
Submitted 30 September, 2014;
originally announced October 2014.
-
"Plug-and-Play" Edge-Preserving Regularization
Authors:
Donghui Chen,
Misha E. Kilmer,
Per Christian Hansen
Abstract:
In many inverse problems it is essential to use regularization methods that preserve edges in the reconstructions, and many reconstruction models have been developed for this task, such as the Total Variation (TV) approach. The associated algorithms are complex and require a good knowledge of large-scale optimization algorithms, and they involve certain tolerances that the user must choose. We pre…
▽ More
In many inverse problems it is essential to use regularization methods that preserve edges in the reconstructions, and many reconstruction models have been developed for this task, such as the Total Variation (TV) approach. The associated algorithms are complex and require a good knowledge of large-scale optimization algorithms, and they involve certain tolerances that the user must choose. We present a simpler approach that relies only on standard computational building blocks in matrix computations, such as orthogonal transformations, preconditioned iterative solvers, Kronecker products, and the discrete cosine transform -- hence the term "plug-and-play." We do not attempt to improve on TV reconstructions, but rather provide an easy-to-use approach to computing reconstructions with similar properties.
△ Less
Submitted 4 June, 2014;
originally announced June 2014.
-
Weighted frames of exponentials and stable recovery of multidimensional functions from nonuniform Fourier samples
Authors:
Ben Adcock,
Milana Gataric,
Anders C. Hansen
Abstract:
In this paper, we consider the problem of recovering a compactly supported multivariate function from a collection of pointwise samples of its Fourier transform taken nonuniformly. We do this by using the concept of weighted Fourier frames. A seminal result of Beurling shows that sample points give rise to a classical Fourier frame provided they are relatively separated and of sufficient density.…
▽ More
In this paper, we consider the problem of recovering a compactly supported multivariate function from a collection of pointwise samples of its Fourier transform taken nonuniformly. We do this by using the concept of weighted Fourier frames. A seminal result of Beurling shows that sample points give rise to a classical Fourier frame provided they are relatively separated and of sufficient density. However, this result does not allow for arbitrary clustering of sample points, as is often the case in practice. Whilst keeping the density condition sharp and dimension independent, our first result removes the separation condition and shows that density alone suffices. However, this result does not lead to estimates for the frame bounds. A known result of Groechenig provides explicit estimates, but only subject to a density condition that deteriorates linearly with dimension. In our second result we improve these bounds by reducing the dimension dependence. In particular, we provide explicit frame bounds which are dimensionless for functions having compact support contained in a sphere. Next, we demonstrate how our two main results give new insight into a reconstruction algorithm---based on the existing generalized sampling framework---that allows for stable and quasi-optimal reconstruction in any particular basis from a finite collection of samples. Finally, we construct sufficiently dense sampling schemes that are often used in practice---jittered, radial and spiral sampling schemes---and provide several examples illustrating the effectiveness of our approach when tested on these schemes.
△ Less
Submitted 6 September, 2015; v1 submitted 13 May, 2014;
originally announced May 2014.
-
A note on compressed sensing of structured sparse wavelet coefficients from subsampled Fourier measurements
Authors:
Ben Adcock,
Anders C. Hansen,
Bogdan Roman
Abstract:
This note complements the paper "The quest for optimal sampling: Computationally efficient, structure-exploiting measurements for compressed sensing" [2]. Its purpose is to present a proof of a result stated therein concerning the recovery via compressed sensing of a signal that has structured sparsity in a Haar wavelet basis when sampled using a multilevel-subsampled discrete Fourier transform. I…
▽ More
This note complements the paper "The quest for optimal sampling: Computationally efficient, structure-exploiting measurements for compressed sensing" [2]. Its purpose is to present a proof of a result stated therein concerning the recovery via compressed sensing of a signal that has structured sparsity in a Haar wavelet basis when sampled using a multilevel-subsampled discrete Fourier transform. In doing so, it provides a simple exposition of the proof in the case of Haar wavelets and discrete Fourier samples of more general result recently provided in the paper "Breaking the coherence barrier: A new theory for compressed sensing" [1].
△ Less
Submitted 14 June, 2014; v1 submitted 25 March, 2014;
originally announced March 2014.
-
The quest for optimal sampling: Computationally efficient, structure-exploiting measurements for compressed sensing
Authors:
Ben Adcock,
Anders C. Hansen,
Bogdan Roman
Abstract:
An intriguing phenomenon in many instances of compressed sensing is that the reconstruction quality is governed not just by the overall sparsity of the signal, but also on its structure. This paper is about understanding this phenomenon, and demonstrating how it can be fruitfully exploited by the design of suitable sampling strategies in order to outperform more standard compressed sensing techniq…
▽ More
An intriguing phenomenon in many instances of compressed sensing is that the reconstruction quality is governed not just by the overall sparsity of the signal, but also on its structure. This paper is about understanding this phenomenon, and demonstrating how it can be fruitfully exploited by the design of suitable sampling strategies in order to outperform more standard compressed sensing techniques based on random matrices.
△ Less
Submitted 27 March, 2014; v1 submitted 25 March, 2014;
originally announced March 2014.