-
Efficient Implementation of Third-order Tensor Methods with Adaptive Regularization for Unconstrained Optimization
Authors:
Coralia Cartis,
Raphael Hauser,
Yang Liu,
Karl Welzel,
Wenqi Zhu
Abstract:
High-order tensor methods that employ local Taylor models of degree $p$ within adaptive regularization frameworks (AR$p$) have recently received significant attention, due to their optimal global and local rates of convergence for both convex and nonconvex optimization problems. However, their numerical performance for general unconstrained optimization problems remains insufficiently explored, wh…
▽ More
High-order tensor methods that employ local Taylor models of degree $p$ within adaptive regularization frameworks (AR$p$) have recently received significant attention, due to their optimal global and local rates of convergence for both convex and nonconvex optimization problems. However, their numerical performance for general unconstrained optimization problems remains insufficiently explored, which we address by showcasing the numerical performance of standard second- and third-order variants ($p=2,3$) and proposing novel techniques for key algorithmic aspects when $p\geq3$ to improve numerical efficiency. To improve the adaptive choice of the regularization parameter, we extend the interpolation-based updating strategy introduced in (Gould, Porcelli, and Toint, 2012) for $p=2$ to $p\geq3$. We identify fundamental differences between the local minima of regularized subproblems for $p=2$ and $p\geq3$ and their effect on performance. Then, for $p\geq3$, we introduce a novel pre-rejection technique that rejects poor subproblem minimizers (referred to as `transient') before any function evaluation, reducing cost and selecting useful (`persistent') ones. Numerical studies confirm efficiency improvements in our modified AR$3$ algorithm. We also assess the effect of different subproblem termination conditions and the choice of the initial regularization parameter on overall performance. Finally, we benchmark our best-performing AR$3$ variants, along with those in (Birgin et al., 2020), against second-order ones (AR$2$). Encouraging results on standard test problems confirm that AR$3$ variants can outperform AR$2$ in terms of objective evaluations, derivative evaluations, and subproblem solves. We provide an efficient, extensive, and modular MATLAB software package including various AR$2$ and AR$3$ variants, allowing ease of use and experimentation for interested users.
△ Less
Submitted 28 February, 2025; v1 submitted 31 December, 2024;
originally announced January 2025.
-
Approximating Higher-Order Derivative Tensors Using Secant Updates
Authors:
Karl Welzel,
Raphael A. Hauser
Abstract:
Quasi-Newton methods employ an update rule that gradually improves the Hessian approximation using the already available gradient evaluations. We propose higher-order secant updates which generalize this idea to higher-order derivatives, approximating for example third derivatives (which are tensors) from given Hessian evaluations. Our generalization is based on the observation that quasi-Newton u…
▽ More
Quasi-Newton methods employ an update rule that gradually improves the Hessian approximation using the already available gradient evaluations. We propose higher-order secant updates which generalize this idea to higher-order derivatives, approximating for example third derivatives (which are tensors) from given Hessian evaluations. Our generalization is based on the observation that quasi-Newton updates are least-change updates satisfying the secant equation, with different methods using different norms to measure the size of the change. We present a full characterization for least-change updates in weighted Frobenius norms (satisfying an analogue of the secant equation) for derivatives of arbitrary order. Moreover, we establish convergence of the approximations to the true derivative under standard assumptions and explore the quality of the generated approximations in numerical experiments.
△ Less
Submitted 15 August, 2023; v1 submitted 27 January, 2023;
originally announced January 2023.
-
Binary Matrix Factorisation and Completion via Integer Programming
Authors:
Reka A. Kovacs,
Oktay Gunluk,
Raphael A. Hauser
Abstract:
Binary matrix factorisation is an essential tool for identifying discrete patterns in binary data. In this paper we consider the rank-k binary matrix factorisation problem (k-BMF) under Boolean arithmetic: we are given an n x m binary matrix X with possibly missing entries and need to find two binary matrices A and B of dimension n x k and k x m respectively, which minimise the distance between X…
▽ More
Binary matrix factorisation is an essential tool for identifying discrete patterns in binary data. In this paper we consider the rank-k binary matrix factorisation problem (k-BMF) under Boolean arithmetic: we are given an n x m binary matrix X with possibly missing entries and need to find two binary matrices A and B of dimension n x k and k x m respectively, which minimise the distance between X and the Boolean product of A and B in the squared Frobenius distance. We present a compact and two exponential size integer programs (IPs) for k-BMF and show that the compact IP has a weak LP relaxation, while the exponential size IPs have a stronger equivalent LP relaxation. We introduce a new objective function, which differs from the traditional squared Frobenius objective in attributing a weight to zero entries of the input matrix that is proportional to the number of times the zero is erroneously covered in a rank-k factorisation. For one of the exponential size IPs we describe a computational approach based on column generation. Experimental results on synthetic and real word datasets suggest that our integer programming approach is competitive against available methods for k-BMF and provides accurate low-error factorisations.
△ Less
Submitted 3 August, 2021; v1 submitted 25 June, 2021;
originally announced June 2021.
-
A seven-point algorithm for piecewise smooth univariate minimization
Authors:
Jonathan Grant-Peters,
Raphael Hauser
Abstract:
In this paper, we construct an algorithm for minimising piecewise smooth functions for which derivative information is not available. The algorithm constructs a pair of quadratic functions, one on each side of the point with smallest known function value, and selects the intersection of these quadratics as the next test point. This algorithm relies on the quadratic function underestimating the tru…
▽ More
In this paper, we construct an algorithm for minimising piecewise smooth functions for which derivative information is not available. The algorithm constructs a pair of quadratic functions, one on each side of the point with smallest known function value, and selects the intersection of these quadratics as the next test point. This algorithm relies on the quadratic function underestimating the true function within a specific range, which is accomplished using a adjustment term that is modified as the algorithm progresses.
△ Less
Submitted 11 December, 2020;
originally announced December 2020.
-
Binary Matrix Factorisation via Column Generation
Authors:
Reka A. Kovacs,
Oktay Gunluk,
Raphael A. Hauser
Abstract:
Identifying discrete patterns in binary data is an important dimensionality reduction tool in machine learning and data mining. In this paper, we consider the problem of low-rank binary matrix factorisation (BMF) under Boolean arithmetic. Due to the hardness of this problem, most previous attempts rely on heuristic techniques. We formulate the problem as a mixed integer linear program and use a la…
▽ More
Identifying discrete patterns in binary data is an important dimensionality reduction tool in machine learning and data mining. In this paper, we consider the problem of low-rank binary matrix factorisation (BMF) under Boolean arithmetic. Due to the hardness of this problem, most previous attempts rely on heuristic techniques. We formulate the problem as a mixed integer linear program and use a large scale optimisation technique of column generation to solve it without the need of heuristic pattern mining. Our approach focuses on accuracy and on the provision of optimality guarantees. Experimental results on real world datasets demonstrate that our proposed method is effective at producing highly accurate factorisations and improves on the previously available best known results for 15 out of 24 problem instances.
△ Less
Submitted 3 August, 2021; v1 submitted 9 November, 2020;
originally announced November 2020.
-
PCA by Optimisation of Symmetric Functions has no Spurious Local Optima
Authors:
Raphael A. Hauser,
Armin Eftekhari
Abstract:
Principal Component Analysis (PCA) finds the best linear representation of data, and is an indispensable tool in many learning and inference tasks. Classically, principal components of a dataset are interpreted as the directions that preserve most of its "energy", an interpretation that is theoretically underpinned by the celebrated Eckart-Young-Mirsky Theorem.
This paper introduces many other w…
▽ More
Principal Component Analysis (PCA) finds the best linear representation of data, and is an indispensable tool in many learning and inference tasks. Classically, principal components of a dataset are interpreted as the directions that preserve most of its "energy", an interpretation that is theoretically underpinned by the celebrated Eckart-Young-Mirsky Theorem.
This paper introduces many other ways of performing PCA, with various geometric interpretations, and proves that the corresponding family of non-convex programs have no spurious local optima, while possessing only strict saddle points. These programs therefore loosely behave like convex problems and can be efficiently solved to global optimality, for example, with certain variants of the stochastic gradient descent.
Beyond providing new geometric interpretations and enhancing our theoretical understanding of PCA, our findings might pave the way for entirely new approaches to structured dimensionality reduction, such as sparse PCA and nonnegative matrix factorisation. More specifically, we study an unconstrained formulation of PCA using determinant optimisation that might provide an elegant alternative to the deflating scheme commonly used in sparse PCA.
△ Less
Submitted 21 December, 2019; v1 submitted 18 May, 2018;
originally announced May 2018.
-
PCA by Determinant Optimization has no Spurious Local Optima
Authors:
Raphael A. Hauser,
Armin Eftekhari,
Heinrich F. Matzinger
Abstract:
Principal component analysis (PCA) is an indispensable tool in many learning tasks that finds the best linear representation for data. Classically, principal components of a dataset are interpreted as the directions that preserve most of its "energy", an interpretation that is theoretically underpinned by the celebrated Eckart-Young-Mirsky Theorem. There are yet other ways of interpreting PCA that…
▽ More
Principal component analysis (PCA) is an indispensable tool in many learning tasks that finds the best linear representation for data. Classically, principal components of a dataset are interpreted as the directions that preserve most of its "energy", an interpretation that is theoretically underpinned by the celebrated Eckart-Young-Mirsky Theorem. There are yet other ways of interpreting PCA that are rarely exploited in practice, largely because it is not known how to reliably solve the corresponding non-convex optimisation programs. In this paper, we consider one such interpretation of principal components as the directions that preserve most of the "volume" of the dataset. Our main contribution is a theorem that shows that the corresponding non-convex program has no spurious local optima. We apply a number of solvers for empirical confirmation.
△ Less
Submitted 11 March, 2018;
originally announced March 2018.
-
Quantifying the Estimation Error of Principal Components
Authors:
Raphael Hauser,
Raul Kangro,
Jüri Lember,
Heinrich Matzinger
Abstract:
Principal component analysis is an important pattern recognition and dimensionality reduction tool in many applications. Principal components are computed as eigenvectors of a maximum likelihood covariance $\widehatΣ$ that approximates a population covariance $Σ$, and these eigenvectors are often used to extract structural information about the variables (or attributes) of the studied population.…
▽ More
Principal component analysis is an important pattern recognition and dimensionality reduction tool in many applications. Principal components are computed as eigenvectors of a maximum likelihood covariance $\widehatΣ$ that approximates a population covariance $Σ$, and these eigenvectors are often used to extract structural information about the variables (or attributes) of the studied population. Since PCA is based on the eigendecomposition of the proxy covariance $\widehatΣ$ rather than the ground-truth $Σ$, it is important to understand the approximation error in each individual eigenvector as a function of the number of available samples. The recent results of Kolchinskii and Lounici yield such bounds. In the present paper we sharpen these bounds and show that eigenvectors can often be reconstructed to a required accuracy from a sample of strictly smaller size order.
△ Less
Submitted 27 October, 2017;
originally announced October 2017.
-
Strong Duality of Linear Optimisation Problems over Measure Spaces
Authors:
Raphael Hauser,
Sergey Shahverdyan
Abstract:
In this work we present two particular cases of the general duality result for linear optimisation problems over signed measures with infinitely many constraints in the form of integrals of functions with respect to the decision variables (the measure in question) for which strong duality holds. In the first case the optimisation problems are over measures with $L^p$ density functions with…
▽ More
In this work we present two particular cases of the general duality result for linear optimisation problems over signed measures with infinitely many constraints in the form of integrals of functions with respect to the decision variables (the measure in question) for which strong duality holds. In the first case the optimisation problems are over measures with $L^p$ density functions with $1 < p < \infty$. In the second case we consider a semi-infinite optimisation problem where finitely many constraints are given in form of bounds on integrals. The latter case has a particular importance in practice where the model can be applied in robust risk management and model-free option pricing.
△ Less
Submitted 17 January, 2015;
originally announced January 2015.
-
The impact of startup costs and the grid operator on the power price equilibrium
Authors:
Miha Troha,
Raphael Hauser
Abstract:
In this paper we propose a quadratic programming model that can be used for calculating the term structure of electricity prices while explicitly modeling startup costs of power plants. In contrast to other approaches presented in the literature, we incorporate the startup costs in a mathematically rigorous manner without relying on ad hoc heuristics. Moreover, we propose a tractable approach for…
▽ More
In this paper we propose a quadratic programming model that can be used for calculating the term structure of electricity prices while explicitly modeling startup costs of power plants. In contrast to other approaches presented in the literature, we incorporate the startup costs in a mathematically rigorous manner without relying on ad hoc heuristics. Moreover, we propose a tractable approach for estimating the startup costs of power plants based on their historical production. Through numerical simulations applied to the entire UK power grid, we demonstrate that the inclusion of startup costs is necessary for the modeling of electricity prices in realistic power systems. Numerical results show that startup costs make electricity prices very spiky. In the second part of the paper, we extend the initial model by including the grid operator who is responsible for managing the grid. Numerical simulations demonstrate that robust decision making of the grid operator can significantly decrease the number and severity of spikes in the electricity price and improve the reliability of the power grid.
△ Less
Submitted 10 December, 2014; v1 submitted 29 November, 2014;
originally announced December 2014.
-
An Upper Bound on the Convergence Rate of a Second Functional in Optimal Sequence Alignment
Authors:
Raphael Hauser,
Heinrich Matzinger,
Ionel Popescu
Abstract:
Consider finite sequences $X_{[1,n]}=X_1\dots X_n$ and $Y_{[1,n]}=Y_1\dots Y_n$ of length $n$, consisting of i.i.d.\ samples of random letters from a finite alphabet, and let $S$ and $T$ be chosen i.i.d.\ randomly from the unit ball in the space of symmetric scoring functions over this alphabet augmented by a gap symbol. We prove a probabilistic upper bound of linear order in $n^{0.75}$ for the de…
▽ More
Consider finite sequences $X_{[1,n]}=X_1\dots X_n$ and $Y_{[1,n]}=Y_1\dots Y_n$ of length $n$, consisting of i.i.d.\ samples of random letters from a finite alphabet, and let $S$ and $T$ be chosen i.i.d.\ randomly from the unit ball in the space of symmetric scoring functions over this alphabet augmented by a gap symbol. We prove a probabilistic upper bound of linear order in $n^{0.75}$ for the deviation of the score relative to $T$ of optimal alignments with gaps of $X_{[1,n]}$ and $Y_{[1,n]}$ relative to $S$. It remains an open problem to prove a lower bound. Our result contributes to the understanding of the microstructure of optimal alignments relative to one given scoring function, extending a theory begun by the first two authors.
△ Less
Submitted 26 September, 2014;
originally announced September 2014.
-
Calculation of a power price equilibrium
Authors:
Miha Troha,
Raphael Hauser
Abstract:
In this paper we propose a tractable quadratic programming formulation for calculating the equilibrium term structure of electricity prices. We rely on a theoretical model described in [21], but extend it so that it reflects actually traded electricity contracts, transaction costs and liquidity considerations. Our numerical simulations examine the properties of the term structure and its dependenc…
▽ More
In this paper we propose a tractable quadratic programming formulation for calculating the equilibrium term structure of electricity prices. We rely on a theoretical model described in [21], but extend it so that it reflects actually traded electricity contracts, transaction costs and liquidity considerations. Our numerical simulations examine the properties of the term structure and its dependence on various parameters of the model. The proposed quadratic programming formulation is applied to calculate the equilibrium term structure of electricity prices in the UK power grid consisting of a few hundred power plants. The impact of ramp up and ramp down constraints are also studied.
△ Less
Submitted 23 September, 2014;
originally announced September 2014.
-
The existence and uniqueness of a power price equilibrium
Authors:
Miha Troha,
Raphael Hauser
Abstract:
We propose a term structure power price model that, in contrast to widely accepted no-arbitrage based approaches, accounts for the non-storable nature of power. It belongs to a class of equilibrium game theoretic models with players divided into producers and consumers. The consumers' goal is to maximize a mean-variance utility function subject to satisfying an inelastic demand of their own client…
▽ More
We propose a term structure power price model that, in contrast to widely accepted no-arbitrage based approaches, accounts for the non-storable nature of power. It belongs to a class of equilibrium game theoretic models with players divided into producers and consumers. The consumers' goal is to maximize a mean-variance utility function subject to satisfying an inelastic demand of their own clients (e.g households, businesses etc.) to whom they sell the power. The producers, who own a portfolio of power plants each defined by a running fuel (e.g. gas, coal, oil...) and physical characteristics (e.g. efficiency, capacity, ramp up/down times...), similarly, seek to maximize a mean-variance utility function consisting of power, fuel, and emission prices subject to production constraints. Our goal is to determine the term structure of the power price at which production matches consumption. In this paper we show that in such a setting the equilibrium price exists and also discuss the conditions for its uniqueness.
△ Less
Submitted 11 August, 2014;
originally announced August 2014.
-
Regression techniques for Portfolio Optimisation using MOSEK
Authors:
Thomas Schmelzer,
Raphael Hauser,
Erling Andersen,
Joachim Dahl
Abstract:
Regression is widely used by practioners across many disciplines. We reformulate the underlying optimisation problem as a second-order conic program providing the flexibility often needed in applications. Using examples from portfolio management and quantitative trading we solve regression problems with and without constraints. Several Python code fragments are given. The code and data are availab…
▽ More
Regression is widely used by practioners across many disciplines. We reformulate the underlying optimisation problem as a second-order conic program providing the flexibility often needed in applications. Using examples from portfolio management and quantitative trading we solve regression problems with and without constraints. Several Python code fragments are given. The code and data are available online at http://www.github.com/tschm/MosekRegression.
△ Less
Submitted 12 October, 2013;
originally announced October 2013.
-
The S-Procedure via Dual Cone Calculus
Authors:
Raphael Hauser
Abstract:
Given a quadratic function $h$ that satisfies a Slater condition, Yakubovich's S-Procedure (or S-Lemma) gives a characterization of all other quadratic functions that are copositive with $h$ in a form that is amenable to numerical computations. In this paper we present a deep-rooted connection between the S-Procedure and the dual cone calculus formula $(K_1\cap K_2)^*= K_1^*+K_2^*$, which holds fo…
▽ More
Given a quadratic function $h$ that satisfies a Slater condition, Yakubovich's S-Procedure (or S-Lemma) gives a characterization of all other quadratic functions that are copositive with $h$ in a form that is amenable to numerical computations. In this paper we present a deep-rooted connection between the S-Procedure and the dual cone calculus formula $(K_1\cap K_2)^*= K_1^*+K_2^*$, which holds for closed convex cones in $\R^2$. To establish the link with the S-Procedure, we generalize the dual cone calculus formula to a situation where $K_1$ is nonclosed, nonconvex and nonconic but exhibits sufficient mathematical resemblance to a closed convex cone. As a result, we obtain a new proof of the S-Lemma and an extension to Hilbert space kernels.
△ Less
Submitted 10 May, 2013;
originally announced May 2013.
-
Relative Robust Portfolio Optimization
Authors:
Raphael Hauser,
Vijay Krishnamurthy,
Reha Tütüncü
Abstract:
Considering mean-variance portfolio problems with uncertain model parameters, we contrast the classical absolute robust optimization approach with the relative robust approach based on a maximum regret function. Although the latter problems are NP-hard in general, we show that tractable inner and outer approximations exist in several cases that are of central interest in asset management.
Considering mean-variance portfolio problems with uncertain model parameters, we contrast the classical absolute robust optimization approach with the relative robust approach based on a maximum regret function. Although the latter problems are NP-hard in general, we show that tractable inner and outer approximations exist in several cases that are of central interest in asset management.
△ Less
Submitted 10 May, 2013; v1 submitted 1 May, 2013;
originally announced May 2013.
-
Letter Change Bias and Local Uniqueness in Optimal Sequence Alignments
Authors:
Raphael Hauser,
Heinrich Matzinger
Abstract:
Considering two optimally aligned random sequences, we investigate the effect on the alignment score caused by changing a random letter in one of the two sequences. Using this idea in conjunction with large deviations theory, we show that in alignments with a low proportion of gaps the optimal alignment is locally unique in most places with high probability. This has implications in the design of…
▽ More
Considering two optimally aligned random sequences, we investigate the effect on the alignment score caused by changing a random letter in one of the two sequences. Using this idea in conjunction with large deviations theory, we show that in alignments with a low proportion of gaps the optimal alignment is locally unique in most places with high probability. This has implications in the design of recently pioneered alignment methods that use the local uniqueness as a homology indicator.
△ Less
Submitted 24 April, 2013;
originally announced April 2013.
-
Distribution of Aligned Letter Pairs in Optimal Alignments of Random Sequences
Authors:
Raphael Hauser,
Heinrich Matzinger
Abstract:
Considering the optimal alignment of two i.i.d. random sequences of length $n$, we show that when the scoring function is chosen randomly, almost surely the empirical distribution of aligned letter pairs in all optimal alignments converges to a unique limiting distribution as $n$ tends to infinity. This result is interesting because it helps understanding the microscopic path structure of a specia…
▽ More
Considering the optimal alignment of two i.i.d. random sequences of length $n$, we show that when the scoring function is chosen randomly, almost surely the empirical distribution of aligned letter pairs in all optimal alignments converges to a unique limiting distribution as $n$ tends to infinity. This result is interesting because it helps understanding the microscopic path structure of a special type of last passage percolation problem with correlated weights, an area of long-standing open problems. Characterizing the microscopic path structure yields furthermore a robust alternative to optimal alignment scores for testing the relatedness of genetic sequences.
△ Less
Submitted 23 November, 2012;
originally announced November 2012.
-
A Monte Carlo Approach to the Fluctuation Problem in Optimal Alignments of Random Strings
Authors:
Saba Amsalu,
Raphael Hauser,
Heinrich Matzinger
Abstract:
The problem of determining the correct order of fluctuation of the optimal alignment score of two random strings of length $n$ has been open for several decades. It is known that the biased expected effect of a random letter-change on the optimal score implies an order of fluctuation linear in $\sqrt{n}$. However, in many situations where such a biased effect is observed empirically, it has been i…
▽ More
The problem of determining the correct order of fluctuation of the optimal alignment score of two random strings of length $n$ has been open for several decades. It is known that the biased expected effect of a random letter-change on the optimal score implies an order of fluctuation linear in $\sqrt{n}$. However, in many situations where such a biased effect is observed empirically, it has been impossible to prove analytically. The main result of this paper shows that when the rescaled-limit of the optimal alignment score increases in a certain direction, then the biased effect exists. On the basis of this result one can quantify a confidence level for the existence of such a biased effect and hence of an order $\sqrt{n}$ fluctuation based on simulation of optimal alignments scores.This is an important step forward, as the correct order of fluctuation was previously known only for certain special distributions. To illustrate the usefulness of our new methodology, we apply it to optimal alignments of strings written in the DNA-alphabet. As scoring function, we use the BLASTZ default-substitution matrix together with a realistic gap penalty. BLASTZ is one of the most widely used sequence alignment methodologies in bioinformatics. For this DNA-setting, we show that with a high level of confidence, the fluctuation of the optimal alignment score is of order $Θ(\sqrt{n})$. An important special case of optimal alignment score is the Longest Common Subsequence (LCS) of random strings. For binary sequences with equiprobable symbols, the question of the fluctuation of the LCS remains open. The symmetry in that case does not allow for our method. On the other hand, in real-life DNA sequences, it is not the case that all letters occur with the same frequency. Thus, for many real life situations, our method allows to determine the order of the fluctuation up to a high confidence level.
△ Less
Submitted 23 November, 2012;
originally announced November 2012.
-
Adversarial Smoothed Analysis
Authors:
Felipe Cucker,
Raphael Hauser,
Martin Lotz
Abstract:
The purpose of this note is to extend the results on uniform smoothed analysis of condition numbers from \cite{BuCuLo:07} to the case where the perturbation follows a radially symmetric probability distribution. In particular, we will show that the bounds derived in \cite{BuCuLo:07} still hold in the case of distributions whose density has a singularity at the center of the perturbation, which w…
▽ More
The purpose of this note is to extend the results on uniform smoothed analysis of condition numbers from \cite{BuCuLo:07} to the case where the perturbation follows a radially symmetric probability distribution. In particular, we will show that the bounds derived in \cite{BuCuLo:07} still hold in the case of distributions whose density has a singularity at the center of the perturbation, which we call {\em adversarial}.
△ Less
Submitted 20 March, 2009;
originally announced March 2009.
-
Kalman Filtering with Equality and Inequality State Constraints
Authors:
Nachi Gupta,
Raphael Hauser
Abstract:
Both constrained and unconstrained optimization problems regularly appear in recursive tracking problems engineers currently address -- however, constraints are rarely exploited for these applications. We define the Kalman Filter and discuss two different approaches to incorporating constraints. Each of these approaches are first applied to equality constraints and then extended to inequality co…
▽ More
Both constrained and unconstrained optimization problems regularly appear in recursive tracking problems engineers currently address -- however, constraints are rarely exploited for these applications. We define the Kalman Filter and discuss two different approaches to incorporating constraints. Each of these approaches are first applied to equality constraints and then extended to inequality constraints. We discuss methods for dealing with nonlinear constraints and for constraining the state prediction. Finally, some experiments are provided to indicate the usefulness of such methods.
△ Less
Submitted 18 September, 2007;
originally announced September 2007.
-
On Tail Decay and Moment Estimates of a Condition Number for Random Linear Conic Systems
Authors:
Dennis Cheung,
Felipe Cucker,
Raphael Hauser
Abstract:
In this paper we study the distribution tails and the moments of a condition number which arises in the study of homogeneous systems of linear inequalities. We consider the case where this system is defined by a Gaussian random matrix and characterise the exact decay rates of the distribution tails, improve the existing moment estimates, and prove various limit theorems for large scale systems.…
▽ More
In this paper we study the distribution tails and the moments of a condition number which arises in the study of homogeneous systems of linear inequalities. We consider the case where this system is defined by a Gaussian random matrix and characterise the exact decay rates of the distribution tails, improve the existing moment estimates, and prove various limit theorems for large scale systems. Our results are of complexity theoretic interest, because interior-point methods and relaxation methods for the solution of systems of linear inequalities have running times that are bounded in terms of the logarithm and the square of the condition number respectively.
△ Less
Submitted 17 September, 2003;
originally announced September 2003.
-
Self-scaled barriers for irreducible symmetric cones
Authors:
Raphael A Hauser,
Yongdo Lim
Abstract:
Self-scaled barrier functions are fundamental objects in the theory of interior-point methods for linear optimization over symmetric cones, of which linear and semidefinite programming are special cases. We are classifying all self-scaled barriers over irreducible symmetric cones and show that these functions are merely homothetic transformations of the universal barrier function. Together with…
▽ More
Self-scaled barrier functions are fundamental objects in the theory of interior-point methods for linear optimization over symmetric cones, of which linear and semidefinite programming are special cases. We are classifying all self-scaled barriers over irreducible symmetric cones and show that these functions are merely homothetic transformations of the universal barrier function. Together with a decomposition theorem for self-scaled barriers this concludes the algebraic classification theory of these functions. After introducing the reader to the concepts relevant to the problem and tracing the history of the subject, we start by deriving our result from first principles in the important special case of semidefinite programming. We then generalise these arguments to irreducible symmetric cones by invoking results from the theory of Euclidean Jordan algebras.
△ Less
Submitted 2 April, 2001;
originally announced April 2001.
-
Self-scaled barrier functions on symmetric cones and their classification
Authors:
Raphael Hauser,
Osman Guler
Abstract:
Self-scaled barrier functions on self-scaled cones were introduced through a set of axioms in 1994 by Y.E. Nesterov and M.J. Todd as a tool for the construction of long-step interior point algorithms. This paper provides firm foundation for these objects by exhibiting their symmetry properties, their intimate ties with the symmetry groups of their domains of definition, and subsequently their de…
▽ More
Self-scaled barrier functions on self-scaled cones were introduced through a set of axioms in 1994 by Y.E. Nesterov and M.J. Todd as a tool for the construction of long-step interior point algorithms. This paper provides firm foundation for these objects by exhibiting their symmetry properties, their intimate ties with the symmetry groups of their domains of definition, and subsequently their decomposition into irreducible parts and algebraic classification theory. In a first part we recall the characterisation of the family of self-scaled cones as the set of symmetric cones and develop a primal-dual symmetric viewpoint on self-scaled barriers, results that were first discovered by the second author. We then show in a short, simple proof that any pointed, convex cone decomposes into a direct sum of irreducible components in a unique way, a result which can also be of independent interest. We then show that any self-scaled barrier function decomposes in an essentially unique way into a direct sum of self-scaled barriers defined on the irreducible components of the underlying symmetric cone. Finally, we present a complete algebraic classification of self-scaled barrier functions using the correspondence between symmetric cones and Euclidean Jordan algebras.
△ Less
Submitted 28 March, 2001;
originally announced March 2001.