-
TaMPERing with Large Language Models: A Field Guide for using Generative AI in Public Administration Research
Authors:
Michael Overton,
Barrie Robison,
Lucas Sheneman
Abstract:
The integration of Large Language Models (LLMs) into social science research presents transformative opportunities for advancing scientific inquiry, particularly in public administration (PA). However, the absence of standardized methodologies for using LLMs poses significant challenges for ensuring transparency, reproducibility, and replicability. This manuscript introduces the TaMPER framework-a…
▽ More
The integration of Large Language Models (LLMs) into social science research presents transformative opportunities for advancing scientific inquiry, particularly in public administration (PA). However, the absence of standardized methodologies for using LLMs poses significant challenges for ensuring transparency, reproducibility, and replicability. This manuscript introduces the TaMPER framework-a structured methodology organized around five critical decision points: Task, Model, Prompt, Evaluation, and Reporting. The TaMPER framework provides scholars with a systematic approach to leveraging LLMs effectively while addressing key challenges such as model variability, prompt design, evaluation protocols, and transparent reporting practices.
△ Less
Submitted 30 March, 2025;
originally announced April 2025.
-
Disc breaking through forced eccentricity growth
Authors:
Madeline Overton,
Rebecca G. Martin,
Stephen H. Lubow,
Stephen Lepp
Abstract:
Motivated by misaligned discs observed in eccentric orbit Be/X-ray binaries, we examine the evolution of a retrograde disc around one component of an eccentric binary with hydrodynamic simulations, $n$-body simulations and linear theory. Forced eccentricity growth from the eccentric orbit binary causes the initially circular disk to undergo eccentricity oscillations. A retrograde disc becomes more…
▽ More
Motivated by misaligned discs observed in eccentric orbit Be/X-ray binaries, we examine the evolution of a retrograde disc around one component of an eccentric binary with hydrodynamic simulations, $n$-body simulations and linear theory. Forced eccentricity growth from the eccentric orbit binary causes the initially circular disk to undergo eccentricity oscillations. A retrograde disc becomes more radially extended, more highly eccentric and undergoes more rapid apsidal precession compared to a prograde disc. We find that a retrograde disc can be subject to disc breaking where the disc forms two rings with different eccentricities and longitude of periastrons while remaining coplanar. This could have implications for the lightcurves and the X-ray outbursts observed in Be/X-ray binaries.
△ Less
Submitted 28 March, 2025;
originally announced March 2025.
-
Complementarity, Augmentation, or Substitutivity? The Impact of Generative Artificial Intelligence on the U.S. Federal Workforce
Authors:
William G. Resh,
Yi Ming,
Xinyao Xia,
Michael Overton,
Gul Nisa Gürbüz,
Brandon De Breuhl
Abstract:
This study investigates the near-future impacts of generative artificial intelligence (AI) technologies on occupational competencies across the U.S. federal workforce. We develop a multi-stage Retrieval-Augmented Generation system to leverage large language models for predictive AI modeling that projects shifts in required competencies and to identify vulnerable occupations on a knowledge-by-skill…
▽ More
This study investigates the near-future impacts of generative artificial intelligence (AI) technologies on occupational competencies across the U.S. federal workforce. We develop a multi-stage Retrieval-Augmented Generation system to leverage large language models for predictive AI modeling that projects shifts in required competencies and to identify vulnerable occupations on a knowledge-by-skill-by-ability basis across the federal government workforce. This study highlights policy recommendations essential for workforce planning in the era of AI. We integrate several sources of detailed data on occupational requirements across the federal government from both centralized and decentralized human resource sources, including from the U.S. Office of Personnel Management (OPM) and various federal agencies. While our preliminary findings suggest some significant shifts in required competencies and potential vulnerability of certain roles to AI-driven changes, we provide nuanced insights that support arguments against abrupt or generic approaches to strategic human capital planning around the development of generative AI. The study aims to inform strategic workforce planning and policy development within federal agencies and demonstrates how this approach can be replicated across other large employment institutions and labor markets.
△ Less
Submitted 11 March, 2025;
originally announced March 2025.
-
Formation of Be star decretion discs through boundary layer effects
Authors:
Rebecca G. Martin,
Stephen H. Lubow,
David Vallet,
Madeline Overton,
Stephen Lepp,
Zhaohuan Zhu
Abstract:
Be stars are rapidly rotating, with angular frequency around $0.7-0.8$ of their Keplerian break up frequency, as a result of significant accretion during the earlier stellar evolution of a companion star. Material from the equator of the Be star is ejected and forms a decretion disc, although the mechanism for the disc formation has remained elusive. We find one-dimensional steady state decretion…
▽ More
Be stars are rapidly rotating, with angular frequency around $0.7-0.8$ of their Keplerian break up frequency, as a result of significant accretion during the earlier stellar evolution of a companion star. Material from the equator of the Be star is ejected and forms a decretion disc, although the mechanism for the disc formation has remained elusive. We find one-dimensional steady state decretion disc solutions that smoothly transition from a rapidly rotating star that is in hydrostatic balance. Boundary layer effects in a geometrically thick disc which connects to a rotationally flattened star enable the formation of a decretion disc at stellar spin rates below the break up rate. For a disc with an aspect ratio $H/R\approx 0.1$ at the inner edge, the torque from the disc on the star slows the stellar spin to the observed range and mass ejection continues at a rate consistent with observed decretion rates. The critical rotation rate, to which the star slows down to, decreases as the disc aspect ratio increases. More generally, steady state accretion and decretion disc solutions can be found for all stellar spin rates. The outcome for a particular system depends upon the balance between the decretion rate and any external infall accretion rate.
△ Less
Submitted 3 March, 2025;
originally announced March 2025.
-
The Rise of Nova V1674 Herculis
Authors:
Robert M. Quimby,
Brian D. Metzger,
Ken J. Shen,
Allen W. Shafter,
Hank Corbett,
Madeline Overton
Abstract:
Observational constraints on classical novae are heavily biased to phases near optical peak and later because of the simple fact that novae are not typically discovered until they become bright. The earliest phases of brightening, coming before discovery, are typically missed, but this is changing with the proliferation of wide-field optical monitoring systems including ZTF, ASAS-SN, and Evryscope…
▽ More
Observational constraints on classical novae are heavily biased to phases near optical peak and later because of the simple fact that novae are not typically discovered until they become bright. The earliest phases of brightening, coming before discovery, are typically missed, but this is changing with the proliferation of wide-field optical monitoring systems including ZTF, ASAS-SN, and Evryscope. Here, we report on unprecedented observations of the fast nova V1674 Her beginning >10 mag below its optical peak and including high-cadence (2 min.) observations that chart a rise of ~8 mag in just 5 hours. Two clear breaks are identified as the light curve transitions first from rising slowly to rising rapidly, followed by a transition to an even faster, nearly linear rate of increasing flux with time. The depths of the observations allow us to place tight constraints on the size of the photosphere under the assumption of blackbody emission from a white dwarf emitting at its Eddington luminosity. We find that the white dwarf was unlikely to have overflowed its Roche lobe prior to the launch of a fast wind, which poses a challenge for explaining the Fermi $γ$-ray detections as the interaction of a fast wind with a slow-torus of gas stripped from the inflated white dwarf envelope by the companion. High-cadence observations of novae from Evryscope and the planned Argus Array can record the diversity of rising nova light curves and help resolve how the interplay between thermonuclear fusion, binary interaction, and shocks power their earliest light.
△ Less
Submitted 21 October, 2024;
originally announced October 2024.
-
Substantial precision enhancements via adaptive symmetry-informed Bayesian metrology
Authors:
Matt Overton,
Jesús Rubio,
Nathan Cooper,
Daniele Baldolini,
David Johnson,
Janet Anders,
Lucia Hackermüller
Abstract:
Precision measurements are essential for addressing major challenges, ranging from gravitational wave detection to healthcare diagnostics. While quantum sensing experiments are constantly improving to deliver greater precision, the in-depth optimisation of measurement procedures beyond phase estimation has been overlooked. Here we present a systematic strategy for parameter estimation that can be…
▽ More
Precision measurements are essential for addressing major challenges, ranging from gravitational wave detection to healthcare diagnostics. While quantum sensing experiments are constantly improving to deliver greater precision, the in-depth optimisation of measurement procedures beyond phase estimation has been overlooked. Here we present a systematic strategy for parameter estimation that can be applied across a wide range of experimental platforms operating in the low data limit. Its strength is the inclusion of experimental control parameters, which allows optimisation within a Bayesian procedure and adaptive repetition. We provide general expressions for the optimal estimator and error for any parameter amenable to symmetry-informed strategies. We demonstrate the power of this strategy by applying it to atom number estimation in a quantum technology experiment. Our protocol results in a five-fold reduction in the fractional variance of the atom number estimate compared to a standard, unoptimized protocol. Equivalently, it achieves the target precision with a third of the data points previously required. The enhanced device performance and accelerated data collection achieved with this Bayesian optimised strategy will be essential for applications in quantum computing, communication, metrology, and the wider quantum technology sector.
△ Less
Submitted 2 May, 2025; v1 submitted 14 October, 2024;
originally announced October 2024.
-
Retrograde discs around one component of a binary are unstable to tilting
Authors:
Madeline Overton,
Rebecca G. Martin,
Stephen H. Lubow,
Stephen Lepp
Abstract:
With hydrodynamic simulations we show that a coplanar disc around one component of a binary can be unstable to global tilting when the disc orbits in a retrograde direction relative to the binary. The disc experiences the largest inclination growth relative to the binary orbit in the outermost radii of the disc, closest to the companion. This tilt instability also occurs for test particles. A retr…
▽ More
With hydrodynamic simulations we show that a coplanar disc around one component of a binary can be unstable to global tilting when the disc orbits in a retrograde direction relative to the binary. The disc experiences the largest inclination growth relative to the binary orbit in the outermost radii of the disc, closest to the companion. This tilt instability also occurs for test particles. A retrograde disc is much larger than a prograde disc since it is not tidally truncated and instead spreads outwards to the orbit of the companion. The coplanar retrograde disc remains circular while a coplanar prograde disc can become eccentric. We suggest that the inclination instability is due to a disc resonance caused by the interaction of the tilt with the tidal field of the binary. This model is applicable to Be/X-ray binaries in which the Be star disc may be retrograde relative to the binary orbit if there was a sufficiently strong kick from the supernova that formed the neutron star companion. The accretion on to the neutron star and the resulting X-ray outbursts are weaker in the retrograde case compared to the prograde case.
△ Less
Submitted 17 November, 2023;
originally announced November 2023.
-
An Experimental Comparison of Methods for Computing the Numerical Radius
Authors:
Tim Mitchell,
Michael L. Overton
Abstract:
We make an experimental comparison of methods for computing the numerical radius of an $n\times n$ complex matrix, based on two well-known characterizations, the first a nonconvex optimization problem in one real variable and the second a convex optimization problem in $n^{2}+1$ real variables. We make comparisons with respect to both accuracy and computation time using publicly available software…
▽ More
We make an experimental comparison of methods for computing the numerical radius of an $n\times n$ complex matrix, based on two well-known characterizations, the first a nonconvex optimization problem in one real variable and the second a convex optimization problem in $n^{2}+1$ real variables. We make comparisons with respect to both accuracy and computation time using publicly available software.
△ Less
Submitted 6 October, 2023;
originally announced October 2023.
-
On the Choice of Sign Defining Householder Transformations
Authors:
Michael L. Overton,
Pinze Yu
Abstract:
It is well known that, when defining Householder transformations, the correct choice of sign in the standard formula is important to avoid cancellation and hence numerical instability. In this note we point out that when the "wrong" choice of sign is used, the extent of the resulting instability depends in a somewhat subtle way on the data leading to cancellation.
It is well known that, when defining Householder transformations, the correct choice of sign in the standard formula is important to avoid cancellation and hence numerical instability. In this note we point out that when the "wrong" choice of sign is used, the extent of the resulting instability depends in a somewhat subtle way on the data leading to cancellation.
△ Less
Submitted 7 October, 2023; v1 submitted 7 August, 2023;
originally announced September 2023.
-
Multi-fidelity robust controller design with gradient sampling
Authors:
Steffen W. R. Werner,
Michael L. Overton,
Benjamin Peherstorfer
Abstract:
Robust controllers that stabilize dynamical systems even under disturbances and noise are often formulated as solutions of nonsmooth, nonconvex optimization problems. While methods such as gradient sampling can handle the nonconvexity and nonsmoothness, the costs of evaluating the objective function may be substantial, making robust control challenging for dynamical systems with high-dimensional s…
▽ More
Robust controllers that stabilize dynamical systems even under disturbances and noise are often formulated as solutions of nonsmooth, nonconvex optimization problems. While methods such as gradient sampling can handle the nonconvexity and nonsmoothness, the costs of evaluating the objective function may be substantial, making robust control challenging for dynamical systems with high-dimensional state spaces. In this work, we introduce multi-fidelity variants of gradient sampling that leverage low-cost, low-fidelity models with low-dimensional state spaces for speeding up the optimization process while nonetheless providing convergence guarantees for a high-fidelity model of the system of interest, which is primarily accessed in the last phase of the optimization process. Our first multi-fidelity method initiates gradient sampling on higher fidelity models with starting points obtained from cheaper, lower fidelity models. Our second multi-fidelity method relies on ensembles of gradients that are computed from low- and high-fidelity models. Numerical experiments with controlling the cooling of a steel rail profile and laminar flow in a cylinder wake demonstrate that our new multi-fidelity gradient sampling methods achieve up to two orders of magnitude speedup compared to the single-fidelity gradient sampling method that relies on the high-fidelity model alone.
△ Less
Submitted 5 December, 2022; v1 submitted 30 May, 2022;
originally announced May 2022.
-
On Properties of Univariate Max Functions at Local Maximizers
Authors:
Tim Mitchell,
Michael L. Overton
Abstract:
More than three decades ago, Boyd and Balakrishnan established a regularity result for the two-norm of a transfer function at maximizers. Their result extends easily to the statement that the maximum eigenvalue of a univariate real analytic Hermitian matrix family is twice continuously differentiable, with Lipschitz second derivative, at all local maximizers, a property that is useful in several a…
▽ More
More than three decades ago, Boyd and Balakrishnan established a regularity result for the two-norm of a transfer function at maximizers. Their result extends easily to the statement that the maximum eigenvalue of a univariate real analytic Hermitian matrix family is twice continuously differentiable, with Lipschitz second derivative, at all local maximizers, a property that is useful in several applications that we describe. We also investigate whether this smoothness property extends to max functions more generally. We show that the pointwise maximum of a finite set of $q$-times continuously differentiable univariate functions must have zero derivative at a maximizer for $q=1$, but arbitrarily close to the maximizer, the derivative may not be defined, even when $q=3$ and the maximizer is isolated.
△ Less
Submitted 17 August, 2021;
originally announced August 2021.
-
Local Minimizers of the Crouzeix Ratio: A Nonsmooth Optimization Case Study
Authors:
Michael L. Overton
Abstract:
Given a square matrix $A$ and a polynomial $p$, the Crouzeix ratio is the norm of the polynomial on the field of values of $A$ divided by the 2-norm of the matrix $p(A)$. Crouzeix's conjecture states that the globally minimal value of the Crouzeix ratio is 0.5, regardless of the matrix order and polynomial degree, and it is known that 1 is a frequently occurring locally minimal value. Making use o…
▽ More
Given a square matrix $A$ and a polynomial $p$, the Crouzeix ratio is the norm of the polynomial on the field of values of $A$ divided by the 2-norm of the matrix $p(A)$. Crouzeix's conjecture states that the globally minimal value of the Crouzeix ratio is 0.5, regardless of the matrix order and polynomial degree, and it is known that 1 is a frequently occurring locally minimal value. Making use of a heavy-tailed distribution to initialize our optimization computations, we demonstrate for the first time that the Crouzeix ratio has many other locally minimal values between 0.5 and 1. Besides showing that the same function values are repeatedly obtained for many different starting points, we also verify that an approximate nonsmooth stationarity condition holds at computed candidate local minimizers. We also find that the same locally minimal values are often obtained both when optimizing over real matrices and polynomials, and over complex matrices and polynomials. We argue that minimization of the Crouzeix ratio makes a very interesting nonsmooth optimization case study, illustrating among other things how effective the BFGS method is for nonsmooth, nonconvex optimization. Our method for verifying approximate nonsmooth stationarity is based on what may be a novel approach to finding approximate subgradients of max functions on an interval. Our extensive computations strongly support Crouzeix's conjecture: in all cases, we find that the smallest locally minimal value is 0.5.
△ Less
Submitted 28 May, 2021;
originally announced May 2021.
-
Finding the strongest stable weightless column with a follower load and relocatable concentrated masses
Authors:
Oleg N. Kirillov,
Michael L. Overton
Abstract:
We consider the problem of optimal placement of concentrated masses along a massless elastic column that is clamped at one end and loaded by a nonconservative follower force at the free end. The goal is to find the largest possible interval such that the variation in the loading parameter within this interval preserves stability of the structure. The stability constraint is nonconvex and nonsmooth…
▽ More
We consider the problem of optimal placement of concentrated masses along a massless elastic column that is clamped at one end and loaded by a nonconservative follower force at the free end. The goal is to find the largest possible interval such that the variation in the loading parameter within this interval preserves stability of the structure. The stability constraint is nonconvex and nonsmooth, making the optimization problem quite challenging. We give a detailed analytical treatment for the case of two masses, arguing that the optimal parameter configuration approaches the flutter and divergence boundaries of the stability region simultaneously. Furthermore, we conjecture that this property holds for any number of masses, which in turn suggests a simple formula for the maximal load interval for $n$ masses. This conjecture is strongly supported by extensive computational results, obtained using the recently developed open-source software package GRANSO (GRadient-based Algorithm for Non-Smooth Optimization) to maximize the load interval subject to an appropriate formulation of the nonsmooth stability constraint. We hope that our work will provide a foundation for new approaches to classical long-standing problems of stability optimization for nonconservative elastic systems arising in civil and mechanical engineering.
△ Less
Submitted 3 February, 2021; v1 submitted 21 August, 2020;
originally announced August 2020.
-
Behavior of Limited Memory BFGS when Applied to Nonsmooth Functions and their Nesterov Smoothings
Authors:
Azam Asl,
Michael L. Overton
Abstract:
The motivation to study the behavior of limited-memory BFGS (L-BFGS) on nonsmooth optimization problems is based on two empirical observations: the widespread success of L-BFGS in solving large-scale smooth optimization problems, and the effectiveness of the full BFGS method in solving small to medium-sized nonsmooth optimization problems, based on using a gradient, not a subgradient, oracle parad…
▽ More
The motivation to study the behavior of limited-memory BFGS (L-BFGS) on nonsmooth optimization problems is based on two empirical observations: the widespread success of L-BFGS in solving large-scale smooth optimization problems, and the effectiveness of the full BFGS method in solving small to medium-sized nonsmooth optimization problems, based on using a gradient, not a subgradient, oracle paradigm. We first summarize our theoretical results on the behavior of the scaled L-BFGS method with one update applied to a simple convex nonsmooth function that is unbounded below, stating conditions under which the method converges to a non-optimal point regardless of the starting point. We then turn to empirically investigating whether the same phenomenon holds more generally,focusing on a difficult problem of Nesterov, as well as eigenvalue optimization problems arising in semidefinite programming applications. We find that when applied to a nonsmooth function directly, L-BFGS, especially its scaled variant, often breaks down with a poor approximation to an optimal solution, in sharp contrast to full BFGS. Unscaled L-BFGS is less prone to breakdown but conducts far more function evaluations per iteration than scaled L-BFGS does, and thus it is slow. Nonetheless, it is often the case that both variants obtain better results than the provably convergent, but slow, subgradient method. On the other hand, when applied to Nesterov's smooth approximation of a nonsmooth function, scaled L-BFGS is generally much more efficient than unscaled L-BFGS, often obtaining good results even when the problem is quite ill-conditioned. Summarizing, for large-scale nonsmooth optimization problems for which full BFGS and other methods for nonsmooth optimization are not practical, it is often better to apply L-BFGS to a smooth approximation of a nonsmooth problem than to apply it directly to the nonsmooth problem.
△ Less
Submitted 18 June, 2020;
originally announced June 2020.
-
H-infinity Strong Stabilization via HIFOO, a Package for Fixed-Order Controller Design
Authors:
Suat Gumussoy,
Marc Millstone,
Michael L. Overton
Abstract:
We report on our experience with strong stabilization using HIFOO, a toolbox for H-infinity fixed-order controller design. We applied HIFOO to 21 fixed-order stable H-infinity controller design problems in the literature, comparing the results with those published for other methods. The results show that HIFOO often achieves good H-infinity performance with low-order stable controllers, unlike oth…
▽ More
We report on our experience with strong stabilization using HIFOO, a toolbox for H-infinity fixed-order controller design. We applied HIFOO to 21 fixed-order stable H-infinity controller design problems in the literature, comparing the results with those published for other methods. The results show that HIFOO often achieves good H-infinity performance with low-order stable controllers, unlike other methods in the literature.
△ Less
Submitted 4 March, 2020;
originally announced March 2020.
-
Fixed-Order H-infinity Controller Design via HIFOO, a Specialized Nonsmooth Optimization Package
Authors:
Suat Gumussoy,
Michael L. Overton
Abstract:
We report on our experience with fixed-order H-infinity controller design using the HIFOO toolbox. We applied HIFOO to various benchmark fixed (or reduced) order H-infinity controller design problems in the literature, comparing the results with those published for other methods. The results show that HIFOO can be used as an effective alternative to existing methods for fixed-order HIFOO controlle…
▽ More
We report on our experience with fixed-order H-infinity controller design using the HIFOO toolbox. We applied HIFOO to various benchmark fixed (or reduced) order H-infinity controller design problems in the literature, comparing the results with those published for other methods. The results show that HIFOO can be used as an effective alternative to existing methods for fixed-order HIFOO controller design.
△ Less
Submitted 4 March, 2020;
originally announced March 2020.
-
Romu: Fast Nonlinear Pseudo-Random Number Generators Providing High Quality
Authors:
Mark A. Overton
Abstract:
We introduce the Romu family of pseudo-random number generators (PRNGs) which combines the nonlinear operation of rotation with the linear operations of multiplication and (optionally) addition. Compared to conventional linear-only PRNGs, this mixture of linear and nonlinear operations achieves a greater degree of randomness using the same number of arithmetic operations. Or equivalently, it achie…
▽ More
We introduce the Romu family of pseudo-random number generators (PRNGs) which combines the nonlinear operation of rotation with the linear operations of multiplication and (optionally) addition. Compared to conventional linear-only PRNGs, this mixture of linear and nonlinear operations achieves a greater degree of randomness using the same number of arithmetic operations. Or equivalently, it achieves the same randomness with fewer operations, resulting in higher speed. The statistical properties of these generators are strong, as they pass BigCrush and PractRand -- the most stringent test suites available. In addition, Romu generators take maximum advantage of instruction-level parallelism in modern superscalar processors, giving them an output latency of zero clock-cycles when inlined, thus adding no delay to an application. Scaled-down versions of these generators can be created and tested, enabling one to estimate the maximum number of values the full-size generators can supply before their randomness declines, ensuring the success of large jobs. Such capacity-estimates are rare for conventional PRNGs. A linear PRNG has a single cycle of states of known length comprising almost all possible states. However, a Romu generator computes pseudo-random permutations of those states, creating multiple cycles with pseudo-random lengths which cannot be determined by theory. But the ease of creating state-sizes of 128 or more bits allows (1) short cycles to be constrained to vanishingly low probabilities, and (2) thousands of parallel streams to be created having infinitesimal probabilities of overlap.
△ Less
Submitted 26 February, 2020;
originally announced February 2020.
-
First-order Perturbation Theory for Eigenvalues and Eigenvectors
Authors:
Anne Greenbaum,
Ren-cang Li,
Michael L. Overton
Abstract:
We present first-order perturbation analysis of a simple eigenvalue and the corresponding right and left eigenvectors of a general square matrix, not assumed to be Hermitian or normal. The eigenvalue result is well known to a broad scientific community. The treatment of eigenvectors is more complicated, with a perturbation theory that is not so well known outside a community of specialists. We giv…
▽ More
We present first-order perturbation analysis of a simple eigenvalue and the corresponding right and left eigenvectors of a general square matrix, not assumed to be Hermitian or normal. The eigenvalue result is well known to a broad scientific community. The treatment of eigenvectors is more complicated, with a perturbation theory that is not so well known outside a community of specialists. We give two different proofs of the main eigenvector perturbation theorem. The first, a block-diagonalization technique inspired by the numerical linear algebra research community and based on the implicit function theorem, has apparently not appeared in the literature in this form. The second, based on complex function theory and on eigenprojectors, as is standard in analytic perturbation theory, is a simplified version of well-known results in the literature. The second derivation uses a convenient normalization of the right and left eigenvectors defined in terms of the associated eigenprojector, but although this dates back to the 1950s, it is rarely discussed in the literature. We then show how the eigenvector perturbation theory is easily extended to handle other normalizations that are often used in practice. We also explain how to verify the perturbation results computationally. We conclude with some remarks about difficulties introduced by multiple eigenvalues and give references to work on perturbation of invariant subspaces corresponding to multiple or clustered eigenvalues. Throughout the paper we give extensive bibliographic commentary and references for further reading.
△ Less
Submitted 3 June, 2019; v1 submitted 2 March, 2019;
originally announced March 2019.
-
Partial smoothness of the numerical radius at matrices whose fields of values are disks
Authors:
Adrian S. Lewis,
Michael L. Overton
Abstract:
Solutions to optimization problems involving the numerical radius often belong to a special class: the set of matrices having field of values a disk centered at the origin. After illustrating this phenomenon with some examples, we illuminate it by studying matrices around which this set of "disk matrices" is a manifold with respect to which the numerical radius is partly smooth. We then apply our…
▽ More
Solutions to optimization problems involving the numerical radius often belong to a special class: the set of matrices having field of values a disk centered at the origin. After illustrating this phenomenon with some examples, we illuminate it by studying matrices around which this set of "disk matrices" is a manifold with respect to which the numerical radius is partly smooth. We then apply our results to matrices whose nonzeros consist of a single superdiagonal, such as Jordan blocks and the Crabb matrix related to a well-known conjecture of Crouzeix. Finally, we consider arbitrary complex three-by-three matrices; even in this case, the details are surprisingly intricate. One of our results is that in this real vector space with dimension 18, the set of disk matrices is a semi-algebraic manifold with dimension 12.
△ Less
Submitted 31 December, 2018;
originally announced January 2019.
-
Analysis of Limited-Memory BFGS on a Class of Nonsmooth Convex Functions
Authors:
Azam Asl,
Michael L. Overton
Abstract:
The limited memory BFGS (L-BFGS) method is widely used for large-scale unconstrained optimization, but its behavior on nonsmooth problems has received little attention. L-BFGS can be used with or without "scaling"; the use of scaling is normally recommended. A simple special case, when just one BFGS update is stored and used at every iteration, is sometimes also known as memoryless BFGS. We analyz…
▽ More
The limited memory BFGS (L-BFGS) method is widely used for large-scale unconstrained optimization, but its behavior on nonsmooth problems has received little attention. L-BFGS can be used with or without "scaling"; the use of scaling is normally recommended. A simple special case, when just one BFGS update is stored and used at every iteration, is sometimes also known as memoryless BFGS. We analyze memoryless BFGS with scaling, using any Armijo-Wolfe line search, on the function $f(x) = a|x^{(1)}| + \sum_{i=2}^{n} x^{(i)}$, initiated at any point $x_0$ with $x_0^{(1)}\not = 0$. We show that if $a\ge 2\sqrt{n-1}$, the absolute value of the normalized search direction generated by this method converges to a constant vector, and if, in addition, $a$ is larger than a quantity that depends on the Armijo parameter, then the iterates converge to a non-optimal point $\bar x$ with $\bar x^{(1)}=0$, although $f$ is unbounded below. As we showed in previous work, the gradient method with any Armijo-Wolfe line search also fails on the same function if $a\geq \sqrt{n-1}$ and $a$ is larger than another quantity depending on the Armijo parameter, but scaled memoryless BFGS fails under a weaker condition relating $a$ to the Armijo parameter than that implying failure of the gradient method. Furthermore, in sharp contrast to the gradient method, if a specific standard Armijo-Wolfe bracketing line search is used, scaled memoryless BFGS fails when $a\ge 2 \sqrt{n-1} $ regardless of the Armijo parameter. Finally, numerical experiments indicate that similar results hold for scaled L-BFGS with any fixed number of updates.
△ Less
Submitted 30 May, 2019; v1 submitted 29 September, 2018;
originally announced October 2018.
-
Gradient Sampling Methods for Nonsmooth Optimization
Authors:
James V. Burke,
Frank E. Curtis,
Adrian S. Lewis,
Michael L. Overton,
Lucas E. A. Simões
Abstract:
This paper reviews the gradient sampling methodology for solving nonsmooth, nonconvex optimization problems. An intuitively straightforward gradient sampling algorithm is stated and its convergence properties are summarized. Throughout this discussion, we emphasize the simplicity of gradient sampling as an extension of the steepest descent method for minimizing smooth objectives. We then provide o…
▽ More
This paper reviews the gradient sampling methodology for solving nonsmooth, nonconvex optimization problems. An intuitively straightforward gradient sampling algorithm is stated and its convergence properties are summarized. Throughout this discussion, we emphasize the simplicity of gradient sampling as an extension of the steepest descent method for minimizing smooth objectives. We then provide overviews of various enhancements that have been proposed to improve practical performance, as well as of several extensions that have been made in the literature, such as to solve constrained problems. The paper also includes clarification of certain technical aspects of the analysis of gradient sampling algorithms, most notably related to the assumptions one needs to make about the set of points at which the objective is continuously differentiable. Finally, we discuss possible future research directions.
△ Less
Submitted 29 April, 2018;
originally announced April 2018.
-
Low-Order Control Design using a Reduced-Order Model with a Stability Constraint on the Full-Order Model
Authors:
Peter Benner,
Tim Mitchell,
Michael L. Overton
Abstract:
We consider low-order controller design for large-scale linear time-invariant dynamical systems with inputs and outputs. Model order reduction is a popular technique, but controllers designed for reduced-order models may result in unstable closed-loop plants when applied to the full-order system. We introduce a new method to design a fixed-order controller by minimizing the $L_\infty$ norm of a re…
▽ More
We consider low-order controller design for large-scale linear time-invariant dynamical systems with inputs and outputs. Model order reduction is a popular technique, but controllers designed for reduced-order models may result in unstable closed-loop plants when applied to the full-order system. We introduce a new method to design a fixed-order controller by minimizing the $L_\infty$ norm of a reduced-order closed-loop transfer matrix function subject to stability constraints on the closed-loop systems for both the reduced-order and the full-order models. Since the optimization objective and the constraints are all nonsmooth and nonconvex we use a sequential quadratic programming method based on quasi-Newton updating that is intended for this problem class, available in the open-source software package GRANSO. Using a publicly available test set, the controllers obtained by the new method are compared with those computed by the HIFOO (H-Infinity Fixed-Order Optimization) toolbox applied to a reduced-order model alone, which frequently fail to stabilize the closed-loop system for the associated full-order model.
△ Less
Submitted 17 March, 2018;
originally announced March 2018.
-
Analysis of the Gradient Method with an Armijo-Wolfe Line Search on a Class of Nonsmooth Convex Functions
Authors:
Azam Asl,
Michael L. Overton
Abstract:
It has long been known that the gradient (steepest descent) method may fail on nonsmooth problems, but the examples that have appeared in the literature are either devised specifically to defeat a gradient or subgradient method with an exact line search or are unstable with respect to perturbation of the initial point. We give an analysis of the gradient method with steplengths satisfying the Armi…
▽ More
It has long been known that the gradient (steepest descent) method may fail on nonsmooth problems, but the examples that have appeared in the literature are either devised specifically to defeat a gradient or subgradient method with an exact line search or are unstable with respect to perturbation of the initial point. We give an analysis of the gradient method with steplengths satisfying the Armijo and Wolfe inexact line search conditions on the nonsmooth convex function $f(x) = a|x^{(1)}| + \sum_{i=2}^{n} x^{(i)}$. We show that if $a$ is sufficiently large, satisfying a condition that depends only on the Armijo parameter, then, when the method is initiated at any point $x_0 \in \R^n$ with $x^{(1)}_0\not = 0$, the iterates converge to a point $\bar x$ with $\bar x^{(1)}=0$, although $f$ is unbounded below. We also give conditions under which the iterates $f(x_k)\to-\infty$, using a specific Armijo-Wolfe bracketing line search. Our experimental results demonstrate that our analysis is reasonably tight.
△ Less
Submitted 20 September, 2018; v1 submitted 22 November, 2017;
originally announced November 2017.
-
Approximating the Real Structured Stability Radius with Frobenius Norm Bounded Perturbations
Authors:
Nicola Guglielmi,
Mert Gurbuzbalaban,
Tim Mitchell,
Michael Overton
Abstract:
We propose a fast method to approximate the real stability radius of a linear dynamical system with output feedback, where the perturbations are restricted to be real valued and bounded with respect to the Frobenius norm. Our work builds on a number of scalable algorithms that have been proposed in recent years, ranging from methods that approximate the complex or real pseudospectral abscissa and…
▽ More
We propose a fast method to approximate the real stability radius of a linear dynamical system with output feedback, where the perturbations are restricted to be real valued and bounded with respect to the Frobenius norm. Our work builds on a number of scalable algorithms that have been proposed in recent years, ranging from methods that approximate the complex or real pseudospectral abscissa and radius of large sparse matrices (and generalizations of these methods for pseudospectra to spectral value sets) to algorithms for approximating the complex stability radius (the reciprocal of the $H_\infty$ norm). Although our algorithm is guaranteed to find only upper bounds to the real stability radius, it seems quite effective in practice. As far as we know, this is the first algorithm that addresses the Frobenius-norm version of this problem. Because the cost mainly consists of computing the eigenvalue with maximal real part for continuous-time systems (or modulus for discrete-time systems) of a sequence of matrices, our algorithm remains very efficient for large-scale systems provided that the system matrices are sparse.
△ Less
Submitted 8 February, 2017;
originally announced February 2017.
-
Multiobjective Robust Control with HIFOO 2.0
Authors:
Suat Gumussoy,
Didier Henrion,
Marc Millstone,
Michael L. Overton
Abstract:
Multiobjective control design is known to be a difficult problem both in theory and practice. Our approach is to search for locally optimal solutions of a nonsmooth optimization problem that is built to incorporate minimization objectives and constraints for multiple plants. We report on the success of this approach using our public-domain Matlab toolbox HIFOO 2.0, comparing our results with ben…
▽ More
Multiobjective control design is known to be a difficult problem both in theory and practice. Our approach is to search for locally optimal solutions of a nonsmooth optimization problem that is built to incorporate minimization objectives and constraints for multiple plants. We report on the success of this approach using our public-domain Matlab toolbox HIFOO 2.0, comparing our results with benchmarks in the literature.
△ Less
Submitted 25 May, 2009; v1 submitted 20 May, 2009;
originally announced May 2009.
-
Maximizing the Closed Loop Asymptotic Decay Rate for the Two-Mass-Spring Control Problem
Authors:
Didier Henrion,
Michael L. Overton
Abstract:
We consider the following problem: find a fixed-order linear controller that maximizes the closed-loop asymptotic decay rate for the classical two-mass-spring system. This can be formulated as the problem of minimizing the abscissa (maximum of the real parts of the roots) of a polynomial whose coefficients depend linearly on the controller parameters. We show that the only order for which there…
▽ More
We consider the following problem: find a fixed-order linear controller that maximizes the closed-loop asymptotic decay rate for the classical two-mass-spring system. This can be formulated as the problem of minimizing the abscissa (maximum of the real parts of the roots) of a polynomial whose coefficients depend linearly on the controller parameters. We show that the only order for which there is a non-trivial solution is 2. In this case, we derive a controller that we prove locally maximizes the asymptotic decay rate, using recently developed techniques from nonsmooth analysis.
△ Less
Submitted 29 March, 2006;
originally announced March 2006.