Search | arXiv e-print repository

TaMPERing with Large Language Models: A Field Guide for using Generative AI in Public Administration Research

Authors: Michael Overton, Barrie Robison, Lucas Sheneman

Abstract: The integration of Large Language Models (LLMs) into social science research presents transformative opportunities for advancing scientific inquiry, particularly in public administration (PA). However, the absence of standardized methodologies for using LLMs poses significant challenges for ensuring transparency, reproducibility, and replicability. This manuscript introduces the TaMPER framework-a… ▽ More The integration of Large Language Models (LLMs) into social science research presents transformative opportunities for advancing scientific inquiry, particularly in public administration (PA). However, the absence of standardized methodologies for using LLMs poses significant challenges for ensuring transparency, reproducibility, and replicability. This manuscript introduces the TaMPER framework-a structured methodology organized around five critical decision points: Task, Model, Prompt, Evaluation, and Reporting. The TaMPER framework provides scholars with a systematic approach to leveraging LLMs effectively while addressing key challenges such as model variability, prompt design, evaluation protocols, and transparent reporting practices. △ Less

Submitted 30 March, 2025; originally announced April 2025.

Comments: 23 Pages, 8 Tables

arXiv:2503.22586 [pdf, other]

doi 10.1093/mnrasl/slaf029

Disc breaking through forced eccentricity growth

Authors: Madeline Overton, Rebecca G. Martin, Stephen H. Lubow, Stephen Lepp

Abstract: Motivated by misaligned discs observed in eccentric orbit Be/X-ray binaries, we examine the evolution of a retrograde disc around one component of an eccentric binary with hydrodynamic simulations, $n$-body simulations and linear theory. Forced eccentricity growth from the eccentric orbit binary causes the initially circular disk to undergo eccentricity oscillations. A retrograde disc becomes more… ▽ More Motivated by misaligned discs observed in eccentric orbit Be/X-ray binaries, we examine the evolution of a retrograde disc around one component of an eccentric binary with hydrodynamic simulations, $n$-body simulations and linear theory. Forced eccentricity growth from the eccentric orbit binary causes the initially circular disk to undergo eccentricity oscillations. A retrograde disc becomes more radially extended, more highly eccentric and undergoes more rapid apsidal precession compared to a prograde disc. We find that a retrograde disc can be subject to disc breaking where the disc forms two rings with different eccentricities and longitude of periastrons while remaining coplanar. This could have implications for the lightcurves and the X-ray outbursts observed in Be/X-ray binaries. △ Less

Submitted 28 March, 2025; originally announced March 2025.

Comments: 6 pages, 3 figures, accepted for publication in MNRAS

arXiv:2503.09637 [pdf]

Complementarity, Augmentation, or Substitutivity? The Impact of Generative Artificial Intelligence on the U.S. Federal Workforce

Authors: William G. Resh, Yi Ming, Xinyao Xia, Michael Overton, Gul Nisa Gürbüz, Brandon De Breuhl

Abstract: This study investigates the near-future impacts of generative artificial intelligence (AI) technologies on occupational competencies across the U.S. federal workforce. We develop a multi-stage Retrieval-Augmented Generation system to leverage large language models for predictive AI modeling that projects shifts in required competencies and to identify vulnerable occupations on a knowledge-by-skill… ▽ More This study investigates the near-future impacts of generative artificial intelligence (AI) technologies on occupational competencies across the U.S. federal workforce. We develop a multi-stage Retrieval-Augmented Generation system to leverage large language models for predictive AI modeling that projects shifts in required competencies and to identify vulnerable occupations on a knowledge-by-skill-by-ability basis across the federal government workforce. This study highlights policy recommendations essential for workforce planning in the era of AI. We integrate several sources of detailed data on occupational requirements across the federal government from both centralized and decentralized human resource sources, including from the U.S. Office of Personnel Management (OPM) and various federal agencies. While our preliminary findings suggest some significant shifts in required competencies and potential vulnerability of certain roles to AI-driven changes, we provide nuanced insights that support arguments against abrupt or generic approaches to strategic human capital planning around the development of generative AI. The study aims to inform strategic workforce planning and policy development within federal agencies and demonstrates how this approach can be replicated across other large employment institutions and labor markets. △ Less

Submitted 11 March, 2025; originally announced March 2025.

Comments: 53 pages, 9 figures, 3 tables

ACM Class: I.2.7; I.2.11; I.2.1; I.2.3; I.7

arXiv:2503.02014 [pdf, other]

Formation of Be star decretion discs through boundary layer effects

Authors: Rebecca G. Martin, Stephen H. Lubow, David Vallet, Madeline Overton, Stephen Lepp, Zhaohuan Zhu

Abstract: Be stars are rapidly rotating, with angular frequency around $0.7-0.8$ of their Keplerian break up frequency, as a result of significant accretion during the earlier stellar evolution of a companion star. Material from the equator of the Be star is ejected and forms a decretion disc, although the mechanism for the disc formation has remained elusive. We find one-dimensional steady state decretion… ▽ More Be stars are rapidly rotating, with angular frequency around $0.7-0.8$ of their Keplerian break up frequency, as a result of significant accretion during the earlier stellar evolution of a companion star. Material from the equator of the Be star is ejected and forms a decretion disc, although the mechanism for the disc formation has remained elusive. We find one-dimensional steady state decretion disc solutions that smoothly transition from a rapidly rotating star that is in hydrostatic balance. Boundary layer effects in a geometrically thick disc which connects to a rotationally flattened star enable the formation of a decretion disc at stellar spin rates below the break up rate. For a disc with an aspect ratio $H/R\approx 0.1$ at the inner edge, the torque from the disc on the star slows the stellar spin to the observed range and mass ejection continues at a rate consistent with observed decretion rates. The critical rotation rate, to which the star slows down to, decreases as the disc aspect ratio increases. More generally, steady state accretion and decretion disc solutions can be found for all stellar spin rates. The outcome for a particular system depends upon the balance between the decretion rate and any external infall accretion rate. △ Less

Submitted 3 March, 2025; originally announced March 2025.

Comments: Accepted for publication in MNRAS

arXiv:2410.16460 [pdf, other]

doi 10.3847/1538-4357/ad887f

The Rise of Nova V1674 Herculis

Authors: Robert M. Quimby, Brian D. Metzger, Ken J. Shen, Allen W. Shafter, Hank Corbett, Madeline Overton

Abstract: Observational constraints on classical novae are heavily biased to phases near optical peak and later because of the simple fact that novae are not typically discovered until they become bright. The earliest phases of brightening, coming before discovery, are typically missed, but this is changing with the proliferation of wide-field optical monitoring systems including ZTF, ASAS-SN, and Evryscope… ▽ More Observational constraints on classical novae are heavily biased to phases near optical peak and later because of the simple fact that novae are not typically discovered until they become bright. The earliest phases of brightening, coming before discovery, are typically missed, but this is changing with the proliferation of wide-field optical monitoring systems including ZTF, ASAS-SN, and Evryscope. Here, we report on unprecedented observations of the fast nova V1674 Her beginning >10 mag below its optical peak and including high-cadence (2 min.) observations that chart a rise of ~8 mag in just 5 hours. Two clear breaks are identified as the light curve transitions first from rising slowly to rising rapidly, followed by a transition to an even faster, nearly linear rate of increasing flux with time. The depths of the observations allow us to place tight constraints on the size of the photosphere under the assumption of blackbody emission from a white dwarf emitting at its Eddington luminosity. We find that the white dwarf was unlikely to have overflowed its Roche lobe prior to the launch of a fast wind, which poses a challenge for explaining the Fermi $γ$-ray detections as the interaction of a fast wind with a slow-torus of gas stripped from the inflated white dwarf envelope by the companion. High-cadence observations of novae from Evryscope and the planned Argus Array can record the diversity of rising nova light curves and help resolve how the interplay between thermonuclear fusion, binary interaction, and shocks power their earliest light. △ Less

Submitted 21 October, 2024; originally announced October 2024.

Comments: 20 pages, 6 figures, 2 tables, accepted for publication in ApJ

arXiv:2410.10615 [pdf, other]

Substantial precision enhancements via adaptive symmetry-informed Bayesian metrology

Authors: Matt Overton, Jesús Rubio, Nathan Cooper, Daniele Baldolini, David Johnson, Janet Anders, Lucia Hackermüller

Abstract: Precision measurements are essential for addressing major challenges, ranging from gravitational wave detection to healthcare diagnostics. While quantum sensing experiments are constantly improving to deliver greater precision, the in-depth optimisation of measurement procedures beyond phase estimation has been overlooked. Here we present a systematic strategy for parameter estimation that can be… ▽ More Precision measurements are essential for addressing major challenges, ranging from gravitational wave detection to healthcare diagnostics. While quantum sensing experiments are constantly improving to deliver greater precision, the in-depth optimisation of measurement procedures beyond phase estimation has been overlooked. Here we present a systematic strategy for parameter estimation that can be applied across a wide range of experimental platforms operating in the low data limit. Its strength is the inclusion of experimental control parameters, which allows optimisation within a Bayesian procedure and adaptive repetition. We provide general expressions for the optimal estimator and error for any parameter amenable to symmetry-informed strategies. We demonstrate the power of this strategy by applying it to atom number estimation in a quantum technology experiment. Our protocol results in a five-fold reduction in the fractional variance of the atom number estimate compared to a standard, unoptimized protocol. Equivalently, it achieves the target precision with a third of the data points previously required. The enhanced device performance and accelerated data collection achieved with this Bayesian optimised strategy will be essential for applications in quantum computing, communication, metrology, and the wider quantum technology sector. △ Less

Submitted 2 May, 2025; v1 submitted 14 October, 2024; originally announced October 2024.

Comments: 11 pages, 5 figures, 2 tables. Revised version with a more explicit theoretical framework and clearer practical applicability

arXiv:2311.10864 [pdf, other]

doi 10.1093/mnrasl/slad172

Retrograde discs around one component of a binary are unstable to tilting

Authors: Madeline Overton, Rebecca G. Martin, Stephen H. Lubow, Stephen Lepp

Abstract: With hydrodynamic simulations we show that a coplanar disc around one component of a binary can be unstable to global tilting when the disc orbits in a retrograde direction relative to the binary. The disc experiences the largest inclination growth relative to the binary orbit in the outermost radii of the disc, closest to the companion. This tilt instability also occurs for test particles. A retr… ▽ More With hydrodynamic simulations we show that a coplanar disc around one component of a binary can be unstable to global tilting when the disc orbits in a retrograde direction relative to the binary. The disc experiences the largest inclination growth relative to the binary orbit in the outermost radii of the disc, closest to the companion. This tilt instability also occurs for test particles. A retrograde disc is much larger than a prograde disc since it is not tidally truncated and instead spreads outwards to the orbit of the companion. The coplanar retrograde disc remains circular while a coplanar prograde disc can become eccentric. We suggest that the inclination instability is due to a disc resonance caused by the interaction of the tilt with the tidal field of the binary. This model is applicable to Be/X-ray binaries in which the Be star disc may be retrograde relative to the binary orbit if there was a sufficiently strong kick from the supernova that formed the neutron star companion. The accretion on to the neutron star and the resulting X-ray outbursts are weaker in the retrograde case compared to the prograde case. △ Less

Submitted 17 November, 2023; originally announced November 2023.

Comments: 6 pages, 3 figures, accepted for publication in MNRAS

arXiv:2310.04646 [pdf, ps, other]

An Experimental Comparison of Methods for Computing the Numerical Radius

Authors: Tim Mitchell, Michael L. Overton

Abstract: We make an experimental comparison of methods for computing the numerical radius of an $n\times n$ complex matrix, based on two well-known characterizations, the first a nonconvex optimization problem in one real variable and the second a convex optimization problem in $n^{2}+1$ real variables. We make comparisons with respect to both accuracy and computation time using publicly available software… ▽ More We make an experimental comparison of methods for computing the numerical radius of an $n\times n$ complex matrix, based on two well-known characterizations, the first a nonconvex optimization problem in one real variable and the second a convex optimization problem in $n^{2}+1$ real variables. We make comparisons with respect to both accuracy and computation time using publicly available software. △ Less

Submitted 6 October, 2023; originally announced October 2023.

MSC Class: 15A60; 90C22

arXiv:2309.02443 [pdf, ps, other]

On the Choice of Sign Defining Householder Transformations

Authors: Michael L. Overton, Pinze Yu

Abstract: It is well known that, when defining Householder transformations, the correct choice of sign in the standard formula is important to avoid cancellation and hence numerical instability. In this note we point out that when the "wrong" choice of sign is used, the extent of the resulting instability depends in a somewhat subtle way on the data leading to cancellation. It is well known that, when defining Householder transformations, the correct choice of sign in the standard formula is important to avoid cancellation and hence numerical instability. In this note we point out that when the "wrong" choice of sign is used, the extent of the resulting instability depends in a somewhat subtle way on the data leading to cancellation. △ Less

Submitted 7 October, 2023; v1 submitted 7 August, 2023; originally announced September 2023.

MSC Class: 65F05

arXiv:2205.15050 [pdf, other]

doi 10.1137/22M1500137

Multi-fidelity robust controller design with gradient sampling

Authors: Steffen W. R. Werner, Michael L. Overton, Benjamin Peherstorfer

Abstract: Robust controllers that stabilize dynamical systems even under disturbances and noise are often formulated as solutions of nonsmooth, nonconvex optimization problems. While methods such as gradient sampling can handle the nonconvexity and nonsmoothness, the costs of evaluating the objective function may be substantial, making robust control challenging for dynamical systems with high-dimensional s… ▽ More Robust controllers that stabilize dynamical systems even under disturbances and noise are often formulated as solutions of nonsmooth, nonconvex optimization problems. While methods such as gradient sampling can handle the nonconvexity and nonsmoothness, the costs of evaluating the objective function may be substantial, making robust control challenging for dynamical systems with high-dimensional state spaces. In this work, we introduce multi-fidelity variants of gradient sampling that leverage low-cost, low-fidelity models with low-dimensional state spaces for speeding up the optimization process while nonetheless providing convergence guarantees for a high-fidelity model of the system of interest, which is primarily accessed in the last phase of the optimization process. Our first multi-fidelity method initiates gradient sampling on higher fidelity models with starting points obtained from cheaper, lower fidelity models. Our second multi-fidelity method relies on ensembles of gradients that are computed from low- and high-fidelity models. Numerical experiments with controlling the cooling of a steel rail profile and laminar flow in a cylinder wake demonstrate that our new multi-fidelity gradient sampling methods achieve up to two orders of magnitude speedup compared to the single-fidelity gradient sampling method that relies on the high-fidelity model alone. △ Less

Submitted 5 December, 2022; v1 submitted 30 May, 2022; originally announced May 2022.

Comments: 28 pages, 4 figures

MSC Class: 37N35; 37N40; 65K10; 90C30; 90C59

Journal ref: SIAM J. Sci. Comput., 45(2):A933-A957, 2023

arXiv:2108.07754 [pdf, ps, other]

On Properties of Univariate Max Functions at Local Maximizers

Authors: Tim Mitchell, Michael L. Overton

Abstract: More than three decades ago, Boyd and Balakrishnan established a regularity result for the two-norm of a transfer function at maximizers. Their result extends easily to the statement that the maximum eigenvalue of a univariate real analytic Hermitian matrix family is twice continuously differentiable, with Lipschitz second derivative, at all local maximizers, a property that is useful in several a… ▽ More More than three decades ago, Boyd and Balakrishnan established a regularity result for the two-norm of a transfer function at maximizers. Their result extends easily to the statement that the maximum eigenvalue of a univariate real analytic Hermitian matrix family is twice continuously differentiable, with Lipschitz second derivative, at all local maximizers, a property that is useful in several applications that we describe. We also investigate whether this smoothness property extends to max functions more generally. We show that the pointwise maximum of a finite set of $q$-times continuously differentiable univariate functions must have zero derivative at a maximizer for $q=1$, but arbitrarily close to the maximizer, the derivative may not be defined, even when $q=3$ and the maximizer is isolated. △ Less

Submitted 17 August, 2021; originally announced August 2021.

Comments: Initial preprint

MSC Class: 49J52; 65F99

arXiv:2105.14176 [pdf, ps, other]

Local Minimizers of the Crouzeix Ratio: A Nonsmooth Optimization Case Study

Authors: Michael L. Overton

Abstract: Given a square matrix $A$ and a polynomial $p$, the Crouzeix ratio is the norm of the polynomial on the field of values of $A$ divided by the 2-norm of the matrix $p(A)$. Crouzeix's conjecture states that the globally minimal value of the Crouzeix ratio is 0.5, regardless of the matrix order and polynomial degree, and it is known that 1 is a frequently occurring locally minimal value. Making use o… ▽ More Given a square matrix $A$ and a polynomial $p$, the Crouzeix ratio is the norm of the polynomial on the field of values of $A$ divided by the 2-norm of the matrix $p(A)$. Crouzeix's conjecture states that the globally minimal value of the Crouzeix ratio is 0.5, regardless of the matrix order and polynomial degree, and it is known that 1 is a frequently occurring locally minimal value. Making use of a heavy-tailed distribution to initialize our optimization computations, we demonstrate for the first time that the Crouzeix ratio has many other locally minimal values between 0.5 and 1. Besides showing that the same function values are repeatedly obtained for many different starting points, we also verify that an approximate nonsmooth stationarity condition holds at computed candidate local minimizers. We also find that the same locally minimal values are often obtained both when optimizing over real matrices and polynomials, and over complex matrices and polynomials. We argue that minimization of the Crouzeix ratio makes a very interesting nonsmooth optimization case study, illustrating among other things how effective the BFGS method is for nonsmooth, nonconvex optimization. Our method for verifying approximate nonsmooth stationarity is based on what may be a novel approach to finding approximate subgradients of max functions on an interval. Our extensive computations strongly support Crouzeix's conjecture: in all cases, we find that the smallest locally minimal value is 0.5. △ Less

Submitted 28 May, 2021; originally announced May 2021.

Comments: 19 pages, 6 figures

MSC Class: 15A60; 49J52; 65F15

arXiv:2008.09551 [pdf, ps, other]

doi 10.1093/qjmam/hbab005

Finding the strongest stable weightless column with a follower load and relocatable concentrated masses

Authors: Oleg N. Kirillov, Michael L. Overton

Abstract: We consider the problem of optimal placement of concentrated masses along a massless elastic column that is clamped at one end and loaded by a nonconservative follower force at the free end. The goal is to find the largest possible interval such that the variation in the loading parameter within this interval preserves stability of the structure. The stability constraint is nonconvex and nonsmooth… ▽ More We consider the problem of optimal placement of concentrated masses along a massless elastic column that is clamped at one end and loaded by a nonconservative follower force at the free end. The goal is to find the largest possible interval such that the variation in the loading parameter within this interval preserves stability of the structure. The stability constraint is nonconvex and nonsmooth, making the optimization problem quite challenging. We give a detailed analytical treatment for the case of two masses, arguing that the optimal parameter configuration approaches the flutter and divergence boundaries of the stability region simultaneously. Furthermore, we conjecture that this property holds for any number of masses, which in turn suggests a simple formula for the maximal load interval for $n$ masses. This conjecture is strongly supported by extensive computational results, obtained using the recently developed open-source software package GRANSO (GRadient-based Algorithm for Non-Smooth Optimization) to maximize the load interval subject to an appropriate formulation of the nonsmooth stability constraint. We hope that our work will provide a foundation for new approaches to classical long-standing problems of stability optimization for nonconservative elastic systems arising in civil and mechanical engineering. △ Less

Submitted 3 February, 2021; v1 submitted 21 August, 2020; originally announced August 2020.

Journal ref: The Quarterly Journal of Mechanics and Applied Mathematics, 2021, 74(2): 223-250

arXiv:2006.11336 [pdf, other]

Behavior of Limited Memory BFGS when Applied to Nonsmooth Functions and their Nesterov Smoothings

Authors: Azam Asl, Michael L. Overton

Abstract: The motivation to study the behavior of limited-memory BFGS (L-BFGS) on nonsmooth optimization problems is based on two empirical observations: the widespread success of L-BFGS in solving large-scale smooth optimization problems, and the effectiveness of the full BFGS method in solving small to medium-sized nonsmooth optimization problems, based on using a gradient, not a subgradient, oracle parad… ▽ More The motivation to study the behavior of limited-memory BFGS (L-BFGS) on nonsmooth optimization problems is based on two empirical observations: the widespread success of L-BFGS in solving large-scale smooth optimization problems, and the effectiveness of the full BFGS method in solving small to medium-sized nonsmooth optimization problems, based on using a gradient, not a subgradient, oracle paradigm. We first summarize our theoretical results on the behavior of the scaled L-BFGS method with one update applied to a simple convex nonsmooth function that is unbounded below, stating conditions under which the method converges to a non-optimal point regardless of the starting point. We then turn to empirically investigating whether the same phenomenon holds more generally,focusing on a difficult problem of Nesterov, as well as eigenvalue optimization problems arising in semidefinite programming applications. We find that when applied to a nonsmooth function directly, L-BFGS, especially its scaled variant, often breaks down with a poor approximation to an optimal solution, in sharp contrast to full BFGS. Unscaled L-BFGS is less prone to breakdown but conducts far more function evaluations per iteration than scaled L-BFGS does, and thus it is slow. Nonetheless, it is often the case that both variants obtain better results than the provably convergent, but slow, subgradient method. On the other hand, when applied to Nesterov's smooth approximation of a nonsmooth function, scaled L-BFGS is generally much more efficient than unscaled L-BFGS, often obtaining good results even when the problem is quite ill-conditioned. Summarizing, for large-scale nonsmooth optimization problems for which full BFGS and other methods for nonsmooth optimization are not practical, it is often better to apply L-BFGS to a smooth approximation of a nonsmooth problem than to apply it directly to the nonsmooth problem. △ Less

Submitted 18 June, 2020; originally announced June 2020.

arXiv:2003.03853 [pdf, ps, other]

doi 10.1109/CDC.2008.4738750

H-infinity Strong Stabilization via HIFOO, a Package for Fixed-Order Controller Design

Authors: Suat Gumussoy, Marc Millstone, Michael L. Overton

Abstract: We report on our experience with strong stabilization using HIFOO, a toolbox for H-infinity fixed-order controller design. We applied HIFOO to 21 fixed-order stable H-infinity controller design problems in the literature, comparing the results with those published for other methods. The results show that HIFOO often achieves good H-infinity performance with low-order stable controllers, unlike oth… ▽ More We report on our experience with strong stabilization using HIFOO, a toolbox for H-infinity fixed-order controller design. We applied HIFOO to 21 fixed-order stable H-infinity controller design problems in the literature, comparing the results with those published for other methods. The results show that HIFOO often achieves good H-infinity performance with low-order stable controllers, unlike other methods in the literature. △ Less

Submitted 4 March, 2020; originally announced March 2020.

Comments: arXiv admin note: text overlap with arXiv:2003.02295

Journal ref: 2008 47th IEEE Conference on Decision and Control, Cancun, 2008, pp. 4135-4140

arXiv:2003.02295 [pdf, ps, other]

doi 10.1109/ACC.2008.4586909

Fixed-Order H-infinity Controller Design via HIFOO, a Specialized Nonsmooth Optimization Package

Authors: Suat Gumussoy, Michael L. Overton

Abstract: We report on our experience with fixed-order H-infinity controller design using the HIFOO toolbox. We applied HIFOO to various benchmark fixed (or reduced) order H-infinity controller design problems in the literature, comparing the results with those published for other methods. The results show that HIFOO can be used as an effective alternative to existing methods for fixed-order HIFOO controlle… ▽ More We report on our experience with fixed-order H-infinity controller design using the HIFOO toolbox. We applied HIFOO to various benchmark fixed (or reduced) order H-infinity controller design problems in the literature, comparing the results with those published for other methods. The results show that HIFOO can be used as an effective alternative to existing methods for fixed-order HIFOO controller design. △ Less

Submitted 4 March, 2020; originally announced March 2020.

Journal ref: 2008 American Control Conference, Seattle, WA, 2008, pp. 2750-2754

arXiv:2002.11331 [pdf, other]

Romu: Fast Nonlinear Pseudo-Random Number Generators Providing High Quality

Authors: Mark A. Overton

Abstract: We introduce the Romu family of pseudo-random number generators (PRNGs) which combines the nonlinear operation of rotation with the linear operations of multiplication and (optionally) addition. Compared to conventional linear-only PRNGs, this mixture of linear and nonlinear operations achieves a greater degree of randomness using the same number of arithmetic operations. Or equivalently, it achie… ▽ More We introduce the Romu family of pseudo-random number generators (PRNGs) which combines the nonlinear operation of rotation with the linear operations of multiplication and (optionally) addition. Compared to conventional linear-only PRNGs, this mixture of linear and nonlinear operations achieves a greater degree of randomness using the same number of arithmetic operations. Or equivalently, it achieves the same randomness with fewer operations, resulting in higher speed. The statistical properties of these generators are strong, as they pass BigCrush and PractRand -- the most stringent test suites available. In addition, Romu generators take maximum advantage of instruction-level parallelism in modern superscalar processors, giving them an output latency of zero clock-cycles when inlined, thus adding no delay to an application. Scaled-down versions of these generators can be created and tested, enabling one to estimate the maximum number of values the full-size generators can supply before their randomness declines, ensuring the success of large jobs. Such capacity-estimates are rare for conventional PRNGs. A linear PRNG has a single cycle of states of known length comprising almost all possible states. However, a Romu generator computes pseudo-random permutations of those states, creating multiple cycles with pseudo-random lengths which cannot be determined by theory. But the ease of creating state-sizes of 128 or more bits allows (1) short cycles to be constrained to vanishingly low probabilities, and (2) thousands of parallel streams to be created having infinitesimal probabilities of overlap. △ Less

Submitted 26 February, 2020; originally announced February 2020.

arXiv:1903.00785 [pdf, ps, other]

First-order Perturbation Theory for Eigenvalues and Eigenvectors

Authors: Anne Greenbaum, Ren-cang Li, Michael L. Overton

Abstract: We present first-order perturbation analysis of a simple eigenvalue and the corresponding right and left eigenvectors of a general square matrix, not assumed to be Hermitian or normal. The eigenvalue result is well known to a broad scientific community. The treatment of eigenvectors is more complicated, with a perturbation theory that is not so well known outside a community of specialists. We giv… ▽ More We present first-order perturbation analysis of a simple eigenvalue and the corresponding right and left eigenvectors of a general square matrix, not assumed to be Hermitian or normal. The eigenvalue result is well known to a broad scientific community. The treatment of eigenvectors is more complicated, with a perturbation theory that is not so well known outside a community of specialists. We give two different proofs of the main eigenvector perturbation theorem. The first, a block-diagonalization technique inspired by the numerical linear algebra research community and based on the implicit function theorem, has apparently not appeared in the literature in this form. The second, based on complex function theory and on eigenprojectors, as is standard in analytic perturbation theory, is a simplified version of well-known results in the literature. The second derivation uses a convenient normalization of the right and left eigenvectors defined in terms of the associated eigenprojector, but although this dates back to the 1950s, it is rarely discussed in the literature. We then show how the eigenvector perturbation theory is easily extended to handle other normalizations that are often used in practice. We also explain how to verify the perturbation results computationally. We conclude with some remarks about difficulties introduced by multiple eigenvalues and give references to work on perturbation of invariant subspaces corresponding to multiple or clustered eigenvalues. Throughout the paper we give extensive bibliographic commentary and references for further reading. △ Less

Submitted 3 June, 2019; v1 submitted 2 March, 2019; originally announced March 2019.

MSC Class: 47A55; 65F15

arXiv:1901.00050 [pdf, ps, other]

Partial smoothness of the numerical radius at matrices whose fields of values are disks

Authors: Adrian S. Lewis, Michael L. Overton

Abstract: Solutions to optimization problems involving the numerical radius often belong to a special class: the set of matrices having field of values a disk centered at the origin. After illustrating this phenomenon with some examples, we illuminate it by studying matrices around which this set of "disk matrices" is a manifold with respect to which the numerical radius is partly smooth. We then apply our… ▽ More Solutions to optimization problems involving the numerical radius often belong to a special class: the set of matrices having field of values a disk centered at the origin. After illustrating this phenomenon with some examples, we illuminate it by studying matrices around which this set of "disk matrices" is a manifold with respect to which the numerical radius is partly smooth. We then apply our results to matrices whose nonzeros consist of a single superdiagonal, such as Jordan blocks and the Crabb matrix related to a well-known conjecture of Crouzeix. Finally, we consider arbitrary complex three-by-three matrices; even in this case, the details are surprisingly intricate. One of our results is that in this real vector space with dimension 18, the set of disk matrices is a semi-algebraic manifold with dimension 12. △ Less

Submitted 31 December, 2018; originally announced January 2019.

arXiv:1810.00292 [pdf, ps, other]

Analysis of Limited-Memory BFGS on a Class of Nonsmooth Convex Functions

Authors: Azam Asl, Michael L. Overton

Abstract: The limited memory BFGS (L-BFGS) method is widely used for large-scale unconstrained optimization, but its behavior on nonsmooth problems has received little attention. L-BFGS can be used with or without "scaling"; the use of scaling is normally recommended. A simple special case, when just one BFGS update is stored and used at every iteration, is sometimes also known as memoryless BFGS. We analyz… ▽ More The limited memory BFGS (L-BFGS) method is widely used for large-scale unconstrained optimization, but its behavior on nonsmooth problems has received little attention. L-BFGS can be used with or without "scaling"; the use of scaling is normally recommended. A simple special case, when just one BFGS update is stored and used at every iteration, is sometimes also known as memoryless BFGS. We analyze memoryless BFGS with scaling, using any Armijo-Wolfe line search, on the function $f(x) = a|x^{(1)}| + \sum_{i=2}^{n} x^{(i)}$, initiated at any point $x_0$ with $x_0^{(1)}\not = 0$. We show that if $a\ge 2\sqrt{n-1}$, the absolute value of the normalized search direction generated by this method converges to a constant vector, and if, in addition, $a$ is larger than a quantity that depends on the Armijo parameter, then the iterates converge to a non-optimal point $\bar x$ with $\bar x^{(1)}=0$, although $f$ is unbounded below. As we showed in previous work, the gradient method with any Armijo-Wolfe line search also fails on the same function if $a\geq \sqrt{n-1}$ and $a$ is larger than another quantity depending on the Armijo parameter, but scaled memoryless BFGS fails under a weaker condition relating $a$ to the Armijo parameter than that implying failure of the gradient method. Furthermore, in sharp contrast to the gradient method, if a specific standard Armijo-Wolfe bracketing line search is used, scaled memoryless BFGS fails when $a\ge 2 \sqrt{n-1} $ regardless of the Armijo parameter. Finally, numerical experiments indicate that similar results hold for scaled L-BFGS with any fixed number of updates. △ Less

Submitted 30 May, 2019; v1 submitted 29 September, 2018; originally announced October 2018.

arXiv:1804.11003 [pdf, ps, other]

Gradient Sampling Methods for Nonsmooth Optimization

Authors: James V. Burke, Frank E. Curtis, Adrian S. Lewis, Michael L. Overton, Lucas E. A. Simões

Abstract: This paper reviews the gradient sampling methodology for solving nonsmooth, nonconvex optimization problems. An intuitively straightforward gradient sampling algorithm is stated and its convergence properties are summarized. Throughout this discussion, we emphasize the simplicity of gradient sampling as an extension of the steepest descent method for minimizing smooth objectives. We then provide o… ▽ More This paper reviews the gradient sampling methodology for solving nonsmooth, nonconvex optimization problems. An intuitively straightforward gradient sampling algorithm is stated and its convergence properties are summarized. Throughout this discussion, we emphasize the simplicity of gradient sampling as an extension of the steepest descent method for minimizing smooth objectives. We then provide overviews of various enhancements that have been proposed to improve practical performance, as well as of several extensions that have been made in the literature, such as to solve constrained problems. The paper also includes clarification of certain technical aspects of the analysis of gradient sampling algorithms, most notably related to the assumptions one needs to make about the set of points at which the objective is continuously differentiable. Finally, we discuss possible future research directions. △ Less

Submitted 29 April, 2018; originally announced April 2018.

Comments: Submitted to: Special Methods for Nonsmooth Optimization (Springer, 2018), edited by A. Bagirov, M. Gaudioso, N. Karmitsa and M. Mäkelä

arXiv:1803.06549 [pdf, ps, other]

Low-Order Control Design using a Reduced-Order Model with a Stability Constraint on the Full-Order Model

Authors: Peter Benner, Tim Mitchell, Michael L. Overton

Abstract: We consider low-order controller design for large-scale linear time-invariant dynamical systems with inputs and outputs. Model order reduction is a popular technique, but controllers designed for reduced-order models may result in unstable closed-loop plants when applied to the full-order system. We introduce a new method to design a fixed-order controller by minimizing the $L_\infty$ norm of a re… ▽ More We consider low-order controller design for large-scale linear time-invariant dynamical systems with inputs and outputs. Model order reduction is a popular technique, but controllers designed for reduced-order models may result in unstable closed-loop plants when applied to the full-order system. We introduce a new method to design a fixed-order controller by minimizing the $L_\infty$ norm of a reduced-order closed-loop transfer matrix function subject to stability constraints on the closed-loop systems for both the reduced-order and the full-order models. Since the optimization objective and the constraints are all nonsmooth and nonconvex we use a sequential quadratic programming method based on quasi-Newton updating that is intended for this problem class, available in the open-source software package GRANSO. Using a publicly available test set, the controllers obtained by the new method are compared with those computed by the HIFOO (H-Infinity Fixed-Order Optimization) toolbox applied to a reduced-order model alone, which frequently fail to stabilize the closed-loop system for the associated full-order model. △ Less

Submitted 17 March, 2018; originally announced March 2018.

arXiv:1711.08517 [pdf, ps, other]

Analysis of the Gradient Method with an Armijo-Wolfe Line Search on a Class of Nonsmooth Convex Functions

Authors: Azam Asl, Michael L. Overton

Abstract: It has long been known that the gradient (steepest descent) method may fail on nonsmooth problems, but the examples that have appeared in the literature are either devised specifically to defeat a gradient or subgradient method with an exact line search or are unstable with respect to perturbation of the initial point. We give an analysis of the gradient method with steplengths satisfying the Armi… ▽ More It has long been known that the gradient (steepest descent) method may fail on nonsmooth problems, but the examples that have appeared in the literature are either devised specifically to defeat a gradient or subgradient method with an exact line search or are unstable with respect to perturbation of the initial point. We give an analysis of the gradient method with steplengths satisfying the Armijo and Wolfe inexact line search conditions on the nonsmooth convex function $f(x) = a|x^{(1)}| + \sum_{i=2}^{n} x^{(i)}$. We show that if $a$ is sufficiently large, satisfying a condition that depends only on the Armijo parameter, then, when the method is initiated at any point $x_0 \in \R^n$ with $x^{(1)}_0\not = 0$, the iterates converge to a point $\bar x$ with $\bar x^{(1)}=0$, although $f$ is unbounded below. We also give conditions under which the iterates $f(x_k)\to-\infty$, using a specific Armijo-Wolfe bracketing line search. Our experimental results demonstrate that our analysis is reasonably tight. △ Less

Submitted 20 September, 2018; v1 submitted 22 November, 2017; originally announced November 2017.

arXiv:1702.02486 [pdf, other]

Approximating the Real Structured Stability Radius with Frobenius Norm Bounded Perturbations

Authors: Nicola Guglielmi, Mert Gurbuzbalaban, Tim Mitchell, Michael Overton

Abstract: We propose a fast method to approximate the real stability radius of a linear dynamical system with output feedback, where the perturbations are restricted to be real valued and bounded with respect to the Frobenius norm. Our work builds on a number of scalable algorithms that have been proposed in recent years, ranging from methods that approximate the complex or real pseudospectral abscissa and… ▽ More We propose a fast method to approximate the real stability radius of a linear dynamical system with output feedback, where the perturbations are restricted to be real valued and bounded with respect to the Frobenius norm. Our work builds on a number of scalable algorithms that have been proposed in recent years, ranging from methods that approximate the complex or real pseudospectral abscissa and radius of large sparse matrices (and generalizations of these methods for pseudospectra to spectral value sets) to algorithms for approximating the complex stability radius (the reciprocal of the $H_\infty$ norm). Although our algorithm is guaranteed to find only upper bounds to the real stability radius, it seems quite effective in practice. As far as we know, this is the first algorithm that addresses the Frobenius-norm version of this problem. Because the cost mainly consists of computing the eigenvalue with maximal real part for continuous-time systems (or modulus for discrete-time systems) of a sequence of matrices, our algorithm remains very efficient for large-scale systems provided that the system matrices are sparse. △ Less

Submitted 8 February, 2017; originally announced February 2017.

arXiv:0905.3229 [pdf, ps, other]

Multiobjective Robust Control with HIFOO 2.0

Authors: Suat Gumussoy, Didier Henrion, Marc Millstone, Michael L. Overton

Abstract: Multiobjective control design is known to be a difficult problem both in theory and practice. Our approach is to search for locally optimal solutions of a nonsmooth optimization problem that is built to incorporate minimization objectives and constraints for multiple plants. We report on the success of this approach using our public-domain Matlab toolbox HIFOO 2.0, comparing our results with ben… ▽ More Multiobjective control design is known to be a difficult problem both in theory and practice. Our approach is to search for locally optimal solutions of a nonsmooth optimization problem that is built to incorporate minimization objectives and constraints for multiple plants. We report on the success of this approach using our public-domain Matlab toolbox HIFOO 2.0, comparing our results with benchmarks in the literature. △ Less

Submitted 25 May, 2009; v1 submitted 20 May, 2009; originally announced May 2009.

arXiv:math/0603681 [pdf, ps, other]

Maximizing the Closed Loop Asymptotic Decay Rate for the Two-Mass-Spring Control Problem

Authors: Didier Henrion, Michael L. Overton

Abstract: We consider the following problem: find a fixed-order linear controller that maximizes the closed-loop asymptotic decay rate for the classical two-mass-spring system. This can be formulated as the problem of minimizing the abscissa (maximum of the real parts of the roots) of a polynomial whose coefficients depend linearly on the controller parameters. We show that the only order for which there… ▽ More We consider the following problem: find a fixed-order linear controller that maximizes the closed-loop asymptotic decay rate for the classical two-mass-spring system. This can be formulated as the problem of minimizing the abscissa (maximum of the real parts of the roots) of a polynomial whose coefficients depend linearly on the controller parameters. We show that the only order for which there is a non-trivial solution is 2. In this case, we derive a controller that we prove locally maximizes the asymptotic decay rate, using recently developed techniques from nonsmooth analysis. △ Less

Submitted 29 March, 2006; originally announced March 2006.

MSC Class: 93D15 93C05 49J52

Showing 1–26 of 26 results for author: Overton, M