-
Efficient single-cycle pulse compression of an ytterbium fiber laser at 10 MHz repetition rate
Authors:
F. Köttig,
D. Schade,
J. R. Koehler,
P. St. J. Russell,
F. Tani
Abstract:
Over the past years, ultrafast lasers with average powers in the 100 W range have become a mature technology, with a multitude of applications in science and technology. Nonlinear temporal compression of these lasers to few- or even single-cycle duration is often essential, yet still hard to achieve, in particular at high repetition rates. Here we report a two-stage system for compressing pulses f…
▽ More
Over the past years, ultrafast lasers with average powers in the 100 W range have become a mature technology, with a multitude of applications in science and technology. Nonlinear temporal compression of these lasers to few- or even single-cycle duration is often essential, yet still hard to achieve, in particular at high repetition rates. Here we report a two-stage system for compressing pulses from a 1030 nm ytterbium fiber laser to single-cycle durations with 5 $μ$J output pulse energy at 9.6 MHz repetition rate. In the first stage, the laser pulses are compressed from 340 to 25 fs by spectral broadening in a krypton-filled single-ring photonic crystal fiber (SR-PCF), subsequent phase compensation being achieved with chirped mirrors. In the second stage, the pulses are further compressed to single-cycle duration by soliton-effect self-compression in a neon-filled SR-PCF. We estimate a pulse duration of ~3.4 fs at the fiber output by numerically back-propagating the measured pulses. Finally, we directly measured a pulse duration of 3.8 fs (1.25 optical cycles) after compensating (using chirped mirrors) the dispersion introduced by the optical elements after the fiber, more than 50% of the total pulse energy being in the main peak. The system can produce compressed pulses with peak powers >0.6 GW and a total transmission exceeding 70%.
△ Less
Submitted 23 January, 2020;
originally announced January 2020.
-
Technical Design Report for the PANDA Endcap Disc DIRC
Authors:
Panda Collaboration,
F. Davi,
W. Erni,
B. Krusche,
M. Steinacher,
N. Walford,
H. Liu,
Z. Liu,
B. Liu,
X. Shen,
C. Wang,
J. Zhao,
M. Albrecht,
T. Erlen,
F. Feldbauer,
M. Fink,
V. Freudenreich,
M. Fritsch,
F. H. Heinsius,
T. Held,
T. Holtmann,
I. Keshk,
H. Koch,
B. Kopf,
M. Kuhlmann
, et al. (441 additional authors not shown)
Abstract:
PANDA (anti-Proton ANnihiliation at DArmstadt) is planned to be one of the four main experiments at the future international accelerator complex FAIR (Facility for Antiproton and Ion Research) in Darmstadt, Germany. It is going to address fundamental questions of hadron physics and quantum chromodynamics using cooled antiproton beams with a high intensity and and momenta between 1.5 and 15 GeV/c.…
▽ More
PANDA (anti-Proton ANnihiliation at DArmstadt) is planned to be one of the four main experiments at the future international accelerator complex FAIR (Facility for Antiproton and Ion Research) in Darmstadt, Germany. It is going to address fundamental questions of hadron physics and quantum chromodynamics using cooled antiproton beams with a high intensity and and momenta between 1.5 and 15 GeV/c. PANDA is designed to reach a maximum luminosity of 2x10^32 cm^2 s. Most of the physics programs require an excellent particle identification (PID). The PID of hadronic states at the forward endcap of the target spectrometer will be done by a fast and compact Cherenkov detector that uses the detection of internally reflected Cherenkov light (DIRC) principle. It is designed to cover the polar angle range from 5° to 22° and to provide a separation power for the separation of charged pions and kaons up to 3 standard deviations (s.d.) for particle momenta up to 4 GeV/c in order to cover the important particle phase space. This document describes the technical design and the expected performance of the novel PANDA Disc DIRC detector that has not been used in any other high energy physics experiment (HEP) before. The performance has been studied with Monte-Carlo simulations and various beam tests at DESY and CERN. The final design meets all PANDA requirements and guarantees suffcient safety margins.
△ Less
Submitted 29 December, 2019;
originally announced December 2019.
-
Safe and Fast Tracking on a Robot Manipulator: Robust MPC and Neural Network Control
Authors:
Julian Nubert,
Johannes Köhler,
Vincent Berenz,
Frank Allgöwer,
Sebastian Trimpe
Abstract:
Fast feedback control and safety guarantees are essential in modern robotics. We present an approach that achieves both by combining novel robust model predictive control (MPC) with function approximation via (deep) neural networks (NNs). The result is a new approach for complex tasks with nonlinear, uncertain, and constrained dynamics as are common in robotics. Specifically, we leverage recent re…
▽ More
Fast feedback control and safety guarantees are essential in modern robotics. We present an approach that achieves both by combining novel robust model predictive control (MPC) with function approximation via (deep) neural networks (NNs). The result is a new approach for complex tasks with nonlinear, uncertain, and constrained dynamics as are common in robotics. Specifically, we leverage recent results in MPC research to propose a new robust setpoint tracking MPC algorithm, which achieves reliable and safe tracking of a dynamic setpoint while guaranteeing stability and constraint satisfaction. The presented robust MPC scheme constitutes a one-layer approach that unifies the often separated planning and control layers, by directly computing the control command based on a reference and possibly obstacle positions. As a separate contribution, we show how the computation time of the MPC can be drastically reduced by approximating the MPC law with a NN controller. The NN is trained and validated from offline samples of the MPC, yielding statistical guarantees, and used in lieu thereof at run time. Our experiments on a state-of-the-art robot manipulator are the first to show that both the proposed robust and approximate MPC schemes scale to real-world robotic systems.
△ Less
Submitted 2 March, 2020; v1 submitted 21 December, 2019;
originally announced December 2019.
-
Robust Economic Model Predictive Control without Terminal Conditions
Authors:
Lukas Schwenkel,
Johannes Köhler,
Matthias A. Müller,
Frank Allgöwer
Abstract:
In this paper, a novel tube-based economic Model Predictive Control (MPC) scheme for uncertain systems that uses neither terminal costs nor terminal constraints is investigated. We show that the results from the undisturbed case can be extended to systems with bounded disturbances by using similar turnpike arguments and a properly modified stage cost. We prove robust guarantees on the closed-loop…
▽ More
In this paper, a novel tube-based economic Model Predictive Control (MPC) scheme for uncertain systems that uses neither terminal costs nor terminal constraints is investigated. We show that the results from the undisturbed case can be extended to systems with bounded disturbances by using similar turnpike arguments and a properly modified stage cost. We prove robust guarantees on the closed-loop performance, convergence, and stability under suitable dissipativity and controllability conditions and discuss them in a numerical example.
△ Less
Submitted 27 July, 2020; v1 submitted 27 November, 2019;
originally announced November 2019.
-
A Sub-sampled Tensor Method for Non-convex Optimization
Authors:
Aurelien Lucchi,
Jonas Kohler
Abstract:
We present a stochastic optimization method that uses a fourth-order regularized model to find local minima of smooth and potentially non-convex objective functions with a finite-sum structure. This algorithm uses sub-sampled derivatives instead of exact quantities. The proposed approach is shown to find an $(ε_1,ε_2,ε_3)$-third-order critical point in at most…
▽ More
We present a stochastic optimization method that uses a fourth-order regularized model to find local minima of smooth and potentially non-convex objective functions with a finite-sum structure. This algorithm uses sub-sampled derivatives instead of exact quantities. The proposed approach is shown to find an $(ε_1,ε_2,ε_3)$-third-order critical point in at most $\bigO\left(\max\left(ε_1^{-4/3}, ε_2^{-2}, ε_3^{-4}\right)\right)$ iterations, thereby matching the rate of deterministic approaches. In order to prove this result, we derive a novel tensor concentration inequality for sums of tensors of any order that makes explicit use of the finite-sum structure of the objective function.
△ Less
Submitted 15 July, 2023; v1 submitted 23 November, 2019;
originally announced November 2019.
-
A nonlinear tracking model predictive control scheme for dynamic target signals
Authors:
Johannes Köhler,
Matthias A. Müller,
Frank Allgöwer
Abstract:
We present a nonlinear model predictive control (MPC) scheme for tracking of dynamic target signals. The scheme combines stabilization and dynamic trajectory planning in one layer, thus ensuring constraint satisfaction irrespective of changes in the dynamic target signal. For periodic target signals we ensure exponential stability of the optimal reachable periodic trajectory using suitable termina…
▽ More
We present a nonlinear model predictive control (MPC) scheme for tracking of dynamic target signals. The scheme combines stabilization and dynamic trajectory planning in one layer, thus ensuring constraint satisfaction irrespective of changes in the dynamic target signal. For periodic target signals we ensure exponential stability of the optimal reachable periodic trajectory using suitable terminal ingredients and a convexity condition for the underlying periodic optimal control problem. Furthermore, we introduce an online optimization of the terminal set size to automate the trade-off between fast convergence and operation close to the constraints. In addition, we show how stabilization and dynamic trajectory planning can be formulated as partially decoupled optimization problems, which reduces the computational demand while ensuring recursive feasibility and convergence. The main tool to enable the proposed design is a novel reference generic offline computation that provides suitable terminal ingredients for tracking of dynamic reference trajectories. The practicality of this approach is demonstrated on benchmark examples, which demonstrates superior performance compared to state of the art approaches.
△ Less
Submitted 20 October, 2020; v1 submitted 8 November, 2019;
originally announced November 2019.
-
Stability and performance in transient average constrained economic MPC without terminal constraints
Authors:
Mario Rosenfelder,
Johannes Köhler,
Frank Allgöwer
Abstract:
In this paper, we investigate system theoretic properties of transient average constrained economic model predictive control (MPC) without terminal constraints. We show that the optimal open-loop solution passes by the optimal steady-state for consecutive time instants. Using this turnpike property and suitable controllability conditions, we provide closed-loop performance bounds. Furthermore, sta…
▽ More
In this paper, we investigate system theoretic properties of transient average constrained economic model predictive control (MPC) without terminal constraints. We show that the optimal open-loop solution passes by the optimal steady-state for consecutive time instants. Using this turnpike property and suitable controllability conditions, we provide closed-loop performance bounds. Furthermore, stability is proved by combining the rotated value function with an input-to-state (ISS) Lyapunov function of an extended state related to the transient average constraints. The results are illustrated with a numerical example.
△ Less
Submitted 20 October, 2020; v1 submitted 8 November, 2019;
originally announced November 2019.
-
A robust adaptive model predictive control framework for nonlinear uncertain systems
Authors:
Johannes Köhler,
Peter Kötting,
Raffaele Soloperto,
Frank Allgöwer,
Matthias A. Müller
Abstract:
In this paper, we present a tube-based framework for robust adaptive model predictive control (RAMPC) for nonlinear systems subject to parametric uncertainty and additive disturbances. Set-membership estimation is used to provide accurate bounds on the parametric uncertainty, which are employed for the construction of the tube in a robust MPC scheme. The resulting RAMPC framework ensures robust re…
▽ More
In this paper, we present a tube-based framework for robust adaptive model predictive control (RAMPC) for nonlinear systems subject to parametric uncertainty and additive disturbances. Set-membership estimation is used to provide accurate bounds on the parametric uncertainty, which are employed for the construction of the tube in a robust MPC scheme. The resulting RAMPC framework ensures robust recursive feasibility and robust constraint satisfaction, while allowing for less conservative operation compared to robust MPC schemes without model/parameter adaptation. Furthermore, by using an additional mean-squared point estimate in the objective function the framework ensures finite-gain $\mathcal{L}_2$ stability w.r.t. additive disturbances. As a first contribution we derive suitable monotonicity and non-increasing properties on general parameter estimation algorithms and tube/set based RAMPC schemes that ensure robust recursive feasibility and robust constraint satisfaction under recursive model updates. Then, as the main contribution of this paper, we provide similar conditions for a tube based formulation that is parametrized using an incremental Lyapunov function, a scalar contraction rate and a function bounding the uncertainty. With this result, we can provide simple constructive designs for different RAMPC schemes with varying computational complexity and conservatism. As a corollary, we can demonstrate that state of the art formulations for nonlinear RAMPC are a special case of the proposed framework. We provide a numerical example that demonstrates the flexibility of the proposed framework and showcase improvements compared to state of the art approaches.
△ Less
Submitted 20 October, 2020; v1 submitted 7 November, 2019;
originally announced November 2019.
-
A computationally efficient robust model predictive control framework for uncertain nonlinear systems -- extended version
Authors:
Johannes Köhler,
Raffaele Soloperto,
Matthias A. Müller,
Frank Allgöwer
Abstract:
In this paper, we present a nonlinear robust model predictive control (MPC) framework for general (state and input dependent) disturbances. This approach uses an online constructed tube in order to tighten the nominal (state and input) constraints. To facilitate an efficient online implementation, the shape of the tube is based on an offline computed incremental Lyapunov function with a correspond…
▽ More
In this paper, we present a nonlinear robust model predictive control (MPC) framework for general (state and input dependent) disturbances. This approach uses an online constructed tube in order to tighten the nominal (state and input) constraints. To facilitate an efficient online implementation, the shape of the tube is based on an offline computed incremental Lyapunov function with a corresponding (nonlinear) incrementally stabilizing feedback. Crucially, the online optimization only implicitly includes these nonlinear functions in terms of scalar bounds, which enables an efficient implementation. Furthermore, to account for an efficient evaluation of the worst case disturbance, a simple function is constructed offline that upper bounds the possible disturbance realizations in a neighbourhood of a given point of the open-loop trajectory. The resulting MPC scheme ensures robust constraint satisfaction and practical asymptotic stability with a moderate increase in the online computational demand compared to a nominal MPC. We demonstrate the applicability of the proposed framework in comparison to state of the art robust MPC approaches with a nonlinear benchmark example. This paper is an extended version of [1], and contains further details and additional considers: continuous-time systems (App. A), more general nonlinear constraints (App. B) and special cases (Sec. IV).
△ Less
Submitted 4 June, 2020; v1 submitted 26 October, 2019;
originally announced October 2019.
-
Data-Driven Tracking MPC for Changing Setpoints
Authors:
Julian Berberich,
Johannes Köhler,
Matthias A. Müller,
Frank Allgöwer
Abstract:
We propose a data-driven tracking model predictive control (MPC) scheme to control unknown discrete-time linear time-invariant systems. The scheme uses a purely data-driven system parametrization to predict future trajectories based on behavioral systems theory. The control objective is tracking of a given input-output setpoint. We prove that this setpoint is exponentially stable for the closed lo…
▽ More
We propose a data-driven tracking model predictive control (MPC) scheme to control unknown discrete-time linear time-invariant systems. The scheme uses a purely data-driven system parametrization to predict future trajectories based on behavioral systems theory. The control objective is tracking of a given input-output setpoint. We prove that this setpoint is exponentially stable for the closed loop of the proposed MPC, if it is reachable by the system dynamics and constraints. For an unreachable setpoint, our scheme guarantees closed-loop exponential stability of the optimal reachable equilibrium. Moreover, in case the system dynamics are known, the presented results extend the existing results for model-based setpoint tracking to the case where the stage cost is only positive semidefinite in the state. The effectiveness of the proposed approach is illustrated by means of a practical example.
△ Less
Submitted 16 April, 2021; v1 submitted 21 October, 2019;
originally announced October 2019.
-
DP-MAC: The Differentially Private Method of Auxiliary Coordinates for Deep Learning
Authors:
Frederik Harder,
Jonas Köhler,
Max Welling,
Mijung Park
Abstract:
Developing a differentially private deep learning algorithm is challenging, due to the difficulty in analyzing the sensitivity of objective functions that are typically used to train deep neural networks. Many existing methods resort to the stochastic gradient descent algorithm and apply a pre-defined sensitivity to the gradients for privatizing weights. However, their slow convergence typically y…
▽ More
Developing a differentially private deep learning algorithm is challenging, due to the difficulty in analyzing the sensitivity of objective functions that are typically used to train deep neural networks. Many existing methods resort to the stochastic gradient descent algorithm and apply a pre-defined sensitivity to the gradients for privatizing weights. However, their slow convergence typically yields a high cumulative privacy loss. Here, we take a different route by employing the method of auxiliary coordinates, which allows us to independently update the weights per layer by optimizing a per-layer objective function. This objective function can be well approximated by a low-order Taylor's expansion, in which sensitivity analysis becomes tractable. We perturb the coefficients of the expansion for privacy, which we optimize using more advanced optimization routines than SGD for faster convergence. We empirically show that our algorithm provides a decent trained model quality under a modest privacy budget.
△ Less
Submitted 15 October, 2019;
originally announced October 2019.
-
Equivariant Flows: sampling configurations for multi-body systems with symmetric energies
Authors:
Jonas Köhler,
Leon Klein,
Frank Noé
Abstract:
Flows are exact-likelihood generative neural networks that transform samples from a simple prior distribution to the samples of the probability distribution of interest. Boltzmann Generators (BG) combine flows and statistical mechanics to sample equilibrium states of strongly interacting many-body systems such as proteins with 1000 atoms. In order to scale and generalize these results, it is essen…
▽ More
Flows are exact-likelihood generative neural networks that transform samples from a simple prior distribution to the samples of the probability distribution of interest. Boltzmann Generators (BG) combine flows and statistical mechanics to sample equilibrium states of strongly interacting many-body systems such as proteins with 1000 atoms. In order to scale and generalize these results, it is essential that the natural symmetries of the probability density - in physics defined by the invariances of the energy function - are built into the flow. Here we develop theoretical tools for constructing such equivariant flows and demonstrate that a BG that is equivariant with respect to rotations and particle permutations can generalize to sampling nontrivially new configurations where a nonequivariant BG cannot.
△ Less
Submitted 1 October, 2019;
originally announced October 2019.
-
A nonlinear model predictive control framework using reference generic terminal ingredients -- extended version
Authors:
Johannes Köhler,
Matthias A. Müller,
Frank Allgöwer
Abstract:
In this paper, we present a quasi infinite horizon nonlinear model predictive control (MPC) scheme for tracking of generic reference trajectories. This scheme is applicable to nonlinear systems, which are locally incrementally stabilizable. For such systems, we provide a reference generic offline procedure to compute an incrementally stabilizing feedback with a continuously parameterized quadratic…
▽ More
In this paper, we present a quasi infinite horizon nonlinear model predictive control (MPC) scheme for tracking of generic reference trajectories. This scheme is applicable to nonlinear systems, which are locally incrementally stabilizable. For such systems, we provide a reference generic offline procedure to compute an incrementally stabilizing feedback with a continuously parameterized quadratic quasi infinite horizon terminal cost. As a result we get a nonlinear reference tracking MPC scheme with a valid terminal cost for general reachable reference trajectories without increasing the online computational complexity. As a corollary, the terminal cost can also be used to design nonlinear MPC schemes that reliably operate under online changing conditions, including unreachable reference signals. The practicality of this approach is demonstrated with a benchmark example.
This paper is an extended version of the accepted paper [1], and contains additional details regarding \textit{robust} trajectory tracking (App.~B), continuous-time dynamics (App.~C), output tracking stage costs (App.~D) and the connection to incremental system properties (App.~A).
△ Less
Submitted 11 March, 2020; v1 submitted 27 September, 2019;
originally announced September 2019.
-
Linear robust adaptive model predictive control: Computational complexity and conservatism -- extended version
Authors:
Johannes Köhler,
Elisa Andina,
Raffaele Soloperto,
Matthias A. Müller,
Frank Allgöwer
Abstract:
In this paper, we present a robust adaptive model predictive control (MPC) scheme for linear systems subject to parametric uncertainty and additive disturbances. The proposed approach provides a computationally efficient formulation with theoretical guarantees (constraint satisfaction and stability), while allowing for reduced conservatism and improved performance due to online parameter adaptatio…
▽ More
In this paper, we present a robust adaptive model predictive control (MPC) scheme for linear systems subject to parametric uncertainty and additive disturbances. The proposed approach provides a computationally efficient formulation with theoretical guarantees (constraint satisfaction and stability), while allowing for reduced conservatism and improved performance due to online parameter adaptation. A moving window parameter set identification is used to compute a fixed complexity parameter set based on past data. Robust constraint satisfaction is achieved by using a computationally efficient tube based robust MPC method. The predicted cost function is based on a least mean squares point estimate, which ensures finite-gain $\mathcal{L}_2$ stability of the closed loop. The overall algorithm has a fixed (user specified) computational complexity. We illustrate the applicability of the approach and the trade-off between conservatism and computational complexity using a numerical example.
This paper is an extended version of~[1], and contains additional details regarding the theoretical proof of Theorem~1, the numerical example, and the offline computations in Appendix~A--B.
△ Less
Submitted 11 March, 2020; v1 submitted 4 September, 2019;
originally announced September 2019.
-
Two-Staged Acoustic Modeling Adaption for Robust Speech Recognition by the Example of German Oral History Interviews
Authors:
Michael Gref,
Christoph Schmidt,
Sven Behnke,
Joachim Köhler
Abstract:
In automatic speech recognition, often little training data is available for specific challenging tasks, but training of state-of-the-art automatic speech recognition systems requires large amounts of annotated speech. To address this issue, we propose a two-staged approach to acoustic modeling that combines noise and reverberation data augmentation with transfer learning to robustly address chall…
▽ More
In automatic speech recognition, often little training data is available for specific challenging tasks, but training of state-of-the-art automatic speech recognition systems requires large amounts of annotated speech. To address this issue, we propose a two-staged approach to acoustic modeling that combines noise and reverberation data augmentation with transfer learning to robustly address challenges such as difficult acoustic recording conditions, spontaneous speech, and speech of elderly people. We evaluate our approach using the example of German oral history interviews, where a relative average reduction of the word error rate by 19.3% is achieved.
△ Less
Submitted 19 August, 2019;
originally announced August 2019.
-
Interpretable and Fine-Grained Visual Explanations for Convolutional Neural Networks
Authors:
Jörg Wagner,
Jan Mathias Köhler,
Tobias Gindele,
Leon Hetzel,
Jakob Thaddäus Wiedemer,
Sven Behnke
Abstract:
To verify and validate networks, it is essential to gain insight into their decisions, limitations as well as possible shortcomings of training data. In this work, we propose a post-hoc, optimization based visual explanation method, which highlights the evidence in the input image for a specific prediction. Our approach is based on a novel technique to defend against adversarial evidence (i.e. fau…
▽ More
To verify and validate networks, it is essential to gain insight into their decisions, limitations as well as possible shortcomings of training data. In this work, we propose a post-hoc, optimization based visual explanation method, which highlights the evidence in the input image for a specific prediction. Our approach is based on a novel technique to defend against adversarial evidence (i.e. faulty evidence due to artefacts) by filtering gradients during optimization. The defense does not depend on human-tuned parameters. It enables explanations which are both fine-grained and preserve the characteristics of images, such as edges and colors. The explanations are interpretable, suited for visualizing detailed evidence and can be tested as they are valid model inputs. We qualitatively and quantitatively evaluate our approach on a multitude of models and datasets.
△ Less
Submitted 7 August, 2019;
originally announced August 2019.
-
The Role of Memory in Stochastic Optimization
Authors:
Antonio Orvieto,
Jonas Kohler,
Aurelien Lucchi
Abstract:
The choice of how to retain information about past gradients dramatically affects the convergence properties of state-of-the-art stochastic optimization methods, such as Heavy-ball, Nesterov's momentum, RMSprop and Adam. Building on this observation, we use stochastic differential equations (SDEs) to explicitly study the role of memory in gradient-based algorithms. We first derive a general contin…
▽ More
The choice of how to retain information about past gradients dramatically affects the convergence properties of state-of-the-art stochastic optimization methods, such as Heavy-ball, Nesterov's momentum, RMSprop and Adam. Building on this observation, we use stochastic differential equations (SDEs) to explicitly study the role of memory in gradient-based algorithms. We first derive a general continuous-time model that can incorporate arbitrary types of memory, for both deterministic and stochastic settings. We provide convergence guarantees for this SDE for weakly-quasi-convex and quadratically growing functions. We then demonstrate how to discretize this SDE to get a flexible discrete-time algorithm that can implement a board spectrum of memories ranging from short- to long-term. Not only does this algorithm increase the degrees of freedom in algorithmic choice for practitioners but it also comes with better stability properties than classical momentum in the convex stochastic setting. In particular, no iterate averaging is needed for convergence. Interestingly, our analysis also provides a novel interpretation of Nesterov's momentum as stable gradient amplification and highlights a possible reason for its unstable behavior in the (convex) stochastic setting. Furthermore, we discuss the use of long term memory for second-moment estimation in adaptive methods, such as Adam and RMSprop. Finally, we provide an extensive experimental study of the effect of different types of memory in both convex and nonconvex settings.
△ Less
Submitted 11 March, 2020; v1 submitted 2 July, 2019;
originally announced July 2019.
-
Uncertainty Based Detection and Relabeling of Noisy Image Labels
Authors:
Jan M. Köhler,
Maximilian Autenrieth,
William H. Beluch
Abstract:
Deep neural networks (DNNs) are powerful tools in computer vision tasks. However, in many realistic scenarios label noise is prevalent in the training images, and overfitting to these noisy labels can significantly harm the generalization performance of DNNs. We propose a novel technique to identify data with noisy labels based on the different distributions of the predictive uncertainties from a…
▽ More
Deep neural networks (DNNs) are powerful tools in computer vision tasks. However, in many realistic scenarios label noise is prevalent in the training images, and overfitting to these noisy labels can significantly harm the generalization performance of DNNs. We propose a novel technique to identify data with noisy labels based on the different distributions of the predictive uncertainties from a DNN over the clean and noisy data. Additionally, the behavior of the uncertainty over the course of training helps to identify the network weights which best can be used to relabel the noisy labels. Data with noisy labels can therefore be cleaned in an iterative process. Our proposed method can be easily implemented, and shows promising performance on the task of noisy label detection on CIFAR-10 and CIFAR-100.
△ Less
Submitted 29 May, 2019;
originally announced June 2019.
-
Data-Driven Model Predictive Control with Stability and Robustness Guarantees
Authors:
Julian Berberich,
Johannes Köhler,
Matthias A. Müller,
Frank Allgöwer
Abstract:
We propose a robust data-driven model predictive control (MPC) scheme to control linear time-invariant (LTI) systems. The scheme uses an implicit model description based on behavioral systems theory and past measured trajectories. In particular, it does not require any prior identification step, but only an initially measured input-output trajectory as well as an upper bound on the order of the un…
▽ More
We propose a robust data-driven model predictive control (MPC) scheme to control linear time-invariant (LTI) systems. The scheme uses an implicit model description based on behavioral systems theory and past measured trajectories. In particular, it does not require any prior identification step, but only an initially measured input-output trajectory as well as an upper bound on the order of the unknown system. First, we prove exponential stability of a nominal data-driven MPC scheme with terminal equality constraints in the case of no measurement noise. For bounded additive output measurement noise, we propose a robust modification of the scheme, including a slack variable with regularization in the cost. We prove that the application of this robust MPC scheme in a multi-step fashion leads to practical exponential stability of the closed loop w.r.t. the noise level. The presented results provide the first (theoretical) analysis of closed-loop properties, resulting from a simple, purely data-driven MPC scheme.
△ Less
Submitted 16 April, 2021; v1 submitted 11 June, 2019;
originally announced June 2019.
-
Simultaneous retrodiction of multi-mode optomechanical systems using matched filters
Authors:
Jonathan Kohler,
Justin A. Gerber,
Emma Deist,
Dan M. Stamper-Kurn
Abstract:
Generation and manipulation of many-body entangled states is of considerable interest, for applications in quantum simulation or sensing, for example. Measurement and verification of the resulting many-body state presents a formidable challenge, however, which can be simplified by multiplexed readout using shared measurement resources. In this work, we analyze and demonstrate state retrodiction fo…
▽ More
Generation and manipulation of many-body entangled states is of considerable interest, for applications in quantum simulation or sensing, for example. Measurement and verification of the resulting many-body state presents a formidable challenge, however, which can be simplified by multiplexed readout using shared measurement resources. In this work, we analyze and demonstrate state retrodiction for a system of optomechanical oscillators coupled to a single-mode optical cavity. Coupling to the shared cavity field facilitates simultaneous optical measurement of the oscillators' transient dynamics at distinct frequencies. Optimal estimators for the oscillators' initial state can be defined as a set of linear matched filters, derived from a detailed model for the detected homodyne signal. We find that the optimal state estimate for optomechanical retrodiction is obtained from high-cooperativity measurements, reaching estimate sensitivity at the Standard Quantum Limit (SQL). Simultaneous estimation of the state of multiple oscillators places additional limits on the estimate precision, due to the diffusive noise each oscillator adds to the optomechanical signal. However, we show that the sensitivity of simultaneous multi-mode state retrodiction reaches the SQL for sufficiently well-resolved oscillators. Finally, an experimental demonstration of two-mode retrodiction is presented, which requires further accounting for technical fluctuations of the oscillator frequency.
△ Less
Submitted 1 February, 2020; v1 submitted 8 June, 2019;
originally announced June 2019.
-
Adaptive norms for deep learning with regularized Newton methods
Authors:
Jonas Kohler,
Leonard Adolphs,
Aurelien Lucchi
Abstract:
We investigate the use of regularized Newton methods with adaptive norms for optimizing neural networks. This approach can be seen as a second-order counterpart of adaptive gradient methods, which we here show to be interpretable as first-order trust region methods with ellipsoidal constraints. In particular, we prove that the preconditioning matrix used in RMSProp and Adam satisfies the necessary…
▽ More
We investigate the use of regularized Newton methods with adaptive norms for optimizing neural networks. This approach can be seen as a second-order counterpart of adaptive gradient methods, which we here show to be interpretable as first-order trust region methods with ellipsoidal constraints. In particular, we prove that the preconditioning matrix used in RMSProp and Adam satisfies the necessary conditions for provable convergence of second-order trust region methods with standard worst-case complexities on general non-convex objectives. Furthermore, we run experiments across different neural architectures and datasets to find that the ellipsoidal constraints constantly outperform their spherical counterpart both in terms of number of backpropagations and asymptotic loss value. Finally, we find comparable performance to state-of-the-art first-order methods in terms of backpropagations, but further advances in hardware are needed to render Newton methods competitive in terms of computational time.
△ Less
Submitted 28 September, 2020; v1 submitted 22 May, 2019;
originally announced May 2019.
-
Low albedos of hot to ultra-hot Jupiters in the optical to near-infrared transition regime
Authors:
M. Mallonn,
J. Köhler,
X. Alexoudi,
C. von Essen,
T. Granzer,
K. Poppenhaeger,
K. G. Strassmeier
Abstract:
The depth of a secondary eclipse contains information of both the thermally emitted light component of a hot Jupiter and the reflected light component. If the dayside atmosphere of the planet is assumed to be isothermal, it is possible to disentangle both. In this work, we analyze 11 eclipse light curves of the hot Jupiter HAT-P-32b obtained at 0.89 $μ$m in the z' band. We obtain a null detection…
▽ More
The depth of a secondary eclipse contains information of both the thermally emitted light component of a hot Jupiter and the reflected light component. If the dayside atmosphere of the planet is assumed to be isothermal, it is possible to disentangle both. In this work, we analyze 11 eclipse light curves of the hot Jupiter HAT-P-32b obtained at 0.89 $μ$m in the z' band. We obtain a null detection for the eclipse depth with state-of-the-art precision, -0.01 +- 0.10 ppt. We confirm previous studies showing that a non-inverted atmosphere model is in disagreement to the measured emission spectrum of HAT-P-32b. We derive an upper limit on the reflected light component, and thus, on the planetary geometric albedo $A_g$. The 97.5%-confidence upper limit is $A_g$ < 0.2. This is the first albedo constraint for HAT-P-32b, and the first z' band albedo value for any exoplanet. It disfavors the influence of large-sized silicate condensates on the planetary day side. We inferred z' band geometric albedo limits from published eclipse measurements also for the ultra-hot Jupiters WASP-12b, WASP-19b, WASP-103b, and WASP-121b, applying the same method. These values consistently point to a low reflectivity in the optical to near-infrared transition regime for hot to ultra-hot Jupiters.
△ Less
Submitted 27 March, 2019; v1 submitted 21 February, 2019;
originally announced February 2019.
-
Boltzmann Generators -- Sampling Equilibrium States of Many-Body Systems with Deep Learning
Authors:
Frank Noé,
Simon Olsson,
Jonas Köhler,
Hao Wu
Abstract:
Computing equilibrium states in condensed-matter many-body systems, such as solvated proteins, is a long-standing challenge. Lacking methods for generating statistically independent equilibrium samples in "one shot", vast computational effort is invested for simulating these system in small steps, e.g., using Molecular Dynamics. Combining deep learning and statistical mechanics, we here develop Bo…
▽ More
Computing equilibrium states in condensed-matter many-body systems, such as solvated proteins, is a long-standing challenge. Lacking methods for generating statistically independent equilibrium samples in "one shot", vast computational effort is invested for simulating these system in small steps, e.g., using Molecular Dynamics. Combining deep learning and statistical mechanics, we here develop Boltzmann Generators, that are shown to generate unbiased one-shot equilibrium samples of representative condensed matter systems and proteins. Boltzmann Generators use neural networks to learn a coordinate transformation of the complex configurational equilibrium distribution to a distribution that can be easily sampled. Accurate computation of free energy differences and discovery of new configurations are demonstrated, providing a statistical mechanics tool that can avoid rare events during sampling without prior knowledge of reaction coordinates.
△ Less
Submitted 12 July, 2019; v1 submitted 4 December, 2018;
originally announced December 2018.
-
Interband electron pairing for superconductivity from the breakdown of the Born-Oppenheimer approximation
Authors:
Myung-Hwan Whangbo,
Shuiquan Deng,
Jürgen Köhler,
Arndt Simon
Abstract:
The origin of interband electron pairing responsible for enhancing superconductivity and the factors controlling its strength were examined. We show that the interband electron pairing is a natural consequence of breaking down the Born-Oppenheimer approximation during the electron-phonon interactions. Its strength is determined by the pair-state excitations around the Fermi surfaces that take plac…
▽ More
The origin of interband electron pairing responsible for enhancing superconductivity and the factors controlling its strength were examined. We show that the interband electron pairing is a natural consequence of breaking down the Born-Oppenheimer approximation during the electron-phonon interactions. Its strength is determined by the pair-state excitations around the Fermi surfaces that take place to form a superconducting state. Fermi surfaces favorable for the pairing were found and its implications were discussed.
△ Less
Submitted 9 October, 2018;
originally announced October 2018.
-
Long-lived refractive index changes induced by femtosecond ionization in gas-filled single-ring photonic crystal fibers
Authors:
Johannes R. Koehler,
Felix Köttig,
Barbara M. Trabold,
Francesco Tani,
Philip St. J. Russell
Abstract:
We investigate refractive index changes caused by femtosecond photoionization in a gas-filled hollow-core photonic crystal fiber. Using spatially-resolved interferometric side-probing, we find that these changes live for tens of microseconds after the photoionization event - eight orders of magnitude longer than the pulse duration. Oscillations in the megahertz frequency range are simultaneously o…
▽ More
We investigate refractive index changes caused by femtosecond photoionization in a gas-filled hollow-core photonic crystal fiber. Using spatially-resolved interferometric side-probing, we find that these changes live for tens of microseconds after the photoionization event - eight orders of magnitude longer than the pulse duration. Oscillations in the megahertz frequency range are simultaneously observed, caused by mechanical vibrations of the thin-walled capillaries surrounding the hollow core. These two non-local effects can affect the propagation of a second pulse that arrives within their lifetime, which works out to repetition rates of tens of kilohertz. Filling the fiber with an atomically lighter gas significantly reduces ionization, lessening the strength of the refractive index changes. The results will be important for understanding the dynamics of gas-based fiber systems operating at high intensities and high repetition rates, when temporally non-local interactions between successive laser pulses become relevant.
△ Less
Submitted 20 September, 2018;
originally announced September 2018.
-
Bias Correction For Paid Search In Media Mix Modeling
Authors:
Aiyou Chen,
David Chan,
Mike Perry,
Yuxue Jin,
Yunting Sun,
Yueqing Wang,
Jim Koehler
Abstract:
Evaluating the return on ad spend (ROAS), the causal effect of advertising on sales, is critical to advertisers for understanding the performance of their existing marketing strategy as well as how to improve and optimize it. Media Mix Modeling (MMM) has been used as a convenient analytical tool to address the problem using observational data. However it is well recognized that MMM suffers from va…
▽ More
Evaluating the return on ad spend (ROAS), the causal effect of advertising on sales, is critical to advertisers for understanding the performance of their existing marketing strategy as well as how to improve and optimize it. Media Mix Modeling (MMM) has been used as a convenient analytical tool to address the problem using observational data. However it is well recognized that MMM suffers from various fundamental challenges: data collection, model specification and selection bias due to ad targeting, among others \citep{chan2017,wolfe2016}.
In this paper, we study the challenge associated with measuring the impact of search ads in MMM, namely the selection bias due to ad targeting. Using causal diagrams of the search ad environment, we derive a statistically principled method for bias correction based on the \textit{back-door} criterion \citep{pearl2013causality}. We use case studies to show that the method provides promising results by comparison with results from randomized experiments. We also report a more complex case study where the advertiser had spent on more than a dozen media channels but results from a randomized experiment are not available. Both our theory and empirical studies suggest that in some common, practical scenarios, one may be able to obtain an approximately unbiased estimate of search ad ROAS.
△ Less
Submitted 9 July, 2018;
originally announced July 2018.
-
The streaming rollout of deep networks - towards fully model-parallel execution
Authors:
Volker Fischer,
Jan Köhler,
Thomas Pfeil
Abstract:
Deep neural networks, and in particular recurrent networks, are promising candidates to control autonomous agents that interact in real-time with the physical world. However, this requires a seamless integration of temporal features into the network's architecture. For the training of and inference with recurrent neural networks, they are usually rolled out over time, and different rollouts exist.…
▽ More
Deep neural networks, and in particular recurrent networks, are promising candidates to control autonomous agents that interact in real-time with the physical world. However, this requires a seamless integration of temporal features into the network's architecture. For the training of and inference with recurrent neural networks, they are usually rolled out over time, and different rollouts exist. Conventionally during inference, the layers of a network are computed in a sequential manner resulting in sparse temporal integration of information and long response times. In this study, we present a theoretical framework to describe rollouts, the level of model-parallelization they induce, and demonstrate differences in solving specific tasks. We prove that certain rollouts, also for networks with only skip and no recurrent connections, enable earlier and more frequent responses, and show empirically that these early responses have better performance. The streaming rollout maximizes these properties and enables a fully parallel execution of the network reducing runtime on massively parallel devices. Finally, we provide an open-source toolbox to design, train, evaluate, and interact with streaming rollouts.
△ Less
Submitted 2 November, 2018; v1 submitted 13 June, 2018;
originally announced June 2018.
-
Learning an Approximate Model Predictive Controller with Guarantees
Authors:
Michael Hertneck,
Johannes Köhler,
Sebastian Trimpe,
Frank Allgöwer
Abstract:
A supervised learning framework is proposed to approximate a model predictive controller (MPC) with reduced computational complexity and guarantees on stability and constraint satisfaction. The framework can be used for a wide class of nonlinear systems. Any standard supervised learning technique (e.g. neural networks) can be employed to approximate the MPC from samples. In order to obtain closed-…
▽ More
A supervised learning framework is proposed to approximate a model predictive controller (MPC) with reduced computational complexity and guarantees on stability and constraint satisfaction. The framework can be used for a wide class of nonlinear systems. Any standard supervised learning technique (e.g. neural networks) can be employed to approximate the MPC from samples. In order to obtain closed-loop guarantees for the learned MPC, a robust MPC design is combined with statistical learning bounds. The MPC design ensures robustness to inaccurate inputs within given bounds, and Hoeffding's Inequality is used to validate that the learned MPC satisfies these bounds with high confidence. The result is a closed-loop statistical guarantee on stability and constraint satisfaction for the learned MPC. The proposed learning-based MPC framework is illustrated on a nonlinear benchmark problem, for which we learn a neural network controller with guarantees.
△ Less
Submitted 11 June, 2018;
originally announced June 2018.
-
Exponential convergence rates for Batch Normalization: The power of length-direction decoupling in non-convex optimization
Authors:
Jonas Kohler,
Hadi Daneshmand,
Aurelien Lucchi,
Ming Zhou,
Klaus Neymeyr,
Thomas Hofmann
Abstract:
Normalization techniques such as Batch Normalization have been applied successfully for training deep neural networks. Yet, despite its apparent empirical benefits, the reasons behind the success of Batch Normalization are mostly hypothetical. We here aim to provide a more thorough theoretical understanding from a classical optimization perspective. Our main contribution towards this goal is the i…
▽ More
Normalization techniques such as Batch Normalization have been applied successfully for training deep neural networks. Yet, despite its apparent empirical benefits, the reasons behind the success of Batch Normalization are mostly hypothetical. We here aim to provide a more thorough theoretical understanding from a classical optimization perspective. Our main contribution towards this goal is the identification of various problem instances in the realm of machine learning where % -- under certain assumptions-- Batch Normalization can provably accelerate optimization. We argue that this acceleration is due to the fact that Batch Normalization splits the optimization task into optimizing length and direction of the parameters separately. This allows gradient-based methods to leverage a favourable global structure in the loss landscape that we prove to exist in Learning Halfspace problems and neural network training with Gaussian inputs. We thereby turn Batch Normalization from an effective practical heuristic into a provably converging algorithm for these settings. Furthermore, we substantiate our analysis with empirical evidence that suggests the validity of our theoretical results in a broader context.
△ Less
Submitted 6 October, 2018; v1 submitted 27 May, 2018;
originally announced May 2018.
-
Double-slit photoelectron interference in strong-field ionization of the neon dimer
Authors:
Maksim Kunitski,
Nicolas Eicke,
Pia Huber,
Jonas Köhler,
Stefan Zeller,
Jörg Voigtsberger,
Nikolai Schlott,
Kevin Henrichs,
Hendrik Sann,
Florian Trinter,
Lothar Ph. H. Schmidt,
Anton Kalinin,
Markus Schöffler,
Till Jahnke,
Manfred Lein,
Reinhard Dörner
Abstract:
Wave-particle duality is an inherent peculiarity of the quantum world. The double-slit experiment has been frequently used for understanding different aspects of this fundamental concept. The occurrence of interference rests on the lack of which-way information and on the absence of decoherence mechanisms, which could scramble the wave fronts. In this letter, we report on the observation of two-ce…
▽ More
Wave-particle duality is an inherent peculiarity of the quantum world. The double-slit experiment has been frequently used for understanding different aspects of this fundamental concept. The occurrence of interference rests on the lack of which-way information and on the absence of decoherence mechanisms, which could scramble the wave fronts. In this letter, we report on the observation of two-center interference in the molecular frame photoelectron momentum distribution upon ionization of the neon dimer by a strong laser field. Postselection of ions, which were measured in coincidence with electrons, allowed choosing the symmetry of the continuum electronic wave function, leading to observation of both, gerade and ungerade, types of interference.
△ Less
Submitted 20 March, 2018;
originally announced March 2018.
-
Escaping Saddles with Stochastic Gradients
Authors:
Hadi Daneshmand,
Jonas Kohler,
Aurelien Lucchi,
Thomas Hofmann
Abstract:
We analyze the variance of stochastic gradients along negative curvature directions in certain non-convex machine learning models and show that stochastic gradients exhibit a strong component along these directions. Furthermore, we show that - contrary to the case of isotropic noise - this variance is proportional to the magnitude of the corresponding eigenvalues and not decreasing in the dimensio…
▽ More
We analyze the variance of stochastic gradients along negative curvature directions in certain non-convex machine learning models and show that stochastic gradients exhibit a strong component along these directions. Furthermore, we show that - contrary to the case of isotropic noise - this variance is proportional to the magnitude of the corresponding eigenvalues and not decreasing in the dimensionality. Based upon this observation we propose a new assumption under which we show that the injection of explicit, isotropic noise usually applied to make gradient descent escape saddle points can successfully be replaced by a simple SGD step. Additionally - and under the same condition - we derive the first convergence rate for plain SGD to a second-order stationary point in a number of iterations that is independent of the problem dimension.
△ Less
Submitted 16 September, 2018; v1 submitted 15 March, 2018;
originally announced March 2018.
-
Spherical CNNs
Authors:
Taco S. Cohen,
Mario Geiger,
Jonas Koehler,
Max Welling
Abstract:
Convolutional Neural Networks (CNNs) have become the method of choice for learning problems involving 2D planar images. However, a number of problems of recent interest have created a demand for models that can analyze spherical images. Examples include omnidirectional vision for drones, robots, and autonomous cars, molecular regression problems, and global weather and climate modelling. A naive a…
▽ More
Convolutional Neural Networks (CNNs) have become the method of choice for learning problems involving 2D planar images. However, a number of problems of recent interest have created a demand for models that can analyze spherical images. Examples include omnidirectional vision for drones, robots, and autonomous cars, molecular regression problems, and global weather and climate modelling. A naive application of convolutional networks to a planar projection of the spherical signal is destined to fail, because the space-varying distortions introduced by such a projection will make translational weight sharing ineffective.
In this paper we introduce the building blocks for constructing spherical CNNs. We propose a definition for the spherical cross-correlation that is both expressive and rotation-equivariant. The spherical correlation satisfies a generalized Fourier theorem, which allows us to compute it efficiently using a generalized (non-commutative) Fast Fourier Transform (FFT) algorithm. We demonstrate the computational efficiency, numerical accuracy, and effectiveness of spherical CNNs applied to 3D model recognition and atomization energy regression.
△ Less
Submitted 25 February, 2018; v1 submitted 30 January, 2018;
originally announced January 2018.
-
Interplay of cascaded Raman- and Brillouin-like scattering in nanostructured optical waveguides
Authors:
R. E. Noskov,
J. R. Koehler,
A. A. Sukhorukov
Abstract:
We formulate a generic concept of engineering optical modes and mechanical resonances in a pair of optically-coupled light-guiding membranes for achieving cascaded light scattering to multiple Stokes and anti-Stokes orders. By utilizing the light pressure exerted on the webs and their induced flexural vibrations, featuring flat phonon dispersion curve with a non-zero cut-off frequency, we show how…
▽ More
We formulate a generic concept of engineering optical modes and mechanical resonances in a pair of optically-coupled light-guiding membranes for achieving cascaded light scattering to multiple Stokes and anti-Stokes orders. By utilizing the light pressure exerted on the webs and their induced flexural vibrations, featuring flat phonon dispersion curve with a non-zero cut-off frequency, we show how to realize exact phase-matching between multiple successive optical side-bands. We predict continuous-wave generation of frequency combs for fundamental and high-order optical modes mediated via backward- and forward-propagating phonons, accompanied by periodic reversal of the energy flow between mechanical and optical modes without using any kind of cavity. These results reveal new possibilities for tailoring light-sound interactions through simultaneous Raman-like intramodal and Brillouin-like intermodal scattering processes.
△ Less
Submitted 11 December, 2017;
originally announced December 2017.
-
Convolutional Networks for Spherical Signals
Authors:
Taco Cohen,
Mario Geiger,
Jonas Köhler,
Max Welling
Abstract:
The success of convolutional networks in learning problems involving planar signals such as images is due to their ability to exploit the translation symmetry of the data distribution through weight sharing. Many areas of science and egineering deal with signals with other symmetries, such as rotation invariant data on the sphere. Examples include climate and weather science, astrophysics, and che…
▽ More
The success of convolutional networks in learning problems involving planar signals such as images is due to their ability to exploit the translation symmetry of the data distribution through weight sharing. Many areas of science and egineering deal with signals with other symmetries, such as rotation invariant data on the sphere. Examples include climate and weather science, astrophysics, and chemistry. In this paper we present spherical convolutional networks. These networks use convolutions on the sphere and rotation group, which results in rotational weight sharing and rotation equivariance. Using a synthetic spherical MNIST dataset, we show that spherical convolutional networks are very effective at dealing with rotationally invariant classification problems.
△ Less
Submitted 15 September, 2017; v1 submitted 14 September, 2017;
originally announced September 2017.
-
Negative-mass instability of the spin and motion of an atomic gas driven by optical cavity backaction
Authors:
Jonathan Kohler,
Justin A. Gerber,
Emma Dowd,
Dan M. Stamper-Kurn
Abstract:
We realize a spin-orbit interaction between the collective spin precession and center-of-mass motion of a trapped ultracold atomic gas, mediated by spin- and position-dependent dispersive coupling to a driven optical cavity. The collective spin, precessing near its highest-energy state in an applied magnetic field, can be approximated as a negative-mass harmonic oscillator. When the Larmor precess…
▽ More
We realize a spin-orbit interaction between the collective spin precession and center-of-mass motion of a trapped ultracold atomic gas, mediated by spin- and position-dependent dispersive coupling to a driven optical cavity. The collective spin, precessing near its highest-energy state in an applied magnetic field, can be approximated as a negative-mass harmonic oscillator. When the Larmor precession and mechanical motion are nearly resonant, cavity mediated coupling leads to a negative-mass instability, driving exponential growth of a correlated mode of the hybrid system. We observe this growth imprinted on modulations of the cavity field and estimate the full covariance of the resulting two-mode state by observing its transient decay during subsequent free evolution.
△ Less
Submitted 2 January, 2018; v1 submitted 13 September, 2017;
originally announced September 2017.
-
Weakly-supervised localization of diabetic retinopathy lesions in retinal fundus images
Authors:
Waleed M. Gondal,
Jan M. Köhler,
René Grzeszick,
Gernot A. Fink,
Michael Hirsch
Abstract:
Convolutional neural networks (CNNs) show impressive performance for image classification and detection, extending heavily to the medical image domain. Nevertheless, medical experts are sceptical in these predictions as the nonlinear multilayer structure resulting in a classification outcome is not directly graspable. Recently, approaches have been shown which help the user to understand the discr…
▽ More
Convolutional neural networks (CNNs) show impressive performance for image classification and detection, extending heavily to the medical image domain. Nevertheless, medical experts are sceptical in these predictions as the nonlinear multilayer structure resulting in a classification outcome is not directly graspable. Recently, approaches have been shown which help the user to understand the discriminative regions within an image which are decisive for the CNN to conclude to a certain class. Although these approaches could help to build trust in the CNNs predictions, they are only slightly shown to work with medical image data which often poses a challenge as the decision for a class relies on different lesion areas scattered around the entire image. Using the DiaretDB1 dataset, we show that on retina images different lesion areas fundamental for diabetic retinopathy are detected on an image level with high accuracy, comparable or exceeding supervised methods. On lesion level, we achieve few false positives with high sensitivity, though, the network is solely trained on image-level labels which do not include information about existing lesions. Classifying between diseased and healthy images, we achieve an AUC of 0.954 on the DiaretDB1.
△ Less
Submitted 29 June, 2017;
originally announced June 2017.
-
Coherent control of flexural vibrations in dual-nanoweb fibers using phase-modulated two-frequency light
Authors:
Johannes R. Koehler,
Roman E. Noskov,
Andrey A. Sukhorukov,
David Novoa,
Philip St. J. Russell
Abstract:
Coherent control of the resonant response in spatially extended optomechanical structures is complicated by the fact that the optical drive is affected by the back-action from the generated phonons. Here we report a new approach to coherent control based on stimulated Raman-like scattering, in which the optical pressure can remain unaffected by the induced vibrations even in the regime of strong o…
▽ More
Coherent control of the resonant response in spatially extended optomechanical structures is complicated by the fact that the optical drive is affected by the back-action from the generated phonons. Here we report a new approach to coherent control based on stimulated Raman-like scattering, in which the optical pressure can remain unaffected by the induced vibrations even in the regime of strong optomechanical interactions. We demonstrate experimentally coherent control of flexural vibrations simultaneously along the whole length of a dual-nanoweb fiber, by imprinting steps in the relative phase between the components of a two-frequency pump signal,the beat frequency being chosen to match a flexural resonance. Furthermore, sequential switching of the relative phase at time intervals shorter than the lifetime of the vibrations reduces their amplitude to a constant value that is fully adjustable by tuning the phase-modulation depth and switching rate. The results may trigger new developments in silicon photonics, since such coherent control uniquely decouples the amplitude of optomechanical oscillations from power-dependent thermal effects and nonlinear optical loss.
△ Less
Submitted 18 December, 2017; v1 submitted 22 June, 2017;
originally announced June 2017.
-
Towards a Knowledge Graph based Speech Interface
Authors:
Ashwini Jaya Kumar,
Sören Auer,
Christoph Schmidt,
Joachim köhler
Abstract:
Applications which use human speech as an input require a speech interface with high recognition accuracy. The words or phrases in the recognised text are annotated with a machine-understandable meaning and linked to knowledge graphs for further processing by the target application. These semantic annotations of recognised words can be represented as a subject-predicate-object triples which collec…
▽ More
Applications which use human speech as an input require a speech interface with high recognition accuracy. The words or phrases in the recognised text are annotated with a machine-understandable meaning and linked to knowledge graphs for further processing by the target application. These semantic annotations of recognised words can be represented as a subject-predicate-object triples which collectively form a graph often referred to as a knowledge graph. This type of knowledge representation facilitates to use speech interfaces with any spoken input application, since the information is represented in logical, semantic form, retrieving and storing can be followed using any web standard query languages. In this work, we develop a methodology for linking speech input to knowledge graphs and study the impact of recognition errors in the overall process. We show that for a corpus with lower WER, the annotation and linking of entities to the DBpedia knowledge graph is considerable. DBpedia Spotlight, a tool to interlink text documents with the linked open data is used to link the speech recognition output to the DBpedia knowledge graph. Such a knowledge-based speech recognition interface is useful for applications such as question answering or spoken dialog systems.
△ Less
Submitted 23 May, 2017;
originally announced May 2017.
-
A generalized approach to model the spectra and radiation dose rate of solar particle events on the surface of Mars
Authors:
Jingnan Guo,
Cary Zeitlin,
Robert F. Wimmer-Schweingruber,
Thoren McDole,
Patrick Kuehl,
Jan C. Appel,
Daniel Matthiae,
Johannes Krauss,
Jan Koehler
Abstract:
For future human missions to Mars, it is important to study the surface radiation environment during extreme and elevated conditions. In the long term, it is mainly Galactic Cosmic Rays (GCRs) modulated by solar activity that contributes to the radiation on the surface of Mars, but intense solar energetic particle (SEP) events may induce acute health effects. Such events may enhance the radiation…
▽ More
For future human missions to Mars, it is important to study the surface radiation environment during extreme and elevated conditions. In the long term, it is mainly Galactic Cosmic Rays (GCRs) modulated by solar activity that contributes to the radiation on the surface of Mars, but intense solar energetic particle (SEP) events may induce acute health effects. Such events may enhance the radiation level significantly and should be detected as immediately as possible to prevent severe damage to humans and equipment. However, the energetic particle environment on the Martian surface is significantly different from that in deep space due to the influence of the Martian atmosphere. Depending on the intensity and shape of the original solar particle spectra as well as particle types, the surface spectra may induce entirely different radiation effects. In order to give immediate and accurate alerts while avoiding unnecessary ones, it is important to model and well understand the atmospheric effect on the incoming SEPs including both protons and helium ions. In this paper, we have developed a generalized approach to quickly model the surface response of any given incoming proton/helium ion spectra and have applied it to a set of historical large solar events thus providing insights into the possible variety of surface radiation environments that may be induced during SEP events. Based on the statistical study of more than 30 significant solar events, we have obtained an empirical model for estimating the surface dose rate directly from the intensities of a power-law SEP spectra.
△ Less
Submitted 12 December, 2017; v1 submitted 9 May, 2017;
originally announced May 2017.
-
Sub-sampled Cubic Regularization for Non-convex Optimization
Authors:
Jonas Moritz Kohler,
Aurelien Lucchi
Abstract:
We consider the minimization of non-convex functions that typically arise in machine learning. Specifically, we focus our attention on a variant of trust region methods known as cubic regularization. This approach is particularly attractive because it escapes strict saddle points and it provides stronger convergence guarantees than first- and second-order as well as classical trust region methods.…
▽ More
We consider the minimization of non-convex functions that typically arise in machine learning. Specifically, we focus our attention on a variant of trust region methods known as cubic regularization. This approach is particularly attractive because it escapes strict saddle points and it provides stronger convergence guarantees than first- and second-order as well as classical trust region methods. However, it suffers from a high computational complexity that makes it impractical for large-scale learning. Here, we propose a novel method that uses sub-sampling to lower this computational cost. By the use of concentration inequalities we provide a sampling scheme that gives sufficiently accurate gradient and Hessian approximations to retain the strong global and local convergence guarantees of cubically regularized methods. To the best of our knowledge this is the first work that gives global convergence guarantees for a sub-sampled variant of cubic regularization on non-convex functions. Furthermore, we provide experimental results supporting our theory.
△ Less
Submitted 1 July, 2017; v1 submitted 16 May, 2017;
originally announced May 2017.
-
The road map toward room temperature superconductivity: manipulating different pairing channels in systems composed of multiple electronic components
Authors:
Annette Bussmann-Holder,
Jurgen Kohler,
Arndt Simon,
Myung-Hwan Whangbo,
Antonio Bianconi,
Andrea Perali
Abstract:
While it is known that the amplification of the superconducting critical temperature Tc is possible in a system of multiple electronic components in comparison with a single component system, many different road maps for room temperature superconductivity have been proposed for a variety of multicomponent scenarios. Here we focus on the scenario where the first electronic component is assumed to h…
▽ More
While it is known that the amplification of the superconducting critical temperature Tc is possible in a system of multiple electronic components in comparison with a single component system, many different road maps for room temperature superconductivity have been proposed for a variety of multicomponent scenarios. Here we focus on the scenario where the first electronic component is assumed to have a vanishing Fermi velocity corresponding to a case of the intermediate polaronic regime, and the second electronic component is in the weak coupling regime with standard high Fermi velocity using a mean field theory for multiband superconductivity. This roadmap is motivated by compelling experimental evidence for one component in the proximity of a Lifshitz transition in cuprates, diborides and iron based superconductors. By keeping a constant and small exchange interaction between the two electron fluids, we search for the optimum coupling strength in the electronic polaronic component which gives the largest amplification of the superconducting critical temperature in comparison with the case of a single electronic component.
△ Less
Submitted 6 July, 2017; v1 submitted 2 April, 2017;
originally announced April 2017.
-
The Solar Orbiter Mission: an Energetic Particle Perspective
Authors:
R. Gómez-Herrero,
J. Rodríguez-Pacheco,
R. F. Wimmer-Schweingruber,
G. M. Mason,
S. Sánchez-Prieto,
C. Martín,
M. Prieto,
G. C. Ho,
F. Espinosa Lara,
I. Cernuda,
J. J. Blanco,
A. Russu,
O. Rodríguez Polo,
S. R. Kulkarni,
C. Terasa,
L. Panitzsch,
S. I. Böttcher,
S. Boden,
B. Heber,
J. Steinhagen,
J. Tammen,
J. Köhler,
C. Drews,
R. Elftmann,
A. Ravanbakhsh
, et al. (5 additional authors not shown)
Abstract:
Solar Orbiter is a joint ESA-NASA mission planed for launch in October 2018. The science payload includes remote-sensing and in-situ instrumentation designed with the primary goal of understanding how the Sun creates and controls the heliosphere. The spacecraft will follow an elliptical orbit around the Sun, with perihelion as close as 0.28 AU. During the late orbit phase the orbital plane will re…
▽ More
Solar Orbiter is a joint ESA-NASA mission planed for launch in October 2018. The science payload includes remote-sensing and in-situ instrumentation designed with the primary goal of understanding how the Sun creates and controls the heliosphere. The spacecraft will follow an elliptical orbit around the Sun, with perihelion as close as 0.28 AU. During the late orbit phase the orbital plane will reach inclinations above 30 degrees, allowing direct observations of the solar polar regions. The Energetic Particle Detector (EPD) is an instrument suite consisting of several sensors measuring electrons, protons and ions over a broad energy interval (2 keV to 15 MeV for electrons, 3 keV to 100 MeV for protons and few tens of keV/nuc to 450 MeV/nuc for ions), providing composition, spectra, timing and anisotropy information. We present an overview of Solar Orbiter from the energetic particle perspective, summarizing the capabilities of EPD and the opportunities that these new observations will provide for understanding how energetic particles are accelerated during solar eruptions and how they propagate through the Heliosphere.
△ Less
Submitted 15 January, 2017;
originally announced January 2017.
-
Condensed-matter equation of states covering a wide region of pressure studied experimentally
Authors:
Elijah E. Gordon,
Juergen Koehler,
Myung-Hwan Whangbo
Abstract:
The relationships among the pressure P, volume V, and temperature T of solid-state materials are described by their equations of state (EOSs), which are often derived from the consideration of the finite-strain energy or the interatomic potential.1-3 These EOSs consist of typically three parameters to determine from experimental P-V-T data by fitting analyses. In the empirical approach to EOSs, on…
▽ More
The relationships among the pressure P, volume V, and temperature T of solid-state materials are described by their equations of state (EOSs), which are often derived from the consideration of the finite-strain energy or the interatomic potential.1-3 These EOSs consist of typically three parameters to determine from experimental P-V-T data by fitting analyses. In the empirical approach to EOSs, one either refines such fitting parameters or improves the mathematical functions3-5 to better simulate the experimental data. Despite over seven decades of studies on EOSs, none has been found to be accurate for all types of solids over the whole temperature and pressure ranges studied experimentally.3,6,7 Here we show that the simple empirical EOS, P = α1(PV) + α2(PV)2 + α3(PV)3, in which the pressure P is indirectly related to the volume V through a cubic polynomial of the energy term PV with three fitting parameters α1 - α3, provides accurate descriptions for the P-vs-V data of condensed matter in a wide region of pressure studied experimentally even in the presence of phase transitions
△ Less
Submitted 16 December, 2016;
originally announced December 2016.
-
Multigap superconductivity at extremely high temperature: a model for the case of pressurized H2S
Authors:
A. Bussmann-Holder,
J. Kohler,
A. Simon,
M. Whangbo,
A. Bianconi
Abstract:
It is known that in pressurized H2S the complex electronic structure in the energy range of 200 meV near the chemical potential can be separated into two electronic components, the first characterized by steep bands with a high Fermi velocity and the second by flat bands with a vanishing Fermi velocity. Also the phonon modes interacting with electrons at the Fermi energy can be separated into two…
▽ More
It is known that in pressurized H2S the complex electronic structure in the energy range of 200 meV near the chemical potential can be separated into two electronic components, the first characterized by steep bands with a high Fermi velocity and the second by flat bands with a vanishing Fermi velocity. Also the phonon modes interacting with electrons at the Fermi energy can be separated into two components: hard modes with high energy around 150 meV and soft modes with energies around 60 meV. Therefore we discuss here a multiband scenario in the standard BCS approximation where the effective BCS coupling coefficient is in the range 0.1- 0.32. We consider a first (second) BCS condensate in the strong (weak) coupling regime 0.32 (0.15). We discuss different scenario segregated in different portions of the material. The results show the phenomenology of unconventional superconducting phases in this two-gap superconductivity scenario where there are two electronic components in two Fermi surface spots, the pairing is mediated by either by a soft or a hard phonon branch where the inter-band exchange term, also if small, plays a key role for the emergence of high temperature superconductivity in pressurized sulfur hydride.
△ Less
Submitted 3 December, 2016;
originally announced December 2016.
-
Transparent EuTiO3 films: a novel two-dimensional magneto-optical device for light modulation
Authors:
Annette Bussmann-Holder,
Krystian Roleder,
Benjamin Stuhlhofer,
Gennady Logvenov,
Iwona Lazar,
Andrzej Soszyński,
Janusz Koperski,
Arndt Simon,
Jürgen Köhler
Abstract:
The magneto-optical activity of high quality transparent thin films of insulating EuTiO3 (ETO) deposited on a thin SrTiO3 (STO) substrate with both being non-magnetic materials are demonstrated to be a versatile tool for light modulation. The operating temperature is close to room temperature and admits multiple device engineering. By using small magnetic fields birefringence of the samples can be…
▽ More
The magneto-optical activity of high quality transparent thin films of insulating EuTiO3 (ETO) deposited on a thin SrTiO3 (STO) substrate with both being non-magnetic materials are demonstrated to be a versatile tool for light modulation. The operating temperature is close to room temperature and admits multiple device engineering. By using small magnetic fields birefringence of the samples can be switched off and on. Similarly, rotation of the sample in the field can modify its birefringence Δn. In addition, Δn can be increased by a factor of 4 in very modest fields with simultaneously enhancing the operating temperature by almost 100K.
△ Less
Submitted 26 September, 2016;
originally announced September 2016.
-
Cavity-assisted measurement and coherent control of collective atomic spin oscillators
Authors:
Jonathan Kohler,
Nicolas Spethmann,
Sydney Schreppler,
Dan M. Stamper-Kurn
Abstract:
We demonstrate continuous measurement and coherent control of the collective spin of an atomic ensemble undergoing Larmor precession in a high-finesse optical cavity. The coupling of the precessing spin to the cavity field yields phenomena similar to those observed in cavity optomechanics, including cavity amplification, damping, and optical spring shifts. These effects arise from autonomous optic…
▽ More
We demonstrate continuous measurement and coherent control of the collective spin of an atomic ensemble undergoing Larmor precession in a high-finesse optical cavity. The coupling of the precessing spin to the cavity field yields phenomena similar to those observed in cavity optomechanics, including cavity amplification, damping, and optical spring shifts. These effects arise from autonomous optical feedback onto the atomic spin dynamics, conditioned by the cavity spectrum. We use this feedback to stabilize the spin in either its high- or low-energy state, where, in equilibrium with measurement back-action heating, it achieves a steady-state temperature, indicated by an asymmetry between the Stokes and anti-Stokes scattering rates. For sufficiently large Larmor frequency, such feedback stabilizes the spin ensemble in a nearly pure quantum state, in spite of continuous measurement by the cavity field.
△ Less
Submitted 10 January, 2017; v1 submitted 26 July, 2016;
originally announced July 2016.
-
A model-based approach to the spatial and spectral calibration of NIRSpec onboard JWST
Authors:
Bernhard Dorner,
Giovanna Giardino,
Pierre Ferruit,
Catarina Alves de Oliveira,
Stephan M. Birkmann,
Torsten Böker,
Guido De Marchi,
Xavier Gnata,
Jess Köhler,
Marco Sirianni,
Peter Jakobsen
Abstract:
Context: The NIRSpec instrument for the James Webb Space Telescope (JWST) can be operated in multiobject (MOS), long-slit, and integral field (IFU) mode with spectral resolutions from 100 to 2700. Its MOS mode uses about a quarter of a million individually addressable minislits for object selection, covering a field of view of $\sim$9 $\mathrm{arcmin}^2$. Aims: The pipeline used to extract wavelen…
▽ More
Context: The NIRSpec instrument for the James Webb Space Telescope (JWST) can be operated in multiobject (MOS), long-slit, and integral field (IFU) mode with spectral resolutions from 100 to 2700. Its MOS mode uses about a quarter of a million individually addressable minislits for object selection, covering a field of view of $\sim$9 $\mathrm{arcmin}^2$. Aims: The pipeline used to extract wavelength-calibrated spectra from NIRSpec detector images relies heavily on a model of NIRSpec optical geometry. We demonstrate how dedicated calibration data from a small subset of NIRSpec modes and apertures can be used to optimize this parametric model to the necessary levels of fidelity. Methods: Following an iterative procedure, the initial fiducial values of the model parameters are manually adjusted and then automatically optimized, so that the model predicted location of the images and spectral lines from the fixed slits, the IFU, and a small subset of the MOS apertures matches their measured location in the main optical planes of the instrument. Results: The NIRSpec parametric model is able to reproduce the spatial and spectral position of the input spectra with high fidelity. The intrinsic accuracy (1-sigma, RMS) of the model, as measured from the extracted calibration spectra, is better than 1/10 of a pixel along the spatial direction and better than 1/20 of a resolution element in the spectral direction for all of the grating-based spectral modes. This is fully consistent with the corresponding allocation in the spatial and spectral calibration budgets of NIRSpec.
△ Less
Submitted 17 June, 2016;
originally announced June 2016.
-
High temperature superconductivity in sulfur hydride under ultrahigh pressure: A complex superconducting phase beyond conventional BCS
Authors:
Annette Bussmann-Holder,
Jurgen Kohler,
M. -H. Whangbo,
Antonio Bianconi,
Arndt Simon
Abstract:
The recent report of superconductivity under high pressure at the record transition temperature of Tc=203K in sulfur hydride has been identified as conventional in view of the observation of an isotope effect upon deuteration. Here it is demonstrated that conventional theories of superconductivity in the sense of BCS or Eliashberg formalisms can neither account for the observed values of Tc nor th…
▽ More
The recent report of superconductivity under high pressure at the record transition temperature of Tc=203K in sulfur hydride has been identified as conventional in view of the observation of an isotope effect upon deuteration. Here it is demonstrated that conventional theories of superconductivity in the sense of BCS or Eliashberg formalisms can neither account for the observed values of Tc nor the pressure dependence of the isotope coefficient. The only way out of the dilemma is a multi-band approach of superconductivity where already small interband coupling suffices to achieve the high values of Tc together with the anomalous pressure dependent isotope effect. In addition, it is shown that anharmonicity of the hydrogen bonds vanishes under pressure whereas anharmonic phonon modes related to sulfur are still active
△ Less
Submitted 21 May, 2016; v1 submitted 4 May, 2016;
originally announced May 2016.
-
Spin orientations of the spin-half Ir4+ ions in Sr3NiIrO6, Sr2IrO4 and Na2IrO3: Density functional, perturbation theory and Madelung potential analyses
Authors:
Elijah E. Gordon,
Hongjun Xiang,
Jürgen Köhler,
Myung-Hwan Whangbo
Abstract:
The spins of the low-spin Ir4+ (S = 1/2, d5) ions at the octahedral sites of the oxides Sr3NiIrO6, Sr2IrO4 and Na2IrO3 exhibit preferred orientations with respect to their IrO6 octahedra. We evaluated the magnetic anisotropies of these S = 1/2 ions on the basis of DFT calculations including spin-orbit coupling (SOC), and probed their origin by performing perturbation theory analyses with SOC as pe…
▽ More
The spins of the low-spin Ir4+ (S = 1/2, d5) ions at the octahedral sites of the oxides Sr3NiIrO6, Sr2IrO4 and Na2IrO3 exhibit preferred orientations with respect to their IrO6 octahedra. We evaluated the magnetic anisotropies of these S = 1/2 ions on the basis of DFT calculations including spin-orbit coupling (SOC), and probed their origin by performing perturbation theory analyses with SOC as perturbation within the LS coupling scheme. The observed spin orientations of Sr3NiIrO6 and Sr2IrO4 are correctly predicted by DFT calculations, and are accounted for by the perturbation theory analysis. As for the spin orientation of Na2IrO3, both experimental studies and DFT calculations have not been unequivocal. Our analysis reveals that the Ir4+ spin orientation of Na2IrO3 should have nonzero components along the c- and a-axes directions. The spin orientations determined by DFT calculations are sensitive to the accuracy of the crystal structures employed, which is explained by perturbation theory analyses when interactions between adjacent Ir4+ ions are taken into consideration. There are indications implying that the 5d electrons of Na2IrO3 are less strongly localized compared with those of Sr3NiIrO6 and Sr2IrO4. This implication was confirmed by showing that the Madelung potentials of the Ir4+ ions are less negative in Na2IrO3 than in Sr3NiIrO6, Sr2IrO4. Most transition-metal S = 1/2 ions do have magnetic anisotropies because the SOC induces interactions among their crystal-field split d-states, and the associated mixing of the states modifies only the orbital parts of the states. This finding cannot be mimicked by a spin Hamiltonian because this model Hamiltonian lacks the orbital degree of freedom, thereby leading to the spin-half syndrome. The spin-orbital entanglement for the 5d spin-half ions Ir4+ is not as strong as has been assumed lately.
△ Less
Submitted 12 March, 2016; v1 submitted 1 March, 2016;
originally announced March 2016.
-
Structure and Composition of the 200 K-Superconducting Phase of H2S under Ultrahigh Pressure: The Perovskite (SH-)(H3S+)
Authors:
Elijah E. Gordon§,
Ke Xu§,
Hongjun Xiang,
Annette Bussmann-Holder,
Reinhard K. Kremer,
Arndt Simon,
Jürgen Köhler,
Myung-Hwan Whangbo
Abstract:
H2S is converted under ultrahigh pressure (> 110 GPa) to a metallic phase that becomes superconducting with a record Tc of 200 K. It has been proposed that the superconducting phase is body-centered cubic H3S ( Im3m , a = 3.089 Å) resulting from a decomposition reaction 3H2S --> 2H3S + S. The analogy of H2S and H2O leads us to a very different conclusion. The well-known dissociation of water into…
▽ More
H2S is converted under ultrahigh pressure (> 110 GPa) to a metallic phase that becomes superconducting with a record Tc of 200 K. It has been proposed that the superconducting phase is body-centered cubic H3S ( Im3m , a = 3.089 Å) resulting from a decomposition reaction 3H2S --> 2H3S + S. The analogy of H2S and H2O leads us to a very different conclusion. The well-known dissociation of water into H3O+ and OH- increases by orders of magnitude under pressure. An equivalent behavior of H2S is anticipated under pressure with the dissociation, 2H2S --> H3S+ + SH- forming a perovskite structure (SH-)(H3S+), which consists of corner-sharing SH6 octahedra with SH- at each A-site (i.e., the center of each S8 cube). Our DFT calculations show that the perovskite (SH-)(H3S+) is thermodynamically more stable than the Im3m structure of H3S, and suggest that the A-site H atoms are most likely fluxional even at Tc.
△ Less
Submitted 7 April, 2016; v1 submitted 8 February, 2016;
originally announced February 2016.