-
New spectral imaging biomarkers for sepsis and mortality in intensive care
Authors:
Silvia Seidlitz,
Katharina Hölzl,
Ayca von Garrel,
Jan Sellner,
Stephan Katzenschlager,
Tobias Hölle,
Dania Fischer,
Maik von der Forst,
Felix C. F. Schmitt,
Markus A. Weigand,
Lena Maier-Hein,
Maximilian Dietrich
Abstract:
With sepsis remaining a leading cause of mortality, early identification of septic patients and those at high risk of death is a challenge of high socioeconomic importance. The driving hypothesis of this study was that hyperspectral imaging (HSI) could provide novel biomarkers for sepsis diagnosis and treatment management due to its potential to monitor microcirculatory alterations. We conducted a…
▽ More
With sepsis remaining a leading cause of mortality, early identification of septic patients and those at high risk of death is a challenge of high socioeconomic importance. The driving hypothesis of this study was that hyperspectral imaging (HSI) could provide novel biomarkers for sepsis diagnosis and treatment management due to its potential to monitor microcirculatory alterations. We conducted a comprehensive study involving HSI data of the palm and fingers from more than 480 patients on the day of their intensive care unit (ICU) admission. The findings demonstrate that HSI measurements can predict sepsis with an area under the receiver operating characteristic curve (AUROC) of 0.80 (95 % confidence interval (CI) [0.76; 0.84]) and mortality with an AUROC of 0.72 (95 % CI [0.65; 0.79]). The predictive performance improves substantially when additional clinical data is incorporated, leading to an AUROC of up to 0.94 (95 % CI [0.92; 0.96]) for sepsis and 0.84 (95 % CI [0.78; 0.89]) for mortality. We conclude that HSI presents novel imaging biomarkers for the rapid, non-invasive prediction of sepsis and mortality, suggesting its potential as an important modality for guiding diagnosis and treatment.
△ Less
Submitted 19 August, 2024;
originally announced August 2024.
-
Learning Better Representations From Less Data For Propositional Satisfiability
Authors:
Mohamed Ghanem,
Frederik Schmitt,
Julian Siber,
Bernd Finkbeiner
Abstract:
Training neural networks on NP-complete problems typically demands very large amounts of training data and often needs to be coupled with computationally expensive symbolic verifiers to ensure output correctness. In this paper, we present NeuRes, a neuro-symbolic approach to address both challenges for propositional satisfiability, being the quintessential NP-complete problem. By combining certifi…
▽ More
Training neural networks on NP-complete problems typically demands very large amounts of training data and often needs to be coupled with computationally expensive symbolic verifiers to ensure output correctness. In this paper, we present NeuRes, a neuro-symbolic approach to address both challenges for propositional satisfiability, being the quintessential NP-complete problem. By combining certificate-driven training and expert iteration, our model learns better representations than models trained for classification only, with a much higher data efficiency -- requiring orders of magnitude less training data. NeuRes employs propositional resolution as a proof system to generate proofs of unsatisfiability and to accelerate the process of finding satisfying truth assignments, exploring both possibilities in parallel. To realize this, we propose an attention-based architecture that autoregressively selects pairs of clauses from a dynamic formula embedding to derive new clauses. Furthermore, we employ expert iteration whereby model-generated proofs progressively replace longer teacher proofs as the new ground truth. This enables our model to reduce a dataset of proofs generated by an advanced solver by ~32% after training on it with no extra guidance. This shows that NeuRes is not limited by the optimality of the teacher algorithm owing to its self-improving workflow. We show that our model achieves far better performance than NeuroSAT in terms of both correctly classified and proven instances.
△ Less
Submitted 10 November, 2024; v1 submitted 13 February, 2024;
originally announced February 2024.
-
NeuroSynt: A Neuro-symbolic Portfolio Solver for Reactive Synthesis
Authors:
Matthias Cosler,
Christopher Hahn,
Ayham Omar,
Frederik Schmitt
Abstract:
We introduce NeuroSynt, a neuro-symbolic portfolio solver framework for reactive synthesis. At the core of the solver lies a seamless integration of neural and symbolic approaches to solving the reactive synthesis problem. To ensure soundness, the neural engine is coupled with model checkers verifying the predictions of the underlying neural models. The open-source implementation of NeuroSynt prov…
▽ More
We introduce NeuroSynt, a neuro-symbolic portfolio solver framework for reactive synthesis. At the core of the solver lies a seamless integration of neural and symbolic approaches to solving the reactive synthesis problem. To ensure soundness, the neural engine is coupled with model checkers verifying the predictions of the underlying neural models. The open-source implementation of NeuroSynt provides an integration framework for reactive synthesis in which new neural and state-of-the-art symbolic approaches can be seamlessly integrated. Extensive experiments demonstrate its efficacy in handling challenging specifications, enhancing the state-of-the-art reactive synthesis solvers, with NeuroSynt contributing novel solves in the current SYNTCOMP benchmarks.
△ Less
Submitted 29 January, 2024; v1 submitted 22 January, 2024;
originally announced January 2024.
-
nl2spec: Interactively Translating Unstructured Natural Language to Temporal Logics with Large Language Models
Authors:
Matthias Cosler,
Christopher Hahn,
Daniel Mendoza,
Frederik Schmitt,
Caroline Trippel
Abstract:
A rigorous formalization of desired system requirements is indispensable when performing any verification task. This often limits the application of verification techniques, as writing formal specifications is an error-prone and time-consuming manual task. To facilitate this, we present nl2spec, a framework for applying Large Language Models (LLMs) to derive formal specifications (in temporal logi…
▽ More
A rigorous formalization of desired system requirements is indispensable when performing any verification task. This often limits the application of verification techniques, as writing formal specifications is an error-prone and time-consuming manual task. To facilitate this, we present nl2spec, a framework for applying Large Language Models (LLMs) to derive formal specifications (in temporal logics) from unstructured natural language. In particular, we introduce a new methodology to detect and resolve the inherent ambiguity of system requirements in natural language: we utilize LLMs to map subformulas of the formalization back to the corresponding natural language fragments of the input. Users iteratively add, delete, and edit these sub-translations to amend erroneous formalizations, which is easier than manually redrafting the entire formalization. The framework is agnostic to specific application domains and can be extended to similar specification languages and new neural models. We perform a user study to obtain a challenging dataset, which we use to run experiments on the quality of translations. We provide an open-source implementation, including a web-based frontend.
△ Less
Submitted 8 March, 2023;
originally announced March 2023.
-
Iterative Circuit Repair Against Formal Specifications
Authors:
Matthias Cosler,
Frederik Schmitt,
Christopher Hahn,
Bernd Finkbeiner
Abstract:
We present a deep learning approach for repairing sequential circuits against formal specifications given in linear-time temporal logic (LTL). Given a defective circuit and its formal specification, we train Transformer models to output circuits that satisfy the corresponding specification. We propose a separated hierarchical Transformer for multimodal representation learning of the formal specifi…
▽ More
We present a deep learning approach for repairing sequential circuits against formal specifications given in linear-time temporal logic (LTL). Given a defective circuit and its formal specification, we train Transformer models to output circuits that satisfy the corresponding specification. We propose a separated hierarchical Transformer for multimodal representation learning of the formal specification and the circuit. We introduce a data generation algorithm that enables generalization to more complex specifications and out-of-distribution datasets. In addition, our proposed repair mechanism significantly improves the automated synthesis of circuits from LTL specifications with Transformers. It improves the state-of-the-art by $6.8$ percentage points on held-out instances and $11.8$ percentage points on an out-of-distribution dataset from the annual reactive synthesis competition.
△ Less
Submitted 2 March, 2023;
originally announced March 2023.
-
Formal Specifications from Natural Language
Authors:
Christopher Hahn,
Frederik Schmitt,
Julia J. Tillman,
Niklas Metzger,
Julian Siber,
Bernd Finkbeiner
Abstract:
We study the generalization abilities of language models when translating natural language into formal specifications with complex semantics. In particular, we fine-tune language models on three datasets consisting of English sentences and their corresponding formal representation: 1) regular expressions (regex), frequently used in programming and search; 2) First-order logic (FOL), commonly used…
▽ More
We study the generalization abilities of language models when translating natural language into formal specifications with complex semantics. In particular, we fine-tune language models on three datasets consisting of English sentences and their corresponding formal representation: 1) regular expressions (regex), frequently used in programming and search; 2) First-order logic (FOL), commonly used in software verification and theorem proving; and 3) linear-time temporal logic (LTL), which forms the basis for industrial hardware specification languages. Our experiments show that, in these diverse domains, the language models maintain their generalization capabilities from pre-trained knowledge of natural language to generalize, e.g., to new variable names or operator descriptions. Additionally, they achieve competitive performance, and even outperform the state-of-the-art for translating into regular expressions, with the benefits of being easy to access, efficient to fine-tune, and without a particular need for domain-specific reasoning.
△ Less
Submitted 19 October, 2022; v1 submitted 4 June, 2022;
originally announced June 2022.
-
Attention Flows for General Transformers
Authors:
Niklas Metzger,
Christopher Hahn,
Julian Siber,
Frederik Schmitt,
Bernd Finkbeiner
Abstract:
In this paper, we study the computation of how much an input token in a Transformer model influences its prediction. We formalize a method to construct a flow network out of the attention values of encoder-only Transformer models and extend it to general Transformer architectures including an auto-regressive decoder. We show that running a maxflow algorithm on the flow network construction yields…
▽ More
In this paper, we study the computation of how much an input token in a Transformer model influences its prediction. We formalize a method to construct a flow network out of the attention values of encoder-only Transformer models and extend it to general Transformer architectures including an auto-regressive decoder. We show that running a maxflow algorithm on the flow network construction yields Shapley values, which determine the impact of a player in cooperative game theory. By interpreting the input tokens in the flow network as players, we can compute their influence on the total attention flow leading to the decoder's decision. Additionally, we provide a library that computes and visualizes the attention flow of arbitrary Transformer models. We show the usefulness of our implementation on various models trained on natural language processing and reasoning tasks.
△ Less
Submitted 30 May, 2022;
originally announced May 2022.
-
Hierarchies of Planning and Reinforcement Learning for Robot Navigation
Authors:
Jan Wöhlke,
Felix Schmitt,
Herke van Hoof
Abstract:
Solving robotic navigation tasks via reinforcement learning (RL) is challenging due to their sparse reward and long decision horizon nature. However, in many navigation tasks, high-level (HL) task representations, like a rough floor plan, are available. Previous work has demonstrated efficient learning by hierarchal approaches consisting of path planning in the HL representation and using sub-goal…
▽ More
Solving robotic navigation tasks via reinforcement learning (RL) is challenging due to their sparse reward and long decision horizon nature. However, in many navigation tasks, high-level (HL) task representations, like a rough floor plan, are available. Previous work has demonstrated efficient learning by hierarchal approaches consisting of path planning in the HL representation and using sub-goals derived from the plan to guide the RL policy in the source task. However, these approaches usually neglect the complex dynamics and sub-optimal sub-goal-reaching capabilities of the robot during planning. This work overcomes these limitations by proposing a novel hierarchical framework that utilizes a trainable planning policy for the HL representation. Thereby robot capabilities and environment conditions can be learned utilizing collected rollout data. We specifically introduce a planning policy based on value iteration with a learned transition model (VI-RL). In simulated robotic navigation tasks, VI-RL results in consistent strong improvement over vanilla RL, is on par with vanilla hierarchal RL on single layouts but more broadly applicable to multiple layouts, and is on par with trainable HL path planning baselines except for a parking task with difficult non-holonomic dynamics where it shows marked improvements.
△ Less
Submitted 5 November, 2021; v1 submitted 23 September, 2021;
originally announced September 2021.
-
Neural Circuit Synthesis from Specification Patterns
Authors:
Frederik Schmitt,
Christopher Hahn,
Markus N. Rabe,
Bernd Finkbeiner
Abstract:
We train hierarchical Transformers on the task of synthesizing hardware circuits directly out of high-level logical specifications in linear-time temporal logic (LTL). The LTL synthesis problem is a well-known algorithmic challenge with a long history and an annual competition is organized to track the improvement of algorithms and tooling over time. New approaches using machine learning might ope…
▽ More
We train hierarchical Transformers on the task of synthesizing hardware circuits directly out of high-level logical specifications in linear-time temporal logic (LTL). The LTL synthesis problem is a well-known algorithmic challenge with a long history and an annual competition is organized to track the improvement of algorithms and tooling over time. New approaches using machine learning might open a lot of possibilities in this area, but suffer from the lack of sufficient amounts of training data. In this paper, we consider a method to generate large amounts of additional training data, i.e., pairs of specifications and circuits implementing them. We ensure that this synthetic data is sufficiently close to human-written specifications by mining common patterns from the specifications used in the synthesis competitions. We show that hierarchical Transformers trained on this synthetic data solve a significant portion of problems from the synthesis competitions, and even out-of-distribution examples from a recent case study.
△ Less
Submitted 25 July, 2021;
originally announced July 2021.
-
Reward (Mis)design for Autonomous Driving
Authors:
W. Bradley Knox,
Alessandro Allievi,
Holger Banzhaf,
Felix Schmitt,
Peter Stone
Abstract:
This article considers the problem of diagnosing certain common errors in reward design. Its insights are also applicable to the design of cost functions and performance metrics more generally. To diagnose common errors, we develop 8 simple sanity checks for identifying flaws in reward functions. These sanity checks are applied to reward functions from past work on reinforcement learning (RL) for…
▽ More
This article considers the problem of diagnosing certain common errors in reward design. Its insights are also applicable to the design of cost functions and performance metrics more generally. To diagnose common errors, we develop 8 simple sanity checks for identifying flaws in reward functions. These sanity checks are applied to reward functions from past work on reinforcement learning (RL) for autonomous driving (AD), revealing near-universal flaws in reward design for AD that might also exist pervasively across reward design for other tasks. Lastly, we explore promising directions that may aid the design of reward functions for AD in subsequent research, following a process of inquiry that can be adapted to other domains.
△ Less
Submitted 11 March, 2022; v1 submitted 28 April, 2021;
originally announced April 2021.
-
Teaching Temporal Logics to Neural Networks
Authors:
Christopher Hahn,
Frederik Schmitt,
Jens U. Kreber,
Markus N. Rabe,
Bernd Finkbeiner
Abstract:
We study two fundamental questions in neuro-symbolic computing: can deep learning tackle challenging problems in logics end-to-end, and can neural networks learn the semantics of logics. In this work we focus on linear-time temporal logic (LTL), as it is widely used in verification. We train a Transformer on the problem to directly predict a solution, i.e. a trace, to a given LTL formula. The trai…
▽ More
We study two fundamental questions in neuro-symbolic computing: can deep learning tackle challenging problems in logics end-to-end, and can neural networks learn the semantics of logics. In this work we focus on linear-time temporal logic (LTL), as it is widely used in verification. We train a Transformer on the problem to directly predict a solution, i.e. a trace, to a given LTL formula. The training data is generated with classical solvers, which, however, only provide one of many possible solutions to each formula. We demonstrate that it is sufficient to train on those particular solutions to formulas, and that Transformers can predict solutions even to formulas from benchmarks from the literature on which the classical solver timed out. Transformers also generalize to the semantics of the logics: while they often deviate from the solutions found by the classical solvers, they still predict correct solutions to most formulas.
△ Less
Submitted 18 February, 2021; v1 submitted 6 March, 2020;
originally announced March 2020.
-
On the Scalability of Data Reduction Techniques in Current and Upcoming HPC Systems from an Application Perspective
Authors:
Axel Huebl,
Rene Widera,
Felix Schmitt,
Alexander Matthes,
Norbert Podhorszki,
Jong Youl Choi,
Scott Klasky,
Michael Bussmann
Abstract:
We implement and benchmark parallel I/O methods for the fully-manycore driven particle-in-cell code PIConGPU. Identifying throughput and overall I/O size as a major challenge for applications on today's and future HPC systems, we present a scaling law characterizing performance bottlenecks in state-of-the-art approaches for data reduction. Consequently, we propose, implement and verify multi-threa…
▽ More
We implement and benchmark parallel I/O methods for the fully-manycore driven particle-in-cell code PIConGPU. Identifying throughput and overall I/O size as a major challenge for applications on today's and future HPC systems, we present a scaling law characterizing performance bottlenecks in state-of-the-art approaches for data reduction. Consequently, we propose, implement and verify multi-threaded data-transformations for the I/O library ADIOS as a feasible way to trade underutilized host-side compute potential on heterogeneous systems for reduced I/O latency.
△ Less
Submitted 1 June, 2017;
originally announced June 2017.
-
Inverse Reinforcement Learning with Simultaneous Estimation of Rewards and Dynamics
Authors:
Michael Herman,
Tobias Gindele,
Jörg Wagner,
Felix Schmitt,
Wolfram Burgard
Abstract:
Inverse Reinforcement Learning (IRL) describes the problem of learning an unknown reward function of a Markov Decision Process (MDP) from observed behavior of an agent. Since the agent's behavior originates in its policy and MDP policies depend on both the stochastic system dynamics as well as the reward function, the solution of the inverse problem is significantly influenced by both. Current IRL…
▽ More
Inverse Reinforcement Learning (IRL) describes the problem of learning an unknown reward function of a Markov Decision Process (MDP) from observed behavior of an agent. Since the agent's behavior originates in its policy and MDP policies depend on both the stochastic system dynamics as well as the reward function, the solution of the inverse problem is significantly influenced by both. Current IRL approaches assume that if the transition model is unknown, additional samples from the system's dynamics are accessible, or the observed behavior provides enough samples of the system's dynamics to solve the inverse problem accurately. These assumptions are often not satisfied. To overcome this, we present a gradient-based IRL approach that simultaneously estimates the system's dynamics. By solving the combined optimization problem, our approach takes into account the bias of the demonstrations, which stems from the generating policy. The evaluation on a synthetic MDP and a transfer learning task shows improvements regarding the sample efficiency as well as the accuracy of the estimated reward functions and transition models.
△ Less
Submitted 13 April, 2016;
originally announced April 2016.
-
Software Design Principles of a DFS Tower A-CWP Prototype
Authors:
Felix Schmitt,
Ralf Heidger,
Stephen Straub,
Benjamin Weiß
Abstract:
SESAR is supposed to boost the development of new operational procedures together with the supporting systems in order to modernize the pan-European air traffic management (ATM). One consequence of this development is that more and more information is presented to - and has to be processed by - air traffic control officers (ATCOs). Thus, there is a strong need for a software design concept that fo…
▽ More
SESAR is supposed to boost the development of new operational procedures together with the supporting systems in order to modernize the pan-European air traffic management (ATM). One consequence of this development is that more and more information is presented to - and has to be processed by - air traffic control officers (ATCOs). Thus, there is a strong need for a software design concept that fosters the development of an advanced (tower) controller working position (A-CWP) that comprehensively integrates the still counting amount of information while reducing the data management workload of ATCOs. We report on our first hands-on experiences obtained during the development of an A-CWP prototype that was used in two SESAR validation sessions.
△ Less
Submitted 24 April, 2013;
originally announced April 2013.