-
Trajectory First: A Curriculum for Discovering Diverse Policies
Authors:
Cornelius V. Braun,
Sayantan Auddy,
Marc Toussaint
Abstract:
Being able to solve a task in diverse ways makes agents more robust to task variations and less prone to local optima. In this context, constrained diversity optimization has emerged as a powerful reinforcement learning (RL) framework to train a diverse set of agents in parallel. However, existing constrained-diversity RL methods often under-explore in complex tasks such as robotic manipulation, l…
▽ More
Being able to solve a task in diverse ways makes agents more robust to task variations and less prone to local optima. In this context, constrained diversity optimization has emerged as a powerful reinforcement learning (RL) framework to train a diverse set of agents in parallel. However, existing constrained-diversity RL methods often under-explore in complex tasks such as robotic manipulation, leading to a lack in policy diversity. To improve diversity optimization in RL, we therefore propose a curriculum that first explores at the trajectory level before learning step-based policies. In our empirical evaluation, we provide novel insights into the shortcoming of skill-based diversity optimization, and demonstrate empirically that our curriculum improves the diversity of the learned skills.
△ Less
Submitted 2 June, 2025;
originally announced June 2025.
-
The Hidden Structure -- Improving Legal Document Understanding Through Explicit Text Formatting
Authors:
Christian Braun,
Alexander Lilienbeck,
Daniel Mentjukov
Abstract:
Legal contracts possess an inherent, semantically vital structure (e.g., sections, clauses) that is crucial for human comprehension but whose impact on LLM processing remains under-explored. This paper investigates the effects of explicit input text structure and prompt engineering on the performance of GPT-4o and GPT-4.1 on a legal question-answering task using an excerpt of the CUAD. We compare…
▽ More
Legal contracts possess an inherent, semantically vital structure (e.g., sections, clauses) that is crucial for human comprehension but whose impact on LLM processing remains under-explored. This paper investigates the effects of explicit input text structure and prompt engineering on the performance of GPT-4o and GPT-4.1 on a legal question-answering task using an excerpt of the CUAD. We compare model exact-match accuracy across various input formats: well-structured plain-text (human-generated from CUAD), plain-text cleaned of line breaks, extracted plain-text from Azure OCR, plain-text extracted by GPT-4o Vision, and extracted (and interpreted) Markdown (MD) from GPT-4o Vision. To give an indication of the impact of possible prompt engineering, we assess the impact of shifting task instructions to the system prompt and explicitly informing the model about the structured nature of the input. Our findings reveal that GPT-4o demonstrates considerable robustness to variations in input structure, but lacks in overall performance. Conversely, GPT-4.1's performance is markedly sensitive; poorly structured inputs yield suboptimal results (but identical with GPT-4o), while well-structured formats (original CUAD text, GPT-4o Vision text and GPT-4o MD) improve exact-match accuracy by ~20 percentage points. Optimizing the system prompt to include task details and an advisory about structured input further elevates GPT-4.1's accuracy by an additional ~10-13 percentage points, with Markdown ultimately achieving the highest performance under these conditions (79 percentage points overall exact-match accuracy). This research empirically demonstrates that while newer models exhibit greater resilience, careful input structuring and strategic prompt design remain critical for optimizing the performance of LLMs, and can significantly affect outcomes in high-stakes legal applications.
△ Less
Submitted 19 May, 2025;
originally announced May 2025.
-
Meta-Optimization and Program Search using Language Models for Task and Motion Planning
Authors:
Denis Shcherba,
Eckart Cobo-Briesewitz,
Cornelius V. Braun,
Marc Toussaint
Abstract:
Intelligent interaction with the real world requires robotic agents to jointly reason over high-level plans and low-level controls. Task and motion planning (TAMP) addresses this by combining symbolic planning and continuous trajectory generation. Recently, foundation model approaches to TAMP have presented impressive results, including fast planning times and the execution of natural language ins…
▽ More
Intelligent interaction with the real world requires robotic agents to jointly reason over high-level plans and low-level controls. Task and motion planning (TAMP) addresses this by combining symbolic planning and continuous trajectory generation. Recently, foundation model approaches to TAMP have presented impressive results, including fast planning times and the execution of natural language instructions. Yet, the optimal interface between high-level planning and low-level motion generation remains an open question: prior approaches are limited by either too much abstraction (e.g., chaining simplified skill primitives) or a lack thereof (e.g., direct joint angle prediction). Our method introduces a novel technique employing a form of meta-optimization to address these issues by: (i) using program search over trajectory optimization problems as an interface between a foundation model and robot control, and (ii) leveraging a zero-order method to optimize numerical parameters in the foundation model output. Results on challenging object manipulation and drawing tasks confirm that our proposed method improves over prior TAMP approaches.
△ Less
Submitted 6 May, 2025;
originally announced May 2025.
-
On the Suitability of pre-trained foundational LLMs for Analysis in German Legal Education
Authors:
Lorenz Wendlinger,
Christian Braun,
Abdullah Al Zubaer,
Simon Alexander Nonn,
Sarah Großkopf,
Christofer Fellicious,
Michael Granitzer
Abstract:
We show that current open-source foundational LLMs possess instruction capability and German legal background knowledge that is sufficient for some legal analysis in an educational context. However, model capability breaks down in very specific tasks, such as the classification of "Gutachtenstil" appraisal style components, or with complex contexts, such as complete legal opinions. Even with exten…
▽ More
We show that current open-source foundational LLMs possess instruction capability and German legal background knowledge that is sufficient for some legal analysis in an educational context. However, model capability breaks down in very specific tasks, such as the classification of "Gutachtenstil" appraisal style components, or with complex contexts, such as complete legal opinions. Even with extended context and effective prompting strategies, they cannot match the Bag-of-Words baseline. To combat this, we introduce a Retrieval Augmented Generation based prompt example selection method that substantially improves predictions in high data availability scenarios. We further evaluate the performance of pre-trained LLMs on two standard tasks for argument mining and automated essay scoring and find it to be more adequate. Throughout, pre-trained LLMs improve upon the baseline in scenarios with little or no labeled data with Chain-of-Thought prompting further helping in the zero-shot case.
△ Less
Submitted 20 December, 2024;
originally announced December 2024.
-
Stein Variational Evolution Strategies
Authors:
Cornelius V. Braun,
Robert T. Lange,
Marc Toussaint
Abstract:
Stein Variational Gradient Descent (SVGD) is a highly efficient method to sample from an unnormalized probability distribution. However, the SVGD update relies on gradients of the log-density, which may not always be available. Existing gradient-free versions of SVGD make use of simple Monte Carlo approximations or gradients from surrogate distributions, both with limitations. To improve gradient-…
▽ More
Stein Variational Gradient Descent (SVGD) is a highly efficient method to sample from an unnormalized probability distribution. However, the SVGD update relies on gradients of the log-density, which may not always be available. Existing gradient-free versions of SVGD make use of simple Monte Carlo approximations or gradients from surrogate distributions, both with limitations. To improve gradient-free Stein variational inference, we combine SVGD steps with evolution strategy (ES) updates. Our results demonstrate that the resulting algorithm generates high-quality samples from unnormalized target densities without requiring gradient information. Compared to prior gradient-free SVGD methods, we find that the integration of the ES update in SVGD significantly improves the performance on multiple challenging benchmark problems.
△ Less
Submitted 5 June, 2025; v1 submitted 14 October, 2024;
originally announced October 2024.
-
NLP Sampling: Combining MCMC and NLP Methods for Diverse Constrained Sampling
Authors:
Marc Toussaint,
Cornelius V. Braun,
Joaquim Ortiz-Haro
Abstract:
Generating diverse samples under hard constraints is a core challenge in many areas. With this work we aim to provide an integrative view and framework to combine methods from the fields of MCMC, constrained optimization, as well as robotics, and gain insights in their strengths from empirical evaluations. We propose NLP Sampling as a general problem formulation, propose a family of restarting two…
▽ More
Generating diverse samples under hard constraints is a core challenge in many areas. With this work we aim to provide an integrative view and framework to combine methods from the fields of MCMC, constrained optimization, as well as robotics, and gain insights in their strengths from empirical evaluations. We propose NLP Sampling as a general problem formulation, propose a family of restarting two-phase methods as a framework to integrated methods from across the fields, and evaluate them on analytical and robotic manipulation planning problems. Complementary to this, we provide several conceptual discussions, e.g. on the role of Lagrange parameters, global sampling, and the idea of a Diffused NLP and a corresponding model-based denoising sampler.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Implications of the AI Act for Non-Discrimination Law and Algorithmic Fairness
Authors:
Luca Deck,
Jan-Laurin Müller,
Conradin Braun,
Domenique Zipperling,
Niklas Kühl
Abstract:
The topic of fairness in AI, as debated in the FATE (Fairness, Accountability, Transparency, and Ethics in AI) communities, has sparked meaningful discussions in the past years. However, from a legal perspective, particularly from the perspective of European Union law, many open questions remain. Whereas algorithmic fairness aims to mitigate structural inequalities at design-level, European non-di…
▽ More
The topic of fairness in AI, as debated in the FATE (Fairness, Accountability, Transparency, and Ethics in AI) communities, has sparked meaningful discussions in the past years. However, from a legal perspective, particularly from the perspective of European Union law, many open questions remain. Whereas algorithmic fairness aims to mitigate structural inequalities at design-level, European non-discrimination law is tailored to individual cases of discrimination after an AI model has been deployed. The AI Act might present a tremendous step towards bridging these two approaches by shifting non-discrimination responsibilities into the design stage of AI models. Based on an integrative reading of the AI Act, we comment on legal as well as technical enforcement problems and propose practical implications on bias detection and bias correction in order to specify and comply with specific technical requirements.
△ Less
Submitted 26 June, 2024; v1 submitted 29 March, 2024;
originally announced March 2024.
-
Overview of Publicly Available Degradation Data Sets for Tasks within Prognostics and Health Management
Authors:
Fabian Mauthe,
Christopher Braun,
Julian Raible,
Peter Zeiler,
Marco F. Huber
Abstract:
Central to the efficacy of prognostics and health management methods is the acquisition and analysis of degradation data, which encapsulates the evolving health condition of engineering systems over time. Degradation data serves as a rich source of information, offering invaluable insights into the underlying degradation processes, failure modes, and performance trends of engineering systems. This…
▽ More
Central to the efficacy of prognostics and health management methods is the acquisition and analysis of degradation data, which encapsulates the evolving health condition of engineering systems over time. Degradation data serves as a rich source of information, offering invaluable insights into the underlying degradation processes, failure modes, and performance trends of engineering systems. This paper provides an overview of publicly available degradation data sets.
△ Less
Submitted 28 January, 2025; v1 submitted 20 March, 2024;
originally announced March 2024.
-
RoboGrind: Intuitive and Interactive Surface Treatment with Industrial Robots
Authors:
Benjamin Alt,
Florian Stöckl,
Silvan Müller,
Christopher Braun,
Julian Raible,
Saad Alhasan,
Oliver Rettig,
Lukas Ringle,
Darko Katic,
Rainer Jäkel,
Michael Beetz,
Marcus Strand,
Marco F. Huber
Abstract:
Surface treatment tasks such as grinding, sanding or polishing are a vital step of the value chain in many industries, but are notoriously challenging to automate. We present RoboGrind, an integrated system for the intuitive, interactive automation of surface treatment tasks with industrial robots. It combines a sophisticated 3D perception pipeline for surface scanning and automatic defect identif…
▽ More
Surface treatment tasks such as grinding, sanding or polishing are a vital step of the value chain in many industries, but are notoriously challenging to automate. We present RoboGrind, an integrated system for the intuitive, interactive automation of surface treatment tasks with industrial robots. It combines a sophisticated 3D perception pipeline for surface scanning and automatic defect identification, an interactive voice-controlled wizard system for the AI-assisted bootstrapping and parameterization of robot programs, and an automatic planning and execution pipeline for force-controlled robotic surface treatment. RoboGrind is evaluated both under laboratory and real-world conditions in the context of refabricating fiberglass wind turbine blades.
△ Less
Submitted 27 February, 2024; v1 submitted 26 February, 2024;
originally announced February 2024.
-
Forecasting Intraday Power Output by a Set of PV Systems using Recurrent Neural Networks and Physical Covariates
Authors:
Pierrick Bruneau,
David Fiorelli,
Christian Braun,
Daniel Koster
Abstract:
Accurate intraday forecasts of the power output by PhotoVoltaic (PV) systems are critical to improve the operation of energy distribution grids. We describe a neural autoregressive model that aims to perform such intraday forecasts. We build upon a physical, deterministic PV performance model, the output of which is used as covariates in the context of the neural model. In addition, our applicatio…
▽ More
Accurate intraday forecasts of the power output by PhotoVoltaic (PV) systems are critical to improve the operation of energy distribution grids. We describe a neural autoregressive model that aims to perform such intraday forecasts. We build upon a physical, deterministic PV performance model, the output of which is used as covariates in the context of the neural model. In addition, our application data relates to a geographically distributed set of PV systems. We address all PV sites with a single neural model, which embeds the information about the PV site in specific covariates. We use a scale-free approach which relies on the explicit modeling of seasonal effects. Our proposal repurposes a model initially used in the retail sector and discloses a novel truncated Gaussian output distribution. An ablation study and a comparison to alternative architectures from the literature shows that the components in the best performing proposed model variant work synergistically to reach a skill score of 15.72% with respect to the physical model, used as a baseline.
△ Less
Submitted 28 August, 2024; v1 submitted 15 March, 2023;
originally announced March 2023.
-
Evaluation of automated airway morphological quantification for assessing fibrosing lung disease
Authors:
Ashkan Pakzad,
Wing Keung Cheung,
Kin Quan,
Nesrin Mogulkoc,
Coline H. M. Van Moorsel,
Brian J. Bartholmai,
Hendrik W. Van Es,
Alper Ezircan,
Frouke Van Beek,
Marcel Veltkamp,
Ronald Karwoski,
Tobias Peikert,
Ryan D. Clay,
Finbar Foley,
Cassandra Braun,
Recep Savas,
Carole Sudre,
Tom Doel,
Daniel C. Alexander,
Peter Wijeratne,
David Hawkes,
Yipeng Hu,
John R Hurst,
Joseph Jacob
Abstract:
Abnormal airway dilatation, termed traction bronchiectasis, is a typical feature of idiopathic pulmonary fibrosis (IPF). Volumetric computed tomography (CT) imaging captures the loss of normal airway tapering in IPF. We postulated that automated quantification of airway abnormalities could provide estimates of IPF disease extent and severity. We propose AirQuant, an automated computational pipelin…
▽ More
Abnormal airway dilatation, termed traction bronchiectasis, is a typical feature of idiopathic pulmonary fibrosis (IPF). Volumetric computed tomography (CT) imaging captures the loss of normal airway tapering in IPF. We postulated that automated quantification of airway abnormalities could provide estimates of IPF disease extent and severity. We propose AirQuant, an automated computational pipeline that systematically parcellates the airway tree into its lobes and generational branches from a deep learning based airway segmentation, deriving airway structural measures from chest CT. Importantly, AirQuant prevents the occurrence of spurious airway branches by thick wave propagation and removes loops in the airway-tree by graph search, overcoming limitations of existing airway skeletonisation algorithms. Tapering between airway segments (intertapering) and airway tortuosity computed by AirQuant were compared between 14 healthy participants and 14 IPF patients. Airway intertapering was significantly reduced in IPF patients, and airway tortuosity was significantly increased when compared to healthy controls. Differences were most marked in the lower lobes, conforming to the typical distribution of IPF-related damage. AirQuant is an open-source pipeline that avoids limitations of existing airway quantification algorithms and has clinical interpretability. Automated airway measurements may have potential as novel imaging biomarkers of IPF severity and disease extent.
△ Less
Submitted 19 November, 2021;
originally announced November 2021.
-
RHH-LGP: Receding Horizon And Heuristics-Based Logic-Geometric Programming For Task And Motion Planning
Authors:
Cornelius V. Braun,
Joaquim Ortiz-Haro,
Marc Toussaint,
Ozgur S. Oguz
Abstract:
Sequential decision-making and motion planning for robotic manipulation induce combinatorial complexity. For long-horizon tasks, especially when the environment comprises many objects that can be interacted with, planning efficiency becomes even more important. To plan such long-horizon tasks, we present the RHH-LGP algorithm for combined task and motion planning (TAMP). First, we propose a TAMP a…
▽ More
Sequential decision-making and motion planning for robotic manipulation induce combinatorial complexity. For long-horizon tasks, especially when the environment comprises many objects that can be interacted with, planning efficiency becomes even more important. To plan such long-horizon tasks, we present the RHH-LGP algorithm for combined task and motion planning (TAMP). First, we propose a TAMP approach (based on Logic-Geometric Programming) that effectively uses geometry-based heuristics for solving long-horizon manipulation tasks. The efficiency of this planner is then further improved by a receding horizon formulation, resulting in RHH-LGP. We demonstrate the robustness and effectiveness of our approach on a diverse range of long-horizon tasks that require reasoning about interactions with a large number of objects. Using our framework, we can solve tasks that require multiple robots, including a mobile robot and snake-like walking robots, to form novel heterogeneous kinematic structures autonomously. By combining geometry-based heuristics with iterative planning, our approach brings an order-of-magnitude reduction of planning time in all investigated problems.
△ Less
Submitted 6 March, 2022; v1 submitted 7 October, 2021;
originally announced October 2021.
-
Should We Worry About Interference in Emerging Dense NGSO Satellite Constellations?
Authors:
Christophe Braun,
Andra M. Voicu,
Ljiljana Simić,
Petri Mähönen
Abstract:
Many satellite operators are planning to deploy NGSO systems for broadband communication services in the Ku-, Ka-, and V-band, where some of them have already launched. Consequently, new challenges are expected for inter-system satellite coexistence due to the increased interference level and the complexity of the interactions resulting from the heterogeneity of the constellations. This is especia…
▽ More
Many satellite operators are planning to deploy NGSO systems for broadband communication services in the Ku-, Ka-, and V-band, where some of them have already launched. Consequently, new challenges are expected for inter-system satellite coexistence due to the increased interference level and the complexity of the interactions resulting from the heterogeneity of the constellations. This is especially relevant for the Ku-band, where the NGSO systems are most diverse and existing GSO systems, which often support critical services, must be protected from interference. It is thus imperative to evaluate the impact of mutual inter-system interference, the efficiency of the basic interference mitigation techniques, and whether regulatory intervention is needed for the new systems. We conduct an extensive study of inter-satellite coexistence in the Ku-band, where we consider all recently proposed NGSO and some selected GSO systems. Our throughput degradation results suggest that existing spectrum regulation may be insufficient to ensure GSO protection from NGSO interference, especially due to the high transmit power of the LEO Kepler satellites. This also results in strong interference towards other NGSO systems, where traditional interference mitigation techniques like look-aside may perform poorly. Specifically, look-aside can be beneficial for large constellations, but detrimental for small constellations. Furthermore, we confirm that band-splitting among satellite operators significantly degrades throughput, also for the Ku-band. Our results overall show that the complexity of the inter-satellite interactions for new NGSO systems is too high to be managed via simple interference mitigation techniques. This means that more sophisticated engineering solutions, and potentially even more strict regulatory requirements, will be needed to ensure coexistence in emerging, dense NGSO deployments.
△ Less
Submitted 11 September, 2019;
originally announced September 2019.
-
An Information Extraction Core System for Real World German Text Processing
Authors:
G. Neumann,
R. Backofen,
J. Baur,
M. Becker,
C. Braun
Abstract:
This paper describes SMES, an information extraction core system for real world German text processing. The basic design criterion of the system is of providing a set of basic powerful, robust, and efficient natural language components and generic linguistic knowledge sources which can easily be customized for processing different tasks in a flexible manner.
This paper describes SMES, an information extraction core system for real world German text processing. The basic design criterion of the system is of providing a set of basic powerful, robust, and efficient natural language components and generic linguistic knowledge sources which can easily be customized for processing different tasks in a flexible manner.
△ Less
Submitted 18 June, 1997;
originally announced June 1997.