-
Text2Model: Generating dynamic chemical reactor models using large language models (LLMs)
Authors:
Sophia Rupprecht,
Yassine Hounat,
Monisha Kumar,
Giacomo Lastrucci,
Artur M. Schweidtmann
Abstract:
As large language models have shown remarkable capabilities in conversing via natural language, the question arises as to how LLMs could potentially assist chemical engineers in research and industry with domain-specific tasks. We generate dynamic chemical reactor models in Modelica code format from textual descriptions as user input. We fine-tune Llama 3.1 8B Instruct on synthetically generated M…
▽ More
As large language models have shown remarkable capabilities in conversing via natural language, the question arises as to how LLMs could potentially assist chemical engineers in research and industry with domain-specific tasks. We generate dynamic chemical reactor models in Modelica code format from textual descriptions as user input. We fine-tune Llama 3.1 8B Instruct on synthetically generated Modelica code for different reactor scenarios. We compare the performance of our fine-tuned model to the baseline Llama 3.1 8B Instruct model and GPT4o. We manually assess the models' predictions regarding the syntactic and semantic accuracy of the generated dynamic models. We find that considerable improvements are achieved by the fine-tuned model with respect to both the semantic and the syntactic accuracy of the Modelica models. However, the fine-tuned model lacks a satisfactory ability to generalize to unseen scenarios compared to GPT4o.
△ Less
Submitted 21 March, 2025;
originally announced March 2025.
-
Talking like Piping and Instrumentation Diagrams (P&IDs)
Authors:
Achmad Anggawirya Alimin,
Dominik P. Goldstein,
Lukas Schulze Balhorn,
Artur M. Schweidtmann
Abstract:
We propose a methodology that allows communication with Piping and Instrumentation Diagrams (P&IDs) using natural language. In particular, we represent P&IDs through the DEXPI data model as labeled property graphs and integrate them with Large Language Models (LLMs). The approach consists of three main parts: 1) P&IDs are cast into a graph representation from the DEXPI format using our pyDEXPI Pyt…
▽ More
We propose a methodology that allows communication with Piping and Instrumentation Diagrams (P&IDs) using natural language. In particular, we represent P&IDs through the DEXPI data model as labeled property graphs and integrate them with Large Language Models (LLMs). The approach consists of three main parts: 1) P&IDs are cast into a graph representation from the DEXPI format using our pyDEXPI Python package. 2) A tool for generating P&ID knowledge graphs from pyDEXPI. 3) Integration of the P&ID knowledge graph to LLMs using graph-based retrieval augmented generation (graph-RAG). This approach allows users to communicate with P&IDs using natural language. It extends LLM's ability to retrieve contextual data from P&IDs and mitigate hallucinations. Leveraging the LLM's large corpus, the model is also able to interpret process information in PIDs, which could help engineers in their daily tasks. In the future, this work will also open up opportunities in the context of other generative Artificial Intelligence (genAI) solutions on P&IDs, and AI-assisted HAZOP studies.
△ Less
Submitted 26 February, 2025;
originally announced February 2025.
-
Rule-based autocorrection of Piping and Instrumentation Diagrams (P&IDs) on graphs
Authors:
Lukas Schulze Balhorn,
Niels Seijsener,
Kevin Dao,
Minji Kim,
Dominik P. Goldstein,
Ge H. M. Driessen,
Artur M. Schweidtmann
Abstract:
A piping and instrumentation diagram (P&ID) is a central reference document in chemical process engineering. Currently, chemical engineers manually review P&IDs through visual inspection to find and rectify errors. However, engineering projects can involve hundreds to thousands of P&ID pages, creating a significant revision workload. This study proposes a rule-based method to support engineers wit…
▽ More
A piping and instrumentation diagram (P&ID) is a central reference document in chemical process engineering. Currently, chemical engineers manually review P&IDs through visual inspection to find and rectify errors. However, engineering projects can involve hundreds to thousands of P&ID pages, creating a significant revision workload. This study proposes a rule-based method to support engineers with error detection and correction in P&IDs. The method is based on a graph representation of P&IDs, enabling automated error detection and correction, i.e., autocorrection, through rule graphs. We use our pyDEXPI Python package to generate P&ID graphs from DEXPI-standard P&IDs. In this study, we developed 33 rules based on chemical engineering knowledge and heuristics, with five selected rules demonstrated as examples. A case study on an illustrative P&ID validates the reliability and effectiveness of the rule-based autocorrection method in revising P&IDs.
△ Less
Submitted 18 February, 2025;
originally announced February 2025.
-
Transferring Graph Neural Networks for Soft Sensor Modeling using Process Topologies
Authors:
Maximilian F. Theisen,
Gabrie M. H. Meesters,
Artur M. Schweidtmann
Abstract:
Data-driven soft sensors help in process operations by providing real-time estimates of otherwise hard- to-measure process quantities, e.g., viscosities or product concentrations. Currently, soft sensors need to be developed individually per plant. Using transfer learning, machine learning-based soft sensors could be reused and fine-tuned across plants and applications. However, transferring data-…
▽ More
Data-driven soft sensors help in process operations by providing real-time estimates of otherwise hard- to-measure process quantities, e.g., viscosities or product concentrations. Currently, soft sensors need to be developed individually per plant. Using transfer learning, machine learning-based soft sensors could be reused and fine-tuned across plants and applications. However, transferring data-driven soft sensor models is in practice often not possible, because the fixed input structure of standard soft sensor models prohibits transfer if, e.g., the sensor information is not identical in all plants. We propose a topology-aware graph neural network approach for transfer learning of soft sensor models across multiple plants. In our method, plants are modeled as graphs: Unit operations are nodes, streams are edges, and sensors are embedded as attributes. Our approach brings two advantages for transfer learning: First, we not only include sensor data but also crucial information on the plant topology. Second, the graph neural network algorithm is flexible with respect to its sensor inputs. This allows us to model data from different plants with different sensor networks. We test the transfer learning capabilities of our modeling approach on ammonia synthesis loops with different process topologies. We build a soft sensor predicting the ammonia concentration in the product. After training on data from one process, we successfully transfer our soft sensor model to a previously unseen process with a different topology. Our approach promises to extend the data-driven soft sensors to cases to leverage data from multiple plants.
△ Less
Submitted 5 February, 2025;
originally announced February 2025.
-
ENFORCE: Nonlinear Constrained Learning with Adaptive-depth Neural Projection
Authors:
Giacomo Lastrucci,
Artur M. Schweidtmann
Abstract:
Ensuring neural networks adhere to domain-specific constraints is crucial for addressing safety and ethical concerns while also enhancing inference accuracy. Despite the nonlinear nature of most real-world tasks, existing methods are predominantly limited to affine or convex constraints. We introduce ENFORCE, a neural network architecture that uses an adaptive projection module (AdaNP) to enforce…
▽ More
Ensuring neural networks adhere to domain-specific constraints is crucial for addressing safety and ethical concerns while also enhancing inference accuracy. Despite the nonlinear nature of most real-world tasks, existing methods are predominantly limited to affine or convex constraints. We introduce ENFORCE, a neural network architecture that uses an adaptive projection module (AdaNP) to enforce nonlinear equality constraints in the predictions. We prove that our projection mapping is 1-Lipschitz, making it well-suited for stable training. We evaluate ENFORCE on an illustrative regression task and for learning solutions to high-dimensional optimization problems in an unsupervised setting. The predictions of our new architecture satisfy $N_C$ equality constraints that are nonlinear in both the inputs and outputs of the neural network, while maintaining scalability with a tractable computational complexity of $\mathcal{O}(N_C^3)$ at training and inference time.
△ Less
Submitted 15 May, 2025; v1 submitted 10 February, 2025;
originally announced February 2025.
-
Picard-KKT-hPINN: Enforcing Nonlinear Enthalpy Balances for Physically Consistent Neural Networks
Authors:
Giacomo Lastrucci,
Tanuj Karia,
Zoë Gromotka,
Artur M. Schweidtmann
Abstract:
Neural networks are widely used as surrogate models but they do not guarantee physically consistent predictions thereby preventing adoption in various applications. We propose a method that can enforce NNs to satisfy physical laws that are nonlinear in nature such as enthalpy balances. Our approach, inspired by Picard successive approximations method, aims to enforce multiplicatively separable con…
▽ More
Neural networks are widely used as surrogate models but they do not guarantee physically consistent predictions thereby preventing adoption in various applications. We propose a method that can enforce NNs to satisfy physical laws that are nonlinear in nature such as enthalpy balances. Our approach, inspired by Picard successive approximations method, aims to enforce multiplicatively separable constraints by sequentially freezing and projecting a set of the participating variables. We demonstrate our PicardKKThPINN for surrogate modeling of a catalytic packed bed reactor for methanol synthesis. Our results show that the method efficiently enforces nonlinear enthalpy and linear atomic balances at machine-level precision. Additionally, we show that enforcing conservation laws can improve accuracy in data-scarce conditions compared to vanilla multilayer perceptron.
△ Less
Submitted 29 January, 2025;
originally announced January 2025.
-
Graph-to-SFILES: Control structure prediction from process topologies using generative artificial intelligence
Authors:
Lukas Schulze Balhorn,
Kevin Degens,
Artur M. Schweidtmann
Abstract:
Control structure design is an important but tedious step in P&ID development. Generative artificial intelligence (AI) promises to reduce P&ID development time by supporting engineers. Previous research on generative AI in chemical process design mainly represented processes by sequences. However, graphs offer a promising alternative because of their permutation invariance. We propose the Graph-to…
▽ More
Control structure design is an important but tedious step in P&ID development. Generative artificial intelligence (AI) promises to reduce P&ID development time by supporting engineers. Previous research on generative AI in chemical process design mainly represented processes by sequences. However, graphs offer a promising alternative because of their permutation invariance. We propose the Graph-to-SFILES model, a generative AI method to predict control structures from flowsheet topologies. The Graph-to-SFILES model takes the flowsheet topology as a graph input and returns a control-extended flowsheet as a sequence in the SFILES 2.0 notation. We compare four different graph encoder architectures, one of them being a graph neural network (GNN) proposed in this work. The Graph-to-SFILES model achieves a top-5 accuracy of 73.2% when trained on 10,000 flowsheet topologies. In addition, the proposed GNN performs best among the encoder architectures. Compared to a purely sequence-based approach, the Graph-to-SFILES model improves the top-5 accuracy for a relatively small training dataset of 1,000 flowsheets from 0.9% to 28.4%. However, the sequence-based approach performs better on a large-scale dataset of 100,000 flowsheets. These results highlight the potential of graph-based AI models to accelerate P&ID development in small-data regimes but their effectiveness on industry relevant case studies still needs to be investigated.
△ Less
Submitted 30 November, 2024;
originally announced December 2024.
-
Bayesian Uncertainty Estimation by Hamiltonian Monte Carlo: Applications to Cardiac MRI Segmentation
Authors:
Yidong Zhao,
Joao Tourais,
Iain Pierce,
Christian Nitsche,
Thomas A. Treibel,
Sebastian Weingärtner,
Artur M. Schweidtmann,
Qian Tao
Abstract:
Deep learning (DL)-based methods have achieved state-of-the-art performance for many medical image segmentation tasks. Nevertheless, recent studies show that deep neural networks (DNNs) can be miscalibrated and overconfident, leading to "silent failures" that are risky for clinical applications. Bayesian DL provides an intuitive approach to DL failure detection, based on posterior probability esti…
▽ More
Deep learning (DL)-based methods have achieved state-of-the-art performance for many medical image segmentation tasks. Nevertheless, recent studies show that deep neural networks (DNNs) can be miscalibrated and overconfident, leading to "silent failures" that are risky for clinical applications. Bayesian DL provides an intuitive approach to DL failure detection, based on posterior probability estimation. However, the posterior is intractable for large medical image segmentation DNNs. To tackle this challenge, we propose a Bayesian learning framework using Hamiltonian Monte Carlo (HMC), tempered by cold posterior (CP) to accommodate medical data augmentation, named HMC-CP. For HMC computation, we further propose a cyclical annealing strategy, capturing both local and global geometries of the posterior distribution, enabling highly efficient Bayesian DNN training with the same computational budget as training a single DNN. The resulting Bayesian DNN outputs an ensemble segmentation along with the segmentation uncertainty. We evaluate the proposed HMC-CP extensively on cardiac magnetic resonance image (MRI) segmentation, using in-domain steady-state free precession (SSFP) cine images as well as out-of-domain datasets of quantitative T1 and T2 mapping. Our results show that the proposed method improves both segmentation accuracy and uncertainty estimation for in- and out-of-domain data, compared with well-established baseline methods such as Monte Carlo Dropout and Deep Ensembles. Additionally, we establish a conceptual link between HMC and the commonly known stochastic gradient descent (SGD) and provide general insight into the uncertainty of DL. This uncertainty is implicitly encoded in the training dynamics but often overlooked. With reliable uncertainty estimation, our method provides a promising direction toward trustworthy DL in clinical applications.
△ Less
Submitted 27 June, 2024; v1 submitted 4 March, 2024;
originally announced March 2024.
-
MachineLearnAthon: An Action-Oriented Machine Learning Didactic Concept
Authors:
Michal Tkáč,
Jakub Sieber,
Lara Kuhlmann,
Matthias Brueggenolte,
Alexandru Rinciog,
Michael Henke,
Artur M. Schweidtmann,
Qinghe Gao,
Maximilian F. Theisen,
Radwa El Shawi
Abstract:
Machine Learning (ML) techniques are encountered nowadays across disciplines, from social sciences, through natural sciences to engineering. The broad application of ML and the accelerated pace of its evolution lead to an increasing need for dedicated teaching concepts aimed at making the application of this technology more reliable and responsible. However, teaching ML is a daunting task. Aside f…
▽ More
Machine Learning (ML) techniques are encountered nowadays across disciplines, from social sciences, through natural sciences to engineering. The broad application of ML and the accelerated pace of its evolution lead to an increasing need for dedicated teaching concepts aimed at making the application of this technology more reliable and responsible. However, teaching ML is a daunting task. Aside from the methodological complexity of ML algorithms, both with respect to theory and implementation, the interdisciplinary and empirical nature of the field need to be taken into consideration. This paper introduces the MachineLearnAthon format, an innovative didactic concept designed to be inclusive for students of different disciplines with heterogeneous levels of mathematics, programming and domain expertise. At the heart of the concept lie ML challenges, which make use of industrial data sets to solve real-world problems. These cover the entire ML pipeline, promoting data literacy and practical skills, from data preparation, through deployment, to evaluation.
△ Less
Submitted 29 January, 2024;
originally announced January 2024.
-
Toward autocorrection of chemical process flowsheets using large language models
Authors:
Lukas Schulze Balhorn,
Marc Caballero,
Artur M. Schweidtmann
Abstract:
The process engineering domain widely uses Process Flow Diagrams (PFDs) and Process and Instrumentation Diagrams (P&IDs) to represent process flows and equipment configurations. However, the P&IDs and PFDs, hereafter called flowsheets, can contain errors causing safety hazards, inefficient operation, and unnecessary expenses. Correcting and verifying flowsheets is a tedious, manual process. We pro…
▽ More
The process engineering domain widely uses Process Flow Diagrams (PFDs) and Process and Instrumentation Diagrams (P&IDs) to represent process flows and equipment configurations. However, the P&IDs and PFDs, hereafter called flowsheets, can contain errors causing safety hazards, inefficient operation, and unnecessary expenses. Correcting and verifying flowsheets is a tedious, manual process. We propose a novel generative AI methodology for automatically identifying errors in flowsheets and suggesting corrections to the user, i.e., autocorrecting flowsheets. Inspired by the breakthrough of Large Language Models (LLMs) for grammatical autocorrection of human language, we investigate LLMs for the autocorrection of flowsheets. The input to the model is a potentially erroneous flowsheet and the output of the model are suggestions for a corrected flowsheet. We train our autocorrection model on a synthetic dataset in a supervised manner. The model achieves a top-1 accuracy of 80% and a top-5 accuracy of 84% on an independent test dataset of synthetically generated flowsheets. The results suggest that the model can learn to autocorrect the synthetic flowsheets. We envision that flowsheet autocorrection will become a useful tool for chemical engineers.
△ Less
Submitted 5 December, 2023;
originally announced December 2023.
-
Mixed-Integer Optimisation of Graph Neural Networks for Computer-Aided Molecular Design
Authors:
Tom McDonald,
Calvin Tsay,
Artur M. Schweidtmann,
Neil Yorke-Smith
Abstract:
ReLU neural networks have been modelled as constraints in mixed integer linear programming (MILP), enabling surrogate-based optimisation in various domains and efficient solution of machine learning certification problems. However, previous works are mostly limited to MLPs. Graph neural networks (GNNs) can learn from non-euclidean data structures such as molecular structures efficiently and are th…
▽ More
ReLU neural networks have been modelled as constraints in mixed integer linear programming (MILP), enabling surrogate-based optimisation in various domains and efficient solution of machine learning certification problems. However, previous works are mostly limited to MLPs. Graph neural networks (GNNs) can learn from non-euclidean data structures such as molecular structures efficiently and are thus highly relevant to computer-aided molecular design (CAMD). We propose a bilinear formulation for ReLU Graph Convolutional Neural Networks and a MILP formulation for ReLU GraphSAGE models. These formulations enable solving optimisation problems with trained GNNs embedded to global optimality. We apply our optimization approach to an illustrative CAMD case study where the formulations of the trained GNNs are used to design molecules with optimal boiling points.
△ Less
Submitted 2 December, 2023;
originally announced December 2023.
-
What does ChatGPT know about natural science and engineering?
Authors:
Lukas Schulze Balhorn,
Jana M. Weber,
Stefan Buijsman,
Julian R. Hildebrandt,
Martina Ziefle,
Artur M. Schweidtmann
Abstract:
ChatGPT is a powerful language model from OpenAI that is arguably able to comprehend and generate text. ChatGPT is expected to have a large impact on society, research, and education. An essential step to understand ChatGPT's expected impact is to study its domain-specific answering capabilities. Here, we perform a systematic empirical assessment of its abilities to answer questions across the nat…
▽ More
ChatGPT is a powerful language model from OpenAI that is arguably able to comprehend and generate text. ChatGPT is expected to have a large impact on society, research, and education. An essential step to understand ChatGPT's expected impact is to study its domain-specific answering capabilities. Here, we perform a systematic empirical assessment of its abilities to answer questions across the natural science and engineering domains. We collected 594 questions from 198 faculty members across 5 faculties at Delft University of Technology. After collecting the answers from ChatGPT, the participants assessed the quality of the answers using a systematic scheme. Our results show that the answers from ChatGPT are on average perceived as ``mostly correct''. Two major trends are that the rating of the ChatGPT answers significantly decreases (i) as the complexity level of the question increases and (ii) as we evaluate skills beyond scientific knowledge, e.g., critical attitude.
△ Less
Submitted 18 September, 2023;
originally announced September 2023.
-
Data-driven Product-Process Optimization of N-isopropylacrylamide Microgel Flow-Synthesis
Authors:
Luise F. Kaven,
Artur M. Schweidtmann,
Jan Keil,
Jana Israel,
Nadja Wolter,
Alexander Mitsos
Abstract:
Microgels are cross-linked, colloidal polymer networks with great potential for stimuli-response release in drug-delivery applications, as their size in the nanometer range allows them to pass human cell boundaries. For applications with specified requirements regarding size, producing tailored microgels in a continuous flow reactor is advantageous because the microgel properties can be controlled…
▽ More
Microgels are cross-linked, colloidal polymer networks with great potential for stimuli-response release in drug-delivery applications, as their size in the nanometer range allows them to pass human cell boundaries. For applications with specified requirements regarding size, producing tailored microgels in a continuous flow reactor is advantageous because the microgel properties can be controlled tightly. However, no fully-specified mechanistic models are available for continuous microgel synthesis, as the physical properties of the included components are only studied partly. To address this gap and accelerate tailor-made microgel development, we propose a data-driven optimization in a hardware-in-the-loop approach to efficiently synthesize microgels with defined sizes. We optimize the synthesis regarding conflicting objectives (maximum production efficiency, minimum energy consumption, and the desired microgel radius) by applying Bayesian optimization via the solver ``Thompson sampling efficient multi-objective optimization'' (TS-EMO). We validate the optimization using the deterministic global solver ``McCormick-based Algorithm for mixed-integer Nonlinear Global Optimization'' (MAiNGO) and verify three computed Pareto optimal solutions via experiments. The proposed framework can be applied to other desired microgel properties and reactor setups and has the potential of efficient development by minimizing number of experiments and modelling effort needed.
△ Less
Submitted 31 August, 2023;
originally announced August 2023.
-
Deep reinforcement learning for process design: Review and perspective
Authors:
Qinghe Gao,
Artur M. Schweidtmann
Abstract:
The transformation towards renewable energy and feedstock supply in the chemical industry requires new conceptual process design approaches. Recently, breakthroughs in artificial intelligence offer opportunities to accelerate this transition. Specifically, deep reinforcement learning, a subclass of machine learning, has shown the potential to solve complex decision-making problems and aid sustaina…
▽ More
The transformation towards renewable energy and feedstock supply in the chemical industry requires new conceptual process design approaches. Recently, breakthroughs in artificial intelligence offer opportunities to accelerate this transition. Specifically, deep reinforcement learning, a subclass of machine learning, has shown the potential to solve complex decision-making problems and aid sustainable process design. We survey state-of-the-art research in reinforcement learning for process design through three major elements: (i) information representation, (ii) agent architecture, and (iii) environment and reward. Moreover, we discuss perspectives on underlying challenges and promising future works to unfold the full potential of reinforcement learning for process design in chemical engineering.
△ Less
Submitted 15 August, 2023;
originally announced August 2023.
-
Data augmentation for machine learning of chemical process flowsheets
Authors:
Lukas Schulze Balhorn,
Edwin Hirtreiter,
Lynn Luderer,
Artur M. Schweidtmann
Abstract:
Artificial intelligence has great potential for accelerating the design and engineering of chemical processes. Recently, we have shown that transformer-based language models can learn to auto-complete chemical process flowsheets using the SFILES 2.0 string notation. Also, we showed that language translation models can be used to translate Process Flow Diagrams (PFDs) into Process and Instrumentati…
▽ More
Artificial intelligence has great potential for accelerating the design and engineering of chemical processes. Recently, we have shown that transformer-based language models can learn to auto-complete chemical process flowsheets using the SFILES 2.0 string notation. Also, we showed that language translation models can be used to translate Process Flow Diagrams (PFDs) into Process and Instrumentation Diagrams (P&IDs). However, artificial intelligence methods require big data and flowsheet data is currently limited. To mitigate this challenge of limited data, we propose a new data augmentation methodology for flowsheet data that is represented in the SFILES 2.0 notation. We show that the proposed data augmentation improves the performance of artificial intelligence-based process design models. In our case study flowsheet data augmentation improved the prediction uncertainty of the flowsheet autocompletion model by 14.7%. In the future, our flowsheet data augmentation can be used for other machine learning algorithms on chemical process flowsheets that are based on SFILES notation.
△ Less
Submitted 7 February, 2023;
originally announced February 2023.
-
Transfer learning for process design with reinforcement learning
Authors:
Qinghe Gao,
Haoyu Yang,
Shachi M. Shanbhag,
Artur M. Schweidtmann
Abstract:
Process design is a creative task that is currently performed manually by engineers. Artificial intelligence provides new potential to facilitate process design. Specifically, reinforcement learning (RL) has shown some success in automating process design by integrating data-driven models that learn to build process flowsheets with process simulation in an iterative design process. However, one ma…
▽ More
Process design is a creative task that is currently performed manually by engineers. Artificial intelligence provides new potential to facilitate process design. Specifically, reinforcement learning (RL) has shown some success in automating process design by integrating data-driven models that learn to build process flowsheets with process simulation in an iterative design process. However, one major challenge in the learning process is that the RL agent demands numerous process simulations in rigorous process simulators, thereby requiring long simulation times and expensive computational power. Therefore, typically short-cut simulation methods are employed to accelerate the learning process. Short-cut methods can, however, lead to inaccurate results. We thus propose to utilize transfer learning for process design with RL in combination with rigorous simulation methods. Transfer learning is an established approach from machine learning that stores knowledge gained while solving one problem and reuses this information on a different target domain. We integrate transfer learning in our RL framework for process design and apply it to an illustrative case study comprising equilibrium reactions, azeotropic separation, and recycles, our method can design economically feasible flowsheets with stable interaction with DWSIM. Our results show that transfer learning enables RL to economically design feasible flowsheets with DWSIM, resulting in a flowsheet with an 8% higher revenue. And the learning time can be reduced by a factor of 2.
△ Less
Submitted 7 February, 2023;
originally announced February 2023.
-
Efficient Bayesian Uncertainty Estimation for nnU-Net
Authors:
Yidong Zhao,
Changchun Yang,
Artur Schweidtmann,
Qian Tao
Abstract:
The self-configuring nnU-Net has achieved leading performance in a large range of medical image segmentation challenges. It is widely considered as the model of choice and a strong baseline for medical image segmentation. However, despite its extraordinary performance, nnU-Net does not supply a measure of uncertainty to indicate its possible failure. This can be problematic for large-scale image s…
▽ More
The self-configuring nnU-Net has achieved leading performance in a large range of medical image segmentation challenges. It is widely considered as the model of choice and a strong baseline for medical image segmentation. However, despite its extraordinary performance, nnU-Net does not supply a measure of uncertainty to indicate its possible failure. This can be problematic for large-scale image segmentation applications, where data are heterogeneous and nnU-Net may fail without notice. In this work, we introduce a novel method to estimate nnU-Net uncertainty for medical image segmentation. We propose a highly effective scheme for posterior sampling of weight space for Bayesian uncertainty estimation. Different from previous baseline methods such as Monte Carlo Dropout and mean-field Bayesian Neural Networks, our proposed method does not require a variational architecture and keeps the original nnU-Net architecture intact, thereby preserving its excellent performance and ease of use. Additionally, we boost the segmentation performance over the original nnU-Net via marginalizing multi-modal posterior models. We applied our method on the public ACDC and M&M datasets of cardiac MRI and demonstrated improved uncertainty estimation over a range of baseline methods. The proposed method further strengthens nnU-Net for medical image segmentation in terms of both segmentation accuracy and quality control.
△ Less
Submitted 1 May, 2024; v1 submitted 12 December, 2022;
originally announced December 2022.
-
Towards automatic generation of Piping and Instrumentation Diagrams (P&IDs) with Artificial Intelligence
Authors:
Edwin Hirtreiter,
Lukas Schulze Balhorn,
Artur M. Schweidtmann
Abstract:
Developing Piping and Instrumentation Diagrams (P&IDs) is a crucial step during the development of chemical processes. Currently, this is a tedious, manual, and time-consuming task. We propose a novel, completely data-driven method for the prediction of control structures. Our methodology is inspired by end-to-end transformer-based human language translation models. We cast the control structure p…
▽ More
Developing Piping and Instrumentation Diagrams (P&IDs) is a crucial step during the development of chemical processes. Currently, this is a tedious, manual, and time-consuming task. We propose a novel, completely data-driven method for the prediction of control structures. Our methodology is inspired by end-to-end transformer-based human language translation models. We cast the control structure prediction as a translation task where Process Flow Diagrams (PFDs) are translated to P&IDs. To use established transformer-based language translation models, we represent the P&IDs and PFDs as strings using our recently proposed SFILES 2.0 notation. Model training is performed in a transfer learning approach. Firstly, we pre-train our model using generated P&IDs to learn the grammatical structure of the process diagrams. Thereafter, the model is fine-tuned leveraging transfer learning on real P&IDs. The model achieved a top-5 accuracy of 74.8% on 10,000 generated P&IDs and 89.2% on 100,000 generated P&IDs. These promising results show great potential for AI-assisted process engineering. The tests on a dataset of 312 real P&IDs indicate the need of a larger P&IDs dataset for industry applications.
△ Less
Submitted 26 October, 2022;
originally announced November 2022.
-
Graph neural networks for the prediction of molecular structure-property relationships
Authors:
Jan G. Rittig,
Qinghe Gao,
Manuel Dahmen,
Alexander Mitsos,
Artur M. Schweidtmann
Abstract:
Molecular property prediction is of crucial importance in many disciplines such as drug discovery, molecular biology, or material and process design. The frequently employed quantitative structure-property/activity relationships (QSPRs/QSARs) characterize molecules by descriptors which are then mapped to the properties of interest via a linear or nonlinear model. In contrast, graph neural networks…
▽ More
Molecular property prediction is of crucial importance in many disciplines such as drug discovery, molecular biology, or material and process design. The frequently employed quantitative structure-property/activity relationships (QSPRs/QSARs) characterize molecules by descriptors which are then mapped to the properties of interest via a linear or nonlinear model. In contrast, graph neural networks, a novel machine learning method, directly work on the molecular graph, i.e., a graph representation where atoms correspond to nodes and bonds correspond to edges. GNNs allow to learn properties in an end-to-end fashion, thereby avoiding the need for informative descriptors as in QSPRs/QSARs. GNNs have been shown to achieve state-of-the-art prediction performance on various property predictions tasks and represent an active field of research. We describe the fundamentals of GNNs and demonstrate the application of GNNs via two examples for molecular property prediction.
△ Less
Submitted 25 July, 2022;
originally announced August 2022.
-
Learning from flowsheets: A generative transformer model for autocompletion of flowsheets
Authors:
Gabriel Vogel,
Lukas Schulze Balhorn,
Artur M. Schweidtmann
Abstract:
We propose a novel method enabling autocompletion of chemical flowsheets. This idea is inspired by the autocompletion of text. We represent flowsheets as strings using the text-based SFILES 2.0 notation and learn the grammatical structure of the SFILES 2.0 language and common patterns in flowsheets using a transformer-based language model. We pre-train our model on synthetically generated flowshee…
▽ More
We propose a novel method enabling autocompletion of chemical flowsheets. This idea is inspired by the autocompletion of text. We represent flowsheets as strings using the text-based SFILES 2.0 notation and learn the grammatical structure of the SFILES 2.0 language and common patterns in flowsheets using a transformer-based language model. We pre-train our model on synthetically generated flowsheets to learn the flowsheet language grammar. Then, we fine-tune our model in a transfer learning step on real flowsheet topologies. Finally, we use the trained model for causal language modeling to autocomplete flowsheets. Eventually, the proposed method can provide chemical engineers with recommendations during interactive flowsheet synthesis. The results demonstrate a high potential of this approach for future AI-assisted process synthesis.
△ Less
Submitted 1 August, 2022;
originally announced August 2022.
-
SFILES 2.0: An extended text-based flowsheet representation
Authors:
Gabriel Vogel,
Lukas Schulze Balhorn,
Edwin Hirtreiter,
Artur M. Schweidtmann
Abstract:
SFILES is a text-based notation for chemical process flowsheets. It was originally proposed by d'Anterroches (2006) who was inspired by the text-based SMILES notation for molecules. The text-based format has several advantages compared to flowsheet images regarding the storage format, computational accessibility, and eventually for data analysis and processing. However, the original SFILES version…
▽ More
SFILES is a text-based notation for chemical process flowsheets. It was originally proposed by d'Anterroches (2006) who was inspired by the text-based SMILES notation for molecules. The text-based format has several advantages compared to flowsheet images regarding the storage format, computational accessibility, and eventually for data analysis and processing. However, the original SFILES version cannot describe essential flowsheet configurations unambiguously, such as the distinction between top and bottom products. Neither is it capable of describing the control structure required for the safe and reliable operation of chemical processes. Also, there is no publicly available software for decoding or encoding chemical process topologies to SFILES. We propose the SFILES 2.0 with a complete description of the extended notation and naming conventions. Additionally, we provide open-source software for the automated conversion between flowsheet graphs and SFILES 2.0 strings. This way, we hope to encourage researchers and engineers to publish their flowsheet topologies as SFILES 2.0 strings. The ultimate goal is to set the standards for creating a FAIR database of chemical process flowsheets, which would be of great value for future data analysis and processing.
△ Less
Submitted 25 July, 2022;
originally announced August 2022.
-
Physical Pooling Functions in Graph Neural Networks for Molecular Property Prediction
Authors:
Artur M. Schweidtmann,
Jan G. Rittig,
Jana M. Weber,
Martin Grohe,
Manuel Dahmen,
Kai Leonhard,
Alexander Mitsos
Abstract:
Graph neural networks (GNNs) are emerging in chemical engineering for the end-to-end learning of physicochemical properties based on molecular graphs. A key element of GNNs is the pooling function which combines atom feature vectors into molecular fingerprints. Most previous works use a standard pooling function to predict a variety of properties. However, unsuitable pooling functions can lead to…
▽ More
Graph neural networks (GNNs) are emerging in chemical engineering for the end-to-end learning of physicochemical properties based on molecular graphs. A key element of GNNs is the pooling function which combines atom feature vectors into molecular fingerprints. Most previous works use a standard pooling function to predict a variety of properties. However, unsuitable pooling functions can lead to unphysical GNNs that poorly generalize. We compare and select meaningful GNN pooling methods based on physical knowledge about the learned properties. The impact of physical pooling functions is demonstrated with molecular properties calculated from quantum mechanical computations. We also compare our results to the recent set2set pooling approach. We recommend using sum pooling for the prediction of properties that depend on molecular size and compare pooling functions for properties that are molecular size-independent. Overall, we show that the use of physical pooling functions significantly enhances generalization.
△ Less
Submitted 27 July, 2022;
originally announced July 2022.
-
Flowsheet synthesis through hierarchical reinforcement learning and graph neural networks
Authors:
Laura Stops,
Roel Leenhouts,
Qinghe Gao,
Artur M. Schweidtmann
Abstract:
Process synthesis experiences a disruptive transformation accelerated by digitization and artificial intelligence. We propose a reinforcement learning algorithm for chemical process design based on a state-of-the-art actor-critic logic. Our proposed algorithm represents chemical processes as graphs and uses graph convolutional neural networks to learn from process graphs. In particular, the graph…
▽ More
Process synthesis experiences a disruptive transformation accelerated by digitization and artificial intelligence. We propose a reinforcement learning algorithm for chemical process design based on a state-of-the-art actor-critic logic. Our proposed algorithm represents chemical processes as graphs and uses graph convolutional neural networks to learn from process graphs. In particular, the graph neural networks are implemented within the agent architecture to process the states and make decisions. Moreover, we implement a hierarchical and hybrid decision-making process to generate flowsheets, where unit operations are placed iteratively as discrete decisions and corresponding design variables are selected as continuous decisions. We demonstrate the potential of our method to design economically viable flowsheets in an illustrative case study comprising equilibrium reactions, azeotropic separation, and recycles. The results show quick learning in discrete, continuous, and hybrid action spaces. Due to the flexible architecture of the proposed reinforcement learning agent, the method is predestined to include large action-state spaces and an interface to process simulators in future research.
△ Less
Submitted 25 July, 2022;
originally announced July 2022.
-
Graph Neural Networks for Temperature-Dependent Activity Coefficient Prediction of Solutes in Ionic Liquids
Authors:
Jan G. Rittig,
Karim Ben Hicham,
Artur M. Schweidtmann,
Manuel Dahmen,
Alexander Mitsos
Abstract:
Ionic liquids (ILs) are important solvents for sustainable processes and predicting activity coefficients (ACs) of solutes in ILs is needed. Recently, matrix completion methods (MCMs), transformers, and graph neural networks (GNNs) have shown high accuracy in predicting ACs of binary mixtures, superior to well-established models, e.g., COSMO-RS and UNIFAC. GNNs are particularly promising here as t…
▽ More
Ionic liquids (ILs) are important solvents for sustainable processes and predicting activity coefficients (ACs) of solutes in ILs is needed. Recently, matrix completion methods (MCMs), transformers, and graph neural networks (GNNs) have shown high accuracy in predicting ACs of binary mixtures, superior to well-established models, e.g., COSMO-RS and UNIFAC. GNNs are particularly promising here as they learn a molecular graph-to-property relationship without pretraining, typically required for transformers, and are, unlike MCMs, applicable to molecules not included in training. For ILs, however, GNN applications are currently missing. Herein, we present a GNN to predict temperature-dependent infinite dilution ACs of solutes in ILs. We train the GNN on a database including more than 40,000 AC values and compare it to a state-of-the-art MCM. The GNN and MCM achieve similar high prediction performance, with the GNN additionally enabling high-quality predictions for ACs of solutions that contain ILs and solutes not considered during training.
△ Less
Submitted 23 June, 2022;
originally announced June 2022.
-
Graph Machine Learning for Design of High-Octane Fuels
Authors:
Jan G. Rittig,
Martin Ritzert,
Artur M. Schweidtmann,
Stefanie Winkler,
Jana M. Weber,
Philipp Morsch,
K. Alexander Heufer,
Martin Grohe,
Alexander Mitsos,
Manuel Dahmen
Abstract:
Fuels with high-knock resistance enable modern spark-ignition engines to achieve high efficiency and thus low CO2 emissions. Identification of molecules with desired autoignition properties indicated by a high research octane number and a high octane sensitivity is therefore of great practical relevance and can be supported by computer-aided molecular design (CAMD). Recent developments in the fiel…
▽ More
Fuels with high-knock resistance enable modern spark-ignition engines to achieve high efficiency and thus low CO2 emissions. Identification of molecules with desired autoignition properties indicated by a high research octane number and a high octane sensitivity is therefore of great practical relevance and can be supported by computer-aided molecular design (CAMD). Recent developments in the field of graph machine learning (graph-ML) provide novel, promising tools for CAMD. We propose a modular graph-ML CAMD framework that integrates generative graph-ML models with graph neural networks and optimization, enabling the design of molecules with desired ignition properties in a continuous molecular space. In particular, we explore the potential of Bayesian optimization and genetic algorithms in combination with generative graph-ML models. The graph-ML CAMD framework successfully identifies well-established high-octane components. It also suggests new candidates, one of which we experimentally investigate and use to illustrate the need for further auto-ignition training data.
△ Less
Submitted 14 October, 2022; v1 submitted 1 June, 2022;
originally announced June 2022.
-
Global Optimization of Gaussian processes
Authors:
Artur M. Schweidtmann,
Dominik Bongartz,
Daniel Grothe,
Tim Kerkenhoff,
Xiaopeng Lin,
Jaromil Najman,
Alexander Mitsos
Abstract:
Gaussian processes~(Kriging) are interpolating data-driven models that are frequently applied in various disciplines. Often, Gaussian processes are trained on datasets and are subsequently embedded as surrogate models in optimization problems. These optimization problems are nonconvex and global optimization is desired. However, previous literature observed computational burdens limiting determini…
▽ More
Gaussian processes~(Kriging) are interpolating data-driven models that are frequently applied in various disciplines. Often, Gaussian processes are trained on datasets and are subsequently embedded as surrogate models in optimization problems. These optimization problems are nonconvex and global optimization is desired. However, previous literature observed computational burdens limiting deterministic global optimization to Gaussian processes trained on few data points. We propose a reduced-space formulation for deterministic global optimization with trained Gaussian processes embedded. For optimization, the branch-and-bound solver branches only on the degrees of freedom and McCormick relaxations are propagated through explicit Gaussian process models. The approach also leads to significantly smaller and computationally cheaper subproblems for lower and upper bounding. To further accelerate convergence, we derive envelopes of common covariance functions for GPs and tight relaxations of acquisition functions used in Bayesian optimization including expected improvement, probability of improvement, and lower confidence bound. In total, we reduce computational time by orders of magnitude compared to state-of-the-art methods, thus overcoming previous computational burdens. We demonstrate the performance and scaling of the proposed method and apply it to Bayesian optimization with global optimization of the acquisition function and chance-constrained programming. The Gaussian process models, acquisition functions, and training scripts are available open-source within the "MeLOn - Machine Learning Models for Optimization" toolbox~(https://git.rwth-aachen.de/avt.svt/public/MeLOn).
△ Less
Submitted 21 May, 2020;
originally announced May 2020.