-
Building Trustworthy AI: Transparent AI Systems via Large Language Models, Ontologies, and Logical Reasoning (TranspNet)
Authors:
Fadi Al Machot,
Martin Thomas Horsch,
Habib Ullah
Abstract:
Growing concerns over the lack of transparency in AI, particularly in high-stakes fields like healthcare and finance, drive the need for explainable and trustworthy systems. While Large Language Models (LLMs) perform exceptionally well in generating accurate outputs, their "black box" nature poses significant challenges to transparency and trust. To address this, the paper proposes the TranspNet p…
▽ More
Growing concerns over the lack of transparency in AI, particularly in high-stakes fields like healthcare and finance, drive the need for explainable and trustworthy systems. While Large Language Models (LLMs) perform exceptionally well in generating accurate outputs, their "black box" nature poses significant challenges to transparency and trust. To address this, the paper proposes the TranspNet pipeline, which integrates symbolic AI with LLMs. By leveraging domain expert knowledge, retrieval-augmented generation (RAG), and formal reasoning frameworks like Answer Set Programming (ASP), TranspNet enhances LLM outputs with structured reasoning and verification.This approach strives to help AI systems deliver results that are as accurate, explainable, and trustworthy as possible, aligning with regulatory expectations for transparency and accountability. TranspNet provides a solution for developing AI systems that are reliable and interpretable, making it suitable for real-world applications where trust is critical.
△ Less
Submitted 18 December, 2024; v1 submitted 13 November, 2024;
originally announced November 2024.
-
Symbolic-AI-Fusion Deep Learning (SAIF-DL): Encoding Knowledge into Training with Answer Set Programming Loss Penalties by a Novel Loss Function Approach
Authors:
Fadi Al Machot,
Martin Thomas Horsch,
Habib Ullah
Abstract:
This paper presents a hybrid methodology that enhances the training process of deep learning (DL) models by embedding domain expert knowledge using ontologies and answer set programming (ASP). By integrating these symbolic AI methods, we encode domain-specific constraints, rules, and logical reasoning directly into the model's learning process, thereby improving both performance and trustworthines…
▽ More
This paper presents a hybrid methodology that enhances the training process of deep learning (DL) models by embedding domain expert knowledge using ontologies and answer set programming (ASP). By integrating these symbolic AI methods, we encode domain-specific constraints, rules, and logical reasoning directly into the model's learning process, thereby improving both performance and trustworthiness. The proposed approach is flexible and applicable to both regression and classification tasks, demonstrating generalizability across various fields such as healthcare, autonomous systems, engineering, and battery manufacturing applications. Unlike other state-of-the-art methods, the strength of our approach lies in its scalability across different domains. The design allows for the automation of the loss function by simply updating the ASP rules, making the system highly scalable and user-friendly. This facilitates seamless adaptation to new domains without significant redesign, offering a practical solution for integrating expert knowledge into DL models in industrial settings such as battery manufacturing.
△ Less
Submitted 18 December, 2024; v1 submitted 13 November, 2024;
originally announced November 2024.
-
Semantic interoperability based on the European Materials and Modelling Ontology and its ontological paradigm: Mereosemiotics
Authors:
Martin Thomas Horsch,
Silvia Chiacchiera,
Björn Schembera,
Michael A. Seaton,
Ilian T. Todorov
Abstract:
The European Materials and Modelling Ontology (EMMO) has recently been advanced in the computational molecular engineering and multiscale modelling communities as a top-level ontology, aiming to support semantic interoperability and data integration solutions, e.g., for research data infrastructures. The present work explores how top-level ontologies that are based on the same paradigm - the same…
▽ More
The European Materials and Modelling Ontology (EMMO) has recently been advanced in the computational molecular engineering and multiscale modelling communities as a top-level ontology, aiming to support semantic interoperability and data integration solutions, e.g., for research data infrastructures. The present work explores how top-level ontologies that are based on the same paradigm - the same set of fundamental postulates - as the EMMO can be applied to models of physical systems and their use in computational engineering practice. This paradigm, which combines mereology (in its extension as mereotopology) and semiotics (following Peirce's approach), is here referred to as mereosemiotics. Multiple conceivable ways of implementing mereosemiotics are compared, and the design space consisting of the possible types of top-level ontologies following this paradigm is characterized.
△ Less
Submitted 11 February, 2021; v1 submitted 22 March, 2020;
originally announced March 2020.
-
Reliable and interoperable computational molecular engineering: 2. Semantic interoperability based on the European Materials and Modelling Ontology
Authors:
Martin Thomas Horsch,
Silvia Chiacchiera,
Youness Bami,
Georg J. Schmitz,
Gabriele Mogni,
Gerhard Goldbeck,
Emanuele Ghedini
Abstract:
The European Materials and Modelling Ontology (EMMO) is a top-level ontology designed by the European Materials Modelling Council to facilitate semantic interoperability between platforms, models, and tools in computational molecular engineering, integrated computational materials engineering, and related applications of materials modelling and characterization. Additionally, domain ontologies exi…
▽ More
The European Materials and Modelling Ontology (EMMO) is a top-level ontology designed by the European Materials Modelling Council to facilitate semantic interoperability between platforms, models, and tools in computational molecular engineering, integrated computational materials engineering, and related applications of materials modelling and characterization. Additionally, domain ontologies exist based on data technology developments from specific platforms. The present work discusses the ongoing work on establishing a European Virtual Marketplace Framework, into which diverse platforms can be integrated. It addresses common challenges that arise when marketplace-level domain ontologies are combined with a top-level ontology like the EMMO by ontology alignment.
△ Less
Submitted 13 January, 2020;
originally announced January 2020.
-
Ontologies for the Virtual Materials Marketplace
Authors:
Martin Thomas Horsch,
Silvia Chiacchiera,
Michael A. Seaton,
Ilian T. Todorov,
Karel Šindelka,
Martin Lísal,
Barbara Andreon,
Esteban Bayro Kaiser,
Gabriele Mogni,
Gerhard Goldbeck,
Ralf Kunze,
Georg Summer,
Andreas Fiseni,
Hauke Brüning,
Peter Schiffels,
Welchy Leite Cavalcanti
Abstract:
The Virtual Materials Marketplace (VIMMP) project, which develops an open platform for providing and accessing services related to materials modelling, is presented with a focus on its ontology development and data technology aspects. Within VIMMP, a system of marketplace-level ontologies is developed to characterize services, models, and interactions between users; the European Materials and Mode…
▽ More
The Virtual Materials Marketplace (VIMMP) project, which develops an open platform for providing and accessing services related to materials modelling, is presented with a focus on its ontology development and data technology aspects. Within VIMMP, a system of marketplace-level ontologies is developed to characterize services, models, and interactions between users; the European Materials and Modelling Ontology (EMMO) is employed as a top-level ontology. The ontologies are used to annotate data that are stored in the ZONTAL Space component of VIMMP and to support the ingest and retrieval of data and metadata at the VIMMP marketplace frontend.
△ Less
Submitted 5 February, 2020; v1 submitted 3 December, 2019;
originally announced December 2019.
-
Semantic interoperability and characterization of data provenance in computational molecular engineering
Authors:
M. T. Horsch,
C. Niethammer,
G. Boccardo,
P. Carbone,
S. Chiacchiera,
M. Chiricotto,
J. D. Elliott,
V. Lobaskin,
P. Neumann,
P. Schiffels,
M. A. Seaton,
I. T. Todorov,
J. Vrabec,
W. L. Cavalcanti
Abstract:
By introducing a common representational system for metadata that describe the employed simulation workflows, diverse sources of data and platforms in computational molecular engineering, such as workflow management systems, can become interoperable at the semantic level. To achieve semantic interoperability, the present work introduces two ontologies that provide a formal specification of the ent…
▽ More
By introducing a common representational system for metadata that describe the employed simulation workflows, diverse sources of data and platforms in computational molecular engineering, such as workflow management systems, can become interoperable at the semantic level. To achieve semantic interoperability, the present work introduces two ontologies that provide a formal specification of the entities occurring in a simulation workflow and the relations between them: The software ontology VISO is developed to represent software packages and their features, and OSMO, an ontology for simulation, modelling, and optimization, is introduced on the basis of MODA, a previously developed semi-intuitive graph notation for workflows in materials modelling. As a proof of concept, OSMO is employed to describe a use case of the TaLPas workflow management system, a scheduler and workflow optimizer for particle-based simulations.
△ Less
Submitted 15 November, 2019; v1 submitted 29 July, 2019;
originally announced August 2019.
-
Update-tolerant and Revocable Password Backup (Extended Version)
Authors:
Moritz Horsch,
Johannes Braun,
Dominique Metz,
Johannes Buchmann
Abstract:
It is practically impossible for users to memorize a large portfolio of strong and individual passwords for their online accounts. A solution is to generate passwords randomly and store them. Yet, storing passwords instead of memorizing them bears the risk of loss, e.g., in situations where the device on which the passwords are stored is damaged, lost, or stolen. This makes the creation of backups…
▽ More
It is practically impossible for users to memorize a large portfolio of strong and individual passwords for their online accounts. A solution is to generate passwords randomly and store them. Yet, storing passwords instead of memorizing them bears the risk of loss, e.g., in situations where the device on which the passwords are stored is damaged, lost, or stolen. This makes the creation of backups of the passwords indispensable. However, placing such backups at secure locations to protect them as well from loss and unauthorized access and keeping them up-to-date at the same time is an unsolved problem in practice.
We present PASCO, a backup solution for passwords that solves this challenge. PASCO backups need not to be updated, even when the user's password portfolio is changed. PASCO backups can be revoked without having physical access to them. This prevents password leakage, even when a user loses control over a backup. Additionally, we show how to extend PASCO to enable a fully controllable emergency access. It allows a user to give someone else access to his passwords in urgent situations. We also present a security evaluation and an implementation of PASCO.
△ Less
Submitted 26 April, 2017; v1 submitted 10 April, 2017;
originally announced April 2017.
-
Molecular simulation of the surface tension of real fluids
Authors:
Stephan Werth,
Martin Horsch,
Hans Hasse
Abstract:
Molecular models of real fluids are validated by comparing the vapor-liquid surface tension from molecular dynamics (MD) simulation to correlations of experimental data. The considered molecular models consist of up to 28 interaction sites, including Lennard-Jones sites, point charges, dipoles and quadrupoles. They represent 38 real fluids, such as ethylene oxide, sulfur dioxide, phosgene, benzene…
▽ More
Molecular models of real fluids are validated by comparing the vapor-liquid surface tension from molecular dynamics (MD) simulation to correlations of experimental data. The considered molecular models consist of up to 28 interaction sites, including Lennard-Jones sites, point charges, dipoles and quadrupoles. They represent 38 real fluids, such as ethylene oxide, sulfur dioxide, phosgene, benzene, ammonia, formaldehyde, methanol and water, and were adjusted to reproduce the saturated liquid density, vapor pressure and enthalpy of vaporization. The models were not adjusted to interfacial properties, however, so that the present MD simulations are a test of model predictions. It is found that all of the considered models overestimate the surface tension. In most cases, however, the relative deviation between the simulation results and correlations to experimental data is smaller than 20 %. This observation corroborates the outcome of our previous studies on the surface tension of 2CLJQ and 2CLJD fluids where an overestimation of the order of 10 to 20 % was found.
△ Less
Submitted 15 August, 2016;
originally announced August 2016.
-
ms2: A molecular simulation tool for thermodynamic properties, new version release
Authors:
Colin W. Glass,
Steffen Reiser,
Gábor Rutkai,
Stephan Deublein,
Andreas Köster,
Gabriela Guevara Carrión,
Amer Wafai,
Martin Horsch,
Martin F. Bernreuther,
Thorsten Windmann,
Hans Hasse,
Jadran Vrabec
Abstract:
A new version release (2.0) of the molecular simulation tool ms2 [S. Deublein et al., Comput. Phys. Commun. 182 (2011) 2350] is presented. Version 2.0 of ms2 features a hybrid parallelization based on MPI and OpenMP for molecular dynamics simulation to achieve higher scalability. Furthermore, the formalism by Lustig [R. Lustig, Mol. Phys. 110 (2012) 3041] is implemented, allowing for a systematic…
▽ More
A new version release (2.0) of the molecular simulation tool ms2 [S. Deublein et al., Comput. Phys. Commun. 182 (2011) 2350] is presented. Version 2.0 of ms2 features a hybrid parallelization based on MPI and OpenMP for molecular dynamics simulation to achieve higher scalability. Furthermore, the formalism by Lustig [R. Lustig, Mol. Phys. 110 (2012) 3041] is implemented, allowing for a systematic sampling of Massieu potential derivatives in a single simulation run. Moreover, the Green-Kubo formalism is extended for the sampling of the electric conductivity and the residence time. To remove the restriction of the preceding version to electro-neutral molecules, Ewald summation is implemented to consider ionic long range interactions. Finally, the sampling of the radial distribution function is added.
△ Less
Submitted 25 July, 2015;
originally announced July 2015.
-
PALPAS - PAsswordLess PAssword Synchronization
Authors:
Moritz Horsch,
Andreas Hülsing,
Johannes Buchmann
Abstract:
Tools that synchronize passwords over several user devices typically store the encrypted passwords in a central online database. For encryption, a low-entropy, password-based key is used. Such a database may be subject to unauthorized access which can lead to the disclosure of all passwords by an offline brute-force attack. In this paper, we present PALPAS, a secure and user-friendly tool that syn…
▽ More
Tools that synchronize passwords over several user devices typically store the encrypted passwords in a central online database. For encryption, a low-entropy, password-based key is used. Such a database may be subject to unauthorized access which can lead to the disclosure of all passwords by an offline brute-force attack. In this paper, we present PALPAS, a secure and user-friendly tool that synchronizes passwords between user devices without storing information about them centrally. The idea of PALPAS is to generate a password from a high entropy secret shared by all devices and a random salt value for each service. Only the salt values are stored on a server but not the secret. The salt enables the user devices to generate the same password but is statistically independent of the password. In order for PALPAS to generate passwords according to different password policies, we also present a mechanism that automatically retrieves and processes the password requirements of services. PALPAS users need to only memorize a single password and the setup of PALPAS on a further device demands only a one-time transfer of few static data.
△ Less
Submitted 15 June, 2015;
originally announced June 2015.
-
Molecular modelling and simulation of the surface tension of real quadrupolar fluids
Authors:
Stephan Werth,
Katrin Stöbener,
Peter Klein,
Karl-Heinz Küfer,
Martin Horsch,
Hans Hasse
Abstract:
Molecular modelling and simulation of the surface tension of fluids with force fields is discussed. 29 real fluids are studied, including nitrogen, oxygen, carbon dioxide, carbon monoxide, fluorine, chlorine, bromine, iodine, ethane, ethylene, acetylene, propyne, propylene, propadiene, carbon disulfide, sulfur hexafluoride, and many refrigerants. The fluids are represented by two-centre Lennard-Jo…
▽ More
Molecular modelling and simulation of the surface tension of fluids with force fields is discussed. 29 real fluids are studied, including nitrogen, oxygen, carbon dioxide, carbon monoxide, fluorine, chlorine, bromine, iodine, ethane, ethylene, acetylene, propyne, propylene, propadiene, carbon disulfide, sulfur hexafluoride, and many refrigerants. The fluids are represented by two-centre Lennard-Jones plus point quadrupole models from the literature. These models were adjusted only to experimental data of the vapour pressure and saturated liquid density so that the results for the surface tension are predictions. The deviations between the predictions and experimental data for the surface tension are of the order of 20 percent. The surface tension is usually overestimated by the models. For further improvements, data on the surface tension can be included in the model development. A suitable strategy for this is multi-criteria optimization based on Pareto sets. This is demonstrated using the model for carbon dioxide as an example.
△ Less
Submitted 21 August, 2014;
originally announced August 2014.
-
ls1 mardyn: The massively parallel molecular dynamics code for large systems
Authors:
Christoph Niethammer,
Stefan Becker,
Martin Bernreuther,
Martin Buchholz,
Wolfgang Eckhardt,
Alexander Heinecke,
Stephan Werth,
Hans-Joachim Bungartz,
Colin W. Glass,
Hans Hasse,
Jadran Vrabec,
Martin Horsch
Abstract:
The molecular dynamics simulation code ls1 mardyn is presented. It is a highly scalable code, optimized for massively parallel execution on supercomputing architectures, and currently holds the world record for the largest molecular simulation with over four trillion particles. It enables the application of pair potentials to length and time scales which were previously out of scope for molecular…
▽ More
The molecular dynamics simulation code ls1 mardyn is presented. It is a highly scalable code, optimized for massively parallel execution on supercomputing architectures, and currently holds the world record for the largest molecular simulation with over four trillion particles. It enables the application of pair potentials to length and time scales which were previously out of scope for molecular dynamics simulation. With an efficient dynamic load balancing scheme, it delivers high scalability even for challenging heterogeneous configurations. Presently, multi-center rigid potential models based on Lennard-Jones sites, point charges and higher-order polarities are supported. Due to its modular design, ls1 mardyn can be extended to new physical models, methods, and algorithms, allowing future users to tailor it to suit their respective needs. Possible applications include scenarios with complex geometries, e.g. for fluids at interfaces, as well as non-equilibrium molecular dynamics simulation of heat and mass transfer.
△ Less
Submitted 20 August, 2014;
originally announced August 2014.
-
Computational molecular engineering as an emerging technology in process engineering
Authors:
Martin Horsch,
Christoph Niethammer,
Jadran Vrabec,
Hans Hasse
Abstract:
The present level of development of molecular force field methods is assessed from the point of view of simulation-based engineering, outlining the immediate perspective for further development and highlighting the newly emerging discipline of Computational Molecular Engineering (CME) which makes basic research in soft matter physics fruitful for industrial applications. Within the coming decade,…
▽ More
The present level of development of molecular force field methods is assessed from the point of view of simulation-based engineering, outlining the immediate perspective for further development and highlighting the newly emerging discipline of Computational Molecular Engineering (CME) which makes basic research in soft matter physics fruitful for industrial applications. Within the coming decade, major breakthroughs can be reached if a research focus is placed on processes at interfaces, combining aspects where an increase in the accessible length and time scales due to massively parallel high-performance computing will lead to particularly significant improvements.
△ Less
Submitted 21 May, 2013;
originally announced May 2013.
-
Molecular modelling and simulation of electrolyte solutions, biomolecules, and wetting of component surfaces
Authors:
Martin Horsch,
Stefan Becker,
Juan Manuel Castillo,
Stephan Deublein,
Agnes Fröscher,
Steffen Reiser,
Stephan Werth,
Jadran Vrabec,
Hans Hasse
Abstract:
Massively-parallel molecular dynamics simulation is applied to systems containing electrolytes, vapour-liquid interfaces, and biomolecules in contact with water-oil interfaces. Novel molecular models of alkali halide salts are presented and employed for the simulation of electrolytes in aqueous solution. The enzymatically catalysed hydroxylation of oleic acid is investigated by molecular dynamics…
▽ More
Massively-parallel molecular dynamics simulation is applied to systems containing electrolytes, vapour-liquid interfaces, and biomolecules in contact with water-oil interfaces. Novel molecular models of alkali halide salts are presented and employed for the simulation of electrolytes in aqueous solution. The enzymatically catalysed hydroxylation of oleic acid is investigated by molecular dynamics simulation taking the internal degrees of freedom of the macromolecules into account. Thereby, Ewald summation methods are used to compute the long range electrostatic interactions. In systems with a phase boundary, the dispersive interaction, which is modelled by the Lennard-Jones potential here, has a more significant long range contribution than in homogeneous systems. This effect is accounted for by implementing the Janecek cutoff correction scheme. On this basis, the HPC infrastructure at the Steinbuch Centre for Computing was accessed and efficiently used, yielding new insights on the molecular systems under consideration.
△ Less
Submitted 17 May, 2013;
originally announced May 2013.
-
A Dynamic Approach to Probabilistic Inference
Authors:
Michael C. Horsch,
David L. Poole
Abstract:
In this paper we present a framework for dynamically constructing Bayesian networks. We introduce the notion of a background knowledge base of schemata, which is a collection of parameterized conditional probability statements. These schemata explicitly separate the general knowledge of properties an individual may have from the specific knowledge of particular individuals that may have these prop…
▽ More
In this paper we present a framework for dynamically constructing Bayesian networks. We introduce the notion of a background knowledge base of schemata, which is a collection of parameterized conditional probability statements. These schemata explicitly separate the general knowledge of properties an individual may have from the specific knowledge of particular individuals that may have these properties. Knowledge of individuals can be combined with this background knowledge to create Bayesian networks, which can then be used in any propagation scheme. We discuss the theory and assumptions necessary for the implementation of dynamic Bayesian networks, and indicate where our approach may be useful.
△ Less
Submitted 27 March, 2013;
originally announced April 2013.
-
Flexible Policy Construction by Information Refinement
Authors:
Michael C. Horsch,
David L. Poole
Abstract:
We report on work towards flexible algorithms for solving decision problems represented as influence diagrams. An algorithm is given to construct a tree structure for each decision node in an influence diagram. Each tree represents a decision function and is constructed incrementally. The improvements to the tree converge to the optimal decision function (neglecting computational costs) and the…
▽ More
We report on work towards flexible algorithms for solving decision problems represented as influence diagrams. An algorithm is given to construct a tree structure for each decision node in an influence diagram. Each tree represents a decision function and is constructed incrementally. The improvements to the tree converge to the optimal decision function (neglecting computational costs) and the asymptotic behaviour is only a constant factor worse than dynamic programming techniques, counting the number of Bayesian network queries. Empirical results show how expected utility increases with the size of the tree and the number of Bayesian net calculations.
△ Less
Submitted 13 February, 2013;
originally announced February 2013.
-
An Anytime Algorithm for Decision Making under Uncertainty
Authors:
Michael C. Horsch,
David L. Poole
Abstract:
We present an anytime algorithm which computes policies for decision problems represented as multi-stage influence diagrams. Our algorithm constructs policies incrementally, starting from a policy which makes no use of the available information. The incremental process constructs policies which includes more of the information available to the decision maker at each step. While the process converg…
▽ More
We present an anytime algorithm which computes policies for decision problems represented as multi-stage influence diagrams. Our algorithm constructs policies incrementally, starting from a policy which makes no use of the available information. The incremental process constructs policies which includes more of the information available to the decision maker at each step. While the process converges to the optimal policy, our approach is designed for situations in which computing the optimal policy is infeasible. We provide examples of the process on several large decision problems, showing that, for these examples, the process constructs valuable (but sub-optimal) policies before the optimal policy would be available by traditional methods.
△ Less
Submitted 30 January, 2013;
originally announced January 2013.
-
Estimating the Value of Computation in Flexible Information Refinement
Authors:
Michael C. Horsch,
David L. Poole
Abstract:
We outline a method to estimate the value of computation for a flexible algorithm using empirical data. To determine a reasonable trade-off between cost and value, we build an empirical model of the value obtained through computation, and apply this model to estimate the value of computation for quite different problems. In particular, we investigate this trade-off for the problem of constructing…
▽ More
We outline a method to estimate the value of computation for a flexible algorithm using empirical data. To determine a reasonable trade-off between cost and value, we build an empirical model of the value obtained through computation, and apply this model to estimate the value of computation for quite different problems. In particular, we investigate this trade-off for the problem of constructing policies for decision problems represented as influence diagrams. We show how two features of our anytime algorithm provide reasonable estimates of the value of computation in this domain.
△ Less
Submitted 23 January, 2013;
originally announced January 2013.
-
Probabilistic Arc Consistency: A Connection between Constraint Reasoning and Probabilistic Reasoning
Authors:
Michael C. Horsch,
Bill Havens
Abstract:
We document a connection between constraint reasoning and probabilistic reasoning. We present an algorithm, called {em probabilistic arc consistency}, which is both a generalization of a well known algorithm for arc consistency used in constraint reasoning, and a specialization of the belief updating algorithm for singly-connected networks. Our algorithm is exact for singly- connected constraint p…
▽ More
We document a connection between constraint reasoning and probabilistic reasoning. We present an algorithm, called {em probabilistic arc consistency}, which is both a generalization of a well known algorithm for arc consistency used in constraint reasoning, and a specialization of the belief updating algorithm for singly-connected networks. Our algorithm is exact for singly- connected constraint problems, but can work well as an approximation for arbitrary problems. We briefly discuss some empirical results, and related methods.
△ Less
Submitted 16 January, 2013;
originally announced January 2013.