Probabilistic transfer learning methodology to expedite high fidelity simulation of reactive flows
Authors:
Bruno S. Soriano,
Ki Sung Jung,
Tarek Echekki,
Jacqueline H. Chen,
Mohammad Khalil
Abstract:
Reduced order models based on the transport of a lower dimensional manifold representation of the thermochemical state, such as Principal Component (PC) transport and Machine Learning (ML) techniques, have been developed to reduce the computational cost associated with the Direct Numerical Simulations (DNS) of reactive flows. Both PC transport and ML normally require an abundance of data to exhibi…
▽ More
Reduced order models based on the transport of a lower dimensional manifold representation of the thermochemical state, such as Principal Component (PC) transport and Machine Learning (ML) techniques, have been developed to reduce the computational cost associated with the Direct Numerical Simulations (DNS) of reactive flows. Both PC transport and ML normally require an abundance of data to exhibit sufficient predictive accuracy, which might not be available due to the prohibitive cost of DNS or experimental data acquisition. To alleviate such difficulties, similar data from an existing dataset or domain (source domain) can be used to train ML models, potentially resulting in adequate predictions in the domain of interest (target domain). This study presents a novel probabilistic transfer learning (TL) framework to enhance the trust in ML models in correctly predicting the thermochemical state in a lower dimensional manifold and a sparse data setting. The framework uses Bayesian neural networks, and autoencoders, to reduce the dimensionality of the state space and diffuse the knowledge from the source to the target domain. The new framework is applied to one-dimensional freely-propagating flame solutions under different data sparsity scenarios. The results reveal that there is an optimal amount of knowledge to be transferred, which depends on the amount of data available in the target domain and the similarity between the domains. TL can reduce the reconstruction error by one order of magnitude for cases with large sparsity. The new framework required 10 times less data for the target domain to reproduce the same error as in the abundant data scenario. Furthermore, comparisons with a state-of-the-art deterministic TL strategy show that the probabilistic method can require four times less data to achieve the same reconstruction error.
△ Less
Submitted 17 May, 2024;
originally announced May 2024.
Client applications and Server Side docker for management of RNASeq and/or VariantSeq workflows and pipelines of the GPRO Suite
Authors:
Ahmed Hafez,
Beatriz Soriano,
Aya A. Elsayed,
Ricardo Futami,
Raquel Ceprian,
Ricardo Ramos-Ruiz,
Genis Martinez,
Francisco J. Roig,
Miguel A. Torres-Font,
Fernando Naya-Català,
Josep Alvar Calduch-Giner,
Lucia Trilla-Fuertes,
Angelo Gamez-Pozo,
Vicente Arnau,
Jose M. Sempere,
Jaume Perez-Sánchez,
Toni Gabaldón,
Carlos Llorens
Abstract:
The GPRO suite is an in-progress bioinformatic project for -omic data analyses. As part of the continued growth of this project, we introduce a client side & server side solution for comparative transcriptomics and analysis of variants. The client side consists of two Java applications called "RNASeq" and "VariantSeq" to manage workflows for RNA-seq and Variant-seq analysis, respectively, based on…
▽ More
The GPRO suite is an in-progress bioinformatic project for -omic data analyses. As part of the continued growth of this project, we introduce a client side & server side solution for comparative transcriptomics and analysis of variants. The client side consists of two Java applications called "RNASeq" and "VariantSeq" to manage workflows for RNA-seq and Variant-seq analysis, respectively, based on the most common command line interface tools for each topic. Both applications are coupled with a Linux server infrastructure (named GPRO Server Side) that hosts all dependencies of each application (scripts, databases, and command line interface tools). Implementation of the server side requires a Linux operating system, PHP, SQL, Python, bash scripting, and third-party software. The GPRO Server Side can be deployed via a Docker container that can be installed in the user's PC using any operating system or on remote servers as a cloud solution. The two applications are available as desktop and cloud applications and provide two execution modes: a Step-by-Step mode enables each step of a workflow to be executed independently and a Pipeline mode allows all steps to be run sequentially. The two applications also feature an experimental support system called GENIE that consists of a virtual chatbot/assistant and a pipeline jobs panel coupled with an expert system. The chatbot can troubleshoot issues with the usage of each tool, the pipeline job panel provides information about the status of each task executed in the GPRO Server Side, and the expert provides the user with a potential recommendation to identify or fix failed analyses. The two applications and the GPRO Server Side combine the user-friendliness and security of client software with the efficiency of front-end & back-end solutions to manage command line interface software for RNA-seq and variant-seq analysis via interface environments.
△ Less
Submitted 19 November, 2022; v1 submitted 14 February, 2022;
originally announced February 2022.