Skip to main content

Showing 1–6 of 6 results for author: Van Deursen, R

.
  1. arXiv:2407.20786  [pdf

    cs.LG cs.AI

    Be aware of overfitting by hyperparameter optimization!

    Authors: Igor V. Tetko, Ruud van Deursen, Guillaume Godin

    Abstract: Hyperparameter optimization is very frequently employed in machine learning. However, an optimization of a large space of parameters could result in overfitting of models. In recent studies on solubility prediction the authors collected seven thermodynamic and kinetic solubility datasets from different data sources. They used state-of-the-art graph-based methods and compared models developed for e… ▽ More

    Submitted 24 November, 2024; v1 submitted 30 July, 2024; originally announced July 2024.

    Comments: 19 pages, 5 Tables

  2. arXiv:2010.01027  [pdf

    q-bio.QM cs.LG

    Beyond Chemical 1D knowledge using Transformers

    Authors: Ruud van Deursen, Igor V. Tetko, Guillaume Godin

    Abstract: In the present paper we evaluated efficiency of the recent Transformer-CNN models to predict target properties based on the augmented stereochemical SMILES. We selected a well-known Cliff activity dataset as well as a Dipole moment dataset and compared the effect of three representations for R/S stereochemistry in SMILES. The considered representations were SMILES without stereochemistry (noChiSMI… ▽ More

    Submitted 7 October, 2020; v1 submitted 2 October, 2020; originally announced October 2020.

  3. State-of-the-Art Augmented NLP Transformer models for direct and single-step retrosynthesis

    Authors: Igor V. Tetko, Pavel Karpov, Ruud Van Deursen, Guillaume Godin

    Abstract: We investigated the effect of different training scenarios on predicting the (retro)synthesis of chemical compounds using a text-like representation of chemical reactions (SMILES) and Natural Language Processing neural network Transformer architecture. We showed that data augmentation, which is a powerful method used in image processing, eliminated the effect of data memorization by neural network… ▽ More

    Submitted 22 September, 2020; v1 submitted 5 March, 2020; originally announced March 2020.

  4. arXiv:1910.13124  [pdf, other

    cs.LG stat.ML

    Multitask Learning On Graph Neural Networks Applied To Molecular Property Predictions

    Authors: Fabio Capela, Vincent Nouchi, Ruud Van Deursen, Igor V. Tetko, Guillaume Godin

    Abstract: Prediction of molecular properties, including physico-chemical properties, is a challenging task in chemistry. Herein we present a new state-of-the-art multitask prediction method based on existing graph neural network models. We have used different architectures for our models and the results clearly demonstrate that multitask learning can improve model performance. Additionally, a significant re… ▽ More

    Submitted 30 October, 2019; v1 submitted 29 October, 2019; originally announced October 2019.

  5. arXiv:1909.11472  [pdf

    cs.LG stat.ML

    Deep Generative Model for Sparse Graphs using Text-Based Learning with Augmentation in Generative Examination Networks

    Authors: Ruud van Deursen, Guillaume Godin

    Abstract: Graphs and networks are a key research tool for a variety of science fields, most notably chemistry, biology, engineering and social sciences. Modeling and generation of graphs with efficient sampling is a key challenge for graphs. In particular, the non-uniqueness, high dimensionality of the vertices and local dependencies of the edges may render the task challenging. We apply our recently introd… ▽ More

    Submitted 24 September, 2019; originally announced September 2019.

  6. arXiv:1909.04825  [pdf

    cs.LG stat.ML

    GEN: Highly Efficient SMILES Explorer Using Autodidactic Generative Examination Networks

    Authors: Ruud van Deursen, Peter Ertl, Igor V. Tetko, Guillaume Godin

    Abstract: Recurrent neural networks have been widely used to generate millions of de novo molecules in a known chemical space. These deep generative models are typically setup with LSTM or GRU units and trained with canonical SMILEs. In this study, we introduce a new robust architecture, Generative Examination Networks GEN, based on bidirectional RNNs with concatenated sub-models to learn and generate molec… ▽ More

    Submitted 10 September, 2019; originally announced September 2019.