Skip to main content

Showing 1–7 of 7 results for author: Arjona-Medina, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2410.08024  [pdf, other

    cs.LG cs.AI

    Pretraining Graph Transformers with Atom-in-a-Molecule Quantum Properties for Improved ADMET Modeling

    Authors: Alessio Fallani, Ramil Nugmanov, Jose Arjona-Medina, Jörg Kurt Wegner, Alexandre Tkatchenko, Kostiantyn Chernichenko

    Abstract: We evaluate the impact of pretraining Graph Transformer architectures on atom-level quantum-mechanical features for the modeling of absorption, distribution, metabolism, excretion, and toxicity (ADMET) properties of drug-like compounds. We compare this pretraining strategy with two others: one based on molecular quantum properties (specifically the HOMO-LUMO gap) and one using a self-supervised at… ▽ More

    Submitted 10 October, 2024; originally announced October 2024.

  2. arXiv:2405.14837  [pdf, other

    cs.LG physics.chem-ph quant-ph

    Analysis of Atom-level pretraining with Quantum Mechanics (QM) data for Graph Neural Networks Molecular property models

    Authors: Jose Arjona-Medina, Ramil Nugmanov

    Abstract: Despite the rapid and significant advancements in deep learning for Quantitative Structure-Activity Relationship (QSAR) models, the challenge of learning robust molecular representations that effectively generalize in real-world scenarios to novel compounds remains an elusive and unresolved task. This study examines how atom-level pretraining with quantum mechanics (QM) data can mitigate violation… ▽ More

    Submitted 27 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

    Comments: 6 pages + 11 Supplement Materials

  3. arXiv:2210.13545  [pdf, other

    cs.LG cs.AI

    MEET: A Monte Carlo Exploration-Exploitation Trade-off for Buffer Sampling

    Authors: Julius Ott, Lorenzo Servadei, Jose Arjona-Medina, Enrico Rinaldi, Gianfranco Mauro, Daniela Sánchez Lopera, Michael Stephan, Thomas Stadelmayer, Avik Santra, Robert Wille

    Abstract: Data selection is essential for any data-based optimization technique, such as Reinforcement Learning. State-of-the-art sampling strategies for the experience replay buffer improve the performance of the Reinforcement Learning agent. However, they do not incorporate uncertainty in the Q-Value estimation. Consequently, they cannot adapt the sampling strategies, including exploration and exploitatio… ▽ More

    Submitted 17 April, 2023; v1 submitted 24 October, 2022; originally announced October 2022.

    Comments: Accepted at ICASSP 2023

    Report number: RIKEN-iTHEMS-Report-23

  4. arXiv:2012.01399  [pdf, ps, other

    cs.LG cs.AI math.OC

    Convergence Proof for Actor-Critic Methods Applied to PPO and RUDDER

    Authors: Markus Holzleitner, Lukas Gruber, José Arjona-Medina, Johannes Brandstetter, Sepp Hochreiter

    Abstract: We prove under commonly used assumptions the convergence of actor-critic reinforcement learning algorithms, which simultaneously learn a policy function, the actor, and a value function, the critic. Both functions can be deep neural networks of arbitrary complexity. Our framework allows showing convergence of the well known Proximal Policy Optimization (PPO) and of the recently introduced RUDDER.… ▽ More

    Submitted 2 December, 2020; originally announced December 2020.

    Comments: 20 pages

  5. arXiv:2009.14108  [pdf, other

    cs.LG cs.AI stat.ML

    Align-RUDDER: Learning From Few Demonstrations by Reward Redistribution

    Authors: Vihang P. Patil, Markus Hofmarcher, Marius-Constantin Dinu, Matthias Dorfer, Patrick M. Blies, Johannes Brandstetter, Jose A. Arjona-Medina, Sepp Hochreiter

    Abstract: Reinforcement learning algorithms require many samples when solving complex hierarchical tasks with sparse and delayed rewards. For such complex tasks, the recently proposed RUDDER uses reward redistribution to leverage steps in the Q-function that are associated with accomplishing sub-tasks. However, often only few episodes with high rewards are available as demonstrations since current explorati… ▽ More

    Submitted 28 June, 2022; v1 submitted 29 September, 2020; originally announced September 2020.

    Comments: Github: https://github.com/ml-jku/align-rudder, YouTube: https://youtu.be/HO-_8ZUl-UY

  6. Explaining and Interpreting LSTMs

    Authors: Leila Arras, Jose A. Arjona-Medina, Michael Widrich, Grégoire Montavon, Michael Gillhofer, Klaus-Robert Müller, Sepp Hochreiter, Wojciech Samek

    Abstract: While neural networks have acted as a strong unifying force in the design of modern AI systems, the neural network architectures themselves remain highly heterogeneous due to the variety of tasks to be solved. In this chapter, we explore how to adapt the Layer-wise Relevance Propagation (LRP) technique used for explaining the predictions of feed-forward networks to the LSTM architecture used for s… ▽ More

    Submitted 25 September, 2019; originally announced September 2019.

    Comments: 28 pages, 7 figures, book chapter, In: Explainable AI: Interpreting, Explaining and Visualizing Deep Learning, LNCS volume 11700, Springer 2019. arXiv admin note: text overlap with arXiv:1806.07857

  7. arXiv:1806.07857  [pdf, other

    cs.LG cs.AI math.OC stat.ML

    RUDDER: Return Decomposition for Delayed Rewards

    Authors: Jose A. Arjona-Medina, Michael Gillhofer, Michael Widrich, Thomas Unterthiner, Johannes Brandstetter, Sepp Hochreiter

    Abstract: We propose RUDDER, a novel reinforcement learning approach for delayed rewards in finite Markov decision processes (MDPs). In MDPs the Q-values are equal to the expected immediate reward plus the expected future rewards. The latter are related to bias problems in temporal difference (TD) learning and to high variance problems in Monte Carlo (MC) learning. Both problems are even more severe when re… ▽ More

    Submitted 10 September, 2019; v1 submitted 20 June, 2018; originally announced June 2018.

    Comments: 9 Pages plus appendix. For videos https://goo.gl/EQerZV