Skip to main content

Showing 1–15 of 15 results for author: Zubarev, D

.
  1. arXiv:2407.20267  [pdf, other

    cs.LG cs.AI physics.chem-ph

    A Large Encoder-Decoder Family of Foundation Models For Chemical Language

    Authors: Eduardo Soares, Victor Shirasuna, Emilio Vital Brazil, Renato Cerqueira, Dmitry Zubarev, Kristin Schmidt

    Abstract: Large-scale pre-training methodologies for chemical language models represent a breakthrough in cheminformatics. These methods excel in tasks such as property prediction and molecule generation by learning contextualized representations of input tokens through self-supervised learning on large unlabeled corpora. Typically, this involves pre-training on unlabeled data followed by fine-tuning on spe… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

    Comments: 14 pages, 3 figures, 14 tables

  2. arXiv:2312.09733  [pdf, other

    quant-ph cond-mat.mtrl-sci

    Quantum-centric Supercomputing for Materials Science: A Perspective on Challenges and Future Directions

    Authors: Yuri Alexeev, Maximilian Amsler, Paul Baity, Marco Antonio Barroca, Sanzio Bassini, Torey Battelle, Daan Camps, David Casanova, Young Jai Choi, Frederic T. Chong, Charles Chung, Chris Codella, Antonio D. Corcoles, James Cruise, Alberto Di Meglio, Jonathan Dubois, Ivan Duran, Thomas Eckl, Sophia Economou, Stephan Eidenbenz, Bruce Elmegreen, Clyde Fare, Ismael Faro, Cristina Sanz Fernández, Rodrigo Neumann Barros Ferreira , et al. (102 additional authors not shown)

    Abstract: Computational models are an essential tool for the design, characterization, and discovery of novel materials. Hard computational tasks in materials science stretch the limits of existing high-performance supercomputing centers, consuming much of their simulation, analysis, and data resources. Quantum computing, on the other hand, is an emerging technology with the potential to accelerate many of… ▽ More

    Submitted 19 September, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

    Comments: 65 pages, 15 figures; comments welcome

    Journal ref: Future Generation Computer Systems, Volume 160, November 2024, Pages 666-710

  3. arXiv:2307.03811  [pdf

    cond-mat.mtrl-sci cond-mat.dis-nn cs.LG

    Formulation Graphs for Mapping Structure-Composition of Battery Electrolytes to Device Performance

    Authors: Vidushi Sharma, Maxwell Giammona, Dmitry Zubarev, Andy Tek, Khanh Nugyuen, Linda Sundberg, Daniele Congiu, Young-Hye La

    Abstract: Advanced computational methods are being actively sought for addressing the challenges associated with discovery and development of new combinatorial material such as formulations. A widely adopted approach involves domain informed high-throughput screening of individual components that can be combined into a formulation. This manages to accelerate the discovery of new compounds for a target appli… ▽ More

    Submitted 28 September, 2023; v1 submitted 7 July, 2023; originally announced July 2023.

    Comments: 35 pages, 10 figures

  4. arXiv:2306.14919  [pdf, other

    physics.chem-ph cs.LG q-bio.QM

    Beyond Chemical Language: A Multimodal Approach to Enhance Molecular Property Prediction

    Authors: Eduardo Soares, Emilio Vital Brazil, Karen Fiorela Aquino Gutierrez, Renato Cerqueira, Dan Sanders, Kristin Schmidt, Dmitry Zubarev

    Abstract: We present a novel multimodal language model approach for predicting molecular properties by combining chemical language representation with physicochemical features. Our approach, MULTIMODAL-MOLFORMER, utilizes a causal multistage feature selection method that identifies physicochemical features based on their direct causal effect on a specific target property. These causal features are then inte… ▽ More

    Submitted 22 June, 2023; originally announced June 2023.

    Comments: 14 pages, 6 Figures, 5 tables. Submited to NEURIPS 2023, Under review

    ACM Class: J.2; I.2.1

  5. arXiv:2304.05389  [pdf

    cs.AI cs.HC

    Human-AI Co-Creation Approach to Find Forever Chemicals Replacements

    Authors: Juliana Jansen Ferreira, Vinícius Segura, Joana G. R. Souza, Gabriel D. J. Barbosa, João Gallas, Renato Cerqueira, Dmitry Zubarev

    Abstract: Generative models are a powerful tool in AI for material discovery. We are designing a software framework that supports a human-AI co-creation process to accelerate finding replacements for the ``forever chemicals''-- chemicals that enable our modern lives, but are harmful to the environment and the human health. Our approach combines AI capabilities with the domain-specific tacit knowledge of sub… ▽ More

    Submitted 11 April, 2023; originally announced April 2023.

    Comments: 5 pages, Generative AI and HCI (GenAICHI) Workshop at CHI 23 (ACM CHI Conference on Human Factors in Computing Systems)

  6. arXiv:2301.08750  [pdf, other

    cs.LG cs.AI cs.IT

    Domain-agnostic and Multi-level Evaluation of Generative Models

    Authors: Girmaw Abebe Tadesse, Jannis Born, Celia Cintas, William Ogallo, Dmitry Zubarev, Matteo Manica, Komminist Weldemariam

    Abstract: While the capabilities of generative models heavily improved in different domains (images, text, graphs, molecules, etc.), their evaluation metrics largely remain based on simplified quantities or manual inspection with limited practicality. To this end, we propose a framework for Multi-level Performance Evaluation of Generative mOdels (MPEGO), which could be employed across different domains. MPE… ▽ More

    Submitted 20 January, 2023; originally announced January 2023.

  7. arXiv:2211.04257  [pdf, other

    cs.LG cs.AI q-bio.QM

    Toward Human-AI Co-creation to Accelerate Material Discovery

    Authors: Dmitry Zubarev, Carlos Raoni Mendes, Emilio Vital Brazil, Renato Cerqueira, Kristin Schmidt, Vinicius Segura, Juliana Jansen Ferreira, Dan Sanders

    Abstract: There is an increasing need in our society to achieve faster advances in Science to tackle urgent problems, such as climate changes, environmental hazards, sustainable energy systems, pandemics, among others. In certain domains like chemistry, scientific discovery carries the extra burden of assessing risks of the proposed novel solutions before moving to the experimental stage. Despite several re… ▽ More

    Submitted 5 November, 2022; originally announced November 2022.

    Comments: 9 pages, 5 figures, NeurIPS 2022 WS: AI4Science

  8. arXiv:2208.00063  [pdf, other

    cs.LG stat.AP

    Topology-Driven Generative Completion of Lacunae in Molecular Data

    Authors: Dmitry Yu. Zubarev, Petar Ristoski

    Abstract: We introduce an approach to the targeted completion of lacunae in molecular data sets which is driven by topological data analysis, such as Mapper algorithm. Lacunae are filled in using scaffold-constrained generative models trained with different scoring functions. The approach enables addition of links and vertices to the skeletonized representations of the data, such as Mapper graph, and falls… ▽ More

    Submitted 29 July, 2022; originally announced August 2022.

    Comments: 10 pages, talk presented at APS March Meeting 2021

  9. arXiv:2112.01625  [pdf, other

    cs.LG physics.chem-ph

    Sample-Efficient Generation of Novel Photo-acid Generator Molecules using a Deep Generative Model

    Authors: Samuel C. Hoffman, Vijil Chenthamarakshan, Dmitry Yu. Zubarev, Daniel P. Sanders, Payel Das

    Abstract: Photo-acid generators (PAGs) are compounds that release acids ($H^+$ ions) when exposed to light. These compounds are critical components of the photolithography processes that are used in the manufacture of semiconductor logic and memory chips. The exponential increase in the demand for semiconductors has highlighted the need for discovering novel photo-acid generators. While de novo molecule des… ▽ More

    Submitted 2 December, 2021; originally announced December 2021.

  10. arXiv:2108.03044  [pdf

    cs.CE

    Molecule Generation Experience: An Open Platform of Material Design for Public Users

    Authors: Seiji Takeda, Toshiyuki Hama, Hsiang-Han Hsu, Akihiro Kishimoto, Makoto Kogoh, Takumi Hongo, Kumiko Fujieda, Hideaki Nakashika, Dmitry Zubarev, Daniel P. Sanders, Jed W. Pitera, Junta Fuchiwaki, Daiju Nakano

    Abstract: Artificial Intelligence (AI)-driven material design has been attracting great attentions as a groundbreaking technology across a wide spectrum of industries. Molecular design is particularly important owing to its broad application domains and boundless creativity attributed to progresses in generative models. The recent maturity of molecular generative models has stimulated expectations for pract… ▽ More

    Submitted 6 August, 2021; originally announced August 2021.

    Comments: 10 pages, 6 figures

  11. arXiv:2004.11521  [pdf

    cs.CE physics.data-an

    Molecular Inverse-Design Platform for Material Industries

    Authors: Seiji Takeda, Toshiyuki Hama, Hsiang-Han Hsu, Victoria A. Piunova, Dmitry Zubarev, Daniel P. Sanders, Jed W. Pitera, Makoto Kogoh, Takumi Hongo, Yenwei Cheng, Wolf Bocanett, Hideaki Nakashika, Akihiro Fujita, Yuta Tsuchiya, Katsuhiko Hino, Kentaro Yano, Shuichi Hirose, Hiroki Toda, Yasumitsu Orii, Daiju Nakano

    Abstract: The discovery of new materials has been the essential force which brings a discontinuous improvement to industrial products' performance. However, the extra-vast combinatorial design space of material structures exceeds human experts' capability to explore all, thereby hampering material development. In this paper, we present a material industry-oriented web platform of an AI-driven molecular inve… ▽ More

    Submitted 16 May, 2020; v1 submitted 23 April, 2020; originally announced April 2020.

    Comments: 9 pages, 7 figures, Accepted to KDD 2020

  12. arXiv:2001.09038  [pdf

    cs.CE

    AI-driven Inverse Design System for Organic Molecules

    Authors: Seiji Takeda, Toshiyuki Hama, Hsiang-Han Hsu, Toshiyuki Yamane, Koji Masuda, Victoria A. Piunova, Dmitry Zubarev, Jed Pitera, Daniel P. Sanders, Daiju Nakano

    Abstract: Designing novel materials that possess desired properties is a central need across many manufacturing industries. Driven by that industrial need, a variety of algorithms and tools have been developed that combine AI (machine learning and analytics) with domain knowledge in physics, chemistry, and materials science. AI-driven materials design can be divided to mainly two stages; the first one is th… ▽ More

    Submitted 20 January, 2020; originally announced January 2020.

  13. arXiv:1807.09754  [pdf

    cs.IR cs.AI

    Data Infrastructure and Approaches for Ontology-Based Drug Repurposing

    Authors: Stephen Boyer, Thomas Griffin, Sarath Swaminathan, Kenneth L. Clarkson, Dmitry Zubarev

    Abstract: We report development of a data infrastructure for drug repurposing that takes advantage of two currently available chemical ontologies. The data infrastructure includes a database of compound- target associations augmented with molecular ontological labels. It also contains two computational tools for prediction of new associations. We describe two drug-repurposing systems: one, Nascent Ontologic… ▽ More

    Submitted 12 July, 2018; originally announced July 2018.

    Comments: 17 pages

  14. Diagnostics of Data-Driven Models: Uncertainty Quantification of PM7 Semi-Empirical Quantum Chemical Method

    Authors: James Oreluk, Zhenyuan Liu, Arun Hegde, Wenyu Li, Andrew Packard, Michael Frenklach, Dmitry Zubarev

    Abstract: We report an evaluation of a semi-empirical quantum chemical method PM7 from the perspective of uncertainty quantification. Specifically, we apply Bound-to-Bound Data Collaboration, an uncertainty quantification framework, to characterize a) variability of PM7 model parameter values consistent with the uncertainty in the training data, and b) uncertainty propagation from the training data to the m… ▽ More

    Submitted 16 June, 2018; v1 submitted 12 June, 2018; originally announced June 2018.

    Journal ref: Scientific Reports 8, Article number: 13248 (2018)

  15. arXiv:1507.06160  [pdf, other

    q-bio.MN nlin.AO physics.chem-ph

    Sustainability of Transient Kinetic Regimes and Origins of Death

    Authors: Dmitry Yu. Zubarev, Leonardo A. Pachón

    Abstract: It is generally recognized that a distinguishing feature of life is its peculiar capability to avoid equilibration. The origin of this capability and its evolution along the timeline of abiogenesis is not yet understood. We propose to study an analog of this phenomenon that could emerge in non-biological systems. To this end, we introduce the concept of sustainability of transient kinetic regimes.… ▽ More

    Submitted 11 January, 2016; v1 submitted 22 July, 2015; originally announced July 2015.

    Comments: 11 pages, 5 figures

    Journal ref: Sci. Rep. 6, 20562 (2016)