It Takes Two to Tango: Directly Optimizing for Constrained Synthesizability in Generative Molecular Design

Guo, Jeff; Schwaller, Philippe

Abstract:Constrained synthesizability is an unaddressed challenge in generative molecular design. In particular, designing molecules satisfying multi-parameter optimization objectives, while simultaneously being synthesizable and enforcing the presence of specific commercial building blocks in the synthesis. This is practically important for molecule re-purposing, sustainability, and efficiency. In this work, we propose a novel reward function called TANimoto Group Overlap (TANGO), which uses chemistry principles to transform a sparse reward function into a dense and learnable reward function -- crucial for reinforcement learning. TANGO can augment general-purpose molecular generative models to directly optimize for constrained synthesizability while simultaneously optimizing for other properties relevant to drug discovery using reinforcement learning. Our framework is general and addresses starting-material, intermediate, and divergent synthesis constraints. Contrary to most existing works in the field, we show that incentivizing a general-purpose (without any inductive biases) model is a productive approach to navigating challenging optimization scenarios. We demonstrate this by showing that the trained models explicitly learn a desirable distribution. Our framework is the first generative approach to tackle constrained synthesizability.

Subjects:	Biomolecules (q-bio.BM); Machine Learning (cs.LG)
Cite as:	arXiv:2410.11527 [q-bio.BM]
	(or arXiv:2410.11527v1 [q-bio.BM] for this version)
	https://doi.org/10.48550/arXiv.2410.11527

Quantitative Biology > Biomolecules

Title:It Takes Two to Tango: Directly Optimizing for Constrained Synthesizability in Generative Molecular Design

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators