Learning To Navigate The Synthetically Accessible Chemical Space Using Reinforcement Learning

Gottipati, Sai Krishna; Sattarov, Boris; Niu, Sufeng; Pathak, Yashaswi; Wei, Haoran; Liu, Shengchao; Thomas, Karam M. J.; Blackburn, Simon; Coley, Connor W.; Tang, Jian; Chandar, Sarath; Bengio, Yoshua

Computer Science > Machine Learning

arXiv:2004.12485 (cs)

[Submitted on 26 Apr 2020 (v1), last revised 20 May 2020 (this version, v2)]

Title:Learning To Navigate The Synthetically Accessible Chemical Space Using Reinforcement Learning

Authors:Sai Krishna Gottipati, Boris Sattarov, Sufeng Niu, Yashaswi Pathak, Haoran Wei, Shengchao Liu, Karam M. J. Thomas, Simon Blackburn, Connor W. Coley, Jian Tang, Sarath Chandar, Yoshua Bengio

View PDF

Abstract:Over the last decade, there has been significant progress in the field of machine learning for de novo drug design, particularly in deep generative models. However, current generative approaches exhibit a significant challenge as they do not ensure that the proposed molecular structures can be feasibly synthesized nor do they provide the synthesis routes of the proposed small molecules, thereby seriously limiting their practical applicability. In this work, we propose a novel forward synthesis framework powered by reinforcement learning (RL) for de novo drug design, Policy Gradient for Forward Synthesis (PGFS), that addresses this challenge by embedding the concept of synthetic accessibility directly into the de novo drug design system. In this setup, the agent learns to navigate through the immense synthetically accessible chemical space by subjecting commercially available small molecule building blocks to valid chemical reactions at every time step of the iterative virtual multi-step synthesis process. The proposed environment for drug discovery provides a highly challenging test-bed for RL algorithms owing to the large state space and high-dimensional continuous action space with hierarchical actions. PGFS achieves state-of-the-art performance in generating structures with high QED and penalized clogP. Moreover, we validate PGFS in an in-silico proof-of-concept associated with three HIV targets. Finally, we describe how the end-to-end training conceptualized in this study represents an important paradigm in radically expanding the synthesizable chemical space and automating the drug discovery process.

Comments:	added the statistics of top-100 compounds used logP metric with scaled components added values of the initial reactants to the box plots some values in tables are recalculated due to the inconsistent environments on different machines. corresponding benchmarks were rerun with the requirements on github. no significant changes in the results. corrected figures in the Appendix
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2004.12485 [cs.LG]
	(or arXiv:2004.12485v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2004.12485

Submission history

From: Vijaya Sai Krishna Gottipati [view email]
[v1] Sun, 26 Apr 2020 21:40:03 UTC (9,467 KB)
[v2] Wed, 20 May 2020 03:28:15 UTC (6,212 KB)

Computer Science > Machine Learning

Title:Learning To Navigate The Synthetically Accessible Chemical Space Using Reinforcement Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Learning To Navigate The Synthetically Accessible Chemical Space Using Reinforcement Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators