Xuhan Liu Abstract

Drug Molecule De novo Design by Multi-Objective Reinforcement Learning for Polypharmacology

Xuhan Liu1, Kai Ye2, Herman W. T. van Vlijmen1,3, Adriaan P. IJzerman1, Gerard J. P. van Westen1

1Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Einsteinweg 55, Leiden, The Netherlands
2Omics and Omics informatics, Xi’an Jiaotong University, 28 Xianning W Rd, Xi’an, China
3Janssen Pharmaceutica NV, Beerse, Belgium
Over the last five years deep learning has progressed tremendously in both image recognition and natural language processing [1]. Now it is increasingly applied to other data rich fields. In drug discovery recurrent neural networks (RNNs) have been shown to be an effective method to generate novel chemical structures in the form of SMILES [2, 3]. Our group also proposed a new method named DrugEx that works by integrating an exploration strategy into RNNs based reinforcement learning to improve the diversity of the generated molecules. [4]

Most current deep learning based methods only focus on a single target to generate drug-like active molecules. In reality, however, drug molecules often interact with more than one target and unintended drug-target interactions can cause adverse effects [5]. Here, we extend our DrugEx model for multi-objective optimization and generate synthesizable drug molecules against more than one selective targets (e.g. adenosine receptors, including types A1, A2A, A2B, and A3).

In this model two deep neural networks (DNNs) interplay with each other under the reinforcement learning framework. We apply an RNN as the agent and a multi-task fully connected DNN as the environment. Ligands that were annotated in bioactivity assays on the adenosine receptors were collected from ChEMBL [6]. Subsequently the environment was created to predict the probability score whether generated molecules are active or not for each protein target. The agent was firstly pre-trained for molecular library generation, and then it was trained under the guidance of the reward, which is given by the weighted sum of these target prediction scores to generate desired molecules. Finally, more desired molecules appear during the training loop until the algorithm converges in reinforcement learning.

Our proof of concept generated compounds with a diverse predicted selectivity profile toward multiple targets. Hence, our model can generate molecules with potentially high efficacy and lower toxicity caused by off-target effects.

References
1. LeCun, Y., Y. Bengio, and G. Hinton, Deep learning. Nature, 2015. 521(7553): p. 436-44.
2. Benjamin, S.-L., et al., Optimizing distributions over molecular space. An Objective-Reinforced Generative Adversarial Network for Inverse-design Chemistry (ORGANIC). 2017.
3. Olivecrona, M., et al., Molecular de-novo design through deep reinforcement learning. J Cheminform, 2017. 9(1): p. 48.
4. Liu, X., et al., An Exploration Strategy Improves the Diversity of de novo Ligands Using Deep Reinforcement Learning – A Case for the Adenosine A2A Receptor. 2018. DOI: 10.26434/chemrxiv.7436789.v2
5. Anighoro, A., J. Bajorath, and G. Rastelli, Polypharmacology: challenges and opportunities in drug discovery. J Med Chem, 2014. 57(19): p. 7874-87.
6. Gaulton, A., et al., ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res, 2012. 40(Database issue): p. D1100-7.

Bursary Application