AlphaFold Meets Drug Design: A Novel Method for de novo Drug Discovery
Andrius Bernatavicius1,2, Anthe Janssen2,3 and Gerard van Westen21) Leiden Academic Centre of Drug Research, Leiden University, The Netherlands
2) Leiden Institute of Advanced Computer Science, Leiden University, The
Netherlands
3) Leiden Institute of Chemistry, Leiden University, The NetherlandsHere we present a novel approach for de novo drug discovery that leverages the latent space of AlphaFold [1] as a means of conditioning the generative model on a particular target protein. We have previously shown that adding information from the proteins improves the performance of QSAR models, which we then refer to as proteochemometrics (PCM) [2, 3]. The use of AlphaFold protein embeddings within the generative model for small molecules allows it to capture the structural relationships between targets. This opens up new possibilities such as interpolation within the chemical space of known highly active compounds and extrapolation on the target side. We show that a single model can generate potentially active, novel, and diverse molecules for a broad range of target proteins from the ChEMBL [4] database, including ones that have limited ligand bioactivity data. Additionally, the trained model and extracted protein embeddings are available as a PyMOL [5] plugin, making the model more accessible and easier to use.
References
[2] Gerard J. P. van Westen et al. “Proteochemometric modeling as a tool to design selective compounds and for extrapolating to novel targets”. In:
Med. Chem. Commun. 2 (1 2011), pp. 16–30. doi: 10.1039/C0MD00165A.
url: http://dx.doi.org/10.1039/C0MD00165A.
[3] Eelke B. Lenselink et al. “Beyond the hype: deep neural networks out-perform established methods using a ChEMBL bioactivity benchmark set”. In: Journal of Cheminformatics 9.1 (Aug. 2017), p. 45. issn: 1758-2946. doi: 10.1186/s13321-017-0232-0. url: https://doi.org/10.1186/s13321-017-0232-0.
[4] A Gaulton et al. “ChEMBL: a large-scale bioactivity database for drug discovery”. en. In: Nucleic Acids Res. 40.D1 (Jan. 2012), pp. D1100–D1107.
[5] LLC Schr ̈odinger and Warren DeLano. PyMOL. Version 2.4.0. May 20,2020. url:http://www.pymol.org/pymol