Gian Marco Ghiandoni Abstract

Reaction Class Recommendation in De novo Drug Design

Gian Marco Ghiandoni1, Beining Chen2, Dimitar Hristozov3, Michael J. Bodkin3, Valerie J. Gillet1

1Information School, University of Sheffield, Regent Court, 211 Portobello, Sheffield, S1 4DP, United Kingdom
2Chemistry Department, University of Sheffield, Dainton Building, Brook Hill, Sheffield, S3 7HF, United Kingdom
3Evotec (U.K.) Ltd, 114 Innovation Drive, Milton Park, Abingdon, OX14 4RZ, United Kingdom
De novo design is branch of chemoinformatics which is concerned with the rational design of tailored molecular structures characterised by desired pharmacodynamic and pharmacokinetic properties. Scoring, construction, and search-based methods are the main components that have been developed in order to explore efficiently the drug-like chemical space, which was estimated at 10 to the power of 60 for the chemical structures that meet Lipinski’s Rule-of-Five (RO5).[1][2]

The introduction of knowledge-based construction techniques, which combine starting materials and reagents according to a set of rules extracted from collections of known reactions, has reduced the chemical space into a smaller number of potentially accessible structures. For example, reaction vectors, which are representations that encode the topological changes occurring chemical transformations, have been implemented in a structure generation algorithm capable of outputting compound structures along with synthetic references in order to promote their accessibility.[3] In more recent work, reaction vectors have been classified by reaction-type to allow the application of reaction classes to further support the medicinal chemists and limit the chemical space only to regions of interest.[4]

Herein, we build on the reaction classification methods for further knowledge exploitation for de novo design. A series of multi-label machine learning models are trained in order to recommend lists of applicable
reaction classes based upon the features of a given starting material, which can be used to drive the automated generation of product libraries supported by the past experience of medicinal chemists. The best performing model is then validated in a de novo design experiment to highlight the benefits derived from the use of recommendation models. Our conclusion is that recommendation models lead to a higher synthetic accessibility of products and a more efficient management of computational resources.

[1] Hartenfeller, M. & Schneider, G. (2011). Wiley Interdiscip. Rev. Comput. Mol. Sci., 1, 742-759.
[2] Walters, W. P. (2019). J. Med. Chem., 62 (3), 1116-1124.
[3] Patel, H., Bodkin, M. J., Chen, B. & Gillet, V. J. (2009). J. Chem. Inf. Model. 49, 1163-1184.
[4] Ghiandoni G., Chen, B., Bodkin, M. & Gillet, V. J. Poster Exhibition at UK QSAR Cardiff (UK), 11th-12th April 2018.