Virtual Toxicity Panel Screens to aid the Medicinal Chemist
Ed J Griffen1, Alexander G. Dossetter1, Andrew Leach2,1, Shane Montague1, Lauren Reid3, Jessica Stacey4, Arkadii Lin5,6
1MedChemica Ltd
2Pharmacy and Biomolecular Sciences, Liverpool John Moores University, Liverpool, Merseyside, United Kingdom
3Bioinformatics Institute (A*STAR), 30 Biopolis Street, Matrix, Singapore 138671
4Information School, University of Sheffield, Regent Court, 211 Portobello, Sheffield S1 4DP, United Kingdom
5Laboratory of Chemoinformatics, Faculty of Chemistry, University of Strasbourg, 4 rue Blaise Pascal, Strasbourg, 67000, France
6Boehringer Ingelheim Pharma GmbH & Co. KG, Birkendorferstrasse 65, 88397 Biberach an der Riss, Germany
Aiding chemists to make data driven decisions is one of the key goals of modelling within drug and agrochemical research. Unforeseen toxicity via secondary pharmacology is a significant risk and when encountered late in a discovery project’s life creates major issues and may even terminate a project. Early awareness of such toxicity risks enables chemists to prioritize screening resources or, in the event of observing in vivo toxicity, to focus on the most probable biological targets. Although significant progress has been made to reduce safety related attrition by controlling physicochemical properties, this approach appears to be reaching its limit.[1] Matched Molecular Pair Analysis (MMPA)[2] is a highly successful method for presenting aggregated data to a chemist and to inform their decision making. Critical learning from that area is that to influence chemists it is essential to supply chemical structural data in addition to numeric guidance.[3] Here we demonstrate how exploiting the properties of random forest and kNN models with refined descriptors can provide both toxicity alerts and structural features in a format designed to help the chemist. Both positive and negative features of molecules can be identified directing the chemist towards areas of molecules to alter and those to leave untouched. Critically, these models are highly auditable in contrast to other machine learning architectures. This enables detailed analysis of the underlying drivers in a model and the areas where a model may be secure or fragile. Chemists can understand the precision of any model’s predictions, its applicability domain and interrogate the underlying supporting data in a manner similar to matched molecular pair analysis. Biological target SAR similarity can also be extracted to identify other biological targets where the molecules may also be active. The development of virtual panel screens for sets of critical in vitro toxicity targets in the cardiac and CNS areas is demonstrated.[4]
References:
[1] Waring MJ, Arrowsmith J, Leach AR, Leeson PD, Mandrell S, Owen RM, et al. An analysis of the attrition of drug candidates from four major pharmaceutical companies. Nature Reviews Drug Discovery 2015;14:475–86. doi:10.1038/nrd4609.