Maximilian Beckers Abstract

Leveraging Large-scale in silico ADMET Predictions to Estimate Small Molecule Developability

Maximilian Beckers, Noé Sturm, Nikolas Fechner and Nikolaus Stiefl
Novartis Institutes for BioMedical Research, Novartis Pharma AG, Postfach, 4002 Basel, Switzerland


Scoring and prioritization of compounds and chemical series using in silico methods is an important task in drug discovery projects. The aim is to prioritize the right series out of a large chemical space in early phases and select the best compounds for synthesis during optimization. Several approaches have been developed for this task over decades. Scoring and selection of subsets of chemical space can for example be done using Lipinski’s rule of five [1], the QED score [2] or other more recent techniques based on deep learning. Moreover, predictive models for ADMET properties can be used to select for compounds with desirable properties and subsequently score these using multi-parameter optimization (MPO) functions. However, quality of these predictions is usually not high enough and most of the compounds compliant with these criteria tend not to be druglike. Thus, better methods for selection of compounds and chemical series with highest chances of delivering a development candidate (DC) are in need.

Following up on our previous reconstruction of chemical series in our in-house compound archives at Novartis [3], we present a novel deep learning approach that leverages ~100 ADMET predictions to assess the potential to deliver a relevant drug candidate [4]. The neural network has been trained using in house predictive models and the recently presented MELLODDY [5] models as input features and the terminal stages in development of the compounds as predicted variable. We evaluate the predictive performance on recent in-house data not seen during training and on a challenging public dataset assembled from ChEMBL that resembles the structure of proprietary compound archives. The resulting score, which we termed bPK score, substantially outperforms previous approaches and showed strong discriminative performance on datasets where previous approaches did not show any. We further apply methods from explainable-AI to investigate what the model learned, discuss the application to drug discovery projects completed recently and present first case studies on the application to in silico generated virtual compounds.

[1] C. A. Lipinski et al. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv Drug Deliv Rev. 1997, 23, 3–25
[2] G. Bickerton et al. Quantifying the chemical beauty of drugs. Nat Chem. 2012,4, 90–98
[3] M. Beckers et al. 25 Years of Small-Molecule Optimization at Novartis: A Retrospective Analysis of Chemical Series Evolution. J. Chem. Inf. Model. 2022, 62, 6002–6021
[4] M. Beckers et al. manuscript in preparation, 2023
[5] Heyndrickx W. et al. MELLODDY: cross pharma federated learning at unprecedented scale unlocks benefits in QSAR without compromising proprietary information. ChemRxiv. 2022