Moritz Walter Abstract

Integrating Heterogeneous Assay Data for ML-based ADME Prediction

Moritz Walter, Dr Lina Humbeck and Dr Miha Skalic

Medicinal Chemistry Department, Boehringer Ingelheim Pharma GmbH & Co. KG, Birkendorfer Str. 65, 88397 Biberach an der Riss, Germany


ADME (Absorption, Distribution, Metabolism, Excretion) properties are key parameters to estimate whether a drug candidate exhibits a desired pharmacokinetic (PK) profile. In recent years, ADME predictions from ML models based on chemical structures have gained growing attention to replace experimental studies for early prioritization of compounds. Pharmaceutical companies possess large datasets of ADME data which in principle can be used for ML modelling. One challenge lies in the heterogeneity of the data. For instance, an assay may be performed both inhouse and externally leading to slightly varying conditions, or the protocol of an inhouse assay may have changed over time. The question arises how such heterogeneous data should be treated to obtain the best-performing models. In addition, different endpoints may be related and hence using data from a data-rich assay (e.g., microsomal stability) might assist a model predicting a related assay with fewer experimental data points (e.g., hepatocyte stability).

In this study, multi-task (MT) modelling approaches and model stacking approaches (i.e., using the prediction for an auxiliary assay/endpoint as input to a target assay) were applied to integrate our inhouse ADME data and the resulting models were compared to single task (ST) baselines. Moreover, attempts were made to understand under which circumstances different assays should be integrated and which modelling technique is well suited in a particular situation.

Preliminary results show that both model stacking (using Random Forest models) and MT graph-convolutional neural networks (Chemprop) may lead to marked improvements over ST models (see Table 1). Using a realistic temporal split, Chemprop achieved highest scores to predict hepatocyte stability when paired with the microsome stability assay for the training set. When experimental values of the microsomal stability assay were additionally provided to the models for the test set, model stacking and Chemprop gave models of approximately equal performance.

Table 1 R2 scores for prediction of hepatocyte stability as ST or integrated with microsomal stability. The best model in each situation in bold. RF: Random Forest
model R2
ST (RF) 0.343 / –
Stacked RF (without/with experimental auxiliary labels for test set) 0.389 / 0.525
MT-Chemprop (without/with experimental auxiliary labels for test set) 0.426 / 0.522