PKSmart: An Open-Source Computational Model to Predict in vivo Pharmacokinetics of Small Molecules
Srijit Seal1, Ola Spjuth2 and Andreas Bender1
1 Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Rd, CB2 1EW, Cambridge, United Kingdom
2 Department of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, Box 591, SE-75124, Uppsala, Sweden
Early assessment of pharmacokinetic (PK) compound properties is crucial, such as in DMTA cycles, and this work provides an open-source computational model, PKSmart, which only requires chemical structure as input. PKSmart predicts human pharmacokinetics (PK) parameters such as steady-state volume of distribution (VDss), total body clearance (CL), half-life (t½), fraction unbound in plasma (fu) and mean residence time (MRT). First, we trained which type of? models on 372 compounds using molecular structural fingerprints and physicochemical properties to predict animal PK parameters [VDss, CL, fu] for rats, dogs, and monkeys. Next, we trained Random Forest models using repeated nested cross-validation on 1,283 unique compounds and Morgan fingerprints, Mordred descriptors and predicted animal PK parameters (obtained using the models above) as features to predict human PK parameters [VDss, CL, t½, fu and MRT]. When validated on external test sets, models combining Morgan fingerprints, Mordred descriptors and predicted animal PK parameters, 42.5% of compounds for VDss, 72.1% of compounds for CL, 34.9% of compounds for fu, and 34.2% of compounds for t½ were within a 2-fold error of the observed values, which is a standard metric in evaluating PK models in the industry. We evaluated a fold-error estimate (and a range of predictions) for each compound based its structural similarity to the training data which revealed an increase in error as the molecular similarity decreased. Users can access a web-hosted application PKSmart (pk-predictor.serve.scilifelab.se) with all code downloadable for local use. To the best of our knowledge, this is the first public in vivo PK model that can predict human and animal PK parameters using inputs of chemical structure alone.