AI with Life: Machine Learning for Antibiotic Screening
McKerlie1*, C. Kern2, BJ. Howlin1, M. Sacchi1 and D. Gems2
1School of Chemistry and Chemical Engineering, FEPS, University of Surrey, Guildford, Surrey, GU2 7XH, UK
2Institute of Healthy Ageing, and Research Department of Genetics, Evolution and Environment, University College London, London WC1E 6BT, UK
Caenorhabditis elegans are microscopic roundworms commonly used to study mechanisms of ageing and interventions that extend lifespan. The DrugAge database  has been developed to monitor the extension of lifespan in various laboratory models including C. elegans through pharmaceutical intervention. We have developed an AI model based on Random Forests that predicts the features of the drugs that most enhance lifespan ; this is an improvement on other similar approaches that typically employ C. elegans and databases like DrugAge to identify potential novel lifespan extending compounds. However, in C. elegans lifespan is not only subject to age related disease but also life shortening pharyngeal infection that affects around 40 percent of the population under standard culture conditions; standard C. elegans is cultured on E. coli . Moreover, C. elegans is extremely impermeable to drug uptake.
Is it possible that a large percentage of the drug outputs of computational approaches like this are flawed due to confounding variables like drug uptake and effects on bacterial pathogenicity? Here we explore this possibility and show, using Lipinski’s rule of 5, that drug uptake plays only a minor role in influencing detection of potential life extension compounds. In contrast, drug properties associated with suppression of infection confounds detection of potential “anti-ageing” interventions as they cause lifespan extension via infection suppression. Our approach is to use machine learning models to highlight and remove possible antibiotics within the DrugAge database. A random forest classifier built using 1024-bit ECFPs was trained on a database constructed by a group at MIT  yielding an AUC ROC score of 0.811 in the test set and deployed on the drug age database. Out of 304 molecules identified as life extending drugs, significantly 22 were antibiotics, i.e. 7.2%.
 Barardo, D., et al (2017) “The DrugAge database of ageing-related drugs.” Aging Cell, 16, 594-597.
 Kapsiani, S., Howlin, B.J. (2021) “Random forest classification for predicting lifespan-extending chemical compounds.” Scientific Reports, 11,13812
 Zhao, Y., A F. Gilliat, M. Ziehm, M. Turmaine, H. Wang, M. Ezcurra, C. Yang, G. Phillips, D. McBay, W. B. Zhang, L. Partridge, Z. Pincus & D. Gemsal (2017). “Two forms of death in ageing Caenorhabditis elegans”. Nat Commun.8,15418
 Stokes, JM., K. Yang, K. Swanson,W. Jin, A. Cubillos-Ruiz, N.M. Donghia, C. R. MacNai, S. French, L.A. Carfrae, Z. Bloom-Ackerman, V M. Tran, A. Chiappino-Pepe, A.H. Badran,I. W. Andrews, E. J. Chory, G. M .Church, E. D. Brown, T.S. Jaakkola, R. Barzilay, J. Collins (2020) “A deep learning approach to antibiotic discovery.” Cell, 180, 688-702.