Chemical Biology Resources at EMBL-EBI
Eloy Felix, Barbara Zdrazil, Ricardo Arcila, James Blackshaw, Nicolas Bosc, Fiona Hunter, Emma Manners, Maria Paula Magarinos, David Mendez Lopez, Juan F. Mosquera, Tevfik Kiziloren, Marleen de Veij and Andrew R. Leach
European Molecular Biology Laboratory’s European Bioinformatics Institute (EMBL-EBI)
ChEMBL is a database of bioactive molecules and associated experimentally determined activities sourced and curated from peer-reviewed scientific literature, selected patents, and a variety of directly deposited datasets. ChEMBL also offers comprehensive annotations of drugs and compounds in clinical development, enabling researchers to monitor the progression of compound and target properties throughout the drug discovery process.
SureChEMBL is a complementary database that encompasses compounds derived from full-text, images, and attachments in patent documents. The data is extracted from patent literature through an automated text and image-mining pipeline on a daily basis.
UniChem is a simple, large-scale, non-redundant database of pointers between chemical structure identifiers. Its purpose is to optimise the efficiency with which structure-based hyperlinks are built and maintained between chemistry databases.
Our resources provide users with the ability to address important practical scientific questions in areas such as chemical biology, drug discovery, bioinformatics, and cheminformatics. Common practical applications of these resources include identifying tool compounds for therapeutic targets, conducting novelty evaluations, creating chemogenomic sets for phenotypic screening, and identifying active compound’s potential targets and off-targets. ChEMBL is also widely utilised in data science and the development and application of AI/ML methods. These databases are utilised by a variety of organisations, including academia, not-for-profit and charitable institutes, biotech companies, and large global organisations.
In this talk I will showcase recent developments in ChEMBL, SureChEMBL, and UniChem and outline some of our plans for the future. These include the widespread adoption of Open Source tools across the projects, the open-sourcing of some of our key components, and the introduction of new functionalities such as a REST API for SureChEMBL and the UniChem similarity search service.