Traversing Enormous Regions of Chemical Space with the GPU.Paul Hawkins1, A. Geoffrey Skillman1 |
|
1OpenEye Scientific | |
The sheer size of the universe of molecules (variously estimated to be between 10^40 and 10^60) is daunting. The number of molecules in a particular library or collection has, until recently, dictated the methods that can be used to search it; only very rapid graph-based methods were fast enough to search libraries of hundreds of millions of molecules in a reasonable time and with reasonable computational resources. However, the wide availability of inexpensive GPUs (Graphics Processing Units) provides new approaches to rapid searching of previously intractably large chemical spaces. In this presentation we will focus on shape similarity searching with FastROCS, the GPU version of the widely used lead discovery tool ROCS. We will briefly present how porting to the GPU enabled us to accelerate shape searching by over 3 orders of magnitude and yet maintain identical virtual screening performance. The unprecedented speed of the current version of FastROCS (> 20 million molecules/GPU/minute) enables searching of enormous libraries in a matter of minutes. The ability to search libraries of this size not only enables much greater areas of chemical space to be searched in a short time, but it also enables better quality, more diverse hits to be found. We will present recent data on searching customized libraries of 10^6 – 10^10 members, generated with specific library chemistry, showing how increasing the size of the space that is searched increases the similarity between high scoring hits and the query, the number of scaffold classes discovered and the probability of finding active molecules in those different scaffolds. These observations have strong implications for the design of chemical libraries. We will also show how FastROCS is part of OpenEye’s drive to democratise molecular design. FastROCS can be made available as a web service, accessing local hardware or through the cloud. Both of these variants can be easily accessed through a browser, thereby enabling non-experts to search large compound libraries from their desktop, and obtain useful results in real-time. We will illustrate this capability by searching parts of the Enamine REAL library (1).
1. https://enamine.net/index.php?option=com_content&task=view&id=254 |