Identification of Promiscuous Scaffolds for Fragment-based Design
Ann-Kathrin Prinz and Oliver Koch
Institute of Pharmaceutical and Medicinal Chemistry, University of Münster,
There is a huge amount of bioactivity data and protein structures available which allows various
data mining and knowledge based discovery.[1] The analysis of bioactivity data is based on the
similar property principle, meaning that similar molecules are likely to show similar properties.[2]
Usually, molecular similarity is declared by complete molecular structures. However, similar
structural elements or scaffolds can also indicate similar ligand binding. Here, we present a scaffold-based analysis of bioactivity data with focus on the identification of promiscuous scaffolds or fragments which should be usable as starting point for fragment-based molecular design. Promiscuous scaffolds are common ligand moieties that directly bind to specific and reoccurring binding subpockets that can be found in different unrelated proteins. These scaffolds seem to be useful for fragment libraries to exploit such often occurring subpockets. As starting point all small molecules included in the bioactivity database ChEMBL[3] were processed, standardized and duplicates were removed. Afterwards, a scaffold network was created by using the rdScaffoldNetwork[4]. Next, a python-based workflow was created for the analysis of the scaffold decoration. First, the scaffolds were filtered by molecular descriptors to focus on fragments. The target data of all filtered scaffolds were retrieved from ChEMBL[3]. Then, the sequence similarity of the corresponding ligand targets was calculated by using NCBI BLAST+[5]. For the identification of promiscuous scaffolds, all scaffolds belonging to ligands binding to different unrelated targets were analyzed for their availability in purchasable compound libraries. Purchasable scaffolds were defined as promiscuous scaffolds and were included in a fragment library which can be used as starting point for fragment-based molecular design.
[1] Humbeck L., Koch O., What Can We Learn from Bioactivity Data? Chemoinformatics Toolsand Applications in Chemical Biology Research, ACS chemical biology 2017, 12, 23.
[2] Wilkins C. L., Randić M., A graph theoretical approach to structure-property and structure-activity correlations, Theoret. Chim. Acta 1980, 58, 45.
[3] Gaulton A., Bellis L. J., Bento A. P., Chambers J., Davies M., Hersey A., Light Y.,McGlinchey S., Michalovich D., Al-Lazikani B. et al., ChEMBL: a large-scale bioactivity
database for drug discovery, Nucleic acids research 2012, 40, D1100-7.
[4] Kruger F., Stiefl N., Landrum G. A., rdScaffoldNetwork: The Scaffold NetworkImplementation in RDKit, Journal of chemical information and modeling 2020, 60, 3331.
[5] Camacho C., Coulouris G., Avagyan V., Ma N., Papadopoulos J., Bealer K., Madden T.L., BLAST+: architecture and applications, BMC bioinformatics 2009, 10, 421.