Abstract Details

Poster 17: inSARa: Enabling Intuitive and Interactive (Large-Scale) SAR Analysis by Reduced Graphs and Hierarchical MCS-based Network Navigation

Sabrina Wollenhaupt1, Knut Baumann1
1Institute of Medicinal and Pharmaceutical Chemistry, University of Technology Braunschweig, Beethovenstr. 55, D-38106 Braunschweig, Germany
The analysis of Structure-Activity Relationships (SAR) of small molecules is a fundamental task in drug discovery. The knowledge of these relationships is valuable for the medicinal chemist, for instance, in the lead-optimization process or de-novo-design.

A recent approach to SAR analysis and visualization is based on fingerprint similarity (e.g. Network-like Similarity Graphs [1]). One drawback of this concept is the difficulty to figure out the underlying molecular features responsible for the presumed chemical similarity. Thus, medicinal chemists often have difficulties with SAR interpretation.

To tackle this crucial and challenging issue, the more intuitive inSARa (intuitive networks for Structure-Activity-Relationships analysis) approach was developed. The method takes advantage of the synergy resulting from the combination of the reduced graph (RG) and the maximum common substructure (MCS) concept [2]. RGs not only enable large-scale SAR analysis but also provide a higher degree of abstraction since they provide a conceptual representation of physicochemical properties and pharmacophoric features. Due to the hierarchical MCS-based structure, the interpretation of inSARa networks is straightforward.

inSARa networks were shown to be valuable for several essential tasks in SAR analysis, such as the identification of bioisosteric exchanges, activity cliffs or common pharmacophoric features. Even though no bioactivity information is used for network generation, inSARa can also be used for compound classification and bioactivity prediction.

Benchmark studies of fingerprints and inSARa networks indicated that both concepts provide complementary information for SAR analysis. Hence, the combination of both methods was also investigated. It turned out that this hybrid approach not only helps to reveal new similarity relationships but also enhances the interpretability by reducing the complexity of inSARa networks. In addition to the gain in interpretability, bioactivity prediction accuracy also increased in some cases.

For further enhancing the benefits from inSARa networks, a large-scale analysis using the ZINC database [3] was carried out with the aim of identifying unspecific RG-MCSs which result from the comparison of two randomly chosen unrelated molecules. It is investigated how this information can be used for detecting unspecific information during network generation and thus enabling more target-specific networks.

Additionally, several ways of taking activity information into account for network generation and effects for the resulting supervised inSARa networks are analyzed.

[1] Wawer M, Peltason L, Weskamp N, Teckentrup A, Bajorath J: Structure−Activity Relationship Anatomy by Network-like Similarity Graphs and Local Structure−Activity Relationship Indices. J. Med. Chem. 2008, 51 (19): 60756084.
[2] Gardiner E J, Gillet V J, Willett P, Cosgrove D A: Representing Clusters Using a Maximum Common Edge Substructure Algorithm Applied to Reduced Graphs and Molecular Graphs. J. Chem. Inf. Model. 2007, 47 (2):354-366.
[3] Irwin, J. J.; Sterling, T.; Mysinger, M. M.; Bolstad, E. S.; Coleman, R. G. ZINC: A Free Tool to Discover Chemistry for Biology. J. Chem. Inf. Model. 2012, 52 (7), 17571768.

Return to Programme