Evgeni Grazhdankin Abstract

Homology modeling with probabilistic restraint graphs

Evgeni Grazhdankin1, Karen Culotta1, Cédric Gageat1, Henri Xhaard1

1Faculty of Pharmacy, Division of Pharmaceutical Chemistry and Technology, University of Helsinki
Pairwise inter-atomic distance constraints can be leveraged to determine three-dimensional conformation of a molecule (Hendrickson, 1995). Indeed, spatial restraints have been employed in the comparative (homology) modeling programs such as MODELLER (Sali and Blundell, 1993). With enough many satisfied spatial restraints, molecular structure will be properly defined.

The pairwise distance restraints are readily embedded into graph data structures with nodes corresponding to the atoms and edges encoding the restraints. Many molecular conformations can be simultaneously encoded into one multigraph, where more than one edge can exist between any two given nodes.

By associating each edge (i.e. a distance restraint) with an occurrence probability, and sampling from the approximation of the joint distribution, we expect to lift samples defining biologically relevant conformations. This approach should allow us to bypass a time-consuming stepwise progression necessary to move from one conformation to another as seen in, for example, molecular dynamics simulations. A related problem of selecting representative instances from uncertain graphs has been explored in literature (Parchas et al., 2014).

Hurdles arise in initial placement (between which atoms) and parameterisation (to which distance) of the restraints. Furthermore, inter-conditioning of the restraints, or determining how a restraint depends on another, is computationally challenging forming a rich source of combinatorial problems.

In the current work, we developed a software to optimize and expand upon MODELLER (Sali 1993) distance restraints with a model-creation-restraint-evaluation feedback loop using probabilistic graphs, heuristic optimisation schemes and an array of quantitative model quality gauges. We focus on membrane-bound proteins, with an emphasis on G protein-coupled receptors. The performance of the method is evaluated for its ability to reconstruct GPCR structures.

The software is written in object-oriented Python (v2.7), following good programming practices to support further development. We store the restraints in a relational PostgreSQL database facilitating big data trends and the use of machine learning methods. Restraint graphs are visualized with three-dimensional interactive representation through PyMOL scripts.

We observe an expanded conformational space sampling and an increase in the size of the hydrogen bond networks as compared to the standard MODELLER models. The results are achieved due to enlargening of the pool of good distance restraints with custom long- and short-range restraints and an explicit representation of the hydrogen bonds in the restraints graph.

Hendrickson B: The Molecule Problem: Exploiting Structure in Global Optimization. SIAM Journal on Optimization, 5(4):835–857, 1995.

Šali A. and Blundell TL: Comparative Protein Modelling by Satisfaction of Spatial Restraints. Journal of Molecular Biology, 234(3): 779–815, 1993.

Parchas P, Gullo F, Papadias D, Bonchi F: The pursuit of a good possible world. In Proceedings of the 2014 ACM SIGMOD international conference on Management of data. New York, New York, USA: ACM Press, 967–978, 2014.