Abstract Details

Prediction of Tautomers and Protonation States in Protein-Ligand Binding Sites

Stefan Bietz1, Sascha Urbaczek1, 2, Matthias Rarey1
1Center for Bioinformatics (ZBH), Bundesstr. 43, 20146 Hamburg, Germany
2BioSolveIT GmbH, An der Ziegelei 75, 53757, Sankt Augustin, Germany
Three-dimensional structures of protein-ligand complexes are an essential and commonly used basis in many chemoinformatics fields such as 3D-QSAR, Virtual Screening, and de novo design. Many analysis and prediction approaches make use of atomically detailed structure models for the investigation of macromolecular properties, enzymatic mechanisms, or molecular interactions. Unfortunately, the most frequently applied experimental technique for the resolution of macromolecular structures, X-ray crystallography, comes along with some uncertainties in the evaluation of hydrogen positions and discrimination of similar chemical elements. Therefore, numerous methods have been developed which try to compensate these shortcomings by a prediction of the most prominent lacking details. Common sets of optimization parameters comprise rotatable hydrogen atoms, tautomers and protonation states of proteogenic amino acids, side chain flips, and water orientations [1-3]. While these parameters cover the unresolved structural properties of proteins quite well, various relevant degrees of freedom remain unconsidered for ligand molecules. Especially neglecting tautomers and protonation states of ligands may result in suboptimal hydrogen-bond networks at the protein-ligand interface, as these occasionally enable fundamentally different interaction patterns. Based on our hydrogen placement algorithm Protoss[4], we developed a novel approach incorporating alternative tautomers and protonation states for ligand molecules. With this, Protoss is to our knowledge the first method covering the whole spectrum of hydrogen variability in protein-ligand complexes automatically.

In general, the diversity of chemical space interferes with the purpose of creating a generally applicable and computationally efficient model of tautomerism and protonation changes. Our approach uses a consistent state enumeration algorithm in combination with a heuristic, pattern-based scoring function for the generation of chemically reasonable tautomers and protonation states. In order to keep the state space tractable, the molecules are partitioned into tautomerizable substructures treated separately during the remaining process. Furthermore the method ensures that only states with similar energetic stability scores are maintained. This procedure is justified by the assumption that only small stability differences may be compensated by favorable interactions. The alternative states and their stability scores are subsequently incorporated into the interaction scoring und optimization procedure of Protoss. Protoss then applies a dynamic programming scheme to calculate an optimal solution for the hydrogen bonding network of the whole protein-ligand complex including the most probable tautomer and protonation state for the ligand.

We run Protoss on the Astex diverse set[5] and compared the predicted ligand states with those present in the data set, which are either adopted from the primary literature or prepared by visual inspection. Furthermore, we consulted a dataset which was originally used for the investigation of tautomer preferences in PDB complexes[6]. Beside a few comprehensible exceptions, we found good agreement rates for both datasets. A comparison with Protonate3D[1] and WHAT IF[2] also demonstrated, how a consideration of ligand tautomers and protonation states can facilitate a better prediction of the hydrogen bonding network in protein-ligand binding sites. Finally, we were able to show that the runtime of Protoss is, in general, only slightly affected by the incorporation of the additional degrees of freedom.

1. Labute, P., Protonate3D: Assignment of ionization states and hydrogen coordinates to macromolecular structures. Proteins: Structure, Function, and Bioinformatics, 2009. 75(1): p. 187-205.
2. Hooft, R.W.W., Sander, C., and Vriend, G., Positioning hydrogen atoms by optimizing hydrogen-bond networks in protein structures. Proteins: Structure, Function, and Bioinformatics, 1998. 26(4): p. 363-376.
3. Bayden, A.S., et al., Web application for studying the free energy of binding and protonation states of protein-ligand complexes based on HINT. Journal of computer-aided molecular design, 2009. 23(9): p. 621-632.
4. Lippert, T. and Rarey, M., Fast automated placement of polar hydrogen atoms in protein-ligand complexes. Journal of cheminformatics, 2009. 1(1): p. 1-12.
5. Hartshorn, M.J., et al., Diverse, high-quality test set for the validation of protein-ligand docking performance. Journal of medicinal chemistry, 2007. 50(4): p. 726-741.
6. Milletti, F. and Vulpetti, A., Tautomer preference in PDB complexes and its impact on structure-based drug discovery. Journal of chemical information and modeling, 2010. 50(6): p. 1062-1074.

Return to Programme