Developing a Robust Method for Automated Assessment of Binding Affinity via Free Energy Perturbation
Maximilian Kuhn1,2, Paolo Tosco1, Antonia Mey2, Mark Mackey1, Julien Michel2
1Cresset
2University of Edinburgh
There is a rising interest towards FEP (Free Energy Perturbation) calculations in the drug discovery community.[1,2] FEP calculations are performed to predict the relative binding affinity changes (ddG) within a congeneric ligand series. Such calculations consist of non-physical (“alchemical”) transformations, in which a molecule (A) is gradually converted into a structurally related molecule (B) through a number of discrete steps, the so-called lambda windows. The ligand simulated in each window can be thought of as an alchemical (i.e., hybrid) molecule consisting of a 1-lambda fraction of A and a lambda fraction of B. The free energy difference between the end states of the transformation can be assessed by a variety of methods, such as the Multistate Bennett Acceptance Ratio (MBAR) and corresponds to the binding affinity difference between the two molecules. Extending this approach to a network of congeneric molecules leads to assessing the binding affinities of the corresponding ligand series.
Recently, alchemical free energy calculations have been applied to predict ligand binding affinities of large data sets, yielding accuracies as close as 0.8 to 1.1 kcal/mol compared to experimental values[2]. However, apart from intrinsic errors like inaccuracies of the force field as well as insufficient phase space sampling, the lack of automation is still a major obstacle to routine application of these methods. For instance, the initial generation of the perturbation network requires comparison of the ligand structures and subsequent connection of the molecules in the network graph based on their structural similarity. This step is especially problematic when the input structures are relatively heterogenous, as this requires the insertion or deletion of a large number of atoms during the simulations. Such perturbations are prone to errors, and usually need a larger number of lambda windows to complete successfully, thus increasing calculation time. Therefore, to increase reliability of the results and avoid wasting computing time on poorly designed perturbations, early recognition and subsequent modification of problematic alchemical networks are necessary.
A number of open-source applications are available to assist in various parts of the FEP workflow. These include various applications implemented in the Sire[3] molecular simulation package for network setup and results evaluation, LOMAP[4] for network generation, AmberTools[5] for preparing the topology files and SOMD[6] for running the simulations. However, installing and using these relatively complex tools requires expertise. In order to make FEP more accessible, a fully automated workflow for performing FEP calculations is under development for inclusion in version 3.0 of Cresset’s drug design software Flare. The workflow can be accessed through a graphical user interface (GUI) or a Python application programming interface (API). The original open-source code base was enhanced to automatically create additional molecules serving as intermediates for ligand pairs which are too dissimilar to be directly mutated into one another. To increase the workflow’s reliability, frequent cycle closures are ensured during the generation of the perturbation network, and energy convergence checks during postprocessing are enforced. Furthermore, an effort is underway to deduce heuristics to infer the optimal number of lambda windows for each perturbation, in order to avoid unnecessary calculations and minimise the probability of insufficient sampling.
A variety of datasets were processed using predefined default parameters, including the FEP+ dataset[2]. Results obtained with our method were broadly comparable to published reports, considering the overall reduced simulation time. During our validation of the method we found that the ideal settings for a given set of ligands and their target protein is difficult to predict in advance. Therefore, depending on the project and computational resources available, further optimisation of the results may be desirable once knowledge on the system under observation has been gathered through preliminary simulations. Power users of this FEP implementation will have full control on simulation parameters via Flare’s GUI and Python API.
[1] Cournia, Z.; Allen, B.; Sherman, W. J. Chem. Inf. Model. 2017, 57 (12), 2911-2937.[2] Wang, L.; Wu, Y.; Deng, Y. et al. J. Am. Chem. Soc. 2015, 137 (7), 2695-2703.
[3] Woods, C.; Mey, A.S.J.S.; Calabrò, G.; Bosisio, S.; Michel, J. Sire molecular simulations framework (2016). http://siremol.org. Accessed 28 January 2019.
[4] Liu, S.; Wu, Y.; Lin, T. et al. J. Comput.-Aided Mol. Des. 2013, 27 (9), 755-770
[5] Case, D.A.; Ben-Shalom, I.Y.; Brozell, S.R. et al. AMBER 2018, University of California, San Francisco.
[6] Mey, A.S.J.S.; Jiménez, J.J. Michel, J. J. Comput.-Aided Mol. Des. 2018, 32, 199-210.