James Middleton Poster

Combining Shape and Electrostatics in a Spectral Geometry-based 3D Molecular Descriptor

James Middleton1 , Gianmarco. M. Ghiandoni2 , Martin Packer3 , Mengdie Zhuang1 and Val Gillet1

1Information School, University of Sheffield, The Wave, 2 Whitham Rd, Sheffield S10 2AH, United Kingdom

2AstraZeneca R&D IT, Eastbrook House, Shaftesbury Road, Cambridge CB2 8DU

3AstraZeneca Early Oncology R&D, Alderley Park, Macclesfield, SK10 4TG

It has been well established that shape complementarity plays an important role in molecular
recognition. However, shape information alone does not suffice for describing accurately the binding
process between a drug molecule and a therapeutic target, which also requires accounting for their
electrostatic complementarity (Weiner et al., 1982). Consequently, various representation methods
have been developed to capture both shape and electrostatic features of molecules. An example of
this is the alignment-invariant ElectroShape descriptor which is built upon the well-known Ultrafast
Shape Recognition (USFR) descriptor (Armstrong et al., 2010). Research in this field has shown that
although alignment-invariant methods typically offer inferior performance to alignment-based 3D
approaches (Cleves et al., 2019), these methods are more efficient when dealing with large
databases due to the avoidance of the computationally expensive alignment of molecules. As a
result, there is a need for a molecular descriptor that can maintain the efficiency of alignment-
invariant approaches as well as offer more competitive application performance in comparison to
established alignment-based methods.

Seddon et al. (2019) developed an alignment-invariant molecular descriptor known as MOLSG which
is based on the concepts of spectral geometry. The MOLSG descriptor captures shape information
through the application of the Laplace Beltrami Operator (LBO) to a molecular surface embedded in
3D space. Intrinsic shape information yielded by the LBO is subsequently fed as input into a local
geometry descriptor (LGD) method to present a more refined representation of the captured shape
information. To enable comparisons between molecules, a global geometry descriptor is generated
from the LGD. In the MOLSG by Seddon et al. (2019), this is achieved using covariance matrices
which capture global shape information as shown by Tuzel et al. (2006). In the case of MOLSG, the
covariance descriptor takes the LGD as input and produces a covariance matrix that describes how
different geometrical features of the LGD relate with each other across the molecule.

In our work, the MOLSG descriptor workflow was modified to include electrostatic information
whilst retaining the beneficial alignment-invariance property of the original shape descriptor.
Electrostatic potential (ESP) surfaces are first computed using the Graph Convolutional Neural
network developed by Rathi et al. (2020) which has been shown to produce ESPs of comparable
quality to computationally expensive Density-Functional Theory (DFT) ESP surfaces whilst taking a
fraction of the time (Rathi et al., 2020). The MOLSG approach has been extended to incorporate the
electrostatic information to form the Electro-MOLSG descriptor. The electrostatic information is
represented using alignment-invariant 1D histograms which take inspiration from the colour
histograms popularized by Swain & Ballard (1991) in content-based image retrieval. We describe how these histograms can be parametrized to control the influence of both molecule size (histogram normalization), as well as the overall resolution of the distribution-based descriptors (bin-width) on virtual screening performance. Subsequently, we demonstrate that shape and electrostatic features can be combined using a weighted similarity metric. Finally, we benchmark Electro-MOLSG against a series of established descriptors in a ligand-based virtual screening setting. Initial findings suggest that Electro-MOLSG outperforms the shape-only implementation of MOLSG, leading to enhanced application performance.

[1] Armstrong, M. S., Morris, G. M., Finn, P. W., Sharma, R., Moretti, L., Cooper, R. I. & Richards, W. G. (2010). ElectroShape: fast molecular similarity calculations incorporating shape, chirality and electrostatics. Journal of computer-aided molecular design, 24(9), 789–801.
[2] Cleves, A.E., Johnson, S.R. & Jain, A.N. Electrostatic-field and surface-shape similarity for virtual screening and pose prediction. Journal of Computer-Aided Molecular Design, 33, 865–886 (2019). https://doi.org/10.1007/s10822-019-00236-6
[3] Rathi, P. C., Ludlow, R. F. & Verdonk, M. L. (2020). Practical High-Quality Electrostatic Potential Surfaces for Drug Discovery Using a Graph-Convolutional Deep Neural Network. Journal of Medicinal Chemistry, 63(16), 8778–8790. https://doi.org/10.1021/acs.jmedchem.9b01129
[4] Seddon, M. P., Cosgrove, D. A., Packer, M. J. & Gillet, V. J. (2019). Alignment-Free Molecular Shape Comparison Using Spectral Geometry: The Framework. Journal of Chemical Information and
Modeling, 59(1), 98–116. https://doi.org/10.1021/acs.jcim.8b00676
[5] Swain, M. J. & Ballard, D. H. (1991). Color Indexing. In International Journal of Computer Vision (Vol. 7, Issue 1).
[6] Weiner, P. K., Langridge, R., Blaney, J. M., Schaefer, R. & Kollman, P. A. (1982). Electrostatic
potential molecular surfaces. Proceedings of the National Academy of Sciences of the United States
of America, 79(12), 3754–3758. https://doi.org/10.1073/pnas.79.12.3754