Partial Matching Shape and Coloured-based Applied to Ligand-Based Virtual Screening
Savíns Puertas-Martín1,2 and Valerie J. Gillet1
1 Information School, University of Sheffield, Sheffield, S10 2AH, United Kingdom
2 Supercomputing – Algorithms Research Group (SAL), University of Almería, Agrifood Campus of International Excellence, ceiA3, 04120, Almería, Spain.
Ligand-based virtual screening (LBVS) is a computational technique used to identify potential drug candidates by searching chemical databases for molecules with similar properties to known active compounds. Shape is a commonly utilised property in LBVS, and various 3D approaches have been developed based on atomic distances, atom-centred Gaussians, and molecular fields . However, existing algorithms designed to find similar compounds in LBVS typically consider the entire molecule, which can overlook specific substructures or fragments that are crucial for protein binding. To address this limitation, we focus on a more specific strategy called fragment alignment or partial matching (PM). PM aims to identify pairs of molecules which have a high degree of overlap over some parts of the surfaces of the molecules. To this end, we propose the alignment of 3D molecule structures using a point cloud representation and we first address the issue of matching a fragment to a larger molecule of which It is a part.
Aligning two objects by their point clouds, formally called the registration problem, is a widely-known problem used in other areas such as robotics, virtual reality and autonomous driving  but has scarcely been exploited in drug discovery. Point clouds have recently been described for matching molecular surfaces in the sensaas approach . However, this approach has some limitations such as (i) the need to apply several resolutions of the point clouds in order to find the optimum alignment, and (ii) the use of stochastic methods for the alignment which leads to variable results. Consequently, in order to improve previous approaches, we propose to perform such point cloud alignment in two stages. A first global optimisation where, based on invariant characteristics of the position of the points, the most similar ones between the two clouds are established . Subsequently, a local alignment is applied with a local ICP optimiser which takes account not only of the coordinates but also the type of atom to which each point belongs . This approach is deterministic and therefore provides a more stable method that overcomes the two limitations described above and which translates reduced computational time.
References Venkatraman, V., Pérez-Nueno, V. I., Mavridis, L., & Ritchie, D. W. (2010). Journal of Chemical Information and Modeling, 50(12), 2079–2093.  Zhu, H., Guo, B., Zou, K., Li, Y., Yuen, K.-V., Mihaylova, L., & Leung, H. (2019).. Sensors, 19(5), 1191.  Douguet, D., & Payan, F. (2020).. Molecular Informatics, 39(8).  Rusu, R. B., Blodow, N., & Beetz, M. (2009). 2009 IEEE International Conference on Robotics and Automation, 3212–3217.  Park, J., Zhou, Q.-Y., & Koltun, V. (2017). Proceedings of the IEEE International Conference on Computer Vision (ICCV)