Glolloc: A Global-local Mixture of Experts Model and its Application to Small Molecule Drug Discovery
Jerome G. P. Wicker and Richard Sherhod
Benevolent AI, Babraham Campus Cambridge CB22 3AT
Glolloc[1] is a recently proposed neural network architecture designed for quantitative structure-activity relationship (QSAR) applications. The algorithm is a Mixture-of-Experts (MoE) ensemble, which aims to improve performance compared to conventional multilayer perceptron neural networks by intelligently combining predictions from both global and local experts. This approach removes the need to maintain separate local models for each area of chemical space for a particular endpoint, or for multiple endpoints. It also provides a mechanism for introspecting the model by interrogating the weights assigned to each expert.
There are two main scenarios in which a Glolloc model can be deployed. The first of these is a single-task approach, in which the endpoint of interest is modelled using multiple structure experts, each corresponding to a different scaffold of interest present in the dataset. By explicitly separating these scaffolds, effects that are only relevant to particular chemical series can be individually modelled, but trained as part of the same overall model. The original publication of this method involved a detailed description of methods for automatically identifying the SMARTS patterns to use for defining the experts, and we show here that commercially available series identification methods can also be used as a replacement for this approach.
The second setup is as a multitask neural network. Glolloc can train a single global expert, equivalent to a traditional multilayer perceptron model. Additionally, experts can also be assigned to each target endpoint to allow explicit modelling of those targets as part of a single overall model (as opposed to training multiple single task models for each one). We demonstrate the utility of this approach as applied to a live drug discovery project for modelling on and off target activity. Additionally, we expand the use cases for Glolloc to endpoints other than QSAR that can be crucial for drug discovery, such as disease-relevant ADME properties.
References
[1] Gaspar, H.A.; Seddon, M.P. (2022) Glolloc: Mixture of global and local experts for molecular activity prediction. MLDD workshop, ICLR 2022 https://openreview.net/forum?id=Mdj229oYWa3