Retrosynthetic Analysis and Molecular Pairs, a Match Made in Heaven at Sosei Heptares?
Robert T Smith1, Benjamin G Tehan1, Conor G Scully1, Miles Congreve1, Andreas Bender2, Chris de Graaf1
2University Of Cambridge
The recent retrosynthetic methods using computational analysis has resulted in several algorithms which give synthetically tractable routes to potential leads [1, 2]. This poster will describe the deployment of an integrated molecule generation and retrosynthetic analysis tool at SoseiHeptares, combining compound enumeration approaches and scoring/ranking methodologies, and data driven identification of efficient chemical synthesis routes.
The described approach aims to address challenges in the enumeration, chemical diversification, post-processing, and scoring/ranking of new chemical matter within drug discovery programs, including:
i) Matched molecular pairs [3, 4] (MMP) has been utilised in drug discovery programs for several years, with its use now widespread, however the algorithm relies on a rich database of compounds with widespread use of different substituents, negating new or recent advances in synthetic chemistry and medicinal chemistry knowledge.
ii) The enumeration of compounds that are not ‘MedChem’ friendly (for example containing reactive functional groups) require significant post processing prior to synthetic efforts. Reactive group filters [5, 6] have been developed to filter high throughput screening sets and these filters can be utilised to try and remove these reactive compounds, however assuming these do not interfere with the reaction they can provide additional reactive handles which can be utilised for further chemical diversification.
iii) The issue of identifying which chemical groups are incompatible in a chemical reaction has previously proved difficult and required trial reactions or a multitude of searches in the chemical literature, however with the advent of retrosynthetic analysis an indication of whether a functional group will interfere with a given synthetic reaction can now be derived .
Our poster will describe how combining retrosynthetic analysis and prospective forward enumeration with matched molecular pairs enables a project to enumerate a large proportion of the synthetic space around a given hit molecule, and in combination with predictions for ADMET and drug likeness enables a project to prioritise target molecules in a highly efficient manner.
 Segler MHS, Preuss M, Waller MP; Learning to Plan chemical Syntheses; arXiv: 1708.04202v1
 Christ CD, Zentgraf M, Kriegl JM; Mining Electronic Laboratory Notebooks: Analysis, Retrosynthesis and Reaction Based Enumeration; J. Chem. Inf. Model.; 2012, 52, 1745–1756; doi: 10.1021/ci300116p
 Hussain J, Rea C; Computationally efficient algorithm to identify matched molecular pairs (MMPs) in large data sets; J Chem Inf Model.; 2010, 50: 339-48
 Dalke A, Hert J, Kramer C; mmpdb: An Open-Source Matched Molecular Pair Platform for Large Multiproperty Data Sets; J Chem Inf Model.; 2018, 58: 902-910
 Hann M, Hudson B, Lewell X, Lifely R, Miller L, Ramsden N; Strategic Pooling of Compounds for High-Throughput Screening; J. Chem. Inf. Comput. Sci.; 1999, 39, 897-902.
 Pearce BC, Sofia MJ, Good AC, Drexler DM, Stock DA; An empirical process for the design of high-throughput screening deck filters; J. Chem. Inf. Model; 2006, 46, 1060-1068.
 Segler MHS, Waller MP; Neural-Symbolic Machine Learning for Retrosynthesis and Reaction Prediction; Chem. Eur. J.; 2017, 23, 5966-5971