Abstract Details


Poster 31: The MQN-Mapplet: Interactive Access to Millions of Molecules on your Desktop

Mahendra Awale1, Ruud van Deursen2, Jean-Louis Reymond1
1Dept. of Chemistry and Biochemistry, University of Bern, Switzerland
2Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
The chemical space describes an ensemble of all organic molecules to be considered when searching for new drugs and can be broadly divided into the known and unknown chemical spaces. The known chemical space contains all the molecules that has been synthesized till date and can be partly represented by publicly available databases like DrugBank(~6500), ChEMBL(>1.1 M), ZINC(~22 M) and PubChem(~32.5 M). On the other hand unknown chemical space contains everything that has not synthesized or discovered yet and have infinite number of possibilities for molecules. Even the total number of possible drug-like small molecules has been estimated to easily reach 1060 compounds. The tiny fraction of this unknown chemical space is available in the form of various virtual chemical libraries like: a) Pfizer Global Virtual Library (PGVL) listing approximately1012 virtual molecules that can be potentially synthesized [1], b) GDB-11(~26.5 M), GDB-13(~977 M) and GDB-17 (~150 B) enumerates all the possible organic small molecules up to 11, 13 and 17 heavy atom counts respectively which are possible following simple rules of chemical stability[2-4].

Considering the vast amount of available chemical information, one of the challenge always remain is the visualization and navigation in this “chemical space”, in way to get quick and broad view of its content. One approach to address this problem is using the concept of multidimensional property spaces in which the dimensions are assigned to selected numerical descriptors of molecular structure[5]. Later principal component analysis (PCA) can be used to project this multidimensional property space in a lower dimensionality space, typically a 2D- or 3D-space which can be visualized.

Here we report the development of the MQN-mapplet which is a Java application giving interactive access to the structure of small molecules in large databases via color-coded maps of their chemical space. These maps are projections from a 42-dimensional property space defined by 42 integer value descriptors called molecular quantum numbers (MQN)[6]. MQN counts different types of atoms, bonds , rings, polar groups and topological features in molecules and by doing so it categorized the molecules by size, rigidity, and polarity and not by the substructure. Despite its simplicity, MQN-space is relevant to biological activities. In contrast to other databases browsing sites, one can start the exploration of chemical space with MQN-Mapplet without using any query molecule. However the option is provided to locate the molecule/query of user interest on the map, which can be further use as point of origin for exploration of chemical space. While navigating in the chemical space user have a access to most of the available structural information. Additionally MQN-Mapplet allows the identification of analogs as neighbors on the MQN-map or in the original 42-dimensional MQN-space. To our knowledge, this type of interactive exploration tool is unprecedented for very large databases such as PubChem and GDB-13 (almost one billion molecules). The application is freely available for download at www.gdb.unibe.ch.

[1] M. Boehm, et. al., J. Med. Chem., 2008, 51, 2468.
[2] T. Fink, H. Bruggesser and J.-L. Reymond, Angew. Chem. Int. Ed., 2005, 44, 1504-1508.
[3] L. C. Blum and J.-L. Reymond, J. Am. Chem. Soc., 2009, 131, 8732 .
[4] L. Ruddigkeit, R. van Deursen, L. C. Blum and J.-L. Reymond, J. Chem. Inf. Model., 2012, 52, 2864.
[5] R. S. Pearlman, and K. M. Smith, Drug Discovery Des., 1998, 9, 339.
[6] K. T. Nguyen and J.-L. Reymond, ChemMedChem., 2009, 4, 1803.

Return to Programme