Abstract Details

Poster 39: Accelerating Data Fusion using High Performance Computing (Multi-core Processors and Graphics Processing Unit)

Mohd-Norhadri Hilmi1, Mostafa AlBarmawi1, Nurul Malim1, Nur'aini Abd Rashid1, Shereena M Arif2
1School of Computer Sciences, Universiti Sains Malaysia
2Information School, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia
Virtual screening is one of the popular applications for screening chemical compounds in pharmaceutical and agrochemistry. In drug discovery research, virtual screening is used to search bioactive compounds against specific target compounds in order to retrieved novel compounds which can lead to drug candidate. Although screening chemical compounds require pair-wise compounds comparison, given large sources of compounds database (currently 100 million substances) produced major bottleneck in computational such as similarity searching particularly when the database grows in size. Recent studies have
highlighted the needs of computational efficiency in chemoinformatics specifically involves high-throughput screening.
With the emergence of approaches to enhancing virtual screening via combination of similarity searching results into unified list causes computational constrain. Such method known as data fusion could highly benefit from the massively parallel nature of the GPU.
Hence, in this research, we intended to use multicore processor and GPU to accelerate data fusion techniques in similarity-based virtual screening for producing better throughput and performance. There were two types of data fusion addressed in this study. The first one is called similarity fusion which
involved combining the sets of result of similarity searches on multiple similarity coefficients and also a single reference compound, another one is group fusion which combines the result of similarity searches based on multiple reference compounds upon same database representation (ECFP4 and ECFC4 fingerprints) with single similarity coefficient. The purposed parallel versions of similarity fusion and group fusion had showed better performance
more than 10x faster than sequential respectively againts MDDR (MDL Drug Data Report) database compounds.

Return to Programme