Predicting Comprehensive Interactomes: Case Study of the Empirical Upper Limit

Citation

Kevin Dick, Bahram Samanfar, Elroy Cober, James R. Green: Predicting comprehensive interactome: case study of the empirical upper limit. International Conference on Biomedical and Health Information, IEEE-EMBS 2019, Chicago, USA.

Résumé en langage clair

NA

Résumé

The design of novel therapeutics emerge from an understanding of the molecular mechanisms and pathogenesis of a given disease. Recently, the engineering of short synthetic binding macromolecules has been proposed as an alternative therapy to synthetic compounds. To both investigate the pathogenesis of disease and explore the protein-protein interaction space within or between organisms, we can predict the comprehensive interactome of various organisms. Such a task is often extremely computationally demanding, resulting in the need for highperformance computing (HPC) infrastructure. In this paper, we provide experiential advice from the prediction of the largest ever collection of interactomes. Seven prediction schemas involving soybean (Glycine max), the Soybean Cyst Nematode (Heterodera glycines), and human (Homo sapiens) were predicted, collectively comprising over 25 billion interactions, 1.8 terabytes of prediction data, and estimated to cost $50,404.03 if executed on the Amazon Web Services compute infrastructure. Given the significance of these organisms on agricultural health and the global economy, these interactomes will form the basis of future testable hypotheses. Furthermore, a suite of job management tools are made available to any researchers looking to leverage HPC infrastructure in their own work: github.com/chazingtheinfinite/suite-HPC-scripts. Finally, a compute cluster dashboard tracking the demand for resources on five publicly available clusters is made available at cu-bic.ca/clusters/.
Index Terms—Machine Learning, High-Performance Computing, Protein-Protein Interaction

Date de publication

2019-05-19