eCEO: an efficient Cloud Epistasis cOmputing model in genome-wide association study

Bioinformatics. 2011 Apr 15;27(8):1045-51. doi: 10.1093/bioinformatics/btr091. Epub 2011 Mar 2.

Abstract

Motivation: Recent studies suggested that a combination of multiple single nucleotide polymorphisms (SNPs) could have more significant associations with a specific phenotype. However, to discover epistasis, the epistatic interactions of SNPs, in a large number of SNPs, is a computationally challenging task. We are, therefore, motivated to develop efficient and effective solutions for identifying epistatic interactions of SNPs.

Results: In this article, we propose an efficient Cloud-based Epistasis cOmputing (eCEO) model for large-scale epistatic interaction in genome-wide association study (GWAS). Given a large number of combinations of SNPs, our eCEO model is able to distribute them to balance the load across the processing nodes. Moreover, our eCEO model can efficiently process each combination of SNPs to determine the significance of its association with the phenotype. We have implemented and evaluated our eCEO model on our own cluster of more than 40 nodes. The experiment results demonstrate that the eCEO model is computationally efficient, flexible, scalable and practical. In addition, we have also deployed our eCEO model on the Amazon Elastic Compute Cloud. Our study further confirms its efficiency and ease of use in a public cloud.

Availability: The source code of eCEO is available at http://www.comp.nus.edu.sg/~wangzk/eCEO.html.

Contact: [email protected].

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Epistasis, Genetic*
  • Genome-Wide Association Study*
  • Models, Statistical*
  • Phenotype
  • Polymorphism, Single Nucleotide*
  • Software