Detecting genome-wide epistases based on the clustering of relatively frequent items

Bioinformatics. 2012 Jan 1;28(1):5-12. doi: 10.1093/bioinformatics/btr603. Epub 2011 Nov 3.

Abstract

Motivation: In genome-wide association studies (GWAS), up to millions of single nucleotide polymorphisms (SNPs) are genotyped for thousands of individuals. However, conventional single locus-based approaches are usually unable to detect gene-gene interactions underlying complex diseases. Due to the huge search space for complicated high order interactions, many existing multi-locus approaches are slow and may suffer from low detection power for GWAS.

Results: In this article, we develop a simple, fast and effective algorithm to detect genome-wide multi-locus epistatic interactions based on the clustering of relatively frequent items. Extensive experiments on simulated data show that our algorithm is fast and more powerful in general than some recently proposed methods. On a real genome-wide case-control dataset for age-related macular degeneration (AMD), the algorithm has identified genotype combinations that are significantly enriched in the cases.

Availability: http://www.cs.ucr.edu/~minzhux/EDCF.zip

Contact: [email protected]; [email protected]

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Cluster Analysis
  • Epistasis, Genetic*
  • Genetic Predisposition to Disease
  • Genome-Wide Association Study*
  • Humans
  • Macular Degeneration / genetics
  • Polymorphism, Single Nucleotide