Mixture model for sub-phenotyping in GWAS.

David Warde-Farley, Michael Brudno, Quaid Morris, Anna Goldenberg

Abstract

Genome Wide Association (GWA) studies resulted in discovery of genetic variants underlying several complex diseases including Chrons disease and age-related macular degeneration (AMD). Still geneticists find that in majority of studies the size of the effect even if it is significant tends to be very small. There are several factors contributing to this problem such as rare variants, complex relationships among SNPs (epistatic effect), and heterogeneity of the phenotype. In this work we focus on addressing phenotypic heterogeneity. We introduce the problem of identifying, from GWAS data, separate genotypic markers from overlapping mixtures of clinically indistinguishable phenotypes. We propose a generative model for this scenario and derive an expectation-maximization (EM) procedure to fit the model to data, as well as a novel screening procedure designed to identify skew specific to certain phenotypic regimes. We present results on several simulated datasets as well as preliminary findings in applying the model to type 2 diabetes dataset.

Publication

Pac. Symp. Biocomput, 17: 363-374, 2012

Date

December, 2011

Links

Link