Showing 1–2 of 2 results for author: Keys, K L

Search v0.5.6 released 2020-02-24

arXiv:1902.05189 [pdf, other]

stat.AP q-bio.GN

doi 10.1007/s00439-019-02001-z

OPENMENDEL: A Cooperative Programming Project for Statistical Genetics

Authors: Hua Zhou, Janet S. Sinsheimer, Christopher A. German, Sarah S. Ji, Douglas M. Bates, Benjamin B. Chu, Kevin L. Keys, Juhyun Kim, Seyoon Ko, Gordon D. Mosher, Jeanette C. Papp, Eric M. Sobel, Jing Zhai, Jin J. Zhou, Kenneth Lange

Abstract: Statistical methods for genomewide association studies (GWAS) continue to improve. However, the increasing volume and variety of genetic and genomic data make computational speed and ease of data manipulation mandatory in future software. In our view, a collaborative effort of statistical geneticists is required to develop open source software targeted to genetic epidemiology. Our attempt to meet… ▽ More Statistical methods for genomewide association studies (GWAS) continue to improve. However, the increasing volume and variety of genetic and genomic data make computational speed and ease of data manipulation mandatory in future software. In our view, a collaborative effort of statistical geneticists is required to develop open source software targeted to genetic epidemiology. Our attempt to meet this need is called the OPENMENDELproject (https://openmendel.github.io). It aims to (1) enable interactive and reproducible analyses with informative intermediate results, (2) scale to big data analytics, (3) embrace parallel and distributed computing, (4) adapt to rapid hardware evolution, (5) allow cloud computing, (6) allow integration of varied genetic data types, and (7) foster easy communication between clinicians, geneticists, statisticians, and computer scientists. This article reviews and makes recommendations to the genetic epidemiology community in the context of the OPENMENDEL project. △ Less

Submitted 13 February, 2019; originally announced February 2019.

Comments: 16 pages, 2 figures, 2 tables

Journal ref: Human Genetics, pp 1-11, 2019 Mar 26
arXiv:1608.01398 [pdf, ps, other]

stat.ML

doi 10.1002/gepi.22068

Iterative Hard Thresholding for Model Selection in Genome-Wide Association Studies

Authors: Kevin L. Keys, Gary K. Chen, Kenneth Lange

Abstract: A genome-wide association study (GWAS) correlates marker variation with trait variation in a sample of individuals. Each study subject is genotyped at a multitude of SNPs (single nucleotide polymorphisms) spanning the genome. Here we assume that subjects are unrelated and collected at random and that trait values are normally distributed or transformed to normality. Over the past decade, researche… ▽ More A genome-wide association study (GWAS) correlates marker variation with trait variation in a sample of individuals. Each study subject is genotyped at a multitude of SNPs (single nucleotide polymorphisms) spanning the genome. Here we assume that subjects are unrelated and collected at random and that trait values are normally distributed or transformed to normality. Over the past decade, researchers have been remarkably successful in applying GWAS analysis to hundreds of traits. The massive amount of data produced in these studies present unique computational challenges. Penalized regression with LASSO or MCP penalties is capable of selecting a handful of associated SNPs from millions of potential SNPs. Unfortunately, model selection can be corrupted by false positives and false negatives, obscuring the genetic underpinning of a trait. This paper introduces the iterative hard thresholding (IHT) algorithm to the GWAS analysis of continuous traits. Our parallel implementation of IHT accommodates SNP genotype compression and exploits multiple CPU cores and graphics processing units (GPUs). This allows statistical geneticists to leverage commodity desktop computers in GWAS analysis and to avoid supercomputing. We evaluate IHT performance on both simulated and real GWAS data and conclude that it reduces false positive and false negative rates while remaining competitive in computational time with penalized regression. Source code is freely available at https://github.com/klkeys/IHT.jl. △ Less

Submitted 24 July, 2017; v1 submitted 3 August, 2016; originally announced August 2016.

Comments: 13 pages, 1 figure, 4 tables

Journal ref: Genetic Epidemiology 2017:41(8), 756--768

Search v0.5.6 released 2020-02-24