Maximum Likelihood for Gaussian Process Classification and Generalized Linear Mixed Models under Case-Control Sampling

Weissbrod, Omer; Kaufman, Shachar; Golan, David; Rosset, Saharon

Statistics > Methodology

arXiv:1801.03901 (stat)

[Submitted on 11 Jan 2018 (v1), last revised 24 Apr 2019 (this version, v3)]

Title:Maximum Likelihood for Gaussian Process Classification and Generalized Linear Mixed Models under Case-Control Sampling

Authors:Omer Weissbrod, Shachar Kaufman, David Golan, Saharon Rosset

View PDF

Abstract:Modern data sets in various domains often include units that were sampled non-randomly from the population and have a latent correlation structure. Here we investigate a common form of this setting, where every unit is associated with a latent variable, all latent variables are correlated, and the probability of sampling a unit depends on its response. Such settings often arise in case-control studies, where the sampled units are correlated due to spatial proximity, family relations, or other sources of relatedness. Maximum likelihood estimation in such settings is challenging from both a computational and statistical perspective, necessitating approximations that take the sampling scheme into account. We propose a family of approximate likelihood approaches which combine composite likelihood and expectation propagation. We demonstrate the efficacy of our solutions via extensive simulations. We utilize them to investigate the genetic architecture of several complex disorders collected in case-control genetic association studies, where hundreds of thousands of genetic variants are measured for every individual, and the underlying disease liabilities of individuals are correlated due to genetic similarity. Our work is the first to provide a tractable likelihood-based solution for case-control data with complex dependency structures.

Subjects:	Methodology (stat.ME)
Cite as:	arXiv:1801.03901 [stat.ME]
	(or arXiv:1801.03901v3 [stat.ME] for this version)
	https://doi.org/10.48550/arXiv.1801.03901
Journal reference:	JMLR (108):1-30, 2019

Submission history

From: Omer Weissbrod [view email]
[v1] Thu, 11 Jan 2018 17:56:06 UTC (2,125 KB)
[v2] Thu, 17 May 2018 14:32:17 UTC (248 KB)
[v3] Wed, 24 Apr 2019 15:00:25 UTC (274 KB)

Statistics > Methodology

Title:Maximum Likelihood for Gaussian Process Classification and Generalized Linear Mixed Models under Case-Control Sampling

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Methodology

Title:Maximum Likelihood for Gaussian Process Classification and Generalized Linear Mixed Models under Case-Control Sampling

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators