Latent Gaussian process with composite likelihoods for data-driven disease stratification

Ramchandran, Siddharth; Koskinen, Miika; Lähdesmäki, Harri

Statistics > Machine Learning

arXiv:1909.01614v1 (stat)

[Submitted on 4 Sep 2019 (this version), latest version 20 Apr 2021 (v4)]

Title:Latent Gaussian process with composite likelihoods for data-driven disease stratification

Authors:Siddharth Ramchandran, Miika Koskinen, Harri Lähdesmäki

View PDF

Abstract:Data-driven techniques for identifying disease subtypes using medical records can greatly benefit the management of patients' health and unravel the underpinnings of diseases. Clinical patient records are typically collected from disparate sources and result in high-dimensional data comprising of multiple likelihoods with noisy and missing values. Probabilistic methods capable of analysing large-scale patient records have a central role in biomedical research and are expected to become even more important when data-driven personalised medicine will be established in clinical practise. In this work we propose an unsupervised, generative model that can identify clustering among patients in a latent space while making use of all available data (i.e. in a heterogeneous data setting with noisy and missing values). We make use of the Gaussian process latent variable models (GPLVM) and deep neural networks to create a non-linear dimensionality reduction technique for heterogeneous data. The effectiveness of our model is demonstrated on clinical data of Parkinson's disease patients treated at the HUS Helsinki University Hospital. We demonstrate sub-groups from the heterogeneous patient data, evaluate the robustness of the findings, and interpret cluster characteristics.

Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG); Applications (stat.AP)
Cite as:	arXiv:1909.01614 [stat.ML]
	(or arXiv:1909.01614v1 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.1909.01614

Submission history

From: Siddharth Ramchandran [view email]
[v1] Wed, 4 Sep 2019 08:30:22 UTC (4,091 KB)
[v2] Wed, 17 Jun 2020 11:12:01 UTC (1,309 KB)
[v3] Wed, 25 Nov 2020 18:32:04 UTC (2,746 KB)
[v4] Tue, 20 Apr 2021 14:57:11 UTC (3,067 KB)

Statistics > Machine Learning

Title:Latent Gaussian process with composite likelihoods for data-driven disease stratification

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Latent Gaussian process with composite likelihoods for data-driven disease stratification

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators