Interactive cohort exploration for spinocerebellar ataxias using synthetic cohort data for visualization
Authors:
Philipp Wegner,
Sebastian Schaaf,
Mischa Uebachs,
Marcus Grobe-Einsler,
Thomas Klockgether,
Juliane Fluck,
Jennifer Faber
Abstract:
Motivation: Visualization of data is a crucial step to understanding and deriving hypotheses from clinical data. However, for clinicians, visualization often comes with great effort due to the lack of technical knowledge about data handling and visualization. The application offers an easy-to-use solution with an intuitive design that enables various kinds of plotting functions. The aim was to pro…
▽ More
Motivation: Visualization of data is a crucial step to understanding and deriving hypotheses from clinical data. However, for clinicians, visualization often comes with great effort due to the lack of technical knowledge about data handling and visualization. The application offers an easy-to-use solution with an intuitive design that enables various kinds of plotting functions. The aim was to provide an intuitive solution with a low entrance barrier for clinical users. Little to no onboarding is required before creating plots, while the complexity of questions can grow up to specific corner cases. To allow for an easy start and testing with SCAview, we incorporated a synthetic cohort dataset based on real data of rare neurological movement disorders: the most common autosomal-dominantly inherited spinocerebellar ataxias (SCAs) type 1, 2, 3, and 6 (SCA1, 2, 3 and 6). Methods: We created a Django-based backend application that serves the data to a React-based frontend that uses Plotly for plotting. A synthetic cohort was created to deploy a version of SCAview without violating any data protection guidelines. Here, we added normal distributed noise to the data and therefore prevent re-identification while keeping distributions and general correlations. Results: This work presents SCAview, an user-friendly, interactive web-based service that enables data visualization in a clickable interface allowing intuitive graphical handling that aims to enable data visualization in a clickable interface. The service is deployed and can be tested with a synthetic cohort created based on a large, longitudinal dataset from observational studies in the most common SCAs.
△ Less
Submitted 13 June, 2023; v1 submitted 29 October, 2022;
originally announced October 2022.
Integrative Data Semantics through a Model-enabled Data Stewardship
Authors:
Philipp Wegner,
Sebastian Schaaf,
Mischa Uebachs,
Daniel Domingo-Fernández,
Yasamin Salimi,
Stephan Gebel,
Astghik Sargsyan,
Colin Birkenbihl,
Stephan Springstubbe,
Thomas Klockgether,
Juliane Fluck,
Martin Hofmann-Apitius,
Alpha Tom Kodamullil
Abstract:
Motivation: The importance of clinical data in understanding the pathophysiology of complex disorders has prompted the launch of multiple initiatives designed to generate patient-level data from various modalities. While these studies can reveal important findings relevant to the disease, each study captures different yet complementary aspects and modalities which, when combined, generate a more c…
▽ More
Motivation: The importance of clinical data in understanding the pathophysiology of complex disorders has prompted the launch of multiple initiatives designed to generate patient-level data from various modalities. While these studies can reveal important findings relevant to the disease, each study captures different yet complementary aspects and modalities which, when combined, generate a more comprehensive picture of disease aetiology. However, achieving this requires a global integration of data across studies, which proves to be challenging given the lack of interoperability of cohort datasets. Results: Here, we present the Data Steward Tool (DST), an application that allows for semi-automatic semantic integration of clinical data into ontologies and global data models and data standards. We demonstrate the applicability of the tool in the field of dementia research by establishing a Clinical Data Model (CDM) in this domain. The CDM currently consists of 277 common variables covering demographics (e.g. age and gender), diagnostics, neuropsychological tests, and biomarker measurements. The DST combined with this disease-specific data model shows how interoperability between multiple, heterogeneous dementia datasets can be achieved.
△ Less
Submitted 17 November, 2021;
originally announced November 2021.