-
Development of a dynamic type 2 diabetes risk prediction tool: a UK Biobank study
Authors:
Nikola Dolezalova,
Massimo Cairo,
Alex Despotovic,
Adam T. C. Booth,
Angus B. Reed,
Davide Morelli,
David Plans
Abstract:
Diabetes affects over 400 million people and is among the leading causes of morbidity worldwide. Identification of high-risk individuals can support early diagnosis and prevention of disease development through lifestyle changes. However, the majority of existing risk scores require information about blood-based factors which are not obtainable outside of the clinic. Here, we aimed to develop an a…
▽ More
Diabetes affects over 400 million people and is among the leading causes of morbidity worldwide. Identification of high-risk individuals can support early diagnosis and prevention of disease development through lifestyle changes. However, the majority of existing risk scores require information about blood-based factors which are not obtainable outside of the clinic. Here, we aimed to develop an accessible solution that could be deployed digitally and at scale. We developed a predictive 10-year type 2 diabetes risk score using 301 features derived from 472,830 participants in the UK Biobank dataset while excluding any features which are not easily obtainable by a smartphone. Using a data-driven feature selection process, 19 features were included in the final reduced model. A Cox proportional hazards model slightly overperformed a DeepSurv model trained using the same features, achieving a concordance index of 0.818 (95% CI: 0.812-0.823), compared to 0.811 (95% CI: 0.806-0.815). The final model showed good calibration. This tool can be used for clinical screening of individuals at risk of developing type 2 diabetes and to foster patient empowerment by broadening their knowledge of the factors affecting their personal risk.
△ Less
Submitted 20 April, 2021;
originally announced April 2021.
-
Development of digitally obtainable 10-year risk scores for depression and anxiety in the general population
Authors:
D. Morelli,
N. Dolezalova,
S. Ponzo,
M. Colombo,
D. Plans
Abstract:
The burden of depression and anxiety in the world is rising. Identification of individuals at increased risk of developing these conditions would help to target them for prevention and ultimately reduce the healthcare burden. We developed a 10-year predictive algorithm for depression and anxiety using the full cohort of over 400,000 UK Biobank (UKB) participants without pre-existing depression or…
▽ More
The burden of depression and anxiety in the world is rising. Identification of individuals at increased risk of developing these conditions would help to target them for prevention and ultimately reduce the healthcare burden. We developed a 10-year predictive algorithm for depression and anxiety using the full cohort of over 400,000 UK Biobank (UKB) participants without pre-existing depression or anxiety using digitally obtainable information. From the initial 204 variables selected from UKB, processed into > 520 features, iterative backward elimination using Cox proportional hazards model was performed to select predictors which account for the majority of its predictive capability. Baseline and reduced models were then trained for depression and anxiety using both Cox and DeepSurv, a deep neural network approach to survival analysis. The baseline Cox model achieved concordance of 0.813 and 0.778 on the validation dataset for depression and anxiety, respectively. For the DeepSurv model, respective concordance indices were 0.805 and 0.774. After feature selection, the depression model contained 43 predictors and the concordance index was 0.801 for both Cox and DeepSurv. The reduced anxiety model, with 27 predictors, achieved concordance of 0.770 in both models. The final models showed good discrimination and calibration in the test datasets.We developed predictive risk scores with high discrimination for depression and anxiety using the UKB cohort, incorporating predictors which are easily obtainable via smartphone. If deployed in a digital solution, it would allow individuals to track their risk, as well as provide some pointers to how to decrease it through lifestyle changes.
△ Less
Submitted 20 April, 2021;
originally announced April 2021.
-
Development of an accessible 10-year Digital CArdioVAscular (DiCAVA) risk assessment: a UK Biobank study
Authors:
Nikola Dolezalova,
Angus B. Reed,
Alex Despotovic,
Bernard Dillon Obika,
Davide Morelli,
Mert Aral,
David Plans
Abstract:
Background: Cardiovascular diseases (CVDs) are among the leading causes of death worldwide. Predictive scores providing personalised risk of developing CVD are increasingly used in clinical practice. Most scores, however, utilise a homogenous set of features and require the presence of a physician.
Objective: The aim was to develop a new risk model (DiCAVA) using statistical and machine learning…
▽ More
Background: Cardiovascular diseases (CVDs) are among the leading causes of death worldwide. Predictive scores providing personalised risk of developing CVD are increasingly used in clinical practice. Most scores, however, utilise a homogenous set of features and require the presence of a physician.
Objective: The aim was to develop a new risk model (DiCAVA) using statistical and machine learning techniques that could be applied in a remote setting. A secondary goal was to identify new patient-centric variables that could be incorporated into CVD risk assessments.
Methods: Across 466,052 participants, Cox proportional hazards (CPH) and DeepSurv models were trained using 608 variables derived from the UK Biobank to investigate the 10-year risk of developing a CVD. Data-driven feature selection reduced the number of features to 47, after which reduced models were trained. Both models were compared to the Framingham score.
Results: The reduced CPH model achieved a c-index of 0.7443, whereas DeepSurv achieved a c-index of 0.7446. Both CPH and DeepSurv were superior in determining the CVD risk compared to Framingham score. Minimal difference was observed when cholesterol and blood pressure were excluded from the models (CPH: 0.741, DeepSurv: 0.739). The models show very good calibration and discrimination on the test data.
Conclusion: We developed a cardiovascular risk model that has very good predictive capacity and encompasses new variables. The score could be incorporated into clinical practice and utilised in a remote setting, without the need of including cholesterol. Future studies will focus on external validation across heterogeneous samples.
△ Less
Submitted 20 April, 2021;
originally announced April 2021.