Application of Machine Learning Algorithms to Predict Uncontrolled Diabetes Using the All of Us Research Program Data

Tadesse M. Abegaz, Muktar Ahmed, Fatimah Sherbeny, Vakaramoko Diaby, Hongmei Chi, Askal Ayalew Ali

Research output: Contribution to journalArticlepeer-review

14 Citations (Scopus)
12 Downloads (Pure)

Abstract

There is a paucity of predictive models for uncontrolled diabetes mellitus. The present study applied different machine learning algorithms on multiple patient characteristics to predict uncontrolled diabetes. Patients with diabetes above the age of 18 from the All of Us Research Program were included. Random forest, extreme gradient boost, logistic regression, and weighted ensemble model algorithms were employed. Patients who had a record of uncontrolled diabetes based on the international classification of diseases code were identified as cases. A set of features including basic demographic, biomarkers and hematological indices were included in the model. The random forest model demonstrated high performance in predicting uncontrolled diabetes, yielding an accuracy of 0.80 (95% CI: 0.79–0.81) as compared to the extreme gradient boost 0.74 (95% CI: 0.73–0.75), the logistic regression 0.64 (95% CI: 0.63–0.65) and the weighted ensemble model 0.77 (95% CI: 0.76–0.79). The maximum area under the receiver characteristics curve value was 0.77 (random forest model), while the minimum value was 0.7 (logistic regression model). Potassium levels, body weight, aspartate aminotransferase, height, and heart rate were important predictors of uncontrolled diabetes. The random forest model demonstrated a high performance in predicting uncontrolled diabetes. Serum electrolytes and physical measurements were important features in predicting uncontrolled diabetes. Machine learning techniques may be used to predict uncontrolled diabetes by incorporating these clinical characteristics.

Original languageEnglish
Article number1138
Number of pages14
JournalHealthcare (Switzerland)
Volume11
Issue number8
DOIs
Publication statusPublished - 2 Apr 2023
Externally publishedYes

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 3 - Good Health and Well-being
    SDG 3 Good Health and Well-being

Keywords

  • All of Us Research Program
  • machine learning
  • prediction
  • serum electrolytes
  • uncontrolled diabetes

Fingerprint

Dive into the research topics of 'Application of Machine Learning Algorithms to Predict Uncontrolled Diabetes Using the All of Us Research Program Data'. Together they form a unique fingerprint.

Cite this