TY - JOUR
T1 - Feasibility of Machine Learning and Logistic Regression Algorithms to Predict Outcome in Orthopaedic Trauma Surgery
AU - Oosterhoff, Jacobien H.F.
AU - Gravesteijn, Benjamin Y.
AU - Karhade, Aditya V.
AU - Jaarsma, Ruurd L.
AU - Kerkhoffs, Gino M.M.J.
AU - Ring, David
AU - Schwab, Joseph H.
AU - Steyerberg, Ewout W.
AU - Doornberg, Job N.
AU - Machine Learning Consortium
AU - Bhandari, Mohit
AU - Beks, Reinier
AU - Bulstra, Anne Eva
AU - Bzovsky, Sofia
AU - Goslings, J. Carel
AU - Guyatt, Gordon
AU - Hendrickx, Laurent
AU - Langerhuizen, David
AU - Mallee, Wouter H.
AU - Nelissen, Rob
AU - Poolman, Rudolf
AU - Stirler, Vincent
AU - Tornetta, Paul
AU - Schemitsch, Emil H.
AU - Schipper, Inger B.
AU - Swiontkowski, Marc
AU - Sanders, David
AU - Sprague, Sheila
AU - Walter, Stephen D.
PY - 2022/3/16
Y1 - 2022/3/16
N2 - Background:Statistical models using machine learning (ML) have the potential for more accurate estimates of the probability of binary events than logistic regression. The present study used existing data sets from large musculoskeletal trauma trials to address the following study questions: (1) Do ML models produce better probability estimates than logistic regression models? (2) Are ML models influenced by different variables than logistic regression models?Methods:We created ML and logistic regression models that estimated the probability of a specific fracture (posterior malleolar involvement in distal spiral tibial shaft and ankle fractures, scaphoid fracture, and distal radial fracture) or adverse event (subsequent surgery [after distal biceps repair or tibial shaft fracture], surgical site infection, and postoperative delirium) using 9 data sets from published musculoskeletal trauma studies. Each data set was split into training (80%) and test (20%) subsets. Fivefold cross-validation of the training set was used to develop the ML models. The best-performing model was then assessed in the independent testing data. Performance was assessed by (1) discrimination (c-statistic), (2) calibration (slope and intercept), and (3) overall performance (Brier score).Results:The mean c-statistic was 0.01 higher for the logistic regression models compared with the best ML models for each data set (range, -0.01 to 0.06). There were fewer variables strongly associated with variation in the ML models, and many were dissimilar from those in the logistic regression models.Conclusions:The observation that ML models produce probability estimates comparable with logistic regression models for binary events in musculoskeletal trauma suggests that their benefit may be limited in this context.
AB - Background:Statistical models using machine learning (ML) have the potential for more accurate estimates of the probability of binary events than logistic regression. The present study used existing data sets from large musculoskeletal trauma trials to address the following study questions: (1) Do ML models produce better probability estimates than logistic regression models? (2) Are ML models influenced by different variables than logistic regression models?Methods:We created ML and logistic regression models that estimated the probability of a specific fracture (posterior malleolar involvement in distal spiral tibial shaft and ankle fractures, scaphoid fracture, and distal radial fracture) or adverse event (subsequent surgery [after distal biceps repair or tibial shaft fracture], surgical site infection, and postoperative delirium) using 9 data sets from published musculoskeletal trauma studies. Each data set was split into training (80%) and test (20%) subsets. Fivefold cross-validation of the training set was used to develop the ML models. The best-performing model was then assessed in the independent testing data. Performance was assessed by (1) discrimination (c-statistic), (2) calibration (slope and intercept), and (3) overall performance (Brier score).Results:The mean c-statistic was 0.01 higher for the logistic regression models compared with the best ML models for each data set (range, -0.01 to 0.06). There were fewer variables strongly associated with variation in the ML models, and many were dissimilar from those in the logistic regression models.Conclusions:The observation that ML models produce probability estimates comparable with logistic regression models for binary events in musculoskeletal trauma suggests that their benefit may be limited in this context.
KW - Machine learning
KW - logistic regression models
KW - musculoskeletal trauma
KW - Orthopaedics
UR - http://www.scopus.com/inward/record.url?scp=85126963439&partnerID=8YFLogxK
U2 - 10.2106/JBJS.21.00341
DO - 10.2106/JBJS.21.00341
M3 - Article
C2 - 34921550
AN - SCOPUS:85126963439
SN - 0021-9355
VL - 104
SP - 544
EP - 551
JO - Journal of Bone and Joint Surgery: American Volume
JF - Journal of Bone and Joint Surgery: American Volume
IS - 6
ER -