Investigating the Correlation of Various Biochemical Indicators with Bone Mineral Density and the Application of Machine Learning Algorithm in the Construction of Osteoporosis Risk Prediction Model

Bing Li, Xiaoqun Hou, Yuelei Wang, Ting Wang, Feng Shen, Xiaxia Chen, Junhua Fu

Article ID: 7836
Vol 38, Issue 2, 2024
DOI: https://doi.org/10.23812/j.biol.regul.homeost.agents.20243802.106
Received: 20 February 2024; Accepted: 20 February 2024; Available online: 20 February 2024; Issue release: 20 February 2024

Abstract

Background: With the advancement of artificial intelligence, machine learning (ML) has brought new opportunities in osteoporosis diagnosis and prevention. Therefore, this study aimed to explore the correlation between blood-related biochemical indicators and bone mineral density (BMD) values, and to construct an osteoporosis risk prediction model using ML algorithms. Methods: In this study, biochemical markers-related data were obtained from 3892 participants, and subsequently the study subjects were categorized into three groups: the normal bone density group, the low bone density group, and the osteoporosis group. Furthermore, various algorithms, such as Random Forest (RF), eXtreme Gradient Boosting (XGBoost), Logistic Regression (LR), Decision Tree (DT), Neural Network (NN), Gradient Boosting Decision Tree (GBDT), Support Vector Machine (SVM), and Naïve Bayes (NB), were used to construct predictive models on the training dataset. Moreover, the models performance was assessed in the test dataset using the receiver operating characteristic (ROC) curve and Area Under the ROC Curve (AUC), as well as the precision-recall (PR) curve AUC. Additionally, variable importance plots as well as SHapley Additive exPlanations (SHAP) plots were generated to determine contributing factors in the optimal model. Results: Among these models, the RF model exhibited the most effective performance, with a prAUC of 0.866. Various factors such as parathyroid hormone (PTH), total procollagen type I N-terminal propeptide (T-PINP), Age, beta-collagen special sequence (β-CTX), 1,25-hydroxyvitamin vitamin D3 (1,25 (OH)2VD3), N-terminal middle segment osteocalcin (N-MID), Weight, Height, phosphorus, body mass index (BMI), and coronary artery disease (CAD) significantly contributed to the models predictive outcomes, particularly within the RF models predictions, where they displayed a substantial impact. Conclusion: The predictive models established using eight algorithms, including RF, XGBoost, LR, DT, NN, GBDT, SVM, and NB, demonstrated excellent performance. However, among these models, the RF model particularly demonstrated the best predictive efficacy.


Keywords

osteoporosis;biochemical indicators;machine learning;prediction model;random forest


References

Supporting Agencies



Copyright (c) 2024 Bing Li, Xiaoqun Hou, Yuelei Wang, Ting Wang, Feng Shen, Xiaxia Chen, Junhua Fu




This site is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0).