Proceedings of International Conference on Applied Innovation in IT  ·  2026/03/31  ·  Vol. 14  ·  Issue 1  ·  pp. 1475–1481
Predictive Modeling of Diabetes Risk Using Logistic Regression and
Mursal Luaibi Saad, Samah Sahi, Marwah Sami Kzar and Nada Abdulkareem Hameed
Background: Type 2 Diabetes Mellitus (T2DM) is an increasing global health challenge, requiring strong predictive models for prompt intervention. This study sought to create and validate a logistic regression-based predictive framework for diabetes risk utilizing electronic health record (EHR) data. A retrospective cohort of 10,000 adults devoid of previous diabetes was derived from anonymized electronic health records (EHRs). Demographics, vital signs, laboratory biomarkers, comorbidities, and medication history were all possible predictors. Data preprocessing included dealing with outliers, filling in missing values, and making features more consistent. We used logistic regression with elastic net regularization and divided the data into training, validation, and independent test sets. We used AUROC, AUPRC, calibration, Brier score, and decision curve analysis to figure out how well the model worked. The model got an AUROC of 0.81 and an AUPRC of 0.46 on the test set. It also had good calibration and subgroup consistency. Logistic regression was easier to understand than machine learning comparisons, but it still had similar levels of accuracy. An understandable, EHR-based logistic regression model offers a useful and clinically significant method for predicting diabetes risk. Future research should broaden validation efforts across diverse populations and investigate the integration of advanced AI methodologies.
Type 2 Diabetes Electronic Health Records Logistic Regression Predictive Modeling Calibration Clinical Decision Support.
References
  1. R. D. Joshi and C. K. Dhakal, “Predicting type 2 diabetes using logistic regression and machine learning approaches,” International Journal of Environmental Research and Public Health, vol. 18, no. 14, p. 7346, 2021.
  2. H. Lai, H. Huang, K. Keshavjee, A. Guergachi, and
  3. X. Gao, “Predictive models for diabetes mellitus using machine learning techniques,” BMC Endocrine Disorders, vol. 19, no. 1, p. 101, 2019.
  4. M. E. Bowen, I. Lingvay, L. Meneghini, B. Moran,
  5. N. O. Santini, S. Zhang, and E. A. Halm, “Derivation and validation of D-RISK: an electronic health record-driven risk score to detect undiagnosed dysglycemia in clinical practice,” Diabetes Care, vol. 48, no. 5, pp. 703-710, 2025.
  6. D. M. Kent, J. Nelson, A. Pittas, F. Colangelo,
  7. C. Koenig, D. van Klaveren, and J. Cuddeback, “An electronic health record-compatible model to predict personalized treatment effects from the Diabetes Prevention Program: a cross-evidence synthesis approach using clinical trial and real-world data,” in Mayo Clinic Proceedings, vol. 97, no. 4, pp. 703-715, Elsevier, Apr. 2022.
  8. F. Mesquita, J. Bernardino, J. Henriques, J. F. Raposo, R. T. Ribeiro, and S. Paredes, “Machine learning techniques to predict the risk of developing diabetic nephropathy: a literature review,” Journal of Diabetes & Metabolic Disorders, vol. 23, no. 1, pp. 825-839, 2024.
  9. L. T. Nguyen and M. Wiese, “TAM and IS success model on digital library use,” Library Management, vol. 24, no. 1-2, pp. 173-185, 2003, [Online]. Available: https://doi.org/10.1108/01435120310454592.
  10. Y. Zhang, H. Li, and X. Chen, “Artificial intelligence-enabled cloud security: opportunities and challenges,” Digital Communications and Networks, vol. 11, no. 2, pp. 55-66, 2025, [Online]. Available: https://doi.org/10.1016/j.dcan.2025.01.005.
  11. Y. Edlitz and E. Segal, “Prediction of type 2 diabetes mellitus onset using logistic regression-based scorecards,” eLife, vol. 11, p. e71862, 2022.
  12. C. Zhu, C. U. Idemudia, and W. Feng, “Improved logistic regression model for diabetes prediction by integrating PCA and K-means techniques,” Informatics in Medicine Unlocked, vol. 17, p. 100179, 2019.
  13. J. Lu, S. Lu, Y. Zhao, L. Yang, W. C. Chan, J. Lian, and D. H. Shum, “An electronic health record-linked machine learning tool for diabetes risk assessment in adults with prediabetes,” The Innovation Medicine, vol. 3, no. 1, 2025.
  14. S. Afolabi, N. Ajadi, A. Jimoh, and I. Adenekan, “Predicting diabetes using supervised machine learning algorithms on e-health records,” Informatics and Health, vol. 2, no. 1, pp. 9-16, 2025.
  15. F. Mohsen, H. R. Al-Absi, N. A. Yousri, N. El Hajj, and Z. Shah, “A scoping review of artificial intelligence-based methods for diabetes risk prediction,” npj Digital Medicine, vol. 6, no. 1, p. 197, 2023.
  16. R. Sharma, P. Gupta, and A. Singh, “Human-computer interaction frameworks for secure digital adoption,” International Journal of Human-Computer Interaction, vol. 41, no. 7, pp. 845-862, 2025, [Online]. Available: https://doi.org/10.1080/10447318.2025.2495843.
  17. A. Barwise and D. Tschida-Reuter and B. Sutor, “Adaptations to interpreter services for hospitalized patients during the COVID-19 pandemic,” in Mayo Clinic Proceedings, vol. 96, no. 12, p. 3184, Oct. 2021.

Proceedings of the International Conference on Applied Innovations in IT by Anhalt University of Applied Sciences is licensed under CC BY-SA 4.0  ·  This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License

ICAIIT 2026
International Conference on Applied Innovation in IT
Navigation
Publisher
ISSN2199-8876
Location Anhalt University of Applied Sciences
Phone +49 (0) 3496 67 5611
Address Building 01, Room 425
Bernburger Str. 55
D-06366 Köthen, Germany
Open Access License

All works are licensed under the Creative Commons Attribution-ShareAlike 4.0 International License (CC BY-SA 4.0), unless otherwise noted.

Published by ICAIIT in cooperation with Anhalt University of Applied Sciences.

© 2026 ICAIIT — International Conference on Applied Innovations in IT. Anhalt University of Applied Sciences, Köthen, Germany.
Visitors: site traffic counter