Proceedings of International Conference on Applied Innovation in IT  ·  2023/03/09  ·  Vol. 11  ·  Issue 1  ·  pp. 75–80
Comparative Analysis of Machine Learning Models for Diabetes Prediction
Zoran Stojanoski, Marija Kalendar and Hristijan Gjoreski
This paper focuses on analyzing the benchmark Diabetes dataset which consists of eight commonly measured characteristics. The goal of the study is to present comparative analysis of six machine learning models that predict diabetes, as well as various preprocessing techniques (under-over sampling, feature standardization). The study investigates various approaches and presents results demonstrating that machine learning algorithms can achieve high accuracy results for diabetes prediction, enabling early detection and better outcomes for patients. The paper shows that ensemble learning methods, such as Extra Trees Classifier and Random Forest Classifier, along with appropriate data pre-processing techniques, can lead to 86% accuracy in diabetes prediction classification problems. The paper highlights the potential for machine learning to play a valuable role in the prediction and management of diabetes, leading to improved quality of life and health outcomes for patients.
Machine Learning Diabetes Prediction Feature Analysis ML Models Comparison.
References
  1. C.L. Newman, D.J. Blake, C.J. Merz, C.L. Blake, and C.J. Merz, “UCI repository of machine learning databases,” 1998
  2. A. H. Jahromi and M. Taheri, "A non-parametric mixture of Gaussian naive Bayes classifiers based on local independent features," 2017 Artificial Intelligence and Signal Processing Conference (AISP), Shiraz, Iran, pp. 209-212, 2017, doi: 10.1109/AISP.2017.8324083.
  3. H. Naz and S. Ahuja, “Deep learning approach for diabetes prediction using PIMA Indian dataset,” Journal of diabetes and metabolic disorders vol. 19, pp. 391-403, Apr. 2020, doi: 10.1007/s40200-020-00520-5.
  4. M. Das, G. Bhattacharyya, R. Gong, and et all. “Determinants of gestational diabetes pedigree function for pima Indian females,” Intern Med Open J. 2022, vol. 6(1), pp. 9-13, doi: 10.17140/IMOJ-6-121.
  5. A. Liaw and M. Wiener, “Classification and Regression by randomForest,” R News, vol. 2, no. 3, pp. 18-22, 2002.
  6. M. Hayaty, M. Siti, and S. Ghufran, “Random and synthetic over-sampling approach to resolve data imbalance in classification,” International Journal of Artificial Intelligence Research vol. 4.2, pp. 86-94, 2020.
  7. J. Wang, M. Xu, H. Wang and J. Zhang, “Classification of Imbalanced Data by Using the SMOTE Algorithm and Locally Linear Embedding,” 2006 8th international Conference on Signal Processing, Guilin, China, 2006, doi: 10.1109/ICOSP.2006.345752.
  8. A. Géron, “Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow,” O'Reilly Media, Inc., 2022.
  9. P. Sven, D. Ferran, F. A. Hamprecht, and B. Nadler, “Cost efficient gradient boosting,” In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS'17), New York, USA, pp.1550-1560, 2017.
  10. T. Chen and C. Guestrin, “Xgboost: A scalable tree boosting system,” Proceedings of the 22nd ACM SigKDD International Conference on Knowledge Discovery and Data Mining, 2016.
  11. C. Bentéjac, A. Csörgő, and G.A. Martínez-Muñoz, “A comparative analysis of gradient boosting algorithms,” Artificial Intelligence Review, vol. 54, 2021, pp.1937-1967, doi: 10.1007/s10462-020-09896-5.

Proceedings of the International Conference on Applied Innovations in IT by Anhalt University of Applied Sciences is licensed under CC BY-SA 4.0  ·  This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License

ICAIIT 2026
International Conference on Applied Innovation in IT
Navigation
Publisher
ISSN2199-8876
Location Anhalt University of Applied Sciences
Phone +49 (0) 3496 67 5611
Address Building 01, Room 425
Bernburger Str. 55
D-06366 Köthen, Germany
Open Access License

All works are licensed under the Creative Commons Attribution-ShareAlike 4.0 International License (CC BY-SA 4.0), unless otherwise noted.

Published by ICAIIT in cooperation with Anhalt University of Applied Sciences.

© 2026 ICAIIT — International Conference on Applied Innovations in IT. Anhalt University of Applied Sciences, Köthen, Germany.
Visitors: site traffic counter