Proceedings of International Conference on Applied Innovation in IT
2023/03/09, Volume 11, Issue 1, pp.75-80

Comparative Analysis of Machine Learning Models for Diabetes Prediction

Zoran Stojanoski, Marija Kalendar and Hristijan Gjoreski

Abstract: This paper focuses on analyzing the benchmark Diabetes dataset which consists of eight commonly measured characteristics. The goal of the study is to present comparative analysis of six machine learning models that predict diabetes, as well as various preprocessing techniques (under-over sampling, feature standardization). The study investigates various approaches and presents results demonstrating that machine learning algorithms can achieve high accuracy results for diabetes prediction, enabling early detection and better outcomes for patients. The paper shows that ensemble learning methods, such as Extra Trees Classifier and Random Forest Classifier, along with appropriate data pre-processing techniques, can lead to 86% accuracy in diabetes prediction classification problems. The paper highlights the potential for machine learning to play a valuable role in the prediction and management of diabetes, leading to improved quality of life and health outcomes for patients.

Keywords: Machine Learning, Diabetes Prediction, Feature Analysis, ML Models Comparison.

DOI: 10.25673/101916

Download: PDF


  1. C.L. Newman, D.J. Blake, C.J. Merz, C.L. Blake, and C.J. Merz, “UCI repository of machine learning databases,” 1998
  2. A. H. Jahromi and M. Taheri, "A non-parametric mixture of Gaussian naive Bayes classifiers based on local independent features," 2017 Artificial Intelligence and Signal Processing Conference (AISP), Shiraz, Iran, pp. 209-212, 2017, doi: 10.1109/AISP.2017.8324083.
  3. H. Naz and S. Ahuja, “Deep learning approach for diabetes prediction using PIMA Indian dataset,” Journal of diabetes and metabolic disorders vol. 19, pp. 391-403, Apr. 2020, doi: 10.1007/s40200-020-00520-5.
  4. M. Das, G. Bhattacharyya, R. Gong, and et all. “Determinants of gestational diabetes pedigree function for pima Indian females,” Intern Med Open J. 2022, vol. 6(1), pp. 9-13, doi: 10.17140/IMOJ-6-121.
  5. A. Liaw and M. Wiener, “Classification and Regression by randomForest,” R News, vol. 2, no. 3, pp. 18-22, 2002.
  6. M. Hayaty, M. Siti, and S. Ghufran, “Random and synthetic over-sampling approach to resolve data imbalance in classification,” International Journal of Artificial Intelligence Research vol. 4.2, pp. 86-94, 2020.
  7. J. Wang, M. Xu, H. Wang and J. Zhang, “Classification of Imbalanced Data by Using the SMOTE Algorithm and Locally Linear Embedding,” 2006 8th international Conference on Signal Processing, Guilin, China, 2006, doi: 10.1109/ICOSP.2006.345752.
  8. A. Géron, “Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow,” O'Reilly Media, Inc., 2022.
  9. P. Sven, D. Ferran, F. A. Hamprecht, and B. Nadler, “Cost efficient gradient boosting,” In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS'17), New York, USA, pp.1550-1560, 2017.
  10. T. Chen and C. Guestrin, “Xgboost: A scalable tree boosting system,” Proceedings of the 22nd ACM SigKDD International Conference on Knowledge Discovery and Data Mining, 2016.
  11. C. Bentéjac, A. Csörgő, and G.A. Martínez-Muñoz, “A comparative analysis of gradient boosting algorithms,” Artificial Intelligence Review, vol. 54, 2021, pp.1937-1967, doi: 10.1007/s10462-020-09896-5.



       - Call for Papers
       - Submission to the Journal
       - Paper Submission
       - Final Paper Submission
       - Important Dates
       - Conference Committee
       - Editorial Board
       - Reviewers
       - Last Proceedings


       - Volume 11, Issue 1 (ICAIIT 2023)
       - Volume 10, Issue 1 (ICAIIT 2022)
       - Volume 9, Issue 1 (ICAIIT 2021)
       - Volume 8, Issue 1 (ICAIIT 2020)
       - Volume 7, Issue 1 (ICAIIT 2019)
       - Volume 7, Issue 2 (ICAIIT 2019)
       - Volume 6, Issue 1 (ICAIIT 2018)
       - Volume 5, Issue 1 (ICAIIT 2017)
       - Volume 4, Issue 1 (ICAIIT 2016)
       - Volume 3, Issue 1 (ICAIIT 2015)
       - Volume 2, Issue 1 (ICAIIT 2014)
       - Volume 1, Issue 1 (ICAIIT 2013)


       ICAIIT 2023
         - Photos
         - Reports

       ICAIIT 2022
         - Message

       ICAIIT 2021
         - Photos
         - Reports

       ICAIIT 2020
         - Photos
         - Reports

       ICAIIT 2019
         - Photos
         - Reports

       ICAIIT 2018
         - Photos
         - Reports






Proceedings of the International Conference on Applied Innovations in IT by Anhalt University of Applied Sciences is licensed under CC BY-SA 4.0

                                                   This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License

           ISSN 2199-8876
           Publisher: Anhalt University of Applied Sciences

Creative Commons License
Except where otherwise noted, all works and proceedings on this site is licensed under Creative Commons Attribution-ShareAlike 4.0 International License.