The Paper discussed the performance of 3 classification techniques (Hoeffding Tree, Logistic Model Tree, and Random Tree) by applying it on a dataset of 539 samples. This was done using 11 features related to blood tests regarding anemia classes by applying 10-fold cross-validation to each technique with time consideration as well. The process was done firstly without feature minimization and showed that LMT achieved the highest accuracy with about 85.53%, then followed by the Random Tree and the Hoeffding Tree with approximately 1 and 2 differences respectively. Classification time also evaluated for this process with leading amplitude for the Random Tree about 0.02 seconds, then followed by the Hoeffding Tree and the LMT with higher running time due to the long mathematical calculations of LMT and Hoeffding Tree. In the second step, feature selection was applied and the elimination mechanism was done for features that considered as less impact on the prediction results such as Gender, Age, and WB. These combinations of feature selection provided better results for three utilized techniques that reach 96.27% for LMT, 94.42 for the Hoeffding Tree's, and 91.26% for th Random Tree. The results highlighted the impact features on Anemia dataset that should be assembled for such a clinical dataset to provide accuracy improvement and lower processing time.
Keywords
AnemiaLogistic Model TreeHoeffding TreeRandom TreePrediction.
References
S. Kilicarslan, M. Celik, and Ş. Sahin, “Hybrid models based on genetic algorithm and deep learning algorithms for nutritional anemia disease classification,” Biomed. Signal Process. Control, vol. 63, p. 102231, 2021.
R. B., B. Kabir, D. Mamta, D. Namrata, R. Sarita, K. Ajay, and R. Y. K., “Analysis and investigation of fuzzy expert system for predicting the child anemia,” Mater. Today: Proc., vol. 56, pt. 1, pp. 231-236, 2022.
R. Vohra, A. Hussain, A. K. Dudyala, J. Pahareeya, and W. Khan, “Multi-class classification algorithms for the diagnosis of anemia in an outpatient clinical setting,” PLoS One, vol. 17, no. 7, 2022.
P. V. and V. C., “Machine learning algorithms for anemia disease prediction - a review,” Int. Res. J. Mod. Eng. Technol. Sci., vol. 4, no. 4, 2022.
S. Mohammed, A. Abbas, A. Ahmad, M. Mohammed, M. Sarı, and H. Uslu Tuna, “Data mining technique’s parameters definition and its prediction effect’s based on iron deficiency dataset,” Sigma J. Eng. Nat. Sci., vol. 43, no. 2, 2025.
C. Li and D. C. Coster, “Improved particle swarm optimization algorithms for optimal designs with various decision criteria,” Mathematics, vol. 10, no. 13, p. 2310, 2022.
N. Q. Sultan and M. S. Siti, “Radial basis function network based on time variant multi-objective particle swarm optimization for medical diseases diagnosis,” Appl. Soft Comput., vol. 11, no. 1, pp. 1427-1438, 2011.
N. Sharma, V. Khullar, and A. Luhach, “Comparative study of back-propagation and PSO based back-propagation for anemia diagnosis in pregnant ladies,” Int. J. Sci. Eng. Comput. Technol., vol. 7, no. 1, pp. 1-5, 2017.
T. Hamdi, J. B. Ali, V. Di Costanzo, F. Fnaiech, E. Moreau, and J. M. Ginoux, “Accurate prediction of continuous blood glucose based on support vector regression and differential evolution algorithm,” Biocybern. Biomed. Eng., vol. 38, no. 2, pp. 362-372, 2018.
V. Laengsri, W. Shoombuatong, W. Adirojananon, C. Nantasenamart, V. Prachayasittikul, and P. Nuchnoi, “ThalPred: A web-based prediction tool for discriminating thalassemia trait and iron deficiency anemia,” BMC Med. Inform. Decis. Mak., vol. 19, no. 1, p. 212, 2019.
T. Qadah and A. Munshi, “Synthesis and prediction of anemia from multi-data attribute co-existence,” IEEE Access, 2024.
A. M. El-Boghdady, S. Kishk, M. M. Ashour, and E. AbdElhalim, “Machine-learning based stacked ensemble model for accurate multi classification of CBC anemia,” Mansoura Eng. J., vol. 49, no. 3, p. 4, 2023.
M. Ramzan, J. Sheng, M. U. Saeed, B. Wang, and F. Z. Duraihem, “Revolutionizing anemia detection: integrative machine learning models and advanced attention mechanisms,” Vis. Comput. Ind. Biomed. Art, vol. 7, no. 1, p. 18, 2024.
L. J. Marcos-Zambrano et al., “Applications of machine learning in human microbiome studies: a review on feature selection, biomarker identification, disease prediction and treatment,” Front. Microbiol., vol. 12, p. 634511, 2021.
E. Elbasi and A. I. Zreikat, “Heart disease classification for early diagnosis based on adaptive Hoeffding tree algorithm in IoMT data,” Int. Arab J. Inf. Technol., vol. 20, no. 1, pp. 38-48, 2023.
S. A. Fayaz, M. Zaman, and M. A. Butt, “An application of logistic model tree (LMT) algorithm to ameliorate prediction accuracy of meteorological data,” Int. J. Adv. Technol. Eng. Explor., vol. 8, no. 84, pp. 1424-1440, 2021.
V. Babenko, I. Nastenko, V. Pavlov, O. Horodetska, I. Dykan, B. Tarasiuk, and V. Lazoryshinets, “Classification of pathologies on medical images using the algorithm of random forest of optimal-complexity trees,” Cybern. Syst. Anal., vol. 59, no. 2, pp. 346-358, 2023.
S. J. M. Sahar, A. A. Arshed, and S. M. Mohammed, “Anemia prediction based on rule classification,” in Proc. 13th Int. Conf. Developments in eSystems Engineering (DeSE), Liverpool, United Kingdom, pp. 427-431, 2020, doi: 10.1109/DeSE51703.2020.9450234.
C. Wu et al., “Risk factors associated with acute respiratory distress syndrome and death in patients with coronavirus disease pneumonia in Wuhan, China,” JAMA Intern. Med., vol. 180, no. 7, pp. 934-943, 2020.
N. Friis-Møller et al., “Cardiovascular disease risk factors in HIV patients–association with antiretroviral therapy. Results from the DAD study,” AIDS, vol. 17, no. 8, pp. 1179-1193, 2003.
K. M. West et al., “The role of circulating glucose and triglyceride concentrations and their interactions with other risk factors as determinants of arterial disease in nine diabetic population samples from the WHO multinational study,” Diabetes Care, vol. 6, no. 4, pp. 361-369, 1983.
D. Kocev, M. Ceci, and T. Stepišnik, “Ensembles of extremely randomized predictive clustering trees for predicting structured outputs,” Mach. Learn., vol. 109, pp. 2213-2241, 2020.
H. Tyralis, G. Papacharalampous, and A. Langousis, “A brief review of random forests for water scientists and practitioners and their recent history in water resources,” Water, vol. 11, no. 5, p. 910, 2019.
A. Kumar, P. Kaur, and P. Sharma, “A survey on Hoeffding tree stream data classification algorithms,” CPUH-Res. J., vol. 1, no. 2, pp. 28-32, 2015.
N. Landwehr, M. Hall, and E. Frank, “Logistic model trees,” Mach. Learn., vol. 95, no. 1-2, pp. 161-205, 2015.
A. K. S. and L. Jaya, “Data mining for classification of power quality problems using WEKA and the effect of attributes on classification accuracy,” Prot. Control Mod. Power Syst., vol. 3, p. 29, 2018.
S. J. M. Sahar and S. M. Mohammed, “COVID-19 risk factors specification using decision tree based on the degree of redundancy between features,” in Proc. IEEE 3rd Global Conf. Advancement in Technology (GCAT), Bangalore, India, pp. 1-11, 2022, doi: 10.1109/GCAT55367.2022.9971950.