HYBRIDIZATION OF MACHINE LEARNING ALGORITHM FOR THE PREDICTION OF HYPOTHYROID
Abstract
Thyroid disease is one of the most progressive endocrine disorders in the human population today, and prediction of this disease is a very critical task in the field of clinical data analysis. Machine learning (ML) has shown effective results in the decision-making and predictions from the enormous data generated in the healthcare domain. However, there are limited studies that combined hybridized machine learning classifiers with a hybrid pre-processing technique of solving imbalance data class problem in a medical dataset. This research is aimed at hybridizing machine learning algorithm for the prediction of hypothyroid using dual preprocessing technique of SMOTE (Synthetic minority over sampling) and RESAMPLE. This study design a hybrid machine learning algorithm for the prediction of hypothyroid using dual filtering pre-processing technique of SMOTE and RESAMPLE to handle data class imbalance as one of its objectives and evaluate the hybrid algorithm with the dual pre-processing technique and without the dual pre-processing technique. This study used WEKA 3.8.3 as the tool of analysis. Four machine learning classification algorithms were compared (J48, Random Forest, Simple Logistic and AdaboostM1) both as a single and hybrid algorithm. Public dataset obtained from UCI repository was used for this work and AdaboostM1 combine with J48 achieved highest accuracy of 99.8% when pre-processed with the combination of SMOTE and Resample technique.