Journal of Data Acquisition and Processing

1 Jan 2023, Volume 38 Issue 1

Article

1.	AN EFFICIENT HETEROGENEOUS ENSEMBLE SMOTE BASED LEARNING MODEL FOR DIABETES MELLITUS PREDICTION Sandeep H1, Dr. B.K Raghavendra2 Journal of Data Acquisition and Processing, 2023, 38 (1): 1340-1352 .

Abstract

The primary goal of the current work is to develop a heterogeneous ensemble model for the diagnosis of diabetes in patients using machine learning techniques. The problem of class imbalance is addressed by the proposed paradigm. Various sampling methods, like up-sampling, down-sampling, and the synthetic minority oversampling technique(SMOTE) are used to address the class imbalance problem. Different feature selection techniques, including Ranksum, Univariate Principle Component Analysis (PCA), Logistic Regression (ULOGR), Cross-Correlation Analysis (CRA), Gini Score, and Information Gain (IGFR) are used to identify the relevant features once the preprocessed data is retrieved. On the PIMA dataset, a variety of classification methods, notably LR, SVM, Naive Bayes, Bagging,Adaboost, and PNN are used to classify and predict if a sample is diabetic or not. The results showed that the MVE ensemble learning method combined with SMOTE sampled data yields the maximum performance with 95.81% accuracy and 0.94 as AUC.

Keyword

Machine Learning, SVM, Adaboost , NB, SMOTE

PDF Download (click here)