|
|
Bimonthly Since 1986 |
ISSN 1004-9037
|
|
|
|
|
Publication Details |
Edited by: Editorial Board of Journal of Data Acquisition and Processing
P.O. Box 2704, Beijing 100190, P.R. China
Sponsored by: Institute of Computing Technology, CAS & China Computer Federation
Undertaken by: Institute of Computing Technology, CAS
Published by: SCIENCE PRESS, BEIJING, CHINA
Distributed by:
China: All Local Post Offices
|
|
|
|
|
|
|
|
|
|
|
05 July-September 2023, Volume 38 Issue 4
|
|
|
Abstract
Abstract:
Type 2 diabetes occurred due to unbalance in glucose consumption in body which eventually lead to disorders of the circulatory, nervous and immune systems. Many studies are done on prediction of this disease involving various clinical and pathological parameters and with advancement of technology many Machine Learning techniques are also incorporate for better predication accuracy. In this paper the idea of data preprocessing is explored and its effect on ML algorithms is Analyzed. For experimental set up two datasets PIMA which is from Kaggle and locally generated and validated dataset LS. Total 5 ML algorithms and 8 different scaling techniques are evaluated in the study. It is observed that without pre-processing of data with any of the scalar the accuracy of PIMA data set is from 46.99 to 69.88%, which improves with scalers up to 77.92 %. For LS dataset without scalers accuracy is as low as 78.67% which improves to 100% with two labels as the LS data set is small and controlled. Various Scalers have different impact on data pre-processing stage for PIMA and LS both datasets. With scalers introduced in pre-processing stage there is visible improvements in accuracy so depending on the data set selection of scalers surely going to improve the efficiency.
Keyword
Type 2 Diabetes ML algorithms, scaler for preprocessing, PIM dataset, LS dataset
PDF Download (click here)
|
|
|
|
|