Abstract:
Particularly in countries like Bangladesh, where delayed diagnosis and limited healthcare resources exacerbate its impact. Traditional diagnostic methods are often costly and time-consuming, leading to late-stage detection and increased health complications. This study explores the application of machine learning techniques for diabetes prediction, leveraging a dataset comprising key clinical parameters such as blood glucose levels, BMI, and HbA1c, as well as personal information parameters such as age, gender, smoking history, hypertension and heart disease. Two datasets were used to gain a better understanding of the patterns among Bangladeshi diabetic patients. One dataset consists of collected data, while the other is a combination of the collected data and a dataset from Kaggle. Various machine learning models, including Logistic Regression, Decision Trees, Random Forest, Support Vector Machines (SVM), and XGBoost, were evaluated for their predictive accuracy. Experimental results of the combined datasets indicate that ensemble models, particularly Random Forest and XGBoost, achieved the highest accuracy, exceeding 97% precision. The findings highlight the potential of AI-driven predictive analytics in enhancing diagnosis, optimizing resource allocation, and supporting data-driven decision-making in healthcare. Future advancements in this field may integrate only Bangladeshi diabetic patients’ dataset from wearable devices and electronic medical records, paving the way for multi-disease prediction and improved patient outcomes