Comparative Analysis of Machine Learning Algorithms for Diabetes Prediction: Finding the Optimal Approach


  • Aftab UL Nabie Department of Computer science, South China University of Technology, China
  • Neetesh Kumar Department of Computer Science & Information Technology, TIEST, NED University, Pakistan
  • Waqas Chander Department of Electrical Engineering, Mehran University of Engineering and Technology, Pakistan
  • Sunil Kumar Department of Electronics Engineering, Quaid Awam University of Engineering and Technology, Pakistan
  • Muhammad Waqas Pasha Department of Computing, Hamdard University, Pakistan
  • Rajesh Kumar Department of Computer Science, University of Palermo, Italy


Machine learning, Classification, Prediction, Support Vector Machines


Diabetes, as a chronic disease, poses a rapidly escalating risk to human health, stemming from a complex interplay of factors such as obesity, elevated blood glucose levels, and various other triggers. Central to its onset is the disruption of insulin hormone function, resulting in abnormal metabolism and increased blood sugar levels. In this paper, we propose a solution to this pressing issue using machine learning techniques. By applying various machine learning algorithms on the Pima Indian diabetes (PID) dataset, we aim to identify the most effective algorithm for this task. Leveraging powerful machine learning algorithms such as (SVM) Support Vector Machine, (RF) Random Forest and others, we endeavor to forecast the onset of diabetes. Through the amalgamation of these techniques, our objective is to proactively identify individuals at risk, enabling timely intervention and preventive measures to safeguard health. The primary goal of this initiative is to mitigate the risk of diabetes onset by forecasting individuals' susceptibility and advocating for lifestyle and dietary adjustments. This study has dual objectives: firstly, to develop and implement a predictive model for diabetes using machine learning techniques, and secondly, to explore effective strategies for achieving success in this endeavor.


World Health Organization. (n.d.). Diabetes. Retrieved from

Medical News Today. (n.d.). How is the pancreas linked with diabetes? Retrieved from

Dunkler, D. (2015). Risk Prediction for Early CKD in Type 2 Diabetes. Clinical Journal of the American Society of Nephrology, 10(8).

Chatterjee, S., Khunti, K., & Davies, M. J. (2017). Type 2 diabetes. The Lancet, 389(10085), 2239–2251.

Alehegn, M., & Joshi, R. (2017). Analysis and prediction of diabetes diseases using machine learning algorithm: Ensemble approach. International Research Journal of Engineering and Technology (IRJET), 04(10).

Swapna, G., Vinayakumar, R., & Soman, K. P. (2018). Diabetes detection using deep learning algorithms. ICT Express, 4(4), 243–246.

Craven, M. W., & Shavlik, J. W. (1997). Using neural networks for data mining. Future Generation Computer Systems, 13(2–3), 211–229.

NVIDIA. (n.d.). What’s the Difference Between Artificial Intelligence, Machine Learning, and Deep Learning? Retrieved from

Wei, S., Zhao, X., & Miao, C. (2018). A comprehensive exploration to the machine learning techniques for diabetes identification. In IEEE 4th World Forum on Internet of Things (WF-IoT).

Giri, B., Ghosh, N. S., Majumdar, R., & Ghosh, A. (2020). Predicting Diabetes Implementing Hybrid Approach. In 8th International Conference on Reliability, Infocom Technologies and Optimization (ICRITO).

Lai, H., Huang, H., Keshavjee, K., et al. (2019). Predictive models for diabetes mellitus using machine learning techniques. BMC Endocrine Disorders, 19(101).

Bose, S. S., & Kumar, C. S. (2019). Combining Multiple Features for Improving the Performance of Multiparameter Patient Monitor. In 5th International Conference on Advanced Computing & Communication Systems (ICACCS).

Abinav, Anil kumar, Naveena Karthika, Pratibha, Ronsen, Gandhiraj R., & Soman K.P. (2010). SVM based Classification of Digitally Modulated Signals for Software Defined Radio. In International Conference on Embedded Systems 2010.

Kavitha K. R., Gopinath, A., & Gopi, M. (2017). Applying Improved SVM Classifier for Leukemia Cancer Classification Using FCBF. In 2017 International Conference on Advances in Computing, Communications, and Informatics (ICACCI).

Thambi, S. V., Sreekumar, K. T., Kumar, C. S., & Raj, P. C. R. (2014). Random forest algorithm for improving the performance of speech/non-speech detection. In First International Conference on Computational Systems and Communications (ICCSC).

Rudra, S., Uddin, M., & Alam, M. M. (2019). Forecasting of Breast Cancer and Diabetes Using Ensemble Learning. International Journal Of Computer Communication And Informatics, 1(1), 1-5.




How to Cite

Aftab UL Nabie, Kumar , N. ., Chander , W. ., Sunil Kumar, Muhammad Waqas Pasha, & Rajesh Kumar. (2024). Comparative Analysis of Machine Learning Algorithms for Diabetes Prediction: Finding the Optimal Approach. International Journal of Computer (IJC), 51(1), 33–42. Retrieved from