Comparison of Machine Learning Models for Classification and Detection of Heart Disease
DOI:
 
							
								https://doi.org/10.37859/coscitech.v6i2.9811
							
						
					Abstract
Heart disease is one of the leading causes of death in the world, so early detection is an important aspect in prevention efforts. This study aims to build a heart disease risk prediction model based on patient clinical data using the Random Forest algorithm. The dataset used consists of 303 data with 13 features such as blood pressure, cholesterol, maximum heart rate, and others, as well as one nested target attribute. The data processing process includes cleaning invalid values such as question marks ('?') which are changed to missing values, and deleting incomplete data to maintain the integrity of the dataset. After going through data exploration and correlation analysis between features, the model is trained using the Random Forest algorithm because of its ability in multiclass classification and resistance to overfitting. The initial evaluation results show that the model has good prediction accuracy with a score reaching 0.89. This study proves that the Random Forest-based machine learning approach is effective in helping the process of systematically identifying heart disease risks, so it has the potential to be a decision support tool in the field of preventive health.
Downloads
References
D. Rolnick et al., “Tackling Climate Change with Machine Learning,” ACM Comput. Surv., vol. 55, no. 2, 2023, doi: 10.1145/3485128.
A. Halbouni, T. S. Gunawan, M. H. Habaebi, M. Halbouni, M. Kartiwi, and R. Ahmad, “Machine Learning and Deep Learning Approaches for CyberSecurity: A Review,” IEEE Access, vol. 10, no. Ml, pp. 19572–19585, 2022, doi: 10.1109/ACCESS.2022.3151248.
N. Kühl, M. Schemmer, M. Goutier, and G. Satzger, “Artificial intelligence and machine learning,” Electron. Mark., vol. 32, no. 4, pp. 2235–2244, 2022, doi: 10.1007/s12525-022-00598-0.
C. W. Tsao et al., Heart Disease and Stroke Statistics - 2023 Update: A Report from the American Heart Association, vol. 147, no. 8. 2023. doi: 10.1161/CIR.0000000000001123.
M. A. Abubakar, M. Muliadi, A. Farmadi, R. Herteno, and R. Ramadhani, “Random Forest Dengan Random Search Terhadap Ketidakseimbangan Kelas Pada Prediksi Gagal Jantung,” J. Inform., vol. 10, no. 1, pp. 13–18, 2023, doi: 10.31294/inf.v10i1.14531.
Z. Jin, J. Shang, Q. Zhu, C. Ling, W. Xie, and B. Qiang, “RFRSF: Employee Turnover Prediction Based on Random Forests and Survival Analysis,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 12343 LNCS, pp. 503–515, 2020, doi: 10.1007/978-3-030-62008-0_35.
L. Zhou, Y. Song, W. Ji, and H. Wei, “Machine learning for combustion,” Energy AI, vol. 7, p. 100128, 2022, doi: 10.1016/j.egyai.2021.100128.
S. Kushwaha, R. Srivastava, H. Vats, and P. Khanna, “Machine learning in healthcare,” Mach. Learn. Soc. Improv. Mod. Prog., pp. 50–70, 2022, doi: 10.4018/978-1-6684-4045-2.ch003.
Fitri Handayani and Reny Medikawati Taufiq, “Komparasi Algoritma Menggunakan Teknik Smote Dalam Melakukan Klasifikasi Penyakit Stroke Otak,” J. CoSciTech (Computer Sci. Inf. Technol., vol. 5, no. 2, pp. 367–372, 2024, doi: 10.37859/coscitech.v5i2.7439.
J. Al Amien, Yoze Rizki, and Mukhlis Ali Rahman Nasution, “Implementasi Adasyn Untuk Imbalance Data Pada Dataset UNSW-NB15 Adasyn Implementation For Data Imbalance on UNSW-NB15 Dataset,” J. CoSciTech (Computer Sci. Inf. Technol., vol. 3, no. 3, pp. 242–248, 2022, doi: 10.37859/coscitech.v3i3.4339.
S. Emami and G. Martínez-Muñoz, “Condensed-gradient boosting,” Int. J. Mach. Learn. Cybern., vol. 16, no. 1, pp. 687–701, 2025, doi: 10.1007/s13042-024-02279-0.
S. Malik, R. Harode, and A. Singh Kunwar, “XGBoost: a deep dive into boosting,” Simon Fraser Univ., no. February, pp. 1–21, 2020, doi: 10.13140/RG.2.2.15243.64803.
L. W. Rizkallah, “Enhancing the performance of gradient boosting trees on regression problems,” J. Big Data, vol. 12, no. 1, 2025, doi: 10.1186/s40537-025-01071-3.
D. Boldini, F. Grisoni, D. Kuhn, L. Friedrich, and S. A. Sieber, “Practical guidelines for the use of gradient boosting for molecular property prediction,” J. Cheminform., vol. 15, no. 1, pp. 1–13, 2023, doi: 10.1186/s13321-023-00743-7.
J. Velthoen, C. Dombry, J. J. Cai, and S. Engelke, “Gradient boosting for extreme quantile regression,” Extremes, vol. 26, no. 4, pp. 639–667, 2023, doi: 10.1007/s10687-023-00473-x.
						









