Komparasi Algoritma Menggunakan Teknik Smote Dalam Melakukan Klasifikasi Penyakit Stroke Otak
Abstract
Stroke is a deadly disease. This can occur due to disturbances in brain function that occur suddenly, progressively and quickly. However, it is difficult to know the early symptoms of stroke. The application of data mining knowledge can be used to diagnose disease. This research was conducted to implement data mining in classifying brain stroke. The dataset used was obtained from Kaggle, totaling 4891 data. However, the dataset does not have a balanced amount of data for each class. To balance the data, the SMOTE technique is used which aims to increase accuracy. The application of the classification algorithms used, namely the Logistic Regression (LR), Random Forest (RF), Support Vector Machine (SVM), and K-Nearest Neighbors (KNN) algorithms aims to determine the best algorithm performance. This research resulted in a comparison of the four algorithms which showed that the LR, RF and SVM algorithms produced the highest accuracy, precision, recall and f1-score values, namely 95% accuracy, 95% precision, 100% recall and 97% f1-score. The KNN algorithm produces lower accuracy, precision, recall and f1-score values, namely 90% accuracy, 95% precision, 85% recall and 90% f1-score.
Downloads
References
[2] Riany A F and Testiana G 2023 Penerapan Data Mining untuk Klasifikasi Penyakit Stroke Menggunakan Algoritma Naïve Bayes J. Saintekom 9 42–54
[3] Setiawan R 2021 Apa itu Data Mining dan Bagaimana Metodenya?
[4] Ismafillah D, Rohana T and Cahyana Y 2023 Implementasi Model Support Vector Machine dan Logistic Regression Untuk Memprediksi Penyakit Stroke JURIKOM (Jurnal Ris. Komputer) 10 248–56
[5] Suhliyyah, Handayani H H and Baihaqi K A 2023 Implementasi Algoritma Logistic Regression Untuk Klasifikasi Penyakit Stroke Syntax J. Inform. 12 15–23
[6] Nabila F, Afrianty I, Sanjaya S and Syafria F 2023 Implementasi Algoritma C4.5 dalam Melakukan Klasifikasi Penyakit Stroke Otak J. Inform. Univ. Pamulang 8 229–35
[7] Azhar Y, Firdausy A K and Amelia P J 2022 Perbandingan Algoritma Klasifikasi Data Mining Untuk Prediksi Penyakit Stroke Sintech J. 5 191–7
[8] Ayuningtyas Y and Suartana I M 2023 Klasifikasi Penyakit Stroke Menggunakan Support Vector Machine ( SVM ) dan Particle Swarm Optimization ( PSO ) JINACS (Journal Informatics Comput. Sci. 4 452–7
[9] Akmal K, Faqih A and Dikananda F 2023 Perbandingan Metode Algoritma Naive Bayes dan K-Nearest Neighbors Untuk Klasifikasi Penyakit Stroke JATI (Jurnal Mhs. Tek. Inform. 7 470–7
[10] Siringoringo R 2018 Klasifikasi Data Tidak Seimbang Menggunakan Algoritma SMOTE Dan K-Nearest Neighbor J. Inf. Syst. Dev. 3
[11] Sofyan S and Prasetyo A 2021 Penerapan Synthetic Minority Oversampling Technique (SMOTE) Terhadap Data Tidak Seimbang Pada Tingkat Pendapatan Pekerja Informal Di Provinsi D.I. Yogyakarta Tahun 2019 Semin. Nas. Off. Stat. 2019 868–77
[12] Mahesh 2023 Exploring Decision Trees, Random Forest, Logistic Regression, KNN, Linear Regression, SVM, RNN, LSTM, and LightGBM for Effective Data Analysis
[13] Varghese D 2018 Comparative Study on Classic Machine learning Algorithms
[14] Oktavyani A R, Wicaksono A, Seanne A F, Nofana A D K, Putra R S and Kurniawan M 2023 Perbandingan Metode Naive Bayes, K-NN dan Decision Tree Semin. Nas. Tek. Elektro, Sist. Informasi, dan Tek. Inform. 276–81