Prediksi Dropout Mahasiswa: Early-Warning Berbasis Enrollment dengan Machine Learning

Authors

  • Febri Andika Putra STMIK Citra Mandiri Padangsidimpuan
  • Syahisro Mirajdandi Politeknik Lembaga Pendidikan dan Pengembangan Profesi Indonesia
  • Nandra Nandra Politeknik Lembaga Pendidikan dan Pengembangan Profesi Indonesia
  • Bisma Okmarizal Universitas Islam Negeri Jurai Siwo Lampung
  • Sandy Mulyanda Institute Bisnis dan Teknologi Pelita Indonesia

DOI:

https://doi.org/10.37859/jf.v15i3.10714
Keywords: student dropout, early warning system, machine learning, enrollment data, binary classification

Abstract

Dropout among university students remains a major challenge in higher education because it affects study continuity, institutional performance, and the efficiency of academic service planning. This study develops a machine learning–based Early Warning System (EWS) that leverages data available at enrollment and is updated after the first semester. Using the public dataset “Predict Students’ Dropout and Academic Success” (n = 4,424), the original three-class outcome (Dropout, Enrolled, Graduate) is simplified into a binary target, with dropout treated as the positive class. Two feature scenarios are evaluated: (1) enrollment-only for pre-entry screening and (2) enrollment plus first-semester indicators to update risk scores. Three models are compared: class-balanced Logistic Regression, class-balanced Random Forest, and Gradient Boosting. Model performance is assessed using accuracy, precision/recall/F1score for the dropout class, balanced accuracy, and ROC-AUC. Under the enrollment-only setting, Logistic Regression achieves the best early-warning performance (recall = 0.697; F1 score = 0.651). After incorporating first-semester features, performance improves (recall = 0.792; F1score = 0.779). Beyond model comparison, this study adds an operational perspective through confusion-matrix simulation and probability-threshold analysis to balance missed at-risk cases and follow-up workload.

Downloads

Download data is not yet available.

References

V. Realinho, J. Machado, L. Baptista, and M. V Martins, “Predicting Student Dropout and Academic Success,” Data, vol. 7, no. 11, p. 146, 2022.

J. Kabáthová and M. Drlík, “Towards predicting student’s dropout in university courses using different machine learning techniques,” Appl. Sci., vol. 11, no. 7, p. 3130, 2021.

M. V Martins, L. Baptista, J. Machado, and V. Realinho, “Multi-Class Phased Prediction of Academic Performance and Dropout in Higher Education,” Appl. Sci., vol. 13, no. 8, p. 4702, 2023.

R. da Silva, V. B. Realinho, L. M. Baptista, and M. V Martins, “Forecasting Students Dropout: A UTAD University Study,” Futur. Internet, vol. 14, no. 3, p. 76, 2022.

M. G. Carballo-Mendívil, M. M. Inzunza-González, and others, “Predicting Student Dropout from Day One: XGBoost-Based Early Detection Model Using First-Year Data,” Appl. Sci., 2025.

J. Niyogisubizo, L. L. Niyigena, and A. G. Lopez, “Predicting student’s dropout in university classes using ensemble machine learning approaches,” Comput. Educ. Artif. Intell., vol. 3, 2022.

E. F. Villegas-Ch, X. A. Palacios-Pacheco, and W. L. Tipán-Pachacama, “Applying Machine Learning and Academic Analytics to Predict Dropout and Performance in Higher Education,” Sustainability, vol. 15, no. 19, 2023.

S. Kim, Y. Kim, and M. Cho, “Using Machine Learning and Deep Learning Models to Predict Dropout Risk in College Students,” Appl. Sci., vol. 13, no. 10, 2023.

F. B. Kurniawan and L. Farokhah, “Aplikasi Cerdas Prediksi Kelulusan Mahasiswa Berbasis Website Menggunakan Metode Support Vector Machine (SVM),” J. FASILKOM (teknologi Inf. dan ILmu KOMputer), vol. 15, no. 1, pp. 155–162, 2025, doi: 10.37859/jf.v15i1.8767.

F. I. Rachman, S. Mujadilah, T. Wahyuni, and L. Anas, “Prediksi Tingkat Kelulusan Menggunakan K-Means Pada Program Studi Informatika Unismuh Makassar,” J. FASILKOM (teknologi Inf. dan ILmu KOMputer), vol. 13, no. 3, pp. 504–510, 2023, doi: 10.37859/jf.v13i3.6061.

Downloads

Published

2025-12-31