Klasifikasi Rating Film Berdasarkan Genre Menggunakan XGBoost dan LightGBM serta Analisis SHAP
DOI:
https://doi.org/10.37859/jf.v16i1.11273
Abstract
Movie rating is often used as an indicator of film quality and audience satisfaction. With the large availability of movie data on online platforms, machine learning techniques can be used to analyze the relationship between film characteristics and rating patterns. One important attribute that can influence movie ratings is genre. This study aims to classify movie ratings based on genre using the XGBoost and LightGBM algorithms and to analyze the contribution of each genre using SHAP (SHapley Additive Explanations). Movie data were collected from The Movie Database (TMDB) API and processed through several preprocessing stages including genre separation, data cleaning, one-hot encoding, and rating categorization. The dataset was then divided into training and testing data with a ratio of 70:30. The classification results show that XGBoost achieved an accuracy of 0.53, slightly higher than LightGBM with an accuracy of 0.52. Further analysis using SHAP indicates that genres such as Horror, Drama, Action, and Comedy have the highest global importance in the classification model. Meanwhile, the analysis of high-rating class predictions shows that Drama has the largest contribution to predicting movies with high ratings. The findings indicate that movie genres have a measurable influence on rating classification, although the importance of genres in the machine learning model does not always align with their average rating values.
Downloads
References
B. Venkateswarlu, N. Yaswanth, A. M. Kumar, U. Satish, K. Dwijesh, and N. Sunanda, “Cinematic Curator : A Machine Learning Approach to Personalized Movie Recommendations,” vol. 15, no. 4, pp. 502–509, 2024.
W. R. Bristi, Z. Zaman, and N. Sultana, “Predicting IMDb Rating of Movies by Machine Learning Techniques”.
H. Bhowmick, “Comprehensive Movie Recommendation System”.
A. Singh, A. Rawat, S. Rao, S. Jain, and U. Y. Reddy, “A Research Paper on Machine Learning based Movie Recommendation System,” pp. 990–997, 2021.
R. Desviana and V. Yasin, “Analisis Preferensi Pengguna terhadap Genre Film Menggunakan Eksplorasi Data pada Dataset MovieLens,” J. Ilm. Ilmu Komput. dan Teknol. Inf., vol. 2, no. 2, pp. 1–7, 2025.
M. K. Najib, A. Irawan, F. N. Salsabilla, and S. Nurdiati, “Performance Comparison of Gradient-based Optimizer for Classification of Movie Genres,” pp. 1–18, 2025, doi: 10.21776/ub.ijma.2025.003.01.1.
S. Tang, “The box office prediction model based on the optimized XGBoost algorithm in the context of film marketing and distribution,” PLoS One, vol. 19, no. 10 October, pp. 1–21, 2024, doi: 10.1371/journal.pone.0309227.
C. Arafat et al., “Perbandingan Algoritma Random Forest Dan Xgboost Untuk Klasifikasi Penyakit Jantung Berdasarkan Data Medis,” vol. 15, no. 2, pp. 430–435, 2025.
G. Ke et al., “LightGBM : A Highly Efficient Gradient Boosting Decision Tree,” no. Nips, pp. 1–9, 2017.
S. Handayani and D. Toresa, “Peningkatan Performa Model Gradient Boosting dalam Klasifikasi Stroke Melalui Optimasi Grid Search,” vol. 14, no. 3, pp. 722–728, 2024.
A. Sachenko, T. Lendiuk, and K. Lipianina-honcharenko, “Evaluation of ensemble machine learning models for movie recommendation systems,” vol. 8472, 2024.
A. Victoria, P.-B. Vanessa, S. Mensing, S. Stodtmann, and C. S. Maier, “Practical guide to SHAP analysis : Explaining supervised machine learning model predictions in drug development Mathematical background,” no. October, pp. 1–15, 2024, doi: 10.1111/cts.70056.
M. Bahraminasr, “IMDb data from Two Generations, from 1979 to 2019; Part one, Dataset Introduction and Preliminary Analysis,” pp. 1–12, 2019.
H. Wang and H. Zhang, “Movie Genre Preference Prediction Using Machine Learning for Customer-Based Information,” vol. 11, no. 12, pp. 1329–1336, 2017.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Aprinia Salsabila Roiqoh, Rizky Parlika, Firza Prima Aditiawan

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Copyright Notice
An author who publishes in the Jurnal FASILKOM (teknologi inFormASi dan ILmu KOMputer) agrees to the following terms:
- Author retains the copyright and grants the journal the right of first publication of the work simultaneously licensed under the Creative Commons Attribution-ShareAlike 4.0 License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal
- Author is able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book) with the acknowledgement of its initial publication in this journal.
- Author is permitted and encouraged to post his/her work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of the published work (See The Effect of Open Access).
Read more about the Creative Commons Attribution-ShareAlike 4.0 Licence here: https://creativecommons.org/licenses/by-sa/4.0/.










_(1).png)



