Clickbait Text Classification with Deep Learning Hybrid LSTM-CNN Method
DOI:
https://doi.org/10.37859/coscitech.v7i1.8609
Abstract
This study aims to determine the category of news titles by dividing them into two groups, namely clickbait and non-clickbait using the LSTM-CNN hybrid method. The dataset used consists of 14,878 data in two categories with 6,285 clickbait news data and 8,693 non clickbait news data obtained from the kaggle page. The research stages include data preprocessing through cleaning, tokenizing, stopword removal, stemming, and text representation using the Word2Vec algorithm. The dataset will then be separated into training and test data using a ratio of 80:20. The LSTM-CNN hybrid model is used because of CNN's advantage in extracting local features as well as LSTM's ability to understand sequential relationships between words. The model performance evaluation was conducted using confusion matrix, with the results of 77.07% accuracy, 70% recall, 73% precision, and 71% F1-score. The LSTM-CNN hybrid model showed better performance than the separate models with an increase in accuracy from 77% to 77.07%. This research shows that the LSTM-CNN model combination is effective in handling clickbait and non-clickbait news text classification, providing quite good results in improving the performance of the previous model.
Downloads
References
[2] R. Sagita, U. Enri, and A. Primajaya, “Klasifikasi Berita Clickbait Menggunakan K-Nearest Neighbor (KNN),” JOINS (Journal of Information System), vol. 5, no. 2, pp. 230–239, Nov. 2020, doi: 10.33633/joins.v5i2.3705.
[3] I. N. Wardani, M. Ningsih, and R. Zusyana, “Pawitra Komunika jurnal komunikasi dan sosial humaniora PENGGUNAAN CLICKBAIT HEADLINE PADA PORTAL BERITA TRIBUNNEWS.COM,” vol. 2, no. 1, 2021,
[Online]. Available: http://ejurnal.unim.ac.id/index.php/pawitrakomunika
[4] Y. Widhiyasana, T. Semiawan, I. Gibran, A. Mudzakir, and M. R. Noor, “Penerapan Convolutional Long Short-Term Memory untuk Klasifikasi Teks Berita Bahasa Indonesia (Convolutional Long Short-Term Memory Implementation for Indonesian News Classification),” 2021.
[5] J. Khatib Sulaiman, S. Rabbani, M. Khairul Anam, and I. Artikel Abstrak, “Optimalisasi Kinerja Klasifikasi Teks Berdasarkan Analisis Berbasis Aspek dan Model Hybrid Deep Learing,” Indonesian Journal of Computer Science.
[6] P. Studi Sistem Informasi, C. Guridno, A. Azimah, and S. Ningsih ABSTRAK, “Jurnal Sistem Informasi Bisnis ( JUNSIBI) ANALISIS HYBRID METODE CNN DAN LSTM DALAM MEDIA BERITA ONLINE INDONESIA PENULIS 1),” vol. 5, no. 1, pp. 86–101, 1957, doi: 10.55122/junsibi.v5i1.1202.
[7] M. R. Kertanegara, “Penggunaan Clickbait Headline pada Situs Berita,” 2018.
[8] N. Rahmatika, G. F. Prisanto, S. Tinggi Ilmu Komunikasi InterStudi, J. I. Wijaya No, and J. Selatan, “Pengaruh Berita Clickbait Terhadap Kepercayaan pada Media di Era Attention Economy.”
[9] J. Homepage, R. Rahman Salam, M. Fajri Jamil, and Y. Ibrahim, “MALCOM: Indonesian Journal of Machine Learning and Computer Science Sentiment Analysis of Cash Direct Assistance Distribution for Fuel Oil Using Support Vector Machine Analisis Sentimen Terhadap Bantuan Langsung Tunai (BLT) Bahan Bakar Minyak (BBM) Menggunakan Support Vector Machine,” vol. 3, pp. 27–35, 2023.
[10] R. Hayami, “KLASIFIKASI TEKS BERITA BERBAHASA INDONESIA MENGGUNAKAN MACHINE
LEARNING DAN DEEP LEARNING: STUDI LITERATUR,” 2023. [Online]. Available: https://ieeexplore.ieee.org/
[11] R. Nanda, E. Haerani, S. K. Gusti, and S. Ramadhani, “Klasifikasi Berita Menggunakan Metode Support Vector Machine,” Jurnal Nasional Komputasi dan Teknologi Informasi, vol. 5, no. 2, 2022.
[12] D. Intan Af et al., “Pengaruh Parameter Word2Vec terhadap Performa Deep Learning pada Klasifikasi Sentimen,” vol. 6, no. 3, 2021.
[13] I. Rifky Hendrawan, E. Utami, and A. D. Hartanto, “Analisis Perbandingan Metode Tf-Idf dan Word2vec pada Klasifikasi Teks Sentimen Masyarakat Terhadap Produk Lokal di Indonesia.”
[14] B. Arief, H. Kholifatullah, and A. Prihanto, “Penerapan Metode Long Short Term Memory Untuk Klasifikasi Pada Hate Speech,” Journal of Informatics and Computer Science, vol. 04, 2023.










