Peringkas teks otomatis pada artikel berbahasa indonesia menggunakan metode maximum marginal relevance

  • Zaky Idhafi UIN Sultan Syarif Kasim
  • Surya Agustian Universitas Islam Negeri Sultan Syarif Kasim
  • Febi Yanto Universitas Islam Negeri Sultan Syarif Kasim
  • Nazruddin Safaat H Universitas Islam Negeri Sultan Syarif Kasim
Keywords: peringkas otomatis, cosine similarity, MMR, maximum marginal relevance, ROUGE, Automated text summarization, cosine similarity, MMR, maximum marginal relevance, ROUGE

Abstract

Automated text summarization is a method for retrieving the essence of one or more text documents. Automatic Text Summarizer is needed for a faster and more efficient process of reading, searching, and understanding information. This study proposes the Maximum Marginal Relevance method to carry out the text summarization process automatically. The method was developed and tested on each of the 150 Indonesian article documents. The summary is generated from the similarity score between sentences calculated using cosine similarity. MMR's performance in producing summaries was evaluated using ROUGE (Recall-Oriented Understudy for Gisting Evaluation), which compares them to gold-generated summaries. Test results for a compression rate of 50% gave F1 scores on ROUGE-1, ROUGE-2, and ROUGE-L at 71.86%, 64.18%, and 71.56%, respectively. In comparison, the test results with a compression rate of 30% produced F1-scores for ROUGE-1, ROUGE-2, and ROUGE-L, respectively 62.95%, 53.61%, and 62.47%. Compared to previous studies, this study produced better scores.

Downloads

Download data is not yet available.

References

DAFTAR PUSTAKA
[1] Y. Yuliska and K. U. Syaliman, “Literatur Review Terhadap Metode, Aplikasi dan Dataset Peringkasan Dokumen Teks Otomatis untuk Teks Berbahasa Indonesia,” IT J. Res. Dev., vol. 5, no. 1, pp. 19–31, 2020
[2] A. P. Widyassari et al., “Review of automatic text summarization techniques & methods,” J. King Saud Univ. - Comput. Inf. Sci., vol. 34, no. 4, pp. 1029–1046, 2022,
[3] I. N. Purnama and N. N. W. Utami, “Implementasi Peringkas Dokumen Berbahasa Indonesia Menggunakan Metode Text To Text Transfer Transformer (T5),” J. Teknol. Inf. dan Komput., vol. 9, no. 4, pp. 381–391, 2023.
[4] A. N. Ammar and S. Suyanto, “Peringkasan Teks Ekstraktif Menggunakan Binary Firefly Algorithm,” Indones. J. Comput., vol. 5, no. 2, pp. 31–42, 2020.
[5] Y. Yuliska and K. U. Syaliman, “Literatur Review Terhadap Metode, Aplikasi dan Dataset Peringkasan Dokumen Teks Otomatis untuk Teks Berbahasa Indonesia,” IT J. Res. Dev., 202.
[6] R. Robiyanto, N. Nugraha, I. Apriatna, J. Mayasih, and C. Kuningan, “Peringkasan Teks Otomatis Berita Menggunakan Metode Maximum Marginal Relevance,” JEJARING J. Teknol. dan Manaj. Inform., vol. 4, no. 1, pp. 23–32, May 2019.
[7] F. Husniah, S. Agustian, and I. Afrianty, “Peringkasan Teks Otomatis Artikel Berbahasa Indonesia Menggunakan Algoritma Textrank,” Pros. Semin. Nas. Teknoka, vol. 7, no. 7, pp. 1–10, 2022.
[8] Halimah, Surya Agustian, and Siti Ramadhani, “Peringkasan teks otomatis (automated text summarization) pada artikel berbahasa indonesia menggunakan algoritma lexrank,” J. CoSciTech (Computer Sci. Inf. Technol., vol. 3, no. 3, pp. 371–381, 2022.
[9] I. M. Pulungan, “Analisa Sentimen Terhadap Penjualan Alat Pelindung Diri Pada Market Place Menggunakan Metode Maximum Marginal Relevance,” Inf. dan Teknol. Ilm., 2022.
[10] M. D. Dewi, B. S. D. Nugraha, and ..., “Penerapan Algoritma Score-Based pada Peringkasan Teks Cerpen Otomatis,” Semin. Inform. …, 2020.
[11] M. D. R. Wahyudi, “Penerapan Algoritma Cosine Similarity pada Text Mining Terjemah Al-Qur’an Berdasarkan Keterkaitan Topik,” Semesta Tek., vol. 22, no. 1, pp. 41–50, 2019.
[12] A. C. Herlingga, I. P. E. Prismana, D. R. Prehanto, and D. A. Dermawan, “Algoritma Stemming Nazief & Adriani dengan Metode Cosine Similarity untuk Chatbot Telegram Terintegrasi dengan E-layanan,” J. Informatics Comput. Sci., vol. 2, no. 01, pp. 19–26, 2020.
[13] Y. A. Kresna, Peringkasan Teks Menggunakan Metode Maximum Marginal Relevance Terhadap Artikel Berita Terkait COVID-19. repository.ub.ac.id, 2021.
[14] A. Agarwal, S. Xu, and M. Grabmair, “Extractive summarization of legal decisions using multi-task learning and maximal marginal relevance,” arXiv Prepr. arXiv2210.12437, 2022.
[15] W. Xiao and G. Carenini, “Systematically exploring redundancy reduction in summarizing long documents,” arXiv Prepr. arXiv2012.00052, 2020.
[16] B. Goodrich, V. Rao, P. J. Liu, and M. Saleh, “Assessing the factual accuracy of generated text,” Proc. ACM SIGKDD Int. Conf. Knowl. Discov. Data Min., pp. 166–175, 2019.
[17] M. Moradi, M. Dashti, and M. Samwald, “Summarization of biomedical articles using domain-specific word embeddings and graph ranking,” J. Biomed. Inform., vol. 107, no. May, p. 103452, 2020.
[18] D. K. Wardy, I. K. G. D. Putra, and N. K. D. Rusjayanthi, “Clustering Artikel pada Portal Berita Online,” JITTER- J. Ilm. Teknol. dan Komput., vol. 3, no. 1, pp. 3–11, 2022.
[19] A. F. Rihardi, S. Agustian, and E. P. Cynthia, “Peringkas Teks Otomatis Menggunakan Metode Latent Dirichlet Allocation ( LDA ),” Pros. SENDIKO (Seminar Nas. Has. Penelit. Pengabdi. Masy. Bid. Ilmu Komputer), vol. 2, pp. 80–89, 2023.
[20] F. Noprianto, S. Agustian, and M. Irsyad, “Clustering Peringkasan Teks Otomatis Dokumen Berita menggunakan Metode K-Means Clustering Automatic Text Summarization of News Documents using the K-Means Method,” Pros. SENDIKO (Seminar Nas. Has. Penelit. Pengabdi. Masy. Bid. Ilmu Komputer), vol. 2, pp. 139–147, 2023.
Published
2023-12-29
How to Cite
Idhafi, Z., Agustian, S., Yanto, F., & Safaat H, N. (2023). Peringkas teks otomatis pada artikel berbahasa indonesia menggunakan metode maximum marginal relevance. Jurnal CoSciTech (Computer Science and Information Technology), 4(3), 609-618. https://doi.org/10.37859/coscitech.v4i3.6311
Abstract views: 107 , PDF downloads: 59