Detection of Hate Speech in Indonesian Language on Twitter Using Machine Learning Algorithm
Abstract
Hate speech is an act of communication carried out by an individual or group in the form of provocation or insults to other individuals or groups. Hate speech is prohibited because it can trigger acts of violence and prejudice either from the perpetrators of the statement or victims of the act. This study aims to find the best algorithm for detecting Hate Speech by comparing the Decision Tree, Naive Bayes, Support Vector Machine and Random Forest algorithms using N-Gram-based
Word Scoring (TF-IDF) methods, including the Union Gram, Bigram and Trigram using the programming language Python. The results reveal that the Naive Bayes algorithm shows the best results using the Trigram feature with an Accuracy of 88.57%, Precision of 96.75% and Recall of 99.34%.
Downloads
Copyright (c) 2019 Prosiding CELSciTech
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All material contained in this site is protected by law. It is prohibited to quote part or all of the contents of this website for commercial uses without the approval of the board of editors of this journal.
If you find one or more articles contained in CELSciTech that violate or potentially infringe your copyright, please report to us, via email to Principle Contact.
The formal legal aspect of access to any information and articles contained in this journal site refers to the terms of the Creative Commons Attribution-ShareAlike (CC BY-SA).
All information contained in CELSciTech is academic. CELSciTech is not responsible for any losses incurred by misuse of information from this site.