Detection of Hate Speech in Indonesian Language on Twitter Using Machine Learning Algorithm

  • Febby Apri Wenando
  • Evans Fuad

Abstract

Hate speech is an act of communication carried out by an individual or group in the form of provocation or insults to other individuals or groups. Hate speech is prohibited because it can trigger acts of violence and prejudice either from the perpetrators of the statement or victims of the act. This study aims to find the best algorithm for detecting Hate Speech by comparing the Decision Tree, Naive Bayes, Support Vector Machine and Random Forest algorithms using N-Gram-based
Word Scoring (TF-IDF) methods, including the Union Gram, Bigram and Trigram using the programming language Python. The results reveal that the Naive Bayes algorithm shows the best results using the Trigram feature with an Accuracy of 88.57%, Precision of 96.75% and Recall of 99.34%.

Downloads

Download data is not yet available.
Published
2019-12-09
Abstract views: 316 , pdf downloads: 962