Volume 5 Issue 4 December - February 2019
Research Paper
Profiling Inappropriate Users’ Tweets Using Deep Long Short-Term Memory (LSTM) Neural Network
Abubakar Umar*, Sulaimon A. Bashir**, Laud Charles Ochei***, Ibrahim A. Adeyanju****
*-** Department of Computer Science, Federal University of Technology Minna Nigeria.
*** Robert Gordon University, Aberdeen, UK.
**** Department of Computer Engineering, Federal University, Oye-Ekiti, Nigeria.
Umar, A., Bashir, S. A., Ochei, L.C., & Adeyanju, I. A. (2019). Profiling Inappropriate Users' Tweets Using Deep Long Short-Term Memory (LSTM) Neural Network. i-manager’s Journal on Pattern Recognition, 5(4), 27-43. https://doi.org/10.26634/jpr.5.4.15864
Abstract
In recent times, big Internet companies have come under increased pressure from governments and NGOs to remove inappropriate materials from social media platforms (e.g., Twitter, Facebook, YouTube). A typical example of this problem is the posting of hateful, abusive, and violent tweets on Twitter which has been blamed for inciting hatred, violence and causing societal disturbances. Manual identification of such tweets and the people who post these tweets is very difficult because of the large number of active users and the frequency with which such tweets are posted. Existing approaches for identifying inappropriate tweets have focused on the detection of such tweets without identifying the users who post them. This paper proposes an approach that can automatically identify different types of inappropriate tweets together with the users who post them. The proposed approach is based on a user profiling algorithm that uses a deep Long Short-Term Memory (LSTM) based neural network trained to detect abusive language. With the support of word embedding features learned from the training set, the algorithm is able to classify the tweets of users into different abusive language categories. Thereafter, the user profiling algorithm uses the classes assigned to the tweets of each user to profile each user into different abusive language category. Experiments on the test set show that the deep LSTM-based abusive language detection model reached an accuracy of 89.14% on detecting whether a tweet is bigotry, offensive, racist, extremism-related and neutral. Also, the user profiling algorithm obtained an accuracy of 83.33% in predicting whether a user is a bigot, racist, extremist, uses offensive language and neutral.
No comments:
Post a Comment