Bimonthly    Since 1986
ISSN 1004-9037
Publication Details
Edited by: Editorial Board of Journal of Data Acquisition and Processing
P.O. Box 2704, Beijing 100190, P.R. China
Sponsored by: Institute of Computing Technology, CAS & China Computer Federation
Undertaken by: Institute of Computing Technology, CAS
Published by: SCIENCE PRESS, BEIJING, CHINA
Distributed by:
China: All Local Post Offices
 
   
      30 Dec 2022, Volume 37 Issue 5   
    Article

    SOCIAL MEDIA HATE SPEECH DETECTION USING NLP AND DEEP LEARNING TECHNIQUE
    Professor Loveleen Kaur Pabla1, Dr. Prashant Kumar Jain2 and Dr. Prabhat Patel2
    Journal of Data Acquisition and Processing, 2022, 37 (5): 1922-1939 . 

    Abstract

    In a number of countries worldwide in normal communication people use offensive languages in reality both online and offline. But, all the abusive conversation between two parties is hate speech, it is the subject of investigation. Therefore, in this paper, the key area of study is the differentiation between hate speech and offensive language. The paper includes three parts of the work: the first study of the recent development in classifying hate speech in social media, the Second, proposed an algorithm for classifying hate speech text from normal and offensive language text, and the third provides an algorithm to identify the source of hate spreader. Therefore, first, a review of recent literature has been carried out which is divided into the review and surveys, hate speech classification as binary classification, and hate speech detection as a multi-class classification problem. Then a model for hate speech classification has been proposed, which includes the data pre-processing, natural language processing (NLP), and Term Frequency-Inverse Document Frequency (TF-IDF) based feature extraction. The features are used to train a 2D-Convolutional Neural Network (CNN) and Support Vector Machine (SVM) model. Finally, an algorithm is proposed to identify the source of hate spreader. The dataset available on Kaggle for hate speech, offensive language, and normal text is used for experimental analysis. According to finding social media text only with the NLP features are not providing good accuracy. On the other hand, only TF-IDF-based features demonstrate higher accuracy as compared to NLP-based features. Additionally, a combination of both features is providing more accurate results as compared to individual techniques.

    Keyword

    Hate speech detection, Offensive language, Text mining, Natural language processing, Deep Learning.


    PDF Download (click here)

SCImago Journal & Country Rank

ISSN 1004-9037

         

Home
Editorial Board
Author Guidelines
Subscription
Journal of Data Acquisition and Processing
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
E-mail: info@sjcjycl.cn
 
  Copyright ©2015 JCST, All Rights Reserved