Multi-class hate speech detection in the Norwegian language using FAST-RNN and multilingual fine-tuned transformers

Hashmi, Ehtesham; Yayilgan, Sule Yildirim

doi:10.1007/s40747-024-01392-5

Multi-class hate speech detection in the Norwegian language using FAST-RNN and multilingual fine-tuned transformers

Original Article
Open access
Published: 21 March 2024

Volume 10, pages 4535–4556, (2024)
Cite this article

Download PDF

You have full access to this open access article

Complex & Intelligent Systems Aims and scope Submit manuscript

Multi-class hate speech detection in the Norwegian language using FAST-RNN and multilingual fine-tuned transformers

Download PDF

947 Accesses
2 Citations
Explore all metrics

Abstract

The growth of social networks has provided a platform for individuals with prejudiced views, allowing them to spread hate speech and target others based on their gender, ethnicity, religion, or sexual orientation. While positive interactions within diverse communities can considerably enhance confidence, it is critical to recognize that negative comments can hurt people’s reputations and well-being. This emergence emphasizes the need for more diligent monitoring and robust policies on these platforms to protect individuals from such discriminatory and harmful behavior. Hate speech is often characterized as an intentional act of aggression directed at a specific group, typically meant to harm or marginalize them based on certain aspects of their identity. Most of the research related to hate speech has been conducted in resource-aware languages like English, Spanish, and French. However, low-resource European languages, such as Irish, Norwegian, Portuguese, Polish, Slovak, and many South Asian, present challenges due to limited linguistic resources, making information extraction labor-intensive. In this study, we present deep neural networks with FastText word embeddings using regularization methods for multi-class hate speech detection in the Norwegian language, along with the implementation of multilingual transformer-based models with hyperparameter tuning and generative configuration. FastText outperformed other deep learning models when stacked with Bidirectional LSTM and GRU, resulting in the FAST-RNN model. In the concluding phase, we compare our results with the state-of-the-art and perform interpretability modeling using Local Interpretable Model-Agnostic Explanations to achieve a more comprehensive understanding of the model’s decision-making mechanisms.

A Survey on Deep Learning Models to Detect Hate Speech and Bullying in Social Media

BERT-based ensemble learning for multi-aspect hate speech detection

Article 03 January 2023

Online Hate Speech Identification Using Fine-tuned ALBERT

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

The complexity and challenges of hate speech in the digital era

As digital technology advances, the era of social computing has significantly enhanced the way individuals interact, especially noticeable in the use of social media platforms and chat forums [30]. The concept of Hate Speech (HS), often veiled in complexity, holds diverse interpretations across regions and cultures, presenting significant hurdles in its detection and control, particularly in our digital age. HS appears in several forms [12], including cyberbullying [71], flaming [47], profanity [25], abusive language [50], toxicity [58], and discrimination [72]. While there is no universally accepted definition of HS, Nobata et al. [48] presented the most widely accepted: "any form of communication that denigrates a specific group of individuals based on attributes such as race, color, ethnicity, gender, sexual orientation, nationality, religion, or other distinguishing characteristics". Several studies align on a similar depiction of HS; [16, 34, 63, 65], characterizing it as an intentional act of hostility towards a specific group, influenced by real or perceived characteristics that constitute the group’s identity. The huge increase in disparaging remarks on Twitter and other cyber platforms is leading to physical violence in the real world. As a result, the research community considers the automated detection of hate-related content on Twitter as a significant challenge [76]. Online HS is at the junction of various societal disputes [26]. It demonstrates the revolutionary impact of technology by bringing both opportunities and difficulties. It is difficult to attain a balance between fundamental rights [27], such as freedom of expression [11], and the defense of human dignity.

Regulatory measures and the global response to online hate speech

Maintaining a safe and pleasant online environment may be extremely difficult because of how amplified such behavior can become when there is anonymity and disconnection from consequences in the real world [51]. Effective and accurate methods to identify and resolve these problems require immediate and keen attention because of their rapid growth and the nature of evolution. As the custodian of freedom of expression, UNESCO actively promotes mutual understanding through all forms of mass communication, including the Internet and social media [23, 40]. On the 31st of May 2016, a voluntary code to stop illegal HS online was introduced as a result of cooperation between the European Commission and Information Communications Technology (ICT) companies. This program mandates the removal of all content that aligns with the definition of HS as set forth by the European Union (E-U) [4]. With the outbreak of the COVID-19 pandemic, there has been a worldwide increase in HS and discrimination, prompting governments at all levels, from local to national, to emphasize the significance of community resilience. Furthermore, the impact of hatred and misinformation during the pandemic has been seen all around the world [19, 33]. The EU has established measures to control how external firms interact and combat the spread of hatred and its code of conduct has shown significant improvement in recent years.^{Footnote 1} These guidelines explicitly state that it is unlawful to participate in any activity that encourages or incites violence against a group or an individual, identifiable by characteristics, such as race, skin color, religion, ancestry, or cultural association [28]. The following figures; 1, 2 depict hate crime incidents in the US. From 2007 to 2020, an increased tendency was seen; however, stability was recorded from 2020 to 2021, indicating potential advances in handling hate.

This paper seeks to detect the HS for the Norwegian dataset by incorporating Deep Learning (DL) and multilingual transformer-based models with hyperparameter tuning. Next, the contributions of this research work are summarized, followed by how the rest of the paper is organized.

Work contributions

The contributions of this paper are as follows.

1.
In this paper, our primary contribution is the refinement and application of established HS detection methodologies through the use of regularization methods, hyperparameter tuning, and generative configurations. This approach has been meticulously applied to a baseline dataset in the Norwegian dialect, which includes class categories like neutral, provocative, offensive, moderately hateful, and hateful. The aim is to significantly enhance HS detection capabilities specifically for the Norwegian language and within these distinct categories.
2.
In addressing our classification problem, we strategically employed supervised FastText embeddings, offering distinct advantages over unsupervised FastText and other word embeddings. The supervised FastText embeddings are fine-tuned to the nuances of Norwegian HS data, capturing domain-specific context and enhancing the performance of sequential DL-based models, which include Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), and Convolutional Neural Network (CNN).
3.
We performed state-of-the-art multilingual transformer-based models, such as Multilingual Bidirectional Encoder Representations from Transformers (mBERT), ELECTRA, and FLAN-T5, along with Norwegian Language Models (LMs) like Nor-T5, Nor-BERT, Scandi-BERT, and nb-BERT. Notably, these Norwegian LMs, previously unexplored in the context of HS detection, were utilized with the implementation of hyperparameter tuning to optimize their performance for our task.
4.
This work involves the implementation of prompt-based fine-tuning using two different techniques, including few-shot and full fine-tuning with generative configuration. This approach allows us to harness the power of transformer-based models and adapt them to our specific task.
5.
Based on the best performance of Bidirectional LSTM and GRU (BiLSTM-GRU), we compared our results with the baseline study and performed the interpretability modeling with Local Interpretable Model-Agnostic Explanations (LIME) to achieve a more comprehensive understanding of the model’s decision-making mechanisms.

Structure of the paper

The rest of the paper is structured as follows: Sect. 2 discusses the existing research work on HS. Section 3 explains the proposed work methodology. Section 4 focuses on the results and discussions. Section 5 is based on the comparison of the results with the baseline methods. Section 6 is related to the interpretability modeling with LIME. Section 7 presents the conclusion and future work.

Related work

Recent advancements in Artificial Intelligence (AI) and Natural Language Processing (NLP) have heightened the prominence of HS detection, leading to the development of various innovative methods in this field [43]. These techniques enhance the understanding of HS and its implications, including monitoring social media and analyzing public discourse. Moreover, the primary focus of these studies has been on well-resourced languages such as English. This emphasis on languages with abundant resources has created a disparity in hate speech research, especially for languages with fewer resources, such as Irish, Portuguese, Norwegian, and various South Asian languages [5].

Machine learning-based methods

H. Elzayady et al. [18] introduced a method for detecting HS in Arabic dialects using a combination of classical machine learning (ML) and DL-based approaches in two phases, incorporating personality traits. In the first phase, the AraPersonality dataset was used, applying correlation validation between personality traits and HS. In the second phase, the Term Frequency-Inverse Document Frequency (TF-IDF) was used for feature extraction. These features were then input into ML models, including Decision Tree (DT) [15], Random Forest (RF), Logistic Regression (LR), Support Vector Machine, and Extreme Gradient Boosting (XGBoost), alongside DL models, such as CNN, LSTM, BiLSTM, and GRU. In their research, Mittal et al. [45] focused on HS detection in the English language using an ML-based approach. They employed the XGBoost model with a Count Vectorizer (CV) for feature extraction. Additionally, they integrated the LIME model to interpret the predictions made by the machine learning algorithm. Their methodology achieved an F1-score of 0.94, demonstrating its effectiveness.

William et al. [75] addressed a tertiary classification problem in HS detection using ML-based methods. They employed both TF-IDF Word2Vec for feature extraction, finding that TF-IDF yielded better results compared to Word2Vec embeddings. The study implemented various models, including SVM, RF, AdaBoost, and KNN. Among these, SVM was found to outperform the other models in terms of accuracy. In their study, Akuma et al. [1] analyzed a dataset of HS and offensive language from kaggle^{Footnote 2} using four ML-based algorithms: KNN, DT, LR, and NB, along with two distinct word embeddings, BoW and TF-IDF. Their work showed that DT when integrated with TF-IDF achieved the best accuracy score of 0.92 when compared to the other models in their research. A. Khanday et al. [32] conducted HS detection on Twitter using COVID-19-related tweets, applying ML and ensemble learning techniques with TF-IDF and BoW. They collected 30,000 tweets during the pandemic, of which 11,000 were annotated as containing hate-related content. The Stochastic Gradient Boosting (SGB) classifier emerged as the most effective, achieving an accuracy and F1-score of 0.98.

Deep learning and transformer-based methods

Saleh et al. [61] conducted binary classification for HS detection using BiLSTM and the transformer-based model BERT. Their research included three publicly available datasets: [16, 73, 74]. They employed three different types of embeddings: domain-specific, Word2Vec, and Global Vectors for Word Representation (GloVe) Word embeddings. The BERT model achieved a 96% F1-score on a combined balanced dataset, outperforming other DL-based methods. S. Nagar et al. [46] introduced a novel approach for HS, utilizing two publicly available datasets [21] and [22]. Their proposed model, named the Variational Graph Auto-Encoder (VGAC), leverages multi-modal data by combining two distinct features: the textual content of tweets and the social network structure of the users who posted them. Initially, the text from a tweet is encoded using a chosen text encoder, and then, it undergoes further processing with a Fully Forward Neural Network (FFNN). Concurrently, the user’s features, which include the social network structure, language usage, and metadata, are encoded using a social network encoder. By integrating the encoded text and user features, their framework aims to comprehensively understand the context of a tweet. Khan et al. [31] presented a deep neural network architecture for sentiment categorization in code-mixed texts. CNN layers are utilized for feature selection, and LSTM layers are applied to capture long-term dependencies in textual input. They also used several word embedding techniques, such as Word2Vec Continuous Bag of Words (CBoW), GLOVE, and FastText. A similar approach was used by Nagra et al. [46] where they conducted SA at the sentence level for RU using Faster Recurrent CNN (FR-CNN) on the RUSA-19 dataset.

Awal et al. [5] proposed a multilingual Model-Agnostic Meta-Learning (MAML) [52] method for detecting HS, employing different publicly available datasets. The base models used in their study were mBERT and XLM-R, alongside datasets from Founta et al. [21], i Orts, [49], Mandl et al. [39], and Bosco et al. [10]. In this study, their proposed model, HATE-MAML, outperformed the baseline models by over 3% in accuracy. In their study, Mazari et al. [41] performed multi-label HS detection using ensemble learning methods. They employed two different word embeddings, FastText and GloVe, and also trained a BERT model combined with BiLSTM and BiGRU, utilizing a dataset from the Kaggle.^{Footnote 3} The multi-labels in their study included categories, such as ’identity hate’, ’threat’, ’insult’, ’obscene’, ’toxic’, and ’severe toxic’. Ali et al. [2] performed a tertiary classification of HS detection on Twitter for the Urdu language. This classification was divided into three categories: hate speech, offensive, and neutral. They utilized deep learning-based models, such as LSTM and GRU, stacked with FastText embeddings, and also implemented a transformer-based BERT model using the Hugging Face tokenizer. Among these models, BERT emerged as the most accurate, achieving a notable accuracy score of 0.73. A similar approach was undertaken by Mehta et al. [42], where they applied traditional ML algorithms, SVM, MNB, RF, LR, and DL-based model LSTM and the transformer-based BERT model. Among these, LSTM emerged as the most effective, achieving an impressive accuracy score of 0.98. After reviewing the existing literature, we conclude that many studies addressed HS using traditional ML and DL-based methods for online data. We will do this analysis using DL, and multilingual transformers with hyperparameter tuning methods, instruction fine-tuning, and with generative configurations in a different way that will provide us with a deep understanding of these approaches.

Table 1 represents the comparative analysis of the current SOTA studies. The prevailing existing methods in HS detection have shown a tendency to underutilize multilingual transformers and language-specific transformers, particularly those that leverage the increasingly popular prompt-based fine-tuning technique in generative AI. Additionally, these methods have primarily focused on word embedding techniques, often giving less attention to the crucial aspects of regularization and hyperparameter tuning that are essential for ensuring the robust performance of algorithms. In contrast, our work not only integrates these advanced transformer models and emphasizes the importance of regularization but also pioneers in applying prompt-based fine-tuning with generative configuration and explainable AI for multi-class HS detection in low-resource Norwegian language.

Table 1 Comparative analysis of state-of-the-art methods

Multi-class hate speech detection in the Norwegian language using FAST-RNN and multilingual fine-tuned transformers

Abstract

Similar content being viewed by others

A Survey on Deep Learning Models to Detect Hate Speech and Bullying in Social Media

BERT-based ensemble learning for multi-aspect hate speech detection

Online Hate Speech Identification Using Fine-tuned ALBERT

Introduction

The complexity and challenges of hate speech in the digital era

Regulatory measures and the global response to online hate speech

Work contributions

Structure of the paper

Related work

Machine learning-based methods

Deep learning and transformer-based methods

Methodology

Dataset

Data preprocessing

Word embedding

Modeling approaches

DL-based models

Transformer-based models

Multilingual transformers

mBERT

ELECTRA

FLAN-T5

Norwegian LMs

Generative configuration

Prompt based fine-tuning

Few-shot fine-tuning

Full fine-tuning

Results and discussion

DL-based models

Transformer-based models

Comparison of the results with the state-of-the-art

Interpretability modeling with LIME

Conclusion and future work

Data availability and access

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical and informed consent for data used

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation