Springerbook 20224
Springerbook 20224
net/publication/372198411
CITATIONS READS
0 224
2 authors:
9 PUBLICATIONS 97 CITATIONS
Higher institute of computer science, Medenine,Tunisia
67 PUBLICATIONS 1,361 CITATIONS
SEE PROFILE
SEE PROFILE
All content following this page was uploaded by Hasna Chouikhi on 20 July 2023.
1 Introduction
happiness. Practically, the Arabic language has a complex nature, due to its
ambiguity and rich morphological system. This nature associated with various
dialects and the lack of resources represent a challenge to the progress of Arabic
sentiment analysis research. The major contributions of our present work are as
follows:
2 Related work
The learning based approaches of ASA can be classified into two categories:
classical machine learning approaches and deep learning approaches.
Machine learning (ML) methods have been widely used for sentiment analy-
sis. ML addresses sentiment analysis as a text classification problem. Many ap-
proaches such as the support vector machine (SVM), maximum entropy (ME),
naive Bayes (NB) algorithm, and artificial neural networks (ANNs) have been
proposed to handle ASA. NB and SVM are the most widely exploited ma-
chine learning algorithms for solving the sentiment classification problem [6]
Al-Rubaiee et al. [7] performed polarity classification and rating classification
using SVM, MNB, and BNB. They achieved 90% accuracy polarity classifica-
tion and 50% accuracy rating classification.
The use of DL is less common in Arabic SA than in English SA. [8] proposed
an approach based on RNN (recurrent neural network) which is trained on a
constructed sentiment treebank and improved sentence-level sentiment analysis
in English datasets. [9] used CNN model for SA tasks and a Stanford segmenter
to perform tweet tokenization and normalization. They used Word2vec for word
embedding with ASTD datasets. [10] used a LSTM-CNN model with only two
unbalanced classes (Positive and negative) among four classes (objective, sub-
jective positive, subjective negative, and subjective mixed) form ASTD.
Since its release in 2018, many pretrained versions of BERT [18] has been
proposed for sequence learning, such as ASA. The recent trend in sentiment
analysis is based on the BERT representation. Let us briefly describe and recall
BERT and the different versions that handle Arabic texts. BERT (Bidirectional
Encoder Representations from Transformers) is pre-trained by conditioning on
both left and right context in all layers, unlike previous language representa-
tion models. Applying BERT to any NLP task requires only to fine-tune one
additional output layer to the downstream task (see Figure 1).
BERT-based ensemble learning approach for sentiment analysis 3
3 Proposed approach
– First, we split the input texts into tokens by tokenization. Figure 2 presents
the result of Arabic-BERT and mBERT tokenizers applied to an example
sentence (S). We observe that the Arabic-BERT tokenizer is more appropri-
ate for Arabic because it considers the characteristic of Arabic morphology.
Fig. 2. Comparison between Arabic BERT (a) and mBERT (b) tokenizer.
– Second, we convert each text to a BERT’s format by adding the special [CLS]
token at the beginning of each text and the [SEP] token between sentences
and the end. Then we execute BERT to get the vector representation of each
word.
– Finally, we add a classification layer on top of the [CLS] token representation
to predict the text’s sentiment polarity.
Data augmentation (DA) consists of artificially increasing the size of the training
data set by generating new data points from existing data. It is used for low-
resource languages, such as Arabic, to avoid overfitting and create more diversity
in the data set. Data augmentation techniques can be applied at the character,
word, and sentence levels. There are various data augmentation methods, in-
cluding Easy Data Augmentation (EDA) methods, Text Generation and Back
Translation. This work uses a back-translation strategy[32] of translating Ara-
bic sentences into English and back into Arabic. We have run back translation
on all the available ASA data sets, including the AJGT, LABR, HARD, and
LargeASA datasets.
BERT-based ensemble learning approach for sentiment analysis 5
In this section, we propose an ASA based on the Arabic BERT model. As men-
tioned in the original paper, Arabic BERT is available in four versions: bert-mini-
arabic, bert-medium-arabic, bert-base-arabic, and bert-large-arabic. We applied
a grid search strategy to find the best Arabic BERT version with the best hy-
perparameters [31]. Table 1 represents the hyperparameters of Arabic BERT for
ASA used after our fine-tuning. We used the AJGT dataset [21] as a testing
dataset.
Among all the works cited, the approach of Ali Safaya [3] is the closest to
our approach. Figure 3 depicts the proposed architecture for arabic SA. Our
architecture consists of three blocks. The first block describes the text prepro-
cessing step, where we used an Arabic BERT tokenizer to split the word into
tokens. The second block is the training model. Arabic BERT model is used
with only 8 encoder (Medium case [3]). The outputs of the last four hidden lay-
ers are concatenated to get a fixed-size representation. The third block is about
the classifier, where we used a dropout layer for some regularization and a fully
connected layer for our output.
– Large-scale Arabic Book Reviews (LABR) [20] contains more than 63,000
book reviews in Arabic.
– Arabic Jordanian General Tweets (AJGT) [21] contains 1,800 tweets anno-
tated as positive and negative.
– Large-scale Arabic Sentiment Analysis (LargeASA). We aggregate the HARD,
LABRR, and AJGT datasets into a large corpus for ASA. This dataset is
publicly available upon request.
Table 4 indicates the variation of the accuracy value according to the method
used and the data sets. It shows that our model (ASA-medium BERT) and
AraBERT[4] achieve a very similar result. Our model gives the best result for
LABR, AJGT and ArsenTD-Lev ([23]) datasets; while [4] works give the best
result with ASTD and HARD datasets. We found a slight difference in the ac-
curacy value between the two works (92,6% compared to 91% for ASTD dataset
([24]) and 86,7% compared to 87% for LABR datasets). However, our model
gives a very good result with the ArsenTD-Lev dataset (75% compared to an
accuracy value that does not exceed 60% with the other models).
The first row block of Table 5 shows a comparison between the three base
models that are used in the stacking approach. It shows that Medium Arabic
BERT is the most performant and mBERT is the less performant. We will en-
visage different stacking strategies for these base models to strengthen them.
8 H. Chouikhi et al.
5 Conclusion
In this paper, we proposed a BERT based ensemble learning for Arabic senti-
ment analysis. We used medium Arabic BERT, AraBERT, and mBERT as base
models. First, we proved that by refine tunig Arabic BERT model we outper-
form the state-of-the-art for ASA. Second, the experiment results showed that
the stacking strategy improves the accuracy. As a continuation of this contribu-
tion, we plan to generalize our results to the sentiment analysis with intensities
case and investigate more data augmentation techniques.
BERT-based ensemble learning approach for sentiment analysis 9
References
1. Chouikhi, H.; Chniter, H. and Jarray, F.: Stacking BERT based Models for Ara-
bic Sentiment Analysis. In Proceedings of the 13th International Joint Conference
on Knowledge Discovery, Knowledge Engineering and Knowledge Management -
KEOD, ISBN 978-989-758-533-3; ISSN 2184-3228, pages 144-150. (2021). DOI:
10.5220/0010648400003064.
2. Dragoni, M., Poria, S., Cambria, E.: OntoSenticNet: A commonsense ontology for
sentiment analysis. IEEE Intelligent Systems, 33(3), 77-85.(2018).
3. Safaya, A., Abdullatif, M., Yuret, D.: Kuisail at semeval-2020 task 12: Bert-cnn
for offensive speech identification in social media. In Proceedings of the Fourteenth
Workshop on Semantic Evaluation, 2054-2059.(2020, December).
4. Antoun, W., Baly, F., Hajj, H.: Arabert: Transformer-based model for arabic lan-
guage understanding. arXiv preprint arXiv:2003.00104.(2020).
5. Devlin, J., Chang, M. W., Lee, K., Toutanova, K.: Bert: Pre-training of
deep bidirectional transformers for language understanding. arXiv preprint
arXiv:1810.04805.(2018).
6. Imran, A., Faiyaz, M., Akhtar, F.: An enhanced approach for quantitative prediction
of personality in facebook posts. International Journal of Education and Manage-
ment Engineering (IJEME), 8(2), 8-19.(2018).
7. Al-Rubaiee, H., Qiu, R., Li, D.: Identifying Mubasher software products through
sentiment analysis of Arabic tweets. In 2016 International Conference on Industrial
Informatics and Computer Systems (CIICS) (pp. 1-6). IEEE.(2016, March).
8. Socher, R., Perelygin, A., Wu, J., Chuang, J., Manning, C. D., Ng, A. Y., Potts,
C.: Recursive deep models for semantic compositionality over a sentiment treebank.
In Proceedings of the 2013 conference on empirical methods in natural language
processing,1631-1642.(2013, October).
9. Rangel, F., Rosso, P., Charfi, A., Zaghouani, W., Ghanem, B., Sánchez-Junquera,
J.: Overview of the track on author profiling and deception detection in arabic.
Working Notes of FIRE 2019. CEUR-WS. org, vol. 2517, 70-83. (2019).
10. Alhumoud, S., Albuhairi, T., Alohaideb, W.: Hybrid sentiment analyser for Arabic
tweets using R. In 2015 7th International Joint Conference on Knowledge Discovery,
Knowledge Engineering and Knowledge Management (IC3K), Vol. 1, 417-424. IEEE.
(2015, November).
10 H. Chouikhi et al.
11. Zahran, M. A., Magooda, A., Mahgoub, A. Y., Raafat, H., Rashwan, M., Atyia,
A.: Word representations in vector space and their applications for arabic. In Inter-
national Conference on Intelligent Text Processing and Computational Linguistics,
430-443. Springer, Cham. (2015, April).
12. ElJundi, O., Antoun, W., El Droubi, N., Hajj, H., El-Hajj, W., Shaban, K.: hul-
mona: The universal language model in arabic. In Proceedings of the fourth arabic
natural language processing workshop, 68-77. (2019, August).
13. Lan, W., Chen, Y., Xu, W., Ritter, A.: An empirical study of pre-trained trans-
formers for Arabic information extraction. arXiv preprint arXiv:2004.14519.(2020).
14. Abdul-Mageed, M., Elmadany, A., Nagoudi, E. M. B.: ARBERT MARBERT: deep
bidirectional transformers for Arabic. arXiv preprint arXiv:2101.01785. (2020).
15. Farha, I. A., Magdy, W.: From arabic sentiment analysis to sarcasm detection:
The arsarcasm dataset. In Proceedings of the 4th Workshop on Open-Source Arabic
Corpora and Processing Tools, with a Shared Task on Offensive Language Detection,
32-39. (2020, May).
16. Abdelali, A., Hassan, S., Mubarak, H., Darwish, K., Samih, Y.: Pre-training bert
on arabic tweets: Practical considerations. arXiv preprint arXiv:2102.10684. (2021).
17. Grave, E., Bojanowski, P., Gupta, P., Joulin, A., Mikolov, T.: Learning word vec-
tors for 157 languages. arXiv preprint arXiv:1802.06893. (2018).
18. Devlin, J., Chang, M. W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidi-
rectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
(2018).
19. Elnagar, A., Khalifa, Y. S., Einea, A.: Hotel Arabic-reviews dataset construction for
sentiment analysis applications. In Intelligent natural language processing: Trends
and applications, 35-52. Springer, Cham. (2018).
20. Aly, M., Atiya, A.: Labr: A large scale arabic book reviews dataset. In Proceed-
ings of the 51st Annual Meeting of the Association for Computational Linguistics,
(Volume 2: Short Papers), 494-498. (2013, August).
21. Alomari, K. M., ElSherif, H. M., Shaalan, K.: Arabic tweets sentimental analysis
using machine learning. In International Conference on Industrial, Engineering and
Other Applications of Applied Intelligent Systems, 602-610. Springer, Cham. (2017,
June).
22. Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text
classification. arXiv preprint arXiv:1607.01759. (2016).
23. Baly, R., Khaddaj, A., Hajj, H., El-Hajj, W., Shaban, K. B.: Arsentd-lev: A multi-
topic corpus for target-based sentiment analysis in arabic levantine tweets. arXiv
preprint arXiv:1906.01830. (2019).
24. Nabil, M., Aly, M., Atiya, A.: Astd: Arabic sentiment tweets dataset. In Proceed-
ings of the 2015 conference on empirical methods in natural language processing,
2515-2519. (2015, September).
25. Ghanem, B., Karoui, J., Benamara, F., Moriceau, V., Rosso, P.: Idat at fire2019:
Overview of the track on irony detection in Arabic tweets. In Proceedings of the
11th Forum for Information Retrieval Evaluation, 10-13. (2019, December).
26. Shoukry, A., Rafea, A.: Sentence-level Arabic sentiment analysis. In 2012 interna-
tional conference on collaboration technologies and systems (CTS), 546-550. IEEE.
(2012, May).
27. Alhumoud, S., Albuhairi, T., Alohaideb, W.: Hybrid sentiment analyzer for Arabic
tweets using R. In 2015 7th International Joint Conference on Knowledge Discovery,
Knowledge Engineering and Knowledge Management (IC3K), Vol. 1, 417-424. IEEE.
(2015, November).
BERT-based ensemble learning approach for sentiment analysis 11
28. Eskander, R., Rambow, O.: Slsa: A sentiment lexicon for standard arabic. In Pro-
ceedings of the 2015 conference on empirical methods in natural language processing,
2545-2550. (2015, September).
29. Dahou, A., Elaziz, M. A., Zhou, J., Xiong, S.: Arabic sentiment classification using
convolutional neural network and differential evolution algorithm. Computational
intelligence and neuroscience. (2019).
30. Harrat, S., Meftouh, K., Smaili, K.: Machine translation for Arabic dialects (sur-
vey). Information Processing Management, 56(2), 262-273. (2019).
31. Chouikhi, H., Chniter, H., Jarray, F.: Arabic sentiment analysis using BERT
model. In International Conference on Computational Collective Intelligence, 621-
632. Springer, Cham. (2021, September).
32. Ma, J., and Li, L.: Data Augmentation For Chinese Text Classification Using Back-
Translation. In Journal of Physics: Conference Series (Vol. 1651, No. 1, p. 012039).
IOP Publishing. (2020, November).