An Ensemble-based Model for Sentiment Analysis of Kurdish Tweets
DOI:
https://doi.org/10.14500/aro.12255Keywords:
Deep learning, Ensemble method, Kurdish language, Machine learning, Roberta word embedding, Sentiment analysisAbstract
Thousands of comments are generated daily on social media in the Kurdistan Region. Sentiment analysis (SA) of these comments is valuable for organizations. The Kurdish language has three main dialects: Sorani (Central), Northern, and Southern. This study focuses on Sorani SA, where existing methods have limited accuracy. The proposed ensemble combines diverse models to improve sentiment classification. Preprocessing and word embedding using Roberta is the first phase of the method. The second phase consists of four proposed models, namely K-nearest neighbor, support vector machine, multilayer perceptron long short-term memory (LSTM), and bidirectional-LSTM (Bi-LSTM), which are used as classifiers. Finally, the ensemble weighted averaging technique is utilized to generate the final classification. To evaluate the performance of the proposed model, a dataset including 24211 unbalanced Soran tweets is first used, and after balancing, the dataset is used. The Bi-LSTM model attained an accuracy of 89.87% on the balanced dataset, and the proposed ensemble method increased the accuracy to 91.76%, which is better than the established state-of-the-art methods of Kurdish SA.
Downloads
References
Abdulla, S., and Hama, M.H., 2015. Sentiment analyses for Kurdish social network texts using naive bayes classifier. Journal of University of Human Development, 1(4), pp.393-397. DOI: https://doi.org/10.21928/juhd.v1n4y2015.pp393-397
Abdullah, A.A., Abdulla, S.H., Toufiq, D.M., Maghdid, H.S., Rashid, T.A., Farho, P.F., Sabr, S.S., Taher, A.H., Hamad, D.S., Veisi, H., and Asaad, A.T., 2024. NER- RoBERTa: Fine-Tuning RoBERTa for Named Entity Recognition (NER) within Low-Resource Languages. [arXiv Preprint].
Abdullah, M., and Shaikh, S., 2018. Teamuncc at Semeval-2018 Task 1: Emotion Detection in English and Arabic Tweets Using Deep Learning. In: Proceedings of the 12th International Workshop on Semantic Evaluation. pp.350-357. DOI: https://doi.org/10.18653/v1/S18-1053
Ahmadi, S., 2020. KLPT - Kurdish Language Processing Toolkit. In: Proceedings of Second Workshop for NLP Open Source Software (NLP-OSS). pp.72-84. DOI: https://doi.org/10.18653/v1/2020.nlposs-1.11
Alowisheq, A., Al-Twairesh, N., Altuwaijri, M., Almoammar, A., Alsuwailem, A., Albuhairi, T., Alahaideb, W., and Alhumoud, S., 2021. MARSA: Multi-domain Arabic resources for sentiment analysis. IEEE Access, 9, pp.142718-142728.
Albuhairi, T., Alahaideb, W., and Alhumoud, S., 2021. MARSA: Multi-domain Arabic resources for sentiment analysis. IEEE Access, 9, pp.142718-142728. DOI: https://doi.org/10.1109/ACCESS.2021.3120746
Al-Smadi, M., Qawasmeh, O., Al-Ayyoub, M., Jararweh, Y., and Gupta, B., 2018. Deep recurrent neural network vs. Support vector machine for aspect-based sentiment analysis of Arabic hotels’ reviews. Journal of Computational Science, 27, pp.386-393. DOI: https://doi.org/10.1016/j.jocs.2017.11.006
Amin, M.H.S.M., Al-Rassam, O., and Faeq, Z.S., 2022. Kurdish language sentiment analysis: Problems and challenges. Mathematical Statistician and Engineering Applications, 71(4), pp.3282-3293. DOI: https://doi.org/10.17762/msea.v71i4.890
Ashraf, M.R., Jana, Y., Umer, Q., Jaffar, M.A., Chung, S., and Ramay, W.Y., 2023. BERT-based sentiment analysis for low-resourced languages: A case study of Urdu language. IEEE Access, 11, pp.110245-110259. DOI: https://doi.org/10.1109/ACCESS.2023.3322101
Awlla, K., and Veisi, H., 2022. Central Kurdish sentiment analysis using deep learning. Journal of University of Anbar for Pure science, 16(2), pp.119-130. DOI: https://doi.org/10.37652/juaps.2022.176501
Badawi, S., 2023. KMD: A New Kurdish Multilabel Emotional Dataset for the Kurdish Sorani Dialect. In: Proceedings of the 6th International Conference on Natural Language and Speech Processing (ICNLSP 2023). pp.308-315.
Badawi, S., Kazemi, A., and Rezaie, V., 2024. KurdiSent: A corpus for Kurdish sentiment analysis. Language Resources and Evaluation, pp.1-20. Bordoloi, M., and Biswas, S.K., 2023. Sentiment analysis: A survey on design framework, applications and future scopes. Artificial Intelligence Review, 56, pp.12505-12560. DOI: https://doi.org/10.1007/s10462-023-10442-2
Chouikhi, H., Chniter, H., and Jarray, F., 2021. Arabic Sentiment Analysis Using BERT Model. In: International Conference on Computational Collective Intelligence. Springer International Publishing, Cham, pp.621-632. DOI: https://doi.org/10.1007/978-3-030-88113-9_50
Esmaili, K.S., Eliassi, D., Salavati, S., Aliabadi, P., Mohammadi, A., Yosefi, S., and Hakimi, S., 2013. Building a Test Collection for Sorani Kurdish. In: 2013 ACS International Conference on Computer Systems and Applications (AICCSA). IEEE, pp.1-7. DOI: https://doi.org/10.1109/AICCSA.2013.6616470
Eyvazi-Abdoljabbar, S., Kim, S., Feizi-Derakhshi, M.R., Farhadi, Z., and Mohammed, D.A., 2024. An Ensemble-based Model for Sentiment Analysis of Persian Comments on Instagram Using Deep Learning Algorithms. IEEE Access. DOI: https://doi.org/10.1109/ACCESS.2024.3473617
Heikal, M., Torki, M., and El-Makky, N., 2018. Sentiment analysis of Arabic tweets using deep learning. Procedia Computer Science, 142, pp.114-122. DOI: https://doi.org/10.1016/j.procs.2018.10.466
Hossin, M., Sulaiman, M.N., Mustapha, A., Mustapha, N., and Rahmat, R.W., 2011. A Hybrid Evaluation Metric for Optimizing Classifier. In: 2011 3rd Conference on Data Mining and Optimization (DMO). IEEE, pp.165-170. DOI: https://doi.org/10.1109/DMO.2011.5976522
Jaf, S., and Ramsay, A., 2014. Stemmer and a POS Tagger for Sorani Kurdish. In: 6th International Conference on Corpus Linguistics. Karim, S.H.T., 2024. Kurdish social media sentiment corpus: Misyar marriage perspectives. Data in Brief, 57, p.110989. DOI: https://doi.org/10.1016/j.dib.2024.110989
Mahmud, D., Abdalla, B.A., and Faraj, A., 2023. Twitter sentiment analysis for Kurdish language. Qalaai Zanist Journal, 8(4), pp.1132-1144. DOI: https://doi.org/10.25212/lfu.qzj.8.4.42
Medhat, W., Hassan, A., and Korashy, H., 2014. Sentiment analysis algorithms and applications: A survey. Ain Shams Engineering Journal, 5(4), pp.1093-1113. DOI: https://doi.org/10.1016/j.asej.2014.04.011
Mohammed, F.S., Zakaria, L., Omar, N., and Albared, M.Y., 2012. Automatic Kurdish SORANi Text Categorization using N-Gram based Model. In: 2012 International Conference on Computer and Information Science (ICCIS). Vol. 1, IEEE, pp.392-395. DOI: https://doi.org/10.1109/ICCISci.2012.6297277
Mustafa, A.M., and Rashid, T.A., 2018. Kurdish stemmer preprocessing steps for improving information retrieval. Journal of Information Science, 44(1), pp.15-27. DOI: https://doi.org/10.1177/0165551516683617
Paredes-Valverde, M.A., Colomo-Palacios, R., Salas-Zárate, M.D.P., and Valencia-García, R., 2017. Sentiment analysis in Spanish for improvement of products and services: A deep learning approach. Scientific Programming, 2017(1), p.1329281. DOI: https://doi.org/10.1155/2017/1329281
Pouyanfar, S., Sadiq, S., Yan, Y., Tian, H., Tao, Y., Reyes, M.P., Shyu, M.L., Chen, S.C., and Iyengar, S.S., 2018. A survey on deep learning: Algorithms, techniques, and applications. ACM Computing Surveys, 51(5), pp.1-36. DOI: https://doi.org/10.1145/3234150
Roshanfekr, B., Khadivi, S., and Rahmati, M., 2017. Sentiment Analysis Using Deep Learning on Persian Texts. In: 2017 Iranian Conference on Electrical Engineering (ICEE). IEEE, pp.1503-1508. DOI: https://doi.org/10.1109/IranianCEE.2017.7985281
Sarker, I.H., 2021. Deep learning: A comprehensive overview on techniques, taxonomy, applications and research directions. SN Computer Science, 2(6), p.420. DOI: https://doi.org/10.1007/s42979-021-00815-1
Shakeel, M.H., Faizullah, S., Alghamidi, T., and Khan, I., 2020. Language Independent Sentiment Analysis. In: 2019 International Conference on Advances in the Emerging Computing Technologies (AECT). IEEE, pp.1-5. DOI: https://doi.org/10.1109/AECT47998.2020.9194186
Sumit, S.H., Hossan, M.Z., Al Muntasir, T., and Sourov, T., 2018. Exploring Word Embedding for Bangla Sentiment Analysis. In: 2018 International Conference on Bangla Speech and Language Processing (ICBSLP). IEEE, pp.1-5. DOI: https://doi.org/10.1109/ICBSLP.2018.8554443
Tsai, C.F., Chen, K., Hu, Y.H., and Chen, W.K., 2020. Improving text summarization of online hotel reviews with review helpfulness and sentiment. Tourism Management, 80, p.104122. DOI: https://doi.org/10.1016/j.tourman.2020.104122
Vateekul, P., and Koomsubha, T., 2016. A Study of Sentiment Analysis Using Deep Learning Techniques on Thai Twitter Data. In: 2016 13th International Joint Conference on Computer Science and Software Engineering (JCSSE). IEEE, pp.1-6. DOI: https://doi.org/10.1109/JCSSE.2016.7748849
Wady, S.H., Badawi, S., and Kurt, F., 2024. A Kurdish Sorani twitter dataset for language modelling. Data in Brief, 57, 110967. DOI: https://doi.org/10.1016/j.dib.2024.110967
Walther, G., and Sagot, B., 2010. Developing a Large-Scale Lexicon for a LessResourced Language: General Methodology and Preliminary Experiments on Sorani Kurdish. In: Proceedings of the 7th SaLTMiL Workshop on Creation and Use of basic Lexical Resources for Less-Resourced Languages (LREC 2010 Workshop).
Yu, Y., Si, X., Hu, C., and Zhang, J., 2019. A review of recurrent neural networks: LSTM cells and network architectures. Neural Computation, 31(7), pp.1235-1270. DOI: https://doi.org/10.1162/neco_a_01199
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Sabat S. Muhamad, Abdulhady A. Abdullah, Hakem Beitollahi, Shamal A. Abdullah, Rezhin S. Shahab, Ashna D. Zrar

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Authors who choose to publish their work with Aro agree to the following terms:
-
Authors retain the copyright to their work and grant the journal the right of first publication. The work is simultaneously licensed under a Creative Commons Attribution License [CC BY-NC-SA 4.0]. This license allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
-
Authors have the freedom to enter into separate agreements for the non-exclusive distribution of the journal's published version of the work. This includes options such as posting it to an institutional repository or publishing it in a book, as long as proper acknowledgement is given to its initial publication in this journal.
-
Authors are encouraged to share and post their work online, including in institutional repositories or on their personal websites, both prior to and during the submission process. This practice can lead to productive exchanges and increase the visibility and citation of the published work.
By agreeing to these terms, authors acknowledge the importance of open access and the benefits it brings to the scholarly community.
Accepted 2025-10-24
Published 2025-12-11







ARO Journal is a scientific, peer-reviewed, periodical, and diamond OAJ that has no APC or ASC.