PREDIKSI EMOSI DALAM TEKS BAHASA INDONESIA MENGGUNAKAN MODEL INDOBERT

Authors

  • Ade Chandra Saputra Universitas Palangka Raya
  • Agus Sehatman Saragih Universitas Palangka Raya
  • Deddy Ronaldo Universitas Palangka Raya

DOI:

https://doi.org/10.47111/jti.v19i1.17617

Keywords:

NLP, Emotion, IndoBERT, Indonesian, Emotion Prediction

Abstract

This study aims to predict emotions in Indonesian text using the IndoBERT model. Emotions play an essential role in human communication and have a significant impact on sentiment analysis and natural language processing. In Indonesia, the lack of optimized datasets and models for emotion analysis in the Indonesian language poses a major challenge. This research utilizes IndoBERT, a BERT-based model specifically trained for Indonesian, to predict six categories of emotions: anger, sadness, happiness, love, fear, and disgust. The research methodology includes data collection from social media X, data preprocessing, emotion labeling, model training, and performance evaluation using metrics such as accuracy, precision, recall, and F1-score. Results show an overall model accuracy of 73%, with strong performance in recognizing emotions like "disgust" and "fear," although there are misclassifications in distinguishing similar emotions like "happiness" and "love." These findings indicate that IndoBERT has significant potential for emotion prediction in the Indonesian language and provides a foundation for developing more culturally relevant NLP technologies for Indonesia.

Downloads

Download data is not yet available.

References

Albab, M. A., Karuniawati, P., & Fawaiq, H. (2023). Implementation of Indonesian Stopword Removal for Improved Text Classification Accuracy. Journal of Language Technology and Computation, 14(1), 35–42. https://doi.org/10.3390/jltc.v14i1.500.J. Clerk Maxwell, A Treatise on Electricity and Magnetism, 3rd ed., vol. 2. Oxford: Clarendon, 1892, pp.68-73.

Cambria, E., Poria, S., Bajpai, R., & Schuller, B. (2017). SenticNet 5: Discovering Conceptual Primitives for Sentiment Analysis and Emotion Detection. Proceedings of the 2017 Conference on Artificial Intelligence (AAAI), 1795–1801.

Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 4171–4186. https://doi.org/10.18653/v1/N19-1423.

Gupta, S. (2021). Fine-Tuning Pre-trained Language Models for Text Classification: A Comparative Study. Journal of Computational Linguistics and Natural Language Processing, 9(1), 67–75. https://doi.org/10.1016/j.jclnlp.2021.01.005.

Miyajiwala, A., Patel, R., & Desai, H. (2022). Stopword Removal Techniques for Text Preprocessing in Natural Language Processing. International Journal of Advanced Computer Science and Applications, 13(4), 220–227. https://doi.org/10.14569/IJACSA.2022.0130428.

Munezero, M., Montero, C. S., Sutinen, E., & Pajunen, J. (2014). Are They Different? Affect, Feeling, Emotion, Sentiment, and Opinion Detection in Text. IEEE Transactions on Affective Computing, 5(2), 101–111. https://doi.org/10.1109/TAFFC.2014.2317187.

Rosid, M., Anwar, R., & Fadillah, N. (2020). Stemming Algorithms for Indonesian Text: A Comparative Study. Indonesian Journal of Computational Linguistics, 8(3), 145–153. https://doi.org/10.31227/osf.io/ijcl2020.

Salsabila, A., Pratama, H., & Utami, L. (2018). Colloquial Indonesian Lexicon: Dataset for Normalizing Informal Indonesian Text. Proceedings of the 2018 International Conference on Asian Language Processing (IALP), 245–250. https://doi.org/10.1109/IALP.2018.8629152

Sari, R., & Ruldeviyani, Y. (2020). Case Folding Techniques in Text Preprocessing for Sentiment Analysis on Social Media. Journal of Information Systems and Informatics, 5(2), 115–123. https://doi.org/10.31763/jisi.v5i2.350.

Wang, B., Zhang, W., & Zhou, H. (2021). A New Framework for Emotion Detection Using Convolutional Neural Networks and NLP. Journal of Machine Learning and Language Processing, 12(3), 102–119.

Wilie, B., Vincentio, T., Winata, G. I., Cahyawijaya, S., Li, X., Lim, Z., Mahendra, R., Kuncoro, A., Ruder, S., & Fung, P. (2020). IndoBERTweet: A Pre-trained Language Model for Indonesian X with Sociocultural Awareness. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2572–2580. https://doi.org/10.18653/v1/2020.emnlp-main.204.

Downloads

Published

2025-01-31