Comparative Study of the Performance of Naïve Bayes, SVM, and K-NN Algorithms for Sentiment Analysis and Topic Modeling of #KaburAjaDulu Hashtags

Authors

  • Sonia Tikamidia Airlangga University, Surabaya, Indonesia
  • Imam Yuadi Airlangga University, Surabaya, Indonesia

DOI:

https://doi.org/10.38035/dijemss.v7i1.5119

Keywords:

#KaburAjaDulu, Sentiment Analysis, LDA

Abstract

The #KaburAjaDulu hashtag phenomenon that has been widely discussed on platform X reflects the increasing anxiety of Indonesia's younger generation towards socio-economic conditions and the direction of state policy. This research aims to assess public perception of the hashtag through sentiment analysis and topic modeling approaches. Data was collected from X users' tweets from May to June 2025. The methods used include text preprocessing, sentiment classification using Naïve Bayes, SVM, and K-NN algorithms, and topic modeling with Latent Dirichlet Allocation (LDA). The analysis results show that SVM performs best with 98.93% accuracy and optimal precision-recall balance. The Naïve Bayes model also shows competitive results but tends to favour positive classes. In contrast, K-NN showed the lowest performance due to its inability to overcome the curse of dimensionality in TF-IDF representation. LDA topic modeling identified three main themes: the employment crisis, distrust of institutions due to corruption, and the nationalism vs. migration dilemma. These three topics indicate deep psychological conflicts experienced by youth. The findings support the Self-Determination Theory, which emphasizes the importance of autonomy, competence, and social connection for individual attachment to the environment. Lack of fulfilment of these needs triggers migration intentions as a form of escape or adaptive strategy. This research provides a practical contribution to designing HR policies based on social data. In addition, this approach can be used as the basis for a real-time public perception monitoring system.

References

Aggarwal, C., & Zhai, C. (2012). Mining Text Data. https://doi.org/10.1007/978-1-4614-3223-4_6

Agustina, M., & Hendry, H. (2021). Sentimen Masyarakat Terkait Perpindahan Ibukota Via Model Random Forest dan Logistic Regression. AITI, 18, 111–124. https://doi.org/10.24246/aiti.v18i2.111-124

AlBadani, B., Shi, R., & Dong, J. (2022). A Novel Machine Learning Approach for Sentiment Analysis on Twitter Incorporating the Universal Language Model Fine-Tuning and SVM. Applied System Innovation, 5(1). https://doi.org/10.3390/asi5010013

Aprilianti, H., Mustofa, H., Umam, K., & Handayani, M. R. (2025). Comparative Study of SVM, KNN, and Naïve Bayes for Sentiment Analysis of Religious Application Reviews. Journal of Applied Informatics and Computing (JAIC), 9(3), 2548–6861. http://jurnal.polibatam.ac.id/index.php/JAIC

Bansal, M., Goyal, A., & Choudhary, A. (2022). A comparative analysis of K-Nearest Neighbor, Genetic, Support Vector Machine, Decision Tree, and Long Short Term Memory algorithms in machine learning. Decision Analytics Journal, 3, 100071. https://doi.org/https://doi.org/10.1016/j.dajour.2022.100071

Defit, S., Windarto, A. P., & Alkhairi, P. (2024). Comparative Analysis of Classification Methods in Sentiment Analysis: The Impact of Feature Selection and Ensemble Techniques Optimization. Telematika, 17(1), 52–67. https://doi.org/10.35671/telematika.v17i1.2824

Dey, L., Chakraborty, S., Biswas, A., Bose, B., & Tiwari, S. (2016). Sentiment Analysis of Review Datasets Using Naïve Bayes‘ and K-NN Classifier. International Journal of Information Engineering and Electronic Business, 8(4), 54–62. https://doi.org/10.5815/ijieeb.2016.04.07

Huq, M. R., Ali, A., & Rahman, A. (2017). Sentiment Analysis on Twitter Data using KNN and SVM. International Journal of Advanced Computer Science and Applications, 8(6), 19–25. https://doi.org/10.14569/ijacsa.2017.080603

IBM. (2021). A practical guide to the Cross-Industry Standard Process for Data Mining (CRISP-DM).

Indrawati, S., & Kuncoro, A. (2021). Improving Competitiveness Through Vocational and Higher Education: Indonesia’s Vision For Human Capital Development In 2019–2024. Bulletin of Indonesian Economic Studies, 57, 29–59. https://doi.org/10.1080/00074918.2021.1909692

Israt Jahan, Md Nakibul Islam, Md Mahadi Hasan, & Md Rafiuddin Siddiky. (2024). Comparative analysis of machine learning algorithms for sentiment classification in social media text. World Journal of Advanced Research and Reviews, 23(3), 2842–2852. https://doi.org/10.30574/wjarr.2024.23.3.2983

Khristianto, R., Chalvani, R., Alauddin Ramadhan, H., & Rakhmawati, N. (2025). Analisis Sentimen terhadap #KaburAjaDulu sebagai Ekspresi Frustrasi Ekonomi Anak Muda dalam Sosial Media X. https://doi.org/10.5281/zenodo.15639676

Le, B., & Nguyen, H. (2015). Twitter Sentiment Analysis Using Machine Learning Techniques. Advances in Intelligent Systems and Computing, 358, 1–415. https://doi.org/10.1007/978-3-319-17996-4

Liu, B., Blasch, E., Chen, Y., Shen, D., & Chen, G. (2013). Scalable sentiment classification for Big Data analysis using Naïve Bayes Classifier. Proceedings - 2013 IEEE International Conference on Big Data, Big Data 2013, 99–104. https://doi.org/10.1109/BigData.2013.6691740

Marciano, H. (2025). Kabur Aja Dulu, bentuk frustasi generasi muda terhadap kondisi bangsa.

Mujahid, M., Lee, E., Rustam, F., Washington, P. B., Ullah, S., Reshi, A. A., & Ashraf, I. (2021). Sentiment analysis and topic modeling on tweets about online education during covid-19. Applied Sciences (Switzerland), 11(18). https://doi.org/10.3390/app11188438

Munawaroh, K., & Alamsyah, A. (2023). Performance Comparison of SVM, Naïve Bayes, and KNN Algorithms for Analysis of Public Opinion Sentiment Against COVID-19 Vaccination on Twitter. Journal of Advances in Information Systems and Technology, 4(2), 113–125. https://doi.org/10.15294/jaist.v4i2.59493

Nurkholis, A., Alita, D., & Munandar, A. (2022). Comparison of Kernel Support Vector Machine Multi-Class in PPKM Sentiment Analysis on Twitter. Jurnal RESTI, 6(2), 227–233. https://doi.org/10.29207/resti.v6i2.3906

Pavitha, N., Pungliya, V., Raut, A., Bhonsle, R., Purohit, A., Patel, A., & Shashidhar, R. (2022). Movie recommendation and sentiment analysis using machine learning. Global Transitions Proceedings, 3(1), 279–284. https://doi.org/10.1016/j.gltp.2022.03.012

Pestov, V. (2011). Is the -NN classifier in high dimensions affected by the curse of dimensionality? Computers & Mathematics with Applications, 65. https://doi.org/10.1016/j.camwa.2012.09.011

Raeisi Shahraki, H., Pourahmad, S., & Zare, N. (2017). K Important Neighbors: A Novel Approach to Binary Classification in High Dimensional Data. BioMed Research International, 2017, 7560807. https://doi.org/10.1155/2017/7560807

Rahardi, M., Aminuddin, A., Abdulloh, F. F., & Nugroho, R. A. (2022). Sentiment Analysis of Covid-19 Vaccination using Support Vector Machine in Indonesia. International Journal of Advanced Computer Science and Applications, 13(6), 534–539. https://doi.org/10.14569/IJACSA.2022.0130665

Ressan, M. B., & Hassan, R. F. (2022). Naïve-Bayes family for sentiment analysis during COVID-19 pandemic and classification tweets. Indonesian Journal of Electrical Engineering and Computer Science, 28(1), 375–383. https://doi.org/10.11591/ijeecs.v28.i1.pp375-383

Ryan, R. M., & Deci, E. L. (1985). Self-Determination Theory and the Facilitation of Intrinsic Motivation, Social Development, and Well-Being Self-Determination Theory. Ryan.

Tamara, R., Putro, H. P., & Herry Wahyono. (2023). Analisis Sentimen Terhadap Pilpres 2024 Berdasarkan Opini Dari Twitter Menggunakan Naïve Bayes Dan Svm. Teknokris, 26(1), 23–32. https://doi.org/10.61488/teknokris.v26i1.245

Transiska, D., Febriawan, D., & Hasan, F. N. (2024). Analisis Sentimen Terhadap Penggunaan Chatgpt Berdasarka Twitter Menggunakan Algoritma Naïve Bayes. Jurnal Media Informatika Budidarma, 8(2), 1077. https://doi.org/10.30865/mib.v8i2.7540

Villavicencio, C., Macrohon, J. J., Inbaraj, X. A., Jeng, J. H., & Hsieh, J. G. (2021). Twitter sentiment analysis towards covid-19 vaccines in the Philippines using naïve bayes. Information (Switzerland), 12(5). https://doi.org/10.3390/info12050204

Wati, P., & Yusuf, M. (2021). Analisis Topik Modelling Terhadap Penggunaan Sosial Media Twitter oleh Pejabat Negara. In Building of Informatics, Technology and Science (BITS) (Vol. 3). https://doi.org/10.47065/bits.v3i3.1012

Xu, Q., Chang, V., & Jayne, C. (2022). A systematic review of social media-based sentiment analysis: Emerging trends and challenges. Decision Analytics Journal, 3, 100073. https://doi.org/10.1016/j.dajour.2022.100073

Yunitasari, D., Khotimah, K., & Fathorrazi, M. (2021). The Implication of Brain Gain on Brain Drain Phenomenon in Overcoming The Problem of Educated Unemployment in Indonesia. Sosiohumaniora, 23(1), 133. https://doi.org/10.24198/sosiohumaniora.v23i1.26749

Zaqy, M., Marlina, L., & Farta Wijaya, R. (2024). Analysis of Indonesian Netizen Sentiment on Platform X Regarding the Arrival of Refugees in Indonesia Using the Multinominal Naive Bayes Method. Sinkron, 8, 1945–1952. https://doi.org/10.33395/sinkron.v8i3.13940

Zerrouki, K., Hamou, R. M., & Rahmoun, A. (2020). Sentiment Analysis of Tweets Using Naïve Bayes, KNN, and Decision Tree. International Journal of Organizational and Collective Intelligence, 10(4), 35–49. https://doi.org/10.4018/ijoci.2020100103

Zulqarnain, Z., Ikhlas, M., & Ilhami, R. (2022). Perception of college students on civic and anti-corruption education: Importance and relevance. Integritas?: Jurnal Antikorupsi, 8(1), 123–134. https://doi.org/10.32697/integritas.v8i1.854

Downloads

Published

2025-10-15

How to Cite

Tikamidia, S., & Yuadi, I. . (2025). Comparative Study of the Performance of Naïve Bayes, SVM, and K-NN Algorithms for Sentiment Analysis and Topic Modeling of #KaburAjaDulu Hashtags. Dinasti International Journal of Education Management and Social Science, 7(1), 237–247. https://doi.org/10.38035/dijemss.v7i1.5119