Comparative Analysis of Machine Learning Algorithms for Indonesian Twitter Sentiment Classification on the Jakarta–Bandung High-Speed Rail Project
DOI:
https://doi.org/10.65780/bima.v1i1.3Keywords:
sentiment analysis, social media, machine learning, Indonesian language, Support Vector MachineAbstract
The rapid growth of social media in Indonesia has opened up new opportunities to gauge public opinion on major national initiatives. One of the most controversial projects is the Jakarta–Bandung High-Speed Railway (KCJB), which has sparked mixed responses due to its financial, environmental, and socio-political implications. To meet the need for systematic analysis, this study applies sentiment analysis to Indonesian Twitter data to evaluate public perspectives on the KCJB project. This research uses a step-by-step methodology, including data collection via the Twitter API, text preprocessing, manual tagging into positive and negative sentiments, and feature extraction using the Term Frequency–Inverse Document Frequency (TF-IDF) method. Four machine learning algorithms—Naïve Bayes, Support Vector Machine (SVM), K-Nearest Neighbors (K-NN), and Random Forest—were trained and verified on stratified data splits, with performance evaluated using accuracy, precision, recall, F1-score, and Area Under the Curve (AUC). The results show that SVM consistently outperforms other models, achieving up to 73% accuracy with balanced precision and recall, as well as the highest AUC value. These findings confirm the robustness of SVM in handling high-dimensional Indonesian text. In addition to its academic contribution to sentiment analysis in languages with limited resources, this research offers practical implications by providing data-driven insights for policymakers and businesses for real-time monitoring, strategic communication, and informed decision-making.














