• Dedy Sugiarto Universitas Trisakti
  • Syandra Sari
  • Anung Barlianto Ariwibowo
  • Fitria Nabilah Putri
  • Dimmas Mulya
  • Tasya Aulia
  • Arviandri Naufal Zaki


rice, product review, sentiment analysis, support vector machine, naive bayes, logistic regression, k-nearest neighbor, TF-IDF, bag of word


This study aims to compare the performance of product purchase sentiment classification in market place shopee using four classification algorithms, namely support vector machine (SVM), naïve bayes (NB), logistic regression (LR),  k-nearest neighbor (KNN) and associated with the feature extraction model used, namely term frequency - inverse document. frequency (TF-IDF) and bag of word (BOW).   Data collection was carried out by extracting rice product review data through the Shopee website using a web scraping technique which was then saved in the form of a file with CSV format. The number of product reviews obtained is 3531 reviews and after pre-processing through the elimination of duplicate reviews, there are 464 reviews with details 16.17% having a negative label (rating 1 or 2), 15.52% having a neutral label (rating 3), and 68.32% have a positive label (rating 4 or 5). The composition of the rankings shows that the data is not balanced. The experimental results show that the combination of LR with TF-IDF shows the best performance with an accuracy of 80%.


