Utilizing NLP Sentiment Analysis Approach to Categorize Amazon Reviews against an Extended Testing Set


  • Arman Sarraf Department of Electrical and Software Engineering, University of Calgary, Calgary, AB T2N 1N4, Canada


Machine Learning, NLP, Sentiment Analysis


Sentiment analysis, also known as opinion mining, is a pivotal aspect of natural language processing (NLP). This method entails discerning the polarity of textual information and determining whether it conveys positive or negative sentiments. In one of the domains, e-commerce, sentiment analysis assumes paramount significance. It offers businesses a nuanced understanding of their brand and product sentiment as reflected in customer reviews, facilitating market comprehension and strategic decision-making. This study primarily focused on analyzing the Amazon food reviews dataset, augmenting the original dataset with newly generated data, and subsequently conducting data preprocessing tasks, encompassing text cleansing, removing stop words, lemmatization, and stemming. Subsequently, machine learning models were constructed, trained, and evaluated using NLP feature extraction techniques to address the sentiment analysis challenge and investigate the impact of increased data volume on model performance. Among the diverse methodologies employed for extracting features from textual data samples, this research integrated term frequency-inverse document frequency (TF-IDF), Word to Vector (W2V), and Bag of Words (BoW) techniques in the feature extraction phase. Furthermore, three distinct machine learning models, namely Logistic Regression, Decision Tree, and Random Forest, were designed, implemented, and assessed. The models' performance was scrutinized following hyperparameter optimization to determine the most effective approach. The outcomes revealed that the performance of the models was consistent, yielding accuracy rates ranging from 85% to 89% on the testing dataset. Nevertheless, the Logistic Regression model, employing BoW features, demonstrated superior performance compared to the other models. Following optimization of the logistic regression model, a remarkable accuracy of 89% was attained on the testing dataset by operating the BoW extracted features.


A. Sarraf and A. Abbaspour, “ChatGPT Application In Summarizing An Evolution Of Deep Learning Techniques In Imaging: A Qualitative Study,” arXiv Prepr. arXiv2312.03723, 2023.

A. Sarraf, M. Azhdari, and S. Sarraf, “A Comprehensive Review of Deep Learning Architectures for Computer Vision Applications,” Am. Sci. Res. J. Eng. Technol. Sci., vol. 77, no. 1, pp. 1–29, 2021.

S. Sarraf, “Hair color classification in face recognition using machine learning algorithms,” Am. Sci. Res. J. Eng. Technol. Sci., vol. 26, no. 3, pp. 317–334, 2016.

C. Prabhavathi, N. Vishali, P. S. Reddy, and J. V Chandramouli, “Machine Learning Model for Classifying L _ Text Using Nlp (Amazon Product Reviews),” Int. Res. J. Comput. Sci., vol. 6, no. 4, pp. 161–178, 2019.

E. Loper and S. Bird, “Nltk: The natural language toolkit,” arXiv Prepr. cs/0205028, 2002.

S. Sarraf, A. Sarraf, D. D. DeSouza, J. A. E. Anderson, M. Kabia, and A. D. N. Initiative, “OViTAD: Optimized vision transformer to predict various stages of Alzheimer’s disease using resting-state fMRI and structural MRI data,” Brain Sci., vol. 13, no. 2, p. 260, 2023.

S. Sarraf, “French Word Recognition Through a Quick Survey on Recurrent Neural Networks Using Long-Short Term Memory RNN-LSTM,” Am. Sci. Res. J. Eng. Technol. Sci., vol. 39, no. 1, pp. 250–267, 2018.

A. Sarraf, A. E. Jalali, and J. Ghaffari, “Recent Applications of Deep Learning Algorithms in Medical Image Analysis,” Am. Sci. Res. J. Eng. Technol. Sci., vol. 72, no. 1, pp. 58–66, 2020.

A. Sarraf, “Binary Image Classification Through an Optimal Topology for Convolutional Neural Networks,” Am. Sci. Res. J. Eng. Technol. Sci., vol. 68, no. 1, pp. 181–192, 2020.

S. Sarraf, “Binary Image Segmentation Using Classification Methods: Support Vector Machines, Artificial Neural Networks and K th Nearest Neighbours,” Int. J. Comput., vol. 24, no. 1, pp. 56–79, 2017.

Y. Zhang, “Qualitative Analysis of DoorDash,” in 2021 3rd International Conference On Economic Management And Cultural Industry (ICEMCI 2021), 2021, pp. 65–68.

S. Sarraf, D. D. DeSouza, J. Anderson, G. Tofighi, and A. D. N. Initiativ, “DeepAD: Alzheimer’s disease classification via deep convolutional neural networks using MRI and fMRI,” BioRxiv, p. 70441, 2016.

S. Sarraf and G. Tofighi, “Deep learning-based pipeline to recognize Alzheimer’s disease using fMRI data,” in 2016 Future Technologies Conference (FTC), 2016, pp. 816–820, doi: https://doi.org/10.1109/ftc.2016.7821697.

S. Sarraf and M. Kabia, “Optimal Topology of Vision Transformer for Real-Time Video Action Recognition in an End-To-End Cloud Solution,” Mach. Learn. Knowl. Extr., vol. 5, no. 4, pp. 1320–1339, 2023.

E. Moosavi-Zadeh, A. Rahimi, H. Rafiee, H. Saberipour, and R. Bahadoran, “Effects of fennel (Foeniculum vulgare) seed powder addition during early lactation on performance, milk fatty acid profile, and rumen fermentation parameters of Holstein cows,” Front. Anim. Sci., vol. 4, p. 1097071, 2023.

S. H. Sarraf, M. Soltanieh, and H. Aghajani, “Repairing the cracks network of hard chromium electroplated layers using plasma nitriding technique,” Vacuum, vol. 127, pp. 1–9, 2016.

S. H. Sarraf, S. Rastegari, and M. Soltanieh, “Deposition of mono dispersed Co–CeO2 nanocomposite coatings by a sol-enhanced pulsed reverse electroplating: process parameters screening,” J. Mater. Res. Technol., vol. 23, pp. 3772–3789, 2023.

S. Sarraf, D. D. Desouza, J. A. E. Anderson, and C. Saverino, “MCADNNet: Recognizing stages of cognitive impairment through efficient convolutional fMRI and MRI neural network topology models,” IEEE Access, vol. 7, pp. 155584–155600, 2019, doi: https://doi.org/10.1109/access.2019.2949577.

S. Sarraf, “Analysis and Detection of DDoS Attacks Using Machine Learning Techniques,” Am. Sci. Res. J. Eng. Technol. Sci., vol. 66, no. 1, pp. 95–104, 2020.

X. Yang, S. Sarraf, and N. Zhang, “Deep learning-based framework for Autism functional MRI image classification,” J. Ark. Acad. Sci., vol. 72, no. 1, pp. 47–52, 2018.

S. H. Sarraf, M. Soltanieh, and S. Rastegari, “Reactive air aluminizing of a nickel-based superalloy (IN738LC): Coating formation mechanism,” Surf. Coatings Technol., vol. 456, p. 129229, 2023.




How to Cite

Arman Sarraf. (2024). Utilizing NLP Sentiment Analysis Approach to Categorize Amazon Reviews against an Extended Testing Set. International Journal of Computer (IJC), 50(1), 107–116. Retrieved from https://ijcjournal.org/index.php/InternationalJournalOfComputer/article/view/2199