Opinion Text Analysis Using Artificial Intelligence
Opinion Text Analysis Using Artificial Intelligence
Volume 8 Issue 3, May-June 2024 Available Online: www.ijtsrd.com e-ISSN: 2456 – 6470
INTRODUCTION
A social media platform like Facebook, provides an specific issue utilizing various sentiments including
opportunity for users to share their views and positive, negative, and neutral [4]. Sentiment analysis
opinions, as well as connect, communicate, and can be applied to social media data to investigate
contribute to specific topics through short-character variations in people's behavior, feelings, and
messages referred to as comments. This can be opinions, like by categorizing the spread tendency of
accomplished using text, images, and videos, among political campaigns. In this work, we use social media
other things, and users can interact by clicking the research to explore young sentiments. This article
like, comment, and repost icons. As more individuals examines the feelings of tweets using multiple
utilize social media, the study of data available online methodologies, including lexical and machine
can be utilized to shed light on evolving people's learning techniques. The time required is a major
views, conduct, and cognition [1]. As a result, issue for existing machine learning approaches,
employing Twitter or Facebook data for sentiment posing a hurdle for all firms seeking to transition their
analysis has grown more common. The growing operations to be processed by automated workflows.
interest in social media analysis has increased the Deep learning techniques have been applied to a
focus on text analysis technologies such as Natural variety of real-world applications, including
Language Processing (NLP) and Artificial sentiment analysis. Techniques for deep learning use
Intelligence (AI)[2]. Text analysis allows you to various algorithms to extract information from raw
assess the sentiments and attitudes of specific target data, including texts or tweets, and express it in
groups. Most of the existing literature concentrates on specific types of models. These models are used to
English texts, however there is an increasing interest derive information from novel datasets that have not
in multilingual analysis[3]. Text analysis can be before been represented.
performed by retrieving subjective comments about a
@ IJTSRD | Unique Paper ID – IJTSRD64847 | Volume – 8 | Issue – 3 | May-June 2024 Page 168
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD64847 | Volume – 8 | Issue – 3 | May-June 2024 Page 169
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
MACHINE LEARNING APPROACH
Machine learning algorithms can design classifiers that finish sentiment categorization by getting feature
vectors, which mostly comprises phases such as data collection and cleaning, feature extraction, training data
with the classifier, and evaluating outcomes[11]. Employing machine learning techniques, the dataset should be
separated into two parts: training and testing. The training sets are designed to help the classifier learn text
features, while the test dataset evaluates its efficiency. Classifiers (for example, Support Vector Machines)
classify text into predefined categories. Machine learning is a common technique for text classification among
scholars. Furthermore, the accuracy of the same classifier for multiple types of text can vary significantly, so the
feature vectors for all kinds of text ought to be trained independently. In the following step, the tweeted data
must be vectorized and divided into a training set and a test set, after which the sentiment labels can be predicted
by employing various categorization models.
@ IJTSRD | Unique Paper ID – IJTSRD64847 | Volume – 8 | Issue – 3 | May-June 2024 Page 170
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
Support Vector Machine Classification
The goal of this system is to find linear separators in vector space to help separate distinct types of input vector
data. After the hyperplane has been retrieved, the retrieved text features can be fed into the classifier to predict
the outcomes. Furthermore, the primary goal is to identify a line that is closest to the support vectors. The steps
for setting up SVC involve computing the distance between the nearest support vectors, also known as the
margin, maximizing the margin to find an optimal hyperplane between support vectors from given data, and
utilizing this hyperplane as a decision boundary to separate the support vectors.
K-Nearest Neighbor (K-NN)
The K-Nearest Neighbors (KNN) method is a prominent machine learning methodology for classification and
regression problems. It is based on the assumption that similar data points would have similar labels or
values.Throughout the training phase, the KNN algorithm keeps the complete training dataset as a reference.
When making predictions, it estimates the distance between the input data point and all of the training instances
using a distance metric such as Euclidean distance. The method then identifies the K nearest neighbors of the
input data point based on their distances. In the case of classification, the algorithm uses the most prevalent class
label among the K neighbors to predict the label for the input data point. Regression uses the average or
weighted average of the target values of the K neighbors to forecast the value of the input data point[12].
Linear discriminant analysis (LDA), as the name implies, is a linear framework for classification and
reduction of dimensionality. It is a statistical approach that divides data into categories. It detects patterns in
features that differentiate between classes. LDA seeks to identify a straight line or plane that best divides these
groups while reducing overlap between each class. It allows for accurate classification of fresh data points by
increasing the spacing between classes. Simply said, LDA helps make sense of data by determining the most
effective way to split various groups, which aids tasks such as pattern detection and classification.
Data Base: we have used Sentiment140 - Automatically labelled database of tweets. We have also used
Facebook comments Sentiment analysis database[13]. We combined data from these two databases and feed into
our model.
In this research, strategies for text cleaning, polarity calculation, and sentiment classification models are devised
and optimized utilizing two distinct sentiment analysis approaches: lexical and machine-learning-based
methodologies. We then compared the results of the various approaches, including output and prediction
accuracy. Machine-learning-based techniques require tweet labels, but manually annotating a significant amount
of data typically takes too long. As a result, 8000 tweets/comments are picked at random in this study, with an
average of roughly 1000 tweets each sentiment category.
@ IJTSRD | Unique Paper ID – IJTSRD64847 | Volume – 8 | Issue – 3 | May-June 2024 Page 171
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
Precision is the percentage of positive observations that accurately forecast the total number of positive forecasts
using the calculation method.
(2)
Recall is the proportion of genuine positive observations that are accurately identified, computed using:
(3)
@ IJTSRD | Unique Paper ID – IJTSRD64847 | Volume – 8 | Issue – 3 | May-June 2024 Page 172
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
The F1 Score is an in-depth assessment and balancing of precision and recall values. It can be computed as
follows:
(4)
@ IJTSRD | Unique Paper ID – IJTSRD64847 | Volume – 8 | Issue – 3 | May-June 2024 Page 173
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
and Science, 2014, pp. 899–904. Technologies and Optimization (Trends and
[7] M. Soleymani, D. Garcia, B. Jou, B. Schuller, Future Directions)(ICRITO), 2020, pp. 537–
542.
S.-F. Chang, and M. Pantic, “A survey of
multimodal sentiment analysis,” Image Vis. [11] O. Adwan, M. Al-Tawil, A. Huneiti, R. Shahin,
Comput., vol. 65, pp. 3–14, 2017. A. A. Zayed, and R. Al-Dibsi, “Twitter
sentiment analysis approaches: A survey,” Int.
[8] Y. Kim, “Convolutional neural networks for
sentence classification. arXiv [J],” preprint, J. Emerg. Technol. Learn., vol. 15, no. 15, pp.
2014. 79–93, 2020.
[10] S. Zahoor and R. Rohilla, “Twitter sentiment [13] “Facebook comments Sentiment analysis”,
analysis using lexical or rule based approach: a [Online]. Available:
case study,” in 2020 8th International https://www.kaggle.com/code/mortena/faceboo
Conference on Reliability, Infocom k-comments-sentiment-analysis
@ IJTSRD | Unique Paper ID – IJTSRD64847 | Volume – 8 | Issue – 3 | May-June 2024 Page 174