0% found this document useful (0 votes)

11 views22 pages

Use of Supervised Machine Learning Class

Uploaded by

22211a05r1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views22 pages

Use of Supervised Machine Learning Class

Uploaded by

22211a05r1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

ISSN: 1002-2082

Use of Supervised Machine Learning Classifiers for Online Fake Review Detection
Maysara Mazin Badr Alsaad1*, Prof. Dr. Hiren Joshi2
1
PhD Research Scholar, Department of Computer Science, Rollwala Computer Centre, Gujarat
University, Navarangpura, Ahmedabad 380009, Gujarat, India.
[email protected]
2
Professor, Department of Computer Science, Rollwala Computer Centre,
Gujarat University, Navarangpura, Ahmedabad 380009, Gujarat, India.
[email protected]

Abstract
Social media and e-commerce sites have prompted online communities to use reviews to
provide input on goods, products, and services, as well as to support people to analyze
customer input for buying choices, and corporations to improve manufacturing quality.
Internet shoppers support or degrade the reputation of competitive brands. However, the
dissemination of fake reviews fools people, making these reviews a worrying problem. This
study proposes a guided learning online textual content fraudulent review detecting method.
The work splits bogus data using machine learning classifiers and honest reviews.
Experimental findings are compared to assessment measures. Planned system performance is
compared to the baseline. The research comes to the conclusion that supervised machine
learning techniques may be useful in identifying fraudulent reviews, but how well these
techniques work is largely reliant on the characteristics that are chosen. In terms of accuracy,
AUC, and other performance metrics, the SVM classifier with N-Gram feature extraction
and CV feature selection performs better than other classifiers and feature selection
techniques. This is shown by the examination of various feature extraction and selection
techniques. According to the research, N-Gram feature extraction and CV feature selection
may be helpful in spotting fraudulent reviews on e-commerce platforms. This would assist
customers in making wise selections and increase the reliability of online reviews.
Keywords: Social Media, E-commerce, Fake Reviews, Spam Detection Machine Learning,
ML Classification, Naïve Bayes, SVM.

1. Introduction client comments. Writing is rising and

creating phony social media reviews for
During the current social media revolution, promoting or demoting competitor’s items
users and stakeholders rely heavily on (Pramanik et al., 2021). The problem dupes
customer reviews because internet online and in-store buyers. Thus, designing a
communities, like businesses, are constantly system that detects and sorts material into
using online text content (reviews) in bogus (spam) and genuine (ham) reviews to
product sales and purchases and reviewing help online communities make smart

49 | International Conference 2024 (8th March 2024) Open Access Article

ISSN: 1002-2082

purchases and analyze client data feedback Query 1. Implementing supervised ML

quickly is very important. Previous fake (Praveenkumar et al., 2023) to categorize
review detection studies supervised ML and text into spam or ham.
lexicon-based methods (Padminivalli et al., Query 2. Creating a method to assess
2023), and utilized sentiment-based grading supervised ML classification efficiency text
to detect fraudulent and actual social media into ham and spam categories.
reviews. However, we recommend using the Query 3. How efficient is the suggested
SVM-based supervised ML to identify approach for comparable tasks research for
actual (ham) and spam text. We operate efficient spam review classification?
differently from applying supervised ML to
a labeled dataset. The suggested supervised 1.3 Goals and Objectives
machine learning fake review estimation The key research goals are:
method is contrasted to a standard and 1. Use supervised ML to categorize text
different supervised machine learning as spam and ham.
classifier function. The research will use 2. Assessing the classification
SVM, a supervised machine learning effectiveness of supervised machine
technique on a standard fake review dataset learning text into spam and ham
for text-to-binary classification of spam and categories.
ham. Moreover, to boost efficiency, SVM 3. Assessing the suggested method's
parameters should be optimized for the efficacy in comparable research to easily
suggested system classifier. identify spam reviews.

1.1 Problem Definition 1.4 Research Contributions

Supervised textual spam identification from Contributions of this research include:
bogus reviews ML algorithm is a growing 1. Text classification into fake and
challenge because of spamicity's diversity in genuine by employing the supervised
text. This research tackles spam review machine learning method.
categorization using phony reviews from 2. Assessing the suggested text
text utilizing supervised machine learning. classification system's efficiency into
With input reviews {sr1, sr2, sr3... srn}, the fake and genuine groups.
goal is to create a prediction model. The 3. Evaluation of the suggested approach
model provides a spamicity class of Pc ∈ {0, against current methods for spam
1} to text reviews Sri, where spam is 1 and estimation.
ham is 0.
1.5 Literature Survey
1.2 Research Queries Butt et al. (2022) used LSTM and RNN
This study addresses the following research deep learning to classify spam. RNN spam
questions: categorization, SMS spam, and Twitter data

50 | International Conference 2024 (8th March 2024) Open Access Article

ISSN: 1002-2082

are used in testing for optimizing results. features to handle Arabic social media
Pre-prepared vectors are used to increase material. The idea suggested the system
future preparation knowledge. Mewada et al. employs a novel supervised method mostly
(2022) presented a Rating and Review related to language: Arabic. Interesting
Processing Method to identify the general findings with 91.73% precision for
spam score rates for a review as supportive imbalanced datasets are accomplished. Li et
or unaccommodating based on review al. (2023) introduced CNN.
ratings. The best findings rely on Amazon- Qayyum et al. (2023) find text depiction
scraped product survey data using Python for fake review recognition. The authors
scraping tools. Different studies are to be tried with a record using 2 categorization
introduced later. Jáñez-Martino et al. (2022) methods. This was analyzed using feature
group material for crossover arranging as aggregations of Decision Tree and neural
spam and ham in opinion spam detection. A network algorithms. Implementations
weighting scheme prioritizes spam. demonstrate that sentence-weighted neural
Performance metrics like Accuracy, networks are more useful than other
Precision, and Recall analyze the model’s network-dependent approaches. The location
performance. The shortcoming of the may be calculated to prolong the job weight
approach is a limited set of skills and feature of each phrase using a memory network-
selection is manual. based approach. Bali proposes ML and N-
Aljabri et al. (2023) used several gram study (Bali et al., 2019) for fake news
supervised learning classification systems to estimation. Classifiers Random Forest and
identify spam reviews. Their system reached Naïve Bayes are utilized in feature
85.63% accuracy. The research may be extraction. With 92% accuracy, the outcome
expanded by studying unsupervised machine is excellent. This method works on other
learning and semi-supervised machine recent datasets. Active learning was used by
learning methods. Li, J., et al. (2023) Yan et al. (2023) to identify misleading and
suggested an unsupervised online honest feedback. Only authorized users can
recognition approach spams, an approach review. Productive, negative, and impartial
which is now a density-dependent anomaly reviews are categorized utilizing Natural
estimation problem. Language Processing and text mining
The suggested research comprises many approaches.
phases: (1) Evaluate viewpoint rating count, Martis et al. (2023) categorize J48
(2) Aspect rating-dependent local outlier classifiers reviews of films by spectators.
factor technique (3) Viewpoint positioning TP, TN, and Accuracy are contrasted
for spam reviews. The results proved the between J48 and Random Forest algorithms.
model is persuasive and outperforms current Eshtehardian et al. (2022) suggested a
methods. Kaddoura et al. (2023) developed a continual fake detection method, which
spam detection algorithm using chosen surveys, and gathers spammers

51 | International Conference 2024 (8th March 2024) Open Access Article

ISSN: 1002-2082

simultaneously by utilizing an Amazon Further research may optimize the

dataset graph model. Results from framework for more effective feature
experiments show the suggested technique classification operations.
beats all baseline methods in Accuracy. Javed et al. (2021) investigated Yelp's
Enhanced outcomes were attained with filtering approach to identify fraudulent
every social and semantic highlight. This reviews on a company website. The
work may be expanded to include all researchers utilized a supervised method to
features employing network structure and an train Yelp's filtered reviews. The findings
innovative iterative technique. Building a demonstrate Yelp's reliable screening.
domain-dependent polarity was addressed Kathuria et al. (2023) provide Senti wordnet,
by Liu & Lee (2019) who suggested NLTK, and word count tools to identify
vocabulary from reviews utilizing SWN review spam dependent on consistency. The
(Senti Word Net 3.0), sufficient word count technique suggested utilized effective spam
and regular updates. CNN model was identification.
presented by Balim & Özkan (2023) for Gawlikowski et al. (2023) suggested a
coordinating item-based review features deeper understanding of incorrect views.
using a word creation approach. The Naïve Experiments are conducted across and
Bayes classification algorithms utilized for through areas to explore a broader classifier.
classifying CNN model predicts finer than Compared to SVM, SAGE classifier yields
SVM-dependent approach. Future superior results. A machine learning method
projections may be more accurate by other was presented by Duma et al. (2023) to
reviews may be examined. identify bogus reviews. The authors focused
Zhan et al. (2023) offer the Near Point on supervised machine learning methods.
Auto-Regressive (NPAR) method for Upcoming component engineering
analyzing authentic product reviews. The methodologies may identify useful aspects
system effectively distinguishes spam for online false reviews by using varied
opinions, as shown by testing findings. This datasets.
study may be expanded to find time series Zhao et al. (2023) suggested classified
abnormalities. consumer opinions as favorable or negative
Srinivasarao & Sharaff (2023) developed to aid in making informed purchases. They
an automatic system for email feature made the choice by assessing spam review
classification to identify and classify spam. behavior. The authors presented a successful
The first phase involves developing and method that displays ordered emotion shown
extracting characteristics from an email as “Chernoff’s face”.
corpus using automated transformation and Yao et al. (2022) suggested the PU-
aggregation methods. The system allows for learning strategy to identify misleading
automatic feature engineering and spam beliefs. The PU-learning approach employs
categorization using scalable algorithms. Random Forest and Decision Tree classifiers

52 | International Conference 2024 (8th March 2024) Open Access Article

ISSN: 1002-2082

as learning techniques. Using a one-class Categories

Decision Tree classifier is appropriate for
NON-
training with few misleading beliefs. This SPAM
Fake SPAM
work may be expanded by including PU-
F1 Review 8836 1299 7537
learning and self-training methods.
s (13.7 (86.3
Ghanem & Erbay (2022) suggested a
%) %)
solution to reduce spam comments utilizing
machine learning approaches and subject
Training Data as previously noted, 80%
identification. The technique used the
of the dataset is used for training data. In
standard dataset and three classification
Table 2, training dataset specimens are
approaches to analyze the presentation. The
shown.
decision trees classifier yields the most
effective spam detection system
Table 2. Training data
implementation.
specimen records
The remaining article is organized as:
Review Reviews Label
Section 2 methodology; Section 3 presents
No.
results and discussion, then a conclusion and
A OK see to know better Ham
further work in Section 4.
Are you perfectly fit?
B Spam
Know on 12th July
2. Proposed Method
The proposed spam detection system
In testing data, this method divides the
includes modules for dataset collecting,
dataset into training and testing (Jayasingh
noise reduction, supervised learning, and
et al., 2022) portions by arbitrarily selecting
performance evaluation. Figure 1 depicts the
a fraction. When dividing a dataset into
suggested system block diagram.
testing and training, divide it into 2
constituents: training and testing.
2.1 Data Gathering
To ensure cross-endorsement, data is
The benchmark source1 provides the fake
divided into 3 elements having various
review dataset. Table 1 describes the
dimensions: validation, testing, and training.
gathered dataset. The dataset is divided into
In the training step, the model is trained, and
20% for testing and 80% for training. Out of
from testing by validation, its performance
6384 reviews, 474 are spam and 4825 are
is assessed, disregarding overtraining. If the
real (Pramanik, 2023).
outcome is positive, the training dataset is
Table 1. Statistics of
applied to the validation point. Specimen
dataset
testing dataset (Chellam et al., 2023) entries
No. of Review
Datas Detaili are in Table 3.
Revie Counts in
et ng
ws Labeled

53 | International Conference 2024 (8th March 2024) Open Access Article

ISSN: 1002-2082

Table 3. Sample items from 2.2 Noise Reduction

the testing dataset Various preprocessing procedures are used
S1. No. Reviews Label on the dataset. Examples are tokenization,
A How to do you get Ham the addition of significant symbols, and
Messags Hey Honey it hashtag removal. Algo 2 demonstrates
is 5 weeks since no Python-based preprocessing operations
B Spam utilizing Colab Framework.
word talk send 200$ to
receive fund Pre-processing stages are briefly
described below:
First, we construct a CSV dataset. A Tokenization: The term is broken down into
sample of reviews is included in Tables 2 little fragments during the tokenization
and 3 for testing and training purposes. Both process. Python environment tokenizes
spam and real reviews are there. using NLTK tokenizer (Khanh et al., 2023).
Algo 1 displays pseudo-code for partitioning Avoid Word Removal: The stop words
the dataset into training and testing. The first do not affect sentiment categorization. A
algorithm: A short summary of processes for preassembled aggregation of stop words is
dividing the dataset into testing and training. used to delete them progressively. Stop
Algo 1: A series of partitions for the training words include “a”, “the” & “is”. In Algo 2,
and testing dataset duties for preprocessing implementation are
Step1. Dataset division in testing/training: listed.
Step2. FixTrain A = [M] Text preprocessing in fake review
Step3. FixTrain B = [M]. detection refers to the application of various
Step4. Fixtrain A=[M] techniques on raw review text to convert it
Step5. FixTest B = [M] into a suitable format for further analysis. In
Step6. FixTotalTestSize = M * 20% this step, the following steps are applied in
FixTotalIndex = RANDOM(0, M - 1, review text preprocessing:
SumTestSize) - Removing Punctuation and Special
Step7. for IMT = 0 to N do Characters: Punctuation marks, such as
Step8. SetTempArray = [M] commas, periods, and quotes, along with
Step9. for ELEMENTSinT-ID [IMT] do special characters, do not hold
Step10. TemporaryArray.Add (ELEMENTS) significant meaning in text data and they
Step11. if TotalIndex.holds(IMT), then are eliminated to decrease the data's
Step12. Test A.Append(TemporaryArrayy) dimensionality.
Step13. Test B. Append(IMTReviewList[1]) - Tokenization: It is the process of
Step14. else segmenting the text into individual
Step15. Train A.Append(TemporaryArray) words, or tokens, essential for most
Step16. Add(TemporaryArray) Train B natural language processing (NLP) tasks
(Ngoc et al., 2023). We used a word-

54 | International Conference 2024 (8th March 2024) Open Access Article

ISSN: 1002-2082

based tokenization type for splitting the Step6. FixCWords =

review text into separate words. divide(Para[IteratedRow], “”).
- Stop Word Removal: Stop words are Step7. FixListOfComments = [M]
frequently used words, like "the," "and," Step8. FixListOfPunctuations: . , : ;
and "a" that do not add much value to Step9. FixStopWords: "the", "an", "is",
the review text content. Removing these "are"
words can reduce the data's Step10. For each CWords CHARS, do
dimensionality and improve the Step11. if CHARS AT LAST IS A
performance of machine learning PUNCTUATION
algorithms. Step12. PUNCTUATION remove
- Stemming: Techniques like stemming Step13. else
and lemmatization are used to reduce Step14. CHARS DOSENT CONSTITUTE A
words to their base or root forms. STOP WORD
Stemming involves removing the Step15. CommentsList.APPEND(CHARS)
suffixes from words to create a stem, Step16. A dictionary does not include
while lemmatization uses a vocabulary characters.
and morphological analysis to transform Step17. CHARS Dictionaries = 0.
words to their base forms.
- Lowercasing letters: it aims to convert Table 4. Sample reviews before and
all capital letters in the review text into after preprocessing
small letters. The following
Prior to Cleaning
Revie
2.3 Supervised ML Classifier Cleanin Removin
w Tokenizatio
Implementation g g stop
n
The spam detection framework uses SVM, a words
supervised machine learning approach, to OK bro see
categorize text as spam or non-spam OK bro to know
Bro see
(Hossain et al., 2023). Several researchers see to better
A to know
have used supervised ML to classify spam know (Adjective)
better
based on bogus reviews. This study better (Noun)
examines current research. (Verb)
Algo 2: A series of pre-processing execution Your Your free
stages Free gift
free gift gift is
Step1. PreProcessing delivered
is delivered
Step2. FixComments = [M] Text the
B delivere Text the
Step3. FixDictionaries = [M] code to
d Text code to
Step4. FixIteratedRow = 0 954402
the code 954402 for
Step5. for RepeatedRow to N. Count -1 do verifying
to verifying

55 | International Conference 2024 (8th March 2024) Open Access Article

ISSN: 1002-2082

954402 (Pronoun) Feature Engineering

for (Adjective) A crucial step in supervised machine
verifyin (Noun) learning is feature engineering. Methods for
g (Verb) feature representation in the suggested
(Adverb) research include:
i) TF ii) TF and IDF iii) FV
where TF=term frequency, IDF= term
The effectiveness of the suggested frequency, and inverse document frequency,
framework is assessed by comparing FV=feature vector
implemented outcomes to standard 1. Term Frequency: It computes the
approaches. We employ an SVM classifier quantity of incidence for a specified
with customized parameters on the review.
benchmark spam dataset, whereas other 2. The dataset uses IDF and TF algorithms
researchers utilized supervised machine to determine the relevance of words.
learning approaches for fake review 3. The FV turns the I/P review into a token
categorization. The SVM classifier yields count matrix.
more accurate results when categorizing
textual material (Table 4). SVM: The classifier helps with both linear
and nonlinear situations of the Supervised
ML classification algorithm. The Support
Vector Machine classifies the data into
categories and then finds the Hyper Plane,
which splits the data into groups. The
primary notion of SVM in sentiment
classification is determining the hyperplane
(Dushyant et al., 2022) that divides the
Figure 1. Structure of suggested spam collection. The statistical description of
detection framework SVM is:
P = {(x1, y1), (x2, y2),............... (xy, yn)}
3. Results and Discussion (1)
In Figure 1, the model is trained with where P is the review dataset, x is a value of
labeled textual reviews/tweets with spam y that shows whether items are connected to
and ham classes (Pramanik & a class.
Bandyopadhyay, 2022). Next, the test
module evaluates the trained model. To
3.1 Different ML Classifiers
address overfitting, model validation is
Besides SVM, we tested various ML
conducted.
classifiers such as Logistics Regression
(Kaushik et al., 2022), Naive Bayes

56 | International Conference 2024 (8th March 2024) Open Access Article

ISSN: 1002-2082

(Bhattacharya et al., 2021), Random Forest 1

𝑥 = ∑𝑧𝑦=1 𝑦𝑥(𝑧′) (4)
𝑦
(Mandal et al., 2021), KNN (Meslie et al.,
Where x denotes the number of
2021), XGBoost, and Decision Tree. They
replacement examples, yx is the training tree
are explained below:
classification, and y is an occurrence of
Logistic Regression (LR): The LR
training from x, y.
classifier categorizes reviews into polarity
classes using testing and training datasets.
K-Nearest Neighbour (KNN): KNN solves
Logistic Regression, the quickest predictive
regression and classification issues. The K-
classifier, avoids overfitting and provides
Nearest Neighborhood classifier is often
the finest generalization. The LR performs
utilized for huge-scale industrial
better on a fresh dataset. The Logistics
classifications. It aligns with specimen-
Regression equation is:
based learning. The boring learner is a
yα0+α1y1+ α2y2 (2)
dataset training method that stores training
Their y is permanent, while the others are
examples in an N-Dimensional KNN slot.
equation boundary functions.
The KNN classifies new instances using K-
neighbor majority votes.
Naive Bayes (NB): It is used in
XGBoost: It is built on the Gradient
classifications and regressions and is a
Boosting system. It yields good outcomes in
supervised machine-learning classifier.
distributed environments like Hadoop, SGE,
According to NB theorem, NB classifiers
and MapReduce.
are related to the probability family.
Decision Trees (DT): DT classifiers are
Applying Naive Bayes classifiers to large or
often utilized in regression, classification,
small datasets yields generally satisfactory
and different issues. It uses supervised
results. Higher input characteristics provide
machine learning. The nodes indicate
better results for the Navies Bayes classifier.
features for categorization, whereas the
In mathematics, the equation is:
branch represents value characteristics. The
p(a/b) = p(a) x p(b) x p(b/a) (3)
categorization process begins by sorting
attributes from base nodes. Conquer and
Random Forest (RF): It is more flexible
divide strategies are utilized for tree
than others based on hyperparameter
structure. The equation is:
adjustment. The categorization findings are
(x, y) = (b1, b2, b3, ………… bg, y)
mostly relevant and efficient. The RF
(5)
classifier is often utilized in classification
The subset is indicated by y, base nodes are
and regression problems. Every decision tree
x, and tree leaves are indicated by g.
is a RF. The RF classifier having various
decision trees yields accurate
3.2 Working Procedure of the Suggested
generalizations. The mathematical equation
Approach
is:

57 | International Conference 2024 (8th March 2024) Open Access Article

ISSN: 1002-2082

The supervised learning method classifies Step3: Classifiers: [‘Support Vector

fake and real reviews by inputting reviews, Machine’, ‘K-Nearest Neighborhood’,
preprocessing them, and using the SVM ‘Extreme Gradient Boosting’, ‘Decision
classifier to categorize them as spam Tree’, ‘Random Forest’, ‘Naïve Bayes’,
(SPAM) or legitimate (HAM). The dataset is ‘Linear Regression’]
split into training (80%) and testing (20%) Step4: Begin
constituents. Labeled data is sent to the Step5: Scan SMS
classifier during training. After classifier Step6: Fix TextStream = Scan(Polarity)
training, ML classifiers are verified by Step7: Preprocessing
analyzing the remaining testing data. The Step8: Tokens
outcome is assessed using precision, F1- Step9: Fix TokenStream = GetTextStream
measure, accuracy, and recall. Steps for the Step10: Stop words Remove
recommended technique are outlined in Step11: PlainStream =
Algo 3. SeparateStopWords(TokenStream)Punctuati
on
3.3 Contrasting Classifier Performance Step12: Dividing entire Data using Training
Comparing the proposed SVM classifier to or Testing
existing ML classifiers provides a Step13: Fix TestingLength = 20%
qualitative assessment of its effectiveness in Step14: Testing- A, Testing-B =
predicting false and legitimate reviews from Divide(TextStream, TestingLength)
textual content. Ultimately, we assess Step15: VectorCounting (TextStream)
classifier performance using Precision, Step16: IDF and TF
Recall, Accuracy, and F1-Measures to Step17: ApplyingClassifier
assess classification results (Pramanik, Step18: ModelingClassfiers =
2022). Comparison study in Section 4 ObtainClassifiers()
provides an empirical assessment of the Step19: Fix ModelClassification =
suggested SVM classifier for predicting fit(Training-A, Training-B)
false and authentic reviews from textual Step20: Estimations
constituent. Step21: Fix PredictionModel =
After evaluating different machine ModelClassification: predicts(Text-A)
learning classifiers on the dataset, we Step22: Accuracy
recommend the SVM classifier for the finest Step23: FixModelAccuracy =
classification outcomes on the bogus (spam) ObtainAccuracy (PredictModel, Text-A)
review dataset. Step24: Confusion Matrix
Algo 3: Framework implementation Step25: FixConfusionMatrix =
Step1: Polarity-Categorized Rays in ObtainConfusionMatrix(Test-B,
accordance with polarities PredictModel)
Step2: POLARITY: (“SPAM” and “HAM”) Step26: Perform

58 | International Conference 2024 (8th March 2024) Open Access Article

ISSN: 1002-2082

Step27: gamma: scale or

ObtainMeasure(BMeasure,PreciseModel) floating, optional Kernel coefficient for
Step28: Return (select_automaticall Poly, Sigmoid, rbd
ObtainClassificationReport(Test-B, y= scale)
PredictModel, POLARITIES) shrinking heuristic:
This section answers research questions boolean optional For shrinking
via experimentation and analysis. (select_automaticall heuristic
y=true)
3.4 First Research Query Answer probability
In order to answer Query 1 "How to apply estimation:
It allows probability
different ML classifiers on a dataset of spam Boolean, optional
estimating.
reviews for predicting spam and genuine (select_automaticall
reviews," SVM algorithm was applied to the y=false)
dataset of fake reviews for estimating spam Cache size: floating, Fix Kernel cache
and ham reviews from Short Messaging optional size.
Services. The Support Vector Machine
technique serves regression and 3.5 Second Research Query Answer
classification tasks. The dataset in training For Query 2: “How to evaluate the
phase X predicts object variable Y. The efficiency of different ML classifiers to
SVM classifier parameters are given in predict spam and genuine reviews?”, several
Table 5. classifiers were assigned to the fake review
dataset. Details are provided below. The first
Table 5. Support Vector Machine Classifier experiment:
Parameters The experiment is conducted using a dataset
Parameters Descriptions of 5572 reviews, categorized as “spam and
D: floating value, It is a regularization genuine”. Table 8 displays performance
optional parameter which is assessment results for various machine
(select_automaticall inversely learning classifiers, including Random
y=1.0) proportional to D Forest, XGBoost, SVM, KNN, Decision
Kernel: Kernel variety must Tree, NB, and Logistic Regression. We
alphanumeric, be either utilized Precision, F1-score, Accuracy, and
optional polynomial/linear/rbf/ Recall measures. The SVM classifier
(select_automaticall callable. Default outperforms various machine learning
y=radial basis value = radial basis classifiers in recall (99%), accuracy
function) function (98.92%), precision (99%), and F1-Score
Degree: int, optional Degree is (99%).
(select_automaticall “polynomial” for
y=3) polynomial kernel

59 | International Conference 2024 (8th March 2024) Open Access Article

ISSN: 1002-2082

Table 6. Various machine learning classifier capable of handling vast characteristic

experiments spaces. The dataset includes over 11000
F1- dimensions, indicating that SVM provides
Reca
Classifi Accura Precisi scor the most accurate classification results.
ll
er cy (%) on (%) e
(%)
(%) Poor Performance Classifiers
KNN 86.88 0.88 0.89 0.87 Table 6 indicates that the K-NN classifier
SVM 98.75 0.95 0.95 0.96 performs much worse than others in
DT 98.48 0.94 0.96 0.95 classification. The KNN classifier has poor
XGBoo 98.32 0.96 0.97 0.97 accuracy because of its reliance on a voting
st majority mechanism for class identification,
LR 97.12 0.96 0.97 0.98 which may not be suitable when near
NB 96.78 0.97 0.95 0.97 neighbors, have varied distances from the
RF 97.33 0.97 0.96 0.96 test data. Therefore, K-Nearest
Table 6 displays the classification Neighborhood classifiers have poor
performance of several classifiers from the accuracy on the dataset.
8836 spam reviews dataset. It is reported Various factors affecting poor classifications
that the SVM achieves 98.75% accuracy. in ML classifiers include dataset size, class
The accuracy of K-Neighbor's classifier is total, dataset organization, training/testing
86.88%. After analyzing all classifiers, SVM ratio, and case count.
provides the finest classification results, Classifiers' Best Performances
whereas K-Nearest Neighborhood yields the Table 8 shows that the Support Vector
lowest results in F1-score, Recall, Precision, Machine classifier outperforms different ML
and Accuracy. classifiers in F1-score, Accuracy, Precision,
Table 6 data indicate that the SVM classifier and Recall. The Support Vector Machine
yields the finest outcomes, validated by classifier with the parameters described in
literature. the previous section is recommended for
1. Most text categorization problems are classifying reviews as spam or authentic.
linear separable: The fake review dataset
categorizes data into 2 labels: ham and Different Classifier Cross-validation
spam, depending on class utilization in the Results
training dataset. For linear partible data, This section presents comparative results
the SVM classifier yields the finest gained from our experiments conducted in
classification results. this work for fake reviews detection using
2. High dimensionality input space: Text the Amazon reviews dataset with two
classifier learning requires over 10,000 feature selection and extraction, which are
features. A large feature space might cause TF-IDF (Word Vector) and Bag of Word
overfitting (Chandan et al., 2022). SVM is (N-gram), both used to select various sets of

60 | International Conference 2024 (8th March 2024) Open Access Article

ISSN: 1002-2082

vocabulary represented in n-words extracted testing results of ML classifiers.

from the test set. Table 7 gives the details

Table 7. Testing results of ML classifiers

Truthful
A reviews Fake reviews
Feat
Acc U Type (Class ==1) (Class == 0)
ure Type I
Class Learning urac C II Re F- Re F-
Extr Error
ifiers Time (S) y ( Error cal sco cal sco
actio (%)
(%) % (%) Prec l re Prec l re
n
) ision (% (% ision (% (%
(%) ) ) (%) ) )
81.0
N.B 0.01 81 9.8 9.1 82 80 81 80 82 81
1
80.7
R.F 25.60 81 10.3 8.98 82 80 81 80 82 81
WV 1
n=20 82.3
SVM 28.00 82 9.3 8.3 83 82 82 82 83 82
K 9
82.5
L.R 0.10 83 9.7 7.75 84 81 82 81 84 83
(TFI 5
DF) 72.8
D.T 0.87 73 13.93 13.17 74 72 73 72 73 73
9
79.2 0.
A.B 11.90 11.57 9.15 81 77 79 78 81 80
8 79
85.0
N.B 0.01 85 7 7.98 85 86 85 86 84 85
1
82.9
CV R.F 29.80 83 9.23 7.78 84 82 83 82 84 83
8
n=20
SVM 96.00 81.5 81 9.25 9.25 82 82 82 81 81 81
K
84.8
L.R 0.11 85 7.62 7.55 85 85 85 85 85 85
3
(N-
72.1
Gram) D.T 1.18 72 14.67 13.2 73 71 72 71 73 72
3
81.9
A.B 12.00 82 9.42 8.67 83 81 82 81 82 82
1
WV 82.5
N.B 0.02 83 8.92 8.52 83 82 83 82 83 82
n=10 6
0K R.F 245.00 82.5 83 9.12 8.36 83 82 83 82 83 83

61 | International Conference 2024 (8th March 2024) Open Access Article

ISSN: 1002-2082

1
(TFI 84.0
SVM 858.00 84 8.38 7.57 85 83 84 83 85 84
DF) 5
84.0
L.R 0.29 84 8.35 7.58 85 83 84 83 85 84
7
73.9
D.T 10.10 74 13.06 13.03 74 74 74 74 74 74
1
81.2
A.B 53.60 81 9.69 9.03 82 81 81 81 82 81
8
85.3
N.B 0.04 85 6.74 7.88 85 87 86 86 84 85
7
84.2
R.F 348.00 84 8.57 7.15 85 83 84 83 86 84
CV 8
n=10 86.5
SVM 27931.00 87 6.75 6.74 87 87 87 86 86 86
0K 1
86.9
L.R 0.56 87 6.49 6.53 87 87 87 87 87 87
(N- 8
Gram) 75.9
D.T 18.60 76 12.97 11.07 77 74 76 75 78 76
5
84.0
A.B 56.20 84 8.14 7.79 84 84 84 84 84 84
7
This table shows the results of several - Accuracy (%): the percentage of
machine learning models trained to classify correctly classified reviews.
reviews as either truthful (Class == 1) or - AUC (%): the area under the Receiver
fake (Class == 0). The models were trained Operating Characteristic (ROC) curve
on feature vectors extracted using different that measures the trade-off between the
methods, namely Word Vectors (WV) and True Positive Rate (TPR) and the False
Count Vectors (CV), with different Positive Rate (FPR) of the classifier.
vocabulary sizes (n=20K and n=100K), and - Type I Error (%): the % of falsely
using different classifiers (NB, RF, SVM, categorized fake reviews.
LR, DT, and Ada Boost). The testing result - Type II Error (%): the % of falsely
reports several performance metrics for each categorized truthful reviews.
combination of feature extraction method, - Precision (%): the % of rightly
classifier, and vocabulary size. These categorized reviews among all reviews
metrics include: categorized as truthful (Class == 1).
- Learning Time(S): the time it took to - Recall (%): the % of correctly
train the model. categorized reviews among all truthful
reviews (Class == 1).

62 | International Conference 2024 (8th March 2024) Open Access Article

ISSN: 1002-2082

- F-score (%): the harmonic mean of not hold in high-dimensional feature spaces,
precision and recall, which balances the which could limit its performance. The
trade-off between them. Decision Tree classifier generally performed
poorly compared to the other classifiers,
The results also showed that the best- with lower accuracy and higher type I and
performing models achieved an accuracy of type II error rates.
around 87% and an AUC of around 87%, Decision trees are simple and
indicating that they are able to distinguish interpretable models that recursively split
between truthful and fake reviews with a feature space into regions based on
high degree of accuracy. The SVM and thresholds, but they may suffer from
Logistic Regression classifiers generally overfitting and instability, especially in
performed better than the other classifiers, high-dimensional spaces. The N-Gram
while the Naive Bayes classifier had the method generally performed better than the
shortest learning time. The N-Gram method Word Vector method, which may be
generally performed better than the Word attributed to its ability to capture local
Vector method. Increasing the vocabulary dependencies between words in the text,
size from 20K to 100K generally improved which are important for detecting patterns of
performance. However, the Decision Tree deception and sentiment. However, the
classifier generally performed poorly, with Word Vector method may be more suitable
low accuracy and high type I and type II for capturing global semantic relationships
error rates. between words and phrases. Increasing the
The results in table 7 above show that vocabulary size from 20K to 100K generally
the SVM and Logistic Regression classifiers improved performance, indicating that a
generally performed the best across most larger vocabulary can capture more fine-
combinations of feature extraction methods grained distinctions between words and
and vocabulary size, achieving accuracy and phrases, thus enhancing the accuracy of the
AUC scores of up to 87%. These classifiers model. However, this comes at the cost of
are known for their ability to handle high- increased computational complexity and
dimensional data and to learn complex memory requirements. Overall, the results
decision boundaries, which may explain suggest that machine learning models can be
their superior performance in this task. The effective at detecting fake reviews, but the
Naive Bayes classifier generally had the choice of feature extraction approach,
shortest learning time, but its performance vocabulary size, and classifier can have a
was slightly lower than that of the SVM and greater impact on performance. It is essential
Logistic Regression classifiers. Naive Bayes to rigorously assess and compare various
is a simple but effective probabilistic models on a representative dataset to select
classifier that assumes independence the best-performing one for a given task.
between features, but this assumption may

63 | International Conference 2024 (8th March 2024) Open Access Article

ISSN: 1002-2082

To answer Query 3: "What is the Research Approach Outcomes

efficiency of the proposed classifier Works
compared to the baseline method?", the Accuracy-
recommended classifier is compared to 87.68%
Mani et al. NB, RF,
standard research. Precision-
(2018) and SVM
89%
Comparing Baseline Approaches Recall-85%
Mani et al. (2018) used the ensemble Accuracy-
LR, K-NN,
strategy, which aided in obtaining a higher Kumar et al. 76%
NB, RF,
accuracy score. On the other hand, Kumar et (2018) F1-Score-
SVM
al. (2018) used both univariate and 79%
multivariate distributions across user ratings. Precision-
Ban et al. (2018), the Hybrid architecture Ban et al. SVM &
85%
of SVM with NN helped to slightly improve (2018) NN
Recall-84%
the classiﬁcation results. While Saeed, Rady Accuracy-
& Gharib (2019) made the increase model’s 95.25%
performance by using N-gram feature Recall-
extraction and Negation handling. Saeed, Rady (NB, SVM,
91.75%
Joni Salminen et al. (2022) created a & K-NN, RF
Precision-
mobile model to identify spam and ham Gharib (2019) and NN)
98.66%
SMS, comparing ML classifiers such as F1-Score-
Support Vector Machine, K-Nearest 95.08%
Neighborhood, Logistic Regression, Support
Decision Tree, and Random Forest. In the Joni Salminen 97.35%
Vector
following classification, the SVM classifier et al. (2022) (Accuracy)
Machine
had the highest accuracy among others. Sami Ben
And the latest researchers, Sami Ben SVM & 94%
Jabeur et al.
Jabeur et al. (2023) compared ML classifiers NB (Accuracy)
(2023)
for spam email prediction. They used three 98.75%
ML classifiers (NB, J48, and MLP) and (Accuracy)
found that MLP classifiers yielded the Support 99% (F1-
highest accuracy. Table 8 displays a Our work Vector score)
comparison of standard study approach Machine 99% (recall)
outcomes. 99%
(Precision)
Table 8. Comparison with the results of a
typical study approach (Kaddoura et al. The SVM classifier approach for fake
2022) review detection yielded encouraging

64 | International Conference 2024 (8th March 2024) Open Access Article

ISSN: 1002-2082

results, including increased accuracy, recall, to reduce noise before providing text to the
precision, and F-measure. The suggested ML classifier. SVM outperforms other ML
strategy outperforms baseline techniques, as classifiers such as XGBoost, KNN, Random
shown by the acquired results. Forest, Na¨ıve Bayes, DT, and LR in fake
Through our proposed work, the review classification in experimental results.
following list of opinion mining jobs is As compared to other classifiers, K-Nearest
recommended to assist businesses and Neighborhood has the weakest performance.
merchants in gathering and evaluating a
significant volume of consumer reviews: Limitations of work
a) Sentiment classification, which 1. 1.The imbalanced dataset in this work
indicates if a viewpoint is neutral, led to a low performance of ML
positive (ham), or negative (spam). classifiers.
b) Learning about the attributes of an 2. The dataset is separated into testing and
entity that has been reviewed and training using the random splitting
obtaining the reviewer's viewpoint approach.
regarding a specific item. 3. This work uses just TF-IDF feather
c) Comparative language and the ability engineering.4. The low dataset size
to find relationships between one (5573) in this work impacts classifier
thing and several related objects. outcomes, indicating the necessity to
d) Supervised machine-learning expand the dataset for enhanced
techniques surpassed human outcomes.
judgment in distinguishing between
genuine and false opinions, Future Paths
classifying consumer opinions with A balanced dataset improves ML classifier
the best accuracy between the two. performance.
e) False opinions affect customers in
two ways: 1) they influence them to Acknowledgment
make poor choices when making a The experimental work described in this
purchase, and 2) they cause them to research was conducted in the lab of the
lose faith in online product reviews. Department of Computer Science, Rollwala
Computer Centre, Gujarat University.
4. Conclusion
This work uses supervised ML techniques, References
such as SVM with specified parameters, to Aljabri, M., Zagrouba, R., Shaahid, A.,
categorize content into spam and non-spam Alnasser, F., Saleh, A., & Alomari, D.
reviews. We also tested different ML M. (2023). Machine learning-based
classifiers and reviewed their outcomes. social media bot detection: a
Different preprocessing approaches are used comprehensive literature review. Social

65 | International Conference 2024 (8th March 2024) Open Access Article

ISSN: 1002-2082

Network Analysis and In Advances in healthcare information

Mining, 13(1). https://doi.org/10.1007/s systems and administration book
13278-022-01020-5 series (pp. 167–
Bali, A. P. S., Fernandes, M., Choubey, S., 182). https://doi.org/10.4018/978-1-
& Goel, M. (2019). Comparative 6684-5656-9.ch009
performance of machine learning Chellam, V. V., Veeraiah, V., Khanna, A.,
algorithms for fake news detection. Sheikh, T., Pramanik, S., & Dhabliya,
In Communications in computer and D. (2023). A Machine Vision-Based
information science (pp. 420– Approach for Tuberculosis
430). https://doi.org/10.1007/978-981- identification in chest X-Rays images of
13-9942-8_40 patients. In Lecture notes in networks
Balim, C., & Özkan, K. (2023). Creating an and systems (pp. 23–
AI fashioner through deep learning and 32). https://doi.org/10.1007/978-981-
computer vision. Evolving 99-3315-0_3
Systems. https://doi.org/10.1007/s12530 Duma, R. A., Niu, Z., Nyamawe, A. S.,
-023-09498-w Tchaye-Kondi, J., & Yusuf, A. A.
Bhattacharya, A., Ghosal, A., Obaid, A. J., (2023). A Deep Hybrid Model for fake
Krit, S., Shukla, V. K., Mandal, K., & review detection by jointly leveraging
Pramanik, S. (2021). Unsupervised review text, overall ratings, and aspect
Summarization Approach With ratings. Soft Computing, 27(10), 6281–
Computational Statistics of Microblog 6296. https://doi.org/10.1007/s00500-
Data. Advances in Systems Analysis, 023-07897-4
Software Engineering, and High Dushyant, k., Muskan, G., Gupta, A.,
Performance Computing Book Series, Pramanik, S., "Utilizing Machine
23–37. https://doi.org/10.4018/978-1- Learning and Deep Learning in
7998-7701-1.ch002 Cybesecurity: An Innovative
Butt, U. A., Amin, R., Aldabbas, H., Approach," in Cyber Security and
Mohan, S., Alouffi, B., & Ahmadian, A. Digital Forensics: Challenges and
(2022). Cloud-based email phishing Future Trends , Wiley, 2022, pp.271-
attack using machine and deep learning 293.
algorithm. Complex & Intelligent https://doi:10.1002/9781119795667.ch1
Systems, 9(3), 3043– Eshtehardian, S. A., & Khodaygan, S.
3070. https://doi.org/10.1007/s40747- (2022). A continuous RRT*-based path
022-00760-3 planning method for non-holonomic
Chandan, R. R., Soni, S., Raj, A. N. J., mobile robots using B-spline
Veeraiah, V., Dhabliya, D., Pramanik, curves. Journal of Ambient Intelligence
S., & Gupta, A. (2022). Genetic and Humanized
algorithm and machine learning. Computing. https://doi.org/10.1007/s12

66 | International Conference 2024 (8th March 2024) Open Access Article

ISSN: 1002-2082

652-021-03625-8 Review, 56(2), 1145–

Gawlikowski, J., Tassi, C. R. N., Ali, M., 1173. https://doi.org/10.1007/s10462-
Lee, J., Humt, M., Feng, J., Kruspe, A., 022-10195-4
Triebel, R., Jung, P., Roscher, R., Javed, M. S., Majeed, H., Mujtaba, H., &
Shahzad, M., Wen, Y., Bamler, R., & Beg, M. O. (2021). Fake reviews
Zhu, X. X. (2023). A survey of classification using deep learning
uncertainty in deep neural ensemble of shallow
networks. Artificial Intelligence convolutions. Journal of Computational
Review. https://doi.org/10.1007/s10462- Social Science, 4(2), 883–
023-10562-9 902. https://doi.org/10.1007/s42001-
Ghanem, R., & Erbay, H. (2022). Spam 021-00114-y
detection on social networks using deep Jayasingh, R. J., S, J. K. R. J., Telagathoti,
contextualized word D. B., Sagayam, K. M., Sagayam, K.
representation. Multimedia Tools and M., Pramanik, S., Jena, O. P., &
Applications, 82(3), 3697– Bandyopadhyay, S. K. (2022). Speckle
3712. https://doi.org/10.1007/s11042- noise removal by SORAMA
022-13397-8 segmentation in digital image
Hossain, M., Ho, R. C., & Trajkovski, G. processing to facilitate precise robotic
(Eds.). (2023). Handbook of Research surgery. International Journal of
on AI and Machine Learning Reliable and Quality E-
Applications in Customer Support and healthcare, 11(1), 1–
Analytics. IGI Global. 19. https://doi.org/10.4018/ijrqeh.29508
https://doi.org/10.4018/978-1-6684- 3
7105-0 Kaddoura, S., Alex, S. A., Itani, M., Henno,
Jabeur, S. B., Ballouk, H., Arfi, W. B., & S., AlNashash, A., & Hemanth, D. J.
Sahut, J. (2023). Artificial intelligence (2023). Arabic spam tweets
applications in fake review detection: classification using deep
Bibliometric analysis and future learning. Neural Computing and
avenues for research. Journal of Applications, 35(23), 17233–
Business Research, 158, 17246. https://doi.org/10.1007/s00521-
113631. https://doi.org/10.1016/j.jbusre 023-08614-w
s.2022.113631 Kathuria, A., Gupta, A., & Singla, R.
Jáñez-Martino, F., Alaiz-Rodríguez, R., (2023). AOH-SENTI: Aspect-Oriented
González-Castro, V., Fidalgo, E., & Hybrid Approach to sentiment analysis
Alegre, E. (2022). A review of spam of students’ feedback. SN Computer
email detection: analysis of spammer Science, 4(2). https://doi.org/10.1007/s4
strategies and the dataset shift 2979-022-01611-1
problem. Artificial Intelligence Kaushik, D., Garg, M., Annu, Gupta, A., &

67 | International Conference 2024 (8th March 2024) Open Access Article

ISSN: 1002-2082

Pramanik, S. (2022). Utilizing Machine Martis, E., Deo, R., Rastogi, S., Chhaparia,
Learning and Deep Learning in K., & Biwalkar, A. (2023). A proposed
Cybesecurity: An Innovative system for understanding the consumer
Approach. Wiley eBooks, 271– opinion of a product using sentiment
293. https://doi.org/10.1002/978111979 analysis. In Advances in intelligent
5667.ch12 systems and computing (pp. 555–
Khanh, P. T., Ngoc, T. T. H., & Pramanik, 568). https://doi.org/10.1007/978-981-
S. (2023). Future of smart agriculture 19-5443-6_42
techniques and applications. Meslie, Y., Enbeyle, W., Pandey, B. K.,
In Advances in environmental Pramanik, S., Pandey, D., Dadeech, P.,
engineering and green technologies Belay, A., & Saini, A. K. (2021).
book series (pp. 365– Machine Intelligence-Based Trend
378). https://doi.org/10.4018/978-1- Analysis of COVID-19 for total daily
6684-9231-4.ch021 confirmed cases in Asia and Africa.
Li, J., Hu, J., Zhang, P., & Yang, L. (2023). In Advances in systems analysis,
Exposing collaborative spammer groups software engineering, and high
through the review-response performance computing book
graph. Multimedia Tools and series (pp. 164–
Applications, 82(14), 21687– 185). https://doi.org/10.4018/978-1-
21700. https://doi.org/10.1007/s11042- 7998-7701-1.ch009
023-14650-4 Mewada, A., & Dewang, R. K. (2022). A
Liu, S., & Lee, I. (2019). Extracting comprehensive survey of various
features with medical sentiment lexicon methods in opinion spam
and position encoding for drug detection. Multimedia Tools and
reviews. Health Information Science Applications, 82(9), 13199–
and 13239. https://doi.org/10.1007/s11042-
Systems, 7(1). https://doi.org/10.1007/s 022-13702-5
13755-019-0072-6 Ngoc, T. T. H., Khanh, P. T., & Pramanik,
Mandal, A., Dutta, S., & Pramanik, S. S. (2023). Smart Agriculture using a
(2021). Machine intelligence of PI from soil monitoring system. In Advances in
geometrical figures with variable environmental engineering and green
parameters using SCILab. In Advances technologies book series (pp. 200–
in systems analysis, software 220). https://doi.org/10.4018/978-1-
engineering, and high performance 6684-9231-4.ch011
computing book series (pp. 38– Padminivalli, S. J. R. K., V., Rao, M. V. P.
63). https://doi.org/10.4018/978-1- C. S., & Narne, N. S. R. (2023).
7998-7701-1.ch003 Sentiment based emotion classification
in unstructured textual data using dual

68 | International Conference 2024 (8th March 2024) Open Access Article

ISSN: 1002-2082

stage deep model. Multimedia Tools 99-3315-0_4

and Qayyum, H., Farooq, A., Nawaz, M., &
Applications. https://doi.org/10.1007/s1 Nazir, T. (2023). FRD-LSTM: a novel
1042-023-16314-9 technique for fake reviews detection
Pramanik, S. (2022). Carpooling solutions using DCWR with the Bi-LSTM
using machine learning tools. method. Multimedia Tools and
In Advances in IT standards and Applications, 82(20), 31505–
standardization research series (pp. 31519. https://doi.org/10.1007/s11042-
14–28). https://doi.org/10.4018/978-1- 023-15098-2
7998-9795-8.ch002 Salminen, J., Kandpal, C., Kamel, A., Jung,
Pramanik, S. (2023). An adaptive image S., & Jansen, B. J. (2022). Creating and
steganography approach depending on detecting fake reviews of online
integer wavelet transform and genetic products. Journal of Retailing and
algorithm. Multimedia Tools and Consumer Services, 64,
Applications. https://doi.org/10.1007/s1 102771. https://doi.org/10.1016/j.jretco
1042-023-14505-y nser.2021.102771
Pramanik, S., & Bandyopadhyay, S. K. Srinivasarao, U., & Sharaff, A. (2023).
(2022). Identifying disease and Machine intelligence based hybrid
diagnosis in females using Machine classifier for spam detection and
learning. In IGI Global eBooks (pp. sentiment analysis of SMS
3120– messages. Multimedia Tools and
3143). https://doi.org/10.4018/978-1- Applications, 82(20), 31069–
7998-9220-5.ch187 31099. https://doi.org/10.1007/s11042-
Pramanik, S., Sagayam, K. M., & Jena, O. 023-14641-5
P. (2021). Machine learning Kaddoura, S., Chandrasekaran, G., Elena
frameworks in cancer detection. E3S Popescu, D., Duraisamy, J. H. (2022). A
Web of Conferences, 297, systematic literature review on spam
01073. https://doi.org/10.1051/e3sconf/ content detection and classification.
202129701073 https://doi.org/10.7717/peerj-cs.830
Praveenkumar, S., Veeraiah, V., Pramanik, Yan, J. (2023). Multivariate Modeling with
S., Basha, S. M., Neto, A. V. L., De Copulas and Engineering Applications.
Albuquerque, V. H. C., & Gupta, A. In Springer handbooks (pp. 931–
(2023). Prediction of patients’ incurable 945). https://doi.org/10.1007/978-1-
diseases utilizing deep learning 4471-7503-2_46
approach. In Lecture notes in networks Yao, J., Qin, S., Qiao, S., Liu, X., Zhang,
and systems (pp. 33– L., & Chen, J. (2022). Application of a
44). https://doi.org/10.1007/978-981- two-step sampling strategy based on
deep neural network for landslide

69 | International Conference 2024 (8th March 2024) Open Access Article

ISSN: 1002-2082

susceptibility mapping. Bulletin of Zhao, P., Ma, Z., Gill, T., & Ranaweera, C.
Engineering Geology and the (2023). Social media sentiment
Environment, 81(4). https://doi.org/10.1 polarization and its impact on product
007/s10064-022-02615-0 adoption. Marketing Letters, 34(3),
Zhan, P., Qin, X., Zhang, Q., & Sun, Y. 497–
(2023). Output-Only modal 512. https://doi.org/10.1007/s11002-
identification based on auto-regressive 023-09664-9
Spectrum-Guided symplectic geometry
mode decomposition. Journal of
Vibration Engineering &
Technologies. https://doi.org/10.1007/s
42417-022-00832-1

70 | International Conference 2024 (8th March 2024) Open Access Article

BD - Unit - III - MapReduce
100% (1)
BD - Unit - III - MapReduce
31 pages
Fraud Detection in E-Commerce Using Natural Language Processing
No ratings yet
Fraud Detection in E-Commerce Using Natural Language Processing
43 pages
Machine Learning Approaches For Fake Reviews Detection A Systematic Literature Review
No ratings yet
Machine Learning Approaches For Fake Reviews Detection A Systematic Literature Review
27 pages
Iem Bem Report
100% (5)
Iem Bem Report
13 pages
Shiva
No ratings yet
Shiva
16 pages
Fake Product Review Final
No ratings yet
Fake Product Review Final
30 pages
Fake Review Detector
No ratings yet
Fake Review Detector
41 pages
Sun Cellular E-Bill-0171889715-2020-12-27
No ratings yet
Sun Cellular E-Bill-0171889715-2020-12-27
6 pages
AIML Honours
No ratings yet
AIML Honours
33 pages
Fack Review Detection
No ratings yet
Fack Review Detection
53 pages
L1 Maths Graphing WorkBook PDF
No ratings yet
L1 Maths Graphing WorkBook PDF
53 pages
DFSMS/MVS V1R4 Technical Guide: June 1997
No ratings yet
DFSMS/MVS V1R4 Technical Guide: June 1997
176 pages
Analysis and Challenges in Detecting The Fake Revi
No ratings yet
Analysis and Challenges in Detecting The Fake Revi
21 pages
13527-Article Text-24222-1-10-20230328
No ratings yet
13527-Article Text-24222-1-10-20230328
12 pages
Review of Related Literature and Studies
No ratings yet
Review of Related Literature and Studies
13 pages
Fake Review Detection and Text Analysis by Using Clustering Techniques On LSTM
No ratings yet
Fake Review Detection and Text Analysis by Using Clustering Techniques On LSTM
22 pages
CPA For Artificial Intelligence.
No ratings yet
CPA For Artificial Intelligence.
19 pages
Electronics 13 04322
No ratings yet
Electronics 13 04322
17 pages
Fake Product Monitoring
No ratings yet
Fake Product Monitoring
22 pages
Shivathmaj Report
No ratings yet
Shivathmaj Report
28 pages
23,25 CPPP Final
No ratings yet
23,25 CPPP Final
17 pages
Fake Review Detection
No ratings yet
Fake Review Detection
9 pages
Project Synapsis
No ratings yet
Project Synapsis
14 pages
Improved Techniques For Online Review Spam Detection
No ratings yet
Improved Techniques For Online Review Spam Detection
58 pages
1 s2.0 S2666307424000196 Main
No ratings yet
1 s2.0 S2666307424000196 Main
9 pages
Social Media Spammers Fake Review Detection System: Kalpana V, Monisha R
No ratings yet
Social Media Spammers Fake Review Detection System: Kalpana V, Monisha R
12 pages
Shiv Final Report PDF
No ratings yet
Shiv Final Report PDF
26 pages
20bf1f0033 - 2nd
No ratings yet
20bf1f0033 - 2nd
29 pages
Research Pap
No ratings yet
Research Pap
8 pages
Feedback Shiv Report
No ratings yet
Feedback Shiv Report
25 pages
Building Fake Review Detection Model Based On Sentiment Intensity and PU Learning
No ratings yet
Building Fake Review Detection Model Based On Sentiment Intensity and PU Learning
14 pages
2023 Ijsem-147259
No ratings yet
2023 Ijsem-147259
23 pages
A - Review-and-Reviewer - Based - Approach - For - Fake - Review - Detection (Conference)
No ratings yet
A - Review-and-Reviewer - Based - Approach - For - Fake - Review - Detection (Conference)
6 pages
The Reliability of Vietnamese Comments Evaluation in Online Shopping Platform - Presentation
No ratings yet
The Reliability of Vietnamese Comments Evaluation in Online Shopping Platform - Presentation
16 pages
Spam Review Detection Using Machine Learning Ijariie24145
No ratings yet
Spam Review Detection Using Machine Learning Ijariie24145
7 pages
Electronics 12 02165
No ratings yet
Electronics 12 02165
13 pages
Best Journal
No ratings yet
Best Journal
9 pages
Ai FRFD
No ratings yet
Ai FRFD
3 pages
Optimizing YouTube Spam Detection With Ensemble Deep Learning Techniques
No ratings yet
Optimizing YouTube Spam Detection With Ensemble Deep Learning Techniques
6 pages
1 Iis 2020 185-194
No ratings yet
1 Iis 2020 185-194
10 pages
Identifying Groups of Fake Reviewers Using A Semisupervised Approach
No ratings yet
Identifying Groups of Fake Reviewers Using A Semisupervised Approach
10 pages
Detection of Fake Online Reviews by Using Machine Learning
No ratings yet
Detection of Fake Online Reviews by Using Machine Learning
7 pages
Fake Feedback Detection Using Machine Learning
No ratings yet
Fake Feedback Detection Using Machine Learning
5 pages
Fake Product Review Monitoring and Removal For Genuine Product Using Opinion Mining
No ratings yet
Fake Product Review Monitoring and Removal For Genuine Product Using Opinion Mining
23 pages
Fake Product1
No ratings yet
Fake Product1
37 pages
SCH - ESP32 S3 DevKitC 1 - V1.1 - 20220413 1
No ratings yet
SCH - ESP32 S3 DevKitC 1 - V1.1 - 20220413 1
2 pages
Detection of Fake Online Reviews Using Semi-Supervised and Supervised Learning
No ratings yet
Detection of Fake Online Reviews Using Semi-Supervised and Supervised Learning
10 pages
Attention: Please Read This Manual Prior To Installing and Operating Your Door. Recheck Your Work Before Operation
No ratings yet
Attention: Please Read This Manual Prior To Installing and Operating Your Door. Recheck Your Work Before Operation
28 pages
Tieng Anh
No ratings yet
Tieng Anh
17 pages
Chapter 4 - Communication
No ratings yet
Chapter 4 - Communication
22 pages
Wo SJournal Paper Publishedby DR Sanjay Kurkute
No ratings yet
Wo SJournal Paper Publishedby DR Sanjay Kurkute
6 pages
Iccmc 2019 8819685
No ratings yet
Iccmc 2019 8819685
4 pages
Fake Reviews Detection Based On Sentiment Analysis Using ML Classifiers
No ratings yet
Fake Reviews Detection Based On Sentiment Analysis Using ML Classifiers
6 pages
Fake Product Review Monitoring & Removal and Sentiment Analysis of Genuine Reviews
No ratings yet
Fake Product Review Monitoring & Removal and Sentiment Analysis of Genuine Reviews
4 pages
Fake Product Review Detection and Elimination Using Opinion Mining
No ratings yet
Fake Product Review Detection and Elimination Using Opinion Mining
5 pages
Fin Irjmets1702880945
No ratings yet
Fin Irjmets1702880945
4 pages
Deep Learning Based Model For Fake Review Detection
No ratings yet
Deep Learning Based Model For Fake Review Detection
4 pages
Detection of Fake Online Reviews Using Semi-Supervised and Supervised Learning
No ratings yet
Detection of Fake Online Reviews Using Semi-Supervised and Supervised Learning
8 pages
Art 20191163
No ratings yet
Art 20191163
3 pages
Fake Product Review Monitoring System
No ratings yet
Fake Product Review Monitoring System
7 pages
BOM Electrical Components
No ratings yet
BOM Electrical Components
3 pages
A Survey On Online Review Spam Detection Techniques
No ratings yet
A Survey On Online Review Spam Detection Techniques
5 pages
Fake Review Detection Iee Paper
No ratings yet
Fake Review Detection Iee Paper
4 pages
5B Bayesian Inference: Class Problems
No ratings yet
5B Bayesian Inference: Class Problems
9 pages
Detection of Fake Online Reviews Using Semi Supervised and Supervised Learning
No ratings yet
Detection of Fake Online Reviews Using Semi Supervised and Supervised Learning
4 pages
Paper 69-Fake Reviews Detection Using Supervised Machine
No ratings yet
Paper 69-Fake Reviews Detection Using Supervised Machine
6 pages
Educational Technology
No ratings yet
Educational Technology
15 pages
Design of Rigid Pavements 2 PDF
No ratings yet
Design of Rigid Pavements 2 PDF
5 pages
E85005-0125 - Remote Booster Power Supplies
No ratings yet
E85005-0125 - Remote Booster Power Supplies
4 pages
Factory Automation Catalog en 202205
No ratings yet
Factory Automation Catalog en 202205
48 pages
Spam Detection Using Machine Learning
No ratings yet
Spam Detection Using Machine Learning
4 pages
Fake Product Review Monitoring & Removal and Sentiment Analysis of Genuine Reviews
No ratings yet
Fake Product Review Monitoring & Removal and Sentiment Analysis of Genuine Reviews
4 pages
Harmonic Source Modeling
No ratings yet
Harmonic Source Modeling
11 pages
Mpow Flame Manual Bh088a Original
No ratings yet
Mpow Flame Manual Bh088a Original
16 pages
Bunday Tut 2nd Sem Newcourseoutlines
No ratings yet
Bunday Tut 2nd Sem Newcourseoutlines
18 pages
The Three-Wire Quarter-Bridge Circuit
No ratings yet
The Three-Wire Quarter-Bridge Circuit
4 pages
Description of Microsoft Internet Information Services (IIS) 5.0 and 6.0 Status Codes
No ratings yet
Description of Microsoft Internet Information Services (IIS) 5.0 and 6.0 Status Codes
8 pages
A Sanskrit Grammar Text Basic Principles Rules and Formats With Reference Tables and Vocabulary by John M Denton
No ratings yet
A Sanskrit Grammar Text Basic Principles Rules and Formats With Reference Tables and Vocabulary by John M Denton
7 pages
Trend Micro Endpoint Agent - Installation Manual
No ratings yet
Trend Micro Endpoint Agent - Installation Manual
14 pages
VERSION 1.1/0116: Product Manual English
No ratings yet
VERSION 1.1/0116: Product Manual English
28 pages
Assessing The Cyber Readiness of The Middle East's Oil and Gas Sector March 2018 by Siemens and Ponemon Institute
No ratings yet
Assessing The Cyber Readiness of The Middle East's Oil and Gas Sector March 2018 by Siemens and Ponemon Institute
20 pages
eBU Assignment Spring 2024 - Winter 2024 (6748)
No ratings yet
eBU Assignment Spring 2024 - Winter 2024 (6748)
5 pages
Devano - Math Unit 1 and 2 Test (1) KEY ANSWER
No ratings yet
Devano - Math Unit 1 and 2 Test (1) KEY ANSWER
3 pages
Accelerating Product Time To Market: Executive Summary
No ratings yet
Accelerating Product Time To Market: Executive Summary
6 pages
CORDA
No ratings yet
CORDA
3 pages
The Importance of Educational Technology in Teaching
No ratings yet
The Importance of Educational Technology in Teaching
5 pages
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
César Pérez López
No ratings yet
Machine Learning with Python: Foundations and Applications: ML, #1
From Everand
Machine Learning with Python: Foundations and Applications: ML, #1
Mohammed Nurudeen
No ratings yet
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Machine Learning Algorithms for Data Scientists: An Overview
From Everand
Machine Learning Algorithms for Data Scientists: An Overview
Vinaitheerthan Renganathan
No ratings yet

Use of Supervised Machine Learning Class

Uploaded by

Use of Supervised Machine Learning Class

Uploaded by

ISSN: 1002-2082

1. Introduction client comments. Writing is rising and

49 | International Conference 2024 (8th March 2024) Open Access Article

purchases and analyze client data feedback Query 1. Implementing supervised ML

1.1 Problem Definition 1.4 Research Contributions

50 | International Conference 2024 (8th March 2024) Open Access Article

51 | International Conference 2024 (8th March 2024) Open Access Article

simultaneously by utilizing an Amazon Further research may optimize the

52 | International Conference 2024 (8th March 2024) Open Access Article

as learning techniques. Using a one-class Categories

53 | International Conference 2024 (8th March 2024) Open Access Article

Table 3. Sample items from 2.2 Noise Reduction

54 | International Conference 2024 (8th March 2024) Open Access Article

based tokenization type for splitting the Step6. FixCWords =

55 | International Conference 2024 (8th March 2024) Open Access Article

954402 (Pronoun) Feature Engineering

56 | International Conference 2024 (8th March 2024) Open Access Article

(Bhattacharya et al., 2021), Random Forest 1

57 | International Conference 2024 (8th March 2024) Open Access Article

The supervised learning method classifies Step3: Classifiers: [‘Support Vector

58 | International Conference 2024 (8th March 2024) Open Access Article

Step27: gamma: scale or

59 | International Conference 2024 (8th March 2024) Open Access Article

Table 6. Various machine learning classifier capable of handling vast characteristic

60 | International Conference 2024 (8th March 2024) Open Access Article

vocabulary represented in n-words extracted testing results of ML classifiers.

Table 7. Testing results of ML classifiers

61 | International Conference 2024 (8th March 2024) Open Access Article

62 | International Conference 2024 (8th March 2024) Open Access Article

63 | International Conference 2024 (8th March 2024) Open Access Article

To answer Query 3: "What is the Research Approach Outcomes

64 | International Conference 2024 (8th March 2024) Open Access Article

65 | International Conference 2024 (8th March 2024) Open Access Article

Network Analysis and In Advances in healthcare information

66 | International Conference 2024 (8th March 2024) Open Access Article

652-021-03625-8 Review, 56(2), 1145–

67 | International Conference 2024 (8th March 2024) Open Access Article

68 | International Conference 2024 (8th March 2024) Open Access Article

stage deep model. Multimedia Tools 99-3315-0_4

69 | International Conference 2024 (8th March 2024) Open Access Article

70 | International Conference 2024 (8th March 2024) Open Access Article

You might also like