0% found this document useful (0 votes)
4 views

Use_of_Supervised_Machine_Learning_Class

Uploaded by

22211a05r1
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Use_of_Supervised_Machine_Learning_Class

Uploaded by

22211a05r1
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

ISSN: 1002-2082

Use of Supervised Machine Learning Classifiers for Online Fake Review Detection
Maysara Mazin Badr Alsaad1*, Prof. Dr. Hiren Joshi2
1
PhD Research Scholar, Department of Computer Science, Rollwala Computer Centre, Gujarat
University, Navarangpura, Ahmedabad 380009, Gujarat, India.
[email protected]
2
Professor, Department of Computer Science, Rollwala Computer Centre,
Gujarat University, Navarangpura, Ahmedabad 380009, Gujarat, India.
[email protected]

Abstract
Social media and e-commerce sites have prompted online communities to use reviews to
provide input on goods, products, and services, as well as to support people to analyze
customer input for buying choices, and corporations to improve manufacturing quality.
Internet shoppers support or degrade the reputation of competitive brands. However, the
dissemination of fake reviews fools people, making these reviews a worrying problem. This
study proposes a guided learning online textual content fraudulent review detecting method.
The work splits bogus data using machine learning classifiers and honest reviews.
Experimental findings are compared to assessment measures. Planned system performance is
compared to the baseline. The research comes to the conclusion that supervised machine
learning techniques may be useful in identifying fraudulent reviews, but how well these
techniques work is largely reliant on the characteristics that are chosen. In terms of accuracy,
AUC, and other performance metrics, the SVM classifier with N-Gram feature extraction
and CV feature selection performs better than other classifiers and feature selection
techniques. This is shown by the examination of various feature extraction and selection
techniques. According to the research, N-Gram feature extraction and CV feature selection
may be helpful in spotting fraudulent reviews on e-commerce platforms. This would assist
customers in making wise selections and increase the reliability of online reviews.
Keywords: Social Media, E-commerce, Fake Reviews, Spam Detection Machine Learning,
ML Classification, Naïve Bayes, SVM.

1. Introduction client comments. Writing is rising and


creating phony social media reviews for
During the current social media revolution, promoting or demoting competitor’s items
users and stakeholders rely heavily on (Pramanik et al., 2021). The problem dupes
customer reviews because internet online and in-store buyers. Thus, designing a
communities, like businesses, are constantly system that detects and sorts material into
using online text content (reviews) in bogus (spam) and genuine (ham) reviews to
product sales and purchases and reviewing help online communities make smart

49 | International Conference 2024 (8th March 2024) Open Access Article


ISSN: 1002-2082

purchases and analyze client data feedback Query 1. Implementing supervised ML


quickly is very important. Previous fake (Praveenkumar et al., 2023) to categorize
review detection studies supervised ML and text into spam or ham.
lexicon-based methods (Padminivalli et al., Query 2. Creating a method to assess
2023), and utilized sentiment-based grading supervised ML classification efficiency text
to detect fraudulent and actual social media into ham and spam categories.
reviews. However, we recommend using the Query 3. How efficient is the suggested
SVM-based supervised ML to identify approach for comparable tasks research for
actual (ham) and spam text. We operate efficient spam review classification?
differently from applying supervised ML to
a labeled dataset. The suggested supervised 1.3 Goals and Objectives
machine learning fake review estimation The key research goals are:
method is contrasted to a standard and 1. Use supervised ML to categorize text
different supervised machine learning as spam and ham.
classifier function. The research will use 2. Assessing the classification
SVM, a supervised machine learning effectiveness of supervised machine
technique on a standard fake review dataset learning text into spam and ham
for text-to-binary classification of spam and categories.
ham. Moreover, to boost efficiency, SVM 3. Assessing the suggested method's
parameters should be optimized for the efficacy in comparable research to easily
suggested system classifier. identify spam reviews.

1.1 Problem Definition 1.4 Research Contributions


Supervised textual spam identification from Contributions of this research include:
bogus reviews ML algorithm is a growing 1. Text classification into fake and
challenge because of spamicity's diversity in genuine by employing the supervised
text. This research tackles spam review machine learning method.
categorization using phony reviews from 2. Assessing the suggested text
text utilizing supervised machine learning. classification system's efficiency into
With input reviews {sr1, sr2, sr3... srn}, the fake and genuine groups.
goal is to create a prediction model. The 3. Evaluation of the suggested approach
model provides a spamicity class of Pc ∈ {0, against current methods for spam
1} to text reviews Sri, where spam is 1 and estimation.
ham is 0.
1.5 Literature Survey
1.2 Research Queries Butt et al. (2022) used LSTM and RNN
This study addresses the following research deep learning to classify spam. RNN spam
questions: categorization, SMS spam, and Twitter data

50 | International Conference 2024 (8th March 2024) Open Access Article


ISSN: 1002-2082

are used in testing for optimizing results. features to handle Arabic social media
Pre-prepared vectors are used to increase material. The idea suggested the system
future preparation knowledge. Mewada et al. employs a novel supervised method mostly
(2022) presented a Rating and Review related to language: Arabic. Interesting
Processing Method to identify the general findings with 91.73% precision for
spam score rates for a review as supportive imbalanced datasets are accomplished. Li et
or unaccommodating based on review al. (2023) introduced CNN.
ratings. The best findings rely on Amazon- Qayyum et al. (2023) find text depiction
scraped product survey data using Python for fake review recognition. The authors
scraping tools. Different studies are to be tried with a record using 2 categorization
introduced later. Jáñez-Martino et al. (2022) methods. This was analyzed using feature
group material for crossover arranging as aggregations of Decision Tree and neural
spam and ham in opinion spam detection. A network algorithms. Implementations
weighting scheme prioritizes spam. demonstrate that sentence-weighted neural
Performance metrics like Accuracy, networks are more useful than other
Precision, and Recall analyze the model’s network-dependent approaches. The location
performance. The shortcoming of the may be calculated to prolong the job weight
approach is a limited set of skills and feature of each phrase using a memory network-
selection is manual. based approach. Bali proposes ML and N-
Aljabri et al. (2023) used several gram study (Bali et al., 2019) for fake news
supervised learning classification systems to estimation. Classifiers Random Forest and
identify spam reviews. Their system reached Naïve Bayes are utilized in feature
85.63% accuracy. The research may be extraction. With 92% accuracy, the outcome
expanded by studying unsupervised machine is excellent. This method works on other
learning and semi-supervised machine recent datasets. Active learning was used by
learning methods. Li, J., et al. (2023) Yan et al. (2023) to identify misleading and
suggested an unsupervised online honest feedback. Only authorized users can
recognition approach spams, an approach review. Productive, negative, and impartial
which is now a density-dependent anomaly reviews are categorized utilizing Natural
estimation problem. Language Processing and text mining
The suggested research comprises many approaches.
phases: (1) Evaluate viewpoint rating count, Martis et al. (2023) categorize J48
(2) Aspect rating-dependent local outlier classifiers reviews of films by spectators.
factor technique (3) Viewpoint positioning TP, TN, and Accuracy are contrasted
for spam reviews. The results proved the between J48 and Random Forest algorithms.
model is persuasive and outperforms current Eshtehardian et al. (2022) suggested a
methods. Kaddoura et al. (2023) developed a continual fake detection method, which
spam detection algorithm using chosen surveys, and gathers spammers

51 | International Conference 2024 (8th March 2024) Open Access Article


ISSN: 1002-2082

simultaneously by utilizing an Amazon Further research may optimize the


dataset graph model. Results from framework for more effective feature
experiments show the suggested technique classification operations.
beats all baseline methods in Accuracy. Javed et al. (2021) investigated Yelp's
Enhanced outcomes were attained with filtering approach to identify fraudulent
every social and semantic highlight. This reviews on a company website. The
work may be expanded to include all researchers utilized a supervised method to
features employing network structure and an train Yelp's filtered reviews. The findings
innovative iterative technique. Building a demonstrate Yelp's reliable screening.
domain-dependent polarity was addressed Kathuria et al. (2023) provide Senti wordnet,
by Liu & Lee (2019) who suggested NLTK, and word count tools to identify
vocabulary from reviews utilizing SWN review spam dependent on consistency. The
(Senti Word Net 3.0), sufficient word count technique suggested utilized effective spam
and regular updates. CNN model was identification.
presented by Balim & Özkan (2023) for Gawlikowski et al. (2023) suggested a
coordinating item-based review features deeper understanding of incorrect views.
using a word creation approach. The Naïve Experiments are conducted across and
Bayes classification algorithms utilized for through areas to explore a broader classifier.
classifying CNN model predicts finer than Compared to SVM, SAGE classifier yields
SVM-dependent approach. Future superior results. A machine learning method
projections may be more accurate by other was presented by Duma et al. (2023) to
reviews may be examined. identify bogus reviews. The authors focused
Zhan et al. (2023) offer the Near Point on supervised machine learning methods.
Auto-Regressive (NPAR) method for Upcoming component engineering
analyzing authentic product reviews. The methodologies may identify useful aspects
system effectively distinguishes spam for online false reviews by using varied
opinions, as shown by testing findings. This datasets.
study may be expanded to find time series Zhao et al. (2023) suggested classified
abnormalities. consumer opinions as favorable or negative
Srinivasarao & Sharaff (2023) developed to aid in making informed purchases. They
an automatic system for email feature made the choice by assessing spam review
classification to identify and classify spam. behavior. The authors presented a successful
The first phase involves developing and method that displays ordered emotion shown
extracting characteristics from an email as “Chernoff’s face”.
corpus using automated transformation and Yao et al. (2022) suggested the PU-
aggregation methods. The system allows for learning strategy to identify misleading
automatic feature engineering and spam beliefs. The PU-learning approach employs
categorization using scalable algorithms. Random Forest and Decision Tree classifiers

52 | International Conference 2024 (8th March 2024) Open Access Article


ISSN: 1002-2082

as learning techniques. Using a one-class Categories


Decision Tree classifier is appropriate for
NON-
training with few misleading beliefs. This SPAM
Fake SPAM
work may be expanded by including PU-
F1 Review 8836 1299 7537
learning and self-training methods.
s (13.7 (86.3
Ghanem & Erbay (2022) suggested a
%) %)
solution to reduce spam comments utilizing
machine learning approaches and subject
Training Data as previously noted, 80%
identification. The technique used the
of the dataset is used for training data. In
standard dataset and three classification
Table 2, training dataset specimens are
approaches to analyze the presentation. The
shown.
decision trees classifier yields the most
effective spam detection system
Table 2. Training data
implementation.
specimen records
The remaining article is organized as:
Review Reviews Label
Section 2 methodology; Section 3 presents
No.
results and discussion, then a conclusion and
A OK see to know better Ham
further work in Section 4.
Are you perfectly fit?
B Spam
Know on 12th July
2. Proposed Method
The proposed spam detection system
In testing data, this method divides the
includes modules for dataset collecting,
dataset into training and testing (Jayasingh
noise reduction, supervised learning, and
et al., 2022) portions by arbitrarily selecting
performance evaluation. Figure 1 depicts the
a fraction. When dividing a dataset into
suggested system block diagram.
testing and training, divide it into 2
constituents: training and testing.
2.1 Data Gathering
To ensure cross-endorsement, data is
The benchmark source1 provides the fake
divided into 3 elements having various
review dataset. Table 1 describes the
dimensions: validation, testing, and training.
gathered dataset. The dataset is divided into
In the training step, the model is trained, and
20% for testing and 80% for training. Out of
from testing by validation, its performance
6384 reviews, 474 are spam and 4825 are
is assessed, disregarding overtraining. If the
real (Pramanik, 2023).
outcome is positive, the training dataset is
Table 1. Statistics of
applied to the validation point. Specimen
dataset
testing dataset (Chellam et al., 2023) entries
No. of Review
Datas Detaili are in Table 3.
Revie Counts in
et ng
ws Labeled

53 | International Conference 2024 (8th March 2024) Open Access Article


ISSN: 1002-2082

Table 3. Sample items from 2.2 Noise Reduction


the testing dataset Various preprocessing procedures are used
S1. No. Reviews Label on the dataset. Examples are tokenization,
A How to do you get Ham the addition of significant symbols, and
Messags Hey Honey it hashtag removal. Algo 2 demonstrates
is 5 weeks since no Python-based preprocessing operations
B Spam utilizing Colab Framework.
word talk send 200$ to
receive fund Pre-processing stages are briefly
described below:
First, we construct a CSV dataset. A Tokenization: The term is broken down into
sample of reviews is included in Tables 2 little fragments during the tokenization
and 3 for testing and training purposes. Both process. Python environment tokenizes
spam and real reviews are there. using NLTK tokenizer (Khanh et al., 2023).
Algo 1 displays pseudo-code for partitioning Avoid Word Removal: The stop words
the dataset into training and testing. The first do not affect sentiment categorization. A
algorithm: A short summary of processes for preassembled aggregation of stop words is
dividing the dataset into testing and training. used to delete them progressively. Stop
Algo 1: A series of partitions for the training words include “a”, “the” & “is”. In Algo 2,
and testing dataset duties for preprocessing implementation are
Step1. Dataset division in testing/training: listed.
Step2. FixTrain A = [M] Text preprocessing in fake review
Step3. FixTrain B = [M]. detection refers to the application of various
Step4. Fixtrain A=[M] techniques on raw review text to convert it
Step5. FixTest B = [M] into a suitable format for further analysis. In
Step6. FixTotalTestSize = M * 20% this step, the following steps are applied in
FixTotalIndex = RANDOM(0, M - 1, review text preprocessing:
SumTestSize) - Removing Punctuation and Special
Step7. for IMT = 0 to N do Characters: Punctuation marks, such as
Step8. SetTempArray = [M] commas, periods, and quotes, along with
Step9. for ELEMENTSinT-ID [IMT] do special characters, do not hold
Step10. TemporaryArray.Add (ELEMENTS) significant meaning in text data and they
Step11. if TotalIndex.holds(IMT), then are eliminated to decrease the data's
Step12. Test A.Append(TemporaryArrayy) dimensionality.
Step13. Test B. Append(IMTReviewList[1]) - Tokenization: It is the process of
Step14. else segmenting the text into individual
Step15. Train A.Append(TemporaryArray) words, or tokens, essential for most
Step16. Add(TemporaryArray) Train B natural language processing (NLP) tasks
(Ngoc et al., 2023). We used a word-

54 | International Conference 2024 (8th March 2024) Open Access Article


ISSN: 1002-2082

based tokenization type for splitting the Step6. FixCWords =


review text into separate words. divide(Para[IteratedRow], “”).
- Stop Word Removal: Stop words are Step7. FixListOfComments = [M]
frequently used words, like "the," "and," Step8. FixListOfPunctuations: . , : ;
and "a" that do not add much value to Step9. FixStopWords: "the", "an", "is",
the review text content. Removing these "are"
words can reduce the data's Step10. For each CWords CHARS, do
dimensionality and improve the Step11. if CHARS AT LAST IS A
performance of machine learning PUNCTUATION
algorithms. Step12. PUNCTUATION remove
- Stemming: Techniques like stemming Step13. else
and lemmatization are used to reduce Step14. CHARS DOSENT CONSTITUTE A
words to their base or root forms. STOP WORD
Stemming involves removing the Step15. CommentsList.APPEND(CHARS)
suffixes from words to create a stem, Step16. A dictionary does not include
while lemmatization uses a vocabulary characters.
and morphological analysis to transform Step17. CHARS Dictionaries = 0.
words to their base forms.
- Lowercasing letters: it aims to convert Table 4. Sample reviews before and
all capital letters in the review text into after preprocessing
small letters. The following
Prior to Cleaning
Revie
2.3 Supervised ML Classifier Cleanin Removin
w Tokenizatio
Implementation g g stop
n
The spam detection framework uses SVM, a words
supervised machine learning approach, to OK bro see
categorize text as spam or non-spam OK bro to know
Bro see
(Hossain et al., 2023). Several researchers see to better
A to know
have used supervised ML to classify spam know (Adjective)
better
based on bogus reviews. This study better (Noun)
examines current research. (Verb)
Algo 2: A series of pre-processing execution Your Your free
stages Free gift
free gift gift is
Step1. PreProcessing delivered
is delivered
Step2. FixComments = [M] Text the
B delivere Text the
Step3. FixDictionaries = [M] code to
d Text code to
Step4. FixIteratedRow = 0 954402
the code 954402 for
Step5. for RepeatedRow to N. Count -1 do verifying
to verifying

55 | International Conference 2024 (8th March 2024) Open Access Article


ISSN: 1002-2082

954402 (Pronoun) Feature Engineering


for (Adjective) A crucial step in supervised machine
verifyin (Noun) learning is feature engineering. Methods for
g (Verb) feature representation in the suggested
(Adverb) research include:
i) TF ii) TF and IDF iii) FV
where TF=term frequency, IDF= term
The effectiveness of the suggested frequency, and inverse document frequency,
framework is assessed by comparing FV=feature vector
implemented outcomes to standard 1. Term Frequency: It computes the
approaches. We employ an SVM classifier quantity of incidence for a specified
with customized parameters on the review.
benchmark spam dataset, whereas other 2. The dataset uses IDF and TF algorithms
researchers utilized supervised machine to determine the relevance of words.
learning approaches for fake review 3. The FV turns the I/P review into a token
categorization. The SVM classifier yields count matrix.
more accurate results when categorizing
textual material (Table 4). SVM: The classifier helps with both linear
and nonlinear situations of the Supervised
ML classification algorithm. The Support
Vector Machine classifies the data into
categories and then finds the Hyper Plane,
which splits the data into groups. The
primary notion of SVM in sentiment
classification is determining the hyperplane
(Dushyant et al., 2022) that divides the
Figure 1. Structure of suggested spam collection. The statistical description of
detection framework SVM is:
P = {(x1, y1), (x2, y2),............... (xy, yn)}
3. Results and Discussion (1)
In Figure 1, the model is trained with where P is the review dataset, x is a value of
labeled textual reviews/tweets with spam y that shows whether items are connected to
and ham classes (Pramanik & a class.
Bandyopadhyay, 2022). Next, the test
module evaluates the trained model. To
3.1 Different ML Classifiers
address overfitting, model validation is
Besides SVM, we tested various ML
conducted.
classifiers such as Logistics Regression
(Kaushik et al., 2022), Naive Bayes

56 | International Conference 2024 (8th March 2024) Open Access Article


ISSN: 1002-2082

(Bhattacharya et al., 2021), Random Forest 1


𝑥 = ∑𝑧𝑦=1 𝑦𝑥(𝑧′) (4)
𝑦
(Mandal et al., 2021), KNN (Meslie et al.,
Where x denotes the number of
2021), XGBoost, and Decision Tree. They
replacement examples, yx is the training tree
are explained below:
classification, and y is an occurrence of
Logistic Regression (LR): The LR
training from x, y.
classifier categorizes reviews into polarity
classes using testing and training datasets.
K-Nearest Neighbour (KNN): KNN solves
Logistic Regression, the quickest predictive
regression and classification issues. The K-
classifier, avoids overfitting and provides
Nearest Neighborhood classifier is often
the finest generalization. The LR performs
utilized for huge-scale industrial
better on a fresh dataset. The Logistics
classifications. It aligns with specimen-
Regression equation is:
based learning. The boring learner is a
yα0+α1y1+ α2y2 (2)
dataset training method that stores training
Their y is permanent, while the others are
examples in an N-Dimensional KNN slot.
equation boundary functions.
The KNN classifies new instances using K-
neighbor majority votes.
Naive Bayes (NB): It is used in
XGBoost: It is built on the Gradient
classifications and regressions and is a
Boosting system. It yields good outcomes in
supervised machine-learning classifier.
distributed environments like Hadoop, SGE,
According to NB theorem, NB classifiers
and MapReduce.
are related to the probability family.
Decision Trees (DT): DT classifiers are
Applying Naive Bayes classifiers to large or
often utilized in regression, classification,
small datasets yields generally satisfactory
and different issues. It uses supervised
results. Higher input characteristics provide
machine learning. The nodes indicate
better results for the Navies Bayes classifier.
features for categorization, whereas the
In mathematics, the equation is:
branch represents value characteristics. The
p(a/b) = p(a) x p(b) x p(b/a) (3)
categorization process begins by sorting
attributes from base nodes. Conquer and
Random Forest (RF): It is more flexible
divide strategies are utilized for tree
than others based on hyperparameter
structure. The equation is:
adjustment. The categorization findings are
(x, y) = (b1, b2, b3, ………… bg, y)
mostly relevant and efficient. The RF
(5)
classifier is often utilized in classification
The subset is indicated by y, base nodes are
and regression problems. Every decision tree
x, and tree leaves are indicated by g.
is a RF. The RF classifier having various
decision trees yields accurate
3.2 Working Procedure of the Suggested
generalizations. The mathematical equation
Approach
is:

57 | International Conference 2024 (8th March 2024) Open Access Article


ISSN: 1002-2082

The supervised learning method classifies Step3: Classifiers: [‘Support Vector


fake and real reviews by inputting reviews, Machine’, ‘K-Nearest Neighborhood’,
preprocessing them, and using the SVM ‘Extreme Gradient Boosting’, ‘Decision
classifier to categorize them as spam Tree’, ‘Random Forest’, ‘Naïve Bayes’,
(SPAM) or legitimate (HAM). The dataset is ‘Linear Regression’]
split into training (80%) and testing (20%) Step4: Begin
constituents. Labeled data is sent to the Step5: Scan SMS
classifier during training. After classifier Step6: Fix TextStream = Scan(Polarity)
training, ML classifiers are verified by Step7: Preprocessing
analyzing the remaining testing data. The Step8: Tokens
outcome is assessed using precision, F1- Step9: Fix TokenStream = GetTextStream
measure, accuracy, and recall. Steps for the Step10: Stop words Remove
recommended technique are outlined in Step11: PlainStream =
Algo 3. SeparateStopWords(TokenStream)Punctuati
on
3.3 Contrasting Classifier Performance Step12: Dividing entire Data using Training
Comparing the proposed SVM classifier to or Testing
existing ML classifiers provides a Step13: Fix TestingLength = 20%
qualitative assessment of its effectiveness in Step14: Testing- A, Testing-B =
predicting false and legitimate reviews from Divide(TextStream, TestingLength)
textual content. Ultimately, we assess Step15: VectorCounting (TextStream)
classifier performance using Precision, Step16: IDF and TF
Recall, Accuracy, and F1-Measures to Step17: ApplyingClassifier
assess classification results (Pramanik, Step18: ModelingClassfiers =
2022). Comparison study in Section 4 ObtainClassifiers()
provides an empirical assessment of the Step19: Fix ModelClassification =
suggested SVM classifier for predicting fit(Training-A, Training-B)
false and authentic reviews from textual Step20: Estimations
constituent. Step21: Fix PredictionModel =
After evaluating different machine ModelClassification: predicts(Text-A)
learning classifiers on the dataset, we Step22: Accuracy
recommend the SVM classifier for the finest Step23: FixModelAccuracy =
classification outcomes on the bogus (spam) ObtainAccuracy (PredictModel, Text-A)
review dataset. Step24: Confusion Matrix
Algo 3: Framework implementation Step25: FixConfusionMatrix =
Step1: Polarity-Categorized Rays in ObtainConfusionMatrix(Test-B,
accordance with polarities PredictModel)
Step2: POLARITY: (“SPAM” and “HAM”) Step26: Perform

58 | International Conference 2024 (8th March 2024) Open Access Article


ISSN: 1002-2082

Step27: gamma: scale or


ObtainMeasure(BMeasure,PreciseModel) floating, optional Kernel coefficient for
Step28: Return (select_automaticall Poly, Sigmoid, rbd
ObtainClassificationReport(Test-B, y= scale)
PredictModel, POLARITIES) shrinking heuristic:
This section answers research questions boolean optional For shrinking
via experimentation and analysis. (select_automaticall heuristic
y=true)
3.4 First Research Query Answer probability
In order to answer Query 1 "How to apply estimation:
It allows probability
different ML classifiers on a dataset of spam Boolean, optional
estimating.
reviews for predicting spam and genuine (select_automaticall
reviews," SVM algorithm was applied to the y=false)
dataset of fake reviews for estimating spam Cache size: floating, Fix Kernel cache
and ham reviews from Short Messaging optional size.
Services. The Support Vector Machine
technique serves regression and 3.5 Second Research Query Answer
classification tasks. The dataset in training For Query 2: “How to evaluate the
phase X predicts object variable Y. The efficiency of different ML classifiers to
SVM classifier parameters are given in predict spam and genuine reviews?”, several
Table 5. classifiers were assigned to the fake review
dataset. Details are provided below. The first
Table 5. Support Vector Machine Classifier experiment:
Parameters The experiment is conducted using a dataset
Parameters Descriptions of 5572 reviews, categorized as “spam and
D: floating value, It is a regularization genuine”. Table 8 displays performance
optional parameter which is assessment results for various machine
(select_automaticall inversely learning classifiers, including Random
y=1.0) proportional to D Forest, XGBoost, SVM, KNN, Decision
Kernel: Kernel variety must Tree, NB, and Logistic Regression. We
alphanumeric, be either utilized Precision, F1-score, Accuracy, and
optional polynomial/linear/rbf/ Recall measures. The SVM classifier
(select_automaticall callable. Default outperforms various machine learning
y=radial basis value = radial basis classifiers in recall (99%), accuracy
function) function (98.92%), precision (99%), and F1-Score
Degree: int, optional Degree is (99%).
(select_automaticall “polynomial” for
y=3) polynomial kernel

59 | International Conference 2024 (8th March 2024) Open Access Article


ISSN: 1002-2082

Table 6. Various machine learning classifier capable of handling vast characteristic


experiments spaces. The dataset includes over 11000
F1- dimensions, indicating that SVM provides
Reca
Classifi Accura Precisi scor the most accurate classification results.
ll
er cy (%) on (%) e
(%)
(%) Poor Performance Classifiers
KNN 86.88 0.88 0.89 0.87 Table 6 indicates that the K-NN classifier
SVM 98.75 0.95 0.95 0.96 performs much worse than others in
DT 98.48 0.94 0.96 0.95 classification. The KNN classifier has poor
XGBoo 98.32 0.96 0.97 0.97 accuracy because of its reliance on a voting
st majority mechanism for class identification,
LR 97.12 0.96 0.97 0.98 which may not be suitable when near
NB 96.78 0.97 0.95 0.97 neighbors, have varied distances from the
RF 97.33 0.97 0.96 0.96 test data. Therefore, K-Nearest
Table 6 displays the classification Neighborhood classifiers have poor
performance of several classifiers from the accuracy on the dataset.
8836 spam reviews dataset. It is reported Various factors affecting poor classifications
that the SVM achieves 98.75% accuracy. in ML classifiers include dataset size, class
The accuracy of K-Neighbor's classifier is total, dataset organization, training/testing
86.88%. After analyzing all classifiers, SVM ratio, and case count.
provides the finest classification results, Classifiers' Best Performances
whereas K-Nearest Neighborhood yields the Table 8 shows that the Support Vector
lowest results in F1-score, Recall, Precision, Machine classifier outperforms different ML
and Accuracy. classifiers in F1-score, Accuracy, Precision,
Table 6 data indicate that the SVM classifier and Recall. The Support Vector Machine
yields the finest outcomes, validated by classifier with the parameters described in
literature. the previous section is recommended for
1. Most text categorization problems are classifying reviews as spam or authentic.
linear separable: The fake review dataset
categorizes data into 2 labels: ham and Different Classifier Cross-validation
spam, depending on class utilization in the Results
training dataset. For linear partible data, This section presents comparative results
the SVM classifier yields the finest gained from our experiments conducted in
classification results. this work for fake reviews detection using
2. High dimensionality input space: Text the Amazon reviews dataset with two
classifier learning requires over 10,000 feature selection and extraction, which are
features. A large feature space might cause TF-IDF (Word Vector) and Bag of Word
overfitting (Chandan et al., 2022). SVM is (N-gram), both used to select various sets of

60 | International Conference 2024 (8th March 2024) Open Access Article


ISSN: 1002-2082

vocabulary represented in n-words extracted testing results of ML classifiers.


from the test set. Table 7 gives the details

Table 7. Testing results of ML classifiers


Truthful
A reviews Fake reviews
Feat
Acc U Type (Class ==1) (Class == 0)
ure Type I
Class Learning urac C II Re F- Re F-
Extr Error
ifiers Time (S) y ( Error cal sco cal sco
actio (%)
(%) % (%) Prec l re Prec l re
n
) ision (% (% ision (% (%
(%) ) ) (%) ) )
81.0
N.B 0.01 81 9.8 9.1 82 80 81 80 82 81
1
80.7
R.F 25.60 81 10.3 8.98 82 80 81 80 82 81
WV 1
n=20 82.3
SVM 28.00 82 9.3 8.3 83 82 82 82 83 82
K 9
82.5
L.R 0.10 83 9.7 7.75 84 81 82 81 84 83
(TFI 5
DF) 72.8
D.T 0.87 73 13.93 13.17 74 72 73 72 73 73
9
79.2 0.
A.B 11.90 11.57 9.15 81 77 79 78 81 80
8 79
85.0
N.B 0.01 85 7 7.98 85 86 85 86 84 85
1
82.9
CV R.F 29.80 83 9.23 7.78 84 82 83 82 84 83
8
n=20
SVM 96.00 81.5 81 9.25 9.25 82 82 82 81 81 81
K
84.8
L.R 0.11 85 7.62 7.55 85 85 85 85 85 85
3
(N-
72.1
Gram) D.T 1.18 72 14.67 13.2 73 71 72 71 73 72
3
81.9
A.B 12.00 82 9.42 8.67 83 81 82 81 82 82
1
WV 82.5
N.B 0.02 83 8.92 8.52 83 82 83 82 83 82
n=10 6
0K R.F 245.00 82.5 83 9.12 8.36 83 82 83 82 83 83

61 | International Conference 2024 (8th March 2024) Open Access Article


ISSN: 1002-2082

1
(TFI 84.0
SVM 858.00 84 8.38 7.57 85 83 84 83 85 84
DF) 5
84.0
L.R 0.29 84 8.35 7.58 85 83 84 83 85 84
7
73.9
D.T 10.10 74 13.06 13.03 74 74 74 74 74 74
1
81.2
A.B 53.60 81 9.69 9.03 82 81 81 81 82 81
8
85.3
N.B 0.04 85 6.74 7.88 85 87 86 86 84 85
7
84.2
R.F 348.00 84 8.57 7.15 85 83 84 83 86 84
CV 8
n=10 86.5
SVM 27931.00 87 6.75 6.74 87 87 87 86 86 86
0K 1
86.9
L.R 0.56 87 6.49 6.53 87 87 87 87 87 87
(N- 8
Gram) 75.9
D.T 18.60 76 12.97 11.07 77 74 76 75 78 76
5
84.0
A.B 56.20 84 8.14 7.79 84 84 84 84 84 84
7
This table shows the results of several - Accuracy (%): the percentage of
machine learning models trained to classify correctly classified reviews.
reviews as either truthful (Class == 1) or - AUC (%): the area under the Receiver
fake (Class == 0). The models were trained Operating Characteristic (ROC) curve
on feature vectors extracted using different that measures the trade-off between the
methods, namely Word Vectors (WV) and True Positive Rate (TPR) and the False
Count Vectors (CV), with different Positive Rate (FPR) of the classifier.
vocabulary sizes (n=20K and n=100K), and - Type I Error (%): the % of falsely
using different classifiers (NB, RF, SVM, categorized fake reviews.
LR, DT, and Ada Boost). The testing result - Type II Error (%): the % of falsely
reports several performance metrics for each categorized truthful reviews.
combination of feature extraction method, - Precision (%): the % of rightly
classifier, and vocabulary size. These categorized reviews among all reviews
metrics include: categorized as truthful (Class == 1).
- Learning Time(S): the time it took to - Recall (%): the % of correctly
train the model. categorized reviews among all truthful
reviews (Class == 1).

62 | International Conference 2024 (8th March 2024) Open Access Article


ISSN: 1002-2082

- F-score (%): the harmonic mean of not hold in high-dimensional feature spaces,
precision and recall, which balances the which could limit its performance. The
trade-off between them. Decision Tree classifier generally performed
poorly compared to the other classifiers,
The results also showed that the best- with lower accuracy and higher type I and
performing models achieved an accuracy of type II error rates.
around 87% and an AUC of around 87%, Decision trees are simple and
indicating that they are able to distinguish interpretable models that recursively split
between truthful and fake reviews with a feature space into regions based on
high degree of accuracy. The SVM and thresholds, but they may suffer from
Logistic Regression classifiers generally overfitting and instability, especially in
performed better than the other classifiers, high-dimensional spaces. The N-Gram
while the Naive Bayes classifier had the method generally performed better than the
shortest learning time. The N-Gram method Word Vector method, which may be
generally performed better than the Word attributed to its ability to capture local
Vector method. Increasing the vocabulary dependencies between words in the text,
size from 20K to 100K generally improved which are important for detecting patterns of
performance. However, the Decision Tree deception and sentiment. However, the
classifier generally performed poorly, with Word Vector method may be more suitable
low accuracy and high type I and type II for capturing global semantic relationships
error rates. between words and phrases. Increasing the
The results in table 7 above show that vocabulary size from 20K to 100K generally
the SVM and Logistic Regression classifiers improved performance, indicating that a
generally performed the best across most larger vocabulary can capture more fine-
combinations of feature extraction methods grained distinctions between words and
and vocabulary size, achieving accuracy and phrases, thus enhancing the accuracy of the
AUC scores of up to 87%. These classifiers model. However, this comes at the cost of
are known for their ability to handle high- increased computational complexity and
dimensional data and to learn complex memory requirements. Overall, the results
decision boundaries, which may explain suggest that machine learning models can be
their superior performance in this task. The effective at detecting fake reviews, but the
Naive Bayes classifier generally had the choice of feature extraction approach,
shortest learning time, but its performance vocabulary size, and classifier can have a
was slightly lower than that of the SVM and greater impact on performance. It is essential
Logistic Regression classifiers. Naive Bayes to rigorously assess and compare various
is a simple but effective probabilistic models on a representative dataset to select
classifier that assumes independence the best-performing one for a given task.
between features, but this assumption may

63 | International Conference 2024 (8th March 2024) Open Access Article


ISSN: 1002-2082

To answer Query 3: "What is the Research Approach Outcomes


efficiency of the proposed classifier Works
compared to the baseline method?", the Accuracy-
recommended classifier is compared to 87.68%
Mani et al. NB, RF,
standard research. Precision-
(2018) and SVM
89%
Comparing Baseline Approaches Recall-85%
Mani et al. (2018) used the ensemble Accuracy-
LR, K-NN,
strategy, which aided in obtaining a higher Kumar et al. 76%
NB, RF,
accuracy score. On the other hand, Kumar et (2018) F1-Score-
SVM
al. (2018) used both univariate and 79%
multivariate distributions across user ratings. Precision-
Ban et al. (2018), the Hybrid architecture Ban et al. SVM &
85%
of SVM with NN helped to slightly improve (2018) NN
Recall-84%
the classification results. While Saeed, Rady Accuracy-
& Gharib (2019) made the increase model’s 95.25%
performance by using N-gram feature Recall-
extraction and Negation handling. Saeed, Rady (NB, SVM,
91.75%
Joni Salminen et al. (2022) created a & K-NN, RF
Precision-
mobile model to identify spam and ham Gharib (2019) and NN)
98.66%
SMS, comparing ML classifiers such as F1-Score-
Support Vector Machine, K-Nearest 95.08%
Neighborhood, Logistic Regression, Support
Decision Tree, and Random Forest. In the Joni Salminen 97.35%
Vector
following classification, the SVM classifier et al. (2022) (Accuracy)
Machine
had the highest accuracy among others. Sami Ben
And the latest researchers, Sami Ben SVM & 94%
Jabeur et al.
Jabeur et al. (2023) compared ML classifiers NB (Accuracy)
(2023)
for spam email prediction. They used three 98.75%
ML classifiers (NB, J48, and MLP) and (Accuracy)
found that MLP classifiers yielded the Support 99% (F1-
highest accuracy. Table 8 displays a Our work Vector score)
comparison of standard study approach Machine 99% (recall)
outcomes. 99%
(Precision)
Table 8. Comparison with the results of a
typical study approach (Kaddoura et al. The SVM classifier approach for fake
2022) review detection yielded encouraging

64 | International Conference 2024 (8th March 2024) Open Access Article


ISSN: 1002-2082

results, including increased accuracy, recall, to reduce noise before providing text to the
precision, and F-measure. The suggested ML classifier. SVM outperforms other ML
strategy outperforms baseline techniques, as classifiers such as XGBoost, KNN, Random
shown by the acquired results. Forest, Na¨ıve Bayes, DT, and LR in fake
Through our proposed work, the review classification in experimental results.
following list of opinion mining jobs is As compared to other classifiers, K-Nearest
recommended to assist businesses and Neighborhood has the weakest performance.
merchants in gathering and evaluating a
significant volume of consumer reviews: Limitations of work
a) Sentiment classification, which 1. 1.The imbalanced dataset in this work
indicates if a viewpoint is neutral, led to a low performance of ML
positive (ham), or negative (spam). classifiers.
b) Learning about the attributes of an 2. The dataset is separated into testing and
entity that has been reviewed and training using the random splitting
obtaining the reviewer's viewpoint approach.
regarding a specific item. 3. This work uses just TF-IDF feather
c) Comparative language and the ability engineering.4. The low dataset size
to find relationships between one (5573) in this work impacts classifier
thing and several related objects. outcomes, indicating the necessity to
d) Supervised machine-learning expand the dataset for enhanced
techniques surpassed human outcomes.
judgment in distinguishing between
genuine and false opinions, Future Paths
classifying consumer opinions with A balanced dataset improves ML classifier
the best accuracy between the two. performance.
e) False opinions affect customers in
two ways: 1) they influence them to Acknowledgment
make poor choices when making a The experimental work described in this
purchase, and 2) they cause them to research was conducted in the lab of the
lose faith in online product reviews. Department of Computer Science, Rollwala
Computer Centre, Gujarat University.
4. Conclusion
This work uses supervised ML techniques, References
such as SVM with specified parameters, to Aljabri, M., Zagrouba, R., Shaahid, A.,
categorize content into spam and non-spam Alnasser, F., Saleh, A., & Alomari, D.
reviews. We also tested different ML M. (2023). Machine learning-based
classifiers and reviewed their outcomes. social media bot detection: a
Different preprocessing approaches are used comprehensive literature review. Social

65 | International Conference 2024 (8th March 2024) Open Access Article


ISSN: 1002-2082

Network Analysis and In Advances in healthcare information


Mining, 13(1). https://doi.org/10.1007/s systems and administration book
13278-022-01020-5 series (pp. 167–
Bali, A. P. S., Fernandes, M., Choubey, S., 182). https://doi.org/10.4018/978-1-
& Goel, M. (2019). Comparative 6684-5656-9.ch009
performance of machine learning Chellam, V. V., Veeraiah, V., Khanna, A.,
algorithms for fake news detection. Sheikh, T., Pramanik, S., & Dhabliya,
In Communications in computer and D. (2023). A Machine Vision-Based
information science (pp. 420– Approach for Tuberculosis
430). https://doi.org/10.1007/978-981- identification in chest X-Rays images of
13-9942-8_40 patients. In Lecture notes in networks
Balim, C., & Özkan, K. (2023). Creating an and systems (pp. 23–
AI fashioner through deep learning and 32). https://doi.org/10.1007/978-981-
computer vision. Evolving 99-3315-0_3
Systems. https://doi.org/10.1007/s12530 Duma, R. A., Niu, Z., Nyamawe, A. S.,
-023-09498-w Tchaye-Kondi, J., & Yusuf, A. A.
Bhattacharya, A., Ghosal, A., Obaid, A. J., (2023). A Deep Hybrid Model for fake
Krit, S., Shukla, V. K., Mandal, K., & review detection by jointly leveraging
Pramanik, S. (2021). Unsupervised review text, overall ratings, and aspect
Summarization Approach With ratings. Soft Computing, 27(10), 6281–
Computational Statistics of Microblog 6296. https://doi.org/10.1007/s00500-
Data. Advances in Systems Analysis, 023-07897-4
Software Engineering, and High Dushyant, k., Muskan, G., Gupta, A.,
Performance Computing Book Series, Pramanik, S., "Utilizing Machine
23–37. https://doi.org/10.4018/978-1- Learning and Deep Learning in
7998-7701-1.ch002 Cybesecurity: An Innovative
Butt, U. A., Amin, R., Aldabbas, H., Approach," in Cyber Security and
Mohan, S., Alouffi, B., & Ahmadian, A. Digital Forensics: Challenges and
(2022). Cloud-based email phishing Future Trends , Wiley, 2022, pp.271-
attack using machine and deep learning 293.
algorithm. Complex & Intelligent https://doi:10.1002/9781119795667.ch1
Systems, 9(3), 3043– Eshtehardian, S. A., & Khodaygan, S.
3070. https://doi.org/10.1007/s40747- (2022). A continuous RRT*-based path
022-00760-3 planning method for non-holonomic
Chandan, R. R., Soni, S., Raj, A. N. J., mobile robots using B-spline
Veeraiah, V., Dhabliya, D., Pramanik, curves. Journal of Ambient Intelligence
S., & Gupta, A. (2022). Genetic and Humanized
algorithm and machine learning. Computing. https://doi.org/10.1007/s12

66 | International Conference 2024 (8th March 2024) Open Access Article


ISSN: 1002-2082

652-021-03625-8 Review, 56(2), 1145–


Gawlikowski, J., Tassi, C. R. N., Ali, M., 1173. https://doi.org/10.1007/s10462-
Lee, J., Humt, M., Feng, J., Kruspe, A., 022-10195-4
Triebel, R., Jung, P., Roscher, R., Javed, M. S., Majeed, H., Mujtaba, H., &
Shahzad, M., Wen, Y., Bamler, R., & Beg, M. O. (2021). Fake reviews
Zhu, X. X. (2023). A survey of classification using deep learning
uncertainty in deep neural ensemble of shallow
networks. Artificial Intelligence convolutions. Journal of Computational
Review. https://doi.org/10.1007/s10462- Social Science, 4(2), 883–
023-10562-9 902. https://doi.org/10.1007/s42001-
Ghanem, R., & Erbay, H. (2022). Spam 021-00114-y
detection on social networks using deep Jayasingh, R. J., S, J. K. R. J., Telagathoti,
contextualized word D. B., Sagayam, K. M., Sagayam, K.
representation. Multimedia Tools and M., Pramanik, S., Jena, O. P., &
Applications, 82(3), 3697– Bandyopadhyay, S. K. (2022). Speckle
3712. https://doi.org/10.1007/s11042- noise removal by SORAMA
022-13397-8 segmentation in digital image
Hossain, M., Ho, R. C., & Trajkovski, G. processing to facilitate precise robotic
(Eds.). (2023). Handbook of Research surgery. International Journal of
on AI and Machine Learning Reliable and Quality E-
Applications in Customer Support and healthcare, 11(1), 1–
Analytics. IGI Global. 19. https://doi.org/10.4018/ijrqeh.29508
https://doi.org/10.4018/978-1-6684- 3
7105-0 Kaddoura, S., Alex, S. A., Itani, M., Henno,
Jabeur, S. B., Ballouk, H., Arfi, W. B., & S., AlNashash, A., & Hemanth, D. J.
Sahut, J. (2023). Artificial intelligence (2023). Arabic spam tweets
applications in fake review detection: classification using deep
Bibliometric analysis and future learning. Neural Computing and
avenues for research. Journal of Applications, 35(23), 17233–
Business Research, 158, 17246. https://doi.org/10.1007/s00521-
113631. https://doi.org/10.1016/j.jbusre 023-08614-w
s.2022.113631 Kathuria, A., Gupta, A., & Singla, R.
Jáñez-Martino, F., Alaiz-Rodríguez, R., (2023). AOH-SENTI: Aspect-Oriented
González-Castro, V., Fidalgo, E., & Hybrid Approach to sentiment analysis
Alegre, E. (2022). A review of spam of students’ feedback. SN Computer
email detection: analysis of spammer Science, 4(2). https://doi.org/10.1007/s4
strategies and the dataset shift 2979-022-01611-1
problem. Artificial Intelligence Kaushik, D., Garg, M., Annu, Gupta, A., &

67 | International Conference 2024 (8th March 2024) Open Access Article


ISSN: 1002-2082

Pramanik, S. (2022). Utilizing Machine Martis, E., Deo, R., Rastogi, S., Chhaparia,
Learning and Deep Learning in K., & Biwalkar, A. (2023). A proposed
Cybesecurity: An Innovative system for understanding the consumer
Approach. Wiley eBooks, 271– opinion of a product using sentiment
293. https://doi.org/10.1002/978111979 analysis. In Advances in intelligent
5667.ch12 systems and computing (pp. 555–
Khanh, P. T., Ngoc, T. T. H., & Pramanik, 568). https://doi.org/10.1007/978-981-
S. (2023). Future of smart agriculture 19-5443-6_42
techniques and applications. Meslie, Y., Enbeyle, W., Pandey, B. K.,
In Advances in environmental Pramanik, S., Pandey, D., Dadeech, P.,
engineering and green technologies Belay, A., & Saini, A. K. (2021).
book series (pp. 365– Machine Intelligence-Based Trend
378). https://doi.org/10.4018/978-1- Analysis of COVID-19 for total daily
6684-9231-4.ch021 confirmed cases in Asia and Africa.
Li, J., Hu, J., Zhang, P., & Yang, L. (2023). In Advances in systems analysis,
Exposing collaborative spammer groups software engineering, and high
through the review-response performance computing book
graph. Multimedia Tools and series (pp. 164–
Applications, 82(14), 21687– 185). https://doi.org/10.4018/978-1-
21700. https://doi.org/10.1007/s11042- 7998-7701-1.ch009
023-14650-4 Mewada, A., & Dewang, R. K. (2022). A
Liu, S., & Lee, I. (2019). Extracting comprehensive survey of various
features with medical sentiment lexicon methods in opinion spam
and position encoding for drug detection. Multimedia Tools and
reviews. Health Information Science Applications, 82(9), 13199–
and 13239. https://doi.org/10.1007/s11042-
Systems, 7(1). https://doi.org/10.1007/s 022-13702-5
13755-019-0072-6 Ngoc, T. T. H., Khanh, P. T., & Pramanik,
Mandal, A., Dutta, S., & Pramanik, S. S. (2023). Smart Agriculture using a
(2021). Machine intelligence of PI from soil monitoring system. In Advances in
geometrical figures with variable environmental engineering and green
parameters using SCILab. In Advances technologies book series (pp. 200–
in systems analysis, software 220). https://doi.org/10.4018/978-1-
engineering, and high performance 6684-9231-4.ch011
computing book series (pp. 38– Padminivalli, S. J. R. K., V., Rao, M. V. P.
63). https://doi.org/10.4018/978-1- C. S., & Narne, N. S. R. (2023).
7998-7701-1.ch003 Sentiment based emotion classification
in unstructured textual data using dual

68 | International Conference 2024 (8th March 2024) Open Access Article


ISSN: 1002-2082

stage deep model. Multimedia Tools 99-3315-0_4


and Qayyum, H., Farooq, A., Nawaz, M., &
Applications. https://doi.org/10.1007/s1 Nazir, T. (2023). FRD-LSTM: a novel
1042-023-16314-9 technique for fake reviews detection
Pramanik, S. (2022). Carpooling solutions using DCWR with the Bi-LSTM
using machine learning tools. method. Multimedia Tools and
In Advances in IT standards and Applications, 82(20), 31505–
standardization research series (pp. 31519. https://doi.org/10.1007/s11042-
14–28). https://doi.org/10.4018/978-1- 023-15098-2
7998-9795-8.ch002 Salminen, J., Kandpal, C., Kamel, A., Jung,
Pramanik, S. (2023). An adaptive image S., & Jansen, B. J. (2022). Creating and
steganography approach depending on detecting fake reviews of online
integer wavelet transform and genetic products. Journal of Retailing and
algorithm. Multimedia Tools and Consumer Services, 64,
Applications. https://doi.org/10.1007/s1 102771. https://doi.org/10.1016/j.jretco
1042-023-14505-y nser.2021.102771
Pramanik, S., & Bandyopadhyay, S. K. Srinivasarao, U., & Sharaff, A. (2023).
(2022). Identifying disease and Machine intelligence based hybrid
diagnosis in females using Machine classifier for spam detection and
learning. In IGI Global eBooks (pp. sentiment analysis of SMS
3120– messages. Multimedia Tools and
3143). https://doi.org/10.4018/978-1- Applications, 82(20), 31069–
7998-9220-5.ch187 31099. https://doi.org/10.1007/s11042-
Pramanik, S., Sagayam, K. M., & Jena, O. 023-14641-5
P. (2021). Machine learning Kaddoura, S., Chandrasekaran, G., Elena
frameworks in cancer detection. E3S Popescu, D., Duraisamy, J. H. (2022). A
Web of Conferences, 297, systematic literature review on spam
01073. https://doi.org/10.1051/e3sconf/ content detection and classification.
202129701073 https://doi.org/10.7717/peerj-cs.830
Praveenkumar, S., Veeraiah, V., Pramanik, Yan, J. (2023). Multivariate Modeling with
S., Basha, S. M., Neto, A. V. L., De Copulas and Engineering Applications.
Albuquerque, V. H. C., & Gupta, A. In Springer handbooks (pp. 931–
(2023). Prediction of patients’ incurable 945). https://doi.org/10.1007/978-1-
diseases utilizing deep learning 4471-7503-2_46
approach. In Lecture notes in networks Yao, J., Qin, S., Qiao, S., Liu, X., Zhang,
and systems (pp. 33– L., & Chen, J. (2022). Application of a
44). https://doi.org/10.1007/978-981- two-step sampling strategy based on
deep neural network for landslide

69 | International Conference 2024 (8th March 2024) Open Access Article


ISSN: 1002-2082

susceptibility mapping. Bulletin of Zhao, P., Ma, Z., Gill, T., & Ranaweera, C.
Engineering Geology and the (2023). Social media sentiment
Environment, 81(4). https://doi.org/10.1 polarization and its impact on product
007/s10064-022-02615-0 adoption. Marketing Letters, 34(3),
Zhan, P., Qin, X., Zhang, Q., & Sun, Y. 497–
(2023). Output-Only modal 512. https://doi.org/10.1007/s11002-
identification based on auto-regressive 023-09664-9
Spectrum-Guided symplectic geometry
mode decomposition. Journal of
Vibration Engineering &
Technologies. https://doi.org/10.1007/s
42417-022-00832-1

70 | International Conference 2024 (8th March 2024) Open Access Article

You might also like