Key Data Extraction and Emotion Analysis of Digital Shopping Based On BERT
Key Data Extraction and Emotion Analysis of Digital Shopping Based On BERT
2.Sentiment Analysis
3.Text summarization
4.Aspect mining
5.Topic modelling
Techniques :
1.Seq2seq
6.Transformers
Why BERT- Bidirectional Encoder Representations
from Transformers ??
Builds upon recent work in pre-training contextual representations.
Best method in NLP to understand context-heavy texts.
BERT provides pre-trained language models for English and 103 other
languages that you can fine-tune to fit your needs. Possible to fine-tune the
English model to do sentiment analysis.
Problem Statement
To accelerate digital-sales by identifying key trends and predict their performance
in the current market.
Challenge targets :
• Companies find difficult in understanding what are the current public needs in this
pandemic situation.
• Knowing the human needs and predictions based on it
boosts the digital sales,
as online shopping is preferred the most these days.
Target :
Kaggle Datasets.
Review content and collection of positive and negative words for training purpose to be
concentrated.
Pre train and fine tune BERT model with category wise words.
Vectored output to be classified with TF-IDF,CRF models for understanding the emotion.
System Architecture & Modules
Explained
• Steps:
1.Obtain the scrape reviews of products under different category.
2.Data wrangling
3.EDA with pre-trained BERT along with a neural network classifier.
Major modules :
•Load the BERT Classifier and Tokenizer along with Input modules;
•Configure the Loaded BERT model and Train for Fine-tuning
•Make Prediction with the Fine-tuned Model
System Architecture
The BERT model ,a dropout layer and a classifier. The IMDB dataset is used for binary sentiment
classification , for knowing a positive or negative review. It contains 25,000 movie reviews for training
and 25,000 for testing. All these are labeled data.
1.Apply weight decay for all parameters except 'bias' and 'LayerNorm'
2.Lookahead optimizer(improves the learning stability and lowers the variance of its inner
optimizer)
3.OneCycleLRWithWarmup with 0 warmup steps, cosine annealing from 5e-5 to 1e-8.
4.Gradient accumulation for large batch training.
BERT model for classification
After two epochs, we’ll able to reach 96.22% accuracy, which is on 6% higher than logistic
regression.
To improve result fine-tuning with frozen encoder.
Test content :
References :
Wanying Yan And Junjun Guo ,“Joint Hierarchical Semantic Clipping And Sentence Extraction For Document Summarization”, J Inf Process Syst, Vol.16, No.4, Pp.820~831, August
2020
Han Zhang,Shaoqi Sun1 , Yongjin Hu , Junxiu Liu, And Yuanbo Guo , (Member, IEEE) “Sentiment Classification For Chinese Text Based On Interactive Multitask Learning”,IEEE
2020,Doi 10.1109/ACCESS.2020.3007889
Abdessamad Benlahbib And El Habib Nfaoui “Aggregating Customer Review Attributes For Online Reputation Generation”,IEEE 2020,Doi 10.1109/ACCESS.2020.2996805
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, And I. Polosukhin, ‘‘Attention Is All You Need,’’ In Proc. Adv. Neural Inf. Process. Syst., 2017,
Pp. 5998–6008.
Yaser Keneshloo,Tian Shi,Naren Ramakrishnan And Chandan K Reddy,”Deep Reinforcement Learning For Sequence-To-Sequence Models” IEEE Vol. 31, No. 7, JULY 2020.
Jacob Devlin,Ming-Wei Chang,Kenton Lee And Kristina Toutanova , Google AI Language “BERT: Pre-Training Of Deep Bidirectional Transformers For Language Understanding”,
Arxiv:1810.04805v2 [Cs.CL] 24 May 2019
Andrea Galassi,Marco Lippi And Paolo Torroni“Attention In Natural Language Processing”IEEE,Unpublished.
Andres Alejandro Ramos Magna,Hector Aleende-CID,Carla Taramasco,Carlos Becerra And Rosa L Figueroa,“Application Of Machine Learning And Word Embedding For
Cancer”IEEE 2020,Doi 10.1109/ACCESS.2020.3000075
Hongbin Xia,Chenhui Ding and Yuan Liu “Sentiment Analysis model based on Self-Attention and Character level Embedding”.IEEE 2020.Doi: 10.1109/ACCESS.2020.3029694
Shaozong Zhang,Dingkai Zhang,Haidong Zhong and Guorong Wang“A multiclassification model of sentiment for E-commerce reviews”.IEEE 2020.Doi:
10.1109/ACCESS.2020.3031588.
Tiancheng Tang,Xinhuai Tang and Tianyi Yuan“Fine-tuning BERT for Multi-Label Senitment analysis uin unbalanaced code -switching text”.IEEE 2020 Doi:
10.1109/ACCESS.2020.3030468.
Abdulmohsen Al-Thubaity,Atheer Alkhalifa,Abdulrahman Almuhareb and Waleed Alsanie“Arabic Diacrticization using Bidirectional Long Short-term memory neural networks with
Conditional Random Fields”.IEEE 2020,Doi:10.1109/ACCESS.2020.3018885.
Yongping Du,Xiaozheng Zhao,Meng He,Wenyang Guo “A novel capsule based hybrid neural network for sentiment classification “IEEE 2019.Doi:
10.1109/ACCESS.2019.2906398.
“Sentiment Analysis About Investors And Consumers In Energy Market Based On BERT-Bilstm”IEEE 2020,Doi 10.1109/ACCESS.2020.3024750