An Intelligent Chatbot Using Deep Learning With Bidir - 2021 - Materials Today PDF
An Intelligent Chatbot Using Deep Learning With Bidir - 2021 - Materials Today PDF
a r t i c l e i n f o a b s t r a c t
Article history: This paper shows the modeling and performance in deep learning computation for an Assistant
Received 12 April 2020 Conversational Agent (Chatbot). The utilization of Tensorflow software library, particularly Neural
Received in revised form 15 May 2020 Machine Translation (NMT) model. Acquiring knowledge for modeling is one of the most important task
Accepted 16 May 2020
and quite difficult to preprocess it. The Bidirectional Recurrent Neural Networks (BRNN) containing
Available online 10 June 2020
attention layers is used, so that input sentence with large number of tokens (or sentences with more than
20–40 words) can be replied with more appropriate conversation. The dataset used in the paper for train-
Keywords:
ing of model is used from Reddit. The model is developed to perform English to English translation. The
Deep learning
Chatbot
main purpose of this work is to increase the perplexity and learning rate of the model and find Bleu Score
Bidirectional RNN and Attention model for translation in same language. The experiments are conducted using Tensorflow using python 3.6. The
Tensorflow perplexity, leaning rate, Bleu score and Average time per 1000 steps are 56.10, 0.0001, 30.16 and 4.5
Neural Machine Translation respectively. One epoch is completed at 23,000 steps. The paper also study MacBook Air as a system
for neural network and deep learning.
Ó 2019 Elsevier Ltd. All rights reserved.
Selection and peer-review under responsibility of the scientific committee of the 3rd International Con-
ference on Science and Engineering of Materials.
1. Introduction network) models and algorithms. The paper shows the formation
of Chatbot by Neural Machine Translation (NMT) model which is
The Chatbot can be defined as a software which help humans to improvement on sequence-to-sequence model. The Chatbot use
make coherent conversation with machine using natural language Bidirectional Recurrent Neural Network (BRNN) [6]. The BRNN
like English, etc. The conversation can be engaging at times with was chosen, as conversation or input to the Chatbot is dynamic,
large vocabularies and broad range of conversational topics. which means the length of input is unfixed. The BRNN is also sup-
Recently, the usage of deep learning has increased in industry ported by attention mechanism [7,8], which help to further
and Chatbot is one of its application [1–3]. Fig. 1 shows user using increase capacity of model to remember longer sequence of sen-
the Chatbot for its various application. This paper will help to cre- tences. The concept of Bidirectional Recurrent Neural Network,
ate open-domain Chatbot, which can be later subjected to a partic- can be understand by taking two independent Recurrent Neural
ular domain, if needed as shown in Fig. 1. It can be done by making Network (RNN) [9] together, sending signals through their layer
changes in dataset, which means training model with particular in opposite directions. So BRNN can be seen as neural network con-
domain knowledge. Due to open domain nature of the Chatbot, it necting two hidden layers in opposite directions to a single output.
can be used in making Artificial Intelligence Assistant which can This helps the network to have both forward and backward infor-
make real life conversation with its user in any topic and situation. mation at every step, i.e. to receive information from both past and
To make deep learning utilized by everyone, a major deep learning future states. The input is fed in one direction in normal time order,
library Tensorflow is implemented by Google [4] and made avail- and the other, in reverse order. The concept of Extended Long Short
able for use as an open source. Tensorflow [5] is Python-friendly Term Memory (ELSTM) [10] can also be used, with Dependent
library bundled with machine learning and deep learning (neural BRNN (DBRNN), as it help to increase the result by 30% on labeled
data. The training of the BRNN is done in a same way as RNN, as
two bidirectional neurons do not interact with one another. When
⇑ Corresponding author.
E-mail address: [email protected] (M. Dhyani).
https://doi.org/10.1016/j.matpr.2020.05.450
2214-7853/Ó 2019 Elsevier Ltd. All rights reserved.
Selection and peer-review under responsibility of the scientific committee of the 3rd International Conference on Science and Engineering of Materials.
818 M. Dhyani, R. Kumar / Materials Today: Proceedings 34 (2021) 817–824
forward and backward passes are done [11], then only weights are details like operating system and version of TensorFlow used. Also
updated. the technology is getting upgraded every day, even if we take Cen-
The Fig. 2 shows the general BRNN architecture with the hidden tral Processing Unit (CPU) and Graphics Processing Unit (GPU),
states for forward and backward direction. The variable ‘O’, ‘I’ and which are becoming faster [12]. The system used is 2017 MacBook
‘H’ means ‘Output’, ‘Input’ and ‘Hidden’ states respectively. The val- Air series by Apple. The laptop was at room temperature all the
ues {X1, X2,. . ..,Xn} are input signals to the network and values {Y1, time of training the model.
Y2,. . .. , Yn} are computed output signals from the network.
2. Literature review
1.1. System details
There have been increase in development of conversational agent
This paper also evaluates MacBook as a system for deep learn- systems in commercial market like in retail, banking and education
ing. Table 1 shows the system specification and other software sectors. The research is being done, to improve accuracy and make
conversation between Chatbot and user as close to real world con-
versations. Apart from traditional rule based technique, used earlier
Table 1
System and Software Specification. in Chatbot development and other straightforward machine learn-
ing algorithm, advance concepts and techniques like Natural Lan-
System Specification
guage Processing (NLP) techniques and Deep Learning Techniques
CPU Intel(R) Core(TM) i5-5350U [email protected] like Deep Neural Network (DNN) and Deep Reinforcement Learning
RAM: 8GB 1600 MHz DDR3
GPU Intel HD Graphics 6000 1536MB
(DRL) are being used to model and train the Chatbot systems. In early
48 @300-1000(Boost)MHz days, the translation was done by breaking up source sentences into
Software Os: MacOs Majave (64-bits) Version: 10.14.6 multiple token or chunks and then translated phrase-by-phrase.
Tensorflow Version: 1.15.0rc1
Sequence-to-Sequence has been a popular model based on neural
M. Dhyani, R. Kumar / Materials Today: Proceedings 34 (2021) 817–824 819
network and deep learning, used in Neural Machine Translation [13]. also bring second metric-the retention rate, very important for
It is used in tasks like machine translation, speech recognition, and Chabot success. The companies aim for significantly high retention
text summarization. Sequence-to-Sequence model has an encoder rate, indicating customer satisfaction. The automated calls and
and a decoder, both having Vanilla RNN [14] by default. The encoder Chatbot messengers are being used to replace other communica-
take source sentence as input to build a thought vector. A thought tion mediums (i.e. lowering call volumes by humans). The reten-
vector is a vector space containing sequence of numbers which rep- tion rate increases when Chatbot are more trained to support the
resent meaning of the sentence. Then finally, decoder process the user in managing their account without speaking to a human assis-
thought vector fed to it and emit a translation called a target tant. Another metric is the ability of the Chatbot, to produce the
sequence or sentence. But Vanilla RNN fails when long sequence of personalized reply to the user. This means that the Chatbot should
sentences are fed to model, as information needs to be remembered. take in the source sentence, understand and analyze it, and pro-
This information frequently becomes larger for bigger datasets and duce an output statement mapped to the particular problem or
create bottleneck for RNN networks. The variation in RNN has to query of the user. The companies try to personalize and customize
be used like BRNN with Attention mechanism or Long Short-Term the output statement according to each user’s need, like bank sug-
Memory (LSTM) and Gated Recurrent Unit (GRU) [15] to handle fail- gesting user a relevant offer or credit card scheme according to the
ure in longer sequences. To understand meaning behind the sen- balance in the user’s account, salary, current loan and its spending
tence, the intention, facts and emotions described in the sentence, history. For example, Erica, which is a Chatbot, is Bank of America’s
it must be analyzed. In [3], the statistical difference was 63.33% to AI virtual assistant, which combine predictive analysis and NLP, to
identify two groups or personalities on basis of their sentiments. help its users to access their balance information, tips on better
The technique of NLP and machine learning helps to deeply analyze saving the money depending upon their spending habits, making
the sentence sentiments and make comfortable environment for transaction between accounts and schedule meetings at financial
humans to make conversation with machine. If dataset is in text, centers and banks. The e-market and Retail Chatbots make engag-
sentiment features can be classified as text level or document level ing environment for users to shop. Through their environment, the
[16]. The deep learning methods like Convolutional Neural Network Chabot transform itself in a personal assistant for assisting in shop-
(CNN) [17] and RNN are used in document sentiment classification, ping. Unlike banking and financial sector Chatbots, Retail conversa-
as it is the difficult one out of the two above mentioned classifica- tional agents are designed to look for higher number of
tion. Also developing Chatbot for young adolescent, engaging them conversational steps by holding the users attention, providing
in their most preferred channel of communication, that are smart- details and encouraging them to browse more and ultimately pur-
phones and successfully helping them to adult focused care. In [2], chase the product. For instance, Ebay’s ShopBot, help users to find
the engagement time was 97%. The Kyoto-NMT is an open-source best deals from its list of billion products. It is easy to-talk-to, like a
implementation of NMT paradigm [18]. It used chained, a Deep friend, either if one is searching for a specific product or browsing
learning Framework. It has used two layer of LSTM with Attention to find something new. The above discussed studies shows net-
model in its Recurrent Neural Network. It has also used whitespaces work designed for small sized datasets and for short input sen-
for making token in data preparation. The vector size used by it is of tences which are not fit for real life conversation as human tends
1000. The training data is a sentence-aligned parallel corpus that is to speak in longer sentences. To counter real world conversation,
in utf-8 text files: one for source language sentences and the other model like BRNN is important to know conversation context and
for target language sentences. In this, a JSON file is created, contain- references, from past as well as future. Attention mechanism is
ing all the parameters. SQLite is used for database creation and for important attachment to the network as it help to weigh the
keeping track of training progress. The bleu score is computed by particular references from the input sentences. Also for evaluating
applying greedy search on validation dataset. As the real translation Chatbot performances there are no hard metrics [21], but
was there (Japanese-to-English), the bleu score they reached is of parameters like Perplexity, Learning rate and Bleu score can repre-
26.22. The Snowbot is a Chatbot, developed on Tensorflow and sent as to how close one can approach at the time of training the
MXNet [19]. The Chatbot uses Cornell movie Dialog Corpus with model.
221,282 QA, size 22 MB and twitter chat with 377,265 QA having
emoji in sentences, size 51 MB. The Chatbot was developed in desk- 3. Methodology
top having NVIDIA GTX 1080 8 GB and 5 GB RAM. The neural net-
work contains two layer of LSTM with 256 hidden units. The The model, BRNN with Attention model not only help in short
vocabulary size was of 50000. The model completed 7 epochs in but also in longer tokens. Attention model help us to remember
1 h. The perplexity was 90 and 135 for Cornell and Twitter dataset longer sequences at a time and also help in context problem where
respectively. both historical and future information is required. In real world as
language may not necessarily be in perfect sequence, sometimes
2.1. Chatbot performance metrics across industries one has to use context, hear full conversation before going back
and responding to words leading up to that point. Human tends
Currently, there are many performance metrics, and certain to speak in longer sentences to understand the meaning. This is
measurement standards are followed across industry for Chatbot the reason that makes combination of BRNN and Attention model,
[20]. Different organizations need Chatbot according to the nature perfectly right choice for Chatbots. The BRNN structure forms the
of their work and market surrounding it. One of the most impor- acyclic graph as can be seen in Fig. 3. The signals are of two types:
tant performance metric for Chatbot is the structure and the length !
forward and backward. The a is forward recurrent component and
of its conversation. The length of output sentence must be appro-
a is backward recurrent component. There are input val-
priate and in context to the conversation being done. Shorter and n
simple the structure of sentence in output, faster the solution, does ues X = X 1 ; X 1 ;: X n } at timestamp t¼ f1; 2; :n} and predicted val-
increase the customer satisfaction rate. To understand this metric n
b = Y
ues are Y b 1; Y b 2; :Y
b n }. To make prediction Yb attimet;which is
with an example of banking sector, in banks, the Chatbot is mainly
used to guide the user through the bank’s policy, schemes and an activation function g(n), computed as:
other customer inquiries about their account. This serve user to
h i
perform their tasks quicker, and also lower the human call assis- b ¼ g Wy !
Y a t ; at þ P y ð1Þ
tance, thus cutting cost in the service. The consumer satisfaction
820 M. Dhyani, R. Kumar / Materials Today: Proceedings 34 (2021) 817–824
Fig. 3. Source input demonstration into BRNN with hidden layer to describe forward and backward flow of information.
whereW y is the weight according to the input and magnitude set in The S n represents the states of the attention model, where
! n2 W. The S 0 is the primary weight, whose value is set according
the network, a t &a t ; are the forward activation at time t and back-
ward activation at time t respectively. TheP y is the computed value to the network. The Zb t at time step t is the attention value after
(or predicted value) from the previous neuron in the direction, computation from attention neuron where t2 N. The Fig. 4 is the
information is advancing. The activation function in neural network extended structure, which when combined with BRNN in Fig. 3
are the function attached to the node (or neuron), which get trig- to produce complete architecture for the deep learning network.
gered when the input value in the current node is relevant in mak- This network has quadratic cost for time. As generally, in Chatbot,
ing prediction. There are many types of activation functions, but not much longer sentences or paragraph are used to converse, so
with BRNN, sigmoid and logistic activation functions are mostly cost may be acceptable. Though there are other research in this
used. For example, in the network shown in Fig. 3, an input of cou- field to reduce quadratic time cost. The attention mechanism is
ple of statements has been given. Statement 1- He said, ‘‘Indira one of the important methods in deep learning techniques, espe-
went to the market.” and statement 2- He said, ‘‘Indira Gandhi cially in area of document classification [24], speech recognition
was a great Prime Minister.” At an instance, when statement 2 [25], and in image processing [26–28].
was inserted into the network, at time step 3, where word ‘‘Indira”
c
was inserted, Y 3 cell prediction or output signal is checked, and due 3.1. Implementation
to bidirectional in nature, the forward information flows from cell
1st and 2nd to 3rd cell, as well as, backward flow of information The procedure for implementing methodology is depicted in
from 9th cell (through all cells in between) to 3rd, help the cell to Fig. 5.
predict that ‘Indira’ in statement 2 is Prime Minister and not ‘Indira’
who went to the market, in statement 1. If simple RNN have been 3.1.1. Datasets
used, the output must be according to statement 1, as the network The Reddit dataset [29] has been used to make database for the
didn’t have the future information. Chatbot. The dataset contain comments of January, year 2015. The
This is because the RNN are unidirectional, i.e. with positive format of data is in JSON format. The content of dataset is par-
direction of time [22]. In attention mechanism, attention vector ent_id, comment_body, score, subreddit, etc. The score is most use-
is generated by calculating score and then calculated vector is ful to set the acceptable data criteria as this show that this
retained in memory, so as to choose between best candidate vec- particular comment is most accurate reply to the parent_comment
tors. The score is calculated by comparing each hidden target with or parent_body. Subreddit can be used to make some specific type
source hidden state. For applying attention mechanism, single of Chatbot like scientific or other particular domain Bots. A subred-
directional RNN can be used above BRNN structure. Each cell in dit is a specific online community, and the posts associated with it
the Attention network is given context, as an input. This type of are dedicated to a particular topic that people write about. The
network is also called context-aware attention network [23]. The database formed after pre-processing the dataset have size of
predicted value from each BRNN cell is taken, combined with value 2.42 GB. The database contain 10,935,217 rows (i.e. number of par-
from the previous state (or neuron) of the attention network for ent comment-reply comment pairs).
calculating the attention value. One can also say that context is
weight of the features from different timestamp. The Fig. 4 shows
3.1.2. Preprocessing
how the weighted value is taken in attention cell, as described
Now first, for training the model, database is required. So data-
above. The context C for cell 1in attention network can be com-
set is converted into a database with fields like parent_id, parent,
puted as:
comment, Subreddit, score and UNIX (to track time). To make data
more admissible, take data (comment) which have less than 50
X 0 0
C1 ¼ t
Y <1;t > a<t > ð2Þ words but more than 1 word (in case reply to parentis empty). Also
remove all newline character, ‘[deleted]’ and ‘[removed]’ com-
ments, etc. If data (comment body) is valid according to acceptable
!t t 0
criteria and has more score than previously paired comment to
a ; a and Y <1;t > is the value from activation func-
0
where a<t > ¼
parent comment of same parent_id, then replace it. Also if encoun-
tion applied on BRNN for prediction on its each cell, that will be
0
tered with a comment with no parent comment then it means, it
used as weight for context computation. Y <t;t > is basically the can itself be parent comment to some other comment (i.e., it is
amount of attention, Y <t> should pay to a<t > .
0
main thread comment in Reddit). For database creation, the data
M. Dhyani, R. Kumar / Materials Today: Proceedings 34 (2021) 817–824 821
have been paired in parent and child comments. Each comment is be designed as mentioned above. Once the training starts, the main
either a main parent comment or reply comment, but each have concerned hyperparameters (HParams) in metrics are bleu score
parent_id. Each parent comment and its reply comment has same (bleu), perplexity (ppl) and learning rate (Lr). Bleu score tells,
parent_id. The pairs are made in accordance to the parent_id. In how good the model is translating a sentence from one language
creation of database, the parent comment is mapped with its best to another language. It should be as high as possible. Perplexity
child or reply comment. Any comment, either a parent or a child, is a measure of the probability distribution, or it tells about model
have an acceptance score of two. When encountered with a new prediction error. Learning rate reflects the model’s learning pro-
comment, if it matches the parent_id of previous entered reply gress in the network. As in this paper, language at both ends of
comment to a parent body, then compare it with entered reply the model is English, so Perplexity is more useful than bleu score.
comment score. If current comment has better score than existing Learning rate is useful but only when model is trained with large
mapped reply comment, the replacement is done between new data and for longer period of time. If model is trained for limited
and previous reply comment and other associated data. If not the period of time or with less data, no significant change in learning
case, then the row remains unchanged. Further, if comment rate will be observed.
encountered has a parent body which is not yet paired with any
reply comment, map the comment with its parent body, else if
comment has no parent body, then create a new row for the com- 3.1.4. Result and analysis
ment, as the new comment encountered can be a parent to some Initially, the perplexity before training the model was 16322.15,
other reply comment. On creation of database, 10,935,217 learning rate was 0.001 and bleu score was 0.00. The average time
parent-reply comment pairs (rows) are created. used by above described system, per 1000 steps is between 4 and
4.5 h. If upper bound of time is taken, then for training machine till
23,000 steps, the system took 103.5 h. The perplexity, learning rate
3.1.3. Training model and bleu rate at step 23,000 is 56.10, 0.0001 and 21.67. The
For training after creation of database, rows have to be divided maximum value of bleu score model reached was of 30.16 at
into training data and test data. For both, two files are created (i.e., 18000thstep. The model also passed one epoch at 23000th step.
Parent comment and Reply comment). Training data contains The learning rate is low and negligible, as changes were made
3,027,254 pairs and Test data contains 5100 pairs. There are also externally to the weights in the neural network, once the training
list of protected phrases (e.g.www.xyz.com should be a single started. The performance evaluation is shown in Table 2.
token) and blacklisted words, to avoid feeding it to learning net-
work. The training files are fed to multiprocessing tokenizer, as
they are CPU intensive. The sentences will be divided into tokens Table 2
Performance evaluation.
on basis of space and punctuation. Each token will act as vocabu-
lary. For each step, vocabulary size is 15000. The size is appropriate Parameter Initial After training 23,000 steps
for systems having virtual memory of 4 Gigabytes. The RegX mod- Perplexity 16322.15 56.10
ule is used for formulating search pattern for vocabulary. It is faster Learning rate 0.001 0.0001
than standard library and it is basically used to check whether a Bleu score 0.00 21.67
Average time (per 1000 steps) 0 4.5 h per 1000 steps
string contain a specific search pattern. The neural network must
822 M. Dhyani, R. Kumar / Materials Today: Proceedings 34 (2021) 817–824
Table 3
Performance comparison.
Ref. Parameter Ref. Domain Our Domain Ref. Result Our Result
[18] Bleu score Japanese-to-English English-to-English 22.86 30.16
[19] Perplexity Cornell Dataset (22 MB) Reddit Dataset (2.42 GB) 90 56.10
[19] Perplexity Twitter Dataset (51 MB) Reddit Dataset (2.42 GB) 135 56.10
[19] Time 83.7 h for 1,000,000 steps 103.5 h for 23,000 steps 10 epoch 1 epoch
Also one can draw comparison of the performances of the Chat- learning model training, but not adequate. If one wants to go
bot in [18] and in [19] with our result, as reflected in Table 3. higher, and train some intermediate and advance model, MacBook
The graph of perplexity and bleu score is shown in Fig. 6. The Air (2017) hardware is not enough. There are insignificant change
other observations in Tensorboard like train_loss, decreases when in latest model of MacBook Air, for instance MacBook Air (2020) or
model starts training, and once if train_loss starts increasing after other MacBook Air series.
reaching a minimum point, one should stop training the model, as With help of test file, previously created for validating Chatbot’s
very less or no change in model performance will occur. Thus, it reply, Table 4 gives the comparison between the source dataset’s
describe that more and excessive training of model can lead to data test reply comment and the Chatbot’s (NMT-Chatbot) reply after
loss. The smoothing of all graphs is done at value of 0.96 for better 23,000 steps of training. The test reply comment means the real
interpretation. world human reply to the test parent comment on Reddit. The
Speed graph in Fig. 7 demonstrate the system speed per 1000 NMT-reply is the output from the Chatbot. The eight sentences
steps. As mentioned above, no real translation is going on this are randomly picked from the parent comment field of the data-
Chatbot, but still value should initially increase. There will be base created for training. The sentences below have length range
decrease in value too at some points as no real translation is taking between 8 and 30, in terms of words. No punctuation in the sen-
place. The perplexity should fall in every case. If it doesn’t fall, it tences shown in Table 4 are added or removed.
means the model is not getting trained properly. Also there will
be not significant change in reply of NMT Chatbot. There is speed
4. Conclusion and future works
variation in value of speed of system as the speed depends upon
overall task getting performed and other opened-up and running
A Chatbot using deep learning NMT model with Tensorflow has
applications. From analysis and experience on system while work-
been developed. The Chatbot architecture was build-up of BRNN
ing on experiment, the MacBook Air is just enough for basic deep
and attention mechanism. The Chatbot Knowledge base is open
Fig. 7. The System speed graph (x-axis denotes number of steps and Y-axis denotes system time (in sec).
M. Dhyani, R. Kumar / Materials Today: Proceedings 34 (2021) 817–824 823
Table 4
Conversation input–output response analysis of referenced user versus NMT-Chatbot reply.
domain, using Reddit dataset and it’s giving some genuine reply. In [5] Abadi, Martín, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen,
Craig Citro, Greg S. Corrado et al. Tensorflow: Large-scale machine learning on
future, the model will be rewarded on relevant and sentiment
heterogeneous distributed systems. arXiv preprint arXiv:1603.04467 (2016).
appropriate reply. This will involve Deep Reinforcement Learning [6] Mike Schuster, Kuldip K. Paliwal, Bidirectional recurrent neural networks, IEEE
(DRL) technique. Also the methodology used in implementing Trans. Signal Process. 45 (11) (1997) 2673–2681.
and training the chatbot, can be used to train the specific domain [7] Xiang Li, Wei Zhang, Qian Ding, Understanding and improving deep learning-
based rolling bearing fault diagnosis with attention mechanism, Signal Process.
chatbot, like scientific, healthcare, security, banking, e-market 161 (2019) 136–154.
and educational domain. This approach will help building the chat- [8] Niclas Ståhl, Gunnar Mathiason, Göran Falkman, Alexander Karlsson, Using
bot in any domain easier and can improve the existing chatbot recurrent neural networks with attention for detecting problematic slab
shapes in steel rolling, Appl. Math. Model. 70 (2019) 365–377.
based on simple RNN architecture or other neural network by [9] Jeffrey L. Elman, Distributed representations, simple recurrent networks, and
using attention mechanism as above. To implement domain speci- grammatical structure, Machine Learning 7 (2–3) (1991) 195–225.
fic chatbot (like healthcare, education, etc.), one can download [10] Yuanhang Su, C.-C. Jay Kuo, On extended long short-term memory and
dependent bidirectional recurrent neural network, Neurocomputing 356
specific Subreddit, of the particular domain. The future work will (2019) 151–161.
also include to build a healthcare Chatbot, guiding patient of dis- [11] Seongwoon Jeong, Max Ferguson, Jerome P. RuiHou, HoonSohn Lynch, Kincho
eases like COVID-19 (pandemic), Diabetes, High Blood Pressure H. Law, Sensor data reconstruction using bidirectional recurrent neural
network with application to bridge monitoring, Adv. Eng. Informatics 42
and heart, etc. by providing information about the inquired dis- (2019) 100991.
ease, food one can eat and ways to deal with several emergency sit- [12] Hyeran Jeon, Woo Hyong Lee, Sung Woo Chung, Load unbalancing strategy for
uations. This Chatbot will be powered by a recommender system multicore embedded processors, IEEE Trans. Computers 59 (10) (2010) 1434–
1440.
too. In this paper, the novel idea was to analyze MacBook Air as
[13] Sutskever, Ilya, OriolVinyals, and Quoc V. Le. Sequence to sequence learning
a system to study and train deep neural network model. We find with neural networks. In Advances in neural information processing systems,
the MacBook air as mediocre and basic level system for deep learn- pp. 3104-3112. 2014.
ing. This result can help basic level students or other professionals [14] Alex Sherstinsky, Fundamentals of Recurrent Neural Network (RNN) and Long
Short-Term Memory (LSTM) network, Physica D: Nonlinear Phenomena 404
to choose system wisely before starting with deep learning. (2020) 132306.
[15] Wuyan Li, Wu. Hao, Nanyang Zhu, Yongnian Jiang, Jinglu Tan, Ya Guo,
Prediction of dissolved oxygen in a fishery pond based on gated recurrent unit
(GRU), Information Processing Agriculture (2020).
Declaration of Competing Interest
[16] Fagui Liu, LaileiZheng JingzhongZheng, Cheng Chen, Combining attention-
based bidirectional gated recurrent neural network and two-dimensional
The authors declare that they have no known competing finan- convolutional neural network for document-level sentiment classification,
cial interests or personal relationships that could have appeared Neurocomputing 371 (2020) 39–50.
[17] Kim, Yoon. Convolutional neural networks for sentence classification. arXiv
to influence the work reported in this paper. preprint arXiv:1408.5882 (2014).
[18] Cromieres, Fabien. Kyoto-NMT: A neural machine translation implementation
in chainer. In Proceedings of COLING 2016, the 26th International Conference
References on Computational Linguistics: System Demonstrations, pp. 307-311. 2016.
[19] Guo, Pinglei, Yusi Xiang, Yunzheng Zhang, and Weiting Zhan. Snowbot: An
empirical study of building Chatbot using seq2seq model with different
[1] Heller, Bob, Mike Proctor, Dean Mah, Lisa Jewell, and Bill Cheung. Freudbot: An machine learning framework.
investigation of Chatbot technology in distance education. In EdMedia+ [20] Przegalinska, Aleksandra, Leon Ciechanowski, Anna Stroz, Peter Gloor, and
Innovate Learning, pp. 3913-3918. Association for the Advancement of GrzegorzMazurek. In bot we trust: A new methodology of Chatbot
Computing in Education (AACE), 2005. performance measures. Business Horizons 62, no. 6 (2019): 785-797.
[2] Jeremy Beaudry, Alyssa Consigli, Colleen Clark, Keith J. Robinson, Getting ready [21] Liu, Chia-Wei, Ryan Lowe, Iulian V. Serban, Michael Noseworthy, Laurent
for adult healthcare: designing a Chatbot to coach adolescents with special Charlin, and Joelle Pineau. How not to evaluate your dialogue system: An
health needs through the transitions of care, J. Pediatric Nursing 49 (2019) 85– empirical study of unsupervised evaluation metrics for dialogue response
91. generation. arXiv preprint arXiv:1603.08023 (2016).
[3] Rhio Sutoyo, Andry Chowanda, Agnes Kurniati, Rini Wongso, Designing an [22] Yu, Wennian, II Yong Kim, Chris Mechefske. Remaining useful life estimation
emotionally realistic chatbot framework to enhance its believability with AIML using a bidirectional recurrent neural network based autoencoder scheme.
and information states, Proc. Computer Sci. 157 (2019) 621–628. Mech. Systems Signal Process. 129 (2019): 764-780.
[4] Mo, Young Jong, Joongheon Kim, Jong-Kook Kim, Aziz Mohaisen, and Woojoo [23] Li, Huayu, Martin Renqiang Min, Yong Ge, and AsimKadav. A context-aware
Lee. Performance of deep learning computation with TensorFlow software attention network for interactive question answering. In Proceedings of the
library in GPU-capable multi-core computing platforms. In 2017 Ninth 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data
International Conference on Ubiquitous and Future Networks (ICUFN), pp. Mining, pp. 927-935. 2017.
240-242. IEEE, 2017.
824 M. Dhyani, R. Kumar / Materials Today: Proceedings 34 (2021) 817–824
[24] Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, Eduard Hovy, [27] Xu, Kelvin, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville,
Hierarchical attention networks for document classification, in: Proceedings of RuslanSalakhudinov, Rich Zemel, and YoshuaBengio. Show, attend and tell:
the 2016 Conference of the North American Chapter of the Association for Neural image caption generation with visual attention. In International
Computational Linguistics: Human Language Technologies, 2016, pp. 1480– conference on machine learning, pp. 2048-2057. 2015.
1489. [28] Kumar, Rajiv, Amresh Kumar, and Pervez Ahmed. A benchmark dataset for
[25] Chan William, Navdeep Jaitly, Quoc V. Le, Oriol Vinyals. Listen, attend and devnagari document recognition research. In 6th International Conference on
spell. arXiv preprint arXiv:1508.01211 (2015). Visualization, Imaging and Simulation (VIS’13), Lemesos, Cyprus, pp. 258-263.
[26] Hua, Yuansheng, Lichao Mou, Xiao Xiang Zhu. Recurrently exploring class-wise 2013.
attention in a hybrid convolutional and bidirectional LSTM network for multi- [29] https://www.reddit.com/r/datasets/comments/3bxlg7/
label aerial image classification. ISPRS J. Photogrammetry Remote Sensing 149 i_have_every_publicly_available_reddit_comment/?st=j9udbxta&sh=69e4fee7
(2019): 188-199. [Reddit dataset].