0% found this document useful (0 votes)

30 views8 pages

Sequence-To-sequence Bangla Sentence Generation With LSTM

Uploaded by

Puja Dwi Lestari

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views8 pages

Sequence-To-sequence Bangla Sentence Generation With LSTM

Uploaded by

Puja Dwi Lestari

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Available online at www.sciencedirect.

com
Available online at www.sciencedirect.com
Available online at www.sciencedirect.com
Available online at www.sciencedirect.com
Available online at www.sciencedirect.com

ScienceDirect
Procedia Computer Science 00 (2019) 000–000
Procedia Computer Science 00 (2019) 000–000
Procedia
Procedia Computer
Computer Science
Science 00(2019)
152 (2019)51–58
000–000 www.elsevier.com/locate/procedia
www.elsevier.com/locate/procedia
Procedia Computer Science 00 (2019) 000–000 www.elsevier.com/locate/procedia
www.elsevier.com/locate/procedia
International
International Conference on Pervasive Computing Advances and Applications -- PerCAA 2019
International Conference
Conference on
on Pervasive
Pervasive Computing
Computing Advances
Advances and
and Applications
Applications - PerCAA
PerCAA 2019
2019
Sequence-to-sequence Bangla Sentence Generation with LSTM
International Conference
Sequence-to-sequence on Pervasive Computing Advances and Applications - PerCAA 2019
Sequence-to-sequence Bangla Bangla Sentence
Sentence Generation
Generation with with LSTM
LSTM
Sequence-to-sequence Recurrent
Bangla
Recurrent Neural Networks
Sentence Generation with LSTM
Recurrent Neural
Neural Networks
Networks
Md.
Md. Sanzidul
Sanzidul Islam
a,∗
Islama,∗
Recurrent
, Sadia Neural
Sultana Sharmin Networks
Mousumi a b
a , Sheikh Abujarb , Syed Akhter
a,∗, Sadia Sultana Sharmin Mousumia , Sheikh Abujarb , Syed Akhter
Md. Sanzidul Islam , Sadia Sultana Sharmin Hossain Mousumi
c
c , Sheikh Abujar , Syed Akhter
Md. Sanzidul Islam a,∗
, Sadiaof Sultana Hossain
Hossain
Sharmin Mousumi c a
, Sheikh Abujarb , Syed Akhter
a Student,
a Student, Dept. CSE, Daffodil International University,
c Dhaka-1207, Bangladesh
baLecturer,
b Student, Dept.
Dept. of
of CSE,
CSE, Daffodil HossainUniversity,
Dept. of CSE, Daffodil International
Daffodil International
International University,
Dhaka-1207, Bangladesh
University, Dhaka-1207,
Dhaka-1207, Bangladesh
Bangladesh
baLecturer,
c Dept.
Lecturer,
Dept. of CSE, Daffodil International University, Dhaka-1207, Bangladesh
Dept. ofofCSE, Daffodil International University, Dhaka-1207, Bangladesh
c Dept. Head,
Student, Dept.
Dept. of CSE,
CSE, Daffodil
Daffodil International
International University,
University, Dhaka-1207,
Dhaka-1207, Bangladesh
Bangladesh
c Dept. Head, Dept. of CSE, Daffodil International University, Dhaka-1207, Bangladesh
Head,Dept.
b Lecturer, Dept.ofofCSE,
CSE,Daffodil
DaffodilInternational
InternationalUniversity,
University,Dhaka-1207,
Dhaka-1207,Bangladesh
Bangladesh
c Dept. Head, Dept. of CSE, Daffodil International University, Dhaka-1207, Bangladesh

Abstract
Abstract
Abstract
Sequence to sequence text generation is the most efficient approach for automatically converting the script of a word from a
Sequence to sequence text generation is the most efficient approach for automatically converting the script of a word from a
Abstract
Sequence
source sequenceto sequence text sequence.
to a target generationText is the most efficient
generation approach for
is the application automatically
of natural languageconverting
generationthe script
which of a word
is useful from a
in sequence
source sequence to a target sequence. Text generation is the application of natural language generation which is useful in sequence
source
modeling
Sequence sequence
like thetomachine
a target sequence.
translation, Text
speechgeneration is the image
recognition, application of natural
captioning, language
language generationthe
identification, which
video is useful
captioninginand
sequence
mucha
modeling like the machine translation, speech recognition, image captioning, language identification, video captioning andfrom
to sequence text generation is the most efficient approach for automatically converting script of a word much
modeling
more.
source In like paper
this
sequence thetomachine
a we have
target translation,
discussed
sequence. speech
Textabout recognition,
Banglaistext
generation the image captioning,
generation,
application using
of language
deep
natural identification,
learning
language video
Long captioning
approach,which
generation Short-term
is useful inand much
Memory
sequence
more. In this paper we have discussed about Bangla text generation, using deep learning approach, Long Short-term Memory
more.
(LSTM),
modelingIn athis
like paper
special
the we of
kind
machinehave
RNN discussed
(Recurrent
translation, about
speech Bangla
Neural text generation,
Network).
recognition, LSTM
image using deep
networks
captioning, learning
are suitable
language forapproach, Long
analyzingvideo
identification, Short-term
sequences Memory
of textand
captioning data and
much
(LSTM), a special kind of RNN (Recurrent Neural Network). LSTM networks are suitable for analyzing sequences of text data and
(LSTM),
predicting
more. In athis
special
the next kind
paper word.
we of RNN
LSTM
have (Recurrent a Neural
could beabout
discussed Network).
respectable
Bangla solution
text LSTM
if younetworks
generation, want are suitable
to deep
using predict forapproach,
the very
learning analyzing sequences
next pointLong
of a given of textsequence.
time
Short-term data and
Memory
predicting the next word. LSTM could be a respectable solution if you want to predict the very next point of a given time sequence.
predicting
In
(LSTM), a the
this article next
we
special word.ofLSTM
proposed
kind couldBangla
a artificial
RNN be a Neural
(Recurrent respectable solution
Text Generator
Network). withif LSTM,
LSTM younetworks
want to predict
which
areis theearly
very very next
for for point
this of a given
language time
and of
also sequence.
this model
In this article we proposed a artificial Bangla Text Generator with LSTM, which issuitable
very early analyzing
for sequences
this language text
and also thisdata and
model
In
is this article
validated
predicting we
with proposed
satisfactorya artificial
accuracy Bangla
rate. Text Generator with LSTM, which is very early for this language and also this model
is validatedthe next
with word. LSTM
satisfactory could rate.
accuracy be a respectable solution if you want to predict the very next point of a given time sequence.
is
In validated
this articlewith
we satisfactory accuracyBangla
proposed a artificial rate. Text Generator with LSTM, which is very early for this language and also this model
c validated
is
2019 The Authors.
with Published
satisfactory by
accuracy Elsevier
rate. Ltd.
©
c 2019 The
The Authors.
Authors. Published
Published by by Elsevier
Elsevier Ltd.
c 2019
This is The
an openAuthors.
access Published
article by Elsevier
under the CC Ltd.
This is an open access article under
This is an open access article under the CC the CC BY-NC-ND
BY-NC-ND
BY-NC-ND
license
license (https://creativecommons.org/licenses/by-nc-nd/4.0/).
license (https://creativecommons.org/licenses/by-nc-nd/4.0/)
(https://creativecommons.org/licenses/by-nc-nd/4.0/).
This
is an open
c 2019 The under
Peer-review access
Authors. article under
Published by
responsibility of the CC
Elsevier
the BY-NC-ND license
Ltd. committee
scientific of(https://creativecommons.org/licenses/by-nc-nd/4.0/).
the International Conference on Pervasive Computing Advances
Keywords:
and
This Applications
is an open access
Keywords:
Keywords:
Language
– PerCAA
article2019.
under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0/).
Modeling; Text Generation; NLP; Bangla Text; Sequence-to-sequence; RNN; LSTM, Deep Learning, Machine Learning
Language Modeling; Text Generation; NLP; Bangla Text; Sequence-to-sequence; RNN; LSTM, Deep Learning, Machine Learning
Language
Keywords: Modeling; Text Generation; NLP; Bangla Text; Sequence-to-sequence; RNN; LSTM, Deep Learning, Machine Learning
Language Modeling; Text Generation; NLP; Bangla Text; Sequence-to-sequence; RNN; LSTM, Deep Learning, Machine Learning

1. Introduction
1. Introduction
1. Introduction
1. Recurrent
Introduction
neural networks are types of neural network designed for capturing information from sequences or
Recurrent neural networks are types of neural network designed for capturing information from sequences or
Recurrent
time neural
series data. It isnetworks
extensionare
of types of neural
feed forward network
neural designed
network for capturing
and different information
from other fromneural
in general sequences or
network
time series data. It is extension of feed forward neural network and different from other in general neural network
time series data.
Recurrent
architectures. It ishandle
neural
It can extension
networks of types
theare feed forward
variable of neural
length. neural
In network
network
earlier and different
designed
Schmidhuber forwith from other
capturing in general
information
Hochreiter, they fromneural
proposed network
sequences
Long or
Short
architectures. It can handle the variable length. In earlier Schmidhuber with Hochreiter, they proposed Long Short
architectures.
time
Termseries
Memory It(LSTM)
data. can
It ishandle the variable
extension of feed
technique length.
[19].In
forward
in 1997 It earlier
neural Schmidhuber
network
solves with Hochreiter,
and different
the hiding gradient from otherby
problem they proposed
inconstructing
general Long
neural Short
network
some extra
Term Memory (LSTM) technique in 1997 [19]. It solves the hiding gradient problem by constructing some extra
Term Memory
architectures.
instruction (LSTM)
andItvery technique
can efficient
handle the in 1997
andvariable
better [19].
length.
then RNN. InItItearlier
solves thea hiding
Schmidhuber
was like gradient
revolutionwith problem by
overHochreiter,
Recurrent constructing
they proposed
Network some
Long
Networks extra
Short
(RNN).
instruction and very efficient and better then RNN. It was like a revolution over Recurrent Network Networks (RNN).
instruction
Term andonvery
Memory
It works well (LSTM) efficient
sequence and task
technique
based better then
in and
1997 RNN.
on[19]. ItItsolves
any type was like
thea hiding
revolution
of sequential over Recurrent
gradient
data. problem by Network Networks
constructing (RNN).
some extra
It works well on sequence based task and on any type of sequential data.
It works well
instruction andonvery
sequence based
efficient and task
betterand on RNN.
then any type of sequential
It was data. over Recurrent Network Networks (RNN).
like a revolution
It works well on sequence based task and on any type of sequential data.
∗ Corresponding author. Tel.: +880 1736752047
∗ Corresponding author. Tel.: +880 1736752047
∗ Corresponding
E-mail address:author. Tel.: +880 1736752047
[email protected]
E-mail address: [email protected]
E-mail address:author.
∗ Corresponding [email protected]
Tel.: +880 1736752047
1877-0509 c 2019 The

E-mail address: Authors. Published by Elsevier Ltd.
[email protected]
1877-0509 c

© 2019
2019 The
The Authors.
Authors. Published by
Published by Elsevier
Elsevier Ltd.
Ltd.
This
1877-0509 c 2019
is an open
access
Thearticle under
Authors. the CC BY-NC-ND
Published license (https://creativecommons.org/licenses/by-nc-nd/4.0/).
by Elsevier Ltd.
This isisan
This anopen
openaccess article
access under
article the CC
under the BY-NC-ND
CC BY-NC-ND licenselicense
(https://creativecommons.org/licenses/by-nc-nd/4.0/).
(https://creativecommons.org/licenses/by-nc-nd/4.0/)
This is an
1877-0509 open access
c 2019 article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0/).
Peer-review underThe Authors. Published
responsibility of thebyscientific
Elsevier Ltd.
committee of the International Conference on Pervasive Computing Advances
This is
and an open access
Applications article under
– PerCAA the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0/).
2019.
10.1016/j.procs.2019.05.026
52 Md. Sanzidul Islam et al. / Procedia Computer Science 152 (2019) 51–58
2 Md. Sanzidul Islam et al. / Procedia Computer Science 00 (2019) 000–000

RNN can not handle backdrop very well but LSTM can. RNN have limitation of memory but LSTM don’t have
any limitation of memory problen in long going dependency. RNN suffers from the same vanishing (or less notorious
exploding) gradient problem as fully connected networks but LSTM can vanish gradient properly. LSTM is better
than RNN because LSTMs are unequivocally intended to stay away from the long haul reliance issue. Recollecting
data for significant lots of time is for all intents and purposes their default conduct, not something they battle to learn.

1.1. Dataset Properties

The neural network we made was trained with Bangla newspaper corpus. We collected a newspaper corpus of 917
days newspaper text from Prothom Alo online. The web scraping with Python was helped a lot for doing this work
automatically. The training dataset contains the properties like-

• Total 917 days newspaper text.

• The daily newspaper text contains average 4500 sentences.
• 4500 sentences contains 12,500 words.
• 12,500 words contains about 15,5000 characters in average.

2. Literature Review

We are proposing a model which can generate sequence-to-sequrnce Bangla Text. There are many research
and development works in this field. But hardly we can find text generation related works with LSTM for Bangla
language. That’s why we determined to make our own dataset and our own prediction model.

Naveen Sankaran et al. proposed a formulation, where they recognized a task which makes model as training of a
sequential translation method [1]. They worked for converting words from a document into Unicode sequence directly.

Praveen Krishnan and et al. introduced an OCR system which pursues a combined architecture in seven different
languages of India and a segmentation free method [2]. Their system was proposed to assist the continuous learning
in the time of being it usable, like continuous user input. They worked with BLSTM method, another form of general
LSTM.

A character-based encoder-decoder model which is acquired to transliterate sequence to sequence consists by

Amir H. Jadidinejad [3]. The proposed an encoder built with Bidirectional Recurrent Neural Network that encodes a
sequence of symbols into vector reprsentation with fixed length.

The effects of the SIGMORPHON 2016 combined task specified that the attentional sequence-to-sequence model
of Bahdanauet is proper for this task [4] [5].

Robert Ostling and Johannes Bjervas proposed a model which was constructed with sequence-to-sequence
artificial neural network and LSTM architecture that was a big attention to enthusiasts[6].

Yasuhisa Fujii et al. considered line-level script documentation papers in the context of multilingual OCR.
They considered some alternatives of an encoder-summarizer method in the framework of an up-to-date multilin-
gual OCR structure and they used an estimate set of several-domain streak photos from 232 languages in 30 scripts [7].

A DNN based SPSS system was made by Sivanand Achanta and et al. which is representing the audio parametric
classifications of things with a single vector by sequence-to-sequence auto-encoders [8].

Mikolov et al. have established the importance of allocated images and the competence to model randomly
extensive needs using Current RNN based language models[9] [10].
Md. Sanzidul Islam et al. / Procedia Computer Science 152 (2019) 51–58 53
Md. Sanzidul Islam et al. / Procedia Computer Science 00 (2019) 000–000 3

Sutskever et al. produce significant sentences by modifying a RNN as well through acquiring from a character-level
corpus. [11]. They introduced a newly made RNN model that one works as multiplicative connectors.

Karpathy and et al. have ensured that an RNNLM is more effective of making image explanations on the
pre-trained model by training the neural network model with RNN[12]. They tried to construct a model architecture
of multimodal RNN.

Zhang and Lapata are also explains remarkable work using RNNs to create Chinese poetry [13]. It was a good
initiative in that time which could able to generate some lines of chinese poem autometically.

Mairesse and Young suggested a phrase-based NLG method was proposed on factored LMs that can realize after a
semantically united corpus [14]. They focused their crowd sourced data and shown how to work with that.

Even though active learning was similarly recommended to accept absorbing online directly from operators, the
necessity for human interpreted alignments boundaries the scalability of the scheme by Mairesse et al. [15].

One more related approach throws NLG as a pattern extraction and matching problem by Angeli et al. [16].

Kondadadi et al. display that the outputs can be more improved by an SVM ranker creating them equivalent to
human-authored texts [17]. They proposed a approach by end-to-end generation technique with some local decisions.

Subhashini Venugopalan, Marcus Rohrbach, Raymond Mooney suggest a novel sequence-to-sequence model to
generate captions for videos. They made explanations with a sequence-to-sequence model, where frames are first
read sequentially and then words are made serially [18].

3. Method Discussion

3.1. RNN Structure

The LSTM network is a special type of RNN. The RNN is neural network which attempts to model sequence or
time dependent in regular behavior. This is done by output feeding back of a neural net layer in time t to the input
layer at time-

t+1 (1)

It looks like this [20]-

Fig. 1. Sequential nodes of Recurrent Neural Network.

54 Md. Sanzidul Islam et al. / Procedia Computer Science 152 (2019) 51–58
4 Md. Sanzidul Islam et al. / Procedia Computer Science 00 (2019) 000–000

Recurrent Neural Networks could be described as Unrolled programmatically at the time of training and testing.
So, we can see something like [20]-

Fig. 2. Unrolled Recurrent Neural Network.

The figure showing here a new word is being supplied in every step with the previous output (i.e. ht-1) and that
one also being supplied at next.

Basically, RNNs are amazingly able to handle the long-term dependencies. The issue was noticed in details by
Hochreiter (1991) and Bengio, et al. (1994), who showed some pretty basic causes why it might be difficult. Thats
why we will use LSTM, a better form of RNN.

3.2. LSTM Networks

The LSTMs are called Long Short Term Memory (LSTM) which are a special type of RNN, capable of learning
long-term dependency problem. Thhis one was discovered by Schmidhuber and Hochreiter in 1997, and were updated
and spreaded by many people in that work.
LSTMs are actually made for avoiding the long-term dependency issue. Keeping information in long periods of time
is their actual default behavior, nothing what they struggle to be trained! The graphical representation of LSTM cell
could be shown as below [20]-

Fig. 3. LSTM Cell Diagram.

Md. Sanzidul Islam et al. / Procedia Computer Science 152 (2019) 51–58 55
Md. Sanzidul Islam et al. / Procedia Computer Science 00 (2019) 000–000 5

4. Proposed Methodology

4.1. Dataset Preprocessing

Working with Bangla is too much difficult still now as there has no much resource and R&D works in this field.
So, processing Bengali text data a difficult task as these are too noisy and also not suitable for working with ma-
chine learning or deep learning approaches. We did some preprocessing work for making our dataset noise-free and
performing its best in neural network, like-

• Removed all Bengali punctuation marks.

• Removed extra spaces and new lines
• Converted the text into utf-8 format.

4.2. Proposed Method

In general an LSTM network is complex comparative to other methods. It consume a much power in hardware ans
machines capability. The whole interior activities and logic flow could be presented as below-
1) Input: Firstly, The input is squashed with the tanh activation function between-1 and 1. This could be expressed by-

g = tanh(bg + tt U g + ht−1 V g ) (2)

Where UgUg and VgVg are the previous weights of cell output and inputs. In other side bgbg is performing as an
input bias. Remember, the exponents (g) is only considering as input weights.

i = σ(bi + xt U i + ht−1 V i ) (3)

The equation 4 is considered as output of LSTM input section-

g◦i (4)

Here the ◦ is elements-wise multiplication.

2) Forget state loop and gate: the output forgotten gate expression is-

f = σ(b f + xt U f + ht−1 V f ) (5)

The product output shows the position of previous state and forgotten gate. The equation for this calculation is-

st−1 ◦ f (6)

The output of forgotten loop is calculated in another strategy. For different time frame-

st = st−1 ◦ f + g ◦ i (7)

3) Output gate: Necessary output gate is evolved as-

◦ = σ(b◦ + xt U ◦ + ht−1 V ◦ ) (8)

So that the cell final output, with tanh squashing, can be expressed as-

ht = tanh(st ) ◦ O (9)
56 Md. Sanzidul Islam et al. / Procedia Computer Science 152 (2019) 51–58
6 Md. Sanzidul Islam et al. / Procedia Computer Science 00 (2019) 000–000

Finally, a very common form of LSTM networks equation can be written from Colahs famous blog post [21]-

Fig. 4. LSTM networks equation.

That’s how the Long-short-term-memory (LSTM) network do the operations sequentially. That’s why it perform
superior in any type of sequential data. The LSTM network activity flow could be presented as the figure given below.
There we can notice some time evaluation term what’s for LSTM is different.

Fig. 5. LSTM networks activity flow.

4.3. Layer Description

Generally a neural network contains three layers for taking input, doing calculation and giving decision. A input
embedding layer was taken as initial layer of neural network as input layer. Here a single line of text is being trained
one after one and sequentially.

Then the hidden layer was taken place. It could be explain as the main LSTM layer and did it for 100 units.

The final and output layer is described now. An activation function is applied here named softmax. Softmax cal-
culates the probability of event distribution over n events. This function generally calculates the probabilities of each
target class across all possible target classes.

eyi
S (yi ) = y (10)
ie
i
Md. Sanzidul Islam et al. / Procedia Computer Science 152 (2019) 51–58 57
Md. Sanzidul Islam et al. / Procedia Computer Science 00 (2019) 000–000 7

4.4. Model Validation

The LSTM model is little different in validation perspective. Performance determination with cross validation or
train-test accuracy in general like CNN model [22] is not practical. It actually better to test the model with real data
and its output. We trained only one weeks news paper corpus for having limitation of hardware limitation. And finally
did test with different Bengali words, then the model generated some text according to previous text. Here are two
generated Bangla sentences with our model-

Fig. 6. Testing model (example-1).

Fig. 7. Testing model (example-2).

5. Future Work

In this paper we worked with less data, due to hardware limitations. Afterwards we will enhance our dataset.
In future we will improve the model for achieving multi task sequence to sequence text generation and multi way
translation like Bengali articles, caption generation. Furthermore we would aim to pursue the possibility of extending
our model to Bangla regional languages. We also has plan to work with Bangla Sign Language [23] generation with
sequential image data as like general people language.

References

[1] Naveen Sankaran T, Aman Neelappa, C V Jawahar, Devanagari Text Recognition: A Transcription Based Formulation, 12th International
Conference on Document Analysis and Recognition, 25-28 Aug. 2013, Washington DC, USA.
[2] Praveen Krishnan, Naveen Sankaran T, Ajeet Kumar Singh, C V Jawahar, Towards a Robust OCR System for Indic Scripts, International Work-
shop on Document Analysis Systems, Centre for Visual Information Technology, International Institute of Information Technology Hyderabad
- 500 032, INDIA, April 2014.
[3] Amir H. Jadidinejad, Neural Machine Transliteration: Preliminary Results, arXiv:1609.04253v1 [cs.CL] 14 Sep 2016.
[4] Ryan Cotterell, Christo Kirov, John Sylak-Glassman, David Yarowsky, Jason Eisner, and Mans Hulden. The sigmorphon 2016 shared task:
Morphological reinflection. In Proceedings of the 14th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and
Morphology. Association for Computational Linguistics, Berlin, Germany, pages 1022, 2016.
[5] Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio, Neural machine translation by jointly learning to align and translate, CoRR
abs/1409.0473, 2014.
[6] Robert Ostling and Johannes Bjerva, SU-RUG at the CoNLLSIGMORPHON 2017 shared task: Morphological Inflection with Attentional
Sequence-to-Sequence Models, arXiv:1706.03499v1 [cs.CL] 12 Jun 2017.
[7] Yasuhisa Fujii, Karel Driesen, Jonathan Baccash, Ash Hurst and Ashok C. Popat, Sequence-to-Label Script Identification for Multilingual
OCR, Google Research, Mountain View, CA 94043, USA, arXiv:1708.04671v2 [cs.CV] 17 Aug 2017.
[8] Sivanand Achanta, KNRK Raju Alluri and Suryakanth V Gangashetty, Statistical Parametric Speech Synthesis Using Bottleneck Representa-
tion From Sequence Auto-encoder, Speech and Vision Laboratory, IIIT Hyderabad, INDIA, arXiv:1606.05844v1 [cs.SD] 19 Jun 2016.
[9] Tomas Mikolov, Martin Karafit, Lukas Burget, JanC ernocky, and Sanjeev Khudanpur, Recurrent neural network based language model, In
Proceedings on InterSpeech, 2010.
[10] Tomas Mikolov, Stefan Kombrink, Lukas Burget, Jan H. Cernocky and Sanjeev Khudanpur, Extensions of recurrent neural network language
model, In ICASSP, 2011 IEEE International Conference on, 2011.
58 Md. Sanzidul Islam et al. / Procedia Computer Science 152 (2019) 51–58
8 Md. Sanzidul Islam et al. / Procedia Computer Science 00 (2019) 000–000

[11] Ilya Sutskever, James Martens and Geoffrey E. Hinton, Generating text with recurrent neural networks, In Proceedings of the 28th International
Conference on Machine Learning (ICML-11), ACM, 2011.
[12] Andrej Karpathy and Li Fei-Fei, Deep visual semantic alignments for generating image descriptions, CoRR, 2014.
[13] Xingxing Zhang and Mirella Lapata, Chinese poetry generation with recurrent neural networks, In Proceedings of the 2014 Conference on
EMNLP, Association for Computational Linguistics, October, 2014.
[14] Francois Mairesse and Steve Young, Stochastic language generation in dialogue using factored language models, Computer Linguistics, 2014.
[15] Francois Mairesse, Milica Gasic, Filip Jurccek, Simon Keizer, Blaise Thomson, Kai Yu and Steve Young, Phrase-based statistical language
generation using graphical models and active learning, In Proceedings of the 48th ACL, ACL 10, 2010.
[16] Gabor Angeli, Percy Liang, and Dan Klein, A simple domainindependent probabilistic approach to generation, In Proceedings of the 2010
Conference on EMNLP, EMNLP 10, Association for Computational Linguistics, 2010.
[17] Ravi Kondadadi, Blake Howald, and Frank Schilder, A statistical nlg framework for aggregated planning and realization In Proceedings of the
51st Annual Meeting of the ACL, Association for Computational Linguistics, 2013.
[18] Subhashini Venugopalan, Marcus Rohrbach, Jeff Donahue, Raymond Mooney, Trevor Darrell and Kate Saenko, Sequence to Sequence Video
to Text, arXiv:1505.00487 [cs.CV] or arXiv:1505.00487v3 [cs.CV] 19 Oct. 2015.
[19] Hochreiter, Sepp, and Jrgen Schmidhuber. Long short-term memory. Neural computation 9.8 (1997): 1735-1780.
[20] Adventuresinmachinelearning.com, Keras LSTM tutorial How to easily build a powerful deep learning language model, 2018. [Online]. Avail-
able: http://www.adventuresinmachinelearning.com/keras-lstm-tutorial/ . [Accessed: 14- Aug- 2018].
[21] Colah.github.io, Understanding LSTM Networks, 2015. [Online]. Available: http://colah.github.io/posts/2015-08-Understanding-LSTMs/ .
[Accessed: 14- Aug- 2018].
[22] Islam, Sanzidul, et al. ”A Potent Model to Recognize Bangla Sign Language Digits Using Convolutional Neural Network.” Procedia computer
science 143 (2018): 611-618.
[23] Islam, Md Sanzidul, et al. ”Ishara-Lipi: The First Complete MultipurposeOpen Access Dataset of Isolated Characters for Bangla Sign Lan-
guage.” 2018 International Conference on Bangla Speech and Language Processing (ICBSLP). IEEE, 2018.

Stin 5014 (Artificial Intelligence Assignment II (20 %)
No ratings yet
Stin 5014 (Artificial Intelligence Assignment II (20 %)
14 pages
Dbms Mini Project
90% (10)
Dbms Mini Project
23 pages
Visvesvaraya Technological University: Belagavi
0% (1)
Visvesvaraya Technological University: Belagavi
17 pages
Núñez Et Al - 2019 - What Happened To Cognitive Science
No ratings yet
Núñez Et Al - 2019 - What Happened To Cognitive Science
10 pages
Ai & Machine Learning
100% (3)
Ai & Machine Learning
62 pages
Pramod Combined
No ratings yet
Pramod Combined
67 pages
Danielson Framework Rubric School Counselors
No ratings yet
Danielson Framework Rubric School Counselors
7 pages
Methods, Procedure and Technique of Teaching Language
No ratings yet
Methods, Procedure and Technique of Teaching Language
28 pages
1 s2.0 S1877050918309530 Main
No ratings yet
1 s2.0 S1877050918309530 Main
12 pages
Ensemble Classifier
No ratings yet
Ensemble Classifier
10 pages
Introduction To Cont Phil Arts
No ratings yet
Introduction To Cont Phil Arts
29 pages
Dll-Mil 2023
No ratings yet
Dll-Mil 2023
3 pages
Conselling
No ratings yet
Conselling
58 pages
Legal Philosophy As Practical Philosophy: Revus
No ratings yet
Legal Philosophy As Practical Philosophy: Revus
25 pages
Strong Slot and Filler
No ratings yet
Strong Slot and Filler
17 pages
1 s2.0 S1877050918307774 Main
No ratings yet
1 s2.0 S1877050918307774 Main
8 pages
INFORMATIVE
100% (1)
INFORMATIVE
26 pages
E-Commerce Service Design ITIL
No ratings yet
E-Commerce Service Design ITIL
8 pages
Effective Feature Extraction of ECG For Biometric Application
No ratings yet
Effective Feature Extraction of ECG For Biometric Application
11 pages
Management, Leadership and Charisma
No ratings yet
Management, Leadership and Charisma
23 pages
WISENT Cidds Technical Report
No ratings yet
WISENT Cidds Technical Report
8 pages
1 s2.0 S1877050920316331 Main
No ratings yet
1 s2.0 S1877050920316331 Main
6 pages
MasterClass 1
No ratings yet
MasterClass 1
6 pages
Climate Change Impacts On Wind Energy
No ratings yet
Climate Change Impacts On Wind Energy
8 pages
1 s2.0 S1877050917326558 Main
No ratings yet
1 s2.0 S1877050917326558 Main
8 pages
Soft Computing Report
100% (1)
Soft Computing Report
20 pages
1 s2.0 S1877050920314538 Main
No ratings yet
1 s2.0 S1877050920314538 Main
6 pages
A Multi-Objective Quantum-Inspired Genetic Algorithm (Mo-QIGA)
No ratings yet
A Multi-Objective Quantum-Inspired Genetic Algorithm (Mo-QIGA)
9 pages
IoT and Machine Learning Approaches For Automation
No ratings yet
IoT and Machine Learning Approaches For Automation
8 pages
Recommender System
No ratings yet
Recommender System
26 pages
Cohesive Devices
No ratings yet
Cohesive Devices
31 pages
Long-Term Wind Speed and Power Forecasting Using Local RNNs Models
No ratings yet
Long-Term Wind Speed and Power Forecasting Using Local RNNs Models
12 pages
World University of Bangladesh: Assignment On
No ratings yet
World University of Bangladesh: Assignment On
1 page
800 Thesis Sobre Witt
No ratings yet
800 Thesis Sobre Witt
42 pages
Gusti Rayyan Noor (1710117210015) Pengajaran Mikro A9
No ratings yet
Gusti Rayyan Noor (1710117210015) Pengajaran Mikro A9
16 pages
1 s2.0 S1877050922010377 Main
No ratings yet
1 s2.0 S1877050922010377 Main
10 pages
LUMOS - Dementia
No ratings yet
LUMOS - Dementia
100 pages
1 s2.0 S1877050921021177 Main
No ratings yet
1 s2.0 S1877050921021177 Main
6 pages
Causative Form (Part 1)
No ratings yet
Causative Form (Part 1)
14 pages
Future Civil Engineering
No ratings yet
Future Civil Engineering
3 pages
10.2.2 Unit Overview
No ratings yet
10.2.2 Unit Overview
8 pages
5215-Article Text-13886-3-10-20220808
No ratings yet
5215-Article Text-13886-3-10-20220808
6 pages
Detailed Lesson Plan - Fluencyy
No ratings yet
Detailed Lesson Plan - Fluencyy
3 pages
Main
No ratings yet
Main
6 pages
1 s2.0 S1877050923002314 Main
No ratings yet
1 s2.0 S1877050923002314 Main
10 pages
70 8 (Prac)
No ratings yet
70 8 (Prac)
5 pages
IoT and Machine Learning Approaches For Automation-Pages-1
No ratings yet
IoT and Machine Learning Approaches For Automation-Pages-1
3 pages
Cyber Diplomacy - Review
No ratings yet
Cyber Diplomacy - Review
10 pages
VENKY Se
No ratings yet
VENKY Se
10 pages
CFP ICECE2012
No ratings yet
CFP ICECE2012
1 page
Network Intrusion Detection System Report
No ratings yet
Network Intrusion Detection System Report
59 pages
NBU Extended ICCTE-2023
No ratings yet
NBU Extended ICCTE-2023
2 pages
Kamrul 2020 Ijais 451841
No ratings yet
Kamrul 2020 Ijais 451841
6 pages
Support Vector Machines For Wind Speed Prediction
No ratings yet
Support Vector Machines For Wind Speed Prediction
9 pages
Chanakya National Law UNIVERSITY, Patna: Sociology-Ii
No ratings yet
Chanakya National Law UNIVERSITY, Patna: Sociology-Ii
7 pages
A Proof of Local Convergence For The Adam Optimizer
No ratings yet
A Proof of Local Convergence For The Adam Optimizer
8 pages
Paper 08
No ratings yet
Paper 08
9 pages
The Unreasonable Effectiveness of Data PDF
No ratings yet
The Unreasonable Effectiveness of Data PDF
5 pages
Article Tableau
No ratings yet
Article Tableau
9 pages
A Novel Approach of Sentiment Classification Using - 2018 - Procedia Computer SC
No ratings yet
A Novel Approach of Sentiment Classification Using - 2018 - Procedia Computer SC
10 pages
Demonstrate TTLM
No ratings yet
Demonstrate TTLM
57 pages
SAMSON JERWIN Final Exam Masteral PE 204 Foundation of PE
No ratings yet
SAMSON JERWIN Final Exam Masteral PE 204 Foundation of PE
4 pages
Prudence Defence Academy: सबसे IMPORTANT CONCPETS वजनसे वकसी भी EXAM र्ें सबसे ज़्यादा प्रश्न पुछे जाते है I
No ratings yet
Prudence Defence Academy: सबसे IMPORTANT CONCPETS वजनसे वकसी भी EXAM र्ें सबसे ज़्यादा प्रश्न पुछे जाते है I
111 pages
Project Report - Sign Language To Text Conversion
No ratings yet
Project Report - Sign Language To Text Conversion
58 pages
Se2 Complete
No ratings yet
Se2 Complete
14 pages
Se1 Complete
No ratings yet
Se1 Complete
15 pages
Feet Competencies and Rubrics 5
No ratings yet
Feet Competencies and Rubrics 5
12 pages
14130-English
No ratings yet
14130-English
9 pages
One-hour-Ahead Wind Speed Prediction Using A Bayesian Methodology
No ratings yet
One-hour-Ahead Wind Speed Prediction Using A Bayesian Methodology
6 pages
Rainfall Forecasting For The Natural Disasters Preparation Using Recurrent Neural Networks
No ratings yet
Rainfall Forecasting For The Natural Disasters Preparation Using Recurrent Neural Networks
6 pages
Day-Ahead Wind Speed Forecasting Using F-ARIMA Models
No ratings yet
Day-Ahead Wind Speed Forecasting Using F-ARIMA Models
6 pages
Prediction of Sea Surface Temperature Using Long Short Term Memory
No ratings yet
Prediction of Sea Surface Temperature Using Long Short Term Memory
5 pages
Wind Speed Forecasting Using Recurrent Neural Networks and Long Short Term Memory
No ratings yet
Wind Speed Forecasting Using Recurrent Neural Networks and Long Short Term Memory
5 pages
Practical 1-2com
No ratings yet
Practical 1-2com
10 pages
Authentication and Authorization in Modern Web Apps For Data
No ratings yet
Authentication and Authorization in Modern Web Apps For Data
10 pages
1 s2.0 S1877050920310012 Main
No ratings yet
1 s2.0 S1877050920310012 Main
10 pages
1 s2.0 S1877050918309335 Main
No ratings yet
1 s2.0 S1877050918309335 Main
8 pages
FF FF FF: Sciencedirect
No ratings yet
FF FF FF: Sciencedirect
10 pages
Sciencedirect Sciencedirect Sciencedirect
No ratings yet
Sciencedirect Sciencedirect Sciencedirect
9 pages
10 1016@j Procs 2020 06 069
No ratings yet
10 1016@j Procs 2020 06 069
9 pages
Defense-In-Depth and Role Authentication For Micro
No ratings yet
Defense-In-Depth and Role Authentication For Micro
8 pages
46
No ratings yet
46
3 pages
Real-Time Deep Learning Approach For Pedestrian Detection and Suspicious Activity Recognition
No ratings yet
Real-Time Deep Learning Approach For Pedestrian Detection and Suspicious Activity Recognition
10 pages
1 s2.0 S1877050920313715 Main
No ratings yet
1 s2.0 S1877050920313715 Main
6 pages
1 s2.0 S187705092300217X Main
No ratings yet
1 s2.0 S187705092300217X Main
10 pages
Project List 3
No ratings yet
Project List 3
10 pages
Transfer Learning Approach For Malware Images Classification On Android Devices Using Deep Convolutional Neural Network
No ratings yet
Transfer Learning Approach For Malware Images Classification On Android Devices Using Deep Convolutional Neural Network
12 pages
Final Course List (Jan - Apr 2025)
No ratings yet
Final Course List (Jan - Apr 2025)
5 pages
Practical File
No ratings yet
Practical File
2 pages
A DDoS Attack Mitigation Framework For IoT Networks Using Fog Computing
No ratings yet
A DDoS Attack Mitigation Framework For IoT Networks Using Fog Computing
8 pages
0909 PHD Lopez-Moliner
No ratings yet
0909 PHD Lopez-Moliner
2 pages
Nazmul Haque: Rajshahi University of Engineering & Technology (RUET)
No ratings yet
Nazmul Haque: Rajshahi University of Engineering & Technology (RUET)
2 pages
A Bibliometric Study of The Research Area of Video
No ratings yet
A Bibliometric Study of The Research Area of Video
8 pages
Overview Merged
No ratings yet
Overview Merged
51 pages
Form 2 Lesson 2 Listening
No ratings yet
Form 2 Lesson 2 Listening
2 pages
Library Management System
No ratings yet
Library Management System
2 pages
Acoustic Based Emergency Vehicle Detection Using Ensemble of Deep Learning Models
No ratings yet
Acoustic Based Emergency Vehicle Detection Using Ensemble of Deep Learning Models
8 pages
Building Cloud and Virtualization Infrastructure: A Hands-on Approach to Virtualization and Implementation of a Private Cloud Using Real-time Use-cases
From Everand
Building Cloud and Virtualization Infrastructure: A Hands-on Approach to Virtualization and Implementation of a Private Cloud Using Real-time Use-cases
Mrs.Lavanya S
No ratings yet
Biomedical Sensors Data Acquisition with LabVIEW: Effective Way to Integrate Arduino with LabView
From Everand
Biomedical Sensors Data Acquisition with LabVIEW: Effective Way to Integrate Arduino with LabView
Lovi Raj Gupta
No ratings yet

Sequence-To-sequence Bangla Sentence Generation With LSTM

Uploaded by

Sequence-To-sequence Bangla Sentence Generation With LSTM

Uploaded by

Available online at www.sciencedirect.

1.1. Dataset Properties

• Total 917 days newspaper text.

A character-based encoder-decoder model which is acquired to transliterate sequence to sequence consists by

3.1. RNN Structure

It looks like this [20]-

Fig. 1. Sequential nodes of Recurrent Neural Network.

Fig. 2. Unrolled Recurrent Neural Network.

3.2. LSTM Networks

Fig. 3. LSTM Cell Diagram.

4.1. Dataset Preprocessing

• Removed all Bengali punctuation marks.

4.2. Proposed Method

g = tanh(bg + tt U g + ht−1 V g ) (2)

i = σ(bi + xt U i + ht−1 V i ) (3)

The equation 4 is considered as output of LSTM input section-

Here the ◦ is elements-wise multiplication.

f = σ(b f + xt U f + ht−1 V f ) (5)

3) Output gate: Necessary output gate is evolved as-

◦ = σ(b◦ + xt U ◦ + ht−1 V ◦ ) (8)

Fig. 4. LSTM networks equation.

Fig. 5. LSTM networks activity flow.

4.3. Layer Description

4.4. Model Validation

Fig. 6. Testing model (example-1).

Fig. 7. Testing model (example-2).

You might also like