CNN vs. LSTM For Turkish Text Classification

This document compares the performance of CNN and LSTM models for text classification in Turkish. It experiments with the TTC-3600 dataset and finds that both CNN and LSTM can efficiently classify Turkish text while achieving good performance. Related work in the document discusses previous research on text classification with Doc2Vec, Word2Vec, and TF-IDF features.

Uploaded by

Erdem Altun

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views6 pages

CNN vs. LSTM For Turkish Text Classification

Uploaded by

Erdem Altun

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

CNN vs.

LSTM for Turkish Text Classification

Melih Yayla Mustafa Diyar Demirkol Saed Alqaraleh

Computer Engineering Department Electrical-Electronics Engineering Department Computer Engineering Department
Hasan Kalyoncu University Hasan Kalyoncu University Hasan Kalyoncu University
Gaziantep, Turkey Gaziantep, Turkey Gaziantep, Turkey
[email protected] [email protected] [email protected]

Abstract—In this paper, the efficiency of two states of the art II. R ELATED W ORK
text classification techniques, i.e., Convolutional Neural Networks
(CNN) and Long Short-Term Memory (LSTM) for supporting the In this section, we have summarized recent researches,
Turkish text classification has been investigated. In addition, the developments, and solutions related to text classification. The
effect of the main preprocessing steps such as Tokenization, Stop
Word Elimination, Stemming, etc. has also been studied. Several
study of Çelenli et al., aims to develop a centroid based
experiments using “TTC-3600” dataset were performed, and it classifier. In this study, documents are represented by vectors
has been observed that both CNN and LSTM can efficiently using the Paragraph Vector model (Doc2Vec). The results of
support the Turkish language and can achieve quite good their experiments indicate that using Distributed Bag of Words
performance. Related to data preprocessing, results indicated (DBOW) architecture with five epochs of classifiers paired
that such a process improves the performance, however, for the
Turkish language, it is preferred to exclude stemming. Also, by
with document embedding vectors obtains the best accuracy.
comparing the performance of feature extraction techniques for Also, it has been observed that using more epochs decreases
processing Turkish language, Word2Vec outperforms TF-IDF. the classification accuracy of the Doc2Vec interestingly. On
Index Terms—Text Classification, Turkish Language, Convo- the other hand, using scarce data amount leads the Doc2Vec to
lutional Neural Networks, Long Short-Term Memory, Natural outperform the SVM classifiers that use tf-idf representations
Language Processing. [6].
Another study was done by Şahin, which compares the use
I. I NTRODUCTION of word2vec in the classification of seven different categories
of Turkish texts with the classical bag of words (BoW) text
In today’s world, the possession of knowledge or infor- representation. Here, each sample was expressed by a vector
mation holds an important place for people, companies, or that has the average of the sample’s words, then, SVM was
even states. However, the extraction of this information is used as the classifier. The experiments were conducted for
quite an essential and hard task. To overcome this problem different parameter settings of word2vec and its effect on
and to obtain the requested information, information retrieval classification success was examined. The study observed the
(IR) systems were developed. In this paper, we are going to accuracy of word2vec which is at the best-measured value was
investigate two text classification techniques, i.e., Convolu- 0.92F is better than tf-idf weighted BoW method which is at
tional Neural Networks (CNN) and Long Short-Term Memory the best-measured value of 0.89F [2].
(LSTM), for supporting the Turkish text classification. The classification performance of heterogeneous classifier
In general, text classification can be defined as the process ensembles for Turkish and English languages was investigated
of stating previously declared categories to text documents. by Kilimci et al. For this purpose, some base learners such
Text classification can be exemplified with the classification as multinominal naive Bayes (MNB), support vector machine
of e-mail messages as spam or not. Another example is that (SVM), multivariate Bernoulli naive Bayers (MVBN), convo-
it will automatically tag all incoming news on a subject for lutional neural network (CNN), and random forest (RF) were
example “art”, “football” or “movies”. Text classification is used. Here, to merge the determination of these base learners,
also one of the most popular study topics in the field of Nat- both majority voting and stacking methods were used. Also,
ural Language Processing (NLP), which aims to classify the Word2vec and TF-IDF were used for feature representation.
tagged texts into the related categories (classes). Nowadays, By applying base learners and heterogeneous ensemble sys-
Naive Bayes, Support Vector Machine [3], Neural Network tems with majority voting and stacking methods on 8 different
[4] and K-nearest neighbor [5] are frequently used for text datasets represented by TF-IDF or Word2vec, RF and CNN
classification. However, the impressive performance archived obtained the best results, and stacking outperforms majority
by Neural Networks especially CNN and LSTM in fields such voting [7].
as image classifications, content-based image retrieval, self- Similar to the previous study, in [8], the effect of en-
derived cars, and many others fields, has attracted researchers semble models while classifying Turkish texts using some
to use such approaches for text processing tasks such as classification algorithms such that naive bayes (NB), J48 –
translations, classification, etc. Decision Tree, K – Nearest Neighbor (K-NN), and support

978-1-6654-3603-8/21/$31.00 ©2021 IEEE

Authorized licensed use limited to: Istanbul Sabahattin Zaim Univ. Downloaded on November 03,2023 at 13:59:19 UTC from IEEE Xplore. Restrictions apply.
vector machine (SVM) as base classifiers was investigated. For was obtained by changing ”b”. 9)In Turkish alphabet, there
the ensemble learning models. In this study, TTC-360 dataset are no “x,w,q” letters whereas there are the letters “ç, ğ, ı,
which is consisted of 13 categories such as economy, sport, ö, ü” different from English alphabet. 10) Turkish words are
art, etc. was used. Results of [8], showed that base classifiers read as written [11].
with Boosting and Rotation Forest ensemble models were able Some of the text preprocessing steps that are investigated
to achieve the best accuracy rate. in order in this study are:
On the other hand, Torunoglu et al. [9] studied the effect of Tokenization: The first step of preprocessing is tokeniza-
different preprocessing steps on Turkish texts classification. tion, i.e., the input text is turned into word tokens [14].
For preprocessing, stemming, stop word filtering and word Stop Word Elimination: Stop words can be defined as
weighting steps were applied. For the classification, Naı̈ve the most used words in a language. However, most of the
Bayes, Naı̈ve Bayes Multinomial (mnNB), Support Vector stop words have no meaning by themselves. If these words
Machines (SVM), And K-Nearest Neighbor were used. Ac- are eliminated, it would be easier to use the most meaningful
cording to their results, stemming has the lowest impact on and semantic words. For the stop word elimination, there are
text classification. However, they stated that stemming is more several libraries like sklearn that can be used to eliminate such
appropriate for information retrieval tasks. words [10].
Lowercasing: One of the common text preprocessing tech-
III. M ECHANISM OF T EXT C LASSIFICATION
niques is lowercasing all characters in the text. This method
In general, the following components can be considered as helps us to increase the stability of inevitable outcomes. It is an
the main ones of Text Classification. appropriate technique for most of the NLP issues [10].In other
words, as shown in Table I, lowercasing basically creates a
A. Text Gathering
standard for the datasets. For example, it assists search engines
This step is working on collecting the samples and datasets to create search indexes in a standard way which improves the
that can be used for building the classification system and effectiveness [10].
also for investigating the performance of such a system.
In this study, as we aim to process the Turkish language, TABLE I: Lowercase Example.
the “TTC-3600” Turkish data set that was constructed using Raw Lowercased
3600 Turkish news and articles and humanly annotated to İsTanBuL
the following topics Ekonomi (Economy), Kültür-Sanat(Art İSTANBUL istanbul
and Culture), Sağlık (Health), Siyaset (Politics), Spor (Sport), İsTaNbUl
KiTAP
Teknoloji (Technology), where each one has 600 articles are KitAp kitap
used. KiTaP

B. Text Pre-Processing
Stemming: Another text preprocessing technique is stem-
In general, text preprocessing is one of the important steps ming. Stemming is basically a method that finds the root of the
in information retrieval and analysis systems. It basically words [10]. Some different techniques are used to perform this
prepares the text into more useful, workable, and proper form. process. For the Turkish language, the most common algorithm
Turkish language belongs to a branch of the Altai language is the SnowBall algorithm. In this algorithm, there are some
family, and it is an additive language, in which words are made rules that the coder has followed due to the Turkish language
and withdrawn by suffixes. Also, Turkish language has some morphology [12]. The rules are;
specific characteristics such as 1) There is no masculinity or
femininity feature like in Arabic and German languages. 2)The • Turkish language has only one affix type which is the
names which came after the numbers do not take the plural suffix.
suffix. 3) There are thickness-thinness and flatness-roundness • In Turkish, it is not possible to have a plural suffix after
harmony in Turkish. According to the first harmony, vowels a possessive suffix.
in a word are either thick or thin, and according to the second • In Turkish, a suffix can have more than one allomorph to
harmony, they are always flat or round. 4) The consonants f, have sound harmony.
j, and h do not exist in the ordinal Turkish words, while they • In Turkish, each vowel expresses a different syllable.
exist in words that were included from other languages. 5) • In Turkish, most of the monosyllabic words are the stem.
The number of consonants that can be found at the beginning • In Turkish, if a word possesses nominal verb suffixes, it
of the word is limited. These consonants are “b, c, d, g, k, comes at the end of the word.
s, t, v, y”. 6) In the case the consonant c is at the beginning • In Turkish, a suffix can be treated as a noun suffix and a
of the word, it will be changed to another consonant ç. 7) nominal verb suffix [12].
The n consonant letter contains only ”what” and its derivative
words: what, when, why, how, and where( In Turkish, they The different sound structures of a morpheme (although it
mean ne, ne zaman, neden, nasıl, and nerede respectively). comes to mind at the first moment as a word, we can say
8)The consonant p is found at the beginning of some words that it is a fragmented form of the word in a sense) is called

Authorized licensed use limited to: Istanbul Sabahattin Zaim Univ. Downloaded on November 03,2023 at 13:59:19 UTC from IEEE Xplore. Restrictions apply.
TABLE II: Suffix Allomorphs [12]. TABLE V: Noun Suffixes [12].
Letter Allomorph a/a Suffixes
U I,i,u,ü 1 -lAr
C c,ç 2 -(U)m
A a,e 3 -(U)mUz
D d,t 4 -(U)n
I ı,I 5 -(U)nUz
6 -(s)U
7 -lArl
8 -(y)U
allomorph. There are some different versions of allomorphs in 9 -nU
Table II. 10 -(n)Un
11 -(y)A
TABLE III: Derivational Suffixes [12]. 12 -nA
13 -DA
a/a Suffixes 14 -Nda
1 -lUk 15 -Dan
2 -CU 16 -nDAn
3 -Cuk 17 -(y)lA
4 -lAş 18 -ki
5 -lA
6 -lAn
7 -CA
8 -lU be solved. There are some text normalization methods such
9 -sUz as dictionary mappings, statistical machine translation, and
spelling – correction based approaches that can be used in such
Derivational suffixes create nouns like the suffixes -tion or case [10]. For the Turkish language, there is an open-source
-ness in English. Different types of suffixes are shown in Table library named Zemberek. This library is using a spelling –
III. correction based approach to check if a word is correctly
written and gives proposals for a word. In other words,
TABLE IV: Nominal Verb Suffixes [12]. Zemberek uses some heuristics look-up tables and language
a/a Suffix models for text normalization [13]. It is worth mentioning that
1 -(y)Um based on our ongoing experimental work, it has been noticed
2 -sUn that word correction in general, and using the Zemberek tool,
3 -(y)Uz
4 -sUnUz
in particular, can improve the overall performance of Turkish
5 -lAr text classification systems by approximately 5%.
6 -md
7 -n C. Feature Extraction
8 -k In the processing of texts, the words in the text show
9 -nUz
10 -Dur categorical and discrete features. It is important to encode
11 -cAsInA such data to use it in the preferred algorithms. The process of
12 -(y)DU subtracting a list of words from the text, and mapping them
13 -(y)sA
to the feature set which can be used by a classifier is called
14 -(y)mUş
15 -(y)ken text feature extraction. Different types of feature extraction
methods are mentioned below.
Also, there are some verb suffixes that are used to create 1) Traditional Methods: Count Vectorization, TF-IDF Vec-
time tenses. Some of the verb suffixes are shown in Table IV. torizer, and HashingVectorizer are the traditional methods of
the feature extraction for Text Classification [15]. In this study,
On the other hand, there are some noun suffixes. These
TF-IDF which is considered as the state of the art traditional
suffixes change words and meanings. Some of the noun
feature extraction method is used.
suffixes are shown in Table V.
TF-IDF Vectorizer: Term Frequency can be explained as
Normalization: One of the important processes of text
the number of appearances of a word in the related text
preprocessing is the normalization step. Normalization is a
document [15]. Equation 1 can be used as the calculation of
method that transforms a text into a standard form [9].
Term frequency.
Normalization is substantial for the text processes, especially
in informal writing where miswrites, abbreviations occur too
much. This process affects the analysis of text dramatically O(wi )
T F (wi ) = (1)
[9], where people are generally using the letters “c, g, i, o, u” N
instead of dotted ones “ç, ğ, ı, ö, ü”. This situation may cause O(wi ) represents the occurrence of the ith word, N is the
some problems when classifying such samples. Also, most total number of words that existed in the used vector.
people do not use vowels when they are texting or posting Inverse document frequency (IDF) measures how important
something. This is another problem that it is preferred to a term is. In general, it is proved that stop words appear in

Authorized licensed use limited to: Istanbul Sabahattin Zaim Univ. Downloaded on November 03,2023 at 13:59:19 UTC from IEEE Xplore. Restrictions apply.
most texts frequently but have little importance. Hence, in the connected layer works on generating output based on the rep-
case of IDF, the highest score is assigned to the rare words, and resentation(vector) from the previous layers. In other words,
the low score is assigned to the frequent words [15]. Inverse each layer produces some features based on the result of the
document frequency can be calculated using Equation 2. previous layer(s) and the overall structure(model) can learn the
feature hierarchy by combining and training all layers. The aim
N here, starting with the low-level details, is to achieve effective
IDF (wi ) = log (2) learning up to high-level details [16].
T
T represents number of documents that includes ith word.
To get the overall score, i.e., TF-IDF, as shown in Equation
3, IDF is multiplies by the TF [15].

T F − IDF (wi ) = T F (wi ) ∗ IDF (wi ) (3)

2) Word Embedding Methods: Word embedding is a natural
language modeling technique that matches words or expres-
sions to an equivalent numerical vector(s). This process helps
machine learning methods to understand the given inputs by
contributing the vector representation of the inputs. Also, this
method has some other advantages like reducing the dimension Fig. 1: Structure of the used CNN
of words and prevent similarity of contextual words [20].
Word2vec, GloVe, and Fasttext are examples of the Word 2) Long Short-Term memory: Recurrent Neural Network
embedding approaches. Word2vec which is considered the (RNN) is a concomitant and collaborate neuron networks,
most suitable option for the Turkish language as stated in [1] where neurons are connected with each other by weights.
is used in this study. These kinds of neural networks are very helpful in the event
Word2vec: It is an unsupervised and prediction-based of inputs of changing sizes, self-acting translation, self-acting
model that expresses words in vector spaces. It was invented pattern recognition, etc. The orientation of transmission of
in 2013 by Google researcher Tomas Mikolov and his team. knowledge in These kinds of neural networks are bidirectional,
Word2vec has two sub-methods: CBOW (Continuous Bag of which withholds the order of the data, and can connect with
Words) and Skip-Gram. Both methods are similar in general high sequence inputs as such network is grounded on a loop
[23], and its output is represented by Equation 4. by courtesy of interior memory. In 1997, Hochreiter and
Schmidhuber proposed a new method named Long Short Term
Memory (LTSM)that can be defined as an extension(improved
W ord2vec(Wi ) = [F1 , F2 , F3 , . . . . . . . . . .Fm ] (4) version) of RNN. LTSM can deal with the problem of fading
Where, in our case, m is set to 300, and F is a float number. gradient by the virtue of its memory that enables deleting,
writing, and reading the info through three gates; Input gate
D. Classification that permits or obstruct the updates; Forget gate that deac-
A document’s automatic classification according to prede- tivates an insignificant neuron depending on weights learned
fined categories is currently attracted researchers’ attention. from the algorithm; and output gate which is the control gate
Unsupervised, supervised and semi-supervised are the three of neurons [20] [21] [22].
main methods for text classification. In the last decade, the
automatic text classification task has some significant improve- E. Test & Result
ments using artificial intelligence algorithms such as Neural In this study, multiple experiments were performed us-
Networks, Bayesian classifiers, Decision Tree, support vector ing the previously mentioned dataset “TTC-3600”, In the
machines (SVMs), etc. In this study, the performance of CNN first experiment, a comparison between the studied feature
and LSTM was investigated, and their details are summarized extraction methods, i.e., Word level TF-IDF, N-gram level
below. TF-IDF, Characters level TF-IDF, and the word2vec Word
1) Convolution Neural Network: Convolutional Neural embedding, to find out the most suitable approach for the
Network which is a kind of Multilayered Perceptron is a feed- Turkish language was performed. In the second one, the effect
forward neural network, was inspired by the visual center of of text preprocessing on the performance of both CNN and
the animals [18] and its mathematical convolution process can LSTM was investigated. Finally, the comparison between the
be considered as the response of a neuron to stimuli from two states of the art CNN and LSTM classification approaches
the stimulus field [17], [19]. The architecture of CNN sets in is studied as well.
one or more convolutional layers, sub-sampling layers pursued Experiment 1: Comparing the Performance of Feature
by fully connected one(s) [16]. In convolutional layers, the Extraction Techniques for Processing Turkish Language
input is filtered and feature maps are obtained. In the sub- In this experiment, the accuracy of feature extraction tech-
sampling layers, feature maps are sampled. Finally, the fully niques was measured. For this experiment, four feature ex-

Authorized licensed use limited to: Istanbul Sabahattin Zaim Univ. Downloaded on November 03,2023 at 13:59:19 UTC from IEEE Xplore. Restrictions apply.
0.91 accuracy. After all these attempts, it is clear that pre-
processing without stemming allows CNN to have the highest
accuracy.

TABLE VII: The accuracy of classification with and without

the pre-processing operation.
Accuracy
Approach Without Pre-processing Full Pre-processing Pre-processing Without Stemming
CNN 0.896 0.913 0.922
LSTM 0.8 0.9 0.91

Experiment 3: CNN vs. LSTM

In this experiment, based on the result of the previous two
experiments, two systems for CNN and LSTM have been
implemented. In addition, these systems use Word2vec feature
extraction technique, which was found to be best among the
other studied methods. Also, pre-processing without stemming
was used which provided the best accuracy among the other
methods. To obtain more accurate results, this experiment was
repeated in five iterations. The results of all the iterations
and their average are shown in Table VIII. Overall, LSTM
had an accuracy of 0.9294 whereas CNN had an accuracy
Fig. 2: Structure of the used LSTM of 0.9278. However, even that the difference between the
obtained accuracies is not much, the execution time of CNN
is almost 1/3 the time of LSTM.
traction methods which are Word level TF-IDF method, N-
gram level TF-IDF method, Characters level TF-IDF method, TABLE VIII: The Accuracy Comparison of CNN and LSTM.
and Word2vec word embedding were used. As a first attempt, Accuracy
Iteration
Word level TF - IDF method was used. After applying this CNN LSTM
1st 0.93 0.926
method, CNN had 0.2 accuracy and LSTM had 0.178 accuracy. 2nd 0.935 0.939
After that, N-gram level TF-IDF method was used. By using 3rd 0.922 0.917
this approach, CNN and LSTM had the same accuracy which 4th 0.926 0.93
is 0.2. As a third attempt, the characters level TF- IDF method 5th 0.926 0.935
avg 0.9278 0.9294
was used. By applying this approach, CNN had 0.178 accuracy
whereas LSTM had 0.26 accuracy. As a last attempt, Word2vec
method was used. After applying this method, CNN had 0.861 Overall, based on our results and the results obtained by
accuracy and LSTM had 0.822 accuracy. Overall, as shown in Doğru H.B. et al. [24], where the two studies were conducted
TABLE VI, it is clear that CNN with Word2vec method has using the same dataset, i.e., ”TTC-3600”, using deep learning
the highest accuracy. such as CNN and LSTM improves the performance of Turkish
text classification. In more detail, the traditional methods such
TABLE VI: The accuracy of feature extraction with the pre- as Support Vector Machines(SVM), Naive Bayes, and Random
processing operations. Forest were able to archive an accuracy of 86.39%, 85.00%,
Accuracy
84.17% respectively. Hence, on average CNN increases the
Approach
Word level TF-IDF N-gram level TF-IDF Characters level TF-IDF Word2Vec accuracy by at least 6.39%, similarly, LSTM increases the
CNN 0.2 0.2 0.178 0.861
LSTM 0.178 0.2 0.26 0.822 accuracy by 6.55%.

Experiment 2: The Effects of Pre-Processing the Turkish F. Conclusion and Future Work
Text In this study, various experimental examinations were per-
In the second experiment, the effect of pre-processing on the formed to observe the effect of different text classification
accuracy of the classification process was measured. As shown steps and methods on the accuracy rates of Turkish text
in TABLE VII, in the first step, none of the pre-processing classification. The TTC-3600 dataset, which contains the news
methods were used. According to this process, CNN had 0.896 collected from six different agencies and news portals, and is
accuracy and LSTM had 0.8 accuracy. Secondly, full pre- also available online is used.
processing steps were used. After applying full pre-processing, Also, two popular classifiers, CNN and LSTM, are used.
CNN had 0.913 accuracy whereas LSTM had 0.9 accuracy. To find the best accuracy rates, different versions of pre-
As the last part, pre-processing without stemming was also processing and feature extraction methods are used with the
investigated and CNN had 0.922 accuracy while LSTM had mentioned classifiers. First of all, different feature extraction

Authorized licensed use limited to: Istanbul Sabahattin Zaim Univ. Downloaded on November 03,2023 at 13:59:19 UTC from IEEE Xplore. Restrictions apply.
methods, which are word level TF-IDF, N-gram level TF- [17] K. . N. Fukushima, “A self-organizing neural network model for a
IDF, characters level TF-IDF, word2vec, are used. After this mechanism of pattern recognition unaffected by shift in position.,” Biol.
Cybern., vol. 36, no. 4, pp. 193–202, 1980.
experiment, it was found that word2vec method has the best [18] D. H. Hubel and T. N. Wiesel, “Receptive fields and functional architec-
accuracy rate. After that, the effect of pre-processing methods ture of monkey striate cortex,” J. Physiol., vol. 195, no. 1, pp. 215–243,
is evaluated by calculating the accuracy of classifiers with pre- Mar. 1968.
[19] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning
processing, without pre-processing, and pre-processing with- applied to document recognition,” Proc. IEEE, vol. 86, no. 11, pp.
out stemming, it is clear that pre-processing without stemming 2278–2324, 1998.
has the best accuracy rate. [20] Naili, Marwa and Habacha, Anja and Ben Ghezala, Henda. (2017).
Comparative study of word embedding methods in topic segmentation.
As the last step, five iterations were made with CNN and Procedia Computer Science. 112. 340-349. 10.1016/j.procs.2017.08.009.
LSTM with pre-processing without stemming and word2vec [21] Rhanoui, Maryem and Mikram, Mounia and Yousfi, Siham and Barzali,
method to find the best approach. As a result, it is seen that the Soukaina. (2019). A CNN-BiLSTM Model for Document-Level Senti-
ment Analysis. Machine Learning and Knowledge Extraction. 1. 832-
average accuracy of CNN is 0.9278, and the average accuracy 847. 10.3390/make1030048.
of LSTM is 0.9294. It is observed that the accuracies are [22] Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Com-
closed, but related to the execution time of the CNN requires put. 1997, 9, 1735–1780.
[23] Mikolov, Tomas and Sutskever, Ilya and Chen, Kai and Corrado, G.s and
almost 1/3 the time of LSTM. Dean, Jeffrey. (2013). Distributed Representations of Words and Phrases
For future work, our study can be extended by using some and their Compositionality. Advances in Neural Information Processing
state-of-the-art word embedding methods like ELMo and XL- Systems. 26.
[24] Doğru, Hasibe Büşra and Hameed, Alaa and Tilki, Sahra and Jamil,
Net. Also, for the system classifier, some other artificial intel- Akhtar. (2021). Comparative Analysis of Deep Learning and Traditional
ligence algorithms such that Support Vector Machine (SVM), Machine Learning Models for Turkish Text Classification.
Decision Tree, Bayesian Classifier can be integrated with deep
features(i.e., extracting the output of selected layer(s) of the
used deep learning model).

R EFERENCES
[1] Alqaraleh, Saed, and MERVE IŞIK.(2020). Efficient Turkish tweet
classification system for crisis response. Turkish Journal of Electrical
Engineering & Computer Sciences 28, no. 6 (2020): 3168-3182.
[2] Sahin G., “Turkish document classification based on Word2Vec and
SVM classifier”, 2018 3rd International Conference on Computer Sci-
ence and Engineering (UBMK), 2018, Antalya, Turkey.
[3] Sebastiani, F..” Machine Learning in Automated Text Categorization”,
ACM Computing Survey. pp. 1-47, 2002
[4] S. N.Sivanandam, S. N. Deepa “Principles of Soft Computing”
[5] Gonde Guo, Hui Wang, David Bell,Yaxin Bi and Kieran Greer “KNN
Model Based Approach in classification. pp .986-996 , 2003
[6] Celenli H. I. , Ozturk S. T. , Sahin G., Gerek A., GANİZ M. C.,
“Document Embedding Based Supervised Methods for Turkish Text
Classification”. 3rd International Conference on Computer Science and
Engineering (UBMK), 20 - 23 September 2018, p.477-482, Sarajevo,
Bosnia-Hercegovina.
[7] Kilimci, Z., Akyokus, S.,: Deep Learning- and Word Embedding-Based
Heterogeneous Classifier Ensembles for Text Classification. Complexity
2018, 1-10 (2018).
[8] Kılınç, D . (2016). The Effect of Ensemble Learning Models on Turkish
Text Classification. Celal Bayar University Journal of Science , 12 (2).
[9] Torunoğlu, D., Çakırman, E., Ganiz, M.C., Akyokuş, S., Gürbüz, M.Z.
(2011). Analysis of Preprocessing Methods on Classification of Turkish
Texts. INISTA 2011, June, 2011, Istanbul, Türkiye
[10] KDnuggets. (2019). All you need to know about text prepro-
cessing for NLP and Machine Learning - KDnuggets. [online]
Available at: https://www.kdnuggets.com/2019/04/text-preprocessing-
nlp-machine-learning.html [Accessed 8 Nov. 2019].
[11] Eteration. (2019). Türkçe Doğal Dil İşlemede Zemberek – eter-
ation. [online] Avaliable at: https://www.turkedebiyati.org/turkcenin-
ozellikleri.html
[12] O. Tunçelli, (2019). Turkish Stemmer for Python – GitHub. [online]
Avaliable at: https://github.com/otuncelli/turkish-stemmer-python
[13] TH. Tuna, (2019). Turkish Text Normalization – GitHub.
[online] Avaliable at: https://github.com/ahmetaa/zemberek-
nlp/tree/master/normalization
[14] Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, In-
troduction to Information Retrieval, Cambridge University Press. 2008.
[15] Scikit-learn: Machine Learning in Python, Pedregosa et al., JMLR 12,
pp. 2825-2830, 2011.
[16] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521,
pp. 436–444, 2015.

Authorized licensed use limited to: Istanbul Sabahattin Zaim Univ. Downloaded on November 03,2023 at 13:59:19 UTC from IEEE Xplore. Restrictions apply.

E. J. Lowe-Personal Agency - The Metaphysics of Mind and Action-Oxford University Press (2008) PDF
100% (4)
E. J. Lowe-Personal Agency - The Metaphysics of Mind and Action-Oxford University Press (2008) PDF
241 pages
A Mind Map Laws Pad
50% (2)
A Mind Map Laws Pad
2 pages
Table of Specification Third Periodical Test in English 2
86% (7)
Table of Specification Third Periodical Test in English 2
4 pages
DLL No.6
No ratings yet
DLL No.6
5 pages
Deep Learning
No ratings yet
Deep Learning
42 pages
Analytics of Machine Learning-Based Algorithms For Text Classification
No ratings yet
Analytics of Machine Learning-Based Algorithms For Text Classification
11 pages
Deely - TowardAPostmodernRecoveryOfPerson PDF
No ratings yet
Deely - TowardAPostmodernRecoveryOfPerson PDF
19 pages
Annotation Guide For The Alchemist
No ratings yet
Annotation Guide For The Alchemist
2 pages
Sas #10 - Edu 537 PDF
No ratings yet
Sas #10 - Edu 537 PDF
7 pages
Paper 1-Bidirectional LSTM With Attention Mechanism and Convolutional Layer
100% (1)
Paper 1-Bidirectional LSTM With Attention Mechanism and Convolutional Layer
51 pages
Lecture 6 - Word2Vec and Text Classification
No ratings yet
Lecture 6 - Word2Vec and Text Classification
66 pages
Classification Survey
No ratings yet
Classification Survey
40 pages
Unit-III NLP
No ratings yet
Unit-III NLP
15 pages
Internationalization and Entry Strategy of Enterprises
No ratings yet
Internationalization and Entry Strategy of Enterprises
53 pages
Plotting Decision Regions - 1 - Mlxtend
No ratings yet
Plotting Decision Regions - 1 - Mlxtend
5 pages
Applsci 10 05841
No ratings yet
Applsci 10 05841
14 pages
NLP Module 3
No ratings yet
NLP Module 3
66 pages
10 1016@j Neunet 2006 12 005 PDF
No ratings yet
10 1016@j Neunet 2006 12 005 PDF
9 pages
ANSWERS Quiz LESSON 1 MIL
No ratings yet
ANSWERS Quiz LESSON 1 MIL
1 page
Text Classification Using NLP
No ratings yet
Text Classification Using NLP
28 pages
NLP m4
No ratings yet
NLP m4
97 pages
A Survey On Text Classification From Shallow To Deep Learning
No ratings yet
A Survey On Text Classification From Shallow To Deep Learning
21 pages
Hierarchical Graph-Based Text Classification Framework With Contextual
No ratings yet
Hierarchical Graph-Based Text Classification Framework With Contextual
18 pages
Impact of Convolutional Neural Network and Fasttext Embedding On Text Classification
No ratings yet
Impact of Convolutional Neural Network and Fasttext Embedding On Text Classification
17 pages
NLP Text Classification Week4
No ratings yet
NLP Text Classification Week4
26 pages
Chapters - Mini Project Report Format
No ratings yet
Chapters - Mini Project Report Format
17 pages
Bag of Tricks For Efficient Text Classification: Armand Joulin Edouard Grave Piotr Bojanowski Tomas Mikolov
No ratings yet
Bag of Tricks For Efficient Text Classification: Armand Joulin Edouard Grave Piotr Bojanowski Tomas Mikolov
5 pages
Language Acquisition From The Routledge Linguistics Encyclopedia by Kirsten Malmkjaer
No ratings yet
Language Acquisition From The Routledge Linguistics Encyclopedia by Kirsten Malmkjaer
10 pages
Community Participation and Social Mobilization in Basic Education
100% (1)
Community Participation and Social Mobilization in Basic Education
136 pages
A Survey of Text Classification With Transformers How Wide How Large How Long How Accurate How Expensive How Safe
No ratings yet
A Survey of Text Classification With Transformers How Wide How Large How Long How Accurate How Expensive How Safe
14 pages
An Analysis Method For Interpretability of CNN Text Classification Model
No ratings yet
An Analysis Method For Interpretability of CNN Text Classification Model
14 pages
Review of Text Classification Methods On Deep Learning
No ratings yet
Review of Text Classification Methods On Deep Learning
13 pages
Q 4 W 6
No ratings yet
Q 4 W 6
5 pages
Character Level Text Classification Via Convolutional Neural Network and Gated Recurrent Unit
No ratings yet
Character Level Text Classification Via Convolutional Neural Network and Gated Recurrent Unit
11 pages
Lect 05
No ratings yet
Lect 05
17 pages
1 s2.0 S1877050922015058 Main
No ratings yet
1 s2.0 S1877050922015058 Main
11 pages
News Classsification
No ratings yet
News Classsification
11 pages
Transformer and Graph Convolutional Network For Text Classification
No ratings yet
Transformer and Graph Convolutional Network For Text Classification
11 pages
Turkish Tweet Classification With Transformer Encoder: Kadhim 2019
No ratings yet
Turkish Tweet Classification With Transformer Encoder: Kadhim 2019
8 pages
Multiclass Text Classification On Unbalanced, Sparse and Noisy Data
No ratings yet
Multiclass Text Classification On Unbalanced, Sparse and Noisy Data
8 pages
DLL Mapeh-3 Q2 W5
No ratings yet
DLL Mapeh-3 Q2 W5
3 pages
Review On Comparison Between Text Classification Algorithms
No ratings yet
Review On Comparison Between Text Classification Algorithms
4 pages
Hindi Text Classification
No ratings yet
Hindi Text Classification
7 pages
Trend
No ratings yet
Trend
47 pages
Unit 2
No ratings yet
Unit 2
26 pages
Text Classification Research Paper 2
No ratings yet
Text Classification Research Paper 2
7 pages
Book Genre Categorization Using Machine Learning Algorithms (K-Nearest Neighbor, Support Vector Machine and Logistic Regression) Using Customized Dataset
No ratings yet
Book Genre Categorization Using Machine Learning Algorithms (K-Nearest Neighbor, Support Vector Machine and Logistic Regression) Using Customized Dataset
12 pages
Dynamic Embedding Projection-Gated
No ratings yet
Dynamic Embedding Projection-Gated
10 pages
Word Embedding Methodsof Text Processing
No ratings yet
Word Embedding Methodsof Text Processing
7 pages
17 - Project Report - NLP-2-27
No ratings yet
17 - Project Report - NLP-2-27
26 pages
11 - Vietnamese Text Classification and Sentiment Based
No ratings yet
11 - Vietnamese Text Classification and Sentiment Based
3 pages
ML7 - Text Classification
No ratings yet
ML7 - Text Classification
13 pages
Text Classification
No ratings yet
Text Classification
7 pages
Intercultural Communication
No ratings yet
Intercultural Communication
10 pages
Research On Web Text Classification Algorithm Based On Improved CNN and SVM
No ratings yet
Research On Web Text Classification Algorithm Based On Improved CNN and SVM
4 pages
He Laskar 2019
No ratings yet
He Laskar 2019
4 pages
Text Classification Research Based On Bert Model and Bayesian Network
No ratings yet
Text Classification Research Based On Bert Model and Bayesian Network
5 pages
A Survey On Machine Learning Techniques
No ratings yet
A Survey On Machine Learning Techniques
8 pages
Text-Based Classification
No ratings yet
Text-Based Classification
7 pages
Zhou 2020
No ratings yet
Zhou 2020
5 pages
UNIT-III Text Classification
No ratings yet
UNIT-III Text Classification
4 pages
Bag of Tricks For Text Classification
No ratings yet
Bag of Tricks For Text Classification
5 pages
IEEE-paper (1) Original
No ratings yet
IEEE-paper (1) Original
3 pages
IEEE-paper On NLP
No ratings yet
IEEE-paper On NLP
3 pages
ACL 2020 Proceedings Template 2 PDF
No ratings yet
ACL 2020 Proceedings Template 2 PDF
4 pages
Enhancing Text Classification Through Novel Deep Learning Sequential Attention Fusion Architecture
No ratings yet
Enhancing Text Classification Through Novel Deep Learning Sequential Attention Fusion Architecture
12 pages
Deixis, Presupposition and Implicature
No ratings yet
Deixis, Presupposition and Implicature
5 pages
Structural and Contextual Analysis: by Hazel Marbella Anzano CCT1-A JAN.16 2019
No ratings yet
Structural and Contextual Analysis: by Hazel Marbella Anzano CCT1-A JAN.16 2019
12 pages
Survey On Text Classification
No ratings yet
Survey On Text Classification
7 pages
Mathews 1989
No ratings yet
Mathews 1989
18 pages
Comparison of Text Classifiers On News Articles
No ratings yet
Comparison of Text Classifiers On News Articles
5 pages
PD Presentation - Siop
No ratings yet
PD Presentation - Siop
19 pages
Understanding Groups and Teams: © 2003 Pearson Education Canada Inc
No ratings yet
Understanding Groups and Teams: © 2003 Pearson Education Canada Inc
34 pages
Spam Text Classification Using LSTM Recurrent Neural Network
No ratings yet
Spam Text Classification Using LSTM Recurrent Neural Network
5 pages
GHOSTnet
No ratings yet
GHOSTnet
10 pages
Solved MCQs 4
No ratings yet
Solved MCQs 4
2 pages
Profed 5
No ratings yet
Profed 5
4 pages
A Survey On Different Types of Approaches To Text Categorization
No ratings yet
A Survey On Different Types of Approaches To Text Categorization
3 pages
Science Research Journal
No ratings yet
Science Research Journal
7 pages
127 1498038923 - 21-06-2017 PDF
No ratings yet
127 1498038923 - 21-06-2017 PDF
9 pages
IELTS Reading True False Not Given Tests
No ratings yet
IELTS Reading True False Not Given Tests
2 pages
Medt 7476 Fred Tucker Assessment Implementation Mod 5
No ratings yet
Medt 7476 Fred Tucker Assessment Implementation Mod 5
7 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
3 pages
Worksheet Robotics
No ratings yet
Worksheet Robotics
4 pages
Research RRL
No ratings yet
Research RRL
3 pages
Case Study
No ratings yet
Case Study
13 pages
LK 0.3 English For Social Communication
No ratings yet
LK 0.3 English For Social Communication
2 pages
Transformers in Deep Learning Architecture: Definitive Reference for Developers and Engineers
From Everand
Transformers in Deep Learning Architecture: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Transmission Electron Microscopy Techniques
From Everand
Transmission Electron Microscopy Techniques
Kaushal Dhawan
No ratings yet
BERT Foundations and Applications: Definitive Reference for Developers and Engineers
From Everand
BERT Foundations and Applications: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet

CNN vs. LSTM For Turkish Text Classification

Uploaded by

CNN vs. LSTM For Turkish Text Classification

Uploaded by

CNN vs.

LSTM for Turkish Text Classification

Melih Yayla Mustafa Diyar Demirkol Saed Alqaraleh

978-1-6654-3603-8/21/$31.00 ©2021 IEEE

T F − IDF (wi ) = T F (wi ) ∗ IDF (wi ) (3)

TABLE VII: The accuracy of classification with and without

Experiment 3: CNN vs. LSTM

You might also like