Zhou 2020

Uploaded by

gadisa gemechu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views5 pages

Zhou 2020

Uploaded by

gadisa gemechu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

A Review of Text Classification Based on Deep Learning

Yifan Zhou
Wuhan University, China
[email protected]

ABSTRACT time, the feature extraction method in the traditional machine

Text classification is the process of discriminating predetermined learning method ignores the word order and context information,
text into a certain class or some certain classes. Text and separates the feature extraction process and the model design
categorization has important applications in redundant filtering, process.
organization management, information retrieval, index building, In contrast to traditional machine learning methods, the end-to-
ambiguity resolution, and text filtering. This paper we will mainly end model will be used in the deep learning machine, and the text
introduce the research background of text classification and tracks features can be extracted automatically through the neural
the research dynamics of text classification at home and abroad. network, that is, learned through the model. Then input into the
Text classification is an essential component in many NLP later level of the model to train the model. The deep learning
problems. Neural network model had achieved extraordinary model expresses the text as a continuous dense vector, solving the
effect in text classification. So we will discuss how the general problem of text representation, and then automatically acquires
methods with deep learning to deal with text classification the feature expression ability by using network structures such as
problems, including Convolution Neural Network(CNN), CNN/RNN. The traditional machine learning text classification
Recurrent convolution neural network(RCNN), Long Short-Term includes text preprocessing, feature extraction and build model.
memory(LSTM), and fastCNN. CNN can construct the However, the deep learning text classification is able to form an
representation of text using a convolutional neural network. RNN end-to-end structure by feature extraction, text representation, and
does well in capturing contextual information. LSTM is explicitly model construction.
designed for time-series data for learning long-term dependencies.
Besides, we will introduce the distributed representation, such as 2. WORD VECTOR REPRESENTATION
Continuous Bags of Words(CBOW) and Skip-Gram. And analyze The deep learning model cannot accept raw text as input and can
the advantages of word2vec model over on-hot encoding. only handle numeric tensors. The process of converting text data
into numerical tensors is called text vectorization. There are two
CCS Concepts main methods for text vectorization: one-hot encoding and word
• Information systems➝Database management system embedding for words.
engines • Computing methodologies
One-hot encoding associates each word with a unique integer
Keywords index, and then converts the integer index i into a binary vector V
Text classification; Word2Vec; CNN; RCNN; LSTM; fastCNN of length N (N is the vocabulary size), and the vector V has only
the i-th element is 1, and the rest The elements are all 0. The
1. INTRODUCTION vector obtained by one-hot coding is high dimensional and sparse.
Text classification technology is a basic task in the field of natural The size of the dimension is equal to the number of words in the
language processing. There are many application scenarios, such vocabulary, and most of the elements are zero. Although the text
as news category classification in news websites, sentiment can be conveniently processed as a vector, the word-relationship
analysis and information retrieval. In fact, there are two between words and words is lost, and the syntactic and semantic
mainstream text classification methods in natural language commonality between words and words cannot be effectively
process. The first method is the traditional machine learning represented. So the second method will be used in this project:
method, and another is the deep learning method. Feature word embedding.
extraction is a key technical point in traditional machine learning
WordW: words → Rn embedding is a parameterized function that
methods. The usual methods are feature extraction based on the
word bag model, pLSA, LDA. However, the text expression maps words to high-dimensional vectors.E.g: W(“cat”) =
formed by these feature extraction methods is high-dimensional (0.2, −0.4,0.7, . . . ), W(“mat”) = (0.0,0.6, −0.1, . .. ).Although the
sparse and the feature expression ability is obvious. At the same word embedding model also maps words to high-dimensional
vectors, for a thousand-level vocabulary, the dimensions are still
Permission to make digital or hard copies of all or part of this work for
much smaller than the one-hot encoded dimensions. At the same
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that time, the vector obtained by the word embedding method is a
copies bear this notice and the full citation on the first page. Copyrights continuous floating-point number vector, and words with similar
for components of this work owned by others than ACM must be meanings will be mapped to positions close to the vector space.
honored. Abstracting with credit is permitted. To copy otherwise, or More information can be incorporated into low-dimensional
republish, to post on servers or to redistribute to lists, requires prior spaces. Moreover, word embedding is different from one-hot
specific permission and/or a fee. Request permissions from coding and is learned from data.
[email protected].
ICGDA 2020, April 15–17, 2020, Marseille, France
© 2020 Association for Computing Machinery.
ACM ISBN 978-1-4503-7741-6/20/04…$15.00
DOI: https://doi.org/10.1145/3397056.3397082

132
used at the beginning, and then these word vectors are learned.
The learning method has the same weight as the learning neural
network. Another way is the fine-tunning method in the non-static
mode, which initializes the word vector with a pre-trained
word2vec vector, and adjusts the word vector during training to
accelerate convergence. Directly randomizing the word vector
works well if there is sufficient training data and resources.
The Word2Vec model infers the word vector of each word based
on the context of the context. Using the maximum likelihood
method, the probability of the target vocabulary w is maximized
One-hot word vector Word embedding dense low- given the previous statement h. If two words can be replaced with
Sparse high-dimensional dimensional each other in the context of context, then the distance between the
two words is very close. The Word2Vec model trains on each
Figure 1. Word Vector sentence in the dataset and slides over the sentence in a fixed
Through research, there are two main methods for obtaining word window, predicting the word vector of the word in the middle of
embedding, one is static and the other is non-static: the fixed window based on the context of the sentence. The output
of the model is called an embedded matrix.
(1) Static mode. First, another machine learning model is used to
pre-train the word embedding, and then the trained words are Word2Vec is divided into two modes: CBOW (Continuous Bags
embedded into the sentiment analysis model. During the training of Words) and Skip-Gram. CBOW is to infer the target word from
process, model does not update the word vector, which is a kind the original sentence, and Skip-Gram, on the other hand, guesses
of migration learning. Static mode is suitable for situations where the original sentence from the target word. When CBOW is used
the amount of data is small. and the amount of data is small, Skip-Gram works well in large
corpora.
(2) Non-static mode. Learn word embedding while completing the
text classification task. For example, a random word vector is

Figure 2.Word2Vector Model.

A length n sentence can be expressed as:
3. DEEP LEARNING MODEL FOR TEXT
𝑥1:𝑛 = 𝑥1 ⊕ 𝑥2 ⊕ … ⊕ 𝑥𝑛 ,which ⊕ is the connection operation.
CLASSIFICATION The convolution operation contains a filter w, w ∈ 𝑅 ℎ𝑘 . The filter
In this section we will show you how to use text vectorized data to is used to create new features in windows containing h words. For
build models for text categorization. It mainly introduces the example, the feature 𝑐𝑖 is generated by a filter containing a
application of CNN, RNN, CNN+LSTM and fastText deep
learning model in text classification. window of words 𝑥𝑖:𝑖 + ℎ − 1 . 𝑐𝑖 = 𝑓(𝑤𝑖:𝑖+ℎ−1 + 𝑏) ,where b
represents the offset and is a nonlinear function. A filter can be
3.1 CNN used to create a feature map for all windows of size h. After that, a
The paper [1] proposes that the core idea of CNN model for max-pooling operation is used to extract the maximum value from
sentence-level classification is to use local features. CNN extracts the feature map as a feature, and the most important feature values
different local features through different convolution kernels. As in the feature map can be obtained by max-pooling, thereby
shown in the figure below, it is a structure diagram of CNN for automatically determining which words play a key role in the text
text classification, which is mainly composed of input layer, classification. Besides, max-pooling can reduce parameters and
convolution layer, pooling layer and fully connected layer. Xi calculations, prevent over-fitting.
represents the i-th word in the sentence, and maps xi to a k-
dimensional word vector.

133
Figure 3. TextCNN Model
In CNN model, the feature is the word vector, including static and directional RNN can be understood in a sense to capture variable-
non-static word vector. The static method uses word vectors such length and bidirectional n-gram information. In [2], the design of
as word2vec pre-training. The training process does not update the RCNN for classification problem is introduced. The following
word vector. We can regard it as a kind of migration learning. figure shows the schematic diagram of the network structure
Especially when the amount of data is not relatively large, the principle. The example uses the result of the last word to directly
static word vector tends to work well. Non-static updates the word connect the full connection layer softmax output.
vector during training. The recommended method is the fine-
tunning method in non-static. It initializes the word vector with a In [2], a method for capturing the semantics of text using RNN.
pre-trained word2vec vector. Adjusting the word vector during The following figure shows the network structure of the model
training can accelerate convergence. Of course, if there is which is a bidirectional recurrent neutral network. The input to the
sufficient training data and Resources, direct random initialization model is an article D (consisting of the word sequence
of the word vector effect is also possible. w1,w2...).The output of the model is class elements. We use
𝑝(𝑘|𝐷, 𝜃) to denote the probability of the document being class
The above process uses a filter to extract a kind of feature, and the k,where theta is the parameters in the network.The word is
next process reuses multiple different filters to extract different presented by itself and its context. With the help of context, we
kinds of feature. These features form a third layer and then output can obtain a more precise word information. In this model,
a probability distribution of the category labels through a fully 𝑐𝑐(𝑤𝑖 ) as the left context of word 𝑤𝑖 and 𝑐𝑐(𝑤𝑖 ) as the right
connected softmax layer. context of word 𝑤𝑖 .And 𝑐𝑐(𝑤𝑖 ) = 𝑓�𝑊𝑊𝑊(𝑤𝑖 − 1) +
𝑊𝑊(𝑤𝑖 − 1)�. W(𝑙) is a matrix that transforms the hidden layer
3.2 RCNN context into the next hidden layer. W(𝑠𝑠) is a matrix that is used
One of the biggest problems with CNN is the fixed view of
to combine the semantic of the current word with the next word’s
filter_size. On the one hand, it is impossible to model longer
left context. 𝑓 is a non-linear activation function.In Equation (3),
sequence information. On the other hand, the super-parameter
we define the representation of word 𝑤𝑖 , which is the
adjustment of filter_size is cumbersome. The essence of CNN is
concatenation of the left-side context vector 𝑐𝑐(𝑤𝑖 ) ,the word
to do the feature expression work of text, and the more commonly
used in natural language processing is Recurrent Convolution embedding 𝑒(𝑤𝑖 ) and the right-side context vector 𝑐𝑐(𝑤𝑖 ) .
Neural Network (RCNN), which can better express context 𝑥𝑖 = [𝑐𝑐(𝑤𝑖 ); 𝑒(𝑤𝑖 ); 𝑐𝑐(𝑤𝑖 )] (3).
information. Specifically in the text classification task, the Bi-

Figure4.RCNN Model

134
After that, we obtain the representation 𝑥𝑖 of the word 𝑤𝑖 . Firstly, belongs. This kind of consideration is obviously more reasonable.
we apply a linear transformation. Secondly, use tanh activation Use the attention mechanism to reconsider the importance of each
function to and send the result to the next layer. When all of the sentence, improving the model in which RNN and LSTM only use
representations of words are calculated, we use a max-pooling hidden variables. CNN and RNN are used in text categorization
layer. The max function is an element-wise function. The k-th tasks, although the effect is significant， there is a deficiency that
element of 𝑦𝑖 (2) is the maximum in the k-th elements of 𝑦 (3) . The the model can not be well explained. The attention mechanism is a
pooling layer converts text with various lengths into a fixed- commonly used modeling long-term memory mechanism in the
length vector. With the pooling layer, we can capture the field of natural language processing. It can intuitively give the
information throughout the entire text. The last part of our model contribution of each word to the result, which is basically the
is an output layer. Similar to traditional neural networks, it is standard of the Seq2Seq model. In fact, text categorization can
defined as 𝑦 (4) = 𝑤 (4) 𝑦 (3) + 𝑏 (4) .Finally, the softmax function is also be understood as a special Seq2Seq in a certain sense, so
applied to 𝑦 (4) . consider introducing attention mechanism to the near.

3.3 C-LSTM 3.4 fastText

In the paper [3], the authors propose a CNN+LSTM model (C- fastText is a fast text classification algorithm and often gets
LSTM) for sentence representation and text classification. C- metrics similar to deep networks, but it is faster than deep
LSTM uses CNN to extract a higher level phrase representation of networks in training time with many orders of magnitude. Using a
a sentence, and then input the phrase representation into the linear model with rank constraints, you can train one billion words
LSTM layer to obtain a sentence representation. C-LSTM can in ten minutes while achieving performance comparable to state-
learn both local features of a phrase as well as features of sentence of-the-art technology. The model structure is very similar to
semantics. CBOW in wor2vec, and both belong to the classification model.
But fastText predicts article category tags and CBOW predicts
The structure of the C-LSTM model is shown in the figure. intermediate words. The fastText model only has one hidden layer
Blocks of the same color in the feature map layer and window and one output layer as shown below.
feature sequence layer corresponds to features for the same
window. The dashed lines connect the feature of a window with
the source feature map. The final output of the entire model is the
last hidden unit of LSTM. This model mainly uses CNN and
LSTM. The input text is convoluted with a filter to form a feature
map. In the feature map, squares of the same color are mapped to
the same window feature sequence layer. Use max-pooling to
select the most important features in the window feature sequence
layer. Then enter it into the LSTM unit. The hidden state of the
last time step of the LSTM is represented as text, and a softmax
layer is added at the end. The model is trained by minimizing the
cross entropy loss function. The parameters of the SGD learning Figure6. fastText Model
model and RM-Sprop are used for optimization. The input of fastText is a sentence with N-gram features
x1,x2...,xn, which are embedded and averaged as the feature
representation of the text to form the hidden varibale. The hidden
layer of fastText is summed by the input layer and averaged,
multiplied by the weight matrix A, which is a look-up table over
the words. It is equivalent to weighting and summing each word
vector as the vector of the sentence. The output layer is obtained
by multiplying the hidden layer by another weight matrix B.
When there are large classes, the computing time will be
expensive and the computational complexity is O(kh),where k is
the number of classes and h is the dimension of the text
representation. fastText use a hierarchical softmax based on the
Huffman coding tree to reduce the computational complexity to
O(hlog2(k)).Each node has a probability that is the probability of
the path from the root to that node.
In order to take word order into account, fastext uses the N-gram
Figure5. C-LSTM Model feature. The inputs are N-gram vectors, which are randomly
generated. Since the amount of N-gram is much larger than words,
LSTM is designed for sequence data, and pooling will break such
it cannot be completely saved. fastText uses the hashing track
sequence organization due to the discontinuous selected features.
method to hash all n-grams into buckets, and hashes all the n-
The LSTM architecture has a range of repeated modules for each
grams in the same bucket to share an embedding vector. fastText
time step as in a standard RNN.LSTM is explicitly designed for
is a simple baseline method for text classification. Unlike
time-series data for learning long-term dependencies, and
unsupervised word vectors from word2vec, the word features of
therefore we choose LSTM upon the convolution layer to learn
the fastText model can be averaged together to form a good
such dependencies in the sequence of higher-level features.
sentence representation.
The core idea of adding a trick-attention on the basis of CNN and
RNN is different for predicting the context in which the text

135
Compared with neural network-based classification algorithms, [2] Siwei Lai, Liheng Xu, Kang Liu, Jun Zhao.Recurrent
fastText has three major advantages: 1) fastText training speed Convolutional Neural Networks for Text
and Standard multicore CPU is used to complete more than 1 Classification[J].AAAI Conference on Articifiacl
billion words of fastText in less than 10 minutes, and 500,000 Intelligence,2013.
sentences in the 312k class are sorted in less than a minute. 2) [3] Zhou C, Sun C, Liu Z, et al. A C-LSTM Neural Network for
fastText does not require pre-trained word vector, it can self- Text Classification[J]. Computer Science, 2015, 1(4):39-44.
training.3) fastText has two important optimizations: Hierarchical
Softmax, N-gram. Use Hierarchical softmax instead of softmax, [4] Joulin A, Grave E, Bojanowski P, et al. Bag of Tricks for
combined with Huffman coding, to reduce complexity to log level. Efficient Text Classification[J]. 2016.
[5] Aggarwal,C.C.,and Zhai,C. A Survey of Text Classification
4. CONCLUSION AND FUTURE WORK Algorithms[J]. 2012.
The deep neural network model like CNN was initially successful
in the field of imaging, and its power point is to capture local [6] Duchi, E. Hazan, Y. Singer. 2011 Adaptive subgradient
correlation. In text classification tasks, CNN can be used to methods for online learning and stochastic
extract key information automatically which is similar to n-grams optimization[J].Journal of Machine Learning
in sentences. RNN can capture contextual information as far as Research,12:2121–2159.
possible when learning word representations with recurrent [7] Dong, F. Wei, S. Liu, M. Zhou, K. Xu. 2014. A Statistical
structure and can constructs the representations of text using a Parsing Framework for Sentiment Classi-fication[J]. CoRR,
convolutional neutral network. LSTM is able to learn phrase-level abs/1401.6330.
features through a convolutional layer and to learn long-term [8] Graves, A. Mohamed, G. Hinton. 2013. Speech recognition
dependencies by sequences of higher-level representations. with deep recurrent neural networks[J]. In Proceedings of
Besides, with the improvement of language model like BERT and ICASSP 2013.
ERNIE, they have achieved more better results than many deep
neural network models. So later, we will pay more attention to [9] Bengio, Y., Courville, A.and Vincent, P. Representation
solve text classification with language model like BERT and learning: A review and new perspectives[J]. IEEE
ERNIE. TPAMI,2013.
[10] Zhang Y, Wallace B. A Sensitivity Analysis of
5. REFERENCES Convolutional Neural Networks for Sentence
[1] Kim Y. Convolutional Neural Networks for Sentence Classification[J]. Computer Science, 2015.
Classification[J]. Eprint Arxiv, 2014.

136

ISTQB® Software Testing Foundation: Course Book
100% (1)
ISTQB® Software Testing Foundation: Course Book
540 pages
Cheat Sheet Imperva
100% (2)
Cheat Sheet Imperva
12 pages
Explaining The Intuition of Word2Vec & Implementing It in Python
No ratings yet
Explaining The Intuition of Word2Vec & Implementing It in Python
13 pages
Large Language Models From Scratch
No ratings yet
Large Language Models From Scratch
29 pages
Lecture 6 - Word2Vec and Text Classification
No ratings yet
Lecture 6 - Word2Vec and Text Classification
66 pages
NLP DL Lecture2
No ratings yet
NLP DL Lecture2
54 pages
Vector Representation of Text: Vagelis Hristidis Prepared With The Help of Nhat Le Many Slides Are From Richard Socher
No ratings yet
Vector Representation of Text: Vagelis Hristidis Prepared With The Help of Nhat Le Many Slides Are From Richard Socher
20 pages
ML7 - Text Classification
No ratings yet
ML7 - Text Classification
13 pages
08 Word Embeddings (2021)
No ratings yet
08 Word Embeddings (2021)
58 pages
Sheet 3
No ratings yet
Sheet 3
5 pages
Unit IV
No ratings yet
Unit IV
58 pages
Dynamic Embedding Projection-Gated
No ratings yet
Dynamic Embedding Projection-Gated
10 pages
Word Embeddings Notes
No ratings yet
Word Embeddings Notes
9 pages
Character Level Text Classification Via Convolutional Neural Network and Gated Recurrent Unit
No ratings yet
Character Level Text Classification Via Convolutional Neural Network and Gated Recurrent Unit
11 pages
Unit IV
No ratings yet
Unit IV
57 pages
DM Chapter 9 - Word Embedding
No ratings yet
DM Chapter 9 - Word Embedding
7 pages
NLP Notes
No ratings yet
NLP Notes
11 pages
Part 3
No ratings yet
Part 3
5 pages
04 - Text Representation
No ratings yet
04 - Text Representation
131 pages
Classification Survey
No ratings yet
Classification Survey
40 pages
Distributed Representations of Sentences and Documents: Quoc Le Tomas Mikolov
No ratings yet
Distributed Representations of Sentences and Documents: Quoc Le Tomas Mikolov
9 pages
Paragraph Vector PDF
No ratings yet
Paragraph Vector PDF
9 pages
Word Embadding
No ratings yet
Word Embadding
24 pages
Unit-III NLP
No ratings yet
Unit-III NLP
15 pages
Lecture 2a - Word Level Semantics
No ratings yet
Lecture 2a - Word Level Semantics
34 pages
Lecture#14
No ratings yet
Lecture#14
38 pages
NLP - L9 Word Embedding
No ratings yet
NLP - L9 Word Embedding
5 pages
Word2Vec - A Baby Step in Deep Learning But A Giant Leap Towards Natural Language Processing
100% (1)
Word2Vec - A Baby Step in Deep Learning But A Giant Leap Towards Natural Language Processing
12 pages
Word Embeddings Classification
No ratings yet
Word Embeddings Classification
52 pages
08-DL-Deep Learning For Text Data (Transfer Learning in NLP)
No ratings yet
08-DL-Deep Learning For Text Data (Transfer Learning in NLP)
53 pages
ML For NLP-LO4
No ratings yet
ML For NLP-LO4
42 pages
Chapter II
No ratings yet
Chapter II
26 pages
Research On Web Text Classification Algorithm Based On Improved CNN and SVM
No ratings yet
Research On Web Text Classification Algorithm Based On Improved CNN and SVM
4 pages
Wordembed
No ratings yet
Wordembed
31 pages
NLP2
No ratings yet
NLP2
11 pages
He Laskar 2019
No ratings yet
He Laskar 2019
4 pages
BDMH LLM
No ratings yet
BDMH LLM
51 pages
Lect 04
No ratings yet
Lect 04
44 pages
Word 2 Vec
No ratings yet
Word 2 Vec
6 pages
Natural Language Processing With Neural Network - Class3
No ratings yet
Natural Language Processing With Neural Network - Class3
25 pages
Constructing and Evaluating Word Embeddings
No ratings yet
Constructing and Evaluating Word Embeddings
33 pages
Unit 2 Updated New
No ratings yet
Unit 2 Updated New
77 pages
Unit 5b - Natural Language Processing
No ratings yet
Unit 5b - Natural Language Processing
41 pages
Impact of Convolutional Neural Network and Fasttext Embedding On Text Classification
No ratings yet
Impact of Convolutional Neural Network and Fasttext Embedding On Text Classification
17 pages
Word Embedding
No ratings yet
Word Embedding
9 pages
Word Embeddings
No ratings yet
Word Embeddings
55 pages
Trend
No ratings yet
Trend
47 pages
542 315 Word2vec
No ratings yet
542 315 Word2vec
20 pages
CHATGPT NLP
No ratings yet
CHATGPT NLP
6 pages
Word Embedding Methodsof Text Processing
No ratings yet
Word Embedding Methodsof Text Processing
7 pages
Vector Semantics and Embeddings
No ratings yet
Vector Semantics and Embeddings
29 pages
Review of Text Classification Methods On Deep Learning
No ratings yet
Review of Text Classification Methods On Deep Learning
13 pages
NLP - Natural Language Processing
No ratings yet
NLP - Natural Language Processing
74 pages
Text Classification Using NLP
No ratings yet
Text Classification Using NLP
28 pages
CNN vs. LSTM For Turkish Text Classification
No ratings yet
CNN vs. LSTM For Turkish Text Classification
6 pages
A Simple Word2vec Tutorial - Zafar Ali - Medium - Reader View
No ratings yet
A Simple Word2vec Tutorial - Zafar Ali - Medium - Reader View
9 pages
Neural Network
No ratings yet
Neural Network
23 pages
Visual Word: Unlocking the Power of Image Understanding
From Everand
Visual Word: Unlocking the Power of Image Understanding
Fouad Sabry
No ratings yet
Gensim for Natural Language Processing: Definitive Reference for Developers and Engineers
From Everand
Gensim for Natural Language Processing: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Advanced Deep Learning Techniques for Natural Language Understanding: A Comprehensive Guide
From Everand
Advanced Deep Learning Techniques for Natural Language Understanding: A Comprehensive Guide
Adam Jones
No ratings yet
Perceptrons: Fundamentals and Applications for The Neural Building Block
From Everand
Perceptrons: Fundamentals and Applications for The Neural Building Block
Fouad Sabry
No ratings yet
Support Vector Machine: Fundamentals and Applications
From Everand
Support Vector Machine: Fundamentals and Applications
Fouad Sabry
No ratings yet
Seminar Guideline
No ratings yet
Seminar Guideline
3 pages
WIRELESS1
No ratings yet
WIRELESS1
9 pages
Gdaisa Gemechu Project
No ratings yet
Gdaisa Gemechu Project
13 pages
Gadisa Gemechu Art Review
No ratings yet
Gadisa Gemechu Art Review
10 pages
Next Sentence Prediction Using LSTM and Gru
No ratings yet
Next Sentence Prediction Using LSTM and Gru
5 pages
Assignment of Wirless
No ratings yet
Assignment of Wirless
12 pages
8-Queen Problem
No ratings yet
8-Queen Problem
2 pages
Computerized Accounting System
100% (1)
Computerized Accounting System
6 pages
Usask Thesis Defense
100% (3)
Usask Thesis Defense
5 pages
Mathematics 2024 BOT Winter School Grade 12 Learne - 240522 - 103501
No ratings yet
Mathematics 2024 BOT Winter School Grade 12 Learne - 240522 - 103501
14 pages
Grade 10 Holiday Assignment 2024 Final
No ratings yet
Grade 10 Holiday Assignment 2024 Final
18 pages
NBIMS-US V3 4.7 Eie-405-415
No ratings yet
NBIMS-US V3 4.7 Eie-405-415
11 pages
GE DigitalFlow GF868
No ratings yet
GE DigitalFlow GF868
163 pages
DAA Practical
No ratings yet
DAA Practical
68 pages
DrTVelmurugan Profile 1
No ratings yet
DrTVelmurugan Profile 1
35 pages
Difficult Riddles For Smart Kids 300 Dif PDF
0% (3)
Difficult Riddles For Smart Kids 300 Dif PDF
7 pages
Modbus TCP Interface BAT and BATI Inverters v4 5 Eng
No ratings yet
Modbus TCP Interface BAT and BATI Inverters v4 5 Eng
124 pages
Multiple Ips and Subnet Support: Pjsip Set Logger On Pjsip Show Endpoints Pjsip Show Registrations
No ratings yet
Multiple Ips and Subnet Support: Pjsip Set Logger On Pjsip Show Endpoints Pjsip Show Registrations
3 pages
Cics Mock Test III
No ratings yet
Cics Mock Test III
6 pages
4th Sem 2nd Internal Exam Routine
No ratings yet
4th Sem 2nd Internal Exam Routine
1 page
Reference Manual (0.5.8) (2020.04.19) : Opencore
No ratings yet
Reference Manual (0.5.8) (2020.04.19) : Opencore
69 pages
SSN Project Report PDF
No ratings yet
SSN Project Report PDF
27 pages
CS311 MCQs Mids 2024 Mam Mehwish
No ratings yet
CS311 MCQs Mids 2024 Mam Mehwish
9 pages
Config Windows Server 2022 SP Us
No ratings yet
Config Windows Server 2022 SP Us
26 pages
iSMA Control Kit: User Manual
No ratings yet
iSMA Control Kit: User Manual
59 pages
Sentiment Analysis On Youtube Comments
No ratings yet
Sentiment Analysis On Youtube Comments
54 pages
Idioms For 12th Class
0% (1)
Idioms For 12th Class
21 pages
Java Networking
100% (1)
Java Networking
21 pages
Information Retrieval Master Thesis
100% (2)
Information Retrieval Master Thesis
7 pages
Fundamentals
No ratings yet
Fundamentals
2 pages
Self Repair App
No ratings yet
Self Repair App
38 pages
C++ Polymorphism
No ratings yet
C++ Polymorphism
6 pages
Zoom Phone - Bring Your Own Carrier
No ratings yet
Zoom Phone - Bring Your Own Carrier
3 pages
Computer Programming I (Python) : Dr. Sami Al-Maqtari
No ratings yet
Computer Programming I (Python) : Dr. Sami Al-Maqtari
170 pages

Zhou 2020

Uploaded by

Zhou 2020

Uploaded by

A Review of Text Classification Based on Deep Learning

ABSTRACT time, the feature extraction method in the traditional machine

Figure 2.Word2Vector Model.

3.3 C-LSTM 3.4 fastText

You might also like