Kim 2016

Uploaded by

Anjani Chairunnisa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

107 views5 pages

Kim 2016

Uploaded by

Anjani Chairunnisa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

2016 IEEE International Conference on Systems, Man, and Cybernetics • SMC 2016 | October 9-12, 2016 • Budapest, Hungary

Semantic Text Classification with Tensor Space

Model-based Naïve Bayes
Han-joon Kim
School of Electrical and Computer Engineering
University of Seoul
Seoul, Korea
[email protected]

Jiyun Kim Jinseog Kim

School of Electrical and Computer Engineering Department of Applied Statistics
University of Seoul Dongguk University
Seoul, Korea Gyeongju, Korea
[email protected] [email protected]

Abstract—This paper presents a semantic naïve Bayes than other learning algorithms in terms of overcoming the
classification technique that is based upon our tensor space problem. Moreover, the NB algorithm is suitable for
model for text representation. In our work, each of Wikipedia operational text classification system since it is very easy to
articles is defined as a single concept, and a document is incrementally update the classification model due to its
represented as a 2nd–order tensor. Our method expands the simplicity; when new documents are given as training
conventional naïve Bayes by incorporating the semantic concept data, the current word feature statistics are updated and
features into term feature statistics under the tensor-space model. additional feature evaluation is immediately carried out
Through extensive experiments using three popular document without re-processing the past training data. This
collections, we prove that the proposed method significantly characteristic is essential in the case where the document
outperforms the conventional naïve Bayes. Surprisingly, the collection is highly evolutionary. More importantly, the
classification performance amounts to almost 100% in terms of NB algorithm does not require a complex generalization
F1-measures when using Reuters-21578 and 20Newsgroups process unlike support vector machine and decision trees;
it has only to calculate the feature statistics per class.
document collections.
Because of the above advantages, there have been
Keywords—text classification; naïve Bayes; vector space; many studies to improve the na ve Bayes text classifier
tensor space; Wikipedia; semantics; concepts in two aspects. One is to combine it with meta-learning
algorithms such as EM [1, 2], boosting [3], and active
I. INTRODUCTION learning [4, 5], and the other is to enrich the
representation of documents with external or internal
Text classification is to automatically assign an unknown semantic features [6, 7, 8, 9]. Recently, as good quality
textual document to its appropriate one or more classes. external knowledge such as Wikipedia
Nowadays, the most popular approach towards text (http://en.wikipedia.org/), WordNet
classification is to use machine learning techniques that (http://wordnet.princeton. edu/), and Open Directory
inductively build a classification model of pre-defined classes Project directory (https://www. dmoz.org/) have been
from a training set of labeled documents. This has been used built, oftentimes the second approach has been attempted
very significantly in spam email filtering, sentiment analysis, to enhance the NB algorithm. Of course, without the help of
readability assessment, and article triage. Popular learning external knowledge, internal (or latent) semantic features can
methods include naïve Bayes (NB), k-nearest neighbors (k-NN), be derived through singular value decomposition (SVD) [9].
decision trees, and support vector machine (SVM). In our work, The important thing is that the semantic features should
we focus on improving the naïve Bayes learning algorithm encompass the correct meanings of terms in a document to
because it is simple yet accurate technique in spite of its wrong improve the NB classification performance.
independence assumption. Besides, the NB algorithm has a In our work, we propose a semantic naïve Bayes text
number of superior advantages compared with other learning classifier that is based upon a tensor space text model proposed
algorithms. in our previous research. In [10], we proposed a text model
In general, an NB learning process is much faster than conforming to the definition of the ‘concept’ in the FCA
that of other machine learning methods since its (Formal Concept Analysis) framework. The model represents a
classification model can be developed with a single pass document as not a vector but a matrix (i.e., 2nd-order tensor)
over training documents. Basically, machine learning that reflects the relationship between term features and
algorithms should effectively deal with the curse-of- semantic features. To realize this semantically enriched text
dimensionality problem since text data have a huge number of model, we employ the Wikipedia encyclopedia as an external
term features. In this respect, the NB algorithm is less sensitive knowledge source in which each article is defined as a single
semantic concept.

978-1-5090-1897-0/16/$31.00 ©2016 IEEE SMC_2016 004206

2016 IEEE International Conference on Systems, Man, and Cybernetics • SMC 2016 | October 9-12, 2016 • Budapest, Hungary

II. PRILIMINARIES vector as well as a term vector. Thus we can say that the
model is a generalization of the conventional vector space
A. Tensor Space Model for Text Representation model. With the document representation, semantic features of
In the vector space model, documents are represented as concepts associated with literal features of terms help to
vectors in which each element has a weighting. In contrast, our significantly improve the performance of text classification.
semantic tensor space model represents documents as 2nd-order
tensors (i.e., matrices) ⨂ , where S is the number of
concepts (or semantics) and T is the number of terms indexed,
and and are the vector spaces for the concepts and terms.
We regard the ‘concept space’ as an independent space equated
with the ‘term’ and ‘document’ spaces used in the VSM.
A concept is defined by a pair of intent and extent
according to the formal concept analysis (FCA) principle [11].
The ‘extent’ means the set of instances that are included in the
concept, the ‘intent’ means all set of common attributes of
instances included in the extent. In our work, the extent that
represents a concept consists of a set of documents related with
the concept; the intent consists of a set of keyword extracted
Fig. 2. Representing a document as a term-by-concept matrix
from the set of documents. Figure 1 illustrates the term-
document matrix and the term-document-concept tensor
B. Naïve Bayes Learning Framework
representations for a given corpus. To represent a document
corpus, rather than a term-document matrix, we can build a 3rd- Naïve Bayes (NB) text classification systems produce their
order tensor with three distinct spaces: document, term, and classification model as a result of learning (estimation) process
concept. As a result, we can naturally represent terms or based on the Naïve Bayes learning algorithm. The estimated
concepts as matrices; given a 3rd-order tensor of a document classification model consists of two kinds of parameters: the
corpus, we can represent a component of each space using the term probability estimates θ | , and the class prior probabilities
other two vector spaces. That is, we can represent a document θ ; that is, the classification model θ θ | , θ . Each
as a concept-by-term matrix, a term as a concept-by-document parameter can be estimated according to maximum a posteriori
matrix, and a concept as a term-by-document matrix. (MAP) estimation.
For classifying a given document, Naïve Bayes learning
system estimates the posterior probability of each class via
|
Bayes’ rule; that is, Pr | , where Pr(c) is the

class prior probability that any random document from the
document collection belongs to the class c, Pr(d|c) is the
probability that a randomly chosen document from documents
in the class c is the document d, and Pr(d) is the probability that
a randomly chosen document from the whole collection is the
document d. The document d is then assigned to a class
argmax ∈ Pr , with the most posterior 1 . Here, the
document d is represented by a bag of words , ,…, | |
where multiple occurrences of words are preserved. Moreover,
the Naïve Bayes assumes that the terms in a document are
mutually independent and the probability of term occurrence is
Fig. 1. A 3rd-order tensor of a document corpus independent of position within the document. This assumption
allows simplifying the classification function.
For a particular document corpus, we need to define a
Φ argmax ∈ Pr |
concept space to build up the text tensor. To define each
dimension of the concept space, we specify a Wikipedia page | |
as a ‘concept’. After choosing the appropriate Wikipedia argmax ∈ Pr ∗∏ Pr | (1)
articles by regarding each of documents as a query, we can
To generate this classification function, Pr(c) can be simply
automatically generate the reasonable semantic space required
estimated by counting the frequency with which each class
for the tensor space [12].
value cj occurs in a set of the training documents Dt , where
Figure 2 illustrates an example of a matrix representation of Pr(cj|di) ∈{0,1}, given by the class label. That is, Pr c θ
a document. The concept-by-term matrix provides information
on the concepts that exist in the document. If necessary, a
document can be expressed as a 1st-order tensor (i.e., a vector)
by summing all the components of each row or column; in 1
The argmaxx F(x) is the value of x for which F(x) has the
other words, a document can be represented by a concept largest value.

SMC_2016 004207
2016 IEEE International Conference on Systems, Man, and Cybernetics • SMC 2016 | October 9-12, 2016 • Budapest, Hungary

, ,
∑ | ^r | ,
P (5)
. As for Pr(tij|c), its maximum likelihood estimate is ∑∈ , , | |
,
(∑ ), where TF(t, c) is the number of occurrence of and estimate semantic probability given as
∈ ,
term t in the class c and V denotes the set of significant words ,
extracted from the training documents. However, this ^ |
Pr (6)
∑ ∈ , | |
estimation can produce a biased underestimate of the
probability, or it can give a probability of zero for any word
that does not occur in some categories. To avoid this problem, respectively, where w(t, s, c) denotes the weighted value of
the estimate for Pr(tij|c) can be adjusted by Laplace’s law of each element in the tensor space, which account for the concept
succession as follows: c of term t in document d. Note that [7] proposed a similar
Naïve Bayes classification method that incorporates inherent
, semantic information, which is obtained by applying latent
Pr θ | ∑∈
(2) topic models from training documents without external
, | |
knowledge. In contrast, the superiority of our learning method
is that the meaning of a term occurring in a document can be
Therefore, the learning of the Naïve Bayes classifier does
more correctly captured with the external Wikipedia articles,
not require any other statistics than those already collected in
and it can be reflected in Equations (5) and (6) through the 2nd
TF, and other generalization process is not necessary.
–order document representation.
III. SEMANTIC NAÏVE BAYES TEXT CLASSIFICATION B. Estimating Semantic Naïve Bayes Learning Parameters
A. Semantically Extending the Naïve Bayes with the 2nd–order Text Representation
In this section, we intend to semantically extend the
conventional Naïve Bayes under the tensor-space model. Note
that it is necessary to incorporate the semantic information into
the first NB parameter θ | Pr . As mentioned
earlier, when additional semantic information for a document
exists in our tensor space model, a document can be
represented as a matrix of terms and semantics:
… … | |
⋮ ⋮ ⋮ ⋮ ⋮
… … | | . (3)
⋮ ⋮ ⋮ ⋮ ⋮
| | … | | … | || |

Here, is a random variable for -th term and -th

semantic information. V is a set of terms indexed, and S is a set
of semantic information (i.e., concepts) in the 2nd–order tensor
space (i.e., term-by-concept matrix) for each of the training
documents. [13] proposed a similar approach to our model to
improve the k-NN classification performance, in which another
Fig. 3. Estimating the learning parameters with the term-by-concept matrix
term space is produced by folding term vectors. However, it is
for document representation
not straightforward to properly determine the terms required
for the new term space, and limitations of using terms as Figure 3 depicts the 2nd–order tensor (i.e., term-by-concept
semantic units remain. In our work, with the 2nd–order tensor matrix) for the training document d in our tensor space model.
(matrix) representation of a document and the conditional In Equation (5), the value of w(t, s, c) corresponds to the
independence assumption in the conventional Naïve Bayes, the weighted value of each cell in the matrix. |V| denotes the size of
likelihood of a document can be approximated as follows: the term space (i.e., the number of terms indexed). The value of
Pr | Pr , 1, … , |V|, 1, … , | | ∑∈ , , in the denominator can be easily obtained by
| |
∏ ∏ Pr| |
| . (4) summing up the weight value of each cell along the term space,
which equals the w(s, c) in the numerator part of Equation 6. In
addition, ∑ ∈ , can be obtained by summing up the w(s,
Let ∑ and ∑ , then assume that a cell c) of a concept s over the semantic concept space. In short,
probability Pr | for given a class in the term-by-concept only if the 3rd-order tensor for a given set of training
matrix is approximated by θ | P | P , | . documents is developed, then our Naïve Bayes learning can be
conducted easily through the sum operations over the term-by-
Since Pr , | Pr | , ∙ Pr | , we can estimate
concept matrix.
term probabilities for a given class and semantic concept
as:

SMC_2016 004208
2016 IEEE International Conference on Systems, Man, and Cybernetics • SMC 2016 | October 9-12, 2016 • Budapest, Hungary

IV. EXPERIMENTS TABLE I. CLASSIFICATION RESULTS IN TERMS OF F1-MEASURE

In this section, we study the effectiveness of the proposed Tensor Space-based

Conventional Jing’s Naïve Bayes
semantic Naïve Bayes learning method. Furthermore, we Datasets
NB NB (from varying the size
demonstrate the performance of text classification as a function of concept window)
of the size of the concept window. Terms occurring in
5 99.5
documents are semantically weighted along a pre-defined
semantic concept space with a so-called ‘concept window’ that 20Newsgroups 69.8 85.4 15 99.7
defines the context; for instance, if the size of the concept 25 99.9
window is too small, then the meanings of terms might not be
well captured. Eventually, we show that our method 5 97.9
significantly outperforms the conventional Naïve Bayes Reuters-21578 80.9 85.8 15 98.6
learning method through extensive experiments with three
popular document collections. 25 98.2

A. Experimental Setup 5 89.3

In order to evaluate our proposed method, we have used 3 OHSUMED 41.7 - 15 90.0
controlled subsets of the 20Newsgroups, Reuters-21578, and
25 89.0
OHSUMED document collections, which have been accepted
as clean collections and are thus commonly used to evaluate
various machine learning algorithms for applications including V. CONCLUSIONS
text classification. The 20Newsgroups collection was collected This paper proposed a semantic Naïve Bayes classification
from the Usenet newsgroups collection. It has approximately method that utilizes our semantic tensor space model for text
20,000 documents, which are partitioned across 20 different representation. To overcome the problem of the lack-of-
newsgroups. In our work, we selected 500 top-ranked largest semantics in the bag-of-words model, our NB learning method
documents belonging to 9 distinct newsgroups including Autos, introduces additional semantic features that correspond to the
Christian, and Electronics. The Reuters-21578 collection meanings of each term in a document; the semantic features
originates from Reuter’s newswire in the year 1987. For a more are composed from external Wikipedia articles being aware of
reliable evaluation, we generated a subset of the Reuters a given training documents. As a result, in our classification
collection in which documents are not skewed over categories learning framework, the conventional term feature statistics
(or topics). We first selected the documents belonging to the
are split into the statistics of features and semantics (or
most frequent 7 categories including Acq, Earn, Crude, Interest,
concepts). Through extensive experiments, we proved that the
Ship, Money-fx, and Trade, and then we chose approximately
1,750 documents that had only a single topic to avoid the proposed method allows performing almost perfect
ambiguity of documents with multiple topics. Lastly, the classification when classifying the documents in the
OHSUMED collection comes from the on-line medical 20Newsgroups and Reuters-21578 collections. In the future,
information database named MEDLINE, which contains titles we plan to design MapReduce algorithms to efficiently
and abstracts from 270 medical journals. As for the analyze large textual datasets since our tensor space model is
classification metric, the classification results are discussed very sparse due to only a small fraction of terms and
with respect to the micro-averaging F1-meausre, which is a semantics of a document.
harmonic average of precision and recall for the classification
results, which varies from 0 to 1, and is proportionally related ACKNOWLEDGMENT
to classification effectiveness. This work was supported by Basic Science Research Program
B. Classification Evaluation through the National Research Foundation of Korea (NRF-
2015R1D1A1A09061299) funded by the Ministry of
As a baseline to measure the classification performance, we Education, and was also supported by Mid-career Researcher
chose the conventional Naïve Bayes and Jing’s Naïve Bayes Program through the National Research Foundation of Korea
[7]. In this experiment, as for the semantic concept space, we
(NRF) grant funded by the Korea government (MISP) (No.
have more than 50 Wikipedia articles that contain top-ranked
NRF-2013R1A2A2A01017030).
terms in each document collection. As expected, our semantic
Naïve Bayes learning has given superior classification results REFERENCES
for the test data as shown in Table 1. Even in case of
[1] K. Nigam, A. McCallum, S. Thrun, and T. M. Mitchell, “Text
classifying the documents in the 20Newsgroups and Reuters- classification from labeled and unlabeled documents using EM,”
21578 collections, our method yields almost perfect Machine Learning, vol. 39, no. 2, pp. 103–134, 2000
classification results. Moreover, we had expected that our [2] T. Tsuruoka, and J. Tsujii, “Training a naïve Bayes classifier via the EM
classification results are dependent upon the size of the concept algorithm with a class distribution constraint,”, Proceedings of the 7th
window, and however we found that the classification Conference on Natural Language Learning (HLT-NAACL 2003), pp.
performance is not sensitive to the size of concept window 127–134, 2003
only if the size of semantic space (i.e., the number of [3] H. J. Kim, J. U. Kim, and Y. G. Ra, “Boosting naïve Bayes text
classification using uncertainty-based selective sampling,”
Wikipedia articles selected) is greater than 20 as seen in the Neurocomputing, vol. 67, pp. 403–410, 2005
table.

SMC_2016 004209
2016 IEEE International Conference on Systems, Man, and Cybernetics • SMC 2016 | October 9-12, 2016 • Budapest, Hungary

[4] S. A. Engelson, and I. Dagan, “Committee-Based Sample Selection for [9] T. Liu, Z. Chen, B. Zhang, W. Y. Ma, and G. Wu, “Improving text
Probabilistic Classifiers,” Journal of Artificial Intelligence Research, classification using local latent semantic indexing,” Proceedings of
vol.11, pp. 335-360, 1999 IEEE International Conference on Data Mining, pp. 162–169, 2004
[5] S. B. Kim, K. S. Han, H. C. Rim, S. H. Myaeng, “Some Effective [10] H. J. Kim, K. J. Hong, and J. Y. Chang, “Semantically Enriching Text
Techniques for Naive Bayes Text Classification,” IEEE Transactions on Representation Model for Document Clustering,” Proceedings of
Knowledge and Data Engineering, vol. 18, no. 11, pp. 1457–1466, 2006 the 30th ACM Symposium On Applied Computing, pp. 922–925, 2015
[6] S. Hassan, M. Rafi, and M. S. Shaikh, “Comparing SVM and naïve [11] R. Wille, “Formal concept analysis as mathematical theory of concepts
Bayes classifiers for text categorization with Wikitology as knowledge and concept hierarchies,” Formal Concept Analysis, Springer Berlin
enrichment,” Proceedings of 14th International IEEE Conference on Heidelberg, pp. 1–33, 2009
Multitopic Conference, pp. 31-34, 2011 [12] K. J. Hong, and H. J. Kim, “A semantic search technique with
[7] H. Jing, Y. Tsao, K. U. Chen, and H. M. Wang, “Semantic Naïve Bayes Wikipedia-based text representation model,” Proceedings of IEEE
Classifier for Document Classification,” Proceedings of International International Conference on Big Data and Smart Computing, pp. 177–
Joint Conference on Natural Language Processing, pp. 1117–1123, 2013 182, 2016
[8] J. Kramer, and C. Gordon, “Improvement of a Naïve Bayes Sentiment [13] D. Cai, X. He, and J. Han, “Tensor space model for document analysis,”
Classifier Using MRS-Based Features,” Lexical and Computational Proceedings of the 29th Annual International ACM SIGIR Conference
Semantics, vol. 22, pp. 22–29, 2014 on Research and Development in Information Retrieval, pp. 625–626,
2006

SMC_2016 004210

K311 CVT
100% (3)
K311 CVT
30 pages
Information Retrieval On Cranfield Dataset
No ratings yet
Information Retrieval On Cranfield Dataset
15 pages
Support Vector Machines For Text Categorization Based On Latent Semantic Indexing
No ratings yet
Support Vector Machines For Text Categorization Based On Latent Semantic Indexing
4 pages
Document Classification Using Distributed Machine Learning
No ratings yet
Document Classification Using Distributed Machine Learning
4 pages
Text Classification by Augmenting Bag of Words (BOW) Representation With Co-Occurrence Feature
No ratings yet
Text Classification by Augmenting Bag of Words (BOW) Representation With Co-Occurrence Feature
5 pages
Text Classification
No ratings yet
Text Classification
7 pages
Job Opportunity Finding by Text Classification: Procedia Engineering
No ratings yet
Job Opportunity Finding by Text Classification: Procedia Engineering
5 pages
The Feature Extraction For Classifying Words On Social Media With The Naïve Bayes Algorithm
No ratings yet
The Feature Extraction For Classifying Words On Social Media With The Naïve Bayes Algorithm
8 pages
A Survey On Different Types of Approaches To Text Categorization
No ratings yet
A Survey On Different Types of Approaches To Text Categorization
3 pages
NLP Text Preprocessing
No ratings yet
NLP Text Preprocessing
19 pages
Improve Text Classification Accuracy Based On Classifier Fusion Methods
No ratings yet
Improve Text Classification Accuracy Based On Classifier Fusion Methods
6 pages
Review On Comparison Between Text Classification Algorithms
No ratings yet
Review On Comparison Between Text Classification Algorithms
4 pages
Machine Learning in Automated Text Categorization FABRIZIO SEBASTIANI Consiglio Nazionale Delle Ricerche
No ratings yet
Machine Learning in Automated Text Categorization FABRIZIO SEBASTIANI Consiglio Nazionale Delle Ricerche
3 pages
Text Classification Using Support Vector Machine IJERTV1IS3174
No ratings yet
Text Classification Using Support Vector Machine IJERTV1IS3174
4 pages
News Classification Using Machine Learning
No ratings yet
News Classification Using Machine Learning
5 pages
Similarity-Based Techniques For Text Document Classification
No ratings yet
Similarity-Based Techniques For Text Document Classification
8 pages
Data Science Interview Preparation Questions (#Day06)
No ratings yet
Data Science Interview Preparation Questions (#Day06)
10 pages
Automatic Induction of Rule Based Text Categorization
No ratings yet
Automatic Induction of Rule Based Text Categorization
10 pages
A New Approach To Represent Textual Documents Using CVSM
No ratings yet
A New Approach To Represent Textual Documents Using CVSM
6 pages
Nepali News Classification
No ratings yet
Nepali News Classification
5 pages
Different Type of Feature Selection For Text Classification
No ratings yet
Different Type of Feature Selection For Text Classification
6 pages
Text Classification MLND Project Report Prasann Pandya
No ratings yet
Text Classification MLND Project Report Prasann Pandya
17 pages
Smriti Mishra
No ratings yet
Smriti Mishra
15 pages
(IEEE Semantic 2008 Pingpen Yuan) MSVM-KNN Multi-Class Text Classification
No ratings yet
(IEEE Semantic 2008 Pingpen Yuan) MSVM-KNN Multi-Class Text Classification
8 pages
Text Classificatio Through Time:: Efficient Label Propagation in Time-Based Graphs
No ratings yet
Text Classificatio Through Time:: Efficient Label Propagation in Time-Based Graphs
9 pages
Semantic Technology-Assisted Review STAR Document
No ratings yet
Semantic Technology-Assisted Review STAR Document
14 pages
A Survey On Machine Learning Techniques
No ratings yet
A Survey On Machine Learning Techniques
8 pages
Dynamic Embedding Projection-Gated
No ratings yet
Dynamic Embedding Projection-Gated
10 pages
Review 3 - Journal Submission Format: Team Number Title (New)
No ratings yet
Review 3 - Journal Submission Format: Team Number Title (New)
28 pages
Ijcst V3i2p17
No ratings yet
Ijcst V3i2p17
5 pages
A Study On The Architecture For Text Categorization and Summarization
No ratings yet
A Study On The Architecture For Text Categorization and Summarization
4 pages
Task 3
No ratings yet
Task 3
17 pages
Lecture 6 - From Unstructured Texts To Structure Data I
No ratings yet
Lecture 6 - From Unstructured Texts To Structure Data I
17 pages
Winter Semester 2023-24 CSE3015 ETH AP2023246000714 Quiz-I-Question-Paper
No ratings yet
Winter Semester 2023-24 CSE3015 ETH AP2023246000714 Quiz-I-Question-Paper
74 pages
Machine Learning Approach To Document Classificati
No ratings yet
Machine Learning Approach To Document Classificati
5 pages
He Laskar 2019
No ratings yet
He Laskar 2019
4 pages
Addressing Sentiment Analysis Challenges
No ratings yet
Addressing Sentiment Analysis Challenges
8 pages
Feature Eng
No ratings yet
Feature Eng
34 pages
Sciencedirect Sciencedirect
No ratings yet
Sciencedirect Sciencedirect
6 pages
Science Research Journal
No ratings yet
Science Research Journal
7 pages
Survey On Text Classification
No ratings yet
Survey On Text Classification
7 pages
Top Machine Learning Informations About Different Algorithms
No ratings yet
Top Machine Learning Informations About Different Algorithms
63 pages
tmp6D8D TMP
No ratings yet
tmp6D8D TMP
5 pages
An Overview of Categorization Techniques: B. Mahalakshmi, Dr. K. Duraiswamy
No ratings yet
An Overview of Categorization Techniques: B. Mahalakshmi, Dr. K. Duraiswamy
7 pages
739 Integration2010 Proceedings
No ratings yet
739 Integration2010 Proceedings
7 pages
Text Extraction Research Paper
No ratings yet
Text Extraction Research Paper
6 pages
Applications of NLP
No ratings yet
Applications of NLP
85 pages
Is Naive Bayes A Good Classifier For Document Clas
No ratings yet
Is Naive Bayes A Good Classifier For Document Clas
11 pages
Weizhang 2013
No ratings yet
Weizhang 2013
4 pages
NLP Q2 21SAL54 Scheme
No ratings yet
NLP Q2 21SAL54 Scheme
6 pages
Chunker Based Sentiment Analysis and Tense Classification For Nepali Text
No ratings yet
Chunker Based Sentiment Analysis and Tense Classification For Nepali Text
14 pages
Large Scale Text Classification Using Map Reduce and Naive Bayes Algorithm For Domain Specified Ontology Building
No ratings yet
Large Scale Text Classification Using Map Reduce and Naive Bayes Algorithm For Domain Specified Ontology Building
5 pages
Text Classification Research With Attention-Based Recurrent Neural Networks
No ratings yet
Text Classification Research With Attention-Based Recurrent Neural Networks
12 pages
Becker and Kuropka - Topic-Based Vector Space Model PDF
No ratings yet
Becker and Kuropka - Topic-Based Vector Space Model PDF
6 pages
Researchpaper
No ratings yet
Researchpaper
9 pages
Paper 2
No ratings yet
Paper 2
9 pages
Text Representation - From Vector To Tensor
No ratings yet
Text Representation - From Vector To Tensor
4 pages
Learn 4
No ratings yet
Learn 4
27 pages
Dewi 2017
No ratings yet
Dewi 2017
5 pages
Empirical Studies of Requirements Validation Techniques: Uzair Akbar Raja
No ratings yet
Empirical Studies of Requirements Validation Techniques: Uzair Akbar Raja
9 pages
Data Peserta Oracle Per 22 Maret 2017 - 0930
No ratings yet
Data Peserta Oracle Per 22 Maret 2017 - 0930
9 pages
Project Bank: Brought To You by - Ultimate Collection of Projects & Source Codes in All Programming Languages
No ratings yet
Project Bank: Brought To You by - Ultimate Collection of Projects & Source Codes in All Programming Languages
7 pages
Project Bank: Brought To You by - Ultimate Collection of Projects & Source Codes in All Programming Languages
No ratings yet
Project Bank: Brought To You by - Ultimate Collection of Projects & Source Codes in All Programming Languages
7 pages
PPC512 Manual
No ratings yet
PPC512 Manual
187 pages
Exper-10 Digital Comparator Circuit
No ratings yet
Exper-10 Digital Comparator Circuit
5 pages
Training Programme Guide
100% (3)
Training Programme Guide
46 pages
Electromagnetic Waves For Underwater Communication
No ratings yet
Electromagnetic Waves For Underwater Communication
5 pages
Final Project. Vaishnavi - Bhandari
No ratings yet
Final Project. Vaishnavi - Bhandari
76 pages
Hallide
No ratings yet
Hallide
35 pages
CA5305.Lecture 2 Sensors: Instructor: Dr. M. Deivamani
No ratings yet
CA5305.Lecture 2 Sensors: Instructor: Dr. M. Deivamani
18 pages
Career Objective: Ramesh .P
No ratings yet
Career Objective: Ramesh .P
4 pages
Data Source Merge InDesign
No ratings yet
Data Source Merge InDesign
10 pages
Define Network?: 2. What Is A Link?
No ratings yet
Define Network?: 2. What Is A Link?
16 pages
Battery Sizing - Open Electrical
No ratings yet
Battery Sizing - Open Electrical
7 pages
Computer Networks Lab Report 1tanveer Ahmed (12323)
No ratings yet
Computer Networks Lab Report 1tanveer Ahmed (12323)
7 pages
Daikin AC'
No ratings yet
Daikin AC'
101 pages
Simple Online Shopping System
No ratings yet
Simple Online Shopping System
7 pages
Project Presentation - Module 1 - Deepak Subudhi
No ratings yet
Project Presentation - Module 1 - Deepak Subudhi
7 pages
Resume 2014
No ratings yet
Resume 2014
2 pages
(IJCT-V2I1P5) Author :sonali V.Satonkar, DR - Seema Kawathekar
No ratings yet
(IJCT-V2I1P5) Author :sonali V.Satonkar, DR - Seema Kawathekar
6 pages
Computer - Applictions Paper
No ratings yet
Computer - Applictions Paper
16 pages
Module 4 CSC 101 - 090421 - 014454
No ratings yet
Module 4 CSC 101 - 090421 - 014454
11 pages
HS Mixer Manual
100% (1)
HS Mixer Manual
59 pages
Marvair CommStat-3 PDS 07.17.20 Rev.9
No ratings yet
Marvair CommStat-3 PDS 07.17.20 Rev.9
4 pages
Top 23 Sites To Buy Verified Payoneer Accounts With Replacement Gurantee
No ratings yet
Top 23 Sites To Buy Verified Payoneer Accounts With Replacement Gurantee
10 pages
2014 - 10 - 09 - 16 - 50 - Cameron Scanner 2000 Expansion Board Quick Start Guide
No ratings yet
2014 - 10 - 09 - 16 - 50 - Cameron Scanner 2000 Expansion Board Quick Start Guide
4 pages
Cybersecurity and Digital Ethics in Law
No ratings yet
Cybersecurity and Digital Ethics in Law
26 pages
Mtkprofile PDF
No ratings yet
Mtkprofile PDF
1 page
Risk Assessment of Light Fittings Installation
No ratings yet
Risk Assessment of Light Fittings Installation
5 pages
Chapter 1 UNIT I TO V (B)
No ratings yet
Chapter 1 UNIT I TO V (B)
36 pages
McHale Bale Wrappers Guide
No ratings yet
McHale Bale Wrappers Guide
33 pages
Marine Global Services Network Catalogue
No ratings yet
Marine Global Services Network Catalogue
118 pages

Kim 2016

Uploaded by

Kim 2016

Uploaded by

2016 IEEE International Conference on Systems, Man, and Cybernetics • SMC 2016 | October 9-12, 2016 • Budapest, Hungary

Semantic Text Classification with Tensor Space

Jiyun Kim Jinseog Kim

978-1-5090-1897-0/16/$31.00 ©2016 IEEE SMC_2016 004206

Here, is a random variable for -th term and -th

IV. EXPERIMENTS TABLE I. CLASSIFICATION RESULTS IN TERMS OF F1-MEASURE

In this section, we study the effectiveness of the proposed Tensor Space-based

A. Experimental Setup 5 89.3

You might also like