0% found this document useful (0 votes)

69 views4 pages

Ontology Based Text Categorization - Telugu Documents: Mrs.A.Kanaka Durga, Dr.A.Govardhan

Yes

Uploaded by

Sagar Sagar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

69 views4 pages

Ontology Based Text Categorization - Telugu Documents: Mrs.A.Kanaka Durga, Dr.A.Govardhan

Yes

Uploaded by

Sagar Sagar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

International Journal of Scientific & Engineering Research Volume 2, Issue 9, September-2011 1

ISSN 2229-5518

Ontology Based Text Categorization - Telugu

Documents
Mrs.A.Kanaka Durga, Dr.A.Govardhan
Abstract— In this paper, we introduce a new method of ontology based text classification for Telugu documents and retrieval system. Many
of the text categorization techniques are based on word and/or phrase analysis of the text. Term frequency analysis signifies the importance of
a term within a document. Two terms within a document can have the same frequency, but one term may contribute more to the meaning of
the sentence compared to the other term. Our aim is to capture the semantics of a text. The model we worked enables to capture the terms that
presents the concepts in the text and thus identifies the topic of the document. We have introduced the new concept based model which ana-
lyzes the terms on the sentences and documents level. This concept-based model effectively discriminates between non-important terms with
respect to sentence semantics and terms which hold the concepts that represent the sentence meaning. The limitations of key-word based
search are overcome by usage of Ontology which is a motivation of semantic IR. The retrieval model is based on an adaptation of the classic
vector-space model. The concept of ontology is associated with the related words and their weights from the pre-classified documents as a
learning stage. In the main process, the words and their mutual relations are extracted from the target documents. The concept of Ontology is
used to map the target document. A detailed description of the test results is illustrated in the paper and we explained thoroughly how the
concept based classification is far more superior when compared to the word based classification for telugu documents.

Index Terms—Concept-based model, IR, Ontology, Retrieval model, Term frequency, Text categorization and Telugu documents,

——————————  ——————————

1. INTRODUCTION
In the current paper we have focussed our efforts on electronic
documents of Telugu Language. Ontology: Ontology is not necessarily norms on the Construc-
The Telugu Language: Telugu language is the second most spo- tion or definition or expression. A conceptual description of on-
ken languages after Hindi in India. Telugu belongs to the South tology including concept, attribute, entity, association description
Central Dravidian subgroup of the Dravidian family of languages. and the main purpose for knowledge sharing and reuse is given by
It has been recently awarded the Classical status. Telugu has been Jade Goldstein [2]. Ontology is the concept (concepts, classes) of
the language of choice for lyrical compositions for its vowel end- abstract sets and attributes (properties, attributes) is for the cha-
ings words, rightly called the “Italian of the East”. Words in Dra- racteristics of objects and entities (individuals, instances) is a real
vidian languages, especially in Telugu are long and complex. thing and association (relations) will attribute is used for the titles
Telugu, like other Dravidian languages is highly rich in morphol- of the two concepts or entities.
ogy and hence agglutinative in nature. Telugu has 16 vowels and
40 consonants. Ontology is a formal, explicit specification of a shared conceptua-
lization. Jade [2] defined that ontology is a conceptual descrip-
Text Categorization: (TC) is the classification of documents tion, including concept, attribute, entity and association descrip-
with respect to a set of one or more pre-existing categories. TC is tion with the main purpose of knowledge sharing and reusing
a hard and very useful operation frequently applied to assign sub- knowledge. In the context of knowledge sharing, we will use the
ject categories to documents, to route and filter texts, or as a part term ontology to mean a specification of a conceptualization.
of natural language processing systems. In the past, several me- That is, ontology is a description of the concepts and relationships
thods proposed for text categorization were typically based on the that can exist for an agent or a community of agents. This defini-
classical Bag-of-Words model where each term or term stem is an tion is consistent with the usage of ontology as set-of-concept-
independent feature. The disadvantages of this classical represen- definitions, but more general.
tation are: The Proposed work is an efficient way of extracting text from the
a) The ignorance of any relation between words, as a result of Telugu Documents and performing Information Retrieval from
which learning algorithms are restricted to detect patterns in the that Telugu Document.
used terminology only, while conceptual patterns remain ig-
nored. Related works in this area have been explained in Section 2.
Our Proposed Work and its layout have been explained in
b) The big dimensionality of the representation space. In this article, Section 3. Results and Performance are dealt in Section 4.
we propose a new method for text categorization, which is based on Section 5 states the Conclusions and further work to be done
the use of the Word Net ontology to capture the relations between the
words.
In this approach terms are merged with their associated concepts LITERATURE SURVEY
extracted from the used ontology to form a hybrid model for text
representation. We have undertaken a series of experiments on Semantics has been introduced at various linguistic levels, word
Telugu documents which highlight the positive contribution of level, sentence level and document content extraction level and at
this approach. various stages of Information Retrieval such as query and document
IJSER © 2011
http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 2, Issue 9, September-2011 2
ISSN 2229-5518

representation, and in indexing. Any attempt to bring in semantics order to overcome this defect, we use morphological analyzer tool
needs to balance the amount of complex natural language processing to get the root words. As a next step, domain specific key words
required, with the increase in retrieval performance. It is important to are identified. Text classifier is applied on the key words selected
note that the pre-processing done for document representation is an from the telugu document and found that the classifier efficiency
offline one-time process which would every time provide of key words are better w.r.t. to the words. We found that when
improvement in the retrieval performance. The main modules of IR we applied the ontological classification, there is an enormous
are pre-processing, indexing and retrieval. A set of documents is amount of improvement in the classifier efficiency. Here we have
given as input to the pre-processing phase where the stop words and
used the Ontology_Dictionary (Wordnet - telugu) developed by
punctuation are removed. The parts of speech of the content words
Centre for Advanced Linguistics and Ttransliteration Studies
are determined by the POS (Part of Speech) tagger after the stemmer
stems the content words resulting in root words. Basically a
(CALTS-UOH), university of Hyderabad(Central University) for
document can be represented with a bag of words using Boolean feature grouping. All such words which are grouped based on the
model. The bag of words however does not provide ranking of the features are termed as word class/concept. With the help of ontol-
retrieved documents. ogy, terms that are found in and around the same concept are
mapped into one dimension. This will help in excluding or dis-
To overcome this limitation, keyword-based search has been put ambiguating the terms that are present in many concepts due to
forward where precision and recalls are improved but this also giving the semantic ambiguity.
some ambiguous results. The use of ontology is the motivations of
the Semantic information retrieval. Semantic search engine is viewed 3.1 An Illustration Using Ontology Based Classification for
as a tool that gets formal ontology-based queries (e.g., in RDQL, Telugu Document
RQL, SPARQL, etc.) from a client, executes them against a
knowledge base (KB), and returns ontology values that satisfy the “AarDika mMtrito, kAryadarsito muKhya mMtri assembly lo
query. These techniques typically use Boolean search models based mMtaNalu” - Telugu (“Chief Minister discussed with the Finance
on an ideal view of the information space as consisting of non- Ministerand secretary in the assembly” – English)
ambiguous, non-redundant, formal pieces of ontological knowledge. Words: { AarDika , mMtrito, muKhya, mMtri, assembly lo,
mMtaNalu}
Conceptual search, i.e., search based on meaning rather than just
character strings, has been the motivation of a large body of research
Root word: mMtri
in the IR field long before the Semantic information retrieval
emerged. This drive can be found in popular and widely explored
areas such as Latent Semantic Indexing , linguistic conceptualization Key words: {mMtri, assembly}
approaches or the use of thesaurus and taxonomies to improve
retrieval. Feature Grouping: {mMtri, muKhyamMtri, kAryadarsi }

Those proposals are commonly based on shallow and sparse concep- Word Class/Concept: {mMtri, muKhyamMtri, kAryadarsi }
tualizations, usually considering very few different types of relations
between concepts and low information specificity levels. The model
we proposed considers a much more detailed and densely populated
conceptual space in the form of an ontology-based KB. Though it is
difficult to obtain such a rich conceptual space, this is one of the ma-
jor targets addressed by the Semantic Web research community. Rajakeeyalu

Our approach combines the flexibility and generality of an IR model

for unstructured search spaces. The expressiveness and detail of a
structured relational model describes some of the knowledge PaRtI Assembly
involved in the unstructured information space, in a structured and
formal way, with powerful and precise data querying facilities.
Ontology-based approach can be relied since it enables further Party gurtu party office Abhyarthi Palaka prati
inferencing capabilities that can be exploited to enhance the retrieval Paksham pakshmam
process. By building upon an ontology-based layer, our model
benefits from semantic data integration facilities.

Mayfield and Fin in combine ontology-based techniques and text-

based retrieval in sequence. We share with Mayfield et al. the idea Speaker Dy Speaker Mantrulu Sakhalu Floorleader
that semantic search should be a complement of keyword-based
search as long as not enough ontologies and metadata are On Pre-processed document, root words are extracted through
Ontology at concept level. Noun words are identified and their
available. frequency computed and preserved in the data bank. On the nouns
thus retrieved, feature-matrix clusters developed. We have calcu-
3. DETAILS OF THE WORK CARRIED OUT
lated a representative feature vector for each concept node in an
To start with, each text document is tokenized so that it gives
Ontology. We have then measured the similarity of the two of
raise to the set of wordsThe efficiency levels are low when we
those class vectors by a simple cosine measure.
apply any of the conventional classifier methodology. These low
efficiency levels are attributed to the inflated form of words. In
IJSER © 2011
http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 2, Issue 9, September-2011 3
ISSN 2229-5518

3.2 Equations
Algorithm: If H1 is true, accept the fact that the efficiency levels are better.
1. Start To measure the performance of these measures, we calculated
2. Morph Analysis (Finding base words) recall rate and precision rate.
3. Apply Ontology Recall rate= a/b and precision rate = a/c
4. Find the Sub-category where a = No. of documents which are classified into category
5. Recognize parent-node as a category of the respective correctly.
document b= No. of documents of category in the testing data.
6. If the parent has a child- repeat the process (iterate the c=No. of documents which are classified into category.
process from 3-5)
7. Otherwise take parent as the final category Table 1. Groups of Misclassification

3.3.Vector Space Model:

We used vector space model to weigh terms and calculate feature Result type No.of texts
vectors Texts assigned to the subclass category 20
Weight of a term is given as : wik = tfik x idfk
where Texts assigned to the sperclass category 4
tfik is the number of occurrences of term tk in document i and 40
idfk is the inverse document frequency of the term tk in the Texts assigned to the other category
collection of documents. .
A commonly used measure for the inverse document frequency
is:
idfk = log(N / nk)
where
N is the total number of documents in the collection, and
nk is the number of document which contains a given term
.
Ontology based classification is carried in three steps: -
Step I : Ontology creation
Step II : Calculating relevance score
Step III : Text classification

Experiments were conducted on a small sample of 400 Telugu

documents which were broadly categorised into two categories .
namely rajakeeyalu (Politics), Aatalu (Sports) .Out of these 400
documents 80% were used as training docments and the balance
20% were used as testing documents.
3.4. Hypothesis:
H1: On ontology based text categorization, more distinctive fea-
tures are mapped towards right of the ontology scale.
H0: On ontology based text categorization, less distinctive fea-
tures are mapped towards left of the ontology scale.

3.5 FIGURES: FLOW PROCESS OF ONTOLOGY BASED TEXT CATEGORIZATION FOR TELUGU DOCUMENTS

Telugu Text Tokenizing Words Morphological Root words Key words

Document
Analyzer

Ontology_ Dic- Text Classify

Text Classify Word Classes Feature
grouping tionary

IJSER © 2011
http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 2, Issue 6, June-2011 4
ISSN 2229-5518

4. CONCLUSION:

Literature on earlier research have proven that in conventional me-

thods, misclassified items are not accessible. Further it is also easy to
develop weakly thesauruses than conventional methods. In our paper,
we have proposed a nd proven that the efficiency of text classifica-
tion of the term is better when we used the Ontology model for Telu-
gu documents when compared to the conventional methods.
5. ACKNOLEDGMENTS:
We sincerely thank Dr.G.Uma Maheswara Rao for providing Ontolo-
gy_dictionary for Telugu and Morphoogical Analyzer tool.

6.REFERENCES
[1]Sebastiani F., “Machine Learning in Automated Text
Categorization,” ACM Computing Surveys, vol. 34, no. 1, pp. 1-47,
2002.

[2] Jade Goldstein, Mark Kantrowitz, Vibhu Mittal, and Jaime

Carbonell (1999), Summarizing Text Documents: Sentence
Selection and Evaluation Metrics, In ACM SIGIR 1999,
pp.121-128, 1999.

[3] Dr.G.Uma Maheswara Rao, Morphological Analyser, at the

centre for ALTS, University of Hyderabad.

[4] Dr.G.Uma Maheswara Rao and research team “Ontolo-

gy_Dictionary-Telegu”,at the centre for ALTS, University of
Hyderabad.

[5] A. karthikeyan et al.,”An Novel Approach sing Semantic Infor-

mation retrieval For Tamil documents”, International Jornal of Engi-
neering Science and Technology,vol.2(9),2010,4424-4433.

[6].S.MChaware et al.,”A survey:Issues of semantic Match-

ing for Indian Languages Using Ontology”,International
Journal of Information echnology and knowledge Man-
agement,vol.2(2).pp.351-354,2010.

Science and Technology and Nation Building
100% (1)
Science and Technology and Nation Building
18 pages
Biostat Mock Exam
No ratings yet
Biostat Mock Exam
4 pages
P16mba7 1
No ratings yet
P16mba7 1
4 pages
MSC IR 2021
100% (1)
MSC IR 2021
188 pages
Applications of AI
No ratings yet
Applications of AI
11 pages
Holt Math Homework Answers
100% (1)
Holt Math Homework Answers
5 pages
Machine Learning Telugu
No ratings yet
Machine Learning Telugu
9 pages
Testbank For Negotiation 9th Edition Lewicki
No ratings yet
Testbank For Negotiation 9th Edition Lewicki
17 pages
Ir - Chapter 1
No ratings yet
Ir - Chapter 1
7 pages
Text Classification Using Support Vector Machine IJERTV1IS3174
No ratings yet
Text Classification Using Support Vector Machine IJERTV1IS3174
4 pages
Module 4 Notes
No ratings yet
Module 4 Notes
34 pages
Tarc Investigation Final Report 5-10-21 1620829751
No ratings yet
Tarc Investigation Final Report 5-10-21 1620829751
211 pages
Thesis - Dinesh Mavaluru
No ratings yet
Thesis - Dinesh Mavaluru
142 pages
Data Collection Using Appropriate Instruments
No ratings yet
Data Collection Using Appropriate Instruments
9 pages
Semantic
No ratings yet
Semantic
165 pages
Testbank and Solutions For Engineering Mechanics Statics 9th Edition
No ratings yet
Testbank and Solutions For Engineering Mechanics Statics 9th Edition
17 pages
Listing of STEM Disciplines
No ratings yet
Listing of STEM Disciplines
1 page
Technical Report: Learning Compound Noun Semantics
No ratings yet
Technical Report: Learning Compound Noun Semantics
167 pages
Ontologies-Based Databases and Information Systems
No ratings yet
Ontologies-Based Databases and Information Systems
159 pages
REDEFEND
No ratings yet
REDEFEND
49 pages
Nikhil 1
No ratings yet
Nikhil 1
57 pages
Ontology Learning - Framework, Techniques
No ratings yet
Ontology Learning - Framework, Techniques
42 pages
Chapter Two
No ratings yet
Chapter Two
29 pages
06 Text and Document
No ratings yet
06 Text and Document
43 pages
Introduction To Semantic Web Ontology Languages: 1 Organisation of This Chapter
No ratings yet
Introduction To Semantic Web Ontology Languages: 1 Organisation of This Chapter
20 pages
Multimedia Information Retrieval (CSC 545) : The Problem of IR
No ratings yet
Multimedia Information Retrieval (CSC 545) : The Problem of IR
29 pages
Schwartz. We Make Our Tools and Our Tools Make Us
No ratings yet
Schwartz. We Make Our Tools and Our Tools Make Us
35 pages
Week 12
No ratings yet
Week 12
19 pages
IR Chapter 2
No ratings yet
IR Chapter 2
37 pages
Final
No ratings yet
Final
14 pages
L02-IR Models MMN
No ratings yet
L02-IR Models MMN
27 pages
The Agile Team (The A-Team) (TAT) : Project Proposal
No ratings yet
The Agile Team (The A-Team) (TAT) : Project Proposal
22 pages
Mini Skripsi Chapter I II III
No ratings yet
Mini Skripsi Chapter I II III
19 pages
Swoogle: Showcasing The Significance of Semantic Search
No ratings yet
Swoogle: Showcasing The Significance of Semantic Search
8 pages
Collaborative Filtering and Inference Rules For Context-Aware Learning Object Recommendation
No ratings yet
Collaborative Filtering and Inference Rules For Context-Aware Learning Object Recommendation
11 pages
Automatic Induction of Rule Based Text Categorization
No ratings yet
Automatic Induction of Rule Based Text Categorization
10 pages
Motivation in Relation To Teachers' Performance
No ratings yet
Motivation in Relation To Teachers' Performance
14 pages
A New Approach To Represent Textual Documents Using CVSM
No ratings yet
A New Approach To Represent Textual Documents Using CVSM
6 pages
Task 3
No ratings yet
Task 3
17 pages
Major Project
No ratings yet
Major Project
33 pages
1 Information Retrieval System
No ratings yet
1 Information Retrieval System
10 pages
Ontology-Based Text Clustering: A. Hotho and S. Staab A. Maedche
No ratings yet
Ontology-Based Text Clustering: A. Hotho and S. Staab A. Maedche
8 pages
Semantic Technology-Assisted Review STAR Document
No ratings yet
Semantic Technology-Assisted Review STAR Document
14 pages
A Hybrid Method For Integrating Multiple
No ratings yet
A Hybrid Method For Integrating Multiple
18 pages
A Language Independent Approach To Develop URDUIR System
No ratings yet
A Language Independent Approach To Develop URDUIR System
10 pages
IRS Assignment 1: 1) What Is Automatic Indexing ?list and Explain The Various Types of Automatic Indexing
No ratings yet
IRS Assignment 1: 1) What Is Automatic Indexing ?list and Explain The Various Types of Automatic Indexing
23 pages
A Big Data-Driven Root Cause Analysis System
No ratings yet
A Big Data-Driven Root Cause Analysis System
16 pages
6th Grade Plate Tectonics and Earth's Structure (Extended) Lesson Plan
No ratings yet
6th Grade Plate Tectonics and Earth's Structure (Extended) Lesson Plan
9 pages
Onotology krr-AI Overview-Unit2
No ratings yet
Onotology krr-AI Overview-Unit2
11 pages
Paper 2
No ratings yet
Paper 2
9 pages
WIREs Data Min Knowl - 2020 - Ntoutsi - Bias in Data Driven Artificial Intelligence Systems An Introductory Survey
No ratings yet
WIREs Data Min Knowl - 2020 - Ntoutsi - Bias in Data Driven Artificial Intelligence Systems An Introductory Survey
14 pages
Similarity-Based Techniques For Text Document Classification
No ratings yet
Similarity-Based Techniques For Text Document Classification
8 pages
UNIT 4 Taxonomies and Ontologies
No ratings yet
UNIT 4 Taxonomies and Ontologies
24 pages
Complex Linguistic Features For Text Classification: A Comprehensive Study
No ratings yet
Complex Linguistic Features For Text Classification: A Comprehensive Study
15 pages
Modern Information Retrieval Chapter 7: Text Operations: Ricardo Baeza-Yates Berthier Ribeiro-Neto
No ratings yet
Modern Information Retrieval Chapter 7: Text Operations: Ricardo Baeza-Yates Berthier Ribeiro-Neto
40 pages
Analise Dominio Giunchiglia2012
No ratings yet
Analise Dominio Giunchiglia2012
11 pages
Ontology A Tool For Organization of Knowledge PDF
No ratings yet
Ontology A Tool For Organization of Knowledge PDF
11 pages
Performance Enhancement and Customization of Information Storage and Retrieval System
No ratings yet
Performance Enhancement and Customization of Information Storage and Retrieval System
32 pages
Survey On Text Classification
No ratings yet
Survey On Text Classification
7 pages
BSBMGT402 - Assessment
No ratings yet
BSBMGT402 - Assessment
12 pages
Csit1232 (2021 - 07 - 30 08 - 37 - 35 UTC)
No ratings yet
Csit1232 (2021 - 07 - 30 08 - 37 - 35 UTC)
11 pages
Irs Unit-3
No ratings yet
Irs Unit-3
20 pages
On The Application of Linguistic Quantifiers For Text Categorization
No ratings yet
On The Application of Linguistic Quantifiers For Text Categorization
12 pages
Document Classification Utilising Ontologies and Relations Between Documents
No ratings yet
Document Classification Utilising Ontologies and Relations Between Documents
8 pages
Bulu
No ratings yet
Bulu
47 pages
Semantic News Finder: A Semantic Retrieval From News Items: M.Thangaraj G.Sujatha
No ratings yet
Semantic News Finder: A Semantic Retrieval From News Items: M.Thangaraj G.Sujatha
9 pages
Checklist For Scientific Journal I Will Use This Checklist As A Guide When Evaluating My Manuscript SL - No YES NO 1. Title 2. 3
No ratings yet
Checklist For Scientific Journal I Will Use This Checklist As A Guide When Evaluating My Manuscript SL - No YES NO 1. Title 2. 3
3 pages
A Proposal For A Web Information Extraction and Question-Answer System
No ratings yet
A Proposal For A Web Information Extraction and Question-Answer System
7 pages
Research On Ontology Construction and Information Extraction Technology Based On Wordnet
No ratings yet
Research On Ontology Construction and Information Extraction Technology Based On Wordnet
6 pages
Measuring The Effectiveness of Advertising
No ratings yet
Measuring The Effectiveness of Advertising
27 pages
U03d2 Ethics, Recruitment, and Random Assignment
No ratings yet
U03d2 Ethics, Recruitment, and Random Assignment
6 pages
Wordnet Improves Text Document Clustering: Andreas Hotho Steffen Staab Gerd Stumme
No ratings yet
Wordnet Improves Text Document Clustering: Andreas Hotho Steffen Staab Gerd Stumme
8 pages
Context Based Document Indexing and Retrieval Using Big Data Analytics - A Review
No ratings yet
Context Based Document Indexing and Retrieval Using Big Data Analytics - A Review
3 pages
Camel Biometrics
No ratings yet
Camel Biometrics
6 pages
An Analysis of Sentence Level Text Classification For The Kannada Language
No ratings yet
An Analysis of Sentence Level Text Classification For The Kannada Language
5 pages
Context Annotated Graph and Fuzzy Simila
No ratings yet
Context Annotated Graph and Fuzzy Simila
13 pages
Paper On Domain Ontology
No ratings yet
Paper On Domain Ontology
4 pages
Text Databases and Information Retrieval: Riloff, Hollaar@cs - Utah.edu&
No ratings yet
Text Databases and Information Retrieval: Riloff, Hollaar@cs - Utah.edu&
3 pages
Volume 2 Issue 6 2016 2020
No ratings yet
Volume 2 Issue 6 2016 2020
5 pages
Ontologies For Semantically Interoperable Systems
No ratings yet
Ontologies For Semantically Interoperable Systems
5 pages
A Novel Approach On Tamil Text Classification Using C Final Modified For Uploading
No ratings yet
A Novel Approach On Tamil Text Classification Using C Final Modified For Uploading
6 pages
Automatic Building of An Ontology From A Corpus of Documents
No ratings yet
Automatic Building of An Ontology From A Corpus of Documents
5 pages
Biostatistics (SAMPLING TECHNIQUES)
No ratings yet
Biostatistics (SAMPLING TECHNIQUES)
2 pages
Crystallization of KNO3 Rubric
No ratings yet
Crystallization of KNO3 Rubric
2 pages
Why Should We Start Giving Importance To Health Sector
No ratings yet
Why Should We Start Giving Importance To Health Sector
3 pages
Detailed Performance Task
No ratings yet
Detailed Performance Task
2 pages
Preprocessing Stemin JI
No ratings yet
Preprocessing Stemin JI
3 pages
Geoexploration Services: Ndt/Engineering Test
No ratings yet
Geoexploration Services: Ndt/Engineering Test
1 page

Ontology Based Text Categorization - Telugu Documents: Mrs.A.Kanaka Durga, Dr.A.Govardhan

Uploaded by

Ontology Based Text Categorization - Telugu Documents: Mrs.A.Kanaka Durga, Dr.A.Govardhan

Uploaded by

International Journal of Scientific & Engineering Research Volume 2, Issue 9, September-2011 1

Ontology Based Text Categorization - Telugu

Our approach combines the flexibility and generality of an IR model

Mayfield and Fin in combine ontology-based techniques and text-

3.3.Vector Space Model:

Experiments were conducted on a small sample of 400 Telugu

Telugu Text Tokenizing Words Morphological Root words Key words

Ontology_ Dic- Text Classify

Literature on earlier research have proven that in conventional me-

[2] Jade Goldstein, Mark Kantrowitz, Vibhu Mittal, and Jaime

[3] Dr.G.Uma Maheswara Rao, Morphological Analyser, at the

[4] Dr.G.Uma Maheswara Rao and research team “Ontolo-

[5] A. karthikeyan et al.,”An Novel Approach sing Semantic Infor-

[6].S.MChaware et al.,”A survey:Issues of semantic Match-

You might also like