Text Classification Based On Random Forest Algorithm
Text Classification Based On Random Forest Algorithm
Algorithm
In view of the poor classification effect of traditional random forest algorithms due
to the low quality of text feature extraction, a random forest method for text
information is proposed. In view of the difficulty in controlling the quality of
traditional random forest decision trees, a weighted voting mechanism is proposed
to improve the quality of decision trees. This algorithm uses tr-k method based on
text feature extraction to improve the quality and diversity of text features, and
uses the latest Bert word vector generation model to represent the text.
Experimental data in the Python environment show that this method can achieve
better results in text classification than IDF based random .
EXISTING SYSTEM:
In The Existing system used Naive Bayes.In Naive Bayes, texts are classified
based on posterior probabilities generated based on the presence of different
classes of words in texts. This assumption makes the computations resources
needed for a naïve bayes classifier far more efficient than non-naïve bayes
approaches which are exponential in complexity. Moreover, it is found that Naive
Bayes is the Less accurate model for text classification.
Algorithm:Naive bayes.
PROPOSED SYSTEM:
The proposed method is based on the Random forest and is proposed to.perform
text classification. In the traditional random forest algorithm, the number and
quality of feature selection are prominent. But for books and other large capacity
text classification, the more the number and quality of text features (classification
decision tree attribute), the better the classification effect will be. Therefore, this
paper proposes a tr-k method which combines TF-IDF, textrank and K-means to
improve the effect of text classification. The full name of the TF-IDF method is
term frequency inverse document frequency.
⮚ tr-k method which combines TF-IDF, textrank and K-means to improve the
effect of text classification.
⮚ RFA has achieved good results in biochip, information extraction and other
fields.
SYSTEM REQUIREMENTS:
HARDWARE REQUIREMENTS:
REFERENCE:
R.Kingsy Grace,B.Suganya Department of Computer Science and Engineering Sri
Ramakrishna Engineering College Coimbatore, India" Research of text
classification based on random forest algorithm" 2020 6th International
Conference on Advanced Computing and Communication Systems (ICACCS)
Date Added to IEEE Xplore: 23 April 2020 INSPEC Accession Number:
19557097 DOI: 10.1109/ICACCS48705.2020.9074233