NLP Text Classification Week4
NLP Text Classification Week4
1. Automatic or semi-
automatic processing of
human language
Natural 2. Can be used for various
Language applications like
b. Intent Classification
c. Topic Labeling
General Process
Pre-process to Feed the data to
desired text format the model Output the class
Feature Predicti
Data Model Output
s on
● Amazon Reviews
● Bag-of-words + TF*IDF
● Word2vec
One-hot encoding
- Creates a binary encoding of words.
1 is encoded on the index of the
word in the corpus
Bag-of-words
- Takes the word count of the target
word in the corpus as the feature
TF*IDF
- Term Frequency * Inverse Document
Frequency
● Frequently occurring words are typically not
important / has less weight (stopwords such as
“is, are, the, etc.”)
● Naive Bayes
● K-nearest neighbors
● Multilayer Perceptron
References [2]
https://developers.google.com/mac
hine-learning/
[3]bunch of stackoverflow /
stackexchange / Kaggle threads