Natural Language Processing
Natural Language Processing
Sentiment Analysis
1. Identify sentiment among several posts or even in the same post where
emotion is not always explicitly expressed.
2. Companies use it to identify opinions and sentiments to understand what
customers think about their products and services.
Text classification
1. Nowadays Google Assistant, Cortana, Siri, Alexa, etc have become an integral
part of our lives. Not only can we talk to them but they also have the ability
to make our lives easier.
2. By accessing our data, they can help us in keeping notes of our tasks, making
calls for us, sending messages, and a lot more.
3. With the help of speech recognition, these assistants can not only detect our
speech but can also make sense of it.
4. According to recent research, a lot more advancements are expected in this
field in the near future.
Chatbots
Script bot functioning is very limited Smart-bots are flexible and powerful.
as they are less powerful.
Script bots work around a script Smart bots work on bigger databases
which is programmed in them and other resources directly
No or little language processing skills NLP and Machine learning skills are
required.
Sentence 1: “His face turned red after he found out that he took the wrong
bag”
Possible meanings: Is he feeling ashamed because he took another person’s bag
instead of his?
• Is he feeling angry because he did not manage to steal the bag that he has
been targeting?
• This statement is correct in syntax but does this make any sense?
• In human language, a perfect balance of syntax and semantics is important
for better understanding.
1. Sentence Segmentation
Under sentence segmentation, the whole corpus is divided into sentences. Each
sentence is taken as a different data so now the whole corpus gets reduced to
sentences.
Example:
1.
You
want
to
see
the
drea
ms Yo wa See th drea wit clos ey
to and Achieve them ?
with u nt e ms h e es
close
eyes
and
achie
ve
them
?
In this step, the tokens which are not necessary are removed from the token list. To
make it easier for the computer to focus on meaningful terms, these words are
removed. It could also be a number, special character
Stopwords: Stopwords are the words that occur very frequently in the corpus but
do not add any value to it.
Examples: a, an, and, are, as, for, it, is, into, in, if, on, or, such, the, there, to.
Example
1. You want to see the dreams with close eyes and achieve them?
the removed words would be
o
o to, the, and, ?
2. The outcome would be:
o You want see dreams with close eyes achieve them
we convert the whole text into a similar case, preferably lower case. This ensures
that the case sensitivity of the machine does not consider the same words as
different just because of different cases.
5. Stemming: Stemming is a technique used to extract the base form of the words
by removing affixes from them. It is just like cutting down the branches of a
tree to its stems.
dreams s dream
dreams s dream
Bag of Words
Step 3: Create document vector In this step, the vocabulary is written in the top
row. Now, for each word in the document, if it matches with the vocabulary, put a
1 under it. If the same word appears again, increment the previous value by 1.
And if the word does not occur in that document, put a 0 under it.
Finally, the words have been converted to numbers. These numbers are the values
of each document. Here, we can see that since we have less amount of data, words
like ‘are’ and ‘and’ also have a high value.