NLP Pipeline
NLP Pipeline
Introduction
AI and NLP are deeply interrelated because Natural Language
Processing (NLP) is a crucial subfield of Artificial Intelligence
(AI) that enables machines to understand, interpret, and
generate human language, thus facilitating human-computer
interaction.
What is Artificial Intelligence
(AI)?
Definition: AI enables computers to mimic human
intelligence.
AI can reason, learn, and make decisions.
Examples: Chatbots, autonomous vehicles, recommendation
systems.
What is Machine Learning
(ML)?
What is Natural Language
Processing (NLP)?
A branch of AI that enables computers to understand and
interpret human language.
Key tasks: Sentiment analysis, automatic text summarization,
speech recognition.
Examples: Google Translate, virtual assistants (Siri, Alexa),
chatbots.
Introduction to NLP
NLP enables machines to understand and process human
language.
It has evolved from rule-based systems to AI-driven
technologies.
NLP impacts industries like healthcare, finance, and customer
service.
The Genesis of NLP (1950s –
1960s)
Alan Turing’s 1950 paper introduced the Turing Test.
24
WORD TOKENIZATION
LOWER CASTING
• Lower casing:
This step reduces complexity. We convert the text data into the
same case, preferably lowercase, so that we don't have to work
with both cases.
Punctuation removal
• Punctuation removal: In this step, all the punctuations
present in the text are removed.
Stop word removal
33
Real Time Tweet Data
34
TOKENIZATION
36
Stop word removal
37
PART OF SPEECH TAGGING
38
POS TAGGING OF TWEET DATA
39
STEMMING AND
LEMMATIZATION
• Stemming is a process that stems or removes last few characters from a
word, often leading to incorrect meanings and spelling.
• Lemmatization considers the context and converts the word to its
meaningful base form, which is called Lemma
40
STEMMING OF THE TWEET
41
LEMMATIZATION OF THE
TWEET
42
Named entity recognition
43
Named entity recognition
44
Output of Pre-Processed
Tweet
45
World cloud of Pre-processed
tweet
46
Feature engineering