Natural Language Processing (NLP) (A Complete Guide)
Natural Language Processing (NLP) (A Complete Guide)
Natural Language Processing (NLP) (A Complete Guide)
🌟 NEW COURSE! How Business Thinkers Can Start Building AI Plugins With Semantic Kernel
A COMPLETE GUIDE TO
Natural Language Processing
Last updated on Jan 11, 2023
TABLE OF CONTENTS
Introduction
What is Natural Language Processing (NLP)
Why Does Natural Language Processing (NLP) Matter?
What is Natural Language Processing (NLP) Used For?
How Does Natural Language Processing (NLP) Work?
Top Natural Language Processing (NLP) Techniques
Six Important Natural Language Processing (NLP) Models
Programming Languages, Libraries, And Frameworks For Natural Language Processing
(NLP)
Controversies Surrounding Natural Language Processing (NLP)
https://www.deeplearning.ai/resources/natural-language-processing/ 1/26
9/4/23, 2:14 PM Natural Language Processing (NLP) [A Complete Guide]
RELEVANT COURSES
Natural Language Processing Specialization
Machine Learning Specialization
Deep Learning Specialization
Introduction
Natural Language Processing (NLP) is one of the hottest areas of artificial intelligence
(AI) thanks to applications like text generators that compose coherent essays,
chatbots that fool people into thinking they’re sentient, and text-to-image programs
that produce photorealistic images of anything you can describe. Recent years have
brought a revolution in the ability of computers to understand human languages,
programming languages, and even biological and chemical sequences, such as DNA
and protein structures, that resemble language. The latest AI models are unlocking
these areas to analyze the meanings of input text and generate meaningful,
expressive output.
Toxicity classification is a branch of sentiment analysis where the aim is not just
to classify hostile intent but also to classify particular categories such as threats,
insults, obscenities, and hatred towards certain identities. The input to such a
model is text, and the output is generally the probability of each class of toxicity.
Toxicity classification models can be used to moderate and improve online
conversations by silencing offensive comments, detecting hate speech, or
scanning documents for defamation.
Machine translation automates translation between different languages. The
input to such a model is text in a specified source language, and the output is the
text in a specified target language. Google Translate is perhaps the most famous
mainstream application. Such models are used to improve communication
between people on social-media platforms such as Facebook or Skype. Effective
approaches to machine translation can distinguish between words with similar
meanings. Some systems also perform language identification; that is, classifying
text as being in one language or another.
https://www.deeplearning.ai/resources/natural-language-processing/ 4/26
9/4/23, 2:14 PM Natural Language Processing (NLP) [A Complete Guide]
writing experience to their customers. Schools also use them to grade student
essays.
Topic modeling is an unsupervised text mining task that takes a corpus of
documents and discovers abstract topics within that corpus. The input to a topic
model is a collection of documents, and the output is a list of topics that defines
words for each topic as well as assignment proportions of each topic in a
document. Latent Dirichlet Allocation (LDA), one of the most popular topic
modeling techniques, tries to view a document as a collection of topics and a
topic as a collection of words. Topic modeling is being used commercially to help
lawyers find evidence in legal documents.
Text generation, more formally known as natural language generation (NLG),
produces text that’s similar to human-written text. Such models can be fine-
tuned to produce text in different genres and formats — including tweets, blogs,
and even computer code. Text generation has been performed using Markov
processes, LSTMs, BERT, GPT-2, LaMDA, and other approaches. It’s particularly
useful for autocomplete and chatbots.
Autocomplete predicts what word comes next, and autocomplete systems of
varying complexity are used in chat applications like WhatsApp. Google uses
autocomplete to predict search queries. One of the most famous models for
autocomplete is GPT-2, which has been used to write articles, song lyrics,
and much more.
Chatbots automate one side of a conversation while a human conversant
generally supplies the other side. They can be divided into the following two
categories:
Database query: We have a database of questions and answers, and we
would like a user to query it using natural language.
Conversation generation: These chatbots can simulate dialogue with a
human partner. Some are capable of engaging in wide-ranging
conversations. A high-profile example is Google’s LaMDA, which provided
such human-like answers to questions that one of its developers was
convinced that it had feelings.
https://www.deeplearning.ai/resources/natural-language-processing/ 6/26
9/4/23, 2:14 PM Natural Language Processing (NLP) [A Complete Guide]
Information retrieval finds the documents that are most relevant to a query. This
is a problem every search and recommendation system faces. The goal is not to
answer a particular query but to retrieve, from a collection of documents that
may be numbered in the millions, a set that is most relevant to the query.
Document retrieval systems mainly execute two processes: indexing and
matching. In most modern systems, indexing is done by a vector space model
through Two-Tower Networks, while matching is done using similarity or distance
scores. Google recently integrated its search function with a multimodal
information retrieval model that works with text, image, and video data.
https://www.deeplearning.ai/resources/natural-language-processing/ 9/26
9/4/23, 2:14 PM Natural Language Processing (NLP) [A Complete Guide]
https://www.deeplearning.ai/resources/natural-language-processing/ 10/26
9/4/23, 2:14 PM Natural Language Processing (NLP) [A Complete Guide]
Inverse Document Frequency: How important is the term in the whole corpus?
IDF(word in a corpus)=log(number of documents in the corpus / number of
documents that include the word)
A word is important if it occurs many times in a document. But that creates a problem.
Words like “a” and “the” appear often. And as such, their TF score will always be high.
We resolve this issue by using Inverse Document Frequency, which is high if the word
is rare and low if the word is common across the corpus. The TF-IDF score of a term
is the product of TF and IDF.
https://www.deeplearning.ai/resources/natural-language-processing/ 11/26
9/4/23, 2:14 PM Natural Language Processing (NLP) [A Complete Guide]
and predicts based on which joint distribution has the highest probability. The naive
assumption in the Naive Bayes model is that the individual words are independent.
https://www.deeplearning.ai/resources/natural-language-processing/ 13/26
9/4/23, 2:14 PM Natural Language Processing (NLP) [A Complete Guide]
Thus:
P(text|label) = P(word_1|label)*P(word_2|label)*…P(word_n|label)
In NLP, such statistical methods can be applied to solve problems such as spam
detection or finding bugs in software code.
Decision trees are a class of supervised classification models that split the
dataset based on different features to maximize information gain in those splits.
Latent Dirichlet Allocation (LDA) is used for topic modeling. LDA tries to view a
document as a collection of topics and a topic as a collection of words. LDA is a
statistical approach. The intuition behind it is that we can describe any topic
using only a small set of words from the corpus.
Hidden Markov models: Markov models are probabilistic models that decide the
next state of a system based on the current state. For example, in NLP, we might
suggest the next word based on the previous word. We can model this as a
Markov model where we might find the transition probabilities of going from
word1 to word2, that is, P(word1|word2). Then we can use a product of these
https://www.deeplearning.ai/resources/natural-language-processing/ 14/26
9/4/23, 2:14 PM Natural Language Processing (NLP) [A Complete Guide]
Recurrent Neural Network (RNN): Many techniques for text classification that use
deep learning process words in close proximity using n-grams or a window
(CNNs). They can see “New York” as a single instance. However, they can’t
capture the context provided by a particular text sequence. They don’t learn the
sequential structure of the data, where every word is dependent on the previous
word or a word in the previous sentence. RNNs remember previous information
using hidden states and connect it to the current task. The architectures known
as Gated Recurrent Unit (GRU) and long short-term memory (LSTM) are types of
RNNs designed to remember information for an extended period. Moreover, the
bidirectional LSTM/GRU keeps contextual information in both directions, which is
helpful in text classification. RNNs have also been used to generate mathematical
proofs and translate human thoughts into words.
https://www.deeplearning.ai/resources/natural-language-processing/ 16/26
9/4/23, 2:14 PM Natural Language Processing (NLP) [A Complete Guide]
https://www.deeplearning.ai/resources/natural-language-processing/ 17/26
9/4/23, 2:14 PM Natural Language Processing (NLP) [A Complete Guide]
https://www.deeplearning.ai/resources/natural-language-processing/ 18/26
9/4/23, 2:14 PM Natural Language Processing (NLP) [A Complete Guide]
Eliza was developed in the mid-1960s to try to solve the Turing Test; that is, to
fool people into thinking they’re conversing with another human being rather than
a machine. Eliza used pattern matching and a series of rules without encoding
the context of the language.
Tay was a chatbot that Microsoft launched in 2016. It was supposed to tweet like
a teen and learn from conversations with real users on Twitter. The bot adopted
phrases from users who tweeted sexist and racist comments, and Microsoft
deactivated it not long afterward. Tay illustrates some points made by the
“Stochastic Parrots” paper, particularly the danger of not debiasing data.
BERT and his Muppet friends: Many deep learning models for NLP are named
after Muppet characters, including ELMo, BERT, Big BIRD, ERNIE, Kermit, Grover,
RoBERTa, and Rosita. Most of these models are good at providing contextual
embeddings and enhanced knowledge representation.
Generative Pre-Trained Transformer 3 (GPT-3) is a 175 billion parameter model
that can write original prose with human-equivalent fluency in response to an
input prompt. The model is based on the transformer architecture. The previous
version, GPT-2, is open source. Microsoft acquired an exclusive license to access
GPT-3’s underlying model from its developer OpenAI, but other users can interact
with it via an application programming interface (API). Several groups including
EleutherAI and Meta have released open source interpretations of GPT-3.
Language Model for Dialogue Applications (LaMDA) is a conversational chatbot
developed by Google. LaMDA is a transformer-based model trained on dialogue
rather than the usual web text. The system aims to provide sensible and specific
responses to conversations. Google developer Blake Lemoine came to believe
that LaMDA is sentient. Lemoine had detailed conversations with AI about his
rights and personhood. During one of these conversations, the AI changed
Lemoine’s mind about Isaac Asimov’s third law of robotics. Lemoine claimed that
LaMDA was sentient, but the idea was disputed by many observers and
commentators. Subsequently, Google placed Lemoine on administrative leave for
distributing proprietary information and ultimately fired him.
Mixture of Experts (MoE): While most deep learning models use the same set of
parameters to process every input, MoE models aim to provide different
parameters for different inputs based on efficient routing algorithms to achieve
https://www.deeplearning.ai/resources/natural-language-processing/ 20/26
9/4/23, 2:14 PM Natural Language Processing (NLP) [A Complete Guide]
Many other languages including JavaScript, Java, and Julia have libraries that
implement NLP methods.
for many small companies. Some experts worry that this could block many
capable engineers from contributing to innovation in AI.
Black box: When a deep learning model renders an output, it’s difficult or
impossible to know why it generated that particular result. While traditional
models like logistic regression enable engineers to examine the impact on the
output of individual features, neural network methods in natural language
processing are essentially black boxes. Such systems are said to be “not
explainable,” since we can’t explain how they arrived at their output. An effective
approach to achieve explainability is especially important in areas like banking,
where regulators want to confirm that a natural language processing system
doesn’t discriminate against some groups of people, and law enforcement, where
models trained on historical data may perpetuate historical biases against certain
groups.
“Nonsense on stilts”: Writer Gary Marcus has criticized deep learning-based NLP for
generating sophisticated language that misleads users to believe that natural
language algorithms understand what they are saying and mistakenly assume they
are capable of more sophisticated reasoning than is currently possible.
COURSE
Machine Learning Specialization
A foundational set of three courses that introduces beginners to the
fundamentals of learning algorithms. Prerequisites include high-school math
and basic programming skills
View Course
https://www.deeplearning.ai/resources/natural-language-processing/ 23/26
9/4/23, 2:14 PM Natural Language Processing (NLP) [A Complete Guide]
COURSE
Deep Learning Specialization
An intermediate set of five courses that help learners get hands-on experience
building and deploying neural networks, the technology at the heart of today’s
most advanced NLP and other sorts of AI models.
View Course
COURSE
Natural Language Processing Specialization
An intermediate set of four courses that provide learners with the theory and
application behind the most relevant and widely used NLP models.
View Course
If you want to learn more about NLP, try reading research papers. Work through the
papers that introduced the models and techniques described in this article. Most are
easy to find on arxiv.org. You might also take a look at these resources:
The Batch : A weekly newsletter that tells you what matters in AI. It’s the best way
to keep up with developments in deep learning.
NLP News: A newsletter from Sebastian Ruder, a research scientist at Google,
focused on what’s new in NLP.
Papers with Code: A web repository of machine learning research, tasks,
benchmarks, and datasets.
We highly recommend learning to implement basic algorithms (linear and logistic
regression, Naive Bayes, decision trees, and vanilla neural networks) in Python. The
next step is to take an open-source implementation and adapt it to a new dataset or
task.
Conclusion
https://www.deeplearning.ai/resources/natural-language-processing/ 24/26
9/4/23, 2:14 PM Natural Language Processing (NLP) [A Complete Guide]
NLP is one of the fast-growing research domains in AI, with applications that involve
tasks including translation, summarization, text generation, and sentiment analysis.
Businesses use NLP to power a growing number of applications, both internal — like
detecting insurance fraud, determining customer sentiment, and optimizing aircraft
maintenance — and customer-facing, like Google Translate.
Aspiring NLP practitioners can begin by familiarizing themselves with foundational AI
skills: performing basic mathematics, coding in Python, and using algorithms like
decision trees, Naive Bayes, and logistic regression. Online courses can help you build
your foundation. They can also help as you proceed into specialized topics.
Specializing in NLP requires a working knowledge of things like neural networks,
frameworks like PyTorch and TensorFlow, and various data preprocessing techniques.
The transformer architecture, which has revolutionized the field since it was
introduced in 2017, is an especially important architecture.
NLP is an exciting and rewarding discipline, and has potential to profoundly impact
the world in many positive ways. Unfortunately, NLP is also the focus of several
controversies, and understanding them is also part of being a responsible
practitioner. For instance, researchers have found that models will parrot biased
language found in their training data, whether they’re counterfactual, racist, or
hateful. Moreover, sophisticated language models can be used to generate
disinformation. A broader concern is that training large models produces substantial
greenhouse gas emissions.
This page is only a brief overview of what NLP is all about. If you have an appetite for
more, DeepLearning.AI offers courses for everyone in their NLP journey, from AI
beginners and those who are ready to specialize. No matter your current level of
expertise or aspirations, remember to keep learning!
https://www.deeplearning.ai/resources/natural-language-processing/ 25/26
9/4/23, 2:14 PM Natural Language Processing (NLP) [A Complete Guide]
https://www.deeplearning.ai/resources/natural-language-processing/ 26/26