0% found this document useful (0 votes)
26 views

Introduction To NLP

grade 3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views

Introduction To NLP

grade 3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 51

Natural Language Processing

(NLP)
1. Introduction to Natural Language
Processing
1. Differences between NLP, NLU, and NLG
2. The Study of Language
3. Applications of Natural Language
Understanding
4. Evaluating Language Understanding Systems
5. The Different Levels of Language Analysis
6. Representations and Understanding
7.The Organization of Natural Language
Understanding Systems
1. Introduction to NLP
2
1.1 NLP vs. NLU vs.
NLG
▶ What is NLP?
 NLP uses methods from various disciplines, such as computer
science, artificial intelligence, linguistics, and data science, to
enable computers to understand human language in both written
and verbal forms.
 NLP use machine learning and deep learning techniques to
complete tasks, like language translation or question
answering.
 NLP takes unstructured data and converts it into a structured
data format (e.g. NER and identification of word patterns, using
methods like tokenization, stemming, and lemmatization, which
examine the root forms of words. 3
1. Introduction to
NLP
 Different approaches have been used for different types of
language tasks:
 Hidden Markov Models (HMMs) are used for part-of-speech (POS)
tagging.
 Recurrent Neural Networks (RNN) help to generate the appropriate
sequence of text.
 N-grams, a simple language model (LM), assign probabilities to
sentences or phrases to predict the accuracy of a response.
 These techniques work together to support popular technology
such as chatbots, or speech recognition products like Amazon’s
Alexa or Apple’s Siri.

1. Introduction to 4
NLP
What is NLU?
 NLU is a subset of NLP, which uses syntactic and semantic analysis of
text and speech to determine the meaning of a sentence.
 NLU also establishes a relevant ontology: a data structure which
specifies the relationships between words and phrases. While humans
naturally do this in conversation, the combination of these analyses is
required for a machine to understand the intended meaning of different
texts. For example, let’s take the following two sentences:
 Alice is swimming against the current.
 The current version of the report is in the folder.
 Example: Sentiment Analysis.

1. Introduction to 5
NLP
What is NLG?
 NLG is another subset of NLP. While NLU focuses on computer
reading comprehension, NLG enables computers to write.
 NLG is the process of producing a human language text response based
on some data input. This text can also be converted into a speech format
through text-to-speech services.
 NLG also encompasses text summarization capabilities that generate
summaries from input documents while maintaining the integrity of the
information.
 NLG systems used templates to generate text. Based on some data or
query, an NLG system would fill in the blank, like a game of Mad Libs.
But over time, NLG systems have evolved with the application of
HMMs, RNNs, and transformers, enabling more dynamic text
generation in real time.
1. Introduction to 6
NLP
NLP vs NLU vs. NLG summary
 NLP seeks to convert unstructured language data into a
structured data format to enable machines to understand speech
and text and formulate relevant, contextual responses. Its
subtopics include NLU and NLG.
 NLU focuses on machine reading comprehension through
grammar and context, enabling it to determine the
intended meaning of a sentence.
 NLG focuses on text generation, or the construction of text in
English or other languages, by a machine and based on a
given dataset.

1. Introduction to 7
NLP
1.2 The Study of
Language
▶ Language is one of the fundamental
aspects of human behavior and is a
crucial component of our lives.
▶ In written form it serves as a long-term
record of knowledge from one
generation to the next.
▶ In spoken form it serves as our
primary means of coordinating
our day-to-day behavior with
others.
1. Introduction to NLP 8
▶ Language is studied in several different academic
disciplines. Each discipline defines its own set of
problems and has its own methods for addressing
them.
▶ The linguist, for instance, studies the structure of
language itself, considering questions such as why
certain combinations of words form sentences, but
others do not, and why a sentence can have some
meanings but not others.
▶ The psycholinguist, on the other hand, studies the
processes of human language production and
comprehension, considering questions such as how
people identify the appropriate structure of a sentence
and when they decide on the appropriate meaning for
words.
▶The philosopher considers how words can mean
anything at all
and how they identify objects in the world.
▶ Philosophers also consider what it means to have beliefs,
goals, and intentions, and how these cognitive
1. Introduction to NLP 9
capabilities relate to language.
▶ The goal of the computational linguist is
to develop a computational theory of
language, using the notions of
algorithms and data structures from
computer science.
▶ To build a computational model, you
must take advantage of what is
known from all the other disciplines.
▶ Figure 1.2 summarizes these
different approaches to studying
language.
1. Introduction to NLP 10
Figure 1.2 The major disciplines studying language
Discipline Typical Problems Tools
Linguists How do words form phrases
and sentences? What
Intuitions about well-formedness
and meaning; mathematical
constrains the possible models of structure (for example,
formal language theory, model
meanings for a sentence? theoretic semantics)

Psycholinguists How do people identify the


structure of sentences? How are
Experimental techniques
based on measuring human
word meanings identified? performance; statistical
When does understanding take
place? analysis of observations

Philosophers What is meaning, and how


do words and sentences
Natural language argumentation
using intuition about counter-
acquire it? How do words examples; mathematical models
(for example, logic and model
identify objects in the world? theory)

Computational How is the structure of


sentences identified? How can
Algorithms, data structures;
formal models of representation
Linguists knowledge and reasoning be and reasoning; AI techniques
modeled? How can language be (search and representation
used to accomplish specific methods)
tasks?
1. Introduction to NLP 11
Motivations for Developing Computational
Models
▶ The scientific motivation is to obtain a better
understanding of how language works. It recognizes
that any one of the other traditional disciplines does
not have the tools to completely address the problem
of how language comprehension and production work.
▶ Computational models may provide very specific
predictions about human behavior that can then be
explored by the psycholinguist. By continuing in this
process, we may eventually acquire a deep
understanding of how human language processing
occurs.

1. Introduction to 1
NLP 2
1.3 Applications of Natural Language
Understanding
▶ The applications of NLU can be divided
into two major classes: text-based
applications and dialogue-based
applications.
▶ Text-based applications involve the
processing of written text, such as
books, newspapers, reports, manuals,
e-mail messages, and so on. These
are all reading- based tasks.

1. Introduction to NLP 13
▶ Dialogue-based applications involve
human- machine communication. Most
naturally this involves spoken language,
but it also includes interaction using
▶ keyboards.
Finding appropriate documents on certain topics from a
database of
texts
▶ Extracting information from messages or articles on
certain topics
▶ Translating documents from one language to another
▶ Summarizing texts for certain purposes
▶ Story Understanding

1. Introduction to 14
NLP
Text-based NL Research
Areas
▶ Information Retrieval
▶ For example, consider the task of finding
newspaper articles on a certain topic in a
large database.
▶ Many, techniques have been developed that
classify documents by the presence of certain
keywords in the text.
▶ You can then retrieve articles on a certain
topic by looking for articles that contain the
keywords associated with that topic.
1. Introduction to 15
NLP
Machine
Translation
▶ Some machine translation systems have been built
that
are based on pattern matching.
▶ The translation is accomplished by finding the best
set of patterns that match the input and producing
the associated output in the other language.
▶ This technique can produce reasonable results in
some cases but sometimes produces completely
wrong translations because of its inability to use
an understanding of content to disambiguate
word senses and sentence meanings
appropriately.
▶ In contrast, other machine translation systems
operate by producing a representation of the
meaning of each sentence in one language, and 16
1. Introduction to NLP
then producing a sentence in the other language
that realizes the same meaning.
Dialogue-based
Applications
▶ Question-Answering systems, where
natural language is used to query a
database
▶ Automated Customer Service
over the telephone
▶ Tutoring Systems, where the
machine interacts with a student
▶ Spoken Language Control of a
machine (for example, voice control of
a VCR or computer)
▶ General Cooperative Problem-
Solving Systems (for example, a
system that helps a person plan and
1. Introduction to 1
7

schedule freight shipments).


NLP
▶ Some of the problems faced by
dialogue systems are quite
different than in text- based
systems.
▶ First, the language used is very
different, and the system needs to
participate actively in order to
maintain a natural, smooth- flowing
dialogue.
▶ Dialogue requires the use of
acknowledgments to verify that things
are understood, and an ability to both
recognize and generate clarification
sub-dialogues when something is not
1. Introduction to NLP 18
clearly understood.
▶ A Speech Recognition System need not
involve any language understanding. For
instance, voice- controlled computers and
VCRs are entering the market now. These do
not involve natural language understanding in
any general way. Rather, the words recognized
are used as commands, much like the
commands you send to a VCR using a remote
control.
▶ Speech Recognition is concerned only with
identifying the words spoken from a given
speech signal, not with understanding how
words are used to communicate.
▶ To be an understanding system, the speech
recognizer would need to feed its input to a
natural language understanding system,
producing what is often called a Spoken
Language Understanding System.
1. Introduction to NLP 19
4. Evaluating Language Understanding
Systems

How can you tell if a system works?

1.Black Box Evaluation: it evaluates


system performance without looking
inside to see how it works.
▶ run the program and see how well it
performs the task it was designed
to do.
▶ If the program is meant to answer
questions about a database of facts,
you might ask it questions to see how
good it is at producing the correct
20
answers. 1. Introduction to NLP
2. Glass Box Evaluation: you look inside
at the structure of the system to identify
various subcomponents of a system and
then evaluate each one with appropriate
tests.
▶ The problem with glass box evaluation is
that it requires some consensus on what
the various components of a NL system
should be.

1. Introduction to NLP 21
1.5 The Different Levels of Language
Analysis
▶ A NL-system must use considerable knowledge about
the structure of the language itself, including what
the words are, how words combine to form sentences,
what the words mean, how word meanings contribute
to sentence meanings, and so on.
▶ Human general world knowledge and their
reasoning abilities. For example, to answer
questions or to participate in a conversation, a person
not only must know a lot about the structure of the
language being used, but also must know about the
world in general and the conversational setting in
particular.

1. Introduction to NLP 22
▶ Phonetic and phonological knowledge -
concerns how words are related to the sounds
that realize them. Such knowledge is crucial
for speech-based systems.
▶ Morphological knowledge - concerns how
words are constructed from more basic
meaning units called morphemes. A
morpheme is the primitive unit of meaning
in a language (for example, the meaning of
the word "friendly" is derivable from the
meaning of the noun "friend" and the suffix
"-ly", which transforms a noun into an
adjective).
1. Introduction to NLP 23
▶ Syntactic knowledge - concerns how words
can be put together to form correct sentences
and determines what structural role each
word plays in the sentence and what phrases
are subparts of what other phrases.
▶ Semantic knowledge - concerns what words
mean and how these meanings combine in
sentences to form sentence meanings. This is
the study of context-independent meaning -
the meaning a sentence has regardless of the
context in which it is used.
▶ Pragmatic knowledge - concerns how
sentences are used in different situations and
how use affects the interpretation of the
sentence.
1. Introduction to NLP 24
▶ Discourse knowledge-concerns how the
immediately preceding sentences affect the
interpretation of the next sentence. This
information is especially important for interpreting
pronouns and for interpreting the temporal
aspects of the information conveyed.
▶ World knowledge - includes the general
knowledge about the structure of the world that
language users must have in order to, for example,
maintain a conversation. It includes what each
language user must know about the other user’s
beliefs and goals.

1. Introduction to NLP 25
1.6 Representations and
Understanding
▶ A crucial component of understanding involves
computing a representation of the meaning of
sentences and texts.
▶ Why not simply use the sentence itself as a
representation of its meaning?
▶ One reason is that most words have multiple
meanings, which we will call senses. The word
"cook", for example, has a sense as a verb and a
sense as a noun; "dish" has multiple senses as a
noun as well as a sense as a verb.

1. Introduction to NLP 26
▶ This ambiguity would inhibit the system from
making the appropriate inferences needed to
model understanding.
▶ The disambiguation problem appears much
easier than it is because people do not
generally notice ambiguity.
▶ While a person does not seem to consider
each of the possible senses of a word when
understanding a sentence, a program must
explicitly consider them one by one.

1. Introduction to NLP 27
▶ To represent meaning, we must have a
more precise language. The tools to do
this come from mathematics and logic
and involve the use of formally
specified representation languages.
▶ Formal languages are specified from
very simple building blocks. The most
fundamental is the notion of an atomic
symbol which is distinguishable from
any other atomic symbol simply based
on how it is written.
▶ Useful representation languages
have the following two properties:
1. Introduction to NLP 28
▶ The representation must be precise
and unambiguous.
▶ You should be able to express every distinct
reading of a sentence as a distinct formula
in the representation.
▶ The representation should capture the
intuitive structure of the natural language
sentences that it represents.
▶ For example, sentences that appear to be
structurally similar should have similar
structural representations, and the
meanings of two sentences that are
paraphrases of each other should be
closely related to1.each other.
Introduction to NLP 29
Syntax: Representing Sentence
Structure
▶ The syntactic structure of a sentence
indicates the way that words in the
sentence are related to each other.
▶ This structure indicates how the words
are grouped together into phrases,
what words modify what other words,
and what words are of central
importance in the sentence.
▶ In addition, this structure may identify
the types of relationships that exist
between phrases and can store other
information about the particular
sentence structure that may be
needed for later processing.
1. Introduction to NLP 30
▶ For example, consider the
following sentences:
1. John sold the book to Mary.
2. The book was sold to Mary by
John.
▶ These sentences share certain
structural properties. In each, the
noun phrases are "John", "Mary", and
"the book“. In other respects, these
sentences are significantly different.

1. Introduction to NLP 31
▶ you could only give sentence 1 as an
answer to the question "What did John
do for Mary?"
▶ Sentence 2 is a much better
continuation of a sentence beginning
with the phrase "After it fell in the
river", as sentences 3 and 4 show.
▶ Following the standard convention in
linguistics, we will use an asterisk (*)
before any example of an ill-formed or
questionable sentence.

1. Introduction to NLP 32
3. *After it fell in the river, John sold Mary
the book.
4.After it fell in the river, the book was sold to
Mary by John.
5. *John are in the corner.
6. *John put the book.
▶ Sentence 5 is ill-formed because the subject and the verb do not
agree in number (the subject is singular and the verb is plural),
▶ while 6 is ill-formed because the verb put requires some modifier
that
describes where John put the object.

1. Introduction to NLP 33
▶ In fact, a robust system should be able
to understand ill-formed sentences
whenever possible. This might suggest
that agreement checks can be
ignored, but this is not so.
▶ Agreement checks are
essential for eliminating
potential ambiguities.
▶ Consider sentences 7 and 8, which are
identical except for the number
feature of the main verb, yet
represent two quite distinct
interpretations. 1. Introduction to NLP 34
7. flying planes are dangerous.
8. flying planes is dangerous.
▶ If you did not check subject-verb
agreement, these two sentences would
be indistinguishable and ambiguous.
▶ Most syntactic representations of
language are based on the notion of
Context-Free Grammars (CFG),
which represent sentence structure in
terms of what phrases are subparts of
other phrases.

1. Introduction to NLP 35
▶ Figure 1.4, shows two different
structures for the sentence "Rice
flies like sand".
▶ The two structures give further
details on the structure of the noun
phrase and verb phrase and identify
the part of speech for each word.

1. Introduction to NLP 36
1. Introduction to NLP 37
The Logical
Form
▶ The structure of a sentence doesn’t reflect its
meaning. For example, the NP "the catch" can
have different meanings depending on
whether the speaker is talking about a
baseball game or a fishing expedition
▶ Both these interpretations have the same
syntactic structure, and the different
meanings arise from an ambiguity concerning
the sense of the word "catch".
▶ Once the correct sense is identified, say the
fishing sense, there still is a problem in
determining what fish are being referred to.
The intended meaning of a sentence depends
on the situation in which the sentence is
produced. 1. Introduction to NLP 38
▶ The division is between context-
independent meaning and context-
dependent meaning.
▶ The fact that "catch" may refer to a baseball
move, or the results of a fishing expedition is
knowledge about English and is independent
of the situation in which the word is used.
▶ On the other hand, the fact that a particular
NP "the catch" refers to what Jack caught
when fishing yesterday is contextually
dependent. The representation of the
context-independent meaning of a sentence
is called its logical form.

1. Introduction to NLP 39
▶ The logical form encodes possible word
senses and identifies the semantic
relationships between the words and
phrases.
▶ Many of these relationships are often captured
using an abstract set of semantic
relationships between the verb and its NPs.
▶ In particular, in both sentences 1 and 2
previously given, the action described is a
selling event, where "John" is the seller, "the
book" is the object being sold, and "Mary" is
the buyer.
▶ These roles are instances of the abstract
semantic roles AGENT, THEME, and TO-POSS
(for final possessor), respectively.

NLP - Prof. A. T. Al- 1. Introduction to NLP 40


Taani
▶ Once the semantic relationships
are determined, some word
senses may be impossible and
thus eliminated from
consideration.
▶ Consider the sentence
9. Jack invited Mary to the
Halloween ball.
▶ The word "ball", which by itself is
ambiguous between the plaything that
bounces and the formal dance event,
can only take the latter sense in
sentence 9, because the verb "invite"
NLP - Prof. A. T. Al-
Taani only makes sense1.with
Introductionthis
to NLP 41

interpretation.
▶ One of the key tasks in semantic
interpretation is to consider what
combinations of the individual word
meanings can combine to create
coherent sentence meanings.
▶ Exploiting such interconnections
between word meanings can greatly
reduce the number of possible word
senses for each word in a given
sentence.

1. Introduction to NLP 42
The Final Meaning
Representation
▶ The final representation needed is a
general knowledge representation
(KR), which the system uses to
represent and reason about its
application domain.
▶ The goal of contextual interpretation
is to take a representation of the
structure of a sentence and its logical
form, and to map this into some
expression in the KR that allows the
system to perform the appropriate
43
task in the domain.
1. Introduction to NLP
▶ In a question-answering application, a
question might map to a database
query, in a story-understanding
application, a sentence might map into
a set of expressions that represent the
situation that the sentence describes.
▶ We will assume that the first-order
predicate calculus (FOPC) is the final
representation language because it is
relatively well known, well studied, and
is precisely defined.
1. Introduction to NLP 44
1.7 The Organization of NLU
Systems
▶ Figure 1.5 shows the organization of
NLU systems.
▶ there are interpretation processes that
map from one representation to the
other. For instance, the process that
maps a sentence to its syntactic
structure and logical form is called the
parser. It uses knowledge about word
and word meanings (the lexicon) and a
set of rules defining the legal structures
(the grammar) in order to assign a
syntactic structure and a logical form to
an input sentence.
1. Introduction to NLP 45
1. Introduction to NLP 46
▶ An alternative organization could
perform syntactic processing first and
then perform semantic interpretation
on the resulting structures.
▶ Combining the two has considerable
advantages because it leads to a
reduction in the number of possible
interpretations, since every proposed
interpretation must simultaneously
be syntactically and semantically well
formed. For example, consider the
following two sentences: 47
1. Introduction to NLP
10. Visiting relatives can be trying.
11. Visiting museums can be trying.
▶ These two sentences have
identical syntactic structure, so
both are syntactically
ambiguous.
▶ In sentence 10, the subject might be
relatives who are visiting you or the
event of you visiting relatives. Both
of these alternatives are
semantically valid, and you would
need to determine the appropriate
sense by using the
contextual mechanism.1. Introduction to NLP 48
▶ Sentence 11 has only one possible semantic
interpretation, since museums are not object
that can visit other people; rather they must
be visited.
▶ In a system with separate syntactic and
semantic processing, there would be two
syntactic interpretations of sentence 11,
one of which the semantic interpreter
would eliminate later.
▶ If syntactic and semantic processing are
combined, however, the system will be able to
detect the semantic anomaly as soon as it
interprets the phrase "visiting museums", and
thus will never build the incorrect syntactic
structure in the first place.
1. Introduction to NLP 49
▶ Continuing through Figure 1.5, the process that
transforms the syntactic structure and logical
form into a final meaning representation is called
contextual processing.
▶ This process includes issues such as

▶ identifying the objects referred to by NPs such as


definite descriptions (for example, "the man") and
pronouns,
▶ the analysis of the temporal aspects of the new
information conveyed by the sentence,
▶ the identification of the speaker’s intention (for
example, whether "Can you lift that rock" is a
yes/no question or a request),
▶ as well as all the inferential processing required to
interpret the sentence appropriately within the
50
application domain. 1. Introduction to NLP
▶ It uses knowledge of the discourse context
(determined by the sentences that
preceded the current one) and knowledge
of the application to produce a final
representation.
▶ The system would then perform whatever
reasoning tasks are appropriate for the
application.
▶ When this requires a response to the user,
the meaning that must be expressed is
passed to the generation component of
the system.
▶ It uses knowledge of the discourse context,
plus information on the grammar and
lexicon, to plan the form of an utterance,
which then is mapped1. Introduction
into words to NLP by a 51

realization process.

You might also like