Module 2 - Natural Language Processing: Paulo Gomes DEI - FCTUC, 2006/2007
Module 2 - Natural Language Processing: Paulo Gomes DEI - FCTUC, 2006/2007
Language Processing
Paulo Gomes
• Morphological Analysis
• Syntactic Analysis
• Semantic Analysis
• Applications
Paulo Gomes ATAI 06/07 2
Introduction to NLP
• Example:
– “But I have promises to keep, and miles to go before I sleep.”
[Miller, 2001]
– Using the word definitions:
Word Definitions Combinations
But 11 11
I 3 33
have 16 528
promises 7 3696
to 21 77616
keep 17 1319472
and 5 6597360
miles 5 32986800
to 21 692722800
go 29 20088961200
before 10 200889612000
I 3 602668836000
sleep 6 3616013016000
• Morphological Analysis
• Syntactic Analysis
• Semantic Analysis
• Applications
Paulo Gomes ATAI 06/07 10
Morphological Analysis
• Tokenization:
– The | University | of | Coimbra | is | 700 | years | old | .
– Problem:
• Identification of Compound Names (Named Entity
Recognition).
• Word/term identification.
• Morphological Analysis
• Syntactic Analysis
• Semantic Analysis
• Applications
Paulo Gomes ATAI 06/07 16
Syntactic Analysis
The | University of Coimbra | is | 700 | years | old | .
• Parsing:
– Full Parsing
• define the sentence structure using a parsing tree.
– Shallow Parsing
• define the sentence structure using parsing chunks.
• Full Parsing:
S
NP VP
NP ADJP
• Shallow Parsing:
• [NP The/DT University of Coimbra/NNP ]
• [VP is/VBZ ]
• [NP 700/CD years/NNS ]
• [ADJP old/JJ ]
• Morphological Analysis
• Syntactic Analysis
• Semantic Analysis
• Applications
Paulo Gomes ATAI 06/07 23
Semantic Analysis
• Syntax-Driven Semantic Analysis:
– Example:
• “Vegetarians eat fruit.”
NP VP
∀x,y : vegetarian(x) Λ fruit(y) ⇒ eats(x,y)
NN
VB NN OR
NP VP
Natural
Language Semantic Semantic
Parser NN VB NN
Representation
Text Analysis
Vegetarians eat fruit
Parse Tree
“Bank”
What does it mean?
• Morphological Analysis
• Syntactic Analysis
• Semantic Analysis
• Applications
Paulo Gomes ATAI 06/07 32
NLP Applications
• Question & Answering (Q&A)
– Brain boost:
http://www.brainboost.com/
Natural
Web Search
Language
(Google, ...)
Question
Web
Answer Resulting Web
extraction Pages
http://www.google.com/language_tools?hl=en
– Eliza
http://www-ai.ijs.si/eliza-cgi-bin/eliza_script
http://www.simonlaven.com/
Selected
Documents
Temporal
distribution of
the selected
documents
Natural
SQL
Language
(SELECT)
Question
Database
Answer
Resulting
Extraction and
Table(s)
Formatting
Document
List
Ontology
Information
about the
selected
Concept document
Selection
EmailN
... Ontology User 1
+ Email1
Email2
User profiles
Email1
User 2
Information
NewsN
Routing System ...
...
News2
News2
News1
User N
• Questions?