Natural Language Processing
Natural Language Processing
Language Processing
Instructor: Prof. Lu Wang
Computer Science and Engineering
University of Michigan
https://web.eecs.umich.edu/~wangluxy/
1
Time and Location
• Time: Mondays and Wednesdays, 10:30 am - 12 pm
• Location: online via Zoom (link is provided on and Canvas & piazza,
anyone with umich.edu email can join piazza for discussions)
2
Course Webpage
• https://web.eecs.umich.edu/~wangluxy/courses/eecs498_wn2021/e
ecs498_wn2021.html
• Slides, (tentative) schedule for topics of lectures, and future dues
• You can also go to the instructor’s web page and find it from there:
• https://web.eecs.umich.edu/~wangluxy
3
The Goal
• Study fundamental tasks in NLP
• Given the remote teaching mode, we will take small breaks (e.g., 5 minutes)
every 15-20 minutes, depending on the progress
• During the break, you’ll have the chance to write down questions in a shared Google doc
4
Prerequisites
• Programming
• Being able to write code in some programming languages (Python
recommended) proficiently
• Courses
• Algorithms
• Probability and statistics
• Linear algebra (optional but highly recommended)
• Supervised machine learning (also optional but highly recommended)
5
Prerequisites
Great notes on probability, statistics, and linear algebra
• Probability and Statistics for Data Science, by Carlos Fernandez-Granda
• https://cims.nyu.edu/~cfgranda/pages/stuff/probability_stats_for_DS.pdf
• No need to be proficient in all aspects!
6
Textbook and References
• Main textbook
• Dan Jurafsky and James H. Martin, "Speech and Language Processing, 2nd Edition",
Prentice Hall, 2009.
• We will also use some material from 3rd edition (for the available part).
• http://web.stanford.edu/~jurafsky/slp3/
• Other references
• Jacob Eisenstein, "Introduction to Natural Language Processing", The MIT Press,
2019
• Chris Manning and Hinrich Schutze, "Foundations of Statistical Natural Language
Processing", MIT Press, 1999
7
Topics of the Course (tentatively)
• Language Modeling
• Part-of-Speech Tagging
Basic
• Text Classification
concepts
• Syntax: Formal Grammars of English, Syntactic Parsing, Statistical Parsing, Dependency Parsing
• Semantics: Vector-Space, Lexical Semantics, Semantics with Dense Vectors
• Information Extraction
• Summarization
Applications • Question Answering
• Sentiment Analysis
• Dialog Systems and Chatbots
• Machine Translation
• Coreference Resolution
• Discourse Analysis 8
Grading
• Assignment (60%)
• 4 assignments, 15% for each
• Project (35%) (details come up soon)
• Participation (5%)
• Classes: attendance, ask and answer questions, participate in discussions…
• Piazza: help your peers, address questions…
9
Course Project
• An NLP-related project
10
Course Project Grading
• The problem needs to be well-defined, useful, and practical.
11
Sample Projects
• Text style transfer (impolite -> polite, positive->negative)
• https://web.eecs.umich.edu/~wangluxy/courses/eecs498_wn2021/material_
eecs498_wn21/report1.pdf
• https://web.eecs.umich.edu/~wangluxy/courses/eecs498_wn2021/material_
eecs498_wn21/report5.pdf
12
More Project Samples
• Stanford NLP class
• http://web.stanford.edu/class/cs224n
• Notice its focus on deep learning
• Your project can use any machine learning technique(s) on a natural
language processing problem, and shouldn’t be limited to deep learning only.
13
Course Project
• Talk to the instructor and IAs on project topics!
• Zoom meetings (~10 minutes) will be arranged during the week of Feb 1st.
14
Course Project Grading
• Three reports
• One-page proposal (5%), due on Feb 12th at 11:59pm.
• Progress report, with code (8%)
• Final, with code (12%)
• One presentation
• In class (7%)
• feedback to other teams' presentations (3%)
15
Audience Award
• Bonus points!
• All teams vote for their favorite project(s) after
presentation.
• The team gets the most votes will be awarded with 1%
bonus point!
16
Submission and Late Policy
• Programming language
• Python (recommended)
17
Submission and Late Policy
• Submissions turned in late will be charged 20 points (out of 100 points) off
for each late day (i.e. 24 hours).
• Each student has a budget of 8 days in total throughout the semester before
a late penalty is applied.
• Each group member is charged with the same number of late days, if any, for
their submission.
18
Get in touch!
• All materials and schedule can be found on the course webpage:
• https://web.eecs.umich.edu/~wangluxy/courses/eecs498_wn2021/eecs498_wn2021.html
• Office hours
• Prof. Lu Wang: Wednesdays, from 12pm to 1pm (Zoom link is provided on Piazza&Canvas)
• IA Yue Kuang, Thursdays 5pm - 6pm, online via Zoom
• IA Ruobing Wang, Tuesdays 8pm - 9pm, online via Zoom
• Piazza
• http://piazza.com/umich/winter2021/eecs498004, please sign up.
• All course relevant questions should go here!
19
What is Natural Language Processing?
20
What is Natural Language Processing?
• Allowing machines to communicate with human
21
What does it mean to understand a language?
22
What does it mean to understand a language?
Phonology
Sound waves
Morphology
Lexemes Words
Syntax
Parse trees
Semantics
Pragmatics Meanings
Discourse 23
What does it mean to understand a language?
Phonology
Morphology
Shallower Analysis
Lexemes
Syntax
Semantics
Deeper Analysis
Pragmatics
Discourse 24
Syntax, Semantics, Pragmatics
• Syntax concerns the proper ordering of words and its affect on meaning.
• The dog bit the boy.
• The boy bit the dog.
• Bit boy dog the the.
• Semantics concerns the (literal) meaning of words, phrases, and sentences.
• “plant” as a photosynthetic organism
• “plant” as a manufacturing facility
• “plant” as the act of sowing
• Pragmatics concerns the overall communicative and social context and its
effect on interpretation.
• Honest or dishonest?
• Context 1: Kyle and Ellen would like to see a movie. Kyle has $20 in his pocket. Tickets cost $8 each.
• Context 2: Kyle and Ellen would like to see a movie. Kyle has $20 in his pocket. Tickets cost $10 each.
25
Syntax, Semantics, Pragmatics
• Syntax concerns the proper ordering of words and its affect on meaning.
• The dog bit the boy.
• The boy bit the dog.
• Bit boy dog the the.
• Semantics concerns the (literal) meaning of words, phrases, and sentences.
• “plant” as a photosynthetic organism
• “plant” as a manufacturing facility
• “plant” as the act of sowing
• Pragmatics concerns the overall communicative and social context and its
effect on interpretation.
• Honest or dishonest?
• Context 1: Kyle and Ellen would like to see a movie. Kyle has $20 in his pocket. Tickets cost $8 each.
• Context 2: Kyle and Ellen would like to see a movie. Kyle has $20 in his pocket. Tickets cost $10 each.
• Kyle: “I have $8.”
26
Where NLP is used?
27
Commercial World
28
Social World
• Disaster Relief
• Chatbots for Mental Health
• Detecting abusive language in online posts
29
Text Classification: Disaster Response
31
Sentiment in Restaurant Reviews
A very bad (one-star) review:
Dan Jurafsky, Victor Chahuneau, Bryan R. Routledge, and Noah A. Smith. 2014. Narrative
framing of consumer sentiment in online restaurant reviews. First Monday 19:4
32
What is the language of bad reviews?
• Negative sentiment language
horrible awful terrible bad disgusting
• Past narratives about people
waited, didn’t, was
he, she, his, her,
manager, customer, waitress, waiter
• Frequent mentions of we and us
... we were ignored until we flagged down a waiter to get our waitress
…
33
Personal Assistants
34
Question Answering: IBM’s Watson
35
Recommendation Engines
If you bought….
36
Why NLP is challenging?
37
Ambiguity is Ubiquitous
• Speech Recognition
• “recognize speech” vs. “wreck a nice beach”
• “youth in Asia” vs. “euthanasia”
38
Ambiguity is Ubiquitous
• Speech Recognition
• “recognize speech” vs. “wreck a nice beach”
• “youth in Asia” vs. “euthanasia”
• Syntactic Analysis
• “I ate spaghetti with chopsticks” vs. “I ate spaghetti with meatballs.”
39
Ambiguity is Ubiquitous
• Speech Recognition
• “recognize speech” vs. “wreck a nice beach”
• “youth in Asia” vs. “euthanasia”
• Syntactic Analysis
• “I ate spaghetti with chopsticks” vs. “I ate spaghetti with meatballs.”
• Semantic Analysis
• “The dog is in the pen.” vs. “The ink is in the pen.”
• “I put the plant in the window” vs. “Ford put the plant in Mexico”
40
Ambiguity is Ubiquitous
• Speech Recognition
• “recognize speech” vs. “wreck a nice beach”
• “youth in Asia” vs. “euthanasia”
• Syntactic Analysis
• “I ate spaghetti with chopsticks” vs. “I ate spaghetti with meatballs.”
• Semantic Analysis
• “The dog is in the pen.” vs. “The ink is in the pen.”
• “I put the plant in the window” vs. “Ford put the plant in Mexico”
• Pragmatic Analysis
• From “The Pink Panther Strikes Again”:
• Clouseau: Does your dog bite?
Hotel Clerk: No.
Clouseau: [bowing down to pet the dog] Nice doggie.
[Dog barks and bites Clouseau in the hand]
Clouseau: I thought you said your dog did not bite!
Hotel Clerk: That is not my dog.
41
Ambiguity
Find at least 6 meanings of this sentence:
I made her duck
42
Ambiguity
Find at least 6 meanings of this sentence:
I made her duck
• I cooked waterfowl for her benefit (to eat)
• I cooked waterfowl belonging to her
• I created the (plaster?) waterfowl she owns
• I caused her to quickly lower her head or body
• I recognized the true identity of her spy waterfowl
• I waved my magic wand and turned her into undifferentiated
waterfowl 43
Ambiguity
I caused her to quickly lower her head or body
Part of speech: “duck” can be a Noun or Verb
I cooked waterfowl belonging to her.
Part of speech:
“her” is possessive pronoun (“of her”)
“her” is dative pronoun (“for her”)
I made the (plaster) duck statue she owns
Word Meaning : “make” can mean “create” or “cook”
44
Ambiguity is Explosive
• Ambiguities compound to generate enormous numbers of possible
interpretations.
• In English, a sentence ending in n prepositional phrases has over 2n
syntactic interpretations
• “Isaw the man with the telescope”: 2 parses
• “I saw the man on the hill with the telescope.”: 5 parses
• “I saw the man on the hill in Texas with the telescope”: 14 parses
• “I saw the man on the hill in Texas with the telescope at noon.”: 42 parses
• “I saw the man on the hill in Texas with the telescope at noon on Monday”:
132 parses
45
Humor and Ambiguity
• Many jokes rely on the ambiguity of language:
• Policeman to little boy: “We are looking for a thief with a bicycle.” Little boy:
“Wouldn’t you be better using your eyes.”
• Why is the teacher wearing sun-glasses. Because the class is so bright.
• Groucho Marx: One morning I shot an elephant in my pajamas. How he got
into my pajamas, I’ll never know.
• She criticized my apartment, so I knocked her flat.
• Noah took all of the animals on the ark in pairs. Except the worms, they came
in apples.
46
Why is Language Ambiguous?
47
Why is Language Ambiguous?
• Having a unique linguistic expression for every possible
conceptualization that could be conveyed would make language
overly complex and linguistic expressions unnecessarily long.
• Allowing resolvable ambiguity permits shorter linguistic expressions,
i.e. data compression.
• Language relies on people’s ability to use their knowledge and
inference abilities to properly resolve ambiguities.
• Infrequently, disambiguation fails, i.e. the compression is lossy.
48
More difficulties: Non-standard language
50
Syntactic Tasks
51
Word Segmentation
• Breaking a string of characters into a sequence of words.
• In some written languages (e.g. Chinese) words are not separated by spaces.
• Even in English, characters other than white-space can be used to separate
words [e.g. , ; . - : ( ) ]
• Examples from English URLs:
• jumptheshark.com Þ jump the shark .com
• twitter.com/realdonaldtrump Þ real donald trump .com
• myspace.com/pluckerswingbar
Þ myspace .com pluckers wing bar
Þ myspace .com plucker swing bar
52
Morphological Analysis
• Morphology is the field of linguistics that studies the internal structure of words.
(Wikipedia)
• A morpheme is the smallest linguistic unit that has semantic meaning (Wikipedia)
• e.g. “carry”, “pre”, “ed”, “ly”, “s”
• Morphological analysis is the task of segmenting a word into its morphemes:
• carried Þ carry + ed (past tense)
• independently Þ in + (depend + ent) + ly
• Googlers Þ (Google + er) + s (plural)
• unlockable Þ un + (lock + able) ?
Þ (un + lock) + able ?
53
Part Of Speech (POS) Tagging
• Annotate each word in a sentence with a part-of-speech.
I ate the spaghetti with meatballs.
Pro V Det N Prep N
John saw the saw and decided to take it to the table.
PN V Det N Con V Part V Pro Prep Det N
54
Phrase Chunking
• Find all non-recursive noun phrases (NPs) and verb phrases (VPs) in a
sentence.
• [NP I] [VP ate] [NP the spaghetti] [PP with] [NP meatballs].
• [NP He ] [VP reckons ] [NP the current account deficit ] [VP will narrow ] [PP
to ] [NP only # 1.8 billion ] [PP in ] [NP September ]
55
Syntactic Parsing
• Produce the correct syntactic parse tree for a sentence.
56
Semantic Tasks
57
Word Sense Disambiguation (WSD)
• Words in natural language usually have a fair number of different
possible meanings.
• Ellen has a strong interest in computational linguistics.
• Ellen pays a large amount of interest on her credit card.
• For many tasks (question answering, translation), the proper sense of
each ambiguous word in a sentence must be determined.
58
Semantic Role Labeling (SRL)
• For each clause, determine the semantic role played by each noun
phrase that is an argument to the verb.
agent patient source destination instrument
• John drove Mary from Austin to Dallas in his Toyota Prius.
• The hammer broke the window.
• Also referred to a “case role analysis,” “thematic analysis,” and
“shallow semantic parsing”
59
Semantic Parsing
• A semantic parser maps a natural-language sentence to a complete,
detailed semantic representation (logical form).
• For many applications, the desired output is immediately executable
by another program.
• Example: Mapping an English database query to Prolog:
How many cities are there in the US?
answer(A, count(B, (city(B), loc(B, C),
const(C, countryid(USA))),
A))
60
Textual Entailment
• Determine whether one natural language sentence entails (implies)
another under an ordinary interpretation.
• E.g., “A soccer game with multiple males playing. -> Some men are
playing a sport.”
61
Pragmatics/Discourse Tasks
62
Anaphora Resolution/Co-Reference
• Determine which phrases in a document refer to the same underlying
entity.
• John put the carrot on the plate and ate it.
• Bush started the war in Iraq. But the president needed the consent of
Congress.
• Some cases require difficult reasoning.
• Today was Jack's birthday. Penny and Janet went to the store. They were going
to get presents. Janet decided to get a kite. "Don't do that," said Penny. "Jack
has a kite. He will make you take it back."
63
More Application-driven Tasks
64
Information Extraction (IE)
• Identify phrases in language that refer to specific types of entities and
relations in text.
• Named entity recognition is task of identifying names of people, places,
organizations, etc. in text.
people organizations places
• Michael Dell is the CEO of Dell Computer Corporation and lives in Austin Texas.
• Relation extraction identifies specific relations between entities.
• Michael Dell is the CEO of Dell Computer Corporation and lives in Austin Texas.
• Michael Dell is the CEO of Dell Computer Corporation and lives in Austin Texas.
65
Question Answering
• Directly answer natural language questions based on information
presented in a corpora of textual documents (e.g. the web).
• Who is the president of United States?
• Donald Trump
66
Text Summarization
• Produce a short summary of one or many longer document(s).
• Article: An international team of scientists studied diet and mortality in
135,335 people between 35 and 70 years old in 18 countries, following
them for an average of more than seven years. Diet information
depended on self-reports, and the scientists controlled for factors
including age, sex, smoking, physical activity and body mass index. The
study is in The Lancet. Compared with people who ate the lowest 20
percent of carbohydrates, those who ate the highest 20 percent had a 28
percent increased risk of death. But high carbohydrate intake was not
associated with cardiovascular death. …
68
Machine Translation
• Translate a sentence from one natural language to another.
• 我喜欢汉堡 à I like burgers.
69
Ambiguity Resolution is Required for Translation
• Syntactic and semantic ambiguities must be properly resolved for
correct translation:
• “John plays the guitar.” → “John 弹 吉他”
• “John plays soccer.” → “John 踢 足球”
70
Ambiguity Resolution is Required for Translation
• Syntactic and semantic ambiguities must be properly resolved for
correct translation:
• “John plays the guitar.” → “John 弹 吉他”
• “John plays soccer.” → “John 踢 足球”
• An apocryphal story is that an early MT system gave the following
results when translating from English to Russian and then back to
English:
• “The spirit is willing but the flesh is weak.” à “The liquor is good but the meat
is spoiled.”
• “Out of sight, out of mind.” à “Invisible idiot.”
71
Bias and Ethics
72
Bias and Ethics
73
Resolving Ambiguity
• Choosing the correct interpretation of linguistic utterances requires
(commonsense) knowledge of:
• Syntax
• An agent is typically the subject of the verb
• Semantics
• Michael and Ellen are names of people
• August is the name of a month (and of a person)
• Toyota is a car company and Prius is a brand of car
• Pragmatics
• Some social norm, communicative goals
• Asking a question, expecting an answer
• World knowledge
• Credit cards require users to pay financial interest
• Agents must be animate and a hammer is not animate
74
State-of-the-Arts
• Learning from large amounts of text data (cf. rule-based methods)
• Supervised learning or unsupervised learning
• Statistical machine learning-based methods
• The probabilistic knowledge acquired allows robust processing that handles
linguistic regularities as well as exceptions.
• Now with neural network-based methods mostly
75
Related Fields
• Artificial Intelligence
• Machine Learning
• Linguistics
• Cognitive science
• Logic
• Data science
• Political science
• Education
• Economics
• …many more
76
Relevant Scientific Conferences and Journals
• Association for Computational Linguistics (ACL)
• North American Association for Computational Linguistics (NAACL)
• Empirical Methods in Natural Language Processing (EMNLP)
• International Conference on Computational Linguistics (COLING)
• Conference on Computational Natural Language Learning (CoNLL)
• Transactions of the Association for Computational Linguistics (TACL)
• Journal of Computational Linguistics (CL)
77