0% found this document useful (0 votes)
6 views36 pages

Lecture 20-23 Part of Speech Tagging

The document discusses part of speech (POS) tagging, distinguishing between open class (nouns, verbs, adjectives, adverbs) and closed class (determiners, pronouns, prepositions) words. It highlights the challenges of POS tagging in English, including ambiguity and the performance of current tagging methods, which achieve about 97% accuracy. Various algorithms for POS tagging, such as Hidden Markov Models and neural sequence models, are mentioned, along with the importance of training data and morphological analysis for unknown words.

Uploaded by

enl36756
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views36 pages

Lecture 20-23 Part of Speech Tagging

The document discusses part of speech (POS) tagging, distinguishing between open class (nouns, verbs, adjectives, adverbs) and closed class (determiners, pronouns, prepositions) words. It highlights the challenges of POS tagging in English, including ambiguity and the performance of current tagging methods, which achieve about 97% accuracy. Various algorithms for POS tagging, such as Hidden Markov Models and neural sequence models, are mentioned, along with the importance of training data and morphological analysis for unknown words.

Uploaded by

enl36756
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

Part Of Speech Tagging

Open class (lexical) words


Nouns Verbs Adjectives old older oldest

Proper Common Main Adverbs slowly


IBM cat / cats see
Italy snow registered Numbers … more
122,312
one
Closed class (functional)
Modals
Determiners the some can Prepositions to with
had
Conjunctions and or Particles off up … more

Pronouns he its Interjections Ow Eh


Open vs. Closed classes
• Open vs. Closed classes
• Closed:
• determiners: a, an, the
• pronouns: she, he, I
• prepositions: on, under, over, near, by, …
• Open:
• Nouns, Verbs, Adjectives, Adverbs.
Why Part of Speech Tagging?
• Parsing
• Machine Translation
• Sentiment of Affective tasks
• Text To Speech
• Meaning
How difficult is POS tagging in English?
• Roughly 15% of word types are ambiguous
• But those 15% tend to be very common so approximately 60% of
word tokens are ambiguous
• Words often have more than one POS: back
• The back door = JJ
• On my back = NN
• Win the voters back = RB
• Promised to back the bill = VB
• The POS tagging problem is to determine the POS tag for a particular
instance of a word.
POS tagging performance
• How many tags are correct? (Tag accuracy)
• About 97% currently
• But baseline is already 90%
• Baseline is performance of stupidest possible method
• Tag every word with its most frequent tag
• Tag unknown words as nouns
• Partly easy because
• Many words are unambiguous
• You get points for them (the, a, etc.) and for punctuation marks!
Sources of information
• What are the main sources of information for POS tagging?
• Knowledge of neighboring words
• Bill saw that man yesterday
• NNP NN DT NN NN
• VB VB(D) IN VB NN
• Knowledge of word probabilities
• man is rarely used as a verb….
• The latter proves the most useful, but the former also helps
More and Better Features ➔ Feature-based
tagger
• Can do surprisingly well just looking at a word by itself:
• Word the: the → DT
• Lowercased word Importantly: importantly → RB
• Prefixes unfathomable: un- → JJ
• Suffixes Importantly: -ly → RB
• Capitalization Meridian: CAP → NNP
• Word shapes 35-year: d-x → JJ
• Then build a supervised machine learning model to predict tag
• P(t|w):
Hidden Markov Model
Zero Order Markov Model (Unigram Model)
𝑛+1 𝑛

𝑝 𝑥1 … 𝑥𝑛 𝑦𝑖 … 𝑦𝑛+1 = ෑ 𝑞 𝑦𝑖 ෑ 𝑒 𝑥𝑖 𝑦𝑖 )
𝑖=1 𝑖=1

First Order Markov Model (Bigram Model)


𝑛+1 𝑛

𝑝 𝑥1 … 𝑥𝑛 𝑦𝑖 … 𝑦𝑛+1 = ෑ 𝑞 𝑦𝑖 | 𝑦𝑖−1 ෑ 𝑒 𝑥𝑖 𝑦𝑖 )
𝑖=1 𝑖=1
Unknown Words
• strongest source of information for guessing the part-of-speech of
unknown words is morphology.
• Words that end in -s are likely to be plural nouns (NNS),
• words ending with -ed tend to be past participles (VBN),
• words ending with -able adjectives (JJ),
Unknown Words
• Store for each final letter sequence (word suffixes) of up to 10 letters,
the statistics of the tag it was associated with in training.
• We are thus computing for each suffix of length i the probability of
the tag ti given the suffix letters

• Back-off is used to smooth these probabilities with successively


shorter suffixes.
Unknown Words
• we can compute the likelihood p(wi|ti) (Prob (word | tag )) that
HMMs require by using Bayesian inversion (i.e., using Bayes rule)

= *
Tagging Problem
Let |S| = 50, length of sequence = n = 15
|S|n = 5015
Standard Algorithms for POS Tagging
• HMM (with Viterbi algorithm)
• Neural Sequence Models (RNN, Transformers)
• Large Language Models (like BERT), fine-tuned

• All require hand-labelled training data, all about equal performance


(97% on English)
Training set
1. She/PRON eats/VERB fish/NOUN
2. He/PRON eats/VERB rice/NOUN
3. She/PRON likes/VERB rice/NOUN
4. Rice/NOUN is/VERB tasty/ADJ
5. Fish/NOUN is/VERB tasty/ADJ

• Test sentence: He likes fish rice, tag sequence: PRON VERB NOUN
NOUN
Reading
• Chapter 8, Speech and Language Processing, Third Edition

You might also like