0% found this document useful (0 votes)

21 views52 pages

NLP1 Lecture5

This document discusses lexical semantics and word meanings. It covers topics like compositional semantics, prototype theory, semantic relations, polysemy, and word sense disambiguation. Various approaches to representing word meanings are also examined.

Uploaded by

mausam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views52 pages

NLP1 Lecture5

Uploaded by

mausam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 52

Natural Language Processing 1

Lecture 5: Lexical and distributional semantics

Katia Shutova

ILLC
University of Amsterdam

12 November 2018
Natural Language Processing 1
Lecture 5: Introduction to semantics & lexical semantics

Semantics

Compositional semantics:
I studies how meanings of phrases are constructed out of
the meaning of individual words
I principle of compositionality: meaning of each whole
phrase derivable from meaning of its parts
I sentence structure conveys some meaning: obtained by
syntactic representation

Lexical semantics:
I studies how the meanings of individual words can be
represented and induced
Natural Language Processing 1
Lecture 5: Introduction to semantics & lexical semantics
Words and concepts

What is lexical meaning?

I recent results in psychology and cognitive neuroscience

give us some clues
I but we don’t have the whole picture yet
I different representations proposed, e.g.
I formal semantic representations based on logic,
I or taxonomies relating words to each other,
I or distributional representations in statistical NLP
I but none of the representations gives us a complete
account of lexical meaning
Natural Language Processing 1
Lecture 5: Introduction to semantics & lexical semantics
Words and concepts

How to approach lexical meaning?

I Formal semantics: set-theoretic approach
e.g., cat0 : the set of all cats; bird0 : the set of all birds.
I meaning postulates, e.g.

∀x[bachelor0 (x) → man0 (x) ∧ unmarried0 (x)]

I Limitations, e.g. is the current Pope a bachelor?

I Defining concepts through enumeration of all of their
features in practice is highly problematic
I How would you define e.g. chair, tomato, thought,
democracy ? – impossible for most concepts
I Prototype theory offers an alternative to set-theoretic
approaches
Natural Language Processing 1
Lecture 5: Introduction to semantics & lexical semantics
Words and concepts

How to approach lexical meaning?

I Formal semantics: set-theoretic approach
e.g., cat0 : the set of all cats; bird0 : the set of all birds.
I meaning postulates, e.g.

∀x[bachelor0 (x) → man0 (x) ∧ unmarried0 (x)]

I Limitations, e.g. is the current Pope a bachelor?

Prototype theory

I introduced the notion of graded semantic categories

I no clear boundaries
I no requirement that a property or set of properties be
shared by all members
I certain members of a category are more central or
prototypical (i.e. instantiate the prototype)
furniture: chair is more prototypical than stool

Eleanor Rosch 1975. Cognitive Representation of Semantic

Categories (J Experimental Psychology)
Natural Language Processing 1
Lecture 5: Introduction to semantics & lexical semantics
Words and concepts

Prototype theory (continued)

I Categories form around prototypes; new members added

on basis of resemblance to prototype
I Features/attributes generally graded
I Category membership a matter of degree
I Categories do not have clear boundaries
Natural Language Processing 1
Lecture 5: Introduction to semantics & lexical semantics
Semantic relations

Semantic relations

Hyponymy: IS-A

dog is a hyponym of animal

animal is a hypernym of dog

I hyponymy relationships form a taxonomy

I works best for concrete nouns
I multiple inheritance: e.g., is coin a hyponym of both metal
and money?
Natural Language Processing 1
Lecture 5: Introduction to semantics & lexical semantics
Semantic relations

Other semantic relations

Meronomy: PART-OF e.g., arm is a meronym of body, steering

wheel is a meronym of car (piece vs part)
Synonymy e.g., aubergine/eggplant.
Antonymy e.g., big/little
Also:
Near-synonymy/similarity e.g., exciting/thrilling
e.g., slim/slender/thin/skinny
Natural Language Processing 1
Lecture 5: Introduction to semantics & lexical semantics
Semantic relations

WordNet

I large scale, open source resource for English

I hand-constructed
I wordnets being built for other languages
I organized into synsets: synonym sets (near-synonyms)
I synsets connected by semantic relations

S: (v) interpret, construe, see (make sense of;

assign a meaning to) - "How do you interpret his
behavior?"
S: (v) understand, read, interpret, translate (make
sense of a language) "She understands French";
"Can you read Greek?"
Natural Language Processing 1
Lecture 5: Introduction to semantics & lexical semantics
Polysemy

Polysemy and word senses

The children ran to the store

If you see this man, run!
Service runs all the way to Cranbury
She is running a relief operation in Sudan
the story or argument runs as follows
Does this old car still run well?
Interest rates run from 5 to 10 percent
Who’s running for treasurer this year?
They ran the tapes over and over again
These dresses run small
Natural Language Processing 1
Lecture 5: Introduction to semantics & lexical semantics
Polysemy

Polysemy
I homonymy: unrelated word senses. bank (raised land) vs
bank (financial institution)
I bank (financial institution) vs bank (in a casino): related but
distinct senses.
I regular polysemy and sense extension
I zero-derivation, e.g. tango (N) vs tango (V), or rabbit,
turkey, halibut (meat / animal)
I metaphorical senses, e.g. swallow [food], swallow
[information], swallow [anger]
I metonymy, e.g. he played Bach; he drank his glass.
I vagueness: nurse, lecturer, driver
I cultural stereotypes: nurse, lecturer, driver
No clearcut distinctions.
Natural Language Processing 1
Lecture 5: Introduction to semantics & lexical semantics
Polysemy

Word sense disambiguation

I Needed for many applications
I relies on context, e.g. collocations: striped bass (the fish)
vs bass guitar.
Methods:
I supervised learning:
I Assume a predefined set of word senses, e.g. WordNet
I Need a large sense-tagged training corpus (difficult to
construct)
I semi-supervised learning (Yarowsky, 1995)
I bootstrap from a few examples
I unsupervised sense induction
I e.g. cluster contexts in which a word occurs
Natural Language Processing 1
Lecture 5: Introduction to semantics & lexical semantics
Word sense disambiguation

WSD by semi-supervised learning

Yarowsky, David (1995) Unsupervised word sense

disambiguation rivalling supervised methods

Disambiguating plant (factory vs vegetation senses):

1. Find contexts in training corpus:
sense training example
? company said that the plant is still operating
? although thousands of plant and animal species
? zonal distribution of plant life
? company manufacturing plant is in Orlando
etc
Natural Language Processing 1
Lecture 5: Introduction to semantics & lexical semantics
Word sense disambiguation

Yarowsky (1995): schematically

Initial state

? ?? ? ?
? ?
? ??
? ? ?
?? ?
? ? ?
?? ? ? ?
? ?
? ? ?
? ?
Natural Language Processing 1
Lecture 5: Introduction to semantics & lexical semantics
Word sense disambiguation

2. Identify some seeds to disambiguate a few uses:

‘plant life’ for vegetation use (A)

‘manufacturing plant’ for factory use (B)

sense training example

? company said that the plant is still operating
? although thousands of plant and animal species
A zonal distribution of plant life
B company manufacturing plant is in Orlando
etc
Natural Language Processing 1
Lecture 5: Introduction to semantics & lexical semantics
Word sense disambiguation

Seeds

? ?? ? B
? B
? ?? manu.
? ? ?
?? life ?
A ? ?
?? A A ?
? A
? A ?
? ?
Natural Language Processing 1
Lecture 5: Introduction to semantics & lexical semantics
Word sense disambiguation

3. Train a decision list classifier on Sense A/Sense B examples.

Rank features by log-likelihood ratio:

P(SenseA |fi )
log
P(SenseB |fi )

reliability criterion sense

8.10 plant life A
7.58 manufacturing plant B
6.27 animal within 10 words of plant A
etc
Natural Language Processing 1
Lecture 5: Introduction to semantics & lexical semantics
Word sense disambiguation

4. Apply the classifier to the training set and add reliable

examples to A and B sets.
sense training example

? company said that the plant is still operating

A although thousands of plant and animal species
A zonal distribution of plant life
B company manufacturing plant is in Orlando
etc
5. Iterate the previous steps 3 and 4 until convergence
Natural Language Processing 1
Lecture 5: Introduction to semantics & lexical semantics
Word sense disambiguation

Iterating:

? ?? ? B
B B
? ??
animal ? ? ?
company
AA B
A ? ?
?? A A ?
? A
? A ?
? ?
Natural Language Processing 1
Lecture 5: Introduction to semantics & lexical semantics
Word sense disambiguation

Final:

A AA B B
B B
A BB
A B B
AA B
A B B
AA A A B
A A
A A B
A B
Natural Language Processing 1
Lecture 5: Introduction to semantics & lexical semantics
Word sense disambiguation

6. Apply the classifier to the unseen test data

I ‘one sense per discourse’: can be used as an additional

refinement
I Yarowsky’s experiments were nearly all on homonyms:
these principles may not hold as well for sense extension.
Natural Language Processing 1
Lecture 5: Introduction to semantics & lexical semantics
Word sense disambiguation

Problems with WSD as supervised classification

Yarowsky reported an accuracy of 95%, but ...

I on ’easy’ homonymous examples
I real performance around 75% (supervised)
I need to predefine word senses (not theoretically sound)
I need a very large training corpus (difficult to annotate,
humans do not agree)
I learn a model for individual words — no real generalisation
Better way:
I unsupervised sense induction (but a very hard task)
Natural Language Processing 1
Lecture 5: Introduction to semantics & lexical semantics
Word sense disambiguation

Distributional hypothesis