0% found this document useful (0 votes)

76 views

NLP Module 4

The document discusses a module on semantic analysis in natural language processing. It covers topics like lexical semantics, relations between word senses including homonymy and polysemy, and word sense disambiguation. The presentation belongs to St. Francis Institute of Technology and is for educational purposes only, with distribution and modifications prohibited.

Uploaded by

Lisban Gonslaves

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

76 views

NLP Module 4

Uploaded by

Lisban Gonslaves

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 99

The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes.

Distribution and modifications of the content is prohibited.

Natural Language Processing

CSDC7013

Subject In-charge
Ms. Pradnya Sawant
Assistant Professor
Room No. 405
email: [email protected]

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 1
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Module 4
Semantic Analysis

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 2
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Contents
▪ Introduction, meaning representation; Lexical
Semantics; Corpus study
▪ Study of Various language dictionaries like WorldNet,
Babelnet
▪ Relations among lexemes & their senses –
Homonymy, Polysemy, Synonymy, Hyponymy
▪ Semantic Ambiguity
▪ Word Sense Disambiguation (WSD); Knowledge
based approach(Lesk‘s Algorithm)
▪ Supervised (Naïve Bayes, Decision List)
▪ Introduction to Semi-supervised method (Yarowsky)
and Unsupervised (Hyperlex)

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 3
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Module 4
Lecture 1
▪ Lexical Semantics
▪ Relations among lexemes & their senses –
Homonymy, Polysemy, Synonymy, Hyponymy

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 4
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Lexical Semantics
● Lexical Semantics is the study of word meaning.
● Lexeme is a pair of a particular form (orthographic or
phonological) with its meaning.
● Lexicon is a ﬁnite list of lexemes.
● A lexeme is represented by a lemma.
● A lemma or citation form is the grammatical form that is used
to represent a lexeme.
○ Carpets → lemma : carpet

○ sing, sang, sung → lemma : sing.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 5
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Lexical Semantics
● The speciﬁc forms sung or sing are called
wordforms.
● The process of mapping from a wordform to a
lemma is called lemmatization.
● Lemmatization is not always deterministic, since
it may depend on the context.
● E.g. the wordform found can map to the lemma
ﬁnd (meaning ‘to locate’) or the lemma found (‘to
create an institution’).

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 6
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Lexical Semantics
● Lemmas are Part-of-Speech speciﬁc; thus the
wordform tables has two possible lemmas, the noun
table and the verb table.
● One way to do lemmatization is via the
morphological parsing algorithms.
● But a lemma is not necessarily the same as the stem
from the morphological parse.
● E.g. celebrations →
stem : celebrate ; lemma : celebration.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 7
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Word Senses
● The meaning of a lemma can vary enormously given
the context.
● Consider two uses of the lemma bank, meaning
something like ‘ﬁnancial institution’ and ‘River
bank’, respectively:
○ E.g. Instead, a bank can hold the investments in a
custodial account in the client’s name.
○ But as agriculture burgeons on the east bank, the river
will shrink even more.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 8
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Homonymy
• A sense (or word sense) is a discrete
representation of one aspect of the meaning of a
word.
• We will represent each sense by placing a
superscript on the orthographic form of the lemma
as in bank1 and bank2.
• The two senses are homonyms, and the relation
between the senses is one of homonymy.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 9
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Homonymy

• Sometimes, there is some semantic

connection between the senses of a word.
• Consider the example: blood bank.
• It has some sort of relation to bank1.
• Both are repositories for entities that can be
deposited.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 10
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Polysemy

• When two senses are related semantically, we

call the relationship between them polysemy
rather than homonymy.
• In Polysemy, the semantic relation between the
senses is systematic and structured.
• E.g. The bank is in the corner of Church.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 11
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Polysemy

• This sense means something like ‘the building

belonging to a ﬁnancial institution’.
• Thus there is a systematic relationship between
senses that we might represent as
• BUILDING↔ORGANIZATION

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 12
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Metonymy

• This particular subtype of polysemy.

• Metonymy is the use of one aspect of a concept
or entity to refer to other aspects of the entity, or
to the entity itself.
• E.g. Animal (The chicken was domesticated in
Asia) ↔Meat (The chicken was overcooked)
• It is very difﬁcult to decide how many senses a
word has.
• E.g. I enjoy reading Shakespeare

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 13
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Zeugma

• One practical technique for determining if two

senses are distinct is to conjoin two uses of a word
in a single sentence.
• This kind of conjunction is called zeugma.
• E.g.
1. Which of those ﬂights serve breakfast?
2. Does Midwest Express serve Philadelphia?
• Does Midwest Express serve breakfast and
Philadelphia?
• He took his hat and leave.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 14
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Zeugma

• The oddness of the invented third example (a case

of zeugma) indicates there is no sensible way to
make a single sense of serve work for both
breakfast and Philadelphia.
• We can use this as evidence that serve has two
different senses in this case.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 15
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Homophones

• A special case of multiple senses that causes

problems for speech recognition and spelling
correction is homophones.
• Homophones are senses that are linked to lemmas
with the same pronunciation but different spellings,
such as wood/would or to/two/too.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 16
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Homographs
• This problem is related to homophones in speech synthesis.
• Homographs are distinct senses linked to lemmas with the
same orthographic form but different pronunciations:
• E.g.
• She let him lead her into the center of the room.
• He lead the people into the room.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 17
How can we deﬁne the meaning of a word
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

sense?
• Can we just look in a dictionary?

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 18
How can we deﬁne meaning of a word
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

sense?
• One approach to define a word sense is to make use of a
similar approach to the dictionary definitions; defining a
sense via its relationship with other senses.
• The second computational approach to meaning
representation is to create a small finite set of semantic
primitives, atomic units of meaning, and then create each
sense definition out of these primitives.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 19
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and
modifications of the content is prohibited.

Relations between Senses

The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Synonymy and Antonymy

• Synonym : When the meaning of two senses of two
different words (lemmas) are identical or nearly
identical
• Synonyms include such pairs as:
• couch/sofa
• vomit/throw up
• car/automobile

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 21
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Synonymy and Antonymy

• Formal deﬁnition : Two words are synonymous if
they are substitutable one for the other in any
sentence without changing the truth conditions of the
sentence.
• While substitutions between some pairs of words
like car/automobile or water/H2O are truth-
preserving, the words are still not identical in
meaning.
• Thus the word synonym is used to describe a
relationship of approximate or rough synonymy.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 22
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Synonymy
• Synonymy is a relation between word senses rather than
between words.
• E.g. big and large.
• These may seem to be synonyms in the following sentences:
• How big is that plane?
• Would I be ﬂying on a large or small plane?
• But note the following sentence where we cannot substitute
large for big:
• Miss Nelson became a kind of big sister to Benjamin.
• Miss Nelson became a kind of large sister to Benjamin.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 23
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Antonym

● Antonyms are words with opposite meaning such as the

following:
○ long/short
○ big/little
○ fast/slow
○ cold/hot
● It is difficult to give a formal definition of antonymy.
● Two senses can be antonyms if they define a binary
opposition, or are at opposite ends of some scale.
● Another groups of antonyms is reversive, which describe
some sort of change or movement in opposite directions, such
as rise/fall or up/down.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 24
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Antonym
● From one perspective, antonyms have very different
meanings, since they are opposite.
● From another perspective, they have very similar
meanings, since they share almost all aspects of their
meaning except their position on a scale, or their
direction. Thus automatically distinguishing
synonyms from antonyms can be difﬁcult.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 25
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Hyponymy
• One sense is a hyponym of another sense if the first
sense is more specific, denoting a subclass of the
other.
• car is a hyponym of vehicle
• dog is a hyponym of animal
• mango is a hyponym of fruit.
• We can define hypernymy more formally by saying
that the class denoted by the superordinate
extensionally includes the class denoted by the
hyponym.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 26
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Hyponymy

• Hypernymy can also be deﬁned in terms of

entailment.
• ∀x A(x) ⇒ B(x).
• Hyponymy is usually a transitive relation; if A is a
hyponym of B and B is a hyponym of C, then A is a
hyponym of C.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 27
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Meronymy, Holynymy

• Another very common relation is meronymy,

the part-whole relation.
• A hyponym refers to a type. A meronym
refers to a part.
• A leg is part of a chair; a wheel is part of a car.
• Hence wheel is a meronym of car, and car is a
holynym of wheel.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 28
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Module 4
Lecture 2
▪ WordNet
▪ Robust Word Sense Disambiguation (WSD)

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 29
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

WordNet
• It is a database of Lexical Relations.
• The most commonly used resource for English sense
relations is the WordNet lexical database.
• WordNet consists of three separate databases:
• For nouns
• For verbs
• For adjectives and adverbs;
• NB: closed class words are not included.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 30
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

WordNet
● Each database consists of a set of lemmas, each one
annotated with a set of senses.
● The WordNet 3.0 release has
○ 1,17,097 nouns
○ 11,488 verbs
○ 22,141 adjectives
○ 4,601 adverbs.
● WordNet can be accessed via the web or downloaded
and accessed locally.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 31
Example
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Lemma entry for noun & adjective: bass

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 33
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Lemma entry for noun & adjective: bass

• There are
• 8 senses for the noun
• 1 sense for the adjective
• each of which has a gloss (a dictionary-style deﬁnition)
• a list of synonyms for the sense (called a synset) and
• sometimes also usage examples (shown for the adjective
sense).
• Unlike dictionaries, WordNet doesn’t represent pronunciation

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 34
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Synset
• The set of near-synonyms for a WordNet sense is
called a synset (for synonym set)
• Synsets are an important primitive in WordNet.
• Synsets actually constitute the senses associated with
WordNet entries
• It is synsets, not wordforms, lemmas or individual
senses, that participate in most of the lexical sense
relations in WordNet.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 35
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Synset

• Each synset is related to its immediately more

general and more speciﬁc synsets via direct
hypernym and hyponym relations.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 36
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Synset

• These relations can be followed to produce longer

chains of more general or more speciﬁc synsets.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 37
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Hypernym chains for bass3

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 38
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Hypernym chains for bass7

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 39
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

BabelNet
• BabelNet groups words in different languages into
collections of synonyms known as Babel phrases . For
each Babel set, BabelNet provides short definitions in many
languages obtained from WordNet and Wikipedia.
• BabelNet is automatically created by linking the largest
multilingual Web encyclopedia, Wikipedia , with the most
popular computer lexicon for English, WordNet.
• The integration is carried out by means of an automatic
mapping in which lexical gaps in resource-poor
languages are filled with the help of statistical machine
translation.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 40
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Module 4
Lecture 3
▪ Robust Word Sense Disambiguation (WSD)
▪ Dictionary based approach

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 42
Word Sense Disambiguation
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Word Sense Disambiguation (WSD)

● WSD is the problem of determining which "sense" (meaning)
of a word is activated by the use of the word in a particular
context.

● WSD is a natural classification problem.

● Given a word and its possible senses, as defined by a

dictionary, classify an occurrence of the word in context into
one or more of its sense classes.

● Features of the context provide evidence for classification.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 44
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Approaches to WSD

• Classified according to the source of knowledge used

in word disambiguation.

1. Dictionary-based or Knowledge-based Methods

○ rely on dictionaries, treasures and lexical
knowledge base.
○ They do not use corpora evidences.
○ E.g. Lesk method
○ Lesk algorithm is based is “measure overlap
between sense definitions for all words in context”.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 45
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Approaches to WSD
2. Supervised Methods
● ML methods make use of sense-annotated corpora
to train.
● These methods assume that context can provide
enough evidence to disambiguate the sense.
● Context is represented as a set of “features” of
words.
● These methods rely on a substantial amount of
manually sense-tagged corpora.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 46
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Approaches to WSD

3. Semi- supervised Methods

● Due to the lack of training corpus, semi-supervised
learning methods are popular.
● Methods use both labelled as well as unlabeled data.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 47
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Approaches to WSD

4. Unsupervised Methods
● The senses can be induced from text by clustering
word occurrences by using some measure of
similarity of the context.
● This task is called word sense induction or
discrimination.
● They have potential to overcome knowledge
acquisition bottleneck due to non-dependency on
manual efforts.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 48
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Module 4
Lecture 4
▪ Knowledge based approach(Lesk‘s Algorithm)

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 49
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Dictionary Based Methods

● All ML approaches require a considerable amount of
work to create a classifier for each ambiguous entry in
the lexicon
● Scaling up these approaches would be a large
undertaking
● To perform large-scale disambiguation, machine
readable dictionaries are needed
● E.g. Lesk Algorithm

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 50
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Lesk Algorithm

● In this approach, all the senses definitions of the

word to be disambiguated are retrieved from the
dictionary.
● Each of these senses is then compared to the
dictionary definitions of all the remaining words
in the context
● The sense with the highest overlap with these
context words is chosen as the correct sense.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 51
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Lesk Algorithm
● E.g. Finding the appropriate sense of cone in the
phrase pine cone given the following definitions:
● Pine:
a. Kind of evergreen tree
b. Waste away through sorrow or illness
● Cone:
a. Solid body which narrows to a point
b. Something of this shape whether solid or hollow
c. Fruit of certain evergreen trees

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 52
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Lesk Algorithm
● The lesk algorithm will select cone(c) as the
correct sense since two of the words in its entry:
evergreen and tree, overlaps with the words in the
entry for pine.
● Neither of the other entries have any overlap with
words in the definition of pine.
● Disadvantage of Lesk Algorithm : The dictionary
entries for the target words are relatively short and
may not provide sufficient material to create
adequate classifiers.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 53
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

How to overcome this disadvantage?

● Remedy: Expand the list of words used in the
classifier to include words related to but not
contained in their individual sense definitions.
● This can be accomplished by including words
whose definitions make use of the target word.
● E.g. the word deposit does not occur in the
definition of bank in the American Heritage
Dictionary. However, banks do occur in the
definition of deposit. Therefore the classifier for
bank must include deposit as a relevant feature.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 54
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

How to overcome this disadvantage?

● Subject Codes : The word deposit can be

related to financial deposit or mud deposit. So
there is still ambiguity with respect to its
usage. This problem can be solved with the
help of Subject Codes.
● Many dictionaries include tags known as
subject codes in their entries that correspond
roughly to broad conceptual categories.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 55
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

How to overcome this disadvantage?

● E.g. the entry for bank in Longman’s dictionary
includes subject code EC(Economics) for the
financial sense of the bank.
● Given such subject codes, the expanded terms in
subject code can be related to this sense of bank

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 56
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Module 4
Lecture 5
▪ Dictionary based approach

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 57
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Dictionary Based Disambiguation

● Semantic roles express some of the semantics of an
argument in its relation to the predicate.
○ E.g. book(argument) that flight(predicate)
● A selectional restriction is a kind of semantic type
constraint that a verb imposes on the kind of concepts that
are allowed to ﬁll its argument roles.
● Two fundamental approaches to handle this problem :
○ Integrated Rule-to-rule Approach
○ Stand-alone Approach

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 58
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Integrated Rule-to-rule Approach

● In this method, the selection of the correct word

sense occurs during semantic analysis as a side
effect of elimination of ill-formed semantic
representations.
● One method is Selectional Restriction-Based
Disambiguation

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 59
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Stand-alone Approach
• In this approach, sense disambiguation is
performed independent of compositional
semantic analysis.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 60
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Selectional Restriction-Based Disambiguation

● In this approach, the selectional restrictions and

type hierarchies are the primary knowledge-sources
used for performing disambiguation.
● These sources are used to rule out inappropriate
senses and thereby reduce the amount of
ambiguity present during semantic analysis.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 61
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Selectional Restriction-Based Disambiguation

● Selectional restrictions are used to block the

formation of component meaning representations
that contain selectional restriction violations.
● By blocking such ill-formed components, the
semantic analyzer will find itself dealing with fewer
ambiguous meaning representation.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 62
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Example:
● Which airline serve NewYork?
● Which airlines serve breakfast?
● Disambiguation : by the selectional restrictions imposed
by ‘NewYork’ and ‘breakfast’ along with the semantic
type information associated with it.
● The predicate selects the correct sense of an
ambiguous argument by eliminating the senses that
fails to match one of its selectional restrictions.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 63
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Selectional Restriction-Based Disambiguation

● There are also cases where both the predicate
and the argument have multiple senses.
● E.g. I am looking for restaurants that serve
vegetarian dishes.
● Determining the correct sense is done by
mutually selecting the correct senses.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 64
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Selectional Restriction-Based Disambiguation

● This approach requires two additions to the knowledge

structures:
○ access to the hierarchical type information about
arguments
○ semantic selectional restriction information about
the arguments to predicates.
● The type information is available in the form of
hypernym information about the heads of the meaning
structures being used as arguments to predicates.
• Selectional restriction information about argument roles
can be encoded by associating the appropriate WordNet
synsets with the arguments to each predicate-bearing
lexical item.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 65
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Limitations of Selectional Restrictions

1. The available selectional restrictions are too
general to uniquely select a correct sense.
E.g. What kind of dishes do you recommend?
2. How to deal with obvious violations of
selectional restrictions?
E.g. But it fell apart in 1931, because people
realized that you cannot eat gold for lunch

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 66
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Limitations of Selectional Restrictions

3. Another challenge is the usage of metaphoric and

metonymic uses
E.g. He is a tiger in the class
E.g. Tiger called his students to the meeting room

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 67
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

How to overcome these limitations?

1. Adopt selectional restrictions as preferences,

rather than rigid requirements.
2. Go for selectional association, which treats it as a
probabilistic measure of the strength of
association between a predicate and a class
dominating the argument to the predicate.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 68
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Module 4
Lecture 6
▪ Dictionary based approach

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 69
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Robust Word Sense Disambiguation

● The selectional restriction approach has too many

requirements such as
○ complete selectional restriction information for all
predicate roles
○ complete type information for the senses.
● The stand alone approach involves using minimal
information from other processes.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 70
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Machine Learning Approaches

● In these approaches, a classifier is used to assign
yet unseen examples to a fixed number of senses.
● The efficiency of the WSD classifier depends on
○ nature of the training material
○ how much material is needed
○ the degree of human intervention
○ the kind of linguistic knowledge used and
○ the output produced.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 71
The Inputs: Feature Vectors
● The initial input consists of
○ the word to be disambiguated (target word) along with a
portion of text in which it is embedded(context).

● This initial input is processed in the following ways:

○ The initial input is POS tagged
○ The context is replaced with larger or smaller segments
○ Some amount of morphological processing like
stemming is performed on contexts
○ Some form of dependency parsing is performed to
ascertain grammatical roles and relations
The Inputs: Feature Vectors
● After initial processing, the input is brought down
to a fixed set of features that capture relevant
information.
● A simple feature vector consists of numeric or
nominal values which can be easily encoded.
● The linguistic features can be divided into two
classes: collocational features and co-
occurrence features.
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Collocational Features

● They are quantifiable position-specific

relationship between lexical items
● It encode information about the lexical
inhabitants of specific positions located to
the left and right of the target word
● Typical features include the word, root form
of the word and POS of the word.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 74
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Collocational Features
● E.g. An electric guitar and bass player stand off to
one side which is not really part of the scene.
● A feature vector consisting of two words to the right
and left of the target word, along with their
respective POS is:
guitar(NN1) and (CJC) player(NN1) stand(VVB)

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 75
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Co-occurrence Feature

● Co-occurrence features are data about neighboring

words ignoring their exact positions
● Here the words themselves serve as features
(play/music)
● The value of the feature is the number of times
the word occurs in a region surrounding the
target word
● E.g. (bass: no of times play/music is occurring
surrounding the word bass)

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 76
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Co-occurrence Feature
● Thus features are small number of frequently used
content words.
● This feature is effective in capturing general topic of
discourse in which target word has occurred.
● E.g. co-occurrence vector for the word bass would
have the following words as features:
○ fishing, big, sound, player, fly, rod, pound,
double, playing, guitar
● Using these words as features with a window size of
10 would be represented by the following vector:
Example: An electric guitar and bass player stand off
to one side which is not really part of the scene.
[0,0,0,1,0,0,0,0,0,1]
St. Francis Institute of Technology NLP
Department of Computer Engineering Ms. Pradnya Sawant 77
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Module 4
Lecture 7
▪ Supervised (Naïve Bayes, Decision List)
▪ Introduction to Semi-supervised method
(Yarowsky) and Unsupervised (Hyperlex)

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 78
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Supervised Learning Approaches

● A learning system is presented with a training

set consisting of feature-encoded inputs along
with their appropriate label or category.
● The output of the system is a classifier system
capable of assigning labels to new feature-
encoded inputs.
● The different classifiers that can be used are:
○ Bayesian Classifier
○ Decision Trees
○ Neural Networks
○ KNN

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 79
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Bootstrapping Approaches

● The bootstrapping approach eliminates the need for a

large training set by relying on a relatively small
number of instances of each sense.
● These labelled instances used as seeds to train an
initial classifier.
● This initial classifier is then used to extract a larger
training set from the remaining untagged corpus.
● Repeating this process, results in a series of classifier.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 89
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Bootstrapping Approaches

● The initial seeds can be generated in a number of

ways.
● One way to generate a seed set is by simply hand
labeling a small set of examples.
● This approach has the following advantages
○ There is a certainty that seed instance are correct
○ The analyst can make some prototypical of each
sense
○ It is reasonably easy to carry out.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 90
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Unsupervised Methods

● Feature vector representations of unlabelled

instances are taken as input and are then grouped
into clusters according to a similarity metric.
● These clusters are then labelled by hand.
● Unseen feature encoded instance is classified by
assigning word sense from the cluster with closest
similarity index.
● A frequently used technique : Agglomerative
Clustering

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 92
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Drawbacks of Unsupervised Methods

● The correct senses of the instances used in the

training data may not be known.
● The clusters are almost certainly heterogeneous
with respect to the senses of the training instances
contained within them.
● The number of clusters is almost always different
from the number of senses of the target word
being disambiguated.

St. Francis Institute of Technology NLP

Department of Computer Engineering Ms. Pradnya Sawant 93
Sample Questions
12. Explain Knowledge based approach(Lesk’s Algorithm)
13. Explain Bootstrapping Semi-supervised method (Yarowsky).
14. Explain Unsupervised learning(Hyperlex)
99

Enjoying English 8 - Testovi Za 8. Razred
67% (3)
Enjoying English 8 - Testovi Za 8. Razred
16 pages
Riemer - Introducing Semantics PDF
100% (7)
Riemer - Introducing Semantics PDF
478 pages
Analyzing Meaning: An Introduction To Semantics and Pragmatics
No ratings yet
Analyzing Meaning: An Introduction To Semantics and Pragmatics
502 pages
Teaching English Through Songs, Rhymes and Chants (I)
60% (5)
Teaching English Through Songs, Rhymes and Chants (I)
28 pages
IS 7118 Unit-9 Semantics
No ratings yet
IS 7118 Unit-9 Semantics
82 pages
C Class Notes
No ratings yet
C Class Notes
26 pages
Lec CS563 Lexical Knowledge NN
No ratings yet
Lec CS563 Lexical Knowledge NN
208 pages
NLPQB2
No ratings yet
NLPQB2
8 pages
NLP Lexical Semnatics Slides
No ratings yet
NLP Lexical Semnatics Slides
55 pages
NLP Assign Mod-4,5,6 IramShaikh
No ratings yet
NLP Assign Mod-4,5,6 IramShaikh
10 pages
18 Word Senses and WordNet
No ratings yet
18 Word Senses and WordNet
22 pages
NLP1 Lecture5
No ratings yet
NLP1 Lecture5
52 pages
14-LexicalSemantics
No ratings yet
14-LexicalSemantics
54 pages
Lexical Semantics: Prabhleen Juneja Tiet
No ratings yet
Lexical Semantics: Prabhleen Juneja Tiet
43 pages
NLP QB2 GT ans
No ratings yet
NLP QB2 GT ans
11 pages
Lecture 13
No ratings yet
Lecture 13
35 pages
nlp unit 3
No ratings yet
nlp unit 3
83 pages
Lec6-7- Traditional Semantic Processing
No ratings yet
Lec6-7- Traditional Semantic Processing
26 pages
Apex Institute of Technology Natural Language Processing: Department of Computer Science & Engineering
No ratings yet
Apex Institute of Technology Natural Language Processing: Department of Computer Science & Engineering
27 pages
4.NLP CIC 4 PDF
No ratings yet
4.NLP CIC 4 PDF
29 pages
CS 388: Natural Language Processing: Word Sense Disambiguation
No ratings yet
CS 388: Natural Language Processing: Word Sense Disambiguation
31 pages
Unit 4
No ratings yet
Unit 4
15 pages
Semantics: Lexical Semantics: Pawan Goyal
No ratings yet
Semantics: Lexical Semantics: Pawan Goyal
54 pages
Week10
No ratings yet
Week10
24 pages
Lecture09 - Lexical Semantics
No ratings yet
Lecture09 - Lexical Semantics
44 pages
4. Semantic Parsing
No ratings yet
4. Semantic Parsing
79 pages
Generative Lexicon
No ratings yet
Generative Lexicon
40 pages
NLP Notes Unit-3.Doc
No ratings yet
NLP Notes Unit-3.Doc
19 pages
Introduction To Semantic Processing
No ratings yet
Introduction To Semantic Processing
13 pages
Download ebooks file The Language of Word Meaning 1st Edition Federica Busa all chapters
100% (2)
Download ebooks file The Language of Word Meaning 1st Edition Federica Busa all chapters
74 pages
Lexical Fields
No ratings yet
Lexical Fields
28 pages
Lexical Semantics: Thesaurus-Based
No ratings yet
Lexical Semantics: Thesaurus-Based
34 pages
Unit 2_Lecture 2
No ratings yet
Unit 2_Lecture 2
13 pages
Semantics Boot Camp
No ratings yet
Semantics Boot Camp
356 pages
Immediate download Analyzing meaning An introduction to semantics and pragmatics Third edition Paul R. Kroeger ebooks 2024
100% (2)
Immediate download Analyzing meaning An introduction to semantics and pragmatics Third edition Paul R. Kroeger ebooks 2024
37 pages
Unit 3-1
No ratings yet
Unit 3-1
66 pages
Introducing Semantics 1st Edition Riemer Nickinstant download
100% (1)
Introducing Semantics 1st Edition Riemer Nickinstant download
53 pages
Word Sense Disambiguation
No ratings yet
Word Sense Disambiguation
68 pages
UNIT-2 NLP
No ratings yet
UNIT-2 NLP
12 pages
Sem 7 - COMP - NLP
No ratings yet
Sem 7 - COMP - NLP
25 pages
Intro To Linguistics - Semantics: Overview of Topics
No ratings yet
Intro To Linguistics - Semantics: Overview of Topics
6 pages
wuolah-app-2024-12-16-12-16-45
No ratings yet
wuolah-app-2024-12-16-12-16-45
37 pages
Analizing Meaning
100% (1)
Analizing Meaning
502 pages
L8-Semantics Postclass
No ratings yet
L8-Semantics Postclass
29 pages
MNLP - Unit-3 (1)
No ratings yet
MNLP - Unit-3 (1)
100 pages
Wordnet
No ratings yet
Wordnet
51 pages
Analyzing meaning An introduction to semantics and pragmatics Third edition Paul R. Kroeger - The latest ebook is available for instant download now
100% (5)
Analyzing meaning An introduction to semantics and pragmatics Third edition Paul R. Kroeger - The latest ebook is available for instant download now
72 pages
7 Lexical Semantics
No ratings yet
7 Lexical Semantics
55 pages
What Is Word Sense Disambiguation Good For?: Adam Kilgarriff Itri University of Brighton
No ratings yet
What Is Word Sense Disambiguation Good For?: Adam Kilgarriff Itri University of Brighton
6 pages
An Introduction To Semantics (Muhammad Ali Alkhuli) (Z-Library)
No ratings yet
An Introduction To Semantics (Muhammad Ali Alkhuli) (Z-Library)
183 pages
The Language of Word Meaning 1st Edition Federica Busa - The ebook is ready for download with just one simple click
100% (2)
The Language of Word Meaning 1st Edition Federica Busa - The ebook is ready for download with just one simple click
49 pages
Lecture 4 - Meaning
No ratings yet
Lecture 4 - Meaning
21 pages
Unit 3 Notes
No ratings yet
Unit 3 Notes
18 pages
Unit 2_Lecture 2
No ratings yet
Unit 2_Lecture 2
13 pages
Unit 3
No ratings yet
Unit 3
18 pages
Mohd Hafiz Sufian Syafiq Esham Wan Noor Adawiyah Nabilah: Group 4
No ratings yet
Mohd Hafiz Sufian Syafiq Esham Wan Noor Adawiyah Nabilah: Group 4
31 pages
Exploring the Fascinating World of Natural Language Processing (NLP): Revolutionizing Communication and Empowering Machines through NLP Techniques and Applications
From Everand
Exploring the Fascinating World of Natural Language Processing (NLP): Revolutionizing Communication and Empowering Machines through NLP Techniques and Applications
daniel Huston
No ratings yet
Mastering Large Language Models: Advanced techniques, applications, cutting-edge methods, and top LLMs (English Edition)
From Everand
Mastering Large Language Models: Advanced techniques, applications, cutting-edge methods, and top LLMs (English Edition)
Sanket Subhash Khandare
No ratings yet
Natural Language Processing with Python: Natural Language Processing Using NLTK
From Everand
Natural Language Processing with Python: Natural Language Processing Using NLTK
Frank Millstein
3.5/5 (4)
The spaCy Handbook: Simplifying Natural Language Processing
From Everand
The spaCy Handbook: Simplifying Natural Language Processing
Robert Johnson
No ratings yet
Teaching a Child with Special Needs at Home and at School: Strategies and Tools That Really Work!
From Everand
Teaching a Child with Special Needs at Home and at School: Strategies and Tools That Really Work!
Judith B. Munday M.A. M.Ed.
No ratings yet
Mastering Natural Language Processing with Python and NLTK
From Everand
Mastering Natural Language Processing with Python and NLTK
Pedro Martins
No ratings yet
Content Analysis Fifth Grade
No ratings yet
Content Analysis Fifth Grade
6 pages
English Around The World
No ratings yet
English Around The World
13 pages
English Presentation Worksheet - Signpost
100% (1)
English Presentation Worksheet - Signpost
2 pages
Unit 2
No ratings yet
Unit 2
14 pages
11.4.23 - LP 1
No ratings yet
11.4.23 - LP 1
28 pages
Past Continuous Tense
No ratings yet
Past Continuous Tense
1 page
IELTS - Thesis Statements Paraphrasing
No ratings yet
IELTS - Thesis Statements Paraphrasing
38 pages
Taller Than Tall Comparative Adjectives Practice Page
No ratings yet
Taller Than Tall Comparative Adjectives Practice Page
1 page
Business Writing PDF
100% (1)
Business Writing PDF
153 pages
Tips For Reading: 2. Use The Process of Elimination
No ratings yet
Tips For Reading: 2. Use The Process of Elimination
5 pages
Daily Dawn Newspaper English Vocabulary with Urdu Meaning: Plunge (verb) اناج بود ،کیبڈ ،انرگ
No ratings yet
Daily Dawn Newspaper English Vocabulary with Urdu Meaning: Plunge (verb) اناج بود ،کیبڈ ،انرگ
7 pages
Colloquial English British English Student
No ratings yet
Colloquial English British English Student
3 pages
Inverted Word
No ratings yet
Inverted Word
23 pages
Kollam Logo Aug2023
No ratings yet
Kollam Logo Aug2023
9 pages
Preliminary Exam
No ratings yet
Preliminary Exam
2 pages
Full Download The Novel and the Multispecies Soundscape Ben De Bruyn PDF DOCX
100% (17)
Full Download The Novel and the Multispecies Soundscape Ben De Bruyn PDF DOCX
65 pages
Marketplace Handout
No ratings yet
Marketplace Handout
2 pages
СРСП.СРС по стилистике ИЯ 3 курс
No ratings yet
СРСП.СРС по стилистике ИЯ 3 курс
12 pages
Uni Work Helper
No ratings yet
Uni Work Helper
73 pages
Writing Booklet - Personal Narrative
100% (3)
Writing Booklet - Personal Narrative
14 pages
Past Simple Regular Verbs
No ratings yet
Past Simple Regular Verbs
2 pages
BASIC STRUCTURE - Group 10
No ratings yet
BASIC STRUCTURE - Group 10
10 pages
English Presentation 1
No ratings yet
English Presentation 1
10 pages
Chapter 2
No ratings yet
Chapter 2
7 pages
4 Lesson Plans TRP
No ratings yet
4 Lesson Plans TRP
2 pages
German Vocab Builder S1 #35 Airplane: Lesson Notes
No ratings yet
German Vocab Builder S1 #35 Airplane: Lesson Notes
3 pages
Reading
No ratings yet
Reading
64 pages
Presentation, Analysis and Interpretation of Data: Table 1. Descriptive Statistics On The Result of The Pre-Test
No ratings yet
Presentation, Analysis and Interpretation of Data: Table 1. Descriptive Statistics On The Result of The Pre-Test
23 pages