02 - Morphological Analysis
02 - Morphological Analysis
Morphological
Analysis
Typical Use case ….
Absolutely loving the new update to the app. Great job! Positive Review
Very disappointed with the customer service, not helpful at all. Negative Review
I noticed the store has extended its hours. Interesting move. Neutral comment
Does anyone know if this product is available in blue? Enquiry
Just tried the new cafe downtown, and it's amazing! Praise , Positive f / b
I'm having trouble logging into my account, can you assist me? Support Request
My order has been delayed for two weeks now, what's going on? Complain
.
.
What are your store hours on weekends?
Can I get more information about the warranty on the laptop models you
sell
Suggestions Service Enquiry Complaint Top Mgmt
0.45 0.72 0.35 0.85 0.15
What is Morphology ?
In linguistics, Morphology is the study of the internal structure of words. It is the study
of words, how they are formed, and their relationship to other words in the same
language. It analyzes the structure of words and parts of words such as stems, root
words, prefixes, and suffixes. Morphology also looks at parts of speech, intonation
and stress, and the ways context can change a word's pronunciation and meaning.
It focuses on how the components within a word (stems, root words, prefixes,
suffixes, etc.) are arranged or modified to create different meanings.
Root :
Is the most basic, irreducible part that carries the core meaning of the word. Unlike
stems, roots cannot be broken down into smaller parts and typically do not have
prefixes, suffixes, or infixes attached to them in their most basic form. Roots form the
base upon which stems and ultimately full words are built. In many cases, the root is
the same as the stem
Types of Morphemes (contd)
For the word "reaction," the root is "act." In "writer" the root is "write."
Prefix: "re-" (meaning again or back)
Root: "act" (basic action or doing) Root: "write" (basic action: to form letters
Suffix: "-ion" (denoting the action or or words)
condition of) Suffix: "-er" (one who does the action)
"Reaction" refers to 'the action of doing "Writer" refers to 'one who writes.'
something again or in response.'
Part of Speech :
Is a category of words in a language that have similar grammatical properties.
Common parts of speech include nouns, verbs, adjectives, adverbs, pronouns,
prepositions, conjunctions, and interjections. Each part of speech plays a specific role
in a sentence, contributing to the sentence's overall meaning and structure.
Understanding parts of speech is crucial for analyzing and constructing sentences
effectively.
Nouns: Words that name people, places, Adjectives: Words that describe or modify
things, or ideas. nouns.
Example: "Computer," "Paris," "happiness." Example: "red," "quick," "intelligent."
Verbs: Words that express actions,
occurrences, or states of being.
Example: "run," "is," "think."
Types of Morphemes (contd)
Adverbs: Words that modify verbs,
adjectives, or other adverbs, often indicating Conjunctions: Words that join
manner, place, time, or degree. words, phrases, or clauses.
Example: "quickly," "there," "very.“ Example: "and," "but," "because.“
Inflectional morphology
Adds information to a word consistent with its context within a sentence
Examples
• Number (singular versus plural) • Case (nominative versus accusative versus…)
automaton → automata he, him, his, …
• Walk → walks
Morphology Analysis Approaches
Morphological analysis may be defined as the process of obtaining grammatical
information from tokens, given their suffix information. Morphological analysis can be
performed in three ways:
1. Morpheme-based morphology (or anitem and arrangement approach),
2. Word-based morphology (or a word and paradigm approach), and
3. Lexeme-based morphology (or an item and process approach).
1. Morpheme-based morphology
Morpheme-based morphology analyzes and describes the structure of words by
breaking them down into their smallest meaningful units, called morphemes. There
are two main types of morphemes in morpheme-based morphology.
Free Morphemes: These can stand alone as words (e.g., "book", "go").
Bound Morphemes: These cannot stand alone and must be attached to a free
morpheme (e.g., prefixes like "un-", suffixes like "-ing"). Words are formed by
combining these morphemes in a linear arrangement.
Word: "Unhappiness"
Structure: [Prefix "Un-"] + [Root "happy"] + [Suffix "-ness"]
This structure shows that the word "unhappiness" is composed of three morphemes:
"un-" (a prefix), "happy" (a root), and "-ness" (a suffix). Each morpheme contributes to
the overall meaning of the word.
Morphology Analysis Approaches (contd)
2. Word -based morphology
Word-based morphology focuses on words as the central units of morphological
analysis rather than morphemes. This approach emphasizes the full forms of words
rather than attempting to segment words into constituent morphemes. It’s a contrast
to morpheme-based morphology, which breaks down words into the smallest units of
meaning. It treats words as indivisible wholes or as bases to which processes are
applied. It looks at how words change as whole units through processes like
inflection, derivation, and compounding.
There is less focus on dividing the word into prefixes, stems, and suffixes. Instead,
the processes that affect the word as a whole are examined.
Stemming
Stemming algorithms aim to remove those affixes required for eg. grammatical role,
tense, derivational morphology leaving only the stem of the word. This is a difficult
problem due to irregular words (eg. common verbs in English), complicated
morphological rules, and part-of-speech and sense ambiguities
NLTK algorithm
- PorterStemmer
- SnowballStemmer
- Lancaster stemmer:
Morphology Analysis (contd)
Lemmatization
Lemmatization is another technique used to reduce inflected words to their root
word. It describes the algorithmic process of identifying an inflected word’s “lemma”
(dictionary form) based on its intended meaning.
POS
Part of natural language processing is determining the role of each word or token in
a body of text. In the world of NLP, we call this process part-of-speech (POS)
tagging. The NLTK package comes with a function pos_tag() that makes this job
relatively seamless, and gives us a good starting point.
VB verb, base form – take
VBD verb, past tense – took
VBG verb, gerund/present participle – taking
VBN verb, past participle – taken
VBP verb, sing. present, non-3d – take
VBZ verb, 3rd person sing. present – takes
Syntax Analysis
Sentiment Analysis
Entity Analysis
Entity Sentiment Analysis
Text Classification
Popular NLP Tools (contd)
The analyzeSyntax method returns details about the linguistic structure of the given
text. For each token in the text, the Natural Language API provides information about
its internal structure (morphology) and its role in the sentence (syntax).