NLP Unit-5
NLP Unit-5
of NLP
Applications
• What uses of the computer involve
language?
• What language use is involved?
• What are the main problems?
• How successful are they?
2/26
Speech applications
• Speech recognition (Speech-to-text)
– Uses
• As a general interface to any text-based application
• Text dictation
• Speech understanding
– Not the same: computer must understand intention, not necessarily exact
words
– Uses
• As a general interface to any application where meaning is important rather than
text
• As part of speech translation
• Difficulties
– Separating speech from background noise
– Filtering of performance errors (disfluencies)
– Recognizing individual sound distinctions (similar phonemes)
– Variability in human speech
– Ambiguity in language (homophones) 3/26
Speech applications
• Voice recognition
– Not really a linguistic issue
– But shares some of the techniques and problems
4/26
Word processing
• Check and correct spelling, grammar and style
• Types of spelling errors
– Non-existent words
• Easy to identify
• But suggested correction not always appropriate
– Accidental homographs
• Deliberate ‘errors’
– Foreign words
– Proper names, neologisms
– Illustrations of spelling errors!
5/26
Better word processing
• Spell checking for homonyms
• Grammar checking
• Tuned to the user
– You can (already) add your own auto-corrections
– Non-native users (‘Interference checking’)
– Dyslexics and other special needs users
• Intelligent word processing
– Find/replace that knows about morphology, syntax
6/26
Text prediction
• Speed up word processing
• Facilitate text dictation
• At lexical level, already seen in SMS
• More sophisticated , might be based on
corpus of previously seen texts
• Especially useful in repeated tasks
– Translation memory
– Authoring memory
7/26
Dialogue systems
• Computer enters a dialogue with user
– Usually specific cooperative task-oriented dialogue
– Often over the phone
– Examples?
• Usually speech-driven, but text also appropriate
• Modern application is automatic transaction processing
• Limited domain may simplify language aspect
• Domain ‘model’ will play a big part
• Simplest case: choose closest match from (hidden) menu
of expected answers
• More realistic versions involve significant problems
8/26
Dialogue systems
• Apart from speech recognition and
synthesis issues, NL components include …
• Topic tracking
• Anaphora resolution
– Use of pronouns, ellipsis
• Reply generation
– Cooperative responses
– Appropriate use of anaphora
9/26
(also know as)
Conversation machines
• Another old AI goal (cf. Turing test)
• Also (amazingly) for amusement
• Mainly speech, but also text based
• Early famous approaches include ELIZA, which
showed what you could do by cheating
• Modern versions have a lot of NLP, especially
discourse modelling, and focus on the language
generation component
10/26
QA systems
• NL interface to knowledge database
• Handling queries in a natural way
• Must understand the domain
• Even if typed, dialogue must be natural
• Handling of anaphora
e.g. When is the next flight to Sydney? 6.50
And the one after? 7.50
What about Melbourne then? 7.20
OK I’ll take the last one.
11/26
IR systems
• Like QA systems, but the aim is to retrieve
information from textual sources that contain the
info, rather than from a structured data base
• Two aspects
– Understanding the query
– Processing text to find the answer
• Named Entity Recognition
12/26
13/26
Named entity recognition
• Typical textual sources involve names
(people, places, corporations), dates,
amounts, etc.
• NER seeks to identify these strings and
label them
• Clues are often linguistic
• Also involves recognizing synonyms, and
processing anaphora
14/26
Automatic summarization
• Renewed interest since mid 1990s, probably
due to growth of WWW
• Different types of summary
– indicative vs. informative
– abstract vs. extract
– generic vs. query-oriented
– background vs. just-the-news
– single-document vs. multi-document
15/26
Automatic summarization
• topic identification
• stereotypical text structure
• cue words
• high-frequency indicator phrases
• intratext connectivity
• discourse structure centrality
• topic fusion
• concept generalization
• semantic association
• summary generation
• sentence planning to achieve information compaction
16/26
Text mining
• Discovery by computer of new, previously
unknown information, by automatically extracting
information from different written resources
(typically Internet).
• Similar to data mining (e.g. using consumer
purchasing patterns to predict which products to
place close together on shelves), but based on
textual information.
• Big application area is biosciences.
17/26
Text mining
• preprocessing of document collections (text
categorization, term extraction)
• storage of the intermediate representations
• techniques to analyze these intermediate
representations (distribution analysis,
clustering, trend analysis, association rules,
etc.)
• visualization of the results.
18/26
Story understanding
• An old AI application
• Involves …
– Inference
– Ability to paraphrase (to demonstrate
understanding)
• Requires access to real-world knowledge
• Often coded in “scripts” and “frames”
19/26
Machine Translation
• Oldest non-numerical application of computers
• Involves processing of source-language as in other
applications, plus …
– Choice of target-language words and structures
– Generation of appropriate target-language strings
• Main difficulty is source-language analysis and/or
cross-lingual transfer implies varying levels of
“understanding”, depending on similarities
between the two languages
• MT ≠ tools for translators, but some overlap
20/26
21/26
Machine Translation
• First approaches perhaps most intuitive: look up
words and then do local rearrangement
• “Second generation” took linguistic approach:
grammars, rule systems, elements of AI
• Recent (since 1990) trend to use empirical
(statistical) approach based on large corpora of
parallel text
– Use existing translations to “learn” translation models,
either a priori (Statistical MT ≈ machine learning) or on
the fly (Example-based MT ≈ case-based reasoning)
– Convergence of empirical and rationalist (rule-based)
approaches: learn models based on treebanks or similar.
22/26
Language teaching
• CALL
• Grammar checking but linked to models of
– The topic
– The learner
– The teaching strategy
• Grammars (etc) can be used to create
language-learning exercises and drills
23/26
Assistive computing
• Interfaces for disabled
• Many devices involve language issues, e.g.
– Text simplification or summarization for users
with low literacy (partially sighted, dyslexic,
non-native speaker, illiterate, etc.)
– Text completion (predictive or retrospective)
• Works on basis of probabilities or previous
examples
24/26
Conclusion
• Many different applications
• But also many common elements
– Basic tools (lexicons, grammars)
– Ambiguity resolution
– Need (but impossibility of having) for real-world
knowledge
• Humans are really very good at language
– Can understand noisy or incomplete messages
– Good at guessing and inferring
25/26
What is SA & OM?
• Identify the orientation of opinion in a piece of
text
Sentiment
Analysis
Machine Natural
Learning Language
Processing
Why sentiment analysis?
• Movie: is this review positive or negative?
• Products: what do people think about the new iPhone?
• Public sentiment: how is consumer confidence? Is despair
increasing?
• Politics: what do people think about this candidate or issue?
• Prediction: predict election outcomes or market trends from
sentiment
29
• Emotion: brief organically synchronized … evaluation of a major event
– angry, sad, joyful, fearful, ashamed, proud, elated
• Mood: diffuse non-caused low-intensity long-duration change in subjective feeling
– cheerful, gloomy, irritable, listless, depressed, buoyant
• Interpersonal stances: affective stance toward another person in a specific interaction
– friendly, flirtatious, distant, cold, warm, supportive, contemptuous
• Attitudes: enduring, affectively colored beliefs, dispositions towards objects or persons
– liking, loving, hating, valuing, desiring
• Personality traits: stable personality dispositions and typical behavior tendencies
– nervous, anxious, reckless, morose, hostile, jealous
Sentiment Analysis
• Sentiment analysis is the detection of attitudes
“enduring, affectively colored beliefs, dispositions
towards objects or persons”
1. Holder (source) of attitude
2. Target (aspect) of attitude
3. Type of attitude
• From a set of types
– Like, love, hate, value, desire, etc.
• Or (more commonly) simple weighted polarity:
– positive, negative, neutral, together with strength
31
4. Text containing the attitude
Sentiment Analysis
• Simplest task:
– Is the attitude of this text positive or negative?
• More complex:
– Rank the attitude of this text from 1 to 5
• Advanced:
– Detect the target, source, or complex attitude
types
Finding sentiment of a sentence
33
Finding aspect/attribute/target of sentiment
36
Summary on Sentiment
• Emotion:
– Detecting annoyed callers to dialogue system
– Detecting confused/frustrated versus confident students
• Mood:
– Finding traumatized or depressed writers
• Interpersonal stances:
– Detection of flirtation or friendliness in conversations
• Personality traits:
– Detection of extroverts
Detection of Friendliness
• Friendly speakers use collaborative
conversational style
– Laughter
– Less use of negative emotional words
– More sympathy
• That’s too bad I’m sorry to hear
that
– More agreement
• I think so too 39
Named Entity(NE) Recognition
40
Why do NER?
• Key part of Information Extraction system
• Robust handling of proper names essential
for many applications such as
Summarization, IR, Anaphora,.........
• Pre-processing for different classification
levels
• Information filtering
• Information linking
41
What is NER ?
• NER involves identification of proper names
in texts, and classification into a set of
predefined categories of interest.
• Three universally accepted categories:
• Person, location and organisation
• Other common tasks: recognition of date/time
expressions, measures (percent, money, weight
etc), email addresses etc.
• Other domain-specific entities: names of
Drugs, Genes, medical conditions, names of
ships, bibliographic references etc.
9/28/2024 42
NER Definition
• Named entity recognition (NER) (also known as entity
identification (EI) and entity extraction) is the task that locate
and classify atomic elements in text into predefined categories such
as the names of persons, organizations, locations, expressions of
times, quantities, monetary values, percentages, etc.
John sold 5 companies in 2002.
9/28/2024 43
What is not NER?
• NER is not event recognition.
• NER does not create templates,
• NER does not perform co-reference or entity
linking,
– though these processes are often implemented alongside
NER as part of a larger IE system.
• NER is not just matching text strings with pre-
defined lists of names.
It recognises entities which are being used as entities in
a given context.
• NER
9/28/2024
is not an easy task! 44
What is Named Entity
• Named Entities are
– A Noun Phrase
– Rigid Designators : It designates/denotes the same
thing in all possible worlds in which the same thing
exists and does not designate anything else in those
possible worlds in which that same thing does not
exist
9/28/2024 45
EXAMPLES for Named Entity
[guitar, NN, and, CC, player, NN, stand, VB, and guitar, player stand]
[fishing, big, sound, player, fly, rod, pound, double, runs, playing, guitar, band]
•…
Applying Naive
•
Bayes to WSD
P(c) is the prior probability of that sense
• Counting in a labeled training set.
• P(w|c) conditional probability of a word given a particular sense
• P(w|c) = count(w,c)/count(c)
• We get both of these from a tagged corpus like SemCor
34
Graph-‐based methods
• First, WordNet can be viewed as a graph
• senses are nodes
• relations (hypernymy, meronymy) are edges
• Also add edge between word and unambiguous gloss words
foodn1 liquidn1
helpingn 1
beveragen1 milkn1
toastn4 drinkn1
sipv1
supv 1
drinkingn1
consumern1 drinkern1
36 potationn1
consumptionn1
How to use the graph
for WSD
• Insert target word and words in its sentential context into the
graph, with directed edges to their senses
“She drank some milk”
1 1
milkn4
with“pagerank”
highest
37 drinkv5
“drink” “milk”
Semi-‐Supervised Learning
Problem: supervised and
dictionary-‐based approaches
require large hand-‐built
resources
What if you don’t have so much trainingdata?
Solution: Bootstrapping
Generalize from a very small hand-‐labeledseed-‐set.
Bootstrapping
• For bass
• Rely on “One sense per collocation” rule
• A word reoccurring in collocation with the same word will almost
surely have the same sense.
B B
(a) B (b) B
Summary
• Word Sense Disambiguation: choosing correct
sense in context
• Applications: MT, QA, etc.
• Three classes of Methods
• Supervised Machine Learning: Naive Bayes classifier
• Thesaurus/Dictionary Methods
• Semi-‐Supervised Learning
• Main intuition
44
10
Dan Jurafsky Classification Methods:
Supervised Machine Learning
• Any kind of classifier
• Naïve Bayes
• Logistic regression
• Support-‐vector machines
• k-‐Nearest Neighbors
• …