0% found this document useful (0 votes)
21 views

Lecture 1

The document discusses the requirements and grading for a Natural Language Processing course. It covers background knowledge needed in machine learning and Python. The grading will be based on a mid-term exam, quizzes and assignments, final exam, and a final project done in teams of 4-5 students.

Uploaded by

HARSHITA RATHORE
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

Lecture 1

The document discusses the requirements and grading for a Natural Language Processing course. It covers background knowledge needed in machine learning and Python. The grading will be based on a mid-term exam, quizzes and assignments, final exam, and a final project done in teams of 4-5 students.

Uploaded by

HARSHITA RATHORE
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 33

Natural Language Processing

Course Requirements
Background
• Machine Learning
• Proficiency in Python: Programming assignments and projects will
require use of Python, NumPy and Pandas.
Book:
1. Practical Natural Language Processing with Python by Mathangi Sri.
2. Daniel Jurafsky and James H. Martin. 2009. Speech and Language
Processing: An Introduction to Natural Language Processing, Speech
Recognition, and Computational Linguistics. 2nd edition. Prentice-Hall.
Grading
• Mid-Term (30%)
• Quiz & Assignments (20%)
• End term (30%)
• Final project (20%): Teams of 4/5 students
Why Study NLP?
• Large volumes of textual data
news articles, web pages, scientific articles, patents, emails, government documents , Tweets, Facebook posts,
comments, Quora ...
 Natural language processing helps computers communicate with humans in their own language and scales
other language-related tasks. n
 NLP makes it possible for computers to read text, hear speech, interpret it, measure sentiment and determine
which parts are important. T Kharagpur) Introduction to the Course Week 1: Lecture 1 7 / 9
 The amount of unstructured data that’s generated every day, from medical records to social media,
automation will be critical to fully analyze text and speech data efficiently.

• Structuring a highly unstructured data source


 Human language is astoundingly complex and diverse. We express ourselves in infinite ways.
 NLP is important because it helps resolve ambiguity in language and adds useful numeric structure to the data
for many downstream applications, such as speech recognition or text analytics.
Why study NLP?
What is NLP?
• Making machines understand human language.
• Design, implement, and test systems that process human languages
for practical applications.
What is NLP?
What is NLP?
Goal of NLP
• Automatic query correction:
Goal of NLP
• Search Engines and query Completion
Sentiment Analysis
Other Goal of NLP
• Who will be the winner of the 2024 Lok Sabha election ?
• Spam detection
• Google translate to load that page in English or in any other
language for you
• Text summarization: Given a big news article or scientific
article can I summarize that in short
Alpha-creation
• Earnings Uncertainty and Attention
• Abstract
• I document a positive relationship between earnings uncertainty and attention to subsequent earnings
releases. I use the percentage of uncertain words in 10-K or 10-Q filings as the measure of ex ante earnings
uncertainty. Earnings releases of high uncertainty firms are accompanied by higher Google search volume,
higher Bloomberg readership, higher abnormal trading volume, and faster analyst response. Low uncertainty
stocks underreact more to earnings surprises suggesting that the lower attention received by them is
insufficient to fully incorporate new information immediately following earnings. The findings are consistent
with attention constrained investors allocating less attention to low uncertainty firms.

Kottimukkalur, B. (2018). Earnings uncertainty and att


ention. Available at SSRN 3287470.
Why is NLP hard ?
Lexical Ambiguity:
• It is word-level ambiguity
• Determine the meaning of each word.
• Same words used in multiple different senses. It is defined as the ambiguity associated with
the meaning of a single word. A single word can have different meanings. A single word can
be a noun, adjective, or verb.

• Ex 1: Will Will will Will’s will?


• Ex2: Rose rose to put rose roes on her rows of roses.
• Ex3: Buffalo buffalo Buffalo buffalo buffalo Buffalo buffalo.
Why is NLP hard ?
• What are the various interpretations of meaning “will”:
Will Will will Will’s will?

• 1. Will -- > Model verb


• 2. Will  Name of the person
• 3. will  verb (his Will)
• 4. Will’s  Name of the person (Same Will or different Will ?)
• 5. will  Noun
Why is NLP hard ?
• Rose rose to put rose roes on her rows of roses.
• Rose - Name of a person (Noun)
• rose - past tense of rise, meaning to get up (Verb)
• rose roes (Adjective)- a roe is a fish egg, and rose here is
referring to a color
• So... Rose got up to put red fish eggs on her rose flowers
• rose has been as a noun, verb, adjective
Why is NLP hard ?
Structural Ambiguity/Syntactic Ambiguity:
• How words are related with each other in a sentence.
• A sentence may have multiple syntactic structure.
• The same sentence can be interpreted in multiple ways. i.e. Different interpretations
of the same sentence. This kind of ambiguity occurs when a sentence is parsed in
different ways.
Ex: I saw a man on a hill with a telescope
Ambiguity:
• Who has Telescope ?
• Ex: Visiting relatives can be boring
Ambiguity:
• visiting a relative’s house can be boring?
• visiting relatives at your place can be boring
Structural Ambiguity/Syntactic Ambiguity
• I saw a man on a hill with a telescope
These are some interpretations of the sentence shown above.
There is a man on the hill, and I watched him with my telescope.
There is a man on the hill, and he has a telescope.
I’m on a hill, and I saw a man using my telescope.
I’m on a hill, and I saw a man who has a telescope.
There is a man on a hill, and I saw him something with my telescope.
Semantic Ambiguity
• Determine the meaning of a sentence.
• Semantic Ambiguity
When the meaning of the words themselves ambiguous/misinterpreted. In other words, semantic
ambiguity happens when a sentence contains an ambiguous word or phrase.

Ex: Mary knows a little French.

“little French” is ambiguous. As we don’t know whether it is about the language French or a person.

Ex: It is very warm here.


• What is actual temperature. If we consider India 40 degree is warm; England 25 degree is warm .
Pragmatic ambiguity
• Determine meaning in context, e.g., to infer the speech acts of language.
• It is a study about the sentences that are not directly spoken. It is the study of how
people use language.
• Pragmatic ambiguity arises when the meaning of words of a sentence is not
specific; it concludes different meanings
• Pragmatic Ambiguity can be defined as the words which have multiple
interpretations.
• The proper sense is not understood due to the grammar formation of the sentence;
this multi interpretation of the sentence gives rise to ambiguity.
• Ex: The chicken is ready to eat.
• Case1: The chicken is ready to eat its breakfast (Semantic meaning)
• Case2: The cooked chicken is ready to be served (Pragmatic meaning)
Why is NLP hard ?
• See you, I will text you later == C U I’ll txt u l8r
• C = see
• U = you

IDIOMS
Dark horse
Ball in your court
Burn the midnight oil
Steps of Natural Language Processing
Lexical Analysis/Morphological Processing

In lexical analysis, we divide a whole chunk of text into paragraphs, sentences,
and words. It involves identifying and analyzing words structure.
In this phase, the sentences, paragraphs are broken into tokens. These tokens are
the smallest unit of text. A paragraph may also be divided into sentences.
For example, “He is from IIMK.” is divided into [ ‘He’ , ‘is’ , ‘from’ , ‘IIMK’,
‘.’] .
Syntax Analysis
Check a sentence is well formed or not and to break it up into a structure that shows the
syntactic relationships between the different words.
Ex: “The school goes to the boy” would be rejected by syntax analyzer or parser .
Syntactic analysis (Parser): Read the tokens and output a parse tree or report syntax
errors.
Semantic Analysis

• Semantic Analysis captures the meaning of the given text while taking
into account context, logical structuring of sentences and grammar
roles.
• Ex: “I ate hot ice cream” will get rejected by the semantic analyzer
because it doesn’t make sense.
• In semantic analysis, the relation between lexical items are identified.
• Sen 1: Students love IIMK.
• Sen 2: IIMK loves students.
• Try to understand how combinations of individual words form the
meaning of the text.
Pragmatic Analysis
• Pragmatic analysis helps users to discover the intentional
effect by applying a set of rules.
• Ex: “close the window?” should be interpreted as a request
instead of an order.
An example of tasks in natural language understanding
The NLP pyramid
Morphology
In morphology, most of the operations are at a word level.
 Prefixes/Suffixes
 Singularization/Pluralization
 Gender detection
 Word inflection
 Lemmatization
 Spellchecking
Syntax
Syntax usually works on sentences.
Building Syntax Trees/Parse tree
Building Dependency Trees
Semantics
Semantics derives meaning from text
Named Entity Extraction
Relation Extraction
Semantic Role Labelling
Word Sense Disambiguation
Pragmatics
Pragmatics analyses the text as a whole. It’s about
determining underlying narrative threads, topics, references.
Some tasks are:
Coreference / Anaphora resolution (find out what word
refers what.
Example: Rahul is fine. He[Rahul]‘s in no danger.)
Topic segmentation
Lexical chains
Text Summarization
Illustration of different levels of text representation

You might also like