1 Introduction

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 45

Natural Language Processing

Book: Speech and Language Processing, by M. Jurafsky,


& J. Martin, New York: Prentice-Hall (2000).
What is natural language processing?

 Process information contained in natural language


text
• Computational Linguistics(CL),

• Human Language Technology(HLT),

• Natural Language Engineering(NLE)


NLP for Machines

• Analyse, understand and generate human language just like humans do.

• Applying computational techniques to language domain.

• To explain linguistic theories to use the theories to build systems that can be
of social use.

• Started off as a branch of Artificial Intelligence.

• Borrows from Linguistics, Psycholinguistics, Cognitive Science and Statistics.


NLP Applications
Production-Level Applications

A computer program in Canada accepts daily weather data and


automatically generates weather reports in English and
French

Over 1,000,000 translation requests daily are processed by


the Babel Fish system available through Altavista

A visitor to Cambridge, MA can ask a computer about places


to eat using only spoken language. The system returns
relevant information from a database of facts about the
restaurant scene.
Prototype-Level Applications

Computers grade student essays in a manner


indistinguishable from human graders
An automated reading tutor intervenes, through speech,
when the reader makes a mistake or asks for help
A computer watches a video clip of a soccer game and
produces a report about what it has seen
A computer predicts upcoming words and expands
abbreviations to help people with disabilities to
communicate
Why NLP ?

• A hallmark of human intelligence.

• Text is the largest repository of human knowledge and is growing quickly

• Computer programmes that understood text or speech.


History of NLP

• In 1950, Alan Turing published an article titled “Machine and Intelligence”


which advertised what is now called the Turing test as a subfield of
intelligence.

• Some beneficial and successful Natural language system were developed I


the 1960s were SHRDLU, a natural language system working in restricted
“blocks of words” with restricted vocabularies was written between 1964 to
1966.
Components and Process of NLP

• Components of NLP

• Linguistics and Language

• Steps of NLP

• Techniques and Methods


Components of NLP

• Natural Language Understanding: taking some spoken/typed sentence and


working out what it means.

• Natural Language Generation: taking some formal representation of what


you want to say and working out a way to express it in a natural(human)
language.
Components of NLP

• Natural Language Understanding:


Mapping the given input in the natural language into a useful representation.

• Different level of analysis required:


• Morphological analysis
• Syntactic analysis
• Semantic analysis
• Discourse analysis
Components of NLP

Natural Language Generation:


•Producing output in the natural language form some internal representation.

Different level of synthesis required:

•Deep planning(what to say)

•Syntactic generation

Note: NL Understanding is much harder than NL Generation. But, still both of


them are hard.
Linguistics and language

Linguistics is the science of language.

Its study includes:

•Sounds which refers to phonology


•Word formation refers to morphology
•Sentence structures refers to syntax
•Meaning refers to semantics
•Understanding refers to pragmatics
Steps of NLP

1. Morphological and Lexical Analysis


2. Syntactic Analysis
3. Semantic Analysis
4. Discourse Integration
5. Pragmatic Analysis
Morphological and Lexical Analysis

• The lexicon of a language is its vocabulary that includes its words and expressions.

• Morphology depicts analysing, identifying and description of structure of words.

• Lexical analysis involves dividing a text paragraphs, words and the sentences.
Syntactic Analysis

• Syntax concerns the proper ordering of words and its affects on meaning

• This involves analysis of the word in a sentence to depict the grammatical structure of
the sentence.

• The words are transformed into structure that shows how the words are related to each
other
• For example: “the girl the go to the school”. This would definitely be rejected
by the English syntactic analyser.
Semantic Analysis

• Semantics concerns the (literal) meaning of word, phrases and sentences.

• This abstracts the dictionary meaning or the exact, meaning from the context.

• The structures which are created by the syntactic analyser are assigned meaning.

For example: “colourless blue idea”. This would be rejected by the


analyser as colourless blue do not make any sense to gether.
Discourse Integration

• Sense of the context

• The meaning of any single sentence depends upon the sentences that precedes it and
also invokes the meaning of the sentences that follow it.

For example: the word “it” in the sentence “she wanted it” depend upon the prior
discourse context.
Pragmatic Analysis

• Pragmatics concerns the overall communicative and social context and its effect on
interpretation.

• It means abstracting or deriving the purposeful use of the language in situations.

• Importantly those aspects of language which required world knowledge.

• The main focus is on what was said is reinterpreted on what it actually means.

For example: “close the window?” should have been interpreted as a request rather
than an order.
Natural Language Generation

• NLG is the process of constructing natural language outputs from non-linguistic inputs

• NLG can be viewed as the reverse process of NL understanding

• A NLG system may have following main parts:


i) Discourse Planner: what will generated, which sentences
ii) Surface Realizer: realises a sentence from its internal representation
iii) Lexical Selection: selecting the correct words describing the concepts
Techniques and methods

Machine Learning

•The learning procedures used during machine learning


•Automatically focuses on the most common cases
•Whereas when we write rules by hand it is often not correct at all
•Concerned on human errors
Techniques and methods

Statistical inference

•Automatic learning procedures can make use of statistical inference algorithms


•Used to produce models that are robust(means strength) to unfamiliar input e.g.
containing words or structures that have not been seen before
•Making intelligent guesses.
Techniques and methods

Input database and Training data

•System based on Automatically learning rules can be made more accurate simply by
supplying more input data or source to it.
•However, system based on hand-written rule can only be made more accurate by
increasing the complexity of the rules, which is a much more difficult task.
Natural language vs. Computer language

•Ambiguity is the primary difference between natural and computer language.

•Formal programming language are designed to be unambiguous. They can be defined by


a grammar that produces a unique parse for each sentence in the language.

•Programming languages are also designed for efficient(deterministic) parsing.

•They are deterministic context-free languages(DCLFs)


Future of NLP

•Make computers as they can solve problems like humans and think like humans as well
as perform activities that humans cannot perform and making it more effiecient than
humans.

•As natural language understanding or readability improves, computers or machines or


devices will be able to learn from the information online and apply what they learned in the
real world.
Difficulties in NLP

•“The person who done it- it’s their fault”

•“The man who knew him went left”


“The man who knew he went left”

•Mother was baking.


The apple pie was baking.
Mother baked an apple pie.
An apple pie was baked by mother.
Lexical ambiguity

•“Time flies like an arrow.”

•“I saw that gas can explode.”

•“They should have scheduled meeting.”

•“Visiting relatives can be annoying.”

•Note: ambiguity due to class of word.


Lexical ambiguity

•“The pitcher fell and broke”

•“The water is in the pitcher.”

•“John drank a pitcher.”

•“John drank a tall pitcher while watching the baseball game.”

•“She approached the bank.”

•Note: ambiguity due to multiple definitions.


Syntactic ambiguity

•“John saw the woman in the park with a telescope.”

You might also like