0% found this document useful (0 votes)
40 views

NLP Unit 1 Notes

Natural language processing

Uploaded by

royalnaveen757
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views

NLP Unit 1 Notes

Natural language processing

Uploaded by

royalnaveen757
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

UNIT 1

What is Natural Language Processing (NLP)


• The process of computer analysis of input provided in a human language (natural language),
and conversion of this input into a useful form of representation.
• The field of NLP is primarily concerned with getting computers to perform useful and
interesting tasks with human languages.

Forms of Natural Language


Language is one of the fundamental aspects of human behaviour and is a crucial component of our lives.
Two forms of Language
1. Written Text
2. Speech
In written form it serves as a long-term record of knowledge from one generation to the next.
In spoken form it serves as our primary means of coordinating our day-to-day behaviour with others.

Applications of NLP:
Two classes of applications
1.Text based applications
2.Dialogue based applications

Text Based Applications


➢ Finding appropriate documents on certain topics from a database of texts (finding books in
library)
➢ Extracting information from messages on certain topics
➢ Translating documents from one language to another
➢ Summarizing texts for certain purposes

Dialogue based applications


This type of applications involve Human-Machine communication.
➢ Question-answering systems, where NL is used to query a database.
➢ Automated customer services over the telephone.
➢ Tutoring systems, where the machine interacts with a student.
➢ Spoken language control of a machine
➢ General cooperative problem solving systems.

Levels of Knowledge of Language


1. Phonological Knowledge – concerns how words are related to the sounds that realize them.
2. Morphological Knowledge – concerns how words are constructed from more basic meaning units
called morphemes.
A morpheme is the primitive unit of meaning in a language.
3. Syntactic Knowledge – concerns how can be put together to form correct sentences and determines
what structural role each word plays in the sentence and what phrases are subparts of other phrases.
4. Semantic Knowledge – concerns what words mean and how these meaning combine in sentences
to form sentence meaning. The study of context-independent meaning.
5. Pragmatic Knowledge – concerns how sentences are used in different situations and how use
affects the interpretation of the sentence.
6. Discourse Knowledge– concerns how the immediately preceding sentences affect the interpretation
of the next sentence.
• For example, interpreting pronouns and interpreting the temporal aspects of the information.
7. World Knowledge – includes general knowledge about the world. What each language user must
know about the other’s beliefs and goals.

Evaluating Language Understanding Systems


It Varies from application to application.
To evaluate a system is to run the program and see how well it performs the task it was designated to
do.
Two methods of evaluations.
1. Black Box Evaluation
2. Glass Box Evaluation
If the system is designated to participate in simple conversations on a topic, you might try conversing
with it.
This is called Black Box Evaluation.
Because it evaluates system performance without looking inside to see how it works.
Identifying various subcomponents of a system and then evaluate each one with appropriated tests is
called Glass Box Evaluation.
In this method we look inside at the structure of the system.
Representations and Understandings
A crucial component of understanding involves computing a representation of the meaning of the
sentence and texts.
Why not simply use the sentence itself as a representation of its meaning?
Ans: Most words have multiple meanings, which are called as senses.
Ex: ‘Cook’ has a sense of a verb and a sense of noun.
‘Dish’ has a sense of a verb and a sense of noun.
‘Still’ has senses as a noun, verb, adjective and adverb.

Formal languages are specified from very simple building blocks.


Useful representation languages have the following two properties:
1. The representation must be precise and unambiguous.
Every distinct reading of a sentence as a distinct formula in the representation.
2. The representation should capture the intuitive structure of the natural language sentences that
it represents.
Sentences that appear to be structurally similar should have similar structural representations.

1. Syntax: Representing Sentence Structure


The syntactic structure of a sentence indicates the way that words in the sentence are related to each
other.
This structure indicates,
How the words are grouped together into phrases?
What words modify what other words?
What words are of central importance in the sentence?
➢ This structure may identify the types of relationships that exists between phrases
➢ Can store other information about the particular sentence structure.
➢ Most syntactic representations of language are based on the notion of Context Free Grammars.
➢ CFGs represent sentence structure in terms of what phrases are subparts of other phrases.

2. The Logical Form


➢ The representation of the context independent meaning of a sentence is called its Logical Form.
➢ The logical form encodes possible word senses and modifies the semantic relationships between
the words and phrases.
➢ Many of these relationships are often captured using an abstract set of semantic relationships
between the verb and its noun phrases.

3. The Final Meaning Representation


➢ The final representation needed is a general knowledge representation, which the system uses
to represent and reason about its application domain.
➢ The goal of contextual interpretation is to take a representation of the structure of a sentence
and its logical form, and to map this into some expression in the Knowledge representation
that allows the system to perform the appropriate task in the domain.

The Organization of Natural Language Understanding Systems


The process that maps a sentence to its syntactic structure and logical form is called the Parser.
It uses knowledge of word and word meanings (the lexicon)
and
a set of rules defining the legal structures ( the grammar) in order to assign a syntactic structure and a
logical form to an input sentence.

The process that transforms the syntactic structure and logical form into a final meaning representation
is called contextual processing.
It uses knowledge of the discourse context and knowledge of the application to produce a final
representation
Two tasks:
Understanding task and Generative task.
If a grammar supports both, it is called as bidirectional grammar.

NLP - an inter-disciplinary Field


• NLP borrows techniques and insights from several disciplines.
• Linguistics: How do words form phrases and sentences? What constraints the possible meaning
for a sentence?
• Computational Linguistics: How is the structure of sentences are identified? How can
knowledge and reasoning be modeled?
• Computer Science: Algorithms for automatons, parsers.
• Engineering: Stochastic techniques for ambiguity resolution.
• Psychology: What linguistic constructions are easy or difficult for people to learn to use?
• Philosophy: What is the meaning, and how do words and sentences acquire it?

You might also like