0% found this document useful (0 votes)
51 views

Parsing and Syntax

The document discusses syntactic analysis and parsing in natural language processing. It defines syntax as how words are related in a sentence and syntactic analysis as analyzing phrases, word modifications, and central words. Syntactic analysis is used in applications like grammar checking, question answering, and machine translation. The document also covers parsing strategies like top-down and bottom-up, and techniques to improve efficiency like separating lexical rules and chunking.

Uploaded by

Gezish
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views

Parsing and Syntax

The document discusses syntactic analysis and parsing in natural language processing. It defines syntax as how words are related in a sentence and syntactic analysis as analyzing phrases, word modifications, and central words. Syntactic analysis is used in applications like grammar checking, question answering, and machine translation. The document also covers parsing strategies like top-down and bottom-up, and techniques to improve efficiency like separating lexical rules and chunking.

Uploaded by

Gezish
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Parsing and Syntax

 Syntax- refers to the way words are related to each other in a sentence.
 Syntactic Analysis- analyzes:
how words are grouped together into phrases;
what words modify other words;
what words are of central importance to the sentence.
 Syntactic Analysis is used in many NLP applications such as:
Grammar Checking
Question Answering
Information Extraction
Machine Translation
Cont…
Noun Phrases
Student, the student, that student, two students, many students
Clever student
A student of computer science
AU students, long queues, the student with long hair, the city where I lived
Adjective Phrases
 incredibly short
 rather difficult
 very happy
 unbelievably quick
 exceedingly sorry about the mistake
 amazingly rich in minerals
Cont…
Verb Phrases
turn, turn on, is turning on, have been working
threatened to throw himself into the window
was an understandable reaction by the visitors
is amazingly rich in minerals
Prepositional Phrases
on the table
across the world
over your head
in the hotel
to their house
Cont…
Adverbial Phrases
 immediately
 unbelievably quickly
 very carefully
Simple Sentences
 The computer is on the table
 He went home
 They are always happy
Complex Sentences
He was driving the car that he bought from his father
We rented our house to friends while we were abroad
Cont…
Examples: Simple Sentences
Cont…
 Concepts: Alphabet, String, and Language
 Formal Language Theory - considers a language as a mathematical object defined by
alphabets, strings and grammar.
 Alphabet - a finite set of symbols.
 e.g. Binary Alphabet: {0, 1}
 Decimal Alphabet: {0, 1, 2 , … , 9}
 English Alphabet: {a, b, c, … , z, A, B, C, …, Z}
 String - a finite sequence of symbols from an alphabet.
 e.g. Binary String: 0100101, 01101, 00110
 Decimal String: 176392, 12, 398702
 English String: killed, Abebe, lion, the
Language- (potentially infinite) set of strings over an alphabet.
 e.g. Binary Language: {0100101, 01101, 00110, ….}
 Decimal Language: {176392, 12, 398702, ….}
 English Language: {killed, Abebe, lion, the, ….}
Cont…
Grammar - a formalism to generate strings in a language by a process of replacing symbols.
- has 4 elements (tuples) represented as: G= (N, T, P, S) where
• N is a finite set of non-terminal symbols. In natural languages, this can be syntactic
categories, phrases or sentences.
• T is a finite set of terminal symbols (disjoint from N). It consists of elements of target
language such as words and letters in natural language.
• P is a finite set of production rules of the form ɑ→β with at least one nonterminal in ɑ.
• S is a member of N called the start symbol (special non-terminal symbol). In natural
languages, the start symbol is a sentence.
Cont…
Hierarchy of Grammars/Languages
 Also known as Chomsky Classification, the hierarchy of grammars/languages represents
a hierarchy of expressiveness of grammars.
 Different classes of grammars/languages are defined by putting different constraints on
production rules resulting in different structural complexity of sentences of natural
languages.
 Chomsky classification consists of the following four levels of grammars/languages:
 Type 0 (Unrestricted / Recursively Enumerable)
 Type I (Context-Sensitive)
 Type II (Context-Free)
 Type III (Regular)
Cont…
Hierarchy of Grammars/Languages: Type 0 (Unrestricted)
Cont…
Hierarchy of Grammars/Languages: Type I (Context-Sensitive)
Cont…
Hierarchy of Grammars/Languages: Type II (Context-Free)
Cont…
Hierarchy of Grammars/Languages: Type III (Regular)
Cont…
Parsing
Is the process of recognizing and assigning structure of sentences.
is a derivation process which identifies the structure of sentences using a given grammar.
considered as a special case of a search problem.
 two basic methods of searching are used
 top-down strategy
 bottom-up strategy
 methods of improving efficiency
 storing lexical rules separately
 chunking
Cont…
Parsing Strategies: Top-down Parsing
Cont…
Parsing Strategies: Bottom-up Parsing
Cont…
Towards Efficient Parsing: Separating Lexical Rules
Cont…
Towards Efficient Parsing: Chunking
Applications of parsing

 Machine translation
tree
English operations Chinese

 Speech synthesis from parses

 Speech recognition using parsing


Put the file in the folder.
Put the file and the folder.
Applications of parsing
 Grammar checking

 Indexing for information retrieval


 Information extraction

You might also like