Lin 13

The document discusses parsing, which involves determining if a string of symbols can be generated by a context-free grammar. It also discusses treebanks, which are corpora with parsed text used for testing parsers, and provides an example of evaluating parser performance metrics on a treebank.

Uploaded by

Brock Ternov

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views8 pages

Lin 13

Uploaded by

Brock Ternov

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Spurious Ambiguity

• Most parse trees of most NL sentences make no

sense.

19
Parsing
• Given a string of non-terminals and a CFG,
determine if the string can be generated by the
CFG.
– Also return a parse tree for the string
– Also return all possible parse trees for the string
• Must search space of derivations for one that
derives the given string.
– Top-Down Parsing: Start searching space of
derivations for the start symbol.
– Bottom-up Parsing: Start search space of reverse
deivations from the terminal symbols in the string.
Parsing Example

Verb NP
book that flight
book Det Nominal

that Noun

flight
Treebanks
• English Penn Treebank: Standard corpus for
testing syntactic parsing consists of 1.2 M words
of text from the Wall Street Journal (WSJ).
• Typical to train on about 40,000 parsed sentences
and test on an additional standard disjoint test set
of 2,416 sentences.
• Chinese Penn Treebank: 100K words from the
Xinhua news service.
• Other corpora existing in many languages, see the
Wikipedia article “Treebank”

85
First WSJ Sentence

( (S
(NP-SBJ
(NP (NNP Pierre) (NNP Vinken) )
(, ,)
(ADJP
(NP (CD 61) (NNS years) )
(JJ old) )
(, ,) )
(VP (MD will)
(VP (VB join)
(NP (DT the) (NN board) )
(PP-CLR (IN as)
(NP (DT a) (JJ nonexecutive) (NN director) ))
(NP-TMP (NNP Nov.) (CD 29) )))
(. .) ))
86
Parsing Evaluation Metrics
• PARSEVAL metrics measure the fraction of the
constituents that match between the computed and
human parse trees. If P is the system’s parse tree and T
is the human parse tree (the “gold standard”):
– Recall = (# correct constituents in P) / (# constituents in T)
– Precision = (# correct constituents in P) / (# constituents in P)
• Labeled Precision and labeled recall require getting the
non-terminal label on the constituent node correct to
count as correct.
• F1 is the harmonic mean of precision and recall.

87
Computing Evaluation Metrics

Correct Tree T Computed Tree P

S S
VP
VP
Verb NP VP
book Det Nominal Verb NP
the Nominal PP book Det Nominal PP
Noun Prep NP Noun Prep NP
the
flight through Proper-Noun flight through Proper-Noun
Houston Houston
# Constituents: 12 # Constituents: 12
# Correct Constituents: 10
Recall = 10/12= 83.3% Precision = 10/12=83.3% F1 = 83.3%
Treebank Results
• Results of current state-of-the-art systems on the
English Penn WSJ treebank are slightly greater than
90% labeled precision and recall.

Programacion I... by Ing. Mollo
14% (7)
Programacion I... by Ing. Mollo
198 pages
Becoming PDF
No ratings yet
Becoming PDF
1,749 pages
NLP Chapter 3
No ratings yet
NLP Chapter 3
23 pages
Requirements For Tie Up
No ratings yet
Requirements For Tie Up
3 pages
Neonatology MCQ
94% (33)
Neonatology MCQ
34 pages
Module-2 ch-4
No ratings yet
Module-2 ch-4
32 pages
NLP Unit 2
No ratings yet
NLP Unit 2
20 pages
Semantic Analysis
No ratings yet
Semantic Analysis
108 pages
Parsing Techniques
No ratings yet
Parsing Techniques
16 pages
9 - Syntax Analysis
No ratings yet
9 - Syntax Analysis
60 pages
Unit 3-2
No ratings yet
Unit 3-2
26 pages
CSC 461 Final
No ratings yet
CSC 461 Final
170 pages
NLP Module 3
No ratings yet
NLP Module 3
41 pages
Compi Desi CHP 04
No ratings yet
Compi Desi CHP 04
28 pages
Law 515 Jurisprudence and Legal Theory I
No ratings yet
Law 515 Jurisprudence and Legal Theory I
71 pages
Chapter 9 V 2
No ratings yet
Chapter 9 V 2
18 pages
Lecture 06
No ratings yet
Lecture 06
54 pages
Compiler Design 4
No ratings yet
Compiler Design 4
48 pages
Iso 23308 1 2020
0% (1)
Iso 23308 1 2020
11 pages
Indian Nights Blanket Pattern
No ratings yet
Indian Nights Blanket Pattern
18 pages
Communication Art
No ratings yet
Communication Art
19 pages
New Document - OddPart26
No ratings yet
New Document - OddPart26
8 pages
New Document - OddPart27
No ratings yet
New Document - OddPart27
8 pages
New Document - OddPart24
No ratings yet
New Document - OddPart24
8 pages
New Document - OddPart10
No ratings yet
New Document - OddPart10
8 pages
New Document - OddPart25
No ratings yet
New Document - OddPart25
8 pages
New Document - OddPart8
No ratings yet
New Document - OddPart8
8 pages
Chapter 4 Syntax Directed Translation
No ratings yet
Chapter 4 Syntax Directed Translation
37 pages
Unit 3 NLP New
No ratings yet
Unit 3 NLP New
15 pages
Full Text
No ratings yet
Full Text
15 pages
Spring 2017 Final Presentation
No ratings yet
Spring 2017 Final Presentation
13 pages
Data
No ratings yet
Data
12 pages
Social Change
100% (1)
Social Change
10 pages
New Document OddPart29
No ratings yet
New Document OddPart29
8 pages
New Document OddPart34
No ratings yet
New Document OddPart34
8 pages
New Document - OddPart5
No ratings yet
New Document - OddPart5
8 pages
New Document OddPart18
No ratings yet
New Document OddPart18
8 pages
New Document - OddPart21
No ratings yet
New Document - OddPart21
8 pages
SLoSP 2007 1
No ratings yet
SLoSP 2007 1
42 pages
4.chapter5 - Syntactic and Semantic Representations
No ratings yet
4.chapter5 - Syntactic and Semantic Representations
47 pages
HIS206 1-Unit Syllabus
No ratings yet
HIS206 1-Unit Syllabus
7 pages
NLP Unit Ii
No ratings yet
NLP Unit Ii
30 pages
Difference Between Systematic and Unsystematic Risk
No ratings yet
Difference Between Systematic and Unsystematic Risk
6 pages
Mis List
No ratings yet
Mis List
47 pages
MIT Open Access
No ratings yet
MIT Open Access
11 pages
Discov Er East Devon's Historical and Nature-Rich Pebblebeds During Heath Week
No ratings yet
Discov Er East Devon's Historical and Nature-Rich Pebblebeds During Heath Week
3 pages
Chapter 3
No ratings yet
Chapter 3
43 pages
Du XL AWFq 3 N
No ratings yet
Du XL AWFq 3 N
2 pages
CH 08
No ratings yet
CH 08
31 pages
Chapter 4 - 6
No ratings yet
Chapter 4 - 6
78 pages
Unit 3
No ratings yet
Unit 3
19 pages
CH2 2
No ratings yet
CH2 2
30 pages
Pallega, Jay-R - General Principles in Physical Science (MAEd Sci 212)
No ratings yet
Pallega, Jay-R - General Principles in Physical Science (MAEd Sci 212)
60 pages
Animal Research and Human Medicine Booklet
No ratings yet
Animal Research and Human Medicine Booklet
24 pages
Longsem2024-25 Cse3015 Eth Ap2024256000125 Reference-material-III
No ratings yet
Longsem2024-25 Cse3015 Eth Ap2024256000125 Reference-material-III
89 pages
Advanced Propulsion System GEM 423E Week 9: Podded Propulsion&Propellers
No ratings yet
Advanced Propulsion System GEM 423E Week 9: Podded Propulsion&Propellers
23 pages
HIST 5101 Syllabus Fall 2018
No ratings yet
HIST 5101 Syllabus Fall 2018
8 pages
A Simple One-Pass Compiler (To Generate Code For The JVM)
No ratings yet
A Simple One-Pass Compiler (To Generate Code For The JVM)
70 pages
Syntax Complete
No ratings yet
Syntax Complete
22 pages
ALL The ROUGH Pages Included From Lesson 1,2,3,4,5 Are Not Included in The Paper
No ratings yet
ALL The ROUGH Pages Included From Lesson 1,2,3,4,5 Are Not Included in The Paper
41 pages
Syntax Analysis: Presentation by
No ratings yet
Syntax Analysis: Presentation by
15 pages
Historical and Comparative Sociology: Winter Quarter 2008
No ratings yet
Historical and Comparative Sociology: Winter Quarter 2008
3 pages
CS 4300: Compiler Theory A Simple Syntax-Directed Translator
No ratings yet
CS 4300: Compiler Theory A Simple Syntax-Directed Translator
70 pages
1 Motivation: Setting Up To Use Pstone
No ratings yet
1 Motivation: Setting Up To Use Pstone
9 pages
History 302 Syllabus Fall 2018
No ratings yet
History 302 Syllabus Fall 2018
10 pages
Jessica Schram's Resume
No ratings yet
Jessica Schram's Resume
1 page
Natural Language Processing Unit 3
No ratings yet
Natural Language Processing Unit 3
55 pages
I Can : Fluency Practice To Improve Basic Skills K-1
No ratings yet
I Can : Fluency Practice To Improve Basic Skills K-1
14 pages
American Civilization I: Bsschwantes@widener - Edu
No ratings yet
American Civilization I: Bsschwantes@widener - Edu
8 pages
CSE 12 Abstract Syntax Trees
No ratings yet
CSE 12 Abstract Syntax Trees
38 pages
NLP M3 SPP
No ratings yet
NLP M3 SPP
53 pages
Unit 2 New One
No ratings yet
Unit 2 New One
12 pages
SOC6401H Theory Historical Sociology F18 JB
No ratings yet
SOC6401H Theory Historical Sociology F18 JB
11 pages
NLP Sem 3 Unit
No ratings yet
NLP Sem 3 Unit
12 pages
CD Unit 3
No ratings yet
CD Unit 3
76 pages
Impact and Probability of Threat
No ratings yet
Impact and Probability of Threat
12 pages
Managing Risksin PPPProjects Thecaseof Izmit Bay Suspension Bridge
No ratings yet
Managing Risksin PPPProjects Thecaseof Izmit Bay Suspension Bridge
9 pages
Mod - 3
No ratings yet
Mod - 3
51 pages
Parsing
No ratings yet
Parsing
10 pages
Historical Archaeology: C S C W
No ratings yet
Historical Archaeology: C S C W
6 pages
4 Semantic Analysis
No ratings yet
4 Semantic Analysis
20 pages
CC-Lec 5 Week 5 Cfgs
No ratings yet
CC-Lec 5 Week 5 Cfgs
29 pages
Lecture15 Parsing
No ratings yet
Lecture15 Parsing
37 pages
Unit 2 - Lecture 1
No ratings yet
Unit 2 - Lecture 1
19 pages
7.CD Lab Manual
No ratings yet
7.CD Lab Manual
35 pages
Rabino Lineth B Activity-8 Consyslab
No ratings yet
Rabino Lineth B Activity-8 Consyslab
13 pages
Annulment, Divorce and Legal Separation in The Philippines
No ratings yet
Annulment, Divorce and Legal Separation in The Philippines
15 pages
Historical Financial Statistics: Calendar Notes
No ratings yet
Historical Financial Statistics: Calendar Notes
13 pages
Module 3 NLP
No ratings yet
Module 3 NLP
32 pages
Chapter15 NaturalLanguage
100% (1)
Chapter15 NaturalLanguage
35 pages
Week 1. Historical Forest and Present Natural Divisions of Illinois
No ratings yet
Week 1. Historical Forest and Present Natural Divisions of Illinois
12 pages
NLP Unit-Ii
No ratings yet
NLP Unit-Ii
71 pages
The Forgotten Glowing Vale Beyond The Veil
No ratings yet
The Forgotten Glowing Vale Beyond The Veil
5 pages
What Is Parsing
No ratings yet
What Is Parsing
47 pages
Project New
No ratings yet
Project New
27 pages
Module No. 3: Parsing Structure in Text
No ratings yet
Module No. 3: Parsing Structure in Text
54 pages
Natural Language Processing: Parsing
No ratings yet
Natural Language Processing: Parsing
18 pages
History of Agrarian Reform
No ratings yet
History of Agrarian Reform
27 pages
14 Syntax 1
No ratings yet
14 Syntax 1
22 pages
Natural Language Processing
No ratings yet
Natural Language Processing
13 pages
Iso 23308 2 2020
0% (1)
Iso 23308 2 2020
9 pages
Natural Language Processing
No ratings yet
Natural Language Processing
21 pages
13-Dependency Grammar-03-09-2024
No ratings yet
13-Dependency Grammar-03-09-2024
31 pages
University of Calicut: Day & Date Subject
No ratings yet
University of Calicut: Day & Date Subject
4 pages
Psychology in Modern India
100% (1)
Psychology in Modern India
19 pages
Short Selling Around The 52-Week and Historical Highs: Eunju - Lee@uml - Edu Npiqueira@bauer - Uh.edu
No ratings yet
Short Selling Around The 52-Week and Historical Highs: Eunju - Lee@uml - Edu Npiqueira@bauer - Uh.edu
32 pages
Example Application HI 199
No ratings yet
Example Application HI 199
6 pages
Building For Future Generations First Principles of True Reed, Jeff
No ratings yet
Building For Future Generations First Principles of True Reed, Jeff
68 pages
Check Semantics - Error Reporting - Disambiguate - Type Coercion - Static Checking
No ratings yet
Check Semantics - Error Reporting - Disambiguate - Type Coercion - Static Checking
108 pages
Lululemon Athletica Incorporated Analysis
No ratings yet
Lululemon Athletica Incorporated Analysis
8 pages
Lecture 2
No ratings yet
Lecture 2
28 pages
Pass Res B1plus UT 8A
No ratings yet
Pass Res B1plus UT 8A
3 pages
Atural Anguage Rocessing: Chandra Prakash LPU
No ratings yet
Atural Anguage Rocessing: Chandra Prakash LPU
59 pages
English Quarter 1 Week 7
No ratings yet
English Quarter 1 Week 7
30 pages
Unit - 5 Natural Language Processing
No ratings yet
Unit - 5 Natural Language Processing
66 pages
Lecture NLP
100% (1)
Lecture NLP
38 pages
Language Learning Reflection Website
No ratings yet
Language Learning Reflection Website
4 pages
Rakowski Preludes: A Brief Examination of His Compositional Process
0% (1)
Rakowski Preludes: A Brief Examination of His Compositional Process
20 pages
The Cyclic System of Transposition for Trumpet
From Everand
The Cyclic System of Transposition for Trumpet
Keith Doles
5/5 (1)

Lin 13

Uploaded by

Lin 13

Uploaded by

Spurious Ambiguity

• Most parse trees of most NL sentences make no

Correct Tree T Computed Tree P

You might also like