0% found this document useful (0 votes)
5 views10 pages

Unit 5

This document covers sentence construction in NLP, focusing on constituency and dependency parsing techniques for analyzing grammatical structures. It explains feature structures, unification, and their roles in ensuring subject-verb agreement and modular grammar representation. Additionally, it discusses canonical forms, first-order predicate calculus, and the importance of robustness in parsing for real-world applications.

Uploaded by

sujanst100
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views10 pages

Unit 5

This document covers sentence construction in NLP, focusing on constituency and dependency parsing techniques for analyzing grammatical structures. It explains feature structures, unification, and their roles in ensuring subject-verb agreement and modular grammar representation. Additionally, it discusses canonical forms, first-order predicate calculus, and the importance of robustness in parsing for real-world applications.

Uploaded by

sujanst100
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 10

📘 Unit 5: Sentence Construction & Unification

🔷 1. Sentence-Level Construction in NLP


✅ What is Sentence Construction?
Sentence construction in NLP means understanding how a sentence is structured grammatically —
how words are grouped into phrases, and how those phrases form a complete sentence.
This is important because:
 Computers don’t inherently understand language.
 NLP needs structure to do translation, summarization, or question answering.
 Structure helps identify the function of each word: Is it a subject? A verb? A modifier?
For example:
“The boy ate the apple.”

Without structure, a machine sees just a string of words.


With sentence construction rules, it knows:
 “The boy” is a noun phrase (NP)
 “ate the apple” is a verb phrase (VP)

🔷 2. Constituency vs Dependency Parsing


Parsing is a technique used to analyze the syntactic structure of a sentence. Two primary parsing
techniques:

🔹 2.1 Constituency Parsing (Phrase Structure Grammar)


 Focuses on grouping words into nested phrases.
 Based on Context-Free Grammar (CFG).
 Breaks sentences into phrases like NP (noun phrase) and VP (verb phrase).
✅ Example:
Sentence: "The cat sat on the mat."
S
/ \
NP VP
/ \ / \
DT NN V PP
/ \
IN NP
/ \
DT NN

 NP → "The cat"
 VP → "sat on the mat"
 PP → "on the mat"
✅ Use-case: Grammar checking, sentence generation.

🔷 2.2. Context-Free Grammar (CFG)


CFG is a formal way to describe syntactic structures using rules.

✅ Components:
 Non-terminals (A, B, S, NP, VP, etc.)
 Terminals (actual words like “the”, “boy”)
 Production rules (A → B C)
 Start symbol (usually S)

🔹 CFG Rules for our example:

S → NP VP
NP → Det N
VP → V NP
Det → 'the'
N → 'boy' | 'dog'
V → 'saw'

🔹 Explanation:
 S → NP VP: A sentence (S) consists of a noun phrase followed by a verb phrase.
 NP → Det N: A noun phrase is made up of a determiner and a noun.
 VP → V NP: A verb phrase consists of a verb followed by a noun phrase.
 Det → 'the': 'the' is a determiner.
 N → 'boy' | 'dog': Both 'boy' and 'dog' are nouns.
 V → 'saw': 'saw' is a verb.
🧠 Applying the Rules Step-by-Step
Let’s parse: “The boy saw the dog”
Step 1: Start with S (sentence)
→ S → NP VP
Step 2: Expand NP
→ NP → Det N
→ Det = 'the'
→ N = 'boy'
✅ First NP = “the boy”
Step 3: Expand VP
→ VP → V NP
→ V = 'saw'
→ NP → Det N
→ Det = 'the', N = 'dog'
✅ Second NP = “the dog”
Final Structure:

S
/ \
NP VP
/ \ / \
Det N V NP
| | | / \
the boy saw Det N
| |
the dog

This tree tells the computer:


 “the boy” is the subject
 “saw the dog” is the action performed

🔷2.3 Dependency Parsing


✅ 1. What is Dependency Parsing?
Dependency parsing is a method of analyzing the grammatical structure of a sentence based on the
relationships between words.
Instead of grouping words into phrases like NP or VP (as in constituency parsing), dependency parsing
focuses on:
 Which word depends on which other word
 Identifying the head (main word) and its dependents

🔹 Key Concepts:
Term Meaning
Head The central word in a phrase (usually a verb or noun)
Dependent A word that adds meaning to the head (like subject, object, modifier)
Arc A directed connection from head to dependent
Root The main verb of the sentence (it has no parent)
Dependency parsing builds a tree structure, with one root and connections (arcs) to all
other words.

✅ 2. Dependency vs. Constituency Parsing


Feature Constituency Parsing Dependency Parsing
Structure Breaks sentence into phrases (NP, VP) Focuses on head-dependent word relations
Grammar Type Based on Context-Free Grammar (CFG) Based on Dependency Grammar
Units of Analysis Phrases Words
Tree with arcs between head and
Output Parse tree with constituents
dependents
Syntax analysis, phrase structure
Applications Question answering, machine translation
modeling

✅ 3. How Dependency Parsing Works – Step-by-Step


Let’s take a sample sentence:
"The cat chased the mouse."

🧾 Step 1: Identify POS Tags


Word POS Explanation
The DT Determiner
cat NN Noun (subject)
chased VBD Verb (main action)
the DT Determiner
mouse NN Noun (object)

🧾 Step 2: Assign the Root


The main verb is usually the root of the tree.
"chased" → ROOT

🧾 Step 3: Find Dependencies


 "cat" → subject of "chased" → nsubj

 "mouse" → object of "chased" → dobj

 "The" → determiner of "cat" → det

 "the" → determiner of "mouse" → det

✅ Final Dependency Tree (Text View):

chased (ROOT)
├── cat (nsubj)
│ └── The (det)
└── mouse (dobj)
└── the (det)

✅ Dependency Relations Table:


Word Relation Head
chased ROOT -
cat nsubj chased
The det cat
mouse dobj chased
the det mouse

✅ 4. Common Dependency Relation Tags


Tag Full Form Example
nsubj nominal subject He runs fast
dobj direct object I saw a movie
iobj indirect object She gave him a gift
det determiner The dog barked
amod adjectival modifier The tall building
advmod adverbial modifier She sang beautifully
prep prepositional modifier Book is on the table
pobj object of preposition on the table
These tags are part of the Universal Dependencies (UD) scheme.
✅ 5. Practical Dependency Parsing using spaCy
import spacy

# Load small English model


nlp = spacy.load("en_core_web_sm")

# Input sentence
doc = nlp("The cat chased the mouse.")

# Print dependency information


for token in doc:
print(f"{token.text} → {token.dep_} → {token.head.text}")

✅ Output:

The → det → cat


cat → nsubj → chased
chased → ROOT → chased
the → det → mouse
mouse → dobj → chased

✅ Why Dependency Parsing is Powerful:


 Works well for free-word-order languages (e.g., Nepali, German)
 Directly gives subject, object, modifiers — helpful in:
 Chatbots
 Translators
 Grammar correction tools
 More compact than constituency trees

📝 Summary of This Chunk


Topic Description
Dependency Parsing Analyze word-to-word grammatical relationships
Head & Dependent Head = main word, Dependent = modifier/helper
Root The main action verb of the sentence
Common Relations nsubj, dobj, det, prep, pobj, etc.
spaCy Python library for fast and easy dependency parsing
🔷 4. Feature Structures in NLP
Feature Structures are sets of attribute-value pairs used to represent grammatical details like
number, gender, tense, etc.
These structures provide rich information about linguistic elements, useful for checking agreement
and unification.

✅ Example:
For word “cats”:

[Category: noun, Number: plural]

For verb “runs”:

[Category: verb, Tense: present, Person: 3rd, Number: singular]

🔷 5. Unification in NLP
Unification is the process of merging two feature structures. It’s successful only when their values
are compatible (i.e., no contradiction).

✅ Why Unification?
 Ensures subject-verb agreement
 Allows modular representation of grammar rules
 Enables robust parsing and semantic analysis

✅ Example:
NP: [Number: singular]
VP: [Number: singular]
→ ✅ Unification succeeds
But:
NP: [Number: plural], VP: [Number: singular]
→ ❌ Unification fails (incompatible)

🔷 6. Coordination, Noun Phrase, and Subcategorization


🔹 6.1 Noun Phrase (NP)
 Group of words centered around a noun.
 Includes determiners, adjectives, and modifiers.
"The black dog", "A sweet little child"

🔹 6.2 Coordination
 Joining similar grammatical units using conjunctions.
"John and Mary", "Fast but safe"

🔹 6.3 Subcategorization
 Specifies what kind of arguments a verb can take.
 Helps distinguish between verbs like:
 "give" (needs subject, object, indirect object)
 "sleep" (only needs subject)
"She gave him a book."
Subcat Frame: [Verb → NP NP]

🔷 7. Canonical Form & Expressiveness


🔹 Canonical Form
 The standard or base form of a sentence.
 Used for normalization before analysis.
"The boy was given a book by the teacher."
Canonical: "The teacher gave the boy a book."

🔹 Expressiveness
 The degree to which a language or grammar can represent meaning.
 More expressive grammars can handle complex language and semantics.

🔷 8. Basics of FOPC (First Order Predicate Calculus)


FOPC is a logic-based language used to represent semantic meaning in NLP.

✅ Why Use FOPC?


 Allows machines to reason about sentence meanings.
 Used in question answering, semantic parsing, and inference.

✅ Example:
Sentence: "All humans are mortal."
FOPC: ∀x (Human(x) → Mortal(x))
🔷 9. Semantic Analysis: Syntax-Driven Integration
Semantic analysis is the process of converting a sentence into a structured meaning representation.

🔸 Syntax-Driven Semantic Analysis


 Meaning is derived from the syntactic structure.
 Combines parsing with feature structures and FOPC.

🔷 10. Attachment & Integration, Robustness


🔹 Attachment
 Deciding where a phrase attaches in a sentence.
 Common issue in prepositional phrases.
"She saw the man with a telescope."
Who has the telescope?

🔹 Integration
 Combining syntax, semantics, and features into one analysis.
 Essential for applications like dialogue systems and machine reasoning.

🔹 Robustness
 The parser should handle errors, informal text, or missing grammar.
 Important for real-world applications (e.g., chatbots, web search).

🔷 🧪 Practicals You Can Try


🔹 StanfordNLP – Constituency & Dependency Parsing

import stanza
stanza.download('en')
nlp = stanza.Pipeline(lang='en', processors='tokenize,pos,lemma,depparse')
doc = nlp("The dog chased the cat.")
for sent in doc.sentences:
sent.print_dependencies()

🔹 spaCy – Feature Structures & Dependency Parsing

import spacy
nlp = spacy.load("en_core_web_sm")
doc = nlp("The boy saw a dog with a telescope.")
for token in doc:
print(f"{token.text} --> {token.dep_} --> {token.head.text}")

📘 Summary Table
Topic Description
Parsing Analyze sentence structure using grammar
Constituency Parsing Phrase-based structure (NP, VP)
Dependency Parsing Word-to-word grammar relationships
CFG Rule-based grammar system
Feature Structures Attribute-value pairs (e.g., tense, number)
Unification Matching feature structures
Canonical Form Standard sentence format
FOPC Logic representation of meaning
Semantic Analysis Extracting sentence meaning
Attachment & Robustness Error handling & ambiguity resolution

You might also like