Practice Questions NLP
Practice Questions NLP
the Show the parse tree for sentence “the dog is angry at the cat”.
Q. 2 Construct the parse tree (Top-Down) for each of the sentence given below using the given
grammar. Which of the following sentences are not recognized by the given grammar, explain with
justification?
S -> NP VP V -> hates
VP -> V N -> mouse
VP -> V NP V -> sneezes
NP -> Det N Det -> the
N ->cat N -> dog
V -> sees
(i) the dog sneezes the cat
(ii) the mouse hates
(iii) the cat the mouse hates
(iv) the mouse hates the mouse
Answer
Parse tree would start with S -> NP VP, then break down each constituent part (NP and VP) using
the relevant rules.
Justification: The sentence structure does not conform to the S -> NP VP rule as it seems to lack
a clear VP structure.
Parse tree would start with S -> NP VP, then break down each constituent part (NP and VP) using
the relevant rules.
Q3. Show all possible parse trees for the sentence "bronze pots clatter" and calculate the probability
of each tree using PCFG.
S -> Noun VP [0.5]
S -> NP Verb [0.5]
VP -> Verb Noun [1.0]
NP -> Adj Noun [1.0]
Adj -> "bronze" [1.0]
Noun -> "bronze" [0.4]
Noun -> "pots" [0.3]
Noun -> "clatter" [0.3]
Verb -> "bronze" [0.3]
Verb -> "pots" [0.5]
Verb -> "clatter" [0.2]
Answer
Parse Tree 1:
S -> Noun VP
Noun -> "bronze"
VP -> Verb Noun
Verb -> "pots"
Noun -> "clatter"
Probability Calculation:
P(T1)=0.5×0.4×1.0×0.5×0.3 =0.03
Parse Tree 2:
S -> NP Verb
NP -> Adj Noun
Adj -> "bronze"
Noun -> "pots"
Verb -> "clatter"
Probability Calculation:
P(T2)=P(S→NP Verb)×P(NP→Adj Noun)×P(Adj→"bronze")×P(Noun→"pots")×P(Verb→"clatter")
P(T2)=0.5×1.0×1.0×0.3×0.2
P(T2)=0.03
Both parse trees T1 and T2 are possible for the sentence "bronze pots clatter" with the provided
PCFG, and both have the same probability of 0.03. This illustrates how natural language sentences
can have multiple valid parses, and how PCFGs can be used to calculate the probability of each
parse, which can be useful in choosing the most likely parse in natural language processing
applications.
Q4. Create a unigram and bigram word model for the following corpus:
Have fun on the school trip.
Ask the teacher if you have any problem.
I have fun just looking around.
(i) Predict the probability of occurrence of next word as ‘any’ after the given word as ‘have’.
(ii) Predict the probability of occurrence of next word as ‘fun’ after the given word as ‘have’.
(iii) Predict the probability of sentence ‘Ask the teacher if you have any problem’ considering Bigram.
(iv) Predict the probability of sentence ‘Ask the teacher if you have any problem’ considering Trigram.
Answer
Unigram model
The unigram model considers each word independently.
P("have")=2/18
P("any")=1/18
P ("fun")=2/82
Bigram Model
The bigram model considers a pair of words. To calculate the probability of a word given the previous
word, we use:
p(word2 | word1)=Count(word1 word2) / Count(word1)
Count("have any") = 1
Count("have fun") = 2
P("any" | "have")=1/2
P("fun" | "have")= 2/2
(i) The probability of occurrence of the next word as ‘any’ after the given word as ‘have’:
P("any" | "have")=0.5
(ii) The probability of occurrence of the next word as ‘fun’ after the given word as ‘have’:
P("fun" | "have")=1.0
Predict the probability of the sentence ‘Ask the teacher if you have any problem’ considering
Bigram:
P(Sentence | Bigram)=P(Ask | START)×P(the | Ask)×P(teacher | the)×P(if | teacher)×P(you | i
f)×P(have | you)×P(any | have)×P(problem | any)×P(END | problem)
P(Ask | START) = Count(START Ask) / Count(START) = 1/3 (since "Ask" is the starting
word in one out of three sentences)
p(the | Ask)P(the | Ask) = Count(Ask the) / Count(Ask) = 1/1
p(teacher | the)P(teacher | the) = Count(the teacher) / Count(the) = 1/2
p(if | teacher)P(if | teacher) = Count(teacher if) / Count(teacher) = 1/1
p(you | if)P(you | if) = Count(if you) / Count(if) = 1/1
p(have | you)P(have | you) = Count(you have) / Count(you) = 1/1
p(any | have)P(any | have) = Count(have any) / Count(have) = 1/2
p(problem | any)P(problem | any) = Count(any problem) / Count(any) = 1/1
p(END | problem)P(END | problem) = Count(problem END) / Count(problem) = 1/1
(assuming that every sentence ending with "problem" is followed by an END token)
P(Sentence | Bigram)=31×1×21×1×1×1×21×1×1
P(Sentence | Bigram)= 1/12
So, the probability of the sentence "Ask the teacher if you have any problem" under the bigram
model and given the provided corpus is 1/12 or approximately 0.0833.
(iv) To calculate the probability of this sentence using the trigram model, we need to find the
probability of each word given the two previous words, i.e., p(wordn∣word n−1,wordn−2)).
P(Sentence | Trigram)=1×1×1×1×1×1×1×1×1
P(Sentence | Trigram)=1