Pract Q
Pract Q
Q. Convert the following Context Free Grammar (CFG) into Chomsky Normal Form (CNF)
S -> a B B B | b A A A
A -> a | A s | b B B
B -> b| b S |A a a
Where S-Start Symbol, A and B -Non-Terminal symbols, a,b and s – terminal symbols.
Q. Apply CYK parsing algorithm to generate the parsing table for input sentence “the pilot flew the plane
to Delhi” using the given grammar in CNF
Q. Consider the sentence “Sing loudly to Dance joyfully” that needs to be tagged using a Hidden Markov Model
(HMM). Two stochastic probability matrices are given: matrix A for state transitions and matrix B for emission
probabilities, with two states being Adjective (Adj) and Adverb (Adv).
The state transition matrix A and the emission probability matrix B are provided as follows:
Matrix A (State Transition Probabilities):
State Adj Adv
Adj 0.6 0.4
Adv 0.3 0.7
Matrix B (Emission Probabilities):
State/Obs. Adj Adv
Adj 0.5 0.5
Adv 0.4 0.6
1. Draw the HMM transition model with transition probabilities Aij and emission probabilities Bij for the
two ambiguous words of the sentence W1=Sing and W2=Dance as states of the model.
2. Determine the state (Adj or Adv), if the observation or output state is “Adverb” and also calculate the
probability for W1=Sing and W2=Dance.
Q The following shows a simple context free grammar (CFG) for a fragment of English.
the Show the parse tree for sentence “the dog is angry at the cat”.
Q. Show all possible parse trees for the sentence "bronze pots clatter" and calculate the probability
of each tree using PCFG.
S -> Noun VP [0.5]
S -> NP Verb [0.5]
VP -> Verb Noun [1.0]
NP -> Adj Noun [1.0]
Adj -> "bronze" [1.0]
Noun -> "bronze" [0.4]
Noun -> "pots" [0.3]
Noun -> "clatter" [0.3]
Verb -> "bronze" [0.3]
Verb -> "pots" [0.5]
Verb -> "clatter" [0.2]
Answer
Parse Tree 1:
S -> Noun VP
Noun -> "bronze"
VP -> Verb Noun
Verb -> "pots"
Noun -> "clatter"
Probability Calculation:
P(T1)=0.5×0.4×1.0×0.5×0.3 =0.03
Parse Tree 2:
S -> NP Verb
NP -> Adj Noun
Adj -> "bronze"
Noun -> "pots"
Verb -> "clatter"
Probability Calculation:
P(T2)=P(S→NP Verb)×P(NP→Adj Noun)×P(Adj→"bronze")×P(Noun→"pots")×P(Verb→"clatter")
P(T2)=0.5×1.0×1.0×0.3×0.2
P(T2)=0.03
Both parse trees T1 and T2 are possible for the sentence "bronze pots clatter" with the provided
PCFG, and both have the same probability of 0.03. This illustrates how natural language sentences
can have multiple valid parses, and how PCFGs can be used to calculate the probability of each
parse, which can be useful in choosing the most likely parse in natural language processing
applications.
Q. Given the following PCFG rules, show all possible parse trees for the sentence "red birds sing"
and calculate the probability of each tree.
S -> Noun VP [0.6] S -> NP Verb [0.4] VP -> Verb Noun [0.9] VP -> Verb Adj [0.1] NP -> Adj Noun
[1.0] Adj -> "red" [1.0] Noun -> "red" [0.2] Noun -> "birds" [0.5] Noun -> "sing" [0.3] Verb -> "red"
[0.1] Verb -> "birds" [0.2] Verb -> "sing" [0.7]
Q. Using the PCFG provided, construct all possible parse trees for the sentence "green fish swim"
and compute their respective probabilities.
S -> NP VP [0.7]
S -> Noun VP [0.3]
VP -> Verb NP [0.8]
VP -> Verb Noun [0.2]
NP -> Adj Noun [1.0]
Adj -> "green" [1.0]
Noun -> "green" [0.3]
Noun -> "fish" [0.4]
Noun -> "swim" [0.3]
Verb -> "green" [0.2]
Verb -> "fish" [0.3]
Verb -> "swim" [0.5]
Q. Create a unigram and bigram word model for the following corpus:
Have fun on the school trip.
Ask the teacher if you have any problem.
I have fun just looking around.
(i) Predict the probability of occurrence of next word as ‘any’ after the given word as ‘have’.
(ii) Predict the probability of occurrence of next word as ‘fun’ after the given word as ‘have’.
(iii) Predict the probability of sentence ‘Ask the teacher if you have any problem’ considering Bigram.
(iv) Predict the probability of sentence ‘Ask the teacher if you have any problem’ considering Trigram.
Answer
Unigram model
The unigram model considers each word independently.
P("have")=2/18
P("any")=1/18
P ("fun")=2/82
Bigram Model
The bigram model considers a pair of words. To calculate the probability of a word given the previous
word, we use:
p(word2 | word1)=Count(word1 word2) / Count(word1)
Count("have any") = 1
Count("have fun") = 2
P("any" | "have")=1/2
P("fun" | "have")= 2/2
(i) The probability of occurrence of the next word as ‘any’ after the given word as ‘have’:
P("any" | "have")=0.5
(ii) The probability of occurrence of the next word as ‘fun’ after the given word as ‘have’:
P("fun" | "have")=1.0
Predict the probability of the sentence ‘Ask the teacher if you have any problem’ considering
Bigram:
P(Sentence | Bigram)=P(Ask | START)×P(the | Ask)×P(teacher | the)×P(if | teacher)×P(you | i
f)×P(have | you)×P(any | have)×P(problem | any)×P(END | problem)
P(Ask | START) = Count(START Ask) / Count(START) = 1/3 (since "Ask" is the starting
word in one out of three sentences)
p(the | Ask)P(the | Ask) = Count(Ask the) / Count(Ask) = 1/1
p(teacher | the)P(teacher | the) = Count(the teacher) / Count(the) = 1/2
p(if | teacher)P(if | teacher) = Count(teacher if) / Count(teacher) = 1/1
p(you | if)P(you | if) = Count(if you) / Count(if) = 1/1
p(have | you)P(have | you) = Count(you have) / Count(you) = 1/1
p(any | have)P(any | have) = Count(have any) / Count(have) = 1/2
p(problem | any)P(problem | any) = Count(any problem) / Count(any) = 1/1
p(END | problem)P(END | problem) = Count(problem END) / Count(problem) = 1/1
(assuming that every sentence ending with "problem" is followed by an END token)
P(Sentence | Bigram)=31×1×21×1×1×1×21×1×1
P(Sentence | Bigram)= 1/12
So, the probability of the sentence "Ask the teacher if you have any problem" under the bigram
model and given the provided corpus is 1/12 or approximately 0.0833.
(iv) To calculate the probability of this sentence using the trigram model, we need to find the
probability of each word given the two previous words, i.e., p(wordn∣word n−1,wordn−2)).
P(Sentence | Trigram)=1×1×1×1×1×1×1×1×1
P(Sentence | Trigram)=1
Corpus:
Time flies like an arrow.
Fruit flies like a banana.
She likes to have fruit for breakfast.
Question 1:
Create a unigram model for the above corpus and use it to:
(i) Predict the probability of occurrence of the word 'like'.
(ii) Predict the probability of occurrence of the word 'flies'.
Question 2:
Create a bigram model for the same corpus and use it to:
(i) Predict the probability of the next word being 'flies' given the current word is 'Time'.
(ii) Predict the probability of the next word being 'an' given the current word is 'like'.
Question 3:
Using the bigram model, predict the probability of the sentence 'Fruit flies like a banana'
occurring in the corpus.
Question 4:
Create a trigram model for the corpus and use it to:
(i) Predict the probability of the next word being 'arrow' given the previous two words are
'Time flies'.
(ii) Predict the probability of the next word being 'for' given the previous two words are
'to have'.
Question 5:
Using the trigram model, predict the probability of the sentence 'She likes to have fruit for
breakfast' occurring in the corpus.