2019 Main
2019 Main
MAIN EXAMINATION
JUNE 2019
INSTRUCTIONS:
Attempt all questions.
Please start each question on a new page.
Show all workings where appropriate.
Write in ink. Rough work can be done in pencil on the reverse side of each
page and will not be marked.
You are advised to leave your answers as fractions were its possible.
For any questions needing part-of-speech tagging use the tagset in
Appendix A
University of KwaZulu-Natal, June 2019 Examination: COMP316 W1 2
Question 1- REGEX, FSAs, FSTs, Words and Word Forms [20 marks]
1.1 The transition Table (Matrix) in Table 1 defines the duck language. Study the table and answer the
following questions;
Table 1
k w a k ! ϵ
0 1
1 2
2 2, 3
3 4
4 5
5:
[3, 1]
1.2 Draw the Finite State Transducer (FST) for the following English Orthographic rule. Assume the
exceptions given in the rule are all the exceptions that exist to the rule.
Words ending in an ‘o’ preceded by a consonant usually end in suffix ‘es’ to form the plural. E.g.
potato-es, volcano-es, torpedo-es. Exceptions to this rule are: pianos, Eskimos.
[7]
1.3 Figure 1 shows a transducer(s) for the consonant doubling spelling rule for English verbs. The rule
doubles a consonant if it is preceded by a consonant and a vowel when converting the verb into its ing
or ed form. E.g. Begin-beginning, beg-begging. Use Figure 1 to answer the following questions;
q1 q2 q3
q8 q7 q6 q5 q4
Figure 1
[2, 7]
University of KwaZulu-Natal, June 2019 Examination: COMP316 W1 3
[3]
2.2 Use the POS-tagged corpus in Figure 2 to answer the following questions:
A/DT bat/NN is/VBZ flying/VBG. A/DT flying/JJ bat/NN is/VBZ scary/JJ. I/PRP saw/VBZ
Mary/NN flying/VBG a/DT bat/NN. She/PRP likes/VBZ flying/VBG a/DT scary/JJ bat/NN.
I/PRP like/VBZ flying/VBG to/TO JHB/NN. Bats/NN are/VBZ flying/JJ mammals/NN.
Flying/VBG planes/NN is/VBZ scary/JJ.
Figure 2
a. Assuming a bigram model, determine which of the following phrases is the most likely
sequence of the words; scary, flying, bat, a, and is, given the corpus in Figure 2.
i. A scary bat is flying,
ii. Flying a bat is scary
b. Given the training data in Figure 2, use a stochastic POS tagger with bigram transition
probabilities to determine whether the word “flying” in the phrase “a flying Mary is scary”
is a JJ or VBG. Assume that the word “a” is a determiner and the word “Mary” is a NN.
[9, 8]
University of KwaZulu-Natal, June 2019 Examination: COMP316 W1 4
3.1 Use the Context Free Grammar (CFG) in Figure 3 to answer the following question ;
a. Complete the Context Free Grammar in Figure 3 so that it generates (at least) the
following sentences:
i. Vincent loved Mia.
ii. Mia gave the gun to the man.
iii. Sboniso knew the robber that fell.
GRAMMAR LEXICON
S → NP VP ProperNoun →Vincent|Mia
NP →ProperNoun Det → a | her
NP → ProperNoun Rel Noun → gun|robber
Nominal → Noun Wh → who | that
Nominal → Noun Rel Prep→ to
Rel→ Wh VP IV → fell
VP → IV TV → loved
VP→ TV NP DV → gave
VP → DV NP PP
PP→ Prep NP
Figure 3
b. Convert the completed grammar from your answer to Question (3.1. a) into Chomsky
Normal Form (CNF).
[5, 4]
3.2 Use the Grammar in Figure 4, and chart[0] of an Earley parser in Table 3, to do chart[1] and chart[2]
of the Earley parser on the sentence “Menzi called Nompilo from Durban”. Each entry in your
charts should have the state, its end points, and the operation that placed it in the chart as shown in
Table 3.
GRAMMAR LEXICON
S → NP VP Noun → “Menzi”
NP → NP PP Noun → “Nompilo”
NP → Noun Noun → “Durban”
VP → Verb NP Verb → “called”
VP → VP PP Prep → “from”
PP → Prep NP
Figure 4
University of KwaZulu-Natal, June 2019 Examination: COMP316 W1 5
Table 3
𝑆 →. 𝑁𝑃 𝑉𝑃 [0, 0] predictor
𝑁𝑃 →. 𝑁𝑃 𝑃𝑃 [0, 0] predictor
Chart [1] … … …
[11]
University of KwaZulu-Natal, June 2019 Examination: COMP316 W1 6
4.2 Consider the following sentence, L, in Figure 5. The word “line” is in focus,
About three years ago, he nearly gave up because he had nothing to sell, now his
shelves are full, and towels and clothes hang from a line every day of the week.
Figure 5
a. Give a collocational feature vector for the word “line” in L, given a window size of six (6)
(3 words to the left and 3 words to the right).
b. Give a bag-of-words feature vector for the word “line” in L, given a window of size 6, and
the following word feature list; [written, school, speech, day, major, hang, sell, nothing,
rope, week].
[4, 3]
4.3 State any two (2) advantages of using word vectors for meaning representation over using
thesaurus based techniques.
[2]
University of KwaZulu-Natal, June 2019 Examination: COMP316 W1 7
Figure 6
5.2 Sequence-to-Sequence model for Neural Machine Translation has two major components, the
encoder and the decoder. Briefly explain the functions of each of these components in Neural
Machine Translation.
[4]
S → VP 1.0
VP → Verb NP 0.75
VP → Verb NP PP 0.1
VP → Verb PP NP 0.15
NP → NP PP 0.3
NP → Det Noun 0.7
PP → Prep Noun 1.0
Det → the|… 0.1
Verb → cut| ask| find|… 0.1
Prep → with| in| … 0.1
Noun → envelope|grandma|scissors|men|shirt|summer|… 0.1
Figure 7
c. Given the direct translation for the Spanish sentence “Cortar la camisa con las tijeras” in
Figure 8, use your answer to Question 5. 3 (b) to determine which of the two sentences in,
Table 4, is the most likely fluent translation of the Spanish sentence.
Table 4
Figure 8
[6, 5, 4]
University of KwaZulu-Natal, June 2019 Examination: COMP316 W1 9
Appendix A