0% found this document useful (0 votes)
14 views9 pages

2019 Main

Uploaded by

thabotwn2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views9 pages

2019 Main

Uploaded by

thabotwn2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

UNIVERSITY OF KWAZULU-NATAL

SCHOOL OF MATHEMATICS, STATISTICS


& COMPUTER SCIENCE

MAIN EXAMINATION

JUNE 2019

COURSE AND CODE


Natural Language Processing COMP316 W1

DURATION: 3 Hours TOTAL MARKS: 100

INTERNAL EXAMINER: Mr. E. Jembere


EXTERNAL EXAMINER: Dr. P. Q. T. Mtshali, DUT

THIS EXAM CONSISTS OF NINE (9) PAGES INCLUDING THIS ONE.


PLEASE ENSURE THAT YOU HAVE ALL PAGES.

INSTRUCTIONS:
 Attempt all questions.
 Please start each question on a new page.
 Show all workings where appropriate.
 Write in ink. Rough work can be done in pencil on the reverse side of each
page and will not be marked.
 You are advised to leave your answers as fractions were its possible.
 For any questions needing part-of-speech tagging use the tagset in
Appendix A
University of KwaZulu-Natal, June 2019 Examination: COMP316 W1 2

Question 1- REGEX, FSAs, FSTs, Words and Word Forms [20 marks]
1.1 The transition Table (Matrix) in Table 1 defines the duck language. Study the table and answer the
following questions;

Table 1

k w a k ! ϵ
0 1
1 2
2 2, 3
3 4
4 5
5:

a. Draw a graphical representation of the FSA represented by the Table 1.


b. Write a Regular expression for the FSA you gave in Question (1.1.a)

[3, 1]

1.2 Draw the Finite State Transducer (FST) for the following English Orthographic rule. Assume the
exceptions given in the rule are all the exceptions that exist to the rule.

Words ending in an ‘o’ preceded by a consonant usually end in suffix ‘es’ to form the plural. E.g.
potato-es, volcano-es, torpedo-es. Exceptions to this rule are: pianos, Eskimos.

[7]

1.3 Figure 1 shows a transducer(s) for the consonant doubling spelling rule for English verbs. The rule
doubles a consonant if it is preceded by a consonant and a vowel when converting the verb into its ing
or ed form. E.g. Begin-beginning, beg-begging. Use Figure 1 to answer the following questions;

q1 q2 q3

q8 q7 q6 q5 q4

Figure 1

a. Is the FST deterministic or non-deterministic on the input? Justify your answer.


b. Give the State Transition Table for the Transducer in Figure 1.

[2, 7]
University of KwaZulu-Natal, June 2019 Examination: COMP316 W1 3

Question 2- Language Models and Part of Speech [20 marks]


2.1 Table 2 shows an excerpt of bigram parameters for an English Language model. Use the data
in Table 2 to generate the most likely phrase with six words, which starts with the word “I”.
Table 2

I food Chinese for want like to eat lunch

for 0 0.047 0.23 0 0 0 0 0 0.42

I 0.002 0 0 0.001 0.45 0.33 0 0.036 0

Chinese 0 0.52 0 0 0 0 0.004 0 0.25

like 0 0.33 0.65 0 0 0 0.3 0.003 0.123

want 0.002 0.23 0.0065 0 0 0 0.66 0.001 0.09

food 0 0 0 0.33 0 0.012 0.002 0 0

to 0.003 0 0.0027 0 0 0.12 0 0.28 0.006

eat 0 0.006 0.37 0 0 0 0 0 0.007

lunch 0 0 0 0.004 0 0 0.008 0 0

[3]

2.2 Use the POS-tagged corpus in Figure 2 to answer the following questions:

A/DT bat/NN is/VBZ flying/VBG. A/DT flying/JJ bat/NN is/VBZ scary/JJ. I/PRP saw/VBZ
Mary/NN flying/VBG a/DT bat/NN. She/PRP likes/VBZ flying/VBG a/DT scary/JJ bat/NN.
I/PRP like/VBZ flying/VBG to/TO JHB/NN. Bats/NN are/VBZ flying/JJ mammals/NN.
Flying/VBG planes/NN is/VBZ scary/JJ.

Figure 2

a. Assuming a bigram model, determine which of the following phrases is the most likely
sequence of the words; scary, flying, bat, a, and is, given the corpus in Figure 2.
i. A scary bat is flying,
ii. Flying a bat is scary
b. Given the training data in Figure 2, use a stochastic POS tagger with bigram transition
probabilities to determine whether the word “flying” in the phrase “a flying Mary is scary”
is a JJ or VBG. Assume that the word “a” is a determiner and the word “Mary” is a NN.
[9, 8]
University of KwaZulu-Natal, June 2019 Examination: COMP316 W1 4

Question 3 -Formal Grammars and parsing [20 marks]

3.1 Use the Context Free Grammar (CFG) in Figure 3 to answer the following question ;
a. Complete the Context Free Grammar in Figure 3 so that it generates (at least) the
following sentences:
i. Vincent loved Mia.
ii. Mia gave the gun to the man.
iii. Sboniso knew the robber that fell.

GRAMMAR LEXICON

S → NP VP ProperNoun →Vincent|Mia
NP →ProperNoun Det → a | her
NP → ProperNoun Rel Noun → gun|robber
Nominal → Noun Wh → who | that
Nominal → Noun Rel Prep→ to
Rel→ Wh VP IV → fell
VP → IV TV → loved
VP→ TV NP DV → gave
VP → DV NP PP
PP→ Prep NP

Figure 3

b. Convert the completed grammar from your answer to Question (3.1. a) into Chomsky
Normal Form (CNF).
[5, 4]

3.2 Use the Grammar in Figure 4, and chart[0] of an Earley parser in Table 3, to do chart[1] and chart[2]
of the Earley parser on the sentence “Menzi called Nompilo from Durban”. Each entry in your
charts should have the state, its end points, and the operation that placed it in the chart as shown in
Table 3.

GRAMMAR LEXICON

S → NP VP Noun → “Menzi”
NP → NP PP Noun → “Nompilo”
NP → Noun Noun → “Durban”
VP → Verb NP Verb → “called”
VP → VP PP Prep → “from”
PP → Prep NP

Figure 4
University of KwaZulu-Natal, June 2019 Examination: COMP316 W1 5
Table 3

Chart[i] state Endpoints Operator

Chart[0] 𝛾 →. 𝑆 [0, 0] Dummy start state

𝑆 →. 𝑁𝑃 𝑉𝑃 [0, 0] predictor

𝑁𝑃 →. 𝑁𝑃 𝑃𝑃 [0, 0] predictor

𝑁𝑃 →. 𝑁𝑜𝑢𝑛 [0, 0] predictor

Chart [1] … … …

[11]
University of KwaZulu-Natal, June 2019 Examination: COMP316 W1 6

Question 4 - Lexical and Computational Lexical Semantics [15 marks]


4.1 Here is the WordNet entry for ”sentence”:
Noun
I. S: (n) sentence (a string of words satisfying the grammatical rules of a language)
"he always spoke in grammatical sentences"
II. S: (n) conviction, judgment of conviction, condemnation, sentence ((criminal law)
a final judgment of guilty in a criminal case and the punishment that is imposed)
"the conviction came as no surprise"
III. S: (n) prison term, sentence, time (the period of time a prisoner is imprisoned) "he
served a prison term of 15 months"; "his sentence was 5 to 10 years"; "he is doing
time in the county jail"
Verb
IV. S: (v) sentence, condemn, doom (pronounce a sentence on (somebody) in a court
of law) "He was condemned to ten years in prison"

a. List all pairs of senses that are polysemous?


b. For any senses that are polysemous, which senses are related by a metonymy
relationship.
c. What is the difference between a homophone and a homograph relationship between
word forms?
[3, 1, 2]

4.2 Consider the following sentence, L, in Figure 5. The word “line” is in focus,

About three years ago, he nearly gave up because he had nothing to sell, now his
shelves are full, and towels and clothes hang from a line every day of the week.

Figure 5

a. Give a collocational feature vector for the word “line” in L, given a window size of six (6)
(3 words to the left and 3 words to the right).
b. Give a bag-of-words feature vector for the word “line” in L, given a window of size 6, and
the following word feature list; [written, school, speech, day, major, hang, sell, nothing,
rope, week].
[4, 3]

4.3 State any two (2) advantages of using word vectors for meaning representation over using
thesaurus based techniques.
[2]
University of KwaZulu-Natal, June 2019 Examination: COMP316 W1 7

Question 5 – Applications [25 marks]


5.1 Figure 6 shows an architecture for a typical Factoid Question Answering System. The
architecture shows three major components of the system, which are the Question Processing,
Passage Retrieval and Answer Processing components.

Figure 6

a. Briefly discuss what is involved in the question processing component.


b. State one NLP technique that is used for answer extraction in the answer processing
component and state how the technique is used.
[4, 2]

5.2 Sequence-to-Sequence model for Neural Machine Translation has two major components, the
encoder and the decoder. Briefly explain the functions of each of these components in Neural
Machine Translation.
[4]

5.3 Use the grammar in Figure 7 to answer the following questions:


a. Draw:
i. the two (2) parse trees for the sentence “Cut the shirt with scissors”, and
ii. the parse tree for the sentence “Cut with scissors the shirt”,
using the grammar in Figure 7.
b. Assuming that the parse trees you drew in Question 5. 3 (a) are all the parse trees for the
given sentences, use the parse trees to calculate the probability (language model) of each
of the two sentences in Question 5.3 (a).
University of KwaZulu-Natal, June 2019 Examination: COMP316 W1 8

PRODUCTION RULE PROBABILITY

S → VP 1.0
VP → Verb NP 0.75
VP → Verb NP PP 0.1
VP → Verb PP NP 0.15
NP → NP PP 0.3
NP → Det Noun 0.7
PP → Prep Noun 1.0
Det → the|… 0.1
Verb → cut| ask| find|… 0.1
Prep → with| in| … 0.1
Noun → envelope|grandma|scissors|men|shirt|summer|… 0.1

Figure 7

c. Given the direct translation for the Spanish sentence “Cortar la camisa con las tijeras” in
Figure 8, use your answer to Question 5. 3 (b) to determine which of the two sentences in,
Table 4, is the most likely fluent translation of the Spanish sentence.
Table 4

index English sentence Ei P(F/Ei)


1 Cut with scissors the shirt 0.2
2 Cut the shirt with scissors 0.1

Cortar la camisa con las tijeras


Cut the shirt with scissors

Figure 8

[6, 5, 4]
University of KwaZulu-Natal, June 2019 Examination: COMP316 W1 9

Appendix A

You might also like