Bahir Dar University Bahir Dar Institute of Technology Faculty of Computing Department of Computer Science
Bahir Dar University Bahir Dar Institute of Technology Faculty of Computing Department of Computer Science
Training data:
I am Sam
Sam I am
Sam I like
Sam I do like
do I like Sam
Assume that we use a bi-gram language model based on the above training data. What is the
most probable next word predicted by the model for the following word sequences? Show.
(1) Sam . . .
(2) Sam I do . . .
(4) do I like . . .
Solution :-
Estimating bigram probabilies by checking the probability of next word prediction for word(W)
can be calculated as [2]
P(W|Wi-1) = count(Wi-1,W)/count(Wi-1)
Assume that we use a bi-gram language model based on the above training data. What is
the most probable next word predicted by the model for the following word sequences?
Show.
(1) Sam . . .
(2) Sam I do . . .
therefore word next to do are “I” and “Like” are equally probable
(4) do I like . . .
[1] R. Nagata, H. Takamura, and G. Neubig, “Adaptive Spelling Error Correction Models for Learner
English,” Procedia Comput. Sci., vol. 112, pp. 474–483, 2017, doi: 10.1016/j.procs.2017.08.065.