Module2 Ch3 B
Module2 Ch3 B
1. Non-word errors
▪ Error resulting in a wordthat does notappearin a given lexiconor is notavalid
orthographicword.
Neural Nets
Rule-based Techniques
Example
Transformation of string “tutor” to “tumour” and its associated 'edit transcript'
(MMRMIM).
M M R M I M
Minimum Edit Distance = 2 t u t o r
t u m o u r
Gayana M N Module 2 – Word Level Analysis 25
Minimum Edit Distance
Example
Transformation of string “vintner” to “writers” and its associated 'edit transcript'
(RIMDMDMMI).
R I M D M D M M I
Minimum Edit v i n t n e r
Distance = 5
w r i t e r s
Gayana M N Module 2 – Word Level Analysis 26
Minimum Edit Distance
M M R M I M
t u t o r
t u m o u r
t u t o _ r
t u m o u r
Edit Distance = 2
Gayana M N Module 2 – Word Level Analysis 30
Minimum Edit Distance
M M D I M I M
t u t o r
t u m o u r
t u t _ o _ r
t u _ m o u r
Edit Distance = 3
Gayana M N Module 2 – Word Level Analysis 32
Minimum Edit Distance
M M R M I M
t u t o r
t u m o u r
t u t o _ r
t u m o u r
Example:
S1 = vintner (7 characters)
S2 = writers (7 characters)
Cost = 1
D(i, 0) = i n 3
t 4
n 5
e 6
r 7
Gayana M N Module 2 – Word Level Analysis 51
Minimum Edit Distance D(0, j) = j
# w r i t e r s
# 0 1 2 3 4 5 6 7
v 1
i 2
D(i, 0) = i n 3
t 4
n 5
e 6
r 7
Gayana M N Module 2 – Word Level Analysis 52
Minimum Edit Distance
r 7
Gayana M N Module 2 – Word Level Analysis 54
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
v 1
+1
i 2
n 3
t 4 D(i, j) = min [ D(i, j− 1) + delete_cost,
D(i − 1, j) + insert_cost,
n 5
D(i−1, j−1) + sub_cost
e 6 t(S1i , S2j)]
r 7
Gayana M N Module 2 – Word Level Analysis 61
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
+1
v 1
+1
i 2
n 3
t 4 D(i, j) = min [ D(i, j− 1) + delete_cost,
D(i − 1, j) + insert_cost,
n 5
D(i−1, j−1) + sub_cost
e 6 t(S1i , S2j)]
r 7
Gayana M N Module 2 – Word Level Analysis 62
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
+1 +1
v 1
+1
i 2
n 3
t 4 D(i, j) = min [ D(i, j− 1) + delete_cost,
D(i − 1, j) + insert_cost,
n 5
D(i−1, j−1) + sub_cost
e 6 t(S1i , S2j)]
r 7
Gayana M N Module 2 – Word Level Analysis 63
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
+1 +1
v 1 1
+1
i 2
n 3
t 4 D(i, j) = min [ D(i, j− 1) + delete_cost,
D(i − 1, j) + insert_cost,
n 5
D(i−1, j−1) + sub_cost
e 6 t(S1i , S2j)]
r 7
Gayana M N Module 2 – Word Level Analysis 58
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
+1
v 1 1
i 2
n 3
t 4
n 5
e 6
r 7
Gayana M N Module 2 – Word Level Analysis 59
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
+1
v 1 1 2
i 2
n 3
t 4
n 5
e 6
r 7
Gayana M N Module 2 – Word Level Analysis 60
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
+1
v 1 1 2 3
i 2
n 3
t 4
n 5
e 6
r 7
Gayana M N Module 2 – Word Level Analysis 61
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
+1
v 1 1 2 3 4 5
i 2
n 3
t 4
n 5
e 6
r 7
Gayana M N Module 2 – Word Level Analysis 62
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
+1
v 1 1 2 3 4 5 6
i 2
n 3
t 4
n 5
e 6
r 7
Gayana M N Module 2 – Word Level Analysis 63
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
+1
v 1 1 2 3 4 5 6 7
i 2
n 3
t 4
n 5
e 6
r 7
Gayana M N Module 2 – Word Level Analysis 64
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
v 1 + 1
1 2 3 4 5 6 7
i 2
n 3
t 4
n 5
e 6
r 7
Gayana M N Module 2 – Word Level Analysis 65
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
v 1 + 1
1 2 3 4 5 6 7
i 2 2
n 3
t 4
n 5
e 6
r 7
Gayana M N Module 2 – Word Level Analysis 66
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
v 1 1 2 3 4 5 6 7
+1
i 2 2 2
n 3
t 4
n 5
e 6
r 7
Gayana M N Module 2 – Word Level Analysis 67
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
v 1 1 2 3 4 5 6 7
2 +1
i 2 2
n 3
t 4
n 5
e 6
r 7
Gayana M N Module 2 – Word Level Analysis 68
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
v 1 1 2 3 4 5 6 7
+0
i 2 2 2
n 3
t 4
n 5
e 6
r 7
Gayana M N Module 2 – Word Level Analysis 69
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
v 1 1 2 +0 3 4 5 6 7
i 2 2 2
n 3
t 4
n 5
e 6
r 7
Gayana M N Module 2 – Word Level Analysis 70
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
v 1 1 2 +0 3 4 5 6 7
i 2 2 2
n 3
t 4
n 5
e 6
r 7
Gayana M N Module 2 – Word Level Analysis 71
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
v 1 1 2 +0 3 4 5 6 7
i 2 2 2 2
n 3
t 4
n 5
e 6
r 7
Gayana M N Module 2 – Word Level Analysis 72
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
v 1 1 2 3 4 5 6 7
i 2 2 2 2 +1 3
n 3
t 4
n 5
e 6
r 7
Gayana M N Module 2 – Word Level Analysis 73
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
v 1 1 2 3 4 5 6 7
i 2 2 2 2 3 +1 4
n 3
t 4
n 5
e 6
r 7
Gayana M N Module 2 – Word Level Analysis 74
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
v 1 1 2 3 4 5 6 7
i 2 2 2 2 3 4 +1 5
n 3
t 4
n 5
e 6
r 7
Gayana M N Module 2 – Word Level Analysis 75
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
v 1 1 2 3 4 5 6 7
i 2 2 2 2 3 4 5 +1 6
n 3
t 4
n 5
e 6
r 7
Gayana M N Module 2 – Word Level Analysis 76
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
v 1 1 2 3 4 5 6 7
i 2 2 2 2 3 4 5 6
+1
n 3 3
t 4
n 5
e 6
r 7
Gayana M N Module 2 – Word Level Analysis 77
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
v 1 1 2 3 4 5 6 7
i 2 2 2 2 3 4 5 6
+1
n 3 3 3
t 4
n 5
e 6
r 7
Gayana M N Module 2 – Word Level Analysis 78
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
v 1 1 2 3 4 5 6 7
i 2 2 2 2 3 4 5 6
+1
n 3 3 3 3
t 4
n 5
e 6
r 7
Gayana M N Module 2 – Word Level Analysis 79
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
v 1 1 2 3 4 5 6 7
i 2 2 2 2 3 4 5 6
+1
n 3 3 3 3 3
t 4
n 5
e 6
r 7
Gayana M N Module 2 – Word Level Analysis 80
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
v 1 1 2 3 4 5 6 7
i 2 2 2 2 3 4 5 6
+1
n 3 3 3 3 3 4
t 4
n 5
e 6
r 7
Gayana M N Module 2 – Word Level Analysis 81
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
v 1 1 2 3 4 5 6 7
i 2 2 2 2 3 4 5 6
+1
n 3 3 3 3 3 4 5
t 4
n 5
e 6
r 7
Gayana M N Module 2 – Word Level Analysis 82
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
v 1 1 2 3 4 5 6 7
i 2 2 2 2 3 4 5 6
+1
n 3 3 3 3 3 4 5 6
t 4
n 5
e 6
r 7
Gayana M N Module 2 – Word Level Analysis 83
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
v 1 1 2 3 4 5 6 7
i 2 2 2 2 3 4 5 6
n 3 3 3 3 3 4 5 6
t 4
n 5
e 6
r 7
Gayana M N Module 2 – Word Level Analysis 84
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
v 1 1 2 3 4 5 6 7
i 2 2 2 2 3 4 5 6
n 3 + 1 3 3 3 3 4 5 6
t 4 4
n 5
e 6
r 7
Gayana M N Module 2 – Word Level Analysis 85
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
v 1 1 2 3 4 5 6 7
i 2 2 2 2 3 4 5 6
n 3 3 +1 3 3 3 4 5 6
t 4 4 4
n 5
e 6
r 7
Gayana M N Module 2 – Word Level Analysis 86
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
v 1 1 2 3 4 5 6 7
i 2 2 2 2 3 4 5 6
n 3 3 3 +1 3 4 5 6
t 3
4 4 4 4
n 5
e 6
r 7 Gayana M N Module 2 – Word Level Analysis 87
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
v 1 1 2 3 4 5 6 7
i 2 2 2 2 3 4 5 6
n 3 3 3 3 +0 3 4 5 6
t 4 4 4 4 3
n 5
e 6
r 7
Gayana M N Module 2 – Word Level Analysis 88
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
v 1 1 2 3 4 5 6 7
i 2 2 2 2 3 4 5 6
n 3 3 3 3 3 4 5 6
+1
t 4 4 4 4 3 4
n 5
e 6
r 7
Gayana M N Module 2 – Word Level Analysis 89
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
v 1 1 2 3 4 5 6 7
i 2 2 2 2 3 4 5 6
n 3 3 3 3 3 4 5 6
+1
t 4 4 4 4 3 4 5
n 5
e 6
r 7
Gayana M N Module 2 – Word Level Analysis 90
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
v 1 1 2 3 4 5 6 7
i 2 2 2 2 3 4 5 6
n 3 3 3 3 3 4 5 6
+1
t 4 4 4 4 3 4 5 6
n 5
e 6
r 7
Gayana M N Module 2 – Word Level Analysis 91
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
v 1 1 2 3 4 5 6 7
i 2 2 2 2 3 4 5 6
n 3 3 3 3 3 4 5 6
t 4 4 4 4 3 4 5 6
n 5 5 5 5 4 4 5 6
e 6
r 7
Gayana M N Module 2 – Word Level Analysis 92
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
v 1 1 2 3 4 5 6 7
i 2 2 2 2 3 4 5 6
n 3 3 3 3 3 4 5 6
t 4 4 4 4 3 4 5 6
n 5 5 5 5 4 4 5 6
e 6 6 6 6 5 4 5 6
r
Gayana M N Module 2 – Word Level Analysis 93
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
v 1 1 2 3 4 5 6 7
i 2 2 2 2 3 4 5 6
n 3 3 3 3 3 4 5 6
t 4 4 4 4 3 4 5 6
n 5 5 5 5 4 4 5 6
e 6 6 6 6 5 4 5 6
r 7 7 6 7 6 5 4 5
Gayana M N Module 2 – Word Level Analysis 94
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
v 1 1 2 3 4 5 6 7
i 2 2 2 2 3 4 5 6
n 3 3 3 3 3 4 5 6
t 4 4 4 4 3 4 5 6
n 5 5 5 5 4 4 5 6
e 6 6 6 6 5 4 5 6
r 7 7 6 7 6 5 4 5
Gayana M N Module 2 – Word Level Analysis 95
Minimum Edit Distance
# w r i t e r s
# 0 1 2 3 4 5 6 7
v 1 1 2 3 4 5 6 7
i 2 2 2 2 3 4 5 6
n 3 3 3 3 3 4 5 6
Minimum
t 4 4 4 4 3 4 5 6 Edit
Distance
n 5 5 5 5 4 4 5 6
e 6 6 6 6 5 4 5 6
r 7 7 6 7 6 5 4 5
Gayana M N Module 2 – Word Level Analysis 96