Lecture 13 Translation and Terminology Lecture Notes
Lecture 13 Translation and Terminology Lecture Notes
Approaches to MT:
Direct:
◦ the translation is based on large dictionaries and word-by-word translation with some
simple grammatical adjustments
◦ The translation unit of the approach is usually a word (based on lists of words)
◦ Ex. Systran - one of the oldest still used (1969)
Knowledge based: follows the linguistic and computational instructions supplied by human
researchers in linguistics and programming
corpus based (statistical or example-based)
replaces traditional rule-based approaches
computers can learn the translations of terminology from previously translated materials
outputs are often ungrammatical
Hybrid (also statistics-based)
incorporates higher level abstract syntax rules
it is difficult to merge the fundamentally different approaches
Recently:
Neural Machine Translation (NMT) (Google) - an end-to-end learning approach for automated
translation, with the potential to overcome many of the weaknesses of conventional phrase-based
translation systems.
Introduced in 2016
advantage - an ability to learn directly from input text to associated output text.
Its architecture typically consists of two recurrent neural networks (RNNs), one to consume the
input text sequence and one to generate translated output text.
Three main uses:
assists comprehension, gives quick access to information
assists in the writing of texts
assists in translation
Quality of MT depends:
on the linguistics affinity between the SL and the TL
on the quality of the Source document
Advantages
Speed (time-effective)
Keeps terminology uniform
Useful for banking and retrieving terminology
Disadvantages
Can be used only in limited subject areas
Only for commercially viable languages, not for low-density languages.
Needs intensive revision and post-editing
increases costs
reduces speed
Computer-assisted translation (CAT) = typically provide two databases, a Translation Memory (TM)
and a termbase (TDB).
A CAT system (Trados, déjà vu, MemSource, MemoQ etc.) provides assistance for the translator
the software breaks the text to be translated into segments
the translator validates the corresponding target text
the software memorises the source segment and the target segment as being linguistic equivalents
If the source segment appears in the text again, the software automatically proposes the
memorised translation.
Advantages
Saves time
Interactive
Ensure consistency for team work
Disadvantages
can only deal with a text simplistically in terms of linguistic segments
training is necessary for efficient use
time is necessary for the creation of an extensive database
Human translation
The first stage in human translation is complete comprehension of the source language text. This
comprehension operates on several levels:
Semantic level
Syntactic level
Pragmatic level
There are at least five types of knowledge used in the translation process of specialised texts:
Linguistic
◦ of the source language
◦ of the target language
of equivalents between the source and target languages
of the subject field as well as general knowledge
of socio-cultural aspects (source and target cultures)