OmegaT Documentation - EN
OmegaT Documentation - EN
OmegaT Documentation - EN
5 - User's Guide
Vito Smolej
OmegaT 3.5 - User's Guide
Vito Smolej
Publication date
Abstract
This document is the official user's guide to OmegaT, the free Computer Aided Translation tool. It also
contains installation instructions.
Table of Contents
1. About OmegaT - introduction .......................................................................................... 1
1. OmegaT highlights ................................................................................................... 1
2. Summary of chapters ............................................................................................... 1
2. Learn to use OmegaT in 5 minutes! ............................................................................... 3
1. Set up a new project ............................................................................................... 3
2. Translate the file ...................................................................................................... 3
3. Validate your tags .................................................................................................... 3
4. Generate the translated file ..................................................................................... 4
5. Few more things to remember ................................................................................ 4
3. Installing and running OmegaT ....................................................................................... 5
1. Windows Users ......................................................................................................... 5
2. Linux (Intel) Users .................................................................................................... 6
3. Mac OS X Users ....................................................................................................... 7
4. Other Systems .......................................................................................................... 8
5. Drag and drop .......................................................................................................... 9
6. Using Java Web Start ............................................................................................... 9
7. Starting OmegaT from the command line ............................................................... 9
8. Building OmegaT From Source .............................................................................. 14
4. The user interface ......................................................................................................... 15
1. Main OmegaT window, other windows and dialogs ................................................ 15
2. OmegaT main window ............................................................................................ 16
3. Other windows ....................................................................................................... 22
5. Menu and Keyboard shortcuts ....................................................................................... 26
1. Main Menu .............................................................................................................. 26
2. Keyboard shortcuts ................................................................................................ 35
6. Project properties .......................................................................................................... 39
1. Generalities ............................................................................................................ 39
2. Languages .............................................................................................................. 39
3. Options ................................................................................................................... 39
4. File locations .......................................................................................................... 40
7. File Filters ...................................................................................................................... 42
1. File filters dialog .................................................................................................... 42
2. Filter options .......................................................................................................... 42
3. Edit filter dialog ...................................................................................................... 44
8. OmegaT Files and Folders ............................................................................................. 47
1. Translation project files .......................................................................................... 47
2. User settings files .................................................................................................. 49
3. Application files ...................................................................................................... 50
9. Files to translate ............................................................................................................ 52
1. File formats ............................................................................................................ 52
2. Other file formats ................................................................................................... 54
3. Right to left languages .......................................................................................... 54
10. Editing behavior ........................................................................................................... 56
11. Working with plain text ............................................................................................... 59
1. Default encoding .................................................................................................... 59
2. The OmegaT solution ............................................................................................. 59
12. Working with formatted text ....................................................................................... 60
1. Formatting tags ...................................................................................................... 60
2. Tag operations ....................................................................................................... 60
3. Tag group nesting .................................................................................................. 61
4. Tag group overlapping ........................................................................................... 61
5. Tag validation options ............................................................................................ 61
6. Tag group validation .............................................................................................. 62
7. Hints for tags management ................................................................................... 63
13. Translation memories .................................................................................................. 64
1. Translation memories in OmegaT .......................................................................... 64
2. Reusing translation memories ............................................................................... 67
iii
OmegaT 3.5 - User's Guide
iv
OmegaT 3.5 - User's Guide
v
List of Figures
4.1. OmegaT main window ................................................................................................ 16
4.2. Matches pane ............................................................................................................. 19
4.3. Matches pane setup ................................................................................................... 20
4.4. multi-word entry in the glossary ................................................................................ 21
4.5. Comments pane .......................................................................................................... 22
4.6. Tag validation window ................................................................................................ 23
4.7. project statistics ......................................................................................................... 24
4.8. Match statistics ........................................................................................................... 25
8.1. OmegaT project .......................................................................................................... 47
8.2. OmegaT projects and subfolders ................................................................................ 48
10.1. Editing behavior options ........................................................................................... 56
12.1. Tag validation entry .................................................................................................. 63
17.1. Regex Tester ............................................................................................................. 79
18.1. Merriam Webster dictionary - use ............................................................................ 81
19.1. Glossary pane ........................................................................................................... 83
19.2. multiple words entries in glossaries - example ........................................................ 84
21.1. Google Translate - example ..................................................................................... 89
22.1. Spellchecker setup .................................................................................................... 92
22.2. Using spellchecker .................................................................................................... 93
E.1. The LanguageTool in OmegaT .................................................................................. 112
vi
List of Tables
4.1. Main OmegaT window ................................................................................................ 15
4.2. Other windows ............................................................................................................ 15
4.3. Settings dialogs .......................................................................................................... 15
4.4. Pane widgets .............................................................................................................. 16
4.5. Main Window - counters ............................................................................................. 17
4.6. Match pane setup ....................................................................................................... 20
5.1. Main Menu .................................................................................................................. 26
5.2. Project menu ............................................................................................................... 26
5.3. Copy/cut/paste shortcuts ............................................................................................ 27
5.4. Edit menu ................................................................................................................... 27
5.5. Go To menu ................................................................................................................ 30
5.6. View menu .................................................................................................................. 31
5.7. Tools menu ................................................................................................................. 32
5.8. Options menu ............................................................................................................. 32
5.9. Help menu .................................................................................................................. 35
5.10. Project management shortcuts ................................................................................. 36
5.11. Editing shortcuts ....................................................................................................... 36
5.12. Moving around shortcuts .......................................................................................... 37
5.13. Various shortcuts ...................................................................................................... 37
17.1. Regex - Flags ............................................................................................................ 77
17.2. Regex - Character ..................................................................................................... 77
17.3. Regex - Quotation ..................................................................................................... 77
17.4. Regex - Classes for Unicode blocks and categories ................................................. 78
17.5. Regex - Character classes ........................................................................................ 78
17.6. Regex - Predefined character classes ...................................................................... 78
17.7. Regex - Boundary matchers ..................................................................................... 78
17.8. Regex - Greedy quantifiers ....................................................................................... 78
17.9. Regex - Reluctant (non-greedy) quantifiers .............................................................. 79
17.10. Regex - Logical operators ....................................................................................... 79
17.11. Regex - Examples of regular expressions in translations ....................................... 80
A.1. ISO 639-1/639-2 Language code list .......................................................................... 98
B.1. Key behavior in the editor ........................................................................................ 103
H.1. Project Menu ............................................................................................................. 117
H.2. Edit Menu ................................................................................................................. 117
H.3. GoTo Menu ............................................................................................................... 118
H.4. View Menu ................................................................................................................ 119
H.5. Tools Menu ............................................................................................................... 119
H.6. Options Menu ........................................................................................................... 119
H.7. Help Menu ................................................................................................................ 120
vii
Chapter 1. About OmegaT -
introduction
1. OmegaT highlights
OmegaT is a free multi platform Computer Aided Translation tool, with the following
highlights:
• Translation memory: OmegaT stores your translations in a translation memory file. At the
same time, it can use memories files from previous translations for reference. Translation
memories can be very useful in a translation where there are numerous repetitions or
reasonably similar segments of text. OmegaT uses translation memories to store your
previous translations and then suggest likely translations for the text you are currently
working on.
These translation memories can be very useful when a document that has already been
translated needs to be updated. Unchanged sentences are automatically translated, while
updated sentences are shown with the translation of the most similar, older sentence.
Modifications to the original document are thus handled with greater ease. If you are
supplied with previously created translation memories , for example by your translation
agency or your client, OmegaT is able to use these as reference memories.
OmegaT uses the standard tmx file format to store and access translation memories, which
guarantees that you can exchange your translation material with other CAT applications
supporting this file format.
• Translation process: Imagine having to translate something; from a single file to a folder
containing subfolders each with a number of files in a variety of formats. When you let
OmegaT know the files that you need to translate, it looks for the files it understands based
on file filtering rules, recognizes the textual parts within them, splits up the text groups
according to the segmentation rules, and displays the segments one by one so that you
can proceed with the translation. OmegaT stores your translations and proposes possible
translations from similar segments registered in the translation memory files. When you
are ready to view the final product, you can export the translated files, open them in the
appropriate application and view the translation in the final format...
2. Summary of chapters
This documentation is intended both as a tutorial and as a reference guide. Here is a short
summary of the chapters and their contents.
• Learn to use OmegaT in 5 minutes!: this chapter is intended as a quick tutorial for
beginners as well as people who already know CAT tools, showing the complete procedure
from opening a new translation project through to completing the translation.
• Installing and running OmegaT: this chapter is useful when you first begin using
OmegaT. It contains the specific instructions on how to install OmegaT and run it on
Windows, Mac OS X and Linux. For advanced users, the chapter describes the command
line mode and its possibilities.
1
About OmegaT - introduction
• The user interface,Main Menu and Keyboard Shortcuts: these two chapters are likely
to be heavily consulted, since they explain the user interface of OmegaT and the functions
available via the main menu and the keyboard shortcuts.
• Project properties, OmegaT Files and Folders: a project in the context of OmegaT is
the piece of work that OmegaT as a CAT tool is able to handle. This chapter describes the
project properties, such as the source and target languages. The second of these chapters
describes the various subfolders and files in a translation project and their role as well as
other user and application files associated with OmegaT.
• Editing field behavior: a short chapter describing how to set up the editing behavior of
the segment being translated.
• Working with plain text and Working with formatted text: these two chapters explain
certain important points concerning texts to be translated, such as the encoding set (in the
case of plain text files) and tag handling (in the case of formatted text).
• Translation memories: explains the role of the various subfolders containing translation
memories, and provides information on other important aspects relating to translation
memories.
• Miscellanea: deals with other issues of interest, such as how to avoid losing data.
• Keyboard shortcuts in the editor: the list of shortcuts used in the editor
• Team Projects
• Keyword index: an extensive keyword index is provided to help the reader find the
relevant information.
2
Chapter 2. Learn to use OmegaT in
5 minutes!
1. Set up a new project
Note: On an Apple Mac, use the Command key instead of the Control key.
To start using OmegaT, first create a project that will hold all your files, such as your source
file, translation memories, glossaries, and eventually your translated file. In the Project menu,
select New... and type a name for your project. Remember where you are creating the project,
because you will need to return to it later.
After you give your project a name, the Create New Project dialog will open. At the top of
that dialog, select your source file's language and the language that your translated file will
be, and click OK to continue.
If you are interested in other settings of this dialog, you can return to it any time by pressing
Ctrl+E.
Next, the Project Files dialog opens. Click on Copy Files to Source Folder... to select your
source files. OmegaT will then copy the selected files to the /source/ subfolder of your newly
created project. After the source files have loaded in the Editor pane, you can close the Project
Files dialog.
The tags in OmegaT are greyed, so they are easy to recognise. They are protected, so that
you cannot modify their contents, but you can delete them, enter them by hand or move them
around in the target sentence. However, if you made mistakes when you typed the formatting
tags, your translated files might fail to open. Therefore, press Ctrl+Shift+V before you
generate your translated files, to validate that your tags are correct.
3
Learn to use OmegaT
in 5 minutes!
• You can create a new project for each new job, and you can add new source files to a
project at anytime.
• To remind yourself of the project's initial settings, open the project properties dialog by
pressing Ctrl+E. To see a list of files in the project, open the Project Files dialog by pressing
Ctrl+L.
• At the end of your translation, OmegaT exports three translation memories called level1,
level2 and omegat to your project folder. The level1 and level2 memories can be shared with
users of other translation programs. The memory named omegat can be used by OmegaT
itself, in future projects that you create. If you place such translation memory files in the
/tm/ subfolder of a project, OmegaT will automatically search them for similar segments,
called "fuzzy matches".
• You can add a new term to the glossary by pressing Ctrl+Shift+G, or copy existing
glossaries to the /glossary/ subfolder of your project folder, and OmegaT will automatically
look up words in them.
• It is often useful to search for words and phrases in the source text and in your translation,
so press Ctrl+F for the Text Search dialog at any time.
4
Chapter 3. Installing and running
OmegaT
1. Windows Users
1.1. Downloading the package
Do you have a Java implementation compatible with Oracle's Java 1.6 JRE?
This package is bundled with Oracle's Java Runtime Environment. This JRE will not interfere
with other Java implementations that may already be installed on your system.
1.2. Installing OmegaT
To install OmegaT, double-click on the program you have downloaded.
At the beginning of the installation you can select the language to be used during the
installation. In the following window you can indicate that the language selected is to be used
in OmegaT. If you check the corresponding checkbox, the OmegaT.l4J.ini file is modified to
use the language selected (see next section for details). Later, after you have accepted the
license agreement, the setup program asks you whether you wish to create a folder in the
start menu, and whether you wish to create a shortcut on the desktop and in the quick launch
bar - you can create these shortcuts later by dragging OmegaT.exe to the desktop or to the
start menu to link it from there. The last frame offers you to have a look at the readme and
changes files for the version you have installed.
1.3. Running OmegaT
Once OmegaT is installed, you can click on OmegaT.jar to launch it directly or you can launch
it directly from the command line.
The simplest way to launch OmegaT, however, is to execute the OmegaT.exe program. The
options for the program start-up in this case will be read from the OmegaT.l4J.ini file, which
resides in the same folder as the exe file and which you can edit to reflect your setup.
The following example for the INI file reserves 1GB of memory, requests French as the user
language and Canada as the country:
Advice: if OmegaT works slowly in Remote Desktop sessions under Windows, you may use
this option:
-Dsun.java2d.noddraw=false
5
Installing and running OmegaT
1.4. Upgrading OmegaT
This information applies only to the "Traditional" Windows versions of OmegaT. It does not
apply to the Web Start versions, which are upgraded automatically, nor to cross-platform
versions installed on Windows.
If you already have a version of OmegaT installed on your PC and wish to upgrade to a more
recent version, you have two options:
• Install over the existing installation. To do this, simply select the same installation
folder as the existing installation when installing the new version. The "old" version
of OmegaT will be overwritten, but any settings from it will be retained. This includes
preferences set from within OmegaT, any changes you have made to your OmegaT.l4J.ini
file, and also your launch script (.bat file), if you are using one.
With this method, you may also download the "Windows without JRE" version, since the new
installation will use your existing JRE.
• Install to a new folder. This will enable you to keep both versions side-by-side, which
you may wish to do until you feel comfortable with the new version. This method will also
use preferences and settings you have made from within OmegaT. In this case, however:
• If you have made changes to your OmegaT.l4J.ini file and/or are using a .bat file, you
must copy these over.
• If your existing OmegaT installation is a "Windows with JRE" version, the new version
must also be a "Windows with JRE" version.
This package is bundled with Oracle's Java Runtime Environment. This JRE will not interfere
with other Java implementations that may already be installed on your system.
2.2. Installing OmegaT
Unpack/untar the downloaded file. This will create an omegat/ folder in the working folder in
which you will find all the files needed to run OmegaT. To untar the .tar.gz file:
$ tar xf downloaded_file.tar.gz
• Press Alt+F2 to show KRunner. Type kmenuedit+enter to run the command. The
KMenuEditor appears. In KMenuEditor select File -> New Item.
• Then, after selecting a suitable menu, add a submenu/item with File - New Submenu and
File - New Item. Enter OmegaT as the name of the new item.
6
Installing and running OmegaT
• In the "Command" field, use the navigation button to find your OmegaT launch script (the
file named OmegaT in the unpacked folder), and select it.
• Click on the icon button (to the right of the Name/Description/Comment fields)
• Other Icons - Browse, and navigate to the /images subfolder in the OmegaT application
folder. Select the OmegaT.png icon.
2.3.2. GNOME Users
You can add OmegaT to your menus as follows:
• Enter "OmegaT" in the "Name" field; in the "Command" field, use the navigation button to
find your OmegaT launch script (the file named OmegaT in the unpacked folder). Select it
and confirm with OK.
• Click on the icon button, then hit Browse... and navigate to the /images subfolder in the
OmegaT application folder. Select the OmegaT.png icon. GNOME may fail to display the
icon files in the available formats and initially appear to expect an SVG file, but if the folder
is selected, the files should appear and OmegaT.png can be selected.
2.4. Running OmegaT
You can launch OmegaT from the command line with a script that includes start-up options or
you can click on OmegaT.jar to launch it directly. Methods differ depending on the distribution.
Make sure that your PATH settings are correct and that .jar files are properly associated with
a Java launcher. Check "Command line launching" below for more information.
3. Mac OS X Users
3.1. Downloading the package
OmegaT contains the Java JRE 1.8
Download OmegaT_3.n.n_Mac.zip.
3.2. Installing OmegaT
Double click on OmegaT_3.n.n_Mac.zip to unpack it. This creates a folder called OmegaT .
The folder contains 2 files: index.html and OmegaT.app. Copy the folder to a suitable folder
(e.g. Applications). Once you have done this, you can delete the OmegaT_3.n.n_Mac.zip file,
it is no longer needed.
3.4. Running OmegaT
Double-click on OmegaT.app or click on its location in the Dock.
You can modify OmegaT's behaviour by editing the Properties as well as the
Configuration.properties file in the package.
7
Installing and running OmegaT
3.5. Mac OS X goodies
OmegaT.app can be accessed from the Mac OS X Services. You can thus select a word
anywhere in OmegaT and use Services to check this word, for instance in Spotlight or in
Google. You can also use AppleScript or Automator to create Services or scripts that will
automate frequent actions
4. Other Systems
This information applies to systems such as Solaris SPARC/x86/x64, Linux x64/PowerPC,
Windows x64
Do you have a Java implementation compatible with Oracle's Java 1.6 JRE?
• I don't know: open a terminal and type "java -version". If a "command not found" or similar
message is returned, it is likely that Java is not installed on your system
• No: obtain a Java JRE for your system (see below) and download
OmegaT_3.n.n_Without_JRE.zip.
Oracle provides JREs for Solaris SPARC/x86 (Java 1.6) and for Linux x64, Solaris x64,
Windows x64 (Java 1.6) at http://www.oracle.com/technetwork/java/archive-139210.html
4.2. Installing OmegaT
To install OmegaT, simply unpack the OmegaT_3.n.n_Without_JRE.zip file. This creates an
./OmegaT_3.n.n_Without_JRE/ folder in the working folder with all the files necessary to run
OmegaT.
8
Installing and running OmegaT
4.4. Running OmegaT
Once OmegaT is installed, you can launch it directly from the command line, you can create a
script that includes launch parameters for the command line or you can click on OmegaT.jar
to launch it directly. Methods differ depending on the distribution. Make sure that your PATH
settings are correct and that .jar files are properly associated with a Java launcher. Check
"Command line launching" below for more information.
http://omegat.sourceforge.net/webstart/OmegaT.jnlp
Download the file OmegaT.jnlp and then click on it. During the installation, depending on
your operating system, you may receive several security warnings. The permissions you give
to this version (which may appear as "unrestricted access to the computer") are identical
to the permissions you give to the local version, i.e., they allow access to the hard drive of
the computer. Subsequent clicks on OmegaT.jnlp will check for any upgrades, install them,
if there are any, and then start OmegaT. After the initial installation you can, of course, also
use OmegaT.jnlp also when you are offline.
Privacy: OmegaT Java Web Start does not save any of your information beyond the computer
on which you are running it. The application runs on your machine only. Your documents and
translation memories remain on your computer, and the OmegaT project will have no access
to your work or information.
Note that if you need or wish to use any of the launch command arguments (see above), you
must use the normal installation.
To launch OmegaT, you must normally type two commands. The first of these is:
cd {folder}
where {folder} is the name of the folder, with complete path, in which your OmegaT program
- specifically, the file OmegaT.jar - is located. In practice, this command will therefore be
something like this:
9
Installing and running OmegaT
On Windows
cd C:\Program Files\OmegaT
On Mac OS X
cd <OmegaT.app location>/OmegaT.app/Contents/Resources/Java/
On Linux
cd /usr/local/omegat
This command changes the folder to the folder containing the executable OmegaT file. The
second command is the command which actually launches OmegaT. In its most basic form,
this command is:
Pay attention to the capitalization - in OS other than Windows, the program will not start, if
you enter omegat instead of OmegaT !
This method has a particular benefit of being suitable for finding causes of problems: if an
error occurs during use of the program, an error message is output in the terminal window
which may contain useful information on the cause of the error.
The above method somewhat impractical way of launching a program routinely. For this
reason, the two commands described above are contained in a file (a "script", also called a
".bat file" on Windows systems).
When this file is executed, the commands within it are automatically carried out.
Consequently, to make changes to the launch command, it is sufficient to modify the file.
A list of possible arguments is given below. Advanced users can obtain more information on
the arguments by typing man java in the terminal window.
The "-Duser.language=XX" argument causes OmegaT to use the language specified rather
than the language of the user's operating system. "XX" in the command stands for the two-
digit code of the desired language. To launch OmegaT with a French interface (for example
on a Russian operating system), the command would therefore be:
• User country
-Duser.country=XX Besides the language, you can also specify the country, for example
CN or TW in case of the Chinese language. To display the instant start guide in the desired
language, you need to specify both the language and the country. This is necessary even
if there's only one combination available, like pt_BR in case of Portuguese / Brazil.
10
Installing and running OmegaT
• Memory assignment
-XmxZZM This command assigns more memory to OmegaT. By default, 512 MB are
assigned, so there is no advantage in assigning less than this figure. "ZZ" stands for
the amount of memory assigned, in megabytes. The command to launch OmegaT with
assignment of 1024 MB (1 gigabyte) of memory is therefore:
-Dhttp.proxyPort=NNNN The port number your system uses to access the proxy server.
• Google Translate V2
• Microsoft Translator
Make sure that you have a free Microsoft account. You’ll need this to
sign-in to Windows Azure Marketplace [http://datamarket.azure.com/dataset/bing/
microsofttranslator#schema] and use the Translator service. Note that up to 2M characters
per month are free of charge. The two entries required are available in your account page
[https://datamarket.azure.com/account] under Primary account key and Customer-ID:
-Dmicrosoft.api.client_id=XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
-
Dmicrosoft.api.client_secret=XXXX9xXxX9xXXxxXXX9xxX99xXXXX9xx9XXxXxXXXXX=
• Yandex Translate
Make sure that you have a free Yandex account. You’ll need this to be able to obtain
and use Yandex Translate API key. API keys can be requested using API key request
form [http://api.yandex.com/key/form.xml?service=trnsl], and viewed on My Keys [http://
api.yandex.com/key/keyslist.xml] page.
-
Dyandex.api.key=trnsl.1.1.XXXXXXXXXXXXXXXX.XXXXXXXXXXXXXXXX.XXXXXXXXXXX
Arguments can be combined: to launch OmegaT with all the examples described above, the
command would be:
7.3.1. Prerequisites
To run OmegaT in the command line mode, a valid OmegaT project must be present. The
location does not matter, since you have to add it to the command line at the start-up anyway.
11
Installing and running OmegaT
If you need altered settings, the configuration files must be available. This can be achieved
in two ways:
• Run OmegaT normally (with the GUI) and specify the settings. If you start OmegaT in console
mode, it will use the same settings.
• If you can't run OmegaT normally (no graphical environment available): copy the settings
files from some other OmegaT installation on another machine to a specific folder. The
location does not matter, since you can add it to the command line at startup. The relevant
files are filters.conf and segmentation.conf and can be found in the user home folder (e.g. C:
\Documents and Settings\%User%\OmegaT under Windows, %user%/.omegat/ under Linux)
--config-dir=/path/to/config-files/ \
--config-file=/path/to/config-file/ \
--mode=console-translate|console-createpseudotranslatetmx|console-align
--source-pattern={regexp}
Explanation:
• <project-dir> tells OmegaT where to find the project to translate. If given, OmegaT starts
in console mode and will translate the given project.
• --config-dir=<config-dir> tells OmegaT in which folder the configuration files are stored.
If not given, OmegaT reverts to default values (OmegaT folder under user home or, if
unavailable, the current working folder). Note double - character
• --mode=...- OmegaT starts in console mode to perform one of the following services
automatically
• --mode=console-translate
In this mode, OmegaT will attempt to translate the files in /source/ with the available
translation memories. This is useful to run OmegaT on a server with TMX files
automatically fed to a project.
• --mode=console-createpseudotranslatetmx
In this mode OmegaT will create a TMX for the whole project, based on the source files
only. You specify the TMX file to be created with
--pseudotranslatetmx=allsegments.tmx --pseudotranslatetype=[equal|empty]
• --mode=console-align
In this mode, OmegaT will align the Java properties files found in the /source/ folder of
the project to the contents found at the specified location. The resulting TMX is stored in
the /omegat/ folder under the name align.tmx.
12
Installing and running OmegaT
Additional parameter is required in this case, specifying the location of the target data:
alignDir must contain a translation in the target language of the project. For instance, if
the project is EN->FR, alignDir must contain a bundle ending with _fr. The resulting tmx
is stored in the omegat folder under the name align.tmx.
• --source-pattern={regexp}
When mode has been used, this option will specify the files to be processed automatically.
If the parameter is not specified, all files will be processed. Here's few typical examples
to limit your choice:
• .*\.html
All HTML files will be translated - note that the period in the usual *.html has to be escaped
(\.) as specified by the rules for regular expressions
• test\.html
Only the file test.html at the root of the source folder will be translated. If there are other
files named test.html in other folders, they will be ignored.
• dir-10\\test\.html
Only the file test.html in the folder dir-10 will be processed. Again note that the backslash
is escaped as well.
• --output-tag-validation-={regexp}
When mode has been used, this option will specify the files to be processed automatically.
If the parameter is not specified, all files will be processed. Here's few typical examples
to limit your choice:
• .*\.html
All HTML files will be translated - note that the period in the usual *.html has to be escaped
(\.) as specified by the rules for regular expressions
• test\.html
Only the file test.html at the root of the source folder will be translated. If there are other
files named test.html in other folders, they will be ignored.
• dir-10\\test\.html
Only the file test.html in the folder dir-10 will be processed. Again note that the backslash
is escaped as well.
• --tag-validation=[abort|warn] outputFileName
This option allows the tag validation in a batch mode. If abort is selected, the tag validator
will stop on the first invalid segment. If warn is specified, the tag validator will process all
segments and write warnings about any segments with invalid tags into the file specified.
• --no-team addresses projects set up for team work. Use it if OmegaT is not to synchronize
the project contents.
• --disable-project-locking allows, under Windows, to open the same project with several
instances of OmegaT. By default, under Windows, omegat.project is locked, and an error
message is received when trying to open a project already opened in another instance of
OmegaT. With that option, no locking occurs.
13
Installing and running OmegaT
7.3.3. Quiet option
An extra command line parameter specific to console mode: --quiet. In the quiet mode, less
info is logged to the screen. The messages you would usually find in the status bar are not
displayed.
This will create a full distribution of OmegaT in the ./dist/ folder, where you will find all the
files necessary to run OmegaT.
14
Chapter 4. The user interface
1. Main OmegaT window, other windows
and dialogs
OmegaT main window contains the main menu, status bar and several panes. Additional
windows are available, as well as dialogs, used to change OmegaT project settings. The
information below summarizes their use and how they are invoked:
Table 4.2. Other windows
Tag Validation window Used to validate tags (open with
Ctrl+Shift+V, close with Esc)
Help browser Used to display the user manual (open with
F1)
Statistics window Used to open the window with the statistics
of the project, display it, using Tools →
Statistics.
Match statistics window Used to display the match statistics of the
project, select Tools → Match statistics to
open it.
Table 4.3. Settings dialogs
Project properties Used to modify the project folders and
languages (access via Ctrl+E shortcut or
Project → properties..., close via Esc)
Font Used to modify the font used by OmegaT
to display source, translation, matches
and glossary terms, (access via Options →
Font..., close via Esc)
File filters Used to adjust the handling of supported file
formats (access via Options → File Filters...,
close via Esc)
15
The user interface
The main window consists of several panes, the main menu and a status bar. You can change
the position of any pane or even undock it to a separate window by clicking and dragging
the pane by its name. Depending on the pane status, different signs can appear at its top
right corner:
Note
If you can not see all the panes (be it opened or minimized), pressing Options > Restore
Main Window will restore them to the state, defined in the installation.
Table 4.4. Pane widgets
16
The user interface
You can overlap panes if desired. When this is done the panes display a tab at the top. The
separators between the panes can be dragged to resize panes. Should you lose track of your
changes to the user interface, you can use Options → Restore the main window any time to
return to the original layout.
It is possible to drag and drop files to each pane, which will react accordingly.
• Editor pane: If an OmegaT project file (omegat.project) is dropped on this pane, the
corresponding project will be opened, closing first any opened project. Other dropped files
will be copied to the source folder. This applies also to the Project files window.
• Fuzzy Matches pane: Dropped .tmx files will be copied to the tm folder.
• Glossary pane: Dropped files with known glossary extensions (.txt, .tab, etc.) will be copied
to the glossary folder.
The counters in the lower right corner keep track of the progress of the translation (numbers
in the left hand column refer to the figure above):
17
The user interface
From a practical point of view, the most important pair of numbers is the second pair: it tells,
how much you have done so far, in relation to the total or second number. The project in the
example is evidently finished, as all the unique segments have been translated.
2.1. Editor pane
This is where you type and edit your translation. The Editor pane displays the text of the
partially translated document: the text already translated is displayed in translation while
the untranslated text is displayed in the original language. The displayed text is split into
segments and you may scroll through the document and double-click on any segment to open
and edit it. In the above case, the segments already translated are shown in yellow.
One of the above segments is the current segment. It is the segment that is displayed in two
parts. The upper part is in the source language, in bold characters with a green background
color, the lower part is the editing field, ended by a marker: the marker is <segment nnnn>
where nnnn is a number of the segment in the project. Use the upper part as a reference and
replace or modify the contents of the editing field with your translation.
Note: the segment marker displays <segment nnnn +yy more> when the segment is non-
unique. In that case, yy is the number of other occurrences of the segment in the project.
Depending upon the preferred editing behavior, the editing field for the untranslated segment
may be empty, contain the source text, or contain the translation of the string most similar
to the one to be translated. When you move to another segment, the translation is validated
and stored. If you want the translation to be the same as the source, simply make the editing
field empty by removing all the text (select all with Ctrl+A and delete with Del). OmegaT is
able to store translations that are identical to the source. This is useful for documents that
contain trade marks, names or other proper nouns, or parts in a third language that do not
require translation. See Translation editing for more details.
If you right click on the Editor pane, a pop-up menu opens, offering Cut, Copy, Paste (i.e.
same functions as Ctrl+X, Ctrl+C and Ctrl+V), GoTo segment and Add glossary entry
functions. In addition, when the right click occurs on an opened segment, other options
concerning Alternative translations are proposed, for example to to jump to another
instance of non-unique segments.
It is possible to drag text from anywhere in the main window and to drop it within the segment.
Texted dragged from outside the target segment is copied, while text dragged from within
the segment is moved.
By default, it is not possible to select words in the source segment using the keyboard rather
than the mouse. Pressing F2 key allows to move the cursor into the source segment (or
anywhere in the editor) with the keyboard arrows. In this mode, "Cursor lock off" is displayed
at the bottom of the pane. To come back to the standard mode "Cursor lock on", press F2
again.
18
The user interface
The match viewer shows the most similar segments from translation memories, both from
internal project translation memory created in real time as you translate your project and
from ancillary translation memories you have imported from your earlier jobs, or received
from your client or translation agency.
When you move to the next segment, the first fuzzy match (the one with the best matching
percentage) is automatically selected. You may select a different match by pressing Ctrl+2,
3, 4, or 5. Of course, pressing Ctrl+5 will have no effect, if there is no match #5. To use the
selected match in your translation, use Ctrl+R to replace the target field with the match or
use Ctrl+I to insert it at the cursor position.
The matching percentage is roughly equivalent to taking the number of common words in
the matched and the matching segment and dividing by the number of words in the longer
of the two. The selected fuzzy match is highlighted in bold, words that are missing in the
segment you are translating are colored blue and words adjacent to the missing parts green.
In the above example the source segment is Context menu command. The top match is
100%, because all words match. So do the next two matches, and the match #4 is similar,
but different. The line with the matching percentage also includes the name of the translation
memory containing the match. If there's no file name displayed, the source is the internal
project translation memory. Orphan segments (the match #2) describe segments in the
default project translation memory that have no corresponding source segment.
There are actually three match estimates available (66/66/30 in the case of the match #4
above). They are defined as follows:
• default OmegaT match - number of matched words - with numerals and tags ignored -
divided by the total word count
19
The user interface
The figure above shows the default match display template. The contents can be customized
using following variables:
2.3. Glossary pane
The Glossary pane allows you to access your own collection of expressions and specialist
terminology which you have built up in your glossary files. It shows translation of terms
found in the current segment. The source segment in the example below was “Context menu
command”, as in the Fuzzy Matches example above, and the terms shown were found in the
glossaries, available (Microsoft's Term collection and Slovenian Linux User group Glossary).
20
The user interface
If you have TransTips option activated (Options → TransTips), you can right click on the
highlighted word in the source segment to open a pop-up menu with suggested translation,
as offered by your glossary. Selecting one of them will insert it at the current cursor position
into the target segment. You can also highlight your preferred alternative in the glossary pane
and insert it into the target by right clicking on the selection.
2.4. Dictionary pane
Dictionaries are the electronic equivalents of printed dictionaries like Merriam Webster,
Duden, Larousse etc., that you may have on your desk. See more about them in the chapter
on Dictionaries
2.6. Notes pane
The translator can add notes to the opened segment, for instance to come back later to the
segment and redo the translation, check that alternative translations are correct or to ask
colleagues for their opinion. You can browse through notes using GoTo → Next Note and GoTo
→ Previous Note.
2.7. Comments pane
Some of the file formats, specialized for translation work, for instance PO, allow the inclusion
of comments. This way the translator can be provided the context about the segment to
be translated. In the example below, the author of the PO file included a warning for the
translator to check the length of the translation:
21
The user interface
Figure 4.5. Comments pane
2.9. Main menu
The main menu provides access to all OmegaT functions. See the Main Menu appendix for
a full description of all menus and menu items. The most frequently used functions are
accessible with keyboard shortcuts, so once you become accustomed to them, you will no
longer need to browse through the menus while translating. See chapter Menu and Keyboard
shortcuts for details.
2.10. Status bar
The status bar displays work-flow related messages at the bottom of the main window. This
bar gives the user feedback on specific operations that are in progress. It also displays the
number of fuzzy and glossary matches for the current segment.
3. Other windows
3.1. Project files
The Project Files window lists the project files and displays other project information. It is
displayed automatically when OmegaT loads a project.
Use Ctrl+L to open and Esc to close it. The Project Files Window displays the following
information:
• the total number of translatable files in the project. These are the files present in the /
source folder in a format that OmegaT is able to recognize. This number is displayed in
brackets, next to the "Project file" title
• the list of all translatable files in the project. Clicking on any file will open it for translation.
Typing any text will open a Filter field where parts of filenames can be entered. You can
select a file with Up and Down keys, and open it for translation by pressing Enter
Note: filenames (in first column) can be sorted alphabetically by clicking in the header. It
also possible to change the position of a filename, by clicking on it and pressing Move ...
buttons.
22
The user interface
Right-clicking on a filename opens a popup that allows to open the source file and (if it
exists) the target file.
• File entries include their names, file filter types, their encoding and the number of segments
each file contains
• the total number of segments, the number of unique segments in the whole project, and
the number of unique segments already translated are shown at the bottom
The set of Unique segments is computed by taking all the segments and removing all
duplicate segments. (The definition of “unique” is case-sensitive: "Run" and "run" are treated
as being different)
The difference between "Number of segments" and "Number of unique segments" provides
an approximate idea of the number of repetitions in the text. Note however that the numbers
do not indicate how relevant the repetitions are: they could mean relatively long sentences
repeated a number of times (in which case you are fortunate) or it could describe a table of
keywords (not so fortunate). The project_stats.txt located in the omegat folder of your project
contains more detailed segment information, broken down by file.
Modifying the segmentation rules may have the effect of modifying the number of segments/
unique segments. This, however, should generally be avoided once you have started
translating the project. See the chapter Segmentation rules for more information.
Adding files to the project: You can add source files to the project by clicking on the
"Import Source Files..." button. This copies the selected files to the source folder and reloads
the project to import the new files. You can also add source files from Internet pages, written
in MediaWiki, by clicking on "Import from MediaWiki" button and providing the corresponding
URL.
3.2. Search window
You can use the search window to find specific segments in a project. You can also have
several search windows open simultaneously. To open a new search window, use Ctrl+F in
the Main window. The search window consists of a text field for search strings or keywords,
flags and radio buttons for setting up the search and a display area containing the results of
the search. See the chapter Searches for more information about the search window.
3.3. Tag validation
The tag validation window detects and lists any tag errors and inconsistencies in the
translation. Open the window with Ctrl+T. The window features a 3 column table with a link
to the segment and its source and target contents:
23
The user interface
Tags are highlighted in bold blue for easy comparison between the original and the translated
contents. Click on the link to jump to the segment in the Editor pane. Correct the error if
necessary and press Ctrl+T to return to the tag validation window to correct other errors. In
the first and third case above tags are paired incorrectly, and in the second case the < sign
is missing from the starting tag.
Tag errors are cases in which the tags in the translation do not correspond in order and
number to the original segment. Some tag scenarios flagged in the tag validation window
are necessary and are benign, others will cause problems when the translated document is
created. Tags generally represent some kind of formatting in the original text. Simplifying the
original text formatting in the source file before commencing translation greatly contributes
to reducing the number of tags.
3.4. Statistics
The statistics window - accessed via Tools>Statistics - shows the statistics of the current
OmegaT project, both in the summary form as well as in detail for every file to be translated.
The statistics shown is available as a tab-separated project_stats.txt file (subfolder omegat),
ready to be loaded into a spreadsheet program for the user's convenience. You can use Ctrl
+A , Ctrl+C , Ctrl+V to copy/paste the contents.
Figure 4.7. project statistics
3.5. Match statistics
The match statistics are accessed viaTools>Match Statistics. The evaluation is rather CPU
intensive and can be time-consuming, so a progress bar is shown during the calculation.
As far as categories are concerned, the de facto industry standard of classifying matches
into the following groups is used: Repetitions, Exact match, 95%-100%, 85%-94%, 75%-84%,
50%-74% and No match. This information is computed for segments as well as for words
and for characters (without and including spaces). Note that there could be minor differences
between the OmegaT counts and the numbers, provided by other CAT tools.
24
The user interface
Figure 4.8. Match statistics
Note that these totals are a good (or as good as they can be) approximation of the work
involved in the project and thus can serve as a basis for your cost and price calculations.
Spaces between segments are not taken into account in the last column. Repetitions stand for
identical segments present several times in the text. The first segment and its contents will
be classified as "no match", and the rest of them as a repetition of the first. If the translation
for several identical source segments already exists in the translation memory of the project,
these segments, together with other, already translated unique segments, will be classified
as an "Exact match". The number of unique segments, if needed, is provided in the standard
statistics window, regardless of whether they have been translated or not.
The rest of the categories (50-100%) involves untranslated segments with a fuzzy match.
Fuzzy matches can come from the /tm folder as well - and not just from the internal translation
memory in /omegaT, as is the case for repetitions and exact matches. The only difference with
matches from the project_save translation memory is that external TMs cannot give exact
matches, only 100%. If one does not wish to use external TMs for counting, one will either
have to empty the /tm folder or change the project setup (temporarily) so that the value for /
tm points to a different location.
The Match Statistics are tab-separated and you can use Ctrl+A , Ctrl+C , Ctrl+V to copy/
paste them, for instance into a spreadsheet or into your cost-accounting application. Once
computed, the data also available in omegat/project_stats_match.txt. Note that the file is
time-stamped, as the calculation (contrary to the standard statistics) is not instantaneous
and can thus quickly become obsolete.
3.6. Help browser
The help browser (which displays this manual) can be opened by pressing F1 or navigating
to Help → User Manual... in the main menu. In the window, the manual and two buttons are
displayed: Back and Contents. The user manual is an HTML document with links to different
chapters. Clicking on a link as you would do in a web browser brings the desired page to
the front.
The user manual is located in the docs subfolder under the OmegaT installation folder, so you
may can, for instance, view the English documentation by opening the docs/en/index.html
file in your browser. Opening the user manual in this way also enables you to follow external
links, as the built-in help browser does not accept external Internet links.
25
Chapter 5. Menu and Keyboard
shortcuts
1. Main Menu
All of OmegaT's functions are available through the menu bar at the top of the Editor window.
Most functions are also available via keyboard shortcuts. Shortcuts are activated by pressing
Ctrl and a letter. Some shortcuts involve other keys. For readability purposes, letters are
written in uppercase here. Ctrl is used on Windows, UNIX and UNIX-like operating systems
with keyboards featuring a Ctrl or Control key. Mac users should instead use Cmd+key
instead. The "Cmd" key either has a "command" label or an apple symbol on Apple keyboards.
You can customize existing shortcuts or add new ones according to your needs. See Appendix
- Shortcuts Customization
Table 5.1. Main Menu
Project Edit Go to View Tools Options Help
1.1. Project
Table 5.2. Project menu
New... Ctrl+Shift+N Creates and opens a new
project. The dialog to create a
project is the same as to edit
the project. See Chapter 6,
Project properties
Download Team Project... Creates a local copy of a
remote OmegaT project.
Open... Ctrl+O Opens a previously created
project.
Open Recent Project Give access to the five last
edited projects. Clicking on
one will save the current
project, close it and open the
other project.
Copy Files to Source Folder... Copies the selected files to the
source folder and reloads the
project to load the new files.
Download MediaWiki Page... Downloads units from
MediaWiki pages, based on
the URL entered.
Reload F5 Reloads the project to take
external changes in source
files, legacy translation
memories, glossaries and
project settings into account.
Close Ctrl+Shift+W Saves the translation and
closes the project.
Save Ctrl+S Saves the internal translation
memory to the hard disk.
OmegaT automatically saves
translations every 10 minutes
26
Menu and Keyboard shortcuts
1.2. Edit
Note: Items that are found in most applications (copy/cut/paste) are not displayed in this
menu, but are available using your system shortcuts. For example:
Table 5.3. Copy/cut/paste shortcuts
Copy Ctrl+C Copies the selected text to the
clipboard.
Cut Ctrl+X Copies the selected text to
the clipboard and deletes the
selected text.
Paste Ctrl+V Pastes the text from the
clipboard at the cursor
position.
Table 5.4. Edit menu
Undo Last Action Ctrl+Z Restores the status before the
last editing action was taken.
27
Menu and Keyboard shortcuts
28
Menu and Keyboard shortcuts
Pressing Ctrl+Shift+F
displays the last already
opened Search window
(instead of opening a new
one).
Search and Replace... Ctrl+K Opens a new Search and
replace window.
29
Menu and Keyboard shortcuts
1.3. Go to
Table 5.5. Go To menu
30
Menu and Keyboard shortcuts
1.4. View
Table 5.6. View menu
Mark Translated Segments If checked, the translated segments will be
marked in yellow.
Mark Untranslated Segments If checked, the untranslated segments will be
marked in violet.
Display Source Segments If checked, source segments will be shown
and marked in green. If not checked, source
segments will not be shown.
Mark Non-Unique Segments If checked, non-unique segments will be
marked in pale grey.
Mark Segments with Notes If checked, segments with notes will be
marked in cyan. This marking has priority
over Mark Translated Segments and Mark
Untranslated Segments.
Mark Non-breakable Spaces If checked, non-breakable spaces will be
displayed with a grey background.
Mark Whitespace If checked, white spaces will be displayed with
a small dot.
Mark Bidirectional Algorithm Control This option displays bidirectional control
Characters characters [http://www.w3.org/International/
questions/qa-bidi-controls]
Mark Auto-Populated Segments If checked, the background of all segments
where the target segment has been auto-
populated (from TMXs placed in /tm/auto
for example) are displayed in colour. The
colours are displayed as long as the
"Save auto-populated status" (in Options/
Editing behaviour..) option is checked.
Usual translations inserted from the auto
folder are displayed in orange. Other
translations, identified specifically in the TMX,
can be displayed using different colours.
For technical details, see the Request
For Enhancement [http://sourceforge.net/p/
omegat/feature-requests/963/]
Aggressive Font Fallback Check this option in case OmegaT does
not display some glyphs properly (even if
the fonts containing the relevant glyphs
31
Menu and Keyboard shortcuts
Note: colors are customizable through the Options / Custom colours... dialog.
1.5. Tools
Table 5.7. Tools menu
Validate Tags Ctrl+Shift+V: Checks for missing or displaced
tags in formatted files. Will display a list
of segments with tag errors and possible
inconsistencies. See Tag Validation and
Chapter 12, Working with formatted text.
Validate Tags for Current Document Same as above, but only for the current
document in translation.
Statistics Opens a new window and displays the project
statistics, i.e. the project totals and totals for
every file in the project.
Match Statistics Displays the Match Statistics for the
project: the number of repetitions, exact
matches, fuzzy matches and no-matches, for
segments, words and in characters.
Match Statistics per File Displays the Match Statistics for each file of
the project: the number of repetitions, exact
matches, fuzzy matches and no-matches, for
segments, words and in characters.
Scripting... Opens a dialog box where the location of
scripts can be set, and where scripts can be
written, run and associated with a shortcut
(see Scripts window)
1.6. Options
Table 5.8. Options menu
Use TAB To Advance Sets segment validation key to Tab instead
of the default Enter. This option is useful for
some Chinese, Japanese or Korean character
input systems.
Always Confirm Quit The program will see confirmation before
closing down.
Machine Translate Allows you to activate/deactivate the Machine
Translation tools offered. When active,
Ctrl+M will insert the suggestion into the
target part of the current segment.
32
Menu and Keyboard shortcuts
33
Menu and Keyboard shortcuts
34
Menu and Keyboard shortcuts
1.7. Help
Table 5.9. Help menu
User Manual... F1: Opens this manual in the default browser.
About... Displays copyright, credits and license
information.
Last Changes... Displays the list of new functionalities,
enhancements and bug fixes for each new
release.
Log... Displays the current log file. The title of the
dialog reflects the file actually used (which
depends on how many instances of OmegaT
are running concurrently).
2. Keyboard shortcuts
The following shortcuts are available from the main window. When another window is on the
foreground, click on the main window to bring it to the foreground or press Esc to close the
other window.
Shortcuts are activated by pressing Ctrl and a letter. Some shortcuts involve other keys. For
readability purposes, letters are written in uppercase here.
Ctrl is used on Windows, UNIX and UNIX-like operating systems with keyboards featuring a
Ctrl / Control key. Mac users should instead use the cmd+key. On Apple keyboards the cmd
key either has a command label or an Apple icon on it.
• Project management
• Editing
35
Menu and Keyboard shortcuts
• Moving around
• Reference windows
• Other
2.1. Project management
Table 5.10. Project management shortcuts
Open project Ctrl+O Displays a dialog to locate an
existing project.
Save Ctrl+S Saves the current work to the
internal translation memory
(file project_save.tmx located
in the project's omegat
folder).
Close Project Shift+Ctrl+W Closes the current project.
Create Translated Ctrl+D Creates the translated
Documents documents in the project's
Target folder and creates
translation memory files
(level1, level2 and omegat
tmx files) in the project's root
folder.
Project properties Ctrl+E Displays the project's settings
for modification, if required.
2.2. Editing
Table 5.11. Editing shortcuts
Undo last action Ctrl+Z Undoes the last editing
actions in the current target
segment
Redo last action Ctrl+Y Redoes the last editing
actions in the current target
segment
Select match #N Ctrl+#N #N is a digit from 1 to 5.
The shortcut selects the Nth
match displayed in the match
window (the first match is
selected by default)
Replace with match Ctrl+R Replaces the current target
segment contents with the
selected match (the first
match is selected by default)
Insert match Ctrl+I Inserts the selected match
at the cursor position in the
current target segment (the
first match is inserted by
default)
Replace with source Ctrl+Shift+R Replaces the current target
segment contents with the
source text contents
36
Menu and Keyboard shortcuts
2.3. Moving around
Table 5.12. Moving around shortcuts
Next Untranslated Segment Ctrl+U Moves the editing field to
the next segment that is
not registered in the project's
translation memory
Next Segment Ctrl+N, Enter or Return Moves the editing field to the
next segment.
Previous Segment Ctrl+P Moves the editing field to the
previous segment
Segment number... Ctrl+J Moves to the segment
number entered
Back in history... Ctrl+Shift+P Moves one segment back in
history
Forward in history... Ctrl+Shift+N Moves one segment forward
in history
2.4. Other
Table 5.13. Various shortcuts
Project files listing Ctrl+L Displays the Project files
listing
Validate Tags Ctrl+T Opens the Tag validation
window.
Export Selection Shift+Ctrl+C Exports the current selection
or the current source, if no
text has been selected. The
text is exported to a plain text
file.
Search Project Ctrl+F Opens a new Search window.
37
Menu and Keyboard shortcuts
38
Chapter 6. Project properties
1. Generalities
The Project → Properties... (Ctrl+E) dialog is used to define and modify the project folders
and languages.
It is possible to modify the project properties during a translation session. Note that changes
to the project setup may have some consequences, especially, when the project has already
been started. Until you have some experience with OmegaT, it is safest to consider all settings
final once the translation has started – unless of course you realize a major mistake has been
made. See the section Preventing data loss for ways and means of protecting your work.
2. Languages
You can either enter the source and target languages by hand or use the drop down
menus. Bear in mind that changing the languages may render the currently used translation
memories useless since their language pair may not longer match the new languages.
Tokenizers corresponding to the selected languages are displayed. See Tokenizers Appendix
for details.
3. Options
Enable Sentence-level The segmentation settings only address the way the source
segmentation files are handled by OmegaT. The predominant way of
segmenting the sources is the sentence-level segmenting, so
this check box should in a normal case be left checked.
Segmentation... The segmentation rules are generally valid across all the
projects. The user, however, may need to generate a set
of rules, specific to the project in question. Use this button
to open a dialog, activate the check box Project specific
segmentation rules, then proceed to adjust the segmentation
rules as desired. The new set of rules will be stored
together with the project and will not interfere with the
general set of segmentation rules. To delete project specific
segmentation rules, uncheck the check box. See chapter
Source Segmentation for more information on segmentation
rules.
39
Project properties
File Filters... In a similar fashion as above the user can create project-
specific File filters, which will be stored together with the
project and will be valid for the current project only. To create
a project-specific set of file filters, click on the File filter ...
button, then activate Enable project specific filters check box
in the window that opens. A copy of the changed filters
configuration will be stored with the project. To delete project
specific file filters, uncheck the check box. Note that in the
menu Options->File Filters, the global user filters are changed,
not the project filters. See chapter File filtersfor more on the
subject.
Hint: the set of file filters for a given project is stored as project/
omegat/filters.xml.
Remove Tags When enabled, all the formatting tags are removed from
source segments. This is especially useful when dealing with
texts where inline formatting is not really useful (e.g., OCRed
PDF, bad converted .odt or .docx, etc.) In a normal case it
should always be possible to open the target documents,
as only inline tags are removed. Non-visible formatting (i.e.,
which doesn't appear as tags in the OmegaT editor) is retained
in target documents.
4. File locations
Here you can select different subfolders, for instance the subfolder with source files, subfolder
for target files etc. If you enter names of folders that do not exist yet, OmegaT creates them
for you. In case you decide to modify project folders, keep in mind that this will not move
existing files from old folders to the new location.
Clic on Exclusions... to define the files or folders that will be ignored by OmegaT. An ignored
file or folder:
• is not copied in /target folder during the translated files creation process.
40
Project properties
In the Exclusion patterns dialog, it is possible to Add or Remove a pattern, or edit one by
selecting a line and pressing F2. It is possible to use wildcards, using the ant syntax [https://
ant.apache.org/manual/dirtasks.html#patterns].
41
Chapter 7. File Filters
OmegaT features highly customizable filters, enabling you to configure numerous aspects.
File filters are pieces of code capable of:
• Reading the document in some specific file format. For instance, plain text files.
To see which file formats can be handled by OmegaT, see the menu Options > File Filters ...
Most users will find the default file filter options sufficient. If this is not the case, open the
main dialog by selecting Options → File Filters... from the main menu. You can also enable
project-specific file filters, which will only be used on the current project, by selecting the File
Filters... option in Project Properties.
You can enable project specific filters via the Project → Properties.... Click on File Filters
button and activate the check box Enable project specific filters. A copy of the filters
configuration will be stored with the project in this case. If you later change filters, only the
project filters will be updated, while the user filters stay unchanged.
Warning! Should you change filter options whilst a project is open, you must reload the
project in order for the changes to take effect.
• Remove leading and trailing tags: uncheck this option to display all the tags including the
leading and trailing ones. Warning: in Microsoft Open XML formats (docx, xlsx, etc.), if all
tags are displayed, DO NOT write text before the first tag (it is a technical tag that must
always begin the segment).
• Preserve spaces for all tags: check this option if the source documents contain significant
spaces (for layout matters) that must not be ignored.
• Ignore file context when identifying segments with alternate translations: by default,
OmegaT uses the source file name as part of the identification of an alternate translation.
if the option is checked, the source file name will not be used, and alternative translations
will take effect in any file as long as the other context (previous/next segments, or some
sort of ID depending on the file format) matches.
2. Filter options
Several filters (Text files, XHTML files, HTML and XHTML files, OpenDocument files and
Microsoft Open XML files) have one or more specific options. To modify the options select the
filter from the list and click on Options. The available options are:
42
File Filters
Text files
if sentence segmentation rules are active, the text will further be segmented according to
the option selected here.
PO files
If on, when a PO segment (which may be a whole paragraph) is not translated, the
translation will be empty in the target file. Technically speaking, the msgstr segment in the
PO target file, if created, will be left empty. As this is the standard behavior for PO files, it
is on by default. If the option is off, the source text will be copied to the target segment.
• Skip PO header
The option allows OmegaT to override the specification in the PO file header and use the
default for the selected target language.
XHTML Files
• Translate the following attributes: the selected attributes will appear as segments in the
Editor window.
• Start a new paragraph on: the <br> HTML tag will constitute a paragraph for segmentation
purposes.
• Skip text matching regular expression: the text matching the regular expression gets
skipped. It is shown rendered red in the tag validator. Text in source segment that matches
is shown in italic.
• Do not translate the content attribute of meta-tags ... : The following meta-tags will not
be translated.
• Do not translate the content of tags with the following attribute key-value pairs (separate
with commas): a match in the list of key-value pairs will cause the content of tags to be
ignored
It is sometimes useful to be able make some tags untranslatable based on the value of
attributes. For example, <div class="hide"> <span translate="no"> You can define key-
value pairs for tags to be left untranslated. For the example above, the field would contain:
class=hide, translate=no
You can select which elements are to be translated. They will appear as separate segments
in the translation.
• Other Options:
• Aggregate tags: if checked, tags without translatable text between them will be
aggregated into single tags.
43
File Filters
• Preserve spaces for all tags: if checked, "white space" (i.e., spaces and newlines) will be
preserved, even if not set technically in the document
• Add or rewrite encoding declaration in HTML and XHTML files: frequently the target files
must have the encoding character set different from the one in the source file (wether it
is explicitly defined or implied). Using this option the translator can specify, whether the
target files are to have the encoding declaration included. For instance, if the file filter
specifies UTF8 as the encoding scheme for the target files, selecting Always will assure that
this information is included in the translated files.
• Translate the following attributes: the selected attributes will appear as segments in the
Editor window.
• Start a new paragraph on: the <br> HTML tag will constitute a paragraph for segmentation
purposes.
• Skip text matching regular expression: the text matching the regular expression gets
skipped. It is shown rendered red in the tag validator. Text in source segment that matches
is shown in italic.
• Do not translate the content attribute of meta-tags ... : The following meta-tags will not
be translated.
• Do not translate the content of tags with the following attribute key-value pairs (separate
with commas): a match in the list of key-value pairs will cause the content of tags to be
ignored
It is sometimes useful to be able make some tags untranslatable based on the value of
attributes. For example, <div class="hide"> <span translate="no"> You can define key-
value pairs for tags to be left untranslated. For the example above, the field would contain:
class=hide, translate=no
• Remove HTML comments in translated document: all commented parts (between <!-- and
-->) won't be copied in the translated document.
44
File Filters
filename patterns against the filename. For example, the pattern *.xhtml matches any file
with the .xhtml extension. If the appropriate filter is found, the file is assigned to it for
processing. For example, by default, XHTML filters are used for processing files with the .xhtml
extension. You can change or add filename patterns for files to be handled by each file filter.
Source filename patterns use wild card characters similar to those used in Searches. The '*'
character matches zero or more characters. The '?' character matches exactly one character.
All other characters represent themselves. For example, if you wish the text filter to handle
readme files (readme, read.me, and readme.txt) you should use the pattern read*.
Only a limited number of file formats specify a mandatory encoding. File formats that do not
specify their encoding will use the encoding you set up for the extension that matches their
name. For example, by default .txt files will be loaded using the default encoding of your
operating system. You may change the source encoding for each different source filename
pattern. Such files may also be written out in any encoding. By default, the translated file
encoding is the same as the source file encoding. Source and target encoding fields use
combo boxes with all supported encodings included. <auto> leaves the encoding choice to
OmegaT. This is how it works:
• OmegaT identifies the source file encoding by using its encoding declaration, if present
(HTML files, XML based files)
• OmegaT is instructed to use a mandatory encoding for certain file formats (Java properties
etc)
• OmegaT uses the default encoding of the operating system for text files.
3.3. Target filename
Sometimes you may wish to rename the files you translate automatically, for example adding
a language code after the file name. The target filename pattern uses a special syntax, so if
you wish to edit this field, you must click Edit...and use the Edit Pattern Dialog. If you wish
to revert to default configuration of the filter, click Defaults. You may also modify the name
directly in the target filename pattern field of the file filters dialog. The Edit Pattern Dialog
offers among others the following options:
• Default is ${filename}– full filename of the source file with extension: in this case the name
of the translated file is the same as that of the source file.
• ${nameOnly}– allows you to insert only the name of the source file without the extension.
• ${targetLanguage}– the target language and country code together (of a form "XX-YY").
45
File Filters
Additional variants are available for variables ${nameOnly} and ${Extension}. In case the
file name has ambivalent name, one can apply variables of the form ${name only-extension
number} and ${extension-extension number} . If for example the original file is named
Document.xx.docx, the following variables will give the following results:
• ${nameOnly-0} Document
• ${nameOnly-1} Document.xx
• ${nameOnly-2} Document.xx.docx
• ${extension-0} docx
• ${extension-1} xx.docx
• ${extension-2} Document.xx.docx
46
Chapter 8. OmegaT Files and
Folders
OmegaT works with three types of files.
• Translation project files: These constitute a translation project. Losing them may affect the
project's integrity and your ability to complete a job. Project files are the most important
files in OmegaT. They are the files you deal with on a daily basis while translating.
• User settings files: These are created when OmegaT's behavior is modified by user
preference settings. Losing them usually results in OmegaT reverting to its "factory
settings". This can sometimes cause a little trouble when you are in the middle of a
translation.
• Application files: These are included in the package you download. Most of them are
required in order for OmegaT to function properly. If for some reason these files are lost or
corrupted, simply download and/or reinstall OmegaT to restore them all.
When you create a translation project, OmegaT automatically creates a folder with the
specified name, and a list of folders:
Figure 8.1. OmegaT project
Alternate locations for some of the folders can be chosen at project creation or during the
translation. It is therefore possible to select existing folders or create folders in locations that
reflect your work flow and project management habits. To change the location of folders after
a project has been created, open Project > Properties... in the menu or with Ctrl+E and make
the necessary changes.
In a file manager a translation project looks and acts just like any other folder. In the following
example the folder my projects contains three OmegaT projects:
47
OmegaT Files and Folders
Double clicking the item with the OmegaT icon is sufficient to open the project. A translation
project Example_Project created with the default settings will be created as a new subfolder
with the following structure:
1.1. Top folder
Top folder of a project always contains the file OmegaT.Project, containing project parameters
as defined in the Project properties window (Project > Properties). While the translation
is progressing, additional files (project_name-omegat.tmx, project_name-level1.tmx and
project_name-level2.tmx) are created (and updated during the process of translation) in this
folder. They contain the one and the same translation memory contents in different forms,
to be used in future projects.
1.2. Subfolder dictionary
Initially empty, this folder will contain dictionaries you have added to the project. See chapter
Dictionaries for more on this subject.
1.3. Subfolder glossary
This folder is initially empty. It will contain glossaries you will be using in the project. See
chapter Glossaries for more on this subject.
1.4. Subfolder omegat
The omegat subfolder contains at least one and possibly several other files. The most
important file here is the project_save.tmx, that is the working translation memory for the
project. Backups of this file (with extension bak) are added progressively to this subfolder, first
at the beginning of the translation session, at its end, and while the translation progresses.
This way an inadvertent data loss is averted - see Preventing Data Loss in chapter Miscellanea.
During translation additional files may get created in this subfolder as follows
48
OmegaT Files and Folders
1.5. Subfolder source
The source subfolder contains files to be translated. You can add the files to it later. Note that
the structure of the source subfolder may take any form you like. If the files to be translated
are parts of a tree structure (as in a website), you need only specify the top-level subfolder
and OmegaT will maintain the entire contents, while keeping the tree structure intact.
1.6. Subfolder target
This subfolder is initially empty. To add contents to it, select Project → Create Translated
Documents (Ctrl+D). Files within the source folder, whether translated or not, are then
generated here, with the same hierarchy as present in the source subfolder. The contents
of the target subfolder will reflect the current state of the translation, as present in the
project translation memory, saved in the current /omegat/project_save.tmx. Untranslated
segments will hereby remain in the source language.
Note that default segmentation rules and file filters can be overridden by project-specific
setup (see above). The location of user files depends upon the platform you use:
49
OmegaT Files and Folders
You can access that folder directly with the Options → Access Configuration Folder menu
entry.
3. Application files
OmegaT is supplied as a package that can be downloaded from SourceForge. Here a platform-
independent package in a standard Java form is considered. Alternatives include a Linux .tar
package, a Windows installer – with or without a Java Runtime Environment –, a Mac OS X
installer, and a source code package for developers.
The platform-independent package can be used on any platform with a working Java 1.6
runtime environment, including the platforms for which a specific package also exists. It is
provided as a compressed file (zip or tar archive) that you must extract to the folder of your
choice for installation. The file can usually be extracted by double-clicking on the downloaded
package. Once the archive has been extracted, a folder containing the following contents is
created:
50
OmegaT Files and Folders
#!/bin/bash java
java -jar OmegaT.jar $*
51
Chapter 9. Files to translate
1. File formats
You can use OmegaT to translate files in a number of file formats. There are basically two
types of file formats, plain text and formatted text.
• PO files (*.po)
Other plain text file types can be handled by OmegaT by associating their file extension to a
supported file type (for example, .pod files can be associated to the ASCII text filter) and by
pre-processing them with specific segmentation rules.
PO files can contain both the source and the target text. Seen from this point of view, they
are plain text files plus translation memories. If for a given source segment there is as yet
no existing translation in the project translation memory (project_save.tmx), the current
translation will be saved in the project_save.tmx as the default translation. In case, however,
the same source segment already exists with a different translation, the new translation will
be saved as an alternative.
52
Files to translate
• ODF - OASIS Open Document Format (*.ods, *.ots, *.odt, *.ott, *.odp, *.otp)
• DocBook (*.xml)
• Visio (*.vxd)
• Schematron (*.sch)
Other formatted text file types may also be handled by OmegaT by associating their file
extensions to a supported file type, assuming that the corresponding segmentation rules will
segment them correctly.
1.3. PDF files
PDF files are a special case. They contain text formatting information, but such information
cannot be reused by OmegaT in order to create target files. Thus, PDF files are handled as
plain text files, and output files are plain text files.
If you need to reproduce text formatting (as well as other things such as drawings) in your
translation, there are three ways to try:
1. Use OmegaT’s default filter (PDF input), translate, create a target file (it will be a plain text
file), add relevant formatting and items manually.
2. Use the Iceni Infix filter. See Howto - Translating PDF files with Iceni Infix and OmegaT
[https://omegat.org/howtos/iceni_infix.html].
Note: the above information applies only to PDF files with a text layer. If you have a PDF file
made of scanned pages (sometimes such files are referred to as ‘dead’ PDFs), you need to
53
Files to translate
use an OCR (optical character recognition) program to recognize the text and convert it to
a format that can be handled by OmegaT.
External tools can be used to convert files to supported formats. The translated files will
then need to be converted back to the original format. For example, if you have an outdated
Microsoft Word version, that does not handle the ODT format, here's a round trip for Word
files with the DOC extension:
The quality of formatting of the translated file will depend on the quality of the round-trip
conversion. Before proceeding with such conversions, be sure to test all options. Check the
OmegaT home page [http://www.omegat.org] for an up-to-date listing of auxiliary translation
tools.
• left justification
• right justification
Using the RTL mode in OmegaT has no influence whatsoever on the display mode of the
translated documents created in OmegaT. The display mode of the translated documents
must be modified within the application (such as Microsoft Word) commonly used to display or
modify them (check the relevant manuals for details). Using Shift+Ctrl+O causes both text
input and display in OmegaT to change. It can be used separately for all three panes (Editor,
Fuzzy Matches and Glossary) by clicking on the pane and toggling the display mode. It can
also be used in all the input fields found in OmegaT - in the search window, for segmentation
rules etc.
54
Files to translate
language is LTR and the target language is RTL, or vice versa, it may be necessary to toggle
back and forth between RTL and LTR modes to view the source and enter the target easily
in their respective modes.
If the document allows, the translator is strongly encouraged to remove style information
from the original document so that as few tags as possible appear in the OmegaT interface.
Follow the indications given in Hints for tags management. Frequently validate tags (see Tag
validation) and produce translated documents (see below and Menu) at regular intervals to
make it easier to catch any problems that arise. A hint: translating a plain text version of
the text and adding the necessary style in the relevant application at a later stage may turn
out to be less hassle.
To avoid changing the target files display parameters each time the files are opened, it may be
possible to change the source file display parameters such that such parameters are inherited
by the target files. Such modifications are possible in ODF files for example.
55
Chapter 10. Editing behavior
The dialog in Options → Editing Behavior... enables the user to select, how the current
segment in the editing field is to be initialized and handled:
You translate your files by moving from segment to segment, editing each current segment
in turn. When moving between segments, you may wish to populate the editing field with
an existing translation in the fuzzy match pane or with the source text. In Options → Editing
Behavior... OmegaT offers you the following alternatives:
The source text You can have the source text inserted automatically into
the editing field. This is useful for texts containing many
trade marks or other proper nouns you which must be left
unchanged.
Leave the segment empty OmegaT leaves the editing field blank. This option allows
you to enter the translation without the need to remove the
source text, thus saving you two keystrokes (Ctrl+A and Del).
Empty translations are now allowed. They are displayed as
<EMPTY> in the Editor. To create one, right-click in a segment,
and select "Set empty translation". The entry Remove
translation in the same pop up menu also allows to delete
the existing translation of the current segment. You achieve
the same by clearing the target segment and pressing Enter.
Insert the best fuzzy match OmegaT inserts the translation of the string most similar to the
current source, if it is above the similarity threshold that you
have selected in this dialog. The prefix (per default empty) can
be used to tag translations, done via fuzzy matches. If you add
56
Editing behavior
The check boxes in the lower half of the dialog window serve the following purpose:
Attempt to convert numbers If this option is checked, when a fuzzy match is inserted,
when inserting a fuzzy either manually or automatically, OmegaT attempts to convert
match the numbers in the fuzzy matches according to the source
contents. There are a number of restrictions:
Allow the translation to be Documents for translation may contain trade marks, names
equal to source or other proper nouns that will be the same in translated
documents. There are two strategies for segments that
contain only such invariable text.
Export the segment to text The text export function exports data from within the current
files OmegaT project to plain text files. The data are exported when
the segment is opened. The files appear in the /script subfolder
in the OmegaT user files folder, and include:
Allow tag editing Uncheck this option to prevent any damage on the tags (i.e., partial
deletion) during editing. Removing an entire tag remains possible in
57
Editing behavior
Validate tags when leaving Check this option to be warned about differences between
a segment source and target segments tags each time you leave a
segment.
Save auto-populated status Check this option to record in the project_save.tmx file the
information that a segment has been auto-populated, so it
can be displayed with a specific color in the Editor (if the
"Mark Auto-Populated Segments" option, in the View menu, is
checked).
Initially load this many By default the editor displays 2,000 of initial segments, and
segments progressively loads more as you scroll up or down. If you have
a powerful machine, and/or if you don't like how the scrollbar
behaves during progressive loading, you can increase this
number.
58
Chapter 11. Working with plain text
1. Default encoding
Plain text files - in most cases files with a txt extension - contain just textual information
and offer no clearly defined way to inform the computer which language they contain. The
most that OmegaT can do in such a case, is to assume that the text is written in the same
language the computer itself uses. This is no problem for files encoded in Unicode using a 16
bit character encoding set. If the text is encoded in 8 bits, however, one can be faced with
the following awkward situation: instead of displaying the text, for Japanese characters...
The computer, running OmegaT, has Russian as the default language, and thus shows the
characters in the Cyrillic alphabet and not in Kanji.
Change the encoding of open your source file in a text editor that correctly interprets
your files to Unicode its encoding and save the file in "UTF-8" encoding. Change
the file extension from .txt to .utf8. OmegaT will automatically
interpret the file as a UTF-8 file. This is the most common-
sense alternative, sparing you problems in the long run.
Specify the encoding for - i.e. files with a .txt extension - : in the Text files section of
your plain text files the file filters dialog, change the Source File Encoding from
<auto> to the encoding that corresponds to your source .txt
file, for instance to .jp for the above example.
Change the extensions of for instance from .txt to .jp for Japanese plain texts: in the
your plain text source files Text files section of the file filters dialog, add new Source
Filename Pattern (*.jp for this example) and select the
appropriate parameters for the source and target encoding
OmegaT has by default the following short list available to make it easier for you to deal with
some plain text files:
• .txt files are automatically (<auto>) interpreted by OmegaT as being encoded in the
computer's default encoding.
• .txt1 files are files in ISO-8859-1, covering most Western Europe languages.
• .txt2 files are files in ISO-8859-2, that covers most Central and Eastern Europe languages
• .utf8 files are interpreted by OmegaT as being encoded in UTF-8 (an encoding that covers
almost all languages in the world).
You can check that yourself by selecting the item File Filters in the menu Options. For
example, when you have a Czech text file (very probably written in the ISO-8859-2 code)
you just need to change the extension .txt to .txt2 and OmegaT will interpret its contents
correctly. And of course, if you wish to be on the safe side, consider converting this kind of
file to Unicode, i.e. to the .utf8 file format.
59
Chapter 12. Working with formatted
text
Formatting information present in the source file usually needs to be reproduced in the
target file. The in-line formatting information made possible by the supported formats (in
particular DocBook, HTML, XHTML, Open Document Format(ODF) and Office Open XML (MS
Office 2007 and later) at the time of writing) is presented as tags in OmegaT. Normally tags
are ignored when considering the similarity between different texts for matching purposes.
Tags reproduced in the translated segment will be present in the translated document.
1. Formatting tags
Tag naming:
The tags consist of one to three characters and a number. Unique numbering allows tags,
corresponding to each other to be grouped together and differentiates between tags, that
have the same shortcut character, but are in fact different. The shortcut characters used try
to reflect the underlying meaning of the tag (e.g. b for bold, i for italics, etc.)
Tag numbering:
Tags are numbered incrementally by tag group. "Tag groups" in this context are a single
tag (such as <i0> and </i0>). Within a segment, the first group (pair or single) receives the
number 0, the second the number 1 etc. The first example below has 3 tag groups (a pair, a
single, and then another pair), the second example has one group only (a pair).
Tags are always either singles or paired. Single tags indicate formatting information that does
not affect the surrounding text (an extra space or line break for example).
<br1> is a single tag and does not affect any surrounding text. Paired tags usually indicate
style information that applies to the text between the opening tag and the closing tag of a
pair. <b0> and </b0> below are paired and affect the text log.txt. Note that the opening tag
must always come before the corresponding closing tag:
OmegaT creates its tags before the process of sentence segmenting. Depending upon the
segmenting rules, the pair of tags may get separated into two consecutive segments and the
tag validation will err on the side of caution and mark the two segments.
2. Tag operations
Care must be exercised with tags. If they are accidentally changed, the formatting of the
final file may be corrupted. The basic rule is that the sequence of tags must be preserved in
the same order. However, it is possible, if certain rules are strictly followed, to deviate from
this basic rule.
Tag duplication:
To duplicate tag groups, just copy them in the position of your choice. Keep in mind that in
a pair group, the opening tag must come before the closing tag. The formatting represented
by the group you have duplicated will be applied to both sections.
60
Working with formatted text
Example:
After duplication:
To delete tag groups, just remove them from the segment. Keep in mind that a pair group
must have both its opening and its closing tag deleted to ensure that all traces of the
formatting are properly erased, otherwise the translated file may become corrupted. By
deleting a tag group you will remove the related formatting from the translated file.
Example:
After deletion:
Example:
After nesting:
Example:
61
Working with formatted text
The behaviour, stated here, applies to all the source files and not just to some of the file
types, like formatted text.
OmegaT can check that programming variables (like %s for instance) in the source exist
in the translation. You can decide not to check at all, check for simple printf variables (like
%s %d etc) or for print variables of all types.
Activating this check box will cause OmegaT to check if simple java MessageFormat tags
(like {0}) are processed correctly.
A regular expression entered here will cause OmegaT treat the detected instances as
customer tags. It checks that the number of tags and their order is identical, just like it is
the case for omegat-tags.
One can enter a regular expression for unwanted contents in the target. Any matches in the
target segment will then be painted red, i.e. easy to identify and correct. When looking for
fuzzy matches, the remove pattern is ignored. A fixed penalty of 5 is added if the removed
part does not match some other segment, so the match does not show up as 100%.
62
Working with formatted text
The tags are highlighted in bold blue for easy comparison between the original and the
translated contents. Click on the link to activate the segment in the Editor. Correct the error
if necessary (in the case above it is the missing <i2></i2> pair) and press Ctrl+Shift+V to
return to the tag validation window to correct other errors. Tag errors are tag sequences
in the translation in which the same tag order and number as in the original segment is
not reproduced. Some tag manipulations are necessary and are benign, others will cause
problems when the translated document is created.
Tags generally represent formatting in some form of the original text. Simplifying the original
formatting greatly contributes to reducing the number of tags. Where circumstances permit,
unifying used fonts, font sizes, colors, etc. should be considered, as it could simplify the
translation and reduce the potential for tag errors. Read the tag operations section to see what
can be done with tags. Remember that if you find tags a problem in OmegaT and formatting
is not extremely relevant for the current translation, removing tags may be the easiest way
out of problems.
If you need to see tags in OmegaT but do not need to retain most of the formatting in the
translated document you are free not to include tags in the translation. In this case pay extra
attention to tag pairs since deleting one side of the pair but forgetting to delete the other is
guaranteed to corrupt your document's formatting. Since tags are included in the text itself, it
is possible to use segmentation rules to create segments with fewer tags. This is an advanced
feature and some experience is required in order for it to be applied properly.
OmegaT is not yet able to detect mistakes in formatting fully automatically, so it will not
prompt you if you make an error or change formatting to fit your target language better.
Sometimes, however, your translated file may look strange, and – in the worst case – may
even refuse to open.
63
Chapter 13. Translation memories
1. Translation memories in OmegaT
1.1. tmx folders - location and purpose
OmegaT projects can have translation memory files - i.e. files with the extension tmx - in
five different places:
omegat folder The omegat folder contains the project_save.tmx and possibly a
number of backup TMX files. The project_save.tmx file contains
all the segments that have been recorded in memory since
you started the project. This file always exists in the project.
Its contents will always be sorted alphabetically by the source
segment.
main project folder The main project folder contains 3 tmx files, project_name-
omegat.tmx, project_name-level1.tmx and project_name-
level2.tmx (project_name being the name of your project).
tm folder The /tm/ folder can contain any number of ancillary translation
memories - i.e. tmx files. Such files can be created in any of the
three varieties indicated above. Note that other CAT tools can
export (and import as well) tmx files, usually in all three forms.
The best thing of course is to use OmegaT-specific TMX files
(see above), so that the in-line formatting within the segment is
retained.
64
Translation memories
Note that the TMX files in the tm folder can be compressed with
gzip.
tm/auto folder If it is clear from the very start, that translations in a given TM (or
TMs) are all correct, one can put them into the tm/auto folder
and avoid confirming a lot of [fuzzy] cases.
tm/enforce folder If you have no doubt that a TMX is more accurate than the
project_save.tmx of OmegaT, put this TMX in /tm/enforce to
overwrite existing default translations unconditionally.
tm/mt folder In the editor pane, when a match is inserted from a TMX
contained in a folder named mt, the background of the active
segment is changed to red. The background is restored to normal
when the segment is left.
65
Translation memories
Optionally, you can let OmegaT have an additional tmx file (OmegaT-style) anywhere you
specify, containing all translatable segments of the project. See pseudo-translated memory
below.
Note that all the translation memories are loaded into memory when the project is opened.
Back-ups of the project translation memory are produced regularly (see next chapter), and
project_save.tmx is also saved/updated when the project is closed or loaded again. This
means for instance that you do not need to exit a project you are currently working on if you
decide to add another ancillary TM to it: you simply reload the project, and the changes you
have made will be included.
The locations of the various different translation memories for a given project are user-defined
(see Project dialog window in Project properties)
Depending on the situation, different strategies are thus possible, for instance:
several projects on the same subject: keep the project structure, and change source and
target folders (Source = source/order1, target = target/order1 etc). Note that you segments
from order1, that are not present in order2 and other subsequent jobs, will be tagged as
orphan segments; however, they will still be useful for getting fuzzy matches.
several translators working on the same project: split the source files into source/
Alice, source/Bob... and allocate them to team members (Alice, Bob ...). They can then create
their own projects and, deliver their own project_save.tmx, when finished or when a given
milestone has been reached. The project_save.tmx files are then collected and possible
conflicts as regards terminology for instance get resolved. A new version of the master TM
is then created, either to be put in team members' tm/autosubfolders or to replace their
project_save.tmx files. The team can also use the same subfolder structure for the target
files. This allows them for instance to check at any moment, whether the target version for
the complete project is still OK
1.2. tmx backup
As you translate your files, OmegaT stores your work continually in project_save.tmx in the
project's /omegat subfolder.
If you believe you have lost translation data, follow the following procedure:
3. Select the backup translation memory that is most likely - e.g. the most recent one, or the
last version from the day before) to contain the data you are looking for
4. Copy it to project_save.tmx
The settings in your project indicate which is the source and which the target language.
OmegaT thus takes the TUV segments corresponding to the project's source and target
language codes and uses them as the source and target segments respectively. OmegaT
recognizes the language codes using the following two standard conventions :
66
Translation memories
• 2- or 3-letter language code followed by the 2-letter country code (e.g. EN-US - See
Appendix A, Languages - ISO 639 code list for a partial list of language and country codes).
If the project language codes and the tmx language codes fully match, the segments are
loaded in memory. If languages match but not the country, the segments still get loaded. If
neither the language code not the country code match, the segments will be ignored.
TMX files can generally contain translation units with several candidate languages. If for a
given source segment there is no entry for the selected target language, all other target
segments are loaded, regardless of the language. For instance, if the language pair of the
project is DE-FR, it can be still be of some help to see hits in the DE-EN translation, if there's
none in the DE-FR pair.
1.4. Orphan segments
The file project_save.tmx contains all the segments that have been translated since you
started the project. If you modify the project segmentation or delete files from the source,
some matches may appear as orphan strings in the Match Viewer: such matches refer
to segments that do not exist any more in the source documents, as they correspond to
segments translated and recorded before the modifications took place.
When you create the target documents in an OmegaT project, the translation memory of
the project is output in the form of three files in the root folder of your OmegaT project (see
the above description). You can regard these three tmx files (-omegat.tmx, -level1.tmx and
-level2.tmx) as an "export translation memory", i.e. as an export of your current project's
content in bilingual form.
Should you wish to reuse a translation memory from a previous project (for example because
the new project is similar to the previous project, or uses terminology which might have been
used before), you can use these translation memories as "input translation memories", i.e.
for import into your new project. In this case, place the translation memories you wish to use
in the /tm or /tm/auto folder of your new project: in the former case you will get hits from
these translation memories in the fuzzy matches viewer, and in the latter case these TMs will
be used to pre-translate your source text.
By default, the /tm folder is below the project's root folder (e.g. .../MyProject/tm), but you
can choose a different folder in the project properties dialog if you wish. This is useful if you
frequently use translation memories produced in the past, for example because they are on
the same subject or for the same customer. In this case, a useful procedure would be:
• Create a folder (a "repository folder") in a convenient location on your hard drive for the
translation memories for a particular customer or subject.
• Whenever you finish a project, copy one of the three "export" translation memory files from
the root folder of the project to the repository folder.
• When you begin a new project on the same subject or for the same customer, navigate to
the repository folder in the Project > Properties > Edit Project dialog and select it as the
translation memory folder.
67
Translation memories
Note that all the tmx files in the /tm repository are parsed when the project is opened,
so putting all different TMs you may have on hand into this folder may unnecessarily slow
OmegaT down. You may even consider removing those that are not required any more, once
you have used their contents to fill up the project-save.tmx file.
OmegaT follows very strict procedures when loading translation memory (tmx) files. If an
error is found in such a file, OmegaT will indicate the position within the defective file at which
the error is located.
Some tools are known to produce invalid tmx files under certain conditions. If you wish to use
such files as reference translations in OmegaT, they must be repaired, or OmegaT will report
an error and fail to load them. Fixes are trivial operations and OmegaT assists troubleshooting
with the related error message. You can ask the user group for advice if you have problems.
OmegaT exports version 1.4 TMX files (both level 1 and level 2). The level 2 export is not
fully compliant with the level 2 standard, but is sufficiently close and will generate correct
matches in other translation memory tools supporting TMX Level 2. If you only need textual
information (and not formatting information), use the level 1 file that OmegaT has created.
• Create a project, separate for other projects, in the desired language pair, with an
appropriate name - note that the TMXs created will include this name.
• Copy the documents, you need the translation memory for, into the source folder of the
project.
• Copy the translation memories, containing the translations of the documents above, into
tm/auto subfolder of the new project.
• Start the project. Check for possible Tag errors with Ctrl+T and untranslated segments with
Ctrl+U. To check everything is as expected, you may press Ctrl+D to create the target
documents and check their contents.
• When you exit the project. the TMX files in the main project folder (see above) now contain
the transltions in the selected language pair, for the files, you have copied into the source
folder. Copy them to a safe place for future referrals.
• To avoid reusing the project and thus possibly polluting future cases, delete the project
folder or archive it away from your workplace.
68
Translation memories
OmegaT interfaces to SVN and Git, two common team software versioning and revision
control systems (RCS), available under an open source license. In case of OmegaT complete
project folders - in other words the translation memories involved as well as source folders,
project settings etc - are managed by the selected RCS. see more in Chapter
The solution in our example is to copy the existing translation memory into the tm/
tmx2source/ subfolder and rename it to ZH_CN.tmx to indicate the target language of the
tmx. The translator will be shown English translations for source segments in Dutch and use
them to create the Chinese translation.
Important: the supporting TMX must be renamed XX_YY.tmx, where XX_YY is the target
language of the tmx, for instance to ZH_CN.tmx in the example above. The project and TMX
source languages should of course be identical - NL in our example. Note that only one TMX
for a given language pair is possible, so if several translation memories should be involved,
you will need to merge them all into the XX_YY.tmx.
All translations from source documents are also displayed in the Comment pane, in addition to
the Match pane. In case of PO files, a 20% penalty applied to the alternative translation (i.e., a
100% match becomes an 80% match). The word [Fuzzy] is displayed on the source segment.
When you load a segmented TTX file, segments with source = target will be included, if "Allow
translation to be equal to source" in Options → Editing Behavior... has been checked. This
may be confusing, so you may consider unchecking this option in this case.
4. Pseudo-translated memory
Note
Of interest for advanced users only!
Before segments get translated, you may wish to pre-process them or address them in
some other way than is possible with OmegaT. For example, if you wish to create a pseudo-
translation for testing purposes, OmegaT enables you to create an additional tmx file that
contains all segments of the project. The translation in this tmx can be either
The tmx file can be given any name you specify. A pseudo-translated memory can be
generated with the following command line parameters:
69
Translation memories
Replace <filename> with the name of the file you wish to create, either absolute or
relative to the working folder (the folder you start OmegaT from). The second argument --
pseudotranslatetype is optional. Its value is either equal (default value, for source=target)
or empty (target segment is empty). You can process the generated tmx with any tool you
want. To reuse it in OmegaT rename it to project_save.tmx and place it in the omegat-folder
of your project.
A project's tmx will be upgraded only once, and will be written in upgraded form into the
project-save.tmx; legacy tmx files will be upgraded on the fly each time the project is loaded.
Note that in some cases changes in file filters in OmegaT may lead to totally different
segmentation; as a result, you will have to upgrade your translation manually in such rare
cases.
70
Chapter 14. Source segmentation
Translation memory tools work with textual units called segments. OmegaT has two ways to
segment a text: by paragraph or by sentence segmentation (also referred to as “rule-based
segmentation”). In order to select the type of segmentation, select Project → Properties...
from the main menu and tick or untick the check box provided. Paragraph segmentation is
advantageous in certain cases, such as highly creative or stylistic translations in which the
translator may wish to change the order of entire sentences; for the majority of projects,
however, sentence segmentation is a choice to be preferred, since it delivers better matches
from previous translations. If sentence segmentation has been selected, you can setup the
rules by selecting Options → Segmentation...from the main menu.
Dependable segmentation rules are already available for many languages, so it is likely that
you will not need to get involved with writing your own segmentation rules. On the other hand
this functionality can be very useful in special cases, where you can increase your productivity
by tuning the segmentation rules to the text to be translated.
Warning: because the text will segment differently after filter options have been changed, so
you may have to start translating from scratch. At the same time the previous valid segments
in the project translation memory will turn into orphan segments. If you change segmentation
options when a project is open, you must reload the project in order for the changes to take
effect.
Structure level OmegaT first parses the text for structure-level segmentation.
segmentation During this process it is only the structure of the source file
that is used to produce segments.
Sentence level After segmenting the source file into structural units, OmegaT
segmentation will segment these blocks further into sentences.
1. Segmentation rules
The process of segmenting can be pictured as follows: the cursor moves along the text, one
character at a time. At each cursor position rules, consisting of a Before and After pattern,
are applied in their given order to see if any of the Before patterns are valid for the text on
the left and the corresponding After pattern for the text on the right of the cursor. If the rule
matches, either the cursor moves on without inserting a segment break (for an exception
rule) or a new segment break is created at the current cursor position (for the break rule).
Break rule Separates the source text into segments. For example, "Did it make sense?
I was not sure." should be split into two segments. For this to happen, there
should be a break rule for "?", when followed by spaces and a capitalized word.
To define a rule as a break rule, tick the Break/Exception check box.
Exception rule specify what parts of text should NOT be separated. In spite of the period,
"Mrs. Dalloway " should not be split in two segments, so an exception rule
should be established for Mrs (and for Mr, for Dr, for prof etc), followed by
a period. To define a rule as an exception rule, leave the Break/Exception
check box unticked.
71
Source segmentation
The predefined break rules should be sufficient for most European languages and Japanese.
In view of the flexibility, you may consider defining more exception rules for your source
language in order to provide more meaningful and coherent segments.
2. Rule priority
All segmentation rule sets for a matching language pattern are active and are applied in the
given order of priority, so rules for specific language should be higher than default ones. For
example, rules for Canadian French (FR-CA) should be set higher than rules for French (FR.*),
and higher than Default (.*) ones. Thus, when translating from Canadian French the rules for
Canadian French - if any - will be applied first, followed by the rules for French and lastly,
by the Default rules.
In order to edit or expand an existing set of rules, simply click on it in the top table. The rules
for that set will appear in the bottom half of the window.
In order to create an empty set of rules for a new language pattern click Add in the upper
half of the dialog. An empty line will appear at the bottom of the upper table (you may have
to scroll down to see it). Change the name of the rule set and the language pattern to the
language concerned and its code (see Appendix A, Languages - ISO 639 code list for a list of
language codes). The syntax of the language pattern conforms to regular expression syntax.
If your set of rules handles a language-country pair, we advise you to move it to the top using
the Move Up button.
Add the Before and After patterns. To check their syntax and their applicability, it is
advisable to use tools which allow you to see their effect directly. See the chapter on Regular
expressions. A good starting point will always be the existing rules.
72
Chapter 15. Searches
1. Search window
Open the Search window with Ctrl+F and enter the word or phrase you wish to search for
in the Search for box.
Alternatively, you can select a word or phrase in the Editor pane, Fuzzy matches pane
or Glossary pane and hit Ctrl+F. The word or phrase is entered in the Search for box
automatically. You can have several Search windows open at the same time, but close them
when they are no longer needed so that they do not clutter your desktop.
Click the dropdown arrow of the Search for box to access the last 10 searches.
• File > Search for selection (Ctrl+F): refocus on the search field and select all its contents.
• File > Close (Ctrl+W): close the search window (in the same way as Esc)
• Edit > Replace with source (Ctrl+Shift+R): replace with current segment source.
• Edit > Create Glossary Entry (Ctrl+Shift+G): add a new glossary item.
• '*' matches zero or more characters, from the current position in a given word to its end.
The search term 'run*' for example would match words 'run', 'runs' and 'running'.
• '?' matches any single character. For instance, 'run?' would match the word 'runs' and 'runn'
in the word 'running'.
The matches will be displayed in bold blue. Note that '*' and '?' have special meaning in
regular expressions, so wild card search, as described here, applies to exact and keyword
search only (see below).
exact search Search for segments containing the exact string you specified. An
exact search looks for a phrase, i.e. if several words are entered,
they are found only if they occur in exactly that sequence. Searching
for open file will thus find all occurrences of the string open file, but
not file opened or open input file.
keyword search Search for segments containing all keywords you specified, in any
order. Select keyword search to search for any number of individual
full words, in any order. OmegaT displays a list of all segments
containing all of the words specified. Keyword searches are similar
to a search "with all of the words" in an Internet search engine such
as Google (AND logic). Using keyword search with open file will thus
find all occurrences of the string open file, as well as file opened,
open input file, file may not be safe to open, etc.
73
Searches
regular expressions The search string will be treated as a regular expression. The search
string - [a-zA-Z]+[öäüqwß] - in the example above for instance looks
for words in the target segment, containing questionable characters
from German keyboard. Regular expressions are a powerful way
to look for instances of a string. See more in the chapter Regular
Expressions.
Additionally to one of the methods above you can select the following:
• case sensitive: the search will be performed for the exact string specified; i.e.
capitalization is observed.
• Space matches nbsp: when this option is checked, a space character put in search entry
can match either a normal space character or a non-breacking space (\u00A) character.
• Display: all matching segments: if checked, all the segments are displayed one by one,
even if they are present several times in the same document or in different documents.
• Display: file names: if checked, the name of the file where each segment is found is
displayed above each result.
• Search in Project: check Memory to include the project memory (project_save.tmx file)
in the search. Check TMs to include the translation memories located in the tm folder in
the search. Check Glossaries to include the glossaries located in the glossary folder in the
search.
• Search in Files: search in a single file or a folder containing a set of files. When searching
through files (as opposed to translation memories), OmegaT restricts the search to files
in source file formats. Consequently, although OmegaT is quite able to handle tmx files, it
does not include them in the Search files search.
If you click on the button Advanced options additional criteria (author or changer of the
translation, date translated, excluding orphan segments, etc) can be selected. When Full/Half
width char insensitive option is checked, searches for fullwidth forms (CJK characters) will
match halfwidth forms and vice versa.
Double-clicking on a segment opens it in the Editor for modifications (one single click does
it when Auto-sync with Editor option is checked). You can then switch back to the Search
window for the next segment found, for instance to check and, if necessary, correct the
terminology.
74
Searches
In the Search window, you can use standard shortcuts (Ctrl+N, Ctrl+P) to move from one
segment to another.
You may have several Search windows open at the same time. You can quickly see their
contents by looking at their title: it will contain the search term used.
NB:
• A search may be limited to 1000 items, so if you search on a common phrase, the editor
then shows only those 1000 matching entries, and not all entries that match the search
criteria.
75
Chapter 16. Search and Replace
1. Search window
Open the Search and replace window with Ctrl+K and enter the word or expression you wish
to replace in the Search for box.
Enter the new word or phrase (regular expressions are not supported) in the Replace with
box, then click one of the following options:
• Replace: operates a "one by one" replacement, by the mean of buttons in the header of
the Editor pane. Click Replace Next or Skip, then end the replacement session with Finish.
1.1. Search options
Search options are similar to the ones displayed in Search window.
Except one: check Untranslated in order to operate Search and replace also on segments
that have not been translated yet.
To make it possible (although Search and replace operates only on memory), OmegaT will
copy the source segment to the target segment before the replace operation occurs. If no
replacement is done to a given segment, the target segment will be “emptied”, i.e., it will
remain untranslated.
76
Chapter 17. Regular expressions
The regular expressions (or regex for short) used in searches and segmentation rules
are those supported by Java. Should you need more specific information, consult
the Java Regex documentation [http://download.oracle.com/javase/1.6.0/docs/api/java/util/
regex/Pattern.html]. See additional references and examples below.
Note
This chapter is intended for advanced users, who need to define their own variants of
segmentation rules or devise more complex and powerful key search items.
Table 17.1. Regex - Flags
Table 17.2. Regex - Character
Table 17.3. Regex - Quotation
77
Regular expressions
Note
greedy quantifiers will match as much as they can. For example, a+ will match the
aaa in aaabbb
78
Regular expressions
Note
non-greedy quantifiers will match as little as they can. For example, a+? will match
the first a in aaabbb
Figure 17.1. Regex Tester
A nice collection of useful regex cases can be found in OmegaT itself (see Options >
Segmentation). The following list includes expressions you may find useful when searching
through the translation memory:
79
Regular expressions
80
Chapter 18. Dictionaries
• Search for the required language combination - for instance on the dictionary links given
by the OmegaT Wiki [https://sourceforge.net/p/omegat/wiki/Reference%20Material/].
• Use untar utility (or its equivalent, for instance winrar in Windows) to extract its contents
into the project folder "Dictionary". There should be three files, with extensions dz, idx and
ifo.
Note that in addition to "source-target" dictionaries you can, using the Dictionary feature,
obtain access to information such as:
• etc...
Some of the dictionaries have no strings attached - i.e. are "Free to use", and others, like the
selection above, are under the GPL license. The following example shows Merriam Webster
10th dictionary "in action":
81
Dictionaries
• Does the folder contain three files of the same name, with extensions? If only one file is
present, check its extension. If it is tar.bz, you have forgotten to unpack (untar) it.
82
Chapter 19. Glossaries
Glossaries are files created and updated manually for use in OmegaT.
If an OmegaT project contains one or more glossaries, any terms in the glossary which are
also found in the current segment will be automatically displayed in the Glossary viewer.
You define its location and name in the project properties dialog. The extension must be .txt
or .utf8 (if not, it will be added). The location of the file must be within the /glossary folder,
but it can be in a deeper folder (e.g., glossary/sub/glossary.txt). The file does not need to
exist when setting it, it will be created (if necessary) when adding a glossary entry. If the file
already exists, no attempt is done to verify the format or the character set of the file: the new
entries will always be in tab-separated format and UTF-8. As the existing content will not be
touched, damage to an existing file would be limited.
1. Usage
To use an existing glossary, simply place it in the /glossary folder after creating the project.
OmegaT automatically detects glossary files in this folder when a project is opened. Terms in
the current segment which OmegaT finds in the glossary file(s) are displayed in the Glossary
pane:
Figure 19.1. Glossary pane
The word before the = sign is the source term, and its translation is (or are) the words after
=. The vocabulary entry can have a comment added. The glossary function only finds exact
matches with the glossary entry (e.g. does not find inflected forms etc.). New terms can be
added manually to the glossary file(s) during translation, for example in a text editor. Newly
added terms will not be recognized once the changes in the text file have been saved.
The source term does not have to be a single-word item, as the next example shows:
83
Glossaries
The underlined item "pop-up menu" can be found in the glossary pane as "pojavni menu".
Highlighting it in the Glossary pane and then rightclicking insets at the cursor position in the
1
target segment.
2. File format
Glossary files are simple plain text files containing three-column, tab-delimited lists with the
source and target terms in the first and second columns respectively. The third column can
be used for additional information. You can have entries with the target column missing, i.e.
just containing the source term and the comment.
Also supported is the CSV format. This format is the same as the tab separated one: source
term, target term. Comment fields are separated by a comma ','. Strings can be enclosed by
quotes ", which allows having a comma inside a string:
"This is a source term, which contains a comma","c'est un terme, qui contient une virgule"
In addition to the plain text format, TBX format is also supported as a read-only glossary
format. The location of the .tbx file must be within the /glossary folder, but it can be in a
deeper folder (e.g., glossary/sub/MyGlossary.tbx).
TBX - Term Base eXchange - is the open, XML-based standard for exchanging structured
terminological data, TBX has been approved as an international standard by LISA and
ISO. If you have an existing terminology handling system it is quite possible it offers
the export of terminology data via TBX format. Microsoft Terminology Collection [http://
www.microsoft.com/Language/en-US/Terminology.aspx] can be downloaded in nearly 100
languages and can serve as a cornerstone IT glossary.
Note: the .tbx output of MultiTerm seems to not be reliable (November 2013), it is better to
use the .tab output of MultiTerm instead.
84
Glossaries
A dialog opens, allowing you to enter the source term, target term and any comments you
may have:
The contents of glossary files are kept in memory and are loaded when the project is opened
or reloaded. Updating a glossary file is thus rather simple: press Ctrl+Shift+G and enter the
new term, its translation and any comments you may have (ensuring you press tab between
the fields) and save the file. The contents of the glossary pane will be updated accordingly.
The location of the writable glossary file can be set in the Project > Properties ... dialog. The
recognized extensions are TXT and UTF8
Note: Of course there are other ways and means to create a simple file with tab delimited
entries. Nothing speaks against using Notepad++ on Windows, GEdit on Linux for instance
or some spreadsheet program for this purpose: any application, that can handle UTF-8 (or
UTF-16 LE) and that can show white space (so that you do not miss the required TAB
characters) can be used.
4. Priority glossary
The results from the priority glossary (by default, glossary/glossary.txt) appear in first places
in the Glossary pane and in TransTips.
As entries can mix words from priority and non priority glossaries, the words from the priority
glossary are displayed in bold.
85
Glossaries
• The glossary file does not have the correct extension (.tab, .utf8 or .txt).
• There is no EXACT match between the glossary entry and the source text in your document
- for instance plurals.
• There are no terms in the current segment which match any terms in the glossary.
• One or more of the above problems may have been fixed, but the project has not been
reloaded.
Problem: In the glossary pane, some characters are not displayed properly
• ...but the same characters are displayed properly in the Editing pane: the extension and
the file encoding do not match.
86
Chapter 20. Using TaaS in OmegaT
1. Generalities
The TaaS service at https://demo.taas-project.eu/info provides terminology services in
European languages (plus Russian). It allows accessing both public and private data, where
private glossaries (called "collections") can be extracted from existing documents, and the
target terms partly populated automatically from various sources.
2. Creating a key
To access the TaaS service, a user must create a key using https://term.tilde.com/account/
keys/create?system=omegaT.
The key must then be given to OmegaT using -Dtaas.user.key=xxxxx. OmegaT configuration
launchers (OmegaT.l4J.ini, omegat.kaptn and Configuration.properties) contain a template
entry.
Browse TaaS Collections will allow browsing existing collections for the source and target
languages of the project, and downloading them. Private collections are displayed in bold.
The collections are downloaded as TBX glossaries in the current glossary folder.
TaaS Terminology Lookup: when checked, will allow querying TaaS data on a segment by
segment basis. All collections (public and private) will be queried for the source and target
language.
To limit the amount of data, it is possible to select a specific domain by selecting Select
TaaS Terminology Lookup Domain. In that dialog, it's possible to select All domains or
a specific one.
87
Chapter 21. Machine Translation
1. Introduction
As opposed to user-generated translation memories (as in the case of OmegaT) Machine
translation (MT) tools use rule-based linguistic tools to create a translation of the source
segment without the need for a translation memory. Statistical learning techniques, based on
source and target texts, are used to build a translation model. Machine translation services
have been achieving good and steadfastly improving results in research evaluations.
To activate any of the Machine Translation services, go to Options > Machine Translate ...
and activate the service desired. Note that they are all web-based: you will have to be on-
line if you want to use them.
2. Google Translate
Google Translate is a payable service offered by Google, for translating sentences, web sites
and complete texts between an ever-growing number of languages. At the time of writing
the list includes more than 50 languages, from Albanian to Yiddish, including of course all the
major languages. The current version of the service is based on usage, with the price of 20
USD per million characters at the time of writing.
Important: Google Translate API v2 requires billing information for all accounts before you
can start using the service (see Pricing and Terms of Service [https://developers.google.com/
translate/v2/pricing?hl=en-US] for more). To identify yourself as a valid user for the Google
services, you use your private unique key sent to you by Google, when you have registered
for the service. See chapter Installing and Running, section Launch command arguments, for
details on how to add this key to the OmegaT environment.
The quality of the Google Translate translation depends on one side on the reservoir of target-
language texts and the availability of their bilingual versions, on the other hand on the quality
of the models built. It is pretty much certain that while the quality may be insufficient in some
cases, it will definitely get better with time and not worse.
88
Machine Translation
The Spanish translation is better than the Slovenian. Note interesar and navegar in Spanish,
are correctly translated as the verbs interest and sail respectively. In the Slovenian version
both words have been translated as nouns. It is actually quite probable that the Spanish
translation is based at least partially on the actual translation of the book.
Once you have activated the service, a suggestion for the translation will appear in the
Machine Translate pane every time a new source segment is opened. If you find the
suggestion acceptable, press Ctrl+M to replace the target part of the opened segment
with the suggestion. In the above segment, for instance, Ctrl+M would replace the Spanish
version with the Slovenian suggestion.
If you do not wish OmegaT to send your source segments to Google to get translated, untick
the Google Translate menu entry in Options.
Note that nothing but your source segment is sent to the MT service. The online version of
Google Translate allows the user to correct the suggestion and send the corrected segment
in. This feature, however, is not implemented in OmegaT.
4. Belazar
Belazar [http://belazar.info/] is a Machine language translation tool for the Russian-Belarusian
language pair.
5. Apertium
Apertium [http://www.apertium.org/] is a free/open-source machine translation platform,
initially aimed at related-language pairs, like CA, ES, GA, PT, OC and FR but recently expanded
to deal with more divergent language pairs (such as English-Catalan). Check the web site for
the latest list of implemented language pairs.
• tools to manage the linguistic data necessary to build a machine translation system for a
given language pair
Apertium uses a shallow-transfer machine translation engine which processes the input
text in stages, as in an assembly line: de-formatting, morphological analysis, part-of-speech
89
Machine Translation
It is possible to use Apertium to build machine translation systems for a variety of language
pairs; to that end, Apertium uses simple XML-based standard formats to encode the linguistic
data needed (either by hand or by converting existing data), which are compiled using the
provided tools into the high-speed formats used by the engine.
6. MyMemory (machine)
By default, MyMemory allows a maximum of 100 requests per day. By specifying an email
address, it is possible to use 1000 requests per day instead of 100 ones.
# MyMemory email
to:
MyMemory [email protected]
• When starting OmegaT from the command line, specify in the command:
• in the Kaptain launcher (Linux only), enter the address in the corresponding field on the
"Online Services" tab.
MyMemory offers also the possibility to manage private TMs. Note: OmegaT does not
interact dynamically with them (you must export/import TMX files manually).
7. Microsoft Translator
In order to get credentials for MS Translator, follow these steps:
If you do not already have an Azure Marketplace account you will need to register one first.
3. Near the bottom you will see entries and values for:
To enable MS Translator in OmegaT edit its launcher or see chapter Installing and Running
to learn how to start OmegaT from the command line.
8. Yandex Translate
In order to be able to use Yandex Translate in OmegaT, you need to obtain an API key from
Yandex [http://api.yandex.com/key/form.xml?service=trnsl].
90
Machine Translation
The obtained API key needs to be passed to OmegaT at startup through yandex.api.key
command-line parameter. To do that edit OmegaT launcher or see chapter Installing and
Running to learn how to start OmegaT from the command line.
• What is the language pair you need? Check if the selected service offers it.
• Google Translate does not work: have you applied Translate API service [https://
developers.google.com/translate/v2/faq]? Note that Google Translate service is not free of
charge, see chapter Installing and Running (runtime parameters) for more on that.
• "Google Translate returned HTTP response code: 403 ...": check that the 38-characters
key, entered in the pinfo.list file, is correct. Check that Translate API service [https://
developers.google.com/translate/v2/faq]has been activated.
• Google Translate does not work: - with the Google API key entered as requested. Check in
Options > Machine Translate, that Google Translate V2 is checked.
• Google Translate V2 reports "Bad request" - check the source and target languages for your
project. Having no languages defined elicits this kind or a response.
91
Chapter 22. Spell checker
OmegaT has a built-in spell checker based on the spelling checker used in Apache OpenOffice,
LibreOffice, Firefox and Thunderbird. It is consequently able to use the huge range of free
spelling dictionaries available for these applications.
• In your file manager, create a new folder in a suitable location in which to store spelling
dictionaries (D:\Translations\spellcheckers in the example below).
• In OmegaT, select Options > Spell Checking, then click Choose beside the Dictionary file
folder field. Navigate to and select the folder you created for dictionaries.
• Place the dictionary files you wish to use in this folder. There are essentially two ways in
which you can do this. You can either copy files manually, i.e. from elsewhere on your
system, using your file manager; or you can use OmegaT's "Install new dictionary"
function to provide a list of available dictionaries to select from. Note that the "Install"
function requires an Internet connection. The selected languages will then be installed and
will eventually appear in your spell checker setup window (this may take a while).
Copying the files manually makes sense if you already have suitable dictionary files on your
system, for instance as part of your Apache OpenOffice, LibreOffice, Firefox or Thunderbird
installation. It is simpler, however, to look for dictionaries online, using the URL of online
dictionaries field:
Figure 22.1. Spellchecker setup
Clicking on Install new dictionary button will open the Dictionary installer window, where you
can select the dictionaries you want to install.
The names of the files must correspond to the language code of your target language as
defined in the project properties dialog (Project > Properties). For example, if you have
92
Spell checker
selected ES-MX (Mexican Spanish) as the target language, the dictionary files must be named
es_MX.dic and es_MX.aff. If you only have a standard Spanish dictionary available, with
file names es_es.dic and es_es.aff for instance, you can copy these files to es_MX.dic and
es_MX.aff, and the spelling dictionary will work. Note that this will of course check for the
standard (Castillian) rather than for Mexican Spanish.
To enable the spell checker, select Options > Spell Checking and tick the Automatically check
the spelling of text check box (see above).
Figure 22.2. Using spellchecker
Right-clicking on an underlined word (Artund in the figure above) opens a drop-down menu
listing suggestions for the correction (Art und). You can also instruct the spell checker to
ignore all the occurrences of the mis-spelled word, or add it to the dictionary.
3. Hints
If the spell checker is not working, then make sure first that the check box "Automatically
check the spelling of text" in the spell checker dialog (Options > Spell checking...) is checked.
Also check that the target language code of your project against the available vocabularies
in the setup window. The spell checker uses the target language code to determine the
language to be used : if the target language is Brazilian Portuguese (PT_BR), the subfolder
with vocabularies must contain the two vocabulary files, called pt_br.aff and pt_br.dic.
If you have already translated a large body of text, and then realize the target language
code of the project does not match the spell checker's language code (you specified pt_BR
as the language, but there are no pt_BR vocabularies, for instance) you can simply copy the
two corresponding files and rename them (e.g. from pt_PT.aff and pt_PT.dic to pt_BR.aff and
pt_BR.dic). Of course it is much wiser, to take a short break and download the correct versions
of the spell checker.
Note that Remove physically removes the selected vocabularies. If they are used by some
other application on your system, they will disappear from that application, too. If, for
93
Spell checker
whatever reason, you need to do this from time to time, it may make sense to copy the files
involved to a different folder, reserved just for use by OmegaT.
94
Chapter 23. Miscellaneous subjects
1. OmegaT Console Mode
Note
Of interest for advanced users only!
The purpose of the console (i.e. command line) mode is to permit the use of OmegaT as
translation tool in a scripting environment. When launched in console mode, no GUI is loaded
( it will work therefore on any console) and the given project is automatically translated. An
example would be a software project, with GUI localized in a number of languages. Using the
console mode, one can make generating a localized interface a part of the build process.
1.1. Prerequisites
To run OmegaT, a valid OmegaT project must be available. The location is irrelevant, since it
must be specified explicitly on the command-line at launch.
If you need non-standard settings, the corresponding configuration files (filters.conf and
segmentation.conf) must be present. This can be achieved in two ways:
• Run OmegaT normally (with the GUI) and set the settings. If you start OmegaT in console
mode, it will use the settings you configured.
• If you are unable to run OmegaT normally (no graphical environment available): copy
the settings files from some other OmegaT installation on another machine to a specific
folder. The location does not matter, since you can add it to the command line at launch
(see below). The relevant files filters.conf and segmentation.conf can be found in the user
home folder (E.g. C:\Documents and Settings\%User%\OmegaT under Windows, %user
%/.omegat/ under Linux)
--config-dir=/path/to/config-files/ \
--mode=console-translate \
--source-pattern={regexp} \
--tag-validation=[block|warn]
Explanation:
• <project-dir> tells OmegaT where to find the project to be translated. If given, OmegaT
launches in console mode and translates the given project.
95
Miscellaneous subjects
1.3. Quiet option
An extra command line parameter specific to console mode: --quiet. In the quiet mode, less
info is logged to the screen. The messages you would usually find in the status bar are not
displayed.
alignDir must contain a translation in the target language of the project. E.g., if the project
is EN->FR, alignDir must contain a bundle ending with _fr. The resulting tmx is stored in the
omegat folder under the name align.tmx.
3. Font settings
In this dialog one can define the font used by OmegaT in the following windows:
• Search window
The dialog can be accessed via the Options → Font... item in the Main menu. The dialog
contains:
Note: In some cases it may take quite some time for OmegaT to update the display after
the font setting has been changed. This is especially the case when a large file containing
many segments is open in the editor, and/or slow hardware is used. Note also that some
fonts behave better for some language pairs than for others. In particular, if you are
96
Miscellaneous subjects
translating between two languages with different alphabets/writing systems (such as Russian
and Japanese), select a font that can be used for both.
If you believe that you have lost translation data, you can use the following procedure to
restore the project to its most recently saved state, usually not older than approximately 10
minutes or so:
3. select the backup translation memory that is the most likely to contain the data you are
looking for
4. rename it project_save.tmx
• Until you are familiar with OmegaT, create translated files at regular intervals and check
that the translated file contains the latest version of your translation.
• Take particular care when making changes to the files in /source while in the middle of a
project. If the source file is modified after you have begun translating, OmegaT may be
unable to find a segment that you have already translated.
• Use these Help texts to get started. Should you run into problems, post a message in the
OmegaT user group [https://groups.yahoo.com/neo/groups/OmegaT/info]. Do not hesitate
to post in the language you feel the most familiar with.
97
Appendix A. Languages - ISO 639
code list
Please check the ISO 639 Code Tables [http://www.sil.org/ISO639-3/codes.asp] for further and
up-to-date information about language codes.
98
Languages - ISO 639 code list
99
Languages - ISO 639 code list
100
Languages - ISO 639 code list
101
Languages - ISO 639 code list
102
Appendix B. Keyboard shortcuts in
the editor
This short text describes key behavior in the editor pane. The term "Move to inside segment"
means, that the cursor moves to the beginning of the segment if it was previously before the
segment, and to the end of the segment if it was previously after it.
* These keys behave differently when the cursor is outside the editable segment:
103
Keyboard shortcuts in the editor
• Backspace: nothing
• Delete: nothing
The "Shift" key doesn't exhibit any special behavior per se: when the "Shift" key is pressed,
all keys move the cursor in their usual manner, except in the case of the Shift+Enter
combination, that inserts a line break into the text.
System-wide commands Select All (Ctrl+A), Paste (Ctrl+V), Cut (Ctrl+X), copy (Ctrl+C),
Insert Match or Selection (Ctrl+I) and Insert source (Ctrl+Shift+I) act in principle on the
text within the currently open segment only.
It is possible to move from one pane to another (for instance, from the Editor to the Fuzzy
Matches pane) using Ctrl+Tab. Ctrl+Shift+Tab moves back to the previous pane. The
shortcuts Ctrl+A and Ctrl+C work in panes, allowing to copy all or some of the information
to the clipboard.
Note that you can reassign the shortcuts to your own preferences. See Appendix ShortCut
Customization
104
Appendix C. OmegaT Team Projects
1. Version control - introduction
The collaborative translation offered by OmegaT is based on the functionality of version or
revision control, widely used by the software community to maintain control of changes to the
code of a program and allow unimpeded collaboration within the development team. OmegaT
supports two of the popular version control systems (VCS for short), Apache Subversion
[http://subversion.apache.org] (often abbreviated SVN, after the command name svn) and
Git [http://git-scm.com/]. The advantages of a VC system for a team of translators can be
summarized as follows:
• Several team members can work on the translation project simultaneously without
interfering with each other
• They can share common material, like project translation memory and its glossary
• Every three minutes by default, an updated version of data shared is available to the rest
of the team
• Conflicts - for instance alternative translations of the same segment or glossary entry - can
be monitored, resolved and merged
The following terms, to be used in the text below, deserve a short explanation:
• VCS server - i.e. SVN or Git server is the environment where the common material is kept
and maintained on the net. The server can exist in the local network but in the majority of
cases it will be available on internet, i.e. via URL address. One member of the team, the
project administrator, needs to be acquainted with handling the server side, i.e. the job of
setting up the environment, importing the OmegaT project, assigning the access rights for
the team members, resolving the conflicts, etc.
• VCS client: To interface with the server an SVN or Git client must be installed on
computers of "project managers" involved in the OmegaT project. Very popular clients
for Windows environment are TortoiseSVN [http://tortoisesvn.net/] and TortoiseGit [http://
code.google.com/p/tortoisegit/]. Other operating systems (Linux, OS X) offer similar
packages.
• repository: the place where the shared material is saved and maintained, either on a local
access network or in Internet. Project members connect with it via their VCS client.
• checkout: the operation that creates a working copy from the repository to your local
computer. The server keeps the information on checkouts, so that later commits (see below)
can be performed in an orderly fashion.
• commit: once a new local version of the checked-out material is ready, it can be committed
to the repository and thus made available to the rest of the team. The server makes
sure that any conflicting changes, due to two members working on the same checked-out
contents, will be resolved.
• administrator: the person responsible for the creation and maintaining of the repository,
i.e. taking care of the server side of the task. To avoid any problems, one person only should
have these rights at least initially.
105
OmegaT Team Projects
implications in terms of confidentiality, since you are loading the original document on a
server outside of your direct control. Alternatively, to avoid this issue you can set a private
SVN server, for example if you already have an Apache server that includes the software in
question (e.g. VisualSVN).
Once the SVN server is available, project managers must locally install a SVN client, in order
to manage the project contents on their computers. For Windows we recommend TortoiseSVN
[http://tortoisesvn.net/]. For Mac you can download the client for instance from SourceForge
[https://sourceforge.net/projects/macsvn/], For Linux see Subversion Commands and Scripts
[http://www.yolinux.com/TUTORIALS/Subversion.html].
2.1. Creating a repository
The procedure presented here relies on the free SVN server (limited to 2 users) offered by
ProjectLocker [http://projectlocker.com/]. Note that the creator of the repository has implicitly
the administrator rights for the repository created. Sign in to the site first or - if it is your first
time on the site, register for it and note your user name and password for the future projects.
2. Type the name and description of the repository. ( OmegaT and OmegaT SL Localization
in the example used here)
3. Choose SVN.
Open the Projects view for your account. The URL shown under Project Services will be used
by SVN to connect clients to the SVN server you have just established. This is also the place
to add members of the team to the project and assign them their rights. Note that the team
members have to be registered first, before you can add them to the project (Note: in the
free version of ProjectLocker you are allowed only two users per project).
Projects can be managed according to your development style and needs. Similar as in the
case of OmegaT projects, you will need to have separate repositories for different language
pairs. Within a given language pair it is best to keep different subjects and/or clients as
separate repositories as well. The alternative is to have one single repository with subfolders
Project1, Project2, etc., and share the common material via common tm, glossary and
dictionary folders.
For the example shown here we decided for the one OmegaT project - one single repository
for the simplicity reasons.
106
OmegaT Team Projects
Enter the URL, provided by ProjectLocker, into the field URL of repository. Make sure the
field Checkout directory is correct, i.e. specifies the empty folder you have created, and
press OK. Once the operation has finished, you can check the said folder: it should now
contain a subfolder .svn and a green OK badge on its icon will show, that the contents of the
folder are up-to-date:
In the next step, we will add the OmegaT files to the local folder. The following files are to be
shared among the members of the team and thus have to be included in any case:
The administrator may decide to include following folders and their contents as well: tm,
glossary and dictionary. Also ignored_words.txt and learned_words.txt in the omegat folder
may be worth sharing and maintaining on the team level. Avoid in any case adding bak files,
project_stats.txt and project_stats_match.txt, in the omegat subfolder, as they would without
any need or profit just bloat the repository. You might want to apply the same to the target
folder and its contents.
After copying the required files into the checkout folder you will notice that its icon has
changed: the green OK badge has changed to a red exclamation sign, signifying the change
in the local copy of the repository. The following two steps will bring the server version up
to date:
• add the copied files to the local version of the repository: right-click on the local
checkout folder and select TortoiseSVN > Add. In the dialog that opens, leave all options as
per default and click OK. The Add Finished! window, similar to the one below will appear:
107
OmegaT Team Projects
• commit local changes to the server: right-click on the local checkout folder and select
SVN Commit.... The Commit window - see below opens. Check the changes to be made -
i.e. the folders and files added in this case.
108
OmegaT Team Projects
Enter an appropriate message into the message window and press OK. The Commit window
will open and show the progress of the commit command. It will first commit the current
contents to the server repository and then update the local copy of the repository - i.e. the
contents of .svn subfolder - so that it is up to date with the latest repository version.
• update local files from the local repository copy - the changes received from the
server repository reside within the .svn subfolder but not yet in the files and folders
themselves. To update the local files, right-click on the checkout folder and select SVN
Update. Check the contents of the folder to confirm that the local copy of the repository
and the corresponding files and folders correspond to the latest server version:
109
OmegaT Team Projects
For subsequent use, all is needed is opening the project like any other OmegaT project.
OmegaT will recognize it is a team project, and will synchronize everything automatically,
every three minutes by default.
110
Appendix D. Tokenizers
1. Introduction
Tokenizers (or stemmers) improve the quality of matches by recognizing inflected words in
source and translation memory data. They also improve glossary matching.
A stemmer for English, for example, should identify the string "cats" (and possibly "catlike",
"catty" etc.) as based on the root "cat", and "stemmer", "stemming", "stemmed" as based
on "stem". A stemming algorithm reduces the words "fishing", "fished", "fish", and "fisher" to
the root word, "fish". This is especially useful in case of languages that use pre- and postfix
forms for the stem words. Borrowing an example from Slovenian, here "good" in all possible
grammatically correct forms:
• lepši, lepša, lepše . - comparative, nominative, masculine, feminine, neutral, resp. Plural
form of the adjective
2. Languages selection
Tokenizers are included in OmegaT and active by default. OmegaT automatically selects a
tokenizer for the source and the target language according to the language settings of the
project. It is possible to select another tokenizer (Language Tokenizer) or a different version
of the tokenizer (Tokenizer Behavior) from the Project Properties window.
In case no tokenizer exists for the current languages, OmegaT uses Hunspell instead (in that
case, make sure that relevant Hunspell dictionaries are installed).
Incompatibilities
OmegaT will not launch if tokenizers are found in the /plugin folder. Remove all the
tokenizers from the /plugin folder before starting OmegaT.
111
Appendix E. LanguageTool plugin
1. Introduction
LanguageTool [http://www.languagetool.org] is an Open Source style and grammar
proofreading software for English, French, German, Polish, Dutch, Romanian, and a number
of other languages - see the list of supported languages [http://www.languagetool.org/
languages/].
You can think of LanguageTool as a software to detect errors that a simple spell checker
cannot detect, e.g. mixing up there/their, no/now etc. It can also detect some grammar
mistakes. It does not include spell checking. LanguageTool will find errors for which a rule
has been defined in its language-specific configuration files.
Incompatibilities
LanguageTool will not work properly if an old version is found in the /plugin folder.
Remove LanguageTool from the /plugin folder before starting OmegaT.
112
Appendix F. Scripts
1. Introduction
OmegaT allows to run scripts written in different scripting languages.
2. Use
Clicking Tools > Scripting opens the Scripting window:
The Scripting window allows you to load an existing script into the text area and run it against
the current opened project. To customize the script feature, do the following:
• Load a script into the editor by clicking on its name in the list on the left panel.
• Right-click on a button from "<1>" to "<12>" in the bottom panel and select "Add". In the
above example, two scripts (position 1 and 2) have already been added.
113
Scripts
• When you left-click on the number, the selected script will run. You can start the selected
macros from the main menu as well by using their entries in the Tools menu or by pressing
Ctrl+Alt+F# (# 1 to 12).
By default, scripts are stored in the "scripts" folder located in OmegaT installation folder (the
folder that contains the OmegaT.jar).
You can add new scripts there, so they will appear in the list of available scripts in the Scripting
window.
3. Scripting languages
The following scripting languages have been implemented:
All the languages have access to the OmegaT object model, with the project as the top object.
The following code snippet in groovy for instance scans through all the segments in all files
in the current project and, if the translation exists, prints out the source and the target of
the segment:
files = project.projectFiles;
for (i in 0 ..< files.size())
{
for (j in 0 ..< files[i].entries.size())
{
currSegment = files[i].entries[j];
if (project.getTranslationInfo(currSegment))
{
source = currSegment.getSrcText();
target = project.getTranslationInfo(currSegment).translation;
console.println(source + " >>>> " + target);
}
}
}
114
Appendix G. OmegaT on the web
1. OmegaT sites and OmegaT SourceForge
project
The OmegaT web site [http://www.omegat.org/]contains links to numerous OmegaT
resources. User support is provided on a volunteer basis at the OmegaT Yahoo! User Group
[http://tech.groups.yahoo.com/group/omegat/]. The FAQ [http://tech.groups.yahoo.com/
group/OmegaT/database?method=reportRows&tbl=1] is a good starting point for finding
answers to questions you may have. For the latest version of OmegaT, refer to the downloads
page at www.omegat.org. You can also file bug reports [https://sourceforge.net/p/omegat/
bugs/] and requests for enhancements. [https://sourceforge.net/p/omegat/feature-requests/]
2. Bug reports
Remember that every good bug report needs just three things:
• Steps to reproduce
You should add copies of files, portions of the log, screen shots, and anything else that you
think will help the developers to find and fix your bug. Note that bug reports and requests for
enhancements are publicly visible, so you should not add any sensitive files. If you wish to
keep track of what is happening to the report, register as a SourceForge user, login and file
a bug report or simply click Monitor at the top of the report.
If you would like to help support the continued development of OmegaT, it would be very much
appreciated - click on this link to go to the OmegaT PayPal account [https://www.paypal.com/
cgi-bin/webscr?cmd=_s-xclick&hosted_button_id=9UB6Y2BBF99LL].
115
Appendix H. Shortcuts
customization
1. Shortcuts customization
Most of the items that appear in the main menu can have a new shortcut assigned. You can
change the already assigned shortcuts and add new shortcuts by putting a shortcut definition
file in your OmegaT preferences folder (see User files location).
The shortcut definition file must be named MainMenuShortcuts.properties and must contain
at most one shortcut definition per line. Empty lines are accepted and comment lines should
start with "//". Anything after the "//" will be ignored.
The shortcut definition syntax is the following: <menu item code>=<shortcut>, where
<menu item code> is a code taken from the tables below and <shortcut> is a combination
1
of pressed keys specified by the user .
• projectOpenMenuItem=ctrl O
• editCreateGlossaryEntryMenuItem=ctrl shift G
The first is the shortcut for Open Project, the second for Create Glossary Entry.
projectOpenMenuItem=shift ctrl O.
If you are on a Mac and you want to add a Shift+Command+S shortcut to Tools → Statistics,
add the following line to your MainMenuShortcuts.properties:
toolsShowStatisticsStandardMenuItem=shift meta S
Save then the file and relaunch OmegaT. Your new shortcuts should now appear next to the
menu items you have modified. If they do not conflict with system shortcuts, they should be
available from within OmegaT.
1
The full syntax for keystrokes (shortcuts) is defined in the following Java 1.6 documentation from Oracle (bottom of page): Java
1.6 keystrokes shortcuts [http://docs.oracle.com/javase/6/docs/api/javax/swing/KeyStroke.html]
2
On the Mac, the modifier meta must be used to specify the command key.
3
The possible keyevents (keys) are listed in the following Java 1.6 documentation from Oracle: Java 1.6 keyEvents description
[http://docs.oracle.com/javase/6/docs/api/java/awt/event/KeyEvent.html]
4
The default OmegaT shortcuts are available from Sourceforge: Default OmegaT Shortcuts [https://sourceforge.net/p/omegat/
svn/HEAD/tree/branches/release-3-6/src/org/omegat/gui/main/MainMenuShortcuts.properties]
The default OmegaT shortcuts for the Mac are also available from Sourceforge, they all use "meta" instead of "ctrl": Default
OmegaT Shortcuts for the Mac [https://sourceforge.net/p/omegat/svn/HEAD/tree/branches/release-3-6/src/org/omegat/gui/main/
MainMenuShortcuts.mac.properties]
116
Shortcuts customization
2. Project Menu
Table H.1. Project Menu
3. Edit Menu
Table H.2. Edit Menu
117
Shortcuts customization
4. GoTo Menu
Table H.3. GoTo Menu
Menu Item Default shortcut Menu Item Code
Next Untranslated Segment Ctrl+U gotoNextUntranslatedMenuItem
Next Translated Segment Ctrl+Shift+U gotoNextTranslatedMenuItem
Next Segment Ctrl+N or Enter or Tab gotoNextSegmentMenuItem
Previous Segment Ctrl+P or Ctrl+Enter or Ctrl gotoPreviousSegmentMenuItem
+Tab
Segment number... Ctrl+J gotoSegmentMenuItem
Next Note gotoNextNoteMenuItem
Previous Note gotoPreviousNoteMenuItem
Next Unique Segment Ctrl+Shift+Q gotoNextUniqueMenuItem
118
Shortcuts customization
5. View Menu
Table H.4. View Menu
Menu Item Default shortcut Menu Item Code
Mark Translated Segments viewMarkTranslatedSegmentsCheckB
Mark Untranslated Segments viewMarkUntranslatedSegmentsChec
Display Source Segments viewDisplaySegmentSourceCheckBox
Mark Non-Unique Segments viewMarkNonUniqueSegmentsCheckB
Mark Segments with Notes viewMarkNotedSegmentsCheckBoxM
Mark Non-breakable Spaces viewMarkNBSPCheckBoxMenuItem
Mark Whitespace viewMarkWhitespaceCheckBoxMenuI
Mark Bidirectional Algorithm viewMarkBidiCheckBoxMenuItem
Control Characters
Mark Auto-Populated viewMarkAutoPopulatedCheckBoxMen
Segments
Modification Info/Display viewDisplayModificationInfoNoneRadi
None
Modification Info/Display viewDisplayModificationInfoSelectedR
Selected
Modification Info/Display All viewDisplayModificationInfoAllRadioB
6. Tools Menu
Table H.5. Tools Menu
Menu Item Default shortcut Menu Item Code
Validate Tags Ctrl+Shift+V toolsValidateTagsMenuItem
Validate Tags for Current toolsSingleValidateTagsMenuItem
Document
Statistics toolsShowStatisticsStandardMenuItem
Match Statistics toolsShowStatisticsMatchesMenuItem
Match Statistics per File toolsShowStatisticsMatchesPerFileMe
7. Options Menu
Table H.6. Options Menu
Menu Item Default shortcut Menu Item Code
Use TAB To Advance optionsTabAdvanceCheckBoxMenuIte
Always Confirm Quit optionsAlwaysConfirmQuitCheckBoxM
Glossary/Display Context optionsGlossaryTBXDisplayContextCh
Description for TBX
Glossaries
119
Shortcuts customization
8. Help Menu
Table H.7. Help Menu
Menu Item Default shortcut Menu Item Code
User Manual... F1 helpContentsMenuItem
About... helpAboutMenuItem
Last Changes... helpLastChangesMenuItem
Log... helpLogMenuItem
120
Appendix I. Legal notices
1. For the documentation
Copyright
The documentation distributed with OmegaT includes the User Manual and the readme.txt
document. The documentation is Copyright ©2013 Vito Smolej, ©2014-2016 Vincent Bidaux.
The author of the Chapter Learn to use OmegaT in 5 minutes! is Samuel Murray, Copyright
©2005-2012.
The documentation is a free document; you can redistribute it and/or modify it under the
terms of the GNU General Public License as published by the Free Software Foundation; either
version 3 of the License, or (if you prefer) any later version.
Warranty
The documentation is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE. See the GNU General Public License for more details.
OmegaT is Copyright © 2000-2016 Keith Godfrey, Zoltan Bartko, Volker Berlin, Didier
Briel, Kim Bruning, Alex Buloichik, Thomas Cordonnier, Sandra Jean Chua, Enrique Estévez
Fernández, Martin Fleurke, Wildrich Fourie, Tony Graham, Phillip Hall, Jean-Christophe Helary,
Chihiro Hio, Thomas Huriaux, Hans-Peter Jacobs, Kyle Katarn, Piotr Kulik, Ibai Lakunza Velasco,
Guido Leenders, Aaron Madlon-Kay, Fabián Mandelbaum, Manfred Martin, Adiel Mittmann,
Hiroshi Miura, John Moran, Maxym Mykhalchuk, Arno Peters, Henry Pijffers, Briac Pilpré,
Tiago Saboga, Andrzej Sawuła, Benjamin Siband, Yu Tang, Rashid Umarov, Antonio Vilei, Ilia
Vinogradov, Martin Wunderlich and Michael Zakharov.
OmegaT is free software; you can redistribute it and/or modify it under the terms of the
GNU General Public License as published by the Free Software Foundation; either version 3
of the License, or (if you prefer) any later version.
Warranty
OmegaT is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY;
without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR
PURPOSE. See the GNU General Public License for more detail.
121
Appendix J. Acknowledgements
1. Thank you all!
Whatever the inconsistencies, omissions and straightforward errors you may find in the
present version, I declare them all my own. This manual, however, would not be possible
without the help and support from a number of people. Explicit thanks to:
• Marc Prior: correcting my first draft was an act of love for OmegaT and the English language.
• Didier Briel: I could not do without Didier's patient and persistent help with DocBook
intricacies. Not to mention his care and diligence, keeping repositories intact and in good
order.
• Samuel Murray: for the introductory chapter "Learn to use OmegaT in 5 minutes".
• Will Helton: his final reading of the draft has spared me a lot of embarrassment. One could
only wonder, how many the and a prepositions would still be missing without his invaluable
help.
• Jean-Christophe Helary: special thanks to JC for his concise description of OmegaT run,
command line parameters and all other details, I have yet to notice.
• Last but not least: my thanks to all the contributors to OmegaT documentation tracker
[https://sourceforge.net/p/omegat/documentation/] for all the inconsistencies found in the
previous versions of the documentation. Keep up your good work!
122
Index I
Installing OmegaT
Linux, 6
C OS X, 7
Comments Other systems, 8, 8
Comments pane, 22 Windows, 5
Customizing OmegaT ISO language codes, 98
Linux, 6
OS X K
Launch parameters, 7 Keyboard shortcuts, 35
Editing, 36
D Goto, 36, 37
Dictionaries, 81 Other, 37
Britannica, 81 Project, 36
Downloading and installing, 81
Longman, 81 L
Merriam Webster, 81 Languages, 98
(see also Dictionaries) Legal notices, 121
Problems with, 82 For the application, 121
StarDict, 81 For the documentation, 121
Webster, 81 Lucene (see Tokenizer)
E M
Editing Behavior, 16 Machine Translation, 88
Encoding Apertium, 89
Central and Eastern European, 59 Belazar, 89
Plain text files, 59 Google Translate, 88
Unicode, 59 Introduction, 88
Western, 59 Microsoft Translator, 90
MyMemory, 90
F Troubleshooting, 91
File filters, 15 Yandex Translate, 90
Dialog, 42, 45 Match Statistics, 24
Editing, 44 (see also Menu Tools)
File type and name pattern, 44 Matches
global vs project file filters, 40 Matches pane - figure, 19
Options, 42 Matches pane setup - figure, 20
Project specific file filters, 42 Matches statistics, 32
Source, target - encoding, 45 Menu, 26
File formats Edit, 27
formatted, 52 Goto, 30
(see also Source files) Help, 35
Unformatted, 52 Options, 32
(see also Source files) Editing behavior..., 56
Font, 15 Project, 26
Tools, 32
G View, 31
Glossaries, 21, 83 Menu Help
Creating a glossary, 84 Help browser, 25
File format, 84 User Manual..., 25
Glossary pane Menu Options
multiple-words entries, 84 Editing behaviour
Location of the writable glossary file, 85 Converting numbers, 57
Microsoft Terminology collection, 84 Empty translation, 56
Priorities, 85 Exporting the current segment, 57
Problems with glossaries, 85 Inserting fuzzy matches, 56
TBX format, 84 Segments with alternative translation, 57
Trados MultiTerm, 85 Translation equal to source, 57
Glossaries, Glossary pane, 83 Font..., 96
123
Index
124
Index
125