Multidimensional Quality Metrics (MQM) Community Group
The Multidimensional Quality Metrics (MQM) Community Group fosters the development of MQM for translation and localization quality assessment and its interoperability with W3C’s Internationalization Tag Set (ITS) 2.0 recommendation. Membership is open to parties interested in contributing to or implementing MQM.
Note: Community Groups are proposed and run by the community. Although W3C hosts these
conversations, the groups do not necessarily represent the views of the W3C Membership or staff.
Derivative works that change the content of this specification MUST attribute the contributions from this work but MUST NOT claim to be “Multidimensional Quality Metrics” or “MQM”. Derivative works MUST therefore use a distinct name that does not imply endorsement of changes. However, implementations of MQM as set forth in this document may state that they implement or use MQM without any special permissions.
This version does not contain updates made in the activities of ASTM F43, which will be added in the next posted version
Editors
Arle Lommel (DFKI)
Aljoscha Burchardt (DFKI)
Attila Görög (TAUS)
Hans Uszkoreit (DFKI)
Alan K. Melby (LTAC Global)
Acknowledgements
Special thanks are due to the following individuals for their contributions:
Serge Gladkoff (Logrus)
Leonid Glazychev (Logrus)
Kim Harris (text&form)
Dale Schultz (IBM)
Jean-François Vanreusel (Adobe)
Document status
This document contains a list of the MQM issue types. This version is a stable version and can be used for implementation.
Feedback
Feedback on this document should be submitted to info@qt21.eu.
Overview
This document defines the issue types used by the Multidimensional Quality Metrics (MQM) framework. It contains a description of the issue types.
1. MQM Issue types
MQM defines a total of over 100 issue types, as defined in this section. They are derived from an examination of major quality assessment systems, both ones based on automatic detection of issues and ones based on manual assessment by reviewers. As quality assessment systems differ considerably in the issues they check, the MQM issue types represent a (non-strict) superset of issues found in translations (as product, as opposed to process). The superset is non-strict because it represents an abstraction of various systems and, in some cases, is less granular than actual systems. For example, an existing system might distinguish between four kinds of issues related to whitespace, but MQM would categorize all of them as Whitespace (whitespace). Information on extending MQM is available in the MQM definition.
In this document MQM issue types are referred to by name followed by the MQM ID value in parentheses on a gray background, e.g., Issue name (issue-name). The issue ID values are linked to the full description of the issue below.
MTM issues exist in a hierarchy, with more specific issues lower in the hierarchy constituting “subtypes” of their parents. For example the issue type Mistranslation (mistranslation) is a subtype of the more general issue type Accuracy (accuracy). Because the issues exist in a hierarchy, rather than as a flat list, MQM can be realized at any level of granularity. At one extreme an MQM-compliant metric could check only two high-level issues, Accuracy (accuracy) and Fluency (fluency); at the other extreme a metric could check all issues defined in MQM. In most cases the number of issues checked will be somewhere between these extremes. Guidance on selecting issue sets can be found in MQM definition. As a general rule, metrics should check the fewest number of issues possible to achieve the requirements of users.
At the top level, MQM issues are grouped into eight major branches or “dimensions”: Accuracy (accuracy), Fluency (fluency), Terminology (terminology), Locale convention (locale-convention), Style (style), Verity (verity), Design (design), and Internationalization (internationalization). It also contains Other (other), used for issues that cannot be assigned elsewhere, and Compatibility (Deprecated) (compatibility), a branch that contains deprecated issues that are retained for compatibility with legacy systems, notably the LISA QA Model. These seven dimensions represent the top level in the MQM hierarchy and themselves may serve as issue types in cases where no further granularity is needed.
These dimensions may be graphically represented as follows (dimensions in bold are in the “MQM Core”):
Issue type names are not case-sensitive (i.e., “Mistranslation”, “MISTRANSLATION”, “mistranslation”, and “MiStRaNsLaTiOn” are all equivalent). The ID values, however, are case sensitive (and are always lower-case) and do not contain spaces. As a result, implementers should ensure that they do not confuse the two, even though in most cases they are nearly identical.
1.1. Hierarchical list of issue types
The following list of issue types presents the full list of MQM categories in hierarchy. Clicking on any issue type name in the lists will take the reader to the definition of the issue type in the next section. The list is separated into sections by dimension. For those dimensions where current sub-issues are defined, a “mind map” shows the hierarchy of the dimension. Clicking on the embedded image will open and SVG version in a new window. Issues in bold in the graphics or the list are part of the “MQM core”. The graphical versions include both the issue name, which might be localized or displayed in some other fashion in an application, and the ID, which MUST remain as given to provide an unambiguous reference to a particular issue type.
1.1.1. Accuracy
Accuracy issues address the relationship of the target text to the source text and can be assessed only by considering this relationship. Changes in intended meaning, addition and omission of content, and similar issues are considered in it.
The Compatibility dimension includes issues taken from legacy metrics that are not considered appropriate for general use in MQM (because they are related to areas not covered by MQM, such as deadlines, software functionality, or physical production). They are included only for compatibility with these older metrics and should not be used for new MQM metrics.
Fluency includes those issues about the linguistic “well-formedness” of the text that can be assessed without regard to whether the text is a translation or not. Most Fluency issues apply equally to source and target texts.
Internationalization covers areas related to the preparation of the source content for subsequent translation or localization. Internationalization issues may be detected through problems found in the target (particularly from those included in Locale convention (locale-convention)), but an Internationalization audit is generally conducted separately from a general assessment of translation quality.
Issues in Locale convention relate to the formal compliance of content with locale-specific conventions, such as use of proper number formats. If content is otherwise correctly translated and fluent but violates specific locale expectations (as defined in the translation specifications), it is addressed in this dimension. This dimension does not cover issues related to whether the content itself is appropriate for the locale (these issues are covered under Verity (verity).
Style issues relate to what is commonly known as “Style”, defined both formally (in style guides) and informally (e.g., a “light style” or an “engaging style”). These issues are closely related to fluency, but are often treated separately by tools and quality processes and so are grouped as a separate dimension in MQM.
Terminology issues relate to the use of domain- or organization-specific terminology (i.e., the use of words to relate to specific concepts not considered part of general language). Adherence to specified terminology is widely considered an issue of central concern in both translation and content authoring. Issues in this branch should not be used for general language mistranslation (e.g., translations that would not be considered correct under reasonable circumstances), and should be reserved for issues related to terminology (e.g., a translation is reasonable but incorrect in the context of specific technical domain or for a particular organization).
Verity issues relate to the suitability of content for the target locale and audience. They do not relate to fluency or accuracy since content may be fluently written and accurately translated and still be inappropriate for the target locale or audience. For example, if a text translated for Germans in Germany refers to options available only in the UK, these portions will likely be problematic. For more details on Verity, see the discussion below.
As Verity (verity) is a relatively new concept in translation quality assessment, some examples may help clarify its usage and how it differs from general Accuracy (accuracy) or Fluency (fluency). Note that issues in this branch may be checked in separate processes relating to market compliance before translation begins or may be subject to discussion and negotiation between the requester and the provider.
The examples are:
An computer advertisement written in the U.S. directly compares a computer’s features with those of a competitor to conclude that the advertiser’s computer is better that its competitor’s. While this sort of advertisement is perfectly acceptable in U.S., it is illegal under German law. As a result an accurate and fluent translation of the advertisement would subject the maker to potential legal penalties. The appropriate German version would therefore highlight strengths of the computer in absolute terms without reference to a competitor. Note: The necessary changes in such a case are often included under the rubric of transcreation. This issue is classified under Legal requirements (legal-requirements)
A user guide for a wireless Internet base station is translated, but includes unnecessary legal notifications from the source text and omits needed notifications for the target locale. This issue is classified under Legal requirements (legal-requirements)
A Hungarian-language novel being translated into Quebec French includes references to Hungarian popular culture and music. When translated in a way that would be appropriate for a French speaker living in Hungary who is familiar with Hungarian culture, these references are unintelligible to the target audience in Quebec. As a result the text includes Verity errors and the appropriate translation for the specifications involves substituting in appropriate Quebecoise pop culture and musical references. These issues can be categorized under Culture-specific reference (culture-specific)
A text is written in a conceptually complex fashion that makes it inaccessible for its intended audience of high-school students (even though the overall register is appropriate). The suitability does not depend on the locale or whether or not the text is translated. It can be categorized as End-user suitability (end-user-suitability).
An automobile service manual for a British automobile assumes a right-hand drive automobile and includes descriptions of parts for that drive system. When the manual is used in the United States, these references are no longer appropriate to the automobile and must be modified to reflect the use of left-hand drive systems in the U.S., including swapping left and right in physical descriptions of some parts. These issues can be categorized as Locale-specific content (locale-specific-content).
A document translated from German to U.S. Spanish includes references to European type electrical plugs (which would be appropriate for the product when sold in Spain). These references must be changed to match the physical characteristics of the product as sold in the U.S. This would be classified under Locale-specific content (locale-specific-content).
A service manual omits needed steps from the description of a technical process, making it impossible to complete the process as written. (Note: this is an example of a monolingual Verity problem that exists independently of the whether the text is translated or not.) This issue is categorized as Incomplete procedure (incomplete-procedure).
A text written for an audience of college-educated agriculture managers is translated for use in another country where the local managers are trained on the job and cannot be assumed to have the same education. As a result, the text contains references to knowledge and background that cannot be assumed in the target audience. This problem is an example of the type End-user suitability (end-user-suitability). Here the text must be changed and clarified for the intended audience.
In a fictional work, an individual sees people dressed in black and thinks of a funeral, but when translated in a covert translation (i.e., a translation that should appear as though it were written in the target locale and language) this association does not work because black is traditionally worn in weddings rather than funerals in the target locale. This would be classified as Culture-specific reference (culture-specific).
A verification form for password security requests users to provide the name of the street they grew up on. However, in some countries there are no street names, and addresses refer to “colonies” and blocks within a colony instead. As a result, the question does not make sense to the target audience. This issue would be categorized as Culture-specific reference (culture-specific).
1.1.10. Other
This dimension is used for issues which cannot be otherwise classified into a dimension of MQM. In cases where an unforeseen issue can be classified as belonging to a dimension, it should be classified in that dimension under the top level or using a custom issue type. In practice Other should be used extremely rarely.
This section lists all MQM issue types in alphabetical order, with the following information:
Name
The name is the English name for the issue type. This name may be localized in other languages or may be changed in a UI to reflect application-specific preferences. (For example, if an existing system is being converted to use MQM categories and already has an issue type called Terminology problem that corresponds to Terminology (terminology), the UI may display the existing name but refer to the ID value terminology internally for mapping purposes. For new English-language implementations, however, it is recommended to use the existing name to prevent confusion.)
ID
An XML identifier for the category. This ID is used to refer unambiguously to each issue type and does not change, even if a UI may display other names for the category.
Definition
A definition of the issue type
MQM Core?
(yes|no) Specifies whether the issue is in the MQM core (see the definition of the MQM Core) or not.
Automatable?
(yes|no) Informative: Indicates whether the issue may be automatically detected. Users interested in fully automatable subsets of MQM may wish to limit themselves to issues marked with “yes”. This specification does not provide any guidance on how to check issues automatically and detection may require language-specific modules or development. Success in detecting issues depends on factors outside the scope of this specification and individual systems may be able to identify issues not identified as automatable in this specification.
Parent
The parent of the issue type in the hierarchy. Each issue can be understood as a type of its parent.
Children
A list of any children to the current issue type.
Applies to
Whether the category applies to target, source, or both
Example(s)
One or more illustrative examples of the issue type
Note(s)
Any notes on usage for the issue type.
Accuracy
ID
accuracy
Definition
The target text does not accurately reflect the source text, allowing for any differences authorized by specifications.
A form enforces a US model of street addresses that does not apply in many target languages and does not support notation of districts or other important features of addresses in some countries.
A text reads “I cannot recommend this too highly.” (The meaning can be that the speaker cannot make a good recommendation or that it is highly recommended.)
Note(s)
This issue is distinguished from ambiguous-translation by its focus on monolingual ambiguity. In cases where the translation process has introduced ambiguity, ambiguous-translation should be used instead, if it is included in a metric. However, any ambiguity in a source text would be classified under this issue.
Ambiguous translation
ID
ambiguous-translation
Definition
An unambiguous source text is translated ambiguously
A text that means that someone is highly recommended is translated as “I cannot recommend this too highly.” (The meaning can be that the speaker cannot make a good recommendation or that it is highly recommended.)
Note(s)
This issue is distinct from ambiguity in that it is limited to issues where the translation process has introduced the ambiguity.
A text is written with many embedded clauses and an excessively wordy style. While the meaning can be understood, the text is very awkward and difficult to follow.
Note(s)
Bi-di support
ID
bi-di-support
Definition
The software cannot support bi-directional scripts, such as Arabic and Hebrew
A task-management system designed in the U.S. displays Sunday as the first day of the week, while many countries list it as the last day of the week.
A website displays all dates according to the Gregorian calendar, but the target audience in much of the Middle East prefers to use the Islamic calendar.
Note(s)
This issue does not apply to cases where dates are displayed in the wrong format, but according to the right calendar system.
A tourism text translated from Arabic English gives a year as 1435, but it should have been converted from the Islamic calendar to the Gregorian calendar year 2014.
In Turkish the upper-case form of i is İ and the lower-case form of I is ı. As a result case-changing algorithms that are not internationalized and aware of Turkish will change the case of these characters incorrectly.
A database application cannot process or produce text stored in ISO Latin 6 (Nordic) encoding and so cannot interface with needed legacy systems in Norway.
While individual sentences of the text are all perfectly fluent, the text as a whole does not make sense and is inconsistent with itself.
Note(s)
Since coherence applies above the segment level, this issue type would generally be assessed with a holistic metric rather than an analytic one, although any claims that a text is not coherent should be able to point to specific portions and problems. Often these specific problems can be classified as coherence or inconsistency issues in an analytic metric.
Cohesion
ID
cohesion
Definition
Portions of the text needed to connect it into an understandable whole (e.g., reference, substitution, ellipsis, conjunction, and lexical cohesion) are missing or incorrect.
An English text is missing conjunctions and particles (e.g., “thus”, “therefore”, “but”, and “however”) needed for the logic of the text to be clear.
Note(s)
Cohesion applies at the local level to incorrect or missing elements needed for the intended meaning of the text to be clear. Cohesion problems at the local level may contribute to coherence problems for the text as a whole.
Color internationalization
ID
color-internationalization
Definition
Use of colors is fixed and not adaptable to other locales
A UK-based website uses a red, white, and blue color scheme and hard-codes these colors into graphical assets and inline styles. When translated for China, these colors are inappropriate but cannot be changed because of the way they are encoded into the site.
Note(s)
This issue type does not apply merely to the use of culture-specific colors, but rather to cases where the colors are not made accessible to the localization process and so cannot be changed.
Company-specific terminology guidelines specify that a product be called the “Acme Turbo2000™”, but the text calls it the “Acme Turbo” or the “Turbo200”.
Note(s)
Should be used when it is necessary to distinguish company-specific terminology issues from more general termbase issues.
Compatibility (Deprecated)
ID
compatibility
Definition
The Compatibility extension contains items which may be used for compatibility with legacy metrics even though they would otherwise not be included in MQM. Most of these issue types are taken from the LISA QA Model documentation.
MQM Core?
no
Automatable?
no
Parent
Children
The following issue types (presented without definition) are included in the Compatibility branch:
Application compatibility
Bill of materials/runlist
Book-building sequence
Covers
Deadline
Delivery
Does not adhere to specifications
Embedded text
File format
Functional
Output device
Printing
Release guide
Spines
Style, publishing standards
Terminology, contextually inappropriate
Applies to
source and target
Example(s)
A quality process checks the LISA QA Model issue “Book-building sequence” and it is included for compatibility with legacy processes
Note(s)
Use of these categories is not recommended and these issue types are to be considered deprecated. They are included only for compatibility with legacy processes.
Since compatibility is not a coherent category, use of this category itself is not recommended in any circumstance, although the children categories listed above may be used for compatibility purposes.
A process description leaves out key steps needed to complete the process, resulting in an incomplete description of the process.
Note(s)
completeness refers to instances in which needed content is missing in the source language. For cases where material present in the source language is not present in a translation, omission should be used instead.
Complexity
ID
complexity
Definition
Different cultures expect different levels of complexity and presentation of information in user interfaces. If the amount of information is too much or too little for a culture, users will perceive the user interface negatively
A user interface developed in Sweden has a minimalist aesthetic, but when localized into China, Chinese users expect to find information in the UI that is normally hidden under various options. As a result they may find it frustrating and difficult to use.
Note(s)
Solving this problem may involve extensive adaptation of localized versions and may not be solvable by simple internationalization steps.
Concatenation
ID
concatenation
Definition
Text is concatenated in ways that will not function properly when the text is translated
A localizable string contains the following: "You have found the ".$item.". Do you wish to pick it up?". When translated this string will cause problems because the article before the item and the equivalent of “it” will both need to be changed to reflect the content of the variable $item.
Note(s)
Confusable security
ID
confusable-security
Definition
Software does not provide any security protection against easily confusable character such as Latin-script A, Greek Α, and Cyrillic А, thus allowing users to impersonate other users’ names.
Users can select user names with any valid, non-control Unicode characters. As a result a user creates the user name Тоny (with the first two letters in Cyrillic) to impersonate an administrative user with the name Tony (all in Latin script).
Note(s)
This issue has emerged with the advent of pervasive Unicode support that allows multiple scripts to be combined in input. Solving this problem requires careful parsing of input.
Corpus conformance
ID
corpus-conformance
Definition
The content is deemed to have a level of conformance to a reference corpus. The non-conformance type reflects the degree to which the text conforms to a reference corpus given an algorithm that combines several classes of error type to produce an aggregate rating.
A text reading “The harbour connected which to printer is busy or configared not properly” is flagged by a language analysis tool as suspect based on its lack of conformance to an existing corpus.
Note(s)
One example of this issue type might involve output from a quality estimation system that delivers a warning that a text has a very low quality estimation score.
Culture-specific graphic
ID
culture-specific-graphic
Definition
Graphics embed cultural assumptions or references and cannot be changed
An English text refers to steps in a process as “First base”, “Second base”, and “Third base”, and to successful completion as a “Home run” and uses other metaphors from baseball. These prove difficult to translate and confuse the target audience in Germany.
An marketing text in Greek includes reference to popular Greek music. When translated into English these references are not understandable to the target audience.
Note(s)
In the cases of texts that were written with the intention that they be translated, this issue may indicate a broader conceptual or Internationalization problem.
A text dealing with business transactions from English into Hindi assumes that all currencies will be expressed in simple units, while the convention in India is to give such prices in lakh rupees (100,000 rupees)
An online commerce form displays all amounts in euros, but customers use the form in countries that use other currencies.
Note(s)
Corresponds to currency-format in locale-convention. This is used to mark engineering problems in the source content, not specific problems in the target.
Date format
ID
date-format
Definition
A text uses a date format inappropriate for its locale.
Design issues may exist either in documentions in isolation (e.g., a second-level heading is formatted as a first-level heading) or in relationship between source and target (e.g., headings are formatted differently between source and target).
String references are embedded in computer code rather than externalized to resource files. As a result the string content is accessible only by manipulatin the source code of the application.
Error messages for a product are stored as variables directly in the source code of a software product and are therefore not localized when UI strings are sent for translation.
Note(s)
Embedded string in graphic
ID
embedded-string-in-graphic
Definition
A graphics contains embedded text as an image that cannot be edited
A service manual contains an image of a mechanical system with part labels stored in the “flattened” graphic. As a result the localization process cannot produce localized versions of the graphic
Note(s)
Solving this problem generally requires access to original application files used to produce graphics initially rather than to rendered down-stream versions use for web display, display in software UIs, or embedding in word-processing applications.
Embedded string
ID
embedded-string
Definition
Textual content is embedded in other elements in ways that make it inaccessible during the localization process.
A text describes a process to repair a device, but following the instructions leads to serious damage to the device and potential injury.
A text assumes that the reader has knowledge of advanced particular physics, but the target audience does not generally have this knowledge.
Note(s)
If the issue relates to the applicability of the content to users in a particular locale, locale-specific-content should be used instead.
End-user suitability generally applies to issues present in the source text, regardless of the target locale, but may apply in cases where there are distinct differences in audience or purpose between source and target.
Entity (such as name or place)
ID
entity
Definition
Names, places, or other “named entities” do not match
The sort routine in a spreadsheet does a simple sort by Unicode code-point sequence, and does not support needed collation sequences for various markets.
Note(s)
Specific results of this problem will often be classified under sorting.
Fixed dialog/UI size
ID
fixed-dialog-ui-size
Definition
Dialog boxes or other UI components are fixed in size and cannot adapt to different amounts of content in other languages.
Arabic text entered into a system does not display appropriate contextual variations (ligatures) and instead uses only medial character forms, rendering the result unreadable.
Note(s)
Font-rendering problems are extremely common in software that has not previously been adapted to support “complex” scripts.
Font, single/double-width (CJK only)
ID
single-double-width
Definition
Single-width characters are used when double-width are intended, or vice versa.
Warning texts are set in sans-serif, but one of them appears in a serif font.
A portion of Japanese text is set with an obliqued face (corresponding to italics in the source text) when dot accents should have been used with a non-oblique face.
Note(s)
Footnote/endnote format
ID
footnote-format
Definition
Footnotes or endnotes are placed inappropriately or use incorrect in-text symbols
A English source text uses a normal-weight serif font for body text but the Japanese translation uses a heavy-weight “gothic” (roughly, sans-serif) font appropriate for headlines only.
Note(s)
While this issue may apply to both source and target, it is most likely to apply to the target.
Grammar checker
ID
grammar-checker
Definition
A needed grammar checking is missing or does not support the required language.
A piece of software being localized for the Swedish market contain images of products available only in the U.S. market.
Note(s)
This issue and its children apply only to cases where cultural aspects of graphics are not accessible to the localization process and so cannot be adapted.
Graphics and tables
ID
graphics-tables
Definition
Issues related to the formatting of graphics and tables.
A website uses inline base64-encoded representations of some graphics to speed up load times for the page, but these graphics are thus not accessible for localization.
Note(s)
Hard-coded keyboard command
ID
hard-coded-keyboard-command
Definition
Key-board shortcuts or other commands are hard-coded into the system and do not function when alternative keyboards are selected
A vital keyboard in the English version command involves pressing the A key, but when the program is run using a Russian keyboard layout, it is unusable because the program is waiting U#0041 (A) for instead of U#0391 (А).
A word processor has been set to use German hyphenation for a Hungarian text. As a result the word mennyi is hyphenated as men-nyi instead of the correct meny-nyi.
Note(s)
Hyphenator
ID
hyphenator
Definition
A hyphenation engine does not support a needed language
A screen shot shows a button with the text “Open other…” but the text referring to the screen shot tells the user to click on the “Open alternative…” button.
Note(s)
Improper exact TM match
ID
improper-exact-tm-match
Definition
An translation is provided as an exact match from a translation memory (TM) system, but is actually incorrect.
An HTML file contains numerous links to other HTML files; some have been updated to reflect the appropriate language version while some point to the source language version.
Note(s)
Inconsistent markup
ID
inconsistent-markup
Definition
Markup elements are inconsistent between the source and target
One part of a text is written in a light and “terse” style while other sections are written in a more wordy style.
Note(s)
Inconsistent style often emerges when multiple translators have worked on a single text. Because Inconsistent style applies to larger portions of texts, it would generally be assessed with a holistic metric rather than an analytic one.
Inconsistent use of terminology
ID
term-inconsistency
Definition
Terminology is used in an inconsistent manner within the text.
The text refers to a component as the “brake release lever”, “brake disengagement lever”, “manual brake release”, and “manual disengagement release”.
Note(s)
This issue and its children are used only to address inconsistent use of terminology. In cases where terminology is incorrect for the domain or termbase termbase or domain-terminology should be used instead. If further detail is needed about whether the source or target text is responsible for the inconsistent use terminology, use one of the daughter issues.
Inconsistent with domain
ID
domain-terminology
Definition
A term is used contrary to general domain expectations
A financial text is translated using “deduct” instead of “debit”. Although conceptually these could be synonyms in general language, “deduct” violated domain conventions.
Note(s)
This issue is used for cases where no term-base is specified yet common domain conventions about terminology use are violated. If a termbase was specified and that term in question violates it, termbase should be used instead, if it is included in the metric (otherwise terminology would be used).
Inconsistent with external reference
ID
external-inconsistency
Definition
The text is inconsistent with a specified external reference
Translation specifications state that quotes in a text must match the 1957 edition of a book, but the translator used the 1943 edition, which was substantially different.
Note(s)
For inconsistent terminology, options in the Terminology branch should be used instead.
Inconsistent with termbase
ID
termbase
Definition
A term is used inconsistently with a specified termbase
A termbase specifies that the term USB memory stick shold be used, but the text uses USB flash drive.
Note(s)
For obvious reasons, this issue type applies only in cases where a term is specified in a termbase that was specified for use. If general domain conventions for terminology are violated instead, then domain-terminology should be used instead, if it is included in a metric.
A form validates names against a regular expression, [A-Za-z']+, but fails when a Japanese user enters a name that does includes characters other than standard Roman characters
Note(s)
Internationalization
ID
internationalization
Definition
There is a problem related to the internationalization of content.
A document assumes that all addresses use postal codes conforming to the U.S. “zip+four” convention and includes a verification step for postal codes that does not allow for non-U.S. codes.
A computer program is localized but some content remains untranslated because it was embedded in the program code and not made accessible to the translator.
Note(s)
While internationaliztion errors are generally detected in the target content, they refer to problems in the source that cause problems with translated/localized content. Even in cases where internationalization is not being specifically checked, if problems related to internationalization are encountered, they should generally be reported to the content creators.
As of August 2014, the intention is to expand this branch in the future with more specific issue types.
A computer program provides support only for American English keyboards and so does not work properly in Icelandic since users cannot enter text with Icelandic-specific characters.
Note(s)
Language-dependent logic
ID
language-dependent-logic
Definition
Content includes language- or locale-dependent logical assumptions that prevent it from being appropriately localized
A technical text uses a “deductive” reasoning style that cannot be easily adapted to areas expecting an “inductive” reasoning style.
Note(s)
This issue type is common when going between European and Asian markets. In some cases texts that are perfectly clear in one market may be difficult to follow in another due to culture-specific differences in logic.
Language-specific tool support
ID
language-specific-tool-support
Definition
Needed tools that specifically support required languages are missing or do not function as expected
Specifications stated that FCC regulatory notices be replaced by CE notices rather than translated, but they were translated instead, rendering the text legally problematic for use in Europe.
Note(s)
Generally used in cases where the translation does not meet requirements. Cases in which the source text does not meet legal requirements are generally critical errors that will require rewriting the source text.
Length
ID
length
Definition
There is a significant discrepancy between the source and the target text lengths.
A portion of the text displays a (non-systematic) formatting problem (e.g., a single heading is formatted incorrectly, even though other headings appear properly).
Note(s)
Locale convention
ID
locale-convention
Definition
The text does not adhere to locale-specific mechanical conventions and violates requirements for the presentation of content in the target locale.
An incorrect format for currency is used for a German text, with a period (.) instead of a comma (,) as a thousands separator.
A text translated into Japanese uses Western quote marks to indicate titles rather than the appropriate Japanese quote marks (「 and 」). (Note: this example would be categorized as quote-mark-type if the metric includes it.)
Note(s)
This issue type is distinguished from locale-specific-content in that this category refers only to whether the text is given the proper mechanical form for the locale, not whether the content applies to the locale or not. If text conforms to conventions for the locale, but does not apply to the target locale, locale-specific-content should be used instead.
Locale-specific content
ID
locale-specific-content
Definition
Content specific to the source locale does not apply to the intended target locale, audience, or purpose.
An advertising text translated for Sweden refers to special offers available only in Germany and therefore is misleading.
A manual for a printer sold in Spain describes features that apply only to versions of the printer sold in Japan and thus may confuse purchasers.
Note(s)
This issue type is distinguished from locale-convention in that this category applies to cases where text corresponds to the conventions of the target locale, but does not apply to the intended audience in the target locale. For example, if the Swedish advertising text mentioned above is properly translated and follows all mechanical locale conventions (e.g., using Swedish kronor instead of euros) but the offer does not apply to Sweden, cocale-specific-content should be chosen. If, however, the text applies to the locale, but does not follow locale conventions (e.g., numbers are formatted incorrectly for the locale), locale-convention should be used instead.
Locale-specific punctuation
ID
locale-specific-punctuation
Definition
The text systematically uses punctuation not appropriate for the specified locale
A text translated from English to Japanese maintains European-style punctuation—such as full-stops (.)—instead of using the appropriate Japanese punctuation, such as the Japanese full stop (。).
Note(s)
Localization support
ID
localization-support
Definition
Aspects of how a software product presents locale-sensitive data are not properly internationalized
An online submission form to register for appointments with a product demonstrator does not allow data to be submitted for many countries because it validates data against a US-centric model.
Note(s)
See the subtypes for specific examples. Note that this issue type and its children apply to internationaliztion problems in the source, not to specific instances in a target language, although they may be discovered as problems classified under locale-convention in specific target languages. Although most of the examples of child nodes use specific instances where problems appear, they all refer to engineering problems of the source content.
An engineering software system developed in France supports only the metric system and when localized for use in the United States does not support U.S. measurement formats, rending it unusable when users print bill of parts sheets to order components from U.S. suppliers.
Note(s)
Corresponds to measurement-format in locale-convention. This is used to mark engineering problems in the source content, not specific problems in the target.
A French-language website has not been fully localized. When the user clicks a link to one of these pages he or she should be taken to the English-language source page, but instead is taken to a blank page with no content.
Note(s)
It is common practice to allow software or websites to fall back on another language if some content is missing. For example, a partially-localized German website might display some content in English for pages that have not yet been localized.
A translation is complete, but during DTP a text box was inadvertently moved off the page and so the translated text does not appear in a rendered PDF version.
Note(s)
This issue does not refer to omitted text (i.e., text that was present in the source but not present in the translation). Instead it refers only to cases where text is present in some form but does not appear in the laid-out version. It also does not refer to text that has been truncated due to text expansion.
A chapter heading is not listed in a Table of Contents.
Note(s)
Mistranslation of technical relationship
ID
technical-relationship
Definition
Content decribing the relationship(s) within a technical description is translated inaccurately with respect to technical knowledge (even if the translation otherwise appears plausible).
A physics text describes the interaction of subatomic particles in a medical scanning device. The translation seems plausible, but incorrectly conveys the relationship of two particles and is therefore incorrect.
A source text describes how a piano action (the mechanism connecting a piano key to the hammer that strikes a string) is translated in a way that incorrectly conveys the relationship between two components.
Note(s)
This issue is not used for incorrect use of individual terms, which would be classified in terminology or one of its children. Rather, it is used for cases where a translation might appear to be correct but where it ends up misconveying information about a technical subject.
Instances of this issue may point to confusing source materials or to lack of translator experience in a specialized domain.
Mistranslation
ID
mistranslation
Definition
The target content does not accurately represent the source content.
A source text states that a medicine should not be administered in doses greater than 200 mg, but the translation states that it should be administered in doses greater than 200 mg (i.e., negation has been omitted).
Note(s)
Multiple terms for concept in source
ID
multiple-terms-for-concept
Definition
A single concept in the source text is expressed with multiple terms for the same concept.
A German source text uses one term for a component of a vehicle, but the target text uses “brake release lever”, “brake disengagement lever”, “manual brake release”, and “manual disengagement release” for this term in English.
Note(s)
Applies to target text only since it refers to cases where one term has multiple translations. As with term-inconsistency, termbase or one of its children should be used instead if a termbase contains a specified term for a concept and the text does not use that particular term.
Name format
ID
name-format
Definition
A text uses a name format inappropriate for its locale.
A text translated from Hungarian to English presents names with the family name first when the name order should be instead inverted to have family name last.
A web form translated for Indonesia requires users to provide a “last name” even though many Indonesians have only a single name.
A translated text refers to “Pedro Diego Estavez” as “Mr. Estavez” rather than “Mr. Diego”.
An online registration form asks for “last name” and “first name”, resulting in confusion for users where family names are listed first (e.g., China, Japan, and Hungary) or where users have multiple family names (e.g., Spain, Portugal, Brazil) or only one name (e.g., Indonesia).
Note(s)
Corresponds to name-format in locale-convention. This is used to mark engineering problems in the source content, not specific problems in the target.
A UI has hard-coded positions in a form and when an address box is expanded to three lines to support addresses in certain locales it then overlaps other UI elements, making them unreadable.
Note(s)
Many UI frameworks automatically support dynamic adjustment of the position of UI elements. Home-built UIs are particularly prone to this problem.
Non-reversible UI
ID
non-reversible-ui
Definition
A UI cannot be reversed to support bi-directional languages
A web-form is left-aligned with multiple items per line, but when translated to Arabic the items appear in the wrong order because the UI cannot automatically adjust their layout.
Note(s)
Many UI frameworks automatically support UI adjustment for bi-directional layouts. Home-built UIs are particularly prone to this problem.
Nonallowed characters
ID
nonallowed-characters
Definition
The text includes characters that are not allowed.
A text may not include colons or forward- or back-slashes, which might cause confusion with path names on some computer systems, but it contains these characters.
Note(s)
Number format
ID
number-format
Definition
A text uses a number format inappropriate for its locale.
An online system displays all numbers with commas to delimit thousands and a full stop (.) to indicate the decimal position. This format is confusing in many locales that use other delimiters or delimit texts using hundreds separators instead of thousands separators.
Note(s)
Corresponds to number-format in locale-convention. This is used to mark engineering problems in the source content, not specific problems in the target.
Number
ID
number
Definition
Numbers are inconsistent between source and target.
A text contains words generally considered to be profanities outside of a context where they would be allowed
Images in a document depict nudity for a culture where nudity is considered offensive
An American text uses the “OK” symbol (👌) to indicate approval, but this symbol is considered offensive in Brazil.
Note(s)
if offensive is to by used, clear guidelines should be given since content offensive in one context may be acceptable in another.
In many cases offensive content may be detected in a (semi)automatic fashion through the use of lists of unacceptable phrases, often in conjunction with terminology checkers. However, automatic checkers will not be able to identify all potentially offensive content, especially as content considered unobjectionable in one context or culture may be considered highly offensive in another.
Omission
ID
omission
Definition
Content is missing from the translation that is present in the source.
A translated text should read “Number of lives remaining: $lifeNumber” but is rendered as “Number of lives remaining:”, with the variable $lifeNumber omitted
Note(s)
Other
ID
other
Definition
Used for any issues not adequately covered by the MQM core or extensions. This category should be used only if it is impossible to assign an issue to an existing category with sufficient granularity.
MQM Core?
no
Automatable?
no
Parent
Children
none
Applies to
source and target
Example(s)
A quality process checks for errors generated from speech-to-text generated during conference interpretation. Because this error type is highly specific to the specific situation, it is not included in any predefined issue type elsewhere.
Note(s)
This category should be used only for any issue type that cannot be mapped to one of the issue types listed above. If an issue type can be considered a more granular example of an existing type, it should be categorized as that type, possibly with a custom extension if the additional granularity is needed.
Over-translation
ID
over-translation
Definition
The target text is more specific than the source text
The source text refers to a “boy” but is translated with a word that applies only to young boys rather than the more general term
Note(s)
In some cases differences in concept structure between languages may render an apparent over-translation necessary. In such cases this issue should not be considered an error, although the issue may be noted for further consideration.
Overall design (layout)
ID
overall-design
Definition
Issues related to overall layout and design (versus local formatting)
A Hungarian text contains the phrase Tele van a hocipőd?, which has been translated as “Are your snow boots full?” rather than with the idiomatic meaning of “Feeling overwhelmed?”.
The regular expression ["'”’][,\.;] (i.e., a quote mark followed by a comma, full stop, or semicolon) is defined as not allowed for a project but a text contains the string ”, (closing quote followed by a comma).
A translated online form validates all postal codes as consisting of exactly five numbers but the target locale uses a combination of six letters and numbers.
A formal letter uses contractions, colloquialisms, and expressions characteristic of spoken rather than written language, and those comes across as less serious than intended.
Note(s)
Register involves a number of factors, including appropriateness of the discourse for the specific subject field, the level of formality, and the mode of discourse (e.g., written text versus transcribed speech).
The notion of register used in this document is derived from Systemic Functional Linguistics.
For uses of the improper grammatical register that do not otherwise impact style, such as German du vs. Sie, use grammatical-register instead.
Resource externalization
ID
resource-externalization
Definition
Translatable resources have not properly been externalized from functional code.
A legal notice in German uses the informal du instead of the formal Sie.
Note(s)
Standard practice in writing international code is to put all translatable resource into external resources (such as files containing UI strings). Failure to do so is a major cause of problems or failure in software localization tasks.
Sequence
ID
sequence
Definition
Sequences in graphics or text appear in a culture-specific order that does not make sense in other locales.
A graphic presents an (implicit) left-to-right ordering of events, but users in the Middle East may follow the steps in reverse order because they expect right-to-left ordering.
Note(s)
Shortcut key internationalization
ID
shortcut-key-internationalization
Definition
Software shortcut keys are set to combinations that do not make sense in all locales and cannot be changed
CTRL-S is used for saving files and cannot be changed, but some locales customarily use other keyboard shortcuts to save files.
Note(s)
Corresponds to shortcut-key in locale-convention. This is used to mark engineering problems in the source content, not specific problems in the target.
Shortcut key
ID
shortcut-key
Definition
A translated software product uses shortcuts that do not conform to locale expectations or that make no sense for the locale
The spell-checking engine used in a presentation tool localized for Korean does not include rules for the Korean language and cannot be used to spell-check Korean text.
A German matching algorithm should recognize that the names Roemer and Römer are the same name (oe and ö are alternative spelling for the same sound) but does not, thus returning only some of the appropriate matches to a query.
Note(s)
This issue may also extend to Unicode characters using different normalization forms if a matching algorthim does not consider canonical equivalence. This issue is closely related to text-indexing
The translation of a light-hearted and humorous advertising campaign is in a serious and “heavy” style even though specifications said it should match the style of the source text.
A German text presents a telephone number in the format (xxx) xxx – xxxx instead of the expected 0xx followed by a group of digits separated into groups by spaces.
The format of telephone numbers is set in forms, databases, or other functional aspects of software and therefore cannot support telephone numbers that do not match this format.
A contact database does not store country codes and fixes all telephone numbers at 10 digits, rejecting any shorter telephone numbers. As a result, it cannot be used outside of a handful of countries that have phone numbers matching these requirements.
Note(s)
Corresponds to telephone-format in locale-convention. This is used to mark engineering problems in the source content, not specific problems in the target.
Tense/mood/aspect
ID
tense-mood-aspect
Definition
A verbal form displays the wrong tense, mood, or aspect
A French text translates English e-mail as e-mail but terminology guidelines mandated that courriel be used.
The English musicological term dog is translated (literally) into German as Hund instead of as Schnarre, as specified in a terminology database.
Note(s)
All issues specifically related to use of domain- or organization-specific terminology are included in this issue and its children.Do not use this issue if a text is simply mistranslated, i.e., if the translation would be a valid translation of the source but simply does not use the particular mandated terminology. For example, if a text translates [river] bank into Spanish as banco (a financial institution) instead of orilla (a river bank), this would be a mistranslation because banco would never be a valid term for the concept of a river bank. However, if a termbase specified that orilla should be used and the translation uses ribera instead, this would be a Terminology error because ribera is a valid term for the concept, but not the specified one.
When users enter text with accented vowels using UTF-8 encoding, these are systematically converted to other characters due to an internal text-processing routine that assumes ISO Latin-1 encoding.
Note(s)
Problems with text corruption often emerge when different systems interact with each other without considering the encoding emitted or expected by other systems.
Text expandability
ID
text-expandability
Definition
Insufficient room is left to allow for text expansion
Translation specifications state that all localized versions of a service manual must preserve the same pagination as the English source, but no extra room has been left for languages that text more physical space than the source text, such as German (which may be 30% longer than the English source).
Note(s)
This issue corresponds to truncation-text-expansion in the Design dimension. This issue is used to identify instances in the source where insufficient room has been left in a document or other item containing text while truncation-text-expansion is used for specific cases where text has extended beyond the allowed bounds.
Text indexing
ID
text-indexing
Definition
When text is indexed for retrieval and processing, the indexing does not account for language-specific requirements.
The German name Römer should be indexed for retrieval as “Römer”, “Roemer”, and “Romer”, but the indexing engine uses only the first. As a result, users looking for this name in a database will not find it if they use one of the alternative forms.
Specifications stated that English text was to be formatted according to the Chicago Manual of Style, but the text delivered followed the American Psychological Association style guide.
Note(s)
Third-party termbase
ID
terminology-third-party
Definition
The text violates terminology guidelines as specified in a termbase from a third-party.
Specifications for translation of a software application specify that UI terms be translated according to the public termbases provided by the developers of the platforms upon which it will be deployed, but certain terms are not translated consistently with these specifications.
Note(s)
Should be used only when it is necessary to distinguish terminology issues related to third-party termbases from more general termbase issues.
Time format
ID
time-format
Definition
A text uses a time format inappropriate for its locale.
A shared calendar system does not consider timezones and sends out all notifications based on the time on the server’s clock. As a result it does not send out reminders for meetings at the appropriate time.
Note(s)
Truncation/text expansion
ID
truncation-text-expansion
Definition
The target text has insufficient room to display the translated text according to specifications.
The source text uses words that refer to a specific type of miltary officer but the target text refers to military officers in general
Note(s)
In some cases differences in concept structure between languages may render an apparent under-translation necessary. In such cases this issue should not be considered an error, although the issue may be noted for further consideration.
The following text appears in an English translation of a German letter: “We thanked him with heart” where “with heart” is an understandable, but non-idiomatic rendering, better stated as “heartily”.
Note(s)
Unintelligible
ID
unintelligible
Definition
The exact nature of the error cannot be determined. Indicates a major break down in fluency.
The text states that a feature is present on a certain model of automobile when in fact it is not available.
Note(s)
Verity issues can apply to the source or target text and often emerge during translation when, for example, a factual statement is true in the source locale but not true in the target locale.
Specifications state that at least two lines of a paragraph must appear on a page (if the paragraph is more than one line), but a single line starts a page while two appear on the previous page.
The Multidimensional Quality Metrics (MQM) Community Group fosters the development of MQM for translation and localization quality assessment and its interoperability with W3C’s Internationalization Tag Set (ITS) 2.0 recommendation. Membership is open to parties interested in contributing to or implementing MQM.
This is a community initiative. This group was originally proposed on 2017-06-12 by Georg Rehm. The following people supported its creation: Georg Rehm, Arle Lommel, Phil Ritchie, Olaf-Michael Stefanov, Alan Melby, Erica Michael, Ingemar Strandvik, Richard Ishida, Felix Sasaki, Pedro Luis Díez Orzas, Yves Savourel, Merle Tenney, Tatiana Gornostay, Aljoscha Burchardt, Tomislav Novak. W3C’s hosting of this group does not imply endorsement of the activities.