0% found this document useful (0 votes)
134 views38 pages

NimdziAtlas22 v3

Language atlas for all language professionals out there

Uploaded by

Ndjdjd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
134 views38 pages

NimdziAtlas22 v3

Language atlas for all language professionals out there

Uploaded by

Ndjdjd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

INFORMATION

CONTAINED IN
THIS
REPORT
1 5
About the 2022
Nimdzi Language Language technology trends
Technology Atlas

2 Methodology
6 Machine translation

3 7
Categories of The 2022 Nimdzi Language
the Nimdzi Atlas, Technology Atlas special focus:
explained virtual interpreting technology

4 Innovations, NLP,
and blockchain 8 What the future holds
About the 2022 Nimdzi
Language Technology Atlas
The Nimdzi Language Technology Atlas offers a unified view of the current language technology
landscape. This freely accessible annual report provides readers with insights into major tech
advancements and demonstrates the value and potential of technology solutions designed for the
language industry.

The Atlas is a useful tool that helps with language technology-related decision-making. Technology
providers use the Atlas both to benchmark their competition as well as to find partners. Investors
refer to it to gain a better understanding of the leading market players. Linguists and buyers of
language services turn to it to see what tools are out there to help them in their everyday jobs. It
allows students of language programs around the world to discover just how many tools may be
just a click away from being leveraged in their future careers.

That being said, market transparency is not enough to keep pace with the changing technology
environment of today. That’s why the Language Technology Atlas serves as a starting point, a map
of sorts. Only you can make your language technology journey really work for you. So, let’s open
our newly updated guide to the world of language tools, and walk this road together.
Happy travels!

Methodology
In this year’s edition of the Nimdzi Language Technology Atlas, we collected data from providers
of more than 800 technology solutions.

The data gathering behind the Atlas is based on four main sources:

1. Ongoing research and ad-hoc requests around new additions to the Atlas as well as changes
to the companies and tools already included in the 2021 edition.
2. More than 50 briefings and meetings with language technology firms, held during the first
half of 2022.
3. Publicly available data (company websites, research articles, blog posts, press releases,
webinars, published research papers, etc.).
4. Experience of Nimdzi team members who regularly use and evaluate various language tools.

These sources have given us a comprehensive understanding of the state of technology development
in the industry. Before we continue, let’s review the definitions of key language technology.
Categories of the
Nimdzi Atlas, explained
Translation management systems
Translation management systems (TMS) are systems that feature both translation (and editing)
environments and project management modules. Core components of a typical TMS include:

• Bilingual translation environment (source and target)


• Translation memory (TM)
• Termbase (TB)
• Machine translation (MT) (optional)
• Project management features
• Built-in quality assurance (QA)

You can check out more features of a modern TMS using Nimdzi’s free TMS Feature Explorer.

Inside the TMS category, there are four subcategories:

1. Localization for developers. This subcategory is reserved mostly for developer-oriented


tools that focus on enabling software teams with modern localization solutions. Whether it is
an open-source TMS with very basic functionality or a full-fledged cloud TMS provider that
serves dev teams providing the full localization cycle, both fall into this subcategory.

2. Enterprise-only TMS. Solutions oriented mostly toward enterprise-level customers. Can a


freelancer buy a license to such a TMS? Not necessarily. In a more usual scenario, an enterprise
may grant the corresponding licenses to the team.

3. Proxy & JS-based website loc and updates on air. TMS solutions oriented toward web
localization fall into this subcategory. For example, solutions such as a translation proxy or
those using an anti-proxy approach, both fall into this subcategory.

4. Generic TMS for every customer profile. Here, some of the brands present in the three
previous categories also appear. This is a subcategory for TMS-type tools that can be (and
are) used by all user types: freelancers, LSPs, enterprises. One for all, all for one, as they say!

Translation business management systems


Unlike TMS, translation business management systems are systems that do not have a bilingual
translation environment. They only have management options for translation project enablement.
We call such technology a BMS or (T)BMS, since that’s exactly what it does: it helps manage
business operations around translation.
Audiovisual translation tools
Here, we feature various tools and platforms for audiovisual translation enablement: from project
and asset management tools to AI-enhanced dubbing tools.

Machine translation
The section demonstrates major MT engines brands subdivided into four subcategories depending
on the MT providers’ specialization.

Integrators
Here we list systems that integrate other systems with each other. The middleware subsection
demonstrates major companies that specialize in integrating various language technologies. The
products in the MT Integrators subsection not only provide smart access to MT engines, but support
certain procedures around MT, so that the users could leverage MT in the best way possible.

Marketplaces and platforms


In this section, we feature platforms and marketplaces focused specifically on translation and
localization talent. In a marketplace, you can post a job and accept responses from linguists or
other professionals who are interested in doing the work for you. Then you book this talent or
directly assign the job to the chosen talent within the platform. If you’re a linguist, you sign up and
set up your profile in the system, get vetted and/or tested (on some marketplaces), and then start
receiving job offers.

There is also the platform LSP option where you not only get access to a library of linguistic
resources and agencies, but also to the workflows for the projects and PMs who support you. You
can upload your files to the platform, get an instant quote, and after quote approval and project
completion, receive the result.

Interpreting systems
At Nimdzi, we coined the umbrella term ‘virtual interpreting technology’ (VIT) to describe any kind
of technology that is used to deliver or facilitate interpreting services in the virtual realm. There
are three ways in which virtual interpreting can be performed or delivered: via over-the-phone
interpreting (OPI), video remote interpreting (VRI), or remote simultaneous interpreting (RSI).

As the name OPI suggests, two or more speakers and an interpreter use a phone to communicate.
This is an audio-only solution and the interpretation is performed consecutively. VRI is also performed
consecutively. However, in this case, there is both an audio and a video feed. Depending on the VRI
solution, users and interpreters either connect via an online platform with video calling capability
or via a mobile app. As for RSI, it directly evolved out of the field of conference interpreting and
is intended for large online meetings and events with participants of many different language
backgrounds. As the name suggests, the interpretation is performed simultaneously — that is to
say, at the same time as the speakers give their speeches.

Artificial Intelligence (AI) is increasingly entering more areas of our lives and the interpreting
market is no exception. We have included machine interpreting solutions in our definition of VIT
and are subsequently listing them in our Language Technology Atlas as well.
Interpreter management and scheduling (IMS) systems are also included in our definition of VIT
because, even though they do not focus on delivering interpreting services, they facilitate them.
An IMS is a useful tool that allows for efficient management of interpreter bookings for both onsite
and virtual interpreting assignments.

You can check out more features of VIT systems using Nimdzi’s free VIT Feature Explorer.

Quality management
This section is devoted to quality management in translation. It features three separate subcategories
which correspond to three main product types in this area: QA tools, review and evaluation tools,
and terminology management tools.

Speech recognition solutions


Also known as automatic speech recognition (ASR). The section features solutions that focus on
automatic transcription and automatic captions.

And if you’re interested in learning more about these solutions and diving deeper into the associated
terminology, here’s a Nimdzi Learning Course which is devoted exactly to explaining what’s what in
language technology.
Innovations, NLP, and
blockchain
Writing about language technology year after year, we can’t often help but wonder whether we’ll
ever see a new idea so innovative that it could transform and reshape our industry as a whole.

There have only ever been three disruptive innovations in the language industry:

1. E-mail (enabled the notion of in-country native speakers as translators)


2. Translation memory software (TMs and the concept of CAT-tools, which then
transformed into TMS)
3. Machine translation, and Google Translate in particular.

Renato Beninatto, Co-founder, Nimdzi Insights

Diving deeper into the question of whether there’s already anything else, in July 2022 Nimdzi
put together a list of the top 25 most innovative companies in the language industry, which goes
beyond the language technology companies we normally write about in the Atlas.

As the technology landscape evolves, so does our thinking about the areas of focus for the Atlas. We
have already started to map technology systems and innovative initiatives that use language data,
and their core is big data, simulation of natural language, and the specific innovative applications
the latter offers. There are also new companies in our industry that use blockchain technology,
peer-to-peer (P2P), payment tokens, and other concepts from the world of big IT.

Let’s take a look at some of those solutions in further detail.

Natural language processing


When thinking about the natural language processing (NLP) landscape, we aim to consider the
applied NLP solutions that both relate to and extend localization enablement, which are essential
to the content and language strategies of digital enterprises. We focus on multilingual learning and
applications that catalyze growth in new markets.

For now, we’re not yet adding a special category to the Atlas infographic, but still would like to
mention a couple of solutions that have caught our attention.

• Datasaur is an NLP data-labeling platform with annotation, collaboration, and automation


capabilities. It helps supercharge NLP models with data, and therefore enables an increase
in model performance.

• Defined.ai has an AI marketplace platform with the assets needed to build AI models. You
can also use it to monetize your AI data, tools, or models by becoming a vendor. They also
offer professional services to help deliver complex machine learning (ML) projects.

• Lexalytics offers options for integrating their text analytics APIs to add NLP into a product,
platform, or application. They perform sentiment analysis, entity extraction, and intention
detection.

• NLP Cloud Playground offers multilingual intelligence via a graphical interface that allows
you to easily try out models without actually writing code.

• Telus has an intelligent assistant platform for conversations. Brands use this platform to
enhance their CX operations.

• The BLOOM model demonstrates a more accessible NLP approach and runs in more than 20
languages. In some ways, it’s creating a new trend.

The boundaries of localization and associated technologies are expanding as the use of AI becomes
more widespread and more features are added.

Both global enterprises with a strong localized presence and localization service providers
alike are focusing on new NLP usage opportunities to enable better business decisions
and reduce operational costs.

Roman Civin, VP of Consulting at Nimdzi

Not only do automated messaging solutions and entity sentiment analysis models create distinct
types of value in specific verticals, but so do content creation and classification systems, search
intelligence and language assets optimization. In businesses where localization drives content
strategy, localized experiences are part of the game, not an afterthought.

Even though the current applications of NLP in localization are not large in number, we will continue
tracking NLP platforms and technologies, how they help, and how they are relevant for our industry.
Blockchain
In addition to Translateme, who we included in the 2021 edition of the Atlas and mentioned in
the data-related section of the Nimdzi 100 in 2021, it’s important to also mention Exfluency when
discussing the blockchain arena. The latter is a relatively new system built on the concept of privacy
by design with an idea of creating a space for a secure multilingual asset store. They now have a
community of more than 1,000 users.

Exfluency offers two levels of anonymization (one for GDPR-compliance, plus another one for
anonymization of certain data following customers requirements). They have hybrid anonymization,
when it’s about 85 percent AI, 15 percent human, and three levels of blockchain use:

• Crypto-currency for people who are using the platform


• Implemented Trust Chain™ with P2P judgment done by single individuals along with trust
mining, also in blockchain: who judged whom based on what
• Change management: what was changed when and by whom

Exfluency was created with an idea of developing a concept that would empower language as a
natural human asset and exclude layers of middlemen on which the regular localization supply chain
is based (with actual linguists at the very end). They succeeded, and it already works for certain
use cases. For this concept to really disrupt the industry, though, a lot of future development is
expected. For example, implementation of customized quality management systems (for when the
P2P judgment happens).

Invariably, it would be interesting to monitor how Translateme and Exfluency develop, as well as
see more companies and use cases in the translation and localization industry with blockchain-
powered solutions.

Language technology trends


TMS
TMS can be considered a prerequisite of professional localization today and a TMS is oftentimes
central to a client’s efforts to localize their content. In fact, 92 percent of translation and localization
managers that Nimdzi interviewed as part of its Lessons in Localization series between July 2020
and December 2021 stated that their companies either use a commercial TMS or one that had been
developed in-house.

There are a great variety of TMS solutions available on the market, each striving to address the
specific needs of their clients. In this year’s Atlas, we reference over 160 different TMS solutions,
10 more than last year. This means that regardless of the wide variety of options out there, new
solutions continue to emerge.
Some language technology companies turn to the TMS arena from related areas — for example,
(T)BMS. Within the last year, a couple of BMS tools added translation editor functionality and
joined the TMS category. For example, Taia was moved from the BMS category to the TMS for
enterprise category.

The TMS market is experiencing growth that is, at least for some players, outpacing the
growth of the language services industry as a whole. This growth, in turn, attracts both
investment opportunities and opportunities to consolidate market position, as evidenced
by a slew of mergers and acquisitions in this segment of the technology market.

Gabriel Karandyšovský, COO, Nimdzi Insights


While it is not exactly within the scope of this research to evaluate the performance of specific
companies, a few points are worth mentioning:

• LILT is further exploring the Global Experience (GX) domain as they continue to develop fully
managed, human-powered AI-assisted translation services. They are providing human-in-the-
loop AI with a gradient update of MT which learns from feedback online. The company is
going in the direction of facilitating not a particular service or TMS technology but rather an
“entire customer journey.” A trend which we’ve also seen reflected in other industry players,
even not necessarily from the LT arena per se (see STAR7 example here).

• As the data from over 400 respondents of Nimdzi’s 2022 TMS Survey suggests, memoQ
and Memsource hold the position of most favorable brands on the market, featuring both
the highest number of positive experience responses and the lowest numbers of negative
experience responses.

• RWS, which has by now assimilated SDL’s suite of technology products reported around
GBP 100 million/USD 125 million in revenue related to technology-enabled services at the
end of 2021. This is a useful benchmark for this subsegment, as SDL’s technology has long
been considered the leader of the market. Trados® Enterprise in our Atlas presents a direct
replacement for RWS Language Cloud.

• Both MotionPoint and Smartling are two interesting players to consider. Today, the former
refers to itself as a tech-enabled service provider, whereas the latter started out as a pure-
play tech provider and has been branching out to providing language services. Each started
their trajectories on opposite ends of the spectrum. This speaks to a trend to keep an eye out
on — especially from the point of view of LSPs wanting to expand their game — that is to say,
companies combining service offerings with advanced technological capabilities.

• Smartcat is further exploring the in-context preview features (for MS Word, subtitles, HTML)
and adding more integrations, for example, with Figma. They are also working on automatic
post-editing (APE). The goal here is to make Smartcat more accessible to people outside the
realm of localization and the TMS environment in particular. Another area Smartcat focuses
their development on is matching and recommending freelancers best fit for a particular job.
That is, automatic recommendations of translators for a specific language combination based
on the content that has been slated for translation.

• In February 2022, Wordbee demoed their experiments around InDesign, dubbed Wordbee
Link for InDesign. A native InDesign multilingual solution has automated layout cloning, with
source and target designs remaining fully independent, as well as automated change tracking
for compliance, smart identification of word to word links, live updates, concurrent workflow
(which allows for working in both source and target simultaneously), and a whole host of other
new features.

• Seeing a growing need for better communication processes, along with smarter automation,
XTM Cloud introduced a query management module that enables users to create and access
all queries, getting the answers they need without ever having to leave the TMS. They have
developed an improved method of increasing TM leveraging, with a proprietary algorithm
that retrieves up to 30 percent more TM matches than the industry standard, resulting in time
and cost savings. The XTM team has also enhanced their automatic placement of inline tags:
tags are now more accurate, saving linguists tedious, manual work. XTM International plans
to introduce more AI features, such as automated reviews.
If we take a closer look at some of the most regularly occurring challenges associated with the
process of selecting a TMS in 2022, we’ll most likely see well-known issues with connectivity (a
modern TMS should be able to connect to a diverse set of systems, from web content repositories
to home-grown software solutions), full compliance with GDPR as well as HIPAA, ISO/IEC
27001:2005, PCI, and other standards and protocols, and security. How do you go about ensuring
that a file does not leave the TMS environment on the translator side, for example? Or that the
document is not compromised when transmitted via unsecured email servers? Another issue
frequently reported by enterprise customers is calculating the ROI of a TMS.

Table: Which of the following TMS solutions have you heard of?

The top-5 companies in the TMS arena — memoQ, Memsource, RWS, XTM, and Smartling, as the
data from the Nimdzi survey suggests — have all increased their individual brand awareness over
the past 10 years. The conclusion could be drawn that, for newcomers in the TMS developer world,
top-tier mindshare status might be hard to obtain. However, the trending data over time tells quite
a different story, showing that it is indeed possible for companies to improve their mindshare status
with the right marketing strategy (see Memsource example).

Terminology advancements and quality management


With all that vivid TMS development, the terminology management question remains an area yet
to be tackled. Developments on this front are few in number.

Worth mentioning, however, are new features for lexiQA. They launched the first version of a
morphological terminology engine and are working on developing morphology support for more
languages as well as planning a pilot with one client at the end of summer. This results in a significant
decrease in QA false positives along with no reported false negatives. On top of that, lexiQA also
introduced a QA-as-you-type feature as well as review capabilities. An LQA mechanism available
via API takes most of the manual work out of the quality assessment process. By having mapped all
error classes onto the MQM model, lexiQA is capable of assigning severity weighting to each error
type, while automatic QA checks produce scorecards based on specified requirements.
Another name that popped up a lot in conversations with localization professionals keen on quality
was Sketch Engine. For example, an online term extractor with monolingual and bilingual term
extraction capabilities, OneClick Terms, is powered by Sketch Engine’s term extraction technology.
It presents both a powerful service and platform with many languages along with a huge corpora.
The value lies in the intelligent use of language corpora, and one of the use cases is advanced
terminology extraction (we haven’t added it to the infographic below, because we do not yet have
a subcategory for terminology extraction).

Another important concept to keep in mind when talking about quality is the translation error
correction (TEC) model, which was introduced in a recent paper published by LILT and the University
of California at Berkeley in June 2022. TEC automates correction of human translations. The idea
is that it could help with speeding up the review process, for example, by suggesting corrections.
Even though some of the said suggestions may still be incorrect, professional reviewers accepted
79 percent of the TEC suggestions for correction overall. It has also been noted that it could be
potentially helpful with tackling client-specific requirements.

A new addition to our Review and Evaluation subsection is LocalyzerQA by Lingoport. LocalyzerQA
automates linguistic review of web, mobile, and desktop applications in context without requiring
developer assistance. In their June release, new features were added to LocalyzerQA such as
global replace of selected strings across files and projects during localization review as well as new
statistics tracking for translations and reviewers.

Note that the well-known Multiterm was removed based on updated information by RWS, and
Trados Terminology has been added instead.
Data for localization
Nowadays, the requirement of having enough data is usually brought up when discussing MT
customization. However, it may also be a blocker for major IT brands who develop generic engines
for low-resource languages.

In the past, MT researchers often wondered how to get around the problem of having adequate
technology without having adequate data. This question is still relevant today, yet it is no longer
just a low-resource-language issue. Fortunately, an increasing number of tools and technologies
have been developed to deal specifically with language data.

Last year, we discussed the most prominent trends in data for AI and AI localization, highlighting
the newly introduced solutions by SYSTRAN, Omniscien, and Pactera EDGE, among others. A
major player in the AI data space it behooves us to mention here is Appen. Founded in 1996,
Appen was historically the first LSP to recognize the importance of collecting and producing
quality language data at scale required to train multilingual AI. To support its global resourcing
and production models, Appen developed an internal proprietary platform ADAP (Appen’s Data
Annotation Platform) that helps procure and annotate training data, enabling the world’s largest
MT, ASR, and NLU solutions relied upon by many of the language technology suppliers listed in
our Atlas.

Appen has several proprietary systems including Ontology Studio, which helps with the creation
of custom multilingual ontologies across multiple locales for market-specific search relevance and
recommendation engines, Ampersand for automated and human-in-the-loop speech recognition
data processing and transcription editing, and ADAP (available internally and via subscription),
which is a resource management and data curation/collection platform.

Crowdsourcing language data is one option, but there’s also the option of data synthesis. As the
name suggests, data is synthesized, or created artificially, in order to overcome the limitations
of real-world data. It’s cheaper, doesn’t contain personal information (as with human-generated
data), and has numerous other benefits. Nonetheless, synthetic data currently accounts for only 1
percent of all market data. However, Gartner forecasts that by 2027 the data market segment will
grow to USD 1.15 billion (48 percent CAGR).

Synthetic voices and AI-enhanced dubbing


Speaking of synthetic data, let’s not forget to discuss the rapidly developing arena of synthetic
voices and ‘AI dubbing’. As we noted last year, AI-enhanced technology for video localization
extended its scalability outside the entertainment industry. Synthetic voices are also used in
e-learning, educational materials, broadcasting, and advertising.
Voiseed, who was already on our radar last year in the “AI-enhanced dubbing tools” subcategory of
the Atlas, just won a PIC challenge at LocWorld Berlin with “A New AI-based Technology to Synthesize
Controllable, Expressive Speech in Multiple Languages, with Multiple Voices.”

New additions to this category include Aloud by Google and Dubverse. The former is a part of Area
120, Google’s in-house incubator for new products and services, while the latter offers AI-powered
video dubbing using text-to-speech (TTS), advanced MT, and AI. The platform features human-like AI
voices from a range of more than 100 speakers of various gender, age, and style to match a particular
content type.

Interestingly, everyone that enters this arena seemingly dubs their solution a “new way to dub” despite
the fact that this category of the Atlas already includes 25 such companies and solutions.

As early as 2020, South African news provider Media24 built its own synthetic voice generator. While
we’re on the subject of Africa, let’s also mention Abena AI, a voice assistant fluent in Twi, the most
widely spoken language in Ghana. The Abena AI app for Android is called “Africa’s first hands-free
offline voice assistant” and opens up voice AI to those who don’t speak English or other common voice
assistant languages.
In 2021, African demand fueled the USD 3.4 million investment by Mozilla Common Voice to create
linguistic databases as a precursor for a voice assistant that can speak Swahili. The Common Voice
project itself was launched five years ago to support voice technology developers who do not nec-
essarily have access to proprietary data. It has become a platform where anyone can donate their
voice to an open-source data bank. It now includes more than 9,000 hours of audio in 60 different
languages.

It is expected that there will be further developments in the area of building native voice
assistants, including adding both more African languages and additional features that are not yet
common in voice assistants like Alexa or Siri.

As noted by Waverly Labs earlier this year, the world has been slowly embracing a “not-only-English
approach.” Waverly Labs has built technology for near-instantaneous communication among multi-
ple languages and dialects, including their latest solutions, Subtitles, Audience, and Ambassador In-
terpreter. Subtitles provides transactions and service exchanges between different language groups.
Designed as a two-sided screen, the device’s mic array picks up conversation between two parties,
sends it to the cloud for translation, and flashes the processed text to the screen. “It’s almost like
watching a subtitled movie, but at the bank, hotel, airline, grocery store, or other such organization
or business.”

And for English-speaking countries where people want to understand what is said on the screen
without using subtitles, there are also new tools appearing on the market. A piece of technology to
keep an eye on is the auto-dubbing solution Klling, by the AI startup KLleon from Singapore. Klling is
an app that dubs media content in English, Korean, Chinese, and Japanese. So if you want to watch
Korean content in English, you can use this dubbing solution. They also have an interesting product
called Klone.
Klone enables the creation of an AI virtual assistant/chatbot using a virtual avatar. AI virtual chatbots
can be used as a user interface for people to interact with. Klone’s underlying deep learning tech-
nology, called Deep Human, requires a single photo and 30 seconds of voice data to create a digital
human. The resulting auto-multilingual dubbing solution is able to dub the video into five languages
while maintaining the person’s voice, with automatic lip sync enabled. Samsung uses Klone for their
AI virtual assistant for Samsung Display.

Digital humans for use in the metaverse and virtual influencers in general are nothing new, but they
are becoming ever more popular and widespread. In June 2022, Lava, the “first Armenian” virtual
influencer appeared on the scene, created with the help of Unreal Engine. Lava publishes content in
English but does not yet talk — unlike South Korean AI anchor Kim Ju-ha.

In her case, the cable TV network MBN even has a page for their AI anchor news labeled “Virtual
Reporter News Pick,” where the AI anchor explains, “I was created through deep learning 10 hours
of video of Kim Ju-ha, learning the details of her voice, the way she talks, facial expressions, the way
her lips move, and the way she moves her body.” As MBN officials stated in 2020, “News reporting
using an AI anchor enables quick news delivery in times of emergency for 24 hours non-stop.”

Will we see more international examples of news delivered with help of such AI? Probably so. And
not only in news and broadcasting. For example, KLleon’s own virtual human trended on TikTok with
over 2 million views.

Amazon, Microsoft, and Meta


In October 2021, Microsoft Translator reached a major milestone: it was capable of translating more
than 100 languages. In July 2022, Meta announced that it had achieved another milestone related to
the No Language Left Behind (NLLB) initiative: NLLB-200, an AI model that can translate content to
and from 200 different languages. This is almost twice the number of languages covered by current
state-of-the-art models.
In the company’s first announcement of NLLB, it was noted that they were particularly interested
in making MT more accessible for speakers of low-resource languages. The model has already
been put to use. For example, Meta partnered with the Wikimedia Foundation to give Wikipedia
editors access to the technology to quickly translate Wikipedia articles into low-resource languag-
es that do not have a particularly prominent presence on the site. Meta AI is also providing up to
USD 200,000 of grants to nonprofit organizations for other real world applications for NLLB-200.
The model itself is open-sourced.

In addition to Meta, in July 2022 Amazon also announced another “break through language barri-
ers.” They offered to combine three of their services — Amazon Transcribe, Amazon Translate, and
Amazon Polly — to produce a near-real-time speech-to-speech solution whose aim is to quickly
translate a source speaker’s live voice input into a spoken target language, and with zero ML ex-
perience. Fully managed AWS services work together in a Python script by using the AWS SDK
for the text translation and text-to-speech portions, and an asynchronous streaming SDK for audio
input transcription.

As suggested by Amazon, the workflow is as follows:

1. Audio is ingested by the Python SDK.


2. Amazon Polly converts the speech to text.
3. Amazon Translate converts the languages. It also has several useful features like creating
and using custom terminology to handle mapping between industry-specific terms.
4. Amazon Live Transcribe converts text to speech.
5. Audio is output to speakers.

Source: Amazon

The real life use cases for this may include medical and business domains, as well as events. Amazon
developers also encouraged users to think about how they could use these services in other ways to
deliver multilingual support for services or media.
Machine translation
Better MT?
Even though the language industry largely agrees on the benefits of MT in its current state, the
billions of words processed by MT engines each and every day may still result in translations that
are, overall, not quite good enough. For this reason, a number of companies have taken on the
mission of changing this via incremental improvements of their technologies aimed at bettering
the MT output quality.

This is one of the reasons the MT market is so complex and dynamic. But with so many options
available, the good news is that, unlike early adopters of MT, modern users need not necessarily
study the market themselves or evaluate engines inhouse. One good example of speeding things
up with customization is adaptive MT. Adaptive MT engines are capable of learning from
corrections in real time. Known examples of such engines are LILT and ModernMT.

However, even though the concept of adaptive MT means improving engines fast, which is in itself
a clear benefit, it may not be the best option for every use case, simply because people who
introduce those corrections from which the engines learn on the fly can and do make mistakes. And,
then, there’s also an issue of having the final say — when multiple “correctors” may be improving
the same engine introducing contradictory changes. As confirmed by Globalese, whose specialty
is, precisely, MT customization, improving engines should be done using quality approved data. An
asynchronous retraining of a custom engine (e.g., on a daily basis) with qualified data can be more
efficient compared to a real time learning based on unconfirmed content.
Moreover, there are now technology companies that can help with preparing data, MT engine as-
sessment, and overall implementation of an MT program at an enterprise, be it MTPE training for
linguists or setting up effective MT customization workflows. As we’ve already covered examples
of such providers in last year’s report, this time we’re going to focus on a couple of more recent
examples of “better MT.”

New generation MT glossary


Most glossaries available on the market still have search-and-replace functionality. But we’re already
seeing changes to this approach. For example, DeepL launched a glossary feature in May 2020
that allows users to define and enforce custom terminology, but they didn’t stop there. They also
launched new language models that more accurately convey the meaning of translated sentences,
while overcoming the challenge of industry-specific professional jargon.

With the continuous improvement in MT technology, engines are expected to get even better,
enabling everyone to use glossary terms with morphologically correct inflections.

TAUS Data-Enhanced MT
In the beginning of 2022, a brand new TAUS service was launched: TAUS Data-Enhanced Machine
Translation (DEMT). TAUS DEMT was set to deliver affordable, customizable, high-quality MT
output with a single click using the training datasets most relevant to the user’s source files. This
is a type of real-time MT where training and customization happen based on selected datasets.

Evaluations performed by third-party MT experts proved that the available TAUS datasets used
in the customization of Amazon Translate improved the BLEU score measured on the test sets by
more than 6 BLEU points on average and 2 BLEU points at a minimum in the medical, ecommerce,
and finance domains. This means an increase of 15.3 percent on average is achieved.

Source: Evaluating the Output of Machine Translation Systems

As a result, TAUS DEMT delivers quality similar to post-edited MT. But delivery is virtually in real
time and prices are 50 percent to 80 percent lower than the “human-in-the loop” service.
Fuzzy matches in MT engines
Speaking of TAUS, according to their research, anything below an 85 percent fuzzy match in
Romance languages is potentially better handled by MT than by translation memory (TM). That’s
most likely why, for instance, the MT suggestions in MateCat are allocated an 85 percent match by
default. In the absence of higher percentage matches, the MT will be displayed.

The representation of MT as a fuzzy match, like in a TM, can give users an idea of the extent to
which this MT may be used. And while we’re on this subject, it’s worth mentioning what Memsource
engineers are trying to do. They have addressed the overall feeling of uncertainty inherent in the
MT evaluation process with their machine translation quality estimation (MTQE) feature, which
helps users automatically estimate the quality of MT, especially if there’s no match from the TM.
Based on MTQE results, users can decide how to treat the output. Knowing that linguists save time
with quality scores for TM matches, the decision to provide a similar option for MT was made.

Looking at the question of fuzzy matches in MT from another perspective, there is ongoing research
into combining TM+MT in a single segment. This actually might become another MT innovation:
part of a given segment may be found in a TM, and therefore presented to the linguist, but another
part (missing from the TM) comes from MT.

MT as a technology has reached a certain level of maturity in terms of baseline quality


and customization, which implies some serious setbacks for traditional translation
memory technology.
Jourik Ciesielski, Nimdzi Insights

A fuzzy TM match entails a predefined error threshold — if you populate a 90 percent match,
you know for sure that 10 percent of the segment needs to be corrected. The edit distance in a
machine-translated segment can be smaller, especially when the MT is leveraged by a well-trained
model that includes company-specific or domain-specific terminology. As a consequence, some
organizations already prefer MT over fuzzy TM matches below 95 percent.

Tone of voice
Paying attention to the tone of voice helps a text appear more aligned and sound more human. In
writing, signals like emotion, body language, gestures, voice, and so on, have to be represented
by the tone of voice, and MT is learning to reflect this significant part of modern communication.

In translation, this is especially useful for conversational scenarios in languages where tone of
formality matters, like German or French. DeepL already has a trigger for formal/informal settings,
i.e., way of speech, and features native tone-of-voice control. Formal/informal options are available
in Amazon Translate. Intento is also leveraging existing technology for tone of voice. Their focus
represents an effort to control involuntary bias when using MT. This involves taking practical steps
towards dodging these biases, piloting a tone-of-voice control that works independently of the MT
providers. So they added MT-agnostic NLP which enabled tone-of-voice control and provided a
wider choice of MT engines for such cases.
The 2022 Nimdzi Language
Technology Atlas special focus:
virtual interpreting technology
Remote interpreting solutions have been both in development and in use for a long time now.
However, prior to the COVID-19 pandemic, uptake was slow. The onset of the pandemic changed
this drastically, and, ever since, it seems that the growth, innovation, and investment in this field
has been unstoppable. Once considered an afterthought or sub-par alternative to onsite services,
remote interpreting has stepped out of the shadows to become the key to continuity of business
and care in many industries.

Because so much has happened and is still happening in VIT, this year’s Language Technology
Atlas is dedicating a special section to this thriving field within the larger market for language
technology.
Virtual interpreting goes mainstream
An interesting side-effect of the boom in remote interpreting is that interpreting has gone more
mainstream. This trend can be observed across different segments of the interpreting market
— from vaccine centers across the US being equipped with portable, on-demand video remote
interpreting (VRI) devices (e.g., from AMN Language Services), to Walgreens pharmacies
partnering with VRI platform VOYCE to enable efficient communication between customers and
employees (including language access for the Deaf and hard-of-hearing through sign language
interpreting).

However, this trend is particularly noticeable in the remote simultaneous interpreting (RSI) space.
Ever since the onset of the pandemic, it appears to have opened the door to new clients, so that
these days RSI is no longer limited to conference interpreting (its field of origin). Instead, RSI has
started to branch out into other areas of the market. For instance, LSPs suddenly received requests
for RSI for parent-teacher conferences and other school events. Local governments reached out
wanting to add RSI to their town hall meetings and COVID-19 announcements. Educators from
various fields, including healthcare education, have added RSI to their classes, and at least one
large esports company is looking to add RSI to its virtual live events. So not only did the forced
move to the virtual realm remove fears and concerns surrounding remote interpreting on the side
of existing clients, it also created a whole new set of opportunities and brought interpreting to
clients who previously never even considered it.

A likely explanation for this development is the popularity of and increased exposure to
(monolingual) video conferencing platforms like Zoom, Microsoft Teams, and Google Meet, that
have been booming ever since in-person meetings became restricted. Although these platforms
were already being used prior to March 2020, the pandemic took things to a whole new level as
video conferencing became the norm in, more or less, every area of society. From businesses to
governments to schools to the average person — no matter the age group, no matter the setting
(weddings, birthday parties, and funerals included).

But sooner or later the now well-known Zoom fatigue started to set in and so people were trying to
find ways to make their virtual meetings more engaging, and started exploring new features and
meeting formats — including multilingual meetings, facilitated by RSI.

These days, it really seems like RSI is everywhere. From US President Joe Biden holding a virtual
Leader Summit on Climate, facilitated by Zoom with RSI from Interprefy, in April 2021, to virtual
wine tours in multiple languages in 2022.

Because RSI is in such high demand right now and there is so much innovation happening in this
field, a large portion of our VIT special will focus on RSI this year.
What’s the difference between VRI and RSI?
Before delving deeper into RSI and the different ways it can be performed, it is important to briefly
distinguish it from VRI because both are remote interpreting solutions that use audio and video.

From a service standpoint, the main difference is that VRI is performed consecutively (speakers
and interpreters taking turns), while RSI is performed simultaneously (interpreting at the same time
as the original speaker). And while VRI is predominantly used for smaller meetings or in healthcare
and public sector settings where only two different languages need to be supported, RSI is meant
for larger events and conferences with people from many different language backgrounds.

From a business standpoint, it is also worth highlighting that interpreters are paid either a day rate
or half-day rate for RSI for larger events or conferences. In addition, since the beginning of the RSI
boom, hourly rates are becoming increasingly common for shorter assignments of only one or two
hours. In comparison, VRI assignments are charged by the minute.

What are the different types of RSI platforms?


Over the past two and a half years, we have learned about a whole host of different solutions for
RSI. There are so many that it can be easy to get them all mixed up. So we have broken them down
into four categories.

1. Video conferencing platforms without an RSI feature


2. Video conferencing platforms with an RSI feature
3. Standalone designated RSI platforms
4. Virtual booth RSI platforms

Let’s take a look at each category in more detail.

1. Video conferencing platforms without an RSI feature


An example of a video conferencing platform which doesn’t have its own RSI capabilities is Skype.
However, that doesn’t mean that it is impossible to do simultaneous interpreting on Skype, it just
means that a workaround is needed. This could be:

• A separate over-the-phone interpreting (OPI) line for interpretation.


• Two simultaneous meetings with the interpreters joining both and the listeners only joining
the meeting with their language.
• The use of a second audio channel via a social networking application, such as Whatsapp,
so that participants can hear the interpretation.
• An integration with a standalone RSI platform. Interprefy, KUDO, and cAPPisco, for instance,
can be added to MS Teams via the Add-on Store. In addition, other RSI platforms also have
the option of feeding the original audio from the MS Teams’ meeting into their platform and
the audio from the interpreters (who work on the RSI platform) back into MS Teams.
2. Video conferencing platforms with an RSI feature
Zoom is the biggest de facto RSI platform, judging by the number of meetings. Platforms like
Zoom, Webex, Google Meet, and Microsoft Teams (MS Teams) fall under this category. MS Teams
is the latest video conferencing platform to add an RSI feature in August 2022. Their RSI tool will
be discussed in more detail in the section on video conferencing platforms. Video conferencing
platforms like these were not designed with multilingualism and RSI in mind, but added an RSI
feature onto their interface when demand for remote multilingual meetings peaked during the
pandemic. Subsequently, the RSI features on these platforms are relatively limited. For example,
Zoom only added relay (i.e. when an interpreter interprets from a colleague’s output rather than
from the original speaker, in a case where the interpreter doesn’t work with the current speaker’s
language) in Spring 2022, and the Google Meet interpreting extension doesn’t allow for multiple
booths. Features like a mute button, individual interpreter chats, a handover feature/button, a
timer, and an audio volume-control button are often missing from these platforms as well.

3. Standalone designated RSI platforms


This type of platform can host its own meetings but RSI is its raison d’être. Many examples of such
platforms can be seen in our Tech Atlas. The interpreter control panel is quite complete and often
aims to resemble that of an in-person booth as much as possible. There are two typical scenarios
for the use of designated RSI platforms:

1. The meeting takes place on the RSI platform in which all participants — speakers, attendees,
and interpreters — join the same platform.
2. The meeting takes place on a different platform, e.g. Zoom, but the interpretation happens
on the RSI platform. In this case, speakers and participants join the original meeting on Zoom
and only the interpreters log into the RSI platform. The original audio from the meeting (and
potentially the video) and the interpreter’s output are fed back and forth between the two
platforms.

In both cases, whether the meeting happens on the RSI platform or not, the interpreters are
typically not visible to the speakers and attendees (although some platforms can enable this upon
request). They act in the background, just as they would in an onsite meeting when they interpret
from a physical booth.

4. Virtual booth RSI platforms


Just like with a physical soundproof booth at an onsite meeting, these platforms function alongside
the original meeting taking place on a video conferencing platform. The two major distinctions
from standalone RSI platforms are that virtual booths do not integrate with video conferencing
platforms but, rather, run in parallel and that they don’t function as standalone meeting platforms.
When using this technology, interpreters join the original meeting on a video conferencing
platform so they can access the audio and video feeds directly. From there, the interpreters listen
to the speeches and deliver the interpretation into the original meeting. However, the interpreters’
rendition is also transmitted into the virtual booth tool alongside the original meeting, so that the
interpreters can listen to each other’s interpretation and take relay.

Interpreters and clients may prefer virtual booths to standalone RSI platforms for three primary
reasons:

1. The process of injecting the original meeting audio into an RSI platform has been known
to degrade sound or add artifacts. Therefore, because the interpreters are in the same
meeting as the participants and their audio is fed directly into the meeting, using a virtual
booth may reduce the risk of sound degradation in the injection process.
2. As the original audio is not fed into the virtual booth, the risk of copyright and privacy
issues is minimized.
3. This type of technology can potentially work with any video conferencing platform, virtual
event organizer, or streaming platform, making it quite versatile.

Which RSI solution is best for me?


The best RSI solution for you or your clients will depend on your requirements and budget. Here,
we have created a brief table outlining the different RSI solutions and their advantages and
disadvantages.
What do interpreters themselves think?
Conference interpreters were in large part forced to adopt RSI during the COVID-19 pandemic as
a means to keep their businesses afloat. A couple of years down the line, feelings are still mixed.
The audio quality of remote speakers is a major thorn in the foot of RSI interpreters and, despite
repeated attempts to educate speakers, the problem persists. At the time of writing, in July 2022,
European Parliament interpreters are on strike and reducing interpreting for remote participants
due to the poor quality of their audio. On the flipside, other interpreters have expressed their
willingness to work from home, in comfortable conditions, without needing to travel left, right,
and center for conferences.

In terms of the best RSI solution for interpreters, the answers are manifold. Consultant interpreters
and interpreters that have their own clients largely tend to prefer being in the original meeting
when performing RSI. By being in the original meeting the interpreter can be visible when needed
and can communicate directly with the client and participants. They don’t have to give the client
over to the RSI platform but can manage everything themselves, including the team of interpreters.
Being a participant in the original meeting is also useful in order to highlight poor sound issues,
deal with technical glitches, and even set up the interpreter channels. However, it is important to
bear in mind that the interpreter is going above and beyond their role of being “just an interpreter”
in such cases. They are wearing multiple hats, including that of educator, technician, mediator, and
chief interpreter.

The opposite is also true. There are many interpreters who do not want to step outside their role of
interpreter. The situation explained in the previous paragraph adds tremendous cognitive load to
the interpreter, who is juggling these different roles, and it could have a negative impact on their
interpreting performance. In some cases, less tech savvy interpreters prefer to just log onto the
RSI platform and do their thing — that is interpreting — without having to worry about secondary
devices, sound mixers, setting up the meeting for participants, and booth channels for interpreters.
For the interpreters who prefer this scenario, a standalone designated RSI platform is probably the
best bet because there is no interaction with the client and no need to be in the original meeting
at all.
Innovation in virtual interpreting and multilingual
meetings
As already alluded to earlier on, there has been a lot of innovation on both the RSI and overall VIT
scene ever since their popularity went through the roof with the onset of the pandemic. The latest,
most relevant developments are discussed in this section.

Video conferencing platforms: Popular and catching up on the RSI front

In the 2021 Language Technology Atlas, we included Zoom for the first time. This year’s Atlas also
includes Webex by Cisco as well as the Google Meet simultaneous interpreting extension. While
these products do not have interpreting as their raison d’être, they are the largest RSI platforms by
simple number of meetings.

In many cases, clients prefer to use regular video conferencing platforms for all of their virtual
meetings, including multilingual ones. This is for a number of reasons, of which the most common
ones have been summarized below:

1. The client already has the platform license. It can be quite a lengthy process to validate the
use of a different platform for company meetings. So clients tend to prefer to stick to the
one they already have the license for and that they are familiar with.

2. Video conferencing platforms tend to have a lower price tag than designated RSI platforms.

3. These platforms are renowned for their ease-of-use for clients and participants. This is
particularly relevant for Zoom which is easy to set up even with the addition of interpreters.

4. Most video conferencing platforms are relatively stable and don’t require excessive
bandwidth, meaning that they can even be used in places with lower internet speeds.

5. Event management features are often quite complete on designated video conferencing
platforms.

While the interpreter experience may be better on designated RSI platforms, most clients don’t
prioritize this aspect. Their needs tend to focus on the points mentioned above.

It is the strength of video conferencing platforms in all of the other areas aside from RSI
that makes them so strong even on the RSI scene. As long as they have a functioning RSI
feature, they are incredibly competitive.

Rosemary Hynes, Interpreting Researcher, Nimdzi Insights

The fact that tech giants from outside the industry have not just taken note of the RSI boom but
acted on it, is yet another confirmation of how strong the demand for multilingual meetings has
become. What started with a rudimentary feature on Zoom has taken on a life of its own, with new
announcements released all the time by different market players.

In Spring 2022, Zoom added relay to its simultaneous interpreting feature potentially making it
more attractive to both clients and interpreters alike. This feature isn’t perfect and it still requires
a secondary device or a sound mixer to be able to hear the floor and the booth partner, but it
has overcome one of the major difficulties of using Zoom for RSI previously — the complicated
workarounds to do relay.

Webex has quite a complete RSI feature which includes an interpreter handover, relay, and a
volume mixer to listen to both the floor and the booth partner. The Google Meet RSI extension
is perhaps the most primitive at the time of writing because it requires two simultaneous Google
Meet meetings to be open at the same time and the interpretation is unidirectional, meaning that
the interpreter cannot interpret simultaneously from the interpreted meeting into the original
meeting if a participant has a question.

MS Teams released its own RSI feature in August 2022 which was both welcomed and frowned
upon by the different industry actors. On the one hand, it was hailed as a game changer by
companies that have the Microsoft license and use MS Teams for their meetings. This is because
multilingual meetings can finally take place on the platform without having to use a workaround or
an integration with a separate RSI platform. On the other hand, like all regular video conferencing
platforms that added an RSI feature, its RSI capabilities remain limited. For example:

• The interpreting booths are unidirectional which means that interpreters cannot work in both
directions (i.e. from Spanish into English and from English into Spanish), thereby obliging
clients to book double the interpreters for a bilingual meeting.
• There is no relay feature which is essential for large multilingual meetings.
• Interpreters cannot be an external MS Teams user to the client’s organization.

The MS Teams RSI feature is still in its beta phase, so it is too early to tell how it will hold up in
real-life scenarios and what impact it will have on the market. The announcement made waves in
the interpreting community and industry circles alike and it was even proclaimed that this could
be the death knell for designated RSI platforms. However, here at Nimdzi we believe otherwise, for
three main reasons:

1. MS Teams was not the first, but rather the most recent video conferencing platform to add
an RSI feature. In particular, Zoom’s presence in this space already changed the market
significantly — and the market adapted.
2. As outlined in an earlier section of this report, there is not just one RSI platform anymore.
It’s a big market that keeps expanding and different RSI solutions are suitable for different
clients. Many corporate clients like using regular video conferencing platforms for their
multilingual meetings but as outlined above, they have their limitations. In other words: the
more specialized the need, the more specialized the platform needs to be, which is where
the designated RSI platforms have the upper hand.
3. It’s as much about the clients as it is about the interpreters. More often than not, clients do
not know where to start when it comes to multilingual meetings. So interpreters are taking on
the role of consulting their clients on best practices and which platform to use. Considering
that interpreters typically prefer platforms that have a complete set of interpreting features,
are easy to use, and comply with safety measures related to audio quality, in such cases
regular video conferencing platforms often do not come out on top.

While it is too early to tell what the impact on the market will be, for the reasons mentioned above,
it is most likely that this latest addition to the RSI marketplace will make a rather small splash after
all.
Virtual booths and interpreter videos

We already mentioned the virtual booth in our section about the different types of RSI platforms.
Still, it deserves a mention here, as it is also one of the latest innovations on the RSI scene and one
that ties in with another development, namely interpreter videos.

Several studies, including ones by the UN, found that remote interpreting is more stressful
for interpreters than onsite work. It was also found that this stress can be worsened when the
interpreters cannot see each other. This is because without a visual the interpreters cannot know if
their booth partner is present, paying attention during tricky speeches, or even ready to take over
the microphone. Of course, the interpreters can message each other, but this adds to the already
tremendous cognitive load of remote interpreting.

To overcome this issue, some platforms have provided their interpreters with video as well as
audio to enable them to see each other. This reassures the interpreters that their partner is present
and may also facilitate the handover through hand signals. While the booth partners can see one
another, the interpreters remain invisible to the meeting participants. This way the interpreters still
have the privacy of the booth, but with the added value of having visual contact with each other.

The latest RSI platforms tend to have incorporated interpreter videos into their interface and
we believe that the more established RSI platforms might also include this feature in their future
versions. In addition, the aforementioned virtual booths also enable interpreters in a virtual meeting
to see and speak to each other without being heard or seen by the meeting attendees, just like in
a physical interpreting booth.

Multilingual livestreaming

Up until this point, the scenarios described all focus on regular, scheduled virtual meetings, events
or webinars. However, for a while now, livestreamed events and interviews have been picking up
steam (see Nimdzi LIVE for instance). Let’s briefly clarify the difference between the two.

Webinars and regular virtual events or meetings typically focus on a smaller target audience of
invited or registered attendees. Livestreamed events, on the other hand, are designed to reach a
much larger audience and can be broadcast to potentially hundreds or thousands of viewers. They
also do not necessarily need to be scheduled in advance (although this is also possible) but can
happen spontaneously. In most cases, livestreams are broadcast to a company’s or person’s social
media and YouTube channels.

The majority of RSI solutions on the market today are focused on the first group — scheduled
events. However, tech providers in the VIT space have started to recognize the potential of bringing
RSI to livestreamed events. The main challenge to overcome is to make sure that original audio and
video as well as the interpreter’s audio output are well synchronized so that there is minimal to no
delay.

Akkadu, a China-based RSI platform, appears to be ahead of the game in this regard. Akkadu uses
RTMP technology to synchronize the different audio and video feeds so that audiences get the full
experience without a delay in the video and the interpretation audio.
To further improve its livestreaming capabilities, Akkadu has recently released its Video Player.
This is a livestreaming video player that can be simply embedded into a client’s webpage using an
iFrame. In this case, the original video and interpretation audio have already been synchronized by
Akkadu behind the scenes using RTMP technology and have been put into an easy-to-use video
player for the client. The video player on the client’s webpage can livestream the original meeting
and the listeners can choose which language to listen to and even access chat in the video player.
It, therefore, has the advantage of being easy to set up, but with the added advantage of being
customizable, and already synchronized.

The multilingual meeting provider

Both in this year’s Nimdzi 100 and in last year’s Interpreting Index we wrote about the need for
a Multilingual Meeting Provider (MMP). Many companies already describe themselves as
facilitating multilingual meetings — and they do. However, what we typically see in the market are
either companies offering RSI or VRI, live captioning or machine interpreting, or a combination of
a few of these services. Interviews with market players show that the needs of clients are shifting
and buyers are increasingly looking for a provider that can do it all.

Clients don’t want to go to one company for their interpreting needs, to another for translation
and again to another for captioning — and potentially all for the same event. Especially as
new meeting formats are emerging all the time and from clients who never used language
services before. These kinds of clients do not want to have to think about the complexity of
language services. What they want is one provider that can facilitate all their requirements
for multilingual meetings and events in the virtual and the physical world.

Sarah Hickey, VP of Research, on behalf of Nimdzi Insights

When we wrote about this before, we asked who was going to fill this gap and in what way (through
partnerships, acquisitions, building or buying new tech, adding services, etc.). Since then, Nimdzi
has identified two solutions which may provide an answer to this question and which have started
filling the MMP gap in two different ways.
Bridge GCS

Bridge GCS describes itself as a virtual events platform that enables immersive and interactive
experiences. The platform was designed with multilingualism in mind and offers a myriad of
different language solutions, such as RSI, multilingual closed captioning, real-time translation of
chats (using MT), as well as AI-powered subtitling. It also offers a localized interface for ten different
languages, including localized waiting room messages, automatic emails, and data analytics.

However, Bridge GCS does not just focus on multilingualism but is also very strong on the event
management front, providing features, such as real-time analytics, RTMP streaming, backstage
communication, breakout rooms, integrations with social media as well as CRMs, direct uploading
of videos into meetings, and Q&A moderation.

When the platform was built, it was designed with different roles in mind — interpreter, technician,
moderator/presenter, event planner, host, and participant — and provides corresponding features
that cater to each one.

This mix of features that enable a multilingual experience (RSI, captioning, subtitling, chats with
MT) paired with its superior event management capabilities are the reason we can consider Bridge
GCS as being on the road to becoming a true MMP.

vSpeeq

vSpeeq is another company that focuses on facilitating multilingual events. However, the company
is neither an RSI platform nor an event management platform. Instead, vSpeeq describes itself as
an ecommerce platform for language services. The main purpose of vSpeeq is for clients to be able
to quickly select and purchase the language services they need for an upcoming event and in this
aspect the user experience really does resemble that of ecommerce platforms, such as Amazon
and the like. Clients can simply select the language service they want from the website, add it to
their cart, and pay for it online. vSpeeq then handles the rest in the background and facilitates the
client’s language needs on the day of the event and on the platform the client chooses.

For now, vSpeeq predominantly offers translation services and RSI and has its own pool of
translators and interpreters. However, vSpeeq is planning on expanding its offerings to create a
one stop shop for language services.

It is important to stress again that vSpeeq is not an RSI platform — the company provides the
service, but not the technology. For the RSI technology, vSpeeq has partnered with a third party.
This is why we do not list vSpeeq in the RSI section of our Atlas but in the “Platform LSP” category
instead.
The first crossover acquisition
In a first of its kind acquisition, Boostlingo announced on March 23, 2022, that it has partnered with
Interpreter Intelligence (an interpreter management and scheduling platform) and VoiceBoxer
(an RSI platform). This shows as two things:

1. The VIT market is growing and maturing. By acquiring Interpreter Intelligence, Boostlingo
has bought its main competitor. Interpreter Intelligence had gained a solid reputation as an
IMS platform and had also added its own OPI and VRI solutions. The decision to partner
marks another defining moment for the VIT industry, as it’s the first time that one VIT
provider has bought another.

2. RSI has become so popular that tech and service providers from other parts of the interpreting
market have realized that it’s time to get a piece of the RSI pie.

In the past, RSI moved in its own circles far removed from VRI, OPI, and other areas of
interpreting. This is because RSI typically comes with a very different client base, born
out of the conferencing sector. Now, however, we can see those circles starting to overlap
and new frontiers are on the horizon. Tech providers in the interpreting world are asking
themselves ‘Should we invest in developing new software ourselves or buy a competitor
who is already an expert in this area?’

Rosemary Hynes, Interpreting Researcher, Nimdzi Insights

The acquisition of VoiceBoxer by Boostlingo, therefore, is a smart move and will help the company
stay competitive at a time when RSI is branching out from the niche into the mainstream. The
acquisition is also a first example of how RSI is gradually being considered a standard interpreting
service, needed to complete the full package on offer.

Interpreting for telehealth continues to grow


Already in other Nimdzi publications, we reported that demand for remote interpreting in the
healthcare sector increased by more than 50 percent ever since patients were asked to call before
making an in-person visit due to COVID-19 related safety measures.

Remote interpreting in healthcare is certainly nothing new. However, the lockdown restrictions
and spike in requests created new challenges. For example, before March 2020, VRI typically only
required two channels — one for the interpreter and one for the doctor and patient, who usually
were in the same room. However, once lockdowns hit, the situation shifted to all three parties
typically being in different locations. This created a need for VRI with three-way call capabilities,
which presented a technical challenge for established providers.

In addition, companies offering VRI or OPI reported a spike in requests from telehealth vendors,
looking for ways to integrate interpreting into their own platforms. So, also in this segment of the
interpreting industry, the race for integrations began.

This trend is confirmed by investment and mergers and acquisitions in this field. For instance, AMN
Healthcare — a multi-billion dollar company in the healthcare staffing industry — acquired Stratus
Video in February 2020 (rebranded to AMN Language Services). The acquisition and subsequent
integration of remote interpreting services into AMN’s telehealth platform happened just in time
for the pandemic, which gave the company a competitive advantage once lockdowns hit.

In a similar deal, UpHealth — a telehealth service provider — acquired Cloudbreak Health and its
remote interpreting solution Martti in June 2021. The acquisition of Cloudbreak allowed UpHealth
to integrate remote interpreting services into their platform, thereby expanding their reach to
include people of all language backgrounds and thus increasing their value proposition to existing
and new clients. Prior to the acquisition by UpHealth, in February 2020, Cloudbreak had already
received USD 10 million in funding.

Last but not least, Jeenie, a US-based VRI platform, raised USD 9.3 million in a Series A funding
round on March 31, 2022. The company specializes in healthcare interpreting.

What all of this shows us is that the demand for interpreting services in the telehealth field
continues to grow. The pandemic created the framework for people who resisted remote
interpreting to embrace it. And now that the genie is out of the bottle, and providers, clients,
and interpreters alike have embraced remote interpreting and are well set up for it, it is
hard to go back.

Sarah Hickey, VP of Research, on behalf of Nimdzi Insights

The use of AI for interpreting


Not much has changed on the machine interpreting front since our 2021 edition of the Language
Technology Atlas. However, there are few things worth mentioning and a few others worth
repeating.

Machine interpreting

Some people will insist that machine interpreting does not exist but that it should exclusively be
called speech-to-speech translation because the output is not the same as if an interpreter were to
assess a speech and give their rendition in another language. Those people are not wrong. However,
couldn’t we say the same about machine translation? It’s not the same as human translation and
yet we have come to accept the term. The distinction is already being made by adding the term
“machine.”

Whatever you decide to call it, machine interpreting or speech-to-speech translation has come a
long way. At this point in time, machine interpreting solutions are already ready for use — just not
for all use cases. What the machines are good at is processing pure information, so assignments
that are more technical in nature or require less nuance are optimal. What the machines are not
(yet) good at is conveying emotion, irony, or tone, as well as transferring gender from one language
to another. This is where the expertise of human interpreters is required, at least for the foreseeable
future.

Two years ago we wrote that the majority of the machine interpreting solutions currently on the
market target individual consumers, e.g., in the form of handheld devices for tourists. While this is
still true, it appears that the tide may be (slowly) turning. The B2B market for machine interpreting
solutions is growing. This is, for example, evidenced in the fact that Worldly has added a Zoom
integration for multilingual captions.

US-based company Wordly predominantly provides speech-to-text and speech-to-speech


translation for conferences. The latest development now shows that RSI providers are not the only
ones catching a ride on the Zoom boom. It also brings us back to the heightened interest from
buyers of all industries to make virtual meetings more accessible — in many different ways and
on different budgets. And all of this is moving the language industry closer into the focus of the
mainstream consumer.

Computer-assisted interpreting (CAI) tools

This section wouldn’t be complete without the mention of CAI tools, which continue to be one of
the “hottest” new developments on the RSI scene.

The purpose of CAI tools is to be a form of AI-booth mate for interpreters performing (remote)
simultaneous interpreting. CAI tools allow interpreters to extract terminology and build their
glossary within seconds. During an assignment, CAI tools can also call out figures and names, and
instantly convert units (e.g., for measurements and currencies). The goal of CAI tools is to make the
interpreter’s preparation time more efficient and to ease the cognitive load during the assignment.
What the future holds
Regardless of the turbulating world events, the market forecasts for language technology are
generally quite promising. There’s a lot of valuable interest in this sphere from a broader (than a
regular localization folk) audience — from everyday users to major investors.

The big IT narrative around LT is well-captured in the Meta’s July statement: “Translation is one of
the most exciting areas in AI because of its impact on people’s everyday lives”
With the rise of AI, NLP, and MT, language technology is no longer perceived as a peripheral area
to the language industry. That is one of the reasons for many other language technology matrices,
catalogs, platforms, and lists emerging here and there — in addition to Nimdzi’s Atlas. This trend
itself can be considered indicative of the ever-increasing interest in the language technology arena.
As visibility is essential to informed decision-making, we are glad to see this increasing popularity of
the subject as well as of the Nimdzi Language Technology Atlas itself.

Even though we publish our free report annually, similarly to the behavior of this market, the data
contained in the Atlas infographics is subject to change. Therefore, we update the visuals more often
than once a year. So let’s stay in touch: don’t hesitate to reach out to us at [email protected] and
tell us about your favorite localization tools and new language technology solutions that should feature
on everyone’s radar. Let’s join forces to properly track how the language technology landscape evolves
in the years to come.
Copyright 2022 Nimdzi Insights.
All rights reserved.

38 |

You might also like