Five Things You Didn’t Know about MT

Five Things You Didn’t Know about MT


Machine translation is arguably the trendiest topic in the the localisation industry at the moment. Virtually any piece of news coming from the industry contains at least a mention of machine translation. With so much information on the matter available, there are still some of its aspects that might be completely new to you.

1. Machine translation is not as new as you might think

Before you read on, try to quickly guess the year in which machine translation was born.

Widespread excitement about MT started with the rise of neural machine translation in 2016. That is when MT started getting a lot of attention so anyone whose guess was that MT was born in the last 2-3 years cannot be faulted.

In 2016, Google published its paper boldly entitled “Google’s Neural Machine Translation: Bridging the Gap between Human and Machine Translation“, ambitiously hailing that the era of translations produced by humans and machines being indistinguishable has arrived.

But that is not when machine translation draws its beginnings.

Machine translation has been around since the 1950s, or more specifically since the beginning of the Cold War. At that time, the US Amy did not have enough staff who could read and understand Russian. They needed a way of translating large volumes of Russian-language intelligence on a daily basis, so the US Army asked IBM for help.

In response to those needs, the Georgetown-IBM experiment started at the IBM headquarters in New York. As a result of the experiment, for the first time in history, the IBM type 701 computer automatically translated 60 Russian sentences into English.

At that time, IBM famously said that “we’ll have perfect machine translation within five years”. We all know how that turned out…

2. A New type of business is here

The rise in machine translation’s popularity has stirred questions regarding how relevant human translators will be to translation processes in the years to come and the general shape of the profession.

But we should not forget that machine translation engines could not exist without the human input. For them to produce any translations, they need to be trained on large quantities of bilingual corpora first. And that bilingual corpora usually consists of translation memories – strings of source sentences along with their translations produced by human linguists.

Without that input, machines would have no means of learning language patterns and, as a result, producing translations.

The increasing need for good quality bilingual resources that are required to create and train machine translation engines has translated into an emergence of a completely new type of business.

Companies that specialise in creating those type of extensive language resources and selling them to other business, which can then use those for machine learning purposes, are quickly gaining popularity.

A great example is Flitto – a South Korean platform where language data is collected by means of crowdsourced translation. Flitto reports Baidu, Tencent, Microsoft and Expedia among its clients.

3. Google translates 143 billion words a day 

Google Translate launched its translation services 12 years ago. Little did everyone know how popular it would become in the following years. Whether it is due to an extensive language coverage or constant enhancements of its features and capabilities, Google CEO Sundar Pichai reports that the app translates a jaw-dropping 143 billion words every day. This means its heavy usage by consumers all around the world.

Smartphone with Google Translate app open on it

Editorial credit: / Shutterstock.com

That number encompasses all language combinations available in the app. The same number of words would take humans just under 318 million hours to translate. This equals 13.3 million days or 37,276 years.

We can all agree that humans would just not have the patience or capacity to process that kind of volumes. Thankfully, human linguists can take care of mission-critical translations that would not be suitable for machines instead.

4. Machine translation lets you work smarter, not harder

If you are a linguist, you constantly search for tools that would make you work smarter, not harder. You’ve got your CAT tool that leverages translation memory, you’ve got your terminology management tools that help you find the right terms quickly and you might even have some project management tools in place that help you keep on top several projects you are involved in.

Using these tools together falls into the recently coined concept of augmented translation. Another way to augment translation processes, and as a result increase the daily word throughput, is to use machine translation.

TAUS estimates that a human linguist translates roughly 400 words per hour, which gives 2,800 words in a 7-hour working day.

Skilful use of machine translation on relevant types of content can increase working speed by 100%.

Using machine translation suggestions with an adequate level of post-editing in CAT tools, especially is segments containing strings of numbers, dates and well-structured short sentences, can make a considerable difference.

Some language service providers report that thanks to the use of machine translation in certain language combinations their linguists are able to reach over 1,000 words per hour.

5. There is no single kind of machine producing translations

The term “machine translation” itself can be misleading because it is not specific enough. There isn’t one single kind of machine and it is important to choose the type of machine translation technology that will be most suited to your needs.

Machine translation can be rule-based but it can be also statistical or neural. However, it can also be hybrid, which means it can combine the strengths of two different frameworks for better translation results.

Machine translation can be generic – not trained to handle any specific type of content, like the publicly available Google Translate platform. But it can also be adaptive, which means it automatically learns from any corrections made by post-editors in real time and adapts to the content being translated by leveraging client- or subject matter-specific translations stored in translation memories.

Therefore, there are multiple aspects of machine translation that makes each solution and provider different. The choice depends on the language combinations, the desired level of quality and the subject matter of the source content.

There is always something new to discover, especially in such fast-moving area as machine translation, which is bound to grow even further continuing to bring out the most of human skills and technology.

Related posts

Subscribe to our newsletter