How to Manage an Audio Localization Project

How to Manage an Audio Localization Project


Voice recognition is getting popular. Whether it’s the hot new ‘home hub’ gadgets, digital assistants such as Amazon Echo and Siri or the increasing popularity of voice search – voice-first technology is on the rise. By 2020 it’s thought that 50% of all searches will use voice and smart home hubs are now the fastest-growing consumer technology. It seems like it’s time for brands to respond to the new consumer expectation to use voice-first technologies in everyday life.

But the human voice (or an AI imitation of it) isn’t just being used in devices such as Alexa. The human voice is still being used in training materials, radio ads, recorded safety announcements and for many different applications.

The rising popularity of video content in social media has led to an increased demand for voiceover services. With all kinds of products increasingly being sold to international audiences, it’s becoming more common for audio materials to be translated and localized for new audiences.

Audio localization projects are usually pretty complex to manage, particularly if you’re working across multiple target markets at once. Transcreating the original script is just one small part of the challenge.

For some projects, you’ll need to carefully tailor the timing of the audio recording to match up to other actions, such as a critical point in an instruction video. That can be hard to do in other languages, particularly if things need to be expressed in a longer or shorter sentence than the original recording.

If you’re taking a stance on how particular words are pronounced, you’ll also need to provide a guide to help voice actors get it right. An example might be your brand name or any industry-specific terminology that a non-specialist voice actor may not know how to pronounce.

Group of people having a meeting

Preparing your voice recording team with a thorough brief detailing the pronunciation of particular words or phrases tailored to match visual elements of a video will be crucial to the success of your audio localization project.

It’s a very good idea to have the script thoroughly reviewed by a fresh pair of eyes and a native speaker. Do this in advance of making the recording, as it’s much easier to do edits at the paper stage. Ultimately, this approach is the most cost-effective.

It’s well worth paying for professionals to do your audio recording, both in terms of the voice actors you use and the recording team. Professional sound engineers are not only likely to have better equipment (and understand how to use it) but they can also clean up any recordings so you get more professional-sounding audio quality.

It’s also worth thinking about rights payments; bare in mind where and how long you will use the content to determine whether you buy the rights in perpetuity or purchase the rights with time or platform limitations.

Growing need

Because audio content is increasingly popular, we’re likely to see a growing demand for voice localization services. The fast-moving nature of social media content and the technology industry means that there’s also intense pressure to turn audio localization projects around very quickly. Subtitling, while not necessarily audio, is also a very common way of supporting the audio dialogue of most video content. This can reduce costs when compared to voice-over.

There are some trends afoot to personalize video content to a greater extent than is now common. An example would be if a brand made the same video 10 times with a slight change each time to adapt it to a different market or audience.

This isn’t yet common practice, and it’s still too expensive for many brands, but personalization seems to be the direction content marketing is increasingly heading in. If this trend takes off and users come to expect more personalized video content, this is likely to really impact on the size of the average localization project.

RELATED: Best Practice Approaches to Video Localization


Growing demand isn’t always matched by improved supply, particularly when language skills are needed. There’s a shortage of translators in some languages and some language pairs are particularly hard to recruit good linguists for.

It’s worth bearing this in mind if you’re planning voice localization in less common language exchanges, you may need to plan further ahead to allow time to recruit the appropriate voiceover artist or find a reputable supplier to work with.

Technology is already helping to make up some of the shortfalls. In a few years’ time, we may see AI playing more of a role in audio localization. As machine translation improves, and artificial voices start to sound more human and natural, there may be simple AI solutions to the current model of recruiting human actors and localization teams.

As subtitles are known to increase video viewing by 40%, AI could also be used to aid the creation of multilingual subtitling as content viewed on platforms such as YouTube and Facebook on mobile is often viewed without sound or watched by users that may not be native speakers of the language the video is created in.

Getting accents right

It may be hard for Brits to believe it, but some parts of the world actually aren’t obsessed with peoples’ accents. It’s been said that UK accents are so granular it’s sometimes possible to distinguish peoples’ accents and dialects down to a distance of two miles separation.

Some Welsh valleys claim to have distinct dialects and accents to the next. Anyone raised in the UK tends to have strong associations for many different accents – as comedian Bill Bailey points out, you never hear a war correspondent with a West Country accent.

There’s a massive slice of class awareness behind much of the UK’s internal language discrimination. But there are other associations; some positive and some negative.

The Yorkshire accent is apparently perceived to be the most intelligent-sounding one, Scottish accents are seen as friendly and honest but there are biases against accents from Birmingham and Liverpool. These biases impact on how voice actors are cast for roles and also affect decisions such as where call centers are located.

A woman in a call centre smiling while taking a call

Although the Scottish accent can be difficult to understand, it’s widely considered to be the most friendly and trustworthy. So much so that Scotland is said to be turning into a call center nation.

Other parts of the world are far more tone-deaf than the UK when it comes to accents. For example, the traditional Beijing dialect and its quirks of pronunciation are seen as rather comic to other Chinese speakers, perhaps not dissimilar to the way a traditional Cockney accent would be perceived and portrayed in the UK’s media.

An accent is something to factor in on for audio localization projects. But there are also other subtleties. Some cultures tend to see women’s’ voices as having less authority, while others might have greater age bias. Some parts of the world have a far greater tolerance for cutesy voices.

It’s important to understand the expectations around accents and regional dialects when you’re casting for voice talent and it’s critical to success to bring in a team with cultural understanding of the target market and audience to work on your audio localization project.

Written by Demetrius Williams
Demetrius Williams
Demetrius Williams is a Digital Marketing Specialist at TranslateMedia and has previous eCommerce experience working with a number of luxury brands in the fashion and beauty industry. He enjoys photography, binge-watching Netflix and can often be found roaming around London with a camera in his hand.

Related posts