We’ve come across so many informative infographics and handy interactive tools on the web recently that we felt it was time to share and celebrate them.
Here’s a beautiful visual representation of the relative proportions of native speakers of the world’s largest languages.
Created for the South China Morning Post, this represents the relative numbers of native speakers for languages that have more than 50 million speakers. Whilst there are over seven thousand languages in the world overall, only 23 have more than 50 million native speakers. These 23 are represented by the sections with thick black borders.
But this visualization also represents where those native speakers are located. English makes up a sizeable yellow chunk of the diagram. This section is further split into different areas representing the native speakers in countries such as the US and UK. What’s particularly interesting is how many native speakers there are in places such as Sierra Leone and Malaysia.
Compare this to languages such as Japanese, Marathi and Telugu; all of which show a lot less regional creep. The way the English language has established itself in other countries betrays England’s colonial past.
Spanish – another language with an active imperial past – is also widespread. Although Japan’s colonial history included some language dissemination, such as in Taiwan, this didn’t have the same effect in terms of volume of speakers as other colonial languages such as English.
There’s also a strong political history behind the stark line bisecting the Bengali language between the speaker populations in India and Bangladesh.
It’s a diagram that reveals a great deal about the distribution of the world’s major languages. What it doesn’t show is how dominant these 23 big languages truly are.
More than half the world’s population counts one of these 23 major languages as their mother tongue. But when it comes to learning languages, even these big languages aren’t being represented proportionally. English is being learnt by a huge number of students (1500 million in total), whilst ‘only’ 30 million students are studying Chinese.
The Language Tree
This stunning linguistic family tree published on sssscomic.com using data from ethnologue.com may not be comprehensive or without controversy but it’s certainly beautiful.
It focuses on the Indo-European family and the Uralic family, and aims to represent (via the relative size of the different ‘branches’) the volumes of native speakers that were thought to have existed prior to the year 0.
Entire language branches that aren’t included in this tree include the Semitic languages, such as Arabic and Hebrew. It also excludes ‘dead’ languages such as Ancient Greek and Latin, and other smaller languages. Whilst it would be fascinating to see how the rest fit in, including all the world’s languages would turn this tree into a forest!
The relationships between the languages that are shown on the tree are inevitably rather controversial – many people might debate how the Catalan, Romansch, and Sardinian languages should be positioned relative to one another. That’s perhaps a question to leave to the linguists, who’ll enjoy the discussion. It’s perhaps best to take this visualization at face value.
Key takeaways as far as we’re concerned include the way Armenian and Greek exist on their own unique independent branches, and another reminder of how Hungarian and Finnish have root connections.
What’s also interesting is that, in these early days, Spanish and English had larger speaker populations than Bengali and Hindi. All four languages still endure today with roughly similar proportions in terms of their relative speaker populations.
We’re pretty impressed with Langscape, a really handy resource for language learners.
Langscape is another attempt at visualizing language diversity, combining geographical mapping of language distribution with knowledge resources that include demographical information about the speaker population and audio recordings of how the language sounds.
What’s particularly interesting about Langscape is that it shows the language distributions within political boundaries, so a good understanding of the linguistic complexity of a country can be easily obtained.
Areas such as India, and the entirety of sub-Saharan Africa, show as patchwork: a pattern that reflects how fragmented the language landscape is in these regions. By contrast, areas such as Northern Europe are strikingly homogenous when it comes to language.
Companies thinking about entering a market such as China or Brazil may find it helpful to see how their customers are likely to be distributed in terms of mother tongue language before they develop their localization strategies.
Maps such as Langscape are helpful as they demonstrate the fallacy of thinking merely in terms of political markets: many of these languages are split neatly across political borders. Brands thinking of entering new markets may find it helpful to think of targeting particular groups of language speakers rather than customers of particular nationalities. That’s especially true in complex places such as China, where language is just one way of segmenting this huge market.
Langscape helpfully illustrates the size of the population of language speakers if you click into each language and scroll down the page. This is something that isn’t easily understood by looking at the geographical distribution because population density varies so much.
But the map doesn’t show the entire story when it comes to language capabilities of the various populations. Consumers in linguistically diverse markets such as India are likely to have capabilities across multiple languages – far more than in markets such as the US and UK where people are less likely to be able to speak multiple languages.
Whilst your customers are likely to be more comfortable operating in their mother tongue, it’s important to research what the local audience’s language capabilities are before you enter a market. For instance, many Indians might not have mother-tongue English but they may be willing to engage with your business in this language.
This map makes some small efforts towards portraying how languages can be internally divided into different dialects and versions. Punjabi – ranked tenth in the world’s most widely spoken languages – is here portrayed as being split into Eastern and Western versions (also known as Gurmukhi and Lahanda). The latter has about twice as many speakers.
However, there are many more recognized dialects of Punjabi than this map reflects. Some of these dialects are arguably languages in their own right. The Langscape map doesn’t reflect the true picture of diversity even within the Punjabi language.
Brands engaging with new language audiences would be wise to seek advice on the true language picture of their audience before they dive in. But they need to be aware that the language that is spoken in a particular region is not the only consideration when expanding abroad.
Population size, relative levels of wealth, internet penetration, social media usage and search engine market share will also have an impact on a brand’s ability to succeed in foreign markets and determine, to a certain extent, the strategies employed to enter these regions successfully.
We were inspired by the many interactive tools online and have developed the International Business Case Builder to allow brands to determine which countries to target and which channels and social networks to feature in their international marketing strategy.