Languages are dying, but is the internet to blame?
Meanwhile the internet, much like the written word, struggles to reflect the linguistic diversity of the spoken word. According to the UN, just 500 languages are used online, with Google search supporting 348, Wikipedia 290, Facebook 80, Twitter 28 and LinkedIn 24. While those numbers have increased in recent years, they now appear to be plateauing. In terms of volume of languages, the internet is reaching saturation.
Despite an increasing diversity of languages online, English is still dominant. Of the 10 million most popular websites, 55.2 percent are in English. The only other languages to make much of a mark are French, German, Japanese, Russian and Spanish -- ranging between four and 5.8 percent of those websites.
The internet's western bias is best understood when considering Hindi, a native language for some 310 million people worldwide. Despite being the fourth most spoken native language, less than 0.1 percent of the 10 million most popular websites are in Hindi. Hindi's problem is not unique; grouped together, the Chinese language is used on just 2.4 to 2.8 percent of websites, despite having an estimated 1.2 billion native speakers.
And it isn't just the words -- the symbols used to write words online are also problematic. Many of the internet's estimated 3.2 billion users can't read or understand Latin text, so www.facebook.com is meaningless and hard to remember. The creation of the Internationalised Domain Name system, which allows for web addresses to be displayed in Arabic, Chinese, Cyrillic, Tamil, Hebrew and others, is starting to make the internet more easily understandable around the world. It might seem trivial, but only having domain names available in the Latin alphabet skews the balance of power in favour of old, established economies.
Domain names that don't use the Latin alphabet, such as this one in Greek, make navigating online easier for non-English speakers
Understanding how to best represent linguistic diversity online remains a major challenge. Facebook estimates that making the internet useful for 80 percent of the world's population only requires content to be available in 92 languages. Wikipedia is getting close, with 52 languages supporting 100,000 articles or more. But it would be wrong, not to say pointless, for the internet to support every single living spoken language.
According to Giuseppe Longobardi, a professor from the department of language and linguistic science at the University of York, the internet is ill-suited to support small, already dying languages. He explains that many of these languages are only spoken by small, village communities who would never use the internet to communicate with other native speakers.
"Hindi can expand its use on the internet, but it is not in danger of extinction," Longobardi tells WIRED. "Several Brazilian or Australian native languages are in danger, but cannot be rescued by the internet because their speakers are already too few to productively visit websites."
For Longobardi, language diversity online is a reflection of how a multilingual world communicates. Small languages will remain small, while a core of widely-spoken languages will continue to dominate the internet.
"Precisely because the internet encompasses and affects a small number of languages, it may even expand the diversity of the language used beyond current limits, but will never resurrect the languages and dialects which are dying," he explains.
Despite being the native language for 310 million people worldwide, Hindi is used on less than 0.1 percent of the world's 10 million most popular websites
But while it is arguably important that minority and endangered languages survive, it isn't feasible that they all become languages of the internet. According to Friederike Luepke, professor of language documentation and description at SOAS, University of London, understanding multilingualism is key.
"By far the majority of the world's population is multilingual, and most people do not read and write small languages but only the official languages of their countries or languages of wider communication," Luepke tells WIRED.
"The internet should support these communities in their need for symbolic recognition while developing communication strategies that take account of their actual practised pattern of multilingualism."
Rather than depriving smaller languages of a voice, Luepke believes the internet might actually be creating new linguistic diversity, albeit in a way that resists easy measurement. "In particular on social media, where users cultivate a register that is written but very reminiscent of oral communication, multilingual speakers draw in very creative fashion on their entire multilingual repertoire, but in a way that defies easy categorisation of language," she says.
On social media, the boundaries between different languages are becoming blurred. As speakers move away from standard, traditional languages and their rules of grammar and spelling break down, social media can foster multilingualism and support linguistic diversity.
"This could be a model for the future," Luepke argues. But concerns about linguistic diversity online miss the real issue: access. In areas such as West Africa people aren't offline because they can't read the language, they are offline because of poor infrastructure and a high cost of entry.
"The real issue is that the vast majority of people worldwide [are] still excluded from internet access in the first place," Luepke says. "Mobile phone technology is going to be decisive in offering internet in a more equitable manner, but it is still far from closing this really worrying gap."
James Temperton