Workshop 297 Report: Digital Inclusion Through a Multilingual Internet
Summary
On October 13, 2023, in Kyoto, IGF Workshop #297 brought subject matter experts together with interested members of the IGF Community to discuss the policy issues involved in achieving a multilingual Internet. Discussion covered the policy questions, above, in addition to various relevant and related topics, including the important role of domain names in promoting linguistic diversity online, the connection between meaningful connectivity and multilingualism on the Internet, the current dominance of the English language on the Internet, the idea of a “language justice” movement, the foundational requirement of Universal Acceptance for a multilingual Internet and recognition of the fact that many different stakeholders must work together to achieve it.
Background
Meaningful connectivity, which the ITU describes as “a safe, satisfying, enriching, and productive online experience, at an affordable cost” promotes the broader goal of digital inclusion, one of the themes of the IGF2023.1 Ensuring that people have the ability to use the Internet in their own language is one aspect of meaningful connectivity. Internet users should not only be able to access content in their native language, but to easily make their local language content available to others. The role of the Internet’s Domain Name System (DNS), and its ability to support scripts not based on the American Standard Code for Information Interchange (ASCII), is key to them being able to do so.2
The Internet’s naming system was designed to help people find resources – websites, videos, email servers, images – on the Internet. Normally, the language of a given domain name bears a strong correlation with the language of the Internet resource it points to.
Introduced in 1983, the DNS catered only to the Latin script. Its first few Top Level Domains (TLD) were based on English words, for example .com (commercial), .net (for network providers), or .org (originally “the miscellaneous TLD for organizations that didn’t fit anywhere else”).3 It was in this environment that the World Wide Web was introduced, six years later, which paved the way for an English-dominant web. One 2023 study estimates 55.6% of global domains primarily use English, and UNESCO estimates that only 14 languages are present on more than 1% of domains.4
Over the years, the ICANN community worked to “internationalize” the DNS by introducing “Internationalized Domain Names” or IDNs into the Internet’s naming system. Over 150 IDNs, including both generic and country code TLDs, have been added to the system and delegated to different entities around the world, including موقع. (.site in Arabic), .みんな (.everyone in Japanese), .ລາວ (.lao in Lao), and .中国 (.china in Chinese).5
A person whose native language is Arabic and whose email is in Arabic is headed to the airport and they have downloaded the airline's mobile app to check in. Though the app's content may be available in the Arabic language, when they seek to enter their email address to sign in, the app does not recognize it and responds "Please enter a valid email address." The person may have no other option than to obtain a free email address in the Latin script from a large email provider in order to use the app.
Making IDNs available to domain name registries, registrars, and registrants is an important first step towards achieving a multilingual Internet. But these domains – and related email addresses – must also be capable of being used on the Internet.
To achieve a multilingual online environment, Internet applications and systems must be capable of treating all domain names and email addresses equally, regardless of the script they are in. In other words, there must be Universal Acceptance of all domain names and email addresses. The low rate of “Universal Acceptance readiness” across the Internet is holding back its transformation into a multilingual network.
There are approximately 7,000 languages and dialects spoken in the world (2,300 in Asia, 2,100 in Africa, 1,300 in the Pacific, 1,000 in the Americas, and 280 in Europe), and 5.35 billion people online, but the content on the Internet does not mirror the world’s linguistic diversity.6 Out of 7,000 languages, only about ten languages have “any substantial online presence.”7 This report summarizes the discussions during Workshop 297, which led to the conclusion that multistakeholder collaboration to promote Universal Acceptance is necessary to increase the online presence of other languages and drive us towards a more multilingual Internet.
Key Points:
- The Internet would be more inclusive if it were multilingual;
- Policies to promote Universal Acceptance are needed for the Internet to be multilingual;
- There exists a great, latent demand for Internationalized Domain Names and Universal Acceptance solutions;
- A multilingual Internet can only be achieved through multistakeholder collaboration.
Featured Speakers (in order of opening statements):
- Alan Davidson, Assistant Secretary of Commerce for Communications and Information, United States Department of Commerce
- Ram Mohan, Chief Strategy Officer, Identity Digital
- Dawit Bekele, Regional Vice President for Africa, Internet Society
- Edmon Chung, Chief Executive Officer, DotAsia Organisation
- Theresa Swinehart, Senior Vice President, Global Domains and Strategy, ICANN
- Akinori Maemura, Chief Policy Officer, Japan Network Information Center
Policy Questions:
- What are the barriers that are keeping people from using the Internet in their own language?
- How can we surmount these barriers through technical and policy coordination?
- In what ways does Internet multilingualism support the broader goal of digital inclusion?
Discussion
Beginning with the basic premise that “if you don't have access to the Internet, or you can't afford it, you can't use the Internet in any language,” discussants identified various economic, social, political, and technical barriers that prevent people from using the Internet in their own language.8
A participant from the African region observed that many of the languages used in Africa are not recognized as official languages within their own countries which, in turn, makes it difficult to obtain support for developing content in that language. A dearth of local language content poses a significant barrier to meaningful connectivity; members of that language community are less inclined to use the Internet when there is no content in their own language available for them to engage with. Further, while “there are many people who are literate in their own language,” the “devices and platforms they want to use are not localized.”9 Most of the devices that we use to connect to the Internet – mobile phones and laptops for example – do not support the less widely-used languages and scripts. And while some of the larger email services – such as Gmail or Outlook – support non-Latin scripts, many others do not.
Taking into account the foregoing, discussants then turned to address a more fundamental, if not architectural, barrier: the capability of the Internet’s websites, applications, and other systems to recognize them just as they would with the Internet’s legacy domains, like .com or .org.
Technical and policy work within the ICANN community over the years has advanced linguistic diversity in the Internet’s naming system. Participants recounted their experiences in building “the technology that allows Internationalized Domain Names and e-mail addresses to work on the Internet in a secure and stable manner.”10 This work saw the addition of tens of thousands of new and different characters to a system that began with “just 26 [Latin] characters plus…10 digits for domain names.”11
At the same time, the same participants also recognized that creating IDNs was just the first step; these new domains also must be able to be put to use on the Internet by those who register them. While websites and email applications are accustomed to legacy domains in the Latin script, most are not programmed to accept IDNs. The Internet cannot become multilingual until software applications treat all Top Level Domains equally, regardless of the script they are in.12 To do this, the multistakeholder Internet community must “build policy and other governance systems that encourage the universal acceptance” of all domain names and email addresses.13 Together, IDNs and Universal Acceptance (UA) are the foundation for the next billion users to meaningfully come online.
Discussants made clear that the technical solutions for IDNs and UA has been addressed. As one participant stated, “the core technical challenges, and the core technical contours of this problem are already solved…That is not something that we ought to be focusing effort on.”14 Instead, the community must address the globally low rate of UA readiness, which many believe now constitutes the greatest barrier to achieving a multilingual Internet. The 2023 UA Readiness Report noted that while UA readiness within modern web browsers is largely positive and improving, not a single social media site tested for usability with an internationalized email address was UA ready.15 This is one illustration of the challenges that occur when the many layers of technology that underpin use of domain names (including software, platforms, and standards) are not yet UA-ready.
Despite reports that Internet multilingualism presents a $10 billion growth opportunity, market forces alone seem unlikely to drive UA readiness. Most Internet users – including many governments – do not know that technology to enable multilingualism exists. For many, the possibility of a multilingual Internet or its opportunities will not naturally come to mind because of the way that the Internet evolved in the English and the Latin script. Because end users have adapted to the linguistic landscape of the current Internet, it is hard to imagine being able to navigate the Internet in any language one would want, or signing in to accounts or signing up for new services with an email address in any script.
Some communities may be more comfortable with or have become used to using the Internet in English. During the session, a native Japanese speaker explained that using domain names in Latin script, instead of Japanese, became a preference because, with such a compact set of characters, identifying locations on the Internet was a much simpler process. 16
Many others, however, may experience a greater need for the use of their native language online, including to be able to preserve the use of their language and to promote cultural traditions.
Discussants noted that, at the end of the day, there exists a great latent demand for multilingual Internet content and for Internationalized Domain Names to convey that content to the right audiences. In other words, until Internet users understand that a multilingual Internet is possible, they are unable to manifest the kind of demand for software developers and companies to change. Quite simply, this latent demand will remain unmet until UA readiness levels across the Internet begin to increase. Policy interventions are therefore needed. Incentives should motivate suppliers to flesh out the implementation of UA across all services.
One discussant gave an example in government procurement. Procurement policies could favor vendors which can demonstrate that their systems are UA ready. Governments, at little to no cost, could create an impetus to those who are competing for business to say to themselves, “we need to prioritize universal acceptance.”17
The power to…express yourself in your own language is incredibly important. We all know it. [When] we have to experience a conversation in a language that is not our own, it is challenging. It takes a different kind of energy. It takes a different form of recognition of the words and what they mean and the interpretations around it. 18
The Internet has become an invaluable resource for education, economic mobility, innovation, accessing government services, social connection, and cultural preservation around the world. These are benefits that remain out of reach to many. Further, as Internet connectivity expands and more of the unconnected are brought online, pressure will mount to address online linguistic exclusion.
Connectivity is just the baseline. Once connected, the goal should be to enable people to thrive online, and to thrive, people need to be able to use the Internet in their own language. This will contribute to the creation of a digital sphere that is diverse and inclusive. This will also enable users to take full advantage of the benefits outlined above, including accessing government services and engaging with local content online.
Internet multilingualism could also make the Internet safer. Users who must use the Internet in an unfamiliar language are more susceptible to fraud and scams, including when dealing with URLs they cannot understand or verify. Creating and enabling linguistic diversity online supports digital inclusion because it attracts more people online and gives those already online greater confidence and trust in the Internet.
“We need a multilingual Internet movement.” 19
Language preservation was discussed during the workshop and it was noted that the Decade of Indigenous Languages, as declared by the United Nations General Assembly, was underway. Taking the specific challenges around Universal Acceptance into account – for example, its inaccessibility to most as a “technical topic,” as well as perceived trepidation by some to take on the challenge of non-ASCII scripts – one discussant suggested that the IGF community may be just the community to start a “language justice” movement, similar to the climate justice movement. The IGF community would be the community to start that movement.
References
Published June 2024. This report was drafted by NTIA, as a IGF WS297 co-organizer, in collaboration with IGF WS297 Participants, which included representatives from DotAsia, ICANN, ISOC, JPNIC, and Identity Digital.
1 “About the Universal and Meaningful Connectivity,” International Telecommunications Union.
2 The American Standard Code for Information Interchange, or ASCII, is the most common character encoding format for text data in computers and on the Internet. The ASCII character set includes the 26 letters of the Latin Alphabet, but not other scripts.
3 Top level domains, or the characters that follow the final dot in a domain name, play an important role in connecting end users to their intended website via the DNS lookup process. A IETF Request for Comments (RFC) published in 1994 established the basis for the Domain Name System and established both generic TLDs (gTLDs) and country-code TLDs (ccTLDs).
J. Postel, “RFC 1591: Domain Name System Structure and Delegation,” IETF, March 1994.
4 Russell Brandom, “What languages dominate the internet?,” Rest of World, June 7, 2023.
5 “UASG 047 UA-Readiness Report FY23,” Universal Acceptance Steering Group, September 26, 2023.
6 Per one speaker, sixty percent of Internet content is in English.
Total global Internet users figure (5.35 billion) is based on a January 2024 estimate. Ani Petrosyan, “Number of internet and social media users worldwide as of January 2024,” Statista, Jan 31, 2024.
7 “How can we achieve a multilingual internet?,” Digwatch Event Report, December 8, 2021.
8 D Bekele, IGF 2023 – Day 2 – WS #297 Digital Inclusion Through a Multilingual Internet, October 10, 2023, Transcript, (pagination not available).
9 Id. D Bekele, Transcript.
10 Id. E Chung, Transcript.
11 Ibid.
12 UA ensures that all domain names, including new generic top-level domains (TLDs), Internationalized Domain Names (IDNs), and internationalized email addresses can be used by all Internet-enabled applications, devices, and systems. To this end, UA has implications beyond multilingualism, including for gTLDs longer than 3 characters.
13 R Mohan, IGF 2023 – Day 2 – WS #297 Digital Inclusion Through a Multilingual Internet, October 10, 2023, Transcript, (pagination not available).
14 Id. R Mohan, Transcript; The Universal Acceptance Steering Group (UASG), which was formed in 2015, has developed a comprehensive library of documentation for technology and email providers to implement UA.
15 “Universal Acceptance Readiness Report FY23,” Universal Acceptance Steering Group, September 26, 2023.
16 IGF 2023 – Day 2 – WS #297 Digital Inclusion Through a Multilingual Internet, October 10, 2023, Transcript (pagination not available); One discussant, whose native language is Japanese, explained that adapting to use the Latin alphabet on the Internet, instead of his native script, came easy. Using ASCII became a preference because, with such a compact set of characters, identifying locations on the Internet was a simple process.
17 Id. R Mohan, Transcript.
18 Id. T Swinehart, Transcript.
19 Id. E Chung, Transcript.