Meta’s New AI Model Translates 200 Languages


Meta launches a new AI model, NLLB-200 that can translate 200 different languages and improves the quality of translations across our technologies by an average of 44 per cent.

*NLLB-200 makes current technologies accessible in a wider range of languages, and in the future will help make virtual experiences more accessible, as well.

In the absence of  high-quality translation tools for hundreds of languages, billions of people today can’t access digital content or participate fully in conversations and communities online in their preferred or native languages. This is particularly an issue for hundreds of millions of people who speak the many languages of Africa and Asia.

To help people connect better today and be part of the metaverse of tomorrow, AI researchers created No Language Left Behind (NLLB), an effort to develop high-quality machine translation capabilities for most of the world’s languages. The company announced an important breakthrough in NLLB, a single AI model called NLLB-200, which translates 200 different languages with results far more accurate than what previous technology could accomplish.

When comparing the quality of translations to previous AI research, NLLB-200 scored an average of 44 per cent higher. For some African and Indian-based languages, NLLB-200’s translations were more than 70 per cent more accurate.

To best evaluate and improve, the company built FLORES-200, a dataset that enables researchers to assess this AI model’s performance in 40,000 different language directions. FLORES-200 measures NLLB-200’s performance in each language to confirm that the translations are high quality.

And to help other researchers improve their translation tools, Metaverse is opening NLLB-200 models and the FLORES-200 dataset to developers, in addition to model training code and code for re-creating the training dataset.

We’re also awarding up to $200,000 of grants for impactful uses of NLLB-200 to researchers and nonprofit organisations with initiatives focused on sustainability, food security, gender-based violence, education or other areas in support of the UN Sustainable Development Goals. Nonprofits interested in using NLLB-200 to translate two or more African languages, as well as researchers working in linguistics, machine translation and language technology, are invited to apply.

These research advancements will support more than 25 billion translations served every day in Feed on Facebook, Instagram and other technologies. You can explore a demo of NLLB-200 and take a deeper dive into how we developed this model.

Expanded Translation and Greater Inclusion

A handful of languages — including English, Mandarin, Spanish and Arabic — dominate the web. Native speakers of these very widely spoken languages may take for granted how meaningful it is to read something in your own mother tongue. NLLB will help more people read things in their preferred language, rather than always requiring an intermediary language that often gets the sentiment or content wrong.

This work can also help advance other technologies, like building assistants that work well in languages such as Javanese and Uzbek, or creating systems to take Bollywood movies and add accurate subtitles in Swahili or Oromo.

As the metaverse begins to take shape, the ability to build technologies that work well in a wider range of languages will help to democratise access to immersive experiences in virtual worlds.