Meta recently announced a collaboration with UNESCO to launch a language technology partner program aimed at promoting the development of artificial intelligence (AI). The program focuses specifically on ethnic minority languages and aims to support future open source AI technologies by collecting voice recordings and text records.
Meta recently announced a new language technology partnership program in partnership with UNESCO to collect voice recordings and text records in multiple languages to drive future open and available artificial intelligence (AI). This program focuses on minority languages that are overlooked in the digital environment.
According to Meta, the program hopes to attract partners, providing over 10 hours of voice recordings and their transcriptions, rich written texts, and a collection of translated sentences. Meta hopes to integrate these languages into its AI speech recognition and translation model through joint efforts with its partners, and the results that will eventually be released in open source.
As of now, confirmed partners include the Nunavut Regional Government in northern Canada, where some residents of the region use a language called Inuit. Meta said in its blog: “Our efforts focus specifically on underserved languages to support UNESCO’s work. Ultimately, our goal is to create intelligent systems that can understand and respond to complex human needs, regardless of language or What is the cultural background.”
To complement this plan, Meta will also release an open source machine translation benchmark designed to evaluate the performance of language translation models. Designed by linguists, this benchmark supports seven languages and is accessible and contributed through the AI development platform Hugging Face.
Meta sees both initiatives as charitable actions, but the company will also benefit from upgrading its voice recognition and translation models. Meta continues to expand the number of languages supported by its AI assistant, Meta AI, and tests, for example, the features of voice translation in Instagram Reels, allowing creators to dub and automatically sync their voices.
Although Meta's efforts in language processing deserve attention, the company has received a lot of criticism for the processing of non-English content. Reports show that when Facebook handles COVID-19 disinformation in Italian and Spanish, almost 70% of content is unmarked, while English content is only 29%. Additionally, leaked documents show that Arabic content is often mislabeled as hate speech. Meta said steps are being taken to improve its translation and content audit technology to address these challenges.
This Meta program not only promotes the development of AI technology, but also makes an important contribution to the protection of global language diversity.