The editor of Downcodes learned that a 7 billion parameter language model called Teuken-7B has been released on the Hugging Face platform, supporting all 24 EU official languages. The model was developed by the EU OpenGPT-X research project and is available to users as open source. Unlike most English-centric AI language models, Teuken-7B was built from scratch, with roughly half of its training data coming from non-English European languages, giving it a significant advantage in handling multiple European languages.
Teuken-7B, a language model with 7 billion parameters, is now available on Hugging Face and supports all 24 official EU languages. The model was developed by the EU OpenGPT-X research project and is available to users as an open source project. Unlike most English-centric AI language models, Teuken-7B was built from scratch, with about half of its training data coming from non-English European languages.
Picture source note: The picture is generated by AI, and the picture is authorized by the service provider Midjourney
The development team says Teuken-7B performs well in all languages it was trained on, and its reliability is particularly impressive when dealing with non-English languages. To measure the performance of language models in European languages, the project team also created a new European LLM ranking, surpassing previous standard testing methods that were mainly based on English.
This release marks a significant step forward in Europe’s push for multilingual AI models, while also providing developers with a powerful and diverse tool to support cross-language applications and research.
The open source release of Teuken-7B brings new possibilities to the field of multi-language AI and reflects Europe's active efforts in independent research and development of AI technology. Its excellent multi-language processing capabilities will provide more convenience to global developers and promote the vigorous development of cross-language applications. It is expected that Teuken-7B can play a role in more fields in the future.