Yuanxiang releases MoE open source large model XVERSE-MoE-A36B with activation parameters reaching 36B

Author：Eve Cole Update Time：2024-12-11 08:16:01

Shenzhen Yuanxiang Information Technology Co., Ltd. launches China's largest Mixture of Experts (MoE) open source large model - XVERSE-MoE-A36B. Its 255B total parameters and 36B activation parameters make its performance comparable to or even superior to many larger parameters Model. This model has achieved significant improvements in training time and inference performance, significantly reduced the cost per token, and provided strong support for low-cost deployment of AI applications. This breakthrough marks China's significant progress in the field of large-scale language models, pushing domestic open source technology to a leading international position. The XVERSE-MoE-A36B model is fully open source and free for commercial use, providing valuable resources for small and medium-sized enterprises, researchers and developers.

The XVERSE-MoE-A36B model has 255B total parameters and 36B activation parameters. Its performance is comparable to large models with more than 100B parameters, achieving a cross-level performance jump. The model reduces training time by 30%, improves inference performance by 100%, significantly reduces the cost per token, and makes low-cost deployment of AI applications possible. Yuanxiang XVERSE's high-performance family bucket series models have been fully open source and unconditionally free for commercial use, which provides many small and medium-sized enterprises, researchers and developers with more choices. The MoE architecture breaks the limitations of traditional expansion laws by combining expert models in multiple subdivisions. While expanding the model scale, it maintains maximum model performance and reduces the computational costs of training and inference. In multiple authoritative evaluations, the effect of Yuanxiang MoE has significantly surpassed that of many similar models, including the domestic 100 billion MoE model Skywork-MoE, the traditional MoE overlord Mixtral-8x22B, and the 314 billion parameter MoE open source model Grok-1- A86B etc.

Yuanxiang XVERSE's high-performance family bucket series models have been fully open source and are unconditionally free for commercial use, which provides many small and medium-sized enterprises, researchers and developers with more choices. The MoE architecture breaks the limitations of traditional expansion laws by combining expert models in multiple subdivisions. While expanding the model scale, it maintains maximum model performance and reduces the computational costs of training and inference.

In multiple authoritative evaluations, the effect of Yuanxiang MoE has significantly surpassed that of many similar models, including the domestic 100 billion MoE model Skywork-MoE, the traditional MoE overlord Mixtral-8x22B, and the 314 billion parameter MoE open source model Grok-1- A86B etc.

Free download of large models

Hugging Face: https://huggingface.co/xverse/XVERSE-MoE-A36B
Magic Scope: https://modelscope.cn/models/xverse/XVERSE-MoE-A36B
Github: https://github.com/xverse-ai/XVERSE-MoE-A36B
Inquiries: [email protected]
Official website: chat.xverse.cn

The open source and free commercial use of XVERSE-MoE-A36B lowers the threshold for AI applications and will greatly promote the development and application of China's artificial intelligence technology. Its excellent performance and convenient access undoubtedly provide powerful tools and resources for domestic and foreign AI developers and researchers. Look forward to seeing more innovative applications based on this model in the future.