xAI company has open sourced its 314 billion parameter hybrid expert model Grok-1, which has attracted widespread attention in the field of artificial intelligence. The model's weights and network architecture are completely open, and it is trained from scratch based on a large amount of text data without application-specific fine-tuning. The activation weight is 25%. The JAX library and Rust language are used for customized training and comply with the Apache2.0 license, which is convenient for developers to use and secondary development. The open source of the model provides researchers with valuable learning and research resources, and also promotes further development in the field of artificial intelligence. Although some researchers believe that its openness needs to be improved, the release of Grok-1 is undoubtedly a major progress in the field of artificial intelligence.
Musk’s xAI company announced that it will open source the 314 billion-parameter hybrid expert model “Grok-1” with fully open weights and network architecture. The model is trained from scratch without application-specific fine-tuning. It is trained based on a large amount of text data. The MoE model activation weight is 25%. It uses JAX library and Rust language to customize the training stack and complies with the Apache2.0 license, and its popularity continues to increase. The model repository provides JAX sample code, which requires large GPU memory, and provides a magnet link to download the weight file. The researchers evaluated Grok-1 as less open and more predictive than LLaMA-2, provided model architecture details, and called for more public details.
The open source of Grok-1, although there is some controversy in terms of openness, its powerful performance and open license make it a model worthy of attention and is expected to promote the development of the field of large language models. In the future, we look forward to more similar open source projects emerging to jointly promote the advancement of artificial intelligence technology. The acquisition and use of the model requires a certain technical threshold, but this does not hinder its contribution to artificial intelligence research.