Meta recently announced that it will build two super clusters equipped with 24,000 H100 GPUs for training its next-generation large language model Llama-3. The project uses advanced RoCEv2 network and Tectonic/Hammerspace's NFS/FUSE network storage solution to improve training efficiency and data access speed. Llama-3 is expected to be online in late April or mid-May and may be a multi-modal model. Meta also plans to continue to open source the model. This move highlights Meta’s determination and strength to continue investing in the field of AI large models, and its future development is worthy of attention.
Meta released two 24K H100GPU clusters on its official website, specially designed for training the large model Llama-3. Llama-3 uses RoCEv2 networking and Tectonic/Hammerspace's NFS/FUSE network storage. It is expected to go online in late April or mid-May, possibly as a multi-modal model and continue to be open source. Meta plans to have 600,000 H100 computing power by the end of 2024.Meta’s large-scale computing power investment heralds the further development of AI model training in the future, and the release of Llama-3 is also worth looking forward to. Its multi-modal characteristics and open source strategy will have a profound impact on the AI field. Meta’s ambitious 600,000 H100 plan demonstrates its strong strength and future development direction in the field of artificial intelligence.