The Oak Ridge National Laboratory in the United States has made a major breakthrough, using Frontier, the world's most powerful supercomputer, to successfully train a language model equivalent to ChatGPT using only 8% of its computing power. The model has trillions of parameters. Through innovative distributed training and parallel technology, the research team achieved 100% weak expansion efficiency, providing valuable experience and technical reference for training larger-scale language models in the future. This research not only demonstrates the power of supercomputing technology, but also highlights the importance of memory and other challenges in dealing with large-scale language model training.
Scientists used the world's most powerful supercomputer to successfully train a ChatGPT-level model, using only 8% of computing power. The breakthrough came from Oak Ridge National Laboratory, where the research team used innovative technology to train a trillion-parameter language model on the Frontier supercomputer. Through distributed training and parallel technology, 100% weak expansion efficiency is achieved. However, training large language models still presents challenges and requires addressing memory issues. The research provides experience for training huge language models in the future and highlights the key role of distributed training and parallel computing.This research result brings new possibilities to the development of the field of artificial intelligence, and also indicates that large-scale language model training technology will develop in a more efficient and energy-saving direction in the future. Efficient use of computing resources is an important direction for the future development of large language models.