OpenAI and DeepMind, the two artificial intelligence giants, have significant differences in their research on scaling laws (Scaling Laws) of large language models (LLM). Scaling Laws aims to predict the impact of changes in model parameters, data volume, and calculation volume on model performance. Its research results will profoundly affect the future development direction of artificial intelligence and have a profound impact on the future of human-machine coexistence. This article will deeply explore the different perspectives, methods and respective contributions of the two companies in Scaling Laws research, and briefly introduce the relevant domestic research progress.
OpenAI and DeepMind have different views and methods in Scaling Laws research. Scaling Laws can predict the loss changes of large models when the amount of parameters, data, and calculations change. Their competition will promote the development of artificial intelligence and affect the future of human-machine coexistence. In the pre-training process of large language models, there is a trade-off between model size, data volume, and training cost. Scaling Laws can help optimize design decisions. DeepMind proposes that model size and data volume should scale in equal proportions, while OpenAI prefers larger models. DeepMind developed AlphaGo and AlphaFold, demonstrating the potential of deep reinforcement learning and neural networks, while OpenAI developed the GPT series of models, demonstrating extraordinary capabilities in generative models. The research conclusion shows that the three factors that affect model performance interact with each other, and DeepMind's Chinchilla model performs excellently. Domestic Baichuan Intelligence and Mingde Large Model have also contributed to the research on Scaling Laws. DeepMind proposed the Levels of AGI classification method, revealing the different development stages of artificial intelligence.The competition between OpenAI and DeepMind in Scaling Laws research not only promotes the development of artificial intelligence technology, but also provides valuable experience for the design and optimization of future large models. The different research paths and results of both parties have jointly built a richer and more comprehensive knowledge system in the field of artificial intelligence, which will ultimately benefit the entire industry and society.