Not only is OpenAI’s much-anticipated next-generation model Orion, but also Anthropic, another star artificial intelligence (AI) start-up that is a rival of Google and OpenAI, has also reported that the development of advanced AI models has encountered bottlenecks.
On Wednesday, November 13th, Eastern Time, Bloomberg reported that OpenAI completed the first round of Orion training in September this year, hoping that it would greatly surpass some previous versions and get closer to the goal of AI surpassing humans, citing two people familiar with the matter. , but Orion failed to perform as well as the company hoped. As of late summer, the model was performing poorly when trying to answer untrained coding questions.
People familiar with the matter commented that, overall, so far, compared to GPT-4’s performance beyond GPT-3.5, Orion and OpenAI’s existing models have not made that much progress.
The report also quoted three other people familiar with the matter as saying that Google's upcoming new version of Gemini did not meet internal expectations, and Anthropic also postponed the planned release of the Claude model known as 3.5 Opus.
The report believes that the above three companies face multiple challenges in developing AI models. It is increasingly difficult for them to find untapped high-quality artificial training data. For example, Orion's unsatisfactory encoding performance stems in part from a lack of sufficient encoding data for training. Even modest improvements in model performance may not be enough to justify the huge costs of building and running a new model, or to meet the expectations of a major upgrade.
The bottleneck problem of AI model development challenges the scaling law that is regarded as a guideline by many start-ups and even technology giants. It also calls into question the feasibility of massive investment in AI to achieve general artificial intelligence (agi).
Wall Street News once mentioned that the law proposed by OpenAI as early as 2020 means that the final performance of a large model is mainly related to the amount of calculation, the amount of model parameters and the amount of training data, and is related to the specific structure (layer) of the model. number/depth/width) are basically irrelevant. In July this year, Microsoft's chief technology officer (CTO) Kevin Scott also defended this law, saying that Scaling law still applies to the current industry - while expanding large models, the marginal benefits do not diminish. Coincidentally, the media broke the news last week that OpenAI found that Orion "didn't make that big a leap" and the progress was far less than the previous two generations of flagship models. This discovery directly challenges the Scaling law that has been pursued in the AI field. Due to the decrease in high-quality training data and the increase in computational costs, researchers at OpenAI had to start exploring whether there were other ways to improve the performance of the model.OpenAI, for example, is embedding more code-writing capabilities into its models and is trying to develop software that can take over a PC to complete web browser activities or application tasks by performing clicks, cursor movements, and other actions.
OpenAI has also established a dedicated team, led by Nick Ryder, who was previously responsible for pre-training, to explore how to optimize limited training data and adjust the application of expansion methods to maintain the stability of model improvement.
Regarding the Bloomberg report on Wednesday, a Google DeepMind spokesperson said that the company is "pleased with the progress of Gemini and we will share more information when it is ready." OpenAI declined to comment. Anthropic also declined to comment, but referred to a blog post published on Monday, in which Anthropic CEO Dario Amodei spoke during a five-hour podcast.
Amodel said that what people call scaling law is not a law. It is a misnomer. It is not a universal law, but an empirical law. Amodel expects scaling laws to continue to exist, but is not sure. He said there are "a lot of things" that could "disrupt" progress toward more powerful AI in the coming years, including "we could run out of data." But he's optimistic that AI companies will find a way to overcome any obstacles.
Regarding the Bloomberg report, Nosson Weissman, the founder of NossonAI, a company that provides customized AI solutions for enterprises, commented that the report did not confuse him because first of all, he did not see the expression of real experts who have made significant contributions in the field of AI. Secondly, we often see significant progress in modeling, and finally, he believes that the news media likes to create drama, and this report just seems to have a beautiful dramatic headline.