OpenAI’s newly released o3AI model has attracted industry attention for its powerful performance and high operating costs. The model achieved impressive results on the ARC-AGI benchmark, but cost more than $1,000 per mission, far more than its predecessor. This highlights the contradiction between performance improvement and cost control of artificial intelligence models, and also triggers a discussion of diminishing returns in the "scaling" approach. This article will provide an in-depth analysis of the performance, cost and future development of the o3AI model.
OpenAI’s recently launched o3AI model is considered its most powerful artificial intelligence product, but its running costs are staggering, with a single task costing more than $1,000.
According to TechCrunch, the new model uses a technique called "test-time calculation" when dealing with complex problems, meaning it spends more time thinking and exploring multiple possibilities before arriving at an answer. Therefore, OpenAI engineers hope that o3 can produce better responses under complex prompts.
According to François Chollet, founder of the ARC-AGI benchmark, the o3 achieved a score of 87.5% in its powerful "high computing mode," which is almost three times the previous-generation o1 model's score of 32%. This shows that the performance improvement of o3 is significant. However, this elaborate calculation process comes with huge overhead. To achieve this high score, O3's computing cost exceeded $1,000 per task, using 170 times more computing power than the low-power version of O3, and significantly higher than its predecessor, which cost less than $4 per task. .
This situation has caused the industry to pay attention to the contradiction between the performance of the o3 model and its operating costs. On the one hand, the significant improvement in o3's score seems to prove that artificial intelligence models can still make progress by "scaling", that is, adding processing power and training data. But on the other hand, criticism is growing about the diminishing returns of expansion. Although o3's improvement is mainly due to improving its "reasoning" method rather than simple expansion, its high operating costs undoubtedly make people worried.
Even the low-compute version of the o3, which scored 76% on the benchmark, costs about $20 per task, making it a relatively cheap option compared to its predecessor. Several times more expensive than still. Moreover, considering that ChatGPT Plus charges only US$25 per month, OpenAI faces huge cost pressure when improving the level of intelligence used by users.
In a blog post about the benchmark results, Chollet noted that while o3 is approaching human performance levels, "the cost is still high and not yet economical." He said the labor cost to solve ARC-AGI tasks is about $5 per task, while the energy consumption is only a few cents. However, he is optimistic that "the cost-effectiveness is likely to improve significantly in the coming months and years." Currently, o3 has not been released to the public, and its "mini version" is expected to be launched in January next year.
Highlights:
A single query of the o3AI model costs over $1,000, demonstrating its high cost to run.
On the ARC-AGI benchmark, the o3 scored 87.5%, almost three times higher than the previous generation o1 model.
At present, o3 has not been released to the public, and the "mini version" is expected to be launched in January next year.
All in all, the o3AI model demonstrates the strong development potential of artificial intelligence technology, but also exposes the challenges brought by high costs. In the future, how to balance performance improvement and cost control will become a key issue in the field of artificial intelligence, and the "mini version" of the o3AI model is also highly anticipated. Whether it can reduce costs while maintaining excellent performance deserves our continued attention.