Recently, Google and OpenAI, the giants in the AI field, have been engaged in fierce competition, and the speed of new models released by both parties is staggering. This competition is not only reflected in the competition of model performance, but also reflects the rapid development of AI technology iterations and the ability to respond quickly to the market. Google and OpenAI’s investment and innovation in technology research and development will continue to promote progress in the field of artificial intelligence and bring more convenient and smarter services to users.
Recently, the competition between Google and OpenAI has heated up again. Just one day after the new version of GPT-4o topped the AI competition list, Google launched the latest experimental model Gemini-Exp-1121, quickly regaining the championship. Just a week ago, Google released Gemini-Exp-1114, which seemed to indicate that Google responded very quickly to the dynamics of OpenAI.
Jack Rae, chief scientist of Google DeepMind, said that this was a "blitz", implying that the iteration speed of post-training is faster than pre-training.
According to official information, Gemini-Exp-1121 has been significantly improved in many aspects, mainly reflected in the enhancement of coding capabilities, reasoning capabilities and visual understanding capabilities. In addition, this model has reached a level comparable to the current top o1-preview and New Sonnet3.5 in the style control of complex prompt words.
In actual tests, Gemini-Exp-1121 also performed better than the new version of GPT-4o in handling comics understanding. Its answers were more comprehensive and it could clearly use subtitles and bold emphasis to present information. In the classic Animal Crossing River logical reasoning question, Gemini-Exp-1121's answer was completely correct, showing stronger logical reasoning ability. On the other hand, the new version of GPT-4o made some mistakes.
At the same time, OpenAI is also actively developing new features. Recently, the code for the "Live Camera" video function was discovered in the latest version of ChatGPT, which marks its progress in voice and visual recognition. OpenAI users will also experience this capability for the first time when using Advanced Speech Mode, showing its intention to expand the use of this feature in the future.
It is foreseeable that next year the main method of communication with Chatbot may gradually shift from traditional text dialogue to voice and more intelligent agent services. This change may be led by the launch of the "live camera" function.
Highlight:
Google's new model Gemini-Exp-1121 quickly overtook GPT-4o after reaching the top and returned to the top of the AI competition.
Gemini-Exp-1121 has improved its coding, reasoning and visual understanding capabilities and performed well.
OpenAI is developing a "real-time camera" function, which may change the way of communicating with AI in the future.
All in all, the competition between Google and OpenAI has promoted the rapid development of AI technology. In the future, the way AI interacts with humans will be more diversified and intelligent, which is worth looking forward to.