Recently, the competition between AI giants Google and OpenAI has intensified. The two parties are fiercely competing in the development and iteration speed of large-scale language models. Google’s newly released Gemini-Exp-1121 model has surpassed OpenAI’s GPT-4o in many key indicators, once again occupying the top spot in the AI competition. The editor of Downcodes will give you an in-depth understanding of this "blitz" in the field of AI, as well as the latest advances in technology and functionality between both parties.
Recently, the competition between Google and OpenAI has heated up again. Just one day after the new version of GPT-4o topped the AI competition list, Google launched the latest experimental model Gemini-Exp-1121, quickly regaining the championship. Just a week ago, Google released Gemini-Exp-1114, which seems to indicate that Google responded very quickly to the dynamics of OpenAI.
Jack Rae, chief scientist of Google DeepMind, said that this was a "blitz", implying that the iteration speed of post-training is faster than pre-training.
According to official information, Gemini-Exp-1121 has been significantly improved in many aspects, mainly reflected in the enhancement of coding capabilities, reasoning capabilities and visual understanding capabilities. In addition, this model has reached a level comparable to the current top o1-preview and New Sonnet3.5 in the style control of complex prompt words.
In actual tests, Gemini-Exp-1121 also performed better than the new version of GPT-4o in handling comics understanding. Its answers were more comprehensive and it could clearly use subtitles and bold emphasis to present information. In the classic Animal Crossing River logical reasoning question, Gemini-Exp-1121's answer was completely correct, showing stronger logical reasoning ability. On the other hand, the new version of GPT-4o made some mistakes.
At the same time, OpenAI is also actively developing new features. Recently, the code for the "Live Camera" video function was discovered in the latest version of ChatGPT, which marks its progress in voice and visual recognition. OpenAI users will also experience this capability for the first time when using Advanced Speech Mode, showing its intention to expand the use of this feature in the future.
It is foreseeable that next year the main method of communication with Chatbot may gradually shift from traditional text dialogue to voice and more intelligent agent services. This change may be led by the launch of the "live camera" function.
This AI competition is still going on, with Google and OpenAI chasing each other, indicating that AI technology will continue to make breakthroughs and innovations in the future, bringing more convenient and smarter services to users. Let's wait and see who will win in the end!