AI Agents (intelligent agents) that take over human devices were once just settings in science fiction movies, but today they have become a hot topic in the stock market.
Just on October 23, Anthropic, a large American AI model company, launched the new Claude 3.5 Haiku and the upgraded version of Sonnet. Sonnet brings a new AI experience "Computer Usage", which can operate the computer like a human, such as viewing the screen, moving the cursor, clicking, typing through a virtual keyboard, etc.
Two days later, Zhipu AI followed Anthropic's lead and released AutoGLM, with a clear goal of "controlling" the user's mobile phone as a personal assistant. It can independently perform personalized tasks such as interacting with WeChat, placing takeout orders, and even grabbing red envelopes. It is mainly targeted at common user operations in apps such as WeChat, Taobao, Meituan, and Xiaohongshu.
These two products represent AI's transition from chat machines to the stage of using tools to solve problems, allowing AI agents to gradually move towards practical products in reality.
This AI agent trend immediately caused a shock in the capital market.
When the market opened in the morning on October 28, Zhipu-related concept stocks quickly hit the daily limit. Stocks such as Parallel Technology, Capital Online, Startup Dark Horse, Doushen Education, Chuanzhi Education, and Dianguang Media rose strongly. Many of these stocks hit the daily limit, with an increase of 20 % to 30%.
The rapid response of the capital market reflects the high expectations for the commercialization prospects of AI Agent. But in essence, since the relevant applications are still in the early stages of the market, this wave of rising prices cannot rule out market sentiment and speculation.
Is AI Agent a long-term trend in future technology or a short-term trend?
From a technical point of view, the rise of AI Agents that focus on "Computer Use" (computer use) and "Phone Use" (mobile phone use) marks the development of AI from single language understanding to gradually expanding to complex task execution.
Anthropic's Claude Sonnet and Zhipu's AutoGLM will not only process natural language conversations, but also directly control the user's device to perform specific operations. This is a new stage of human-computer interaction. Anthropic's Sonnet demonstration shows that it can handle tasks such as code writing and data analysis, and can even try different solutions when errors occur. This flexibility shows that AI is beginning to have a certain "execution power."
Zhipu’s AutoGLM focuses on the mobile phone scene. By understanding UI components through OCR technology and understanding component functions through chain thinking training, AutoGLM can identify different components on the user's mobile phone screen, understand their functions, and then execute operations according to instructions, such as automating WeChat interactions and e-commerce orders.
However, such products still have limitations in user experience and commercialization.
Although AutoGLM makes mobile phone operations more intelligent, it also raises concerns about privacy and security issues: Will users give up some privacy protection for convenience? In addition, AutoGLM currently still requires clear instructions and is limited in cross-platform adaptability and operational accuracy - to achieve truly seamless automation, continuous optimization is required.
In terms of true "intelligence", AutoGLM also has room for improvement. For example, CITIC Securities pointed out in a research report that in the official demonstration video, AutoGLM paid more than 18 yuan when placing an order for Luckin Coffee, which was a clear premium. It seems that it has not yet mastered the complex "grabbing coupons" gameplay of these brands. .
In terms of specific commercialization, in September, Zhipu and Honor established a joint AI large model technology laboratory to allow the industry to see the potential of AI Agent in terminal applications. However, due to the limited number of mobile phone brands that support this feature, real large-scale application will still take time. According to IDC, the market share of AI mobile phones and AI PCs in the Chinese market will exceed 50% and 80% respectively in 2027.
Judging from the layout actions of technology giants, AI Agent is indeed an important battlefield in the field of large models.
According to public information, OpenAI is expected to launch its own AI Agent software Orion by the end of the year, and Apple will also add Apple Intelligence to iOS 18.1 next month. Microsoft has open sourced the screen parsing tool OmniParser, which can complete functions such as automatic ticket booking. Google's Geimini 2.0 is expected to be launched in December, and a new similar project "Project Jarvis" is being developed to automate Chrome web page tasks.
This means that AI agents continue to move from laboratory products to mass applications, and the giants behind them are also stepping up to occupy the market.
Venture capital trends in Silicon Valley show that more and more companies are shifting from AI infrastructure to the application level, and more vertically segmented AI applications are booming. However, current AI Agent technology still faces challenges, such as insufficient cross-platform operation capabilities, heavy reliance on instructions, and personalized experience that needs to be optimized. In order to completely enter the mainstream market, AI Agent must not only improve its functions, but also gain public trust in terms of privacy and data security.
In the short term, the application scope of AI Agent is still limited, but the efficiency and convenience it brings are attractive enough. Once technical and privacy issues are resolved, AI agents will have greater opportunities to advance intelligent applications in human life.