What’s the solution to AI’s $600 billion conundrum? Startup executives discuss big model costs and commercialization
Author:Eve Cole
Update Time:2024-11-16 11:42:01
Beijing News Shell Finance News (Reporter Bai Jinlei and Chen Weicheng) From October 25th to 26th, the RTE2024 10th Real-Time Internet Conference, co-sponsored by the RTE Developer Community and Shengwang, was held in Beijing. At the meeting, Jia Yangqing, founder and CEO of Lepton AI, Wei Wei, partner of MiniMax, Guoyang Zeng, co-founder and chief technology officer of Wall-Facing Intelligence, and Wang Tiezhen, engineer of Hugging Face, discussed "AI's $600 billion problem: starting from the basics" The roundtable forum "From Facilities to Commercialization" attracted the attention of the industry. "AI's $600 Billion Problem" originated from an article by David Cahn, a partner at Sequoia Capital. He believed that the gap between huge investments in AI (artificial intelligence) infrastructure and actual income is too large. Artificial intelligence is about to reach the tipping point of a bubble; AI may be the next transformative technology wave, and the decline in GPU (graphics processing unit) computing prices will actually be good for long-term innovation and startups, while investors will suffer. RTE2024 The 10th Real-time Internet Conference. Picture | Photo provided by interviewee Regarding the construction of AI infrastructure, Jia Yangqing shared two core views: Models of the same size will become more and more capable, especially through technologies such as distillation and compression. The current LLama 3.2 3B model even It can have similar capabilities to the previous LLama 70B model; except for a few leading companies, more and more companies will adopt "open source + fine tuning" to make next-generation models, so the application of open source architecture will become more and more common. "The advantage of the open source model is the ecology and community. From the perspective of practical application, many people can find an open source model and fine-tune it, but it is not enough to solve all problems with the open source model." Wang Tiezhen said, "We will see in the future With more and more Infra (infrastructure) and Realtime (real-time processing) work, everyone needs to not only pay attention to the open source model itself, but also pay attention to the infrastructure and data closed loop of the open source model, so that the open source model can run better and faster. Realtime requires TTS (Text to Speech) and large models. If they can be put together in some way and placed closer to the edge, it can produce very good results. " How should we see large model training ?
And the cost of reasoning? Zeng Guoyang shared, “With the advancement of technology, computing power will definitely become cheaper and cheaper, and the scale of models with the same capabilities will become smaller and smaller, but the optimization of computing power cost will eventually translate into training more powerful models. To truly achieve AGI (General artificial intelligence) level, we can only feel that the model is becoming more and more powerful, and it is difficult to feel the change in cost.” He also mentioned that since wall-facing intelligence is an end-to-end model, he is very concerned about how to do it. To make the model run faster on the end, during the actual deployment process, they will use various quantization compression and even sparsification methods to optimize the actual deployment overhead. Jia Yangqing also pointed out that cost is not a consideration. He judged that the cost of reasoning will drop to one-tenth of the current cost within a year. When building applications, entrepreneurs can perform cost accounting based on the current cost of making an application, which is one-tenth of the current cost, to see if it can be done, including models, hardware and After being applied in large quantities, the cost can also be reduced. Recent reports indicate that OpenAI is disbanding its “AGI Readiness” team that focused on AI security research. How do the founders of AI companies present view AI safety and ethical issues? For example, Jia Yangqing said that current aircraft have many safety requirements, but rocket manufacturing is given more flexibility. Therefore, he speculated that OpenAI may be for better early development, or it may be that AI security does not jump out of the traditional security category. Traditional data security and cloud security are sufficient guarantees. Wang Tiezhen said that it is relatively early to worry about AI replacing humans, but AI has already had a negative impact on some industries, such as the impact of videos that are fake and real, including the impact on the psychology of teenagers, and there are many opportunities for entrepreneurship here. At the event, Shengwang announced that it and MiniMax are polishing China's first Realtime API (real-time processing application programming interface). So, how should we view the practical application potential of audio and video multimodal models? Wei Wei said that with the emergence of multi-modality, the boundaries of generative artificial intelligence will continue to expand and accelerate the transformation of this industry. From the product and user service process, Wei Wei discovered that models such as text, voice, music, and video can help creators in art, film, television, music and other fields greatly improve their efficiency, and provide them with new ideas and methods. Wang Tiezhen also believes that if the effects of video generation can exceed movie-level effects and do not need to be generated multiple times, even if the price is high, some people will be willing to try it.