Meta Platforms has released a new streamlined version of the Llama model, Llama 3.21B and 3B, which enables the stable operation of large-scale language models on ordinary smartphones and tablets. The editor of Downcodes will explain this breakthrough progress and its significance in detail.
Meta Platforms today released a new streamlined version of its Llama model, including Llama3.21B and 3B products, which for the first time enabled the stable operation of large-scale language models on ordinary smartphones and tablets. By innovatively integrating quantitative training technology and optimization algorithms, the new version reduces file size by 56%, reduces running memory requirements by 41%, and increases processing speed to 4 times that of the original version while maintaining the original processing quality. Continuous processing of 8,000 character text.
When tested on Android phones, Meta's compressed AI models (SpinQuant and QLoRA) were significantly faster and more efficient than the standard versions. Smaller models run four times faster while using less memory
In the actual test of the OnePlus12 mobile phone, this compressed version showed comparable performance to the standard version, and at the same time greatly improved the operating efficiency, effectively solving the long-term problem of insufficient computing power of mobile devices. Meta has chosen to adopt an open cooperation market strategy and carry out in-depth cooperation with mainstream mobile processor manufacturers such as Qualcomm and MediaTek. The new version will be released simultaneously through the Llama official website and Hugging Face platform, providing developers with convenient access channels.
This strategy is in stark contrast to other giants in the industry. When Google and Apple choose to deeply integrate new technologies with their operating systems, Meta's open route provides developers with greater room for innovation. This release marks the shift in data processing models from centralized servers to personal terminals. Local processing solutions can not only better protect user privacy, but also provide a faster response experience.
This technological breakthrough may trigger major changes similar to those during the popularization of personal computers, although it still faces challenges such as device performance requirements and developer platform selection. As the performance of mobile devices continues to improve, the advantages of localized processing solutions will gradually emerge. Meta hopes to promote the development of the entire industry in a more efficient and safer direction through open cooperation and open up new ways for future application development of mobile devices.
The simplified version of the Llama model brings new possibilities for mobile AI applications, and its open cooperation strategy is also worth learning from the industry. In the future, as technology continues to mature and become more popular, localized AI processing will become a mainstream trend.