Meta Platforms released the Llama 3.21B and 3B streamlined large language models today, which is the first time in the industry to achieve the stable operation of large language models on ordinary mobile phones and tablets. Through quantitative training technology and optimization algorithm, this version reduces the file size by 56%, reduces the running memory by 41%, and increases the speed to 4 times the original version. 8,000 characters of text can be continuously processed in a single time. This breakthrough progress marks the transformation of data processing mode from centralized servers to personal terminals, opening up new ways for future application development of mobile devices.
Meta Platforms today released a new streamlined version of its Llama model, including Llama 3.21B and 3B products, achieving the stable operation of large-scale language models on ordinary smartphones and tablets for the first time. By innovatively integrating quantitative training technology and optimization algorithms, the new version reduces the file size by 56% while maintaining the original processing quality, and reduces the operating memory requirement by 41%, and increases the processing speed by 4 times to the original version. Continuous processing of 8,000 character text.
When tested on Android phones, Meta's compressed AI models (SpinQuant and QLoRA) have significantly improved speed and efficiency compared to the standard version. Smaller models run four times faster, while memory usage is reduced
In the actual test of OnePlus12 phones, this compressed version showed performance comparable to that of the standard version, while greatly improving the operating efficiency and effectively solving the long-term trouble of insufficient computing power of mobile devices. Meta chose to adopt an open cooperation market strategy and conduct in-depth cooperation with mainstream mobile processor manufacturers such as Qualcomm and MediaTek. The new version will be released simultaneously through the official Llama website and the Hugging Face platform, providing developers with convenient access channels.
This strategy is in sharp contrast to other giants in the industry. Meta's open route provides developers with greater innovation space when Google and Apple choose to deeply integrate new technologies with their operating systems. This release marks the transformation of data processing mode from a centralized server to a personal terminal. Local processing solutions can not only better protect user privacy, but also provide a faster response experience.
This technological breakthrough may trigger major changes like the period when personal computers are popularized, although they still face challenges such as device performance requirements and developer platform selection. As the performance of mobile devices continues to improve, the advantages of localized processing solutions will gradually emerge. Meta hopes to promote the entire industry to develop in a more efficient and secure direction through open cooperation, and open up new ways for the future application development of mobile devices.
Meta's choice of open cooperation strategy is in contrast to other technology giants, providing developers with broader space for innovation and also heralding the future development direction of localized AI processing. This technological breakthrough is expected to change the way mobile devices are used, bringing major changes like the period of personal computer popularization, and it is worth looking forward to its future development.