Meta is actively promoting the development of its large language model, Llama, with the goal of creating autonomous machine intelligence that can truly fit into daily life and have strong reasoning capabilities. This article will explore Meta's improvement strategies, training methods, and future plans for Llama models, including the highly anticipated progress of Llama4. Meta is committed to building the Llama model into an AI system that can efficiently handle complex tasks and adapt to dynamically changing environments, which will have a profound impact on the field of artificial intelligence.
Recently, Meta's chief AI scientist Yann LeCun said that self-machine intelligence (AMI) can truly help people's daily lives. Meta is working to improve the reasoning capabilities of its Llama model, hoping to compete with top models like GPT-4o.
Meta’s vice president Manohar Paluri mentioned that they are exploring to make the Llama model not only “plan” but also to evaluate decisions in real time and adjust when conditions change. This iterative approach combines the technology of "think chain" to achieve autonomous machine intelligence that can effectively combine perception, reasoning and planning.
Furthermore, Paluri emphasizes that in AI inference in “non-verifiable domains”, models need to break down complex tasks into manageable steps in order to adapt dynamically. For example, planning a trip not only requires booking a flight, but also dealing with real-time weather changes, which may lead to re-planning of routes. Meta also recently launched the Dualformer model, which can dynamically switch between fast intuition and slow thinking during human cognition, effectively solving complex tasks.
Regarding the training of Llama models, Meta uses self-supervised learning (SSL), which helps the model learn a wide range of data representations in multiple fields, giving it flexibility. Meanwhile, reinforcement learning and human feedback (RLHF) make the model perform more refinely on specific tasks. The combination of the two makes the Llama model outstanding in generating high-quality synthetic data, especially in areas where language features are scarce.
Regarding the release of Llama4, Meta CEO Mark Zuckerberg revealed in an interview that the team has started pre-training for Llama4. He also mentioned that Meta is building a computing cluster and data infrastructure for Llama4, which is expected to be a major improvement. Paluri humorously mentioned that if Zuckerberg was asked when it was released, he might say “today,” highlighting the company’s rapid progress in AI development.
Meta hopes to continue to launch new Llama versions in the coming months to continuously improve AI capabilities. With frequent updates, developers can expect significant upgrades to each release.
Key points:
- Meta Chief AI scientist believes that autonomous machine intelligence will help improve daily life.
- The Llama model will combine self-supervised learning and reinforcement learning to improve multi-field reasoning capabilities.
- Pre-training for Llama4 has begun and is expected to be launched around 2025.
All in all, Meta's continued investment and innovation in the Llama model demonstrates its ambitions in the field of artificial intelligence. The future development of the Llama model is worth looking forward to, and its continuously improved capabilities will profoundly influence the way people live and work.