Microsoft launches new Phi-3.5 series of AI models, beating Google, OpenAI, etc.

Author：Eve Cole Update Time：2024-12-22 16:16:02

Microsoft recently released three powerful Phi-3.5 AI models, namely Phi-3.5-mini-instruct, Phi-3.5-MoE-instruct and Phi-3.5-vision-instruct, targeting lightweight reasoning and hybrid experts respectively. Models and multi-modal tasks are optimized. This marks Microsoft's significant progress in the field of multi-lingual and multi-modal artificial intelligence, further solidifying its leading position in this field. All three models are released under the MIT open source license, providing developers with a wide range of application possibilities.

Microsoft announced the release of three new Phi-3.5 models, further consolidating its leading position in the development of multi-language and multi-modal artificial intelligence. The three new models are: Phi-3.5-mini-instruct, -3.5-MoE-instruct and Phi-3.5-vision-instruct, each targeting different application scenarios.

The Phi-3.5Mini Instruct model is a lightweight AI model with 380 million parameters, which is very suitable for environments with limited computing power. It supports a context length of 128k and is specifically optimized for instruction execution capabilities, making it suitable for tasks such as code generation, mathematical problem solving, and logical reasoning. Despite its small size, this model shows impressive competitiveness in multi-language and multi-turn dialogue tasks, surpassing other models in its class.

Entrance: https://huggingface.co/microsoft/Phi-3.5-mini-instruct

The Phi-3.5MoE model, an “expert hybrid” model, combines several different types of models, each focused on a specific task. It has 41.9 billion parameters and supports a context length of 128k, which can demonstrate powerful performance in a variety of reasoning tasks. This model performs very well in code, mathematics, and multi-language understanding, even surpassing larger models in some benchmarks, such as surpassing OpenAI's GPT-4o in MMLU (Massive Multi-Task Language Understanding) mini.

Entrance: https://huggingface.co/microsoft/Phi-3.5-MoE-instruct

The Phi-3.5Vision Instruct model is an advanced multi-modal AI model that integrates text and image processing capabilities and is suitable for tasks such as image understanding, optical character recognition, chart and table analysis, and video summarization. This model also supports a context length of 128k and can handle complex multi-frame vision tasks.

Entrance: https://huggingface.co/microsoft/Phi-3.5-vision-instruct

In order to train these three models, Microsoft conducted large-scale data processing. The Mini Instruct model used 3.4 trillion markers and was trained on 512 H100-80G GPUs for 10 days; the Vision Instruct model used 500 billion markers and was trained on 6 days; and the MoE model was used in 23 days 4.9 trillion markers were used for training.

It is worth mentioning that these three Phi-3.5 models are all released under the MIT open source license, and developers can freely use, modify and distribute these software. This not only reflects Microsoft's support for the open source community, but also allows more developers to integrate cutting-edge AI capabilities into their applications.

Highlight:

Microsoft launched three new AI models, targeting lightweight reasoning, hybrid expert and multi-modal tasks.

?Phi-3.5MoE outperforms GPT-4o mini in benchmark tests and performs well.

? All three models are licensed under the MIT open source license, and developers can freely use and modify them.

All in all, the three Phi-3.5 models released by Microsoft, with their powerful performance, wide range of application scenarios and open licenses, will undoubtedly have a profound impact on the field of artificial intelligence and provide developers and researchers with powerful tools. Tools also herald the new direction of future AI technology development.