Downcodes editor reports: Google released the Japanese version of the Gemma AI model at the Gemma Developer Day in Tokyo. This compact model with only 2 billion parameters has performance comparable to GPT-3.5 and can be run on mobile devices. It not only performs well on Japanese, but also overcomes the "catastrophic forgetting" problem common in small models in multi-language fine-tuning, maintaining its ability in English. Google also generously opened up model weights, training materials, and examples, and set up a competition with a prize of up to $150,000 to encourage developers to adapt the Gemma model to more local languages and promote global communication.
The Gemma model released this time performs well in Japanese language processing while maintaining its ability in English. This is particularly important for small models, because when fine-tuning a new language, they may face the problem of "catastrophic forgetting", where newly learned knowledge overwrites previously learned information. But Gemma successfully overcame this problem and demonstrated powerful language processing capabilities.
What’s more worth mentioning is that Google also immediately released the weights, training materials and examples of the model through platforms such as Kaggle and Hugging Face to help developers get started faster. This means that developers can easily use this model for local computing, which will bring more possibilities, especially in edge computing applications.
In order to encourage more international developers, Google has also launched a competition called "Unlocking Global Communication with Gemma" with a prize of up to US$150,000. This program is designed to help developers adapt Gemma models to local languages. Currently, there are already projects underway in Arabic, Vietnamese and Zulu. In India, developers are working on the “Navarasa” project, which plans to optimize the model to support 12 Indian languages, while another team is working on fine-tuning support for Korean dialects.
The launch of the Gemma2 series of models aims to achieve higher performance with fewer parameters. Compared with similar models from other companies such as Meta, Gemma2 performs equally well. In some cases, Gemma2 with 200 million parameters can even surpass some models with 70 billion parameters, such as LLaMA-2. Developers and researchers can obtain the Gemma-2-2B model and other Gemma models through the free plans of Hugging Face, Google AI Studio, and Google Colab, and they can also be found in the Vertex AI Model Garden.
Official website entrance: https://aistudio.google.com/app/prompts/new_chat?model=gemma-2-2b-it
Hugging Face: https://huggingface.co/google
Google Colab: https://ai.google.dev/gemma/docs/keras_inference?hl=de
All in all, the release of the Gemma model provides developers with powerful tools and brings new possibilities for the development of artificial intelligence in multi-language applications. Its lightweight design and open resource sharing model will promote the popularization and application of artificial intelligence technology, and it is worth looking forward to its future development and application.