Stability AI announces the launch of Stable Diffusion 3.5, a series of three different versions of large text-to-image generation models. The series aims to meet the wide range of needs from researchers to enterprise customers to enthusiasts, by providing models of different parameter scales and performance characteristics to suit different computing capabilities and application scenarios. The update is intended to respond to previous shortcomings of Stable Diffusion 3.0 and compete with other leading AI image generation tools on the market.
Stability AI recently launched its latest deep learning text-to-image generation model - Stable Diffusion3.5. This release includes three improved open source models designed to meet the needs of different users, including researchers, corporate customers and enthusiasts.
Among them, Stable Diffusion3.5Large is the most powerful model in the entire series, with parameters as high as 8.1 billion. With its excellent image quality and high responsiveness to prompts, the model is ideal for professional users, capable of generating high-quality images with a resolution of up to 1 megapixel.
In addition, Stable Diffusion3.5Large Turbo is a simplified version of Stable Diffusion3.5Large. While generating high-quality images, it greatly improves the speed. It can complete image generation in just 4 steps. It is more efficient than the previous version and is suitable for users who need to create quickly.
Another new model is the Stable Diffusion3.5Medium, which has 2.5 billion parameters. The model adopts an improved MMDiT-X architecture and training method, designed to be "out of the box" and runs smoothly even on consumer hardware. It strikes a good balance between image generation quality and ease of customization, producing 0.25 to 2 megapixel images.
The background of this release is that after the June release of Stable Diffusion3Medium failed to meet expectations, Stability AI decided to launch a more transformative solution. The company said they hope to regain market competitiveness with the update to meet challenges from platforms such as OpenAI's DALL-E and Midjourney.
An important technological innovation in the new model is the introduction of query-key normalization technology. This innovation enhances the customization of the model and responsiveness to prompts, and users can obtain more consistent results with clear prompts, while also getting richer image interpretation when using wider prompts.
The Stable Diffusion3.5 series model will be released under Stability AI's community license, allowing users to use non-commercially for free. At the same time, entities with annual income less than US$1 million can also be used for commercial use for free, while users with excess income must apply for a corporate license.
All models and their self-hosting weights are provided on the Hugging Face and Stability AI's APIs. Additionally, ControlNets feature, which offers advanced image customization options, is expected to be launched in the coming days.
Official entrance:
https://stability.ai/stable-image
Three versions of Hugging Face portals:
https://huggingface.co/stabilityai/stable-diffusion-3.5-large
https://huggingface.co/stabilityai/stable-diffusion-3.5-large-turbo
https://huggingface.co/stabilityai/stable-diffusion-3.5-medium
Key points:
The newly launched Stable Diffusion3.5 offers three model versions to meet different user needs.
Stable Diffusion3.5Large Turbo has faster image generation speeds, suitable for fast creation.
The new model introduces query-key normalization technology, which improves customization and responsiveness.
In short, the launch of the Stable Diffusion 3.5 series model marks a major upgrade of Stability AI in the field of text-to-image generation. Its multi-version strategy and technological innovation are expected to further enhance the user experience and gain a place in the fierce market competition. Visit the provided links and experience the brand new image generation technology!