Open-Sora Plan v1.2 is here! This update brings revolutionary upgrades, the core of which lies in the new 3D full attention architecture, which completely changes the way AI understands the physical world and achieves a leap from plane to three-dimensional. In addition to the significant improvement in the ability to understand the physical world, the ability to generate videos from text has also been significantly enhanced. The clarity and consistency of the generated videos have been improved. The processing capabilities of space and time dimensions have also been significantly optimized, and the inference speed has been even higher. Significant improvement. Let us take a look at the detailed updates of Open-Sora Plan v1.2.
Open-Sora Plan has been upgraded again! The latest version of Open-Sora Plan v1.2 introduces a new 3D full attention architecture, which improves the understanding of the physical world.
Main highlights of this update:
New 3D full attention architecture: The new architecture allows AI to make a qualitative leap in understanding the physical world. It is no longer a QR code that can only think in two dimensions, now it can understand this three-dimensional world 360 degrees without blind spots!
Upgraded text generation video capabilities: You type a piece of text, and AI can present you with a lifelike video picture.
Improved clarity and consistency: Through the new architecture and optimized VAE structure, the video quality generated by Open-Sora is clearer and the content is more coherent. Say goodbye to ambiguity!
Perfect integration of space and time: The new 3D full attention architecture solves a major problem in the previous version - processing space and time dimensions simultaneously. What does this mean? It means that the generated video will be significantly improved in terms of spatial performance and temporal fluency!
The inference speed is greatly improved: The optimized CausalVideoVAE structure not only improves the performance of the model, but also makes the inference speed soaring. Efficiency parties cheer!
Looking back at the development history of Open-Sora, we will find that its progress is amazing. As recently as May 2024, the v1.1.0 version was still using the 2+1D model architecture, mainly used for exploratory training. And now, just a few months later, it has evolved into a creator that can create a 3D world! At such a speed, even Darwin would have exclaimed: The theory of evolution is about to be rewritten!
The coolest thing is that the Open-Sora team doesn't hide anything! The code, data, and models are all open source, and they just stick the instructions on how to create the world on your face. Their goal is simple: let everyone become the god of video creation! This open and sharing attitude will undoubtedly accelerate the progress of AI video generation technology.
The release of Open-Sora Plan v1.2.0 marks a new era for video generation models. It not only significantly improves visual representation compression and reasoning efficiency, but also points out the direction for future development.
Project address: https://top.aibase.com/tool/open-sora-plan-v1-2
The release of Open-Sora Plan v1.2 heralds that AI video generation technology has entered a new stage of development, and its open source feature also provides a strong impetus for technological progress. We look forward to more surprises that Open-Sora can bring in the future!