Shengshu Technology Video Generation Model Vidu Version 1.5 Release overcomes the problem of "multi-subject consistency" - AI Articles

Author：Eve Cole Update Time：2025-02-12 03:32:01

More than a hundred days after Vidu was launched, Shengshu Technology grandly launched Vidu version 1.5, which has made a world-leading breakthrough in understanding diversified inputs and solving the problem of "consistency". This marks the entry of visual models into the "context" era and lays a solid foundation for the accelerated development of general artificial intelligence (AGI). Vidu 1.5 is not a simple functional upgrade, but a reflection of the emergence of visual model intelligence. It has stronger context learning ability, memory management ability and efficient generation efficiency, and can generate a high-quality video in less than 30 seconds.

More than 100 days after Vidu was launched, Shengshu Technology proudly announced the launch of the new version of Vidu1.5, which achieved a breakthrough in the world's leading level, especially in understanding diversified inputs and breaking through the "consistency" problem.

The launch of Vidu1.5 marks the entry of visual models into a new era of "context" and accelerates the arrival of general artificial intelligence (AGI). Vidu has the ability to generate characters consistently at the beginning of its global launch, and solved key pain points in video generation by locking on the facial features of the characters. In September, Vidu was the first in the world to release the "subject consistency" function, expanding facial consistency to whole-body consistency and expanding the scope to any subject such as animals, objects, and virtual characters. Vidu's technological breakthroughs are mainly reflected in three aspects: precise control of complex subjects, natural consistency of facial features and dynamic expressions, and multi-subject consistency.

微信截图_20241113135537.png

微信截图_20241113135531.png

Vidu1.5 shows the new "intelligence emergence" of visual models, demonstrating its powerful context learning ability. This means that the visual model not only has the ability to understand and imagine, but also can manage memory during the generation process. Vidu1.5 continues its industry-leading generation efficiency, generating a video in less than 30 seconds. Vidu adheres to the concept of universality, a design philosophy consistent with LLM (large language model), unifies all problems into problems with visual input and visual output, uses a single Transformer to uniformly model variable-length input and output, and uses a single Transformer to unify the modeling of variable-length input and output, and from the video data Get intelligence in compression.

The launch of Vidu1.5 not only improves the controllability of the video model, but also achieves the consistent generation of multiple angles, multiple subjects and multiple elements through flexible and diverse inputs. This marks the emergence of visual intelligence and accelerates the arrival of AGI. Vidu is no longer just a high-quality and efficient video generator, it can also incorporate contextual information and memory in the generation process. This is a "big leap" of visual modal intelligence. The visual model will have stronger cognitive abilities and become an important puzzle for AGI.

Experience address: www.vidu.studio

The release of Vidu 1.5 is not only a technological breakthrough, but also a milestone progress in the field of visual intelligence. It provides new possibilities for the future development of AGI, and it is worth looking forward to its application and innovation in more fields. Welcome to visit the experience address and experience the charm of visual intelligence!