Kuaishou has launched a new text-to-video generation framework CineMaster, which has 3D perception capabilities and is known as the video version of ControlNet, providing users with unprecedented creative freedom. It allows users to accurately control the position and camera movement of objects in the video through text prompts and control signals such as depth maps, camera tracks, object labels, and other control signals, so as to achieve accurate control of the generated video content. This marks the AI video generation technology to a new level and will greatly improve the video creation efficiency and creative expression capabilities.
The core advantage of CineMaster is its strong control capabilities. Users can not only generate videos through traditional text prompts, but also make fine adjustments based on control signals such as depth maps, camera tracks, object labels, etc., so as to create more creative and personalized works. Kuaishou also provides a set of processes for extracting 3D bounding boxes and camera tracks from large-scale videos, providing powerful data support for CineMaster's training and application. The project page of CineMaster is online, and interested users can visit cinemaster-dev.github.io/.
The emergence of CineMaster indicates that the field of AI video generation is about to usher in a new wave of development. Its powerful control capabilities and convenient operation methods will surely bring users a richer creative experience and promote the further development of video content creation. We look forward to the future of CineMaster to bring more surprises and innovations.