Google releases VideoPoet video generation model, supporting ten-second video and audio generation

Author：Eve Cole Update Time：2025-01-12 09:32:02

Google recently released its latest video generation model, VideoPoet, which can generate videos up to 10 seconds long and automatically generate corresponding soundtrack sound effects. Different from previous diffusion models, VideoPoet uses a large language model, which allows it to integrate multiple functions such as text to video, video repair, and video stylization, greatly improving the flexibility and efficiency of use. Its unique video extension mechanism creates the illusion of infinite extension of the video by repeatedly predicting the content of the next frame of the last frame, bringing users a new video generation experience.

On December 19, Google released the video generation model VideoPoet. This model can generate videos up to 10 seconds long, and can also automatically generate soundtrack sound effects based on the video content. VideoPoet extends the video by repeatedly predicting the next frame content of the last frame of the video, making the user feel that the video can be extended infinitely. Unlike other models, VideoPoet uses a large language model instead of a diffusion model, so it integrates multiple functions such as text to video, video repair, and video stylization into the same model, making it more flexible to use.

The emergence of VideoPoet marks a significant advancement in video generation technology. Its powerful functions and convenient operation methods are expected to be widely used in various fields in the future, providing users with a richer and more convenient video creation experience. We look forward to VideoPoet bringing more surprising features and applications in the future.