Rhymes AI has launched the revolutionary text-image-to-video generation model Allegro-TI2V, which marks a major breakthrough in AI in the creative field. Allegro-TI2V supports a context length of up to 79.2K, an output resolution of 720×1280 pixels, and provides multiple generation modes, such as subsequent video generation and intermediate video generation, which greatly improves the efficiency of video creation. The model is released under the Apache2.0 license and can be easily accessed and used by users.
Rhymes AI recently released its revolutionary text-image-to-video generation model Allegro-TI2V. This breakthrough technology opens up a new frontier for digital content creation. As the latest advancement in generative AI, Allegro-TI2V provides creative workers with unprecedented visual storytelling tools, marking the huge potential of AI technology in the creative field.
Allegro-TI2V excels in multiple technical specifications, supporting context lengths up to 79.2K, equivalent to 88 frames of video. Its output resolution is 720×1280 pixels, and the video generation speed is 15 frames per second. Users can also choose to interpolate to 30FPS to meet the needs of different application scenarios. The architecture of this model is very complex, including the 175 million parameter VideoVAE and the 2.8 billion parameter VideoDiT model, allowing it to accurately capture the text prompts input by the user and the essence of the initial image. In addition, Allegro-TI2V also supports multi-precision modes (FP32, BF16, FP16). In BF16 mode, only 9.3GB of GPU memory is needed to generate video, which greatly reduces hardware requirements.
The innovation of Allegro-TI2V is that it introduces two new generation modes: Subsequent video generation: based on text prompts and initial frames, continuous video content is created. This mode helps creators easily generate videos that match their set theme and style. Intermediate video generation: Based on the first and last frames of a given video, generate natural transitional intermediate frames, breaking the time and space limitations of traditional video editing.
These innovative modes enable Allegro-TI2V to provide creators with a more efficient and flexible video creation method, greatly improving creation efficiency and quality.
Rhymes AI has released Allegro-TI2V under the Apache 2.0 license, making this technology more easily accessible and usable by researchers, developers and content creators. Users only need to install Python3.10+, PyTorch2.4+ and CUDA12.4+ to easily get started and quickly experience this advanced technology.
Allegro-TI2V has a wide range of application prospects, from film production and game development to digital art and creative prototyping, all of which can give full play to its powerful generation capabilities. According to data provided by the developer, a single H100 GPU can generate a 6-second video in about 20 minutes. With a configuration of 8 H100 GPUs, the generation time will be shortened to 3 minutes, significantly improving the efficiency of video content creation.
Usage address: https://huggingface.co/rhymes-ai/Allegro-TI2V
Product address: https://rhymes.ai/blog-details/allegro-advanced-video-generation-model
With its powerful functions and ease of use, Allegro-TI2V will greatly promote the development of video content creation and bring new possibilities to the creative industry. Its open source nature also encourages wider community participation and technological development, and it is worth looking forward to its future applications and improvements.