AWS launches Nova series of generative AI models, supporting text, image and video generation

Author：Eve Cole Update Time：2024-12-17 17:48:01

Amazon AWS launched the Nova series of multi-modal generative AI models at the re:Invent conference, covering text, image and video generation, aiming to provide faster and lower-cost AI solutions. The Nova series includes four text generation models (Micro, Lite, Pro and Premier), as well as image generation model Nova Canvas and video generation model Nova Reel, to meet the needs and complexity requirements of different users. This series of models supports multiple languages and can be seamlessly integrated with the AWS Bedrock platform to facilitate users to fine-tune and optimize. AWS also promises to launch speech-to-speech models and "any-to-any" models in the future to further expand the capabilities of the Nova series.

At the re:Invent conference on Tuesday, Amazon Web Services (AWS) announced the launch of its new family of multi-modal generative AI models - Nova. The Nova series released this time includes four text generation models: Micro, Lite, Pro and Premier. In addition, the image generation model Nova Canvas and the video generation model Nova Reel are also launched.

Amazon CEO Andy Jassy said Micro, Lite and Pro models will begin rolling out to AWS customers that day, while Premier models are expected to be released in early 2025. The Nova series is designed to handle multiple input forms (including text, images, and videos). The text generation model is specially optimized for 15 languages, mainly supporting English.

Nova text generation model

Nova text generation models come in different features and specifications. The Micro model is known for its lowest latency and fast response, but only supports text input and output, making it suitable for fast processing tasks. The Lite model supports fast input processing of text, images, and videos, while the Pro model offers a balance between accuracy, speed, and cost. Premier is the most powerful model, designed for complex workloads and suitable for advanced applications that require customized models.

The models also differ in context window size. The Micro supports up to about 100,000 words, and the Lite and Pro models can handle about 225,000 words, 15,000 lines of code, or 30 minutes of audio content. And AWS said that by early 2025, the context window for some Nova models will expand to 2 million markers.

Jassy emphasized that the Nova series is the fastest and lowest-cost AI model among similar products. They can be fine-tuned on AWS's AI development platform AWS Bedrock to further improve speed and efficiency. In addition, the Nova series can work seamlessly with proprietary systems and APIs to perform a variety of automation tasks.

Nova Canvas and Nova Reel

In addition to text generation, AWS also launched two image and video generation tools: Nova Canvas and Nova Reel. Nova Canvas allows users to generate and edit images via prompts and provides control over the color scheme and layout of the generated images. Nova Reel can generate up to six seconds of video based on cues or reference images, and allows users to adjust camera movement, including pan, rotation and zoom.

Here are the images from Canvas:

Although Reel is currently limited to producing short 6-second videos, AWS says longer video versions will be available soon. Additionally, AWS has built-in responsible usage controls for these tools, including watermarking and content moderation to avoid generating harmful content.

Jassy also revealed that AWS is developing a speech-to-speech model, which is expected to be launched in the first quarter of 2025. This model will support speech input and generate natural human speech. In addition, AWS is also developing an "any-to-any" model, expected to be released in mid-2025, that supports multi-modal conversion across text, voice, images, and video.

AWS remains cautious about the confidentiality of its training data and says it will provide a compensation policy on copyright issues to protect the legitimate rights and interests of customers.

Project entrance: https://aws.amazon.com/cn/ai/generative-ai/nova/

Official blog: https://aws.amazon.com/cn/blogs/aws/introducing-amazon-nova-frontier-intelligence-and-industry-leading-price-performance/

All in all, the launch of the AWS Nova series marks a new stage in the development of multi-modal generative AI technology. Its powerful functions, efficient speed, and emphasis on responsible use will bring a new AI experience to users. The continued development and functional expansion of the Nova series in the future is worth looking forward to.