Fugatto, the latest AI model released by NVIDIA, seems to have magical power and completely revolutionizes audio processing technology. It can not only generate a mixture of music, voice and sound, but also understand and execute instructions entered by users through text and audio files, making it possible to create a variety of wonderful auditory effects. The editor of Downcodes will take you to have an in-depth understanding of this disruptive AI model and see how it can bring scenes from science fiction movies into reality.
Fugatto, whose full name is "Foundational Generative Audio Transformer Opus1", is an audio processing model based on generative AI technology. Unlike other AI models that can only create music or modify speech, Fugatto has the more powerful ability to generate or convert any mixture of music, speech, and sounds, and is able to understand and execute instructions entered by users through text and audio files.
Fugatto's powerful features have amazed users from all walks of life, including music producers, advertising agencies, language learning tool developers, and game developers. Music producers can use it to quickly experiment with different musical styles, vocals, and instruments, and even add effects or improve the sound quality to existing songs. Advertising companies can use it to add different accents and emotions to the dubbing of advertisements, and easily promote advertisements to different regions and target groups. Language learning tool developers can use Fugatto to convert course content into any voice the user wants, such as that of a family member or friend, to make learning more personalized. Game developers can use Fugatto to modify in-game sound materials in real time based on game progress, or create new game sound effects based on text commands and audio input.
The magic of Fugatto is its ability to understand and generate sounds just like a human. Not only can it carry out specific instructions given by the user, it can also create new sounds that have never been heard before. For example, it can make the trumpet make a dog sound, and the saxophone make a cat sound. As long as the user can describe it, Fugatto can create it.
Picture source note: The picture is generated by AI, and the picture is authorized by the service provider Midjourney
Another groundbreaking ability of Fugatto is its ability to combine instructions learned separately during training to produce more complex effects. For example, users can ask it to generate a French-accented voice with a sad emotion. What's even more amazing is that Fugatto also allows users to make subtle adjustments to the instructions, such as controlling the thickness of the accent or the intensity of the sadness, allowing users to create like an artist.
Fugatto can also generate sounds that change over time, such as a storm approaching from a distance and thunder building in intensity before slowly fading into the distance. Users can precisely control the sound changing process and create a variety of vivid sound effects.
Fugatto is a collaborative effort between researchers from around the world, with team members from countries such as India, Brazil, China, Jordan and South Korea. Their diverse backgrounds give Fugatto greater multi-accent and multi-language capabilities.
The birth of Fugatto is the culmination of NVIDIA's years of research in the fields of speech modeling, audio coding, and audio understanding. It uses 2.5 billion parameters and is trained on a cluster of NVIDIA DGX systems equipped with 32 NVIDIA H100Tensor Core GPUs.
The emergence of Fugatto marks a new era in audio processing technology. It will bring unlimited possibilities to various fields such as music, movies, games, education, etc. Let us look forward to it creating more amazing auditory feasts!
Official blog: https://blogs.nvidia.com/blog/fugatto-gen-ai-sound-model/
The emergence of Fugatto heralds the huge potential of artificial intelligence in the audio field. Its powerful functions and convenient operation methods will undoubtedly bring unprecedented innovation to all walks of life. Let's wait and see how Fugatto will continue to shape our auditory world in the future!