Doubao App released the latest "end-to-end" voice large model on January 20, 2025, making a major update to the real-time voice call function. This update marks that Doubao has made significant progress in the field of voice interaction. It no longer relies on the traditional ASR, LLM and TTS cascade solutions, but integrates speech recognition, understanding and generation into the same model, achieving a smoother , a more intelligent voice interaction experience. The focus of this update is to improve the anthropomorphism of voice interaction, allowing AI to better understand and respond to human emotions.
On January 20, 2025, Doubao App officially released its latest "end-to-end" voice model and made important updates to the real-time voice call function. This progress marks another leap forward for Doubao in the field of voice interaction, surpassing the previous cascade solutions of ASR (automatic speech recognition), LLM (large language model) and TTS (Tensheng Audio), integrating speech recognition, understanding and generation. integrated in the same model.
After testing by "Smart Emergence", the biggest highlight of the new version of Doubao is that it has human-like expression ability and emotional output, improving the fluency and intelligence level of dialogue. In particular, the "Soul Singer" and "Various Master" modes allow Doubao to not only sing, but also perform rich role-playing, becoming a new favorite for user interaction. For example, when users asked Doubao to imitate the voice of celebrity Yu Shuxin, Doubao not only successfully replicated the character's tone, but also playfully expressed his own unique personality.
What's even more worth mentioning is that Doubao is able to improvise songs in natural conversations without the need for complicated instructions or professional prompts. Users can ask Doubao to sing at will, and can even specify the theme of the lyrics. Although Doubao's performance occasionally made small mistakes, his reaction speed and improvisational ability were amazing, demonstrating his strong anthropomorphic ability.
In addition, the two newly added personality modes of Doubao, namely "the little bag" and "the exaggerated master", also bring freshness to users. These personality patterns allow Doubao to express different emotions and styles in different situations, thereby enhancing the fun and realism of interactions.
Today, with the increasing development of voice interaction technology, this update of Doubao not only expands the application scenarios of AI to emotional companionship, psychological counseling and other fields, but also makes AI's emotional communication capabilities closer to humans. This transformation will undoubtedly enable Doubao to occupy a place in the highly competitive market and lead the future development of AI interaction.
This update of Doubao App not only achieves a breakthrough in technology, but more importantly, achieves a qualitative leap in user experience, providing a new direction for the future development of AI interaction, and it is worth looking forward to more innovations in its follow-up.