Hume AI announced the launch of its experimental feature "Voice Control," an innovative feature that allows users to easily customize personalized AI voices without any programming or AI expertise. Through intuitive virtual sliders, users can accurately adjust ten different dimensions of voice characteristics, such as gender, confidence, enthusiasm, etc., to create a unique voice that suits various application scenarios. This feature builds on Hume’s previously launched “Empathic Voice Interface 2” (EVI2), further improving the naturalness, emotional expression and customizability of speech.
Hume AI, a startup focusing on emotionally intelligent voice interfaces, recently launched an experimental feature called "voice control."
This new tool is designed to help developers and users create personalized AI sounds without any coding, AI prompt engineering or sound design skills. Users can easily customize the sound to suit their needs by precisely adjusting the sound characteristics.
This new feature builds on the company’s previously launched Empathic Voice Interface 2 (EVI2), which enhances the naturalness, emotional responsiveness and customizability of speech. Unlike traditional voice cloning technology, Hume's products focus on delivering unique and expressive voices to meet the needs of a variety of applications including customer service chatbots, digital assistants, teachers, tour guides, and accessibility features.
Voice control allows developers to adjust voice characteristics along ten different dimensions, including gender, assertiveness, excitement, confidence, and more.
“Male/Female: Gendered vocalizations that range between more masculine and more feminine.
Confidence: The firmness of the voice, between timidity and boldness.
Buoyancy: The density of sound, ranging between deflation and buoyancy.
Confidence: The degree of certainty in the voice, between shyness and confidence.
Enthusiasm: Excitement in the voice, somewhere between calm and enthusiasm.
Nasal: The openness of the voice, ranging between clear and nasal.
Relaxation: The pressure in the voice, between tension and relaxation.
Smoothness: The texture of the sound, somewhere between smooth and staccato.
Mildness: The energy behind the sound, somewhere between gentle and powerful.
Tightness: How contained the sound is, ranging between tight and breathless. "
Users can fine-tune these properties in real time via virtual sliders, making customization simple and straightforward. This feature is currently available in Hume's virtual platform, and users can access it by simply registering for free.
Voice control is currently available in beta and integrates with Hume's Empathic Voice Interface (EVI), making it available for a wide range of applications. Developers can select a base voice, adjust its characteristics, and preview the results in real time. This process ensures repeatability and stability from session to session, which is a key feature of real-time applications such as customer service bots or virtual assistants.
The impact of EVI2 is evident in the voice control functionality. Early models introduced features such as conversational prompts and multi-language capabilities that broadened the scope of voice AI applications. For example, EVI2 supports sub-second response times for natural, instant conversations. It also allows speaking styles to be dynamically adjusted during interactions, making it a versatile tool for businesses.
This move is precisely to solve the problem of dependence on preset sounds in the AI industry. Many brands or applications often have difficulty finding sounds that meet their needs. Hume's goal is to develop emotionally sensitive voice AI and promote industry progress. When EVI2 is released in September 2024, it will already significantly improve the latency and cost-effectiveness of voice and provide a secure alternative to voice adjustment functions.
Hume's research-driven approach is at the heart of product development, combining cross-cultural voice recordings and emotional survey data. This methodology forms the basis of EVI2 and the newly launched voice control, allowing it to capture human perception of sound in minute detail.
Currently, voice control has been launched in the beta version and is combined with Hume’s Empathic Voice Interface (EVI) to support a variety of application scenarios. Developers can select a base sound, adjust its characteristics, and preview the results in real time, ensuring consistency and stability in real-time applications such as customer service or virtual assistants.
As competition intensifies in the market, Hume's personalized voice and emotional intelligence positioning make it stand out in the voice AI field. In the future, Hume plans to expand the functions of voice control, add adjustable dimensions, optimize sound quality, and increase the selection of basic sounds.
Official blog: https://www.hume.ai/blog/introducing-voice-control
Highlights:
? **Hume AI has launched a "voice control" function, allowing users to easily create personalized AI voices. **
?️ ** This feature requires no coding skills, and users can adjust the sound characteristics through sliders. **
? **Hume is designed to meet diverse application needs through personalized and emotionally intelligent voice AI. **
All in all, Hume AI's "voice control" function brings unprecedented convenience to AI voice customization. Its personalization and emotional intelligence features will greatly expand the application of AI voice in various fields. It is worth looking forward to its future development and functions. upgrade.