Hume AI has announced the launch of its revolutionary voice control feature, an experimental feature that allows users to create highly personalized AI voices without any coding or expertise. This feature is based on Hume's Empathic Voice Interface 2 (EVI2), which further enhances the naturalness, emotional expression and customizability of speech, providing developers and users with an unprecedented level of control to create solutions that meet a variety of application needs. Unique voices such as customer service chatbots, digital assistants, educational tools, and more. This innovative technology is expected to revolutionize the voice AI industry and provide users with a more thoughtful and personalized voice experience.
Hume AI, a startup focusing on emotionally intelligent voice interfaces, recently launched an experimental feature called "voice control."
This new tool is designed to help developers and users create personalized AI sounds without any coding, AI prompt engineering or sound design skills. Users can easily customize the sound to suit their needs by precisely adjusting the sound characteristics.
This new feature builds on the company’s previously launched Empathic Voice Interface 2 (EVI2), which enhanced the naturalness, emotional responsiveness and customizability of speech. Unlike traditional voice cloning technology, Hume's products focus on delivering unique and expressive voices to meet the needs of applications as diverse as customer service chatbots, digital assistants, teachers, tour guides, and accessibility features.
Voice control allows developers to adjust voice characteristics along ten different dimensions, including gender, assertiveness, excitement, confidence, and more.
“Male/Female: Gendered vocalizations that range between more masculine and more feminine.
Confidence: The firmness of the voice, between timidity and boldness.
Buoyancy: The density of sound, ranging between deflation and buoyancy.
Confidence: The degree of certainty in the voice, somewhere between shy and confident.
Enthusiasm: Excitement in the voice, somewhere between calm and enthusiasm.
Nasal: The openness of the voice, ranging between clear and nasal.
Relaxation: The pressure in the voice, between tension and relaxation.
Smoothness: The texture of the sound, somewhere between smooth and staccato.
Mildness: The energy behind the sound, somewhere between gentle and powerful.
Tightness: How contained the sound is, ranging between tight and breathless. "
Users can fine-tune these properties in real time via virtual sliders, making customization simple and straightforward. This feature is currently available in Hume's virtual platform, and users can access it by simply registering for free.
Voice control is currently available in beta and integrates with Hume's Empathic Voice Interface (EVI), making it available for a wide range of applications. Developers can select a base voice, adjust its characteristics, and preview the results in real time. This process ensures repeatability and stability from session to session, which is a key feature of real-time applications such as customer service bots or virtual assistants.
The impact of EVI2 is evident in the voice control functionality. Early models introduced features such as conversational prompts and multi-language capabilities that broadened the scope of voice AI applications. For example, EVI2 supports sub-second response times for natural, immediate conversations. It also allows speaking styles to be dynamically adjusted during interactions, making it a versatile tool for businesses.
This move is precisely to solve the problem of dependence on preset sounds in the AI industry. Many brands or applications often have difficulty finding sounds that meet their needs. Hume's goal is to develop emotionally sensitive voice AI and promote industry progress. When EVI2 is released in September 2024, it will already significantly improve the latency and cost-effectiveness of voice and provide a secure alternative to voice adjustment functions.
Hume's research-driven approach is at the heart of product development, combining cross-cultural voice recordings and emotional survey data. This methodology forms the basis of EVI2 and the newly launched voice control, allowing it to capture human perception of sound in minute detail.
Currently, voice control has been launched in the beta version and is combined with Hume’s Empathic Voice Interface (EVI) to support a variety of application scenarios. Developers can select a base sound, adjust its characteristics, and preview the results in real time, ensuring consistency and stability in real-time applications such as customer service or virtual assistants.
As competition intensifies in the market, Hume's personalized voice and emotional intelligence positioning makes it stand out in the voice AI field. In the future, Hume plans to expand the functions of voice control, add adjustable dimensions, optimize sound quality, and increase the selection of basic sounds.
Official blog: https://www.hume.ai/blog/introducing-voice-control
Highlight:
**Hume AI has launched a "voice control" function, allowing users to easily create personalized AI voices. **
** No coding skills are required for this feature, and users can adjust the sound signature with a slider. **
**Hume is designed to meet diverse application needs through personalized and emotionally intelligent voice AI. **
All in all, Hume AI's "voice control" function brings new possibilities to the field of AI voice customization. Its convenience and personalized functions are expected to promote the widespread application of voice AI technology and bring users a more humane voice interaction experience. . In the future, with the continuous improvement and expansion of functions, Hume AI is expected to become a leader in the field of voice AI.