CoMoSVC: Innovative technology that converts one person’s singing voice into another person’s singing voice

Author：Eve Cole Update Time：2025-01-21 11:48:02

The University of Hong Kong and Microsoft Research Asia collaborated to develop a breakthrough voice conversion technology - CoMoSVC. The technology's ability to transform one person's singing voice into another's singing voice lies at its core in a clever combination of diffusion-based teacher models and self-consistency attributes. This enables CoMoSVC to achieve unprecedented processing speed while ensuring high-quality audio conversion, bringing revolutionary changes to the fields of music production and audio processing.

CoMoSVC, an innovative technology jointly developed by the University of Hong Kong and Microsoft Asia researchers, can convert one person's singing voice into another person's singing voice. It achieves a balance of high-quality audio conversion and fast processing speed by using a diffusion-based teacher model and self-consistency properties for sound conversion. Unlike the traditional iterative sampling process, CoMoSVC implements one-step sampling, greatly speeding up processing and maintaining high-quality conversion. This innovative technology will bring more efficient and convenient solutions to audio conversion, providing more possibilities for creation and expression in areas such as music production.

With its efficient and convenient features, CoMoSVC technology is expected to be widely used in music creation, speech synthesis and other fields, providing users with richer audio processing options and further promoting the development and innovation of audio technology. The speed increase brought by its one-step sampling technology also provides new possibilities for real-time audio processing.