English | Simplified Chinese
Awesome-ChatTTS is an officially recommended ChatTTS resource summary project. You are welcome to recommend it or recommend it in issues.
If you think this project is helpful for you to understand and use ChatTTS, please give me some rewards and support.
Note
The following projects are community resources. For the official information, please go to the source warehouse 2noise/ChatTTS.
Website | type |
---|---|
Original Web | Original web version experience |
Forge Web | Forge Enhanced Edition Experience |
Linux | Python installation package |
Samples | Tone seed example |
Cloning | Tone cloning experience |
project | Star | Highlights |
---|---|---|
jianchang512/ChatTTS-ui | Provides API interface that can be called in third-party applications | |
6drf21e/ChatTTS_colab | Provide streaming output, support long audio generation and part-character reading | |
lenML/ChatTTS-Forge | Provides vocal enhancement and background noise reduction, with additional prompt words available | |
CCmahua/ChatTTS-Enhanced | Supports batch processing of files and exports of SRT files | |
HKoon/ChatTTS-OpenVoice | Sound cloning with OpenVoice |
project | Star | Highlights |
---|---|---|
6drf21e/ChatTTS_Speaker | Tone character marking and stability evaluation | |
AIFSH/ComfyUI-ChatTTS | ComfyUi version, which can be introduced as a workflow node | |
MaterialShadow/ChatTTS-manager | Provides a tone management system and WebUI interface |
After actual testing, there is a significant difference in the effect of generating spk_emb
each time the specified tone seed value is generated and reusing pre-generated spk_emb
. It is recommended to use .pt
tone files or tone codes (string representations).
The tone seeds were initially marked and stable evaluation in the ChatTTS_Speaker project, and the right tone can be quickly selected through examples.
When used in the official WebUI, you can directly copy the tone code and replace the value in 9. Speaker Embedding
to achieve tone control.
When used in Python scripts, refer to the compression scheme in issue#07 to achieve tone control.
spk = torch . load ( "asset/seed_1332_restored_emb.pt" , map_location = torch . device ( 'cpu' )). detach ()
spk_emb_str = compress_and_encode ( spk )
params_infer_code = ChatTTS . Chat . InferCodeParams (
spk_emb = spk_emb_str , # add sampled speaker
temperature = .0003 , # using custom temperature
top_P = 0.7 , # top P decode
top_K = 20 , # top K decode
)
video | Highlights |
---|---|
Brother Tongji Zihao | Detailed deployment tutorial from entry to advanced |
ZTFS | Mac M1 deployment tutorial |
King - Bao Bao | Windows Deployment Tutorial |
video | Highlights |
---|---|
Sam Witteveen | Introduction to the English version |
After recent iterations, the problems in the source repository code have been basically solved. If you encounter problems, it is recommended to check the Chinese version of the official description document in detail first. If you have any questions, you can continue to view this document.
The original project needs to download the corresponding model from HuggingFace. If you cannot access the Internet smoothly and scientifically, you will not be able to complete this step. As an alternative, you can download the model and configuration from modelscope and configure the local path.
Important
The model library on the Magic Tower is maintained by volunteers and does not guarantee that all models are up to date. Please verify it yourself if necessary.
pip install modelscope
# 在开头导入依赖,并下载模型和配置
from modelscope import snapshot_download
model_dir = snapshot_download ( 'zlj2546/ChatTTS' )
# 第 118 行修改模型路径
ret = chat . load_models ( 'custom' , custom_path = model_dir )
When running in the IDE, the script cannot run smoothly due to the relative path of the file.
It is recommended to refer to the instructions in the quick startup of the official documentation and run it directly in the terminal.
Make sure that you are in the project root directory when executing the following command.
python examples/web/webui.py
The generated audio will be saved to
./output_audio_n.mp3
python examples/cmd/run.py " Your text 1. " " Your text 2. "
This problem occurs because the official code does not cover all the time when dealing with Chinese punctuation, for example ?
Symbols such as , …
are not processed, resulting in an error during model generation.
You can manually delete similar Chinese punctuation marks, or modify the code in ChatTTS/utils/infer_utils.py
to add missing punctuation marks to the dictionary of character_map
on lines 103.
character_map = {
'…' : '' ,
'—' : ',' ,
'_' : ',' ,
'?' : ',' ,
}
The GPU requires at least 4G video memory, otherwise the CPU will be used. For related issues, please refer to the instructions in the ChatTTS-ui project.
1. load_models() got an unexpected keyword argument 'source'
See FAQs for details - Model cannot be downloaded
2. cannot import name 'CommitOperationAdd' from 'huggingface_hub'
See FAQs for details - Model cannot be downloaded
3. FileNotFoundError:[Erzno 2] No such file or directory: 'C:\Users\xxx\.cache\huggingface\hub\models--2Noise--ChatTTS\snapshots
See FAQs for details - Model cannot be downloaded
4. local variable 'Normalizer' referenced before assignment
You need to install pynini
and WeTextProcessing
dependencies after completing the environment configuration.
conda install -c conda-forge pynini=2.1.5 && pip install WeTextProcessing
5. download to Local path D:pythonlprojectChatTTSChatTTS failed.
Execute scripts directly in the IDE, and an error will be reported due to file path problems. See FAQ for details - Cannot run in the IDE
6. ModuleNotFoundError : No module named'Cython'
The Python execution path is not found, Windows devices need to configure the environment path according to the tutorial