English README | Tip Project | Discord Discussion Group
A simple local web interface that uses ChatTTS to synthesize text into speech on the web page, supports Chinese, English, and mixed numbers, and provides an API interface.
Original ChatTTS project. Starting from version 0.96, source code deployment must first install ffmpeg. The previous timbre files csv and pt are no longer available. Please fill in the timbre values and regenerate. Get the timbre
[Sponsor]
302.AI is an AI supermarket that brings together the world's top brands, with pay-as-you-go, zero monthly fees, and zero threshold for using various types of AI.
Comprehensive functions, simple and easy to use, zero threshold for on-demand payment, separation of managers and users
Interface preview
Alphanumeric symbol control character mixed effect
For the first time, download the model from huggingface.co or github to the asset directory. If the network is unstable, the download may fail. If it fails, please download it separately.
After downloading and unzipping, you will see the asset folder, which contains multiple pt files. Copy all pt files to the asset directory, and then restart the software.
GitHub download address: https://github.com/jianchang512/ChatTTS-ui/releases/download/v1.0/all-models.7z
Baidu network disk download address: https://pan.baidu.com/s/1yGDZM9YNN7kW9e7SFo8lLw?pwd=ct5x
Pull project repository
Clone the project in any path, for example:
git clone https://github.com/jianchang512/ChatTTS-ui.git chat-tts-ui
Start Runner
Enter the project directory:
cd chat-tts-ui
Start the container and view the initialization log:
gpu版本
docker compose -f docker-compose.gpu.yaml up -d
cpu版本
docker compose -f docker-compose.cpu.yaml up -d
docker compose logs -f --no-log-prefix
Visit ChatTTS WebUI
启动:['0.0.0.0', '9966']
, that is, access IP:9966
, for example:
http://127.0.0.1:9966
http://192.168.1.100:9966
Get the latest code from the main branch:
git checkout main
git pull origin main
Go to the next step and update to the latest image:
docker compose down
gpu版本
docker compose -f docker-compose.gpu.yaml up -d --build
cpu版本
docker compose -f docker-compose.cpu.yaml up -d --build
docker compose logs -f --no-log-prefix
Configure the python3.9-3.11 environment and install ffmpeg. yum install ffmpeg
or apt-get install ffmpeg
etc.
Create an empty directory /data/chattts
and execute the command cd /data/chattts && git clone https://github.com/jianchang512/chatTTS-ui .
Create a virtual environment python3 -m venv venv
Activate virtual environment source ./venv/bin/activate
Install dependencies pip3 install -r requirements.txt
If CUDA acceleration is not required, execute
pip3 install torch==2.2.0 torchaudio==2.2.0
If CUDA acceleration is required, execute
pip install torch==2.2.0 torchaudio==2.2.0 --index-url https://download.pytorch.org/whl/cu118
pip install nvidia-cublas-cu11 nvidia-cudnn-cu11
You also need to install CUDA11.8+ ToolKit, please search for the installation method yourself or refer to https://juejin.cn/post/7318704408727519270
In addition to CUDA, AMD GPUs can also be used for acceleration, which requires the installation of ROCm and PyTorch_ROCm versions. AMG GPU uses ROCm out of the box in PyTorch without additional code modifications.
pip3 install torch==2.2.0 torchaudio==2.2.0 --index-url https://download.pytorch.org/whl/rocm6.0
After the installation is complete, you can use the rocm-smi command to view the AMD GPU in the system. You can also use the following Torch code (query_gpu.py) to query the current AMD GPU Device.
import torch
print(torch.__version__)
if torch.cuda.is_available():
device = torch.device("cuda") # a CUDA device object
print('Using GPU:', torch.cuda.get_device_name(0))
else:
device = torch.device("cpu")
print('Using CPU')
torch.cuda.get_device_properties(0)
Using the above code, taking AMD Radeon Pro W7900 as an example, query the device as follows.
$ python ~/query_gpu.py
2.4.0.dev20240401+rocm6.0
Using GPU: AMD Radeon PRO W7900
Execute python3 app.py
to start, and the browser window will automatically open with the default address http://127.0.0.1:9966
(Note: the model is downloaded from the modelscope magic tower by default, and proxy downloading cannot be used. Please turn off the proxy)
Configure the python3.9-3.11 environment, install git, and execute the command brew install libsndfile git [email protected]
to continue execution.
brew install ffmpeg
export PATH="/usr/local/opt/[email protected]/bin:$PATH"
source ~/.bash_profile
source ~/.zshrc
Create an empty directory /data/chattts
and execute the command cd /data/chattts && git clone https://github.com/jianchang512/chatTTS-ui .
Create a virtual environment python3 -m venv venv
Activate virtual environment source ./venv/bin/activate
Install dependencies pip3 install -r requirements.txt
Install torch pip3 install torch==2.2.0 torchaudio==2.2.0
Execute python3 app.py
to start, and the browser window will automatically open with the default address http://127.0.0.1:9966
(Note: the model is downloaded from the modelscope magic tower by default, and proxy downloading cannot be used. Please turn off the proxy)
Download python3.9-3.11, be sure to select Add Python to environment variables
when installing.
Download ffmpeg.exe and place it in the ffmpeg folder in the software directory
Download and install git, https://github.com/git-for-windows/git/releases/download/v2.45.1.windows.1/Git-2.45.1-64-bit.exe
Create an empty folder D:/chattts
and enter it. Enter cmd
in the address bar and press Enter. In the pop-up cmd window, execute the command git clone https://github.com/jianchang512/chatTTS-ui .
Create a virtual environment and execute the command python -m venv venv
To activate the virtual environment, execute .venvscriptsactivate
To install dependencies, execute pip install -r requirements.txt
If CUDA acceleration is not required,
Execute pip install torch==2.2.0 torchaudio==2.2.0
If CUDA acceleration is required, execute
pip install torch==2.2.0 torchaudio==2.2.0 --index-url https://download.pytorch.org/whl/cu118
You also need to install CUDA11.8+ ToolKit, please search for the installation method yourself or refer to https://juejin.cn/post/7318704408727519270
Execute python app.py
to start, and the browser window will automatically open with the default address http://127.0.0.1:9966
(Note: the model is downloaded from the modelscope magic tower by default, and proxy downloading cannot be used. Please turn off the proxy)
If the GPU memory is less than 4G, the CPU will be forced to be used.
Under Windows or Linux, if the video memory is greater than 4G and it is an NVIDIA graphics card, but the CPU is still used after source code deployment, you can try to uninstall torch first and then reinstall it. Uninstall pip uninstall -y torch torchaudio
and reinstall the cuda version of torch. pip install torch==2.2.0 torchaudio==2.2.0 --index-url https://download.pytorch.org/whl/cu118
. CUDA11.8+ must be installed
By default, it detects whether modelscope can be connected. If it can, download the model from modelscope, otherwise download the model from huggingface.co.
After version 0.96, due to the ChatTTS kernel upgrade, it is no longer possible to directly use the pt file downloaded from this site (https://modelscope.cn/studios/ttwwaaa/ChatTTS_Speaker)
Therefore, by adding the conversion script cover-pt.py, the Win integration package can directly download the cover-pt.exe file and place it in the same directory as app.exe and double-click to execute it.
After executing python cover-pt.py
, the file starting with seed_
and ending with _emb.pt
in speaker
directory, that is, the default file name pt after downloading, will be converted into an available encoding format. The converted pt will be renamed Ending with _emb-covert.pt
.
example:
If this file exists in speaker/seed_2155_restored_emb.pt
, it will be converted to speaker/seed_2155_restored_emb-cover.pt
, and then the original pt file will be deleted, leaving only the converted file.
The default address is http://127.0.0.1:9966
. If you want to modify it, you can open the .env
file in the directory and change WEB_ADDRESS=127.0.0.1:9966
to the appropriate IP and port, such as WEB_ADDRESS=192.168.0.10:9966
so that it can be accessed by the LAN
Request method: POST
Request address: http://127.0.0.1:9966/tts
Request parameters:
text: str| Required, the text to be synthesized into speech
voice: Optional, default is 2222, the number that determines the voice, 2222 | 7869 | 6653 | 4099 | 5099, you can choose one of them, or any voice will be used randomly.
prompt: str| optional, default empty, set laughter and pause, for example [oral_2][laugh_0][break_6]
temperature: float| optional, default 0.3
top_p: float| optional, default 0.7
top_k: int| optional, default 20
skip_refine: int| Optional, default 0, 1=skip refine text, 0=not skip
custom_voice: int| Optional, default 0, custom seed value when obtaining timbre value, needs an integer greater than 0, if set, this will prevail, voice
will be ignored
Return: json data
Successful return: {code:0,msg:ok,audio_files:[dict1,dict2]}
其中 audio_files 是字典数组,每个元素dict为 {filename:wav文件绝对路径,url:可下载的wav网址}
Return on failure:
{code:1,msg:错误原因}
# API调用代码
import requests
res = requests.post('http://127.0.0.1:9966/tts', data={
"text": "若不懂无需填写",
"prompt": "",
"voice": "3333",
"temperature": 0.3,
"top_p": 0.7,
"top_k": 20,
"skip_refine": 0,
"custom_voice": 0
})
print(res.json())
#ok
{code:0, msg:'ok', audio_files:[{filename: E:/python/chattts/static/wavs/20240601-22_12_12-c7456293f7b5e4dfd3ff83bbd884a23e.wav, url: http://127.0.0.1:9966/static/wavs/20240601-22_12_12-c7456293f7b5e4dfd3ff83bbd884a23e.wav}]}
#error
{code:1, msg:"error"}
Upgrade pyVideoTrans to 1.82+ https://github.com/jianchang512/pyvideotrans
ChatTTS
in the main interface