pyannote 音頻
Version 3.3.1
在生產中使用pyannote.audio
開源工具包?考慮切換到 pyannoteAI 以獲得更好更快的選擇。
pyannote.audio
揚聲器二值化工具包pyannote.audio
是一個用 Python 編寫的開源工具包,用於說話者分類。它基於 PyTorch 機器學習框架,配備了最先進的預訓練模型和管道,可以根據您自己的數據進一步微調,以獲得更好的性能。
pip install pyannote.audio
pyannote.audio
安裝 pyannote.audiopyannote/segmentation-3.0
使用者條件pyannote/speaker-diarization-3.1
使用者條件hf.co/settings/tokens
建立訪問令牌。 from pyannote . audio import Pipeline
pipeline = Pipeline . from_pretrained (
"pyannote/speaker-diarization-3.1" ,
use_auth_token = "HUGGINGFACE_ACCESS_TOKEN_GOES_HERE" )
# send pipeline to GPU (when available)
import torch
pipeline . to ( torch . device ( "cuda" ))
# apply pretrained pipeline
diarization = pipeline ( "audio.wav" )
# print the result
for turn , _ , speaker in diarization . itertracks ( yield_label = True ):
print ( f"start= { turn . start :.1f } s stop= { turn . end :.1f } s speaker_ { speaker } " )
# start=0.2s stop=1.5s speaker_0
# start=1.8s stop=3.9s speaker_1
# start=4.2s stop=5.7s speaker_0
# ...
pyannote
預訓練語音分離管道,作者:Clément Pagés 開箱即用的pyannote.audio
揚聲器二值化管道 v3.1 預計比 v2.x 更好(更快)。這些數字是二值化錯誤率(以%為單位):
基準 | v2.1 | v3.1 | pyannoteAI |
---|---|---|---|
AISHELL-4 | 14.1 | 12.2 | 11.9 |
阿里會議(頻道1) | 27.4 | 24.4 | 22.5 |
AMI (IHM) | 18.9 | 18.8 | 16.6 |
AMI(SDM) | 27.1 | 22.4 | 20.9 |
AVA-AVD | 66.3 | 50.0 | 39.8 |
打電話回家(第 2 部分) | 31.6 | 28.4 | 22.2 |
迪哈德 3(完整) | 26.9 | 21.7 | 17.2 |
獲利21 | 17.0 | 9.4 | 9.0 |
Ego4D(開發) | 61.5 | 51.2 | 43.8 |
默沙東狂野 | 32.8 | 25.3 | 19.8 |
隨機存取記憶體控制器 | 22.5 | 22.2 | 18.4 |
重複(第二階段) | 8.2 | 7.8 | 7.6 |
VoxConverse (v0.3) | 11.2 | 11.3 | 9.4 |
二值化錯誤率(%)
如果您使用pyannote.audio
請使用以下引用:
@inproceedings { Plaquet23 ,
author = { Alexis Plaquet and Hervé Bredin } ,
title = { {Powerset multi-class cross entropy loss for neural speaker diarization} } ,
year = 2023 ,
booktitle = { Proc. INTERSPEECH 2023 } ,
}
@inproceedings { Bredin23 ,
author = { Hervé Bredin } ,
title = { {pyannote.audio 2.1 speaker diarization pipeline: principle, benchmark, and recipe} } ,
year = 2023 ,
booktitle = { Proc. INTERSPEECH 2023 } ,
}
下面的指令將設定開發pyannote.audio
函式庫所需的預提交掛鉤和套件。
pip install -e .[dev,testing]
pre-commit install
pytest