spear tts pytorch下載 - spear tts pytorch原始碼下載

spear tts pytorch

Ai源碼

0.4.8

下載

Spear-TTS - Pytorch

在 Pytorch 實現 Spear-TTS - 多說話者文字轉語音注意網絡

這裡建構的文本到語義模組將用於 SoundStorm 進行調節。

欣賞

他們慷慨的贊助工作和開源尖端人工智慧研究的穩定性
Lucas Newman 完成了反向翻譯部分以及波束搜尋解碼！
Lucas Newman 完成了最終文字到語意轉換器的訓練程式碼！

安裝

$ pip install spear-tts-pytorch

用法

 import torch

from audiolm_pytorch import HubertWithKmeans

from spear_tts_pytorch import (
    TextToSemantic ,
    SemanticToTextDatasetGenerator ,
    GeneratedAudioTextDataset ,
    MockDataset
)

wav2vec = HubertWithKmeans (
    checkpoint_path = './hubert_base_ls960.pt' ,
    kmeans_path = './hubert_base_ls960_L9_km500.bin'
)

model = TextToSemantic (
    wav2vec = wav2vec ,
    dim = 512 ,
    num_text_token_ids = 256 ,
    heads = 8 ,
    target_kv_heads = 2 , # grouped query attention, for memory efficient decoding
    source_depth = 1 ,
    target_depth = 1
)

ds = MockDataset ( 10 )

dataset_generator = SemanticToTextDatasetGenerator (
    model = model ,
    dataset = ds ,
    folder = './output_folder'
)

dataset_generator ( max_length = 2 )

generated_dataset = GeneratedAudioTextDataset (
    folder = './output_folder'
)

assert len ( generated_dataset ) == 10

托多

引文

 @misc { kharitonov2023speak ,
    title   = { Speak, Read and Prompt: High-Fidelity Text-to-Speech with Minimal Supervision } , 
    author  = { Eugene Kharitonov and Damien Vincent and Zalán Borsos and Raphaël Marinier and Sertan Girgin and Olivier Pietquin and Matt Sharifi and Marco Tagliasacchi and Neil Zeghidour } ,
    year    = { 2023 } ,
    eprint  = { 2302.03540 } ,
    archivePrefix = { arXiv } ,
    primaryClass = { cs.SD }
}

 @inproceedings { dao2022flashattention ,
    title   = { Flash{A}ttention: Fast and Memory-Efficient Exact Attention with {IO}-Awareness } ,
    author  = { Dao, Tri and Fu, Daniel Y. and Ermon, Stefano and Rudra, Atri and R{'e}, Christopher } ,
    booktitle = { Advances in Neural Information Processing Systems } ,
    year    = { 2022 }
}

 @misc { shi2023enhance ,
    title   = { Enhance audio generation controllability through representation similarity regularization } , 
    author  = { Yangyang Shi and Gael Le Lan and Varun Nagaraja and Zhaoheng Ni and Xinhao Mei and Ernie Chang and Forrest Iandola and Yang Liu and Vikas Chandra } ,
    year    = { 2023 } ,
    eprint  = { 2309.08773 } ,
    archivePrefix = { arXiv } ,
    primaryClass = { cs.SD }
}

 @article { Ainslie2023GQATG ,
    title   = { GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints } ,
    author  = { Joshua Ainslie and James Lee-Thorp and Michiel de Jong and Yury Zemlyanskiy and Federico Lebr'on and Sumit K. Sanghai } ,
    journal = { ArXiv } ,
    year    = { 2023 } ,
    volume  = { abs/2305.13245 } ,
    url     = { https://api.semanticscholar.org/CorpusID:258833177 }
}

 @inproceedings { Leviathan2022FastIF ,
    title   = { Fast Inference from Transformers via Speculative Decoding } ,
    author  = { Yaniv Leviathan and Matan Kalman and Y. Matias } ,
    booktitle = { International Conference on Machine Learning } ,
    year    = { 2022 } ,
    url     = { https://api.semanticscholar.org/CorpusID:254096365 }
}

展開

附加信息

版本 0.4.8
類型 Ai源碼
更新時間 2025-01-15
大小 110.97KB
來自於 Github

相關應用

GitHub sgrebnov/cordova plugin background download

2024-11-05
pytorch image models

2024-11-03
F5 TTS ComfyUI

2024-11-02
Wa ch the greatest of all time 2024 ull ovie Online For Fr e Strea ings At Home

2024-11-02
wolfs 2024 f llmo ie f lmyz lla dow load ree 7 0p 4 0p a d 10 0p

2024-11-01
語音開發英文資料(TTS使用指南Delphi版)

2009-05-28

爲您推薦

chat.petals.dev

其他源碼

1.0.0
GPT Prompt Templates

其他源碼

1.0.0
GPTyped

其他源碼

GPTyped 1.0.5
node telegram bot api

Ai源碼

v0.50.0
typebot.io

Ai源碼

v3.1.2
python wechaty getting started

Ai源碼

1.0.0
waymo open dataset

其他源碼

December 2023 Update
termwind

其他類別

v2.3.0
wp functions

其他類別

1.0.0

相關資訊全部