Unduh spear tts pytorch - unduh kode sumber spear tts pytorch

spear tts pytorch

Kode Sumber AI

0.4.8

Unduh

Tombak-TTS - Pytorch

Implementasi Spear-TTS - jaringan perhatian text-to-speech multi-speaker, di Pytorch

Modul text-to-semantic yang dibangun di sini akan digunakan untuk SoundStorm untuk pengkondisian.

Apresiasi

Stabilitas atas sponsorship mereka yang murah hati untuk dikerjakan dan open source penelitian kecerdasan buatan yang canggih
Lucas Newman yang telah menyelesaikan bagian terjemahan balik, serta decoding penelusuran berkas!
Lucas Newman yang telah menyelesaikan teks terakhir ke kode pelatihan transformator semantik!

Memasang

$ pip install spear-tts-pytorch

Penggunaan

 import torch

from audiolm_pytorch import HubertWithKmeans

from spear_tts_pytorch import (
    TextToSemantic ,
    SemanticToTextDatasetGenerator ,
    GeneratedAudioTextDataset ,
    MockDataset
)

wav2vec = HubertWithKmeans (
    checkpoint_path = './hubert_base_ls960.pt' ,
    kmeans_path = './hubert_base_ls960_L9_km500.bin'
)

model = TextToSemantic (
    wav2vec = wav2vec ,
    dim = 512 ,
    num_text_token_ids = 256 ,
    heads = 8 ,
    target_kv_heads = 2 , # grouped query attention, for memory efficient decoding
    source_depth = 1 ,
    target_depth = 1
)

ds = MockDataset ( 10 )

dataset_generator = SemanticToTextDatasetGenerator (
    model = model ,
    dataset = ds ,
    folder = './output_folder'
)

dataset_generator ( max_length = 2 )

generated_dataset = GeneratedAudioTextDataset (
    folder = './output_folder'
)

assert len ( generated_dataset ) == 10

Semua yang harus dilakukan

Kutipan

 @misc { kharitonov2023speak ,
    title   = { Speak, Read and Prompt: High-Fidelity Text-to-Speech with Minimal Supervision } , 
    author  = { Eugene Kharitonov and Damien Vincent and Zalán Borsos and Raphaël Marinier and Sertan Girgin and Olivier Pietquin and Matt Sharifi and Marco Tagliasacchi and Neil Zeghidour } ,
    year    = { 2023 } ,
    eprint  = { 2302.03540 } ,
    archivePrefix = { arXiv } ,
    primaryClass = { cs.SD }
}

 @inproceedings { dao2022flashattention ,
    title   = { Flash{A}ttention: Fast and Memory-Efficient Exact Attention with {IO}-Awareness } ,
    author  = { Dao, Tri and Fu, Daniel Y. and Ermon, Stefano and Rudra, Atri and R{'e}, Christopher } ,
    booktitle = { Advances in Neural Information Processing Systems } ,
    year    = { 2022 }
}

 @misc { shi2023enhance ,
    title   = { Enhance audio generation controllability through representation similarity regularization } , 
    author  = { Yangyang Shi and Gael Le Lan and Varun Nagaraja and Zhaoheng Ni and Xinhao Mei and Ernie Chang and Forrest Iandola and Yang Liu and Vikas Chandra } ,
    year    = { 2023 } ,
    eprint  = { 2309.08773 } ,
    archivePrefix = { arXiv } ,
    primaryClass = { cs.SD }
}

 @article { Ainslie2023GQATG ,
    title   = { GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints } ,
    author  = { Joshua Ainslie and James Lee-Thorp and Michiel de Jong and Yury Zemlyanskiy and Federico Lebr'on and Sumit K. Sanghai } ,
    journal = { ArXiv } ,
    year    = { 2023 } ,
    volume  = { abs/2305.13245 } ,
    url     = { https://api.semanticscholar.org/CorpusID:258833177 }
}

 @inproceedings { Leviathan2022FastIF ,
    title   = { Fast Inference from Transformers via Speculative Decoding } ,
    author  = { Yaniv Leviathan and Matan Kalman and Y. Matias } ,
    booktitle = { International Conference on Machine Learning } ,
    year    = { 2022 } ,
    url     = { https://api.semanticscholar.org/CorpusID:254096365 }
}

Memperluas

Informasi Tambahan

Versi 0.4.8
Tipe Kode Sumber AI
Waktu Pembaruan 2025-01-15
ukuran 110.97KB
Berasal dari Github

Aplikasi Terkait

GitHub sgrebnov/cordova plugin background download

2024-11-05
pytorch image models

2024-11-03
F5 TTS ComfyUI

2024-11-02
Wa ch the greatest of all time 2024 ull ovie Online For Fr e Strea ings At Home

2024-11-02
wolfs 2024 f llmo ie f lmyz lla dow load ree 7 0p 4 0p a d 10 0p

2024-11-01
Informasi bahasa Inggris tentang pengembangan suara (Panduan Pengguna TTS versi Delphi)

2009-05-28

Direkomendasikan untuk Anda

chat.petals.dev

Kode sumber lainnya

1.0.0
GPT Prompt Templates

Kode sumber lainnya

1.0.0
GPTyped

Kode sumber lainnya

GPTyped 1.0.5
node telegram bot api

Kode Sumber AI

v0.50.0
typebot.io

Kode Sumber AI

v3.1.2
python wechaty getting started

Kode Sumber AI

1.0.0
waymo open dataset

Kode sumber lainnya

December 2023 Update
termwind

Kategori lainnya

v2.3.0
wp functions

Kategori lainnya

1.0.0

Informasi Terkait Semua