deepstory下載 - deepstory原始碼下載

deepstory

Ai源碼

1.0.0

下載

深層故事

Deepstory 是一件將自然語言生成（NLG）w/GPT-2、文字轉語音（TTS）w/深度卷積TTS、語音動畫w/語音驅動動畫和圖像動畫w/一階運動模型融入其中的藝術作品。

簡而言之，它將文本/生成的文本轉換為視頻，其中角色被動畫化，用他/她的聲音講述你的故事。

您可以將圖像轉換為視頻，如下所示：

它提供了一個舒適的網頁介面和用flask編寫的後端來創建你自己的故事。

它支援 Transformer 模型和 pytorch-dctts 模型

現場演示

Colab (flask-ngrok)：https://colab.research.google.com/drive/1HYCPUmFw5rN8kvZdwzFpfBlaUMWPNHas?usp=sharing

影片（如果您需要說明）：https://blog.thetobysiu.com/video/

更新

重新設計介面，尤其是整個GPT2介面
GPT2現在支援從原始資料載入文本，這樣就可以繼續產生基於書本的故事
找出 GPT2 中的令牌限制並僅推斷到最接近的 1024 - 預測長度令牌
GPT2支援互動模式，可以產生多批句子並提供添加這些句子的接口
句子說話者映射系統，不再默認替換所有說話人
文字標準化現在處於合成階段，以便保留標點符號並可以引用以在合成音訊中具有可變的持續時間
音訊合成現在都在臨時資料夾中，合成的音訊被修剪，以便它的動畫視訊更準確（sda 模式訓練的資料也很短）
組合音訊現在根據標點符號具有可變的靜音
基本上，重寫網路介面和大量程式碼...

Colab版本即將上線！

介面

資料夾結構

 Deepstory
├── animator.py
├── app.py
├── data
│   ├── dctts
│   │   ├── Geralt
│   │   │   ├── ssrn.pth
│   │   │   └── t2m.pth
│   │   ├── LJ
│   │   │   ├── ssrn.pth
│   │   │   └── t2m.pth
│   │   └── Yennefer
│   │       ├── ssrn.pth
│   │       └── t2m.pth
│   ├── fom
│   │   ├── vox-256.yaml
│   │   ├── vox-adv-256.yaml
│   │   ├── vox-adv-cpk.pth.tar
│   │   └── vox-cpk.pth.tar
│   ├── gpt2
│   │   ├── Waiting for Godot
│   │   │   ├── config.json
│   │   │   ├── default.txt
│   │   │   ├── merges.txt
│   │   │   ├── pytorch_model.bin
│   │   │   ├── special_tokens_map.json
│   │   │   ├── text.txt
│   │   │   ├── tokenizer_config.json
│   │   │   └── vocab.json
│   │   └── Witcher Books
│   │       ├── config.json
│   │       ├── default.txt
│   │       ├── merges.txt
│   │       ├── pytorch_model.bin
│   │       ├── special_tokens_map.json
│   │       ├── text.txt
│   │       ├── tokenizer_config.json
│   │       └── vocab.json
│   ├── images
│   │   ├── Geralt
│   │   │   ├── 0.jpg
│   │   │   └── fx.jpg
│   │   └── Yennefer
│   │       ├── 0.jpg
│   │       ├── 1.jpg
│   │       ├── 2.jpg
│   │       ├── 3.jpg
│   │       ├── 4.jpg
│   │       └── 5.jpg
│   └── sda
│       ├── grid.dat
│       └── image.bmp
├── deepstory.py
├── generate.py
├── modules
│   ├── dctts
│   │   ├── audio.py
│   │   ├── hparams.py
│   │   ├── __init__.py
│   │   ├── layers.py
│   │   ├── ssrn.py
│   │   └── text2mel.py
│   ├── fom
│   │   ├── animate.py
│   │   ├── dense_motion.py
│   │   ├── generator.py
│   │   ├── __init__.py
│   │   ├── keypoint_detector.py
│   │   ├── sync_batchnorm
│   │   │   ├── batchnorm.py
│   │   │   ├── comm.py
│   │   │   ├── __init__.py
│   │   │   └── replicate.py
│   │   └── util.py
│   └── sda
│       ├── encoder_audio.py
│       ├── encoder_image.py
│       ├── img_generator.py
│       ├── __init__.py
│       ├── rnn_audio.py
│       ├── sda.py
│       └── utils.py
├── README.md
├── requirements.txt
├── static
│   ├── bootstrap
│   │   ├── css
│   │   │   └── bootstrap.min.css
│   │   └── js
│   │       └── bootstrap.min.js
│   ├── css
│   │   └── styles.css
│   └── js
│       └── jquery.min.js
├── templates
│   ├── animate.html
│   ├── deepstory.js
│   ├── gen_sentences.html
│   ├── gpt2.html
│   ├── index.html
│   ├── map.html
│   ├── models.html
│   ├── sentences.html
│   ├── status.html
│   └── video.html
├── test.py
├── text.txt
├── util.py
└── voice.py