emoji_vid_gen下載 - emoji_vid

emoji_vid_gen

Ai源碼

1.0.0

下載

表情符號視訊生成器

轉換為影片的範例腳本

EmojiVidGen 是一個有趣的工具，可以從文字檔案中建立影片。它接受包含腳本（類似於故事或對話）的純文字檔案形式的輸入。然後，它將這個腳本轉換成令人驚嘆的影片。 EmojiVidGen 是基於插件系統，允許嘗試不同的模型和語言。您所需要的只是一些想像和打字技巧！

主要特點

將文字檔案轉換為具有視覺吸引力的視頻
自動產生旁白、影像和音訊效果
專為在 8 GB 記憶體的電腦上流暢運行而設計，即使沒有 GPU 也能提供合理的處理速度
利用各種生成式人工智慧模型來完成其任務
建立在強大的插件系統之上，可輕鬆擴展
在不同模型和口語之間切換。

雖然EmojiVidGen最初旨在用於 GenAI 的娛樂，但它在製作引人入勝且酷炫的內容方面擁有巨大的潛力，尤其是在有能力的人手中。該計畫是實驗性的，主要用於教育目的，探索人工智慧驅動的影片創作的可能性。

該軟體僅用於教育目的。您自行決定使用它並承擔風險。請注意，本程式碼中使用的人工智慧模型可能對商業用途有限制。

安裝

sudo apt update
sudo apt install espeak ffmpeg

git clone https://github.com/code2k13/emoji_vid_gen
cd emoji_vid_gen
wget https://github.com/googlefonts/noto-emoji/raw/main/fonts/NotoColorEmoji.ttf

pip install -r requirements.txt

範例腳本

注意：腳本應始終以Image:指令開頭

Image: Cartoon illustration showing a beautiful landscape with mountains and a road.
Audio: Tranquil calm music occasional chirping of birds.
Title: EmojiVidGen
?: Emoji vid gen is a tool to create videos from text files using AI.

如何跑

python generate_video.py stories/hello.txt hello.mp4

一個功能齊全的範例

Image:  A single trophy kept on table. comic book style.
Audio: Upbeat introduction music for cartoon show.
Title: Emoji Quiz Showdown
?: " Welcome to the Emoji Quiz Showdown! Are you ready to test your knowledge? "
?: " Meow! I'm ready! "
?: " Woof! Let's do this! "
Image: Cartoon illustration of the Eiffel Tower.
?: " First question What is the capital of France? "
Audio: suspenseful music playing.
?: " Paris! "
Audio: people applauding sound
Image: Cartoon illustration of Mount Everest.
?: " Correct! One point for the cat! Next question  What is the tallest mountain in the world? "
Audio: suspenseful music playing.
?: " Mount Everest! "
Audio: people applauding sound
Image: Cartoon illustration of a water molecule.
?: " Right again! One point for the dog! Next question  What is the chemical symbol for water? "
Audio: suspenseful music playing.
?: " H2O! "
Audio: people applauding sound
Image: Cartoon illustration of a globe with seven continents.
?: " Correct! Another point for the cat! Last question How many continents are there on Earth? "
Audio: suspenseful music playing.
?: " Seven! "
Audio: people applauding sound
?: " Correct! It's a tie! You both did great! Thanks for playing the Emoji Quiz Showdown! "

敘述者

表情符號?️被保留為旁白。在行首使用它會導致系統僅產生聲音而不在背景上輸出任何影像。

使用預設

如果您按照前面的視訊產生說明進行操作，您可能會注意到預設設定使用espeak作為文字轉語音引擎，從而產生機械聲音輸出。 EmojiVidGen 採用由插件組成的內部結構構建，每個插件都能夠修改任務的執行方式或使用的模型。

例如，您可以為每種類型的生成任務指定特定的插件 - 無論是文字轉圖像、文字到音訊還是文字轉語音。由於每個插件都以其獨特的模型和方法運行，因此單獨配置這些設定可能會讓人不知所措。為了簡化這個過程，我引入了預設的概念。您可以透過提供generate_video.py檔案--preset選項來套用預設。

例如，下面的預設使用名為local_medium預設。

python generate_video.py stories/hello.txt hello.mp4 --preset local_medium

所有預設都儲存在./presets folder中。要建立新的預設（例如custom_preset ），只需在「./presets」資料夾中建立新的custom_preset.yaml檔案並開始像這樣使用它

python generate_video.py stories/hello.txt hello.mp4 --preset custom_preset

請注意， characters部分中使用的voice應受所選text_to_speech提供者的支援。理想情況下，影像應為具有方形長寬比和透明背景的 PNG 檔案。

可用預設

預設名稱	描述
openai_basic	使用 OpenAI 進行文字轉語音（標準）和圖像生成 (DALL-E 2 @ 512x512)。需要填入`OPENAI_API_KEY`環境變數
openai_medium	與 openai_basic 類似，但使用 (DALL-E 3 @ 1024x1024)。需要填入`OPENAI_API_KEY`環境變數
本地基本	使用 Huggingface 的穩定擴散管道和`stabilityai/sd-turbo`模型將文字轉換為影像。使用`espeak`進行文字轉語音，使用 Huggingface 的 AudioLDM 管道進行文字轉音訊。
本地基本GPU	與 local_basic 相同，但啟用了 cuda 支援。
本地媒體	與 local_basic 類似，但使用`brave`作為文字轉語音引擎，使用`stabilityai/sdxl-turbo`模型進行文字轉圖像
本地媒體	與 local_medium 相同，但啟用了 cuda 支援。
十一_中	與 local_medium 相同，但啟用了`ElevenLabs`文字轉語音 API 支援。需要在`.env`檔中定義網際網路和`ELEVEN_API_KEY`變數。需要網路和 ElevenLabs 帳戶。
對白者中型	與 local_medium 相同，但啟用了使用`parler`文字轉語音 API 支援。

配置字元

有時您可能不想在影片中使用表情符號作為角色，或為每個角色使用不同的聲音。現在可以使用預設 yaml 檔案中的characters部分來實現這一點。下面給出了此類部分的範例：

 global :
  width : 512
  height : 512 
  use_cuda : " false "
  characters :
    - name : " ? "
      voice : " fable "

    - name : " ? "
      image : " /workspace/emoji_vid_gen/cat.png "
      voice : " alloy "

    - name : " ? "
      image : " /workspace/emoji_vid_gen/dog.png "
      voice : " echo "

text_to_speech :
  provider : openai
  voice : Nova

建立自訂預設

在製品

關於緩存

EmojiVidGen 利用快取機制來保留影片建立過程中產生的資源，每個資源都與所使用的特定「提示」相關聯。事實證明，此功能非常有用，尤其是在迭代優化影片時，無需重複重新生成資產。但請注意， .cache目錄不會自動清除。建議在完成一個視頻項目並開始另一個視頻項目時清除它。

提示：若要強制重新建立快取資源，請對「提示」進行較小的更改，例如新增空格或標點符號

使用預先建立的資產

確保資源檔案存在於.cache資料夾中。以這種方式建立腳本

Image: .cache/existing_background_hd.png
Audio: Funny opening music jingle.
Title: EmojiVidGen
?: .cache/existing_speech.wav

變更影像的預設寬度和高度

複製適當的預設檔並修改以下行：

 global :
  width : 1152
  height : 896

注意：此設定確實會影響穩定擴散的輸出。並不是所有的決議都那麼有效。有關更多信息，請查看此 https://replicate.com/guides/stable-diffusion/how-to-use/ 。穩定擴散似乎適用於方形縱橫比。

已知問題

使用espeak文字轉語音提供者時，您將看到此錯誤訊息。

Traceback (most recent call last):
  File " /usr/local/lib/python3.10/dist-packages/pyttsx3/drivers/espeak.py " , line 171, in _onSynth
    self._proxy.notify( ' finished-utterance ' , completed=True)
ReferenceError: weakly-referenced object no longer exists

暫時忽略此錯誤，因為它不會影響輸出。

如果收到以下錯誤，請刪除.cache目錄

  File " plyvel/_plyvel.pyx " , line 247, in plyvel._plyvel.DB.__init__
  File " plyvel/_plyvel.pyx " , line 88, in plyvel._plyvel.raise_for_status
plyvel._plyvel.IOError: b ' IO error: lock .cache/asset/LOCK: Resource temporarily unavailable '

引文

 @misc{lacombe-etal-2024-parler-tts,
  author = {Yoach Lacombe and Vaibhav Srivastav and Sanchit Gandhi},
  title = {Parler-TTS},
  year = {2024},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {url{https://github.com/huggingface/parler-tts}}
}

 @misc{lyth2024natural,
      title={Natural language guidance of high-fidelity text-to-speech with synthetic annotations},
      author={Dan Lyth and Simon King},
      year={2024},
      eprint={2402.01912},
      archivePrefix={arXiv},
      primaryClass={cs.SD}
}

展開

附加信息

版本 1.0.0
類型 Ai源碼
更新時間 2024-12-30
大小 3.67MB
來自於 Github

相關應用

OpenCore_NO_ACPI_Build

2024-11-13
nspanel_pro_tools_apk

2024-11-12
zkwork_aleo_gpu_worker

2024-11-11
lysmarine_gen

2024-11-06
nextcloud_share_url_downloader

2024-11-01
麗華資料分析引擎免費版3.0_搜尋_導航_採集_輿情_排行_api

2022-06-28

爲您推薦

chat.petals.dev

其他源碼

1.0.0
GPT Prompt Templates

其他源碼

1.0.0
GPTyped

其他源碼

GPTyped 1.0.5
node telegram bot api

Ai源碼

v0.50.0
typebot.io

Ai源碼

v3.1.2
python wechaty getting started

Ai源碼

1.0.0
waymo open dataset

其他源碼

December 2023 Update
termwind

其他類別

v2.3.0
wp functions

其他類別

1.0.0

相關資訊全部