dalle flow下載 - dalle flow源代碼下載

dall·e流：用於從文本創建高清圖像的人類在循環工作流程
人類在線^？從文本創建高清圖像的工作流程

dall·e流是一種交互式工作流程，用於從文本提示中生成高清圖像。首先，它利用dall·e-mega，Glid-3 XL和穩定的擴散來生成像徵候選者，然後調用剪輯 - 服務以對候選者的提示進行排名。首選的候選者被餵入GLID-3 XL進行擴散，這通常會豐富質地和背景。最後，候選人通過Swinir將候選人提升到1024x1024。

DALL·E流是用Jina在客戶端服務器體系結構中構建的，這使其具有很高的可擴展性，非阻滯流和現代的Pythonic界面。客戶端可以通過TLS通過GRPC/WebSocket/HTTP與服務器進行交互。

為什麼要在循環中？生成藝術是一個創造過程。儘管達爾·e釋放了人們的創造力的最新進展，但擁有單個單獨的輸出UX/UI將想像力鎖定在單一可能性上，無論這個單一的結果多麼好，這都是不好的。通過將生成藝術形式化為迭代程序，dall·e流是單線的替代方法。

用法

dall·e流在客戶端服務器體系結構中。

客戶用法
服務器使用情況，即部署自己的服務器

更新

？ 2022/10/27 RealEsrgan Upscaler已被添加。
配x 2022/10/26使用clip-as-Service在grpcs://api.clip.jina.ai:2096 （需要jina >= v3.11.0 ），您需要首先從這裡獲得訪問權限。有關更多詳細信息，請參閱使用剪貼即服務。
？從提示符中添加了2022/9/25基於自動夾的分割。
？已添加了2022/8/17到圖像的文本，以進行穩定擴散。為了使用它，您需要同意他們的TOS，下載權重，然後在Docker或flow_parser.py中啟用標誌。
配x 2022/8/8開始使用clip-as-Service作為外部執行者。現在，您可以根據需要輕鬆部署自己的剪輯執行程序。由於這種改進而發生了一個很小的破壞變化，因此請重新打開Google Colab中的筆記本。
配x 2022/7/6演示服務器遷移到AWS EKS，以獲得更好的可用性和魯棒性，服務器URL現在更改為grpcs://dalle-flow.dev.jina.ai 。現在，所有連接都使用TLS加密，請重新打開Google Colab中的筆記本。
配x 2022/6/25由於GPU配額不超出6/25 0:00-12：00之間的意外停機時間。新服務器現在有2個GPU，在客戶端筆記本中添加HealthCheck。
2022/6/3將圖像的默認數量減少到每個途徑2，4用於擴散。
？ 2022/6/21現在可以在Docker Hub上獲得預製圖像！該圖像可以在CUDA 11.6上開箱即用。修復夾子服務中的上游錯誤。
配x 2022/5/23修復了夾子服務中的上游錯誤。此錯誤使得第二個擴散步驟與給定文本無關。事實證明，新的Dockerfile在AWS EC2 p2.x8large實例上可以再現。
2022/5/13B刪除TLS作為CloudFlare提供100次超時，使Dalle流動在可用的情況下，請重新打開Google Colab！的筆記本！
？ 2022/5/13新的大型檢查站！現在，所有連接都與TLS一起，請重新打開Google Colab！的筆記本！
？ 2022/5/10添加了Dockerfile！現在，您可以輕鬆地部署自己的dall·e流。新的大型檢查站！較小的內存腳印，整個流程現在可以使用21GB內存的一個GPU 。
？ 2022/5/7新的MEGA檢查點和GLID3上的多重優化：較少的內存 - 腳印，使用clip-as-Service的ViT-L/14@336px ， steps 100->200 。
？ 2022/5/6 dall·e Flow剛剛更新了！請重新打開Google Colab中的筆記本！
- 修訂了第一步：產生了16個候選人，從達爾·埃加（Dall·E Mega）產生了8個，8個候選人，來自Glid3-XL；然後以剪貼畫為服務的排名。
- 提高了流動效率：總體速度，包括擴散和升級現在要快得多！

畫廊

a realistic photo of a muddy dog A scientist comparing apples and oranges, by Norman Rockwell an oil painting portrait of the regal Burger King posing with a Whopper Eternal clock powered by a human cranium, artstation another planet amazing landscape The Decline and Fall of the Roman Empire board game kickstarter A raccoon astronaut with the cosmos reflecting on the glass of his helmet dreaming of the stars, digital art A photograph of an apple that is a disco ball, 85 mm lens, studio lighting a cubism painting Donald trump happy cyberpunk oil painting of a hamster drinking tea outside Colossus of Rhodes by Max Ernst landscape with great castle in middle of forest an medieval oil painting of Kanye west feels satisfied while playing chess in the style of Expressionism An oil pastel painting of an annoyed cat in a spaceship dinosaurs at the brink of a nuclear disaster fantasy landscape with medieval city GPU chip in the form of an avocado, digital art a giant rubber duck in the ocean Paddington bear as austrian emperor in antique black & white photography a rainy night with a superhero perched above a city, in the style of a comic book A synthwave style sunset above the reflecting water of the sea, digital art an oil painting of ocean beach front in the style of Titian an oil painting of Klingon general in the style of Rubens city, top view, cyberpunk, digital realistic art an oil painting of a medieval cyborg automaton made of magic parts and old steampunk mechanics a watercolour painting of a top view of a pirate ship sailing on the clouds a knight made of beautiful flowers and fruits by Rachel ruysch in the style of Syd brak a 3D render of a rainbow colored hot air balloon flying above a reflective lake a teddy bear on a skateboard in Times Square cozy bedroom at night an oil painting of monkey using computer the diagram of a search machine invented by Leonardo da Vinci A stained glass window of toucans in outer space a campfire in the woods at night with the milky-way galaxy in the sky Bionic killer robot made of AI scarab beetles The Hanging Gardens of Babylon in the middle of a city, in the style of Dalí painting oil of Izhevsk a hyper realistic photo of a marshmallow office chair fantasy landscape with city ocean beach front view in Van Gogh style An oil painting of a family reunited inside of an airport, digital art antique photo of a knight riding a T-Rex a top view of a pirate ship sailing on the clouds an oil painting of a humanoid robot playing chess in the style of Matisse a cubism painting of a cat dressed as French emperor Napoleon a husky dog wearing a hat with sunglasses A mystical castle appears between the clouds in the style of Vincent di Fate golden gucci airpods realistic photo

客戶

使用客戶非常容易。以下步驟最好在Jupyter Notebook或Google Colab中運行。

您需要先安裝Docarray和Jina：

pip install " docarray[common]>=0.13.5 " jina

我們為您提供了一台演示服務器：

配x由於大量要求，我們的服務器可能會延遲響應。但是，我們對保持正常運行時間的高度有信心。您還可以在此處遵循指令來部署自己的服務器。

 server_url = 'grpcs://dalle-flow.dev.jina.ai'

步驟1：通過dall生成·e Mega

現在，讓我們定義提示：

 prompt = 'an oil painting of a humanoid robot playing chess in the style of Matisse'

讓我們將其提交給服務器並可視化結果：

 from docarray import Document

doc = Document ( text = prompt ). post ( server_url , parameters = { 'num_images' : 8 })
da = doc . matches

da . plot_image_sprites ( fig_size = ( 10 , 10 ), show_index = True )

在這裡，我們生成了24個候選者，從達勒 - 梅加（Dalle-Mega）產生了8個候選者，8個來自GLID3 XL的候選者，以及穩定擴散的8個候選者，這是在num_images中定義的，大約需要約2分鐘。如果對您來說太長，則可以使用較小的值。

步驟2：通過GLID3 XL選擇和完善

24個候選人用剪貼畫和服務對象進行排序，索引0是由Clip評判的最佳候選人。當然，您可能會有所不同。注意左上角的數字嗎？選擇您最喜歡的一個並獲得更好的視圖：

 fav_id = 3
fav = da [ fav_id ]
fav . embedding = doc . embedding
fav . display ()

現在，讓我們將選定的候選物提交到服務器以進行擴散。

 diffused = fav . post ( f' { server_url } ' , parameters = { 'skip_rate' : 0.5 , 'num_images' : 36 }, target_executor = 'diffusion' ). matches

diffused . plot_image_sprites ( fig_size = ( 10 , 10 ), show_index = True )

這將根據所選圖像提供36張圖像。您可以通過給skip_rate一個接近零值或接近一個值來迫使其接近給定圖像來使模型更具即興創作。整個過程大約需要約2分鐘。

步驟3：通過Swinir選擇和高檔

選擇您最喜歡的圖像，然後仔細觀察：

 dfav_id = 34
fav = diffused [ dfav_id ]
fav . display ()

最後，提交到最後一步的服務器：將其提升到1024 x 1024px。

 fav = fav . post ( f' { server_url } /upscale' )
fav . display ()

就是這樣！這是一個。如果不滿意，請重複該程序。

順便說一句，docarray是一個強大且易於使用的數據結構，用於非結構化數據。對於在跨/多模式域工作的數據科學家來說，它是超級富有成效的。要了解有關docarray的更多信息，請查看文檔。

伺服器

您可以按照以下說明託管自己的服務器。

硬件要求

dall·e流在其峰值處需要一個帶有21GB VRAM的GPU。所有服務都被擠進了這個GPU，其中包括（大致）

達勒〜9GB
GLID擴散〜6GB
穩定擴散〜8GB（batch_size = 4 in config.yml ，512x512）
Swinir〜3GB
剪輯VIT-L/14-336PX〜3GB

以下合理的技巧可用於進一步降低VRAM：

Swinir可以移至CPU（-3GB）
剪輯可以委派給夾子as-service免費服務器（-3GB）

它需要在硬盤驅動器上至少有50GB的空間，主要用於下載驗證的型號。

需要高速互聯網。下載模型時，慢速/不穩定的互聯網可能會引起令人沮喪的超時。

僅測試CPU的環境，可能無法正常工作。 Google Colab可能會投擲OOM，因此也無法正常工作。

服務器架構

如果您已經安裝了Jina，則可以通過以下方式生成以上流程圖：

 # pip install jina
jina export flowchart flow.yml flow.svg

穩定的擴散權重

如果您想使用穩定的擴散，則首先需要在網站上註冊一個帳戶，並同意模型的條款和條件。登錄後，您可以找到通往此處所需的模型的版本：

compvis / sd-v1-5 inpainting.ckpt

在下載“權重”部分下，單擊sd-v1-x.ckpt的鏈接。寫作時的最新權重為sd-v1-5.ckpt 。

DOCKER用戶：將此文件放入名為ldm/stable-diffusion-v1的文件夾中，並重命名IT model.ckpt 。仔細按照以下說明進行操作，因為默認情況下未啟用SD。

本地用戶：將此文件放入dalle/stable-diffusion/models/ldm/stable-diffusion-v1/model.ckpt中，完成“本地運行”下的其餘步驟後。仔細按照以下說明進行操作，因為默認情況下未啟用SD。

在Docker中運行

預製圖像

我們提供了可以直接拉動的預製碼頭圖像。

docker pull jinaai/dalle-flow:latest

自己構建

我們提供了一個Dockerfile，該碼頭使您可以將服務器從框中運行。

我們的Dockerfile將CUDA 11.6用作基本圖像，您可能需要根據系統進行調整。

git clone https://github.com/jina-ai/dalle-flow.git
cd dalle-flow

docker build --build-arg GROUP_ID= $( id -g ${USER} ) --build-arg USER_ID= $( id -u ${USER} ) -t jinaai/dalle-flow .

該建築將需要10分鐘的平均互聯網速度，這將導致18GB Docker的圖像。

運行容器

要運行它，只需做：

docker run -p 51005:51005 
  -it 
  -v $HOME /.cache:/home/dalle/.cache 
  --gpus all 
  jinaai/dalle-flow

另外，您也可以使用一些啟用或禁用的工作流程運行，以防止內存外崩潰。為此，通過以下環境變量之一：

 DISABLE_DALLE_MEGA
DISABLE_GLID3XL
DISABLE_SWINIR
ENABLE_STABLE_DIFFUSION
ENABLE_CLIPSEG
ENABLE_REALESRGAN

例如，如果您想禁用GLID3XL工作流，請運行：

docker run -e DISABLE_GLID3XL= ' 1 ' 
  -p 51005:51005 
  -it 
  -v $HOME /.cache:/home/dalle/.cache 
  --gpus all 
  jinaai/dalle-flow

首次運行將需要大約10分鐘的時間，而平均互聯網速度將需要大約10分鐘。
-v $HOME/.cache:/root/.cache避免在每個Docker運行中下載重複的模型。
-p 51005:51005的第一部分是您的主機公共端口。如果您公開服務，請確保人們可以訪問此端口。其第二個標準桿是Flow.yml中定義的端口。
如果要使用穩定的擴散，則必須使用ENABLE_STABLE_DIFFUSION手動啟用它。
如果要使用Clipseg，則必須使用ENABLE_CLIPSEG手動啟用它。
如果要使用RealEsrgan，則必須使用ENABLE_REALESRGAN手動啟用它。

穩定擴散和Docker的特殊說明

穩定的擴散只有在您下載了權重並使它們作為虛擬卷中提供時，才能啟用穩定的擴散，同時為SD啟用環境標誌（ ENABLE_STABLE_DIFFUSION ） 。

您應該以前將權重放入名為ldm/stable-diffusion-v1的文件夾中，並將其標記為model.ckpt 。在下面的YOUR_MODEL_PATH/ldm上替換您的系統中的路徑，以將權重輸送到Docker映像中。

docker run -e ENABLE_STABLE_DIFFUSION= " 1 " 
  -e DISABLE_DALLE_MEGA= " 1 " 
  -e DISABLE_GLID3XL= " 1 " 
  -p 51005:51005 
  -it 
  -v YOUR_MODEL_PATH/ldm:/dalle/stable-diffusion/models/ldm/ 
  -v $HOME /.cache:/home/dalle/.cache 
  --gpus all 
  jinaai/dalle-flow

您應該在運行後像以下內容一樣看到屏幕：

請注意，與本地運行不同，在Docker內部運行可能會提供更少的生動進度鍵，顏色日誌和打印。這是由於碼頭容器中終端的局限性。它不會影響實際用法。

本地運行

本地運行需要一些手動步驟，但通常更容易調試。

克隆回購

mkdir dalle && cd dalle
git clone https://github.com/jina-ai/dalle-flow.git
git clone https://github.com/jina-ai/SwinIR.git
git clone --branch v0.0.15 https://github.com/AmericanPresidentJimmyCarter/stable-diffusion.git
git clone https://github.com/CompVis/latent-diffusion.git
git clone https://github.com/jina-ai/glid-3-xl.git
git clone https://github.com/timojl/clipseg.git

您應該具有以下文件夾結構：

 dalle/
 |
 |-- Real-ESRGAN/
 |-- SwinIR/
 |-- clipseg/
 |-- dalle-flow/
 |-- glid-3-xl/
 |-- latent-diffusion/
 |-- stable-diffusion/

安裝輔助存儲庫

 cd dalle-flow
python3 -m virtualenv env
source env/bin/activate && cd -
pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu116
pip install numpy tqdm pytorch_lightning einops numpy omegaconf
pip install https://github.com/crowsonkb/k-diffusion/archive/master.zip
pip install git+https://github.com/AmericanPresidentJimmyCarter/[email protected]
pip install basicsr facexlib gfpgan
pip install realesrgan
pip install https://github.com/AmericanPresidentJimmyCarter/xformers-builds/raw/master/cu116/xformers-0.0.14.dev0-cp310-cp310-linux_x86_64.whl && 
cd latent-diffusion && pip install -e . && cd -
cd stable-diffusion && pip install -e . && cd -
cd SwinIR && pip install -e . && cd -
cd glid-3-xl && pip install -e . && cd -
cd clipseg && pip install -e . && cd -

如果您使用的話，我們需要下載幾個型號：GLID-3-XL：

 cd glid-3-xl
wget https://dall-3.com/models/glid-3-xl/bert.pt
wget https://dall-3.com/models/glid-3-xl/kl-f8.pt
wget https://dall-3.com/models/glid-3-xl/finetune.pt
cd -

clipseg和RealESRGAN都要求您設置正確的緩存文件夾路徑，通常是$ HOME/。

安裝流程

 cd dalle-flow
pip install -r requirements.txt
pip install jax~=0.3.24

啟動服務器

現在，您在dalle-flow/下，運行以下命令：

 # Optionally disable some generative models with the following flags when
# using flow_parser.py:
# --disable-dalle-mega
# --disable-glid3xl
# --disable-swinir
# --enable-stable-diffusion
python flow_parser.py
jina flow --uses flow.tmp.yml

您應該立即看到此屏幕：

首先，下載DALL·E Mega型號和其他必要型號將需要〜8分鐘。程序運行只需大約1分鐘即可傳達成功消息。

當一切準備就緒時，您會看到：

恭喜！現在，您應該能夠運行客戶端。

您可以根據需要修改和擴展服務器流量，例如更改模型，添加持久性，甚至自動啟動到Instagram/opensea。使用Jina和Docarray，您可以輕鬆地製作dall·e Flow Cloud-native並準備生產。

使用剪貼即服務

為了減少VRAM的使用情況，您可以將CLIP-as-service用作外部執行程序，可在grpcs://api.clip.jina.ai:2096 。
首先，確保您從控制台網站或CLI創建了一個訪問令牌，如下

jina auth token create < name of PAT > -e < expiration days >

然後，您需要從flow.yml更改執行者相關的配置（ host ， port ， external ， tls和grpc_metadata ）。

...
  - name : clip_encoder
    uses : jinahub+docker://CLIPTorchEncoder/latest-gpu
    host : ' api.clip.jina.ai '
    port : 2096
    tls : true
    external : true
    grpc_metadata :
      authorization : " <your access token> "
    needs : [gateway]
...
  - name : rerank
    uses : jinahub+docker://CLIPTorchEncoder/latest-gpu
    host : ' api.clip.jina.ai '
    port : 2096
    uses_requests :
      ' / ' : rank
    tls : true
    external : true
    grpc_metadata :
      authorization : " <your access token> "
    needs : [dalle, diffusion]

您也可以使用flow_parser.py自動生成並使用CLIP-as-service作為外部執行程序來生成和運行流程：

python flow_parser.py --cas-token " <your access token>'
jina flow --uses flow.tmp.yml

配x grpc_metadata僅在Jina v3.11.0之後可用。如果您使用的是舊版本，請升級到最新版本。

現在，您可以在流中使用免費的CLIP-as-service 。

支持

為了擴展dall·e流，您需要熟悉Jina和Docarray。
加入我們的Discord社區，與其他社區成員聊天。
加入我們的工程所有人聚會，討論您的用例並學習Jina的新功能。
- 什麼時候？每個月的第二個星期二
- 在哪裡？ Zoom（請參閱我們的公共事件日曆/.ical）並在YouTube上進行直播
在我們的YouTube頻道上訂閱最新視頻教程