PixArt sigma下載 - PixArt sigma原始碼下載

PixArt sigma

其他源碼

下載

PixArt-Σ：用於 4K 文字到影像產生的擴散變壓器的弱到強訓練

該儲存庫包含 PyTorch 模型定義、預訓練權重和推理/採樣程式碼，用於我們探索 4K 文字到圖像生成的擴散變壓器的弱到強訓練的論文。您可以在我們的專案頁面上找到更多視覺化效果。

PixArt-Σ：用於 4K 文字到影像產生的擴散變壓器的弱到強訓練
陳俊松*、葛崇建*、謝恩澤*†、吳悅*、姚樂偉、任小哲、王忠道、羅平、陸虎川、李振國
華為諾亞方舟實驗室、大連理工大學、香港大學、香港科技大學

歡迎大家踴躍投稿！

借鑒之前的 PixArt-α 項目，我們會盡量保持這個 repo 盡可能簡單，以便 PixArt 社區的每個人都可以使用它。

突發新聞！

（新）2024 年 4 月 24 日。？擴散器現在支援我們！恭喜！請記住更新一次擴散器檢查點以使其可用。
（新）2024 年 4 月 24 日。 LoRA程式碼發布！
（✅ 新）2024 年 4 月 23 日。原相Σ 2K ckpt 發佈！
（✅ 新）2024 年 4 月 16 日。 PixArt-Σ 線上演示現已推出！
（✅ 新）2024 年 4 月 16 日。 PixArt-α-DMD One Step Generator 訓練代碼全部發佈！
（✅ 新）2024 年 4 月 11 日。 PixArt-Σ 演示和 PixArt-Σ 管道！原相Σ 支持? diffusers使用補丁的? diffusers可實現快速體驗！
（✅ 新）2024 年 4 月 10 日。 PixArt-α-DMD 一步取樣示範程式碼 & PixArt-α-DMD 檢查點 512px 發佈！
（✅ 新）2024 年 4 月 9 日。原相Σ關卡1024px發布！
（✅ 新）2024 年 4 月 6 日。原相Σ檢查點256px & 512px 發佈！
（✅ 新）2024 年 3 月 29 日。 PixArt-Σ訓練&推理代碼&玩具資料發布！

內容

-主要的

從弱到強
訓練
推理
使用擴散器
啟動演示
可用型號

-指導

特徵提取*（可選）
一步產生 (DMD)
洛拉與朵拉
[LCM：即將推出]
[ControlNet：即將推出]
[ComfyUI：即將推出]
資料重新格式化*（可選）

-其他的

致謝
引文
待辦事項

？與原相α比較

模型	T5令牌長度	VAE	2K/4K
原相-Σ	300	SDXL	✅
原相-α	120	SD1.5

模型	樣品1	樣品2	樣品3
原相-Σ
原相-α
迅速的	特寫鏡頭，60 多歲的白髮、留著鬍子的男人，穿著羊毛外套和棕色貝雷帽，戴著眼鏡，觀察路人，電影般。	身體拍攝，法國女人，攝影，法國街道背景，逆光，邊緣光，富士膠片。	兩艘海盜船在一杯咖啡內航行時互相戰鬥的逼真特寫影片。

提示詳情

樣本1完整提示：一個60多歲的白髮留著鬍鬚的男人的極端特寫，他坐在巴黎的一家咖啡館裡，沉思著思考宇宙的歷史，他的目光聚焦在畫面外的人身上當他們走路時，他幾乎一動不動地坐著，他穿著一件羊毛大衣西裝外套和一件紐扣襯衫，他戴著**棕色貝雷帽**和眼鏡，有著非常教授的外表，最後他提供了一個微妙的閉嘴微笑彷彿找到了生命之謎的答案，燈光非常電影化，金色的光芒和背景的巴黎街道和城市，景深，電影化的35mm膠片。

？依賴關係和安裝

Python >= 3.9 (推薦使用 Anaconda 或 Miniconda)
PyTorch >= 2.0.1+cu11.7

conda create -n pixart python==3.9.0
conda activate pixart
conda install pytorch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 pytorch-cuda=11.7 -c pytorch -c nvidia

git clone https://github.com/PixArt-alpha/PixArt-sigma.git
cd PixArt-sigma
pip install -r requirements.txt

如何訓練

1. 原相培訓

首先。

我們啟動一個新的儲存庫來建立一個更用戶友好且更相容的程式碼庫。主要模型結構與PixArt-α相同，您仍然可以在原始儲存庫的基礎上開發您的功能。另外，這個repo將來會支持PixArt-alpha 。

提示

現在您無需事先提取特徵即可訓練模型。我們對PixArt-α程式碼庫中的資料結構進行了改造，讓每個人都可以從一開始就開始訓練、推理和視覺化，沒有任何痛苦。

1.1 下載玩具資料集

首先下載玩具資料集。訓練的資料集結構為：

 cd ./pixart-sigma-toy-dataset

Dataset Structure
├──InternImgs/  (images are saved here)
│  ├──000000000000.png
│  ├──000000000001.png
│  ├──......
├──InternData/
│  ├──data_info.json    (meta data)
Optional(?)
│  ├──img_sdxl_vae_features_1024resolution_ms_new    (run tools/extract_caption_feature.py to generate caption T5 features, same name as images except .npz extension)
│  │  ├──000000000000.npy
│  │  ├──000000000001.npy
│  │  ├──......
│  ├──caption_features_new
│  │  ├──000000000000.npz
│  │  ├──000000000001.npz
│  │  ├──......
│  ├──sharegpt4v_caption_features_new    (run tools/extract_caption_feature.py to generate caption T5 features, same name as images except .npz extension)
│  │  ├──000000000000.npz
│  │  ├──000000000001.npz
│  │  ├──......

1.2 下載預訓練的檢查點

 # SDXL-VAE, T5 checkpoints
git lfs install
git clone https://huggingface.co/PixArt-alpha/pixart_sigma_sdxlvae_T5_diffusers output/pretrained_models/pixart_sigma_sdxlvae_T5_diffusers

# PixArt-Sigma checkpoints
python tools/download.py # environment eg. HF_ENDPOINT=https://hf-mirror.com can use for HuggingFace mirror

1.3 你已經準備好訓練了！

從設定檔目錄中選擇所需的設定檔。

python -m torch.distributed.launch --nproc_per_node=1 --master_port=12345 
          train_scripts/train.py 
          configs/pixart_sigma_config/PixArt_sigma_xl2_img512_internalms.py 
          --load-from output/pretrained_models/PixArt-Sigma-XL-2-512-MS.pth 
          --work-dir output/your_first_pixart-exp 
          --debug

如何測試

1.Gradio快速入門

首先，先安裝所需的依賴項。確保您已將模型（即將推出）中的檢查點檔案下載到output/pretrained_models資料夾，然後在本機電腦上執行：

 # SDXL-VAE, T5 checkpoints
git lfs install
git clone https://huggingface.co/PixArt-alpha/pixart_sigma_sdxlvae_T5_diffusers output/pixart_sigma_sdxlvae_T5_diffusers

# PixArt-Sigma checkpoints
python tools/download.py

# demo launch
python scripts/interface.py --model_path output/pretrained_models/PixArt-Sigma-XL-2-512-MS.pth --image_size 512 --port 11223

2. 整合在擴散器中

重要的

升級您的diffusers以使PixArtSigmaPipeline可用！

pip install git+https://github.com/huggingface/diffusers

對於diffusers<0.28.0 ，請檢查此腳本以獲得協助。

 import torch
from diffusers import Transformer2DModel , PixArtSigmaPipeline

device = torch . device ( "cuda:0" if torch . cuda . is_available () else "cpu" )
weight_dtype = torch . float16

transformer = Transformer2DModel . from_pretrained (
    "PixArt-alpha/PixArt-Sigma-XL-2-1024-MS" , 
    subfolder = 'transformer' , 
    torch_dtype = weight_dtype ,
    use_safetensors = True ,
)
pipe = PixArtSigmaPipeline . from_pretrained (
    "PixArt-alpha/pixart_sigma_sdxlvae_T5_diffusers" ,
    transformer = transformer ,
    torch_dtype = weight_dtype ,
    use_safetensors = True ,
)
pipe . to ( device )

# Enable memory optimizations.
# pipe.enable_model_cpu_offload()

prompt = "A small cactus with a happy face in the Sahara desert."
image = pipe ( prompt ). images [ 0 ]
image . save ( "./catcus.png" )

3. 原相演示

pip install git+https://github.com/huggingface/diffusers

# PixArt-Sigma 1024px
DEMO_PORT=12345 python app/app_pixart_sigma.py

# PixArt-Sigma One step Sampler(DMD)
DEMO_PORT=12345 python app/app_pixart_dmd.py

讓我們來看一個使用http://your-server-ip:12345的簡單範例。

4. 將.pth檢查點轉換為擴散器版本

直接從抱臉下載

或運行：

pip install git+https://github.com/huggingface/diffusers

python tools/convert_pixart_to_diffusers.py --orig_ckpt_path output/pretrained_models/PixArt-Sigma-XL-2-1024-MS.pth --dump_path output/pretrained_models/PixArt-Sigma-XL-2-1024-MS --only_transformer=True --image_size=1024 --version sigma

⏬ 可用型號

所有模型都會在此處自動下載。您也可以選擇從此網址手動下載。

模型	#參數	檢查點路徑	在 OpenXLab 中下載
T5 和 SDXL-VAE	4.5B	擴散器：pixart_sigma_sdxlvae_T5_diffusers	即將推出
原相-Σ-256	0.6B	pth：PixArt-Sigma-XL-2-256x256.pth 擴散器：PixArt-Sigma-XL-2-256x256	即將推出
原相-Σ-512	0.6B	pth：PixArt-Sigma-XL-2-512-MS.pth 擴散器：PixArt-Sigma-XL-2-512-MS	即將推出
原相-α-512-DMD	0.6B	擴散器：PixArt-Alpha-DMD-XL-2-512x512	即將推出
原相-Σ-1024	0.6B	pth：PixArt-Sigma-XL-2-1024-MS.pth 擴散器：PixArt-Sigma-XL-2-1024-MS	即將推出
原相-Σ-2K	0.6B	pth：PixArt-Sigma-XL-2-2K-MS.pth 擴散器：PixArt-Sigma-XL-2-2K-MS	即將推出

?待辦事項清單

我們會盡力發布

?致謝

感謝 PixArt-α、DiT 和 OpenDMD 的精彩工作和程式碼庫！
感謝 Diffusers 出色的技術支援和出色的合作！
感謝 Hugging Face 贊助這次精彩的示範！

書目詞典

 @misc{chen2024pixartsigma,
  title={PixArt-Sigma: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation},
  author={Junsong Chen and Chongjian Ge and Enze Xie and Yue Wu and Lewei Yao and Xiaozhe Ren and Zhongdao Wang and Ping Luo and Huchuan Lu and Zhenguo Li},
  year={2024},
  eprint={2403.04692},
  archivePrefix={arXiv},
  primaryClass={cs.CV}

明星歷史

展開

附加信息

版本
類型其他源碼
更新時間 2024-12-24
大小 3.92MB
來自於 Github

相關應用

GitHub sgrebnov/cordova plugin background download

2024-11-05
Wa ch ull navra maza navsacha 2 2024 ull ovie Fr e Online On Strea ings

2024-11-03
Wa ch navra maza navsacha 2 2024 ull ovie Online For Fr e Strea ings At Home

2024-11-03
Wa ch the greatest of all time 2024 ull ovie Online For Fr e Strea ings At Home

2024-11-02
wolfs 2024 f llmo ie f lmyz lla dow load ree 7 0p 4 0p a d 10 0p

2024-11-01
GitHub actions/download artifact

2024-11-01

爲您推薦

chat.petals.dev

其他源碼

1.0.0
GPT Prompt Templates

其他源碼

1.0.0
GPTyped

其他源碼

GPTyped 1.0.5
waymo open dataset

其他源碼

December 2023 Update
SmartTube

其他源碼

24.71 Stable
Sunamu

其他源碼

Release 2.2.0
waymo open dataset

其他源碼

December 2023 Update
wp functions

其他類別

1.0.0
termwind

其他類別

v2.3.0

相關資訊全部