magvit2 pytorch下載 - magvit2 pytorch原始碼下載

magvit2 pytorch

Ai源碼

0.4.9

下載

MagViT2 - Pytorch

從語言模型實現 MagViT2 擊敗擴散 - 分詞器是 Pytorch 中視覺生成的關鍵。目前，這在視訊生成/理解方面擁有 SOTA。

論文中提出的 Lookup Free Quantizer 可以在單獨的儲存庫中找到。也許應該從音訊開始探索所有其他模式

如果您有興趣公開複製本文中提出的分詞器，請加入

更新：騰訊已使用此儲存庫中的程式碼並開源了工作模型

欣賞

穩定性人工智慧和？感謝慷慨的贊助，以及我的其他贊助商，感謝他們為我提供了開源人工智慧的獨立性。
Louis Serrano 分享了一些早期的初始運行，驗證了整體架構與有限標量量化的收斂性。
你？如果您是一位才華橫溢的研究工程師/科學家，請隨時為尖端的開源科學做出貢獻！

安裝

$ pip install magvit2-pytorch

用法

 from magvit2_pytorch import (
    VideoTokenizer ,
    VideoTokenizerTrainer
)

tokenizer = VideoTokenizer (
    image_size = 128 ,
    init_dim = 64 ,
    max_dim = 512 ,
    codebook_size = 1024 ,
    layers = (
        'residual' ,
        'compress_space' ,
        ( 'consecutive_residual' , 2 ),
        'compress_space' ,
        ( 'consecutive_residual' , 2 ),
        'linear_attend_space' ,
        'compress_space' ,
        ( 'consecutive_residual' , 2 ),
        'attend_space' ,
        'compress_time' ,
        ( 'consecutive_residual' , 2 ),
        'compress_time' ,
        ( 'consecutive_residual' , 2 ),
        'attend_time' ,
    )
)

trainer = VideoTokenizerTrainer (
    tokenizer ,
    dataset_folder = '/path/to/a/lot/of/media' ,     # folder of either videos or images, depending on setting below
    dataset_type = 'videos' ,                        # 'videos' or 'images', prior papers have shown pretraining on images to be effective for video synthesis
    batch_size = 4 ,
    grad_accum_every = 8 ,
    learning_rate = 2e-5 ,
    num_train_steps = 1_000_000
)

trainer . train ()

# after a lot of training ...
# can use the EMA of the tokenizer

ema_tokenizer = trainer . ema_tokenizer

# mock video

video = torch . randn ( 1 , 3 , 17 , 128 , 128 )

# tokenizing video to discrete codes

codes = ema_tokenizer . tokenize ( video ) # (1, 9, 16, 16) <- in this example, time downsampled by 4x and space downsampled by 8x. flatten token ids for (non)-autoregressive training

# sanity check

decoded_video = ema_tokenizer . decode_from_code_indices ( codes )

assert torch . allclose (
    decoded_video ,
    ema_tokenizer ( video , return_recon = True )
)

要追蹤權重和偏差的實驗，請在VideoTokenizerTrainer上設定use_wandb_tracking = True ，然後使用.trackers上下文管理器

 trainer = VideoTokenizerTrainer (
    use_wandb_tracking = True ,
    ...
)

with trainer . trackers ( project_name = 'magvit2' , run_name = 'baseline' ):
    trainer . train ()

托多

引文

 @misc { yu2023language ,
    title   = { Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation } , 
    author  = { Lijun Yu and José Lezama and Nitesh B. Gundavarapu and Luca Versari and Kihyuk Sohn and David Minnen and Yong Cheng and Agrim Gupta and Xiuye Gu and Alexander G. Hauptmann and Boqing Gong and Ming-Hsuan Yang and Irfan Essa and David A. Ross and Lu Jiang } ,
    year    = { 2023 } ,
    eprint  = { 2310.05737 } ,
    archivePrefix = { arXiv } ,
    primaryClass = { cs.CV }
}

 @inproceedings { dao2022flashattention ,
    title   = { Flash{A}ttention: Fast and Memory-Efficient Exact Attention with {IO}-Awareness } ,
    author  = { Dao, Tri and Fu, Daniel Y. and Ermon, Stefano and Rudra, Atri and R{'e}, Christopher } ,
    booktitle = { Advances in Neural Information Processing Systems } ,
    year    = { 2022 }
}

 @article { Zhang2021TokenST ,
    title   = { Token Shift Transformer for Video Classification } ,
    author  = { Hao Zhang and Y. Hao and Chong-Wah Ngo } ,
    journal = { Proceedings of the 29th ACM International Conference on Multimedia } ,
    year    = { 2021 }
}

 @inproceedings { Arora2023ZoologyMA ,
    title   = { Zoology: Measuring and Improving Recall in Efficient Language Models } ,
    author  = { Simran Arora and Sabri Eyuboglu and Aman Timalsina and Isys Johnson and Michael Poli and James Zou and Atri Rudra and Christopher R'e } ,
    year    = { 2023 } ,
    url     = { https://api.semanticscholar.org/CorpusID:266149332 }
}

展開

附加信息

版本 0.4.9
類型 Ai源碼
更新時間 2025-01-17
大小 1.73MB
來自於 Github

相關應用

GitHub sgrebnov/cordova plugin background download

2024-11-05
Wa ch ull navra maza navsacha 2 2024 ull ovie Fr e Online On Strea ings

2024-11-03
Wa ch navra maza navsacha 2 2024 ull ovie Online For Fr e Strea ings At Home

2024-11-03
pytorch image models

2024-11-03
Wa ch the greatest of all time 2024 ull ovie Online For Fr e Strea ings At Home

2024-11-02
wolfs 2024 f llmo ie f lmyz lla dow load ree 7 0p 4 0p a d 10 0p

2024-11-01

爲您推薦

chat.petals.dev

其他源碼

1.0.0
GPT Prompt Templates

其他源碼

1.0.0
GPTyped

其他源碼

GPTyped 1.0.5
node telegram bot api

Ai源碼

v0.50.0
typebot.io

Ai源碼

v3.1.2
python wechaty getting started

Ai源碼

1.0.0
waymo open dataset

其他源碼

December 2023 Update
termwind

其他類別

v2.3.0
wp functions

其他類別

1.0.0

相關資訊全部