lumiere pytorch 다운로드 - lumiere pytorch 소스 코드 다운로드

lumiere pytorch

AI 소스 코드

0.0.24

다운로드

뤼미에르 - 파이토치

Pytorch에서 Google Deepmind의 SOTA 텍스트-비디오 생성인 Lumiere 구현

Yannic의 논문 리뷰

이 문서는 대부분 텍스트-이미지 모델에 대한 몇 가지 핵심 아이디어이므로 한 단계 더 나아가 이 저장소 내에서 새로운 Karras U-net을 비디오로 확장할 것입니다.

감사

A16Z 오픈 소스 AI 보조금 프로그램 및? 현재 인공 지능 연구를 오픈 소스로 독립시킬 수 있도록 아낌없는 후원과 다른 후원자들에게 포옹을 전합니다.

설치하다

$ pip install lumiere-pytorch

용법

 import torch
from lumiere_pytorch import MPLumiere

from denoising_diffusion_pytorch import KarrasUnet

karras_unet = KarrasUnet (
    image_size = 256 ,
    dim = 8 ,
    channels = 3 ,
    dim_max = 768 ,
)

lumiere = MPLumiere (
    karras_unet ,
    image_size = 256 ,
    unet_time_kwarg = 'time' ,
    conv_module_names = [
        'downs.1' ,
        'ups.1' ,
        'downs.2' ,
        'ups.2' ,
    ],
    attn_module_names = [
        'mids.0'
    ],
    upsample_module_names = [
        'ups.2' ,
        'ups.1' ,
    ],
    downsample_module_names = [
        'downs.1' ,
        'downs.2'
    ]
)

noised_video = torch . randn ( 2 , 3 , 8 , 256 , 256 )
time = torch . ones ( 2 ,)

denoised_video = lumiere ( noised_video , time = time )

assert noised_video . shape == denoised_video . shape

토도

인용

 @inproceedings { BarTal2024LumiereAS ,
    title   = { Lumiere: A Space-Time Diffusion Model for Video Generation } ,
    author  = { Omer Bar-Tal and Hila Chefer and Omer Tov and Charles Herrmann and Roni Paiss and Shiran Zada and Ariel Ephrat and Junhwa Hur and Yuanzhen Li and Tomer Michaeli and Oliver Wang and Deqing Sun and Tali Dekel and Inbar Mosseri } ,
    year    = { 2024 } ,
    url     = { https://api.semanticscholar.org/CorpusID:267095113 }
}

 @article { Karras2023AnalyzingAI ,
    title   = { Analyzing and Improving the Training Dynamics of Diffusion Models } ,
    author  = { Tero Karras and Miika Aittala and Jaakko Lehtinen and Janne Hellsten and Timo Aila and Samuli Laine } ,
    journal = { ArXiv } ,
    year    = { 2023 } ,
    volume  = { abs/2312.02696 } ,
    url     = { https://api.semanticscholar.org/CorpusID:265659032 }
}