Video Genrator text to video 다운로드 - Video Genrator text to video 소스 코드 다운로드

Video Genrator text to video

기타 소스코드

1.0.0

다운로드

텍스트에서 비디오 생성

Tune-A-Video: 텍스트-비디오 생성을 위한 이미지 확산 모델의 원샷 조정

설정

요구사항

pip install -r requirements.txt

가중치

[Stable Diffusion] Stable Diffusion은 어떤 텍스트 입력에도 사실적인 이미지를 생성할 수 있는 잠재 텍스트-이미지 확산 모델입니다. 사전 훈련된 Stable Diffusion 모델은 Hugging Face(예: Stable Diffusion v1-4, v2-1)에서 다운로드할 수 있습니다. 또한 다양한 스타일(예: Modern Disney, Redshift 등)에 대해 훈련된 미세 조정된 Stable Diffusion 모델을 사용할 수도 있습니다.

[DreamBooth] DreamBooth는 피사체에 대한 몇 개의 이미지(3~5개 이미지)만으로 Stable Diffusion과 같은 텍스트-이미지 모델을 개인화하는 방법입니다. DreamBooth 모델에서 비디오를 조정하면 특정 주제에 대한 개인화된 텍스트-비디오 생성이 가능합니다. Hugging Face에는 일부 공개 DreamBooth 모델이 있습니다(예: mr-potato-head). 이 교육 예제에 따라 자신만의 DreamBooth 모델을 교육할 수도 있습니다.

용법

훈련

텍스트-비디오 생성을 위해 텍스트-이미지 확산 모델을 미세 조정하려면 다음 명령을 실행하십시오.

accelerate launch train_tuneavideo.py --config= " configs/man-skiing.yaml "

추론

훈련이 완료되면 추론을 실행합니다.

 from tuneavideo . pipelines . pipeline_tuneavideo import TuneAVideoPipeline
from tuneavideo . models . unet import UNet3DConditionModel
from tuneavideo . util import save_videos_grid
import torch

pretrained_model_path = "./checkpoints/stable-diffusion-v1-4"
my_model_path = "./outputs/man-skiing"
unet = UNet3DConditionModel . from_pretrained ( my_model_path , subfolder = 'unet' , torch_dtype = torch . float16 ). to ( 'cuda' )
pipe = TuneAVideoPipeline . from_pretrained ( pretrained_model_path , unet = unet , torch_dtype = torch . float16 ). to ( "cuda" )
pipe . enable_xformers_memory_efficient_attention ()
pipe . enable_vae_slicing ()

prompt = "spider man is skiing"
ddim_inv_latent = torch . load ( f" { my_model_path } /inv_latents/ddim_latent-500.pt" ). to ( torch . float16 )
video = pipe ( prompt , latents = ddim_inv_latent , video_length = 24 , height = 512 , width = 512 , num_inference_steps = 50 , guidance_scale = 12.5 ). videos

save_videos_grid ( video , f"./ { prompt } .gif" )

결과

사전 훈련된 T2I(안정적 확산)

입력 비디오	출력 비디오

"남자가 스키를 타고 있어요"	"원더우먼, 스키를 타요"	"어린 소녀가 스키를 타고 있어요"

"토끼가 수박을 먹고 있어요"	"고양이가 테이블 위에서 수박을 먹고 있어요"	"강아지가 테이블 위에서 치즈버거를 먹고 있어요, 만화 스타일"

"지프차가 도로를 달리고 있어요"	"자동차가 도로를 만화 스타일로 움직이고 있습니다."	"눈 위에서 자동차가 움직이고 있다"