deep daze 다운로드 - deep daze 소스 코드 다운로드

딥 데이즈

푸른 언덕 위에 안개

잔디 위에 부서진 접시

우주적인 사랑과 관심

군중 속의 시간여행자

전염병 중의 삶

햇살 가득한 숲속의 명상적 평화

완전히 붉은 이미지를 그리는 남자

LSD에 대한 환각적인 경험

이게 뭔가요?

OpenAI의 CLIP 및 Siren을 사용하여 텍스트를 이미지로 생성하기 위한 간단한 명령줄 도구입니다. 이 기술을 발견하고 훌륭한 이름을 생각해낸 공로는 Ryan Murdock에게 돌아갑니다!

원본 노트

새로운 단순화된 노트북

이를 위해서는 Nvidia GPU 또는 AMD GPU가 필요합니다.

권장: 16GB VRAM
최소 요구 사항: 4GB VRAM(매우 낮음 설정 사용, 아래 사용 지침 참조)

설치하다

$ pip install deep-daze

윈도우 설치

Python이 설치되어 있다고 가정합니다.

명령 프롬프트를 열고 현재 버전의 Python 디렉터리로 이동합니다.

  pip install deep-daze

예

$ imagine " a house in the forest "

Windows의 경우:

관리자로 명령 프롬프트 열기

  imagine " a house in the forest "

그게 다야.

메모리가 충분하다면 --deeper 플래그를 추가하여 더 나은 품질을 얻을 수 있습니다.

$ imagine " shattered plates on the ground " --deeper

고급의

진정한 딥러닝 방식에서는 레이어가 많을수록 더 나은 결과를 얻을 수 있습니다. 기본값은 16 이지만 리소스에 따라 32 로 늘릴 수 있습니다.

$ imagine " stranger in strange lands " --num-layers 32

용법

CLI

NAME
    imagine

SYNOPSIS
    imagine TEXT < flags >

POSITIONAL ARGUMENTS
    TEXT
        (required) A phrase less than 77 tokens which you would like to visualize.

FLAGS
    --img=IMAGE_PATH
        Default: None
        Path to png/jpg image or PIL image to optimize on
    --encoding=ENCODING
        Default: None
        User-created custom CLIP encoding. If used, replaces any text or image that was used.
    --create_story=CREATE_STORY
        Default: False
        Creates a story by optimizing each epoch on a new sliding-window of the input words. If this is enabled, much longer texts than 77 tokens can be used. Requires save_progress to visualize the transitions of the story.
    --story_start_words=STORY_START_WORDS
        Default: 5
        Only used if create_story is True. How many words to optimize on for the first epoch.
    --story_words_per_epoch=STORY_WORDS_PER_EPOCH
        Default: 5
        Only used if create_story is True. How many words to add to the optimization goal per epoch after the first one.
    --story_separator:
        Default: None
        Only used if create_story is True. Defines a separator like ' . ' that splits the text into groups for each epoch. Separator needs to be in the text otherwise it will be ignored
    --lower_bound_cutout=LOWER_BOUND_CUTOUT
        Default: 0.1
        Lower bound of the sampling of the size of the random cut-out of the SIREN image per batch. Should be smaller than 0.8.
    --upper_bound_cutout=UPPER_BOUND_CUTOUT
        Default: 1.0
        Upper bound of the sampling of the size of the random cut-out of the SIREN image per batch. Should probably stay at 1.0.
    --saturate_bound=SATURATE_BOUND
        Default: False
        If True, the LOWER_BOUND_CUTOUT is linearly increased to 0.75 during training.
    --learning_rate=LEARNING_RATE
        Default: 1e-05
        The learning rate of the neural net.
    --num_layers=NUM_LAYERS
        Default: 16
        The number of hidden layers to use in the Siren neural net.
    --batch_size=BATCH_SIZE
        Default: 4
        The number of generated images to pass into Siren before calculating loss. Decreasing this can lower memory and accuracy.
    --gradient_accumulate_every=GRADIENT_ACCUMULATE_EVERY
        Default: 4
        Calculate a weighted loss of n samples for each iteration. Increasing this can help increase accuracy with lower batch sizes.
    --epochs=EPOCHS
        Default: 20
        The number of epochs to run.
    --iterations=ITERATIONS
        Default: 1050
        The number of times to calculate and backpropagate loss in a given epoch.
    --save_every=SAVE_EVERY
        Default: 100
        Generate an image every time iterations is a multiple of this number.
    --image_width=IMAGE_WIDTH
        Default: 512
        The desired resolution of the image.
    --deeper=DEEPER
        Default: False
        Uses a Siren neural net with 32 hidden layers.
    --overwrite=OVERWRITE
        Default: False
        Whether or not to overwrite existing generated images of the same name.
    --save_progress=SAVE_PROGRESS
        Default: False
        Whether or not to save images generated before training Siren is complete.
    --seed=SEED
        Type: Optional[]
        Default: None
        A seed to be used for deterministic runs.
    --open_folder=OPEN_FOLDER
        Default: True
        Whether or not to open a folder showing your generated images.
    --save_date_time=SAVE_DATE_TIME
        Default: False
        Save files with a timestamp prepended e.g. ` %y%m%d-%H%M%S-my_phrase_here `
    --start_image_path=START_IMAGE_PATH
        Default: None
        The generator is trained first on a starting image before steered towards the textual input
    --start_image_train_iters=START_IMAGE_TRAIN_ITERS
        Default: 50
        The number of steps for the initial training on the starting image
    --theta_initial=THETA_INITIAL
        Default: 30.0
        Hyperparameter describing the frequency of the color space. Only applies to the first layer of the network.
    --theta_hidden=THETA_INITIAL
        Default: 30.0
        Hyperparameter describing the frequency of the color space. Only applies to the hidden layers of the network.
    --save_gif=SAVE_GIF
        Default: False
        Whether or not to save a GIF animation of the generation procedure. Only works if save_progress is set to True.

애벌칠

Mario Klingemann이 처음 고안하고 공유한 기술을 사용하면 텍스트로 이동하기 전에 시작 이미지로 생성기 네트워크를 프라이밍할 수 있습니다.

사용하려는 이미지의 경로를 지정하고 선택적으로 초기 학습 단계 수를 지정하기만 하면 됩니다.

$ imagine ' a clear night sky filled with stars ' --start_image_path ./cloudy-night-sky.jpg

프라이밍된 시작 이미지

그런 다음 A pizza with green pepper.

이미지 해석을 위한 최적화

생성기 네트워크만 프라이밍하는 대신 최적화 목표로 이미지를 공급할 수도 있습니다. 그런 다음 Deepdaze는 해당 이미지에 대한 자체 해석을 렌더링합니다.

$ imagine --img samples/Autumn_1875_Frederic_Edwin_Church.jpg

원본 이미지:

네트워크의 해석:

원본 이미지:

네트워크의 해석:

텍스트와 이미지 결합에 최적화

$ imagine " A psychedelic experience. " --img samples/hot-dog.jpg

네트워크의 해석:

새로운 기능: 스토리 만들기

텍스트의 일반 모드에서는 77개의 토큰만 허용됩니다. 전체 스토리/단락/노래/시를 시각화하려면 create_story True 로 설정하세요.

로버트 프로스트(Robert Frost)의 "눈 내리는 저녁 숲가에 멈춰서기"라는 시를 보면 다음과 같습니다. "이게 누구의 숲인지 알 것 같습니다. 하지만 그의 집은 마을에 있습니다. 그는 내가 여기 멈춰 그의 숲이 눈으로 가득 차는 것을 지켜보는 것을 보지 못할 것입니다. 내 작은 말은 숲과 얼어붙은 호수 사이 근처에 농가도 없이 멈추는 것을 이상하게 생각하는 것 같습니다. 그는 마구 종을 흔들어 뭔가 실수가 있는지 묻습니다. 그리고 숲은 아름답고 어둡고 깊지만 나에게는 지켜야 할 약속이 있고 잠들기 전에 가야 할 길이 있고 잠들기 전에 가야 할 길이 있습니다."

우리는 다음을 얻습니다:

Whose_woods_these_are_I_think_I_know._His_house_is_in_the_village_though._He_.mp4

파이썬

Python에서 `deep_daze.Imagine` 호출합니다.

 from deep_daze import Imagine

imagine = Imagine (
    text = 'cosmic love and attention' ,
    num_layers = 24 ,
)
imagine ()

4번째 반복마다 진행 상황 저장

insert_text_here.00001.png, insert_text_here.00002.png, ...최대 (total_iterations % save_every) 형식으로 이미지를 저장합니다.

 imagine = Imagine (
    text = text ,
    save_every = 4 ,
    save_progress = True
)

각 이미지 앞에 현재 타임스탬프를 추가합니다.

타임스탬프와 시퀀스 번호가 모두 포함된 파일을 생성합니다.

예: 210129-043928_328751_insert_text_here.00001.png, 210129-043928_512351_insert_text_here.00002.png, ...

 imagine = Imagine (
    text = text ,
    save_every = 4 ,
    save_progress = True ,
    save_date_time = True ,
)

높은 GPU 메모리 사용량

사용 가능한 vram이 16GiB 이상인 경우 약간의 공간을 두고 이러한 설정을 실행할 수 있습니다.

 imagine = Imagine (
    text = text ,
    num_layers = 42 ,
    batch_size = 64 ,
    gradient_accumulate_every = 1 ,
)

평균 GPU 메모리 사용량

 imagine = Imagine (
    text = text ,
    num_layers = 24 ,
    batch_size = 16 ,
    gradient_accumulate_every = 2
)

매우 낮은 GPU 메모리 사용량(4GiB 미만)

8GiB vram 미만의 카드에서 이 작업을 실행하고 싶다면 image_width를 낮출 수 있습니다.

 imagine = Imagine (
    text = text ,
    image_width = 256 ,
    num_layers = 16 ,
    batch_size = 1 ,
    gradient_accumulate_every = 16 # Increase gradient_accumulate_every to correct for loss in low batch sizes
)

VRAM 및 속도 벤치마크:

이 실험은 2060 Super RTX 및 3700X Ryzen 5를 사용하여 수행되었습니다. 먼저 매개변수(bs = 배치 크기)를 언급한 다음 메모리 사용량 및 경우에 따라 초당 훈련 반복을 언급합니다.

이미지 해상도 512의 경우:

bs 1, num_layers 22: 7.96GB
bs 2, num_layers 20: 7.5GB
bs 16, num_layers 16: 6.5GB

이미지 해상도 256의 경우:

bs 8, num_layers 48: 5.3GB
bs 16, num_layers 48: 5.46GB - 2.0 it/s
bs 32, num_layers 48: 5.92GB - 1.67it/s
bs 8, num_layers 44: 5GB - 2.39 it/s
bs 32, num_layers 44, grad_acc 1: 5.62GB - 4.83it/s
bs 96, num_layers 44, grad_acc 1: 7.51GB - 2.77it/s
bs 32, num_layers 66, grad_acc 1: 7.09GB - 3.7it/s

@NotNANtoN은 44개 레이어와 1~8 에포크 교육을 포함하는 배치 크기 32개를 권장합니다.

이건 어디로 가는 걸까요?

이것은 단지 티저일 뿐입니다. 우리는 자연어를 이용해 이미지, 소리 등 무엇이든 마음대로 생성할 수 있게 될 것입니다. 홀로데크는 곧 우리 삶에서 현실이 될 것입니다.

이 기술을 발전시키는 데 관심이 있다면 Pytorch 또는 Mesh Tensorflow용 DALL-E 복제 작업에 참여하세요.

대안

Big Sleep - CLIP 및 Big GAN의 생성기

인용

 @misc { unpublished2021clip ,
    title  = { CLIP: Connecting Text and Images } ,
    author = { Alec Radford, Ilya Sutskever, Jong Wook Kim, Gretchen Krueger, Sandhini Agarwal } ,
    year   = { 2021 }
}

 @misc { sitzmann2020implicit ,
    title   = { Implicit Neural Representations with Periodic Activation Functions } ,
    author  = { Vincent Sitzmann and Julien N. P. Martel and Alexander W. Bergman and David B. Lindell and Gordon Wetzstein } ,
    year    = { 2020 } ,
    eprint  = { 2006.09661 } ,
    archivePrefix = { arXiv } ,
    primaryClass = { cs.CV }
}

확장하다

deep daze

딥 데이즈

이게 뭔가요?

설치하다

윈도우 설치

예

고급의

용법

CLI

애벌칠

이미지 해석을 위한 최적화

텍스트와 이미지 결합에 최적화

새로운 기능: 스토리 만들기

파이썬

Python에서 `deep_daze.Imagine` 호출합니다.

4번째 반복마다 진행 상황 저장

각 이미지 앞에 현재 타임스탬프를 추가합니다.

높은 GPU 메모리 사용량

평균 GPU 메모리 사용량

매우 낮은 GPU 메모리 사용량(4GiB 미만)

VRAM 및 속도 벤치마크:

이건 어디로 가는 걸까요?

대안

인용

DAZE CAM 앱

딥 필드

딥헌터 게임

딥디

딥 레이스: 전투

깊은 룬

chat.petals.dev

GPT Prompt Templates

GPTyped

node telegram bot api

typebot.io

python wechaty getting started

waymo open dataset

termwind

wp functions

deep daze

딥 데이즈

이게 뭔가요?

설치하다

윈도우 설치

예

고급의

용법

CLI

애벌칠

이미지 해석을 위한 최적화

텍스트와 이미지 결합에 최적화

새로운 기능: 스토리 만들기

파이썬

Python에서 deep_daze.Imagine 호출합니다.

4번째 반복마다 진행 상황 저장

각 이미지 앞에 현재 타임스탬프를 추가합니다.

높은 GPU 메모리 사용량

평균 GPU 메모리 사용량

매우 낮은 GPU 메모리 사용량(4GiB 미만)

VRAM 및 속도 벤치마크:

이건 어디로 가는 걸까요?

대안

인용

Python에서 `deep_daze.Imagine` 호출합니다.