Baixar DALLE mtf - Baixar código-fonte DALLE mtf

DALLE mtf

Código-Fonte de IA

1.0.0

Baixar

DALL-E em Mesh-Tensorflow [WIP]

DALL-E da Open-AI em Mesh-Tensorflow.

Se isso for igualmente eficiente ao GPT-Neo, este repositório deve ser capaz de treinar modelos até e maiores que o tamanho do DALL-E da Open-AI (parâmetros 12B).

Nenhum modelo pré-treinado... Ainda.

Obrigado a Ben Wang pela implementação do tf vae, bem como por fazer a versão mtf funcionar, e Aran Komatsuzaki pela ajuda na construção do mtf VAE e do pipeline de entrada.

Configurar

git clone https://github.com/EleutherAI/GPTNeo
cd GPTNeo
pip3 install -r requirements.txt

Configuração de treinamento

Funciona em TPUs, não foi testado em GPUs, mas deveria funcionar em teoria . As configurações de exemplo foram projetadas para serem executadas em um pod TPU v3-32.

Para configurar TPUs, inscreva-se no Google Cloud Platform e crie um bucket de armazenamento.

Crie sua VM por meio de um shell do Google ( https://ssh.cloud.google.com/ ) com ctpu up --vm-only para que ela possa se conectar ao seu bucket do Google e TPUs e configurar o repositório conforme acima.

Pré-treinamento VAE

DALLE precisa de um VAE pré-treinado para compactar imagens em tokens. Para executar o pré-treinamento VAE, ajuste os parâmetros em configs/vae_example.json para um caminho glob apontando para um conjunto de dados de jpgs e ajuste o tamanho da imagem para o tamanho apropriado.

  "dataset": {
    "train_path": "gs://neo-datasets/CIFAR-10-images/train/**/*.jpg",
    "eval_path": "gs://neo-datasets/CIFAR-10-images/test/**/*.jpg",
    "image_size": 32
  }

Depois que tudo estiver configurado, crie sua TPU e execute:

python train_vae_tf.py --tpu your_tpu_name --model vae_example

O treinamento registra tensores de imagem e valores de perda. Para verificar o progresso, você pode executar:

tensorboard --logdir your_model_dir

Criação de conjunto de dados [DALL-E]

Depois que o VAE estiver pré-treinado, você poderá passar para o DALL-E.

Atualmente estamos treinando em um conjunto de dados fictício. Um conjunto de dados público e em grande escala para o DALL-E está em andamento. Enquanto isso, para gerar alguns dados fictícios, execute:

python src/data/create_tfrecords.py

Isso deve baixar o CIFAR-10 e gerar algumas legendas aleatórias para atuar como entradas de texto.

Os conjuntos de dados personalizados devem ser formatados em uma pasta, com um arquivo jsonl na pasta raiz contendo dados de legenda e caminhos para as respectivas imagens, conforme segue:

 Folder structure:

        data_folder
            jsonl_file
            folder_1
                img1
                img2
                ...
            folder_2
                img1
                img2
                ...
            ...

jsonl structure:
    {"image_path": folder_1/img1, "caption": "some words"}
    {"image_path": folder_2/img2, "caption": "more words"}
    ...

você pode então usar a função create_paired_dataset em src/data/create_tfrecords.py para codificar o conjunto de dados em tfrecords para uso no treinamento.

Depois que o conjunto de dados for criado, copie-o para seu bucket com gsutil:

gsutil cp -r DALLE-tfrecords gs://neo-datasets/

E por fim, execute o treinamento com

python train_dalle.py --tpu your_tpu_name --model dalle_example

Guia de configuração

VAE:

 {
  "model_type": "vae",
  "dataset": {
    "train_path": "gs://neo-datasets/CIFAR-10-images/train/**/*.jpg", # glob path to training images
    "eval_path": "gs://neo-datasets/CIFAR-10-images/test/**/*.jpg", # glob path to eval images
    "image_size": 32 # size of images (all images will be cropped / padded to this size)
  },
  "train_batch_size": 32, 
  "eval_batch_size": 32,
  "predict_batch_size": 32,
  "steps_per_checkpoint": 1000, # how often to save a checkpoint
  "iterations": 500, # number of batches to infeed to the tpu at a time. Must be < steps_per_checkpoint
  "train_steps": 100000, # total training steps
  "eval_steps": 0, # run evaluation for this many steps every steps_per_checkpoint
  "model_path": "gs://neo-models/vae_test2/", # directory in which to save the model
  "mesh_shape": "data:16,model:2", # mapping of processors to named dimensions - see mesh-tensorflow repo for more info
  "layout": "batch_dim:data", # which named dimensions of the model to split across the mesh - see mesh-tensorflow repo for more info
  "num_tokens": 512, # vocab size
  "dim": 512, 
  "hidden_dim": 64, # size of hidden dim
  "n_channels": 3, # number of input channels
  "bf_16": false, # if true, the model is trained with bfloat16 precision
  "lr": 0.001, # learning rate [by default learning rate starts at this value, then decays to 10% of this value over the course of the training]
  "num_layers": 3, # number of blocks in the encoder / decoder
  "train_gumbel_hard": true, # whether to use hard or soft gumbel_softmax
  "eval_gumbel_hard": true
}

DALL-E:

 {
  "model_type": "dalle",
  "dataset": {
    "train_path": "gs://neo-datasets/DALLE-tfrecords/*.tfrecords", # glob path to tfrecords data
    "eval_path": "gs://neo-datasets/DALLE-tfrecords/*.tfrecords",
    "image_size": 32 # size of images (all images will be cropped / padded to this size)
  },
  "train_batch_size": 32, # see above
  "eval_batch_size": 32,
  "predict_batch_size": 32,
  "steps_per_checkpoint": 1000,
  "iterations": 500,
  "train_steps": 100000,
  "predict_steps": 0,
  "eval_steps": 0,
  "n_channels": 3,
  "bf_16": false,
  "lr": 0.001,
  "model_path": "gs://neo-models/dalle_test/",
  "mesh_shape": "data:16,model:2",
  "layout": "batch_dim:data",
  "n_embd": 512, # size of embedding dim
  "text_vocab_size": 50258, # vocabulary size of the text tokenizer
  "image_vocab_size": 512, # vocabulary size of the vae - should equal num_tokens above
  "text_seq_len": 256, # length of text inputs (all inputs longer / shorter will be truncated / padded)
  "n_layers": 6, 
  "n_heads": 4, # number of attention heads. For best performance, n_embd / n_heads should equal 128
  "vae_model": "vae_example" # path to or name of vae model config
}

Expandir

Informações adicionais

Versão 1.0.0
Tipo Código-Fonte de IA
Data da Última Atualização 2025-01-16
tamanho 38.35KB
Vindo de Github

Aplicativos Relacionados

GitHub sgrebnov/cordova plugin background download

2024-11-05
Wa ch ull navra maza navsacha 2 2024 ull ovie Fr e Online On Strea ings

2024-11-03
Wa ch navra maza navsacha 2 2024 ull ovie Online For Fr e Strea ings At Home

2024-11-03
Wa ch the greatest of all time 2024 ull ovie Online For Fr e Strea ings At Home

2024-11-02
wolfs 2024 f llmo ie f lmyz lla dow load ree 7 0p 4 0p a d 10 0p

2024-11-01
GitHub the via/releases

2024-11-01

Recomendado para você

chat.petals.dev

Outro código-fonte

1.0.0
GPT Prompt Templates

Outro código-fonte

1.0.0
GPTyped

Outro código-fonte

GPTyped 1.0.5
node telegram bot api

Código-Fonte de IA

v0.50.0
typebot.io

Código-Fonte de IA

v3.1.2
python wechaty getting started

Código-Fonte de IA

1.0.0
waymo open dataset

Outro código-fonte

December 2023 Update
termwind

Outras categorias

v2.3.0
wp functions

Outras categorias

1.0.0

Informações Relacionadas Todos