Download q transformer - download do código-fonte q transformer

q transformer

Código-Fonte de IA

0.3.0

Baixar

Transformador Q

Implementação de Q-Transformer, aprendizado de reforço off-line escalável por meio de Q-Functions autoregressivas, fora do Google Deepmind

Manterei a lógica do Q-learning em ação única apenas para comparação final com o Q-learning autorregressivo proposto em múltiplas ações. Também para servir de educação para mim e para o público.

A formulação autoregressiva de Q-learning foi reproduzida por Kotb et al.

Instalar

$ pip install q-transformer

Uso

 import torch

from q_transformer import (
    QRoboticTransformer ,
    QLearner ,
    Agent ,
    ReplayMemoryDataset
)

# the attention model

model = QRoboticTransformer (
    vit = dict (
        num_classes = 1000 ,
        dim_conv_stem = 64 ,
        dim = 64 ,
        dim_head = 64 ,
        depth = ( 2 , 2 , 5 , 2 ),
        window_size = 7 ,
        mbconv_expansion_rate = 4 ,
        mbconv_shrinkage_rate = 0.25 ,
        dropout = 0.1
    ),
    num_actions = 8 ,
    action_bins = 256 ,
    depth = 1 ,
    heads = 8 ,
    dim_head = 64 ,
    cond_drop_prob = 0.2 ,
    dueling = True
)

# you need to supply your own environment, by overriding BaseEnvironment

from q_transformer . mocks import MockEnvironment

env = MockEnvironment (
    state_shape = ( 3 , 6 , 224 , 224 ),
    text_embed_shape = ( 768 ,)
)

# env.init()     should return instructions and initial state: Tuple[str, Tensor[*state_shape]]
# env(actions)   should return rewards, next state, and done flag: Tuple[Tensor[()], Tensor[*state_shape], Tensor[()]]

# agent is a class that allows the q-model to interact with the environment to generate a replay memory dataset for learning

agent = Agent (
    model ,
    environment = env ,
    num_episodes = 1000 ,
    max_num_steps_per_episode = 100 ,
)

agent ()

# Q learning on the replay memory dataset on the model

q_learner = QLearner (
    model ,
    dataset = ReplayMemoryDataset (),
    num_train_steps = 10000 ,
    learning_rate = 3e-4 ,
    batch_size = 4 ,
    grad_accum_every = 16 ,
)

q_learner ()

# after much learning
# your robot should be better at selecting optimal actions

video = torch . randn ( 2 , 3 , 6 , 224 , 224 )

instructions = [
    'bring me that apple sitting on the table' ,
    'please pass the butter'
]

actions = model . get_optimal_actions ( video , instructions )

Apreciação

StabilityAI, programa de subsídios de IA de código aberto A16Z e ? Huggingface pelos generosos patrocínios, bem como aos meus outros patrocinadores, por me proporcionarem a independência para abrir o código-fonte da atual pesquisa de inteligência artificial

Pendência

Citações

 @inproceedings { qtransformer ,
    title   = { Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions } ,
    authors = { Yevgen Chebotar and Quan Vuong and Alex Irpan and Karol Hausman and Fei Xia and Yao Lu and Aviral Kumar and Tianhe Yu and Alexander Herzog and Karl Pertsch and Keerthana Gopalakrishnan and Julian Ibarz and Ofir Nachum and Sumedh Sontakke and Grecia Salazar and Huong T Tran and Jodilyn Peralta and Clayton Tan and Deeksha Manjunath and Jaspiar Singht and Brianna Zitkovich and Tomas Jackson and Kanishka Rao and Chelsea Finn and Sergey Levine } ,
    booktitle = { 7th Annual Conference on Robot Learning } ,
    year   = { 2023 }
}

 @inproceedings { dao2022flashattention ,
    title   = { Flash{A}ttention: Fast and Memory-Efficient Exact Attention with {IO}-Awareness } ,
    author  = { Dao, Tri and Fu, Daniel Y. and Ermon, Stefano and Rudra, Atri and R{'e}, Christopher } ,
    booktitle = { Advances in Neural Information Processing Systems } ,
    year    = { 2022 }
}

 @inproceedings { Kumar2023MaintainingPI ,
    title   = { Maintaining Plasticity in Continual Learning via Regenerative Regularization } ,
    author  = { Saurabh Kumar and Henrik Marklund and Benjamin Van Roy } ,
    year    = { 2023 } ,
    url     = { https://api.semanticscholar.org/CorpusID:261076021 }
}

Expandir

Informações adicionais

Versão 0.3.0
Tipo Código-Fonte de IA
Data da Última Atualização 2025-01-14
tamanho 1.42MB
Vindo de Github

Aplicativos Relacionados

Qfang. com

2024-09-08
Versão móvel do Monster Transformer

2023-09-07
Aplicativo QCFUN

2023-08-28
Aplicativo Barbie Q

2023-06-27
Preocupe-se com Q

2022-08-29
Q-Dir

2009-06-22

Recomendado para você

chat.petals.dev

Outro código-fonte

1.0.0
GPT Prompt Templates

Outro código-fonte

1.0.0
GPTyped

Outro código-fonte

GPTyped 1.0.5
node telegram bot api

Código-Fonte de IA

v0.50.0
typebot.io

Código-Fonte de IA

v3.1.2
python wechaty getting started

Código-Fonte de IA

1.0.0
waymo open dataset

Outro código-fonte

December 2023 Update
termwind

Outras categorias

v2.3.0
wp functions

Outras categorias

1.0.0

Informações Relacionadas Todos