memory efficient attention pytorch Download - memory efficient attention pytorch Download do código-fonte

memory efficient attention pytorch

Código-Fonte de IA

0.1.6

Baixar

Pytorch de atenção com eficiência de memória (obsoleto)

Implementação de uma atenção multicabeça com eficiência de memória, conforme proposto no artigo Autoatenção não precisa de memória O (n²). Além disso, o módulo cuidará do mascaramento, do mascaramento causal, bem como da atenção cruzada.

Este repositório também contém uma implementação ingênua e não CUDA das melhorias feitas por Tri Dao com seu artigo Flash Attention 2, para fins educacionais. É uma virada de jogo em termos de atenção e construção de transformadores de longo contexto.

Atualização: de agora em diante, você deve apenas usar a função F.scaled_dot_product_attention no Pytorch 2.0 para suporte integrado ao Flash Attention v1 - ou usar o Flash Attention v2 no repositório oficial

Instalar

$ pip install memory-efficient-attention-pytorch

Uso

Para modelo de linguagem autorregressivo

 import torch
from memory_efficient_attention_pytorch import Attention

attn = Attention (
    dim = 512 ,
    dim_head = 64 ,                # dimension per head
    heads = 8 ,                    # number of attention heads
    causal = True ,                # autoregressive or not
    memory_efficient = True ,      # whether to use memory efficient attention (can be turned off to test against normal attention)
    q_bucket_size = 1024 ,         # bucket size along queries dimension
    k_bucket_size = 2048          # bucket size along key / values dimension
). cuda ()

x = torch . randn ( 1 , 65536 , 512 ). cuda ()
out = attn ( x ) # (1, 65536, 512)

Atenção cruzada

 import torch
from memory_efficient_attention_pytorch import Attention

cross_attn = Attention (
    dim = 512 ,
    dim_head = 64 ,
    heads = 8 ,
    memory_efficient = True ,
    q_bucket_size = 1024 ,
    k_bucket_size = 2048
). cuda ()

x = torch . randn ( 1 , 65536 , 512 ). cuda ()
context = torch . randn ( 1 , 65536 , 512 ). cuda ()
mask = torch . ones ( 1 , 65536 ). bool (). cuda ()

out = cross_attn ( x , context = context , mask = mask ) # (1, 65536, 512)

Citações

 @misc { rabe2021selfattention ,
    title   = { Self-attention Does Not Need $O(n^2)$ Memory } , 
    author  = { Markus N. Rabe and Charles Staats } ,
    year    = { 2021 } ,
    eprint  = { 2112.05682 } ,
    archivePrefix = { arXiv } ,
    primaryClass = { cs.LG }
}

 @misc { liu2021swin ,
    title   = { Swin Transformer V2: Scaling Up Capacity and Resolution } ,
    author  = { Ze Liu and Han Hu and Yutong Lin and Zhuliang Yao and Zhenda Xie and Yixuan Wei and Jia Ning and Yue Cao and Zheng Zhang and Li Dong and Furu Wei and Baining Guo } ,
    year    = { 2021 } ,
    eprint  = { 2111.09883 } ,
    archivePrefix = { arXiv } ,
    primaryClass = { cs.CV }
}

 @article { Dao2022FlashAttentionFA ,
    title   = { FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness } ,
    author  = { Tri Dao and Daniel Y. Fu and Stefano Ermon and Atri Rudra and Christopher R'e } ,
    journal = { ArXiv } ,
    year    = { 2022 } ,
    volume  = { abs/2205.14135 }
}

 @article { dao2023flashattention2 ,
  title     = { Flash{A}ttention-2: Faster Attention with Better Parallelism and Work Partitioning,
  author    = {Dao, Tri},
  year      = {2023}
}

Expandir

Informações adicionais

Versão 0.1.6
Tipo Código-Fonte de IA
Data da Última Atualização 2025-01-14
tamanho 34.87MB
Vindo de Github

Aplicativos Relacionados

efficient language detector

2024-11-06
Parameter Efficient Transfer Learning Benchmark

2024-11-06
pytorch image models

2024-11-03
memória de socorro

2023-04-07
Memória Brilhante: Infinita

2022-07-29
Sistema simples de site pessoal Memory Hall

2010-12-10

Recomendado para você

chat.petals.dev

Outro código-fonte

1.0.0
GPT Prompt Templates

Outro código-fonte

1.0.0
GPTyped

Outro código-fonte

GPTyped 1.0.5
node telegram bot api

Código-Fonte de IA

v0.50.0
typebot.io

Código-Fonte de IA

v3.1.2
python wechaty getting started

Código-Fonte de IA

1.0.0
waymo open dataset

Outro código-fonte

December 2023 Update
termwind

Outras categorias

v2.3.0
wp functions

Outras categorias

1.0.0

Informações Relacionadas Todos