memory efficient attention pytorch Descargar - memory efficient attention pytorch Descarga del código fuente

memory efficient attention pytorch

Código Fuente de IA

0.1.6

Descargar

Atención eficiente de la memoria Pytorch (obsoleto)

Implementación de una atención multicabezal eficiente en memoria como se propone en el artículo La autoatención no necesita memoria O(n²). Además, el módulo se ocupará del enmascaramiento, el enmascaramiento causal y la atención cruzada.

Este repositorio también contiene una implementación ingenua, no CUDA, de las mejoras realizadas por Tri Dao con su artículo Flash Attention 2, con fines educativos. Es un punto de inflexión para la atención y la construcción de transformadores de contexto a largo plazo.

Actualización: de ahora en adelante, debería usar la función F.scaled_dot_product_attention en Pytorch 2.0 para obtener soporte integrado para Flash Attention v1, o usar Flash Attention v2 en el repositorio oficial.

Instalar

$ pip install memory-efficient-attention-pytorch

Uso

Para modelo de lenguaje autorregresivo

 import torch
from memory_efficient_attention_pytorch import Attention

attn = Attention (
    dim = 512 ,
    dim_head = 64 ,                # dimension per head
    heads = 8 ,                    # number of attention heads
    causal = True ,                # autoregressive or not
    memory_efficient = True ,      # whether to use memory efficient attention (can be turned off to test against normal attention)
    q_bucket_size = 1024 ,         # bucket size along queries dimension
    k_bucket_size = 2048          # bucket size along key / values dimension
). cuda ()

x = torch . randn ( 1 , 65536 , 512 ). cuda ()
out = attn ( x ) # (1, 65536, 512)

atención cruzada

 import torch
from memory_efficient_attention_pytorch import Attention

cross_attn = Attention (
    dim = 512 ,
    dim_head = 64 ,
    heads = 8 ,
    memory_efficient = True ,
    q_bucket_size = 1024 ,
    k_bucket_size = 2048
). cuda ()

x = torch . randn ( 1 , 65536 , 512 ). cuda ()
context = torch . randn ( 1 , 65536 , 512 ). cuda ()
mask = torch . ones ( 1 , 65536 ). bool (). cuda ()

out = cross_attn ( x , context = context , mask = mask ) # (1, 65536, 512)

Citas

 @misc { rabe2021selfattention ,
    title   = { Self-attention Does Not Need $O(n^2)$ Memory } , 
    author  = { Markus N. Rabe and Charles Staats } ,
    year    = { 2021 } ,
    eprint  = { 2112.05682 } ,
    archivePrefix = { arXiv } ,
    primaryClass = { cs.LG }
}

 @misc { liu2021swin ,
    title   = { Swin Transformer V2: Scaling Up Capacity and Resolution } ,
    author  = { Ze Liu and Han Hu and Yutong Lin and Zhuliang Yao and Zhenda Xie and Yixuan Wei and Jia Ning and Yue Cao and Zheng Zhang and Li Dong and Furu Wei and Baining Guo } ,
    year    = { 2021 } ,
    eprint  = { 2111.09883 } ,
    archivePrefix = { arXiv } ,
    primaryClass = { cs.CV }
}

 @article { Dao2022FlashAttentionFA ,
    title   = { FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness } ,
    author  = { Tri Dao and Daniel Y. Fu and Stefano Ermon and Atri Rudra and Christopher R'e } ,
    journal = { ArXiv } ,
    year    = { 2022 } ,
    volume  = { abs/2205.14135 }
}

 @article { dao2023flashattention2 ,
  title     = { Flash{A}ttention-2: Faster Attention with Better Parallelism and Work Partitioning,
  author    = {Dao, Tri},
  year      = {2023}
}

Expandir

Información adicional

Versión 0.1.6
Tipo Código Fuente de IA
Fecha de actualización 2025-01-14
tamaño 34.87MB
Proviene de Github

Aplicaciones relacionadas

efficient language detector

2024-11-06
Parameter Efficient Transfer Learning Benchmark

2024-11-06
pytorch image models

2024-11-03
memoria de auxilio

2023-04-07
Memoria brillante: infinita

2022-07-29
Sistema de sitio web personal simple Memory Hall

2010-12-10

Recomendado para ti

chat.petals.dev

Otro código fuente

1.0.0
GPT Prompt Templates

Otro código fuente

1.0.0
GPTyped

Otro código fuente

GPTyped 1.0.5
node telegram bot api

Código Fuente de IA

v0.50.0
typebot.io

Código Fuente de IA

v3.1.2
python wechaty getting started

Código Fuente de IA

1.0.0
waymo open dataset

Otro código fuente

December 2023 Update
termwind

Otras categorias

v2.3.0
wp functions

Otras categorias

1.0.0

Información relacionada Todo