memory efficient attention pytorch Télécharger - memory efficient attention pytorch Téléchargement du code source

memory efficient attention pytorch

Code Source AI

0.1.6

Télécharger

Pytorch d'attention efficace en mémoire (obsolète)

Implémentation d'une attention multi-têtes efficace en matière de mémoire, comme proposé dans l'article, L'auto-attention n'a pas besoin de mémoire O(n²). De plus, le module s'occupera du masquage, du masquage causal, ainsi que de l'attention croisée.

Ce référentiel contient également une implémentation naïve non-CUDA des améliorations apportées par Tri Dao avec son article Flash Attention 2, à des fins pédagogiques. Cela change la donne en termes d'attention et de construction de transformateurs de contexte long.

Mise à jour : à partir de maintenant, vous devriez simplement utiliser la fonction F.scaled_dot_product_attention dans Pytorch 2.0 pour la prise en charge intégrée de Flash Attention v1 - ou utiliser Flash Attention v2 dans le référentiel officiel

Installer

$ pip install memory-efficient-attention-pytorch

Usage

Pour le modèle de langage autorégressif

 import torch
from memory_efficient_attention_pytorch import Attention

attn = Attention (
    dim = 512 ,
    dim_head = 64 ,                # dimension per head
    heads = 8 ,                    # number of attention heads
    causal = True ,                # autoregressive or not
    memory_efficient = True ,      # whether to use memory efficient attention (can be turned off to test against normal attention)
    q_bucket_size = 1024 ,         # bucket size along queries dimension
    k_bucket_size = 2048          # bucket size along key / values dimension
). cuda ()

x = torch . randn ( 1 , 65536 , 512 ). cuda ()
out = attn ( x ) # (1, 65536, 512)

Attention croisée

 import torch
from memory_efficient_attention_pytorch import Attention

cross_attn = Attention (
    dim = 512 ,
    dim_head = 64 ,
    heads = 8 ,
    memory_efficient = True ,
    q_bucket_size = 1024 ,
    k_bucket_size = 2048
). cuda ()

x = torch . randn ( 1 , 65536 , 512 ). cuda ()
context = torch . randn ( 1 , 65536 , 512 ). cuda ()
mask = torch . ones ( 1 , 65536 ). bool (). cuda ()

out = cross_attn ( x , context = context , mask = mask ) # (1, 65536, 512)

Citations

 @misc { rabe2021selfattention ,
    title   = { Self-attention Does Not Need $O(n^2)$ Memory } , 
    author  = { Markus N. Rabe and Charles Staats } ,
    year    = { 2021 } ,
    eprint  = { 2112.05682 } ,
    archivePrefix = { arXiv } ,
    primaryClass = { cs.LG }
}

 @misc { liu2021swin ,
    title   = { Swin Transformer V2: Scaling Up Capacity and Resolution } ,
    author  = { Ze Liu and Han Hu and Yutong Lin and Zhuliang Yao and Zhenda Xie and Yixuan Wei and Jia Ning and Yue Cao and Zheng Zhang and Li Dong and Furu Wei and Baining Guo } ,
    year    = { 2021 } ,
    eprint  = { 2111.09883 } ,
    archivePrefix = { arXiv } ,
    primaryClass = { cs.CV }
}

 @article { Dao2022FlashAttentionFA ,
    title   = { FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness } ,
    author  = { Tri Dao and Daniel Y. Fu and Stefano Ermon and Atri Rudra and Christopher R'e } ,
    journal = { ArXiv } ,
    year    = { 2022 } ,
    volume  = { abs/2205.14135 }
}

 @article { dao2023flashattention2 ,
  title     = { Flash{A}ttention-2: Faster Attention with Better Parallelism and Work Partitioning,
  author    = {Dao, Tri},
  year      = {2023}
}

Développer

Informations supplémentaires

Version 0.1.6
Type Code Source AI
Date de mise à jour 2025-01-14
taille 34.87MB
Provenant de Github

Applications connexes

efficient language detector

2024-11-06
Parameter Efficient Transfer Learning Benchmark

2024-11-06
pytorch image models

2024-11-03
souvenir du premier secours

2023-04-07
Mémoire lumineuse : infinie

2022-07-29
Système de site Web personnel simple Memory Hall

2010-12-10

Recommandé pour vous

chat.petals.dev

Autre code source

1.0.0
GPT Prompt Templates

Autre code source

1.0.0
GPTyped

Autre code source

GPTyped 1.0.5
node telegram bot api

Code Source AI

v0.50.0
typebot.io

Code Source AI

v3.1.2
python wechaty getting started

Code Source AI

1.0.0
waymo open dataset

Autre code source

December 2023 Update
termwind

Autres catégories

v2.3.0
wp functions

Autres catégories

1.0.0

Actualités connexes Tout