memory efficient attention pytorch ดาวน์โหลด - memory efficient attention pytorch ดาวน์โหลดซอร์สโค้ด

memory efficient attention pytorch

โค้ดแหล่งที่มา AI

0.1.6

ดาวน์โหลด

Pytorch ความสนใจอย่างมีประสิทธิภาพของหน่วยความจำ (ล้าสมัย)

การใช้ความสนใจแบบหลายหัวที่มีประสิทธิภาพของหน่วยความจำตามที่เสนอในรายงาน การเอาใจใส่ตนเองไม่จำเป็นต้องมีหน่วยความจำ O(n²) นอกจากนี้ โมดูลจะดูแลเรื่องการมาสก์ การมาสก์เชิงสาเหตุ และความสนใจแบบข้าม

พื้นที่เก็บข้อมูลนี้ยังประกอบด้วยการนำการปรับปรุงที่ Tri Dao กระทำโดย Tri Dao อย่างไร้เดียงสาด้วยเอกสาร Flash Attention 2 ของเขาเพื่อวัตถุประสงค์ทางการศึกษา มันเป็นตัวเปลี่ยนเกมสำหรับความสนใจและสร้างหม้อแปลงบริบทยาว

อัปเดต: จากนี้ไป คุณควรใช้ฟังก์ชัน F.scaled_dot_product_attention ใน Pytorch 2.0 สำหรับการรองรับ Flash Attention v1 ในตัว - หรือใช้ Flash Attention v2 ที่พื้นที่เก็บข้อมูลอย่างเป็นทางการ

ติดตั้ง

$ pip install memory-efficient-attention-pytorch

การใช้งาน

สำหรับโมเดลภาษาแบบถอยหลังอัตโนมัติ

 import torch
from memory_efficient_attention_pytorch import Attention

attn = Attention (
    dim = 512 ,
    dim_head = 64 ,                # dimension per head
    heads = 8 ,                    # number of attention heads
    causal = True ,                # autoregressive or not
    memory_efficient = True ,      # whether to use memory efficient attention (can be turned off to test against normal attention)
    q_bucket_size = 1024 ,         # bucket size along queries dimension
    k_bucket_size = 2048          # bucket size along key / values dimension
). cuda ()

x = torch . randn ( 1 , 65536 , 512 ). cuda ()
out = attn ( x ) # (1, 65536, 512)

ข้ามความสนใจ

 import torch
from memory_efficient_attention_pytorch import Attention

cross_attn = Attention (
    dim = 512 ,
    dim_head = 64 ,
    heads = 8 ,
    memory_efficient = True ,
    q_bucket_size = 1024 ,
    k_bucket_size = 2048
). cuda ()

x = torch . randn ( 1 , 65536 , 512 ). cuda ()
context = torch . randn ( 1 , 65536 , 512 ). cuda ()
mask = torch . ones ( 1 , 65536 ). bool (). cuda ()

out = cross_attn ( x , context = context , mask = mask ) # (1, 65536, 512)

การอ้างอิง

 @misc { rabe2021selfattention ,
    title   = { Self-attention Does Not Need $O(n^2)$ Memory } , 
    author  = { Markus N. Rabe and Charles Staats } ,
    year    = { 2021 } ,
    eprint  = { 2112.05682 } ,
    archivePrefix = { arXiv } ,
    primaryClass = { cs.LG }
}

 @misc { liu2021swin ,
    title   = { Swin Transformer V2: Scaling Up Capacity and Resolution } ,
    author  = { Ze Liu and Han Hu and Yutong Lin and Zhuliang Yao and Zhenda Xie and Yixuan Wei and Jia Ning and Yue Cao and Zheng Zhang and Li Dong and Furu Wei and Baining Guo } ,
    year    = { 2021 } ,
    eprint  = { 2111.09883 } ,
    archivePrefix = { arXiv } ,
    primaryClass = { cs.CV }
}

 @article { Dao2022FlashAttentionFA ,
    title   = { FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness } ,
    author  = { Tri Dao and Daniel Y. Fu and Stefano Ermon and Atri Rudra and Christopher R'e } ,
    journal = { ArXiv } ,
    year    = { 2022 } ,
    volume  = { abs/2205.14135 }
}

 @article { dao2023flashattention2 ,
  title     = { Flash{A}ttention-2: Faster Attention with Better Parallelism and Work Partitioning,
  author    = {Dao, Tri},
  year      = {2023}
}

ขยาย

ข้อมูลเพิ่มเติม

เวอร์ชัน 0.1.6
ประเภท โค้ดแหล่งที่มา AI
เวลาอัปเดต 2025-01-14
ขนาด 34.87MB
มาจาก Github

แอปที่เกี่ยวข้อง

efficient language detector

2024-11-06
Parameter Efficient Transfer Learning Benchmark

2024-11-06
pytorch image models

2024-11-03
ความทรงจำในวันเมย์เดย์

2023-04-07
ความทรงจำที่สดใส: ไม่มีที่สิ้นสุด

2022-07-29
Memory Hall ระบบเว็บไซต์ส่วนตัวที่เรียบง่าย

2010-12-10

แนะนำสำหรับคุณ

chat.petals.dev

ซอร์สโค้ดอื่น ๆ

1.0.0
GPT Prompt Templates

ซอร์สโค้ดอื่น ๆ

1.0.0
GPTyped

ซอร์สโค้ดอื่น ๆ

GPTyped 1.0.5
node telegram bot api

โค้ดแหล่งที่มา AI

v0.50.0
typebot.io

โค้ดแหล่งที่มา AI

v3.1.2
python wechaty getting started

โค้ดแหล่งที่มา AI

1.0.0
waymo open dataset

ซอร์สโค้ดอื่น ๆ

December 2023 Update
termwind

หมวดหมู่อื่นๆ

v2.3.0
wp functions

หมวดหมู่อื่นๆ

1.0.0

ข้อมูลที่เกี่ยวข้อง ทั้งหมด