ดาวน์โหลด ChatRWKV - ดาวน์โหลดซอร์สโค้ด ChatRWKV

ChatRWKV

โค้ดแหล่งที่มา AI

1.0.0

ดาวน์โหลด

ChatRWKV (ออกเสียงว่า "RwaKuv" (rʌkuv ใน IPA) จาก 4 พารามิเตอร์หลัก: RWKV)

หน้าแรกของ RWKV: https://www.rwkv.com

โปรดตรวจสอบ https://github.com/BlinkDL/ChatRWKV/blob/main/API_DEMO_CHAT.py ก่อน

ChatRWKV ก็เหมือนกับ ChatGPT แต่ขับเคลื่อนโดยโมเดลภาษา RWKV (100% RNN) ของฉัน ซึ่งเป็น RNN เดียว (ณ ตอนนี้) ที่สามารถจับคู่หม้อแปลงในด้านคุณภาพและการปรับขนาดได้ ในขณะที่เร็วขึ้นและประหยัด VRAM การฝึกอบรมสนับสนุนโดย Stability EleutherAI :)

เวอร์ชันล่าสุดของเราคือ RWKV-6 https://arxiv.org/abs/2404.05892 (รุ่นตัวอย่าง: https://huggingface.co/BlinkDL/temp )

การสาธิต RWKV-6 3B : https://huggingface.co/spaces/BlinkDL/RWKV-Gradio-1

การสาธิต RWKV-6 7B : https://huggingface.co/spaces/BlinkDL/RWKV-Gradio-2

RWKV-v5-มาตรฐาน-1

repo หลัก RWKV-LM : https://github.com/BlinkDL/RWKV-LM (คำอธิบาย การปรับแต่ง การฝึกอบรม ฯลฯ)

การสาธิตการแชทสำหรับนักพัฒนา: https://github.com/BlinkDL/ChatRWKV/blob/main/API_DEMO_CHAT.py

RWKV Discord: https://discord.gg/bDSBUMeFpc (สมาชิก 7,000 คนขึ้นไป)

ทวิตเตอร์ : https://twitter.com/BlinkDL_AI

หน้าแรก : https://www.rwkv.com/

ตุ้มน้ำหนัก RWKV ที่ล้ำสมัย: https://huggingface.co/BlinkDL

ตุ้มน้ำหนัก RWKV ที่เข้ากันได้กับ HF: https://huggingface.co/RWKV

ใช้ v2/convert_model.py เพื่อแปลงโมเดลสำหรับกลยุทธ์ เพื่อให้โหลดเร็วขึ้นและประหยัด CPU RAM

หมายเหตุ RWKV_CUDA_ON จะสร้างเคอร์เนล CUDA (เร็วกว่ามากและบันทึก VRAM) นี่คือวิธีการสร้างมัน ("pip install ninja" ก่อน):

 # How to build in Linux: set these and run v2/chat.py
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
# How to build in win:
Install VS2022 build tools (https://aka.ms/vs/17/release/vs_BuildTools.exe select Desktop C++). Reinstall CUDA 11.7 (install VC++ extensions). Run v2/chat.py in "x64 native tools command prompt".

แพ็คเกจ pip RWKV : https://pypi.org/project/rwkv/ (โปรดตรวจสอบเวอร์ชันล่าสุดและอัปเกรดเสมอ)

https://github.com/cgisky1980/ai00_rwkv_server API การอนุมาน GPU ที่เร็วที่สุดพร้อม vulkan (ดีสำหรับ nvidia/amd/intel)

https://github.com/cryscan/web-rwkv แบ็กเอนด์สำหรับ ai00_rwkv_server

https://github.com/saharNooby/rwkv.cpp การอนุมาน CPU/cuBLAS/CLBlast ที่รวดเร็ว: int4/int8/fp16/fp32

https://github.com/JL-er/RWKV-PEFT lora/pissa/Qlora/Qpissa/state การปรับแต่ง

https://github.com/RWKV/RWKV-infctx-trainer เทรนเนอร์ Infctx

สคริปต์สาธิตโลก: https://github.com/BlinkDL/ChatRWKV/blob/main/API_DEMO_WORLD.py

สคริปต์สาธิต Raven Q&A: https://github.com/BlinkDL/ChatRWKV/blob/main/v2/benchmark_more.py

ChatRWKV-กลยุทธ์

RWKV ใน 150 บรรทัด (แบบจำลอง การอนุมาน การสร้างข้อความ): https://github.com/BlinkDL/ChatRWKV/blob/main/RWKV_in_150_lines.py

RWKV v5 ใน 250 บรรทัด (พร้อมโทเค็นด้วย): https://github.com/BlinkDL/ChatRWKV/blob/main/RWKV_v5_demo.py

การสร้างเอ็นจิ้นการอนุมาน RWKV ของคุณเอง : เริ่มต้นด้วย https://github.com/BlinkDL/ChatRWKV/blob/main/src/model_run.py ซึ่งเข้าใจง่ายกว่า (ใช้โดย https://github.com/BlinkDL/ChatRWKV/ blob/main/chat.py)

พิมพ์ล่วงหน้า RWKV https://arxiv.org/abs/2305.13048

กระดาษ RWKV

RWKV v6 มีภาพประกอบ:

RWKV-v6

โครงการชุมชน RWKV ที่ยอดเยี่ยม :

https://github.com/saharNooby/rwkv.cpp รวดเร็ว i4 i8 fp16 fp32 การอนุมาน CPU โดยใช้ ggml

https://github.com/harrisonvanderbyl/rwkv-cpp-cuda รวดเร็ว windows/linux & cuda/rocm/vulkan การอนุมาน GPU (ไม่จำเป็นต้องใช้ python & pytorch)

https://github.com/Blealtan/RWKV-LM-LoRA การปรับแต่ง LoRA อย่างละเอียด

https://github.com/josStorer/RWKV-Runner GUI สุดเจ๋ง

โครงการ RWKV เพิ่มเติม: https://github.com/search?o=desc&q=rwkv&s=updated&type=Repositories

ChatRWKV v2: ด้วยกลยุทธ์ "สตรีม" และ "แยก" และ INT8 3G VRAM ก็เพียงพอที่จะรัน RWKV 14B :) https://github.com/BlinkDL/ChatRWKV/tree/main/v2

 os . environ [ "RWKV_JIT_ON" ] = '1'
os . environ [ "RWKV_CUDA_ON" ] = '0' # if '1' then use CUDA kernel for seq mode (much faster)
from rwkv . model import RWKV                         # pip install rwkv
model = RWKV ( model = '/fsx/BlinkDL/HF-MODEL/rwkv-4-pile-1b5/RWKV-4-Pile-1B5-20220903-8040' , strategy = 'cuda fp16' )

out , state = model . forward ([ 187 , 510 , 1563 , 310 , 247 ], None )   # use 20B_tokenizer.json
print ( out . detach (). cpu (). numpy ())                   # get logits
out , state = model . forward ([ 187 , 510 ], None )
out , state = model . forward ([ 1563 ], state )           # RNN has state (use deepcopy if you want to clone it)
out , state = model . forward ([ 310 , 247 ], state )
print ( out . detach (). cpu (). numpy ())                   # same result as above

RWKV-ประเมิน

นี่คือ https://huggingface.co/BlinkDL/rwkv-4-raven/blob/main/RWKV-4-Raven-14B-v7-Eng-20230404-ctx4096.pth: แชทRWKV

เมื่อคุณสร้างแชทบอท RWKV ให้ตรวจสอบข้อความที่สอดคล้องกับสถานะเสมอ เพื่อป้องกันข้อผิดพลาด

อย่าเรียก raw forward() โดยตรง ให้ใส่ไว้ในฟังก์ชันที่จะบันทึกข้อความที่สอดคล้องกับสถานะแทน

(สำหรับรุ่น v4-raven ให้ใช้ Bob/Alice สำหรับรุ่น v4/v5/v6-world ให้ใช้ User/Assistant)

รูปแบบการแชทที่ดีที่สุด (ตรวจสอบว่าข้อความของคุณอยู่ในรูปแบบนี้หรือไม่): Bob: xxxxxxxxxxxxxxxxxxnnAlice: xxxxxxxxxxxxxnnBob: xxxxxxxxxxxxxxxxnnAlice:

ไม่ควรมีช่องว่างหลังจาก "Alice:" สุดท้าย ผลลัพธ์การสร้างจะมีช่องว่างในตอนเริ่มต้น และคุณสามารถตัดออกได้
คุณสามารถใช้ n ใน xxxxx แต่หลีกเลี่ยง nn ดังนั้นเพียงแค่ทำ xxxxx = xxxxx.strip().replace('rn','n').replace('nn','n')

หากคุณกำลังสร้างกลไกการอนุมาน RWKV ของคุณเอง ให้เริ่มต้นด้วย https://github.com/BlinkDL/ChatRWKV/blob/main/src/model_run.py ซึ่งเข้าใจง่ายกว่า (ใช้โดย https://github.com/BlinkDL /ChatRWKV/blob/main/chat.py)

โมเดล RWKV 14B & 7B ที่ปรับแต่งในสไตล์ Alpaca ซีรีส์ "Raven" ล่าสุดนั้นดีมาก (เกือบจะเหมือน ChatGPT และเก่งในการแชทหลายรอบด้วย) ดาวน์โหลด: https://huggingface.co/BlinkDL/rwkv-4-raven

ผลลัพธ์รุ่นเก่าก่อนหน้า: แชทRWKV