This project aims to eliminate the barriers of using large language models by automating everything for you. All you need is a lightweight executable program of just a few megabytes. Additionally, this project provides an interface compatible with the OpenAI API, which means that every ChatGPT client is an RWKV client.
English | 简体中文 | 日本語
FAQs | Preview | Download | Simple Deploy Example | Server Deploy Examples | MIDI Hardware Input
You can deploy backend-python on a server and use this program as a client only. Fill in
your server address in the Settings API URL
.
If you are deploying and providing public services, please limit the request size through API gateway to prevent excessive resource usage caused by submitting overly long prompts. Additionally, please restrict the upper limit of requests' max_tokens based on your actual situation: https://github.com/josStorer/RWKV-Runner/blob/master/backend-python/utils/rwkv.py#L567, the default is set as le=102400, which may result in significant resource consumption for individual responses in extreme cases.
Default configs has enabled custom CUDA kernel acceleration, which is much faster and consumes much less VRAM. If you
encounter possible compatibility issues (output garbled), go to the Configs page and turn
off Use Custom CUDA kernel to Accelerate
, or try to upgrade your gpu driver.
If Windows Defender claims this is a virus, you can try
downloading v1.3.7_win.zip
and letting it update automatically to the latest version, or add it to the trusted
list (Windows Security
-> Virus & threat protection
-> Manage settings
-> Exclusions
-> Add or remove exclusions
-> Add an exclusion
-> Folder
-> RWKV-Runner
).
For different tasks, adjusting API parameters can achieve better results. For example, for translation tasks, you can try setting Temperature to 1 and Top_P to 0.3.
git clone https://github.com/josStorer/RWKV-Runner
# Then
cd RWKV-Runner
python ./backend-python/main.py #The backend inference service has been started, request /switch-model API to load the model, refer to the API documentation: http://127.0.0.1:8000/docs
# Or
cd RWKV-Runner/frontend
npm ci
npm run build #Compile the frontend
cd ..
python ./backend-python/webui_server.py #Start the frontend service separately
# Or
python ./backend-python/main.py --webui #Start the frontend and backend service at the same time
# Help Info
python ./backend-python/main.py -h
ab -p body.json -T application/json -c 20 -n 100 -l http://127.0.0.1:8000/chat/completions
body.json:
{
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}
Note: v1.4.0 has improved the quality of embeddings API. The generated results are not compatible with previous versions. If you are using embeddings API to generate knowledge bases or similar, please regenerate.
If you are using langchain, just use OpenAIEmbeddings(openai_api_base="http://127.0.0.1:8000", openai_api_key="sk-")
import numpy as np
import requests
def cosine_similarity(a, b):
return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
values = [
"I am a girl",
"我是个女孩",
"私は女の子です",
"广东人爱吃福建人",
"我是个人类",
"I am a human",
"that dog is so cute",
"私はねこむすめです、にゃん♪",
"宇宙级特大事件!号外号外!"
]
embeddings = []
for v in values:
r = requests.post("http://127.0.0.1:8000/embeddings", json={"input": v})
embedding = r.json()["data"][0]["embedding"]
embeddings.append(embedding)
compared_embedding = embeddings[0]
embeddings_cos_sim = [cosine_similarity(compared_embedding, e) for e in embeddings]
for i in np.argsort(embeddings_cos_sim)[::-1]:
print(f"{embeddings_cos_sim[i]:.10f} - {values[i]}")
Tip: You can download https://github.com/josStorer/sgm_plus and unzip it to the program's assets/sound-font
directory
to use it as an offline sound source. Please note that if you are compiling the program from source code, do not place
it in the source code directory.
If you don't have a MIDI keyboard, you can use virtual MIDI input software like Virtual Midi Controller 3 LE
, along
with loopMIDI, to use a regular
computer keyboard as MIDI input.
Tip: You can download https://github.com/josStorer/sgm_plus and unzip it to the program's assets/sound-font
directory
to use it as an offline sound source. Please note that if you are compiling the program from source code, do not place
it in the source code directory.