Gensokyo llm Download - Gensokyo llm Source code download

Gensokyo llm

AI Source Code

Release 166

Download

gensokyo-llm

#gensokyo-llm

One-click version of large models suitable for Gensokyo and Onebotv11

Documentation & Tutorials

Beginner-Simple access to the robot: Simply connect the robot to 6 major platforms including QQ

Intermediate-docking one-api tutorial: visual management of large model api

Intermediate-large model api configuration example-domestic large model

Intermediate-Large Model API Configuration Example-International Chapter

Ready to use - quick deployment on Telegram

Ready to use - quick deployment on Discord

Ready to use - quick deployment in Kook

Simple-SillyTavern-Hunyuan

Simple-SillyTavern-bean bag

characteristic

Supports all Onebotv11 standard frameworks. Supports http-api and reverse ws, supports streaming, multiple configuration files (multiple prompt words)

Ultra-small size, built-in sqlite maintenance context, supports proxy,

One-click docking to the Gensokyo framework only requires configuring the reverse http address for receiving information and the forward http address for calling the sending API.

Automatically maintain context based on sqlite database. In conversation mode, use reset command to reset

System, role card, context length can be set,

At the same time, openai original flavor api with automatic context is provided to the outside world (classic 3 parameters, id, parent id, messgae)

It can be run as an API, and can also be connected to the QQ channel robot QQ robot open platform with one click.

Can convert the sse type of gpt, increment or only send new sse

SSE memory security in a concurrent environment supports maintaining simultaneous bidirectional SSE transmission for multiple users

Support llm platform

Tencent Hunyuan

Tencent Yuanqi

Baidu Wenxin

Ali Tongyi

Tsinghua wisdom spectrum

Byte Volcano (bean bag)

OpenAI

groQ

Rwkv runner

One-API

Convert the APIs of these platforms into a unified API structure, provide context, and support SSE return

By setting the token of the corresponding platform in yml and setting AllApi=true, switching calls can be made at the same time.

Supports docking with multiple chatbot platforms

Gensokyo framework-QQ open platform

Gensokyo Framework-Discord

Gensokyo Framework-Kook

Gensokyo framework-WeChat subscription accountpublic account

Gensokyo Framework-Telegram

All Onebotv11 implementations

security

Multiple complete security measures to ensure the safety of developers and llm applications as much as possible.

Multiple rounds of simulated QA can be set to strengthen role prompt words, reset replies, safe word replies, and first level safety measures can be customized

Supports multiple gsk-llm interconnections to form ai-agent applications, such as one llm sorting prompt words for another llm, auditing prompt words, and secondary security measures

Vector safe word list, sensitive interception word list based on vector similarity, is performed before text replacement, the third security measure

The ultra-efficient text IN-Out replacement rule implemented by the AhoCorasick algorithm can replace a large number of n keywords with their corresponding new keywords. The fourth level of security measures

The results can be passed through Baidu-Tencent again, text review interface, fifth layer of security measures

The log is fully recorded, and the command line parameter -test can quickly run the security self-test script from test.txt.

Command line -mlog will perform QA formatting on all currently stored logs, review them daily, extract new security rules from actual scenarios, and continuously increase security. The sixth level of security measures

Language filtering allows llm to only accept the specified language, automatically convert Traditional Chinese to Simplified Chinese, apply security rules, and defend in the areas you are good at. The seventh level of security measures

Prompt word length limit, control security in the most original way, prevent malicious users from constructing long prompt words, the eighth level of security measures

Through these methods, create an llm conversational robot that is as safe as possible.

Text IN-OUT double-layer replacement, you can realize dynamic replacement and modification of internal prompt words by yourself, which is safer and more powerful

Based on the vector data table structure designed by SQLite, caching can be used to save money. Customize the cache hit rate and accuracy.

A specialized scenario application optimized for high-efficiency, high-performance and high-QPS scenarios. It has no redundant functions and instructions and is fully designed around digital humans.

How to use

Run the gensokyo-llm executable program using the command line

Configure config.yml to start, and then listen to the port port to provide /conversation api

Supports middleware development. Between the gensokyo framework layer and the http request of gensokyo-llm, middleware can be developed to implement vector expansion, database expansion, and dynamic modification of user issues.

Support reverse ws connection and support simultaneous connection to multiple onebotv11 http-api

API interface call instructions

This document provides instructions on the calling method of the API interface and the format of the configuration file to help users use and configure it correctly.

Query parameters supported by the interface

conversation and gensokyo endpoints of this system support specifying specific configurations through the query parameter ?prompt=xxx .

The prompt parameter allows the user to specify the configuration YAML file located in the prompts folder of the executable file (exe). Use this parameter to dynamically adjust API behavior and return content.
The prompts folder needs to have a default keyboard.yml for generating bubbles. Its system prompt words need to follow the prompts rules of the json bubble generator.

YAML configuration file format

Configuration files should follow the following YAML format. An example configuration file is provided here that shows how to define dialogue content for different roles:

 Prompt :
  - role : " system "
    content : " Welcome to the system. How can I assist you today? "
  - role : " user "
    content : " I need help with my account. "
  - role : " assistant "
    content : " I can help you with that. What seems to be the problem? "
  - role : " user "
    content : " aaaaaaaaaa! "
  - role : " assistant "
    content : " ooooooooo? "
settings :
  # 以下是通用配置项 和config.yml相同
  useSse : true
  port : 46233

Multiple configuration files support

Request `/gensokyo` endpoint

The system supports additional prompt parameters and api parameters when making requests to the /gensokyo endpoint. The api parameters allow specifying full endpoints such as /conversation_ernie . To enable this feature, you need to enable the allapi option in the configuration.

Example request:

 GET /gensokyo?prompt=example&api=conversation_ernie

Supported endpoint list: (Requires configuration: allApi: true)

    http . HandleFunc ( "/conversation_gpt" , app . ChatHandlerChatgpt )
		http . HandleFunc ( "/conversation_hunyuan" , app . ChatHandlerHunyuan )
		http . HandleFunc ( "/conversation_ernie" , app . ChatHandlerErnie )
		http . HandleFunc ( "/conversation_rwkv" , app . ChatHandlerRwkv )
		http . HandleFunc ( "/conversation_tyqw" , app . ChatHandlerTyqw )
		http . HandleFunc ( "/conversation_glm" , app . ChatHandlerGlm )

Request `/conversation` endpoint

Similar to /gensokyo , the /conversation endpoint supports additional prompt parameters.

Example request:

 GET /conversation?prompt=example

`prompt` parameter analysis

The provided prompt parameter will reference the corresponding YAML file in the /prompts folder in the executable directory (for example, xxxx.yml , where xxxx is the value of prompt parameter).

By writing a large number of yml files with prompts, you can switch character cards. With the same character, you can switch storylines and different scenes.

YAML configuration file

For the configuration format of the YAML file, please refer to the YAML configuration file format section. The configuration items listed below support dynamic overriding in requests:

Each parameter implements configuration coverage

If there are any omissions and configuration coverage needs to be supported, please submit an issue.

All bool values must be specified in the yml covered by the configuration file, otherwise they will be considered false.

Dynamic configuration override is a feature I conceived by myself. Using this feature, you can achieve recursion between configuration files. For example, you can pass prompt=a in your middleware, specify Lotus to call itself in a.yml, and Specify the next prompt parameter in the lotus address as b, b specifies c, c specifies d, and so on.

Story mode (under testing and under design)

This project implements a prompt word control flow and a controllable context construction method. Based on the multiple configuration files implemented in this project, conditional jumps and switches between configuration files can be realized.

It allows users to circulate among multiple sets of prompt words according to some conditions, in order, and optionally, to realize text love games, adventure games, non-continuous multi-branch storylines, and other work. Streaming prompt word system.

- [x] promptMarks :
    - BranchName : "去逛街路上"
      Keywords : ["坐车", "走路", "触发"]
    - BranchName : "在家准备"
      Keywords : ["等一下", "慢慢", "准备"]
- [x] enhancedQA : true
- [x] promptChoicesQ :
    - Round : 1
      ReplaceText : "回家吧"
      Keywords : ["我累了", "不想去了"]
    - Round : 2
      ReplaceText : "我们打车去"
      Keywords : ["快点去", "想去", "早点"]
    - Round : 3
      ReplaceText : "我们走着去"
      Keywords : ["不着急", "等下"]
    - Round : 1
      ReplaceText : "放松一下"
      Keywords : [] # 相当于 enhancedChoices = false
- [x] promptChoicesA : 同上。
- [x] promptCoverQ : 只有Q没有A，格式同上，Choices是附加，cover是覆盖。
- [x] promptCoverA : # 同上
- [x] switchOnQ :
    - round : 1
      switch : ["故事退出分支", "下一个分支"]
      keywords : ["不想", "累了", "想", "不累"]
- [x] switchOnA :
    - round : 1
      switch : ["晚上分支"]
      keywords : ["时间不早了"]
- [x] exitOnQ :
    - round : 1
      keywords : ["退出", "忘了吧", "重置", "无聊"]
- [x] exitOnA :
    - round : 1
      keywords : ["退出", "我是一个AI", "我是一个人工", "我是一个基于"]
- [x] envType : 0 # 0=不使用场景描述, 1=在本轮llm回复前发送场景描述, 2=在本轮llm回复后发送场景描述, 场景描述支持[image:xxx][pic:xxx][图片:xxx][背景:xxx]标签, xxx为相对或绝对路径, 需在exe运行目录下
- [x] envPics : [] # 现阶段ai速度太慢,人工指定,数组代表多个,每个数组成员以1: 2: 开始代表对应第几轮.
- [x] envContents : [] # 如果要跳过某个轮次,直接指定文字是2: 图片也是2: 代表本轮文图是空的.
- [x] promptChanceQ :
    - probability : 50
      text : "让我们休息一下"
    - probability : 30
      text : "继续前进"
    - probability : 70
      text : "停下来看看周围"
    - probability : 10
      text : "尝试一些新东西"

The above parameters are all located in the settings section of the multi-configuration file. You can determine the length of the prompt words for each scene and the length of each scene promptMarksLength to control the granularity of the plot.

Story mode triggering method 1, middleware control, calling the /gensokyo port by itself and attaching different prompt parameters, and manually cutting off

Set the http api address of the ob11 robot framework in gsk-llm. The ob11 plug-in application is not responsible for sending messages. It only makes conditional judgments based on the message content. As a control middleware, it gives developers the freedom to control the conditions.

The second story mode triggering method is to automatically switch branches based on keywords by configuring switchOnQ and switchOnA in the default configuration file config.yml.

Combined with the ability of the configuration file in the prompt parameters to advance the story, a basic AI storyline focusing on prompt words can be realized. In addition, it is also necessary to design a corresponding -keyboard.yml for each prompt.yml to generate bubbles.

The keywords of promptMarks are [], which means pressing promptMarksLength to switch prompt word files. promptMarksLength represents the context length maintained by this prompt word file.

When promptMarksLength is less than 0, the subsequent branches will be read from promptMarks and randomly switched from them. When promptMarkType=1,

1=Triggered based on conditions, also triggered when promptMarksLength is reached.

For configuration details, see process control-promptmarks.md

This branch will be triggered when the user and model speak the mark (you need to write the prompt word yourself so that llm can speak it according to the conditions.)

You can use the system prompt words and QA of the current story fragment to guide the AI to output the switching words agreed with you, thereby designing multiple trigger words for each target branch and allowing the large model to decide the development direction of the story on its own.

When enhancedQA is false, the predefined QA in the configuration file will be added to the top of the user QA and exist in llm's memory (without affecting the overall dialogue direction) to form a weak impact.

When enhancedQA is true, I tried to move the position of the predefined QA in the configuration file from the top down to the front of the user's current conversation, but the effect was not ideal,

Currently, it will be mixed and integrated with the current user's historical QA to guide user input to a certain extent, thus influencing the direction of the story process.

Configuration template

The "configuration control flow" parameter is introduced, which is a method that is less flexible than ai-agent, but has higher plot controllability, lower generation speed and lower cost.

promptChoicesQ & promptChoicesA Documentation: Process Control-promptchoicesQ Process Control-promptCoverQ Process Control-promptChanceQ

switchOnQ means switching the current branch when matching text is found in Q. The same is true for switchOnA, and its configuration method is the same as promptChoices.

Process control-switchonQA

exitOnQ means that the current branch will be exited when the specified keyword is detected. Process control-exitonQA

promptMarks, switchOnQ, and switchOnA are functionally the same. They all jump to branches based on keywords. promptMarks is executed first, regardless of rounds or QA. switchOnQ and switchOnA are more specific, distinguishing Q and A, distinguishing rounds, implementation details Jump.

Story mode effect (QQ open platform)

renderings

Known issues

If there are fixed branches that do not need to be switched, please set the promptMarksLength of the yml to 99999

promptMarksLength: 99999

This is to avoid mistakenly switching to a non-existent branch, causing session errors.

Why use text control flow instead of ai-agent?

The configuration control flow is simple and intuitive. The dialogue logic is managed through the configuration file. The configuration file is easy to maintain. Non-technical personnel, such as plot writers, can directly learn the configuration file rules and modify the configuration file to update the dialogue logic without programming knowledge.

High plot certainty: Given the same input and configuration, the plot direction is generally consistent, which is very important to ensure the coherence and predictability of the dialogue plot.

The cost is low, and the context is cleverly combined and replaced instead of being processed by multiple AIs at the same time. It consumes almost the same amount of tokens as a normal conversation, saving money.

It’s fast, generates results like normal dialogue QA, and writes plots like game scripts.

Low-cost AI story and novel solutions suitable for individual developers and small development teams. Low cost, high speed, and high controllability. The effect will directly improve with the improvement of the model and prompt word effect.

For dialogue plot chat scenarios, if the plot is relatively fixed, the dialogue path is preset, and the update frequency is not high, it is more suitable to use configuration control flow, because it provides a high degree of controllability and an easy-to-understand management method.

If the dialogue system requires a high degree of interactivity and personalization, or the plot changes are complex and needs to be dynamically adjusted based on the user's specific feedback and behavior, then it may be more appropriate to use an AI-based agent solution, which requires higher technical investment and maintenance costs.

endpoint

This section describes the specific endpoint information for communicating with the API.

property	Details
URL	`http://localhost:46230/conversation`
method	`POST`

Request parameters

The request body that the client should send to the server must be in JSON format. The following table details the data type and description of each field.

Field name	type	describe
`message`	String	The content of the message sent by the user
`conversationId`	String	Unique identifier for the current conversation session
`parentMessageId`	String	The identifier of the previous message associated with this message

Request example

The following JSON object shows the structure of the request body when sending a request to this API endpoint:

{
    "message" : "我第一句话说的什么" ,
    "conversationId" : " 07710821-ad06-408c-ba60-1a69bf3ca92a " ,
    "parentMessageId" : " 73b144d2-a41f-4aeb-b3bb-8624f0e54ba6 "
}

This example shows how to construct a request body that contains the message content, a unique identifier for the current conversation session, and an identifier for the previous message. This format ensures that the requested data not only complies with the server's processing rules, but also facilitates the maintenance of conversational context consistency.

Return value example

A successful response will return status code 200 and a JSON object with the following fields:

Field name	type	describe
`response`	String	Interface response message content
`conversationId`	String	Unique identifier for the current conversation
`messageId`	String	Unique identifier of the current message
`details`	Object	Include additional usage details
`usage`	Object (in `details` )	Usage details such as token count

Response example

{
    "response" : "回答内容" ,
    "conversationId" : " c9b8746d-aa8c-44b3-804a-bb5ad27f5b84 " ,
    "messageId" : " 36cc9422-da58-47ec-a25e-e8b8eceb47f5 " ,
    "details" : {
        "usage" : {
            "prompt_tokens" : 88 ,
            "completion_tokens" : 2
        }
    }
}

compatibility

Can run on various architectures (native Android is not supported yet, sqlitev3 requires cgo). Since cgo compilation is complicated, arm platform, or other architectures, you can try to compile it locally under the corresponding system architecture.

Scene support

API method calls QQ channel for direct access

Agreed parameters

Auditor request parameters

When the request needs to be sent to another GSK LLM as an auditor, the JSON format that should be returned is as follows:

{ "result" : %s }

%s here represents a placeholder that will be replaced with a specific floating point value.

Bubble generation request results

When requesting another GSK LLM to generate bubbles, the JSON format that should be returned is as follows:

[ " " , " " , " " ]

This means that the result of the bubble generation is an array containing three strings. This format is used to specify three different bubbles when returning results, or less than or equal to 3.

It is no longer necessary to open multiple gsk-llm to implement agent-like functions. Based on the new multi-configuration coverage, prompt parameters and lotus features, you can request to implement complex features such as bubble generation and story advancement by yourself.

GetAIPromptkeyboardPath can be its own address or can have prompt parameters.

When using middleware to specify prompt parameters, the configuration is located in the prompts folder and its format is xxx-keyboard.yml. If middleware is not used, please specify the prompts parameters in the path and place the corresponding xxx.yml in the prompts folder. )

Set the /conversation address of the gsk-llm joint work of the system prompt words. It is agreed that the system prompt words need to return a text json array (3).

Idea reference

This project refers to the ideas of the following well-known projects and implements a simplified AI text control flow configuration format.

Rasa
- Project homepage : Rasa
- GitHub address : Rasa on GitHub
- Rasa is an open source machine learning framework for automating text and speech conversations. It enables developers to build complex chatbots, handle multiple rounds of conversations, and supports custom conversation management strategies.
- It uses a format called "stories" to define possible paths for a conversation, and these stories control the flow of the conversation based on the user's intent and preconditions.
Twine
- Project homepage : Twine
- GitHub address : Twine on GitHub
- Twine is an open source tool for creating interactive stories, and it's great for writing branching stories and visual novels. It provides an intuitive visual editing interface that allows creators to create stories without programming knowledge.
- It allows authors to write choice-based stories where the development of the story depends on the reader's decisions. This is an implementation of textual control flow for narrative management.
Inklewriter
- Project homepage , Ink on GitHub.
- It is a web platform that allows users to create interactive stories. It designed the Ink language, an open source project for writing interactive narratives and games.