READ THIS IN ENGLISH
LangChain-Chatchat (formerly Langchain-ChatGLM)
Based on large language models such as ChatGLM and application frameworks such as Langchain, it is an open source RAG and Agent application project that can be deployed offline.
Overview
Function introduction
0.3.x feature overview
Supported model inference frameworks and models
Get started quickly
pip installation and deployment
Source code installation and deployment/development deployment
Docker deployment
Project Milestones
Contact us
?️ A question-and-answer application based on local knowledge base implemented using langchain ideas. The goal is to build a knowledge base question-and-answer solution that is friendly to Chinese scenarios and open source models and can run offline.
? Inspired by GanymedeNil's project document.ai and the ChatGLM-6B Pull Request created by Alex Zhangji, a local knowledge base Q&A application that can be implemented using open source models throughout the entire process was established. In the latest version of this project, frameworks such as Xinference and Ollama can be used to access models such as GLM-4-Chat, Qwen2-Instruct, Llama3, etc., relying on the langchain framework to support calling services through APIs provided based on FastAPI, or using the WebUI based on Streamlit. operate.
✅ This project supports the mainstream open source LLM, Embedding models and vector databases on the market, and can realize offline private deployment of all open source models. At the same time, this project also supports the call of OpenAI GPT API, and will continue to expand access to various models and model APIs in the future.
⛓️ The implementation principle of this project is shown in the figure below. The process includes loading files -> reading text -> text segmentation -> text vectorization -> question vectorization -> matching top k
in the text vector that is most similar to the question vector top k
-> The matched text is added to prompt
as context and question -> Submitted to LLM
to generate an answer.
? Principle introduction video
From a document processing perspective, the implementation process is as follows:
? This project does not involve fine-tuning or training processes, but fine-tuning or training can be used to optimize the effect of this project.
? The code used in version 0.3.0
of the AutoDL image has been updated to version v0.3.0
of this project.
? The Docker image will be updated in the near future.
?? If you want to contribute to this project, please visit the development guide to get more information about development and deployment.
Function | 0.2.x | 0.3.x |
---|---|---|
Model access | Local: fastchat Online: XXXModelWorker | Local: model_provider, supports most mainstream model loading frameworks Online: oneapi All model accesses are compatible with openai sdk |
Agent | ❌Unstable | ✅Optimized for ChatGLM3 and Qwen, Agent capabilities are significantly improved |
LLM conversation | ✅ | ✅ |
Knowledge base conversation | ✅ | ✅ |
search engine conversation | ✅ | ✅ |
file conversation | ✅Vector retrieval only | ✅Unified into File RAG function, supporting multiple search methods such as BM25+KNN |
Database dialogue | ❌ | ✅ |
Multimodal picture dialogue | ❌ | ✅ It is recommended to use qwen-vl-chat |
ARXIV Literature Dialogue | ❌ | ✅ |
Wolfram Conversation | ❌ | ✅ |
Vincentian picture | ❌ | ✅ |
Local knowledge base management | ✅ | ✅ |
WEBUI | ✅ | ✅Better multi-session support, customized system prompt words... |
The core functions of version 0.3.x are implemented by Agent, but users can also manually implement tool calls:
Operation mode | Implemented functions | Applicable scenarios |
---|---|---|
Check "Enable Agent" and select multiple tools | Automatic tool calling by LLM | Use models with Agent capabilities such as ChatGLM3/Qwen or online API |
Check "Enable Agent" and select a single tool | LLM only parses tool parameters | The model agent used has average capabilities and cannot select tools very well. I want to select functions manually. |
Uncheck "Enable Agent" and select a single tool | Without using the Agent function, manually fill in the parameters to call the tool. | The model used does not have agent capabilities |
Uncheck any tool and upload an image | Picture conversation | Use multimodal models such as qwen-vl-chat |
For more features and updates, please experience actual deployment.
This project already supports recent open source large language models and Embedding models that are mainstream on the market such as GLM-4-Chat and Qwen2-Instruct. These models require users to start the model deployment framework by themselves and access the project by modifying the configuration information. This project The supported local model deployment frameworks are as follows:
Model deployment framework | Xinference | LocalAI | Ollama | FastChat |
---|---|---|---|---|
OpenAI API interface alignment | ✅ | ✅ | ✅ | ✅ |
Accelerate inference engines | GPTQ, GGML, vLLM, TensorRT, mlx | GPTQ, GGML, vLLM, TensorRT | GGUF, GGML | vLLM |
Access model type | LLM, Embedding, Rerank, Text-to-Image, Vision, Audio | LLM, Embedding, Rerank, Text-to-Image, Vision, Audio | LLM, Text-to-Image, Vision | LLM, Vision |
Function Call | ✅ | ✅ | ✅ | / |
More platform support (CPU, Metal) | ✅ | ✅ | ✅ | ✅ |
Heterogeneous | ✅ | ✅ | / | / |
cluster | ✅ | ✅ | / | / |
Operational document link | Documentation | LocalAI Documentation | Ollama Documentation | FastChat Documentation |
Available models | Xinference already supports models | LocalAI already supports models | Ollama already supports models | FastChat already supports models |
In addition to the above-mentioned local model loading framework, the project also provides support for One API framework access to online APIs, including OpenAI ChatGPT, Azure OpenAI API, Anthropic Claude, Zhipu Qingyan, Baichuan and other commonly used online APIs. Access and use.
Note
About Xinference loading local models: Xinference's built-in models will be automatically downloaded. If you want it to load the locally downloaded model, you can execute streamlit run xinference_manager.py
in the project tools/model_loaders directory after starting the Xinference service. Follow the page prompts: Just specify the model to set the local path.
? In terms of software, this project has supported use in the Python 3.8-3.11 environment and has been tested in Windows, macOS, and Linux operating systems.
? In terms of hardware, since version 0.3.0 has been modified to support access to different model deployment frameworks, it can be used under different hardware conditions such as CPU, GPU, NPU, MPS, etc.
Starting from version 0.3.0, Langchain-Chachat provides an installation method in the form of a Python library. For specific installation, please execute:
pip install langchain-chatchat -U
Important
To ensure that the Python library used is the latest version, it is recommended to use the official Pypi source or Tsinghua source.
Note
Because the model deployment framework Xinference requires additional installation of the corresponding Python dependency library when connecting to Langchain-Chachat, it is recommended to use the following installation method when using it with the Xinference framework:
pip install "langchain-chatchat[xinference]" -U
Starting from version 0.3.0, Langchain-Chatchat no longer loads models directly based on the local model path input by the user. The types of models involved include LLM, Embedding, Reranker and multi-modal models that will be supported in the future, etc., have been changed to Supports access to major model inference frameworks commonly available on the market, such as Xinference, Ollama, LocalAI, FastChat, One API, etc.
Therefore, please make sure that before starting the Langchain-Chachat project, first run the model inference framework and load the required model.
Here we take Xinference as an example. Please refer to the Xinference documentation for framework deployment and model loading.
Warning
To avoid dependency conflicts, please place Langchain-Chachat and model deployment frameworks such as Xinference in different Python virtual environments, such as conda, venv, virtualenv, etc.
Starting from version 0.3.1, Langchain-Chachat uses local yaml
files for configuration. Users can directly view and modify the contents, and the server will be automatically updated without restarting.
Set the root directory where Chatchat stores configuration files and data files (optional)
# on linux or macosexport CHATCHAT_ROOT=/path/to/chatchat_data # on windowsset CHATCHAT_ROOT=/path/to/chatchat_data
If this environment variable is not set, the current directory will be automatically used.
Perform initialization
chatchat init
This command does the following:
Create all required data directories
Copy samples knowledge base content
Generate default yaml
configuration file
Modify configuration file
Configure the model (model_settings.yaml)
You need to perform model access configuration according to the model inference framework selected in step 2. Model inference framework and load the model and the loaded model. For details, refer to the comments in model_settings.yaml
. Mainly modify the following contents:
# The default LLM name DEFAULT_LLM_MODEL: qwen1.5-chat # The default Embedding name DEFAULT_EMBEDDING_MODEL: bge-large-zh-v1.5 # Change the keys of `llm_model, action_model` in `LLM_MODEL_CONFIG` to the corresponding LLM model# Modify the corresponding model platform information in `MODEL_PLATFORMS`
Configure knowledge base path (basic_settings.yaml) (optional)
The default knowledge base is located at CHATCHAT_ROOT/data/knowledge_base
. If you want to place the knowledge base in a different location, or want to connect to an existing knowledge base, you can modify the corresponding directory here.
# Default storage path of knowledge base KB_ROOT_PATH: D:chatchat-testdataknowledge_base # Default storage path of database. If you use sqlite, you can modify DB_ROOT_PATH directly; if you use other databases, please modify SQLALCHEMY_DATABASE_URI directly. DB_ROOT_PATH: D:chatchat-testdataknowledge_baseinfo.db # Knowledge base information database connection URI SQLALCHEMY_DATABASE_URI: sqlite:///D:chatchat-testdataknowledge_baseinfo.db
Configure knowledge base (kb_settings.yaml) (optional)
The FAISS
knowledge base is used by default. If you want to connect to other types of knowledge bases, you can modify DEFAULT_VS_TYPE
and kbs_config
.
Warning
Before initializing the knowledge base, please ensure that the model inference framework and corresponding embedding
model have been started, and the model access configuration has been completed according to step 3 above.
chatchat kb -r
For more functions, see chatchat kb --help
The following log appears indicating success:
----------------------------------------------------------------------------------------------------
知识库名称 :samples
知识库类型 :faiss
向量模型: :bge-large-zh-v1.5
知识库路径 :/root/anaconda3/envs/chatchat/lib/python3.11/site-packages/chatchat/data/knowledge_base/samples
文件总数量 :47
入库文件数 :42
知识条目数 :740
用时 :0:02:29.701002
----------------------------------------------------------------------------------------------------
总计用时 :0:02:33.414425
Note
Frequently Asked Questions about Knowledge Base Initialization
This problem often occurs in newly created virtual environments and can be confirmed by:
from unstructured.partition.auto import partition
If the statement is stuck and cannot be executed, you can execute the following command:
pip uninstall python-magic-bin# check the version of the uninstalled package pip install 'python-magic-bin=={version}'
Then follow the instructions in this section to re-create the knowledge base.
chatchat start -a
When the following interface appears, the startup is successful:
Warning
Since the default listening address DEFAULT_BIND_HOST
configured by chatchat is 127.0.0.1, it cannot be accessed through other IPs.
If you need to access through the machine IP (such as Linux system), you need to change the listening address to 0.0.0.0 in basic_settings.yaml
.
For database conversation configuration, please go here for database conversation configuration instructions.
Please refer to the development guide for source code installation and deployment.
docker pull chatimage/chatchat:0.3.1.3-93e2c87-20240829 docker pull ccr.ccs.tencentyun.com/langchain-chatchat/chatchat:0.3.1.3-93e2c87-20240829 # Domestic mirror
Important
Strongly recommended: Use docker-compose deployment, please refer to README_docker for details
The structure of 0.3.x has changed a lot, and it is strongly recommended that you redeploy according to the documentation. The following guide does not guarantee 100% compatibility and success. Remember to back up important data in advance!
First follow the steps in安装部署
to configure the operating environment and modify the configuration file.
Copy the knowledge_base directory of the 0.2.x project to the configured DATA
directory
2023年4月
: Langchain-ChatGLM 0.1.0
was released, supporting local knowledge base question and answer based on the ChatGLM-6B model.
2023年8月
: Langchain-ChatGLM
was renamed Langchain-Chatchat
, version 0.2.0
was released, using fastchat
as the model loading solution, supporting more models and databases.
2023年10月
: Langchain-Chatchat 0.2.5
was released, launching Agent content, and the open source project won the third prize in the hackathon held by Founder Park & Zhipu AI & Zilliz
.
2023年12月
: Langchain-Chatchat
open source project received more than 20K stars.
2024年6月
: Langchain-Chatchat 0.3.0
is released, bringing a new project structure.
? Let us look forward to the future Chatchat story...
The code of this project follows the Apache-2.0 protocol.
? Langchain-Chachat project WeChat communication group. If you are also interested in this project, you are welcome to join the group chat to participate in discussion and exchange.
? Langchain-Chachat project official public account, welcome to scan the QR code to follow.
If this project is helpful to your research, please cite us:
@software{langchain_chatchat, title = {{langchain-chatchat}}, author = {Liu, Qian and Song, Jinke, and Huang, Zhiguo, and Zhang, Yuxuan, and glide-the, and liunux4odoo}, year = 2024, journal = {GitHub repository}, publisher = {GitHub}, howpublished = {url{https://github.com/chatchat-space/Langchain-Chatchat}} }