HuixiangDou Download - HuixiangDou Source code download

英語 | 簡體中文

HuiyangDou是一位基於LLM的專業知識助理。

優點：

設計預處理、拒絕和回應三階段管道
- chat_in_group應對群聊場景，回答用戶問題，不會出現訊息氾濫，參見2401.08772、2405.02817，混合檢索和精確度報告
- chat_with_repo用於即時串流聊天
無需培訓，僅 CPU、2G、10G、20G 和 80G 配置
提供完整的 Web、Android 和管道原始碼套件，工業級且商業可行

看看回想豆運作的場景，加入微信群試試裡面的AI助手。

如果對您有幫助，請給個star

？新功能

我們的網頁版已經發佈到OpenXLab，您可以在其中建立知識庫、更新正反例、開啟網頁搜尋、測試聊天、加入飛書/微信群組。請參閱 BiliBili 和 YouTube！

Android 的網頁版 API 也支援其他裝置。請參閱 Python 範例程式碼。

[2024/09] 倒排索引器讓LLM更喜歡知識庫
[2024/09] 程式碼檢索
[2024/08] chat_with_readthedocs，看看如何整合？
[2024/07] 圖文檢索&去除langchain ?
[2024/07] 混合知識圖與密集檢索 F1 分數提升 1.7%
[2024/06] chunksize、splitter 和 text2vec 模型的評估
[2024/05] wkteam微信接入，解析圖片&URL，支援共指解析
[2024/05] SFT LLM on NLP任務，F1提升29%
？ LoRA-Qwen1.5-14B LoRA-Qwen1.5-32B 羊駝數據 arXiv
[2024/04] RAG標註SFT問答資料及範例
[2024/04] 發布Web前後端服務原始碼？
[2024/03] 新的個人微信整合與預建APK ！
[2024/02] [實驗特性]微信群融合多模態實現OCR

支援狀態

法學碩士	文件格式	檢索方式	一體化	預處理
實習生LM2/實習生LM2.5 奇文1.5~2.5 璞玉步趣基米深度搜尋 GLM（智浦）矽雲 Xi-Api	pdf 單字卓越 PPT html 降價 TXT	文件密集程式碼稀疏知識圖譜網路搜尋原始圖圖片和文字	微信(android/wkteam) 雲雀 OpenXLab 網路混音器演示 HTTP伺服器閱讀文件	共指消解

？硬體需求

以下是不同功能對GPU顯存的要求，差異僅在於選項是否開啟。

配置範例	GPU 記憶體需求	描述
配置-cpu.ini	-	使用siliconcloud API 僅適用於文字
配置-2G.ini	2GB	使用openai API（例如kimi、deepseek和stepfun）僅搜尋文本
配置-multimodal.ini	10GB	使用 openai API 進行法學碩士、圖像和文字檢索
【標準版】config.ini	19GB	法學碩士本地部署，單一模式
配置高級.ini	80GB	本地法學碩士，照應解析，單一模態，適用於微信群

運行標準版

我們以標準版（本地運行LLM、文字檢索）為例進行介紹。其他版本只是配置選項不同。

一、下載並安裝依賴

點選同意BCE模型協議，登入huggingface

huggingface-cli login

安裝依賴項

 # parsing `word` format requirements
apt update
apt install python-dev libxml2-dev libxslt1-dev antiword unrtf poppler-utils pstotext tesseract-ocr flac ffmpeg lame libmad0 libsox-fmt-mp3 sox libjpeg-dev swig libpulse-dev
# python requirements
pip install -r requirements.txt
# For python3.8, install faiss-gpu instead of faiss

二.建立知識庫並提出問題

使用mmpose文件建立mmpose知識庫並過濾問題。如果您有自己的文檔，只需將它們放在repodir下即可。

複製並執行以下所有命令（包括“#”符號）。

 # Download the knowledge base, we only take the documents of mmpose as an example. You can put any of your own documents under `repodir`
cd HuixiangDou
mkdir repodir
git clone https://github.com/open-mmlab/mmpose    --depth=1 repodir/mmpose

# Save the features of repodir to workdir, and update the positive and negative example thresholds into `config.ini`
mkdir workdir
python3 -m huixiangdou.service.feature_store

運行後，使用python3 -m huixiangdou.main --standalone進行測試。此時，回覆mmpose相關問題（與知識庫相關），同時不回覆天氣問題。

python3 -m huixiangdou.main --standalone

+---------------------------+---------+----------------------------+-----------------+
|         Query             |  State  |         Reply              |   References    |
+===========================+=========+============================+=================+
| How to install mmpose ?    | success | To install mmpose, plea..  | installation.md |
--------------------------------------------------------------------------------------
| How is the weather today ? | unrelated.. | ..                     |                 |
+-----------------------+---------+--------------------------------+-----------------+
? Input your question here, type ` bye ` for exit:
..

筆記

如果每次重開LLM太慢，可以先python3 -m huifangdou.service.llm_server_hybrid ；然後打開一個新窗口，每次只執行python3 -m huiyangdou.main而不重新啟動LLM。

也可以使用gradio運行一個簡單的 Web UI：

python3 -m huixiangdou.gradio_ui

輸出.mp4

或執行伺服器來監聽 23333，預設管道是chat_with_repo ：

python3 -m huixiangdou.server

# test async API 
curl -X POST http://127.0.0.1:23333/huixiangdou_stream  -H " Content-Type: application/json " -d ' {"text": "how to install mmpose","image": ""} '
# cURL sync API
curl -X POST http://127.0.0.1:23333/huixiangdou_inference  -H " Content-Type: application/json " -d ' {"text": "how to install mmpose","image": ""} '

請更新repodir文檔，good_questions和bad_questions，並嘗試你自己的領域知識（醫療，金融，電力等）。

三．融入飛書、微信群

單向發送至飛書群
雙向飛書群收發、召回
個人微信Android接入
個人微信wkteam訪問

四．部署Web前端和後端

我們提供typescript前端和python後端原始碼：

支援多租戶管理
零編程接入飛書、微信
k8s 友好

與OpenXlab APP相同，請閱讀Web部署文件。

？其他配置

僅CPU版

如果沒有可用的GPU，可以使用siliconcloud API完成模型推理。

以docker miniconda+Python3.11為例，安裝CPU依賴並運作：

 # Start container
docker run -v /path/to/huixiangdou:/huixiangdou -p 7860:7860 -p 23333:23333 -it continuumio/miniconda3 /bin/bash
# Install dependencies
apt update
apt install python-dev libxml2-dev libxslt1-dev antiword unrtf poppler-utils pstotext tesseract-ocr flac ffmpeg lame libmad0 libsox-fmt-mp3 sox libjpeg-dev swig libpulse-dev
python3 -m pip install -r requirements-cpu.txt
# Establish knowledge base
python3 -m huixiangdou.service.feature_store --config_path config-cpu.ini
# Q&A test
python3 -m huixiangdou.main --standalone --config_path config-cpu.ini
# gradio UI
python3 -m huixiangdou.gradio_ui --config_path config-cpu.ini

如果您發現安裝太慢，Docker Hub 中提供了預先安裝的映像。只需在啟動 docker 時替換它即可。

2G性價比版

如果你的GPU mem超過1.8G，或是你追求性價比。此配置放棄本地LLM，改用遠端LLM，與標準版相同。

以siliconcloud為例，將官網申請的API TOKEN填入config-2G.ini中

 # config-2G.ini
[ llm ]
enable_local = 0   # Turn off local LLM
enable_remote = 1  # Only use remote
..
remote_type = " siliconcloud "   # Choose siliconcloud
remote_api_key = " YOUR-API-KEY-HERE " # Your API key
remote_llm_model = " alibaba/Qwen1.5-110B-Chat "

筆記

每個問答場景最差需要呼叫LLM 7次，受免費使用者RPM限制，可以修改config.ini中的rpm參數

執行以下命令以取得Q&A結果

python3 -m huixiangdou.main --standalone --config-path config-2G.ini # Start all services at once

10G多模版

如果你有10G GPU內存，你可以進一步支援圖像和文字檢索。只需修改config.ini中使用的模型即可。

 # config-multimodal.ini
# !!! Download `https://huggingface.co/BAAI/bge-visualized/blob/main/Visualized_m3.pth`    to `bge-m3` folder !!!
embedding_model_path = " BAAI/bge-m3 "
reranker_model_path = " BAAI/bge-reranker-v2-minicpm-layerwise "

筆記：

需手動下載Visualized_m3.pth到bge-m3目錄下
在主分支上安裝 FlagEmbedding，我們已經修復了錯誤。在這裡您可以下載bpe_simple_vocab_16e6.txt.gz
安裝要求/multimodal.txt

執行gradio進行測試，查看圖文檢索結果。

python3 tests/test_query_gradio.py

80G完整版

微信體驗群組中的「灰香豆」已啟用全部功能：

Serper 搜尋和 SourceGraph 搜尋增強
群組聊天圖片、微信公眾號解析
文字共指解析
混合法學碩士
知識庫與openmmlab的12個儲存庫（1700個文件）相關，拒絕閒聊

請閱讀以下主題：

混合知識圖譜和密集檢索
參考config-advanced.ini配置提高效果
群聊場景照應解析訓練
使用wkteam微信接入，整合影像、公眾號解析、照應解析
使用rag.py註解SFT訓練數據

安卓工具

貢獻者提供了與微信互動的Android工具。此解決方案基於系統層級API，原則上可以控制任何UI（不限於通訊軟體）。

常問問題

如果機器人太冷/太健談怎麼辦？
- 將真實場景中應該回答的問題填寫到resource/good_questions.json中，將應該拒絕的問題填寫到resource/bad_questions.json中。
- 調整repodir中的主題內容，確保主庫中的markdown文件不包含不相關的內容。
重新運行feature_store以更新閾值和特徵庫。
️可以直接修改config.ini中的reject_throttle 。一般來說，0.5是較高的值； 0.2 太低了。
啟動正常，但是運行時記憶體不足？
基於 Transformers 結構的 LLM 長文本需要更多記憶體。這時候就需要在模型上做kv快取量化，例如lmdeploy量化描述。然後使用docker獨立部署Hybrid LLM Service。
如何存取其他本機LLM / 存取後效果不理想？
- 開放混合llm服務，新增新的LLM推理實作。
- 參考test_intention_prompt和測試數據，調整新模型的prompt和threshold，更新到prompt.py中。
如果回應太慢/請求總是失敗怎麼辦？
- 參考混合llm服務添加指數退避和重傳。
- 使用 lmdeploy 等推理框架取代本地 LLM，而不是本機 Huggingface/transformers。
如果 GPU 顯存太低怎麼辦？
此時無法執行本機LLM，只能使用遠端LLM與text2vec結合來執行管道。請確保config.ini僅使用遠端LLM並關閉本機LLM。

？致謝

KIMI：長文字LLM，支援直接檔案上傳
標誌嵌入：BAAI RAG 組
BCEmbedding：中英雙語特質模型
Langchain-ChatChat：Langchain和ChatGLM的應用
GrabRedEnvelope：微信搶紅包

引文

@misc{kong2024huixiangdou,
      title={HuiXiangDou: Overcoming Group Chat Scenarios with LLM-based Technical Assistance},
      author={Huanjun Kong and Songyang Zhang and Jiaying Li and Min Xiao and Jun Xu and Kai Chen},
      year={2024},
      eprint={2401.08772},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

@misc{kong2024labelingsupervisedfinetuningdata,
      title={Labeling supervised fine-tuning data with the scaling law}, 
      author={Huanjun Kong},
      year={2024},
      eprint={2405.02817},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2405.02817}, 
}

展開