libre chatダウンロード - libre chatソースコードダウンロード

リブレチャット

機械学習の知識がなくても、Mixtral や Llama 2 などのオープンソースの大規模言語モデル (LLM) に基づいた完全に自己ホスト型のチャットボット Web サービスを簡単に構成して展開できます。

UI と API を備えた無料のオープンソースチャットボット Web サービス。
?完全に自己ホスト型であり、どのサービスにも関連付けられておらず、オフライン対応です。 API キーのことは忘れてください。モデルと埋め込みは事前にダウンロードでき、必要に応じてトレーニングと推論のプロセスをオフラインで実行できます。
? OpenAPI 仕様を使用して記述された Web API: GET/POST 操作、ストリーミング応答用の WebSocket
?ストリーミング応答とマークダウンレンダリングを備えたチャット Web UI はデスクトップとモバイルで適切に動作します。代替の gradio ベースの UI も利用できます。
セットアップが簡単で、プログラミングの必要がなく、YAML ファイルでサービスを構成し、1 つのコマンドで開始するだけです
? pipパッケージとして利用可能ですか? またはdockerイメージとして利用可能ですか?
GPU は必要ありません。ラップトップの CPU でも動作します。とはいえ、CPU 上で実行するだけでもかなり遅くなる可能性があります (最近のラップトップではドキュメントベースの質問に答えるのに最大 1 分かかります)。
? LangChainとllama.cppを利用してローカルで推論を実行します。
?さまざまなタイプのエージェントをデプロイできます。
- 一般的な会話: 追加のトレーニングは必要ありません。テンプレートプロンプトなどの設定を構成するだけです。
- ドキュメントベースの質問応答(実験的): API UI を通じてアップロードされたドキュメントから類似性ベクトルを自動的に構築します。チャットボットはそれらを使用して質問に回答し、回答の生成に使用されたドキュメント (PDF、CSV、HTML、JSON、マークダウンなどがサポートされています)。
?何が起こっているかを理解するための読みやすいログ。

ドキュメント

Libre Chat の使用方法の詳細については、 vemonet.github.io/libre-chatのドキュメントを確認してください。

作業中です

警告

このプロジェクトは進行中のものであるため、注意して使用してください。

これらのチェックポイントは、将来的に取り組む予定の機能です。コメントやリクエストがあれば、お気軽に課題でお知らせください。

応答を WebSocket にストリーミングして、生成された単語を表示します
ユーザーがチャットボットの生成を停止できるボタンを追加します
認証メカニズムを追加しますか? (OAuth/OpenID Connect) #5
会話履歴を追加しますか? https://milvus.io/blog/conversational-memory-in-langchain.md
管理者ダッシュボード Web UI を追加して、ユーザーが QA 用のドキュメントをアップロード/検査/削除したり、チャットボットの構成を参照/編集したりできるようにします。
Kubernetes のデプロイメント (Helm チャート?)

? Docker を使用してデプロイする

事前トレーニング済みモデルMixtral-8x7B-Instruct使用してそれをすばやくデプロイするだけの場合は、docker を使用できます。

docker run -it -p 8000:8000 ghcr.io/vemonet/libre-chat:main

環境変数を使用してデプロイメントを構成できます。このためには、 docker composeと.envファイルを使用する方が簡単です。まずdocker-compose.ymlファイルを作成します。

 version : " 3 "
services :
  libre-chat :
    image : ghcr.io/vemonet/libre-chat:main
    volumes :
      # ️ Share folders from the current directory to the /data dir in the container
      - ./chat.yml:/data/chat.yml
      - ./models:/data/models
      - ./documents:/data/documents
      - ./embeddings:/data/embeddings
      - ./vectorstore:/data/vectorstore
    ports :
      - 8000:8000

そして、設定を含むchat.ymlファイルをdocker-compose.ymlと同じフォルダーに作成します。

 llm :
  model_path : ./models/mixtral-8x7b-instruct-v0.1.Q2_K.gguf
  model_download : https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF/resolve/main/mixtral-8x7b-instruct-v0.1.Q2_K.gguf
  temperature : 0.01    # Config how creative, but also potentially wrong, the model can be. 0 is safe, 1 is adventurous
  max_new_tokens : 1024 # Max number of words the LLM can generate
  # Always use input for the human input variable with a generic agent
  prompt_variables : [input, history]
  prompt_template : |
    Your are an assistant, please help me

    {history}
    User: {input}
    AI Assistant:

vector :
  vector_path : null # Path to the vectorstore to do QA retrieval, e.g. ./vectorstore/db_faiss
  # Set to null to deploy a generic conversational agent
  vector_download : null
  embeddings_path : ./embeddings/all-MiniLM-L6-v2 # Path to embeddings used to generate the vectors, or use directly from HuggingFace: sentence-transformers/all-MiniLM-L6-v2
  embeddings_download : https://public.ukp.informatik.tu-darmstadt.de/reimers/sentence-transformers/v0.2/all-MiniLM-L6-v2.zip
  documents_path : ./documents # Path to documents to vectorize
  chunk_size : 500             # Maximum size of chunks, in terms of number of characters
  chunk_overlap : 50           # Overlap in characters between chunks
  chain_type : stuff           # Or: map_reduce, reduce, map_rerank. More details: https://docs.langchain.com/docs/components/chains/index_related_chains
  search_type : similarity     # Or: similarity_score_threshold, mmr. More details: https://python.langchain.com/docs/modules/data_connection/retrievers/vectorstore
  return_sources_count : 2     # Number of sources to return when generating an answer
  score_threshold : null       # If using the similarity_score_threshold search type. Between 0 and 1

info :
  title : " Libre Chat "
  version : " 0.1.0 "
  description : |
    Open source and free chatbot powered by [LangChain](https://python.langchain.com) and [llama.cpp](https://github.com/ggerganov/llama.cpp)
  examples :
  - What is the capital of the Netherlands?
  - Which drugs are approved by the FDA to mitigate Alzheimer symptoms?
  - How can I create a logger with timestamp using python logging?
  favicon : https://raw.github.com/vemonet/libre-chat/main/docs/docs/assets/logo.png
  repository_url : https://github.com/vemonet/libre-chat
  public_url : https://chat.semanticscience.org
  contact :
    name : Vincent Emonet
    email : [email protected]
  license_info :
    name : MIT license
    url : https://raw.github.com/vemonet/libre-chat/main/LICENSE.txt

最後に次のようにしてチャットサービスを開始します。

docker compose up

⁉️ pip での使用法

このパッケージには Python >=3.8 が必要です。 pipxまたはpipを使用してインストールするだけです。

pip install libre-chat

⌨️ コマンドラインインターフェースとして使用

ターミナルを使用して、UI と API を含む新しいチャット Web サービスを簡単に開始できます。

libre-chat start

特定の構成ファイルを指定します。

libre-chat start config/chat-vectorstore-qa.yml

ベクターストアを再構築するには:

libre-chat build --vector vectorstore/db_faiss --documents documents

利用可能なオプションの完全な概要を確認するには、次のコマンドを使用します。

libre-chat --help

? Pythonで使用する

または、このパッケージを Python スクリプトで使用することもできます。

 import logging

import uvicorn
from libre_chat import ChatConf , ChatEndpoint , Llm

logging . basicConfig ( level = logging . getLevelName ( "INFO" ))
conf = ChatConf (
  model_path = "./models/mixtral-8x7b-instruct-v0.1.Q2_K.gguf" ,
  vector_path = None
)
llm = Llm ( conf = conf )
print ( llm . query ( "What is the capital of the Netherlands?" ))

# Create and deploy a FastAPI app based on your LLM
app = ChatEndpoint ( llm = llm , conf = conf )
uvicorn . run ( app )