chroma
0.5.20
Chroma - 開源嵌入資料庫。
使用記憶體建立 Python 或 JavaScript LLM 應用程式的最快方法!
| |文檔 |首頁
pip install chromadb # python client
# for javascript, npm install chromadb!
# for client-server mode, chroma run --path /chroma_db_path
核心 API 只有 4 個函數(運行我們的 Google Colab 或 Replit 模板):
import chromadb
# setup Chroma in-memory, for easy prototyping. Can add persistence easily!
client = chromadb . Client ()
# Create collection. get_collection, get_or_create_collection, delete_collection also available!
collection = client . create_collection ( "all-my-documents" )
# Add docs to the collection. Can also update and delete. Row-based API coming soon!
collection . add (
documents = [ "This is document1" , "This is document2" ], # we handle tokenization, embedding, and indexing automatically. You can skip that and add your own embeddings as well
metadatas = [{ "source" : "notion" }, { "source" : "google-docs" }], # filter on these!
ids = [ "doc1" , "doc2" ], # unique for each doc
)
# Query/search 2 most similar results. You can also .get by id
results = collection . query (
query_texts = [ "This is a query document" ],
n_results = 2 ,
# where={"metadata_field": "is_equal_to_this"}, # optional filter
# where_document={"$contains":"search_string"} # optional filter
)
?️? LangChain
(python 和 js), ? LlamaIndex
及更多內容即將推出例如, "Chat your data"
用例:
GPT3
等 LLM 的上下文視窗中,以進行額外的總結或分析。 什麼是嵌入?
[1.2, 2.1, ....]
。此過程使機器學習模型“可以理解”文件。嵌入資料庫(也稱為向量資料庫)儲存嵌入,並允許您按最近鄰居進行搜索,而不是像傳統資料庫那樣按子字串進行搜尋。預設情況下,Chroma 使用 Sentence Transformers 為您嵌入,但您也可以使用 OpenAI 嵌入、Cohere(多語言)嵌入或您自己的嵌入。
Chroma 是一個快速發展的項目。我們歡迎公關貢獻者和關於如何改進專案的想法。
#contributing
頻道Good first issue tag
發布 Cadence目前,我們每週一發布pypi
和npm
軟體包的新標記版本。修補程式在一周中的任何時間都會發布。
阿帕契2.0