⚡ 在 Ruby 中建立 LLM 支援的應用程式 ⚡
有關 Rails 的深度集成,請參閱:langchainrb_rails gem。
可用於付費諮詢活動!給我發電子郵件。
安裝 gem 並透過執行以下命令將其新增至應用程式的 Gemfile 中:
bundle add langchainrb
如果沒有使用bundler來管理依賴項,請執行以下命令安裝gem:
gem install langchainrb
可能需要額外的寶石。預設不包含它們,因此您可以僅包含您需要的內容。
require "langchain"
Langchain::LLM
模組提供了與各種大型語言模型 (LLM) 提供者互動的統一介面。這種抽象化允許您輕鬆地在不同的 LLM 後端之間切換,而無需更改應用程式程式碼。
所有LLM類別都繼承自Langchain::LLM::Base
並為常用操作提供一致的介面:
大多數 LLM 課程可以使用 API 金鑰和可選的預設選項進行初始化:
llm = Langchain :: LLM :: OpenAI . new (
api_key : ENV [ "OPENAI_API_KEY" ] ,
default_options : { temperature : 0.7 , chat_model : "gpt-4o" }
)
使用embed
方法為給定文字產生嵌入:
response = llm . embed ( text : "Hello, world!" )
embedding = response . embedding
embed()
接受的參數text
:(必要)要嵌入的輸入文字。model
:(可選)要使用的模型名稱或將使用預設嵌入模型。使用complete
方法為給定提示產生補全:
response = llm . complete ( prompt : "Once upon a time" )
completion = response . completion
complete()
接受的參數prompt
:(必填)輸入完成提示。max_tokens
:(可選)要產生的最大令牌數。temperature
:(可選)控制產生的隨機性。較高的值(例如,0.8)使輸出更加隨機,而較低的值(例如,0.2)使其更確定。top_p
:(可選)溫度的替代方案,控制產生令牌的多樣性。n
:(可選)為每個提示產生的完成數。stop
:(可選)API 將停止產生更多令牌的序列。presence_penalty
:(可選)根據新標記目前在文本中的存在對其進行懲罰。frequency_penalty
:(可選)根據新標記在文字中出現的頻率對其進行懲罰。使用chat
方法產生聊天完成結果:
messages = [
{ role : "system" , content : "You are a helpful assistant." } ,
{ role : "user" , content : "What's the weather like today?" }
# Google Gemini and Google VertexAI expect messages in a different format:
# { role: "user", parts: [{ text: "why is the sky blue?" }]}
]
response = llm . chat ( messages : messages )
chat_completion = response . chat_completion
chat()
接受的參數messages
:(必要)表示對話歷史記錄的訊息物件陣列。model
:(可選)要使用的特定聊天模型。temperature
:(可選)控制產生的隨機性。top_p
:(可選)溫度的替代方案,控制產生令牌的多樣性。n
:(可選)要產生的聊天完成選項的數量。max_tokens
:(可選)聊天完成時產生的最大令牌數。stop
:(可選)API 將停止產生更多令牌的序列。presence_penalty
:(可選)根據新標記目前在文本中的存在對其進行懲罰。frequency_penalty
:(可選)根據新標記在文字中出現的頻率對其進行懲罰。logit_bias
:(可選)修改指定標記出現在補全中的可能性。user
:(可選)代表您的最終使用者的唯一識別碼。tools
:(可選)模型可能呼叫的工具清單。tool_choice
:(可選)控制模型如何呼叫函數。 由於統一的接口,您可以透過變更實例化的類別輕鬆地在不同的 LLM 提供者之間切換:
# Using Anthropic
anthropic_llm = Langchain :: LLM :: Anthropic . new ( api_key : ENV [ "ANTHROPIC_API_KEY" ] )
# Using Google Gemini
gemini_llm = Langchain :: LLM :: GoogleGemini . new ( api_key : ENV [ "GOOGLE_GEMINI_API_KEY" ] )
# Using OpenAI
openai_llm = Langchain :: LLM :: OpenAI . new ( api_key : ENV [ "OPENAI_API_KEY" ] )
每個 LLM 方法都會傳回一個回應對象,該物件提供用於存取結果的一致介面:
embedding
:傳回嵌入向量completion
:傳回產生的文字補全chat_completion
: 回傳產生的聊天完成tool_calls
:傳回 LLM 進行的工具調用prompt_tokens
:傳回提示中的標記數量completion_tokens
:傳回完成中的令牌數量total_tokens
:傳回使用的令牌總數筆記
雖然核心介面在各個提供者之間是一致的,但一些法學碩士可能會提供額外的功能或參數。請查閱每個法學碩士課程的文檔,以了解特定於提供者的功能和選項。
建立帶有輸入變數的提示:
prompt = Langchain :: Prompt :: PromptTemplate . new ( template : "Tell me a {adjective} joke about {content}." , input_variables : [ "adjective" , "content" ] )
prompt . format ( adjective : "funny" , content : "chickens" ) # "Tell me a funny joke about chickens."
僅使用提示而不使用 input_variables 建立 PromptTemplate:
prompt = Langchain :: Prompt :: PromptTemplate . from_template ( "Tell me a funny joke about chickens." )
prompt . input_variables # []
prompt . format # "Tell me a funny joke about chickens."
將提示範本儲存到 JSON 檔案:
prompt . save ( file_path : "spec/fixtures/prompt/prompt_template.json" )
使用 JSON 檔案載入新的提示範本:
prompt = Langchain :: Prompt . load_from_path ( file_path : "spec/fixtures/prompt/prompt_template.json" )
prompt . input_variables # ["adjective", "content"]
使用一些鏡頭範例建立提示:
prompt = Langchain :: Prompt :: FewShotPromptTemplate . new (
prefix : "Write antonyms for the following words." ,
suffix : "Input: {adjective} n Output:" ,
example_prompt : Langchain :: Prompt :: PromptTemplate . new (
input_variables : [ "input" , "output" ] ,
template : "Input: {input} n Output: {output}"
) ,
examples : [
{ "input" : "happy" , "output" : "sad" } ,
{ "input" : "tall" , "output" : "short" }
] ,
input_variables : [ "adjective" ]
)
prompt . format ( adjective : "good" )
# Write antonyms for the following words.
#
# Input: happy
# Output: sad
#
# Input: tall
# Output: short
#
# Input: good
# Output:
將提示範本儲存到 JSON 檔案:
prompt . save ( file_path : "spec/fixtures/prompt/few_shot_prompt_template.json" )
使用 JSON 檔案載入新的提示範本:
prompt = Langchain :: Prompt . load_from_path ( file_path : "spec/fixtures/prompt/few_shot_prompt_template.json" )
prompt . prefix # "Write antonyms for the following words."
使用 YAML 檔案載入新的提示範本:
prompt = Langchain :: Prompt . load_from_path ( file_path : "spec/fixtures/prompt/prompt_template.yaml" )
prompt . input_variables #=> ["adjective", "content"]
將 LLM 文字回應解析為結構化輸出,例如 JSON。
您可以使用StructuredOutputParser
產生提示,指示 LLM 提供符合特定 JSON 模式的 JSON 回應:
json_schema = {
type : "object" ,
properties : {
name : {
type : "string" ,
description : "Persons name"
} ,
age : {
type : "number" ,
description : "Persons age"
} ,
interests : {
type : "array" ,
items : {
type : "object" ,
properties : {
interest : {
type : "string" ,
description : "A topic of interest"
} ,
levelOfInterest : {
type : "number" ,
description : "A value between 0 and 100 of how interested the person is in this interest"
}
} ,
required : [ "interest" , "levelOfInterest" ] ,
additionalProperties : false
} ,
minItems : 1 ,
maxItems : 3 ,
description : "A list of the person's interests"
}
} ,
required : [ "name" , "age" , "interests" ] ,
additionalProperties : false
}
parser = Langchain :: OutputParsers :: StructuredOutputParser . from_json_schema ( json_schema )
prompt = Langchain :: Prompt :: PromptTemplate . new ( template : "Generate details of a fictional character. n {format_instructions} n Character description: {description}" , input_variables : [ "description" , "format_instructions" ] )
prompt_text = prompt . format ( description : "Korean chemistry student" , format_instructions : parser . get_format_instructions )
# Generate details of a fictional character.
# You must format your output as a JSON value that adheres to a given "JSON Schema" instance.
# ...
然後解析 llm 回應:
llm = Langchain :: LLM :: OpenAI . new ( api_key : ENV [ "OPENAI_API_KEY" ] )
llm_response = llm . chat ( messages : [ { role : "user" , content : prompt_text } ] ) . completion
parser . parse ( llm_response )
# {
# "name" => "Kim Ji-hyun",
# "age" => 22,
# "interests" => [
# {
# "interest" => "Organic Chemistry",
# "levelOfInterest" => 85
# },
# ...
# ]
# }
如果解析器無法解析 LLM 回應,您可以使用OutputFixingParser
。它向 LLM 發送一條錯誤訊息、先前的輸出和原始提示文本,要求「固定」回應:
begin
parser . parse ( llm_response )
rescue Langchain :: OutputParsers :: OutputParserException => e
fix_parser = Langchain :: OutputParsers :: OutputFixingParser . from_llm (
llm : llm ,
parser : parser
)
fix_parser . parse ( llm_response )
end
或者,如果您不需要處理OutputParserException
,您可以簡化程式碼:
# we already have the `OutputFixingParser`:
# parser = Langchain::OutputParsers::StructuredOutputParser.from_json_schema(json_schema)
fix_parser = Langchain :: OutputParsers :: OutputFixingParser . from_llm (
llm : llm ,
parser : parser
)
fix_parser . parse ( llm_response )
具體範例請參見此處
RAG 是一種幫助法學碩士產生準確和最新資訊的方法。典型的 RAG 工作流程遵循以下 3 個步驟:
Langchain.rb 在支援的向量搜尋資料庫之上提供了一個方便的統一介面,可以輕鬆配置索引、新增資料、查詢和檢索。
資料庫 | 開源 | 雲端產品 |
---|---|---|
色度 | ✅ | ✅ |
埃普西拉 | ✅ | ✅ |
漢斯庫 | ✅ | |
米爾烏斯 | ✅ | ✅ 齊利茲雲 |
松果 | ✅ | |
向量 | ✅ | ✅ |
奎德蘭特 | ✅ | ✅ |
韋維阿特 | ✅ | ✅ |
彈性搜尋 | ✅ | ✅ |
選擇您將使用的向量搜尋資料庫,新增 gem 依賴項並實例化客戶端:
gem "weaviate-ruby" , "~> 0.8.9"
選擇並實例化您將用於產生嵌入的 LLM 提供者
llm = Langchain :: LLM :: OpenAI . new ( api_key : ENV [ "OPENAI_API_KEY" ] )
client = Langchain :: Vectorsearch :: Weaviate . new (
url : ENV [ "WEAVIATE_URL" ] ,
api_key : ENV [ "WEAVIATE_API_KEY" ] ,
index_name : "Documents" ,
llm : llm
)
您可以實例化任何其他支援的向量搜尋資料庫:
client = Langchain :: Vectorsearch :: Chroma . new ( ... ) # `gem "chroma-db", "~> 0.6.0"`
client = Langchain :: Vectorsearch :: Epsilla . new ( ... ) # `gem "epsilla-ruby", "~> 0.0.3"`
client = Langchain :: Vectorsearch :: Hnswlib . new ( ... ) # `gem "hnswlib", "~> 0.8.1"`
client = Langchain :: Vectorsearch :: Milvus . new ( ... ) # `gem "milvus", "~> 0.9.3"`
client = Langchain :: Vectorsearch :: Pinecone . new ( ... ) # `gem "pinecone", "~> 0.1.6"`
client = Langchain :: Vectorsearch :: Pgvector . new ( ... ) # `gem "pgvector", "~> 0.2"`
client = Langchain :: Vectorsearch :: Qdrant . new ( ... ) # `gem "qdrant-ruby", "~> 0.9.3"`
client = Langchain :: Vectorsearch :: Elasticsearch . new ( ... ) # `gem "elasticsearch", "~> 8.2.0"`
建立預設架構:
client . create_default_schema
將純文字資料新增至向量搜尋資料庫:
client . add_texts (
texts : [
"Begin by preheating your oven to 375°F (190°C). Prepare four boneless, skinless chicken breasts by cutting a pocket into the side of each breast, being careful not to cut all the way through. Season the chicken with salt and pepper to taste. In a large skillet, melt 2 tablespoons of unsalted butter over medium heat. Add 1 small diced onion and 2 minced garlic cloves, and cook until softened, about 3-4 minutes. Add 8 ounces of fresh spinach and cook until wilted, about 3 minutes. Remove the skillet from heat and let the mixture cool slightly." ,
"In a bowl, combine the spinach mixture with 4 ounces of softened cream cheese, 1/4 cup of grated Parmesan cheese, 1/4 cup of shredded mozzarella cheese, and 1/4 teaspoon of red pepper flakes. Mix until well combined. Stuff each chicken breast pocket with an equal amount of the spinach mixture. Seal the pocket with a toothpick if necessary. In the same skillet, heat 1 tablespoon of olive oil over medium-high heat. Add the stuffed chicken breasts and sear on each side for 3-4 minutes, or until golden brown."
]
)
或使用檔案解析器將資料載入、解析並索引到資料庫中:
my_pdf = Langchain . root . join ( "path/to/my.pdf" )
my_text = Langchain . root . join ( "path/to/my.txt" )
my_docx = Langchain . root . join ( "path/to/my.docx" )
client . add_data ( paths : [ my_pdf , my_text , my_docx ] )
支援的文件格式:docx、html、pdf、text、json、jsonl、csv、xlsx、eml、pptx。
根據傳入的查詢字串檢索類似文件:
client . similarity_search (
query : ,
k : # number of results to be retrieved
)
根據透過 HyDE 技術傳入的查詢字串檢索相似文件:
client . similarity_search_with_hyde ( )
根據傳入的嵌入檢索類似文件:
client . similarity_search_by_vector (
embedding : ,
k : # number of results to be retrieved
)
基於RAG的查詢
client . ask ( question : "..." )
Langchain::Assistant
是一個強大且靈活的類,它結合了大型語言模型 (LLM)、工具和對話管理來創建智慧的互動式助理。它旨在處理複雜的對話、執行工具並根據互動上下文提供連貫的回應。
llm = Langchain :: LLM :: OpenAI . new ( api_key : ENV [ "OPENAI_API_KEY" ] )
assistant = Langchain :: Assistant . new (
llm : llm ,
instructions : "You're a helpful AI assistant" ,
tools : [ Langchain :: Tool :: NewsRetriever . new ( api_key : ENV [ "NEWS_API_KEY" ] ) ]
)
# Add a user message and run the assistant
assistant . add_message_and_run! ( content : "What's the latest news about AI?" )
# Supply an image to the assistant
assistant . add_message_and_run! (
content : "Show me a picture of a cat" ,
image_url : "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
)
# Access the conversation thread
messages = assistant . messages
# Run the assistant with automatic tool execution
assistant . run ( auto_tool_execution : true )
# If you want to stream the response, you can add a response handler
assistant = Langchain :: Assistant . new (
llm : llm ,
instructions : "You're a helpful AI assistant" ,
tools : [ Langchain :: Tool :: NewsRetriever . new ( api_key : ENV [ "NEWS_API_KEY" ] ) ]
) do | response_chunk |
# ...handle the response stream
# print(response_chunk.inspect)
end
assistant . add_message ( content : "Hello" )
assistant . run ( auto_tool_execution : true )
請注意,目前並非所有法學碩士都支援串流媒體。
llm
:要使用的 LLM 實例(必要)tools
:工具實例數組(可選)instructions
:助手的系統指令(可選)tool_choice
:指定如何選擇工具。預設值:“自動”。可以傳遞特定的工具函數名稱。這將強制助手始終使用此功能。parallel_tool_calls
:是否進行多個平行工具呼叫。預設值:trueadd_message_callback
:將任何訊息加入對話時所呼叫的回呼函數(proc、lambda)(可選) assistant . add_message_callback = -> ( message ) { puts "New message: #{ message } " }
tool_execution_callback
:執行工具之前呼叫的回呼函數(proc、lambda)(可選) assistant . tool_execution_callback = -> ( tool_call_id , tool_name , method_name , tool_arguments ) { puts "Executing tool_call_id: #{ tool_call_id } , tool_name: #{ tool_name } , method_name: #{ method_name } , tool_arguments: #{ tool_arguments } " }
add_message
:將使用者訊息加入到訊息數組中run!
:處理對話並產生回應add_message_and_run!
:結合新增訊息和運行助手submit_tool_output
:手動將輸出提交到工具調用messages
:傳回正在進行的訊息列表Langchain::Tool::Calculator
:用於計算數學表達式。需要gem "eqn"
。Langchain::Tool::Database
:連接您的 SQL 資料庫。需要gem "sequel"
。Langchain::Tool::FileSystem
:與檔案系統互動(讀取和寫入)。Langchain::Tool::RubyCodeInterpreter
:用於評估產生的 Ruby 程式碼。需要gem "safe_ruby"
(需要更好的解決方案)。Langchain::Tool::NewsRetriever
:NewsApi.org 的包裝器,用於獲取新聞文章。Langchain::Tool::Tavily
:Tavily AI 的包裝。Langchain::Tool::Weather
:呼叫 Open Weather API 來擷取目前天氣。Langchain::Tool::Wikipedia
:呼叫維基百科 API。透過建立extend Langchain::ToolDefinition
模組並實現所需方法的類,可以使用自訂工具輕鬆擴展 Langchain::Assistant。
class MovieInfoTool
extend Langchain :: ToolDefinition
define_function :search_movie , description : "MovieInfoTool: Search for a movie by title" do
property :query , type : "string" , description : "The movie title to search for" , required : true
end
define_function :get_movie_details , description : "MovieInfoTool: Get detailed information about a specific movie" do
property :movie_id , type : "integer" , description : "The TMDb ID of the movie" , required : true
end
def initialize ( api_key : )
@api_key = api_key
end
def search_movie ( query : )
...
end
def get_movie_details ( movie_id : )
...
end
end
movie_tool = MovieInfoTool . new ( api_key : "..." )
assistant = Langchain :: Assistant . new (
llm : llm ,
instructions : "You're a helpful AI assistant that can provide movie information" ,
tools : [ movie_tool ]
)
assistant . add_message_and_run ( content : "Can you tell me about the movie 'Inception'?" )
# Check the response in the last message in the conversation
assistant . messages . last
此助手包括對無效輸入、不支援的 LLM 類型和工具執行失敗的錯誤處理。它使用狀態機來管理會話流並優雅地處理不同的場景。
評估模組是一個工具集合,可用於評估和追蹤 LLM 和 RAG(檢索增強生成)管道的輸出產品的性能。
Ragas 可協助您評估檢索增強產生 (RAG) 管道。實作基於本文和原始 Python 儲存庫。 Ragas 追蹤以下 3 個指標並分配 0.0 - 1.0 分數:
# We recommend using Langchain::LLM::OpenAI as your llm for Ragas
ragas = Langchain :: Evals :: Ragas :: Main . new ( llm : llm )
# The answer that the LLM generated
# The question (or the original prompt) that was asked
# The context that was retrieved (usually from a vectorsearch database)
ragas . score ( answer : "" , question : "" , context : "" )
# =>
# {
# ragas_score: 0.6601257446503674,
# answer_relevance_score: 0.9573145866787608,
# context_relevance_score: 0.6666666666666666,
# faithfulness_score: 0.5
# }
其他可用範例:/examples
Langchain.rb 使用標準 Ruby Logger 機制並預設為相同level
值(目前為Logger::DEBUG
)。
顯示所有日誌訊息:
Langchain . logger . level = Logger :: DEBUG
預設情況下,記錄器記錄到STDOUT
。為了配置日誌目標(即記錄到檔案),請執行下列操作:
Langchain . logger = Logger . new ( "path/to/file" , ** Langchain :: LOGGER_OPTIONS )
如果您在安裝pragmatic_segmenter
所需的unicode
gem 時遇到問題,請嘗試執行:
gem install unicode -- --with-cflags= " -Wno-incompatible-function-pointer-types "
git clone https://github.com/andreibondarev/langchainrb.git
cp .env.example .env
,然後填入.env
中的環境變量bundle exec rake
以確保測試通過並運行 standardrbbin/console
在 REPL 會話中載入 gem。請隨意添加您自己的 LLM、工具、代理等實例並進行試驗。gem install lefthook && lefthook install -f
加入我們的 Langchain.rb Discord 伺服器。
歡迎在 GitHub 上提交錯誤報告和拉取請求:https://github.com/andreibondarev/langchainrb。
該 gem 根據 MIT 授權條款作為開源提供。