LangChain是一個幫助您方便使用LM(Large Language)的框架。
LangChain Basic 解釋了 LangChain 每種配置的範例程式碼。
描述如何將 LangChain 應用到使用 Falcon FM 建立的 SageMaker 端點。使用透過 SageMaker JumpStart 安裝 Falcon FM 所獲得的 SageMaker 端點(例如,jumpstart-dft-hf-llm-falcon-7b-instruct-bf16)。
參考Falcon的輸入與輸出,註冊ContentHandler的transform_input和transform_output,如下所示。
from langchain import PromptTemplate , SagemakerEndpoint
from langchain . llms . sagemaker_endpoint import LLMContentHandler
class ContentHandler ( LLMContentHandler ):
content_type = "application/json"
accepts = "application/json"
def transform_input ( self , prompt : str , model_kwargs : dict ) -> bytes :
input_str = json . dumps ({ 'inputs' : prompt , 'parameters' : model_kwargs })
return input_str . encode ( 'utf-8' )
def transform_output ( self , output : bytes ) -> str :
response_json = json . loads ( output . read (). decode ( "utf-8" ))
return response_json [ 0 ][ "generated_text" ]
使用 endpoint_name、aws_region、parameters 和 content_handler 為 Sagemaker Endpoint 註冊 llm,如下所示。
endpoint_name = 'jumpstart-dft-hf-llm-falcon-7b-instruct-bf16'
aws_region = boto3 . Session (). region_name
parameters = {
"max_new_tokens" : 300
}
content_handler = ContentHandler ()
llm = SagemakerEndpoint (
endpoint_name = endpoint_name ,
region_name = aws_region ,
model_kwargs = parameters ,
content_handler = content_handler
)
您可以如下檢查 llm 的運作情況。
llm ( "Tell me a joke" )
此時的結果如下。
I once told a joke to a friend, but it didn't work. He just looked
Web 載入器 - 您可以使用 langchain 載入網頁。
from langchain . document_loaders import WebBaseLoader
from langchain . indexes import VectorstoreIndexCreator
loader = WebBaseLoader ( "https://lilianweng.github.io/posts/2023-06-23-agent/" )
index = VectorstoreIndexCreator (). from_loaders ([ loader ])
定義如下所示的範本後,您可以定義LLMChain並執行它。有關詳細信息,請參閱 langchain-sagemaker-endpoint-Q&A.ipynb。
from langchain import PromptTemplate , LLMChain
template = "Tell me a {adjective} joke about {content}."
prompt = PromptTemplate . from_template ( template )
llm_chain = LLMChain ( prompt = prompt , llm = llm )
outputText = llm_chain . run ( adjective = "funny" , content = "chickens" )
print ( outputText )
此時的結果如下。
Why did the chicken cross the playground? To get to the other slide!
使用 langchain.chains.question_answering 執行文件的提問/回答。有關詳細信息,請參閱 langchain-sagemaker-endpoint-Q&A.ipynb。
定義提示的範本。
template = """Use the following pieces of context to answer the question at the end.
{context}
Question: {question}
Answer:"""
prompt = PromptTemplate (
template = template , input_variables = [ "context" , "question" ]
)
使用 langchain.docstore.document 建立文件。
from langchain . docstore . document import Document
example_doc_1 = """
Peter and Elizabeth took a taxi to attend the night party in the city. While in the party, Elizabeth collapsed and was rushed to the hospital.
Since she was diagnosed with a brain injury, the doctor told Peter to stay besides her until she gets well.
Therefore, Peter stayed with her at the hospital for 3 days without leaving.
"""
docs = [
Document (
page_content = example_doc_1 ,
)
]
現在進行提問/回答。
from langchain . chains . question_answering import load_qa_chain
question = "How long was Elizabeth hospitalized?"
chain = load_qa_chain ( prompt = prompt , llm = llm )
output = chain ({ "input_documents" : docs , "question" : question }, return_only_outputs = True )
print ( output )
此時的結果如下。
{'output_text': ' 3 days'}
langchain-sagemaker-endpoint-pdf-summary.ipynb 解釋如何使用基於 Falcon FM 的 SageMaker Endpoint 進行 PDF Summery。
首先,使用 PyPDF2 讀取儲存在 S3 中的 PDF 檔案並提取文字。
import PyPDF2
from io import BytesIO
sess = sagemaker . Session ()
s3_bucket = sess . default_bucket ()
s3_prefix = 'docs'
s3_file_name = '2016-3series.pdf' # S3의 파일명
s3r = boto3 . resource ( "s3" )
doc = s3r . Object ( s3_bucket , s3_prefix + '/' + s3_file_name )
contents = doc . get ()[ 'Body' ]. read ()
reader = PyPDF2 . PdfReader ( BytesIO ( contents ))
raw_text = []
for page in reader . pages :
raw_text . append ( page . extract_text ())
contents = ' n ' . join ( raw_text )
new_contents = str ( contents ). replace ( " n " , " " )
由於文件很大,使用RecursiveCharacterTextSplitter將其分成區塊並保存在Document中。然後,使用load_summarize_chain進行總結。
from langchain . text_splitter import CharacterTextSplitter
from langchain . text_splitter import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter ( chunk_size = 1000 , chunk_overlap = 0 )
texts = text_splitter . split_text ( new_contents )
from langchain . docstore . document import Document
docs = [
Document (
page_content = t
) for t in texts [: 3 ]
]
from langchain . chains . summarize import load_summarize_chain
from langchain . prompts import PromptTemplate
prompt_template = """Write a concise summary of the following:
{text}
CONCISE SUMMARY """
PROMPT = PromptTemplate ( template = prompt_template , input_variables = [ "text" ])
chain = load_summarize_chain ( llm , chain_type = "stuff" , prompt = PROMPT )
summary = chain . run ( docs )
from langchain import Bedrock
from langchain . embeddings import BedrockEmbeddings
llm = Bedrock ()
print ( llm ( "explain GenAI" ))
浪鏈文檔
浪鏈-github
SageMaker 端點
2-Lab02-RAG-法學碩士
AWS Kendra Langchain 擴展
品質檢查和文件聊天
LangChain - 模組 - 語言模型 - 法學碩士 - 整合 - SageMakerEndpoint
LangChain - 生態系統 - 整合 - SageMaker Endpoint
攝取知識庫資料 ta Vector DB