讲师是最受欢迎的Python库,用于与大型语言模型(LLMS)的结构化输出合作,每月下载超过60万。它建立在Pydantic的顶部,提供了一种简单,透明且用户友好的API,可以管理验证,重试和流响应。准备好通过社区的最佳选择来增强LLM工作流程!
如果您的公司经常使用教练,我们很乐意在我们的网站上拥有您的徽标!请填写此表格
使用单个命令安装讲师:
pip install -U instructor
现在,让我们以一个简单的示例来看讲师:
import instructor
from pydantic import BaseModel
from openai import OpenAI
# Define your desired output structure
class UserInfo ( BaseModel ):
name : str
age : int
# Patch the OpenAI client
client = instructor . from_openai ( OpenAI ())
# Extract structured data from natural language
user_info = client . chat . completions . create (
model = "gpt-4o-mini" ,
response_model = UserInfo ,
messages = [{ "role" : "user" , "content" : "John Doe is 30 years old." }],
)
print ( user_info . name )
#> John Doe
print ( user_info . age )
#> 30
讲师提供了一个强大的挂钩系统,使您可以拦截和记录LLM交互过程的各个阶段。这是一个简单的示例,演示了如何使用钩子:
import instructor
from openai import OpenAI
from pydantic import BaseModel
class UserInfo ( BaseModel ):
name : str
age : int
# Initialize the OpenAI client with Instructor
client = instructor . from_openai ( OpenAI ())
# Define hook functions
def log_kwargs ( ** kwargs ):
print ( f"Function called with kwargs: { kwargs } " )
def log_exception ( exception : Exception ):
print ( f"An exception occurred: { str ( exception ) } " )
client . on ( "completion:kwargs" , log_kwargs )
client . on ( "completion:error" , log_exception )
user_info = client . chat . completions . create (
model = "gpt-4o-mini" ,
response_model = UserInfo ,
messages = [
{ "role" : "user" , "content" : "Extract the user name: 'John is 20 years old'" }
],
)
"""
{
'args': (),
'kwargs': {
'messages': [
{
'role': 'user',
'content': "Extract the user name: 'John is 20 years old'",
}
],
'model': 'gpt-4o-mini',
'tools': [
{
'type': 'function',
'function': {
'name': 'UserInfo',
'description': 'Correctly extracted `UserInfo` with all the required parameters with correct types',
'parameters': {
'properties': {
'name': {'title': 'Name', 'type': 'string'},
'age': {'title': 'Age', 'type': 'integer'},
},
'required': ['age', 'name'],
'type': 'object',
},
},
}
],
'tool_choice': {'type': 'function', 'function': {'name': 'UserInfo'}},
},
}
"""
print ( f"Name: { user_info . name } , Age: { user_info . age } " )
#> Name: John, Age: 20
此示例证明:
钩子为函数的输入和任何错误提供了宝贵的见解,从而增强了调试和监视功能。
import instructor
from anthropic import Anthropic
from pydantic import BaseModel
class User ( BaseModel ):
name : str
age : int
client = instructor . from_anthropic ( Anthropic ())
# note that client.chat.completions.create will also work
resp = client . messages . create (
model = "claude-3-opus-20240229" ,
max_tokens = 1024 ,
system = "You are a world class AI that excels at extracting user data from a sentence" ,
messages = [
{
"role" : "user" ,
"content" : "Extract Jason is 25 years old." ,
}
],
response_model = User ,
)
assert isinstance ( resp , User )
assert resp . name == "Jason"
assert resp . age == 25
确保安装cohere
并使用export CO_API_KEY=<YOUR_COHERE_API_KEY>
设置系统环境变量。
pip install cohere
import instructor
import cohere
from pydantic import BaseModel
class User ( BaseModel ):
name : str
age : int
client = instructor . from_cohere ( cohere . Client ())
# note that client.chat.completions.create will also work
resp = client . chat . completions . create (
model = "command-r-plus" ,
max_tokens = 1024 ,
messages = [
{
"role" : "user" ,
"content" : "Extract Jason is 25 years old." ,
}
],
response_model = User ,
)
assert isinstance ( resp , User )
assert resp . name == "Jason"
assert resp . age == 25
确保安装Google AI Python SDK。您应该使用API密钥设置GOOGLE_API_KEY
环境变量。双子座工具调用还需要安装jsonref
。
pip install google-generativeai jsonref
import instructor
import google . generativeai as genai
from pydantic import BaseModel
class User ( BaseModel ):
name : str
age : int
# genai.configure(api_key=os.environ["API_KEY"]) # alternative API key configuration
client = instructor . from_gemini (
client = genai . GenerativeModel (
model_name = "models/gemini-1.5-flash-latest" , # model defaults to "gemini-pro"
),
mode = instructor . Mode . GEMINI_JSON ,
)
另外,您可以从OpenAI客户端致电Gemini。您必须设置gcloud
,在顶点AI上进行设置,然后安装Google Auth库。
pip install google-auth
import google . auth
import google . auth . transport . requests
import instructor
from openai import OpenAI
from pydantic import BaseModel
creds , project = google . auth . default ()
auth_req = google . auth . transport . requests . Request ()
creds . refresh ( auth_req )
# Pass the Vertex endpoint and authentication to the OpenAI SDK
PROJECT = 'PROJECT_ID'
LOCATION = (
'LOCATION' # https://cloud.google.com/vertex-ai/generative-ai/docs/learn/locations
)
base_url = f'https:// { LOCATION } -aiplatform.googleapis.com/v1beta1/projects/ { PROJECT } /locations/ { LOCATION } /endpoints/openapi'
client = instructor . from_openai (
OpenAI ( base_url = base_url , api_key = creds . token ), mode = instructor . Mode . JSON
)
# JSON mode is req'd
class User ( BaseModel ):
name : str
age : int
resp = client . chat . completions . create (
model = "google/gemini-1.5-flash-001" ,
max_tokens = 1024 ,
messages = [
{
"role" : "user" ,
"content" : "Extract Jason is 25 years old." ,
}
],
response_model = User ,
)
assert isinstance ( resp , User )
assert resp . name == "Jason"
assert resp . age == 25
import instructor
from litellm import completion
from pydantic import BaseModel
class User ( BaseModel ):
name : str
age : int
client = instructor . from_litellm ( completion )
resp = client . chat . completions . create (
model = "claude-3-opus-20240229" ,
max_tokens = 1024 ,
messages = [
{
"role" : "user" ,
"content" : "Extract Jason is 25 years old." ,
}
],
response_model = User ,
)
assert isinstance ( resp , User )
assert resp . name == "Jason"
assert resp . age == 25
这是教练的梦想,但是由于Openai的修补,我不可能打字得很好。现在,有了新客户,我们可以打字可以运行良好!我们还添加了一些create_*
方法,以使创建迭代和部分的创建和访问原始完成。
create
import openai
import instructor
from pydantic import BaseModel
class User ( BaseModel ):
name : str
age : int
client = instructor . from_openai ( openai . OpenAI ())
user = client . chat . completions . create (
model = "gpt-4-turbo-preview" ,
messages = [
{ "role" : "user" , "content" : "Create a user" },
],
response_model = User ,
)
现在,如果使用IDE,可以看到该类型已正确推断。
await create
这也将与异步客户端正确使用。
import openai
import instructor
from pydantic import BaseModel
client = instructor . from_openai ( openai . AsyncOpenAI ())
class User ( BaseModel ):
name : str
age : int
async def extract ():
return await client . chat . completions . create (
model = "gpt-4-turbo-preview" ,
messages = [
{ "role" : "user" , "content" : "Create a user" },
],
response_model = User ,
)
请注意,仅仅因为我们返回create
方法, extract()
函数将返回正确的用户类型。
create_with_completion
您也可以返回原始的完成对象
import openai
import instructor
from pydantic import BaseModel
client = instructor . from_openai ( openai . OpenAI ())
class User ( BaseModel ):
name : str
age : int
user , completion = client . chat . completions . create_with_completion (
model = "gpt-4-turbo-preview" ,
messages = [
{ "role" : "user" , "content" : "Create a user" },
],
response_model = User ,
)
create_partial
为了处理流,我们仍然支持Iterable[T]
和Partial[T]
但是为了简化类型推理,我们还添加了create_iterable
和create_partial
方法!
import openai
import instructor
from pydantic import BaseModel
client = instructor . from_openai ( openai . OpenAI ())
class User ( BaseModel ):
name : str
age : int
user_stream = client . chat . completions . create_partial (
model = "gpt-4-turbo-preview" ,
messages = [
{ "role" : "user" , "content" : "Create a user" },
],
response_model = User ,
)
for user in user_stream :
print ( user )
#> name=None age=None
#> name=None age=None
#> name=None age=None
#> name=None age=None
#> name=None age=None
#> name=None age=None
#> name='John Doe' age=None
#> name='John Doe' age=None
#> name='John Doe' age=None
#> name='John Doe' age=30
#> name='John Doe' age=30
# name=None age=None
# name='' age=None
# name='John' age=None
# name='John Doe' age=None
# name='John Doe' age=30
现在请注意,推断的类型是Generator[User, None]
create_iterable
当我们要提取多个对象时,我们会得到一个对象的功能。
import openai
import instructor
from pydantic import BaseModel
client = instructor . from_openai ( openai . OpenAI ())
class User ( BaseModel ):
name : str
age : int
users = client . chat . completions . create_iterable (
model = "gpt-4-turbo-preview" ,
messages = [
{ "role" : "user" , "content" : "Create 2 users" },
],
response_model = User ,
)
for user in users :
print ( user )
#> name='John Doe' age=30
#> name='Jane Doe' age=28
# User(name='John Doe', age=30)
# User(name='Jane Smith', age=25)
我们邀请您为pytest
中的Evals做出贡献,以监视OpenAI模型和instructor
库的质量。首先,请查看Evals的人类和Openai,并以Pytest测试的形式贡献您自己的Evals。这些EVALS将每周运行一次,结果将发布。
如果您想提供帮助,请查看一些标记为good-first-issue
或在这里help-wanted
问题。它们可以是改进代码,来宾博客文章或新食谱的任何内容。
我们还提供一些添加的CLI功能,以便于方便:
instructor jobs
:这有助于使用OpenAI创建微调工作。简单使用instructor jobs create-from-file --help
- 螺旋
instructor files
:轻松管理上传的文件。您将能够从命令行创建,删除和上传文件
instructor usage
:您可以每次前往OpenAI站点,而是可以监视CLI的使用情况,并按日期和时间段过滤。请注意,用法通常需要约5-10分钟才能从Openai的一边更新
该项目是根据MIT许可证的条款获得许可的。