我们的使命是简化LLM景观,统一让您:
?使用来自任何提供商的任何LLM :使用单个接口,您可以通过简单地更改一个字符串来使用所有提供商的所有LLM。无需管理多个API键或处理不同的输入输出格式。统一为您处理所有这些!
提高LLM性能:添加您自己的自定义测试和EVALS,并在所有型号和提供商上基准提示自己的提示。比较质量,成本和速度,并在系统提示上进行迭代,直到所有测试用例通过为止,您可以部署应用程序!
?通往最佳LLM的途径:通过将每个提示的理想模型和提供商路由来提高质量,成本和速度。
只需安装软件包:
pip install unifyai
然后注册以获取您的API键,然后就可以了!
import unify
client = unify . Unify ( "gpt-4o@openai" , api_key = < your_key > )
client . generate ( "hello world!" )
笔记
我们建议使用python-dotenv将UNIFY_KEY="My API Key"
添加到.env
文件中,以避免使用上述使用api_key
参数。对于其余的读数,我们将假设您将密钥设置为环境变量。
您可以列出所有模型,提供商和端点( <model>@<provider>
Pair),如下:
models = unify . list_models ()
providers = unify . list_providers ()
endpoints = unify . list_endpoints ()
您也可以在这些功能中过滤如下:
import random
anthropic_models = unify . list_models ( "anthropic" )
client . set_endpoint ( random . choice ( anthropic_models ) + "@anthropic" )
latest_llama3p1_providers = unify . list_providers ( "llama-3.1-405b-chat" )
client . set_endpoint ( "llama-3.1-405b-chat@" + random . choice ( latest_llama3p1_providers ))
openai_endpoints = unify . list_endpoints ( "openai" )
client . set_endpoint ( random . choice ( openai_endpoints ))
mixtral8x7b_endpoints = unify . list_endpoints ( "mixtral-8x7b-instruct-v0.1" )
client . set_endpoint ( random . choice ( mixtral8x7b_endpoints ))
如果要更改endpoint
, model
或provider
,则可以分别使用.set_endpoint
, .set_model
, .set_provider
方法来完成。
client . set_endpoint ( "mistral-7b-instruct-v0.3@deepinfra" )
client . set_model ( "mistral-7b-instruct-v0.3" )
client . set_provider ( "deepinfra" )
您可以使用system_message
参数在.generate
函数中影响模型的角色:
response = client . generate (
user_message = "Hello Llama! Who was Isaac Newton?" , system_message = "You should always talk in rhymes"
)
如果您想使用.generate
函数发送多个消息,则应使用以下messages
参数:
messages = [
{ "role" : "user" , "content" : "Who won the world series in 2020?" },
{ "role" : "assistant" , "content" : "The Los Angeles Dodgers won the World Series in 2020." },
{ "role" : "user" , "content" : "Where was it played?" }
]
res = client . generate ( messages = messages )
查询LLMS时,您通常需要保留提示的许多方面,并且仅在每个后续呼叫上更改提示的一小部分。
例如,您可能需要修复温带,系统提示和可用工具,同时传递来自下游应用程序的不同用户消息。统一中的所有客户端都可以通过默认参数非常简单,该参数可以在构造函数中指定,也可以使用setters方法在任何时间设置。
例如,以下代码将将temperature=0.5
传递给所有后续请求,而无需重复传递到.generate()
方法中。
client = unify . Unify ( "claude-3-haiku@anthropic" , temperature = 0.5 )
client . generate ( "Hello world!" )
client . generate ( "What a nice day." )
所有参数也可以通过getters检索,并通过setters设置:
client = unify . Unify ( "claude-3-haiku@anthropic" , temperature = 0.5 )
print ( client . temperature ) # 0.5
client . set_temperature ( 1.0 )
print ( client . temperature ) # 1.0
将值传递给.generate()
方法将覆盖为客户端指定的默认值。
client = unify . Unify ( "claude-3-haiku@anthropic" , temperature = 0.5 )
client . generate ( "Hello world!" ) # temperature of 0.5
client . generate ( "What a nice day." , temperature = 1.0 ) # temperature of 1.0
为了同时处理多个用户请求,例如在聊天机器人应用程序中,建议对它们进行异步处理。下面给出了使用AsyncUnify
的最小示例:
import unify
import asyncio
async_client = unify . AsyncUnify ( "llama-3-8b-chat@fireworks-ai" )
asyncio . run ( async_client . generate ( "Hello Llama! Who was Isaac Newton?" ))
更多一个更应用的示例,可以并行处理多个请求如下:
import unify
import asyncio
clients = dict ()
clients [ "gpt-4o@openai" ] = unify . AsyncUnify ( "gpt-4o@openai" )
clients [ "claude-3-opus@anthropic" ] = unify . AsyncUnify ( "claude-3-opus@anthropic" )
clients [ "llama-3-8b-chat@fireworks-ai" ] = unify . AsyncUnify ( "llama-3-8b-chat@fireworks-ai" )
async def generate_responses ( user_message : str ):
responses_ = dict ()
for endpoint_ , client in clients . items ():
responses_ [ endpoint_ ] = await client . generate ( user_message )
return responses_
responses = asyncio . run ( generate_responses ( "Hello, how's it going?" ))
for endpoint , response in responses . items ():
print ( "endpoint: {}" . format ( endpoint ))
print ( "response: {} n " . format ( response ))
明智的功能,异步和同步客户端是相同的。
您可以通过在.generate
函数中设置stream=True
来启用流响应。
import unify
client = unify . Unify ( "llama-3-8b-chat@fireworks-ai" )
stream = client . generate ( "Hello Llama! Who was Isaac Newton?" , stream = True )
for chunk in stream :
print ( chunk , end = "" )
它与异步客户端完全相同。
import unify
import asyncio
async_client = unify . AsyncUnify ( "llama-3-8b-chat@fireworks-ai" )
async def stream ():
async_stream = await async_client . generate ( "Hello Llama! Who was Isaac Newton?" , stream = True )
async for chunk in async_stream :
print ( chunk , end = "" )
asyncio . run ( stream ())
要了解有关我们更高级的API功能,基准测试和LLM路由的更多信息,请查看我们的全面文档!