與 Friendli 一起增強生成式人工智慧服務
Friendli Client 提供了方便的介面,可與 Friendli Suite 提供的端點服務進行交互,Friendli Suite 是服務生成 AI 模型的終極解決方案。它專為靈活性和性能而設計,支援同步和非同步操作,讓您可以輕鬆地將強大的 AI 功能整合到您的應用程式中。
若要開始使用 Friendli,請使用pip
安裝客戶端套件:
pip install friendli-client
重要的
在使用client = Friendli()
初始化客戶端實例之前,您必須設定FRIENDLI_TOKEN
環境變數。或者,您可以在建立用戶端時提供個人存取權杖的值作為token
參數,如下所示:
from friendli import Friendli
client = Friendli ( token = "YOUR PERSONAL ACCESS TOKEN" )
Friendli Serverless Endpoint 提供簡單的點擊即用介面,用於存取 Llama 3.1 等流行的開源模型。透過按代幣付費計費,這是探索和實驗的理想選擇。
若要與無伺服器端點託管的模型進行交互,請提供要在model
參數中使用的模型程式碼。請參閱定價表,以了解可用型號代碼及其定價的清單。
from friendli import Friendli
client = Friendli ()
chat_completion = client . chat . completions . create (
model = "meta-llama-3.1-8b-instruct" ,
messages = [
{
"role" : "user" ,
"content" : "Tell me how to make a delicious pancake" ,
}
],
)
print ( chat_completion . choices [ 0 ]. message . content )
Friendli 專用端點可讓您在專用 GPU 資源上執行自訂產生 AI 模型。
若要與專用端點交互,請在model
參數中提供端點 ID。
import os
from friendli import Friendli
client = Friendli (
team_id = os . environ [ "TEAM_ID" ], # If not provided, default team is used.
use_dedicated_endpoint = True ,
)
chat_completion = client . chat . completions . create (
model = os . environ [ "ENDPOINT_ID" ],
messages = [
{
"role" : "user" ,
"content" : "Tell me how to make a delicious pancake" ,
}
],
)
print ( chat_completion . choices [ 0 ]. message . content )
Friendli Container 非常適合喜歡在自己的基礎設施中為 LLM 提供服務的使用者。透過在本地或雲端 GPU 上的容器中部署 Friendli 引擎,您可以保持對資料和操作的完全控制,確保安全性並遵守內部策略。
from friendli import Friendli
client = Friendli ( base_url = "http://0.0.0.0:8000" )
chat_completion = client . chat . completions . create (
messages = [
{
"role" : "user" ,
"content" : "Tell me how to make a delicious pancake" ,
}
],
)
print ( chat_completion . choices [ 0 ]. message . content )
import asyncio
from friendli import AsyncFriendli
client = AsyncFriendli ()
async def main () -> None :
chat_completion = await client . chat . completions . create (
model = "meta-llama-3.1-8b-instruct" ,
messages = [
{
"role" : "user" ,
"content" : "Tell me how to make a delicious pancake" ,
}
],
)
print ( chat_completion . choices [ 0 ]. message . content )
asyncio . run ( main ())
from friendli import Friendli
client = Friendli ()
stream = client . chat . completions . create (
model = "meta-llama-3.1-8b-instruct" ,
messages = [
{
"role" : "user" ,
"content" : "Tell me how to make a delicious pancake" ,
}
],
stream = True ,
)
for chunk in stream :
print ( chunk . choices [ 0 ]. delta . content or "" , end = "" , flush = True )
非同步客戶端 ( AsyncFriendli
) 使用相同的介面來傳輸回應。
import asyncio
from friendli import AsyncFriendli
client = AsyncFriendli ()
async def main () -> None :
stream = await client . chat . completions . create (
model = "meta-llama-3.1-8b-instruct" ,
messages = [
{
"role" : "user" ,
"content" : "Tell me how to make a delicious pancake" ,
}
],
stream = True ,
)
async for chunk in stream :
print ( chunk . choices [ 0 ]. delta . content or "" , end = "" , flush = True )
asyncio . run ( main ())
如果您的終端節點服務於多 LoRA 模型,您可以透過在model
參數中提供適配器路由來向其中一個適配器發送請求。
對於 Friendli 專用端點,請提供端點 ID 和適配器路由,並以冒號:
:) 分隔。
import os
from friendli import Friendli
client = Friendli (
team_id = os . environ [ "TEAM_ID" ], # If not provided, default team is used.
use_dedicated_endpoint = True ,
)
chat_completion = client . lora . completions . create (
model = f" { os . environ [ 'ENDPOINT_ID' ] } : { os . environ [ 'ADAPTER_ROUTE' ] } " ,
messages = [
{
"role" : "user" ,
"content" : "Tell me how to make a delicious pancake" ,
}
],
)
對於Friendli Container,只需提供適配器名稱。
import os
from friendli import Friendli
client = Friendli ( base_url = "http://0.0.0.0:8000" )
chat_completion = client . lora . completions . create (
model = os . environ [ "ADAPTER_NAME" ],
messages = [
{
"role" : "user" ,
"content" : "Tell me how to make a delicious pancake" ,
}
],
)
重要的
gRPC 僅 Friendli Container 支持,且僅提供v1/completions
的串流 API。
當 Frienldi Container 在 gPRC 模式下運作時,用戶端可以透過使用use_grpc=True
參數初始化 gRPC 伺服器來與 gRPC 伺服器互動。
from friendli import Friendli
client = Friendli ( base_url = "0.0.0.0:8000" , use_grpc = True )
stream = client . chat . completions . create (
messages = [
{
"role" : "user" ,
"content" : "Tell me how to make a delicious pancake" ,
}
],
stream = True , # Only streaming mode is available
)
for chunk in stream :
print ( chunk . choices [ 0 ]. delta . content or "" , end = "" , flush = True )
客戶端使用httpx
發送HTTP請求。您可以在初始化Friendli
時提供自訂的httpx.Client
。
import httpx
from friendli import Friendli
with httpx . Client () as client :
client = Friendli ( http_client = http_client )
對於非同步客戶端,您可以提供httpx.AsyncClient
。
import httx
from friendli import AsyncFriendli
with httpx . AsyncClient () as client :
client = AsyncFriendli ( http_client = http_client )
import grpc
from friendli import Friendli
with grpc . insecure_channel ( "0.0.0.0:8000" ) as channel :
client = Friendli ( use_grpc = True , grpc_channel = channel )
您可以為非同步客戶端使用相同的介面。
import grpc . aio
from friendli import AsyncFriendli
async with grpc . aio . insecure_channel ( "0.0.0.0:8000" ) as channel :
client = AsyncFriendli ( use_grpc = True , grpc_channel = channel )
Friendli客戶端提供了多種方法來管理和釋放資源。
Friendli
和AsyncFriendli
用戶端都可以在其生命週期內保留網路連線或其他資源。為了確保正確釋放這些資源,您應該呼叫close()
方法或在上下文管理器中使用用戶端。
from friendli import Friendli
client = Friendli ()
# Use the client for various operations...
# When done, close the client to release resources
client . close ()
對於非同步客戶端,模式類似:
import asyncio
from friendli import AsyncFriendli
client = AsyncFriendli ()
# Use the client for various async operations...
# When done, close the client to release resources
await client . close ()
您也可以使用上下文管理器在退出區塊時自動關閉用戶端並釋放資源,使其成為更安全、更方便的資源管理方式。
from friendli import Friendli
with Friendli () as client :
...
對於非同步使用:
import asyncio
from friendli import AsyncFriendli
async def main ():
async with AsyncFriendli () as client :
...
asyncio . run ( main ())
使用串流回應時,互動完成後正確關閉 HTTP 連線至關重要。預設情況下,一旦流中的所有資料都被消耗完(即,當 for 迴圈到達末端時),連線就會自動關閉。但是,如果串流傳輸因異常或其他問題而中斷,則連線可能會保持開啟狀態,並且在被垃圾收集之前不會被釋放。為了確保正確釋放所有底層連接和資源,明確關閉連接非常重要,特別是在串流過早終止時。
from friendli import Friendli
client = Friendli ()
stream = client . chat . completions . create (
model = "meta-llama-3.1-8b-instruct" ,
messages = [
{
"role" : "user" ,
"content" : "Tell me how to make a delicious pancake" ,
}
],
stream = True ,
)
try :
for chunk in stream :
print ( chunk . choices [ 0 ]. delta . content or "" , end = "" , flush = True )
finally :
stream . close () # Ensure the stream is closed after use
對於非同步流:
import asyncio
from friendli import AsyncFriendli
client = AsyncFriendli ()
async def main ():
stream = await client . chat . completions . create (
model = "meta-llama-3.1-8b-instruct" ,
messages = [
{
"role" : "user" ,
"content" : "Tell me how to make a delicious pancake" ,
}
],
stream = True ,
)
try :
async for chunk in stream :
print ( chunk . choices [ 0 ]. delta . content or "" , end = "" , flush = True )
finally :
await stream . close () # Ensure the stream is closed after use
asyncio . run ( main ())
您也可以使用上下文管理器在退出區塊時自動關閉用戶端並釋放資源,使其成為更安全、更方便的資源管理方式。
from friendli import Friendli
client = Friendli ()
with client . chat . completions . create (
model = "meta-llama-3.1-8b-instruct" ,
messages = [
{
"role" : "user" ,
"content" : "Tell me how to make a delicious pancake" ,
}
],
stream = True ,
) as stream :
for chunk in stream :
print ( chunk . choices [ 0 ]. delta . content or "" , end = "" , flush = True )
對於非同步流:
import asyncio
from friendli import AsyncFriendli
client = AsyncFriendli ()
async def main ():
async with await client . chat . completions . create (
model = "meta-llama-3.1-8b-instruct" ,
messages = [
{
"role" : "user" ,
"content" : "Tell me how to make a delicious pancake" ,
}
],
stream = True ,
) as stream :
async for chunk in stream :
print ( chunk . choices [ 0 ]. delta . content or "" , end = "" , flush = True )
asyncio . run ( main ())
將 gRPC 介面與串流媒體結合使用時,您可能會想要在正在進行的串流操作完成之前取消該操作。如果您因為逾時或某些其他情況而需要停止串流,這尤其有用。
對於同步 gRPC 串流:
from friendli import Friendli
client = Friendli ( base_url = "0.0.0.0:8000" , use_grpc = True )
stream = client . chat . completions . create (
messages = [
{
"role" : "user" ,
"content" : "Tell me how to make a delicious pancake" ,
}
],
stream = True ,
)
try :
for chunk in stream :
print ( chunk . choices [ 0 ]. delta . content or "" , end = "" , flush = True )
except SomeException :
stream . cancel () # Cancel the stream in case of an error or interruption
對於非同步 gRPC 流:
import asyncio
from friendli import AsyncFriendli
client = AsyncFriendli ( base_url = "0.0.0.0:8000" , use_grpc = True )
async def main ():
stream = await client . chat . completions . create (
messages = [
{
"role" : "user" ,
"content" : "Tell me how to make a delicious pancake" ,
}
],
stream = True ,
)
try :
async for chunk in stream :
print ( chunk . choices [ 0 ]. delta . content or "" , end = "" , flush = True )
except SomeException :
stream . cancel () # Cancel the stream in case of an error or interruption
asyncio . run ( main ())
您也可以直接使用 CLI 呼叫產生 API。
friendli api chat-completions create
-g " user Tell me how to make a delicious pancake "
-m meta-llama-3.1-8b-instruct
有關friendli
命令的更多信息,請在終端shell中運行friendli --help
。這將為您提供可用選項和使用說明的詳細清單。
提示
查看我們的官方文件以了解更多資訊!