friendli clientダウンロード - friendli clientソースコードのダウンロード

Friendli による生成 AI サービスのスーパーチャージ

CI ステータス

Friendli クライアントは、生成 AI モデルを提供するための究極のソリューションである Friendli Suite が提供するエンドポイントサービスと対話するための便利なインターフェイスを提供します。柔軟性とパフォーマンスを考慮して設計されており、同期操作と非同期操作の両方をサポートしているため、強力な AI 機能をアプリケーションに簡単に統合できます。

インストール

Friendli を使い始めるには、 pip使用してクライアントパッケージをインストールします。

pip install friendli-client

重要

client = Friendli()でクライアントインスタンスを初期化する前に、 FRIENDLI_TOKEN環境変数を設定する必要があります。あるいは、次のように、クライアントの作成時に個人用アクセストークンの値をtoken引数として指定することもできます。

 from friendli import Friendli

client = Friendli ( token = "YOUR PERSONAL ACCESS TOKEN" )

Friendli サーバーレスエンドポイント

Friendli サーバーレスエンドポイントは、Llama 3.1 などの人気のあるオープンソースモデルにアクセスするための、シンプルなクリックアンドプレイインターフェイスを提供します。トークンごとの支払いなので、探索や実験に最適です。

サーバーレスエンドポイントによってホストされるモデルと対話するには、使用するモデルコードをmodel引数に指定します。利用可能なモデルコードとその価格のリストについては、価格表を参照してください。

 from friendli import Friendli

client = Friendli ()

chat_completion = client . chat . completions . create (
    model = "meta-llama-3.1-8b-instruct" ,
    messages = [
        {
            "role" : "user" ,
            "content" : "Tell me how to make a delicious pancake" ,
        }
    ],
)
print ( chat_completion . choices [ 0 ]. message . content )

Friendli 専用エンドポイント

Friendli 専用エンドポイントを使用すると、専用の GPU リソースでカスタム生成 AI モデルを実行できます。

専用エンドポイントと対話するには、 model引数にエンドポイント ID を指定します。

 import os
from friendli import Friendli

client = Friendli (
    team_id = os . environ [ "TEAM_ID" ],  # If not provided, default team is used.
    use_dedicated_endpoint = True ,
)

chat_completion = client . chat . completions . create (
    model = os . environ [ "ENDPOINT_ID" ],
    messages = [
        {
            "role" : "user" ,
            "content" : "Tell me how to make a delicious pancake" ,
        }
    ],
)
print ( chat_completion . choices [ 0 ]. message . content )

フレンドリコンテナ

Friendli Container は、独自のインフラストラクチャ内で LLM を提供したいユーザーに最適です。 Friendli Engine をオンプレミスまたはクラウド GPU 上のコンテナにデプロイすることで、データと操作の完全な制御を維持し、セキュリティと内部ポリシーへのコンプライアンスを確保できます。

 from friendli import Friendli

client = Friendli ( base_url = "http://0.0.0.0:8000" )

chat_completion = client . chat . completions . create (
    messages = [
        {
            "role" : "user" ,
            "content" : "Tell me how to make a delicious pancake" ,
        }
    ],
)
print ( chat_completion . choices [ 0 ]. message . content )

非同期の使用法

 import asyncio
from friendli import AsyncFriendli

client = AsyncFriendli ()

async def main () -> None :
    chat_completion = await client . chat . completions . create (
        model = "meta-llama-3.1-8b-instruct" ,
        messages = [
            {
                "role" : "user" ,
                "content" : "Tell me how to make a delicious pancake" ,
            }
        ],
    )
    print ( chat_completion . choices [ 0 ]. message . content )


asyncio . run ( main ())

ストリーミングの使用法

 from friendli import Friendli

client = Friendli ()

stream = client . chat . completions . create (
    model = "meta-llama-3.1-8b-instruct" ,
    messages = [
        {
            "role" : "user" ,
            "content" : "Tell me how to make a delicious pancake" ,
        }
    ],
    stream = True ,
)
for chunk in stream :
    print ( chunk . choices [ 0 ]. delta . content or "" , end = "" , flush = True )

非同期クライアント ( AsyncFriendli ) は、同じインターフェイスを使用して応答をストリーミングします。

 import asyncio
from friendli import AsyncFriendli

client = AsyncFriendli ()

async def main () -> None :
    stream = await client . chat . completions . create (
        model = "meta-llama-3.1-8b-instruct" ,
        messages = [
            {
                "role" : "user" ,
                "content" : "Tell me how to make a delicious pancake" ,
            }
        ],
        stream = True ,
    )
    async for chunk in stream :
        print ( chunk . choices [ 0 ]. delta . content or "" , end = "" , flush = True )


asyncio . run ( main ())

高度な使用法

LoRA アダプターへのリクエストの送信

エンドポイントが Multi-LoRA モデルを提供している場合は、 model引数にアダプタールートを指定することで、アダプターの 1 つにリクエストを送信できます。

Friendli 専用エンドポイントの場合は、エンドポイント ID とアダプタールートをコロン: :) で区切って指定します。

 import os
from friendli import Friendli

client = Friendli (
    team_id = os . environ [ "TEAM_ID" ],  # If not provided, default team is used.
    use_dedicated_endpoint = True ,
)

chat_completion = client . lora . completions . create (
    model = f" { os . environ [ 'ENDPOINT_ID' ] } : { os . environ [ 'ADAPTER_ROUTE' ] } " ,
    messages = [
        {
            "role" : "user" ,
            "content" : "Tell me how to make a delicious pancake" ,
        }
    ],
)

Friendli Container の場合は、アダプター名を指定するだけです。

 import os
from friendli import Friendli

client = Friendli ( base_url = "http://0.0.0.0:8000" )

chat_completion = client . lora . completions . create (
    model = os . environ [ "ADAPTER_NAME" ],
    messages = [
        {
            "role" : "user" ,
            "content" : "Tell me how to make a delicious pancake" ,
        }
    ],
)

gRPC インターフェイスの使用

重要

gRPC は Friendli Container でのみサポートされており、 v1/completionsのストリーミング API のみが利用可能です。

Frienldi コンテナーが gPRC モードで実行されている場合、クライアントはuse_grpc=True引数で初期化することで gRPC サーバーと対話できます。

 from friendli import Friendli

client = Friendli ( base_url = "0.0.0.0:8000" , use_grpc = True )

stream = client . chat . completions . create (
    messages = [
        {
            "role" : "user" ,
            "content" : "Tell me how to make a delicious pancake" ,
        }
    ],
    stream = True ,  # Only streaming mode is available
)

for chunk in stream :
    print ( chunk . choices [ 0 ]. delta . content or "" , end = "" , flush = True )

HTTP クライアントの構成

クライアントはhttpx使用して HTTP リクエストを送信します。 Friendli初期化するときに、カスタマイズされたhttpx.Client提供できます。

 import httpx
from friendli import Friendli

with httpx . Client () as client :
    client = Friendli ( http_client = http_client )

非同期クライアントの場合は、 httpx.AsyncClientを指定できます。

 import httx
from friendli import AsyncFriendli

with httpx . AsyncClient () as client :
    client = AsyncFriendli ( http_client = http_client )

gRPC チャネルの構成

 import grpc
from friendli import Friendli

with grpc . insecure_channel ( "0.0.0.0:8000" ) as channel :
    client = Friendli ( use_grpc = True , grpc_channel = channel )

非同期クライアントにも同じインターフェイスを使用できます。

 import grpc . aio
from friendli import AsyncFriendli

async with grpc . aio . insecure_channel ( "0.0.0.0:8000" ) as channel :
    client = AsyncFriendli ( use_grpc = True , grpc_channel = channel )

リソースの管理

Friendli クライアントは、リソースを管理および解放するためのいくつかの方法を提供します。

クライアントを閉じる

FriendliクライアントとAsyncFriendliクライアントは両方とも、存続期間中ネットワーク接続またはその他のリソースを保持できます。これらのリソースが適切に解放されていることを確認するには、 close()メソッドを呼び出すか、コンテキストマネージャー内でクライアントを使用する必要があります。

 from friendli import Friendli

client = Friendli ()

# Use the client for various operations...

# When done, close the client to release resources
client . close ()

非同期クライアントの場合も、パターンは同様です。

 import asyncio
from friendli import AsyncFriendli

client = AsyncFriendli ()

# Use the client for various async operations...

# When done, close the client to release resources
await client . close ()

また、コンテキストマネージャーを使用して、ブロックの終了時にクライアントを自動的に閉じてリソースを解放することもできるため、リソースを管理するためのより安全で便利な方法になります。

 from friendli import Friendli

with Friendli () as client :
    ...

非同期使用の場合:

 import asyncio
from friendli import AsyncFriendli

async def main ():
    async with AsyncFriendli () as client :
        ...


asyncio . run ( main ())

ストリーミング応答の管理

ストリーミング応答を使用する場合、対話の完了後に HTTP 接続を適切に閉じることが重要です。デフォルトでは、ストリームからのデータがすべて消費されると (つまり、for ループが最後に到達すると)、接続は自動的に閉じられます。ただし、ストリーミングが例外やその他の問題によって中断された場合、接続は開いたままになり、ガベージコレクションが完了するまで解放されない可能性があります。基礎となるすべての接続とリソースが適切に解放されるようにするには、特にストリーミングが途中で終了する場合には、明示的に接続を閉じることが重要です。

 from friendli import Friendli

client = Friendli ()

stream = client . chat . completions . create (
    model = "meta-llama-3.1-8b-instruct" ,
    messages = [
        {
            "role" : "user" ,
            "content" : "Tell me how to make a delicious pancake" ,
        }
    ],
    stream = True ,
)

try :
    for chunk in stream :
        print ( chunk . choices [ 0 ]. delta . content or "" , end = "" , flush = True )
finally :
    stream . close ()  # Ensure the stream is closed after use

非同期ストリーミングの場合:

 import asyncio
from friendli import AsyncFriendli

client = AsyncFriendli ()

async def main ():
    stream = await client . chat . completions . create (
        model = "meta-llama-3.1-8b-instruct" ,
        messages = [
            {
                "role" : "user" ,
                "content" : "Tell me how to make a delicious pancake" ,
            }
        ],
        stream = True ,
    )

    try :
        async for chunk in stream :
            print ( chunk . choices [ 0 ]. delta . content or "" , end = "" , flush = True )
    finally :
        await stream . close ()  # Ensure the stream is closed after use

asyncio . run ( main ())

 from friendli import Friendli

client = Friendli ()

with client . chat . completions . create (
    model = "meta-llama-3.1-8b-instruct" ,
    messages = [
        {
            "role" : "user" ,
            "content" : "Tell me how to make a delicious pancake" ,
        }
    ],
    stream = True ,
) as stream :
    for chunk in stream :
        print ( chunk . choices [ 0 ]. delta . content or "" , end = "" , flush = True )

非同期ストリーミングの場合:

 import asyncio
from friendli import AsyncFriendli

client = AsyncFriendli ()

async def main ():
    async with await client . chat . completions . create (
        model = "meta-llama-3.1-8b-instruct" ,
        messages = [
            {
                "role" : "user" ,
                "content" : "Tell me how to make a delicious pancake" ,
            }
        ],
        stream = True ,
    ) as stream :
        async for chunk in stream :
            print ( chunk . choices [ 0 ]. delta . content or "" , end = "" , flush = True )

asyncio . run ( main ())

gRPC ストリームのキャンセル

ストリーミングで gRPC インターフェイスを使用する場合、進行中のストリーム操作が完了する前にキャンセルしたい場合があります。これは、タイムアウトまたはその他の条件によりストリームを停止する必要がある場合に特に便利です。

同期 gRPC ストリーミングの場合:

 from friendli import Friendli

client = Friendli ( base_url = "0.0.0.0:8000" , use_grpc = True )

stream = client . chat . completions . create (
    messages = [
        {
            "role" : "user" ,
            "content" : "Tell me how to make a delicious pancake" ,
        }
    ],
    stream = True ,
)

try :
    for chunk in stream :
        print ( chunk . choices [ 0 ]. delta . content or "" , end = "" , flush = True )
except SomeException :
    stream . cancel ()  # Cancel the stream in case of an error or interruption

非同期 gRPC ストリーミングの場合:

 import asyncio
from friendli import AsyncFriendli

client = AsyncFriendli ( base_url = "0.0.0.0:8000" , use_grpc = True )

async def main ():
    stream = await client . chat . completions . create (
        messages = [
            {
                "role" : "user" ,
                "content" : "Tell me how to make a delicious pancake" ,
            }
        ],
        stream = True ,
    )

    try :
        async for chunk in stream :
            print ( chunk . choices [ 0 ]. delta . content or "" , end = "" , flush = True )
    except SomeException :
        stream . cancel ()  # Cancel the stream in case of an error or interruption

asyncio . run ( main ())

CLI の例

CLI を使用して生成 API を直接呼び出すこともできます。

friendli api chat-completions create 
  -g " user Tell me how to make a delicious pancake " 
  -m meta-llama-3.1-8b-instruct

friendliコマンドの詳細については、ターミナルシェルでfriendli --helpを実行してください。これにより、利用可能なオプションと使用手順の詳細なリストが提供されます。

ヒント

詳細については、公式ドキュメントをご覧ください。

拡大する

friendli client

インストール

Friendli サーバーレスエンドポイント

Friendli 専用エンドポイント

フレンドリコンテナ

非同期の使用法

ストリーミングの使用法

高度な使用法

LoRA アダプターへのリクエストの送信

gRPC インターフェイスの使用

HTTP クライアントの構成

gRPC チャネルの構成

リソースの管理

クライアントを閉じる

ストリーミング応答の管理

gRPC ストリームのキャンセル

CLI の例

java client

amneziawg windows client

rdt client

discord bot client

clip_share_client

client

chat.petals.dev

GPT Prompt Templates

GPTyped

node telegram bot api

typebot.io

python wechaty getting started

waymo open dataset

termwind

wp functions

friendli client

インストール

Friendli サーバーレス エンドポイント

Friendli 専用エンドポイント

フレンドリコンテナ

非同期の使用法

ストリーミングの使用法

高度な使用法

LoRA アダプターへのリクエストの送信

gRPC インターフェイスの使用

HTTP クライアントの構成

gRPC チャネルの構成

リソースの管理

クライアントを閉じる

ストリーミング応答の管理

gRPC ストリームのキャンセル

CLI の例

Friendli サーバーレスエンドポイント