friendli client 다운로드 - friendli client 소스 코드 다운로드

Friendli를 통해 생성적 AI 서비스 강화

CI 현황

Friendli 클라이언트는 생성 AI 모델을 제공하기 위한 최고의 솔루션인 Friendli Suite에서 제공하는 엔드포인트 서비스와 상호 작용할 수 있는 편리한 인터페이스를 제공합니다. 유연성과 성능을 위해 설계된 이 제품은 동기식 및 비동기식 작업을 모두 지원하므로 강력한 AI 기능을 애플리케이션에 쉽게 통합할 수 있습니다.

설치

Friendli를 시작하려면 pip 사용하여 클라이언트 패키지를 설치하세요.

pip install friendli-client

중요한

client = Friendli() 사용하여 클라이언트 인스턴스를 초기화하기 전에 FRIENDLI_TOKEN 환경 변수를 설정해야 합니다. 또는 다음과 같이 클라이언트를 생성할 때 개인 액세스 토큰의 값을 token 인수로 제공할 수 있습니다.

 from friendli import Friendli

client = Friendli ( token = "YOUR PERSONAL ACCESS TOKEN" )

친숙한 서버리스 엔드포인트

Friendli Serverless Endpoint는 Llama 3.1과 같은 인기 있는 오픈 소스 모델에 액세스하기 위한 간단한 클릭 앤 플레이 인터페이스를 제공합니다. 토큰당 지불 방식을 사용하면 탐색 및 실험에 이상적입니다.

서버리스 엔드포인트에서 호스팅되는 모델과 상호 작용하려면 model 인수에 사용하려는 모델 코드를 제공하세요. 사용 가능한 모델 코드 목록과 가격은 가격표를 참조하세요.

 from friendli import Friendli

client = Friendli ()

chat_completion = client . chat . completions . create (
    model = "meta-llama-3.1-8b-instruct" ,
    messages = [
        {
            "role" : "user" ,
            "content" : "Tell me how to make a delicious pancake" ,
        }
    ],
)
print ( chat_completion . choices [ 0 ]. message . content )

Friendli 전용 엔드포인트

Friendli Dedicated Endpoints를 사용하면 전용 GPU 리소스에서 맞춤형 생성 AI 모델을 실행할 수 있습니다.

전용 엔드포인트와 상호작용하려면 model 인수에 엔드포인트 ID를 제공하세요.

 import os
from friendli import Friendli

client = Friendli (
    team_id = os . environ [ "TEAM_ID" ],  # If not provided, default team is used.
    use_dedicated_endpoint = True ,
)

chat_completion = client . chat . completions . create (
    model = os . environ [ "ENDPOINT_ID" ],
    messages = [
        {
            "role" : "user" ,
            "content" : "Tell me how to make a delicious pancake" ,
        }
    ],
)
print ( chat_completion . choices [ 0 ]. message . content )

친근한 컨테이너

Friendli Container는 자체 인프라 내에서 LLM 서비스를 선호하는 사용자에게 적합합니다. 온프레미스 또는 클라우드 GPU의 컨테이너에 Friendli 엔진을 배포하면 데이터 및 운영에 대한 완전한 제어권을 유지하여 보안과 내부 정책 준수를 보장할 수 있습니다.

 from friendli import Friendli

client = Friendli ( base_url = "http://0.0.0.0:8000" )

chat_completion = client . chat . completions . create (
    messages = [
        {
            "role" : "user" ,
            "content" : "Tell me how to make a delicious pancake" ,
        }
    ],
)
print ( chat_completion . choices [ 0 ]. message . content )

비동기 사용

 import asyncio
from friendli import AsyncFriendli

client = AsyncFriendli ()

async def main () -> None :
    chat_completion = await client . chat . completions . create (
        model = "meta-llama-3.1-8b-instruct" ,
        messages = [
            {
                "role" : "user" ,
                "content" : "Tell me how to make a delicious pancake" ,
            }
        ],
    )
    print ( chat_completion . choices [ 0 ]. message . content )


asyncio . run ( main ())

스트리밍 사용량

 from friendli import Friendli

client = Friendli ()

stream = client . chat . completions . create (
    model = "meta-llama-3.1-8b-instruct" ,
    messages = [
        {
            "role" : "user" ,
            "content" : "Tell me how to make a delicious pancake" ,
        }
    ],
    stream = True ,
)
for chunk in stream :
    print ( chunk . choices [ 0 ]. delta . content or "" , end = "" , flush = True )

비동기 클라이언트( AsyncFriendli )는 동일한 인터페이스를 사용하여 응답을 스트리밍합니다.

 import asyncio
from friendli import AsyncFriendli

client = AsyncFriendli ()

async def main () -> None :
    stream = await client . chat . completions . create (
        model = "meta-llama-3.1-8b-instruct" ,
        messages = [
            {
                "role" : "user" ,
                "content" : "Tell me how to make a delicious pancake" ,
            }
        ],
        stream = True ,
    )
    async for chunk in stream :
        print ( chunk . choices [ 0 ]. delta . content or "" , end = "" , flush = True )


asyncio . run ( main ())

고급 사용법

LoRA 어댑터에 요청 보내기

엔드포인트가 Multi-LoRA 모델을 제공하는 경우 model 인수에 어댑터 경로를 제공하여 어댑터 중 하나에 요청을 보낼 수 있습니다.

Friendli 전용 엔드포인트의 경우 엔드포인트 ID와 어댑터 경로를 콜론( : )으로 구분하여 제공합니다.

 import os
from friendli import Friendli

client = Friendli (
    team_id = os . environ [ "TEAM_ID" ],  # If not provided, default team is used.
    use_dedicated_endpoint = True ,
)

chat_completion = client . lora . completions . create (
    model = f" { os . environ [ 'ENDPOINT_ID' ] } : { os . environ [ 'ADAPTER_ROUTE' ] } " ,
    messages = [
        {
            "role" : "user" ,
            "content" : "Tell me how to make a delicious pancake" ,
        }
    ],
)

Friendli 컨테이너의 경우 어댑터 이름만 제공하세요.

 import os
from friendli import Friendli

client = Friendli ( base_url = "http://0.0.0.0:8000" )

chat_completion = client . lora . completions . create (
    model = os . environ [ "ADAPTER_NAME" ],
    messages = [
        {
            "role" : "user" ,
            "content" : "Tell me how to make a delicious pancake" ,
        }
    ],
)

gRPC 인터페이스 사용

중요한

gRPC는 Friendli Container에서만 지원되며, v1/completions 의 스트리밍 API만 사용할 수 있습니다.

Frienldi 컨테이너가 gPRC 모드에서 실행 중이면 클라이언트는 use_grpc=True 인수로 초기화하여 gRPC 서버와 상호 작용할 수 있습니다.

 from friendli import Friendli

client = Friendli ( base_url = "0.0.0.0:8000" , use_grpc = True )

stream = client . chat . completions . create (
    messages = [
        {
            "role" : "user" ,
            "content" : "Tell me how to make a delicious pancake" ,
        }
    ],
    stream = True ,  # Only streaming mode is available
)

for chunk in stream :
    print ( chunk . choices [ 0 ]. delta . content or "" , end = "" , flush = True )

HTTP 클라이언트 구성

클라이언트는 httpx 사용하여 HTTP 요청을 보냅니다. Friendli 초기화할 때 사용자 정의된 httpx.Client 제공할 수 있습니다.

 import httpx
from friendli import Friendli

with httpx . Client () as client :
    client = Friendli ( http_client = http_client )

비동기 클라이언트의 경우 httpx.AsyncClient 제공할 수 있습니다.

 import httx
from friendli import AsyncFriendli

with httpx . AsyncClient () as client :
    client = AsyncFriendli ( http_client = http_client )

gRPC 채널 구성

 import grpc
from friendli import Friendli

with grpc . insecure_channel ( "0.0.0.0:8000" ) as channel :
    client = Friendli ( use_grpc = True , grpc_channel = channel )

비동기 클라이언트에 동일한 인터페이스를 사용할 수 있습니다.

 import grpc . aio
from friendli import AsyncFriendli

async with grpc . aio . insecure_channel ( "0.0.0.0:8000" ) as channel :
    client = AsyncFriendli ( use_grpc = True , grpc_channel = channel )

자원 관리

Friendli 클라이언트는 리소스를 관리하고 해제하는 여러 가지 방법을 제공합니다.

클라이언트 닫기

Friendli 및 AsyncFriendli 클라이언트는 모두 수명 동안 네트워크 연결이나 기타 리소스를 보유할 수 있습니다. 이러한 리소스가 제대로 해제되었는지 확인하려면 close() 메서드를 호출하거나 컨텍스트 관리자 내에서 클라이언트를 사용해야 합니다.

 from friendli import Friendli

client = Friendli ()

# Use the client for various operations...

# When done, close the client to release resources
client . close ()

비동기 클라이언트의 경우 패턴은 비슷합니다.

 import asyncio
from friendli import AsyncFriendli

client = AsyncFriendli ()

# Use the client for various async operations...

# When done, close the client to release resources
await client . close ()

또한 컨텍스트 관리자를 사용하면 자동으로 클라이언트를 닫고 블록이 종료될 때 리소스를 해제할 수 있으므로 리소스를 보다 안전하고 편리하게 관리할 수 있습니다.

 from friendli import Friendli

with Friendli () as client :
    ...

비동기식 사용의 경우:

 import asyncio
from friendli import AsyncFriendli

async def main ():
    async with AsyncFriendli () as client :
        ...


asyncio . run ( main ())

스트리밍 응답 관리

스트리밍 응답을 사용할 때 상호 작용이 완료된 후 HTTP 연결을 적절하게 닫는 것이 중요합니다. 기본적으로 스트림의 모든 데이터가 소비되면(즉, for 루프가 끝에 도달하면) 연결이 자동으로 닫힙니다. 그러나 예외 또는 기타 문제로 인해 스트리밍이 중단되는 경우 연결은 열린 상태로 유지될 수 있으며 가비지 수집이 완료될 때까지 해제되지 않습니다. 모든 기본 연결과 리소스가 적절하게 해제되도록 하려면 특히 스트리밍이 조기에 종료되는 경우 연결을 명시적으로 닫는 것이 중요합니다.

 from friendli import Friendli

client = Friendli ()

stream = client . chat . completions . create (
    model = "meta-llama-3.1-8b-instruct" ,
    messages = [
        {
            "role" : "user" ,
            "content" : "Tell me how to make a delicious pancake" ,
        }
    ],
    stream = True ,
)

try :
    for chunk in stream :
        print ( chunk . choices [ 0 ]. delta . content or "" , end = "" , flush = True )
finally :
    stream . close ()  # Ensure the stream is closed after use

비동기 스트리밍의 경우:

 import asyncio
from friendli import AsyncFriendli

client = AsyncFriendli ()

async def main ():
    stream = await client . chat . completions . create (
        model = "meta-llama-3.1-8b-instruct" ,
        messages = [
            {
                "role" : "user" ,
                "content" : "Tell me how to make a delicious pancake" ,
            }
        ],
        stream = True ,
    )

    try :
        async for chunk in stream :
            print ( chunk . choices [ 0 ]. delta . content or "" , end = "" , flush = True )
    finally :
        await stream . close ()  # Ensure the stream is closed after use

asyncio . run ( main ())

 from friendli import Friendli

client = Friendli ()

with client . chat . completions . create (
    model = "meta-llama-3.1-8b-instruct" ,
    messages = [
        {
            "role" : "user" ,
            "content" : "Tell me how to make a delicious pancake" ,
        }
    ],
    stream = True ,
) as stream :
    for chunk in stream :
        print ( chunk . choices [ 0 ]. delta . content or "" , end = "" , flush = True )

비동기 스트리밍의 경우:

 import asyncio
from friendli import AsyncFriendli

client = AsyncFriendli ()

async def main ():
    async with await client . chat . completions . create (
        model = "meta-llama-3.1-8b-instruct" ,
        messages = [
            {
                "role" : "user" ,
                "content" : "Tell me how to make a delicious pancake" ,
            }
        ],
        stream = True ,
    ) as stream :
        async for chunk in stream :
            print ( chunk . choices [ 0 ]. delta . content or "" , end = "" , flush = True )

asyncio . run ( main ())

gRPC 스트림 취소

스트리밍과 함께 gRPC 인터페이스를 사용하는 경우 진행 중인 스트림 작업이 완료되기 전에 취소할 수 있습니다. 이는 시간 초과 또는 기타 조건으로 인해 스트림을 중지해야 하는 경우 특히 유용합니다.

동기 gRPC 스트리밍의 경우:

 from friendli import Friendli

client = Friendli ( base_url = "0.0.0.0:8000" , use_grpc = True )

stream = client . chat . completions . create (
    messages = [
        {
            "role" : "user" ,
            "content" : "Tell me how to make a delicious pancake" ,
        }
    ],
    stream = True ,
)

try :
    for chunk in stream :
        print ( chunk . choices [ 0 ]. delta . content or "" , end = "" , flush = True )
except SomeException :
    stream . cancel ()  # Cancel the stream in case of an error or interruption

비동기 gRPC 스트리밍의 경우:

 import asyncio
from friendli import AsyncFriendli

client = AsyncFriendli ( base_url = "0.0.0.0:8000" , use_grpc = True )

async def main ():
    stream = await client . chat . completions . create (
        messages = [
            {
                "role" : "user" ,
                "content" : "Tell me how to make a delicious pancake" ,
            }
        ],
        stream = True ,
    )

    try :
        async for chunk in stream :
            print ( chunk . choices [ 0 ]. delta . content or "" , end = "" , flush = True )
    except SomeException :
        stream . cancel ()  # Cancel the stream in case of an error or interruption

asyncio . run ( main ())

CLI 예

CLI를 사용하여 직접 생성 API를 호출할 수도 있습니다.

friendli api chat-completions create 
  -g " user Tell me how to make a delicious pancake " 
  -m meta-llama-3.1-8b-instruct

friendli 명령에 대한 자세한 내용을 보려면 터미널 셸에서 friendli --help 실행하세요. 그러면 사용 가능한 옵션과 사용 지침에 대한 자세한 목록이 제공됩니다.

팁

자세한 내용은 공식 문서를 확인하세요!

확장하다

friendli client

설치

친숙한 서버리스 엔드포인트

Friendli 전용 엔드포인트

친근한 컨테이너

비동기 사용

스트리밍 사용량

고급 사용법

LoRA 어댑터에 요청 보내기

gRPC 인터페이스 사용

HTTP 클라이언트 구성

gRPC 채널 구성

자원 관리

클라이언트 닫기

스트리밍 응답 관리

gRPC 스트림 취소

CLI 예

java client

amneziawg windows client

rdt client

discord bot client

clip_share_client

client

chat.petals.dev

GPT Prompt Templates

GPTyped

node telegram bot api

typebot.io

python wechaty getting started

waymo open dataset

termwind

wp functions