uni apiダウンロードuni apiソースコードのダウンロード

uni-api

英語|中国語

導入

個人的な使用には、1つ/new-APIは複雑すぎて、個人が必要としない多くの商業機能があります。複雑なフロントエンドインターフェイスが必要で、より多くのモデルのサポートを好む場合は、UNI-APIを試すことができます。これは、大規模な言語モデルAPIの管理を統合するプロジェクトであり、単一の統合されたAPIインターフェイスを介して複数のバックエンドサービスを呼び出し、それらをすべてOpenai形式に変換し、負荷分散をサポートすることができます。現在サポートされているバックエンドサービスには、Openai、Anthropic、Gemini、Vertex、Cohere、Groq、CloudFlare、OpenRouterなどが含まれます。

特徴

APIチャネルを構成するためのフロントエンドの純粋な構成ファイルはありません。ファイルを作成するだけで独自のAPIステーションを実行できます。ドキュメントには、初心者向けの詳細な構成ガイドがあります。
OpenAI、DeepSeek、OpenRouter、OpenAI形式のその他のAPIなどのプロバイダーをサポートする複数のバックエンドサービスの統一管理。 Openai Dalle-3画像生成をサポートします。
同時に、人類、ジェミニ、頂点AI、cohere、groq、cloudflareをサポートしています。頂点は、ClaudeとGemini APIを同時にサポートしています。
Openai、人類、ジェミニ、頂点ネイティブツール使用関数呼び出しをサポートします。
Openai、人類、gemini、頂点ネイティブ画像認識APIをサポートします。
4種類の負荷分散をサポートします。
1. チャネルレベルの加重負荷分散をサポートし、異なるチャネルの重みに従ってリクエストを配布できるようにします。デフォルトでは有効にされておらず、チャネルの重みを構成する必要があります。
2. 頂点の地域負荷分散と高い並行性をサポートします。これにより、ジェミニとクロードの並行性を最大（領域の数）時間まで増加させることができます。追加の構成なしで自動的に有効になります。
3. 頂点領域レベルの負荷バランシングを除き、すべてのAPIはチャネルレベルのシーケンシャル負荷バランシングをサポートし、没入型翻訳エクスペリエンスを向上させます。デフォルトでは有効にされておらず、 SCHEDULING_ALGORITHM round_robinとして構成する必要があります。
4. 単一のチャネルで複数のAPIキーに対して自動APIキーレベルのラウンドロビンロードバランスをサポートします。
APIチャネルの応答が失敗した場合、自動再試行をサポートすると、次のAPIチャネルを自動的に再試行します。
サポートチャネル冷却：APIチャネル応答が失敗すると、チャネルは自動的に除外および冷却され、チャネルへのリクエストが停止します。冷却期間が終了すると、モデルは再び故障するまで自動的に復元され、その時点で再び冷却されます。
細粒のモデルタイムアウト設定をサポートし、各モデルの異なるタイムアウト期間を許可します。
微調整された許可制御をサポートします。ワイルドカードを使用して、APIキーチャネルで使用可能な特定のモデルを設定することをサポートします。
サポートレートの制限では、2/min、1分あたり2回、5/時間、1時間あたり5回、1日10回、1日10回、10/月、月額10回、10/年、年間10回など、整数として1分あたりの最大リクエスト数を設定できます。デフォルトは60/minです。
/v1/chat/completions 、 /v1/images/generations 、 /v1/audio/transcriptions 、 /v1/moderations 、 /v1/models複数の標準Openai形式インターフェイスをサポートします。
ユーザーメッセージの道徳的レビューを行うことができるOpenAIモデレートの道徳的レビューをサポートします。不適切なメッセージが見つかった場合、エラーメッセージが返されます。これにより、バックエンドAPIがプロバイダーによって禁止されるリスクが減ります。

使用方法

UNI-APIを起動するには、構成ファイルを使用する必要があります。構成ファイルから始めるには2つの方法があります。

最初の方法は、 CONFIG_URL環境変数を使用して、uni-apiが起動すると自動的にダウンロードされる構成ファイルURLを入力することです。
2番目の方法は、 api.yamlという名前の構成ファイルをコンテナにマウントすることです。

方法1： `api.yaml`構成ファイルをマウントして、uni-apiを開始します

uni-apiを起動するには、事前に構成ファイルを入力する必要がありますapi.yamlという名前の構成ファイルを使用してuni-api起動する必要があります。複数のモデルを構成でき、各モデルは複数のバックエンドサービスを構成し、ロードバランスをサポートできます。以下は、実行できる最小api.yaml構成ファイルの例です。

 providers :
  - provider : provider_name # Service provider name, such as openai, anthropic, gemini, openrouter, can be any name, required
    base_url : https://api.your.com/v1/chat/completions # Backend service API address, required
    api : sk-YgS6GTi0b4bEabc4C # Provider's API Key, required, automatically uses base_url and api to get all available models through the /v1/models endpoint.
  # Multiple providers can be configured here, each provider can configure multiple API Keys, and each API Key can configure multiple models.
api_keys :
  - api : sk-Pkj60Yf8JFWxfgRmXQFWyGtWUddGZnmi3KlvowmRWpWpQxx # API Key, user request uni-api requires API key, required
  # This API Key can use all models, that is, it can use all models in all channels set under providers, without needing to add available channels one by one.

api.yamlの詳細な高度な構成：

 providers :
  - provider : provider_name # Service provider name, such as openai, anthropic, gemini, openrouter, can be any name, required
    base_url : https://api.your.com/v1/chat/completions # Backend service API address, required
    api : sk-YgS6GTi0b4bEabc4C # Provider's API Key, required
    model : # Optional, if model is not configured, all available models will be automatically obtained through base_url and api via the /v1/models endpoint.
      - gpt-4o # Usable model name, required
      - claude-3-5-sonnet-20240620 : claude-3-5-sonnet # Rename model, claude-3-5-sonnet-20240620 is the provider's model name, claude-3-5-sonnet is the renamed name, you can use a simple name to replace the original complex name, optional
      - dall-e-3

  - provider : anthropic
    base_url : https://api.anthropic.com/v1/messages
    api : # Supports multiple API Keys, multiple keys automatically enable polling load balancing, at least one key, required
      - sk-ant-api03-bNnAOJyA-xQw_twAA
      - sk-ant-api02-bNnxxxx
    model :
      - claude-3-5-sonnet-20240620 : claude-3-5-sonnet # Rename model, claude-3-5-sonnet-20240620 is the provider's model name, claude-3-5-sonnet is the renamed name, you can use a simple name to replace the original complex name, optional
    tools : true # Whether to support tools, such as generating code, generating documents, etc., default is true, optional

  - provider : gemini
    base_url : https://generativelanguage.googleapis.com/v1beta # base_url supports v1beta/v1, only for Gemini model use, required
    api : # Supports multiple API Keys, multiple keys automatically enable polling load balancing, at least one key, required
      - AIzaSyAN2k6IRdgw123
      - AIzaSyAN2k6IRdgw456
      - AIzaSyAN2k6IRdgw789
    model :
      - gemini-1.5-pro
      - gemini-1.5-flash-exp-0827 : gemini-1.5-flash # After renaming, the original model name gemini-1.5-flash-exp-0827 cannot be used, if you want to use the original name, you can add the original name in the model, just add the line below to use the original name
      - gemini-1.5-flash-exp-0827 # Add this line, both gemini-1.5-flash-exp-0827 and gemini-1.5-flash can be requested
    tools : true
    preferences :
      api_key_rate_limit : 15/min # Each API Key can request up to 15 times per minute, optional. The default is 999999/min. Supports multiple frequency constraints: 15/min,10/day
      # api_key_rate_limit: # You can set different frequency limits for each model
      #   gemini-1.5-flash: 15/min,1500/day
      #   gemini-1.5-pro: 2/min,50/day
      #   default: 4/min # If the model does not set the frequency limit, use the frequency limit of default
      api_key_cooldown_period : 60 # Each API Key will be cooled down for 60 seconds after encountering a 429 error. Optional, the default is 0 seconds. When set to 0, the cooling mechanism is not enabled. When there are multiple API keys, the cooling mechanism will take effect.
      api_key_schedule_algorithm : round_robin # Set the request order of multiple API Keys, optional. The default is round_robin, and the optional values are: round_robin, random. It will take effect when there are multiple API keys. round_robin is polling load balancing, and random is random load balancing.
      model_timeout : # Model timeout, in seconds, default 100 seconds, optional
        gemini-1.5-pro : 10 # Model gemini-1.5-pro timeout is 10 seconds
        gemini-1.5-flash : 10 # Model gemini-1.5-flash timeout is 10 seconds
        default : 10 # Model does not have a timeout set, use the default timeout of 10 seconds, when requesting a model not in model_timeout, the timeout is also 10 seconds, if default is not set, uni-api will use the default timeout set by the environment variable TIMEOUT, the default timeout is 100 seconds
      proxy : socks5://[username]:[password]@[ip]:[port] # Proxy address, optional. Supports socks5 and http proxies, default is not used.

  - provider : vertex
    project_id : gen-lang-client-xxxxxxxxxxxxxx # Description: Your Google Cloud project ID. Format: String, usually composed of lowercase letters, numbers, and hyphens. How to obtain: You can find your project ID in the project selector of the Google Cloud Console.
    private_key : " -----BEGIN PRIVATE KEY----- n xxxxx n -----END PRIVATE " # Description: Private key for Google Cloud Vertex AI service account. Format: A JSON formatted string containing the private key information of the service account. How to obtain: Create a service account in Google Cloud Console, generate a JSON formatted key file, and then set its content as the value of this environment variable.
    client_email : [email protected] # Description: Email address of the Google Cloud Vertex AI service account. Format: Usually a string like "[email protected]". How to obtain: Generated when creating a service account, or you can view the service account details in the "IAM and Admin" section of the Google Cloud Console.
    model :
      - gemini-1.5-pro
      - gemini-1.5-flash
      - gemini-1.5-pro : gemini-1.5-pro-search # Only supports using the gemini-1.5-pro-search model to request uni-api when using the Vertex Gemini API, to automatically use the Google official search tool.
      - claude-3-5-sonnet@20240620 : claude-3-5-sonnet
      - claude-3-opus@20240229 : claude-3-opus
      - claude-3-sonnet@20240229 : claude-3-sonnet
      - claude-3-haiku@20240307 : claude-3-haiku
    tools : true
    notes : https://xxxxx.com/ # You can put the provider's website, notes, official documentation, optional

  - provider : cloudflare
    api : f42b3xxxxxxxxxxq4aoGAh # Cloudflare API Key, required
    cf_account_id : 8ec0xxxxxxxxxxxxe721 # Cloudflare Account ID, required
    model :
      - ' @cf/meta/llama-3.1-8b-instruct ' : llama-3.1-8b # Rename model, @cf/meta/llama-3.1-8b-instruct is the provider's original model name, must be enclosed in quotes, otherwise yaml syntax error, llama-3.1-8b is the renamed name, you can use a simple name to replace the original complex name, optional
      - ' @cf/meta/llama-3.1-8b-instruct ' # Must be enclosed in quotes, otherwise yaml syntax error

  - provider : other-provider
    base_url : https://api.xxx.com/v1/messages
    api : sk-bNnAOJyA-xQw_twAA
    model :
      - causallm-35b-beta2ep-q6k : causallm-35b
      - anthropic/claude-3-5-sonnet
    tools : false
    engine : openrouter # Force the use of a specific message format, currently supports gpt, claude, gemini, openrouter native format, optional

api_keys :
  - api : sk-KjjI60Yf0JFWxfgRmXqFWyGtWUd9GZnmi3KlvowmRWpWpQRo # API Key, required for users to use this service
    model : # Models that can be used by this API Key, required. Default channel-level polling load balancing is enabled, and each request model is requested in sequence according to the model configuration. It is not related to the original channel order in providers. Therefore, you can set different request sequences for each API key.
      - gpt-4o # Usable model name, can use all gpt-4o models provided by providers
      - claude-3-5-sonnet # Usable model name, can use all claude-3-5-sonnet models provided by providers
      - gemini/* # Usable model name, can only use all models provided by providers named gemini, where gemini is the provider name, * represents all models
    role : admin

  - api : sk-pkhf60Yf0JGyJxgRmXqFQyTgWUd9GZnmi3KlvowmRWpWqrhy
    model :
      - anthropic/claude-3-5-sonnet # Usable model name, can only use the claude-3-5-sonnet model provided by the provider named anthropic. Models with the same name from other providers cannot be used. This syntax will not match the model named anthropic/claude-3-5-sonnet provided by other-provider.
      - <anthropic/claude-3-5-sonnet> # By adding angle brackets on both sides of the model name, it will not search for the claude-3-5-sonnet model under the channel named anthropic, but will take the entire anthropic/claude-3-5-sonnet as the model name. This syntax can match the model named anthropic/claude-3-5-sonnet provided by other-provider. But it will not match the claude-3-5-sonnet model under anthropic.
      - openai-test/text-moderation-latest # When message moderation is enabled, the text-moderation-latest model under the channel named openai-test can be used for moderation.
      - sk-KjjI60Yd0JFWtxxxxxxxxxxxxxxwmRWpWpQRo/* # Support using other API keys as channels
    preferences :
      SCHEDULING_ALGORITHM : fixed_priority # When SCHEDULING_ALGORITHM is fixed_priority, use fixed priority scheduling, always execute the channel of the first model with a request. Default is enabled, SCHEDULING_ALGORITHM default value is fixed_priority. SCHEDULING_ALGORITHM optional values are: fixed_priority, round_robin, weighted_round_robin, lottery, random.
      # When SCHEDULING_ALGORITHM is random, use random polling load balancing, randomly request the channel of the model with a request.
      # When SCHEDULING_ALGORITHM is round_robin, use polling load balancing, request the channel of the model used by the user in order.
      AUTO_RETRY : true # Whether to automatically retry, automatically retry the next provider, true for automatic retry, false for no automatic retry, default is true. Also supports setting a number, indicating the number of retries.
      rate_limit : 15/min # Supports rate limiting, each API Key can request up to 15 times per minute, optional. The default is 999999/min. Supports multiple frequency constraints: 15/min,10/day
      # rate_limit: # You can set different frequency limits for each model
      #   gemini-1.5-flash: 15/min,1500/day
      #   gemini-1.5-pro: 2/min,50/day
      #   default: 4/min # If the model does not set the frequency limit, use the frequency limit of default
      ENABLE_MODERATION : true # Whether to enable message moderation, true for enable, false for disable, default is false, when enabled, it will moderate the user's message, if inappropriate messages are found, an error message will be returned.

  # Channel-level weighted load balancing configuration example
  - api : sk-KjjI60Yd0JFWtxxxxxxxxxxxxxxwmRWpWpQRo
    model :
      - gcp1/* : 5 # The number after the colon is the weight, weight only supports positive integers.
      - gcp2/* : 3 # The size of the number represents the weight, the larger the number, the greater the probability of the request.
      - gcp3/* : 2 # In this example, there are a total of 10 weights for all channels, and 10 requests will have 5 requests for the gcp1/* model, 2 requests for the gcp2/* model, and 3 requests for the gcp3/* model.

    preferences :
      SCHEDULING_ALGORITHM : weighted_round_robin # Only when SCHEDULING_ALGORITHM is weighted_round_robin and the above channel has weights, it will request according to the weighted order. Use weighted polling load balancing, request the channel of the model with a request according to the weight order. When SCHEDULING_ALGORITHM is lottery, use lottery polling load balancing, request the channel of the model with a request according to the weight randomly. Channels without weights automatically fall back to round_robin polling load balancing.
      AUTO_RETRY : true

preferences : # Global configuration
  model_timeout : # Model timeout, in seconds, default 100 seconds, optional
    gpt-4o : 10 # Model gpt-4o timeout is 10 seconds, gpt-4o is the model name, when requesting models like gpt-4o-2024-08-06, the timeout is also 10 seconds
    claude-3-5-sonnet : 10 # Model claude-3-5-sonnet timeout is 10 seconds, when requesting models like claude-3-5-sonnet-20240620, the timeout is also 10 seconds
    default : 10 # Model does not have a timeout set, use the default timeout of 10 seconds, when requesting a model not in model_timeout, the default timeout is 10 seconds, if default is not set, uni-api will use the default timeout set by the environment variable TIMEOUT, the default timeout is 100 seconds
    o1-mini : 30 # Model o1-mini timeout is 30 seconds, when requesting models starting with o1-mini, the timeout is 30 seconds
    o1-preview : 100 # Model o1-preview timeout is 100 seconds, when requesting models starting with o1-preview, the timeout is 100 seconds
  cooldown_period : 300 # Channel cooldown time, in seconds, default 300 seconds, optional. When a model request fails, the channel will be automatically excluded and cooled down for a period of time, and will not request the channel again. After the cooldown time ends, the model will be automatically restored until the request fails again, and it will be cooled down again. When cooldown_period is set to 0, the cooling mechanism is not enabled.
  error_triggers : # Error triggers, when the message returned by the model contains any of the strings in the error_triggers, the channel will return an error. Optional
    - The bot's usage is covered by the developer
    - process this request due to overload or policy

構成ファイルをマウントし、Uni-API Dockerコンテナを起動します。

docker run --user root -p 8001:8000 --name uni-api -dit 
-v ./api.yaml:/home/api.yaml 
yym68686/uni-api:latest

方法2： `CONFIG_URL`環境変数を使用してUNI-APIを開始します

メソッド1に従って構成ファイルを書き込んだ後、クラウドディスクにアップロードし、ファイルの直接リンクを取得し、 CONFIG_URL環境変数を使用してUni-API Dockerコンテナを起動します。

docker run --user root -p 8001:8000 --name uni-api -dit 
-e CONFIG_URL=http://file_url/api.yaml 
yym68686/uni-api:latest

環境変数

config_url：configurationファイルのダウンロードアドレスは、ローカルファイルまたはリモートファイルである可能性があります。
タイムアウト：リクエストタイムアウト、デフォルトは100秒です。タイムアウトは、1つのチャネルが応答しないときに次のチャネルに切り替えるのに必要な時間を制御できます。オプション
disable_database：データベースを無効にするかどうか、デフォルトはfalse、オプションです

Vercel Remote Deployment

上のワンクリック展開ボタンをクリックした後、環境変数CONFIG_URL構成ファイルの直接リンクに設定し、 DISABLE_DATABASEにtrueにし、[作成]をクリックしてプロジェクトを作成します。展開後、vercelプロジェクトパネルの[> [> functions]の下でファンクションマックスの持続時間を60秒に手動で設定し、[展開]メニューをクリックして[再deploy]をクリックして、タイムアウトを60秒に設定します。再展開しない場合、デフォルトのタイムアウトは元の10秒のままになります。 Vercelプロジェクトを削除して再作成しないでください。代わりに、現在展開されているVercelプロジェクト内の展開メニューのRedeployをクリックして、関数の最大期間の変更を有効にします。

Ubuntuの展開

倉庫のリリースで、対応するバイナリファイルの最新バージョン、たとえばUni-API-Linux-X86_64-0.0.99.Pexという名前のファイルを見つけます。サーバー上のバイナリファイルをダウンロードして実行します。

wget https://github.com/yym68686/uni-api/releases/download/v0.0.99/uni-api-linux-x86_64-0.0.99.pex
chmod +x uni-api-linux-x86_64-0.0.99.pex
./uni-api-linux-x86_64-0.0.99.pex

Serv00リモート展開（FreeBSD 14.0）

まず、パネルにログインします。追加サービスでは、タブをクリックして独自のアプリケーションを実行して、オプションを独自のプログラムを実行できるようにし、パネルポートの予約に移動してポートをランダムに開きます。

独自のドメイン名がない場合は、パネルwww Webサイトにアクセスして、提供されているデフォルトのドメイン名を削除します。次に、ドメインが削除されたドメインである新しいドメインを作成します。 [詳細設定]をクリックした後、Webサイトタイプをプロキシドメインに設定すると、プロキシポートが開いたばかりのポートを指す必要があります。 HTTPSの使用を選択しないでください。

SSH00サーバーにログインし、次のコマンドを実行します。

git clone --depth 1 -b main --quiet https://github.com/yym68686/uni-api.git
cd uni-api
python -m venv uni-api
tmux new -s uni-api
source uni-api/bin/activate
export CFLAGS= " -I/usr/local/include "
export CXXFLAGS= " -I/usr/local/include "
export CC=gcc
export CXX=g++
export MAX_CONCURRENCY=1
export CPUCOUNT=1
export MAKEFLAGS= " -j1 "
CMAKE_BUILD_PARALLEL_LEVEL=1 cpuset -l 0 pip install -vv -r requirements.txt
cpuset -l 0 pip install -r -vv requirements.txt

ctrl+bd tmuxを終了し、インストールが完了するまで数時間待ちます。インストールが完了したら、次のコマンドを実行します。

tmux attach -t uni-api
source uni-api/bin/activate
export CONFIG_URL=http://file_url/api.yaml
export DISABLE_DATABASE=true
# Modify the port, xxx is the port, modify it yourself, corresponding to the port opened in the panel Port reservation
sed -i ' ' ' s/port=8000/port=xxx/ ' main.py
sed -i ' ' ' s/reload=True/reload=False/ ' main.py
python main.py

Ctrl+BDを使用してTMUXを終了し、プログラムをバックグラウンドで実行できるようにします。この時点で、他のチャットクライアントでUNI-APIを使用できます。カールテストスクリプト：

curl -X POST https://xxx.serv00.net/v1/chat/completions 
-H ' Content-Type: application/json ' 
-H ' Authorization: Bearer sk-xxx ' 
-d ' {"model": "gpt-4o","messages": [{"role": "user","content": "Hello"}]} '

参照文書：

https://docs.serv00.com/python/

https://linux.do/t/topic/201181

https://linux.do/t/topic/218738

Dockerローカル展開

コンテナを起動します

docker run --user root -p 8001:8000 --name uni-api -dit 
-e CONFIG_URL=http://file_url/api.yaml  # If the local configuration file has already been mounted, there is no need to set CONFIG_URL
-v ./api.yaml:/home/api.yaml  # If CONFIG_URL is already set, there is no need to mount the configuration file
-v ./uniapi_db:/home/data  # If you do not want to save statistical data, there is no need to mount this folder
yym68686/uni-api:latest

または、Docker Composeを使用したい場合は、Docker-Compose.ymlの例を次に示します。

 services :
  uni-api :
    container_name : uni-api
    image : yym68686/uni-api:latest
    environment :
      - CONFIG_URL=http://file_url/api.yaml # If a local configuration file is already mounted, there is no need to set CONFIG_URL
    ports :
      - 8001:8000
    volumes :
      - ./api.yaml:/home/api.yaml # If CONFIG_URL is already set, there is no need to mount the configuration file
      - ./uniapi_db:/home/data # If you do not want to save statistical data, there is no need to mount this folder

config_urlは、自動的にダウンロードできるリモート構成ファイルのURLです。たとえば、特定のプラットフォームで構成ファイルを変更するのが快適でない場合は、構成ファイルをホスティングサービスにアップロードし、config_urlであるUni-APIへの直接リンクを提供することができます。ローカルマウントされた構成ファイルを使用している場合、config_urlを設定する必要はありません。 config_urlは、構成ファイルをマウントするのに便利ではない場合に使用されます。

dockerを実行して、バックグラウンドでコンテナを作成します

docker-compose pull
docker-compose up -d

Dockerビルド

docker build --no-cache -t uni-api:latest -f Dockerfile --platform linux/amd64 .
docker tag uni-api:latest yym68686/uni-api:latest
docker push yym68686/uni-api:latest

Docker画像を一度クリックします

 set -eu
docker pull yym68686/uni-api:latest
docker rm -f uni-api
docker run --user root -p 8001:8000 -dit --name uni-api 
-e CONFIG_URL=http://file_url/api.yaml 
-v ./api.yaml:/home/api.yaml 
-v ./uniapi_db:/home/data 
yym68686/uni-api:latest
docker logs -f uni-api

安らかなカールテスト

curl -X POST http://127.0.0.1:8000/v1/chat/completions 
-H " Content-Type: application/json " 
-H " Authorization: Bearer ${API} " 
-d ' {"model": "gpt-4o","messages": [{"role": "user", "content": "Hello"}],"stream": true} '

PEX Linuxパッケージ：

VERSION= $( cat VERSION )
pex -D . -r requirements.txt 
    -c uvicorn 
    --inject-args ' main:app --host 0.0.0.0 --port 8000 ' 
    --platform linux_x86_64-cp-3.10.12-cp310 
    --interpreter-constraint ' ==3.10.* ' 
    --no-strip-pex-env 
    -o uni-api-linux-x86_64- ${VERSION} .pex

MacOSパッケージ：

VERSION= $( cat VERSION )
pex -r requirements.txt 
    -c uvicorn 
    --inject-args ' main:app --host 0.0.0.0 --port 8000 ' 
    -o uni-api-macos-arm64- ${VERSION} .pex

スポンサー

次のスポンサーのサポートに感謝します。

@PowerHunter：2000円
@IOI：¥50

私たちを後援する方法

私たちのプロジェクトをサポートしたい場合は、次の方法で私たちを後援することができます。

PayPal
USDT-TRC20、USDT-TRC20ウォレットアドレス： TLFbqSv5pDu5he43mVmK1dNx7yBMFeN7d8
wechat
アリパイ

ご支援ありがとうございます！

よくある質問

なぜエラーError processing request or performing moral check: 404: No matching model found常に表示されませんか？

enable_moderationをfalseに設定すると、この問題が修正されます。 Enable_ModerationがTRUEの場合、APIはテキストモデレーションラテストモデルを使用できる必要があり、プロバイダーモデル設定でテキストモデレーションレートが提供されていない場合、モデルが見つからないことを示すエラーが発生します。

特定のチャネルのリクエストに優先順位を付ける方法、チャネルの優先度を設定する方法は？

API_Keysにチャネル順序を直接設定します。他の設定は必要ありません。サンプル構成ファイル：

 providers :
  - provider : ai1
    base_url : https://xxx/v1/chat/completions
    api : sk-xxx

  - provider : ai2
    base_url : https://xxx/v1/chat/completions
    api : sk-xxx

api_keys :
  - api : sk-1234
    model :
      - ai2/*
      - ai1/*

このようにして、最初にAI2を要求し、失敗した場合はAI1を要求します。

さまざまなスケジューリングアルゴリズムの背後にある動作は何ですか？たとえば、sixed_priority、weeded_round_robin、宝くじ、ランダム、round_robin？

すべてのスケジューリングアルゴリズムは、API_Keysを設定して有効にする必要があります。

sixt_priority：優先度のスケジューリングを修正しました。すべてのリクエストは、最初にユーザーリクエストを持っているモデルのチャネルによって常に実行されます。エラーの場合、次のチャネルに切り替わります。これは、デフォルトのスケジューリングアルゴリズムです。
weeghted_round_robin：加重ラウンドロビンロードバランシングは、構成ファイルAPI_Keys。（API）.Modelの重量順序セットに従って、ユーザーの要求モデルでチャネルを要求します。
宝くじ：ラウンドロビンロードバランシングを描き、構成ファイルAPI_Keys（API）.Modelに設定された重みに従って、ユーザー要求でモデルのチャネルをランダムにリクエストします。
Round_robin：Round-Robinロードバランシングは、構成ファイルAPI_Keys。（API）.Modelの構成順序に従ってユーザーが要求したモデルを所有するチャネルを要求します。チャネルの優先順位を設定する方法に関する以前の質問を確認できます。

base_urlはどのように正しく埋めるべきですか？

高度な構成に示されているいくつかの特別なチャネルを除き、すべてのOpenAI形式プロバイダーはbase_urlを完全に入力する必要があります。つまり、base_urlは/v1/chat/completionsで終了する必要があります。 githubモデルを使用している場合、base_urlはhttps://models.inference.azure.com/chat/completionsとして埋め込まれます。

モデルのタイムアウトタイムはどのように機能しますか？チャネルレベルのタイムアウト設定とグローバルモデルタイムアウト設定の優先順位は何ですか？

チャネルレベルのタイムアウト設定は、グローバルモデルタイムアウト設定よりも優先度が高くなります。優先順位は次のとおりです。チャネルレベルのモデルタイムアウト設定>チャンネルレベルのデフォルトタイムアウト設定>グローバルモデルタイムアウト設定>グローバルデフォルトタイムアウト設定>環境変数タイムアウト。

モデルのタイムアウト時間を調整することにより、一部のチャネルのタイミングのエラーを回避できます。エラー{'error': '500', 'details': 'fetch_response_stream Read Response Timeout'}に遭遇した場合は、モデルのタイムアウト時間を増やしてみてください。

API_KEY_RATE_LIMITはどのように機能しますか？複数のモデルで同じレート制限を設定するにはどうすればよいですか？

4つのモデルGemini-1.5-Pro-Latest、Gemini-1.5-Pro、Gemini-1.5-Pro-001、Gemini-1.5-Pro-002に同じ周波数制限を設定する場合は、次のように設定できます。

 api_key_rate_limit :
  gemini-1.5-pro : 1000/min

これは、Gemini-1.5-Pro文字列を含むすべてのモデルと一致します。これら4つのモデルの周波数制限、Gemini-1.5-Pro-Latest、Gemini-1.5-Pro、Gemini-1.5-Pro-001、Gemini-1.5-Pro-002はすべて1000/分に設定されます。 API_KEY_RATE_LIMITフィールドを構成するためのロジックは次のとおりです。ここにサンプル構成ファイルがあります。

 api_key_rate_limit :
  gemini-1.5-pro : 1000/min
  gemini-1.5-pro-002 : 500/min

この時点で、モデルGemini-1.5-Pro-002を使用したリクエストがある場合。

まず、UNI-APIはAPI_KEY_RATE_LIMITのモデルと正確に一致しようとします。 Gemini-1.5-Pro-002のレート制限が設定されている場合、Gemini-1.5-Pro-002のレート制限は500/minです。この時点で要求されたモデルがGemini-1.5-Pro-002ではなく、Gemini-1.5-Pro-Latestである場合、API_KEY_RATE_LIMITにはGemini-1.5-Pro-Latestのレート制限セットがないため、Gemini-1.5-Pro-latestと同じプレフィックスと同じプレフィックスを持つモデルがあります。 1000/minに設定します。

星の歴史

拡大する

uni api

uni-api

導入

特徴

使用方法

方法1： `api.yaml`構成ファイルをマウントして、uni-apiを開始します

方法2： `CONFIG_URL`環境変数を使用してUNI-APIを開始します

環境変数

Vercel Remote Deployment

Ubuntuの展開

Serv00リモート展開（FreeBSD 14.0）

Dockerローカル展開

スポンサー

私たちを後援する方法

よくある質問

星の歴史

evolution api

ヨーロッパユニ

魔法使いユニ

ユニ

壊れたユニ

タンクユニ

chat.petals.dev

GPT Prompt Templates

GPTyped

waymo open dataset

Sunamu

chat.petals.dev

waymo open dataset

termwind

wp functions

uni api

uni-api

導入

特徴

使用方法

方法1： api.yaml構成ファイルをマウントして、uni-apiを開始します

方法2： CONFIG_URL環境変数を使用してUNI-APIを開始します

環境変数

Vercel Remote Deployment

Ubuntuの展開

Serv00リモート展開（FreeBSD 14.0）

Dockerローカル展開

スポンサー

私たちを後援する方法

よくある質問

星の歴史

方法1： `api.yaml`構成ファイルをマウントして、uni-apiを開始します

方法2： `CONFIG_URL`環境変数を使用してUNI-APIを開始します