uni api下載-uni uni api源代碼下載

Uni-Api

英語|中國人

介紹

為了個人用途，一個/new-api過於復雜，因為個人不需要的許多商業功能。如果您不想要復雜的前端界面，並且更喜歡對更多型號的支持，則可以嘗試使用Uni-API。這是一個統一大型語言模型API的管理項目，使您可以通過單個統一的API接口調用多個後端服務，將它們全部轉換為OpenAI格式，並支持負載平衡。目前支持的後端服務包括：OpenAI，人類，雙子座，頂點，Cohere，Groq，Cloudflare，OpenRouter等。

特徵

沒有用於配置API通道的前端純配置文件。您只需編寫文件即可運行自己的API站，並且該文檔具有詳細的配置指南，即對初學者友好。
多個後端服務的統一管理，以OpenAI格式為OpenAI，DeepSeek，OpenRouter和其他API等提供商提供支持。支持Openai Dalle-3圖像生成。
同時支持人類，雙子座，頂點AI，cohere，groq，Cloudflare。頂點同時支持Claude和Gemini API。
支持OpenAI，人類，雙子座，頂點本機工具使用功能調用。
支持OpenAI，人類，雙子座，頂點本地圖像識別API。
支持四種類型的負載平衡。
1. 支持通道級的加權負載平衡，從而可以根據不同的通道權重分配請求。默認情況下不啟用它，需要配置通道權重。
2. 支持頂點區域負載平衡和高並發性，可以將雙子座和克勞德並發增加到（API *區域數量）次。自動啟用，無需其他配置。
3. 除了頂點區域級負載平衡外，所有API都支持通道級的順序負載平衡，從而增強了沉浸式翻譯體驗。默認情況下不啟用它，並且需要配置SCHEDULING_ALGORITHM為round_robin 。
4. 支持單個通道中多個API鍵的自動API鍵級圓形旋轉負載平衡。
支持自動重試，當API通道響應失敗時，會自動重試下一個API通道。
支持通道冷卻：當API頻道響應失敗時，該通道將自動排除並冷卻一段時間，並且對通道的請求將停止。冷卻期結束後，該模型將自動恢復直到再次失效為止，此時將再次冷卻。
支持細粒度的模型超時設置，為每個型號提供不同的超時持續時間。
支持細粒度的許可控制。支持使用通配符設置可用於API密鑰通道的特定型號的支持。
限制支持率，您可以將每分鐘的最大請求設置為整數，例如2/分鐘，每分鐘2次，5/小時，每小時5次，10/10次，每天10次，每天10次，10月10日，10次，每月10次，每年10次，每年10次。默認值為60/min。
支持多個標準的OpenAI格式接口： /v1/chat/completions ， /v1/images/generations ， /v1/audio/transcriptions ， /v1/moderations ， /v1/models 。
支持OpenAI審核的道德審查，可以對用戶信息進行道德審查。如果找到不適當的消息，將返回錯誤消息。這降低了提供者禁止後端API的風險。

用法方法

要啟動Uni-API，必須使用配置文件。從配置文件開始有兩種方法：

第一種方法是使用CONFIG_URL環境變量填充配置文件URL，當Uni-API啟動時將自動下載。
第二種方法是將一個名為api.yaml的配置文件安裝到容器中。

方法1：安裝`api.yaml`配置文件以啟動UNI-API

您必須提前填寫配置文件才能啟動uni-api ，並且必須使用名為api.yaml的配置文件啟動uni-api ，您可以配置多個模型，每個模型都可以配置多個後端服務，並支持負載平衡。以下是可以運行的最小api.yaml配置文件的示例：

 providers :
  - provider : provider_name # Service provider name, such as openai, anthropic, gemini, openrouter, can be any name, required
    base_url : https://api.your.com/v1/chat/completions # Backend service API address, required
    api : sk-YgS6GTi0b4bEabc4C # Provider's API Key, required, automatically uses base_url and api to get all available models through the /v1/models endpoint.
  # Multiple providers can be configured here, each provider can configure multiple API Keys, and each API Key can configure multiple models.
api_keys :
  - api : sk-Pkj60Yf8JFWxfgRmXQFWyGtWUddGZnmi3KlvowmRWpWpQxx # API Key, user request uni-api requires API key, required
  # This API Key can use all models, that is, it can use all models in all channels set under providers, without needing to add available channels one by one.

api.yaml的詳細高級配置：

 providers :
  - provider : provider_name # Service provider name, such as openai, anthropic, gemini, openrouter, can be any name, required
    base_url : https://api.your.com/v1/chat/completions # Backend service API address, required
    api : sk-YgS6GTi0b4bEabc4C # Provider's API Key, required
    model : # Optional, if model is not configured, all available models will be automatically obtained through base_url and api via the /v1/models endpoint.
      - gpt-4o # Usable model name, required
      - claude-3-5-sonnet-20240620 : claude-3-5-sonnet # Rename model, claude-3-5-sonnet-20240620 is the provider's model name, claude-3-5-sonnet is the renamed name, you can use a simple name to replace the original complex name, optional
      - dall-e-3

  - provider : anthropic
    base_url : https://api.anthropic.com/v1/messages
    api : # Supports multiple API Keys, multiple keys automatically enable polling load balancing, at least one key, required
      - sk-ant-api03-bNnAOJyA-xQw_twAA
      - sk-ant-api02-bNnxxxx
    model :
      - claude-3-5-sonnet-20240620 : claude-3-5-sonnet # Rename model, claude-3-5-sonnet-20240620 is the provider's model name, claude-3-5-sonnet is the renamed name, you can use a simple name to replace the original complex name, optional
    tools : true # Whether to support tools, such as generating code, generating documents, etc., default is true, optional

  - provider : gemini
    base_url : https://generativelanguage.googleapis.com/v1beta # base_url supports v1beta/v1, only for Gemini model use, required
    api : # Supports multiple API Keys, multiple keys automatically enable polling load balancing, at least one key, required
      - AIzaSyAN2k6IRdgw123
      - AIzaSyAN2k6IRdgw456
      - AIzaSyAN2k6IRdgw789
    model :
      - gemini-1.5-pro
      - gemini-1.5-flash-exp-0827 : gemini-1.5-flash # After renaming, the original model name gemini-1.5-flash-exp-0827 cannot be used, if you want to use the original name, you can add the original name in the model, just add the line below to use the original name
      - gemini-1.5-flash-exp-0827 # Add this line, both gemini-1.5-flash-exp-0827 and gemini-1.5-flash can be requested
    tools : true
    preferences :
      api_key_rate_limit : 15/min # Each API Key can request up to 15 times per minute, optional. The default is 999999/min. Supports multiple frequency constraints: 15/min,10/day
      # api_key_rate_limit: # You can set different frequency limits for each model
      #   gemini-1.5-flash: 15/min,1500/day
      #   gemini-1.5-pro: 2/min,50/day
      #   default: 4/min # If the model does not set the frequency limit, use the frequency limit of default
      api_key_cooldown_period : 60 # Each API Key will be cooled down for 60 seconds after encountering a 429 error. Optional, the default is 0 seconds. When set to 0, the cooling mechanism is not enabled. When there are multiple API keys, the cooling mechanism will take effect.
      api_key_schedule_algorithm : round_robin # Set the request order of multiple API Keys, optional. The default is round_robin, and the optional values are: round_robin, random. It will take effect when there are multiple API keys. round_robin is polling load balancing, and random is random load balancing.
      model_timeout : # Model timeout, in seconds, default 100 seconds, optional
        gemini-1.5-pro : 10 # Model gemini-1.5-pro timeout is 10 seconds
        gemini-1.5-flash : 10 # Model gemini-1.5-flash timeout is 10 seconds
        default : 10 # Model does not have a timeout set, use the default timeout of 10 seconds, when requesting a model not in model_timeout, the timeout is also 10 seconds, if default is not set, uni-api will use the default timeout set by the environment variable TIMEOUT, the default timeout is 100 seconds
      proxy : socks5://[username]:[password]@[ip]:[port] # Proxy address, optional. Supports socks5 and http proxies, default is not used.

  - provider : vertex
    project_id : gen-lang-client-xxxxxxxxxxxxxx # Description: Your Google Cloud project ID. Format: String, usually composed of lowercase letters, numbers, and hyphens. How to obtain: You can find your project ID in the project selector of the Google Cloud Console.
    private_key : " -----BEGIN PRIVATE KEY----- n xxxxx n -----END PRIVATE " # Description: Private key for Google Cloud Vertex AI service account. Format: A JSON formatted string containing the private key information of the service account. How to obtain: Create a service account in Google Cloud Console, generate a JSON formatted key file, and then set its content as the value of this environment variable.
    client_email : [email protected] # Description: Email address of the Google Cloud Vertex AI service account. Format: Usually a string like "[email protected]". How to obtain: Generated when creating a service account, or you can view the service account details in the "IAM and Admin" section of the Google Cloud Console.
    model :
      - gemini-1.5-pro
      - gemini-1.5-flash
      - gemini-1.5-pro : gemini-1.5-pro-search # Only supports using the gemini-1.5-pro-search model to request uni-api when using the Vertex Gemini API, to automatically use the Google official search tool.
      - claude-3-5-sonnet@20240620 : claude-3-5-sonnet
      - claude-3-opus@20240229 : claude-3-opus
      - claude-3-sonnet@20240229 : claude-3-sonnet
      - claude-3-haiku@20240307 : claude-3-haiku
    tools : true
    notes : https://xxxxx.com/ # You can put the provider's website, notes, official documentation, optional

  - provider : cloudflare
    api : f42b3xxxxxxxxxxq4aoGAh # Cloudflare API Key, required
    cf_account_id : 8ec0xxxxxxxxxxxxe721 # Cloudflare Account ID, required
    model :
      - ' @cf/meta/llama-3.1-8b-instruct ' : llama-3.1-8b # Rename model, @cf/meta/llama-3.1-8b-instruct is the provider's original model name, must be enclosed in quotes, otherwise yaml syntax error, llama-3.1-8b is the renamed name, you can use a simple name to replace the original complex name, optional
      - ' @cf/meta/llama-3.1-8b-instruct ' # Must be enclosed in quotes, otherwise yaml syntax error

  - provider : other-provider
    base_url : https://api.xxx.com/v1/messages
    api : sk-bNnAOJyA-xQw_twAA
    model :
      - causallm-35b-beta2ep-q6k : causallm-35b
      - anthropic/claude-3-5-sonnet
    tools : false
    engine : openrouter # Force the use of a specific message format, currently supports gpt, claude, gemini, openrouter native format, optional

api_keys :
  - api : sk-KjjI60Yf0JFWxfgRmXqFWyGtWUd9GZnmi3KlvowmRWpWpQRo # API Key, required for users to use this service
    model : # Models that can be used by this API Key, required. Default channel-level polling load balancing is enabled, and each request model is requested in sequence according to the model configuration. It is not related to the original channel order in providers. Therefore, you can set different request sequences for each API key.
      - gpt-4o # Usable model name, can use all gpt-4o models provided by providers
      - claude-3-5-sonnet # Usable model name, can use all claude-3-5-sonnet models provided by providers
      - gemini/* # Usable model name, can only use all models provided by providers named gemini, where gemini is the provider name, * represents all models
    role : admin

  - api : sk-pkhf60Yf0JGyJxgRmXqFQyTgWUd9GZnmi3KlvowmRWpWqrhy
    model :
      - anthropic/claude-3-5-sonnet # Usable model name, can only use the claude-3-5-sonnet model provided by the provider named anthropic. Models with the same name from other providers cannot be used. This syntax will not match the model named anthropic/claude-3-5-sonnet provided by other-provider.
      - <anthropic/claude-3-5-sonnet> # By adding angle brackets on both sides of the model name, it will not search for the claude-3-5-sonnet model under the channel named anthropic, but will take the entire anthropic/claude-3-5-sonnet as the model name. This syntax can match the model named anthropic/claude-3-5-sonnet provided by other-provider. But it will not match the claude-3-5-sonnet model under anthropic.
      - openai-test/text-moderation-latest # When message moderation is enabled, the text-moderation-latest model under the channel named openai-test can be used for moderation.
      - sk-KjjI60Yd0JFWtxxxxxxxxxxxxxxwmRWpWpQRo/* # Support using other API keys as channels
    preferences :
      SCHEDULING_ALGORITHM : fixed_priority # When SCHEDULING_ALGORITHM is fixed_priority, use fixed priority scheduling, always execute the channel of the first model with a request. Default is enabled, SCHEDULING_ALGORITHM default value is fixed_priority. SCHEDULING_ALGORITHM optional values are: fixed_priority, round_robin, weighted_round_robin, lottery, random.
      # When SCHEDULING_ALGORITHM is random, use random polling load balancing, randomly request the channel of the model with a request.
      # When SCHEDULING_ALGORITHM is round_robin, use polling load balancing, request the channel of the model used by the user in order.
      AUTO_RETRY : true # Whether to automatically retry, automatically retry the next provider, true for automatic retry, false for no automatic retry, default is true. Also supports setting a number, indicating the number of retries.
      rate_limit : 15/min # Supports rate limiting, each API Key can request up to 15 times per minute, optional. The default is 999999/min. Supports multiple frequency constraints: 15/min,10/day
      # rate_limit: # You can set different frequency limits for each model
      #   gemini-1.5-flash: 15/min,1500/day
      #   gemini-1.5-pro: 2/min,50/day
      #   default: 4/min # If the model does not set the frequency limit, use the frequency limit of default
      ENABLE_MODERATION : true # Whether to enable message moderation, true for enable, false for disable, default is false, when enabled, it will moderate the user's message, if inappropriate messages are found, an error message will be returned.

  # Channel-level weighted load balancing configuration example
  - api : sk-KjjI60Yd0JFWtxxxxxxxxxxxxxxwmRWpWpQRo
    model :
      - gcp1/* : 5 # The number after the colon is the weight, weight only supports positive integers.
      - gcp2/* : 3 # The size of the number represents the weight, the larger the number, the greater the probability of the request.
      - gcp3/* : 2 # In this example, there are a total of 10 weights for all channels, and 10 requests will have 5 requests for the gcp1/* model, 2 requests for the gcp2/* model, and 3 requests for the gcp3/* model.

    preferences :
      SCHEDULING_ALGORITHM : weighted_round_robin # Only when SCHEDULING_ALGORITHM is weighted_round_robin and the above channel has weights, it will request according to the weighted order. Use weighted polling load balancing, request the channel of the model with a request according to the weight order. When SCHEDULING_ALGORITHM is lottery, use lottery polling load balancing, request the channel of the model with a request according to the weight randomly. Channels without weights automatically fall back to round_robin polling load balancing.
      AUTO_RETRY : true

preferences : # Global configuration
  model_timeout : # Model timeout, in seconds, default 100 seconds, optional
    gpt-4o : 10 # Model gpt-4o timeout is 10 seconds, gpt-4o is the model name, when requesting models like gpt-4o-2024-08-06, the timeout is also 10 seconds
    claude-3-5-sonnet : 10 # Model claude-3-5-sonnet timeout is 10 seconds, when requesting models like claude-3-5-sonnet-20240620, the timeout is also 10 seconds
    default : 10 # Model does not have a timeout set, use the default timeout of 10 seconds, when requesting a model not in model_timeout, the default timeout is 10 seconds, if default is not set, uni-api will use the default timeout set by the environment variable TIMEOUT, the default timeout is 100 seconds
    o1-mini : 30 # Model o1-mini timeout is 30 seconds, when requesting models starting with o1-mini, the timeout is 30 seconds
    o1-preview : 100 # Model o1-preview timeout is 100 seconds, when requesting models starting with o1-preview, the timeout is 100 seconds
  cooldown_period : 300 # Channel cooldown time, in seconds, default 300 seconds, optional. When a model request fails, the channel will be automatically excluded and cooled down for a period of time, and will not request the channel again. After the cooldown time ends, the model will be automatically restored until the request fails again, and it will be cooled down again. When cooldown_period is set to 0, the cooling mechanism is not enabled.
  error_triggers : # Error triggers, when the message returned by the model contains any of the strings in the error_triggers, the channel will return an error. Optional
    - The bot's usage is covered by the developer
    - process this request due to overload or policy

安裝配置文件並啟動Uni-API Docker容器：

docker run --user root -p 8001:8000 --name uni-api -dit 
-v ./api.yaml:/home/api.yaml 
yym68686/uni-api:latest

方法兩個：使用`CONFIG_URL`環境變量啟動uni-api

根據方法One編寫配置文件後，將其上傳到雲磁盤，獲取文件的直接鏈接，然後使用CONFIG_URL環境變量來啟動Uni-API Docker容器：

docker run --user root -p 8001:8000 --name uni-api -dit 
-e CONFIG_URL=http://file_url/api.yaml 
yym68686/uni-api:latest

環境變量

config_url：配置文件的下載地址，可以是本地文件或遠程文件，可選的
超時：請求超時，默認值為100秒。超時可以控制一個通道未響應時切換到下一個通道所需的時間。選修的
disable_database：是否禁用數據庫，默認值為false，可選

Vercel遠程部署

單擊上面的一單擊部署按鈕後，將環境變量CONFIG_URL設置為配置文件的直接鏈接，將DISABLE_DATABASE設置為true，然後單擊創建創建項目。部署後，您需要在設置下的Vercel項目面板中手動將功能最大持續時間設置為60秒 - >函數，然後單擊“部署”菜單，然後單擊Redeploy到Redeploy，將超時設置為60秒。如果您不重新部署，則默認超時將保留在原始的10秒鐘。請注意，您不應刪除Vercel項目並重新創建它；取而代之的是，在當前部署的Vercel項目中的“部署”菜單中單擊Redeploy，以使功能最大持續時間修改生效。

Ubuntu部署

在倉庫版本中，查找相應的二進製文件的最新版本，例如一個名為uni-api-linux-x86_64-0.0.0.0.99.pex的文件。在服務器上下載二進製文件並運行它：

wget https://github.com/yym68686/uni-api/releases/download/v0.0.99/uni-api-linux-x86_64-0.0.99.pex
chmod +x uni-api-linux-x86_64-0.0.99.pex
./uni-api-linux-x86_64-0.0.99.pex

SERV00遠程部署（FreeBSD 14.0）

首先，登錄到面板，在其他服務中單擊“選項卡”運行您自己的應用程序以啟用自己的程序運行您自己的程序，然後轉到面板端口預訂以隨機打開端口。

如果您沒有自己的域名，請轉到面板www網站並刪除提供的默認域名。然後創建一個新的域，該域是您剛剛刪除的域。單擊高級設置後，將網站類型設置為代理域，代理端口應指向您剛打開的端口。請勿選擇使用HTTP。

SSH登錄到SERV00服務器，執行以下命令：

git clone --depth 1 -b main --quiet https://github.com/yym68686/uni-api.git
cd uni-api
python -m venv uni-api
tmux new -s uni-api
source uni-api/bin/activate
export CFLAGS= " -I/usr/local/include "
export CXXFLAGS= " -I/usr/local/include "
export CC=gcc
export CXX=g++
export MAX_CONCURRENCY=1
export CPUCOUNT=1
export MAKEFLAGS= " -j1 "
CMAKE_BUILD_PARALLEL_LEVEL=1 cpuset -l 0 pip install -vv -r requirements.txt
cpuset -l 0 pip install -r -vv requirements.txt

CTRL+BD退出TMUX，等待幾個小時的安裝完成，並且安裝完成後，執行以下命令：

tmux attach -t uni-api
source uni-api/bin/activate
export CONFIG_URL=http://file_url/api.yaml
export DISABLE_DATABASE=true
# Modify the port, xxx is the port, modify it yourself, corresponding to the port opened in the panel Port reservation
sed -i ' ' ' s/port=8000/port=xxx/ ' main.py
sed -i ' ' ' s/reload=True/reload=False/ ' main.py
python main.py

使用CTRL+BD退出TMUX，使程序可以在後台運行。此時，您可以在其他聊天客戶端中使用Uni-API。捲曲測試腳本：

curl -X POST https://xxx.serv00.net/v1/chat/completions 
-H ' Content-Type: application/json ' 
-H ' Authorization: Bearer sk-xxx ' 
-d ' {"model": "gpt-4o","messages": [{"role": "user","content": "Hello"}]} '

參考文檔：

https://docs.serv00.com/python/

https://linux.do/t/topic/201181

https://linux.do/t/topic/218738

Docker本地部署

啟動容器

docker run --user root -p 8001:8000 --name uni-api -dit 
-e CONFIG_URL=http://file_url/api.yaml  # If the local configuration file has already been mounted, there is no need to set CONFIG_URL
-v ./api.yaml:/home/api.yaml  # If CONFIG_URL is already set, there is no need to mount the configuration file
-v ./uniapi_db:/home/data  # If you do not want to save statistical data, there is no need to mount this folder
yym68686/uni-api:latest

或者，如果您想使用Docker組合，這裡是Docker-Compose.yml示例：

 services :
  uni-api :
    container_name : uni-api
    image : yym68686/uni-api:latest
    environment :
      - CONFIG_URL=http://file_url/api.yaml # If a local configuration file is already mounted, there is no need to set CONFIG_URL
    ports :
      - 8001:8000
    volumes :
      - ./api.yaml:/home/api.yaml # If CONFIG_URL is already set, there is no need to mount the configuration file
      - ./uniapi_db:/home/data # If you do not want to save statistical data, there is no need to mount this folder

config_url是可以自動下載的遠程配置文件的URL。例如，如果您不舒服地修改了某個平台上的配置文件，則可以將配置文件上傳到託管服務，並為UNI-API提供直接鏈接以下載，即config_url。如果您使用的是本地安裝的配置文件，則無需設置config_url。當不方便地安裝配置文件時，使用config_url。

在後台運行Docker組合容器

docker-compose pull
docker-compose up -d

Docker Build

docker build --no-cache -t uni-api:latest -f Dockerfile --platform linux/amd64 .
docker tag uni-api:latest yym68686/uni-api:latest
docker push yym68686/uni-api:latest

一單擊重新啟動Docker圖像

 set -eu
docker pull yym68686/uni-api:latest
docker rm -f uni-api
docker run --user root -p 8001:8000 -dit --name uni-api 
-e CONFIG_URL=http://file_url/api.yaml 
-v ./api.yaml:/home/api.yaml 
-v ./uniapi_db:/home/data 
yym68686/uni-api:latest
docker logs -f uni-api

恢復捲曲測試

curl -X POST http://127.0.0.1:8000/v1/chat/completions 
-H " Content-Type: application/json " 
-H " Authorization: Bearer ${API} " 
-d ' {"model": "gpt-4o","messages": [{"role": "user", "content": "Hello"}],"stream": true} '

PEX Linux包裝：

VERSION= $( cat VERSION )
pex -D . -r requirements.txt 
    -c uvicorn 
    --inject-args ' main:app --host 0.0.0.0 --port 8000 ' 
    --platform linux_x86_64-cp-3.10.12-cp310 
    --interpreter-constraint ' ==3.10.* ' 
    --no-strip-pex-env 
    -o uni-api-linux-x86_64- ${VERSION} .pex

MacOS包裝：

VERSION= $( cat VERSION )
pex -r requirements.txt 
    -c uvicorn 
    --inject-args ' main:app --host 0.0.0.0 --port 8000 ' 
    -o uni-api-macos-arm64- ${VERSION} .pex

贊助商

我們感謝以下贊助商的支持：

@PowerHunter：¥2000
@IOI：¥50

如何讚助我們

如果您想支持我們的項目，則可以通過以下方式贊助我們：

貝寶
USDT-TRC20，USDT-TRC20錢包地址： TLFbqSv5pDu5he43mVmK1dNx7yBMFeN7d8
微信
支付寶

謝謝您的支持！

常問問題

為什麼錯誤Error processing request or performing moral check: 404: No matching model found總是出現的？

將enable_moderation設置為false將解決此問題。當enable_moderation為True時，API必須能夠使用文本模型最終模型，並且如果您在提供商模型設置中未提供最終文本模型 - 最終，則會發生錯誤，表明找不到該模型。

如何優先考慮特定頻道的請求，如何設置頻道的優先級？

直接在API_KEYS中設置通道順序。無需其他設置。示例配置文件：

 providers :
  - provider : ai1
    base_url : https://xxx/v1/chat/completions
    api : sk-xxx

  - provider : ai2
    base_url : https://xxx/v1/chat/completions
    api : sk-xxx

api_keys :
  - api : sk-1234
    model :
      - ai2/*
      - ai1/*

這樣，首先請求AI2，如果失敗，請求AI1。

各種調度算法背後的行為是什麼？例如，fixed_priority，加權_ROUND_ROBIN，彩票，隨機，round_robin？

所有調度算法都需要通過設置api_keys啟用。（API）.preferences.preferences.scheduling_algorithm配置文件中的任何值：fixed_priority，firesed_round_robibin，lottery，lottery，lottery，lottery，andar inter，randy，round_robin。

fixe_priority：固定優先級計劃。所有請求始終由首先具有用戶請求的模型頻道執行。如果發生錯誤，它將切換到下一個通道。這是默認的調度算法。
加權_ROUND_ROBIN：加權圓形旋轉負載平衡，根據配置文件api_keys中設置的權重訂單。
彩票：繪製圓形旋轉負載平衡，根據配置文件api_keys中的權重。（API）.model隨機隨機請求模型的通道。
Round_robin：圓形載荷負載平衡，請求根據配置文件API_KEYS中的配置訂單。您可以檢查有關如何設置頻道優先級的上一個問題。

base_url應該如何正確填充？

除了高級配置中顯示的某些特殊頻道外，所有OpenAI格式提供商都需要完全填寫base_url，這意味著base_url必須以/v1/chat/completions結尾。如果您使用的是github型號，則應將base_url填充為https://models.inference.ai.ai.azure.com/chat/completions，而不是Azure的URL。

模型超時時間如何工作？渠道級超時設置和全局模型超時設置的優先級是什麼？

頻道級超時設置的優先級高於全局模型超時設置。優先順序是：通道級模型超時設置>頻道級默認超時設置>全局模型超時設置>全局默認超時設置>“環境變量timeout”。

通過調整模型超時時間，您可以避免某些通道的錯誤時間。如果遇到錯誤{'error': '500', 'details': 'fetch_response_stream Read Response Timeout'} ，請嘗試增加模型超時時間。

API_KEY_RATE_LIMIT如何工作？如何為多個型號設置相同的速率限制？

如果要為GEMINI-1.5-PRO-LATEST，Gemini-1.5-Pro，Gemini-1.5-Pro-001，Gemini-1.5-Pro-002設置相同的頻率限制，同時可以這樣設置：

 api_key_rate_limit :
  gemini-1.5-pro : 1000/min

這將匹配所有包含Gemini-1.5-Pro字符串的模型。這四個型號的頻率限制，Gemini-1.5-Pro-latest，Gemini-1.5-Pro，Gemini-1.5-Pro-001，Gemini-1.5-Pro-002，都將設置為1000/min。配置API_KEY_RATE_LIMIT字段的邏輯如下，這是一個示例配置文件：

 api_key_rate_limit :
  gemini-1.5-pro : 1000/min
  gemini-1.5-pro-002 : 500/min

目前，如果有型號Gemini-1.5-Pro-002的請求。

首先，Uni-API將嘗試在API_KEY_RATE_LIMIT中精確匹配該模型。如果設置了Gemini-1.5-Pro-002的速率限制，則GEMINI-1.5-PRO-002的速率限制為500/min。 If the requested model at this time is not gemini-1.5-pro-002, but gemini-1.5-pro-latest, since the api_key_rate_limit does not have a rate limit set for gemini-1.5-pro-latest, it will look for any model with the same prefix as gemini-1.5-pro-latest that has been set, thus the rate limit for gemini-1.5-pro-latest will be set to 1000/min.

星曆史

展開

uni api

Uni-Api

介紹

特徵

用法方法

方法1：安裝`api.yaml`配置文件以啟動UNI-API

方法兩個：使用`CONFIG_URL`環境變量啟動uni-api

環境變量

Vercel遠程部署

Ubuntu部署

SERV00遠程部署（FreeBSD 14.0）

Docker本地部署

贊助商

如何讚助我們

常問問題

星曆史

evolution api

歐洲大學

施法者大學

大學

破碎的大學

坦克大學

chat.petals.dev

GPT Prompt Templates

GPTyped

waymo open dataset

Sunamu

chat.petals.dev

waymo open dataset

termwind

wp functions

uni api

Uni-Api

介紹

特徵

用法方法

方法1：安裝api.yaml配置文件以啟動UNI-API

方法兩個：使用CONFIG_URL環境變量啟動uni-api

環境變量

Vercel遠程部署

Ubuntu部署

SERV00遠程部署（FreeBSD 14.0）

Docker本地部署

贊助商

如何讚助我們

常問問題

星曆史

方法1：安裝`api.yaml`配置文件以啟動UNI-API

方法兩個：使用`CONFIG_URL`環境變量啟動uni-api