uni api 다운로드 - uni api 소스 코드 다운로드

유니 아피

영어 | 중국인

소개

개인 용도로 One/New-API는 개인이 필요로하지 않는 많은 상업적 기능과 너무 복잡합니다. 복잡한 프론트 엔드 인터페이스를 원하지 않고 더 많은 모델에 대한 지원을 선호하는 경우 UNI-API를 시도 할 수 있습니다. 이 프로젝트는 대형 언어 모델 API의 관리를 통합하여 단일 통합 API 인터페이스를 통해 여러 백엔드 서비스를 호출하여 모두 OpenAI 형식으로 변환하고로드 밸런싱을 지원할 수있는 프로젝트입니다. 현재 지원되는 백엔드 서비스에는 OpenAi, Anthropic, Gemini, Vertex, Cohere, Groq, Cloudflare, OpenRouter 등이 포함됩니다.

특징

API 채널을 구성 할 프론트 엔드, 순수한 구성 파일이 없습니다. 파일을 작성하는 것만으로도 자신의 API 스테이션을 실행할 수 있으며 문서에는 초보자 친화적 인 자세한 구성 안내서가 있습니다.
OpenAi, DeepSeek, OpenRouter 및 기타 API와 같은 지원 제공 업체 및 OpenAI 형식의 여러 백엔드 서비스의 통합 관리. OpenAi Dalle-3 이미지 생성을 지원합니다.
동시에 의인성, Gemini, Vertex AI, Cohere, Groq, Cloudflare를 지원합니다. 정점은 동시에 Claude 및 Gemini API를 지원합니다.
Openai, 의인성, 쌍둥이 자리, 정점 기본 도구 사용 기능 호출을 지원하십시오.
OpenAi, 의인성, 쌍둥이 자리, 정점 기본 이미지 인식 API를 지원합니다.
4 가지 유형의로드 밸런싱을 지원합니다.
1. 채널 레벨 가중로드 밸런싱을 지원하므로 다른 채널 가중치에 따라 요청을 배포 할 수 있습니다. 기본적으로 활성화되지 않으며 채널 가중치를 구성해야합니다.
2. Vertex Regional Load Balancing 및 High 동시성을 지원하여 Gemini 및 Claude 동시성을 최대 (API 수의 영역 수) 시간으로 증가시킬 수 있습니다. 추가 구성없이 자동으로 활성화되었습니다.
3. Vertex 영역 수준로드 밸런싱을 제외하고 모든 API는 채널 수준의 순차로드 밸런싱을 지원하여 몰입 형 변환 경험을 향상시킵니다. 기본적으로 활성화되지 않으며 SCHEDULING_ALGORITHM round_robin 으로 구성해야합니다.
4. 단일 채널의 여러 API 키에 대한 자동 API 키 레벨 라운드 로빈로드 밸런싱을 지원합니다.
API 채널 응답이 실패하면 자동 재시도 지원 다음 API 채널을 자동으로 재 시도합니다.
지원 채널 냉각 : API 채널 응답이 실패하면 채널을 일정 기간 동안 자동으로 제외하고 냉각되며 채널에 대한 요청이 중지됩니다. 냉각 기간이 끝나면 모델은 다시 실패 할 때까지 자동으로 복원되며,이 시점에서 다시 냉각됩니다.
세분화 된 모델 타임 아웃 설정을 지원하여 각 모델마다 다른 시간 초과 지속 시간을 허용합니다.
세분화 된 권한 통제를 지원합니다. 와일드 카드를 사용하여 API 키 채널에 사용할 수있는 특정 모델을 설정합니다.
지지 속도 제한, 분당 최대 요청 수를 정수, 예를 들어 분당 2/분, 분당 2 회, 시간당 5 회, 시간당 5 회, 하루 10 회, 월 10 회, 월 10 회, 연간 10 회를 설정할 수 있습니다. 기본값은 60/분입니다.
다중 표준 OpenAI 형식 인터페이스 : /v1/chat/completions , /v1/images/generations , /v1/audio/transcriptions , /v1/moderations , /v1/models 를 지원합니다.
사용자 메시지에 대한 도덕적 검토를 수행 할 수있는 Openai 중재 도덕 검토를 지원합니다. 부적절한 메시지를 찾으면 오류 메시지가 반환됩니다. 이로 인해 제공자가 백엔드 API를 금지 할 위험이 줄어 듭니다.

사용 방법

UNI-API를 시작하려면 구성 파일을 사용해야합니다. 구성 파일로 시작하는 두 가지 방법이 있습니다.

첫 번째 방법은 CONFIG_URL 환경 변수를 사용하여 configuration 파일 URL을 채우는 것입니다. UNI-API가 시작될 때 자동으로 다운로드됩니다.
두 번째 방법은 api.yaml 이라는 구성 파일을 컨테이너에 마운트하는 것입니다.

메소드 1 : `api.yaml` 구성 파일을 마운트하여 UNI-API를 시작합니다.

uni-api 시작하려면 구성 파일을 미리 작성해야하며 api.yaml 이라는 구성 파일을 사용하여 uni-api 시작해야합니다. 여러 모델을 구성 할 수 있으며 각 모델은 여러 백엔드 서비스를 구성하고로드 밸런싱을 지원할 수 있습니다. 아래는 실행할 수있는 최소 api.yaml 구성 파일의 예입니다.

 providers :
  - provider : provider_name # Service provider name, such as openai, anthropic, gemini, openrouter, can be any name, required
    base_url : https://api.your.com/v1/chat/completions # Backend service API address, required
    api : sk-YgS6GTi0b4bEabc4C # Provider's API Key, required, automatically uses base_url and api to get all available models through the /v1/models endpoint.
  # Multiple providers can be configured here, each provider can configure multiple API Keys, and each API Key can configure multiple models.
api_keys :
  - api : sk-Pkj60Yf8JFWxfgRmXQFWyGtWUddGZnmi3KlvowmRWpWpQxx # API Key, user request uni-api requires API key, required
  # This API Key can use all models, that is, it can use all models in all channels set under providers, without needing to add available channels one by one.

api.yaml 의 자세한 고급 구성 :

 providers :
  - provider : provider_name # Service provider name, such as openai, anthropic, gemini, openrouter, can be any name, required
    base_url : https://api.your.com/v1/chat/completions # Backend service API address, required
    api : sk-YgS6GTi0b4bEabc4C # Provider's API Key, required
    model : # Optional, if model is not configured, all available models will be automatically obtained through base_url and api via the /v1/models endpoint.
      - gpt-4o # Usable model name, required
      - claude-3-5-sonnet-20240620 : claude-3-5-sonnet # Rename model, claude-3-5-sonnet-20240620 is the provider's model name, claude-3-5-sonnet is the renamed name, you can use a simple name to replace the original complex name, optional
      - dall-e-3

  - provider : anthropic
    base_url : https://api.anthropic.com/v1/messages
    api : # Supports multiple API Keys, multiple keys automatically enable polling load balancing, at least one key, required
      - sk-ant-api03-bNnAOJyA-xQw_twAA
      - sk-ant-api02-bNnxxxx
    model :
      - claude-3-5-sonnet-20240620 : claude-3-5-sonnet # Rename model, claude-3-5-sonnet-20240620 is the provider's model name, claude-3-5-sonnet is the renamed name, you can use a simple name to replace the original complex name, optional
    tools : true # Whether to support tools, such as generating code, generating documents, etc., default is true, optional

  - provider : gemini
    base_url : https://generativelanguage.googleapis.com/v1beta # base_url supports v1beta/v1, only for Gemini model use, required
    api : # Supports multiple API Keys, multiple keys automatically enable polling load balancing, at least one key, required
      - AIzaSyAN2k6IRdgw123
      - AIzaSyAN2k6IRdgw456
      - AIzaSyAN2k6IRdgw789
    model :
      - gemini-1.5-pro
      - gemini-1.5-flash-exp-0827 : gemini-1.5-flash # After renaming, the original model name gemini-1.5-flash-exp-0827 cannot be used, if you want to use the original name, you can add the original name in the model, just add the line below to use the original name
      - gemini-1.5-flash-exp-0827 # Add this line, both gemini-1.5-flash-exp-0827 and gemini-1.5-flash can be requested
    tools : true
    preferences :
      api_key_rate_limit : 15/min # Each API Key can request up to 15 times per minute, optional. The default is 999999/min. Supports multiple frequency constraints: 15/min,10/day
      # api_key_rate_limit: # You can set different frequency limits for each model
      #   gemini-1.5-flash: 15/min,1500/day
      #   gemini-1.5-pro: 2/min,50/day
      #   default: 4/min # If the model does not set the frequency limit, use the frequency limit of default
      api_key_cooldown_period : 60 # Each API Key will be cooled down for 60 seconds after encountering a 429 error. Optional, the default is 0 seconds. When set to 0, the cooling mechanism is not enabled. When there are multiple API keys, the cooling mechanism will take effect.
      api_key_schedule_algorithm : round_robin # Set the request order of multiple API Keys, optional. The default is round_robin, and the optional values are: round_robin, random. It will take effect when there are multiple API keys. round_robin is polling load balancing, and random is random load balancing.
      model_timeout : # Model timeout, in seconds, default 100 seconds, optional
        gemini-1.5-pro : 10 # Model gemini-1.5-pro timeout is 10 seconds
        gemini-1.5-flash : 10 # Model gemini-1.5-flash timeout is 10 seconds
        default : 10 # Model does not have a timeout set, use the default timeout of 10 seconds, when requesting a model not in model_timeout, the timeout is also 10 seconds, if default is not set, uni-api will use the default timeout set by the environment variable TIMEOUT, the default timeout is 100 seconds
      proxy : socks5://[username]:[password]@[ip]:[port] # Proxy address, optional. Supports socks5 and http proxies, default is not used.

  - provider : vertex
    project_id : gen-lang-client-xxxxxxxxxxxxxx # Description: Your Google Cloud project ID. Format: String, usually composed of lowercase letters, numbers, and hyphens. How to obtain: You can find your project ID in the project selector of the Google Cloud Console.
    private_key : " -----BEGIN PRIVATE KEY----- n xxxxx n -----END PRIVATE " # Description: Private key for Google Cloud Vertex AI service account. Format: A JSON formatted string containing the private key information of the service account. How to obtain: Create a service account in Google Cloud Console, generate a JSON formatted key file, and then set its content as the value of this environment variable.
    client_email : [email protected] # Description: Email address of the Google Cloud Vertex AI service account. Format: Usually a string like "[email protected]". How to obtain: Generated when creating a service account, or you can view the service account details in the "IAM and Admin" section of the Google Cloud Console.
    model :
      - gemini-1.5-pro
      - gemini-1.5-flash
      - gemini-1.5-pro : gemini-1.5-pro-search # Only supports using the gemini-1.5-pro-search model to request uni-api when using the Vertex Gemini API, to automatically use the Google official search tool.
      - claude-3-5-sonnet@20240620 : claude-3-5-sonnet
      - claude-3-opus@20240229 : claude-3-opus
      - claude-3-sonnet@20240229 : claude-3-sonnet
      - claude-3-haiku@20240307 : claude-3-haiku
    tools : true
    notes : https://xxxxx.com/ # You can put the provider's website, notes, official documentation, optional

  - provider : cloudflare
    api : f42b3xxxxxxxxxxq4aoGAh # Cloudflare API Key, required
    cf_account_id : 8ec0xxxxxxxxxxxxe721 # Cloudflare Account ID, required
    model :
      - ' @cf/meta/llama-3.1-8b-instruct ' : llama-3.1-8b # Rename model, @cf/meta/llama-3.1-8b-instruct is the provider's original model name, must be enclosed in quotes, otherwise yaml syntax error, llama-3.1-8b is the renamed name, you can use a simple name to replace the original complex name, optional
      - ' @cf/meta/llama-3.1-8b-instruct ' # Must be enclosed in quotes, otherwise yaml syntax error

  - provider : other-provider
    base_url : https://api.xxx.com/v1/messages
    api : sk-bNnAOJyA-xQw_twAA
    model :
      - causallm-35b-beta2ep-q6k : causallm-35b
      - anthropic/claude-3-5-sonnet
    tools : false
    engine : openrouter # Force the use of a specific message format, currently supports gpt, claude, gemini, openrouter native format, optional

api_keys :
  - api : sk-KjjI60Yf0JFWxfgRmXqFWyGtWUd9GZnmi3KlvowmRWpWpQRo # API Key, required for users to use this service
    model : # Models that can be used by this API Key, required. Default channel-level polling load balancing is enabled, and each request model is requested in sequence according to the model configuration. It is not related to the original channel order in providers. Therefore, you can set different request sequences for each API key.
      - gpt-4o # Usable model name, can use all gpt-4o models provided by providers
      - claude-3-5-sonnet # Usable model name, can use all claude-3-5-sonnet models provided by providers
      - gemini/* # Usable model name, can only use all models provided by providers named gemini, where gemini is the provider name, * represents all models
    role : admin

  - api : sk-pkhf60Yf0JGyJxgRmXqFQyTgWUd9GZnmi3KlvowmRWpWqrhy
    model :
      - anthropic/claude-3-5-sonnet # Usable model name, can only use the claude-3-5-sonnet model provided by the provider named anthropic. Models with the same name from other providers cannot be used. This syntax will not match the model named anthropic/claude-3-5-sonnet provided by other-provider.
      - <anthropic/claude-3-5-sonnet> # By adding angle brackets on both sides of the model name, it will not search for the claude-3-5-sonnet model under the channel named anthropic, but will take the entire anthropic/claude-3-5-sonnet as the model name. This syntax can match the model named anthropic/claude-3-5-sonnet provided by other-provider. But it will not match the claude-3-5-sonnet model under anthropic.
      - openai-test/text-moderation-latest # When message moderation is enabled, the text-moderation-latest model under the channel named openai-test can be used for moderation.
      - sk-KjjI60Yd0JFWtxxxxxxxxxxxxxxwmRWpWpQRo/* # Support using other API keys as channels
    preferences :
      SCHEDULING_ALGORITHM : fixed_priority # When SCHEDULING_ALGORITHM is fixed_priority, use fixed priority scheduling, always execute the channel of the first model with a request. Default is enabled, SCHEDULING_ALGORITHM default value is fixed_priority. SCHEDULING_ALGORITHM optional values are: fixed_priority, round_robin, weighted_round_robin, lottery, random.
      # When SCHEDULING_ALGORITHM is random, use random polling load balancing, randomly request the channel of the model with a request.
      # When SCHEDULING_ALGORITHM is round_robin, use polling load balancing, request the channel of the model used by the user in order.
      AUTO_RETRY : true # Whether to automatically retry, automatically retry the next provider, true for automatic retry, false for no automatic retry, default is true. Also supports setting a number, indicating the number of retries.
      rate_limit : 15/min # Supports rate limiting, each API Key can request up to 15 times per minute, optional. The default is 999999/min. Supports multiple frequency constraints: 15/min,10/day
      # rate_limit: # You can set different frequency limits for each model
      #   gemini-1.5-flash: 15/min,1500/day
      #   gemini-1.5-pro: 2/min,50/day
      #   default: 4/min # If the model does not set the frequency limit, use the frequency limit of default
      ENABLE_MODERATION : true # Whether to enable message moderation, true for enable, false for disable, default is false, when enabled, it will moderate the user's message, if inappropriate messages are found, an error message will be returned.

  # Channel-level weighted load balancing configuration example
  - api : sk-KjjI60Yd0JFWtxxxxxxxxxxxxxxwmRWpWpQRo
    model :
      - gcp1/* : 5 # The number after the colon is the weight, weight only supports positive integers.
      - gcp2/* : 3 # The size of the number represents the weight, the larger the number, the greater the probability of the request.
      - gcp3/* : 2 # In this example, there are a total of 10 weights for all channels, and 10 requests will have 5 requests for the gcp1/* model, 2 requests for the gcp2/* model, and 3 requests for the gcp3/* model.

    preferences :
      SCHEDULING_ALGORITHM : weighted_round_robin # Only when SCHEDULING_ALGORITHM is weighted_round_robin and the above channel has weights, it will request according to the weighted order. Use weighted polling load balancing, request the channel of the model with a request according to the weight order. When SCHEDULING_ALGORITHM is lottery, use lottery polling load balancing, request the channel of the model with a request according to the weight randomly. Channels without weights automatically fall back to round_robin polling load balancing.
      AUTO_RETRY : true

preferences : # Global configuration
  model_timeout : # Model timeout, in seconds, default 100 seconds, optional
    gpt-4o : 10 # Model gpt-4o timeout is 10 seconds, gpt-4o is the model name, when requesting models like gpt-4o-2024-08-06, the timeout is also 10 seconds
    claude-3-5-sonnet : 10 # Model claude-3-5-sonnet timeout is 10 seconds, when requesting models like claude-3-5-sonnet-20240620, the timeout is also 10 seconds
    default : 10 # Model does not have a timeout set, use the default timeout of 10 seconds, when requesting a model not in model_timeout, the default timeout is 10 seconds, if default is not set, uni-api will use the default timeout set by the environment variable TIMEOUT, the default timeout is 100 seconds
    o1-mini : 30 # Model o1-mini timeout is 30 seconds, when requesting models starting with o1-mini, the timeout is 30 seconds
    o1-preview : 100 # Model o1-preview timeout is 100 seconds, when requesting models starting with o1-preview, the timeout is 100 seconds
  cooldown_period : 300 # Channel cooldown time, in seconds, default 300 seconds, optional. When a model request fails, the channel will be automatically excluded and cooled down for a period of time, and will not request the channel again. After the cooldown time ends, the model will be automatically restored until the request fails again, and it will be cooled down again. When cooldown_period is set to 0, the cooling mechanism is not enabled.
  error_triggers : # Error triggers, when the message returned by the model contains any of the strings in the error_triggers, the channel will return an error. Optional
    - The bot's usage is covered by the developer
    - process this request due to overload or policy

구성 파일을 장착하고 UNI-API Docker 컨테이너를 시작하십시오.

docker run --user root -p 8001:8000 --name uni-api -dit 
-v ./api.yaml:/home/api.yaml 
yym68686/uni-api:latest

방법 2 : `CONFIG_URL` 환경 변수를 사용하여 UNI-API를 시작하십시오

메소드 1에 따라 구성 파일을 작성한 후 클라우드 디스크에 업로드하고 파일의 직접 링크를 가져온 다음 CONFIG_URL 환경 변수를 사용하여 Uni-API Docker 컨테이너를 시작하십시오.

docker run --user root -p 8001:8000 --name uni-api -dit 
-e CONFIG_URL=http://file_url/api.yaml 
yym68686/uni-api:latest

환경 변수

config_url : 로컬 파일 또는 원격 파일 일 수있는 구성 파일의 다운로드 주소, 선택 사항
시간 초과 : 요청 시간 초과, 기본값은 100 초입니다. 시간 초과는 한 채널이 응답하지 않을 때 다음 채널로 전환하는 데 필요한 시간을 제어 할 수 있습니다. 선택 과목
disable_database : 데이터베이스를 비활성화할지 여부, 기본값은 false, 선택 사항

Vercel 원격 배포

위의 원 클릭 배포 버튼을 클릭 한 후에는 환경 변수 CONFIG_URL 구성 파일의 직접 링크로 설정하고 DISABLE_DATABASE true로 설정 한 다음 Create를 클릭하여 프로젝트를 작성하십시오. 배포 후 설정 -> 함수 아래 Vercel 프로젝트 패널에서 기능 max 기간을 수동으로 60 초로 수동으로 설정 한 다음 배포 메뉴를 클릭하고 Redeploy를 클릭하여 Redeploy를 클릭하여 타임 아웃을 60 초로 설정합니다. 재배치하지 않으면 기본 타임 아웃은 원래 10 초로 유지됩니다. Vercel 프로젝트를 삭제하고 재현해서는 안됩니다. 대신, 현재 배포 된 Vercel 프로젝트의 배포 메뉴에서 REDEPLOY를 클릭하여 기능 Max Duration Modification이 적용되도록합니다.

우분투 배포

창고 릴리스에서 해당 바이너리 파일의 최신 버전 (예 : Uni-Api-Linux-x86_64-0.0.99.pex라는 파일을 찾으십시오. 서버에서 이진 파일을 다운로드하고 실행하십시오.

wget https://github.com/yym68686/uni-api/releases/download/v0.0.99/uni-api-linux-x86_64-0.0.99.pex
chmod +x uni-api-linux-x86_64-0.0.99.pex
./uni-api-linux-x86_64-0.0.99.pex

Serv00 원격 배포 (freebsd 14.0)

먼저 패널에 로그인하고 추가 서비스에서 탭을 클릭하여 자신의 응용 프로그램을 실행하여 자신의 프로그램을 실행할 수있는 옵션을 사용한 다음 패널 포트 예약으로 이동하여 포트를 무작위로 열 수 있습니다.

자신의 도메인 이름이없는 경우 패널 www 웹 사이트로 이동하여 제공된 기본 도메인 이름을 삭제하십시오. 그런 다음 방금 삭제 한 도메인이있는 새로운 도메인을 만듭니다. 고급 설정을 클릭 한 후 웹 사이트 유형을 프록시 도메인으로 설정하면 프록시 포트가 방금 열린 포트를 가리켜야합니다. 사용 https를 선택하지 마십시오.

SSH Serv00 서버에 로그인하고 다음 명령을 실행합니다.

git clone --depth 1 -b main --quiet https://github.com/yym68686/uni-api.git
cd uni-api
python -m venv uni-api
tmux new -s uni-api
source uni-api/bin/activate
export CFLAGS= " -I/usr/local/include "
export CXXFLAGS= " -I/usr/local/include "
export CC=gcc
export CXX=g++
export MAX_CONCURRENCY=1
export CPUCOUNT=1
export MAKEFLAGS= " -j1 "
CMAKE_BUILD_PARALLEL_LEVEL=1 cpuset -l 0 pip install -vv -r requirements.txt
cpuset -l 0 pip install -r -vv requirements.txt

Ctrl+Bd는 tmux를 종료하려면 설치가 완료 될 때까지 기다렸다가 설치가 완료되면 다음 명령을 실행하십시오.

tmux attach -t uni-api
source uni-api/bin/activate
export CONFIG_URL=http://file_url/api.yaml
export DISABLE_DATABASE=true
# Modify the port, xxx is the port, modify it yourself, corresponding to the port opened in the panel Port reservation
sed -i ' ' ' s/port=8000/port=xxx/ ' main.py
sed -i ' ' ' s/reload=True/reload=False/ ' main.py
python main.py

Ctrl+BD를 사용하여 TMUX를 종료하여 프로그램이 백그라운드에서 실행될 수 있습니다. 이 시점에서 다른 채팅 클라이언트에서는 UNI-API를 사용할 수 있습니다. 컬 테스트 스크립트 :

curl -X POST https://xxx.serv00.net/v1/chat/completions 
-H ' Content-Type: application/json ' 
-H ' Authorization: Bearer sk-xxx ' 
-d ' {"model": "gpt-4o","messages": [{"role": "user","content": "Hello"}]} '

참조 문서 :

https://docs.serv00.com/python/

https://linux.do/t/topic/201181

https://linux.do/t/topic/218738

도커 지역 배치

컨테이너를 시작하십시오

docker run --user root -p 8001:8000 --name uni-api -dit 
-e CONFIG_URL=http://file_url/api.yaml  # If the local configuration file has already been mounted, there is no need to set CONFIG_URL
-v ./api.yaml:/home/api.yaml  # If CONFIG_URL is already set, there is no need to mount the configuration file
-v ./uniapi_db:/home/data  # If you do not want to save statistical data, there is no need to mount this folder
yym68686/uni-api:latest

또는 Docker Compose를 사용하려면 다음은 docker-compose.yml 예입니다.

 services :
  uni-api :
    container_name : uni-api
    image : yym68686/uni-api:latest
    environment :
      - CONFIG_URL=http://file_url/api.yaml # If a local configuration file is already mounted, there is no need to set CONFIG_URL
    ports :
      - 8001:8000
    volumes :
      - ./api.yaml:/home/api.yaml # If CONFIG_URL is already set, there is no need to mount the configuration file
      - ./uniapi_db:/home/data # If you do not want to save statistical data, there is no need to mount this folder

config_url은 자동으로 다운로드 할 수있는 원격 구성 파일의 URL입니다. 예를 들어, 특정 플랫폼에서 구성 파일을 수정하는 것이 편하지 않은 경우 구성 파일을 호스팅 서비스에 업로드하고 config_url 인 Uni-API에 직접 링크를 제공 할 수 있습니다. 로컬 장착 구성 파일을 사용하는 경우 config_url을 설정할 필요가 없습니다. config_url은 구성 파일을 마운트하는 것이 편리하지 않은 경우 사용됩니다.

Docker Compose 컨테이너를 백그라운드에서 실행하십시오

docker-compose pull
docker-compose up -d

도커 빌드

docker build --no-cache -t uni-api:latest -f Dockerfile --platform linux/amd64 .
docker tag uni-api:latest yym68686/uni-api:latest
docker push yym68686/uni-api:latest

한 번의 클릭 Docker 이미지를 다시 시작하십시오

 set -eu
docker pull yym68686/uni-api:latest
docker rm -f uni-api
docker run --user root -p 8001:8000 -dit --name uni-api 
-e CONFIG_URL=http://file_url/api.yaml 
-v ./api.yaml:/home/api.yaml 
-v ./uniapi_db:/home/data 
yym68686/uni-api:latest
docker logs -f uni-api

편안한 컬 테스트

curl -X POST http://127.0.0.1:8000/v1/chat/completions 
-H " Content-Type: application/json " 
-H " Authorization: Bearer ${API} " 
-d ' {"model": "gpt-4o","messages": [{"role": "user", "content": "Hello"}],"stream": true} '

Pex Linux 포장 :

VERSION= $( cat VERSION )
pex -D . -r requirements.txt 
    -c uvicorn 
    --inject-args ' main:app --host 0.0.0.0 --port 8000 ' 
    --platform linux_x86_64-cp-3.10.12-cp310 
    --interpreter-constraint ' ==3.10.* ' 
    --no-strip-pex-env 
    -o uni-api-linux-x86_64- ${VERSION} .pex

마코스 포장 :

VERSION= $( cat VERSION )
pex -r requirements.txt 
    -c uvicorn 
    --inject-args ' main:app --host 0.0.0.0 --port 8000 ' 
    -o uni-api-macos-arm64- ${VERSION} .pex

스폰서

다음 후원자에게 감사의 말씀을 전합니다.

@PowerHunter : ¥ 2000
@ioi ： ¥ 50

우리를 후원하는 방법

프로젝트를 지원하려면 다음과 같은 방법으로 당사를 후원 할 수 있습니다.

PayPal
USDT-TRC20, USDT-TRC20 지갑 주소 : TLFbqSv5pDu5he43mVmK1dNx7yBMFeN7d8
Wechat
Alipay

지원해 주셔서 감사합니다!

FAQ

오류 Error processing request or performing moral check: 404: No matching model found 이유는 무엇입니까?

enable_moderation을 False로 설정하면이 문제가 해결됩니다. enable_moderation이 true 인 경우 API는 텍스트 수정-가장 큰 모델을 사용할 수 있어야하며 제공자 모델 설정에서 텍스트 수분을 제공하지 않은 경우 모델을 찾을 수 없음을 나타내는 오류가 발생합니다.

특정 채널에 대한 요청의 우선 순위를 정하는 방법, 채널의 우선 순위를 설정하는 방법은 무엇입니까?

API_Keys에서 채널 순서를 직접 설정하십시오. 다른 설정이 필요하지 않습니다. 샘플 구성 파일 :

 providers :
  - provider : ai1
    base_url : https://xxx/v1/chat/completions
    api : sk-xxx

  - provider : ai2
    base_url : https://xxx/v1/chat/completions
    api : sk-xxx

api_keys :
  - api : sk-1234
    model :
      - ai2/*
      - ai1/*

이런 식으로 AI2를 먼저 요청하고 실패하면 AI1을 요청하십시오.

다양한 스케줄링 알고리즘의 동작은 무엇입니까? 예를 들어, 고정 _priority, wheated_round_robin, 복권, 무작위, 라운드 _robin?

모든 값에 대한 구성 파일에서 API_KEYS를 설정하여 모든 예약 알고리즘을 설정하여 활성화되어야합니다.

고정 _priority : 고정 우선 순위 일정. 모든 요청은 항상 사용자 요청이있는 모델의 채널에 의해 항상 실행됩니다. 오류의 경우 다음 채널로 전환됩니다. 이것은 기본 스케줄링 알고리즘입니다.
whengred_round_robin : 가중 라운드 로빈로드 밸런싱은 구성 파일 API_KEYS (API) .Model에 설정된 중량 순서에 따라 사용자의 요청 된 모델과 함께 채널을 요청합니다.
복권 : 라운드 로빈로드 밸런싱을 그리면 구성 파일 API_KEYS (API) .Model의 중량에 따라 사용자 요청이있는 모델 채널을 무작위로 요청하십시오.
RAND_ROBIN : RAND-ROBIN로드 밸런싱은 구성 파일 API_KEYS (API) .Model의 구성 순서에 따라 사용자가 요청한 모델을 소유 한 채널을 요청합니다. 채널의 우선 순위를 설정하는 방법에 대한 이전 질문을 확인할 수 있습니다.

base_url을 어떻게 올바르게 채워야합니까?

고급 구성에 표시된 일부 특수 채널을 제외하고 모든 OpenAI 형식 제공 업체는 Base_URL을 완전히 채워야하므로 Base_URL은/v1/채팅/완성으로 끝나야합니다. GitHub 모델을 사용하는 경우 Base_url은 Azure의 URL이 아닌 https://models.inference.ai.azure.com/chat/completions로 채워야합니다.

모델 시간 초과 시간은 어떻게 작동합니까? 채널 레벨 타임 아웃 설정 및 글로벌 모델 타임 아웃 설정의 우선 순위는 무엇입니까?

채널 레벨 타임 아웃 설정은 글로벌 모델 타임 아웃 설정보다 우선 순위가 높습니다. 우선 순위 순서는 다음과 같습니다. 채널 수준 모델 시간 초과 설정> 채널 레벨 기본 시간 초과 설정> 글로벌 모델 타임 아웃 설정> 전역 기본 시간 초과 설정> 환경 변수 시간 초과입니다.

모델 타임 아웃 시간을 조정하면 일부 채널 타이밍 오류를 피할 수 있습니다. 오류 {'error': '500', 'details': 'fetch_response_stream Read Response Timeout'} 오류가 발생하면 모델 시간 초과 시간을 늘리십시오.

API_KEY_RATE_LIMIT는 어떻게 작동합니까? 여러 모델에 대해 동일한 속도 제한을 어떻게 설정합니까?

네 가지 모델 gemini-1.5-pro-latest, gemini-1.5-pro, gemini-1.5-pro-001, gemini-1.5-pro-002에 대해 동일한 주파수 한계를 설정하려면 다음과 같이 설정할 수 있습니다.

 api_key_rate_limit :
  gemini-1.5-pro : 1000/min

이것은 Gemini-1.5-Pro 문자열을 포함하는 모든 모델과 일치합니다. 이들 4 가지 모델의 빈도 한계, gemini-1.5-pro-latest, gemini-1.5-pro, gemini-1.5-pro-001, gemini-1.5-pro-002의 주파수는 모두 1000/min으로 설정됩니다. API_KEY_RATE_LIMIT 필드를 구성하기위한 논리는 다음과 같습니다. 다음은 샘플 구성 파일입니다.

 api_key_rate_limit :
  gemini-1.5-pro : 1000/min
  gemini-1.5-pro-002 : 500/min

현재 모델 Gemini-1.5-Pro-002를 사용하여 요청이있는 경우.

먼저, UNI-API는 API_KEY_RATE_LIMIT의 모델과 정확하게 일치하려고 시도합니다. Gemini-1.5-Pro-002의 속도 제한이 설정되면 Gemini-1.5-Pro-002의 속도 제한은 500/분입니다. 이 시점에서 요청 된 모델이 gemini-1.5-pro-002가 아니지만 Gemini-1.5-Pro-Latest, api_key_rate_limit은 gemini-1.5-pro-latest에 대한 속도 제한 설정이 없기 때문에 Gemini-1.5-Pro-Latest와 동일한 접두사와 동일한 접두사를 찾을 수 있으므로 Gemini-1.5-Pro-Latest가 될 것입니다. 1000/분로 설정하십시오.

스타 역사

확장하다

uni api

유니 아피

소개

특징

사용 방법

메소드 1 : `api.yaml` 구성 파일을 마운트하여 UNI-API를 시작합니다.

방법 2 : `CONFIG_URL` 환경 변수를 사용하여 UNI-API를 시작하십시오

환경 변수

Vercel 원격 배포

우분투 배포

Serv00 원격 배포 (freebsd 14.0)

도커 지역 배치

스폰서

우리를 후원하는 방법

FAQ

스타 역사

evolution api

유로파 유니

스펠캐스터 유니

유니

깨진 유니

탱크 유니

chat.petals.dev

GPT Prompt Templates

GPTyped

waymo open dataset

Sunamu

chat.petals.dev

waymo open dataset

termwind

wp functions

uni api

유니 아피

소개

특징

사용 방법

메소드 1 : api.yaml 구성 파일을 마운트하여 UNI-API를 시작합니다.

방법 2 : CONFIG_URL 환경 변수를 사용하여 UNI-API를 시작하십시오

환경 변수

Vercel 원격 배포

우분투 배포

Serv00 원격 배포 (freebsd 14.0)

도커 지역 배치

스폰서

우리를 후원하는 방법

FAQ

스타 역사

메소드 1 : `api.yaml` 구성 파일을 마운트하여 UNI-API를 시작합니다.

방법 2 : `CONFIG_URL` 환경 변수를 사용하여 UNI-API를 시작하십시오