llama cpp python下載 - llama cpp python原始碼下載

`llama.cpp`的 Python 綁定

@ggerganov 的llama.cpp庫的簡單 Python 綁定。該軟體包提供：

透過ctypes介面對 C API 進行低階存取。
用於文字補全的高階 Python API
- 類似 OpenAI 的 API
- 浪鏈相容性
- LlamaIndex 相容性
OpenAI 相容的網路伺服器
- 本地副駕駛更換
- 函數呼叫支援
- 視覺API支援
- 多種型號

文件位於 https://llama-cpp-python.readthedocs.io/en/latest。

安裝

要求：

Python 3.8+
C編譯器
- Linux：gcc 或 clang
- Windows：Visual Studio 或 MinGW
- MacOS：Xcode

若要安裝軟體包，請執行：

pip install llama-cpp-python

這也將從原始程式碼建立llama.cpp並將其與此 python 套件一起安裝。

如果失敗，請將--verbose加入pip install查看完整的 cmake 建置日誌。

預製輪（新）

還可以安裝具有基本 CPU 支援的預製輪。

pip install llama-cpp-python 
  --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cpu

安裝配置

llama.cpp支援許多硬體加速後端以加速推理以及後端特定選項。有關完整列表，請參閱 llama.cpp 自述文件。

所有llama.cpp cmake 建置選項都可以透過CMAKE_ARGS環境變數或在安裝過程中透過--config-settings / -C cli 標誌設定。

環境變數

 # Linux and Mac
CMAKE_ARGS= " -DGGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS " 
  pip install llama-cpp-python

 # Windows
$ env: CMAKE_ARGS = " -DGGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS "
pip install llama - cpp - python

CLI/要求.txt

它們也可以透過pip install -C / --config-settings命令進行設定並儲存到requirements.txt檔案中：

pip install --upgrade pip # ensure pip is up to date
pip install llama-cpp-python 
  -C cmake.args= " -DGGML_BLAS=ON;-DGGML_BLAS_VENDOR=OpenBLAS "

 # requirements.txt

llama-cpp-python -C cmake.args="-DGGML_BLAS=ON;-DGGML_BLAS_VENDOR=OpenBLAS"

支援的後端

以下是一些常見的後端、它們的建置命令以及所需的任何其他環境變數。

OpenBLAS（CPU）

若要使用 OpenBLAS 安裝，請在安裝前設定GGML_BLAS和GGML_BLAS_VENDOR環境變數：

CMAKE_ARGS= " -DGGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS " pip install llama-cpp-python

CUDA

若要安裝 CUDA 支持，請在安裝前設定GGML_CUDA=on環境變數：

CMAKE_ARGS= " -DGGML_CUDA=on " pip install llama-cpp-python

預製輪（新）

還可以安裝支援 CUDA 的預建輪。只要您的系統符合一些要求：

CUDA 版本為 12.1、12.2、12.3、12.4 或 12.5
Python 版本為 3.10、3.11 或 3.12

pip install llama-cpp-python 
  --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/ < cuda-version >

其中<cuda-version>是以下之一：

cu121 ：CUDA 12.1
cu122 ：CUDA 12.2
cu123 ：CUDA 12.3
cu124 ：CUDA 12.4
cu125 ：CUDA 12.5

例如，要安裝 CUDA 12.1 輪：

pip install llama-cpp-python 
  --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cu121

金屬

若要使用 Metal (MPS) 安裝，請在安裝前設定GGML_METAL=on環境變數：

CMAKE_ARGS= " -DGGML_METAL=on " pip install llama-cpp-python

預製輪（新）

還可以安裝有金屬支撐的預製輪。只要您的系統符合一些要求：

MacOS 版本為 11.0 或更高版本
Python 版本為 3.10、3.11 或 3.12

pip install llama-cpp-python 
  --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/metal

hipBLAS (ROCm)

要安裝 AMD 卡的 hipBLAS / ROCm 支持，請在安裝前設定GGML_HIPBLAS=on環境變數：

CMAKE_ARGS= " -DGGML_HIPBLAS=on " pip install llama-cpp-python

伏爾甘

若要安裝 Vulkan 支持，請在安裝前設定GGML_VULKAN=on環境變數：

CMAKE_ARGS= " -DGGML_VULKAN=on " pip install llama-cpp-python

SYCL

若要安裝 SYCL 支持，請在安裝前設定GGML_SYCL=on環境變數：

 source /opt/intel/oneapi/setvars.sh   
CMAKE_ARGS= " -DGGML_SYCL=on -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx " pip install llama-cpp-python

遠程過程調用

若要安裝 RPC 支持，請在安裝前設定GGML_RPC=on環境變數：

 source /opt/intel/oneapi/setvars.sh   
CMAKE_ARGS= " -DGGML_RPC=on " pip install llama-cpp-python

Windows 註解

錯誤：找不到“nmake”或“CMAKE_C_COMPILER”

如果您遇到問題，它抱怨找不到'nmake' '?'或 CMAKE_C_COMPILER，您可以按照 llama.cpp 儲存庫中所述提取 w64devkit，並在執行pip install 之前將其手動新增至 CMAKE_ARGS：

 $env:CMAKE_GENERATOR = "MinGW Makefiles"
$env:CMAKE_ARGS = "-DGGML_OPENBLAS=on -DCMAKE_C_COMPILER=C: /w64devkit/bin/gcc.exe -DCMAKE_CXX_COMPILER=C: /w64devkit/bin/g++.exe"

請參閱上述說明並將CMAKE_ARGS設定為您要使用的 BLAS 後端。

MacOS 筆記

詳細的 MacOS Metal GPU 安裝文件位於 docs/install/macos.md

M1 Mac 效能問題

注意：如果您使用的是Apple Silicon (M1) Mac，請確保您已安裝支援arm64架構的Python版本。例如：

wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-MacOSX-arm64.sh
bash Miniforge3-MacOSX-arm64.sh

否則，安裝時將建置 llama.cpp x86 版本，該版本在 Apple Silicon (M1) Mac 上速度會慢 10 倍。

M 系列 Mac 錯誤：`（mach-o 文件，但架構不相容（有 'x86_64'，需要 'arm64'））`

嘗試安裝

CMAKE_ARGS= " -DCMAKE_OSX_ARCHITECTURES=arm64 -DCMAKE_APPLE_SILICON_PROCESSOR=arm64 -DGGML_METAL=on " pip install --upgrade --verbose --force-reinstall --no-cache-dir llama-cpp-python

升級和重新安裝

若要升級和重建llama-cpp-python請將--upgrade --force-reinstall --no-cache-dir標誌新增至pip install命令中，以確保從來源重建軟體包。

進階API

API參考

進階 API 透過Llama類別提供簡單的託管介面。

以下是一個簡短的範例，示範如何使用進階 API 來完成基本文字：

 from llama_cpp import Llama

llm = Llama (
      model_path = "./models/7B/llama-model.gguf" ,
      # n_gpu_layers=-1, # Uncomment to use GPU acceleration
      # seed=1337, # Uncomment to set a specific seed
      # n_ctx=2048, # Uncomment to increase the context window
)
output = llm (
      "Q: Name the planets in the solar system? A: " , # Prompt
      max_tokens = 32 , # Generate up to 32 tokens, set to None to generate up to the end of the context window
      stop = [ "Q:" , " n " ], # Stop generating just before the model would generate a new question
      echo = True # Echo the prompt back in the output
) # Generate a completion, can also call create_completion
print ( output )

預設情況下llama-cpp-python以 OpenAI 相容格式產生補全：

{
  "id" : "cmpl-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" ,
  "object" : "text_completion" ,
  "created" : 1679561337 ,
  "model" : "./models/7B/llama-model.gguf" ,
  "choices" : [
    {
      "text" : "Q: Name the planets in the solar system? A: Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, Neptune and Pluto." ,
      "index" : 0 ,
      "logprobs" : None ,
      "finish_reason" : "stop"
    }
  ],
  "usage" : {
    "prompt_tokens" : 14 ,
    "completion_tokens" : 28 ,
    "total_tokens" : 42
  }
}

文字完成可透過Llama類別的__call__和create_completion方法取得。

從 Hugging Face Hub 中拉取模型

您可以使用from_pretrained方法直接從 Hugging Face 下載gguf格式的Llama模型。您需要安裝huggingface-hub軟體包才能使用此功能（ pip install huggingface-hub ）。

 llm = Llama . from_pretrained (
    repo_id = "Qwen/Qwen2-0.5B-Instruct-GGUF" ,
    filename = "*q8_0.gguf" ,
    verbose = False
)

預設情況下from_pretrained會將模型下載到 Huggingface 快取目錄，然後您可以使用huggingface-cli工具管理已安裝的模型檔案。

聊天完成

進階 API 還提供了一個簡單的聊天完成介面。

聊天完成要求模型知道如何將訊息格式化為單一提示。 Llama類別使用預先註冊的聊天格式（即chatml 、 llama-2 、 gemma等）或透過提供自訂聊天處理程序物件來完成此操作。

該模型將使用以下優先順序將訊息格式化為單一提示：

使用chat_handler （如果提供）
使用chat_format （如果提供）
使用gguf模型元資料中的tokenizer.chat_template （應該適用於大多數新模型，舊模型可能沒有這個）
否則，回退到llama-2聊天格式

設定verbose=True以查看所選的聊天格式。

 from llama_cpp import Llama
llm = Llama (
      model_path = "path/to/llama-2/llama-model.gguf" ,
      chat_format = "llama-2"
)
llm . create_chat_completion (
      messages = [
          { "role" : "system" , "content" : "You are an assistant who perfectly describes images." },
          {
              "role" : "user" ,
              "content" : "Describe this image in detail please."
          }
      ]
)

聊天完成可透過Llama類別的create_chat_completion方法實現。

為了相容於 OpenAI API v1，您可以使用create_chat_completion_openai_v1方法，該方法將傳回 pydantic 模型而不是字典。

JSON 和 JSON 模式模式

若要將聊天回應限制為僅有效 JSON 或特定 JSON 模式，請使用create_chat_completion中的response_format參數。

JSON模式

以下範例將僅回應有效的 JSON 字串。

 from llama_cpp import Llama
llm = Llama ( model_path = "path/to/model.gguf" , chat_format = "chatml" )
llm . create_chat_completion (
    messages = [
        {
            "role" : "system" ,
            "content" : "You are a helpful assistant that outputs in JSON." ,
        },
        { "role" : "user" , "content" : "Who won the world series in 2020" },
    ],
    response_format = {
        "type" : "json_object" ,
    },
    temperature = 0.7 ,
)

JSON 架構模式

若要將回應進一步限制為特定 JSON 架構，請將該架構新增至response_format參數的schema屬性中。

 from llama_cpp import Llama
llm = Llama ( model_path = "path/to/model.gguf" , chat_format = "chatml" )
llm . create_chat_completion (
    messages = [
        {
            "role" : "system" ,
            "content" : "You are a helpful assistant that outputs in JSON." ,
        },
        { "role" : "user" , "content" : "Who won the world series in 2020" },
    ],
    response_format = {
        "type" : "json_object" ,
        "schema" : {
            "type" : "object" ,
            "properties" : { "team_name" : { "type" : "string" }},
            "required" : [ "team_name" ],
        },
    },
    temperature = 0.7 ,
)

函數呼叫

高層API支援OpenAI相容的函數和工具呼叫。這可以透過functionary預訓練模型聊天格式或透過通用chatml-function-calling聊天格式來實現。

 from llama_cpp import Llama
llm = Llama ( model_path = "path/to/chatml/llama-model.gguf" , chat_format = "chatml-function-calling" )
llm . create_chat_completion (
      messages = [
        {
          "role" : "system" ,
          "content" : "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. The assistant calls functions with appropriate input when necessary"

        },
        {
          "role" : "user" ,
          "content" : "Extract Jason is 25 years old"
        }
      ],
      tools = [{
        "type" : "function" ,
        "function" : {
          "name" : "UserDetail" ,
          "parameters" : {
            "type" : "object" ,
            "title" : "UserDetail" ,
            "properties" : {
              "name" : {
                "title" : "Name" ,
                "type" : "string"
              },
              "age" : {
                "title" : "Age" ,
                "type" : "integer"
              }
            },
            "required" : [ "name" , "age" ]
          }
        }
      }],
      tool_choice = {
        "type" : "function" ,
        "function" : {
          "name" : "UserDetail"
        }
      }
)

職能 v2

可以在此處找到這組模型的各種 gguf 轉換檔。 Functionary 能夠聰明地呼叫函數，並分析任何提供的函數輸出以產生一致的回應。所有 v2 型號的 functionary 都支援平行函數呼叫。初始化 Llama 類別時，您可以為chat_format提供functionary-v1或functionary-v2 。

由於 llama.cpp 和 HuggingFace 的 tokenizer 之間存在差異，需要為工作人員提供 HF Tokenizer。 LlamaHFTokenizer類別可以初始化並傳遞到 Llama 類別中。這將覆寫 Llama 類別中使用的預設 llama.cpp 標記產生器。分詞器檔案已包含在託管 gguf 檔案的相應 HF 儲存庫中。

 from llama_cpp import Llama
from llama_cpp . llama_tokenizer import LlamaHFTokenizer
llm = Llama . from_pretrained (
  repo_id = "meetkai/functionary-small-v2.2-GGUF" ,
  filename = "functionary-small-v2.2.q4_0.gguf" ,
  chat_format = "functionary-v2" ,
  tokenizer = LlamaHFTokenizer . from_pretrained ( "meetkai/functionary-small-v2.2-GGUF" )
)

注意：無需提供 Functionary 中使用的預設系統訊息，因為它們會自動新增到 Functionary 聊天處理程序中。因此，訊息應僅包含為模型提供附加上下文的聊天訊息和/或系統訊息（例如：日期時間等）。

多模式模型

llama-cpp-python支援 llava1.5 等語言模型，允許語言模型從文字和圖像中讀取資訊。

以下是支援的多模式模型及其各自的聊天處理程序 (Python API) 和聊天格式 (伺服器 API)。

模型	`LlamaChatHandler`	`chat_format`
llava-v1.5-7b	`Llava15ChatHandler`	`llava-1-5`
llava-v1.5-13b	`Llava15ChatHandler`	`llava-1-5`
llava-v1.6-34b	`Llava16ChatHandler`	`llava-1-6`
月夢2	`MoondreamChatHandler`	`moondream2`
納諾拉瓦	`NanollavaChatHandler`	`nanollava`
駱駝-3-視覺-阿爾法	`Llama3VisionAlphaChatHandler`	`llama-3-vision-alpha`
minicpm-v-2.6	`MiniCPMv26ChatHandler`	`minicpm-v-2.6`

然後，您需要使用自訂聊天處理程序來載入剪輯模型並處理聊天訊息和圖像。

 from llama_cpp import Llama
from llama_cpp . llama_chat_format import Llava15ChatHandler
chat_handler = Llava15ChatHandler ( clip_model_path = "path/to/llava/mmproj.bin" )
llm = Llama (
  model_path = "./path/to/llava/llama-model.gguf" ,
  chat_handler = chat_handler ,
  n_ctx = 2048 , # n_ctx should be increased to accommodate the image embedding
)
llm . create_chat_completion (
    messages = [
        { "role" : "system" , "content" : "You are an assistant who perfectly describes images." },
        {
            "role" : "user" ,
            "content" : [
                { "type" : "text" , "text" : "What's in this image?" },
                { "type" : "image_url" , "image_url" : { "url" : "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg" } }
            ]
        }
    ]
)

您也可以使用from_pretrained方法從 Hugging Face Hub 中提取模型。

 from llama_cpp import Llama
from llama_cpp . llama_chat_format import MoondreamChatHandler

chat_handler = MoondreamChatHandler . from_pretrained (
  repo_id = "vikhyatk/moondream2" ,
  filename = "*mmproj*" ,
)

llm = Llama . from_pretrained (
  repo_id = "vikhyatk/moondream2" ,
  filename = "*text-model*" ,
  chat_handler = chat_handler ,
  n_ctx = 2048 , # n_ctx should be increased to accommodate the image embedding
)

response = llm . create_chat_completion (
    messages = [
        {
            "role" : "user" ,
            "content" : [
                { "type" : "text" , "text" : "What's in this image?" },
                { "type" : "image_url" , "image_url" : { "url" : "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg" } }

            ]
        }
    ]
)
print ( response [ "choices" ][ 0 ][ "text" ])

註：多模態模型也支援工具呼叫和JSON模式。

載入本機圖片

影像可以作為 base64 編碼的資料 URI 傳遞。以下範例示範如何執行此操作。

 import base64

def image_to_base64_data_uri ( file_path ):
    with open ( file_path , "rb" ) as img_file :
        base64_data = base64 . b64encode ( img_file . read ()). decode ( 'utf-8' )
        return f"data:image/png;base64, { base64_data } "

# Replace 'file_path.png' with the actual path to your PNG file
file_path = 'file_path.png'
data_uri = image_to_base64_data_uri ( file_path )

messages = [
    { "role" : "system" , "content" : "You are an assistant who perfectly describes images." },
    {
        "role" : "user" ,
        "content" : [
            { "type" : "image_url" , "image_url" : { "url" : data_uri }},
            { "type" : "text" , "text" : "Describe this image in detail please." }
        ]
    }
]

推測性解碼

llama-cpp-python支援推測性解碼，允許模型根據草稿模型產生補全。

使用推測解碼的最快方法是透過LlamaPromptLookupDecoding類別。

只需在初始化期間將其作為草稿模型傳遞給Llama類別即可。

 from llama_cpp import Llama
from llama_cpp . llama_speculative import LlamaPromptLookupDecoding

llama = Llama (
    model_path = "path/to/model.gguf" ,
    draft_model = LlamaPromptLookupDecoding ( num_pred_tokens = 10 ) # num_pred_tokens is the number of tokens to predict 10 is the default and generally good for gpu, 2 performs better for cpu-only machines.
)

嵌入

若要產生文字嵌入，請使用create_embedding或embed 。請注意，您必須在建立模型時將embedding=True傳遞給建構函數，以便它們正常運作。

 import llama_cpp

llm = llama_cpp . Llama ( model_path = "path/to/model.gguf" , embedding = True )

embeddings = llm . create_embedding ( "Hello, world!" )

# or create multiple embeddings at once

embeddings = llm . create_embedding ([ "Hello, world!" , "Goodbye, world!" ])

Transformer 風格模型中的嵌入有兩個主要概念：標記層級和序列層級。序列級嵌入是透過將令牌級嵌入「池化」在一起而產生的，通常是透過對它們進行平均或使用第一個令牌。

明確面向嵌入的模型通常會預設回傳序列級嵌入，每個輸入字串一個。非嵌入模型（例如為文字產生設計的模型）通常只會傳回標記層級的嵌入，每個序列中的每個標記都有一個嵌入。因此，對於令牌層級嵌入，傳回類型的維度將更高。

在某些情況下，可以在模型建立時使用pooling_type標誌來控制池行為。您可以使用LLAMA_POOLING_TYPE_NONE確保任何模型的令牌等級嵌入。相反，目前不可能獲得面向生成的模型來產生序列級嵌入，但您始終可以手動進行池化。

調整上下文視窗

Llama 模型的上下文視窗決定了一次可以處理的最大令牌數量。預設情況下，該值設定為 512 個令牌，但可以根據您的要求進行調整。

例如，如果您想使用更大的上下文，您可以在初始化 Llama 物件時透過設定 n_ctx 參數來擴充上下文視窗：

 llm = Llama ( model_path = "./models/7B/llama-model.gguf" , n_ctx = 2048 )

OpenAI 相容的 Web 伺服器

llama-cpp-python提供了一個 Web 伺服器，旨在充當 OpenAI API 的直接替代品。這允許您將 llama.cpp 相容模型與任何 OpenAI 相容客戶端（語言庫、服務等）一起使用。

要安裝伺服器套件並開始：

pip install ' llama-cpp-python[server] '
python3 -m llama_cpp.server --model models/7B/llama-model.gguf

與上面的硬體加速部分類似，您還可以安裝 GPU (cuBLAS) 支持，如下所示：

CMAKE_ARGS= " -DGGML_CUDA=on " FORCE_CMAKE=1 pip install ' llama-cpp-python[server] '
python3 -m llama_cpp.server --model models/7B/llama-model.gguf --n_gpu_layers 35

導航至 http://localhost:8000/docs 以查看 OpenAPI 文件。

要綁定到0.0.0.0以啟用遠端連接，請使用python3 -m llama_cpp.server --host 0.0.0.0 。同樣，若要變更連接埠（預設為 8000），請使用--port 。

您可能還想設定提示格式。對於 chatml，請使用

python3 -m llama_cpp.server --model models/7B/llama-model.gguf --chat_format chatml

這將根據模型期望的方式格式化提示。您可以在型號卡中找到提示格式。有關可能的選項，請參閱 llama_cpp/llama_chat_format.py 並尋找以「@register_chat_format」開頭的行。

如果安裝了huggingface-hub ，您也可以使用--hf_model_repo_id標誌從 Hugging Face Hub 載入模型。

python3 -m llama_cpp.server --hf_model_repo_id Qwen/Qwen2-0.5B-Instruct-GGUF --model ' *q8_0.gguf '

網路伺服器功能

本地副駕駛更換
函數呼叫支援
視覺API支援
多種型號

Docker映像

GHCR 上提供了 Docker 映像。運行伺服器：

docker run --rm -it -p 8000:8000 -v /path/to/models:/models -e MODEL=/models/llama-model.gguf ghcr.io/abetlen/llama-cpp-python:latest

termux 上的 Docker（需要 root）是目前在手機上運行它的唯一已知方法，請參閱 termux 支援問題

低級API

API參考

低階 API 是直接ctypes綁定到llama.cpp提供的 C API。整個低階 API 可以在 llama_cpp/llama_cpp.py 中找到，並直接鏡像 llama.h 中的 C API。

下面是一個簡短的範例，示範如何使用低階 API 來標記提示：

 import llama_cpp
import ctypes
llama_cpp . llama_backend_init ( False ) # Must be called once at the start of each program
params = llama_cpp . llama_context_default_params ()
# use bytes for char * params
model = llama_cpp . llama_load_model_from_file ( b"./models/7b/llama-model.gguf" , params )
ctx = llama_cpp . llama_new_context_with_model ( model , params )
max_tokens = params . n_ctx
# use ctypes arrays for array params
tokens = ( llama_cpp . llama_token * int ( max_tokens ))()
n_tokens = llama_cpp . llama_tokenize ( ctx , b"Q: Name the planets in the solar system? A: " , tokens , max_tokens , llama_cpp . c_bool ( True ))
llama_cpp . llama_free ( ctx )

查看範例資料夾以取得使用低階 API 的更多範例。

文件

文件可透過 https://llama-cpp-python.readthedocs.io/ 取得。如果您發現文件有任何問題，請提出問題或提交 PR。

發展

該軟體包正在積極開發中，我歡迎任何貢獻。

首先，克隆儲存庫並以可編輯/開發模式安裝套件：

git clone --recurse-submodules https://github.com/abetlen/llama-cpp-python.git
cd llama-cpp-python

# Upgrade pip (required for editable mode)
pip install --upgrade pip

# Install with pip
pip install -e .

# if you want to use the fastapi / openapi server
pip install -e .[server]

# to install all optional dependencies
pip install -e .[all]

# to clear the local build cache
make clean

您也可以透過在vendor/llama.cpp子模組中檢查所需的提交，然後執行make clean和pip install -e .來測試llama.cpp的特定提交。再次。 llama.h API 中的任何變更都需要變更llama_cpp/llama_cpp.py檔案以符合新 API（其他地方可能需要其他變更）。