benchmarks pipeline下載 - benchmarks pipeline源代碼下載

benchmarks pipeline

其他源碼

1.0.0

下載

S＆P AI基準演示管道

此存儲庫顯示瞭如何通過S＆P AI基準測試的模型。所有配置的模型都可以在config.py中看到。可以很容易地將自己的型號添加到配置中，或使用命令行選項運行擁抱面模型。

設定

Please download the questions from our S&P AI Benchmarks website's submission page and save them directly within this folder, benchmarks-pipeline/benchmark_questions.json .

 # We recommend using python 3.10.6 with pyenv
pyenv install 3.10.6
pyenv local 3.10.6
virtualenv -p python3.10.6 .benchmarks
source .benchmarks/bin/activate

# Install the requirements in your local environment
pip install -r requirements.txt

硬件要求：大多數可以在CPU上快速運行的型號在此基準測試上表現不佳；我們建議使用與GPU的系統。要設置設備，請使用--device_map參數。

設計決策

我們提供用於評估的提示；當前，所有模型都對給定的問題類型使用相同的提示。我們允許模型多次嘗試以預期格式生成答案。沒有這個重試步驟，我們發現某些模型會因我們的答案解析而過度損害：它們以錯誤的格式產生正確的答案。因此，我們允許多達10次嘗試以預期格式生成答案。默認情況下，此存儲庫中的源代碼可以執行此操作，但可以由-t, --answer_parsing_tries_alloted參數控制。

用法

我們為config.py中的開源和Propielary模型提供了許多配置。如果要使用其中一種模型，請使用config.py中列出的代碼。您還可以通過Commandline ARGS配置HugingFace模型。

python main.py -m Mistral-7B-v0.1-cot
# or:
python main.py -n mistralai/Mistral-7B-v0.1 --prompt_style cot --max_new_tokens 12 --answer_parsing_tries_alloted 1

輸出CSV包含問題ID的列，並在沒有標頭的情況下回答。有關示例輸出，請參見results/Mistral-7B-v0.1-cot.csv 。

 # A snapshot from the example output.
35c06bfe-60a7-47b4-ab82-39e138abd629,13428.0
33c7bd71-e5a3-40dd-8eb0-5000c9353977,-4.5
7b60e737-4f0a-467b-9f73-fa5714d8cdbb,41846.0
0a3f6ada-b8d3-48cc-adb4-270af0e08289,2.0
03999e5f-05ee-4b71-95ad-c5a61aae4858,2.0

配置新型號

如果要在config.py中添加一個新模型添加到_CONFIG變量。例如，以下片段使用自定義默認的max_new_tokens添加了Zephyr模型。您還必須選擇要使用的提示創建者。這控制了每個問題創建的提示。我們提供兩個， code_prompt_creater和cot_prompt_creator 。

 _CONFIG = {
    ...,
    "example-zepyhr-code" : lambda : (
        HFChatModel (
            "HuggingFaceH4/zephyr-7b-beta" ,
            device_map = "auto" ,
            generation_kwargs = { "max_new_tokens" : 2048 },
        ),
        code_prompt_creator ,
    ),
}

對於此特定模型，您可以直接使用命令行：

python main.py -n HuggingFaceH4/zephyr-7b-beta --prompt_style code --max_new_tokens 2048 --device_map auto

上傳

將您的結果上傳到S＆P AI基準！請參閱此處的https://benchmarks.kensho.com。

接觸

此存儲庫旨在作為進一步實驗的模板！

請與[email protected]聯繫。

展開

附加信息

版本 1.0.0
類型其他源碼
更新時間 2025-02-22
大小 44.94KB
來自於 Github

相關應用

ComfyUI_Pipeline_Tool

2024-11-09
GitHub sgrebnov/cordova plugin background download

2024-11-05
Wa ch the greatest of all time 2024 ull ovie Online For Fr e Strea ings At Home

2024-11-02
wolfs 2024 f llmo ie f lmyz lla dow load ree 7 0p 4 0p a d 10 0p

2024-11-01
trigger circleci pipeline action

2024-11-01
大禹管道

2022-08-18

爲您推薦

chat.petals.dev

其他源碼

1.0.0
GPT Prompt Templates

其他源碼

1.0.0
GPTyped

其他源碼

GPTyped 1.0.5
waymo open dataset

其他源碼

December 2023 Update
Sunamu

其他源碼

Release 2.2.0
chat.petals.dev

其他源碼

1.0.0
waymo open dataset

其他源碼

December 2023 Update
termwind

其他類別

v2.3.0
wp functions

其他類別

1.0.0

相關資訊全部