axolotl下載 - axolotl原始碼下載

axolotl

其他源碼

v0.5.2

下載

Axolotl 是一款旨在簡化各種人工智慧模型微調的工具，為多種配置和架構提供支援。

特徵：

訓練各種 Huggingface 模型，如 llama、pythia、falcon、mpt
支援 fullfinetune、lora、qlora、relora 和 gptq
使用簡單的 yaml 檔案或 CLI 覆蓋自訂配置
載入不同的資料集格式、使用自訂格式或引入您自己的標記化資料集
與 xformer、flashattention、ligerkernel、ropescaling 和 multipacking 集成
透過 FSDP 或 Deepspeed 與單一 GPU 或多個 GPU 搭配使用
在本機或雲端輕鬆使用 Docker 運行
將結果和可選的檢查點記錄到 wandb、mlflow 或 Comet
還有更多！

蠑螈
- 目錄
- 蠑螈支持
- 快速入門⚡
  - 用法
- 進階設定
  - 環境
    - 碼頭工人
    - 康達/皮普·文夫
    - 雲端GPU
    - 裸機雲端GPU
      - Lambda實驗室
      - GCP
    - 視窗
    - 蘋果
    - Google合作實驗室
    - 透過 SkyPilot 在公有雲上啟動
    - 透過 dstack 在公有雲上啟動
  - 數據集
  - 配置
    - 所有配置選項
  - 火車
    - 預處理資料集
    - 多GPU
      - 深速
      - FSDP
      - FSDP + QLoRA
      - 權重和偏差記錄
      - 特殊代幣
    - 獅虎內核
  - 推理遊樂場
  - 將 LORA 合併到基礎
- 常見錯誤？
  - 標記化不匹配黑白推理和訓練
- 調試蠑螈
- 需要幫助嗎？？
- 徽章❤️？
- 社區展示
- 貢獻？
- 贊助商？
  - ？鑽石贊助商 - 直接聯繫
  - ？金牌贊助商 - $5000/月
  - ？白銀贊助商 - $1000/月
  - ？銅牌贊助商 - $500/月

Axolotl 提供了一個用於微調的統一儲存庫
輕鬆建構多種AI模型

繼續問蠑螈問題！

蠑螈支持

	FP16/FP32	洛拉	格洛拉	總表	帶 flash attn 的 gptq	閃光注意事項	xformers 收件者
駱駝	✅	✅	✅	✅	✅	✅	✅
米斯特拉爾	✅	✅	✅	✅	✅	✅	✅
混合MoE	✅	✅	✅	❓	❓	❓	❓
混合8X22	✅	✅	✅	❓	❓	❓	❓
皮提亞	✅	✅	✅				❓
大腦	✅	✅	✅				❓
BTLM	✅	✅	✅				❓
MPT	✅		❓				❓
鷸	✅	✅	✅				❓
gpt-j	✅	✅	✅			❓	❓
XGen	✅	❓	✅	❓	❓	❓	✅
菲	✅	✅	✅	❓	❓	❓	❓
RWKV	✅	❓	❓	❓	❓	❓	❓
奎文	✅	✅	✅	❓	❓	❓	❓
芽	✅	✅	✅	❓	❓	✅	❓
詹巴	✅	✅	✅	❓	❓	✅	❓

✅：支援：不支援 ❓：未經測試

快速入門⚡

只需幾個步驟即可開始使用蠑螈！本快速入門指南將引導您完成設定和運行基本的微調任務。

需求：Nvidia GPU（Ampere 架構或bf16和 Flash Attention 的更新版本）、Python >=3.10 和 PyTorch >=2.3.1。

git clone https://github.com/axolotl-ai-cloud/axolotl
cd axolotl

pip3 install packaging ninja
pip3 install -e ' .[flash-attn,deepspeed] '

用法

 # preprocess datasets - optional but recommended
CUDA_VISIBLE_DEVICES= " " python -m axolotl.cli.preprocess examples/openllama-3b/lora.yml

# finetune lora
accelerate launch -m axolotl.cli.train examples/openllama-3b/lora.yml

# inference
accelerate launch -m axolotl.cli.inference examples/openllama-3b/lora.yml 
    --lora_model_dir= " ./outputs/lora-out "

# gradio
accelerate launch -m axolotl.cli.inference examples/openllama-3b/lora.yml 
    --lora_model_dir= " ./outputs/lora-out " --gradio

# remote yaml files - the yaml config can be hosted on a public URL
# Note: the yaml config must directly link to the **raw** yaml
accelerate launch -m axolotl.cli.train https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/examples/openllama-3b/lora.yml

進階設定

環境

碼頭工人

docker run --gpus ' "all" ' --rm -it axolotlai/axolotl:main-latest

或在當前文件上運行以進行開發：

docker compose up -d

提示

如果您想偵錯 axolotl 或更喜歡使用 Docker 作為開發環境，請參閱偵錯指南中有關 Docker 的部分。

Docker進階

一個更強大的 Docker 運行命令是這樣的：

docker run --privileged --gpus ' "all" ' --shm-size 10g --rm -it --name axolotl --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 --mount type=bind,src= " ${PWD} " ,target=/workspace/axolotl -v ${HOME} /.cache/huggingface:/root/.cache/huggingface axolotlai/axolotl:main-latest

它還：

透過--ipc和--ulimit args 防止執行階段出現記憶體問題，例如 deepspeed（例如，您可能會遇到 SIGBUS/signal 7 錯誤）。
透過--mount / -v args 儲存下載的 HF 資料（模型等）以及 axolotl 程式碼的修改。
--name參數只是讓在 vscode ( Dev Containers: Attach to Running Container... ) 或終端機中引用容器變得更容易。
--privileged標誌為容器提供所有功能。
--shm-size 10g參數增加共享記憶體大小。如果您在使用 deepspeed 時看到exitcode: -7錯誤，請使用此選項。

更多資訊請上 nvidia 網站

康達/皮普·文夫

安裝 python >= 3.10
安裝 pytorch 穩定版 https://pytorch.org/get-started/locally/

安裝 Axolotl 以及 python 依賴項

pip3 install packaging
pip3 install -e ' .[flash-attn,deepspeed] '

（可選）登入 Huggingface 以使用門控模型/資料集。
```
huggingface-cli login
```
在 Huggingface.co/settings/tokens 取得令牌

雲端GPU

對於支援 docker 映像的雲端 GPU 供應商，請使用axolotlai/axolotl-cloud:main-latest

在 Latitude.sh 上使用此直接鏈接
在 JarvisLabs.ai 上使用此直接鏈接
在 RunPod 上使用此直接鏈接

裸機雲端GPU

Lambda實驗室

點選展開

安裝Python

sudo apt update
sudo apt install -y python3.10

sudo update-alternatives --install /usr/bin/python python /usr/bin/python3.10 1
sudo update-alternatives --config python # pick 3.10 if given option
python -V # should be 3.10

安裝點子

wget https://bootstrap.pypa.io/get-pip.py
python get-pip.py

安裝 Pytorch https://pytorch.org/get-started/locally/
請按照快速入門上的說明進行操作。
跑步

pip3 install protobuf==3.20.3
pip3 install -U --ignore-installed requests Pillow psutil scipy

設定路徑

 export LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu: $LD_LIBRARY_PATH

GCP

點選展開

使用安裝了 cuda 和 pytorch 的 Deeplearning Linux 作業系統。然後按照快速入門上的說明進行操作。

確保運行以下命令來卸載 xla。

pip uninstall -y torch_xla[tpu]

視窗

請使用 WSL 或 Docker！

蘋果

使用以下方法取代快速入門中的安裝方法。

 pip3 install -e '.'

更多資訊：mac.md

Google合作實驗室

請使用此範例筆記本。

透過 SkyPilot 在公有雲上啟動

要在 7 個以上雲端（GCP、AWS、Azure、OCI 等）上的 GPU 執行個體（按需執行個體和現貨執行個體）上啟動，您可以使用 SkyPilot：

pip install " skypilot-nightly[gcp,aws,azure,oci,lambda,kubernetes,ibm,scp] "  # choose your clouds
sky check

取得使用 Axolotl 微調mistralai/Mistral-7B-v0.1的範例 YAML：

 git clone https://github.com/skypilot-org/skypilot.git
cd skypilot/llm/axolotl

使用一個指令來啟動：

 # On-demand
HF_TOKEN=xx sky launch axolotl.yaml --env HF_TOKEN

# Managed spot (auto-recovery on preemption)
HF_TOKEN=xx BUCKET= < unique-name > sky spot launch axolotl-spot.yaml --env HF_TOKEN --env BUCKET

透過 dstack 在公有雲上啟動

若要在公有雲（GCP、AWS、Azure、Lambda Labs、TensorDock、Vast.ai 和 CUDO）上的 GPU 執行個體（隨選執行個體和現貨執行個體）上啟動，您可以使用 dstack。

在 YAML 中寫出職位說明如下：

 # dstack.yaml
type : task

image : axolotlai/axolotl-cloud:main-latest

env :
  - HUGGING_FACE_HUB_TOKEN
  - WANDB_API_KEY

commands :
  - accelerate launch -m axolotl.cli.train config.yaml

ports :
  - 6006

resources :
  gpu :
    memory : 24GB..
    count : 2

然後，只需使用dstack run命令運行作業即可。如果您想要現貨實例，請附加--spot選項。 dstack run指令將會向您顯示跨多雲服務價格最便宜的實例：

pip install dstack
HUGGING_FACE_HUB_TOKEN=xxx WANDB_API_KEY=xxx dstack run . -f dstack.yaml # --spot

有關更詳細的用例，請參閱官方 dstack 文件以及官方儲存庫上 axolotl 範例的詳細說明。

數據集

Axolotl 支援多種資料集格式。建議使用 JSONL。 JSONL 的架構取決於您想要使用的任務和提示範本。除了 JSONL，您還可以使用 HuggingFace 資料集，其中每個 JSONL 欄位都有列。

有關如何使用不同資料集格式的更多信息，請參閱文件。

配置

請參閱範例以快速入門。建議根據您的需求進行複製和修改。最重要的選項是：

模型

 base_model : ./llama-7b-hf # local or huggingface repo

注意：程式碼將載入正確的架構。

數據集

 datasets :
    # huggingface repo
  - path : vicgalle/alpaca-gpt4
    type : alpaca

    # huggingface repo with specific configuration/subset
  - path : EleutherAI/pile
    name : enron_emails
    type : completion # format from earlier
    field : text # Optional[str] default: text, field to use for completion data

    # huggingface repo with multiple named configurations/subsets
  - path : bigcode/commitpackft
    name :
      - ruby
      - python
      - typescript
    type : ... # unimplemented custom format

    # chat_template https://axolotl-ai-cloud.github.io/axolotl/docs/dataset-formats/conversation.html#chat_template
  - path : ...
    type : chat_template
    chat_template : chatml # defaults to tokenizer's chat_template

    # local
  - path : data.jsonl # or json
    ds_type : json # see other options below
    type : alpaca

    # dataset with splits, but no train split
  - path : knowrohit07/know_sql
    type : context_qa.load_v2
    train_on_split : validation

    # loading from s3 or gcs
    # s3 creds will be loaded from the system default and gcs only supports public access
  - path : s3://path_to_ds # Accepts folder with arrow/parquet or file path like above. Supports s3, gcs.
    ...

    # Loading Data From a Public URL
    # - The file format is `json` (which includes `jsonl`) by default. For different formats, adjust the `ds_type` option accordingly.
  - path : https://some.url.com/yourdata.jsonl # The URL should be a direct link to the file you wish to load. URLs must use HTTPS protocol, not HTTP.
    ds_type : json # this is the default, see other options below.

載入中

 load_in_4bit : true
load_in_8bit : true

bf16 : auto # require >=ampere, auto will detect if your GPU supports this and choose automatically.
fp16 : # leave empty to use fp16 when bf16 is 'auto'. set to false if you want to fallback to fp32
tf32 : true # require >=ampere

bfloat16 : true # require >=ampere, use instead of bf16 when you don't want AMP (automatic mixed precision)
float16 : true # use instead of fp16 when you don't want AMP

注意：Repo 不進行 4 位元量化。

洛拉

 adapter : lora # 'qlora' or leave blank for full finetune
lora_r : 8
lora_alpha : 16
lora_dropout : 0.05
lora_target_modules :
  - q_proj
  - v_proj

所有配置選項

有關所有配置選項，請參閱這些文件。

火車

跑步

accelerate launch -m axolotl.cli.train your_config.yml

提示

您也可以引用託管在公共 URL 上的設定文件，例如accelerate launch -m axolotl.cli.train https://yourdomain.com/your_config.yml

預處理資料集

在微調之前，您可以選擇使用以下內容對資料集進行預標記。建議對於大型資料集使用此方法。

將dataset_prepared_path:設定為本機資料夾，用於儲存和載入預標記化資料集。
（可選）：設定push_dataset_to_hub: hf_user/repo將其推送到Huggingface。
（可選）：使用--debug查看預處理的範例。

python -m axolotl.cli.preprocess your_config.yml

多GPU

以下是 axolotl 中可用於使用多個 GPU 進行訓練的選項。請注意，DeepSpeed 是目前建議的多 GPU 選項，因為 FSDP 可能會遇到遺失不穩定的情況。

深速

Deepspeed 是一款適用於多 GPU 系統的最佳化套件，可讓您訓練比 GPU VRAM 通常能夠容納的模型大得多的模型。有關 deepspeed 各種優化類型的更多信息，請訪問 https://huggingface.co/docs/accelerate/main/en/usage_guides/deepspeed#what-is-integrated

我們為 ZeRO 第 1、2 和 3 階段提供了幾種預設的 deepspeed JSON 配置。

 deepspeed : deepspeed_configs/zero1.json

accelerate launch -m axolotl.cli.train examples/llama-2/config.yml --deepspeed deepspeed_configs/zero1.json

FSDP

美洲駝 FSDP

 fsdp :
  - full_shard
  - auto_wrap
fsdp_config :
  fsdp_offload_params : true
  fsdp_state_dict_type : FULL_STATE_DICT
  fsdp_transformer_layer_cls_to_wrap : LlamaDecoderLayer

FSDP + QLoRA

Axolotl 支援使用 FSDP 和 QLoRA 進行訓練，請參閱這些文件以獲取更多資訊。

權重和偏差記錄

確保您的WANDB_API_KEY環境變數已設定（建議），或者您使用wandb login登入 wandb 。

萬寶選項

 wandb_mode :
wandb_project :
wandb_entity :
wandb_watch :
wandb_name :
wandb_log_model :

彗星測井

確保您的COMET_API_KEY環境變數已設定（建議），或者您使用comet login登入 wandb 。

萬寶選項

 use_comet :
comet_api_key :
comet_workspace :
comet_project_name :
comet_experiment_key :
comet_mode :
comet_online :
comet_experiment_config :

特殊代幣

在記號產生器的詞彙表中包含特殊的記號（例如分隔符號、序列結尾、序列開頭）非常重要。這將幫助您避免標記化問題並幫助您更好地訓練模型。你可以在 axolotl 中這樣做：

 special_tokens :
  bos_token : " <s> "
  eos_token : " </s> "
  unk_token : " <unk> "
tokens : # these are delimiters
  - " <|im_start|> "
  - " <|im_end|> "

當您將這些標記包含在 axolotl 配置中時，axolotl 會將這些標記新增至標記產生器的詞彙表。

獅虎內核

Liger Kernel：用於 LLM 訓練的高效 Triton 內核

https://github.com/linkedin/Liger-Kernel

Liger（LinkedIn GPU Efficient Runtime）核心是專為 LLM 訓練設計的 Triton 核心的集合。可有效提升多GPU訓練吞吐量20%，記憶體佔用降低60%。 Liger Kernel 組合良好，並且與 FSDP 和 Deepspeed 相容。

 plugins :
  - axolotl.integrations.liger.LigerPlugin
liger_rope : true
liger_rms_norm : true
liger_glu_activation : true
liger_layer_norm : true
liger_fused_linear_cross_entropy : true

推理遊樂場

Axolotl 可讓您在互動式終端遊樂場中載入模型以進行快速實驗。設定檔與用於訓練的設定檔相同。

根據訓練的模型類型，將適當的標誌傳遞給推理命令：

預訓練的 LORA：

python -m axolotl.cli.inference examples/your_config.yml --lora_model_dir= " ./lora-output-dir "

全權重微調：

python -m axolotl.cli.inference examples/your_config.yml --base_model= " ./completed-model "

透過文字檔案中的提示進行完全權重微調：

cat /tmp/prompt.txt | python -m axolotl.cli.inference examples/your_config.yml 
  --base_model= " ./completed-model " --prompter=None --load_in_8bit=True

-- 使用 gradio 託管

python -m axolotl.cli.inference examples/your_config.yml --gradio

如果您已開啟並收到類似以下內容的錯誤，請使用--sample_packing False ：

RuntimeError: 堆疊期望每個張量大小相等，但在條目 0 處得到 [1, 32, 1, 128]，在條目 1 處得到 [1, 32, 8, 128]

將 LORA 合併到基礎

以下命令會將您的 LORA 轉接器與您的基本模型合併。您可以選擇傳遞參數--lora_model_dir來指定儲存 LORA 適配器的目錄，否則，這將從 axolotl 設定檔中的output_dir推斷出來。合併後的模型保存在子目錄{lora_model_dir}/merged中。

python3 -m axolotl.cli.merge_lora your_config.yml --lora_model_dir= " ./completed-model "

您可能需要使用gpu_memory_limit和/或lora_on_cpu配置選項以避免記憶體不足。如果你仍然用完 CUDA 內存，你可以嘗試合併到系統 RAM 中

CUDA_VISIBLE_DEVICES= " " python3 -m axolotl.cli.merge_lora ...

儘管這會非常慢，但建議使用上面的配置選項。

常見錯誤？

另請參閱常見問題和調試指南。

如果您遇到「Cuda 記憶體不足」錯誤，則表示您的 GPU 在訓練過程中記憶體不足。解決方法如下：

請減少以下任何一項

micro_batch_size
eval_batch_size
gradient_accumulation_steps
sequence_len

如果它沒有幫助，請嘗試在命令中不使用 deepspeed 和加速（將“accelerate launch”替換為“python”）來運行。

使用 adamw_bnb_8bit 也可能會節省一些記憶體。

failed (exitcode: -9)

通常意味著您的系統已耗盡系統記憶體。同樣，您應該考慮減少與用完 VRAM 時相同的設定。此外，考慮升級系統 RAM，這應該比 GPU 升級更簡單。

運行時錯誤：預期標量類型 Float 但發現 Half

嘗試設定fp16: true

NotImplementedError：找不到memory_efficient_attention_forward的運算子...

嘗試關閉 xformers。

加速配置遺失

忽略它是安全的。

NCCL 訓練期間逾時

請參閱 NCCL 指南。

標記化不匹配黑白推理和訓練

對於許多格式，Axolotl 透過在標記字串後連接標記 id 來建構提示。連接 token id 而不是對字串進行操作的原因是為了保持對注意力掩碼的精確計算。

如果您解碼由 axolotl 建構的提示，您可能會看到您不希望看到的標記之間有空格（或缺少空格），特別是在分隔符號和特殊標記周圍。當您開始使用新格式時，您應該始終執行以下操作：

使用python -m axolotl.cli.preprocess your_config.yml --debug實作一些數據，然後使用模型的標記產生器解碼前幾行。
在推理過程中，在將 token id 張量傳遞給模型之前，將這些 token 解碼回字串。
確保 #2 中的推理字串看起來與您從 #1 微調的資料完全相同，包括空格和換行符。如果它們不相同，請相應地調整您的推理伺服器。
作為額外的故障排除步驟，您可以查看 1 和 2 之間的令牌 ID，以確保它們相同。

訓練和推理過程中的提示不一致可能會導致模型表現非常差，因此值得檢查這一點。請參閱此部落格文章以了解具體範例。

調試蠑螈

請參閱此偵錯指南，以了解有關偵錯 Axolotl 的提示，以及使用 VSCode 進行偵錯的範例配置。

需要幫助嗎？？

加入我們的 Discord 伺服器，我們的社群成員可以為您提供協助。

需要專門的支援嗎？請透過 ✉️[email protected] 與我們聯絡以獲得專屬的支援選項。

徽章❤️？

用蠑螈打造一些很酷的東西？考慮在您的模型卡上新增徽章。

 [ < img src = " https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png " alt = " Built with Axolotl " width = " 200 " height = " 32 " /> ] ( https://github.com/axolotl-ai-cloud/axolotl )

社區展示

看看一些使用 Axolotl 建造的項目和模型！您有想要添加到我們的社區展示中的模型嗎？與您的模型建立 PR。

開放獲取人工智慧集體

牛頭怪 13b
蠍尾獅 13b
鷹頭馬身有翼獸 30b

袖珍文件實驗室

丹的個性引擎 13b LoRA

貢獻？

請閱讀貢獻指南

蟲子？請檢查未解決的問題，否則建立新問題。

非常歡迎PR ！

請執行快速入門說明，然後執行以下命令來設定環境：

pip3 install -r requirements-dev.txt -r requirements-tests.txt
pre-commit install

# test
pytest tests/

# optional: run against all files
pre-commit run --all-files

感謝迄今為止所有的貢獻者。透過為 Axolotl 做出貢獻，幫助推動開源人工智慧的進步。

贊助商？

OpenAccess AI Collective 由 winglian、NanoCode012、tmm1、mhenrichsen、casper-hansen、hamelsmu 等志工貢獻者經營，他們透過修復錯誤、回答社區問題和實現新功能來幫助我們加速前進。 Axolotl 需要贊助商捐贈來運行我們的單元和整合測試、解決社區問題以及提供賞金所需的計算資源。如果您喜歡 axolotl，請考慮透過 GitHub Sponsors、Ko-fi 贊助該項目，或直接聯絡 [email protected]。

？鑽石贊助商 - 直接聯繫

？金牌贊助商 - $5000/月

？白銀贊助商 - $1000/月

？銅牌贊助商 - $500/月

賈維斯實驗室.ai

展開

附加信息

版本 v0.5.2
類型其他源碼
更新時間 2024-12-02
大小 2.04MB
來自於 Github

相關應用

waymo open dataset

2024-11-18
SmartTube

2024-12-14
Sunamu

2024-12-14
MySchedule.py

2024-12-15
viptools for eslam

2024-12-15
VITAident

2024-12-15

爲您推薦

chat.petals.dev

其他源碼

1.0.0
GPT Prompt Templates

其他源碼

1.0.0
GPTyped

其他源碼

GPTyped 1.0.5
waymo open dataset

其他源碼

December 2023 Update
SmartTube

其他源碼

24.71 Stable
Sunamu

其他源碼

Release 2.2.0
waymo open dataset

其他源碼

December 2023 Update
wp functions

其他類別

1.0.0
termwind

其他類別

v2.3.0

相關資訊全部

如何在 PETS GO 中獲得巨型果凍蠑螈
2024-11-16

axolotl

目錄