llama3 playground
1.0.0
一個完整的、可立即運行的環境,用於使用自訂資料集微調 Llama 3 模型並在微調後的模型上運行推理
注意:目前僅在 NVIDIA RTX 2080 和 NVIDIA Tesla T4 GPU 上進行了測試。它尚未在其他 GPU 類別或 CPU 上進行過測試。
在您的主機上執行此命令以檢查您安裝了哪個 Nvidia GPU。
nvidia-smi
這應該會顯示你的 GPU 資訊。
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.171.04 Driver Version: 535.171.04 CUDA Version: 12.2 |
| -----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
| =========================================+======================+====================== |
| 0 NVIDIA GeForce RTX 2080 Off | 00000000:01:00.0 On | N/A |
| 22% 38C P8 17W / 215W | 197MiB / 8192MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
git clone https://github.com/amithkoujalgi/llama3-playground.git
cd llama3-playground
bash build.sh
bash run.sh
這將使用以下服務啟動 Docker 容器。
服務 | 外部可存取端點 | 內部連接埠 | 描述 |
---|---|---|---|
導師 | http://本地主機:8884 | 9001 | 用於在自訂資料集上運行訓練並查看訓練器進程的日誌 |
FastAPI伺服器 | http://localhost:8883/docs | 8070 | 用於存取模型伺服器的API |
JupyterLab伺服器 | http://localhost:8888/lab | 8888 | 造訪 JupyterLab 介面以瀏覽容器並更新/試驗程式碼 |
注意:所有進程(OCR、訓練和推理)都使用 GPU,如果同時執行任何類型的多個進程,我們將遇到記憶體不足 (OOM) 問題。為了解決這個問題,系統被設計為在任何給定時間點僅運行一個進程。 (即,一次只能執行一個 OCR 或訓練或推理實例)
請隨意根據您的需求更新程式碼。
轉到終端機並輸入
playground --train
轉到終端機並輸入
playground -l
這會在/app/data/trained-models/
下產生模型。訓練器腳本產生 2 個模型:
lora-adapters
後綴的模型。運行 OCR:
cd /app/llama3_playground/core
python ocr.py
-f " /app/sample.pdf "
要了解選項的含義,請前往 JupyterLab 並執行python ocr.py -h
使用 RAG 進行推理:
cd /app/llama3_playground/core
python infer_rag.py
-m " llama-3-8b-instruct-custom-1720802202 "
-d " /app/data/ocr-runs/123/text-result.txt "
-q " What is the employer name, address, telephone, TIN, tax year end, type of business, plan name, Plan Sequence Number, Trust ID, Account number, is it a new plan or existing plan as true or false, are elective deferrals and roth deferrals allowed as true or false, are loans permitted as true or false, are life insurance investments permitted and what is the ligibility Service Requirement selected? "
-t 256
-e " Alibaba-NLP/gte-base-en-v1.5 "
-p " There are checkboxes in the text that denote the value as selected if the text is [Yes], and unselected if the text is [No]. The checkbox option's value can either be before the selected value or after. Keep this in context while responding and be very careful and precise in picking these values. Always respond as JSON. Keep the responses precise and concise. "
要了解選項的含義,請前往 JupyterLab 並執行python infer_rag.py -h
如果您的主機上沒有安裝 NVIDIA Container Toolkit,則需要執行此操作。
# Configure the production repository
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list |
sed ' s#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g ' |
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
# Optionally, configure the repository to use experimental packages
sed -i -e ' /experimental/ s/^#//g ' /etc/apt/sources.list.d/nvidia-container-toolkit.list
# Update the packages list from the repository
sudo apt-get update
# Install the NVIDIA Container Toolkit packages
sudo apt-get install -y nvidia-container-toolkit
其他環境請參考此。
curl --silent -X ' POST '
' http://localhost:8883/api/infer/sync/ctx-text '
-H ' accept: application/json '
-H ' Content-Type: application/json '
-d ' {
"model_name": "llama-3-8b-instruct-custom-1720690384",
"context_data": "You are a magician who goes by the name Magica",
"question_text": "Who are you?",
"prompt_text": "Respond in a musical and Shakespearean tone",
"max_new_tokens": 50
} ' | jq -r " .data.response "
curl -X ' POST '
' http://localhost:8883/api/ocr/sync/pdf '
-H ' accept: application/json '
-H ' Content-Type: multipart/form-data '
-F ' file=@your_file.pdf;type=application/pdf '
true
,否則傳回false
。 curl -X ' GET '
' http://localhost:8883/api/ocr/status '
-H ' accept: application/json '
參考: