pip install refuel-autolabel
https://docs.refuel.ai/
訪問大型,清潔和多樣的標籤數據集是任何機器學習工作成功的關鍵組成部分。與手動標記相比,像GPT-4這樣的最先進的LLMs能夠以高準確性和成本和時間的一小部分自動標記數據。
AutoLabel是一個使用您選擇的任何大語言模型(LLM)標記,清潔和豐富文本數據集的Python庫。
查看我們的技術報告,以了解有關在我們的基準測試中Refuelllm-V2的性能的更多信息。您可以通過遵循以下步驟來複製基準測試
cd autolabel / benchmark
curl https : // autolabel - benchmarking . s3 . us - west - 2. amazonaws . com / data . zip - o data . zip
unzip data . zip
python benchmark . py - - model $ model - - base_dir benchmark - results
python results . py - - eval_dir benchmark - results
cat results . csv
您可以通過用基準測試的模型的名稱替換$模型來對相關模型進行基準測試。如果它是API託管的模型,例如gpt-3.5-turbo
, gpt-4-1106-preview
, claude-3-opus-20240229
, gemini-1.5-pro-preview-0409
或其他一些自動標籤支持模型,只需寫入模型的名稱即可。如果要進行基準測試的模型是支持VLLM的模型,請通過本地路徑或與模型相對應的擁抱面路徑。對於所有型號,這將運行基準以及相同的提示。
results.csv
將包含一個行的行,每個模型都被標記為行。查看benchmark/results.csv
中的示例。
AutoLabel提供了一個簡單的三步過程來標記數據:
讓我們想像我們正在建立一個ML模型來分析電影評論的情感分析。我們有一個電影評論的數據集,我們希望首先標記。對於這種情況,這是示例數據集和配置的樣子:
{
"task_name" : "MovieSentimentReview" ,
"task_type" : "classification" ,
"model" : {
"provider" : "openai" ,
"name" : "gpt-3.5-turbo"
},
"dataset" : {
"label_column" : "label" ,
"delimiter" : ","
},
"prompt" : {
"task_guidelines" : "You are an expert at analyzing the sentiment of movie reviews. Your job is to classify the provided movie review into one of the following labels: {labels}" ,
"labels" : [
"positive" ,
"negative" ,
"neutral"
],
"few_shot_examples" : [
{
"example" : "I got a fairly uninspired stupid film about how human industry is bad for nature." ,
"label" : "negative"
},
{
"example" : "I loved this movie. I found it very heart warming to see Adam West, Burt Ward, Frank Gorshin, and Julie Newmar together again." ,
"label" : "positive"
},
{
"example" : "This movie will be played next week at the Chinese theater." ,
"label" : "neutral"
}
],
"example_template" : "Input: {example} n Output: {label}"
}
}
初始化標籤代理並將其傳遞給配置:
from autolabel import LabelingAgent , AutolabelDataset
agent = LabelingAgent ( config = 'config.json' )
預覽將發送到LLM的示例提示:
ds = AutolabelDataset ( 'dataset.csv' , config = config )
agent . plan ( ds )
這打印:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100/100 0:00:00 0:00:00
┌──────────────────────────┬─────────┐
│ Total Estimated Cost │ $0.538 │
│ Number of Examples │ 200 │
│ Average cost per example │ 0.00269 │
└──────────────────────────┴─────────┘
─────────────────────────────────────────
Prompt Example:
You are an expert at analyzing the sentiment of movie reviews. Your job is to classify the provided movie review into one of the following labels: [positive, negative, neutral]
Some examples with their output answers are provided below:
Example: I got a fairly uninspired stupid film about how human industry is bad for nature.
Output:
negative
Example: I loved this movie. I found it very heart warming to see Adam West, Burt Ward, Frank Gorshin, and Julie Newmar together again.
Output:
positive
Example: This movie will be played next week at the Chinese theater.
Output:
neutral
Now I want you to label the following example:
Input: A rare exception to the rule that great literature makes disappointing films.
Output:
─────────────────────────────────────────────────────────────────────────────────────────
最後,我們可以在數據集的子集或整個數據集上運行標籤:
ds = agent . run ( ds )
輸出數據幀包含標籤列:
ds . df . head ()
text ... MovieSentimentReview_llm_label
0 I was very excited about seeing this film , ant ... ... negative
1 Serum is about a crazy doctor that finds a ser ... ... negative
4 I loved this movie . I knew it would be chocked ... ... positive
...
Cupuel提供對託管開源LLM的標籤訪問權限,並且為了估算信心這是有幫助的,因為您可以校準標籤任務的信心閾值,然後將較少自信的標籤路由到人類中,而您仍然可以獲得自動標籤的好處,以獲得自信的示例。
為了使用加油託管的LLM,您可以在此處請求訪問。
查看我們的公共路線圖,以了解有關AutoLabel圖書館正在進行的和計劃改進的更多信息。
我們一直在尋找社區的建議和貢獻。加入有關Discord的討論或打開GitHub問題以報告錯誤和請求功能。
Autolabel是一個快速發展的項目。我們歡迎各種形式的貢獻 - 錯誤報告,提取請求和改進圖書館的想法。