pip install refuel-autolabel
https://docs.refuel.ai/
访问大型,清洁和多样的标签数据集是任何机器学习工作成功的关键组成部分。与手动标记相比,像GPT-4这样的最先进的LLMs能够以高准确性和成本和时间的一小部分自动标记数据。
AutoLabel是一个使用您选择的任何大语言模型(LLM)标记,清洁和丰富文本数据集的Python库。
查看我们的技术报告,以了解有关在我们的基准测试中Refuelllm-V2的性能的更多信息。您可以通过遵循以下步骤来复制基准测试
cd autolabel / benchmark
curl https : // autolabel - benchmarking . s3 . us - west - 2. amazonaws . com / data . zip - o data . zip
unzip data . zip
python benchmark . py - - model $ model - - base_dir benchmark - results
python results . py - - eval_dir benchmark - results
cat results . csv
您可以通过用基准测试的模型的名称替换$模型来对相关模型进行基准测试。如果它是API托管的模型,例如gpt-3.5-turbo
, gpt-4-1106-preview
, claude-3-opus-20240229
, gemini-1.5-pro-preview-0409
或其他一些自动标签支持模型,只需写入模型的名称即可。如果要进行基准测试的模型是支持VLLM的模型,请通过本地路径或与模型相对应的拥抱面路径。对于所有型号,这将运行基准以及相同的提示。
results.csv
将包含一个行的行,每个模型都被标记为行。查看benchmark/results.csv
中的示例。
AutoLabel提供了一个简单的三步过程来标记数据:
让我们想象我们正在建立一个ML模型来分析电影评论的情感分析。我们有一个电影评论的数据集,我们希望首先标记。对于这种情况,这是示例数据集和配置的样子:
{
"task_name" : "MovieSentimentReview" ,
"task_type" : "classification" ,
"model" : {
"provider" : "openai" ,
"name" : "gpt-3.5-turbo"
},
"dataset" : {
"label_column" : "label" ,
"delimiter" : ","
},
"prompt" : {
"task_guidelines" : "You are an expert at analyzing the sentiment of movie reviews. Your job is to classify the provided movie review into one of the following labels: {labels}" ,
"labels" : [
"positive" ,
"negative" ,
"neutral"
],
"few_shot_examples" : [
{
"example" : "I got a fairly uninspired stupid film about how human industry is bad for nature." ,
"label" : "negative"
},
{
"example" : "I loved this movie. I found it very heart warming to see Adam West, Burt Ward, Frank Gorshin, and Julie Newmar together again." ,
"label" : "positive"
},
{
"example" : "This movie will be played next week at the Chinese theater." ,
"label" : "neutral"
}
],
"example_template" : "Input: {example} n Output: {label}"
}
}
初始化标签代理并将其传递给配置:
from autolabel import LabelingAgent , AutolabelDataset
agent = LabelingAgent ( config = 'config.json' )
预览将发送到LLM的示例提示:
ds = AutolabelDataset ( 'dataset.csv' , config = config )
agent . plan ( ds )
这打印:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100/100 0:00:00 0:00:00
┌──────────────────────────┬─────────┐
│ Total Estimated Cost │ $0.538 │
│ Number of Examples │ 200 │
│ Average cost per example │ 0.00269 │
└──────────────────────────┴─────────┘
─────────────────────────────────────────
Prompt Example:
You are an expert at analyzing the sentiment of movie reviews. Your job is to classify the provided movie review into one of the following labels: [positive, negative, neutral]
Some examples with their output answers are provided below:
Example: I got a fairly uninspired stupid film about how human industry is bad for nature.
Output:
negative
Example: I loved this movie. I found it very heart warming to see Adam West, Burt Ward, Frank Gorshin, and Julie Newmar together again.
Output:
positive
Example: This movie will be played next week at the Chinese theater.
Output:
neutral
Now I want you to label the following example:
Input: A rare exception to the rule that great literature makes disappointing films.
Output:
─────────────────────────────────────────────────────────────────────────────────────────
最后,我们可以在数据集的子集或整个数据集上运行标签:
ds = agent . run ( ds )
输出数据帧包含标签列:
ds . df . head ()
text ... MovieSentimentReview_llm_label
0 I was very excited about seeing this film , ant ... ... negative
1 Serum is about a crazy doctor that finds a ser ... ... negative
4 I loved this movie . I knew it would be chocked ... ... positive
...
Cupuel提供对托管开源LLM的标签访问权限,并且为了估算信心这是有帮助的,因为您可以校准标签任务的信心阈值,然后将较少自信的标签路由到人类中,而您仍然可以获得自动标签的好处,以获得自信的示例。
为了使用加油托管的LLM,您可以在此处请求访问。
查看我们的公共路线图,以了解有关AutoLabel图书馆正在进行的和计划改进的更多信息。
我们一直在寻找社区的建议和贡献。加入有关Discord的讨论或打开GitHub问题以报告错误和请求功能。
Autolabel是一个快速发展的项目。我们欢迎各种形式的贡献 - 错误报告,提取请求和改进图书馆的想法。