该存储库包含有关纸质WICE的数据集和代码:Wikipedia(EMNLP 2023)索赔的现实世界中的索赔。
作者:Ryo Kamoi,Tanya Goyal,Juan Diego Rodriguez,Greg Durrett
@inproceedings { kamoi-etal-2023-wice ,
title = " {W}i{CE}: Real-World Entailment for Claims in {W}ikipedia " ,
author = " Kamoi, Ryo and
Goyal, Tanya and
Rodriguez, Juan and
Durrett, Greg " ,
editor = " Bouamor, Houda and
Pino, Juan and
Bali, Kalika " ,
booktitle = " Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing " ,
month = dec,
year = " 2023 " ,
address = " Singapore " ,
publisher = " Association for Computational Linguistics " ,
url = " https://aclanthology.org/2023.emnlp-main.470 " ,
pages = " 7561--7583 " ,
}
WICE是基于自然主张的精细元素文本构成数据集,并从Wikipedia提取的证据对。 Given a sentence in Wikipedia and the corresponding article(s) it cites, we annotate the entailment label, a list of sentences in the cited article(s) that support the claim sentence, and tokens in the claim that are unsupported by the article( S)。
This dataset can be used to evaluate a variety of tasks, but is primarily designed for three tasks: entailment classification, evidence sentence retrieval, and non-supported tokens detection.
数据/intailment_retrieval包括用于完成和检索任务的WICE数据集。数据/intailment_retrieval/索赔包括具有原始索赔和数据/intailment_retrieval/sublaim的数据,其中包括带有分解索赔的数据(使用索赔要求进行良好的注释)。
每个子目录都包含用于火车,开发和测试集的JSONL文件。这是JSONL文件中数据的示例:
{
"label" : " partially_supported " ,
"supporting_sentences" : [[ 5 , 15 ], [ 15 , 17 ]],
"claim" : " Arnold is currently the publisher and editorial director of Media Play News, one of five Hollywood trades and the only one dedicated to the home entertainment sector. " ,
"evidence" : [ list of evidence sentences ],
"meta" : { "id" : " dev02986 " , "claim_title" : " Roger Hedgecock " , "claim_section" : " Other endeavors. " , "claim_context" : [ paragraph ]}
}
label
:intailment标签在{ supported
, partially_supported
, not_supported
}中supporting_sentences
:支持句子的索引列表。所有提供的支撑句子集都是有效的(在上面的示例中, [5, 15]
和[5, 17]
都被注释为正确的支持句子集,其中包含相同的信息)。claim
:威基百科的句子evidence
:引用网站中的句子清单meta
claim_title
:包括claim
的Wikipedia页面的标题claim_section
:包括claim
的节claim_context
: claim
之前的句子数据/non_supported_tokens包括用于非支持令牌检测任务的WICE数据集。我们仅提供注释为partially_supported
子声明的注释。我们用低通道协议过滤了数据点(有关详细信息,请参阅本文)。
{
"claim" : " Irene Hervey appeared in over fifty films and numerous television series. " ,
"claim_tokens" : [ " Irene " , " Hervey " , " appeared " , " in " , " over " , " fifty " , " films " , " and " , " numerous " , " television " , " series " , " . " ],
"non_supported_spans" : [ false , false , false , false , true , true , false , false , false , false , false , false ],
"evidence" : [ list of evidence sentences ],
"meta" : { "id" : " test00561-1 " , "claim_title" : " Irene Hervey " , "claim_section" : " Abstract. " , "claim_context" : " Irene Hervey was an American film, stage, and television actress. " }
}
claim_tokens
:索赔中的令牌列表non_supported_spans
:对应于claim_tokens
的bool列表( true
是不支持的令牌) 索赔_split目录包括索赔分裂的提示,这是一种使用GPT-3分解索赔的方法。在这项工作的实验中,我们使用不同的数据集使用不同的提示,因此我们为WICE,VITAMINC,PAWS和FRANK(XSUM)提供了提示。
当您在WICE上评估核心分类模型时,除非您的模型可以使用很长的输入上下文处理,否则您必须从证据文章中检索证据句子作为第一步。请参阅我们的论文,以获取评估WICE输入长度有限的模型的可能方法。
如果评估证据检索模型,则可以在数据/intailment_retrieval中使用数据。
If you are looking for simple NLI datasets with short evidence that do not require any retrieval models (like SNLI, MNLI, and ANLI), you can use our oracle retrieval dataset. Oracle检索数据集模拟了您具有完美证据检索模型的情况。当您在此Oracle检索数据上报告结果时,您需要清楚地提到使用Oracle检索数据集,而不是原始的WICE数据集。
我们提供了使用GPT-3.5和GPT-4在Oracle检索数据集上复制实验的代码。有关详细信息,请参阅Code_and_resources/code/readme.md。
请参阅licence.md文件。