factor下載 - factor源代碼下載

factor

其他源碼

下載

因素

該回購包含來自AI21實驗室紙的數據生成基準，用於語言模型的事實評估。

數據

我們包括以下因素基準來評估語言模型的事實：

Wiki-factor：基於樁的Wikipedia部分）驗證拆分。數據集由2994個示例組成。
新聞因素：基於從精製網絡數據集提取的路透社文章。數據集由1036個示例組成。
專家因子：基於ExpertQa的驗證和測試拆分，這是一個長期以來答案數據集的問題。基準由236個示例組成。

評估

設定

要在我們的存儲庫中安裝所需的庫，請運行：

pip install -r requirements.txt

要具有特定於CUDA的Pytorch版本，請在運行上述命令之前安裝您的版本。

語言模型列表

在本文中，我們為以下模型提供結果（用其中之一替換$MODEL_NAME ）。

GPT-2： gpt2 ， gpt2-medium ， gpt2-large ， gpt2-xl
gpt-neo： EleutherAI/gpt-neo-1.3B ， EleutherAI/gpt-neo-2.7B ， EleutherAI/gpt-j-6B
OPT： facebook/opt-125m ， facebook/opt-350m ，Facebook/ facebook/opt-2.7b facebook/opt-1.3b ， facebook/opt-6.7b ， facebook/opt-13b ， facebook/opt-30b ， facebook/opt-66b

評估腳本

要通過因子數據集對模型進行評估，請使用以下命令：

python python eval_factuality.py 
--data_file ./data/wiki_factor.csv 
--output_folder $OUTPUT_DIR 
--model_name $MODEL_NAME

執照

wiki_factor ， expert_factor和代碼：根據MIT許可發布。
news_factor ：基準是從精製網絡數據集派生的。公共摘錄可根據ODC by 1.0許可提供；用戶還應遵守Common Crawl Tou：https：//commoncrawl.org/terms-of-use/。

引用

如果您發現我們的論文或代碼有幫助，請引用我們的論文：

 @article{muhlgay2023generating,
  title={Generating benchmarks for factuality evaluation of language models},
  author={Muhlgay, Dor and Ram, Ori and Magar, Inbal and Levine, Yoav and Ratner, Nir and Belinkov, Yonatan and Abend, Omri and Leyton-Brown, Kevin and Shashua, Amnon and Shoham, Yoav},
  journal={arXiv preprint arXiv:2307.06908},
  year={2023}
}

展開

附加信息

版本
類型其他源碼
更新時間 2025-02-02
大小 3.79MB
來自於 Github

相關應用

wporg two factor

2024-11-08

爲您推薦

chat.petals.dev

其他源碼

1.0.0
GPT Prompt Templates

其他源碼

1.0.0
GPTyped

其他源碼

GPTyped 1.0.5
waymo open dataset

其他源碼

December 2023 Update
SmartTube

其他源碼

24.71 Stable
Sunamu

其他源碼

Release 2.2.0
waymo open dataset

其他源碼

December 2023 Update
wp functions

其他類別

1.0.0
termwind

其他類別

v2.3.0

相關資訊全部

Abiotic Factor在哪裡可以下載
2024-05-06