FollowIR下載 - FollowIR源碼下載

FollowIR

其他源碼

1.0.0

下載

FollowIR：評估和教授資訊檢索模型以遵循說明

模型/資料連結|安裝|用途 |排行榜 |引用 |

FollowIR 論文的官方儲存庫：評估和教學資訊檢索模型以遵循指令。官方評估可以透過安裝mteb庫並透過零（或僅幾）行程式碼變更來評估您的 MTEB 相容模型來完成！

連結

二進位	描述
關注IR-7B	7B 參數模型，根據查詢和指令對文件進行重新排序。它是根據 Mistral-7B 在以下資料集上進行微調的
跟著IR-火車	用於訓練 FollowIR-7B 的資料集。它由 TREC 指令和查詢以及 GPT 產生的已過濾的合成文件組成。
關注IR-train-raw	上述訓練集的預過濾版本。這沒有用於模型訓練，因為一些 GPT 產生的資料不正確。

您還可以找到單獨的註釋的測試資料（Robust04、Core17 和 News21），儘管該格式最適合與 MTEB 的評估代碼一起使用。

安裝

如果您想重現論文中的實驗，可以使用以下程式碼：

git clone https://github.com/orionw/FollowIR.git
cd FollowIR/
conda create -n followir python=3.9 -y
conda activate followir
pip install -r requirements.txt
bash launch_all_jobs.sh

用法

如果您的模型與SentenceTransformer相容且不需要特殊標記來連接查詢和指令，您可以簡單地使用以下一行命令：

mteb -m $MODEL_NAME -t $DATASET

對於{Robust04InstructionRetrieval, Core17InstructionRetrieval, News21InstructionRetrieval}中的每個資料集

如果您有雙編碼器模型，但想要做一些不同於簡單地將指令附加到帶有空格的查詢的操作，您可以擴展DenseRetrievalExactSearch並檢查 kwargs 中的instructions 。請參閱（請參閱 models/base_sentence_transformers/ 作為小修改的起點，並參閱 models/e5/ 作為較大修改的範例）。

重新排序器的使用

重新排序器現已新增至 MTEB！如果您使用重排序模型，則需要擴充DenseRetrievalExactSearch類別並定義__init__和predict函數（有關各種重新排序範例，請參閱模型/重新排序部分）。您的預測函數應該接受input_to_rerank ，它將是以下形式的元組：

 # if there are no instructions, instructions will be a list of Nones
# Instructions will be present for all of the FollowIR datasets
queries , passages , instructions = list ( zip ( * input_to_rerank ))

您的predict函數應該使用這些並傳回一個包含每個元組項的分數的清單。

引用

如果您發現程式碼、資料或模型有用，請隨意引用：

 @misc { weller2024followir ,
      title = { FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions } , 
      author = { Orion Weller and Benjamin Chang and Sean MacAvaney and Kyle Lo and Arman Cohan and Benjamin Van Durme and Dawn Lawrie and Luca Soldaini } ,
      year = { 2024 } ,
      eprint = { 2403.15246 } ,
      archivePrefix = { arXiv } ,
      primaryClass = { cs.IR }
}

展開

附加信息

版本 1.0.0
類型其他源碼
更新時間 2024-12-26
大小 83.12MB
來自於 Github

相關應用

waymo open dataset

2024-11-18
SmartTube

2024-12-14
Sunamu

2024-12-14
viptools for eslam

2024-12-15
MySchedule.py

2024-12-15
VITAident

2024-12-15

爲您推薦

chat.petals.dev

其他源碼

1.0.0
GPT Prompt Templates

其他源碼

1.0.0
GPTyped

其他源碼

GPTyped 1.0.5
waymo open dataset

其他源碼

December 2023 Update
SmartTube

其他源碼

24.71 Stable
Sunamu

其他源碼

Release 2.2.0
waymo open dataset

其他源碼

December 2023 Update
termwind

其他類別

v2.3.0
wp functions

其他類別

1.0.0

相關資訊全部