LM SupConダウンロード - LM SupConソースコードのダウンロード

LM SupCon

AI ソースコード

1.0.0

ダウンロード

プロンプトベースの少数ショット言語学習者のための対照学習

このリポジトリでは、NAACL 2022 に採択された論文「Contrastive Learning for Prompt-based Few-shot Language Learners (Yiren Jian、Chongyang Gao、Soroush Vosoughi 著)」の実装について説明します。

このリポジトリが研究に役立つと思われる場合は、論文の引用を検討してください。

 @inproceedings { jian-etal-2022-contrastive ,
    title = " Contrastive Learning for Prompt-based Few-shot Language Learners " ,
    author = " Jian, Yiren  and
      Gao, Chongyang  and
      Vosoughi, Soroush " ,
    booktitle = " Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies " ,
    month = jul,
    year = " 2022 " ,
    address = " Seattle, United States " ,
    publisher = " Association for Computational Linguistics " ,
    url = " https://aclanthology.org/2022.naacl-main.408 " ,
    pages = " 5577--5587 " ,
    abstract = "The impressive performance of GPT-3 using natural language prompts and in-context learning has inspired work on better fine-tuning of moderately-sized models under this paradigm. Following this line of work, we present a contrastive learning framework that clusters inputs from the same class for better generality of models trained with only limited examples. Specifically, we propose a supervised contrastive framework that clusters inputs from the same class under different augmented {``}views{''} and repel the ones from different classes. We create different {``}views{''} of an example by appending it with different language prompts and contextual demonstrations. Combining a contrastive loss with the standard masked language modeling (MLM) loss in prompt-based few-shot learners, the experimental results show that our method can improve over the state-of-the-art methods in a diverse set of 15 language tasks. Our framework makes minimal assumptions on the task or the base model, and can be applied to many recent methods with little modification.",
}

私たちのコードは LM-BFF と SupCon ( /src/losses.py ) から大幅に借用しています。

要件

このリポジトリは、Ubuntu 18.04.5 LTS、Python 3.7、PyTorch 1.6.0、および CUDA 10.1 でテストされました。 RoBERTa-base での実験には 48 GB GPU が必要で、RoBERTa-large では 4x 48 GB GPU が必要です。実験は Nvidia RTX-A6000 と RTX-8000 で実行しましたが、40 GB の Nvidia A100 でも動作するはずです。

データのダウンロード

LM-BFF からの前処理されたデータセット (SST-2、SST-5、MR、CR、MPQA、Subj、TREC、CoLA、MNLI、SNLI、QNLI、RTE、MRPC、QQP) を使用します。 LM-BFF は、データセットのダウンロードと準備に役立つスクリプトを提供します。以下のコマンドを実行するだけです。

 cd data
bash download_dataset.sh

次に、次のコマンドを使用して、調査で使用した 16 ショットのデータセットを生成します。

python tools/generate_k_shot_data.py

微調整を実行する

タスクに使用される主なプロンプト (テンプレート) は、 run_experiments.shで事前定義されています。対照学習用の入力のマルチビューを生成するときに使用される補助テンプレートは/auto_template/$TASKにあります。

システムに 1 つの GPU があると仮定して、SST-5 で微調整を実行する例を示します (入力の「拡張ビュー」のランダムなテンプレートとランダムなデモンストレーション)。

 for seed in 13 21 42 87 100   # ### random seeds for different train-test splits
do
    for bs in 40   # ### batch size
    do
        for lr in 1e-5    # ### learning rate for MLM loss
        do
            for supcon_lr in 1e-5    # ### learning rate for SupCon loss
            do
                TAG=exp 
                TYPE=prompt-demo 
                TASK=sst-5 
                BS= $bs 
                LR= $lr 
                SupCon_LR= $supcon_lr 
                SEED= $seed 
                MODEL=roberta-base 
                bash run_experiment.sh
            done
        done
    done
done

rm -rf result/

私たちのフレームワークは、デモンストレーションのないプロンプトベースの方法、つまりTYPE=promptにも適用されます (この場合、「拡張ビュー」を生成するためのテンプレートをランダムにサンプリングするだけです)。結果はlogに保存されます。

RoBERTa-large をベースモデルとして使用するには、それぞれ 48 GB のメモリを搭載した 4 つの GPU が必要です。まずsrc/models.pyの 20 行目を編集してdef __init__(self, hidden_size=1024)にする必要があります。

 for seed in 13 21 42 87 100   # ### random seeds for different train-test splits
do
    for bs in 10   # ### batch size for each GPU, total batch size is then 40
    do
        for lr in 1e-5    # ### learning rate for MLM loss
        do
            for supcon_lr in 1e-5    # ### learning rate for SupCon loss
            do
                TAG=exp 
                TYPE=prompt-demo 
                TASK=sst-5 
                BS= $bs 
                LR= $lr 
                SupCon_LR= $supcon_lr 
                SEED= $seed 
                MODEL=roberta-large 
                bash run_experiment.sh
            done
        done
    done
done

rm -rf result/

結果の収集

 python tools/gather_result.py --condition "{'tag': 'exp', 'task_name': 'sst-5', 'few_shot_type': 'prompt-demo'}"

logから結果を収集し、それらの 5 つのトレーニングとテストの分割にわたる平均と標準偏差を計算します。

連絡先

ご質問がございましたら、著者にお問い合わせください。

謝辞

予備的な実装をしていただいた LM-BFF と SupCon に感謝します。

拡大する

追加情報

バージョン 1.0.0
タイプ AI ソースコード
更新時間 2025-01-07
サイズ 50MB
から Github

LM SupCon

プロンプトベースの少数ショット言語学習者のための対照学習

要件

データのダウンロード

微調整を実行する

結果の収集

連絡先

謝辞

GitHub sgrebnov/cordova plugin background download

Wa ch ull navra maza navsacha 2 2024 ull ovie Fr e Online On Strea ings

Wa ch navra maza navsacha 2 2024 ull ovie Online For Fr e Strea ings At Home

Wa ch the greatest of all time 2024 ull ovie Online For Fr e Strea ings At Home

wolfs 2024 f llmo ie f lmyz lla dow load ree 7 0p 4 0p a d 10 0p

LMオンライン告白Webページ制作PHPソースコード美化版正式版

chat.petals.dev

GPT Prompt Templates

GPTyped

node telegram bot api

typebot.io

python wechaty getting started

waymo open dataset

wp functions

termwind