LLaMa2langダウンロード - LLaMa2langソースコードのダウンロード

LLaMa3 のサポートが追加されました

LLaMa2lang v0.6

このリポジトリには、任意の言語 (英語以外) に対するチャット用に LLaMa3-8B (またはその他の基礎モデル) を微調整するための便利なスクリプトが含まれています。この背後にある理論的根拠は、LLaMa3 は主に英語のデータに基づいてトレーニングされており、他の言語ではある程度機能しますが、英語に比べてパフォーマンスが低いということです。

微調整の力と RAG の力を組み合わせてください。LLaMa2Lang で調整されたモデルの上で使用できる、RAG 上の RAG Me Up リポジトリをチェックしてください。

TL;DR

 pip install -r requirements.txt

# Translate OASST1 to target language
python translate.py m2m target_lang checkpoint_location

# Combine the checkpoint files into a dataset
python combine_checkpoints.py input_folder output_location

# Finetune
python finetune.py tuned_model dataset_name instruction_prompt

# Optionally finetune with DPO (RLHF)
python finetune_dpo.py tuned_model dataset_name instruction_prompt

# Run inference
python run_inference.py model_name instruction_prompt input

何をするのか

LLaMa3 などの基礎モデルを特定の言語に合わせて調整するために従うプロセスは次のとおりです。

Q&A/命令のペアを含むデータセットを読み込みます。
データセット全体を指定されたターゲット言語に翻訳します。
翻訳されたデータセットをロードし、最高ランクのそれぞれの回答を持つプロンプトのみを、後続のプロンプトまで再帰的に選択することで、スレッドを抽出します。
スレッドを、指定されたテンプレート (カスタマイズ可能) に従ってプロンプトに変換します。
QLoRA と PEFT を使用して、このデータセットに対する基本基礎モデルの指示微調整を微調整します。
- QLoRA と PEFT を使用して DPO と微調整し、モデルの能力をさらに拡張し、拒否された回答よりも優先される回答をモデルに教えます。ベースデータセットにはこの情報が必要であることに注意してください。
- DPO の代わりに、ORPO を使用して同じことを実現できます。
新しくトレーニングされたモデルを使用して推論を実行します。

サポートされているパラダイム

翻訳

オーパス
M2M
マドラッド
mBART
NLLB
シームレス（ラージのみ）
タワーの指示 (スペルミスを修正できます)

ベースデータセット

以下はテスト済みですが、さらに機能する可能性があります

OASST1
OASST2

サポートされている基礎モデル

LLaMa3
LLaMa2
ミストラル
(非公式) ミクストラル 8x7B

ロードマップ

[L2L-6] 他のライブラリとの相互運用性を調査する (Axolotl、llamacpp、unsloth)
[L2L-7] QLoRA の隣に異なる量子化を許可します (GGUF、GPTQ、AWQ)
[L2L-10] トークナイザーと語彙の拡張をサポート

コストとランタイム

上記のプロセスは、無料の Google Colab T4 GPU で完全に実行できます。ただし、最後のステップは、十分に短いコンテキストウィンドウと最大 2 のバッチでのみ正常に実行できます。さらに、ステップ 2 の翻訳には、どの言語でも合計で約 36 時間かかるため、次の場合は複数のステップで実行する必要があります。無料の Google Colab GPU を使い続けたい。

ステップ 5 の微調整モデルは、vast.ai の A40 を使用して実行され、各モデルのコストは 1 ドル未満で、約 1.5 時間で完了しました。

使用法

pytorch がインストールされており、ご使用の環境で動作していることを確認してください (CUDA の使用が望ましい): https://pytorch.org/get-started/locally/
リポジトリのクローンを作成し、要件をインストールします。

pip install -r requirements.txt

基本データセットを指定したターゲット言語に翻訳します。

 usage: translate.py [-h] [--quant8] [--quant4] [--base_dataset BASE_DATASET] [--base_dataset_text_field BASE_DATASET_TEXT_FIELD] [--base_dataset_lang_field BASE_DATASET_LANG_FIELD]
                    [--checkpoint_n CHECKPOINT_N] [--batch_size BATCH_SIZE] [--max_length MAX_LENGTH] [--cpu] [--source_lang SOURCE_LANG]
                    {opus,mbart,madlad,m2m,nllb,seamless_m4t_v2,towerinstruct} ... target_lang checkpoint_location

Translate an instruct/RLHF dataset to a given target language using a variety of translation models

positional arguments:
  {opus,mbart,madlad,m2m,nllb,seamless_m4t_v2,towerinstruct}
                        The model/architecture used for translation.
    opus                Translate the dataset using HelsinkiNLP OPUS models.
    mbart               Translate the dataset using mBART.
    madlad              Translate the dataset using Google's MADLAD models.
    m2m                 Translate the dataset using Facebook's M2M models.
    nllb                Translate the dataset using Facebook's NLLB models.
    seamless_m4t_v2     Translate the dataset using Facebook's SeamlessM4T-v2 multimodal models.
    towerinstruct       Translate the dataset using Unbabel's Tower Instruct. Make sure your target language is in the 10 languages supported by the model.
  target_lang           The target language. Make sure you use language codes defined by the translation model you are using.
  checkpoint_location   The folder the script will write (JSONized) checkpoint files to. Folder will be created if it doesn't exist.

options:
  -h, --help            show this help message and exit
  --quant8              Optional flag to load the translation model in 8 bits. Decreases memory usage, increases running time
  --quant4              Optional flag to load the translation model in 4 bits. Decreases memory usage, increases running time
  --base_dataset BASE_DATASET
                        The base dataset to translate, defaults to OpenAssistant/oasst1
  --base_dataset_text_field BASE_DATASET_TEXT_FIELD
                        The base dataset's column name containing the actual text to translate. Defaults to text
  --base_dataset_lang_field BASE_DATASET_LANG_FIELD
                        The base dataset's column name containing the language the source text was written in. Defaults to lang
  --checkpoint_n CHECKPOINT_N
                        An integer representing how often a checkpoint file will be written out. To start off, 400 is a reasonable number.
  --batch_size BATCH_SIZE
                        The batch size for a single translation model. Adjust based on your GPU capacity. Default is 10.
  --max_length MAX_LENGTH
                        How much tokens to generate at most. More tokens might be more accurate for lengthy input but creates a risk of running out of memory. Default is unlimited.
  --cpu                 Forces usage of CPU. By default GPU is taken if available.
  --source_lang SOURCE_LANG
                        Source language to select from OASST based on lang property of dataset

さまざまな変換モデルにさらにパラメータが必要な場合は、次を実行します。

 python translate.py [MODEL] -h

上記のリストから共通パラメータを指定する前に、必ず最初にモデル固有のパラメータを指定してください。呼び出しの例:

 # Using M2M with 4bit quantization and differen batch sizes to translate Dutch
python translate.py m2m nl ./output_nl --quant4 --batch_size 20

# Using madlad 7B with 8bit quantization for German with different max_length
python translate.py madlad --model_size 7b de ./output_de --quant8 --batch_size 5 --max_length 512

# Be sure to use target language codes that the model you use understands
python translate.py mbart xh_ZA ./output_xhosa
python translate.py nllb nld_Latn ./output_nl

チェックポイントのファイルの JSON 配列を結合して Huggingface データセットにし、それをディスクに書き込むか、Huggingface に公開します。スクリプトはデフォルトでディスクへの書き込みを試行し、フォルダーがディスク上に存在しない場合は Huggingface への公開に戻ります。 Huggingface に公開するには、ドキュメントに従ってHF_TOKEN環境変数が設定されていることを確認してください。

 usage: combine_checkpoints.py [-h] input_folder output_location

Combine checkpoint files from translation.

positional arguments:
  input_folder     The checkpoint folder used in translation, with the target language appended.
                   Example: "./output_nl".
  output_location  Where to write the Huggingface Dataset. Can be a disk location or a Huggingface
                   Dataset repository.

options:
  -h, --help       show this help message and exit

翻訳されたメッセージをチャット/指示/プロンプトスレッドに変換し、LoRA と PEFT を使用して基礎モデルの指示を微調整します。

 usage: finetune.py [-h] [--base_model BASE_MODEL] [--base_dataset_text_field BASE_DATASET_TEXT_FIELD] [--base_dataset_rank_field BASE_DATASET_RANK_FIELD] [--base_dataset_id_field BASE_DATASET_ID_FIELD] [--base_dataset_parent_field BASE_DATASET_PARENT_FIELD]
                   [--base_dataset_role_field BASE_DATASET_ROLE_FIELD] [--quant8] [--noquant] [--max_seq_length MAX_SEQ_LENGTH] [--num_train_epochs NUM_TRAIN_EPOCHS] [--batch_size BATCH_SIZE] [--threads_output_name THREADS_OUTPUT_NAME] [--thread_template THREAD_TEMPLATE]
                   [--padding PADDING]
                   tuned_model dataset_name instruction_prompt

Finetune a base instruct/chat model using (Q)LoRA and PEFT

positional arguments:
  tuned_model           The name of the resulting tuned model.
  dataset_name          The name of the dataset to use for fine-tuning. This should be the output of the combine_checkpoints script.
  instruction_prompt    An instruction message added to every prompt given to the chatbot to force it to answer in the target language. Example: "You are a generic chatbot that always answers in English."

options:
  -h, --help            show this help message and exit
  --base_model BASE_MODEL
                        The base foundation model. Default is "NousResearch/Meta-Llama-3-8B-Instruct".
  --base_dataset_text_field BASE_DATASET_TEXT_FIELD
                        The dataset's column name containing the actual text to translate. Defaults to text
  --base_dataset_rank_field BASE_DATASET_RANK_FIELD
                        The dataset's column name containing the rank of an answer given to a prompt. Defaults to rank
  --base_dataset_id_field BASE_DATASET_ID_FIELD
                        The dataset's column name containing the id of a text. Defaults to message_id
  --base_dataset_parent_field BASE_DATASET_PARENT_FIELD
                        The dataset's column name containing the parent id of a text. Defaults to parent_id
  --base_dataset_role_field BASE_DATASET_ROLE_FIELD
                        The dataset's column name containing the role of the author of the text (eg. prompter, assistant). Defaults to role
  --quant8              Finetunes the model in 8 bits. Requires more memory than the default 4 bit.
  --noquant             Do not quantize the finetuning. Requires more memory than the default 4 bit and optional 8 bit.
  --max_seq_length MAX_SEQ_LENGTH
                        The maximum sequence length to use in finetuning. Should most likely line up with your base model's default max_seq_length. Default is 512.
  --num_train_epochs NUM_TRAIN_EPOCHS
                        Number of epochs to use. 2 is default and has been shown to work well.
  --batch_size BATCH_SIZE
                        The batch size to use in finetuning. Adjust to fit in your GPU vRAM. Default is 4
  --threads_output_name THREADS_OUTPUT_NAME
                        If specified, the threads created in this script for finetuning will also be saved to disk or HuggingFace Hub.
  --thread_template THREAD_TEMPLATE
                        A file containing the thread template to use. Default is threads/template_fefault.txt
  --padding PADDING     What padding to use, can be either left or right.

6.1 [オプション] DPO を使用した微調整 (RLHF と同様)

 usage: finetune_dpo.py [-h] [--base_model BASE_MODEL] [--base_dataset_text_field BASE_DATASET_TEXT_FIELD] [--base_dataset_rank_field BASE_DATASET_RANK_FIELD] [--base_dataset_id_field BASE_DATASET_ID_FIELD] [--base_dataset_parent_field BASE_DATASET_PARENT_FIELD] [--quant8]
                       [--noquant] [--max_seq_length MAX_SEQ_LENGTH] [--max_prompt_length MAX_PROMPT_LENGTH] [--num_train_epochs NUM_TRAIN_EPOCHS] [--batch_size BATCH_SIZE] [--threads_output_name THREADS_OUTPUT_NAME] [--thread_template THREAD_TEMPLATE] [--max_steps MAX_STEPS]
                       [--padding PADDING]
                       tuned_model dataset_name instruction_prompt

Finetune a base instruct/chat model using (Q)LoRA and PEFT using DPO (RLHF)

positional arguments:
  tuned_model           The name of the resulting tuned model.
  dataset_name          The name of the dataset to use for fine-tuning. This should be the output of the combine_checkpoints script.
  instruction_prompt    An instruction message added to every prompt given to the chatbot to force it to answer in the target language. Example: "You are a generic chatbot that always answers in English."

options:
  -h, --help            show this help message and exit
  --base_model BASE_MODEL
                        The base foundation model. Default is "NousResearch/Meta-Llama-3-8B-Instruct".
  --base_dataset_text_field BASE_DATASET_TEXT_FIELD
                        The dataset's column name containing the actual text to translate. Defaults to text
  --base_dataset_rank_field BASE_DATASET_RANK_FIELD
                        The dataset's column name containing the rank of an answer given to a prompt. Defaults to rank
  --base_dataset_id_field BASE_DATASET_ID_FIELD
                        The dataset's column name containing the id of a text. Defaults to message_id
  --base_dataset_parent_field BASE_DATASET_PARENT_FIELD
                        The dataset's column name containing the parent id of a text. Defaults to parent_id
  --quant8              Finetunes the model in 8 bits. Requires more memory than the default 4 bit.
  --noquant             Do not quantize the finetuning. Requires more memory than the default 4 bit and optional 8 bit.
  --max_seq_length MAX_SEQ_LENGTH
                        The maximum sequence length to use in finetuning. Should most likely line up with your base model's default max_seq_length. Default is 512.
  --max_prompt_length MAX_PROMPT_LENGTH
                        The maximum length of the prompts to use. Default is 512.
  --num_train_epochs NUM_TRAIN_EPOCHS
                        Number of epochs to use. 2 is default and has been shown to work well.
  --batch_size BATCH_SIZE
                        The batch size to use in finetuning. Adjust to fit in your GPU vRAM. Default is 4
  --threads_output_name THREADS_OUTPUT_NAME
                        If specified, the threads created in this script for finetuning will also be saved to disk or HuggingFace Hub.
  --thread_template THREAD_TEMPLATE
                        A file containing the thread template to use. Default is threads/template_fefault.txt
  --max_steps MAX_STEPS
                        The maximum number of steps to run DPO for. Default is -1 which will run the data through fully for the number of epochs but this will be very time-consuming.
  --padding PADDING     What padding to use, can be either left or right.

6.1 [オプション] ORPO を使用した微調整 (RLHF と同様)

 usage: finetune_orpo.py [-h] [--base_model BASE_MODEL] [--base_dataset_text_field BASE_DATASET_TEXT_FIELD] [--base_dataset_rank_field BASE_DATASET_RANK_FIELD] [--base_dataset_id_field BASE_DATASET_ID_FIELD] [--base_dataset_parent_field BASE_DATASET_PARENT_FIELD] [--quant8]
                        [--noquant] [--max_seq_length MAX_SEQ_LENGTH] [--max_prompt_length MAX_PROMPT_LENGTH] [--num_train_epochs NUM_TRAIN_EPOCHS] [--batch_size BATCH_SIZE] [--threads_output_name THREADS_OUTPUT_NAME] [--thread_template THREAD_TEMPLATE] [--max_steps MAX_STEPS]
                        [--padding PADDING]
                        tuned_model dataset_name instruction_prompt

Finetune a base instruct/chat model using (Q)LoRA and PEFT using ORPO (RLHF)

positional arguments:
  tuned_model           The name of the resulting tuned model.
  dataset_name          The name of the dataset to use for fine-tuning. This should be the output of the combine_checkpoints script.
  instruction_prompt    An instruction message added to every prompt given to the chatbot to force it to answer in the target language. Example: "You are a generic chatbot that always answers in English."

options:
  -h, --help            show this help message and exit
  --base_model BASE_MODEL
                        The base foundation model. Default is "NousResearch/Meta-Llama-3-8B-Instruct".
  --base_dataset_text_field BASE_DATASET_TEXT_FIELD
                        The dataset's column name containing the actual text to translate. Defaults to text
  --base_dataset_rank_field BASE_DATASET_RANK_FIELD
                        The dataset's column name containing the rank of an answer given to a prompt. Defaults to rank
  --base_dataset_id_field BASE_DATASET_ID_FIELD
                        The dataset's column name containing the id of a text. Defaults to message_id
  --base_dataset_parent_field BASE_DATASET_PARENT_FIELD
                        The dataset's column name containing the parent id of a text. Defaults to parent_id
  --quant8              Finetunes the model in 8 bits. Requires more memory than the default 4 bit.
  --noquant             Do not quantize the finetuning. Requires more memory than the default 4 bit and optional 8 bit.
  --max_seq_length MAX_SEQ_LENGTH
                        The maximum sequence length to use in finetuning. Should most likely line up with your base model's default max_seq_length. Default is 512.
  --max_prompt_length MAX_PROMPT_LENGTH
                        The maximum length of the prompts to use. Default is 512.
  --num_train_epochs NUM_TRAIN_EPOCHS
                        Number of epochs to use. 2 is default and has been shown to work well.
  --batch_size BATCH_SIZE
                        The batch size to use in finetuning. Adjust to fit in your GPU vRAM. Default is 4
  --threads_output_name THREADS_OUTPUT_NAME
                        If specified, the threads created in this script for finetuning will also be saved to disk or HuggingFace Hub.
  --thread_template THREAD_TEMPLATE
                        A file containing the thread template to use. Default is threads/template_fefault.txt
  --max_steps MAX_STEPS
                        The maximum number of steps to run ORPO for. Default is -1 which will run the data through fully for the number of epochs but this will be very time-consuming.
  --padding PADDING     What padding to use, can be either left or right.

新しく作成した QLoRA モデルを使用して推論を実行します。

 usage: run_inference.py [-h] model_name instruction_prompt input

Script to run inference on a tuned model.

positional arguments:
  model_name          The name of the tuned model that you pushed to Huggingface in the previous
                      step.
  instruction_prompt  An instruction message added to every prompt given to the chatbot to force
                      it to answer in the target language.
  input               The actual chat input prompt. The script is only meant for testing purposes
                      and exits after answering.

options:
  -h, --help          show this help message and exit

適切な翻訳モデルの選択

ターゲット言語に対してどの翻訳モデルを選択すればよいかを知るにはどうすればよいですか?

ある程度の推測に役立つbenchmark.pyスクリプトを使用してカバーしました(使用するデータセットは OPUS モデルがトレーニングされたデータセットと同じであるため、結果は常に OPUS に有利になります)。使用方法については、以下のこのスクリプトのヘルプを参照してください。モデルは 4 ビット量子化でロードされ、OPUS ブックのサブセットの小さなサンプルで実行されます。

基本データセットで最も一般的に使用される言語をsource_ language として使用し、ターゲット翻訳言語を target_lang として使用してください。たとえば、OASST1 の場合は、少なくともenとesソース言語として実行するようにしてください。

 usage: benchmark.py [-h] [--cpu] [--start START] [--n N] [--max_length MAX_LENGTH] source_language target_language included_models

Benchmark all the different translation models for a specific source and target language to find out which performs best. This uses 4bit quantization to limit GPU usage. Note:
the outcomes are indicative - you cannot assume corretness of the BLEU and CHRF scores but you can compare models against each other relatively.

positional arguments:
  source_language       The source language you want to test for. Check your dataset to see which occur most prevalent or use English as a good start.
  target_language       The source language you want to test for. This should be the language you want to apply the translate script on. Note: in benchmark, we use 2-character
                        language codes, in constrast to translate.py where you need to specify whatever your model expects.
  included_models       Comma-separated list of models to include. Allowed values are: opus, m2m_418m, m2m_1.2b, madlad_3b, madlad_7b, madlad_10b, madlad_7bbt, mbart,
                        nllb_distilled600m, nllb_1.3b, nllb_distilled1.3b, nllb_3.3b, seamless

options:
  -h, --help            show this help message and exit
  --cpu                 Forces usage of CPU. By default GPU is taken if available.
  --start START         The starting offset to include sentences from the OPUS books dataset from. Defaults to 0.
  --n N                 The number of sentences to benchmark on. Defaults to 100.
  --max_length MAX_LENGTH
                        How much tokens to generate at most. More tokens might be more accurate for lengthy input but creates a risk of running out of memory. Default is 512.

データセットとモデル

私たちはすでに多数のデータセットとモデルを作成しており、今後も作成し続けるでしょう。 LLM の民主化を支援したいですか?リポジトリのクローンを作成し、他の言語のデータセットとモデルを作成してから、PR を作成します。

翻訳された oasst1 データセット


オランダ語 UnderstandLing/oasst1_nl	スペイン語 UnderstandLing/oasst1_es	フランス語 UnderstandLing/oasst1_fr	ドイツ語 UnderstandLing/oasst1_de
カタロニア語 xaviviro/oasst1_ca	ポルトガル語 UnderstandLing/oasst1_pt	アラビア語 HeshamHaroon/oasst-arabic	イタリア語 UnderstandLing/oasst1_it
ロシア語 UnderstandLing/oasst1_ru	ヒンディー語 UnderstandLing/oasst1_hi	中国語UnderstandingLing/oasst1_zh	ポーランドのクリスチャン/oasst1_pl
日本語UnderstandLing/oasst1_jap	バスク語 xezpeleta/oasst1_eu	ベンガル語 UnderstandLing/oasst1_bn	トルコ語 UnderstandLing/oasst1_tr

言語固有の ❗LLaMa3-8B❗ チャットモデルアダプター

これらのモデルを使用する前に、Meta の LLaMa3-8B モデルにアクセスできることを確認し、HF_TOKEN を設定してください。


UnderstandLing/Llama-3-8B-Instruct-nl オランダ語	UnderstandLing/Llama-3-8B-Instruct-es スペイン語	UnderstandLing/Llama-3-8B-Instruct-fr フランス語	UnderstandLing/Llama-3-8B-Instruct-de ドイツ語
UnderstandLing/Llama-3-8B-Instruct-pt ポルトガル語	UnderstandLing/Llama-3-8B-Instruct-it イタリア語	UnderstandLing/Llama-3-8B-Instruct-hi ヒンディー語	UnderstandLing/Llama-3-8B-Instruct-ru ロシア語

翻訳された LLaMa2 スレッドチャットプロンプトデータセット


オランダ語 UnderstandLing/oasst1_nl_threads	スペイン語 UnderstandLing/oasst1_es_threads	フランス語 UnderstandLing/oasst1_fr_threads	ドイツ語 UnderstandLing/oasst1_de_threads
カタロニア語 xaviviro/oasst1_ca_threads	ポルトガル語 UnderstandLing/oasst1_pt_threads	アラビア語 HeshamHaroon/oasst-arabic_threads	イタリア語 UnderstandLing/oasst1_it_threads
ロシア語 UnderstandLing/oasst1_ru_threads	ヒンディー語 UnderstandLing/oasst1_hi_threads	中国語 UnderstandLing/oasst1_zh_threads	ポーランドのクリスチャン/oasst1_pl_threads
日本語 UnderstandLing/oasst1_jap_threads	バスク語 xezpeleta/oasst1_eu_threads	ベンガル語 UnderstandLing/oasst1_bn_threads	トルコ語 UnderstandLing/oasst1_tr_threads

言語固有の LLaMa2-7B チャットモデルアダプター


UnderstandLing/llama-2-7b-chat-nl オランダ語	UnderstandLing/llama-2-7b-chat-es スペイン語	UnderstandLing/llama-2-7b-chat-fr フランス語	UnderstandLing/llama-2-7b-chat-de ドイツ語
xaviviro/llama-2-7b-chat-ca カタロニア語	UnderstandLing/llama-2-7b-chat-pt ポルトガル語	HeshamHaroon/llama-2-7b-chat-ar アラビア語	UnderstandLing/llama-2-7b-chat-it イタリア語
UnderstandLing/llama-2-7b-chat-ru ロシア語	UnderstandLing/llama-2-7b-chat-hi ヒンディー語	UnderstandLing/llama-2-7b-chat-zh 中国語	chrystians/llama-2-7b-chat-pl-polish-polski ポーランド語
xezpeleta/llama-2-7b-chat-eu バスク語	UnderstandLing/llama-2-7b-chat-bn ベンガル語	UnderstandLing/llama-2-7b-chat-tr トルコ語

言語固有の Mistral チャットモデルアダプター


UnderstandLing/Mistral-7B-Instruct-v0.2-nl オランダ語	UnderstandLing/Mistral-7B-Instruct-v0.2-es スペイン語	UnderstandLing/Mistral-7B-Instruct-v0.2-de ドイツ語

言語固有の LLaMa2-13B チャットモデルアダプター


UnderstandLing/llama-2-13b-chat-nl オランダ語	UnderstandLing/llama-2-13b-chat-es スペイン語	UnderstandLing/llama-2-13b-chat-fr フランス語

言語固有の Mixtral-8x7B チャットモデルアダプター


UnderstandLing/Mixtral-8x7B-Instruct-nl オランダ語

経験的なパフォーマンス

オランダ語

<s>[INST] <<SYS>> Je bent een generieke chatbot die altijd in het Nederlands antwoord geeft. <</SYS>> Wat is de hoofdstad van Nederland? [/INST] Amsterdam</s>

<s>[INST] <<SYS>> Je bent een generieke chatbot die altijd in het Nederlands antwoord geeft. <</SYS>> Wat is de hoofdstad van Nederland? [/INST] Amsterdam</s><s>[INST] Hoeveel inwoners heeft die stad? [/INST] 850 duizend inwoners (2023)</s>

<s>[INST] <<SYS>> Je bent een generieke chatbot die altijd in het Nederlands antwoord geeft. <</SYS>> Wie is de minister-president van Nederland? [/INST] Mark Rutte is sinds 2010 minister-president van Nederland. Hij is meerdere keren herkozen.</s>

よくある質問

Q: 最初に完全な OASST1/2 データセットを翻訳するのはなぜですか?最高ランクのスレッドだけを翻訳した方が早いのではないだろうか？
A: 最初にスレッドを作成してから翻訳することで、スループット時間の点でかなりの時間を節約できますが、OASST1/2 の完全な翻訳はそれ自体でも役立つと考えられるため、コミュニティに提供しています。
Q: バニラ LLaMa3 と比較して、微調整のパフォーマンスはどの程度ですか?
A: 正式なベンチマークはありませんが、そもそも LLaMa3 が英語以外の言語を一貫して話せるようにすることは、不可能ではないにしても困難です。実際に生成される英語以外の言語は、文法的に壊れていることがよくあります。私たちの微調整ではこのような動作は見られません。
Q: 微調整に他のフレームワークを使用できますか?
A: はい、できます。マルチ GPU セットアップでのトレーニングには Axolotl を使用します。
Q: 異なる翻訳モデルを混合できますか?
A: 確かに、複数のモデルで翻訳を実行すると、パフォーマンスが向上する可能性さえあると考えられます。これを実現するには、翻訳を早期に停止し、別の翻訳モデルで翻訳スクリプトを再実行してチェックポイントから続行します。

資金調達

私たちは AI を民主化し、そのアプリケーションを進歩させるための資金を積極的に探しています。投資をご希望の場合は、[email protected] までご連絡ください。

拡大する

LLaMa2lang

LLaMa3 のサポートが追加されました

LLaMa2lang v0.6

TL;DR

何をするのか

サポートされているパラダイム

翻訳

ベースデータセット

サポートされている基礎モデル

ロードマップ

コストとランタイム

使用法

適切な翻訳モデルの選択

データセットとモデル

翻訳された oasst1 データセット

言語固有の ❗LLaMa3-8B❗ チャットモデルアダプター

翻訳された LLaMa2 スレッドチャットプロンプトデータセット

言語固有の LLaMa2-7B チャットモデルアダプター

言語固有の Mistral チャットモデルアダプター

言語固有の LLaMa2-13B チャットモデルアダプター

言語固有の Mixtral-8x7B チャットモデルアダプター

経験的なパフォーマンス

オランダ語

よくある質問

資金調達

node telegram bot api

typebot.io

python wechaty getting started

TranscriberBot

genal chat

Facemoji

chat.petals.dev

GPT Prompt Templates

GPTyped

node telegram bot api

typebot.io

python wechaty getting started

waymo open dataset

termwind

wp functions

LLaMa2lang

LLaMa3 のサポートが追加されました

LLaMa2lang v0.6

TL;DR

何をするのか

サポートされているパラダイム

翻訳

ベースデータセット

サポートされている基礎モデル

ロードマップ

コストとランタイム

使用法

適切な翻訳モデルの選択

データセットとモデル

翻訳された oasst1 データセット

言語固有の ❗LLaMa3-8B❗ チャット モデル アダプター

翻訳された LLaMa2 スレッド チャット プロンプト データセット

言語固有の LLaMa2-7B チャット モデル アダプター

言語固有の Mistral チャット モデル アダプター

言語固有の LLaMa2-13B チャット モデル アダプター

言語固有の Mixtral-8x7B チャット モデル アダプター

経験的なパフォーマンス

オランダ語

よくある質問

資金調達

言語固有の ❗LLaMa3-8B❗ チャットモデルアダプター

翻訳された LLaMa2 スレッドチャットプロンプトデータセット

言語固有の LLaMa2-7B チャットモデルアダプター

言語固有の Mistral チャットモデルアダプター

言語固有の LLaMa2-13B チャットモデルアダプター

言語固有の Mixtral-8x7B チャットモデルアダプター