KnowAgentダウンロード - KnowAgentソースコードのダウンロード

KnowAgent

AI ソースコード

1.0.0

ダウンロード

LLM ベースのエージェントのための知識拡張プランニング。

?紙・ウェブ

私たちの開発は、いくつかの重要なステップに基づいています。まず、特定のタスクに関連するアクション計画の知識を統合する、広範なアクション知識ベースを作成します。このデータベースは、情報の外部貯蔵庫として機能し、モデルのアクション生成プロセスを制御します。その後、行動知識をテキストに変換することで、モデルがその知識を深く理解し、行動軌跡の作成に活用できるようにします。最後に、知識に基づく自己学習フェーズを通じて、モデルの反復プロセスから開発された軌道を使用して、アクション知識の理解と応用を継続的に改善します。このプロセスは、エージェントの計画能力を強化するだけでなく、複雑な状況での応用の可能性も高めます。

？目次

？目次
？ニュース
?取り付け
⁉️パス生成の計画
知識豊富な自己学習
？引用
了承

？ニュース

[2024-03]新しい論文「KnowAgent: LLM ベースのエージェントのための知識拡張プランニング」をリリースしました。

?取り付け

KnowAgent の使用を開始するには、次の簡単なインストール手順に従います。

git clone https://github.com/zjunlp/KnowAgent.git
cd KnowAgent
pip install -r requirements.txt

HotpotQA データセットと ALFWorld データセットをそれぞれPath_Generation/alfworld_run/dataとPath_Generation/hotpotqa_run/dataに配置しました。さらに構成を進めるには、ALFWorld と FastChat の元のセットアップに進むことをお勧めします。

⁉️パス生成の計画

計画パス生成プロセスは KnowAgent に不可欠です。プランニングパス生成を実行するためのスクリプトは、 Path_Generationディレクトリ、具体的にはrun_alfworld.shおよびrun_hotpotqa.shにあります。これらのスクリプトは、bash コマンドを使用して実行できます。スクリプトをニーズに合わせて調整するには、 modeパラメーターを変更してトレーニング ( train ) モードとテスト ( test ) モードを切り替えたり、 llm_nameパラメーターを変更して別の LLM を使用したりできます。

 cd Path_Generation

# For training with HotpotQA
python run_hotpotqa.py --llm_name llama-2-13b --max_context_len 4000 --mode train --output_path ../Self-Learning/trajs/

# For testing with HotpotQA
python run_hotpotqa.py --llm_name llama-2-13b --max_context_len 4000 --mode test --output_path output/
    
# For training with ALFWorld
python alfworld_run/run_alfworld.py --llm_name llama-2-13b --mode train --output_path ../Self-Learning/trajs/

# For testing with ALFWorld
python alfworld_run/run_alfworld.py --llm_name llama-2-13b --mode test --output_path output/

ここでは、Llama-{7,13,70}b-chat によって合成された軌跡をフィルタリングする前に Google ドライブ上で公開します。

♟️知識豊富な自己学習

計画パスと対応する軌道を取得した後、知識のある自己学習プロセスが始まります。生成された軌跡は、まず、Self-Learning ディレクトリにあるスクリプトを使用して Alpaca 形式に変換する必要があります。

最初の反復では、 traj_reformat.shに概説されている手順に従います。

 cd Self-Learning
# For HotpotQA
python train/Hotpotqa_reformat.py --input_path trajs/KnowAgentHotpotQA_llama-2-13b.jsonl --output_path train/datas

# For ALFWorld
python train/ALFWorld_reformat.py --input_path trajs/KnowAgentALFWorld_llama-2-13b.jsonl --output_path train/datas

後続の反復では、軌道再フォーマットスクリプトを実行する前に、知識ベースの軌道フィルタリングとマージを実行することが不可欠です。これは、 traj_merge_and_filter.shを使用して実現できます。

 python trajs/traj_merge_and_filter.py 
    --task HotpotQA 
    --input_path1  trajs/datas/KnowAgentHotpotQA_llama-2-13b_D0.jsonl 
    --input_path2  trajs/datas/KnowAgentHotpotQA_llama-2-13b_D1.jsonl 
    --output_path   trajs/datas

次に、 Self-Learning/train.sh train.shおよび Self Self-Learning/train_iter.sh train_iter.sh にあるスクリプトで指定されているように、 train.sh およびtrain_iter.sh実行して、自己学習プロセスを開始します。

 CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 deepspeed train/train_lora.py 
    --model_name_or_path  llama-2-13b-chat
    --lora_r 8 
    --lora_alpha 16 
    --lora_dropout 0.05 
    --data_path datas/data_knowagent.json 
    --output_dir models/Hotpotqa/M1 
    --num_train_epochs 5 
    --per_device_train_batch_size 2 
    --per_device_eval_batch_size 1 
    --gradient_accumulation_steps 1 
    --evaluation_strategy "no" 
    --save_strategy "steps" 
    --save_steps 10000 
    --save_total_limit 1 
    --learning_rate 1e-4 
    --weight_decay 0. 
    --warmup_ratio 0.03 
    --lr_scheduler_type "cosine" 
    --logging_steps 1 
    --fp16 True 
    --model_max_length 4096 
    --gradient_checkpointing True 
    --q_lora False 
    --deepspeed /data/zyq/FastChat/playground/deepspeed_config_s3.json 
    --resume_from_checkpoint False

？引用

 @article { zhu2024knowagent ,
  title = { KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents } ,
  author = { Zhu, Yuqi and Qiao, Shuofei and Ou, Yixin and Deng, Shumin and Zhang, Ningyu and Lyu, Shiwei and Shen, Yue and Liang, Lei and Gu, Jinjie and Chen, Huajun } ,
  journal = { arXiv preprint arXiv:2403.03101 } ,
  year = { 2024 }
}

了承

KnowAgent の開発に多大な影響を与えた以下のプロジェクトの作成者および貢献者に感謝の意を表します。
- FastChat : トレーニングモジュールコードは FastChat から適応されています。 FastChat にアクセスすると、LangChain を介したオープンモデルとの統合が FastChat 経由で容易になります。 LangChain と FastChat の統合について詳しくは、こちらをご覧ください。
- BOLAA : 推論モジュールのコードは BOLAA に基づいて実装されます。ボラー訪問
- ReAct 、 Reflexion 、 FireActなどの追加のベースラインコードが利用され、さまざまなアプローチと方法論が示されています。
この分野への貴重な貢献をしてくださったすべての貢献者に心からの感謝を申し上げます。