KnowAgent 다운로드 - KnowAgent 소스 코드 다운로드

KnowAgent

AI 소스 코드

1.0.0

다운로드

LLM 기반 에이전트를 위한 지식 증강 계획.

?종이 • 웹

우리의 개발은 몇 가지 주요 단계에 기반을 두고 있습니다. 처음에는 특정 작업과 관련된 행동 계획 지식을 통합하는 광범위한 행동 지식 기반을 만듭니다. 이 데이터베이스는 정보의 외부 저장소 역할을 하며 모델의 작업 생성 프로세스를 조정합니다. 이어서 , 행동 지식을 텍스트로 변환함으로써 모델이 이 지식을 깊이 이해하고 행동 궤적을 생성하는 데 활용할 수 있도록 합니다. 마지막으로 , 지식이 풍부한 자가 학습 단계를 통해 모델의 반복 프로세스에서 개발된 궤적을 사용하여 행동 지식에 대한 이해와 적용을 지속적으로 개선합니다. 이 프로세스는 에이전트의 계획 능력을 강화할 뿐만 아니라 복잡한 상황에 적용할 수 있는 잠재력도 향상시킵니다.

?목차

?목차
?소식
?설치
?️계획 경로 생성
지식이 풍부한 자기 학습
?소환
승인

?소식

[2024-03] 새로운 논문 "KnowAgent: LLM 기반 에이전트를 위한 지식 증강 계획"을 발표합니다.

?설치

KnowAgent를 시작하려면 다음의 간단한 설치 단계를 따르십시오.

git clone https://github.com/zjunlp/KnowAgent.git
cd KnowAgent
pip install -r requirements.txt

HotpotQA 및 ALFWorld 데이터 세트를 각각 Path_Generation/alfworld_run/data 및 Path_Generation/hotpotqa_run/data 아래에 배치했습니다. 추가 구성을 위해서는 ALFWorld 및 FastChat의 원래 설정을 진행하는 것이 좋습니다.

?️계획 경로 생성

계획 경로 생성 프로세스는 KnowAgent의 필수 요소입니다. Path_Generation 디렉터리, 특히 run_alfworld.sh 및 run_hotpotqa.sh 에서 계획 경로 생성을 실행하기 위한 스크립트를 찾을 수 있습니다. 이 스크립트는 bash 명령을 사용하여 실행할 수 있습니다. 필요에 맞게 스크립트를 조정하려면 mode 매개변수를 수정하여 훈련( train ) 모드와 테스트( test ) 모드 사이를 전환하고 llm_name 매개변수를 변경하여 다른 LLM을 사용할 수 있습니다.

 cd Path_Generation

# For training with HotpotQA
python run_hotpotqa.py --llm_name llama-2-13b --max_context_len 4000 --mode train --output_path ../Self-Learning/trajs/

# For testing with HotpotQA
python run_hotpotqa.py --llm_name llama-2-13b --max_context_len 4000 --mode test --output_path output/
    
# For training with ALFWorld
python alfworld_run/run_alfworld.py --llm_name llama-2-13b --mode train --output_path ../Self-Learning/trajs/

# For testing with ALFWorld
python alfworld_run/run_alfworld.py --llm_name llama-2-13b --mode test --output_path output/

여기에서는 필터링 전 Google Drive에서 Llama-{7,13,70}b-chat으로 합성한 궤적을 공개합니다.

♟️지식 있는 자가 학습

계획 경로와 해당 궤적을 얻은 후 지식 기반 자가 학습 프로세스가 시작됩니다. 생성된 궤적은 먼저 Self-Learning 디렉터리에 있는 스크립트를 사용하여 Alpaca 형식으로 변환되어야 합니다.

초기 반복의 경우 traj_reformat.sh 에 설명된 단계를 따르세요.

 cd Self-Learning
# For HotpotQA
python train/Hotpotqa_reformat.py --input_path trajs/KnowAgentHotpotQA_llama-2-13b.jsonl --output_path train/datas

# For ALFWorld
python train/ALFWorld_reformat.py --input_path trajs/KnowAgentALFWorld_llama-2-13b.jsonl --output_path train/datas

후속 반복의 경우 궤도 재형식화 스크립트를 실행하기 전에 지식 기반 궤도 필터링 및 병합을 수행하는 것이 필수적입니다. traj_merge_and_filter.sh 사용하여 이를 달성할 수 있습니다.

 python trajs/traj_merge_and_filter.py 
    --task HotpotQA 
    --input_path1  trajs/datas/KnowAgentHotpotQA_llama-2-13b_D0.jsonl 
    --input_path2  trajs/datas/KnowAgentHotpotQA_llama-2-13b_D1.jsonl 
    --output_path   trajs/datas

다음으로, Self-Learning/train.sh 및 Self-Learning/train_iter.sh 에 있는 스크립트에 지정된 대로 train.sh 및 train_iter.sh 실행하여 자가 학습 프로세스를 시작합니다.

 CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 deepspeed train/train_lora.py 
    --model_name_or_path  llama-2-13b-chat
    --lora_r 8 
    --lora_alpha 16 
    --lora_dropout 0.05 
    --data_path datas/data_knowagent.json 
    --output_dir models/Hotpotqa/M1 
    --num_train_epochs 5 
    --per_device_train_batch_size 2 
    --per_device_eval_batch_size 1 
    --gradient_accumulation_steps 1 
    --evaluation_strategy "no" 
    --save_strategy "steps" 
    --save_steps 10000 
    --save_total_limit 1 
    --learning_rate 1e-4 
    --weight_decay 0. 
    --warmup_ratio 0.03 
    --lr_scheduler_type "cosine" 
    --logging_steps 1 
    --fp16 True 
    --model_max_length 4096 
    --gradient_checkpointing True 
    --q_lora False 
    --deepspeed /data/zyq/FastChat/playground/deepspeed_config_s3.json 
    --resume_from_checkpoint False

?소환

 @article { zhu2024knowagent ,
  title = { KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents } ,
  author = { Zhu, Yuqi and Qiao, Shuofei and Ou, Yixin and Deng, Shumin and Zhang, Ningyu and Lyu, Shiwei and Shen, Yue and Liang, Lei and Gu, Jinjie and Chen, Huajun } ,
  journal = { arXiv preprint arXiv:2403.03101 } ,
  year = { 2024 }
}

승인

KnowAgent 개발에 큰 영향을 준 다음 프로젝트의 제작자와 기여자에게 감사를 표합니다.
- FastChat : 우리의 교육 모듈 코드는 FastChat에서 채택되었습니다. FastChat을 방문하면 LangChain을 통한 개방형 모델과의 통합이 FastChat을 통해 촉진됩니다. LangChain과 FastChat 통합에 대해 자세히 알아보세요.
- BOLAA : 추론 모듈 코드는 BOLAA를 기반으로 구현됩니다. BOLAA 방문
- ReAct , Reflexion , FireAct 등의 추가 기본 코드가 활용되어 다양한 접근 방식과 방법론을 보여줍니다.
해당 분야에 귀중한 기여를 해주신 모든 기여자에게 진심으로 감사드립니다!