GreaseLM 다운로드 - GreaseLM 소스 코드 다운로드

GreaseLM

AI 소스 코드

1.0.0

다운로드

GreaseLM : 질문 답변을 위한 그래프 REASoning 향상된 언어 모델

이 저장소는 GreaseLM : Graph REASoning Enhanced Language Models for Question Answering(ICLR 2022 스포트라이트) 논문의 소스 코드 및 데이터를 제공합니다. 당사의 코드, 처리된 데이터 또는 사전 학습된 모델을 사용하는 경우 다음을 인용해 주세요.

GreaseLM, title={ GreaseLM : Graph REASoning Enhanced Language Models}, author={Zhang, Xikun and Bosselut, Antoine and Yasunaga, Michihiro and Ren, Hongyu and Liang, Percy and Manning, Christopher D and Leskovec, Jure}, booktitle={International Conference on Learning Representations}, year={2021} }">

 @inproceedings { zhang2021 GreaseLM ,
  title = { GreaseLM : Graph REASoning Enhanced Language Models } ,
  author = { Zhang, Xikun and Bosselut, Antoine and Yasunaga, Michihiro and Ren, Hongyu and Liang, Percy and Manning, Christopher D and Leskovec, Jure } ,
  booktitle = { International Conference on Learning Representations } ,
  year = { 2021 }
}

<스팬 클래스= GreaseLM 모델 아키텍처" alt="" style="max-width: 100%;">

1. 종속성

파이썬 == 3.8
파이토치 == 1.8.0
변압기 == 3.4.0
토치 기하학 == 1.7.0

다음 명령을 실행하여 conda 환경을 만듭니다(CUDA 10.1 가정).

GreaseLM python=3.8 conda activate GreaseLM pip install numpy==1.18.3 tqdm pip install torch==1.8.0+cu101 torchvision -f https://download.pytorch.org/whl/torch_stable.html pip install transformers==3.4.0 nltk spacy pip install wandb conda install -y -c conda-forge tensorboardx conda install -y -c conda-forge tensorboard # for torch-geometric pip install torch-scatter==2.0.7 -f https://pytorch-geometric.com/whl/torch-1.8.0+cu101.html pip install torch-cluster==1.5.9 -f https://pytorch-geometric.com/whl/torch-1.8.0+cu101.html pip install torch-sparse==0.6.9 -f https://pytorch-geometric.com/whl/torch-1.8.0+cu101.html pip install torch-spline-conv==1.2.1 -f https://pytorch-geometric.com/whl/torch-1.8.0+cu101.html pip install torch-geometric==1.7.0 -f https://pytorch-geometric.com/whl/torch-1.8.0+cu101.html">

conda create -y -n GreaseLM python=3.8
conda activate GreaseLM
pip install numpy==1.18.3 tqdm
pip install torch==1.8.0+cu101 torchvision -f https://download.pytorch.org/whl/torch_stable.html
pip install transformers==3.4.0 nltk spacy
pip install wandb
conda install -y -c conda-forge tensorboardx
conda install -y -c conda-forge tensorboard

# for torch-geometric
pip install torch-scatter==2.0.7 -f https://pytorch-geometric.com/whl/torch-1.8.0+cu101.html
pip install torch-cluster==1.5.9 -f https://pytorch-geometric.com/whl/torch-1.8.0+cu101.html
pip install torch-sparse==0.6.9 -f https://pytorch-geometric.com/whl/torch-1.8.0+cu101.html
pip install torch-spline-conv==1.2.1 -f https://pytorch-geometric.com/whl/torch-1.8.0+cu101.html
pip install torch-geometric==1.7.0 -f https://pytorch-geometric.com/whl/torch-1.8.0+cu101.html

2. 데이터 다운로드

직접 데이터 다운로드 및 전처리

데이터를 직접 전처리하는 데 시간이 오래 걸릴 수 있으므로 전처리된 데이터를 직접 다운로드하려면 다음 하위 섹션으로 이동하세요.

다음을 사용하여 원시 ConceptNet, CommonsenseQA, OpenBookQA 데이터를 다운로드하세요.

 ./download_raw_data.sh

다음을 실행하여 이러한 원시 데이터를 전처리할 수 있습니다.

 CUDA_VISIBLE_DEVICES=0 python preprocess.py -p <num_processes>

CUDA_VISIBLE_DEVICES=... 명령 시작 부분에 사용하려는 GPU를 지정할 수 있습니다. 스크립트는 다음을 수행합니다.

ConceptNet 설정(예: ConceptNet에서 영어 관계 추출, 원래 42개 관계 유형을 17개 유형으로 병합)
QA 데이터 세트를 .jsonl 파일로 변환합니다(예: data/csqa/statement/ 에 저장됨).
질문과 답변에서 언급된 모든 개념을 확인하세요.
각 QA 쌍에 대한 하위 그래프 추출

MedQA-USMLE 데이터와 Disease Database 및 DrugBank를 기반으로 한 생의학 지식 그래프를 다운로드하고 전처리하는 스크립트는 utils_biomed/ 에 제공됩니다.

전처리된 데이터를 직접 다운로드

편의를 위해 데이터를 직접 전처리하고 싶지 않은 경우 여기에서 전처리된 모든 데이터를 다운로드할 수 있습니다. 이 저장소의 최상위 디렉터리에 다운로드하고 압축을 풉니다. medqa_usmle 및 ddb 폴더를 data/ 디렉터리로 이동합니다.

결과 파일 구조

결과 파일 구조는 다음과 같아야 합니다.

 .
├── README.md
├── data/
    ├── cpnet/                 (prerocessed ConceptNet)
    ├── csqa/
        ├── train_rand_split.jsonl
        ├── dev_rand_split.jsonl
        ├── test_rand_split_no_answers.jsonl
        ├── statement/             (converted statements)
        ├── grounded/              (grounded entities)
        ├── graphs/                (extracted subgraphs)
        ├── ...
    ├── obqa/
    ├── medqa_usmle/
    └── ddb/

3. 트레이닝 GreaseLM

CommonsenseQA에서 GreaseLM 교육하려면 다음을 실행하세요.

 CUDA_VISIBLE_DEVICES=0 ./run_ GreaseLM .sh csqa --data_dir data/

CUDA_VISIBLE_DEVICES=... 명령 시작 부분에 사용하려는 GPU를 최대 2개까지 지정할 수 있습니다.

마찬가지로 OpenbookQA에서 GreaseLM 교육하려면 다음을 실행하세요.

 CUDA_VISIBLE_DEVICES=0 ./run_ GreaseLM .sh obqa --data_dir data/

MedQA-USMLE에서 GreaseLM 교육하려면 다음을 실행하세요.

 CUDA_VISIBLE_DEVICES=0 ./run_ GreaseLM __medqa_usmle.sh

4. 사전 훈련된 모델 체크포인트

CommonsenseQA에서 IH-dev acc를 달성하는 사전 훈련된 GreaseLM 모델을 여기에서 다운로드할 수 있습니다. 79.0 및 IH 테스트 acc. 74.0 .

OpenbookQA에서 사전 훈련된 GreaseLM 모델을 여기에서 다운로드할 수도 있습니다. 이 모델은 테스트 acc를 달성합니다. 84.8 .

또한 여기에서 MedQA-USMLE에서 사전 훈련된 GreaseLM 모델을 다운로드할 수 있으며, 이는 테스트 acc를 달성합니다. 38.5 .

5. 사전 학습된 모델 체크포인트 평가

CommonsenseQA에서 사전 훈련된 GreaseLM 모델 체크포인트를 평가하려면 다음을 실행하세요.

 CUDA_VISIBLE_DEVICES=0 ./eval_ GreaseLM .sh csqa --data_dir data/ --load_model_path /path/to/checkpoint

CUDA_VISIBLE_DEVICES=... 명령 시작 부분에 사용하려는 GPU를 최대 2개까지 지정할 수 있습니다.

마찬가지로 OpenbookQA에서 사전 학습된 GreaseLM 모델 체크포인트를 평가하려면 다음을 실행하세요.

 CUDA_VISIBLE_DEVICES=0 ./eval_ GreaseLM .sh obqa --data_dir data/ --load_model_path /path/to/checkpoint

MedQA-USMLE에서 사전 학습된 GreaseLM 모델 체크포인트를 평가하려면 다음을 실행하세요.

 INHERIT_BERT=1 CUDA_VISIBLE_DEVICES=0 ./eval_ GreaseLM .sh medqa_usmle --data_dir data/ --load_model_path /path/to/checkpoint

6. 자신만의 데이터세트를 사용하세요

데이터 세트를 .jsonl 형식의 {train,dev,test}.statement.jsonl 로 변환합니다( data/csqa/statement/train.statement.jsonl 참조).
.jsonl 파일을 저장하기 위해 data/{yourdataset}/ 에 디렉터리를 만듭니다.
preprocess.py 수정하고 데이터에 대한 하위 그래프 추출을 수행합니다.
자신의 데이터세트를 지원하도록 utils/parser_utils.py 수정하세요.

7. 승인

이 저장소는 다음 작업을 기반으로 구축되었습니다.

 QA-GNN: Question Answering using Language Models and Knowledge Graphs
https://github.com/michiyasunaga/qagnn

작성자와 개발자에게 많은 감사를 드립니다!

확장하다

추가 정보

버전 1.0.0
유형 AI 소스 코드
업데이트 시간 2024-12-30
크기 50MB
출처 Github

GreaseLM

GreaseLM : 질문 답변을 위한 그래프 REASoning 향상된 언어 모델

1. 종속성

2. 데이터 다운로드

직접 데이터 다운로드 및 전처리

전처리된 데이터를 직접 다운로드

결과 파일 구조

3. 트레이닝 GreaseLM

4. 사전 훈련된 모델 체크포인트

5. 사전 학습된 모델 체크포인트 평가

6. 자신만의 데이터세트를 사용하세요

7. 승인

node telegram bot api

typebot.io

python wechaty getting started

TranscriberBot

genal chat

Facemoji

chat.petals.dev

GPT Prompt Templates

GPTyped

node telegram bot api

typebot.io

python wechaty getting started

waymo open dataset

termwind

wp functions