Bias CRS 다운로드 - Bias CRS 소스 코드 다운로드

Bias CRS

AI 소스 코드

1.0.0

다운로드

편향 분석 및 언어 모델 강화 데이터 증강을 통한 대화 추천 시스템 개선

EMNLP 2023에서 승인됨(결과)

환경을 준비하는 명령

 apt-get update
apt-get install build-essential -y

Preparing enviroment:
(for torch 1.12.0)

Option 1: 
pip install pyg-lib torch-scatter torch-sparse torch-cluster torch-spline-conv torch-geometric -f https://data.pyg.org/whl/torch-1.12.0+cu113.html
export LD_LIBRARY_PATH="/opt/conda/lib/:$LD_LIBRARY_PATH"

Option 2:
pip install torch==1.12.0+cu113 torchvision==0.13.0+cu113 torchaudio==0.12.0 --extra-index-url https://download.pytorch.org/whl/cu113 
pip install torch-scatter -f https://pytorch-geometric.com/whl/torch-${TORCH}+${CUDA}.html
pip install torch-sparse -f https://pytorch-geometric.com/whl/torch-${TORCH}+${CUDA}.html
pip install torch-cluster -f https://pytorch-geometric.com/whl/torch-${TORCH}+${CUDA}.html
pip install torch-spline-conv -f https://pytorch-geometric.com/whl/torch-${TORCH}+${CUDA}.html
pip install torch-geometric

pip install -r requirements.txt

For the case of "command 'x86_64-linux-gnu-gcc' failed with exit status 1":
apt-get install python3.x-dev

빠른 테스트

 python run_bias_crs.py --config config/crs/tgredial/tgredial.yaml

실험은 ReDial, KGSF, KBRD 및 TGReDial 모델을 통해 수행되었으며 ReDIAL 및 TGReDIAL 데이터 세트에서 평가되었습니다.

데이터 증대

합성 대화 상자의 생성 및 준비는 data_aug 폴더 내에서 먼저 [data_prep_gen_ .ipynb]를 수행한 다음 [gen_convert_ .ipynb]를 통해 구현됩니다(*는 데이터 세트 이름을 나타냄).

데이터 증대는 [bias_crs/data/dataloader/base.py] 내의 base.py 내에서 구현되며, popNudge를 통해 증대될 항목 수에 대한 변경 사항은 여기에서 변경할 수 있습니다.

실험 결과의 모든 실행에 대해 [data/bias/] 디렉터리에 저장되고 그 뒤에는 모델 및 데이터 세트 이름을 딴 폴더가 있으며 제목은 [bias_anlytic_data.csv]입니다.

교차 에피소드 인기도 및 사용자 의도 지향 인기도 점수를 통한 추천 결과에 대한 해당 분석은 [분석] 폴더를 통해 액세스할 수 있습니다.

에피소드 간 인기도 계산

 from scipy.stats.stats import pearsonr

def compute_pop_scores(pop_score_dict, items):
    return [pop_score_dict[item] if item in pop_score_dict else 0.0 for item in items]

pop_scores = [compute_pop_scores(pop_score_dict, row['Prediction_items']) for _, row in data.iterrows()]
data['pop_scores'] = pop_scores

new_conv = True
cep_scores = []
for idx, row in data.iterrows():
    # set the default value to the first episode
    if new_conv:
        new_conv = False
        cep_scores.append(0.5)
    else:
        if idx+1 < len(data) and row['conv_id'] != data.at[idx+1, 'conv_id']:
            new_conv=True
        pearsonr_score = np.abs(pearsonr(row['pop_scores'], data.at[idx-1, 'pop_scores'])[0])
        cep_scores.append(pearsonr_score)

data['cep_score'] = cep_scores
data['cep_pop_score'] = data['cep_score'] * data['pop_bias']

사용자 의도 중심 인기도 계산

 data['target_pop_score'] = data['target_item_index'].map(pop_score_dict)
data['UIOP'] = np.abs(data['pop_bias'] - data['target_pop_score'])

소환

 @inproceedings{
    title={Improving Conversational Recommendation Systems via Bias Analysis and Language-Model-Enhanced Data Augmentation},
    author={Xi Wang, Hossein A. Rahmani, Jiqun Liu, Emine Yilmaz}
    booktitle={Proceedings of EMNLP 2023 (Findings)}
    year={2023}
}