LaBERT下载 - LaBERT源代码下载

LaBERT

其他源码

1.0.0

下载

长度可控的图像字幕 (ECCV2020)

该存储库提供了论文长度可控图像字幕的实现。

安装

conda create --name labert python=3.7
conda activate labert

conda install pytorch=1.3.1 torchvision cudatoolkit=10.1 -c pytorch
pip install h5py tqdm transformers==2.1.1
pip install git+https://github.com/salaniz/pycocoevalcap

数据和预训练模型

按照链接准备 MSCOCO 数据。
从百度云盘[代码：0j9f]或Google Drive下载预训练的Bert和Faster-RCNN。
- 它是一个统一的检查点文件，包含预训练的Bert-base和 Faster-RCNN 的fc6层。
从百度云盘[代码：fpke]或Google Drive下载我们预训练的LaBERT模型。

脚本

火车

python -m torch.distributed.launch 
  --nproc_per_node= $NUM_GPUS 
  --master_port=4396 train.py 
  save_dir $PATH_TO_TRAIN_OUTPUT 
  samples_per_gpu $NUM_SAMPLES_PER_GPU

继续火车

python -m torch.distributed.launch 
  --nproc_per_node= $NUM_GPUS 
  --master_port=4396 train.py 
  save_dir $PATH_TO_TRAIN_OUTPUT 
  samples_per_gpu $NUM_SAMPLES_PER_GPU 
  model_path $PATH_TO_MODEL

推理

python inference.py 
  model_path $PATH_TO_MODEL 
  save_dir $PATH_TO_TEST_OUTPUT 
  samples_per_gpu $NUM_SAMPLES_PER_GPU

评价

python evaluate.py 
  --gt_caption data/id2captions_test.json 
  --pd_caption $PATH_TO_TEST_OUTPUT /caption_results.json 
  --save_dir $PATH_TO_TEST_OUTPUT

引用

如果该项目对您的研究有帮助，请考虑在您的出版物中引用我们的论文。

 @article{deng2020length,
  title={Length-Controllable Image Captioning},
  author={Deng, Chaorui and Ding, Ning and Tan, Mingkui and Wu, Qi},
  journal={arXiv preprint arXiv:2007.09580},
  year={2020}
}

展开

附加信息