PuMer下载 - PuMer源码下载

PuMer

其他源码

1.0.0

下载

普美 (ACL 2023)

该仓库是论文“PuMer：Pruning and Merging Tokens for Efficient Vision Language Models”的官方实现，论文

用法

安装

安装 miniforge （与 conda 相同，更便携）创建一个 python 环境： conda env create -f env.yaml ，激活它： conda activate pumer克隆此仓库： [email protected]:csarron/pumer.git

测试cuda： python -c "import torch;print(torch.cuda.is_available())"

获取火炬环境： python -m torch.utils.collect_env

安装： pip install -e .

出于本地开发目的： pip install -e ".[dev]"

env-frozen.yaml是通过conda env export | grep -v "^prefix: | pumer==" > env-frozen.yaml生成的conda env export | grep -v "^prefix: | pumer==" > env-frozen.yaml

准备数据和预训练模型

数据预处理参见notes/data.md

请参阅cli/prep/convert_ckpt.py以转换原始预训练的 METER 和 ViLT 检查点

以下是准备后的文件布局：

# tree -h data
├── [4.0K]  ckpt
│   └── [4.0K]  converted
│       ├── [4.0K]  meter_pretrain_384
│       │   ├── [ 674]  config.json
│       │   └── [1.3G]  pytorch_model.bin
│       ├── [4.0K]  meter_pretrain_irtr_384
│       │   ├── [ 729]  config.json
│       │   └── [1.2G]  pytorch_model.bin
│       ├── [4.0K]  meter_pretrain_nlvr2_288
│       │   ├── [ 674]  config.json
│       │   └── [1.3G]  pytorch_model.bin
│       ├── [4.0K]  vilt_pretrain
│       │   ├── [ 619]  config.json
│       │   └── [518M]  pytorch_model.bin
│       ├── [4.0K]  vilt_pretrain_irtr
│       │   ├── [ 718]  config.json
│       │   └── [426M]  pytorch_model.bin
│       └── [4.0K]  vilt_pretrain_nlvr2
│           ├── [ 619]  config.json
│           └── [518M]  pytorch_model.bin
├── [4.0K]  datasets
│   ├── [4.0K]  irtr
│   │   ├── [390K]  flickr30k-test.jsonl
│   │   ├── [ 11M]  flickr30k-train.jsonl
│   │   ├── [397K]  flickr30k-val.jsonl
│   │   ├── [ 10M]  mscoco-restval.jsonl
│   │   ├── [1.7M]  mscoco-test.jsonl
│   │   ├── [ 28M]  mscoco-train.jsonl
│   │   └── [1.7M]  mscoco-val.jsonl
│   ├── [4.0K]  nlvr2
│   │   ├── [3.6M]  dev.json
│   │   ├── [3.6M]  test1.json
│   │   └── [ 39M]  train.json
│   ├── [4.0K]  snli-ve
│   │   ├── [ 16M]  snli_ve_dev.jsonl
│   │   ├── [ 16M]  snli_ve_test.jsonl
│   │   └── [464M]  snli_ve_train.jsonl
│   └── [4.0K]  vqa2
│       ├── [ 57K]  vqa2_ans2label.json
│       ├── [ 39K]  vqa2_label2ans.json
│       ├── [161K]  vqa2-small.jsonl
│       ├── [ 45M]  vqa2-test2015.jsonl
│       ├── [ 71M]  vqa2-train2014.jsonl
│       └── [ 34M]  vqa2-val2014.jsonl
└── [4.0K]  lmdb
    ├── [ 13G]  coco-test2015.lmdb
    ├── [ 19G]  coco-trainval2014.lmdb
    ├── [4.2G]  flickr30k_images.lmdb
    ├── [837M]  nlvr2-dev.lmdb
    ├── [837M]  nlvr2-test1.lmdb
    └── [ 11G]  nlvr2-train.lmdb

培训与评估

有关示例用法，请参阅notes/cmd.md；

查看 https://huggingface.co/csarron 以获取finetuend 检查点：（ -ft是原始的finetuned 模型， p0.x-r0.x-t0.x-xxx是我们的PuMer 模型）

vilt-vqa2-ft
vilt-vqa2-p0.1-r0.3-t0.2-258
vilt-ve-ft 
vilt-ve-p0.1r0.3t0.2-2468 
vilt-nlvr2-ft 
vilt-nlvr2-p0.1r0.3t0.2-258
meter-vqa2-ft
meter-vqa2-p0.2r0.2t0.2-0246
meter-ve-ft 
meter-ve-p0.3r0.5t0.2-0246 
meter-nlvr2-ft 
meter-nlvr2-p0.3r0.5t0.2-246

分析 FLOP 次数

参见注释/profile.md

常见问题解答

第一次使用后设置TRANSFORMERS_OFFLINE=1 ，否则有时会因为一直在线查找而报504错误。

杂项

忽略src/pumer/model/pruner.py中的代码（已弃用且未使用），需要清理
当前的代码库包含许多与 PuMer 实现无关的混乱和实验代码，请忽略。

引文

@inproceedings{cao-etal-2023-pumer，标题=“{P}u{M}er：高效视觉语言模型的修剪和合并令牌”，作者=“Cao，Qingqing和Paranjape，Bhargavi和Hajishirzi，Hannaneh”， booktitle = “计算语言学协会第61届年会论文集（卷1：长论文）”，月份= jul，年份=“2023”，地址=“加拿大多伦多”，出版商=“计算语言学协会”，url =“https://aclanthology.org/2023.acl-long .721”，页数=“12890--12903”，
}

展开

附加信息