TransPose下载 - TransPose源代码下载

TransPose

其他源码

Yaml

下载

介绍

转置是基于CNN特征提取器，变压器编码器和预测头的人姿势估计模型。给定图像，在变压器内置的注意力层可以有效地捕获关键点之间的长距离空间关系，并解释高度依赖的预测关键位置的依赖性。

建筑学

[Arxiv 2012.14214] [纸] [示例注释]

转置：通过变压器，Sen Yang，Zhibin Quan，Mu Nie，Wankou Yang，ICCV 2021的键盘定位

模型动物园

我们选择两种类型的CNN作为骨干候选者：Resnet和hrnet。派生的卷积块是Resnet-Small，Hrnet-Small-W32和Hrnet-Small-W48。

模型	骨干	＃注意层	d	h	#heads	#params	AP（可可Val GT Bbox）	下载
tranpose-r-a3	Resnet-s	3	256	1024	8	5.2MB	73.8	模型
tranpose-r-a4	Resnet-s	4	256	1024	8	6.0MB	75.1	模型
转置HS	HRNET-S-W32	4	64	128	1	8.0MB	76.1	模型
转置-H-A4	HRNET-S-W48	4	96	192	1	17.3MB	77.5	模型
转置-H-A6	HRNET-S-W48	6	96	192	1	17.5MB	78.1	模型

快速使用

尝试网络演示：

您可以直接从Torch Hub上直接加载带有验证的COCO Train2017数据集上的转置R-A4或转置-H-A4模型，仅通过：

 import torch

tpr = torch . hub . load ( 'yangsenius/TransPose:main' , 'tpr_a4_256x192' , pretrained = True )
tph = torch . hub . load ( 'yangsenius/TransPose:main' , 'tph_a4_256x192' , pretrained = True )

可可VAL2017的结果，检测器的人类AP为56.4，可可Val2017数据集

模型	输入大小	FPS*	gflops	AP	AP .5	AP .75	AP（M）	AP（L）	ar	AR .5	AR .75	手臂）	AR（L）
tranpose-r-a3	256x192	141	8.0	0.717	0.889	0.788	0.680	0.786	0.771	0.930	0.836	0.727	0.835
tranpose-r-a4	256x192	138	8.9	0.726	0.891	0.799	0.688	0.798	0.780	0.931	0.845	0.735	0.844
转置HS	256x192	45	10.2	0.742	0.896	0.808	0.706	0.810	0.795	0.935	0.855	0.752	0.856
转置-H-A4	256x192	41	17.5	0.753	0.900	0.818	0.717	0.821	0.803	0.939	0.861	0.761	0.865
转置-H-A6	256x192	38	21.8	0.758	0.901	0.821	0.719	0.828	0.808	0.939	0.864	0.764	0.872

笔记：

我们在单个NVIDIA 2080TI GPU上计算了从可可瓦尔数据集中测试100个样品的平均fps*。 FPS可能在不同的测试下上下波动。
我们在不同的硬件平台上训练了不同的型号： 1 x RTX2080TI GPU（TP-R-A4），4 x Titan XP GPU（TP-HS，TP-H-A4）和4 x Tesla P40 GPU（TP-H--- A6） 。

可可测试-DEV 2017的结果，检测器在可可Test-DEV2017数据集上具有60.9的人类AP为60.9

模型	输入大小	#params	gflops	AP	AP .5	AP .75	AP（M）	AP（L）	ar	AR .5	AR .75	手臂）	AR（L）
转置HS	256x192	8.0m	10.2	0.734	0.916	0.811	0.701	0.793	0.786	0.950	0.856	0.745	0.843
转置-H-A4	256x192	173m	17.5	0.747	0.919	0.822	0.714	0.807	0.799	0.953	0.866	0.758	0.854
转置-H-A6	256x192	175m	21.8	0.750	0.922	0.823	0.713	0.811	0.801	0.954	0.867	0.759	0.859

可视化

Jupyter笔记本演示

在给定输入图像，预验证的转置模型和预测的位置，我们可以可视化注意分数的阈值预测位置的空间依赖性。

threshold=0.00的TransPose-R-A4

threshold=0.01的TransPose-R-A4

threshold=0.00的TransPose-H-A4

threshold=0.00075的TransPose-H-A4

入门

安装

克隆此存储库，我们将其称为您将您克隆为$ {pose_root}的目录
```
git clone https://github.com/yangsenius/TransPose.git
```
在Pytorch官方网站上安装Pytorch> = 1.6和Torchvision> = 0.7
安装软件包依赖项。确保Python环境> = 3.7
```
pip install -r requirements.txt
```
在$ {pose_root}下制作输出（培训模型和文件）和日志（张板日志）目录
```
mkdir output log
cd ${POSE_ROOT} /lib
make
```

从此存储库的发行版中下载验证的模型到指定目录

 $ {POSE_ROOT}
 `-- models
     `-- pytorch
         |-- imagenet
         |   |-- hrnet_w32-36af842e.pth
         |   |-- hrnet_w48-8ef0771d.pth
         |   |-- resnet50-19c8e357.pth
         |-- transpose_coco
         |   |-- tp_r_256x192_enc3_d256_h1024_mh8.pth
         |   |-- tp_r_256x192_enc4_d256_h1024_mh8.pth
         |   |-- tp_h_32_256x192_enc4_d64_h128_mh1.pth
         |   |-- tp_h_48_256x192_enc4_d96_h192_mh1.pth
         |   |-- tp_h_48_256x192_enc6_d96_h192_mh1.pth

数据准备

我们遵循HRNET的步骤，准备可可列车/val/test数据集和注释。检测到的结果将从OneDrive或Googledrive下载。请下载或将它们链接到$ {pose_root}/data/coco/，并使它们看起来像这样：

 $ {POSE_ROOT}/data/coco/
| -- annotations
|   |-- person_keypoints_train2017.json
|   `-- person_keypoints_val2017.json
| -- person_detection_results
|   |-- COCO_val2017_detections_AP_H_56_person.json
|   `-- COCO_test-dev2017_detections_AP_H_609_person.json
`-- images
	|-- train2017
	|   |-- 000000000009.jpg
	|   |-- ... 
	`-- val2017
		|-- 000000000139.jpg
		|-- ...

经历和测试

对可可Val2017数据集进行测试

python tools/test.py --cfg experiments/coco/transpose_r/TP_R_256x192_d256_h1024_enc4_mh8.yaml TEST.USE_GT_BBOX True

可可Train2017数据集的培训

python tools/train.py --cfg experiments/coco/transpose_r/TP_R_256x192_d256_h1024_enc4_mh8.yaml

致谢

非常感谢这些论文及其开源代码：hrnet，detr，darkpose

执照

该存储库是根据MIT许可发布的。

引用

如果您觉得这个存储库有用，请给它一颗星吗？或考虑引用我们的工作：

@inproceedings{yang2021transpose,
  title={TransPose: Keypoint Localization via Transformer},
  author={Yang, Sen and Quan, Zhibin and Nie, Mu and Yang, Wankou},
  booktitle={IEEE/CVF International Conference on Computer Vision (ICCV)},
  year={2021}
}

展开

附加信息