TransPose下載 - TransPose源代碼下載

TransPose

其他源碼

Yaml

下載

介紹

轉置是基於CNN特徵提取器，變壓器編碼器和預測頭的人姿勢估計模型。給定圖像，在變壓器內置的注意力層可以有效地捕獲關鍵點之間的長距離空間關係，並解釋高度依賴的預測關鍵位置的依賴性。

建築學

[Arxiv 2012.14214] [紙] [示例註釋]

轉置：通過變壓器，Sen Yang，Zhibin Quan，Mu Nie，Wankou Yang，ICCV 2021的鍵盤定位

模型動物園

我們選擇兩種類型的CNN作為骨幹候選者：Resnet和hrnet。派生的捲積塊是Resnet-Small，Hrnet-Small-W32和Hrnet-Small-W48。

模型	骨幹	＃注意層	d	h	#heads	#params	AP（可可Val GT Bbox）	下載
tranpose-r-a3	Resnet-s	3	256	1024	8	5.2MB	73.8	模型
tranpose-r-a4	Resnet-s	4	256	1024	8	6.0MB	75.1	模型
轉置HS	HRNET-S-W32	4	64	128	1	8.0MB	76.1	模型
轉置-H-A4	HRNET-S-W48	4	96	192	1	17.3MB	77.5	模型
轉置-H-A6	HRNET-S-W48	6	96	192	1	17.5MB	78.1	模型

快速使用

嘗試網絡演示：

您可以直接從Torch Hub上直接加載帶有驗證的COCO Train2017數據集上的轉置R-A4或轉置-H-A4模型，僅通過：

 import torch

tpr = torch . hub . load ( 'yangsenius/TransPose:main' , 'tpr_a4_256x192' , pretrained = True )
tph = torch . hub . load ( 'yangsenius/TransPose:main' , 'tph_a4_256x192' , pretrained = True )

可可VAL2017的結果，檢測器的人類AP為56.4，可可Val2017數據集

模型	輸入大小	FPS*	gflops	AP	AP .5	AP .75	AP（M）	AP（L）	ar	AR .5	AR .75	手臂）	AR（L）
tranpose-r-a3	256x192	141	8.0	0.717	0.889	0.788	0.680	0.786	0.771	0.930	0.836	0.727	0.835
tranpose-r-a4	256x192	138	8.9	0.726	0.891	0.799	0.688	0.798	0.780	0.931	0.845	0.735	0.844
轉置HS	256x192	45	10.2	0.742	0.896	0.808	0.706	0.810	0.795	0.935	0.855	0.752	0.856
轉置-H-A4	256x192	41	17.5	0.753	0.900	0.818	0.717	0.821	0.803	0.939	0.861	0.761	0.865
轉置-H-A6	256x192	38	21.8	0.758	0.901	0.821	0.719	0.828	0.808	0.939	0.864	0.764	0.872

筆記：

我們在單個NVIDIA 2080TI GPU上計算了從可可瓦爾數據集中測試100個樣品的平均fps*。 FPS可能在不同的測試下上下波動。
我們在不同的硬件平台上訓練了不同的型號： 1 x RTX2080TI GPU（TP-R-A4），4 x Titan XP GPU（TP-HS，TP-H-A4）和4 x Tesla P40 GPU（TP- H--- A6） 。

可可測試-DEV 2017的結果，檢測器在可可Test-DEV2017數據集上具有60.9的人類AP為60.9

模型	輸入大小	#params	gflops	AP	AP .5	AP .75	AP（M）	AP（L）	ar	AR .5	AR .75	手臂）	AR（L）
轉置HS	256x192	8.0m	10.2	0.734	0.916	0.811	0.701	0.793	0.786	0.950	0.856	0.745	0.843
轉置-H-A4	256x192	173m	17.5	0.747	0.919	0.822	0.714	0.807	0.799	0.953	0.866	0.758	0.854
轉置-H-A6	256x192	175m	21.8	0.750	0.922	0.823	0.713	0.811	0.801	0.954	0.867	0.759	0.859

可視化

Jupyter筆記本演示

在給定輸入圖像，預驗證的轉置模型和預測的位置，我們可以可視化注意分數的閾值預測位置的空間依賴性。

threshold=0.00的TransPose-R-A4

threshold=0.01的TransPose-R-A4

threshold=0.00的TransPose-H-A4

threshold=0.00075的TransPose-H-A4

入門

安裝

克隆此存儲庫，我們將其稱為您將您克隆為$ {pose_root}的目錄
```
git clone https://github.com/yangsenius/TransPose.git
```
在Pytorch官方網站上安裝Pytorch> = 1.6和Torchvision> = 0.7
安裝軟件包依賴項。確保Python環境> = 3.7
```
pip install -r requirements.txt
```
在$ {pose_root}下製作輸出（培訓模型和文件）和日誌（張板日誌）目錄
```
mkdir output log
cd ${POSE_ROOT} /lib
make
```

從此存儲庫的發行版中下載驗證的模型到指定目錄

 $ {POSE_ROOT}
 `-- models
     `-- pytorch
         |-- imagenet
         |   |-- hrnet_w32-36af842e.pth
         |   |-- hrnet_w48-8ef0771d.pth
         |   |-- resnet50-19c8e357.pth
         |-- transpose_coco
         |   |-- tp_r_256x192_enc3_d256_h1024_mh8.pth
         |   |-- tp_r_256x192_enc4_d256_h1024_mh8.pth
         |   |-- tp_h_32_256x192_enc4_d64_h128_mh1.pth
         |   |-- tp_h_48_256x192_enc4_d96_h192_mh1.pth
         |   |-- tp_h_48_256x192_enc6_d96_h192_mh1.pth

數據準備

我們遵循HRNET的步驟，準備可可列車/val/test數據集和註釋。檢測到的結果將從OneDrive或Googledrive下載。請下載或將它們鏈接到$ {pose_root}/data/coco/，並使它們看起來像這樣：

 $ {POSE_ROOT}/data/coco/
| -- annotations
|   |-- person_keypoints_train2017.json
|   `-- person_keypoints_val2017.json
| -- person_detection_results
|   |-- COCO_val2017_detections_AP_H_56_person.json
|   `-- COCO_test-dev2017_detections_AP_H_609_person.json
`-- images
	|-- train2017
	|   |-- 000000000009.jpg
	|   |-- ... 
	`-- val2017
		|-- 000000000139.jpg
		|-- ...

經歷和測試

對可可Val2017數據集進行測試

python tools/test.py --cfg experiments/coco/transpose_r/TP_R_256x192_d256_h1024_enc4_mh8.yaml TEST.USE_GT_BBOX True

可可Train2017數據集的培訓

python tools/train.py --cfg experiments/coco/transpose_r/TP_R_256x192_d256_h1024_enc4_mh8.yaml

致謝

非常感謝這些論文及其開源代碼：hrnet，detr，darkpose

執照

該存儲庫是根據MIT許可發布的。

引用

如果您覺得這個存儲庫有用，請給它一顆星嗎？或考慮引用我們的工作：

@inproceedings{yang2021transpose,
  title={TransPose: Keypoint Localization via Transformer},
  author={Yang, Sen and Quan, Zhibin and Nie, Mu and Yang, Wankou},
  booktitle={IEEE/CVF International Conference on Computer Vision (ICCV)},
  year={2021}
}

展開

附加信息

版本 Yaml
類型其他源碼
更新時間 2025-01-30
大小 3.46MB
來自於 Github

相關應用

waymo open dataset

2024-11-18
SmartTube

2024-12-14
Sunamu

2024-12-14
MySchedule.py

2024-12-15
viptools for eslam

2024-12-15
VITAident

2024-12-15

爲您推薦

chat.petals.dev

其他源碼

1.0.0
GPT Prompt Templates

其他源碼

1.0.0
GPTyped

其他源碼

GPTyped 1.0.5
waymo open dataset

其他源碼

December 2023 Update
SmartTube

其他源碼

24.71 Stable
Sunamu

其他源碼

Release 2.2.0
waymo open dataset

其他源碼

December 2023 Update
termwind

其他類別

v2.3.0
wp functions

其他類別

1.0.0

相關資訊全部