surgeon pytorch下载 - surgeon pytorch源代码下载

surgeon pytorch

Ai源码

0.0.4

下载

用于检查和提取 PyTorch 模型中间层的库。

为什么？

通常情况下，我们希望在不修改代码的情况下检查PyTorch 模型的中间层。这对于获取语言模型的注意力矩阵、可视化层嵌入或将损失函数应用于中间层非常有用。有时我们想要提取模型的子部分并独立运行它们，以调试它们或单独训练它们。所有这些都可以通过 Surgeon 完成，而无需更改原始模型的任何一条线。

安装

$ pip install surgeon-pytorch

用法

检查

给定 PyTorch 模型，我们可以使用get_layers显示所有层：

 import torch
import torch . nn as nn

from surgeon_pytorch import Inspect , get_layers

class SomeModel ( nn . Module ):

    def __init__ ( self ):
        super (). __init__ ()
        self . layer1 = nn . Linear ( 5 , 3 )
        self . layer2 = nn . Linear ( 3 , 2 )
        self . layer3 = nn . Linear ( 2 , 1 )

    def forward ( self , x ):
        x1 = self . layer1 ( x )
        x2 = self . layer2 ( x1 )
        y = self . layer3 ( x2 )
        return y


model = SomeModel ()
print ( get_layers ( model )) # ['layer1', 'layer2', 'layer3']

然后我们可以使用Inspect包装要检查的model ，并且在每次前向调用新模型时，我们还将输出提供的层输出（在第二个返回值中）：

 model_wrapped = Inspect ( model , layer = 'layer2' )
x = torch . rand ( 1 , 5 )
y , x2 = model_wrapped ( x )
print ( x2 ) # tensor([[-0.2726,  0.0910]], grad_fn=<AddmmBackward0>)

检查多层

我们可以提供图层列表：

 model_wrapped = Inspect ( model , layer = [ 'layer1' , 'layer2' ])
x = torch . rand ( 1 , 5 )
y , [ x1 , x2 ] = model_wrapped ( x )
print ( x1 ) # tensor([[ 0.1739,  0.3844, -0.4724]], grad_fn=<AddmmBackward0>)
print ( x2 ) # tensor([[-0.2238,  0.0107]], grad_fn=<AddmmBackward0>)

名称检查层输出

我们可以提供一个字典来获取命名输出：

 model_wrapped = Inspect ( model , layer = { 'layer1' : 'x1' , 'layer2' : 'x2' })
x = torch . rand ( 1 , 5 )
y , layers = model_wrapped ( x )
print ( layers )
"""
{
    'x1': tensor([[ 0.3707,  0.6584, -0.2970]], grad_fn=<AddmmBackward0>),
    'x2': tensor([[-0.1953, -0.3408]], grad_fn=<AddmmBackward0>)
}
"""

应用程序编程接口

 model = Inspect (
    model : nn . Module ,
    layer : Union [ str , Sequence [ str ], Dict [ str , str ]],
    keep_output : bool = True ,
)

提炼

给定 PyTorch 模型，我们可以使用get_nodes显示图形的所有中间节点：

 import torch
import torch . nn as nn
from surgeon_pytorch import Extract , get_nodes

class SomeModel ( nn . Module ):

    def __init__ ( self ):
        super (). __init__ ()
        self . layer1 = nn . Linear ( 5 , 3 )
        self . layer2 = nn . Linear ( 3 , 2 )
        self . layer3 = nn . Linear ( 1 , 1 )

    def forward ( self , x ):
        x1 = torch . relu ( self . layer1 ( x ))
        x2 = torch . sigmoid ( self . layer2 ( x1 ))
        y = self . layer3 ( x2 ). tanh ()
        return y

model = SomeModel ()
print ( get_nodes ( model )) # ['x', 'layer1', 'relu', 'layer2', 'sigmoid', 'layer3', 'tanh']

然后我们可以使用Extract提取输出，这将创建一个返回请求的输出节点的新模型：

 model_ext = Extract ( model , node_out = 'sigmoid' )
x = torch . rand ( 1 , 5 )
sigmoid = model_ext ( x )
print ( sigmoid ) # tensor([[0.5570, 0.3652]], grad_fn=<SigmoidBackward0>)

我们还可以使用新的输入节点提取模型：

 model_ext = Extract ( model , node_in = 'layer1' , node_out = 'sigmoid' )
layer1 = torch . rand ( 1 , 3 )
sigmoid = model_ext ( layer1 )
print ( sigmoid ) # tensor([[0.5444, 0.3965]], grad_fn=<SigmoidBackward0>)

多节点

我们还可以提供多个输入和输出并命名它们：

 model_ext = Extract ( model , node_in = { 'layer1' : 'x' }, node_out = { 'sigmoid' : 'y1' , 'relu' : 'y2' })
out = model_ext ( x = torch . rand ( 1 , 3 ))
print ( out )
"""
{
    'y1': tensor([[0.4437, 0.7152]], grad_fn=<SigmoidBackward0>),
    'y2': tensor([[0.0555, 0.9014, 0.8297]]),
}
"""

图形输入/输出摘要

请注意，更改输入节点可能不足以切割图形（可能还有其他依赖项连接到先前的输入）。要查看新图的所有输入，我们可以调用model_ext.summary ，它将为我们提供所有所需输入和返回输出的概述：

 import torch
import torch . nn as nn
from surgeon_pytorch import Extract , get_nodes

class SomeModel ( nn . Module ):

    def __init__ ( self ):
        super (). __init__ ()
        self . layer1a = nn . Linear ( 2 , 2 )
        self . layer1b = nn . Linear ( 2 , 2 )
        self . layer2 = nn . Linear ( 2 , 1 )

    def forward ( self , x ):
        a = self . layer1a ( x )
        b = self . layer1b ( x )
        c = torch . add ( a , b )
        y = self . layer2 ( c )
        return y

model = SomeModel ()
print ( get_nodes ( model )) # ['x', 'layer1a', 'layer1b', 'add', 'layer2']

model_ext = Extract ( model , node_in = { 'layer1a' : 'my_input' }, node_out = { 'add' : 'my_add' })
print ( model_ext . summary ) # {'input': ('x', 'my_input'), 'output': {'my_add': add}}

out = model_ext ( x = torch . rand ( 1 , 2 ), my_input = torch . rand ( 1 , 2 ))
print ( out ) # {'my_add': tensor([[ 0.3722, -0.6843]], grad_fn=<AddBackward0>)}

应用程序编程接口

应用程序编程接口

 model = Extract (
    model : nn . Module ,
    node_in : Optional [ Union [ str , Sequence [ str ], Dict [ str , str ]]] = None ,
    node_out : Optional [ Union [ str , Sequence [ str ], Dict [ str , str ]]] = None ,
    tracer : Optional [ Type [ Tracer ]] = None ,          # Tracer class used, default: torch.fx.Tracer
    concrete_args : Optional [ Dict [ str , Any ]] = None , # Tracer concrete_args, default: None
    keep_output : bool = None ,                       # Set to `True` to return original outputs as first argument, default: True except if node_out are provided
    share_modules : bool = False ,                    # Set to true if you want to share module weights with original model
)

检查与提取

Inspect类始终执行作为输入提供的整个模型，并使用特殊的钩子来记录流过的张量值。这种方法的优点是 (1) 我们不创建新模块 (2) 它允许动态执行图（即依赖于输入的for循环和if语句）。 Inspect的缺点是（1）如果我们只需要执行模型的一部分，则会浪费一些计算，（2）我们只能从nn.Module层输出值 - 没有中间函数值。

Extract类使用符号跟踪构建一个全新的模型。这种方法的优点是（1）我们可以在任何地方裁剪图形并获得仅计算该部分的新模型，（2）我们可以从中间函数（不仅仅是层）中提取值，（3）我们还可以更改输入张量。 Extract的缺点是只允许静态图（请注意，大多数模型都有静态图）。