Extract subgraph from a pre-trained model


#1

Hi,

I’m facing difficulties when extracting intermediate layers from a pre-trained model and feed the pre-trained parameters to it. And I’m posting to ask if there is a clean way to do this.

For example, let’s suppose I have a graph g: A->B->C->D->E loaded using Gluon, and what I need is B->C->D.

Step 1: It is fairly easy to use g_sym = g(mx.sym.var("input")) to get the symbol, and then g_int = g_sym.get_internals() to get all internal symbols. In this way, I can get A->B->C->D, by g_int['D_output'].

Step 2: When I try to further trim it to B->C->D. I find that the only way is to traverse the graph back using get_children() until B, and then reconstruct the whole graph. For example, I use the following code snippet to reconstruct a Convolution symbol programatically.

# Suppose we have recursively generated the symbol graph B->C
f = getattr(mx.sym, D_op_type) # suppose D_op_type = 'Convolution'
new_D = f(data=new_C, name=old_D.name, **old_D.list_attr())
subgraph = gluon.SymbolBlock(new_D, input)

Step 3: Now, when I try to load the pretrained parameters into the new model subgraph, the problem is: the parameters cannot be loaded successfully because the names of the implicitly created parameters (weight and bias) can be different. For a pretrained model, the weight and bias of a certain layer can be arbitrary. But for MXNET Symbol API, creating a Convolution layer conv1 will always have its weight and bias symbol named and created as conv1_weight and conv1_bias automatically, which I cannot find a way to modify them.

Therefore, my solution to this is to change the corresponding parameter name in the old graph (A-B-C-D-E) instead, to keep it consistent with the new graph. But, this can be very tricky and not decent sometimes, because it is not reliable to infer if two parameters are equivalent simply by parsing their names/shapes. For example, if pre_trained model has the weight and bias named conv1_w and conv1_b, whereas the new model has them named conv1_weight and conv1_bias. If these two parameters have the same shape (although unlikely), it’s hard to draw the connection between the same parameters of the old and new graph.

Has anyone ever done similar things before, or have any idea how to extract subgraph from an existing symbol graph and load the parameters decently?

Thanks
Patrick


#2

If you are using Gluon and if layers have been chained together in a Sequential or HybridSequential, then you can extract subgraphs by creating slices, for example:

resnet18 = vision.resnet18_v1(pretrained=True, ctx=ctx)
resnet18.hybridize()
print(resnet18.features[4:5])

Regarding the naming of parameters: in Gluon you can set a prefix or a name_scope. Here is a great tutorial https://mxnet.incubator.apache.org/tutorials/gluon/naming.html

In general, it is propably easier to load the full graph with all parameters and then extract the subgraph from it. In this sense you avoid having to match nodes of a subgraph with parameters.