Gluon -> module : What is `label_name` and why do I need labels for Modules to run/bind?


#1

I am currently trying to transfer Gluon models into Module models.

For me to do this successfully on a softmax output neural net, I need to do the following:

def block2symbol(block):
    data = mx.sym.Variable('data')
    sym = block(data)
    params = block.collect_params()
    arg_params = {}
    aux_params = {}
    for k, v in params.items():
        if v._stype == 'default':
            data = v.data()
        else:
            raise NotImplemented("stype {} is not yet supported for parameters in block2symbol.")
        arg_params[k] = data
        aux_params[k] = data
    return sym, arg_params, aux_params

# Converting gluon into module
mx_sym, args, auxs = block2symbol(net) # net is some gluon block.
# Need name = softmax so that label_names can handle softmax_label
mx_sym = mx.sym.SoftmaxOutput(data=mx_sym, name='softmax')
model = mx.mod.Module(symbol = mx_sym, context = mx.cpu(), 
                       label_names = ['softmax_label'])
model.bind(for_training=False,
           data_shapes = data_iter.provide_data, 
           label_shapes = data_iter.provide_label)
model.set_params(args, auxs)

I understand why we need everything above, except for this line:

mx_sym = mx.sym.SoftmaxOutput(data=mx_sym, name='softmax')

If I don’t include this line, andI just use the mx_syms as they are, then I get the error:

KeyError: 'softmax_label'

Which is associated with the label_names argument in my Module.

I looked inside the internals and this is the difference between adding a SoftmaxOutput:

before:

['data',
 'hybridsequential0_conv0_weight',
 'hybridsequential0_conv0_bias',
 'hybridsequential0_conv0_fwd_output',
 'hybridsequential0_conv0_relu_fwd_output',
 'hybridsequential0_conv1_weight',
 ...
 'hybridsequential0_conv3_fwd_output',
 'hybridsequential0_conv3_relu_fwd_output',
 'hybridsequential0_flatten0_reshape0_output',
 'hybridsequential0_dense0_weight',
 'hybridsequential0_dense0_bias',
 'hybridsequential0_dense0_fwd_output',
 'hybridsequential0_dense0_relu_fwd_output',
 'hybridsequential0_dense1_weight',
 'hybridsequential0_dense1_bias',
 'hybridsequential0_dense1_fwd_output']

after:

['data',
 'hybridsequential0_conv0_weight',
 'hybridsequential0_conv0_bias',
 'hybridsequential0_conv0_fwd_output',
 'hybridsequential0_conv0_relu_fwd_output',
 'hybridsequential0_conv1_weight',
 ...
 'hybridsequential0_conv3_fwd_output',
 'hybridsequential0_conv3_relu_fwd_output',
 'hybridsequential0_flatten0_reshape0_output',
 'hybridsequential0_dense0_weight',
 'hybridsequential0_dense0_bias',
 'hybridsequential0_dense0_fwd_output',
 'hybridsequential0_dense0_relu_fwd_output',
 'hybridsequential0_dense1_weight',
 'hybridsequential0_dense1_bias',
 'hybridsequential0_dense1_fwd_output',
 'softmax_label',
 'softmax_output']

So I guess I need the _label. However, what is this label and why do I need it? I can’t see the sourcecode for SoftmaxOutput since it’s in the generated python file. More importantly, what if we want something that’s not a softmax output? Gluon doesn’t have a softmax symbol at the very end, so we could use the output of that neural network for anything, not necessarily passing it through softmax.


#2

You don’t need a label, it is the default because it was assumed to be the most common use-case, and symbolic models tend to have a final loss layer, while in Gluon the loss is usually computed after the model output.

I modified your example to have the label name purposefully set to None and inference is done without label:

import mxnet as mx

# Get a Gluon Model 
net = mx.gluon.model_zoo.vision.resnet101_v1(pretrained=True)

# Export to symbolic  format
net.hybridize()
net(mx.nd.ones((1,3,224,224)))
net.export('test', 0)

# Load symbolic model
mx_sym, args, auxs = mx.model.load_checkpoint('test',0)

# Create module
model = mx.mod.Module(symbol = mx_sym, context = mx.cpu(), label_names=[])

# Bind the data shape and load params
model.bind(for_training=False,
           data_shapes = [('data',(1,3,224,224))],
           label_shapes = None))
model.set_params(args, auxs)

# Pass data through model
data = mx.io.DataBatch([mx.nd.ones((1,3,224,224))], provide_data=model.data_shapes)
model.forward(data)
model.get_outputs()