Cannot bind model with custom loss function


#1

I trained a weighted model successfully, but when I try to load the model for prediction, I got errors.
The model is trained with the weighted loss: apache/incubator-mxnet/blob/v1.0.0/example/sparse/weighted_softmax_ce.py
Then I created the multi-loss using:

softmax_list = []

for i in range((len(classifier_def))):
fc = mx.symbol.FullyConnected(data=flat,
num_hidden=classifier_def[i][1],
name=classifier_def[i][0])
softmax_list.append(mx.sym.MakeLoss(mx.symbol.Custom(data=fc,
name=‘softmax_’ + classifier_def[i][0],
positive_cls_weight=classifier_def[i][2],
op_type=‘weighted_softmax_ce_loss’)))

softmax = mx.symbol.Group(softmax_list)
return softmax

It trains successfully. When I tried to use the model in this way:

sym_tmp, arg_params, aux_params = mx.model.load_checkpoint(base_params_path, args[‘load_epoch’])

devs = mx.cpu() if args['gpus'] is None else [mx.gpu(int(i)) for i in args['gpus'].split(',')]
mod = mx.mod.Module(symbol=sym_tmp,
                    context=devs,
                    label_names=None)
print mod.output_names
mod.bind(for_training=False,
         data_shapes=[('data', (args['batch_size'],) + mean_img.shape)])
mod.set_params(arg_params, aux_params, allow_missing=True)

ERROR:
mod.output_names:
[‘makeloss0_output’, ‘makeloss1_output’, ‘makeloss2_output’, ‘makeloss3_output’, ‘makeloss4_output’, ‘makeloss5_output’, ‘makeloss6_output’, ‘makeloss7_output’, ‘makeloss8_output’, ‘makeloss9_output’, ‘makeloss10_output’, ‘makeloss11_output’, ‘makeloss12_output’, ‘makeloss13_output’, ‘makeloss14_output’, ‘makeloss15_output’, ‘makeloss16_output’, ‘makeloss17_output’, ‘makeloss18_output’, ‘makeloss19_output’, ‘makeloss20_output’, ‘makeloss21_output’, ‘makeloss22_output’, ‘makeloss23_output’, ‘makeloss24_output’, ‘makeloss25_output’, ‘makeloss26_output’, ‘makeloss27_output’, ‘makeloss28_output’, ‘makeloss29_output’, ‘makeloss30_output’, ‘makeloss31_output’, ‘makeloss32_output’, ‘makeloss33_output’, ‘makeloss34_output’, ‘makeloss35_output’, ‘makeloss36_output’, ‘makeloss37_output’, ‘makeloss38_output’, ‘makeloss39_output’, ‘makeloss40_output’, ‘makeloss41_output’, ‘makeloss42_output’]

mod.bind(for_training=False,
… data_shapes=[(‘data’, (40,3,224,224))])
[19:53:55] /Users/travis/build/dmlc/mxnet-distro/mxnet-build/dmlc-core/include/dmlc/logging.h:308: [19:53:55] src/executor/graph_executor.cc:413: InferShape pass cannot decide shapes for the following arguments (0s means unknown dimensions). Please consider providing them as inputs:
softmax_animal print_label: (), softmax_baseball_label: (), softmax_batwing_label: (), softmax_bow_label: (), softmax_boyfriend_label: (), softmax_cable-knit_label: (), softmax_cami_label: (), softmax_camo_label: (), softmax_capri_label: (), softmax_chino_label: (), …

It seems has some InferShape issue, any suggestions?


#2

During prediction you do not feed labels to the model, so the label symbol (softmax_bow_label, etc) cannot find matching inputs.

You need to use a different symbol graph that doesn’t contain the losses.

For example, use mx.symbol.Group(fc_list) where fc_list is all the fc layer outputs