MXNet Forum

Example SSD cannot load model if trained with resnet50


I trained an SSD model via SageMaker that is AFAIK uses the code.

After training the model is expected to be converted to a “deployable” state which removes the loss symbols by running the script. Afterwards, I load the model with the following code:

sym, arg_params, aux_params = mx.model.load_checkpoint(‘deploy_ssd_vgg16_reduced_512’, 0)
mod = mx.mod.Module(symbol=sym, context=ctx, label_names=None)
mod.bind(for_training=False, data_shapes=[(‘data’, (1,3,512,512))],
mod.set_params(arg_params, aux_params, allow_extra=True)

This works fine as long as the model is trained with a VGG feature extractor. However, Sagemaker (and hence the example code) allows training with resnet50 which produces a model that can be converted with but the resulting model cannot be loaded anymore with the above code. The error I am getting is:

RuntimeError: _plus12_cls_pred_conv_bias is not presented

And indeed the BN params and few other are missing from the param file. Maybe the deploy script is bugged with resnet50?



so looking at the deploy script it seems it gets the network symbols from so there might be a bug in the config definitions for resnet. Haven’t been able to pin-point what exactly though


Thanks for the reply. Turns out it was a SageMaker bug producing wrong model files.