MxNet has trouble saving all parameters of a network


#1

In my experiment, the MxNet may forget saving some parameters of my network.

I am studying mxnet’s gluoncv package (https://gluon-cv.mxnet.io/index.html). To learn the programming skills from the engineers, I manually generate an SSD with ‘gluoncv.model_zoo.ssd.SSD’. The parameters that I use to initialize this class are the same as the official ‘ssd_512_resnet50_v1_voc’ network exceptclasses=(‘car’, ‘pedestrian’, ‘truck’, ‘trafficLight’, ‘biker’)’.

from gluoncv.model_zoo.ssd import SSD
import mxnet as mx

name = ‘resnet50_v1’
base_size = 512
features=[‘stage3_activation5’, ‘stage4_activation2’]
filters=[512, 512, 256, 256]
sizes=[51.2, 102.4, 189.4, 276.4, 363.52, 450.6, 492]
ratios=[[1, 2, 0.5]] + [[1, 2, 0.5, 3, 1.0/3]] * 3 + [[1, 2, 0.5]] * 2
steps=[16, 32, 64, 128, 256, 512]
classes=(‘car’, ‘pedestrian’, ‘truck’, ‘trafficLight’, ‘biker’)

pretrained=True

net = SSD(network = name, base_size = base_size, features = features,
num_filters = filters, sizes = sizes, ratios = ratios, steps = steps,
pretrained=pretrained, classes=classes)

I try to feed a manmade data x to this network, and it gives following errors.

x = mx.nd.zeros(shape=(batch_size,3,base_size,base_size))
cls_preds, box_preds, anchors = net(x)

RuntimeError: Parameter 'ssd0_expand_trans_conv0_weight' has not been initialized. Note that you should initialize parameters and create Trainer with Block.collect_params() instead of Block.params because the later does not include Parameters of nested child Blocks

This is reasonable. The SSD uses function ‘gluoncv.nn.feature.FeatureExpander’ to add new layers on the ‘resnet50_v1’, and I forget to initialize them. So, I use following codes.

net.initialize()

Oho, it gives me a lot of warnings.

v.initialize(None, ctx, init, force_reinit=force_reinit)
C:\Users\Bird\AppData\Local\conda\conda\envs\ssd\lib\site-packages\mxnet\gluon\parameter.py:687: UserWarning: Parameter ‘ssd0_resnetv10_stage4_batchnorm9_running_mean’ is already initialized, ignoring. Set force_reinit=True to re-initialize.
v.initialize(None, ctx, init, force_reinit=force_reinit)
C:\Users\Bird\AppData\Local\conda\conda\envs\ssd\lib\site-packages\mxnet\gluon\parameter.py:687: UserWarning: Parameter ‘ssd0_resnetv10_stage4_batchnorm9_running_var’ is already initialized, ignoring. Set force_reinit=True to re-initialize.
v.initialize(None, ctx, init, force_reinit=force_reinit)

The ‘resnet50_v1’ which is the base of SSD are pre-trained, so these parameters cannot be installed. However, these warnings are annoying.

How can I turn them off?

Here, though, comes the first problem. I would like to save the parameters of the network.

net.save_params(‘F:/Temps/Models_tmp/’ +‘myssd.params’)

The parameter file of ’resnet50_v1’ (‘resnet50_v1-c940b1a0.params’) is 97.7MB; however, my parameter file is only 9.96MB. Are there some magical technologies to compress these parameters?

To test this new technology, I open a new console and rebuild the same network. Then, I load the saved parameters and feed a data to it.

net.load_params(‘F:/Temps/Models_tmp/’ +‘myssd.params’)
x = mx.nd.zeros(shape=(batch_size,3,base_size,base_size))

The initialization error comes again.

RuntimeError: Parameter ‘ssd0_expand_trans_conv0_weight’ has not been initialized. Note that you should initialize parameters and create Trainer with Block.collect_params() instead of Block.params because the later does not include Parameters of nested child Blocks

This cannot be right because the saved file ‘myssd.params’ should contain all the installed parameters of my network.

To find the block ‘ssd0_expand_trans_conv0’, I do a deeper research in ‘gluoncv.nn.feature. FeatureExpander’. I use ‘mxnet.gluon. nn.Conv2D’ to replace ‘mx.sym.Convolution’ in the ‘FeatureExpander’ function.

'''
        y = mx.sym.Convolution(
            y, num_filter=num_trans, kernel=(1, 1), no_bias=use_bn,
            name='expand_trans_conv{}'.format(i), attr={'__init__': weight_init})
        '''
        Conv1 = nn.Conv2D(channels = num_trans,kernel_size = (1, 1),use_bias = use_bn,weight_initializer = weight_init)
        y = Conv1(y)
        Conv1.initialize(verbose = True)
    '''    
    y = mx.sym.Convolution(
        y, num_filter=f, kernel=(3, 3), pad=(1, 1), stride=(2, 2),
        no_bias=use_bn, name='expand_conv{}'.format(i), attr={'__init__': weight_init})
    '''
    Conv2 = nn.Conv2D(channels = f,kernel_size = (3, 3),padding = (1, 1),strides = (2, 2),use_bias = use_bn, weight_initializer = weight_init)
    y = Conv2(y)
    Conv2.initialize(verbose = True)

These new blocks can be initialized manually. However, the MxNet still report the same errors.
It seems that the manual initialization is of no effect.

How can I save all the parameters of my network and restore them?


#2

You will not be able to save the parameters until you pass through a real input to the network after initialize().

To summarize the steps:

  1. grab a net = gluoncv.model_zoo.get_model(‘xxx’), if you are using pretrained model, you can safely net.save_params(‘xxx’) because all weights have successfully inferred and ignore steps(2,3,4)
  2. net.initialize() if not loaded from pretrained
  3. net(mx.nd.zeros((1, 3, 512, 512)) for example to let all the parameters to infer shape
  4. net.save_params()

#3

we will add an option to turn off warnings if partial parameters are already initialized


#4

Thanks for your support. I have not yet tested your idea. I will report the results after performing an experimentation.