Initializing parameters of SymbolBlock

Hi, I am migrating from my Symbol API based code base to Gluon. Since I want to reuse networks defined in Symbol API, I thought gluon SymbolBlock would fit my needs perfectly.

But I couldn’t get parameter initialization working correctly. For example, I want to initialize convolution weight with xavier, zero-initialize bias, batch norm ‘beta’, etc. Batch norm ‘gamma’, ‘moving_var’ need to be initialized with all ones. I thought block.initialize(mx.init.Xavier()) should do the job, but apparently it also tries to initialize conv bias and batch norm parameters with xavier, which raises error.

I modified parameter initialization logic in SymbolBlock here into something like this. This works OK, but ideally I don’t want to modify mxnet code base. So, is there a good way to do proper parameter initialization of SymbolBlock? Or is SymbolBlock meant to be used only with pre-trained model?

Hi @masahi, you can initialize each parameter individually. Here I get a model from the model zoo, and save it in its symbolic format, so that we are in the same starting position. I then load the model using the new SymbolBlock.imports API I get back my symbol. I then split my parameters according to what you wished to do above, and apply the respective initialization on each parameter. We can then print the weights and make sure they have been initialized properly.

# Save a symbolic model
net = gluon.model_zoo.vision.resnet18_v1(pretrained=False, classes=5)
net.hybridize()
net.initialize()
net(mx.nd.ones((1,3,224,224)))
net.export('test')

# Load the symbolic model
s = gluon.nn.SymbolBlock.imports('test-symbol.json', ['data'])

# Initializing the parameters
s.collect_params('.*gamma|.*running_mean|.*running_var').initialize(mx.init.Constant(1))
s.collect_params('.*beta|.*bias').initialize(mx.init.Constant(0))
s.collect_params('.*weight').initialize(mx.init.Xavier())

# Running one batch of data
s(mx.nd.ones((1,3,224,224)))

# Check results
print(list(s.collect_params('.*gamma|.*running_mean|.*running_var').values())[0].data())
print(list(s.collect_params('.*beta|.*bias').values())[0].data())
print(list(s.collect_params('.*weight').values())[0].data()[0,0,:])
[1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
<NDArray 64 @cpu(0)>

[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
<NDArray 64 @cpu(0)>

[[ 0.03865026 -0.03501667 -0.04138449  0.01635652  0.00095572 -0.003131
  -0.01541091]
 [-0.00846649  0.0084407  -0.01027619 -0.01369495 -0.01029496  0.00449695
   0.00767175]
 [ 0.03228955 -0.03860051  0.02594707  0.010812    0.03466668  0.02166713
   0.02997286]
 [ 0.04084081 -0.01115768 -0.03034292 -0.03436436 -0.02279269 -0.02631962
  -0.02392154]
 [-0.01886583  0.00957927 -0.03969173 -0.0156552  -0.00196541  0.03107235
  -0.02484216]
 [ 0.02815529 -0.00914492 -0.00087662 -0.02381214 -0.0427332  -0.00936336
   0.04226137]
 [ 0.02236562 -0.00982476 -0.02679424  0.03971916  0.03767575 -0.01587596
   0.02418222]]
<NDArray 7x7 @cpu(0)>
3 Likes

Thanks very much for the detailed answer, @ThomasDelteil. I ended up rewriting my model in Gluon. But it is good to know such a solution exists.

I will use your method later when I use other existing symbol based models in Gluon.

@masahi no worries, I have updated my answer to take advantage of the select regex parameter of collect_params() which makes it cleaner and simpler. Thanks @safrooze for pointing this out to me!

1 Like