Train model with no bias in convolution layer


#1

For specific purpose, i want to remove bias in some convolution layer of “mobilenet_ssd_300” model and training network from scratch. Mxnet and gluoncv use batchnorm layer for faster convergence when training. I tried to set “beta” and “running_mean” term of batchnorm to 0 and lr_mult=0 to make sure that they cannot learn anything. however, in output model, i could see that “running_mean” term was not completely removed. So the model still had a small shift factor. It was not my expectation.
So, how can i completely remove shift factor in batchnorm layer when training?


#2

Hi you can try setting the center parameter in the BatchNorm layer to false with the use_global_stats param also set to False. See https://mxnet.incubator.apache.org/api/python/gluon/nn.html#mxnet.gluon.nn.BatchNorm for more.

Feel free to post a follow up with more details about your use case and or code sample if this doesn’t fully address your question.


#3

the param use_global_stats is False by default, the “running_mean” is not use center parameter, so the result wasn’t change.
Btw, is it possible to remove bias completely when training with batchnorm?


#4

This is in brief how BN works in default mode. In training, it calculates mean and std of each batch and normalizes the batch using these two values. It also updates the running mean and std, but it isn’t used in training. In inference, it only uses the running mean and inference.

This default behavior can be changed by setting use_global_stats to true, in which case BN simply uses the values of running mean and std to normalize the data without modifying anything.

I’m not exactly clear on what you’re trying to do. Are you intending to have BN calculate std, but not subtract mean and only scale by std?