Change batch normalization momentum during training


#1

How to change the momentum for mx.sym.BatchNorm?

It usually converges faster if we gradually increase the momentum.

I cannot find a way to do it.


#2

Once a BatchNorm layer is created, there is no easy way to change the momentum for that layer.

However, if you plan to increase the momentum after every few epoch, you can save the parameters and trainer state of the model, recreate the model with a higher momentum for BN layer, load the parameters and trainer state and resume training. As far as you are not doing this frequently, it shouldn’t add much overhead.