The idea is that I want to clip the value of the weight by l2 norm of the weights of the network after the optimizer updates the weights. This l2 norm is calculated differently for different layer (such as conv and fc)
Does anyone has any suggestion how to do this in mxnet? Preferably with simple modification so that I could use multi-gpu training API stuffs
And then get the data from all of the network parameters, and clip to max_norm.
net_params = [p[1].data() for p in net.collect_params().items()]
max_norm = 0.1
mx.gluon.utils.clip_global_norm(net_params, max_norm=max_norm)
print(net[0].weight.data())