Hi all, I’ve run into an issue that I cannot find a answer for in the tutorials. If there is one, please point it out!
I want to implement in MxNet (Symbol) this important paper: http://proceedings.mlr.press/v37/ganin15.pdf
which has a Gradient Reversal Layer. This layer behaves as the identity on a forward pass but multiplies the gradients by a negative constant on the backward pass. My issue is how to ‘feed’ that constant into the Gradient Reversal layer when performing the backward pass. I’m building the Gradient Reversal layer as a custom operator in the following way:
################################
class GradientReversalLayer(mx.operator.CustomOp):
def init(self, ctx, lambda_param):
self.ctx = ctx
self.lambda_param = lambda_param
def forward(self, is_train, req, in_data, out_data, aux):
x = in_data[0]
y = x #identity on forward pass
self.assign(out_data[0], req[0], y)
def backward(self, req, out_grad, in_data, out_data, in_grad, aux):
y = -out_grad[0] * self.lambda_param
self.assign(in_grad[0], req[0], y)
@mx.operator.register(“GradientReversalLayer”)
class GradientReversalLayerProp(mx.operator.CustomOpProp):
def __init__(self, **kwargs):
super(GradientReversalLayerProp, self).__init__(need_top_grad=True)
def list_arguments(self):
return ['data', 'lambda_param']
def list_outputs(self):
return ['output']
def infer_shape(self, in_shape):
data_shape = in_shape[0]
output_shape = in_shape[0]
return (data_shape,), (output_shape,), ()
def infer_type(self, in_type):
dtype = in_type[0]
return [dtype], [dtype], []
def create_operator(self, ctx, shapes, dtypes):
return GradientReversalLayer(ctx, lambda_param=self.lambda_param)
################################
You’ll notice that there is a parameter ‘lambda_param’ that should be different on each batch (i.e. each backward pass). Inside my training loop, I have a fairly standard use of autograd:
#############
with autograd.record():
pred = model(data)
losses = criterion_domain(pred, domain_label)
autograd.backward(losses)
#############
How should I pass the ‘lambda_param’ into the Gradient Reversal Layer when calling autograd.backward? I would guess it should be something like autograd.backward(losses, lambda_param=2) but I can’t find supporting information. I’m very much appreciative of any insight (or potential documentation) you can provide.
Thanks,
Ben