CustomOp without backward pass

adrian · December 7, 2018, 2:52pm

Hi everyone,

What happens in gluon if I do not define a backward pass for a custom operator? Does it use autograd from the forward pass? If not, how can I implement the backward pass so that autograd of the forward pass is used? I am aware that I could inherit from HybridBlock instead, where I do not have to overwrite the backward pass.

safrooze · December 12, 2018, 2:19am

Typically one would implement a custom op is if the operation cannot be performed using ndarray operators, in which case both forward and backward equations must be hand-implemented.

If your forward equation can be implemented using ndarray operators, there is no need to implement custom operators. Technically your equations can simply be under the autograd scope:

with autograd.record():
    out = x * y + k

Alternatively if you are composing a neural network that has several layers and you simply want one of the layers to be custom, then what you want is to implement a custom Gluon block, not a custom operator:

class MyBlock(gluon.HybridBlock):
    def __init__():
        super(LSTMCell, self).__init__()
        self.k = self.params.get('k', (100, 0))  # k is a learnable bias in this case
    def hybrid_forward(x, y, k):
        out = x * y + k

adrian · December 13, 2018, 7:20pm

Thanks for your reply. As I said I am aware that I can solve the issue by implementing a HybridBlock, and that is what I am using now. I was just very suprised how my network behaved before, when I used a custom operator without a backward pass and was curious what was happening there.
In my opinion, when a custom operator is defined without a backward pass, it should either use autograd for backward, or throw an error to indicate that the backward behaviour is very unlikly to do what the user would expect. I think it just blocks gradients atm.

safrooze · December 13, 2018, 7:53pm

If you look at CustomOp class which you need to inherit from when you write a custom operator, the backward() method in the base class doesn’t do anything, which means that the gradient w.r.t. input is zero if backward() isn’t implemented.

The system cannot use autograd because one should really use a CustomOp only when there are no NDArray equivalent operations for the task, in which case the operations are performed in numpy, which cannot be captured by autograd.

An error is not thrown because there can be many instances in which only forward is of relevance to the user. I do, however, think that perhaps a warning log is justified when backward() of the base class is invoked. Care to submit a PR for a warning log?

adrian · December 13, 2018, 7:56pm

Thanks, that answers all my questions and it totally makes sense.
Might open a PR when I have some free time

Topic		Replies	Views
Implementing backward for a custom layer in gluon	1	356	September 2, 2019
`backward` of custom parametrized operator Gluon	1	345	June 11, 2019
Custom block backward pass Discussion	2	865	November 28, 2017
Custom Layer with own Cuda code Gluon	2	1432	November 3, 2017
Internals of gluon custom layer’s backward Discussion	3	448	June 8, 2019

CustomOp without backward pass

Related Topics