CustomOp without backward pass


Hi everyone,

What happens in gluon if I do not define a backward pass for a custom operator? Does it use autograd from the forward pass? If not, how can I implement the backward pass so that autograd of the forward pass is used? I am aware that I could inherit from HybridBlock instead, where I do not have to overwrite the backward pass.


Typically one would implement a custom op is if the operation cannot be performed using ndarray operators, in which case both forward and backward equations must be hand-implemented.

If your forward equation can be implemented using ndarray operators, there is no need to implement custom operators. Technically your equations can simply be under the autograd scope:

with autograd.record():
    out = x * y + k

Alternatively if you are composing a neural network that has several layers and you simply want one of the layers to be custom, then what you want is to implement a custom Gluon block, not a custom operator:

class MyBlock(gluon.HybridBlock):
    def __init__():
        super(LSTMCell, self).__init__()
        self.k = self.params.get('k', (100, 0))  # k is a learnable bias in this case
    def hybrid_forward(x, y, k):
        out = x * y + k


Thanks for your reply. As I said I am aware that I can solve the issue by implementing a HybridBlock, and that is what I am using now. I was just very suprised how my network behaved before, when I used a custom operator without a backward pass and was curious what was happening there.
In my opinion, when a custom operator is defined without a backward pass, it should either use autograd for backward, or throw an error to indicate that the backward behaviour is very unlikly to do what the user would expect. I think it just blocks gradients atm.


If you look at CustomOp class which you need to inherit from when you write a custom operator, the backward() method in the base class doesn’t do anything, which means that the gradient w.r.t. input is zero if backward() isn’t implemented.

The system cannot use autograd because one should really use a CustomOp only when there are no NDArray equivalent operations for the task, in which case the operations are performed in numpy, which cannot be captured by autograd.

An error is not thrown because there can be many instances in which only forward is of relevance to the user. I do, however, think that perhaps a warning log is justified when backward() of the base class is invoked. Care to submit a PR for a warning log? :slight_smile:


Thanks, that answers all my questions and it totally makes sense.
Might open a PR when I have some free time :slight_smile: