 Automatic Differentiation

Why give it a nd.array([10, 1., .1, .01])?
What do I get in x.grad if I don’t pass the head gradient to the `backward` function? Isn’t it dz/dx?

Hi @vermicelli,

I think backward should be applied to `y` here not `z`, that would make more sense to me.

And then the example should show a case where you could calculate dz/dy manually (possibly even not using mxnet), and still be able to use autograd for dy/dx to calculate dz/dx which is stored in `x.grad` as you pointed out.

Something like this example:

``````import mxnet as mx

x = mx.nd.array([0.,1.,2.,3.])

y = x * 2

# dy/dz calculated outside of autograd
dydz = mx.nd.array([10, 1., .1, .01])
y.backward(dydz)
# thus calculating dz/dx, even though dz/dx was outside of autograd
``````
``````[20.    2.    0.2   0.02]
<NDArray 4 @cpu(0)>
``````

@mli @smolix please confirm? Quite a complex example for an intro. Are there many use cases of this you’ve seen in the wild?

Thank you for your reply. This makes sense to me. But I think the ‘dy/dz’ in the comment `# dy/dz calculated outside of autograd` should be ‘dz/dy’. My understanding of your example is that you let the MXNet do the autograd on dy/dx which should be 2, and told autograd you already have the dz/dy part manually which is `[10, 1., .1, .01]`. Then autograd store the dz/dy * dy/dx in x.grad as the final result. Am I right?