Obtaining second order derivatives for a function wrt arbitrary parameters in the computation graph



We have an implementation of a recurrent network in MXnet and are trying to obtain the second order derivatives of a loss function with respect to arbitrary(all) parameters in the computational graph.

Is there a way to do this? Any help would be nice :slight_smile:

Hessian vector products in symbolic
How to get gradients using symbol API
How to compute higher order gradients

It’s unclear to me how many operators support higher-order gradients at this time, so it might not work on your network, but there is an interface that should allow you to do it provided all the operators support it.


You can find documentation for it on this page, but you have to scroll down because for some reason, there isn’t an anchor link for it at the top.

Gist should be something like this (I didn’t test this)

with mx.autograd.record():
  output = net(x)
  loss = loss_func(output)
  dz = mx.autograd.grad(loss, [z], create_graph=True)  # where [z] is the parameter(s) you want

dz[0].backward()  # now the actual parameters should have second order gradients


Thanks! This (https://github.com/apache/incubator-mxnet/issues/10002) seems to imply that not all operators support this yet.

I’ll try the interface you’ve pointed to and report operators it fails on to the contributors.