How to get gradients using symbol API


#1

This is probably a silly question but I’m having a difficult time learning the autograd API and the Symbol API. I can’t seem to figure out how to compute the gradient of a function when the function is using symbols and not NDArrays. For example:

x_in = mx.nd.array([2])
x_in.attach_grad()

X = mx.sym.Variable("X")

with autograd.record():
    F = X * X
    execute = F.bind(ctx=mx.cpu(0),args={'X' : x_in})
    out = execute.forward()
    
    grad = autograd.grad(out[0], [x])

This code gives an error: “Cannot differentiate node because it is not in a computational graph.”

I feel like I’m missing some sort of fundamental information about how the autograd api and symbol api work, but I can’t seem to find examples of gradients being calculated with symbols.

Thanks for any help!


#2

You don’t need to explicitly use autograd when using symbol. Set the is_train argument to true on your forward pass and the information will be kept in order to do a backward pass and get back the computed gradients.
You need to allocate the memory for your gradients through the args_grad argument.

If you want to use symbols, the Module API is good at hiding these low-level details from you.

Otherwise I would suggest to use Gluon :smile:

x_in = mx.nd.array([2])
​
X = mx.sym.Variable("X")
F = X * X
​
executor = F.bind(ctx=mx.cpu(0),args={"X" : x_in}, args_grad= {"X": mx.nd.zeros((1))})

out = executor.forward(is_train=True).copy()

execute.backward(out)
print(execute.grad_arrays)
[[16.]<NDArray 1 @cpu(0)>]

#3

Is it possible to compute the hessian or other higher order gradients with this method? For example, is there a symbolic gradient operator that you could compute the gradient of?


#4

@bschrift please see this thread for second order derivatives: Obtaining second order derivatives for a function wrt arbitrary parameters in the computation graph