Help to use nd.sgd.update for R optimizers

jeremiedb · June 11, 2018, 4:19am

Optimizers in R package are performing the lengthy update of the state which result in high memory consumption. As I understand it, garbage collection isn’t automatically performed on C objects so a manual gc() is needed in certain circumstances which impair performance.

Current attempt to implement the inplace update of the weights on the executor as in Python package is the following:

mx.nd.sgd.update(weight = weight, grad = grad, mom = state$mom, lr = lr, wd = wd, rescale_grad = rescale.grad, clip_gradient = clip_gradient, out = weight)

The weight and grad are the same function arguments as in current optimizers and come from the exec$ref.arg.arrays and exec$ref.grad.arrays. The function results in an appropriate update of the weight in the executor as well as of the states, but it returns an error message:

Error in mx.nd.sgd.mom.update(weight = exec$arg.arrays$fc_weight, grad = exec$grad.arrays$fc_weight, : ./ndarray.h:87: RCheck failed: ptr_->writable && !ptr_->moved Passing a read only NDArray to mutate function

The error can be bypassed by wraping the call within try function, but while training, R crashes randomly, after a varying number of iterations (anywhere between 20 to 150b iterations) for no clear cause (memory consumption remains low and doesn’t seem in cause).

During model training, the parameter update is performed with the following call:

for (i in seq_len(ndevice)) { updaters[[i]](train.execs[[i]]$ref.arg.arrays, train.execs[[i]]$ref.grad.arrays) }

Any help would be greatly appreciated to figure out where the glitch is with that approach as it seems to be nearly functional and would likely solve problematic memory consumption.

ThomasDelteil · July 3, 2018, 10:00pm

@hetong007 any pointers for helping resolve this issue?

Topic		Replies	Views
Unable to run Multi-*_update operators for NN Optimizer	1	387	July 29, 2019
Implementation of weighted softmax by extending mx.autograd.Function fails	2	649	September 2, 2019
Riemannian SGD for spherical embeddings?	1	504	January 30, 2020
About stale gradient Gluon	17	3199	October 19, 2020
Gradient fetching Discussion	2	586	May 31, 2018

Help to use nd.sgd.update for R optimizers

Related Topics