When update_on_kvstore = None, kv_store simply sums the gradients and passes them to the worker. But where does kv_store initialize the value in key_value to 0?
If kv_store is not initialized to 0 every iteration, then the sum of the gradients is wrong.
Where and how to initialize if needed?
The problem has been solved, the default update operation of kv_store is Assign.
I found that the gradient was automatically summed in the push process, and then the sum gradient was assigned to the corresponding key_value in kv_store.
Can I change the merge process of push in kv_store?
Yes you can change the merge process, by calling kv._set_updater()
:
def update(key, input, stored):
print("update on key: %d" % key)
stored += input * 2
kv._set_updater(update)
kv.pull(3, out=a)
print(a.asnumpy())
You can also find more information in this tutorial: https://mxnet.incubator.apache.org/tutorials/python/kvstore.html