I am working on multi-GPU training using Gluon for RNNs
I have set the context as follows:
ctx = [mx.gpu(i) for i in range(num_gpus)]
And the begin state is defined as follows:
def begin_state(self, *args, **kwargs): return self.core.begin_state(*args, **kwargs)
But then the following code:
hidden = model.begin_state(func=mx.nd.zeros, batch_size=batch_size, ctx=ctx)
gives me the following error:
Invalid context string [gpu(0), gpu(1), gpu(2), gpu(3)]
I know that this is occurring because I am passing a list of contexts instead of one context. How to distribute the hidden state across all the GPUs?