Error "Parameter was not initialized on context cpu(0)"

I’m training resnet18_v1 on ImageNet dataset with officially provided code via Gluon https://gluon-cv.mxnet.io/_downloads/3bb06a6d6d085b1bb501b30aaf6c21c5/train_imagenet.py.

When I tried to access the model parameters via:

params = dict(net.collect_params())
weight = params["resnetv10_conv0_weight"].data()

An error occured: Parameter 'resnetv10_conv0_weight' was not initialized on context cpu(0).

I encounter this error only if the model is trained with multiple GPUs (–num-gpus > 1).

When I use --num-gpus 1, everything works smoothly.

So how can the error be resolved?

Can you check with the following command where the parameters are located?

print(params["resnetv10_conv0_weight"].data().context)

You may have to manually copy the parameters to the right context by using the following function https://beta.mxnet.io/api/ndarray/_autogen/mxnet.ndarray.NDArray.copyto.html

@kaizhao you need to specify the context from which you want the data from
params["resnetv10_conv0_weight"].data(ctx=mx.gpu(0)) for example

1 Like

Ok I got it. Thanks @NRauschmayr also.

I have another question: if I make some changes to the parameters by

p = params["conv1_weight"].data(ctx=mx.gpu(0))
p = change_the_params(p)
params["conv1_weight"].set_data(p)

How can I async the changes to all other devices? Or will it be done automatically by calling
.set_data()