I have a machine using Nvidia-docker with mxnet inside. When the first GPU command is given, such as
mx.nd.ones(3, ctx=mx.gpu()), it takes a very long time to get going. About 2 minutes. However, outside the docker and on the host, it takes just a couple seconds for the first round. (Both host and the docker image are using Mxnet 1.2.1)
To make matters more confusing, I also have a desktop machine, and when I run mxnet in its nvidia docker, it’s fast.
These should be the same docker images. So why is the first mxnet GPU call so slow only on one machine’s nvidia docker when that same machine is also fast outside of nvidia docker?
Are there some settings I can play with to avoid this issue?