I’m very happy to see that blog post on AMP: https://medium.com/apache-mxnet/simplify-mixed-precision-training-with-mxnet-amp-dc2564b1c7b0
I don’t understand the part about the type error at instantiation:
“This error occurs because the SSD script from GluonCV, before the actual training, launches the network once on the CPU context in order to obtain anchors for the data loader, and the CPU context does not support some of the FP16 operations, like Conv or Dense layers. We will fix this by changing the get_dataloader() function to use the GPU context for anchor generation:”
what should we do? instantiate the net on GPU and do the anchor generation on GPU too?
EDIT: When I do the proposed solution (instantiate on GPU and anchor on GPU) I have this:
terminate called after throwing an instance of ‘dmlc::Error’
what(): [14:06:06] /home/travis/build/dmlc/mxnet-distro/mxnet-build/3rdparty/mshadow/mshadow/./tensor_gpu-inl.h:35: Check failed: e == cudaSuccess: CUDA: initialization error