How to use AMP with gluoncv SSD?

olivcruche · October 19, 2019, 2:03pm

Hi,
I’m very happy to see that blog post on AMP: https://medium.com/apache-mxnet/simplify-mixed-precision-training-with-mxnet-amp-dc2564b1c7b0
I don’t understand the part about the type error at instantiation:
“This error occurs because the SSD script from GluonCV, before the actual training, launches the network once on the CPU context in order to obtain anchors for the data loader, and the CPU context does not support some of the FP16 operations, like Conv or Dense layers. We will fix this by changing the get_dataloader() function to use the GPU context for anchor generation:”

what should we do? instantiate the net on GPU and do the anchor generation on GPU too?

EDIT: When I do the proposed solution (instantiate on GPU and anchor on GPU) I have this:
terminate called after throwing an instance of ‘dmlc::Error’
what(): [14:06:06] /home/travis/build/dmlc/mxnet-distro/mxnet-build/3rdparty/mshadow/mshadow/./tensor_gpu-inl.h:35: Check failed: e == cudaSuccess: CUDA: initialization error

cheers

olivcruche · October 19, 2019, 2:27pm

Instantiating the net on GPU and anchors on CPU seem to work:

net = gcv.model_zoo.get_model(args.basemodel, pretrained=True, ctx=ctx[0])

# instantiate training iterator
with autograd.train_mode():
    _, _, anchors = net(mx.nd.zeros((1, 3, image_size, image_size), ctx=ctx[0]))

anchors = anchors.as_in_context(mx.cpu())

ThomasDelteil · October 22, 2019, 11:58pm

Thanks @olivcruche for sharing your solution. Indeed there is a little documented feature of the gluonCV SSD model where calling the model under the autograd train_mode you get back the anchors. The issue is that on mixed precision, if you do that on CPU and the CPU does not support the half precision then it crashes.

hence to get the anchors, you need to get them on GPU and then copy the results on CPU just like you did in your code snippet.

Topic		Replies	Views
Cryptic failure of SSD training with gluoncv 0.5.0 Gluon	1	503	October 23, 2019
Unable to run sample code on GPU Gluon	7	3586	June 20, 2019
How to float16 gluoncv SSD finetuning? Gluon	3	687	December 28, 2019
Questions regarding AMP Gluon	2	389	June 5, 2020
GluonCV CenterNet Horovod and/or AMP Gluon	0	657	September 18, 2020

How to use AMP with gluoncv SSD?

Related Topics