I am trying to understand automatic mixed precision (AMP) and went through this link. I have a few doubts and it would be great if someone can help me clear them:
It seems that during training the model is initialized as fp32 and all the input / output are set as fp32. Amp seems to perform conversions to fp16 internally. Is it possible to specify to use fp16 input?
My goal is to move to fp16 for all but the last layer so that I can have around double the batch size. I checked the github code and the init method has options of providing which methods to keep in fp32 and which to cast to fp16. That should work for me. However, how do I specify Sigmoid activation here? Based on the tutorial it seems that it takes generic names like Convolution, SoftmaxOutput instead of the actual operator like conv2d. I am using
The inference section in the tutorial mentions that the model should be converted for inference. So, does that mean that I need to convert even when running validation after every epoch? Also, can the model be converted before training (for reducing model size) so that the training occurs only on the reduced model?