Check failed: e == CUDNN_STATUS_SUCCESS (2 vs. 0) cuDNN: CUDNN_STATUS_ALLOC_FAILED


#1

hello, I get a bug when train a net, i define a function to return a network, and training is ok, but a error occured in predict. the error information as follows:

src/operator/nn/./cudnn/cudnn_convolution-inl.h:553: Check failed: e == CUDNN_STATUS_SUCCESS (2 vs. 0) cuDNN: CUDNN_STATUS_ALLOC_FAILED

someone have seen this type error?
thk!:relaxed:


#2

The error appears to be related to memory allocation failure. Have you double checked how you’re using your network in inference? Often memory used in inference is lower than training, unless you’re doing batch inference with much larger batches than training.