Saving and loading cudNN autotune and graph optimization

QueensGambit · September 14, 2019, 2:50pm

Hello everyone,

I created a feature request on GitHub about the topic of saving cudNN autotune and graph fusion to disk in order to be reloaded later:

https://github.com/apache/incubator-mxnet/issues/16173

In our use case TensorRT is very helpful but requires a long start-up time.
This start-up time leads to the point that our executable is being killed by external programs because it does not reply in time to ping requests.

Is anyone interested in this feature, too?

Best,
~QueensGambit

NRauschmayr · September 20, 2019, 7:18am

Thanks for creating the feature request. A similar request has been opened a while ago: https://github.com/apache/incubator-mxnet/issues/10567 It would be certainly helpful to cache autotune results.

mikeobr · February 6, 2020, 4:26pm

I am interested in this feature. It would make MXNet much simpler to implement for production use.

Multiple processes of MXNet on the same server have this risk: If multiple cudNN autotunes are triggered at the same time, the spike in memory may cause out of memory errors.

Topic		Replies	Views
Is it normal that mxnet takes up much more GPU memory at the start up? Discussion	3	2896	May 30, 2018
The GPU memory usage is not stable Performance	3	1006	May 12, 2018
When to set CUDNN_AUTOTUNE_DEFAULT to 0? Performance	1	1561	October 23, 2018
Memory usage grows over time and causes cudaMallocErrors Performance	8	1609	July 26, 2019
`MXImperativeInvokeEx` is taking a long time Performance	8	770	January 6, 2019

Saving and loading cudNN autotune and graph optimization

Related Topics