GPU memory fluctuates for optimizer of Nesterov accelerated SGD?

mg0880gm · October 3, 2017, 11:55pm

I didn’t observe this for Adam or SGD. The GPU memory was about 3.2G during initialization. After that it was 6.9G and then 6.0G and then 4.6G and so on so forth, until the out of memory reported.

/home/ubuntu/src/mxnet/dmlc-core/include/dmlc/./logging.h:308: src/storage/./pooled_storage_manager.h:102: cudaMalloc failed: out of memory

Stack trace returned 10 entries:
[bt] (0) /usr/local/lib/python2.7/dist-packages/mxnet-0.11.0-py2.7.egg/mxnet/libmxnet.so(_ZN4dmlc15LogMessageFatalD1Ev+0x3c) [0x7fe2e107bf0c]
[bt] (1) /usr/local/lib/python2.7/dist-packages/mxnet-0.11.0-py2.7.egg/mxnet/libmxnet.so(_ZN5mxnet7storage23GPUPooledStorageManager5AllocEm+0x15e) [0x7fe2e20acbae]
[bt] (2) /usr/local/lib/python2.7/dist-packages/mxnet-0.11.0-py2.7.egg/mxnet/libmxnet.so(_ZN5mxnet11StorageImpl5AllocEmNS_7ContextE+0x69) [0x7fe2e20b00b9]
[bt] (3) /usr/local/lib/python2.7/dist-packages/mxnet-0.11.0-py2.7.egg/mxnet/libmxnet.so(+0x175f895) [0x7fe2e20d4895]
[bt] (4) /usr/local/lib/python2.7/dist-packages/mxnet-0.11.0-py2.7.egg/mxnet/libmxnet.so(_ZN5mxnet6engine14ThreadedEngine15ExecuteOprBlockENS_10RunContextEPNS0_8OprBlockE+0x93) [0x7fe2e209ccb3]
[bt] (5) /usr/local/lib/python2.7/dist-packages/mxnet-0.11.0-py2.7.egg/mxnet/libmxnet.so(ZNSt17_Function_handlerIFvSt10shared_ptrIN5mxnet6engine10ThreadPool11SimpleEventEEEZZNS2_23ThreadedEnginePerDevice13PushToExecuteEPNS2_8OprBlockEbENKUlvE1_clEvEUlS5_E_E9_M_invokeERKSt9_Any_dataOS5+0x123) [0x7fe2e20a59d3]
[bt] (6) /usr/local/lib/python2.7/dist-packages/mxnet-0.11.0-py2.7.egg/mxnet/libmxnet.so(_ZNSt6thread5_ImplISt12_Bind_simpleIFSt8functionIFvSt10shared_ptrIN5mxnet6engine10ThreadPool11SimpleEventEEEES8_EEE6_M_runEv+0x4a) [0x7fe2e209f13a]
[bt] (7) /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xb8c80) [0x7fe2f7d0cc80]
[bt] (8) /lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba) [0x7fe2fd9dc6ba]
[bt] (9) /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7fe2fd7123dd]

/home/ubuntu/src/mxnet/dmlc-core/include/dmlc/./logging.h:308: src/engine/./threaded_engine.h:347: src/storage/./pooled_storage_manager.h:102: cudaMalloc failed: out of memory

The GPU memory cost is about 6G for Adam and 4.5G for SGD.

Appreciate if anybody could help.

smolix · October 4, 2017, 7:41pm

Different optimizers have different amounts of internal state (e.g. you might want to store momentum, diagonal preconditioners, etc.). This is why different optimizers need different amounts of storage. Have a look at optimizer.py in the mxnet codebase to see what’s going on.

TL;DR - if you’re really short of memory, use plain SGD.

Topic		Replies	Views
Gluon Multi GPU Out of Memory Issues	6	3427	April 11, 2019
How to optimize the GPU memory usage for deep neural network?	6	4520	October 4, 2017
The GPU memory usage is not stable Performance	3	1014	May 12, 2018
Why the GPU usage is so weird at the starting moment? Gluon	3	963	November 4, 2018
Memory increase when use adam and rmsprop Discussion	2	1517	April 24, 2018

GPU memory fluctuates for optimizer of Nesterov accelerated SGD?

Related Topics