Mxnet GPU freezes python


#1

Hi,

I’ve been unsuccessful with running GPU version of mxnet for a while now, so I’ve decided to seek help here.

I’m trying to run it on GeForce GTX 960M:

C:\Users\pavel>"C:\Program Files\NVIDIA Corporation\NVSMI\nvidia-smi.exe"
Sat Sep 22 11:20:57 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 411.63                 Driver Version: 411.63                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 960M   WDDM  | 00000000:01:00.0 Off |                  N/A |
| N/A   50C    P8    N/A /  N/A |     40MiB /  4096MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-------------------------------------------------------------------------

With mxnet version 1.3.0 installed via pip install mxnet-cu92, having cuda 9.2 installed and added to PATH (also tried addin the cuDNN 7.1).

I’ve also tried mxnet-cu90 with cuda version 9.0 with same result.

The issue:
Sadly, Iam also unable to provide any error whatsoever, because mxnet just crashes the python interpreter.
Here is the simple code i try to run:

(GT) C:\Users\pavel>python
Python 3.6.6 |Anaconda, Inc.| (default, Jun 28 2018, 11:27:44) [MSC v.1900 64 bit (AMD64)] on win32
Type “help”, “copyright”, “credits” or “license” for more information.

import mxnet as mx
mx.nd.ones((5))

[1. 1. 1. 1. 1.]
<NDArray 5 @cpu(0)>

mx.nd.ones((5), ctx=mx.gpu())

(GT) C:\Users\pavel>

After the simple operation, python just exits without any traceback, error or anything.

Could anybody point me to direction, which might help me to solve the issue?

Thanks you


#2

Hey @LavinaVRovine, can you try pip install mxnet-cu92==1.2.1.post1 and see if it works?

Also I have a hunch you might be hitting a similar issue as that one, can you see if following the tips on increasing the cache size help at all? https://github.com/apache/incubator-mxnet/issues/10016


#3

Try to use mxnet-cu92 v1.2.0.

I meet same problem like you when mxnet-cu92 is v1.3.0. But when I uninstall v1.3.0 and use v1.2.0, it works. :grinning:

Good luck.


#4

That worked! Thank you