Unable to run gpu device by calling mx.gpu() even though GPU is running


#1

mxnet.base.MXNetError: [16:11:09] src/engine/threaded_engine.cc:318: Check failed: device_count_ > 0 (-1 vs. 0) GPU usage requires at least 1 GPU

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.67                 Driver Version: 390.67                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla P100-PCIE...  On   | 00000000:04:00.0 Off |                    0 |
| N/A   28C    P0    24W / 250W |      0MiB / 12198MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla P100-PCIE...  On   | 00000000:05:00.0 Off |                    0 |
| N/A   30C    P0    24W / 250W |      0MiB / 12198MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  Tesla P100-PCIE...  On   | 00000000:88:00.0 Off |                    0 |
| N/A   27C    P0    24W / 250W |      0MiB / 12198MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   3  Tesla P100-PCIE...  On   | 00000000:89:00.0 Off |                    0 |
| N/A   29C    P0    25W / 250W |      0MiB / 12198MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

#2

Hi @awesomemangosmoothie,
Can you give more information on your configuration:
CUDA, cuDNN, MXNet version, how did you install MXNet, from source or pip?
Can you paste a minimally reproducible sample of code as well?
Thanks


#3

i have the same problem


#4

I saw this once too. But, restarting the kernel (shell) worked for me.