I was trying the new NVIDIA Nsight Computer CLI tool on my ubuntu server. However, no matter what binaries I run it always gives me ==PROF== No kernels were profiled. Back in nvprof it was pretty straightforward to just call nvprof ./a.out but it doesn’t work with nv-nsight-cu-cli. Did I miss something here? I didn’t find much help from NVIDIA documentation on that.
Agreed, there’s not much documentation from NVIDIA on this! I actually tried out Nsight Compute not so long ago and documented the steps I took to get it working. Check out the steps on this post and please let me know how you get on, especially which metric you find most useful.
Just a heads up, I found Visual Profiler a slightly more useful for profiling deep learning models end to end, but see if you can get anything useful out of NSight Compute. Cheers, Thom
The thing is I’m trying to visualize the cuda graph structure which seems to be only supported in nsight compute. However I can’t even get the command line profiler to start doing any work
And you’re totally sure your script is actually using the GPU? Is a
profile.nsight-cuprof-report file created in the working directory? Can you try the following example?
/usr/local/cuda-10.0/NsightCompute-1.0/nv-nsight-cu-cli -f -c 10 /home/ubuntu/anaconda3/envs/mxnet_p36/bin/python /home/ubuntu/mxnet/example/gluon/mnist/mnist.py --cuda --batch-size 500 --epochs 1
I think I partially solved the problem. However I didn’t really know the reason behind it. I was able to use the nv-nsight-cu-cli on my own machine without problem. However, when I ssh into a server and do the same thing it prints out “==ERROR== The application returned an error code (11)” and returns “==WARNING== No kernels were profiled”. Didn’t really know why it works on my own computer but not the server. Have you encountered this before?
I’ve only tried on a remote machine (since my local machine didn’t have a NVIDIA GPU). And I was able to generate a
profile.nsight-cuprof-report file on the remote machine, when running the command I previously shared over
ssh. Check that the GPU is active when you’re running your own script with