Nsight compute



I was trying the new NVIDIA Nsight Computer CLI tool on my ubuntu server. However, no matter what binaries I run it always gives me ==PROF== No kernels were profiled. Back in nvprof it was pretty straightforward to just call nvprof ./a.out but it doesn’t work with nv-nsight-cu-cli. Did I miss something here? I didn’t find much help from NVIDIA documentation on that.



Agreed, there’s not much documentation from NVIDIA on this! I actually tried out Nsight Compute not so long ago and documented the steps I took to get it working. Check out the steps on this post and please let me know how you get on, especially which metric you find most useful.


Just a heads up, I found Visual Profiler a slightly more useful for profiling deep learning models end to end, but see if you can get anything useful out of NSight Compute. Cheers, Thom