Invoking CUDA code from MXNET without modifying the source code

sergeigofman · March 20, 2018, 4:33pm

I have a shared object with a bunch of CUDA functions operating on plain contiguous arrays. I am willing to use this code as is to introduce a set of custom operators in MXNET. The guide on adding new operators (https://mxnet.incubator.apache.org/faq/new_op.html) offers 3 choises:

Plain python custom op
NVRTC-based custom op
Modifying MXNET source code to add a new operator.
None of the options above works for me. I am looking for a way to obtain a pointer to ndarray data in GPU and invoke my shared object passing this pointer as a parameter.
Is this possible?
If it is not, what is the recommended method to achieve what I need?
Thanks in advance!

ThomasDelteil · March 29, 2018, 10:13pm

Here is the link to the documentation for the run-time compilation API
https://mxnet.incubator.apache.org/api/python/rtc/rtc.html

“The RTC package contains tools for compiling and running CUDA code from python frontend. The compiled kernels can be used stand-alone or combined with autograd.Function or operator.CustomOpProp to support differentiation.”

sergeigofman · April 1, 2018, 1:48pm

Thanks, I looked into it before asking here. Unfortunately, my CUDA code has many dependencies, and as such, carries many inclde statements that RTC doesn’t like. Is there other way?

Topic		Replies	Views
Create mxnet.ndarray.NDArray from pycuda.driver.DeviceAllocation	3	915	April 11, 2020
[C++ Symbolic] MXNet design question	2	413	September 9, 2018
Mxnet ndarray to numpy without copy Discussion	1	477	September 11, 2019
Implement custom operator with tile operation Discussion	2	453	July 4, 2019
NDArray.concat failed to concatenate two array on different GPUs?	3	1661	May 17, 2018

Invoking CUDA code from MXNET without modifying the source code

Related Topics