I have a shared object with a bunch of CUDA functions operating on plain contiguous arrays. I am willing to use this code as is to introduce a set of custom operators in MXNET. The guide on adding new operators (https://mxnet.incubator.apache.org/faq/new_op.html) offers 3 choises:
- Plain python custom op
- NVRTC-based custom op
- Modifying MXNET source code to add a new operator.
None of the options above works for me. I am looking for a way to obtain a pointer to ndarray data in GPU and invoke my shared object passing this pointer as a parameter.
- Is this possible?
- If it is not, what is the recommended method to achieve what I need?
Thanks in advance!