The current ndarray.asnumpy() call is doing a full copy. I have a use case where this copy is too expensive so I am looking for a way to do the conversion without it. I came up with the following sniplet by checking the cpp implementation which works OK as long as the ndarray is on cpu and dtype is float.
def asnumpy_nocopy(a):
# Assume cpu context
a.wait_to_read()
c_uint64_p = POINTER(c_uint64)
handle = cast(a.handle, c_uint64_p) # NDArray*
ptr_ = cast(handle[0], c_uint64_p) # shared_ptr<Chunk>
dptr = cast(ptr_[0], POINTER(c_float)) # shandle.dptr
return np.ctypeslib.as_array(dptr, shape=a.shape)
This solution is very hacky and will break easily if the backend changes. Is there a better more elegant way to do this?