Data copy between cpu and gpu in jetson TX1

I use Jetson TX1 for inference. It said that there’s data copy between CPU and GPU memory. As On TX1 CPU and GPU share the same memory, Is there a way to not use the cpu memory, just keep the data in GPU? Thank you.

I am not too familiar with Jetson particulars, maybe @kellen can help here.

More generally, in MXNet, NDArray can be assigned to particular compute context by calling .as_in_context()

For example:

a = mx.nd.ones((10, 10)) wil allocate on CPU by default. But you can do
b = mx.nd.ones((10, 10), ctx=mx.gpu()) to allocate directly the array on GPU.
You can also move a to GPU by doing a.as_in_context(mx.gpu())