What happen? One of the data from the same iterator is on the CPU and the other is on the GPU. Is this a bug?

kenjewu · April 25, 2018, 6:48am

One of the data from the same iterator is on the CPU and the other is on the GPU. Is this a bug?
I don’t know where the problem is, please someone help me！

ThomasDelteil · April 26, 2018, 7:06pm

If you look at the code of the ArrayDataset, line 151 you see:

            if isinstance(data, ndarray.NDArray) and len(data.shape) == 1:
                data = data.asnumpy()

Which means if one of the NDArray has a single dimension it will be converted to numpy. Then the DataLoader will load it as ndarray and by default it will be on the CPU.

Usually that’s what you want since the metrics are usually calculated on CPU. If you want to keep the y on GPU, I recommend adding a dummy extra dimension using .expand_dims().

x = nd.ones((5,6), mx.gpu())
y = nd.ones((5), mx.gpu())
y = y.expand_dims(axis=1)
print(y.shape)
dataset = mx.gluon.data.ArrayDataset(x, y)
dataloader = mx.gluon.data.DataLoader(dataset, batch_size=1)
for data in dataloader:
    print(data)
    break
(5, 1)
[
[[ 1.  1.  1.  1.  1.  1.]]
<NDArray 1x6 @gpu(0)>, 
[[ 1.]]
<NDArray 1x1 @gpu(0)>]

kenjewu · April 28, 2018, 6:55am

Thank you very much for helping me solve this problem. The explanation is very detailed. Thank you.

Topic		Replies	Views
Example of a multithreaded data iterator	6	2089	May 22, 2019
Gluon Multi GPU Out of Memory Issues	6	3418	April 11, 2019
Data copy between cpu and gpu in jetson TX1 Performance	1	813	May 28, 2018
Dataloader cost too much gpu-0 memory Gluon	1	560	August 31, 2018
Homework Q4	3	368	January 29, 2019

What happen? One of the data from the same iterator is on the CPU and the other is on the GPU. Is this a bug?

Related Topics