Integer labels in NDArrayIter

Is there a way to make NDArrayIter spit out integer labels? It always converts labels to float, and in the example below I can’t convert back as some significant digits get lost. Thanks!

from mxnet import nd
import numpy as np

X_train = np.array([1,2,3,4,5])
Y_train = np.array([1,2,3,4,5111111122])

train_data = NDArrayIter(X_train, Y_train, 2, shuffle=False)

train_data.label[0]

Output:
(‘softmax_label’,
[ 1.00000000e+00 2.00000000e+00 3.00000000e+00 4.00000000e+00
5.11111117e+09]
<NDArray 5 @cpu(0)>)

You can use mxnet ndarray instead of numpy ndarray for Y_train and get integer labels. Below is the example:

import mxnet as mx
import numpy as np


X_train = mx.nd.array([1,2,3,4,5])
Y_train = mx.nd.array([1,2,3,4,5111111122], dtype='int64')

 train_data = mx.io.NDArrayIter(X_train, Y_train, 2, shuffle=False)
 print train_data.label[0]

AFAIK, This is not possible with numpy ndarrays since the dtype for the numpy ndarray is not passed when contructing the mxnet ndarray from numpy ndarray.

Thanks, it worked! However, for some strange reason, it switched to floats again when I changed shuffle=True

This is because when shuffle is True, it doesn’t pass the dtype of the ndarray to array util. Thanks for letting us know, I have opened a MXNet issue for this. (https://github.com/apache/incubator-mxnet/issues/8430)

Thanks @anirudh2290! What do you think of this workaround:

import mxnet as mx
from mxnet.gluon.data.dataset import ArrayDataset

X_train = nd.array([[1],[2],[3],[4],[5]], dtype=‘int64’)
Y_train = nd.array([[1],[2],[3],[4],[5111111122]], dtype=‘int64’)

dl = mx.gluon.data.DataLoader(ArrayDataset(X_train, Y_train), 2, shuffle=True)

for (d1, d2) in dl:
print(d2)

Output:
[[5111111122]
[ 3]]
<NDArray 2x1 @cpu(0)>

[[1]
[4]]
<NDArray 2x1 @cpu(0)>

[[2]]
<NDArray 1x1 @cpu(0)>

@avolozin Yep, That works too! You can also use a 1D array that you used in your initial example.