Size and stepsize of DataIter

Hi,
Is there a way to find out the size (number of batches) of a DataIter (e.g. DataIter, MXDataIter, …)?
A straight forward: “len(data_iter)” does not work.
And secondly, is there a way to set a “stepsize” in the data iterator?
Thanks,
David

Hi David (@dfferstl), not sure if I understand your question, but did you try with data_iter.data[0][1].shape ? Here a sample code to play with:

import mxnet as mx 

data = mx.ndarray.random_randint(0, 10, (100,3))
label = mx.ndarray.random_randint(0, 10, (100,))
data_iter = mx.io.NDArrayIter(data=data, label=label, batch_size=30)

for batch in data_iter:
    print(data_iter.data[0][1].shape)

If you need to know the batch size, you can call

data_iter.batch_size

Thanks for the quick answer. I want to see the total number of batches in a MXDataIter and I could not find a function that returns me that.

Are you asking about the total number of iterations in that for loop? This would be the total number of samples divided by the batch size. For the example above, it would be 100 // 30 + 1 = 4 as the default for last_batch_handle is pad. If you switch it to discard, then it would be 100 // 30 = 3.

Here you can find more info about NDArrayIter:

https://beta.mxnet.io/api/gluon-related/_autogen/mxnet.io.NDArrayIter.html

Yes, I want to know the total number of iterations for a MXDataIter element. I know how to theoretically calculate the number of batches but my question was how to get this data from the MXDataIter object itself (without stepping through the for loop and counting the steps).
This is necessary e.g. when loading the data from a rec data container where it is unclear how many samples it contains.

So, I assume there is not way to find this out yet, isn’t it?