Is there a way to find out the size (number of batches) of a DataIter (e.g. DataIter, MXDataIter, …)?
A straight forward: “len(data_iter)” does not work.
And secondly, is there a way to set a “stepsize” in the data iterator?
Hi David (@dfferstl), not sure if I understand your question, but did you try with data_iter.data.shape ? Here a sample code to play with:
import mxnet as mx data = mx.ndarray.random_randint(0, 10, (100,3)) label = mx.ndarray.random_randint(0, 10, (100,)) data_iter = mx.io.NDArrayIter(data=data, label=label, batch_size=30) for batch in data_iter: print(data_iter.data.shape)
If you need to know the batch size, you can call
Thanks for the quick answer. I want to see the total number of batches in a MXDataIter and I could not find a function that returns me that.
Are you asking about the total number of iterations in that for loop? This would be the total number of samples divided by the batch size. For the example above, it would be 100 // 30 + 1 = 4 as the default for last_batch_handle is pad. If you switch it to discard, then it would be 100 // 30 = 3.
Here you can find more info about NDArrayIter:
Yes, I want to know the total number of iterations for a MXDataIter element. I know how to theoretically calculate the number of batches but my question was how to get this data from the MXDataIter object itself (without stepping through the for loop and counting the steps).
This is necessary e.g. when loading the data from a rec data container where it is unclear how many samples it contains.
So, I assume there is not way to find this out yet, isn’t it?