Hi,

Is there a way to find out the size (number of batches) of a DataIter (e.g. DataIter, MXDataIter, …)?

A straight forward: “len(data_iter)” does not work.

And secondly, is there a way to set a “stepsize” in the data iterator?

Thanks,

David

# Size and stepsize of DataIter

Hi David (@dfferstl), not sure if I understand your question, but did you try with data_iter.data[0][1].shape ? Here a sample code to play with:

```
import mxnet as mx
data = mx.ndarray.random_randint(0, 10, (100,3))
label = mx.ndarray.random_randint(0, 10, (100,))
data_iter = mx.io.NDArrayIter(data=data, label=label, batch_size=30)
for batch in data_iter:
print(data_iter.data[0][1].shape)
```

If you need to know the batch size, you can call

`data_iter.batch_size`

Thanks for the quick answer. I want to see the total number of batches in a MXDataIter and I could not find a function that returns me that.

Are you asking about the total number of iterations in that for loop? This would be the total number of samples divided by the batch size. For the example above, it would be *100 // 30 + 1 = 4* as the default for **last_batch_handle** is **pad**. If you switch it to **discard**, then it would be *100 // 30 = 3*.

Here you can find more info about NDArrayIter:

https://beta.mxnet.io/api/gluon-related/_autogen/mxnet.io.NDArrayIter.html

Yes, I want to know the total number of iterations for a MXDataIter element. I know how to theoretically calculate the number of batches but my question was how to get this data from the MXDataIter object itself (without stepping through the for loop and counting the steps).

This is necessary e.g. when loading the data from a rec data container where it is unclear how many samples it contains.

So, I assume there is not way to find this out yet, isn’t it?