What's the meaning of parameter `handle` in `MXDataIter` , which is used to custom my own `CSVIter`?


Hi guys,
I find that current mxnet.io.CSVIter does not support running-time image augmentations when reading from .csv files. So I am working on customing my own csv data iterator which inherits from super class MXDataIter, for that the MXNet doc said MXDataIter is the wrapper class of CSVIter(C++) in Python. Below is my codes of myCSVIter:

import mxnet as mx
class myCSVIter(mx.io.MXDataIter):
        def __init__(self, handle, augs, **kwargs):
            super(myCSVIter, self).__init__(handle, **kwargs)
            self.augs = augs
        def  reset():
            super(myCSVIter, self).reset()
        def __next__():
        def next():
                data_batch = super(myCSVIter, self).next()
                image = data_batch.data[0]
                label = data_batch.label[0]
                for aug in self.augs:
                      image = aug(image)   # apply image augmentations
                return mx.io.DataBatch(image, label)
            except StopIteration:
                raise StopIteration

However, I don’t know what the handle paramete means in mx.io.MXDataIter ? So I tried to pass a CSVIter object (train_img_iter) to it, as belows:

train_img_iter = mx.io.CSVIter(data_csv='train_datas_tmp.csv', data_shape=(4, 640, 480),
                               label_csv='train_labels_tmp.csv', label_shape=(640, 480),
                               batch_size=1, dtype='float32')

i.e. I take the train_img_iter as the handle parameter when construct an instance of myCSVIter, but I got an error: Don’t know how to convert parameter 1.

So, my question is:

  1. Is what I have done to implement a customed CSVIter (Enabled by image augmentation) correct ?
  2. What’s the meaning of handle parameter in MXDataIter ?
  3. Is there any way to read .csv data files and do image augmentations simultaneously ?

  1. You need to inherit from mx.io.DataIter not from mx.io.MXDataIter, so you actually don’t need to deal with handle directly. See this tutorial as an example of how to implement a custom iterator: http://mxnet.incubator.apache.org/tutorials/basic/data.html

  2. handle is actually a handle to underlying C++ iterator. You don’t need to work with it directly.

  3. I don’t know of any out-of-the-box way to do reading csv and do augmentation. Seems like you need to implement something like that. However, if you are using Gluon, maybe you can make your life a little bit easier using transform way. Take a look into Dataset and DataLoaders: https://mxnet.incubator.apache.org/tutorials/gluon/datasets.html


Thanks for your reply.

I will have a try to inherit from mx.io.DataIter to read my .csv data (do augmentations at the same time) iterately ~