Utilizing sample weights (not class weights) while training Gluon/Symbol model



I have a use case where each sample image in the training data has a weight associated it. This weight is not correlated with the class label but indicates the credibility of the data source from where the image was obtained. I am looking forward to utilize these weights while training a Gluon/Symbol Model with Softmax/Sigmoid loss.

I see MXNet provides a way to balance the imbalanced data by using sample_weights in mxnet.gluon.loss.SigmoidBinaryCrossEntropyLoss(). From all the examples online, I see that the weights are generated during processing data batch based on the class label. At this point we only have data batches and no other information corresponding to the images. As per our use case, we need a way of iterating over each image data in the data batch and mapping image IDs to sample weights. Is there any way we can achieve this?
Any links to related examples will be helpful.



You can add the sample weights to your dataset that is used by the dataloader. So when iterating through the batch you have the corresponding weights.

dataset = gluon.data.ArrayDataset(data, label, weight)
dataloader = gluon.data.DataLoader(dataset, batch_size)

for data, label, weight in dataloader:

    data = data.as_in_context(ctx)
    label = label.as_in_context(ctx)
    weight = weight.as_in_context(ctx)
    output = model(data)
    loss_function = mx.gluon.loss.SigmoidBinaryCrossEntropyLoss()
    loss = loss_function(output, label, weight)


That worked perfectly. Thanks!