Gluoncv yolov3 training

olivcruche · February 12, 2020, 9:55am

Hi,
I can’t make sense of the official gluoncv yoloV3 tuto

In the beginning of the tuto it is said to use:

batchify_fn = Tuple(Stack(), Pad(pad_val=-1))
train_loader = DataLoader(train_dataset.transform(train_transform), batch_size, shuffle=True,
                          batchify_fn=batchify_fn, last_batch='rollover', num_workers=num_workers)
val_loader = DataLoader(val_dataset.transform(val_transform), batch_size, shuffle=False,
                        batchify_fn=batchify_fn, last_batch='keep', num_workers=num_workers)

then out of nowhere I the training loop in the bottom of the snippet the code change to being this:

train_transform = presets.yolo.YOLO3DefaultTrainTransform(width, height, net)
# return stacked images, center_targets, scale_targets, gradient weights, objectness_targets, class_targets
# additionally, return padded ground truth bboxes, so there are 7 components returned by dataloader
batchify_fn = Tuple(*([Stack() for _ in range(6)] + [Pad(axis=0, pad_val=-1) for _ in range(1)]))
train_loader = DataLoader(train_dataset.transform(train_transform), batch_size, shuffle=True,
                          batchify_fn=batchify_fn, last_batch='rollover', num_workers=num_workers)

What are the differences between those two things that seem to conflict?

sad · February 24, 2020, 10:40pm

Hi Olivier,

I think it has to do with the fact that in the beginning of the tutorial, it’s just going through the basic transformations for the network in inference mode. i.e this transformation only does stuff on the image that you pass in (normalization, to tensor, maybe some random augmentations etc).

However, when you want to train the model, for yolo specifically (and for other obj detection models), you need to generate some more targets for the yolo loss e.g objectness score, scale targets. This is because the yolo loss consists of comparing what the model predicts for each of those (objectness, scale, center) to the targets generated. In order to this, you have to pass the net to the train_transform function, which converts the network to training mode and causes the output of the network to change. Instead of just predicting bounding boxes and class labels, now the network returns the losses.

See the source code for the YoloDefaultTrainTransform for more: https://gluon-cv.mxnet.io/_modules/gluoncv/data/transforms/presets/yolo.html#YOLO3DefaultTrainTransform.

Topic		Replies	Views
What dataset format is required for gluoncv Yolov3 training? Gluon	1	405	March 31, 2020
What that mean in gluoncv, when training yolo v3? Gluon	1	383	December 6, 2019
Finetuning YOLO and FRCNN Gluon	4	1184	February 6, 2020
Is this a correct way to prepare custom data for yolo v3 detector? Gluon	9	2367	March 20, 2019
Error finetuning YOLO3 Discussion	1	491	May 31, 2019

Gluoncv yolov3 training

Related Topics