Default YOLOv3 does not improve

Hi,

I am trying to run the train_yolo3.py example file, with everything, but batch size, as default. All the losses are nan during the training process and after a whole epoch the reported mAP is 0.0. The training seems to be happening, there is load on my GPU, but the model won’t improve.

This is happening on the default voc dataset and also in a custom dataset, I have not tested with COCO.

python3 train_yolo3.py --network darknet53 --dataset voc --gpus 0,1,2,3,4,5,6,7 --batch-size 64 -j 16 --log-interval 100 --lr-decay-epoch 160,180 --epochs 200 --syncbn --warmup-epochs 4

Here is the parameters used for the model zoo.

Note that the batch size can have a big influence on the final results since a lot of hyperparameters (learning rate especially) are tied to it. Try diminishing the default learning rate and see if it helps.