I am using vgg16_atrous_voc model straight out of the zoo (pre-trained). My overall project is related to object tracking where occlusion is a major problem - experimenting with a classic shell game - in my case 3 cups, 1 M&M. i.e. start with M&M under cup #2, shuffle the cups and the computer should tell where it ends. I’m only at the SSD phase. I am training my SSD to detect red cups, blue cups, yellow cups, M&M and hands - (5 classes). I’m using 3 different colored cups because it will help me later with labeled data when working on the actual shuffle videos. With that background - here is my question:
So far I have 6700 training images & 2200 validation images. The model is excellent at drawing bounding boxes around the cups - SUPER accurate. However it seems to be color blind. The ability to distinguish red/yellow/blue is appalling. It tends to think blue cups are yellow, red cups are blue etc. If there are 4 cups in an image - all the same color - it typically thinks they are all the same color - but the wrong color. It’s not random distribution.
below is a sample of the model’s output as it trains.
- i removed _color_distort from the default SSD transformations. It looked like this would be a major problem. after removing it - didn’t really make much difference.
- tried VOC07MApMetric (each class converges to 0.90909) then went to VOCMApMetric (currently in use)
- each time I add training data, it improves slightly but it still acts like it is color blind.
- I reviewed training data - visually. out of 100s of images, no labeling mistakes - red cups are labeled as red cups
I need some clues - why does it do so poorly in distinguishing the difference in the cup colors.
INFO:root:[Epoch 203][Batch 99], Speed: 19.662 samples/sec, CrossEntropy=0.669, SmoothL1=0.187
INFO:root:[Epoch 203][Batch 199], Speed: 19.257 samples/sec, CrossEntropy=0.667, SmoothL1=0.198
INFO:root:[Epoch 203] Training cost: 429.114, CrossEntropy=0.667, SmoothL1=0.198
INFO:root:[Epoch 203] Validation:
INFO:root:[Epoch 204][Batch 99], Speed: 19.764 samples/sec, CrossEntropy=0.655, SmoothL1=0.186
INFO:root:[Epoch 204][Batch 199], Speed: 21.072 samples/sec, CrossEntropy=0.664, SmoothL1=0.195
INFO:root:[Epoch 204] Training cost: 422.747, CrossEntropy=0.665, SmoothL1=0.196
INFO:root:[Epoch 204] Validation:
INFO:root:[Epoch 205][Batch 99], Speed: 21.189 samples/sec, CrossEntropy=0.652, SmoothL1=0.189
INFO:root:[Epoch 205][Batch 199], Speed: 20.269 samples/sec, CrossEntropy=0.654, SmoothL1=0.194
INFO:root:[Epoch 205] Training cost: 424.573, CrossEntropy=0.656, SmoothL1=0.196
INFO:root:[Epoch 205] Validation:
INFO:root:[Epoch 206][Batch 99], Speed: 15.724 samples/sec, CrossEntropy=0.666, SmoothL1=0.202
INFO:root:[Epoch 206][Batch 199], Speed: 21.082 samples/sec, CrossEntropy=0.661, SmoothL1=0.199
INFO:root:[Epoch 206] Training cost: 423.563, CrossEntropy=0.660, SmoothL1=0.197
INFO:root:[Epoch 206] Validation: